public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH V5 0/9] Support Intel APX EGPR
@ 2023-12-28  1:27 Cui, Lili
  2023-12-28  1:27 ` [PATCH V5 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
                   ` (8 more replies)
  0 siblings, 9 replies; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich

*** BLURB HERE ***
Optimizations and fixes needed in the future.
1. The current implementation of vexvvvvv needs to be optimized.
2. Convert vround* with egpr to VRNDSCALE* instead of reporting an error.
3. Find a suitable variable to replace OperandConstraint=REX2_REQUIRED.
4. The current gen.c does not handle "cpuid&(cpuid|APX_F)" correctly and a separate patch is required to fix this.

Cui, Lili (5):
  Support APX GPR32 with rex2 prefix
  Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
  Support APX GPR32 with extend evex prefix
  Add tests for APX GPR32 with extend evex prefix
  Support APX pushp/popp

Hu, Lin1 (2):
  Support APX NDD optimized encoding.
  Support APX JMPABS for disassembler

Mo, Zewei (1):
  Support APX Push2/Pop2

konglin1 (1):
  Support APX NDD

 gas/config/tc-i386.c                          | 461 ++++++++++--
 gas/doc/c-i386.texi                           |   7 +-
 gas/testsuite/gas/i386/apx-push2pop2-inval.l  |   5 +
 gas/testsuite/gas/i386/apx-push2pop2-inval.s  |   9 +
 gas/testsuite/gas/i386/i386.exp               |   1 +
 .../i386/ilp32/x86-64-opcode-inval-intel.d    |  47 +-
 .../gas/i386/ilp32/x86-64-opcode-inval.d      |  47 +-
 gas/testsuite/gas/i386/rex-bad.l              |   8 +-
 .../gas/i386/x86-64-apx-egpr-inval.l          | 202 +++++
 .../gas/i386/x86-64-apx-egpr-inval.s          | 209 ++++++
 .../gas/i386/x86-64-apx-egpr-promote-inval.l  |  20 +
 .../gas/i386/x86-64-apx-egpr-promote-inval.s  |  29 +
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d |  20 +
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s |  21 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  41 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  49 ++
 .../gas/i386/x86-64-apx-evex-promoted-intel.d | 318 ++++++++
 .../gas/i386/x86-64-apx-evex-promoted.d       | 318 ++++++++
 .../gas/i386/x86-64-apx-evex-promoted.s       | 314 ++++++++
 .../gas/i386/x86-64-apx-jmpabs-intel.d        |  12 +
 .../gas/i386/x86-64-apx-jmpabs-inval.d        |  40 +
 .../gas/i386/x86-64-apx-jmpabs-inval.s        |  15 +
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d    |  12 +
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s    |   5 +
 .../gas/i386/x86-64-apx-ndd-optimize.d        | 132 ++++
 .../gas/i386/x86-64-apx-ndd-optimize.s        | 125 ++++
 gas/testsuite/gas/i386/x86-64-apx-ndd.d       | 160 ++++
 gas/testsuite/gas/i386/x86-64-apx-ndd.s       | 155 ++++
 .../gas/i386/x86-64-apx-push2pop2-intel.d     |  42 ++
 .../gas/i386/x86-64-apx-push2pop2-inval.l     |  13 +
 .../gas/i386/x86-64-apx-push2pop2-inval.s     |  17 +
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d |  42 ++
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s |  39 +
 .../gas/i386/x86-64-apx-pushp-popp-intel.d    |  14 +
 .../gas/i386/x86-64-apx-pushp-popp-inval.l    |   5 +
 .../gas/i386/x86-64-apx-pushp-popp-inval.s    |   7 +
 .../gas/i386/x86-64-apx-pushp-popp.d          |  14 +
 .../gas/i386/x86-64-apx-pushp-popp.s          |   8 +
 gas/testsuite/gas/i386/x86-64-apx-rex2.d      |  83 +++
 gas/testsuite/gas/i386/x86-64-apx-rex2.s      |  85 +++
 gas/testsuite/gas/i386/x86-64-evex.d          |   2 +-
 .../gas/i386/x86-64-opcode-inval-intel.d      |  26 +-
 gas/testsuite/gas/i386/x86-64-opcode-inval.d  |  26 +-
 gas/testsuite/gas/i386/x86-64-opcode-inval.s  |   4 -
 gas/testsuite/gas/i386/x86-64-pseudos-bad.l   |  75 +-
 gas/testsuite/gas/i386/x86-64-pseudos-bad.s   |  74 ++
 gas/testsuite/gas/i386/x86-64-pseudos.d       |  63 ++
 gas/testsuite/gas/i386/x86-64-pseudos.s       |  64 ++
 gas/testsuite/gas/i386/x86-64.exp             |  20 +-
 include/opcode/i386.h                         |   4 +
 opcodes/i386-dis-evex-prefix.h                |  58 ++
 opcodes/i386-dis-evex-reg.h                   |  63 ++
 opcodes/i386-dis-evex-w.h                     |  10 +
 opcodes/i386-dis-evex-x86-64.h                |  50 ++
 opcodes/i386-dis-evex.h                       | 347 ++++++++-
 opcodes/i386-dis.c                            | 702 +++++++++++++-----
 opcodes/i386-gen.c                            |  52 +-
 opcodes/i386-opc.h                            |  27 +-
 opcodes/i386-opc.tbl                          | 204 ++++-
 opcodes/i386-reg.tbl                          |  64 ++
 60 files changed, 4637 insertions(+), 449 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.s
 create mode 100644 opcodes/i386-dis-evex-x86-64.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:53   ` H.J. Lu
  2024-01-05 14:45   ` Jan Beulich
  2023-12-28  1:27 ` [PATCH V5 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich

APX uses the REX2 prefix to support EGPR for map0 and map1 of legacy
instructions. We added the NoEgpr flag in i386-gen.c for instructions
that do not support EGPR.

gas/ChangeLog:

2023-12-28  Lingling Kong <lingling.kong@intel.com>
	    H.J. Lu  <hongjiu.lu@intel.com>
	    Lili Cui <lili.cui@intel.com>
	    Lin Hu   <lin1.hu@intel.com>

	* config/tc-i386.c
	(enum i386_error): Add unsupported_EGPR_for_addressing
	and invalid_pseudo_prefix.
	(struct _i386_insn): Add rex2 and rex2_encoding for
	gpr32.
	(cpu_arch): Add apx_f.
	(is_cpu): Ditto.
	(register_number): Handle RegRex2 for gpr32.
	(is_apx_rex2_encoding): New func. Test rex2 prefix encoding.
	(build_rex2_prefix): New func. Build legacy insn in
	opcode 0/1 use gpr32 with rex2 prefix.
	(establish_rex): Handle rex2 and rex2_encoding.
	(optimize_encoding): Handel add r16-r31 for registers.
	(md_assemble): Handle apx encoding.
	(parse_insn): Handle Prefix_REX2.
	(check_EgprOperands): New func. Check if Egprs operands
	are valid for the instruction
	(match_template):  Handle Egpr operands check.
	(set_rex_rex2):  New func. set i.rex and i.rex2.
	(build_modrm_byte): Ditto.
	(output_insn): Handle rex2 2-byte prefix output.
	(check_register): Handle check egpr illegal without
	target apx, 64-bit mode and with rex_prefix.
	* doc/c-i386.texi: Document .apx.
	* testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d: D5 valid
	in 64-bit mode.
	* testsuite/gas/i386/ilp32/x86-64-opcode-inval.d: Ditto.
	* testsuite/gas/i386/rex-bad: Adjust rex testcase.
	* testsuite/gas/i386/x86-64-opcode-inval-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-opcode-inval.d: Ditto.
	* testsuite/gas/i386/x86-64-opcode-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-pseudos-bad.l: Add illegal rex2 test.
	* testsuite/gas/i386/x86-64-pseudos-bad.s: Ditto.
	* testsuite/gas/i386/x86-64-pseudos.d: Add rex2 test.
	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
	* testsuite/gas/i386/x86-64.exp: Run APX tests.
	* testsuite/gas/i386/x86-64-apx-egpr-inval.l: New test.
	* testsuite/gas/i386/x86-64-apx-egpr-inval.s: New test.
	* testsuite/gas/i386/x86-64-apx-rex2.d: New test.
	* testsuite/gas/i386/x86-64-apx-rex2.s: New test.

include/ChangeLog:

	* opcode/i386.h (REX2_OPCODE): New.
	(REX2_M): Ditto.

opcodes/ChangeLog:

	* i386-dis.c (struct instr_info): Add erex for gpr32.
	Add last_erex_prefix for rex2 prefix.
	(REX2_M): Extend for gpr32.
	(PREFIX_REX2): Ditto.
	(PREFIX_REX2_ILLEGAL): Ditto.
	(ckprefix): Ditto.
	(prefix_name): Ditto.
	(print_insn): Ditto.
	(print_register): Ditto.
	(OP_E_memory): Ditto.
	(OP_REG): Ditto.
	(OP_EX): Ditto.
	* i386-gen.c (rex2_disallowed): Some instructions are not allowed rex2 prefix.
	(process_i386_opcode_modifier): Set NoEgpr for VEX and some special instructions.
	(output_i386_opcode): Handle if_entry_needs_special_handle.
	* i386-init.h : Regenerated.
	* i386-mnem.h : Regenerated.
	* i386-opc.h (enum i386_cpu): Add CpuAPX_F.
	(NoEgpr): New.
	(Prefix_NoOptimize): Ditto.
	(Prefix_REX2): Ditto.
	(RegRex2): Ditto.
	* i386-opc.tbl: Add rex2 prefix.
	* i386-reg.tbl: Add egprs (r16-r31).
	* i386-tbl.h: Regenerated.
---
 gas/config/tc-i386.c                          | 178 ++++++++++--
 gas/doc/c-i386.texi                           |   7 +-
 .../i386/ilp32/x86-64-opcode-inval-intel.d    |  47 +---
 .../gas/i386/ilp32/x86-64-opcode-inval.d      |  47 +---
 gas/testsuite/gas/i386/rex-bad.l              |   8 +-
 .../gas/i386/x86-64-apx-egpr-inval.l          |  15 +
 .../gas/i386/x86-64-apx-egpr-inval.s          |  18 ++
 gas/testsuite/gas/i386/x86-64-apx-rex2.d      |  83 ++++++
 gas/testsuite/gas/i386/x86-64-apx-rex2.s      |  85 ++++++
 .../gas/i386/x86-64-opcode-inval-intel.d      |  26 +-
 gas/testsuite/gas/i386/x86-64-opcode-inval.d  |  26 +-
 gas/testsuite/gas/i386/x86-64-opcode-inval.s  |   4 -
 gas/testsuite/gas/i386/x86-64-pseudos-bad.l   |  75 ++++-
 gas/testsuite/gas/i386/x86-64-pseudos-bad.s   |  74 +++++
 gas/testsuite/gas/i386/x86-64-pseudos.d       |  21 ++
 gas/testsuite/gas/i386/x86-64-pseudos.s       |  21 ++
 gas/testsuite/gas/i386/x86-64.exp             |   2 +
 include/opcode/i386.h                         |   4 +
 opcodes/i386-dis.c                            | 257 ++++++++++++------
 opcodes/i386-gen.c                            |  50 +++-
 opcodes/i386-opc.h                            |  13 +-
 opcodes/i386-opc.tbl                          |  27 +-
 opcodes/i386-reg.tbl                          |  64 +++++
 23 files changed, 886 insertions(+), 266 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index cdd3b55c655..bb302f28add 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -239,6 +239,7 @@ enum i386_error
     bad_imm4,
     unsupported_with_intel_mnemonic,
     unsupported_syntax,
+    unsupported_EGPR_for_addressing,
     unsupported,
     unsupported_on_arch,
     unsupported_64bit,
@@ -249,6 +250,7 @@ enum i386_error
     invalid_vector_register_set,
     invalid_tmm_register_set,
     invalid_dest_and_src_register_set,
+    invalid_pseudo_prefix,
     unsupported_vector_index_register,
     unsupported_broadcast,
     broadcast_needed,
@@ -356,6 +358,7 @@ struct _i386_insn
     modrm_byte rm;
     rex_byte rex;
     rex_byte vrex;
+    rex_byte rex2;
     sib_byte sib;
     vex_prefix vex;
 
@@ -429,6 +432,9 @@ struct _i386_insn
     /* Prefer the REX byte in encoding.  */
     bool rex_encoding;
 
+    /* Prefer the REX2 prefix in encoding.  */
+    bool rex2_encoding;
+
     /* Disable instruction size optimization.  */
     bool no_optimize;
 
@@ -1149,6 +1155,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (pbndkb, PBNDKB, PBNDKB, false),
   VECARCH (avx10.1, AVX10_1, ANY_AVX512F, set),
   SUBARCH (user_msr, USER_MSR, USER_MSR, false),
+  SUBARCH (apx_f, APX_F, APX_F, false),
 };
 
 #undef SUBARCH
@@ -1664,6 +1671,7 @@ _is_cpu (const i386_cpu_attr *a, enum i386_cpu cpu)
     case CpuHLE:      return a->bitfield.cpuhle;
     case CpuAVX512F:  return a->bitfield.cpuavx512f;
     case CpuAVX512VL: return a->bitfield.cpuavx512vl;
+    case CpuAPX_F:    return a->bitfield.cpuapx_f;
     case Cpu64:       return a->bitfield.cpu64;
     case CpuNo64:     return a->bitfield.cpuno64;
     default:
@@ -2335,7 +2343,7 @@ register_number (const reg_entry *r)
   if (r->reg_flags & RegRex)
     nr += 8;
 
-  if (r->reg_flags & RegVRex)
+  if (r->reg_flags & (RegVRex | RegRex2))
     nr += 16;
 
   return nr;
@@ -3871,6 +3879,12 @@ is_any_vex_encoding (const insn_template *t)
   return t->opcode_modifier.vex || t->opcode_modifier.evex;
 }
 
+static INLINE bool
+is_apx_rex2_encoding (void)
+{
+  return i.rex2 || i.rex2_encoding;
+}
+
 static unsigned int
 get_broadcast_bytes (const insn_template *t, bool diag)
 {
@@ -4126,6 +4140,22 @@ build_evex_prefix (void)
     i.vex.bytes[3] |= i.mask.reg->reg_num;
 }
 
+/* Build (2 bytes) rex2 prefix.
+   | D5h |
+   | m | R4 X4 B4 | W R X B |
+
+   Rex2 reuses i.vex as they both encode i.tm.opcode_space in their prefixes.
+ */
+static void
+build_rex2_prefix (void)
+{
+  i.vex.length = 2;
+  i.vex.bytes[0] = 0xd5;
+  /* For the W R X B bits, the variables of rex prefix will be reused.  */
+  i.vex.bytes[1] = ((i.tm.opcode_space << 7)
+		    | (i.rex2 << 4) | i.rex);
+}
+
 static void establish_rex (void)
 {
   /* Note that legacy encodings have at most 2 non-immediate operands.  */
@@ -4140,13 +4170,16 @@ static void establish_rex (void)
      registers to new ones.  */
 
   if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
-       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0))
+       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
+	   || i.rex2 != 0))
       || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
-	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0)))
+	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
+	      || i.rex2 != 0)))
     {
       unsigned int x;
 
-      i.rex |= REX_OPCODE;
+      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
+	i.rex |= REX_OPCODE;
       for (x = first; x <= last; x++)
 	{
 	  /* Look for 8 bit operand that uses old registers.  */
@@ -4157,7 +4190,7 @@ static void establish_rex (void)
 	      /* In case it is "hi" register, give up.  */
 	      if (i.op[x].regs->reg_num > 3)
 		as_bad (_("can't encode register '%s%s' in an "
-			  "instruction requiring REX prefix"),
+			  "instruction requiring REX/REX2 prefix"),
 			register_prefix, i.op[x].regs->reg_name);
 
 	      /* Otherwise it is equivalent to the extended register.
@@ -4168,11 +4201,11 @@ static void establish_rex (void)
 	}
     }
 
-  if (i.rex == 0 && i.rex_encoding)
+   if (i.rex == 0 && i.rex2 == 0 && (i.rex_encoding || i.rex2_encoding))
     {
       /* Check if we can add a REX_OPCODE byte.  Look for 8 bit operand
 	 that uses legacy register.  If it is "hi" register, don't add
-	 the REX_OPCODE byte.  */
+	 rex and rex2 prefix.  */
       unsigned int x;
 
       for (x = first; x <= last; x++)
@@ -4183,6 +4216,7 @@ static void establish_rex (void)
 	  {
 	    gas_assert (!(i.op[x].regs->reg_flags & RegRex));
 	    i.rex_encoding = false;
+	    i.rex2_encoding = false;
 	    break;
 	  }
 
@@ -4190,8 +4224,14 @@ static void establish_rex (void)
 	i.rex = REX_OPCODE;
     }
 
-  if (i.rex != 0)
-    add_prefix (REX_OPCODE | i.rex);
+   if (is_apx_rex2_encoding ())
+     {
+       build_rex2_prefix ();
+       /* The individual REX.RXBW bits got consumed.  */
+       i.rex &= REX_OPCODE;
+     }
+   else if (i.rex != 0)
+     add_prefix (REX_OPCODE | i.rex);
 }
 
 static void
@@ -4457,14 +4497,22 @@ optimize_encoding (void)
 	  i.types[1].bitfield.byte = 1;
 	  /* Ignore the suffix.  */
 	  i.suffix = 0;
-	  /* Convert to byte registers.  */
+	  /* Convert to byte registers. 8-bit registers are special,
+	     RegRex64 and non-RegRex64 each have 8 registers.  */
 	  if (i.types[1].bitfield.word)
-	    j = 16;
-	  else if (i.types[1].bitfield.dword)
+	    /* 32 (or 40) 8-bit registers.  */
 	    j = 32;
+	  else if (i.types[1].bitfield.dword)
+	    /* 32 (or 40) 8-bit registers + 32 16-bit registers.  */
+	    j = 64;
 	  else
-	    j = 48;
-	  if (!(i.op[1].regs->reg_flags & RegRex) && base_regnum < 4)
+	    /* 32 (or 40) 8-bit registers + 32 16-bit registers
+	       + 32 32-bit registers.  */
+	    j = 96;
+
+	  /* In 64-bit mode, the following byte registers cannot be accessed
+	     if using the Rex and Rex2 prefix: AH, BH, CH, DH */
+	  if (!(i.op[1].regs->reg_flags & (RegRex | RegRex2)) && base_regnum < 4)
 	    j += 8;
 	  i.op[1].regs -= j;
 	}
@@ -5354,6 +5402,9 @@ md_assemble (char *line)
 	case unsupported_syntax:
 	  err_msg = _("unsupported syntax");
 	  break;
+	case unsupported_EGPR_for_addressing:
+	  err_msg = _("extended GPR cannot be used as base/index");
+	  break;
 	case unsupported:
 	  as_bad (_("unsupported instruction `%s'"),
 		  pass1_mnem ? pass1_mnem : insn_name (current_templates.start));
@@ -5407,6 +5458,9 @@ md_assemble (char *line)
 	case invalid_dest_and_src_register_set:
 	  err_msg = _("destination and source registers must be distinct");
 	  break;
+	case invalid_pseudo_prefix:
+	  err_msg = _("rex2 pseudo prefix cannot be used");
+	  break;
 	case unsupported_vector_index_register:
 	  err_msg = _("unsupported vector index register");
 	  break;
@@ -5662,6 +5716,13 @@ md_assemble (char *line)
 	  return;
 	}
 
+      /* Check for explicit REX2 prefix.  */
+      if (i.rex2_encoding)
+	{
+	  as_bad (_("{rex2} prefix invalid with `%s'"), insn_name (&i.tm));
+	  return;
+	}
+
       if (i.tm.opcode_modifier.vex)
 	build_vex_prefix (t);
       else
@@ -5868,6 +5929,10 @@ parse_insn (const char *line, char *mnemonic, bool prefix_only)
 		  /* {rex} */
 		  i.rex_encoding = true;
 		  break;
+		case Prefix_REX2:
+		  /* {rex2} */
+		  i.rex2_encoding = true;
+		  break;
 		case Prefix_NoOptimize:
 		  /* {nooptimize} */
 		  i.no_optimize = true;
@@ -7015,6 +7080,43 @@ VEX_check_encoding (const insn_template *t)
   return 0;
 }
 
+/* Check if Egprs operands are valid for the instruction.  */
+
+static bool
+check_EgprOperands (const insn_template *t)
+{
+  if (!t->opcode_modifier.noegpr)
+    return 0;
+
+  for (unsigned int op = 0; op < i.operands; op++)
+    {
+      if (i.types[op].bitfield.class != Reg)
+	continue;
+
+      if (i.op[op].regs->reg_flags & RegRex2)
+	{
+	  i.error = register_type_mismatch;
+	  return 1;
+	}
+    }
+
+  if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
+      || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
+    {
+      i.error = unsupported_EGPR_for_addressing;
+      return 1;
+    }
+
+  /* Check if pseudo prefix {rex2} is valid.  */
+  if (i.rex2_encoding)
+    {
+      i.error = invalid_pseudo_prefix;
+      return 1;
+    }
+
+  return 0;
+}
+
 /* Helper function for the progress() macro in match_template().  */
 static INLINE enum i386_error progress (enum i386_error new,
 					enum i386_error last,
@@ -7159,6 +7261,13 @@ match_template (char mnem_suffix)
 	      continue;
 	    }
 
+	  /* Check if pseudo prefix {rex2} is valid.  */
+	  if (t->opcode_modifier.noegpr && i.rex2_encoding)
+	    {
+	      specific_error = progress (invalid_pseudo_prefix);
+	      continue;
+	    }
+
 	  /* We've found a match; break out of loop.  */
 	  break;
 	}
@@ -7489,6 +7598,13 @@ match_template (char mnem_suffix)
 	  continue;
 	}
 
+      /* Check if EGPR operands(r16-r31) are valid.  */
+      if (check_EgprOperands (t))
+	{
+	  specific_error = progress (i.error);
+	  continue;
+	}
+
       /* Check whether to use the shorter VEX encoding for certain insns where
 	 the EVEX encoding comes first in the table.  This requires the respective
 	 AVX-* feature to be explicitly enabled.
@@ -8387,6 +8503,18 @@ static INLINE void set_rex_vrex (const reg_entry *r, unsigned int rex_bit,
 
   if (r->reg_flags & RegVRex)
     i.vrex |= rex_bit;
+
+  if (r->reg_flags & RegRex2)
+    i.rex2 |= rex_bit;
+}
+
+static INLINE void
+set_rex_rex2 (const reg_entry *r, unsigned int rex_bit)
+{
+  if ((r->reg_flags & RegRex) != 0)
+    i.rex |= rex_bit;
+  if ((r->reg_flags & RegRex2) != 0)
+    i.rex2 |= rex_bit;
 }
 
 static int
@@ -8870,8 +8998,7 @@ build_modrm_byte (void)
 		  i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING;
 		  i.types[op] = operand_type_and_not (i.types[op], anydisp);
 		  i.types[op].bitfield.disp32 = 1;
-		  if ((i.index_reg->reg_flags & RegRex) != 0)
-		    i.rex |= REX_X;
+		  set_rex_rex2 (i.index_reg, REX_X);
 		}
 	    }
 	  /* RIP addressing for 64bit mode.  */
@@ -8942,8 +9069,7 @@ build_modrm_byte (void)
 
 	      if (!i.tm.opcode_modifier.sib)
 		i.rm.regmem = i.base_reg->reg_num;
-	      if ((i.base_reg->reg_flags & RegRex) != 0)
-		i.rex |= REX_B;
+	      set_rex_rex2 (i.base_reg, REX_B);
 	      i.sib.base = i.base_reg->reg_num;
 	      /* x86-64 ignores REX prefix bit here to avoid decoder
 		 complications.  */
@@ -8981,8 +9107,7 @@ build_modrm_byte (void)
 		  else
 		    i.sib.index = i.index_reg->reg_num;
 		  i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING;
-		  if ((i.index_reg->reg_flags & RegRex) != 0)
-		    i.rex |= REX_X;
+		  set_rex_rex2 (i.index_reg, REX_X);
 		}
 
 	      if (i.disp_operands
@@ -10126,6 +10251,12 @@ output_insn (const struct last_insn *last_insn)
 	  for (j = ARRAY_SIZE (i.prefix), q = i.prefix; j > 0; j--, q++)
 	    if (*q)
 	      frag_opcode_byte (*q);
+
+	  if (is_apx_rex2_encoding ())
+	    {
+	      frag_opcode_byte (i.vex.bytes[0]);
+	      frag_opcode_byte (i.vex.bytes[1]);
+	    }
 	}
       else
 	{
@@ -14164,6 +14295,13 @@ static bool check_register (const reg_entry *r)
 	i.vec_encoding = vex_encoding_error;
     }
 
+  if (r->reg_flags & RegRex2)
+    {
+      if (!cpu_arch_flags.bitfield.cpuapx_f
+	  || flag_code != CODE_64BIT)
+	return false;
+    }
+
   if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
       && (!cpu_arch_flags.bitfield.cpu64
 	  || r->reg_type.bitfield.class != RegCR
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 03ee980bef7..21f48c93300 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -217,6 +217,7 @@ accept various extension mnemonics.  For example,
 @code{avx10.1/256},
 @code{avx10.1/128},
 @code{user_msr},
+@code{apx_f},
 @code{amx_int8},
 @code{amx_bf16},
 @code{amx_fp16},
@@ -983,6 +984,10 @@ Different encoding options can be specified via pseudo prefixes:
 instructions (x86-64 only).  Note that this differs from the @samp{rex}
 prefix which generates REX prefix unconditionally.
 
+@item
+@samp{@{rex2@}} -- prefer REX2 prefix for integer and legacy vector
+instructions (APX_F only).
+
 @item
 @samp{@{nooptimize@}} -- disable instruction size optimization.
 @end itemize
@@ -1663,7 +1668,7 @@ supported on the CPU specified.  The choices for @var{cpu_type} are:
 @item @samp{.lwp} @tab @samp{.fma4} @tab @samp{.xop} @tab @samp{.cx16}
 @item @samp{.padlock} @tab @samp{.clzero} @tab @samp{.mwaitx} @tab @samp{.rdpru}
 @item @samp{.mcommit} @tab @samp{.sev_es} @tab @samp{.snp} @tab @samp{.invlpgb}
-@item @samp{.tlbsync}
+@item @samp{.tlbsync} @tab @samp{.apx_f}
 @end multitable
 
 Apart from the warning, there are only two other effects on
diff --git a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d
index a2b09d2e74f..56834371133 100644
--- a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d
+++ b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d
@@ -2,49 +2,4 @@
 #as: --32
 #objdump: -dw -Mx86-64 -Mintel
 #name: x86-64 (ILP32) illegal opcodes (Intel mode)
-
-.*: +file format .*
-
-Disassembly of section .text:
-
-0+ <aaa>:
-[ 	]*[a-f0-9]+:	37                   	\(bad\)
-
-0+1 <aad0>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
-
-0+3 <aad1>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	02                   	.byte 0x2
-
-0+5 <aam0>:
-[ 	]*[a-f0-9]+:	d4                   	\(bad\)
-[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
-
-0+7 <aam1>:
-[ 	]*[a-f0-9]+:	d4                   	\(bad\)
-[ 	]*[a-f0-9]+:	02                   	.byte 0x2
-
-0+9 <aas>:
-[ 	]*[a-f0-9]+:	3f                   	\(bad\)
-
-0+a <bound>:
-[ 	]*[a-f0-9]+:	62                   	.byte 0x62
-[ 	]*[a-f0-9]+:	10                   	.byte 0x10
-
-0+c <daa>:
-[ 	]*[a-f0-9]+:	27                   	\(bad\)
-
-0+d <das>:
-[ 	]*[a-f0-9]+:	2f                   	\(bad\)
-
-0+e <into>:
-[ 	]*[a-f0-9]+:	ce                   	\(bad\)
-
-0+f <pusha>:
-[ 	]*[a-f0-9]+:	60                   	\(bad\)
-
-0+10 <popa>:
-[ 	]*[a-f0-9]+:	61                   	\(bad\)
-#pass
+#dump: ../x86-64-opcode-inval-intel.d
diff --git a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d
index 5a17b0b412e..b5233a5cf93 100644
--- a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d
+++ b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d
@@ -2,49 +2,4 @@
 #as: --32
 #objdump: -dw -Mx86-64
 #name: x86-64 (ILP32) illegal opcodes
-
-.*: +file format .*
-
-Disassembly of section .text:
-
-0+ <aaa>:
-[ 	]*[a-f0-9]+:	37                   	\(bad\)
-
-0+1 <aad0>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
-
-0+3 <aad1>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	02                   	.byte 0x2
-
-0+5 <aam0>:
-[ 	]*[a-f0-9]+:	d4                   	\(bad\)
-[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
-
-0+7 <aam1>:
-[ 	]*[a-f0-9]+:	d4                   	\(bad\)
-[ 	]*[a-f0-9]+:	02                   	.byte 0x2
-
-0+9 <aas>:
-[ 	]*[a-f0-9]+:	3f                   	\(bad\)
-
-0+a <bound>:
-[ 	]*[a-f0-9]+:	62                   	.byte 0x62
-[ 	]*[a-f0-9]+:	10                   	.byte 0x10
-
-0+c <daa>:
-[ 	]*[a-f0-9]+:	27                   	\(bad\)
-
-0+d <das>:
-[ 	]*[a-f0-9]+:	2f                   	\(bad\)
-
-0+e <into>:
-[ 	]*[a-f0-9]+:	ce                   	\(bad\)
-
-0+f <pusha>:
-[ 	]*[a-f0-9]+:	60                   	\(bad\)
-
-0+10 <popa>:
-[ 	]*[a-f0-9]+:	61                   	\(bad\)
-#pass
+#dump: ../x86-64-opcode-inval.d
diff --git a/gas/testsuite/gas/i386/rex-bad.l b/gas/testsuite/gas/i386/rex-bad.l
index 407558ec541..abd4d3045d0 100644
--- a/gas/testsuite/gas/i386/rex-bad.l
+++ b/gas/testsuite/gas/i386/rex-bad.l
@@ -3,8 +3,8 @@
 .*:5: Error: same .*
 .*:6: Error: same .*
 .*:7: Error: same .*
-.*:9: Error: .* REX .*
-.*:10: Error: .* REX .*
-.*:12: Error: .* REX .*
-.*:13: Error: .* REX .*
+.*:9: Error: .* REX/REX2 .*
+.*:10: Error: .* REX/REX2 .*
+.*:12: Error: .* REX/REX2 .*
+.*:13: Error: .* REX/REX2 .*
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
new file mode 100644
index 00000000000..bb5c602a2e2
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
@@ -0,0 +1,15 @@
+.*: Assembler messages:
+.*:4: Error: bad register name `%r17d'
+.*:7: Error: extended GPR cannot be used as base/index for `xsave'
+.*:8: Error: extended GPR cannot be used as base/index for `xsave64'
+.*:9: Error: extended GPR cannot be used as base/index for `xrstor'
+.*:10: Error: extended GPR cannot be used as base/index for `xrstor64'
+.*:11: Error: extended GPR cannot be used as base/index for `xsaves'
+.*:12: Error: extended GPR cannot be used as base/index for `xsaves64'
+.*:13: Error: extended GPR cannot be used as base/index for `xrstors'
+.*:14: Error: extended GPR cannot be used as base/index for `xrstors64'
+.*:15: Error: extended GPR cannot be used as base/index for `xsaveopt'
+.*:16: Error: extended GPR cannot be used as base/index for `xsaveopt64'
+.*:17: Error: extended GPR cannot be used as base/index for `xsavec'
+.*:18: Error: extended GPR cannot be used as base/index for `xsavec64'
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
new file mode 100644
index 00000000000..bfb6b3fd03b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
@@ -0,0 +1,18 @@
+# Check illegal 64bit APX_F instructions
+	.text
+	.arch .noapx_f
+	test    $0x7, %r17d
+	.arch .apx_f
+	test    $0x7, %r17d
+	xsave (%r16, %rbx)
+	xsave64 (%r16, %r31)
+	xrstor (%r16, %rbx)
+	xrstor64 (%r16, %rbx)
+	xsaves (%rbx, %r16)
+	xsaves64 (%r16, %rbx)
+	xrstors (%rbx, %r31)
+	xrstors64 (%r16, %rbx)
+	xsaveopt (%r16, %rbx)
+	xsaveopt64 (%r16, %r31)
+	xsavec (%r16, %rbx)
+	xsavec64 (%r16, %r31)
diff --git a/gas/testsuite/gas/i386/x86-64-apx-rex2.d b/gas/testsuite/gas/i386/x86-64-apx-rex2.d
new file mode 100644
index 00000000000..e3cd534da11
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.d
@@ -0,0 +1,83 @@
+#as:
+#objdump: -dw
+#name: x86-64 APX_F use gpr32 with rex2 prefix
+#source: x86-64-apx-rex2.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+[	 ]*[a-f0-9]+:[	 ]*d5 11 f6 c0 07[	 ]+test   \$0x7,%r24b
+[	 ]*[a-f0-9]+:[	 ]*d5 11 f7 c0 07 00 00 00[	 ]+test   \$0x7,%r24d
+[	 ]*[a-f0-9]+:[	 ]*d5 19 f7 c0 07 00 00 00[	 ]+test   \$0x7,%r24
+[	 ]*[a-f0-9]+:[	 ]*66 d5 11 f7 c0 07 00[	 ]+test   \$0x7,%r24w
+[	 ]*[a-f0-9]+:[	 ]*44 0f af f8[	 ]+imul   %eax,%r15d
+[	 ]*[a-f0-9]+:[	 ]*d5 c0 af c0[	 ]+imul   %eax,%r16d
+[	 ]*[a-f0-9]+:[	 ]*d5 90 62 12[	 ]+punpckldq %mm2,\(%r18\)
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 00[	 ]+lea    \(%rax\),%r16d
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 08[	 ]+lea    \(%rax\),%r17d
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 10[	 ]+lea    \(%rax\),%r18d
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 18[	 ]+lea    \(%rax\),%r19d
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 20[	 ]+lea    \(%rax\),%r20d
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 28[	 ]+lea    \(%rax\),%r21d
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 30[	 ]+lea    \(%rax\),%r22d
+[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 38[	 ]+lea    \(%rax\),%r23d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 00[	 ]+lea    \(%rax\),%r24d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 08[	 ]+lea    \(%rax\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 10[	 ]+lea    \(%rax\),%r26d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 18[	 ]+lea    \(%rax\),%r27d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 20[	 ]+lea    \(%rax\),%r28d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 28[	 ]+lea    \(%rax\),%r29d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 30[	 ]+lea    \(%rax\),%r30d
+[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 38[	 ]+lea    \(%rax\),%r31d
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 05 00 00 00 00[	 ]+lea    0x0\(,%r16,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 0d 00 00 00 00[	 ]+lea    0x0\(,%r17,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 15 00 00 00 00[	 ]+lea    0x0\(,%r18,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 1d 00 00 00 00[	 ]+lea    0x0\(,%r19,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 25 00 00 00 00[	 ]+lea    0x0\(,%r20,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 2d 00 00 00 00[	 ]+lea    0x0\(,%r21,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 35 00 00 00 00[	 ]+lea    0x0\(,%r22,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 3d 00 00 00 00[	 ]+lea    0x0\(,%r23,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 05 00 00 00 00[	 ]+lea    0x0\(,%r24,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 0d 00 00 00 00[	 ]+lea    0x0\(,%r25,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 15 00 00 00 00[	 ]+lea    0x0\(,%r26,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 1d 00 00 00 00[	 ]+lea    0x0\(,%r27,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 25 00 00 00 00[	 ]+lea    0x0\(,%r28,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 2d 00 00 00 00[	 ]+lea    0x0\(,%r29,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 35 00 00 00 00[	 ]+lea    0x0\(,%r30,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 3d 00 00 00 00[	 ]+lea    0x0\(,%r31,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 00[	 ]+lea    \(%r16\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 01[	 ]+lea    \(%r17\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 02[	 ]+lea    \(%r18\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 03[	 ]+lea    \(%r19\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 04 24       	lea    \(%r20\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 45 00       	lea    0x0\(%r21\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 06[	 ]+lea    \(%r22\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 07[	 ]+lea    \(%r23\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 00[	 ]+lea    \(%r24\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 01[	 ]+lea    \(%r25\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 02[	 ]+lea    \(%r26\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 03[	 ]+lea    \(%r27\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 04 24       	lea    \(%r28\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 45 00       	lea    0x0\(%r29\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 06          	lea    \(%r30\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 07          	lea    \(%r31\),%eax
+[	 ]*[a-f0-9]+:[	 ]*4c 8d 38             	lea    \(%rax\),%r15
+[	 ]*[a-f0-9]+:[	 ]*d5 48 8d 00          	lea    \(%rax\),%r16
+[	 ]*[a-f0-9]+:[	 ]*49 8d 07             	lea    \(%r15\),%rax
+[	 ]*[a-f0-9]+:[	 ]*d5 18 8d 00          	lea    \(%r16\),%rax
+[	 ]*[a-f0-9]+:[	 ]*4a 8d 04 3d 00 00 00 00 	lea    0x0\(,%r15,1\),%rax
+[	 ]*[a-f0-9]+:[	 ]*d5 28 8d 04 05 00 00 00 00 	lea    0x0\(,%r16,1\),%rax
+[	 ]*[a-f0-9]+:[	 ]*d5 1c 03 00          	add    \(%r16\),%r8
+[	 ]*[a-f0-9]+:[	 ]*d5 1c 03 38          	add    \(%r16\),%r15
+[	 ]*[a-f0-9]+:[	 ]*d5 4a 8b 04 0d 00 00 00 00 	mov    0x0\(,%r9,1\),%r16
+[	 ]*[a-f0-9]+:[	 ]*d5 4a 8b 04 35 00 00 00 00 	mov    0x0\(,%r14,1\),%r16
+[	 ]*[a-f0-9]+:[	 ]*d5 4d 2b 3a          	sub    \(%r10\),%r31
+[	 ]*[a-f0-9]+:[	 ]*d5 4d 2b 7d 00       	sub    0x0\(%r13\),%r31
+[	 ]*[a-f0-9]+:[	 ]*d5 30 8d 44 20 01    	lea    0x1\(%r16,%r20,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 76 8d 7c 20 01    	lea    0x1\(%r16,%r28,1\),%r31d
+[	 ]*[a-f0-9]+:[	 ]*d5 12 8d 84 04 81 00 00 00 	lea    0x81\(%r20,%r8,1\),%eax
+[	 ]*[a-f0-9]+:[	 ]*d5 57 8d bc 04 81 00 00 00 	lea    0x81\(%r28,%r8,1\),%r31d
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-rex2.s b/gas/testsuite/gas/i386/x86-64-apx-rex2.s
new file mode 100644
index 00000000000..eaaaaa77dd7
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.s
@@ -0,0 +1,85 @@
+# Check 64bit instructions with rex2 prefix encoding
+
+	.allow_index_reg
+	.text
+_start:
+         test	$0x7, %r24b
+         test	$0x7, %r24d
+         test	$0x7, %r24
+         test	$0x7, %r24w
+## REX2.M bit
+         imull	%eax, %r15d
+         imull	%eax, %r16d
+         punpckldq (%r18), %mm2
+## REX2.R4 bit
+         leal	(%rax), %r16d
+         leal	(%rax), %r17d
+         leal	(%rax), %r18d
+         leal	(%rax), %r19d
+         leal	(%rax), %r20d
+         leal	(%rax), %r21d
+         leal	(%rax), %r22d
+         leal	(%rax), %r23d
+         leal	(%rax), %r24d
+         leal	(%rax), %r25d
+         leal	(%rax), %r26d
+         leal	(%rax), %r27d
+         leal	(%rax), %r28d
+         leal	(%rax), %r29d
+         leal	(%rax), %r30d
+         leal	(%rax), %r31d
+## REX2.X4 bit
+         leal	(,%r16), %eax
+         leal	(,%r17), %eax
+         leal	(,%r18), %eax
+         leal	(,%r19), %eax
+         leal	(,%r20), %eax
+         leal	(,%r21), %eax
+         leal	(,%r22), %eax
+         leal	(,%r23), %eax
+         leal	(,%r24), %eax
+         leal	(,%r25), %eax
+         leal	(,%r26), %eax
+         leal	(,%r27), %eax
+         leal	(,%r28), %eax
+         leal	(,%r29), %eax
+         leal	(,%r30), %eax
+         leal	(,%r31), %eax
+## REX2.B4 bit
+         leal	(%r16), %eax
+         leal	(%r17), %eax
+         leal	(%r18), %eax
+         leal	(%r19), %eax
+         leal	(%r20), %eax
+         leal	(%r21), %eax
+         leal	(%r22), %eax
+         leal	(%r23), %eax
+         leal	(%r24), %eax
+         leal	(%r25), %eax
+         leal	(%r26), %eax
+         leal	(%r27), %eax
+         leal	(%r28), %eax
+         leal	(%r29), %eax
+         leal	(%r30), %eax
+         leal	(%r31), %eax
+## REX2.W bit
+         leaq	(%rax), %r15
+         leaq	(%rax), %r16
+         leaq	(%r15), %rax
+         leaq	(%r16), %rax
+         leaq	(,%r15), %rax
+         leaq	(,%r16), %rax
+## REX2.R3 bit
+         add    (%r16), %r8
+         add    (%r16), %r15
+## REX2.X3 bit
+         mov    (,%r9), %r16
+         mov    (,%r14), %r16
+## REX2.B3 bit
+	 sub   (%r10), %r31
+	 sub   (%r13), %r31
+## SIB
+         leal	1(%r16, %r20), %eax
+         leal	1(%r16, %r28), %r31d
+         leal	129(%r20, %r8), %eax
+         leal	129(%r28, %r8), %r31d
diff --git a/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d b/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d
index 6ee5b2f95ce..66c4d2cddc0 100644
--- a/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d
@@ -10,41 +10,33 @@ Disassembly of section .text:
 0+ <aaa>:
 [ 	]*[a-f0-9]+:	37                   	\(bad\)
 
-0+1 <aad0>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
-
-0+3 <aad1>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	02                   	.byte 0x2
-
-0+5 <aam0>:
+0+1 <aam0>:
 [ 	]*[a-f0-9]+:	d4                   	\(bad\)
 [ 	]*[a-f0-9]+:	0a                   	.byte 0xa
 
-0+7 <aam1>:
+0+3 <aam1>:
 [ 	]*[a-f0-9]+:	d4                   	\(bad\)
 [ 	]*[a-f0-9]+:	02                   	.byte 0x2
 
-0+9 <aas>:
+0+5 <aas>:
 [ 	]*[a-f0-9]+:	3f                   	\(bad\)
 
-0+a <bound>:
+0+6 <bound>:
 [ 	]*[a-f0-9]+:	62                   	.byte 0x62
 [ 	]*[a-f0-9]+:	10                   	.byte 0x10
 
-0+c <daa>:
+0+8 <daa>:
 [ 	]*[a-f0-9]+:	27                   	\(bad\)
 
-0+d <das>:
+0+9 <das>:
 [ 	]*[a-f0-9]+:	2f                   	\(bad\)
 
-0+e <into>:
+0+a <into>:
 [ 	]*[a-f0-9]+:	ce                   	\(bad\)
 
-0+f <pusha>:
+0+b <pusha>:
 [ 	]*[a-f0-9]+:	60                   	\(bad\)
 
-0+10 <popa>:
+0+c <popa>:
 [ 	]*[a-f0-9]+:	61                   	\(bad\)
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-opcode-inval.d b/gas/testsuite/gas/i386/x86-64-opcode-inval.d
index 12f02c1766c..fbb850b56da 100644
--- a/gas/testsuite/gas/i386/x86-64-opcode-inval.d
+++ b/gas/testsuite/gas/i386/x86-64-opcode-inval.d
@@ -9,41 +9,33 @@ Disassembly of section .text:
 0+ <aaa>:
 [ 	]*[a-f0-9]+:	37                   	\(bad\)
 
-0+1 <aad0>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
-
-0+3 <aad1>:
-[ 	]*[a-f0-9]+:	d5                   	\(bad\)
-[ 	]*[a-f0-9]+:	02                   	.byte 0x2
-
-0+5 <aam0>:
+0+1 <aam0>:
 [ 	]*[a-f0-9]+:	d4                   	\(bad\)
 [ 	]*[a-f0-9]+:	0a                   	.byte 0xa
 
-0+7 <aam1>:
+0+3 <aam1>:
 [ 	]*[a-f0-9]+:	d4                   	\(bad\)
 [ 	]*[a-f0-9]+:	02                   	.byte 0x2
 
-0+9 <aas>:
+0+5 <aas>:
 [ 	]*[a-f0-9]+:	3f                   	\(bad\)
 
-0+a <bound>:
+0+6 <bound>:
 [ 	]*[a-f0-9]+:	62                   	.byte 0x62
 [ 	]*[a-f0-9]+:	10                   	.byte 0x10
 
-0+c <daa>:
+0+8 <daa>:
 [ 	]*[a-f0-9]+:	27                   	\(bad\)
 
-0+d <das>:
+0+9 <das>:
 [ 	]*[a-f0-9]+:	2f                   	\(bad\)
 
-0+e <into>:
+0+a <into>:
 [ 	]*[a-f0-9]+:	ce                   	\(bad\)
 
-0+f <pusha>:
+0+b <pusha>:
 [ 	]*[a-f0-9]+:	60                   	\(bad\)
 
-0+10 <popa>:
+0+c <popa>:
 [ 	]*[a-f0-9]+:	61                   	\(bad\)
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-opcode-inval.s b/gas/testsuite/gas/i386/x86-64-opcode-inval.s
index 6cbfe7705a8..fbcda3df773 100644
--- a/gas/testsuite/gas/i386/x86-64-opcode-inval.s
+++ b/gas/testsuite/gas/i386/x86-64-opcode-inval.s
@@ -2,10 +2,6 @@
 # All the followings are illegal opcodes for x86-64.
 aaa:
 	aaa
-aad0:
-	aad
-aad1:
-	aad $2
 aam0:
 	aam
 aam1:
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos-bad.l b/gas/testsuite/gas/i386/x86-64-pseudos-bad.l
index 3f9f67fcf4b..a72f847085d 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.l
+++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.l
@@ -1,6 +1,71 @@
 .*: Assembler messages:
-.*:3: Error: .*`vmovaps'.*
-.*:4: Error: .*`vmovaps'.*
-.*:5: Error: .*`vmovaps'.*
-.*:6: Error: .*`vmovaps'.*
-.*:7: Error: .*`rorx'.*
+.*:[0-9]+: Error: .*`vmovaps'.*
+.*:[0-9]+: Error: .*`vmovaps'.*
+.*:[0-9]+: Error: .*`vmovaps'.*
+.*:[0-9]+: Error: .*`vmovaps'.*
+.*:[0-9]+: Error: .*`rorx'.*
+.*:[0-9]+: Error: .*`vmovaps'.*
+.*:[0-9]+: Error: .*`xsave'.*
+.*:[0-9]+: Error: .*`xsaves'.*
+.*:[0-9]+: Error: .*`xsaves64'.*
+.*:[0-9]+: Error: .*`xsavec'.*
+.*:[0-9]+: Error: .*`xrstors'.*
+.*:[0-9]+: Error: .*`xrstors64'.*
+.*:[0-9]+: Error: .*`mov'.*
+.*:[0-9]+: Error: .*`movabs'.*
+.*:[0-9]+: Error: .*`cmps'.*
+.*:[0-9]+: Error: .*`lods'.*
+.*:[0-9]+: Error: .*`lods'.*
+.*:[0-9]+: Error: .*`lods'.*
+.*:[0-9]+: Error: .*`movs'.*
+.*:[0-9]+: Error: .*`movs'.*
+.*:[0-9]+: Error: .*`scas'.*
+.*:[0-9]+: Error: .*`scas'.*
+.*:[0-9]+: Error: .*`scas'.*
+.*:[0-9]+: Error: .*`stos'.*
+.*:[0-9]+: Error: .*`stos'.*
+.*:[0-9]+: Error: .*`stos'.*
+.*:[0-9]+: Error: .*`jo'.*
+.*:[0-9]+: Error: .*`jno'.*
+.*:[0-9]+: Error: .*`jb'.*
+.*:[0-9]+: Error: .*`jae'.*
+.*:[0-9]+: Error: .*`je'.*
+.*:[0-9]+: Error: .*`jne'.*
+.*:[0-9]+: Error: .*`jbe'.*
+.*:[0-9]+: Error: .*`ja'.*
+.*:[0-9]+: Error: .*`js'.*
+.*:[0-9]+: Error: .*`jns'.*
+.*:[0-9]+: Error: .*`jp'.*
+.*:[0-9]+: Error: .*`jnp'.*
+.*:[0-9]+: Error: .*`jl'.*
+.*:[0-9]+: Error: .*`jge'.*
+.*:[0-9]+: Error: .*`jle'.*
+.*:[0-9]+: Error: .*`jg'.*
+.*:[0-9]+: Error: .*`jo'.*
+.*:[0-9]+: Error: .*`jno'.*
+.*:[0-9]+: Error: .*`jb'.*
+.*:[0-9]+: Error: .*`jae'.*
+.*:[0-9]+: Error: .*`je'.*
+.*:[0-9]+: Error: .*`jne'.*
+.*:[0-9]+: Error: .*`jbe'.*
+.*:[0-9]+: Error: .*`ja'.*
+.*:[0-9]+: Error: .*`js'.*
+.*:[0-9]+: Error: .*`jns'.*
+.*:[0-9]+: Error: .*`jp'.*
+.*:[0-9]+: Error: .*`jnp'.*
+.*:[0-9]+: Error: .*`jl'.*
+.*:[0-9]+: Error: .*`jge'.*
+.*:[0-9]+: Error: .*`jle'.*
+.*:[0-9]+: Error: .*`jg'.*
+.*:[0-9]+: Error: .*`in'.*
+.*:[0-9]+: Error: .*`in'.*
+.*:[0-9]+: Error: .*`out'.*
+.*:[0-9]+: Error: .*`out'.*
+.*:[0-9]+: Error: .*`jmp'.*
+.*:[0-9]+: Error: .*`loop'.*
+.*:[0-9]+: Error: .*`wrmsr'.*
+.*:[0-9]+: Error: .*`rdtsc'.*
+.*:[0-9]+: Error: .*`rdmsr'.*
+.*:[0-9]+: Error: .*`sysenter'.*
+.*:[0-9]+: Error: .*`sysexit'.*
+.*:[0-9]+: Error: .*`rdpmc'.*
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
index 3b923593a6a..54c17a9eab7 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
+++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
@@ -5,3 +5,77 @@ pseudos:
 	{rex} vmovaps %xmm7,%xmm2
 	{rex} vmovaps %xmm17,%xmm2
 	{rex} rorx $7,%eax,%ebx
+	{rex2} vmovaps %xmm7,%xmm2
+	{rex2} xsave (%rax)
+	{rex2} xsaves (%ecx)
+	{rex2} xsaves64 (%ecx)
+	{rex2} xsavec (%ecx)
+	{rex2} xrstors (%ecx)
+	{rex2} xrstors64 (%ecx)
+
+	#All opcodes in the row 0xA* (map0) prefixed REX2 are illegal.
+	#{rex2} test (0xa8) is a special case, it will remap to test (0xf6)
+	{rex2} mov    0x90909090,%al
+	{rex2} movabs 0x1,%al
+	{rex2} cmpsb  %es:(%edi),%ds:(%esi)
+	{rex2} lodsb
+	{rex2} lods   %ds:(%esi),%al
+	{rex2} lodsb   (%esi)
+	{rex2} movs
+	{rex2} movs   (%esi), (%edi)
+	{rex2} scasl
+	{rex2} scas   %es:(%edi),%eax
+	{rex2} scasb   (%edi)
+	{rex2} stosb
+	{rex2} stosb   (%edi)
+	{rex2} stos   %eax,%es:(%edi)
+
+	#All opcodes in the row 0x7* (map0) and 0x8* (map1) prefixed REX2 are illegal.
+	{rex2} jo     .+2-0x70
+	{rex2} jno    .+2-0x70
+	{rex2} jb     .+2-0x70
+	{rex2} jae    .+2-0x70
+	{rex2} je     .+2-0x70
+	{rex2} jne    .+2-0x70
+	{rex2} jbe    .+2-0x70
+	{rex2} ja     .+2-0x70
+	{rex2} js     .+2-0x70
+	{rex2} jns    .+2-0x70
+	{rex2} jp     .+2-0x70
+	{rex2} jnp    .+2-0x70
+	{rex2} jl     .+2-0x70
+	{rex2} jge    .+2-0x70
+	{rex2} jle    .+2-0x70
+	{rex2} jg     .+2-0x70
+	{rex2} jo     .+6+0x90909090
+	{rex2} jno    .+6+0x90909090
+	{rex2} jb     .+6+0x90909090
+	{rex2} jae    .+6+0x90909090
+	{rex2} je     .+6+0x90909090
+	{rex2} jne    .+6+0x90909090
+	{rex2} jbe    .+6+0x90909090
+	{rex2} ja     .+6+0x90909090
+	{rex2} js     .+6+0x90909090
+	{rex2} jns    .+6+0x90909090
+	{rex2} jp     .+6+0x90909090
+	{rex2} jnp    .+6+0x90909090
+	{rex2} jl     .+6+0x90909090
+	{rex2} jge    .+6+0x90909090
+	{rex2} jle    .+6+0x90909090
+	{rex2} jg     .+6+0x90909090
+
+	#All opcodes in the row 0xE* (map0) prefixed REX2 are illegal.
+	{rex2} in $0x90,%al
+	{rex2} in $0x90
+	{rex2} out $0x90,%al
+	{rex2} out $0x90
+	{rex2} jmp  *%eax
+	{rex2} loop foo
+
+	#All opcodes in the row 0x3* (map1) prefixed REX2 are illegal.
+	{rex2} wrmsr
+	{rex2} rdtsc
+	{rex2} rdmsr
+	{rex2} sysenter
+	{rex2} sysexitl
+	{rex2} rdpmc
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.d b/gas/testsuite/gas/i386/x86-64-pseudos.d
index 866a804ab92..19dcd8415ac 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos.d
+++ b/gas/testsuite/gas/i386/x86-64-pseudos.d
@@ -404,6 +404,18 @@ Disassembly of section .text:
  +[a-f0-9]+:	41 0f 28 10          	movaps \(%r8\),%xmm2
  +[a-f0-9]+:	40 0f 38 01 01       	rex phaddw \(%rcx\),%mm0
  +[a-f0-9]+:	41 0f 38 01 00       	phaddw \(%r8\),%mm0
+ +[a-f0-9]+:	88 c4                	mov    %al,%ah
+ +[a-f0-9]+:	d5 00 d3 e0          	{rex2 0x0} shl %cl,%eax
+ +[a-f0-9]+:	d5 00 38 ca          	{rex2 0x0} cmp %cl,%dl
+ +[a-f0-9]+:	d5 00 b3 01          	{rex2 0x0} mov \$(0x)?1,%bl
+ +[a-f0-9]+:	d5 00 89 c3          	{rex2 0x0} mov %eax,%ebx
+ +[a-f0-9]+:	d5 01 89 c6          	{rex2 0x1} mov %eax,%r14d
+ +[a-f0-9]+:	d5 01 89 00          	{rex2 0x1} mov %eax,\(%r8\)
+ +[a-f0-9]+:	d5 80 28 d7          	{rex2 0x80} movaps %xmm7,%xmm2
+ +[a-f0-9]+:	d5 84 28 e7          	{rex2 0x84} movaps %xmm7,%xmm12
+ +[a-f0-9]+:	d5 80 28 11          	{rex2 0x80} movaps \(%rcx\),%xmm2
+ +[a-f0-9]+:	d5 81 28 10          	{rex2 0x81} movaps \(%r8\),%xmm2
+ +[a-f0-9]+:	d5 80 d5 f0          	{rex2 0x80} pmullw %mm0,%mm6
  +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
  +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
  +[a-f0-9]+:	8a 85 00 00 00 00    	mov    0x0\(%rbp\),%al
@@ -458,6 +470,15 @@ Disassembly of section .text:
  +[a-f0-9]+:	41 0f 28 10          	movaps \(%r8\),%xmm2
  +[a-f0-9]+:	40 0f 38 01 01       	rex phaddw \(%rcx\),%mm0
  +[a-f0-9]+:	41 0f 38 01 00       	phaddw \(%r8\),%mm0
+ +[a-f0-9]+:	88 c4                	mov    %al,%ah
+ +[a-f0-9]+:	d5 00 89 c3          	{rex2 0x0} mov %eax,%ebx
+ +[a-f0-9]+:	d5 01 89 c6          	{rex2 0x1} mov %eax,%r14d
+ +[a-f0-9]+:	d5 01 89 00          	{rex2 0x1} mov %eax,\(%r8\)
+ +[a-f0-9]+:	d5 80 28 d7          	{rex2 0x80} movaps %xmm7,%xmm2
+ +[a-f0-9]+:	d5 84 28 e7          	{rex2 0x84} movaps %xmm7,%xmm12
+ +[a-f0-9]+:	d5 80 28 11          	{rex2 0x80} movaps \(%rcx\),%xmm2
+ +[a-f0-9]+:	d5 81 28 10          	{rex2 0x81} movaps \(%r8\),%xmm2
+ +[a-f0-9]+:	d5 80 d5 f0          	{rex2 0x80} pmullw %mm0,%mm6
  +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
  +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
  +[a-f0-9]+:	8a 85 00 00 00 00    	mov    0x0\(%rbp\),%al
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.s b/gas/testsuite/gas/i386/x86-64-pseudos.s
index 06f0b62d049..5a53c363615 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos.s
+++ b/gas/testsuite/gas/i386/x86-64-pseudos.s
@@ -360,6 +360,18 @@ _start:
 	{rex} movaps (%r8),%xmm2
 	{rex} phaddw (%rcx),%mm0
 	{rex} phaddw (%r8),%mm0
+	{rex2} mov %al,%ah
+	{rex2} shl %cl, %eax
+	{rex2} cmp %cl, %dl
+	{rex2} mov $1, %bl
+	{rex2} movl %eax,%ebx
+	{rex2} movl %eax,%r14d
+	{rex2} movl %eax,(%r8)
+	{rex2} movaps %xmm7,%xmm2
+	{rex2} movaps %xmm7,%xmm12
+	{rex2} movaps (%rcx),%xmm2
+	{rex2} movaps (%r8),%xmm2
+	{rex2} pmullw %mm0,%mm6
 
 	movb (%rbp),%al
 	{disp8} movb (%rbp),%al
@@ -422,6 +434,15 @@ _start:
 	{rex} movaps xmm2,XMMWORD PTR [r8]
 	{rex} phaddw mm0,QWORD PTR [rcx]
 	{rex} phaddw mm0,QWORD PTR [r8]
+	{rex2} mov ah,al
+	{rex2} mov ebx,eax
+	{rex2} mov r14d,eax
+	{rex2} mov DWORD PTR [r8],eax
+	{rex2} movaps xmm2,xmm7
+	{rex2} movaps xmm12,xmm7
+	{rex2} movaps xmm2,XMMWORD PTR [rcx]
+	{rex2} movaps xmm2,XMMWORD PTR [r8]
+	{rex2} pmullw mm6,mm0
 
 	mov al, BYTE PTR [rbp]
 	{disp8} mov al, BYTE PTR [rbp]
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index e4b0cc8b85b..91c068d5b40 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -363,6 +363,8 @@ run_dump_test "x86-64-avx512f-rcigrne-intel"
 run_dump_test "x86-64-avx512f-rcigrne"
 run_dump_test "x86-64-avx512f-rcigru-intel"
 run_dump_test "x86-64-avx512f-rcigru"
+run_list_test "x86-64-apx-egpr-inval"
+run_dump_test "x86-64-apx-rex2"
 run_dump_test "x86-64-avx512f-rcigrz-intel"
 run_dump_test "x86-64-avx512f-rcigrz"
 run_dump_test "x86-64-clwb"
diff --git a/include/opcode/i386.h b/include/opcode/i386.h
index dec7652c1cc..2823d02c68a 100644
--- a/include/opcode/i386.h
+++ b/include/opcode/i386.h
@@ -112,9 +112,13 @@
 /* x86-64 extension prefix.  */
 #define REX_OPCODE	0x40
 
+#define REX2_OPCODE	0xd5
+
 /* Non-zero if OPCODE is the rex prefix.  */
 #define REX_PREFIX_P(opcode) (((opcode) & 0xf0) == REX_OPCODE)
 
+/* M0 in rex2 prefix represents map0 or map1.  */
+#define REX2_M 0x8
 /* Indicates 64 bit operand size.  */
 #define REX_W	8
 /* High extension to reg field of modrm byte.  */
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index e78a2a9350e..4d6d547b2b6 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -144,6 +144,12 @@ struct instr_info
   /* Bits of REX we've already used.  */
   uint8_t rex_used;
 
+  /* Record W R4 X4 B4 bits for rex2.  */
+  unsigned char rex2;
+  /* Bits of rex2 we've already used.  */
+  unsigned char rex2_used;
+  unsigned char rex2_payload;
+
   bool need_modrm;
   unsigned char need_vex;
   bool has_sib;
@@ -169,6 +175,7 @@ struct instr_info
   signed char last_data_prefix;
   signed char last_addr_prefix;
   signed char last_rex_prefix;
+  signed char last_rex2_prefix;
   signed char last_seg_prefix;
   signed char fwait_prefix;
   /* The active segment register prefix.  */
@@ -265,8 +272,13 @@ struct dis_private {
   {							\
     if (value)						\
       {							\
-	if ((ins->rex & value))				\
+	if (ins->rex & value)				\
 	  ins->rex_used |= (value) | REX_OPCODE;	\
+	if (ins->rex2 & value)				\
+	  {						\
+	    ins->rex2_used |= (value);			\
+	    ins->rex_used |= REX_OPCODE;		\
+	  }						\
       }							\
     else						\
       ins->rex_used |= REX_OPCODE;			\
@@ -276,6 +288,7 @@ struct dis_private {
 #define EVEX_b_used 1
 #define EVEX_len_used 2
 
+
 /* Flags stored in PREFIXES.  */
 #define PREFIX_REPZ 1
 #define PREFIX_REPNZ 2
@@ -289,6 +302,7 @@ struct dis_private {
 #define PREFIX_DATA 0x200
 #define PREFIX_ADDR 0x400
 #define PREFIX_FWAIT 0x800
+#define PREFIX_REX2 0x1000
 
 /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
    to ADDR (exclusive) are valid.  Returns true for success, false
@@ -370,6 +384,7 @@ fetch_error (const instr_info *ins)
 #define PREFIX_IGNORED_DATA	(PREFIX_DATA << PREFIX_IGNORED_SHIFT)
 #define PREFIX_IGNORED_ADDR	(PREFIX_ADDR << PREFIX_IGNORED_SHIFT)
 #define PREFIX_IGNORED_LOCK	(PREFIX_LOCK << PREFIX_IGNORED_SHIFT)
+#define PREFIX_REX2_ILLEGAL	(PREFIX_REX2 << PREFIX_IGNORED_SHIFT)
 
 /* Opcode prefixes.  */
 #define PREFIX_OPCODE		(PREFIX_REPZ \
@@ -1888,23 +1903,23 @@ static const struct dis386 dis386[] = {
   { "outs{b|}",		{ indirDXr, Xb }, 0 },
   { X86_64_TABLE (X86_64_6F) },
   /* 70 */
-  { "joH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jnoH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jbH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jaeH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jeH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jneH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jbeH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jaH",		{ Jb, BND, cond_jump_flag }, 0 },
+  { "joH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jnoH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jbH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jaeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jneH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jbeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jaH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
   /* 78 */
-  { "jsH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jnsH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jpH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jnpH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jlH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jgeH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jleH",		{ Jb, BND, cond_jump_flag }, 0 },
-  { "jgH",		{ Jb, BND, cond_jump_flag }, 0 },
+  { "jsH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jnsH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jpH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jnpH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jlH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jgeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jleH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jgH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
   /* 80 */
   { REG_TABLE (REG_80) },
   { REG_TABLE (REG_81) },
@@ -1942,23 +1957,23 @@ static const struct dis386 dis386[] = {
   { "sahf",		{ XX }, 0 },
   { "lahf",		{ XX }, 0 },
   /* a0 */
-  { "mov%LB",		{ AL, Ob }, 0 },
-  { "mov%LS",		{ eAX, Ov }, 0 },
-  { "mov%LB",		{ Ob, AL }, 0 },
-  { "mov%LS",		{ Ov, eAX }, 0 },
-  { "movs{b|}",		{ Ybr, Xb }, 0 },
-  { "movs{R|}",		{ Yvr, Xv }, 0 },
-  { "cmps{b|}",		{ Xb, Yb }, 0 },
-  { "cmps{R|}",		{ Xv, Yv }, 0 },
+  { "mov%LB",		{ AL, Ob }, PREFIX_REX2_ILLEGAL },
+  { "mov%LS",		{ eAX, Ov }, PREFIX_REX2_ILLEGAL },
+  { "mov%LB",		{ Ob, AL }, PREFIX_REX2_ILLEGAL },
+  { "mov%LS",		{ Ov, eAX }, PREFIX_REX2_ILLEGAL },
+  { "movs{b|}",		{ Ybr, Xb }, PREFIX_REX2_ILLEGAL },
+  { "movs{R|}",		{ Yvr, Xv }, PREFIX_REX2_ILLEGAL },
+  { "cmps{b|}",		{ Xb, Yb }, PREFIX_REX2_ILLEGAL },
+  { "cmps{R|}",		{ Xv, Yv }, PREFIX_REX2_ILLEGAL },
   /* a8 */
-  { "testB",		{ AL, Ib }, 0 },
-  { "testS",		{ eAX, Iv }, 0 },
-  { "stosB",		{ Ybr, AL }, 0 },
-  { "stosS",		{ Yvr, eAX }, 0 },
-  { "lodsB",		{ ALr, Xb }, 0 },
-  { "lodsS",		{ eAXr, Xv }, 0 },
-  { "scasB",		{ AL, Yb }, 0 },
-  { "scasS",		{ eAX, Yv }, 0 },
+  { "testB",		{ AL, Ib }, PREFIX_REX2_ILLEGAL },
+  { "testS",		{ eAX, Iv }, PREFIX_REX2_ILLEGAL },
+  { "stosB",		{ Ybr, AL }, PREFIX_REX2_ILLEGAL },
+  { "stosS",		{ Yvr, eAX }, PREFIX_REX2_ILLEGAL },
+  { "lodsB",		{ ALr, Xb }, PREFIX_REX2_ILLEGAL },
+  { "lodsS",		{ eAXr, Xv }, PREFIX_REX2_ILLEGAL },
+  { "scasB",		{ AL, Yb }, PREFIX_REX2_ILLEGAL },
+  { "scasS",		{ eAX, Yv }, PREFIX_REX2_ILLEGAL },
   /* b0 */
   { "movB",		{ RMAL, Ib }, 0 },
   { "movB",		{ RMCL, Ib }, 0 },
@@ -2014,23 +2029,23 @@ static const struct dis386 dis386[] = {
   { FLOAT },
   { FLOAT },
   /* e0 */
-  { "loopneFH",		{ Jb, XX, loop_jcxz_flag }, 0 },
-  { "loopeFH",		{ Jb, XX, loop_jcxz_flag }, 0 },
-  { "loopFH",		{ Jb, XX, loop_jcxz_flag }, 0 },
-  { "jEcxzH",		{ Jb, XX, loop_jcxz_flag }, 0 },
-  { "inB",		{ AL, Ib }, 0 },
-  { "inG",		{ zAX, Ib }, 0 },
-  { "outB",		{ Ib, AL }, 0 },
-  { "outG",		{ Ib, zAX }, 0 },
+  { "loopneFH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
+  { "loopeFH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
+  { "loopFH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
+  { "jEcxzH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
+  { "inB",		{ AL, Ib }, PREFIX_REX2_ILLEGAL },
+  { "inG",		{ zAX, Ib }, PREFIX_REX2_ILLEGAL },
+  { "outB",		{ Ib, AL }, PREFIX_REX2_ILLEGAL },
+  { "outG",		{ Ib, zAX }, PREFIX_REX2_ILLEGAL },
   /* e8 */
   { X86_64_TABLE (X86_64_E8) },
   { X86_64_TABLE (X86_64_E9) },
   { X86_64_TABLE (X86_64_EA) },
-  { "jmp",		{ Jb, BND }, 0 },
-  { "inB",		{ AL, indirDX }, 0 },
-  { "inG",		{ zAX, indirDX }, 0 },
-  { "outB",		{ indirDX, AL }, 0 },
-  { "outG",		{ indirDX, zAX }, 0 },
+  { "jmp",		{ Jb, BND }, PREFIX_REX2_ILLEGAL },
+  { "inB",		{ AL, indirDX }, PREFIX_REX2_ILLEGAL },
+  { "inG",		{ zAX, indirDX }, PREFIX_REX2_ILLEGAL },
+  { "outB",		{ indirDX, AL }, PREFIX_REX2_ILLEGAL },
+  { "outG",		{ indirDX, zAX }, PREFIX_REX2_ILLEGAL },
   /* f0 */
   { Bad_Opcode },	/* lock prefix */
   { "int1",		{ XX }, 0 },
@@ -2107,12 +2122,12 @@ static const struct dis386 dis386_twobyte[] = {
   { PREFIX_TABLE (PREFIX_0F2E) },
   { PREFIX_TABLE (PREFIX_0F2F) },
   /* 30 */
-  { "wrmsr",		{ XX }, 0 },
-  { "rdtsc",		{ XX }, 0 },
-  { "rdmsr",		{ XX }, 0 },
-  { "rdpmc",		{ XX }, 0 },
-  { "sysenter",		{ SEP }, 0 },
-  { "sysexit%LQ",	{ SEP }, 0 },
+  { "wrmsr",		{ XX }, PREFIX_REX2_ILLEGAL },
+  { "rdtsc",		{ XX }, PREFIX_REX2_ILLEGAL },
+  { "rdmsr",		{ XX }, PREFIX_REX2_ILLEGAL },
+  { "rdpmc",		{ XX }, PREFIX_REX2_ILLEGAL },
+  { "sysenter",		{ SEP }, PREFIX_REX2_ILLEGAL },
+  { "sysexit%LQ",	{ SEP }, PREFIX_REX2_ILLEGAL },
   { Bad_Opcode },
   { "getsec",		{ XX }, 0 },
   /* 38 */
@@ -2197,23 +2212,23 @@ static const struct dis386 dis386_twobyte[] = {
   { PREFIX_TABLE (PREFIX_0F7E) },
   { PREFIX_TABLE (PREFIX_0F7F) },
   /* 80 */
-  { "joH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jnoH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jbH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jaeH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jeH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jneH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jbeH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jaH",		{ Jv, BND, cond_jump_flag }, 0 },
+  { "joH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jnoH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jbH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jaeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jneH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jbeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jaH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
   /* 88 */
-  { "jsH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jnsH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jpH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jnpH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jlH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jgeH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jleH",		{ Jv, BND, cond_jump_flag }, 0 },
-  { "jgH",		{ Jv, BND, cond_jump_flag }, 0 },
+  { "jsH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jnsH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jpH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jnpH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jlH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jgeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jleH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
+  { "jgH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
   /* 90 */
   { "seto",		{ Eb }, 0 },
   { "setno",		{ Eb }, 0 },
@@ -2406,22 +2421,30 @@ static const char intel_index16[][6] = {
 
 static const char att_names64[][8] = {
   "%rax", "%rcx", "%rdx", "%rbx", "%rsp", "%rbp", "%rsi", "%rdi",
-  "%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15"
+  "%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15",
+  "%r16", "%r17", "%r18", "%r19", "%r20", "%r21", "%r22", "%r23",
+  "%r24", "%r25", "%r26", "%r27", "%r28", "%r29", "%r30", "%r31",
 };
 static const char att_names32[][8] = {
   "%eax", "%ecx", "%edx", "%ebx", "%esp", "%ebp", "%esi", "%edi",
-  "%r8d", "%r9d", "%r10d", "%r11d", "%r12d", "%r13d", "%r14d", "%r15d"
+  "%r8d", "%r9d", "%r10d", "%r11d", "%r12d", "%r13d", "%r14d", "%r15d",
+  "%r16d", "%r17d", "%r18d", "%r19d", "%r20d", "%r21d", "%r22d", "%r23d",
+  "%r24d", "%r25d", "%r26d", "%r27d", "%r28d", "%r29d", "%r30d", "%r31d",
 };
 static const char att_names16[][8] = {
   "%ax", "%cx", "%dx", "%bx", "%sp", "%bp", "%si", "%di",
-  "%r8w", "%r9w", "%r10w", "%r11w", "%r12w", "%r13w", "%r14w", "%r15w"
+  "%r8w", "%r9w", "%r10w", "%r11w", "%r12w", "%r13w", "%r14w", "%r15w",
+  "%r16w", "%r17w", "%r18w", "%r19w", "%r20w", "%r21w", "%r22w", "%r23w",
+  "%r24w", "%r25w", "%r26w", "%r27w", "%r28w", "%r29w", "%r30w", "%r31w",
 };
 static const char att_names8[][8] = {
   "%al", "%cl", "%dl", "%bl", "%ah", "%ch", "%dh", "%bh",
 };
 static const char att_names8rex[][8] = {
   "%al", "%cl", "%dl", "%bl", "%spl", "%bpl", "%sil", "%dil",
-  "%r8b", "%r9b", "%r10b", "%r11b", "%r12b", "%r13b", "%r14b", "%r15b"
+  "%r8b", "%r9b", "%r10b", "%r11b", "%r12b", "%r13b", "%r14b", "%r15b",
+  "%r16b", "%r17b", "%r18b", "%r19b", "%r20b", "%r21b", "%r22b", "%r23b",
+  "%r24b", "%r25b", "%r26b", "%r27b", "%r28b", "%r29b", "%r30b", "%r31b",
 };
 static const char att_names_seg[][4] = {
   "%es", "%cs", "%ss", "%ds", "%fs", "%gs", "%?", "%?",
@@ -2810,9 +2833,9 @@ static const struct dis386 reg_table[][8] = {
     { Bad_Opcode },
     { "cmpxchg8b", { { CMPXCHG8B_Fixup, q_mode } }, 0 },
     { Bad_Opcode },
-    { "xrstors", { FXSAVE }, 0 },
-    { "xsavec", { FXSAVE }, 0 },
-    { "xsaves", { FXSAVE }, 0 },
+    { "xrstors", { FXSAVE }, PREFIX_REX2_ILLEGAL },
+    { "xsavec", { FXSAVE }, PREFIX_REX2_ILLEGAL },
+    { "xsaves", { FXSAVE }, PREFIX_REX2_ILLEGAL },
     { MOD_TABLE (MOD_0FC7_REG_6) },
     { MOD_TABLE (MOD_0FC7_REG_7) },
   },
@@ -3384,7 +3407,7 @@ static const struct dis386 prefix_table[][4] = {
 
   /* PREFIX_0FAE_REG_4_MOD_0 */
   {
-    { "xsave",	{ FXSAVE }, 0 },
+    { "xsave",	{ FXSAVE }, PREFIX_REX2_ILLEGAL },
     { "ptwrite{%LQ|}", { Edq }, 0 },
   },
 
@@ -3402,7 +3425,7 @@ static const struct dis386 prefix_table[][4] = {
 
   /* PREFIX_0FAE_REG_6_MOD_0 */
   {
-    { "xsaveopt",	{ FXSAVE }, PREFIX_OPCODE },
+    { "xsaveopt",	{ FXSAVE }, PREFIX_OPCODE | PREFIX_REX2_ILLEGAL },
     { "clrssbsy",	{ Mq }, PREFIX_OPCODE },
     { "clwb",	{ Mb }, PREFIX_OPCODE },
   },
@@ -4197,13 +4220,13 @@ static const struct dis386 x86_64_table[][2] = {
   /* X86_64_E8 */
   {
     { "callP",		{ Jv, BND }, 0 },
-    { "call@",		{ Jv, BND }, 0 }
+    { "call@",		{ Jv, BND }, PREFIX_REX2_ILLEGAL }
   },
 
   /* X86_64_E9 */
   {
     { "jmpP",		{ Jv, BND }, 0 },
-    { "jmp@",		{ Jv, BND }, 0 }
+    { "jmp@",		{ Jv, BND }, PREFIX_REX2_ILLEGAL }
   },
 
   /* X86_64_EA */
@@ -8184,7 +8207,7 @@ static const struct dis386 mod_table[][2] = {
   },
   {
     /* MOD_0FAE_REG_5 */
-    { "xrstor",		{ FXSAVE }, PREFIX_OPCODE },
+    { "xrstor",		{ FXSAVE }, PREFIX_OPCODE | PREFIX_REX2_ILLEGAL },
     { PREFIX_TABLE (PREFIX_0FAE_REG_5_MOD_3) },
   },
   {
@@ -8387,6 +8410,24 @@ ckprefix (instr_info *ins)
 	    return ckp_okay;
 	  ins->last_rex_prefix = i;
 	  break;
+	/* REX2 must be the last prefix. */
+	case REX2_OPCODE:
+	  if (ins->address_mode == mode_64bit)
+	    {
+	      if (ins->last_rex_prefix >= 0)
+		return ckp_bogus;
+
+	      ins->codep++;
+	      if (!fetch_code (ins->info, ins->codep + 1))
+		return ckp_fetch_error;
+	      ins->rex2_payload = *ins->codep;
+	      ins->rex2 = ins->rex2_payload >> 4;
+	      ins->rex = (ins->rex2_payload & 0xf) | REX_OPCODE;
+	      ins->codep++;
+	      ins->last_rex2_prefix = i;
+	      ins->all_prefixes[i] = REX2_OPCODE;
+	    }
+	  return ckp_okay;
 	case 0xf3:
 	  ins->prefixes |= PREFIX_REPZ;
 	  ins->last_repz_prefix = i;
@@ -8554,6 +8595,8 @@ prefix_name (enum address_mode mode, uint8_t pref, int sizeflag)
       return "bnd";
     case NOTRACK_PREFIX:
       return "notrack";
+    case REX2_OPCODE:
+      return "rex2";
     default:
       return NULL;
     }
@@ -9202,6 +9245,7 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
     .last_data_prefix = -1,
     .last_addr_prefix = -1,
     .last_rex_prefix = -1,
+    .last_rex2_prefix = -1,
     .last_seg_prefix = -1,
     .fwait_prefix = -1,
   };
@@ -9367,13 +9411,18 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       goto out;
     }
 
-  if (*ins.codep == 0x0f)
+  /* REX2.M in rex2 prefix represents map0 or map1.  */
+  if (ins.last_rex2_prefix < 0 ? *ins.codep == 0x0f : (ins.rex2 & REX2_M))
     {
       unsigned char threebyte;
 
-      ins.codep++;
-      if (!fetch_code (info, ins.codep + 1))
-	goto fetch_error_out;
+      if (!ins.rex2)
+	{
+	  ins.codep++;
+	  if (!fetch_code (info, ins.codep + 1))
+	    goto fetch_error_out;
+	}
+
       threebyte = *ins.codep;
       dp = &dis386_twobyte[threebyte];
       ins.need_modrm = twobyte_has_modrm[threebyte];
@@ -9529,7 +9578,15 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       goto out;
     }
 
-  switch (dp->prefix_requirement)
+  if ((dp->prefix_requirement & PREFIX_REX2_ILLEGAL)
+      && ins.last_rex2_prefix >= 0)
+    {
+      i386_dis_printf (info, dis_style_text, "(bad)");
+      ret = ins.end_codep - priv.the_buffer;
+      goto out;
+    }
+
+  switch (dp->prefix_requirement & ~PREFIX_REX2_ILLEGAL)
     {
     case PREFIX_DATA:
       /* If only the data prefix is marked as mandatory, its absence renders
@@ -9588,6 +9645,13 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       && !ins.need_vex && ins.last_rex_prefix >= 0)
     ins.all_prefixes[ins.last_rex_prefix] = 0;
 
+  /* Check if the REX2 prefix is used.  */
+  if (ins.last_rex2_prefix >= 0
+      && ((ins.rex2 & 0x7) ^ (ins.rex2_used & 0x7)) == 0
+      && (ins.rex ^ ins.rex_used) == 0
+      && (ins.rex2 & 0x7))
+    ins.all_prefixes[ins.last_rex2_prefix] = 0;
+
   /* Check if the SEG prefix is used.  */
   if ((ins.prefixes & (PREFIX_CS | PREFIX_SS | PREFIX_DS | PREFIX_ES
 		       | PREFIX_FS | PREFIX_GS)) != 0
@@ -9616,7 +9680,11 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
 	if (name == NULL)
 	  abort ();
 	prefix_length += strlen (name) + 1;
-	i386_dis_printf (info, dis_style_mnemonic, "%s ", name);
+	if (ins.all_prefixes[i] == REX2_OPCODE)
+	  i386_dis_printf (info, dis_style_mnemonic, "{%s 0x%x} ", name,
+			   (unsigned int) ins.rex2_payload);
+	else
+	  i386_dis_printf (info, dis_style_mnemonic, "%s ", name);
       }
 
   /* Check maximum code length.  */
@@ -11163,6 +11231,8 @@ print_register (instr_info *ins, unsigned int reg, unsigned int rexmask,
   USED_REX (rexmask);
   if (ins->rex & rexmask)
     reg += 8;
+  if (ins->rex2 & rexmask)
+    reg += 16;
 
   switch (bytemode)
     {
@@ -11170,7 +11240,7 @@ print_register (instr_info *ins, unsigned int reg, unsigned int rexmask,
     case b_swap_mode:
       if (reg & 4)
 	USED_REX (0);
-      if (ins->rex)
+      if (ins->rex || ins->rex2)
 	names = att_names8rex;
       else
 	names = att_names8;
@@ -11386,6 +11456,8 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
   int riprel = 0;
   int shift;
 
+  add += (ins->rex2 & REX_B) ? 16 : 0;
+
   if (ins->vex.evex)
     {
 
@@ -11559,6 +11631,9 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 		}
 	      break;
 	    default:
+	      if (ins->rex2 & REX_X)
+		vindex += 16;
+
 	      if (vindex != 4)
 		indexes = ins->address_mode == mode_64bit && !addr32flag
 			  ? att_names64 : att_names32;
@@ -11946,7 +12021,7 @@ static bool
 OP_REG (instr_info *ins, int code, int sizeflag)
 {
   const char *s;
-  int add;
+  int add = 0;
 
   switch (code)
     {
@@ -11959,8 +12034,8 @@ OP_REG (instr_info *ins, int code, int sizeflag)
   USED_REX (REX_B);
   if (ins->rex & REX_B)
     add = 8;
-  else
-    add = 0;
+  if (ins->rex2 & REX_B)
+    add += 16;
 
   switch (code)
     {
@@ -12674,6 +12749,8 @@ OP_EX (instr_info *ins, int bytemode, int sizeflag)
   USED_REX (REX_B);
   if (ins->rex & REX_B)
     reg += 8;
+  if (ins->rex2 & REX_B)
+    reg += 16;
   if (ins->vex.evex)
     {
       USED_REX (REX_X);
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 110a8371bd0..dd4850e1855 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -275,6 +275,8 @@ static const dependency isa_dependencies[] =
     "64" },
   { "USER_MSR",
     "64" },
+  { "APX_F",
+    "XSAVE|64" },
 };
 
 /* This array is populated as process_i386_initializers() walks cpu_flags[].  */
@@ -397,6 +399,7 @@ static bitfield cpu_flags[] =
   BITFIELD (FRED),
   BITFIELD (LKGS),
   BITFIELD (USER_MSR),
+  BITFIELD (APX_F),
   BITFIELD (MWAITX),
   BITFIELD (CLZERO),
   BITFIELD (OSPKE),
@@ -483,6 +486,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (Optimize),
   BITFIELD (Dialect),
   BITFIELD (ISA64),
+  BITFIELD (NoEgpr),
 };
 
 #define CLASS(n) #n, n
@@ -1069,10 +1073,44 @@ get_element_size (char **opnd, int lineno)
   return elem_size;
 }
 
+static bool
+rex2_disallowed (const unsigned long long opcode, unsigned int length,
+		 unsigned int space, const char *cpu_flags)
+{
+  /* Some opcodes encode a ModR/M-like byte directly in the opcode.  */
+  unsigned int base_opcode = opcode >> (8 * length - 8);
+
+  /* All opcodes listed map0 0x4*, 0x7*, 0xa*, 0xe* and map1 0x3*, 0x8*
+     are reserved under REX2 and triggers #UD when prefixed with REX2 */
+  if (space == 0)
+    switch (base_opcode >> 4)
+      {
+      case 0x4:
+      case 0x7:
+      case 0xA:
+      case 0xE:
+	return true;
+      default:
+	return false;
+    }
+
+  if (space == SPACE_0F)
+    switch (base_opcode >> 4)
+      {
+      case 0x3:
+      case 0x8:
+	return true;
+      default:
+	return false;
+      }
+
+  return false;
+}
+
 static void
 process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
 			      unsigned int prefix, const char *extension_opcode,
-			      char **opnd, int lineno)
+			      char **opnd, int lineno, bool rex2_disallowed)
 {
   char *str, *next, *last;
   bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
@@ -1199,6 +1237,12 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
 	  || modifiers[SAE].value))
     modifiers[EVex].value = EVEXDYN;
 
+  /* Vex, legacy map2 and map3 and rex2_disallowed do not support EGPR.
+     For templates supporting both Vex and EVex allowing EGPR.  */
+  if ((modifiers[Vex].value || space > SPACE_0F || rex2_disallowed)
+      && !modifiers[EVex].value)
+    modifiers[NoEgpr].value = 1;
+
   output_opcode_modifier (table, modifiers, ARRAY_SIZE (modifiers));
 }
 
@@ -1423,7 +1467,9 @@ output_i386_opcode (FILE *table, const char *name, char *str,
   free (ident);
 
   process_i386_opcode_modifier (table, opcode_modifier, space, prefix,
-				extension_opcode, operand_types, lineno);
+				extension_opcode, operand_types, lineno,
+				rex2_disallowed (opcode, length, space,
+						 cpu_flags));
 
   process_i386_cpu_flag (table, cpu_flags, NULL, ",", "    ", lineno, CpuMax);
 
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 03b02bd9681..8c967ea90b0 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -319,6 +319,8 @@ enum i386_cpu
   CpuAVX512F,
   /* Intel AVX-512 VL Instructions support required.  */
   CpuAVX512VL,
+  /* Intel APX_F Instructions support required.  */
+  CpuAPX_F,
   /* Not supported in the 64bit mode  */
   CpuNo64,
 
@@ -354,6 +356,7 @@ enum i386_cpu
 		   cpuhle:1, \
 		   cpuavx512f:1, \
 		   cpuavx512vl:1, \
+		   cpuapx_f:1, \
       /* NOTE: This field needs to remain last. */ \
 		   cpuno64:1
 
@@ -735,6 +738,11 @@ enum
 #define INTEL64		2
 #define INTEL64ONLY	3
   ISA64,
+
+  /* egprs (r16-r31) on instruction illegal. We also use it to judge
+     whether the instruction supports pseudo-prefix {rex2}.  */
+  NoEgpr,
+
   /* The last bitfield in i386_opcode_modifier.  */
   Opcode_Modifier_Num
 };
@@ -779,6 +787,7 @@ typedef struct i386_opcode_modifier
   unsigned int optimize:1;
   unsigned int dialect:2;
   unsigned int isa64:2;
+  unsigned int noegpr:1;
 } i386_opcode_modifier;
 
 /* Operand classes.  */
@@ -993,7 +1002,8 @@ typedef struct insn_template
 #define Prefix_VEX3		6	/* {vex3} */
 #define Prefix_EVEX		7	/* {evex} */
 #define Prefix_REX		8	/* {rex} */
-#define Prefix_NoOptimize	9	/* {nooptimize} */
+#define Prefix_REX2		9	/* {rex2} */
+#define Prefix_NoOptimize	10	/* {nooptimize} */
 
   /* the bits in opcode_modifier are used to generate the final opcode from
      the base_opcode.  These bits also are used to detect alternate forms of
@@ -1020,6 +1030,7 @@ typedef struct
 #define RegRex	    0x1  /* Extended register.  */
 #define RegRex64    0x2  /* Extended 8 bit register.  */
 #define RegVRex	    0x4  /* Extended vector register.  */
+#define RegRex2	    0x8  /* Extended GPRs R16–R31 register.  */
   unsigned char reg_num;
 #define RegIP	((unsigned char ) ~0)
 /* EIZ and RIZ are fake index registers.  */
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 1e54717fa7e..37d3e8663bb 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -892,7 +892,7 @@ rex.wrxb, 0x4f, x64, NoSuf|IsPrefix, {}
 <pseudopfx:ident:cpu, disp8:Disp8:0, disp16:Disp16:No64, disp32:Disp32:i386, +
                       load:Load:0, store:Store:0, +
                       vex:VEX:0, vex2:VEX:0, vex3:VEX3:0, evex:EVEX:0, +
-                      rex:REX:x64, nooptimize:NoOptimize:0>
+                      rex:REX:x64, rex2:REX2:APX_F, nooptimize:NoOptimize:0>
 
 {<pseudopfx>}, PSEUDO_PREFIX/Prefix_<pseudopfx:ident>, <pseudopfx:cpu>, NoSuf|IsPrefix, {}
 
@@ -1425,16 +1425,17 @@ crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf, { Reg8|Reg64|Uns
 
 // xsave/xrstor New Instructions.
 
-xsave, 0xfae/4, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Unspecified|BaseIndex }
-xsave64, 0xfae/4, Xsave&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
-xrstor, 0xfae/5, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Unspecified|BaseIndex }
-xrstor64, 0xfae/5, Xsave&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
+xsave, 0xfae/4, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoEgpr, { Unspecified|BaseIndex }
+xsave64, 0xfae/4, Xsave&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
+xrstor, 0xfae/5, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoEgpr, { Unspecified|BaseIndex }
+xrstor64, 0xfae/5, Xsave&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
 xgetbv, 0xf01d0, Xsave, NoSuf, {}
 xsetbv, 0xf01d1, Xsave, NoSuf, {}
 
 // xsaveopt
-xsaveopt, 0xfae/6, Xsaveopt, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Unspecified|BaseIndex }
-xsaveopt64, 0xfae/6, Xsaveopt&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
+
+xsaveopt, 0xfae/6, Xsaveopt, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoEgpr, { Unspecified|BaseIndex }
+xsaveopt64, 0xfae/6, Xsaveopt&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
 
 // AES instructions.
 
@@ -2477,17 +2478,17 @@ clflushopt, 0x660fae/7, ClflushOpt, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex
 
 // XSAVES/XRSTORS instructions.
 
-xrstors, 0xfc7/3, XSAVES, Modrm|NoSuf, { Unspecified|BaseIndex }
-xrstors64, 0xfc7/3, XSAVES&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
-xsaves, 0xfc7/5, XSAVES, Modrm|NoSuf, { Unspecified|BaseIndex }
-xsaves64, 0xfc7/5, XSAVES&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
+xrstors, 0xfc7/3, XSAVES, Modrm|NoSuf|NoEgpr, { Unspecified|BaseIndex }
+xrstors64, 0xfc7/3, XSAVES&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
+xsaves, 0xfc7/5, XSAVES, Modrm|NoSuf|NoEgpr, { Unspecified|BaseIndex }
+xsaves64, 0xfc7/5, XSAVES&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
 
 // XSAVES instructions end.
 
 // XSAVEC instructions.
 
-xsavec, 0xfc7/4, XSAVEC, Modrm|NoSuf, { Unspecified|BaseIndex }
-xsavec64, 0xfc7/4, XSAVEC&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
+xsavec, 0xfc7/4, XSAVEC, Modrm|NoSuf|NoEgpr, { Unspecified|BaseIndex }
+xsavec64, 0xfc7/4, XSAVEC&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
 
 // XSAVEC instructions end.
 
diff --git a/opcodes/i386-reg.tbl b/opcodes/i386-reg.tbl
index 2ac56e3fd0b..8fead35e320 100644
--- a/opcodes/i386-reg.tbl
+++ b/opcodes/i386-reg.tbl
@@ -43,6 +43,22 @@ r12b, Class=Reg|Byte, RegRex|RegRex64, 4, Dw2Inval, Dw2Inval
 r13b, Class=Reg|Byte, RegRex|RegRex64, 5, Dw2Inval, Dw2Inval
 r14b, Class=Reg|Byte, RegRex|RegRex64, 6, Dw2Inval, Dw2Inval
 r15b, Class=Reg|Byte, RegRex|RegRex64, 7, Dw2Inval, Dw2Inval
+r16b, Class=Reg|Byte, RegRex2|RegRex64, 0, Dw2Inval, Dw2Inval
+r17b, Class=Reg|Byte, RegRex2|RegRex64, 1, Dw2Inval, Dw2Inval
+r18b, Class=Reg|Byte, RegRex2|RegRex64, 2, Dw2Inval, Dw2Inval
+r19b, Class=Reg|Byte, RegRex2|RegRex64, 3, Dw2Inval, Dw2Inval
+r20b, Class=Reg|Byte, RegRex2|RegRex64, 4, Dw2Inval, Dw2Inval
+r21b, Class=Reg|Byte, RegRex2|RegRex64, 5, Dw2Inval, Dw2Inval
+r22b, Class=Reg|Byte, RegRex2|RegRex64, 6, Dw2Inval, Dw2Inval
+r23b, Class=Reg|Byte, RegRex2|RegRex64, 7, Dw2Inval, Dw2Inval
+r24b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 0, Dw2Inval, Dw2Inval
+r25b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 1, Dw2Inval, Dw2Inval
+r26b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 2, Dw2Inval, Dw2Inval
+r27b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 3, Dw2Inval, Dw2Inval
+r28b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 4, Dw2Inval, Dw2Inval
+r29b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 5, Dw2Inval, Dw2Inval
+r30b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 6, Dw2Inval, Dw2Inval
+r31b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 7, Dw2Inval, Dw2Inval
 // 16 bit regs
 ax, Class=Reg|Instance=Accum|Word, 0, 0, Dw2Inval, Dw2Inval
 cx, Class=Reg|Word, 0, 1, Dw2Inval, Dw2Inval
@@ -60,6 +76,22 @@ r12w, Class=Reg|Word, RegRex, 4, Dw2Inval, Dw2Inval
 r13w, Class=Reg|Word, RegRex, 5, Dw2Inval, Dw2Inval
 r14w, Class=Reg|Word, RegRex, 6, Dw2Inval, Dw2Inval
 r15w, Class=Reg|Word, RegRex, 7, Dw2Inval, Dw2Inval
+r16w, Class=Reg|Word, RegRex2, 0, Dw2Inval, Dw2Inval
+r17w, Class=Reg|Word, RegRex2, 1, Dw2Inval, Dw2Inval
+r18w, Class=Reg|Word, RegRex2, 2, Dw2Inval, Dw2Inval
+r19w, Class=Reg|Word, RegRex2, 3, Dw2Inval, Dw2Inval
+r20w, Class=Reg|Word, RegRex2, 4, Dw2Inval, Dw2Inval
+r21w, Class=Reg|Word, RegRex2, 5, Dw2Inval, Dw2Inval
+r22w, Class=Reg|Word, RegRex2, 6, Dw2Inval, Dw2Inval
+r23w, Class=Reg|Word, RegRex2, 7, Dw2Inval, Dw2Inval
+r24w, Class=Reg|Word, RegRex2|RegRex, 0, Dw2Inval, Dw2Inval
+r25w, Class=Reg|Word, RegRex2|RegRex, 1, Dw2Inval, Dw2Inval
+r26w, Class=Reg|Word, RegRex2|RegRex, 2, Dw2Inval, Dw2Inval
+r27w, Class=Reg|Word, RegRex2|RegRex, 3, Dw2Inval, Dw2Inval
+r28w, Class=Reg|Word, RegRex2|RegRex, 4, Dw2Inval, Dw2Inval
+r29w, Class=Reg|Word, RegRex2|RegRex, 5, Dw2Inval, Dw2Inval
+r30w, Class=Reg|Word, RegRex2|RegRex, 6, Dw2Inval, Dw2Inval
+r31w, Class=Reg|Word, RegRex2|RegRex, 7, Dw2Inval, Dw2Inval
 // 32 bit regs
 eax, Class=Reg|Instance=Accum|Dword|BaseIndex, 0, 0, 0, Dw2Inval
 ecx, Class=Reg|Instance=RegC|Dword|BaseIndex, 0, 1, 1, Dw2Inval
@@ -77,6 +109,22 @@ r12d, Class=Reg|Dword|BaseIndex, RegRex, 4, Dw2Inval, Dw2Inval
 r13d, Class=Reg|Dword|BaseIndex, RegRex, 5, Dw2Inval, Dw2Inval
 r14d, Class=Reg|Dword|BaseIndex, RegRex, 6, Dw2Inval, Dw2Inval
 r15d, Class=Reg|Dword|BaseIndex, RegRex, 7, Dw2Inval, Dw2Inval
+r16d, Class=Reg|Dword|BaseIndex, RegRex2, 0, Dw2Inval, Dw2Inval
+r17d, Class=Reg|Dword|BaseIndex, RegRex2, 1, Dw2Inval, Dw2Inval
+r18d, Class=Reg|Dword|BaseIndex, RegRex2, 2, Dw2Inval, Dw2Inval
+r19d, Class=Reg|Dword|BaseIndex, RegRex2, 3, Dw2Inval, Dw2Inval
+r20d, Class=Reg|Dword|BaseIndex, RegRex2, 4, Dw2Inval, Dw2Inval
+r21d, Class=Reg|Dword|BaseIndex, RegRex2, 5, Dw2Inval, Dw2Inval
+r22d, Class=Reg|Dword|BaseIndex, RegRex2, 6, Dw2Inval, Dw2Inval
+r23d, Class=Reg|Dword|BaseIndex, RegRex2, 7, Dw2Inval, Dw2Inval
+r24d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 0, Dw2Inval, Dw2Inval
+r25d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 1, Dw2Inval, Dw2Inval
+r26d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 2, Dw2Inval, Dw2Inval
+r27d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 3, Dw2Inval, Dw2Inval
+r28d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 4, Dw2Inval, Dw2Inval
+r29d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 5, Dw2Inval, Dw2Inval
+r30d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 6, Dw2Inval, Dw2Inval
+r31d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 7, Dw2Inval, Dw2Inval
 rax, Class=Reg|Instance=Accum|Qword|BaseIndex, 0, 0, Dw2Inval, 0
 rcx, Class=Reg|Instance=RegC|Qword|BaseIndex, 0, 1, Dw2Inval, 2
 rdx, Class=Reg|Instance=RegD|Qword|BaseIndex, 0, 2, Dw2Inval, 1
@@ -93,6 +141,22 @@ r12, Class=Reg|Qword|BaseIndex, RegRex, 4, Dw2Inval, 12
 r13, Class=Reg|Qword|BaseIndex, RegRex, 5, Dw2Inval, 13
 r14, Class=Reg|Qword|BaseIndex, RegRex, 6, Dw2Inval, 14
 r15, Class=Reg|Qword|BaseIndex, RegRex, 7, Dw2Inval, 15
+r16, Class=Reg|Qword|BaseIndex, RegRex2, 0, Dw2Inval, 130
+r17, Class=Reg|Qword|BaseIndex, RegRex2, 1, Dw2Inval, 131
+r18, Class=Reg|Qword|BaseIndex, RegRex2, 2, Dw2Inval, 132
+r19, Class=Reg|Qword|BaseIndex, RegRex2, 3, Dw2Inval, 133
+r20, Class=Reg|Qword|BaseIndex, RegRex2, 4, Dw2Inval, 134
+r21, Class=Reg|Qword|BaseIndex, RegRex2, 5, Dw2Inval, 135
+r22, Class=Reg|Qword|BaseIndex, RegRex2, 6, Dw2Inval, 136
+r23, Class=Reg|Qword|BaseIndex, RegRex2, 7, Dw2Inval, 137
+r24, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 0, Dw2Inval, 138
+r25, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 1, Dw2Inval, 139
+r26, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 2, Dw2Inval, 140
+r27, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 3, Dw2Inval, 141
+r28, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 4, Dw2Inval, 142
+r29, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 5, Dw2Inval, 143
+r30, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 6, Dw2Inval, 144
+r31, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 7, Dw2Inval, 145
 // Vector mask registers.
 k0, Class=RegMask, 0, 0, 93, 118
 k1, Class=RegMask, 0, 1, 94, 119
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
  2023-12-28  1:27 ` [PATCH V5 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:54   ` H.J. Lu
  2023-12-28  1:27 ` [PATCH V5 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich

opcode/ChangeLog:

	* i386-dis-evex.hi: Added an empty EVEX_MAP4_ sub-table for
	legacy insn promote to EVEX insn.
	* opcodes/i386-dis-evex.h: Add EVEX_MAP4.
---
 opcodes/i386-dis-evex.h | 291 ++++++++++++++++++++++++++++++++++++++++
 opcodes/i386-dis.c      |   1 +
 2 files changed, 292 insertions(+)

diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index e6295119d2b..7ad1edbe72d 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -872,6 +872,297 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
   },
+  /* EVEX_MAP4_ */
+  {
+    /* 00 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 08 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 10 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 18 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 20 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 28 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 30 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 38 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 40 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 48 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 50 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 58 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 60 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 68 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 70 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 78 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 80 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 88 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 90 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* 98 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* A0 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* A8 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* B0 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* B8 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* C0 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* C8 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* D0 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* D8 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* E0 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* E8 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* F0 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    /* F8 */
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+  },
   /* EVEX_MAP5_ */
   {
     /* 00 */
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 4d6d547b2b6..e006d869258 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1296,6 +1296,7 @@ enum
   EVEX_0F = 0,
   EVEX_0F38,
   EVEX_0F3A,
+  EVEX_MAP4,
   EVEX_MAP5,
   EVEX_MAP6,
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 3/9] Support APX GPR32 with extend evex prefix
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
  2023-12-28  1:27 ` [PATCH V5 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
  2023-12-28  1:27 ` [PATCH V5 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:54   ` H.J. Lu
  2023-12-28  1:27 ` [PATCH V5 4/9] Add tests for " Cui, Lili
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich

This patch adds non-ND, non-NF forms of EVEX promotion insn.

EVEX extension of legacy instructions:
  All promoted legacy instructions are placed in EVEX map 4, which is
  currently reserved.
EVEX extension of EVEX instructions:
  All existing EVEX instructions are extended by APX using the extended
  EVEX prefix, so that they can access all 32 GPRs.
EVEX extension of VEX instructions:
  Promoting a VEX instruction into the EVEX space does not change the map
  id, the opcode, or the operand encoding of the VEX instruction.

Note: The promoted versions of MOVBE will be extended to include the “MOVBE
  reg1, reg2”.

  gas/ChangeLog:

  2023-12-28  Lingling Kong <lingling.kong@intel.com>
	      H.J. Lu  <hongjiu.lu@intel.com>
	      Lili Cui <lili.cui@intel.com>
	      Lin Hu   <lin1.hu@intel.com>

	* config/tc-i386.c
	(install_template): Handled APX combines.
	(is_apx_evex_encoding): Test apx evex encoding.
	(build_apx_evex_prefix): Enabe APX evex prefix.
	(md_assemble): Handle apx with evex encoding.
	(process_suffix): Handle apx map4 prefix.
	(check_register): Assign i.vec_encoding for APX evex instructions.
	* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
	* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.

opcodes/ChangeLog:

	* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
	* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
	PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
	PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
	PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
	PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
	PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
	PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
	* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
	* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
	promote to apx to use gpr32
	* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
	X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
	X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5, X86_64_EVEX_0F38F6,
	X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
	* i386-dis.c
	(struct instr_info): Deleted bool r.
	(PREFIX_NP_OR_DATA): New.
	(NO_PREFIX): New.
	(putop): Ditto.
	(X86_64_EVEX_FROM_VEX_TABLE): Diito.
	(get_valid_dis386): Decode insn erex in extend evex prefix.
	Handle EVEX_MAP4
	(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
	(print_register): Handle apx instructions decode.
	(OP_E_memory): Diito.
	(OP_G): Diito.
	(OP_XMM): Diito.
	(DistinctDest_Fixup): Diito.
	* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
	* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
	promote to evex.
	* i386-opc.tbl: Handle some legacy and vex insns don't
	support gpr32. And add some legacy insn (map2 / 3) promote
	to evex.
---
 gas/config/tc-i386.c                 |  72 +++++++++++-
 gas/testsuite/gas/i386/x86-64-evex.d |   2 +-
 gas/testsuite/gas/i386/x86-64.exp    |   2 +-
 opcodes/i386-dis-evex-prefix.h       |  58 ++++++++++
 opcodes/i386-dis-evex-x86-64.h       |  50 +++++++++
 opcodes/i386-dis-evex.h              |  94 ++++++++--------
 opcodes/i386-dis.c                   | 160 +++++++++++++++++++++++----
 opcodes/i386-gen.c                   |   2 +
 opcodes/i386-opc.h                   |   6 +
 opcodes/i386-opc.tbl                 |  90 ++++++++++-----
 10 files changed, 433 insertions(+), 103 deletions(-)
 create mode 100644 opcodes/i386-dis-evex-x86-64.h

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index bb302f28add..7e62d08e9bd 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -435,6 +435,9 @@ struct _i386_insn
     /* Prefer the REX2 prefix in encoding.  */
     bool rex2_encoding;
 
+    /* Need to use an Egpr capable encoding (REX2 or EVEX).  */
+    bool has_egpr;
+
     /* Disable instruction size optimization.  */
     bool no_optimize;
 
@@ -3676,12 +3679,12 @@ install_template (const insn_template *t)
 
   /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
   if (t->opcode_modifier.vex && t->opcode_modifier.evex)
-  {
+    {
       if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
 	   || maybe_cpu (t, CpuFMA))
 	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
 	{
-	  if (need_evex_encoding ())
+	  if (need_evex_encoding () || i.has_egpr)
 	    {
 	      i.tm.opcode_modifier.vex = 0;
 	      i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f;
@@ -3698,7 +3701,19 @@ install_template (const insn_template *t)
 		gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
 	    }
 	}
-  }
+
+      if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
+	   || maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
+	   || maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
+	   || maybe_cpu (t, CpuBMI2))
+	  && maybe_cpu (t, CpuAPX_F))
+	{
+	  if (need_evex_encoding () || i.has_egpr)
+	    i.tm.opcode_modifier.vex = 0;
+	  else
+	    i.tm.opcode_modifier.evex = 0;
+	}
+    }
 
   /* Note that for pseudo prefixes this produces a length of 1. But for them
      the length isn't interesting at all.  */
@@ -3879,6 +3894,15 @@ is_any_vex_encoding (const insn_template *t)
   return t->opcode_modifier.vex || t->opcode_modifier.evex;
 }
 
+/* We can use this function only when the current encoding is evex.  */
+static INLINE bool
+is_apx_evex_encoding (void)
+{
+  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
+    || (i.vex.register_specifier
+	&& (i.vex.register_specifier->reg_flags & RegRex2));
+}
+
 static INLINE bool
 is_apx_rex2_encoding (void)
 {
@@ -4156,6 +4180,27 @@ build_rex2_prefix (void)
 		    | (i.rex2 << 4) | i.rex);
 }
 
+/* Build the EVEX prefix (4-byte) for evex insn
+   | 62h |
+   | `R`X`B`R' | B'mmm |
+   | W | v`v`v`v | `x' | pp |
+   | z| L'L | b | `v | aaa |
+*/
+static void
+build_apx_evex_prefix (void)
+{
+  build_evex_prefix ();
+  if (i.rex2 & REX_R)
+    i.vex.bytes[1] &= ~0x10;
+  if (i.rex2 & REX_B)
+    i.vex.bytes[1] |= 0x08;
+  if (i.rex2 & REX_X)
+    i.vex.bytes[2] &= ~0x04;
+  if (i.vex.register_specifier
+      && i.vex.register_specifier->reg_flags & RegRex2)
+    i.vex.bytes[3] &= ~0x08;
+}
+
 static void establish_rex (void)
 {
   /* Note that legacy encodings have at most 2 non-immediate operands.  */
@@ -5723,13 +5768,18 @@ md_assemble (char *line)
 	  return;
 	}
 
-      if (i.tm.opcode_modifier.vex)
+      if (is_apx_evex_encoding ())
+	build_apx_evex_prefix ();
+      else if (i.tm.opcode_modifier.vex)
 	build_vex_prefix (t);
       else
 	build_evex_prefix ();
 
       /* The individual REX.RXBW bits got consumed.  */
       i.rex &= REX_OPCODE;
+
+      /* The rex2 bits got consumed.  */
+      i.rex2 = 0;
     }
 
   /* Handle conversion of 'int $3' --> special int3 insn.  */
@@ -8084,7 +8134,8 @@ process_suffix (void)
       if (i.suffix != QWORD_MNEM_SUFFIX
 	  && i.tm.opcode_modifier.mnemonicsize != IGNORESIZE
 	  && !i.tm.opcode_modifier.floatmf
-	  && !is_any_vex_encoding (&i.tm)
+	  && (!is_any_vex_encoding (&i.tm)
+	      || i.tm.opcode_space == SPACE_EVEXMAP4)
 	  && ((i.suffix == LONG_MNEM_SUFFIX) == (flag_code == CODE_16BIT)
 	      || (flag_code == CODE_64BIT
 		  && i.tm.opcode_modifier.jump == JUMP_BYTE)))
@@ -8094,7 +8145,14 @@ process_suffix (void)
 	  if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
 	    prefix = ADDR_PREFIX_OPCODE;
 
-	  if (!add_prefix (prefix))
+	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
+	     needs to be adjusted.  */
+	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
+	    {
+	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
+	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
+	    }
+	  else if (!add_prefix (prefix))
 	    return 0;
 	}
 
@@ -14300,6 +14358,8 @@ static bool check_register (const reg_entry *r)
       if (!cpu_arch_flags.bitfield.cpuapx_f
 	  || flag_code != CODE_64BIT)
 	return false;
+
+      i.has_egpr = true;
     }
 
   if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
diff --git a/gas/testsuite/gas/i386/x86-64-evex.d b/gas/testsuite/gas/i386/x86-64-evex.d
index 041747db892..5d974c312da 100644
--- a/gas/testsuite/gas/i386/x86-64-evex.d
+++ b/gas/testsuite/gas/i386/x86-64-evex.d
@@ -17,6 +17,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f1 d6 38 7b f0    	vcvtusi2ss %rax,\{rd-sae\},%xmm5,%xmm6
  +[a-f0-9]+:	62 f1 57 38 7b f0    	vcvtusi2sd %eax,\{rd-bad\},%xmm5,%xmm6
  +[a-f0-9]+:	62 f1 d7 38 7b f0    	vcvtusi2sd %rax,\{rd-sae\},%xmm5,%xmm6
- +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,\(bad\)
+ +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,%r16d
  +[a-f0-9]+:	62 e1 7c 08 c2 c0 00 	vcmpeqps %xmm0,%xmm0,\(bad\)
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 91c068d5b40..ffacc9c8e2b 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -250,7 +250,7 @@ run_dump_test "x86-64-sse-noavx"
 run_dump_test "x86-64-movbe"
 run_dump_test "x86-64-movbe-intel"
 run_dump_test "x86-64-movbe-suffix"
-run_list_test "x86-64-inval-movbe" "-al"
+run_list_test "x86-64-inval-movbe" "-march=+noapx_f -al"
 run_dump_test "x86-64-ept"
 run_dump_test "x86-64-ept-intel"
 run_list_test "x86-64-inval-ept" "-al"
diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
index 28da54922c7..54ed48c6952 100644
--- a/opcodes/i386-dis-evex-prefix.h
+++ b/opcodes/i386-dis-evex-prefix.h
@@ -338,6 +338,64 @@
     { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
     { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
   },
+  /* PREFIX_EVEX_MAP4_D8 */
+  {
+    { "sha1nexte", { XM, EXxmm }, 0 },
+    { REG_TABLE (REG_0F38D8_PREFIX_1) },
+  },
+  /* PREFIX_EVEX_MAP4_DA */
+  {
+    { "sha1msg2", { XM, EXxmm }, 0 },
+    { "encodekey128", { Gd, Rd }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DB */
+  {
+    { "sha256rnds2", { XM, EXxmm, XMM0 }, 0 },
+    { "encodekey256", { Gd, Rd }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DC */
+  {
+    { "sha256msg1", { XM, EXxmm }, 0 },
+    { "aesenc128kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DD */
+  {
+    { "sha256msg2", { XM, EXxmm }, 0 },
+    { "aesdec128kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DE */
+  {
+    { Bad_Opcode },
+    { "aesenc256kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DF */
+  {
+    { Bad_Opcode },
+    { "aesdec256kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F0 */
+  {
+    { "crc32A", { Gdq, Eb }, 0 },
+    { "invept", { Gm, Mo }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F1 */
+  {
+    { "crc32Q", { Gdq, Ev }, 0 },
+    { "invvpid", { Gm, Mo }, 0 },
+    { "crc32Q", { Gdq, Ev }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F2 */
+  {
+    { Bad_Opcode },
+    { "invpcid", { Gm, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F8 */
+  {
+    { Bad_Opcode },
+    { "enqcmds", { Gva, M },  0 },
+    { "movdir64b", { Gva, M }, 0 },
+    { "enqcmd", { Gva, M }, 0 },
+  },
   /* PREFIX_EVEX_MAP5_10 */
   {
     { Bad_Opcode },
diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-64.h
new file mode 100644
index 00000000000..0d9d98a7691
--- /dev/null
+++ b/opcodes/i386-dis-evex-x86-64.h
@@ -0,0 +1,50 @@
+  /* X86_64_EVEX_0F90 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F90_L_0) },
+  },
+  /* X86_64_EVEX_0F91 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F91_L_0) },
+  },
+  /* X86_64_EVEX_0F92 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F92_L_0) },
+  },
+  /* X86_64_EVEX_0F93 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F93_L_0) },
+  },
+  /* X86_64_EVEX_0F38F2 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
+  },
+  /* X86_64_EVEX_0F38F3 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
+  },
+  /* X86_64_EVEX_0F38F5 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F5_L_0) },
+  },
+  /* X86_64_EVEX_0F38F6 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE(PREFIX_VEX_0F38F6_L_0) },
+  },
+  /* X86_64_EVEX_0F38F7 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE(PREFIX_VEX_0F38F7_L_0) },
+  },
+  /* X86_64_EVEX_0F3AF0 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) },
+  },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 7ad1edbe72d..90c063b2188 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -164,10 +164,10 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 90 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F90) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F91) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F92) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F93) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -375,9 +375,9 @@ static const struct dis386 evex_table[][256] = {
     { "vpsllv%DQ",	{ XM, Vex, EXx }, PREFIX_DATA },
     /* 48 */
     { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F3849) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F384B) },
     { "vrcp14p%XW",	{ XM, EXx }, PREFIX_DATA },
     { "vrcp14s%XW",	{ XMScalar, VexScalar, EXdq }, PREFIX_DATA },
     { "vrsqrt14p%XW",	{ XM, EXx }, 0 },
@@ -545,32 +545,32 @@ static const struct dis386 evex_table[][256] = {
     { "%XEvaesdecY",	{ XM, Vex, EXx }, PREFIX_DATA },
     { "%XEvaesdeclastY", { XM, Vex, EXx }, PREFIX_DATA },
     /* E0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E0) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E1) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E2) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E3) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E4) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E5) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E6) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E7) },
     /* E8 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E8) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E9) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EA) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EB) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EC) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38ED) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EE) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EF) },
     /* F0 */
     { Bad_Opcode },
     { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F2) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F3) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F5) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F6) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F7) },
     /* F8 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -854,7 +854,7 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* F0 */
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F3AF0) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -983,13 +983,13 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 60 */
+    { "movbeS",	{ Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "movbeS",	{ Ev, Gv }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "wrussK",	{ M, Gdq }, PREFIX_DATA },
+    { PREFIX_TABLE (PREFIX_0F38F6) },
     { Bad_Opcode },
     /* 68 */
     { Bad_Opcode },
@@ -1113,19 +1113,19 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* D8 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_D8) },
+    { "sha1msg1",	{ XM, EXxmm }, NO_PREFIX },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DA) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DB) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DC) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DD) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DE) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DF) },
     /* E0 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1145,20 +1145,20 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* F0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F0) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F1) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F2) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* F8 */
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
+    { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_0F38FC) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index e006d869258..d4d32befcf9 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -132,6 +132,13 @@ enum x86_64_isa
   intel64
 };
 
+enum evex_type
+{
+  evex_default = 0,
+  evex_from_legacy,
+  evex_from_vex,
+};
+
 struct instr_info
 {
   enum address_mode address_mode;
@@ -212,7 +219,6 @@ struct instr_info
     int ll;
     bool w;
     bool evex;
-    bool r;
     bool v;
     bool zeroing;
     bool b;
@@ -220,6 +226,8 @@ struct instr_info
   }
   vex;
 
+  enum evex_type evex_type;
+
   /* Remember if the current op is a jump instruction.  */
   bool op_is_jump;
 
@@ -303,6 +311,8 @@ struct dis_private {
 #define PREFIX_ADDR 0x400
 #define PREFIX_FWAIT 0x800
 #define PREFIX_REX2 0x1000
+#define PREFIX_NP_OR_DATA 0x2000
+#define NO_PREFIX   0x4000
 
 /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
    to ADDR (exclusive) are valid.  Returns true for success, false
@@ -800,6 +810,7 @@ enum
   USE_RM_TABLE,
   USE_PREFIX_TABLE,
   USE_X86_64_TABLE,
+  USE_X86_64_EVEX_FROM_VEX_TABLE,
   USE_3BYTE_TABLE,
   USE_XOP_8F_TABLE,
   USE_VEX_C4_TABLE,
@@ -818,6 +829,8 @@ enum
 #define RM_TABLE(I)		DIS386 (USE_RM_TABLE, (I))
 #define PREFIX_TABLE(I)		DIS386 (USE_PREFIX_TABLE, (I))
 #define X86_64_TABLE(I)		DIS386 (USE_X86_64_TABLE, (I))
+#define X86_64_EVEX_FROM_VEX_TABLE(I) \
+  DIS386 (USE_X86_64_EVEX_FROM_VEX_TABLE, (I))
 #define THREE_BYTE_TABLE(I)	DIS386 (USE_3BYTE_TABLE, (I))
 #define XOP_8F_TABLE()		DIS386 (USE_XOP_8F_TABLE, 0)
 #define VEX_C4_TABLE()		DIS386 (USE_VEX_C4_TABLE, 0)
@@ -866,7 +879,7 @@ enum
   REG_VEX_0F73,
   REG_VEX_0FAE,
   REG_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0,
-  REG_VEX_0F38F3_L_0,
+  REG_VEX_0F38F3_L_0_P_0,
   REG_VEX_MAP7_F8_L_0_W_0,
 
   REG_XOP_09_01_L_0,
@@ -878,7 +891,7 @@ enum
   REG_EVEX_0F72,
   REG_EVEX_0F73,
   REG_EVEX_0F38C6_L_2,
-  REG_EVEX_0F38C7_L_2
+  REG_EVEX_0F38C7_L_2,
 };
 
 enum
@@ -1094,6 +1107,8 @@ enum
   PREFIX_VEX_0F38CC,
   PREFIX_VEX_0F38CD,
   PREFIX_VEX_0F38DA_W_0,
+  PREFIX_VEX_0F38F2_L_0,
+  PREFIX_VEX_0F38F3_L_0,
   PREFIX_VEX_0F38F5_L_0,
   PREFIX_VEX_0F38F6_L_0,
   PREFIX_VEX_0F38F7_L_0,
@@ -1156,6 +1171,18 @@ enum
   PREFIX_EVEX_0F3A67,
   PREFIX_EVEX_0F3AC2,
 
+  PREFIX_EVEX_MAP4_D8,
+  PREFIX_EVEX_MAP4_DA,
+  PREFIX_EVEX_MAP4_DB,
+  PREFIX_EVEX_MAP4_DC,
+  PREFIX_EVEX_MAP4_DD,
+  PREFIX_EVEX_MAP4_DE,
+  PREFIX_EVEX_MAP4_DF,
+  PREFIX_EVEX_MAP4_F0,
+  PREFIX_EVEX_MAP4_F1,
+  PREFIX_EVEX_MAP4_F2,
+  PREFIX_EVEX_MAP4_F8,
+
   PREFIX_EVEX_MAP5_10,
   PREFIX_EVEX_MAP5_11,
   PREFIX_EVEX_MAP5_1D,
@@ -1267,7 +1294,19 @@ enum
   X86_64_VEX_0F38ED,
   X86_64_VEX_0F38EE,
   X86_64_VEX_0F38EF,
+
   X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
+
+  X86_64_EVEX_0F90,
+  X86_64_EVEX_0F91,
+  X86_64_EVEX_0F92,
+  X86_64_EVEX_0F93,
+  X86_64_EVEX_0F38F2,
+  X86_64_EVEX_0F38F3,
+  X86_64_EVEX_0F38F5,
+  X86_64_EVEX_0F38F6,
+  X86_64_EVEX_0F38F7,
+  X86_64_EVEX_0F3AF0,
 };
 
 enum
@@ -2882,12 +2921,12 @@ static const struct dis386 reg_table[][8] = {
   {
     { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0_R_0) },
   },
-  /* REG_VEX_0F38F3_L_0 */
+  /* REG_VEX_0F38F3_L_0_P_0 */
   {
     { Bad_Opcode },
-    { "blsrS",		{ VexGdq, Edq }, PREFIX_OPCODE },
-    { "blsmskS",	{ VexGdq, Edq }, PREFIX_OPCODE },
-    { "blsiS",		{ VexGdq, Edq }, PREFIX_OPCODE },
+    { "blsrS",		{ VexGdq, Edq }, 0 },
+    { "blsmskS",	{ VexGdq, Edq }, 0 },
+    { "blsiS",		{ VexGdq, Edq }, 0 },
   },
   /* REG_VEX_MAP7_F8_L_0_W_0 */
   {
@@ -4035,6 +4074,16 @@ static const struct dis386 prefix_table[][4] = {
     { "vsm4rnds4", { XM, Vex, EXx }, 0 },
   },
 
+  /* PREFIX_VEX_0F38F2_L_0 */
+  {
+    { "andnS",          { Gdq, VexGdq, Edq }, 0 },
+  },
+
+  /* PREFIX_VEX_0F38F3_L_0 */
+  {
+    { REG_TABLE (REG_VEX_0F38F3_L_0_P_0) },
+  },
+
   /* PREFIX_VEX_0F38F5_L_0 */
   {
     { "bzhiS",		{ Gdq, Edq, VexGdq }, 0 },
@@ -4527,6 +4576,7 @@ static const struct dis386 x86_64_table[][2] = {
     { PREFIX_TABLE (PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64) },
   },
 
+#include "i386-dis-evex-x86-64.h"
 };
 
 static const struct dis386 three_byte_table[][256] = {
@@ -7113,12 +7163,12 @@ static const struct dis386 vex_len_table[][2] = {
 
   /* VEX_LEN_0F38F2 */
   {
-    { "andnS",		{ Gdq, VexGdq, Edq }, PREFIX_OPCODE },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
   },
 
   /* VEX_LEN_0F38F3 */
   {
-    { REG_TABLE(REG_VEX_0F38F3_L_0) },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
   },
 
   /* VEX_LEN_0F38F5 */
@@ -8732,6 +8782,17 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
       dp = &prefix_table[dp->op[1].bytemode][vindex];
       break;
 
+    case USE_X86_64_EVEX_FROM_VEX_TABLE:
+      ins->evex_type = evex_from_vex;
+      /* EVEX from VEX instrucions require that EVEX.z, EVEX.L’L, EVEX.b and
+	 the lower 2 bits of EVEX.aaa must be 0.  */
+      if ((ins->vex.mask_register_specifier & 0x3) != 0
+	  || ins->vex.ll != 0
+	  || ins->vex.zeroing != 0
+	  || ins->vex.b)
+	return &bad_opcode;
+
+      /* Fall through.  */
     case USE_X86_64_TABLE:
       vindex = ins->address_mode == mode_64bit ? 1 : 0;
       dp = &x86_64_table[dp->op[1].bytemode][vindex];
@@ -8977,9 +9038,13 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
       if (!fetch_code (ins->info, ins->codep + 4))
 	return &err_opcode;
       /* The first byte after 0x62.  */
+      if (*ins->codep & 0x8)
+	ins->rex2 |= REX_B;
+      if (!(*ins->codep & 0x10))
+	ins->rex2 |= REX_R;
+
       ins->rex = ~(*ins->codep >> 5) & 0x7;
-      ins->vex.r = *ins->codep & 0x10;
-      switch ((*ins->codep & 0xf))
+      switch (*ins->codep & 0x7)
 	{
 	default:
 	  return &bad_opcode;
@@ -8992,6 +9057,12 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 	case 0x3:
 	  vex_table_index = EVEX_0F3A;
 	  break;
+	case 0x4:
+	  vex_table_index = EVEX_MAP4;
+	  ins->evex_type = evex_from_legacy;
+	  if (ins->address_mode != mode_64bit)
+	    return &bad_opcode;
+	  break;
 	case 0x5:
 	  vex_table_index = EVEX_MAP5;
 	  break;
@@ -9008,9 +9079,8 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 
       ins->vex.register_specifier = (~(*ins->codep >> 3)) & 0xf;
 
-      /* The U bit.  */
       if (!(*ins->codep & 0x4))
-	return &bad_opcode;
+	ins->rex2 |= REX_X;
 
       switch ((*ins->codep & 0x3))
 	{
@@ -9040,12 +9110,26 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 
       if (ins->address_mode != mode_64bit)
 	{
+	  /* Report bad for !evex_default and when two fixed values of evex
+	     change..  */
+	  if (ins->evex_type != evex_default
+	      || (ins->rex2 & (REX_B | REX_X)))
+	    return &bad_opcode;
 	  /* In 16/32-bit mode silently ignore following bits.  */
 	  ins->rex &= ~REX_B;
-	  ins->vex.r = true;
+	  ins->rex2 &= ~REX_R;
 	}
 
       ins->need_vex = 4;
+
+      /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
+	 lower 2 bits of EVEX.aaa must be 0.  */
+      if (ins->evex_type == evex_from_legacy
+	  && ((ins->vex.mask_register_specifier & 0x3) != 0
+	      || ins->vex.ll != 0
+	      || ins->vex.zeroing != 0))
+	return &bad_opcode;
+
       ins->codep++;
       vindex = *ins->codep++;
       dp = &evex_table[vex_table_index][vindex];
@@ -9460,6 +9544,13 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       dp = get_valid_dis386 (dp, &ins);
       if (dp == &err_opcode)
 	goto fetch_error_out;
+
+      /* For APX instructions promoted from legacy maps 0/1, embedded prefix
+	 is interpreted as the operand size override.  */
+      if (ins.evex_type == evex_from_legacy
+	  && ins.vex.prefix == DATA_PREFIX_OPCODE)
+	sizeflag ^= DFLAG;
+
       if (dp != NULL && putop (&ins, dp->name, sizeflag) == 0)
 	{
 	  if (!get_sib (&ins, sizeflag))
@@ -9639,6 +9730,25 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       if (ins.last_repnz_prefix >= 0)
 	ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
       break;
+
+    case PREFIX_NP_OR_DATA:
+      if (ins.vex.prefix == REPE_PREFIX_OPCODE
+	  || ins.vex.prefix == REPNE_PREFIX_OPCODE)
+	{
+	  i386_dis_printf (info, dis_style_text, "(bad)");
+	  ret = ins.end_codep - priv.the_buffer;
+	  goto out;
+	}
+      break;
+
+    case NO_PREFIX:
+      if (ins.vex.prefix)
+	{
+	  i386_dis_printf (info, dis_style_text, "(bad)");
+	  ret = ins.end_codep - priv.the_buffer;
+	  goto out;
+	}
+      break;
     }
 
   /* Check if the REX prefix is used.  */
@@ -10348,7 +10458,7 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 		{
 		case 'X':
 		  if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
-		      || !ins->vex.r
+		      || (ins->rex2 & REX_R)
 		      || (ins->modrm.mod == 3 && (ins->rex & REX_X))
 		      || !ins->vex.v || ins->vex.mask_register_specifier)
 		    break;
@@ -11459,7 +11569,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 
   add += (ins->rex2 & REX_B) ? 16 : 0;
 
-  if (ins->vex.evex)
+  if (ins->vex.evex && ins->evex_type == evex_default)
     {
 
       /* Zeroing-masking is invalid for memory destinations. Set the flag
@@ -11603,6 +11713,13 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 		abort ();
 	      if (ins->vex.evex)
 		{
+		  /* S/G EVEX insns require EVEX.X4 not to be set.  */
+		  if (ins->rex2 & REX_X)
+		    {
+		      oappend (ins, "(bad)");
+		      return true;
+		    }
+
 		  if (!ins->vex.v)
 		    vindex += 16;
 		  check_gather = ins->obufp == ins->op_out[1];
@@ -11805,7 +11922,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 
 	      if (ins->rex & REX_R)
 	        modrm_reg += 8;
-	      if (!ins->vex.r)
+	      if (ins->rex2 & REX_R)
 	        modrm_reg += 16;
 	      if (vindex == modrm_reg)
 		oappend (ins, "/(bad)");
@@ -12011,10 +12128,7 @@ OP_indirE (instr_info *ins, int bytemode, int sizeflag)
 static bool
 OP_G (instr_info *ins, int bytemode, int sizeflag)
 {
-  if (ins->vex.evex && !ins->vex.r && ins->address_mode == mode_64bit)
-    oappend (ins, "(bad)");
-  else
-    print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
+  print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
   return true;
 }
 
@@ -12645,7 +12759,7 @@ OP_XMM (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
     reg += 8;
   if (ins->vex.evex)
     {
-      if (!ins->vex.r)
+      if (ins->rex2 & REX_R)
 	reg += 16;
     }
 
@@ -13652,7 +13766,7 @@ DistinctDest_Fixup (instr_info *ins, int bytemode, int sizeflag)
   /* Calc destination register number.  */
   if (ins->rex & REX_R)
     modrm_reg += 8;
-  if (!ins->vex.r)
+  if (ins->rex2 & REX_R)
     modrm_reg += 16;
 
   /* Calc src1 register number.  */
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index dd4850e1855..508b441a343 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -487,6 +487,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (Dialect),
   BITFIELD (ISA64),
   BITFIELD (NoEgpr),
+  BITFIELD (NF),
 };
 
 #define CLASS(n) #n, n
@@ -1120,6 +1121,7 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
     SPACE(0F),
     SPACE(0F38),
     SPACE(0F3A),
+    SPACE(EVEXMAP4),
     SPACE(EVEXMAP5),
     SPACE(EVEXMAP6),
     SPACE(VEXMAP7),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 8c967ea90b0..064ec48edad 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -743,6 +743,9 @@ enum
      whether the instruction supports pseudo-prefix {rex2}.  */
   NoEgpr,
 
+  /* No CSPAZO flags update indication.  */
+  NF,
+
   /* The last bitfield in i386_opcode_modifier.  */
   Opcode_Modifier_Num
 };
@@ -788,6 +791,7 @@ typedef struct i386_opcode_modifier
   unsigned int dialect:2;
   unsigned int isa64:2;
   unsigned int noegpr:1;
+  unsigned int nf:1;
 } i386_opcode_modifier;
 
 /* Operand classes.  */
@@ -963,6 +967,7 @@ typedef struct insn_template
      1: 0F opcode prefix / space.
      2: 0F38 opcode prefix / space.
      3: 0F3A opcode prefix / space.
+     4: EVEXMAP4 opcode prefix / space.
      5: EVEXMAP5 opcode prefix / space.
      6: EVEXMAP6 opcode prefix / space.
      7: VEXMAP7 opcode prefix / space.
@@ -974,6 +979,7 @@ typedef struct insn_template
 #define SPACE_0F	1
 #define SPACE_0F38	2
 #define SPACE_0F3A	3
+#define SPACE_EVEXMAP4	4
 #define SPACE_EVEXMAP5	5
 #define SPACE_EVEXMAP6	6
 #define SPACE_VEXMAP7	7
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 37d3e8663bb..11b8c0b63cb 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -113,6 +113,7 @@
 #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
 #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
 
+#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
 #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
 #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
 
@@ -139,6 +140,9 @@
 
 #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
 
+// The template supports VEX format for cpuid and EVEX format for cpuid & apx_f.
+#define APX_F(cpuid) cpuid&(cpuid|APX_F)
+
 // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
 // the bit to mark commutative VEX encodings where swapping the source
 // operands may allow to switch from 3-byte to 2-byte VEX encoding.
@@ -194,6 +198,7 @@ mov, 0xf24, i386&No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Te
 
 // Move after swapping the bytes
 movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movbe, 0x60, Movbe&APX_F, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // Move with sign extend.
 movsb, 0xfbe, i386, Modrm|No_bSuf|No_sSuf, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
@@ -1315,13 +1320,16 @@ getsec, 0xf37, SMX, NoSuf, {}
 
 invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invept, 0xf3f0, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invvpid, 0xf3f1, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // INVPCID instruction
 
 invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invpcid, 0xf3f2, INVPCID&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // SSSE3 instructions.
 
@@ -1422,6 +1430,8 @@ pcmpistri<sse42>, 0x660f3a63, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, Reg
 pcmpistrm<sse42>, 0x660f3a62, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 crc32, 0xf20f38f0, SSE4_2, W|Modrm|No_sSuf|No_qSuf, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
 crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
+crc32, 0xf0, APX_F, W|Modrm|No_sSuf|No_qSuf|EVexMap4, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
+crc32, 0xf0, APX_F, W|Modrm|No_wSuf|No_lSuf|No_sSuf|EVexMap4, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
 
 // xsave/xrstor New Instructions.
 
@@ -1836,14 +1846,14 @@ xtest, 0xf01d6, HLE|RTM, NoSuf, {}
 
 // BMI2 instructions.
 
-bzhi, 0xf5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-mulx, 0xf2f6, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pdep, 0xf2f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pext, 0xf3f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-rorx, 0xf2f0, BMI2, Modrm|CheckOperandSize|Vex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-sarx, 0xf3f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shlx, 0x66f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shrx, 0xf2f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+bzhi, 0xf5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+mulx, 0xf2f6, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pdep, 0xf2f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pext, 0xf3f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+rorx, 0xf2f0, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+sarx, 0xf3f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shlx, 0x66f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shrx, 0xf2f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 
 // FMA4 instructions
 
@@ -1913,11 +1923,11 @@ lwpins, 0x12/0, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|U
 
 // BMI instructions
 
-andn, 0xf2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsi, 0xf3/3, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsmsk, 0xf3/2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsr, 0xf3/1, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+andn, 0xf2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+bextr, 0xf7, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsi, 0xf3/3, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsmsk, 0xf3/2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsr, 0xf3/1, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // TBM instructions
@@ -2046,13 +2056,21 @@ bndldx, 0x0f1a, MPX, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
 
 // SHA instructions.
 sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1nexte, 0xf38c8, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1msg1, 0xf38c9, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1msg2, 0xf38ca, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256msg1, 0xf38cc, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256msg1, 0xdc, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256msg2, 0xf38cd, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256msg2, 0xdd, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 
 // SHA512 instructions.
 
@@ -2114,9 +2132,9 @@ kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { R
 kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 
-kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
-kmov<bw>, 0x<bw:kpfx>91, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
-kmov<bw>, 0x<bw:kpfx>92, <bw:kcpu>, D|Modrm|Vex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
+kmov<bw>, 0x<bw:kpfx>90, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
+kmov<bw>, 0x<bw:kpfx>91, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
+kmov<bw>, 0x<bw:kpfx>92, APX_F(<bw:kcpu>), D|Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
 
 knot<bw>, 0x<bw:kpfx>44, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
 kortest<bw>, 0x<bw:kpfx>98, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
@@ -2591,9 +2609,9 @@ vpmovzxdq, 0x6635, AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift
 kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask, RegMask, RegMask }
-kmov<dq>, 0x<dq:kpfx>90, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
-kmov<dq>, 0x<dq:kpfx>91, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
-kmov<dq>, 0xf292, AVX512BW, D|Modrm|Vex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
+kmov<dq>, 0x<dq:kpfx>90, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
+kmov<dq>, 0x<dq:kpfx>91, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
+kmov<dq>, 0xf292, APX_F(AVX512BW), D|Modrm|Vex128|EVex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
 knot<dq>, 0x<dq:kpfx>44, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
 kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kortest<dq>, 0x<dq:kpfx>98, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
@@ -2992,9 +3010,13 @@ rdsspq, 0xf30f1e/1, SHSTK&x64, Modrm|NoSuf, { Reg64 }
 saveprevssp, 0xf30f01ea, SHSTK, NoSuf, {}
 rstorssp, 0xf30f01/5, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
 wrssd, 0x0f38f6, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
+wrssd, 0x66, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
 wrssq, 0x0f38f6, SHSTK&x64, Modrm|NoSuf|Size64, { Reg64, Qword|Unspecified|BaseIndex }
+wrssq, 0x66, SHSTK&APX_F, Modrm|NoSuf|Size64|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
 wrussd, 0x660f38f5, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
+wrussd, 0x6665, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
 wrussq, 0x660f38f5, SHSTK&x64, Modrm|NoSuf, { Reg64, Qword|Unspecified|BaseIndex }
+wrussq, 0x6665, SHSTK&APX_F, Modrm|NoSuf|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
 setssbsy, 0xf30f01e8, SHSTK, NoSuf, {}
 clrssbsy, 0xf30fae/6, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
 endbr64, 0xf30f1efa, IBT, NoSuf, {}
@@ -3042,7 +3064,9 @@ cldemote, 0x0f1c/0, CLDEMOTE, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
 // MOVDIR[I,64B] instructions.
 
 movdiri, 0xf38f9, MOVDIRI, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+movdiri, 0xf9, MOVDIRI&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 movdir64b, 0x660f38f8, MOVDIR64B, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movdir64b, 0x66f8, MOVDIR64B&APX_F, Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 
 // MOVEDIR instructions end.
 
@@ -3071,7 +3095,9 @@ vcvtneps2bf16<Vxy>, 0xf372, AVX_NE_CONVERT, Modrm|<Vxy:vex>|Space0F38|VexW0|NoSu
 // ENQCMD instructions.
 
 enqcmd, 0xf20f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+enqcmd, 0xf2f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 enqcmds, 0xf30f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+enqcmds, 0xf3f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 
 // ENQCMD instructions end.
 
@@ -3132,8 +3158,8 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
 
 // AMX instructions.
 
-ldtilecfg, 0x49/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
-sttilecfg, 0x6649/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
+ldtilecfg, 0x49/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
+sttilecfg, 0x6649/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
 
 tcmmimfp16ps, 0x666c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tcmmrlfp16ps, 0x6c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
@@ -3145,9 +3171,9 @@ tdpbuud, 0x5e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
 tdpbusd, 0x665e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tdpbsud, 0xf35e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 
-tileloadd, 0xf24b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
-tileloaddt1, 0x664b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
-tilestored, 0xf34b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
+tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
 
 tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {}
 
@@ -3159,15 +3185,25 @@ tilezero, 0xf249, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
 
 loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM }
 encodekey128, 0xf30f38fa, KL, Modrm|NoSuf, { Reg32, Reg32 }
+encodekey128, 0xf3da, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
 encodekey256, 0xf30f38fb, KL, Modrm|NoSuf, { Reg32, Reg32 }
+encodekey256, 0xf3db, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
 aesenc128kl, 0xf30f38dc, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesenc128kl, 0xf3dc, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesdec128kl, 0xf30f38dd, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesdec128kl, 0xf3dd, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesenc256kl, 0xf30f38de, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesenc256kl, 0xf3de, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesdec256kl, 0xf30f38df, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesdec256kl, 0xf3df, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesencwide128kl, 0xf30f38d8/0, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesencwide128kl, 0xf3d8/0, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesdecwide128kl, 0xf30f38d8/1, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesdecwide128kl, 0xf3d8/1, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesencwide256kl, 0xf30f38d8/2, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesencwide256kl, 0xf3d8/2, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesdecwide256kl, 0xf30f38d8/3, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesdecwide256kl, 0xf3d8/3, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 
 // KEYLOCKER instructions end.
 
@@ -3315,7 +3351,7 @@ prefetchit1, 0xf18/6, PREFETCHI, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
 
 // CMPCCXADD instructions.
 
-cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD, Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+cmp<cc>xadd, 0x66e<cc:opc>, APX_F(CMPCCXADD), Modrm|Vex|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 
 // CMPCCXADD instructions end.
 
@@ -3335,9 +3371,13 @@ wrmsrlist, 0xf30f01c6, MSRLIST, NoSuf, {}
 // RAO-INT instructions.
 
 aadd, 0xf38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aadd, 0xfc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 aand, 0x660f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aand, 0x66fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 aor, 0xf20f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aor, 0xf2fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 axor, 0xf30f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+axor, 0xf3fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 
 // RAO-INT instructions end.
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 4/9] Add tests for APX GPR32 with extend evex prefix
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
                   ` (2 preceding siblings ...)
  2023-12-28  1:27 ` [PATCH V5 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:54   ` H.J. Lu
  2023-12-28  1:27 ` [PATCH V5 5/9] Support APX NDD Cui, Lili
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich

gas/ChangeLog:

2023-12-28 Lingling Kong <lingling.kong@intel.com>
	    H.J. Lu  <hongjiu.lu@intel.com>
	    Lili Cui <lili.cui@intel.com>
	    Lin Hu   <lin1.hu@intel.com>

	* testsuite/gas/i386/x86-64-apx-egpr-inval.l: Add some insn don't
	support gpr32.
	* testsuite/gas/i386/x86-64-apx-egpr-inval.s: Ditto.
	* testsuite/gas/i386/x86-64.exp: Add new test.
	* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: New test.
	* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: New test.
	* testsuite/gas/i386/x86-64-apx-evex-egpr.d: New test.
	* testsuite/gas/i386/x86-64-apx-evex-egpr.s: New test.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: New test.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: New test.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: New test.
	* testsuite/gas/i386/x86-64-apx-evex-promoted.d: New test.
	* testsuite/gas/i386/x86-64-apx-evex-promoted.s: New test.
---
 .../gas/i386/x86-64-apx-egpr-inval.l          | 187 ++++++++++
 .../gas/i386/x86-64-apx-egpr-inval.s          | 191 +++++++++++
 .../gas/i386/x86-64-apx-egpr-promote-inval.l  |  20 ++
 .../gas/i386/x86-64-apx-egpr-promote-inval.s  |  29 ++
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d |  20 ++
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s |  21 ++
 .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  33 ++
 .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  39 +++
 .../gas/i386/x86-64-apx-evex-promoted-intel.d | 318 ++++++++++++++++++
 .../gas/i386/x86-64-apx-evex-promoted.d       | 318 ++++++++++++++++++
 .../gas/i386/x86-64-apx-evex-promoted.s       | 314 +++++++++++++++++
 gas/testsuite/gas/i386/x86-64.exp             |   5 +
 12 files changed, 1495 insertions(+)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s

diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
index bb5c602a2e2..0472748978a 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
@@ -12,4 +12,191 @@
 .*:16: Error: extended GPR cannot be used as base/index for `xsaveopt64'
 .*:17: Error: extended GPR cannot be used as base/index for `xsavec'
 .*:18: Error: extended GPR cannot be used as base/index for `xsavec64'
+.*:20: Error: extended GPR cannot be used as base/index for `blendpd'
+.*:21: Error: extended GPR cannot be used as base/index for `blendps'
+.*:22: Error: extended GPR cannot be used as base/index for `blendvpd'
+.*:23: Error: extended GPR cannot be used as base/index for `blendvpd'
+.*:24: Error: extended GPR cannot be used as base/index for `blendvps'
+.*:25: Error: extended GPR cannot be used as base/index for `blendvps'
+.*:26: Error: extended GPR cannot be used as base/index for `dppd'
+.*:27: Error: extended GPR cannot be used as base/index for `dpps'
+.*:28: Error: register type mismatch for `extractps'
+.*:29: Error: extended GPR cannot be used as base/index for `extractps'
+.*:30: Error: extended GPR cannot be used as base/index for `insertps'
+.*:31: Error: extended GPR cannot be used as base/index for `movntdqa'
+.*:32: Error: extended GPR cannot be used as base/index for `mpsadbw'
+.*:33: Error: extended GPR cannot be used as base/index for `pabsb'
+.*:34: Error: extended GPR cannot be used as base/index for `pabsd'
+.*:35: Error: extended GPR cannot be used as base/index for `pabsw'
+.*:36: Error: extended GPR cannot be used as base/index for `packusdw'
+.*:37: Error: extended GPR cannot be used as base/index for `palignr'
+.*:38: Error: extended GPR cannot be used as base/index for `pblendvb'
+.*:39: Error: extended GPR cannot be used as base/index for `pblendvb'
+.*:40: Error: extended GPR cannot be used as base/index for `pblendw'
+.*:41: Error: extended GPR cannot be used as base/index for `pcmpeqq'
+.*:42: Error: extended GPR cannot be used as base/index for `pcmpestri'
+.*:43: Error: extended GPR cannot be used as base/index for `pcmpestrm'
+.*:44: Error: extended GPR cannot be used as base/index for `pcmpgtq'
+.*:45: Error: extended GPR cannot be used as base/index for `pcmpistri'
+.*:46: Error: extended GPR cannot be used as base/index for `pcmpistrm'
+.*:47: Error: register type mismatch for `pextrb'
+.*:48: Error: extended GPR cannot be used as base/index for `pextrb'
+.*:49: Error: extended GPR cannot be used as base/index for `pextrd'
+.*:50: Error: extended GPR cannot be used as base/index for `pextrq'
+.*:51: Error: extended GPR cannot be used as base/index for `pextrw'
+.*:52: Error: extended GPR cannot be used as base/index for `phaddd'
+.*:53: Error: extended GPR cannot be used as base/index for `phaddsw'
+.*:54: Error: extended GPR cannot be used as base/index for `phaddw'
+.*:55: Error: extended GPR cannot be used as base/index for `phminposuw'
+.*:56: Error: extended GPR cannot be used as base/index for `phsubw'
+.*:57: Error: register type mismatch for `pinsrb'
+.*:58: Error: extended GPR cannot be used as base/index for `pinsrb'
+.*:59: Error: register type mismatch for `pinsrd'
+.*:60: Error: extended GPR cannot be used as base/index for `pinsrd'
+.*:61: Error: register type mismatch for `pinsrq'
+.*:62: Error: extended GPR cannot be used as base/index for `pinsrq'
+.*:63: Error: extended GPR cannot be used as base/index for `pmaddubsw'
+.*:64: Error: extended GPR cannot be used as base/index for `pmaxsb'
+.*:65: Error: extended GPR cannot be used as base/index for `pmaxsd'
+.*:66: Error: extended GPR cannot be used as base/index for `pmaxud'
+.*:67: Error: extended GPR cannot be used as base/index for `pmaxuw'
+.*:68: Error: extended GPR cannot be used as base/index for `pminsb'
+.*:69: Error: extended GPR cannot be used as base/index for `pminsd'
+.*:70: Error: extended GPR cannot be used as base/index for `pminud'
+.*:71: Error: extended GPR cannot be used as base/index for `pminuw'
+.*:72: Error: extended GPR cannot be used as base/index for `pmovsxbd'
+.*:73: Error: extended GPR cannot be used as base/index for `pmovsxbq'
+.*:74: Error: extended GPR cannot be used as base/index for `pmovsxbw'
+.*:75: Error: extended GPR cannot be used as base/index for `pmovsxbw'
+.*:76: Error: extended GPR cannot be used as base/index for `pmovsxdq'
+.*:77: Error: extended GPR cannot be used as base/index for `pmovsxwd'
+.*:78: Error: extended GPR cannot be used as base/index for `pmovsxwq'
+.*:79: Error: extended GPR cannot be used as base/index for `pmovzxbd'
+.*:80: Error: extended GPR cannot be used as base/index for `pmovzxbq'
+.*:81: Error: extended GPR cannot be used as base/index for `pmovzxdq'
+.*:82: Error: extended GPR cannot be used as base/index for `pmovzxwd'
+.*:83: Error: extended GPR cannot be used as base/index for `pmovzxwq'
+.*:84: Error: extended GPR cannot be used as base/index for `pmuldq'
+.*:85: Error: extended GPR cannot be used as base/index for `pmulhrsw'
+.*:86: Error: extended GPR cannot be used as base/index for `pmulld'
+.*:87: Error: extended GPR cannot be used as base/index for `pshufb'
+.*:88: Error: extended GPR cannot be used as base/index for `psignb'
+.*:89: Error: extended GPR cannot be used as base/index for `psignd'
+.*:90: Error: extended GPR cannot be used as base/index for `psignw'
+.*:91: Error: extended GPR cannot be used as base/index for `roundpd'
+.*:92: Error: extended GPR cannot be used as base/index for `roundps'
+.*:93: Error: extended GPR cannot be used as base/index for `roundsd'
+.*:94: Error: extended GPR cannot be used as base/index for `roundss'
+.*:96: Error: extended GPR cannot be used as base/index for `aesdec'
+.*:97: Error: extended GPR cannot be used as base/index for `aesdeclast'
+.*:98: Error: extended GPR cannot be used as base/index for `aesenc'
+.*:99: Error: extended GPR cannot be used as base/index for `aesenclast'
+.*:100: Error: extended GPR cannot be used as base/index for `aesimc'
+.*:101: Error: extended GPR cannot be used as base/index for `aeskeygenassist'
+.*:102: Error: extended GPR cannot be used as base/index for `pclmulhqhqdq'
+.*:103: Error: extended GPR cannot be used as base/index for `pclmulhqlqdq'
+.*:104: Error: extended GPR cannot be used as base/index for `pclmullqhqdq'
+.*:105: Error: extended GPR cannot be used as base/index for `pclmullqlqdq'
+.*:106: Error: extended GPR cannot be used as base/index for `pclmulqdq'
+.*:108: Error: extended GPR cannot be used as base/index for `gf2p8affineinvqb'
+.*:109: Error: extended GPR cannot be used as base/index for `gf2p8affineqb'
+.*:110: Error: extended GPR cannot be used as base/index for `gf2p8mulb'
+.*:112: Error: extended GPR cannot be used as base/index for `vaesimc'
+.*:113: Error: extended GPR cannot be used as base/index for `vaeskeygenassist'
+.*:114: Error: extended GPR cannot be used as base/index for `vblendpd'
+.*:115: Error: extended GPR cannot be used as base/index for `vblendpd'
+.*:116: Error: extended GPR cannot be used as base/index for `vblendps'
+.*:117: Error: extended GPR cannot be used as base/index for `vblendps'
+.*:118: Error: extended GPR cannot be used as base/index for `vblendvpd'
+.*:119: Error: extended GPR cannot be used as base/index for `vblendvpd'
+.*:120: Error: extended GPR cannot be used as base/index for `vblendvps'
+.*:121: Error: extended GPR cannot be used as base/index for `vblendvps'
+.*:122: Error: extended GPR cannot be used as base/index for `vdppd'
+.*:123: Error: extended GPR cannot be used as base/index for `vdpps'
+.*:124: Error: extended GPR cannot be used as base/index for `vdpps'
+.*:125: Error: extended GPR cannot be used as base/index for `vhaddpd'
+.*:126: Error: extended GPR cannot be used as base/index for `vhaddpd'
+.*:127: Error: extended GPR cannot be used as base/index for `vhsubps'
+.*:128: Error: extended GPR cannot be used as base/index for `vhsubps'
+.*:129: Error: extended GPR cannot be used as base/index for `vlddqu'
+.*:130: Error: extended GPR cannot be used as base/index for `vlddqu'
+.*:131: Error: extended GPR cannot be used as base/index for `vldmxcsr'
+.*:132: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
+.*:133: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
+.*:134: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
+.*:135: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
+.*:136: Error: extended GPR cannot be used as base/index for `vmaskmovps'
+.*:137: Error: extended GPR cannot be used as base/index for `vmaskmovps'
+.*:138: Error: extended GPR cannot be used as base/index for `vmaskmovps'
+.*:139: Error: extended GPR cannot be used as base/index for `vmaskmovps'
+.*:140: Error: register type mismatch for `vmovmskpd'
+.*:141: Error: register type mismatch for `vmovmskpd'
+.*:142: Error: register type mismatch for `vmovmskps'
+.*:143: Error: register type mismatch for `vmovmskps'
+.*:144: Error: extended GPR cannot be used as base/index for `vpblendd'
+.*:145: Error: extended GPR cannot be used as base/index for `vpblendd'
+.*:146: Error: extended GPR cannot be used as base/index for `vpblendvb'
+.*:147: Error: extended GPR cannot be used as base/index for `vpblendvb'
+.*:148: Error: extended GPR cannot be used as base/index for `vpblendw'
+.*:149: Error: extended GPR cannot be used as base/index for `vpblendw'
+.*:150: Error: extended GPR cannot be used as base/index for `vpcmpeqb'
+.*:151: Error: extended GPR cannot be used as base/index for `vpcmpeqd'
+.*:152: Error: extended GPR cannot be used as base/index for `vpcmpeqq'
+.*:153: Error: extended GPR cannot be used as base/index for `vpcmpeqw'
+.*:154: Error: extended GPR cannot be used as base/index for `vpcmpestri'
+.*:155: Error: extended GPR cannot be used as base/index for `vpcmpestrm'
+.*:156: Error: extended GPR cannot be used as base/index for `vpcmpgtb'
+.*:157: Error: extended GPR cannot be used as base/index for `vpcmpgtd'
+.*:158: Error: extended GPR cannot be used as base/index for `vpcmpgtq'
+.*:159: Error: extended GPR cannot be used as base/index for `vpcmpgtw'
+.*:160: Error: extended GPR cannot be used as base/index for `vpcmpistri'
+.*:161: Error: extended GPR cannot be used as base/index for `vpcmpistrm'
+.*:162: Error: extended GPR cannot be used as base/index for `vperm2f128'
+.*:163: Error: extended GPR cannot be used as base/index for `vperm2i128'
+.*:164: Error: extended GPR cannot be used as base/index for `vphaddd'
+.*:165: Error: extended GPR cannot be used as base/index for `vphaddd'
+.*:166: Error: extended GPR cannot be used as base/index for `vphaddsw'
+.*:167: Error: extended GPR cannot be used as base/index for `vphaddsw'
+.*:168: Error: extended GPR cannot be used as base/index for `vphaddw'
+.*:169: Error: extended GPR cannot be used as base/index for `vphaddw'
+.*:170: Error: extended GPR cannot be used as base/index for `vphminposuw'
+.*:171: Error: extended GPR cannot be used as base/index for `vphsubd'
+.*:172: Error: extended GPR cannot be used as base/index for `vphsubd'
+.*:173: Error: extended GPR cannot be used as base/index for `vphsubsw'
+.*:174: Error: extended GPR cannot be used as base/index for `vphsubsw'
+.*:175: Error: extended GPR cannot be used as base/index for `vphsubw'
+.*:176: Error: extended GPR cannot be used as base/index for `vphsubw'
+.*:177: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
+.*:178: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
+.*:179: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
+.*:180: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
+.*:181: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
+.*:182: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
+.*:183: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
+.*:184: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
+.*:185: Error: register type mismatch for `vpmovmskb'
+.*:186: Error: register type mismatch for `vpmovmskb'
+.*:187: Error: extended GPR cannot be used as base/index for `vpsignb'
+.*:188: Error: extended GPR cannot be used as base/index for `vpsignb'
+.*:189: Error: extended GPR cannot be used as base/index for `vpsignd'
+.*:190: Error: extended GPR cannot be used as base/index for `vpsignd'
+.*:191: Error: extended GPR cannot be used as base/index for `vpsignw'
+.*:192: Error: extended GPR cannot be used as base/index for `vpsignw'
+.*:193: Error: extended GPR cannot be used as base/index for `vptest'
+.*:194: Error: extended GPR cannot be used as base/index for `vptest'
+.*:195: Error: extended GPR cannot be used as base/index for `vrcpps'
+.*:196: Error: extended GPR cannot be used as base/index for `vrcpps'
+.*:197: Error: extended GPR cannot be used as base/index for `vrcpss'
+.*:198: Error: extended GPR cannot be used as base/index for `vroundpd'
+.*:199: Error: extended GPR cannot be used as base/index for `vroundps'
+.*:200: Error: extended GPR cannot be used as base/index for `vroundsd'
+.*:201: Error: extended GPR cannot be used as base/index for `vroundss'
+.*:202: Error: extended GPR cannot be used as base/index for `vrsqrtps'
+.*:203: Error: extended GPR cannot be used as base/index for `vrsqrtps'
+.*:204: Error: extended GPR cannot be used as base/index for `vrsqrtss'
+.*:205: Error: extended GPR cannot be used as base/index for `vstmxcsr'
+.*:206: Error: extended GPR cannot be used as base/index for `vtestpd'
+.*:207: Error: extended GPR cannot be used as base/index for `vtestpd'
+.*:208: Error: extended GPR cannot be used as base/index for `vtestps'
+.*:209: Error: extended GPR cannot be used as base/index for `vtestps'
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
index bfb6b3fd03b..fde038d6b2f 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
@@ -16,3 +16,194 @@
 	xsaveopt64 (%r16, %r31)
 	xsavec (%r16, %rbx)
 	xsavec64 (%r16, %r31)
+#SSE
+	blendpd $100,(%r18),%xmm6
+	blendps $100,(%r18),%xmm6
+	blendvpd %xmm0,(%r19),%xmm6
+	blendvpd (%r19),%xmm6
+	blendvps %xmm0,(%r19),%xmm6
+	blendvps (%r19),%xmm6
+	dppd $100,(%r20),%xmm6
+	dpps $100,(%r20),%xmm6
+	extractps $100,%xmm4,%r21
+	extractps $100,%xmm4,(%r21)
+	insertps $100,(%r21),%xmm6
+	movntdqa (%r21),%xmm4
+	mpsadbw $100,(%r21),%xmm6
+	pabsb (%r17),%xmm0
+	pabsd (%r17),%xmm0
+	pabsw (%r17),%xmm0
+	packusdw (%r21),%xmm6
+	palignr $100,(%r17),%xmm6
+	pblendvb %xmm0,(%r22),%xmm6
+	pblendvb (%r22),%xmm6
+	pblendw $100,(%r22),%xmm6
+	pcmpeqq (%r22),%xmm6
+	pcmpestri $100,(%r25),%xmm6
+	pcmpestrm $100,(%r25),%xmm6
+	pcmpgtq (%r25),%xmm4
+	pcmpistri $100,(%r25),%xmm6
+	pcmpistrm $100,(%r25),%xmm6
+	pextrb $100,%xmm4,%r22
+	pextrb $100,%xmm4,(%r22)
+	pextrd $100,%xmm4,(%r22)
+	pextrq $100,%xmm4,(%r22)
+	pextrw $100,%xmm4,(%r22)
+	phaddd  (%r17),%xmm0
+	phaddsw (%r17),%xmm0
+	phaddw  (%r17),%xmm0
+	phminposuw (%r23),%xmm4
+	phsubw (%r17),%xmm0
+	pinsrb $100,%r23,%xmm4
+	pinsrb $100,(%r23),%xmm4
+	pinsrd $100, %r23d, %xmm4
+	pinsrd $100,(%r23),%xmm4
+	pinsrq $100, %r24, %xmm4
+	pinsrq $100,(%r24),%xmm4
+	pmaddubsw (%r17),%xmm0
+	pmaxsb (%r24),%xmm6
+	pmaxsd (%r24),%xmm6
+	pmaxud (%r24),%xmm6
+	pmaxuw (%r24),%xmm6
+	pminsb (%r24),%xmm6
+	pminsd (%r24),%xmm6
+	pminud (%r24),%xmm6
+	pminuw (%r24),%xmm6
+	pmovsxbd (%r24),%xmm4
+	pmovsxbq (%r24),%xmm4
+	pmovsxbw (%r24),%xmm4
+	pmovsxbw (%r24),%xmm4
+	pmovsxdq (%r24),%xmm4
+	pmovsxwd (%r24),%xmm4
+	pmovsxwq (%r24),%xmm4
+	pmovzxbd (%r24),%xmm4
+	pmovzxbq (%r24),%xmm4
+	pmovzxdq (%r24),%xmm4
+	pmovzxwd (%r24),%xmm4
+	pmovzxwq (%r24),%xmm4
+	pmuldq (%r24),%xmm4
+	pmulhrsw (%r17),%xmm0
+	pmulld (%r24),%xmm4
+	pshufb (%r17),%xmm0
+	psignb (%r17),%xmm0
+	psignd (%r17),%xmm0
+	psignw (%r17),%xmm0
+	roundpd $100,(%r24),%xmm6
+	roundps $100,(%r24),%xmm6
+	roundsd $100,(%r24),%xmm6
+	roundss $100,(%r24),%xmm6
+#AES
+	aesdec (%r26),%xmm6
+	aesdeclast (%r26),%xmm6
+	aesenc (%r26),%xmm6
+	aesenclast (%r26),%xmm6
+	aesimc (%r26),%xmm6
+	aeskeygenassist $100,(%r26),%xmm6
+	pclmulhqhqdq (%r26),%xmm6
+	pclmulhqlqdq (%r26),%xmm6
+	pclmullqhqdq (%r26),%xmm6
+	pclmullqlqdq (%r26),%xmm6
+	pclmulqdq $100,(%r26),%xmm6
+#GFNI
+	gf2p8affineinvqb $100,(%r26),%xmm6
+	gf2p8affineqb $100,(%r26),%xmm6
+	gf2p8mulb (%r26),%xmm6
+#VEX without evex
+	vaesimc (%r27), %xmm3
+	vaeskeygenassist $7,(%r27),%xmm3
+	vblendpd $7,(%r27),%xmm6,%xmm2
+	vblendpd $7,(%r27),%ymm6,%ymm2
+	vblendps $7,(%r27),%xmm6,%xmm2
+	vblendps $7,(%r27),%ymm6,%ymm2
+	vblendvpd %xmm4,(%r27),%xmm2,%xmm7
+	vblendvpd %ymm4,(%r27),%ymm2,%ymm7
+	vblendvps %xmm4,(%r27),%xmm2,%xmm7
+	vblendvps %ymm4,(%r27),%ymm2,%ymm7
+	vdppd $7,(%r27),%xmm6,%xmm2
+	vdpps $7,(%r27),%xmm6,%xmm2
+	vdpps $7,(%r27),%ymm6,%ymm2
+	vhaddpd (%r27),%xmm6,%xmm5
+	vhaddpd (%r27),%ymm6,%ymm5
+	vhsubps (%r27),%xmm6,%xmm5
+	vhsubps (%r27),%ymm6,%ymm5
+	vlddqu (%r27),%xmm4
+	vlddqu (%r27),%ymm4
+	vldmxcsr (%r27)
+	vmaskmovpd %xmm4,%xmm6,(%r27)
+	vmaskmovpd %ymm4,%ymm6,(%r27)
+	vmaskmovpd (%r27),%xmm4,%xmm6
+	vmaskmovpd (%r27),%ymm4,%ymm6
+	vmaskmovps %xmm4,%xmm6,(%r27)
+	vmaskmovps %ymm4,%ymm6,(%r27)
+	vmaskmovps (%r27),%xmm4,%xmm6
+	vmaskmovps (%r27),%ymm4,%ymm6
+	vmovmskpd %xmm4,%r27d
+	vmovmskpd %xmm8,%r27d
+	vmovmskps %xmm4,%r27d
+	vmovmskps %ymm8,%r27d
+	vpblendd $7,(%r27),%xmm6,%xmm2
+	vpblendd $7,(%r27),%ymm6,%ymm2
+	vpblendvb %xmm4,(%r27),%xmm2,%xmm7
+	vpblendvb %ymm4,(%r27),%ymm2,%ymm7
+	vpblendw $7,(%r27),%xmm6,%xmm2
+	vpblendw $7,(%r27),%ymm6,%ymm2
+	vpcmpeqb (%r26),%ymm6,%ymm2
+	vpcmpeqd (%r26),%ymm6,%ymm2
+	vpcmpeqq (%r16),%ymm6,%ymm2
+	vpcmpeqw (%r16),%ymm6,%ymm2
+	vpcmpestri $7,(%r27),%xmm6
+	vpcmpestrm $7,(%r27),%xmm6
+	vpcmpgtb (%r26),%ymm6,%ymm2
+	vpcmpgtd (%r26),%ymm6,%ymm2
+	vpcmpgtq (%r16),%ymm6,%ymm2
+	vpcmpgtw (%r16),%ymm6,%ymm2
+	vpcmpistri $100,(%r25),%xmm6
+	vpcmpistrm $100,(%r25),%xmm6
+	vperm2f128 $7,(%r27),%ymm6,%ymm2
+	vperm2i128 $7,(%r27),%ymm6,%ymm2
+	vphaddd (%r27),%xmm6,%xmm7
+	vphaddd (%r27),%ymm6,%ymm7
+	vphaddsw (%r27),%xmm6,%xmm7
+	vphaddsw (%r27),%ymm6,%ymm7
+	vphaddw (%r27),%xmm6,%xmm7
+	vphaddw (%r27),%ymm6,%ymm7
+	vphminposuw (%r27),%xmm6
+	vphsubd (%r27),%xmm6,%xmm7
+	vphsubd (%r27),%ymm6,%ymm7
+	vphsubsw (%r27),%xmm6,%xmm7
+	vphsubsw (%r27),%ymm6,%ymm7
+	vphsubw (%r27),%xmm6,%xmm7
+	vphsubw (%r27),%ymm6,%ymm7
+	vpmaskmovd %xmm4,%xmm6,(%r27)
+	vpmaskmovd %ymm4,%ymm6,(%r27)
+	vpmaskmovd (%r27),%xmm4,%xmm6
+	vpmaskmovd (%r27),%ymm4,%ymm6
+	vpmaskmovq %xmm4,%xmm6,(%r27)
+	vpmaskmovq %ymm4,%ymm6,(%r27)
+	vpmaskmovq (%r27),%xmm4,%xmm6
+	vpmaskmovq (%r27),%ymm4,%ymm6
+	vpmovmskb %xmm4,%r27
+	vpmovmskb %ymm4,%r27d
+	vpsignb (%r27),%xmm6,%xmm7
+	vpsignb (%r27),%xmm6,%xmm7
+	vpsignd (%r27),%xmm6,%xmm7
+	vpsignd (%r27),%xmm6,%xmm7
+	vpsignw (%r27),%xmm6,%xmm7
+	vpsignw (%r27),%xmm6,%xmm7
+	vptest (%r27),%ymm6
+	vptest (%r27),%xmm6
+	vrcpps (%r27),%xmm6
+	vrcpps (%r27),%ymm6
+	vrcpss (%r27),%xmm6,%xmm6
+	vroundpd $1,(%r24),%xmm6
+	vroundps $2,(%r24),%xmm6
+	vroundsd $3,(%r24),%xmm6,%xmm3
+	vroundss $4,(%r24),%xmm6,%xmm3
+	vrsqrtps (%r27),%xmm6
+	vrsqrtps (%r27),%ymm6
+	vrsqrtss (%r27),%xmm6,%xmm6
+	vstmxcsr (%r27)
+	vtestpd (%r27),%xmm6
+	vtestpd (%r27),%ymm6
+	vtestps (%r27),%xmm6
+	vtestps (%r27),%ymm6
diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
new file mode 100644
index 00000000000..f8701d7ec22
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
@@ -0,0 +1,20 @@
+.*: Assembler messages:
+.*:4: Error: `movbe' is not supported on `x86_64.nomovbe'
+.*:5: Error: `movbe' is not supported on `x86_64.nomovbe'
+.*:8: Error: `invept' is not supported on `x86_64.noept'
+.*:9: Error: `invept' is not supported on `x86_64.noept'
+.*:12: Error: `kmovq' is not supported on `x86_64.noavx512bw'
+.*:13: Error: `kmovq' is not supported on `x86_64.noavx512bw'
+.*:16: Error: `kmovb' is not supported on `x86_64.noavx512dq'
+.*:17: Error: `kmovb' is not supported on `x86_64.noavx512dq'
+.*:20: Error: `kmovw' is not supported on `x86_64.noavx512f'
+.*:21: Error: `kmovw' is not supported on `x86_64.noavx512f'
+.*:24: Error: `andn' is not supported on `x86_64.nobmi'
+.*:25: Error: `andn' is not supported on `x86_64.nobmi'
+.*:28: Error: `bzhi' is not supported on `x86_64.nobmi2'
+.*:29: Error: `bzhi' is not supported on `x86_64.nobmi2'
+GAS LISTING .*
+#...
+[ 	]*1[ 	]+\# Check illegal 64bit APX EVEX promoted instructions
+[ 	]*2[ 	]+\.text
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
new file mode 100644
index 00000000000..2ea47419b4d
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
@@ -0,0 +1,29 @@
+# Check illegal 64bit APX EVEX promoted instructions
+	.text
+	.arch .nomovbe
+	movbe (%r16), %r17
+	movbe (%rax), %rcx
+	.arch default
+	.arch .noept
+	invept (%r16), %r17
+	invept (%rax), %rcx
+	.arch default
+	.arch .noavx512bw
+	kmovq %k1, (%r16)
+	kmovq %k1, (%r8)
+	.arch default
+	.arch .noavx512dq
+	kmovb %k1, %r16d
+	kmovb %k1, %r8d
+	.arch default
+	.arch .noavx512f
+	kmovw %k1, %r16d
+	kmovw %k1, %r8d
+	.arch default
+	.arch .nobmi
+	andn %r16,%r15,%r11
+	andn %r15,%r15,%r11
+	.arch default
+	.arch .nobmi2
+	bzhi %r16,%r15,%r11
+	bzhi %r15,%r15,%r11
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
new file mode 100644
index 00000000000..c3c578675c0
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
@@ -0,0 +1,20 @@
+#as:
+#objdump: -dw
+#name: x86-64 APX old evex insn use gpr32 with extend-evex prefix
+#source: x86-64-apx-evex-egpr.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 fb 79 48 19 04 08 01[	 ]+vextractf32x4 \$0x1,%zmm0,\(%r16,%r17,1\)
+\s*[a-f0-9]+:\s*62 fa 79 48 5a 04 1a[	 ]+vbroadcasti32x4 \(%r18,%r19,1\),%zmm0
+\s*[a-f0-9]+:\s*62 eb 7d 08 17 c4 01[	 ]+vextractps \$0x1,%xmm16,%r20d
+\s*[a-f0-9]+:\s*62 69 97 00 2a f5[	 ]+vcvtsi2sd %r21,%xmm29,%xmm30
+\s*[a-f0-9]+:\s*67 62 fe 55 58 96 36[	 ]+vfmaddsub132ph \(%r22d\)\{1to32\},%zmm5,%zmm6
+\s*[a-f0-9]+:\s*62 81 fe 18 78 fe[	 ]+vcvttss2usi \{sae\},%xmm30,%r23
+\s*[a-f0-9]+:\s*62 25 10 47 58 b4 c5 00 00 00 10[	 ]+vaddph 0x10000000\(%rbp,%r24,8\),%zmm29,%zmm30\{%k7\}
+\s*[a-f0-9]+:\s*62 4d 7c 08 2f 71 7f[	 ]+vcomish 0xfe\(%r25\),%xmm30
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
new file mode 100644
index 00000000000..7d1c5de2b6d
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
@@ -0,0 +1,21 @@
+# Check 64bit old evex instructions use gpr32 with evex prefix encoding
+
+	.allow_index_reg
+	.text
+_start:
+## DestMem
+	 vextractf32x4	$1, %zmm0, (%r16,%r17)
+## SrcMem
+	 vbroadcasti32x4	(%r18,%r19), %zmm0
+## DestReg
+	 vextractps	$1, %xmm16, %r20d
+## SrcReg
+	 vcvtsi2sdq      %r21, %xmm29, %xmm30
+## Broadcast
+	 vfmaddsub132ph  (%r22d){1to32}, %zmm5, %zmm6
+## SAE
+	 vcvttss2usi     {sae}, %xmm30, %r23
+## Masking
+	 vaddph  0x10000000(%rbp, %r24, 8), %zmm29, %zmm30{%k7}
+## Disp8memshift
+	 vcomish 254(%r25), %xmm30
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
new file mode 100644
index 00000000000..69b2d87f0f7
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
@@ -0,0 +1,33 @@
+#objdump: -dw
+#name: x86-64 EVEX-promoted bad
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+[ 	]*[a-f0-9]+:[ 	]+62 fc 7e 08 60[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+62 fc 7f 08 60[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+62 e2 f9 41 91 84[ 	]+vpgatherqq \(bad\),%zmm16\{%k1\}
+[ 	]*[a-f0-9]+:[ 	]+cd ff[ 	]+int    \$0xff
+[ 	]*[a-f0-9]+:[ 	]+62 fd 7d 08 60[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+62 fc 7d[ 	]+\(bad\).*
+[ 	]*[a-f0-9]+:[ 	]+09 60 c7[ 	]+or     %esp,-0x39\(%rax\)
+[ 	]*[a-f0-9]+:[ 	]+62 fc 7d[ 	]+\(bad\).*
+[ 	]*[a-f0-9]+:[ 	]+28 60 c7[ 	]+.*
+[ 	]*[a-f0-9]+:[ 	]+62 fc 7d[ 	]+\(bad\).*
+[ 	]*[a-f0-9]+:[ 	]+8f[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+60[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 09 f5[ 	]+\(bad\).*
+[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
+[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 28 f5[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
+[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 8f f5[ 	]+\(bad\).*
+[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
+[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 18 f5[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
new file mode 100644
index 00000000000..719c4b6de53
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
@@ -0,0 +1,39 @@
+# Check Illegal prefix for 64bit EVEX-promoted instructions
+
+        .allow_index_reg
+        .text
+_start:
+	#movbe %r23w,%ax set EVEX.pp = f3.
+	.insn EVEX.L0.f3.M12.W0 0x60, %di, %ax
+
+	#movbe %r23w,%ax set EVEX.pp = f2.
+	.insn EVEX.L0.f2.M12.W0 0x60, %di, %ax
+
+	#VSIB vpgatherqq (%rbp,%zmm17,8),%zmm16{%k1} set EVEX.P[10] == 0
+	.byte 0x62, 0xe2, 0xf9, 0x41, 0x91, 0x84, 0xcd
+	.byte 0xff
+
+	#EVEX_MAP4 movbe %r23w,%ax set EVEX.mm == 0b01.
+	.insn EVEX.L0.66.M13.W0 0x60, %di, %ax
+
+	#EVEX_MAP4 movbe %r23w,%ax set EVEX.aaa[1:0] (P[17:16]) == 0b01
+	.insn EVEX.L0.66.M12.W0 0x60, %di, %ax{%k1}
+
+	#EVEX_MAP4 movbe %r18w,%ax set EVEX.L'L == 0b01.
+	.insn EVEX.L1.66.M12.W0 0x60, %di, %ax
+
+	#EVEX_MAP4 movbe %r18w,%ax set EVEX.z == 0b1.
+	.insn EVEX.L0.66.M12.W0 0x60, %di, %ax {%k7}{z}
+
+	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.aaa[1:0] (P[17:16])
+	#== 0b01
+	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx), %rcx{%k1}
+
+	#EVEX from VEX bzhi %rax,(%rax,%rbx),%ecx EVEX.P[22:21](EVEX.L’L) == 0b01
+	.insn EVEX.L1.NP.0f38.W1 0xf5, %rax, (%rax,%rbx), %rcx
+
+	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[23](EVEX.z) == 0b1
+	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx), %rcx {%k7}{z}
+
+	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[20](EVEX.b) == 0b1
+	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx){1to8}, %rcx
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
new file mode 100644
index 00000000000..02e811de88d
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
@@ -0,0 +1,318 @@
+#as:
+#objdump: -dw -Mintel
+#name: x86_64 APX_F EVEX-Promoted insns (Intel disassembly)
+#source: x86-64-apx-evex-promoted.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32[	 ]+r22,r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32[	 ]+r22,QWORD PTR \[r31\]
+[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32[	 ]+r17,r19b
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32[	 ]+r21d,r19b
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32[	 ]+ebx,BYTE PTR \[r19\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32[	 ]+r23d,r31d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32[	 ]+r23d,DWORD PTR \[r31\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32[	 ]+r21d,r31w
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32[	 ]+r21d,WORD PTR \[r31\]
+[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32[	 ]+r18,rax
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+r25d,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+BYTE PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+k5,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+k5,BYTE PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+r25d,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+k5,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+k5,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+r31,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+k5,r31
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+k5,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+r25d,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+k5,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+k5,WORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+ax,r18w
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r16\+rax\*4\+0x123\],r18w
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],r18w
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+DWORD PTR \[r16\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r16\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+r31,QWORD PTR \[r16\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+r18w,WORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4 xmm22,xmm23,0x7b
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\],0x7b
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2 xmm12,XMMWORD PTR \[r31\+rax\*4\+0x123\],xmm0
+[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd tmm6,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1 tmm6,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+\[r31\+rax\*4\+0x123\],tmm6
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+\[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+\[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+\[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+\[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl xmm22,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32[	 ]+r22,r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32[	 ]+r22,QWORD PTR \[r31\]
+[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32[	 ]+r17,r19b
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32[	 ]+r21d,r19b
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32[	 ]+ebx,BYTE PTR \[r19\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32[	 ]+r23d,r31d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32[	 ]+r23d,DWORD PTR \[r31\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32[	 ]+r21d,r31w
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32[	 ]+r21d,WORD PTR \[r31\]
+[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32[	 ]+r18,rax
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+r25d,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+BYTE PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+k5,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+k5,BYTE PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+r25d,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+k5,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+k5,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+r31,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+k5,r31
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+k5,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+r25d,k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+k5,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+k5,WORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+ax,r18w
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r16\+rax\*4\+0x123\],r18w
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],r18w
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+DWORD PTR \[r16\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r16\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+r31,QWORD PTR \[r16\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+r18w,WORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+r31,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4 xmm22,xmm23,0x7b
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\],0x7b
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2 xmm22,xmm23
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2 xmm12,XMMWORD PTR \[r31\+rax\*4\+0x123\],xmm0
+[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+r10d,edx,r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+r11,r15,r31
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd tmm6,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1 tmm6,\[r31\+rax\*4\+0x123\]
+[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+\[r31\+rax\*4\+0x123\],tmm6
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+\[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+\[r31\+rax\*4\+0x123\],r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+\[r31\+rax\*4\+0x123\],r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+\[r31\+rax\*4\+0x123\],r31
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
new file mode 100644
index 00000000000..3a7dffc013b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
@@ -0,0 +1,318 @@
+#as:
+#objdump: -dw
+#name: x86_64 APX_F EVEX-Promoted insns
+#source: x86-64-apx-evex-promoted.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32  %r31,%r22
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32q \(%r31\),%r22
+[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32  %r19b,%r17
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32  %r19b,%r21d
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32b \(%r19\),%ebx
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32  %r31d,%r23d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32l \(%r31\),%r23d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32  %r31w,%r21d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32w \(%r31\),%r21d
+[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32  %rax,%r18
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31d,%eax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31d,%eax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+%k5,%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+%r25d,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+%k5,%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+%r25d,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+%k5,%r31
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+%r31,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+%k5,%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+%r25d,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+%r18w,%ax
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r16,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+%r25d,0x123\(%r16,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r16,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r16,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r18w
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31d,%eax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4[	 ]+\$0x7b,%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4[	 ]+\$0x7b,0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2[	 ]+%xmm0,0x123\(%r31,%rax,4\),%xmm12
+[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd[	 ]+0x123\(%r31,%rax,4\),%tmm6
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1[	 ]+0x123\(%r31,%rax,4\),%tmm6
+[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+%tmm6,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32  %r31,%r22
+[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32q \(%r31\),%r22
+[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32  %r19b,%r17
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32  %r19b,%r21d
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32b \(%r19\),%ebx
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32  %r31d,%r23d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32l \(%r31\),%r23d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32  %r31w,%r21d
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32w \(%r31\),%r21d
+[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32  %rax,%r18
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31d,%eax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31d,%eax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+%k5,%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+%r25d,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+%k5,%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+%r25d,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+%k5,%r31
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+%r31,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+%k5,%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+%k5,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+%r25d,%k5
+[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+0x123\(%r31,%rax,4\),%k5
+[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+%r18w,%ax
+[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r16,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+%r25d,0x123\(%r16,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r16,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r16,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r18w
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31d,%eax,4\),%r25d
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31,%rax,4\),%r31
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
+[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4[	 ]+\$0x7b,%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4[	 ]+\$0x7b,0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2[	 ]+%xmm23,%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
+[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2[	 ]+%xmm0,0x123\(%r31,%rax,4\),%xmm12
+[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+%r25d,%edx,%r10d
+[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
+[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+%r31,%r15,%r11
+[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd[	 ]+0x123\(%r31,%rax,4\),%tmm6
+[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1[	 ]+0x123\(%r31,%rax,4\),%tmm6
+[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+%tmm6,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+%r31,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+%r25d,0x123\(%r31,%rax,4\)
+[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+%r31,0x123\(%r31,%rax,4\)
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
new file mode 100644
index 00000000000..39752c27432
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
@@ -0,0 +1,314 @@
+# Check 64bit APX_F EVEX-Promoted instructions.
+
+	.text
+_start:
+	aadd	%r25d,0x123(%r31,%rax,4)
+	aadd	%r31,0x123(%r31,%rax,4)
+	aand	%r25d,0x123(%r31,%rax,4)
+	aand	%r31,0x123(%r31,%rax,4)
+	aesdec128kl	0x123(%r31,%rax,4),%xmm22
+	aesdec256kl	0x123(%r31,%rax,4),%xmm22
+	aesdecwide128kl	0x123(%r31,%rax,4)
+	aesdecwide256kl	0x123(%r31,%rax,4)
+	aesenc128kl	0x123(%r31,%rax,4),%xmm22
+	aesenc256kl	0x123(%r31,%rax,4),%xmm22
+	aesencwide128kl	0x123(%r31,%rax,4)
+	aesencwide256kl	0x123(%r31,%rax,4)
+	aor	%r25d,0x123(%r31,%rax,4)
+	aor	%r31,0x123(%r31,%rax,4)
+	axor	%r25d,0x123(%r31,%rax,4)
+	axor	%r31,0x123(%r31,%rax,4)
+	bextr	%r25d,%edx,%r10d
+	bextr	%r25d,0x123(%r31,%rax,4),%edx
+	bextr	%r31,%r15,%r11
+	bextr	%r31,0x123(%r31,%rax,4),%r15
+	blsi	%r25d,%edx
+	blsi	%r31,%r15
+	blsi	0x123(%r31,%rax,4),%r25d
+	blsi	0x123(%r31,%rax,4),%r31
+	blsmsk	%r25d,%edx
+	blsmsk	%r31,%r15
+	blsmsk	0x123(%r31,%rax,4),%r25d
+	blsmsk	0x123(%r31,%rax,4),%r31
+	blsr	%r25d,%edx
+	blsr	%r31,%r15
+	blsr	0x123(%r31,%rax,4),%r25d
+	blsr	0x123(%r31,%rax,4),%r31
+	bzhi	%r25d,%edx,%r10d
+	bzhi	%r25d,0x123(%r31,%rax,4),%edx
+	bzhi	%r31,%r15,%r11
+	bzhi	%r31,0x123(%r31,%rax,4),%r15
+	cmpbexadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpbexadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpbxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpbxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmplxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmplxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnbexadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnbexadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnbxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnbxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnlexadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnlexadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnlxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnlxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnoxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnoxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnpxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnpxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnsxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnsxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpnzxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpnzxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpoxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpoxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmppxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmppxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpsxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpsxadd	%r31,%r15,0x123(%r31,%rax,4)
+	cmpzxadd	%r25d,%edx,0x123(%r31,%rax,4)
+	cmpzxadd	%r31,%r15,0x123(%r31,%rax,4)
+	crc32q	%r31, %r22
+	crc32q	(%r31), %r22
+	crc32b	%r19b, %r17
+	crc32b	%r19b, %r21d
+	crc32b	(%r19),%ebx
+	crc32l	%r31d, %r23d
+	crc32l	(%r31), %r23d
+	crc32w	%r31w, %r21d
+	crc32w	(%r31),%r21d
+	crc32	%rax, %r18
+	encodekey128	%r25d,%edx
+	encodekey256	%r25d,%edx
+	enqcmd	0x123(%r31d,%eax,4),%r25d
+	enqcmd	0x123(%r31,%rax,4),%r31
+	enqcmds	0x123(%r31d,%eax,4),%r25d
+	enqcmds	0x123(%r31,%rax,4),%r31
+	invept	0x123(%r31,%rax,4),%r31
+	invpcid	0x123(%r31,%rax,4),%r31
+	invvpid	0x123(%r31,%rax,4),%r31
+	kmovb	%k5,%r25d
+	kmovb	%k5,0x123(%r31,%rax,4)
+	kmovb	%r25d,%k5
+	kmovb	0x123(%r31,%rax,4),%k5
+	kmovd	%k5,%r25d
+	kmovd	%k5,0x123(%r31,%rax,4)
+	kmovd	%r25d,%k5
+	kmovd	0x123(%r31,%rax,4),%k5
+	kmovq	%k5,%r31
+	kmovq	%k5,0x123(%r31,%rax,4)
+	kmovq	%r31,%k5
+	kmovq	0x123(%r31,%rax,4),%k5
+	kmovw	%k5,%r25d
+	kmovw	%k5,0x123(%r31,%rax,4)
+	kmovw	%r25d,%k5
+	kmovw	0x123(%r31,%rax,4),%k5
+	ldtilecfg	0x123(%r31,%rax,4)
+	movbe	%r18w,%ax
+	movbe	%r18w,0x123(%r16,%rax,4)
+	movbe	%r18w,0x123(%r31,%rax,4)
+	movbe	%r25d,%edx
+	movbe	%r25d,0x123(%r16,%rax,4)
+	movbe	%r31,%r15
+	movbe	%r31,0x123(%r16,%rax,4)
+	movbe	%r31,0x123(%r31,%rax,4)
+	movbe	0x123(%r16,%rax,4),%r31
+	movbe	0x123(%r31,%rax,4),%r18w
+	movbe	0x123(%r31,%rax,4),%r25d
+	movdir64b	0x123(%r31d,%eax,4),%r25d
+	movdir64b	0x123(%r31,%rax,4),%r31
+	movdiri	%r25d,0x123(%r31,%rax,4)
+	movdiri	%r31,0x123(%r31,%rax,4)
+	pdep	%r25d,%edx,%r10d
+	pdep	%r31,%r15,%r11
+	pdep	0x123(%r31,%rax,4),%r25d,%edx
+	pdep	0x123(%r31,%rax,4),%r31,%r15
+	pext	%r25d,%edx,%r10d
+	pext	%r31,%r15,%r11
+	pext	0x123(%r31,%rax,4),%r25d,%edx
+	pext	0x123(%r31,%rax,4),%r31,%r15
+	sha1msg1	%xmm23,%xmm22
+	sha1msg1	0x123(%r31,%rax,4),%xmm22
+	sha1msg2	%xmm23,%xmm22
+	sha1msg2	0x123(%r31,%rax,4),%xmm22
+	sha1nexte	%xmm23,%xmm22
+	sha1nexte	0x123(%r31,%rax,4),%xmm22
+	sha1rnds4	$0x7b,%xmm23,%xmm22
+	sha1rnds4	$0x7b,0x123(%r31,%rax,4),%xmm22
+	sha256msg1	%xmm23,%xmm22
+	sha256msg1	0x123(%r31,%rax,4),%xmm22
+	sha256msg2	%xmm23,%xmm22
+	sha256msg2	0x123(%r31,%rax,4),%xmm22
+	sha256rnds2	0x123(%r31,%rax,4),%xmm12
+	shlx	%r25d,%edx,%r10d
+	shlx	%r25d,0x123(%r31,%rax,4),%edx
+	shlx	%r31,%r15,%r11
+	shlx	%r31,0x123(%r31,%rax,4),%r15
+	shrx	%r25d,%edx,%r10d
+	shrx	%r25d,0x123(%r31,%rax,4),%edx
+	shrx	%r31,%r15,%r11
+	shrx	%r31,0x123(%r31,%rax,4),%r15
+	sttilecfg	0x123(%r31,%rax,4)
+	tileloadd	0x123(%r31,%rax,4),%tmm6
+	tileloaddt1	0x123(%r31,%rax,4),%tmm6
+	tilestored	%tmm6,0x123(%r31,%rax,4)
+	wrssd	%r25d,0x123(%r31,%rax,4)
+	wrssq	%r31,0x123(%r31,%rax,4)
+	wrussd	%r25d,0x123(%r31,%rax,4)
+	wrussq	%r31,0x123(%r31,%rax,4)
+
+	.intel_syntax noprefix
+	aadd	[r31+rax*4+0x123],r25d
+	aadd	[r31+rax*4+0x123],r31
+	aand	[r31+rax*4+0x123],r25d
+	aand	[r31+rax*4+0x123],r31
+	aesdec128kl	xmm22,[r31+rax*4+0x123]
+	aesdec256kl	xmm22,[r31+rax*4+0x123]
+	aesdecwide128kl	[r31+rax*4+0x123]
+	aesdecwide256kl	[r31+rax*4+0x123]
+	aesenc128kl	xmm22,[r31+rax*4+0x123]
+	aesenc256kl	xmm22,[r31+rax*4+0x123]
+	aesencwide128kl	[r31+rax*4+0x123]
+	aesencwide256kl	[r31+rax*4+0x123]
+	aor	[r31+rax*4+0x123],r25d
+	aor	[r31+rax*4+0x123],r31
+	axor	[r31+rax*4+0x123],r25d
+	axor	[r31+rax*4+0x123],r31
+	bextr	r10d,edx,r25d
+	bextr	edx, [r31+rax*4+0x123],r25d
+	bextr	r11,r15,r31
+	bextr	r15, [r31+rax*4+0x123],r31
+	blsi	edx,r25d
+	blsi	r15,r31
+	blsi	r25d, [r31+rax*4+0x123]
+	blsi	r31,  [r31+rax*4+0x123]
+	blsmsk	edx,r25d
+	blsmsk	r15,r31
+	blsmsk	r25d, [r31+rax*4+0x123]
+	blsmsk	r31,  [r31+rax*4+0x123]
+	blsr	edx,r25d
+	blsr	r15,r31
+	blsr	r25d, [r31+rax*4+0x123]
+	blsr	r31,  [r31+rax*4+0x123]
+	bzhi	r10d,edx,r25d
+	bzhi	edx, [r31+rax*4+0x123],r25d
+	bzhi	r11,r15,r31
+	bzhi	r15, [r31+rax*4+0x123],r31
+	cmpbexadd	 [r31+rax*4+0x123],edx,r25d
+	cmpbexadd	 [r31+rax*4+0x123],r15,r31
+	cmpbxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpbxadd	 [r31+rax*4+0x123],r15,r31
+	cmplxadd	 [r31+rax*4+0x123],edx,r25d
+	cmplxadd	 [r31+rax*4+0x123],r15,r31
+	cmpnbexadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnbexadd	 [r31+rax*4+0x123],r15,r31
+	cmpnbxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnbxadd	 [r31+rax*4+0x123],r15,r31
+	cmpnlexadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnlexadd	 [r31+rax*4+0x123],r15,r31
+	cmpnlxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnlxadd	 [r31+rax*4+0x123],r15,r31
+	cmpnoxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnoxadd	 [r31+rax*4+0x123],r15,r31
+	cmpnpxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnpxadd	 [r31+rax*4+0x123],r15,r31
+	cmpnsxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnsxadd	 [r31+rax*4+0x123],r15,r31
+	cmpnzxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpnzxadd	 [r31+rax*4+0x123],r15,r31
+	cmpoxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpoxadd	 [r31+rax*4+0x123],r15,r31
+	cmppxadd	 [r31+rax*4+0x123],edx,r25d
+	cmppxadd	 [r31+rax*4+0x123],r15,r31
+	cmpsxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpsxadd	 [r31+rax*4+0x123],r15,r31
+	cmpzxadd	 [r31+rax*4+0x123],edx,r25d
+	cmpzxadd	 [r31+rax*4+0x123],r15,r31
+	crc32	r22,r31
+	crc32	r22,QWORD PTR [r31]
+	crc32	r17,r19b
+	crc32	r21d,r19b
+	crc32	ebx,BYTE PTR [r19]
+	crc32	r23d,r31d
+	crc32	r23d,DWORD PTR [r31]
+	crc32	r21d,r31w
+	crc32	r21d,WORD PTR [r31]
+	crc32	r18,rax
+	encodekey128	edx,r25d
+	encodekey256	edx,r25d
+	enqcmd	r25d,[r31d+eax*4+0x123]
+	enqcmd	r31,[r31+rax*4+0x123]
+	enqcmds	r25d,[r31d+eax*4+0x123]
+	enqcmds	r31,[r31+rax*4+0x123]
+	invept	r31,OWORD PTR [r31+rax*4+0x123]
+	invpcid	r31,[r31+rax*4+0x123]
+	invvpid	r31,OWORD PTR [r31+rax*4+0x123]
+	kmovb	r25d,k5
+	kmovb	BYTE PTR [r31+rax*4+0x123],k5
+	kmovb	k5,r25d
+	kmovb	k5,BYTE PTR [r31+rax*4+0x123]
+	kmovd	r25d,k5
+	kmovd	DWORD PTR [r31+rax*4+0x123],k5
+	kmovd	k5,r25d
+	kmovd	k5,DWORD PTR [r31+rax*4+0x123]
+	kmovq	r31,k5
+	kmovq	QWORD PTR [r31+rax*4+0x123],k5
+	kmovq	k5,r31
+	kmovq	k5,QWORD PTR [r31+rax*4+0x123]
+	kmovw	r25d,k5
+	kmovw	WORD PTR [r31+rax*4+0x123],k5
+	kmovw	k5,r25d
+	kmovw	k5,WORD PTR [r31+rax*4+0x123]
+	ldtilecfg	[r31+rax*4+0x123]
+	movbe	ax,r18w
+	movbe	WORD PTR [r16+rax*4+0x123],r18w
+	movbe	WORD PTR [r31+rax*4+0x123],r18w
+	movbe	edx,r25d
+	movbe	DWORD PTR [r16+rax*4+0x123],r25d
+	movbe	r15,r31
+	movbe	QWORD PTR [r16+rax*4+0x123],r31
+	movbe	QWORD PTR [r31+rax*4+0x123],r31
+	movbe	r31,QWORD PTR [r16+rax*4+0x123]
+	movbe	r18w,WORD PTR [r31+rax*4+0x123]
+	movbe	r25d,DWORD PTR [r31+rax*4+0x123]
+	movdir64b	r25d,[r31d+eax*4+0x123]
+	movdir64b	r31,[r31+rax*4+0x123]
+	movdiri	DWORD PTR [r31+rax*4+0x123],r25d
+	movdiri	QWORD PTR [r31+rax*4+0x123],r31
+	pdep	r10d,edx,r25d
+	pdep	r11,r15,r31
+	pdep	edx,r25d,DWORD PTR [r31+rax*4+0x123]
+	pdep	r15,r31,QWORD PTR [r31+rax*4+0x123]
+	pext	r10d,edx,r25d
+	pext	r11,r15,r31
+	pext	edx,r25d,DWORD PTR [r31+rax*4+0x123]
+	pext	r15,r31,QWORD PTR [r31+rax*4+0x123]
+	sha1msg1	xmm22,xmm23
+	sha1msg1	xmm22,XMMWORD PTR [r31+rax*4+0x123]
+	sha1msg2	xmm22,xmm23
+	sha1msg2	xmm22,XMMWORD PTR [r31+rax*4+0x123]
+	sha1nexte	xmm22,xmm23
+	sha1nexte	xmm22,XMMWORD PTR [r31+rax*4+0x123]
+	sha1rnds4	xmm22,xmm23,0x7b
+	sha1rnds4	xmm22,XMMWORD PTR [r31+rax*4+0x123],0x7b
+	sha256msg1	xmm22,xmm23
+	sha256msg1	xmm22,XMMWORD PTR [r31+rax*4+0x123]
+	sha256msg2	xmm22,xmm23
+	sha256msg2	xmm22,XMMWORD PTR [r31+rax*4+0x123]
+	sha256rnds2	xmm12,XMMWORD PTR [r31+rax*4+0x123]
+	shlx	r10d,edx,r25d
+	shlx	edx,DWORD PTR [r31+rax*4+0x123],r25d
+	shlx	r11,r15,r31
+	shlx	r15,QWORD PTR [r31+rax*4+0x123],r31
+	shrx	r10d,edx,r25d
+	shrx	edx,DWORD PTR [r31+rax*4+0x123],r25d
+	shrx	r11,r15,r31
+	shrx	r15,QWORD PTR [r31+rax*4+0x123],r31
+	sttilecfg	[r31+rax*4+0x123]
+	tileloadd	tmm6,[r31+rax*4+0x123]
+	tileloaddt1	tmm6,[r31+rax*4+0x123]
+	tilestored	[r31+rax*4+0x123],tmm6
+	wrssd	DWORD PTR [r31+rax*4+0x123],r25d
+	wrssq	QWORD PTR [r31+rax*4+0x123],r31
+	wrussd	DWORD PTR [r31+rax*4+0x123],r25d
+	wrussq	QWORD PTR [r31+rax*4+0x123],r31
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index ffacc9c8e2b..bfda747e02e 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -364,7 +364,12 @@ run_dump_test "x86-64-avx512f-rcigrne"
 run_dump_test "x86-64-avx512f-rcigru-intel"
 run_dump_test "x86-64-avx512f-rcigru"
 run_list_test "x86-64-apx-egpr-inval"
+run_dump_test "x86-64-apx-evex-promoted-bad"
+run_list_test "x86-64-apx-egpr-promote-inval" "-al"
 run_dump_test "x86-64-apx-rex2"
+run_dump_test "x86-64-apx-evex-promoted"
+run_dump_test "x86-64-apx-evex-promoted-intel"
+run_dump_test "x86-64-apx-evex-egpr"
 run_dump_test "x86-64-avx512f-rcigrz-intel"
 run_dump_test "x86-64-avx512f-rcigrz"
 run_dump_test "x86-64-clwb"
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 5/9] Support APX NDD
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
                   ` (3 preceding siblings ...)
  2023-12-28  1:27 ` [PATCH V5 4/9] Add tests for " Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:55   ` H.J. Lu
  2023-12-28  1:27 ` [PATCH V5 6/9] Support APX Push2/Pop2 Cui, Lili
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich, konglin1

From: konglin1 <lingling.kong@intel.com>

opcodes/ChangeLog:

	* opcodes/i386-dis-evex-reg.h: Handle for REG_EVEX_MAP4_80,
	REG_EVEX_MAP4_81, REG_EVEX_MAP4_83,  REG_EVEX_MAP4_F6,
	REG_EVEX_MAP4_F7, REG_EVEX_MAP4_FE, REG_EVEX_MAP4_FF.
	* opcodes/i386-dis-evex.h: Add NDD insn.
	* opcodes/i386-dis.c (nd): New define.
	(VexGb): Ditto.
	(VexGv): Ditto.
	(get_valid_dis386): Change for NDD decode.
	(print_insn): Ditto.
	(putop): Ditto.
	(intel_operand_size): Ditto.
	(OP_E_memory): Ditto.
	(OP_VEX): Ditto.
	* opcodes/i386-opc.h (VexVVVV_DST): New.
	* opcodes/i386-opc.tbl: Add APX NDD instructions and adjust VexVVVV.
	* opcodes/i386-tbl.h: Regenerated.

gas/ChangeLog:

	* gas/config/tc-i386.c (operand_size_match):
	Support APX NDD that the number of operands is 3.
	(build_apx_evex_prefix): Change for ndd encode.
	(process_operands): Ditto.
	(build_modrm_byte): Ditto.
	(match_template): Support swap the first two operands for
	APX NDD.
	* testsuite/gas/i386/x86-64.exp: Add x86-64-apx-ndd.
	* testsuite/gas/i386/x86-64-apx-ndd.d: New test.
	* testsuite/gas/i386/x86-64-apx-ndd.s: Ditto.
	* testsuite/gas/i386/x86-64-pseudos.d: Add test.
	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d : Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s : Ditto.
---
 gas/config/tc-i386.c                          |  62 +++++--
 .../gas/i386/x86-64-apx-evex-promoted-bad.d   |   3 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.s   |   3 +
 gas/testsuite/gas/i386/x86-64-apx-ndd.d       | 160 ++++++++++++++++
 gas/testsuite/gas/i386/x86-64-apx-ndd.s       | 155 ++++++++++++++++
 gas/testsuite/gas/i386/x86-64-pseudos.d       |  42 +++++
 gas/testsuite/gas/i386/x86-64-pseudos.s       |  43 +++++
 gas/testsuite/gas/i386/x86-64.exp             |   1 +
 opcodes/i386-dis-evex-reg.h                   |  54 ++++++
 opcodes/i386-dis-evex.h                       | 124 ++++++-------
 opcodes/i386-dis.c                            | 171 +++++++++++-------
 opcodes/i386-opc.h                            |   6 +-
 opcodes/i386-opc.tbl                          |  75 ++++++++
 13 files changed, 754 insertions(+), 145 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 7e62d08e9bd..99b484122e1 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -2239,8 +2239,10 @@ operand_size_match (const insn_template *t)
       unsigned int given = i.operands - j - 1;
 
       /* For FMA4 and XOP insns VEX.W controls just the first two
-	 register operands.  */
-      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP))
+	 register operands. And APX_F insns just swap the two source operands,
+	 with the 3rd one being the destination.  */
+      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP)
+	  || is_cpu (t, CpuAPX_F))
 	given = j < 2 ? 1 - j : j;
 
       if (t->operand_types[j].bitfield.class == Reg
@@ -4199,6 +4201,11 @@ build_apx_evex_prefix (void)
   if (i.vex.register_specifier
       && i.vex.register_specifier->reg_flags & RegRex2)
     i.vex.bytes[3] &= ~0x08;
+
+  /* Encode the NDD bit of the instruction promoted from the legacy
+     space.  */
+  if (i.vex.register_specifier && i.tm.opcode_space == SPACE_EVEXMAP4)
+    i.vex.bytes[3] |= 0x10;
 }
 
 static void establish_rex (void)
@@ -7472,18 +7479,22 @@ match_template (char mnem_suffix)
 	     - the store form is requested, and the template is a load form,
 	     - the non-default (swapped) form is requested.  */
 	  overlap1 = operand_type_and (operand_types[0], operand_types[1]);
+
+	  j = i.operands - 1 - (t->opcode_space == SPACE_EVEXMAP4
+				&& t->opcode_modifier.vexvvvv);
+
 	  if (t->opcode_modifier.d && i.reg_operands == i.operands
 	      && !operand_type_all_zero (&overlap1))
 	    switch (i.dir_encoding)
 	      {
 	      case dir_encoding_load:
-		if (operand_type_check (operand_types[i.operands - 1], anymem)
+		if (operand_type_check (operand_types[j], anymem)
 		    || t->opcode_modifier.regmem)
 		  goto check_reverse;
 		break;
 
 	      case dir_encoding_store:
-		if (!operand_type_check (operand_types[i.operands - 1], anymem)
+		if (!operand_type_check (operand_types[j], anymem)
 		    && !t->opcode_modifier.regmem)
 		  goto check_reverse;
 		break;
@@ -7494,6 +7505,7 @@ match_template (char mnem_suffix)
 	      case dir_encoding_default:
 		break;
 	      }
+
 	  /* If we want store form, we skip the current load.  */
 	  if ((i.dir_encoding == dir_encoding_store
 	       || i.dir_encoding == dir_encoding_swap)
@@ -7523,11 +7535,13 @@ match_template (char mnem_suffix)
 		continue;
 	      /* Try reversing direction of operands.  */
 	      j = is_cpu (t, CpuFMA4)
-		  || is_cpu (t, CpuXOP) ? 1 : i.operands - 1;
+		  || is_cpu (t, CpuXOP)
+		  || is_cpu (t, CpuAPX_F) ? 1 : i.operands - 1;
 	      overlap0 = operand_type_and (i.types[0], operand_types[j]);
 	      overlap1 = operand_type_and (i.types[j], operand_types[0]);
 	      overlap2 = operand_type_and (i.types[1], operand_types[1]);
-	      gas_assert (t->operands != 3 || !check_register);
+	      gas_assert (t->operands != 3 || !check_register
+			  || is_cpu (t, CpuAPX_F));
 	      if (!operand_type_match (overlap0, i.types[0])
 		  || !operand_type_match (overlap1, i.types[j])
 		  || (t->operands == 3
@@ -7562,6 +7576,11 @@ match_template (char mnem_suffix)
 		  found_reverse_match = Opcode_VexW;
 		  goto check_operands_345;
 		}
+	      else if (is_cpu (t, CpuAPX_F) && i.operands == 3)
+		{
+		  found_reverse_match = Opcode_D;
+		  goto check_operands_345;
+		}
 	      else if (t->opcode_space != SPACE_BASE
 		       && (t->opcode_space != SPACE_0F
 			   /* MOV to/from CR/DR/TR, as an exception, follow
@@ -7743,6 +7762,9 @@ match_template (char mnem_suffix)
 
       i.tm.base_opcode ^= found_reverse_match;
 
+      if (i.tm.opcode_space == SPACE_EVEXMAP4)
+	goto swap_first_2;
+
       /* Certain SIMD insns have their load forms specified in the opcode
 	 table, and hence we need to _set_ RegMem instead of clearing it.
 	 We need to avoid setting the bit though on insns like KMOVW.  */
@@ -7762,6 +7784,7 @@ match_template (char mnem_suffix)
 	 flipping VEX.W.  */
       i.tm.opcode_modifier.vexw ^= VEXW0 ^ VEXW1;
 
+    swap_first_2:
       j = i.tm.operand_types[0].bitfield.imm8;
       i.tm.operand_types[j] = operand_types[j + 1];
       i.tm.operand_types[j + 1] = operand_types[j];
@@ -8583,12 +8606,9 @@ process_operands (void)
      unnecessary segment overrides.  */
   const reg_entry *default_seg = NULL;
 
-  /* We only need to check those implicit registers for instructions
-     with 3 operands or less.  */
-  if (i.operands <= 3)
-    for (unsigned int j = 0; j < i.operands; j++)
-      if (i.types[j].bitfield.instance != InstanceNone)
-	i.reg_operands--;
+  for (unsigned int j = 0; j < i.operands; j++)
+    if (i.types[j].bitfield.instance != InstanceNone)
+      i.reg_operands--;
 
   if (i.tm.opcode_modifier.sse2avx)
     {
@@ -8942,11 +8962,19 @@ build_modrm_byte (void)
 				     || i.vec_encoding == vex_encoding_evex));
     }
 
-  for (v = source + 1; v < dest; ++v)
-    if (v != reg_slot)
-      break;
-  if (v >= dest)
-    v = ~0;
+  if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
+    {
+      v = dest;
+      dest-- ;
+    }
+  else
+    {
+      for (v = source + 1; v < dest; ++v)
+	if (v != reg_slot)
+	  break;
+      if (v >= dest)
+	v = ~0;
+    }
   if (i.tm.extension_opcode != None)
     {
       if (dest != source)
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
index 69b2d87f0f7..ba14736c3a8 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
@@ -31,3 +31,6 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
 [ 	]*[a-f0-9]+:[ 	]+62 f2 fc 18 f5[ 	]+\(bad\)
 [ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
+[ 	]*[a-f0-9]+:[ 	]+62 f4 e4[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+08 ff[ 	]+.*
+[ 	]*[a-f0-9]+:[ 	]+04 08[ 	]+.*
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
index 719c4b6de53..fcbb1b93659 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
@@ -37,3 +37,6 @@ _start:
 
 	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[20](EVEX.b) == 0b1
 	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx){1to8}, %rcx
+
+	#{evex} inc %rax %rbx EVEX.vvvv != 1111 && EVEX.ND = 0.
+	.insn EVEX.L0.NP.M4.W1 0xff/0, (%rax,%rcx), %rbx
diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd.d b/gas/testsuite/gas/i386/x86-64-apx-ndd.d
new file mode 100644
index 00000000000..73410606ce3
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.d
@@ -0,0 +1,160 @@
+#as:
+#objdump: -dw
+#name: x86-64 APX NDD instructions with evex prefix encoding
+#source: x86-64-apx-ndd.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 d0 34 12 	adc    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 10 f9    	adc    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 12 04 07 	adc    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 13 04 07 	adc    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 14 83 11 	adcl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 54 6d 10 66 c7    	adcx   %r15d,%r8d,%r18d
+\s*[a-f0-9]+:\s*62 14 f9 08 66 04 3f 	adcx   \(%r15,%r31,1\),%r8
+\s*[a-f0-9]+:\s*62 14 69 10 66 04 3f 	adcx   \(%r15,%r31,1\),%r8d,%r18d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 c0 34 12 	add    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 d4 fc 10 81 c7 33 44 34 12 	add    \$0x12344433,%r15,%r16
+\s*[a-f0-9]+:\s*62 d4 74 10 80 c5 34 	add    \$0x34,%r13b,%r17b
+\s*[a-f0-9]+:\s*62 f4 bc 18 81 c0 11 22 33 f4 	add    \$0xfffffffff4332211,%rax,%r8
+\s*[a-f0-9]+:\s*62 44 fc 10 01 f8    	add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
+\s*[a-f0-9]+:\s*62 44 f8 10 01 3c c0 	add    %r31,\(%r8,%r16,8\),%r16
+\s*[a-f0-9]+:\s*62 44 7c 10 00 f8    	add    %r31b,%r8b,%r16b
+\s*[a-f0-9]+:\s*62 44 7c 10 01 f8    	add    %r31d,%r8d,%r16d
+\s*[a-f0-9]+:\s*62 44 7d 10 01 f8    	add    %r31w,%r8w,%r16w
+\s*[a-f0-9]+:\s*62 5c fc 10 03 07    	add    \(%r31\),%r8,%r16
+\s*[a-f0-9]+:\s*62 5c f8 10 03 84 07 90 90 00 00 	add    0x9090\(%r31,%r16,1\),%r8,%r16
+\s*[a-f0-9]+:\s*62 44 7c 10 00 f8    	add    %r31b,%r8b,%r16b
+\s*[a-f0-9]+:\s*62 44 7c 10 01 f8    	add    %r31d,%r8d,%r16d
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 04 83 11 	addl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 44 fc 10 01 f8    	add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 d4 fc 10 81 04 8f 33 44 34 12 	addq   \$0x12344433,\(%r15,%rcx,4\),%r16
+\s*[a-f0-9]+:\s*62 44 7d 10 01 f8    	add    %r31w,%r8w,%r16w
+\s*[a-f0-9]+:\s*62 54 6e 10 66 c7    	adox   %r15d,%r8d,%r18d
+\s*[a-f0-9]+:\s*62 5c fc 10 03 c7    	add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 44 fc 10 01 f8    	add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 14 fa 08 66 04 3f 	adox   \(%r15,%r31,1\),%r8
+\s*[a-f0-9]+:\s*62 14 6a 10 66 04 3f 	adox   \(%r15,%r31,1\),%r8d,%r18d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 e0 34 12 	and    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 20 f9    	and    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 22 04 07 	and    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 23 04 07 	and    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 24 83 11 	andl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 47 90 90 90 90 90 	cmova  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 43 90 90 90 90 90 	cmovae -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 42 90 90 90 90 90 	cmovb  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 46 90 90 90 90 90 	cmovbe -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 44 90 90 90 90 90 	cmove  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4f 90 90 90 90 90 	cmovg  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4d 90 90 90 90 90 	cmovge -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4c 90 90 90 90 90 	cmovl  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4e 90 90 90 90 90 	cmovle -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 45 90 90 90 90 90 	cmovne -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 41 90 90 90 90 90 	cmovno -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 49 90 90 90 90 90 	cmovns -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 40 90 90 90 90 90 	cmovo  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4a 90 90 90 90 90 	cmovp  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 48 90 90 90 90 90 	cmovs  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*62 f4 f4 10 ff c8    	dec    %rax,%r17
+\s*[a-f0-9]+:\s*62 9c 3c 18 fe 0c 27 	decb   \(%r31,%r12,1\),%r8b
+\s*[a-f0-9]+:\s*62 b4 b0 10 af 94 f8 09 09 00 00 	imul   0x909\(%rax,%r31,8\),%rdx,%r25
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 af 90 09 09 09 00 	imul   0x90909\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*62 dc fc 10 ff c7    	inc    %r31,%r16
+\s*[a-f0-9]+:\s*62 dc bc 18 ff c7    	inc    %r31,%r8
+\s*[a-f0-9]+:\s*62 f4 e4 18 ff c0    	inc    %rax,%rbx
+\s*[a-f0-9]+:\s*62 f4 f4 10 f7 d8    	neg    %rax,%r17
+\s*[a-f0-9]+:\s*62 9c 3c 18 f6 1c 27 	negb   \(%r31,%r12,1\),%r8b
+\s*[a-f0-9]+:\s*62 f4 f4 10 f7 d0    	not    %rax,%r17
+\s*[a-f0-9]+:\s*62 9c 3c 18 f6 14 27 	notb   \(%r31,%r12,1\),%r8b
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 c8 34 12 	or     \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 08 f9    	or     %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 0a 04 07 	or     \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 0b 04 07 	or     \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 0c 83 11 	orl    \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 d4 02 	rcl    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 d0    	rcl    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 10    	rclb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 10 02 	rcll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 10    	rclw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 14 83 	rclw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 dc 02 	rcr    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 d8    	rcr    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 18    	rcrb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 18 02 	rcrl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 18    	rcrw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 1c 83 	rcrw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 c4 02 	rol    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 c0    	rol    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 00    	rolb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 00 02 	roll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 00    	rolw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 04 83 	rolw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 cc 02 	ror    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 c8    	ror    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 08    	rorb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 08 02 	rorl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 08    	rorw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 0c 83 	rorw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 fc 02 	sar    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 f8    	sar    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 38    	sarb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 38 02 	sarl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 38    	sarw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 3c 83 	sarw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 d8 34 12 	sbb    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 18 f9    	sbb    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 1a 04 07 	sbb    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 1b 04 07 	sbb    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 1c 83 11 	sbbl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 e4 02 	shl    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 e4 02 	shl    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e0    	shl    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e0    	shl    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 20    	shlb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 20    	shlb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 74 84 10 24 20 01 	shld   \$0x1,%r12,\(%rax\),%r31
+\s*[a-f0-9]+:\s*62 74 04 10 24 38 02 	shld   \$0x2,%r15d,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 54 05 10 24 c4 02 	shld   \$0x2,%r8w,%r12w,%r31w
+\s*[a-f0-9]+:\s*62 7c bc 18 a5 e0    	shld   %cl,%r12,%r16,%r8
+\s*[a-f0-9]+:\s*62 7c 05 10 a5 2c 83 	shld   %cl,%r13w,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 74 05 10 a5 08    	shld   %cl,%r9w,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 20 02 	shll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 20 02 	shll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 20    	shlw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 20    	shlw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 24 83 	shlw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 24 83 	shlw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 ec 02 	shr    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e8    	shr    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 28    	shrb   \$1,\(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 74 84 10 2c 20 01 	shrd   \$0x1,%r12,\(%rax\),%r31
+\s*[a-f0-9]+:\s*62 74 04 10 2c 38 02 	shrd   \$0x2,%r15d,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 54 05 10 2c c4 02 	shrd   \$0x2,%r8w,%r12w,%r31w
+\s*[a-f0-9]+:\s*62 7c bc 18 ad e0    	shrd   %cl,%r12,%r16,%r8
+\s*[a-f0-9]+:\s*62 7c 05 10 ad 2c 83 	shrd   %cl,%r13w,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 74 05 10 ad 08    	shrd   %cl,%r9w,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 28 02 	shrl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 28    	shrw   \$1,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 2c 83 	shrw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 e8 34 12 	sub    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 28 f9    	sub    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 2a 04 07 	sub    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 2b 04 07 	sub    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 2c 83 11 	subl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 f0 34 12 	xor    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 30 f9    	xor    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 32 04 07 	xor    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 33 04 07 	xor    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 34 83 11 	xorl   \$0x11,\(%r19,%rax,4\),%r20d
diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd.s b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
new file mode 100644
index 00000000000..4e248f737a9
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
@@ -0,0 +1,155 @@
+# Check 64bit APX NDD instructions with evex prefix encoding
+
+	.allow_index_reg
+	.text
+_start:
+	adc    $0x1234,%ax,%r30w
+	adc    %r15b,%r17b,%r18b
+	adc    %r15d,(%r8),%r18d
+	adc    (%r15,%rax,1),%r16b,%r8b
+	adc    (%r15,%rax,1),%r16w,%r8w
+	adcl   $0x11,(%r19,%rax,4),%r20d
+	adcx   %r15d,%r8d,%r18d
+	adcx   (%r15,%r31,1),%r8
+	adcx   (%r15,%r31,1),%r8d,%r18d
+	add    $0x1234,%ax,%r30w
+	add    $0x12344433,%r15,%r16
+	add    $0x34,%r13b,%r17b
+	add    $0xfffffffff4332211,%rax,%r8
+	add    %r31,%r8,%r16
+	add    %r31,(%r8),%r16
+	add    %r31,(%r8,%r16,8),%r16
+	add    %r31b,%r8b,%r16b
+	add    %r31d,%r8d,%r16d
+	add    %r31w,%r8w,%r16w
+	add    (%r31),%r8,%r16
+	add    0x9090(%r31,%r16,1),%r8,%r16
+	addb   %r31b,%r8b,%r16b
+	addl   %r31d,%r8d,%r16d
+	addl   $0x11,(%r19,%rax,4),%r20d
+	addq   %r31,%r8,%r16
+	addq   $0x12344433,(%r15,%rcx,4),%r16
+	addw   %r31w,%r8w,%r16w
+	adox   %r15d,%r8d,%r18d
+	{load}  add    %r31,%r8,%r16
+	{store} add    %r31,%r8,%r16
+	adox   (%r15,%r31,1),%r8
+	adox   (%r15,%r31,1),%r8d,%r18d
+	and    $0x1234,%ax,%r30w
+	and    %r15b,%r17b,%r18b
+	and    %r15d,(%r8),%r18d
+	and    (%r15,%rax,1),%r16b,%r8b
+	and    (%r15,%rax,1),%r16w,%r8w
+	andl   $0x11,(%r19,%rax,4),%r20d
+	cmova  0x90909090(%eax),%edx,%r8d
+	cmovae 0x90909090(%eax),%edx,%r8d
+	cmovb  0x90909090(%eax),%edx,%r8d
+	cmovbe 0x90909090(%eax),%edx,%r8d
+	cmove  0x90909090(%eax),%edx,%r8d
+	cmovg  0x90909090(%eax),%edx,%r8d
+	cmovge 0x90909090(%eax),%edx,%r8d
+	cmovl  0x90909090(%eax),%edx,%r8d
+	cmovle 0x90909090(%eax),%edx,%r8d
+	cmovne 0x90909090(%eax),%edx,%r8d
+	cmovno 0x90909090(%eax),%edx,%r8d
+	cmovnp 0x90909090(%eax),%edx,%r8d
+	cmovns 0x90909090(%eax),%edx,%r8d
+	cmovo  0x90909090(%eax),%edx,%r8d
+	cmovp  0x90909090(%eax),%edx,%r8d
+	cmovs  0x90909090(%eax),%edx,%r8d
+	dec    %rax,%r17
+	decb   (%r31,%r12,1),%r8b
+	imul   0x909(%rax,%r31,8),%rdx,%r25
+	imul   0x90909(%eax),%edx,%r8d
+	inc    %r31,%r16
+	inc    %r31,%r8
+	inc    %rax,%rbx
+	neg    %rax,%r17
+	negb   (%r31,%r12,1),%r8b
+	not    %rax,%r17
+	notb   (%r31,%r12,1),%r8b
+	or     $0x1234,%ax,%r30w
+	or     %r15b,%r17b,%r18b
+	or     %r15d,(%r8),%r18d
+	or     (%r15,%rax,1),%r16b,%r8b
+	or     (%r15,%rax,1),%r16w,%r8w
+	orl    $0x11,(%r19,%rax,4),%r20d
+	rcl    $0x2,%r12b,%r31b
+	rcl    %cl,%r16b,%r8b
+	rclb   $0x1,(%rax),%r31b
+	rcll   $0x2,(%rax),%r31d
+	rclw   $0x1,(%rax),%r31w
+	rclw   %cl,(%r19,%rax,4),%r31w
+	rcr    $0x2,%r12b,%r31b
+	rcr    %cl,%r16b,%r8b
+	rcrb   $0x1,(%rax),%r31b
+	rcrl   $0x2,(%rax),%r31d
+	rcrw   $0x1,(%rax),%r31w
+	rcrw   %cl,(%r19,%rax,4),%r31w
+	rol    $0x2,%r12b,%r31b
+	rol    %cl,%r16b,%r8b
+	rolb   $0x1,(%rax),%r31b
+	roll   $0x2,(%rax),%r31d
+	rolw   $0x1,(%rax),%r31w
+	rolw   %cl,(%r19,%rax,4),%r31w
+	ror    $0x2,%r12b,%r31b
+	ror    %cl,%r16b,%r8b
+	rorb   $0x1,(%rax),%r31b
+	rorl   $0x2,(%rax),%r31d
+	rorw   $0x1,(%rax),%r31w
+	rorw   %cl,(%r19,%rax,4),%r31w
+	sar    $0x2,%r12b,%r31b
+	sar    %cl,%r16b,%r8b
+	sarb   $0x1,(%rax),%r31b
+	sarl   $0x2,(%rax),%r31d
+	sarw   $0x1,(%rax),%r31w
+	sarw   %cl,(%r19,%rax,4),%r31w
+	sbb    $0x1234,%ax,%r30w
+	sbb    %r15b,%r17b,%r18b
+	sbb    %r15d,(%r8),%r18d
+	sbb    (%r15,%rax,1),%r16b,%r8b
+	sbb    (%r15,%rax,1),%r16w,%r8w
+	sbbl   $0x11,(%r19,%rax,4),%r20d
+	shl    $0x2,%r12b,%r31b
+	shl    $0x2,%r12b,%r31b
+	shl    %cl,%r16b,%r8b
+	shl    %cl,%r16b,%r8b
+	shlb   $0x1,(%rax),%r31b
+	shlb   $0x1,(%rax),%r31b
+	shld   $0x1,%r12,(%rax),%r31
+	shld   $0x2,%r15d,(%rax),%r31d
+	shld   $0x2,%r8w,%r12w,%r31w
+	shld   %cl,%r12,%r16,%r8
+	shld   %cl,%r13w,(%r19,%rax,4),%r31w
+	shld   %cl,%r9w,(%rax),%r31w
+	shll   $0x2,(%rax),%r31d
+	shll   $0x2,(%rax),%r31d
+	shlw   $0x1,(%rax),%r31w
+	shlw   $0x1,(%rax),%r31w
+	shlw   %cl,(%r19,%rax,4),%r31w
+	shlw   %cl,(%r19,%rax,4),%r31w
+	shr    $0x2,%r12b,%r31b
+	shr    %cl,%r16b,%r8b
+	shrb   $0x1,(%rax),%r31b
+	shrd   $0x1,%r12,(%rax),%r31
+	shrd   $0x2,%r15d,(%rax),%r31d
+	shrd   $0x2,%r8w,%r12w,%r31w
+	shrd   %cl,%r12,%r16,%r8
+	shrd   %cl,%r13w,(%r19,%rax,4),%r31w
+	shrd   %cl,%r9w,(%rax),%r31w
+	shrl   $0x2,(%rax),%r31d
+	shrw   $0x1,(%rax),%r31w
+	shrw   %cl,(%r19,%rax,4),%r31w
+	sub    $0x1234,%ax,%r30w
+	sub    %r15b,%r17b,%r18b
+	sub    %r15d,(%r8),%r18d
+	sub    (%r15,%rax,1),%r16b,%r8b
+	sub    (%r15,%rax,1),%r16w,%r8w
+	subl   $0x11,(%r19,%rax,4),%r20d
+	xor    $0x1234,%ax,%r30w
+	xor    %r15b,%r17b,%r18b
+	xor    %r15d,(%r8),%r18d
+	xor    (%r15,%rax,1),%r16b,%r8b
+	xor    (%r15,%rax,1),%r16w,%r8w
+	xorl   $0x11,(%r19,%rax,4),%r20d
+
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.d b/gas/testsuite/gas/i386/x86-64-pseudos.d
index 19dcd8415ac..c55e6f4b7aa 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos.d
+++ b/gas/testsuite/gas/i386/x86-64-pseudos.d
@@ -137,6 +137,48 @@ Disassembly of section .text:
  +[a-f0-9]+:	33 07                	xor    \(%rdi\),%eax
  +[a-f0-9]+:	31 07                	xor    %eax,\(%rdi\)
  +[a-f0-9]+:	33 07                	xor    \(%rdi\),%eax
+ +[a-f0-9]+:	62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
+ +[a-f0-9]+:	62 44 fc 10 03 38    	add    \(%r8\),%r31,%r16
+ +[a-f0-9]+:	62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
+ +[a-f0-9]+:	62 44 fc 10 03 38    	add    \(%r8\),%r31,%r16
+ +[a-f0-9]+:	62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 2b 38    	sub    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 2b 38    	sub    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 1b 38    	sbb    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 1b 38    	sbb    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 23 38    	and    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 23 38    	and    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 0b 38    	or     \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 0b 38    	or     \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 33 38    	xor    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 33 38    	xor    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 13 38    	adc    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 13 38    	adc    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 44 fc 10 01 f8    	add    %r31,%r8,%r16
+ +[a-f0-9]+:	62 5c fc 10 03 c7    	add    %r31,%r8,%r16
+ +[a-f0-9]+:	62 7c 6c 10 28 f9    	sub    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 2a cf    	sub    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 18 f9    	sbb    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 1a cf    	sbb    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 20 f9    	and    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 22 cf    	and    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 08 f9    	or     %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 0a cf    	or     %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 30 f9    	xor    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 32 cf    	xor    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 10 f9    	adc    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 12 cf    	adc    %r15b,%r17b,%r18b
  +[a-f0-9]+:	b0 12                	mov    \$0x12,%al
  +[a-f0-9]+:	b8 45 03 00 00       	mov    \$0x345,%eax
  +[a-f0-9]+:	b0 12                	mov    \$0x12,%al
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.s b/gas/testsuite/gas/i386/x86-64-pseudos.s
index 5a53c363615..041f98e1939 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos.s
+++ b/gas/testsuite/gas/i386/x86-64-pseudos.s
@@ -134,6 +134,49 @@ _start:
 	{load} xor (%rdi), %eax
 	{store} xor %eax, (%rdi)
 	{store} xor (%rdi), %eax
+	{load}  add    %r31,(%r8),%r16
+	{load}	add    (%r8),%r31,%r16
+	{store} add    %r31,(%r8),%r16
+	{store}	add    (%r8),%r31,%r16
+	{load} 	sub    %r15d,(%r8),%r18d
+	{load}	sub    (%r8),%r15d,%r18d
+	{store} sub    %r15d,(%r8),%r18d
+	{store} sub    (%r8),%r15d,%r18d
+	{load} 	sbb    %r15d,(%r8),%r18d
+	{load}	sbb    (%r8),%r15d,%r18d
+	{store} sbb    %r15d,(%r8),%r18d
+	{store} sbb    (%r8),%r15d,%r18d
+	{load} 	and    %r15d,(%r8),%r18d
+	{load}	and    (%r8),%r15d,%r18d
+	{store} and    %r15d,(%r8),%r18d
+	{store} and    (%r8),%r15d,%r18d
+	{load} 	or     %r15d,(%r8),%r18d
+	{load}	or     (%r8),%r15d,%r18d
+	{store} or     %r15d,(%r8),%r18d
+	{store} or     (%r8),%r15d,%r18d
+	{load} 	xor    %r15d,(%r8),%r18d
+	{load}	xor    (%r8),%r15d,%r18d
+	{store} xor    %r15d,(%r8),%r18d
+	{store} xor    (%r8),%r15d,%r18d
+	{load} 	adc    %r15d,(%r8),%r18d
+	{load}	adc    (%r8),%r15d,%r18d
+	{store} adc    %r15d,(%r8),%r18d
+	{store} adc    (%r8),%r15d,%r18d
+
+	{store} add    %r31,%r8,%r16
+	{load}  add    %r31,%r8,%r16
+	{store} sub    %r15b,%r17b,%r18b
+	{load}	sub    %r15b,%r17b,%r18b
+	{store}	sbb    %r15b,%r17b,%r18b
+	{load}	sbb    %r15b,%r17b,%r18b
+	{store}	and    %r15b,%r17b,%r18b
+	{load}	and    %r15b,%r17b,%r18b
+	{store}	or     %r15b,%r17b,%r18b
+	{load}	or     %r15b,%r17b,%r18b
+	{store}	xor    %r15b,%r17b,%r18b
+	{load}	xor    %r15b,%r17b,%r18b
+	{store}	adc    %r15b,%r17b,%r18b
+	{load}	adc    %r15b,%r17b,%r18b
 
 	.irp m, mov, adc, add, and, cmp, or, sbb, sub, test, xor
 	\m	$0x12, %al
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index bfda747e02e..3a3438a5de3 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -370,6 +370,7 @@ run_dump_test "x86-64-apx-rex2"
 run_dump_test "x86-64-apx-evex-promoted"
 run_dump_test "x86-64-apx-evex-promoted-intel"
 run_dump_test "x86-64-apx-evex-egpr"
+run_dump_test "x86-64-apx-ndd"
 run_dump_test "x86-64-avx512f-rcigrz-intel"
 run_dump_test "x86-64-avx512f-rcigrz"
 run_dump_test "x86-64-clwb"
diff --git a/opcodes/i386-dis-evex-reg.h b/opcodes/i386-dis-evex-reg.h
index 2885063628b..cac3c39c4c5 100644
--- a/opcodes/i386-dis-evex-reg.h
+++ b/opcodes/i386-dis-evex-reg.h
@@ -49,3 +49,57 @@
     { "vscatterpf0qp%XW",  { MVexVSIBQWpX }, PREFIX_DATA },
     { "vscatterpf1qp%XW",  { MVexVSIBQWpX }, PREFIX_DATA },
   },
+  /* REG_EVEX_MAP4_80 */
+  {
+    { "addA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "orA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "adcA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "sbbA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "andA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "subA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "xorA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+  },
+  /* REG_EVEX_MAP4_81 */
+  {
+    { "addQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
+    { "orQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
+    { "adcQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
+    { "sbbQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
+    { "andQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
+    { "subQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
+    { "xorQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
+  },
+  /* REG_EVEX_MAP4_83 */
+  {
+    { "addQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
+    { "orQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
+    { "adcQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
+    { "sbbQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
+    { "andQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
+    { "subQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
+    { "xorQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
+  },
+  /* REG_EVEX_MAP4_F6 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { "notA",	{ VexGb, Eb }, NO_PREFIX },
+    { "negA",	{ VexGb, Eb }, NO_PREFIX },
+  },
+  /* REG_EVEX_MAP4_F7 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { "notQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
+    { "negQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
+  },
+  /* REG_EVEX_MAP4_FE */
+  {
+    { "incA",	{ VexGb, Eb }, NO_PREFIX },
+    { "decA",	{ VexGb, Eb }, NO_PREFIX },
+  },
+  /* REG_EVEX_MAP4_FF */
+  {
+    { "incQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
+    { "decQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
+  },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 90c063b2188..a8a891d7f0e 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -875,64 +875,64 @@ static const struct dis386 evex_table[][256] = {
   /* EVEX_MAP4_ */
   {
     /* 00 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "addB",             { VexGb, Eb, Gb }, NO_PREFIX },
+    { "addS",             { VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
+    { "addB",             { VexGb, Gb, EbS }, NO_PREFIX },
+    { "addS",             { VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 08 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "orB",		{ VexGb, Eb, Gb }, NO_PREFIX },
+    { "orS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
+    { "orB",		{ VexGb, Gb, EbS }, NO_PREFIX },
+    { "orS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 10 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "adcB",		{ VexGb, Eb, Gb }, NO_PREFIX },
+    { "adcS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
+    { "adcB",		{ VexGb, Gb, EbS }, NO_PREFIX },
+    { "adcS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 18 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "sbbB",		{ VexGb, Eb, Gb }, NO_PREFIX },
+    { "sbbS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
+    { "sbbB",		{ VexGb, Gb, EbS }, NO_PREFIX },
+    { "sbbS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 20 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "andB",		{ VexGb, Eb, Gb }, NO_PREFIX },
+    { "andS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
+    { "andB",		{ VexGb, Gb, EbS }, NO_PREFIX },
+    { "andS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
+    { "shldS",		{ VexGv, Ev, Gv, Ib }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 28 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "subB",		{ VexGb, Eb, Gb }, NO_PREFIX },
+    { "subS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
+    { "subB",		{ VexGb, Gb, EbS }, NO_PREFIX },
+    { "subS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
+    { "shrdS",		{ VexGv, Ev, Gv, Ib }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 30 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "xorB",		{ VexGb, Eb, Gb }, NO_PREFIX },
+    { "xorS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
+    { "xorB",		{ VexGb, Gb, EbS }, NO_PREFIX },
+    { "xorS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -947,23 +947,23 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 40 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "%CFcmovoS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovnoS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovbS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovaeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmoveS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovneS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovbeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovaS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
     /* 48 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "%CFcmovsS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovnsS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovpS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovnpS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovlS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovgeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovleS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "%CFcmovgS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
     /* 50 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1019,10 +1019,10 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 80 */
+    { REG_TABLE (REG_EVEX_MAP4_80) },
+    { REG_TABLE (REG_EVEX_MAP4_81) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_EVEX_MAP4_83) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1060,7 +1060,7 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { "shldS",	{ VexGv, Ev, Gv, CL }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     /* A8 */
@@ -1069,9 +1069,9 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
+    { "shrdS",	{ VexGv, Ev, Gv, CL }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "imulS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
     /* B0 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1091,8 +1091,8 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* C0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_C0) },
+    { REG_TABLE (REG_C1) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1109,10 +1109,10 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* D0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_D0) },
+    { REG_TABLE (REG_D1) },
+    { REG_TABLE (REG_D2) },
+    { REG_TABLE (REG_D3) },
     { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1151,8 +1151,8 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_EVEX_MAP4_F6) },
+    { REG_TABLE (REG_EVEX_MAP4_F7) },
     /* F8 */
     { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
     { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
@@ -1160,8 +1160,8 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { PREFIX_TABLE (PREFIX_0F38FC) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_EVEX_MAP4_FE) },
+    { REG_TABLE (REG_EVEX_MAP4_FF) },
   },
   /* EVEX_MAP5_ */
   {
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index d4d32befcf9..1bb2882d839 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -226,6 +226,9 @@ struct instr_info
   }
   vex;
 
+/* For APX EVEX-promoted prefix, EVEX.ND shares the same bit as vex.b.  */
+#define nd b
+
   enum evex_type evex_type;
 
   /* Remember if the current op is a jump instruction.  */
@@ -578,6 +581,8 @@ fetch_error (const instr_info *ins)
 #define VexGatherD { OP_VEX, vex_vsib_d_w_dq_mode }
 #define VexGatherQ { OP_VEX, vex_vsib_q_w_dq_mode }
 #define VexGdq { OP_VEX, dq_mode }
+#define VexGb { OP_VEX, b_mode }
+#define VexGv { OP_VEX, v_mode }
 #define VexTmm { OP_VEX, tmm_mode }
 #define XMVexI4 { OP_REG_VexI4, x_mode }
 #define XMVexScalarI4 { OP_REG_VexI4, scalar_mode }
@@ -892,6 +897,13 @@ enum
   REG_EVEX_0F73,
   REG_EVEX_0F38C6_L_2,
   REG_EVEX_0F38C7_L_2,
+  REG_EVEX_MAP4_80,
+  REG_EVEX_MAP4_81,
+  REG_EVEX_MAP4_83,
+  REG_EVEX_MAP4_F6,
+  REG_EVEX_MAP4_F7,
+  REG_EVEX_MAP4_FE,
+  REG_EVEX_MAP4_FF,
 };
 
 enum
@@ -2599,25 +2611,25 @@ static const struct dis386 reg_table[][8] = {
   },
   /* REG_C0 */
   {
-    { "rolA",	{ Eb, Ib }, 0 },
-    { "rorA",	{ Eb, Ib }, 0 },
-    { "rclA",	{ Eb, Ib }, 0 },
-    { "rcrA",	{ Eb, Ib }, 0 },
-    { "shlA",	{ Eb, Ib }, 0 },
-    { "shrA",	{ Eb, Ib }, 0 },
-    { "shlA",	{ Eb, Ib }, 0 },
-    { "sarA",	{ Eb, Ib }, 0 },
+    { "rolA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "rorA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "rclA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "rcrA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "shlA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "shrA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "shlA",	{ VexGb, Eb, Ib }, NO_PREFIX },
+    { "sarA",	{ VexGb, Eb, Ib }, NO_PREFIX },
   },
   /* REG_C1 */
   {
-    { "rolQ",	{ Ev, Ib }, 0 },
-    { "rorQ",	{ Ev, Ib }, 0 },
-    { "rclQ",	{ Ev, Ib }, 0 },
-    { "rcrQ",	{ Ev, Ib }, 0 },
-    { "shlQ",	{ Ev, Ib }, 0 },
-    { "shrQ",	{ Ev, Ib }, 0 },
-    { "shlQ",	{ Ev, Ib }, 0 },
-    { "sarQ",	{ Ev, Ib }, 0 },
+    { "rolQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
+    { "rorQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
+    { "rclQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
+    { "rcrQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
+    { "shlQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
+    { "shrQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
+    { "shlQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
+    { "sarQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
   },
   /* REG_C6 */
   {
@@ -2643,47 +2655,47 @@ static const struct dis386 reg_table[][8] = {
   },
   /* REG_D0 */
   {
-    { "rolA",	{ Eb, I1 }, 0 },
-    { "rorA",	{ Eb, I1 }, 0 },
-    { "rclA",	{ Eb, I1 }, 0 },
-    { "rcrA",	{ Eb, I1 }, 0 },
-    { "shlA",	{ Eb, I1 }, 0 },
-    { "shrA",	{ Eb, I1 }, 0 },
-    { "shlA",	{ Eb, I1 }, 0 },
-    { "sarA",	{ Eb, I1 }, 0 },
+    { "rolA",	{ VexGb, Eb, I1 }, NO_PREFIX },
+    { "rorA",	{ VexGb, Eb, I1 }, NO_PREFIX },
+    { "rclA",	{ VexGb, Eb, I1 }, NO_PREFIX },
+    { "rcrA",	{ VexGb, Eb, I1 }, NO_PREFIX },
+    { "shlA",	{ VexGb, Eb, I1 }, NO_PREFIX },
+    { "shrA",	{ VexGb, Eb, I1 }, NO_PREFIX },
+    { "shlA",	{ VexGb, Eb, I1 }, NO_PREFIX },
+    { "sarA",	{ VexGb, Eb, I1 }, NO_PREFIX },
   },
   /* REG_D1 */
   {
-    { "rolQ",	{ Ev, I1 }, 0 },
-    { "rorQ",	{ Ev, I1 }, 0 },
-    { "rclQ",	{ Ev, I1 }, 0 },
-    { "rcrQ",	{ Ev, I1 }, 0 },
-    { "shlQ",	{ Ev, I1 }, 0 },
-    { "shrQ",	{ Ev, I1 }, 0 },
-    { "shlQ",	{ Ev, I1 }, 0 },
-    { "sarQ",	{ Ev, I1 }, 0 },
+    { "rolQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
+    { "rorQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
+    { "rclQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
+    { "rcrQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
+    { "shlQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
+    { "shrQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
+    { "shlQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
+    { "sarQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
   },
   /* REG_D2 */
   {
-    { "rolA",	{ Eb, CL }, 0 },
-    { "rorA",	{ Eb, CL }, 0 },
-    { "rclA",	{ Eb, CL }, 0 },
-    { "rcrA",	{ Eb, CL }, 0 },
-    { "shlA",	{ Eb, CL }, 0 },
-    { "shrA",	{ Eb, CL }, 0 },
-    { "shlA",	{ Eb, CL }, 0 },
-    { "sarA",	{ Eb, CL }, 0 },
+    { "rolA",	{ VexGb, Eb, CL }, NO_PREFIX },
+    { "rorA",	{ VexGb, Eb, CL }, NO_PREFIX },
+    { "rclA",	{ VexGb, Eb, CL }, NO_PREFIX },
+    { "rcrA",	{ VexGb, Eb, CL }, NO_PREFIX },
+    { "shlA",	{ VexGb, Eb, CL }, NO_PREFIX },
+    { "shrA",	{ VexGb, Eb, CL }, NO_PREFIX },
+    { "shlA",	{ VexGb, Eb, CL }, NO_PREFIX },
+    { "sarA",	{ VexGb, Eb, CL }, NO_PREFIX },
   },
   /* REG_D3 */
   {
-    { "rolQ",	{ Ev, CL }, 0 },
-    { "rorQ",	{ Ev, CL }, 0 },
-    { "rclQ",	{ Ev, CL }, 0 },
-    { "rcrQ",	{ Ev, CL }, 0 },
-    { "shlQ",	{ Ev, CL }, 0 },
-    { "shrQ",	{ Ev, CL }, 0 },
-    { "shlQ",	{ Ev, CL }, 0 },
-    { "sarQ",	{ Ev, CL }, 0 },
+    { "rolQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
+    { "rorQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
+    { "rclQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
+    { "rcrQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
+    { "shlQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
+    { "shrQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
+    { "shlQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
+    { "sarQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
   },
   /* REG_F6 */
   {
@@ -3633,8 +3645,8 @@ static const struct dis386 prefix_table[][4] = {
   /* PREFIX_0F38F6 */
   {
     { "wrssK",	{ M, Gdq }, 0 },
-    { "adoxS",	{ Gdq, Edq}, 0 },
-    { "adcxS",	{ Gdq, Edq}, 0 },
+    { "adoxS",	{ VexGdq, Gdq, Edq}, 0 },
+    { "adcxS",	{ VexGdq, Gdq, Edq}, 0 },
     { Bad_Opcode },
   },
 
@@ -9120,6 +9132,12 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 	  ins->rex2 &= ~REX_R;
 	}
 
+      /* EVEX from legacy instructions, when the EVEX.ND bit is 0,
+	 all bits of EVEX.vvvv and EVEX.V' must be 1.  */
+      if (ins->evex_type == evex_from_legacy && !ins->vex.nd
+	  && (ins->vex.register_specifier || !ins->vex.v))
+	return &bad_opcode;
+
       ins->need_vex = 4;
 
       /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
@@ -9137,8 +9155,10 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
       if (!fetch_modrm (ins))
 	return &err_opcode;
 
-      /* Set vector length.  */
-      if (ins->modrm.mod == 3 && ins->vex.b)
+      /* Set vector length. For EVEX-promoted instructions, evex.ll == 0b00,
+	 which has the same encoding as vex.length == 128 and they can share
+	 the same processing with vex.length in OP_VEX.  */
+      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type != evex_from_legacy)
 	ins->vex.length = 512;
       else
 	{
@@ -9605,8 +9625,8 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
 	    }
 
 	  /* Check whether rounding control was enabled for an insn not
-	     supporting it.  */
-	  if (ins.modrm.mod == 3 && ins.vex.b
+	     supporting it, when evex.b is not treated as evex.nd.  */
+	  if (ins.modrm.mod == 3 && ins.vex.b && ins.evex_type == evex_default
 	      && !(ins.evex_used & EVEX_b_used))
 	    {
 	      for (i = 0; i < MAX_OPERANDS; ++i)
@@ -10499,16 +10519,23 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 	  ins->used_prefixes |= (ins->prefixes & PREFIX_ADDR);
 	  break;
 	case 'F':
-	  if (ins->intel_syntax)
-	    break;
-	  if ((ins->prefixes & PREFIX_ADDR) || (sizeflag & SUFFIX_ALWAYS))
+	  if (l == 0)
 	    {
-	      if (sizeflag & AFLAG)
-		*ins->obufp++ = ins->address_mode == mode_64bit ? 'q' : 'l';
-	      else
-		*ins->obufp++ = ins->address_mode == mode_64bit ? 'l' : 'w';
-	      ins->used_prefixes |= (ins->prefixes & PREFIX_ADDR);
+	      if (ins->intel_syntax)
+		break;
+	      if ((ins->prefixes & PREFIX_ADDR) || (sizeflag & SUFFIX_ALWAYS))
+		{
+		  if (sizeflag & AFLAG)
+		    *ins->obufp++ = ins->address_mode == mode_64bit ? 'q' : 'l';
+		  else
+		    *ins->obufp++ = ins->address_mode == mode_64bit ? 'l' : 'w';
+		  ins->used_prefixes |= (ins->prefixes & PREFIX_ADDR);
+		}
 	    }
+	  else if (l == 1 && last[0] == 'C')
+	    break;
+	  else
+	    abort ();
 	  break;
 	case 'G':
 	  if (ins->intel_syntax || (ins->obufp[-1] != 's'
@@ -11072,7 +11099,8 @@ print_displacement (instr_info *ins, bfd_signed_vma val)
 static void
 intel_operand_size (instr_info *ins, int bytemode, int sizeflag)
 {
-  if (ins->vex.b)
+  /* Check if there is a broadcast, when evex.b is not treated as evex.nd.  */
+  if (ins->vex.b && ins->evex_type == evex_default)
     {
       if (!ins->vex.no_broadcast)
 	switch (bytemode)
@@ -11569,6 +11597,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 
   add += (ins->rex2 & REX_B) ? 16 : 0;
 
+  /* Handles EVEX other than APX EVEX-promoted instructions.  */
   if (ins->vex.evex && ins->evex_type == evex_default)
     {
 
@@ -12004,7 +12033,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 	  print_operand_value (ins, disp & 0xffff, dis_style_text);
 	}
     }
-  if (ins->vex.b)
+  if (ins->vex.b && ins->evex_type == evex_default)
     {
       ins->evex_used |= EVEX_b_used;
 
@@ -13370,6 +13399,13 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
   if (!ins->need_vex)
     return true;
 
+  if (ins->evex_type == evex_from_legacy)
+    {
+      ins->evex_used |= EVEX_b_used;
+      if (!ins->vex.nd)
+	return true;
+    }
+
   reg = ins->vex.register_specifier;
   ins->vex.register_specifier = 0;
   if (ins->address_mode != mode_64bit)
@@ -13461,12 +13497,19 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
 	  names = att_names_xmm;
 	  ins->evex_used |= EVEX_len_used;
 	  break;
+	case v_mode:
 	case dq_mode:
 	  if (ins->rex & REX_W)
 	    names = att_names64;
+	  else if (bytemode == v_mode
+		   && !(sizeflag & DFLAG))
+	    names = att_names16;
 	  else
 	    names = att_names32;
 	  break;
+	case b_mode:
+	  names = att_names8rex;
+	  break;
 	case mask_bd_mode:
 	case mask_mode:
 	  if (reg > 0x7)
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 064ec48edad..9e8c827b934 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -638,8 +638,10 @@ enum
   Vex,
   /* How to encode VEX.vvvv:
      0: VEX.vvvv must be 1111b.
-     1: VEX.vvvv encodes one of the register operands.
+     1: VEX.vvvv encodes one of the src register operands.
+     2: VEX.vvvv encodes the dest register operand.
    */
+#define VexVVVV_DST   2
   VexVVVV,
   /* How the VEX.W bit is used:
      0: Set by the REX.W bit.
@@ -776,7 +778,7 @@ typedef struct i386_opcode_modifier
   unsigned int immext:1;
   unsigned int norex64:1;
   unsigned int vex:2;
-  unsigned int vexvvvv:1;
+  unsigned int vexvvvv:2;
   unsigned int vexw:2;
   unsigned int opcodeprefix:2;
   unsigned int sib:3;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 11b8c0b63cb..54c659099af 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -140,12 +140,16 @@
 
 #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
 
+#define DstVVVV VexVVVV=VexVVVV_DST
+
 // The template supports VEX format for cpuid and EVEX format for cpuid & apx_f.
 #define APX_F(cpuid) cpuid&(cpuid|APX_F)
 
 // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
 // the bit to mark commutative VEX encodings where swapping the source
 // operands may allow to switch from 3-byte to 2-byte VEX encoding.
+// And re-use the bit to mark some NDD insns that swapping the source operands
+// may allow to switch from EVEX encoding to REX2 encoding.
 #define C StaticRounding
 
 #define FP 387|287|8087
@@ -292,26 +296,38 @@ std, 0xfd, 0, NoSuf, {}
 sti, 0xfb, 0, NoSuf, {}
 
 // Arithmetic.
+add, 0x0, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 add, 0x0, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+add, 0x83/0, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 add, 0x4, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+add, 0x80/0, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}
 add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
 inc, 0x40, No64, No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32 }
+inc, 0xfe/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, {Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}
 inc, 0xfe/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+sub, 0x28, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64, }
 sub, 0x28, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+sub, 0x83/5, APX_F, Modrm|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 sub, 0x83/5, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 sub, 0x2c, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+sub, 0x80/5, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sub, 0x80/5, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
 dec, 0x48, No64, No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32 }
+dec, 0xfe/1, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 dec, 0xfe/1, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+sbb, 0x18, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sbb, 0x18, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+sbb, 0x83/3, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 sbb, 0x83/3, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 sbb, 0x1c, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+sbb, 0x80/3, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sbb, 0x80/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+sbb, 0x80/3, APX_F, W|Modrm|EVexMap4|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
 cmp, 0x38, 0, D|W|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 cmp, 0x83/7, 0, Modrm|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
@@ -322,30 +338,45 @@ test, 0x84, 0, D|W|C|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64, R
 test, 0xa8, 0, W|No_sSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 test, 0xf6/0, 0, W|Modrm|No_sSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+and, 0x20, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 and, 0x20, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+and, 0x83/4, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 and, 0x83/4, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock|Optimize, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 and, 0x24, 0, W|No_sSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+and, 0x80/4, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 and, 0x80/4, 0, W|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+or, 0x8, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 or, 0x8, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+or, 0x83/1, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 or, 0x83/1, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 or, 0xc, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+or, 0x80/1, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 or, 0x80/1, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+xor, 0x30, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 xor, 0x30, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+xor, 0x83/6, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 xor, 0x83/6, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 xor, 0x34, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+xor, 0x80/6, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 xor, 0x80/6, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
 // clr with 1 operand is really xor with 2 operands.
 clr, 0x30, 0, W|Modrm|No_sSuf|RegKludge|Optimize, { Reg8|Reg16|Reg32|Reg64 }
 
+adc, 0x10, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 adc, 0x10, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+adc, 0x83/2, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 adc, 0x83/2, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 adc, 0x14, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+adc, 0x80/2, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+neg, 0xf6/3, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 neg, 0xf6/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+
+not, 0xf6/2, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 not, 0xf6/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
 aaa, 0x37, No64, NoSuf, {}
@@ -379,6 +410,7 @@ cqto, 0x99, x64, Size64|NoSuf, {}
 // These multiplies can only be selected with single operand forms.
 mul, 0xf6/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 imul, 0xf6/5, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+imul, 0xaf, APX_F, C|Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 }
 imul, 0xfaf, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 imul, 0x6b, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 imul, 0x69, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm16|Imm32|Imm32S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
@@ -393,52 +425,90 @@ div, 0xf6/6, 0, W|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspe
 idiv, 0xf6/7, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 idiv, 0xf6/7, 0, W|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
 
+rol, 0xd0/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rol, 0xd0/0, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rol, 0xc0/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rol, 0xc0/0, i186, W|Modrm|No_sSuf, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rol, 0xd2/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rol, 0xd2/0, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 rol, 0xd0/0, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+ror, 0xd0/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 ror, 0xd0/1, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+ror, 0xc0/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 ror, 0xc0/1, i186, W|Modrm|No_sSuf, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+ror, 0xd2/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 ror, 0xd2/1, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 ror, 0xd0/1, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rcl, 0xd0/2, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcl, 0xc0/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rcl, 0xc0/2, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcl, 0xc0/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcl, 0xd2/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rcl, 0xd2/2, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcl, 0xd2/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 rcl, 0xd0/2, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rcr, 0xd0/3, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcr, 0xc0/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rcr, 0xc0/3, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcr, 0xc0/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcr, 0xd2/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 rcr, 0xd2/3, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcr, 0xd2/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 rcr, 0xd0/3, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+sal, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sal, 0xd0/4, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+sal, 0xc0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sal, 0xc0/4, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+sal, 0xd2/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sal, 0xd2/4, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 sal, 0xd0/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+shl, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 shl, 0xd0/4, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shl, 0xc0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 shl, 0xc0/4, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shl, 0xd2/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 shl, 0xd2/4, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 shl, 0xd0/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+shr, 0xd0/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 shr, 0xd0/5, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shr, 0xc0/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 shr, 0xc0/5, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shr, 0xd2/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 shr, 0xd2/5, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 shr, 0xd0/5, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+sar, 0xd0/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sar, 0xd0/7, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+sar, 0xc0/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sar, 0xc0/7, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+sar, 0xd2/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 sar, 0xd2/7, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 sar, 0xd0/7, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+shld, 0x24, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 shld, 0xfa4, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shld, 0xa5, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 shld, 0xfa5, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shld, 0xa5, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 shld, 0xfa5, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
+shrd, 0x2c, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 shrd, 0xfac, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shrd, 0xad, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 shrd, 0xfad, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
+shrd, 0xad, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 shrd, 0xfad, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
 
 // Control transfer instructions.
@@ -940,6 +1010,7 @@ ud2b, 0xfb9, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|U
 // 3rd official undefined instr (older CPUs don't take a ModR/M byte)
 ud0, 0xfff, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
+cmov<cc>, 0x4<cc:opc>, CMOV&APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 }
 cmov<cc>, 0xf4<cc:opc>, CMOV, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 fcmovb, 0xda/0, i687, Modrm|NoSuf, { FloatReg, FloatAcc }
@@ -2031,8 +2102,12 @@ xcryptofb, 0xf30fa7e8, PadLock, NoSuf|RepPrefixOk, {}
 xstore, 0xfa7c0, PadLock, NoSuf|RepPrefixOk, {}
 
 // Multy-precision Add Carry, rdseed instructions.
+adcx, 0x6666, ADX&APX_F, C|Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|DstVVVV|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
 adcx, 0x660f38f6, ADX, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adcx, 0x6666, ADX&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adox, 0xf366, ADX&APX_F, C|Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|DstVVVV|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
 adox, 0xf30f38f6, ADX, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adox, 0xf366, ADX&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 rdseed, 0xfc7/7, RdSeed, Modrm|NoSuf, { Reg16|Reg32|Reg64 }
 
 // SMAP instructions.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 6/9] Support APX Push2/Pop2
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
                   ` (4 preceding siblings ...)
  2023-12-28  1:27 ` [PATCH V5 5/9] Support APX NDD Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:55   ` H.J. Lu
  2023-12-28  1:27 ` [PATCH V5 7/9] Support APX pushp/popp Cui, Lili
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich, Mo, Zewei

From: "Mo, Zewei" <zewei.mo@intel.com>

PPX functionality for PUSH/POP is not implemented in this patch
and will be implemented separately.

gas/ChangeLog:

2023-12-28  Zewei Mo <zewei.mo@intel.com>
            H.J. Lu  <hongjiu.lu@intel.com>
            Lili Cui <lili.cui@intel.com>

	* config/tc-i386.c: (enum i386_error):
	New unsupported_rsp_register and invalid_src_register_set.
	(md_assemble): Add handler for unsupported_rsp_register and
	invalid_src_register_set.
	(check_APX_operands): Add invalid check for push2/pop2.
	(match_template): Handle check_APX_operands.
	* testsuite/gas/i386/i386.exp: Add apx-push2pop2 tests.
	* testsuite/gas/i386/x86-64.exp: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2.d: New test.
	* testsuite/gas/i386/x86-64-apx-push2pop2.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2-inval.l: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2-inval.s: Ditto.
	* testsuite/gas/i386/apx-push2pop2-inval.s: Ditto.
	* testsuite/gas/i386/apx-push2pop2-inval.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Added bad
	testcases for POP2.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.

opcodes/ChangeLog:

	* i386-dis-evex-reg.h: Add REG_EVEX_MAP4_8F.
	* i386-dis-evex-w.h: Add EVEX_W_MAP4_8F_R_0 and EVEX_W_MAP4_FF_R_6
	* i386-dis-evex.h: Add REG_EVEX_MAP4_8F.
	* i386-dis.c (PUSH2_POP2_Fixup): Add special handling for PUSH2/POP2.
	(get_valid_dis386): Add handler for vector length and address_mode for
	APX-Push2/Pop2 insn.
	(nd): define nd as b for EVEX-promoted instrutions.
	(OP_VEX): Add handler of 64-bit vvvv register for APX-Push2/Pop2 insn.
	* i386-gen.c: Add Push2Pop2 bitfield.
	* i386-opc.h: Regenerated.
	* i386-opc.tbl: Regenerated.
---
 gas/config/tc-i386.c                          | 44 +++++++++++++++++++
 gas/testsuite/gas/i386/apx-push2pop2-inval.l  |  5 +++
 gas/testsuite/gas/i386/apx-push2pop2-inval.s  |  9 ++++
 gas/testsuite/gas/i386/i386.exp               |  1 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  5 +++
 .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  7 +++
 .../gas/i386/x86-64-apx-push2pop2-intel.d     | 42 ++++++++++++++++++
 .../gas/i386/x86-64-apx-push2pop2-inval.l     | 13 ++++++
 .../gas/i386/x86-64-apx-push2pop2-inval.s     | 17 +++++++
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d | 42 ++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s | 39 ++++++++++++++++
 gas/testsuite/gas/i386/x86-64.exp             |  3 ++
 opcodes/i386-dis-evex-reg.h                   |  9 ++++
 opcodes/i386-dis-evex-w.h                     | 10 +++++
 opcodes/i386-dis-evex.h                       |  2 +-
 opcodes/i386-dis.c                            | 31 +++++++++++++
 opcodes/i386-opc.tbl                          |  9 ++++
 17 files changed, 287 insertions(+), 1 deletion(-)
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 99b484122e1..8af98e435ef 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -250,6 +250,7 @@ enum i386_error
     invalid_vector_register_set,
     invalid_tmm_register_set,
     invalid_dest_and_src_register_set,
+    invalid_dest_register_set,
     invalid_pseudo_prefix,
     unsupported_vector_index_register,
     unsupported_broadcast,
@@ -259,6 +260,7 @@ enum i386_error
     no_default_mask,
     unsupported_rc_sae,
     unsupported_vector_size,
+    unsupported_rsp_register,
     internal_error,
   };
 
@@ -5510,6 +5512,9 @@ md_assemble (char *line)
 	case invalid_dest_and_src_register_set:
 	  err_msg = _("destination and source registers must be distinct");
 	  break;
+	case invalid_dest_register_set:
+	  err_msg = _("two dest registers must be distinct");
+	  break;
 	case invalid_pseudo_prefix:
 	  err_msg = _("rex2 pseudo prefix cannot be used");
 	  break;
@@ -5538,6 +5543,9 @@ md_assemble (char *line)
 	  as_bad (_("vector size above %u required for `%s'"), 128u << vector_size,
 		  pass1_mnem ? pass1_mnem : insn_name (current_templates.start));
 	  return;
+	case unsupported_rsp_register:
+	  err_msg = _("'rsp' register cannot be used");
+	  break;
 	case internal_error:
 	  err_msg = _("internal error");
 	  break;
@@ -7174,6 +7182,35 @@ check_EgprOperands (const insn_template *t)
   return 0;
 }
 
+/* Check if APX operands are valid for the instruction.  */
+static bool
+check_APX_operands (const insn_template *t)
+{
+  /* Push2* and Pop2* cannot use RSP and Pop2* cannot pop two same registers.
+   */
+  switch (t->mnem_off)
+    {
+    case MN_pop2:
+    case MN_pop2p:
+      if (register_number (i.op[0].regs) == register_number (i.op[1].regs))
+	{
+	  i.error = invalid_dest_register_set;
+	  return 1;
+	}
+    /* fall through */
+    case MN_push2:
+    case MN_push2p:
+      if (register_number (i.op[0].regs) == 4
+	  || register_number (i.op[1].regs) == 4)
+	{
+	  i.error = unsupported_rsp_register;
+	  return 1;
+	}
+      break;
+    }
+  return 0;
+}
+
 /* Helper function for the progress() macro in match_template().  */
 static INLINE enum i386_error progress (enum i386_error new,
 					enum i386_error last,
@@ -7674,6 +7711,13 @@ match_template (char mnem_suffix)
 	  continue;
 	}
 
+      /* Check if APX operands are valid.  */
+      if (check_APX_operands (t))
+	{
+	  specific_error = progress (i.error);
+	  continue;
+	}
+
       /* Check whether to use the shorter VEX encoding for certain insns where
 	 the EVEX encoding comes first in the table.  This requires the respective
 	 AVX-* feature to be explicitly enabled.
diff --git a/gas/testsuite/gas/i386/apx-push2pop2-inval.l b/gas/testsuite/gas/i386/apx-push2pop2-inval.l
new file mode 100644
index 00000000000..a55a71520c8
--- /dev/null
+++ b/gas/testsuite/gas/i386/apx-push2pop2-inval.l
@@ -0,0 +1,5 @@
+.* Assembler messages:
+.*:6: Error: `push2' is only supported in 64-bit mode
+.*:7: Error: `push2p' is only supported in 64-bit mode
+.*:8: Error: `pop2' is only supported in 64-bit mode
+.*:9: Error: `pop2p' is only supported in 64-bit mode
diff --git a/gas/testsuite/gas/i386/apx-push2pop2-inval.s b/gas/testsuite/gas/i386/apx-push2pop2-inval.s
new file mode 100644
index 00000000000..77166327ed1
--- /dev/null
+++ b/gas/testsuite/gas/i386/apx-push2pop2-inval.s
@@ -0,0 +1,9 @@
+# Check 32bit APX-PUSH2/POP2 instructions
+
+	.allow_index_reg
+	.text
+_start:
+	push2 %rax, %rbx
+	push2p %rax, %rbx
+	pop2 %rax, %rbx
+	pop2p %rax, %rbx
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 3917be6be70..f9ee85b4bb3 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -511,6 +511,7 @@ if [gas_32_check] then {
     run_dump_test "sm4-intel"
     run_list_test "pbndkb-inval"
     run_list_test "user_msr-inval"
+    run_list_test "apx-push2pop2-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "invlpgb"
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
index ba14736c3a8..3bfb5dec202 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
@@ -34,3 +34,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:[ 	]+62 f4 e4[ 	]+\(bad\)
 [ 	]*[a-f0-9]+:[ 	]+08 ff[ 	]+.*
 [ 	]*[a-f0-9]+:[ 	]+04 08[ 	]+.*
+[ 	]*[a-f0-9]+:[ 	]+62 f4 3c[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+08 8f c0 ff ff ff[ 	]+or.*
+[ 	]*[a-f0-9]+:[ 	]+62 74 7c 18 8f c0[ 	]+pop2   %rax,\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+62 d4 3c 18 8f[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+c0[ 	]+.*
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
index fcbb1b93659..fde6736e9b2 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
@@ -40,3 +40,10 @@ _start:
 
 	#{evex} inc %rax %rbx EVEX.vvvv != 1111 && EVEX.ND = 0.
 	.insn EVEX.L0.NP.M4.W1 0xff/0, (%rax,%rcx), %rbx
+	# pop2 %rax, %r8 set EVEX.ND=0.
+	.insn EVEX.L0.M4.W0 0x8f/0,  %rax, %r8
+	.byte 0xff, 0xff, 0xff
+	# pop2 %rax, %r8 set EVEX.vvvv = 1111.
+	.insn EVEX.L0.M4.W0 0x8f,  %rax, {rn-sae},%r8
+	# pop2 %r8, %r8.
+	.insn EVEX.L0.M4.W0 0x8f/0,  %r8,{rn-sae}, %r8
diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
new file mode 100644
index 00000000000..46b21219582
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
@@ -0,0 +1,42 @@
+#as: --64
+#objdump: -dw -Mintel
+#name: i386 APX-push2pop2 insns (Intel disassembly)
+#source: x86-64-apx-push2pop2.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+rax,rbx
+\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+r8,r17
+\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+r31,r9
+\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+r24,r31
+\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+rax,rbx
+\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+r8,r17
+\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+r31,r9
+\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+r24,r31
+\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+rbx,rax
+\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+r17,r8
+\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+r9,r31
+\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+r31,r24
+\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+rbx,rax
+\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+r17,r8
+\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+r9,r31
+\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+r31,r24
+\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+rax,rbx
+\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+r8,r17
+\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+r31,r9
+\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+r24,r31
+\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+rax,rbx
+\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+r8,r17
+\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+r31,r9
+\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+r24,r31
+\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+rbx,rax
+\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+r17,r8
+\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+r9,r31
+\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+r31,r24
+\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+rbx,rax
+\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+r17,r8
+\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+r9,r31
+\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+r31,r24
diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
new file mode 100644
index 00000000000..2cd142885a1
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
@@ -0,0 +1,13 @@
+.* Assembler messages:
+.*:6: Error: operand size mismatch for `push2'
+.*:7: Error: operand size mismatch for `push2'
+.*:8: Error: 'rsp' register cannot be used for `push2'
+.*:9: Error: 'rsp' register cannot be used for `push2'
+.*:10: Error: operand size mismatch for `push2p'
+.*:11: Error: 'rsp' register cannot be used for `push2p'
+.*:12: Error: operand size mismatch for `pop2'
+.*:13: Error: 'rsp' register cannot be used for `pop2'
+.*:14: Error: 'rsp' register cannot be used for `pop2'
+.*:15: Error: two dest registers must be distinct for `pop2'
+.*:16: Error: 'rsp' register cannot be used for `pop2p'
+.*:17: Error: two dest registers must be distinct for `pop2p'
diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
new file mode 100644
index 00000000000..83cef97d57e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
@@ -0,0 +1,17 @@
+# Check illegal APX-Push2Pop2 instructions
+
+	.allow_index_reg
+	.text
+_start:
+	push2  %ax, %bx
+	push2  %eax, %ebx
+	push2  %rsp, %r17
+	push2  %r17, %rsp
+	push2p %eax, %ebx
+	push2p %rsp, %r17
+	pop2   %ax, %bx
+	pop2   %rax, %rsp
+	pop2   %rsp, %rax
+	pop2   %r12, %r12
+	pop2p  %rax, %rsp
+	pop2p  %r12, %r12
diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2.d b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
new file mode 100644
index 00000000000..54f22a7f94e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
@@ -0,0 +1,42 @@
+#as: --64
+#objdump: -dw
+#name: x86_64 APX-push2pop2 insns
+#source: x86-64-apx-push2pop2.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+%rbx,%rax
+\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+%r17,%r8
+\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+%r9,%r31
+\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+%r31,%r24
+\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+%rbx,%rax
+\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+%r17,%r8
+\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+%r9,%r31
+\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+%r31,%r24
+\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+%rax,%rbx
+\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+%r8,%r17
+\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+%r31,%r9
+\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+%r24,%r31
+\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+%rax,%rbx
+\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+%r8,%r17
+\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+%r31,%r9
+\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+%r24,%r31
+\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+%rbx,%rax
+\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+%r17,%r8
+\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+%r9,%r31
+\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+%r31,%r24
+\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+%rbx,%rax
+\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+%r17,%r8
+\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+%r9,%r31
+\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+%r31,%r24
+\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+%rax,%rbx
+\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+%r8,%r17
+\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+%r31,%r9
+\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+%r24,%r31
+\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+%rax,%rbx
+\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+%r8,%r17
+\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+%r31,%r9
+\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+%r24,%r31
diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2.s b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
new file mode 100644
index 00000000000..5c28c13ba2e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
@@ -0,0 +1,39 @@
+# Check 64bit APX-Push2Pop2 instructions
+
+	.allow_index_reg
+	.text
+_start:
+	push2 %rbx, %rax
+	push2 %r17, %r8
+	push2 %r9, %r31
+	push2 %r31, %r24
+	push2p %rbx, %rax
+	push2p %r17, %r8
+	push2p %r9, %r31
+	push2p %r31, %r24
+	pop2 %rax, %rbx
+	pop2 %r8, %r17
+	pop2 %r31, %r9
+	pop2 %r24, %r31
+	pop2p %rax, %rbx
+	pop2p %r8, %r17
+	pop2p %r31, %r9
+	pop2p %r24, %r31
+
+	.intel_syntax noprefix
+	push2 rax, rbx
+	push2 r8, r17
+	push2 r31, r9
+	push2 r24, r31
+	push2p rax, rbx
+	push2p r8, r17
+	push2p r31, r9
+	push2p r24, r31
+	pop2 rbx, rax
+	pop2 r17, r8
+	pop2 r9, r31
+	pop2 r31, r24
+	pop2p rbx, rax
+	pop2p r17, r8
+	pop2p r9, r31
+	pop2p r31, r24
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 3a3438a5de3..0e7b5d0c073 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -345,6 +345,9 @@ run_dump_test "x86-64-avx512dq-rcigrd-intel"
 run_dump_test "x86-64-avx512dq-rcigrd"
 run_dump_test "x86-64-avx512dq-rcigrne-intel"
 run_dump_test "x86-64-avx512dq-rcigrne"
+run_dump_test "x86-64-apx-push2pop2"
+run_dump_test "x86-64-apx-push2pop2-intel"
+run_list_test "x86-64-apx-push2pop2-inval"
 run_dump_test "x86-64-avx512dq-rcigru-intel"
 run_dump_test "x86-64-avx512dq-rcigru"
 run_dump_test "x86-64-avx512dq-rcigrz-intel"
diff --git a/opcodes/i386-dis-evex-reg.h b/opcodes/i386-dis-evex-reg.h
index cac3c39c4c5..81bb41646c5 100644
--- a/opcodes/i386-dis-evex-reg.h
+++ b/opcodes/i386-dis-evex-reg.h
@@ -79,6 +79,10 @@
     { "subQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
     { "xorQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
   },
+  /* REG_EVEX_MAP4_8F */
+  {
+    { VEX_W_TABLE (EVEX_W_MAP4_8F_R_0) },
+  },
   /* REG_EVEX_MAP4_F6 */
   {
     { Bad_Opcode },
@@ -102,4 +106,9 @@
   {
     { "incQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
     { "decQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { VEX_W_TABLE (EVEX_W_MAP4_FF_R_6) },
   },
diff --git a/opcodes/i386-dis-evex-w.h b/opcodes/i386-dis-evex-w.h
index b828277d413..12ab29544bb 100644
--- a/opcodes/i386-dis-evex-w.h
+++ b/opcodes/i386-dis-evex-w.h
@@ -442,6 +442,16 @@
     { Bad_Opcode },
     { "vpshrdw",   { XM, Vex, EXx, Ib }, 0 },
   },
+  /* EVEX_W_MAP4_8F_R_0 */
+  {
+    { "pop2", { { PUSH2_POP2_Fixup, q_mode}, Eq }, NO_PREFIX },
+    { "pop2p", { { PUSH2_POP2_Fixup, q_mode}, Eq }, NO_PREFIX },
+  },
+  /* EVEX_W_MAP4_FF_R_6 */
+  {
+    { "push2", { { PUSH2_POP2_Fixup, q_mode}, Eq }, 0 },
+    { "push2p", { { PUSH2_POP2_Fixup, q_mode}, Eq }, 0 },
+  },
   /* EVEX_W_MAP5_5B_P_0 */
   {
     { "vcvtdq2ph%XY",	{ XMxmmq, EXx, EXxEVexR }, 0 },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index a8a891d7f0e..4f2ec966457 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -1035,7 +1035,7 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_EVEX_MAP4_8F) },
     /* 90 */
     { Bad_Opcode },
     { Bad_Opcode },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 1bb2882d839..9daef6fa9fd 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -105,6 +105,7 @@ static bool FXSAVE_Fixup (instr_info *, int, int);
 static bool MOVSXD_Fixup (instr_info *, int, int);
 static bool DistinctDest_Fixup (instr_info *, int, int);
 static bool PREFETCHI_Fixup (instr_info *, int, int);
+static bool PUSH2_POP2_Fixup (instr_info *, int, int);
 
 static void ATTRIBUTE_PRINTF_3 i386_dis_printf (const disassemble_info *,
 						enum disassembler_style,
@@ -900,6 +901,7 @@ enum
   REG_EVEX_MAP4_80,
   REG_EVEX_MAP4_81,
   REG_EVEX_MAP4_83,
+  REG_EVEX_MAP4_8F,
   REG_EVEX_MAP4_F6,
   REG_EVEX_MAP4_F7,
   REG_EVEX_MAP4_FE,
@@ -1739,6 +1741,9 @@ enum
   EVEX_W_0F3A70,
   EVEX_W_0F3A72,
 
+  EVEX_W_MAP4_8F_R_0,
+  EVEX_W_MAP4_FF_R_6,
+
   EVEX_W_MAP5_5B_P_0,
   EVEX_W_MAP5_7A_P_3,
 };
@@ -13510,6 +13515,9 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
 	case b_mode:
 	  names = att_names8rex;
 	  break;
+	case q_mode:
+	  names = att_names64;
+	  break;
 	case mask_bd_mode:
 	case mask_mode:
 	  if (reg > 0x7)
@@ -13894,3 +13902,26 @@ PREFETCHI_Fixup (instr_info *ins, int bytemode, int sizeflag)
 
   return OP_M (ins, bytemode, sizeflag);
 }
+
+static bool
+PUSH2_POP2_Fixup (instr_info *ins, int bytemode, int sizeflag)
+{
+  if (ins->modrm.mod != 3)
+    return true;
+
+  unsigned int vvvv_reg = ins->vex.register_specifier
+    | (!ins->vex.v << 4);
+  unsigned int rm_reg = ins->modrm.rm + (ins->rex & REX_B ? 8 : 0)
+    + (ins->rex2 & REX_B ? 16 : 0);
+
+  /* Push2/Pop2 cannot use RSP and Pop2 cannot pop two same registers.  */
+  if (!ins->vex.nd || vvvv_reg == 0x4 || rm_reg == 0x4
+      || (!ins->modrm.reg
+	  && vvvv_reg == rm_reg))
+    {
+      oappend (ins, "(bad)");
+      return true;
+    }
+
+  return OP_VEX (ins, bytemode, sizeflag);
+}
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 54c659099af..900ca36d286 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3480,3 +3480,12 @@ uwrmsr, 0xf30f38f8, USER_MSR, Modrm|NoSuf|NoRex64, { Reg64, Reg64 }
 uwrmsr, 0xf3f8/0, USER_MSR, Modrm|Vex128|VexMap7|VexW0|NoSuf, { Imm32, Reg64 }
 
 // USER_MSR instructions end.
+
+// APX Push2/Pop2 instructions.
+
+push2, 0xff/6, APX_F, Modrm|VexW0|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
+push2p, 0xff/6, APX_F, Modrm|VexW1|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
+pop2, 0x8f/0, APX_F, Modrm|VexW0|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
+pop2p, 0x8f/0, APX_F, Modrm|VexW1|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
+
+// APX Push2/Pop2 instructions end.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 7/9] Support APX pushp/popp
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
                   ` (5 preceding siblings ...)
  2023-12-28  1:27 ` [PATCH V5 6/9] Support APX Push2/Pop2 Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:56   ` H.J. Lu
  2023-12-28  1:27 ` [PATCH V5 8/9] Support APX NDD optimized encoding Cui, Lili
  2023-12-28  1:27 ` [PATCH V5 9/9] Support APX JMPABS for disassembler Cui, Lili
  8 siblings, 1 reply; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich

gas/ChangeLog:

	* config/tc-i386.c (process_operands): Handle "PUSHP/POPP requires
	rex2.w == 1."
	* testsuite/gas/i386/x86-64.exp: Add new test for PUSHP/POPP.
	* testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d: New test.
	* testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l: Ditto.
	* testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-pushp-popp.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-pushp-popp.s: Ditto.

opcodes/ChangeLog:

	* i386-dis.c (putop): print pushp and popp.
	* i386-opc.tbl: Added new insns.
	* i386-init.h : Regenerated.
	* i386-mnem.h : Regenerated.
	* i386-tbl.h: Regenerated.
---
 gas/config/tc-i386.c                          |  3 +-
 .../gas/i386/x86-64-apx-pushp-popp-intel.d    | 14 +++++
 .../gas/i386/x86-64-apx-pushp-popp-inval.l    |  5 ++
 .../gas/i386/x86-64-apx-pushp-popp-inval.s    |  7 +++
 .../gas/i386/x86-64-apx-pushp-popp.d          | 14 +++++
 .../gas/i386/x86-64-apx-pushp-popp.s          |  8 +++
 gas/testsuite/gas/i386/x86-64.exp             |  3 +
 opcodes/i386-dis.c                            | 55 ++++++++++++-------
 opcodes/i386-opc.h                            |  2 +
 opcodes/i386-opc.tbl                          |  3 +
 10 files changed, 94 insertions(+), 20 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 8af98e435ef..640c6511f20 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -3910,7 +3910,8 @@ is_apx_evex_encoding (void)
 static INLINE bool
 is_apx_rex2_encoding (void)
 {
-  return i.rex2 || i.rex2_encoding;
+  return i.rex2 || i.rex2_encoding
+	|| i.tm.opcode_modifier.operandconstraint == REX2_REQUIRED;
 }
 
 static unsigned int
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
new file mode 100644
index 00000000000..44e3e96a5df
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
@@ -0,0 +1,14 @@
+#as:
+#objdump: -dw -Mintel
+#name: x86_64 APX_F pushp popp insns (Intel disassembly)
+#source: x86-64-apx-pushp-popp.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*d5 08 50[	 ]+pushp  rax
+\s*[a-f0-9]+:\s*d5 19 57[	 ]+pushp  r31
+\s*[a-f0-9]+:\s*d5 08 58[	 ]+popp   rax
+\s*[a-f0-9]+:\s*d5 19 5f[	 ]+popp   r31
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
new file mode 100644
index 00000000000..c4d774b9673
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
@@ -0,0 +1,5 @@
+.* Assembler messages:
+.*:4: Error: operand size mismatch for `pushp'
+.*:5: Error: operand size mismatch for `popp'
+.*:6: Error: operand size mismatch for `pushp'
+.*:7: Error: operand size mismatch for `popp'
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
new file mode 100644
index 00000000000..28ed5d8145a
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
@@ -0,0 +1,7 @@
+# Check bytecode of APX_F pushp popp instructions with illegal instructions.
+
+	.text
+	pushp %eax
+	popp  %eax
+	pushp (%rax)
+	popp  (%rax)
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
new file mode 100644
index 00000000000..b20e5ba9a35
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
@@ -0,0 +1,14 @@
+#as:
+#objdump: -dw
+#name: x86_64 APX_F pushp popp insns
+#source: x86-64-apx-pushp-popp.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*d5 08 50[ 	]+pushp  %rax
+\s*[a-f0-9]+:\s*d5 19 57[ 	]+pushp  %r31
+\s*[a-f0-9]+:\s*d5 08 58[ 	]+popp   %rax
+\s*[a-f0-9]+:\s*d5 19 5f[ 	]+popp   %r31
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
new file mode 100644
index 00000000000..0ea66d0e70c
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
@@ -0,0 +1,8 @@
+# Check 64bit APX_F pushp popp instructions
+
+       .text
+ _start:
+	pushp %rax
+	pushp %r31
+	popp  %rax
+	popp  %r31
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 0e7b5d0c073..1b13c52454e 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -348,6 +348,9 @@ run_dump_test "x86-64-avx512dq-rcigrne"
 run_dump_test "x86-64-apx-push2pop2"
 run_dump_test "x86-64-apx-push2pop2-intel"
 run_list_test "x86-64-apx-push2pop2-inval"
+run_dump_test "x86-64-apx-pushp-popp"
+run_dump_test "x86-64-apx-pushp-popp-intel"
+run_list_test "x86-64-apx-pushp-popp-inval"
 run_dump_test "x86-64-avx512dq-rcigru-intel"
 run_dump_test "x86-64-avx512dq-rcigru"
 run_dump_test "x86-64-avx512dq-rcigrz-intel"
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 9daef6fa9fd..e851fb376d9 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -301,6 +301,9 @@ struct dis_private {
 #define EVEX_len_used 2
 
 
+/* {rex2} is not printed when the REX2_SPECIAL is set.  */
+#define REX2_SPECIAL 16
+
 /* Flags stored in PREFIXES.  */
 #define PREFIX_REPZ 1
 #define PREFIX_REPNZ 2
@@ -1924,23 +1927,23 @@ static const struct dis386 dis386[] = {
   { "dec{S|}",		{ RMeSI }, 0 },
   { "dec{S|}",		{ RMeDI }, 0 },
   /* 50 */
-  { "push{!P|}",		{ RMrAX }, 0 },
-  { "push{!P|}",		{ RMrCX }, 0 },
-  { "push{!P|}",		{ RMrDX }, 0 },
-  { "push{!P|}",		{ RMrBX }, 0 },
-  { "push{!P|}",		{ RMrSP }, 0 },
-  { "push{!P|}",		{ RMrBP }, 0 },
-  { "push{!P|}",		{ RMrSI }, 0 },
-  { "push{!P|}",		{ RMrDI }, 0 },
+  { "push!P",		{ RMrAX }, 0 },
+  { "push!P",		{ RMrCX }, 0 },
+  { "push!P",		{ RMrDX }, 0 },
+  { "push!P",		{ RMrBX }, 0 },
+  { "push!P",		{ RMrSP }, 0 },
+  { "push!P",		{ RMrBP }, 0 },
+  { "push!P",		{ RMrSI }, 0 },
+  { "push!P",		{ RMrDI }, 0 },
   /* 58 */
-  { "pop{!P|}",		{ RMrAX }, 0 },
-  { "pop{!P|}",		{ RMrCX }, 0 },
-  { "pop{!P|}",		{ RMrDX }, 0 },
-  { "pop{!P|}",		{ RMrBX }, 0 },
-  { "pop{!P|}",		{ RMrSP }, 0 },
-  { "pop{!P|}",		{ RMrBP }, 0 },
-  { "pop{!P|}",		{ RMrSI }, 0 },
-  { "pop{!P|}",		{ RMrDI }, 0 },
+  { "pop!P",		{ RMrAX }, 0 },
+  { "pop!P",		{ RMrCX }, 0 },
+  { "pop!P",		{ RMrDX }, 0 },
+  { "pop!P",		{ RMrBX }, 0 },
+  { "pop!P",		{ RMrSP }, 0 },
+  { "pop!P",		{ RMrBP }, 0 },
+  { "pop!P",		{ RMrSI }, 0 },
+  { "pop!P",		{ RMrDI }, 0 },
   /* 60 */
   { X86_64_TABLE (X86_64_60) },
   { X86_64_TABLE (X86_64_61) },
@@ -9783,9 +9786,10 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
 
   /* Check if the REX2 prefix is used.  */
   if (ins.last_rex2_prefix >= 0
-      && ((ins.rex2 & 0x7) ^ (ins.rex2_used & 0x7)) == 0
-      && (ins.rex ^ ins.rex_used) == 0
-      && (ins.rex2 & 0x7))
+      && ((ins.rex2 & REX2_SPECIAL)
+	  || (((ins.rex2 & 7) ^ (ins.rex2_used & 7)) == 0
+	      && (ins.rex ^ ins.rex_used) == 0
+	      && (ins.rex2 & 7))))
     ins.all_prefixes[ins.last_rex2_prefix] = 0;
 
   /* Check if the SEG prefix is used.  */
@@ -10632,6 +10636,19 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 	case 'P':
 	  if (l == 0)
 	    {
+	      if (!cond && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
+		{
+		  /* For pushp and popp, p is printed and do not print {rex2}
+		     for them.  */
+		  *ins->obufp++ = 'p';
+		  ins->rex2 |= REX2_SPECIAL;
+		  break;
+		}
+
+	      /* For "!P" print nothing else in Intel syntax.  */
+	      if (!cond && ins->intel_syntax)
+		break;
+
 	      if ((ins->modrm.mod == 3 || !cond)
 		  && !(sizeflag & SUFFIX_ALWAYS))
 		break;
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 9e8c827b934..8db6c51538a 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -579,6 +579,8 @@ enum
   /* Instrucion requires that destination must be distinct from source
      registers.  */
 #define DISTINCT_DEST 9
+  /* Instrucion requires REX2 prefix.  */
+#define REX2_REQUIRED 10
   OperandConstraint,
   /* instruction ignores operand size prefix and in Intel mode ignores
      mnemonic size suffix check.  */
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 900ca36d286..edd9f73ae22 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -85,6 +85,7 @@
 #define RegKludge         OperandConstraint=REG_KLUDGE
 #define SwapSources       OperandConstraint=SWAP_SOURCES
 #define Ugh               OperandConstraint=UGH
+#define Rex2              OperandConstraint=REX2_REQUIRED
 
 #define ATTSyntax         Dialect=ATT_SYNTAX
 #define ATTMnemonic       Dialect=ATT_MNEMONIC
@@ -229,6 +230,7 @@ push, 0x68, i186&No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }
 push, 0x6, No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }
 // In 64bit mode, the operand size is implicitly 64bit.
 push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
+pushp, 0x50, APX_F, No_bSuf|No_wSuf|No_lSuf|No_sSuf|Rex2, { Reg64 }
 push, 0xff/6, x64, Modrm|DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64|Unspecified|BaseIndex }
 push, 0x6a, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Imm8S }
 push, 0x68, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Imm16|Imm32S }
@@ -242,6 +244,7 @@ pop, 0x8f/0, No64, Modrm|DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32|Unsp
 pop, 0x7, No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }
 // In 64bit mode, the operand size is implicitly 64bit.
 pop, 0x58, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
+popp, 0x58, APX_F, No_bSuf|No_wSuf|No_lSuf|No_sSuf|Rex2, { Reg64 }
 pop, 0x8f/0, x64, Modrm|DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64|Unspecified|BaseIndex }
 pop, 0xfa1, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { SReg }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 8/9] Support APX NDD optimized encoding.
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
                   ` (6 preceding siblings ...)
  2023-12-28  1:27 ` [PATCH V5 7/9] Support APX pushp/popp Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:56   ` H.J. Lu
  2024-01-05 14:36   ` Jan Beulich
  2023-12-28  1:27 ` [PATCH V5 9/9] Support APX JMPABS for disassembler Cui, Lili
  8 siblings, 2 replies; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich, Hu, Lin1

From: "Hu, Lin1" <lin1.hu@intel.com>

This patch aims to optimize:

add %r16, %r15, %r15 -> add %r16, %r15

gas/ChangeLog:

	* config/tc-i386.c (check_Rex_required): New function.
	(can_convert_NDD_to_legacy): Ditto.
	(match_template): If we can optimzie APX NDD insns, so rematch
	template.
	* testsuite/gas/i386/x86-64.exp: Add test.
	* testsuite/gas/i386/x86-64-apx-ndd-optimize.d: New test.
	* testsuite/gas/i386/x86-64-apx-ndd-optimize.s: Ditto.
---
 gas/config/tc-i386.c                          | 104 ++++++++++++++
 .../gas/i386/x86-64-apx-ndd-optimize.d        | 132 ++++++++++++++++++
 .../gas/i386/x86-64-apx-ndd-optimize.s        | 125 +++++++++++++++++
 gas/testsuite/gas/i386/x86-64.exp             |   1 +
 4 files changed, 362 insertions(+)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 640c6511f20..3b0ba41cc72 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -7212,6 +7212,56 @@ check_APX_operands (const insn_template *t)
   return 0;
 }
 
+/* Check if the instruction use the REX registers or REX prefix.  */
+static bool
+check_Rex_required (void)
+{
+  for (unsigned int op = 0; op < i.operands; op++)
+    {
+      if (i.types[op].bitfield.class != Reg)
+	continue;
+
+      if (i.op[op].regs->reg_flags & (RegRex | RegRex64))
+	return true;
+    }
+
+  if ((i.index_reg && (i.index_reg->reg_flags & (RegRex | RegRex64)))
+      || (i.base_reg && (i.base_reg->reg_flags & (RegRex | RegRex64))))
+    return true;
+
+  /* Check pseudo prefix {rex} are valid.  */
+  return i.rex_encoding;
+}
+
+/* Optimize APX NDD insns to legacy insns.  */
+static unsigned int
+can_convert_NDD_to_legacy (const insn_template *t)
+{
+  unsigned int match_dest_op = ~0;
+
+  if (!i.tm.opcode_modifier.nf
+      && i.reg_operands >= 2)
+    {
+      unsigned int dest = i.operands - 1;
+      unsigned int src1 = i.operands - 2;
+      unsigned int src2 = (i.operands > 3) ? i.operands - 3 : 0;
+
+      if (i.types[src1].bitfield.class == Reg
+	  && i.op[src1].regs == i.op[dest].regs)
+	match_dest_op = src1;
+      /* If the first operand is the same as the third operand,
+	 these instructions need to support the ability to commutative
+	 the first two operands and still not change the semantics in order
+	 to be optimized.  */
+      else if (optimize > 1
+	       && t->opcode_modifier.commutative
+	       && i.types[src2].bitfield.class == Reg
+	       && i.op[src2].regs == i.op[dest].regs)
+	match_dest_op = src2;
+    }
+  return match_dest_op;
+}
+
 /* Helper function for the progress() macro in match_template().  */
 static INLINE enum i386_error progress (enum i386_error new,
 					enum i386_error last,
@@ -7754,6 +7804,60 @@ match_template (char mnem_suffix)
 	  i.memshift = memshift;
 	}
 
+      /* If we can optimize a NDD insn to legacy insn, like
+	 add %r16, %r8, %r8 -> add %r16, %r8,
+	 add  %r8, %r16, %r8 -> add %r16, %r8, then rematch template.
+	 Note that the semantics have not been changed.  */
+      if (optimize
+	  && !i.no_optimize
+	  && i.vec_encoding != vex_encoding_evex
+	  && t + 1 < current_templates.end
+	  && !t[1].opcode_modifier.evex
+	  && t[1].opcode_space <= SPACE_0F38
+	  && t->opcode_modifier.vexvvvv == VexVVVV_DST
+	  && (i.types[i.operands - 1].bitfield.dword
+	      || i.types[i.operands - 1].bitfield.qword))
+	{
+	  unsigned int match_dest_op = can_convert_NDD_to_legacy (t);
+
+	  if (match_dest_op != (unsigned int) ~0)
+	    {
+	      size_match = true;
+	      /* We ensure that the next template has the same input
+		 operands as the original matching template by the first
+		 opernd (ATT). To avoid someone support new NDD insns and
+		 put it in the wrong position.  */
+	      overlap0 = operand_type_and (i.types[0],
+					   t[1].operand_types[0]);
+	      if (t->opcode_modifier.d)
+		overlap1 = operand_type_and (i.types[0],
+					     t[1].operand_types[1]);
+	      if (!operand_type_match (overlap0, i.types[0])
+		  && (!t->opcode_modifier.d
+		      || !operand_type_match (overlap1, i.types[0])))
+		size_match = false;
+
+	      if (size_match
+		  && (t[1].opcode_space <= SPACE_0F
+		      /* Some non-legacy-map0/1 insns can be shorter when
+			 legacy-encoded and when no REX prefix is required.  */
+		      || (!check_EgprOperands (t + 1)
+			  && !check_Rex_required ()
+			  && !i.op[i.operands - 1].regs->reg_type.bitfield.qword)))
+		{
+		  if (i.operands > 2 && match_dest_op == i.operands - 3)
+		    swap_2_operands (match_dest_op, i.operands - 2);
+
+		  --i.operands;
+		  --i.reg_operands;
+
+		  specific_error = progress (internal_error);
+		  continue;
+		}
+
+	    }
+	}
+
       /* We've found a match; break out of loop.  */
       break;
     }
diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
new file mode 100644
index 00000000000..48f0f1ceee3
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
@@ -0,0 +1,132 @@
+#as: -Os
+#objdump: -drw
+#name: x86-64 APX NDD optimized encoding
+#source: x86-64-apx-ndd-optimize.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*d5 4d 01 f8          	add    %r31,%r8
+\s*[a-f0-9]+:\s*62 44 3c 18 00 f8    	add    %r31b,%r8b,%r8b
+\s*[a-f0-9]+:\s*d5 4d 01 f8          	add    %r31,%r8
+\s*[a-f0-9]+:\s*d5 1d 03 c7          	add    %r31,%r8
+\s*[a-f0-9]+:\s*d5 4d 03 38          	add    \(%r8\),%r31
+\s*[a-f0-9]+:\s*d5 1d 03 07          	add    \(%r31\),%r8
+\s*[a-f0-9]+:\s*49 81 c7 33 44 34 12 	add    \$0x12344433,%r15
+\s*[a-f0-9]+:\s*49 81 c0 11 22 33 f4 	add    \$0xfffffffff4332211,%r8
+\s*[a-f0-9]+:\s*d5 19 ff c7          	inc    %r31
+\s*[a-f0-9]+:\s*62 dc 04 10 fe c7    	inc    %r31b,%r31b
+\s*[a-f0-9]+:\s*d5 1c 29 f9          	sub    %r15,%r17
+\s*[a-f0-9]+:\s*62 7c 74 10 28 f9    	sub    %r15b,%r17b,%r17b
+\s*[a-f0-9]+:\s*62 54 84 18 29 38    	sub    %r15,\(%r8\),%r15
+\s*[a-f0-9]+:\s*d5 49 2b 04 07       	sub    \(%r15,%rax,1\),%r16
+\s*[a-f0-9]+:\s*d5 19 81 ee 34 12 00 00 	sub    \$0x1234,%r30
+\s*[a-f0-9]+:\s*d5 18 ff c9          	dec    %r17
+\s*[a-f0-9]+:\s*62 fc 74 10 fe c9    	dec    %r17b,%r17b
+\s*[a-f0-9]+:\s*d5 1c 19 f9          	sbb    %r15,%r17
+\s*[a-f0-9]+:\s*62 7c 74 10 18 f9    	sbb    %r15b,%r17b,%r17b
+\s*[a-f0-9]+:\s*62 54 84 18 19 38    	sbb    %r15,\(%r8\),%r15
+\s*[a-f0-9]+:\s*d5 49 1b 04 07       	sbb    \(%r15,%rax,1\),%r16
+\s*[a-f0-9]+:\s*d5 19 81 de 34 12 00 00 	sbb    \$0x1234,%r30
+\s*[a-f0-9]+:\s*d5 1c 21 f9          	and    %r15,%r17
+\s*[a-f0-9]+:\s*62 7c 74 10 20 f9    	and    %r15b,%r17b,%r17b
+\s*[a-f0-9]+:\s*4d 23 38             	and    \(%r8\),%r15
+\s*[a-f0-9]+:\s*d5 49 23 04 07       	and    \(%r15,%rax,1\),%r16
+\s*[a-f0-9]+:\s*d5 11 81 e6 34 12 00 00 	and    \$0x1234,%r30d
+\s*[a-f0-9]+:\s*d5 1c 09 f9          	or     %r15,%r17
+\s*[a-f0-9]+:\s*62 7c 74 10 08 f9    	or     %r15b,%r17b,%r17b
+\s*[a-f0-9]+:\s*4d 0b 38             	or     \(%r8\),%r15
+\s*[a-f0-9]+:\s*d5 49 0b 04 07       	or     \(%r15,%rax,1\),%r16
+\s*[a-f0-9]+:\s*d5 19 81 ce 34 12 00 00 	or     \$0x1234,%r30
+\s*[a-f0-9]+:\s*d5 1c 31 f9          	xor    %r15,%r17
+\s*[a-f0-9]+:\s*62 7c 74 10 30 f9    	xor    %r15b,%r17b,%r17b
+\s*[a-f0-9]+:\s*4d 33 38             	xor    \(%r8\),%r15
+\s*[a-f0-9]+:\s*d5 49 33 04 07       	xor    \(%r15,%rax,1\),%r16
+\s*[a-f0-9]+:\s*d5 19 81 f6 34 12 00 00 	xor    \$0x1234,%r30
+\s*[a-f0-9]+:\s*d5 1c 11 f9          	adc    %r15,%r17
+\s*[a-f0-9]+:\s*62 7c 74 10 10 f9    	adc    %r15b,%r17b,%r17b
+\s*[a-f0-9]+:\s*4d 13 38             	adc    \(%r8\),%r15
+\s*[a-f0-9]+:\s*d5 49 13 04 07       	adc    \(%r15,%rax,1\),%r16
+\s*[a-f0-9]+:\s*d5 19 81 d6 34 12 00 00 	adc    \$0x1234,%r30
+\s*[a-f0-9]+:\s*d5 18 f7 d9          	neg    %r17
+\s*[a-f0-9]+:\s*62 fc 74 10 f6 d9    	neg    %r17b,%r17b
+\s*[a-f0-9]+:\s*d5 18 f7 d1          	not    %r17
+\s*[a-f0-9]+:\s*62 fc 74 10 f6 d1    	not    %r17b,%r17b
+\s*[a-f0-9]+:\s*67 0f af 90 09 09 09 00 	imul   0x90909\(%eax\),%edx
+\s*[a-f0-9]+:\s*d5 aa af 94 f8 09 09 00 00 	imul   0x909\(%rax,%r31,8\),%rdx
+\s*[a-f0-9]+:\s*48 0f af d0          	imul   %rax,%rdx
+\s*[a-f0-9]+:\s*d5 19 d1 c7          	rol    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 c7    	rol    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 c4 02          	rol    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 c4 02 	rol    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*d5 19 d1 cf          	ror    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 cf    	ror    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 cc 02          	ror    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 cc 02 	ror    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*d5 19 d1 d7          	rcl    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 d7    	rcl    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 d4 02          	rcl    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 d4 02 	rcl    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*d5 19 d1 df          	rcr    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 df    	rcr    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 dc 02          	rcr    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 dc 02 	rcr    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*d5 19 d1 e7          	shl    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 e7    	shl    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 e4 02          	shl    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 e4 02 	shl    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*d5 19 d1 e7          	shl    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 e7    	shl    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 e4 02          	shl    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 e4 02 	shl    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*d5 19 d1 ef          	shr    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 ef    	shr    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 ec 02          	shr    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 ec 02 	shr    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*d5 19 d1 ff          	sar    \$1,%r31
+\s*[a-f0-9]+:\s*62 dc 04 10 d0 ff    	sar    \$1,%r31b,%r31b
+\s*[a-f0-9]+:\s*49 c1 fc 02          	sar    \$0x2,%r12
+\s*[a-f0-9]+:\s*62 d4 1c 18 c0 fc 02 	sar    \$0x2,%r12b,%r12b
+\s*[a-f0-9]+:\s*62 74 9c 18 24 20 01 	shld   \$0x1,%r12,\(%rax\),%r12
+\s*[a-f0-9]+:\s*4d 0f a4 c4 02       	shld   \$0x2,%r8,%r12
+\s*[a-f0-9]+:\s*62 54 bc 18 24 c4 02 	shld   \$0x2,%r8,%r12,%r8
+\s*[a-f0-9]+:\s*62 74 b4 18 a5 08    	shld   %cl,%r9,\(%rax\),%r9
+\s*[a-f0-9]+:\s*d5 9c a5 e0          	shld   %cl,%r12,%r16
+\s*[a-f0-9]+:\s*62 7c 9c 18 a5 e0    	shld   %cl,%r12,%r16,%r12
+\s*[a-f0-9]+:\s*62 74 9c 18 2c 20 01 	shrd   \$0x1,%r12,\(%rax\),%r12
+\s*[a-f0-9]+:\s*4d 0f ac ec 01       	shrd   \$0x1,%r13,%r12
+\s*[a-f0-9]+:\s*62 54 94 18 2c ec 01 	shrd   \$0x1,%r13,%r12,%r13
+\s*[a-f0-9]+:\s*62 74 b4 18 ad 08    	shrd   %cl,%r9,\(%rax\),%r9
+\s*[a-f0-9]+:\s*d5 9c ad e0          	shrd   %cl,%r12,%r16
+\s*[a-f0-9]+:\s*62 7c 9c 18 ad e0    	shrd   %cl,%r12,%r16,%r12
+\s*[a-f0-9]+:\s*67 0f 40 90 90 90 90 90 	cmovo  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 41 90 90 90 90 90 	cmovno -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 42 90 90 90 90 90 	cmovb  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 43 90 90 90 90 90 	cmovae -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 44 90 90 90 90 90 	cmove  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 45 90 90 90 90 90 	cmovne -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 46 90 90 90 90 90 	cmovbe -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 47 90 90 90 90 90 	cmova  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 48 90 90 90 90 90 	cmovs  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 49 90 90 90 90 90 	cmovns -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 4a 90 90 90 90 90 	cmovp  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 4c 90 90 90 90 90 	cmovl  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 4d 90 90 90 90 90 	cmovge -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 4e 90 90 90 90 90 	cmovle -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*67 0f 4f 90 90 90 90 90 	cmovg  -0x6f6f6f70\(%eax\),%edx
+\s*[a-f0-9]+:\s*66 0f 38 f6 c3       	adcx   %ebx,%eax
+\s*[a-f0-9]+:\s*66 0f 38 f6 c3       	adcx   %ebx,%eax
+\s*[a-f0-9]+:\s*62 f4 fd 18 66 c3    	adcx   %rbx,%rax,%rax
+\s*[a-f0-9]+:\s*62 74 3d 18 66 c0    	adcx   %eax,%r8d,%r8d
+\s*[a-f0-9]+:\s*62 d4 7d 18 66 c7    	adcx   %r15d,%eax,%eax
+\s*[a-f0-9]+:\s*67 66 0f 38 f6 04 0a 	adcx   \(%edx,%ecx,1\),%eax
+\s*[a-f0-9]+:\s*f3 0f 38 f6 c3       	adox   %ebx,%eax
+\s*[a-f0-9]+:\s*f3 0f 38 f6 c3       	adox   %ebx,%eax
+\s*[a-f0-9]+:\s*62 f4 fe 18 66 c3    	adox   %rbx,%rax,%rax
+\s*[a-f0-9]+:\s*62 74 3e 18 66 c0    	adox   %eax,%r8d,%r8d
+\s*[a-f0-9]+:\s*62 d4 7e 18 66 c7    	adox   %r15d,%eax,%eax
+\s*[a-f0-9]+:\s*67 f3 0f 38 f6 04 0a 	adox   \(%edx,%ecx,1\),%eax
diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
new file mode 100644
index 00000000000..6ffdf5a6390
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
@@ -0,0 +1,125 @@
+# Check 64bit APX NDD instructions with optimized encoding
+
+	.text
+_start:
+add    %r31,%r8,%r8
+addb   %r31b,%r8b,%r8b
+{store} add    %r31,%r8,%r8
+{load}  add    %r31,%r8,%r8
+add    %r31,(%r8),%r31
+add    (%r31),%r8,%r8
+add    $0x12344433,%r15,%r15
+add    $0xfffffffff4332211,%r8,%r8
+inc    %r31,%r31
+incb   %r31b,%r31b
+sub    %r15,%r17,%r17
+subb   %r15b,%r17b,%r17b
+sub    %r15,(%r8),%r15
+sub    (%r15,%rax,1),%r16,%r16
+sub    $0x1234,%r30,%r30
+dec    %r17,%r17
+decb   %r17b,%r17b
+sbb    %r15,%r17,%r17
+sbbb   %r15b,%r17b,%r17b
+sbb    %r15,(%r8),%r15
+sbb    (%r15,%rax,1),%r16,%r16
+sbb    $0x1234,%r30,%r30
+and    %r15,%r17,%r17
+andb   %r15b,%r17b,%r17b
+and    %r15,(%r8),%r15
+and    (%r15,%rax,1),%r16,%r16
+and    $0x1234,%r30,%r30
+or     %r15,%r17,%r17
+orb    %r15b,%r17b,%r17b
+or     %r15,(%r8),%r15
+or     (%r15,%rax,1),%r16,%r16
+or     $0x1234,%r30,%r30
+xor    %r15,%r17,%r17
+xorb   %r15b,%r17b,%r17b
+xor    %r15,(%r8),%r15
+xor    (%r15,%rax,1),%r16,%r16
+xor    $0x1234,%r30,%r30
+adc    %r15,%r17,%r17
+adcb   %r15b,%r17b,%r17b
+adc    %r15,(%r8),%r15
+adc    (%r15,%rax,1),%r16,%r16
+adc    $0x1234,%r30,%r30
+neg    %r17,%r17
+negb   %r17b,%r17b
+not    %r17,%r17
+notb   %r17b,%r17b
+imul   0x90909(%eax),%edx,%edx
+imul   0x909(%rax,%r31,8),%rdx,%rdx
+imul   %rdx,%rax,%rdx
+rol    $0x1,%r31,%r31
+rolb   $0x1,%r31b,%r31b
+rol    $0x2,%r12,%r12
+rolb   $0x2,%r12b,%r12b
+ror    $0x1,%r31,%r31
+rorb   $0x1,%r31b,%r31b
+ror    $0x2,%r12,%r12
+rorb   $0x2,%r12b,%r12b
+rcl    $0x1,%r31,%r31
+rclb   $0x1,%r31b,%r31b
+rcl    $0x2,%r12,%r12
+rclb   $0x2,%r12b,%r12b
+rcr    $0x1,%r31,%r31
+rcrb   $0x1,%r31b,%r31b
+rcr    $0x2,%r12,%r12
+rcrb   $0x2,%r12b,%r12b
+sal    $0x1,%r31,%r31
+salb   $0x1,%r31b,%r31b
+sal    $0x2,%r12,%r12
+salb   $0x2,%r12b,%r12b
+shl    $0x1,%r31,%r31
+shlb   $0x1,%r31b,%r31b
+shl    $0x2,%r12,%r12
+shlb   $0x2,%r12b,%r12b
+shr    $0x1,%r31,%r31
+shrb   $0x1,%r31b,%r31b
+shr    $0x2,%r12,%r12
+shrb   $0x2,%r12b,%r12b
+sar    $0x1,%r31,%r31
+sarb   $0x1,%r31b,%r31b
+sar    $0x2,%r12,%r12
+sarb   $0x2,%r12b,%r12b
+shld   $0x1,%r12,(%rax),%r12
+shld   $0x2,%r8,%r12,%r12
+shld   $0x2,%r8,%r12,%r8
+shld   %cl,%r9,(%rax),%r9
+shld   %cl,%r12,%r16,%r16
+shld   %cl,%r12,%r16,%r12
+shrd   $0x1,%r12,(%rax),%r12
+shrd   $0x1,%r13,%r12,%r12
+shrd   $0x1,%r13,%r12,%r13
+shrd   %cl,%r9,(%rax),%r9
+shrd   %cl,%r12,%r16,%r16
+shrd   %cl,%r12,%r16,%r12
+cmovo  0x90909090(%eax),%edx,%edx
+cmovno 0x90909090(%eax),%edx,%edx
+cmovb  0x90909090(%eax),%edx,%edx
+cmovae 0x90909090(%eax),%edx,%edx
+cmove  0x90909090(%eax),%edx,%edx
+cmovne 0x90909090(%eax),%edx,%edx
+cmovbe 0x90909090(%eax),%edx,%edx
+cmova  0x90909090(%eax),%edx,%edx
+cmovs  0x90909090(%eax),%edx,%edx
+cmovns 0x90909090(%eax),%edx,%edx
+cmovp  0x90909090(%eax),%edx,%edx
+cmovnp 0x90909090(%eax),%edx,%edx
+cmovl  0x90909090(%eax),%edx,%edx
+cmovge 0x90909090(%eax),%edx,%edx
+cmovle 0x90909090(%eax),%edx,%edx
+cmovg  0x90909090(%eax),%edx,%edx
+adcx   %ebx,%eax,%eax
+adcx   %eax,%ebx,%eax
+adcx   %rbx,%rax,%rax
+adcx   %eax,%r8d,%r8d
+adcx   %r15d,%eax,%eax
+adcx   (%edx,%ecx,1),%eax,%eax
+adox   %ebx,%eax,%eax
+adox   %eax,%ebx,%eax
+adox   %rbx,%rax,%rax
+adox   %eax,%r8d,%r8d
+adox   %r15d,%eax,%eax
+adox   (%edx,%ecx,1),%eax,%eax
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 1b13c52454e..2ba4c49417a 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -561,6 +561,7 @@ run_dump_test "x86-64-optimize-6"
 run_list_test "x86-64-optimize-7a" "-I${srcdir}/$subdir -march=+noavx -al"
 run_dump_test "x86-64-optimize-7b"
 run_list_test "x86-64-optimize-8" "-I${srcdir}/$subdir -march=+noavx2 -al"
+run_dump_test "x86-64-apx-ndd-optimize"
 run_dump_test "x86-64-align-branch-1a"
 run_dump_test "x86-64-align-branch-1b"
 run_dump_test "x86-64-align-branch-1c"
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH V5 9/9] Support APX JMPABS for disassembler
  2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
                   ` (7 preceding siblings ...)
  2023-12-28  1:27 ` [PATCH V5 8/9] Support APX NDD optimized encoding Cui, Lili
@ 2023-12-28  1:27 ` Cui, Lili
  2023-12-28  1:56   ` H.J. Lu
  2024-01-05 12:08   ` Jan Beulich
  8 siblings, 2 replies; 30+ messages in thread
From: Cui, Lili @ 2023-12-28  1:27 UTC (permalink / raw)
  To: binutils; +Cc: hongjiu.lu, jbeulich, Hu, Lin1

From: "Hu, Lin1" <lin1.hu@intel.com>

gas/ChangeLog:

	* testsuite/gas/i386/x86-64.exp: Ditto.
	* testsuite/gas/i386/x86-64-apx-jmpabs-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-jmpabs-inval.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-jmpabs-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-jmpabs.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-jmpabs.s: Ditto.

opcodes/ChangeLog:

	* i386-dis.c (JMPABS_Fixup): New Fixup function to disassemble jmpabs.
	(print_insn): Add #UD exception for jmpabs.
	(dis386): Modify a1 unit for support jmpabs.
	* i386-mnem.h: Regenerated.
	* i386-opc.tbl: New insns.
	* i386-tbl.h: Regenerated.
---
 .../gas/i386/x86-64-apx-jmpabs-intel.d        | 12 ++++++
 .../gas/i386/x86-64-apx-jmpabs-inval.d        | 40 +++++++++++++++++++
 .../gas/i386/x86-64-apx-jmpabs-inval.s        | 15 +++++++
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d    | 12 ++++++
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s    |  5 +++
 gas/testsuite/gas/i386/x86-64.exp             |  3 ++
 opcodes/i386-dis.c                            | 37 ++++++++++++++++-
 7 files changed, 122 insertions(+), 2 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s

diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
new file mode 100644
index 00000000000..2b87f95532f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
@@ -0,0 +1,12 @@
+#as:
+#objdump: -dw -Mintel
+#name: x86_64 APX_F JMPABS insns (Intel disassembly)
+#source: x86-64-apx-jmpabs.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*d5 00 a1 02 00 00 00 00 00 00 00[	 ]+jmpabs 0x2
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
new file mode 100644
index 00000000000..86f313f0873
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
@@ -0,0 +1,40 @@
+#as: --64
+#objdump: -dw
+#name: illegal decoding of APX_F jmpabs insns
+#source: x86-64-apx-jmpabs-inval.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <.text>:
+\s*[a-f0-9]+:	66 d5 00 a1[  	]+\(bad\)
+\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	67 d5 00 a1[  	]+\(bad\)
+\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	f2 d5 00 a1[  	]+\(bad\)
+\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	f3 d5 00 a1[  	]+\(bad\)
+\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	f0 d5 00 a1[  	]+\(bad\)
+\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	d5 08 a1[  	]+\(bad\)
+\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
new file mode 100644
index 00000000000..de4440a5466
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
@@ -0,0 +1,15 @@
+# Check bytecode of APX_F jmpabs instructions with illegal encode.
+
+	.text
+# With 66 prefix
+	.byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+# With 67 prefix
+	.byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+# With F2 prefix
+	.byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+# With F3 prefix
+	.byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+# With LOCK prefix
+	.byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+# REX2.M0 = 0 REX2.W = 1
+	.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs.d b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
new file mode 100644
index 00000000000..e95b54f5dab
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
@@ -0,0 +1,12 @@
+#as:
+#objdump: -dw
+#name: x86_64 APX_F JMPABS insns
+#source: x86-64-apx-jmpabs.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*d5 00 a1 02 00 00 00 00 00 00 00[	 ]+jmpabs \$0x2
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
new file mode 100644
index 00000000000..69ffb763260
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
@@ -0,0 +1,5 @@
+# Check 64bit APX_F JMPABS instructions
+
+	.text
+ _start:
+	.byte 0xd5,0x00,0xa1,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 2ba4c49417a..fa6a1c3c945 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -377,6 +377,9 @@ run_dump_test "x86-64-apx-evex-promoted"
 run_dump_test "x86-64-apx-evex-promoted-intel"
 run_dump_test "x86-64-apx-evex-egpr"
 run_dump_test "x86-64-apx-ndd"
+run_dump_test "x86-64-apx-jmpabs"
+run_dump_test "x86-64-apx-jmpabs-intel"
+run_dump_test "x86-64-apx-jmpabs-inval"
 run_dump_test "x86-64-avx512f-rcigrz-intel"
 run_dump_test "x86-64-avx512f-rcigrz"
 run_dump_test "x86-64-clwb"
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index e851fb376d9..b6d7e089823 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -106,6 +106,7 @@ static bool MOVSXD_Fixup (instr_info *, int, int);
 static bool DistinctDest_Fixup (instr_info *, int, int);
 static bool PREFETCHI_Fixup (instr_info *, int, int);
 static bool PUSH2_POP2_Fixup (instr_info *, int, int);
+static bool JMPABS_Fixup (instr_info *, int, int);
 
 static void ATTRIBUTE_PRINTF_3 i386_dis_printf (const disassemble_info *,
 						enum disassembler_style,
@@ -2018,7 +2019,7 @@ static const struct dis386 dis386[] = {
   { "lahf",		{ XX }, 0 },
   /* a0 */
   { "mov%LB",		{ AL, Ob }, PREFIX_REX2_ILLEGAL },
-  { "mov%LS",		{ eAX, Ov }, PREFIX_REX2_ILLEGAL },
+  { "mov%LS",		{ { JMPABS_Fixup, eAX_reg }, { JMPABS_Fixup, v_mode } }, PREFIX_REX2_ILLEGAL },
   { "mov%LB",		{ Ob, AL }, PREFIX_REX2_ILLEGAL },
   { "mov%LS",		{ Ov, eAX }, PREFIX_REX2_ILLEGAL },
   { "movs{b|}",		{ Ybr, Xb }, PREFIX_REX2_ILLEGAL },
@@ -9699,7 +9700,7 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
     }
 
   if ((dp->prefix_requirement & PREFIX_REX2_ILLEGAL)
-      && ins.last_rex2_prefix >= 0)
+      && ins.last_rex2_prefix >= 0 && (ins.rex2 & REX2_SPECIAL) == 0)
     {
       i386_dis_printf (info, dis_style_text, "(bad)");
       ret = ins.end_codep - priv.the_buffer;
@@ -13942,3 +13943,35 @@ PUSH2_POP2_Fixup (instr_info *ins, int bytemode, int sizeflag)
 
   return OP_VEX (ins, bytemode, sizeflag);
 }
+
+static bool
+JMPABS_Fixup (instr_info *ins, int bytemode, int sizeflag)
+{
+  if (ins->last_rex2_prefix >= 0)
+    {
+      uint64_t op;
+
+      if ((ins->prefixes & (PREFIX_OPCODE | PREFIX_ADDR | PREFIX_LOCK)) != 0x0
+	  || (ins->rex & REX_W) != 0x0)
+	{
+	  oappend (ins, "(bad)");
+	  return true;
+	}
+
+      if (bytemode == eAX_reg)
+	return true;
+
+      if (!get64 (ins, &op))
+	return false;
+
+      ins->mnemonicendp = stpcpy (ins->obuf, "jmpabs");
+      ins->rex2 |= REX2_SPECIAL;
+      oappend_immediate (ins, op);
+
+      return true;
+    }
+
+  if (bytemode == eAX_reg)
+    return OP_IMREG (ins, bytemode, sizeflag);
+  return OP_OFF64 (ins, bytemode, sizeflag);
+}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
  2023-12-28  1:27 ` [PATCH V5 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
@ 2023-12-28  1:53   ` H.J. Lu
  2024-01-04  8:02     ` Jan Beulich
  2024-01-05 14:45   ` Jan Beulich
  1 sibling, 1 reply; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:53 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich

On Thu, Dec 28, 2023 at 01:27:06AM +0000, Cui, Lili wrote:
> APX uses the REX2 prefix to support EGPR for map0 and map1 of legacy
> instructions. We added the NoEgpr flag in i386-gen.c for instructions
> that do not support EGPR.
> 
> gas/ChangeLog:
> 
> 2023-12-28  Lingling Kong <lingling.kong@intel.com>
> 	    H.J. Lu  <hongjiu.lu@intel.com>
> 	    Lili Cui <lili.cui@intel.com>
> 	    Lin Hu   <lin1.hu@intel.com>
> 
> 	* config/tc-i386.c
> 	(enum i386_error): Add unsupported_EGPR_for_addressing
> 	and invalid_pseudo_prefix.
> 	(struct _i386_insn): Add rex2 and rex2_encoding for
> 	gpr32.
> 	(cpu_arch): Add apx_f.
> 	(is_cpu): Ditto.
> 	(register_number): Handle RegRex2 for gpr32.
> 	(is_apx_rex2_encoding): New func. Test rex2 prefix encoding.
> 	(build_rex2_prefix): New func. Build legacy insn in
> 	opcode 0/1 use gpr32 with rex2 prefix.
> 	(establish_rex): Handle rex2 and rex2_encoding.
> 	(optimize_encoding): Handel add r16-r31 for registers.
> 	(md_assemble): Handle apx encoding.
> 	(parse_insn): Handle Prefix_REX2.
> 	(check_EgprOperands): New func. Check if Egprs operands
> 	are valid for the instruction
> 	(match_template):  Handle Egpr operands check.
> 	(set_rex_rex2):  New func. set i.rex and i.rex2.
> 	(build_modrm_byte): Ditto.
> 	(output_insn): Handle rex2 2-byte prefix output.
> 	(check_register): Handle check egpr illegal without
> 	target apx, 64-bit mode and with rex_prefix.
> 	* doc/c-i386.texi: Document .apx.
> 	* testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d: D5 valid
> 	in 64-bit mode.
> 	* testsuite/gas/i386/ilp32/x86-64-opcode-inval.d: Ditto.
> 	* testsuite/gas/i386/rex-bad: Adjust rex testcase.
> 	* testsuite/gas/i386/x86-64-opcode-inval-intel.d: Ditto.
> 	* testsuite/gas/i386/x86-64-opcode-inval.d: Ditto.
> 	* testsuite/gas/i386/x86-64-opcode-inval.s: Ditto.
> 	* testsuite/gas/i386/x86-64-pseudos-bad.l: Add illegal rex2 test.
> 	* testsuite/gas/i386/x86-64-pseudos-bad.s: Ditto.
> 	* testsuite/gas/i386/x86-64-pseudos.d: Add rex2 test.
> 	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
> 	* testsuite/gas/i386/x86-64.exp: Run APX tests.
> 	* testsuite/gas/i386/x86-64-apx-egpr-inval.l: New test.
> 	* testsuite/gas/i386/x86-64-apx-egpr-inval.s: New test.
> 	* testsuite/gas/i386/x86-64-apx-rex2.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-rex2.s: New test.
> 
> include/ChangeLog:
> 
> 	* opcode/i386.h (REX2_OPCODE): New.
> 	(REX2_M): Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis.c (struct instr_info): Add erex for gpr32.
> 	Add last_erex_prefix for rex2 prefix.
> 	(REX2_M): Extend for gpr32.
> 	(PREFIX_REX2): Ditto.
> 	(PREFIX_REX2_ILLEGAL): Ditto.
> 	(ckprefix): Ditto.
> 	(prefix_name): Ditto.
> 	(print_insn): Ditto.
> 	(print_register): Ditto.
> 	(OP_E_memory): Ditto.
> 	(OP_REG): Ditto.
> 	(OP_EX): Ditto.
> 	* i386-gen.c (rex2_disallowed): Some instructions are not allowed rex2 prefix.
> 	(process_i386_opcode_modifier): Set NoEgpr for VEX and some special instructions.
> 	(output_i386_opcode): Handle if_entry_needs_special_handle.
> 	* i386-init.h : Regenerated.
> 	* i386-mnem.h : Regenerated.
> 	* i386-opc.h (enum i386_cpu): Add CpuAPX_F.
> 	(NoEgpr): New.
> 	(Prefix_NoOptimize): Ditto.
> 	(Prefix_REX2): Ditto.
> 	(RegRex2): Ditto.
> 	* i386-opc.tbl: Add rex2 prefix.
> 	* i386-reg.tbl: Add egprs (r16-r31).
> 	* i386-tbl.h: Regenerated.
> ---
>  gas/config/tc-i386.c                          | 178 ++++++++++--
>  gas/doc/c-i386.texi                           |   7 +-
>  .../i386/ilp32/x86-64-opcode-inval-intel.d    |  47 +---
>  .../gas/i386/ilp32/x86-64-opcode-inval.d      |  47 +---
>  gas/testsuite/gas/i386/rex-bad.l              |   8 +-
>  .../gas/i386/x86-64-apx-egpr-inval.l          |  15 +
>  .../gas/i386/x86-64-apx-egpr-inval.s          |  18 ++
>  gas/testsuite/gas/i386/x86-64-apx-rex2.d      |  83 ++++++
>  gas/testsuite/gas/i386/x86-64-apx-rex2.s      |  85 ++++++
>  .../gas/i386/x86-64-opcode-inval-intel.d      |  26 +-
>  gas/testsuite/gas/i386/x86-64-opcode-inval.d  |  26 +-
>  gas/testsuite/gas/i386/x86-64-opcode-inval.s  |   4 -
>  gas/testsuite/gas/i386/x86-64-pseudos-bad.l   |  75 ++++-
>  gas/testsuite/gas/i386/x86-64-pseudos-bad.s   |  74 +++++
>  gas/testsuite/gas/i386/x86-64-pseudos.d       |  21 ++
>  gas/testsuite/gas/i386/x86-64-pseudos.s       |  21 ++
>  gas/testsuite/gas/i386/x86-64.exp             |   2 +
>  include/opcode/i386.h                         |   4 +
>  opcodes/i386-dis.c                            | 257 ++++++++++++------
>  opcodes/i386-gen.c                            |  50 +++-
>  opcodes/i386-opc.h                            |  13 +-
>  opcodes/i386-opc.tbl                          |  27 +-
>  opcodes/i386-reg.tbl                          |  64 +++++
>  23 files changed, 886 insertions(+), 266 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.s
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index cdd3b55c655..bb302f28add 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -239,6 +239,7 @@ enum i386_error
>      bad_imm4,
>      unsupported_with_intel_mnemonic,
>      unsupported_syntax,
> +    unsupported_EGPR_for_addressing,
>      unsupported,
>      unsupported_on_arch,
>      unsupported_64bit,
> @@ -249,6 +250,7 @@ enum i386_error
>      invalid_vector_register_set,
>      invalid_tmm_register_set,
>      invalid_dest_and_src_register_set,
> +    invalid_pseudo_prefix,
>      unsupported_vector_index_register,
>      unsupported_broadcast,
>      broadcast_needed,
> @@ -356,6 +358,7 @@ struct _i386_insn
>      modrm_byte rm;
>      rex_byte rex;
>      rex_byte vrex;
> +    rex_byte rex2;
>      sib_byte sib;
>      vex_prefix vex;
>  
> @@ -429,6 +432,9 @@ struct _i386_insn
>      /* Prefer the REX byte in encoding.  */
>      bool rex_encoding;
>  
> +    /* Prefer the REX2 prefix in encoding.  */
> +    bool rex2_encoding;
> +
>      /* Disable instruction size optimization.  */
>      bool no_optimize;
>  
> @@ -1149,6 +1155,7 @@ static const arch_entry cpu_arch[] =
>    SUBARCH (pbndkb, PBNDKB, PBNDKB, false),
>    VECARCH (avx10.1, AVX10_1, ANY_AVX512F, set),
>    SUBARCH (user_msr, USER_MSR, USER_MSR, false),
> +  SUBARCH (apx_f, APX_F, APX_F, false),
>  };
>  
>  #undef SUBARCH
> @@ -1664,6 +1671,7 @@ _is_cpu (const i386_cpu_attr *a, enum i386_cpu cpu)
>      case CpuHLE:      return a->bitfield.cpuhle;
>      case CpuAVX512F:  return a->bitfield.cpuavx512f;
>      case CpuAVX512VL: return a->bitfield.cpuavx512vl;
> +    case CpuAPX_F:    return a->bitfield.cpuapx_f;
>      case Cpu64:       return a->bitfield.cpu64;
>      case CpuNo64:     return a->bitfield.cpuno64;
>      default:
> @@ -2335,7 +2343,7 @@ register_number (const reg_entry *r)
>    if (r->reg_flags & RegRex)
>      nr += 8;
>  
> -  if (r->reg_flags & RegVRex)
> +  if (r->reg_flags & (RegVRex | RegRex2))
>      nr += 16;
>  
>    return nr;
> @@ -3871,6 +3879,12 @@ is_any_vex_encoding (const insn_template *t)
>    return t->opcode_modifier.vex || t->opcode_modifier.evex;
>  }
>  
> +static INLINE bool
> +is_apx_rex2_encoding (void)
> +{
> +  return i.rex2 || i.rex2_encoding;
> +}
> +
>  static unsigned int
>  get_broadcast_bytes (const insn_template *t, bool diag)
>  {
> @@ -4126,6 +4140,22 @@ build_evex_prefix (void)
>      i.vex.bytes[3] |= i.mask.reg->reg_num;
>  }
>  
> +/* Build (2 bytes) rex2 prefix.
> +   | D5h |
> +   | m | R4 X4 B4 | W R X B |
> +
> +   Rex2 reuses i.vex as they both encode i.tm.opcode_space in their prefixes.
> + */
> +static void
> +build_rex2_prefix (void)
> +{
> +  i.vex.length = 2;
> +  i.vex.bytes[0] = 0xd5;
> +  /* For the W R X B bits, the variables of rex prefix will be reused.  */
> +  i.vex.bytes[1] = ((i.tm.opcode_space << 7)
> +		    | (i.rex2 << 4) | i.rex);
> +}
> +
>  static void establish_rex (void)
>  {
>    /* Note that legacy encodings have at most 2 non-immediate operands.  */
> @@ -4140,13 +4170,16 @@ static void establish_rex (void)
>       registers to new ones.  */
>  
>    if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
> -       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0))
> +       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> +	   || i.rex2 != 0))
>        || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
> -	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0)))
> +	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> +	      || i.rex2 != 0)))
>      {
>        unsigned int x;
>  
> -      i.rex |= REX_OPCODE;
> +      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
> +	i.rex |= REX_OPCODE;
>        for (x = first; x <= last; x++)
>  	{
>  	  /* Look for 8 bit operand that uses old registers.  */
> @@ -4157,7 +4190,7 @@ static void establish_rex (void)
>  	      /* In case it is "hi" register, give up.  */
>  	      if (i.op[x].regs->reg_num > 3)
>  		as_bad (_("can't encode register '%s%s' in an "
> -			  "instruction requiring REX prefix"),
> +			  "instruction requiring REX/REX2 prefix"),
>  			register_prefix, i.op[x].regs->reg_name);
>  
>  	      /* Otherwise it is equivalent to the extended register.
> @@ -4168,11 +4201,11 @@ static void establish_rex (void)
>  	}
>      }
>  
> -  if (i.rex == 0 && i.rex_encoding)
> +   if (i.rex == 0 && i.rex2 == 0 && (i.rex_encoding || i.rex2_encoding))
>      {
>        /* Check if we can add a REX_OPCODE byte.  Look for 8 bit operand
>  	 that uses legacy register.  If it is "hi" register, don't add
> -	 the REX_OPCODE byte.  */
> +	 rex and rex2 prefix.  */
>        unsigned int x;
>  
>        for (x = first; x <= last; x++)
> @@ -4183,6 +4216,7 @@ static void establish_rex (void)
>  	  {
>  	    gas_assert (!(i.op[x].regs->reg_flags & RegRex));
>  	    i.rex_encoding = false;
> +	    i.rex2_encoding = false;
>  	    break;
>  	  }
>  
> @@ -4190,8 +4224,14 @@ static void establish_rex (void)
>  	i.rex = REX_OPCODE;
>      }
>  
> -  if (i.rex != 0)
> -    add_prefix (REX_OPCODE | i.rex);
> +   if (is_apx_rex2_encoding ())
> +     {
> +       build_rex2_prefix ();
> +       /* The individual REX.RXBW bits got consumed.  */
> +       i.rex &= REX_OPCODE;
> +     }
> +   else if (i.rex != 0)
> +     add_prefix (REX_OPCODE | i.rex);
>  }
>  
>  static void
> @@ -4457,14 +4497,22 @@ optimize_encoding (void)
>  	  i.types[1].bitfield.byte = 1;
>  	  /* Ignore the suffix.  */
>  	  i.suffix = 0;
> -	  /* Convert to byte registers.  */
> +	  /* Convert to byte registers. 8-bit registers are special,
> +	     RegRex64 and non-RegRex64 each have 8 registers.  */
>  	  if (i.types[1].bitfield.word)
> -	    j = 16;
> -	  else if (i.types[1].bitfield.dword)
> +	    /* 32 (or 40) 8-bit registers.  */
>  	    j = 32;
> +	  else if (i.types[1].bitfield.dword)
> +	    /* 32 (or 40) 8-bit registers + 32 16-bit registers.  */
> +	    j = 64;
>  	  else
> -	    j = 48;
> -	  if (!(i.op[1].regs->reg_flags & RegRex) && base_regnum < 4)
> +	    /* 32 (or 40) 8-bit registers + 32 16-bit registers
> +	       + 32 32-bit registers.  */
> +	    j = 96;
> +
> +	  /* In 64-bit mode, the following byte registers cannot be accessed
> +	     if using the Rex and Rex2 prefix: AH, BH, CH, DH */
> +	  if (!(i.op[1].regs->reg_flags & (RegRex | RegRex2)) && base_regnum < 4)
>  	    j += 8;
>  	  i.op[1].regs -= j;
>  	}
> @@ -5354,6 +5402,9 @@ md_assemble (char *line)
>  	case unsupported_syntax:
>  	  err_msg = _("unsupported syntax");
>  	  break;
> +	case unsupported_EGPR_for_addressing:
> +	  err_msg = _("extended GPR cannot be used as base/index");
> +	  break;
>  	case unsupported:
>  	  as_bad (_("unsupported instruction `%s'"),
>  		  pass1_mnem ? pass1_mnem : insn_name (current_templates.start));
> @@ -5407,6 +5458,9 @@ md_assemble (char *line)
>  	case invalid_dest_and_src_register_set:
>  	  err_msg = _("destination and source registers must be distinct");
>  	  break;
> +	case invalid_pseudo_prefix:
> +	  err_msg = _("rex2 pseudo prefix cannot be used");
> +	  break;
>  	case unsupported_vector_index_register:
>  	  err_msg = _("unsupported vector index register");
>  	  break;
> @@ -5662,6 +5716,13 @@ md_assemble (char *line)
>  	  return;
>  	}
>  
> +      /* Check for explicit REX2 prefix.  */
> +      if (i.rex2_encoding)
> +	{
> +	  as_bad (_("{rex2} prefix invalid with `%s'"), insn_name (&i.tm));
> +	  return;
> +	}
> +
>        if (i.tm.opcode_modifier.vex)
>  	build_vex_prefix (t);
>        else
> @@ -5868,6 +5929,10 @@ parse_insn (const char *line, char *mnemonic, bool prefix_only)
>  		  /* {rex} */
>  		  i.rex_encoding = true;
>  		  break;
> +		case Prefix_REX2:
> +		  /* {rex2} */
> +		  i.rex2_encoding = true;
> +		  break;
>  		case Prefix_NoOptimize:
>  		  /* {nooptimize} */
>  		  i.no_optimize = true;
> @@ -7015,6 +7080,43 @@ VEX_check_encoding (const insn_template *t)
>    return 0;
>  }
>  
> +/* Check if Egprs operands are valid for the instruction.  */
> +
> +static bool
> +check_EgprOperands (const insn_template *t)
> +{
> +  if (!t->opcode_modifier.noegpr)
> +    return 0;
> +
> +  for (unsigned int op = 0; op < i.operands; op++)
> +    {
> +      if (i.types[op].bitfield.class != Reg)
> +	continue;
> +
> +      if (i.op[op].regs->reg_flags & RegRex2)
> +	{
> +	  i.error = register_type_mismatch;
> +	  return 1;
> +	}
> +    }
> +
> +  if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
> +      || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
> +    {
> +      i.error = unsupported_EGPR_for_addressing;
> +      return 1;
> +    }
> +
> +  /* Check if pseudo prefix {rex2} is valid.  */
> +  if (i.rex2_encoding)
> +    {
> +      i.error = invalid_pseudo_prefix;
> +      return 1;
> +    }
> +
> +  return 0;
> +}
> +
>  /* Helper function for the progress() macro in match_template().  */
>  static INLINE enum i386_error progress (enum i386_error new,
>  					enum i386_error last,
> @@ -7159,6 +7261,13 @@ match_template (char mnem_suffix)
>  	      continue;
>  	    }
>  
> +	  /* Check if pseudo prefix {rex2} is valid.  */
> +	  if (t->opcode_modifier.noegpr && i.rex2_encoding)
> +	    {
> +	      specific_error = progress (invalid_pseudo_prefix);
> +	      continue;
> +	    }
> +
>  	  /* We've found a match; break out of loop.  */
>  	  break;
>  	}
> @@ -7489,6 +7598,13 @@ match_template (char mnem_suffix)
>  	  continue;
>  	}
>  
> +      /* Check if EGPR operands(r16-r31) are valid.  */
> +      if (check_EgprOperands (t))
> +	{
> +	  specific_error = progress (i.error);
> +	  continue;
> +	}
> +
>        /* Check whether to use the shorter VEX encoding for certain insns where
>  	 the EVEX encoding comes first in the table.  This requires the respective
>  	 AVX-* feature to be explicitly enabled.
> @@ -8387,6 +8503,18 @@ static INLINE void set_rex_vrex (const reg_entry *r, unsigned int rex_bit,
>  
>    if (r->reg_flags & RegVRex)
>      i.vrex |= rex_bit;
> +
> +  if (r->reg_flags & RegRex2)
> +    i.rex2 |= rex_bit;
> +}
> +
> +static INLINE void
> +set_rex_rex2 (const reg_entry *r, unsigned int rex_bit)
> +{
> +  if ((r->reg_flags & RegRex) != 0)
> +    i.rex |= rex_bit;
> +  if ((r->reg_flags & RegRex2) != 0)
> +    i.rex2 |= rex_bit;
>  }
>  
>  static int
> @@ -8870,8 +8998,7 @@ build_modrm_byte (void)
>  		  i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING;
>  		  i.types[op] = operand_type_and_not (i.types[op], anydisp);
>  		  i.types[op].bitfield.disp32 = 1;
> -		  if ((i.index_reg->reg_flags & RegRex) != 0)
> -		    i.rex |= REX_X;
> +		  set_rex_rex2 (i.index_reg, REX_X);
>  		}
>  	    }
>  	  /* RIP addressing for 64bit mode.  */
> @@ -8942,8 +9069,7 @@ build_modrm_byte (void)
>  
>  	      if (!i.tm.opcode_modifier.sib)
>  		i.rm.regmem = i.base_reg->reg_num;
> -	      if ((i.base_reg->reg_flags & RegRex) != 0)
> -		i.rex |= REX_B;
> +	      set_rex_rex2 (i.base_reg, REX_B);
>  	      i.sib.base = i.base_reg->reg_num;
>  	      /* x86-64 ignores REX prefix bit here to avoid decoder
>  		 complications.  */
> @@ -8981,8 +9107,7 @@ build_modrm_byte (void)
>  		  else
>  		    i.sib.index = i.index_reg->reg_num;
>  		  i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING;
> -		  if ((i.index_reg->reg_flags & RegRex) != 0)
> -		    i.rex |= REX_X;
> +		  set_rex_rex2 (i.index_reg, REX_X);
>  		}
>  
>  	      if (i.disp_operands
> @@ -10126,6 +10251,12 @@ output_insn (const struct last_insn *last_insn)
>  	  for (j = ARRAY_SIZE (i.prefix), q = i.prefix; j > 0; j--, q++)
>  	    if (*q)
>  	      frag_opcode_byte (*q);
> +
> +	  if (is_apx_rex2_encoding ())
> +	    {
> +	      frag_opcode_byte (i.vex.bytes[0]);
> +	      frag_opcode_byte (i.vex.bytes[1]);
> +	    }
>  	}
>        else
>  	{
> @@ -14164,6 +14295,13 @@ static bool check_register (const reg_entry *r)
>  	i.vec_encoding = vex_encoding_error;
>      }
>  
> +  if (r->reg_flags & RegRex2)
> +    {
> +      if (!cpu_arch_flags.bitfield.cpuapx_f
> +	  || flag_code != CODE_64BIT)
> +	return false;
> +    }
> +
>    if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
>        && (!cpu_arch_flags.bitfield.cpu64
>  	  || r->reg_type.bitfield.class != RegCR
> diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
> index 03ee980bef7..21f48c93300 100644
> --- a/gas/doc/c-i386.texi
> +++ b/gas/doc/c-i386.texi
> @@ -217,6 +217,7 @@ accept various extension mnemonics.  For example,
>  @code{avx10.1/256},
>  @code{avx10.1/128},
>  @code{user_msr},
> +@code{apx_f},
>  @code{amx_int8},
>  @code{amx_bf16},
>  @code{amx_fp16},
> @@ -983,6 +984,10 @@ Different encoding options can be specified via pseudo prefixes:
>  instructions (x86-64 only).  Note that this differs from the @samp{rex}
>  prefix which generates REX prefix unconditionally.
>  
> +@item
> +@samp{@{rex2@}} -- prefer REX2 prefix for integer and legacy vector
> +instructions (APX_F only).
> +
>  @item
>  @samp{@{nooptimize@}} -- disable instruction size optimization.
>  @end itemize
> @@ -1663,7 +1668,7 @@ supported on the CPU specified.  The choices for @var{cpu_type} are:
>  @item @samp{.lwp} @tab @samp{.fma4} @tab @samp{.xop} @tab @samp{.cx16}
>  @item @samp{.padlock} @tab @samp{.clzero} @tab @samp{.mwaitx} @tab @samp{.rdpru}
>  @item @samp{.mcommit} @tab @samp{.sev_es} @tab @samp{.snp} @tab @samp{.invlpgb}
> -@item @samp{.tlbsync}
> +@item @samp{.tlbsync} @tab @samp{.apx_f}
>  @end multitable
>  
>  Apart from the warning, there are only two other effects on
> diff --git a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d
> index a2b09d2e74f..56834371133 100644
> --- a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d
> +++ b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d
> @@ -2,49 +2,4 @@
>  #as: --32
>  #objdump: -dw -Mx86-64 -Mintel
>  #name: x86-64 (ILP32) illegal opcodes (Intel mode)
> -
> -.*: +file format .*
> -
> -Disassembly of section .text:
> -
> -0+ <aaa>:
> -[ 	]*[a-f0-9]+:	37                   	\(bad\)
> -
> -0+1 <aad0>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
> -
> -0+3 <aad1>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	02                   	.byte 0x2
> -
> -0+5 <aam0>:
> -[ 	]*[a-f0-9]+:	d4                   	\(bad\)
> -[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
> -
> -0+7 <aam1>:
> -[ 	]*[a-f0-9]+:	d4                   	\(bad\)
> -[ 	]*[a-f0-9]+:	02                   	.byte 0x2
> -
> -0+9 <aas>:
> -[ 	]*[a-f0-9]+:	3f                   	\(bad\)
> -
> -0+a <bound>:
> -[ 	]*[a-f0-9]+:	62                   	.byte 0x62
> -[ 	]*[a-f0-9]+:	10                   	.byte 0x10
> -
> -0+c <daa>:
> -[ 	]*[a-f0-9]+:	27                   	\(bad\)
> -
> -0+d <das>:
> -[ 	]*[a-f0-9]+:	2f                   	\(bad\)
> -
> -0+e <into>:
> -[ 	]*[a-f0-9]+:	ce                   	\(bad\)
> -
> -0+f <pusha>:
> -[ 	]*[a-f0-9]+:	60                   	\(bad\)
> -
> -0+10 <popa>:
> -[ 	]*[a-f0-9]+:	61                   	\(bad\)
> -#pass
> +#dump: ../x86-64-opcode-inval-intel.d
> diff --git a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d
> index 5a17b0b412e..b5233a5cf93 100644
> --- a/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d
> +++ b/gas/testsuite/gas/i386/ilp32/x86-64-opcode-inval.d
> @@ -2,49 +2,4 @@
>  #as: --32
>  #objdump: -dw -Mx86-64
>  #name: x86-64 (ILP32) illegal opcodes
> -
> -.*: +file format .*
> -
> -Disassembly of section .text:
> -
> -0+ <aaa>:
> -[ 	]*[a-f0-9]+:	37                   	\(bad\)
> -
> -0+1 <aad0>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
> -
> -0+3 <aad1>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	02                   	.byte 0x2
> -
> -0+5 <aam0>:
> -[ 	]*[a-f0-9]+:	d4                   	\(bad\)
> -[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
> -
> -0+7 <aam1>:
> -[ 	]*[a-f0-9]+:	d4                   	\(bad\)
> -[ 	]*[a-f0-9]+:	02                   	.byte 0x2
> -
> -0+9 <aas>:
> -[ 	]*[a-f0-9]+:	3f                   	\(bad\)
> -
> -0+a <bound>:
> -[ 	]*[a-f0-9]+:	62                   	.byte 0x62
> -[ 	]*[a-f0-9]+:	10                   	.byte 0x10
> -
> -0+c <daa>:
> -[ 	]*[a-f0-9]+:	27                   	\(bad\)
> -
> -0+d <das>:
> -[ 	]*[a-f0-9]+:	2f                   	\(bad\)
> -
> -0+e <into>:
> -[ 	]*[a-f0-9]+:	ce                   	\(bad\)
> -
> -0+f <pusha>:
> -[ 	]*[a-f0-9]+:	60                   	\(bad\)
> -
> -0+10 <popa>:
> -[ 	]*[a-f0-9]+:	61                   	\(bad\)
> -#pass
> +#dump: ../x86-64-opcode-inval.d
> diff --git a/gas/testsuite/gas/i386/rex-bad.l b/gas/testsuite/gas/i386/rex-bad.l
> index 407558ec541..abd4d3045d0 100644
> --- a/gas/testsuite/gas/i386/rex-bad.l
> +++ b/gas/testsuite/gas/i386/rex-bad.l
> @@ -3,8 +3,8 @@
>  .*:5: Error: same .*
>  .*:6: Error: same .*
>  .*:7: Error: same .*
> -.*:9: Error: .* REX .*
> -.*:10: Error: .* REX .*
> -.*:12: Error: .* REX .*
> -.*:13: Error: .* REX .*
> +.*:9: Error: .* REX/REX2 .*
> +.*:10: Error: .* REX/REX2 .*
> +.*:12: Error: .* REX/REX2 .*
> +.*:13: Error: .* REX/REX2 .*
>  #pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
> new file mode 100644
> index 00000000000..bb5c602a2e2
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
> @@ -0,0 +1,15 @@
> +.*: Assembler messages:
> +.*:4: Error: bad register name `%r17d'
> +.*:7: Error: extended GPR cannot be used as base/index for `xsave'
> +.*:8: Error: extended GPR cannot be used as base/index for `xsave64'
> +.*:9: Error: extended GPR cannot be used as base/index for `xrstor'
> +.*:10: Error: extended GPR cannot be used as base/index for `xrstor64'
> +.*:11: Error: extended GPR cannot be used as base/index for `xsaves'
> +.*:12: Error: extended GPR cannot be used as base/index for `xsaves64'
> +.*:13: Error: extended GPR cannot be used as base/index for `xrstors'
> +.*:14: Error: extended GPR cannot be used as base/index for `xrstors64'
> +.*:15: Error: extended GPR cannot be used as base/index for `xsaveopt'
> +.*:16: Error: extended GPR cannot be used as base/index for `xsaveopt64'
> +.*:17: Error: extended GPR cannot be used as base/index for `xsavec'
> +.*:18: Error: extended GPR cannot be used as base/index for `xsavec64'
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
> new file mode 100644
> index 00000000000..bfb6b3fd03b
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
> @@ -0,0 +1,18 @@
> +# Check illegal 64bit APX_F instructions
> +	.text
> +	.arch .noapx_f
> +	test    $0x7, %r17d
> +	.arch .apx_f
> +	test    $0x7, %r17d
> +	xsave (%r16, %rbx)
> +	xsave64 (%r16, %r31)
> +	xrstor (%r16, %rbx)
> +	xrstor64 (%r16, %rbx)
> +	xsaves (%rbx, %r16)
> +	xsaves64 (%r16, %rbx)
> +	xrstors (%rbx, %r31)
> +	xrstors64 (%r16, %rbx)
> +	xsaveopt (%r16, %rbx)
> +	xsaveopt64 (%r16, %r31)
> +	xsavec (%r16, %rbx)
> +	xsavec64 (%r16, %r31)
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-rex2.d b/gas/testsuite/gas/i386/x86-64-apx-rex2.d
> new file mode 100644
> index 00000000000..e3cd534da11
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.d
> @@ -0,0 +1,83 @@
> +#as:
> +#objdump: -dw
> +#name: x86-64 APX_F use gpr32 with rex2 prefix
> +#source: x86-64-apx-rex2.s
> +
> +.*: +file format .*
> +
> +
> +Disassembly of section .text:
> +
> +0+ <_start>:
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 f6 c0 07[	 ]+test   \$0x7,%r24b
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 f7 c0 07 00 00 00[	 ]+test   \$0x7,%r24d
> +[	 ]*[a-f0-9]+:[	 ]*d5 19 f7 c0 07 00 00 00[	 ]+test   \$0x7,%r24
> +[	 ]*[a-f0-9]+:[	 ]*66 d5 11 f7 c0 07 00[	 ]+test   \$0x7,%r24w
> +[	 ]*[a-f0-9]+:[	 ]*44 0f af f8[	 ]+imul   %eax,%r15d
> +[	 ]*[a-f0-9]+:[	 ]*d5 c0 af c0[	 ]+imul   %eax,%r16d
> +[	 ]*[a-f0-9]+:[	 ]*d5 90 62 12[	 ]+punpckldq %mm2,\(%r18\)
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 00[	 ]+lea    \(%rax\),%r16d
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 08[	 ]+lea    \(%rax\),%r17d
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 10[	 ]+lea    \(%rax\),%r18d
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 18[	 ]+lea    \(%rax\),%r19d
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 20[	 ]+lea    \(%rax\),%r20d
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 28[	 ]+lea    \(%rax\),%r21d
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 30[	 ]+lea    \(%rax\),%r22d
> +[	 ]*[a-f0-9]+:[	 ]*d5 40 8d 38[	 ]+lea    \(%rax\),%r23d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 00[	 ]+lea    \(%rax\),%r24d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 08[	 ]+lea    \(%rax\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 10[	 ]+lea    \(%rax\),%r26d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 18[	 ]+lea    \(%rax\),%r27d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 20[	 ]+lea    \(%rax\),%r28d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 28[	 ]+lea    \(%rax\),%r29d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 30[	 ]+lea    \(%rax\),%r30d
> +[	 ]*[a-f0-9]+:[	 ]*d5 44 8d 38[	 ]+lea    \(%rax\),%r31d
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 05 00 00 00 00[	 ]+lea    0x0\(,%r16,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 0d 00 00 00 00[	 ]+lea    0x0\(,%r17,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 15 00 00 00 00[	 ]+lea    0x0\(,%r18,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 1d 00 00 00 00[	 ]+lea    0x0\(,%r19,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 25 00 00 00 00[	 ]+lea    0x0\(,%r20,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 2d 00 00 00 00[	 ]+lea    0x0\(,%r21,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 35 00 00 00 00[	 ]+lea    0x0\(,%r22,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 20 8d 04 3d 00 00 00 00[	 ]+lea    0x0\(,%r23,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 05 00 00 00 00[	 ]+lea    0x0\(,%r24,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 0d 00 00 00 00[	 ]+lea    0x0\(,%r25,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 15 00 00 00 00[	 ]+lea    0x0\(,%r26,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 1d 00 00 00 00[	 ]+lea    0x0\(,%r27,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 25 00 00 00 00[	 ]+lea    0x0\(,%r28,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 2d 00 00 00 00[	 ]+lea    0x0\(,%r29,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 35 00 00 00 00[	 ]+lea    0x0\(,%r30,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 22 8d 04 3d 00 00 00 00[	 ]+lea    0x0\(,%r31,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 00[	 ]+lea    \(%r16\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 01[	 ]+lea    \(%r17\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 02[	 ]+lea    \(%r18\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 03[	 ]+lea    \(%r19\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 04 24       	lea    \(%r20\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 45 00       	lea    0x0\(%r21\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 06[	 ]+lea    \(%r22\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 10 8d 07[	 ]+lea    \(%r23\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 00[	 ]+lea    \(%r24\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 01[	 ]+lea    \(%r25\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 02[	 ]+lea    \(%r26\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 03[	 ]+lea    \(%r27\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 04 24       	lea    \(%r28\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 45 00       	lea    0x0\(%r29\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 06          	lea    \(%r30\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 11 8d 07          	lea    \(%r31\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*4c 8d 38             	lea    \(%rax\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*d5 48 8d 00          	lea    \(%rax\),%r16
> +[	 ]*[a-f0-9]+:[	 ]*49 8d 07             	lea    \(%r15\),%rax
> +[	 ]*[a-f0-9]+:[	 ]*d5 18 8d 00          	lea    \(%r16\),%rax
> +[	 ]*[a-f0-9]+:[	 ]*4a 8d 04 3d 00 00 00 00 	lea    0x0\(,%r15,1\),%rax
> +[	 ]*[a-f0-9]+:[	 ]*d5 28 8d 04 05 00 00 00 00 	lea    0x0\(,%r16,1\),%rax
> +[	 ]*[a-f0-9]+:[	 ]*d5 1c 03 00          	add    \(%r16\),%r8
> +[	 ]*[a-f0-9]+:[	 ]*d5 1c 03 38          	add    \(%r16\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*d5 4a 8b 04 0d 00 00 00 00 	mov    0x0\(,%r9,1\),%r16
> +[	 ]*[a-f0-9]+:[	 ]*d5 4a 8b 04 35 00 00 00 00 	mov    0x0\(,%r14,1\),%r16
> +[	 ]*[a-f0-9]+:[	 ]*d5 4d 2b 3a          	sub    \(%r10\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*d5 4d 2b 7d 00       	sub    0x0\(%r13\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*d5 30 8d 44 20 01    	lea    0x1\(%r16,%r20,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 76 8d 7c 20 01    	lea    0x1\(%r16,%r28,1\),%r31d
> +[	 ]*[a-f0-9]+:[	 ]*d5 12 8d 84 04 81 00 00 00 	lea    0x81\(%r20,%r8,1\),%eax
> +[	 ]*[a-f0-9]+:[	 ]*d5 57 8d bc 04 81 00 00 00 	lea    0x81\(%r28,%r8,1\),%r31d
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-rex2.s b/gas/testsuite/gas/i386/x86-64-apx-rex2.s
> new file mode 100644
> index 00000000000..eaaaaa77dd7
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.s
> @@ -0,0 +1,85 @@
> +# Check 64bit instructions with rex2 prefix encoding
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +         test	$0x7, %r24b
> +         test	$0x7, %r24d
> +         test	$0x7, %r24
> +         test	$0x7, %r24w
> +## REX2.M bit
> +         imull	%eax, %r15d
> +         imull	%eax, %r16d
> +         punpckldq (%r18), %mm2
> +## REX2.R4 bit
> +         leal	(%rax), %r16d
> +         leal	(%rax), %r17d
> +         leal	(%rax), %r18d
> +         leal	(%rax), %r19d
> +         leal	(%rax), %r20d
> +         leal	(%rax), %r21d
> +         leal	(%rax), %r22d
> +         leal	(%rax), %r23d
> +         leal	(%rax), %r24d
> +         leal	(%rax), %r25d
> +         leal	(%rax), %r26d
> +         leal	(%rax), %r27d
> +         leal	(%rax), %r28d
> +         leal	(%rax), %r29d
> +         leal	(%rax), %r30d
> +         leal	(%rax), %r31d
> +## REX2.X4 bit
> +         leal	(,%r16), %eax
> +         leal	(,%r17), %eax
> +         leal	(,%r18), %eax
> +         leal	(,%r19), %eax
> +         leal	(,%r20), %eax
> +         leal	(,%r21), %eax
> +         leal	(,%r22), %eax
> +         leal	(,%r23), %eax
> +         leal	(,%r24), %eax
> +         leal	(,%r25), %eax
> +         leal	(,%r26), %eax
> +         leal	(,%r27), %eax
> +         leal	(,%r28), %eax
> +         leal	(,%r29), %eax
> +         leal	(,%r30), %eax
> +         leal	(,%r31), %eax
> +## REX2.B4 bit
> +         leal	(%r16), %eax
> +         leal	(%r17), %eax
> +         leal	(%r18), %eax
> +         leal	(%r19), %eax
> +         leal	(%r20), %eax
> +         leal	(%r21), %eax
> +         leal	(%r22), %eax
> +         leal	(%r23), %eax
> +         leal	(%r24), %eax
> +         leal	(%r25), %eax
> +         leal	(%r26), %eax
> +         leal	(%r27), %eax
> +         leal	(%r28), %eax
> +         leal	(%r29), %eax
> +         leal	(%r30), %eax
> +         leal	(%r31), %eax
> +## REX2.W bit
> +         leaq	(%rax), %r15
> +         leaq	(%rax), %r16
> +         leaq	(%r15), %rax
> +         leaq	(%r16), %rax
> +         leaq	(,%r15), %rax
> +         leaq	(,%r16), %rax
> +## REX2.R3 bit
> +         add    (%r16), %r8
> +         add    (%r16), %r15
> +## REX2.X3 bit
> +         mov    (,%r9), %r16
> +         mov    (,%r14), %r16
> +## REX2.B3 bit
> +	 sub   (%r10), %r31
> +	 sub   (%r13), %r31
> +## SIB
> +         leal	1(%r16, %r20), %eax
> +         leal	1(%r16, %r28), %r31d
> +         leal	129(%r20, %r8), %eax
> +         leal	129(%r28, %r8), %r31d
> diff --git a/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d b/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d
> index 6ee5b2f95ce..66c4d2cddc0 100644
> --- a/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d
> +++ b/gas/testsuite/gas/i386/x86-64-opcode-inval-intel.d
> @@ -10,41 +10,33 @@ Disassembly of section .text:
>  0+ <aaa>:
>  [ 	]*[a-f0-9]+:	37                   	\(bad\)
>  
> -0+1 <aad0>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
> -
> -0+3 <aad1>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	02                   	.byte 0x2
> -
> -0+5 <aam0>:
> +0+1 <aam0>:
>  [ 	]*[a-f0-9]+:	d4                   	\(bad\)
>  [ 	]*[a-f0-9]+:	0a                   	.byte 0xa
>  
> -0+7 <aam1>:
> +0+3 <aam1>:
>  [ 	]*[a-f0-9]+:	d4                   	\(bad\)
>  [ 	]*[a-f0-9]+:	02                   	.byte 0x2
>  
> -0+9 <aas>:
> +0+5 <aas>:
>  [ 	]*[a-f0-9]+:	3f                   	\(bad\)
>  
> -0+a <bound>:
> +0+6 <bound>:
>  [ 	]*[a-f0-9]+:	62                   	.byte 0x62
>  [ 	]*[a-f0-9]+:	10                   	.byte 0x10
>  
> -0+c <daa>:
> +0+8 <daa>:
>  [ 	]*[a-f0-9]+:	27                   	\(bad\)
>  
> -0+d <das>:
> +0+9 <das>:
>  [ 	]*[a-f0-9]+:	2f                   	\(bad\)
>  
> -0+e <into>:
> +0+a <into>:
>  [ 	]*[a-f0-9]+:	ce                   	\(bad\)
>  
> -0+f <pusha>:
> +0+b <pusha>:
>  [ 	]*[a-f0-9]+:	60                   	\(bad\)
>  
> -0+10 <popa>:
> +0+c <popa>:
>  [ 	]*[a-f0-9]+:	61                   	\(bad\)
>  #pass
> diff --git a/gas/testsuite/gas/i386/x86-64-opcode-inval.d b/gas/testsuite/gas/i386/x86-64-opcode-inval.d
> index 12f02c1766c..fbb850b56da 100644
> --- a/gas/testsuite/gas/i386/x86-64-opcode-inval.d
> +++ b/gas/testsuite/gas/i386/x86-64-opcode-inval.d
> @@ -9,41 +9,33 @@ Disassembly of section .text:
>  0+ <aaa>:
>  [ 	]*[a-f0-9]+:	37                   	\(bad\)
>  
> -0+1 <aad0>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	0a                   	.byte 0xa
> -
> -0+3 <aad1>:
> -[ 	]*[a-f0-9]+:	d5                   	\(bad\)
> -[ 	]*[a-f0-9]+:	02                   	.byte 0x2
> -
> -0+5 <aam0>:
> +0+1 <aam0>:
>  [ 	]*[a-f0-9]+:	d4                   	\(bad\)
>  [ 	]*[a-f0-9]+:	0a                   	.byte 0xa
>  
> -0+7 <aam1>:
> +0+3 <aam1>:
>  [ 	]*[a-f0-9]+:	d4                   	\(bad\)
>  [ 	]*[a-f0-9]+:	02                   	.byte 0x2
>  
> -0+9 <aas>:
> +0+5 <aas>:
>  [ 	]*[a-f0-9]+:	3f                   	\(bad\)
>  
> -0+a <bound>:
> +0+6 <bound>:
>  [ 	]*[a-f0-9]+:	62                   	.byte 0x62
>  [ 	]*[a-f0-9]+:	10                   	.byte 0x10
>  
> -0+c <daa>:
> +0+8 <daa>:
>  [ 	]*[a-f0-9]+:	27                   	\(bad\)
>  
> -0+d <das>:
> +0+9 <das>:
>  [ 	]*[a-f0-9]+:	2f                   	\(bad\)
>  
> -0+e <into>:
> +0+a <into>:
>  [ 	]*[a-f0-9]+:	ce                   	\(bad\)
>  
> -0+f <pusha>:
> +0+b <pusha>:
>  [ 	]*[a-f0-9]+:	60                   	\(bad\)
>  
> -0+10 <popa>:
> +0+c <popa>:
>  [ 	]*[a-f0-9]+:	61                   	\(bad\)
>  #pass
> diff --git a/gas/testsuite/gas/i386/x86-64-opcode-inval.s b/gas/testsuite/gas/i386/x86-64-opcode-inval.s
> index 6cbfe7705a8..fbcda3df773 100644
> --- a/gas/testsuite/gas/i386/x86-64-opcode-inval.s
> +++ b/gas/testsuite/gas/i386/x86-64-opcode-inval.s
> @@ -2,10 +2,6 @@
>  # All the followings are illegal opcodes for x86-64.
>  aaa:
>  	aaa
> -aad0:
> -	aad
> -aad1:
> -	aad $2
>  aam0:
>  	aam
>  aam1:
> diff --git a/gas/testsuite/gas/i386/x86-64-pseudos-bad.l b/gas/testsuite/gas/i386/x86-64-pseudos-bad.l
> index 3f9f67fcf4b..a72f847085d 100644
> --- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.l
> +++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.l
> @@ -1,6 +1,71 @@
>  .*: Assembler messages:
> -.*:3: Error: .*`vmovaps'.*
> -.*:4: Error: .*`vmovaps'.*
> -.*:5: Error: .*`vmovaps'.*
> -.*:6: Error: .*`vmovaps'.*
> -.*:7: Error: .*`rorx'.*
> +.*:[0-9]+: Error: .*`vmovaps'.*
> +.*:[0-9]+: Error: .*`vmovaps'.*
> +.*:[0-9]+: Error: .*`vmovaps'.*
> +.*:[0-9]+: Error: .*`vmovaps'.*
> +.*:[0-9]+: Error: .*`rorx'.*
> +.*:[0-9]+: Error: .*`vmovaps'.*
> +.*:[0-9]+: Error: .*`xsave'.*
> +.*:[0-9]+: Error: .*`xsaves'.*
> +.*:[0-9]+: Error: .*`xsaves64'.*
> +.*:[0-9]+: Error: .*`xsavec'.*
> +.*:[0-9]+: Error: .*`xrstors'.*
> +.*:[0-9]+: Error: .*`xrstors64'.*
> +.*:[0-9]+: Error: .*`mov'.*
> +.*:[0-9]+: Error: .*`movabs'.*
> +.*:[0-9]+: Error: .*`cmps'.*
> +.*:[0-9]+: Error: .*`lods'.*
> +.*:[0-9]+: Error: .*`lods'.*
> +.*:[0-9]+: Error: .*`lods'.*
> +.*:[0-9]+: Error: .*`movs'.*
> +.*:[0-9]+: Error: .*`movs'.*
> +.*:[0-9]+: Error: .*`scas'.*
> +.*:[0-9]+: Error: .*`scas'.*
> +.*:[0-9]+: Error: .*`scas'.*
> +.*:[0-9]+: Error: .*`stos'.*
> +.*:[0-9]+: Error: .*`stos'.*
> +.*:[0-9]+: Error: .*`stos'.*
> +.*:[0-9]+: Error: .*`jo'.*
> +.*:[0-9]+: Error: .*`jno'.*
> +.*:[0-9]+: Error: .*`jb'.*
> +.*:[0-9]+: Error: .*`jae'.*
> +.*:[0-9]+: Error: .*`je'.*
> +.*:[0-9]+: Error: .*`jne'.*
> +.*:[0-9]+: Error: .*`jbe'.*
> +.*:[0-9]+: Error: .*`ja'.*
> +.*:[0-9]+: Error: .*`js'.*
> +.*:[0-9]+: Error: .*`jns'.*
> +.*:[0-9]+: Error: .*`jp'.*
> +.*:[0-9]+: Error: .*`jnp'.*
> +.*:[0-9]+: Error: .*`jl'.*
> +.*:[0-9]+: Error: .*`jge'.*
> +.*:[0-9]+: Error: .*`jle'.*
> +.*:[0-9]+: Error: .*`jg'.*
> +.*:[0-9]+: Error: .*`jo'.*
> +.*:[0-9]+: Error: .*`jno'.*
> +.*:[0-9]+: Error: .*`jb'.*
> +.*:[0-9]+: Error: .*`jae'.*
> +.*:[0-9]+: Error: .*`je'.*
> +.*:[0-9]+: Error: .*`jne'.*
> +.*:[0-9]+: Error: .*`jbe'.*
> +.*:[0-9]+: Error: .*`ja'.*
> +.*:[0-9]+: Error: .*`js'.*
> +.*:[0-9]+: Error: .*`jns'.*
> +.*:[0-9]+: Error: .*`jp'.*
> +.*:[0-9]+: Error: .*`jnp'.*
> +.*:[0-9]+: Error: .*`jl'.*
> +.*:[0-9]+: Error: .*`jge'.*
> +.*:[0-9]+: Error: .*`jle'.*
> +.*:[0-9]+: Error: .*`jg'.*
> +.*:[0-9]+: Error: .*`in'.*
> +.*:[0-9]+: Error: .*`in'.*
> +.*:[0-9]+: Error: .*`out'.*
> +.*:[0-9]+: Error: .*`out'.*
> +.*:[0-9]+: Error: .*`jmp'.*
> +.*:[0-9]+: Error: .*`loop'.*
> +.*:[0-9]+: Error: .*`wrmsr'.*
> +.*:[0-9]+: Error: .*`rdtsc'.*
> +.*:[0-9]+: Error: .*`rdmsr'.*
> +.*:[0-9]+: Error: .*`sysenter'.*
> +.*:[0-9]+: Error: .*`sysexit'.*
> +.*:[0-9]+: Error: .*`rdpmc'.*
> diff --git a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
> index 3b923593a6a..54c17a9eab7 100644
> --- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
> +++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
> @@ -5,3 +5,77 @@ pseudos:
>  	{rex} vmovaps %xmm7,%xmm2
>  	{rex} vmovaps %xmm17,%xmm2
>  	{rex} rorx $7,%eax,%ebx
> +	{rex2} vmovaps %xmm7,%xmm2
> +	{rex2} xsave (%rax)
> +	{rex2} xsaves (%ecx)
> +	{rex2} xsaves64 (%ecx)
> +	{rex2} xsavec (%ecx)
> +	{rex2} xrstors (%ecx)
> +	{rex2} xrstors64 (%ecx)
> +
> +	#All opcodes in the row 0xA* (map0) prefixed REX2 are illegal.
> +	#{rex2} test (0xa8) is a special case, it will remap to test (0xf6)
> +	{rex2} mov    0x90909090,%al
> +	{rex2} movabs 0x1,%al
> +	{rex2} cmpsb  %es:(%edi),%ds:(%esi)
> +	{rex2} lodsb
> +	{rex2} lods   %ds:(%esi),%al
> +	{rex2} lodsb   (%esi)
> +	{rex2} movs
> +	{rex2} movs   (%esi), (%edi)
> +	{rex2} scasl
> +	{rex2} scas   %es:(%edi),%eax
> +	{rex2} scasb   (%edi)
> +	{rex2} stosb
> +	{rex2} stosb   (%edi)
> +	{rex2} stos   %eax,%es:(%edi)
> +
> +	#All opcodes in the row 0x7* (map0) and 0x8* (map1) prefixed REX2 are illegal.
> +	{rex2} jo     .+2-0x70
> +	{rex2} jno    .+2-0x70
> +	{rex2} jb     .+2-0x70
> +	{rex2} jae    .+2-0x70
> +	{rex2} je     .+2-0x70
> +	{rex2} jne    .+2-0x70
> +	{rex2} jbe    .+2-0x70
> +	{rex2} ja     .+2-0x70
> +	{rex2} js     .+2-0x70
> +	{rex2} jns    .+2-0x70
> +	{rex2} jp     .+2-0x70
> +	{rex2} jnp    .+2-0x70
> +	{rex2} jl     .+2-0x70
> +	{rex2} jge    .+2-0x70
> +	{rex2} jle    .+2-0x70
> +	{rex2} jg     .+2-0x70
> +	{rex2} jo     .+6+0x90909090
> +	{rex2} jno    .+6+0x90909090
> +	{rex2} jb     .+6+0x90909090
> +	{rex2} jae    .+6+0x90909090
> +	{rex2} je     .+6+0x90909090
> +	{rex2} jne    .+6+0x90909090
> +	{rex2} jbe    .+6+0x90909090
> +	{rex2} ja     .+6+0x90909090
> +	{rex2} js     .+6+0x90909090
> +	{rex2} jns    .+6+0x90909090
> +	{rex2} jp     .+6+0x90909090
> +	{rex2} jnp    .+6+0x90909090
> +	{rex2} jl     .+6+0x90909090
> +	{rex2} jge    .+6+0x90909090
> +	{rex2} jle    .+6+0x90909090
> +	{rex2} jg     .+6+0x90909090
> +
> +	#All opcodes in the row 0xE* (map0) prefixed REX2 are illegal.
> +	{rex2} in $0x90,%al
> +	{rex2} in $0x90
> +	{rex2} out $0x90,%al
> +	{rex2} out $0x90
> +	{rex2} jmp  *%eax
> +	{rex2} loop foo
> +
> +	#All opcodes in the row 0x3* (map1) prefixed REX2 are illegal.
> +	{rex2} wrmsr
> +	{rex2} rdtsc
> +	{rex2} rdmsr
> +	{rex2} sysenter
> +	{rex2} sysexitl
> +	{rex2} rdpmc
> diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.d b/gas/testsuite/gas/i386/x86-64-pseudos.d
> index 866a804ab92..19dcd8415ac 100644
> --- a/gas/testsuite/gas/i386/x86-64-pseudos.d
> +++ b/gas/testsuite/gas/i386/x86-64-pseudos.d
> @@ -404,6 +404,18 @@ Disassembly of section .text:
>   +[a-f0-9]+:	41 0f 28 10          	movaps \(%r8\),%xmm2
>   +[a-f0-9]+:	40 0f 38 01 01       	rex phaddw \(%rcx\),%mm0
>   +[a-f0-9]+:	41 0f 38 01 00       	phaddw \(%r8\),%mm0
> + +[a-f0-9]+:	88 c4                	mov    %al,%ah
> + +[a-f0-9]+:	d5 00 d3 e0          	{rex2 0x0} shl %cl,%eax
> + +[a-f0-9]+:	d5 00 38 ca          	{rex2 0x0} cmp %cl,%dl
> + +[a-f0-9]+:	d5 00 b3 01          	{rex2 0x0} mov \$(0x)?1,%bl
> + +[a-f0-9]+:	d5 00 89 c3          	{rex2 0x0} mov %eax,%ebx
> + +[a-f0-9]+:	d5 01 89 c6          	{rex2 0x1} mov %eax,%r14d
> + +[a-f0-9]+:	d5 01 89 00          	{rex2 0x1} mov %eax,\(%r8\)
> + +[a-f0-9]+:	d5 80 28 d7          	{rex2 0x80} movaps %xmm7,%xmm2
> + +[a-f0-9]+:	d5 84 28 e7          	{rex2 0x84} movaps %xmm7,%xmm12
> + +[a-f0-9]+:	d5 80 28 11          	{rex2 0x80} movaps \(%rcx\),%xmm2
> + +[a-f0-9]+:	d5 81 28 10          	{rex2 0x81} movaps \(%r8\),%xmm2
> + +[a-f0-9]+:	d5 80 d5 f0          	{rex2 0x80} pmullw %mm0,%mm6
>   +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
>   +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
>   +[a-f0-9]+:	8a 85 00 00 00 00    	mov    0x0\(%rbp\),%al
> @@ -458,6 +470,15 @@ Disassembly of section .text:
>   +[a-f0-9]+:	41 0f 28 10          	movaps \(%r8\),%xmm2
>   +[a-f0-9]+:	40 0f 38 01 01       	rex phaddw \(%rcx\),%mm0
>   +[a-f0-9]+:	41 0f 38 01 00       	phaddw \(%r8\),%mm0
> + +[a-f0-9]+:	88 c4                	mov    %al,%ah
> + +[a-f0-9]+:	d5 00 89 c3          	{rex2 0x0} mov %eax,%ebx
> + +[a-f0-9]+:	d5 01 89 c6          	{rex2 0x1} mov %eax,%r14d
> + +[a-f0-9]+:	d5 01 89 00          	{rex2 0x1} mov %eax,\(%r8\)
> + +[a-f0-9]+:	d5 80 28 d7          	{rex2 0x80} movaps %xmm7,%xmm2
> + +[a-f0-9]+:	d5 84 28 e7          	{rex2 0x84} movaps %xmm7,%xmm12
> + +[a-f0-9]+:	d5 80 28 11          	{rex2 0x80} movaps \(%rcx\),%xmm2
> + +[a-f0-9]+:	d5 81 28 10          	{rex2 0x81} movaps \(%r8\),%xmm2
> + +[a-f0-9]+:	d5 80 d5 f0          	{rex2 0x80} pmullw %mm0,%mm6
>   +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
>   +[a-f0-9]+:	8a 45 00             	mov    0x0\(%rbp\),%al
>   +[a-f0-9]+:	8a 85 00 00 00 00    	mov    0x0\(%rbp\),%al
> diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.s b/gas/testsuite/gas/i386/x86-64-pseudos.s
> index 06f0b62d049..5a53c363615 100644
> --- a/gas/testsuite/gas/i386/x86-64-pseudos.s
> +++ b/gas/testsuite/gas/i386/x86-64-pseudos.s
> @@ -360,6 +360,18 @@ _start:
>  	{rex} movaps (%r8),%xmm2
>  	{rex} phaddw (%rcx),%mm0
>  	{rex} phaddw (%r8),%mm0
> +	{rex2} mov %al,%ah
> +	{rex2} shl %cl, %eax
> +	{rex2} cmp %cl, %dl
> +	{rex2} mov $1, %bl
> +	{rex2} movl %eax,%ebx
> +	{rex2} movl %eax,%r14d
> +	{rex2} movl %eax,(%r8)
> +	{rex2} movaps %xmm7,%xmm2
> +	{rex2} movaps %xmm7,%xmm12
> +	{rex2} movaps (%rcx),%xmm2
> +	{rex2} movaps (%r8),%xmm2
> +	{rex2} pmullw %mm0,%mm6
>  
>  	movb (%rbp),%al
>  	{disp8} movb (%rbp),%al
> @@ -422,6 +434,15 @@ _start:
>  	{rex} movaps xmm2,XMMWORD PTR [r8]
>  	{rex} phaddw mm0,QWORD PTR [rcx]
>  	{rex} phaddw mm0,QWORD PTR [r8]
> +	{rex2} mov ah,al
> +	{rex2} mov ebx,eax
> +	{rex2} mov r14d,eax
> +	{rex2} mov DWORD PTR [r8],eax
> +	{rex2} movaps xmm2,xmm7
> +	{rex2} movaps xmm12,xmm7
> +	{rex2} movaps xmm2,XMMWORD PTR [rcx]
> +	{rex2} movaps xmm2,XMMWORD PTR [r8]
> +	{rex2} pmullw mm6,mm0
>  
>  	mov al, BYTE PTR [rbp]
>  	{disp8} mov al, BYTE PTR [rbp]
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index e4b0cc8b85b..91c068d5b40 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -363,6 +363,8 @@ run_dump_test "x86-64-avx512f-rcigrne-intel"
>  run_dump_test "x86-64-avx512f-rcigrne"
>  run_dump_test "x86-64-avx512f-rcigru-intel"
>  run_dump_test "x86-64-avx512f-rcigru"
> +run_list_test "x86-64-apx-egpr-inval"
> +run_dump_test "x86-64-apx-rex2"
>  run_dump_test "x86-64-avx512f-rcigrz-intel"
>  run_dump_test "x86-64-avx512f-rcigrz"
>  run_dump_test "x86-64-clwb"
> diff --git a/include/opcode/i386.h b/include/opcode/i386.h
> index dec7652c1cc..2823d02c68a 100644
> --- a/include/opcode/i386.h
> +++ b/include/opcode/i386.h
> @@ -112,9 +112,13 @@
>  /* x86-64 extension prefix.  */
>  #define REX_OPCODE	0x40
>  
> +#define REX2_OPCODE	0xd5
> +
>  /* Non-zero if OPCODE is the rex prefix.  */
>  #define REX_PREFIX_P(opcode) (((opcode) & 0xf0) == REX_OPCODE)
>  
> +/* M0 in rex2 prefix represents map0 or map1.  */
> +#define REX2_M 0x8
>  /* Indicates 64 bit operand size.  */
>  #define REX_W	8
>  /* High extension to reg field of modrm byte.  */
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index e78a2a9350e..4d6d547b2b6 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -144,6 +144,12 @@ struct instr_info
>    /* Bits of REX we've already used.  */
>    uint8_t rex_used;
>  
> +  /* Record W R4 X4 B4 bits for rex2.  */
> +  unsigned char rex2;
> +  /* Bits of rex2 we've already used.  */
> +  unsigned char rex2_used;
> +  unsigned char rex2_payload;
> +
>    bool need_modrm;
>    unsigned char need_vex;
>    bool has_sib;
> @@ -169,6 +175,7 @@ struct instr_info
>    signed char last_data_prefix;
>    signed char last_addr_prefix;
>    signed char last_rex_prefix;
> +  signed char last_rex2_prefix;
>    signed char last_seg_prefix;
>    signed char fwait_prefix;
>    /* The active segment register prefix.  */
> @@ -265,8 +272,13 @@ struct dis_private {
>    {							\
>      if (value)						\
>        {							\
> -	if ((ins->rex & value))				\
> +	if (ins->rex & value)				\
>  	  ins->rex_used |= (value) | REX_OPCODE;	\
> +	if (ins->rex2 & value)				\
> +	  {						\
> +	    ins->rex2_used |= (value);			\
> +	    ins->rex_used |= REX_OPCODE;		\
> +	  }						\
>        }							\
>      else						\
>        ins->rex_used |= REX_OPCODE;			\
> @@ -276,6 +288,7 @@ struct dis_private {
>  #define EVEX_b_used 1
>  #define EVEX_len_used 2
>  
> +
>  /* Flags stored in PREFIXES.  */
>  #define PREFIX_REPZ 1
>  #define PREFIX_REPNZ 2
> @@ -289,6 +302,7 @@ struct dis_private {
>  #define PREFIX_DATA 0x200
>  #define PREFIX_ADDR 0x400
>  #define PREFIX_FWAIT 0x800
> +#define PREFIX_REX2 0x1000
>  
>  /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
>     to ADDR (exclusive) are valid.  Returns true for success, false
> @@ -370,6 +384,7 @@ fetch_error (const instr_info *ins)
>  #define PREFIX_IGNORED_DATA	(PREFIX_DATA << PREFIX_IGNORED_SHIFT)
>  #define PREFIX_IGNORED_ADDR	(PREFIX_ADDR << PREFIX_IGNORED_SHIFT)
>  #define PREFIX_IGNORED_LOCK	(PREFIX_LOCK << PREFIX_IGNORED_SHIFT)
> +#define PREFIX_REX2_ILLEGAL	(PREFIX_REX2 << PREFIX_IGNORED_SHIFT)
>  
>  /* Opcode prefixes.  */
>  #define PREFIX_OPCODE		(PREFIX_REPZ \
> @@ -1888,23 +1903,23 @@ static const struct dis386 dis386[] = {
>    { "outs{b|}",		{ indirDXr, Xb }, 0 },
>    { X86_64_TABLE (X86_64_6F) },
>    /* 70 */
> -  { "joH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jnoH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jbH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jaeH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jeH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jneH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jbeH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jaH",		{ Jb, BND, cond_jump_flag }, 0 },
> +  { "joH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jnoH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jbH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jaeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jneH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jbeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jaH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
>    /* 78 */
> -  { "jsH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jnsH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jpH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jnpH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jlH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jgeH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jleH",		{ Jb, BND, cond_jump_flag }, 0 },
> -  { "jgH",		{ Jb, BND, cond_jump_flag }, 0 },
> +  { "jsH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jnsH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jpH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jnpH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jlH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jgeH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jleH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jgH",		{ Jb, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
>    /* 80 */
>    { REG_TABLE (REG_80) },
>    { REG_TABLE (REG_81) },
> @@ -1942,23 +1957,23 @@ static const struct dis386 dis386[] = {
>    { "sahf",		{ XX }, 0 },
>    { "lahf",		{ XX }, 0 },
>    /* a0 */
> -  { "mov%LB",		{ AL, Ob }, 0 },
> -  { "mov%LS",		{ eAX, Ov }, 0 },
> -  { "mov%LB",		{ Ob, AL }, 0 },
> -  { "mov%LS",		{ Ov, eAX }, 0 },
> -  { "movs{b|}",		{ Ybr, Xb }, 0 },
> -  { "movs{R|}",		{ Yvr, Xv }, 0 },
> -  { "cmps{b|}",		{ Xb, Yb }, 0 },
> -  { "cmps{R|}",		{ Xv, Yv }, 0 },
> +  { "mov%LB",		{ AL, Ob }, PREFIX_REX2_ILLEGAL },
> +  { "mov%LS",		{ eAX, Ov }, PREFIX_REX2_ILLEGAL },
> +  { "mov%LB",		{ Ob, AL }, PREFIX_REX2_ILLEGAL },
> +  { "mov%LS",		{ Ov, eAX }, PREFIX_REX2_ILLEGAL },
> +  { "movs{b|}",		{ Ybr, Xb }, PREFIX_REX2_ILLEGAL },
> +  { "movs{R|}",		{ Yvr, Xv }, PREFIX_REX2_ILLEGAL },
> +  { "cmps{b|}",		{ Xb, Yb }, PREFIX_REX2_ILLEGAL },
> +  { "cmps{R|}",		{ Xv, Yv }, PREFIX_REX2_ILLEGAL },
>    /* a8 */
> -  { "testB",		{ AL, Ib }, 0 },
> -  { "testS",		{ eAX, Iv }, 0 },
> -  { "stosB",		{ Ybr, AL }, 0 },
> -  { "stosS",		{ Yvr, eAX }, 0 },
> -  { "lodsB",		{ ALr, Xb }, 0 },
> -  { "lodsS",		{ eAXr, Xv }, 0 },
> -  { "scasB",		{ AL, Yb }, 0 },
> -  { "scasS",		{ eAX, Yv }, 0 },
> +  { "testB",		{ AL, Ib }, PREFIX_REX2_ILLEGAL },
> +  { "testS",		{ eAX, Iv }, PREFIX_REX2_ILLEGAL },
> +  { "stosB",		{ Ybr, AL }, PREFIX_REX2_ILLEGAL },
> +  { "stosS",		{ Yvr, eAX }, PREFIX_REX2_ILLEGAL },
> +  { "lodsB",		{ ALr, Xb }, PREFIX_REX2_ILLEGAL },
> +  { "lodsS",		{ eAXr, Xv }, PREFIX_REX2_ILLEGAL },
> +  { "scasB",		{ AL, Yb }, PREFIX_REX2_ILLEGAL },
> +  { "scasS",		{ eAX, Yv }, PREFIX_REX2_ILLEGAL },
>    /* b0 */
>    { "movB",		{ RMAL, Ib }, 0 },
>    { "movB",		{ RMCL, Ib }, 0 },
> @@ -2014,23 +2029,23 @@ static const struct dis386 dis386[] = {
>    { FLOAT },
>    { FLOAT },
>    /* e0 */
> -  { "loopneFH",		{ Jb, XX, loop_jcxz_flag }, 0 },
> -  { "loopeFH",		{ Jb, XX, loop_jcxz_flag }, 0 },
> -  { "loopFH",		{ Jb, XX, loop_jcxz_flag }, 0 },
> -  { "jEcxzH",		{ Jb, XX, loop_jcxz_flag }, 0 },
> -  { "inB",		{ AL, Ib }, 0 },
> -  { "inG",		{ zAX, Ib }, 0 },
> -  { "outB",		{ Ib, AL }, 0 },
> -  { "outG",		{ Ib, zAX }, 0 },
> +  { "loopneFH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
> +  { "loopeFH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
> +  { "loopFH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jEcxzH",		{ Jb, XX, loop_jcxz_flag }, PREFIX_REX2_ILLEGAL },
> +  { "inB",		{ AL, Ib }, PREFIX_REX2_ILLEGAL },
> +  { "inG",		{ zAX, Ib }, PREFIX_REX2_ILLEGAL },
> +  { "outB",		{ Ib, AL }, PREFIX_REX2_ILLEGAL },
> +  { "outG",		{ Ib, zAX }, PREFIX_REX2_ILLEGAL },
>    /* e8 */
>    { X86_64_TABLE (X86_64_E8) },
>    { X86_64_TABLE (X86_64_E9) },
>    { X86_64_TABLE (X86_64_EA) },
> -  { "jmp",		{ Jb, BND }, 0 },
> -  { "inB",		{ AL, indirDX }, 0 },
> -  { "inG",		{ zAX, indirDX }, 0 },
> -  { "outB",		{ indirDX, AL }, 0 },
> -  { "outG",		{ indirDX, zAX }, 0 },
> +  { "jmp",		{ Jb, BND }, PREFIX_REX2_ILLEGAL },
> +  { "inB",		{ AL, indirDX }, PREFIX_REX2_ILLEGAL },
> +  { "inG",		{ zAX, indirDX }, PREFIX_REX2_ILLEGAL },
> +  { "outB",		{ indirDX, AL }, PREFIX_REX2_ILLEGAL },
> +  { "outG",		{ indirDX, zAX }, PREFIX_REX2_ILLEGAL },
>    /* f0 */
>    { Bad_Opcode },	/* lock prefix */
>    { "int1",		{ XX }, 0 },
> @@ -2107,12 +2122,12 @@ static const struct dis386 dis386_twobyte[] = {
>    { PREFIX_TABLE (PREFIX_0F2E) },
>    { PREFIX_TABLE (PREFIX_0F2F) },
>    /* 30 */
> -  { "wrmsr",		{ XX }, 0 },
> -  { "rdtsc",		{ XX }, 0 },
> -  { "rdmsr",		{ XX }, 0 },
> -  { "rdpmc",		{ XX }, 0 },
> -  { "sysenter",		{ SEP }, 0 },
> -  { "sysexit%LQ",	{ SEP }, 0 },
> +  { "wrmsr",		{ XX }, PREFIX_REX2_ILLEGAL },
> +  { "rdtsc",		{ XX }, PREFIX_REX2_ILLEGAL },
> +  { "rdmsr",		{ XX }, PREFIX_REX2_ILLEGAL },
> +  { "rdpmc",		{ XX }, PREFIX_REX2_ILLEGAL },
> +  { "sysenter",		{ SEP }, PREFIX_REX2_ILLEGAL },
> +  { "sysexit%LQ",	{ SEP }, PREFIX_REX2_ILLEGAL },
>    { Bad_Opcode },
>    { "getsec",		{ XX }, 0 },
>    /* 38 */
> @@ -2197,23 +2212,23 @@ static const struct dis386 dis386_twobyte[] = {
>    { PREFIX_TABLE (PREFIX_0F7E) },
>    { PREFIX_TABLE (PREFIX_0F7F) },
>    /* 80 */
> -  { "joH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jnoH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jbH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jaeH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jeH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jneH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jbeH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jaH",		{ Jv, BND, cond_jump_flag }, 0 },
> +  { "joH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jnoH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jbH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jaeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jneH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jbeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jaH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
>    /* 88 */
> -  { "jsH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jnsH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jpH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jnpH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jlH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jgeH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jleH",		{ Jv, BND, cond_jump_flag }, 0 },
> -  { "jgH",		{ Jv, BND, cond_jump_flag }, 0 },
> +  { "jsH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jnsH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jpH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jnpH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jlH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jgeH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jleH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
> +  { "jgH",		{ Jv, BND, cond_jump_flag }, PREFIX_REX2_ILLEGAL },
>    /* 90 */
>    { "seto",		{ Eb }, 0 },
>    { "setno",		{ Eb }, 0 },
> @@ -2406,22 +2421,30 @@ static const char intel_index16[][6] = {
>  
>  static const char att_names64[][8] = {
>    "%rax", "%rcx", "%rdx", "%rbx", "%rsp", "%rbp", "%rsi", "%rdi",
> -  "%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15"
> +  "%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15",
> +  "%r16", "%r17", "%r18", "%r19", "%r20", "%r21", "%r22", "%r23",
> +  "%r24", "%r25", "%r26", "%r27", "%r28", "%r29", "%r30", "%r31",
>  };
>  static const char att_names32[][8] = {
>    "%eax", "%ecx", "%edx", "%ebx", "%esp", "%ebp", "%esi", "%edi",
> -  "%r8d", "%r9d", "%r10d", "%r11d", "%r12d", "%r13d", "%r14d", "%r15d"
> +  "%r8d", "%r9d", "%r10d", "%r11d", "%r12d", "%r13d", "%r14d", "%r15d",
> +  "%r16d", "%r17d", "%r18d", "%r19d", "%r20d", "%r21d", "%r22d", "%r23d",
> +  "%r24d", "%r25d", "%r26d", "%r27d", "%r28d", "%r29d", "%r30d", "%r31d",
>  };
>  static const char att_names16[][8] = {
>    "%ax", "%cx", "%dx", "%bx", "%sp", "%bp", "%si", "%di",
> -  "%r8w", "%r9w", "%r10w", "%r11w", "%r12w", "%r13w", "%r14w", "%r15w"
> +  "%r8w", "%r9w", "%r10w", "%r11w", "%r12w", "%r13w", "%r14w", "%r15w",
> +  "%r16w", "%r17w", "%r18w", "%r19w", "%r20w", "%r21w", "%r22w", "%r23w",
> +  "%r24w", "%r25w", "%r26w", "%r27w", "%r28w", "%r29w", "%r30w", "%r31w",
>  };
>  static const char att_names8[][8] = {
>    "%al", "%cl", "%dl", "%bl", "%ah", "%ch", "%dh", "%bh",
>  };
>  static const char att_names8rex[][8] = {
>    "%al", "%cl", "%dl", "%bl", "%spl", "%bpl", "%sil", "%dil",
> -  "%r8b", "%r9b", "%r10b", "%r11b", "%r12b", "%r13b", "%r14b", "%r15b"
> +  "%r8b", "%r9b", "%r10b", "%r11b", "%r12b", "%r13b", "%r14b", "%r15b",
> +  "%r16b", "%r17b", "%r18b", "%r19b", "%r20b", "%r21b", "%r22b", "%r23b",
> +  "%r24b", "%r25b", "%r26b", "%r27b", "%r28b", "%r29b", "%r30b", "%r31b",
>  };
>  static const char att_names_seg[][4] = {
>    "%es", "%cs", "%ss", "%ds", "%fs", "%gs", "%?", "%?",
> @@ -2810,9 +2833,9 @@ static const struct dis386 reg_table[][8] = {
>      { Bad_Opcode },
>      { "cmpxchg8b", { { CMPXCHG8B_Fixup, q_mode } }, 0 },
>      { Bad_Opcode },
> -    { "xrstors", { FXSAVE }, 0 },
> -    { "xsavec", { FXSAVE }, 0 },
> -    { "xsaves", { FXSAVE }, 0 },
> +    { "xrstors", { FXSAVE }, PREFIX_REX2_ILLEGAL },
> +    { "xsavec", { FXSAVE }, PREFIX_REX2_ILLEGAL },
> +    { "xsaves", { FXSAVE }, PREFIX_REX2_ILLEGAL },
>      { MOD_TABLE (MOD_0FC7_REG_6) },
>      { MOD_TABLE (MOD_0FC7_REG_7) },
>    },
> @@ -3384,7 +3407,7 @@ static const struct dis386 prefix_table[][4] = {
>  
>    /* PREFIX_0FAE_REG_4_MOD_0 */
>    {
> -    { "xsave",	{ FXSAVE }, 0 },
> +    { "xsave",	{ FXSAVE }, PREFIX_REX2_ILLEGAL },
>      { "ptwrite{%LQ|}", { Edq }, 0 },
>    },
>  
> @@ -3402,7 +3425,7 @@ static const struct dis386 prefix_table[][4] = {
>  
>    /* PREFIX_0FAE_REG_6_MOD_0 */
>    {
> -    { "xsaveopt",	{ FXSAVE }, PREFIX_OPCODE },
> +    { "xsaveopt",	{ FXSAVE }, PREFIX_OPCODE | PREFIX_REX2_ILLEGAL },
>      { "clrssbsy",	{ Mq }, PREFIX_OPCODE },
>      { "clwb",	{ Mb }, PREFIX_OPCODE },
>    },
> @@ -4197,13 +4220,13 @@ static const struct dis386 x86_64_table[][2] = {
>    /* X86_64_E8 */
>    {
>      { "callP",		{ Jv, BND }, 0 },
> -    { "call@",		{ Jv, BND }, 0 }
> +    { "call@",		{ Jv, BND }, PREFIX_REX2_ILLEGAL }
>    },
>  
>    /* X86_64_E9 */
>    {
>      { "jmpP",		{ Jv, BND }, 0 },
> -    { "jmp@",		{ Jv, BND }, 0 }
> +    { "jmp@",		{ Jv, BND }, PREFIX_REX2_ILLEGAL }
>    },
>  
>    /* X86_64_EA */
> @@ -8184,7 +8207,7 @@ static const struct dis386 mod_table[][2] = {
>    },
>    {
>      /* MOD_0FAE_REG_5 */
> -    { "xrstor",		{ FXSAVE }, PREFIX_OPCODE },
> +    { "xrstor",		{ FXSAVE }, PREFIX_OPCODE | PREFIX_REX2_ILLEGAL },
>      { PREFIX_TABLE (PREFIX_0FAE_REG_5_MOD_3) },
>    },
>    {
> @@ -8387,6 +8410,24 @@ ckprefix (instr_info *ins)
>  	    return ckp_okay;
>  	  ins->last_rex_prefix = i;
>  	  break;
> +	/* REX2 must be the last prefix. */
> +	case REX2_OPCODE:
> +	  if (ins->address_mode == mode_64bit)
> +	    {
> +	      if (ins->last_rex_prefix >= 0)
> +		return ckp_bogus;
> +
> +	      ins->codep++;
> +	      if (!fetch_code (ins->info, ins->codep + 1))
> +		return ckp_fetch_error;
> +	      ins->rex2_payload = *ins->codep;
> +	      ins->rex2 = ins->rex2_payload >> 4;
> +	      ins->rex = (ins->rex2_payload & 0xf) | REX_OPCODE;
> +	      ins->codep++;
> +	      ins->last_rex2_prefix = i;
> +	      ins->all_prefixes[i] = REX2_OPCODE;
> +	    }
> +	  return ckp_okay;
>  	case 0xf3:
>  	  ins->prefixes |= PREFIX_REPZ;
>  	  ins->last_repz_prefix = i;
> @@ -8554,6 +8595,8 @@ prefix_name (enum address_mode mode, uint8_t pref, int sizeflag)
>        return "bnd";
>      case NOTRACK_PREFIX:
>        return "notrack";
> +    case REX2_OPCODE:
> +      return "rex2";
>      default:
>        return NULL;
>      }
> @@ -9202,6 +9245,7 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>      .last_data_prefix = -1,
>      .last_addr_prefix = -1,
>      .last_rex_prefix = -1,
> +    .last_rex2_prefix = -1,
>      .last_seg_prefix = -1,
>      .fwait_prefix = -1,
>    };
> @@ -9367,13 +9411,18 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>        goto out;
>      }
>  
> -  if (*ins.codep == 0x0f)
> +  /* REX2.M in rex2 prefix represents map0 or map1.  */
> +  if (ins.last_rex2_prefix < 0 ? *ins.codep == 0x0f : (ins.rex2 & REX2_M))
>      {
>        unsigned char threebyte;
>  
> -      ins.codep++;
> -      if (!fetch_code (info, ins.codep + 1))
> -	goto fetch_error_out;
> +      if (!ins.rex2)
> +	{
> +	  ins.codep++;
> +	  if (!fetch_code (info, ins.codep + 1))
> +	    goto fetch_error_out;
> +	}
> +
>        threebyte = *ins.codep;
>        dp = &dis386_twobyte[threebyte];
>        ins.need_modrm = twobyte_has_modrm[threebyte];
> @@ -9529,7 +9578,15 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>        goto out;
>      }
>  
> -  switch (dp->prefix_requirement)
> +  if ((dp->prefix_requirement & PREFIX_REX2_ILLEGAL)
> +      && ins.last_rex2_prefix >= 0)
> +    {
> +      i386_dis_printf (info, dis_style_text, "(bad)");
> +      ret = ins.end_codep - priv.the_buffer;
> +      goto out;
> +    }
> +
> +  switch (dp->prefix_requirement & ~PREFIX_REX2_ILLEGAL)
>      {
>      case PREFIX_DATA:
>        /* If only the data prefix is marked as mandatory, its absence renders
> @@ -9588,6 +9645,13 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>        && !ins.need_vex && ins.last_rex_prefix >= 0)
>      ins.all_prefixes[ins.last_rex_prefix] = 0;
>  
> +  /* Check if the REX2 prefix is used.  */
> +  if (ins.last_rex2_prefix >= 0
> +      && ((ins.rex2 & 0x7) ^ (ins.rex2_used & 0x7)) == 0
> +      && (ins.rex ^ ins.rex_used) == 0
> +      && (ins.rex2 & 0x7))
> +    ins.all_prefixes[ins.last_rex2_prefix] = 0;
> +
>    /* Check if the SEG prefix is used.  */
>    if ((ins.prefixes & (PREFIX_CS | PREFIX_SS | PREFIX_DS | PREFIX_ES
>  		       | PREFIX_FS | PREFIX_GS)) != 0
> @@ -9616,7 +9680,11 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>  	if (name == NULL)
>  	  abort ();
>  	prefix_length += strlen (name) + 1;
> -	i386_dis_printf (info, dis_style_mnemonic, "%s ", name);
> +	if (ins.all_prefixes[i] == REX2_OPCODE)
> +	  i386_dis_printf (info, dis_style_mnemonic, "{%s 0x%x} ", name,
> +			   (unsigned int) ins.rex2_payload);
> +	else
> +	  i386_dis_printf (info, dis_style_mnemonic, "%s ", name);
>        }
>  
>    /* Check maximum code length.  */
> @@ -11163,6 +11231,8 @@ print_register (instr_info *ins, unsigned int reg, unsigned int rexmask,
>    USED_REX (rexmask);
>    if (ins->rex & rexmask)
>      reg += 8;
> +  if (ins->rex2 & rexmask)
> +    reg += 16;
>  
>    switch (bytemode)
>      {
> @@ -11170,7 +11240,7 @@ print_register (instr_info *ins, unsigned int reg, unsigned int rexmask,
>      case b_swap_mode:
>        if (reg & 4)
>  	USED_REX (0);
> -      if (ins->rex)
> +      if (ins->rex || ins->rex2)
>  	names = att_names8rex;
>        else
>  	names = att_names8;
> @@ -11386,6 +11456,8 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>    int riprel = 0;
>    int shift;
>  
> +  add += (ins->rex2 & REX_B) ? 16 : 0;
> +
>    if (ins->vex.evex)
>      {
>  
> @@ -11559,6 +11631,9 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  		}
>  	      break;
>  	    default:
> +	      if (ins->rex2 & REX_X)
> +		vindex += 16;
> +
>  	      if (vindex != 4)
>  		indexes = ins->address_mode == mode_64bit && !addr32flag
>  			  ? att_names64 : att_names32;
> @@ -11946,7 +12021,7 @@ static bool
>  OP_REG (instr_info *ins, int code, int sizeflag)
>  {
>    const char *s;
> -  int add;
> +  int add = 0;
>  
>    switch (code)
>      {
> @@ -11959,8 +12034,8 @@ OP_REG (instr_info *ins, int code, int sizeflag)
>    USED_REX (REX_B);
>    if (ins->rex & REX_B)
>      add = 8;
> -  else
> -    add = 0;
> +  if (ins->rex2 & REX_B)
> +    add += 16;
>  
>    switch (code)
>      {
> @@ -12674,6 +12749,8 @@ OP_EX (instr_info *ins, int bytemode, int sizeflag)
>    USED_REX (REX_B);
>    if (ins->rex & REX_B)
>      reg += 8;
> +  if (ins->rex2 & REX_B)
> +    reg += 16;
>    if (ins->vex.evex)
>      {
>        USED_REX (REX_X);
> diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> index 110a8371bd0..dd4850e1855 100644
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -275,6 +275,8 @@ static const dependency isa_dependencies[] =
>      "64" },
>    { "USER_MSR",
>      "64" },
> +  { "APX_F",
> +    "XSAVE|64" },
>  };
>  
>  /* This array is populated as process_i386_initializers() walks cpu_flags[].  */
> @@ -397,6 +399,7 @@ static bitfield cpu_flags[] =
>    BITFIELD (FRED),
>    BITFIELD (LKGS),
>    BITFIELD (USER_MSR),
> +  BITFIELD (APX_F),
>    BITFIELD (MWAITX),
>    BITFIELD (CLZERO),
>    BITFIELD (OSPKE),
> @@ -483,6 +486,7 @@ static bitfield opcode_modifiers[] =
>    BITFIELD (Optimize),
>    BITFIELD (Dialect),
>    BITFIELD (ISA64),
> +  BITFIELD (NoEgpr),
>  };
>  
>  #define CLASS(n) #n, n
> @@ -1069,10 +1073,44 @@ get_element_size (char **opnd, int lineno)
>    return elem_size;
>  }
>  
> +static bool
> +rex2_disallowed (const unsigned long long opcode, unsigned int length,
> +		 unsigned int space, const char *cpu_flags)
> +{
> +  /* Some opcodes encode a ModR/M-like byte directly in the opcode.  */
> +  unsigned int base_opcode = opcode >> (8 * length - 8);
> +
> +  /* All opcodes listed map0 0x4*, 0x7*, 0xa*, 0xe* and map1 0x3*, 0x8*
> +     are reserved under REX2 and triggers #UD when prefixed with REX2 */
> +  if (space == 0)
> +    switch (base_opcode >> 4)
> +      {
> +      case 0x4:
> +      case 0x7:
> +      case 0xA:
> +      case 0xE:
> +	return true;
> +      default:
> +	return false;
> +    }
> +
> +  if (space == SPACE_0F)
> +    switch (base_opcode >> 4)
> +      {
> +      case 0x3:
> +      case 0x8:
> +	return true;
> +      default:
> +	return false;
> +      }
> +
> +  return false;
> +}
> +
>  static void
>  process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
>  			      unsigned int prefix, const char *extension_opcode,
> -			      char **opnd, int lineno)
> +			      char **opnd, int lineno, bool rex2_disallowed)
>  {
>    char *str, *next, *last;
>    bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
> @@ -1199,6 +1237,12 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
>  	  || modifiers[SAE].value))
>      modifiers[EVex].value = EVEXDYN;
>  
> +  /* Vex, legacy map2 and map3 and rex2_disallowed do not support EGPR.
> +     For templates supporting both Vex and EVex allowing EGPR.  */
> +  if ((modifiers[Vex].value || space > SPACE_0F || rex2_disallowed)
> +      && !modifiers[EVex].value)
> +    modifiers[NoEgpr].value = 1;
> +
>    output_opcode_modifier (table, modifiers, ARRAY_SIZE (modifiers));
>  }
>  
> @@ -1423,7 +1467,9 @@ output_i386_opcode (FILE *table, const char *name, char *str,
>    free (ident);
>  
>    process_i386_opcode_modifier (table, opcode_modifier, space, prefix,
> -				extension_opcode, operand_types, lineno);
> +				extension_opcode, operand_types, lineno,
> +				rex2_disallowed (opcode, length, space,
> +						 cpu_flags));
>  
>    process_i386_cpu_flag (table, cpu_flags, NULL, ",", "    ", lineno, CpuMax);
>  
> diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> index 03b02bd9681..8c967ea90b0 100644
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -319,6 +319,8 @@ enum i386_cpu
>    CpuAVX512F,
>    /* Intel AVX-512 VL Instructions support required.  */
>    CpuAVX512VL,
> +  /* Intel APX_F Instructions support required.  */
> +  CpuAPX_F,
>    /* Not supported in the 64bit mode  */
>    CpuNo64,
>  
> @@ -354,6 +356,7 @@ enum i386_cpu
>  		   cpuhle:1, \
>  		   cpuavx512f:1, \
>  		   cpuavx512vl:1, \
> +		   cpuapx_f:1, \
>        /* NOTE: This field needs to remain last. */ \
>  		   cpuno64:1
>  
> @@ -735,6 +738,11 @@ enum
>  #define INTEL64		2
>  #define INTEL64ONLY	3
>    ISA64,
> +
> +  /* egprs (r16-r31) on instruction illegal. We also use it to judge
> +     whether the instruction supports pseudo-prefix {rex2}.  */
> +  NoEgpr,
> +
>    /* The last bitfield in i386_opcode_modifier.  */
>    Opcode_Modifier_Num
>  };
> @@ -779,6 +787,7 @@ typedef struct i386_opcode_modifier
>    unsigned int optimize:1;
>    unsigned int dialect:2;
>    unsigned int isa64:2;
> +  unsigned int noegpr:1;
>  } i386_opcode_modifier;
>  
>  /* Operand classes.  */
> @@ -993,7 +1002,8 @@ typedef struct insn_template
>  #define Prefix_VEX3		6	/* {vex3} */
>  #define Prefix_EVEX		7	/* {evex} */
>  #define Prefix_REX		8	/* {rex} */
> -#define Prefix_NoOptimize	9	/* {nooptimize} */
> +#define Prefix_REX2		9	/* {rex2} */
> +#define Prefix_NoOptimize	10	/* {nooptimize} */
>  
>    /* the bits in opcode_modifier are used to generate the final opcode from
>       the base_opcode.  These bits also are used to detect alternate forms of
> @@ -1020,6 +1030,7 @@ typedef struct
>  #define RegRex	    0x1  /* Extended register.  */
>  #define RegRex64    0x2  /* Extended 8 bit register.  */
>  #define RegVRex	    0x4  /* Extended vector register.  */
> +#define RegRex2	    0x8  /* Extended GPRs R16–R31 register.  */
>    unsigned char reg_num;
>  #define RegIP	((unsigned char ) ~0)
>  /* EIZ and RIZ are fake index registers.  */
> diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> index 1e54717fa7e..37d3e8663bb 100644
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -892,7 +892,7 @@ rex.wrxb, 0x4f, x64, NoSuf|IsPrefix, {}
>  <pseudopfx:ident:cpu, disp8:Disp8:0, disp16:Disp16:No64, disp32:Disp32:i386, +
>                        load:Load:0, store:Store:0, +
>                        vex:VEX:0, vex2:VEX:0, vex3:VEX3:0, evex:EVEX:0, +
> -                      rex:REX:x64, nooptimize:NoOptimize:0>
> +                      rex:REX:x64, rex2:REX2:APX_F, nooptimize:NoOptimize:0>
>  
>  {<pseudopfx>}, PSEUDO_PREFIX/Prefix_<pseudopfx:ident>, <pseudopfx:cpu>, NoSuf|IsPrefix, {}
>  
> @@ -1425,16 +1425,17 @@ crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf, { Reg8|Reg64|Uns
>  
>  // xsave/xrstor New Instructions.
>  
> -xsave, 0xfae/4, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Unspecified|BaseIndex }
> -xsave64, 0xfae/4, Xsave&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
> -xrstor, 0xfae/5, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Unspecified|BaseIndex }
> -xrstor64, 0xfae/5, Xsave&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
> +xsave, 0xfae/4, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoEgpr, { Unspecified|BaseIndex }
> +xsave64, 0xfae/4, Xsave&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
> +xrstor, 0xfae/5, Xsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoEgpr, { Unspecified|BaseIndex }
> +xrstor64, 0xfae/5, Xsave&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
>  xgetbv, 0xf01d0, Xsave, NoSuf, {}
>  xsetbv, 0xf01d1, Xsave, NoSuf, {}
>  
>  // xsaveopt
> -xsaveopt, 0xfae/6, Xsaveopt, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Unspecified|BaseIndex }
> -xsaveopt64, 0xfae/6, Xsaveopt&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
> +
> +xsaveopt, 0xfae/6, Xsaveopt, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoEgpr, { Unspecified|BaseIndex }
> +xsaveopt64, 0xfae/6, Xsaveopt&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
>  
>  // AES instructions.
>  
> @@ -2477,17 +2478,17 @@ clflushopt, 0x660fae/7, ClflushOpt, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex
>  
>  // XSAVES/XRSTORS instructions.
>  
> -xrstors, 0xfc7/3, XSAVES, Modrm|NoSuf, { Unspecified|BaseIndex }
> -xrstors64, 0xfc7/3, XSAVES&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
> -xsaves, 0xfc7/5, XSAVES, Modrm|NoSuf, { Unspecified|BaseIndex }
> -xsaves64, 0xfc7/5, XSAVES&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
> +xrstors, 0xfc7/3, XSAVES, Modrm|NoSuf|NoEgpr, { Unspecified|BaseIndex }
> +xrstors64, 0xfc7/3, XSAVES&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
> +xsaves, 0xfc7/5, XSAVES, Modrm|NoSuf|NoEgpr, { Unspecified|BaseIndex }
> +xsaves64, 0xfc7/5, XSAVES&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
>  
>  // XSAVES instructions end.
>  
>  // XSAVEC instructions.
>  
> -xsavec, 0xfc7/4, XSAVEC, Modrm|NoSuf, { Unspecified|BaseIndex }
> -xsavec64, 0xfc7/4, XSAVEC&x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
> +xsavec, 0xfc7/4, XSAVEC, Modrm|NoSuf|NoEgpr, { Unspecified|BaseIndex }
> +xsavec64, 0xfc7/4, XSAVEC&x64, Modrm|NoSuf|Size64|NoEgpr, { Unspecified|BaseIndex }
>  
>  // XSAVEC instructions end.
>  
> diff --git a/opcodes/i386-reg.tbl b/opcodes/i386-reg.tbl
> index 2ac56e3fd0b..8fead35e320 100644
> --- a/opcodes/i386-reg.tbl
> +++ b/opcodes/i386-reg.tbl
> @@ -43,6 +43,22 @@ r12b, Class=Reg|Byte, RegRex|RegRex64, 4, Dw2Inval, Dw2Inval
>  r13b, Class=Reg|Byte, RegRex|RegRex64, 5, Dw2Inval, Dw2Inval
>  r14b, Class=Reg|Byte, RegRex|RegRex64, 6, Dw2Inval, Dw2Inval
>  r15b, Class=Reg|Byte, RegRex|RegRex64, 7, Dw2Inval, Dw2Inval
> +r16b, Class=Reg|Byte, RegRex2|RegRex64, 0, Dw2Inval, Dw2Inval
> +r17b, Class=Reg|Byte, RegRex2|RegRex64, 1, Dw2Inval, Dw2Inval
> +r18b, Class=Reg|Byte, RegRex2|RegRex64, 2, Dw2Inval, Dw2Inval
> +r19b, Class=Reg|Byte, RegRex2|RegRex64, 3, Dw2Inval, Dw2Inval
> +r20b, Class=Reg|Byte, RegRex2|RegRex64, 4, Dw2Inval, Dw2Inval
> +r21b, Class=Reg|Byte, RegRex2|RegRex64, 5, Dw2Inval, Dw2Inval
> +r22b, Class=Reg|Byte, RegRex2|RegRex64, 6, Dw2Inval, Dw2Inval
> +r23b, Class=Reg|Byte, RegRex2|RegRex64, 7, Dw2Inval, Dw2Inval
> +r24b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 0, Dw2Inval, Dw2Inval
> +r25b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 1, Dw2Inval, Dw2Inval
> +r26b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 2, Dw2Inval, Dw2Inval
> +r27b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 3, Dw2Inval, Dw2Inval
> +r28b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 4, Dw2Inval, Dw2Inval
> +r29b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 5, Dw2Inval, Dw2Inval
> +r30b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 6, Dw2Inval, Dw2Inval
> +r31b, Class=Reg|Byte, RegRex2|RegRex64|RegRex, 7, Dw2Inval, Dw2Inval
>  // 16 bit regs
>  ax, Class=Reg|Instance=Accum|Word, 0, 0, Dw2Inval, Dw2Inval
>  cx, Class=Reg|Word, 0, 1, Dw2Inval, Dw2Inval
> @@ -60,6 +76,22 @@ r12w, Class=Reg|Word, RegRex, 4, Dw2Inval, Dw2Inval
>  r13w, Class=Reg|Word, RegRex, 5, Dw2Inval, Dw2Inval
>  r14w, Class=Reg|Word, RegRex, 6, Dw2Inval, Dw2Inval
>  r15w, Class=Reg|Word, RegRex, 7, Dw2Inval, Dw2Inval
> +r16w, Class=Reg|Word, RegRex2, 0, Dw2Inval, Dw2Inval
> +r17w, Class=Reg|Word, RegRex2, 1, Dw2Inval, Dw2Inval
> +r18w, Class=Reg|Word, RegRex2, 2, Dw2Inval, Dw2Inval
> +r19w, Class=Reg|Word, RegRex2, 3, Dw2Inval, Dw2Inval
> +r20w, Class=Reg|Word, RegRex2, 4, Dw2Inval, Dw2Inval
> +r21w, Class=Reg|Word, RegRex2, 5, Dw2Inval, Dw2Inval
> +r22w, Class=Reg|Word, RegRex2, 6, Dw2Inval, Dw2Inval
> +r23w, Class=Reg|Word, RegRex2, 7, Dw2Inval, Dw2Inval
> +r24w, Class=Reg|Word, RegRex2|RegRex, 0, Dw2Inval, Dw2Inval
> +r25w, Class=Reg|Word, RegRex2|RegRex, 1, Dw2Inval, Dw2Inval
> +r26w, Class=Reg|Word, RegRex2|RegRex, 2, Dw2Inval, Dw2Inval
> +r27w, Class=Reg|Word, RegRex2|RegRex, 3, Dw2Inval, Dw2Inval
> +r28w, Class=Reg|Word, RegRex2|RegRex, 4, Dw2Inval, Dw2Inval
> +r29w, Class=Reg|Word, RegRex2|RegRex, 5, Dw2Inval, Dw2Inval
> +r30w, Class=Reg|Word, RegRex2|RegRex, 6, Dw2Inval, Dw2Inval
> +r31w, Class=Reg|Word, RegRex2|RegRex, 7, Dw2Inval, Dw2Inval
>  // 32 bit regs
>  eax, Class=Reg|Instance=Accum|Dword|BaseIndex, 0, 0, 0, Dw2Inval
>  ecx, Class=Reg|Instance=RegC|Dword|BaseIndex, 0, 1, 1, Dw2Inval
> @@ -77,6 +109,22 @@ r12d, Class=Reg|Dword|BaseIndex, RegRex, 4, Dw2Inval, Dw2Inval
>  r13d, Class=Reg|Dword|BaseIndex, RegRex, 5, Dw2Inval, Dw2Inval
>  r14d, Class=Reg|Dword|BaseIndex, RegRex, 6, Dw2Inval, Dw2Inval
>  r15d, Class=Reg|Dword|BaseIndex, RegRex, 7, Dw2Inval, Dw2Inval
> +r16d, Class=Reg|Dword|BaseIndex, RegRex2, 0, Dw2Inval, Dw2Inval
> +r17d, Class=Reg|Dword|BaseIndex, RegRex2, 1, Dw2Inval, Dw2Inval
> +r18d, Class=Reg|Dword|BaseIndex, RegRex2, 2, Dw2Inval, Dw2Inval
> +r19d, Class=Reg|Dword|BaseIndex, RegRex2, 3, Dw2Inval, Dw2Inval
> +r20d, Class=Reg|Dword|BaseIndex, RegRex2, 4, Dw2Inval, Dw2Inval
> +r21d, Class=Reg|Dword|BaseIndex, RegRex2, 5, Dw2Inval, Dw2Inval
> +r22d, Class=Reg|Dword|BaseIndex, RegRex2, 6, Dw2Inval, Dw2Inval
> +r23d, Class=Reg|Dword|BaseIndex, RegRex2, 7, Dw2Inval, Dw2Inval
> +r24d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 0, Dw2Inval, Dw2Inval
> +r25d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 1, Dw2Inval, Dw2Inval
> +r26d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 2, Dw2Inval, Dw2Inval
> +r27d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 3, Dw2Inval, Dw2Inval
> +r28d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 4, Dw2Inval, Dw2Inval
> +r29d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 5, Dw2Inval, Dw2Inval
> +r30d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 6, Dw2Inval, Dw2Inval
> +r31d, Class=Reg|Dword|BaseIndex, RegRex2|RegRex, 7, Dw2Inval, Dw2Inval
>  rax, Class=Reg|Instance=Accum|Qword|BaseIndex, 0, 0, Dw2Inval, 0
>  rcx, Class=Reg|Instance=RegC|Qword|BaseIndex, 0, 1, Dw2Inval, 2
>  rdx, Class=Reg|Instance=RegD|Qword|BaseIndex, 0, 2, Dw2Inval, 1
> @@ -93,6 +141,22 @@ r12, Class=Reg|Qword|BaseIndex, RegRex, 4, Dw2Inval, 12
>  r13, Class=Reg|Qword|BaseIndex, RegRex, 5, Dw2Inval, 13
>  r14, Class=Reg|Qword|BaseIndex, RegRex, 6, Dw2Inval, 14
>  r15, Class=Reg|Qword|BaseIndex, RegRex, 7, Dw2Inval, 15
> +r16, Class=Reg|Qword|BaseIndex, RegRex2, 0, Dw2Inval, 130
> +r17, Class=Reg|Qword|BaseIndex, RegRex2, 1, Dw2Inval, 131
> +r18, Class=Reg|Qword|BaseIndex, RegRex2, 2, Dw2Inval, 132
> +r19, Class=Reg|Qword|BaseIndex, RegRex2, 3, Dw2Inval, 133
> +r20, Class=Reg|Qword|BaseIndex, RegRex2, 4, Dw2Inval, 134
> +r21, Class=Reg|Qword|BaseIndex, RegRex2, 5, Dw2Inval, 135
> +r22, Class=Reg|Qword|BaseIndex, RegRex2, 6, Dw2Inval, 136
> +r23, Class=Reg|Qword|BaseIndex, RegRex2, 7, Dw2Inval, 137
> +r24, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 0, Dw2Inval, 138
> +r25, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 1, Dw2Inval, 139
> +r26, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 2, Dw2Inval, 140
> +r27, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 3, Dw2Inval, 141
> +r28, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 4, Dw2Inval, 142
> +r29, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 5, Dw2Inval, 143
> +r30, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 6, Dw2Inval, 144
> +r31, Class=Reg|Qword|BaseIndex, RegRex2|RegRex, 7, Dw2Inval, 145
>  // Vector mask registers.
>  k0, Class=RegMask, 0, 0, 93, 118
>  k1, Class=RegMask, 0, 1, 94, 119
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
  2023-12-28  1:27 ` [PATCH V5 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
@ 2023-12-28  1:54   ` H.J. Lu
  0 siblings, 0 replies; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:54 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich

On Thu, Dec 28, 2023 at 01:27:07AM +0000, Cui, Lili wrote:
> opcode/ChangeLog:
> 
> 	* i386-dis-evex.hi: Added an empty EVEX_MAP4_ sub-table for
> 	legacy insn promote to EVEX insn.
> 	* opcodes/i386-dis-evex.h: Add EVEX_MAP4.
> ---
>  opcodes/i386-dis-evex.h | 291 ++++++++++++++++++++++++++++++++++++++++
>  opcodes/i386-dis.c      |   1 +
>  2 files changed, 292 insertions(+)
> 
> diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
> index e6295119d2b..7ad1edbe72d 100644
> --- a/opcodes/i386-dis-evex.h
> +++ b/opcodes/i386-dis-evex.h
> @@ -872,6 +872,297 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>    },
> +  /* EVEX_MAP4_ */
> +  {
> +    /* 00 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 08 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 10 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 18 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 20 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 28 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 30 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 38 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 40 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 48 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 50 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 58 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 60 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 68 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 70 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 78 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 80 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 88 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 90 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* 98 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* A0 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* A8 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* B0 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* B8 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* C0 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* C8 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* D0 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* D8 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* E0 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* E8 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* F0 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    /* F8 */
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +  },
>    /* EVEX_MAP5_ */
>    {
>      /* 00 */
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index 4d6d547b2b6..e006d869258 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -1296,6 +1296,7 @@ enum
>    EVEX_0F = 0,
>    EVEX_0F38,
>    EVEX_0F3A,
> +  EVEX_MAP4,
>    EVEX_MAP5,
>    EVEX_MAP6,
>  };
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 3/9] Support APX GPR32 with extend evex prefix
  2023-12-28  1:27 ` [PATCH V5 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
@ 2023-12-28  1:54   ` H.J. Lu
  2023-12-28 13:48     ` Cui, Lili
  0 siblings, 1 reply; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:54 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich

On Thu, Dec 28, 2023 at 01:27:08AM +0000, Cui, Lili wrote:
> This patch adds non-ND, non-NF forms of EVEX promotion insn.
> 
> EVEX extension of legacy instructions:
>   All promoted legacy instructions are placed in EVEX map 4, which is
>   currently reserved.
> EVEX extension of EVEX instructions:
>   All existing EVEX instructions are extended by APX using the extended
>   EVEX prefix, so that they can access all 32 GPRs.
> EVEX extension of VEX instructions:
>   Promoting a VEX instruction into the EVEX space does not change the map
>   id, the opcode, or the operand encoding of the VEX instruction.
> 
> Note: The promoted versions of MOVBE will be extended to include the “MOVBE
>   reg1, reg2”.
> 
>   gas/ChangeLog:
> 
>   2023-12-28  Lingling Kong <lingling.kong@intel.com>
> 	      H.J. Lu  <hongjiu.lu@intel.com>
> 	      Lili Cui <lili.cui@intel.com>
> 	      Lin Hu   <lin1.hu@intel.com>
> 
> 	* config/tc-i386.c
> 	(install_template): Handled APX combines.
> 	(is_apx_evex_encoding): Test apx evex encoding.
> 	(build_apx_evex_prefix): Enabe APX evex prefix.
> 	(md_assemble): Handle apx with evex encoding.
> 	(process_suffix): Handle apx map4 prefix.
> 	(check_register): Assign i.vec_encoding for APX evex instructions.
> 	* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
> 	* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
> 	* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
> 	PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
> 	PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
> 	PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
> 	PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
> 	PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
> 	PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
> 	* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
> 	* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
> 	promote to apx to use gpr32
> 	* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
> 	X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
> 	X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5, X86_64_EVEX_0F38F6,
> 	X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
> 	* i386-dis.c
> 	(struct instr_info): Deleted bool r.
> 	(PREFIX_NP_OR_DATA): New.
> 	(NO_PREFIX): New.
> 	(putop): Ditto.
> 	(X86_64_EVEX_FROM_VEX_TABLE): Diito.
> 	(get_valid_dis386): Decode insn erex in extend evex prefix.
> 	Handle EVEX_MAP4
> 	(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
> 	(print_register): Handle apx instructions decode.
> 	(OP_E_memory): Diito.
> 	(OP_G): Diito.
> 	(OP_XMM): Diito.
> 	(DistinctDest_Fixup): Diito.
> 	* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
> 	* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
> 	promote to evex.
> 	* i386-opc.tbl: Handle some legacy and vex insns don't
> 	support gpr32. And add some legacy insn (map2 / 3) promote
> 	to evex.
> ---
>  gas/config/tc-i386.c                 |  72 +++++++++++-
>  gas/testsuite/gas/i386/x86-64-evex.d |   2 +-
>  gas/testsuite/gas/i386/x86-64.exp    |   2 +-
>  opcodes/i386-dis-evex-prefix.h       |  58 ++++++++++
>  opcodes/i386-dis-evex-x86-64.h       |  50 +++++++++
>  opcodes/i386-dis-evex.h              |  94 ++++++++--------
>  opcodes/i386-dis.c                   | 160 +++++++++++++++++++++++----
>  opcodes/i386-gen.c                   |   2 +
>  opcodes/i386-opc.h                   |   6 +
>  opcodes/i386-opc.tbl                 |  90 ++++++++++-----
>  10 files changed, 433 insertions(+), 103 deletions(-)
>  create mode 100644 opcodes/i386-dis-evex-x86-64.h
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index bb302f28add..7e62d08e9bd 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -435,6 +435,9 @@ struct _i386_insn
>      /* Prefer the REX2 prefix in encoding.  */
>      bool rex2_encoding;
>  
> +    /* Need to use an Egpr capable encoding (REX2 or EVEX).  */
> +    bool has_egpr;
> +
>      /* Disable instruction size optimization.  */
>      bool no_optimize;
>  
> @@ -3676,12 +3679,12 @@ install_template (const insn_template *t)
>  
>    /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
>    if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> -  {
> +    {
>        if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
>  	   || maybe_cpu (t, CpuFMA))
>  	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
>  	{
> -	  if (need_evex_encoding ())
> +	  if (need_evex_encoding () || i.has_egpr)
>  	    {
>  	      i.tm.opcode_modifier.vex = 0;
>  	      i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f;
> @@ -3698,7 +3701,19 @@ install_template (const insn_template *t)
>  		gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
>  	    }
>  	}
> -  }
> +
> +      if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
> +	   || maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
> +	   || maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
> +	   || maybe_cpu (t, CpuBMI2))
> +	  && maybe_cpu (t, CpuAPX_F))
> +	{
> +	  if (need_evex_encoding () || i.has_egpr)
> +	    i.tm.opcode_modifier.vex = 0;
> +	  else
> +	    i.tm.opcode_modifier.evex = 0;
> +	}
> +    }
>  
>    /* Note that for pseudo prefixes this produces a length of 1. But for them
>       the length isn't interesting at all.  */
> @@ -3879,6 +3894,15 @@ is_any_vex_encoding (const insn_template *t)
>    return t->opcode_modifier.vex || t->opcode_modifier.evex;
>  }
>  
> +/* We can use this function only when the current encoding is evex.  */
> +static INLINE bool
> +is_apx_evex_encoding (void)
> +{
> +  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
> +    || (i.vex.register_specifier
> +	&& (i.vex.register_specifier->reg_flags & RegRex2));
> +}
> +
>  static INLINE bool
>  is_apx_rex2_encoding (void)
>  {
> @@ -4156,6 +4180,27 @@ build_rex2_prefix (void)
>  		    | (i.rex2 << 4) | i.rex);
>  }
>  
> +/* Build the EVEX prefix (4-byte) for evex insn
> +   | 62h |
> +   | `R`X`B`R' | B'mmm |
> +   | W | v`v`v`v | `x' | pp |
> +   | z| L'L | b | `v | aaa |
> +*/
> +static void
> +build_apx_evex_prefix (void)
> +{
> +  build_evex_prefix ();
> +  if (i.rex2 & REX_R)
> +    i.vex.bytes[1] &= ~0x10;
> +  if (i.rex2 & REX_B)
> +    i.vex.bytes[1] |= 0x08;
> +  if (i.rex2 & REX_X)
> +    i.vex.bytes[2] &= ~0x04;
> +  if (i.vex.register_specifier
> +      && i.vex.register_specifier->reg_flags & RegRex2)
> +    i.vex.bytes[3] &= ~0x08;
> +}
> +
>  static void establish_rex (void)
>  {
>    /* Note that legacy encodings have at most 2 non-immediate operands.  */
> @@ -5723,13 +5768,18 @@ md_assemble (char *line)
>  	  return;
>  	}
>  
> -      if (i.tm.opcode_modifier.vex)
> +      if (is_apx_evex_encoding ())
> +	build_apx_evex_prefix ();
> +      else if (i.tm.opcode_modifier.vex)
>  	build_vex_prefix (t);
>        else
>  	build_evex_prefix ();
>  
>        /* The individual REX.RXBW bits got consumed.  */
>        i.rex &= REX_OPCODE;
> +
> +      /* The rex2 bits got consumed.  */
> +      i.rex2 = 0;
>      }
>  
>    /* Handle conversion of 'int $3' --> special int3 insn.  */
> @@ -8084,7 +8134,8 @@ process_suffix (void)
>        if (i.suffix != QWORD_MNEM_SUFFIX
>  	  && i.tm.opcode_modifier.mnemonicsize != IGNORESIZE
>  	  && !i.tm.opcode_modifier.floatmf
> -	  && !is_any_vex_encoding (&i.tm)
> +	  && (!is_any_vex_encoding (&i.tm)
> +	      || i.tm.opcode_space == SPACE_EVEXMAP4)
>  	  && ((i.suffix == LONG_MNEM_SUFFIX) == (flag_code == CODE_16BIT)
>  	      || (flag_code == CODE_64BIT
>  		  && i.tm.opcode_modifier.jump == JUMP_BYTE)))
> @@ -8094,7 +8145,14 @@ process_suffix (void)
>  	  if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
>  	    prefix = ADDR_PREFIX_OPCODE;
>  
> -	  if (!add_prefix (prefix))
> +	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> +	     needs to be adjusted.  */
> +	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
> +	    {
> +	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
> +	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> +	    }
> +	  else if (!add_prefix (prefix))
>  	    return 0;
>  	}
>  
> @@ -14300,6 +14358,8 @@ static bool check_register (const reg_entry *r)
>        if (!cpu_arch_flags.bitfield.cpuapx_f
>  	  || flag_code != CODE_64BIT)
>  	return false;
> +
> +      i.has_egpr = true;
>      }
>  
>    if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
> diff --git a/gas/testsuite/gas/i386/x86-64-evex.d b/gas/testsuite/gas/i386/x86-64-evex.d
> index 041747db892..5d974c312da 100644
> --- a/gas/testsuite/gas/i386/x86-64-evex.d
> +++ b/gas/testsuite/gas/i386/x86-64-evex.d
> @@ -17,6 +17,6 @@ Disassembly of section .text:
>   +[a-f0-9]+:	62 f1 d6 38 7b f0    	vcvtusi2ss %rax,\{rd-sae\},%xmm5,%xmm6
>   +[a-f0-9]+:	62 f1 57 38 7b f0    	vcvtusi2sd %eax,\{rd-bad\},%xmm5,%xmm6
>   +[a-f0-9]+:	62 f1 d7 38 7b f0    	vcvtusi2sd %rax,\{rd-sae\},%xmm5,%xmm6
> - +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,\(bad\)
> + +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,%r16d
>   +[a-f0-9]+:	62 e1 7c 08 c2 c0 00 	vcmpeqps %xmm0,%xmm0,\(bad\)
>  #pass
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index 91c068d5b40..ffacc9c8e2b 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -250,7 +250,7 @@ run_dump_test "x86-64-sse-noavx"
>  run_dump_test "x86-64-movbe"
>  run_dump_test "x86-64-movbe-intel"
>  run_dump_test "x86-64-movbe-suffix"
> -run_list_test "x86-64-inval-movbe" "-al"
> +run_list_test "x86-64-inval-movbe" "-march=+noapx_f -al"
>  run_dump_test "x86-64-ept"
>  run_dump_test "x86-64-ept-intel"
>  run_list_test "x86-64-inval-ept" "-al"
> diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
> index 28da54922c7..54ed48c6952 100644
> --- a/opcodes/i386-dis-evex-prefix.h
> +++ b/opcodes/i386-dis-evex-prefix.h
> @@ -338,6 +338,64 @@
>      { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
>      { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
>    },
> +  /* PREFIX_EVEX_MAP4_D8 */
> +  {
> +    { "sha1nexte", { XM, EXxmm }, 0 },
> +    { REG_TABLE (REG_0F38D8_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_DA */
> +  {
> +    { "sha1msg2", { XM, EXxmm }, 0 },
> +    { "encodekey128", { Gd, Rd }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_DB */
> +  {
> +    { "sha256rnds2", { XM, EXxmm, XMM0 }, 0 },
> +    { "encodekey256", { Gd, Rd }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_DC */
> +  {
> +    { "sha256msg1", { XM, EXxmm }, 0 },
> +    { "aesenc128kl", { XM, M }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_DD */
> +  {
> +    { "sha256msg2", { XM, EXxmm }, 0 },
> +    { "aesdec128kl", { XM, M }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_DE */
> +  {
> +    { Bad_Opcode },
> +    { "aesenc256kl", { XM, M }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_DF */
> +  {
> +    { Bad_Opcode },
> +    { "aesdec256kl", { XM, M }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_F0 */
> +  {
> +    { "crc32A", { Gdq, Eb }, 0 },
> +    { "invept", { Gm, Mo }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_F1 */
> +  {
> +    { "crc32Q", { Gdq, Ev }, 0 },
> +    { "invvpid", { Gm, Mo }, 0 },
> +    { "crc32Q", { Gdq, Ev }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_F2 */
> +  {
> +    { Bad_Opcode },
> +    { "invpcid", { Gm, M }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_F8 */
> +  {
> +    { Bad_Opcode },
> +    { "enqcmds", { Gva, M },  0 },
> +    { "movdir64b", { Gva, M }, 0 },
> +    { "enqcmd", { Gva, M }, 0 },
> +  },
>    /* PREFIX_EVEX_MAP5_10 */
>    {
>      { Bad_Opcode },
> diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-64.h
> new file mode 100644
> index 00000000000..0d9d98a7691
> --- /dev/null
> +++ b/opcodes/i386-dis-evex-x86-64.h
> @@ -0,0 +1,50 @@
> +  /* X86_64_EVEX_0F90 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_W_TABLE (VEX_W_0F90_L_0) },
> +  },
> +  /* X86_64_EVEX_0F91 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_W_TABLE (VEX_W_0F91_L_0) },
> +  },
> +  /* X86_64_EVEX_0F92 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_W_TABLE (VEX_W_0F92_L_0) },
> +  },
> +  /* X86_64_EVEX_0F93 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_W_TABLE (VEX_W_0F93_L_0) },
> +  },
> +  /* X86_64_EVEX_0F38F2 */
> +  {
> +    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
> +  },
> +  /* X86_64_EVEX_0F38F3 */
> +  {
> +    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
> +  },
> +  /* X86_64_EVEX_0F38F5 */
> +  {
> +    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_VEX_0F38F5_L_0) },
> +  },
> +  /* X86_64_EVEX_0F38F6 */
> +  {
> +    { Bad_Opcode },
> +    { PREFIX_TABLE(PREFIX_VEX_0F38F6_L_0) },
> +  },
> +  /* X86_64_EVEX_0F38F7 */
> +  {
> +    { Bad_Opcode },
> +    { PREFIX_TABLE(PREFIX_VEX_0F38F7_L_0) },
> +  },
> +  /* X86_64_EVEX_0F3AF0 */
> +  {
> +    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) },
> +  },
> diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
> index 7ad1edbe72d..90c063b2188 100644
> --- a/opcodes/i386-dis-evex.h
> +++ b/opcodes/i386-dis-evex.h
> @@ -164,10 +164,10 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 90 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F90) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F91) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F92) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F93) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -375,9 +375,9 @@ static const struct dis386 evex_table[][256] = {
>      { "vpsllv%DQ",	{ XM, Vex, EXx }, PREFIX_DATA },
>      /* 48 */
>      { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F3849) },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F384B) },
>      { "vrcp14p%XW",	{ XM, EXx }, PREFIX_DATA },
>      { "vrcp14s%XW",	{ XMScalar, VexScalar, EXdq }, PREFIX_DATA },
>      { "vrsqrt14p%XW",	{ XM, EXx }, 0 },
> @@ -545,32 +545,32 @@ static const struct dis386 evex_table[][256] = {
>      { "%XEvaesdecY",	{ XM, Vex, EXx }, PREFIX_DATA },
>      { "%XEvaesdeclastY", { XM, Vex, EXx }, PREFIX_DATA },
>      /* E0 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E0) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E1) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E2) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E3) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E4) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E5) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E6) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E7) },
>      /* E8 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E8) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E9) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EA) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EB) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EC) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38ED) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EE) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EF) },
>      /* F0 */
>      { Bad_Opcode },
>      { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F2) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F3) },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F5) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F6) },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F7) },
>      /* F8 */
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -854,7 +854,7 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* F0 */
> -    { Bad_Opcode },
> +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F3AF0) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -983,13 +983,13 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 60 */
> +    { "movbeS",	{ Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "movbeS",	{ Ev, Gv }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "wrussK",	{ M, Gdq }, PREFIX_DATA },
> +    { PREFIX_TABLE (PREFIX_0F38F6) },
>      { Bad_Opcode },
>      /* 68 */
>      { Bad_Opcode },
> @@ -1113,19 +1113,19 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* D8 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_D8) },
> +    { "sha1msg1",	{ XM, EXxmm }, NO_PREFIX },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DA) },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DB) },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DC) },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DD) },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DE) },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DF) },
>      /* E0 */
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -1145,20 +1145,20 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* F0 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F0) },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F1) },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F2) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* F8 */
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
> +    { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_0F38FC) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index e006d869258..d4d32befcf9 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -132,6 +132,13 @@ enum x86_64_isa
>    intel64
>  };
>  
> +enum evex_type
> +{
> +  evex_default = 0,
> +  evex_from_legacy,
> +  evex_from_vex,
> +};
> +
>  struct instr_info
>  {
>    enum address_mode address_mode;
> @@ -212,7 +219,6 @@ struct instr_info
>      int ll;
>      bool w;
>      bool evex;
> -    bool r;
>      bool v;
>      bool zeroing;
>      bool b;
> @@ -220,6 +226,8 @@ struct instr_info
>    }
>    vex;
>  
> +  enum evex_type evex_type;
> +
>    /* Remember if the current op is a jump instruction.  */
>    bool op_is_jump;
>  
> @@ -303,6 +311,8 @@ struct dis_private {
>  #define PREFIX_ADDR 0x400
>  #define PREFIX_FWAIT 0x800
>  #define PREFIX_REX2 0x1000
> +#define PREFIX_NP_OR_DATA 0x2000
> +#define NO_PREFIX   0x4000
>  
>  /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
>     to ADDR (exclusive) are valid.  Returns true for success, false
> @@ -800,6 +810,7 @@ enum
>    USE_RM_TABLE,
>    USE_PREFIX_TABLE,
>    USE_X86_64_TABLE,
> +  USE_X86_64_EVEX_FROM_VEX_TABLE,
>    USE_3BYTE_TABLE,
>    USE_XOP_8F_TABLE,
>    USE_VEX_C4_TABLE,
> @@ -818,6 +829,8 @@ enum
>  #define RM_TABLE(I)		DIS386 (USE_RM_TABLE, (I))
>  #define PREFIX_TABLE(I)		DIS386 (USE_PREFIX_TABLE, (I))
>  #define X86_64_TABLE(I)		DIS386 (USE_X86_64_TABLE, (I))
> +#define X86_64_EVEX_FROM_VEX_TABLE(I) \
> +  DIS386 (USE_X86_64_EVEX_FROM_VEX_TABLE, (I))
>  #define THREE_BYTE_TABLE(I)	DIS386 (USE_3BYTE_TABLE, (I))
>  #define XOP_8F_TABLE()		DIS386 (USE_XOP_8F_TABLE, 0)
>  #define VEX_C4_TABLE()		DIS386 (USE_VEX_C4_TABLE, 0)
> @@ -866,7 +879,7 @@ enum
>    REG_VEX_0F73,
>    REG_VEX_0FAE,
>    REG_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0,
> -  REG_VEX_0F38F3_L_0,
> +  REG_VEX_0F38F3_L_0_P_0,
>    REG_VEX_MAP7_F8_L_0_W_0,
>  
>    REG_XOP_09_01_L_0,
> @@ -878,7 +891,7 @@ enum
>    REG_EVEX_0F72,
>    REG_EVEX_0F73,
>    REG_EVEX_0F38C6_L_2,
> -  REG_EVEX_0F38C7_L_2
> +  REG_EVEX_0F38C7_L_2,
>  };
>  
>  enum
> @@ -1094,6 +1107,8 @@ enum
>    PREFIX_VEX_0F38CC,
>    PREFIX_VEX_0F38CD,
>    PREFIX_VEX_0F38DA_W_0,
> +  PREFIX_VEX_0F38F2_L_0,
> +  PREFIX_VEX_0F38F3_L_0,
>    PREFIX_VEX_0F38F5_L_0,
>    PREFIX_VEX_0F38F6_L_0,
>    PREFIX_VEX_0F38F7_L_0,
> @@ -1156,6 +1171,18 @@ enum
>    PREFIX_EVEX_0F3A67,
>    PREFIX_EVEX_0F3AC2,
>  
> +  PREFIX_EVEX_MAP4_D8,
> +  PREFIX_EVEX_MAP4_DA,
> +  PREFIX_EVEX_MAP4_DB,
> +  PREFIX_EVEX_MAP4_DC,
> +  PREFIX_EVEX_MAP4_DD,
> +  PREFIX_EVEX_MAP4_DE,
> +  PREFIX_EVEX_MAP4_DF,
> +  PREFIX_EVEX_MAP4_F0,
> +  PREFIX_EVEX_MAP4_F1,
> +  PREFIX_EVEX_MAP4_F2,
> +  PREFIX_EVEX_MAP4_F8,
> +
>    PREFIX_EVEX_MAP5_10,
>    PREFIX_EVEX_MAP5_11,
>    PREFIX_EVEX_MAP5_1D,
> @@ -1267,7 +1294,19 @@ enum
>    X86_64_VEX_0F38ED,
>    X86_64_VEX_0F38EE,
>    X86_64_VEX_0F38EF,
> +
>    X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
> +
> +  X86_64_EVEX_0F90,
> +  X86_64_EVEX_0F91,
> +  X86_64_EVEX_0F92,
> +  X86_64_EVEX_0F93,
> +  X86_64_EVEX_0F38F2,
> +  X86_64_EVEX_0F38F3,
> +  X86_64_EVEX_0F38F5,
> +  X86_64_EVEX_0F38F6,
> +  X86_64_EVEX_0F38F7,
> +  X86_64_EVEX_0F3AF0,
>  };
>  
>  enum
> @@ -2882,12 +2921,12 @@ static const struct dis386 reg_table[][8] = {
>    {
>      { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0_R_0) },
>    },
> -  /* REG_VEX_0F38F3_L_0 */
> +  /* REG_VEX_0F38F3_L_0_P_0 */
>    {
>      { Bad_Opcode },
> -    { "blsrS",		{ VexGdq, Edq }, PREFIX_OPCODE },
> -    { "blsmskS",	{ VexGdq, Edq }, PREFIX_OPCODE },
> -    { "blsiS",		{ VexGdq, Edq }, PREFIX_OPCODE },
> +    { "blsrS",		{ VexGdq, Edq }, 0 },
> +    { "blsmskS",	{ VexGdq, Edq }, 0 },
> +    { "blsiS",		{ VexGdq, Edq }, 0 },
>    },
>    /* REG_VEX_MAP7_F8_L_0_W_0 */
>    {
> @@ -4035,6 +4074,16 @@ static const struct dis386 prefix_table[][4] = {
>      { "vsm4rnds4", { XM, Vex, EXx }, 0 },
>    },
>  
> +  /* PREFIX_VEX_0F38F2_L_0 */
> +  {
> +    { "andnS",          { Gdq, VexGdq, Edq }, 0 },
> +  },
> +
> +  /* PREFIX_VEX_0F38F3_L_0 */
> +  {
> +    { REG_TABLE (REG_VEX_0F38F3_L_0_P_0) },
> +  },
> +
>    /* PREFIX_VEX_0F38F5_L_0 */
>    {
>      { "bzhiS",		{ Gdq, Edq, VexGdq }, 0 },
> @@ -4527,6 +4576,7 @@ static const struct dis386 x86_64_table[][2] = {
>      { PREFIX_TABLE (PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64) },
>    },
>  
> +#include "i386-dis-evex-x86-64.h"
>  };
>  
>  static const struct dis386 three_byte_table[][256] = {
> @@ -7113,12 +7163,12 @@ static const struct dis386 vex_len_table[][2] = {
>  
>    /* VEX_LEN_0F38F2 */
>    {
> -    { "andnS",		{ Gdq, VexGdq, Edq }, PREFIX_OPCODE },
> +    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
>    },
>  
>    /* VEX_LEN_0F38F3 */
>    {
> -    { REG_TABLE(REG_VEX_0F38F3_L_0) },
> +    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
>    },
>  
>    /* VEX_LEN_0F38F5 */
> @@ -8732,6 +8782,17 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>        dp = &prefix_table[dp->op[1].bytemode][vindex];
>        break;
>  
> +    case USE_X86_64_EVEX_FROM_VEX_TABLE:
> +      ins->evex_type = evex_from_vex;
> +      /* EVEX from VEX instrucions require that EVEX.z, EVEX.L’L, EVEX.b and
> +	 the lower 2 bits of EVEX.aaa must be 0.  */
> +      if ((ins->vex.mask_register_specifier & 0x3) != 0
> +	  || ins->vex.ll != 0
> +	  || ins->vex.zeroing != 0
> +	  || ins->vex.b)
> +	return &bad_opcode;
> +
> +      /* Fall through.  */
>      case USE_X86_64_TABLE:
>        vindex = ins->address_mode == mode_64bit ? 1 : 0;
>        dp = &x86_64_table[dp->op[1].bytemode][vindex];
> @@ -8977,9 +9038,13 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>        if (!fetch_code (ins->info, ins->codep + 4))
>  	return &err_opcode;
>        /* The first byte after 0x62.  */
> +      if (*ins->codep & 0x8)
> +	ins->rex2 |= REX_B;
> +      if (!(*ins->codep & 0x10))
> +	ins->rex2 |= REX_R;
> +
>        ins->rex = ~(*ins->codep >> 5) & 0x7;
> -      ins->vex.r = *ins->codep & 0x10;
> -      switch ((*ins->codep & 0xf))
> +      switch (*ins->codep & 0x7)
>  	{
>  	default:
>  	  return &bad_opcode;
> @@ -8992,6 +9057,12 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  	case 0x3:
>  	  vex_table_index = EVEX_0F3A;
>  	  break;
> +	case 0x4:
> +	  vex_table_index = EVEX_MAP4;
> +	  ins->evex_type = evex_from_legacy;
> +	  if (ins->address_mode != mode_64bit)
> +	    return &bad_opcode;
> +	  break;
>  	case 0x5:
>  	  vex_table_index = EVEX_MAP5;
>  	  break;
> @@ -9008,9 +9079,8 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  
>        ins->vex.register_specifier = (~(*ins->codep >> 3)) & 0xf;
>  
> -      /* The U bit.  */
>        if (!(*ins->codep & 0x4))
> -	return &bad_opcode;
> +	ins->rex2 |= REX_X;
>  
>        switch ((*ins->codep & 0x3))
>  	{
> @@ -9040,12 +9110,26 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  
>        if (ins->address_mode != mode_64bit)
>  	{
> +	  /* Report bad for !evex_default and when two fixed values of evex
> +	     change..  */
> +	  if (ins->evex_type != evex_default
> +	      || (ins->rex2 & (REX_B | REX_X)))
> +	    return &bad_opcode;
>  	  /* In 16/32-bit mode silently ignore following bits.  */
>  	  ins->rex &= ~REX_B;
> -	  ins->vex.r = true;
> +	  ins->rex2 &= ~REX_R;
>  	}
>  
>        ins->need_vex = 4;
> +
> +      /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
> +	 lower 2 bits of EVEX.aaa must be 0.  */
> +      if (ins->evex_type == evex_from_legacy
> +	  && ((ins->vex.mask_register_specifier & 0x3) != 0
> +	      || ins->vex.ll != 0
> +	      || ins->vex.zeroing != 0))
> +	return &bad_opcode;
> +
>        ins->codep++;
>        vindex = *ins->codep++;
>        dp = &evex_table[vex_table_index][vindex];
> @@ -9460,6 +9544,13 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>        dp = get_valid_dis386 (dp, &ins);
>        if (dp == &err_opcode)
>  	goto fetch_error_out;
> +
> +      /* For APX instructions promoted from legacy maps 0/1, embedded prefix
> +	 is interpreted as the operand size override.  */
> +      if (ins.evex_type == evex_from_legacy
> +	  && ins.vex.prefix == DATA_PREFIX_OPCODE)
> +	sizeflag ^= DFLAG;
> +
>        if (dp != NULL && putop (&ins, dp->name, sizeflag) == 0)
>  	{
>  	  if (!get_sib (&ins, sizeflag))
> @@ -9639,6 +9730,25 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>        if (ins.last_repnz_prefix >= 0)
>  	ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
>        break;
> +
> +    case PREFIX_NP_OR_DATA:
> +      if (ins.vex.prefix == REPE_PREFIX_OPCODE
> +	  || ins.vex.prefix == REPNE_PREFIX_OPCODE)
> +	{
> +	  i386_dis_printf (info, dis_style_text, "(bad)");
> +	  ret = ins.end_codep - priv.the_buffer;
> +	  goto out;
> +	}
> +      break;
> +
> +    case NO_PREFIX:
> +      if (ins.vex.prefix)
> +	{
> +	  i386_dis_printf (info, dis_style_text, "(bad)");
> +	  ret = ins.end_codep - priv.the_buffer;
> +	  goto out;
> +	}
> +      break;
>      }
>  
>    /* Check if the REX prefix is used.  */
> @@ -10348,7 +10458,7 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
>  		{
>  		case 'X':
>  		  if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
> -		      || !ins->vex.r
> +		      || (ins->rex2 & REX_R)
>  		      || (ins->modrm.mod == 3 && (ins->rex & REX_X))
>  		      || !ins->vex.v || ins->vex.mask_register_specifier)
>  		    break;
> @@ -11459,7 +11569,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  
>    add += (ins->rex2 & REX_B) ? 16 : 0;
>  
> -  if (ins->vex.evex)
> +  if (ins->vex.evex && ins->evex_type == evex_default)
>      {
>  
>        /* Zeroing-masking is invalid for memory destinations. Set the flag
> @@ -11603,6 +11713,13 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  		abort ();
>  	      if (ins->vex.evex)
>  		{
> +		  /* S/G EVEX insns require EVEX.X4 not to be set.  */
> +		  if (ins->rex2 & REX_X)
> +		    {
> +		      oappend (ins, "(bad)");
> +		      return true;
> +		    }
> +
>  		  if (!ins->vex.v)
>  		    vindex += 16;
>  		  check_gather = ins->obufp == ins->op_out[1];
> @@ -11805,7 +11922,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  
>  	      if (ins->rex & REX_R)
>  	        modrm_reg += 8;
> -	      if (!ins->vex.r)
> +	      if (ins->rex2 & REX_R)
>  	        modrm_reg += 16;
>  	      if (vindex == modrm_reg)
>  		oappend (ins, "/(bad)");
> @@ -12011,10 +12128,7 @@ OP_indirE (instr_info *ins, int bytemode, int sizeflag)
>  static bool
>  OP_G (instr_info *ins, int bytemode, int sizeflag)
>  {
> -  if (ins->vex.evex && !ins->vex.r && ins->address_mode == mode_64bit)
> -    oappend (ins, "(bad)");
> -  else
> -    print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
> +  print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
>    return true;
>  }
>  
> @@ -12645,7 +12759,7 @@ OP_XMM (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>      reg += 8;
>    if (ins->vex.evex)
>      {
> -      if (!ins->vex.r)
> +      if (ins->rex2 & REX_R)
>  	reg += 16;
>      }
>  
> @@ -13652,7 +13766,7 @@ DistinctDest_Fixup (instr_info *ins, int bytemode, int sizeflag)
>    /* Calc destination register number.  */
>    if (ins->rex & REX_R)
>      modrm_reg += 8;
> -  if (!ins->vex.r)
> +  if (ins->rex2 & REX_R)
>      modrm_reg += 16;
>  
>    /* Calc src1 register number.  */
> diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> index dd4850e1855..508b441a343 100644
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -487,6 +487,7 @@ static bitfield opcode_modifiers[] =
>    BITFIELD (Dialect),
>    BITFIELD (ISA64),
>    BITFIELD (NoEgpr),
> +  BITFIELD (NF),
>  };
>  
>  #define CLASS(n) #n, n
> @@ -1120,6 +1121,7 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
>      SPACE(0F),
>      SPACE(0F38),
>      SPACE(0F3A),
> +    SPACE(EVEXMAP4),
>      SPACE(EVEXMAP5),
>      SPACE(EVEXMAP6),
>      SPACE(VEXMAP7),
> diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> index 8c967ea90b0..064ec48edad 100644
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -743,6 +743,9 @@ enum
>       whether the instruction supports pseudo-prefix {rex2}.  */
>    NoEgpr,
>  
> +  /* No CSPAZO flags update indication.  */
> +  NF,
> +
>    /* The last bitfield in i386_opcode_modifier.  */
>    Opcode_Modifier_Num
>  };
> @@ -788,6 +791,7 @@ typedef struct i386_opcode_modifier
>    unsigned int dialect:2;
>    unsigned int isa64:2;
>    unsigned int noegpr:1;
> +  unsigned int nf:1;
>  } i386_opcode_modifier;
>  
>  /* Operand classes.  */
> @@ -963,6 +967,7 @@ typedef struct insn_template
>       1: 0F opcode prefix / space.
>       2: 0F38 opcode prefix / space.
>       3: 0F3A opcode prefix / space.
> +     4: EVEXMAP4 opcode prefix / space.
>       5: EVEXMAP5 opcode prefix / space.
>       6: EVEXMAP6 opcode prefix / space.
>       7: VEXMAP7 opcode prefix / space.
> @@ -974,6 +979,7 @@ typedef struct insn_template
>  #define SPACE_0F	1
>  #define SPACE_0F38	2
>  #define SPACE_0F3A	3
> +#define SPACE_EVEXMAP4	4
>  #define SPACE_EVEXMAP5	5
>  #define SPACE_EVEXMAP6	6
>  #define SPACE_VEXMAP7	7
> diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> index 37d3e8663bb..11b8c0b63cb 100644
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -113,6 +113,7 @@
>  #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
>  #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
>  
> +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
>  #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
>  #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
>  
> @@ -139,6 +140,9 @@
>  
>  #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
>  
> +// The template supports VEX format for cpuid and EVEX format for cpuid & apx_f.
> +#define APX_F(cpuid) cpuid&(cpuid|APX_F)
> +
>  // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
>  // the bit to mark commutative VEX encodings where swapping the source
>  // operands may allow to switch from 3-byte to 2-byte VEX encoding.
> @@ -194,6 +198,7 @@ mov, 0xf24, i386&No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Te
>  
>  // Move after swapping the bytes
>  movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +movbe, 0x60, Movbe&APX_F, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  
>  // Move with sign extend.
>  movsb, 0xfbe, i386, Modrm|No_bSuf|No_sSuf, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> @@ -1315,13 +1320,16 @@ getsec, 0xf37, SMX, NoSuf, {}
>  
>  invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
>  invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invept, 0xf3f0, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
>  invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
>  invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invvpid, 0xf3f1, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
>  
>  // INVPCID instruction
>  
>  invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
>  invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invpcid, 0xf3f2, INVPCID&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
>  
>  // SSSE3 instructions.
>  
> @@ -1422,6 +1430,8 @@ pcmpistri<sse42>, 0x660f3a63, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, Reg
>  pcmpistrm<sse42>, 0x660f3a62, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
>  crc32, 0xf20f38f0, SSE4_2, W|Modrm|No_sSuf|No_qSuf, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
>  crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
> +crc32, 0xf0, APX_F, W|Modrm|No_sSuf|No_qSuf|EVexMap4, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
> +crc32, 0xf0, APX_F, W|Modrm|No_wSuf|No_lSuf|No_sSuf|EVexMap4, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
>  
>  // xsave/xrstor New Instructions.
>  
> @@ -1836,14 +1846,14 @@ xtest, 0xf01d6, HLE|RTM, NoSuf, {}
>  
>  // BMI2 instructions.
>  
> -bzhi, 0xf5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> -mulx, 0xf2f6, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> -pdep, 0xf2f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> -pext, 0xf3f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> -rorx, 0xf2f0, BMI2, Modrm|CheckOperandSize|Vex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> -sarx, 0xf3f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> -shlx, 0x66f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> -shrx, 0xf2f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +bzhi, 0xf5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +mulx, 0xf2f6, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> +pdep, 0xf2f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> +pext, 0xf3f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> +rorx, 0xf2f0, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +sarx, 0xf3f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +shlx, 0x66f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +shrx, 0xf2f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
>  
>  // FMA4 instructions
>  
> @@ -1913,11 +1923,11 @@ lwpins, 0x12/0, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|U
>  
>  // BMI instructions
>  
> -andn, 0xf2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> -bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> -blsi, 0xf3/3, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> -blsmsk, 0xf3/2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> -blsr, 0xf3/1, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +andn, 0xf2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
> +bextr, 0xf7, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +blsi, 0xf3/3, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +blsmsk, 0xf3/2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +blsr, 0xf3/1, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
>  tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  
>  // TBM instructions
> @@ -2046,13 +2056,21 @@ bndldx, 0x0f1a, MPX, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
>  
>  // SHA instructions.
>  sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
>  sha1nexte, 0xf38c8, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
>  sha1msg1, 0xf38c9, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
>  sha1msg2, 0xf38ca, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
>  sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
>  sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
>  sha256msg1, 0xf38cc, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha256msg1, 0xdc, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
>  sha256msg2, 0xf38cd, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> +sha256msg2, 0xdd, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
>  
>  // SHA512 instructions.
>  
> @@ -2114,9 +2132,9 @@ kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { R
>  kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
>  kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
>  
> -kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
> -kmov<bw>, 0x<bw:kpfx>91, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
> -kmov<bw>, 0x<bw:kpfx>92, <bw:kcpu>, D|Modrm|Vex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
> +kmov<bw>, 0x<bw:kpfx>90, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
> +kmov<bw>, 0x<bw:kpfx>91, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
> +kmov<bw>, 0x<bw:kpfx>92, APX_F(<bw:kcpu>), D|Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
>  
>  knot<bw>, 0x<bw:kpfx>44, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
>  kortest<bw>, 0x<bw:kpfx>98, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
> @@ -2591,9 +2609,9 @@ vpmovzxdq, 0x6635, AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift
>  kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
>  kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
>  kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask, RegMask, RegMask }
> -kmov<dq>, 0x<dq:kpfx>90, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
> -kmov<dq>, 0x<dq:kpfx>91, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
> -kmov<dq>, 0xf292, AVX512BW, D|Modrm|Vex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
> +kmov<dq>, 0x<dq:kpfx>90, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
> +kmov<dq>, 0x<dq:kpfx>91, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
> +kmov<dq>, 0xf292, APX_F(AVX512BW), D|Modrm|Vex128|EVex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
>  knot<dq>, 0x<dq:kpfx>44, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
>  kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
>  kortest<dq>, 0x<dq:kpfx>98, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
> @@ -2992,9 +3010,13 @@ rdsspq, 0xf30f1e/1, SHSTK&x64, Modrm|NoSuf, { Reg64 }
>  saveprevssp, 0xf30f01ea, SHSTK, NoSuf, {}
>  rstorssp, 0xf30f01/5, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
>  wrssd, 0x0f38f6, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
> +wrssd, 0x66, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
>  wrssq, 0x0f38f6, SHSTK&x64, Modrm|NoSuf|Size64, { Reg64, Qword|Unspecified|BaseIndex }
> +wrssq, 0x66, SHSTK&APX_F, Modrm|NoSuf|Size64|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
>  wrussd, 0x660f38f5, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
> +wrussd, 0x6665, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
>  wrussq, 0x660f38f5, SHSTK&x64, Modrm|NoSuf, { Reg64, Qword|Unspecified|BaseIndex }
> +wrussq, 0x6665, SHSTK&APX_F, Modrm|NoSuf|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
>  setssbsy, 0xf30f01e8, SHSTK, NoSuf, {}
>  clrssbsy, 0xf30fae/6, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
>  endbr64, 0xf30f1efa, IBT, NoSuf, {}
> @@ -3042,7 +3064,9 @@ cldemote, 0x0f1c/0, CLDEMOTE, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
>  // MOVDIR[I,64B] instructions.
>  
>  movdiri, 0xf38f9, MOVDIRI, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +movdiri, 0xf9, MOVDIRI&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
>  movdir64b, 0x660f38f8, MOVDIR64B, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +movdir64b, 0x66f8, MOVDIR64B&APX_F, Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
>  
>  // MOVEDIR instructions end.
>  
> @@ -3071,7 +3095,9 @@ vcvtneps2bf16<Vxy>, 0xf372, AVX_NE_CONVERT, Modrm|<Vxy:vex>|Space0F38|VexW0|NoSu
>  // ENQCMD instructions.
>  
>  enqcmd, 0xf20f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +enqcmd, 0xf2f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
>  enqcmds, 0xf30f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +enqcmds, 0xf3f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
>  
>  // ENQCMD instructions end.
>  
> @@ -3132,8 +3158,8 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
>  
>  // AMX instructions.
>  
> -ldtilecfg, 0x49/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
> -sttilecfg, 0x6649/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
> +ldtilecfg, 0x49/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
> +sttilecfg, 0x6649/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
>  
>  tcmmimfp16ps, 0x666c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
>  tcmmrlfp16ps, 0x6c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
> @@ -3145,9 +3171,9 @@ tdpbuud, 0x5e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
>  tdpbusd, 0x665e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
>  tdpbsud, 0xf35e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
>  
> -tileloadd, 0xf24b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> -tileloaddt1, 0x664b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> -tilestored, 0xf34b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
> +tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> +tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> +tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
>  
>  tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {}
>  
> @@ -3159,15 +3185,25 @@ tilezero, 0xf249, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
>  
>  loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM }
>  encodekey128, 0xf30f38fa, KL, Modrm|NoSuf, { Reg32, Reg32 }
> +encodekey128, 0xf3da, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
>  encodekey256, 0xf30f38fb, KL, Modrm|NoSuf, { Reg32, Reg32 }
> +encodekey256, 0xf3db, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
>  aesenc128kl, 0xf30f38dc, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
> +aesenc128kl, 0xf3dc, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
>  aesdec128kl, 0xf30f38dd, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
> +aesdec128kl, 0xf3dd, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
>  aesenc256kl, 0xf30f38de, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
> +aesenc256kl, 0xf3de, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
>  aesdec256kl, 0xf30f38df, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
> +aesdec256kl, 0xf3df, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
>  aesencwide128kl, 0xf30f38d8/0, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
> +aesencwide128kl, 0xf3d8/0, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
>  aesdecwide128kl, 0xf30f38d8/1, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
> +aesdecwide128kl, 0xf3d8/1, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
>  aesencwide256kl, 0xf30f38d8/2, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
> +aesencwide256kl, 0xf3d8/2, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
>  aesdecwide256kl, 0xf30f38d8/3, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
> +aesdecwide256kl, 0xf3d8/3, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
>  
>  // KEYLOCKER instructions end.
>  
> @@ -3315,7 +3351,7 @@ prefetchit1, 0xf18/6, PREFETCHI, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
>  
>  // CMPCCXADD instructions.
>  
> -cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD, Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +cmp<cc>xadd, 0x66e<cc:opc>, APX_F(CMPCCXADD), Modrm|Vex|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
>  
>  // CMPCCXADD instructions end.
>  
> @@ -3335,9 +3371,13 @@ wrmsrlist, 0xf30f01c6, MSRLIST, NoSuf, {}
>  // RAO-INT instructions.
>  
>  aadd, 0xf38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +aadd, 0xfc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
>  aand, 0x660f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +aand, 0x66fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
>  aor, 0xf20f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +aor, 0xf2fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
>  axor, 0xf30f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +axor, 0xf3fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
>  
>  // RAO-INT instructions end.
>  
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 4/9] Add tests for APX GPR32 with extend evex prefix
  2023-12-28  1:27 ` [PATCH V5 4/9] Add tests for " Cui, Lili
@ 2023-12-28  1:54   ` H.J. Lu
  0 siblings, 0 replies; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:54 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich

On Thu, Dec 28, 2023 at 01:27:09AM +0000, Cui, Lili wrote:
> gas/ChangeLog:
> 
> 2023-12-28 Lingling Kong <lingling.kong@intel.com>
> 	    H.J. Lu  <hongjiu.lu@intel.com>
> 	    Lili Cui <lili.cui@intel.com>
> 	    Lin Hu   <lin1.hu@intel.com>
> 
> 	* testsuite/gas/i386/x86-64-apx-egpr-inval.l: Add some insn don't
> 	support gpr32.
> 	* testsuite/gas/i386/x86-64-apx-egpr-inval.s: Ditto.
> 	* testsuite/gas/i386/x86-64.exp: Add new test.
> 	* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: New test.
> 	* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: New test.
> 	* testsuite/gas/i386/x86-64-apx-evex-egpr.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-evex-egpr.s: New test.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: New test.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted.s: New test.
> ---
>  .../gas/i386/x86-64-apx-egpr-inval.l          | 187 ++++++++++
>  .../gas/i386/x86-64-apx-egpr-inval.s          | 191 +++++++++++
>  .../gas/i386/x86-64-apx-egpr-promote-inval.l  |  20 ++
>  .../gas/i386/x86-64-apx-egpr-promote-inval.s  |  29 ++
>  gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d |  20 ++
>  gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s |  21 ++
>  .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  33 ++
>  .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  39 +++
>  .../gas/i386/x86-64-apx-evex-promoted-intel.d | 318 ++++++++++++++++++
>  .../gas/i386/x86-64-apx-evex-promoted.d       | 318 ++++++++++++++++++
>  .../gas/i386/x86-64-apx-evex-promoted.s       | 314 +++++++++++++++++
>  gas/testsuite/gas/i386/x86-64.exp             |   5 +
>  12 files changed, 1495 insertions(+)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
> 
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
> index bb5c602a2e2..0472748978a 100644
> --- a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
> +++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
> @@ -12,4 +12,191 @@
>  .*:16: Error: extended GPR cannot be used as base/index for `xsaveopt64'
>  .*:17: Error: extended GPR cannot be used as base/index for `xsavec'
>  .*:18: Error: extended GPR cannot be used as base/index for `xsavec64'
> +.*:20: Error: extended GPR cannot be used as base/index for `blendpd'
> +.*:21: Error: extended GPR cannot be used as base/index for `blendps'
> +.*:22: Error: extended GPR cannot be used as base/index for `blendvpd'
> +.*:23: Error: extended GPR cannot be used as base/index for `blendvpd'
> +.*:24: Error: extended GPR cannot be used as base/index for `blendvps'
> +.*:25: Error: extended GPR cannot be used as base/index for `blendvps'
> +.*:26: Error: extended GPR cannot be used as base/index for `dppd'
> +.*:27: Error: extended GPR cannot be used as base/index for `dpps'
> +.*:28: Error: register type mismatch for `extractps'
> +.*:29: Error: extended GPR cannot be used as base/index for `extractps'
> +.*:30: Error: extended GPR cannot be used as base/index for `insertps'
> +.*:31: Error: extended GPR cannot be used as base/index for `movntdqa'
> +.*:32: Error: extended GPR cannot be used as base/index for `mpsadbw'
> +.*:33: Error: extended GPR cannot be used as base/index for `pabsb'
> +.*:34: Error: extended GPR cannot be used as base/index for `pabsd'
> +.*:35: Error: extended GPR cannot be used as base/index for `pabsw'
> +.*:36: Error: extended GPR cannot be used as base/index for `packusdw'
> +.*:37: Error: extended GPR cannot be used as base/index for `palignr'
> +.*:38: Error: extended GPR cannot be used as base/index for `pblendvb'
> +.*:39: Error: extended GPR cannot be used as base/index for `pblendvb'
> +.*:40: Error: extended GPR cannot be used as base/index for `pblendw'
> +.*:41: Error: extended GPR cannot be used as base/index for `pcmpeqq'
> +.*:42: Error: extended GPR cannot be used as base/index for `pcmpestri'
> +.*:43: Error: extended GPR cannot be used as base/index for `pcmpestrm'
> +.*:44: Error: extended GPR cannot be used as base/index for `pcmpgtq'
> +.*:45: Error: extended GPR cannot be used as base/index for `pcmpistri'
> +.*:46: Error: extended GPR cannot be used as base/index for `pcmpistrm'
> +.*:47: Error: register type mismatch for `pextrb'
> +.*:48: Error: extended GPR cannot be used as base/index for `pextrb'
> +.*:49: Error: extended GPR cannot be used as base/index for `pextrd'
> +.*:50: Error: extended GPR cannot be used as base/index for `pextrq'
> +.*:51: Error: extended GPR cannot be used as base/index for `pextrw'
> +.*:52: Error: extended GPR cannot be used as base/index for `phaddd'
> +.*:53: Error: extended GPR cannot be used as base/index for `phaddsw'
> +.*:54: Error: extended GPR cannot be used as base/index for `phaddw'
> +.*:55: Error: extended GPR cannot be used as base/index for `phminposuw'
> +.*:56: Error: extended GPR cannot be used as base/index for `phsubw'
> +.*:57: Error: register type mismatch for `pinsrb'
> +.*:58: Error: extended GPR cannot be used as base/index for `pinsrb'
> +.*:59: Error: register type mismatch for `pinsrd'
> +.*:60: Error: extended GPR cannot be used as base/index for `pinsrd'
> +.*:61: Error: register type mismatch for `pinsrq'
> +.*:62: Error: extended GPR cannot be used as base/index for `pinsrq'
> +.*:63: Error: extended GPR cannot be used as base/index for `pmaddubsw'
> +.*:64: Error: extended GPR cannot be used as base/index for `pmaxsb'
> +.*:65: Error: extended GPR cannot be used as base/index for `pmaxsd'
> +.*:66: Error: extended GPR cannot be used as base/index for `pmaxud'
> +.*:67: Error: extended GPR cannot be used as base/index for `pmaxuw'
> +.*:68: Error: extended GPR cannot be used as base/index for `pminsb'
> +.*:69: Error: extended GPR cannot be used as base/index for `pminsd'
> +.*:70: Error: extended GPR cannot be used as base/index for `pminud'
> +.*:71: Error: extended GPR cannot be used as base/index for `pminuw'
> +.*:72: Error: extended GPR cannot be used as base/index for `pmovsxbd'
> +.*:73: Error: extended GPR cannot be used as base/index for `pmovsxbq'
> +.*:74: Error: extended GPR cannot be used as base/index for `pmovsxbw'
> +.*:75: Error: extended GPR cannot be used as base/index for `pmovsxbw'
> +.*:76: Error: extended GPR cannot be used as base/index for `pmovsxdq'
> +.*:77: Error: extended GPR cannot be used as base/index for `pmovsxwd'
> +.*:78: Error: extended GPR cannot be used as base/index for `pmovsxwq'
> +.*:79: Error: extended GPR cannot be used as base/index for `pmovzxbd'
> +.*:80: Error: extended GPR cannot be used as base/index for `pmovzxbq'
> +.*:81: Error: extended GPR cannot be used as base/index for `pmovzxdq'
> +.*:82: Error: extended GPR cannot be used as base/index for `pmovzxwd'
> +.*:83: Error: extended GPR cannot be used as base/index for `pmovzxwq'
> +.*:84: Error: extended GPR cannot be used as base/index for `pmuldq'
> +.*:85: Error: extended GPR cannot be used as base/index for `pmulhrsw'
> +.*:86: Error: extended GPR cannot be used as base/index for `pmulld'
> +.*:87: Error: extended GPR cannot be used as base/index for `pshufb'
> +.*:88: Error: extended GPR cannot be used as base/index for `psignb'
> +.*:89: Error: extended GPR cannot be used as base/index for `psignd'
> +.*:90: Error: extended GPR cannot be used as base/index for `psignw'
> +.*:91: Error: extended GPR cannot be used as base/index for `roundpd'
> +.*:92: Error: extended GPR cannot be used as base/index for `roundps'
> +.*:93: Error: extended GPR cannot be used as base/index for `roundsd'
> +.*:94: Error: extended GPR cannot be used as base/index for `roundss'
> +.*:96: Error: extended GPR cannot be used as base/index for `aesdec'
> +.*:97: Error: extended GPR cannot be used as base/index for `aesdeclast'
> +.*:98: Error: extended GPR cannot be used as base/index for `aesenc'
> +.*:99: Error: extended GPR cannot be used as base/index for `aesenclast'
> +.*:100: Error: extended GPR cannot be used as base/index for `aesimc'
> +.*:101: Error: extended GPR cannot be used as base/index for `aeskeygenassist'
> +.*:102: Error: extended GPR cannot be used as base/index for `pclmulhqhqdq'
> +.*:103: Error: extended GPR cannot be used as base/index for `pclmulhqlqdq'
> +.*:104: Error: extended GPR cannot be used as base/index for `pclmullqhqdq'
> +.*:105: Error: extended GPR cannot be used as base/index for `pclmullqlqdq'
> +.*:106: Error: extended GPR cannot be used as base/index for `pclmulqdq'
> +.*:108: Error: extended GPR cannot be used as base/index for `gf2p8affineinvqb'
> +.*:109: Error: extended GPR cannot be used as base/index for `gf2p8affineqb'
> +.*:110: Error: extended GPR cannot be used as base/index for `gf2p8mulb'
> +.*:112: Error: extended GPR cannot be used as base/index for `vaesimc'
> +.*:113: Error: extended GPR cannot be used as base/index for `vaeskeygenassist'
> +.*:114: Error: extended GPR cannot be used as base/index for `vblendpd'
> +.*:115: Error: extended GPR cannot be used as base/index for `vblendpd'
> +.*:116: Error: extended GPR cannot be used as base/index for `vblendps'
> +.*:117: Error: extended GPR cannot be used as base/index for `vblendps'
> +.*:118: Error: extended GPR cannot be used as base/index for `vblendvpd'
> +.*:119: Error: extended GPR cannot be used as base/index for `vblendvpd'
> +.*:120: Error: extended GPR cannot be used as base/index for `vblendvps'
> +.*:121: Error: extended GPR cannot be used as base/index for `vblendvps'
> +.*:122: Error: extended GPR cannot be used as base/index for `vdppd'
> +.*:123: Error: extended GPR cannot be used as base/index for `vdpps'
> +.*:124: Error: extended GPR cannot be used as base/index for `vdpps'
> +.*:125: Error: extended GPR cannot be used as base/index for `vhaddpd'
> +.*:126: Error: extended GPR cannot be used as base/index for `vhaddpd'
> +.*:127: Error: extended GPR cannot be used as base/index for `vhsubps'
> +.*:128: Error: extended GPR cannot be used as base/index for `vhsubps'
> +.*:129: Error: extended GPR cannot be used as base/index for `vlddqu'
> +.*:130: Error: extended GPR cannot be used as base/index for `vlddqu'
> +.*:131: Error: extended GPR cannot be used as base/index for `vldmxcsr'
> +.*:132: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
> +.*:133: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
> +.*:134: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
> +.*:135: Error: extended GPR cannot be used as base/index for `vmaskmovpd'
> +.*:136: Error: extended GPR cannot be used as base/index for `vmaskmovps'
> +.*:137: Error: extended GPR cannot be used as base/index for `vmaskmovps'
> +.*:138: Error: extended GPR cannot be used as base/index for `vmaskmovps'
> +.*:139: Error: extended GPR cannot be used as base/index for `vmaskmovps'
> +.*:140: Error: register type mismatch for `vmovmskpd'
> +.*:141: Error: register type mismatch for `vmovmskpd'
> +.*:142: Error: register type mismatch for `vmovmskps'
> +.*:143: Error: register type mismatch for `vmovmskps'
> +.*:144: Error: extended GPR cannot be used as base/index for `vpblendd'
> +.*:145: Error: extended GPR cannot be used as base/index for `vpblendd'
> +.*:146: Error: extended GPR cannot be used as base/index for `vpblendvb'
> +.*:147: Error: extended GPR cannot be used as base/index for `vpblendvb'
> +.*:148: Error: extended GPR cannot be used as base/index for `vpblendw'
> +.*:149: Error: extended GPR cannot be used as base/index for `vpblendw'
> +.*:150: Error: extended GPR cannot be used as base/index for `vpcmpeqb'
> +.*:151: Error: extended GPR cannot be used as base/index for `vpcmpeqd'
> +.*:152: Error: extended GPR cannot be used as base/index for `vpcmpeqq'
> +.*:153: Error: extended GPR cannot be used as base/index for `vpcmpeqw'
> +.*:154: Error: extended GPR cannot be used as base/index for `vpcmpestri'
> +.*:155: Error: extended GPR cannot be used as base/index for `vpcmpestrm'
> +.*:156: Error: extended GPR cannot be used as base/index for `vpcmpgtb'
> +.*:157: Error: extended GPR cannot be used as base/index for `vpcmpgtd'
> +.*:158: Error: extended GPR cannot be used as base/index for `vpcmpgtq'
> +.*:159: Error: extended GPR cannot be used as base/index for `vpcmpgtw'
> +.*:160: Error: extended GPR cannot be used as base/index for `vpcmpistri'
> +.*:161: Error: extended GPR cannot be used as base/index for `vpcmpistrm'
> +.*:162: Error: extended GPR cannot be used as base/index for `vperm2f128'
> +.*:163: Error: extended GPR cannot be used as base/index for `vperm2i128'
> +.*:164: Error: extended GPR cannot be used as base/index for `vphaddd'
> +.*:165: Error: extended GPR cannot be used as base/index for `vphaddd'
> +.*:166: Error: extended GPR cannot be used as base/index for `vphaddsw'
> +.*:167: Error: extended GPR cannot be used as base/index for `vphaddsw'
> +.*:168: Error: extended GPR cannot be used as base/index for `vphaddw'
> +.*:169: Error: extended GPR cannot be used as base/index for `vphaddw'
> +.*:170: Error: extended GPR cannot be used as base/index for `vphminposuw'
> +.*:171: Error: extended GPR cannot be used as base/index for `vphsubd'
> +.*:172: Error: extended GPR cannot be used as base/index for `vphsubd'
> +.*:173: Error: extended GPR cannot be used as base/index for `vphsubsw'
> +.*:174: Error: extended GPR cannot be used as base/index for `vphsubsw'
> +.*:175: Error: extended GPR cannot be used as base/index for `vphsubw'
> +.*:176: Error: extended GPR cannot be used as base/index for `vphsubw'
> +.*:177: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
> +.*:178: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
> +.*:179: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
> +.*:180: Error: extended GPR cannot be used as base/index for `vpmaskmovd'
> +.*:181: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
> +.*:182: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
> +.*:183: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
> +.*:184: Error: extended GPR cannot be used as base/index for `vpmaskmovq'
> +.*:185: Error: register type mismatch for `vpmovmskb'
> +.*:186: Error: register type mismatch for `vpmovmskb'
> +.*:187: Error: extended GPR cannot be used as base/index for `vpsignb'
> +.*:188: Error: extended GPR cannot be used as base/index for `vpsignb'
> +.*:189: Error: extended GPR cannot be used as base/index for `vpsignd'
> +.*:190: Error: extended GPR cannot be used as base/index for `vpsignd'
> +.*:191: Error: extended GPR cannot be used as base/index for `vpsignw'
> +.*:192: Error: extended GPR cannot be used as base/index for `vpsignw'
> +.*:193: Error: extended GPR cannot be used as base/index for `vptest'
> +.*:194: Error: extended GPR cannot be used as base/index for `vptest'
> +.*:195: Error: extended GPR cannot be used as base/index for `vrcpps'
> +.*:196: Error: extended GPR cannot be used as base/index for `vrcpps'
> +.*:197: Error: extended GPR cannot be used as base/index for `vrcpss'
> +.*:198: Error: extended GPR cannot be used as base/index for `vroundpd'
> +.*:199: Error: extended GPR cannot be used as base/index for `vroundps'
> +.*:200: Error: extended GPR cannot be used as base/index for `vroundsd'
> +.*:201: Error: extended GPR cannot be used as base/index for `vroundss'
> +.*:202: Error: extended GPR cannot be used as base/index for `vrsqrtps'
> +.*:203: Error: extended GPR cannot be used as base/index for `vrsqrtps'
> +.*:204: Error: extended GPR cannot be used as base/index for `vrsqrtss'
> +.*:205: Error: extended GPR cannot be used as base/index for `vstmxcsr'
> +.*:206: Error: extended GPR cannot be used as base/index for `vtestpd'
> +.*:207: Error: extended GPR cannot be used as base/index for `vtestpd'
> +.*:208: Error: extended GPR cannot be used as base/index for `vtestps'
> +.*:209: Error: extended GPR cannot be used as base/index for `vtestps'
>  #pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
> index bfb6b3fd03b..fde038d6b2f 100644
> --- a/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
> +++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
> @@ -16,3 +16,194 @@
>  	xsaveopt64 (%r16, %r31)
>  	xsavec (%r16, %rbx)
>  	xsavec64 (%r16, %r31)
> +#SSE
> +	blendpd $100,(%r18),%xmm6
> +	blendps $100,(%r18),%xmm6
> +	blendvpd %xmm0,(%r19),%xmm6
> +	blendvpd (%r19),%xmm6
> +	blendvps %xmm0,(%r19),%xmm6
> +	blendvps (%r19),%xmm6
> +	dppd $100,(%r20),%xmm6
> +	dpps $100,(%r20),%xmm6
> +	extractps $100,%xmm4,%r21
> +	extractps $100,%xmm4,(%r21)
> +	insertps $100,(%r21),%xmm6
> +	movntdqa (%r21),%xmm4
> +	mpsadbw $100,(%r21),%xmm6
> +	pabsb (%r17),%xmm0
> +	pabsd (%r17),%xmm0
> +	pabsw (%r17),%xmm0
> +	packusdw (%r21),%xmm6
> +	palignr $100,(%r17),%xmm6
> +	pblendvb %xmm0,(%r22),%xmm6
> +	pblendvb (%r22),%xmm6
> +	pblendw $100,(%r22),%xmm6
> +	pcmpeqq (%r22),%xmm6
> +	pcmpestri $100,(%r25),%xmm6
> +	pcmpestrm $100,(%r25),%xmm6
> +	pcmpgtq (%r25),%xmm4
> +	pcmpistri $100,(%r25),%xmm6
> +	pcmpistrm $100,(%r25),%xmm6
> +	pextrb $100,%xmm4,%r22
> +	pextrb $100,%xmm4,(%r22)
> +	pextrd $100,%xmm4,(%r22)
> +	pextrq $100,%xmm4,(%r22)
> +	pextrw $100,%xmm4,(%r22)
> +	phaddd  (%r17),%xmm0
> +	phaddsw (%r17),%xmm0
> +	phaddw  (%r17),%xmm0
> +	phminposuw (%r23),%xmm4
> +	phsubw (%r17),%xmm0
> +	pinsrb $100,%r23,%xmm4
> +	pinsrb $100,(%r23),%xmm4
> +	pinsrd $100, %r23d, %xmm4
> +	pinsrd $100,(%r23),%xmm4
> +	pinsrq $100, %r24, %xmm4
> +	pinsrq $100,(%r24),%xmm4
> +	pmaddubsw (%r17),%xmm0
> +	pmaxsb (%r24),%xmm6
> +	pmaxsd (%r24),%xmm6
> +	pmaxud (%r24),%xmm6
> +	pmaxuw (%r24),%xmm6
> +	pminsb (%r24),%xmm6
> +	pminsd (%r24),%xmm6
> +	pminud (%r24),%xmm6
> +	pminuw (%r24),%xmm6
> +	pmovsxbd (%r24),%xmm4
> +	pmovsxbq (%r24),%xmm4
> +	pmovsxbw (%r24),%xmm4
> +	pmovsxbw (%r24),%xmm4
> +	pmovsxdq (%r24),%xmm4
> +	pmovsxwd (%r24),%xmm4
> +	pmovsxwq (%r24),%xmm4
> +	pmovzxbd (%r24),%xmm4
> +	pmovzxbq (%r24),%xmm4
> +	pmovzxdq (%r24),%xmm4
> +	pmovzxwd (%r24),%xmm4
> +	pmovzxwq (%r24),%xmm4
> +	pmuldq (%r24),%xmm4
> +	pmulhrsw (%r17),%xmm0
> +	pmulld (%r24),%xmm4
> +	pshufb (%r17),%xmm0
> +	psignb (%r17),%xmm0
> +	psignd (%r17),%xmm0
> +	psignw (%r17),%xmm0
> +	roundpd $100,(%r24),%xmm6
> +	roundps $100,(%r24),%xmm6
> +	roundsd $100,(%r24),%xmm6
> +	roundss $100,(%r24),%xmm6
> +#AES
> +	aesdec (%r26),%xmm6
> +	aesdeclast (%r26),%xmm6
> +	aesenc (%r26),%xmm6
> +	aesenclast (%r26),%xmm6
> +	aesimc (%r26),%xmm6
> +	aeskeygenassist $100,(%r26),%xmm6
> +	pclmulhqhqdq (%r26),%xmm6
> +	pclmulhqlqdq (%r26),%xmm6
> +	pclmullqhqdq (%r26),%xmm6
> +	pclmullqlqdq (%r26),%xmm6
> +	pclmulqdq $100,(%r26),%xmm6
> +#GFNI
> +	gf2p8affineinvqb $100,(%r26),%xmm6
> +	gf2p8affineqb $100,(%r26),%xmm6
> +	gf2p8mulb (%r26),%xmm6
> +#VEX without evex
> +	vaesimc (%r27), %xmm3
> +	vaeskeygenassist $7,(%r27),%xmm3
> +	vblendpd $7,(%r27),%xmm6,%xmm2
> +	vblendpd $7,(%r27),%ymm6,%ymm2
> +	vblendps $7,(%r27),%xmm6,%xmm2
> +	vblendps $7,(%r27),%ymm6,%ymm2
> +	vblendvpd %xmm4,(%r27),%xmm2,%xmm7
> +	vblendvpd %ymm4,(%r27),%ymm2,%ymm7
> +	vblendvps %xmm4,(%r27),%xmm2,%xmm7
> +	vblendvps %ymm4,(%r27),%ymm2,%ymm7
> +	vdppd $7,(%r27),%xmm6,%xmm2
> +	vdpps $7,(%r27),%xmm6,%xmm2
> +	vdpps $7,(%r27),%ymm6,%ymm2
> +	vhaddpd (%r27),%xmm6,%xmm5
> +	vhaddpd (%r27),%ymm6,%ymm5
> +	vhsubps (%r27),%xmm6,%xmm5
> +	vhsubps (%r27),%ymm6,%ymm5
> +	vlddqu (%r27),%xmm4
> +	vlddqu (%r27),%ymm4
> +	vldmxcsr (%r27)
> +	vmaskmovpd %xmm4,%xmm6,(%r27)
> +	vmaskmovpd %ymm4,%ymm6,(%r27)
> +	vmaskmovpd (%r27),%xmm4,%xmm6
> +	vmaskmovpd (%r27),%ymm4,%ymm6
> +	vmaskmovps %xmm4,%xmm6,(%r27)
> +	vmaskmovps %ymm4,%ymm6,(%r27)
> +	vmaskmovps (%r27),%xmm4,%xmm6
> +	vmaskmovps (%r27),%ymm4,%ymm6
> +	vmovmskpd %xmm4,%r27d
> +	vmovmskpd %xmm8,%r27d
> +	vmovmskps %xmm4,%r27d
> +	vmovmskps %ymm8,%r27d
> +	vpblendd $7,(%r27),%xmm6,%xmm2
> +	vpblendd $7,(%r27),%ymm6,%ymm2
> +	vpblendvb %xmm4,(%r27),%xmm2,%xmm7
> +	vpblendvb %ymm4,(%r27),%ymm2,%ymm7
> +	vpblendw $7,(%r27),%xmm6,%xmm2
> +	vpblendw $7,(%r27),%ymm6,%ymm2
> +	vpcmpeqb (%r26),%ymm6,%ymm2
> +	vpcmpeqd (%r26),%ymm6,%ymm2
> +	vpcmpeqq (%r16),%ymm6,%ymm2
> +	vpcmpeqw (%r16),%ymm6,%ymm2
> +	vpcmpestri $7,(%r27),%xmm6
> +	vpcmpestrm $7,(%r27),%xmm6
> +	vpcmpgtb (%r26),%ymm6,%ymm2
> +	vpcmpgtd (%r26),%ymm6,%ymm2
> +	vpcmpgtq (%r16),%ymm6,%ymm2
> +	vpcmpgtw (%r16),%ymm6,%ymm2
> +	vpcmpistri $100,(%r25),%xmm6
> +	vpcmpistrm $100,(%r25),%xmm6
> +	vperm2f128 $7,(%r27),%ymm6,%ymm2
> +	vperm2i128 $7,(%r27),%ymm6,%ymm2
> +	vphaddd (%r27),%xmm6,%xmm7
> +	vphaddd (%r27),%ymm6,%ymm7
> +	vphaddsw (%r27),%xmm6,%xmm7
> +	vphaddsw (%r27),%ymm6,%ymm7
> +	vphaddw (%r27),%xmm6,%xmm7
> +	vphaddw (%r27),%ymm6,%ymm7
> +	vphminposuw (%r27),%xmm6
> +	vphsubd (%r27),%xmm6,%xmm7
> +	vphsubd (%r27),%ymm6,%ymm7
> +	vphsubsw (%r27),%xmm6,%xmm7
> +	vphsubsw (%r27),%ymm6,%ymm7
> +	vphsubw (%r27),%xmm6,%xmm7
> +	vphsubw (%r27),%ymm6,%ymm7
> +	vpmaskmovd %xmm4,%xmm6,(%r27)
> +	vpmaskmovd %ymm4,%ymm6,(%r27)
> +	vpmaskmovd (%r27),%xmm4,%xmm6
> +	vpmaskmovd (%r27),%ymm4,%ymm6
> +	vpmaskmovq %xmm4,%xmm6,(%r27)
> +	vpmaskmovq %ymm4,%ymm6,(%r27)
> +	vpmaskmovq (%r27),%xmm4,%xmm6
> +	vpmaskmovq (%r27),%ymm4,%ymm6
> +	vpmovmskb %xmm4,%r27
> +	vpmovmskb %ymm4,%r27d
> +	vpsignb (%r27),%xmm6,%xmm7
> +	vpsignb (%r27),%xmm6,%xmm7
> +	vpsignd (%r27),%xmm6,%xmm7
> +	vpsignd (%r27),%xmm6,%xmm7
> +	vpsignw (%r27),%xmm6,%xmm7
> +	vpsignw (%r27),%xmm6,%xmm7
> +	vptest (%r27),%ymm6
> +	vptest (%r27),%xmm6
> +	vrcpps (%r27),%xmm6
> +	vrcpps (%r27),%ymm6
> +	vrcpss (%r27),%xmm6,%xmm6
> +	vroundpd $1,(%r24),%xmm6
> +	vroundps $2,(%r24),%xmm6
> +	vroundsd $3,(%r24),%xmm6,%xmm3
> +	vroundss $4,(%r24),%xmm6,%xmm3
> +	vrsqrtps (%r27),%xmm6
> +	vrsqrtps (%r27),%ymm6
> +	vrsqrtss (%r27),%xmm6,%xmm6
> +	vstmxcsr (%r27)
> +	vtestpd (%r27),%xmm6
> +	vtestpd (%r27),%ymm6
> +	vtestps (%r27),%xmm6
> +	vtestps (%r27),%ymm6
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
> new file mode 100644
> index 00000000000..f8701d7ec22
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
> @@ -0,0 +1,20 @@
> +.*: Assembler messages:
> +.*:4: Error: `movbe' is not supported on `x86_64.nomovbe'
> +.*:5: Error: `movbe' is not supported on `x86_64.nomovbe'
> +.*:8: Error: `invept' is not supported on `x86_64.noept'
> +.*:9: Error: `invept' is not supported on `x86_64.noept'
> +.*:12: Error: `kmovq' is not supported on `x86_64.noavx512bw'
> +.*:13: Error: `kmovq' is not supported on `x86_64.noavx512bw'
> +.*:16: Error: `kmovb' is not supported on `x86_64.noavx512dq'
> +.*:17: Error: `kmovb' is not supported on `x86_64.noavx512dq'
> +.*:20: Error: `kmovw' is not supported on `x86_64.noavx512f'
> +.*:21: Error: `kmovw' is not supported on `x86_64.noavx512f'
> +.*:24: Error: `andn' is not supported on `x86_64.nobmi'
> +.*:25: Error: `andn' is not supported on `x86_64.nobmi'
> +.*:28: Error: `bzhi' is not supported on `x86_64.nobmi2'
> +.*:29: Error: `bzhi' is not supported on `x86_64.nobmi2'
> +GAS LISTING .*
> +#...
> +[ 	]*1[ 	]+\# Check illegal 64bit APX EVEX promoted instructions
> +[ 	]*2[ 	]+\.text
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
> new file mode 100644
> index 00000000000..2ea47419b4d
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
> @@ -0,0 +1,29 @@
> +# Check illegal 64bit APX EVEX promoted instructions
> +	.text
> +	.arch .nomovbe
> +	movbe (%r16), %r17
> +	movbe (%rax), %rcx
> +	.arch default
> +	.arch .noept
> +	invept (%r16), %r17
> +	invept (%rax), %rcx
> +	.arch default
> +	.arch .noavx512bw
> +	kmovq %k1, (%r16)
> +	kmovq %k1, (%r8)
> +	.arch default
> +	.arch .noavx512dq
> +	kmovb %k1, %r16d
> +	kmovb %k1, %r8d
> +	.arch default
> +	.arch .noavx512f
> +	kmovw %k1, %r16d
> +	kmovw %k1, %r8d
> +	.arch default
> +	.arch .nobmi
> +	andn %r16,%r15,%r11
> +	andn %r15,%r15,%r11
> +	.arch default
> +	.arch .nobmi2
> +	bzhi %r16,%r15,%r11
> +	bzhi %r15,%r15,%r11
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
> new file mode 100644
> index 00000000000..c3c578675c0
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
> @@ -0,0 +1,20 @@
> +#as:
> +#objdump: -dw
> +#name: x86-64 APX old evex insn use gpr32 with extend-evex prefix
> +#source: x86-64-apx-evex-egpr.s
> +
> +.*: +file format .*
> +
> +
> +Disassembly of section .text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*62 fb 79 48 19 04 08 01[	 ]+vextractf32x4 \$0x1,%zmm0,\(%r16,%r17,1\)
> +\s*[a-f0-9]+:\s*62 fa 79 48 5a 04 1a[	 ]+vbroadcasti32x4 \(%r18,%r19,1\),%zmm0
> +\s*[a-f0-9]+:\s*62 eb 7d 08 17 c4 01[	 ]+vextractps \$0x1,%xmm16,%r20d
> +\s*[a-f0-9]+:\s*62 69 97 00 2a f5[	 ]+vcvtsi2sd %r21,%xmm29,%xmm30
> +\s*[a-f0-9]+:\s*67 62 fe 55 58 96 36[	 ]+vfmaddsub132ph \(%r22d\)\{1to32\},%zmm5,%zmm6
> +\s*[a-f0-9]+:\s*62 81 fe 18 78 fe[	 ]+vcvttss2usi \{sae\},%xmm30,%r23
> +\s*[a-f0-9]+:\s*62 25 10 47 58 b4 c5 00 00 00 10[	 ]+vaddph 0x10000000\(%rbp,%r24,8\),%zmm29,%zmm30\{%k7\}
> +\s*[a-f0-9]+:\s*62 4d 7c 08 2f 71 7f[	 ]+vcomish 0xfe\(%r25\),%xmm30
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
> new file mode 100644
> index 00000000000..7d1c5de2b6d
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
> @@ -0,0 +1,21 @@
> +# Check 64bit old evex instructions use gpr32 with evex prefix encoding
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +## DestMem
> +	 vextractf32x4	$1, %zmm0, (%r16,%r17)
> +## SrcMem
> +	 vbroadcasti32x4	(%r18,%r19), %zmm0
> +## DestReg
> +	 vextractps	$1, %xmm16, %r20d
> +## SrcReg
> +	 vcvtsi2sdq      %r21, %xmm29, %xmm30
> +## Broadcast
> +	 vfmaddsub132ph  (%r22d){1to32}, %zmm5, %zmm6
> +## SAE
> +	 vcvttss2usi     {sae}, %xmm30, %r23
> +## Masking
> +	 vaddph  0x10000000(%rbp, %r24, 8), %zmm29, %zmm30{%k7}
> +## Disp8memshift
> +	 vcomish 254(%r25), %xmm30
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> new file mode 100644
> index 00000000000..69b2d87f0f7
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> @@ -0,0 +1,33 @@
> +#objdump: -dw
> +#name: x86-64 EVEX-promoted bad
> +
> +.*: +file format .*
> +
> +
> +Disassembly of section .text:
> +
> +0+ <_start>:
> +[ 	]*[a-f0-9]+:[ 	]+62 fc 7e 08 60[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+62 fc 7f 08 60[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+62 e2 f9 41 91 84[ 	]+vpgatherqq \(bad\),%zmm16\{%k1\}
> +[ 	]*[a-f0-9]+:[ 	]+cd ff[ 	]+int    \$0xff
> +[ 	]*[a-f0-9]+:[ 	]+62 fd 7d 08 60[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+62 fc 7d[ 	]+\(bad\).*
> +[ 	]*[a-f0-9]+:[ 	]+09 60 c7[ 	]+or     %esp,-0x39\(%rax\)
> +[ 	]*[a-f0-9]+:[ 	]+62 fc 7d[ 	]+\(bad\).*
> +[ 	]*[a-f0-9]+:[ 	]+28 60 c7[ 	]+.*
> +[ 	]*[a-f0-9]+:[ 	]+62 fc 7d[ 	]+\(bad\).*
> +[ 	]*[a-f0-9]+:[ 	]+8f[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+60[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+c7[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 09 f5[ 	]+\(bad\).*
> +[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
> +[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 28 f5[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
> +[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 8f f5[ 	]+\(bad\).*
> +[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
> +[ 	]*[a-f0-9]+:[ 	]+62 f2 fc 18 f5[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> new file mode 100644
> index 00000000000..719c4b6de53
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> @@ -0,0 +1,39 @@
> +# Check Illegal prefix for 64bit EVEX-promoted instructions
> +
> +        .allow_index_reg
> +        .text
> +_start:
> +	#movbe %r23w,%ax set EVEX.pp = f3.
> +	.insn EVEX.L0.f3.M12.W0 0x60, %di, %ax
> +
> +	#movbe %r23w,%ax set EVEX.pp = f2.
> +	.insn EVEX.L0.f2.M12.W0 0x60, %di, %ax
> +
> +	#VSIB vpgatherqq (%rbp,%zmm17,8),%zmm16{%k1} set EVEX.P[10] == 0
> +	.byte 0x62, 0xe2, 0xf9, 0x41, 0x91, 0x84, 0xcd
> +	.byte 0xff
> +
> +	#EVEX_MAP4 movbe %r23w,%ax set EVEX.mm == 0b01.
> +	.insn EVEX.L0.66.M13.W0 0x60, %di, %ax
> +
> +	#EVEX_MAP4 movbe %r23w,%ax set EVEX.aaa[1:0] (P[17:16]) == 0b01
> +	.insn EVEX.L0.66.M12.W0 0x60, %di, %ax{%k1}
> +
> +	#EVEX_MAP4 movbe %r18w,%ax set EVEX.L'L == 0b01.
> +	.insn EVEX.L1.66.M12.W0 0x60, %di, %ax
> +
> +	#EVEX_MAP4 movbe %r18w,%ax set EVEX.z == 0b1.
> +	.insn EVEX.L0.66.M12.W0 0x60, %di, %ax {%k7}{z}
> +
> +	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.aaa[1:0] (P[17:16])
> +	#== 0b01
> +	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx), %rcx{%k1}
> +
> +	#EVEX from VEX bzhi %rax,(%rax,%rbx),%ecx EVEX.P[22:21](EVEX.L’L) == 0b01
> +	.insn EVEX.L1.NP.0f38.W1 0xf5, %rax, (%rax,%rbx), %rcx
> +
> +	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[23](EVEX.z) == 0b1
> +	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx), %rcx {%k7}{z}
> +
> +	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[20](EVEX.b) == 0b1
> +	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx){1to8}, %rcx
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
> new file mode 100644
> index 00000000000..02e811de88d
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
> @@ -0,0 +1,318 @@
> +#as:
> +#objdump: -dw -Mintel
> +#name: x86_64 APX_F EVEX-Promoted insns (Intel disassembly)
> +#source: x86-64-apx-evex-promoted.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32[	 ]+r22,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32[	 ]+r22,QWORD PTR \[r31\]
> +[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32[	 ]+r17,r19b
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32[	 ]+r21d,r19b
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32[	 ]+ebx,BYTE PTR \[r19\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32[	 ]+r23d,r31d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32[	 ]+r23d,DWORD PTR \[r31\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32[	 ]+r21d,r31w
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32[	 ]+r21d,WORD PTR \[r31\]
> +[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32[	 ]+r18,rax
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+r25d,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+BYTE PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+k5,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+k5,BYTE PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+r25d,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+k5,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+k5,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+r31,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+k5,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+k5,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+r25d,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+k5,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+k5,WORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+ax,r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r16\+rax\*4\+0x123\],r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+DWORD PTR \[r16\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r16\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+r31,QWORD PTR \[r16\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+r18w,WORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4 xmm22,xmm23,0x7b
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\],0x7b
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2 xmm12,XMMWORD PTR \[r31\+rax\*4\+0x123\],xmm0
> +[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd tmm6,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1 tmm6,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+\[r31\+rax\*4\+0x123\],tmm6
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+\[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+\[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+\[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+\[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl xmm22,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32[	 ]+r22,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32[	 ]+r22,QWORD PTR \[r31\]
> +[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32[	 ]+r17,r19b
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32[	 ]+r21d,r19b
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32[	 ]+ebx,BYTE PTR \[r19\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32[	 ]+r23d,r31d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32[	 ]+r23d,DWORD PTR \[r31\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32[	 ]+r21d,r31w
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32[	 ]+r21d,WORD PTR \[r31\]
> +[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32[	 ]+r18,rax
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+r31,OWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+r25d,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+BYTE PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+k5,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+k5,BYTE PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+r25d,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+k5,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+k5,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+r31,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+k5,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+k5,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+r25d,k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+k5,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+k5,WORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+ax,r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r16\+rax\*4\+0x123\],r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+WORD PTR \[r31\+rax\*4\+0x123\],r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+DWORD PTR \[r16\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r16\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+r31,QWORD PTR \[r16\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+r18w,WORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+r25d,\[r31d\+eax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+r31,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+edx,r25d,DWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+r15,r31,QWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4 xmm22,xmm23,0x7b
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\],0x7b
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2 xmm22,xmm23
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2 xmm22,XMMWORD PTR \[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2 xmm12,XMMWORD PTR \[r31\+rax\*4\+0x123\],xmm0
> +[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+r10d,edx,r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+edx,DWORD PTR \[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+r11,r15,r31
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+r15,QWORD PTR \[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd tmm6,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1 tmm6,\[r31\+rax\*4\+0x123\]
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+\[r31\+rax\*4\+0x123\],tmm6
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+\[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+\[r31\+rax\*4\+0x123\],r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+\[r31\+rax\*4\+0x123\],r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+\[r31\+rax\*4\+0x123\],r31
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
> new file mode 100644
> index 00000000000..3a7dffc013b
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
> @@ -0,0 +1,318 @@
> +#as:
> +#objdump: -dw
> +#name: x86_64 APX_F EVEX-Promoted insns
> +#source: x86-64-apx-evex-promoted.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32  %r31,%r22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32q \(%r31\),%r22
> +[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32  %r19b,%r17
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32  %r19b,%r21d
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32b \(%r19\),%ebx
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32  %r31d,%r23d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32l \(%r31\),%r23d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32  %r31w,%r21d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32w \(%r31\),%r21d
> +[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32  %rax,%r18
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31d,%eax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31d,%eax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+%k5,%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+%r25d,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+%k5,%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+%r25d,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+%k5,%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+%r31,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+%k5,%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+%r25d,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+%r18w,%ax
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r16,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+%r25d,0x123\(%r16,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r16,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r16,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31d,%eax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4[	 ]+\$0x7b,%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4[	 ]+\$0x7b,0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2[	 ]+%xmm0,0x123\(%r31,%rax,4\),%xmm12
> +[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd[	 ]+0x123\(%r31,%rax,4\),%tmm6
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1[	 ]+0x123\(%r31,%rax,4\),%tmm6
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+%tmm6,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 fc 8c 87 23 01 00 00[	 ]+aadd[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 fc bc 87 23 01 00 00[	 ]+aadd[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 fc 8c 87 23 01 00 00[	 ]+aand[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 fc bc 87 23 01 00 00[	 ]+aand[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dd b4 87 23 01 00 00[	 ]+aesdec128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 df b4 87 23 01 00 00[	 ]+aesdec256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 8c 87 23 01 00 00[	 ]+aesdecwide128kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 9c 87 23 01 00 00[	 ]+aesdecwide256kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 dc b4 87 23 01 00 00[	 ]+aesenc128kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7e 08 de b4 87 23 01 00 00[	 ]+aesenc256kl[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 84 87 23 01 00 00[	 ]+aesencwide128kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 d8 94 87 23 01 00 00[	 ]+aesencwide256kl[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 fc 8c 87 23 01 00 00[	 ]+aor[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c ff 08 fc bc 87 23 01 00 00[	 ]+aor[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 fc 8c 87 23 01 00 00[	 ]+axor[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 fc bc 87 23 01 00 00[	 ]+axor[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f7 d2[	 ]+bextr[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f7 94 87 23 01 00 00[	 ]+bextr[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f7 df[	 ]+bextr[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f7 bc 87 23 01 00 00[	 ]+bextr[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d9[	 ]+blsi[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 df[	 ]+blsi[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 9c 87 23 01 00 00[	 ]+blsi[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 d1[	 ]+blsmsk[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 d7[	 ]+blsmsk[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 94 87 23 01 00 00[	 ]+blsmsk[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 da 6c 08 f3 c9[	 ]+blsr[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 08 f3 cf[	 ]+blsr[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 84 00 f3 8c 87 23 01 00 00[	 ]+blsr[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 72 34 00 f5 d2[	 ]+bzhi[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 34 00 f5 94 87 23 01 00 00[	 ]+bzhi[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 84 00 f5 df[	 ]+bzhi[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 84 00 f5 bc 87 23 01 00 00[	 ]+bzhi[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e6 94 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e6 bc 87 23 01 00 00[	 ]+cmpbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e2 94 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e2 bc 87 23 01 00 00[	 ]+cmpbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ec 94 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ec bc 87 23 01 00 00[	 ]+cmplxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e7 94 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e7 bc 87 23 01 00 00[	 ]+cmpnbexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e3 94 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e3 bc 87 23 01 00 00[	 ]+cmpnbxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ef 94 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ef bc 87 23 01 00 00[	 ]+cmpnlexadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ed 94 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ed bc 87 23 01 00 00[	 ]+cmpnlxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e1 94 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e1 bc 87 23 01 00 00[	 ]+cmpnoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 eb 94 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 eb bc 87 23 01 00 00[	 ]+cmpnpxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e9 94 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e9 bc 87 23 01 00 00[	 ]+cmpnsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e5 94 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e5 bc 87 23 01 00 00[	 ]+cmpnzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e0 94 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e0 bc 87 23 01 00 00[	 ]+cmpoxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 ea 94 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 ea bc 87 23 01 00 00[	 ]+cmppxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e8 94 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e8 bc 87 23 01 00 00[	 ]+cmpsxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 e4 94 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r25d,%edx,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 e4 bc 87 23 01 00 00[	 ]+cmpzxadd[	 ]+%r31,%r15,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 f7[	 ]+crc32  %r31,%r22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc fc 08 f1 37[	 ]+crc32q \(%r31\),%r22
> +[	 ]*[a-f0-9]+:[	 ]*62 ec fc 08 f0 cb[	 ]+crc32  %r19b,%r17
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7c 08 f0 eb[	 ]+crc32  %r19b,%r21d
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7c 08 f0 1b[	 ]+crc32b \(%r19\),%ebx
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 ff[	 ]+crc32  %r31d,%r23d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 f1 3f[	 ]+crc32l \(%r31\),%r23d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 ef[	 ]+crc32  %r31w,%r21d
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 f1 2f[	 ]+crc32w \(%r31\),%r21d
> +[	 ]*[a-f0-9]+:[	 ]*62 e4 fc 08 f1 d0[	 ]+crc32  %rax,%r18
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 da d1[	 ]+encodekey128[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7e 08 db d1[	 ]+encodekey256[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7f 08 f8 8c 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31d,%eax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7f 08 f8 bc 87 23 01 00 00[	 ]+enqcmd[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7e 08 f8 8c 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31d,%eax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7e 08 f8 bc 87 23 01 00 00[	 ]+enqcmds[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f0 bc 87 23 01 00 00[	 ]+invept[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f2 bc 87 23 01 00 00[	 ]+invpcid[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fe 08 f1 bc 87 23 01 00 00[	 ]+invvpid[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7d 08 93 cd[	 ]+kmovb[	 ]+%k5,%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 91 ac 87 23 01 00 00[	 ]+kmovb[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 92 e9[	 ]+kmovb[	 ]+%r25d,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7d 08 90 ac 87 23 01 00 00[	 ]+kmovb[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7f 08 93 cd[	 ]+kmovd[	 ]+%k5,%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 91 ac 87 23 01 00 00[	 ]+kmovd[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7f 08 92 e9[	 ]+kmovd[	 ]+%r25d,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fd 08 90 ac 87 23 01 00 00[	 ]+kmovd[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 61 ff 08 93 fd[	 ]+kmovq[	 ]+%k5,%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 91 ac 87 23 01 00 00[	 ]+kmovq[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 ff 08 92 ef[	 ]+kmovq[	 ]+%r31,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 fc 08 90 ac 87 23 01 00 00[	 ]+kmovq[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 61 7c 08 93 cd[	 ]+kmovw[	 ]+%k5,%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 91 ac 87 23 01 00 00[	 ]+kmovw[	 ]+%k5,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 92 e9[	 ]+kmovw[	 ]+%r25d,%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 d9 7c 08 90 ac 87 23 01 00 00[	 ]+kmovw[	 ]+0x123\(%r31,%rax,4\),%k5
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7c 08 49 84 87 23 01 00 00[	 ]+ldtilecfg[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 fc 7d 08 60 c2[	 ]+movbe[	 ]+%r18w,%ax
> +[	 ]*[a-f0-9]+:[	 ]*62 ec 7d 08 61 94 80 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r16,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 61 94 87 23 01 00 00[	 ]+movbe[	 ]+%r18w,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 dc 7c 08 60 d1[	 ]+movbe[	 ]+%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 6c 7c 08 61 8c 80 23 01 00 00[	 ]+movbe[	 ]+%r25d,0x123\(%r16,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5c fc 08 60 ff[	 ]+movbe[	 ]+%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 61 bc 80 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r16,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 61 bc 87 23 01 00 00[	 ]+movbe[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 6c fc 08 60 bc 80 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r16,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7d 08 60 94 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r18w
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 60 8c 87 23 01 00 00[	 ]+movbe[	 ]+0x123\(%r31,%rax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*67 62 4c 7d 08 f8 8c 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31d,%eax,4\),%r25d
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 f8 bc 87 23 01 00 00[	 ]+movdir64b[	 ]+0x123\(%r31,%rax,4\),%r31
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 f9 8c 87 23 01 00 00[	 ]+movdiri[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 f9 bc 87 23 01 00 00[	 ]+movdiri[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6f 08 f5 d1[	 ]+pdep[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 08 f5 df[	 ]+pdep[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f5 94 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f5 bc 87 23 01 00 00[	 ]+pdep[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 6e 08 f5 d1[	 ]+pext[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 08 f5 df[	 ]+pext[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 da 36 00 f5 94 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r25d,%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 86 00 f5 bc 87 23 01 00 00[	 ]+pext[	 ]+0x123\(%r31,%rax,4\),%r31,%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d9 f7[	 ]+sha1msg1[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d9 b4 87 23 01 00 00[	 ]+sha1msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 da f7[	 ]+sha1msg2[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 da b4 87 23 01 00 00[	 ]+sha1msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d8 f7[	 ]+sha1nexte[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d8 b4 87 23 01 00 00[	 ]+sha1nexte[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 d4 f7 7b[	 ]+sha1rnds4[	 ]+\$0x7b,%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 d4 b4 87 23 01 00 00 7b[	 ]+sha1rnds4[	 ]+\$0x7b,0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dc f7[	 ]+sha256msg1[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dc b4 87 23 01 00 00[	 ]+sha256msg1[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 a4 7c 08 dd f7[	 ]+sha256msg2[	 ]+%xmm23,%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 cc 7c 08 dd b4 87 23 01 00 00[	 ]+sha256msg2[	 ]+0x123\(%r31,%rax,4\),%xmm22
> +[	 ]*[a-f0-9]+:[	 ]*62 5c 7c 08 db a4 87 23 01 00 00[	 ]+sha256rnds2[	 ]+%xmm0,0x123\(%r31,%rax,4\),%xmm12
> +[	 ]*[a-f0-9]+:[	 ]*62 72 35 00 f7 d2[	 ]+shlx[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 35 00 f7 94 87 23 01 00 00[	 ]+shlx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 85 00 f7 df[	 ]+shlx[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 85 00 f7 bc 87 23 01 00 00[	 ]+shlx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 72 37 00 f7 d2[	 ]+shrx[	 ]+%r25d,%edx,%r10d
> +[	 ]*[a-f0-9]+:[	 ]*62 da 37 00 f7 94 87 23 01 00 00[	 ]+shrx[	 ]+%r25d,0x123\(%r31,%rax,4\),%edx
> +[	 ]*[a-f0-9]+:[	 ]*62 52 87 00 f7 df[	 ]+shrx[	 ]+%r31,%r15,%r11
> +[	 ]*[a-f0-9]+:[	 ]*62 5a 87 00 f7 bc 87 23 01 00 00[	 ]+shrx[	 ]+%r31,0x123\(%r31,%rax,4\),%r15
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 49 84 87 23 01 00 00[	 ]+sttilecfg[	 ]+0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7f 08 4b b4 87 23 01 00 00[	 ]+tileloadd[	 ]+0x123\(%r31,%rax,4\),%tmm6
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7d 08 4b b4 87 23 01 00 00[	 ]+tileloaddt1[	 ]+0x123\(%r31,%rax,4\),%tmm6
> +[	 ]*[a-f0-9]+:[	 ]*62 da 7e 08 4b b4 87 23 01 00 00[	 ]+tilestored[	 ]+%tmm6,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7c 08 66 8c 87 23 01 00 00[	 ]+wrssd[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fc 08 66 bc 87 23 01 00 00[	 ]+wrssq[	 ]+%r31,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c 7d 08 65 8c 87 23 01 00 00[	 ]+wrussd[	 ]+%r25d,0x123\(%r31,%rax,4\)
> +[	 ]*[a-f0-9]+:[	 ]*62 4c fd 08 65 bc 87 23 01 00 00[	 ]+wrussq[	 ]+%r31,0x123\(%r31,%rax,4\)
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
> new file mode 100644
> index 00000000000..39752c27432
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
> @@ -0,0 +1,314 @@
> +# Check 64bit APX_F EVEX-Promoted instructions.
> +
> +	.text
> +_start:
> +	aadd	%r25d,0x123(%r31,%rax,4)
> +	aadd	%r31,0x123(%r31,%rax,4)
> +	aand	%r25d,0x123(%r31,%rax,4)
> +	aand	%r31,0x123(%r31,%rax,4)
> +	aesdec128kl	0x123(%r31,%rax,4),%xmm22
> +	aesdec256kl	0x123(%r31,%rax,4),%xmm22
> +	aesdecwide128kl	0x123(%r31,%rax,4)
> +	aesdecwide256kl	0x123(%r31,%rax,4)
> +	aesenc128kl	0x123(%r31,%rax,4),%xmm22
> +	aesenc256kl	0x123(%r31,%rax,4),%xmm22
> +	aesencwide128kl	0x123(%r31,%rax,4)
> +	aesencwide256kl	0x123(%r31,%rax,4)
> +	aor	%r25d,0x123(%r31,%rax,4)
> +	aor	%r31,0x123(%r31,%rax,4)
> +	axor	%r25d,0x123(%r31,%rax,4)
> +	axor	%r31,0x123(%r31,%rax,4)
> +	bextr	%r25d,%edx,%r10d
> +	bextr	%r25d,0x123(%r31,%rax,4),%edx
> +	bextr	%r31,%r15,%r11
> +	bextr	%r31,0x123(%r31,%rax,4),%r15
> +	blsi	%r25d,%edx
> +	blsi	%r31,%r15
> +	blsi	0x123(%r31,%rax,4),%r25d
> +	blsi	0x123(%r31,%rax,4),%r31
> +	blsmsk	%r25d,%edx
> +	blsmsk	%r31,%r15
> +	blsmsk	0x123(%r31,%rax,4),%r25d
> +	blsmsk	0x123(%r31,%rax,4),%r31
> +	blsr	%r25d,%edx
> +	blsr	%r31,%r15
> +	blsr	0x123(%r31,%rax,4),%r25d
> +	blsr	0x123(%r31,%rax,4),%r31
> +	bzhi	%r25d,%edx,%r10d
> +	bzhi	%r25d,0x123(%r31,%rax,4),%edx
> +	bzhi	%r31,%r15,%r11
> +	bzhi	%r31,0x123(%r31,%rax,4),%r15
> +	cmpbexadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpbexadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpbxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpbxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmplxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmplxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnbexadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnbexadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnbxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnbxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnlexadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnlexadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnlxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnlxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnoxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnoxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnpxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnpxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnsxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnsxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpnzxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpnzxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpoxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpoxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmppxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmppxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpsxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpsxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	cmpzxadd	%r25d,%edx,0x123(%r31,%rax,4)
> +	cmpzxadd	%r31,%r15,0x123(%r31,%rax,4)
> +	crc32q	%r31, %r22
> +	crc32q	(%r31), %r22
> +	crc32b	%r19b, %r17
> +	crc32b	%r19b, %r21d
> +	crc32b	(%r19),%ebx
> +	crc32l	%r31d, %r23d
> +	crc32l	(%r31), %r23d
> +	crc32w	%r31w, %r21d
> +	crc32w	(%r31),%r21d
> +	crc32	%rax, %r18
> +	encodekey128	%r25d,%edx
> +	encodekey256	%r25d,%edx
> +	enqcmd	0x123(%r31d,%eax,4),%r25d
> +	enqcmd	0x123(%r31,%rax,4),%r31
> +	enqcmds	0x123(%r31d,%eax,4),%r25d
> +	enqcmds	0x123(%r31,%rax,4),%r31
> +	invept	0x123(%r31,%rax,4),%r31
> +	invpcid	0x123(%r31,%rax,4),%r31
> +	invvpid	0x123(%r31,%rax,4),%r31
> +	kmovb	%k5,%r25d
> +	kmovb	%k5,0x123(%r31,%rax,4)
> +	kmovb	%r25d,%k5
> +	kmovb	0x123(%r31,%rax,4),%k5
> +	kmovd	%k5,%r25d
> +	kmovd	%k5,0x123(%r31,%rax,4)
> +	kmovd	%r25d,%k5
> +	kmovd	0x123(%r31,%rax,4),%k5
> +	kmovq	%k5,%r31
> +	kmovq	%k5,0x123(%r31,%rax,4)
> +	kmovq	%r31,%k5
> +	kmovq	0x123(%r31,%rax,4),%k5
> +	kmovw	%k5,%r25d
> +	kmovw	%k5,0x123(%r31,%rax,4)
> +	kmovw	%r25d,%k5
> +	kmovw	0x123(%r31,%rax,4),%k5
> +	ldtilecfg	0x123(%r31,%rax,4)
> +	movbe	%r18w,%ax
> +	movbe	%r18w,0x123(%r16,%rax,4)
> +	movbe	%r18w,0x123(%r31,%rax,4)
> +	movbe	%r25d,%edx
> +	movbe	%r25d,0x123(%r16,%rax,4)
> +	movbe	%r31,%r15
> +	movbe	%r31,0x123(%r16,%rax,4)
> +	movbe	%r31,0x123(%r31,%rax,4)
> +	movbe	0x123(%r16,%rax,4),%r31
> +	movbe	0x123(%r31,%rax,4),%r18w
> +	movbe	0x123(%r31,%rax,4),%r25d
> +	movdir64b	0x123(%r31d,%eax,4),%r25d
> +	movdir64b	0x123(%r31,%rax,4),%r31
> +	movdiri	%r25d,0x123(%r31,%rax,4)
> +	movdiri	%r31,0x123(%r31,%rax,4)
> +	pdep	%r25d,%edx,%r10d
> +	pdep	%r31,%r15,%r11
> +	pdep	0x123(%r31,%rax,4),%r25d,%edx
> +	pdep	0x123(%r31,%rax,4),%r31,%r15
> +	pext	%r25d,%edx,%r10d
> +	pext	%r31,%r15,%r11
> +	pext	0x123(%r31,%rax,4),%r25d,%edx
> +	pext	0x123(%r31,%rax,4),%r31,%r15
> +	sha1msg1	%xmm23,%xmm22
> +	sha1msg1	0x123(%r31,%rax,4),%xmm22
> +	sha1msg2	%xmm23,%xmm22
> +	sha1msg2	0x123(%r31,%rax,4),%xmm22
> +	sha1nexte	%xmm23,%xmm22
> +	sha1nexte	0x123(%r31,%rax,4),%xmm22
> +	sha1rnds4	$0x7b,%xmm23,%xmm22
> +	sha1rnds4	$0x7b,0x123(%r31,%rax,4),%xmm22
> +	sha256msg1	%xmm23,%xmm22
> +	sha256msg1	0x123(%r31,%rax,4),%xmm22
> +	sha256msg2	%xmm23,%xmm22
> +	sha256msg2	0x123(%r31,%rax,4),%xmm22
> +	sha256rnds2	0x123(%r31,%rax,4),%xmm12
> +	shlx	%r25d,%edx,%r10d
> +	shlx	%r25d,0x123(%r31,%rax,4),%edx
> +	shlx	%r31,%r15,%r11
> +	shlx	%r31,0x123(%r31,%rax,4),%r15
> +	shrx	%r25d,%edx,%r10d
> +	shrx	%r25d,0x123(%r31,%rax,4),%edx
> +	shrx	%r31,%r15,%r11
> +	shrx	%r31,0x123(%r31,%rax,4),%r15
> +	sttilecfg	0x123(%r31,%rax,4)
> +	tileloadd	0x123(%r31,%rax,4),%tmm6
> +	tileloaddt1	0x123(%r31,%rax,4),%tmm6
> +	tilestored	%tmm6,0x123(%r31,%rax,4)
> +	wrssd	%r25d,0x123(%r31,%rax,4)
> +	wrssq	%r31,0x123(%r31,%rax,4)
> +	wrussd	%r25d,0x123(%r31,%rax,4)
> +	wrussq	%r31,0x123(%r31,%rax,4)
> +
> +	.intel_syntax noprefix
> +	aadd	[r31+rax*4+0x123],r25d
> +	aadd	[r31+rax*4+0x123],r31
> +	aand	[r31+rax*4+0x123],r25d
> +	aand	[r31+rax*4+0x123],r31
> +	aesdec128kl	xmm22,[r31+rax*4+0x123]
> +	aesdec256kl	xmm22,[r31+rax*4+0x123]
> +	aesdecwide128kl	[r31+rax*4+0x123]
> +	aesdecwide256kl	[r31+rax*4+0x123]
> +	aesenc128kl	xmm22,[r31+rax*4+0x123]
> +	aesenc256kl	xmm22,[r31+rax*4+0x123]
> +	aesencwide128kl	[r31+rax*4+0x123]
> +	aesencwide256kl	[r31+rax*4+0x123]
> +	aor	[r31+rax*4+0x123],r25d
> +	aor	[r31+rax*4+0x123],r31
> +	axor	[r31+rax*4+0x123],r25d
> +	axor	[r31+rax*4+0x123],r31
> +	bextr	r10d,edx,r25d
> +	bextr	edx, [r31+rax*4+0x123],r25d
> +	bextr	r11,r15,r31
> +	bextr	r15, [r31+rax*4+0x123],r31
> +	blsi	edx,r25d
> +	blsi	r15,r31
> +	blsi	r25d, [r31+rax*4+0x123]
> +	blsi	r31,  [r31+rax*4+0x123]
> +	blsmsk	edx,r25d
> +	blsmsk	r15,r31
> +	blsmsk	r25d, [r31+rax*4+0x123]
> +	blsmsk	r31,  [r31+rax*4+0x123]
> +	blsr	edx,r25d
> +	blsr	r15,r31
> +	blsr	r25d, [r31+rax*4+0x123]
> +	blsr	r31,  [r31+rax*4+0x123]
> +	bzhi	r10d,edx,r25d
> +	bzhi	edx, [r31+rax*4+0x123],r25d
> +	bzhi	r11,r15,r31
> +	bzhi	r15, [r31+rax*4+0x123],r31
> +	cmpbexadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpbexadd	 [r31+rax*4+0x123],r15,r31
> +	cmpbxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpbxadd	 [r31+rax*4+0x123],r15,r31
> +	cmplxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmplxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnbexadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnbexadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnbxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnbxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnlexadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnlexadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnlxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnlxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnoxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnoxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnpxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnpxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnsxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnsxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpnzxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpnzxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpoxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpoxadd	 [r31+rax*4+0x123],r15,r31
> +	cmppxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmppxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpsxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpsxadd	 [r31+rax*4+0x123],r15,r31
> +	cmpzxadd	 [r31+rax*4+0x123],edx,r25d
> +	cmpzxadd	 [r31+rax*4+0x123],r15,r31
> +	crc32	r22,r31
> +	crc32	r22,QWORD PTR [r31]
> +	crc32	r17,r19b
> +	crc32	r21d,r19b
> +	crc32	ebx,BYTE PTR [r19]
> +	crc32	r23d,r31d
> +	crc32	r23d,DWORD PTR [r31]
> +	crc32	r21d,r31w
> +	crc32	r21d,WORD PTR [r31]
> +	crc32	r18,rax
> +	encodekey128	edx,r25d
> +	encodekey256	edx,r25d
> +	enqcmd	r25d,[r31d+eax*4+0x123]
> +	enqcmd	r31,[r31+rax*4+0x123]
> +	enqcmds	r25d,[r31d+eax*4+0x123]
> +	enqcmds	r31,[r31+rax*4+0x123]
> +	invept	r31,OWORD PTR [r31+rax*4+0x123]
> +	invpcid	r31,[r31+rax*4+0x123]
> +	invvpid	r31,OWORD PTR [r31+rax*4+0x123]
> +	kmovb	r25d,k5
> +	kmovb	BYTE PTR [r31+rax*4+0x123],k5
> +	kmovb	k5,r25d
> +	kmovb	k5,BYTE PTR [r31+rax*4+0x123]
> +	kmovd	r25d,k5
> +	kmovd	DWORD PTR [r31+rax*4+0x123],k5
> +	kmovd	k5,r25d
> +	kmovd	k5,DWORD PTR [r31+rax*4+0x123]
> +	kmovq	r31,k5
> +	kmovq	QWORD PTR [r31+rax*4+0x123],k5
> +	kmovq	k5,r31
> +	kmovq	k5,QWORD PTR [r31+rax*4+0x123]
> +	kmovw	r25d,k5
> +	kmovw	WORD PTR [r31+rax*4+0x123],k5
> +	kmovw	k5,r25d
> +	kmovw	k5,WORD PTR [r31+rax*4+0x123]
> +	ldtilecfg	[r31+rax*4+0x123]
> +	movbe	ax,r18w
> +	movbe	WORD PTR [r16+rax*4+0x123],r18w
> +	movbe	WORD PTR [r31+rax*4+0x123],r18w
> +	movbe	edx,r25d
> +	movbe	DWORD PTR [r16+rax*4+0x123],r25d
> +	movbe	r15,r31
> +	movbe	QWORD PTR [r16+rax*4+0x123],r31
> +	movbe	QWORD PTR [r31+rax*4+0x123],r31
> +	movbe	r31,QWORD PTR [r16+rax*4+0x123]
> +	movbe	r18w,WORD PTR [r31+rax*4+0x123]
> +	movbe	r25d,DWORD PTR [r31+rax*4+0x123]
> +	movdir64b	r25d,[r31d+eax*4+0x123]
> +	movdir64b	r31,[r31+rax*4+0x123]
> +	movdiri	DWORD PTR [r31+rax*4+0x123],r25d
> +	movdiri	QWORD PTR [r31+rax*4+0x123],r31
> +	pdep	r10d,edx,r25d
> +	pdep	r11,r15,r31
> +	pdep	edx,r25d,DWORD PTR [r31+rax*4+0x123]
> +	pdep	r15,r31,QWORD PTR [r31+rax*4+0x123]
> +	pext	r10d,edx,r25d
> +	pext	r11,r15,r31
> +	pext	edx,r25d,DWORD PTR [r31+rax*4+0x123]
> +	pext	r15,r31,QWORD PTR [r31+rax*4+0x123]
> +	sha1msg1	xmm22,xmm23
> +	sha1msg1	xmm22,XMMWORD PTR [r31+rax*4+0x123]
> +	sha1msg2	xmm22,xmm23
> +	sha1msg2	xmm22,XMMWORD PTR [r31+rax*4+0x123]
> +	sha1nexte	xmm22,xmm23
> +	sha1nexte	xmm22,XMMWORD PTR [r31+rax*4+0x123]
> +	sha1rnds4	xmm22,xmm23,0x7b
> +	sha1rnds4	xmm22,XMMWORD PTR [r31+rax*4+0x123],0x7b
> +	sha256msg1	xmm22,xmm23
> +	sha256msg1	xmm22,XMMWORD PTR [r31+rax*4+0x123]
> +	sha256msg2	xmm22,xmm23
> +	sha256msg2	xmm22,XMMWORD PTR [r31+rax*4+0x123]
> +	sha256rnds2	xmm12,XMMWORD PTR [r31+rax*4+0x123]
> +	shlx	r10d,edx,r25d
> +	shlx	edx,DWORD PTR [r31+rax*4+0x123],r25d
> +	shlx	r11,r15,r31
> +	shlx	r15,QWORD PTR [r31+rax*4+0x123],r31
> +	shrx	r10d,edx,r25d
> +	shrx	edx,DWORD PTR [r31+rax*4+0x123],r25d
> +	shrx	r11,r15,r31
> +	shrx	r15,QWORD PTR [r31+rax*4+0x123],r31
> +	sttilecfg	[r31+rax*4+0x123]
> +	tileloadd	tmm6,[r31+rax*4+0x123]
> +	tileloaddt1	tmm6,[r31+rax*4+0x123]
> +	tilestored	[r31+rax*4+0x123],tmm6
> +	wrssd	DWORD PTR [r31+rax*4+0x123],r25d
> +	wrssq	QWORD PTR [r31+rax*4+0x123],r31
> +	wrussd	DWORD PTR [r31+rax*4+0x123],r25d
> +	wrussq	QWORD PTR [r31+rax*4+0x123],r31
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index ffacc9c8e2b..bfda747e02e 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -364,7 +364,12 @@ run_dump_test "x86-64-avx512f-rcigrne"
>  run_dump_test "x86-64-avx512f-rcigru-intel"
>  run_dump_test "x86-64-avx512f-rcigru"
>  run_list_test "x86-64-apx-egpr-inval"
> +run_dump_test "x86-64-apx-evex-promoted-bad"
> +run_list_test "x86-64-apx-egpr-promote-inval" "-al"
>  run_dump_test "x86-64-apx-rex2"
> +run_dump_test "x86-64-apx-evex-promoted"
> +run_dump_test "x86-64-apx-evex-promoted-intel"
> +run_dump_test "x86-64-apx-evex-egpr"
>  run_dump_test "x86-64-avx512f-rcigrz-intel"
>  run_dump_test "x86-64-avx512f-rcigrz"
>  run_dump_test "x86-64-clwb"
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 5/9] Support APX NDD
  2023-12-28  1:27 ` [PATCH V5 5/9] Support APX NDD Cui, Lili
@ 2023-12-28  1:55   ` H.J. Lu
  0 siblings, 0 replies; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:55 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich, konglin1

On Thu, Dec 28, 2023 at 01:27:10AM +0000, Cui, Lili wrote:
> From: konglin1 <lingling.kong@intel.com>
> 
> opcodes/ChangeLog:
> 
> 	* opcodes/i386-dis-evex-reg.h: Handle for REG_EVEX_MAP4_80,
> 	REG_EVEX_MAP4_81, REG_EVEX_MAP4_83,  REG_EVEX_MAP4_F6,
> 	REG_EVEX_MAP4_F7, REG_EVEX_MAP4_FE, REG_EVEX_MAP4_FF.
> 	* opcodes/i386-dis-evex.h: Add NDD insn.
> 	* opcodes/i386-dis.c (nd): New define.
> 	(VexGb): Ditto.
> 	(VexGv): Ditto.
> 	(get_valid_dis386): Change for NDD decode.
> 	(print_insn): Ditto.
> 	(putop): Ditto.
> 	(intel_operand_size): Ditto.
> 	(OP_E_memory): Ditto.
> 	(OP_VEX): Ditto.
> 	* opcodes/i386-opc.h (VexVVVV_DST): New.
> 	* opcodes/i386-opc.tbl: Add APX NDD instructions and adjust VexVVVV.
> 	* opcodes/i386-tbl.h: Regenerated.
> 
> gas/ChangeLog:
> 
> 	* gas/config/tc-i386.c (operand_size_match):
> 	Support APX NDD that the number of operands is 3.
> 	(build_apx_evex_prefix): Change for ndd encode.
> 	(process_operands): Ditto.
> 	(build_modrm_byte): Ditto.
> 	(match_template): Support swap the first two operands for
> 	APX NDD.
> 	* testsuite/gas/i386/x86-64.exp: Add x86-64-apx-ndd.
> 	* testsuite/gas/i386/x86-64-apx-ndd.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-ndd.s: Ditto.
> 	* testsuite/gas/i386/x86-64-pseudos.d: Add test.
> 	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d : Ditto.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s : Ditto.
> ---
>  gas/config/tc-i386.c                          |  62 +++++--
>  .../gas/i386/x86-64-apx-evex-promoted-bad.d   |   3 +
>  .../gas/i386/x86-64-apx-evex-promoted-bad.s   |   3 +
>  gas/testsuite/gas/i386/x86-64-apx-ndd.d       | 160 ++++++++++++++++
>  gas/testsuite/gas/i386/x86-64-apx-ndd.s       | 155 ++++++++++++++++
>  gas/testsuite/gas/i386/x86-64-pseudos.d       |  42 +++++
>  gas/testsuite/gas/i386/x86-64-pseudos.s       |  43 +++++
>  gas/testsuite/gas/i386/x86-64.exp             |   1 +
>  opcodes/i386-dis-evex-reg.h                   |  54 ++++++
>  opcodes/i386-dis-evex.h                       | 124 ++++++-------
>  opcodes/i386-dis.c                            | 171 +++++++++++-------
>  opcodes/i386-opc.h                            |   6 +-
>  opcodes/i386-opc.tbl                          |  75 ++++++++
>  13 files changed, 754 insertions(+), 145 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 7e62d08e9bd..99b484122e1 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -2239,8 +2239,10 @@ operand_size_match (const insn_template *t)
>        unsigned int given = i.operands - j - 1;
>  
>        /* For FMA4 and XOP insns VEX.W controls just the first two
> -	 register operands.  */
> -      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP))
> +	 register operands. And APX_F insns just swap the two source operands,
> +	 with the 3rd one being the destination.  */
> +      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP)
> +	  || is_cpu (t, CpuAPX_F))
>  	given = j < 2 ? 1 - j : j;
>  
>        if (t->operand_types[j].bitfield.class == Reg
> @@ -4199,6 +4201,11 @@ build_apx_evex_prefix (void)
>    if (i.vex.register_specifier
>        && i.vex.register_specifier->reg_flags & RegRex2)
>      i.vex.bytes[3] &= ~0x08;
> +
> +  /* Encode the NDD bit of the instruction promoted from the legacy
> +     space.  */
> +  if (i.vex.register_specifier && i.tm.opcode_space == SPACE_EVEXMAP4)
> +    i.vex.bytes[3] |= 0x10;
>  }
>  
>  static void establish_rex (void)
> @@ -7472,18 +7479,22 @@ match_template (char mnem_suffix)
>  	     - the store form is requested, and the template is a load form,
>  	     - the non-default (swapped) form is requested.  */
>  	  overlap1 = operand_type_and (operand_types[0], operand_types[1]);
> +
> +	  j = i.operands - 1 - (t->opcode_space == SPACE_EVEXMAP4
> +				&& t->opcode_modifier.vexvvvv);
> +
>  	  if (t->opcode_modifier.d && i.reg_operands == i.operands
>  	      && !operand_type_all_zero (&overlap1))
>  	    switch (i.dir_encoding)
>  	      {
>  	      case dir_encoding_load:
> -		if (operand_type_check (operand_types[i.operands - 1], anymem)
> +		if (operand_type_check (operand_types[j], anymem)
>  		    || t->opcode_modifier.regmem)
>  		  goto check_reverse;
>  		break;
>  
>  	      case dir_encoding_store:
> -		if (!operand_type_check (operand_types[i.operands - 1], anymem)
> +		if (!operand_type_check (operand_types[j], anymem)
>  		    && !t->opcode_modifier.regmem)
>  		  goto check_reverse;
>  		break;
> @@ -7494,6 +7505,7 @@ match_template (char mnem_suffix)
>  	      case dir_encoding_default:
>  		break;
>  	      }
> +
>  	  /* If we want store form, we skip the current load.  */
>  	  if ((i.dir_encoding == dir_encoding_store
>  	       || i.dir_encoding == dir_encoding_swap)
> @@ -7523,11 +7535,13 @@ match_template (char mnem_suffix)
>  		continue;
>  	      /* Try reversing direction of operands.  */
>  	      j = is_cpu (t, CpuFMA4)
> -		  || is_cpu (t, CpuXOP) ? 1 : i.operands - 1;
> +		  || is_cpu (t, CpuXOP)
> +		  || is_cpu (t, CpuAPX_F) ? 1 : i.operands - 1;
>  	      overlap0 = operand_type_and (i.types[0], operand_types[j]);
>  	      overlap1 = operand_type_and (i.types[j], operand_types[0]);
>  	      overlap2 = operand_type_and (i.types[1], operand_types[1]);
> -	      gas_assert (t->operands != 3 || !check_register);
> +	      gas_assert (t->operands != 3 || !check_register
> +			  || is_cpu (t, CpuAPX_F));
>  	      if (!operand_type_match (overlap0, i.types[0])
>  		  || !operand_type_match (overlap1, i.types[j])
>  		  || (t->operands == 3
> @@ -7562,6 +7576,11 @@ match_template (char mnem_suffix)
>  		  found_reverse_match = Opcode_VexW;
>  		  goto check_operands_345;
>  		}
> +	      else if (is_cpu (t, CpuAPX_F) && i.operands == 3)
> +		{
> +		  found_reverse_match = Opcode_D;
> +		  goto check_operands_345;
> +		}
>  	      else if (t->opcode_space != SPACE_BASE
>  		       && (t->opcode_space != SPACE_0F
>  			   /* MOV to/from CR/DR/TR, as an exception, follow
> @@ -7743,6 +7762,9 @@ match_template (char mnem_suffix)
>  
>        i.tm.base_opcode ^= found_reverse_match;
>  
> +      if (i.tm.opcode_space == SPACE_EVEXMAP4)
> +	goto swap_first_2;
> +
>        /* Certain SIMD insns have their load forms specified in the opcode
>  	 table, and hence we need to _set_ RegMem instead of clearing it.
>  	 We need to avoid setting the bit though on insns like KMOVW.  */
> @@ -7762,6 +7784,7 @@ match_template (char mnem_suffix)
>  	 flipping VEX.W.  */
>        i.tm.opcode_modifier.vexw ^= VEXW0 ^ VEXW1;
>  
> +    swap_first_2:
>        j = i.tm.operand_types[0].bitfield.imm8;
>        i.tm.operand_types[j] = operand_types[j + 1];
>        i.tm.operand_types[j + 1] = operand_types[j];
> @@ -8583,12 +8606,9 @@ process_operands (void)
>       unnecessary segment overrides.  */
>    const reg_entry *default_seg = NULL;
>  
> -  /* We only need to check those implicit registers for instructions
> -     with 3 operands or less.  */
> -  if (i.operands <= 3)
> -    for (unsigned int j = 0; j < i.operands; j++)
> -      if (i.types[j].bitfield.instance != InstanceNone)
> -	i.reg_operands--;
> +  for (unsigned int j = 0; j < i.operands; j++)
> +    if (i.types[j].bitfield.instance != InstanceNone)
> +      i.reg_operands--;
>  
>    if (i.tm.opcode_modifier.sse2avx)
>      {
> @@ -8942,11 +8962,19 @@ build_modrm_byte (void)
>  				     || i.vec_encoding == vex_encoding_evex));
>      }
>  
> -  for (v = source + 1; v < dest; ++v)
> -    if (v != reg_slot)
> -      break;
> -  if (v >= dest)
> -    v = ~0;
> +  if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
> +    {
> +      v = dest;
> +      dest-- ;
> +    }
> +  else
> +    {
> +      for (v = source + 1; v < dest; ++v)
> +	if (v != reg_slot)
> +	  break;
> +      if (v >= dest)
> +	v = ~0;
> +    }
>    if (i.tm.extension_opcode != None)
>      {
>        if (dest != source)
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> index 69b2d87f0f7..ba14736c3a8 100644
> --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> @@ -31,3 +31,6 @@ Disassembly of section .text:
>  [ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
>  [ 	]*[a-f0-9]+:[ 	]+62 f2 fc 18 f5[ 	]+\(bad\)
>  [ 	]*[a-f0-9]+:[ 	]+0c 18[ 	]+or.*
> +[ 	]*[a-f0-9]+:[ 	]+62 f4 e4[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+08 ff[ 	]+.*
> +[ 	]*[a-f0-9]+:[ 	]+04 08[ 	]+.*
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> index 719c4b6de53..fcbb1b93659 100644
> --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> @@ -37,3 +37,6 @@ _start:
>  
>  	#EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[20](EVEX.b) == 0b1
>  	.insn EVEX.L0.NP.0f38.W1 0xf5, %rax, (%rax,%rbx){1to8}, %rcx
> +
> +	#{evex} inc %rax %rbx EVEX.vvvv != 1111 && EVEX.ND = 0.
> +	.insn EVEX.L0.NP.M4.W1 0xff/0, (%rax,%rcx), %rbx
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd.d b/gas/testsuite/gas/i386/x86-64-apx-ndd.d
> new file mode 100644
> index 00000000000..73410606ce3
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.d
> @@ -0,0 +1,160 @@
> +#as:
> +#objdump: -dw
> +#name: x86-64 APX NDD instructions with evex prefix encoding
> +#source: x86-64-apx-ndd.s
> +
> +.*: +file format .*
> +
> +
> +Disassembly of section .text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*62 f4 0d 10 81 d0 34 12 	adc    \$0x1234,%ax,%r30w
> +\s*[a-f0-9]+:\s*62 7c 6c 10 10 f9    	adc    %r15b,%r17b,%r18b
> +\s*[a-f0-9]+:\s*62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
> +\s*[a-f0-9]+:\s*62 c4 3c 18 12 04 07 	adc    \(%r15,%rax,1\),%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 c4 3d 18 13 04 07 	adc    \(%r15,%rax,1\),%r16w,%r8w
> +\s*[a-f0-9]+:\s*62 fc 5c 10 83 14 83 11 	adcl   \$0x11,\(%r19,%rax,4\),%r20d
> +\s*[a-f0-9]+:\s*62 54 6d 10 66 c7    	adcx   %r15d,%r8d,%r18d
> +\s*[a-f0-9]+:\s*62 14 f9 08 66 04 3f 	adcx   \(%r15,%r31,1\),%r8
> +\s*[a-f0-9]+:\s*62 14 69 10 66 04 3f 	adcx   \(%r15,%r31,1\),%r8d,%r18d
> +\s*[a-f0-9]+:\s*62 f4 0d 10 81 c0 34 12 	add    \$0x1234,%ax,%r30w
> +\s*[a-f0-9]+:\s*62 d4 fc 10 81 c7 33 44 34 12 	add    \$0x12344433,%r15,%r16
> +\s*[a-f0-9]+:\s*62 d4 74 10 80 c5 34 	add    \$0x34,%r13b,%r17b
> +\s*[a-f0-9]+:\s*62 f4 bc 18 81 c0 11 22 33 f4 	add    \$0xfffffffff4332211,%rax,%r8
> +\s*[a-f0-9]+:\s*62 44 fc 10 01 f8    	add    %r31,%r8,%r16
> +\s*[a-f0-9]+:\s*62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
> +\s*[a-f0-9]+:\s*62 44 f8 10 01 3c c0 	add    %r31,\(%r8,%r16,8\),%r16
> +\s*[a-f0-9]+:\s*62 44 7c 10 00 f8    	add    %r31b,%r8b,%r16b
> +\s*[a-f0-9]+:\s*62 44 7c 10 01 f8    	add    %r31d,%r8d,%r16d
> +\s*[a-f0-9]+:\s*62 44 7d 10 01 f8    	add    %r31w,%r8w,%r16w
> +\s*[a-f0-9]+:\s*62 5c fc 10 03 07    	add    \(%r31\),%r8,%r16
> +\s*[a-f0-9]+:\s*62 5c f8 10 03 84 07 90 90 00 00 	add    0x9090\(%r31,%r16,1\),%r8,%r16
> +\s*[a-f0-9]+:\s*62 44 7c 10 00 f8    	add    %r31b,%r8b,%r16b
> +\s*[a-f0-9]+:\s*62 44 7c 10 01 f8    	add    %r31d,%r8d,%r16d
> +\s*[a-f0-9]+:\s*62 fc 5c 10 83 04 83 11 	addl   \$0x11,\(%r19,%rax,4\),%r20d
> +\s*[a-f0-9]+:\s*62 44 fc 10 01 f8    	add    %r31,%r8,%r16
> +\s*[a-f0-9]+:\s*62 d4 fc 10 81 04 8f 33 44 34 12 	addq   \$0x12344433,\(%r15,%rcx,4\),%r16
> +\s*[a-f0-9]+:\s*62 44 7d 10 01 f8    	add    %r31w,%r8w,%r16w
> +\s*[a-f0-9]+:\s*62 54 6e 10 66 c7    	adox   %r15d,%r8d,%r18d
> +\s*[a-f0-9]+:\s*62 5c fc 10 03 c7    	add    %r31,%r8,%r16
> +\s*[a-f0-9]+:\s*62 44 fc 10 01 f8    	add    %r31,%r8,%r16
> +\s*[a-f0-9]+:\s*62 14 fa 08 66 04 3f 	adox   \(%r15,%r31,1\),%r8
> +\s*[a-f0-9]+:\s*62 14 6a 10 66 04 3f 	adox   \(%r15,%r31,1\),%r8d,%r18d
> +\s*[a-f0-9]+:\s*62 f4 0d 10 81 e0 34 12 	and    \$0x1234,%ax,%r30w
> +\s*[a-f0-9]+:\s*62 7c 6c 10 20 f9    	and    %r15b,%r17b,%r18b
> +\s*[a-f0-9]+:\s*62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
> +\s*[a-f0-9]+:\s*62 c4 3c 18 22 04 07 	and    \(%r15,%rax,1\),%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 c4 3d 18 23 04 07 	and    \(%r15,%rax,1\),%r16w,%r8w
> +\s*[a-f0-9]+:\s*62 fc 5c 10 83 24 83 11 	andl   \$0x11,\(%r19,%rax,4\),%r20d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 47 90 90 90 90 90 	cmova  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 43 90 90 90 90 90 	cmovae -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 42 90 90 90 90 90 	cmovb  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 46 90 90 90 90 90 	cmovbe -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 44 90 90 90 90 90 	cmove  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 4f 90 90 90 90 90 	cmovg  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 4d 90 90 90 90 90 	cmovge -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 4c 90 90 90 90 90 	cmovl  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 4e 90 90 90 90 90 	cmovle -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 45 90 90 90 90 90 	cmovne -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 41 90 90 90 90 90 	cmovno -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 49 90 90 90 90 90 	cmovns -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 40 90 90 90 90 90 	cmovo  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 4a 90 90 90 90 90 	cmovp  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 48 90 90 90 90 90 	cmovs  -0x6f6f6f70\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*62 f4 f4 10 ff c8    	dec    %rax,%r17
> +\s*[a-f0-9]+:\s*62 9c 3c 18 fe 0c 27 	decb   \(%r31,%r12,1\),%r8b
> +\s*[a-f0-9]+:\s*62 b4 b0 10 af 94 f8 09 09 00 00 	imul   0x909\(%rax,%r31,8\),%rdx,%r25
> +\s*[a-f0-9]+:\s*67 62 f4 3c 18 af 90 09 09 09 00 	imul   0x90909\(%eax\),%edx,%r8d
> +\s*[a-f0-9]+:\s*62 dc fc 10 ff c7    	inc    %r31,%r16
> +\s*[a-f0-9]+:\s*62 dc bc 18 ff c7    	inc    %r31,%r8
> +\s*[a-f0-9]+:\s*62 f4 e4 18 ff c0    	inc    %rax,%rbx
> +\s*[a-f0-9]+:\s*62 f4 f4 10 f7 d8    	neg    %rax,%r17
> +\s*[a-f0-9]+:\s*62 9c 3c 18 f6 1c 27 	negb   \(%r31,%r12,1\),%r8b
> +\s*[a-f0-9]+:\s*62 f4 f4 10 f7 d0    	not    %rax,%r17
> +\s*[a-f0-9]+:\s*62 9c 3c 18 f6 14 27 	notb   \(%r31,%r12,1\),%r8b
> +\s*[a-f0-9]+:\s*62 f4 0d 10 81 c8 34 12 	or     \$0x1234,%ax,%r30w
> +\s*[a-f0-9]+:\s*62 7c 6c 10 08 f9    	or     %r15b,%r17b,%r18b
> +\s*[a-f0-9]+:\s*62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
> +\s*[a-f0-9]+:\s*62 c4 3c 18 0a 04 07 	or     \(%r15,%rax,1\),%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 c4 3d 18 0b 04 07 	or     \(%r15,%rax,1\),%r16w,%r8w
> +\s*[a-f0-9]+:\s*62 fc 5c 10 83 0c 83 11 	orl    \$0x11,\(%r19,%rax,4\),%r20d
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 d4 02 	rcl    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 d0    	rcl    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 10    	rclb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 10 02 	rcll   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 10    	rclw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 14 83 	rclw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 dc 02 	rcr    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 d8    	rcr    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 18    	rcrb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 18 02 	rcrl   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 18    	rcrw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 1c 83 	rcrw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 c4 02 	rol    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 c0    	rol    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 00    	rolb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 00 02 	roll   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 00    	rolw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 04 83 	rolw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 cc 02 	ror    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 c8    	ror    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 08    	rorb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 08 02 	rorl   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 08    	rorw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 0c 83 	rorw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 fc 02 	sar    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 f8    	sar    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 38    	sarb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 38 02 	sarl   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 38    	sarw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 3c 83 	sarw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 f4 0d 10 81 d8 34 12 	sbb    \$0x1234,%ax,%r30w
> +\s*[a-f0-9]+:\s*62 7c 6c 10 18 f9    	sbb    %r15b,%r17b,%r18b
> +\s*[a-f0-9]+:\s*62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
> +\s*[a-f0-9]+:\s*62 c4 3c 18 1a 04 07 	sbb    \(%r15,%rax,1\),%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 c4 3d 18 1b 04 07 	sbb    \(%r15,%rax,1\),%r16w,%r8w
> +\s*[a-f0-9]+:\s*62 fc 5c 10 83 1c 83 11 	sbbl   \$0x11,\(%r19,%rax,4\),%r20d
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 e4 02 	shl    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 e4 02 	shl    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e0    	shl    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e0    	shl    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 20    	shlb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 20    	shlb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 74 84 10 24 20 01 	shld   \$0x1,%r12,\(%rax\),%r31
> +\s*[a-f0-9]+:\s*62 74 04 10 24 38 02 	shld   \$0x2,%r15d,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 54 05 10 24 c4 02 	shld   \$0x2,%r8w,%r12w,%r31w
> +\s*[a-f0-9]+:\s*62 7c bc 18 a5 e0    	shld   %cl,%r12,%r16,%r8
> +\s*[a-f0-9]+:\s*62 7c 05 10 a5 2c 83 	shld   %cl,%r13w,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 74 05 10 a5 08    	shld   %cl,%r9w,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 20 02 	shll   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 20 02 	shll   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 20    	shlw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 20    	shlw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 24 83 	shlw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 24 83 	shlw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 d4 04 10 c0 ec 02 	shr    \$0x2,%r12b,%r31b
> +\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e8    	shr    %cl,%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 f4 04 10 d0 28    	shrb   \$1,\(%rax\),%r31b
> +\s*[a-f0-9]+:\s*62 74 84 10 2c 20 01 	shrd   \$0x1,%r12,\(%rax\),%r31
> +\s*[a-f0-9]+:\s*62 74 04 10 2c 38 02 	shrd   \$0x2,%r15d,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 54 05 10 2c c4 02 	shrd   \$0x2,%r8w,%r12w,%r31w
> +\s*[a-f0-9]+:\s*62 7c bc 18 ad e0    	shrd   %cl,%r12,%r16,%r8
> +\s*[a-f0-9]+:\s*62 7c 05 10 ad 2c 83 	shrd   %cl,%r13w,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 74 05 10 ad 08    	shrd   %cl,%r9w,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 f4 04 10 c1 28 02 	shrl   \$0x2,\(%rax\),%r31d
> +\s*[a-f0-9]+:\s*62 f4 05 10 d1 28    	shrw   \$1,\(%rax\),%r31w
> +\s*[a-f0-9]+:\s*62 fc 05 10 d3 2c 83 	shrw   %cl,\(%r19,%rax,4\),%r31w
> +\s*[a-f0-9]+:\s*62 f4 0d 10 81 e8 34 12 	sub    \$0x1234,%ax,%r30w
> +\s*[a-f0-9]+:\s*62 7c 6c 10 28 f9    	sub    %r15b,%r17b,%r18b
> +\s*[a-f0-9]+:\s*62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
> +\s*[a-f0-9]+:\s*62 c4 3c 18 2a 04 07 	sub    \(%r15,%rax,1\),%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 c4 3d 18 2b 04 07 	sub    \(%r15,%rax,1\),%r16w,%r8w
> +\s*[a-f0-9]+:\s*62 fc 5c 10 83 2c 83 11 	subl   \$0x11,\(%r19,%rax,4\),%r20d
> +\s*[a-f0-9]+:\s*62 f4 0d 10 81 f0 34 12 	xor    \$0x1234,%ax,%r30w
> +\s*[a-f0-9]+:\s*62 7c 6c 10 30 f9    	xor    %r15b,%r17b,%r18b
> +\s*[a-f0-9]+:\s*62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
> +\s*[a-f0-9]+:\s*62 c4 3c 18 32 04 07 	xor    \(%r15,%rax,1\),%r16b,%r8b
> +\s*[a-f0-9]+:\s*62 c4 3d 18 33 04 07 	xor    \(%r15,%rax,1\),%r16w,%r8w
> +\s*[a-f0-9]+:\s*62 fc 5c 10 83 34 83 11 	xorl   \$0x11,\(%r19,%rax,4\),%r20d
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd.s b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
> new file mode 100644
> index 00000000000..4e248f737a9
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
> @@ -0,0 +1,155 @@
> +# Check 64bit APX NDD instructions with evex prefix encoding
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +	adc    $0x1234,%ax,%r30w
> +	adc    %r15b,%r17b,%r18b
> +	adc    %r15d,(%r8),%r18d
> +	adc    (%r15,%rax,1),%r16b,%r8b
> +	adc    (%r15,%rax,1),%r16w,%r8w
> +	adcl   $0x11,(%r19,%rax,4),%r20d
> +	adcx   %r15d,%r8d,%r18d
> +	adcx   (%r15,%r31,1),%r8
> +	adcx   (%r15,%r31,1),%r8d,%r18d
> +	add    $0x1234,%ax,%r30w
> +	add    $0x12344433,%r15,%r16
> +	add    $0x34,%r13b,%r17b
> +	add    $0xfffffffff4332211,%rax,%r8
> +	add    %r31,%r8,%r16
> +	add    %r31,(%r8),%r16
> +	add    %r31,(%r8,%r16,8),%r16
> +	add    %r31b,%r8b,%r16b
> +	add    %r31d,%r8d,%r16d
> +	add    %r31w,%r8w,%r16w
> +	add    (%r31),%r8,%r16
> +	add    0x9090(%r31,%r16,1),%r8,%r16
> +	addb   %r31b,%r8b,%r16b
> +	addl   %r31d,%r8d,%r16d
> +	addl   $0x11,(%r19,%rax,4),%r20d
> +	addq   %r31,%r8,%r16
> +	addq   $0x12344433,(%r15,%rcx,4),%r16
> +	addw   %r31w,%r8w,%r16w
> +	adox   %r15d,%r8d,%r18d
> +	{load}  add    %r31,%r8,%r16
> +	{store} add    %r31,%r8,%r16
> +	adox   (%r15,%r31,1),%r8
> +	adox   (%r15,%r31,1),%r8d,%r18d
> +	and    $0x1234,%ax,%r30w
> +	and    %r15b,%r17b,%r18b
> +	and    %r15d,(%r8),%r18d
> +	and    (%r15,%rax,1),%r16b,%r8b
> +	and    (%r15,%rax,1),%r16w,%r8w
> +	andl   $0x11,(%r19,%rax,4),%r20d
> +	cmova  0x90909090(%eax),%edx,%r8d
> +	cmovae 0x90909090(%eax),%edx,%r8d
> +	cmovb  0x90909090(%eax),%edx,%r8d
> +	cmovbe 0x90909090(%eax),%edx,%r8d
> +	cmove  0x90909090(%eax),%edx,%r8d
> +	cmovg  0x90909090(%eax),%edx,%r8d
> +	cmovge 0x90909090(%eax),%edx,%r8d
> +	cmovl  0x90909090(%eax),%edx,%r8d
> +	cmovle 0x90909090(%eax),%edx,%r8d
> +	cmovne 0x90909090(%eax),%edx,%r8d
> +	cmovno 0x90909090(%eax),%edx,%r8d
> +	cmovnp 0x90909090(%eax),%edx,%r8d
> +	cmovns 0x90909090(%eax),%edx,%r8d
> +	cmovo  0x90909090(%eax),%edx,%r8d
> +	cmovp  0x90909090(%eax),%edx,%r8d
> +	cmovs  0x90909090(%eax),%edx,%r8d
> +	dec    %rax,%r17
> +	decb   (%r31,%r12,1),%r8b
> +	imul   0x909(%rax,%r31,8),%rdx,%r25
> +	imul   0x90909(%eax),%edx,%r8d
> +	inc    %r31,%r16
> +	inc    %r31,%r8
> +	inc    %rax,%rbx
> +	neg    %rax,%r17
> +	negb   (%r31,%r12,1),%r8b
> +	not    %rax,%r17
> +	notb   (%r31,%r12,1),%r8b
> +	or     $0x1234,%ax,%r30w
> +	or     %r15b,%r17b,%r18b
> +	or     %r15d,(%r8),%r18d
> +	or     (%r15,%rax,1),%r16b,%r8b
> +	or     (%r15,%rax,1),%r16w,%r8w
> +	orl    $0x11,(%r19,%rax,4),%r20d
> +	rcl    $0x2,%r12b,%r31b
> +	rcl    %cl,%r16b,%r8b
> +	rclb   $0x1,(%rax),%r31b
> +	rcll   $0x2,(%rax),%r31d
> +	rclw   $0x1,(%rax),%r31w
> +	rclw   %cl,(%r19,%rax,4),%r31w
> +	rcr    $0x2,%r12b,%r31b
> +	rcr    %cl,%r16b,%r8b
> +	rcrb   $0x1,(%rax),%r31b
> +	rcrl   $0x2,(%rax),%r31d
> +	rcrw   $0x1,(%rax),%r31w
> +	rcrw   %cl,(%r19,%rax,4),%r31w
> +	rol    $0x2,%r12b,%r31b
> +	rol    %cl,%r16b,%r8b
> +	rolb   $0x1,(%rax),%r31b
> +	roll   $0x2,(%rax),%r31d
> +	rolw   $0x1,(%rax),%r31w
> +	rolw   %cl,(%r19,%rax,4),%r31w
> +	ror    $0x2,%r12b,%r31b
> +	ror    %cl,%r16b,%r8b
> +	rorb   $0x1,(%rax),%r31b
> +	rorl   $0x2,(%rax),%r31d
> +	rorw   $0x1,(%rax),%r31w
> +	rorw   %cl,(%r19,%rax,4),%r31w
> +	sar    $0x2,%r12b,%r31b
> +	sar    %cl,%r16b,%r8b
> +	sarb   $0x1,(%rax),%r31b
> +	sarl   $0x2,(%rax),%r31d
> +	sarw   $0x1,(%rax),%r31w
> +	sarw   %cl,(%r19,%rax,4),%r31w
> +	sbb    $0x1234,%ax,%r30w
> +	sbb    %r15b,%r17b,%r18b
> +	sbb    %r15d,(%r8),%r18d
> +	sbb    (%r15,%rax,1),%r16b,%r8b
> +	sbb    (%r15,%rax,1),%r16w,%r8w
> +	sbbl   $0x11,(%r19,%rax,4),%r20d
> +	shl    $0x2,%r12b,%r31b
> +	shl    $0x2,%r12b,%r31b
> +	shl    %cl,%r16b,%r8b
> +	shl    %cl,%r16b,%r8b
> +	shlb   $0x1,(%rax),%r31b
> +	shlb   $0x1,(%rax),%r31b
> +	shld   $0x1,%r12,(%rax),%r31
> +	shld   $0x2,%r15d,(%rax),%r31d
> +	shld   $0x2,%r8w,%r12w,%r31w
> +	shld   %cl,%r12,%r16,%r8
> +	shld   %cl,%r13w,(%r19,%rax,4),%r31w
> +	shld   %cl,%r9w,(%rax),%r31w
> +	shll   $0x2,(%rax),%r31d
> +	shll   $0x2,(%rax),%r31d
> +	shlw   $0x1,(%rax),%r31w
> +	shlw   $0x1,(%rax),%r31w
> +	shlw   %cl,(%r19,%rax,4),%r31w
> +	shlw   %cl,(%r19,%rax,4),%r31w
> +	shr    $0x2,%r12b,%r31b
> +	shr    %cl,%r16b,%r8b
> +	shrb   $0x1,(%rax),%r31b
> +	shrd   $0x1,%r12,(%rax),%r31
> +	shrd   $0x2,%r15d,(%rax),%r31d
> +	shrd   $0x2,%r8w,%r12w,%r31w
> +	shrd   %cl,%r12,%r16,%r8
> +	shrd   %cl,%r13w,(%r19,%rax,4),%r31w
> +	shrd   %cl,%r9w,(%rax),%r31w
> +	shrl   $0x2,(%rax),%r31d
> +	shrw   $0x1,(%rax),%r31w
> +	shrw   %cl,(%r19,%rax,4),%r31w
> +	sub    $0x1234,%ax,%r30w
> +	sub    %r15b,%r17b,%r18b
> +	sub    %r15d,(%r8),%r18d
> +	sub    (%r15,%rax,1),%r16b,%r8b
> +	sub    (%r15,%rax,1),%r16w,%r8w
> +	subl   $0x11,(%r19,%rax,4),%r20d
> +	xor    $0x1234,%ax,%r30w
> +	xor    %r15b,%r17b,%r18b
> +	xor    %r15d,(%r8),%r18d
> +	xor    (%r15,%rax,1),%r16b,%r8b
> +	xor    (%r15,%rax,1),%r16w,%r8w
> +	xorl   $0x11,(%r19,%rax,4),%r20d
> +
> diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.d b/gas/testsuite/gas/i386/x86-64-pseudos.d
> index 19dcd8415ac..c55e6f4b7aa 100644
> --- a/gas/testsuite/gas/i386/x86-64-pseudos.d
> +++ b/gas/testsuite/gas/i386/x86-64-pseudos.d
> @@ -137,6 +137,48 @@ Disassembly of section .text:
>   +[a-f0-9]+:	33 07                	xor    \(%rdi\),%eax
>   +[a-f0-9]+:	31 07                	xor    %eax,\(%rdi\)
>   +[a-f0-9]+:	33 07                	xor    \(%rdi\),%eax
> + +[a-f0-9]+:	62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
> + +[a-f0-9]+:	62 44 fc 10 03 38    	add    \(%r8\),%r31,%r16
> + +[a-f0-9]+:	62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
> + +[a-f0-9]+:	62 44 fc 10 03 38    	add    \(%r8\),%r31,%r16
> + +[a-f0-9]+:	62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 2b 38    	sub    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 2b 38    	sub    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 1b 38    	sbb    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 1b 38    	sbb    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 23 38    	and    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 23 38    	and    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 0b 38    	or     \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 0b 38    	or     \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 33 38    	xor    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 33 38    	xor    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 13 38    	adc    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
> + +[a-f0-9]+:	62 54 6c 10 13 38    	adc    \(%r8\),%r15d,%r18d
> + +[a-f0-9]+:	62 44 fc 10 01 f8    	add    %r31,%r8,%r16
> + +[a-f0-9]+:	62 5c fc 10 03 c7    	add    %r31,%r8,%r16
> + +[a-f0-9]+:	62 7c 6c 10 28 f9    	sub    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 c4 6c 10 2a cf    	sub    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 7c 6c 10 18 f9    	sbb    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 c4 6c 10 1a cf    	sbb    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 7c 6c 10 20 f9    	and    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 c4 6c 10 22 cf    	and    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 7c 6c 10 08 f9    	or     %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 c4 6c 10 0a cf    	or     %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 7c 6c 10 30 f9    	xor    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 c4 6c 10 32 cf    	xor    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 7c 6c 10 10 f9    	adc    %r15b,%r17b,%r18b
> + +[a-f0-9]+:	62 c4 6c 10 12 cf    	adc    %r15b,%r17b,%r18b
>   +[a-f0-9]+:	b0 12                	mov    \$0x12,%al
>   +[a-f0-9]+:	b8 45 03 00 00       	mov    \$0x345,%eax
>   +[a-f0-9]+:	b0 12                	mov    \$0x12,%al
> diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.s b/gas/testsuite/gas/i386/x86-64-pseudos.s
> index 5a53c363615..041f98e1939 100644
> --- a/gas/testsuite/gas/i386/x86-64-pseudos.s
> +++ b/gas/testsuite/gas/i386/x86-64-pseudos.s
> @@ -134,6 +134,49 @@ _start:
>  	{load} xor (%rdi), %eax
>  	{store} xor %eax, (%rdi)
>  	{store} xor (%rdi), %eax
> +	{load}  add    %r31,(%r8),%r16
> +	{load}	add    (%r8),%r31,%r16
> +	{store} add    %r31,(%r8),%r16
> +	{store}	add    (%r8),%r31,%r16
> +	{load} 	sub    %r15d,(%r8),%r18d
> +	{load}	sub    (%r8),%r15d,%r18d
> +	{store} sub    %r15d,(%r8),%r18d
> +	{store} sub    (%r8),%r15d,%r18d
> +	{load} 	sbb    %r15d,(%r8),%r18d
> +	{load}	sbb    (%r8),%r15d,%r18d
> +	{store} sbb    %r15d,(%r8),%r18d
> +	{store} sbb    (%r8),%r15d,%r18d
> +	{load} 	and    %r15d,(%r8),%r18d
> +	{load}	and    (%r8),%r15d,%r18d
> +	{store} and    %r15d,(%r8),%r18d
> +	{store} and    (%r8),%r15d,%r18d
> +	{load} 	or     %r15d,(%r8),%r18d
> +	{load}	or     (%r8),%r15d,%r18d
> +	{store} or     %r15d,(%r8),%r18d
> +	{store} or     (%r8),%r15d,%r18d
> +	{load} 	xor    %r15d,(%r8),%r18d
> +	{load}	xor    (%r8),%r15d,%r18d
> +	{store} xor    %r15d,(%r8),%r18d
> +	{store} xor    (%r8),%r15d,%r18d
> +	{load} 	adc    %r15d,(%r8),%r18d
> +	{load}	adc    (%r8),%r15d,%r18d
> +	{store} adc    %r15d,(%r8),%r18d
> +	{store} adc    (%r8),%r15d,%r18d
> +
> +	{store} add    %r31,%r8,%r16
> +	{load}  add    %r31,%r8,%r16
> +	{store} sub    %r15b,%r17b,%r18b
> +	{load}	sub    %r15b,%r17b,%r18b
> +	{store}	sbb    %r15b,%r17b,%r18b
> +	{load}	sbb    %r15b,%r17b,%r18b
> +	{store}	and    %r15b,%r17b,%r18b
> +	{load}	and    %r15b,%r17b,%r18b
> +	{store}	or     %r15b,%r17b,%r18b
> +	{load}	or     %r15b,%r17b,%r18b
> +	{store}	xor    %r15b,%r17b,%r18b
> +	{load}	xor    %r15b,%r17b,%r18b
> +	{store}	adc    %r15b,%r17b,%r18b
> +	{load}	adc    %r15b,%r17b,%r18b
>  
>  	.irp m, mov, adc, add, and, cmp, or, sbb, sub, test, xor
>  	\m	$0x12, %al
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index bfda747e02e..3a3438a5de3 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -370,6 +370,7 @@ run_dump_test "x86-64-apx-rex2"
>  run_dump_test "x86-64-apx-evex-promoted"
>  run_dump_test "x86-64-apx-evex-promoted-intel"
>  run_dump_test "x86-64-apx-evex-egpr"
> +run_dump_test "x86-64-apx-ndd"
>  run_dump_test "x86-64-avx512f-rcigrz-intel"
>  run_dump_test "x86-64-avx512f-rcigrz"
>  run_dump_test "x86-64-clwb"
> diff --git a/opcodes/i386-dis-evex-reg.h b/opcodes/i386-dis-evex-reg.h
> index 2885063628b..cac3c39c4c5 100644
> --- a/opcodes/i386-dis-evex-reg.h
> +++ b/opcodes/i386-dis-evex-reg.h
> @@ -49,3 +49,57 @@
>      { "vscatterpf0qp%XW",  { MVexVSIBQWpX }, PREFIX_DATA },
>      { "vscatterpf1qp%XW",  { MVexVSIBQWpX }, PREFIX_DATA },
>    },
> +  /* REG_EVEX_MAP4_80 */
> +  {
> +    { "addA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "orA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "adcA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "sbbA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "andA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "subA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "xorA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +  },
> +  /* REG_EVEX_MAP4_81 */
> +  {
> +    { "addQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
> +    { "orQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
> +    { "adcQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
> +    { "sbbQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
> +    { "andQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
> +    { "subQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
> +    { "xorQ",	{ VexGv, Ev, Iv }, PREFIX_NP_OR_DATA },
> +  },
> +  /* REG_EVEX_MAP4_83 */
> +  {
> +    { "addQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
> +    { "orQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
> +    { "adcQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
> +    { "sbbQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
> +    { "andQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
> +    { "subQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
> +    { "xorQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
> +  },
> +  /* REG_EVEX_MAP4_F6 */
> +  {
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { "notA",	{ VexGb, Eb }, NO_PREFIX },
> +    { "negA",	{ VexGb, Eb }, NO_PREFIX },
> +  },
> +  /* REG_EVEX_MAP4_F7 */
> +  {
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { "notQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
> +    { "negQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
> +  },
> +  /* REG_EVEX_MAP4_FE */
> +  {
> +    { "incA",	{ VexGb, Eb }, NO_PREFIX },
> +    { "decA",	{ VexGb, Eb }, NO_PREFIX },
> +  },
> +  /* REG_EVEX_MAP4_FF */
> +  {
> +    { "incQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
> +    { "decQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
> +  },
> diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
> index 90c063b2188..a8a891d7f0e 100644
> --- a/opcodes/i386-dis-evex.h
> +++ b/opcodes/i386-dis-evex.h
> @@ -875,64 +875,64 @@ static const struct dis386 evex_table[][256] = {
>    /* EVEX_MAP4_ */
>    {
>      /* 00 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "addB",             { VexGb, Eb, Gb }, NO_PREFIX },
> +    { "addS",             { VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
> +    { "addB",             { VexGb, Gb, EbS }, NO_PREFIX },
> +    { "addS",             { VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 08 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "orB",		{ VexGb, Eb, Gb }, NO_PREFIX },
> +    { "orS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
> +    { "orB",		{ VexGb, Gb, EbS }, NO_PREFIX },
> +    { "orS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 10 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "adcB",		{ VexGb, Eb, Gb }, NO_PREFIX },
> +    { "adcS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
> +    { "adcB",		{ VexGb, Gb, EbS }, NO_PREFIX },
> +    { "adcS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 18 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "sbbB",		{ VexGb, Eb, Gb }, NO_PREFIX },
> +    { "sbbS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
> +    { "sbbB",		{ VexGb, Gb, EbS }, NO_PREFIX },
> +    { "sbbS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 20 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "andB",		{ VexGb, Eb, Gb }, NO_PREFIX },
> +    { "andS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
> +    { "andB",		{ VexGb, Gb, EbS }, NO_PREFIX },
> +    { "andS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
> +    { "shldS",		{ VexGv, Ev, Gv, Ib }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 28 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "subB",		{ VexGb, Eb, Gb }, NO_PREFIX },
> +    { "subS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
> +    { "subB",		{ VexGb, Gb, EbS }, NO_PREFIX },
> +    { "subS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
> +    { "shrdS",		{ VexGv, Ev, Gv, Ib }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 30 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "xorB",		{ VexGb, Eb, Gb }, NO_PREFIX },
> +    { "xorS",		{ VexGv, Ev, Gv }, PREFIX_NP_OR_DATA },
> +    { "xorB",		{ VexGb, Gb, EbS }, NO_PREFIX },
> +    { "xorS",		{ VexGv, Gv, EvS }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -947,23 +947,23 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 40 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "%CFcmovoS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovnoS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovbS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovaeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmoveS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovneS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovbeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovaS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
>      /* 48 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "%CFcmovsS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovnsS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovpS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovnpS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovlS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovgeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovleS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
> +    { "%CFcmovgS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
>      /* 50 */
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -1019,10 +1019,10 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 80 */
> +    { REG_TABLE (REG_EVEX_MAP4_80) },
> +    { REG_TABLE (REG_EVEX_MAP4_81) },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_EVEX_MAP4_83) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -1060,7 +1060,7 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "shldS",	{ VexGv, Ev, Gv, CL }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* A8 */
> @@ -1069,9 +1069,9 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> +    { "shrdS",	{ VexGv, Ev, Gv, CL }, PREFIX_NP_OR_DATA },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "imulS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
>      /* B0 */
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -1091,8 +1091,8 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* C0 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_C0) },
> +    { REG_TABLE (REG_C1) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -1109,10 +1109,10 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* D0 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_D0) },
> +    { REG_TABLE (REG_D1) },
> +    { REG_TABLE (REG_D2) },
> +    { REG_TABLE (REG_D3) },
>      { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -1151,8 +1151,8 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_EVEX_MAP4_F6) },
> +    { REG_TABLE (REG_EVEX_MAP4_F7) },
>      /* F8 */
>      { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
>      { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
> @@ -1160,8 +1160,8 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { PREFIX_TABLE (PREFIX_0F38FC) },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_EVEX_MAP4_FE) },
> +    { REG_TABLE (REG_EVEX_MAP4_FF) },
>    },
>    /* EVEX_MAP5_ */
>    {
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index d4d32befcf9..1bb2882d839 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -226,6 +226,9 @@ struct instr_info
>    }
>    vex;
>  
> +/* For APX EVEX-promoted prefix, EVEX.ND shares the same bit as vex.b.  */
> +#define nd b
> +
>    enum evex_type evex_type;
>  
>    /* Remember if the current op is a jump instruction.  */
> @@ -578,6 +581,8 @@ fetch_error (const instr_info *ins)
>  #define VexGatherD { OP_VEX, vex_vsib_d_w_dq_mode }
>  #define VexGatherQ { OP_VEX, vex_vsib_q_w_dq_mode }
>  #define VexGdq { OP_VEX, dq_mode }
> +#define VexGb { OP_VEX, b_mode }
> +#define VexGv { OP_VEX, v_mode }
>  #define VexTmm { OP_VEX, tmm_mode }
>  #define XMVexI4 { OP_REG_VexI4, x_mode }
>  #define XMVexScalarI4 { OP_REG_VexI4, scalar_mode }
> @@ -892,6 +897,13 @@ enum
>    REG_EVEX_0F73,
>    REG_EVEX_0F38C6_L_2,
>    REG_EVEX_0F38C7_L_2,
> +  REG_EVEX_MAP4_80,
> +  REG_EVEX_MAP4_81,
> +  REG_EVEX_MAP4_83,
> +  REG_EVEX_MAP4_F6,
> +  REG_EVEX_MAP4_F7,
> +  REG_EVEX_MAP4_FE,
> +  REG_EVEX_MAP4_FF,
>  };
>  
>  enum
> @@ -2599,25 +2611,25 @@ static const struct dis386 reg_table[][8] = {
>    },
>    /* REG_C0 */
>    {
> -    { "rolA",	{ Eb, Ib }, 0 },
> -    { "rorA",	{ Eb, Ib }, 0 },
> -    { "rclA",	{ Eb, Ib }, 0 },
> -    { "rcrA",	{ Eb, Ib }, 0 },
> -    { "shlA",	{ Eb, Ib }, 0 },
> -    { "shrA",	{ Eb, Ib }, 0 },
> -    { "shlA",	{ Eb, Ib }, 0 },
> -    { "sarA",	{ Eb, Ib }, 0 },
> +    { "rolA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "rorA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "rclA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "rcrA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "shlA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "shrA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "shlA",	{ VexGb, Eb, Ib }, NO_PREFIX },
> +    { "sarA",	{ VexGb, Eb, Ib }, NO_PREFIX },
>    },
>    /* REG_C1 */
>    {
> -    { "rolQ",	{ Ev, Ib }, 0 },
> -    { "rorQ",	{ Ev, Ib }, 0 },
> -    { "rclQ",	{ Ev, Ib }, 0 },
> -    { "rcrQ",	{ Ev, Ib }, 0 },
> -    { "shlQ",	{ Ev, Ib }, 0 },
> -    { "shrQ",	{ Ev, Ib }, 0 },
> -    { "shlQ",	{ Ev, Ib }, 0 },
> -    { "sarQ",	{ Ev, Ib }, 0 },
> +    { "rolQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
> +    { "rorQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
> +    { "rclQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
> +    { "rcrQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
> +    { "shlQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
> +    { "shrQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
> +    { "shlQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
> +    { "sarQ",	{ VexGv, Ev, Ib }, PREFIX_NP_OR_DATA },
>    },
>    /* REG_C6 */
>    {
> @@ -2643,47 +2655,47 @@ static const struct dis386 reg_table[][8] = {
>    },
>    /* REG_D0 */
>    {
> -    { "rolA",	{ Eb, I1 }, 0 },
> -    { "rorA",	{ Eb, I1 }, 0 },
> -    { "rclA",	{ Eb, I1 }, 0 },
> -    { "rcrA",	{ Eb, I1 }, 0 },
> -    { "shlA",	{ Eb, I1 }, 0 },
> -    { "shrA",	{ Eb, I1 }, 0 },
> -    { "shlA",	{ Eb, I1 }, 0 },
> -    { "sarA",	{ Eb, I1 }, 0 },
> +    { "rolA",	{ VexGb, Eb, I1 }, NO_PREFIX },
> +    { "rorA",	{ VexGb, Eb, I1 }, NO_PREFIX },
> +    { "rclA",	{ VexGb, Eb, I1 }, NO_PREFIX },
> +    { "rcrA",	{ VexGb, Eb, I1 }, NO_PREFIX },
> +    { "shlA",	{ VexGb, Eb, I1 }, NO_PREFIX },
> +    { "shrA",	{ VexGb, Eb, I1 }, NO_PREFIX },
> +    { "shlA",	{ VexGb, Eb, I1 }, NO_PREFIX },
> +    { "sarA",	{ VexGb, Eb, I1 }, NO_PREFIX },
>    },
>    /* REG_D1 */
>    {
> -    { "rolQ",	{ Ev, I1 }, 0 },
> -    { "rorQ",	{ Ev, I1 }, 0 },
> -    { "rclQ",	{ Ev, I1 }, 0 },
> -    { "rcrQ",	{ Ev, I1 }, 0 },
> -    { "shlQ",	{ Ev, I1 }, 0 },
> -    { "shrQ",	{ Ev, I1 }, 0 },
> -    { "shlQ",	{ Ev, I1 }, 0 },
> -    { "sarQ",	{ Ev, I1 }, 0 },
> +    { "rolQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
> +    { "rorQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
> +    { "rclQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
> +    { "rcrQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
> +    { "shlQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
> +    { "shrQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
> +    { "shlQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
> +    { "sarQ",	{ VexGv, Ev, I1 }, PREFIX_NP_OR_DATA },
>    },
>    /* REG_D2 */
>    {
> -    { "rolA",	{ Eb, CL }, 0 },
> -    { "rorA",	{ Eb, CL }, 0 },
> -    { "rclA",	{ Eb, CL }, 0 },
> -    { "rcrA",	{ Eb, CL }, 0 },
> -    { "shlA",	{ Eb, CL }, 0 },
> -    { "shrA",	{ Eb, CL }, 0 },
> -    { "shlA",	{ Eb, CL }, 0 },
> -    { "sarA",	{ Eb, CL }, 0 },
> +    { "rolA",	{ VexGb, Eb, CL }, NO_PREFIX },
> +    { "rorA",	{ VexGb, Eb, CL }, NO_PREFIX },
> +    { "rclA",	{ VexGb, Eb, CL }, NO_PREFIX },
> +    { "rcrA",	{ VexGb, Eb, CL }, NO_PREFIX },
> +    { "shlA",	{ VexGb, Eb, CL }, NO_PREFIX },
> +    { "shrA",	{ VexGb, Eb, CL }, NO_PREFIX },
> +    { "shlA",	{ VexGb, Eb, CL }, NO_PREFIX },
> +    { "sarA",	{ VexGb, Eb, CL }, NO_PREFIX },
>    },
>    /* REG_D3 */
>    {
> -    { "rolQ",	{ Ev, CL }, 0 },
> -    { "rorQ",	{ Ev, CL }, 0 },
> -    { "rclQ",	{ Ev, CL }, 0 },
> -    { "rcrQ",	{ Ev, CL }, 0 },
> -    { "shlQ",	{ Ev, CL }, 0 },
> -    { "shrQ",	{ Ev, CL }, 0 },
> -    { "shlQ",	{ Ev, CL }, 0 },
> -    { "sarQ",	{ Ev, CL }, 0 },
> +    { "rolQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
> +    { "rorQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
> +    { "rclQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
> +    { "rcrQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
> +    { "shlQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
> +    { "shrQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
> +    { "shlQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
> +    { "sarQ",	{ VexGv, Ev, CL }, PREFIX_NP_OR_DATA },
>    },
>    /* REG_F6 */
>    {
> @@ -3633,8 +3645,8 @@ static const struct dis386 prefix_table[][4] = {
>    /* PREFIX_0F38F6 */
>    {
>      { "wrssK",	{ M, Gdq }, 0 },
> -    { "adoxS",	{ Gdq, Edq}, 0 },
> -    { "adcxS",	{ Gdq, Edq}, 0 },
> +    { "adoxS",	{ VexGdq, Gdq, Edq}, 0 },
> +    { "adcxS",	{ VexGdq, Gdq, Edq}, 0 },
>      { Bad_Opcode },
>    },
>  
> @@ -9120,6 +9132,12 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  	  ins->rex2 &= ~REX_R;
>  	}
>  
> +      /* EVEX from legacy instructions, when the EVEX.ND bit is 0,
> +	 all bits of EVEX.vvvv and EVEX.V' must be 1.  */
> +      if (ins->evex_type == evex_from_legacy && !ins->vex.nd
> +	  && (ins->vex.register_specifier || !ins->vex.v))
> +	return &bad_opcode;
> +
>        ins->need_vex = 4;
>  
>        /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
> @@ -9137,8 +9155,10 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>        if (!fetch_modrm (ins))
>  	return &err_opcode;
>  
> -      /* Set vector length.  */
> -      if (ins->modrm.mod == 3 && ins->vex.b)
> +      /* Set vector length. For EVEX-promoted instructions, evex.ll == 0b00,
> +	 which has the same encoding as vex.length == 128 and they can share
> +	 the same processing with vex.length in OP_VEX.  */
> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type != evex_from_legacy)
>  	ins->vex.length = 512;
>        else
>  	{
> @@ -9605,8 +9625,8 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>  	    }
>  
>  	  /* Check whether rounding control was enabled for an insn not
> -	     supporting it.  */
> -	  if (ins.modrm.mod == 3 && ins.vex.b
> +	     supporting it, when evex.b is not treated as evex.nd.  */
> +	  if (ins.modrm.mod == 3 && ins.vex.b && ins.evex_type == evex_default
>  	      && !(ins.evex_used & EVEX_b_used))
>  	    {
>  	      for (i = 0; i < MAX_OPERANDS; ++i)
> @@ -10499,16 +10519,23 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
>  	  ins->used_prefixes |= (ins->prefixes & PREFIX_ADDR);
>  	  break;
>  	case 'F':
> -	  if (ins->intel_syntax)
> -	    break;
> -	  if ((ins->prefixes & PREFIX_ADDR) || (sizeflag & SUFFIX_ALWAYS))
> +	  if (l == 0)
>  	    {
> -	      if (sizeflag & AFLAG)
> -		*ins->obufp++ = ins->address_mode == mode_64bit ? 'q' : 'l';
> -	      else
> -		*ins->obufp++ = ins->address_mode == mode_64bit ? 'l' : 'w';
> -	      ins->used_prefixes |= (ins->prefixes & PREFIX_ADDR);
> +	      if (ins->intel_syntax)
> +		break;
> +	      if ((ins->prefixes & PREFIX_ADDR) || (sizeflag & SUFFIX_ALWAYS))
> +		{
> +		  if (sizeflag & AFLAG)
> +		    *ins->obufp++ = ins->address_mode == mode_64bit ? 'q' : 'l';
> +		  else
> +		    *ins->obufp++ = ins->address_mode == mode_64bit ? 'l' : 'w';
> +		  ins->used_prefixes |= (ins->prefixes & PREFIX_ADDR);
> +		}
>  	    }
> +	  else if (l == 1 && last[0] == 'C')
> +	    break;
> +	  else
> +	    abort ();
>  	  break;
>  	case 'G':
>  	  if (ins->intel_syntax || (ins->obufp[-1] != 's'
> @@ -11072,7 +11099,8 @@ print_displacement (instr_info *ins, bfd_signed_vma val)
>  static void
>  intel_operand_size (instr_info *ins, int bytemode, int sizeflag)
>  {
> -  if (ins->vex.b)
> +  /* Check if there is a broadcast, when evex.b is not treated as evex.nd.  */
> +  if (ins->vex.b && ins->evex_type == evex_default)
>      {
>        if (!ins->vex.no_broadcast)
>  	switch (bytemode)
> @@ -11569,6 +11597,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  
>    add += (ins->rex2 & REX_B) ? 16 : 0;
>  
> +  /* Handles EVEX other than APX EVEX-promoted instructions.  */
>    if (ins->vex.evex && ins->evex_type == evex_default)
>      {
>  
> @@ -12004,7 +12033,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  	  print_operand_value (ins, disp & 0xffff, dis_style_text);
>  	}
>      }
> -  if (ins->vex.b)
> +  if (ins->vex.b && ins->evex_type == evex_default)
>      {
>        ins->evex_used |= EVEX_b_used;
>  
> @@ -13370,6 +13399,13 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>    if (!ins->need_vex)
>      return true;
>  
> +  if (ins->evex_type == evex_from_legacy)
> +    {
> +      ins->evex_used |= EVEX_b_used;
> +      if (!ins->vex.nd)
> +	return true;
> +    }
> +
>    reg = ins->vex.register_specifier;
>    ins->vex.register_specifier = 0;
>    if (ins->address_mode != mode_64bit)
> @@ -13461,12 +13497,19 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>  	  names = att_names_xmm;
>  	  ins->evex_used |= EVEX_len_used;
>  	  break;
> +	case v_mode:
>  	case dq_mode:
>  	  if (ins->rex & REX_W)
>  	    names = att_names64;
> +	  else if (bytemode == v_mode
> +		   && !(sizeflag & DFLAG))
> +	    names = att_names16;
>  	  else
>  	    names = att_names32;
>  	  break;
> +	case b_mode:
> +	  names = att_names8rex;
> +	  break;
>  	case mask_bd_mode:
>  	case mask_mode:
>  	  if (reg > 0x7)
> diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> index 064ec48edad..9e8c827b934 100644
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -638,8 +638,10 @@ enum
>    Vex,
>    /* How to encode VEX.vvvv:
>       0: VEX.vvvv must be 1111b.
> -     1: VEX.vvvv encodes one of the register operands.
> +     1: VEX.vvvv encodes one of the src register operands.
> +     2: VEX.vvvv encodes the dest register operand.
>     */
> +#define VexVVVV_DST   2
>    VexVVVV,
>    /* How the VEX.W bit is used:
>       0: Set by the REX.W bit.
> @@ -776,7 +778,7 @@ typedef struct i386_opcode_modifier
>    unsigned int immext:1;
>    unsigned int norex64:1;
>    unsigned int vex:2;
> -  unsigned int vexvvvv:1;
> +  unsigned int vexvvvv:2;
>    unsigned int vexw:2;
>    unsigned int opcodeprefix:2;
>    unsigned int sib:3;
> diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> index 11b8c0b63cb..54c659099af 100644
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -140,12 +140,16 @@
>  
>  #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
>  
> +#define DstVVVV VexVVVV=VexVVVV_DST
> +
>  // The template supports VEX format for cpuid and EVEX format for cpuid & apx_f.
>  #define APX_F(cpuid) cpuid&(cpuid|APX_F)
>  
>  // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
>  // the bit to mark commutative VEX encodings where swapping the source
>  // operands may allow to switch from 3-byte to 2-byte VEX encoding.
> +// And re-use the bit to mark some NDD insns that swapping the source operands
> +// may allow to switch from EVEX encoding to REX2 encoding.
>  #define C StaticRounding
>  
>  #define FP 387|287|8087
> @@ -292,26 +296,38 @@ std, 0xfd, 0, NoSuf, {}
>  sti, 0xfb, 0, NoSuf, {}
>  
>  // Arithmetic.
> +add, 0x0, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  add, 0x0, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +add, 0x83/0, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  add, 0x4, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> +add, 0x80/0, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}
>  add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
>  inc, 0x40, No64, No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32 }
> +inc, 0xfe/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, {Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}
>  inc, 0xfe/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +sub, 0x28, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64, }
>  sub, 0x28, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +sub, 0x83/5, APX_F, Modrm|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  sub, 0x83/5, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  sub, 0x2c, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> +sub, 0x80/5, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sub, 0x80/5, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
>  dec, 0x48, No64, No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32 }
> +dec, 0xfe/1, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  dec, 0xfe/1, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +sbb, 0x18, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sbb, 0x18, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +sbb, 0x83/3, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  sbb, 0x83/3, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  sbb, 0x1c, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> +sbb, 0x80/3, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sbb, 0x80/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +sbb, 0x80/3, APX_F, W|Modrm|EVexMap4|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
>  cmp, 0x38, 0, D|W|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  cmp, 0x83/7, 0, Modrm|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> @@ -322,30 +338,45 @@ test, 0x84, 0, D|W|C|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64, R
>  test, 0xa8, 0, W|No_sSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
>  test, 0xf6/0, 0, W|Modrm|No_sSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +and, 0x20, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  and, 0x20, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +and, 0x83/4, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  and, 0x83/4, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock|Optimize, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  and, 0x24, 0, W|No_sSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> +and, 0x80/4, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  and, 0x80/4, 0, W|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +or, 0x8, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  or, 0x8, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +or, 0x83/1, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  or, 0x83/1, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  or, 0xc, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> +or, 0x80/1, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  or, 0x80/1, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +xor, 0x30, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  xor, 0x30, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +xor, 0x83/6, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  xor, 0x83/6, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  xor, 0x34, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> +xor, 0x80/6, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  xor, 0x80/6, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
>  // clr with 1 operand is really xor with 2 operands.
>  clr, 0x30, 0, W|Modrm|No_sSuf|RegKludge|Optimize, { Reg8|Reg16|Reg32|Reg64 }
>  
> +adc, 0x10, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  adc, 0x10, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +adc, 0x83/2, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  adc, 0x83/2, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  adc, 0x14, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> +adc, 0x80/2, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +neg, 0xf6/3, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  neg, 0xf6/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +
> +not, 0xf6/2, APX_F, W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  not, 0xf6/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
>  aaa, 0x37, No64, NoSuf, {}
> @@ -379,6 +410,7 @@ cqto, 0x99, x64, Size64|NoSuf, {}
>  // These multiplies can only be selected with single operand forms.
>  mul, 0xf6/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  imul, 0xf6/5, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +imul, 0xaf, APX_F, C|Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 }
>  imul, 0xfaf, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  imul, 0x6b, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  imul, 0x69, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm16|Imm32|Imm32S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> @@ -393,52 +425,90 @@ div, 0xf6/6, 0, W|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspe
>  idiv, 0xf6/7, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  idiv, 0xf6/7, 0, W|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
>  
> +rol, 0xd0/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rol, 0xd0/0, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rol, 0xc0/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rol, 0xc0/0, i186, W|Modrm|No_sSuf, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rol, 0xd2/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rol, 0xd2/0, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  rol, 0xd0/0, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +ror, 0xd0/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  ror, 0xd0/1, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +ror, 0xc0/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  ror, 0xc0/1, i186, W|Modrm|No_sSuf, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +ror, 0xd2/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  ror, 0xd2/1, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  ror, 0xd0/1, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rcl, 0xd0/2, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcl, 0xc0/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rcl, 0xc0/2, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcl, 0xc0/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcl, 0xd2/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rcl, 0xd2/2, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcl, 0xd2/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  rcl, 0xd0/2, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rcr, 0xd0/3, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcr, 0xc0/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rcr, 0xc0/3, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcr, 0xc0/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcr, 0xd2/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  rcr, 0xd2/3, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcr, 0xd2/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  rcr, 0xd0/3, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +sal, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sal, 0xd0/4, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +sal, 0xc0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sal, 0xc0/4, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +sal, 0xd2/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sal, 0xd2/4, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  sal, 0xd0/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +shl, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  shl, 0xd0/4, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shl, 0xc0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  shl, 0xc0/4, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shl, 0xd2/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  shl, 0xd2/4, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  shl, 0xd0/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +shr, 0xd0/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  shr, 0xd0/5, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shr, 0xc0/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  shr, 0xc0/5, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shr, 0xd2/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  shr, 0xd2/5, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  shr, 0xd0/5, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +sar, 0xd0/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sar, 0xd0/7, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +sar, 0xc0/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sar, 0xc0/7, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +sar, 0xd2/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|DstVVVV|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
>  sar, 0xd2/7, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  sar, 0xd0/7, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +shld, 0x24, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  shld, 0xfa4, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shld, 0xa5, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  shld, 0xfa5, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shld, 0xa5, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  shld, 0xfa5, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
> +shrd, 0x2c, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  shrd, 0xfac, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shrd, 0xad, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  shrd, 0xfad, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +shrd, 0xad, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  shrd, 0xfad, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>  
>  // Control transfer instructions.
> @@ -940,6 +1010,7 @@ ud2b, 0xfb9, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|U
>  // 3rd official undefined instr (older CPUs don't take a ModR/M byte)
>  ud0, 0xfff, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  
> +cmov<cc>, 0x4<cc:opc>, CMOV&APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 }
>  cmov<cc>, 0xf4<cc:opc>, CMOV, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  
>  fcmovb, 0xda/0, i687, Modrm|NoSuf, { FloatReg, FloatAcc }
> @@ -2031,8 +2102,12 @@ xcryptofb, 0xf30fa7e8, PadLock, NoSuf|RepPrefixOk, {}
>  xstore, 0xfa7c0, PadLock, NoSuf|RepPrefixOk, {}
>  
>  // Multy-precision Add Carry, rdseed instructions.
> +adcx, 0x6666, ADX&APX_F, C|Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|DstVVVV|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
>  adcx, 0x660f38f6, ADX, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +adcx, 0x6666, ADX&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +adox, 0xf366, ADX&APX_F, C|Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|DstVVVV|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
>  adox, 0xf30f38f6, ADX, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> +adox, 0xf366, ADX&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
>  rdseed, 0xfc7/7, RdSeed, Modrm|NoSuf, { Reg16|Reg32|Reg64 }
>  
>  // SMAP instructions.
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 6/9] Support APX Push2/Pop2
  2023-12-28  1:27 ` [PATCH V5 6/9] Support APX Push2/Pop2 Cui, Lili
@ 2023-12-28  1:55   ` H.J. Lu
  0 siblings, 0 replies; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:55 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich, Mo, Zewei

On Thu, Dec 28, 2023 at 01:27:11AM +0000, Cui, Lili wrote:
> From: "Mo, Zewei" <zewei.mo@intel.com>
> 
> PPX functionality for PUSH/POP is not implemented in this patch
> and will be implemented separately.
> 
> gas/ChangeLog:
> 
> 2023-12-28  Zewei Mo <zewei.mo@intel.com>
>             H.J. Lu  <hongjiu.lu@intel.com>
>             Lili Cui <lili.cui@intel.com>
> 
> 	* config/tc-i386.c: (enum i386_error):
> 	New unsupported_rsp_register and invalid_src_register_set.
> 	(md_assemble): Add handler for unsupported_rsp_register and
> 	invalid_src_register_set.
> 	(check_APX_operands): Add invalid check for push2/pop2.
> 	(match_template): Handle check_APX_operands.
> 	* testsuite/gas/i386/i386.exp: Add apx-push2pop2 tests.
> 	* testsuite/gas/i386/x86-64.exp: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-push2pop2.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-push2pop2.s: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-push2pop2-intel.d: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-push2pop2-inval.l: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-push2pop2-inval.s: Ditto.
> 	* testsuite/gas/i386/apx-push2pop2-inval.s: Ditto.
> 	* testsuite/gas/i386/apx-push2pop2-inval.d: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Added bad
> 	testcases for POP2.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis-evex-reg.h: Add REG_EVEX_MAP4_8F.
> 	* i386-dis-evex-w.h: Add EVEX_W_MAP4_8F_R_0 and EVEX_W_MAP4_FF_R_6
> 	* i386-dis-evex.h: Add REG_EVEX_MAP4_8F.
> 	* i386-dis.c (PUSH2_POP2_Fixup): Add special handling for PUSH2/POP2.
> 	(get_valid_dis386): Add handler for vector length and address_mode for
> 	APX-Push2/Pop2 insn.
> 	(nd): define nd as b for EVEX-promoted instrutions.
> 	(OP_VEX): Add handler of 64-bit vvvv register for APX-Push2/Pop2 insn.
> 	* i386-gen.c: Add Push2Pop2 bitfield.
> 	* i386-opc.h: Regenerated.
> 	* i386-opc.tbl: Regenerated.
> ---
>  gas/config/tc-i386.c                          | 44 +++++++++++++++++++
>  gas/testsuite/gas/i386/apx-push2pop2-inval.l  |  5 +++
>  gas/testsuite/gas/i386/apx-push2pop2-inval.s  |  9 ++++
>  gas/testsuite/gas/i386/i386.exp               |  1 +
>  .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  5 +++
>  .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  7 +++
>  .../gas/i386/x86-64-apx-push2pop2-intel.d     | 42 ++++++++++++++++++
>  .../gas/i386/x86-64-apx-push2pop2-inval.l     | 13 ++++++
>  .../gas/i386/x86-64-apx-push2pop2-inval.s     | 17 +++++++
>  gas/testsuite/gas/i386/x86-64-apx-push2pop2.d | 42 ++++++++++++++++++
>  gas/testsuite/gas/i386/x86-64-apx-push2pop2.s | 39 ++++++++++++++++
>  gas/testsuite/gas/i386/x86-64.exp             |  3 ++
>  opcodes/i386-dis-evex-reg.h                   |  9 ++++
>  opcodes/i386-dis-evex-w.h                     | 10 +++++
>  opcodes/i386-dis-evex.h                       |  2 +-
>  opcodes/i386-dis.c                            | 31 +++++++++++++
>  opcodes/i386-opc.tbl                          |  9 ++++
>  17 files changed, 287 insertions(+), 1 deletion(-)
>  create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.l
>  create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 99b484122e1..8af98e435ef 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -250,6 +250,7 @@ enum i386_error
>      invalid_vector_register_set,
>      invalid_tmm_register_set,
>      invalid_dest_and_src_register_set,
> +    invalid_dest_register_set,
>      invalid_pseudo_prefix,
>      unsupported_vector_index_register,
>      unsupported_broadcast,
> @@ -259,6 +260,7 @@ enum i386_error
>      no_default_mask,
>      unsupported_rc_sae,
>      unsupported_vector_size,
> +    unsupported_rsp_register,
>      internal_error,
>    };
>  
> @@ -5510,6 +5512,9 @@ md_assemble (char *line)
>  	case invalid_dest_and_src_register_set:
>  	  err_msg = _("destination and source registers must be distinct");
>  	  break;
> +	case invalid_dest_register_set:
> +	  err_msg = _("two dest registers must be distinct");
> +	  break;
>  	case invalid_pseudo_prefix:
>  	  err_msg = _("rex2 pseudo prefix cannot be used");
>  	  break;
> @@ -5538,6 +5543,9 @@ md_assemble (char *line)
>  	  as_bad (_("vector size above %u required for `%s'"), 128u << vector_size,
>  		  pass1_mnem ? pass1_mnem : insn_name (current_templates.start));
>  	  return;
> +	case unsupported_rsp_register:
> +	  err_msg = _("'rsp' register cannot be used");
> +	  break;
>  	case internal_error:
>  	  err_msg = _("internal error");
>  	  break;
> @@ -7174,6 +7182,35 @@ check_EgprOperands (const insn_template *t)
>    return 0;
>  }
>  
> +/* Check if APX operands are valid for the instruction.  */
> +static bool
> +check_APX_operands (const insn_template *t)
> +{
> +  /* Push2* and Pop2* cannot use RSP and Pop2* cannot pop two same registers.
> +   */
> +  switch (t->mnem_off)
> +    {
> +    case MN_pop2:
> +    case MN_pop2p:
> +      if (register_number (i.op[0].regs) == register_number (i.op[1].regs))
> +	{
> +	  i.error = invalid_dest_register_set;
> +	  return 1;
> +	}
> +    /* fall through */
> +    case MN_push2:
> +    case MN_push2p:
> +      if (register_number (i.op[0].regs) == 4
> +	  || register_number (i.op[1].regs) == 4)
> +	{
> +	  i.error = unsupported_rsp_register;
> +	  return 1;
> +	}
> +      break;
> +    }
> +  return 0;
> +}
> +
>  /* Helper function for the progress() macro in match_template().  */
>  static INLINE enum i386_error progress (enum i386_error new,
>  					enum i386_error last,
> @@ -7674,6 +7711,13 @@ match_template (char mnem_suffix)
>  	  continue;
>  	}
>  
> +      /* Check if APX operands are valid.  */
> +      if (check_APX_operands (t))
> +	{
> +	  specific_error = progress (i.error);
> +	  continue;
> +	}
> +
>        /* Check whether to use the shorter VEX encoding for certain insns where
>  	 the EVEX encoding comes first in the table.  This requires the respective
>  	 AVX-* feature to be explicitly enabled.
> diff --git a/gas/testsuite/gas/i386/apx-push2pop2-inval.l b/gas/testsuite/gas/i386/apx-push2pop2-inval.l
> new file mode 100644
> index 00000000000..a55a71520c8
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/apx-push2pop2-inval.l
> @@ -0,0 +1,5 @@
> +.* Assembler messages:
> +.*:6: Error: `push2' is only supported in 64-bit mode
> +.*:7: Error: `push2p' is only supported in 64-bit mode
> +.*:8: Error: `pop2' is only supported in 64-bit mode
> +.*:9: Error: `pop2p' is only supported in 64-bit mode
> diff --git a/gas/testsuite/gas/i386/apx-push2pop2-inval.s b/gas/testsuite/gas/i386/apx-push2pop2-inval.s
> new file mode 100644
> index 00000000000..77166327ed1
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/apx-push2pop2-inval.s
> @@ -0,0 +1,9 @@
> +# Check 32bit APX-PUSH2/POP2 instructions
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +	push2 %rax, %rbx
> +	push2p %rax, %rbx
> +	pop2 %rax, %rbx
> +	pop2p %rax, %rbx
> diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
> index 3917be6be70..f9ee85b4bb3 100644
> --- a/gas/testsuite/gas/i386/i386.exp
> +++ b/gas/testsuite/gas/i386/i386.exp
> @@ -511,6 +511,7 @@ if [gas_32_check] then {
>      run_dump_test "sm4-intel"
>      run_list_test "pbndkb-inval"
>      run_list_test "user_msr-inval"
> +    run_list_test "apx-push2pop2-inval"
>      run_list_test "sg"
>      run_dump_test "clzero"
>      run_dump_test "invlpgb"
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> index ba14736c3a8..3bfb5dec202 100644
> --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
> @@ -34,3 +34,8 @@ Disassembly of section .text:
>  [ 	]*[a-f0-9]+:[ 	]+62 f4 e4[ 	]+\(bad\)
>  [ 	]*[a-f0-9]+:[ 	]+08 ff[ 	]+.*
>  [ 	]*[a-f0-9]+:[ 	]+04 08[ 	]+.*
> +[ 	]*[a-f0-9]+:[ 	]+62 f4 3c[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+08 8f c0 ff ff ff[ 	]+or.*
> +[ 	]*[a-f0-9]+:[ 	]+62 74 7c 18 8f c0[ 	]+pop2   %rax,\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+62 d4 3c 18 8f[ 	]+\(bad\)
> +[ 	]*[a-f0-9]+:[ 	]+c0[ 	]+.*
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> index fcbb1b93659..fde6736e9b2 100644
> --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> @@ -40,3 +40,10 @@ _start:
>  
>  	#{evex} inc %rax %rbx EVEX.vvvv != 1111 && EVEX.ND = 0.
>  	.insn EVEX.L0.NP.M4.W1 0xff/0, (%rax,%rcx), %rbx
> +	# pop2 %rax, %r8 set EVEX.ND=0.
> +	.insn EVEX.L0.M4.W0 0x8f/0,  %rax, %r8
> +	.byte 0xff, 0xff, 0xff
> +	# pop2 %rax, %r8 set EVEX.vvvv = 1111.
> +	.insn EVEX.L0.M4.W0 0x8f,  %rax, {rn-sae},%r8
> +	# pop2 %r8, %r8.
> +	.insn EVEX.L0.M4.W0 0x8f/0,  %r8,{rn-sae}, %r8
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
> new file mode 100644
> index 00000000000..46b21219582
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
> @@ -0,0 +1,42 @@
> +#as: --64
> +#objdump: -dw -Mintel
> +#name: i386 APX-push2pop2 insns (Intel disassembly)
> +#source: x86-64-apx-push2pop2.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+rax,rbx
> +\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+r8,r17
> +\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+r31,r9
> +\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+r24,r31
> +\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+rax,rbx
> +\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+r8,r17
> +\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+r31,r9
> +\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+r24,r31
> +\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+rbx,rax
> +\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+r17,r8
> +\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+r9,r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+r31,r24
> +\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+rbx,rax
> +\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+r17,r8
> +\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+r9,r31
> +\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+r31,r24
> +\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+rax,rbx
> +\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+r8,r17
> +\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+r31,r9
> +\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+r24,r31
> +\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+rax,rbx
> +\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+r8,r17
> +\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+r31,r9
> +\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+r24,r31
> +\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+rbx,rax
> +\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+r17,r8
> +\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+r9,r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+r31,r24
> +\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+rbx,rax
> +\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+r17,r8
> +\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+r9,r31
> +\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+r31,r24
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
> new file mode 100644
> index 00000000000..2cd142885a1
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
> @@ -0,0 +1,13 @@
> +.* Assembler messages:
> +.*:6: Error: operand size mismatch for `push2'
> +.*:7: Error: operand size mismatch for `push2'
> +.*:8: Error: 'rsp' register cannot be used for `push2'
> +.*:9: Error: 'rsp' register cannot be used for `push2'
> +.*:10: Error: operand size mismatch for `push2p'
> +.*:11: Error: 'rsp' register cannot be used for `push2p'
> +.*:12: Error: operand size mismatch for `pop2'
> +.*:13: Error: 'rsp' register cannot be used for `pop2'
> +.*:14: Error: 'rsp' register cannot be used for `pop2'
> +.*:15: Error: two dest registers must be distinct for `pop2'
> +.*:16: Error: 'rsp' register cannot be used for `pop2p'
> +.*:17: Error: two dest registers must be distinct for `pop2p'
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
> new file mode 100644
> index 00000000000..83cef97d57e
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
> @@ -0,0 +1,17 @@
> +# Check illegal APX-Push2Pop2 instructions
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +	push2  %ax, %bx
> +	push2  %eax, %ebx
> +	push2  %rsp, %r17
> +	push2  %r17, %rsp
> +	push2p %eax, %ebx
> +	push2p %rsp, %r17
> +	pop2   %ax, %bx
> +	pop2   %rax, %rsp
> +	pop2   %rsp, %rax
> +	pop2   %r12, %r12
> +	pop2p  %rax, %rsp
> +	pop2p  %r12, %r12
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2.d b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
> new file mode 100644
> index 00000000000..54f22a7f94e
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
> @@ -0,0 +1,42 @@
> +#as: --64
> +#objdump: -dw
> +#name: x86_64 APX-push2pop2 insns
> +#source: x86-64-apx-push2pop2.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+%rbx,%rax
> +\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+%r17,%r8
> +\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+%r9,%r31
> +\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+%r31,%r24
> +\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+%rbx,%rax
> +\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+%r17,%r8
> +\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+%r9,%r31
> +\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+%r31,%r24
> +\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+%rax,%rbx
> +\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+%r8,%r17
> +\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+%r31,%r9
> +\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+%r24,%r31
> +\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+%rax,%rbx
> +\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+%r8,%r17
> +\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+%r31,%r9
> +\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+%r24,%r31
> +\s*[a-f0-9]+:\s*62 f4 7c 18 ff f3\s+push2\s+%rbx,%rax
> +\s*[a-f0-9]+:\s*62 fc 3c 18 ff f1\s+push2\s+%r17,%r8
> +\s*[a-f0-9]+:\s*62 d4 04 10 ff f1\s+push2\s+%r9,%r31
> +\s*[a-f0-9]+:\s*62 dc 3c 10 ff f7\s+push2\s+%r31,%r24
> +\s*[a-f0-9]+:\s*62 f4 fc 18 ff f3\s+push2p\s+%rbx,%rax
> +\s*[a-f0-9]+:\s*62 fc bc 18 ff f1\s+push2p\s+%r17,%r8
> +\s*[a-f0-9]+:\s*62 d4 84 10 ff f1\s+push2p\s+%r9,%r31
> +\s*[a-f0-9]+:\s*62 dc bc 10 ff f7\s+push2p\s+%r31,%r24
> +\s*[a-f0-9]+:\s*62 f4 64 18 8f c0\s+pop2\s+%rax,%rbx
> +\s*[a-f0-9]+:\s*62 d4 74 10 8f c0\s+pop2\s+%r8,%r17
> +\s*[a-f0-9]+:\s*62 dc 34 18 8f c7\s+pop2\s+%r31,%r9
> +\s*[a-f0-9]+:\s*62 dc 04 10 8f c0\s+pop2\s+%r24,%r31
> +\s*[a-f0-9]+:\s*62 f4 e4 18 8f c0\s+pop2p\s+%rax,%rbx
> +\s*[a-f0-9]+:\s*62 d4 f4 10 8f c0\s+pop2p\s+%r8,%r17
> +\s*[a-f0-9]+:\s*62 dc b4 18 8f c7\s+pop2p\s+%r31,%r9
> +\s*[a-f0-9]+:\s*62 dc 84 10 8f c0\s+pop2p\s+%r24,%r31
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-push2pop2.s b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
> new file mode 100644
> index 00000000000..5c28c13ba2e
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
> @@ -0,0 +1,39 @@
> +# Check 64bit APX-Push2Pop2 instructions
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +	push2 %rbx, %rax
> +	push2 %r17, %r8
> +	push2 %r9, %r31
> +	push2 %r31, %r24
> +	push2p %rbx, %rax
> +	push2p %r17, %r8
> +	push2p %r9, %r31
> +	push2p %r31, %r24
> +	pop2 %rax, %rbx
> +	pop2 %r8, %r17
> +	pop2 %r31, %r9
> +	pop2 %r24, %r31
> +	pop2p %rax, %rbx
> +	pop2p %r8, %r17
> +	pop2p %r31, %r9
> +	pop2p %r24, %r31
> +
> +	.intel_syntax noprefix
> +	push2 rax, rbx
> +	push2 r8, r17
> +	push2 r31, r9
> +	push2 r24, r31
> +	push2p rax, rbx
> +	push2p r8, r17
> +	push2p r31, r9
> +	push2p r24, r31
> +	pop2 rbx, rax
> +	pop2 r17, r8
> +	pop2 r9, r31
> +	pop2 r31, r24
> +	pop2p rbx, rax
> +	pop2p r17, r8
> +	pop2p r9, r31
> +	pop2p r31, r24
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index 3a3438a5de3..0e7b5d0c073 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -345,6 +345,9 @@ run_dump_test "x86-64-avx512dq-rcigrd-intel"
>  run_dump_test "x86-64-avx512dq-rcigrd"
>  run_dump_test "x86-64-avx512dq-rcigrne-intel"
>  run_dump_test "x86-64-avx512dq-rcigrne"
> +run_dump_test "x86-64-apx-push2pop2"
> +run_dump_test "x86-64-apx-push2pop2-intel"
> +run_list_test "x86-64-apx-push2pop2-inval"
>  run_dump_test "x86-64-avx512dq-rcigru-intel"
>  run_dump_test "x86-64-avx512dq-rcigru"
>  run_dump_test "x86-64-avx512dq-rcigrz-intel"
> diff --git a/opcodes/i386-dis-evex-reg.h b/opcodes/i386-dis-evex-reg.h
> index cac3c39c4c5..81bb41646c5 100644
> --- a/opcodes/i386-dis-evex-reg.h
> +++ b/opcodes/i386-dis-evex-reg.h
> @@ -79,6 +79,10 @@
>      { "subQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
>      { "xorQ",	{ VexGv, Ev, sIb }, PREFIX_NP_OR_DATA },
>    },
> +  /* REG_EVEX_MAP4_8F */
> +  {
> +    { VEX_W_TABLE (EVEX_W_MAP4_8F_R_0) },
> +  },
>    /* REG_EVEX_MAP4_F6 */
>    {
>      { Bad_Opcode },
> @@ -102,4 +106,9 @@
>    {
>      { "incQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
>      { "decQ",	{ VexGv, Ev }, PREFIX_NP_OR_DATA },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { VEX_W_TABLE (EVEX_W_MAP4_FF_R_6) },
>    },
> diff --git a/opcodes/i386-dis-evex-w.h b/opcodes/i386-dis-evex-w.h
> index b828277d413..12ab29544bb 100644
> --- a/opcodes/i386-dis-evex-w.h
> +++ b/opcodes/i386-dis-evex-w.h
> @@ -442,6 +442,16 @@
>      { Bad_Opcode },
>      { "vpshrdw",   { XM, Vex, EXx, Ib }, 0 },
>    },
> +  /* EVEX_W_MAP4_8F_R_0 */
> +  {
> +    { "pop2", { { PUSH2_POP2_Fixup, q_mode}, Eq }, NO_PREFIX },
> +    { "pop2p", { { PUSH2_POP2_Fixup, q_mode}, Eq }, NO_PREFIX },
> +  },
> +  /* EVEX_W_MAP4_FF_R_6 */
> +  {
> +    { "push2", { { PUSH2_POP2_Fixup, q_mode}, Eq }, 0 },
> +    { "push2p", { { PUSH2_POP2_Fixup, q_mode}, Eq }, 0 },
> +  },
>    /* EVEX_W_MAP5_5B_P_0 */
>    {
>      { "vcvtdq2ph%XY",	{ XMxmmq, EXx, EXxEVexR }, 0 },
> diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
> index a8a891d7f0e..4f2ec966457 100644
> --- a/opcodes/i386-dis-evex.h
> +++ b/opcodes/i386-dis-evex.h
> @@ -1035,7 +1035,7 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_EVEX_MAP4_8F) },
>      /* 90 */
>      { Bad_Opcode },
>      { Bad_Opcode },
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index 1bb2882d839..9daef6fa9fd 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -105,6 +105,7 @@ static bool FXSAVE_Fixup (instr_info *, int, int);
>  static bool MOVSXD_Fixup (instr_info *, int, int);
>  static bool DistinctDest_Fixup (instr_info *, int, int);
>  static bool PREFETCHI_Fixup (instr_info *, int, int);
> +static bool PUSH2_POP2_Fixup (instr_info *, int, int);
>  
>  static void ATTRIBUTE_PRINTF_3 i386_dis_printf (const disassemble_info *,
>  						enum disassembler_style,
> @@ -900,6 +901,7 @@ enum
>    REG_EVEX_MAP4_80,
>    REG_EVEX_MAP4_81,
>    REG_EVEX_MAP4_83,
> +  REG_EVEX_MAP4_8F,
>    REG_EVEX_MAP4_F6,
>    REG_EVEX_MAP4_F7,
>    REG_EVEX_MAP4_FE,
> @@ -1739,6 +1741,9 @@ enum
>    EVEX_W_0F3A70,
>    EVEX_W_0F3A72,
>  
> +  EVEX_W_MAP4_8F_R_0,
> +  EVEX_W_MAP4_FF_R_6,
> +
>    EVEX_W_MAP5_5B_P_0,
>    EVEX_W_MAP5_7A_P_3,
>  };
> @@ -13510,6 +13515,9 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>  	case b_mode:
>  	  names = att_names8rex;
>  	  break;
> +	case q_mode:
> +	  names = att_names64;
> +	  break;
>  	case mask_bd_mode:
>  	case mask_mode:
>  	  if (reg > 0x7)
> @@ -13894,3 +13902,26 @@ PREFETCHI_Fixup (instr_info *ins, int bytemode, int sizeflag)
>  
>    return OP_M (ins, bytemode, sizeflag);
>  }
> +
> +static bool
> +PUSH2_POP2_Fixup (instr_info *ins, int bytemode, int sizeflag)
> +{
> +  if (ins->modrm.mod != 3)
> +    return true;
> +
> +  unsigned int vvvv_reg = ins->vex.register_specifier
> +    | (!ins->vex.v << 4);
> +  unsigned int rm_reg = ins->modrm.rm + (ins->rex & REX_B ? 8 : 0)
> +    + (ins->rex2 & REX_B ? 16 : 0);
> +
> +  /* Push2/Pop2 cannot use RSP and Pop2 cannot pop two same registers.  */
> +  if (!ins->vex.nd || vvvv_reg == 0x4 || rm_reg == 0x4
> +      || (!ins->modrm.reg
> +	  && vvvv_reg == rm_reg))
> +    {
> +      oappend (ins, "(bad)");
> +      return true;
> +    }
> +
> +  return OP_VEX (ins, bytemode, sizeflag);
> +}
> diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> index 54c659099af..900ca36d286 100644
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3480,3 +3480,12 @@ uwrmsr, 0xf30f38f8, USER_MSR, Modrm|NoSuf|NoRex64, { Reg64, Reg64 }
>  uwrmsr, 0xf3f8/0, USER_MSR, Modrm|Vex128|VexMap7|VexW0|NoSuf, { Imm32, Reg64 }
>  
>  // USER_MSR instructions end.
> +
> +// APX Push2/Pop2 instructions.
> +
> +push2, 0xff/6, APX_F, Modrm|VexW0|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
> +push2p, 0xff/6, APX_F, Modrm|VexW1|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
> +pop2, 0x8f/0, APX_F, Modrm|VexW0|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
> +pop2p, 0x8f/0, APX_F, Modrm|VexW1|EVex128|EVexMap4|VexVVVV|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
> +
> +// APX Push2/Pop2 instructions end.
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 7/9] Support APX pushp/popp
  2023-12-28  1:27 ` [PATCH V5 7/9] Support APX pushp/popp Cui, Lili
@ 2023-12-28  1:56   ` H.J. Lu
  0 siblings, 0 replies; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:56 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich

On Thu, Dec 28, 2023 at 01:27:12AM +0000, Cui, Lili wrote:
> gas/ChangeLog:
> 
> 	* config/tc-i386.c (process_operands): Handle "PUSHP/POPP requires
> 	rex2.w == 1."
> 	* testsuite/gas/i386/x86-64.exp: Add new test for PUSHP/POPP.
> 	* testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-pushp-popp.d: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-pushp-popp.s: Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis.c (putop): print pushp and popp.
> 	* i386-opc.tbl: Added new insns.
> 	* i386-init.h : Regenerated.
> 	* i386-mnem.h : Regenerated.
> 	* i386-tbl.h: Regenerated.
> ---
>  gas/config/tc-i386.c                          |  3 +-
>  .../gas/i386/x86-64-apx-pushp-popp-intel.d    | 14 +++++
>  .../gas/i386/x86-64-apx-pushp-popp-inval.l    |  5 ++
>  .../gas/i386/x86-64-apx-pushp-popp-inval.s    |  7 +++
>  .../gas/i386/x86-64-apx-pushp-popp.d          | 14 +++++
>  .../gas/i386/x86-64-apx-pushp-popp.s          |  8 +++
>  gas/testsuite/gas/i386/x86-64.exp             |  3 +
>  opcodes/i386-dis.c                            | 55 ++++++++++++-------
>  opcodes/i386-opc.h                            |  2 +
>  opcodes/i386-opc.tbl                          |  3 +
>  10 files changed, 94 insertions(+), 20 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 8af98e435ef..640c6511f20 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -3910,7 +3910,8 @@ is_apx_evex_encoding (void)
>  static INLINE bool
>  is_apx_rex2_encoding (void)
>  {
> -  return i.rex2 || i.rex2_encoding;
> +  return i.rex2 || i.rex2_encoding
> +	|| i.tm.opcode_modifier.operandconstraint == REX2_REQUIRED;
>  }
>  
>  static unsigned int
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
> new file mode 100644
> index 00000000000..44e3e96a5df
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
> @@ -0,0 +1,14 @@
> +#as:
> +#objdump: -dw -Mintel
> +#name: x86_64 APX_F pushp popp insns (Intel disassembly)
> +#source: x86-64-apx-pushp-popp.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*d5 08 50[	 ]+pushp  rax
> +\s*[a-f0-9]+:\s*d5 19 57[	 ]+pushp  r31
> +\s*[a-f0-9]+:\s*d5 08 58[	 ]+popp   rax
> +\s*[a-f0-9]+:\s*d5 19 5f[	 ]+popp   r31
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
> new file mode 100644
> index 00000000000..c4d774b9673
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
> @@ -0,0 +1,5 @@
> +.* Assembler messages:
> +.*:4: Error: operand size mismatch for `pushp'
> +.*:5: Error: operand size mismatch for `popp'
> +.*:6: Error: operand size mismatch for `pushp'
> +.*:7: Error: operand size mismatch for `popp'
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
> new file mode 100644
> index 00000000000..28ed5d8145a
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
> @@ -0,0 +1,7 @@
> +# Check bytecode of APX_F pushp popp instructions with illegal instructions.
> +
> +	.text
> +	pushp %eax
> +	popp  %eax
> +	pushp (%rax)
> +	popp  (%rax)
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
> new file mode 100644
> index 00000000000..b20e5ba9a35
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
> @@ -0,0 +1,14 @@
> +#as:
> +#objdump: -dw
> +#name: x86_64 APX_F pushp popp insns
> +#source: x86-64-apx-pushp-popp.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*d5 08 50[ 	]+pushp  %rax
> +\s*[a-f0-9]+:\s*d5 19 57[ 	]+pushp  %r31
> +\s*[a-f0-9]+:\s*d5 08 58[ 	]+popp   %rax
> +\s*[a-f0-9]+:\s*d5 19 5f[ 	]+popp   %r31
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
> new file mode 100644
> index 00000000000..0ea66d0e70c
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
> @@ -0,0 +1,8 @@
> +# Check 64bit APX_F pushp popp instructions
> +
> +       .text
> + _start:
> +	pushp %rax
> +	pushp %r31
> +	popp  %rax
> +	popp  %r31
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index 0e7b5d0c073..1b13c52454e 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -348,6 +348,9 @@ run_dump_test "x86-64-avx512dq-rcigrne"
>  run_dump_test "x86-64-apx-push2pop2"
>  run_dump_test "x86-64-apx-push2pop2-intel"
>  run_list_test "x86-64-apx-push2pop2-inval"
> +run_dump_test "x86-64-apx-pushp-popp"
> +run_dump_test "x86-64-apx-pushp-popp-intel"
> +run_list_test "x86-64-apx-pushp-popp-inval"
>  run_dump_test "x86-64-avx512dq-rcigru-intel"
>  run_dump_test "x86-64-avx512dq-rcigru"
>  run_dump_test "x86-64-avx512dq-rcigrz-intel"
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index 9daef6fa9fd..e851fb376d9 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -301,6 +301,9 @@ struct dis_private {
>  #define EVEX_len_used 2
>  
>  
> +/* {rex2} is not printed when the REX2_SPECIAL is set.  */
> +#define REX2_SPECIAL 16
> +
>  /* Flags stored in PREFIXES.  */
>  #define PREFIX_REPZ 1
>  #define PREFIX_REPNZ 2
> @@ -1924,23 +1927,23 @@ static const struct dis386 dis386[] = {
>    { "dec{S|}",		{ RMeSI }, 0 },
>    { "dec{S|}",		{ RMeDI }, 0 },
>    /* 50 */
> -  { "push{!P|}",		{ RMrAX }, 0 },
> -  { "push{!P|}",		{ RMrCX }, 0 },
> -  { "push{!P|}",		{ RMrDX }, 0 },
> -  { "push{!P|}",		{ RMrBX }, 0 },
> -  { "push{!P|}",		{ RMrSP }, 0 },
> -  { "push{!P|}",		{ RMrBP }, 0 },
> -  { "push{!P|}",		{ RMrSI }, 0 },
> -  { "push{!P|}",		{ RMrDI }, 0 },
> +  { "push!P",		{ RMrAX }, 0 },
> +  { "push!P",		{ RMrCX }, 0 },
> +  { "push!P",		{ RMrDX }, 0 },
> +  { "push!P",		{ RMrBX }, 0 },
> +  { "push!P",		{ RMrSP }, 0 },
> +  { "push!P",		{ RMrBP }, 0 },
> +  { "push!P",		{ RMrSI }, 0 },
> +  { "push!P",		{ RMrDI }, 0 },
>    /* 58 */
> -  { "pop{!P|}",		{ RMrAX }, 0 },
> -  { "pop{!P|}",		{ RMrCX }, 0 },
> -  { "pop{!P|}",		{ RMrDX }, 0 },
> -  { "pop{!P|}",		{ RMrBX }, 0 },
> -  { "pop{!P|}",		{ RMrSP }, 0 },
> -  { "pop{!P|}",		{ RMrBP }, 0 },
> -  { "pop{!P|}",		{ RMrSI }, 0 },
> -  { "pop{!P|}",		{ RMrDI }, 0 },
> +  { "pop!P",		{ RMrAX }, 0 },
> +  { "pop!P",		{ RMrCX }, 0 },
> +  { "pop!P",		{ RMrDX }, 0 },
> +  { "pop!P",		{ RMrBX }, 0 },
> +  { "pop!P",		{ RMrSP }, 0 },
> +  { "pop!P",		{ RMrBP }, 0 },
> +  { "pop!P",		{ RMrSI }, 0 },
> +  { "pop!P",		{ RMrDI }, 0 },
>    /* 60 */
>    { X86_64_TABLE (X86_64_60) },
>    { X86_64_TABLE (X86_64_61) },
> @@ -9783,9 +9786,10 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>  
>    /* Check if the REX2 prefix is used.  */
>    if (ins.last_rex2_prefix >= 0
> -      && ((ins.rex2 & 0x7) ^ (ins.rex2_used & 0x7)) == 0
> -      && (ins.rex ^ ins.rex_used) == 0
> -      && (ins.rex2 & 0x7))
> +      && ((ins.rex2 & REX2_SPECIAL)
> +	  || (((ins.rex2 & 7) ^ (ins.rex2_used & 7)) == 0
> +	      && (ins.rex ^ ins.rex_used) == 0
> +	      && (ins.rex2 & 7))))
>      ins.all_prefixes[ins.last_rex2_prefix] = 0;
>  
>    /* Check if the SEG prefix is used.  */
> @@ -10632,6 +10636,19 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
>  	case 'P':
>  	  if (l == 0)
>  	    {
> +	      if (!cond && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
> +		{
> +		  /* For pushp and popp, p is printed and do not print {rex2}
> +		     for them.  */
> +		  *ins->obufp++ = 'p';
> +		  ins->rex2 |= REX2_SPECIAL;
> +		  break;
> +		}
> +
> +	      /* For "!P" print nothing else in Intel syntax.  */
> +	      if (!cond && ins->intel_syntax)
> +		break;
> +
>  	      if ((ins->modrm.mod == 3 || !cond)
>  		  && !(sizeflag & SUFFIX_ALWAYS))
>  		break;
> diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> index 9e8c827b934..8db6c51538a 100644
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -579,6 +579,8 @@ enum
>    /* Instrucion requires that destination must be distinct from source
>       registers.  */
>  #define DISTINCT_DEST 9
> +  /* Instrucion requires REX2 prefix.  */
> +#define REX2_REQUIRED 10
>    OperandConstraint,
>    /* instruction ignores operand size prefix and in Intel mode ignores
>       mnemonic size suffix check.  */
> diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> index 900ca36d286..edd9f73ae22 100644
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -85,6 +85,7 @@
>  #define RegKludge         OperandConstraint=REG_KLUDGE
>  #define SwapSources       OperandConstraint=SWAP_SOURCES
>  #define Ugh               OperandConstraint=UGH
> +#define Rex2              OperandConstraint=REX2_REQUIRED
>  
>  #define ATTSyntax         Dialect=ATT_SYNTAX
>  #define ATTMnemonic       Dialect=ATT_MNEMONIC
> @@ -229,6 +230,7 @@ push, 0x68, i186&No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }
>  push, 0x6, No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }
>  // In 64bit mode, the operand size is implicitly 64bit.
>  push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
> +pushp, 0x50, APX_F, No_bSuf|No_wSuf|No_lSuf|No_sSuf|Rex2, { Reg64 }
>  push, 0xff/6, x64, Modrm|DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64|Unspecified|BaseIndex }
>  push, 0x6a, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Imm8S }
>  push, 0x68, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Imm16|Imm32S }
> @@ -242,6 +244,7 @@ pop, 0x8f/0, No64, Modrm|DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32|Unsp
>  pop, 0x7, No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }
>  // In 64bit mode, the operand size is implicitly 64bit.
>  pop, 0x58, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
> +popp, 0x58, APX_F, No_bSuf|No_wSuf|No_lSuf|No_sSuf|Rex2, { Reg64 }
>  pop, 0x8f/0, x64, Modrm|DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64|Unspecified|BaseIndex }
>  pop, 0xfa1, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { SReg }
>  
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 8/9] Support APX NDD optimized encoding.
  2023-12-28  1:27 ` [PATCH V5 8/9] Support APX NDD optimized encoding Cui, Lili
@ 2023-12-28  1:56   ` H.J. Lu
  2024-01-05 14:36   ` Jan Beulich
  1 sibling, 0 replies; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:56 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich, Hu, Lin1

On Thu, Dec 28, 2023 at 01:27:13AM +0000, Cui, Lili wrote:
> From: "Hu, Lin1" <lin1.hu@intel.com>
> 
> This patch aims to optimize:
> 
> add %r16, %r15, %r15 -> add %r16, %r15
> 
> gas/ChangeLog:
> 
> 	* config/tc-i386.c (check_Rex_required): New function.
> 	(can_convert_NDD_to_legacy): Ditto.
> 	(match_template): If we can optimzie APX NDD insns, so rematch
> 	template.
> 	* testsuite/gas/i386/x86-64.exp: Add test.
> 	* testsuite/gas/i386/x86-64-apx-ndd-optimize.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-ndd-optimize.s: Ditto.
> ---
>  gas/config/tc-i386.c                          | 104 ++++++++++++++
>  .../gas/i386/x86-64-apx-ndd-optimize.d        | 132 ++++++++++++++++++
>  .../gas/i386/x86-64-apx-ndd-optimize.s        | 125 +++++++++++++++++
>  gas/testsuite/gas/i386/x86-64.exp             |   1 +
>  4 files changed, 362 insertions(+)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 640c6511f20..3b0ba41cc72 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -7212,6 +7212,56 @@ check_APX_operands (const insn_template *t)
>    return 0;
>  }
>  
> +/* Check if the instruction use the REX registers or REX prefix.  */
> +static bool
> +check_Rex_required (void)
> +{
> +  for (unsigned int op = 0; op < i.operands; op++)
> +    {
> +      if (i.types[op].bitfield.class != Reg)
> +	continue;
> +
> +      if (i.op[op].regs->reg_flags & (RegRex | RegRex64))
> +	return true;
> +    }
> +
> +  if ((i.index_reg && (i.index_reg->reg_flags & (RegRex | RegRex64)))
> +      || (i.base_reg && (i.base_reg->reg_flags & (RegRex | RegRex64))))
> +    return true;
> +
> +  /* Check pseudo prefix {rex} are valid.  */
> +  return i.rex_encoding;
> +}
> +
> +/* Optimize APX NDD insns to legacy insns.  */
> +static unsigned int
> +can_convert_NDD_to_legacy (const insn_template *t)
> +{
> +  unsigned int match_dest_op = ~0;
> +
> +  if (!i.tm.opcode_modifier.nf
> +      && i.reg_operands >= 2)
> +    {
> +      unsigned int dest = i.operands - 1;
> +      unsigned int src1 = i.operands - 2;
> +      unsigned int src2 = (i.operands > 3) ? i.operands - 3 : 0;
> +
> +      if (i.types[src1].bitfield.class == Reg
> +	  && i.op[src1].regs == i.op[dest].regs)
> +	match_dest_op = src1;
> +      /* If the first operand is the same as the third operand,
> +	 these instructions need to support the ability to commutative
> +	 the first two operands and still not change the semantics in order
> +	 to be optimized.  */
> +      else if (optimize > 1
> +	       && t->opcode_modifier.commutative
> +	       && i.types[src2].bitfield.class == Reg
> +	       && i.op[src2].regs == i.op[dest].regs)
> +	match_dest_op = src2;
> +    }
> +  return match_dest_op;
> +}
> +
>  /* Helper function for the progress() macro in match_template().  */
>  static INLINE enum i386_error progress (enum i386_error new,
>  					enum i386_error last,
> @@ -7754,6 +7804,60 @@ match_template (char mnem_suffix)
>  	  i.memshift = memshift;
>  	}
>  
> +      /* If we can optimize a NDD insn to legacy insn, like
> +	 add %r16, %r8, %r8 -> add %r16, %r8,
> +	 add  %r8, %r16, %r8 -> add %r16, %r8, then rematch template.
> +	 Note that the semantics have not been changed.  */
> +      if (optimize
> +	  && !i.no_optimize
> +	  && i.vec_encoding != vex_encoding_evex
> +	  && t + 1 < current_templates.end
> +	  && !t[1].opcode_modifier.evex
> +	  && t[1].opcode_space <= SPACE_0F38
> +	  && t->opcode_modifier.vexvvvv == VexVVVV_DST
> +	  && (i.types[i.operands - 1].bitfield.dword
> +	      || i.types[i.operands - 1].bitfield.qword))
> +	{
> +	  unsigned int match_dest_op = can_convert_NDD_to_legacy (t);
> +
> +	  if (match_dest_op != (unsigned int) ~0)
> +	    {
> +	      size_match = true;
> +	      /* We ensure that the next template has the same input
> +		 operands as the original matching template by the first
> +		 opernd (ATT). To avoid someone support new NDD insns and
> +		 put it in the wrong position.  */
> +	      overlap0 = operand_type_and (i.types[0],
> +					   t[1].operand_types[0]);
> +	      if (t->opcode_modifier.d)
> +		overlap1 = operand_type_and (i.types[0],
> +					     t[1].operand_types[1]);
> +	      if (!operand_type_match (overlap0, i.types[0])
> +		  && (!t->opcode_modifier.d
> +		      || !operand_type_match (overlap1, i.types[0])))
> +		size_match = false;
> +
> +	      if (size_match
> +		  && (t[1].opcode_space <= SPACE_0F
> +		      /* Some non-legacy-map0/1 insns can be shorter when
> +			 legacy-encoded and when no REX prefix is required.  */
> +		      || (!check_EgprOperands (t + 1)
> +			  && !check_Rex_required ()
> +			  && !i.op[i.operands - 1].regs->reg_type.bitfield.qword)))
> +		{
> +		  if (i.operands > 2 && match_dest_op == i.operands - 3)
> +		    swap_2_operands (match_dest_op, i.operands - 2);
> +
> +		  --i.operands;
> +		  --i.reg_operands;
> +
> +		  specific_error = progress (internal_error);
> +		  continue;
> +		}
> +
> +	    }
> +	}
> +
>        /* We've found a match; break out of loop.  */
>        break;
>      }
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
> new file mode 100644
> index 00000000000..48f0f1ceee3
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
> @@ -0,0 +1,132 @@
> +#as: -Os
> +#objdump: -drw
> +#name: x86-64 APX NDD optimized encoding
> +#source: x86-64-apx-ndd-optimize.s
> +
> +.*: +file format .*
> +
> +
> +Disassembly of section .text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*d5 4d 01 f8          	add    %r31,%r8
> +\s*[a-f0-9]+:\s*62 44 3c 18 00 f8    	add    %r31b,%r8b,%r8b
> +\s*[a-f0-9]+:\s*d5 4d 01 f8          	add    %r31,%r8
> +\s*[a-f0-9]+:\s*d5 1d 03 c7          	add    %r31,%r8
> +\s*[a-f0-9]+:\s*d5 4d 03 38          	add    \(%r8\),%r31
> +\s*[a-f0-9]+:\s*d5 1d 03 07          	add    \(%r31\),%r8
> +\s*[a-f0-9]+:\s*49 81 c7 33 44 34 12 	add    \$0x12344433,%r15
> +\s*[a-f0-9]+:\s*49 81 c0 11 22 33 f4 	add    \$0xfffffffff4332211,%r8
> +\s*[a-f0-9]+:\s*d5 19 ff c7          	inc    %r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 fe c7    	inc    %r31b,%r31b
> +\s*[a-f0-9]+:\s*d5 1c 29 f9          	sub    %r15,%r17
> +\s*[a-f0-9]+:\s*62 7c 74 10 28 f9    	sub    %r15b,%r17b,%r17b
> +\s*[a-f0-9]+:\s*62 54 84 18 29 38    	sub    %r15,\(%r8\),%r15
> +\s*[a-f0-9]+:\s*d5 49 2b 04 07       	sub    \(%r15,%rax,1\),%r16
> +\s*[a-f0-9]+:\s*d5 19 81 ee 34 12 00 00 	sub    \$0x1234,%r30
> +\s*[a-f0-9]+:\s*d5 18 ff c9          	dec    %r17
> +\s*[a-f0-9]+:\s*62 fc 74 10 fe c9    	dec    %r17b,%r17b
> +\s*[a-f0-9]+:\s*d5 1c 19 f9          	sbb    %r15,%r17
> +\s*[a-f0-9]+:\s*62 7c 74 10 18 f9    	sbb    %r15b,%r17b,%r17b
> +\s*[a-f0-9]+:\s*62 54 84 18 19 38    	sbb    %r15,\(%r8\),%r15
> +\s*[a-f0-9]+:\s*d5 49 1b 04 07       	sbb    \(%r15,%rax,1\),%r16
> +\s*[a-f0-9]+:\s*d5 19 81 de 34 12 00 00 	sbb    \$0x1234,%r30
> +\s*[a-f0-9]+:\s*d5 1c 21 f9          	and    %r15,%r17
> +\s*[a-f0-9]+:\s*62 7c 74 10 20 f9    	and    %r15b,%r17b,%r17b
> +\s*[a-f0-9]+:\s*4d 23 38             	and    \(%r8\),%r15
> +\s*[a-f0-9]+:\s*d5 49 23 04 07       	and    \(%r15,%rax,1\),%r16
> +\s*[a-f0-9]+:\s*d5 11 81 e6 34 12 00 00 	and    \$0x1234,%r30d
> +\s*[a-f0-9]+:\s*d5 1c 09 f9          	or     %r15,%r17
> +\s*[a-f0-9]+:\s*62 7c 74 10 08 f9    	or     %r15b,%r17b,%r17b
> +\s*[a-f0-9]+:\s*4d 0b 38             	or     \(%r8\),%r15
> +\s*[a-f0-9]+:\s*d5 49 0b 04 07       	or     \(%r15,%rax,1\),%r16
> +\s*[a-f0-9]+:\s*d5 19 81 ce 34 12 00 00 	or     \$0x1234,%r30
> +\s*[a-f0-9]+:\s*d5 1c 31 f9          	xor    %r15,%r17
> +\s*[a-f0-9]+:\s*62 7c 74 10 30 f9    	xor    %r15b,%r17b,%r17b
> +\s*[a-f0-9]+:\s*4d 33 38             	xor    \(%r8\),%r15
> +\s*[a-f0-9]+:\s*d5 49 33 04 07       	xor    \(%r15,%rax,1\),%r16
> +\s*[a-f0-9]+:\s*d5 19 81 f6 34 12 00 00 	xor    \$0x1234,%r30
> +\s*[a-f0-9]+:\s*d5 1c 11 f9          	adc    %r15,%r17
> +\s*[a-f0-9]+:\s*62 7c 74 10 10 f9    	adc    %r15b,%r17b,%r17b
> +\s*[a-f0-9]+:\s*4d 13 38             	adc    \(%r8\),%r15
> +\s*[a-f0-9]+:\s*d5 49 13 04 07       	adc    \(%r15,%rax,1\),%r16
> +\s*[a-f0-9]+:\s*d5 19 81 d6 34 12 00 00 	adc    \$0x1234,%r30
> +\s*[a-f0-9]+:\s*d5 18 f7 d9          	neg    %r17
> +\s*[a-f0-9]+:\s*62 fc 74 10 f6 d9    	neg    %r17b,%r17b
> +\s*[a-f0-9]+:\s*d5 18 f7 d1          	not    %r17
> +\s*[a-f0-9]+:\s*62 fc 74 10 f6 d1    	not    %r17b,%r17b
> +\s*[a-f0-9]+:\s*67 0f af 90 09 09 09 00 	imul   0x90909\(%eax\),%edx
> +\s*[a-f0-9]+:\s*d5 aa af 94 f8 09 09 00 00 	imul   0x909\(%rax,%r31,8\),%rdx
> +\s*[a-f0-9]+:\s*48 0f af d0          	imul   %rax,%rdx
> +\s*[a-f0-9]+:\s*d5 19 d1 c7          	rol    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 c7    	rol    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 c4 02          	rol    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 c4 02 	rol    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*d5 19 d1 cf          	ror    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 cf    	ror    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 cc 02          	ror    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 cc 02 	ror    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*d5 19 d1 d7          	rcl    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 d7    	rcl    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 d4 02          	rcl    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 d4 02 	rcl    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*d5 19 d1 df          	rcr    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 df    	rcr    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 dc 02          	rcr    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 dc 02 	rcr    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*d5 19 d1 e7          	shl    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 e7    	shl    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 e4 02          	shl    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 e4 02 	shl    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*d5 19 d1 e7          	shl    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 e7    	shl    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 e4 02          	shl    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 e4 02 	shl    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*d5 19 d1 ef          	shr    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 ef    	shr    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 ec 02          	shr    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 ec 02 	shr    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*d5 19 d1 ff          	sar    \$1,%r31
> +\s*[a-f0-9]+:\s*62 dc 04 10 d0 ff    	sar    \$1,%r31b,%r31b
> +\s*[a-f0-9]+:\s*49 c1 fc 02          	sar    \$0x2,%r12
> +\s*[a-f0-9]+:\s*62 d4 1c 18 c0 fc 02 	sar    \$0x2,%r12b,%r12b
> +\s*[a-f0-9]+:\s*62 74 9c 18 24 20 01 	shld   \$0x1,%r12,\(%rax\),%r12
> +\s*[a-f0-9]+:\s*4d 0f a4 c4 02       	shld   \$0x2,%r8,%r12
> +\s*[a-f0-9]+:\s*62 54 bc 18 24 c4 02 	shld   \$0x2,%r8,%r12,%r8
> +\s*[a-f0-9]+:\s*62 74 b4 18 a5 08    	shld   %cl,%r9,\(%rax\),%r9
> +\s*[a-f0-9]+:\s*d5 9c a5 e0          	shld   %cl,%r12,%r16
> +\s*[a-f0-9]+:\s*62 7c 9c 18 a5 e0    	shld   %cl,%r12,%r16,%r12
> +\s*[a-f0-9]+:\s*62 74 9c 18 2c 20 01 	shrd   \$0x1,%r12,\(%rax\),%r12
> +\s*[a-f0-9]+:\s*4d 0f ac ec 01       	shrd   \$0x1,%r13,%r12
> +\s*[a-f0-9]+:\s*62 54 94 18 2c ec 01 	shrd   \$0x1,%r13,%r12,%r13
> +\s*[a-f0-9]+:\s*62 74 b4 18 ad 08    	shrd   %cl,%r9,\(%rax\),%r9
> +\s*[a-f0-9]+:\s*d5 9c ad e0          	shrd   %cl,%r12,%r16
> +\s*[a-f0-9]+:\s*62 7c 9c 18 ad e0    	shrd   %cl,%r12,%r16,%r12
> +\s*[a-f0-9]+:\s*67 0f 40 90 90 90 90 90 	cmovo  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 41 90 90 90 90 90 	cmovno -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 42 90 90 90 90 90 	cmovb  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 43 90 90 90 90 90 	cmovae -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 44 90 90 90 90 90 	cmove  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 45 90 90 90 90 90 	cmovne -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 46 90 90 90 90 90 	cmovbe -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 47 90 90 90 90 90 	cmova  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 48 90 90 90 90 90 	cmovs  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 49 90 90 90 90 90 	cmovns -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 4a 90 90 90 90 90 	cmovp  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 4c 90 90 90 90 90 	cmovl  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 4d 90 90 90 90 90 	cmovge -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 4e 90 90 90 90 90 	cmovle -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*67 0f 4f 90 90 90 90 90 	cmovg  -0x6f6f6f70\(%eax\),%edx
> +\s*[a-f0-9]+:\s*66 0f 38 f6 c3       	adcx   %ebx,%eax
> +\s*[a-f0-9]+:\s*66 0f 38 f6 c3       	adcx   %ebx,%eax
> +\s*[a-f0-9]+:\s*62 f4 fd 18 66 c3    	adcx   %rbx,%rax,%rax
> +\s*[a-f0-9]+:\s*62 74 3d 18 66 c0    	adcx   %eax,%r8d,%r8d
> +\s*[a-f0-9]+:\s*62 d4 7d 18 66 c7    	adcx   %r15d,%eax,%eax
> +\s*[a-f0-9]+:\s*67 66 0f 38 f6 04 0a 	adcx   \(%edx,%ecx,1\),%eax
> +\s*[a-f0-9]+:\s*f3 0f 38 f6 c3       	adox   %ebx,%eax
> +\s*[a-f0-9]+:\s*f3 0f 38 f6 c3       	adox   %ebx,%eax
> +\s*[a-f0-9]+:\s*62 f4 fe 18 66 c3    	adox   %rbx,%rax,%rax
> +\s*[a-f0-9]+:\s*62 74 3e 18 66 c0    	adox   %eax,%r8d,%r8d
> +\s*[a-f0-9]+:\s*62 d4 7e 18 66 c7    	adox   %r15d,%eax,%eax
> +\s*[a-f0-9]+:\s*67 f3 0f 38 f6 04 0a 	adox   \(%edx,%ecx,1\),%eax
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
> new file mode 100644
> index 00000000000..6ffdf5a6390
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
> @@ -0,0 +1,125 @@
> +# Check 64bit APX NDD instructions with optimized encoding
> +
> +	.text
> +_start:
> +add    %r31,%r8,%r8
> +addb   %r31b,%r8b,%r8b
> +{store} add    %r31,%r8,%r8
> +{load}  add    %r31,%r8,%r8
> +add    %r31,(%r8),%r31
> +add    (%r31),%r8,%r8
> +add    $0x12344433,%r15,%r15
> +add    $0xfffffffff4332211,%r8,%r8
> +inc    %r31,%r31
> +incb   %r31b,%r31b
> +sub    %r15,%r17,%r17
> +subb   %r15b,%r17b,%r17b
> +sub    %r15,(%r8),%r15
> +sub    (%r15,%rax,1),%r16,%r16
> +sub    $0x1234,%r30,%r30
> +dec    %r17,%r17
> +decb   %r17b,%r17b
> +sbb    %r15,%r17,%r17
> +sbbb   %r15b,%r17b,%r17b
> +sbb    %r15,(%r8),%r15
> +sbb    (%r15,%rax,1),%r16,%r16
> +sbb    $0x1234,%r30,%r30
> +and    %r15,%r17,%r17
> +andb   %r15b,%r17b,%r17b
> +and    %r15,(%r8),%r15
> +and    (%r15,%rax,1),%r16,%r16
> +and    $0x1234,%r30,%r30
> +or     %r15,%r17,%r17
> +orb    %r15b,%r17b,%r17b
> +or     %r15,(%r8),%r15
> +or     (%r15,%rax,1),%r16,%r16
> +or     $0x1234,%r30,%r30
> +xor    %r15,%r17,%r17
> +xorb   %r15b,%r17b,%r17b
> +xor    %r15,(%r8),%r15
> +xor    (%r15,%rax,1),%r16,%r16
> +xor    $0x1234,%r30,%r30
> +adc    %r15,%r17,%r17
> +adcb   %r15b,%r17b,%r17b
> +adc    %r15,(%r8),%r15
> +adc    (%r15,%rax,1),%r16,%r16
> +adc    $0x1234,%r30,%r30
> +neg    %r17,%r17
> +negb   %r17b,%r17b
> +not    %r17,%r17
> +notb   %r17b,%r17b
> +imul   0x90909(%eax),%edx,%edx
> +imul   0x909(%rax,%r31,8),%rdx,%rdx
> +imul   %rdx,%rax,%rdx
> +rol    $0x1,%r31,%r31
> +rolb   $0x1,%r31b,%r31b
> +rol    $0x2,%r12,%r12
> +rolb   $0x2,%r12b,%r12b
> +ror    $0x1,%r31,%r31
> +rorb   $0x1,%r31b,%r31b
> +ror    $0x2,%r12,%r12
> +rorb   $0x2,%r12b,%r12b
> +rcl    $0x1,%r31,%r31
> +rclb   $0x1,%r31b,%r31b
> +rcl    $0x2,%r12,%r12
> +rclb   $0x2,%r12b,%r12b
> +rcr    $0x1,%r31,%r31
> +rcrb   $0x1,%r31b,%r31b
> +rcr    $0x2,%r12,%r12
> +rcrb   $0x2,%r12b,%r12b
> +sal    $0x1,%r31,%r31
> +salb   $0x1,%r31b,%r31b
> +sal    $0x2,%r12,%r12
> +salb   $0x2,%r12b,%r12b
> +shl    $0x1,%r31,%r31
> +shlb   $0x1,%r31b,%r31b
> +shl    $0x2,%r12,%r12
> +shlb   $0x2,%r12b,%r12b
> +shr    $0x1,%r31,%r31
> +shrb   $0x1,%r31b,%r31b
> +shr    $0x2,%r12,%r12
> +shrb   $0x2,%r12b,%r12b
> +sar    $0x1,%r31,%r31
> +sarb   $0x1,%r31b,%r31b
> +sar    $0x2,%r12,%r12
> +sarb   $0x2,%r12b,%r12b
> +shld   $0x1,%r12,(%rax),%r12
> +shld   $0x2,%r8,%r12,%r12
> +shld   $0x2,%r8,%r12,%r8
> +shld   %cl,%r9,(%rax),%r9
> +shld   %cl,%r12,%r16,%r16
> +shld   %cl,%r12,%r16,%r12
> +shrd   $0x1,%r12,(%rax),%r12
> +shrd   $0x1,%r13,%r12,%r12
> +shrd   $0x1,%r13,%r12,%r13
> +shrd   %cl,%r9,(%rax),%r9
> +shrd   %cl,%r12,%r16,%r16
> +shrd   %cl,%r12,%r16,%r12
> +cmovo  0x90909090(%eax),%edx,%edx
> +cmovno 0x90909090(%eax),%edx,%edx
> +cmovb  0x90909090(%eax),%edx,%edx
> +cmovae 0x90909090(%eax),%edx,%edx
> +cmove  0x90909090(%eax),%edx,%edx
> +cmovne 0x90909090(%eax),%edx,%edx
> +cmovbe 0x90909090(%eax),%edx,%edx
> +cmova  0x90909090(%eax),%edx,%edx
> +cmovs  0x90909090(%eax),%edx,%edx
> +cmovns 0x90909090(%eax),%edx,%edx
> +cmovp  0x90909090(%eax),%edx,%edx
> +cmovnp 0x90909090(%eax),%edx,%edx
> +cmovl  0x90909090(%eax),%edx,%edx
> +cmovge 0x90909090(%eax),%edx,%edx
> +cmovle 0x90909090(%eax),%edx,%edx
> +cmovg  0x90909090(%eax),%edx,%edx
> +adcx   %ebx,%eax,%eax
> +adcx   %eax,%ebx,%eax
> +adcx   %rbx,%rax,%rax
> +adcx   %eax,%r8d,%r8d
> +adcx   %r15d,%eax,%eax
> +adcx   (%edx,%ecx,1),%eax,%eax
> +adox   %ebx,%eax,%eax
> +adox   %eax,%ebx,%eax
> +adox   %rbx,%rax,%rax
> +adox   %eax,%r8d,%r8d
> +adox   %r15d,%eax,%eax
> +adox   (%edx,%ecx,1),%eax,%eax
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index 1b13c52454e..2ba4c49417a 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -561,6 +561,7 @@ run_dump_test "x86-64-optimize-6"
>  run_list_test "x86-64-optimize-7a" "-I${srcdir}/$subdir -march=+noavx -al"
>  run_dump_test "x86-64-optimize-7b"
>  run_list_test "x86-64-optimize-8" "-I${srcdir}/$subdir -march=+noavx2 -al"
> +run_dump_test "x86-64-apx-ndd-optimize"
>  run_dump_test "x86-64-align-branch-1a"
>  run_dump_test "x86-64-align-branch-1b"
>  run_dump_test "x86-64-align-branch-1c"
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 9/9] Support APX JMPABS for disassembler
  2023-12-28  1:27 ` [PATCH V5 9/9] Support APX JMPABS for disassembler Cui, Lili
@ 2023-12-28  1:56   ` H.J. Lu
  2024-01-05 12:08   ` Jan Beulich
  1 sibling, 0 replies; 30+ messages in thread
From: H.J. Lu @ 2023-12-28  1:56 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, jbeulich, Hu, Lin1

On Thu, Dec 28, 2023 at 01:27:14AM +0000, Cui, Lili wrote:
> From: "Hu, Lin1" <lin1.hu@intel.com>
> 
> gas/ChangeLog:
> 
> 	* testsuite/gas/i386/x86-64.exp: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-jmpabs-intel.d: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-jmpabs-inval.d: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-jmpabs-inval.s: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-jmpabs.d: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-jmpabs.s: Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis.c (JMPABS_Fixup): New Fixup function to disassemble jmpabs.
> 	(print_insn): Add #UD exception for jmpabs.
> 	(dis386): Modify a1 unit for support jmpabs.
> 	* i386-mnem.h: Regenerated.
> 	* i386-opc.tbl: New insns.
> 	* i386-tbl.h: Regenerated.
> ---
>  .../gas/i386/x86-64-apx-jmpabs-intel.d        | 12 ++++++
>  .../gas/i386/x86-64-apx-jmpabs-inval.d        | 40 +++++++++++++++++++
>  .../gas/i386/x86-64-apx-jmpabs-inval.s        | 15 +++++++
>  gas/testsuite/gas/i386/x86-64-apx-jmpabs.d    | 12 ++++++
>  gas/testsuite/gas/i386/x86-64-apx-jmpabs.s    |  5 +++
>  gas/testsuite/gas/i386/x86-64.exp             |  3 ++
>  opcodes/i386-dis.c                            | 37 ++++++++++++++++-
>  7 files changed, 122 insertions(+), 2 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
> 
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
> new file mode 100644
> index 00000000000..2b87f95532f
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
> @@ -0,0 +1,12 @@
> +#as:
> +#objdump: -dw -Mintel
> +#name: x86_64 APX_F JMPABS insns (Intel disassembly)
> +#source: x86-64-apx-jmpabs.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*d5 00 a1 02 00 00 00 00 00 00 00[	 ]+jmpabs 0x2
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
> new file mode 100644
> index 00000000000..86f313f0873
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
> @@ -0,0 +1,40 @@
> +#as: --64
> +#objdump: -dw
> +#name: illegal decoding of APX_F jmpabs insns
> +#source: x86-64-apx-jmpabs-inval.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <.text>:
> +\s*[a-f0-9]+:	66 d5 00 a1[  	]+\(bad\)
> +\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	67 d5 00 a1[  	]+\(bad\)
> +\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	f2 d5 00 a1[  	]+\(bad\)
> +\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	f3 d5 00 a1[  	]+\(bad\)
> +\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	f0 d5 00 a1[  	]+\(bad\)
> +\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	d5 08 a1[  	]+\(bad\)
> +\s*[a-f0-9]+:	01 00[  	]+add    %eax,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +\s*[a-f0-9]+:	00 00[  	]+add    %al,\(%rax\)
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
> new file mode 100644
> index 00000000000..de4440a5466
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
> @@ -0,0 +1,15 @@
> +# Check bytecode of APX_F jmpabs instructions with illegal encode.
> +
> +	.text
> +# With 66 prefix
> +	.byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With 67 prefix
> +	.byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With F2 prefix
> +	.byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With F3 prefix
> +	.byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With LOCK prefix
> +	.byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# REX2.M0 = 0 REX2.W = 1
> +	.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs.d b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
> new file mode 100644
> index 00000000000..e95b54f5dab
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
> @@ -0,0 +1,12 @@
> +#as:
> +#objdump: -dw
> +#name: x86_64 APX_F JMPABS insns
> +#source: x86-64-apx-jmpabs.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*d5 00 a1 02 00 00 00 00 00 00 00[	 ]+jmpabs \$0x2
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
> new file mode 100644
> index 00000000000..69ffb763260
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
> @@ -0,0 +1,5 @@
> +# Check 64bit APX_F JMPABS instructions
> +
> +	.text
> + _start:
> +	.byte 0xd5,0x00,0xa1,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
> index 2ba4c49417a..fa6a1c3c945 100644
> --- a/gas/testsuite/gas/i386/x86-64.exp
> +++ b/gas/testsuite/gas/i386/x86-64.exp
> @@ -377,6 +377,9 @@ run_dump_test "x86-64-apx-evex-promoted"
>  run_dump_test "x86-64-apx-evex-promoted-intel"
>  run_dump_test "x86-64-apx-evex-egpr"
>  run_dump_test "x86-64-apx-ndd"
> +run_dump_test "x86-64-apx-jmpabs"
> +run_dump_test "x86-64-apx-jmpabs-intel"
> +run_dump_test "x86-64-apx-jmpabs-inval"
>  run_dump_test "x86-64-avx512f-rcigrz-intel"
>  run_dump_test "x86-64-avx512f-rcigrz"
>  run_dump_test "x86-64-clwb"
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index e851fb376d9..b6d7e089823 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -106,6 +106,7 @@ static bool MOVSXD_Fixup (instr_info *, int, int);
>  static bool DistinctDest_Fixup (instr_info *, int, int);
>  static bool PREFETCHI_Fixup (instr_info *, int, int);
>  static bool PUSH2_POP2_Fixup (instr_info *, int, int);
> +static bool JMPABS_Fixup (instr_info *, int, int);
>  
>  static void ATTRIBUTE_PRINTF_3 i386_dis_printf (const disassemble_info *,
>  						enum disassembler_style,
> @@ -2018,7 +2019,7 @@ static const struct dis386 dis386[] = {
>    { "lahf",		{ XX }, 0 },
>    /* a0 */
>    { "mov%LB",		{ AL, Ob }, PREFIX_REX2_ILLEGAL },
> -  { "mov%LS",		{ eAX, Ov }, PREFIX_REX2_ILLEGAL },
> +  { "mov%LS",		{ { JMPABS_Fixup, eAX_reg }, { JMPABS_Fixup, v_mode } }, PREFIX_REX2_ILLEGAL },
>    { "mov%LB",		{ Ob, AL }, PREFIX_REX2_ILLEGAL },
>    { "mov%LS",		{ Ov, eAX }, PREFIX_REX2_ILLEGAL },
>    { "movs{b|}",		{ Ybr, Xb }, PREFIX_REX2_ILLEGAL },
> @@ -9699,7 +9700,7 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>      }
>  
>    if ((dp->prefix_requirement & PREFIX_REX2_ILLEGAL)
> -      && ins.last_rex2_prefix >= 0)
> +      && ins.last_rex2_prefix >= 0 && (ins.rex2 & REX2_SPECIAL) == 0)
>      {
>        i386_dis_printf (info, dis_style_text, "(bad)");
>        ret = ins.end_codep - priv.the_buffer;
> @@ -13942,3 +13943,35 @@ PUSH2_POP2_Fixup (instr_info *ins, int bytemode, int sizeflag)
>  
>    return OP_VEX (ins, bytemode, sizeflag);
>  }
> +
> +static bool
> +JMPABS_Fixup (instr_info *ins, int bytemode, int sizeflag)
> +{
> +  if (ins->last_rex2_prefix >= 0)
> +    {
> +      uint64_t op;
> +
> +      if ((ins->prefixes & (PREFIX_OPCODE | PREFIX_ADDR | PREFIX_LOCK)) != 0x0
> +	  || (ins->rex & REX_W) != 0x0)
> +	{
> +	  oappend (ins, "(bad)");
> +	  return true;
> +	}
> +
> +      if (bytemode == eAX_reg)
> +	return true;
> +
> +      if (!get64 (ins, &op))
> +	return false;
> +
> +      ins->mnemonicendp = stpcpy (ins->obuf, "jmpabs");
> +      ins->rex2 |= REX2_SPECIAL;
> +      oappend_immediate (ins, op);
> +
> +      return true;
> +    }
> +
> +  if (bytemode == eAX_reg)
> +    return OP_IMREG (ins, bytemode, sizeflag);
> +  return OP_OFF64 (ins, bytemode, sizeflag);
> +}
> -- 
> 2.25.1
> 

OK.

Thanks.

H.J.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH V5 3/9] Support APX GPR32 with extend evex prefix
  2023-12-28  1:54   ` H.J. Lu
@ 2023-12-28 13:48     ` Cui, Lili
  0 siblings, 0 replies; 30+ messages in thread
From: Cui, Lili @ 2023-12-28 13:48 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils, Beulich, Jan

[-- Attachment #1: Type: text/plain, Size: 56831 bytes --]

This is what I checked in, thanks.

Lili.

> -----Original Message-----
> From: H.J. Lu <hjl.tools@gmail.com>
> Sent: Thursday, December 28, 2023 9:54 AM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: binutils@sourceware.org; Beulich, Jan <JBeulich@suse.com>
> Subject: Re: [PATCH V5 3/9] Support APX GPR32 with extend evex prefix
> 
> On Thu, Dec 28, 2023 at 01:27:08AM +0000, Cui, Lili wrote:
> > This patch adds non-ND, non-NF forms of EVEX promotion insn.
> >
> > EVEX extension of legacy instructions:
> >   All promoted legacy instructions are placed in EVEX map 4, which is
> >   currently reserved.
> > EVEX extension of EVEX instructions:
> >   All existing EVEX instructions are extended by APX using the extended
> >   EVEX prefix, so that they can access all 32 GPRs.
> > EVEX extension of VEX instructions:
> >   Promoting a VEX instruction into the EVEX space does not change the map
> >   id, the opcode, or the operand encoding of the VEX instruction.
> >
> > Note: The promoted versions of MOVBE will be extended to include the
> “MOVBE
> >   reg1, reg2”.
> >
> >   gas/ChangeLog:
> >
> >   2023-12-28  Lingling Kong <lingling.kong@intel.com>
> > 	      H.J. Lu  <hongjiu.lu@intel.com>
> > 	      Lili Cui <lili.cui@intel.com>
> > 	      Lin Hu   <lin1.hu@intel.com>
> >
> > 	* config/tc-i386.c
> > 	(install_template): Handled APX combines.
> > 	(is_apx_evex_encoding): Test apx evex encoding.
> > 	(build_apx_evex_prefix): Enabe APX evex prefix.
> > 	(md_assemble): Handle apx with evex encoding.
> > 	(process_suffix): Handle apx map4 prefix.
> > 	(check_register): Assign i.vec_encoding for APX evex instructions.
> > 	* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
> > 	* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.
> >
> > opcodes/ChangeLog:
> >
> > 	* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
> > 	* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
> > 	PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
> > 	PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
> > 	PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
> > 	PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
> > 	PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
> > 	PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
> > 	* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
> > 	* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
> > 	promote to apx to use gpr32
> > 	* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
> > 	X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
> > 	X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5,
> X86_64_EVEX_0F38F6,
> > 	X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
> > 	* i386-dis.c
> > 	(struct instr_info): Deleted bool r.
> > 	(PREFIX_NP_OR_DATA): New.
> > 	(NO_PREFIX): New.
> > 	(putop): Ditto.
> > 	(X86_64_EVEX_FROM_VEX_TABLE): Diito.
> > 	(get_valid_dis386): Decode insn erex in extend evex prefix.
> > 	Handle EVEX_MAP4
> > 	(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
> > 	(print_register): Handle apx instructions decode.
> > 	(OP_E_memory): Diito.
> > 	(OP_G): Diito.
> > 	(OP_XMM): Diito.
> > 	(DistinctDest_Fixup): Diito.
> > 	* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
> > 	* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
> > 	promote to evex.
> > 	* i386-opc.tbl: Handle some legacy and vex insns don't
> > 	support gpr32. And add some legacy insn (map2 / 3) promote
> > 	to evex.
> > ---
> >  gas/config/tc-i386.c                 |  72 +++++++++++-
> >  gas/testsuite/gas/i386/x86-64-evex.d |   2 +-
> >  gas/testsuite/gas/i386/x86-64.exp    |   2 +-
> >  opcodes/i386-dis-evex-prefix.h       |  58 ++++++++++
> >  opcodes/i386-dis-evex-x86-64.h       |  50 +++++++++
> >  opcodes/i386-dis-evex.h              |  94 ++++++++--------
> >  opcodes/i386-dis.c                   | 160 +++++++++++++++++++++++----
> >  opcodes/i386-gen.c                   |   2 +
> >  opcodes/i386-opc.h                   |   6 +
> >  opcodes/i386-opc.tbl                 |  90 ++++++++++-----
> >  10 files changed, 433 insertions(+), 103 deletions(-)
> >  create mode 100644 opcodes/i386-dis-evex-x86-64.h
> >
> > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> > index bb302f28add..7e62d08e9bd 100644
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -435,6 +435,9 @@ struct _i386_insn
> >      /* Prefer the REX2 prefix in encoding.  */
> >      bool rex2_encoding;
> >
> > +    /* Need to use an Egpr capable encoding (REX2 or EVEX).  */
> > +    bool has_egpr;
> > +
> >      /* Disable instruction size optimization.  */
> >      bool no_optimize;
> >
> > @@ -3676,12 +3679,12 @@ install_template (const insn_template *t)
> >
> >    /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
> >    if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> > -  {
> > +    {
> >        if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> >  	   || maybe_cpu (t, CpuFMA))
> >  	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
> >  	{
> > -	  if (need_evex_encoding ())
> > +	  if (need_evex_encoding () || i.has_egpr)
> >  	    {
> >  	      i.tm.opcode_modifier.vex = 0;
> >  	      i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f;
> > @@ -3698,7 +3701,19 @@ install_template (const insn_template *t)
> >  		gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
> >  	    }
> >  	}
> > -  }
> > +
> > +      if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
> > +	   || maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
> > +	   || maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
> > +	   || maybe_cpu (t, CpuBMI2))
> > +	  && maybe_cpu (t, CpuAPX_F))
> > +	{
> > +	  if (need_evex_encoding () || i.has_egpr)
> > +	    i.tm.opcode_modifier.vex = 0;
> > +	  else
> > +	    i.tm.opcode_modifier.evex = 0;
> > +	}
> > +    }
> >
> >    /* Note that for pseudo prefixes this produces a length of 1. But for them
> >       the length isn't interesting at all.  */
> > @@ -3879,6 +3894,15 @@ is_any_vex_encoding (const insn_template *t)
> >    return t->opcode_modifier.vex || t->opcode_modifier.evex;
> >  }
> >
> > +/* We can use this function only when the current encoding is evex.  */
> > +static INLINE bool
> > +is_apx_evex_encoding (void)
> > +{
> > +  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
> > +    || (i.vex.register_specifier
> > +	&& (i.vex.register_specifier->reg_flags & RegRex2));
> > +}
> > +
> >  static INLINE bool
> >  is_apx_rex2_encoding (void)
> >  {
> > @@ -4156,6 +4180,27 @@ build_rex2_prefix (void)
> >  		    | (i.rex2 << 4) | i.rex);
> >  }
> >
> > +/* Build the EVEX prefix (4-byte) for evex insn
> > +   | 62h |
> > +   | `R`X`B`R' | B'mmm |
> > +   | W | v`v`v`v | `x' | pp |
> > +   | z| L'L | b | `v | aaa |
> > +*/
> > +static void
> > +build_apx_evex_prefix (void)
> > +{
> > +  build_evex_prefix ();
> > +  if (i.rex2 & REX_R)
> > +    i.vex.bytes[1] &= ~0x10;
> > +  if (i.rex2 & REX_B)
> > +    i.vex.bytes[1] |= 0x08;
> > +  if (i.rex2 & REX_X)
> > +    i.vex.bytes[2] &= ~0x04;
> > +  if (i.vex.register_specifier
> > +      && i.vex.register_specifier->reg_flags & RegRex2)
> > +    i.vex.bytes[3] &= ~0x08;
> > +}
> > +
> >  static void establish_rex (void)
> >  {
> >    /* Note that legacy encodings have at most 2 non-immediate operands.  */
> > @@ -5723,13 +5768,18 @@ md_assemble (char *line)
> >  	  return;
> >  	}
> >
> > -      if (i.tm.opcode_modifier.vex)
> > +      if (is_apx_evex_encoding ())
> > +	build_apx_evex_prefix ();
> > +      else if (i.tm.opcode_modifier.vex)
> >  	build_vex_prefix (t);
> >        else
> >  	build_evex_prefix ();
> >
> >        /* The individual REX.RXBW bits got consumed.  */
> >        i.rex &= REX_OPCODE;
> > +
> > +      /* The rex2 bits got consumed.  */
> > +      i.rex2 = 0;
> >      }
> >
> >    /* Handle conversion of 'int $3' --> special int3 insn.  */
> > @@ -8084,7 +8134,8 @@ process_suffix (void)
> >        if (i.suffix != QWORD_MNEM_SUFFIX
> >  	  && i.tm.opcode_modifier.mnemonicsize != IGNORESIZE
> >  	  && !i.tm.opcode_modifier.floatmf
> > -	  && !is_any_vex_encoding (&i.tm)
> > +	  && (!is_any_vex_encoding (&i.tm)
> > +	      || i.tm.opcode_space == SPACE_EVEXMAP4)
> >  	  && ((i.suffix == LONG_MNEM_SUFFIX) == (flag_code == CODE_16BIT)
> >  	      || (flag_code == CODE_64BIT
> >  		  && i.tm.opcode_modifier.jump == JUMP_BYTE)))
> > @@ -8094,7 +8145,14 @@ process_suffix (void)
> >  	  if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
> >  	    prefix = ADDR_PREFIX_OPCODE;
> >
> > -	  if (!add_prefix (prefix))
> > +	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> > +	     needs to be adjusted.  */
> > +	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
> > +	    {
> > +	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
> > +	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> > +	    }
> > +	  else if (!add_prefix (prefix))
> >  	    return 0;
> >  	}
> >
> > @@ -14300,6 +14358,8 @@ static bool check_register (const reg_entry *r)
> >        if (!cpu_arch_flags.bitfield.cpuapx_f
> >  	  || flag_code != CODE_64BIT)
> >  	return false;
> > +
> > +      i.has_egpr = true;
> >      }
> >
> >    if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
> > diff --git a/gas/testsuite/gas/i386/x86-64-evex.d
> b/gas/testsuite/gas/i386/x86-64-evex.d
> > index 041747db892..5d974c312da 100644
> > --- a/gas/testsuite/gas/i386/x86-64-evex.d
> > +++ b/gas/testsuite/gas/i386/x86-64-evex.d
> > @@ -17,6 +17,6 @@ Disassembly of section .text:
> >   +[a-f0-9]+:	62 f1 d6 38 7b f0    	vcvtusi2ss %rax,\{rd-
> sae\},%xmm5,%xmm6
> >   +[a-f0-9]+:	62 f1 57 38 7b f0    	vcvtusi2sd %eax,\{rd-
> bad\},%xmm5,%xmm6
> >   +[a-f0-9]+:	62 f1 d7 38 7b f0    	vcvtusi2sd %rax,\{rd-
> sae\},%xmm5,%xmm6
> > - +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,\(bad\)
> > + +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,%r16d
> >   +[a-f0-9]+:	62 e1 7c 08 c2 c0 00 	vcmpeqps %xmm0,%xmm0,\(bad\)
> >  #pass
> > diff --git a/gas/testsuite/gas/i386/x86-64.exp
> b/gas/testsuite/gas/i386/x86-64.exp
> > index 91c068d5b40..ffacc9c8e2b 100644
> > --- a/gas/testsuite/gas/i386/x86-64.exp
> > +++ b/gas/testsuite/gas/i386/x86-64.exp
> > @@ -250,7 +250,7 @@ run_dump_test "x86-64-sse-noavx"
> >  run_dump_test "x86-64-movbe"
> >  run_dump_test "x86-64-movbe-intel"
> >  run_dump_test "x86-64-movbe-suffix"
> > -run_list_test "x86-64-inval-movbe" "-al"
> > +run_list_test "x86-64-inval-movbe" "-march=+noapx_f -al"
> >  run_dump_test "x86-64-ept"
> >  run_dump_test "x86-64-ept-intel"
> >  run_list_test "x86-64-inval-ept" "-al"
> > diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
> > index 28da54922c7..54ed48c6952 100644
> > --- a/opcodes/i386-dis-evex-prefix.h
> > +++ b/opcodes/i386-dis-evex-prefix.h
> > @@ -338,6 +338,64 @@
> >      { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
> >      { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
> >    },
> > +  /* PREFIX_EVEX_MAP4_D8 */
> > +  {
> > +    { "sha1nexte", { XM, EXxmm }, 0 },
> > +    { REG_TABLE (REG_0F38D8_PREFIX_1) },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DA */
> > +  {
> > +    { "sha1msg2", { XM, EXxmm }, 0 },
> > +    { "encodekey128", { Gd, Rd }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DB */
> > +  {
> > +    { "sha256rnds2", { XM, EXxmm, XMM0 }, 0 },
> > +    { "encodekey256", { Gd, Rd }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DC */
> > +  {
> > +    { "sha256msg1", { XM, EXxmm }, 0 },
> > +    { "aesenc128kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DD */
> > +  {
> > +    { "sha256msg2", { XM, EXxmm }, 0 },
> > +    { "aesdec128kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DE */
> > +  {
> > +    { Bad_Opcode },
> > +    { "aesenc256kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DF */
> > +  {
> > +    { Bad_Opcode },
> > +    { "aesdec256kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F0 */
> > +  {
> > +    { "crc32A", { Gdq, Eb }, 0 },
> > +    { "invept", { Gm, Mo }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F1 */
> > +  {
> > +    { "crc32Q", { Gdq, Ev }, 0 },
> > +    { "invvpid", { Gm, Mo }, 0 },
> > +    { "crc32Q", { Gdq, Ev }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F2 */
> > +  {
> > +    { Bad_Opcode },
> > +    { "invpcid", { Gm, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F8 */
> > +  {
> > +    { Bad_Opcode },
> > +    { "enqcmds", { Gva, M },  0 },
> > +    { "movdir64b", { Gva, M }, 0 },
> > +    { "enqcmd", { Gva, M }, 0 },
> > +  },
> >    /* PREFIX_EVEX_MAP5_10 */
> >    {
> >      { Bad_Opcode },
> > diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-
> 64.h
> > new file mode 100644
> > index 00000000000..0d9d98a7691
> > --- /dev/null
> > +++ b/opcodes/i386-dis-evex-x86-64.h
> > @@ -0,0 +1,50 @@
> > +  /* X86_64_EVEX_0F90 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F90_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F91 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F91_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F92 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F92_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F93 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F93_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F2 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F3 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F5 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F5_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F6 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE(PREFIX_VEX_0F38F6_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F7 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE(PREFIX_VEX_0F38F7_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F3AF0 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) },
> > +  },
> > diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
> > index 7ad1edbe72d..90c063b2188 100644
> > --- a/opcodes/i386-dis-evex.h
> > +++ b/opcodes/i386-dis-evex.h
> > @@ -164,10 +164,10 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* 90 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F90) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F91) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F92) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F93) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -375,9 +375,9 @@ static const struct dis386 evex_table[][256] = {
> >      { "vpsllv%DQ",	{ XM, Vex, EXx }, PREFIX_DATA },
> >      /* 48 */
> >      { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F3849) },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F384B) },
> >      { "vrcp14p%XW",	{ XM, EXx }, PREFIX_DATA },
> >      { "vrcp14s%XW",	{ XMScalar, VexScalar, EXdq }, PREFIX_DATA },
> >      { "vrsqrt14p%XW",	{ XM, EXx }, 0 },
> > @@ -545,32 +545,32 @@ static const struct dis386 evex_table[][256] = {
> >      { "%XEvaesdecY",	{ XM, Vex, EXx }, PREFIX_DATA },
> >      { "%XEvaesdeclastY", { XM, Vex, EXx }, PREFIX_DATA },
> >      /* E0 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E0) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E1) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E2) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E3) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E4) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E5) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E6) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E7) },
> >      /* E8 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E8) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E9) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EA) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EB) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EC) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38ED) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EE) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EF) },
> >      /* F0 */
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F2) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F3) },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F5) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F6) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F7) },
> >      /* F8 */
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -854,7 +854,7 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* F0 */
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F3AF0) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -983,13 +983,13 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* 60 */
> > +    { "movbeS",	{ Gv, Ev }, PREFIX_NP_OR_DATA },
> > +    { "movbeS",	{ Ev, Gv }, PREFIX_NP_OR_DATA },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "wrussK",	{ M, Gdq }, PREFIX_DATA },
> > +    { PREFIX_TABLE (PREFIX_0F38F6) },
> >      { Bad_Opcode },
> >      /* 68 */
> >      { Bad_Opcode },
> > @@ -1113,19 +1113,19 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* D8 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_D8) },
> > +    { "sha1msg1",	{ XM, EXxmm }, NO_PREFIX },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DA) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DB) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DC) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DD) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DE) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DF) },
> >      /* E0 */
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -1145,20 +1145,20 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* F0 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F0) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F1) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F2) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* F8 */
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
> > +    { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_0F38FC) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> > index e006d869258..d4d32befcf9 100644
> > --- a/opcodes/i386-dis.c
> > +++ b/opcodes/i386-dis.c
> > @@ -132,6 +132,13 @@ enum x86_64_isa
> >    intel64
> >  };
> >
> > +enum evex_type
> > +{
> > +  evex_default = 0,
> > +  evex_from_legacy,
> > +  evex_from_vex,
> > +};
> > +
> >  struct instr_info
> >  {
> >    enum address_mode address_mode;
> > @@ -212,7 +219,6 @@ struct instr_info
> >      int ll;
> >      bool w;
> >      bool evex;
> > -    bool r;
> >      bool v;
> >      bool zeroing;
> >      bool b;
> > @@ -220,6 +226,8 @@ struct instr_info
> >    }
> >    vex;
> >
> > +  enum evex_type evex_type;
> > +
> >    /* Remember if the current op is a jump instruction.  */
> >    bool op_is_jump;
> >
> > @@ -303,6 +311,8 @@ struct dis_private {
> >  #define PREFIX_ADDR 0x400
> >  #define PREFIX_FWAIT 0x800
> >  #define PREFIX_REX2 0x1000
> > +#define PREFIX_NP_OR_DATA 0x2000
> > +#define NO_PREFIX   0x4000
> >
> >  /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
> >     to ADDR (exclusive) are valid.  Returns true for success, false
> > @@ -800,6 +810,7 @@ enum
> >    USE_RM_TABLE,
> >    USE_PREFIX_TABLE,
> >    USE_X86_64_TABLE,
> > +  USE_X86_64_EVEX_FROM_VEX_TABLE,
> >    USE_3BYTE_TABLE,
> >    USE_XOP_8F_TABLE,
> >    USE_VEX_C4_TABLE,
> > @@ -818,6 +829,8 @@ enum
> >  #define RM_TABLE(I)		DIS386 (USE_RM_TABLE, (I))
> >  #define PREFIX_TABLE(I)		DIS386 (USE_PREFIX_TABLE, (I))
> >  #define X86_64_TABLE(I)		DIS386 (USE_X86_64_TABLE, (I))
> > +#define X86_64_EVEX_FROM_VEX_TABLE(I) \
> > +  DIS386 (USE_X86_64_EVEX_FROM_VEX_TABLE, (I))
> >  #define THREE_BYTE_TABLE(I)	DIS386 (USE_3BYTE_TABLE, (I))
> >  #define XOP_8F_TABLE()		DIS386 (USE_XOP_8F_TABLE, 0)
> >  #define VEX_C4_TABLE()		DIS386 (USE_VEX_C4_TABLE, 0)
> > @@ -866,7 +879,7 @@ enum
> >    REG_VEX_0F73,
> >    REG_VEX_0FAE,
> >    REG_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0,
> > -  REG_VEX_0F38F3_L_0,
> > +  REG_VEX_0F38F3_L_0_P_0,
> >    REG_VEX_MAP7_F8_L_0_W_0,
> >
> >    REG_XOP_09_01_L_0,
> > @@ -878,7 +891,7 @@ enum
> >    REG_EVEX_0F72,
> >    REG_EVEX_0F73,
> >    REG_EVEX_0F38C6_L_2,
> > -  REG_EVEX_0F38C7_L_2
> > +  REG_EVEX_0F38C7_L_2,
> >  };
> >
> >  enum
> > @@ -1094,6 +1107,8 @@ enum
> >    PREFIX_VEX_0F38CC,
> >    PREFIX_VEX_0F38CD,
> >    PREFIX_VEX_0F38DA_W_0,
> > +  PREFIX_VEX_0F38F2_L_0,
> > +  PREFIX_VEX_0F38F3_L_0,
> >    PREFIX_VEX_0F38F5_L_0,
> >    PREFIX_VEX_0F38F6_L_0,
> >    PREFIX_VEX_0F38F7_L_0,
> > @@ -1156,6 +1171,18 @@ enum
> >    PREFIX_EVEX_0F3A67,
> >    PREFIX_EVEX_0F3AC2,
> >
> > +  PREFIX_EVEX_MAP4_D8,
> > +  PREFIX_EVEX_MAP4_DA,
> > +  PREFIX_EVEX_MAP4_DB,
> > +  PREFIX_EVEX_MAP4_DC,
> > +  PREFIX_EVEX_MAP4_DD,
> > +  PREFIX_EVEX_MAP4_DE,
> > +  PREFIX_EVEX_MAP4_DF,
> > +  PREFIX_EVEX_MAP4_F0,
> > +  PREFIX_EVEX_MAP4_F1,
> > +  PREFIX_EVEX_MAP4_F2,
> > +  PREFIX_EVEX_MAP4_F8,
> > +
> >    PREFIX_EVEX_MAP5_10,
> >    PREFIX_EVEX_MAP5_11,
> >    PREFIX_EVEX_MAP5_1D,
> > @@ -1267,7 +1294,19 @@ enum
> >    X86_64_VEX_0F38ED,
> >    X86_64_VEX_0F38EE,
> >    X86_64_VEX_0F38EF,
> > +
> >    X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
> > +
> > +  X86_64_EVEX_0F90,
> > +  X86_64_EVEX_0F91,
> > +  X86_64_EVEX_0F92,
> > +  X86_64_EVEX_0F93,
> > +  X86_64_EVEX_0F38F2,
> > +  X86_64_EVEX_0F38F3,
> > +  X86_64_EVEX_0F38F5,
> > +  X86_64_EVEX_0F38F6,
> > +  X86_64_EVEX_0F38F7,
> > +  X86_64_EVEX_0F3AF0,
> >  };
> >
> >  enum
> > @@ -2882,12 +2921,12 @@ static const struct dis386 reg_table[][8] = {
> >    {
> >      { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0_R_0) },
> >    },
> > -  /* REG_VEX_0F38F3_L_0 */
> > +  /* REG_VEX_0F38F3_L_0_P_0 */
> >    {
> >      { Bad_Opcode },
> > -    { "blsrS",		{ VexGdq, Edq }, PREFIX_OPCODE },
> > -    { "blsmskS",	{ VexGdq, Edq }, PREFIX_OPCODE },
> > -    { "blsiS",		{ VexGdq, Edq }, PREFIX_OPCODE },
> > +    { "blsrS",		{ VexGdq, Edq }, 0 },
> > +    { "blsmskS",	{ VexGdq, Edq }, 0 },
> > +    { "blsiS",		{ VexGdq, Edq }, 0 },
> >    },
> >    /* REG_VEX_MAP7_F8_L_0_W_0 */
> >    {
> > @@ -4035,6 +4074,16 @@ static const struct dis386 prefix_table[][4] = {
> >      { "vsm4rnds4", { XM, Vex, EXx }, 0 },
> >    },
> >
> > +  /* PREFIX_VEX_0F38F2_L_0 */
> > +  {
> > +    { "andnS",          { Gdq, VexGdq, Edq }, 0 },
> > +  },
> > +
> > +  /* PREFIX_VEX_0F38F3_L_0 */
> > +  {
> > +    { REG_TABLE (REG_VEX_0F38F3_L_0_P_0) },
> > +  },
> > +
> >    /* PREFIX_VEX_0F38F5_L_0 */
> >    {
> >      { "bzhiS",		{ Gdq, Edq, VexGdq }, 0 },
> > @@ -4527,6 +4576,7 @@ static const struct dis386 x86_64_table[][2] = {
> >      { PREFIX_TABLE (PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64) },
> >    },
> >
> > +#include "i386-dis-evex-x86-64.h"
> >  };
> >
> >  static const struct dis386 three_byte_table[][256] = {
> > @@ -7113,12 +7163,12 @@ static const struct dis386 vex_len_table[][2] =
> {
> >
> >    /* VEX_LEN_0F38F2 */
> >    {
> > -    { "andnS",		{ Gdq, VexGdq, Edq }, PREFIX_OPCODE },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
> >    },
> >
> >    /* VEX_LEN_0F38F3 */
> >    {
> > -    { REG_TABLE(REG_VEX_0F38F3_L_0) },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
> >    },
> >
> >    /* VEX_LEN_0F38F5 */
> > @@ -8732,6 +8782,17 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >        dp = &prefix_table[dp->op[1].bytemode][vindex];
> >        break;
> >
> > +    case USE_X86_64_EVEX_FROM_VEX_TABLE:
> > +      ins->evex_type = evex_from_vex;
> > +      /* EVEX from VEX instrucions require that EVEX.z, EVEX.L’L, EVEX.b and
> > +	 the lower 2 bits of EVEX.aaa must be 0.  */
> > +      if ((ins->vex.mask_register_specifier & 0x3) != 0
> > +	  || ins->vex.ll != 0
> > +	  || ins->vex.zeroing != 0
> > +	  || ins->vex.b)
> > +	return &bad_opcode;
> > +
> > +      /* Fall through.  */
> >      case USE_X86_64_TABLE:
> >        vindex = ins->address_mode == mode_64bit ? 1 : 0;
> >        dp = &x86_64_table[dp->op[1].bytemode][vindex];
> > @@ -8977,9 +9038,13 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >        if (!fetch_code (ins->info, ins->codep + 4))
> >  	return &err_opcode;
> >        /* The first byte after 0x62.  */
> > +      if (*ins->codep & 0x8)
> > +	ins->rex2 |= REX_B;
> > +      if (!(*ins->codep & 0x10))
> > +	ins->rex2 |= REX_R;
> > +
> >        ins->rex = ~(*ins->codep >> 5) & 0x7;
> > -      ins->vex.r = *ins->codep & 0x10;
> > -      switch ((*ins->codep & 0xf))
> > +      switch (*ins->codep & 0x7)
> >  	{
> >  	default:
> >  	  return &bad_opcode;
> > @@ -8992,6 +9057,12 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >  	case 0x3:
> >  	  vex_table_index = EVEX_0F3A;
> >  	  break;
> > +	case 0x4:
> > +	  vex_table_index = EVEX_MAP4;
> > +	  ins->evex_type = evex_from_legacy;
> > +	  if (ins->address_mode != mode_64bit)
> > +	    return &bad_opcode;
> > +	  break;
> >  	case 0x5:
> >  	  vex_table_index = EVEX_MAP5;
> >  	  break;
> > @@ -9008,9 +9079,8 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >
> >        ins->vex.register_specifier = (~(*ins->codep >> 3)) & 0xf;
> >
> > -      /* The U bit.  */
> >        if (!(*ins->codep & 0x4))
> > -	return &bad_opcode;
> > +	ins->rex2 |= REX_X;
> >
> >        switch ((*ins->codep & 0x3))
> >  	{
> > @@ -9040,12 +9110,26 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >
> >        if (ins->address_mode != mode_64bit)
> >  	{
> > +	  /* Report bad for !evex_default and when two fixed values of evex
> > +	     change..  */
> > +	  if (ins->evex_type != evex_default
> > +	      || (ins->rex2 & (REX_B | REX_X)))
> > +	    return &bad_opcode;
> >  	  /* In 16/32-bit mode silently ignore following bits.  */
> >  	  ins->rex &= ~REX_B;
> > -	  ins->vex.r = true;
> > +	  ins->rex2 &= ~REX_R;
> >  	}
> >
> >        ins->need_vex = 4;
> > +
> > +      /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
> > +	 lower 2 bits of EVEX.aaa must be 0.  */
> > +      if (ins->evex_type == evex_from_legacy
> > +	  && ((ins->vex.mask_register_specifier & 0x3) != 0
> > +	      || ins->vex.ll != 0
> > +	      || ins->vex.zeroing != 0))
> > +	return &bad_opcode;
> > +
> >        ins->codep++;
> >        vindex = *ins->codep++;
> >        dp = &evex_table[vex_table_index][vindex];
> > @@ -9460,6 +9544,13 @@ print_insn (bfd_vma pc, disassemble_info *info,
> int intel_syntax)
> >        dp = get_valid_dis386 (dp, &ins);
> >        if (dp == &err_opcode)
> >  	goto fetch_error_out;
> > +
> > +      /* For APX instructions promoted from legacy maps 0/1, embedded
> prefix
> > +	 is interpreted as the operand size override.  */
> > +      if (ins.evex_type == evex_from_legacy
> > +	  && ins.vex.prefix == DATA_PREFIX_OPCODE)
> > +	sizeflag ^= DFLAG;
> > +
> >        if (dp != NULL && putop (&ins, dp->name, sizeflag) == 0)
> >  	{
> >  	  if (!get_sib (&ins, sizeflag))
> > @@ -9639,6 +9730,25 @@ print_insn (bfd_vma pc, disassemble_info *info,
> int intel_syntax)
> >        if (ins.last_repnz_prefix >= 0)
> >  	ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
> >        break;
> > +
> > +    case PREFIX_NP_OR_DATA:
> > +      if (ins.vex.prefix == REPE_PREFIX_OPCODE
> > +	  || ins.vex.prefix == REPNE_PREFIX_OPCODE)
> > +	{
> > +	  i386_dis_printf (info, dis_style_text, "(bad)");
> > +	  ret = ins.end_codep - priv.the_buffer;
> > +	  goto out;
> > +	}
> > +      break;
> > +
> > +    case NO_PREFIX:
> > +      if (ins.vex.prefix)
> > +	{
> > +	  i386_dis_printf (info, dis_style_text, "(bad)");
> > +	  ret = ins.end_codep - priv.the_buffer;
> > +	  goto out;
> > +	}
> > +      break;
> >      }
> >
> >    /* Check if the REX prefix is used.  */
> > @@ -10348,7 +10458,7 @@ putop (instr_info *ins, const char
> *in_template, int sizeflag)
> >  		{
> >  		case 'X':
> >  		  if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
> > -		      || !ins->vex.r
> > +		      || (ins->rex2 & REX_R)
> >  		      || (ins->modrm.mod == 3 && (ins->rex & REX_X))
> >  		      || !ins->vex.v || ins->vex.mask_register_specifier)
> >  		    break;
> > @@ -11459,7 +11569,7 @@ OP_E_memory (instr_info *ins, int bytemode,
> int sizeflag)
> >
> >    add += (ins->rex2 & REX_B) ? 16 : 0;
> >
> > -  if (ins->vex.evex)
> > +  if (ins->vex.evex && ins->evex_type == evex_default)
> >      {
> >
> >        /* Zeroing-masking is invalid for memory destinations. Set the flag
> > @@ -11603,6 +11713,13 @@ OP_E_memory (instr_info *ins, int
> bytemode, int sizeflag)
> >  		abort ();
> >  	      if (ins->vex.evex)
> >  		{
> > +		  /* S/G EVEX insns require EVEX.X4 not to be set.  */
> > +		  if (ins->rex2 & REX_X)
> > +		    {
> > +		      oappend (ins, "(bad)");
> > +		      return true;
> > +		    }
> > +
> >  		  if (!ins->vex.v)
> >  		    vindex += 16;
> >  		  check_gather = ins->obufp == ins->op_out[1];
> > @@ -11805,7 +11922,7 @@ OP_E_memory (instr_info *ins, int bytemode,
> int sizeflag)
> >
> >  	      if (ins->rex & REX_R)
> >  	        modrm_reg += 8;
> > -	      if (!ins->vex.r)
> > +	      if (ins->rex2 & REX_R)
> >  	        modrm_reg += 16;
> >  	      if (vindex == modrm_reg)
> >  		oappend (ins, "/(bad)");
> > @@ -12011,10 +12128,7 @@ OP_indirE (instr_info *ins, int bytemode, int
> sizeflag)
> >  static bool
> >  OP_G (instr_info *ins, int bytemode, int sizeflag)
> >  {
> > -  if (ins->vex.evex && !ins->vex.r && ins->address_mode == mode_64bit)
> > -    oappend (ins, "(bad)");
> > -  else
> > -    print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
> > +  print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
> >    return true;
> >  }
> >
> > @@ -12645,7 +12759,7 @@ OP_XMM (instr_info *ins, int bytemode, int
> sizeflag ATTRIBUTE_UNUSED)
> >      reg += 8;
> >    if (ins->vex.evex)
> >      {
> > -      if (!ins->vex.r)
> > +      if (ins->rex2 & REX_R)
> >  	reg += 16;
> >      }
> >
> > @@ -13652,7 +13766,7 @@ DistinctDest_Fixup (instr_info *ins, int
> bytemode, int sizeflag)
> >    /* Calc destination register number.  */
> >    if (ins->rex & REX_R)
> >      modrm_reg += 8;
> > -  if (!ins->vex.r)
> > +  if (ins->rex2 & REX_R)
> >      modrm_reg += 16;
> >
> >    /* Calc src1 register number.  */
> > diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> > index dd4850e1855..508b441a343 100644
> > --- a/opcodes/i386-gen.c
> > +++ b/opcodes/i386-gen.c
> > @@ -487,6 +487,7 @@ static bitfield opcode_modifiers[] =
> >    BITFIELD (Dialect),
> >    BITFIELD (ISA64),
> >    BITFIELD (NoEgpr),
> > +  BITFIELD (NF),
> >  };
> >
> >  #define CLASS(n) #n, n
> > @@ -1120,6 +1121,7 @@ process_i386_opcode_modifier (FILE *table, char
> *mod, unsigned int space,
> >      SPACE(0F),
> >      SPACE(0F38),
> >      SPACE(0F3A),
> > +    SPACE(EVEXMAP4),
> >      SPACE(EVEXMAP5),
> >      SPACE(EVEXMAP6),
> >      SPACE(VEXMAP7),
> > diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> > index 8c967ea90b0..064ec48edad 100644
> > --- a/opcodes/i386-opc.h
> > +++ b/opcodes/i386-opc.h
> > @@ -743,6 +743,9 @@ enum
> >       whether the instruction supports pseudo-prefix {rex2}.  */
> >    NoEgpr,
> >
> > +  /* No CSPAZO flags update indication.  */
> > +  NF,
> > +
> >    /* The last bitfield in i386_opcode_modifier.  */
> >    Opcode_Modifier_Num
> >  };
> > @@ -788,6 +791,7 @@ typedef struct i386_opcode_modifier
> >    unsigned int dialect:2;
> >    unsigned int isa64:2;
> >    unsigned int noegpr:1;
> > +  unsigned int nf:1;
> >  } i386_opcode_modifier;
> >
> >  /* Operand classes.  */
> > @@ -963,6 +967,7 @@ typedef struct insn_template
> >       1: 0F opcode prefix / space.
> >       2: 0F38 opcode prefix / space.
> >       3: 0F3A opcode prefix / space.
> > +     4: EVEXMAP4 opcode prefix / space.
> >       5: EVEXMAP5 opcode prefix / space.
> >       6: EVEXMAP6 opcode prefix / space.
> >       7: VEXMAP7 opcode prefix / space.
> > @@ -974,6 +979,7 @@ typedef struct insn_template
> >  #define SPACE_0F	1
> >  #define SPACE_0F38	2
> >  #define SPACE_0F3A	3
> > +#define SPACE_EVEXMAP4	4
> >  #define SPACE_EVEXMAP5	5
> >  #define SPACE_EVEXMAP6	6
> >  #define SPACE_VEXMAP7	7
> > diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> > index 37d3e8663bb..11b8c0b63cb 100644
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -113,6 +113,7 @@
> >  #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
> >  #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
> >
> > +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
> >  #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
> >  #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
> >
> > @@ -139,6 +140,9 @@
> >
> >  #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
> >
> > +// The template supports VEX format for cpuid and EVEX format for cpuid &
> apx_f.
> > +#define APX_F(cpuid) cpuid&(cpuid|APX_F)
> > +
> >  // The EVEX purpose of StaticRounding appears only together with SAE. Re-
> use
> >  // the bit to mark commutative VEX encodings where swapping the source
> >  // operands may allow to switch from 3-byte to 2-byte VEX encoding.
> > @@ -194,6 +198,7 @@ mov, 0xf24, i386&No64,
> D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Te
> >
> >  // Move after swapping the bytes
> >  movbe, 0x0f38f0, Movbe,
> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf,
> { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +movbe, 0x60, Movbe&APX_F,
> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4,
> { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >
> >  // Move with sign extend.
> >  movsb, 0xfbe, i386, Modrm|No_bSuf|No_sSuf,
> { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > @@ -1315,13 +1320,16 @@ getsec, 0xf37, SMX, NoSuf, {}
> >
> >  invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf,
> { Oword|Unspecified|BaseIndex, Reg32 }
> >  invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64,
> { Oword|Unspecified|BaseIndex, Reg64 }
> > +invept, 0xf3f0, EPT&APX_F, Modrm|NoSuf|EVexMap4,
> { Oword|Unspecified|BaseIndex, Reg64 }
> >  invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf,
> { Oword|Unspecified|BaseIndex, Reg32 }
> >  invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64,
> { Oword|Unspecified|BaseIndex, Reg64 }
> > +invvpid, 0xf3f1, EPT&APX_F, Modrm|NoSuf|EVexMap4,
> { Oword|Unspecified|BaseIndex, Reg64 }
> >
> >  // INVPCID instruction
> >
> >  invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf,
> { Oword|Unspecified|BaseIndex, Reg32 }
> >  invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64,
> { Oword|Unspecified|BaseIndex, Reg64 }
> > +invpcid, 0xf3f2, INVPCID&APX_F, Modrm|NoSuf|EVexMap4,
> { Oword|Unspecified|BaseIndex, Reg64 }
> >
> >  // SSSE3 instructions.
> >
> > @@ -1422,6 +1430,8 @@ pcmpistri<sse42>, 0x660f3a63, <sse42:cpu>,
> Modrm|<sse42:attr>|NoSuf, { Imm8, Reg
> >  pcmpistrm<sse42>, 0x660f3a62, <sse42:cpu>,
> Modrm|<sse42:attr>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex,
> RegXMM }
> >  crc32, 0xf20f38f0, SSE4_2, W|Modrm|No_sSuf|No_qSuf,
> { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
> >  crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf,
> { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
> > +crc32, 0xf0, APX_F, W|Modrm|No_sSuf|No_qSuf|EVexMap4,
> { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
> > +crc32, 0xf0, APX_F, W|Modrm|No_wSuf|No_lSuf|No_sSuf|EVexMap4,
> { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
> >
> >  // xsave/xrstor New Instructions.
> >
> > @@ -1836,14 +1846,14 @@ xtest, 0xf01d6, HLE|RTM, NoSuf, {}
> >
> >  // BMI2 instructions.
> >
> > -bzhi, 0xf5, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -mulx, 0xf2f6, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -pdep, 0xf2f5, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -pext, 0xf3f5, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -rorx, 0xf2f0, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F3A|No_bSuf|No_wSuf|No_sSu
> f, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -sarx, 0xf3f7, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -shlx, 0x66f7, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -shrx, 0xf2f7, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +bzhi, 0xf5, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +mulx, 0xf2f6, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > +pdep, 0xf2f5, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > +pext, 0xf3f5, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > +rorx, 0xf2f0, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F3A|No_bSuf|No_wSu
> f|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> > +sarx, 0xf3f7, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +shlx, 0x66f7, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +shrx, 0xf2f7, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> >
> >  // FMA4 instructions
> >
> > @@ -1913,11 +1923,11 @@ lwpins, 0x12/0, LWP,
> Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|U
> >
> >  // BMI instructions
> >
> > -andn, 0xf2, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -bextr, 0xf7, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -blsi, 0xf3/3, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -blsmsk, 0xf3/2, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -blsr, 0xf3/1, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +andn, 0xf2, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64, Reg32|Reg64 }
> > +bextr, 0xf7, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +blsi, 0xf3/3, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> > +blsmsk, 0xf3/2, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> > +blsr, 0xf3/1, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> >  tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf,
> { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >
> >  // TBM instructions
> > @@ -2046,13 +2056,21 @@ bndldx, 0x0f1a, MPX,
> Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
> >
> >  // SHA instructions.
> >  sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S,
> RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S,
> RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1nexte, 0xf38c8, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1msg1, 0xf38c9, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1msg2, 0xf38ca, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword,
> RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256msg1, 0xf38cc, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256msg1, 0xdc, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256msg2, 0xf38cd, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256msg2, 0xdd, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >
> >  // SHA512 instructions.
> >
> > @@ -2114,9 +2132,9 @@ kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>,
> Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { R
> >  kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>,
> Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>,
> Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask,
> RegMask }
> >
> > -kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf,
> { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
> > -kmov<bw>, 0x<bw:kpfx>91, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask,
> <bw:elem>|Unspecified|BaseIndex }
> > -kmov<bw>, 0x<bw:kpfx>92, <bw:kcpu>,
> D|Modrm|Vex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
> > +kmov<bw>, 0x<bw:kpfx>90, APX_F(<bw:kcpu>),
> Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf,
> { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
> > +kmov<bw>, 0x<bw:kpfx>91, APX_F(<bw:kcpu>),
> Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask,
> <bw:elem>|Unspecified|BaseIndex }
> > +kmov<bw>, 0x<bw:kpfx>92, APX_F(<bw:kcpu>),
> D|Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
> >
> >  knot<bw>, 0x<bw:kpfx>44, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
> >  kortest<bw>, 0x<bw:kpfx>98, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
> > @@ -2591,9 +2609,9 @@ vpmovzxdq, 0x6635, AVX512VL,
> Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift
> >  kadd<dq>, 0x<dq:kpfx>4a, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kand<dq>, 0x<dq:kpfx>41, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kandn<dq>, 0x<dq:kpfx>42, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask,
> RegMask, RegMask }
> > -kmov<dq>, 0x<dq:kpfx>90, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf,
> { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
> > -kmov<dq>, 0x<dq:kpfx>91, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask,
> <dq:elem>|Unspecified|BaseIndex }
> > -kmov<dq>, 0xf292, AVX512BW,
> D|Modrm|Vex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
> > +kmov<dq>, 0x<dq:kpfx>90, APX_F(AVX512BW),
> Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf,
> { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
> > +kmov<dq>, 0x<dq:kpfx>91, APX_F(AVX512BW),
> Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask,
> <dq:elem>|Unspecified|BaseIndex }
> > +kmov<dq>, 0xf292, APX_F(AVX512BW),
> D|Modrm|Vex128|EVex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>,
> RegMask }
> >  knot<dq>, 0x<dq:kpfx>44, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
> >  kor<dq>, 0x<dq:kpfx>45, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kortest<dq>, 0x<dq:kpfx>98, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
> > @@ -2992,9 +3010,13 @@ rdsspq, 0xf30f1e/1, SHSTK&x64,
> Modrm|NoSuf, { Reg64 }
> >  saveprevssp, 0xf30f01ea, SHSTK, NoSuf, {}
> >  rstorssp, 0xf30f01/5, SHSTK, Modrm|NoSuf,
> { Qword|Unspecified|BaseIndex }
> >  wrssd, 0x0f38f6, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32,
> Dword|Unspecified|BaseIndex }
> > +wrssd, 0x66, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4,
> { Reg32, Dword|Unspecified|BaseIndex }
> >  wrssq, 0x0f38f6, SHSTK&x64, Modrm|NoSuf|Size64, { Reg64,
> Qword|Unspecified|BaseIndex }
> > +wrssq, 0x66, SHSTK&APX_F, Modrm|NoSuf|Size64|EVexMap4, { Reg64,
> Qword|Unspecified|BaseIndex }
> >  wrussd, 0x660f38f5, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32,
> Dword|Unspecified|BaseIndex }
> > +wrussd, 0x6665, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4,
> { Reg32, Dword|Unspecified|BaseIndex }
> >  wrussq, 0x660f38f5, SHSTK&x64, Modrm|NoSuf, { Reg64,
> Qword|Unspecified|BaseIndex }
> > +wrussq, 0x6665, SHSTK&APX_F, Modrm|NoSuf|EVexMap4, { Reg64,
> Qword|Unspecified|BaseIndex }
> >  setssbsy, 0xf30f01e8, SHSTK, NoSuf, {}
> >  clrssbsy, 0xf30fae/6, SHSTK, Modrm|NoSuf,
> { Qword|Unspecified|BaseIndex }
> >  endbr64, 0xf30f1efa, IBT, NoSuf, {}
> > @@ -3042,7 +3064,9 @@ cldemote, 0x0f1c/0, CLDEMOTE,
> Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
> >  // MOVDIR[I,64B] instructions.
> >
> >  movdiri, 0xf38f9, MOVDIRI,
> Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +movdiri, 0xf9, MOVDIRI&APX_F,
> Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMa
> p4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> >  movdir64b, 0x660f38f8, MOVDIR64B, Modrm|AddrPrefixOpReg|NoSuf,
> { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +movdir64b, 0x66f8, MOVDIR64B&APX_F,
> Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex,
> Reg32|Reg64 }
> >
> >  // MOVEDIR instructions end.
> >
> > @@ -3071,7 +3095,9 @@ vcvtneps2bf16<Vxy>, 0xf372,
> AVX_NE_CONVERT, Modrm|<Vxy:vex>|Space0F38|VexW0|NoSu
> >  // ENQCMD instructions.
> >
> >  enqcmd, 0xf20f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf,
> { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +enqcmd, 0xf2f8, APX_F(ENQCMD),
> Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex,
> Reg32|Reg64 }
> >  enqcmds, 0xf30f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf,
> { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +enqcmds, 0xf3f8, APX_F(ENQCMD),
> Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex,
> Reg32|Reg64 }
> >
> >  // ENQCMD instructions end.
> >
> > @@ -3132,8 +3158,8 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
> >
> >  // AMX instructions.
> >
> > -ldtilecfg, 0x49/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex }
> > -sttilecfg, 0x6649/0, AMX_TILE,
> Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
> > +ldtilecfg, 0x49/0, APX_F(AMX_TILE),
> Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex }
> > +sttilecfg, 0x6649/0, APX_F(AMX_TILE),
> Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex }
> >
> >  tcmmimfp16ps, 0x666c, AMX_COMPLEX,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> >  tcmmrlfp16ps, 0x6c, AMX_COMPLEX,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> > @@ -3145,9 +3171,9 @@ tdpbuud, 0x5e, AMX_INT8,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> >  tdpbusd, 0x665e, AMX_INT8,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> >  tdpbsud, 0xf35e, AMX_INT8,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> >
> > -tileloadd, 0xf24b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex, RegTMM }
> > -tileloaddt1, 0x664b, AMX_TILE,
> Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex,
> RegTMM }
> > -tilestored, 0xf34b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf,
> { RegTMM, Unspecified|BaseIndex }
> > +tileloadd, 0xf24b, APX_F(AMX_TILE),
> Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex, RegTMM }
> > +tileloaddt1, 0x664b, APX_F(AMX_TILE),
> Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex, RegTMM }
> > +tilestored, 0xf34b, APX_F(AMX_TILE),
> Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM,
> Unspecified|BaseIndex }
> >
> >  tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {}
> >
> > @@ -3159,15 +3185,25 @@ tilezero, 0xf249, AMX_TILE,
> Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
> >
> >  loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM }
> >  encodekey128, 0xf30f38fa, KL, Modrm|NoSuf, { Reg32, Reg32 }
> > +encodekey128, 0xf3da, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32,
> Reg32 }
> >  encodekey256, 0xf30f38fb, KL, Modrm|NoSuf, { Reg32, Reg32 }
> > +encodekey256, 0xf3db, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32,
> Reg32 }
> >  aesenc128kl, 0xf30f38dc, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesenc128kl, 0xf3dc, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesdec128kl, 0xf30f38dd, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesdec128kl, 0xf3dd, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesenc256kl, 0xf30f38de, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesenc256kl, 0xf3de, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesdec256kl, 0xf30f38df, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesdec256kl, 0xf3df, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesencwide128kl, 0xf30f38d8/0, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesencwide128kl, 0xf3d8/0, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >  aesdecwide128kl, 0xf30f38d8/1, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesdecwide128kl, 0xf3d8/1, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >  aesencwide256kl, 0xf30f38d8/2, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesencwide256kl, 0xf3d8/2, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >  aesdecwide256kl, 0xf30f38d8/3, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesdecwide256kl, 0xf3d8/3, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >
> >  // KEYLOCKER instructions end.
> >
> > @@ -3315,7 +3351,7 @@ prefetchit1, 0xf18/6, PREFETCHI,
> Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
> >
> >  // CMPCCXADD instructions.
> >
> > -cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD,
> Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +cmp<cc>xadd, 0x66e<cc:opc>, APX_F(CMPCCXADD),
> Modrm|Vex|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSiz
> e|NoSuf, { Reg32|Reg64, Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >
> >  // CMPCCXADD instructions end.
> >
> > @@ -3335,9 +3371,13 @@ wrmsrlist, 0xf30f01c6, MSRLIST, NoSuf, {}
> >  // RAO-INT instructions.
> >
> >  aadd, 0xf38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +aadd, 0xfc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >  aand, 0x660f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +aand, 0x66fc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >  aor, 0xf20f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +aor, 0xf2fc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >  axor, 0xf30f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +axor, 0xf3fc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >
> >  // RAO-INT instructions end.
> >
> > --
> > 2.25.1
> >
> 
> OK.
> 
> Thanks.
> 
> H.J.

[-- Attachment #2: 0001-Support-APX-GPR32-with-extend-evex-prefix.patch --]
[-- Type: application/octet-stream, Size: 51908 bytes --]

From 2c9fdb12f84f2ee3440d41c699c86351177f59ea Mon Sep 17 00:00:00 2001
From: "Cui, Lili" <lili.cui@intel.com>
Date: Thu, 28 Dec 2023 01:06:40 +0000
Subject: [PATCH] Support APX GPR32 with extend evex prefix
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This patch adds non-ND, non-NF forms of EVEX promotion insn.

EVEX extension of legacy instructions:
  All promoted legacy instructions are placed in EVEX map 4, which is
  currently reserved.
EVEX extension of EVEX instructions:
  All existing EVEX instructions are extended by APX using the extended
  EVEX prefix, so that they can access all 32 GPRs.
EVEX extension of VEX instructions:
  Promoting a VEX instruction into the EVEX space does not change the map
  id, the opcode, or the operand encoding of the VEX instruction.

Note: The promoted versions of MOVBE will be extended to include the “MOVBE
  reg1, reg2”.

  gas/ChangeLog:

  2023-12-28  Lingling Kong <lingling.kong@intel.com>
	      H.J. Lu  <hongjiu.lu@intel.com>
	      Lili Cui <lili.cui@intel.com>
	      Lin Hu   <lin1.hu@intel.com>

	* config/tc-i386.c (struct _i386_insn): Add has_egpr.
	(need_evex_encoding): Adjusted for apx.
	(cpu_flags_match): Ditto.
	(install_template): Handled APX combines.
	(is_apx_evex_encoding): Test apx evex encoding.
	(build_apx_evex_prefix): Enabe APX evex prefix.
	(md_assemble): Handle apx with evex encoding.
	(process_suffix): Handle apx map4 prefix.
	(check_register): Assign i.vec_encoding for APX evex instructions.
	* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
	* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.

opcodes/ChangeLog:

	* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
	* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
	PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
	PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
	PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
	PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
	PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
	PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
	* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
	* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
	promote to apx to use gpr32
	* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
	X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
	X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5, X86_64_EVEX_0F38F6,
	X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
	* i386-dis.c
	(struct instr_info): Deleted bool r.
	(PREFIX_NP_OR_DATA): New.
	(NO_PREFIX): New.
	(putop): Ditto.
	(X86_64_EVEX_FROM_VEX_TABLE): Diito.
	(get_valid_dis386): Decode insn erex in extend evex prefix.
	Handle EVEX_MAP4
	(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
	(print_register): Handle apx instructions decode.
	(OP_E_memory): Diito.
	(OP_G): Diito.
	(OP_XMM): Diito.
	(DistinctDest_Fixup): Diito.
	* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
	* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
	promote to evex.
	* i386-opc.tbl: Handle some legacy and vex insns don't
	support gpr32. And add some legacy insn (map2 / 3) promote
	to evex.
---
 gas/config/tc-i386.c                 |  85 ++++++++++++--
 gas/testsuite/gas/i386/x86-64-evex.d |   2 +-
 gas/testsuite/gas/i386/x86-64.exp    |   2 +-
 opcodes/i386-dis-evex-prefix.h       |  58 ++++++++++
 opcodes/i386-dis-evex-x86-64.h       |  50 +++++++++
 opcodes/i386-dis-evex.h              |  94 ++++++++--------
 opcodes/i386-dis.c                   | 160 +++++++++++++++++++++++----
 opcodes/i386-gen.c                   |   2 +
 opcodes/i386-opc.h                   |   6 +
 opcodes/i386-opc.tbl                 |  90 ++++++++++-----
 10 files changed, 440 insertions(+), 109 deletions(-)
 create mode 100644 opcodes/i386-dis-evex-x86-64.h

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 11b3927cd23..e41b79aef93 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -435,6 +435,9 @@ struct _i386_insn
     /* Prefer the REX2 prefix in encoding.  */
     bool rex2_encoding;
 
+    /* Need to use an Egpr capable encoding (REX2 or EVEX).  */
+    bool has_egpr;
+
     /* Disable instruction size optimization.  */
     bool no_optimize;
 
@@ -1862,10 +1865,11 @@ cpu_flags_and_not (i386_cpu_flags x, i386_cpu_flags y)
 
 static const i386_cpu_flags avx512 = CPU_ANY_AVX512F_FLAGS;
 
-static INLINE bool need_evex_encoding (void)
+static INLINE bool need_evex_encoding (const insn_template *t)
 {
   return i.vec_encoding == vex_encoding_evex
 	|| i.vec_encoding == vex_encoding_evex512
+	|| (t->opcode_modifier.vex && i.has_egpr)
 	|| i.mask.reg;
 }
 
@@ -1905,13 +1909,13 @@ cpu_flags_match (const insn_template *t)
       if ((any.bitfield.cpuavx || any.bitfield.cpuavx2 || any.bitfield.cpufma)
 	  && (any.bitfield.cpuavx512f || any.bitfield.cpuavx512vl))
 	{
-	  if (need_evex_encoding ())
+	  if (need_evex_encoding (t))
 	    {
 	      any.bitfield.cpuavx = 0;
 	      any.bitfield.cpuavx2 = 0;
 	      any.bitfield.cpufma = 0;
 	    }
-	  /* need_evex_encoding() isn't reliable before operands were
+	  /* need_evex_encoding(t) isn't reliable before operands were
 	     parsed.  */
 	  else if (i.operands)
 	    {
@@ -3676,12 +3680,12 @@ install_template (const insn_template *t)
 
   /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
   if (t->opcode_modifier.vex && t->opcode_modifier.evex)
-  {
+    {
       if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
 	   || maybe_cpu (t, CpuFMA))
 	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
 	{
-	  if (need_evex_encoding ())
+	  if (need_evex_encoding (t))
 	    {
 	      i.tm.opcode_modifier.vex = 0;
 	      i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f;
@@ -3698,7 +3702,19 @@ install_template (const insn_template *t)
 		gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
 	    }
 	}
-  }
+
+      if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
+	   || maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
+	   || maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
+	   || maybe_cpu (t, CpuBMI2))
+	  && maybe_cpu (t, CpuAPX_F))
+	{
+	  if (need_evex_encoding (t))
+	    i.tm.opcode_modifier.vex = 0;
+	  else
+	    i.tm.opcode_modifier.evex = 0;
+	}
+    }
 
   /* Note that for pseudo prefixes this produces a length of 1. But for them
      the length isn't interesting at all.  */
@@ -3879,6 +3895,15 @@ is_any_vex_encoding (const insn_template *t)
   return t->opcode_modifier.vex || t->opcode_modifier.evex;
 }
 
+/* We can use this function only when the current encoding is evex.  */
+static INLINE bool
+is_apx_evex_encoding (void)
+{
+  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
+    || (i.vex.register_specifier
+	&& (i.vex.register_specifier->reg_flags & RegRex2));
+}
+
 static INLINE bool
 is_apx_rex2_encoding (void)
 {
@@ -4156,6 +4181,27 @@ build_rex2_prefix (void)
 		    | (i.rex2 << 4) | i.rex);
 }
 
+/* Build the EVEX prefix (4-byte) for evex insn
+   | 62h |
+   | `R`X`B`R' | B'mmm |
+   | W | v`v`v`v | `x' | pp |
+   | z| L'L | b | `v | aaa |
+*/
+static void
+build_apx_evex_prefix (void)
+{
+  build_evex_prefix ();
+  if (i.rex2 & REX_R)
+    i.vex.bytes[1] &= ~0x10;
+  if (i.rex2 & REX_B)
+    i.vex.bytes[1] |= 0x08;
+  if (i.rex2 & REX_X)
+    i.vex.bytes[2] &= ~0x04;
+  if (i.vex.register_specifier
+      && i.vex.register_specifier->reg_flags & RegRex2)
+    i.vex.bytes[3] &= ~0x08;
+}
+
 static void establish_rex (void)
 {
   /* Note that legacy encodings have at most 2 non-immediate operands.  */
@@ -5723,13 +5769,18 @@ md_assemble (char *line)
 	  return;
 	}
 
-      if (i.tm.opcode_modifier.vex)
+      if (is_apx_evex_encoding ())
+	build_apx_evex_prefix ();
+      else if (i.tm.opcode_modifier.vex)
 	build_vex_prefix (t);
       else
 	build_evex_prefix ();
 
       /* The individual REX.RXBW bits got consumed.  */
       i.rex &= REX_OPCODE;
+
+      /* The rex2 bits got consumed.  */
+      i.rex2 = 0;
     }
 
   /* Handle conversion of 'int $3' --> special int3 insn.  */
@@ -6648,7 +6699,7 @@ check_VecOperands (const insn_template *t)
   if (!cpu_flags_all_zero (&cpu)
       && !is_cpu (t, CpuAVX512VL)
       && !cpu_arch_flags.bitfield.cpuavx512vl
-      && (!t->opcode_modifier.vex || need_evex_encoding ()))
+      && (!t->opcode_modifier.vex || need_evex_encoding (t)))
     {
       for (op = 0; op < t->operands; ++op)
 	{
@@ -6960,7 +7011,7 @@ check_VecOperands (const insn_template *t)
   /* Check vector Disp8 operand.  */
   if (t->opcode_modifier.disp8memshift
       && (!t->opcode_modifier.vex
-          || need_evex_encoding ())
+	  || need_evex_encoding (t))
       && i.disp_encoding <= disp_encoding_8bit)
     {
       if (i.broadcast.type || i.broadcast.bytes)
@@ -7617,7 +7668,7 @@ match_template (char mnem_suffix)
       if ((t == current_templates.start || j > 1)
 	  && t->opcode_modifier.disp8memshift
 	  && !t->opcode_modifier.vex
-	  && !need_evex_encoding ()
+	  && !need_evex_encoding (t)
 	  && t + j < current_templates.end
 	  && t[j].opcode_modifier.vex)
 	{
@@ -8084,7 +8135,8 @@ process_suffix (void)
       if (i.suffix != QWORD_MNEM_SUFFIX
 	  && i.tm.opcode_modifier.mnemonicsize != IGNORESIZE
 	  && !i.tm.opcode_modifier.floatmf
-	  && !is_any_vex_encoding (&i.tm)
+	  && (!is_any_vex_encoding (&i.tm)
+	      || i.tm.opcode_space == SPACE_EVEXMAP4)
 	  && ((i.suffix == LONG_MNEM_SUFFIX) == (flag_code == CODE_16BIT)
 	      || (flag_code == CODE_64BIT
 		  && i.tm.opcode_modifier.jump == JUMP_BYTE)))
@@ -8094,7 +8146,14 @@ process_suffix (void)
 	  if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
 	    prefix = ADDR_PREFIX_OPCODE;
 
-	  if (!add_prefix (prefix))
+	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
+	     needs to be adjusted.  */
+	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
+	    {
+	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
+	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
+	    }
+	  else if (!add_prefix (prefix))
 	    return 0;
 	}
 
@@ -14300,6 +14359,8 @@ static bool check_register (const reg_entry *r)
       if (!cpu_arch_flags.bitfield.cpuapx_f
 	  || flag_code != CODE_64BIT)
 	return false;
+
+      i.has_egpr = true;
     }
 
   if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
diff --git a/gas/testsuite/gas/i386/x86-64-evex.d b/gas/testsuite/gas/i386/x86-64-evex.d
index 041747db892..5d974c312da 100644
--- a/gas/testsuite/gas/i386/x86-64-evex.d
+++ b/gas/testsuite/gas/i386/x86-64-evex.d
@@ -17,6 +17,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f1 d6 38 7b f0    	vcvtusi2ss %rax,\{rd-sae\},%xmm5,%xmm6
  +[a-f0-9]+:	62 f1 57 38 7b f0    	vcvtusi2sd %eax,\{rd-bad\},%xmm5,%xmm6
  +[a-f0-9]+:	62 f1 d7 38 7b f0    	vcvtusi2sd %rax,\{rd-sae\},%xmm5,%xmm6
- +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,\(bad\)
+ +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,%r16d
  +[a-f0-9]+:	62 e1 7c 08 c2 c0 00 	vcmpeqps %xmm0,%xmm0,\(bad\)
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 91c068d5b40..ffacc9c8e2b 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -250,7 +250,7 @@ run_dump_test "x86-64-sse-noavx"
 run_dump_test "x86-64-movbe"
 run_dump_test "x86-64-movbe-intel"
 run_dump_test "x86-64-movbe-suffix"
-run_list_test "x86-64-inval-movbe" "-al"
+run_list_test "x86-64-inval-movbe" "-march=+noapx_f -al"
 run_dump_test "x86-64-ept"
 run_dump_test "x86-64-ept-intel"
 run_list_test "x86-64-inval-ept" "-al"
diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
index 28da54922c7..54ed48c6952 100644
--- a/opcodes/i386-dis-evex-prefix.h
+++ b/opcodes/i386-dis-evex-prefix.h
@@ -338,6 +338,64 @@
     { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
     { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
   },
+  /* PREFIX_EVEX_MAP4_D8 */
+  {
+    { "sha1nexte", { XM, EXxmm }, 0 },
+    { REG_TABLE (REG_0F38D8_PREFIX_1) },
+  },
+  /* PREFIX_EVEX_MAP4_DA */
+  {
+    { "sha1msg2", { XM, EXxmm }, 0 },
+    { "encodekey128", { Gd, Rd }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DB */
+  {
+    { "sha256rnds2", { XM, EXxmm, XMM0 }, 0 },
+    { "encodekey256", { Gd, Rd }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DC */
+  {
+    { "sha256msg1", { XM, EXxmm }, 0 },
+    { "aesenc128kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DD */
+  {
+    { "sha256msg2", { XM, EXxmm }, 0 },
+    { "aesdec128kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DE */
+  {
+    { Bad_Opcode },
+    { "aesenc256kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DF */
+  {
+    { Bad_Opcode },
+    { "aesdec256kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F0 */
+  {
+    { "crc32A", { Gdq, Eb }, 0 },
+    { "invept", { Gm, Mo }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F1 */
+  {
+    { "crc32Q", { Gdq, Ev }, 0 },
+    { "invvpid", { Gm, Mo }, 0 },
+    { "crc32Q", { Gdq, Ev }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F2 */
+  {
+    { Bad_Opcode },
+    { "invpcid", { Gm, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F8 */
+  {
+    { Bad_Opcode },
+    { "enqcmds", { Gva, M },  0 },
+    { "movdir64b", { Gva, M }, 0 },
+    { "enqcmd", { Gva, M }, 0 },
+  },
   /* PREFIX_EVEX_MAP5_10 */
   {
     { Bad_Opcode },
diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-64.h
new file mode 100644
index 00000000000..0d9d98a7691
--- /dev/null
+++ b/opcodes/i386-dis-evex-x86-64.h
@@ -0,0 +1,50 @@
+  /* X86_64_EVEX_0F90 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F90_L_0) },
+  },
+  /* X86_64_EVEX_0F91 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F91_L_0) },
+  },
+  /* X86_64_EVEX_0F92 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F92_L_0) },
+  },
+  /* X86_64_EVEX_0F93 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F93_L_0) },
+  },
+  /* X86_64_EVEX_0F38F2 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
+  },
+  /* X86_64_EVEX_0F38F3 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
+  },
+  /* X86_64_EVEX_0F38F5 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F5_L_0) },
+  },
+  /* X86_64_EVEX_0F38F6 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE(PREFIX_VEX_0F38F6_L_0) },
+  },
+  /* X86_64_EVEX_0F38F7 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE(PREFIX_VEX_0F38F7_L_0) },
+  },
+  /* X86_64_EVEX_0F3AF0 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) },
+  },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 7ad1edbe72d..90c063b2188 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -164,10 +164,10 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 90 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F90) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F91) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F92) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F93) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -375,9 +375,9 @@ static const struct dis386 evex_table[][256] = {
     { "vpsllv%DQ",	{ XM, Vex, EXx }, PREFIX_DATA },
     /* 48 */
     { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F3849) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F384B) },
     { "vrcp14p%XW",	{ XM, EXx }, PREFIX_DATA },
     { "vrcp14s%XW",	{ XMScalar, VexScalar, EXdq }, PREFIX_DATA },
     { "vrsqrt14p%XW",	{ XM, EXx }, 0 },
@@ -545,32 +545,32 @@ static const struct dis386 evex_table[][256] = {
     { "%XEvaesdecY",	{ XM, Vex, EXx }, PREFIX_DATA },
     { "%XEvaesdeclastY", { XM, Vex, EXx }, PREFIX_DATA },
     /* E0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E0) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E1) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E2) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E3) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E4) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E5) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E6) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E7) },
     /* E8 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E8) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E9) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EA) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EB) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EC) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38ED) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EE) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EF) },
     /* F0 */
     { Bad_Opcode },
     { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F2) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F3) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F5) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F6) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F7) },
     /* F8 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -854,7 +854,7 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* F0 */
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F3AF0) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -983,13 +983,13 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 60 */
+    { "movbeS",	{ Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "movbeS",	{ Ev, Gv }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "wrussK",	{ M, Gdq }, PREFIX_DATA },
+    { PREFIX_TABLE (PREFIX_0F38F6) },
     { Bad_Opcode },
     /* 68 */
     { Bad_Opcode },
@@ -1113,19 +1113,19 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* D8 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_D8) },
+    { "sha1msg1",	{ XM, EXxmm }, NO_PREFIX },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DA) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DB) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DC) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DD) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DE) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DF) },
     /* E0 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1145,20 +1145,20 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* F0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F0) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F1) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F2) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* F8 */
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
+    { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_0F38FC) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index e006d869258..5a72a2030ae 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -132,6 +132,13 @@ enum x86_64_isa
   intel64
 };
 
+enum evex_type
+{
+  evex_default = 0,
+  evex_from_legacy,
+  evex_from_vex,
+};
+
 struct instr_info
 {
   enum address_mode address_mode;
@@ -212,7 +219,6 @@ struct instr_info
     int ll;
     bool w;
     bool evex;
-    bool r;
     bool v;
     bool zeroing;
     bool b;
@@ -220,6 +226,8 @@ struct instr_info
   }
   vex;
 
+  enum evex_type evex_type;
+
   /* Remember if the current op is a jump instruction.  */
   bool op_is_jump;
 
@@ -303,6 +311,8 @@ struct dis_private {
 #define PREFIX_ADDR 0x400
 #define PREFIX_FWAIT 0x800
 #define PREFIX_REX2 0x1000
+#define PREFIX_NP_OR_DATA 0x2000
+#define NO_PREFIX   0x4000
 
 /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
    to ADDR (exclusive) are valid.  Returns true for success, false
@@ -800,6 +810,7 @@ enum
   USE_RM_TABLE,
   USE_PREFIX_TABLE,
   USE_X86_64_TABLE,
+  USE_X86_64_EVEX_FROM_VEX_TABLE,
   USE_3BYTE_TABLE,
   USE_XOP_8F_TABLE,
   USE_VEX_C4_TABLE,
@@ -818,6 +829,8 @@ enum
 #define RM_TABLE(I)		DIS386 (USE_RM_TABLE, (I))
 #define PREFIX_TABLE(I)		DIS386 (USE_PREFIX_TABLE, (I))
 #define X86_64_TABLE(I)		DIS386 (USE_X86_64_TABLE, (I))
+#define X86_64_EVEX_FROM_VEX_TABLE(I) \
+  DIS386 (USE_X86_64_EVEX_FROM_VEX_TABLE, (I))
 #define THREE_BYTE_TABLE(I)	DIS386 (USE_3BYTE_TABLE, (I))
 #define XOP_8F_TABLE()		DIS386 (USE_XOP_8F_TABLE, 0)
 #define VEX_C4_TABLE()		DIS386 (USE_VEX_C4_TABLE, 0)
@@ -866,7 +879,7 @@ enum
   REG_VEX_0F73,
   REG_VEX_0FAE,
   REG_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0,
-  REG_VEX_0F38F3_L_0,
+  REG_VEX_0F38F3_L_0_P_0,
   REG_VEX_MAP7_F8_L_0_W_0,
 
   REG_XOP_09_01_L_0,
@@ -878,7 +891,7 @@ enum
   REG_EVEX_0F72,
   REG_EVEX_0F73,
   REG_EVEX_0F38C6_L_2,
-  REG_EVEX_0F38C7_L_2
+  REG_EVEX_0F38C7_L_2,
 };
 
 enum
@@ -1094,6 +1107,8 @@ enum
   PREFIX_VEX_0F38CC,
   PREFIX_VEX_0F38CD,
   PREFIX_VEX_0F38DA_W_0,
+  PREFIX_VEX_0F38F2_L_0,
+  PREFIX_VEX_0F38F3_L_0,
   PREFIX_VEX_0F38F5_L_0,
   PREFIX_VEX_0F38F6_L_0,
   PREFIX_VEX_0F38F7_L_0,
@@ -1156,6 +1171,18 @@ enum
   PREFIX_EVEX_0F3A67,
   PREFIX_EVEX_0F3AC2,
 
+  PREFIX_EVEX_MAP4_D8,
+  PREFIX_EVEX_MAP4_DA,
+  PREFIX_EVEX_MAP4_DB,
+  PREFIX_EVEX_MAP4_DC,
+  PREFIX_EVEX_MAP4_DD,
+  PREFIX_EVEX_MAP4_DE,
+  PREFIX_EVEX_MAP4_DF,
+  PREFIX_EVEX_MAP4_F0,
+  PREFIX_EVEX_MAP4_F1,
+  PREFIX_EVEX_MAP4_F2,
+  PREFIX_EVEX_MAP4_F8,
+
   PREFIX_EVEX_MAP5_10,
   PREFIX_EVEX_MAP5_11,
   PREFIX_EVEX_MAP5_1D,
@@ -1267,7 +1294,19 @@ enum
   X86_64_VEX_0F38ED,
   X86_64_VEX_0F38EE,
   X86_64_VEX_0F38EF,
+
   X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
+
+  X86_64_EVEX_0F90,
+  X86_64_EVEX_0F91,
+  X86_64_EVEX_0F92,
+  X86_64_EVEX_0F93,
+  X86_64_EVEX_0F38F2,
+  X86_64_EVEX_0F38F3,
+  X86_64_EVEX_0F38F5,
+  X86_64_EVEX_0F38F6,
+  X86_64_EVEX_0F38F7,
+  X86_64_EVEX_0F3AF0,
 };
 
 enum
@@ -2882,12 +2921,12 @@ static const struct dis386 reg_table[][8] = {
   {
     { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0_R_0) },
   },
-  /* REG_VEX_0F38F3_L_0 */
+  /* REG_VEX_0F38F3_L_0_P_0 */
   {
     { Bad_Opcode },
-    { "blsrS",		{ VexGdq, Edq }, PREFIX_OPCODE },
-    { "blsmskS",	{ VexGdq, Edq }, PREFIX_OPCODE },
-    { "blsiS",		{ VexGdq, Edq }, PREFIX_OPCODE },
+    { "blsrS",		{ VexGdq, Edq }, 0 },
+    { "blsmskS",	{ VexGdq, Edq }, 0 },
+    { "blsiS",		{ VexGdq, Edq }, 0 },
   },
   /* REG_VEX_MAP7_F8_L_0_W_0 */
   {
@@ -4035,6 +4074,16 @@ static const struct dis386 prefix_table[][4] = {
     { "vsm4rnds4", { XM, Vex, EXx }, 0 },
   },
 
+  /* PREFIX_VEX_0F38F2_L_0 */
+  {
+    { "andnS",          { Gdq, VexGdq, Edq }, 0 },
+  },
+
+  /* PREFIX_VEX_0F38F3_L_0 */
+  {
+    { REG_TABLE (REG_VEX_0F38F3_L_0_P_0) },
+  },
+
   /* PREFIX_VEX_0F38F5_L_0 */
   {
     { "bzhiS",		{ Gdq, Edq, VexGdq }, 0 },
@@ -4527,6 +4576,7 @@ static const struct dis386 x86_64_table[][2] = {
     { PREFIX_TABLE (PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64) },
   },
 
+#include "i386-dis-evex-x86-64.h"
 };
 
 static const struct dis386 three_byte_table[][256] = {
@@ -7113,12 +7163,12 @@ static const struct dis386 vex_len_table[][2] = {
 
   /* VEX_LEN_0F38F2 */
   {
-    { "andnS",		{ Gdq, VexGdq, Edq }, PREFIX_OPCODE },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
   },
 
   /* VEX_LEN_0F38F3 */
   {
-    { REG_TABLE(REG_VEX_0F38F3_L_0) },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
   },
 
   /* VEX_LEN_0F38F5 */
@@ -8732,6 +8782,17 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
       dp = &prefix_table[dp->op[1].bytemode][vindex];
       break;
 
+    case USE_X86_64_EVEX_FROM_VEX_TABLE:
+      ins->evex_type = evex_from_vex;
+      /* EVEX from VEX instrucions require that EVEX.z, EVEX.L’L, EVEX.b and
+	 the lower 2 bits of EVEX.aaa must be 0.  */
+      if ((ins->vex.mask_register_specifier & 0x3) != 0
+	  || ins->vex.ll != 0
+	  || ins->vex.zeroing != 0
+	  || ins->vex.b)
+	return &bad_opcode;
+
+      /* Fall through.  */
     case USE_X86_64_TABLE:
       vindex = ins->address_mode == mode_64bit ? 1 : 0;
       dp = &x86_64_table[dp->op[1].bytemode][vindex];
@@ -8977,9 +9038,13 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
       if (!fetch_code (ins->info, ins->codep + 4))
 	return &err_opcode;
       /* The first byte after 0x62.  */
+      if (*ins->codep & 0x8)
+	ins->rex2 |= REX_B;
+      if (!(*ins->codep & 0x10))
+	ins->rex2 |= REX_R;
+
       ins->rex = ~(*ins->codep >> 5) & 0x7;
-      ins->vex.r = *ins->codep & 0x10;
-      switch ((*ins->codep & 0xf))
+      switch (*ins->codep & 0x7)
 	{
 	default:
 	  return &bad_opcode;
@@ -8992,6 +9057,12 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 	case 0x3:
 	  vex_table_index = EVEX_0F3A;
 	  break;
+	case 0x4:
+	  vex_table_index = EVEX_MAP4;
+	  ins->evex_type = evex_from_legacy;
+	  if (ins->address_mode != mode_64bit)
+	    return &bad_opcode;
+	  break;
 	case 0x5:
 	  vex_table_index = EVEX_MAP5;
 	  break;
@@ -9008,9 +9079,8 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 
       ins->vex.register_specifier = (~(*ins->codep >> 3)) & 0xf;
 
-      /* The U bit.  */
       if (!(*ins->codep & 0x4))
-	return &bad_opcode;
+	ins->rex2 |= REX_X;
 
       switch ((*ins->codep & 0x3))
 	{
@@ -9040,12 +9110,26 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 
       if (ins->address_mode != mode_64bit)
 	{
+	  /* Report bad for !evex_default and when two fixed values of evex
+	     change..  */
+	  if (ins->evex_type != evex_default
+	      || (ins->rex2 & (REX_B | REX_X)))
+	    return &bad_opcode;
 	  /* In 16/32-bit mode silently ignore following bits.  */
 	  ins->rex &= ~REX_B;
-	  ins->vex.r = true;
+	  ins->rex2 &= ~REX_R;
 	}
 
       ins->need_vex = 4;
+
+      /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
+	 lower 2 bits of EVEX.aaa must be 0.  */
+      if (ins->evex_type == evex_from_legacy
+	  && ((ins->vex.mask_register_specifier & 0x3) != 0
+	      || ins->vex.ll != 0
+	      || ins->vex.zeroing != 0))
+	return &bad_opcode;
+
       ins->codep++;
       vindex = *ins->codep++;
       dp = &evex_table[vex_table_index][vindex];
@@ -9460,6 +9544,13 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       dp = get_valid_dis386 (dp, &ins);
       if (dp == &err_opcode)
 	goto fetch_error_out;
+
+      /* For APX instructions promoted from legacy maps 0/1, embedded prefix
+	 is interpreted as the operand size override.  */
+      if (ins.evex_type == evex_from_legacy
+	  && ins.vex.prefix == DATA_PREFIX_OPCODE)
+	sizeflag ^= DFLAG;
+
       if (dp != NULL && putop (&ins, dp->name, sizeflag) == 0)
 	{
 	  if (!get_sib (&ins, sizeflag))
@@ -9639,6 +9730,25 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       if (ins.last_repnz_prefix >= 0)
 	ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
       break;
+
+    case PREFIX_NP_OR_DATA:
+      if (ins.vex.prefix == REPE_PREFIX_OPCODE
+	  || ins.vex.prefix == REPNE_PREFIX_OPCODE)
+	{
+	  i386_dis_printf (info, dis_style_text, "(bad)");
+	  ret = ins.end_codep - priv.the_buffer;
+	  goto out;
+	}
+      break;
+
+    case NO_PREFIX:
+      if (ins.vex.prefix)
+	{
+	  i386_dis_printf (info, dis_style_text, "(bad)");
+	  ret = ins.end_codep - priv.the_buffer;
+	  goto out;
+	}
+      break;
     }
 
   /* Check if the REX prefix is used.  */
@@ -10348,7 +10458,7 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 		{
 		case 'X':
 		  if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
-		      || !ins->vex.r
+		      || (ins->rex2 & 7)
 		      || (ins->modrm.mod == 3 && (ins->rex & REX_X))
 		      || !ins->vex.v || ins->vex.mask_register_specifier)
 		    break;
@@ -11459,7 +11569,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 
   add += (ins->rex2 & REX_B) ? 16 : 0;
 
-  if (ins->vex.evex)
+  if (ins->vex.evex && ins->evex_type == evex_default)
     {
 
       /* Zeroing-masking is invalid for memory destinations. Set the flag
@@ -11603,6 +11713,13 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 		abort ();
 	      if (ins->vex.evex)
 		{
+		  /* S/G EVEX insns require EVEX.X4 not to be set.  */
+		  if (ins->rex2 & REX_X)
+		    {
+		      oappend (ins, "(bad)");
+		      return true;
+		    }
+
 		  if (!ins->vex.v)
 		    vindex += 16;
 		  check_gather = ins->obufp == ins->op_out[1];
@@ -11805,7 +11922,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 
 	      if (ins->rex & REX_R)
 	        modrm_reg += 8;
-	      if (!ins->vex.r)
+	      if (ins->rex2 & REX_R)
 	        modrm_reg += 16;
 	      if (vindex == modrm_reg)
 		oappend (ins, "/(bad)");
@@ -12011,10 +12128,7 @@ OP_indirE (instr_info *ins, int bytemode, int sizeflag)
 static bool
 OP_G (instr_info *ins, int bytemode, int sizeflag)
 {
-  if (ins->vex.evex && !ins->vex.r && ins->address_mode == mode_64bit)
-    oappend (ins, "(bad)");
-  else
-    print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
+  print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
   return true;
 }
 
@@ -12645,7 +12759,7 @@ OP_XMM (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
     reg += 8;
   if (ins->vex.evex)
     {
-      if (!ins->vex.r)
+      if (ins->rex2 & REX_R)
 	reg += 16;
     }
 
@@ -13652,7 +13766,7 @@ DistinctDest_Fixup (instr_info *ins, int bytemode, int sizeflag)
   /* Calc destination register number.  */
   if (ins->rex & REX_R)
     modrm_reg += 8;
-  if (!ins->vex.r)
+  if (ins->rex2 & REX_R)
     modrm_reg += 16;
 
   /* Calc src1 register number.  */
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index dd4850e1855..508b441a343 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -487,6 +487,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (Dialect),
   BITFIELD (ISA64),
   BITFIELD (NoEgpr),
+  BITFIELD (NF),
 };
 
 #define CLASS(n) #n, n
@@ -1120,6 +1121,7 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
     SPACE(0F),
     SPACE(0F38),
     SPACE(0F3A),
+    SPACE(EVEXMAP4),
     SPACE(EVEXMAP5),
     SPACE(EVEXMAP6),
     SPACE(VEXMAP7),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 8c967ea90b0..064ec48edad 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -743,6 +743,9 @@ enum
      whether the instruction supports pseudo-prefix {rex2}.  */
   NoEgpr,
 
+  /* No CSPAZO flags update indication.  */
+  NF,
+
   /* The last bitfield in i386_opcode_modifier.  */
   Opcode_Modifier_Num
 };
@@ -788,6 +791,7 @@ typedef struct i386_opcode_modifier
   unsigned int dialect:2;
   unsigned int isa64:2;
   unsigned int noegpr:1;
+  unsigned int nf:1;
 } i386_opcode_modifier;
 
 /* Operand classes.  */
@@ -963,6 +967,7 @@ typedef struct insn_template
      1: 0F opcode prefix / space.
      2: 0F38 opcode prefix / space.
      3: 0F3A opcode prefix / space.
+     4: EVEXMAP4 opcode prefix / space.
      5: EVEXMAP5 opcode prefix / space.
      6: EVEXMAP6 opcode prefix / space.
      7: VEXMAP7 opcode prefix / space.
@@ -974,6 +979,7 @@ typedef struct insn_template
 #define SPACE_0F	1
 #define SPACE_0F38	2
 #define SPACE_0F3A	3
+#define SPACE_EVEXMAP4	4
 #define SPACE_EVEXMAP5	5
 #define SPACE_EVEXMAP6	6
 #define SPACE_VEXMAP7	7
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 37d3e8663bb..11b8c0b63cb 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -113,6 +113,7 @@
 #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
 #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
 
+#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
 #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
 #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
 
@@ -139,6 +140,9 @@
 
 #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
 
+// The template supports VEX format for cpuid and EVEX format for cpuid & apx_f.
+#define APX_F(cpuid) cpuid&(cpuid|APX_F)
+
 // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
 // the bit to mark commutative VEX encodings where swapping the source
 // operands may allow to switch from 3-byte to 2-byte VEX encoding.
@@ -194,6 +198,7 @@ mov, 0xf24, i386&No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Te
 
 // Move after swapping the bytes
 movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movbe, 0x60, Movbe&APX_F, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // Move with sign extend.
 movsb, 0xfbe, i386, Modrm|No_bSuf|No_sSuf, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
@@ -1315,13 +1320,16 @@ getsec, 0xf37, SMX, NoSuf, {}
 
 invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invept, 0xf3f0, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invvpid, 0xf3f1, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // INVPCID instruction
 
 invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invpcid, 0xf3f2, INVPCID&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // SSSE3 instructions.
 
@@ -1422,6 +1430,8 @@ pcmpistri<sse42>, 0x660f3a63, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, Reg
 pcmpistrm<sse42>, 0x660f3a62, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 crc32, 0xf20f38f0, SSE4_2, W|Modrm|No_sSuf|No_qSuf, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
 crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
+crc32, 0xf0, APX_F, W|Modrm|No_sSuf|No_qSuf|EVexMap4, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
+crc32, 0xf0, APX_F, W|Modrm|No_wSuf|No_lSuf|No_sSuf|EVexMap4, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
 
 // xsave/xrstor New Instructions.
 
@@ -1836,14 +1846,14 @@ xtest, 0xf01d6, HLE|RTM, NoSuf, {}
 
 // BMI2 instructions.
 
-bzhi, 0xf5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-mulx, 0xf2f6, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pdep, 0xf2f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pext, 0xf3f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-rorx, 0xf2f0, BMI2, Modrm|CheckOperandSize|Vex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-sarx, 0xf3f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shlx, 0x66f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shrx, 0xf2f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+bzhi, 0xf5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+mulx, 0xf2f6, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pdep, 0xf2f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pext, 0xf3f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+rorx, 0xf2f0, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+sarx, 0xf3f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shlx, 0x66f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shrx, 0xf2f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 
 // FMA4 instructions
 
@@ -1913,11 +1923,11 @@ lwpins, 0x12/0, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|U
 
 // BMI instructions
 
-andn, 0xf2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsi, 0xf3/3, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsmsk, 0xf3/2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsr, 0xf3/1, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+andn, 0xf2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+bextr, 0xf7, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsi, 0xf3/3, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsmsk, 0xf3/2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsr, 0xf3/1, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // TBM instructions
@@ -2046,13 +2056,21 @@ bndldx, 0x0f1a, MPX, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
 
 // SHA instructions.
 sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1nexte, 0xf38c8, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1msg1, 0xf38c9, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1msg2, 0xf38ca, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256msg1, 0xf38cc, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256msg1, 0xdc, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256msg2, 0xf38cd, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256msg2, 0xdd, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 
 // SHA512 instructions.
 
@@ -2114,9 +2132,9 @@ kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { R
 kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 
-kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
-kmov<bw>, 0x<bw:kpfx>91, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
-kmov<bw>, 0x<bw:kpfx>92, <bw:kcpu>, D|Modrm|Vex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
+kmov<bw>, 0x<bw:kpfx>90, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
+kmov<bw>, 0x<bw:kpfx>91, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
+kmov<bw>, 0x<bw:kpfx>92, APX_F(<bw:kcpu>), D|Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
 
 knot<bw>, 0x<bw:kpfx>44, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
 kortest<bw>, 0x<bw:kpfx>98, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
@@ -2591,9 +2609,9 @@ vpmovzxdq, 0x6635, AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift
 kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask, RegMask, RegMask }
-kmov<dq>, 0x<dq:kpfx>90, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
-kmov<dq>, 0x<dq:kpfx>91, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
-kmov<dq>, 0xf292, AVX512BW, D|Modrm|Vex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
+kmov<dq>, 0x<dq:kpfx>90, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
+kmov<dq>, 0x<dq:kpfx>91, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
+kmov<dq>, 0xf292, APX_F(AVX512BW), D|Modrm|Vex128|EVex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
 knot<dq>, 0x<dq:kpfx>44, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
 kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kortest<dq>, 0x<dq:kpfx>98, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
@@ -2992,9 +3010,13 @@ rdsspq, 0xf30f1e/1, SHSTK&x64, Modrm|NoSuf, { Reg64 }
 saveprevssp, 0xf30f01ea, SHSTK, NoSuf, {}
 rstorssp, 0xf30f01/5, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
 wrssd, 0x0f38f6, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
+wrssd, 0x66, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
 wrssq, 0x0f38f6, SHSTK&x64, Modrm|NoSuf|Size64, { Reg64, Qword|Unspecified|BaseIndex }
+wrssq, 0x66, SHSTK&APX_F, Modrm|NoSuf|Size64|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
 wrussd, 0x660f38f5, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
+wrussd, 0x6665, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
 wrussq, 0x660f38f5, SHSTK&x64, Modrm|NoSuf, { Reg64, Qword|Unspecified|BaseIndex }
+wrussq, 0x6665, SHSTK&APX_F, Modrm|NoSuf|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
 setssbsy, 0xf30f01e8, SHSTK, NoSuf, {}
 clrssbsy, 0xf30fae/6, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
 endbr64, 0xf30f1efa, IBT, NoSuf, {}
@@ -3042,7 +3064,9 @@ cldemote, 0x0f1c/0, CLDEMOTE, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
 // MOVDIR[I,64B] instructions.
 
 movdiri, 0xf38f9, MOVDIRI, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+movdiri, 0xf9, MOVDIRI&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 movdir64b, 0x660f38f8, MOVDIR64B, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movdir64b, 0x66f8, MOVDIR64B&APX_F, Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 
 // MOVEDIR instructions end.
 
@@ -3071,7 +3095,9 @@ vcvtneps2bf16<Vxy>, 0xf372, AVX_NE_CONVERT, Modrm|<Vxy:vex>|Space0F38|VexW0|NoSu
 // ENQCMD instructions.
 
 enqcmd, 0xf20f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+enqcmd, 0xf2f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 enqcmds, 0xf30f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+enqcmds, 0xf3f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 
 // ENQCMD instructions end.
 
@@ -3132,8 +3158,8 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
 
 // AMX instructions.
 
-ldtilecfg, 0x49/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
-sttilecfg, 0x6649/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
+ldtilecfg, 0x49/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
+sttilecfg, 0x6649/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
 
 tcmmimfp16ps, 0x666c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tcmmrlfp16ps, 0x6c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
@@ -3145,9 +3171,9 @@ tdpbuud, 0x5e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
 tdpbusd, 0x665e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tdpbsud, 0xf35e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 
-tileloadd, 0xf24b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
-tileloaddt1, 0x664b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
-tilestored, 0xf34b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
+tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
 
 tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {}
 
@@ -3159,15 +3185,25 @@ tilezero, 0xf249, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
 
 loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM }
 encodekey128, 0xf30f38fa, KL, Modrm|NoSuf, { Reg32, Reg32 }
+encodekey128, 0xf3da, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
 encodekey256, 0xf30f38fb, KL, Modrm|NoSuf, { Reg32, Reg32 }
+encodekey256, 0xf3db, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
 aesenc128kl, 0xf30f38dc, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesenc128kl, 0xf3dc, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesdec128kl, 0xf30f38dd, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesdec128kl, 0xf3dd, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesenc256kl, 0xf30f38de, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesenc256kl, 0xf3de, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesdec256kl, 0xf30f38df, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesdec256kl, 0xf3df, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesencwide128kl, 0xf30f38d8/0, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesencwide128kl, 0xf3d8/0, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesdecwide128kl, 0xf30f38d8/1, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesdecwide128kl, 0xf3d8/1, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesencwide256kl, 0xf30f38d8/2, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesencwide256kl, 0xf3d8/2, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesdecwide256kl, 0xf30f38d8/3, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesdecwide256kl, 0xf3d8/3, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 
 // KEYLOCKER instructions end.
 
@@ -3315,7 +3351,7 @@ prefetchit1, 0xf18/6, PREFETCHI, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
 
 // CMPCCXADD instructions.
 
-cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD, Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+cmp<cc>xadd, 0x66e<cc:opc>, APX_F(CMPCCXADD), Modrm|Vex|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 
 // CMPCCXADD instructions end.
 
@@ -3335,9 +3371,13 @@ wrmsrlist, 0xf30f01c6, MSRLIST, NoSuf, {}
 // RAO-INT instructions.
 
 aadd, 0xf38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aadd, 0xfc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 aand, 0x660f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aand, 0x66fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 aor, 0xf20f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aor, 0xf2fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 axor, 0xf30f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+axor, 0xf3fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 
 // RAO-INT instructions end.
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
  2023-12-28  1:53   ` H.J. Lu
@ 2024-01-04  8:02     ` Jan Beulich
  2024-01-04 11:27       ` Cui, Lili
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Beulich @ 2024-01-04  8:02 UTC (permalink / raw)
  To: H.J. Lu, Cui, Lili; +Cc: binutils, Nick Clifton

On 28.12.2023 02:53, H.J. Lu wrote:
> On Thu, Dec 28, 2023 at 01:27:06AM +0000, Cui, Lili wrote:
>> APX uses the REX2 prefix to support EGPR for map0 and map1 of legacy
>> instructions. We added the NoEgpr flag in i386-gen.c for instructions
>> that do not support EGPR.
>>[...]
> 
> OK.

I can't believe you approve a patch (apparently even a whole series) without
any comment and without allowing me time to re-review it, when you _know_
that I put in a whole lot of effort to try to make this start out in good
shape, rather than having to clean up afterwards. This isn't to say the
patch still needs further adjustment - I simply didn't get to looking at the
series again; before the holidays I didn't even manage to get through all of
v4.

Lili, along these lines, while formally it may be okay for you to commit
with H.J.'s approval, imo common sense should have told you not to.

Further, the bigger a patch, the more important it is that you trim context
when replying. Every reader has to scroll through hundreds of lines just
to find there are no other comments, just the "OK".

Please be more considerate going forward,
Jan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
  2024-01-04  8:02     ` Jan Beulich
@ 2024-01-04 11:27       ` Cui, Lili
  0 siblings, 0 replies; 30+ messages in thread
From: Cui, Lili @ 2024-01-04 11:27 UTC (permalink / raw)
  To: Beulich, Jan, H.J. Lu; +Cc: binutils, Nick Clifton

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Thursday, January 4, 2024 4:03 PM
> To: H.J. Lu <hjl.tools@gmail.com>; Cui, Lili <lili.cui@intel.com>
> Cc: binutils@sourceware.org; Nick Clifton <nickc@redhat.com>
> Subject: Re: [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
> 
> On 28.12.2023 02:53, H.J. Lu wrote:
> > On Thu, Dec 28, 2023 at 01:27:06AM +0000, Cui, Lili wrote:
> >> APX uses the REX2 prefix to support EGPR for map0 and map1 of legacy
> >>instructions. We added the NoEgpr flag in i386-gen.c for instructions
> >>that do not support EGPR.
> >>[...]
> >
> > OK.
> 
> I can't believe you approve a patch (apparently even a whole series) without
> any comment and without allowing me time to re-review it, when you _know_
> that I put in a whole lot of effort to try to make this start out in good shape,
> rather than having to clean up afterwards. This isn't to say the patch still needs
> further adjustment - I simply didn't get to looking at the series again; before
> the holidays I didn't even manage to get through all of v4.
> 
> Lili, along these lines, while formally it may be okay for you to commit with
> H.J.'s approval, imo common sense should have told you not to.
> 
> Further, the bigger a patch, the more important it is that you trim context
> when replying. Every reader has to scroll through hundreds of lines just to find
> there are no other comments, just the "OK".
> 
> Please be more considerate going forward, Jan

Hi Jan,

Happy new year! I'm glad you're back from vacation.

Our APX patches have been modified for three months and there are no errors affecting correctness now. Users interested in APX are waiting for it to become available in the master. HJ still helped to check the patch thoroughly and found no major problems. Therefore, I submitted the patches. While there are still many aspects of the code that can be optimized, development of APX continues and our discussions continue, I will continue to follow up and implement your suggestions.

We sincerely appreciate the time and effort you invested in reviewing these APX patches. I also learned a lot from you during this period. Wishing you a joyful New Year!

Regards,
Lili.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 9/9] Support APX JMPABS for disassembler
  2023-12-28  1:27 ` [PATCH V5 9/9] Support APX JMPABS for disassembler Cui, Lili
  2023-12-28  1:56   ` H.J. Lu
@ 2024-01-05 12:08   ` Jan Beulich
  2024-01-08  2:32     ` Hu, Lin1
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Beulich @ 2024-01-05 12:08 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hongjiu.lu, Hu, Lin1, binutils

On 28.12.2023 02:27, Cui, Lili wrote:
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
> @@ -0,0 +1,15 @@
> +# Check bytecode of APX_F jmpabs instructions with illegal encode.
> +
> +	.text
> +# With 66 prefix
> +	.byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With 67 prefix
> +	.byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With F2 prefix
> +	.byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With F3 prefix
> +	.byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# With LOCK prefix
> +	.byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +# REX2.M0 = 0 REX2.W = 1
> +	.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00

Considering that I specifically asked that this use .insn, and that I
further took the time to make a patch to make .insn work with {rex2},
I find it rather poor that here and ...

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
> @@ -0,0 +1,5 @@
> +# Check 64bit APX_F JMPABS instructions
> +
> +	.text
> + _start:
> +	.byte 0xd5,0x00,0xa1,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00

... here it is still .byte that is being used.

Jan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 8/9] Support APX NDD optimized encoding.
  2023-12-28  1:27 ` [PATCH V5 8/9] Support APX NDD optimized encoding Cui, Lili
  2023-12-28  1:56   ` H.J. Lu
@ 2024-01-05 14:36   ` Jan Beulich
  2024-01-08  2:49     ` Hu, Lin1
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Beulich @ 2024-01-05 14:36 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hongjiu.lu, Hu, Lin1, binutils

On 28.12.2023 02:27, Cui, Lili wrote:
> @@ -7754,6 +7804,60 @@ match_template (char mnem_suffix)
>  	  i.memshift = memshift;
>  	}
>  
> +      /* If we can optimize a NDD insn to legacy insn, like
> +	 add %r16, %r8, %r8 -> add %r16, %r8,
> +	 add  %r8, %r16, %r8 -> add %r16, %r8, then rematch template.
> +	 Note that the semantics have not been changed.  */
> +      if (optimize
> +	  && !i.no_optimize
> +	  && i.vec_encoding != vex_encoding_evex
> +	  && t + 1 < current_templates.end
> +	  && !t[1].opcode_modifier.evex
> +	  && t[1].opcode_space <= SPACE_0F38
> +	  && t->opcode_modifier.vexvvvv == VexVVVV_DST
> +	  && (i.types[i.operands - 1].bitfield.dword
> +	      || i.types[i.operands - 1].bitfield.qword))

While you check the last operand's type here, ...

> +	{
> +	  unsigned int match_dest_op = can_convert_NDD_to_legacy (t);
> +
> +	  if (match_dest_op != (unsigned int) ~0)
> +	    {
> +	      size_match = true;
> +	      /* We ensure that the next template has the same input
> +		 operands as the original matching template by the first
> +		 opernd (ATT). To avoid someone support new NDD insns and
> +		 put it in the wrong position.  */
> +	      overlap0 = operand_type_and (i.types[0],
> +					   t[1].operand_types[0]);
> +	      if (t->opcode_modifier.d)
> +		overlap1 = operand_type_and (i.types[0],
> +					     t[1].operand_types[1]);
> +	      if (!operand_type_match (overlap0, i.types[0])
> +		  && (!t->opcode_modifier.d
> +		      || !operand_type_match (overlap1, i.types[0])))
> +		size_match = false;

.. why is it the first one's here? That may be a memory operand, which
in AT&T mode cannot possibly have a size.

Jan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
  2023-12-28  1:27 ` [PATCH V5 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
  2023-12-28  1:53   ` H.J. Lu
@ 2024-01-05 14:45   ` Jan Beulich
  2024-01-08  3:41     ` Cui, Lili
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Beulich @ 2024-01-05 14:45 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hongjiu.lu, binutils

On 28.12.2023 02:27, Cui, Lili wrote:
> @@ -4168,11 +4201,11 @@ static void establish_rex (void)
>  	}
>      }
>  
> -  if (i.rex == 0 && i.rex_encoding)
> +   if (i.rex == 0 && i.rex2 == 0 && (i.rex_encoding || i.rex2_encoding))

Any reason the indentation of this line was changed (to now be wrong)?

> @@ -7015,6 +7080,43 @@ VEX_check_encoding (const insn_template *t)
>    return 0;
>  }
>  
> +/* Check if Egprs operands are valid for the instruction.  */
> +
> +static bool
> +check_EgprOperands (const insn_template *t)
> +{
> +  if (!t->opcode_modifier.noegpr)
> +    return 0;
> +
> +  for (unsigned int op = 0; op < i.operands; op++)
> +    {
> +      if (i.types[op].bitfield.class != Reg)
> +	continue;
> +
> +      if (i.op[op].regs->reg_flags & RegRex2)
> +	{
> +	  i.error = register_type_mismatch;
> +	  return 1;
> +	}
> +    }
> +
> +  if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
> +      || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
> +    {
> +      i.error = unsupported_EGPR_for_addressing;
> +      return 1;
> +    }
> +
> +  /* Check if pseudo prefix {rex2} is valid.  */
> +  if (i.rex2_encoding)
> +    {
> +      i.error = invalid_pseudo_prefix;
> +      return 1;
> +    }
> +
> +  return 0;
> +}

A function with return type "bool" would better return true/false, not
1/0, with "true" representing success / okay / good.

Jan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH V5 9/9] Support APX JMPABS for disassembler
  2024-01-05 12:08   ` Jan Beulich
@ 2024-01-08  2:32     ` Hu, Lin1
  2024-01-08  7:41       ` Jan Beulich
  0 siblings, 1 reply; 30+ messages in thread
From: Hu, Lin1 @ 2024-01-08  2:32 UTC (permalink / raw)
  To: Beulich, Jan, Cui, Lili; +Cc: Lu, Hongjiu, binutils

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, January 5, 2024 8:09 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; Hu, Lin1 <lin1.hu@intel.com>;
> binutils@sourceware.org
> Subject: Re: [PATCH V5 9/9] Support APX JMPABS for disassembler
> 
> On 28.12.2023 02:27, Cui, Lili wrote:
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
> > @@ -0,0 +1,15 @@
> > +# Check bytecode of APX_F jmpabs instructions with illegal encode.
> > +
> > +	.text
> > +# With 66 prefix
> > +	.byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> > +# With 67 prefix
> > +	.byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> > +# With F2 prefix
> > +	.byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> > +# With F3 prefix
> > +	.byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> > +# With LOCK prefix
> > +	.byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> > +# REX2.M0 = 0 REX2.W = 1
> > +	.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> 
> Considering that I specifically asked that this use .insn, and that I further took
> the time to make a patch to make .insn work with {rex2}, I find it rather poor
> that here and ...
> 
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
> > @@ -0,0 +1,5 @@
> > +# Check 64bit APX_F JMPABS instructions
> > +
> > +	.text
> > + _start:
> > +	.byte 0xd5,0x00,0xa1,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> 
> ... here it is still .byte that is being used.
> 

I'm not always keeping my eye on what patches push in Binutils. We can upstream a new fix patch like this. 
        .text
 # With 66 prefix
-       .byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+       .insn {rex2} data16 0xa1, $1{:u64}
 # With 67 prefix
-       .byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+       .insn {rex2} addr32 0xa1, $1{:u64}
 # With F2 prefix
-       .byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+       .insn {rex2} repne 0xa1, $1{:u64}
 # With F3 prefix
-       .byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+       .insn {rex2} rep 0xa1, $1{:u64}
 # With LOCK prefix
-       .byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+       .insn {rex2} lock 0xa1, $1{:u64}
 # REX2.M0 = 0 REX2.W = 1
-       .byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
+       .insn {rex2} 0x08,0xa1, $1{:u64}
+#.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00

But the last test " REX2.M0 = 0 REX2.W = 1" is invalid, do you have some advise?

BRs,
Lin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH V5 8/9] Support APX NDD optimized encoding.
  2024-01-05 14:36   ` Jan Beulich
@ 2024-01-08  2:49     ` Hu, Lin1
  0 siblings, 0 replies; 30+ messages in thread
From: Hu, Lin1 @ 2024-01-08  2:49 UTC (permalink / raw)
  To: Beulich, Jan, Cui, Lili; +Cc: Lu, Hongjiu, binutils

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, January 5, 2024 10:36 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; Hu, Lin1 <lin1.hu@intel.com>;
> binutils@sourceware.org
> Subject: Re: [PATCH V5 8/9] Support APX NDD optimized encoding.
> 
> On 28.12.2023 02:27, Cui, Lili wrote:
> > @@ -7754,6 +7804,60 @@ match_template (char mnem_suffix)
> >  	  i.memshift = memshift;
> >  	}
> >
> > +      /* If we can optimize a NDD insn to legacy insn, like
> > +	 add %r16, %r8, %r8 -> add %r16, %r8,
> > +	 add  %r8, %r16, %r8 -> add %r16, %r8, then rematch template.
> > +	 Note that the semantics have not been changed.  */
> > +      if (optimize
> > +	  && !i.no_optimize
> > +	  && i.vec_encoding != vex_encoding_evex
> > +	  && t + 1 < current_templates.end
> > +	  && !t[1].opcode_modifier.evex
> > +	  && t[1].opcode_space <= SPACE_0F38
> > +	  && t->opcode_modifier.vexvvvv == VexVVVV_DST
> > +	  && (i.types[i.operands - 1].bitfield.dword
> > +	      || i.types[i.operands - 1].bitfield.qword))
> 
> While you check the last operand's type here, ...
> 
> > +	{
> > +	  unsigned int match_dest_op = can_convert_NDD_to_legacy (t);
> > +
> > +	  if (match_dest_op != (unsigned int) ~0)
> > +	    {
> > +	      size_match = true;
> > +	      /* We ensure that the next template has the same input
> > +		 operands as the original matching template by the first
> > +		 opernd (ATT). To avoid someone support new NDD insns and
> > +		 put it in the wrong position.  */
> > +	      overlap0 = operand_type_and (i.types[0],
> > +					   t[1].operand_types[0]);
> > +	      if (t->opcode_modifier.d)
> > +		overlap1 = operand_type_and (i.types[0],
> > +					     t[1].operand_types[1]);
> > +	      if (!operand_type_match (overlap0, i.types[0])
> > +		  && (!t->opcode_modifier.d
> > +		      || !operand_type_match (overlap1, i.types[0])))
> > +		size_match = false;
> 
> .. why is it the first one's here? That may be a memory operand, which in AT&T
> mode cannot possibly have a size.
> 

These two places serve different purposes, I check last operand is for 8/16-bit operand don't have zero_upper (you mentioned). If the insn is a NDD insn, it's last operand must be a register operand, so it doesn't have the problem about memory operand.

The reason for checking the first operand is stated in the comment; it doesn't need to care about the size. It ensure the original matching template have same input like the next template (exclude the last operand). Because the original template is a NDD template and the next template is it's legacy version (If the future developer put them in right order).

BRs,
Lin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
  2024-01-05 14:45   ` Jan Beulich
@ 2024-01-08  3:41     ` Cui, Lili
  0 siblings, 0 replies; 30+ messages in thread
From: Cui, Lili @ 2024-01-08  3:41 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, binutils



> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, January 5, 2024 10:45 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; binutils@sourceware.org
> Subject: Re: [PATCH V5 1/9] Support APX GPR32 with rex2 prefix
> 
> On 28.12.2023 02:27, Cui, Lili wrote:
> > @@ -4168,11 +4201,11 @@ static void establish_rex (void)
> >  	}
> >      }
> >
> > -  if (i.rex == 0 && i.rex_encoding)
> > +   if (i.rex == 0 && i.rex2 == 0 && (i.rex_encoding ||
> > + i.rex2_encoding))
> 
> Any reason the indentation of this line was changed (to now be wrong)?
> 

Ok, I will fix it.

> > @@ -7015,6 +7080,43 @@ VEX_check_encoding (const insn_template *t)
> >    return 0;
> >  }
> >
> > +/* Check if Egprs operands are valid for the instruction.  */
> > +
> > +static bool
> > +check_EgprOperands (const insn_template *t) {
> > +  if (!t->opcode_modifier.noegpr)
> > +    return 0;
> > +
> > +  for (unsigned int op = 0; op < i.operands; op++)
> > +    {
> > +      if (i.types[op].bitfield.class != Reg)
> > +	continue;
> > +
> > +      if (i.op[op].regs->reg_flags & RegRex2)
> > +	{
> > +	  i.error = register_type_mismatch;
> > +	  return 1;
> > +	}
> > +    }
> > +
> > +  if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
> > +      || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
> > +    {
> > +      i.error = unsupported_EGPR_for_addressing;
> > +      return 1;
> > +    }
> > +
> > +  /* Check if pseudo prefix {rex2} is valid.  */  if
> > + (i.rex2_encoding)
> > +    {
> > +      i.error = invalid_pseudo_prefix;
> > +      return 1;
> > +    }
> > +
> > +  return 0;
> > +}
> 
> A function with return type "bool" would better return true/false, not 1/0,
> with "true" representing success / okay / good.
> 
OK.

Thanks,
Lili.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH V5 9/9] Support APX JMPABS for disassembler
  2024-01-08  2:32     ` Hu, Lin1
@ 2024-01-08  7:41       ` Jan Beulich
  2024-01-08  7:44         ` Hu, Lin1
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Beulich @ 2024-01-08  7:41 UTC (permalink / raw)
  To: Hu, Lin1; +Cc: Lu, Hongjiu, binutils, Cui, Lili

On 08.01.2024 03:32, Hu, Lin1 wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Friday, January 5, 2024 8:09 PM
>> To: Cui, Lili <lili.cui@intel.com>
>> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; Hu, Lin1 <lin1.hu@intel.com>;
>> binutils@sourceware.org
>> Subject: Re: [PATCH V5 9/9] Support APX JMPABS for disassembler
>>
>> On 28.12.2023 02:27, Cui, Lili wrote:
>>> --- /dev/null
>>> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
>>> @@ -0,0 +1,15 @@
>>> +# Check bytecode of APX_F jmpabs instructions with illegal encode.
>>> +
>>> +	.text
>>> +# With 66 prefix
>>> +	.byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
>>> +# With 67 prefix
>>> +	.byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
>>> +# With F2 prefix
>>> +	.byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
>>> +# With F3 prefix
>>> +	.byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
>>> +# With LOCK prefix
>>> +	.byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
>>> +# REX2.M0 = 0 REX2.W = 1
>>> +	.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
>>
>> Considering that I specifically asked that this use .insn, and that I further took
>> the time to make a patch to make .insn work with {rex2}, I find it rather poor
>> that here and ...
>>
>>> --- /dev/null
>>> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
>>> @@ -0,0 +1,5 @@
>>> +# Check 64bit APX_F JMPABS instructions
>>> +
>>> +	.text
>>> + _start:
>>> +	.byte 0xd5,0x00,0xa1,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00
>>
>> ... here it is still .byte that is being used.
>>
> 
> I'm not always keeping my eye on what patches push in Binutils.

That's not a general requirement of course, but when it specifically is
work done for you, I would have expected it to be recognized and then
leveraged.

> We can upstream a new fix patch like this. 
>         .text
>  # With 66 prefix
> -       .byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +       .insn {rex2} data16 0xa1, $1{:u64}
>  # With 67 prefix
> -       .byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +       .insn {rex2} addr32 0xa1, $1{:u64}
>  # With F2 prefix
> -       .byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +       .insn {rex2} repne 0xa1, $1{:u64}
>  # With F3 prefix
> -       .byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +       .insn {rex2} rep 0xa1, $1{:u64}
>  # With LOCK prefix
> -       .byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +       .insn {rex2} lock 0xa1, $1{:u64}
>  # REX2.M0 = 0 REX2.W = 1
> -       .byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> +       .insn {rex2} 0x08,0xa1, $1{:u64}
> +#.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> 
> But the last test " REX2.M0 = 0 REX2.W = 1" is invalid, do you have some advise?

Well, no, as long as {rex2} cannot specify any of the payload bits, and when
there are no operands controlling the individual bit (due to there not being
any register/memory operands), it can't be easily expressed using .insn.
Further work would be required to permit that, but for the time being in
_such_ cases it is (of course) okay to use .byte.

Jan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH V5 9/9] Support APX JMPABS for disassembler
  2024-01-08  7:41       ` Jan Beulich
@ 2024-01-08  7:44         ` Hu, Lin1
  0 siblings, 0 replies; 30+ messages in thread
From: Hu, Lin1 @ 2024-01-08  7:44 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, binutils, Cui, Lili

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, January 8, 2024 3:41 PM
> To: Hu, Lin1 <lin1.hu@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; binutils@sourceware.org; Cui, Lili
> <lili.cui@intel.com>
> Subject: Re: [PATCH V5 9/9] Support APX JMPABS for disassembler
> 
> On 08.01.2024 03:32, Hu, Lin1 wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Friday, January 5, 2024 8:09 PM
> >> To: Cui, Lili <lili.cui@intel.com>
> >> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; Hu, Lin1 <lin1.hu@intel.com>;
> >> binutils@sourceware.org
> >> Subject: Re: [PATCH V5 9/9] Support APX JMPABS for disassembler
> >>
> >> On 28.12.2023 02:27, Cui, Lili wrote:
> >>> --- /dev/null
> >>> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
> >>> @@ -0,0 +1,15 @@
> >>> +# Check bytecode of APX_F jmpabs instructions with illegal encode.
> >>> +
> >>> +	.text
> >>> +# With 66 prefix
> >>> +	.byte 0x66,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> >>> +# With 67 prefix
> >>> +	.byte 0x67,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> >>> +# With F2 prefix
> >>> +	.byte 0xf2,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> >>> +# With F3 prefix
> >>> +	.byte 0xf3,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> >>> +# With LOCK prefix
> >>> +	.byte 0xf0,0xd5,0x00,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> >>> +# REX2.M0 = 0 REX2.W = 1
> >>> +	.byte 0xd5,0x08,0xa1,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> >>
> >> Considering that I specifically asked that this use .insn, and that I
> >> further took the time to make a patch to make .insn work with {rex2},
> >> I find it rather poor that here and ...
> >>
> >>> --- /dev/null
> >>> +++ b/gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
> >>> @@ -0,0 +1,5 @@
> >>> +# Check 64bit APX_F JMPABS instructions
> >>> +
> >>> +	.text
> >>> + _start:
> >>> +	.byte 0xd5,0x00,0xa1,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00
> >>
> >> ... here it is still .byte that is being used.
> >>
> >
> > I'm not always keeping my eye on what patches push in Binutils.
> 
> That's not a general requirement of course, but when it specifically is work done
> for you, I would have expected it to be recognized and then leveraged.
> 

OK, I will upstream another fix patch.

BRs,
Lin

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-01-08  7:44 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
2023-12-28  1:27 ` [PATCH V5 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-28  1:53   ` H.J. Lu
2024-01-04  8:02     ` Jan Beulich
2024-01-04 11:27       ` Cui, Lili
2024-01-05 14:45   ` Jan Beulich
2024-01-08  3:41     ` Cui, Lili
2023-12-28  1:27 ` [PATCH V5 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-12-28  1:54   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-28  1:54   ` H.J. Lu
2023-12-28 13:48     ` Cui, Lili
2023-12-28  1:27 ` [PATCH V5 4/9] Add tests for " Cui, Lili
2023-12-28  1:54   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 5/9] Support APX NDD Cui, Lili
2023-12-28  1:55   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 6/9] Support APX Push2/Pop2 Cui, Lili
2023-12-28  1:55   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 7/9] Support APX pushp/popp Cui, Lili
2023-12-28  1:56   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-28  1:56   ` H.J. Lu
2024-01-05 14:36   ` Jan Beulich
2024-01-08  2:49     ` Hu, Lin1
2023-12-28  1:27 ` [PATCH V5 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-12-28  1:56   ` H.J. Lu
2024-01-05 12:08   ` Jan Beulich
2024-01-08  2:32     ` Hu, Lin1
2024-01-08  7:41       ` Jan Beulich
2024-01-08  7:44         ` Hu, Lin1

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).