public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "Cui, Lili" <lili.cui@intel.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: "binutils@sourceware.org" <binutils@sourceware.org>,
	"Beulich, Jan" <JBeulich@suse.com>
Subject: RE: [PATCH V5 3/9] Support APX GPR32 with extend evex prefix
Date: Thu, 28 Dec 2023 13:48:06 +0000	[thread overview]
Message-ID: <SJ0PR11MB560016482CD2704C3DF4636F9E9EA@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <ZYzVUWHGdKzU_CyV@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 56831 bytes --]

This is what I checked in, thanks.

Lili.

> -----Original Message-----
> From: H.J. Lu <hjl.tools@gmail.com>
> Sent: Thursday, December 28, 2023 9:54 AM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: binutils@sourceware.org; Beulich, Jan <JBeulich@suse.com>
> Subject: Re: [PATCH V5 3/9] Support APX GPR32 with extend evex prefix
> 
> On Thu, Dec 28, 2023 at 01:27:08AM +0000, Cui, Lili wrote:
> > This patch adds non-ND, non-NF forms of EVEX promotion insn.
> >
> > EVEX extension of legacy instructions:
> >   All promoted legacy instructions are placed in EVEX map 4, which is
> >   currently reserved.
> > EVEX extension of EVEX instructions:
> >   All existing EVEX instructions are extended by APX using the extended
> >   EVEX prefix, so that they can access all 32 GPRs.
> > EVEX extension of VEX instructions:
> >   Promoting a VEX instruction into the EVEX space does not change the map
> >   id, the opcode, or the operand encoding of the VEX instruction.
> >
> > Note: The promoted versions of MOVBE will be extended to include the
> “MOVBE
> >   reg1, reg2”.
> >
> >   gas/ChangeLog:
> >
> >   2023-12-28  Lingling Kong <lingling.kong@intel.com>
> > 	      H.J. Lu  <hongjiu.lu@intel.com>
> > 	      Lili Cui <lili.cui@intel.com>
> > 	      Lin Hu   <lin1.hu@intel.com>
> >
> > 	* config/tc-i386.c
> > 	(install_template): Handled APX combines.
> > 	(is_apx_evex_encoding): Test apx evex encoding.
> > 	(build_apx_evex_prefix): Enabe APX evex prefix.
> > 	(md_assemble): Handle apx with evex encoding.
> > 	(process_suffix): Handle apx map4 prefix.
> > 	(check_register): Assign i.vec_encoding for APX evex instructions.
> > 	* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
> > 	* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.
> >
> > opcodes/ChangeLog:
> >
> > 	* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
> > 	* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
> > 	PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
> > 	PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
> > 	PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
> > 	PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
> > 	PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
> > 	PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
> > 	* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
> > 	* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
> > 	promote to apx to use gpr32
> > 	* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
> > 	X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
> > 	X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5,
> X86_64_EVEX_0F38F6,
> > 	X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
> > 	* i386-dis.c
> > 	(struct instr_info): Deleted bool r.
> > 	(PREFIX_NP_OR_DATA): New.
> > 	(NO_PREFIX): New.
> > 	(putop): Ditto.
> > 	(X86_64_EVEX_FROM_VEX_TABLE): Diito.
> > 	(get_valid_dis386): Decode insn erex in extend evex prefix.
> > 	Handle EVEX_MAP4
> > 	(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
> > 	(print_register): Handle apx instructions decode.
> > 	(OP_E_memory): Diito.
> > 	(OP_G): Diito.
> > 	(OP_XMM): Diito.
> > 	(DistinctDest_Fixup): Diito.
> > 	* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
> > 	* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
> > 	promote to evex.
> > 	* i386-opc.tbl: Handle some legacy and vex insns don't
> > 	support gpr32. And add some legacy insn (map2 / 3) promote
> > 	to evex.
> > ---
> >  gas/config/tc-i386.c                 |  72 +++++++++++-
> >  gas/testsuite/gas/i386/x86-64-evex.d |   2 +-
> >  gas/testsuite/gas/i386/x86-64.exp    |   2 +-
> >  opcodes/i386-dis-evex-prefix.h       |  58 ++++++++++
> >  opcodes/i386-dis-evex-x86-64.h       |  50 +++++++++
> >  opcodes/i386-dis-evex.h              |  94 ++++++++--------
> >  opcodes/i386-dis.c                   | 160 +++++++++++++++++++++++----
> >  opcodes/i386-gen.c                   |   2 +
> >  opcodes/i386-opc.h                   |   6 +
> >  opcodes/i386-opc.tbl                 |  90 ++++++++++-----
> >  10 files changed, 433 insertions(+), 103 deletions(-)
> >  create mode 100644 opcodes/i386-dis-evex-x86-64.h
> >
> > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> > index bb302f28add..7e62d08e9bd 100644
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -435,6 +435,9 @@ struct _i386_insn
> >      /* Prefer the REX2 prefix in encoding.  */
> >      bool rex2_encoding;
> >
> > +    /* Need to use an Egpr capable encoding (REX2 or EVEX).  */
> > +    bool has_egpr;
> > +
> >      /* Disable instruction size optimization.  */
> >      bool no_optimize;
> >
> > @@ -3676,12 +3679,12 @@ install_template (const insn_template *t)
> >
> >    /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
> >    if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> > -  {
> > +    {
> >        if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> >  	   || maybe_cpu (t, CpuFMA))
> >  	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
> >  	{
> > -	  if (need_evex_encoding ())
> > +	  if (need_evex_encoding () || i.has_egpr)
> >  	    {
> >  	      i.tm.opcode_modifier.vex = 0;
> >  	      i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f;
> > @@ -3698,7 +3701,19 @@ install_template (const insn_template *t)
> >  		gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
> >  	    }
> >  	}
> > -  }
> > +
> > +      if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
> > +	   || maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
> > +	   || maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
> > +	   || maybe_cpu (t, CpuBMI2))
> > +	  && maybe_cpu (t, CpuAPX_F))
> > +	{
> > +	  if (need_evex_encoding () || i.has_egpr)
> > +	    i.tm.opcode_modifier.vex = 0;
> > +	  else
> > +	    i.tm.opcode_modifier.evex = 0;
> > +	}
> > +    }
> >
> >    /* Note that for pseudo prefixes this produces a length of 1. But for them
> >       the length isn't interesting at all.  */
> > @@ -3879,6 +3894,15 @@ is_any_vex_encoding (const insn_template *t)
> >    return t->opcode_modifier.vex || t->opcode_modifier.evex;
> >  }
> >
> > +/* We can use this function only when the current encoding is evex.  */
> > +static INLINE bool
> > +is_apx_evex_encoding (void)
> > +{
> > +  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
> > +    || (i.vex.register_specifier
> > +	&& (i.vex.register_specifier->reg_flags & RegRex2));
> > +}
> > +
> >  static INLINE bool
> >  is_apx_rex2_encoding (void)
> >  {
> > @@ -4156,6 +4180,27 @@ build_rex2_prefix (void)
> >  		    | (i.rex2 << 4) | i.rex);
> >  }
> >
> > +/* Build the EVEX prefix (4-byte) for evex insn
> > +   | 62h |
> > +   | `R`X`B`R' | B'mmm |
> > +   | W | v`v`v`v | `x' | pp |
> > +   | z| L'L | b | `v | aaa |
> > +*/
> > +static void
> > +build_apx_evex_prefix (void)
> > +{
> > +  build_evex_prefix ();
> > +  if (i.rex2 & REX_R)
> > +    i.vex.bytes[1] &= ~0x10;
> > +  if (i.rex2 & REX_B)
> > +    i.vex.bytes[1] |= 0x08;
> > +  if (i.rex2 & REX_X)
> > +    i.vex.bytes[2] &= ~0x04;
> > +  if (i.vex.register_specifier
> > +      && i.vex.register_specifier->reg_flags & RegRex2)
> > +    i.vex.bytes[3] &= ~0x08;
> > +}
> > +
> >  static void establish_rex (void)
> >  {
> >    /* Note that legacy encodings have at most 2 non-immediate operands.  */
> > @@ -5723,13 +5768,18 @@ md_assemble (char *line)
> >  	  return;
> >  	}
> >
> > -      if (i.tm.opcode_modifier.vex)
> > +      if (is_apx_evex_encoding ())
> > +	build_apx_evex_prefix ();
> > +      else if (i.tm.opcode_modifier.vex)
> >  	build_vex_prefix (t);
> >        else
> >  	build_evex_prefix ();
> >
> >        /* The individual REX.RXBW bits got consumed.  */
> >        i.rex &= REX_OPCODE;
> > +
> > +      /* The rex2 bits got consumed.  */
> > +      i.rex2 = 0;
> >      }
> >
> >    /* Handle conversion of 'int $3' --> special int3 insn.  */
> > @@ -8084,7 +8134,8 @@ process_suffix (void)
> >        if (i.suffix != QWORD_MNEM_SUFFIX
> >  	  && i.tm.opcode_modifier.mnemonicsize != IGNORESIZE
> >  	  && !i.tm.opcode_modifier.floatmf
> > -	  && !is_any_vex_encoding (&i.tm)
> > +	  && (!is_any_vex_encoding (&i.tm)
> > +	      || i.tm.opcode_space == SPACE_EVEXMAP4)
> >  	  && ((i.suffix == LONG_MNEM_SUFFIX) == (flag_code == CODE_16BIT)
> >  	      || (flag_code == CODE_64BIT
> >  		  && i.tm.opcode_modifier.jump == JUMP_BYTE)))
> > @@ -8094,7 +8145,14 @@ process_suffix (void)
> >  	  if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
> >  	    prefix = ADDR_PREFIX_OPCODE;
> >
> > -	  if (!add_prefix (prefix))
> > +	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> > +	     needs to be adjusted.  */
> > +	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
> > +	    {
> > +	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
> > +	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> > +	    }
> > +	  else if (!add_prefix (prefix))
> >  	    return 0;
> >  	}
> >
> > @@ -14300,6 +14358,8 @@ static bool check_register (const reg_entry *r)
> >        if (!cpu_arch_flags.bitfield.cpuapx_f
> >  	  || flag_code != CODE_64BIT)
> >  	return false;
> > +
> > +      i.has_egpr = true;
> >      }
> >
> >    if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
> > diff --git a/gas/testsuite/gas/i386/x86-64-evex.d
> b/gas/testsuite/gas/i386/x86-64-evex.d
> > index 041747db892..5d974c312da 100644
> > --- a/gas/testsuite/gas/i386/x86-64-evex.d
> > +++ b/gas/testsuite/gas/i386/x86-64-evex.d
> > @@ -17,6 +17,6 @@ Disassembly of section .text:
> >   +[a-f0-9]+:	62 f1 d6 38 7b f0    	vcvtusi2ss %rax,\{rd-
> sae\},%xmm5,%xmm6
> >   +[a-f0-9]+:	62 f1 57 38 7b f0    	vcvtusi2sd %eax,\{rd-
> bad\},%xmm5,%xmm6
> >   +[a-f0-9]+:	62 f1 d7 38 7b f0    	vcvtusi2sd %rax,\{rd-
> sae\},%xmm5,%xmm6
> > - +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,\(bad\)
> > + +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,%r16d
> >   +[a-f0-9]+:	62 e1 7c 08 c2 c0 00 	vcmpeqps %xmm0,%xmm0,\(bad\)
> >  #pass
> > diff --git a/gas/testsuite/gas/i386/x86-64.exp
> b/gas/testsuite/gas/i386/x86-64.exp
> > index 91c068d5b40..ffacc9c8e2b 100644
> > --- a/gas/testsuite/gas/i386/x86-64.exp
> > +++ b/gas/testsuite/gas/i386/x86-64.exp
> > @@ -250,7 +250,7 @@ run_dump_test "x86-64-sse-noavx"
> >  run_dump_test "x86-64-movbe"
> >  run_dump_test "x86-64-movbe-intel"
> >  run_dump_test "x86-64-movbe-suffix"
> > -run_list_test "x86-64-inval-movbe" "-al"
> > +run_list_test "x86-64-inval-movbe" "-march=+noapx_f -al"
> >  run_dump_test "x86-64-ept"
> >  run_dump_test "x86-64-ept-intel"
> >  run_list_test "x86-64-inval-ept" "-al"
> > diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
> > index 28da54922c7..54ed48c6952 100644
> > --- a/opcodes/i386-dis-evex-prefix.h
> > +++ b/opcodes/i386-dis-evex-prefix.h
> > @@ -338,6 +338,64 @@
> >      { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
> >      { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
> >    },
> > +  /* PREFIX_EVEX_MAP4_D8 */
> > +  {
> > +    { "sha1nexte", { XM, EXxmm }, 0 },
> > +    { REG_TABLE (REG_0F38D8_PREFIX_1) },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DA */
> > +  {
> > +    { "sha1msg2", { XM, EXxmm }, 0 },
> > +    { "encodekey128", { Gd, Rd }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DB */
> > +  {
> > +    { "sha256rnds2", { XM, EXxmm, XMM0 }, 0 },
> > +    { "encodekey256", { Gd, Rd }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DC */
> > +  {
> > +    { "sha256msg1", { XM, EXxmm }, 0 },
> > +    { "aesenc128kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DD */
> > +  {
> > +    { "sha256msg2", { XM, EXxmm }, 0 },
> > +    { "aesdec128kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DE */
> > +  {
> > +    { Bad_Opcode },
> > +    { "aesenc256kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_DF */
> > +  {
> > +    { Bad_Opcode },
> > +    { "aesdec256kl", { XM, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F0 */
> > +  {
> > +    { "crc32A", { Gdq, Eb }, 0 },
> > +    { "invept", { Gm, Mo }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F1 */
> > +  {
> > +    { "crc32Q", { Gdq, Ev }, 0 },
> > +    { "invvpid", { Gm, Mo }, 0 },
> > +    { "crc32Q", { Gdq, Ev }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F2 */
> > +  {
> > +    { Bad_Opcode },
> > +    { "invpcid", { Gm, M }, 0 },
> > +  },
> > +  /* PREFIX_EVEX_MAP4_F8 */
> > +  {
> > +    { Bad_Opcode },
> > +    { "enqcmds", { Gva, M },  0 },
> > +    { "movdir64b", { Gva, M }, 0 },
> > +    { "enqcmd", { Gva, M }, 0 },
> > +  },
> >    /* PREFIX_EVEX_MAP5_10 */
> >    {
> >      { Bad_Opcode },
> > diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-
> 64.h
> > new file mode 100644
> > index 00000000000..0d9d98a7691
> > --- /dev/null
> > +++ b/opcodes/i386-dis-evex-x86-64.h
> > @@ -0,0 +1,50 @@
> > +  /* X86_64_EVEX_0F90 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F90_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F91 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F91_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F92 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F92_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F93 */
> > +  {
> > +    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F93_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F2 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F3 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F5 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F5_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F6 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE(PREFIX_VEX_0F38F6_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F38F7 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE(PREFIX_VEX_0F38F7_L_0) },
> > +  },
> > +  /* X86_64_EVEX_0F3AF0 */
> > +  {
> > +    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) },
> > +  },
> > diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
> > index 7ad1edbe72d..90c063b2188 100644
> > --- a/opcodes/i386-dis-evex.h
> > +++ b/opcodes/i386-dis-evex.h
> > @@ -164,10 +164,10 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* 90 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F90) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F91) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F92) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F93) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -375,9 +375,9 @@ static const struct dis386 evex_table[][256] = {
> >      { "vpsllv%DQ",	{ XM, Vex, EXx }, PREFIX_DATA },
> >      /* 48 */
> >      { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F3849) },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F384B) },
> >      { "vrcp14p%XW",	{ XM, EXx }, PREFIX_DATA },
> >      { "vrcp14s%XW",	{ XMScalar, VexScalar, EXdq }, PREFIX_DATA },
> >      { "vrsqrt14p%XW",	{ XM, EXx }, 0 },
> > @@ -545,32 +545,32 @@ static const struct dis386 evex_table[][256] = {
> >      { "%XEvaesdecY",	{ XM, Vex, EXx }, PREFIX_DATA },
> >      { "%XEvaesdeclastY", { XM, Vex, EXx }, PREFIX_DATA },
> >      /* E0 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E0) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E1) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E2) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E3) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E4) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E5) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E6) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E7) },
> >      /* E8 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E8) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E9) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EA) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EB) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EC) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38ED) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EE) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EF) },
> >      /* F0 */
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F2) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F3) },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F5) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F6) },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F7) },
> >      /* F8 */
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -854,7 +854,7 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* F0 */
> > -    { Bad_Opcode },
> > +    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F3AF0) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -983,13 +983,13 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* 60 */
> > +    { "movbeS",	{ Gv, Ev }, PREFIX_NP_OR_DATA },
> > +    { "movbeS",	{ Ev, Gv }, PREFIX_NP_OR_DATA },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "wrussK",	{ M, Gdq }, PREFIX_DATA },
> > +    { PREFIX_TABLE (PREFIX_0F38F6) },
> >      { Bad_Opcode },
> >      /* 68 */
> >      { Bad_Opcode },
> > @@ -1113,19 +1113,19 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* D8 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_D8) },
> > +    { "sha1msg1",	{ XM, EXxmm }, NO_PREFIX },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DA) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DB) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DC) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DD) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DE) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DF) },
> >      /* E0 */
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -1145,20 +1145,20 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* F0 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F0) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F1) },
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F2) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* F8 */
> > +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
> > +    { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { PREFIX_TABLE (PREFIX_0F38FC) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> > index e006d869258..d4d32befcf9 100644
> > --- a/opcodes/i386-dis.c
> > +++ b/opcodes/i386-dis.c
> > @@ -132,6 +132,13 @@ enum x86_64_isa
> >    intel64
> >  };
> >
> > +enum evex_type
> > +{
> > +  evex_default = 0,
> > +  evex_from_legacy,
> > +  evex_from_vex,
> > +};
> > +
> >  struct instr_info
> >  {
> >    enum address_mode address_mode;
> > @@ -212,7 +219,6 @@ struct instr_info
> >      int ll;
> >      bool w;
> >      bool evex;
> > -    bool r;
> >      bool v;
> >      bool zeroing;
> >      bool b;
> > @@ -220,6 +226,8 @@ struct instr_info
> >    }
> >    vex;
> >
> > +  enum evex_type evex_type;
> > +
> >    /* Remember if the current op is a jump instruction.  */
> >    bool op_is_jump;
> >
> > @@ -303,6 +311,8 @@ struct dis_private {
> >  #define PREFIX_ADDR 0x400
> >  #define PREFIX_FWAIT 0x800
> >  #define PREFIX_REX2 0x1000
> > +#define PREFIX_NP_OR_DATA 0x2000
> > +#define NO_PREFIX   0x4000
> >
> >  /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
> >     to ADDR (exclusive) are valid.  Returns true for success, false
> > @@ -800,6 +810,7 @@ enum
> >    USE_RM_TABLE,
> >    USE_PREFIX_TABLE,
> >    USE_X86_64_TABLE,
> > +  USE_X86_64_EVEX_FROM_VEX_TABLE,
> >    USE_3BYTE_TABLE,
> >    USE_XOP_8F_TABLE,
> >    USE_VEX_C4_TABLE,
> > @@ -818,6 +829,8 @@ enum
> >  #define RM_TABLE(I)		DIS386 (USE_RM_TABLE, (I))
> >  #define PREFIX_TABLE(I)		DIS386 (USE_PREFIX_TABLE, (I))
> >  #define X86_64_TABLE(I)		DIS386 (USE_X86_64_TABLE, (I))
> > +#define X86_64_EVEX_FROM_VEX_TABLE(I) \
> > +  DIS386 (USE_X86_64_EVEX_FROM_VEX_TABLE, (I))
> >  #define THREE_BYTE_TABLE(I)	DIS386 (USE_3BYTE_TABLE, (I))
> >  #define XOP_8F_TABLE()		DIS386 (USE_XOP_8F_TABLE, 0)
> >  #define VEX_C4_TABLE()		DIS386 (USE_VEX_C4_TABLE, 0)
> > @@ -866,7 +879,7 @@ enum
> >    REG_VEX_0F73,
> >    REG_VEX_0FAE,
> >    REG_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0,
> > -  REG_VEX_0F38F3_L_0,
> > +  REG_VEX_0F38F3_L_0_P_0,
> >    REG_VEX_MAP7_F8_L_0_W_0,
> >
> >    REG_XOP_09_01_L_0,
> > @@ -878,7 +891,7 @@ enum
> >    REG_EVEX_0F72,
> >    REG_EVEX_0F73,
> >    REG_EVEX_0F38C6_L_2,
> > -  REG_EVEX_0F38C7_L_2
> > +  REG_EVEX_0F38C7_L_2,
> >  };
> >
> >  enum
> > @@ -1094,6 +1107,8 @@ enum
> >    PREFIX_VEX_0F38CC,
> >    PREFIX_VEX_0F38CD,
> >    PREFIX_VEX_0F38DA_W_0,
> > +  PREFIX_VEX_0F38F2_L_0,
> > +  PREFIX_VEX_0F38F3_L_0,
> >    PREFIX_VEX_0F38F5_L_0,
> >    PREFIX_VEX_0F38F6_L_0,
> >    PREFIX_VEX_0F38F7_L_0,
> > @@ -1156,6 +1171,18 @@ enum
> >    PREFIX_EVEX_0F3A67,
> >    PREFIX_EVEX_0F3AC2,
> >
> > +  PREFIX_EVEX_MAP4_D8,
> > +  PREFIX_EVEX_MAP4_DA,
> > +  PREFIX_EVEX_MAP4_DB,
> > +  PREFIX_EVEX_MAP4_DC,
> > +  PREFIX_EVEX_MAP4_DD,
> > +  PREFIX_EVEX_MAP4_DE,
> > +  PREFIX_EVEX_MAP4_DF,
> > +  PREFIX_EVEX_MAP4_F0,
> > +  PREFIX_EVEX_MAP4_F1,
> > +  PREFIX_EVEX_MAP4_F2,
> > +  PREFIX_EVEX_MAP4_F8,
> > +
> >    PREFIX_EVEX_MAP5_10,
> >    PREFIX_EVEX_MAP5_11,
> >    PREFIX_EVEX_MAP5_1D,
> > @@ -1267,7 +1294,19 @@ enum
> >    X86_64_VEX_0F38ED,
> >    X86_64_VEX_0F38EE,
> >    X86_64_VEX_0F38EF,
> > +
> >    X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
> > +
> > +  X86_64_EVEX_0F90,
> > +  X86_64_EVEX_0F91,
> > +  X86_64_EVEX_0F92,
> > +  X86_64_EVEX_0F93,
> > +  X86_64_EVEX_0F38F2,
> > +  X86_64_EVEX_0F38F3,
> > +  X86_64_EVEX_0F38F5,
> > +  X86_64_EVEX_0F38F6,
> > +  X86_64_EVEX_0F38F7,
> > +  X86_64_EVEX_0F3AF0,
> >  };
> >
> >  enum
> > @@ -2882,12 +2921,12 @@ static const struct dis386 reg_table[][8] = {
> >    {
> >      { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0_R_0) },
> >    },
> > -  /* REG_VEX_0F38F3_L_0 */
> > +  /* REG_VEX_0F38F3_L_0_P_0 */
> >    {
> >      { Bad_Opcode },
> > -    { "blsrS",		{ VexGdq, Edq }, PREFIX_OPCODE },
> > -    { "blsmskS",	{ VexGdq, Edq }, PREFIX_OPCODE },
> > -    { "blsiS",		{ VexGdq, Edq }, PREFIX_OPCODE },
> > +    { "blsrS",		{ VexGdq, Edq }, 0 },
> > +    { "blsmskS",	{ VexGdq, Edq }, 0 },
> > +    { "blsiS",		{ VexGdq, Edq }, 0 },
> >    },
> >    /* REG_VEX_MAP7_F8_L_0_W_0 */
> >    {
> > @@ -4035,6 +4074,16 @@ static const struct dis386 prefix_table[][4] = {
> >      { "vsm4rnds4", { XM, Vex, EXx }, 0 },
> >    },
> >
> > +  /* PREFIX_VEX_0F38F2_L_0 */
> > +  {
> > +    { "andnS",          { Gdq, VexGdq, Edq }, 0 },
> > +  },
> > +
> > +  /* PREFIX_VEX_0F38F3_L_0 */
> > +  {
> > +    { REG_TABLE (REG_VEX_0F38F3_L_0_P_0) },
> > +  },
> > +
> >    /* PREFIX_VEX_0F38F5_L_0 */
> >    {
> >      { "bzhiS",		{ Gdq, Edq, VexGdq }, 0 },
> > @@ -4527,6 +4576,7 @@ static const struct dis386 x86_64_table[][2] = {
> >      { PREFIX_TABLE (PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64) },
> >    },
> >
> > +#include "i386-dis-evex-x86-64.h"
> >  };
> >
> >  static const struct dis386 three_byte_table[][256] = {
> > @@ -7113,12 +7163,12 @@ static const struct dis386 vex_len_table[][2] =
> {
> >
> >    /* VEX_LEN_0F38F2 */
> >    {
> > -    { "andnS",		{ Gdq, VexGdq, Edq }, PREFIX_OPCODE },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
> >    },
> >
> >    /* VEX_LEN_0F38F3 */
> >    {
> > -    { REG_TABLE(REG_VEX_0F38F3_L_0) },
> > +    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
> >    },
> >
> >    /* VEX_LEN_0F38F5 */
> > @@ -8732,6 +8782,17 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >        dp = &prefix_table[dp->op[1].bytemode][vindex];
> >        break;
> >
> > +    case USE_X86_64_EVEX_FROM_VEX_TABLE:
> > +      ins->evex_type = evex_from_vex;
> > +      /* EVEX from VEX instrucions require that EVEX.z, EVEX.L’L, EVEX.b and
> > +	 the lower 2 bits of EVEX.aaa must be 0.  */
> > +      if ((ins->vex.mask_register_specifier & 0x3) != 0
> > +	  || ins->vex.ll != 0
> > +	  || ins->vex.zeroing != 0
> > +	  || ins->vex.b)
> > +	return &bad_opcode;
> > +
> > +      /* Fall through.  */
> >      case USE_X86_64_TABLE:
> >        vindex = ins->address_mode == mode_64bit ? 1 : 0;
> >        dp = &x86_64_table[dp->op[1].bytemode][vindex];
> > @@ -8977,9 +9038,13 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >        if (!fetch_code (ins->info, ins->codep + 4))
> >  	return &err_opcode;
> >        /* The first byte after 0x62.  */
> > +      if (*ins->codep & 0x8)
> > +	ins->rex2 |= REX_B;
> > +      if (!(*ins->codep & 0x10))
> > +	ins->rex2 |= REX_R;
> > +
> >        ins->rex = ~(*ins->codep >> 5) & 0x7;
> > -      ins->vex.r = *ins->codep & 0x10;
> > -      switch ((*ins->codep & 0xf))
> > +      switch (*ins->codep & 0x7)
> >  	{
> >  	default:
> >  	  return &bad_opcode;
> > @@ -8992,6 +9057,12 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >  	case 0x3:
> >  	  vex_table_index = EVEX_0F3A;
> >  	  break;
> > +	case 0x4:
> > +	  vex_table_index = EVEX_MAP4;
> > +	  ins->evex_type = evex_from_legacy;
> > +	  if (ins->address_mode != mode_64bit)
> > +	    return &bad_opcode;
> > +	  break;
> >  	case 0x5:
> >  	  vex_table_index = EVEX_MAP5;
> >  	  break;
> > @@ -9008,9 +9079,8 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >
> >        ins->vex.register_specifier = (~(*ins->codep >> 3)) & 0xf;
> >
> > -      /* The U bit.  */
> >        if (!(*ins->codep & 0x4))
> > -	return &bad_opcode;
> > +	ins->rex2 |= REX_X;
> >
> >        switch ((*ins->codep & 0x3))
> >  	{
> > @@ -9040,12 +9110,26 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >
> >        if (ins->address_mode != mode_64bit)
> >  	{
> > +	  /* Report bad for !evex_default and when two fixed values of evex
> > +	     change..  */
> > +	  if (ins->evex_type != evex_default
> > +	      || (ins->rex2 & (REX_B | REX_X)))
> > +	    return &bad_opcode;
> >  	  /* In 16/32-bit mode silently ignore following bits.  */
> >  	  ins->rex &= ~REX_B;
> > -	  ins->vex.r = true;
> > +	  ins->rex2 &= ~REX_R;
> >  	}
> >
> >        ins->need_vex = 4;
> > +
> > +      /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
> > +	 lower 2 bits of EVEX.aaa must be 0.  */
> > +      if (ins->evex_type == evex_from_legacy
> > +	  && ((ins->vex.mask_register_specifier & 0x3) != 0
> > +	      || ins->vex.ll != 0
> > +	      || ins->vex.zeroing != 0))
> > +	return &bad_opcode;
> > +
> >        ins->codep++;
> >        vindex = *ins->codep++;
> >        dp = &evex_table[vex_table_index][vindex];
> > @@ -9460,6 +9544,13 @@ print_insn (bfd_vma pc, disassemble_info *info,
> int intel_syntax)
> >        dp = get_valid_dis386 (dp, &ins);
> >        if (dp == &err_opcode)
> >  	goto fetch_error_out;
> > +
> > +      /* For APX instructions promoted from legacy maps 0/1, embedded
> prefix
> > +	 is interpreted as the operand size override.  */
> > +      if (ins.evex_type == evex_from_legacy
> > +	  && ins.vex.prefix == DATA_PREFIX_OPCODE)
> > +	sizeflag ^= DFLAG;
> > +
> >        if (dp != NULL && putop (&ins, dp->name, sizeflag) == 0)
> >  	{
> >  	  if (!get_sib (&ins, sizeflag))
> > @@ -9639,6 +9730,25 @@ print_insn (bfd_vma pc, disassemble_info *info,
> int intel_syntax)
> >        if (ins.last_repnz_prefix >= 0)
> >  	ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
> >        break;
> > +
> > +    case PREFIX_NP_OR_DATA:
> > +      if (ins.vex.prefix == REPE_PREFIX_OPCODE
> > +	  || ins.vex.prefix == REPNE_PREFIX_OPCODE)
> > +	{
> > +	  i386_dis_printf (info, dis_style_text, "(bad)");
> > +	  ret = ins.end_codep - priv.the_buffer;
> > +	  goto out;
> > +	}
> > +      break;
> > +
> > +    case NO_PREFIX:
> > +      if (ins.vex.prefix)
> > +	{
> > +	  i386_dis_printf (info, dis_style_text, "(bad)");
> > +	  ret = ins.end_codep - priv.the_buffer;
> > +	  goto out;
> > +	}
> > +      break;
> >      }
> >
> >    /* Check if the REX prefix is used.  */
> > @@ -10348,7 +10458,7 @@ putop (instr_info *ins, const char
> *in_template, int sizeflag)
> >  		{
> >  		case 'X':
> >  		  if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
> > -		      || !ins->vex.r
> > +		      || (ins->rex2 & REX_R)
> >  		      || (ins->modrm.mod == 3 && (ins->rex & REX_X))
> >  		      || !ins->vex.v || ins->vex.mask_register_specifier)
> >  		    break;
> > @@ -11459,7 +11569,7 @@ OP_E_memory (instr_info *ins, int bytemode,
> int sizeflag)
> >
> >    add += (ins->rex2 & REX_B) ? 16 : 0;
> >
> > -  if (ins->vex.evex)
> > +  if (ins->vex.evex && ins->evex_type == evex_default)
> >      {
> >
> >        /* Zeroing-masking is invalid for memory destinations. Set the flag
> > @@ -11603,6 +11713,13 @@ OP_E_memory (instr_info *ins, int
> bytemode, int sizeflag)
> >  		abort ();
> >  	      if (ins->vex.evex)
> >  		{
> > +		  /* S/G EVEX insns require EVEX.X4 not to be set.  */
> > +		  if (ins->rex2 & REX_X)
> > +		    {
> > +		      oappend (ins, "(bad)");
> > +		      return true;
> > +		    }
> > +
> >  		  if (!ins->vex.v)
> >  		    vindex += 16;
> >  		  check_gather = ins->obufp == ins->op_out[1];
> > @@ -11805,7 +11922,7 @@ OP_E_memory (instr_info *ins, int bytemode,
> int sizeflag)
> >
> >  	      if (ins->rex & REX_R)
> >  	        modrm_reg += 8;
> > -	      if (!ins->vex.r)
> > +	      if (ins->rex2 & REX_R)
> >  	        modrm_reg += 16;
> >  	      if (vindex == modrm_reg)
> >  		oappend (ins, "/(bad)");
> > @@ -12011,10 +12128,7 @@ OP_indirE (instr_info *ins, int bytemode, int
> sizeflag)
> >  static bool
> >  OP_G (instr_info *ins, int bytemode, int sizeflag)
> >  {
> > -  if (ins->vex.evex && !ins->vex.r && ins->address_mode == mode_64bit)
> > -    oappend (ins, "(bad)");
> > -  else
> > -    print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
> > +  print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
> >    return true;
> >  }
> >
> > @@ -12645,7 +12759,7 @@ OP_XMM (instr_info *ins, int bytemode, int
> sizeflag ATTRIBUTE_UNUSED)
> >      reg += 8;
> >    if (ins->vex.evex)
> >      {
> > -      if (!ins->vex.r)
> > +      if (ins->rex2 & REX_R)
> >  	reg += 16;
> >      }
> >
> > @@ -13652,7 +13766,7 @@ DistinctDest_Fixup (instr_info *ins, int
> bytemode, int sizeflag)
> >    /* Calc destination register number.  */
> >    if (ins->rex & REX_R)
> >      modrm_reg += 8;
> > -  if (!ins->vex.r)
> > +  if (ins->rex2 & REX_R)
> >      modrm_reg += 16;
> >
> >    /* Calc src1 register number.  */
> > diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> > index dd4850e1855..508b441a343 100644
> > --- a/opcodes/i386-gen.c
> > +++ b/opcodes/i386-gen.c
> > @@ -487,6 +487,7 @@ static bitfield opcode_modifiers[] =
> >    BITFIELD (Dialect),
> >    BITFIELD (ISA64),
> >    BITFIELD (NoEgpr),
> > +  BITFIELD (NF),
> >  };
> >
> >  #define CLASS(n) #n, n
> > @@ -1120,6 +1121,7 @@ process_i386_opcode_modifier (FILE *table, char
> *mod, unsigned int space,
> >      SPACE(0F),
> >      SPACE(0F38),
> >      SPACE(0F3A),
> > +    SPACE(EVEXMAP4),
> >      SPACE(EVEXMAP5),
> >      SPACE(EVEXMAP6),
> >      SPACE(VEXMAP7),
> > diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> > index 8c967ea90b0..064ec48edad 100644
> > --- a/opcodes/i386-opc.h
> > +++ b/opcodes/i386-opc.h
> > @@ -743,6 +743,9 @@ enum
> >       whether the instruction supports pseudo-prefix {rex2}.  */
> >    NoEgpr,
> >
> > +  /* No CSPAZO flags update indication.  */
> > +  NF,
> > +
> >    /* The last bitfield in i386_opcode_modifier.  */
> >    Opcode_Modifier_Num
> >  };
> > @@ -788,6 +791,7 @@ typedef struct i386_opcode_modifier
> >    unsigned int dialect:2;
> >    unsigned int isa64:2;
> >    unsigned int noegpr:1;
> > +  unsigned int nf:1;
> >  } i386_opcode_modifier;
> >
> >  /* Operand classes.  */
> > @@ -963,6 +967,7 @@ typedef struct insn_template
> >       1: 0F opcode prefix / space.
> >       2: 0F38 opcode prefix / space.
> >       3: 0F3A opcode prefix / space.
> > +     4: EVEXMAP4 opcode prefix / space.
> >       5: EVEXMAP5 opcode prefix / space.
> >       6: EVEXMAP6 opcode prefix / space.
> >       7: VEXMAP7 opcode prefix / space.
> > @@ -974,6 +979,7 @@ typedef struct insn_template
> >  #define SPACE_0F	1
> >  #define SPACE_0F38	2
> >  #define SPACE_0F3A	3
> > +#define SPACE_EVEXMAP4	4
> >  #define SPACE_EVEXMAP5	5
> >  #define SPACE_EVEXMAP6	6
> >  #define SPACE_VEXMAP7	7
> > diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> > index 37d3e8663bb..11b8c0b63cb 100644
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -113,6 +113,7 @@
> >  #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
> >  #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
> >
> > +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
> >  #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
> >  #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
> >
> > @@ -139,6 +140,9 @@
> >
> >  #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
> >
> > +// The template supports VEX format for cpuid and EVEX format for cpuid &
> apx_f.
> > +#define APX_F(cpuid) cpuid&(cpuid|APX_F)
> > +
> >  // The EVEX purpose of StaticRounding appears only together with SAE. Re-
> use
> >  // the bit to mark commutative VEX encodings where swapping the source
> >  // operands may allow to switch from 3-byte to 2-byte VEX encoding.
> > @@ -194,6 +198,7 @@ mov, 0xf24, i386&No64,
> D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Te
> >
> >  // Move after swapping the bytes
> >  movbe, 0x0f38f0, Movbe,
> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf,
> { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +movbe, 0x60, Movbe&APX_F,
> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4,
> { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >
> >  // Move with sign extend.
> >  movsb, 0xfbe, i386, Modrm|No_bSuf|No_sSuf,
> { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > @@ -1315,13 +1320,16 @@ getsec, 0xf37, SMX, NoSuf, {}
> >
> >  invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf,
> { Oword|Unspecified|BaseIndex, Reg32 }
> >  invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64,
> { Oword|Unspecified|BaseIndex, Reg64 }
> > +invept, 0xf3f0, EPT&APX_F, Modrm|NoSuf|EVexMap4,
> { Oword|Unspecified|BaseIndex, Reg64 }
> >  invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf,
> { Oword|Unspecified|BaseIndex, Reg32 }
> >  invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64,
> { Oword|Unspecified|BaseIndex, Reg64 }
> > +invvpid, 0xf3f1, EPT&APX_F, Modrm|NoSuf|EVexMap4,
> { Oword|Unspecified|BaseIndex, Reg64 }
> >
> >  // INVPCID instruction
> >
> >  invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf,
> { Oword|Unspecified|BaseIndex, Reg32 }
> >  invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64,
> { Oword|Unspecified|BaseIndex, Reg64 }
> > +invpcid, 0xf3f2, INVPCID&APX_F, Modrm|NoSuf|EVexMap4,
> { Oword|Unspecified|BaseIndex, Reg64 }
> >
> >  // SSSE3 instructions.
> >
> > @@ -1422,6 +1430,8 @@ pcmpistri<sse42>, 0x660f3a63, <sse42:cpu>,
> Modrm|<sse42:attr>|NoSuf, { Imm8, Reg
> >  pcmpistrm<sse42>, 0x660f3a62, <sse42:cpu>,
> Modrm|<sse42:attr>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex,
> RegXMM }
> >  crc32, 0xf20f38f0, SSE4_2, W|Modrm|No_sSuf|No_qSuf,
> { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
> >  crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf,
> { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
> > +crc32, 0xf0, APX_F, W|Modrm|No_sSuf|No_qSuf|EVexMap4,
> { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
> > +crc32, 0xf0, APX_F, W|Modrm|No_wSuf|No_lSuf|No_sSuf|EVexMap4,
> { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
> >
> >  // xsave/xrstor New Instructions.
> >
> > @@ -1836,14 +1846,14 @@ xtest, 0xf01d6, HLE|RTM, NoSuf, {}
> >
> >  // BMI2 instructions.
> >
> > -bzhi, 0xf5, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -mulx, 0xf2f6, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -pdep, 0xf2f5, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -pext, 0xf3f5, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -rorx, 0xf2f0, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F3A|No_bSuf|No_wSuf|No_sSu
> f, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -sarx, 0xf3f7, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -shlx, 0x66f7, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -shrx, 0xf2f7, BMI2,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +bzhi, 0xf5, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +mulx, 0xf2f6, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > +pdep, 0xf2f5, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > +pext, 0xf3f5, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > +rorx, 0xf2f0, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F3A|No_bSuf|No_wSu
> f|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> > +sarx, 0xf3f7, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +shlx, 0x66f7, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +shrx, 0xf2f7, APX_F(BMI2),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> >
> >  // FMA4 instructions
> >
> > @@ -1913,11 +1923,11 @@ lwpins, 0x12/0, LWP,
> Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|U
> >
> >  // BMI instructions
> >
> > -andn, 0xf2, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64,
> Reg32|Reg64 }
> > -bextr, 0xf7, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> _bSuf|No_wSuf|No_sSuf, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -blsi, 0xf3/3, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -blsmsk, 0xf3/2, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > -blsr, 0xf3/1, BMI,
> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wS
> uf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +andn, 0xf2, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64, Reg32|Reg64 }
> > +bextr, 0xf7, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSo
> urces|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64,
> Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > +blsi, 0xf3/3, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> > +blsmsk, 0xf3/2, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> > +blsr, 0xf3/1, APX_F(BMI),
> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
> f|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex,
> Reg32|Reg64 }
> >  tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf,
> { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >
> >  // TBM instructions
> > @@ -2046,13 +2056,21 @@ bndldx, 0x0f1a, MPX,
> Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
> >
> >  // SHA instructions.
> >  sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S,
> RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S,
> RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1nexte, 0xf38c8, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1msg1, 0xf38c9, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1msg2, 0xf38ca, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword,
> RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256msg1, 0xf38cc, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256msg1, 0xdc, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256msg2, 0xf38cd, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha256msg2, 0xdd, SHA&APX_F, Modrm|NoSuf|EVexMap4,
> { RegXMM|Unspecified|BaseIndex, RegXMM }
> >
> >  // SHA512 instructions.
> >
> > @@ -2114,9 +2132,9 @@ kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>,
> Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { R
> >  kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>,
> Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>,
> Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask,
> RegMask }
> >
> > -kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf,
> { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
> > -kmov<bw>, 0x<bw:kpfx>91, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask,
> <bw:elem>|Unspecified|BaseIndex }
> > -kmov<bw>, 0x<bw:kpfx>92, <bw:kcpu>,
> D|Modrm|Vex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
> > +kmov<bw>, 0x<bw:kpfx>90, APX_F(<bw:kcpu>),
> Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf,
> { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
> > +kmov<bw>, 0x<bw:kpfx>91, APX_F(<bw:kcpu>),
> Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask,
> <bw:elem>|Unspecified|BaseIndex }
> > +kmov<bw>, 0x<bw:kpfx>92, APX_F(<bw:kcpu>),
> D|Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
> >
> >  knot<bw>, 0x<bw:kpfx>44, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
> >  kortest<bw>, 0x<bw:kpfx>98, <bw:kcpu>,
> Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
> > @@ -2591,9 +2609,9 @@ vpmovzxdq, 0x6635, AVX512VL,
> Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift
> >  kadd<dq>, 0x<dq:kpfx>4a, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kand<dq>, 0x<dq:kpfx>41, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kandn<dq>, 0x<dq:kpfx>42, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask,
> RegMask, RegMask }
> > -kmov<dq>, 0x<dq:kpfx>90, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf,
> { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
> > -kmov<dq>, 0x<dq:kpfx>91, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask,
> <dq:elem>|Unspecified|BaseIndex }
> > -kmov<dq>, 0xf292, AVX512BW,
> D|Modrm|Vex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
> > +kmov<dq>, 0x<dq:kpfx>90, APX_F(AVX512BW),
> Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf,
> { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
> > +kmov<dq>, 0x<dq:kpfx>91, APX_F(AVX512BW),
> Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask,
> <dq:elem>|Unspecified|BaseIndex }
> > +kmov<dq>, 0xf292, APX_F(AVX512BW),
> D|Modrm|Vex128|EVex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>,
> RegMask }
> >  knot<dq>, 0x<dq:kpfx>44, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
> >  kor<dq>, 0x<dq:kpfx>45, AVX512BW,
> Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask,
> RegMask }
> >  kortest<dq>, 0x<dq:kpfx>98, AVX512BW,
> Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
> > @@ -2992,9 +3010,13 @@ rdsspq, 0xf30f1e/1, SHSTK&x64,
> Modrm|NoSuf, { Reg64 }
> >  saveprevssp, 0xf30f01ea, SHSTK, NoSuf, {}
> >  rstorssp, 0xf30f01/5, SHSTK, Modrm|NoSuf,
> { Qword|Unspecified|BaseIndex }
> >  wrssd, 0x0f38f6, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32,
> Dword|Unspecified|BaseIndex }
> > +wrssd, 0x66, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4,
> { Reg32, Dword|Unspecified|BaseIndex }
> >  wrssq, 0x0f38f6, SHSTK&x64, Modrm|NoSuf|Size64, { Reg64,
> Qword|Unspecified|BaseIndex }
> > +wrssq, 0x66, SHSTK&APX_F, Modrm|NoSuf|Size64|EVexMap4, { Reg64,
> Qword|Unspecified|BaseIndex }
> >  wrussd, 0x660f38f5, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32,
> Dword|Unspecified|BaseIndex }
> > +wrussd, 0x6665, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4,
> { Reg32, Dword|Unspecified|BaseIndex }
> >  wrussq, 0x660f38f5, SHSTK&x64, Modrm|NoSuf, { Reg64,
> Qword|Unspecified|BaseIndex }
> > +wrussq, 0x6665, SHSTK&APX_F, Modrm|NoSuf|EVexMap4, { Reg64,
> Qword|Unspecified|BaseIndex }
> >  setssbsy, 0xf30f01e8, SHSTK, NoSuf, {}
> >  clrssbsy, 0xf30fae/6, SHSTK, Modrm|NoSuf,
> { Qword|Unspecified|BaseIndex }
> >  endbr64, 0xf30f1efa, IBT, NoSuf, {}
> > @@ -3042,7 +3064,9 @@ cldemote, 0x0f1c/0, CLDEMOTE,
> Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
> >  // MOVDIR[I,64B] instructions.
> >
> >  movdiri, 0xf38f9, MOVDIRI,
> Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +movdiri, 0xf9, MOVDIRI&APX_F,
> Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMa
> p4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> >  movdir64b, 0x660f38f8, MOVDIR64B, Modrm|AddrPrefixOpReg|NoSuf,
> { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +movdir64b, 0x66f8, MOVDIR64B&APX_F,
> Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex,
> Reg32|Reg64 }
> >
> >  // MOVEDIR instructions end.
> >
> > @@ -3071,7 +3095,9 @@ vcvtneps2bf16<Vxy>, 0xf372,
> AVX_NE_CONVERT, Modrm|<Vxy:vex>|Space0F38|VexW0|NoSu
> >  // ENQCMD instructions.
> >
> >  enqcmd, 0xf20f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf,
> { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +enqcmd, 0xf2f8, APX_F(ENQCMD),
> Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex,
> Reg32|Reg64 }
> >  enqcmds, 0xf30f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf,
> { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> > +enqcmds, 0xf3f8, APX_F(ENQCMD),
> Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex,
> Reg32|Reg64 }
> >
> >  // ENQCMD instructions end.
> >
> > @@ -3132,8 +3158,8 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
> >
> >  // AMX instructions.
> >
> > -ldtilecfg, 0x49/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex }
> > -sttilecfg, 0x6649/0, AMX_TILE,
> Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
> > +ldtilecfg, 0x49/0, APX_F(AMX_TILE),
> Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex }
> > +sttilecfg, 0x6649/0, APX_F(AMX_TILE),
> Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex }
> >
> >  tcmmimfp16ps, 0x666c, AMX_COMPLEX,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> >  tcmmrlfp16ps, 0x6c, AMX_COMPLEX,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> > @@ -3145,9 +3171,9 @@ tdpbuud, 0x5e, AMX_INT8,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> >  tdpbusd, 0x665e, AMX_INT8,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> >  tdpbsud, 0xf35e, AMX_INT8,
> Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM, RegTMM, RegTMM }
> >
> > -tileloadd, 0xf24b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex, RegTMM }
> > -tileloaddt1, 0x664b, AMX_TILE,
> Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex,
> RegTMM }
> > -tilestored, 0xf34b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf,
> { RegTMM, Unspecified|BaseIndex }
> > +tileloadd, 0xf24b, APX_F(AMX_TILE),
> Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex, RegTMM }
> > +tileloaddt1, 0x664b, APX_F(AMX_TILE),
> Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf,
> { Unspecified|BaseIndex, RegTMM }
> > +tilestored, 0xf34b, APX_F(AMX_TILE),
> Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM,
> Unspecified|BaseIndex }
> >
> >  tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {}
> >
> > @@ -3159,15 +3185,25 @@ tilezero, 0xf249, AMX_TILE,
> Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
> >
> >  loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM }
> >  encodekey128, 0xf30f38fa, KL, Modrm|NoSuf, { Reg32, Reg32 }
> > +encodekey128, 0xf3da, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32,
> Reg32 }
> >  encodekey256, 0xf30f38fb, KL, Modrm|NoSuf, { Reg32, Reg32 }
> > +encodekey256, 0xf3db, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32,
> Reg32 }
> >  aesenc128kl, 0xf30f38dc, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesenc128kl, 0xf3dc, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesdec128kl, 0xf30f38dd, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesdec128kl, 0xf3dd, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesenc256kl, 0xf30f38de, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesenc256kl, 0xf3de, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesdec256kl, 0xf30f38df, KL, Modrm|NoSuf, { Unspecified|BaseIndex,
> RegXMM }
> > +aesdec256kl, 0xf3df, KL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex, RegXMM }
> >  aesencwide128kl, 0xf30f38d8/0, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesencwide128kl, 0xf3d8/0, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >  aesdecwide128kl, 0xf30f38d8/1, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesdecwide128kl, 0xf3d8/1, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >  aesencwide256kl, 0xf30f38d8/2, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesencwide256kl, 0xf3d8/2, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >  aesdecwide256kl, 0xf30f38d8/3, WideKL, Modrm|NoSuf,
> { Unspecified|BaseIndex }
> > +aesdecwide256kl, 0xf3d8/3, WideKL&APX_F, Modrm|NoSuf|EVexMap4,
> { Unspecified|BaseIndex }
> >
> >  // KEYLOCKER instructions end.
> >
> > @@ -3315,7 +3351,7 @@ prefetchit1, 0xf18/6, PREFETCHI,
> Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
> >
> >  // CMPCCXADD instructions.
> >
> > -cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD,
> Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +cmp<cc>xadd, 0x66e<cc:opc>, APX_F(CMPCCXADD),
> Modrm|Vex|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSiz
> e|NoSuf, { Reg32|Reg64, Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >
> >  // CMPCCXADD instructions end.
> >
> > @@ -3335,9 +3371,13 @@ wrmsrlist, 0xf30f01c6, MSRLIST, NoSuf, {}
> >  // RAO-INT instructions.
> >
> >  aadd, 0xf38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +aadd, 0xfc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >  aand, 0x660f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +aand, 0x66fc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >  aor, 0xf20f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +aor, 0xf2fc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >  axor, 0xf30f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf,
> { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> > +axor, 0xf3fc, RAO_INT&APX_F,
> Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64,
> Dword|Qword|Unspecified|BaseIndex }
> >
> >  // RAO-INT instructions end.
> >
> > --
> > 2.25.1
> >
> 
> OK.
> 
> Thanks.
> 
> H.J.

[-- Attachment #2: 0001-Support-APX-GPR32-with-extend-evex-prefix.patch --]
[-- Type: application/octet-stream, Size: 51908 bytes --]

From 2c9fdb12f84f2ee3440d41c699c86351177f59ea Mon Sep 17 00:00:00 2001
From: "Cui, Lili" <lili.cui@intel.com>
Date: Thu, 28 Dec 2023 01:06:40 +0000
Subject: [PATCH] Support APX GPR32 with extend evex prefix
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This patch adds non-ND, non-NF forms of EVEX promotion insn.

EVEX extension of legacy instructions:
  All promoted legacy instructions are placed in EVEX map 4, which is
  currently reserved.
EVEX extension of EVEX instructions:
  All existing EVEX instructions are extended by APX using the extended
  EVEX prefix, so that they can access all 32 GPRs.
EVEX extension of VEX instructions:
  Promoting a VEX instruction into the EVEX space does not change the map
  id, the opcode, or the operand encoding of the VEX instruction.

Note: The promoted versions of MOVBE will be extended to include the “MOVBE
  reg1, reg2”.

  gas/ChangeLog:

  2023-12-28  Lingling Kong <lingling.kong@intel.com>
	      H.J. Lu  <hongjiu.lu@intel.com>
	      Lili Cui <lili.cui@intel.com>
	      Lin Hu   <lin1.hu@intel.com>

	* config/tc-i386.c (struct _i386_insn): Add has_egpr.
	(need_evex_encoding): Adjusted for apx.
	(cpu_flags_match): Ditto.
	(install_template): Handled APX combines.
	(is_apx_evex_encoding): Test apx evex encoding.
	(build_apx_evex_prefix): Enabe APX evex prefix.
	(md_assemble): Handle apx with evex encoding.
	(process_suffix): Handle apx map4 prefix.
	(check_register): Assign i.vec_encoding for APX evex instructions.
	* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
	* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.

opcodes/ChangeLog:

	* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
	* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
	PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
	PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
	PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
	PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
	PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
	PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
	* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
	* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
	promote to apx to use gpr32
	* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
	X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
	X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5, X86_64_EVEX_0F38F6,
	X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
	* i386-dis.c
	(struct instr_info): Deleted bool r.
	(PREFIX_NP_OR_DATA): New.
	(NO_PREFIX): New.
	(putop): Ditto.
	(X86_64_EVEX_FROM_VEX_TABLE): Diito.
	(get_valid_dis386): Decode insn erex in extend evex prefix.
	Handle EVEX_MAP4
	(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
	(print_register): Handle apx instructions decode.
	(OP_E_memory): Diito.
	(OP_G): Diito.
	(OP_XMM): Diito.
	(DistinctDest_Fixup): Diito.
	* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
	* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
	promote to evex.
	* i386-opc.tbl: Handle some legacy and vex insns don't
	support gpr32. And add some legacy insn (map2 / 3) promote
	to evex.
---
 gas/config/tc-i386.c                 |  85 ++++++++++++--
 gas/testsuite/gas/i386/x86-64-evex.d |   2 +-
 gas/testsuite/gas/i386/x86-64.exp    |   2 +-
 opcodes/i386-dis-evex-prefix.h       |  58 ++++++++++
 opcodes/i386-dis-evex-x86-64.h       |  50 +++++++++
 opcodes/i386-dis-evex.h              |  94 ++++++++--------
 opcodes/i386-dis.c                   | 160 +++++++++++++++++++++++----
 opcodes/i386-gen.c                   |   2 +
 opcodes/i386-opc.h                   |   6 +
 opcodes/i386-opc.tbl                 |  90 ++++++++++-----
 10 files changed, 440 insertions(+), 109 deletions(-)
 create mode 100644 opcodes/i386-dis-evex-x86-64.h

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 11b3927cd23..e41b79aef93 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -435,6 +435,9 @@ struct _i386_insn
     /* Prefer the REX2 prefix in encoding.  */
     bool rex2_encoding;
 
+    /* Need to use an Egpr capable encoding (REX2 or EVEX).  */
+    bool has_egpr;
+
     /* Disable instruction size optimization.  */
     bool no_optimize;
 
@@ -1862,10 +1865,11 @@ cpu_flags_and_not (i386_cpu_flags x, i386_cpu_flags y)
 
 static const i386_cpu_flags avx512 = CPU_ANY_AVX512F_FLAGS;
 
-static INLINE bool need_evex_encoding (void)
+static INLINE bool need_evex_encoding (const insn_template *t)
 {
   return i.vec_encoding == vex_encoding_evex
 	|| i.vec_encoding == vex_encoding_evex512
+	|| (t->opcode_modifier.vex && i.has_egpr)
 	|| i.mask.reg;
 }
 
@@ -1905,13 +1909,13 @@ cpu_flags_match (const insn_template *t)
       if ((any.bitfield.cpuavx || any.bitfield.cpuavx2 || any.bitfield.cpufma)
 	  && (any.bitfield.cpuavx512f || any.bitfield.cpuavx512vl))
 	{
-	  if (need_evex_encoding ())
+	  if (need_evex_encoding (t))
 	    {
 	      any.bitfield.cpuavx = 0;
 	      any.bitfield.cpuavx2 = 0;
 	      any.bitfield.cpufma = 0;
 	    }
-	  /* need_evex_encoding() isn't reliable before operands were
+	  /* need_evex_encoding(t) isn't reliable before operands were
 	     parsed.  */
 	  else if (i.operands)
 	    {
@@ -3676,12 +3680,12 @@ install_template (const insn_template *t)
 
   /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
   if (t->opcode_modifier.vex && t->opcode_modifier.evex)
-  {
+    {
       if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
 	   || maybe_cpu (t, CpuFMA))
 	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
 	{
-	  if (need_evex_encoding ())
+	  if (need_evex_encoding (t))
 	    {
 	      i.tm.opcode_modifier.vex = 0;
 	      i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f;
@@ -3698,7 +3702,19 @@ install_template (const insn_template *t)
 		gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
 	    }
 	}
-  }
+
+      if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
+	   || maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
+	   || maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
+	   || maybe_cpu (t, CpuBMI2))
+	  && maybe_cpu (t, CpuAPX_F))
+	{
+	  if (need_evex_encoding (t))
+	    i.tm.opcode_modifier.vex = 0;
+	  else
+	    i.tm.opcode_modifier.evex = 0;
+	}
+    }
 
   /* Note that for pseudo prefixes this produces a length of 1. But for them
      the length isn't interesting at all.  */
@@ -3879,6 +3895,15 @@ is_any_vex_encoding (const insn_template *t)
   return t->opcode_modifier.vex || t->opcode_modifier.evex;
 }
 
+/* We can use this function only when the current encoding is evex.  */
+static INLINE bool
+is_apx_evex_encoding (void)
+{
+  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
+    || (i.vex.register_specifier
+	&& (i.vex.register_specifier->reg_flags & RegRex2));
+}
+
 static INLINE bool
 is_apx_rex2_encoding (void)
 {
@@ -4156,6 +4181,27 @@ build_rex2_prefix (void)
 		    | (i.rex2 << 4) | i.rex);
 }
 
+/* Build the EVEX prefix (4-byte) for evex insn
+   | 62h |
+   | `R`X`B`R' | B'mmm |
+   | W | v`v`v`v | `x' | pp |
+   | z| L'L | b | `v | aaa |
+*/
+static void
+build_apx_evex_prefix (void)
+{
+  build_evex_prefix ();
+  if (i.rex2 & REX_R)
+    i.vex.bytes[1] &= ~0x10;
+  if (i.rex2 & REX_B)
+    i.vex.bytes[1] |= 0x08;
+  if (i.rex2 & REX_X)
+    i.vex.bytes[2] &= ~0x04;
+  if (i.vex.register_specifier
+      && i.vex.register_specifier->reg_flags & RegRex2)
+    i.vex.bytes[3] &= ~0x08;
+}
+
 static void establish_rex (void)
 {
   /* Note that legacy encodings have at most 2 non-immediate operands.  */
@@ -5723,13 +5769,18 @@ md_assemble (char *line)
 	  return;
 	}
 
-      if (i.tm.opcode_modifier.vex)
+      if (is_apx_evex_encoding ())
+	build_apx_evex_prefix ();
+      else if (i.tm.opcode_modifier.vex)
 	build_vex_prefix (t);
       else
 	build_evex_prefix ();
 
       /* The individual REX.RXBW bits got consumed.  */
       i.rex &= REX_OPCODE;
+
+      /* The rex2 bits got consumed.  */
+      i.rex2 = 0;
     }
 
   /* Handle conversion of 'int $3' --> special int3 insn.  */
@@ -6648,7 +6699,7 @@ check_VecOperands (const insn_template *t)
   if (!cpu_flags_all_zero (&cpu)
       && !is_cpu (t, CpuAVX512VL)
       && !cpu_arch_flags.bitfield.cpuavx512vl
-      && (!t->opcode_modifier.vex || need_evex_encoding ()))
+      && (!t->opcode_modifier.vex || need_evex_encoding (t)))
     {
       for (op = 0; op < t->operands; ++op)
 	{
@@ -6960,7 +7011,7 @@ check_VecOperands (const insn_template *t)
   /* Check vector Disp8 operand.  */
   if (t->opcode_modifier.disp8memshift
       && (!t->opcode_modifier.vex
-          || need_evex_encoding ())
+	  || need_evex_encoding (t))
       && i.disp_encoding <= disp_encoding_8bit)
     {
       if (i.broadcast.type || i.broadcast.bytes)
@@ -7617,7 +7668,7 @@ match_template (char mnem_suffix)
       if ((t == current_templates.start || j > 1)
 	  && t->opcode_modifier.disp8memshift
 	  && !t->opcode_modifier.vex
-	  && !need_evex_encoding ()
+	  && !need_evex_encoding (t)
 	  && t + j < current_templates.end
 	  && t[j].opcode_modifier.vex)
 	{
@@ -8084,7 +8135,8 @@ process_suffix (void)
       if (i.suffix != QWORD_MNEM_SUFFIX
 	  && i.tm.opcode_modifier.mnemonicsize != IGNORESIZE
 	  && !i.tm.opcode_modifier.floatmf
-	  && !is_any_vex_encoding (&i.tm)
+	  && (!is_any_vex_encoding (&i.tm)
+	      || i.tm.opcode_space == SPACE_EVEXMAP4)
 	  && ((i.suffix == LONG_MNEM_SUFFIX) == (flag_code == CODE_16BIT)
 	      || (flag_code == CODE_64BIT
 		  && i.tm.opcode_modifier.jump == JUMP_BYTE)))
@@ -8094,7 +8146,14 @@ process_suffix (void)
 	  if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
 	    prefix = ADDR_PREFIX_OPCODE;
 
-	  if (!add_prefix (prefix))
+	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
+	     needs to be adjusted.  */
+	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
+	    {
+	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
+	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
+	    }
+	  else if (!add_prefix (prefix))
 	    return 0;
 	}
 
@@ -14300,6 +14359,8 @@ static bool check_register (const reg_entry *r)
       if (!cpu_arch_flags.bitfield.cpuapx_f
 	  || flag_code != CODE_64BIT)
 	return false;
+
+      i.has_egpr = true;
     }
 
   if (((r->reg_flags & (RegRex64 | RegRex)) || r->reg_type.bitfield.qword)
diff --git a/gas/testsuite/gas/i386/x86-64-evex.d b/gas/testsuite/gas/i386/x86-64-evex.d
index 041747db892..5d974c312da 100644
--- a/gas/testsuite/gas/i386/x86-64-evex.d
+++ b/gas/testsuite/gas/i386/x86-64-evex.d
@@ -17,6 +17,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f1 d6 38 7b f0    	vcvtusi2ss %rax,\{rd-sae\},%xmm5,%xmm6
  +[a-f0-9]+:	62 f1 57 38 7b f0    	vcvtusi2sd %eax,\{rd-bad\},%xmm5,%xmm6
  +[a-f0-9]+:	62 f1 d7 38 7b f0    	vcvtusi2sd %rax,\{rd-sae\},%xmm5,%xmm6
- +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,\(bad\)
+ +[a-f0-9]+:	62 e1 7e 08 2d c0    	vcvtss2si %xmm0,%r16d
  +[a-f0-9]+:	62 e1 7c 08 c2 c0 00 	vcmpeqps %xmm0,%xmm0,\(bad\)
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 91c068d5b40..ffacc9c8e2b 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -250,7 +250,7 @@ run_dump_test "x86-64-sse-noavx"
 run_dump_test "x86-64-movbe"
 run_dump_test "x86-64-movbe-intel"
 run_dump_test "x86-64-movbe-suffix"
-run_list_test "x86-64-inval-movbe" "-al"
+run_list_test "x86-64-inval-movbe" "-march=+noapx_f -al"
 run_dump_test "x86-64-ept"
 run_dump_test "x86-64-ept-intel"
 run_list_test "x86-64-inval-ept" "-al"
diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
index 28da54922c7..54ed48c6952 100644
--- a/opcodes/i386-dis-evex-prefix.h
+++ b/opcodes/i386-dis-evex-prefix.h
@@ -338,6 +338,64 @@
     { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
     { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
   },
+  /* PREFIX_EVEX_MAP4_D8 */
+  {
+    { "sha1nexte", { XM, EXxmm }, 0 },
+    { REG_TABLE (REG_0F38D8_PREFIX_1) },
+  },
+  /* PREFIX_EVEX_MAP4_DA */
+  {
+    { "sha1msg2", { XM, EXxmm }, 0 },
+    { "encodekey128", { Gd, Rd }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DB */
+  {
+    { "sha256rnds2", { XM, EXxmm, XMM0 }, 0 },
+    { "encodekey256", { Gd, Rd }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DC */
+  {
+    { "sha256msg1", { XM, EXxmm }, 0 },
+    { "aesenc128kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DD */
+  {
+    { "sha256msg2", { XM, EXxmm }, 0 },
+    { "aesdec128kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DE */
+  {
+    { Bad_Opcode },
+    { "aesenc256kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_DF */
+  {
+    { Bad_Opcode },
+    { "aesdec256kl", { XM, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F0 */
+  {
+    { "crc32A", { Gdq, Eb }, 0 },
+    { "invept", { Gm, Mo }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F1 */
+  {
+    { "crc32Q", { Gdq, Ev }, 0 },
+    { "invvpid", { Gm, Mo }, 0 },
+    { "crc32Q", { Gdq, Ev }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F2 */
+  {
+    { Bad_Opcode },
+    { "invpcid", { Gm, M }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_F8 */
+  {
+    { Bad_Opcode },
+    { "enqcmds", { Gva, M },  0 },
+    { "movdir64b", { Gva, M }, 0 },
+    { "enqcmd", { Gva, M }, 0 },
+  },
   /* PREFIX_EVEX_MAP5_10 */
   {
     { Bad_Opcode },
diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-64.h
new file mode 100644
index 00000000000..0d9d98a7691
--- /dev/null
+++ b/opcodes/i386-dis-evex-x86-64.h
@@ -0,0 +1,50 @@
+  /* X86_64_EVEX_0F90 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F90_L_0) },
+  },
+  /* X86_64_EVEX_0F91 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F91_L_0) },
+  },
+  /* X86_64_EVEX_0F92 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F92_L_0) },
+  },
+  /* X86_64_EVEX_0F93 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F93_L_0) },
+  },
+  /* X86_64_EVEX_0F38F2 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
+  },
+  /* X86_64_EVEX_0F38F3 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
+  },
+  /* X86_64_EVEX_0F38F5 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F5_L_0) },
+  },
+  /* X86_64_EVEX_0F38F6 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE(PREFIX_VEX_0F38F6_L_0) },
+  },
+  /* X86_64_EVEX_0F38F7 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE(PREFIX_VEX_0F38F7_L_0) },
+  },
+  /* X86_64_EVEX_0F3AF0 */
+  {
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) },
+  },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 7ad1edbe72d..90c063b2188 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -164,10 +164,10 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 90 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F90) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F91) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F92) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F93) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -375,9 +375,9 @@ static const struct dis386 evex_table[][256] = {
     { "vpsllv%DQ",	{ XM, Vex, EXx }, PREFIX_DATA },
     /* 48 */
     { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F3849) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F384B) },
     { "vrcp14p%XW",	{ XM, EXx }, PREFIX_DATA },
     { "vrcp14s%XW",	{ XMScalar, VexScalar, EXdq }, PREFIX_DATA },
     { "vrsqrt14p%XW",	{ XM, EXx }, 0 },
@@ -545,32 +545,32 @@ static const struct dis386 evex_table[][256] = {
     { "%XEvaesdecY",	{ XM, Vex, EXx }, PREFIX_DATA },
     { "%XEvaesdeclastY", { XM, Vex, EXx }, PREFIX_DATA },
     /* E0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E0) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E1) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E2) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E3) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E4) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E5) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E6) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E7) },
     /* E8 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E8) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38E9) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EA) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EB) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EC) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38ED) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EE) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F38EF) },
     /* F0 */
     { Bad_Opcode },
     { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F2) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F3) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F5) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F6) },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F38F7) },
     /* F8 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -854,7 +854,7 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* F0 */
-    { Bad_Opcode },
+    { X86_64_EVEX_FROM_VEX_TABLE (X86_64_EVEX_0F3AF0) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -983,13 +983,13 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 60 */
+    { "movbeS",	{ Gv, Ev }, PREFIX_NP_OR_DATA },
+    { "movbeS",	{ Ev, Gv }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "wrussK",	{ M, Gdq }, PREFIX_DATA },
+    { PREFIX_TABLE (PREFIX_0F38F6) },
     { Bad_Opcode },
     /* 68 */
     { Bad_Opcode },
@@ -1113,19 +1113,19 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { "sha1rnds4",	{ XM, EXxmm, Ib }, NO_PREFIX },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* D8 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_D8) },
+    { "sha1msg1",	{ XM, EXxmm }, NO_PREFIX },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DA) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DB) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DC) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DD) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DE) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_DF) },
     /* E0 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1145,20 +1145,20 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* F0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F0) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F1) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F2) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* F8 */
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
+    { "movdiri",	{ Mdq, Gdq }, NO_PREFIX },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_0F38FC) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index e006d869258..5a72a2030ae 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -132,6 +132,13 @@ enum x86_64_isa
   intel64
 };
 
+enum evex_type
+{
+  evex_default = 0,
+  evex_from_legacy,
+  evex_from_vex,
+};
+
 struct instr_info
 {
   enum address_mode address_mode;
@@ -212,7 +219,6 @@ struct instr_info
     int ll;
     bool w;
     bool evex;
-    bool r;
     bool v;
     bool zeroing;
     bool b;
@@ -220,6 +226,8 @@ struct instr_info
   }
   vex;
 
+  enum evex_type evex_type;
+
   /* Remember if the current op is a jump instruction.  */
   bool op_is_jump;
 
@@ -303,6 +311,8 @@ struct dis_private {
 #define PREFIX_ADDR 0x400
 #define PREFIX_FWAIT 0x800
 #define PREFIX_REX2 0x1000
+#define PREFIX_NP_OR_DATA 0x2000
+#define NO_PREFIX   0x4000
 
 /* Make sure that bytes from INFO->PRIVATE_DATA->BUFFER (inclusive)
    to ADDR (exclusive) are valid.  Returns true for success, false
@@ -800,6 +810,7 @@ enum
   USE_RM_TABLE,
   USE_PREFIX_TABLE,
   USE_X86_64_TABLE,
+  USE_X86_64_EVEX_FROM_VEX_TABLE,
   USE_3BYTE_TABLE,
   USE_XOP_8F_TABLE,
   USE_VEX_C4_TABLE,
@@ -818,6 +829,8 @@ enum
 #define RM_TABLE(I)		DIS386 (USE_RM_TABLE, (I))
 #define PREFIX_TABLE(I)		DIS386 (USE_PREFIX_TABLE, (I))
 #define X86_64_TABLE(I)		DIS386 (USE_X86_64_TABLE, (I))
+#define X86_64_EVEX_FROM_VEX_TABLE(I) \
+  DIS386 (USE_X86_64_EVEX_FROM_VEX_TABLE, (I))
 #define THREE_BYTE_TABLE(I)	DIS386 (USE_3BYTE_TABLE, (I))
 #define XOP_8F_TABLE()		DIS386 (USE_XOP_8F_TABLE, 0)
 #define VEX_C4_TABLE()		DIS386 (USE_VEX_C4_TABLE, 0)
@@ -866,7 +879,7 @@ enum
   REG_VEX_0F73,
   REG_VEX_0FAE,
   REG_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0,
-  REG_VEX_0F38F3_L_0,
+  REG_VEX_0F38F3_L_0_P_0,
   REG_VEX_MAP7_F8_L_0_W_0,
 
   REG_XOP_09_01_L_0,
@@ -878,7 +891,7 @@ enum
   REG_EVEX_0F72,
   REG_EVEX_0F73,
   REG_EVEX_0F38C6_L_2,
-  REG_EVEX_0F38C7_L_2
+  REG_EVEX_0F38C7_L_2,
 };
 
 enum
@@ -1094,6 +1107,8 @@ enum
   PREFIX_VEX_0F38CC,
   PREFIX_VEX_0F38CD,
   PREFIX_VEX_0F38DA_W_0,
+  PREFIX_VEX_0F38F2_L_0,
+  PREFIX_VEX_0F38F3_L_0,
   PREFIX_VEX_0F38F5_L_0,
   PREFIX_VEX_0F38F6_L_0,
   PREFIX_VEX_0F38F7_L_0,
@@ -1156,6 +1171,18 @@ enum
   PREFIX_EVEX_0F3A67,
   PREFIX_EVEX_0F3AC2,
 
+  PREFIX_EVEX_MAP4_D8,
+  PREFIX_EVEX_MAP4_DA,
+  PREFIX_EVEX_MAP4_DB,
+  PREFIX_EVEX_MAP4_DC,
+  PREFIX_EVEX_MAP4_DD,
+  PREFIX_EVEX_MAP4_DE,
+  PREFIX_EVEX_MAP4_DF,
+  PREFIX_EVEX_MAP4_F0,
+  PREFIX_EVEX_MAP4_F1,
+  PREFIX_EVEX_MAP4_F2,
+  PREFIX_EVEX_MAP4_F8,
+
   PREFIX_EVEX_MAP5_10,
   PREFIX_EVEX_MAP5_11,
   PREFIX_EVEX_MAP5_1D,
@@ -1267,7 +1294,19 @@ enum
   X86_64_VEX_0F38ED,
   X86_64_VEX_0F38EE,
   X86_64_VEX_0F38EF,
+
   X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
+
+  X86_64_EVEX_0F90,
+  X86_64_EVEX_0F91,
+  X86_64_EVEX_0F92,
+  X86_64_EVEX_0F93,
+  X86_64_EVEX_0F38F2,
+  X86_64_EVEX_0F38F3,
+  X86_64_EVEX_0F38F5,
+  X86_64_EVEX_0F38F6,
+  X86_64_EVEX_0F38F7,
+  X86_64_EVEX_0F3AF0,
 };
 
 enum
@@ -2882,12 +2921,12 @@ static const struct dis386 reg_table[][8] = {
   {
     { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_0_R_0) },
   },
-  /* REG_VEX_0F38F3_L_0 */
+  /* REG_VEX_0F38F3_L_0_P_0 */
   {
     { Bad_Opcode },
-    { "blsrS",		{ VexGdq, Edq }, PREFIX_OPCODE },
-    { "blsmskS",	{ VexGdq, Edq }, PREFIX_OPCODE },
-    { "blsiS",		{ VexGdq, Edq }, PREFIX_OPCODE },
+    { "blsrS",		{ VexGdq, Edq }, 0 },
+    { "blsmskS",	{ VexGdq, Edq }, 0 },
+    { "blsiS",		{ VexGdq, Edq }, 0 },
   },
   /* REG_VEX_MAP7_F8_L_0_W_0 */
   {
@@ -4035,6 +4074,16 @@ static const struct dis386 prefix_table[][4] = {
     { "vsm4rnds4", { XM, Vex, EXx }, 0 },
   },
 
+  /* PREFIX_VEX_0F38F2_L_0 */
+  {
+    { "andnS",          { Gdq, VexGdq, Edq }, 0 },
+  },
+
+  /* PREFIX_VEX_0F38F3_L_0 */
+  {
+    { REG_TABLE (REG_VEX_0F38F3_L_0_P_0) },
+  },
+
   /* PREFIX_VEX_0F38F5_L_0 */
   {
     { "bzhiS",		{ Gdq, Edq, VexGdq }, 0 },
@@ -4527,6 +4576,7 @@ static const struct dis386 x86_64_table[][2] = {
     { PREFIX_TABLE (PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64) },
   },
 
+#include "i386-dis-evex-x86-64.h"
 };
 
 static const struct dis386 three_byte_table[][256] = {
@@ -7113,12 +7163,12 @@ static const struct dis386 vex_len_table[][2] = {
 
   /* VEX_LEN_0F38F2 */
   {
-    { "andnS",		{ Gdq, VexGdq, Edq }, PREFIX_OPCODE },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F2_L_0) },
   },
 
   /* VEX_LEN_0F38F3 */
   {
-    { REG_TABLE(REG_VEX_0F38F3_L_0) },
+    { PREFIX_TABLE (PREFIX_VEX_0F38F3_L_0) },
   },
 
   /* VEX_LEN_0F38F5 */
@@ -8732,6 +8782,17 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
       dp = &prefix_table[dp->op[1].bytemode][vindex];
       break;
 
+    case USE_X86_64_EVEX_FROM_VEX_TABLE:
+      ins->evex_type = evex_from_vex;
+      /* EVEX from VEX instrucions require that EVEX.z, EVEX.L’L, EVEX.b and
+	 the lower 2 bits of EVEX.aaa must be 0.  */
+      if ((ins->vex.mask_register_specifier & 0x3) != 0
+	  || ins->vex.ll != 0
+	  || ins->vex.zeroing != 0
+	  || ins->vex.b)
+	return &bad_opcode;
+
+      /* Fall through.  */
     case USE_X86_64_TABLE:
       vindex = ins->address_mode == mode_64bit ? 1 : 0;
       dp = &x86_64_table[dp->op[1].bytemode][vindex];
@@ -8977,9 +9038,13 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
       if (!fetch_code (ins->info, ins->codep + 4))
 	return &err_opcode;
       /* The first byte after 0x62.  */
+      if (*ins->codep & 0x8)
+	ins->rex2 |= REX_B;
+      if (!(*ins->codep & 0x10))
+	ins->rex2 |= REX_R;
+
       ins->rex = ~(*ins->codep >> 5) & 0x7;
-      ins->vex.r = *ins->codep & 0x10;
-      switch ((*ins->codep & 0xf))
+      switch (*ins->codep & 0x7)
 	{
 	default:
 	  return &bad_opcode;
@@ -8992,6 +9057,12 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 	case 0x3:
 	  vex_table_index = EVEX_0F3A;
 	  break;
+	case 0x4:
+	  vex_table_index = EVEX_MAP4;
+	  ins->evex_type = evex_from_legacy;
+	  if (ins->address_mode != mode_64bit)
+	    return &bad_opcode;
+	  break;
 	case 0x5:
 	  vex_table_index = EVEX_MAP5;
 	  break;
@@ -9008,9 +9079,8 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 
       ins->vex.register_specifier = (~(*ins->codep >> 3)) & 0xf;
 
-      /* The U bit.  */
       if (!(*ins->codep & 0x4))
-	return &bad_opcode;
+	ins->rex2 |= REX_X;
 
       switch ((*ins->codep & 0x3))
 	{
@@ -9040,12 +9110,26 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 
       if (ins->address_mode != mode_64bit)
 	{
+	  /* Report bad for !evex_default and when two fixed values of evex
+	     change..  */
+	  if (ins->evex_type != evex_default
+	      || (ins->rex2 & (REX_B | REX_X)))
+	    return &bad_opcode;
 	  /* In 16/32-bit mode silently ignore following bits.  */
 	  ins->rex &= ~REX_B;
-	  ins->vex.r = true;
+	  ins->rex2 &= ~REX_R;
 	}
 
       ins->need_vex = 4;
+
+      /* EVEX from legacy instructions require that EVEX.z, EVEX.L’L and the
+	 lower 2 bits of EVEX.aaa must be 0.  */
+      if (ins->evex_type == evex_from_legacy
+	  && ((ins->vex.mask_register_specifier & 0x3) != 0
+	      || ins->vex.ll != 0
+	      || ins->vex.zeroing != 0))
+	return &bad_opcode;
+
       ins->codep++;
       vindex = *ins->codep++;
       dp = &evex_table[vex_table_index][vindex];
@@ -9460,6 +9544,13 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       dp = get_valid_dis386 (dp, &ins);
       if (dp == &err_opcode)
 	goto fetch_error_out;
+
+      /* For APX instructions promoted from legacy maps 0/1, embedded prefix
+	 is interpreted as the operand size override.  */
+      if (ins.evex_type == evex_from_legacy
+	  && ins.vex.prefix == DATA_PREFIX_OPCODE)
+	sizeflag ^= DFLAG;
+
       if (dp != NULL && putop (&ins, dp->name, sizeflag) == 0)
 	{
 	  if (!get_sib (&ins, sizeflag))
@@ -9639,6 +9730,25 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
       if (ins.last_repnz_prefix >= 0)
 	ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
       break;
+
+    case PREFIX_NP_OR_DATA:
+      if (ins.vex.prefix == REPE_PREFIX_OPCODE
+	  || ins.vex.prefix == REPNE_PREFIX_OPCODE)
+	{
+	  i386_dis_printf (info, dis_style_text, "(bad)");
+	  ret = ins.end_codep - priv.the_buffer;
+	  goto out;
+	}
+      break;
+
+    case NO_PREFIX:
+      if (ins.vex.prefix)
+	{
+	  i386_dis_printf (info, dis_style_text, "(bad)");
+	  ret = ins.end_codep - priv.the_buffer;
+	  goto out;
+	}
+      break;
     }
 
   /* Check if the REX prefix is used.  */
@@ -10348,7 +10458,7 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 		{
 		case 'X':
 		  if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
-		      || !ins->vex.r
+		      || (ins->rex2 & 7)
 		      || (ins->modrm.mod == 3 && (ins->rex & REX_X))
 		      || !ins->vex.v || ins->vex.mask_register_specifier)
 		    break;
@@ -11459,7 +11569,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 
   add += (ins->rex2 & REX_B) ? 16 : 0;
 
-  if (ins->vex.evex)
+  if (ins->vex.evex && ins->evex_type == evex_default)
     {
 
       /* Zeroing-masking is invalid for memory destinations. Set the flag
@@ -11603,6 +11713,13 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 		abort ();
 	      if (ins->vex.evex)
 		{
+		  /* S/G EVEX insns require EVEX.X4 not to be set.  */
+		  if (ins->rex2 & REX_X)
+		    {
+		      oappend (ins, "(bad)");
+		      return true;
+		    }
+
 		  if (!ins->vex.v)
 		    vindex += 16;
 		  check_gather = ins->obufp == ins->op_out[1];
@@ -11805,7 +11922,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 
 	      if (ins->rex & REX_R)
 	        modrm_reg += 8;
-	      if (!ins->vex.r)
+	      if (ins->rex2 & REX_R)
 	        modrm_reg += 16;
 	      if (vindex == modrm_reg)
 		oappend (ins, "/(bad)");
@@ -12011,10 +12128,7 @@ OP_indirE (instr_info *ins, int bytemode, int sizeflag)
 static bool
 OP_G (instr_info *ins, int bytemode, int sizeflag)
 {
-  if (ins->vex.evex && !ins->vex.r && ins->address_mode == mode_64bit)
-    oappend (ins, "(bad)");
-  else
-    print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
+  print_register (ins, ins->modrm.reg, REX_R, bytemode, sizeflag);
   return true;
 }
 
@@ -12645,7 +12759,7 @@ OP_XMM (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
     reg += 8;
   if (ins->vex.evex)
     {
-      if (!ins->vex.r)
+      if (ins->rex2 & REX_R)
 	reg += 16;
     }
 
@@ -13652,7 +13766,7 @@ DistinctDest_Fixup (instr_info *ins, int bytemode, int sizeflag)
   /* Calc destination register number.  */
   if (ins->rex & REX_R)
     modrm_reg += 8;
-  if (!ins->vex.r)
+  if (ins->rex2 & REX_R)
     modrm_reg += 16;
 
   /* Calc src1 register number.  */
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index dd4850e1855..508b441a343 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -487,6 +487,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (Dialect),
   BITFIELD (ISA64),
   BITFIELD (NoEgpr),
+  BITFIELD (NF),
 };
 
 #define CLASS(n) #n, n
@@ -1120,6 +1121,7 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space,
     SPACE(0F),
     SPACE(0F38),
     SPACE(0F3A),
+    SPACE(EVEXMAP4),
     SPACE(EVEXMAP5),
     SPACE(EVEXMAP6),
     SPACE(VEXMAP7),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 8c967ea90b0..064ec48edad 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -743,6 +743,9 @@ enum
      whether the instruction supports pseudo-prefix {rex2}.  */
   NoEgpr,
 
+  /* No CSPAZO flags update indication.  */
+  NF,
+
   /* The last bitfield in i386_opcode_modifier.  */
   Opcode_Modifier_Num
 };
@@ -788,6 +791,7 @@ typedef struct i386_opcode_modifier
   unsigned int dialect:2;
   unsigned int isa64:2;
   unsigned int noegpr:1;
+  unsigned int nf:1;
 } i386_opcode_modifier;
 
 /* Operand classes.  */
@@ -963,6 +967,7 @@ typedef struct insn_template
      1: 0F opcode prefix / space.
      2: 0F38 opcode prefix / space.
      3: 0F3A opcode prefix / space.
+     4: EVEXMAP4 opcode prefix / space.
      5: EVEXMAP5 opcode prefix / space.
      6: EVEXMAP6 opcode prefix / space.
      7: VEXMAP7 opcode prefix / space.
@@ -974,6 +979,7 @@ typedef struct insn_template
 #define SPACE_0F	1
 #define SPACE_0F38	2
 #define SPACE_0F3A	3
+#define SPACE_EVEXMAP4	4
 #define SPACE_EVEXMAP5	5
 #define SPACE_EVEXMAP6	6
 #define SPACE_VEXMAP7	7
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 37d3e8663bb..11b8c0b63cb 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -113,6 +113,7 @@
 #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
 #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
 
+#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
 #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
 #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
 
@@ -139,6 +140,9 @@
 
 #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
 
+// The template supports VEX format for cpuid and EVEX format for cpuid & apx_f.
+#define APX_F(cpuid) cpuid&(cpuid|APX_F)
+
 // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
 // the bit to mark commutative VEX encodings where swapping the source
 // operands may allow to switch from 3-byte to 2-byte VEX encoding.
@@ -194,6 +198,7 @@ mov, 0xf24, i386&No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Te
 
 // Move after swapping the bytes
 movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movbe, 0x60, Movbe&APX_F, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // Move with sign extend.
 movsb, 0xfbe, i386, Modrm|No_bSuf|No_sSuf, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
@@ -1315,13 +1320,16 @@ getsec, 0xf37, SMX, NoSuf, {}
 
 invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invept, 0xf3f0, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invvpid, 0xf3f1, EPT&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // INVPCID instruction
 
 invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
+invpcid, 0xf3f2, INVPCID&APX_F, Modrm|NoSuf|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // SSSE3 instructions.
 
@@ -1422,6 +1430,8 @@ pcmpistri<sse42>, 0x660f3a63, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, Reg
 pcmpistrm<sse42>, 0x660f3a62, <sse42:cpu>, Modrm|<sse42:attr>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 crc32, 0xf20f38f0, SSE4_2, W|Modrm|No_sSuf|No_qSuf, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
 crc32, 0xf20f38f0, SSE4_2&x64, W|Modrm|No_wSuf|No_lSuf|No_sSuf, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
+crc32, 0xf0, APX_F, W|Modrm|No_sSuf|No_qSuf|EVexMap4, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
+crc32, 0xf0, APX_F, W|Modrm|No_wSuf|No_lSuf|No_sSuf|EVexMap4, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
 
 // xsave/xrstor New Instructions.
 
@@ -1836,14 +1846,14 @@ xtest, 0xf01d6, HLE|RTM, NoSuf, {}
 
 // BMI2 instructions.
 
-bzhi, 0xf5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-mulx, 0xf2f6, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pdep, 0xf2f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pext, 0xf3f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-rorx, 0xf2f0, BMI2, Modrm|CheckOperandSize|Vex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-sarx, 0xf3f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shlx, 0x66f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shrx, 0xf2f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+bzhi, 0xf5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+mulx, 0xf2f6, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pdep, 0xf2f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pext, 0xf3f5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+rorx, 0xf2f0, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+sarx, 0xf3f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shlx, 0x66f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shrx, 0xf2f7, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 
 // FMA4 instructions
 
@@ -1913,11 +1923,11 @@ lwpins, 0x12/0, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|U
 
 // BMI instructions
 
-andn, 0xf2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsi, 0xf3/3, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsmsk, 0xf3/2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsr, 0xf3/1, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+andn, 0xf2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+bextr, 0xf7, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsi, 0xf3/3, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsmsk, 0xf3/2, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsr, 0xf3/1, APX_F(BMI), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // TBM instructions
@@ -2046,13 +2056,21 @@ bndldx, 0x0f1a, MPX, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
 
 // SHA instructions.
 sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1nexte, 0xf38c8, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1msg1, 0xf38c9, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha1msg2, 0xf38ca, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256rnds2, 0xdb, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256msg1, 0xf38cc, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256msg1, 0xdc, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 sha256msg2, 0xf38cd, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
+sha256msg2, 0xdd, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM }
 
 // SHA512 instructions.
 
@@ -2114,9 +2132,9 @@ kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { R
 kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 
-kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
-kmov<bw>, 0x<bw:kpfx>91, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
-kmov<bw>, 0x<bw:kpfx>92, <bw:kcpu>, D|Modrm|Vex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
+kmov<bw>, 0x<bw:kpfx>90, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
+kmov<bw>, 0x<bw:kpfx>91, APX_F(<bw:kcpu>), Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { RegMask, <bw:elem>|Unspecified|BaseIndex }
+kmov<bw>, 0x<bw:kpfx>92, APX_F(<bw:kcpu>), D|Modrm|Vex128|EVex128|Space0F|VexW0|NoSuf, { Reg32, RegMask }
 
 knot<bw>, 0x<bw:kpfx>44, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
 kortest<bw>, 0x<bw:kpfx>98, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
@@ -2591,9 +2609,9 @@ vpmovzxdq, 0x6635, AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift
 kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf|Optimize, { RegMask, RegMask, RegMask }
-kmov<dq>, 0x<dq:kpfx>90, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
-kmov<dq>, 0x<dq:kpfx>91, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
-kmov<dq>, 0xf292, AVX512BW, D|Modrm|Vex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
+kmov<dq>, 0x<dq:kpfx>90, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
+kmov<dq>, 0x<dq:kpfx>91, APX_F(AVX512BW), Modrm|Vex128|EVex128|Space0F|VexW1|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
+kmov<dq>, 0xf292, APX_F(AVX512BW), D|Modrm|Vex128|EVex128|Space0F|<dq:vexw64>|NoSuf, { <dq:gpr>, RegMask }
 knot<dq>, 0x<dq:kpfx>44, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
 kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|NoSuf, { RegMask, RegMask, RegMask }
 kortest<dq>, 0x<dq:kpfx>98, AVX512BW, Modrm|Vex128|Space0F|VexW1|NoSuf, { RegMask, RegMask }
@@ -2992,9 +3010,13 @@ rdsspq, 0xf30f1e/1, SHSTK&x64, Modrm|NoSuf, { Reg64 }
 saveprevssp, 0xf30f01ea, SHSTK, NoSuf, {}
 rstorssp, 0xf30f01/5, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
 wrssd, 0x0f38f6, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
+wrssd, 0x66, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
 wrssq, 0x0f38f6, SHSTK&x64, Modrm|NoSuf|Size64, { Reg64, Qword|Unspecified|BaseIndex }
+wrssq, 0x66, SHSTK&APX_F, Modrm|NoSuf|Size64|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
 wrussd, 0x660f38f5, SHSTK, Modrm|IgnoreSize|NoSuf, { Reg32, Dword|Unspecified|BaseIndex }
+wrussd, 0x6665, SHSTK&APX_F, Modrm|IgnoreSize|NoSuf|EVexMap4, { Reg32, Dword|Unspecified|BaseIndex }
 wrussq, 0x660f38f5, SHSTK&x64, Modrm|NoSuf, { Reg64, Qword|Unspecified|BaseIndex }
+wrussq, 0x6665, SHSTK&APX_F, Modrm|NoSuf|EVexMap4, { Reg64, Qword|Unspecified|BaseIndex }
 setssbsy, 0xf30f01e8, SHSTK, NoSuf, {}
 clrssbsy, 0xf30fae/6, SHSTK, Modrm|NoSuf, { Qword|Unspecified|BaseIndex }
 endbr64, 0xf30f1efa, IBT, NoSuf, {}
@@ -3042,7 +3064,9 @@ cldemote, 0x0f1c/0, CLDEMOTE, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
 // MOVDIR[I,64B] instructions.
 
 movdiri, 0xf38f9, MOVDIRI, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+movdiri, 0xf9, MOVDIRI&APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 movdir64b, 0x660f38f8, MOVDIR64B, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movdir64b, 0x66f8, MOVDIR64B&APX_F, Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 
 // MOVEDIR instructions end.
 
@@ -3071,7 +3095,9 @@ vcvtneps2bf16<Vxy>, 0xf372, AVX_NE_CONVERT, Modrm|<Vxy:vex>|Space0F38|VexW0|NoSu
 // ENQCMD instructions.
 
 enqcmd, 0xf20f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+enqcmd, 0xf2f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 enqcmds, 0xf30f38f8, ENQCMD, Modrm|AddrPrefixOpReg|NoSuf, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+enqcmds, 0xf3f8, APX_F(ENQCMD), Modrm|AddrPrefixOpReg|NoSuf|EVexMap4, { Unspecified|BaseIndex, Reg32|Reg64 }
 
 // ENQCMD instructions end.
 
@@ -3132,8 +3158,8 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
 
 // AMX instructions.
 
-ldtilecfg, 0x49/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
-sttilecfg, 0x6649/0, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
+ldtilecfg, 0x49/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
+sttilecfg, 0x6649/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
 
 tcmmimfp16ps, 0x666c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tcmmrlfp16ps, 0x6c, AMX_COMPLEX, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
@@ -3145,9 +3171,9 @@ tdpbuud, 0x5e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
 tdpbusd, 0x665e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tdpbsud, 0xf35e, AMX_INT8, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 
-tileloadd, 0xf24b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
-tileloaddt1, 0x664b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
-tilestored, 0xf34b, AMX_TILE, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
+tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }
 
 tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {}
 
@@ -3159,15 +3185,25 @@ tilezero, 0xf249, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
 
 loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM }
 encodekey128, 0xf30f38fa, KL, Modrm|NoSuf, { Reg32, Reg32 }
+encodekey128, 0xf3da, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
 encodekey256, 0xf30f38fb, KL, Modrm|NoSuf, { Reg32, Reg32 }
+encodekey256, 0xf3db, KL&APX_F, Modrm|NoSuf|EVexMap4, { Reg32, Reg32 }
 aesenc128kl, 0xf30f38dc, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesenc128kl, 0xf3dc, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesdec128kl, 0xf30f38dd, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesdec128kl, 0xf3dd, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesenc256kl, 0xf30f38de, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesenc256kl, 0xf3de, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesdec256kl, 0xf30f38df, KL, Modrm|NoSuf, { Unspecified|BaseIndex, RegXMM }
+aesdec256kl, 0xf3df, KL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex, RegXMM }
 aesencwide128kl, 0xf30f38d8/0, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesencwide128kl, 0xf3d8/0, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesdecwide128kl, 0xf30f38d8/1, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesdecwide128kl, 0xf3d8/1, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesencwide256kl, 0xf30f38d8/2, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesencwide256kl, 0xf3d8/2, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 aesdecwide256kl, 0xf30f38d8/3, WideKL, Modrm|NoSuf, { Unspecified|BaseIndex }
+aesdecwide256kl, 0xf3d8/3, WideKL&APX_F, Modrm|NoSuf|EVexMap4, { Unspecified|BaseIndex }
 
 // KEYLOCKER instructions end.
 
@@ -3315,7 +3351,7 @@ prefetchit1, 0xf18/6, PREFETCHI, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
 
 // CMPCCXADD instructions.
 
-cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD, Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+cmp<cc>xadd, 0x66e<cc:opc>, APX_F(CMPCCXADD), Modrm|Vex|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 
 // CMPCCXADD instructions end.
 
@@ -3335,9 +3371,13 @@ wrmsrlist, 0xf30f01c6, MSRLIST, NoSuf, {}
 // RAO-INT instructions.
 
 aadd, 0xf38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aadd, 0xfc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 aand, 0x660f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aand, 0x66fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 aor, 0xf20f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+aor, 0xf2fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 axor, 0xf30f38fc, RAO_INT, Modrm|IgnoreSize|CheckOperandSize|NoSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+axor, 0xf3fc, RAO_INT&APX_F, Modrm|IgnoreSize|CheckOperandSize|NoSuf|EVexMap4, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 
 // RAO-INT instructions end.
 
-- 
2.25.1


  reply	other threads:[~2023-12-28 13:48 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-28  1:27 [PATCH V5 0/9] Support Intel APX EGPR Cui, Lili
2023-12-28  1:27 ` [PATCH V5 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-28  1:53   ` H.J. Lu
2024-01-04  8:02     ` Jan Beulich
2024-01-04 11:27       ` Cui, Lili
2024-01-05 14:45   ` Jan Beulich
2024-01-08  3:41     ` Cui, Lili
2023-12-28  1:27 ` [PATCH V5 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-12-28  1:54   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-28  1:54   ` H.J. Lu
2023-12-28 13:48     ` Cui, Lili [this message]
2023-12-28  1:27 ` [PATCH V5 4/9] Add tests for " Cui, Lili
2023-12-28  1:54   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 5/9] Support APX NDD Cui, Lili
2023-12-28  1:55   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 6/9] Support APX Push2/Pop2 Cui, Lili
2023-12-28  1:55   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 7/9] Support APX pushp/popp Cui, Lili
2023-12-28  1:56   ` H.J. Lu
2023-12-28  1:27 ` [PATCH V5 8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-28  1:56   ` H.J. Lu
2024-01-05 14:36   ` Jan Beulich
2024-01-08  2:49     ` Hu, Lin1
2023-12-28  1:27 ` [PATCH V5 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-12-28  1:56   ` H.J. Lu
2024-01-05 12:08   ` Jan Beulich
2024-01-08  2:32     ` Hu, Lin1
2024-01-08  7:41       ` Jan Beulich
2024-01-08  7:44         ` Hu, Lin1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB560016482CD2704C3DF4636F9E9EA@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).