public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/4] x86: operand handling consolidation
@ 2017-12-15  9:54 Jan Beulich
  2017-12-15 10:32 ` [PATCH 1/4] x86: replace Reg8, Reg16, Reg32, and Reg64 Jan Beulich
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Jan Beulich @ 2017-12-15  9:54 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

This is another step towards a better maintainable opcode table.
The 4th patch does only a first part of the then possible template
reduction. Something similar ought to be possible for AVX512 ones,
just that the broadcasting then needs dealing with in tc-i386.c.

As to further table size reduction, there are two choices:
- keep Reg8 etc and drop now redundant Byte from such
  templates
- replace Reg8 etc by just Reg, keeping Byte etc
I'd prefer the former (and the 4th patch moves along those
lines for RegXMM and RegYMM), not the least for the bigger
win in readability.

1: replace Reg8, Reg16, Reg32, and Reg64
2: drop FloatReg and FloatAcc
3: fold RegXMM/RegYMM/RegZMM into RegSIMD
4: fold certain AVX and AVX2 templates

Just for reference I'm also going to send a 5th patch which I've
used to find the templates which need splitting (in patch 1, other
than I had feared there were no splits necessary in patch 3). I
didn't want to rely on just the testsuite to spot them all. I don't
anticipate that this patch will want committing, but if others
think it's worthwhile I could certainly bring it into committable
shape.

Jan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] x86: replace Reg8, Reg16, Reg32, and Reg64
  2017-12-15  9:54 [PATCH 0/4] x86: operand handling consolidation Jan Beulich
@ 2017-12-15 10:32 ` Jan Beulich
  2017-12-15 12:46   ` H.J. Lu
  2017-12-15 10:33 ` [PATCH 2/4] x86: drop FloatReg and FloatAcc Jan Beulich
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2017-12-15 10:32 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Use a combination of a single new Reg bit and Byte, Word, Dword, or
Qword instead.

Besides shrinking the number of operand type bits this has the benefit
of making register handling more similar to accumulator handling (a
generic flag is being accompanied by a "size qualifier"). It requires,
however, to split a few insn templates, as it is no longer correct to
have combinations like Reg32|Reg64|Byte. This slight growth in size will
hopefully be outweighed by this change paving the road for folding a
presumably much larger number of templates later on.

gas/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (operand_type_check, pi): Switch .reg<N> to
	just .reg.
	(operand_size_match): Qualify .anysize check with .reg one.
	Extend .acc check to also cover .reg.
	(operand_type_register_match): Drop m0 and m1 parameters. Switch
	.reg<N> to .byte/.word/.dword/.qword. Drop .acc special
	handling. 
	(md_assemble): Expand .reg8 checks to .reg plus .bytes ones.
	(optimize_imm, process_suffix, check_byte_reg, check_long_reg,
	check_qword_reg, check_word_reg): Expand .reg<N> checks to .reg
	plus size ones.
	(match_template): Drop arguments from calls to
	operand_type_register_match().
	(build_modrm_byte, i386_addressing_mode, i386_index_check,
	parse_real_register): Replace .reg<N> checks.
	* config/tc-i386-intel.c (i386_intel_simplify,
	i386_intel_operand): Switch .reg16 to .word.

opcodes/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* i386-gen.c (operand_type_shorthands): New.
	(opcode_modifiers): Replace Reg<N> with just Reg.
	(set_bitfield_from_cpu_flag_init): Rename to
	set_bitfield_from_shorthand. Drop value parameter. Process
	operand_type_shorthands.
	(set_bitfield): Adjust call accordingly.
	* i386-opc.h (enum of operand types): Replace Reg<N> with just
	Reg.
	(union i386_operand_type): Replace reg<N> with just reg.
	* i386-opc.tbl (extractps, pextrb, pextrw, pinsrb, pinsrw,
	vextractps, vpextrb, vpextrw, vpinsrb, vpinsrw): Split into
	separate register and memory forms.
	* i386-reg.tbl (al): Drop Byte.
	(ax): Drop Word.
	(eax): Drop Dword.
	(rax): Drop Qword.
	* i386-init.h, i386-tbl.h: Re-generate.

--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -451,7 +451,7 @@ static int i386_intel_simplify (expressi
 	    {
 	      resolve_expression (scale);
 	      if (scale->X_op != O_constant
-		  || intel_state.index->reg_type.bitfield.reg16)
+		  || intel_state.index->reg_type.bitfield.word)
 		scale->X_add_number = 0;
 	      intel_state.scale_factor *= scale->X_add_number;
 	    }
@@ -897,8 +897,8 @@ i386_intel_operand (char *operand_string
 	 mode we have to do this here.  */
       if (intel_state.base
 	  && intel_state.index
-	  && intel_state.base->reg_type.bitfield.reg16
-	  && intel_state.index->reg_type.bitfield.reg16
+	  && intel_state.base->reg_type.bitfield.word
+	  && intel_state.index->reg_type.bitfield.word
 	  && intel_state.base->reg_num >= 6
 	  && intel_state.index->reg_num < 6)
 	{
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1775,10 +1775,7 @@ operand_type_check (i386_operand_type t,
   switch (c)
     {
     case reg:
-      return (t.bitfield.reg8
-	      || t.bitfield.reg16
-	      || t.bitfield.reg32
-	      || t.bitfield.reg64);
+      return t.bitfield.reg;
 
     case imm:
       return (t.bitfield.imm8
@@ -1867,10 +1864,12 @@ operand_size_match (const insn_template
   /* Check memory and accumulator operand size.  */
   for (j = 0; j < i.operands; j++)
     {
-      if (t->operand_types[j].bitfield.anysize)
+      if (!i.types[j].bitfield.reg && t->operand_types[j].bitfield.anysize)
 	continue;
 
-      if (t->operand_types[j].bitfield.acc && !match_reg_size (t, j))
+      if ((t->operand_types[j].bitfield.reg
+	   || t->operand_types[j].bitfield.acc)
+	  && !match_reg_size (t, j))
 	{
 	  match = 0;
 	  break;
@@ -1898,7 +1897,8 @@ mismatch:
   match = 1;
   for (j = 0; j < 2; j++)
     {
-      if (t->operand_types[j].bitfield.acc
+      if ((t->operand_types[j].bitfield.reg
+	   || t->operand_types[j].bitfield.acc)
 	  && !match_reg_size (t, j ? 0 : 1))
 	goto mismatch;
 
@@ -1940,14 +1940,11 @@ mismatch:
 }
 
 /* If given types g0 and g1 are registers they must be of the same type
-   unless the expected operand type register overlap is null.
-   Note that Acc in a template matches every size of reg.  */
+   unless the expected operand type register overlap is null.  */
 
 static INLINE int
-operand_type_register_match (i386_operand_type m0,
-			     i386_operand_type g0,
+operand_type_register_match (i386_operand_type g0,
 			     i386_operand_type t0,
-			     i386_operand_type m1,
 			     i386_operand_type g1,
 			     i386_operand_type t1)
 {
@@ -1957,32 +1954,16 @@ operand_type_register_match (i386_operan
   if (!operand_type_check (g1, reg))
     return 1;
 
-  if (g0.bitfield.reg8 == g1.bitfield.reg8
-      && g0.bitfield.reg16 == g1.bitfield.reg16
-      && g0.bitfield.reg32 == g1.bitfield.reg32
-      && g0.bitfield.reg64 == g1.bitfield.reg64)
+  if (g0.bitfield.byte == g1.bitfield.byte
+      && g0.bitfield.word == g1.bitfield.word
+      && g0.bitfield.dword == g1.bitfield.dword
+      && g0.bitfield.qword == g1.bitfield.qword)
     return 1;
 
-  if (m0.bitfield.acc)
-    {
-      t0.bitfield.reg8 = 1;
-      t0.bitfield.reg16 = 1;
-      t0.bitfield.reg32 = 1;
-      t0.bitfield.reg64 = 1;
-    }
-
-  if (m1.bitfield.acc)
-    {
-      t1.bitfield.reg8 = 1;
-      t1.bitfield.reg16 = 1;
-      t1.bitfield.reg32 = 1;
-      t1.bitfield.reg64 = 1;
-    }
-
-  if (!(t0.bitfield.reg8 & t1.bitfield.reg8)
-      && !(t0.bitfield.reg16 & t1.bitfield.reg16)
-      && !(t0.bitfield.reg32 & t1.bitfield.reg32)
-      && !(t0.bitfield.reg64 & t1.bitfield.reg64))
+  if (!(t0.bitfield.byte & t1.bitfield.byte)
+      && !(t0.bitfield.word & t1.bitfield.word)
+      && !(t0.bitfield.dword & t1.bitfield.dword)
+      && !(t0.bitfield.qword & t1.bitfield.qword))
     return 1;
 
   i.error = register_type_mismatch;
@@ -2821,10 +2802,7 @@ pi (char *line, i386_insn *x)
       fprintf (stdout, "    #%d:  ", j + 1);
       pt (x->types[j]);
       fprintf (stdout, "\n");
-      if (x->types[j].bitfield.reg8
-	  || x->types[j].bitfield.reg16
-	  || x->types[j].bitfield.reg32
-	  || x->types[j].bitfield.reg64
+      if (x->types[j].bitfield.reg
 	  || x->types[j].bitfield.regmmx
 	  || x->types[j].bitfield.regxmm
 	  || x->types[j].bitfield.regymm
@@ -3856,12 +3834,12 @@ md_assemble (char *line)
      instruction already has a prefix, we need to convert old
      registers to new ones.  */
 
-  if ((i.types[0].bitfield.reg8
+  if ((i.types[0].bitfield.reg && i.types[0].bitfield.byte
        && (i.op[0].regs->reg_flags & RegRex64) != 0)
-      || (i.types[1].bitfield.reg8
+      || (i.types[1].bitfield.reg && i.types[1].bitfield.byte
 	  && (i.op[1].regs->reg_flags & RegRex64) != 0)
-      || ((i.types[0].bitfield.reg8
-	   || i.types[1].bitfield.reg8)
+      || (((i.types[0].bitfield.reg && i.types[0].bitfield.byte)
+	   || (i.types[1].bitfield.reg && i.types[1].bitfield.byte))
 	  && i.rex != 0))
     {
       int x;
@@ -3870,7 +3848,7 @@ md_assemble (char *line)
       for (x = 0; x < 2; x++)
 	{
 	  /* Look for 8 bit operand that uses old registers.  */
-	  if (i.types[x].bitfield.reg8
+	  if (i.types[x].bitfield.reg && i.types[x].bitfield.byte
 	      && (i.op[x].regs->reg_flags & RegRex64) == 0)
 	    {
 	      /* In case it is "hi" register, give up.  */
@@ -4377,22 +4355,22 @@ optimize_imm (void)
 	 but the following works for instructions with immediates.
 	 In any case, we can't set i.suffix yet.  */
       for (op = i.operands; --op >= 0;)
-	if (i.types[op].bitfield.reg8)
+	if (i.types[op].bitfield.reg && i.types[op].bitfield.byte)
 	  {
 	    guess_suffix = BYTE_MNEM_SUFFIX;
 	    break;
 	  }
-	else if (i.types[op].bitfield.reg16)
+	else if (i.types[op].bitfield.reg && i.types[op].bitfield.word)
 	  {
 	    guess_suffix = WORD_MNEM_SUFFIX;
 	    break;
 	  }
-	else if (i.types[op].bitfield.reg32)
+	else if (i.types[op].bitfield.reg && i.types[op].bitfield.dword)
 	  {
 	    guess_suffix = LONG_MNEM_SUFFIX;
 	    break;
 	  }
-	else if (i.types[op].bitfield.reg64)
+	else if (i.types[op].bitfield.reg && i.types[op].bitfield.qword)
 	  {
 	    guess_suffix = QWORD_MNEM_SUFFIX;
 	    break;
@@ -5084,9 +5062,9 @@ match_template (char mnem_suffix)
 	  if (!operand_type_match (overlap0, i.types[0])
 	      || !operand_type_match (overlap1, i.types[1])
 	      || (check_register
-		  && !operand_type_register_match (overlap0, i.types[0],
+		  && !operand_type_register_match (i.types[0],
 						   operand_types[0],
-						   overlap1, i.types[1],
+						   i.types[1],
 						   operand_types[1])))
 	    {
 	      /* Check if other direction is valid ...  */
@@ -5100,10 +5078,8 @@ check_reverse:
 	      if (!operand_type_match (overlap0, i.types[0])
 		  || !operand_type_match (overlap1, i.types[1])
 		  || (check_register
-		      && !operand_type_register_match (overlap0,
-						       i.types[0],
+		      && !operand_type_register_match (i.types[0],
 						       operand_types[1],
-						       overlap1,
 						       i.types[1],
 						       operand_types[0])))
 		{
@@ -5144,10 +5120,8 @@ check_reverse:
 		{
 		case 5:
 		  if (!operand_type_match (overlap4, i.types[4])
-		      || !operand_type_register_match (overlap3,
-						       i.types[3],
+		      || !operand_type_register_match (i.types[3],
 						       operand_types[3],
-						       overlap4,
 						       i.types[4],
 						       operand_types[4]))
 		    continue;
@@ -5155,10 +5129,8 @@ check_reverse:
 		case 4:
 		  if (!operand_type_match (overlap3, i.types[3])
 		      || (check_register
-			  && !operand_type_register_match (overlap2,
-							   i.types[2],
+			  && !operand_type_register_match (i.types[2],
 							   operand_types[2],
-							   overlap3,
 							   i.types[3],
 							   operand_types[3])))
 		    continue;
@@ -5170,10 +5142,8 @@ check_reverse:
 		     register consistency between operands 2 and 3.  */
 		  if (!operand_type_match (overlap2, i.types[2])
 		      || (check_register
-			  && !operand_type_register_match (overlap1,
-							   i.types[1],
+			  && !operand_type_register_match (i.types[1],
 							   operand_types[1],
-							   overlap2,
 							   i.types[2],
 							   operand_types[2])))
 		    continue;
@@ -5381,16 +5351,16 @@ process_suffix (void)
 	     type. */
 	  if (i.tm.base_opcode == 0xf20f38f1)
 	    {
-	      if (i.types[0].bitfield.reg16)
+	      if (i.types[0].bitfield.reg && i.types[0].bitfield.word)
 		i.suffix = WORD_MNEM_SUFFIX;
-	      else if (i.types[0].bitfield.reg32)
+	      else if (i.types[0].bitfield.reg && i.types[0].bitfield.dword)
 		i.suffix = LONG_MNEM_SUFFIX;
-	      else if (i.types[0].bitfield.reg64)
+	      else if (i.types[0].bitfield.reg && i.types[0].bitfield.qword)
 		i.suffix = QWORD_MNEM_SUFFIX;
 	    }
 	  else if (i.tm.base_opcode == 0xf20f38f0)
 	    {
-	      if (i.types[0].bitfield.reg8)
+	      if (i.types[0].bitfield.reg && i.types[0].bitfield.byte)
 		i.suffix = BYTE_MNEM_SUFFIX;
 	    }
 
@@ -5411,22 +5381,22 @@ process_suffix (void)
 		if (!i.tm.operand_types[op].bitfield.inoutportreg
 		    && !i.tm.operand_types[op].bitfield.shiftcount)
 		  {
-		    if (i.types[op].bitfield.reg8)
+		    if (i.types[op].bitfield.reg && i.types[op].bitfield.byte)
 		      {
 			i.suffix = BYTE_MNEM_SUFFIX;
 			break;
 		      }
-		    else if (i.types[op].bitfield.reg16)
+		    if (i.types[op].bitfield.reg && i.types[op].bitfield.word)
 		      {
 			i.suffix = WORD_MNEM_SUFFIX;
 			break;
 		      }
-		    else if (i.types[op].bitfield.reg32)
+		    if (i.types[op].bitfield.reg && i.types[op].bitfield.dword)
 		      {
 			i.suffix = LONG_MNEM_SUFFIX;
 			break;
 		      }
-		    else if (i.types[op].bitfield.reg64)
+		    if (i.types[op].bitfield.reg && i.types[op].bitfield.qword)
 		      {
 			i.suffix = QWORD_MNEM_SUFFIX;
 			break;
@@ -5583,9 +5553,9 @@ process_suffix (void)
 	  /* The address size override prefix changes the size of the
 	     first operand.  */
 	  if ((flag_code == CODE_32BIT
-	       && i.op->regs[0].reg_type.bitfield.reg16)
+	       && i.op->regs[0].reg_type.bitfield.word)
 	      || (flag_code != CODE_32BIT
-		  && i.op->regs[0].reg_type.bitfield.reg32))
+		  && i.op->regs[0].reg_type.bitfield.dword))
 	    if (!add_prefix (ADDR_PREFIX_OPCODE))
 	      return 0;
 	}
@@ -5642,10 +5612,14 @@ check_byte_reg (void)
 
   for (op = i.operands; --op >= 0;)
     {
+      /* Skip non-register operands. */
+      if (!i.types[op].bitfield.reg)
+	continue;
+
       /* If this is an eight bit register, it's OK.  If it's the 16 or
 	 32 bit version of an eight bit register, we will just use the
 	 low portion, and that's OK too.  */
-      if (i.types[op].bitfield.reg8)
+      if (i.types[op].bitfield.byte)
 	continue;
 
       /* I/O port address operands are OK too.  */
@@ -5656,9 +5630,9 @@ check_byte_reg (void)
       if (i.tm.base_opcode == 0xf20f38f0)
 	continue;
 
-      if ((i.types[op].bitfield.reg16
-	   || i.types[op].bitfield.reg32
-	   || i.types[op].bitfield.reg64)
+      if ((i.types[op].bitfield.word
+	   || i.types[op].bitfield.dword
+	   || i.types[op].bitfield.qword)
 	  && i.op[op].regs->reg_num < 4
 	  /* Prohibit these changes in 64bit mode, since the lowering
 	     would be more complicated.  */
@@ -5668,7 +5642,7 @@ check_byte_reg (void)
 	  if (!quiet_warnings)
 	    as_warn (_("using `%s%s' instead of `%s%s' due to `%c' suffix"),
 		     register_prefix,
-		     (i.op[op].regs + (i.types[op].bitfield.reg16
+		     (i.op[op].regs + (i.types[op].bitfield.word
 				       ? REGNAM_AL - REGNAM_AX
 				       : REGNAM_AL - REGNAM_EAX))->reg_name,
 		     register_prefix,
@@ -5678,9 +5652,7 @@ check_byte_reg (void)
 	  continue;
 	}
       /* Any other register is bad.  */
-      if (i.types[op].bitfield.reg16
-	  || i.types[op].bitfield.reg32
-	  || i.types[op].bitfield.reg64
+      if (i.types[op].bitfield.reg
 	  || i.types[op].bitfield.regmmx
 	  || i.types[op].bitfield.regxmm
 	  || i.types[op].bitfield.regymm
@@ -5710,12 +5682,16 @@ check_long_reg (void)
   int op;
 
   for (op = i.operands; --op >= 0;)
+    /* Skip non-register operands. */
+    if (!i.types[op].bitfield.reg)
+      continue;
     /* Reject eight bit registers, except where the template requires
        them. (eg. movzb)  */
-    if (i.types[op].bitfield.reg8
-	&& (i.tm.operand_types[op].bitfield.reg16
-	    || i.tm.operand_types[op].bitfield.reg32
-	    || i.tm.operand_types[op].bitfield.acc))
+    else if (i.types[op].bitfield.byte
+	     && (i.tm.operand_types[op].bitfield.reg
+		 || i.tm.operand_types[op].bitfield.acc)
+	     && (i.tm.operand_types[op].bitfield.word
+		 || i.tm.operand_types[op].bitfield.dword))
       {
 	as_bad (_("`%s%s' not allowed with `%s%c'"),
 		register_prefix,
@@ -5726,9 +5702,10 @@ check_long_reg (void)
       }
     /* Warn if the e prefix on a general reg is missing.  */
     else if ((!quiet_warnings || flag_code == CODE_64BIT)
-	     && i.types[op].bitfield.reg16
-	     && (i.tm.operand_types[op].bitfield.reg32
-		 || i.tm.operand_types[op].bitfield.acc))
+	     && i.types[op].bitfield.word
+	     && (i.tm.operand_types[op].bitfield.reg
+		 || i.tm.operand_types[op].bitfield.acc)
+	     && i.tm.operand_types[op].bitfield.dword)
       {
 	/* Prohibit these changes in the 64bit mode, since the
 	   lowering is more complicated.  */
@@ -5747,9 +5724,10 @@ check_long_reg (void)
 #endif
       }
     /* Warn if the r prefix on a general reg is present.  */
-    else if (i.types[op].bitfield.reg64
-	     && (i.tm.operand_types[op].bitfield.reg32
-		 || i.tm.operand_types[op].bitfield.acc))
+    else if (i.types[op].bitfield.qword
+	     && (i.tm.operand_types[op].bitfield.reg
+		 || i.tm.operand_types[op].bitfield.acc)
+	     && i.tm.operand_types[op].bitfield.dword)
       {
 	if (intel_syntax
 	    && i.tm.opcode_modifier.toqword
@@ -5775,12 +5753,16 @@ check_qword_reg (void)
   int op;
 
   for (op = i.operands; --op >= 0; )
+    /* Skip non-register operands. */
+    if (!i.types[op].bitfield.reg)
+      continue;
     /* Reject eight bit registers, except where the template requires
        them. (eg. movzb)  */
-    if (i.types[op].bitfield.reg8
-	&& (i.tm.operand_types[op].bitfield.reg16
-	    || i.tm.operand_types[op].bitfield.reg32
-	    || i.tm.operand_types[op].bitfield.acc))
+    else if (i.types[op].bitfield.byte
+	     && (i.tm.operand_types[op].bitfield.reg
+		 || i.tm.operand_types[op].bitfield.acc)
+	     && (i.tm.operand_types[op].bitfield.word
+		 || i.tm.operand_types[op].bitfield.dword))
       {
 	as_bad (_("`%s%s' not allowed with `%s%c'"),
 		register_prefix,
@@ -5790,10 +5772,11 @@ check_qword_reg (void)
 	return 0;
       }
     /* Warn if the r prefix on a general reg is missing.  */
-    else if ((i.types[op].bitfield.reg16
-	      || i.types[op].bitfield.reg32)
-	     && (i.tm.operand_types[op].bitfield.reg64
-		 || i.tm.operand_types[op].bitfield.acc))
+    else if ((i.types[op].bitfield.word
+	      || i.types[op].bitfield.dword)
+	     && (i.tm.operand_types[op].bitfield.reg
+		 || i.tm.operand_types[op].bitfield.acc)
+	     && i.tm.operand_types[op].bitfield.qword)
       {
 	/* Prohibit these changes in the 64bit mode, since the
 	   lowering is more complicated.  */
@@ -5820,12 +5803,16 @@ check_word_reg (void)
 {
   int op;
   for (op = i.operands; --op >= 0;)
+    /* Skip non-register operands. */
+    if (!i.types[op].bitfield.reg)
+      continue;
     /* Reject eight bit registers, except where the template requires
        them. (eg. movzb)  */
-    if (i.types[op].bitfield.reg8
-	&& (i.tm.operand_types[op].bitfield.reg16
-	    || i.tm.operand_types[op].bitfield.reg32
-	    || i.tm.operand_types[op].bitfield.acc))
+    else if (i.types[op].bitfield.byte
+	     && (i.tm.operand_types[op].bitfield.reg
+		 || i.tm.operand_types[op].bitfield.acc)
+	     && (i.tm.operand_types[op].bitfield.word
+		 || i.tm.operand_types[op].bitfield.dword))
       {
 	as_bad (_("`%s%s' not allowed with `%s%c'"),
 		register_prefix,
@@ -5836,10 +5823,11 @@ check_word_reg (void)
       }
     /* Warn if the e or r prefix on a general reg is present.  */
     else if ((!quiet_warnings || flag_code == CODE_64BIT)
-	     && (i.types[op].bitfield.reg32
-		 || i.types[op].bitfield.reg64)
-	     && (i.tm.operand_types[op].bitfield.reg16
-		 || i.tm.operand_types[op].bitfield.acc))
+	     && (i.types[op].bitfield.dword
+		 || i.types[op].bitfield.qword)
+	     && (i.tm.operand_types[op].bitfield.reg
+		 || i.tm.operand_types[op].bitfield.acc)
+	     && i.tm.operand_types[op].bitfield.word)
       {
 	/* Prohibit these changes in the 64bit mode, since the
 	   lowering is more complicated.  */
@@ -6458,8 +6446,8 @@ build_modrm_byte (void)
 	      op = i.tm.operand_types[vvvv];
 	      op.bitfield.regmem = 0;
 	      if ((dest + 1) >= i.operands
-		  || (!op.bitfield.reg32
-		      && !op.bitfield.reg64
+		  || ((!op.bitfield.reg
+		       || (!op.bitfield.dword && !op.bitfield.qword))
 		      && !operand_type_equal (&op, &regxmm)
 		      && !operand_type_equal (&op, &regymm)
 		      && !operand_type_equal (&op, &regzmm)
@@ -6639,7 +6627,7 @@ build_modrm_byte (void)
 	      if (! i.disp_operands)
 		fake_zero_displacement = 1;
 	    }
-	  else if (i.base_reg->reg_type.bitfield.reg16)
+	  else if (i.base_reg->reg_type.bitfield.word)
 	    {
 	      gas_assert (!i.tm.opcode_modifier.vecsib);
 	      switch (i.base_reg->reg_num)
@@ -6819,10 +6807,7 @@ build_modrm_byte (void)
 	  unsigned int vex_reg = ~0;
 
 	  for (op = 0; op < i.operands; op++)
-	    if (i.types[op].bitfield.reg8
-		|| i.types[op].bitfield.reg16
-		|| i.types[op].bitfield.reg32
-		|| i.types[op].bitfield.reg64
+	    if (i.types[op].bitfield.reg
 		|| i.types[op].bitfield.regmmx
 		|| i.types[op].bitfield.regxmm
 		|| i.types[op].bitfield.regymm
@@ -6892,8 +6877,8 @@ build_modrm_byte (void)
 	    {
 	      i386_operand_type *type = &i.tm.operand_types[vex_reg];
 
-	      if (type->bitfield.reg32 != 1
-		  && type->bitfield.reg64 != 1
+	      if ((!type->bitfield.reg
+		   || (!type->bitfield.dword && !type->bitfield.qword))
 		  && !operand_type_equal (type, &regxmm)
 		  && !operand_type_equal (type, &regymm)
 		  && !operand_type_equal (type, &regzmm)
@@ -7351,7 +7336,7 @@ check_prefix:
 	     ==> need second modrm byte.  */
 	  if (i.rm.regmem == ESCAPE_TO_TWO_BYTE_ADDRESSING
 	      && i.rm.mode != 3
-	      && !(i.base_reg && i.base_reg->reg_type.bitfield.reg16))
+	      && !(i.base_reg && i.base_reg->reg_type.bitfield.word))
 	    FRAG_APPEND_1_CHAR ((i.sib.base << 0
 				 | i.sib.index << 3
 				 | i.sib.scale << 6));
@@ -8620,10 +8605,10 @@ i386_addressing_mode (void)
 	    {
 	      if (addr_reg->reg_num == RegEip
 		  || addr_reg->reg_num == RegEiz
-		  || addr_reg->reg_type.bitfield.reg32)
+		  || addr_reg->reg_type.bitfield.dword)
 		addr_mode = CODE_32BIT;
 	      else if (flag_code != CODE_64BIT
-		       && addr_reg->reg_type.bitfield.reg16)
+		       && addr_reg->reg_type.bitfield.word)
 		addr_mode = CODE_16BIT;
 
 	      if (addr_mode != flag_code)
@@ -8702,10 +8687,10 @@ i386_index_check (const char *operand_st
 	  if (i.mem_operands
 	      && i.base_reg
 	      && !((addr_mode == CODE_64BIT
-		    && i.base_reg->reg_type.bitfield.reg64)
+		    && i.base_reg->reg_type.bitfield.qword)
 		   || (addr_mode == CODE_32BIT
-		       ? i.base_reg->reg_type.bitfield.reg32
-		       : i.base_reg->reg_type.bitfield.reg16)))
+		       ? i.base_reg->reg_type.bitfield.dword
+		       : i.base_reg->reg_type.bitfield.word)))
 	    goto bad_address;
 
 	  as_warn (_("`%s' is not valid here (expected `%c%s%s%c')"),
@@ -8731,8 +8716,8 @@ bad_address:
 	  /* 32-bit/64-bit checks.  */
 	  if ((i.base_reg
 	       && (addr_mode == CODE_64BIT
-		   ? !i.base_reg->reg_type.bitfield.reg64
-		   : !i.base_reg->reg_type.bitfield.reg32)
+		   ? !i.base_reg->reg_type.bitfield.qword
+		   : !i.base_reg->reg_type.bitfield.dword)
 	       && (i.index_reg
 		   || (i.base_reg->reg_num
 		       != (addr_mode == CODE_64BIT ? RegRip : RegEip))))
@@ -8741,9 +8726,9 @@ bad_address:
 		  && !i.index_reg->reg_type.bitfield.regymm
 		  && !i.index_reg->reg_type.bitfield.regzmm
 		  && ((addr_mode == CODE_64BIT
-		       ? !(i.index_reg->reg_type.bitfield.reg64
+		       ? !(i.index_reg->reg_type.bitfield.qword
 			   || i.index_reg->reg_num == RegRiz)
-		       : !(i.index_reg->reg_type.bitfield.reg32
+		       : !(i.index_reg->reg_type.bitfield.dword
 			   || i.index_reg->reg_num == RegEiz))
 		      || !i.index_reg->reg_type.bitfield.baseindex)))
 	    goto bad_address;
@@ -8769,10 +8754,10 @@ bad_address:
 	{
 	  /* 16-bit checks.  */
 	  if ((i.base_reg
-	       && (!i.base_reg->reg_type.bitfield.reg16
+	       && (!i.base_reg->reg_type.bitfield.word
 		   || !i.base_reg->reg_type.bitfield.baseindex))
 	      || (i.index_reg
-		  && (!i.index_reg->reg_type.bitfield.reg16
+		  && (!i.index_reg->reg_type.bitfield.word
 		      || !i.index_reg->reg_type.bitfield.baseindex
 		      || !(i.base_reg
 			   && i.base_reg->reg_num < 6
@@ -9775,7 +9760,7 @@ parse_real_register (char *reg_string, c
   if (operand_type_all_zero (&r->reg_type))
     return (const reg_entry *) NULL;
 
-  if ((r->reg_type.bitfield.reg32
+  if ((r->reg_type.bitfield.dword
        || r->reg_type.bitfield.sreg3
        || r->reg_type.bitfield.control
        || r->reg_type.bitfield.debug
@@ -9824,7 +9809,7 @@ parse_real_register (char *reg_string, c
     }
 
   if (((r->reg_flags & (RegRex64 | RegRex))
-       || r->reg_type.bitfield.reg64)
+       || r->reg_type.bitfield.qword)
       && (!cpu_arch_flags.bitfield.cpulm
 	  || !operand_type_equal (&r->reg_type, &control))
       && flag_code != CODE_64BIT)
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -335,6 +335,14 @@ static initializer cpu_flag_init[] =
     "CpuAVX512_BITALG" },
 };
 
+static const initializer operand_type_shorthands[] =
+{
+  { "Reg8",     "Reg|Byte" },
+  { "Reg16",    "Reg|Word" },
+  { "Reg32",    "Reg|Dword" },
+  { "Reg64",    "Reg|Qword" },
+};
+
 static initializer operand_type_init[] =
 {
   { "OPERAND_TYPE_NONE",
@@ -631,10 +639,7 @@ static bitfield opcode_modifiers[] =
 
 static bitfield operand_types[] =
 {
-  BITFIELD (Reg8),
-  BITFIELD (Reg16),
-  BITFIELD (Reg32),
-  BITFIELD (Reg64),
+  BITFIELD (Reg),
   BITFIELD (FloatReg),
   BITFIELD (RegMMX),
   BITFIELD (RegXMM),
@@ -789,9 +794,8 @@ next_field (char *str, char sep, char **
 static void set_bitfield (char *, bitfield *, int, unsigned int, int);
 
 static int
-set_bitfield_from_cpu_flag_init (char *f, bitfield *array,
-				 int value, unsigned int size,
-				 int lineno)
+set_bitfield_from_shorthand (char *f, bitfield *array, unsigned int size,
+			     int lineno)
 {
   char *str, *next, *last;
   unsigned int i;
@@ -812,6 +816,22 @@ set_bitfield_from_cpu_flag_init (char *f
 	return 0;
       }
 
+  for (i = 0; i < ARRAY_SIZE (operand_type_shorthands); i++)
+    if (strcmp (operand_type_shorthands[i].name, f) == 0)
+      {
+	/* Turn on selective bits.  */
+	char *init = xstrdup (operand_type_shorthands[i].init);
+	last = init + strlen (init);
+	for (next = init; next && next < last; )
+	  {
+	    str = next_field (next, '|', &next, last);
+	    if (str)
+	      set_bitfield (str, array, 1, size, lineno);
+	  }
+	free (init);
+	return 0;
+      }
+
   return -1;
 }
 
@@ -862,8 +882,8 @@ set_bitfield (char *f, bitfield *array,
 	}
     }
 
-  /* Handle CPU_XXX_FLAGS.  */
-  if (!set_bitfield_from_cpu_flag_init (f, array, value, size, lineno))
+  /* Handle shorthands.  */
+  if (value == 1 && !set_bitfield_from_shorthand (f, array, size, lineno))
     return;
 
   if (lineno != -1)
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -685,14 +685,8 @@ typedef struct i386_opcode_modifier
 
 enum
 {
-  /* 8bit register */
-  Reg8 = 0,
-  /* 16bit register */
-  Reg16,
-  /* 32bit register */
-  Reg32,
-  /* 64bit register */
-  Reg64,
+  /* Register (qualified by Byte, Word, etc) */
+  Reg = 0,
   /* Floating pointer stack register */
   FloatReg,
   /* MMX register */
@@ -814,10 +808,7 @@ typedef union i386_operand_type
 {
   struct
     {
-      unsigned int reg8:1;
-      unsigned int reg16:1;
-      unsigned int reg32:1;
-      unsigned int reg64:1;
+      unsigned int reg:1;
       unsigned int floatreg:1;
       unsigned int regmmx:1;
       unsigned int regxmm:1;
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1269,13 +1269,18 @@ pavgw, 2, 0xfe3, None, 2, CpuSSE|Cpu3dno
 pavgw, 2, 0x66e3, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pavgw, 2, 0x660fe3, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pextrw, 3, 0x66c5, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64 }
-pextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64|Word|Unspecified|BaseIndex }
+pextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+pextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Word|Unspecified|BaseIndex }
 pextrw, 3, 0x660fc5, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
-pextrw, 3, 0x660f3a15, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|Word|Unspecified|BaseIndex }
+pextrw, 3, 0x660f3a15, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+pextrw, 3, 0x660f3a15, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Word|Unspecified|BaseIndex }
 pextrw, 3, 0xfc5, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|NoAVX, { Imm8, RegMMX, Reg32|Reg64 }
-pinsrw, 3, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, Reg32|Reg64|Word|Unspecified|BaseIndex, RegXMM }
-pinsrw, 3, 0x660fc4, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64|Word|Unspecified|BaseIndex, RegXMM }
-pinsrw, 3, 0xfc4, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|NoAVX, { Imm8, Reg32|Reg64|Word|Unspecified|BaseIndex, RegMMX }
+pinsrw, 3, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, Reg32|Reg64, RegXMM }
+pinsrw, 3, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|SSE2AVX, { Imm8, Word|Unspecified|BaseIndex, RegXMM }
+pinsrw, 3, 0x660fc4, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM }
+pinsrw, 3, 0x660fc4, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM }
+pinsrw, 3, 0xfc4, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|NoAVX, { Imm8, Reg32|Reg64, RegMMX }
+pinsrw, 3, 0xfc4, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoAVX, { Imm8, Word|Unspecified|BaseIndex, RegMMX }
 pmaxsw, 2, 0x66ee, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pmaxsw, 2, 0x660fee, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pmaxsw, 2, 0xfee, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { Qword|Unspecified|BaseIndex|RegMMX, RegMMX }
@@ -1656,8 +1661,10 @@ dppd, 3, 0x6641, None, 1, CpuAVX, Modrm|
 dppd, 3, 0x660f3a41, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 dpps, 3, 0x6640, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 dpps, 3, 0x660f3a40, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-extractps, 3, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64|Dword|Unspecified|BaseIndex }
-extractps, 3, 0x660f3a17, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|Dword|Unspecified|BaseIndex }
+extractps, 3, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
+extractps, 3, 0x6617, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg64|RegMem }
+extractps, 3, 0x660f3a17, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
+extractps, 3, 0x660f3a17, None, 3, CpuSSE4_1|Cpu64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg64|RegMem }
 insertps, 3, 0x6621, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
 insertps, 3, 0x660f3a21, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movntdqa, 2, 0x662a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Xmmword|Unspecified|BaseIndex, RegXMM }
@@ -1674,16 +1681,20 @@ pblendw, 3, 0x660e, None, 1, CpuAVX, Mod
 pblendw, 3, 0x660f3a0e, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pcmpeqq, 2, 0x6629, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pcmpeqq, 2, 0x660f3829, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-pextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64|Byte|Unspecified|BaseIndex }
-pextrb, 3, 0x660f3a14, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|Byte|Unspecified|BaseIndex }
+pextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+pextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
+pextrb, 3, 0x660f3a14, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+pextrb, 3, 0x660f3a14, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
 pextrd, 3, 0x6616, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
 pextrd, 3, 0x660f3a16, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
 pextrq, 3, 0x6616, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|Size64|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg64|Qword|Unspecified|BaseIndex }
 pextrq, 3, 0x660f3a16, None, 3, CpuSSE4_1|Cpu64, Modrm|Size64|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg64|Qword|Unspecified|BaseIndex }
 phminposuw, 2, 0x6641, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 phminposuw, 2, 0x660f3841, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-pinsrb, 3, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, Reg32|Reg64|Byte|Unspecified|BaseIndex, RegXMM }
-pinsrb, 3, 0x660f3a20, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64|Byte|Unspecified|BaseIndex, RegXMM }
+pinsrb, 3, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, Reg32|Reg64, RegXMM }
+pinsrb, 3, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Byte|Unspecified|BaseIndex, RegXMM }
+pinsrb, 3, 0x660f3a20, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM }
+pinsrb, 3, 0x660f3a20, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM }
 pinsrd, 3, 0x6622, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Reg32|Dword|Unspecified|BaseIndex, RegXMM }
 pinsrd, 3, 0x660f3a22, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Dword|Unspecified|BaseIndex, RegXMM }
 pinsrq, 3, 0x6622, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|Size64|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Reg64|Qword|Unspecified|BaseIndex, RegXMM }
@@ -2103,7 +2114,8 @@ vdppd, 4, 0x6641, None, 1, CpuAVX, Modrm
 vdpps, 4, 0x6640, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vdpps, 4, 0x6640, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vextractf128, 3, 0x6619, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM, Xmmword|Unspecified|BaseIndex|RegXMM }
-vextractps, 3, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|Dword|Unspecified|BaseIndex }
+vextractps, 3, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
+vextractps, 3, 0x6617, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg64|RegMem }
 vhaddpd, 3, 0x667c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vhaddpd, 3, 0x667c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vhaddps, 3, 0xf27c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -2266,11 +2278,13 @@ vpermilps, 3, 0x660c, None, 1, CpuAVX, M
 vpermilps, 3, 0x660c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpermilps, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpermilps, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vpextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|Byte|Unspecified|BaseIndex }
+vpextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+vpextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
 vpextrd, 3, 0x6616, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
 vpextrq, 3, 0x6616, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|Size64|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg64|Qword|Unspecified|BaseIndex }
 vpextrw, 3, 0x66c5, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
-vpextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|Word|Unspecified|BaseIndex }
+vpextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+vpextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Word|Unspecified|BaseIndex }
 vphaddd, 3, 0x6602, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vphaddsw, 3, 0x6603, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vphaddw, 3, 0x6601, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -2278,10 +2292,12 @@ vphminposuw, 2, 0x6641, None, 1, CpuAVX,
 vphsubd, 3, 0x6606, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vphsubsw, 3, 0x6607, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vphsubw, 3, 0x6605, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpinsrb, 4, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64|Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsrb, 4, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrb, 4, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpinsrd, 4, 0x6622, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpinsrq, 4, 0x6622, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|Size64|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg64|Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpinsrw, 4, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsrw, 4, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrw, 4, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpmaddubsw, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpmaddwd, 3, 0x66f5, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpmaxsb, 3, 0x663c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -3779,7 +3795,8 @@ vextractf64x4, 3, 0x661B, None, 1, CpuAV
 vextracti64x4, 3, 0x663B, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=2|VexW=2|VecESize=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegZMM, RegYMM|RegMem }
 vextracti64x4, 3, 0x663B, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=2|VexOpcode=2|VexW=2|VecESize=1|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegZMM, YMMword|Unspecified|BaseIndex }
 
-vextractps, 3, 0x6617, None, 1, CpuAVX512F, Modrm|EVex=4|VexOpcode=2|Disp8MemShift=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg64|Reg32|Dword|Unspecified|BaseIndex }
+vextractps, 3, 0x6617, None, 1, CpuAVX512F, Modrm|EVex=4|VexOpcode=2|Disp8MemShift=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
+vextractps, 3, 0x6617, None, 1, CpuAVX512F|Cpu64, Modrm|EVex=4|VexOpcode=2|Disp8MemShift=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg64|RegMem }
 
 vfixupimmpd, 4, 0x6654, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=2|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vfixupimmpd, 5, 0x6654, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=2|VexVVVV=1|VexW=2|VecESize=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SAE, { Imm8, Imm8, RegZMM, RegZMM, RegZMM }
@@ -5708,12 +5725,16 @@ vpsrldq, 3, 0x6673, 3, 1, CpuAVX512BW|Cp
 vpsrldq, 3, 0x6673, 3, 1, CpuAVX512BW|CpuAVX512VL, Modrm|EVex=3|VexOpcode=0|VexVVVV=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, YMMword|Unspecified|BaseIndex, RegYMM }
 
 vpextrw, 3, 0x66C5, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64 }
-vpinsrw, 4, 0x66C4, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=0|VexVVVV=1|Disp8MemShift=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Reg64|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsrw, 4, 0x66C4, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=0|VexVVVV=1|Disp8MemShift=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrw, 4, 0x66C4, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=0|VexVVVV=1|Disp8MemShift=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vpextrb, 3, 0x6614, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64|Byte|Unspecified|BaseIndex }
-vpinsrb, 4, 0x6620, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Reg64|Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpextrb, 3, 0x6614, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+vpextrb, 3, 0x6614, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
+vpinsrb, 4, 0x6620, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrb, 4, 0x6620, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vpextrw, 3, 0x6615, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|Disp8MemShift=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64|Word|Unspecified|BaseIndex }
+vpextrw, 3, 0x6615, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|Disp8MemShift=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64|RegMem }
+vpextrw, 3, 0x6615, None, 1, CpuAVX512BW, Modrm|EVex=4|VexOpcode=2|Disp8MemShift=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Word|Unspecified|BaseIndex }
 
 vpmaddwd, 3, 0x66F5, None, 1, CpuAVX512BW, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpmaddwd, 3, 0x66F5, None, 1, CpuAVX512BW|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
--- a/opcodes/i386-reg.tbl
+++ b/opcodes/i386-reg.tbl
@@ -21,7 +21,7 @@
 // Make %st first as we test for it.
 st, FloatReg|FloatAcc, 0, 0, 11, 33
 // 8 bit regs
-al, Reg8|Acc|Byte, 0, 0, Dw2Inval, Dw2Inval
+al, Reg8|Acc, 0, 0, Dw2Inval, Dw2Inval
 cl, Reg8|ShiftCount, 0, 1, Dw2Inval, Dw2Inval
 dl, Reg8, 0, 2, Dw2Inval, Dw2Inval
 bl, Reg8, 0, 3, Dw2Inval, Dw2Inval
@@ -46,7 +46,7 @@ r13b, Reg8, RegRex|RegRex64, 5, Dw2Inval
 r14b, Reg8, RegRex|RegRex64, 6, Dw2Inval, Dw2Inval
 r15b, Reg8, RegRex|RegRex64, 7, Dw2Inval, Dw2Inval
 // 16 bit regs
-ax, Reg16|Acc|Word, 0, 0, Dw2Inval, Dw2Inval
+ax, Reg16|Acc, 0, 0, Dw2Inval, Dw2Inval
 cx, Reg16, 0, 1, Dw2Inval, Dw2Inval
 dx, Reg16|InOutPortReg, 0, 2, Dw2Inval, Dw2Inval
 bx, Reg16|BaseIndex, 0, 3, Dw2Inval, Dw2Inval
@@ -63,7 +63,7 @@ r13w, Reg16, RegRex, 5, Dw2Inval, Dw2Inv
 r14w, Reg16, RegRex, 6, Dw2Inval, Dw2Inval
 r15w, Reg16, RegRex, 7, Dw2Inval, Dw2Inval
 // 32 bit regs
-eax, Reg32|BaseIndex|Acc|Dword, 0, 0, 0, Dw2Inval
+eax, Reg32|BaseIndex|Acc, 0, 0, 0, Dw2Inval
 ecx, Reg32|BaseIndex, 0, 1, 1, Dw2Inval
 edx, Reg32|BaseIndex, 0, 2, 2, Dw2Inval
 ebx, Reg32|BaseIndex, 0, 3, 3, Dw2Inval
@@ -79,7 +79,7 @@ r12d, Reg32|BaseIndex, RegRex, 4, Dw2Inv
 r13d, Reg32|BaseIndex, RegRex, 5, Dw2Inval, Dw2Inval
 r14d, Reg32|BaseIndex, RegRex, 6, Dw2Inval, Dw2Inval
 r15d, Reg32|BaseIndex, RegRex, 7, Dw2Inval, Dw2Inval
-rax, Reg64|BaseIndex|Acc|Qword, 0, 0, Dw2Inval, 0
+rax, Reg64|BaseIndex|Acc, 0, 0, Dw2Inval, 0
 rcx, Reg64|BaseIndex, 0, 1, Dw2Inval, 2
 rdx, Reg64|BaseIndex, 0, 2, Dw2Inval, 1
 rbx, Reg64|BaseIndex, 0, 3, Dw2Inval, 3


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/4] x86: drop FloatReg and FloatAcc
  2017-12-15  9:54 [PATCH 0/4] x86: operand handling consolidation Jan Beulich
  2017-12-15 10:32 ` [PATCH 1/4] x86: replace Reg8, Reg16, Reg32, and Reg64 Jan Beulich
@ 2017-12-15 10:33 ` Jan Beulich
  2017-12-15 12:47   ` H.J. Lu
  2017-12-15 10:34 ` [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD Jan Beulich
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2017-12-15 10:33 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Express them as Reg|Tbyte and Acc|Tbyte respectively.

gas/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (operand_type_check): Extend comment.
	(match_reg_size): Also check .tbyte.
	(match_mem_size): No longer check .tbyte here.
	(md_assemble): Drop .floatacc check.
	(check_byte_reg): Drop .floatreg and .floatacc checks.
	(process_operands, parse_real_register): Replace .floatreg
	check.

opcodes/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* i386-gen.c (operand_type_shorthands): Add FloatAcc and
	FloatReg.
	(operand_types): Drop FloatAcc and FloatReg.
	* i386-opc.h (enum of operand types): Likewise. Extend comment.
	(union i386_operand_type): Drop floatacc and floatreg.
	* i386-reg.tbl (st, st(0)): Replace FloatAcc by Acc.
	* i386-init.h, i386-tbl.h: Re-generate.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1807,7 +1807,7 @@ operand_type_check (i386_operand_type t,
   return 0;
 }
 
-/* Return 1 if there is no conflict in 8bit/16bit/32bit/64bit on
+/* Return 1 if there is no conflict in 8bit/16bit/32bit/64bit/80bit on
    operand J for instruction template T.  */
 
 static INLINE int
@@ -1820,7 +1820,9 @@ match_reg_size (const insn_template *t,
 	   || (i.types[j].bitfield.dword
 	       && !t->operand_types[j].bitfield.dword)
 	   || (i.types[j].bitfield.qword
-	       && !t->operand_types[j].bitfield.qword));
+	       && !t->operand_types[j].bitfield.qword)
+	   || (i.types[j].bitfield.tbyte
+	       && !t->operand_types[j].bitfield.tbyte));
 }
 
 /* Return 1 if there is no conflict in any size on operand J for
@@ -1835,8 +1837,6 @@ match_mem_size (const insn_template *t,
 		&& !t->operand_types[j].bitfield.unspecified)
 	       || (i.types[j].bitfield.fword
 		   && !t->operand_types[j].bitfield.fword)
-	       || (i.types[j].bitfield.tbyte
-		   && !t->operand_types[j].bitfield.tbyte)
 	       || (i.types[j].bitfield.xmmword
 		   && !t->operand_types[j].bitfield.xmmword)
 	       || (i.types[j].bitfield.ymmword
@@ -3768,8 +3768,7 @@ md_assemble (char *line)
     for (j = 0; j < i.operands; j++)
       if (i.types[j].bitfield.inoutportreg
 	  || i.types[j].bitfield.shiftcount
-	  || i.types[j].bitfield.acc
-	  || i.types[j].bitfield.floatacc)
+	  || i.types[j].bitfield.acc)
 	i.reg_operands--;
 
   /* ImmExt should be processed after SSE2AVX.  */
@@ -5661,9 +5660,7 @@ check_byte_reg (void)
 	  || i.types[op].bitfield.sreg3
 	  || i.types[op].bitfield.control
 	  || i.types[op].bitfield.debug
-	  || i.types[op].bitfield.test
-	  || i.types[op].bitfield.floatreg
-	  || i.types[op].bitfield.floatacc)
+	  || i.types[op].bitfield.test)
 	{
 	  as_bad (_("`%s%s' not allowed with `%s%c'"),
 		  register_prefix,
@@ -6122,7 +6119,7 @@ duplicate:
 	     0 or 1.  */
 	  unsigned int op;
 
-	  if (i.types[0].bitfield.floatreg
+	  if ((i.types[0].bitfield.reg && i.types[0].bitfield.tbyte)
 	      || operand_type_check (i.types[0], reg))
 	    op = 0;
 	  else
@@ -9768,7 +9765,7 @@ parse_real_register (char *reg_string, c
       && !cpu_arch_flags.bitfield.cpui386)
     return (const reg_entry *) NULL;
 
-  if (r->reg_type.bitfield.floatreg
+  if (r->reg_type.bitfield.tbyte
       && !cpu_arch_flags.bitfield.cpu8087
       && !cpu_arch_flags.bitfield.cpu287
       && !cpu_arch_flags.bitfield.cpu387)
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -341,6 +341,8 @@ static const initializer operand_type_sh
   { "Reg16",    "Reg|Word" },
   { "Reg32",    "Reg|Dword" },
   { "Reg64",    "Reg|Qword" },
+  { "FloatAcc", "Acc|Tbyte" },
+  { "FloatReg", "Reg|Tbyte" },
 };
 
 static initializer operand_type_init[] =
@@ -640,7 +642,6 @@ static bitfield opcode_modifiers[] =
 static bitfield operand_types[] =
 {
   BITFIELD (Reg),
-  BITFIELD (FloatReg),
   BITFIELD (RegMMX),
   BITFIELD (RegXMM),
   BITFIELD (RegYMM),
@@ -667,7 +668,6 @@ static bitfield operand_types[] =
   BITFIELD (SReg2),
   BITFIELD (SReg3),
   BITFIELD (Acc),
-  BITFIELD (FloatAcc),
   BITFIELD (JumpAbsolute),
   BITFIELD (EsSeg),
   BITFIELD (RegMem),
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -687,8 +687,6 @@ enum
 {
   /* Register (qualified by Byte, Word, etc) */
   Reg = 0,
-  /* Floating pointer stack register */
-  FloatReg,
   /* MMX register */
   RegMMX,
   /* SSE register */
@@ -740,10 +738,8 @@ enum
   Disp32S,
   /* 64 bit displacement */
   Disp64,
-  /* Accumulator %al/%ax/%eax/%rax */
+  /* Accumulator %al/%ax/%eax/%rax/%st(0) */
   Acc,
-  /* Floating pointer top stack register %st(0) */
-  FloatAcc,
   /* Register which can be used for base or index in memory operand.  */
   BaseIndex,
   /* Register to hold in/out port addr = dx */
@@ -809,7 +805,6 @@ typedef union i386_operand_type
   struct
     {
       unsigned int reg:1;
-      unsigned int floatreg:1;
       unsigned int regmmx:1;
       unsigned int regxmm:1;
       unsigned int regymm:1;
@@ -833,7 +828,6 @@ typedef union i386_operand_type
       unsigned int disp32s:1;
       unsigned int disp64:1;
       unsigned int acc:1;
-      unsigned int floatacc:1;
       unsigned int baseindex:1;
       unsigned int inoutportreg:1;
       unsigned int shiftcount:1;
--- a/opcodes/i386-reg.tbl
+++ b/opcodes/i386-reg.tbl
@@ -19,7 +19,7 @@
 // 02110-1301, USA.
 
 // Make %st first as we test for it.
-st, FloatReg|FloatAcc, 0, 0, 11, 33
+st, FloatReg|Acc, 0, 0, 11, 33
 // 8 bit regs
 al, Reg8|Acc, 0, 0, Dw2Inval, Dw2Inval
 cl, Reg8|ShiftCount, 0, 1, Dw2Inval, Dw2Inval
@@ -292,7 +292,7 @@ eip, BaseIndex, RegRex64, RegEip, 8, Dw2
 riz, BaseIndex, RegRex64, RegRiz, Dw2Inval, Dw2Inval
 eiz, BaseIndex, 0, RegEiz, Dw2Inval, Dw2Inval
 // fp regs.
-st(0), FloatReg|FloatAcc, 0, 0, 11, 33
+st(0), FloatReg|Acc, 0, 0, 11, 33
 st(1), FloatReg, 0, 1, 12, 34
 st(2), FloatReg, 0, 2, 13, 35
 st(3), FloatReg, 0, 3, 14, 36


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD
  2017-12-15  9:54 [PATCH 0/4] x86: operand handling consolidation Jan Beulich
  2017-12-15 10:32 ` [PATCH 1/4] x86: replace Reg8, Reg16, Reg32, and Reg64 Jan Beulich
  2017-12-15 10:33 ` [PATCH 2/4] x86: drop FloatReg and FloatAcc Jan Beulich
@ 2017-12-15 10:34 ` Jan Beulich
  2017-12-15 12:50   ` H.J. Lu
  2017-12-15 10:35 ` [PATCH 4/4] x86: fold certain AVX and AVX2 templates Jan Beulich
  2017-12-18  8:03 ` [PATCH 5/4] x86: helper changes to i386-gen.c Jan Beulich
  4 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2017-12-15 10:34 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

... qualified by their respective sizes, allowing to drop FirstXmm0 at
the same time.

gas/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (match_simd_size): New.
	(match_mem_size): Use it.
	(operand_size_match): Likewise. Split .reg and .acc checks.
	(pi, check_VecOperands, match_template, check_byte_reg,
	check_long_reg, check_qword_reg, build_modrm_byte,
	parse_real_register): Replace .regxmm, .regymm, and .regzmm
	checks.
	(md_assemble): Qualify .acc check with .xmmword one.
	(bad_implicit_operand): Delete.
	(process_operands): Replace .firstxmm0 checks with .acc plus
	.xmmword ones. Drop now pointless assertions. Convert .acc to
	.regsimd.
	* config/tc-i386-intel.c (i386_intel_simplify_register): Replace
	.regxmm, .regymm, and .regzmm checks.
	* testsuite/gas/i386/x86-64-specific-reg.l: Adjust expectations.

opcodes/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* i386-gen.c (operand_type_shorthands): Add RegXMM, RegYMM, and
	RegZMM.
	(opcode_modifiers): Drop FirstXmm0.
	(operand_types): Replace RegXMM, RegYMM, and RegZMM with just
	RegSIMD.
	* i386-opc.h (enum of opcode modifiers): Drop FirstXmm0.
	(struct i386_opcode_modifier): Drop firstxmm0.
	(enum of operand types): Replace RegXMM, RegYMM, and RegZMM with
	just RegSIMD. Extend comment.
	(union i386_operand_type): Replace regxmm, regymm, and regzmm
	with just regsimd.
	* i386-opc.tbl (blendvpd, blendvps, pblendvb, sha256rnds2): Use
	Acc|Xmmword.
	* i386-reg.tbl (xmm0): Add Acc.
	* i386-init.h, i386-tbl.h: Re-generate.

--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -286,9 +286,9 @@ i386_intel_simplify_register (expression
       i.op[this_operand].regs = i386_regtab + reg_num;
     }
   else if (!intel_state.index
-	   && (i386_regtab[reg_num].reg_type.bitfield.regxmm
-	       || i386_regtab[reg_num].reg_type.bitfield.regymm
-	       || i386_regtab[reg_num].reg_type.bitfield.regzmm
+	   && (i386_regtab[reg_num].reg_type.bitfield.xmmword
+	       || i386_regtab[reg_num].reg_type.bitfield.ymmword
+	       || i386_regtab[reg_num].reg_type.bitfield.zmmword
 	       || i386_regtab[reg_num].reg_num == RegRiz
 	       || i386_regtab[reg_num].reg_num == RegEiz))
     intel_state.index = i386_regtab + reg_num;
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1825,6 +1825,20 @@ match_reg_size (const insn_template *t,
 	       && !t->operand_types[j].bitfield.tbyte));
 }
 
+/* Return 1 if there is no conflict in SIMD register on
+   operand J for instruction template T.  */
+
+static INLINE int
+match_simd_size (const insn_template *t, unsigned int j)
+{
+  return !((i.types[j].bitfield.xmmword
+	    && !t->operand_types[j].bitfield.xmmword)
+	   || (i.types[j].bitfield.ymmword
+	       && !t->operand_types[j].bitfield.ymmword)
+	   || (i.types[j].bitfield.zmmword
+	       && !t->operand_types[j].bitfield.zmmword));
+}
+
 /* Return 1 if there is no conflict in any size on operand J for
    instruction template T.  */
 
@@ -1837,12 +1851,17 @@ match_mem_size (const insn_template *t,
 		&& !t->operand_types[j].bitfield.unspecified)
 	       || (i.types[j].bitfield.fword
 		   && !t->operand_types[j].bitfield.fword)
-	       || (i.types[j].bitfield.xmmword
-		   && !t->operand_types[j].bitfield.xmmword)
-	       || (i.types[j].bitfield.ymmword
-		   && !t->operand_types[j].bitfield.ymmword)
-	       || (i.types[j].bitfield.zmmword
-		   && !t->operand_types[j].bitfield.zmmword)));
+	       /* For scalar opcode templates to allow register and memory
+		  operands at the same time, some special casing is needed
+		  here.  */
+	       || ((t->operand_types[j].bitfield.regsimd
+		    && !t->opcode_modifier.broadcast
+		    && (t->operand_types[j].bitfield.dword
+			|| t->operand_types[j].bitfield.qword))
+		   ? (i.types[j].bitfield.xmmword
+		      || i.types[j].bitfield.ymmword
+		      || i.types[j].bitfield.zmmword)
+		   : !match_simd_size(t, j))));
 }
 
 /* Return 1 if there is no size conflict on any operands for
@@ -1864,17 +1883,31 @@ operand_size_match (const insn_template
   /* Check memory and accumulator operand size.  */
   for (j = 0; j < i.operands; j++)
     {
-      if (!i.types[j].bitfield.reg && t->operand_types[j].bitfield.anysize)
+      if (!i.types[j].bitfield.reg && !i.types[j].bitfield.regsimd
+	  && t->operand_types[j].bitfield.anysize)
 	continue;
 
-      if ((t->operand_types[j].bitfield.reg
-	   || t->operand_types[j].bitfield.acc)
+      if (t->operand_types[j].bitfield.reg
 	  && !match_reg_size (t, j))
 	{
 	  match = 0;
 	  break;
 	}
 
+      if (t->operand_types[j].bitfield.regsimd
+	  && !match_simd_size (t, j))
+	{
+	  match = 0;
+	  break;
+	}
+
+      if (t->operand_types[j].bitfield.acc
+	  && (!match_reg_size (t, j) || !match_simd_size (t, j)))
+	{
+	  match = 0;
+	  break;
+	}
+
       if (i.types[j].bitfield.mem && !match_mem_size (t, j))
 	{
 	  match = 0;
@@ -2804,9 +2837,7 @@ pi (char *line, i386_insn *x)
       fprintf (stdout, "\n");
       if (x->types[j].bitfield.reg
 	  || x->types[j].bitfield.regmmx
-	  || x->types[j].bitfield.regxmm
-	  || x->types[j].bitfield.regymm
-	  || x->types[j].bitfield.regzmm
+	  || x->types[j].bitfield.regsimd
 	  || x->types[j].bitfield.sreg2
 	  || x->types[j].bitfield.sreg3
 	  || x->types[j].bitfield.control
@@ -3768,7 +3799,7 @@ md_assemble (char *line)
     for (j = 0; j < i.operands; j++)
       if (i.types[j].bitfield.inoutportreg
 	  || i.types[j].bitfield.shiftcount
-	  || i.types[j].bitfield.acc)
+	  || (i.types[j].bitfield.acc && !i.types[j].bitfield.xmmword))
 	i.reg_operands--;
 
   /* ImmExt should be processed after SSE2AVX.  */
@@ -4576,9 +4607,9 @@ check_VecOperands (const insn_template *
   /* Without VSIB byte, we can't have a vector register for index.  */
   if (!t->opcode_modifier.vecsib
       && i.index_reg
-      && (i.index_reg->reg_type.bitfield.regxmm
-	  || i.index_reg->reg_type.bitfield.regymm
-	  || i.index_reg->reg_type.bitfield.regzmm))
+      && (i.index_reg->reg_type.bitfield.xmmword
+	  || i.index_reg->reg_type.bitfield.ymmword
+	  || i.index_reg->reg_type.bitfield.zmmword))
     {
       i.error = unsupported_vector_index_register;
       return 1;
@@ -4598,11 +4629,11 @@ check_VecOperands (const insn_template *
     {
       if (!i.index_reg
 	  || !((t->opcode_modifier.vecsib == VecSIB128
-		&& i.index_reg->reg_type.bitfield.regxmm)
+		&& i.index_reg->reg_type.bitfield.xmmword)
 	       || (t->opcode_modifier.vecsib == VecSIB256
-		   && i.index_reg->reg_type.bitfield.regymm)
+		   && i.index_reg->reg_type.bitfield.ymmword)
 	       || (t->opcode_modifier.vecsib == VecSIB512
-		   && i.index_reg->reg_type.bitfield.regzmm)))
+		   && i.index_reg->reg_type.bitfield.zmmword)))
       {
 	i.error = invalid_vsib_address;
 	return 1;
@@ -4611,10 +4642,12 @@ check_VecOperands (const insn_template *
       gas_assert (i.reg_operands == 2 || i.mask);
       if (i.reg_operands == 2 && !i.mask)
 	{
-	  gas_assert (i.types[0].bitfield.regxmm
-		      || i.types[0].bitfield.regymm);
-	  gas_assert (i.types[2].bitfield.regxmm
-		      || i.types[2].bitfield.regymm);
+	  gas_assert (i.types[0].bitfield.regsimd);
+	  gas_assert (i.types[0].bitfield.xmmword
+		      || i.types[0].bitfield.ymmword);
+	  gas_assert (i.types[2].bitfield.regsimd);
+	  gas_assert (i.types[2].bitfield.xmmword
+		      || i.types[2].bitfield.ymmword);
 	  if (operand_check == check_none)
 	    return 0;
 	  if (register_number (i.op[0].regs)
@@ -4633,9 +4666,10 @@ check_VecOperands (const insn_template *
 	}
       else if (i.reg_operands == 1 && i.mask)
 	{
-	  if ((i.types[1].bitfield.regxmm
-	       || i.types[1].bitfield.regymm
-	       || i.types[1].bitfield.regzmm)
+	  if (i.types[1].bitfield.regsimd
+	      && (i.types[1].bitfield.xmmword
+	          || i.types[1].bitfield.ymmword
+	          || i.types[1].bitfield.zmmword)
 	      && (register_number (i.op[1].regs)
 		  == register_number (i.index_reg)))
 	    {
@@ -4941,13 +4975,9 @@ match_template (char mnem_suffix)
 		 && !intel_float_operand (t->name))
 	      : intel_float_operand (t->name) != 2)
 	  && ((!operand_types[0].bitfield.regmmx
-	       && !operand_types[0].bitfield.regxmm
-	       && !operand_types[0].bitfield.regymm
-	       && !operand_types[0].bitfield.regzmm)
+	       && !operand_types[0].bitfield.regsimd)
 	      || (!operand_types[t->operands > 1].bitfield.regmmx
-		  && !operand_types[t->operands > 1].bitfield.regxmm
-		  && !operand_types[t->operands > 1].bitfield.regymm
-		  && !operand_types[t->operands > 1].bitfield.regzmm))
+		  && !operand_types[t->operands > 1].bitfield.regsimd))
 	  && (t->base_opcode != 0x0fc7
 	      || t->extension_opcode != 1 /* cmpxchg8b */))
 	continue;
@@ -4960,9 +4990,9 @@ match_template (char mnem_suffix)
 		      && !intel_float_operand (t->name))
 		   : intel_float_operand (t->name) != 2)
 	       && ((!operand_types[0].bitfield.regmmx
-		    && !operand_types[0].bitfield.regxmm)
+		    && !operand_types[0].bitfield.regsimd)
 		   || (!operand_types[t->operands > 1].bitfield.regmmx
-		       && !operand_types[t->operands > 1].bitfield.regxmm)))
+		       && !operand_types[t->operands > 1].bitfield.regsimd)))
 	continue;
 
       /* Do not verify operands when there are none.  */
@@ -5653,9 +5683,7 @@ check_byte_reg (void)
       /* Any other register is bad.  */
       if (i.types[op].bitfield.reg
 	  || i.types[op].bitfield.regmmx
-	  || i.types[op].bitfield.regxmm
-	  || i.types[op].bitfield.regymm
-	  || i.types[op].bitfield.regzmm
+	  || i.types[op].bitfield.regsimd
 	  || i.types[op].bitfield.sreg2
 	  || i.types[op].bitfield.sreg3
 	  || i.types[op].bitfield.control
@@ -5728,7 +5756,7 @@ check_long_reg (void)
       {
 	if (intel_syntax
 	    && i.tm.opcode_modifier.toqword
-	    && !i.types[0].bitfield.regxmm)
+	    && !i.types[0].bitfield.regsimd)
 	  {
 	    /* Convert to QWORD.  We want REX byte. */
 	    i.suffix = QWORD_MNEM_SUFFIX;
@@ -5779,7 +5807,7 @@ check_qword_reg (void)
 	   lowering is more complicated.  */
 	if (intel_syntax
 	    && i.tm.opcode_modifier.todword
-	    && !i.types[0].bitfield.regxmm)
+	    && !i.types[0].bitfield.regsimd)
 	  {
 	    /* Convert to DWORD.  We don't want REX byte. */
 	    i.suffix = LONG_MNEM_SUFFIX;
@@ -5930,20 +5958,6 @@ finalize_imm (void)
 }
 
 static int
-bad_implicit_operand (int xmm)
-{
-  const char *ireg = xmm ? "xmm0" : "ymm0";
-
-  if (intel_syntax)
-    as_bad (_("the last operand of `%s' must be `%s%s'"),
-	    i.tm.name, register_prefix, ireg);
-  else
-    as_bad (_("the first operand of `%s' must be `%s%s'"),
-	    i.tm.name, register_prefix, ireg);
-  return 0;
-}
-
-static int
 process_operands (void)
 {
   /* Default segment register this instruction will use for memory
@@ -5962,17 +5976,15 @@ process_operands (void)
 		  && MAX_OPERANDS > dupl
 		  && operand_type_equal (&i.types[dest], &regxmm));
 
-      if (i.tm.opcode_modifier.firstxmm0)
+      if (i.tm.operand_types[0].bitfield.acc
+	  && i.tm.operand_types[0].bitfield.xmmword)
 	{
-	  /* The first operand is implicit and must be xmm0.  */
-	  gas_assert (operand_type_equal (&i.types[0], &regxmm));
-	  if (register_number (i.op[0].regs) != 0)
-	    return bad_implicit_operand (1);
-
 	  if (i.tm.opcode_modifier.vexsources == VEX3SOURCES)
 	    {
 	      /* Keep xmm0 for instructions with VEX prefix and 3
 		 sources.  */
+	      i.tm.operand_types[0].bitfield.acc = 0;
+	      i.tm.operand_types[0].bitfield.regsimd = 1;
 	      goto duplicate;
 	    }
 	  else
@@ -6032,18 +6044,11 @@ duplicate:
        if (i.tm.opcode_modifier.immext)
 	 process_immext ();
     }
-  else if (i.tm.opcode_modifier.firstxmm0)
+  else if (i.tm.operand_types[0].bitfield.acc
+	   && i.tm.operand_types[0].bitfield.xmmword)
     {
       unsigned int j;
 
-      /* The first operand is implicit and must be xmm0/ymm0/zmm0.  */
-      gas_assert (i.reg_operands
-		  && (operand_type_equal (&i.types[0], &regxmm)
-		      || operand_type_equal (&i.types[0], &regymm)
-		      || operand_type_equal (&i.types[0], &regzmm)));
-      if (register_number (i.op[0].regs) != 0)
-	return bad_implicit_operand (i.types[0].bitfield.regxmm);
-
       for (j = 1; j < i.operands; j++)
 	{
 	  i.op[j - 1] = i.op[j];
@@ -6806,10 +6811,8 @@ build_modrm_byte (void)
 	  for (op = 0; op < i.operands; op++)
 	    if (i.types[op].bitfield.reg
 		|| i.types[op].bitfield.regmmx
-		|| i.types[op].bitfield.regxmm
-		|| i.types[op].bitfield.regymm
+		|| i.types[op].bitfield.regsimd
 		|| i.types[op].bitfield.regbnd
-		|| i.types[op].bitfield.regzmm
 		|| i.types[op].bitfield.regmask
 		|| i.types[op].bitfield.sreg2
 		|| i.types[op].bitfield.sreg3
@@ -8719,9 +8722,9 @@ bad_address:
 		   || (i.base_reg->reg_num
 		       != (addr_mode == CODE_64BIT ? RegRip : RegEip))))
 	      || (i.index_reg
-		  && !i.index_reg->reg_type.bitfield.regxmm
-		  && !i.index_reg->reg_type.bitfield.regymm
-		  && !i.index_reg->reg_type.bitfield.regzmm
+		  && !i.index_reg->reg_type.bitfield.xmmword
+		  && !i.index_reg->reg_type.bitfield.ymmword
+		  && !i.index_reg->reg_type.bitfield.zmmword
 		  && ((addr_mode == CODE_64BIT
 		       ? !(i.index_reg->reg_type.bitfield.qword
 			   || i.index_reg->reg_num == RegRiz)
@@ -9774,13 +9777,13 @@ parse_real_register (char *reg_string, c
   if (r->reg_type.bitfield.regmmx && !cpu_arch_flags.bitfield.cpuregmmx)
     return (const reg_entry *) NULL;
 
-  if (r->reg_type.bitfield.regxmm && !cpu_arch_flags.bitfield.cpuregxmm)
+  if (r->reg_type.bitfield.xmmword && !cpu_arch_flags.bitfield.cpuregxmm)
     return (const reg_entry *) NULL;
 
-  if (r->reg_type.bitfield.regymm && !cpu_arch_flags.bitfield.cpuregymm)
+  if (r->reg_type.bitfield.ymmword && !cpu_arch_flags.bitfield.cpuregymm)
     return (const reg_entry *) NULL;
 
-  if (r->reg_type.bitfield.regzmm && !cpu_arch_flags.bitfield.cpuregzmm)
+  if (r->reg_type.bitfield.zmmword && !cpu_arch_flags.bitfield.cpuregzmm)
     return (const reg_entry *) NULL;
 
   if (r->reg_type.bitfield.regmask
--- a/gas/testsuite/gas/i386/x86-64-specific-reg.l
+++ b/gas/testsuite/gas/i386/x86-64-specific-reg.l
@@ -315,62 +315,62 @@
 .*:[0-9]*: Error: .*r15.* 2 .*invlpga.*
 .*:[0-9]*: Error: .*r15.* 1 .*skinit.*
 # xmm1
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm2
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm3
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm4
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm5
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm6
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm7
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm8
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm9
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm10
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm11
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm12
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm13
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm14
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
 # xmm15
-.*:[0-9]*: Error: .*blendvpd.*xmm0.*
-.*:[0-9]*: Error: .*blendvps.*xmm0.*
-.*:[0-9]*: Error: .*pblendv.*xmm0.*
+.*:[0-9]*: Error: .*blendvpd.*
+.*:[0-9]*: Error: .*blendvps.*
+.*:[0-9]*: Error: .*pblendv.*
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -343,6 +343,9 @@ static const initializer operand_type_sh
   { "Reg64",    "Reg|Qword" },
   { "FloatAcc", "Acc|Tbyte" },
   { "FloatReg", "Reg|Tbyte" },
+  { "RegXMM",   "RegSIMD|Xmmword" },
+  { "RegYMM",   "RegSIMD|Ymmword" },
+  { "RegZMM",   "RegSIMD|Zmmword" },
 };
 
 static initializer operand_type_init[] =
@@ -601,7 +604,6 @@ static bitfield opcode_modifiers[] =
   BITFIELD (NoTrackPrefixOk),
   BITFIELD (IsLockable),
   BITFIELD (RegKludge),
-  BITFIELD (FirstXmm0),
   BITFIELD (Implicit1stXmm0),
   BITFIELD (RepPrefixOk),
   BITFIELD (HLEPrefixOk),
@@ -643,9 +645,7 @@ static bitfield operand_types[] =
 {
   BITFIELD (Reg),
   BITFIELD (RegMMX),
-  BITFIELD (RegXMM),
-  BITFIELD (RegYMM),
-  BITFIELD (RegZMM),
+  BITFIELD (RegSIMD),
   BITFIELD (RegMask),
   BITFIELD (Imm1),
   BITFIELD (Imm8),
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -430,8 +430,6 @@ enum
   /* fake an extra reg operand for clr, imul and special register
      processing for some instructions.  */
   RegKludge,
-  /* The first operand must be xmm0 */
-  FirstXmm0,
   /* An implicit xmm0 as the first operand */
   Implicit1stXmm0,
   /* The HLE prefix is OK:
@@ -643,7 +641,6 @@ typedef struct i386_opcode_modifier
   unsigned int notrackprefixok:1;
   unsigned int islockable:1;
   unsigned int regkludge:1;
-  unsigned int firstxmm0:1;
   unsigned int implicit1stxmm0:1;
   unsigned int hleprefixok:2;
   unsigned int repprefixok:1;
@@ -689,12 +686,8 @@ enum
   Reg = 0,
   /* MMX register */
   RegMMX,
-  /* SSE register */
-  RegXMM,
-  /* AVX registers */
-  RegYMM,
-  /* AVX512 registers */
-  RegZMM,
+  /* Vector registers */
+  RegSIMD,
   /* Vector Mask registers */
   RegMask,
   /* Control register */
@@ -738,7 +731,7 @@ enum
   Disp32S,
   /* 64 bit displacement */
   Disp64,
-  /* Accumulator %al/%ax/%eax/%rax/%st(0) */
+  /* Accumulator %al/%ax/%eax/%rax/%st(0)/%xmm0 */
   Acc,
   /* Register which can be used for base or index in memory operand.  */
   BaseIndex,
@@ -806,9 +799,7 @@ typedef union i386_operand_type
     {
       unsigned int reg:1;
       unsigned int regmmx:1;
-      unsigned int regxmm:1;
-      unsigned int regymm:1;
-      unsigned int regzmm:1;
+      unsigned int regsimd:1;
       unsigned int regmask:1;
       unsigned int control:1;
       unsigned int debug:1;
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1649,13 +1649,13 @@ blendpd, 3, 0x660d, None, 1, CpuAVX, Mod
 blendpd, 3, 0x660f3a0d, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 blendps, 3, 0x660c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 blendps, 3, 0x660f3a0c, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-blendvpd, 3, 0x664b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|FirstXmm0|VexImmExt|SSE2AVX, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
+blendvpd, 3, 0x664b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt|SSE2AVX, { Acc|Xmmword, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 blendvpd, 2, 0x664b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Implicit1stXmm0|VexImmExt|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-blendvpd, 3, 0x660f3815, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|FirstXmm0, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
+blendvpd, 3, 0x660f3815, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Acc|Xmmword, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 blendvpd, 2, 0x660f3815, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-blendvps, 3, 0x664a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|FirstXmm0|VexImmExt|SSE2AVX, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
+blendvps, 3, 0x664a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt|SSE2AVX, { Acc|Xmmword, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 blendvps, 2, 0x664a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Implicit1stXmm0|VexImmExt|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-blendvps, 3, 0x660f3814, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|FirstXmm0, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
+blendvps, 3, 0x660f3814, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Acc|Xmmword, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 blendvps, 2, 0x660f3814, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 dppd, 3, 0x6641, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 dppd, 3, 0x660f3a41, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
@@ -1673,9 +1673,9 @@ mpsadbw, 3, 0x6642, None, 1, CpuAVX, Mod
 mpsadbw, 3, 0x660f3a42, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 packusdw, 2, 0x662b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 packusdw, 2, 0x660f382b, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-pblendvb, 3, 0x664c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|FirstXmm0|VexImmExt|SSE2AVX, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
+pblendvb, 3, 0x664c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt|SSE2AVX, { Acc|Xmmword, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pblendvb, 2, 0x664c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Implicit1stXmm0|VexImmExt|SSE2AVX, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-pblendvb, 3, 0x660f3810, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|FirstXmm0, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
+pblendvb, 3, 0x660f3810, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Acc|Xmmword, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pblendvb, 2, 0x660f3810, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pblendw, 3, 0x660e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 pblendw, 3, 0x660f3a0e, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
@@ -3134,7 +3134,7 @@ sha1rnds4, 3, 0xf3acc, None, 3, CpuSHA,
 sha1nexte, 2, 0xf38c8, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
 sha1msg1, 2, 0xf38c9, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
 sha1msg2, 2, 0xf38ca, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
-sha256rnds2, 3, 0xf38cb, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|FirstXmm0, { RegXMM, Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
+sha256rnds2, 3, 0xf38cb, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Acc|Xmmword, Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
 sha256rnds2, 2, 0xf38cb, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
 sha256msg1, 2, 0xf38cc, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
 sha256msg2, 2, 0xf38cd, None, 3, CpuSHA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|RegXMM|Disp8|Disp16|Disp32|Disp32S|Unspecified|BaseIndex, RegXMM }
--- a/opcodes/i386-reg.tbl
+++ b/opcodes/i386-reg.tbl
@@ -180,7 +180,7 @@ mm4, RegMMX, 0, 4, 33, 45
 mm5, RegMMX, 0, 5, 34, 46
 mm6, RegMMX, 0, 6, 35, 47
 mm7, RegMMX, 0, 7, 36, 48
-xmm0, RegXMM, 0, 0, 21, 17
+xmm0, RegXMM|Acc, 0, 0, 21, 17
 xmm1, RegXMM, 0, 1, 22, 18
 xmm2, RegXMM, 0, 2, 23, 19
 xmm3, RegXMM, 0, 3, 24, 20


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/4] x86: fold certain AVX and AVX2 templates
  2017-12-15  9:54 [PATCH 0/4] x86: operand handling consolidation Jan Beulich
                   ` (2 preceding siblings ...)
  2017-12-15 10:34 ` [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD Jan Beulich
@ 2017-12-15 10:35 ` Jan Beulich
  2017-12-15 13:10   ` H.J. Lu
                     ` (2 more replies)
  2017-12-18  8:03 ` [PATCH 5/4] x86: helper changes to i386-gen.c Jan Beulich
  4 siblings, 3 replies; 20+ messages in thread
From: Jan Beulich @ 2017-12-15 10:35 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Just like for instructions in GPRs, there's no need to have separate
templates for otherwise identical insns acting on XMM or YMM registers
(or memory of the same size).

gas/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (regymm, regzmm): Delete.
	(operand_type_register_match). Extend comment. Also handle some
	memory operands here. Extend to cover .regsimd.
	(build_vex_prefix): Derive vector_length from actual operand
	size.
	(process_operands, build_modrm_byte): Use .regsimd.

opcodes/
2017-12-15  Jan Beulich  <jbeulich@suse.com>

	* i386-gen.c (operand_type_init): Delete OPERAND_TYPE_REGYMM and
	OPERAND_TYPE_REGZMM entries.
	* i386-opc.h (enum of opcode modifiers): Extend comment.
	i386-opc.tbl (vaddpd, vaddps, vaddsubpd, vaddsubps, vandnpd,
	vandnps, vandpd, vandps, vblendpd, vblendps, vblendvpd,
	vblendvps, vbroadcastss, vcmpeq_ospd, vcmpeq_osps, vcmpeqpd,
	vcmpeqps, vcmpeq_uqpd, vcmpeq_uqps, vcmpeq_uspd, vcmpeq_usps,
	vcmpfalse_ospd, vcmpfalse_osps, vcmpfalsepd, vcmpfalseps,
	vcmpge_oqpd, vcmpge_oqps, vcmpgepd, vcmpgeps, vcmpgt_oqpd,
	vcmpgt_oqps, vcmpgtpd, vcmpgtps, vcmple_oqpd, vcmple_oqps,
	vcmplepd, vcmpleps, vcmplt_oqpd, vcmplt_oqps, vcmpltpd,
	vcmpltps, vcmpneq_oqpd, vcmpneq_oqps, vcmpneq_ospd,
	vcmpneq_osps, vcmpneqpd, vcmpneqps, vcmpneq_uspd, vcmpneq_usps,
	vcmpngepd, vcmpngeps, vcmpnge_uqpd, vcmpnge_uqps, vcmpngtpd,
	vcmpngtps, vcmpngt_uqpd, vcmpngt_uqps, vcmpnlepd, vcmpnleps,
	vcmpnle_uqpd, vcmpnle_uqps, vcmpnltpd, vcmpnltps, vcmpnlt_uqpd,
	vcmpnlt_uqps, vcmpordpd, vcmpordps, vcmpord_spd, vcmpord_sps,
	vcmppd, vcmpps, vcmptruepd, vcmptrueps, vcmptrue_uspd,
	vcmptrue_usps, vcmpunordpd, vcmpunordps, vcmpunord_spd,
	vcmpunord_sps, vcvtdq2ps, vcvtpd2dq, vcvtpd2ps, vcvtps2dq,
	vcvttpd2dq, vcvttps2dq, vdivpd, vdivps, vdpps, vhaddpd, vhaddps,
	vhsubpd, vhsubps, vlddqu, vmaskmovpd, vmaskmovps, vmaxpd,
	vmaxps, vminpd, vminps, vmovapd, vmovaps, vmovdqa, vmovdqu,
	vmovmskpd, vmovmskps, vmovntdq, vmovntpd, vmovntps, vmovshdup,
	vmovsldup, vmovupd, vmovups, vmulpd, vmulps, vorpd, vorps,
	vpermilpd, vpermilps, vptest, vrcpps, vroundpd, vroundps,
	vrsqrtps, vshufpd, vshufps, vsqrtpd, vsqrtps, vsubpd, vsubps,
	vtestpd, vtestps, vunpckhpd, vunpckhps, vunpcklpd, vunpcklps,
	vxorpd, vxorps, vpblendd, vpbroadcastb, vpbroadcastd,
	vpbroadcastw, vpbroadcastq, vpmaskmovd, vpmaskmovq, vpsllvd,
	vpsllvq, vpsravd, vpsravq, vpsrlvd, vpsrlvq): Fold 128- and
	256-bit forms. Use CheckRegSize instead of IgnoreSize where
	appropriate. Drop Xmmword and Ymmword from the results where
	possible.
	* i386-tbl.h: Re-generate.
---
For some yet to be understood reason folding the memory forms of
vcvtpd2ps doesn't work (some Intel mode ymmword ptr forms produce
128-bit insns).

Similarly I didn't figure out yet the reason for an anomaly when the
"unspecified" checks in operand_type_register_match() are missing: In
that case I've observed errors on vaddsubp{s,d}, but not on e.g.
vaddp{s,d} with identical operands.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1747,8 +1747,6 @@ static const i386_operand_type disp16_32
 static const i386_operand_type anydisp
   = OPERAND_TYPE_ANYDISP;
 static const i386_operand_type regxmm = OPERAND_TYPE_REGXMM;
-static const i386_operand_type regymm = OPERAND_TYPE_REGYMM;
-static const i386_operand_type regzmm = OPERAND_TYPE_REGZMM;
 static const i386_operand_type regmask = OPERAND_TYPE_REGMASK;
 static const i386_operand_type imm8 = OPERAND_TYPE_IMM8;
 static const i386_operand_type imm8s = OPERAND_TYPE_IMM8S;
@@ -1973,7 +1971,9 @@ mismatch:
 }
 
 /* If given types g0 and g1 are registers they must be of the same type
-   unless the expected operand type register overlap is null.  */
+   unless the expected operand type register overlap is null.
+   Memory operand size of certain SIMD instructions is also being checked
+   here.  */
 
 static INLINE int
 operand_type_register_match (i386_operand_type g0,
@@ -1981,22 +1981,36 @@ operand_type_register_match (i386_operan
 			     i386_operand_type g1,
 			     i386_operand_type t1)
 {
-  if (!operand_type_check (g0, reg))
+  if (!g0.bitfield.reg
+      && !g0.bitfield.regsimd
+      && (!operand_type_check (g0, anymem)
+	  || g0.bitfield.unspecified
+	  || !t0.bitfield.regsimd))
     return 1;
 
-  if (!operand_type_check (g1, reg))
+  if (!g1.bitfield.reg
+      && !g1.bitfield.regsimd
+      && (!operand_type_check (g1, anymem)
+	  || g1.bitfield.unspecified
+	  || !t1.bitfield.regsimd))
     return 1;
 
   if (g0.bitfield.byte == g1.bitfield.byte
       && g0.bitfield.word == g1.bitfield.word
       && g0.bitfield.dword == g1.bitfield.dword
-      && g0.bitfield.qword == g1.bitfield.qword)
+      && g0.bitfield.qword == g1.bitfield.qword
+      && g0.bitfield.xmmword == g1.bitfield.xmmword
+      && g0.bitfield.ymmword == g1.bitfield.ymmword
+      && g0.bitfield.zmmword == g1.bitfield.zmmword)
     return 1;
 
   if (!(t0.bitfield.byte & t1.bitfield.byte)
       && !(t0.bitfield.word & t1.bitfield.word)
       && !(t0.bitfield.dword & t1.bitfield.dword)
-      && !(t0.bitfield.qword & t1.bitfield.qword))
+      && !(t0.bitfield.qword & t1.bitfield.qword)
+      && !(t0.bitfield.xmmword & t1.bitfield.xmmword)
+      && !(t0.bitfield.ymmword & t1.bitfield.ymmword)
+      && !(t0.bitfield.zmmword & t1.bitfield.zmmword))
     return 1;
 
   i.error = register_type_mismatch;
@@ -3240,8 +3254,22 @@ build_vex_prefix (const insn_template *t
 
   if (i.tm.opcode_modifier.vex == VEXScalar)
     vector_length = avxscalar;
+  else if (i.tm.opcode_modifier.vex == VEX256)
+    vector_length = 1;
   else
-    vector_length = i.tm.opcode_modifier.vex == VEX256 ? 1 : 0;
+    {
+      unsigned int op;
+
+      vector_length = 0;
+      for (op = 0; op < t->operands; ++op)
+	if (t->operand_types[op].bitfield.xmmword
+	    && t->operand_types[op].bitfield.ymmword
+	    && i.types[op].bitfield.ymmword)
+	  {
+	    vector_length = 1;
+	    break;
+	  }
+    }
 
   switch ((i.tm.base_opcode >> 8) & 0xff)
     {
@@ -6066,10 +6094,7 @@ duplicate:
   else if (i.tm.opcode_modifier.implicitquadgroup)
     {
       /* The second operand must be {x,y,z}mmN, where N is a multiple of 4. */
-      gas_assert (i.operands >= 2
-          && (operand_type_equal (&i.types[1], &regxmm)
-              || operand_type_equal (&i.types[1], &regymm)
-              || operand_type_equal (&i.types[1], &regzmm)));
+      gas_assert (i.operands >= 2 && i.types[1].bitfield.regsimd);
       unsigned int regnum = register_number (i.op[1].regs);
       unsigned int first_reg_in_group = regnum & ~3;
       unsigned int last_reg_in_group = first_reg_in_group + 3;
@@ -6230,9 +6255,7 @@ build_modrm_byte (void)
                           && i.types[0].bitfield.vec_imm4
                           && (i.tm.opcode_modifier.vexw == VEXW0
                               || i.tm.opcode_modifier.vexw == VEXW1)
-                          && (operand_type_equal (&i.tm.operand_types[dest], &regxmm)
-                              || operand_type_equal (&i.tm.operand_types[dest], &regymm)
-                              || operand_type_equal (&i.tm.operand_types[dest], &regzmm)))));
+                          && i.tm.operand_types[dest].bitfield.regsimd)));
 
       if (i.imm_operands == 0)
         {
@@ -6264,12 +6287,7 @@ build_modrm_byte (void)
               nds = tmp;
             }
 
-          gas_assert (operand_type_equal (&i.tm.operand_types[reg_slot],
-					  &regxmm)
-                      || operand_type_equal (&i.tm.operand_types[reg_slot],
-                                             &regymm)
-                      || operand_type_equal (&i.tm.operand_types[reg_slot],
-                                             &regzmm));
+          gas_assert (i.tm.operand_types[reg_slot].bitfield.regsimd);
           exp->X_op = O_constant;
           exp->X_add_number = register_number (i.op[reg_slot].regs) << 4;
 	  gas_assert ((i.op[reg_slot].regs->reg_flags & RegVRex) == 0);
@@ -6311,22 +6329,13 @@ build_modrm_byte (void)
               i.types[imm_slot].bitfield.imm8 = 1;
             }
 
-          gas_assert (operand_type_equal (&i.tm.operand_types[reg_slot],
-					  &regxmm)
-		      || operand_type_equal (&i.tm.operand_types[reg_slot],
-					     &regymm)
-		      || operand_type_equal (&i.tm.operand_types[reg_slot],
-					     &regzmm));
+          gas_assert (i.tm.operand_types[reg_slot].bitfield.regsimd);
           i.op[imm_slot].imms->X_add_number
               |= register_number (i.op[reg_slot].regs) << 4;
 	  gas_assert ((i.op[reg_slot].regs->reg_flags & RegVRex) == 0);
         }
 
-      gas_assert (operand_type_equal (&i.tm.operand_types[nds], &regxmm)
-                  || operand_type_equal (&i.tm.operand_types[nds],
-                                         &regymm)
-                  || operand_type_equal (&i.tm.operand_types[nds],
-                                         &regzmm));
+      gas_assert (i.tm.operand_types[nds].bitfield.regsimd);
       i.vex.register_specifier = i.op[nds].regs;
     }
   else
@@ -6450,9 +6459,7 @@ build_modrm_byte (void)
 	      if ((dest + 1) >= i.operands
 		  || ((!op.bitfield.reg
 		       || (!op.bitfield.dword && !op.bitfield.qword))
-		      && !operand_type_equal (&op, &regxmm)
-		      && !operand_type_equal (&op, &regymm)
-		      && !operand_type_equal (&op, &regzmm)
+		      && !op.bitfield.regsimd
 		      && !operand_type_equal (&op, &regmask)))
 		abort ();
 	      i.vex.register_specifier = i.op[vvvv].regs;
@@ -6879,9 +6886,7 @@ build_modrm_byte (void)
 
 	      if ((!type->bitfield.reg
 		   || (!type->bitfield.dword && !type->bitfield.qword))
-		  && !operand_type_equal (type, &regxmm)
-		  && !operand_type_equal (type, &regymm)
-		  && !operand_type_equal (type, &regzmm)
+		  && !type->bitfield.regsimd
 		  && !operand_type_equal (type, &regmask))
 		abort ();
 
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -412,10 +412,6 @@ static initializer operand_type_init[] =
     "RegMMX" },
   { "OPERAND_TYPE_REGXMM",
     "RegXMM" },
-  { "OPERAND_TYPE_REGYMM",
-    "RegYMM" },
-  { "OPERAND_TYPE_REGZMM",
-    "RegZMM" },
   { "OPERAND_TYPE_REGMASK",
     "RegMask" },
   { "OPERAND_TYPE_ESSEG",
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -461,7 +461,7 @@ enum
   /* deprecated fp insn, gets a warning */
   Ugh,
   /* insn has VEX prefix:
-	1: 128bit VEX prefix.
+	1: 128bit VEX prefix (or operand dependent).
 	2: 256bit VEX prefix.
 	3: Scalar VEX prefix.
    */
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1836,231 +1836,152 @@ gf2p8mulb, 2, 0x660f38cf, None, 3, CpuGF
 
 // AVX instructions.
 
-vaddpd, 3, 0x6658, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaddpd, 3, 0x6658, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vaddps, 3, 0x58, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaddps, 3, 0x58, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vaddpd, 3, 0x6658, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vaddps, 3, 0x58, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vaddsd, 3, 0xf258, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vaddss, 3, 0xf358, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaddsubpd, 3, 0x66d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaddsubpd, 3, 0x66d0, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vaddsubps, 3, 0xf2d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaddsubps, 3, 0xf2d0, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vandnpd, 3, 0x6655, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vandnpd, 3, 0x6655, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vandnps, 3, 0x55, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vandnps, 3, 0x55, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vandpd, 3, 0x6654, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vandpd, 3, 0x6654, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vandps, 3, 0x54, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vandps, 3, 0x54, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vblendpd, 4, 0x660d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vblendpd, 4, 0x660d, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vblendps, 4, 0x660c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vblendps, 4, 0x660c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vblendvpd, 4, 0x664b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vblendvpd, 4, 0x664b, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexSources=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vblendvps, 4, 0x664a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vblendvps, 4, 0x664a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexSources=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vaddsubpd, 3, 0x66d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vaddsubps, 3, 0xf2d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandnpd, 3, 0x6655, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandnps, 3, 0x55, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandpd, 3, 0x6654, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandps, 3, 0x54, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vblendpd, 4, 0x660d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vblendps, 4, 0x660c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vblendvpd, 4, 0x664b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vblendvps, 4, 0x664a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|VexSources=2|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vbroadcastf128, 2, 0x661a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
 vbroadcastsd, 2, 0x6619, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegYMM }
-vbroadcastss, 2, 0x6618, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex, RegXMM }
-vbroadcastss, 2, 0x6618, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex, RegYMM }
-vcmpeq_ospd, 3, 0x66c2, 0x10, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_ospd, 3, 0x66c2, 0x10, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpeq_osps, 3, 0xc2, 0x10, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_osps, 3, 0xc2, 0x10, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vbroadcastss, 2, 0x6618, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex, RegXMM|RegYMM }
+vcmpeq_ospd, 3, 0x66c2, 0x10, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpeq_osps, 3, 0xc2, 0x10, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpeq_ossd, 3, 0xf2c2, 0x10, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpeq_osss, 3, 0xf3c2, 0x10, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeqpd, 3, 0x66c2, 0x0, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeqpd, 3, 0x66c2, 0x0, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpeqps, 3, 0xc2, 0x0, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeqps, 3, 0xc2, 0x0, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpeqpd, 3, 0x66c2, 0x0, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpeqps, 3, 0xc2, 0x0, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpeqsd, 3, 0xf2c2, 0x0, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpeqss, 3, 0xf3c2, 0x0, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_uqpd, 3, 0x66c2, 0x8, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_uqpd, 3, 0x66c2, 0x8, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpeq_uqps, 3, 0xc2, 0x8, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_uqps, 3, 0xc2, 0x8, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpeq_uqpd, 3, 0x66c2, 0x8, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpeq_uqps, 3, 0xc2, 0x8, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpeq_uqsd, 3, 0xf2c2, 0x8, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpeq_uqss, 3, 0xf3c2, 0x8, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_uspd, 3, 0x66c2, 0x18, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_uspd, 3, 0x66c2, 0x18, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpeq_usps, 3, 0xc2, 0x18, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpeq_usps, 3, 0xc2, 0x18, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpeq_uspd, 3, 0x66c2, 0x18, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpeq_usps, 3, 0xc2, 0x18, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpeq_ussd, 3, 0xf2c2, 0x18, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpeq_usss, 3, 0xf3c2, 0x18, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpfalse_ospd, 3, 0x66c2, 0x1b, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpfalse_ospd, 3, 0x66c2, 0x1b, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpfalse_osps, 3, 0xc2, 0x1b, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpfalse_osps, 3, 0xc2, 0x1b, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpfalse_ospd, 3, 0x66c2, 0x1b, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpfalse_osps, 3, 0xc2, 0x1b, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpfalse_ossd, 3, 0xf2c2, 0x1b, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpfalse_osss, 3, 0xf3c2, 0x1b, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpfalsepd, 3, 0x66c2, 0xb, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpfalsepd, 3, 0x66c2, 0xb, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpfalseps, 3, 0xc2, 0xb, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpfalseps, 3, 0xc2, 0xb, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpfalsepd, 3, 0x66c2, 0xb, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpfalseps, 3, 0xc2, 0xb, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpfalsesd, 3, 0xf2c2, 0xb, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpfalsess, 3, 0xf3c2, 0xb, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpge_oqpd, 3, 0x66c2, 0x1d, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpge_oqpd, 3, 0x66c2, 0x1d, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpge_oqps, 3, 0xc2, 0x1d, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpge_oqps, 3, 0xc2, 0x1d, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpge_oqpd, 3, 0x66c2, 0x1d, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpge_oqps, 3, 0xc2, 0x1d, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpge_oqsd, 3, 0xf2c2, 0x1d, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpge_oqss, 3, 0xf3c2, 0x1d, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgepd, 3, 0x66c2, 0xd, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgepd, 3, 0x66c2, 0xd, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpgeps, 3, 0xc2, 0xd, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgeps, 3, 0xc2, 0xd, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpgepd, 3, 0x66c2, 0xd, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpgeps, 3, 0xc2, 0xd, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpgesd, 3, 0xf2c2, 0xd, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpgess, 3, 0xf3c2, 0xd, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgt_oqpd, 3, 0x66c2, 0x1e, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgt_oqpd, 3, 0x66c2, 0x1e, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpgt_oqps, 3, 0xc2, 0x1e, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgt_oqps, 3, 0xc2, 0x1e, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpgt_oqpd, 3, 0x66c2, 0x1e, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpgt_oqps, 3, 0xc2, 0x1e, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpgt_oqsd, 3, 0xf2c2, 0x1e, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpgt_oqss, 3, 0xf3c2, 0x1e, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgtpd, 3, 0x66c2, 0xe, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgtpd, 3, 0x66c2, 0xe, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpgtps, 3, 0xc2, 0xe, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpgtps, 3, 0xc2, 0xe, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpgtpd, 3, 0x66c2, 0xe, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpgtps, 3, 0xc2, 0xe, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpgtsd, 3, 0xf2c2, 0xe, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpgtss, 3, 0xf3c2, 0xe, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmple_oqpd, 3, 0x66c2, 0x12, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmple_oqpd, 3, 0x66c2, 0x12, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmple_oqps, 3, 0xc2, 0x12, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmple_oqps, 3, 0xc2, 0x12, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmple_oqpd, 3, 0x66c2, 0x12, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmple_oqps, 3, 0xc2, 0x12, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmple_oqsd, 3, 0xf2c2, 0x12, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmple_oqss, 3, 0xf3c2, 0x12, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmplepd, 3, 0x66c2, 0x2, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmplepd, 3, 0x66c2, 0x2, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpleps, 3, 0xc2, 0x2, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpleps, 3, 0xc2, 0x2, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmplepd, 3, 0x66c2, 0x2, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpleps, 3, 0xc2, 0x2, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmplesd, 3, 0xf2c2, 0x2, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpless, 3, 0xf3c2, 0x2, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmplt_oqpd, 3, 0x66c2, 0x11, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmplt_oqpd, 3, 0x66c2, 0x11, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmplt_oqps, 3, 0xc2, 0x11, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmplt_oqps, 3, 0xc2, 0x11, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmplt_oqpd, 3, 0x66c2, 0x11, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmplt_oqps, 3, 0xc2, 0x11, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmplt_oqsd, 3, 0xf2c2, 0x11, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmplt_oqss, 3, 0xf3c2, 0x11, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpltpd, 3, 0x66c2, 0x1, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpltpd, 3, 0x66c2, 0x1, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpltps, 3, 0xc2, 0x1, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpltps, 3, 0xc2, 0x1, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpltpd, 3, 0x66c2, 0x1, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpltps, 3, 0xc2, 0x1, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpltsd, 3, 0xf2c2, 0x1, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpltss, 3, 0xf3c2, 0x1, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_oqpd, 3, 0x66c2, 0xc, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_oqpd, 3, 0x66c2, 0xc, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpneq_oqps, 3, 0xc2, 0xc, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_oqps, 3, 0xc2, 0xc, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpneq_oqpd, 3, 0x66c2, 0xc, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpneq_oqps, 3, 0xc2, 0xc, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpneq_oqsd, 3, 0xf2c2, 0xc, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpneq_oqss, 3, 0xf3c2, 0xc, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_ospd, 3, 0x66c2, 0x1c, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_ospd, 3, 0x66c2, 0x1c, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpneq_osps, 3, 0xc2, 0x1c, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_osps, 3, 0xc2, 0x1c, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpneq_ospd, 3, 0x66c2, 0x1c, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpneq_osps, 3, 0xc2, 0x1c, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpneq_ossd, 3, 0xf2c2, 0x1c, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpneq_osss, 3, 0xf3c2, 0x1c, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneqpd, 3, 0x66c2, 0x4, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneqpd, 3, 0x66c2, 0x4, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpneqps, 3, 0xc2, 0x4, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneqps, 3, 0xc2, 0x4, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpneqpd, 3, 0x66c2, 0x4, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpneqps, 3, 0xc2, 0x4, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpneqsd, 3, 0xf2c2, 0x4, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpneqss, 3, 0xf3c2, 0x4, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_uspd, 3, 0x66c2, 0x14, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_uspd, 3, 0x66c2, 0x14, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpneq_usps, 3, 0xc2, 0x14, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpneq_usps, 3, 0xc2, 0x14, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpneq_uspd, 3, 0x66c2, 0x14, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpneq_usps, 3, 0xc2, 0x14, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpneq_ussd, 3, 0xf2c2, 0x14, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpneq_usss, 3, 0xf3c2, 0x14, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngepd, 3, 0x66c2, 0x9, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngepd, 3, 0x66c2, 0x9, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpngeps, 3, 0xc2, 0x9, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngeps, 3, 0xc2, 0x9, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpngepd, 3, 0x66c2, 0x9, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpngeps, 3, 0xc2, 0x9, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpngesd, 3, 0xf2c2, 0x9, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpngess, 3, 0xf3c2, 0x9, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnge_uqpd, 3, 0x66c2, 0x19, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnge_uqpd, 3, 0x66c2, 0x19, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpnge_uqps, 3, 0xc2, 0x19, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnge_uqps, 3, 0xc2, 0x19, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpnge_uqpd, 3, 0x66c2, 0x19, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpnge_uqps, 3, 0xc2, 0x19, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpnge_uqsd, 3, 0xf2c2, 0x19, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpnge_uqss, 3, 0xf3c2, 0x19, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngtpd, 3, 0x66c2, 0xa, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngtpd, 3, 0x66c2, 0xa, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpngtps, 3, 0xc2, 0xa, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngtps, 3, 0xc2, 0xa, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpngtpd, 3, 0x66c2, 0xa, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpngtps, 3, 0xc2, 0xa, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpngtsd, 3, 0xf2c2, 0xa, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpngtss, 3, 0xf3c2, 0xa, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngt_uqpd, 3, 0x66c2, 0x1a, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngt_uqpd, 3, 0x66c2, 0x1a, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpngt_uqps, 3, 0xc2, 0x1a, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpngt_uqps, 3, 0xc2, 0x1a, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpngt_uqpd, 3, 0x66c2, 0x1a, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpngt_uqps, 3, 0xc2, 0x1a, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpngt_uqsd, 3, 0xf2c2, 0x1a, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpngt_uqss, 3, 0xf3c2, 0x1a, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnlepd, 3, 0x66c2, 0x6, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnlepd, 3, 0x66c2, 0x6, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpnleps, 3, 0xc2, 0x6, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnleps, 3, 0xc2, 0x6, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpnlepd, 3, 0x66c2, 0x6, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpnleps, 3, 0xc2, 0x6, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpnlesd, 3, 0xf2c2, 0x6, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpnless, 3, 0xf3c2, 0x6, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnle_uqpd, 3, 0x66c2, 0x16, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnle_uqpd, 3, 0x66c2, 0x16, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpnle_uqps, 3, 0xc2, 0x16, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnle_uqps, 3, 0xc2, 0x16, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpnle_uqpd, 3, 0x66c2, 0x16, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpnle_uqps, 3, 0xc2, 0x16, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpnle_uqsd, 3, 0xf2c2, 0x16, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpnle_uqss, 3, 0xf3c2, 0x16, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnltpd, 3, 0x66c2, 0x5, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnltpd, 3, 0x66c2, 0x5, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpnltps, 3, 0xc2, 0x5, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnltps, 3, 0xc2, 0x5, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpnltpd, 3, 0x66c2, 0x5, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpnltps, 3, 0xc2, 0x5, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpnltsd, 3, 0xf2c2, 0x5, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpnltss, 3, 0xf3c2, 0x5, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnlt_uqpd, 3, 0x66c2, 0x15, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnlt_uqpd, 3, 0x66c2, 0x15, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpnlt_uqps, 3, 0xc2, 0x15, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpnlt_uqps, 3, 0xc2, 0x15, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpnlt_uqpd, 3, 0x66c2, 0x15, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpnlt_uqps, 3, 0xc2, 0x15, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpnlt_uqsd, 3, 0xf2c2, 0x15, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpnlt_uqss, 3, 0xf3c2, 0x15, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpordpd, 3, 0x66c2, 0x7, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpordpd, 3, 0x66c2, 0x7, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpordps, 3, 0xc2, 0x7, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpordps, 3, 0xc2, 0x7, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpordpd, 3, 0x66c2, 0x7, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpordps, 3, 0xc2, 0x7, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpordsd, 3, 0xf2c2, 0x7, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpord_spd, 3, 0x66c2, 0x17, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpord_spd, 3, 0x66c2, 0x17, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpord_sps, 3, 0xc2, 0x17, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpord_sps, 3, 0xc2, 0x17, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpord_spd, 3, 0x66c2, 0x17, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpord_sps, 3, 0xc2, 0x17, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpordss, 3, 0xf3c2, 0x7, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpord_ssd, 3, 0xf2c2, 0x17, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpord_sss, 3, 0xf3c2, 0x17, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmppd, 4, 0x66c2, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmppd, 4, 0x66c2, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpps, 4, 0xc2, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpps, 4, 0xc2, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmppd, 4, 0x66c2, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpps, 4, 0xc2, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpsd, 4, 0xf2c2, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpss, 4, 0xf3c2, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmptruepd, 3, 0x66c2, 0xf, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmptruepd, 3, 0x66c2, 0xf, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmptrueps, 3, 0xc2, 0xf, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmptrueps, 3, 0xc2, 0xf, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmptruepd, 3, 0x66c2, 0xf, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmptrueps, 3, 0xc2, 0xf, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmptruesd, 3, 0xf2c2, 0xf, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmptruess, 3, 0xf3c2, 0xf, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmptrue_uspd, 3, 0x66c2, 0x1f, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmptrue_uspd, 3, 0x66c2, 0x1f, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmptrue_usps, 3, 0xc2, 0x1f, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmptrue_usps, 3, 0xc2, 0x1f, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmptrue_uspd, 3, 0x66c2, 0x1f, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmptrue_usps, 3, 0xc2, 0x1f, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmptrue_ussd, 3, 0xf2c2, 0x1f, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmptrue_usss, 3, 0xf3c2, 0x1f, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpunordpd, 3, 0x66c2, 0x3, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpunordpd, 3, 0x66c2, 0x3, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpunordps, 3, 0xc2, 0x3, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpunordps, 3, 0xc2, 0x3, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpunordpd, 3, 0x66c2, 0x3, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpunordps, 3, 0xc2, 0x3, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpunordsd, 3, 0xf2c2, 0x3, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpunord_spd, 3, 0x66c2, 0x13, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpunord_spd, 3, 0x66c2, 0x13, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vcmpunord_sps, 3, 0xc2, 0x13, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcmpunord_sps, 3, 0xc2, 0x13, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vcmpunord_spd, 3, 0x66c2, 0x13, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmpunord_sps, 3, 0xc2, 0x13, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmpunordss, 3, 0xf3c2, 0x3, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpunord_ssd, 3, 0xf2c2, 0x13, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcmpunord_sss, 3, 0xf3c2, 0x13, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -2068,22 +1989,17 @@ vcomisd, 2, 0x662f, None, 1, CpuAVX, Mod
 vcomiss, 2, 0x2f, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtdq2pd, 2, 0xf3e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtdq2pd, 2, 0xf3e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegYMM }
-vcvtdq2ps, 2, 0x5b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vcvtdq2ps, 2, 0x5b, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Xmmword|BaseIndex, RegXMM }
-vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegXMM }
-vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Ymmword|BaseIndex, RegXMM }
+vcvtdq2ps, 2, 0x5b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, RegXMM }
+vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Xmmword|Ymmword|BaseIndex, RegXMM }
 vcvtpd2dqx, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtpd2dqy, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex|RegYMM, RegXMM }
-vcvtpd2ps, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
+vcvtpd2ps, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, RegXMM }
 vcvtpd2ps, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Xmmword|BaseIndex, RegXMM }
-vcvtpd2ps, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegXMM }
 vcvtpd2ps, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Ymmword|BaseIndex, RegXMM }
 vcvtpd2psx, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtpd2psy, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex|RegYMM, RegXMM }
-vcvtps2dq, 2, 0x665b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vcvtps2dq, 2, 0x665b, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vcvtps2dq, 2, 0x665b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vcvtps2pd, 2, 0x5a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtps2pd, 2, 0x5a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegYMM }
 vcvtsd2si, 2, 0xf22d, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|ToDword, { Qword|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
@@ -2094,70 +2010,47 @@ vcvtsi2ss, 3, 0xf32a, None, 1, CpuAVX|Cp
 vcvtsi2ss, 3, 0xf32a, None, 1, CpuAVX|Cpu64, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vcvtss2sd, 3, 0xf35a, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcvtss2si, 2, 0xf32d, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|ToQword, { Dword|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
-vcvttpd2dq, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vcvttpd2dq, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Intelsyntax, { Xmmword|BaseIndex, RegXMM }
-vcvttpd2dq, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegXMM }
-vcvttpd2dq, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Intelsyntax, { Ymmword|BaseIndex, RegXMM }
+vcvttpd2dq, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, RegXMM }
+vcvttpd2dq, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Intelsyntax, { Xmmword|Ymmword|BaseIndex, RegXMM }
 vcvttpd2dqx, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTsyntax, { Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvttpd2dqy, 2, 0x66e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTsyntax, { Unspecified|BaseIndex|RegYMM, RegXMM }
-vcvttps2dq, 2, 0xf35b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vcvttps2dq, 2, 0xf35b, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vcvttps2dq, 2, 0xf35b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vcvttsd2si, 2, 0xf22c, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|ToDword, { Qword|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
 vcvttss2si, 2, 0xf32c, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|ToQword, { Dword|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
-vdivpd, 3, 0x665e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vdivpd, 3, 0x665e, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vdivps, 3, 0x5e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vdivps, 3, 0x5e, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vdivpd, 3, 0x665e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vdivps, 3, 0x5e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vdivsd, 3, 0xf25e, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vdivss, 3, 0xf35e, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vdppd, 4, 0x6641, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vdpps, 4, 0x6640, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vdpps, 4, 0x6640, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vdpps, 4, 0x6640, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vextractf128, 3, 0x6619, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM, Xmmword|Unspecified|BaseIndex|RegXMM }
 vextractps, 3, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
 vextractps, 3, 0x6617, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg64|RegMem }
-vhaddpd, 3, 0x667c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vhaddpd, 3, 0x667c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vhaddps, 3, 0xf27c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vhaddps, 3, 0xf27c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vhsubpd, 3, 0x667d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vhsubpd, 3, 0x667d, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vhsubps, 3, 0xf27d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vhsubps, 3, 0xf27d, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vhaddpd, 3, 0x667c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vhaddps, 3, 0xf27c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vhsubpd, 3, 0x667d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vhsubps, 3, 0xf27d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vinsertf128, 4, 0x6618, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM }
 vinsertps, 4, 0x6621, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vlddqu, 2, 0xf2f0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegXMM }
-vlddqu, 2, 0xf2f0, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex, RegYMM }
+vlddqu, 2, 0xf2f0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM }
 vldmxcsr, 1, 0xae, 0x2, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex }
 vmaskmovdqu, 2, 0x66f7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vmaskmovpd, 3, 0x662f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, Xmmword|Unspecified|BaseIndex }
-vmaskmovpd, 3, 0x662f, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, Ymmword|Unspecified|BaseIndex }
-vmaskmovpd, 3, 0x662d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vmaskmovpd, 3, 0x662d, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex, RegYMM, RegYMM }
-vmaskmovps, 3, 0x662e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, Xmmword|Unspecified|BaseIndex }
-vmaskmovps, 3, 0x662e, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, Ymmword|Unspecified|BaseIndex }
-vmaskmovps, 3, 0x662c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vmaskmovps, 3, 0x662c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex, RegYMM, RegYMM }
-vmaxpd, 3, 0x665f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vmaxpd, 3, 0x665f, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vmaxps, 3, 0x5f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vmaxps, 3, 0x5f, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vmaskmovpd, 3, 0x662f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
+vmaskmovpd, 3, 0x662d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vmaskmovps, 3, 0x662e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
+vmaskmovps, 3, 0x662c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vmaxpd, 3, 0x665f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vmaxps, 3, 0x5f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vmaxsd, 3, 0xf25f, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vmaxss, 3, 0xf35f, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vminpd, 3, 0x665d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vminpd, 3, 0x665d, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vminps, 3, 0x5d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vminps, 3, 0x5d, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vminpd, 3, 0x665d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vminps, 3, 0x5d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vminsd, 3, 0xf25d, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vminss, 3, 0xf35d, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vmovapd, 2, 0x6628, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovapd, 2, 0x6629, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM }
-vmovapd, 2, 0x6628, None, 1, CpuAVX, Load|Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovapd, 2, 0x6629, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM }
-vmovaps, 2, 0x28, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovaps, 2, 0x29, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM }
-vmovaps, 2, 0x28, None, 1, CpuAVX, Load|Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovaps, 2, 0x29, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM }
+vmovapd, 2, 0x6628, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vmovapd, 2, 0x6629, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM }
+vmovaps, 2, 0x28, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vmovaps, 2, 0x29, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM }
 // vmovd really shouldn't allow for 64bit operand (vmovq is the right
 // mnemonic for copying between Reg64/Mem64 and RegXMM, as is mandated
 // by Intel AVX spec).  To avoid extra template in gcc x86 backend and
@@ -2169,14 +2062,10 @@ vmovd, 2, 0x667e, None, 1, CpuAVX, Modrm
 vmovd, 2, 0x667e, None, 1, CpuAVX|Cpu64, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { RegXMM, Qword|Reg64|BaseIndex }
 vmovddup, 2, 0xf212, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vmovddup, 2, 0xf212, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovdqa, 2, 0x666f, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovdqa, 2, 0x667f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM }
-vmovdqa, 2, 0x666f, None, 1, CpuAVX, Load|Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovdqa, 2, 0x667f, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM }
-vmovdqu, 2, 0xf36f, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovdqu, 2, 0xf37f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM }
-vmovdqu, 2, 0xf36f, None, 1, CpuAVX, Load|Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovdqu, 2, 0xf37f, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM }
+vmovdqa, 2, 0x666f, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vmovdqa, 2, 0x667f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM }
+vmovdqu, 2, 0xf36f, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vmovdqu, 2, 0xf37f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM }
 vmovhlps, 3, 0x12, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
 vmovhpd, 3, 0x6616, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vmovhpd, 2, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Qword|Unspecified|BaseIndex }
@@ -2187,17 +2076,12 @@ vmovlpd, 3, 0x6612, None, 1, CpuAVX, Mod
 vmovlpd, 2, 0x6613, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Qword|Unspecified|BaseIndex }
 vmovlps, 3, 0x12, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vmovlps, 2, 0x13, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Qword|Unspecified|BaseIndex }
-vmovmskpd, 2, 0x6650, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
-vmovmskpd, 2, 0x6650, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegYMM, Reg32|Reg64 }
-vmovmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
-vmovmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegYMM, Reg32|Reg64 }
-vmovntdq, 2, 0x66e7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
-vmovntdq, 2, 0x66e7, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex }
+vmovmskpd, 2, 0x6650, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM|RegYMM, Reg32|Reg64 }
+vmovmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM|RegYMM, Reg32|Reg64 }
+vmovntdq, 2, 0x66e7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
 vmovntdqa, 2, 0x662a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegXMM }
-vmovntpd, 2, 0x662b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
-vmovntpd, 2, 0x662b, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex }
-vmovntps, 2, 0x2b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
-vmovntps, 2, 0x2b, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex }
+vmovntpd, 2, 0x662b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
+vmovntps, 2, 0x2b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
 vmovq, 2, 0xf37e, None, 1, CpuAVX, Load|Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vmovq, 2, 0x66d6, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
 vmovq, 2, 0x666e, None, 1, CpuAVX|Cpu64, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|Qword|Unspecified|BaseIndex, RegXMM }
@@ -2206,33 +2090,23 @@ vmovsd, 2, 0xf211, None, 1, CpuAVX, Modr
 vmovsd, 2, 0xf210, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 vmovsd, 3, 0xf210, None, 1, CpuAVX, Load|Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
 vmovsd, 3, 0xf211, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM|RegMem }
-vmovshdup, 2, 0xf316, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovshdup, 2, 0xf316, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovsldup, 2, 0xf312, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovsldup, 2, 0xf312, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vmovshdup, 2, 0xf316, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vmovsldup, 2, 0xf312, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vmovss, 2, 0xf311, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Dword|Unspecified|BaseIndex }
 vmovss, 2, 0xf310, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex, RegXMM }
 vmovss, 3, 0xf310, None, 1, CpuAVX, Load|Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
 vmovss, 3, 0xf311, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM|RegMem }
-vmovupd, 2, 0x6610, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovupd, 2, 0x6611, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM }
-vmovupd, 2, 0x6610, None, 1, CpuAVX, Load|Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovupd, 2, 0x6611, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM }
-vmovups, 2, 0x10, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovups, 2, 0x11, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex|RegXMM }
-vmovups, 2, 0x10, None, 1, CpuAVX, Load|Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vmovups, 2, 0x11, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM }
+vmovupd, 2, 0x6610, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vmovupd, 2, 0x6611, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM }
+vmovups, 2, 0x10, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vmovups, 2, 0x11, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM }
 vmpsadbw, 4, 0x6642, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vmulpd, 3, 0x6659, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vmulpd, 3, 0x6659, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vmulps, 3, 0x59, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vmulps, 3, 0x59, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vmulpd, 3, 0x6659, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vmulps, 3, 0x59, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vmulsd, 3, 0xf259, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vmulss, 3, 0xf359, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vorpd, 3, 0x6656, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vorpd, 3, 0x6656, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vorps, 3, 0x56, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vorps, 3, 0x56, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vorpd, 3, 0x6656, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vorps, 3, 0x56, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpabsb, 2, 0x661c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpabsd, 2, 0x661e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpabsw, 2, 0x661d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
@@ -2270,14 +2144,10 @@ vpcmpgtw, 3, 0x6665, None, 1, CpuAVX, Mo
 vpcmpistri, 3, 0x6663, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpcmpistrm, 3, 0x6662, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vperm2f128, 4, 0x6606, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpermilpd, 3, 0x660d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpermilpd, 3, 0x660d, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpermilpd, 3, 0x6605, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vpermilpd, 3, 0x6605, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vpermilps, 3, 0x660c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpermilps, 3, 0x660c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpermilps, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vpermilps, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vpermilpd, 3, 0x660d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpermilpd, 3, 0x6605, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vpermilps, 3, 0x660c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpermilps, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vpextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64|RegMem }
 vpextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
 vpextrd, 3, 0x6616, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
@@ -2367,8 +2237,7 @@ vpsubsw, 3, 0x66e9, None, 1, CpuAVX, Mod
 vpsubusb, 3, 0x66d8, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpsubusw, 3, 0x66d9, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpsubw, 3, 0x66f9, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vptest, 2, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vptest, 2, 0x6617, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vptest, 2, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vpunpckhbw, 3, 0x6668, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpunpckhdq, 3, 0x666a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpunpckhqdq, 3, 0x666d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -2378,53 +2247,35 @@ vpunpckldq, 3, 0x6662, None, 1, CpuAVX,
 vpunpcklqdq, 3, 0x666c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpunpcklwd, 3, 0x6661, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpxor, 3, 0x66ef, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vrcpps, 2, 0x53, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vrcpps, 2, 0x53, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vrcpps, 2, 0x53, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vrcpss, 3, 0xf353, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vroundpd, 3, 0x6609, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vroundpd, 3, 0x6609, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vroundps, 3, 0x6608, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vroundps, 3, 0x6608, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vroundpd, 3, 0x6609, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vroundps, 3, 0x6608, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vroundsd, 4, 0x660b, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vroundss, 4, 0x660a, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vrsqrtps, 2, 0x52, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vrsqrtps, 2, 0x52, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vrsqrtps, 2, 0x52, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vrsqrtss, 3, 0xf352, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vshufpd, 4, 0x66c6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vshufpd, 4, 0x66c6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vshufps, 4, 0xc6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vshufps, 4, 0xc6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vsqrtpd, 2, 0x6651, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vsqrtpd, 2, 0x6651, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vsqrtps, 2, 0x51, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vsqrtps, 2, 0x51, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vshufpd, 4, 0x66c6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vshufps, 4, 0xc6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vsqrtpd, 2, 0x6651, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vsqrtps, 2, 0x51, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vsqrtsd, 3, 0xf251, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vsqrtss, 3, 0xf351, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vstmxcsr, 1, 0xae, 0x3, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex }
-vsubpd, 3, 0x665c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vsubpd, 3, 0x665c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vsubps, 3, 0x5c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vsubps, 3, 0x5c, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vsubpd, 3, 0x665c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vsubps, 3, 0x5c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vsubsd, 3, 0xf25c, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vsubss, 3, 0xf35c, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vtestpd, 2, 0x660f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vtestpd, 2, 0x660f, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
-vtestps, 2, 0x660e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vtestps, 2, 0x660e, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
+vtestpd, 2, 0x660f, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
+vtestps, 2, 0x660e, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vucomisd, 2, 0x662e, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vucomiss, 2, 0x2e, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vunpckhpd, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vunpckhpd, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vunpckhps, 3, 0x15, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vunpckhps, 3, 0x15, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vunpcklpd, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vunpcklpd, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vunpcklps, 3, 0x14, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vunpcklps, 3, 0x14, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vxorpd, 3, 0x6657, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vxorpd, 3, 0x6657, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vxorps, 3, 0x57, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vxorps, 3, 0x57, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vunpckhpd, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vunpckhps, 3, 0x15, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vunpcklpd, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vunpcklps, 3, 0x14, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vxorpd, 3, 0x6657, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vxorps, 3, 0x57, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vzeroall, 0, 0x77, None, 1, CpuAVX, Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 vzeroupper, 0, 0x77, None, 1, CpuAVX, Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 
@@ -2551,18 +2402,12 @@ vpxor, 3, 0x66ef, None, 1, CpuAVX2, Modr
 
 vbroadcasti128, 2, 0x665A, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
 vbroadcastsd, 2, 0x6619, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegYMM }
-vbroadcastss, 2, 0x6618, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vbroadcastss, 2, 0x6618, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegYMM }
-vpblendd, 4, 0x6602, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpblendd, 4, 0x6602, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpbroadcastb, 2, 0x6678, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Byte|Unspecified|BaseIndex|RegXMM, RegXMM }
-vpbroadcastb, 2, 0x6678, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Byte|Unspecified|BaseIndex|RegXMM, RegYMM }
-vpbroadcastd, 2, 0x6658, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vpbroadcastd, 2, 0x6658, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegYMM }
-vpbroadcastq, 2, 0x6659, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { QWord|Unspecified|BaseIndex|RegXMM, RegXMM }
-vpbroadcastq, 2, 0x6659, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { QWord|Unspecified|BaseIndex|RegXMM, RegYMM }
-vpbroadcastw, 2, 0x6679, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM }
-vpbroadcastw, 2, 0x6679, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Word|Unspecified|BaseIndex|RegXMM, RegYMM }
+vbroadcastss, 2, 0x6618, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM|RegYMM }
+vpblendd, 4, 0x6602, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpbroadcastb, 2, 0x6678, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Byte|Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM }
+vpbroadcastd, 2, 0x6658, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM }
+vpbroadcastq, 2, 0x6659, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { QWord|Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM }
+vpbroadcastw, 2, 0x6679, None, 1, CpuAVX2, Modrm|Vex=1|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM }
 vperm2i128, 4, 0x6646, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpermd, 3, 0x6636, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpermpd, 3, 0x6601, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
@@ -2570,24 +2415,15 @@ vpermps, 3, 0x6616, None, 1, CpuAVX2, Mo
 vpermq, 3, 0x6600, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM }
 vextracti128, 3, 0x6639, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM, Xmmword|Unspecified|BaseIndex|RegXMM }
 vinserti128, 4, 0x6638, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM }
-vpmaskmovd, 3, 0x668e, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, Xmmword|Unspecified|BaseIndex }
-vpmaskmovd, 3, 0x668e, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, Ymmword|Unspecified|BaseIndex }
-vpmaskmovd, 3, 0x668c, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmaskmovd, 3, 0x668c, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex, RegYMM, RegYMM }
-vpmaskmovq, 3, 0x668e, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, Xmmword|Unspecified|BaseIndex }
-vpmaskmovq, 3, 0x668e, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, Ymmword|Unspecified|BaseIndex }
-vpmaskmovq, 3, 0x668c, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmaskmovq, 3, 0x668c, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex, RegYMM, RegYMM }
-vpsllvd, 3, 0x6647, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpsllvd, 3, 0x6647, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpsllvq, 3, 0x6647, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpsllvq, 3, 0x6647, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpsravd, 3, 0x6646, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpsravd, 3, 0x6646, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpsrlvd, 3, 0x6645, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpsrlvd, 3, 0x6645, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpsrlvq, 3, 0x6645, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpsrlvq, 3, 0x6645, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vpmaskmovd, 3, 0x668e, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
+vpmaskmovd, 3, 0x668c, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaskmovq, 3, 0x668e, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
+vpmaskmovq, 3, 0x668c, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsllvd, 3, 0x6647, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsllvq, 3, 0x6647, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsravd, 3, 0x6646, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrlvd, 3, 0x6645, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrlvq, 3, 0x6645, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX gather instructions
 vgatherdpd, 3, 0x6692, None, 1, CpuAVX2, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VecSIB=1, { RegXMM, Qword|Unspecified|BaseIndex, RegXMM }


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] x86: replace Reg8, Reg16, Reg32, and Reg64
  2017-12-15 10:32 ` [PATCH 1/4] x86: replace Reg8, Reg16, Reg32, and Reg64 Jan Beulich
@ 2017-12-15 12:46   ` H.J. Lu
  0 siblings, 0 replies; 20+ messages in thread
From: H.J. Lu @ 2017-12-15 12:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Dec 15, 2017 at 2:32 AM, Jan Beulich <JBeulich@suse.com> wrote:
> Use a combination of a single new Reg bit and Byte, Word, Dword, or
> Qword instead.
>
> Besides shrinking the number of operand type bits this has the benefit
> of making register handling more similar to accumulator handling (a
> generic flag is being accompanied by a "size qualifier"). It requires,
> however, to split a few insn templates, as it is no longer correct to
> have combinations like Reg32|Reg64|Byte. This slight growth in size will
> hopefully be outweighed by this change paving the road for folding a
> presumably much larger number of templates later on.
>
> gas/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (operand_type_check, pi): Switch .reg<N> to
>         just .reg.
>         (operand_size_match): Qualify .anysize check with .reg one.
>         Extend .acc check to also cover .reg.
>         (operand_type_register_match): Drop m0 and m1 parameters. Switch
>         .reg<N> to .byte/.word/.dword/.qword. Drop .acc special
>         handling.
>         (md_assemble): Expand .reg8 checks to .reg plus .bytes ones.
>         (optimize_imm, process_suffix, check_byte_reg, check_long_reg,
>         check_qword_reg, check_word_reg): Expand .reg<N> checks to .reg
>         plus size ones.
>         (match_template): Drop arguments from calls to
>         operand_type_register_match().
>         (build_modrm_byte, i386_addressing_mode, i386_index_check,
>         parse_real_register): Replace .reg<N> checks.
>         * config/tc-i386-intel.c (i386_intel_simplify,
>         i386_intel_operand): Switch .reg16 to .word.
>
> opcodes/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-gen.c (operand_type_shorthands): New.
>         (opcode_modifiers): Replace Reg<N> with just Reg.
>         (set_bitfield_from_cpu_flag_init): Rename to
>         set_bitfield_from_shorthand. Drop value parameter. Process
>         operand_type_shorthands.
>         (set_bitfield): Adjust call accordingly.
>         * i386-opc.h (enum of operand types): Replace Reg<N> with just
>         Reg.
>         (union i386_operand_type): Replace reg<N> with just reg.
>         * i386-opc.tbl (extractps, pextrb, pextrw, pinsrb, pinsrw,
>         vextractps, vpextrb, vpextrw, vpinsrb, vpinsrw): Split into
>         separate register and memory forms.
>         * i386-reg.tbl (al): Drop Byte.
>         (ax): Drop Word.
>         (eax): Drop Dword.
>         (rax): Drop Qword.
>         * i386-init.h, i386-tbl.h: Re-generate.
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] x86: drop FloatReg and FloatAcc
  2017-12-15 10:33 ` [PATCH 2/4] x86: drop FloatReg and FloatAcc Jan Beulich
@ 2017-12-15 12:47   ` H.J. Lu
  0 siblings, 0 replies; 20+ messages in thread
From: H.J. Lu @ 2017-12-15 12:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Dec 15, 2017 at 2:33 AM, Jan Beulich <JBeulich@suse.com> wrote:
> Express them as Reg|Tbyte and Acc|Tbyte respectively.
>
> gas/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (operand_type_check): Extend comment.
>         (match_reg_size): Also check .tbyte.
>         (match_mem_size): No longer check .tbyte here.
>         (md_assemble): Drop .floatacc check.
>         (check_byte_reg): Drop .floatreg and .floatacc checks.
>         (process_operands, parse_real_register): Replace .floatreg
>         check.
>
> opcodes/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-gen.c (operand_type_shorthands): Add FloatAcc and
>         FloatReg.
>         (operand_types): Drop FloatAcc and FloatReg.
>         * i386-opc.h (enum of operand types): Likewise. Extend comment.
>         (union i386_operand_type): Drop floatacc and floatreg.
>         * i386-reg.tbl (st, st(0)): Replace FloatAcc by Acc.
>         * i386-init.h, i386-tbl.h: Re-generate.
>

OK.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD
  2017-12-15 10:34 ` [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD Jan Beulich
@ 2017-12-15 12:50   ` H.J. Lu
  2017-12-15 16:22     ` Jan Beulich
  0 siblings, 1 reply; 20+ messages in thread
From: H.J. Lu @ 2017-12-15 12:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Dec 15, 2017 at 2:34 AM, Jan Beulich <JBeulich@suse.com> wrote:
> ... qualified by their respective sizes, allowing to drop FirstXmm0 at
> the same time.
>
> gas/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (match_simd_size): New.
>         (match_mem_size): Use it.
>         (operand_size_match): Likewise. Split .reg and .acc checks.
>         (pi, check_VecOperands, match_template, check_byte_reg,
>         check_long_reg, check_qword_reg, build_modrm_byte,
>         parse_real_register): Replace .regxmm, .regymm, and .regzmm
>         checks.
>         (md_assemble): Qualify .acc check with .xmmword one.
>         (bad_implicit_operand): Delete.
>         (process_operands): Replace .firstxmm0 checks with .acc plus
>         .xmmword ones. Drop now pointless assertions. Convert .acc to
>         .regsimd.
>         * config/tc-i386-intel.c (i386_intel_simplify_register): Replace
>         .regxmm, .regymm, and .regzmm checks.
>         * testsuite/gas/i386/x86-64-specific-reg.l: Adjust expectations.
>
> opcodes/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-gen.c (operand_type_shorthands): Add RegXMM, RegYMM, and
>         RegZMM.
>         (opcode_modifiers): Drop FirstXmm0.
>         (operand_types): Replace RegXMM, RegYMM, and RegZMM with just
>         RegSIMD.
>         * i386-opc.h (enum of opcode modifiers): Drop FirstXmm0.
>         (struct i386_opcode_modifier): Drop firstxmm0.
>         (enum of operand types): Replace RegXMM, RegYMM, and RegZMM with
>         just RegSIMD. Extend comment.
>         (union i386_operand_type): Replace regxmm, regymm, and regzmm
>         with just regsimd.
>         * i386-opc.tbl (blendvpd, blendvps, pblendvb, sha256rnds2): Use
>         Acc|Xmmword.
>         * i386-reg.tbl (xmm0): Add Acc.
>         * i386-init.h, i386-tbl.h: Re-generate.
>
> --- a/gas/config/tc-i386-intel.c
> +++ b/gas/config/tc-i386-intel.c
> @@ -286,9 +286,9 @@ i386_intel_simplify_register (expression
>        i.op[this_operand].regs = i386_regtab + reg_num;
>      }
>    else if (!intel_state.index
> -          && (i386_regtab[reg_num].reg_type.bitfield.regxmm
> -              || i386_regtab[reg_num].reg_type.bitfield.regymm
> -              || i386_regtab[reg_num].reg_type.bitfield.regzmm
> +          && (i386_regtab[reg_num].reg_type.bitfield.xmmword
> +              || i386_regtab[reg_num].reg_type.bitfield.ymmword
> +              || i386_regtab[reg_num].reg_type.bitfield.zmmword
>                || i386_regtab[reg_num].reg_num == RegRiz
>                || i386_regtab[reg_num].reg_num == RegEiz))
>      intel_state.index = i386_regtab + reg_num;
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -1825,6 +1825,20 @@ match_reg_size (const insn_template *t,
>                && !t->operand_types[j].bitfield.tbyte));
>  }
>
> +/* Return 1 if there is no conflict in SIMD register on
> +   operand J for instruction template T.  */
> +
> +static INLINE int
> +match_simd_size (const insn_template *t, unsigned int j)
> +{
> +  return !((i.types[j].bitfield.xmmword
> +           && !t->operand_types[j].bitfield.xmmword)
> +          || (i.types[j].bitfield.ymmword
> +              && !t->operand_types[j].bitfield.ymmword)
> +          || (i.types[j].bitfield.zmmword
> +              && !t->operand_types[j].bitfield.zmmword));
> +}
> +
>  /* Return 1 if there is no conflict in any size on operand J for
>     instruction template T.  */
>
> @@ -1837,12 +1851,17 @@ match_mem_size (const insn_template *t,
>                 && !t->operand_types[j].bitfield.unspecified)
>                || (i.types[j].bitfield.fword
>                    && !t->operand_types[j].bitfield.fword)
> -              || (i.types[j].bitfield.xmmword
> -                  && !t->operand_types[j].bitfield.xmmword)
> -              || (i.types[j].bitfield.ymmword
> -                  && !t->operand_types[j].bitfield.ymmword)
> -              || (i.types[j].bitfield.zmmword
> -                  && !t->operand_types[j].bitfield.zmmword)));
> +              /* For scalar opcode templates to allow register and memory
> +                 operands at the same time, some special casing is needed
> +                 here.  */
> +              || ((t->operand_types[j].bitfield.regsimd
> +                   && !t->opcode_modifier.broadcast
> +                   && (t->operand_types[j].bitfield.dword
> +                       || t->operand_types[j].bitfield.qword))
> +                  ? (i.types[j].bitfield.xmmword
> +                     || i.types[j].bitfield.ymmword
> +                     || i.types[j].bitfield.zmmword)
> +                  : !match_simd_size(t, j))));
>  }
>
>  /* Return 1 if there is no size conflict on any operands for
> @@ -1864,17 +1883,31 @@ operand_size_match (const insn_template
>    /* Check memory and accumulator operand size.  */
>    for (j = 0; j < i.operands; j++)
>      {
> -      if (!i.types[j].bitfield.reg && t->operand_types[j].bitfield.anysize)
> +      if (!i.types[j].bitfield.reg && !i.types[j].bitfield.regsimd
> +         && t->operand_types[j].bitfield.anysize)
>         continue;
>
> -      if ((t->operand_types[j].bitfield.reg
> -          || t->operand_types[j].bitfield.acc)
> +      if (t->operand_types[j].bitfield.reg
>           && !match_reg_size (t, j))
>         {
>           match = 0;
>           break;
>         }
>
> +      if (t->operand_types[j].bitfield.regsimd
> +         && !match_simd_size (t, j))
> +       {
> +         match = 0;
> +         break;
> +       }
> +
> +      if (t->operand_types[j].bitfield.acc
> +         && (!match_reg_size (t, j) || !match_simd_size (t, j)))
> +       {
> +         match = 0;
> +         break;
> +       }
> +
>        if (i.types[j].bitfield.mem && !match_mem_size (t, j))
>         {
>           match = 0;
> @@ -2804,9 +2837,7 @@ pi (char *line, i386_insn *x)
>        fprintf (stdout, "\n");
>        if (x->types[j].bitfield.reg
>           || x->types[j].bitfield.regmmx
> -         || x->types[j].bitfield.regxmm
> -         || x->types[j].bitfield.regymm
> -         || x->types[j].bitfield.regzmm
> +         || x->types[j].bitfield.regsimd
>           || x->types[j].bitfield.sreg2
>           || x->types[j].bitfield.sreg3
>           || x->types[j].bitfield.control
> @@ -3768,7 +3799,7 @@ md_assemble (char *line)
>      for (j = 0; j < i.operands; j++)
>        if (i.types[j].bitfield.inoutportreg
>           || i.types[j].bitfield.shiftcount
> -         || i.types[j].bitfield.acc)
> +         || (i.types[j].bitfield.acc && !i.types[j].bitfield.xmmword))
>         i.reg_operands--;
>
>    /* ImmExt should be processed after SSE2AVX.  */
> @@ -4576,9 +4607,9 @@ check_VecOperands (const insn_template *
>    /* Without VSIB byte, we can't have a vector register for index.  */
>    if (!t->opcode_modifier.vecsib
>        && i.index_reg
> -      && (i.index_reg->reg_type.bitfield.regxmm
> -         || i.index_reg->reg_type.bitfield.regymm
> -         || i.index_reg->reg_type.bitfield.regzmm))
> +      && (i.index_reg->reg_type.bitfield.xmmword
> +         || i.index_reg->reg_type.bitfield.ymmword
> +         || i.index_reg->reg_type.bitfield.zmmword))
>      {
>        i.error = unsupported_vector_index_register;
>        return 1;
> @@ -4598,11 +4629,11 @@ check_VecOperands (const insn_template *
>      {
>        if (!i.index_reg
>           || !((t->opcode_modifier.vecsib == VecSIB128
> -               && i.index_reg->reg_type.bitfield.regxmm)
> +               && i.index_reg->reg_type.bitfield.xmmword)
>                || (t->opcode_modifier.vecsib == VecSIB256
> -                  && i.index_reg->reg_type.bitfield.regymm)
> +                  && i.index_reg->reg_type.bitfield.ymmword)
>                || (t->opcode_modifier.vecsib == VecSIB512
> -                  && i.index_reg->reg_type.bitfield.regzmm)))
> +                  && i.index_reg->reg_type.bitfield.zmmword)))
>        {
>         i.error = invalid_vsib_address;
>         return 1;
> @@ -4611,10 +4642,12 @@ check_VecOperands (const insn_template *
>        gas_assert (i.reg_operands == 2 || i.mask);
>        if (i.reg_operands == 2 && !i.mask)
>         {
> -         gas_assert (i.types[0].bitfield.regxmm
> -                     || i.types[0].bitfield.regymm);
> -         gas_assert (i.types[2].bitfield.regxmm
> -                     || i.types[2].bitfield.regymm);
> +         gas_assert (i.types[0].bitfield.regsimd);
> +         gas_assert (i.types[0].bitfield.xmmword
> +                     || i.types[0].bitfield.ymmword);
> +         gas_assert (i.types[2].bitfield.regsimd);
> +         gas_assert (i.types[2].bitfield.xmmword
> +                     || i.types[2].bitfield.ymmword);
>           if (operand_check == check_none)
>             return 0;
>           if (register_number (i.op[0].regs)
> @@ -4633,9 +4666,10 @@ check_VecOperands (const insn_template *
>         }
>        else if (i.reg_operands == 1 && i.mask)
>         {
> -         if ((i.types[1].bitfield.regxmm
> -              || i.types[1].bitfield.regymm
> -              || i.types[1].bitfield.regzmm)
> +         if (i.types[1].bitfield.regsimd
> +             && (i.types[1].bitfield.xmmword
> +                 || i.types[1].bitfield.ymmword
> +                 || i.types[1].bitfield.zmmword)
>               && (register_number (i.op[1].regs)
>                   == register_number (i.index_reg)))
>             {
> @@ -4941,13 +4975,9 @@ match_template (char mnem_suffix)
>                  && !intel_float_operand (t->name))
>               : intel_float_operand (t->name) != 2)
>           && ((!operand_types[0].bitfield.regmmx
> -              && !operand_types[0].bitfield.regxmm
> -              && !operand_types[0].bitfield.regymm
> -              && !operand_types[0].bitfield.regzmm)
> +              && !operand_types[0].bitfield.regsimd)
>               || (!operand_types[t->operands > 1].bitfield.regmmx
> -                 && !operand_types[t->operands > 1].bitfield.regxmm
> -                 && !operand_types[t->operands > 1].bitfield.regymm
> -                 && !operand_types[t->operands > 1].bitfield.regzmm))
> +                 && !operand_types[t->operands > 1].bitfield.regsimd))
>           && (t->base_opcode != 0x0fc7
>               || t->extension_opcode != 1 /* cmpxchg8b */))
>         continue;
> @@ -4960,9 +4990,9 @@ match_template (char mnem_suffix)
>                       && !intel_float_operand (t->name))
>                    : intel_float_operand (t->name) != 2)
>                && ((!operand_types[0].bitfield.regmmx
> -                   && !operand_types[0].bitfield.regxmm)
> +                   && !operand_types[0].bitfield.regsimd)
>                    || (!operand_types[t->operands > 1].bitfield.regmmx
> -                      && !operand_types[t->operands > 1].bitfield.regxmm)))
> +                      && !operand_types[t->operands > 1].bitfield.regsimd)))
>         continue;
>
>        /* Do not verify operands when there are none.  */
> @@ -5653,9 +5683,7 @@ check_byte_reg (void)
>        /* Any other register is bad.  */
>        if (i.types[op].bitfield.reg
>           || i.types[op].bitfield.regmmx
> -         || i.types[op].bitfield.regxmm
> -         || i.types[op].bitfield.regymm
> -         || i.types[op].bitfield.regzmm
> +         || i.types[op].bitfield.regsimd
>           || i.types[op].bitfield.sreg2
>           || i.types[op].bitfield.sreg3
>           || i.types[op].bitfield.control
> @@ -5728,7 +5756,7 @@ check_long_reg (void)
>        {
>         if (intel_syntax
>             && i.tm.opcode_modifier.toqword
> -           && !i.types[0].bitfield.regxmm)
> +           && !i.types[0].bitfield.regsimd)
>           {
>             /* Convert to QWORD.  We want REX byte. */
>             i.suffix = QWORD_MNEM_SUFFIX;
> @@ -5779,7 +5807,7 @@ check_qword_reg (void)
>            lowering is more complicated.  */
>         if (intel_syntax
>             && i.tm.opcode_modifier.todword
> -           && !i.types[0].bitfield.regxmm)
> +           && !i.types[0].bitfield.regsimd)
>           {
>             /* Convert to DWORD.  We don't want REX byte. */
>             i.suffix = LONG_MNEM_SUFFIX;
> @@ -5930,20 +5958,6 @@ finalize_imm (void)
>  }
>
>  static int
> -bad_implicit_operand (int xmm)
> -{
> -  const char *ireg = xmm ? "xmm0" : "ymm0";
> -
> -  if (intel_syntax)
> -    as_bad (_("the last operand of `%s' must be `%s%s'"),
> -           i.tm.name, register_prefix, ireg);
> -  else
> -    as_bad (_("the first operand of `%s' must be `%s%s'"),
> -           i.tm.name, register_prefix, ireg);
> -  return 0;
> -}
> -

Will we miss the assembly operand error checking?

-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] x86: fold certain AVX and AVX2 templates
  2017-12-15 10:35 ` [PATCH 4/4] x86: fold certain AVX and AVX2 templates Jan Beulich
@ 2017-12-15 13:10   ` H.J. Lu
  2017-12-15 16:32     ` Jan Beulich
  2017-12-21 10:07   ` Jan Beulich
  2017-12-22  8:54   ` Alan Modra
  2 siblings, 1 reply; 20+ messages in thread
From: H.J. Lu @ 2017-12-15 13:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Dec 15, 2017 at 2:35 AM, Jan Beulich <JBeulich@suse.com> wrote:
> Just like for instructions in GPRs, there's no need to have separate
> templates for otherwise identical insns acting on XMM or YMM registers
> (or memory of the same size).
>
> gas/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (regymm, regzmm): Delete.
>         (operand_type_register_match). Extend comment. Also handle some
>         memory operands here. Extend to cover .regsimd.
>         (build_vex_prefix): Derive vector_length from actual operand
>         size.
>         (process_operands, build_modrm_byte): Use .regsimd.
>
> opcodes/
> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-gen.c (operand_type_init): Delete OPERAND_TYPE_REGYMM and
>         OPERAND_TYPE_REGZMM entries.
>         * i386-opc.h (enum of opcode modifiers): Extend comment.
>         i386-opc.tbl (vaddpd, vaddps, vaddsubpd, vaddsubps, vandnpd,
>         vandnps, vandpd, vandps, vblendpd, vblendps, vblendvpd,
>         vblendvps, vbroadcastss, vcmpeq_ospd, vcmpeq_osps, vcmpeqpd,
>         vcmpeqps, vcmpeq_uqpd, vcmpeq_uqps, vcmpeq_uspd, vcmpeq_usps,
>         vcmpfalse_ospd, vcmpfalse_osps, vcmpfalsepd, vcmpfalseps,
>         vcmpge_oqpd, vcmpge_oqps, vcmpgepd, vcmpgeps, vcmpgt_oqpd,
>         vcmpgt_oqps, vcmpgtpd, vcmpgtps, vcmple_oqpd, vcmple_oqps,
>         vcmplepd, vcmpleps, vcmplt_oqpd, vcmplt_oqps, vcmpltpd,
>         vcmpltps, vcmpneq_oqpd, vcmpneq_oqps, vcmpneq_ospd,
>         vcmpneq_osps, vcmpneqpd, vcmpneqps, vcmpneq_uspd, vcmpneq_usps,
>         vcmpngepd, vcmpngeps, vcmpnge_uqpd, vcmpnge_uqps, vcmpngtpd,
>         vcmpngtps, vcmpngt_uqpd, vcmpngt_uqps, vcmpnlepd, vcmpnleps,
>         vcmpnle_uqpd, vcmpnle_uqps, vcmpnltpd, vcmpnltps, vcmpnlt_uqpd,
>         vcmpnlt_uqps, vcmpordpd, vcmpordps, vcmpord_spd, vcmpord_sps,
>         vcmppd, vcmpps, vcmptruepd, vcmptrueps, vcmptrue_uspd,
>         vcmptrue_usps, vcmpunordpd, vcmpunordps, vcmpunord_spd,
>         vcmpunord_sps, vcvtdq2ps, vcvtpd2dq, vcvtpd2ps, vcvtps2dq,
>         vcvttpd2dq, vcvttps2dq, vdivpd, vdivps, vdpps, vhaddpd, vhaddps,
>         vhsubpd, vhsubps, vlddqu, vmaskmovpd, vmaskmovps, vmaxpd,
>         vmaxps, vminpd, vminps, vmovapd, vmovaps, vmovdqa, vmovdqu,
>         vmovmskpd, vmovmskps, vmovntdq, vmovntpd, vmovntps, vmovshdup,
>         vmovsldup, vmovupd, vmovups, vmulpd, vmulps, vorpd, vorps,
>         vpermilpd, vpermilps, vptest, vrcpps, vroundpd, vroundps,
>         vrsqrtps, vshufpd, vshufps, vsqrtpd, vsqrtps, vsubpd, vsubps,
>         vtestpd, vtestps, vunpckhpd, vunpckhps, vunpcklpd, vunpcklps,
>         vxorpd, vxorps, vpblendd, vpbroadcastb, vpbroadcastd,
>         vpbroadcastw, vpbroadcastq, vpmaskmovd, vpmaskmovq, vpsllvd,
>         vpsllvq, vpsravd, vpsravq, vpsrlvd, vpsrlvq): Fold 128- and
>         256-bit forms. Use CheckRegSize instead of IgnoreSize where
>         appropriate. Drop Xmmword and Ymmword from the results where
>         possible.
>         * i386-tbl.h: Re-generate.
> ---
> For some yet to be understood reason folding the memory forms of
> vcvtpd2ps doesn't work (some Intel mode ymmword ptr forms produce
> 128-bit insns).

Integer extension instructions also take 2 register operands of different
sizes.  How are they handled?

> Similarly I didn't figure out yet the reason for an anomaly when the
> "unspecified" checks in operand_type_register_match() are missing: In
> that case I've observed errors on vaddsubp{s,d}, but not on e.g.
> vaddp{s,d} with identical operands.

Please open a bug with a testcase.

-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD
  2017-12-15 12:50   ` H.J. Lu
@ 2017-12-15 16:22     ` Jan Beulich
  2017-12-15 16:28       ` H.J. Lu
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2017-12-15 16:22 UTC (permalink / raw)
  To: hjl.tools; +Cc: binutils

>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 1:50 PM >>>
>On Fri, Dec 15, 2017 at 2:34 AM, Jan Beulich <JBeulich@suse.com> wrote:
>> @@ -5930,20 +5958,6 @@ finalize_imm (void)
>>  }
>>
>>  static int
>> -bad_implicit_operand (int xmm)
>> -{
>> -  const char *ireg = xmm ? "xmm0" : "ymm0";
>> -
>> -  if (intel_syntax)
>> -    as_bad (_("the last operand of `%s' must be `%s%s'"),
>> -           i.tm.name, register_prefix, ireg);
>> -  else
>> -    as_bad (_("the first operand of `%s' must be `%s%s'"),
>> -           i.tm.name, register_prefix, ireg);
>> -  return 0;
>> -}
>> -
>
>Will we miss the assembly operand error checking?

Not sure I understand what you mean. As the altered test case shows, error
messages are still present, just that they don't mention %xmm0 anymore. If
you mean something else, please explain. But keep in mind that things are
now more similar to GPR accumulator (i.e. %al etc) or FPU (%st(0)) handling,
where there is no such problem either. The template now requires %xmm0 to
be used.

Jan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD
  2017-12-15 16:22     ` Jan Beulich
@ 2017-12-15 16:28       ` H.J. Lu
  2017-12-15 16:36         ` Jan Beulich
  0 siblings, 1 reply; 20+ messages in thread
From: H.J. Lu @ 2017-12-15 16:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Dec 15, 2017 at 8:22 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 1:50 PM >>>
>>On Fri, Dec 15, 2017 at 2:34 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>> @@ -5930,20 +5958,6 @@ finalize_imm (void)
>>>  }
>>>
>>>  static int
>>> -bad_implicit_operand (int xmm)
>>> -{
>>> -  const char *ireg = xmm ? "xmm0" : "ymm0";
>>> -
>>> -  if (intel_syntax)
>>> -    as_bad (_("the last operand of `%s' must be `%s%s'"),
>>> -           i.tm.name, register_prefix, ireg);
>>> -  else
>>> -    as_bad (_("the first operand of `%s' must be `%s%s'"),
>>> -           i.tm.name, register_prefix, ireg);
>>> -  return 0;
>>> -}
>>> -
>>
>>Will we miss the assembly operand error checking?
>
> Not sure I understand what you mean. As the altered test case shows, error
> messages are still present, just that they don't mention %xmm0 anymore. If
> you mean something else, please explain. But keep in mind that things are
> now more similar to GPR accumulator (i.e. %al etc) or FPU (%st(0)) handling,
> where there is no such problem either. The template now requires %xmm0 to
> be used.

It is hard to tell what the error message is from

*:[0-9]*: Error: .*blendvpd.*

We used to get

The last operand of ....

What do we get now instead?

-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] x86: fold certain AVX and AVX2 templates
  2017-12-15 13:10   ` H.J. Lu
@ 2017-12-15 16:32     ` Jan Beulich
  2017-12-15 16:49       ` H.J. Lu
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2017-12-15 16:32 UTC (permalink / raw)
  To: hjl.tools; +Cc: binutils

>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 2:10 PM >>>
>On Fri, Dec 15, 2017 at 2:35 AM, Jan Beulich <JBeulich@suse.com> wrote:
>> Just like for instructions in GPRs, there's no need to have separate
>> templates for otherwise identical insns acting on XMM or YMM registers
>> (or memory of the same size).
>>
>> gas/
>> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>>
>>         * config/tc-i386.c (regymm, regzmm): Delete.
>>         (operand_type_register_match). Extend comment. Also handle some
>>         memory operands here. Extend to cover .regsimd.
>>         (build_vex_prefix): Derive vector_length from actual operand
>>         size.
>>         (process_operands, build_modrm_byte): Use .regsimd.
>>
>> opcodes/
>> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>>
>>         * i386-gen.c (operand_type_init): Delete OPERAND_TYPE_REGYMM and
>>         OPERAND_TYPE_REGZMM entries.
>>         * i386-opc.h (enum of opcode modifiers): Extend comment.
>>         i386-opc.tbl (vaddpd, vaddps, vaddsubpd, vaddsubps, vandnpd,
>>         vandnps, vandpd, vandps, vblendpd, vblendps, vblendvpd,
>>         vblendvps, vbroadcastss, vcmpeq_ospd, vcmpeq_osps, vcmpeqpd,
>>         vcmpeqps, vcmpeq_uqpd, vcmpeq_uqps, vcmpeq_uspd, vcmpeq_usps,
>>         vcmpfalse_ospd, vcmpfalse_osps, vcmpfalsepd, vcmpfalseps,
>>         vcmpge_oqpd, vcmpge_oqps, vcmpgepd, vcmpgeps, vcmpgt_oqpd,
>>         vcmpgt_oqps, vcmpgtpd, vcmpgtps, vcmple_oqpd, vcmple_oqps,
>>         vcmplepd, vcmpleps, vcmplt_oqpd, vcmplt_oqps, vcmpltpd,
>>         vcmpltps, vcmpneq_oqpd, vcmpneq_oqps, vcmpneq_ospd,
>>         vcmpneq_osps, vcmpneqpd, vcmpneqps, vcmpneq_uspd, vcmpneq_usps,
>>         vcmpngepd, vcmpngeps, vcmpnge_uqpd, vcmpnge_uqps, vcmpngtpd,
>>         vcmpngtps, vcmpngt_uqpd, vcmpngt_uqps, vcmpnlepd, vcmpnleps,
>>         vcmpnle_uqpd, vcmpnle_uqps, vcmpnltpd, vcmpnltps, vcmpnlt_uqpd,
>>         vcmpnlt_uqps, vcmpordpd, vcmpordps, vcmpord_spd, vcmpord_sps,
>>         vcmppd, vcmpps, vcmptruepd, vcmptrueps, vcmptrue_uspd,
>>         vcmptrue_usps, vcmpunordpd, vcmpunordps, vcmpunord_spd,
>>         vcmpunord_sps, vcvtdq2ps, vcvtpd2dq, vcvtpd2ps, vcvtps2dq,
>>         vcvttpd2dq, vcvttps2dq, vdivpd, vdivps, vdpps, vhaddpd, vhaddps,
>>         vhsubpd, vhsubps, vlddqu, vmaskmovpd, vmaskmovps, vmaxpd,
>>         vmaxps, vminpd, vminps, vmovapd, vmovaps, vmovdqa, vmovdqu,
>>         vmovmskpd, vmovmskps, vmovntdq, vmovntpd, vmovntps, vmovshdup,
>>         vmovsldup, vmovupd, vmovups, vmulpd, vmulps, vorpd, vorps,
>>         vpermilpd, vpermilps, vptest, vrcpps, vroundpd, vroundps,
>>         vrsqrtps, vshufpd, vshufps, vsqrtpd, vsqrtps, vsubpd, vsubps,
>>         vtestpd, vtestps, vunpckhpd, vunpckhps, vunpcklpd, vunpcklps,
>>         vxorpd, vxorps, vpblendd, vpbroadcastb, vpbroadcastd,
>>         vpbroadcastw, vpbroadcastq, vpmaskmovd, vpmaskmovq, vpsllvd,
>>         vpsllvq, vpsravd, vpsravq, vpsrlvd, vpsrlvq): Fold 128- and
>>         256-bit forms. Use CheckRegSize instead of IgnoreSize where
>>         appropriate. Drop Xmmword and Ymmword from the results where
>>         possible.
>>         * i386-tbl.h: Re-generate.
>> ---
>> For some yet to be understood reason folding the memory forms of
>> vcvtpd2ps doesn't work (some Intel mode ymmword ptr forms produce
>> 128-bit insns).
>
>Integer extension instructions also take 2 register operands of different
>sizes.  How are they handled?

As per the list of changes insns, conversions to/from scalar int aren't being
folded, so their handling doesn't change. And quite obviously so, since no
matter what the GPR size, the other side is an xmmword (register or
memory), while here I'm folding only templates where one used xmmword
and the other ymmword.

>> Similarly I didn't figure out yet the reason for an anomaly when the
>> "unspecified" checks in operand_type_register_match() are missing: In
>> that case I've observed errors on vaddsubp{s,d}, but not on e.g.
>> vaddp{s,d} with identical operands.
>
>Please open a bug with a testcase.

You perhaps misunderstood: I've observed this issue while putting together
the patch here. I'm not aware of an issue without the patch applied, nor with
the patch in its current form applied. I'm merely pointing out that there is
a _possible_ issue pointed out by this anomaly. This could e.g. be the result
of some latent bug somewhere which was triggered by the not-yet-correct
patch. I'm intending to investigate this, but I can't predict when this will be;
I've put the note here in case the observation triggers something for you or
anyone else who reads this, which might then help me save some time
needlessly investigating what's going on there.

Jan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD
  2017-12-15 16:28       ` H.J. Lu
@ 2017-12-15 16:36         ` Jan Beulich
  2017-12-15 16:41           ` H.J. Lu
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2017-12-15 16:36 UTC (permalink / raw)
  To: hjl.tools; +Cc: binutils

>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 5:28 PM >>>
>On Fri, Dec 15, 2017 at 8:22 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 1:50 PM >>>
>>>On Fri, Dec 15, 2017 at 2:34 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> @@ -5930,20 +5958,6 @@ finalize_imm (void)
>>>>  }
>>>>
>>>>  static int
>>>> -bad_implicit_operand (int xmm)
>>>> -{
>>>> -  const char *ireg = xmm ? "xmm0" : "ymm0";
>>>> -
>>>> -  if (intel_syntax)
>>>> -    as_bad (_("the last operand of `%s' must be `%s%s'"),
>>>> -           i.tm.name, register_prefix, ireg);
>>>> -  else
>>>> -    as_bad (_("the first operand of `%s' must be `%s%s'"),
>>>> -           i.tm.name, register_prefix, ireg);
>>>> -  return 0;
>>>> -}
>>>> -
>>>
>>>Will we miss the assembly operand error checking?
>>
>> Not sure I understand what you mean. As the altered test case shows, error
>> messages are still present, just that they don't mention %xmm0 anymore. If
>> you mean something else, please explain. But keep in mind that things are
>> now more similar to GPR accumulator (i.e. %al etc) or FPU (%st(0)) handling,
>> where there is no such problem either. The template now requires %xmm0 to
>> be used.
>
>It is hard to tell what the error message is from
>
>*:[0-9]*: Error: .*blendvpd.*
>
>We used to get
>
>The last operand of ....
>
>What do we get now instead?

"operand type mismatch for ..." iirc. I don't think it was appropriate for the
earlier more specific error message to be there in the first place - as said, this
is now and should always have been similar to e.g. FPU insns requiring %st(0)
as one of their operands, where %st(0) isn't being mentioned in the diagnostic
either afair.

Jan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD
  2017-12-15 16:36         ` Jan Beulich
@ 2017-12-15 16:41           ` H.J. Lu
  0 siblings, 0 replies; 20+ messages in thread
From: H.J. Lu @ 2017-12-15 16:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Dec 15, 2017 at 8:36 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 5:28 PM >>>
>>On Fri, Dec 15, 2017 at 8:22 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 1:50 PM >>>
>>>>On Fri, Dec 15, 2017 at 2:34 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> @@ -5930,20 +5958,6 @@ finalize_imm (void)
>>>>>  }
>>>>>
>>>>>  static int
>>>>> -bad_implicit_operand (int xmm)
>>>>> -{
>>>>> -  const char *ireg = xmm ? "xmm0" : "ymm0";
>>>>> -
>>>>> -  if (intel_syntax)
>>>>> -    as_bad (_("the last operand of `%s' must be `%s%s'"),
>>>>> -           i.tm.name, register_prefix, ireg);
>>>>> -  else
>>>>> -    as_bad (_("the first operand of `%s' must be `%s%s'"),
>>>>> -           i.tm.name, register_prefix, ireg);
>>>>> -  return 0;
>>>>> -}
>>>>> -
>>>>
>>>>Will we miss the assembly operand error checking?
>>>
>>> Not sure I understand what you mean. As the altered test case shows, error
>>> messages are still present, just that they don't mention %xmm0 anymore. If
>>> you mean something else, please explain. But keep in mind that things are
>>> now more similar to GPR accumulator (i.e. %al etc) or FPU (%st(0)) handling,
>>> where there is no such problem either. The template now requires %xmm0 to
>>> be used.
>>
>>It is hard to tell what the error message is from
>>
>>*:[0-9]*: Error: .*blendvpd.*
>>
>>We used to get
>>
>>The last operand of ....
>>
>>What do we get now instead?
>
> "operand type mismatch for ..." iirc. I don't think it was appropriate for the
> earlier more specific error message to be there in the first place - as said, this
> is now and should always have been similar to e.g. FPU insns requiring %st(0)
> as one of their operands, where %st(0) isn't being mentioned in the diagnostic
> either afair.
>

Patch is OK then.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] x86: fold certain AVX and AVX2 templates
  2017-12-15 16:32     ` Jan Beulich
@ 2017-12-15 16:49       ` H.J. Lu
  0 siblings, 0 replies; 20+ messages in thread
From: H.J. Lu @ 2017-12-15 16:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Dec 15, 2017 at 8:32 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 2:10 PM >>>
>>On Fri, Dec 15, 2017 at 2:35 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>> Just like for instructions in GPRs, there's no need to have separate
>>> templates for otherwise identical insns acting on XMM or YMM registers
>>> (or memory of the same size).
>>>
>>> gas/
>>> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>>>
>>>         * config/tc-i386.c (regymm, regzmm): Delete.
>>>         (operand_type_register_match). Extend comment. Also handle some
>>>         memory operands here. Extend to cover .regsimd.
>>>         (build_vex_prefix): Derive vector_length from actual operand
>>>         size.
>>>         (process_operands, build_modrm_byte): Use .regsimd.
>>>
>>> opcodes/
>>> 2017-12-15  Jan Beulich  <jbeulich@suse.com>
>>>
>>>         * i386-gen.c (operand_type_init): Delete OPERAND_TYPE_REGYMM and
>>>         OPERAND_TYPE_REGZMM entries.
>>>         * i386-opc.h (enum of opcode modifiers): Extend comment.
>>>         i386-opc.tbl (vaddpd, vaddps, vaddsubpd, vaddsubps, vandnpd,
>>>         vandnps, vandpd, vandps, vblendpd, vblendps, vblendvpd,
>>>         vblendvps, vbroadcastss, vcmpeq_ospd, vcmpeq_osps, vcmpeqpd,
>>>         vcmpeqps, vcmpeq_uqpd, vcmpeq_uqps, vcmpeq_uspd, vcmpeq_usps,
>>>         vcmpfalse_ospd, vcmpfalse_osps, vcmpfalsepd, vcmpfalseps,
>>>         vcmpge_oqpd, vcmpge_oqps, vcmpgepd, vcmpgeps, vcmpgt_oqpd,
>>>         vcmpgt_oqps, vcmpgtpd, vcmpgtps, vcmple_oqpd, vcmple_oqps,
>>>         vcmplepd, vcmpleps, vcmplt_oqpd, vcmplt_oqps, vcmpltpd,
>>>         vcmpltps, vcmpneq_oqpd, vcmpneq_oqps, vcmpneq_ospd,
>>>         vcmpneq_osps, vcmpneqpd, vcmpneqps, vcmpneq_uspd, vcmpneq_usps,
>>>         vcmpngepd, vcmpngeps, vcmpnge_uqpd, vcmpnge_uqps, vcmpngtpd,
>>>         vcmpngtps, vcmpngt_uqpd, vcmpngt_uqps, vcmpnlepd, vcmpnleps,
>>>         vcmpnle_uqpd, vcmpnle_uqps, vcmpnltpd, vcmpnltps, vcmpnlt_uqpd,
>>>         vcmpnlt_uqps, vcmpordpd, vcmpordps, vcmpord_spd, vcmpord_sps,
>>>         vcmppd, vcmpps, vcmptruepd, vcmptrueps, vcmptrue_uspd,
>>>         vcmptrue_usps, vcmpunordpd, vcmpunordps, vcmpunord_spd,
>>>         vcmpunord_sps, vcvtdq2ps, vcvtpd2dq, vcvtpd2ps, vcvtps2dq,
>>>         vcvttpd2dq, vcvttps2dq, vdivpd, vdivps, vdpps, vhaddpd, vhaddps,
>>>         vhsubpd, vhsubps, vlddqu, vmaskmovpd, vmaskmovps, vmaxpd,
>>>         vmaxps, vminpd, vminps, vmovapd, vmovaps, vmovdqa, vmovdqu,
>>>         vmovmskpd, vmovmskps, vmovntdq, vmovntpd, vmovntps, vmovshdup,
>>>         vmovsldup, vmovupd, vmovups, vmulpd, vmulps, vorpd, vorps,
>>>         vpermilpd, vpermilps, vptest, vrcpps, vroundpd, vroundps,
>>>         vrsqrtps, vshufpd, vshufps, vsqrtpd, vsqrtps, vsubpd, vsubps,
>>>         vtestpd, vtestps, vunpckhpd, vunpckhps, vunpcklpd, vunpcklps,
>>>         vxorpd, vxorps, vpblendd, vpbroadcastb, vpbroadcastd,
>>>         vpbroadcastw, vpbroadcastq, vpmaskmovd, vpmaskmovq, vpsllvd,
>>>         vpsllvq, vpsravd, vpsravq, vpsrlvd, vpsrlvq): Fold 128- and
>>>         256-bit forms. Use CheckRegSize instead of IgnoreSize where
>>>         appropriate. Drop Xmmword and Ymmword from the results where
>>>         possible.
>>>         * i386-tbl.h: Re-generate.
>>> ---
>>> For some yet to be understood reason folding the memory forms of
>>> vcvtpd2ps doesn't work (some Intel mode ymmword ptr forms produce
>>> 128-bit insns).
>>
>>Integer extension instructions also take 2 register operands of different
>>sizes.  How are they handled?
>
> As per the list of changes insns, conversions to/from scalar int aren't being
> folded, so their handling doesn't change. And quite obviously so, since no
> matter what the GPR size, the other side is an xmmword (register or
> memory), while here I'm folding only templates where one used xmmword
> and the other ymmword.
>
>>> Similarly I didn't figure out yet the reason for an anomaly when the
>>> "unspecified" checks in operand_type_register_match() are missing: In
>>> that case I've observed errors on vaddsubp{s,d}, but not on e.g.
>>> vaddp{s,d} with identical operands.
>>
>>Please open a bug with a testcase.
>
> You perhaps misunderstood: I've observed this issue while putting together
> the patch here. I'm not aware of an issue without the patch applied, nor with
> the patch in its current form applied. I'm merely pointing out that there is
> a _possible_ issue pointed out by this anomaly. This could e.g. be the result
> of some latent bug somewhere which was triggered by the not-yet-correct
> patch. I'm intending to investigate this, but I can't predict when this will be;
> I've put the note here in case the observation triggers something for you or
> anyone else who reads this, which might then help me save some time
> needlessly investigating what's going on there.
>

Patch is OK then.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/4] x86: helper changes to i386-gen.c
  2017-12-15  9:54 [PATCH 0/4] x86: operand handling consolidation Jan Beulich
                   ` (3 preceding siblings ...)
  2017-12-15 10:35 ` [PATCH 4/4] x86: fold certain AVX and AVX2 templates Jan Beulich
@ 2017-12-18  8:03 ` Jan Beulich
  2017-12-18 11:25   ` H.J. Lu
  4 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2017-12-18  8:03 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

I've just realized that I forgot to send the helper patch I've been
using, presumably just for reference (as mentioned in the overview
mail, I could certainly transform this into a properly shaped,
committable patch). Note that the opcode table adjustments are
to avoid false positives; the entries changed ae bogus anyway,
but that'll be the subject of my default-operand-size series which
will still take some time until it is ready for posting.

Jan

--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -1082,7 +1082,11 @@ process_i386_operand_type (FILE *table,
 
   if (strcmp (op, "0"))
     {
-      int baseindex = 0;
+      int baseindex = 0, disp = 0;
+      int reg = 0, reg8 = 0, reg16 = 0, reg32 = 0, reg64 = 0;
+      int simd = 0, regxmm = 0, regymm = 0, regzmm = 0;
+      int byte = 0, word = 0, dword = 0, qword = 0, anysize = 0;
+      int xmmword = 0, ymmword = 0, zmmword = 0;
 
       last = op + strlen (op);
       for (next = op; next && next < last; )
@@ -1093,6 +1097,38 @@ process_i386_operand_type (FILE *table,
 	      set_bitfield (str, types, 1, ARRAY_SIZE (types), lineno);
 	      if (strcasecmp(str, "BaseIndex") == 0)
 		baseindex = 1;
+	      else if (strncasecmp(str, "Disp", 4) == 0)
+		disp = 1;
+	      else if (strcasecmp(str, "Reg8") == 0)
+		reg8 = reg = 1;
+	      else if (strcasecmp(str, "Reg16") == 0)
+		reg16 = reg = 1;
+	      else if (strcasecmp(str, "Reg32") == 0)
+		reg32 = reg = 1;
+	      else if (strcasecmp(str, "Reg64") == 0)
+		reg64 = reg = 1;
+	      else if (strcasecmp(str, "RegXMM") == 0)
+		regxmm = simd = 1;
+	      else if (strcasecmp(str, "RegYMM") == 0)
+		regymm = simd = 1;
+	      else if (strcasecmp(str, "RegZMM") == 0)
+		regzmm = simd = 1;
+	      else if (strcasecmp(str, "Byte") == 0)
+		byte = 1;
+	      else if (strcasecmp(str, "Word") == 0)
+		word = 1;
+	      else if (strcasecmp(str, "Dword") == 0)
+		dword = 1;
+	      else if (strcasecmp(str, "Qword") == 0)
+		qword = 1;
+	      else if (strcasecmp(str, "Xmmword") == 0)
+		xmmword = 1;
+	      else if (strcasecmp(str, "Ymmword") == 0)
+		ymmword = 1;
+	      else if (strcasecmp(str, "Zmmword") == 0)
+		zmmword = 1;
+	      else if (strcasecmp(str, "Anysize") == 0)
+		anysize = 1;
 	    }
 	}
 
@@ -1106,6 +1142,38 @@ process_i386_operand_type (FILE *table,
 	  if (!active_cpu_flags.bitfield.cpuno64)
 	    set_bitfield("Disp32S", types, 1, ARRAY_SIZE (types), lineno);
 	}
+
+      if (stage == stage_opcodes && (baseindex || disp))
+	{
+	  if (reg8 && !byte && !anysize)
+	    fprintf (stderr, "%s:%d: Reg8 but no Byte mem\n", filename, lineno);
+	  if (reg16 && !word && !anysize)
+	    fprintf (stderr, "%s:%d: Reg16 but no Word mem\n", filename, lineno);
+	  if (reg32 && !dword && !anysize)
+	    fprintf (stderr, "%s:%d: Reg32 but no Dword mem\n", filename, lineno);
+	  if (reg64 && !qword && !anysize)
+	    fprintf (stderr, "%s:%d: Reg64 but no Qword mem\n", filename, lineno);
+	  if (regxmm && !xmmword && !anysize && (ymmword || zmmword))
+	    fprintf (stderr, "%s:%d: RegXMM but no Xmmword mem\n", filename, lineno);
+	  if (regymm && !ymmword && !anysize && (xmmword || zmmword))
+	    fprintf (stderr, "%s:%d: RegYMM but no Ymmword mem\n", filename, lineno);
+	  if (regzmm && !zmmword && !anysize && (xmmword || ymmword))
+	    fprintf (stderr, "%s:%d: RegZMM but no Zmmword mem\n", filename, lineno);
+	}
+      if (!reg8 && byte && reg)
+	fprintf (stderr, "%s:%d: Byte and RegN but no Reg8\n", filename, lineno);
+      if (!reg16 && word && reg)
+	fprintf (stderr, "%s:%d: Word and RegN but no Reg16\n", filename, lineno);
+      if (!reg32 && dword && reg)
+	fprintf (stderr, "%s:%d: Dword and RegN but no Reg32\n", filename, lineno);
+      if (!reg64 && qword && reg)
+	fprintf (stderr, "%s:%d: Qword and RegN but no Reg64\n", filename, lineno);
+      if (!regxmm && xmmword && simd)
+	fprintf (stderr, "%s:%d: Xmmword and Reg?MM but no RegXMM\n", filename, lineno);
+      if (!regymm && ymmword && simd)
+	fprintf (stderr, "%s:%d: Ymmword and Reg?MM but no RegYMM\n", filename, lineno);
+      if (!regzmm && zmmword && simd)
+	fprintf (stderr, "%s:%d: Zmmword and Reg?MM but no RegZMM\n", filename, lineno);
     }
   output_operand_type (table, types, ARRAY_SIZE (types), stage,
 		       indent);
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -66,9 +66,9 @@ movsbq, 2, 0xfbe, None, 2, Cpu64, Modrm|
 movswq, 2, 0xfbf, None, 2, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg16|Word|Unspecified|BaseIndex, Reg64 }
 movslq, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg32|Dword|Unspecified|BaseIndex, Reg64 }
 // Intel Syntax next 3 insns
-movsx, 2, 0xfbe, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
-movsx, 2, 0xfbf, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg16|Unspecified|BaseIndex, Reg32|Reg64 }
-movsx, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|ATTSyntax, { Reg32|Unspecified|BaseIndex, Reg64 }
+movsx, 2, 0xfbe, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Byte|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movsx, 2, 0xfbf, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Word|Unspecified|BaseIndex, Reg32|Reg64 }
+movsx, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|ATTSyntax, { Reg|Dword|Unspecified|BaseIndex, Reg64 }
 movsx, 2, 0xfbe, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg8|Byte|BaseIndex, Reg16|Reg32|Reg64 }
 movsx, 2, 0xfbf, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg16|Word|BaseIndex, Reg32|Reg64 }
 movsx, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|IntelSyntax, { Reg32|Dword|BaseIndex, Reg64 }
@@ -79,8 +79,8 @@ movzb, 2, 0xfb6, None, 2, Cpu386, Modrm|
 movzw, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg16|Word|Unspecified|BaseIndex, Reg32|Reg64 }
 // Intel Syntax next 2 insns (the 64-bit variants are not particulary
 // useful since the zero extend 32->64 is implicit, but we can encode them).
-movzx, 2, 0xfb6, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
-movzx, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg16|Unspecified|BaseIndex, Reg32|Reg64 }
+movzx, 2, 0xfb6, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Byte|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movzx, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Word|Unspecified|BaseIndex, Reg32|Reg64 }
 movzx, 2, 0xfb6, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg8|Byte|BaseIndex, Reg16|Reg32|Reg64 }
 movzx, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg16|Word|BaseIndex, Reg32|Reg64 }
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/4] x86: helper changes to i386-gen.c
  2017-12-18  8:03 ` [PATCH 5/4] x86: helper changes to i386-gen.c Jan Beulich
@ 2017-12-18 11:25   ` H.J. Lu
  0 siblings, 0 replies; 20+ messages in thread
From: H.J. Lu @ 2017-12-18 11:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Mon, Dec 18, 2017 at 12:03 AM, Jan Beulich <JBeulich@suse.com> wrote:
> I've just realized that I forgot to send the helper patch I've been
> using, presumably just for reference (as mentioned in the overview
> mail, I could certainly transform this into a properly shaped,
> committable patch). Note that the opcode table adjustments are
> to avoid false positives; the entries changed ae bogus anyway,
> but that'll be the subject of my default-operand-size series which
> will still take some time until it is ready for posting.
>
> Jan
>
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -1082,7 +1082,11 @@ process_i386_operand_type (FILE *table,
>
>    if (strcmp (op, "0"))
>      {
> -      int baseindex = 0;
> +      int baseindex = 0, disp = 0;
> +      int reg = 0, reg8 = 0, reg16 = 0, reg32 = 0, reg64 = 0;
> +      int simd = 0, regxmm = 0, regymm = 0, regzmm = 0;
> +      int byte = 0, word = 0, dword = 0, qword = 0, anysize = 0;
> +      int xmmword = 0, ymmword = 0, zmmword = 0;
>
>        last = op + strlen (op);
>        for (next = op; next && next < last; )
> @@ -1093,6 +1097,38 @@ process_i386_operand_type (FILE *table,
>               set_bitfield (str, types, 1, ARRAY_SIZE (types), lineno);
>               if (strcasecmp(str, "BaseIndex") == 0)
>                 baseindex = 1;
> +             else if (strncasecmp(str, "Disp", 4) == 0)
> +               disp = 1;
> +             else if (strcasecmp(str, "Reg8") == 0)
> +               reg8 = reg = 1;
> +             else if (strcasecmp(str, "Reg16") == 0)
> +               reg16 = reg = 1;
> +             else if (strcasecmp(str, "Reg32") == 0)
> +               reg32 = reg = 1;
> +             else if (strcasecmp(str, "Reg64") == 0)
> +               reg64 = reg = 1;
> +             else if (strcasecmp(str, "RegXMM") == 0)
> +               regxmm = simd = 1;
> +             else if (strcasecmp(str, "RegYMM") == 0)
> +               regymm = simd = 1;
> +             else if (strcasecmp(str, "RegZMM") == 0)
> +               regzmm = simd = 1;
> +             else if (strcasecmp(str, "Byte") == 0)
> +               byte = 1;
> +             else if (strcasecmp(str, "Word") == 0)
> +               word = 1;
> +             else if (strcasecmp(str, "Dword") == 0)
> +               dword = 1;
> +             else if (strcasecmp(str, "Qword") == 0)
> +               qword = 1;
> +             else if (strcasecmp(str, "Xmmword") == 0)
> +               xmmword = 1;
> +             else if (strcasecmp(str, "Ymmword") == 0)
> +               ymmword = 1;
> +             else if (strcasecmp(str, "Zmmword") == 0)
> +               zmmword = 1;
> +             else if (strcasecmp(str, "Anysize") == 0)
> +               anysize = 1;
>             }
>         }
>
> @@ -1106,6 +1142,38 @@ process_i386_operand_type (FILE *table,
>           if (!active_cpu_flags.bitfield.cpuno64)
>             set_bitfield("Disp32S", types, 1, ARRAY_SIZE (types), lineno);
>         }
> +
> +      if (stage == stage_opcodes && (baseindex || disp))
> +       {
> +         if (reg8 && !byte && !anysize)
> +           fprintf (stderr, "%s:%d: Reg8 but no Byte mem\n", filename, lineno);
> +         if (reg16 && !word && !anysize)
> +           fprintf (stderr, "%s:%d: Reg16 but no Word mem\n", filename, lineno);
> +         if (reg32 && !dword && !anysize)
> +           fprintf (stderr, "%s:%d: Reg32 but no Dword mem\n", filename, lineno);
> +         if (reg64 && !qword && !anysize)
> +           fprintf (stderr, "%s:%d: Reg64 but no Qword mem\n", filename, lineno);
> +         if (regxmm && !xmmword && !anysize && (ymmword || zmmword))
> +           fprintf (stderr, "%s:%d: RegXMM but no Xmmword mem\n", filename, lineno);
> +         if (regymm && !ymmword && !anysize && (xmmword || zmmword))
> +           fprintf (stderr, "%s:%d: RegYMM but no Ymmword mem\n", filename, lineno);
> +         if (regzmm && !zmmword && !anysize && (xmmword || ymmword))
> +           fprintf (stderr, "%s:%d: RegZMM but no Zmmword mem\n", filename, lineno);
> +       }
> +      if (!reg8 && byte && reg)
> +       fprintf (stderr, "%s:%d: Byte and RegN but no Reg8\n", filename, lineno);
> +      if (!reg16 && word && reg)
> +       fprintf (stderr, "%s:%d: Word and RegN but no Reg16\n", filename, lineno);
> +      if (!reg32 && dword && reg)
> +       fprintf (stderr, "%s:%d: Dword and RegN but no Reg32\n", filename, lineno);
> +      if (!reg64 && qword && reg)
> +       fprintf (stderr, "%s:%d: Qword and RegN but no Reg64\n", filename, lineno);
> +      if (!regxmm && xmmword && simd)
> +       fprintf (stderr, "%s:%d: Xmmword and Reg?MM but no RegXMM\n", filename, lineno);
> +      if (!regymm && ymmword && simd)
> +       fprintf (stderr, "%s:%d: Ymmword and Reg?MM but no RegYMM\n", filename, lineno);
> +      if (!regzmm && zmmword && simd)
> +       fprintf (stderr, "%s:%d: Zmmword and Reg?MM but no RegZMM\n", filename, lineno);
>      }
>    output_operand_type (table, types, ARRAY_SIZE (types), stage,
>                        indent);
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -66,9 +66,9 @@ movsbq, 2, 0xfbe, None, 2, Cpu64, Modrm|
>  movswq, 2, 0xfbf, None, 2, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg16|Word|Unspecified|BaseIndex, Reg64 }
>  movslq, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg32|Dword|Unspecified|BaseIndex, Reg64 }
>  // Intel Syntax next 3 insns
> -movsx, 2, 0xfbe, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> -movsx, 2, 0xfbf, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg16|Unspecified|BaseIndex, Reg32|Reg64 }
> -movsx, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|ATTSyntax, { Reg32|Unspecified|BaseIndex, Reg64 }
> +movsx, 2, 0xfbe, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Byte|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +movsx, 2, 0xfbf, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Word|Unspecified|BaseIndex, Reg32|Reg64 }
> +movsx, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|ATTSyntax, { Reg|Dword|Unspecified|BaseIndex, Reg64 }
>  movsx, 2, 0xfbe, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg8|Byte|BaseIndex, Reg16|Reg32|Reg64 }
>  movsx, 2, 0xfbf, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg16|Word|BaseIndex, Reg32|Reg64 }
>  movsx, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|IntelSyntax, { Reg32|Dword|BaseIndex, Reg64 }
> @@ -79,8 +79,8 @@ movzb, 2, 0xfb6, None, 2, Cpu386, Modrm|
>  movzw, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg16|Word|Unspecified|BaseIndex, Reg32|Reg64 }
>  // Intel Syntax next 2 insns (the 64-bit variants are not particulary
>  // useful since the zero extend 32->64 is implicit, but we can encode them).
> -movzx, 2, 0xfb6, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> -movzx, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg16|Unspecified|BaseIndex, Reg32|Reg64 }
> +movzx, 2, 0xfb6, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Byte|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +movzx, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Reg|Word|Unspecified|BaseIndex, Reg32|Reg64 }
>  movzx, 2, 0xfb6, None, 2, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg8|Byte|BaseIndex, Reg16|Reg32|Reg64 }
>  movzx, 2, 0xfb7, None, 2, Cpu386, Modrm|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Reg16|Word|BaseIndex, Reg32|Reg64 }
>
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] x86: fold certain AVX and AVX2 templates
  2017-12-15 10:35 ` [PATCH 4/4] x86: fold certain AVX and AVX2 templates Jan Beulich
  2017-12-15 13:10   ` H.J. Lu
@ 2017-12-21 10:07   ` Jan Beulich
  2017-12-22  8:54   ` Alan Modra
  2 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2017-12-21 10:07 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

>>> On 15.12.17 at 11:35, <JBeulich@suse.com> wrote:
> Similarly I didn't figure out yet the reason for an anomaly when the
> "unspecified" checks in operand_type_register_match() are missing: In
> that case I've observed errors on vaddsubp{s,d}, but not on e.g.
> vaddp{s,d} with identical operands.

This turns out to have an easy explanation: The consequences
of the missing checks were simply invisible for vaddp{s,d} and
alike, as only error messages resulted in the testsuite, but no
generated code differences could be looked at. Once I've looked
at the produced code, vaddp{s,d} simply had matches on the
AVX512VL templates, while for insns without direct AVX512VL
equivalent another match can't be found, and hence an error
is being reported.

Bottom line - nothing odd anywhere with or without the (complete)
patch.

Jan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] x86: fold certain AVX and AVX2 templates
  2017-12-15 10:35 ` [PATCH 4/4] x86: fold certain AVX and AVX2 templates Jan Beulich
  2017-12-15 13:10   ` H.J. Lu
  2017-12-21 10:07   ` Jan Beulich
@ 2017-12-22  8:54   ` Alan Modra
  2018-01-02 10:49     ` Jan Beulich
  2 siblings, 1 reply; 20+ messages in thread
From: Alan Modra @ 2017-12-22  8:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils, H.J. Lu

On Fri, Dec 15, 2017 at 03:35:13AM -0700, Jan Beulich wrote:
> 	* i386-gen.c (operand_type_init): Delete OPERAND_TYPE_REGYMM and
> 	OPERAND_TYPE_REGZMM entries.

This change should have resulted in a regen of opcodes/i386-init.h,
and there are uses of the defines in gas/config/tc-i386.c (inside
#ifdef DEBUG386 so you won't normally see a compilation error).

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] x86: fold certain AVX and AVX2 templates
  2017-12-22  8:54   ` Alan Modra
@ 2018-01-02 10:49     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2018-01-02 10:49 UTC (permalink / raw)
  To: Alan Modra; +Cc: H.J. Lu, binutils

>>> On 22.12.17 at 09:54, <amodra@gmail.com> wrote:
> On Fri, Dec 15, 2017 at 03:35:13AM -0700, Jan Beulich wrote:
>> 	* i386-gen.c (operand_type_init): Delete OPERAND_TYPE_REGYMM and
>> 	OPERAND_TYPE_REGZMM entries.
> 
> This change should have resulted in a regen of opcodes/i386-init.h,
> and there are uses of the defines in gas/config/tc-i386.c (inside
> #ifdef DEBUG386 so you won't normally see a compilation error).

Indeed. I had realized the missing update to the generated header,
but had meant to update it with further changes to come hopefully
during the next weeks, as the leftover wouldn't have caused any
problems. But with the symbols actually used, I've now pushed a
partial revert. I'd like to note though that pt() as a whole doesn't
really look to be correct anymore; I've taken a note to look into
getting this to work right again.

Jan

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-01-02 10:49 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-15  9:54 [PATCH 0/4] x86: operand handling consolidation Jan Beulich
2017-12-15 10:32 ` [PATCH 1/4] x86: replace Reg8, Reg16, Reg32, and Reg64 Jan Beulich
2017-12-15 12:46   ` H.J. Lu
2017-12-15 10:33 ` [PATCH 2/4] x86: drop FloatReg and FloatAcc Jan Beulich
2017-12-15 12:47   ` H.J. Lu
2017-12-15 10:34 ` [PATCH 3/4] x86: fold RegXMM/RegYMM/RegZMM into RegSIMD Jan Beulich
2017-12-15 12:50   ` H.J. Lu
2017-12-15 16:22     ` Jan Beulich
2017-12-15 16:28       ` H.J. Lu
2017-12-15 16:36         ` Jan Beulich
2017-12-15 16:41           ` H.J. Lu
2017-12-15 10:35 ` [PATCH 4/4] x86: fold certain AVX and AVX2 templates Jan Beulich
2017-12-15 13:10   ` H.J. Lu
2017-12-15 16:32     ` Jan Beulich
2017-12-15 16:49       ` H.J. Lu
2017-12-21 10:07   ` Jan Beulich
2017-12-22  8:54   ` Alan Modra
2018-01-02 10:49     ` Jan Beulich
2017-12-18  8:03 ` [PATCH 5/4] x86: helper changes to i386-gen.c Jan Beulich
2017-12-18 11:25   ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).