public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments
@ 2020-03-04  9:32 Jan Beulich
  2020-03-04  9:41 ` [PATCH 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich
                   ` (9 more replies)
  0 siblings, 10 replies; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:32 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

This multi-purpose attribute has traditionally been misused in
various places, but changes over the last so many months / years
have also resulted in it having become superfluous, while in
other cases the need to have it was not recognized. Before
actually thinking about splitting the attribute such that each
serves only one purpose, try to clean things up to get a better
understanding of the legitimate uses and hence the overall needs.

1: refine TPAUSE and UMWAIT
2: add missing IgnoreSize
3: correct MPX insn w/o base or index encoding in 16-bit mode
4: drop Rex64 attribute
5: replace NoRex64 on VEX-encoded insns
6: don't accept FI{LD,STP,STTP}LL in Intel syntax mode
7: fold (supposed to be) identical code
8: drop/replace IgnoreSize
9: reduce amount of various VCVT* templates

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
@ 2020-03-04  9:41 ` Jan Beulich
  2020-03-04 11:37   ` H.J. Lu
  2020-03-04  9:42 ` [PATCH 2/9] x86: add missing IgnoreSize Jan Beulich
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:41 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Allowing 64-bit registers is misleading here: Elsewhere these get allowed
when there's no difference between either variant, because of 32-bit
destination registers having their upper halves zeroed in 64-bit mode.
Here, however, they're source registers, and hence specifying 64-bit
registers would lead to the ambiguity of whether the upper 32 bits
actually matter.

Additionally, for proper code generation in 16-bit mode, IgnoreSize is
needed on both.

And finally, just like for e.g. MONITOR/MWAIT, add variants with all
input registers explicitly specified.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (md_assemble): Also exclude tpause and umwait
	from having their operands swapped.
	* testsuite/gas/i386/waitpkg.s,
	testsuite/gas/i386/x86-64-waitpkg.s: Add tpause and umwait
	3-operand cases.
	* testsuite/gas/i386/waitpkg.d,
	testsuite/gas/i386/waitpkg-intel.d,
	testsuite/gas/i386/x86-64-waitpkg.d,
	testsuite/gas/i386/x86-64-waitpkg-intel.d: Adjust expectations.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-opc.tbl (tpause, umwait): Add IgnoreSize. Add 3-operand
	template.
	* i386-tbl.h: Re-generate.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4328,16 +4328,19 @@ md_assemble (char *line)
   /* Now we've parsed the mnemonic into a set of templates, and have the
      operands at hand.  */
 
-  /* All Intel opcodes have reversed operands except for "bound", "enter"
-     "monitor*", and "mwait*".  We also don't reverse intersegment "jmp"
-     and "call" instructions with 2 immediate operands so that the immediate
-     segment precedes the offset, as it does when in AT&T mode. */
+  /* All Intel opcodes have reversed operands except for "bound", "enter",
+     "monitor*", "mwait*", "tpause", and "umwait".  We also don't reverse
+     intersegment "jmp" and "call" instructions with 2 immediate operands so
+     that the immediate segment precedes the offset, as it does when in AT&T
+     mode.  */
   if (intel_syntax
       && i.operands > 1
       && (strcmp (mnemonic, "bound") != 0)
       && (strcmp (mnemonic, "invlpga") != 0)
       && (strncmp (mnemonic, "monitor", 7) != 0)
       && (strncmp (mnemonic, "mwait", 5) != 0)
+      && (strcmp (mnemonic, "tpause") != 0)
+      && (strcmp (mnemonic, "umwait") != 0)
       && !(operand_type_check (i.types[0], imm)
 	   && operand_type_check (i.types[1], imm)))
     swap_operands ();
--- a/gas/testsuite/gas/i386/waitpkg-intel.d
+++ b/gas/testsuite/gas/i386/waitpkg-intel.d
@@ -12,5 +12,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 0f ae f0[ 	]*umonitor eax
 [ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f1[ 	]*umonitor cx
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait ebx
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause edi
 #pass
--- a/gas/testsuite/gas/i386/waitpkg.d
+++ b/gas/testsuite/gas/i386/waitpkg.d
@@ -12,5 +12,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 0f ae f0[ 	]*umonitor %eax
 [ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f1[ 	]*umonitor %cx
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait %ebx
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause %ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait %edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause %edi
 #pass
--- a/gas/testsuite/gas/i386/waitpkg.s
+++ b/gas/testsuite/gas/i386/waitpkg.s
@@ -5,4 +5,11 @@ _start:
 	umonitor %eax
 	umonitor %cx
 	umwait %ecx
+	umwait %ebx, %edx, %eax
 	tpause %ecx
+	tpause %ebx, %edx, %eax
+
+	.intel_syntax noprefix
+
+	umwait edi, edx, eax
+	tpause edi, edx, eax
--- a/gas/testsuite/gas/i386/x86-64-waitpkg-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-waitpkg-intel.d
@@ -13,11 +13,11 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 41 0f ae f2[ 	]*umonitor r10
 [ 	]*[a-f0-9]+:[ 	]*67 f3 41 0f ae f2[ 	]*umonitor r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait r10d
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait edi
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
-[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
-[ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause r10d
 [ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause r10d
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause edi
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f6[ 	]*umwait esi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f6[ 	]*tpause esi
 #pass
--- a/gas/testsuite/gas/i386/x86-64-waitpkg.d
+++ b/gas/testsuite/gas/i386/x86-64-waitpkg.d
@@ -13,11 +13,11 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 41 0f ae f2[ 	]*umonitor %r10
 [ 	]*[a-f0-9]+:[ 	]*67 f3 41 0f ae f2[ 	]*umonitor %r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait %r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait %r10d
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait %edi
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
-[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
-[ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause %r10d
 [ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause %r10d
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause %edi
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f6[ 	]*umwait %esi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f6[ 	]*tpause %esi
 #pass
--- a/gas/testsuite/gas/i386/x86-64-waitpkg.s
+++ b/gas/testsuite/gas/i386/x86-64-waitpkg.s
@@ -6,10 +6,13 @@ _start:
 	umonitor %r10
 	umonitor %r10d
 	umwait %ecx
-	umwait %rcx
-	umwait %r10
 	umwait %r10d
+	umwait %edi, %edx, %eax
 	tpause %ecx
-	tpause %rcx
-	tpause %r10
 	tpause %r10d
+	tpause %edi, %edx, %eax
+
+	.intel_syntax noprefix
+
+	umwait esi, edx, eax
+	tpause esi, edx, eax
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -4763,10 +4763,10 @@ pconfig, 0, 0x0f01c5, None, 3, CpuPCONFI
 // WAITPKG instructions.
 
 umonitor, 1, 0xf30fae, 0x6, 2, CpuWAITPKG, Modrm|AddrPrefixOpReg, { Reg16|Reg32|Reg64 }
-
-tpause, 1, 0x660fae, 0x6, 2, CpuWAITPKG, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Reg32|Reg64 }
-
-umwait, 1, 0xf20fae, 0x6, 2, CpuWAITPKG, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Reg32|Reg64 }
+tpause, 1, 0x660fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32 }
+tpause, 3, 0x660fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, RegD|Dword, Acc|Dword }
+umwait, 1, 0xf20fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32 }
+umwait, 3, 0xf20fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, RegD|Dword, Acc|Dword }
 
 // WAITPKG instructions end.
 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 2/9] x86: add missing IgnoreSize
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
  2020-03-04  9:41 ` [PATCH 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich
@ 2020-03-04  9:42 ` Jan Beulich
  2020-03-04 11:40   ` H.J. Lu
  2020-03-04  9:43 ` [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode Jan Beulich
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:42 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

For proper code generation in 16-bit mode (or to avoid the "same type of
prefix used twice" diagnostic there), IgnoreSize is needed on certain
templates allowing for just 32-(and maybe 64-)bit operands.

Beyond adding tests for the previously broken cases, also add ones for
the previously working cases where IgnoreSize is needed for the same
reason (leaving out MPX for now, as that'll require an assembler change
first). Some minor adjustments to tests get done such that re-use of the
same code for 16-bit code generation testing becomes easier.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* testsuite/gas/i386/adx.s, testsuite/gas/i386/cet.s,
	testsuite/gas/i386/ept.s, testsuite/gas/i386/fsgs.s,
	testsuite/gas/i386/invpcid.s, testsuite/gas/i386/movdir.s,
	testsuite/gas/i386/ptwrite.s, testsuite/gas/i386/vmx.s,
	testsuite/gas/i386/waitpkg.s: Re-assemble some of the source as
	16-bit code.
	* testsuite/gas/i386/code16.s: Add CR, DR, and TR access cases
	as well as a BSWAP one.
	* testsuite/gas/i386/rdpid.s: Add 16-bit case.
	* testsuite/gas/i386/sse2-16bit.s: Cover more insns.
	* testsuite/gas/i386/adx-intel.d, testsuite/gas/i386/adx.d,
	testsuite/gas/i386/cet-intel.d, testsuite/gas/i386/cet.d,
	testsuite/gas/i386/code16.d, testsuite/gas/i386/ept-intel.d,
	testsuite/gas/i386/ept.d, testsuite/gas/i386/fsgs-intel.d,
	testsuite/gas/i386/fsgs.d, testsuite/gas/i386/invpcid-intel.d,
	testsuite/gas/i386/invpcid.d, testsuite/gas/i386/movdir-intel.d,
	testsuite/gas/i386/movdir.d, testsuite/gas/i386/ptwrite-intel.d,
	testsuite/gas/i386/ptwrite.d, testsuite/gas/i386/rdpid-intel.d,
	testsuite/gas/i386/rdpid.d, testsuite/gas/i386/sse2-16bit.d,
	testsuite/gas/i386/vmx.d, testsuite/gas/i386/waitpkg-intel.d,
	testsuite/gas/i386/waitpkg.d: Adjust expectations.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-opc.tbl (movmskps, mwait, vmread, vmwrite, invept,
	invvpid, invpcid, rdfsbase, rdgsbase, wrfsbase, wrgsbase, adcx,
	adox, mwaitx, rdpid, movdiri): Add IgnoreSize.
	(ptwrite): Split into non-64-bit and 64-bit forms.
	* i386-tbl.h: Re-generate.

--- a/gas/testsuite/gas/i386/adx-intel.d
+++ b/gas/testsuite/gas/i386/adx-intel.d
@@ -20,12 +20,22 @@ Disassembly of section .text:
 [       ]*[a-f0-9]+:	f3 0f 38 f6 00       	adox   eax,DWORD PTR \[eax\]
 [       ]*[a-f0-9]+:	f3 0f 38 f6 ca       	adox   ecx,edx
 [       ]*[a-f0-9]+:	f3 0f 38 f6 00       	adox   eax,DWORD PTR \[eax\]
-[       ]*[a-f0-9]+:	66 0f 38 f6 82 8f 01 00 00 	adcx   eax,DWORD PTR \[edx\+0x18f\]
+[       ]*[a-f0-9]+:	66 0f 38 f6 42 24    	adcx   eax,DWORD PTR \[edx\+0x24\]
 [       ]*[a-f0-9]+:	66 0f 38 f6 d1       	adcx   edx,ecx
-[       ]*[a-f0-9]+:	66 0f 38 f6 94 f4 c0 1d fe ff 	adcx   edx,DWORD PTR \[esp\+esi\*8-0x1e240\]
+[       ]*[a-f0-9]+:	66 0f 38 f6 54 f4 f4 	adcx   edx,DWORD PTR \[esp\+esi\*8-0xc\]
 [       ]*[a-f0-9]+:	66 0f 38 f6 00       	adcx   eax,DWORD PTR \[eax\]
-[       ]*[a-f0-9]+:	f3 0f 38 f6 82 8f 01 00 00 	adox   eax,DWORD PTR \[edx\+0x18f\]
+[       ]*[a-f0-9]+:	f3 0f 38 f6 42 24    	adox   eax,DWORD PTR \[edx\+0x24\]
 [       ]*[a-f0-9]+:	f3 0f 38 f6 d1       	adox   edx,ecx
-[       ]*[a-f0-9]+:	f3 0f 38 f6 94 f4 c0 1d fe ff 	adox   edx,DWORD PTR \[esp\+esi\*8-0x1e240\]
+[       ]*[a-f0-9]+:	f3 0f 38 f6 54 f4 f4 	adox   edx,DWORD PTR \[esp\+esi\*8-0xc\]
 [       ]*[a-f0-9]+:	f3 0f 38 f6 00       	adox   eax,DWORD PTR \[eax\]
+[       ]*[a-f0-9]+:	67 66 0f 38 f6 42 24 	adcx   eax,DWORD PTR \[bp\+si\+0x24\]
+[       ]*[a-f0-9]+:	66 0f 38 f6 d1       	adcx   edx,ecx
+[       ]*[a-f0-9]+:	67 66 0f 38 f6 54 f4 	adcx   edx,DWORD PTR \[si-0xc\]
+[       ]*[a-f0-9]+:	f4                   	hlt *
+[       ]*[a-f0-9]+:	67 66 0f 38 f6 00    	adcx   eax,DWORD PTR \[bx\+si\]
+[       ]*[a-f0-9]+:	67 f3 0f 38 f6 42 24 	adox   eax,DWORD PTR \[bp\+si\+0x24\]
+[       ]*[a-f0-9]+:	f3 0f 38 f6 d1       	adox   edx,ecx
+[       ]*[a-f0-9]+:	67 f3 0f 38 f6 54 f4 	adox   edx,DWORD PTR \[si-0xc\]
+[       ]*[a-f0-9]+:	f4                   	hlt *
+[       ]*[a-f0-9]+:	67 f3 0f 38 f6 00    	adox   eax,DWORD PTR \[bx\+si\]
 #pass
--- a/gas/testsuite/gas/i386/adx.d
+++ b/gas/testsuite/gas/i386/adx.d
@@ -19,12 +19,22 @@ Disassembly of section .text:
 [       ]*[a-f0-9]+:	f3 0f 38 f6 00       	adox   \(%eax\),%eax
 [       ]*[a-f0-9]+:	f3 0f 38 f6 ca       	adox   %edx,%ecx
 [       ]*[a-f0-9]+:	f3 0f 38 f6 00       	adox   \(%eax\),%eax
-[       ]*[a-f0-9]+:	66 0f 38 f6 82 8f 01 00 00 	adcx   0x18f\(%edx\),%eax
+[       ]*[a-f0-9]+:	66 0f 38 f6 42 24    	adcx   0x24\(%edx\),%eax
 [       ]*[a-f0-9]+:	66 0f 38 f6 d1       	adcx   %ecx,%edx
-[       ]*[a-f0-9]+:	66 0f 38 f6 94 f4 c0 1d fe ff 	adcx   -0x1e240\(%esp,%esi,8\),%edx
+[       ]*[a-f0-9]+:	66 0f 38 f6 54 f4 f4 	adcx   -0xc\(%esp,%esi,8\),%edx
 [       ]*[a-f0-9]+:	66 0f 38 f6 00       	adcx   \(%eax\),%eax
-[       ]*[a-f0-9]+:	f3 0f 38 f6 82 8f 01 00 00 	adox   0x18f\(%edx\),%eax
+[       ]*[a-f0-9]+:	f3 0f 38 f6 42 24    	adox   0x24\(%edx\),%eax
 [       ]*[a-f0-9]+:	f3 0f 38 f6 d1       	adox   %ecx,%edx
-[       ]*[a-f0-9]+:	f3 0f 38 f6 94 f4 c0 1d fe ff 	adox   -0x1e240\(%esp,%esi,8\),%edx
+[       ]*[a-f0-9]+:	f3 0f 38 f6 54 f4 f4 	adox   -0xc\(%esp,%esi,8\),%edx
 [       ]*[a-f0-9]+:	f3 0f 38 f6 00       	adox   \(%eax\),%eax
+[       ]*[a-f0-9]+:	67 66 0f 38 f6 42 24 	adcx   0x24\(%bp,%si\),%eax
+[       ]*[a-f0-9]+:	66 0f 38 f6 d1       	adcx   %ecx,%edx
+[       ]*[a-f0-9]+:	67 66 0f 38 f6 54 f4 	adcx   -0xc\(%si\),%edx
+[       ]*[a-f0-9]+:	f4                   	hlt *
+[       ]*[a-f0-9]+:	67 66 0f 38 f6 00    	adcx   \(%bx,%si\),%eax
+[       ]*[a-f0-9]+:	67 f3 0f 38 f6 42 24 	adox   0x24\(%bp,%si\),%eax
+[       ]*[a-f0-9]+:	f3 0f 38 f6 d1       	adox   %ecx,%edx
+[       ]*[a-f0-9]+:	67 f3 0f 38 f6 54 f4 	adox   -0xc\(%si\),%edx
+[       ]*[a-f0-9]+:	f4                   	hlt *
+[       ]*[a-f0-9]+:	67 f3 0f 38 f6 00    	adox   \(%bx,%si\),%eax
 #pass
--- a/gas/testsuite/gas/i386/adx.s
+++ b/gas/testsuite/gas/i386/adx.s
@@ -17,14 +17,17 @@ _start:
         adoxl   (%eax), %eax
 
 	.intel_syntax noprefix
+	.rept 2
 
-        adcx    eax, DWORD PTR [edx+399]
+        adcx    eax, DWORD PTR [edx+36]
         adcx    edx, ecx
-        adcx    edx, DWORD PTR [esp+esi*8-123456]
+        adcx    edx, DWORD PTR [esp+esi*8-12]
         adcx    eax, DWORD PTR [eax]
 
-        adox    eax, DWORD PTR [edx+399]
+        adox    eax, DWORD PTR [edx+36]
         adox    edx, ecx
-        adox    edx, DWORD PTR [esp+esi*8-123456]
+        adox    edx, DWORD PTR [esp+esi*8-12]
         adox    eax, DWORD PTR [eax]
 
+	.code16
+	.endr
--- a/gas/testsuite/gas/i386/cet-intel.d
+++ b/gas/testsuite/gas/i386/cet-intel.d
@@ -21,13 +21,29 @@ Disassembly of section .text:
  +[a-f0-9]+:	f3 0f ae e9          	incsspd ecx
  +[a-f0-9]+:	f3 0f 1e c9          	rdsspd ecx
  +[a-f0-9]+:	f3 0f 01 ea          	saveprevssp 
- +[a-f0-9]+:	f3 0f 01 2c 01       	rstorssp QWORD PTR \[ecx\+eax\*1\]
+ +[a-f0-9]+:	f3 0f 01 6c 01 90    	rstorssp QWORD PTR \[ecx\+eax\*1-0x70\]
  +[a-f0-9]+:	0f 38 f6 02          	wrssd  \[edx\],eax
  +[a-f0-9]+:	0f 38 f6 10          	wrssd  \[eax\],edx
  +[a-f0-9]+:	66 0f 38 f5 14 2f    	wrussd \[edi\+ebp\*1\],edx
- +[a-f0-9]+:	66 0f 38 f5 3c 2a    	wrussd \[edx\+ebp\*1\],edi
+ +[a-f0-9]+:	66 0f 38 f5 3c 0e    	wrussd \[esi\+ecx\*1\],edi
  +[a-f0-9]+:	f3 0f 01 e8          	setssbsy 
- +[a-f0-9]+:	f3 0f ae 34 04       	clrssbsy QWORD PTR \[esp\+eax\*1\]
+ +[a-f0-9]+:	f3 0f ae 34 44       	clrssbsy QWORD PTR \[esp\+eax\*2\]
  +[a-f0-9]+:	f3 0f 1e fa          	endbr64 
  +[a-f0-9]+:	f3 0f 1e fb          	endbr32 
+ +[a-f0-9]+:	f3 0f ae e9          	incsspd ecx
+ +[a-f0-9]+:	f3 0f 1e c9          	rdsspd ecx
+ +[a-f0-9]+:	f3 0f 01 ea          	saveprevssp *
+ +[a-f0-9]+:	67 f3 0f 01 6c 01    	rstorssp QWORD PTR \[si\+0x1\]
+ +[a-f0-9]+:	90                   	nop *
+ +[a-f0-9]+:	67 0f 38 f6 02       	wrssd  \[bp\+si\],eax
+ +[a-f0-9]+:	67 0f 38 f6 10       	wrssd  \[bx\+si\],edx
+ +[a-f0-9]+:	67 66 0f 38 f5 14    	wrussd \[si\],edx
+ +[a-f0-9]+:	2f                   	das *
+ +[a-f0-9]+:	67 66 0f 38 f5 3c    	wrussd \[si\],edi
+ +[a-f0-9]+:	0e                   	push   cs
+ +[a-f0-9]+:	f3 0f 01 e8          	setssbsy *
+ +[a-f0-9]+:	67 f3 0f ae 34       	clrssbsy QWORD PTR \[si\]
+ +[a-f0-9]+:	44                   	inc    esp
+ +[a-f0-9]+:	f3 0f 1e fa          	endbr64 *
+ +[a-f0-9]+:	f3 0f 1e fb          	endbr32 *
 #pass
--- a/gas/testsuite/gas/i386/cet.d
+++ b/gas/testsuite/gas/i386/cet.d
@@ -19,13 +19,29 @@ Disassembly of section .text:
  +[a-f0-9]+:	f3 0f ae e9          	incsspd %ecx
  +[a-f0-9]+:	f3 0f 1e c9          	rdsspd %ecx
  +[a-f0-9]+:	f3 0f 01 ea          	saveprevssp 
- +[a-f0-9]+:	f3 0f 01 2c 01       	rstorssp \(%ecx,%eax,1\)
+ +[a-f0-9]+:	f3 0f 01 6c 01 90    	rstorssp -0x70\(%ecx,%eax,1\)
  +[a-f0-9]+:	0f 38 f6 02          	wrssd  %eax,\(%edx\)
  +[a-f0-9]+:	0f 38 f6 10          	wrssd  %edx,\(%eax\)
  +[a-f0-9]+:	66 0f 38 f5 14 2f    	wrussd %edx,\(%edi,%ebp,1\)
- +[a-f0-9]+:	66 0f 38 f5 3c 2a    	wrussd %edi,\(%edx,%ebp,1\)
+ +[a-f0-9]+:	66 0f 38 f5 3c 0e    	wrussd %edi,\(%esi,%ecx,1\)
  +[a-f0-9]+:	f3 0f 01 e8          	setssbsy 
- +[a-f0-9]+:	f3 0f ae 34 04       	clrssbsy \(%esp,%eax,1\)
+ +[a-f0-9]+:	f3 0f ae 34 44       	clrssbsy \(%esp,%eax,2\)
  +[a-f0-9]+:	f3 0f 1e fa          	endbr64 
  +[a-f0-9]+:	f3 0f 1e fb          	endbr32 
+ +[a-f0-9]+:	f3 0f ae e9          	incsspd %ecx
+ +[a-f0-9]+:	f3 0f 1e c9          	rdsspd %ecx
+ +[a-f0-9]+:	f3 0f 01 ea          	saveprevssp *
+ +[a-f0-9]+:	67 f3 0f 01 6c 01    	rstorssp 0x1\(%si\)
+ +[a-f0-9]+:	90                   	nop *
+ +[a-f0-9]+:	67 0f 38 f6 02       	wrssd  %eax,\(%bp,%si\)
+ +[a-f0-9]+:	67 0f 38 f6 10       	wrssd  %edx,\(%bx,%si\)
+ +[a-f0-9]+:	67 66 0f 38 f5 14    	wrussd %edx,\(%si\)
+ +[a-f0-9]+:	2f                   	das *
+ +[a-f0-9]+:	67 66 0f 38 f5 3c    	wrussd %edi,\(%si\)
+ +[a-f0-9]+:	0e                   	push   %cs
+ +[a-f0-9]+:	f3 0f 01 e8          	setssbsy *
+ +[a-f0-9]+:	67 f3 0f ae 34       	clrssbsy \(%si\)
+ +[a-f0-9]+:	44                   	inc    %esp
+ +[a-f0-9]+:	f3 0f 1e fa          	endbr64 *
+ +[a-f0-9]+:	f3 0f 1e fb          	endbr32 *
 #pass
--- a/gas/testsuite/gas/i386/cet.s
+++ b/gas/testsuite/gas/i386/cet.s
@@ -13,15 +13,18 @@ _start:
 	endbr32
 
 	.intel_syntax noprefix
+	.rept 2
 	incsspd ecx
 	rdsspd ecx
 	saveprevssp
-	rstorssp QWORD PTR [ecx + eax]
+	rstorssp QWORD PTR [ecx + eax - 0x70]
 	wrssd [edx],eax
 	wrssd dword ptr [eax],edx
 	wrussd [edi + ebp],edx
-	wrussd dword ptr [edx + ebp],edi
+	wrussd dword ptr [esi + ecx],edi
 	setssbsy
-	clrssbsy QWORD PTR [esp + eax]
+	clrssbsy QWORD PTR [esp + eax * 2]
 	endbr64
 	endbr32
+	.code16
+	.endr
--- a/gas/testsuite/gas/i386/code16.d
+++ b/gas/testsuite/gas/i386/code16.d
@@ -10,6 +10,13 @@ Disassembly of section .text:
  +[a-f0-9]+:	f3 66 a7             	repz cmpsl %es:\(%di\),%ds:\(%si\)
  +[a-f0-9]+:	66 f3 a5             	rep movsl %ds:\(%si\),%es:\(%di\)
  +[a-f0-9]+:	66 f3 a7             	repz cmpsl %es:\(%di\),%ds:\(%si\)
+ +[a-f0-9]+:	0f 20 d1             	mov    %cr2,%ecx
+ +[a-f0-9]+:	0f 22 d1             	mov    %ecx,%cr2
+ +[a-f0-9]+:	0f 21 d1             	mov    %d[br]2,%ecx
+ +[a-f0-9]+:	0f 23 d1             	mov    %ecx,%d[br]2
+ +[a-f0-9]+:	0f 24 d1             	mov    %tr2,%ecx
+ +[a-f0-9]+:	0f 26 d1             	mov    %ecx,%tr2
+ +[a-f0-9]+:	66 0f c9             	bswap  %ecx
  +[a-f0-9]+:	66 f3 a5             	rep movsl %ds:\(%si\),%es:\(%di\)
  +[a-f0-9]+:	66 f3 a7             	repz cmpsl %es:\(%di\),%ds:\(%si\)
 #pass
--- a/gas/testsuite/gas/i386/code16.s
+++ b/gas/testsuite/gas/i386/code16.s
@@ -4,6 +4,18 @@
 	rep; cmpsd
 	rep movsd %ds:(%si),%es:(%di)
 	rep cmpsd %es:(%di),%ds:(%si)
+
+	mov	%cr2, %ecx
+	mov	%ecx, %cr2
+
+	mov	%dr2, %ecx
+	mov	%ecx, %dr2
+
+	mov	%tr2, %ecx
+	mov	%ecx, %tr2
+
+	bswap	%ecx
+
 	.intel_syntax noprefix
 	rep movsd dword ptr es:[di], dword ptr ds:[si]
 	rep cmpsd dword ptr ds:[si], dword ptr es:[di]
--- a/gas/testsuite/gas/i386/ept-intel.d
+++ b/gas/testsuite/gas/i386/ept-intel.d
@@ -11,4 +11,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	66 0f 38 81 19       	invvpid ebx,OWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	66 0f 38 80 19       	invept ebx,OWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	66 0f 38 81 19       	invvpid ebx,OWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:	67 66 0f 38 80 19    	invept ebx,OWORD PTR \[bx\+di\]
+[ 	]*[a-f0-9]+:	67 66 0f 38 81 19    	invvpid ebx,OWORD PTR \[bx\+di\]
+[ 	]*[a-f0-9]+:	67 66 0f 38 80 19    	invept ebx,OWORD PTR \[bx\+di\]
+[ 	]*[a-f0-9]+:	67 66 0f 38 81 19    	invvpid ebx,OWORD PTR \[bx\+di\]
 #pass
--- a/gas/testsuite/gas/i386/ept.d
+++ b/gas/testsuite/gas/i386/ept.d
@@ -10,4 +10,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	66 0f 38 81 19       	invvpid \(%ecx\),%ebx
 [ 	]*[a-f0-9]+:	66 0f 38 80 19       	invept \(%ecx\),%ebx
 [ 	]*[a-f0-9]+:	66 0f 38 81 19       	invvpid \(%ecx\),%ebx
+[ 	]*[a-f0-9]+:	67 66 0f 38 80 19    	invept \(%bx,%di\),%ebx
+[ 	]*[a-f0-9]+:	67 66 0f 38 81 19    	invvpid \(%bx,%di\),%ebx
+[ 	]*[a-f0-9]+:	67 66 0f 38 80 19    	invept \(%bx,%di\),%ebx
+[ 	]*[a-f0-9]+:	67 66 0f 38 81 19    	invvpid \(%bx,%di\),%ebx
 #pass
--- a/gas/testsuite/gas/i386/ept.s
+++ b/gas/testsuite/gas/i386/ept.s
@@ -1,9 +1,15 @@
 # Check EPT instructions
 	.text
 _start:
+	.rept 2
+
 	invept	(%ecx), %ebx
 	invvpid	(%ecx), %ebx
 
 	.intel_syntax noprefix
 	invept ebx, oword ptr [ecx]
 	invvpid ebx, oword ptr [ecx]
+
+	.att_syntax prefix
+	.code16
+	.endr
--- a/gas/testsuite/gas/i386/fsgs-intel.d
+++ b/gas/testsuite/gas/i386/fsgs-intel.d
@@ -16,4 +16,12 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f3 0f ae cb          	rdgsbase ebx
 [ 	]*[a-f0-9]+:	f3 0f ae d3          	wrfsbase ebx
 [ 	]*[a-f0-9]+:	f3 0f ae db          	wrgsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae c3          	rdfsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae cb          	rdgsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae d3          	wrfsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae db          	wrgsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae c3          	rdfsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae cb          	rdgsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae d3          	wrfsbase ebx
+[ 	]*[a-f0-9]+:	f3 0f ae db          	wrgsbase ebx
 #pass
--- a/gas/testsuite/gas/i386/fsgs.d
+++ b/gas/testsuite/gas/i386/fsgs.d
@@ -15,4 +15,12 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f3 0f ae cb          	rdgsbase %ebx
 [ 	]*[a-f0-9]+:	f3 0f ae d3          	wrfsbase %ebx
 [ 	]*[a-f0-9]+:	f3 0f ae db          	wrgsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae c3          	rdfsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae cb          	rdgsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae d3          	wrfsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae db          	wrgsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae c3          	rdfsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae cb          	rdgsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae d3          	wrfsbase %ebx
+[ 	]*[a-f0-9]+:	f3 0f ae db          	wrgsbase %ebx
 #pass
--- a/gas/testsuite/gas/i386/fsgs.s
+++ b/gas/testsuite/gas/i386/fsgs.s
@@ -2,6 +2,7 @@
 
 	.text
 foo:
+	.rept 2
 	rdfsbase %ebx
 	rdgsbase %ebx
 	wrfsbase %ebx
@@ -12,3 +13,7 @@ foo:
 	rdgsbase ebx
 	wrfsbase ebx
 	wrgsbase ebx
+
+	.att_syntax prefix
+	.code16
+	.endr
--- a/gas/testsuite/gas/i386/invpcid-intel.d
+++ b/gas/testsuite/gas/i386/invpcid-intel.d
@@ -12,4 +12,7 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	66 0f 38 82 10       	invpcid edx,\[eax\]
 [ 	]*[a-f0-9]+:	66 0f 38 82 10       	invpcid edx,\[eax\]
 [ 	]*[a-f0-9]+:	66 0f 38 82 10       	invpcid edx,\[eax\]
+[ 	]*[a-f0-9]+:	67 66 0f 38 82 10    	invpcid edx,\[bx\+si\]
+[ 	]*[a-f0-9]+:	67 66 0f 38 82 10    	invpcid edx,\[bx\+si\]
+[ 	]*[a-f0-9]+:	67 66 0f 38 82 10    	invpcid edx,\[bx\+si\]
 #pass
--- a/gas/testsuite/gas/i386/invpcid.d
+++ b/gas/testsuite/gas/i386/invpcid.d
@@ -11,4 +11,7 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	66 0f 38 82 10       	invpcid \(%eax\),%edx
 [ 	]*[a-f0-9]+:	66 0f 38 82 10       	invpcid \(%eax\),%edx
 [ 	]*[a-f0-9]+:	66 0f 38 82 10       	invpcid \(%eax\),%edx
+[ 	]*[a-f0-9]+:	67 66 0f 38 82 10    	invpcid \(%bx,%si\),%edx
+[ 	]*[a-f0-9]+:	67 66 0f 38 82 10    	invpcid \(%bx,%si\),%edx
+[ 	]*[a-f0-9]+:	67 66 0f 38 82 10    	invpcid \(%bx,%si\),%edx
 #pass
--- a/gas/testsuite/gas/i386/invpcid.s
+++ b/gas/testsuite/gas/i386/invpcid.s
@@ -2,8 +2,14 @@
 
 	.text
 foo:
+	.rept 2
+
 	invpcid	(%eax), %edx
 
 	.intel_syntax noprefix
 	invpcid	edx,[eax]
 	invpcid	edx,oword ptr [eax]
+
+	.att_syntax prefix
+	.code16
+	.endr
--- a/gas/testsuite/gas/i386/movdir-intel.d
+++ b/gas/testsuite/gas/i386/movdir-intel.d
@@ -16,4 +16,11 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*0f 38 f9 01[ 	]*movdiri DWORD PTR \[ecx\],eax
 [ 	]*[a-f0-9]+:[ 	]*66 0f 38 f8 01[ 	]*movdir64b eax,\[ecx\]
 [ 	]*[a-f0-9]+:[ 	]*67 66 0f 38 f8 04[ 	]*movdir64b ax,\[si\]
+[ 	]*[a-f0-9]+:[ 	]*67 0f 38 f9 01[ 	]*movdiri DWORD PTR \[bx\+di\],eax
+[ 	]*[a-f0-9]+:[ 	]*67 66 0f 38 f8 01[ 	]*movdir64b ax,\[bx\+di\]
+[ 	]*[a-f0-9]+:[ 	]*66 0f 38 f8 04 67[ 	]*movdir64b eax,\[edi\+eiz\*2\]
+[ 	]*[a-f0-9]+:[ 	]*0f 38 f9 01[ 	]*movdiri DWORD PTR \[ecx\],eax
+[ 	]*[a-f0-9]+:[ 	]*67 0f 38 f9 01[ 	]*movdiri DWORD PTR \[bx\+di\],eax
+[ 	]*[a-f0-9]+:[ 	]*67 66 0f 38 f8 01[ 	]*movdir64b ax,\[bx\+di\]
+[ 	]*[a-f0-9]+:[ 	]*66 0f 38 f8 04 90[ 	]*movdir64b eax,\[eax\+edx\*4\]
 #pass
--- a/gas/testsuite/gas/i386/movdir.d
+++ b/gas/testsuite/gas/i386/movdir.d
@@ -16,4 +16,11 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*0f 38 f9 01[ 	]*movdiri %eax,\(%ecx\)
 [ 	]*[a-f0-9]+:[ 	]*66 0f 38 f8 01[ 	]*movdir64b \(%ecx\),%eax
 [ 	]*[a-f0-9]+:[ 	]*67 66 0f 38 f8 04[ 	]*movdir64b \(%si\),%ax
+[ 	]*[a-f0-9]+:[ 	]*67 0f 38 f9 01[ 	]*movdiri %eax,\(%bx,%di\)
+[ 	]*[a-f0-9]+:[ 	]*67 66 0f 38 f8 01[ 	]*movdir64b \(%bx,%di\),%ax
+[ 	]*[a-f0-9]+:[ 	]*66 0f 38 f8 04 67[ 	]*movdir64b \(%edi,%eiz,2\),%eax
+[ 	]*[a-f0-9]+:[ 	]*0f 38 f9 01[ 	]*movdiri %eax,\(%ecx\)
+[ 	]*[a-f0-9]+:[ 	]*67 0f 38 f9 01[ 	]*movdiri %eax,\(%bx,%di\)
+[ 	]*[a-f0-9]+:[ 	]*67 66 0f 38 f8 01[ 	]*movdir64b \(%bx,%di\),%ax
+[ 	]*[a-f0-9]+:[ 	]*66 0f 38 f8 04 90[ 	]*movdir64b \(%eax,%edx,4\),%eax
 #pass
--- a/gas/testsuite/gas/i386/movdir.s
+++ b/gas/testsuite/gas/i386/movdir.s
@@ -3,6 +3,7 @@
 	.allow_index_reg
 	.text
 _start:
+	.rept 2
 	movdiri %eax, (%ecx)
 	movdir64b (%ecx),%eax
 	movdir64b (%si),%ax
@@ -12,3 +13,9 @@ _start:
 	movdiri dword ptr [ecx], eax
 	movdir64b eax,[ecx]
 	movdir64b ax,[si]
+
+	.att_syntax prefix
+	.code16
+	.endr
+
+	nop
--- a/gas/testsuite/gas/i386/ptwrite-intel.d
+++ b/gas/testsuite/gas/i386/ptwrite-intel.d
@@ -16,4 +16,11 @@ Disassembly of section \.text:
  +[a-f0-9]+:	f3 0f ae e1          	ptwrite ecx
  +[a-f0-9]+:	f3 0f ae 21          	ptwrite DWORD PTR \[ecx\]
  +[a-f0-9]+:	f3 0f ae 21          	ptwrite DWORD PTR \[ecx\]
+ +[a-f0-9]+:	f3 0f ae e1          	ptwrite ecx
+ +[a-f0-9]+:	f3 0f ae e1          	ptwrite ecx
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwrite DWORD PTR \[bx\+di\]
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwrite DWORD PTR \[bx\+di\]
+ +[a-f0-9]+:	f3 0f ae e1          	ptwrite ecx
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwrite DWORD PTR \[bx\+di\]
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwrite DWORD PTR \[bx\+di\]
 #pass
--- a/gas/testsuite/gas/i386/ptwrite.d
+++ b/gas/testsuite/gas/i386/ptwrite.d
@@ -16,4 +16,11 @@ Disassembly of section \.text:
  +[a-f0-9]+:	f3 0f ae e1          	ptwrite %ecx
  +[a-f0-9]+:	f3 0f ae 21          	ptwritel \(%ecx\)
  +[a-f0-9]+:	f3 0f ae 21          	ptwritel \(%ecx\)
+ +[a-f0-9]+:	f3 0f ae e1          	ptwrite %ecx
+ +[a-f0-9]+:	f3 0f ae e1          	ptwrite %ecx
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwritel \(%bx,%di\)
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwritel \(%bx,%di\)
+ +[a-f0-9]+:	f3 0f ae e1          	ptwrite %ecx
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwritel \(%bx,%di\)
+ +[a-f0-9]+:	67 f3 0f ae 21       	ptwritel \(%bx,%di\)
 #pass
--- a/gas/testsuite/gas/i386/ptwrite.s
+++ b/gas/testsuite/gas/i386/ptwrite.s
@@ -2,6 +2,7 @@
 
 	.text
 _start:
+	.rept 2
 	ptwrite %ecx
 	ptwritel %ecx
 	ptwrite (%ecx)
@@ -11,3 +12,7 @@ _start:
 	ptwrite ecx
 	ptwrite [ecx]
 	ptwrite DWORD PTR [ecx]
+
+	.att_syntax prefix
+	.code16
+	.endr
--- a/gas/testsuite/gas/i386/rdpid-intel.d
+++ b/gas/testsuite/gas/i386/rdpid-intel.d
@@ -8,4 +8,5 @@ Disassembly of section .text:
 
 0+ <_start>:
 [ 	]*[a-f0-9]+:[ 	]*f3 0f c7 f8[ 	]*rdpid  eax
+[ 	]*[a-f0-9]+:[ 	]*f3 0f c7 f9[ 	]*rdpid  ecx
 #pass
--- a/gas/testsuite/gas/i386/rdpid.d
+++ b/gas/testsuite/gas/i386/rdpid.d
@@ -8,4 +8,5 @@ Disassembly of section .text:
 
 0+ <_start>:
 [ 	]*[a-f0-9]+:[ 	]*f3 0f c7 f8[ 	]*rdpid  %eax
+[ 	]*[a-f0-9]+:[ 	]*f3 0f c7 f9[ 	]*rdpid  %ecx
 #pass
--- a/gas/testsuite/gas/i386/rdpid.s
+++ b/gas/testsuite/gas/i386/rdpid.s
@@ -3,3 +3,6 @@
 	.text
 _start:
 	rdpid %eax
+
+	.code16
+	rdpid %ecx
--- a/gas/testsuite/gas/i386/sse2-16bit.d
+++ b/gas/testsuite/gas/i386/sse2-16bit.d
@@ -164,4 +164,23 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	66 0f fb c1          	psubq  %xmm1,%xmm0
 [ 	]*[a-f0-9]+:	67 66 0f fb 00       	psubq  \(%eax\),%xmm0
 [ 	]*[a-f0-9]+:	0f 58 2f             	addps  \(%bx\),%xmm5
+[ 	]*[a-f0-9]+:	f3 0f 2a d9          	cvtsi2ss %ecx,%xmm3
+[ 	]*[a-f0-9]+:	f3 0f 2d cb          	cvtss2si %xmm3,%ecx
+[ 	]*[a-f0-9]+:	f3 0f 2c cb          	cvttss2si %xmm3,%ecx
+[ 	]*[a-f0-9]+:	66 0f 3a 17 ca 00    	extractps \$0x0,%xmm1,%edx
+[ 	]*[a-f0-9]+:	0f 50 ca             	movmskps %xmm2,%ecx
+[ 	]*[a-f0-9]+:	66 0f 3a 14 ca 00    	pextrb \$0x0,%xmm1,%edx
+[ 	]*[a-f0-9]+:	66 0f 3a 16 ca 00    	pextrd \$0x0,%xmm1,%edx
+[ 	]*[a-f0-9]+:	0f c5 d1 00          	pextrw \$0x0,%mm1,%edx
+[ 	]*[a-f0-9]+:	66 0f c5 d1 00       	pextrw \$0x0,%xmm1,%edx
+[ 	]*[a-f0-9]+:	66 0f 3a 20 d1 00    	pinsrb \$0x0,%ecx,%xmm2
+[ 	]*[a-f0-9]+:	66 0f 3a 22 d1 00    	pinsrd \$0x0,%ecx,%xmm2
+[ 	]*[a-f0-9]+:	0f c4 d1 00          	pinsrw \$0x0,%ecx,%mm2
+[ 	]*[a-f0-9]+:	66 0f c4 d1 00       	pinsrw \$0x0,%ecx,%xmm2
+[ 	]*[a-f0-9]+:	66 0f d7 d3          	pmovmskb %xmm3,%edx
+[ 	]*[a-f0-9]+:	f3 0f 2a 05          	cvtsi2ssl? \(%di\),%xmm0
+[ 	]*[a-f0-9]+:	66 0f 3a 17 0d 00    	extractps \$0x0,%xmm1,\(%di\)
+[ 	]*[a-f0-9]+:	66 0f 3a 21 05 00    	insertps \$0x0,\(%di\),%xmm0
+[ 	]*[a-f0-9]+:	66 0f 3a 16 0d 00    	pextrd \$0x0,%xmm1,\(%di\)
+[ 	]*[a-f0-9]+:	66 0f 3a 22 05 00    	pinsrd \$0x0,\(%di\),%xmm0
 #pass
--- a/gas/testsuite/gas/i386/sse2-16bit.s
+++ b/gas/testsuite/gas/i386/sse2-16bit.s
@@ -4,4 +4,26 @@
 	.include "sse2.s"
 	.att_syntax prefix
 
+	# also a few SSE* insns
 	addps (%bx),%xmm5
+	cvtsi2ss %ecx,%xmm3
+	cvtss2si %xmm3,%ecx
+	cvttss2si %xmm3,%ecx
+	extractps $0,%xmm1,%edx
+	movmskps %xmm2,%ecx
+	pextrb $0,%xmm1,%edx
+	pextrd $0,%xmm1,%edx
+	pextrw $0,%mm1,%edx
+	pextrw $0,%xmm1,%edx
+	pinsrb $0,%ecx,%xmm2
+	pinsrd $0,%ecx,%xmm2
+	pinsrw $0,%ecx,%mm2
+	pinsrw $0,%ecx,%xmm2
+	pmovmskb %xmm3,%edx
+
+	.intel_syntax noprefix
+	cvtsi2ss xmm0, dword ptr [di]
+	extractps dword ptr [di], xmm1, 0
+	insertps xmm0, dword ptr [di], 0
+	pextrd dword ptr [di], xmm1, 0
+	pinsrd xmm0, dword ptr [di], 0
--- a/gas/testsuite/gas/i386/vmx.d
+++ b/gas/testsuite/gas/i386/vmx.d
@@ -22,4 +22,20 @@ Disassembly of section .text:
   29:	0f 79 d8 [ 	]*vmwrite %eax,%ebx
   2c:	0f 79 18 [ 	]*vmwrite \(%eax\),%ebx
   2f:	0f 79 18 [ 	]*vmwrite \(%eax\),%ebx
-	...
+[ 	]*[a-f0-9]+:	0f 01 c1[ 	]*vmcall *
+[ 	]*[a-f0-9]+:	0f 01 c2[ 	]*vmlaunch *
+[ 	]*[a-f0-9]+:	0f 01 c3[ 	]*vmresume *
+[ 	]*[a-f0-9]+:	0f 01 c4[ 	]*vmxoff *
+[ 	]*[a-f0-9]+:	67 66 0f c7 30[ 	]*vmclear \(%bx,%si\)
+[ 	]*[a-f0-9]+:	67 0f c7 30[ 	]*vmptrld \(%bx,%si\)
+[ 	]*[a-f0-9]+:	67 0f c7 38[ 	]*vmptrst \(%bx,%si\)
+[ 	]*[a-f0-9]+:	67 f3 0f c7 30[ 	]*vmxon  \(%bx,%si\)
+[ 	]*[a-f0-9]+:	0f 78 c3[ 	]*vmread %eax,%ebx
+[ 	]*[a-f0-9]+:	0f 78 c3[ 	]*vmread %eax,%ebx
+[ 	]*[a-f0-9]+:	67 0f 78 03[ 	]*vmread %eax,\(%bp,%di\)
+[ 	]*[a-f0-9]+:	67 0f 78 03[ 	]*vmread %eax,\(%bp,%di\)
+[ 	]*[a-f0-9]+:	0f 79 d8[ 	]*vmwrite %eax,%ebx
+[ 	]*[a-f0-9]+:	0f 79 d8[ 	]*vmwrite %eax,%ebx
+[ 	]*[a-f0-9]+:	67 0f 79 18[ 	]*vmwrite \(%bx,%si\),%ebx
+[ 	]*[a-f0-9]+:	67 0f 79 18[ 	]*vmwrite \(%bx,%si\),%ebx
+#pass
--- a/gas/testsuite/gas/i386/vmx.s
+++ b/gas/testsuite/gas/i386/vmx.s
@@ -2,6 +2,8 @@
 
 	.text
 foo:
+	.rept 2
+
 	vmcall
 	vmlaunch
 	vmresume
@@ -18,4 +20,7 @@ foo:
 	vmwritel %eax,%ebx
 	vmwrite (%eax),%ebx
 	vmwritel (%eax),%ebx
+
+	.code16
+	.endr
 	.p2align	4,0
--- a/gas/testsuite/gas/i386/waitpkg-intel.d
+++ b/gas/testsuite/gas/i386/waitpkg-intel.d
@@ -17,4 +17,12 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause ebx
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait edi
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause edi
+[ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f0[ 	]*umonitor ax
+[ 	]*[a-f0-9]+:[ 	]*f3 0f ae f1[ 	]*umonitor ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait ebx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause edi
 #pass
--- a/gas/testsuite/gas/i386/waitpkg.d
+++ b/gas/testsuite/gas/i386/waitpkg.d
@@ -17,4 +17,12 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause %ebx
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait %edi
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause %edi
+[ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f0[ 	]*umonitor %ax
+[ 	]*[a-f0-9]+:[ 	]*f3 0f ae f1[ 	]*umonitor %ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait %ebx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause %ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait %edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause %edi
 #pass
--- a/gas/testsuite/gas/i386/waitpkg.s
+++ b/gas/testsuite/gas/i386/waitpkg.s
@@ -2,6 +2,7 @@
 
 	.text
 _start:
+	.rept 2
 	umonitor %eax
 	umonitor %cx
 	umwait %ecx
@@ -13,3 +14,7 @@ _start:
 
 	umwait edi, edx, eax
 	tpause edi, edx, eax
+
+	.att_syntax prefix
+	.code16
+	.endr
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1294,7 +1294,7 @@ movlps, 2, 0x12, None, 1, CpuAVX, Modrm|
 movlps, 2, 0x13, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex }
 movlps, 2, 0xf12, None, 2, CpuSSE, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 movmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { RegXMM, Reg32|Reg64 }
-movmskps, 2, 0xf50, None, 2, CpuSSE, Modrm|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
+movmskps, 2, 0xf50, None, 2, CpuSSE, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
 movntps, 2, 0x2b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Xmmword|Unspecified|BaseIndex }
 movntps, 2, 0xf2b, None, 2, CpuSSE, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
 movntq, 2, 0xfe7, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { RegMMX, Qword|Unspecified|BaseIndex }
@@ -1588,7 +1588,7 @@ movsldup, 2, 0xf30f12, None, 2, CpuSSE3,
 mwait, 0, 0xf01c9, None, 3, CpuSSE3, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { 0 }
 // mwait is very special. AX and CX are always 32 bits.
 // The 64-bit form exists only for compatibility with older gas.
-mwait, 2, 0xf01c9, None, 3, CpuSSE3, CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|NoAVX, { Acc|Dword|Qword, RegC|Dword|Qword }
+mwait, 2, 0xf01c9, None, 3, CpuSSE3, CheckRegSize|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|NoAVX, { Acc|Dword|Qword, RegC|Dword|Qword }
 
 // VMX instructions.
 
@@ -1598,9 +1598,9 @@ vmlaunch, 0, 0xf01c2, None, 3, CpuVMX, N
 vmresume, 0, 0xf01c3, None, 3, CpuVMX, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 vmptrld, 1, 0xfc7, 0x6, 2, CpuVMX, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
 vmptrst, 1, 0xfc7, 0x7, 2, CpuVMX, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
-vmread, 2, 0xf78, None, 2, CpuVMX|CpuNo64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, Reg32|Dword|Unspecified|BaseIndex }
+vmread, 2, 0xf78, None, 2, CpuVMX|CpuNo64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, Reg32|Unspecified|BaseIndex }
 vmread, 2, 0xf78, None, 2, CpuVMX|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Reg64, Reg64|Qword|Unspecified|BaseIndex }
-vmwrite, 2, 0xf79, None, 2, CpuVMX|CpuNo64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Dword|Unspecified|BaseIndex, Reg32 }
+vmwrite, 2, 0xf79, None, 2, CpuVMX|CpuNo64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, Reg32 }
 vmwrite, 2, 0xf79, None, 2, CpuVMX|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Reg64|Qword|Unspecified|BaseIndex, Reg64 }
 vmxoff, 0, 0xf01c4, None, 3, CpuVMX, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 vmxon, 1, 0xf30fc7, 0x6, 2, CpuVMX, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
@@ -1615,14 +1615,14 @@ getsec, 0, 0xf37, None, 2, CpuSMX, No_bS
 
 // EPT instructions.
 
-invept, 2, 0x660f3880, None, 3, CpuEPT|CpuNo64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf, { Oword|Unspecified|BaseIndex, Reg32 }
+invept, 2, 0x660f3880, None, 3, CpuEPT|CpuNo64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invept, 2, 0x660f3880, None, 3, CpuEPT|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
-invvpid, 2, 0x660f3881, None, 3, CpuEPT|CpuNo64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf, { Oword|Unspecified|BaseIndex, Reg32 }
+invvpid, 2, 0x660f3881, None, 3, CpuEPT|CpuNo64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invvpid, 2, 0x660f3881, None, 3, CpuEPT|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // INVPCID instruction
 
-invpcid, 2, 0x660f3882, None, 3, CpuINVPCID|CpuNo64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf, { Oword|Unspecified|BaseIndex, Reg32 }
+invpcid, 2, 0x660f3882, None, 3, CpuINVPCID|CpuNo64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf, { Oword|Unspecified|BaseIndex, Reg32 }
 invpcid, 2, 0x660f3882, None, 3, CpuINVPCID|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_qSuf|No_sSuf|No_ldSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
 
 // SSSE3 instructions.
@@ -2485,11 +2485,11 @@ vgf2p8mulb, 3, 0x66cf, None, 1, CpuAVX|C
 
 // FSGSBASE, RDRND and F16C
 
-rdfsbase, 1, 0xf30fae, 0x0, 2, CpuFSGSBase, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
-rdgsbase, 1, 0xf30fae, 0x1, 2, CpuFSGSBase, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
+rdfsbase, 1, 0xf30fae, 0x0, 2, CpuFSGSBase, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
+rdgsbase, 1, 0xf30fae, 0x1, 2, CpuFSGSBase, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
 rdrand, 1, 0xfc7, 0x6, 2, CpuRdRnd, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg16|Reg32|Reg64 }
-wrfsbase, 1, 0xf30fae, 0x2, 2, CpuFSGSBase, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
-wrgsbase, 1, 0xf30fae, 0x3, 2, CpuFSGSBase, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
+wrfsbase, 1, 0xf30fae, 0x2, 2, CpuFSGSBase, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
+wrgsbase, 1, 0xf30fae, 0x3, 2, CpuFSGSBase, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64 }
 vcvtph2ps, 2, 0x6613, None, 1, CpuF16C, Modrm|Vex|VexOpcode=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtph2ps, 2, 0x6613, None, 1, CpuF16C, Modrm|Vex=2|VexOpcode=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegYMM }
 vcvtps2ph, 3, 0x661d, None, 1, CpuF16C, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
@@ -2880,8 +2880,8 @@ xcryptofb, 0, 0xf30fa7e8, None, 3, CpuPa
 xstore, 0, 0xfa7c0, None, 3, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, { 0 }
 
 // Multy-precision Add Carry, rdseed instructions.
-adcx, 2, 0x660f38f6, None, 3, CpuADX, Modrm|CheckRegSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, Reg32|Reg64 }
-adox, 2, 0xf30f38f6, None, 3, CpuADX, Modrm|CheckRegSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, Reg32|Reg64 }
+adcx, 2, 0x660f38f6, None, 3, CpuADX, Modrm|CheckRegSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adox, 2, 0xf30f38f6, None, 3, CpuADX, Modrm|CheckRegSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 rdseed, 1, 0xfc7, 0x7, 2, CpuRdSeed, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg16|Reg32|Reg64 }
 
 // SMAP instructions.
@@ -4702,7 +4702,7 @@ monitorx, 3, 0xf01fa, None, 3, CpuMWAITX
 
 mwaitx, 0, 0xf01fb, None, 3, CpuMWAITX, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 // The 64-bit form exists only for compatibility with older gas.
-mwaitx, 3, 0xf01fb, None, 3, CpuMWAITX, CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Acc|Dword|Qword, RegC|Dword|Qword, RegB|Dword|Qword }
+mwaitx, 3, 0xf01fb, None, 3, CpuMWAITX, CheckRegSize|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Acc|Dword|Qword, RegC|Dword|Qword, RegB|Dword|Qword }
 
 // MONITORX/MWAITX instructions end
 
@@ -4715,14 +4715,15 @@ wrpkru, 0, 0xf01ef, None, 3, CpuOSPKE, N
 
 // RDPID instructions.
 
-rdpid, 1, 0xf30fc7, 0x7, 2, CpuRDPID|CpuNo64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32 }
+rdpid, 1, 0xf30fc7, 0x7, 2, CpuRDPID|CpuNo64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32 }
 rdpid, 1, 0xf30fc7, 0x7, 2, CpuRDPID|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Reg64 }
 
 // RDPID instructions end.
 
 // PTWRITE instructions.
 
-ptwrite, 1, 0xf30fae, 0x4, 2, CpuPTWRITE, Modrm|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex }
+ptwrite, 1, 0xf30fae, 0x4, 2, CpuPTWRITE|CpuNo64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex }
+ptwrite, 1, 0xf30fae, 0x4, 2, CpuPTWRITE|Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Unspecified|BaseIndex }
 
 // PTWRITE instructions end.
 
@@ -4778,8 +4779,7 @@ cldemote, 1, 0x0f1c, 0x0, 2, CpuCLDEMOTE
 
 // MOVDIR[I,64B] instructions.
 
-movdiri, 2, 0xf38f9, None, 3, CpuMOVDIRI, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
-
+movdiri, 2, 0xf38f9, None, 3, CpuMOVDIRI, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 movdir64b, 2, 0x660f38f8, None, 3, CpuMOVDIR64B, Modrm|AddrPrefixOpReg, { Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // MOVEDIR instructions end.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
  2020-03-04  9:41 ` [PATCH 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich
  2020-03-04  9:42 ` [PATCH 2/9] x86: add missing IgnoreSize Jan Beulich
@ 2020-03-04  9:43 ` Jan Beulich
  2020-03-04 11:46   ` H.J. Lu
  2020-03-04  9:44 ` [PATCH 4/9] x86: drop Rex64 attribute Jan Beulich
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:43 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Since 16-bit addressing isn't allowed, Disp32 needs to be forced; Disp16
fails to match the templates.

The SDM leaves open whether BNDC[LNU] with a GPR operand require an
operand size override; this aspect is therefore left untouched here.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (i386_addressing_mode): For 32-bit
	addressing for MPX insns without base/index.
	* testsuite/gas/i386/mpx-16bit.s,
	* testsuite/gas/i386/mpx-16bit.d: New.
	* testsuite/gas/i386/i386.exp: Run new test.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-dis.c (OP_E_memory): Exclude recording of used address
	prefix for "bnd" modes only in 64-bit mode. Don't decode 16-bit
	addressed memory operands for MPX insns.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -10297,6 +10297,21 @@ i386_addressing_mode (void)
 
   if (i.prefix[ADDR_PREFIX])
     addr_mode = flag_code == CODE_32BIT ? CODE_16BIT : CODE_32BIT;
+  else if (flag_code == CODE_16BIT
+	   && current_templates->start->cpu_flags.bitfield.cpumpx
+	   /* Avoid replacing the "16-bit addressing not allowed" diagnostic
+	      from md_assemble() by "is not a valid base/index expression"
+	      when there is a base and/or index.  */
+	   && !i.types[this_operand].bitfield.baseindex)
+    {
+      /* MPX insn memory operands with neither base nor index must be forced
+	 to use 32-bit addressing in 16-bit mode.  */
+      addr_mode = CODE_32BIT;
+      i.prefix[ADDR_PREFIX] = ADDR_PREFIX_OPCODE;
+      ++i.prefixes;
+      gas_assert (!i.types[this_operand].bitfield.disp16);
+      gas_assert (!i.types[this_operand].bitfield.disp32);
+    }
   else
     {
       addr_mode = flag_code;
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -329,6 +329,7 @@ if [expr ([istarget "i*86-*-*"] ||  [ist
     run_list_test "mpx-inval-1" "-al"
     run_list_test "mpx-inval-2" "-al"
     run_dump_test "mpx-add-bnd-prefix"
+    run_dump_test "mpx-16bit"
     run_list_test "bnd" "-al"
     run_dump_test "sha"
     run_dump_test "clflushopt"
--- /dev/null
+++ b/gas/testsuite/gas/i386/mpx-16bit.d
@@ -0,0 +1,145 @@
+#as: -I${srcdir}/$subdir
+#objdump: -drw -Mi8086
+#name: i386 MPX (16-bit)
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <start>:
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 08       	bndmk  \(%eax\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 0d 99 03 00 00 	addr32 bndmk 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 4a 03    	bndmk  0x3\(%edx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 0c 08    	bndmk  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 0c 0d 00 00 00 00 	bndmk  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 4c 01 03 	bndmk  0x3\(%ecx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 08       	bndmov \(%eax\),%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 0d 99 03 00 00 	addr32 bndmov 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 52 03    	bndmov 0x3\(%edx\),%bnd2
+[ 	]*[a-f0-9]+:	67 66 0f 1a 14 10    	bndmov \(%eax,%edx,1\),%bnd2
+[ 	]*[a-f0-9]+:	67 66 0f 1a 14 05 00 00 00 00 	bndmov 0x0\(,%eax,1\),%bnd2
+[ 	]*[a-f0-9]+:	67 66 0f 1a 4c 01 03 	bndmov 0x3\(%ecx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	66 0f 1a c2          	bndmov %bnd2,%bnd0
+[ 	]*[a-f0-9]+:	67 66 0f 1b 08       	bndmov %bnd1,\(%eax\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 0d 99 03 00 00 	addr32 bndmov %bnd1,0x399
+[ 	]*[a-f0-9]+:	67 66 0f 1b 52 03    	bndmov %bnd2,0x3\(%edx\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 14 10    	bndmov %bnd2,\(%eax,%edx,1\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 14 05 00 00 00 00 	bndmov %bnd2,0x0\(,%eax,1\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 4c 01 03 	bndmov %bnd1,0x3\(%ecx,%eax,1\)
+[ 	]*[a-f0-9]+:	66 0f 1a d0          	bndmov %bnd0,%bnd2
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 09       	bndcl  \(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	f3 0f 1a c9          	bndcl  %ecx,%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 0d 99 03 00 00 	addr32 bndcl 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 4a 03    	bndcl  0x3\(%edx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 0c 08    	bndcl  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 0c 0d 00 00 00 00 	bndcl  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 4c 01 03 	bndcl  0x3\(%ecx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 09       	bndcu  \(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	f2 0f 1a c9          	bndcu  %ecx,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 0d 99 03 00 00 	addr32 bndcu 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 4a 03    	bndcu  0x3\(%edx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 0c 08    	bndcu  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 0c 0d 00 00 00 00 	bndcu  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 4c 01 03 	bndcu  0x3\(%ecx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 09       	bndcn  \(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	f2 0f 1b c9          	bndcn  %ecx,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 0d 99 03 00 00 	addr32 bndcn 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 4a 03    	bndcn  0x3\(%edx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 0c 08    	bndcn  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 0c 0d 00 00 00 00 	bndcn  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 4c 01 03 	bndcn  0x3\(%ecx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 0f 1b 44 18 03    	bndstx %bnd0,0x3\(%eax,%ebx,1\)
+[ 	]*[a-f0-9]+:	67 0f 1b 54 13 03    	bndstx %bnd2,0x3\(%ebx,%edx,1\)
+[ 	]*[a-f0-9]+:	67 0f 1b 14 15 03 00 00 00 	bndstx %bnd2,0x3\(,%edx,1\)
+[ 	]*[a-f0-9]+:	67 0f 1b 9a 99 03 00 00 	bndstx %bnd3,0x399\(%edx\)
+[ 	]*[a-f0-9]+:	67 0f 1b 93 34 12 00 00 	bndstx %bnd2,0x1234\(%ebx\)
+[ 	]*[a-f0-9]+:	67 0f 1b 53 03       	bndstx %bnd2,0x3\(%ebx\)
+[ 	]*[a-f0-9]+:	67 0f 1b 0a          	bndstx %bnd1,\(%edx\)
+[ 	]*[a-f0-9]+:	67 0f 1a 44 18 03    	bndldx 0x3\(%eax,%ebx,1\),%bnd0
+[ 	]*[a-f0-9]+:	67 0f 1a 54 13 03    	bndldx 0x3\(%ebx,%edx,1\),%bnd2
+[ 	]*[a-f0-9]+:	67 0f 1a 14 15 03 00 00 00 	bndldx 0x3\(,%edx,1\),%bnd2
+[ 	]*[a-f0-9]+:	67 0f 1a 9a 99 03 00 00 	bndldx 0x399\(%edx\),%bnd3
+[ 	]*[a-f0-9]+:	67 0f 1a 93 34 12 00 00 	bndldx 0x1234\(%ebx\),%bnd2
+[ 	]*[a-f0-9]+:	67 0f 1a 53 03       	bndldx 0x3\(%ebx\),%bnd2
+[ 	]*[a-f0-9]+:	67 0f 1a 0a          	bndldx \(%edx\),%bnd1
+[ 	]*[a-f0-9]+:	f2 e8 91 01          	bnd call [a-f0-9]+ <foo>
+[ 	]*[a-f0-9]+:	67 f2 ff 10          	bnd call \*\(%eax\)
+[ 	]*[a-f0-9]+:	f2 0f 84 88 01       	bnd je [a-f0-9]+ <foo>
+[ 	]*[a-f0-9]+:	f2 e9 84 01          	bnd jmp [a-f0-9]+ <foo>
+[ 	]*[a-f0-9]+:	67 f2 ff 21          	bnd jmp \*\(%ecx\)
+[ 	]*[a-f0-9]+:	f2 c3                	bnd ret *
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 08       	bndmk  \(%eax\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 0d 99 03 00 00 	addr32 bndmk 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 49 03    	bndmk  0x3\(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 0c 08    	bndmk  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 0c 0d 00 00 00 00 	bndmk  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1b 4c 02 03 	bndmk  0x3\(%edx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 08       	bndmov \(%eax\),%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 0d 99 03 00 00 	addr32 bndmov 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 49 03    	bndmov 0x3\(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 0c 08    	bndmov \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 0c 0d 00 00 00 00 	bndmov 0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 66 0f 1a 4c 02 03 	bndmov 0x3\(%edx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	66 0f 1a c1          	bndmov %bnd1,%bnd0
+[ 	]*[a-f0-9]+:	67 66 0f 1b 08       	bndmov %bnd1,\(%eax\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 0d 99 03 00 00 	addr32 bndmov %bnd1,0x399
+[ 	]*[a-f0-9]+:	67 66 0f 1b 49 03    	bndmov %bnd1,0x3\(%ecx\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 0c 08    	bndmov %bnd1,\(%eax,%ecx,1\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 0c 0d 00 00 00 00 	bndmov %bnd1,0x0\(,%ecx,1\)
+[ 	]*[a-f0-9]+:	67 66 0f 1b 4c 02 03 	bndmov %bnd1,0x3\(%edx,%eax,1\)
+[ 	]*[a-f0-9]+:	66 0f 1a c8          	bndmov %bnd0,%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 08       	bndcl  \(%eax\),%bnd1
+[ 	]*[a-f0-9]+:	f3 0f 1a c9          	bndcl  %ecx,%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 0d 99 03 00 00 	addr32 bndcl 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 49 03    	bndcl  0x3\(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 0c 08    	bndcl  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 0c 0d 00 00 00 00 	bndcl  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f3 0f 1a 4c 02 03 	bndcl  0x3\(%edx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 08       	bndcu  \(%eax\),%bnd1
+[ 	]*[a-f0-9]+:	f2 0f 1a c9          	bndcu  %ecx,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 0d 99 03 00 00 	addr32 bndcu 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 49 03    	bndcu  0x3\(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 0c 08    	bndcu  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 0c 0d 00 00 00 00 	bndcu  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1a 4c 02 03 	bndcu  0x3\(%edx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 08       	bndcn  \(%eax\),%bnd1
+[ 	]*[a-f0-9]+:	f2 0f 1b c9          	bndcn  %ecx,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 0d 99 03 00 00 	addr32 bndcn 0x399,%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 49 03    	bndcn  0x3\(%ecx\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 0c 08    	bndcn  \(%eax,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 0c 0d 00 00 00 00 	bndcn  0x0\(,%ecx,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 f2 0f 1b 4c 02 03 	bndcn  0x3\(%edx,%eax,1\),%bnd1
+[ 	]*[a-f0-9]+:	67 0f 1b 44 18 03    	bndstx %bnd0,0x3\(%eax,%ebx,1\)
+[ 	]*[a-f0-9]+:	67 0f 1b 54 13 03    	bndstx %bnd2,0x3\(%ebx,%edx,1\)
+[ 	]*[a-f0-9]+:	67 0f 1b 14 0d 00 00 00 00 	bndstx %bnd2,0x0\(,%ecx,1\)
+[ 	]*[a-f0-9]+:	67 0f 1b 9a 99 03 00 00 	bndstx %bnd3,0x399\(%edx\)
+[ 	]*[a-f0-9]+:	67 0f 1b 14 1d 03 00 00 00 	bndstx %bnd2,0x3\(,%ebx,1\)
+[ 	]*[a-f0-9]+:	67 0f 1b 0a          	bndstx %bnd1,\(%edx\)
+[ 	]*[a-f0-9]+:	67 0f 1a 44 18 03    	bndldx 0x3\(%eax,%ebx,1\),%bnd0
+[ 	]*[a-f0-9]+:	67 0f 1a 54 13 03    	bndldx 0x3\(%ebx,%edx,1\),%bnd2
+[ 	]*[a-f0-9]+:	67 0f 1a 14 0d 00 00 00 00 	bndldx 0x0\(,%ecx,1\),%bnd2
+[ 	]*[a-f0-9]+:	67 0f 1a 9a 99 03 00 00 	bndldx 0x399\(%edx\),%bnd3
+[ 	]*[a-f0-9]+:	67 0f 1a 14 1d 03 00 00 00 	bndldx 0x3\(,%ebx,1\),%bnd2
+[ 	]*[a-f0-9]+:	67 0f 1a 0a          	bndldx \(%edx\),%bnd1
+[ 	]*[a-f0-9]+:	f2 e8 10 00          	bnd call [a-f0-9]+ <foo>
+[ 	]*[a-f0-9]+:	66 f2 ff d0          	bnd calll? \*%eax
+[ 	]*[a-f0-9]+:	f2 74 09             	bnd je [a-f0-9]+ <foo>
+[ 	]*[a-f0-9]+:	f2 eb 06             	bnd jmp [a-f0-9]+ <foo>
+[ 	]*[a-f0-9]+:	66 f2 ff e1          	bnd jmpl? \*%ecx
+[ 	]*[a-f0-9]+:	f2 c3                	bnd ret *
+
+[a-f0-9]+ <foo>:
+[ 	]*[a-f0-9]+:	f2 c3                	bnd ret *
+
+[a-f0-9]+ <bad>:
+#...
+[a-f0-9]+ <bad16>:
+[ 	]*[a-f0-9]+:	f3 0f 1b 00          	bndmk  \(bad\),%bnd0
+[ 	]*[a-f0-9]+:	66 0f 1a 00          	bndmov \(bad\),%bnd0
+[ 	]*[a-f0-9]+:	f3 0f 1a 00          	bndcl  \(bad\),%bnd0
+[ 	]*[a-f0-9]+:	f2 0f 1b 00          	bndcn  \(bad\),%bnd0
+[ 	]*[a-f0-9]+:	f2 0f 1a 00          	bndcu  \(bad\),%bnd0
+[ 	]*[a-f0-9]+:	0f 1b 00             	bndstx %bnd0,\(bad\)
+[ 	]*[a-f0-9]+:	0f 1a 00             	bndldx \(bad\),%bnd0
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/mpx-16bit.s
@@ -0,0 +1,13 @@
+	.code16
+	.include "mpx.s"
+
+	.att_syntax prefix
+	.code32
+bad16: # 16-bit addressing mode seen by the disassembler
+	bndmk	(%eax), %bnd0
+	bndmov	(%eax), %bnd0
+	bndcl	(%eax), %bnd0
+	bndcn	(%eax), %bnd0
+	bndcu	(%eax), %bnd0
+	bndstx	%bnd0, (%eax)
+	bndldx	(%eax), %bnd0
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -14272,10 +14272,11 @@ OP_E_memory (int bytemode, int sizeflag)
 	  }
 
       if ((havebase || haveindex || needindex || needaddr32 || riprel)
-	  && (bytemode != v_bnd_mode)
-	  && (bytemode != v_bndmk_mode)
-	  && (bytemode != bnd_mode)
-	  && (bytemode != bnd_swap_mode))
+	  && (address_mode != mode_64bit
+	      || ((bytemode != v_bnd_mode)
+		  && (bytemode != v_bndmk_mode)
+		  && (bytemode != bnd_mode)
+		  && (bytemode != bnd_swap_mode))))
 	used_prefixes |= PREFIX_ADDR;
 
       if (havedisp || (intel_syntax && riprel))
@@ -14356,6 +14357,14 @@ OP_E_memory (int bytemode, int sizeflag)
 	    }
 	}
     }
+  else if (bytemode == v_bnd_mode
+	   || bytemode == v_bndmk_mode
+	   || bytemode == bnd_mode
+	   || bytemode == bnd_swap_mode)
+    {
+      oappend ("(bad)");
+      return;
+    }
   else
     {
       /* 16 bit address mode */

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 4/9] x86: drop Rex64 attribute
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
                   ` (2 preceding siblings ...)
  2020-03-04  9:43 ` [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode Jan Beulich
@ 2020-03-04  9:44 ` Jan Beulich
  2020-03-04 11:47   ` H.J. Lu
  2020-03-04  9:45 ` [PATCH 6/9] x86: don't accept FI{LD,STP,STTP}LL in Intel syntax mode Jan Beulich
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:44 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

It is almost entirely redundant with Size64, and the sole case (CRC32)
where direct replacement isn't possible can easily be taken care of in
another way.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (md_assemble): Drop use of rex64.
	(process_suffix): For REX.W for 64-bit CRC32.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-gen.c (opcode_modifiers): Remove Rex64 field.
	* i386-opc.h (Rex64): Delete.
	(struct i386_opcode_modifier): Remove rex64 field.
	* i386-opc.tbl (crc32): Drop Rex64.
	Replace Rex64 with Size64 everywhere else.
	* i386-tbl.h: Re-generate.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4546,9 +4546,6 @@ md_assemble (char *line)
       i.op[0].disps->X_op = O_symbol;
     }
 
-  if (i.tm.opcode_modifier.rex64)
-    i.rex |= REX_W;
-
   /* For 8 bit registers we need an empty rex prefix.  Also if the
      instruction already has a prefix, we need to convert old
      registers to new ones.  */
@@ -6317,6 +6314,11 @@ process_suffix (void)
 	  || (i.tm.base_opcode == 0x63 && i.tm.cpu_flags.bitfield.cpu64))
 	--i.operands;
 
+      /* crc32 needs REX.W set regardless of suffix / source operand size.  */
+      if (i.tm.base_opcode == 0xf20f38f0
+          && i.tm.operand_types[1].bitfield.qword)
+        i.rex |= REX_W;
+
       /* If there's no instruction mnemonic suffix we try to invent one
 	 based on GPR operands.  */
       if (!i.suffix)
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -649,7 +649,6 @@ static bitfield opcode_modifiers[] =
   BITFIELD (IsPrefix),
   BITFIELD (ImmExt),
   BITFIELD (NoRex64),
-  BITFIELD (Rex64),
   BITFIELD (Ugh),
   BITFIELD (Vex),
   BITFIELD (VexVVVV),
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -496,8 +496,6 @@ enum
   ImmExt,
   /* instruction don't need Rex64 prefix.  */
   NoRex64,
-  /* instruction require Rex64 prefix.  */
-  Rex64,
   /* deprecated fp insn, gets a warning */
   Ugh,
   /* insn has VEX prefix:
@@ -689,7 +687,6 @@ typedef struct i386_opcode_modifier
   unsigned int isprefix:1;
   unsigned int immext:1;
   unsigned int norex64:1;
-  unsigned int rex64:1;
   unsigned int ugh:1;
   unsigned int vex:2;
   unsigned int vexvvvv:2;
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -134,9 +134,9 @@ movbe, 2, 0x0f38f1, None, 3, CpuMovbe, M
 movsbl, 2, 0xfbe, None, 2, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg32 }
 movsbw, 2, 0xfbe, None, 2, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg16 }
 movswl, 2, 0xfbf, None, 2, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg16|Word|Unspecified|BaseIndex, Reg32 }
-movsbq, 2, 0xfbe, None, 2, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg8|Byte|Unspecified|BaseIndex, Reg64 }
-movswq, 2, 0xfbf, None, 2, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg16|Word|Unspecified|BaseIndex, Reg64 }
-movslq, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg32|Dword|Unspecified|BaseIndex, Reg64 }
+movsbq, 2, 0xfbe, None, 2, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg8|Byte|Unspecified|BaseIndex, Reg64 }
+movswq, 2, 0xfbf, None, 2, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg16|Word|Unspecified|BaseIndex, Reg64 }
+movslq, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg32|Dword|Unspecified|BaseIndex, Reg64 }
 movsx, 2, 0xfbe, None, 2, Cpu386, W|Modrm|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Reg16|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 movsx, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
 movsxd, 2, 0x63, None, 1, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
@@ -919,9 +919,9 @@ sysenter, 0, 0xf34, None, 2, Cpu686|CpuN
 sysexit, 0, 0xf35, None, 2, Cpu64, Intel64Only|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 sysexit, 0, 0xf35, None, 2, Cpu686|CpuNo64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 fxsave, 1, 0xfae, 0x0, 2, CpuFXSR, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Unspecified|BaseIndex }
-fxsave64, 1, 0xfae, 0x0, 2, CpuFXSR|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+fxsave64, 1, 0xfae, 0x0, 2, CpuFXSR|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 fxrstor, 1, 0xfae, 0x1, 2, CpuFXSR, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Unspecified|BaseIndex }
-fxrstor64, 1, 0xfae, 0x1, 2, CpuFXSR|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+fxrstor64, 1, 0xfae, 0x1, 2, CpuFXSR|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 rdpmc, 0, 0xf33, None, 2, Cpu686, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 // official undefined instr.
 ud2, 0, 0xf0b, None, 2, Cpu186, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
@@ -1014,11 +1014,11 @@ emms, 0, 0xf77, None, 2, CpuMMX, No_bSuf
 // spec). AMD's spec, having been in existence for much longer, failed to
 // recognize that and specified movd for 32- and 64-bit operations.
 movd, 2, 0x666e, None, 1, CpuAVX, D|Modrm|Vex=1|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Reg32|Unspecified|BaseIndex, RegXMM }
-movd, 2, 0x666e, None, 1, CpuAVX|Cpu64, D|Modrm|Vex=1|VexOpcode=0|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
+movd, 2, 0x666e, None, 1, CpuAVX|Cpu64, D|Modrm|Vex=1|VexOpcode=0|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
 movd, 2, 0x660f6e, None, 2, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
-movd, 2, 0x660f6e, None, 2, CpuSSE2|Cpu64, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg64|BaseIndex, RegXMM }
+movd, 2, 0x660f6e, None, 2, CpuSSE2|Cpu64, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegXMM }
 movd, 2, 0xf6e, None, 2, CpuMMX, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
-movd, 2, 0xf6e, None, 2, CpuMMX|Cpu64, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg64|BaseIndex, RegMMX }
+movd, 2, 0xf6e, None, 2, CpuMMX|Cpu64, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegMMX }
 // In the 64bit mode the short form mov immediate is redefined to have
 // 64bit displacement value.  We put the 64bit displacement first and
 // we only mark constants larger than 32bit as Disp64.
@@ -1558,7 +1558,7 @@ addsubpd, 2, 0x66d0, None, 1, CpuAVX, Mo
 addsubpd, 2, 0x660fd0, None, 2, CpuSSE3, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 addsubps, 2, 0xf2d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 addsubps, 2, 0xf20fd0, None, 2, CpuSSE3, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
-cmpxchg16b, 1, 0xfc7, 0x1, 2, CpuCX16|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64|IsLockable, { Oword|Unspecified|BaseIndex }
+cmpxchg16b, 1, 0xfc7, 0x1, 2, CpuCX16|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|IsLockable, { Oword|Unspecified|BaseIndex }
 fisttp, 1, 0xdf, 0x1, 1, CpuFISTTP, Modrm|FloatMF|No_bSuf|No_wSuf|No_qSuf|No_ldSuf, { Word|Dword|Unspecified|BaseIndex }
 fisttp, 1, 0xdd, 0x1, 1, CpuFISTTP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
 fisttpll, 1, 0xdd, 0x1, 1, CpuFISTTP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex }
@@ -1804,20 +1804,20 @@ pcmpistri, 3, 0x660f3a63, None, 3, CpuSS
 pcmpistrm, 3, 0x6662, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 pcmpistrm, 3, 0x660f3a62, None, 3, CpuSSE4_2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 crc32, 2, 0xf20f38f0, None, 3, CpuSSE4_2, W|Modrm|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { Reg8|Reg16|Reg32|Unspecified|BaseIndex, Reg32 }
-crc32, 2, 0xf20f38f0, None, 3, CpuSSE4_2|Cpu64, W|Modrm|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|Rex64|NoAVX, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
+crc32, 2, 0xf20f38f0, None, 3, CpuSSE4_2|Cpu64, W|Modrm|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoAVX, { Reg8|Reg64|Unspecified|BaseIndex, Reg64 }
 
 // xsave/xrstor New Instructions.
 
 xsave, 1, 0xfae, 0x4, 2, CpuXsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Unspecified|BaseIndex }
-xsave64, 1, 0xfae, 0x4, 2, CpuXsave|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+xsave64, 1, 0xfae, 0x4, 2, CpuXsave|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 xrstor, 1, 0xfae, 0x5, 2, CpuXsave, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Unspecified|BaseIndex }
-xrstor64, 1, 0xfae, 0x5, 2, CpuXsave|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+xrstor64, 1, 0xfae, 0x5, 2, CpuXsave|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 xgetbv, 0, 0xf01d0, None, 3, CpuXsave, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 xsetbv, 0, 0xf01d1, None, 3, CpuXsave, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 
 // xsaveopt
 xsaveopt, 1, 0xfae, 0x6, 2, CpuXsaveopt, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Unspecified|BaseIndex }
-xsaveopt64, 1, 0xfae, 0x6, 2, CpuXsaveopt|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+xsaveopt64, 1, 0xfae, 0x6, 2, CpuXsaveopt|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 
 // AES instructions.
 
@@ -4052,16 +4052,16 @@ clflushopt, 1, 0x660fae, 0x7, 2, CpuClfl
 // XSAVES/XRSTORS instructions.
 
 xrstors, 1, 0xfc7, 0x3, 2, CpuXSAVES, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex }
-xrstors64, 1, 0xfc7, 0x3, 2, CpuXSAVES|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+xrstors64, 1, 0xfc7, 0x3, 2, CpuXSAVES|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 xsaves, 1, 0xfc7, 0x5, 2, CpuXSAVES, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex }
-xsaves64, 1, 0xfc7, 0x5, 2, CpuXSAVES|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+xsaves64, 1, 0xfc7, 0x5, 2, CpuXSAVES|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 
 // XSAVES instructions end.
 
 // XSAVEC instructions.
 
 xsavec, 1, 0xfc7, 0x4, 2, CpuXSAVEC, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex }
-xsavec64, 1, 0xfc7, 0x4, 2, CpuXSAVEC|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Unspecified|BaseIndex }
+xsavec64, 1, 0xfc7, 0x4, 2, CpuXSAVEC|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Unspecified|BaseIndex }
 
 // XSAVEC instructions end.
 
@@ -4736,9 +4736,9 @@ rdsspq, 1, 0xf30f1e, 0x1, 2, CpuSHSTK|Cp
 saveprevssp, 0, 0xf30f01ea, None, 3, CpuSHSTK, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 rstorssp, 1, 0xf30f01, 0x5, 2, CpuSHSTK, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
 wrssd, 2, 0x0f38f6, None, 3, CpuSHSTK, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, Dword|Unspecified|BaseIndex }
-wrssq, 2, 0x0f38f6, None, 3, CpuSHSTK|Cpu64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg64, Qword|Unspecified|BaseIndex }
+wrssq, 2, 0x0f38f6, None, 3, CpuSHSTK|Cpu64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64, Qword|Unspecified|BaseIndex }
 wrussd, 2, 0x660f38f5, None, 3, CpuSHSTK, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, Dword|Unspecified|BaseIndex }
-wrussq, 2, 0x660f38f5, None, 3, CpuSHSTK|Cpu64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Rex64, { Reg64, Qword|Unspecified|BaseIndex }
+wrussq, 2, 0x660f38f5, None, 3, CpuSHSTK|Cpu64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64, Qword|Unspecified|BaseIndex }
 setssbsy, 0, 0xf30f01e8, None, 3, CpuSHSTK, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 clrssbsy, 1, 0xf30fae, 0x6, 2, CpuSHSTK, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
 endbr64, 0, 0xf30f1efa, None, 3, CpuIBT, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 6/9] x86: don't accept FI{LD,STP,STTP}LL in Intel syntax mode
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
                   ` (3 preceding siblings ...)
  2020-03-04  9:44 ` [PATCH 4/9] x86: drop Rex64 attribute Jan Beulich
@ 2020-03-04  9:45 ` Jan Beulich
  2020-03-04 11:55   ` H.J. Lu
  2020-03-04  9:46 ` [PATCH 7/9] x86: fold (supposed to be) identical code Jan Beulich
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:45 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

As of commit dc2be329b950 ("i386: Only check suffix in instruction
mnemonic") these have been accepted even with "qword ptr" operand size
specifier, but in 64-bit mode they're now wrongly having a REX prefix
(with REX.W set) emitted in this case. These aren't Intel syntax
mnemonics, so rather than fixing code generation, let's simply reject
them. As a result, the Qword attribute can then be dropped, too.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-opc.tbl (fildll, fistpll, fisttpll): Add ATTSyntax.
	* i386-tbl.h: Re-generate.

--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -606,7 +606,7 @@ fld, 1, 0xd9c0, None, 2, CpuFP, IgnoreSi
 fld, 1, 0xdb, 0x5, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Tbyte|Unspecified|BaseIndex }
 fild, 1, 0xdf, 0x0, 1, CpuFP, Modrm|FloatMF|No_bSuf|No_wSuf|No_qSuf|No_ldSuf, { Word|Dword|Unspecified|BaseIndex }
 fild, 1, 0xdf, 0x5, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
-fildll, 1, 0xdf, 0x5, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex }
+fildll, 1, 0xdf, 0x5, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex }
 fldt, 1, 0xdb, 0x5, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Tbyte|Unspecified|BaseIndex }
 fbld, 1, 0xdf, 0x4, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Tbyte|Unspecified|BaseIndex }
 
@@ -624,7 +624,7 @@ fstp, 1, 0xddd8, None, 2, CpuFP, IgnoreS
 fstp, 1, 0xdb, 0x7, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Tbyte|Unspecified|BaseIndex }
 fistp, 1, 0xdf, 0x3, 1, CpuFP, Modrm|FloatMF|No_bSuf|No_wSuf|No_qSuf|No_ldSuf, { Word|Dword|Unspecified|BaseIndex }
 fistp, 1, 0xdf, 0x7, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
-fistpll, 1, 0xdf, 0x7, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex }
+fistpll, 1, 0xdf, 0x7, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex }
 fstpt, 1, 0xdb, 0x7, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Tbyte|Unspecified|BaseIndex }
 fbstp, 1, 0xdf, 0x6, 1, CpuFP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Tbyte|Unspecified|BaseIndex }
 
@@ -1561,7 +1561,7 @@ addsubps, 2, 0xf20fd0, None, 2, CpuSSE3,
 cmpxchg16b, 1, 0xfc7, 0x1, 2, CpuCX16|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|IsLockable, { Oword|Unspecified|BaseIndex }
 fisttp, 1, 0xdf, 0x1, 1, CpuFISTTP, Modrm|FloatMF|No_bSuf|No_wSuf|No_qSuf|No_ldSuf, { Word|Dword|Unspecified|BaseIndex }
 fisttp, 1, 0xdd, 0x1, 1, CpuFISTTP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex }
-fisttpll, 1, 0xdd, 0x1, 1, CpuFISTTP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex }
+fisttpll, 1, 0xdd, 0x1, 1, CpuFISTTP, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex }
 haddpd, 2, 0x667c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 haddpd, 2, 0x660f7c, None, 2, CpuSSE3, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 haddps, 2, 0xf27c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 7/9] x86: fold (supposed to be) identical code
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
                   ` (4 preceding siblings ...)
  2020-03-04  9:45 ` [PATCH 6/9] x86: don't accept FI{LD,STP,STTP}LL in Intel syntax mode Jan Beulich
@ 2020-03-04  9:46 ` Jan Beulich
  2020-03-04 11:56   ` H.J. Lu
  2020-03-04  9:47 ` [PATCH 9/9] x86: reduce amount of various VCVT* templates Jan Beulich
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:46 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

The Q and L suffix exclusion checks in match_template() ought to be
(kept) in sync as far as their FPU and SIMD aspects go. This was
already violated by only the Q one checking for active broadcast.
Convert the code such that there'll be only one instance of the logic,
the more that subsequently the logic is liable to need further
refinement / extension. (The alternative would be to drop all SIMD-ness
from the L part, but it is in principle possible to enable all sorts of
SIMD support with just a pre-386 CPU, via suitable .arch directives.)

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (match_template): Fold duplicate code in
	logic rejecting certain suffixes in certain modes. Drop
	pointless "else".

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -5853,43 +5853,30 @@ match_template (char mnem_suffix)
       for (j = 0; j < MAX_OPERANDS; j++)
 	operand_types[j] = t->operand_types[j];
 
-      /* In general, don't allow 64-bit operands in 32-bit mode.  */
-      if (i.suffix == QWORD_MNEM_SUFFIX
-	  && flag_code != CODE_64BIT
+      /* In general, don't allow
+	 - 64-bit operands outside of 64-bit mode,
+	 - 32-bit operands on pre-386.  */
+      if (((i.suffix == QWORD_MNEM_SUFFIX
+	    && flag_code != CODE_64BIT
+	    && (t->base_opcode != 0x0fc7
+		|| t->extension_opcode != 1 /* cmpxchg8b */))
+	   || (i.suffix == LONG_MNEM_SUFFIX
+	       && !cpu_arch_flags.bitfield.cpui386))
 	  && (intel_syntax
 	      ? (t->opcode_modifier.mnemonicsize != IGNORESIZE
-	         && !t->opcode_modifier.broadcast
+		 && !t->opcode_modifier.broadcast
 		 && !intel_float_operand (t->name))
 	      : intel_float_operand (t->name) != 2)
 	  && ((operand_types[0].bitfield.class != RegMMX
 	       && operand_types[0].bitfield.class != RegSIMD)
 	      || (operand_types[t->operands > 1].bitfield.class != RegMMX
-		  && operand_types[t->operands > 1].bitfield.class != RegSIMD))
-	  && (t->base_opcode != 0x0fc7
-	      || t->extension_opcode != 1 /* cmpxchg8b */))
-	continue;
-
-      /* In general, don't allow 32-bit operands on pre-386.  */
-      else if (i.suffix == LONG_MNEM_SUFFIX
-	       && !cpu_arch_flags.bitfield.cpui386
-	       && (intel_syntax
-		   ? (t->opcode_modifier.mnemonicsize != IGNORESIZE
-		      && !intel_float_operand (t->name))
-		   : intel_float_operand (t->name) != 2)
-	       && ((operand_types[0].bitfield.class != RegMMX
-		    && operand_types[0].bitfield.class != RegSIMD)
-		   || (operand_types[t->operands > 1].bitfield.class != RegMMX
-		       && operand_types[t->operands > 1].bitfield.class
-			  != RegSIMD)))
+		  && operand_types[t->operands > 1].bitfield.class != RegSIMD)))
 	continue;
 
       /* Do not verify operands when there are none.  */
-      else
-	{
-	  if (!t->operands)
-	    /* We've found a match; break out of loop.  */
-	    break;
-	}
+      if (!t->operands)
+	/* We've found a match; break out of loop.  */
+	break;
 
       if (!t->opcode_modifier.jump
 	  || t->opcode_modifier.jump == JUMP_ABSOLUTE)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 9/9] x86: reduce amount of various VCVT* templates
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
                   ` (5 preceding siblings ...)
  2020-03-04  9:46 ` [PATCH 7/9] x86: fold (supposed to be) identical code Jan Beulich
@ 2020-03-04  9:47 ` Jan Beulich
  2020-03-04 12:00   ` H.J. Lu
  2020-03-04 10:15 ` [PATCH 8/9] x86: drop/replace IgnoreSize Jan Beulich
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04  9:47 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Presumably as a result of various changes over the last several months,
and - for some of them - with a generalization of logic in
match_mem_size() plus mirroring of this generalization into the
broadcast handling logic of check_VecOperands(), various register-only
templates can be foled into their respective memory forms. This in
particular then also allows dropping a few more instances of IgnoreSize.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (match_mem_size): Generalize broadcast special
	casing.
	(check_VecOperands): Zap xmmword/ymmword/zmmword when more than
	one of byte/word/dword/qword is set alongside a SIMD register in
	a template's operand.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-opc.tbl (vcvtdq2pd, vcvtps2pd, vcvtudq2pd, vcvtps2ph,
	vcvtps2qq, vcvtps2uqq, vcvttps2qq, vcvttps2uqq): Fold separate
	register and memory source templates. Replace VexW= by VexW*
	where applicable.
	* i386-tbl.h: Re-generate.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -2124,11 +2124,11 @@ match_mem_size (const insn_template *t,
 		  here.  Also for v{,p}broadcast*, {,v}pmov{s,z}*, and
 		  down-conversion vpmov*.  */
 	       || ((t->operand_types[wanted].bitfield.class == RegSIMD
-		    && !t->opcode_modifier.broadcast
-		    && (t->operand_types[wanted].bitfield.byte
-			|| t->operand_types[wanted].bitfield.word
-			|| t->operand_types[wanted].bitfield.dword
-			|| t->operand_types[wanted].bitfield.qword))
+		    && t->operand_types[wanted].bitfield.byte
+		       + t->operand_types[wanted].bitfield.word
+		       + t->operand_types[wanted].bitfield.dword
+		       + t->operand_types[wanted].bitfield.qword
+		       > !!t->opcode_modifier.broadcast)
 		   ? (i.types[given].bitfield.xmmword
 		      || i.types[given].bitfield.ymmword
 		      || i.types[given].bitfield.zmmword)
@@ -5512,6 +5512,16 @@ check_VecOperands (const insn_template *
 	}
 
       overlap = operand_type_and (type, t->operand_types[op]);
+      if (t->operand_types[op].bitfield.class == RegSIMD
+	  && t->operand_types[op].bitfield.byte
+	     + t->operand_types[op].bitfield.word
+	     + t->operand_types[op].bitfield.dword
+	     + t->operand_types[op].bitfield.qword > 1)
+	{
+	  overlap.bitfield.xmmword = 0;
+	  overlap.bitfield.ymmword = 0;
+	  overlap.bitfield.zmmword = 0;
+	}
       if (operand_type_all_zero (&overlap))
 	  goto bad_broadcast;
 
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -2016,9 +2016,8 @@ vcmpunord_ssd, 3, 0xf2c2, 0x13, 1, CpuAV
 vcmpunord_sss, 3, 0xf3c2, 0x13, 1, CpuAVX, Modrm|C|Vex=3|VexOpcode=0|VexVVVV|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcomisd, 2, 0x662f, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vcomiss, 2, 0x2f, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vcvtdq2pd, 2, 0xf3e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM|RegYMM }
-vcvtdq2pd, 2, 0xf3e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
-vcvtdq2pd, 2, 0xf3e6, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
+vcvtdq2pd, 2, 0xf3e6, None, 1, CpuAVX, Modrm|Vex128|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
+vcvtdq2pd, 2, 0xf3e6, None, 1, CpuAVX, Modrm|Vex256|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegYMM }
 vcvtdq2ps, 2, 0x5b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM }
 vcvtpd2dq, 2, 0xf2e6, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { RegXMM|RegYMM, RegXMM }
@@ -2029,9 +2028,8 @@ vcvtpd2ps, 2, 0x665a, None, 1, CpuAVX, M
 vcvtpd2psx, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtpd2psy, 2, 0x665a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Unspecified|BaseIndex|RegYMM, RegXMM }
 vcvtps2dq, 2, 0x665b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vcvtps2pd, 2, 0x5a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM|RegYMM }
-vcvtps2pd, 2, 0x5a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
-vcvtps2pd, 2, 0x5a, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
+vcvtps2pd, 2, 0x5a, None, 1, CpuAVX, Modrm|Vex128|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
+vcvtps2pd, 2, 0x5a, None, 1, CpuAVX, Modrm|Vex256|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegYMM }
 vcvtsd2si, 2, 0xf22d, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|ToDword, { Qword|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
 vcvtsd2ss, 3, 0xf25a, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcvtsi2sd, 3, 0xf22a, None, 1, CpuAVX, Modrm|VexLIG|VexOpcode=0|VexVVVV|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
@@ -4105,12 +4103,10 @@ vscatterdps, 2, 0x66A2, None, 1, CpuAVX5
 vscatterqps, 2, 0x66A3, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=2|NoDefMask|VexOpcode=1|VexW0|Disp8MemShift=2|VecSIB=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Dword|Unspecified|BaseIndex }
 vscatterqps, 2, 0x66A3, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=2|NoDefMask|VexOpcode=1|VexW0|Disp8MemShift=2|VecSIB=2|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Dword|Unspecified|BaseIndex }
 
-vcvtdq2pd, 2, 0xF3E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM|RegYMM }
-vcvtdq2pd, 2, 0xF3E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Qword|Unspecified|BaseIndex, RegXMM }
-vcvtdq2pd, 2, 0xF3E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|XMMword|Unspecified|BaseIndex, RegYMM }
-vcvtudq2pd, 2, 0xF37A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM|RegYMM }
-vcvtudq2pd, 2, 0xF37A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Qword|Unspecified|BaseIndex, RegXMM }
-vcvtudq2pd, 2, 0xF37A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|XMMword|Unspecified|BaseIndex, RegYMM }
+vcvtdq2pd, 2, 0xF3E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex128|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+vcvtdq2pd, 2, 0xF3E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex256|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
+vcvtudq2pd, 2, 0xF37A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex128|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+vcvtudq2pd, 2, 0xF37A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex256|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 
 vcvtpd2dq, 2, 0xF2E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|Masking=3|VexOpcode=0|VexW=2|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { RegXMM|RegYMM|Qword|Unspecified|BaseIndex, RegXMM }
 vcvtpd2dq, 2, 0xF2E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|Masking=3|VexOpcode=0|VexW=2|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { RegXMM|RegYMM|Qword|BaseIndex, RegXMM }
@@ -4130,13 +4126,11 @@ vcvtpd2udqy, 2, 0x79, None, 1, CpuAVX512
 vcvtph2ps, 2, 0x6613, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=1|VexW0|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
 vcvtph2ps, 2, 0x6613, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=1|VexW=1|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegYMM }
 
-vcvtps2pd, 2, 0x5A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM|RegYMM }
-vcvtps2pd, 2, 0x5A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Qword|Unspecified|BaseIndex, RegXMM }
-vcvtps2pd, 2, 0x5A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|XMMword|Unspecified|BaseIndex, RegYMM }
-
-vcvtps2ph, 3, 0x661D, None, 1, CpuAVX512F|CpuAVX512VL, RegMem|Masking=3|VexOpcode=2|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|RegYMM, RegXMM }
-vcvtps2ph, 3, 0x661D, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=2|VexOpcode=2|VexW=1|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Qword|Unspecified|BaseIndex }
-vcvtps2ph, 3, 0x661D, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=2|VexOpcode=2|VexW=1|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM, XMMword|Unspecified|BaseIndex }
+vcvtps2pd, 2, 0x5A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex128|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+vcvtps2pd, 2, 0x5A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex256|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
+
+vcvtps2ph, 3, 0x661D, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex128|MaskingMorZ|VexOpcode=2|VexW0|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, RegXMM|Qword|Unspecified|BaseIndex }
+vcvtps2ph, 3, 0x661D, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex256|MaskingMorZ|VexOpcode=2|VexW0|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM, RegXMM|Unspecified|BaseIndex }
 
 vcvttpd2dq, 2, 0x66E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|Masking=3|VexOpcode=0|VexW=2|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { RegXMM|RegYMM|Unspecified|Qword|BaseIndex, RegXMM }
 vcvttpd2dq, 2, 0x66E6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|Masking=3|VexOpcode=0|VexW=2|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { RegXMM|RegYMM|Qword|BaseIndex, RegXMM }
@@ -4462,15 +4456,13 @@ vcvtpd2uqq, 2, 0x6679, None, 1, CpuAVX51
 vcvtpd2uqq, 3, 0x6679, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=2|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|StaticRounding|SAE, { Imm8, RegZMM, RegZMM }
 
 vcvtps2qq, 2, 0x667B, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=5|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM }
-vcvtps2qq, 2, 0x667B, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vcvtps2qq, 2, 0x667B, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Qword|Unspecified|BaseIndex, RegXMM }
-vcvtps2qq, 2, 0x667B, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 vcvtps2qq, 3, 0x667B, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|StaticRounding|SAE, { Imm8, RegYMM, RegZMM }
+vcvtps2qq, 2, 0x667B, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex128|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+vcvtps2qq, 2, 0x667B, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex256|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 vcvtps2uqq, 2, 0x6679, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=5|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM }
-vcvtps2uqq, 2, 0x6679, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vcvtps2uqq, 2, 0x6679, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Qword|Unspecified|BaseIndex, RegXMM }
-vcvtps2uqq, 2, 0x6679, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 vcvtps2uqq, 3, 0x6679, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|StaticRounding|SAE, { Imm8, RegYMM, RegZMM }
+vcvtps2uqq, 2, 0x6679, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex128|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+vcvtps2uqq, 2, 0x6679, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex256|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 
 vcvtqq2pd, 2, 0xF3E6, None, 1, CpuAVX512DQ, Modrm|Masking=3|VexOpcode=0|VexW=2|Broadcast|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vcvtqq2pd, 3, 0xF3E6, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=2|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|StaticRounding|SAE, { Imm8, RegZMM, RegZMM }
@@ -4490,15 +4482,13 @@ vcvttpd2uqq, 2, 0x6678, None, 1, CpuAVX5
 vcvttpd2uqq, 3, 0x6678, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=2|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SAE, { Imm8, RegZMM, RegZMM }
 
 vcvttps2qq, 2, 0x667A, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=5|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM }
-vcvttps2qq, 2, 0x667A, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vcvttps2qq, 2, 0x667A, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Qword|Unspecified|BaseIndex, RegXMM }
-vcvttps2qq, 2, 0x667A, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 vcvttps2qq, 3, 0x667A, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SAE, { Imm8, RegYMM, RegZMM }
+vcvttps2qq, 2, 0x667A, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex128|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+vcvttps2qq, 2, 0x667A, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex256|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 vcvttps2uqq, 2, 0x6678, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=5|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM }
-vcvttps2uqq, 2, 0x6678, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM }
-vcvttps2uqq, 2, 0x6678, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Qword|Unspecified|BaseIndex, RegXMM }
-vcvttps2uqq, 2, 0x6678, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 vcvttps2uqq, 3, 0x6678, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SAE, { Imm8, RegYMM, RegZMM }
+vcvttps2uqq, 2, 0x6678, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex128|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+vcvttps2uqq, 2, 0x6678, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex256|Masking=3|VexOpcode=0|VexW0|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM }
 
 vcvtuqq2ps, 2, 0xF27A, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=2|Broadcast|Disp8MemShift=6|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|Unspecified|BaseIndex, RegYMM }
 vcvtuqq2ps, 3, 0xF27A, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexW=2|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|StaticRounding|SAE, { Imm8, RegZMM, RegYMM }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 8/9] x86: drop/replace IgnoreSize
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
                   ` (6 preceding siblings ...)
  2020-03-04  9:47 ` [PATCH 9/9] x86: reduce amount of various VCVT* templates Jan Beulich
@ 2020-03-04 10:15 ` Jan Beulich
  2020-03-04 11:59   ` H.J. Lu
  2020-03-04 10:19 ` [PATCH 5/9] x86: replace NoRex64 on VEX-encoded insns Jan Beulich
  2020-03-05  8:07 ` [PATCH v1.1 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04 10:15 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

[-- Attachment #1: Type: text/plain, Size: 2167 bytes --]

Even after commit dc2be329b950 ("i386: Only check suffix in instruction
mnemonic"), by which many of its uses have become unnecessary (some were
unnecessary even before), IgnoreSize is still used for various slightly
different purposes:
- to suppress emission of an operand size prefix,
- in Intel syntax mode to zap "derived" suffixes in certain cases and to
  skip certain checks of remaining "derived" suffixes,
- to suppress ambiguous operand size / missing suffix diagnostics,
- for prefixes to suppress the "stand-alone ... prefix" warning.
Drop entirely unnecessary ones and where possible also replace instances
by the more focused (because of having just a single purpose) NoRex64.

To further restrict when IgnoreSize is needed, also generalize the logic
when to skip a template because of a present or derived L or Q suffix,
by skipping immediate operands. Additionally consider mask registers and
VecSIB there.

Note that for the time being the attribute needs to be kept in place on
MMX/SSE/etc insns (but not on VEX/EVEX encoded ones unless an operand
template of them allows for only non-SIMD-register actuals) allowing for
Dword operands - the logic when to emit a data size prefix would need
further adjustment first.

Note also that the memory forms of {,v}pinsrw get their permission for
an L or Q suffix dropped. I can only assume that it being this way was a
cut-and-paste mistake from the register forms, as the latter
specifically have NoRex64 set, and the {,v}pextrw counterparts don't
allow these suffixes either.

Convert VexW= again to their respective VexW* on lines touched anyway.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (match_template): Extend code in logic
	rejecting certain suffixes in certain modes to also cover mask
	register use and VecSIB. Drop special casing of broadcast. Skip
	immediates in the check.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-opc.tbl: Drop IgnoreSize from various SIMD insns. Replace
	VexW= by VexW* and VexVVVV=1 by just VexVVVV where applicable.
	* i386-tbl.h: Re-generate.

[resend with actual patch data as compressed attachment, for size reasons]

[-- Attachment #2: binutils-master-x86-stray-IgnoreSize.patch.bz2 --]
[-- Type: application/octet-stream, Size: 17000 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 5/9] x86: replace NoRex64 on VEX-encoded insns
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
                   ` (7 preceding siblings ...)
  2020-03-04 10:15 ` [PATCH 8/9] x86: drop/replace IgnoreSize Jan Beulich
@ 2020-03-04 10:19 ` Jan Beulich
  2020-03-04 11:51   ` H.J. Lu
  2020-03-05  8:07 ` [PATCH v1.1 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich
  9 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04 10:19 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

When the template specifies any of the possible VexW settings, we can
use this instead of a separate NoRex64 to suppress the setting of REX_W.
Note that this ends up addressing an inconsistency between VEX- and
EVEX-encoded VEXTRACTPS, VPEXTR{B,W}, and VPINSR{B,W} - while the former
avoided setting VEX.W, the latter pointlessly set EVEX.W when there is a
64-bit GPR operand. Adjust the testcase to cover both cases.

Convert VexW= to their respective VexW* on lines touched anyway.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (process_suffix): Exlucde !vexw insns
	alongside !norex64 ones.
	* testsuite/gas/i386/x86-64-avx512bw.s: Test VPEXTR* and VPINSR*
	with both 32- and 64-bit GPR operands.
	* testsuite/gas/i386/x86-64-avx512f.s: Test VEXTRACTPS with both
	32- and 64-bit GPR operands.
	* testsuite/gas/i386/x86-64-avx512bw-intel.d,
	testsuite/gas/i386/x86-64-avx512bw.d,
	testsuite/gas/i386/x86-64-avx512f-intel.d,
	testsuite/gas/i386/x86-64-avx512f.d: Adjust expectations.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-opc.tbl (movq): Drop NoRex64 from XMM/XMM SSE2AVX variants.
	(movmskps, pextrw, pinsrw, pmovmskb, movmskpd, extractps,
	pextrb, pinsrb, roundsd): Drop NoRex64 and where applicable use
	VexW0 on SSE2AVX variants.
	(vmovq): Drop NoRex64 from XMM/XMM variants.
	(vextractps, vmovmskpd, vmovmskps, vpextrb, vpextrw, vpinsrb,
	vpinsrw, vpmovmskb, vroundsd, vpmovmskb): Drop NoRex64 and where
	applicable use VexW0.
	* i386-tbl.h: Re-generate.
---
In principle this paves the way for folding NoRex64 with some VEX-only
attribute bit, as there's no VEX-or-alike insn left with NoRex64 set
(and following the underlying model there also isn't going to be).

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -6646,6 +6646,7 @@ process_suffix (void)
       if (i.suffix == QWORD_MNEM_SUFFIX
 	  && flag_code == CODE_64BIT
 	  && !i.tm.opcode_modifier.norex64
+	  && !i.tm.opcode_modifier.vexw
 	  /* Special case for xchg %rax,%rax.  It is NOP and doesn't
 	     need rex64. */
 	  && ! (i.operands == 2
--- a/gas/testsuite/gas/i386/x86-64-avx512bw-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-avx512bw-intel.d
@@ -229,9 +229,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 00 20 00 00[ 	]*vpblendmw zmm30,zmm29,ZMMWORD PTR \[rdx\+0x2000\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 72 80[ 	]*vpblendmw zmm30,zmm29,ZMMWORD PTR \[rdx-0x2000\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 c0 df ff ff[ 	]*vpblendmw zmm30,zmm29,ZMMWORD PTR \[rdx-0x2040\]
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 ab[ 	]*vpextrb rax,xmm29,0xab
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 7b[ 	]*vpextrb rax,xmm29,0x7b
-[ 	]*[a-f0-9]+:[ 	]*62 43 fd 08 14 e8 7b[ 	]*vpextrb r8,xmm29,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 ab[ 	]*vpextrb eax,xmm29,0xab
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 7b[ 	]*vpextrb eax,xmm29,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 43 7d 08 14 e8 7b[ 	]*vpextrb r8d,xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 29 7b[ 	]*vpextrb BYTE PTR \[rcx\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 23 7d 08 14 ac f0 23 01 00 00 7b[ 	]*vpextrb BYTE PTR \[rax\+r14\*8\+0x123\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 6a 7f 7b[ 	]*vpextrb BYTE PTR \[rdx\+0x7f\],xmm29,0x7b
@@ -244,9 +244,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa 00 01 00 00 7b[ 	]*vpextrw WORD PTR \[rdx\+0x100\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 6a 80 7b[ 	]*vpextrw WORD PTR \[rdx-0x100\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa fe fe ff ff 7b[ 	]*vpextrw WORD PTR \[rdx-0x102\],xmm29,0x7b
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 ab[ 	]*vpextrw rax,xmm30,0xab
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 7b[ 	]*vpextrw rax,xmm30,0x7b
-[ 	]*[a-f0-9]+:[ 	]*62 11 fd 08 c5 c6 7b[ 	]*vpextrw r8,xmm30,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 ab[ 	]*vpextrw eax,xmm30,0xab
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 7b[ 	]*vpextrw eax,xmm30,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 11 7d 08 c5 c6 7b[ 	]*vpextrw r8d,xmm30,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 ab[ 	]*vpinsrb xmm30,xmm29,eax,0xab
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 7b[ 	]*vpinsrb xmm30,xmm29,eax,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f5 7b[ 	]*vpinsrb xmm30,xmm29,ebp,0x7b
@@ -1076,9 +1076,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 00 20 00 00[ 	]*vpblendmw zmm30,zmm29,ZMMWORD PTR \[rdx\+0x2000\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 72 80[ 	]*vpblendmw zmm30,zmm29,ZMMWORD PTR \[rdx-0x2000\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 c0 df ff ff[ 	]*vpblendmw zmm30,zmm29,ZMMWORD PTR \[rdx-0x2040\]
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 ab[ 	]*vpextrb rax,xmm29,0xab
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 7b[ 	]*vpextrb rax,xmm29,0x7b
-[ 	]*[a-f0-9]+:[ 	]*62 43 fd 08 14 e8 7b[ 	]*vpextrb r8,xmm29,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 ab[ 	]*vpextrb eax,xmm29,0xab
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 7b[ 	]*vpextrb eax,xmm29,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 43 7d 08 14 e8 7b[ 	]*vpextrb r8d,xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 29 7b[ 	]*vpextrb BYTE PTR \[rcx\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 23 7d 08 14 ac f0 34 12 00 00 7b[ 	]*vpextrb BYTE PTR \[rax\+r14\*8\+0x1234\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 6a 7f 7b[ 	]*vpextrb BYTE PTR \[rdx\+0x7f\],xmm29,0x7b
@@ -1091,9 +1091,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa 00 01 00 00 7b[ 	]*vpextrw WORD PTR \[rdx\+0x100\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 6a 80 7b[ 	]*vpextrw WORD PTR \[rdx-0x100\],xmm29,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa fe fe ff ff 7b[ 	]*vpextrw WORD PTR \[rdx-0x102\],xmm29,0x7b
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 ab[ 	]*vpextrw rax,xmm30,0xab
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 7b[ 	]*vpextrw rax,xmm30,0x7b
-[ 	]*[a-f0-9]+:[ 	]*62 11 fd 08 c5 c6 7b[ 	]*vpextrw r8,xmm30,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 ab[ 	]*vpextrw eax,xmm30,0xab
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 7b[ 	]*vpextrw eax,xmm30,0x7b
+[ 	]*[a-f0-9]+:[ 	]*62 11 7d 08 c5 c6 7b[ 	]*vpextrw r8d,xmm30,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 ab[ 	]*vpinsrb xmm30,xmm29,eax,0xab
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 7b[ 	]*vpinsrb xmm30,xmm29,eax,0x7b
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f5 7b[ 	]*vpinsrb xmm30,xmm29,ebp,0x7b
--- a/gas/testsuite/gas/i386/x86-64-avx512bw.d
+++ b/gas/testsuite/gas/i386/x86-64-avx512bw.d
@@ -229,9 +229,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 00 20 00 00[ 	]*vpblendmw 0x2000\(%rdx\),%zmm29,%zmm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 72 80[ 	]*vpblendmw -0x2000\(%rdx\),%zmm29,%zmm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 c0 df ff ff[ 	]*vpblendmw -0x2040\(%rdx\),%zmm29,%zmm30
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 ab[ 	]*vpextrb \$0xab,%xmm29,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 43 fd 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%r8
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 ab[ 	]*vpextrb \$0xab,%xmm29,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 43 7d 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%r8d
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 29 7b[ 	]*vpextrb \$0x7b,%xmm29,\(%rcx\)
 [ 	]*[a-f0-9]+:[ 	]*62 23 7d 08 14 ac f0 23 01 00 00 7b[ 	]*vpextrb \$0x7b,%xmm29,0x123\(%rax,%r14,8\)
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 6a 7f 7b[ 	]*vpextrb \$0x7b,%xmm29,0x7f\(%rdx\)
@@ -244,9 +244,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa 00 01 00 00 7b[ 	]*vpextrw \$0x7b,%xmm29,0x100\(%rdx\)
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 6a 80 7b[ 	]*vpextrw \$0x7b,%xmm29,-0x100\(%rdx\)
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa fe fe ff ff 7b[ 	]*vpextrw \$0x7b,%xmm29,-0x102\(%rdx\)
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 ab[ 	]*vpextrw \$0xab,%xmm30,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 11 fd 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%r8
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 ab[ 	]*vpextrw \$0xab,%xmm30,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 11 7d 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%r8d
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 ab[ 	]*vpinsrb \$0xab,%eax,%xmm29,%xmm30
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 7b[ 	]*vpinsrb \$0x7b,%eax,%xmm29,%xmm30
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f5 7b[ 	]*vpinsrb \$0x7b,%ebp,%xmm29,%xmm30
@@ -1076,9 +1076,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 00 20 00 00[ 	]*vpblendmw 0x2000\(%rdx\),%zmm29,%zmm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 72 80[ 	]*vpblendmw -0x2000\(%rdx\),%zmm29,%zmm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 95 40 66 b2 c0 df ff ff[ 	]*vpblendmw -0x2040\(%rdx\),%zmm29,%zmm30
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 ab[ 	]*vpextrb \$0xab,%xmm29,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 63 fd 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 43 fd 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%r8
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 ab[ 	]*vpextrb \$0xab,%xmm29,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 43 7d 08 14 e8 7b[ 	]*vpextrb \$0x7b,%xmm29,%r8d
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 29 7b[ 	]*vpextrb \$0x7b,%xmm29,\(%rcx\)
 [ 	]*[a-f0-9]+:[ 	]*62 23 7d 08 14 ac f0 34 12 00 00 7b[ 	]*vpextrb \$0x7b,%xmm29,0x1234\(%rax,%r14,8\)
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 14 6a 7f 7b[ 	]*vpextrb \$0x7b,%xmm29,0x7f\(%rdx\)
@@ -1091,9 +1091,9 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa 00 01 00 00 7b[ 	]*vpextrw \$0x7b,%xmm29,0x100\(%rdx\)
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 6a 80 7b[ 	]*vpextrw \$0x7b,%xmm29,-0x100\(%rdx\)
 [ 	]*[a-f0-9]+:[ 	]*62 63 7d 08 15 aa fe fe ff ff 7b[ 	]*vpextrw \$0x7b,%xmm29,-0x102\(%rdx\)
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 ab[ 	]*vpextrw \$0xab,%xmm30,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 91 fd 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%rax
-[ 	]*[a-f0-9]+:[ 	]*62 11 fd 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%r8
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 ab[ 	]*vpextrw \$0xab,%xmm30,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 91 7d 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%eax
+[ 	]*[a-f0-9]+:[ 	]*62 11 7d 08 c5 c6 7b[ 	]*vpextrw \$0x7b,%xmm30,%r8d
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 ab[ 	]*vpinsrb \$0xab,%eax,%xmm29,%xmm30
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f0 7b[ 	]*vpinsrb \$0x7b,%eax,%xmm29,%xmm30
 [ 	]*[a-f0-9]+:[ 	]*62 63 15 00 20 f5 7b[ 	]*vpinsrb \$0x7b,%ebp,%xmm29,%xmm30
--- a/gas/testsuite/gas/i386/x86-64-avx512bw.s
+++ b/gas/testsuite/gas/i386/x86-64-avx512bw.s
@@ -223,7 +223,7 @@ _start:
 	vpblendmw	8192(%rdx), %zmm29, %zmm30	 # AVX512BW
 	vpblendmw	-8192(%rdx), %zmm29, %zmm30	 # AVX512BW Disp8
 	vpblendmw	-8256(%rdx), %zmm29, %zmm30	 # AVX512BW
-	vpextrb	$0xab, %xmm29, %rax	 # AVX512BW
+	vpextrb	$0xab, %xmm29, %eax	 # AVX512BW
 	vpextrb	$123, %xmm29, %rax	 # AVX512BW
 	vpextrb	$123, %xmm29, %r8	 # AVX512BW
 	vpextrb	$123, %xmm29, (%rcx)	 # AVX512BW
@@ -238,13 +238,13 @@ _start:
 	vpextrw	$123, %xmm29, 256(%rdx)	 # AVX512BW
 	vpextrw	$123, %xmm29, -256(%rdx)	 # AVX512BW Disp8
 	vpextrw	$123, %xmm29, -258(%rdx)	 # AVX512BW
-	vpextrw	$0xab, %xmm30, %rax	 # AVX512BW
+	vpextrw	$0xab, %xmm30, %eax	 # AVX512BW
 	vpextrw	$123, %xmm30, %rax	 # AVX512BW
 	vpextrw	$123, %xmm30, %r8	 # AVX512BW
 	vpinsrb	$0xab, %eax, %xmm29, %xmm30	 # AVX512BW
-	vpinsrb	$123, %eax, %xmm29, %xmm30	 # AVX512BW
+	vpinsrb	$123, %rax, %xmm29, %xmm30	 # AVX512BW
 	vpinsrb	$123, %ebp, %xmm29, %xmm30	 # AVX512BW
-	vpinsrb	$123, %r13d, %xmm29, %xmm30	 # AVX512BW
+	vpinsrb	$123, %r13, %xmm29, %xmm30	 # AVX512BW
 	vpinsrb	$123, (%rcx), %xmm29, %xmm30	 # AVX512BW
 	vpinsrb	$123, 0x123(%rax,%r14,8), %xmm29, %xmm30	 # AVX512BW
 	vpinsrb	$123, 127(%rdx), %xmm29, %xmm30	 # AVX512BW Disp8
@@ -252,9 +252,9 @@ _start:
 	vpinsrb	$123, -128(%rdx), %xmm29, %xmm30	 # AVX512BW Disp8
 	vpinsrb	$123, -129(%rdx), %xmm29, %xmm30	 # AVX512BW
 	vpinsrw	$0xab, %eax, %xmm29, %xmm30	 # AVX512BW
-	vpinsrw	$123, %eax, %xmm29, %xmm30	 # AVX512BW
+	vpinsrw	$123, %rax, %xmm29, %xmm30	 # AVX512BW
 	vpinsrw	$123, %ebp, %xmm29, %xmm30	 # AVX512BW
-	vpinsrw	$123, %r13d, %xmm29, %xmm30	 # AVX512BW
+	vpinsrw	$123, %r13, %xmm29, %xmm30	 # AVX512BW
 	vpinsrw	$123, (%rcx), %xmm29, %xmm30	 # AVX512BW
 	vpinsrw	$123, 0x123(%rax,%r14,8), %xmm29, %xmm30	 # AVX512BW
 	vpinsrw	$123, 254(%rdx), %xmm29, %xmm30	 # AVX512BW Disp8
@@ -1072,7 +1072,7 @@ _start:
 	vpblendmw	zmm30, zmm29, ZMMWORD PTR [rdx+8192]	 # AVX512BW
 	vpblendmw	zmm30, zmm29, ZMMWORD PTR [rdx-8192]	 # AVX512BW Disp8
 	vpblendmw	zmm30, zmm29, ZMMWORD PTR [rdx-8256]	 # AVX512BW
-	vpextrb	rax, xmm29, 0xab	 # AVX512BW
+	vpextrb	eax, xmm29, 0xab	 # AVX512BW
 	vpextrb	rax, xmm29, 123	 # AVX512BW
 	vpextrb	r8, xmm29, 123	 # AVX512BW
 	vpextrb	BYTE PTR [rcx], xmm29, 123	 # AVX512BW
@@ -1087,13 +1087,13 @@ _start:
 	vpextrw	WORD PTR [rdx+256], xmm29, 123	 # AVX512BW
 	vpextrw	WORD PTR [rdx-256], xmm29, 123	 # AVX512BW Disp8
 	vpextrw	WORD PTR [rdx-258], xmm29, 123	 # AVX512BW
-	vpextrw	rax, xmm30, 0xab	 # AVX512BW
+	vpextrw	eax, xmm30, 0xab	 # AVX512BW
 	vpextrw	rax, xmm30, 123	 # AVX512BW
 	vpextrw	r8, xmm30, 123	 # AVX512BW
 	vpinsrb	xmm30, xmm29, eax, 0xab	 # AVX512BW
-	vpinsrb	xmm30, xmm29, eax, 123	 # AVX512BW
+	vpinsrb	xmm30, xmm29, rax, 123	 # AVX512BW
 	vpinsrb	xmm30, xmm29, ebp, 123	 # AVX512BW
-	vpinsrb	xmm30, xmm29, r13d, 123	 # AVX512BW
+	vpinsrb	xmm30, xmm29, r13, 123	 # AVX512BW
 	vpinsrb	xmm30, xmm29, BYTE PTR [rcx], 123	 # AVX512BW
 	vpinsrb	xmm30, xmm29, BYTE PTR [rax+r14*8+0x1234], 123	 # AVX512BW
 	vpinsrb	xmm30, xmm29, BYTE PTR [rdx+127], 123	 # AVX512BW Disp8
@@ -1101,9 +1101,9 @@ _start:
 	vpinsrb	xmm30, xmm29, BYTE PTR [rdx-128], 123	 # AVX512BW Disp8
 	vpinsrb	xmm30, xmm29, BYTE PTR [rdx-129], 123	 # AVX512BW
 	vpinsrw	xmm30, xmm29, eax, 0xab	 # AVX512BW
-	vpinsrw	xmm30, xmm29, eax, 123	 # AVX512BW
+	vpinsrw	xmm30, xmm29, rax, 123	 # AVX512BW
 	vpinsrw	xmm30, xmm29, ebp, 123	 # AVX512BW
-	vpinsrw	xmm30, xmm29, r13d, 123	 # AVX512BW
+	vpinsrw	xmm30, xmm29, r13, 123	 # AVX512BW
 	vpinsrw	xmm30, xmm29, WORD PTR [rcx], 123	 # AVX512BW
 	vpinsrw	xmm30, xmm29, WORD PTR [rax+r14*8+0x1234], 123	 # AVX512BW
 	vpinsrw	xmm30, xmm29, WORD PTR [rdx+254], 123	 # AVX512BW Disp8
--- a/gas/testsuite/gas/i386/x86-64-avx512f-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-avx512f-intel.d
@@ -2709,9 +2709,9 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee ab 	vextracti64x4 ymm30\{k7\},zmm29,0xab
 [ 	]*[a-f0-9]+:	62 03 fd cf 3b ee ab 	vextracti64x4 ymm30\{k7\}\{z\},zmm29,0xab
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee 7b 	vextracti64x4 ymm30\{k7\},zmm29,0x7b
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 ab 	vextractps rax,xmm29,0xab
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 7b 	vextractps rax,xmm29,0x7b
-[ 	]*[a-f0-9]+:	62 43 fd 08 17 e8 7b 	vextractps r8,xmm29,0x7b
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 ab 	vextractps eax,xmm29,0xab
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 7b 	vextractps eax,xmm29,0x7b
+[ 	]*[a-f0-9]+:	62 43 7d 08 17 e8 7b 	vextractps r8d,xmm29,0x7b
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 29 7b 	vextractps DWORD PTR \[rcx\],xmm29,0x7b
 [ 	]*[a-f0-9]+:	62 23 7d 08 17 ac f0 23 01 00 00 7b 	vextractps DWORD PTR \[rax\+r14\*8\+0x123\],xmm29,0x7b
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 6a 7f 7b 	vextractps DWORD PTR \[rdx\+0x1fc\],xmm29,0x7b
@@ -9730,9 +9730,9 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee ab 	vextracti64x4 ymm30\{k7\},zmm29,0xab
 [ 	]*[a-f0-9]+:	62 03 fd cf 3b ee ab 	vextracti64x4 ymm30\{k7\}\{z\},zmm29,0xab
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee 7b 	vextracti64x4 ymm30\{k7\},zmm29,0x7b
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 ab 	vextractps rax,xmm29,0xab
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 7b 	vextractps rax,xmm29,0x7b
-[ 	]*[a-f0-9]+:	62 43 fd 08 17 e8 7b 	vextractps r8,xmm29,0x7b
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 ab 	vextractps eax,xmm29,0xab
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 7b 	vextractps eax,xmm29,0x7b
+[ 	]*[a-f0-9]+:	62 43 7d 08 17 e8 7b 	vextractps r8d,xmm29,0x7b
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 29 7b 	vextractps DWORD PTR \[rcx\],xmm29,0x7b
 [ 	]*[a-f0-9]+:	62 23 7d 08 17 ac f0 34 12 00 00 7b 	vextractps DWORD PTR \[rax\+r14\*8\+0x1234\],xmm29,0x7b
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 6a 7f 7b 	vextractps DWORD PTR \[rdx\+0x1fc\],xmm29,0x7b
--- a/gas/testsuite/gas/i386/x86-64-avx512f.d
+++ b/gas/testsuite/gas/i386/x86-64-avx512f.d
@@ -2708,9 +2708,9 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee ab 	vextracti64x4 \$0xab,%zmm29,%ymm30\{%k7\}
 [ 	]*[a-f0-9]+:	62 03 fd cf 3b ee ab 	vextracti64x4 \$0xab,%zmm29,%ymm30\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee 7b 	vextracti64x4 \$0x7b,%zmm29,%ymm30\{%k7\}
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 ab 	vextractps \$0xab,%xmm29,%rax
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%rax
-[ 	]*[a-f0-9]+:	62 43 fd 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%r8
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 ab 	vextractps \$0xab,%xmm29,%eax
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%eax
+[ 	]*[a-f0-9]+:	62 43 7d 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%r8d
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 29 7b 	vextractps \$0x7b,%xmm29,\(%rcx\)
 [ 	]*[a-f0-9]+:	62 23 7d 08 17 ac f0 23 01 00 00 7b 	vextractps \$0x7b,%xmm29,0x123\(%rax,%r14,8\)
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 6a 7f 7b 	vextractps \$0x7b,%xmm29,0x1fc\(%rdx\)
@@ -9729,9 +9729,9 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee ab 	vextracti64x4 \$0xab,%zmm29,%ymm30\{%k7\}
 [ 	]*[a-f0-9]+:	62 03 fd cf 3b ee ab 	vextracti64x4 \$0xab,%zmm29,%ymm30\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:	62 03 fd 4f 3b ee 7b 	vextracti64x4 \$0x7b,%zmm29,%ymm30\{%k7\}
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 ab 	vextractps \$0xab,%xmm29,%rax
-[ 	]*[a-f0-9]+:	62 63 fd 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%rax
-[ 	]*[a-f0-9]+:	62 43 fd 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%r8
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 ab 	vextractps \$0xab,%xmm29,%eax
+[ 	]*[a-f0-9]+:	62 63 7d 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%eax
+[ 	]*[a-f0-9]+:	62 43 7d 08 17 e8 7b 	vextractps \$0x7b,%xmm29,%r8d
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 29 7b 	vextractps \$0x7b,%xmm29,\(%rcx\)
 [ 	]*[a-f0-9]+:	62 23 7d 08 17 ac f0 34 12 00 00 7b 	vextractps \$0x7b,%xmm29,0x1234\(%rax,%r14,8\)
 [ 	]*[a-f0-9]+:	62 63 7d 08 17 6a 7f 7b 	vextractps \$0x7b,%xmm29,0x1fc\(%rdx\)
--- a/gas/testsuite/gas/i386/x86-64-avx512f.s
+++ b/gas/testsuite/gas/i386/x86-64-avx512f.s
@@ -2953,7 +2953,7 @@ _start:
 	vextracti64x4	$0xab, %zmm29, %ymm30{%k7}{z}	 # AVX512F
 	vextracti64x4	$123, %zmm29, %ymm30{%k7}	 # AVX512F
 
-	vextractps	$0xab, %xmm29, %rax	 # AVX512F
+	vextractps	$0xab, %xmm29, %eax	 # AVX512F
 	vextractps	$123, %xmm29, %rax	 # AVX512F
 	vextractps	$123, %xmm29, %r8	 # AVX512F
 	vextractps	$123, %xmm29, (%rcx)	 # AVX512F
@@ -10611,7 +10611,7 @@ _start:
 	vextracti64x4	ymm30{k7}{z}, zmm29, 0xab	 # AVX512F
 	vextracti64x4	ymm30{k7}, zmm29, 123	 # AVX512F
 
-	vextractps	rax, xmm29, 0xab	 # AVX512F
+	vextractps	eax, xmm29, 0xab	 # AVX512F
 	vextractps	rax, xmm29, 123	 # AVX512F
 	vextractps	r8, xmm29, 123	 # AVX512F
 	vextractps	DWORD PTR [rcx], xmm29, 123	 # AVX512F
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1026,8 +1026,8 @@ movq, 2, 0xa1, None, 1, Cpu64, D|Size64|
 movq, 2, 0x89, None, 1, Cpu64, D|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3, { Reg64, Reg64|Unspecified|Qword|BaseIndex }
 movq, 2, 0xc7, 0x0, 1, Cpu64, Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3|Optimize, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
 movq, 2, 0xb8, None, 1, Cpu64, Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
-movq, 2, 0xf37e, None, 1, CpuAVX, Load|Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
-movq, 2, 0x66d6, None, 1, CpuAVX, Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
+movq, 2, 0xf37e, None, 1, CpuAVX, Load|Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
+movq, 2, 0x66d6, None, 1, CpuAVX, Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
 movq, 2, 0x666e, None, 1, CpuAVX|Cpu64, D|Modrm|Vex=1|VexOpcode=0|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Unspecified|BaseIndex, RegXMM }
 movq, 2, 0xf30f7e, None, 2, CpuSSE2, Load|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
 movq, 2, 0x660fd6, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { RegXMM, Unspecified|Qword|BaseIndex|RegXMM }
@@ -1293,7 +1293,7 @@ movlhps, 2, 0xf16, None, 2, CpuSSE, Modr
 movlps, 2, 0x12, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
 movlps, 2, 0x13, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex }
 movlps, 2, 0xf12, None, 2, CpuSSE, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
-movmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { RegXMM, Reg32|Reg64 }
+movmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW0|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|SSE2AVX, { RegXMM, Reg32|Reg64 }
 movmskps, 2, 0xf50, None, 2, CpuSSE, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
 movntps, 2, 0x2b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Xmmword|Unspecified|BaseIndex }
 movntps, 2, 0xf2b, None, 2, CpuSSE, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
@@ -1317,14 +1317,14 @@ pavgb, 2, 0x660fe0, None, 2, CpuSSE2, Mo
 pavgw, 2, 0xfe3, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { Qword|Unspecified|BaseIndex|RegMMX, RegMMX }
 pavgw, 2, 0x66e3, None, 1, CpuAVX, Modrm|C|Vex|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pavgw, 2, 0x660fe3, None, 2, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
-pextrw, 3, 0x66c5, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64 }
-pextrw, 3, 0x6615, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64 }
+pextrw, 3, 0x66c5, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexW0|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64 }
+pextrw, 3, 0x6615, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64 }
 pextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Word|Unspecified|BaseIndex }
 pextrw, 3, 0x660fc5, None, 2, CpuSSE2, Load|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
 pextrw, 3, 0x660f3a15, None, 3, CpuSSE4_1, RegMem|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
 pextrw, 3, 0x660f3a15, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Word|Unspecified|BaseIndex }
 pextrw, 3, 0xfc5, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|NoAVX, { Imm8, RegMMX, Reg32|Reg64 }
-pinsrw, 3, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, Reg32|Reg64, RegXMM }
+pinsrw, 3, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW0|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|SSE2AVX, { Imm8, Reg32|Reg64, RegXMM }
 pinsrw, 3, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|SSE2AVX, { Imm8, Word|Unspecified|BaseIndex, RegXMM }
 pinsrw, 3, 0x660fc4, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM }
 pinsrw, 3, 0x660fc4, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM }
@@ -1342,7 +1342,7 @@ pminsw, 2, 0xfea, None, 2, CpuSSE|Cpu3dn
 pminub, 2, 0x66da, None, 1, CpuAVX, Modrm|C|Vex|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pminub, 2, 0x660fda, None, 2, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pminub, 2, 0xfda, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { Qword|Unspecified|BaseIndex|RegMMX, RegMMX }
-pmovmskb, 2, 0x66d7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { RegXMM, Reg32|Reg64 }
+pmovmskb, 2, 0x66d7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|SSE2AVX, { RegXMM, Reg32|Reg64 }
 pmovmskb, 2, 0x660fd7, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
 pmovmskb, 2, 0xfd7, None, 2, CpuSSE|Cpu3dnowA, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|NoAVX, { RegMMX, Reg32|Reg64 }
 pmulhuw, 2, 0x66e4, None, 1, CpuAVX, Modrm|C|Vex|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1464,7 +1464,7 @@ movhpd, 2, 0x660f16, None, 2, CpuSSE2, D
 movlpd, 2, 0x6612, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
 movlpd, 2, 0x6613, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex }
 movlpd, 2, 0x660f12, None, 2, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
-movmskpd, 2, 0x6650, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64|SSE2AVX, { RegXMM, Reg32|Reg64 }
+movmskpd, 2, 0x6650, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|SSE2AVX, { RegXMM, Reg32|Reg64 }
 movmskpd, 2, 0x660f50, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
 movntpd, 2, 0x662b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Xmmword|Unspecified|BaseIndex }
 movntpd, 2, 0x660f2b, None, 2, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
@@ -1695,7 +1695,7 @@ dppd, 3, 0x660f3a41, None, 3, CpuSSE4_1,
 dpps, 3, 0x6640, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 dpps, 3, 0x660f3a40, None, 3, CpuSSE4_1, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 extractps, 3, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
-extractps, 3, 0x6617, None, 1, CpuAVX|Cpu64, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg64 }
+extractps, 3, 0x6617, None, 1, CpuAVX|Cpu64, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg64 }
 extractps, 3, 0x660f3a17, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
 extractps, 3, 0x660f3a17, None, 3, CpuSSE4_1|Cpu64, RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg64 }
 insertps, 3, 0x6621, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
@@ -1714,7 +1714,7 @@ pblendw, 3, 0x660e, None, 1, CpuAVX, Mod
 pblendw, 3, 0x660f3a0e, None, 3, CpuSSE4_1, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 pcmpeqq, 2, 0x6629, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pcmpeqq, 2, 0x660f3829, None, 3, CpuSSE4_1, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
-pextrb, 3, 0x6614, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64 }
+pextrb, 3, 0x6614, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Reg32|Reg64 }
 pextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
 pextrb, 3, 0x660f3a14, None, 3, CpuSSE4_1, RegMem|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
 pextrb, 3, 0x660f3a14, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
@@ -1724,7 +1724,7 @@ pextrq, 3, 0x6616, None, 1, CpuAVX|Cpu64
 pextrq, 3, 0x660f3a16, None, 3, CpuSSE4_1|Cpu64, Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg64|Unspecified|BaseIndex }
 phminposuw, 2, 0x6641, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 phminposuw, 2, 0x660f3841, None, 3, CpuSSE4_1, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
-pinsrb, 3, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, Reg32|Reg64, RegXMM }
+pinsrb, 3, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Reg32|Reg64, RegXMM }
 pinsrb, 3, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Byte|Unspecified|BaseIndex, RegXMM }
 pinsrb, 3, 0x660f3a20, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM }
 pinsrb, 3, 0x660f3a20, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM }
@@ -1782,7 +1782,7 @@ roundpd, 3, 0x6609, None, 1, CpuAVX, Mod
 roundpd, 3, 0x660f3a09, None, 3, CpuSSE4_1, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 roundps, 3, 0x6608, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 roundps, 3, 0x660f3a08, None, 3, CpuSSE4_1, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
-roundsd, 3, 0x660b, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
+roundsd, 3, 0x660b, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 roundsd, 3, 0x660f3a0b, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 roundss, 3, 0x660a, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
 roundss, 3, 0x660f3a0a, None, 3, CpuSSE4_1, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
@@ -2055,7 +2055,7 @@ vdppd, 4, 0x6641, None, 1, CpuAVX, Modrm
 vdpps, 4, 0x6640, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vextractf128, 3, 0x6619, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=2|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM }
 vextractps, 3, 0x6617, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
-vextractps, 3, 0x6617, None, 1, CpuAVX|Cpu64, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg64 }
+vextractps, 3, 0x6617, None, 1, CpuAVX|Cpu64, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg64 }
 vhaddpd, 3, 0x667c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vhaddps, 3, 0xf27c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vhsubpd, 3, 0x667d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
@@ -2100,14 +2100,14 @@ vmovlpd, 3, 0x6612, None, 1, CpuAVX, Mod
 vmovlpd, 2, 0x6613, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Qword|Unspecified|BaseIndex }
 vmovlps, 3, 0x12, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vmovlps, 2, 0x13, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Qword|Unspecified|BaseIndex }
-vmovmskpd, 2, 0x6650, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM|RegYMM, Reg32|Reg64 }
-vmovmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM|RegYMM, Reg32|Reg64 }
+vmovmskpd, 2, 0x6650, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { RegXMM|RegYMM, Reg32|Reg64 }
+vmovmskps, 2, 0x50, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { RegXMM|RegYMM, Reg32|Reg64 }
 vmovntdq, 2, 0x66e7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
 vmovntdqa, 2, 0x662a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex, RegXMM }
 vmovntpd, 2, 0x662b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
 vmovntps, 2, 0x2b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
-vmovq, 2, 0xf37e, None, 1, CpuAVX, Load|Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vmovq, 2, 0x66d6, None, 1, CpuAVX, Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
+vmovq, 2, 0xf37e, None, 1, CpuAVX, Load|Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
+vmovq, 2, 0x66d6, None, 1, CpuAVX, Modrm|Vex=1|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
 vmovq, 2, 0x666e, None, 1, CpuAVX|Cpu64, D|Modrm|Vex=1|VexOpcode=0|VexW=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|Unspecified|BaseIndex, RegXMM }
 vmovsd, 2, 0xf210, None, 1, CpuAVX, D|Modrm|Vex=3|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 vmovsd, 3, 0xf210, None, 1, CpuAVX, D|Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
@@ -2165,12 +2165,12 @@ vpermilpd, 3, 0x660d, None, 1, CpuAVX, M
 vpermilpd, 3, 0x6605, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vpermilps, 3, 0x660c, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpermilps, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vpextrb, 3, 0x6614, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
+vpextrb, 3, 0x6614, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64 }
 vpextrb, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Byte|Unspecified|BaseIndex }
 vpextrd, 3, 0x6616, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
 vpextrq, 3, 0x6616, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg64|Unspecified|BaseIndex }
-vpextrw, 3, 0x66c5, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
-vpextrw, 3, 0x6615, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, RegXMM, Reg32|Reg64 }
+vpextrw, 3, 0x66c5, None, 1, CpuAVX, Load|Modrm|Vex|VexOpcode=0|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64 }
+vpextrw, 3, 0x6615, None, 1, CpuAVX, RegMem|Vex|VexOpcode=2|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Reg32|Reg64 }
 vpextrw, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, Word|Unspecified|BaseIndex }
 vphaddd, 3, 0x6602, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vphaddsw, 3, 0x6603, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -2179,11 +2179,11 @@ vphminposuw, 2, 0x6641, None, 1, CpuAVX,
 vphsubd, 3, 0x6606, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vphsubsw, 3, 0x6607, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vphsubw, 3, 0x6605, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpinsrb, 4, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrb, 4, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
 vpinsrb, 4, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpinsrd, 4, 0x6622, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg32|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpinsrq, 4, 0x6622, None, 1, CpuAVX|Cpu64, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpinsrw, 4, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrw, 4, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
 vpinsrw, 4, 0x66c4, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpmaddubsw, 3, 0x6604, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpmaddwd, 3, 0x66f5, None, 1, CpuAVX, Modrm|C|Vex|VexOpcode=0|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -2199,7 +2199,7 @@ vpminsw, 3, 0x66ea, None, 1, CpuAVX, Mod
 vpminub, 3, 0x66da, None, 1, CpuAVX, Modrm|C|Vex|VexOpcode=0|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpminud, 3, 0x663b, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vpminuw, 3, 0x663a, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpmovmskb, 2, 0x66d7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
+vpmovmskb, 2, 0x66d7, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { RegXMM, Reg32|Reg64 }
 vpmovsxbd, 2, 0x6621, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpmovsxbq, 2, 0x6622, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpmovsxbw, 2, 0x6620, None, 1, CpuAVX, Modrm|Vex|VexOpcode=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
@@ -2268,7 +2268,7 @@ vrcpps, 2, 0x53, None, 1, CpuAVX, Modrm|
 vrcpss, 3, 0xf353, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vroundpd, 3, 0x6609, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vroundps, 3, 0x6608, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vroundsd, 4, 0x660b, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vroundsd, 4, 0x660b, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vroundss, 4, 0x660a, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=2|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vrsqrtps, 2, 0x52, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexWIG|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vrsqrtss, 3, 0xf352, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
@@ -2350,7 +2350,7 @@ vpminsw, 3, 0x66ea, None, 1, CpuAVX2, Mo
 vpminub, 3, 0x66da, None, 1, CpuAVX2, Modrm|C|Vex=2|VexOpcode=0|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpminud, 3, 0x663b, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpminuw, 3, 0x663a, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexVVVV=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpmovmskb, 2, 0x66d7, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegYMM, Reg32|Reg64 }
+vpmovmskb, 2, 0x66d7, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { RegYMM, Reg32|Reg64 }
 vpmovsxbd, 2, 0x6621, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM }
 vpmovsxbq, 2, 0x6622, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexWIG|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { DWord|Unspecified|BaseIndex|RegXMM, RegYMM }
 vpmovsxbw, 2, 0x6620, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=1|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM, RegYMM }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-04  9:41 ` [PATCH 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich
@ 2020-03-04 11:37   ` H.J. Lu
  2020-03-04 11:40     ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
> when there's no difference between either variant, because of 32-bit
> destination registers having their upper halves zeroed in 64-bit mode.
> Here, however, they're source registers, and hence specifying 64-bit
> registers would lead to the ambiguity of whether the upper 32 bits
> actually matter.
>
> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
> needed on both.

Are there testcases to show IgnoreSize is needed on them?

> And finally, just like for e.g. MONITOR/MWAIT, add variants with all
> input registers explicitly specified.
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (md_assemble): Also exclude tpause and umwait
>         from having their operands swapped.
>         * testsuite/gas/i386/waitpkg.s,
>         testsuite/gas/i386/x86-64-waitpkg.s: Add tpause and umwait
>         3-operand cases.
>         * testsuite/gas/i386/waitpkg.d,
>         testsuite/gas/i386/waitpkg-intel.d,
>         testsuite/gas/i386/x86-64-waitpkg.d,
>         testsuite/gas/i386/x86-64-waitpkg-intel.d: Adjust expectations.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-opc.tbl (tpause, umwait): Add IgnoreSize. Add 3-operand
>         template.
>         * i386-tbl.h: Re-generate.


--
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-04 11:37   ` H.J. Lu
@ 2020-03-04 11:40     ` Jan Beulich
  2020-03-04 11:44       ` H.J. Lu
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04 11:40 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 04.03.2020 12:36, H.J. Lu wrote:
> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
>> when there's no difference between either variant, because of 32-bit
>> destination registers having their upper halves zeroed in 64-bit mode.
>> Here, however, they're source registers, and hence specifying 64-bit
>> registers would lead to the ambiguity of whether the upper 32 bits
>> actually matter.
>>
>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
>> needed on both.
> 
> Are there testcases to show IgnoreSize is needed on them?

The situation with 16-bit test cases is rather poor anyway. I didn't
consider it reasonable to add such very special ones when far more
general ones don't exist. But if your question is to mean you demand
such to be added, then I'll (somewhat hesitantly) add/extend some.
Please clarify.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/9] x86: add missing IgnoreSize
  2020-03-04  9:42 ` [PATCH 2/9] x86: add missing IgnoreSize Jan Beulich
@ 2020-03-04 11:40   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> For proper code generation in 16-bit mode (or to avoid the "same type of
> prefix used twice" diagnostic there), IgnoreSize is needed on certain
> templates allowing for just 32-(and maybe 64-)bit operands.
>
> Beyond adding tests for the previously broken cases, also add ones for
> the previously working cases where IgnoreSize is needed for the same
> reason (leaving out MPX for now, as that'll require an assembler change
> first). Some minor adjustments to tests get done such that re-use of the
> same code for 16-bit code generation testing becomes easier.
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * testsuite/gas/i386/adx.s, testsuite/gas/i386/cet.s,
>         testsuite/gas/i386/ept.s, testsuite/gas/i386/fsgs.s,
>         testsuite/gas/i386/invpcid.s, testsuite/gas/i386/movdir.s,
>         testsuite/gas/i386/ptwrite.s, testsuite/gas/i386/vmx.s,
>         testsuite/gas/i386/waitpkg.s: Re-assemble some of the source as
>         16-bit code.
>         * testsuite/gas/i386/code16.s: Add CR, DR, and TR access cases
>         as well as a BSWAP one.
>         * testsuite/gas/i386/rdpid.s: Add 16-bit case.
>         * testsuite/gas/i386/sse2-16bit.s: Cover more insns.
>         * testsuite/gas/i386/adx-intel.d, testsuite/gas/i386/adx.d,
>         testsuite/gas/i386/cet-intel.d, testsuite/gas/i386/cet.d,
>         testsuite/gas/i386/code16.d, testsuite/gas/i386/ept-intel.d,
>         testsuite/gas/i386/ept.d, testsuite/gas/i386/fsgs-intel.d,
>         testsuite/gas/i386/fsgs.d, testsuite/gas/i386/invpcid-intel.d,
>         testsuite/gas/i386/invpcid.d, testsuite/gas/i386/movdir-intel.d,
>         testsuite/gas/i386/movdir.d, testsuite/gas/i386/ptwrite-intel.d,
>         testsuite/gas/i386/ptwrite.d, testsuite/gas/i386/rdpid-intel.d,
>         testsuite/gas/i386/rdpid.d, testsuite/gas/i386/sse2-16bit.d,
>         testsuite/gas/i386/vmx.d, testsuite/gas/i386/waitpkg-intel.d,
>         testsuite/gas/i386/waitpkg.d: Adjust expectations.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-opc.tbl (movmskps, mwait, vmread, vmwrite, invept,
>         invvpid, invpcid, rdfsbase, rdgsbase, wrfsbase, wrgsbase, adcx,
>         adox, mwaitx, rdpid, movdiri): Add IgnoreSize.
>         (ptwrite): Split into non-64-bit and 64-bit forms.
>         * i386-tbl.h: Re-generate.
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-04 11:40     ` Jan Beulich
@ 2020-03-04 11:44       ` H.J. Lu
  2020-03-05  8:08         ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 04.03.2020 12:36, H.J. Lu wrote:
> > On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
> >> when there's no difference between either variant, because of 32-bit
> >> destination registers having their upper halves zeroed in 64-bit mode.
> >> Here, however, they're source registers, and hence specifying 64-bit
> >> registers would lead to the ambiguity of whether the upper 32 bits
> >> actually matter.
> >>
> >> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
> >> needed on both.
> >
> > Are there testcases to show IgnoreSize is needed on them?
>
> The situation with 16-bit test cases is rather poor anyway. I didn't
> consider it reasonable to add such very special ones when far more
> general ones don't exist. But if your question is to mean you demand

Let's start from somewhere.

> such to be added, then I'll (somewhat hesitantly) add/extend some.
> Please clarify.

Please add testcases.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode
  2020-03-04  9:43 ` [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode Jan Beulich
@ 2020-03-04 11:46   ` H.J. Lu
  2020-03-04 11:50     ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 1:38 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Since 16-bit addressing isn't allowed, Disp32 needs to be forced; Disp16
> fails to match the templates.
>
> The SDM leaves open whether BNDC[LNU] with a GPR operand require an
> operand size override; this aspect is therefore left untouched here.
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (i386_addressing_mode): For 32-bit
>         addressing for MPX insns without base/index.
>         * testsuite/gas/i386/mpx-16bit.s,
>         * testsuite/gas/i386/mpx-16bit.d: New.
>         * testsuite/gas/i386/i386.exp: Run new test.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-dis.c (OP_E_memory): Exclude recording of used address
>         prefix for "bnd" modes only in 64-bit mode. Don't decode 16-bit
>         addressed memory operands for MPX insns.
>
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -10297,6 +10297,21 @@ i386_addressing_mode (void)
>
>    if (i.prefix[ADDR_PREFIX])
>      addr_mode = flag_code == CODE_32BIT ? CODE_16BIT : CODE_32BIT;
> +  else if (flag_code == CODE_16BIT
> +          && current_templates->start->cpu_flags.bitfield.cpumpx
> +          /* Avoid replacing the "16-bit addressing not allowed" diagnostic
> +             from md_assemble() by "is not a valid base/index expression"
> +             when there is a base and/or index.  */
> +          && !i.types[this_operand].bitfield.baseindex)
> +    {
> +      /* MPX insn memory operands with neither base nor index must be forced
> +        to use 32-bit addressing in 16-bit mode.  */
> +      addr_mode = CODE_32BIT;
> +      i.prefix[ADDR_PREFIX] = ADDR_PREFIX_OPCODE;
> +      ++i.prefixes;
> +      gas_assert (!i.types[this_operand].bitfield.disp16);
> +      gas_assert (!i.types[this_operand].bitfield.disp32);
> +    }
>    else
>      {

Since MPX isn't available in 16-bit mode, should they be disallowed? Given that
MPX has been deprecated, I prefer an error here.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 4/9] x86: drop Rex64 attribute
  2020-03-04  9:44 ` [PATCH 4/9] x86: drop Rex64 attribute Jan Beulich
@ 2020-03-04 11:47   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 1:38 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> It is almost entirely redundant with Size64, and the sole case (CRC32)
> where direct replacement isn't possible can easily be taken care of in
> another way.
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (md_assemble): Drop use of rex64.
>         (process_suffix): For REX.W for 64-bit CRC32.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-gen.c (opcode_modifiers): Remove Rex64 field.
>         * i386-opc.h (Rex64): Delete.
>         (struct i386_opcode_modifier): Remove rex64 field.
>         * i386-opc.tbl (crc32): Drop Rex64.
>         Replace Rex64 with Size64 everywhere else.
>         * i386-tbl.h: Re-generate.
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode
  2020-03-04 11:46   ` H.J. Lu
@ 2020-03-04 11:50     ` Jan Beulich
  2020-03-04 11:55       ` H.J. Lu
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04 11:50 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 04.03.2020 12:45, H.J. Lu wrote:
> On Wed, Mar 4, 2020 at 1:38 AM Jan Beulich <jbeulich@suse.com> wrote:
>> --- a/gas/config/tc-i386.c
>> +++ b/gas/config/tc-i386.c
>> @@ -10297,6 +10297,21 @@ i386_addressing_mode (void)
>>
>>    if (i.prefix[ADDR_PREFIX])
>>      addr_mode = flag_code == CODE_32BIT ? CODE_16BIT : CODE_32BIT;
>> +  else if (flag_code == CODE_16BIT
>> +          && current_templates->start->cpu_flags.bitfield.cpumpx
>> +          /* Avoid replacing the "16-bit addressing not allowed" diagnostic
>> +             from md_assemble() by "is not a valid base/index expression"
>> +             when there is a base and/or index.  */
>> +          && !i.types[this_operand].bitfield.baseindex)
>> +    {
>> +      /* MPX insn memory operands with neither base nor index must be forced
>> +        to use 32-bit addressing in 16-bit mode.  */
>> +      addr_mode = CODE_32BIT;
>> +      i.prefix[ADDR_PREFIX] = ADDR_PREFIX_OPCODE;
>> +      ++i.prefixes;
>> +      gas_assert (!i.types[this_operand].bitfield.disp16);
>> +      gas_assert (!i.types[this_operand].bitfield.disp32);
>> +    }
>>    else
>>      {
> 
> Since MPX isn't available in 16-bit mode, should they be disallowed?

How is it not available? As per my understanding, one just needs
to use 32-bit addressing.

> Given that MPX has been deprecated, I prefer an error here.

The use of "here" is confusing - just for the broken case (no
base/index), or for MPX insns in general? (Asking just in case
my understanding expressed above is wrong.)

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 5/9] x86: replace NoRex64 on VEX-encoded insns
  2020-03-04 10:19 ` [PATCH 5/9] x86: replace NoRex64 on VEX-encoded insns Jan Beulich
@ 2020-03-04 11:51   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 2:06 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> When the template specifies any of the possible VexW settings, we can
> use this instead of a separate NoRex64 to suppress the setting of REX_W.
> Note that this ends up addressing an inconsistency between VEX- and
> EVEX-encoded VEXTRACTPS, VPEXTR{B,W}, and VPINSR{B,W} - while the former
> avoided setting VEX.W, the latter pointlessly set EVEX.W when there is a
> 64-bit GPR operand. Adjust the testcase to cover both cases.
>
> Convert VexW= to their respective VexW* on lines touched anyway.
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (process_suffix): Exlucde !vexw insns
>         alongside !norex64 ones.
>         * testsuite/gas/i386/x86-64-avx512bw.s: Test VPEXTR* and VPINSR*
>         with both 32- and 64-bit GPR operands.
>         * testsuite/gas/i386/x86-64-avx512f.s: Test VEXTRACTPS with both
>         32- and 64-bit GPR operands.
>         * testsuite/gas/i386/x86-64-avx512bw-intel.d,
>         testsuite/gas/i386/x86-64-avx512bw.d,
>         testsuite/gas/i386/x86-64-avx512f-intel.d,
>         testsuite/gas/i386/x86-64-avx512f.d: Adjust expectations.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-opc.tbl (movq): Drop NoRex64 from XMM/XMM SSE2AVX variants.
>         (movmskps, pextrw, pinsrw, pmovmskb, movmskpd, extractps,
>         pextrb, pinsrb, roundsd): Drop NoRex64 and where applicable use
>         VexW0 on SSE2AVX variants.
>         (vmovq): Drop NoRex64 from XMM/XMM variants.
>         (vextractps, vmovmskpd, vmovmskps, vpextrb, vpextrw, vpinsrb,
>         vpinsrw, vpmovmskb, vroundsd, vpmovmskb): Drop NoRex64 and where
>         applicable use VexW0.
>         * i386-tbl.h: Re-generate.
> ---
> In principle this paves the way for folding NoRex64 with some VEX-only
> attribute bit, as there's no VEX-or-alike insn left with NoRex64 set
> (and following the underlying model there also isn't going to be).

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/9] x86: don't accept FI{LD,STP,STTP}LL in Intel syntax mode
  2020-03-04  9:45 ` [PATCH 6/9] x86: don't accept FI{LD,STP,STTP}LL in Intel syntax mode Jan Beulich
@ 2020-03-04 11:55   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 1:39 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> As of commit dc2be329b950 ("i386: Only check suffix in instruction
> mnemonic") these have been accepted even with "qword ptr" operand size
> specifier, but in 64-bit mode they're now wrongly having a REX prefix
> (with REX.W set) emitted in this case. These aren't Intel syntax
> mnemonics, so rather than fixing code generation, let's simply reject
> them. As a result, the Qword attribute can then be dropped, too.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-opc.tbl (fildll, fistpll, fisttpll): Add ATTSyntax.
>         * i386-tbl.h: Re-generate.
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode
  2020-03-04 11:50     ` Jan Beulich
@ 2020-03-04 11:55       ` H.J. Lu
  2020-03-04 12:58         ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 3:50 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 04.03.2020 12:45, H.J. Lu wrote:
> > On Wed, Mar 4, 2020 at 1:38 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> --- a/gas/config/tc-i386.c
> >> +++ b/gas/config/tc-i386.c
> >> @@ -10297,6 +10297,21 @@ i386_addressing_mode (void)
> >>
> >>    if (i.prefix[ADDR_PREFIX])
> >>      addr_mode = flag_code == CODE_32BIT ? CODE_16BIT : CODE_32BIT;
> >> +  else if (flag_code == CODE_16BIT
> >> +          && current_templates->start->cpu_flags.bitfield.cpumpx
> >> +          /* Avoid replacing the "16-bit addressing not allowed" diagnostic
> >> +             from md_assemble() by "is not a valid base/index expression"
> >> +             when there is a base and/or index.  */
> >> +          && !i.types[this_operand].bitfield.baseindex)
> >> +    {
> >> +      /* MPX insn memory operands with neither base nor index must be forced
> >> +        to use 32-bit addressing in 16-bit mode.  */
> >> +      addr_mode = CODE_32BIT;
> >> +      i.prefix[ADDR_PREFIX] = ADDR_PREFIX_OPCODE;
> >> +      ++i.prefixes;
> >> +      gas_assert (!i.types[this_operand].bitfield.disp16);
> >> +      gas_assert (!i.types[this_operand].bitfield.disp32);
> >> +    }
> >>    else
> >>      {
> >
> > Since MPX isn't available in 16-bit mode, should they be disallowed?
>
> How is it not available? As per my understanding, one just needs
> to use 32-bit addressing.

0x67 prefix is special for MPX.  It can't be used as address prefix on MPX
instructions.

> > Given that MPX has been deprecated, I prefer an error here.
>
> The use of "here" is confusing - just for the broken case (no
> base/index), or for MPX insns in general? (Asking just in case
> my understanding expressed above is wrong.)

flag_code == CODE_16BIT && current_templates->start->cpu_flags.bitfield.cpumpx

should be an error.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 7/9] x86: fold (supposed to be) identical code
  2020-03-04  9:46 ` [PATCH 7/9] x86: fold (supposed to be) identical code Jan Beulich
@ 2020-03-04 11:56   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 1:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> The Q and L suffix exclusion checks in match_template() ought to be
> (kept) in sync as far as their FPU and SIMD aspects go. This was
> already violated by only the Q one checking for active broadcast.
> Convert the code such that there'll be only one instance of the logic,
> the more that subsequently the logic is liable to need further
> refinement / extension. (The alternative would be to drop all SIMD-ness
> from the L part, but it is in principle possible to enable all sorts of
> SIMD support with just a pre-386 CPU, via suitable .arch directives.)
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (match_template): Fold duplicate code in
>         logic rejecting certain suffixes in certain modes. Drop
>         pointless "else".
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 8/9] x86: drop/replace IgnoreSize
  2020-03-04 10:15 ` [PATCH 8/9] x86: drop/replace IgnoreSize Jan Beulich
@ 2020-03-04 11:59   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 11:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 2:03 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Even after commit dc2be329b950 ("i386: Only check suffix in instruction
> mnemonic"), by which many of its uses have become unnecessary (some were
> unnecessary even before), IgnoreSize is still used for various slightly
> different purposes:
> - to suppress emission of an operand size prefix,
> - in Intel syntax mode to zap "derived" suffixes in certain cases and to
>   skip certain checks of remaining "derived" suffixes,
> - to suppress ambiguous operand size / missing suffix diagnostics,
> - for prefixes to suppress the "stand-alone ... prefix" warning.
> Drop entirely unnecessary ones and where possible also replace instances
> by the more focused (because of having just a single purpose) NoRex64.
>
> To further restrict when IgnoreSize is needed, also generalize the logic
> when to skip a template because of a present or derived L or Q suffix,
> by skipping immediate operands. Additionally consider mask registers and
> VecSIB there.
>
> Note that for the time being the attribute needs to be kept in place on
> MMX/SSE/etc insns (but not on VEX/EVEX encoded ones unless an operand
> template of them allows for only non-SIMD-register actuals) allowing for
> Dword operands - the logic when to emit a data size prefix would need
> further adjustment first.
>
> Note also that the memory forms of {,v}pinsrw get their permission for
> an L or Q suffix dropped. I can only assume that it being this way was a
> cut-and-paste mistake from the register forms, as the latter
> specifically have NoRex64 set, and the {,v}pextrw counterparts don't
> allow these suffixes either.
>
> Convert VexW= again to their respective VexW* on lines touched anyway.
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (match_template): Extend code in logic
>         rejecting certain suffixes in certain modes to also cover mask
>         register use and VecSIB. Drop special casing of broadcast. Skip
>         immediates in the check.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-opc.tbl: Drop IgnoreSize from various SIMD insns. Replace
>         VexW= by VexW* and VexVVVV=1 by just VexVVVV where applicable.
>         * i386-tbl.h: Re-generate.
>

OK.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 9/9] x86: reduce amount of various VCVT* templates
  2020-03-04  9:47 ` [PATCH 9/9] x86: reduce amount of various VCVT* templates Jan Beulich
@ 2020-03-04 12:00   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 12:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 1:41 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Presumably as a result of various changes over the last several months,
> and - for some of them - with a generalization of logic in
> match_mem_size() plus mirroring of this generalization into the
> broadcast handling logic of check_VecOperands(), various register-only
> templates can be foled into their respective memory forms. This in
> particular then also allows dropping a few more instances of IgnoreSize.
>
> gas/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (match_mem_size): Generalize broadcast special
>         casing.
>         (check_VecOperands): Zap xmmword/ymmword/zmmword when more than
>         one of byte/word/dword/qword is set alongside a SIMD register in
>         a template's operand.
>
> opcodes/
> 2020-03-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-opc.tbl (vcvtdq2pd, vcvtps2pd, vcvtudq2pd, vcvtps2ph,
>         vcvtps2qq, vcvtps2uqq, vcvttps2qq, vcvttps2uqq): Fold separate
>         register and memory source templates. Replace VexW= by VexW*
>         where applicable.
>         * i386-tbl.h: Re-generate.
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode
  2020-03-04 11:55       ` H.J. Lu
@ 2020-03-04 12:58         ` Jan Beulich
  2020-03-04 13:26           ` H.J. Lu
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-04 12:58 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 04.03.2020 12:54, H.J. Lu wrote:
> On Wed, Mar 4, 2020 at 3:50 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 04.03.2020 12:45, H.J. Lu wrote:
>>> On Wed, Mar 4, 2020 at 1:38 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> --- a/gas/config/tc-i386.c
>>>> +++ b/gas/config/tc-i386.c
>>>> @@ -10297,6 +10297,21 @@ i386_addressing_mode (void)
>>>>
>>>>    if (i.prefix[ADDR_PREFIX])
>>>>      addr_mode = flag_code == CODE_32BIT ? CODE_16BIT : CODE_32BIT;
>>>> +  else if (flag_code == CODE_16BIT
>>>> +          && current_templates->start->cpu_flags.bitfield.cpumpx
>>>> +          /* Avoid replacing the "16-bit addressing not allowed" diagnostic
>>>> +             from md_assemble() by "is not a valid base/index expression"
>>>> +             when there is a base and/or index.  */
>>>> +          && !i.types[this_operand].bitfield.baseindex)
>>>> +    {
>>>> +      /* MPX insn memory operands with neither base nor index must be forced
>>>> +        to use 32-bit addressing in 16-bit mode.  */
>>>> +      addr_mode = CODE_32BIT;
>>>> +      i.prefix[ADDR_PREFIX] = ADDR_PREFIX_OPCODE;
>>>> +      ++i.prefixes;
>>>> +      gas_assert (!i.types[this_operand].bitfield.disp16);
>>>> +      gas_assert (!i.types[this_operand].bitfield.disp32);
>>>> +    }
>>>>    else
>>>>      {
>>>
>>> Since MPX isn't available in 16-bit mode, should they be disallowed?
>>
>> How is it not available? As per my understanding, one just needs
>> to use 32-bit addressing.
> 
> 0x67 prefix is special for MPX.  It can't be used as address prefix on MPX
> instructions.

It not only can, but is required to be in 16-bit mode. Let me quote
BNDMK's SDM page:

Protected Mode Exceptions
#UD If the LOCK prefix is used.
If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.
If 67H prefix is not used and CS.D=0.
If 67H prefix is used and CS.D=1.

Real-Address Mode Exceptions
#UD If the LOCK prefix is used.
If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.
If 16-bit addressing is used.

Virtual-8086 Mode Exceptions
#UD If the LOCK prefix is used.
If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.
If 16-bit addressing is used.

It is quite clear to me from this that (a) MPX is allowed
in 16-bit mode (and even in all forms of it, other than
e.g. VEX/EVEX-encoded insns) and (b) the 67 prefix acts
as a normal address size override there. Its use simply is
mandatory in 16-bit mode.

>>> Given that MPX has been deprecated, I prefer an error here.
>>
>> The use of "here" is confusing - just for the broken case (no
>> base/index), or for MPX insns in general? (Asking just in case
>> my understanding expressed above is wrong.)
> 
> flag_code == CODE_16BIT && current_templates->start->cpu_flags.bitfield.cpumpx
> 
> should be an error.

As per above, I see no reason for such behavior.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode
  2020-03-04 12:58         ` Jan Beulich
@ 2020-03-04 13:26           ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-04 13:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Mar 4, 2020 at 4:58 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 04.03.2020 12:54, H.J. Lu wrote:
> > On Wed, Mar 4, 2020 at 3:50 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 04.03.2020 12:45, H.J. Lu wrote:
> >>> On Wed, Mar 4, 2020 at 1:38 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> --- a/gas/config/tc-i386.c
> >>>> +++ b/gas/config/tc-i386.c
> >>>> @@ -10297,6 +10297,21 @@ i386_addressing_mode (void)
> >>>>
> >>>>    if (i.prefix[ADDR_PREFIX])
> >>>>      addr_mode = flag_code == CODE_32BIT ? CODE_16BIT : CODE_32BIT;
> >>>> +  else if (flag_code == CODE_16BIT
> >>>> +          && current_templates->start->cpu_flags.bitfield.cpumpx
> >>>> +          /* Avoid replacing the "16-bit addressing not allowed" diagnostic
> >>>> +             from md_assemble() by "is not a valid base/index expression"
> >>>> +             when there is a base and/or index.  */
> >>>> +          && !i.types[this_operand].bitfield.baseindex)
> >>>> +    {
> >>>> +      /* MPX insn memory operands with neither base nor index must be forced
> >>>> +        to use 32-bit addressing in 16-bit mode.  */
> >>>> +      addr_mode = CODE_32BIT;
> >>>> +      i.prefix[ADDR_PREFIX] = ADDR_PREFIX_OPCODE;
> >>>> +      ++i.prefixes;
> >>>> +      gas_assert (!i.types[this_operand].bitfield.disp16);
> >>>> +      gas_assert (!i.types[this_operand].bitfield.disp32);
> >>>> +    }
> >>>>    else
> >>>>      {
> >>>
> >>> Since MPX isn't available in 16-bit mode, should they be disallowed?
> >>
> >> How is it not available? As per my understanding, one just needs
> >> to use 32-bit addressing.
> >
> > 0x67 prefix is special for MPX.  It can't be used as address prefix on MPX
> > instructions.
>
> It not only can, but is required to be in 16-bit mode. Let me quote
> BNDMK's SDM page:
>
> Protected Mode Exceptions
> #UD If the LOCK prefix is used.
> If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.
> If 67H prefix is not used and CS.D=0.
> If 67H prefix is used and CS.D=1.
>
> Real-Address Mode Exceptions
> #UD If the LOCK prefix is used.
> If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.
> If 16-bit addressing is used.
>
> Virtual-8086 Mode Exceptions
> #UD If the LOCK prefix is used.
> If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.
> If 16-bit addressing is used.
>
> It is quite clear to me from this that (a) MPX is allowed
> in 16-bit mode (and even in all forms of it, other than
> e.g. VEX/EVEX-encoded insns) and (b) the 67 prefix acts
> as a normal address size override there. Its use simply is
> mandatory in 16-bit mode.
>
> >>> Given that MPX has been deprecated, I prefer an error here.
> >>
> >> The use of "here" is confusing - just for the broken case (no
> >> base/index), or for MPX insns in general? (Asking just in case
> >> my understanding expressed above is wrong.)
> >
> > flag_code == CODE_16BIT && current_templates->start->cpu_flags.bitfield.cpumpx
> >
> > should be an error.
>
> As per above, I see no reason for such behavior.
>

Patch is OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v1.1 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
                   ` (8 preceding siblings ...)
  2020-03-04 10:19 ` [PATCH 5/9] x86: replace NoRex64 on VEX-encoded insns Jan Beulich
@ 2020-03-05  8:07 ` Jan Beulich
  9 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2020-03-05  8:07 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Allowing 64-bit registers is misleading here: Elsewhere these get allowed
when there's no difference between either variant, because of 32-bit
destination registers having their upper halves zeroed in 64-bit mode.
Here, however, they're source registers, and hence specifying 64-bit
registers would lead to the ambiguity of whether the upper 32 bits
actually matter.

Additionally, for proper code generation in 16-bit mode, IgnoreSize is
needed on both.

And finally, just like for e.g. MONITOR/MWAIT, add variants with all
input registers explicitly specified.

gas/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (md_assemble): Also exclude tpause and umwait
	from having their operands swapped.
	* testsuite/gas/i386/waitpkg.s,
	testsuite/gas/i386/x86-64-waitpkg.s: Add tpause and umwait
	3-operand cases as well as testing of 16-bit code generation.
	* testsuite/gas/i386/waitpkg.d,
	testsuite/gas/i386/waitpkg-intel.d,
	testsuite/gas/i386/x86-64-waitpkg.d,
	testsuite/gas/i386/x86-64-waitpkg-intel.d: Adjust expectations.

opcodes/
2020-03-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-opc.tbl (tpause, umwait): Add IgnoreSize. Add 3-operand
	template.
	* i386-tbl.h: Re-generate.
---
v1.1: Move 16-bit WaitPKG testing here (from "x86: add missing
      IgnoreSize"). (I'm not going to re-send the other patches of
      this series.)

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4328,16 +4328,19 @@ md_assemble (char *line)
   /* Now we've parsed the mnemonic into a set of templates, and have the
      operands at hand.  */
 
-  /* All Intel opcodes have reversed operands except for "bound", "enter"
-     "monitor*", and "mwait*".  We also don't reverse intersegment "jmp"
-     and "call" instructions with 2 immediate operands so that the immediate
-     segment precedes the offset, as it does when in AT&T mode. */
+  /* All Intel opcodes have reversed operands except for "bound", "enter",
+     "monitor*", "mwait*", "tpause", and "umwait".  We also don't reverse
+     intersegment "jmp" and "call" instructions with 2 immediate operands so
+     that the immediate segment precedes the offset, as it does when in AT&T
+     mode.  */
   if (intel_syntax
       && i.operands > 1
       && (strcmp (mnemonic, "bound") != 0)
       && (strcmp (mnemonic, "invlpga") != 0)
       && (strncmp (mnemonic, "monitor", 7) != 0)
       && (strncmp (mnemonic, "mwait", 5) != 0)
+      && (strcmp (mnemonic, "tpause") != 0)
+      && (strcmp (mnemonic, "umwait") != 0)
       && !(operand_type_check (i.types[0], imm)
 	   && operand_type_check (i.types[1], imm)))
     swap_operands ();
--- a/gas/testsuite/gas/i386/waitpkg-intel.d
+++ b/gas/testsuite/gas/i386/waitpkg-intel.d
@@ -12,5 +12,17 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 0f ae f0[ 	]*umonitor eax
 [ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f1[ 	]*umonitor cx
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait ebx
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause edi
+[ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f0[ 	]*umonitor ax
+[ 	]*[a-f0-9]+:[ 	]*f3 0f ae f1[ 	]*umonitor ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait ebx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause edi
 #pass
--- a/gas/testsuite/gas/i386/waitpkg.d
+++ b/gas/testsuite/gas/i386/waitpkg.d
@@ -12,5 +12,17 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 0f ae f0[ 	]*umonitor %eax
 [ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f1[ 	]*umonitor %cx
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait %ebx
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause %ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait %edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause %edi
+[ 	]*[a-f0-9]+:[ 	]*67 f3 0f ae f0[ 	]*umonitor %ax
+[ 	]*[a-f0-9]+:[ 	]*f3 0f ae f1[ 	]*umonitor %ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f3[ 	]*umwait %ebx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f3[ 	]*tpause %ebx
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait %edi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause %edi
 #pass
--- a/gas/testsuite/gas/i386/waitpkg.s
+++ b/gas/testsuite/gas/i386/waitpkg.s
@@ -2,7 +2,19 @@
 
 	.text
 _start:
+	.rept 2
 	umonitor %eax
 	umonitor %cx
 	umwait %ecx
+	umwait %ebx, %edx, %eax
 	tpause %ecx
+	tpause %ebx, %edx, %eax
+
+	.intel_syntax noprefix
+
+	umwait edi, edx, eax
+	tpause edi, edx, eax
+
+	.att_syntax prefix
+	.code16
+	.endr
--- a/gas/testsuite/gas/i386/x86-64-waitpkg-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-waitpkg-intel.d
@@ -13,11 +13,11 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 41 0f ae f2[ 	]*umonitor r10
 [ 	]*[a-f0-9]+:[ 	]*67 f3 41 0f ae f2[ 	]*umonitor r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait r10d
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait edi
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
-[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause ecx
-[ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause r10d
 [ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause r10d
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause edi
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f6[ 	]*umwait esi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f6[ 	]*tpause esi
 #pass
--- a/gas/testsuite/gas/i386/x86-64-waitpkg.d
+++ b/gas/testsuite/gas/i386/x86-64-waitpkg.d
@@ -13,11 +13,11 @@ Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*f3 41 0f ae f2[ 	]*umonitor %r10
 [ 	]*[a-f0-9]+:[ 	]*67 f3 41 0f ae f2[ 	]*umonitor %r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f1[ 	]*umwait %ecx
-[ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait %r10d
 [ 	]*[a-f0-9]+:[ 	]*f2 41 0f ae f2[ 	]*umwait %r10d
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f7[ 	]*umwait %edi
 [ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
-[ 	]*[a-f0-9]+:[ 	]*66 0f ae f1[ 	]*tpause %ecx
-[ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause %r10d
 [ 	]*[a-f0-9]+:[ 	]*66 41 0f ae f2[ 	]*tpause %r10d
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f7[ 	]*tpause %edi
+[ 	]*[a-f0-9]+:[ 	]*f2 0f ae f6[ 	]*umwait %esi
+[ 	]*[a-f0-9]+:[ 	]*66 0f ae f6[ 	]*tpause %esi
 #pass
--- a/gas/testsuite/gas/i386/x86-64-waitpkg.s
+++ b/gas/testsuite/gas/i386/x86-64-waitpkg.s
@@ -6,10 +6,13 @@ _start:
 	umonitor %r10
 	umonitor %r10d
 	umwait %ecx
-	umwait %rcx
-	umwait %r10
 	umwait %r10d
+	umwait %edi, %edx, %eax
 	tpause %ecx
-	tpause %rcx
-	tpause %r10
 	tpause %r10d
+	tpause %edi, %edx, %eax
+
+	.intel_syntax noprefix
+
+	umwait esi, edx, eax
+	tpause esi, edx, eax
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -4763,10 +4763,10 @@ pconfig, 0, 0x0f01c5, None, 3, CpuPCONFI
 // WAITPKG instructions.
 
 umonitor, 1, 0xf30fae, 0x6, 2, CpuWAITPKG, Modrm|AddrPrefixOpReg, { Reg16|Reg32|Reg64 }
-
-tpause, 1, 0x660fae, 0x6, 2, CpuWAITPKG, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Reg32|Reg64 }
-
-umwait, 1, 0xf20fae, 0x6, 2, CpuWAITPKG, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Reg32|Reg64 }
+tpause, 1, 0x660fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32 }
+tpause, 3, 0x660fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, RegD|Dword, Acc|Dword }
+umwait, 1, 0xf20fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32 }
+umwait, 3, 0xf20fae, 0x6, 2, CpuWAITPKG, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32, RegD|Dword, Acc|Dword }
 
 // WAITPKG instructions end.
 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-04 11:44       ` H.J. Lu
@ 2020-03-05  8:08         ` Jan Beulich
  2020-03-05 14:05           ` H.J. Lu
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-05  8:08 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 04.03.2020 12:44, H.J. Lu wrote:
> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 04.03.2020 12:36, H.J. Lu wrote:
>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
>>>> when there's no difference between either variant, because of 32-bit
>>>> destination registers having their upper halves zeroed in 64-bit mode.
>>>> Here, however, they're source registers, and hence specifying 64-bit
>>>> registers would lead to the ambiguity of whether the upper 32 bits
>>>> actually matter.
>>>>
>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
>>>> needed on both.
>>>
>>> Are there testcases to show IgnoreSize is needed on them?
>>
>> The situation with 16-bit test cases is rather poor anyway. I didn't
>> consider it reasonable to add such very special ones when far more
>> general ones don't exist. But if your question is to mean you demand
> 
> Let's start from somewhere.
> 
>> such to be added, then I'll (somewhat hesitantly) add/extend some.
>> Please clarify.
> 
> Please add testcases.

Actually they were there, in patch 2. I've moved them to this patch
and have just sent v1.1 for just this one patch.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05  8:08         ` Jan Beulich
@ 2020-03-05 14:05           ` H.J. Lu
  2020-03-05 14:08             ` Jan Beulich
  2020-03-05 15:22             ` Jan Beulich
  0 siblings, 2 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-05 14:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 04.03.2020 12:44, H.J. Lu wrote:
> > On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 04.03.2020 12:36, H.J. Lu wrote:
> >>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
> >>>> when there's no difference between either variant, because of 32-bit
> >>>> destination registers having their upper halves zeroed in 64-bit mode.
> >>>> Here, however, they're source registers, and hence specifying 64-bit
> >>>> registers would lead to the ambiguity of whether the upper 32 bits
> >>>> actually matter.
> >>>>
> >>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
> >>>> needed on both.
> >>>
> >>> Are there testcases to show IgnoreSize is needed on them?
> >>
> >> The situation with 16-bit test cases is rather poor anyway. I didn't
> >> consider it reasonable to add such very special ones when far more
> >> general ones don't exist. But if your question is to mean you demand
> >
> > Let's start from somewhere.
> >
> >> such to be added, then I'll (somewhat hesitantly) add/extend some.
> >> Please clarify.
> >
> > Please add testcases.
>
> Actually they were there, in patch 2. I've moved them to this patch
> and have just sent v1.1 for just this one patch.

Do we need to adjust disassembler for 16-bit mode?

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 14:05           ` H.J. Lu
@ 2020-03-05 14:08             ` Jan Beulich
  2020-03-05 14:38               ` H.J. Lu
  2020-03-05 15:22             ` Jan Beulich
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-05 14:08 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 05.03.2020 15:04, H.J. Lu wrote:
> On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 04.03.2020 12:44, H.J. Lu wrote:
>>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 04.03.2020 12:36, H.J. Lu wrote:
>>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
>>>>>> when there's no difference between either variant, because of 32-bit
>>>>>> destination registers having their upper halves zeroed in 64-bit mode.
>>>>>> Here, however, they're source registers, and hence specifying 64-bit
>>>>>> registers would lead to the ambiguity of whether the upper 32 bits
>>>>>> actually matter.
>>>>>>
>>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
>>>>>> needed on both.
>>>>>
>>>>> Are there testcases to show IgnoreSize is needed on them?
>>>>
>>>> The situation with 16-bit test cases is rather poor anyway. I didn't
>>>> consider it reasonable to add such very special ones when far more
>>>> general ones don't exist. But if your question is to mean you demand
>>>
>>> Let's start from somewhere.
>>>
>>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
>>>> Please clarify.
>>>
>>> Please add testcases.
>>
>> Actually they were there, in patch 2. I've moved them to this patch
>> and have just sent v1.1 for just this one patch.
> 
> Do we need to adjust disassembler for 16-bit mode?

I haven't checked, but (a) this would be an independent change
and (b) it would likely affect more than just the insns here
(see e.g. other parts of the series).

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 14:08             ` Jan Beulich
@ 2020-03-05 14:38               ` H.J. Lu
  2020-03-05 14:51                 ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: H.J. Lu @ 2020-03-05 14:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Thu, Mar 5, 2020 at 6:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2020 15:04, H.J. Lu wrote:
> > On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 04.03.2020 12:44, H.J. Lu wrote:
> >>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 04.03.2020 12:36, H.J. Lu wrote:
> >>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
> >>>>>> when there's no difference between either variant, because of 32-bit
> >>>>>> destination registers having their upper halves zeroed in 64-bit mode.
> >>>>>> Here, however, they're source registers, and hence specifying 64-bit
> >>>>>> registers would lead to the ambiguity of whether the upper 32 bits
> >>>>>> actually matter.
> >>>>>>
> >>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
> >>>>>> needed on both.
> >>>>>
> >>>>> Are there testcases to show IgnoreSize is needed on them?
> >>>>
> >>>> The situation with 16-bit test cases is rather poor anyway. I didn't
> >>>> consider it reasonable to add such very special ones when far more
> >>>> general ones don't exist. But if your question is to mean you demand
> >>>
> >>> Let's start from somewhere.
> >>>
> >>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
> >>>> Please clarify.
> >>>
> >>> Please add testcases.
> >>
> >> Actually they were there, in patch 2. I've moved them to this patch
> >> and have just sent v1.1 for just this one patch.
> >
> > Do we need to adjust disassembler for 16-bit mode?
>
> I haven't checked, but (a) this would be an independent change
> and (b) it would likely affect more than just the insns here
> (see e.g. other parts of the series).
>

Disassembler should be adjusted when IgnoreSize is added
to tpause and umwait.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 14:38               ` H.J. Lu
@ 2020-03-05 14:51                 ` Jan Beulich
  2020-03-05 14:54                   ` H.J. Lu
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-05 14:51 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 05.03.2020 15:37, H.J. Lu wrote:
> On Thu, Mar 5, 2020 at 6:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 05.03.2020 15:04, H.J. Lu wrote:
>>> On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 04.03.2020 12:44, H.J. Lu wrote:
>>>>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 04.03.2020 12:36, H.J. Lu wrote:
>>>>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
>>>>>>>> when there's no difference between either variant, because of 32-bit
>>>>>>>> destination registers having their upper halves zeroed in 64-bit mode.
>>>>>>>> Here, however, they're source registers, and hence specifying 64-bit
>>>>>>>> registers would lead to the ambiguity of whether the upper 32 bits
>>>>>>>> actually matter.
>>>>>>>>
>>>>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
>>>>>>>> needed on both.
>>>>>>>
>>>>>>> Are there testcases to show IgnoreSize is needed on them?
>>>>>>
>>>>>> The situation with 16-bit test cases is rather poor anyway. I didn't
>>>>>> consider it reasonable to add such very special ones when far more
>>>>>> general ones don't exist. But if your question is to mean you demand
>>>>>
>>>>> Let's start from somewhere.
>>>>>
>>>>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
>>>>>> Please clarify.
>>>>>
>>>>> Please add testcases.
>>>>
>>>> Actually they were there, in patch 2. I've moved them to this patch
>>>> and have just sent v1.1 for just this one patch.
>>>
>>> Do we need to adjust disassembler for 16-bit mode?
>>
>> I haven't checked, but (a) this would be an independent change
>> and (b) it would likely affect more than just the insns here
>> (see e.g. other parts of the series).
> 
> Disassembler should be adjusted when IgnoreSize is added
> to tpause and umwait.

But IgnoreSize has nothing to do with the disassembler. Right
now all I'm after is getting the assembler to produce correct
code. There are test cases to verify this is the case. Whether
the disassembler deals with any of this correctly (including
also the insns getting IgnoreSize added by "x86: add missing
IgnoreSize", which you've already approved) is an entirely
separate topic, which I may be willing to look into down the
road, but not now and here.

More generally, when anyone fixes one bug, why should they get
penalized to have their bug fix accepted only when they take
the - perhaps significant amount of - time to also fix another,
unrelated bug? In the case here, it is definitely not my fault
if 16-bit handling is in a bad state. And I can't afford fixing
all issues there are in one go.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 14:51                 ` Jan Beulich
@ 2020-03-05 14:54                   ` H.J. Lu
  2020-03-05 15:16                     ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: H.J. Lu @ 2020-03-05 14:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Thu, Mar 5, 2020 at 6:51 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2020 15:37, H.J. Lu wrote:
> > On Thu, Mar 5, 2020 at 6:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 05.03.2020 15:04, H.J. Lu wrote:
> >>> On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 04.03.2020 12:44, H.J. Lu wrote:
> >>>>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>> On 04.03.2020 12:36, H.J. Lu wrote:
> >>>>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
> >>>>>>>> when there's no difference between either variant, because of 32-bit
> >>>>>>>> destination registers having their upper halves zeroed in 64-bit mode.
> >>>>>>>> Here, however, they're source registers, and hence specifying 64-bit
> >>>>>>>> registers would lead to the ambiguity of whether the upper 32 bits
> >>>>>>>> actually matter.
> >>>>>>>>
> >>>>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
> >>>>>>>> needed on both.
> >>>>>>>
> >>>>>>> Are there testcases to show IgnoreSize is needed on them?
> >>>>>>
> >>>>>> The situation with 16-bit test cases is rather poor anyway. I didn't
> >>>>>> consider it reasonable to add such very special ones when far more
> >>>>>> general ones don't exist. But if your question is to mean you demand
> >>>>>
> >>>>> Let's start from somewhere.
> >>>>>
> >>>>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
> >>>>>> Please clarify.
> >>>>>
> >>>>> Please add testcases.
> >>>>
> >>>> Actually they were there, in patch 2. I've moved them to this patch
> >>>> and have just sent v1.1 for just this one patch.
> >>>
> >>> Do we need to adjust disassembler for 16-bit mode?
> >>
> >> I haven't checked, but (a) this would be an independent change
> >> and (b) it would likely affect more than just the insns here
> >> (see e.g. other parts of the series).
> >
> > Disassembler should be adjusted when IgnoreSize is added
> > to tpause and umwait.
>
> But IgnoreSize has nothing to do with the disassembler. Right
> now all I'm after is getting the assembler to produce correct
> code. There are test cases to verify this is the case. Whether
> the disassembler deals with any of this correctly (including
> also the insns getting IgnoreSize added by "x86: add missing
> IgnoreSize", which you've already approved) is an entirely
> separate topic, which I may be willing to look into down the
> road, but not now and here.
>
> More generally, when anyone fixes one bug, why should they get
> penalized to have their bug fix accepted only when they take
> the - perhaps significant amount of - time to also fix another,
> unrelated bug? In the case here, it is definitely not my fault
> if 16-bit handling is in a bad state. And I can't afford fixing
> all issues there are in one go.
>

Then don't add IgnoreSize to tpause and umwait for now.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 14:54                   ` H.J. Lu
@ 2020-03-05 15:16                     ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2020-03-05 15:16 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 05.03.2020 15:53, H.J. Lu wrote:
> On Thu, Mar 5, 2020 at 6:51 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 05.03.2020 15:37, H.J. Lu wrote:
>>> On Thu, Mar 5, 2020 at 6:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 05.03.2020 15:04, H.J. Lu wrote:
>>>>> On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>
>>>>>> On 04.03.2020 12:44, H.J. Lu wrote:
>>>>>>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>> On 04.03.2020 12:36, H.J. Lu wrote:
>>>>>>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
>>>>>>>>>> when there's no difference between either variant, because of 32-bit
>>>>>>>>>> destination registers having their upper halves zeroed in 64-bit mode.
>>>>>>>>>> Here, however, they're source registers, and hence specifying 64-bit
>>>>>>>>>> registers would lead to the ambiguity of whether the upper 32 bits
>>>>>>>>>> actually matter.
>>>>>>>>>>
>>>>>>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
>>>>>>>>>> needed on both.
>>>>>>>>>
>>>>>>>>> Are there testcases to show IgnoreSize is needed on them?
>>>>>>>>
>>>>>>>> The situation with 16-bit test cases is rather poor anyway. I didn't
>>>>>>>> consider it reasonable to add such very special ones when far more
>>>>>>>> general ones don't exist. But if your question is to mean you demand
>>>>>>>
>>>>>>> Let's start from somewhere.
>>>>>>>
>>>>>>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
>>>>>>>> Please clarify.
>>>>>>>
>>>>>>> Please add testcases.
>>>>>>
>>>>>> Actually they were there, in patch 2. I've moved them to this patch
>>>>>> and have just sent v1.1 for just this one patch.
>>>>>
>>>>> Do we need to adjust disassembler for 16-bit mode?
>>>>
>>>> I haven't checked, but (a) this would be an independent change
>>>> and (b) it would likely affect more than just the insns here
>>>> (see e.g. other parts of the series).
>>>
>>> Disassembler should be adjusted when IgnoreSize is added
>>> to tpause and umwait.
>>
>> But IgnoreSize has nothing to do with the disassembler. Right
>> now all I'm after is getting the assembler to produce correct
>> code. There are test cases to verify this is the case. Whether
>> the disassembler deals with any of this correctly (including
>> also the insns getting IgnoreSize added by "x86: add missing
>> IgnoreSize", which you've already approved) is an entirely
>> separate topic, which I may be willing to look into down the
>> road, but not now and here.
>>
>> More generally, when anyone fixes one bug, why should they get
>> penalized to have their bug fix accepted only when they take
>> the - perhaps significant amount of - time to also fix another,
>> unrelated bug? In the case here, it is definitely not my fault
>> if 16-bit handling is in a bad state. And I can't afford fixing
>> all issues there are in one go.
> 
> Then don't add IgnoreSize to tpause and umwait for now.

You're kidding. Once again - where is the connection between
adding IgnoreSize and disassembler behavior? I want code to
be generated correctly, no more and no less.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 14:05           ` H.J. Lu
  2020-03-05 14:08             ` Jan Beulich
@ 2020-03-05 15:22             ` Jan Beulich
  2020-03-05 15:37               ` H.J. Lu
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-05 15:22 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 05.03.2020 15:04, H.J. Lu wrote:
> On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 04.03.2020 12:44, H.J. Lu wrote:
>>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 04.03.2020 12:36, H.J. Lu wrote:
>>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
>>>>>> when there's no difference between either variant, because of 32-bit
>>>>>> destination registers having their upper halves zeroed in 64-bit mode.
>>>>>> Here, however, they're source registers, and hence specifying 64-bit
>>>>>> registers would lead to the ambiguity of whether the upper 32 bits
>>>>>> actually matter.
>>>>>>
>>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
>>>>>> needed on both.
>>>>>
>>>>> Are there testcases to show IgnoreSize is needed on them?
>>>>
>>>> The situation with 16-bit test cases is rather poor anyway. I didn't
>>>> consider it reasonable to add such very special ones when far more
>>>> general ones don't exist. But if your question is to mean you demand
>>>
>>> Let's start from somewhere.
>>>
>>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
>>>> Please clarify.
>>>
>>> Please add testcases.
>>
>> Actually they were there, in patch 2. I've moved them to this patch
>> and have just sent v1.1 for just this one patch.
> 
> Do we need to adjust disassembler for 16-bit mode?

I've checked now - no, we don't.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 15:22             ` Jan Beulich
@ 2020-03-05 15:37               ` H.J. Lu
  2020-03-05 15:42                 ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: H.J. Lu @ 2020-03-05 15:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Thu, Mar 5, 2020 at 7:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2020 15:04, H.J. Lu wrote:
> > On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 04.03.2020 12:44, H.J. Lu wrote:
> >>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 04.03.2020 12:36, H.J. Lu wrote:
> >>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
> >>>>>> when there's no difference between either variant, because of 32-bit
> >>>>>> destination registers having their upper halves zeroed in 64-bit mode.
> >>>>>> Here, however, they're source registers, and hence specifying 64-bit
> >>>>>> registers would lead to the ambiguity of whether the upper 32 bits
> >>>>>> actually matter.
> >>>>>>
> >>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
> >>>>>> needed on both.
> >>>>>
> >>>>> Are there testcases to show IgnoreSize is needed on them?
> >>>>
> >>>> The situation with 16-bit test cases is rather poor anyway. I didn't
> >>>> consider it reasonable to add such very special ones when far more
> >>>> general ones don't exist. But if your question is to mean you demand
> >>>
> >>> Let's start from somewhere.
> >>>
> >>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
> >>>> Please clarify.
> >>>
> >>> Please add testcases.
> >>
> >> Actually they were there, in patch 2. I've moved them to this patch
> >> and have just sent v1.1 for just this one patch.
> >
> > Do we need to adjust disassembler for 16-bit mode?
>
> I've checked now - no, we don't.
>

So disassembler has been correct.  Is the bug only in assembler?
What is the difference in encoding before and after your change?

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 15:37               ` H.J. Lu
@ 2020-03-05 15:42                 ` Jan Beulich
  2020-03-05 16:00                   ` H.J. Lu
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2020-03-05 15:42 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 05.03.2020 16:37, H.J. Lu wrote:
> On Thu, Mar 5, 2020 at 7:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 05.03.2020 15:04, H.J. Lu wrote:
>>> On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 04.03.2020 12:44, H.J. Lu wrote:
>>>>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 04.03.2020 12:36, H.J. Lu wrote:
>>>>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
>>>>>>>> when there's no difference between either variant, because of 32-bit
>>>>>>>> destination registers having their upper halves zeroed in 64-bit mode.
>>>>>>>> Here, however, they're source registers, and hence specifying 64-bit
>>>>>>>> registers would lead to the ambiguity of whether the upper 32 bits
>>>>>>>> actually matter.
>>>>>>>>
>>>>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
>>>>>>>> needed on both.
>>>>>>>
>>>>>>> Are there testcases to show IgnoreSize is needed on them?
>>>>>>
>>>>>> The situation with 16-bit test cases is rather poor anyway. I didn't
>>>>>> consider it reasonable to add such very special ones when far more
>>>>>> general ones don't exist. But if your question is to mean you demand
>>>>>
>>>>> Let's start from somewhere.
>>>>>
>>>>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
>>>>>> Please clarify.
>>>>>
>>>>> Please add testcases.
>>>>
>>>> Actually they were there, in patch 2. I've moved them to this patch
>>>> and have just sent v1.1 for just this one patch.
>>>
>>> Do we need to adjust disassembler for 16-bit mode?
>>
>> I've checked now - no, we don't.
>>
> 
> So disassembler has been correct.  Is the bug only in assembler?
> What is the difference in encoding before and after your change?

UMWAIT gets a stray 66 prefix emitted, and TPAUSE fails to
assemble ("same type of prefix used twice").

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/9] x86: refine TPAUSE and UMWAIT
  2020-03-05 15:42                 ` Jan Beulich
@ 2020-03-05 16:00                   ` H.J. Lu
  0 siblings, 0 replies; 37+ messages in thread
From: H.J. Lu @ 2020-03-05 16:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Thu, Mar 5, 2020 at 7:42 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.03.2020 16:37, H.J. Lu wrote:
> > On Thu, Mar 5, 2020 at 7:22 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 05.03.2020 15:04, H.J. Lu wrote:
> >>> On Thu, Mar 5, 2020 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 04.03.2020 12:44, H.J. Lu wrote:
> >>>>> On Wed, Mar 4, 2020 at 3:40 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>> On 04.03.2020 12:36, H.J. Lu wrote:
> >>>>>>> On Wed, Mar 4, 2020 at 1:37 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>> Allowing 64-bit registers is misleading here: Elsewhere these get allowed
> >>>>>>>> when there's no difference between either variant, because of 32-bit
> >>>>>>>> destination registers having their upper halves zeroed in 64-bit mode.
> >>>>>>>> Here, however, they're source registers, and hence specifying 64-bit
> >>>>>>>> registers would lead to the ambiguity of whether the upper 32 bits
> >>>>>>>> actually matter.
> >>>>>>>>
> >>>>>>>> Additionally, for proper code generation in 16-bit mode, IgnoreSize is
> >>>>>>>> needed on both.
> >>>>>>>
> >>>>>>> Are there testcases to show IgnoreSize is needed on them?
> >>>>>>
> >>>>>> The situation with 16-bit test cases is rather poor anyway. I didn't
> >>>>>> consider it reasonable to add such very special ones when far more
> >>>>>> general ones don't exist. But if your question is to mean you demand
> >>>>>
> >>>>> Let's start from somewhere.
> >>>>>
> >>>>>> such to be added, then I'll (somewhat hesitantly) add/extend some.
> >>>>>> Please clarify.
> >>>>>
> >>>>> Please add testcases.
> >>>>
> >>>> Actually they were there, in patch 2. I've moved them to this patch
> >>>> and have just sent v1.1 for just this one patch.
> >>>
> >>> Do we need to adjust disassembler for 16-bit mode?
> >>
> >> I've checked now - no, we don't.
> >>
> >
> > So disassembler has been correct.  Is the bug only in assembler?
> > What is the difference in encoding before and after your change?
>
> UMWAIT gets a stray 66 prefix emitted, and TPAUSE fails to
> assemble ("same type of prefix used twice").
>

Patch is OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2020-03-05 16:00 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04  9:32 [PATCH 0/9] x86: (mainly) misc IgnoreSize related adjustments Jan Beulich
2020-03-04  9:41 ` [PATCH 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich
2020-03-04 11:37   ` H.J. Lu
2020-03-04 11:40     ` Jan Beulich
2020-03-04 11:44       ` H.J. Lu
2020-03-05  8:08         ` Jan Beulich
2020-03-05 14:05           ` H.J. Lu
2020-03-05 14:08             ` Jan Beulich
2020-03-05 14:38               ` H.J. Lu
2020-03-05 14:51                 ` Jan Beulich
2020-03-05 14:54                   ` H.J. Lu
2020-03-05 15:16                     ` Jan Beulich
2020-03-05 15:22             ` Jan Beulich
2020-03-05 15:37               ` H.J. Lu
2020-03-05 15:42                 ` Jan Beulich
2020-03-05 16:00                   ` H.J. Lu
2020-03-04  9:42 ` [PATCH 2/9] x86: add missing IgnoreSize Jan Beulich
2020-03-04 11:40   ` H.J. Lu
2020-03-04  9:43 ` [PATCH 3/9] x86: correct MPX insn w/o base or index encoding in 16-bit mode Jan Beulich
2020-03-04 11:46   ` H.J. Lu
2020-03-04 11:50     ` Jan Beulich
2020-03-04 11:55       ` H.J. Lu
2020-03-04 12:58         ` Jan Beulich
2020-03-04 13:26           ` H.J. Lu
2020-03-04  9:44 ` [PATCH 4/9] x86: drop Rex64 attribute Jan Beulich
2020-03-04 11:47   ` H.J. Lu
2020-03-04  9:45 ` [PATCH 6/9] x86: don't accept FI{LD,STP,STTP}LL in Intel syntax mode Jan Beulich
2020-03-04 11:55   ` H.J. Lu
2020-03-04  9:46 ` [PATCH 7/9] x86: fold (supposed to be) identical code Jan Beulich
2020-03-04 11:56   ` H.J. Lu
2020-03-04  9:47 ` [PATCH 9/9] x86: reduce amount of various VCVT* templates Jan Beulich
2020-03-04 12:00   ` H.J. Lu
2020-03-04 10:15 ` [PATCH 8/9] x86: drop/replace IgnoreSize Jan Beulich
2020-03-04 11:59   ` H.J. Lu
2020-03-04 10:19 ` [PATCH 5/9] x86: replace NoRex64 on VEX-encoded insns Jan Beulich
2020-03-04 11:51   ` H.J. Lu
2020-03-05  8:07 ` [PATCH v1.1 1/9] x86: refine TPAUSE and UMWAIT Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).