public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* x86: Support Intel AVX VNNI
@ 2020-10-14  6:37 Cui, Lili
  2020-10-14 10:34 ` H.J. Lu
  2020-10-14 13:12 ` Jan Beulich
  0 siblings, 2 replies; 44+ messages in thread
From: Cui, Lili @ 2020-10-14  6:37 UTC (permalink / raw)
  To: binutils

Hi all,
 
This patch is about to enable binutils support for AVX-VNNI.
AVX (VEX-encoded) versions of the Vector Neural Network Instructions(AVX-VNNI).
more details please refer to https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Make check-gas is ok.

Subject: [PATCH] x86: Support Intel AVX VNNI

Intel AVX VNNI instructions are marked with CpuVEX_PREFIX.  Without the
pseudo {vex} prefix, mnemonics of Intel VNNI instructions are encoded
with the EVEX prefix.  The pseudo {vex} prefix can be used to encode
mnemonics of Intel VNNI instructions with the VEX prefix.

gas/
	* config/tc-i386.c (cpu_arch): Add .avx_vnni and noavx_vnni.
	(cpu_flags_match): Support CpuVEX_PREFIX.
	* doc/c-i386.texi: Document .avx_vnni, noavx_vnni and how to
	encode Intel VNNI instructions with VEX prefix.
	* testsuite/gas/i386/avx-vnni.d: New file.
	* testsuite/gas/i386/avx-vnni.s: Likewise.
	* testsuite/gas/i386/x86-64-avx-vnni.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-vnni.s: Likewise.
	* testsuite/gas/i386/i386.exp: Run AVX VNNI tests.

opcodes/
	* i386-dis.c (PREFIX_VEX_0F3850): New.
	(PREFIX_VEX_0F3851): Likewise.
	(PREFIX_VEX_0F3852): Likewise.
	(PREFIX_VEX_0F3853): Likewise.
	(VEX_W_0F3850_P_2): Likewise.
	(VEX_W_0F3851_P_2): Likewise.
	(VEX_W_0F3852_P_2): Likewise.
	(VEX_W_0F3853_P_2): Likewise.
	(prefix_table): Add PREFIX_VEX_0F3850, PREFIX_VEX_0F3851,
	PREFIX_VEX_0F3852 and PREFIX_VEX_0F3853.
	(vex_table): Add VEX_W_0F3850_P_2, VEX_W_0F3851_P_2,
	VEX_W_0F3852_P_2 and VEX_W_0F3853_P_2.
	(putop): Add support for "XV" to print "{vex3}" pseudo prefix.
	* i386-gen.c (cpu_flag_init): Clear the CpuAVX_VNNI bit in
	CPU_UNKNOWN_FLAGS.  Add CPU_AVX_VNNI_FLAGS and
	CPU_ANY_AVX_VNNI_FLAGS.
	(cpu_flags): Add CpuAVX_VNNI and CpuVEX_PREFIX.
	* i386-opc.h (CpuAVX_VNNI): New.
	(CpuVEX_PREFIX): Likewise.
	(i386_cpu_flags): Add cpuavx_vnni and cpuvex_prefix.
	* i386-opc.tbl: Add Intel AVX VNNI instructions.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Likewise.
---
 gas/config/tc-i386.c                     | 12 +++++-
 gas/doc/c-i386.texi                      |  8 +++-
 gas/testsuite/gas/i386/avx-vnni.d        | 43 ++++++++++++++++++++++
 gas/testsuite/gas/i386/avx-vnni.s        | 22 +++++++++++
 gas/testsuite/gas/i386/i386.exp          |  2 +
 gas/testsuite/gas/i386/x86-64-avx-vnni.d | 47 ++++++++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-avx-vnni.s | 23 ++++++++++++
 opcodes/i386-dis.c                       | 43 +++++++++++++++++++---
 opcodes/i386-gen.c                       |  6 +++
 opcodes/i386-opc.h                       |  6 +++
 opcodes/i386-opc.tbl                     | 10 +++++
 11 files changed, 214 insertions(+), 8 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/avx-vnni.d
 create mode 100644 gas/testsuite/gas/i386/avx-vnni.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index b1e8f7cf1f..101980c45a 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1180,6 +1180,8 @@ static const arch_entry cpu_arch[] =
     CPU_AVX512_VNNI_FLAGS, 0 },
   { STRING_COMMA_LEN (".avx512_bitalg"), PROCESSOR_UNKNOWN,
     CPU_AVX512_BITALG_FLAGS, 0 },
+  { STRING_COMMA_LEN (".avx_vnni"), PROCESSOR_UNKNOWN,
+    CPU_AVX_VNNI_FLAGS, 0 },
   { STRING_COMMA_LEN (".clzero"), PROCESSOR_UNKNOWN,
     CPU_CLZERO_FLAGS, 0 },
   { STRING_COMMA_LEN (".mwaitx"), PROCESSOR_UNKNOWN,
@@ -1276,6 +1278,7 @@ static const noarch_entry cpu_noarch[] =
   { STRING_COMMA_LEN ("noavx512_vbmi2"), CPU_ANY_AVX512_VBMI2_FLAGS },
   { STRING_COMMA_LEN ("noavx512_vnni"), CPU_ANY_AVX512_VNNI_FLAGS },
   { STRING_COMMA_LEN ("noavx512_bitalg"), CPU_ANY_AVX512_BITALG_FLAGS },
+  { STRING_COMMA_LEN ("noavx_vnni"), CPU_ANY_AVX_VNNI_FLAGS },
   { STRING_COMMA_LEN ("noibt"), CPU_ANY_IBT_FLAGS },
   { STRING_COMMA_LEN ("noshstk"), CPU_ANY_SHSTK_FLAGS },
   { STRING_COMMA_LEN ("noamx_int8"), CPU_ANY_AMX_INT8_FLAGS },
@@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
       cpu = cpu_flags_and (x, cpu);
       if (!cpu_flags_all_zero (&cpu))
 	{
-	  if (x.bitfield.cpuavx)
+	  if (x.bitfield.cpuvex_prefix)
+	    {
+	      /* We need to check a few extra flags with VEX_PREFIX.  */
+	      if (i.vec_encoding == vex_encoding_vex
+		  || i.vec_encoding == vex_encoding_vex3)
+		match |= CPU_FLAGS_ARCH_MATCH;
+	    }
+	  else if (x.bitfield.cpuavx)
 	    {
 	      /* We need to check a few extra flags with AVX.  */
 	      if (cpu.bitfield.cpuavx
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 776fed8ed5..a6075874d2 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -211,6 +211,7 @@ accept various extension mnemonics.  For example,
 @code{avx512_vp2intersect},
 @code{tdx},
 @code{avx512_bf16},
+@code{avx_vnni},
 @code{noavx512f},
 @code{noavx512cd},
 @code{noavx512er},
@@ -229,6 +230,7 @@ accept various extension mnemonics.  For example,
 @code{noavx512_vp2intersect},
 @code{notdx},
 @code{noavx512_bf16},
+@code{noavx_vnni},
 @code{noenqcmd},
 @code{noserialize},
 @code{notsxldtrk},
@@ -857,6 +859,10 @@ prefix which generates REX prefix unconditionally.
 @samp{@{nooptimize@}} -- disable instruction size optimization.
 @end itemize
 
+Mnemonics of Intel VNNI instructions are encoded with the EVEX prefix
+by default.  The pseudo @samp{@{vex@}} prefix can be used to encode
+mnemonics of Intel VNNI instructions with the VEX prefix.
+
 @cindex conversion instructions, i386
 @cindex i386 conversion instructions
 @cindex conversion instructions, x86-64
@@ -1505,7 +1511,7 @@ supported on the CPU specified.  The choices for @var{cpu_type} are:
 @item @samp{.avx512vbmi} @tab @samp{.avx512_4fmaps} @tab @samp{.avx512_4vnniw}
 @item @samp{.avx512_vpopcntdq} @tab @samp{.avx512_vbmi2} @tab @samp{.avx512_vnni}
 @item @samp{.avx512_bitalg} @tab @samp{.avx512_bf16} @tab @samp{.avx512_vp2intersect}
-@item @samp{.tdx}
+@item @samp{.tdx} @tab @samp{.avx_vnni}
 @item @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} @tab @item @samp{.ibt}
 @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote}
 @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq}
diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
new file mode 100644
index 0000000000..6d6e779d6e
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -0,0 +1,43 @@
+#objdump: -dw
+#name: i386 AVX VNNI insns
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
+#pass
diff --git a/gas/testsuite/gas/i386/avx-vnni.s b/gas/testsuite/gas/i386/avx-vnni.s
new file mode 100644
index 0000000000..4ddc733040
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni.s
@@ -0,0 +1,22 @@
+	.allow_index_reg
+
+.macro test_insn mnemonic
+	\mnemonic	%xmm2, %xmm4, %xmm2
+	{evex} \mnemonic %xmm2, %xmm4, %xmm2
+	{vex}  \mnemonic %xmm2, %xmm4, %xmm2
+	{vex2} \mnemonic %xmm2, %xmm4, %xmm2
+	{vex3} \mnemonic %xmm2, %xmm4, %xmm2
+	{vex}  \mnemonic (%ecx), %xmm4, %xmm2
+	{vex2} \mnemonic (%ecx), %xmm4, %xmm2
+	{vex3} \mnemonic (%ecx), %xmm4, %xmm2
+.endm
+
+	.text
+_start:
+	test_insn vpdpbusd
+	test_insn vpdpwssd
+	test_insn vpdpbusds
+	test_insn vpdpwssds
+
+	.arch .avx_vnni
+	 vpdpbusd	%xmm2, %xmm4, %xmm2
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 8645f3061c..8cb31ac2d5 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -458,6 +458,7 @@ if [gas_32_check] then {
     run_dump_test "avx512_bf16"
     run_dump_test "avx512_bf16_vl"
     run_list_test "avx512_bf16_vl-inval"
+    run_dump_test "avx-vnni"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "disassem"
@@ -1074,6 +1075,7 @@ if [gas_64_check] then {
     run_dump_test "x86-64-avx512_bf16"
     run_dump_test "x86-64-avx512_bf16_vl"
     run_list_test "x86-64-avx512_bf16_vl-inval"
+    run_dump_test "x86-64-avx-vnni"
     run_dump_test "x86-64-clzero"
     run_dump_test "x86-64-mwaitx-bdver4"
     run_list_test "x86-64-mwaitx-reg"
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
new file mode 100644
index 0000000000..ebb0ebf02c
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -0,0 +1,47 @@
+#objdump: -dw
+#name: x86-64 AVX VNNI insns
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
+ +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.s b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
new file mode 100644
index 0000000000..7f47bf684b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
@@ -0,0 +1,23 @@
+	.allow_index_reg
+
+.macro test_insn mnemonic
+	\mnemonic	 %xmm12, %xmm4, %xmm2
+	{evex} \mnemonic %xmm12, %xmm4, %xmm2
+	{vex}  \mnemonic %xmm12, %xmm4, %xmm2
+	{vex2} \mnemonic %xmm12, %xmm4, %xmm2
+	{vex3} \mnemonic %xmm12, %xmm4, %xmm2
+	{vex}  \mnemonic (%rcx), %xmm4, %xmm2
+	{vex2} \mnemonic (%rcx), %xmm4, %xmm2
+	{vex3} \mnemonic (%rcx), %xmm4, %xmm2
+	\mnemonic	 %xmm22, %xmm4, %xmm2
+.endm
+
+	.text
+_start:
+	test_insn vpdpbusd
+	test_insn vpdpwssd
+	test_insn vpdpbusds
+	test_insn vpdpwssds
+
+	.arch .avx_vnni
+	vpdpbusd	%xmm12, %xmm4, %xmm2
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 4d8f4f4cc2..b7793164f8 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1500,6 +1500,10 @@ enum
   VEX_W_0F384B_X86_64_P_1,
   VEX_W_0F384B_X86_64_P_2,
   VEX_W_0F384B_X86_64_P_3,
+  VEX_W_0F3850,
+  VEX_W_0F3851,
+  VEX_W_0F3852,
+  VEX_W_0F3853,
   VEX_W_0F3858,
   VEX_W_0F3859,
   VEX_W_0F385A_M_0_L_0,
@@ -1787,6 +1791,7 @@ struct dis386 {
    "XZ" => print 'x', 'y', or 'z' if suffix_always is true or no
 	   register operands and no broadcast.
    "XW" => print 's', 'd' depending on the VEX.W bit (for FMA)
+   "XV" => print "{vex3}" pseudo prefix
    "LQ" => print 'l' ('d' in Intel mode) or 'q' for memory operand, cond
 	   being false, or no operand at all in 64bit mode, or if suffix_always
 	   is true.
@@ -6156,10 +6161,10 @@ static const struct dis386 vex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 50 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F3850) },
+    { VEX_W_TABLE (VEX_W_0F3851) },
+    { VEX_W_TABLE (VEX_W_0F3852) },
+    { VEX_W_TABLE (VEX_W_0F3853) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -7690,6 +7695,22 @@ static const struct dis386 vex_w_table[][2] = {
     /* VEX_W_0F384B_X86_64_P_3 */
     { MOD_TABLE (MOD_VEX_0F384B_X86_64_P_3_W_0) },
   },
+  {
+    /* VEX_W_0F3850 */
+    { "%XV vpdpbusd",	{ XM, Vex, EXx }, 0 },
+  },
+  {
+    /* VEX_W_0F3851 */
+    { "%XV vpdpbusds",	{ XM, Vex, EXx }, 0 },
+  },
+  {
+    /* VEX_W_0F3852 */
+    { "%XV vpdpwssd",	{ XM, Vex, EXx }, 0 },
+  },
+  {
+    /* VEX_W_0F3853 */
+    { "%XV vpdpwssds",	{ XM, Vex, EXx }, 0 },
+  },
   {
     /* VEX_W_0F3858 */
     { "vpbroadcastd", { XM, EXxmm_md }, PREFIX_DATA },
@@ -10934,9 +10955,19 @@ putop (const char *in_template, int sizeflag)
 	case 'V':
 	  if (l == 0)
 	    abort ();
-	  else if (l == 1 && last[0] == 'L')
+	  else if (l == 1
+		   && (last[0] == 'L' || last[0] == 'X'))
 	    {
-	      if (rex & REX_W)
+	      if (last[0] == 'X')
+		{
+		  *obufp++ = '{';
+		  *obufp++ = 'v';
+		  *obufp++ = 'e';
+		  *obufp++ = 'x';
+		  *obufp++ = '3';
+		  *obufp++ = '}';
+		}
+	      else if (rex & REX_W)
 		{
 		  *obufp++ = 'a';
 		  *obufp++ = 'b';
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 81c68cdf43..f1385f4cd0 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -207,6 +207,8 @@ static initializer cpu_flag_init[] =
     "CPU_SSE4_2_FLAGS|CPU_XSAVE_FLAGS|CpuAVX" },
   { "CPU_AVX2_FLAGS",
     "CPU_AVX_FLAGS|CpuAVX2" },
+  { "CPU_AVX_VNNI_FLAGS",
+    "CPU_AVX2_FLAGS|CpuAVX_VNNI" },
   { "CPU_AVX512F_FLAGS",
     "CPU_AVX2_FLAGS|CpuAVX512F" },
   { "CPU_AVX512CD_FLAGS",
@@ -401,6 +403,8 @@ static initializer cpu_flag_init[] =
     "CpuAMX_BF16" },
   { "CPU_ANY_AMX_TILE_FLAGS",
     "CpuAMX_TILE|CpuAMX_INT8|CpuAMX_BF16" },
+  { "CPU_ANY_AVX_VNNI_FLAGS",
+    "CpuAVX_VNNI|CpuVEX_PREFIX" },
   { "CPU_ANY_MOVDIRI_FLAGS",
     "CpuMOVDIRI" },
   { "CPU_ANY_MOVDIR64B_FLAGS",
@@ -624,6 +628,8 @@ static bitfield cpu_flags[] =
   BITFIELD (CpuAVX512_BF16),
   BITFIELD (CpuAVX512_VP2INTERSECT),
   BITFIELD (CpuTDX),
+  BITFIELD (CpuAVX_VNNI),
+  BITFIELD (CpuVEX_PREFIX),
   BITFIELD (CpuMWAITX),
   BITFIELD (CpuCLZERO),
   BITFIELD (CpuOSPKE),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index e783683d0c..094bcb6a0d 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -212,6 +212,10 @@ enum
   CpuAVX512_VP2INTERSECT,
   /* TDX Instructions support required.  */
   CpuTDX,
+  /* Intel AVX VNNI Instructions support required.  */
+  CpuAVX_VNNI,
+  /* Intel AVX Instructions support via {vex} prefix required.  */
+  CpuVEX_PREFIX,
   /* mwaitx instruction required */
   CpuMWAITX,
   /* Clzero instruction required */
@@ -378,6 +382,8 @@ typedef union i386_cpu_flags
       unsigned int cpuavx512_bf16:1;
       unsigned int cpuavx512_vp2intersect:1;
       unsigned int cputdx:1;
+      unsigned int cpuavx_vnni:1;
+      unsigned int cpuvex_prefix:1;
       unsigned int cpumwaitx:1;
       unsigned int cpuclzero:1;
       unsigned int cpuospke:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 2c7184b0e9..5e48b5567a 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3904,6 +3904,16 @@ vpshrdw, 4, 0x6672, None, 1, CpuAVX512_VBMI2, Modrm|Masking=3|OpcodePrefix=2|Vex
 
 // AVX512_VBMI2 instructions end
 
+// AVX_VNNI instructions
+
+vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+
+vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+
+// AVX_VNNI instructions end
+
 // AVX512_VNNI instructions
 
 vpdpbusd, 3, 0x6650, None, 1, CpuAVX512_VNNI, Modrm|Masking=3|OpcodePrefix=1|VexVVVV=1|VexW=1|Broadcast|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-- 
2.17.1

Thanks,
Lili.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-14  6:37 x86: Support Intel AVX VNNI Cui, Lili
@ 2020-10-14 10:34 ` H.J. Lu
  2020-10-14 13:12 ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-14 10:34 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, Jan Beulich

On Tue, Oct 13, 2020 at 11:37 PM Cui, Lili <lili.cui@intel.com> wrote:
>
> Hi all,
>
> This patch is about to enable binutils support for AVX-VNNI.
> AVX (VEX-encoded) versions of the Vector Neural Network Instructions(AVX-VNNI).
> more details please refer to https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
>
> Make check-gas is ok.
>
> Subject: [PATCH] x86: Support Intel AVX VNNI
>
> Intel AVX VNNI instructions are marked with CpuVEX_PREFIX.  Without the
> pseudo {vex} prefix, mnemonics of Intel VNNI instructions are encoded
> with the EVEX prefix.  The pseudo {vex} prefix can be used to encode
> mnemonics of Intel VNNI instructions with the VEX prefix.
>
> gas/
>         * config/tc-i386.c (cpu_arch): Add .avx_vnni and noavx_vnni.
>         (cpu_flags_match): Support CpuVEX_PREFIX.
>         * doc/c-i386.texi: Document .avx_vnni, noavx_vnni and how to
>         encode Intel VNNI instructions with VEX prefix.
>         * testsuite/gas/i386/avx-vnni.d: New file.
>         * testsuite/gas/i386/avx-vnni.s: Likewise.
>         * testsuite/gas/i386/x86-64-avx-vnni.d: Likewise.
>         * testsuite/gas/i386/x86-64-avx-vnni.s: Likewise.
>         * testsuite/gas/i386/i386.exp: Run AVX VNNI tests.
>
> opcodes/
>         * i386-dis.c (PREFIX_VEX_0F3850): New.
>         (PREFIX_VEX_0F3851): Likewise.
>         (PREFIX_VEX_0F3852): Likewise.
>         (PREFIX_VEX_0F3853): Likewise.
>         (VEX_W_0F3850_P_2): Likewise.
>         (VEX_W_0F3851_P_2): Likewise.
>         (VEX_W_0F3852_P_2): Likewise.
>         (VEX_W_0F3853_P_2): Likewise.
>         (prefix_table): Add PREFIX_VEX_0F3850, PREFIX_VEX_0F3851,
>         PREFIX_VEX_0F3852 and PREFIX_VEX_0F3853.
>         (vex_table): Add VEX_W_0F3850_P_2, VEX_W_0F3851_P_2,
>         VEX_W_0F3852_P_2 and VEX_W_0F3853_P_2.
>         (putop): Add support for "XV" to print "{vex3}" pseudo prefix.
>         * i386-gen.c (cpu_flag_init): Clear the CpuAVX_VNNI bit in
>         CPU_UNKNOWN_FLAGS.  Add CPU_AVX_VNNI_FLAGS and
>         CPU_ANY_AVX_VNNI_FLAGS.
>         (cpu_flags): Add CpuAVX_VNNI and CpuVEX_PREFIX.
>         * i386-opc.h (CpuAVX_VNNI): New.
>         (CpuVEX_PREFIX): Likewise.
>         (i386_cpu_flags): Add cpuavx_vnni and cpuvex_prefix.
>         * i386-opc.tbl: Add Intel AVX VNNI instructions.
>         * i386-init.h: Regenerated.
>         * i386-tbl.h: Likewise.

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-14  6:37 x86: Support Intel AVX VNNI Cui, Lili
  2020-10-14 10:34 ` H.J. Lu
@ 2020-10-14 13:12 ` Jan Beulich
  2020-10-14 13:28   ` H.J. Lu
  1 sibling, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-14 13:12 UTC (permalink / raw)
  To: Cui, Lili; +Cc: binutils, H. J. Lu

On 14.10.2020 08:37, Cui, Lili wrote:
> Subject: [PATCH] x86: Support Intel AVX VNNI
> 
> Intel AVX VNNI instructions are marked with CpuVEX_PREFIX.  Without the
> pseudo {vex} prefix, mnemonics of Intel VNNI instructions are encoded
> with the EVEX prefix.  The pseudo {vex} prefix can be used to encode
> mnemonics of Intel VNNI instructions with the VEX prefix.

I take it this is (on the gas side) to avoid breaking existing code
by suddenly switching from EVEX- to VEX-encoding. Ugly, but well ...
(I imply that there are or are going to be CPUs with AVX512_VNNI but
without AVX_VNNI, or else all of this would be quite pointless.) On
the disassembler side, though, I don't see at all why the {vex3}
thingy needs printing. We don't do so for other mnemonics allowing
perhaps all of 2-byte VEX, 3-byte VEX, and EVEX encoding.

I wonder anyway why these separate encodings are being introduced
after their EVEX ones. Is this just because in many cases the
encodings are shorter? If so, why isn't this done more consistently
for all AVX512 insns not (yet) having VEX encoded counterparts?

> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
>        cpu = cpu_flags_and (x, cpu);
>        if (!cpu_flags_all_zero (&cpu))
>  	{
> -	  if (x.bitfield.cpuavx)
> +	  if (x.bitfield.cpuvex_prefix)
> +	    {
> +	      /* We need to check a few extra flags with VEX_PREFIX.  */
> +	      if (i.vec_encoding == vex_encoding_vex
> +		  || i.vec_encoding == vex_encoding_vex3)
> +		match |= CPU_FLAGS_ARCH_MATCH;
> +	    }
> +	  else if (x.bitfield.cpuavx)

Is this (including the new cpuvex_prefix attribute, which imo shouldn't
be a Cpu* bit) really needed? Couldn't you achieve the same by placing
the templates _after_ the AVX512 counterparts? Iirc templates get
tried in order, and the first match wins. The {vex3} prefix would then
prevent a match on the EVEX-encoded AVX512_VNNI templates.

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/avx-vnni.s
> @@ -0,0 +1,22 @@
> +	.allow_index_reg
> +
> +.macro test_insn mnemonic
> +	\mnemonic	%xmm2, %xmm4, %xmm2
> +	{evex} \mnemonic %xmm2, %xmm4, %xmm2
> +	{vex}  \mnemonic %xmm2, %xmm4, %xmm2
> +	{vex2} \mnemonic %xmm2, %xmm4, %xmm2
> +	{vex3} \mnemonic %xmm2, %xmm4, %xmm2
> +	{vex}  \mnemonic (%ecx), %xmm4, %xmm2
> +	{vex2} \mnemonic (%ecx), %xmm4, %xmm2
> +	{vex3} \mnemonic (%ecx), %xmm4, %xmm2
> +.endm

I question the {vex2} cases here: Bot {vex} and {vex3} are
mandatory prefixes, while {vex2} has been deprecated (and it's
not even documented anymore). If anything, {vex2} should result
in an error here. But for simplicity omitting the tests now
(allowing the case to become an error later on) would be fine
with me.

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3904,6 +3904,16 @@ vpshrdw, 4, 0x6672, None, 1, CpuAVX512_VBMI2, Modrm|Masking=3|OpcodePrefix=2|Vex
>  
>  // AVX512_VBMI2 instructions end
>  
> +// AVX_VNNI instructions
> +
> +vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
> +vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
> +
> +vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
> +vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }

At the very least please use more "modern" (and more meaningful /
readable) constructs where available, e.g. VexW0 instead of VexW=1.
But in new code I'd like to further encourage to avoid any
attributes of the form <name>=<value>, and instead introduce
suitable #define-s instead. Since we're already transforming the
code (slowly, but still), this will reduce future churn. (For
VexVVVV=1 please note that this is equivalent to just VexVVVV
anyway.)

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-14 13:12 ` Jan Beulich
@ 2020-10-14 13:28   ` H.J. Lu
  2020-10-15  7:10     ` Cui, Lili
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-14 13:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Wed, Oct 14, 2020 at 6:12 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 14.10.2020 08:37, Cui, Lili wrote:
> > Subject: [PATCH] x86: Support Intel AVX VNNI
> >
> > Intel AVX VNNI instructions are marked with CpuVEX_PREFIX.  Without the
> > pseudo {vex} prefix, mnemonics of Intel VNNI instructions are encoded
> > with the EVEX prefix.  The pseudo {vex} prefix can be used to encode
> > mnemonics of Intel VNNI instructions with the VEX prefix.
>
> I take it this is (on the gas side) to avoid breaking existing code
> by suddenly switching from EVEX- to VEX-encoding. Ugly, but well ...
> (I imply that there are or are going to be CPUs with AVX512_VNNI but

Cascadelake has AVX512 VNNI.

> without AVX_VNNI, or else all of this would be quite pointless.) On
> the disassembler side, though, I don't see at all why the {vex3}
> thingy needs printing. We don't do so for other mnemonics allowing
> perhaps all of 2-byte VEX, 3-byte VEX, and EVEX encoding.
>
> I wonder anyway why these separate encodings are being introduced
> after their EVEX ones. Is this just because in many cases the
> encodings are shorter? If so, why isn't this done more consistently
> for all AVX512 insns not (yet) having VEX encoded counterparts?

Good questions.

> > @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> >        cpu = cpu_flags_and (x, cpu);
> >        if (!cpu_flags_all_zero (&cpu))
> >       {
> > -       if (x.bitfield.cpuavx)
> > +       if (x.bitfield.cpuvex_prefix)
> > +         {
> > +           /* We need to check a few extra flags with VEX_PREFIX.  */
> > +           if (i.vec_encoding == vex_encoding_vex
> > +               || i.vec_encoding == vex_encoding_vex3)
> > +             match |= CPU_FLAGS_ARCH_MATCH;
> > +         }
> > +       else if (x.bitfield.cpuavx)
>
> Is this (including the new cpuvex_prefix attribute, which imo shouldn't
> be a Cpu* bit) really needed? Couldn't you achieve the same by placing
> the templates _after_ the AVX512 counterparts? Iirc templates get
> tried in order, and the first match wins. The {vex3} prefix would then
> prevent a match on the EVEX-encoded AVX512_VNNI templates.

Lili, please look into it.

> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/avx-vnni.s
> > @@ -0,0 +1,22 @@
> > +     .allow_index_reg
> > +
> > +.macro test_insn mnemonic
> > +     \mnemonic       %xmm2, %xmm4, %xmm2
> > +     {evex} \mnemonic %xmm2, %xmm4, %xmm2
> > +     {vex}  \mnemonic %xmm2, %xmm4, %xmm2
> > +     {vex2} \mnemonic %xmm2, %xmm4, %xmm2
> > +     {vex3} \mnemonic %xmm2, %xmm4, %xmm2
> > +     {vex}  \mnemonic (%ecx), %xmm4, %xmm2
> > +     {vex2} \mnemonic (%ecx), %xmm4, %xmm2
> > +     {vex3} \mnemonic (%ecx), %xmm4, %xmm2
> > +.endm
>
> I question the {vex2} cases here: Bot {vex} and {vex3} are
> mandatory prefixes, while {vex2} has been deprecated (and it's
> not even documented anymore). If anything, {vex2} should result
> in an error here. But for simplicity omitting the tests now
> (allowing the case to become an error later on) would be fine
> with me.

Lili, let's drop {vex2} test.

> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3904,6 +3904,16 @@ vpshrdw, 4, 0x6672, None, 1, CpuAVX512_VBMI2, Modrm|Masking=3|OpcodePrefix=2|Vex
> >
> >  // AVX512_VBMI2 instructions end
> >
> > +// AVX_VNNI instructions
> > +
> > +vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
> > +vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
> > +
> > +vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
> > +vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
>
> At the very least please use more "modern" (and more meaningful /
> readable) constructs where available, e.g. VexW0 instead of VexW=1.
> But in new code I'd like to further encourage to avoid any
> attributes of the form <name>=<value>, and instead introduce
> suitable #define-s instead. Since we're already transforming the
> code (slowly, but still), this will reduce future churn. (For
> VexVVVV=1 please note that this is equivalent to just VexVVVV
> anyway.)

Lili, please adjust them.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: x86: Support Intel AVX VNNI
  2020-10-14 13:28   ` H.J. Lu
@ 2020-10-15  7:10     ` Cui, Lili
  2020-10-15  7:24       ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Cui, Lili @ 2020-10-15  7:10 UTC (permalink / raw)
  To: H.J. Lu, Jan Beulich; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 19206 bytes --]

> > > @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> > >        cpu = cpu_flags_and (x, cpu);
> > >        if (!cpu_flags_all_zero (&cpu))
> > >       {
> > > -       if (x.bitfield.cpuavx)
> > > +       if (x.bitfield.cpuvex_prefix)
> > > +         {
> > > +           /* We need to check a few extra flags with VEX_PREFIX.  */
> > > +           if (i.vec_encoding == vex_encoding_vex
> > > +               || i.vec_encoding == vex_encoding_vex3)
> > > +             match |= CPU_FLAGS_ARCH_MATCH;
> > > +         }
> > > +       else if (x.bitfield.cpuavx)
> >
> > Is this (including the new cpuvex_prefix attribute, which imo
> > shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> > by placing the templates _after_ the AVX512 counterparts? Iirc
> > templates get tried in order, and the first match wins. The {vex3}
> > prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> templates.
> 
> Lili, please look into it.
> 

I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.

.arch .noavx512_vnni
vpdpbusd %xmm2,%xmm4,%xmm2 

As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix. 
we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
and move it into opcode_modifier bit, thanks.

> > I question the {vex2} cases here: Bot {vex} and {vex3} are mandatory
> > prefixes, while {vex2} has been deprecated (and it's not even
> > documented anymore). If anything, {vex2} should result in an error
> > here. But for simplicity omitting the tests now (allowing the case to
> > become an error later on) would be fine with me.
> 
> Lili, let's drop {vex2} test.
> 
Done.

> > At the very least please use more "modern" (and more meaningful /
> > readable) constructs where available, e.g. VexW0 instead of VexW=1.
> > But in new code I'd like to further encourage to avoid any attributes
> > of the form <name>=<value>, and instead introduce suitable #define-s
> > instead. Since we're already transforming the code (slowly, but
> > still), this will reduce future churn. (For
> > VexVVVV=1 please note that this is equivalent to just VexVVVV
> > anyway.)
> 
> Lili, please adjust them.
> 
Done.

Subject: [PATCH] Enhancement for avx-vnni patch

1. Rename CpuVEX_PREFIX to PseudoVexPrefix and
   move it from cpu_flags to opcode_modifiers.
2. Delete {vex2} invalid tests.
3. Use VexW0 and VexVVVV in the AVX-VNNI instructions.

gas/
	* config/tc-i386.c: Move Pseudo Prefix check to match_template.
	* testsuite/gas/i386/avx-vnni-inval.l: New file.
	* testsuite/gas/i386/avx-vnni-inval.s: Likewise.
	* testsuite/gas/i386/avx-vnni.d: Delete invalid {vex2} test.
	* testsuite/gas/i386/avx-vnni.s: Likewise.
	* testsuite/gas/i386/i386.exp: Add AVX VNNI invalid tests.
	* testsuite/gas/i386/x86-64-avx-vnni-inval.l: New file.
	* testsuite/gas/i386/x86-64-avx-vnni-inval.s: Likewise.
	* testsuite/gas/i386/x86-64-avx-vnni.d: Delete invalid {vex2} test.
	* testsuite/gas/i386/x86-64-avx-vnni.s: Likewise.

opcodes/
	* i386-opc.tbl: Rename CpuVEX_PREFIX to PseudoVexPrefix
	and move it from cpu_flags to opcode_modifiers.
	Use VexW0 and VexVVVV in the AVX-VNNI instructions.
	* i386-gen.c: Likewise.
	* i386-opc.h: Likewise.
	* i386-opc.h: Likewise.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Likewise.
---
 gas/config/tc-i386.c                           | 16 ++++++++--------
 gas/testsuite/gas/i386/avx-vnni-inval.l        |  3 +++
 gas/testsuite/gas/i386/avx-vnni-inval.s        |  9 +++++++++
 gas/testsuite/gas/i386/avx-vnni.d              |  8 --------
 gas/testsuite/gas/i386/avx-vnni.s              |  2 --
 gas/testsuite/gas/i386/i386.exp                |  2 ++
 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l |  5 +++++
 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s | 11 +++++++++++
 gas/testsuite/gas/i386/x86-64-avx-vnni.d       |  8 --------
 gas/testsuite/gas/i386/x86-64-avx-vnni.s       |  2 --
 opcodes/i386-gen.c                             |  4 ++--
 opcodes/i386-opc.h                             |  6 +++---
 opcodes/i386-opc.tbl                           |  8 ++++----
 13 files changed, 47 insertions(+), 37 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/avx-vnni-inval.l
 create mode 100644 gas/testsuite/gas/i386/avx-vnni-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index a081064aba..487454f24a 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1973,14 +1973,7 @@ cpu_flags_match (const insn_template *t)
       cpu = cpu_flags_and (x, cpu);
       if (!cpu_flags_all_zero (&cpu))
 	{
-	  if (x.bitfield.cpuvex_prefix)
-	    {
-	      /* We need to check a few extra flags with VEX_PREFIX.  */
-	      if (i.vec_encoding == vex_encoding_vex
-		  || i.vec_encoding == vex_encoding_vex3)
-		match |= CPU_FLAGS_ARCH_MATCH;
-	    }
-	  else if (x.bitfield.cpuavx)
+	  if (x.bitfield.cpuavx)
 	    {
 	      /* We need to check a few extra flags with AVX.  */
 	      if (cpu.bitfield.cpuavx
@@ -6265,6 +6258,13 @@ match_template (char mnem_suffix)
       if (cpu_flags_match (t) != CPU_FLAGS_PERFECT_MATCH)
 	continue;
 
+      /* Check Pseudo Prefix.  */
+      i.error = unsupported;
+      if (t->opcode_modifier.pseudovexprefix
+	  && !(i.vec_encoding == vex_encoding_vex
+	      || i.vec_encoding == vex_encoding_vex3))
+	continue;
+
       /* Check AT&T mnemonic.   */
       i.error = unsupported_with_intel_mnemonic;
       if (intel_mnemonic && t->opcode_modifier.attmnemonic)
diff --git a/gas/testsuite/gas/i386/avx-vnni-inval.l b/gas/testsuite/gas/i386/avx-vnni-inval.l
new file mode 100644
index 0000000000..972f31f082
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni-inval.l
@@ -0,0 +1,3 @@
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpdpbusd'
+.*:9: Error: unsupported instruction `vpdpbusd'
diff --git a/gas/testsuite/gas/i386/avx-vnni-inval.s b/gas/testsuite/gas/i386/avx-vnni-inval.s
new file mode 100644
index 0000000000..2b4cf0bf9b
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni-inval.s
@@ -0,0 +1,9 @@
+# Check illegal in AVXVNNI instructions
+
+	.text
+	.arch .noavx512_vnni
+_start:
+	vpdpbusd %xmm2,%xmm4,%xmm2
+
+	.intel_syntax noprefix
+	vpdpbusd %xmm2,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
index 6d6e779d6e..6e31528cf2 100644
--- a/gas/testsuite/gas/i386/avx-vnni.d
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -11,32 +11,24 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/avx-vnni.s b/gas/testsuite/gas/i386/avx-vnni.s
index 4ddc733040..b37bc85c3a 100644
--- a/gas/testsuite/gas/i386/avx-vnni.s
+++ b/gas/testsuite/gas/i386/avx-vnni.s
@@ -4,10 +4,8 @@
 	\mnemonic	%xmm2, %xmm4, %xmm2
 	{evex} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm2, %xmm4, %xmm2
-	{vex2} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex3} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic (%ecx), %xmm4, %xmm2
-	{vex2} \mnemonic (%ecx), %xmm4, %xmm2
 	{vex3} \mnemonic (%ecx), %xmm4, %xmm2
 .endm
 
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index f5727678e2..068813d77f 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -459,6 +459,7 @@ if [gas_32_check] then {
     run_dump_test "avx512_bf16_vl"
     run_list_test "avx512_bf16_vl-inval"
     run_dump_test "avx-vnni"
+    run_list_test "avx-vnni-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "disassem"
@@ -1077,6 +1078,7 @@ if [gas_64_check] then {
     run_dump_test "x86-64-avx512_bf16_vl"
     run_list_test "x86-64-avx512_bf16_vl-inval"
     run_dump_test "x86-64-avx-vnni"
+    run_list_test "x86-64-avx-vnni-inval"
     run_dump_test "x86-64-clzero"
     run_dump_test "x86-64-mwaitx-bdver4"
     run_list_test "x86-64-mwaitx-reg"
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
new file mode 100644
index 0000000000..e05764d95f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
@@ -0,0 +1,5 @@
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpdpbusds'
+.*:7: Error: unsupported instruction `vpdpbusds'
+.*:10: Error: unsupported instruction `vpdpbusds'
+.*:11: Error: unsupported instruction `vpdpbusds'
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s
new file mode 100644
index 0000000000..8d165bc0f0
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s
@@ -0,0 +1,11 @@
+# Check illegal in AVXVNNI instructions
+
+	.text
+	.arch .noavx512_vnni
+_start:
+	vpdpbusds %xmm2, %xmm4, %xmm2
+	vpdpbusds %xmm22, %xmm4, %xmm2
+
+	.intel_syntax noprefix
+	vpdpbusds xmm2, xmm4, xmm2
+	vpdpbusds xmm2, xmm4, xmm22
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
index ebb0ebf02c..c4474739ed 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -11,8 +11,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
@@ -20,8 +18,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
@@ -29,8 +25,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
@@ -38,8 +32,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.s b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
index 7f47bf684b..95b6dc2ef3 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.s
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
@@ -4,10 +4,8 @@
 	\mnemonic	 %xmm12, %xmm4, %xmm2
 	{evex} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm12, %xmm4, %xmm2
-	{vex2} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex3} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic (%rcx), %xmm4, %xmm2
-	{vex2} \mnemonic (%rcx), %xmm4, %xmm2
 	{vex3} \mnemonic (%rcx), %xmm4, %xmm2
 	\mnemonic	 %xmm22, %xmm4, %xmm2
 .endm
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index fc42088638..c3f0181329 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -408,7 +408,7 @@ static initializer cpu_flag_init[] =
   { "CPU_ANY_AMX_TILE_FLAGS",
     "CpuAMX_TILE|CpuAMX_INT8|CpuAMX_BF16" },
   { "CPU_ANY_AVX_VNNI_FLAGS",
-    "CpuAVX_VNNI|CpuVEX_PREFIX" },
+    "CpuAVX_VNNI" },
   { "CPU_ANY_MOVDIRI_FLAGS",
     "CpuMOVDIRI" },
   { "CPU_ANY_UINTR_FLAGS",
@@ -637,7 +637,6 @@ static bitfield cpu_flags[] =
   BITFIELD (CpuAVX512_VP2INTERSECT),
   BITFIELD (CpuTDX),
   BITFIELD (CpuAVX_VNNI),
-  BITFIELD (CpuVEX_PREFIX),
   BITFIELD (CpuMWAITX),
   BITFIELD (CpuCLZERO),
   BITFIELD (CpuOSPKE),
@@ -708,6 +707,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (ImmExt),
   BITFIELD (NoRex64),
   BITFIELD (Ugh),
+  BITFIELD (PseudoVexPrefix),
   BITFIELD (Vex),
   BITFIELD (VexVVVV),
   BITFIELD (VexW),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 2e90c58421..ce2a1a5b47 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -214,8 +214,6 @@ enum
   CpuTDX,
   /* Intel AVX VNNI Instructions support required.  */
   CpuAVX_VNNI,
-  /* Intel AVX Instructions support via {vex} prefix required.  */
-  CpuVEX_PREFIX,
   /* mwaitx instruction required */
   CpuMWAITX,
   /* Clzero instruction required */
@@ -387,7 +385,6 @@ typedef union i386_cpu_flags
       unsigned int cpuavx512_vp2intersect:1;
       unsigned int cputdx:1;
       unsigned int cpuavx_vnni:1;
-      unsigned int cpuvex_prefix:1;
       unsigned int cpumwaitx:1;
       unsigned int cpuclzero:1;
       unsigned int cpuospke:1;
@@ -534,6 +531,8 @@ enum
   NoRex64,
   /* deprecated fp insn, gets a warning */
   Ugh,
+  /* Intel AVX Instructions support via {vex} prefix */
+  PseudoVexPrefix,
   /* insn has VEX prefix:
 	1: 128bit VEX prefix (or operand dependent).
 	2: 256bit VEX prefix.
@@ -739,6 +738,7 @@ typedef struct i386_opcode_modifier
   unsigned int immext:1;
   unsigned int norex64:1;
   unsigned int ugh:1;
+  unsigned int pseudovexprefix:1;
   unsigned int vex:2;
   unsigned int vexvvvv:2;
   unsigned int vexw:2;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 6745eff26c..56c2838991 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3906,11 +3906,11 @@ vpshrdw, 4, 0x6672, None, 1, CpuAVX512_VBMI2, Modrm|Masking=3|OpcodePrefix=2|Vex
 
 // AVX_VNNI instructions
 
-vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
-vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX_VNNI instructions end
-- 
2.17.1

Thanks,
Lili.

[-- Attachment #2: 0001-Enhancement-for-avx-vnni-patch.patch --]
[-- Type: application/octet-stream, Size: 16703 bytes --]

From 004a2566602fd610e43032cc91f981b3c9467555 Mon Sep 17 00:00:00 2001
From: "Cui,Lili" <lili.cui@intel.com>
Date: Thu, 15 Oct 2020 10:45:08 +0800
Subject: [PATCH] Enhancement for avx-vnni patch

1. Rename CpuVEX_PREFIX to PseudoVexPrefix and
   move it from cpu_flags to opcode_modifiers.
2. Delete {vex2} invalid test.
3. Use VexW0 and VexVVVV in the AVX-VNNI instructions.

gas/
	* config/tc-i386.c: Move Pseudo Prefix check to match_template.
	* testsuite/gas/i386/avx-vnni-inval.l: New file.
	* testsuite/gas/i386/avx-vnni-inval.s: Likewise.
	* testsuite/gas/i386/avx-vnni.d: Delete invalid {vex2} test.
	* testsuite/gas/i386/avx-vnni.s: Likewise.
	* testsuite/gas/i386/i386.exp: Add AVX VNNI invalid tests.
	* testsuite/gas/i386/x86-64-avx-vnni-inval.l: New file.
	* testsuite/gas/i386/x86-64-avx-vnni-inval.s: Likewise.
	* testsuite/gas/i386/x86-64-avx-vnni.d: Delete invalid {vex2} test.
	* testsuite/gas/i386/x86-64-avx-vnni.s: Likewise.

opcodes/
	* i386-opc.tbl: Rename CpuVEX_PREFIX to PseudoVexPrefix
	and move it from cpu_flags to opcode_modifiers.
	Use VexW0 and VexVVVV in the AVX-VNNI instructions.
	* i386-gen.c: Likewise.
	* i386-opc.h: Likewise.
	* i386-opc.h: Likewise.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Likewise.
---
 gas/config/tc-i386.c                           | 16 ++++++++--------
 gas/testsuite/gas/i386/avx-vnni-inval.l        |  3 +++
 gas/testsuite/gas/i386/avx-vnni-inval.s        |  9 +++++++++
 gas/testsuite/gas/i386/avx-vnni.d              |  8 --------
 gas/testsuite/gas/i386/avx-vnni.s              |  2 --
 gas/testsuite/gas/i386/i386.exp                |  2 ++
 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l |  5 +++++
 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s | 11 +++++++++++
 gas/testsuite/gas/i386/x86-64-avx-vnni.d       |  8 --------
 gas/testsuite/gas/i386/x86-64-avx-vnni.s       |  2 --
 opcodes/i386-gen.c                             |  4 ++--
 opcodes/i386-opc.h                             |  6 +++---
 opcodes/i386-opc.tbl                           |  8 ++++----
 13 files changed, 47 insertions(+), 37 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/avx-vnni-inval.l
 create mode 100644 gas/testsuite/gas/i386/avx-vnni-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index a081064aba..487454f24a 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1973,14 +1973,7 @@ cpu_flags_match (const insn_template *t)
       cpu = cpu_flags_and (x, cpu);
       if (!cpu_flags_all_zero (&cpu))
 	{
-	  if (x.bitfield.cpuvex_prefix)
-	    {
-	      /* We need to check a few extra flags with VEX_PREFIX.  */
-	      if (i.vec_encoding == vex_encoding_vex
-		  || i.vec_encoding == vex_encoding_vex3)
-		match |= CPU_FLAGS_ARCH_MATCH;
-	    }
-	  else if (x.bitfield.cpuavx)
+	  if (x.bitfield.cpuavx)
 	    {
 	      /* We need to check a few extra flags with AVX.  */
 	      if (cpu.bitfield.cpuavx
@@ -6265,6 +6258,13 @@ match_template (char mnem_suffix)
       if (cpu_flags_match (t) != CPU_FLAGS_PERFECT_MATCH)
 	continue;
 
+      /* Check Pseudo Prefix.  */
+      i.error = unsupported;
+      if (t->opcode_modifier.pseudovexprefix
+	  && !(i.vec_encoding == vex_encoding_vex
+	      || i.vec_encoding == vex_encoding_vex3))
+	continue;
+
       /* Check AT&T mnemonic.   */
       i.error = unsupported_with_intel_mnemonic;
       if (intel_mnemonic && t->opcode_modifier.attmnemonic)
diff --git a/gas/testsuite/gas/i386/avx-vnni-inval.l b/gas/testsuite/gas/i386/avx-vnni-inval.l
new file mode 100644
index 0000000000..972f31f082
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni-inval.l
@@ -0,0 +1,3 @@
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpdpbusd'
+.*:9: Error: unsupported instruction `vpdpbusd'
diff --git a/gas/testsuite/gas/i386/avx-vnni-inval.s b/gas/testsuite/gas/i386/avx-vnni-inval.s
new file mode 100644
index 0000000000..2b4cf0bf9b
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni-inval.s
@@ -0,0 +1,9 @@
+# Check illegal in AVXVNNI instructions
+
+	.text
+	.arch .noavx512_vnni
+_start:
+	vpdpbusd %xmm2,%xmm4,%xmm2
+
+	.intel_syntax noprefix
+	vpdpbusd %xmm2,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
index 6d6e779d6e..6e31528cf2 100644
--- a/gas/testsuite/gas/i386/avx-vnni.d
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -11,32 +11,24 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/avx-vnni.s b/gas/testsuite/gas/i386/avx-vnni.s
index 4ddc733040..b37bc85c3a 100644
--- a/gas/testsuite/gas/i386/avx-vnni.s
+++ b/gas/testsuite/gas/i386/avx-vnni.s
@@ -4,10 +4,8 @@
 	\mnemonic	%xmm2, %xmm4, %xmm2
 	{evex} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm2, %xmm4, %xmm2
-	{vex2} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex3} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic (%ecx), %xmm4, %xmm2
-	{vex2} \mnemonic (%ecx), %xmm4, %xmm2
 	{vex3} \mnemonic (%ecx), %xmm4, %xmm2
 .endm
 
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index f5727678e2..068813d77f 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -459,6 +459,7 @@ if [gas_32_check] then {
     run_dump_test "avx512_bf16_vl"
     run_list_test "avx512_bf16_vl-inval"
     run_dump_test "avx-vnni"
+    run_list_test "avx-vnni-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "disassem"
@@ -1077,6 +1078,7 @@ if [gas_64_check] then {
     run_dump_test "x86-64-avx512_bf16_vl"
     run_list_test "x86-64-avx512_bf16_vl-inval"
     run_dump_test "x86-64-avx-vnni"
+    run_list_test "x86-64-avx-vnni-inval"
     run_dump_test "x86-64-clzero"
     run_dump_test "x86-64-mwaitx-bdver4"
     run_list_test "x86-64-mwaitx-reg"
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
new file mode 100644
index 0000000000..e05764d95f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
@@ -0,0 +1,5 @@
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpdpbusds'
+.*:7: Error: unsupported instruction `vpdpbusds'
+.*:10: Error: unsupported instruction `vpdpbusds'
+.*:11: Error: unsupported instruction `vpdpbusds'
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s
new file mode 100644
index 0000000000..8d165bc0f0
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s
@@ -0,0 +1,11 @@
+# Check illegal in AVXVNNI instructions
+
+	.text
+	.arch .noavx512_vnni
+_start:
+	vpdpbusds %xmm2, %xmm4, %xmm2
+	vpdpbusds %xmm22, %xmm4, %xmm2
+
+	.intel_syntax noprefix
+	vpdpbusds xmm2, xmm4, xmm2
+	vpdpbusds xmm2, xmm4, xmm22
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
index ebb0ebf02c..c4474739ed 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -11,8 +11,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
@@ -20,8 +18,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
@@ -29,8 +25,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
@@ -38,8 +32,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.s b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
index 7f47bf684b..95b6dc2ef3 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.s
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
@@ -4,10 +4,8 @@
 	\mnemonic	 %xmm12, %xmm4, %xmm2
 	{evex} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm12, %xmm4, %xmm2
-	{vex2} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex3} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic (%rcx), %xmm4, %xmm2
-	{vex2} \mnemonic (%rcx), %xmm4, %xmm2
 	{vex3} \mnemonic (%rcx), %xmm4, %xmm2
 	\mnemonic	 %xmm22, %xmm4, %xmm2
 .endm
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index fc42088638..c3f0181329 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -408,7 +408,7 @@ static initializer cpu_flag_init[] =
   { "CPU_ANY_AMX_TILE_FLAGS",
     "CpuAMX_TILE|CpuAMX_INT8|CpuAMX_BF16" },
   { "CPU_ANY_AVX_VNNI_FLAGS",
-    "CpuAVX_VNNI|CpuVEX_PREFIX" },
+    "CpuAVX_VNNI" },
   { "CPU_ANY_MOVDIRI_FLAGS",
     "CpuMOVDIRI" },
   { "CPU_ANY_UINTR_FLAGS",
@@ -637,7 +637,6 @@ static bitfield cpu_flags[] =
   BITFIELD (CpuAVX512_VP2INTERSECT),
   BITFIELD (CpuTDX),
   BITFIELD (CpuAVX_VNNI),
-  BITFIELD (CpuVEX_PREFIX),
   BITFIELD (CpuMWAITX),
   BITFIELD (CpuCLZERO),
   BITFIELD (CpuOSPKE),
@@ -708,6 +707,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (ImmExt),
   BITFIELD (NoRex64),
   BITFIELD (Ugh),
+  BITFIELD (PseudoVexPrefix),
   BITFIELD (Vex),
   BITFIELD (VexVVVV),
   BITFIELD (VexW),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 2e90c58421..ce2a1a5b47 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -214,8 +214,6 @@ enum
   CpuTDX,
   /* Intel AVX VNNI Instructions support required.  */
   CpuAVX_VNNI,
-  /* Intel AVX Instructions support via {vex} prefix required.  */
-  CpuVEX_PREFIX,
   /* mwaitx instruction required */
   CpuMWAITX,
   /* Clzero instruction required */
@@ -387,7 +385,6 @@ typedef union i386_cpu_flags
       unsigned int cpuavx512_vp2intersect:1;
       unsigned int cputdx:1;
       unsigned int cpuavx_vnni:1;
-      unsigned int cpuvex_prefix:1;
       unsigned int cpumwaitx:1;
       unsigned int cpuclzero:1;
       unsigned int cpuospke:1;
@@ -534,6 +531,8 @@ enum
   NoRex64,
   /* deprecated fp insn, gets a warning */
   Ugh,
+  /* Intel AVX Instructions support via {vex} prefix */
+  PseudoVexPrefix,
   /* insn has VEX prefix:
 	1: 128bit VEX prefix (or operand dependent).
 	2: 256bit VEX prefix.
@@ -739,6 +738,7 @@ typedef struct i386_opcode_modifier
   unsigned int immext:1;
   unsigned int norex64:1;
   unsigned int ugh:1;
+  unsigned int pseudovexprefix:1;
   unsigned int vex:2;
   unsigned int vexvvvv:2;
   unsigned int vexw:2;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 6745eff26c..56c2838991 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3906,11 +3906,11 @@ vpshrdw, 4, 0x6672, None, 1, CpuAVX512_VBMI2, Modrm|Masking=3|OpcodePrefix=2|Vex
 
 // AVX_VNNI instructions
 
-vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
-vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX_VNNI instructions end
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15  7:10     ` Cui, Lili
@ 2020-10-15  7:24       ` Jan Beulich
  2020-10-15 11:15         ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-15  7:24 UTC (permalink / raw)
  To: Cui, Lili; +Cc: H.J. Lu, binutils

On 15.10.2020 09:10, Cui, Lili wrote:
>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
>>>>        cpu = cpu_flags_and (x, cpu);
>>>>        if (!cpu_flags_all_zero (&cpu))
>>>>       {
>>>> -       if (x.bitfield.cpuavx)
>>>> +       if (x.bitfield.cpuvex_prefix)
>>>> +         {
>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
>>>> +           if (i.vec_encoding == vex_encoding_vex
>>>> +               || i.vec_encoding == vex_encoding_vex3)
>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
>>>> +         }
>>>> +       else if (x.bitfield.cpuavx)
>>>
>>> Is this (including the new cpuvex_prefix attribute, which imo
>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
>>> by placing the templates _after_ the AVX512 counterparts? Iirc
>>> templates get tried in order, and the first match wins. The {vex3}
>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
>> templates.
>>
>> Lili, please look into it.
>>
> 
> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> 
> .arch .noavx512_vnni
> vpdpbusd %xmm2,%xmm4,%xmm2 
> 
> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix. 
> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> and move it into opcode_modifier bit, thanks.

I disagree, unless AVX-VNNI was specified to have a dependency on
AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
that another reason for introducing these encodings may be to allow
their use on AVX512-incapable hardware). The above very much should
result in the VEX encoding despite the absence of a {vex} prefix.
It's really only the default case of everything being enabled where
the pseudo-prefix should be mandated. This particularly implies
that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
need for the pseudo prefix.

> Subject: [PATCH] Enhancement for avx-vnni patch

This title and the fact that the original patch was committed
points out once again that a little bit of time should be given
for people to look at proposed changes. I typically give it at
least over night (day on the other side of the ocean) until I
commit not sufficiently trivial changes, no matter how quickly an
ack arrives.

Irrespective of this - thanks for doing this incremental change.

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/avx-vnni-inval.s
> @@ -0,0 +1,9 @@
> +# Check illegal in AVXVNNI instructions
> +
> +	.text
> +	.arch .noavx512_vnni
> +_start:
> +	vpdpbusd %xmm2,%xmm4,%xmm2
> +
> +	.intel_syntax noprefix
> +	vpdpbusd %xmm2,%xmm4,%xmm2

I question the need for Intel syntax tests in test cases like this
one.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15  7:24       ` Jan Beulich
@ 2020-10-15 11:15         ` H.J. Lu
  2020-10-15 11:45           ` Cui, Lili
  2020-10-15 12:28           ` Jan Beulich
  0 siblings, 2 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-15 11:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.10.2020 09:10, Cui, Lili wrote:
> >>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> >>>>        cpu = cpu_flags_and (x, cpu);
> >>>>        if (!cpu_flags_all_zero (&cpu))
> >>>>       {
> >>>> -       if (x.bitfield.cpuavx)
> >>>> +       if (x.bitfield.cpuvex_prefix)
> >>>> +         {
> >>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
> >>>> +           if (i.vec_encoding == vex_encoding_vex
> >>>> +               || i.vec_encoding == vex_encoding_vex3)
> >>>> +             match |= CPU_FLAGS_ARCH_MATCH;
> >>>> +         }
> >>>> +       else if (x.bitfield.cpuavx)
> >>>
> >>> Is this (including the new cpuvex_prefix attribute, which imo
> >>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> >>> by placing the templates _after_ the AVX512 counterparts? Iirc
> >>> templates get tried in order, and the first match wins. The {vex3}
> >>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> >> templates.
> >>
> >> Lili, please look into it.
> >>
> >
> > I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> >
> > .arch .noavx512_vnni
> > vpdpbusd %xmm2,%xmm4,%xmm2
> >
> > As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
> > we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> > and move it into opcode_modifier bit, thanks.
>
> I disagree, unless AVX-VNNI was specified to have a dependency on
> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> that another reason for introducing these encodings may be to allow
> their use on AVX512-incapable hardware). The above very much should
> result in the VEX encoding despite the absence of a {vex} prefix.
> It's really only the default case of everything being enabled where
> the pseudo-prefix should be mandated. This particularly implies
> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
> need for the pseudo prefix.

AVX VNNI always requires the {vex} prefix.  It isn't optional.
It is similar to

vmovdqu32 %xmm5, %xmm6

vs

vmovdqu %xmm5, %xmm6

It is the 32 suffix vs the {vex} prefix.

> > Subject: [PATCH] Enhancement for avx-vnni patch
>
> This title and the fact that the original patch was committed
> points out once again that a little bit of time should be given
> for people to look at proposed changes. I typically give it at
> least over night (day on the other side of the ocean) until I
> commit not sufficiently trivial changes, no matter how quickly an
> ack arrives.
>
> Irrespective of this - thanks for doing this incremental change.
>
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/avx-vnni-inval.s
> > @@ -0,0 +1,9 @@
> > +# Check illegal in AVXVNNI instructions
> > +
> > +     .text
> > +     .arch .noavx512_vnni
> > +_start:
> > +     vpdpbusd %xmm2,%xmm4,%xmm2
> > +
> > +     .intel_syntax noprefix
> > +     vpdpbusd %xmm2,%xmm4,%xmm2
>
> I question the need for Intel syntax tests in test cases like this
> one.

Please only keep the AT&T syntax test.


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: x86: Support Intel AVX VNNI
  2020-10-15 11:15         ` H.J. Lu
@ 2020-10-15 11:45           ` Cui, Lili
  2020-10-16  2:05             ` H.J. Lu
  2020-10-15 12:28           ` Jan Beulich
  1 sibling, 1 reply; 44+ messages in thread
From: Cui, Lili @ 2020-10-15 11:45 UTC (permalink / raw)
  To: H.J. Lu, Jan Beulich; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 2202 bytes --]

> > >>> Is this (including the new cpuvex_prefix attribute, which imo
> > >>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the
> > >>> same by placing the templates _after_ the AVX512 counterparts?
> > >>> Iirc templates get tried in order, and the first match wins. The
> > >>> {vex3} prefix would then prevent a match on the EVEX-encoded
> > >>> AVX512_VNNI
> > >> templates.
> > >>
> > >> Lili, please look into it.
> > >>
> > >
> > > I add an invalid test for it, we need cpuvex_prefix attribute for under
> scenario.
> > >
> > > .arch .noavx512_vnni
> > > vpdpbusd %xmm2,%xmm4,%xmm2
> > >
> > > As without the pseudo {vex} prefix, this instruction should be encoded
> with EVEX prefix.
> > > we should report error for it, I rename CpuVEX_PREFIX to
> > > PseudoVexPrefix and move it into opcode_modifier bit, thanks.
> >
> > I disagree, unless AVX-VNNI was specified to have a dependency on
> > AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> > that another reason for introducing these encodings may be to allow
> > their use on AVX512-incapable hardware). The above very much should
> > result in the VEX encoding despite the absence of a {vex} prefix.
> > It's really only the default case of everything being enabled where
> > the pseudo-prefix should be mandated. This particularly implies that
> > an explicit ".arch .avx_vnni" ought to _also_ eliminate the need for
> > the pseudo prefix.
> 
> AVX VNNI always requires the {vex} prefix.  It isn't optional.
> It is similar to
> 
> vmovdqu32 %xmm5, %xmm6
> 
> vs
> 
> vmovdqu %xmm5, %xmm6
> 
> It is the 32 suffix vs the {vex} prefix.
> 
> > > --- /dev/null
> > > +++ b/gas/testsuite/gas/i386/avx-vnni-inval.s
> > > @@ -0,0 +1,9 @@
> > > +# Check illegal in AVXVNNI instructions
> > > +
> > > +     .text
> > > +     .arch .noavx512_vnni
> > > +_start:
> > > +     vpdpbusd %xmm2,%xmm4,%xmm2
> > > +
> > > +     .intel_syntax noprefix
> > > +     vpdpbusd %xmm2,%xmm4,%xmm2
> >
> > I question the need for Intel syntax tests in test cases like this
> > one.
> 
> Please only keep the AT&T syntax test.

Done.
Thanks,
Lili.

> --
> H.J.

[-- Attachment #2: 0001-Enhancement-for-avx-vnni-patch.patch --]
[-- Type: application/octet-stream, Size: 16399 bytes --]

From 4ec744a744f7afe5a4173a77b6904fa407a800f6 Mon Sep 17 00:00:00 2001
From: "Cui,Lili" <lili.cui@intel.com>
Date: Thu, 15 Oct 2020 10:45:08 +0800
Subject: [PATCH] Enhancement for avx-vnni patch

1. Rename CpuVEX_PREFIX to PseudoVexPrefix and
   move it from cpu_flags to opcode_modifiers.
2. Delete {vex2} invalid test.
3. Use VexW0 and VexVVVV in the AVX-VNNI instructions.

gas/
	* config/tc-i386.c: Move Pseudo Prefix check to match_template.
	* testsuite/gas/i386/avx-vnni-inval.l: New file.
	* testsuite/gas/i386/avx-vnni-inval.s: Likewise.
	* testsuite/gas/i386/avx-vnni.d: Delete invalid {vex2} test.
	* testsuite/gas/i386/avx-vnni.s: Likewise.
	* testsuite/gas/i386/i386.exp: Add AVX VNNI invalid tests.
	* testsuite/gas/i386/x86-64-avx-vnni-inval.l: New file.
	* testsuite/gas/i386/x86-64-avx-vnni-inval.s: Likewise.
	* testsuite/gas/i386/x86-64-avx-vnni.d: Delete invalid {vex2} test.
	* testsuite/gas/i386/x86-64-avx-vnni.s: Likewise.

opcodes/
	* i386-opc.tbl: Rename CpuVEX_PREFIX to PseudoVexPrefix
	and move it from cpu_flags to opcode_modifiers.
	Use VexW0 and VexVVVV in the AVX-VNNI instructions.
	* i386-gen.c: Likewise.
	* i386-opc.h: Likewise.
	* i386-opc.h: Likewise.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Likewise.
---
 gas/config/tc-i386.c                           | 16 ++++++++--------
 gas/testsuite/gas/i386/avx-vnni-inval.l        |  2 ++
 gas/testsuite/gas/i386/avx-vnni-inval.s        |  6 ++++++
 gas/testsuite/gas/i386/avx-vnni.d              |  8 --------
 gas/testsuite/gas/i386/avx-vnni.s              |  2 --
 gas/testsuite/gas/i386/i386.exp                |  2 ++
 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l |  3 +++
 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s |  7 +++++++
 gas/testsuite/gas/i386/x86-64-avx-vnni.d       |  8 --------
 gas/testsuite/gas/i386/x86-64-avx-vnni.s       |  2 --
 opcodes/i386-gen.c                             |  4 ++--
 opcodes/i386-opc.h                             |  6 +++---
 opcodes/i386-opc.tbl                           |  8 ++++----
 13 files changed, 37 insertions(+), 37 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/avx-vnni-inval.l
 create mode 100644 gas/testsuite/gas/i386/avx-vnni-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index a081064aba..487454f24a 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1973,14 +1973,7 @@ cpu_flags_match (const insn_template *t)
       cpu = cpu_flags_and (x, cpu);
       if (!cpu_flags_all_zero (&cpu))
 	{
-	  if (x.bitfield.cpuvex_prefix)
-	    {
-	      /* We need to check a few extra flags with VEX_PREFIX.  */
-	      if (i.vec_encoding == vex_encoding_vex
-		  || i.vec_encoding == vex_encoding_vex3)
-		match |= CPU_FLAGS_ARCH_MATCH;
-	    }
-	  else if (x.bitfield.cpuavx)
+	  if (x.bitfield.cpuavx)
 	    {
 	      /* We need to check a few extra flags with AVX.  */
 	      if (cpu.bitfield.cpuavx
@@ -6265,6 +6258,13 @@ match_template (char mnem_suffix)
       if (cpu_flags_match (t) != CPU_FLAGS_PERFECT_MATCH)
 	continue;
 
+      /* Check Pseudo Prefix.  */
+      i.error = unsupported;
+      if (t->opcode_modifier.pseudovexprefix
+	  && !(i.vec_encoding == vex_encoding_vex
+	      || i.vec_encoding == vex_encoding_vex3))
+	continue;
+
       /* Check AT&T mnemonic.   */
       i.error = unsupported_with_intel_mnemonic;
       if (intel_mnemonic && t->opcode_modifier.attmnemonic)
diff --git a/gas/testsuite/gas/i386/avx-vnni-inval.l b/gas/testsuite/gas/i386/avx-vnni-inval.l
new file mode 100644
index 0000000000..c55f1003f0
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni-inval.l
@@ -0,0 +1,2 @@
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpdpbusd'
diff --git a/gas/testsuite/gas/i386/avx-vnni-inval.s b/gas/testsuite/gas/i386/avx-vnni-inval.s
new file mode 100644
index 0000000000..c06babff43
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-vnni-inval.s
@@ -0,0 +1,6 @@
+# Check illegal in AVXVNNI instructions
+
+	.text
+	.arch .noavx512_vnni
+_start:
+	vpdpbusd %xmm2,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
index 6d6e779d6e..6e31528cf2 100644
--- a/gas/testsuite/gas/i386/avx-vnni.d
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -11,32 +11,24 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/avx-vnni.s b/gas/testsuite/gas/i386/avx-vnni.s
index 4ddc733040..b37bc85c3a 100644
--- a/gas/testsuite/gas/i386/avx-vnni.s
+++ b/gas/testsuite/gas/i386/avx-vnni.s
@@ -4,10 +4,8 @@
 	\mnemonic	%xmm2, %xmm4, %xmm2
 	{evex} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm2, %xmm4, %xmm2
-	{vex2} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex3} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic (%ecx), %xmm4, %xmm2
-	{vex2} \mnemonic (%ecx), %xmm4, %xmm2
 	{vex3} \mnemonic (%ecx), %xmm4, %xmm2
 .endm
 
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index f5727678e2..068813d77f 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -459,6 +459,7 @@ if [gas_32_check] then {
     run_dump_test "avx512_bf16_vl"
     run_list_test "avx512_bf16_vl-inval"
     run_dump_test "avx-vnni"
+    run_list_test "avx-vnni-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "disassem"
@@ -1077,6 +1078,7 @@ if [gas_64_check] then {
     run_dump_test "x86-64-avx512_bf16_vl"
     run_list_test "x86-64-avx512_bf16_vl-inval"
     run_dump_test "x86-64-avx-vnni"
+    run_list_test "x86-64-avx-vnni-inval"
     run_dump_test "x86-64-clzero"
     run_dump_test "x86-64-mwaitx-bdver4"
     run_list_test "x86-64-mwaitx-reg"
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
new file mode 100644
index 0000000000..a276b3775b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.l
@@ -0,0 +1,3 @@
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpdpbusds'
+.*:7: Error: unsupported instruction `vpdpbusds'
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s
new file mode 100644
index 0000000000..f621ef4be2
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-inval.s
@@ -0,0 +1,7 @@
+# Check illegal in AVXVNNI instructions
+
+	.text
+	.arch .noavx512_vnni
+_start:
+	vpdpbusds %xmm2, %xmm4, %xmm2
+	vpdpbusds %xmm22, %xmm4, %xmm2
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
index ebb0ebf02c..c4474739ed 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -11,8 +11,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
@@ -20,8 +18,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
@@ -29,8 +25,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
@@ -38,8 +32,6 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.s b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
index 7f47bf684b..95b6dc2ef3 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.s
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
@@ -4,10 +4,8 @@
 	\mnemonic	 %xmm12, %xmm4, %xmm2
 	{evex} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm12, %xmm4, %xmm2
-	{vex2} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex3} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic (%rcx), %xmm4, %xmm2
-	{vex2} \mnemonic (%rcx), %xmm4, %xmm2
 	{vex3} \mnemonic (%rcx), %xmm4, %xmm2
 	\mnemonic	 %xmm22, %xmm4, %xmm2
 .endm
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index fc42088638..c3f0181329 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -408,7 +408,7 @@ static initializer cpu_flag_init[] =
   { "CPU_ANY_AMX_TILE_FLAGS",
     "CpuAMX_TILE|CpuAMX_INT8|CpuAMX_BF16" },
   { "CPU_ANY_AVX_VNNI_FLAGS",
-    "CpuAVX_VNNI|CpuVEX_PREFIX" },
+    "CpuAVX_VNNI" },
   { "CPU_ANY_MOVDIRI_FLAGS",
     "CpuMOVDIRI" },
   { "CPU_ANY_UINTR_FLAGS",
@@ -637,7 +637,6 @@ static bitfield cpu_flags[] =
   BITFIELD (CpuAVX512_VP2INTERSECT),
   BITFIELD (CpuTDX),
   BITFIELD (CpuAVX_VNNI),
-  BITFIELD (CpuVEX_PREFIX),
   BITFIELD (CpuMWAITX),
   BITFIELD (CpuCLZERO),
   BITFIELD (CpuOSPKE),
@@ -708,6 +707,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (ImmExt),
   BITFIELD (NoRex64),
   BITFIELD (Ugh),
+  BITFIELD (PseudoVexPrefix),
   BITFIELD (Vex),
   BITFIELD (VexVVVV),
   BITFIELD (VexW),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 2e90c58421..ce2a1a5b47 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -214,8 +214,6 @@ enum
   CpuTDX,
   /* Intel AVX VNNI Instructions support required.  */
   CpuAVX_VNNI,
-  /* Intel AVX Instructions support via {vex} prefix required.  */
-  CpuVEX_PREFIX,
   /* mwaitx instruction required */
   CpuMWAITX,
   /* Clzero instruction required */
@@ -387,7 +385,6 @@ typedef union i386_cpu_flags
       unsigned int cpuavx512_vp2intersect:1;
       unsigned int cputdx:1;
       unsigned int cpuavx_vnni:1;
-      unsigned int cpuvex_prefix:1;
       unsigned int cpumwaitx:1;
       unsigned int cpuclzero:1;
       unsigned int cpuospke:1;
@@ -534,6 +531,8 @@ enum
   NoRex64,
   /* deprecated fp insn, gets a warning */
   Ugh,
+  /* Intel AVX Instructions support via {vex} prefix */
+  PseudoVexPrefix,
   /* insn has VEX prefix:
 	1: 128bit VEX prefix (or operand dependent).
 	2: 256bit VEX prefix.
@@ -739,6 +738,7 @@ typedef struct i386_opcode_modifier
   unsigned int immext:1;
   unsigned int norex64:1;
   unsigned int ugh:1;
+  unsigned int pseudovexprefix:1;
   unsigned int vex:2;
   unsigned int vexvvvv:2;
   unsigned int vexw:2;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 6745eff26c..56c2838991 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3906,11 +3906,11 @@ vpshrdw, 4, 0x6672, None, 1, CpuAVX512_VBMI2, Modrm|Masking=3|OpcodePrefix=2|Vex
 
 // AVX_VNNI instructions
 
-vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusd, 3, 0x6650, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssd, 3, 0x6652, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
-vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI|CpuVEX_PREFIX, Modrm|Vex|OpcodePrefix=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusds, 3, 0x6651, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssds, 3, 0x6653, None, 1, CpuAVX_VNNI, Modrm|Vex|PseudoVexPrefix|OpcodePrefix=1|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX_VNNI instructions end
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 11:15         ` H.J. Lu
  2020-10-15 11:45           ` Cui, Lili
@ 2020-10-15 12:28           ` Jan Beulich
  2020-10-15 12:38             ` H.J. Lu
  1 sibling, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-15 12:28 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 15.10.2020 13:15, H.J. Lu wrote:
> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 15.10.2020 09:10, Cui, Lili wrote:
>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
>>>>>>        cpu = cpu_flags_and (x, cpu);
>>>>>>        if (!cpu_flags_all_zero (&cpu))
>>>>>>       {
>>>>>> -       if (x.bitfield.cpuavx)
>>>>>> +       if (x.bitfield.cpuvex_prefix)
>>>>>> +         {
>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
>>>>>> +           if (i.vec_encoding == vex_encoding_vex
>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
>>>>>> +         }
>>>>>> +       else if (x.bitfield.cpuavx)
>>>>>
>>>>> Is this (including the new cpuvex_prefix attribute, which imo
>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
>>>>> templates get tried in order, and the first match wins. The {vex3}
>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
>>>> templates.
>>>>
>>>> Lili, please look into it.
>>>>
>>>
>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
>>>
>>> .arch .noavx512_vnni
>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>
>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
>>> and move it into opcode_modifier bit, thanks.
>>
>> I disagree, unless AVX-VNNI was specified to have a dependency on
>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
>> that another reason for introducing these encodings may be to allow
>> their use on AVX512-incapable hardware). The above very much should
>> result in the VEX encoding despite the absence of a {vex} prefix.
>> It's really only the default case of everything being enabled where
>> the pseudo-prefix should be mandated. This particularly implies
>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
>> need for the pseudo prefix.
> 
> AVX VNNI always requires the {vex} prefix.  It isn't optional.

That's said or written where? These are new insns with - afaict - no
specification beyond the ISA extensions doc. There's nothing like
that said there afaics.

> It is similar to
> 
> vmovdqu32 %xmm5, %xmm6
> 
> vs
> 
> vmovdqu %xmm5, %xmm6
> 
> It is the 32 suffix vs the {vex} prefix.

I don't see the similarity. the 32 / 64 suffix in the EVEX encoding
controls EVEX.W. There's nothing similar here.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 12:28           ` Jan Beulich
@ 2020-10-15 12:38             ` H.J. Lu
  2020-10-15 15:22               ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-15 12:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.10.2020 13:15, H.J. Lu wrote:
> > On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 15.10.2020 09:10, Cui, Lili wrote:
> >>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> >>>>>>        cpu = cpu_flags_and (x, cpu);
> >>>>>>        if (!cpu_flags_all_zero (&cpu))
> >>>>>>       {
> >>>>>> -       if (x.bitfield.cpuavx)
> >>>>>> +       if (x.bitfield.cpuvex_prefix)
> >>>>>> +         {
> >>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
> >>>>>> +           if (i.vec_encoding == vex_encoding_vex
> >>>>>> +               || i.vec_encoding == vex_encoding_vex3)
> >>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
> >>>>>> +         }
> >>>>>> +       else if (x.bitfield.cpuavx)
> >>>>>
> >>>>> Is this (including the new cpuvex_prefix attribute, which imo
> >>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> >>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
> >>>>> templates get tried in order, and the first match wins. The {vex3}
> >>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> >>>> templates.
> >>>>
> >>>> Lili, please look into it.
> >>>>
> >>>
> >>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> >>>
> >>> .arch .noavx512_vnni
> >>> vpdpbusd %xmm2,%xmm4,%xmm2
> >>>
> >>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
> >>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> >>> and move it into opcode_modifier bit, thanks.
> >>
> >> I disagree, unless AVX-VNNI was specified to have a dependency on
> >> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> >> that another reason for introducing these encodings may be to allow
> >> their use on AVX512-incapable hardware). The above very much should
> >> result in the VEX encoding despite the absence of a {vex} prefix.
> >> It's really only the default case of everything being enabled where
> >> the pseudo-prefix should be mandated. This particularly implies
> >> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
> >> need for the pseudo prefix.
> >
> > AVX VNNI always requires the {vex} prefix.  It isn't optional.
>
> That's said or written where? These are new insns with - afaict - no
> specification beyond the ISA extensions doc. There's nothing like

This is true.  When we implemented AVX VNNI, we decided that
the {vex} prefix is mandatory so that

vpdpbusd %xmm2,%xmm4,%xmm2

always mean EVEX encoding.

> that said there afaics.
>
> > It is similar to
> >
> > vmovdqu32 %xmm5, %xmm6
> >
> > vs
> >
> > vmovdqu %xmm5, %xmm6
> >
> > It is the 32 suffix vs the {vex} prefix.
>
> I don't see the similarity. the 32 / 64 suffix in the EVEX encoding
> controls EVEX.W. There's nothing similar here.
>

There are no EVEX vmovdqu instructions, just like there are no
AVX VNNI without {vex}.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 12:38             ` H.J. Lu
@ 2020-10-15 15:22               ` Jan Beulich
  2020-10-15 15:23                 ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-15 15:22 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 15.10.2020 14:38, H.J. Lu wrote:
> On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 15.10.2020 13:15, H.J. Lu wrote:
>>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 15.10.2020 09:10, Cui, Lili wrote:
>>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
>>>>>>>>        cpu = cpu_flags_and (x, cpu);
>>>>>>>>        if (!cpu_flags_all_zero (&cpu))
>>>>>>>>       {
>>>>>>>> -       if (x.bitfield.cpuavx)
>>>>>>>> +       if (x.bitfield.cpuvex_prefix)
>>>>>>>> +         {
>>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
>>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
>>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
>>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
>>>>>>>> +         }
>>>>>>>> +       else if (x.bitfield.cpuavx)
>>>>>>>
>>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
>>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
>>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
>>>>>>> templates get tried in order, and the first match wins. The {vex3}
>>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
>>>>>> templates.
>>>>>>
>>>>>> Lili, please look into it.
>>>>>>
>>>>>
>>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
>>>>>
>>>>> .arch .noavx512_vnni
>>>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>>>
>>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
>>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
>>>>> and move it into opcode_modifier bit, thanks.
>>>>
>>>> I disagree, unless AVX-VNNI was specified to have a dependency on
>>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
>>>> that another reason for introducing these encodings may be to allow
>>>> their use on AVX512-incapable hardware). The above very much should
>>>> result in the VEX encoding despite the absence of a {vex} prefix.
>>>> It's really only the default case of everything being enabled where
>>>> the pseudo-prefix should be mandated. This particularly implies
>>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
>>>> need for the pseudo prefix.
>>>
>>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
>>
>> That's said or written where? These are new insns with - afaict - no
>> specification beyond the ISA extensions doc. There's nothing like
> 
> This is true.  When we implemented AVX VNNI, we decided that
> the {vex} prefix is mandatory so that
> 
> vpdpbusd %xmm2,%xmm4,%xmm2
> 
> always mean EVEX encoding.

And this decision was discussed internally at Intel, and other
community members get no say at all?

>> that said there afaics.
>>
>>> It is similar to
>>>
>>> vmovdqu32 %xmm5, %xmm6
>>>
>>> vs
>>>
>>> vmovdqu %xmm5, %xmm6
>>>
>>> It is the 32 suffix vs the {vex} prefix.
>>
>> I don't see the similarity. the 32 / 64 suffix in the EVEX encoding
>> controls EVEX.W. There's nothing similar here.
>>
> 
> There are no EVEX vmovdqu instructions,

Right, another reason why the comparison isn't a helpful one.

Jan

> just like there are no
> AVX VNNI without {vex}.
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 15:22               ` Jan Beulich
@ 2020-10-15 15:23                 ` H.J. Lu
  2020-10-15 15:26                   ` H.J. Lu
  2020-10-15 15:28                   ` Jan Beulich
  0 siblings, 2 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-15 15:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Thu, Oct 15, 2020 at 8:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.10.2020 14:38, H.J. Lu wrote:
> > On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 15.10.2020 13:15, H.J. Lu wrote:
> >>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 15.10.2020 09:10, Cui, Lili wrote:
> >>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> >>>>>>>>        cpu = cpu_flags_and (x, cpu);
> >>>>>>>>        if (!cpu_flags_all_zero (&cpu))
> >>>>>>>>       {
> >>>>>>>> -       if (x.bitfield.cpuavx)
> >>>>>>>> +       if (x.bitfield.cpuvex_prefix)
> >>>>>>>> +         {
> >>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
> >>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
> >>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
> >>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
> >>>>>>>> +         }
> >>>>>>>> +       else if (x.bitfield.cpuavx)
> >>>>>>>
> >>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
> >>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> >>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
> >>>>>>> templates get tried in order, and the first match wins. The {vex3}
> >>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> >>>>>> templates.
> >>>>>>
> >>>>>> Lili, please look into it.
> >>>>>>
> >>>>>
> >>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> >>>>>
> >>>>> .arch .noavx512_vnni
> >>>>> vpdpbusd %xmm2,%xmm4,%xmm2
> >>>>>
> >>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
> >>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> >>>>> and move it into opcode_modifier bit, thanks.
> >>>>
> >>>> I disagree, unless AVX-VNNI was specified to have a dependency on
> >>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> >>>> that another reason for introducing these encodings may be to allow
> >>>> their use on AVX512-incapable hardware). The above very much should
> >>>> result in the VEX encoding despite the absence of a {vex} prefix.
> >>>> It's really only the default case of everything being enabled where
> >>>> the pseudo-prefix should be mandated. This particularly implies
> >>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
> >>>> need for the pseudo prefix.
> >>>
> >>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
> >>
> >> That's said or written where? These are new insns with - afaict - no
> >> specification beyond the ISA extensions doc. There's nothing like
> >
> > This is true.  When we implemented AVX VNNI, we decided that
> > the {vex} prefix is mandatory so that
> >
> > vpdpbusd %xmm2,%xmm4,%xmm2
> >
> > always mean EVEX encoding.
>
> And this decision was discussed internally at Intel, and other

Internally.

> community members get no say at all?
>
> >> that said there afaics.
> >>
> >>> It is similar to
> >>>
> >>> vmovdqu32 %xmm5, %xmm6
> >>>
> >>> vs
> >>>
> >>> vmovdqu %xmm5, %xmm6
> >>>
> >>> It is the 32 suffix vs the {vex} prefix.
> >>
> >> I don't see the similarity. the 32 / 64 suffix in the EVEX encoding
> >> controls EVEX.W. There's nothing similar here.
> >>
> >
> > There are no EVEX vmovdqu instructions,
>
> Right, another reason why the comparison isn't a helpful one.
>
> Jan
>
> > just like there are no
> > AVX VNNI without {vex}.
> >
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 15:23                 ` H.J. Lu
@ 2020-10-15 15:26                   ` H.J. Lu
  2020-10-15 15:28                   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-15 15:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Thu, Oct 15, 2020 at 8:23 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Thu, Oct 15, 2020 at 8:22 AM Jan Beulich <jbeulich@suse.com> wrote:
> >
> > On 15.10.2020 14:38, H.J. Lu wrote:
> > > On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
> > >>
> > >> On 15.10.2020 13:15, H.J. Lu wrote:
> > >>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
> > >>>> On 15.10.2020 09:10, Cui, Lili wrote:
> > >>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> > >>>>>>>>        cpu = cpu_flags_and (x, cpu);
> > >>>>>>>>        if (!cpu_flags_all_zero (&cpu))
> > >>>>>>>>       {
> > >>>>>>>> -       if (x.bitfield.cpuavx)
> > >>>>>>>> +       if (x.bitfield.cpuvex_prefix)
> > >>>>>>>> +         {
> > >>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
> > >>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
> > >>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
> > >>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
> > >>>>>>>> +         }
> > >>>>>>>> +       else if (x.bitfield.cpuavx)
> > >>>>>>>
> > >>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
> > >>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> > >>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
> > >>>>>>> templates get tried in order, and the first match wins. The {vex3}
> > >>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> > >>>>>> templates.
> > >>>>>>
> > >>>>>> Lili, please look into it.
> > >>>>>>
> > >>>>>
> > >>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> > >>>>>
> > >>>>> .arch .noavx512_vnni
> > >>>>> vpdpbusd %xmm2,%xmm4,%xmm2
> > >>>>>
> > >>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
> > >>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> > >>>>> and move it into opcode_modifier bit, thanks.
> > >>>>
> > >>>> I disagree, unless AVX-VNNI was specified to have a dependency on
> > >>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> > >>>> that another reason for introducing these encodings may be to allow
> > >>>> their use on AVX512-incapable hardware). The above very much should
> > >>>> result in the VEX encoding despite the absence of a {vex} prefix.
> > >>>> It's really only the default case of everything being enabled where
> > >>>> the pseudo-prefix should be mandated. This particularly implies
> > >>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
> > >>>> need for the pseudo prefix.
> > >>>
> > >>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
> > >>
> > >> That's said or written where? These are new insns with - afaict - no
> > >> specification beyond the ISA extensions doc. There's nothing like
> > >
> > > This is true.  When we implemented AVX VNNI, we decided that
> > > the {vex} prefix is mandatory so that
> > >
> > > vpdpbusd %xmm2,%xmm4,%xmm2
> > >
> > > always mean EVEX encoding.
> >
> > And this decision was discussed internally at Intel, and other
>
> Internally.
>

We considered different mnemonics for AVX VNNI.  The final
decision was to use the mandatory {vex} prefix.


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 15:23                 ` H.J. Lu
  2020-10-15 15:26                   ` H.J. Lu
@ 2020-10-15 15:28                   ` Jan Beulich
  2020-10-15 15:34                     ` H.J. Lu
  1 sibling, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-15 15:28 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 15.10.2020 17:23, H.J. Lu wrote:
> On Thu, Oct 15, 2020 at 8:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 15.10.2020 14:38, H.J. Lu wrote:
>>> On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 15.10.2020 13:15, H.J. Lu wrote:
>>>>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 15.10.2020 09:10, Cui, Lili wrote:
>>>>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
>>>>>>>>>>        cpu = cpu_flags_and (x, cpu);
>>>>>>>>>>        if (!cpu_flags_all_zero (&cpu))
>>>>>>>>>>       {
>>>>>>>>>> -       if (x.bitfield.cpuavx)
>>>>>>>>>> +       if (x.bitfield.cpuvex_prefix)
>>>>>>>>>> +         {
>>>>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
>>>>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
>>>>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
>>>>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
>>>>>>>>>> +         }
>>>>>>>>>> +       else if (x.bitfield.cpuavx)
>>>>>>>>>
>>>>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
>>>>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
>>>>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
>>>>>>>>> templates get tried in order, and the first match wins. The {vex3}
>>>>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
>>>>>>>> templates.
>>>>>>>>
>>>>>>>> Lili, please look into it.
>>>>>>>>
>>>>>>>
>>>>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
>>>>>>>
>>>>>>> .arch .noavx512_vnni
>>>>>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>>>>>
>>>>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
>>>>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
>>>>>>> and move it into opcode_modifier bit, thanks.
>>>>>>
>>>>>> I disagree, unless AVX-VNNI was specified to have a dependency on
>>>>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
>>>>>> that another reason for introducing these encodings may be to allow
>>>>>> their use on AVX512-incapable hardware). The above very much should
>>>>>> result in the VEX encoding despite the absence of a {vex} prefix.
>>>>>> It's really only the default case of everything being enabled where
>>>>>> the pseudo-prefix should be mandated. This particularly implies
>>>>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
>>>>>> need for the pseudo prefix.
>>>>>
>>>>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
>>>>
>>>> That's said or written where? These are new insns with - afaict - no
>>>> specification beyond the ISA extensions doc. There's nothing like
>>>
>>> This is true.  When we implemented AVX VNNI, we decided that
>>> the {vex} prefix is mandatory so that
>>>
>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>
>>> always mean EVEX encoding.
>>
>> And this decision was discussed internally at Intel, and other
> 
> Internally.
> 
>> community members get no say at all?

Please can such discussions be had in the public for open source projects?
I continue to think that the behavior as implemented is not the best
possible choice. Therefore I'd like to at least hear the arguments that
led to this decision.

Thanks, Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 15:28                   ` Jan Beulich
@ 2020-10-15 15:34                     ` H.J. Lu
  2020-10-15 16:04                       ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-15 15:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Thu, Oct 15, 2020 at 8:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.10.2020 17:23, H.J. Lu wrote:
> > On Thu, Oct 15, 2020 at 8:22 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 15.10.2020 14:38, H.J. Lu wrote:
> >>> On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 15.10.2020 13:15, H.J. Lu wrote:
> >>>>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>> On 15.10.2020 09:10, Cui, Lili wrote:
> >>>>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> >>>>>>>>>>        cpu = cpu_flags_and (x, cpu);
> >>>>>>>>>>        if (!cpu_flags_all_zero (&cpu))
> >>>>>>>>>>       {
> >>>>>>>>>> -       if (x.bitfield.cpuavx)
> >>>>>>>>>> +       if (x.bitfield.cpuvex_prefix)
> >>>>>>>>>> +         {
> >>>>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
> >>>>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
> >>>>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
> >>>>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
> >>>>>>>>>> +         }
> >>>>>>>>>> +       else if (x.bitfield.cpuavx)
> >>>>>>>>>
> >>>>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
> >>>>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> >>>>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
> >>>>>>>>> templates get tried in order, and the first match wins. The {vex3}
> >>>>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> >>>>>>>> templates.
> >>>>>>>>
> >>>>>>>> Lili, please look into it.
> >>>>>>>>
> >>>>>>>
> >>>>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> >>>>>>>
> >>>>>>> .arch .noavx512_vnni
> >>>>>>> vpdpbusd %xmm2,%xmm4,%xmm2
> >>>>>>>
> >>>>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
> >>>>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> >>>>>>> and move it into opcode_modifier bit, thanks.
> >>>>>>
> >>>>>> I disagree, unless AVX-VNNI was specified to have a dependency on
> >>>>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> >>>>>> that another reason for introducing these encodings may be to allow
> >>>>>> their use on AVX512-incapable hardware). The above very much should
> >>>>>> result in the VEX encoding despite the absence of a {vex} prefix.
> >>>>>> It's really only the default case of everything being enabled where
> >>>>>> the pseudo-prefix should be mandated. This particularly implies
> >>>>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
> >>>>>> need for the pseudo prefix.
> >>>>>
> >>>>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
> >>>>
> >>>> That's said or written where? These are new insns with - afaict - no
> >>>> specification beyond the ISA extensions doc. There's nothing like
> >>>
> >>> This is true.  When we implemented AVX VNNI, we decided that
> >>> the {vex} prefix is mandatory so that
> >>>
> >>> vpdpbusd %xmm2,%xmm4,%xmm2
> >>>
> >>> always mean EVEX encoding.
> >>
> >> And this decision was discussed internally at Intel, and other
> >
> > Internally.
> >
> >> community members get no say at all?
>
> Please can such discussions be had in the public for open source projects?

The discussion happened around the time when I added

commit 86fa6981e7487e2c2df4337aa75ed2d93c32eaf2
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Mar 9 09:58:46 2017 -0800

    X86: Add pseudo prefixes to control encoding

    Many x86 instructions have more than one encodings.  Assembler picks
    the default one, usually the shortest one.  Although the ".s", ".d8"
    and ".d32" suffixes can be used to swap register operands or specify
    displacement size, they aren't very flexible.  This patch adds pseudo
    prefixes, {xxx}, to control instruction encoding.  The available
    pseudo prefixes are {disp8}, {disp32}, {load}, {store}, {vex2}, {vex3}
    and {evex}.  Pseudo prefixes are preferred over the ".s", ".d8" and
    ".d32" suffixes, which are deprecated.

It wasn't practical to discuss it in public.

> I continue to think that the behavior as implemented is not the best
> possible choice. Therefore I'd like to at least hear the arguments that
> led to this decision.
>

Please send me your detailed comments.  I will forward it to our internal
group.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 15:34                     ` H.J. Lu
@ 2020-10-15 16:04                       ` Jan Beulich
  2020-10-15 16:15                         ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-15 16:04 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 15.10.2020 17:34, H.J. Lu wrote:
> On Thu, Oct 15, 2020 at 8:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 15.10.2020 17:23, H.J. Lu wrote:
>>> On Thu, Oct 15, 2020 at 8:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 15.10.2020 14:38, H.J. Lu wrote:
>>>>> On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>
>>>>>> On 15.10.2020 13:15, H.J. Lu wrote:
>>>>>>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>> On 15.10.2020 09:10, Cui, Lili wrote:
>>>>>>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
>>>>>>>>>>>>        cpu = cpu_flags_and (x, cpu);
>>>>>>>>>>>>        if (!cpu_flags_all_zero (&cpu))
>>>>>>>>>>>>       {
>>>>>>>>>>>> -       if (x.bitfield.cpuavx)
>>>>>>>>>>>> +       if (x.bitfield.cpuvex_prefix)
>>>>>>>>>>>> +         {
>>>>>>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
>>>>>>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
>>>>>>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
>>>>>>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
>>>>>>>>>>>> +         }
>>>>>>>>>>>> +       else if (x.bitfield.cpuavx)
>>>>>>>>>>>
>>>>>>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
>>>>>>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
>>>>>>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
>>>>>>>>>>> templates get tried in order, and the first match wins. The {vex3}
>>>>>>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
>>>>>>>>>> templates.
>>>>>>>>>>
>>>>>>>>>> Lili, please look into it.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
>>>>>>>>>
>>>>>>>>> .arch .noavx512_vnni
>>>>>>>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>>>>>>>
>>>>>>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
>>>>>>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
>>>>>>>>> and move it into opcode_modifier bit, thanks.
>>>>>>>>
>>>>>>>> I disagree, unless AVX-VNNI was specified to have a dependency on
>>>>>>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
>>>>>>>> that another reason for introducing these encodings may be to allow
>>>>>>>> their use on AVX512-incapable hardware). The above very much should
>>>>>>>> result in the VEX encoding despite the absence of a {vex} prefix.
>>>>>>>> It's really only the default case of everything being enabled where
>>>>>>>> the pseudo-prefix should be mandated. This particularly implies
>>>>>>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
>>>>>>>> need for the pseudo prefix.
>>>>>>>
>>>>>>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
>>>>>>
>>>>>> That's said or written where? These are new insns with - afaict - no
>>>>>> specification beyond the ISA extensions doc. There's nothing like
>>>>>
>>>>> This is true.  When we implemented AVX VNNI, we decided that
>>>>> the {vex} prefix is mandatory so that
>>>>>
>>>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>>>
>>>>> always mean EVEX encoding.
>>>>
>>>> And this decision was discussed internally at Intel, and other
>>>
>>> Internally.
>>>
>>>> community members get no say at all?
>>
>> Please can such discussions be had in the public for open source projects?
> 
> The discussion happened around the time when I added
> 
> commit 86fa6981e7487e2c2df4337aa75ed2d93c32eaf2
> Author: H.J. Lu <hjl.tools@gmail.com>
> Date:   Thu Mar 9 09:58:46 2017 -0800
> 
>     X86: Add pseudo prefixes to control encoding
> 
>     Many x86 instructions have more than one encodings.  Assembler picks
>     the default one, usually the shortest one.  Although the ".s", ".d8"
>     and ".d32" suffixes can be used to swap register operands or specify
>     displacement size, they aren't very flexible.  This patch adds pseudo
>     prefixes, {xxx}, to control instruction encoding.  The available
>     pseudo prefixes are {disp8}, {disp32}, {load}, {store}, {vex2}, {vex3}
>     and {evex}.  Pseudo prefixes are preferred over the ".s", ".d8" and
>     ".d32" suffixes, which are deprecated.
> 
> It wasn't practical to discuss it in public.

Wow.

>> I continue to think that the behavior as implemented is not the best
>> possible choice. Therefore I'd like to at least hear the arguments that
>> led to this decision.
> 
> Please send me your detailed comments.  I will forward it to our internal
> group.

I've given my two points already - there are two cases where the
pseudo prefix shouldn't be required. Plus, as also said, the
disassembler shouldn't display it by default.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 16:04                       ` Jan Beulich
@ 2020-10-15 16:15                         ` H.J. Lu
  2020-10-16  6:10                           ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-15 16:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Thu, Oct 15, 2020 at 9:04 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.10.2020 17:34, H.J. Lu wrote:
> > On Thu, Oct 15, 2020 at 8:28 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 15.10.2020 17:23, H.J. Lu wrote:
> >>> On Thu, Oct 15, 2020 at 8:22 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 15.10.2020 14:38, H.J. Lu wrote:
> >>>>> On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 15.10.2020 13:15, H.J. Lu wrote:
> >>>>>>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>> On 15.10.2020 09:10, Cui, Lili wrote:
> >>>>>>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> >>>>>>>>>>>>        cpu = cpu_flags_and (x, cpu);
> >>>>>>>>>>>>        if (!cpu_flags_all_zero (&cpu))
> >>>>>>>>>>>>       {
> >>>>>>>>>>>> -       if (x.bitfield.cpuavx)
> >>>>>>>>>>>> +       if (x.bitfield.cpuvex_prefix)
> >>>>>>>>>>>> +         {
> >>>>>>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
> >>>>>>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
> >>>>>>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
> >>>>>>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
> >>>>>>>>>>>> +         }
> >>>>>>>>>>>> +       else if (x.bitfield.cpuavx)
> >>>>>>>>>>>
> >>>>>>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
> >>>>>>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> >>>>>>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
> >>>>>>>>>>> templates get tried in order, and the first match wins. The {vex3}
> >>>>>>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> >>>>>>>>>> templates.
> >>>>>>>>>>
> >>>>>>>>>> Lili, please look into it.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> >>>>>>>>>
> >>>>>>>>> .arch .noavx512_vnni
> >>>>>>>>> vpdpbusd %xmm2,%xmm4,%xmm2
> >>>>>>>>>
> >>>>>>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
> >>>>>>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> >>>>>>>>> and move it into opcode_modifier bit, thanks.
> >>>>>>>>
> >>>>>>>> I disagree, unless AVX-VNNI was specified to have a dependency on
> >>>>>>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> >>>>>>>> that another reason for introducing these encodings may be to allow
> >>>>>>>> their use on AVX512-incapable hardware). The above very much should
> >>>>>>>> result in the VEX encoding despite the absence of a {vex} prefix.
> >>>>>>>> It's really only the default case of everything being enabled where
> >>>>>>>> the pseudo-prefix should be mandated. This particularly implies
> >>>>>>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
> >>>>>>>> need for the pseudo prefix.
> >>>>>>>
> >>>>>>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
> >>>>>>
> >>>>>> That's said or written where? These are new insns with - afaict - no
> >>>>>> specification beyond the ISA extensions doc. There's nothing like
> >>>>>
> >>>>> This is true.  When we implemented AVX VNNI, we decided that
> >>>>> the {vex} prefix is mandatory so that
> >>>>>
> >>>>> vpdpbusd %xmm2,%xmm4,%xmm2
> >>>>>
> >>>>> always mean EVEX encoding.
> >>>>
> >>>> And this decision was discussed internally at Intel, and other
> >>>
> >>> Internally.
> >>>
> >>>> community members get no say at all?
> >>
> >> Please can such discussions be had in the public for open source projects?
> >
> > The discussion happened around the time when I added
> >
> > commit 86fa6981e7487e2c2df4337aa75ed2d93c32eaf2
> > Author: H.J. Lu <hjl.tools@gmail.com>
> > Date:   Thu Mar 9 09:58:46 2017 -0800
> >
> >     X86: Add pseudo prefixes to control encoding
> >
> >     Many x86 instructions have more than one encodings.  Assembler picks
> >     the default one, usually the shortest one.  Although the ".s", ".d8"
> >     and ".d32" suffixes can be used to swap register operands or specify
> >     displacement size, they aren't very flexible.  This patch adds pseudo
> >     prefixes, {xxx}, to control instruction encoding.  The available
> >     pseudo prefixes are {disp8}, {disp32}, {load}, {store}, {vex2}, {vex3}
> >     and {evex}.  Pseudo prefixes are preferred over the ".s", ".d8" and
> >     ".d32" suffixes, which are deprecated.
> >
> > It wasn't practical to discuss it in public.
>
> Wow.
>
> >> I continue to think that the behavior as implemented is not the best
> >> possible choice. Therefore I'd like to at least hear the arguments that
> >> led to this decision.
> >
> > Please send me your detailed comments.  I will forward it to our internal
> > group.
>
> I've given my two points already - there are two cases where the
> pseudo prefix shouldn't be required. Plus, as also said, the
> disassembler shouldn't display it by default.
>

It needs to be much more than that to have any impact on a
decision made years ago.


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 11:45           ` Cui, Lili
@ 2020-10-16  2:05             ` H.J. Lu
  0 siblings, 0 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-16  2:05 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Jan Beulich, binutils

On Thu, Oct 15, 2020 at 4:46 AM Cui, Lili <lili.cui@intel.com> wrote:
>
> > > >>> Is this (including the new cpuvex_prefix attribute, which imo
> > > >>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the
> > > >>> same by placing the templates _after_ the AVX512 counterparts?
> > > >>> Iirc templates get tried in order, and the first match wins. The
> > > >>> {vex3} prefix would then prevent a match on the EVEX-encoded
> > > >>> AVX512_VNNI
> > > >> templates.
> > > >>
> > > >> Lili, please look into it.
> > > >>
> > > >
> > > > I add an invalid test for it, we need cpuvex_prefix attribute for under
> > scenario.
> > > >
> > > > .arch .noavx512_vnni
> > > > vpdpbusd %xmm2,%xmm4,%xmm2
> > > >
> > > > As without the pseudo {vex} prefix, this instruction should be encoded
> > with EVEX prefix.
> > > > we should report error for it, I rename CpuVEX_PREFIX to
> > > > PseudoVexPrefix and move it into opcode_modifier bit, thanks.
> > >
> > > I disagree, unless AVX-VNNI was specified to have a dependency on
> > > AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> > > that another reason for introducing these encodings may be to allow
> > > their use on AVX512-incapable hardware). The above very much should
> > > result in the VEX encoding despite the absence of a {vex} prefix.
> > > It's really only the default case of everything being enabled where
> > > the pseudo-prefix should be mandated. This particularly implies that
> > > an explicit ".arch .avx_vnni" ought to _also_ eliminate the need for
> > > the pseudo prefix.
> >
> > AVX VNNI always requires the {vex} prefix.  It isn't optional.
> > It is similar to
> >
> > vmovdqu32 %xmm5, %xmm6
> >
> > vs
> >
> > vmovdqu %xmm5, %xmm6
> >
> > It is the 32 suffix vs the {vex} prefix.
> >
> > > > --- /dev/null
> > > > +++ b/gas/testsuite/gas/i386/avx-vnni-inval.s
> > > > @@ -0,0 +1,9 @@
> > > > +# Check illegal in AVXVNNI instructions
> > > > +
> > > > +     .text
> > > > +     .arch .noavx512_vnni
> > > > +_start:
> > > > +     vpdpbusd %xmm2,%xmm4,%xmm2
> > > > +
> > > > +     .intel_syntax noprefix
> > > > +     vpdpbusd %xmm2,%xmm4,%xmm2
> > >
> > > I question the need for Intel syntax tests in test cases like this
> > > one.
> >
> > Please only keep the AT&T syntax test.
>
> Done.
> Thanks,
> Lili.

OK.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-15 16:15                         ` H.J. Lu
@ 2020-10-16  6:10                           ` Jan Beulich
  2020-10-16 18:07                             ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-16  6:10 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 15.10.2020 18:15, H.J. Lu wrote:
> On Thu, Oct 15, 2020 at 9:04 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 15.10.2020 17:34, H.J. Lu wrote:
>>> On Thu, Oct 15, 2020 at 8:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> I continue to think that the behavior as implemented is not the best
>>>> possible choice. Therefore I'd like to at least hear the arguments that
>>>> led to this decision.
>>>
>>> Please send me your detailed comments.  I will forward it to our internal
>>> group.
>>
>> I've given my two points already - there are two cases where the
>> pseudo prefix shouldn't be required. Plus, as also said, the
>> disassembler shouldn't display it by default.
> 
> It needs to be much more than that to have any impact on a
> decision made years ago.

I'm sorry, but again: A decision made internally, years ago or not,
cannot possibly be the final one in an open source world. It shouldn't
even need me to provide extended arguments against, when the basic
request is to first of all supply the reasoning behind that decision.
Maybe once I know the the train of thought, I agree (and withdraw my
counter arguments)?

H.J., let me be very clear: Since there's a general pattern here in
that it often looks like technical disagreement gets resolved simply
by more or less harsh discarding of arguments (and, not just once,
deliberate introduction of bugs), I'm very willing to let this
escalate, as here you even prevent a technical discussion by hiding
your arguments. The way you drive things in certain cases is, imo,
not how things ought to be done for open source projects. And yes -
I'm not forgetting that you're the maintainer, and hence you get the
final say. (I wonder though whether, given my work over the last
years, I shouldn't have my maintainership area extended beyond Intel
syntax aspects, e.g. to all of x86's gas/ and opcodes/.)

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-16  6:10                           ` Jan Beulich
@ 2020-10-16 18:07                             ` H.J. Lu
  2020-10-19  6:28                               ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-16 18:07 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Thu, Oct 15, 2020 at 11:10 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.10.2020 18:15, H.J. Lu wrote:
> > On Thu, Oct 15, 2020 at 9:04 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 15.10.2020 17:34, H.J. Lu wrote:
> >>> On Thu, Oct 15, 2020 at 8:28 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> I continue to think that the behavior as implemented is not the best
> >>>> possible choice. Therefore I'd like to at least hear the arguments that
> >>>> led to this decision.
> >>>
> >>> Please send me your detailed comments.  I will forward it to our internal
> >>> group.
> >>
> >> I've given my two points already - there are two cases where the
> >> pseudo prefix shouldn't be required. Plus, as also said, the
> >> disassembler shouldn't display it by default.
> >
> > It needs to be much more than that to have any impact on a
> > decision made years ago.
>
> I'm sorry, but again: A decision made internally, years ago or not,
> cannot possibly be the final one in an open source world. It shouldn't
> even need me to provide extended arguments against, when the basic
> request is to first of all supply the reasoning behind that decision.
> Maybe once I know the the train of thought, I agree (and withdraw my
> counter arguments)?
>
> H.J., let me be very clear: Since there's a general pattern here in
> that it often looks like technical disagreement gets resolved simply
> by more or less harsh discarding of arguments (and, not just once,
> deliberate introduction of bugs), I'm very willing to let this
> escalate, as here you even prevent a technical discussion by hiding
> your arguments. The way you drive things in certain cases is, imo,
> not how things ought to be done for open source projects. And yes -
> I'm not forgetting that you're the maintainer, and hence you get the
> final say. (I wonder though whether, given my work over the last
> years, I shouldn't have my maintainership area extended beyond Intel
> syntax aspects, e.g. to all of x86's gas/ and opcodes/.)
>

When AVX VNNI was added, we could either use different mnemonics
from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
mandatory to avoid any confusion.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-16 18:07                             ` H.J. Lu
@ 2020-10-19  6:28                               ` Jan Beulich
  2020-10-19  8:26                                 ` Cui, Lili
  2020-10-19 12:24                                 ` H.J. Lu
  0 siblings, 2 replies; 44+ messages in thread
From: Jan Beulich @ 2020-10-19  6:28 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 16.10.2020 20:07, H.J. Lu wrote:
> When AVX VNNI was added, we could either use different mnemonics
> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
> mandatory to avoid any confusion.

What confusion could there be when a person has given suitable
explicit .arch directives? And how is {vex3} in disassembler
output helping in any way, when comparing to all other AVX+
insns which have AVX512VL counterparts? (Apart from that I'd
further question why it needs to be {vex3} when {vex} would
suffice, but I'd like to see this dropped altogether anyway,
except perhaps in some non-default mode, where it then should
be output consistently.)

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: x86: Support Intel AVX VNNI
  2020-10-19  6:28                               ` Jan Beulich
@ 2020-10-19  8:26                                 ` Cui, Lili
  2020-10-19 11:19                                   ` Jan Beulich
  2020-10-19 12:24                                 ` H.J. Lu
  1 sibling, 1 reply; 44+ messages in thread
From: Cui, Lili @ 2020-10-19  8:26 UTC (permalink / raw)
  To: Jan Beulich, H.J. Lu; +Cc: binutils

> On 16.10.2020 20:07, H.J. Lu wrote:
> > When AVX VNNI was added, we could either use different mnemonics from
> > AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
> > mandatory to avoid any confusion.
> 
> What confusion could there be when a person has given suitable
> explicit .arch directives? And how is {vex3} in disassembler output helping in
> any way, when comparing to all other AVX+ insns which have AVX512VL
> counterparts? (Apart from that I'd further question why it needs to be {vex3}
> when {vex} would suffice, but I'd like to see this dropped altogether anyway,
> except perhaps in some non-default mode, where it then should be output
> consistently.)
> 
> Jan

Hi Jan,

About " why it needs to be {vex3} when {vex} would suffice ", I think AVX_VNNI  can
not be expressed in {vex} format, because all AVX_VNNI instructions need the bitfield  
"m-mmmm" of {vex3} to express 0F38, thanks.

Lili.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19  8:26                                 ` Cui, Lili
@ 2020-10-19 11:19                                   ` Jan Beulich
  0 siblings, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2020-10-19 11:19 UTC (permalink / raw)
  To: Cui, Lili; +Cc: H.J. Lu, binutils

On 19.10.2020 10:26, Cui, Lili wrote:
>> On 16.10.2020 20:07, H.J. Lu wrote:
>>> When AVX VNNI was added, we could either use different mnemonics from
>>> AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
>>> mandatory to avoid any confusion.
>>
>> What confusion could there be when a person has given suitable
>> explicit .arch directives? And how is {vex3} in disassembler output helping in
>> any way, when comparing to all other AVX+ insns which have AVX512VL
>> counterparts? (Apart from that I'd further question why it needs to be {vex3}
>> when {vex} would suffice, but I'd like to see this dropped altogether anyway,
>> except perhaps in some non-default mode, where it then should be output
>> consistently.)
> 
> About " why it needs to be {vex3} when {vex} would suffice ", I think AVX_VNNI  can
> not be expressed in {vex} format, because all AVX_VNNI instructions need the bitfield  
> "m-mmmm" of {vex3} to express 0F38, thanks.

{vex} doesn't mean "2-byte VEX", but "either form of VEX". All
it precludes is legacy or EVEX encoding. {vex2} was deprecated
because where possible gas will pick the 2-byte encoding anyway
(and iirc {vex2} also mistakenly had [almost] the meaning of
{vex}).

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19  6:28                               ` Jan Beulich
  2020-10-19  8:26                                 ` Cui, Lili
@ 2020-10-19 12:24                                 ` H.J. Lu
  2020-10-19 13:22                                   ` Jan Beulich
  2020-10-23  1:57                                   ` Cui, Lili
  1 sibling, 2 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-19 12:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 16.10.2020 20:07, H.J. Lu wrote:
> > When AVX VNNI was added, we could either use different mnemonics
> > from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
> > mandatory to avoid any confusion.
>
> What confusion could there be when a person has given suitable
> explicit .arch directives? And how is {vex3} in disassembler

When debugging assembly codes under GDB, one shouldn't guess
how they are assembled.

> output helping in any way, when comparing to all other AVX+
> insns which have AVX512VL counterparts? (Apart from that I'd
> further question why it needs to be {vex3} when {vex} would
> suffice, but I'd like to see this dropped altogether anyway,
> except perhaps in some non-default mode, where it then should
> be output consistently.)

Yes, {vex3} can be dropped from AVX VNNI tests.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 12:24                                 ` H.J. Lu
@ 2020-10-19 13:22                                   ` Jan Beulich
  2020-10-19 13:37                                     ` H.J. Lu
  2020-10-23  1:57                                   ` Cui, Lili
  1 sibling, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-19 13:22 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 19.10.2020 14:24, H.J. Lu wrote:
> On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 16.10.2020 20:07, H.J. Lu wrote:
>>> When AVX VNNI was added, we could either use different mnemonics
>>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
>>> mandatory to avoid any confusion.
>>
>> What confusion could there be when a person has given suitable
>> explicit .arch directives? And how is {vex3} in disassembler
> 
> When debugging assembly codes under GDB, one shouldn't guess
> how they are assembled.

And in gdb how do you tell

	vaddps %xmm0, %xmm1, %xmm2

from

	{evex} vaddps %xmm0, %xmm1, %xmm2

or

	{vex3} vaddps %xmm0, %xmm1, %xmm2

This is exactly the same as the case at hand, just that the
order in time in which both encodings where introduced is
reversed. But the order of introduction time shouldn't matter
to the (long term) behavior of a tool.

Jan

>> output helping in any way, when comparing to all other AVX+
>> insns which have AVX512VL counterparts? (Apart from that I'd
>> further question why it needs to be {vex3} when {vex} would
>> suffice, but I'd like to see this dropped altogether anyway,
>> except perhaps in some non-default mode, where it then should
>> be output consistently.)
> 
> Yes, {vex3} can be dropped from AVX VNNI tests.
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 13:22                                   ` Jan Beulich
@ 2020-10-19 13:37                                     ` H.J. Lu
  2020-10-19 13:40                                       ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-19 13:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Mon, Oct 19, 2020 at 6:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.10.2020 14:24, H.J. Lu wrote:
> > On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 16.10.2020 20:07, H.J. Lu wrote:
> >>> When AVX VNNI was added, we could either use different mnemonics
> >>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
> >>> mandatory to avoid any confusion.
> >>
> >> What confusion could there be when a person has given suitable
> >> explicit .arch directives? And how is {vex3} in disassembler
> >
> > When debugging assembly codes under GDB, one shouldn't guess
> > how they are assembled.
>
> And in gdb how do you tell
>
>         vaddps %xmm0, %xmm1, %xmm2
>
> from
>
>         {evex} vaddps %xmm0, %xmm1, %xmm2
>
> or
>
>         {vex3} vaddps %xmm0, %xmm1, %xmm2
>
> This is exactly the same as the case at hand, just that the
> order in time in which both encodings where introduced is
> reversed. But the order of introduction time shouldn't matter
> to the (long term) behavior of a tool.

When assembly sources are assembled with -g, one can
debug assembly codes in GDB like C:

Breakpoint 1, __memcmp_avx2_movbe ()
    at ../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:64
64 movl %edx, %edx
(gdb) list
59 ENTRY (MEMCMP)
60 # ifdef USE_AS_WMEMCMP
61 shl $2, %RDX_LP
62 # elif defined __ILP32__
63 /* Clear the upper 32 bits.  */
64 movl %edx, %edx
65 # endif
66 cmp $VEC_SIZE, %RDX_LP
67 jb L(less_vec)
68
(gdb)

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 13:37                                     ` H.J. Lu
@ 2020-10-19 13:40                                       ` Jan Beulich
  2020-10-19 13:43                                         ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-19 13:40 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 19.10.2020 15:37, H.J. Lu wrote:
> On Mon, Oct 19, 2020 at 6:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 19.10.2020 14:24, H.J. Lu wrote:
>>> On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 16.10.2020 20:07, H.J. Lu wrote:
>>>>> When AVX VNNI was added, we could either use different mnemonics
>>>>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
>>>>> mandatory to avoid any confusion.
>>>>
>>>> What confusion could there be when a person has given suitable
>>>> explicit .arch directives? And how is {vex3} in disassembler
>>>
>>> When debugging assembly codes under GDB, one shouldn't guess
>>> how they are assembled.
>>
>> And in gdb how do you tell
>>
>>         vaddps %xmm0, %xmm1, %xmm2
>>
>> from
>>
>>         {evex} vaddps %xmm0, %xmm1, %xmm2
>>
>> or
>>
>>         {vex3} vaddps %xmm0, %xmm1, %xmm2
>>
>> This is exactly the same as the case at hand, just that the
>> order in time in which both encodings where introduced is
>> reversed. But the order of introduction time shouldn't matter
>> to the (long term) behavior of a tool.
> 
> When assembly sources are assembled with -g, one can
> debug assembly codes in GDB like C:
> 
> Breakpoint 1, __memcmp_avx2_movbe ()
>     at ../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:64
> 64 movl %edx, %edx
> (gdb) list
> 59 ENTRY (MEMCMP)
> 60 # ifdef USE_AS_WMEMCMP
> 61 shl $2, %RDX_LP
> 62 # elif defined __ILP32__
> 63 /* Clear the upper 32 bits.  */
> 64 movl %edx, %edx
> 65 # endif
> 66 cmp $VEC_SIZE, %RDX_LP
> 67 jb L(less_vec)
> 68
> (gdb)

I know, but how is this related to what I've said?

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 13:40                                       ` Jan Beulich
@ 2020-10-19 13:43                                         ` H.J. Lu
  2020-10-19 14:11                                           ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-19 13:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Mon, Oct 19, 2020 at 6:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.10.2020 15:37, H.J. Lu wrote:
> > On Mon, Oct 19, 2020 at 6:22 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 19.10.2020 14:24, H.J. Lu wrote:
> >>> On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 16.10.2020 20:07, H.J. Lu wrote:
> >>>>> When AVX VNNI was added, we could either use different mnemonics
> >>>>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
> >>>>> mandatory to avoid any confusion.
> >>>>
> >>>> What confusion could there be when a person has given suitable
> >>>> explicit .arch directives? And how is {vex3} in disassembler
> >>>
> >>> When debugging assembly codes under GDB, one shouldn't guess
> >>> how they are assembled.
> >>
> >> And in gdb how do you tell
> >>
> >>         vaddps %xmm0, %xmm1, %xmm2
> >>
> >> from
> >>
> >>         {evex} vaddps %xmm0, %xmm1, %xmm2
> >>
> >> or
> >>
> >>         {vex3} vaddps %xmm0, %xmm1, %xmm2
> >>
> >> This is exactly the same as the case at hand, just that the
> >> order in time in which both encodings where introduced is
> >> reversed. But the order of introduction time shouldn't matter
> >> to the (long term) behavior of a tool.
> >
> > When assembly sources are assembled with -g, one can
> > debug assembly codes in GDB like C:
> >
> > Breakpoint 1, __memcmp_avx2_movbe ()
> >     at ../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:64
> > 64 movl %edx, %edx
> > (gdb) list
> > 59 ENTRY (MEMCMP)
> > 60 # ifdef USE_AS_WMEMCMP
> > 61 shl $2, %RDX_LP
> > 62 # elif defined __ILP32__
> > 63 /* Clear the upper 32 bits.  */
> > 64 movl %edx, %edx
> > 65 # endif
> > 66 cmp $VEC_SIZE, %RDX_LP
> > 67 jb L(less_vec)
> > 68
> > (gdb)
>
> I know, but how is this related to what I've said?
>

vaddps %xmm0, %xmm1, %xmm2

is EVEX.

{vex3}/{vex} vaddps %xmm0, %xmm1, %xmm2

is VEX.


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 13:43                                         ` H.J. Lu
@ 2020-10-19 14:11                                           ` Jan Beulich
  2020-10-19 14:21                                             ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-19 14:11 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 19.10.2020 15:43, H.J. Lu wrote:
> On Mon, Oct 19, 2020 at 6:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 19.10.2020 15:37, H.J. Lu wrote:
>>> On Mon, Oct 19, 2020 at 6:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 19.10.2020 14:24, H.J. Lu wrote:
>>>>> On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>
>>>>>> On 16.10.2020 20:07, H.J. Lu wrote:
>>>>>>> When AVX VNNI was added, we could either use different mnemonics
>>>>>>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
>>>>>>> mandatory to avoid any confusion.
>>>>>>
>>>>>> What confusion could there be when a person has given suitable
>>>>>> explicit .arch directives? And how is {vex3} in disassembler
>>>>>
>>>>> When debugging assembly codes under GDB, one shouldn't guess
>>>>> how they are assembled.
>>>>
>>>> And in gdb how do you tell
>>>>
>>>>         vaddps %xmm0, %xmm1, %xmm2
>>>>
>>>> from
>>>>
>>>>         {evex} vaddps %xmm0, %xmm1, %xmm2
>>>>
>>>> or
>>>>
>>>>         {vex3} vaddps %xmm0, %xmm1, %xmm2
>>>>
>>>> This is exactly the same as the case at hand, just that the
>>>> order in time in which both encodings where introduced is
>>>> reversed. But the order of introduction time shouldn't matter
>>>> to the (long term) behavior of a tool.
>>>
>>> When assembly sources are assembled with -g, one can
>>> debug assembly codes in GDB like C:
>>>
>>> Breakpoint 1, __memcmp_avx2_movbe ()
>>>     at ../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:64
>>> 64 movl %edx, %edx
>>> (gdb) list
>>> 59 ENTRY (MEMCMP)
>>> 60 # ifdef USE_AS_WMEMCMP
>>> 61 shl $2, %RDX_LP
>>> 62 # elif defined __ILP32__
>>> 63 /* Clear the upper 32 bits.  */
>>> 64 movl %edx, %edx
>>> 65 # endif
>>> 66 cmp $VEC_SIZE, %RDX_LP
>>> 67 jb L(less_vec)
>>> 68
>>> (gdb)
>>
>> I know, but how is this related to what I've said?
>>
> 
> vaddps %xmm0, %xmm1, %xmm2
> 
> is EVEX.
> 
> {vex3}/{vex} vaddps %xmm0, %xmm1, %xmm2
> 
> is VEX.

But that's not what the disassmebler produces, nor what it
should produce. There's no need for {vex} on the assembler
side (as encoding defaults to the VEX variant for all AVX
and AVX2 insns), and hence the disassembler also doesn't
(and shouldn't) output {vex}. If anything, in an extended
mode, {vex3} / {evex} could be produced for insns where
multiple encodings are possible, to disambiguate them.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 14:11                                           ` Jan Beulich
@ 2020-10-19 14:21                                             ` H.J. Lu
  2020-10-19 14:55                                               ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-19 14:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Mon, Oct 19, 2020 at 7:11 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.10.2020 15:43, H.J. Lu wrote:
> > On Mon, Oct 19, 2020 at 6:40 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 19.10.2020 15:37, H.J. Lu wrote:
> >>> On Mon, Oct 19, 2020 at 6:22 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 19.10.2020 14:24, H.J. Lu wrote:
> >>>>> On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 16.10.2020 20:07, H.J. Lu wrote:
> >>>>>>> When AVX VNNI was added, we could either use different mnemonics
> >>>>>>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
> >>>>>>> mandatory to avoid any confusion.
> >>>>>>
> >>>>>> What confusion could there be when a person has given suitable
> >>>>>> explicit .arch directives? And how is {vex3} in disassembler
> >>>>>
> >>>>> When debugging assembly codes under GDB, one shouldn't guess
> >>>>> how they are assembled.
> >>>>
> >>>> And in gdb how do you tell
> >>>>
> >>>>         vaddps %xmm0, %xmm1, %xmm2
> >>>>
> >>>> from
> >>>>
> >>>>         {evex} vaddps %xmm0, %xmm1, %xmm2
> >>>>
> >>>> or
> >>>>
> >>>>         {vex3} vaddps %xmm0, %xmm1, %xmm2
> >>>>
> >>>> This is exactly the same as the case at hand, just that the
> >>>> order in time in which both encodings where introduced is
> >>>> reversed. But the order of introduction time shouldn't matter
> >>>> to the (long term) behavior of a tool.
> >>>
> >>> When assembly sources are assembled with -g, one can
> >>> debug assembly codes in GDB like C:
> >>>
> >>> Breakpoint 1, __memcmp_avx2_movbe ()
> >>>     at ../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:64
> >>> 64 movl %edx, %edx
> >>> (gdb) list
> >>> 59 ENTRY (MEMCMP)
> >>> 60 # ifdef USE_AS_WMEMCMP
> >>> 61 shl $2, %RDX_LP
> >>> 62 # elif defined __ILP32__
> >>> 63 /* Clear the upper 32 bits.  */
> >>> 64 movl %edx, %edx
> >>> 65 # endif
> >>> 66 cmp $VEC_SIZE, %RDX_LP
> >>> 67 jb L(less_vec)
> >>> 68
> >>> (gdb)
> >>
> >> I know, but how is this related to what I've said?
> >>
> >
> > vaddps %xmm0, %xmm1, %xmm2
> >
> > is EVEX.
> >
> > {vex3}/{vex} vaddps %xmm0, %xmm1, %xmm2
> >
> > is VEX.
>
> But that's not what the disassmebler produces, nor what it

This is what GDB shows in assembly source.

> should produce. There's no need for {vex} on the assembler
> side (as encoding defaults to the VEX variant for all AVX
> and AVX2 insns), and hence the disassembler also doesn't
> (and shouldn't) output {vex}. If anything, in an extended
> mode, {vex3} / {evex} could be produced for insns where
> multiple encodings are possible, to disambiguate them.
>
> Jan



-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 14:21                                             ` H.J. Lu
@ 2020-10-19 14:55                                               ` Jan Beulich
  2020-10-19 19:52                                                 ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-19 14:55 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 19.10.2020 16:21, H.J. Lu wrote:
> On Mon, Oct 19, 2020 at 7:11 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 19.10.2020 15:43, H.J. Lu wrote:
>>> On Mon, Oct 19, 2020 at 6:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 19.10.2020 15:37, H.J. Lu wrote:
>>>>> On Mon, Oct 19, 2020 at 6:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>
>>>>>> On 19.10.2020 14:24, H.J. Lu wrote:
>>>>>>> On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>>
>>>>>>>> On 16.10.2020 20:07, H.J. Lu wrote:
>>>>>>>>> When AVX VNNI was added, we could either use different mnemonics
>>>>>>>>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
>>>>>>>>> mandatory to avoid any confusion.
>>>>>>>>
>>>>>>>> What confusion could there be when a person has given suitable
>>>>>>>> explicit .arch directives? And how is {vex3} in disassembler
>>>>>>>
>>>>>>> When debugging assembly codes under GDB, one shouldn't guess
>>>>>>> how they are assembled.
>>>>>>
>>>>>> And in gdb how do you tell
>>>>>>
>>>>>>         vaddps %xmm0, %xmm1, %xmm2
>>>>>>
>>>>>> from
>>>>>>
>>>>>>         {evex} vaddps %xmm0, %xmm1, %xmm2
>>>>>>
>>>>>> or
>>>>>>
>>>>>>         {vex3} vaddps %xmm0, %xmm1, %xmm2
>>>>>>
>>>>>> This is exactly the same as the case at hand, just that the
>>>>>> order in time in which both encodings where introduced is
>>>>>> reversed. But the order of introduction time shouldn't matter
>>>>>> to the (long term) behavior of a tool.
>>>>>
>>>>> When assembly sources are assembled with -g, one can
>>>>> debug assembly codes in GDB like C:
>>>>>
>>>>> Breakpoint 1, __memcmp_avx2_movbe ()
>>>>>     at ../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:64
>>>>> 64 movl %edx, %edx
>>>>> (gdb) list
>>>>> 59 ENTRY (MEMCMP)
>>>>> 60 # ifdef USE_AS_WMEMCMP
>>>>> 61 shl $2, %RDX_LP
>>>>> 62 # elif defined __ILP32__
>>>>> 63 /* Clear the upper 32 bits.  */
>>>>> 64 movl %edx, %edx
>>>>> 65 # endif
>>>>> 66 cmp $VEC_SIZE, %RDX_LP
>>>>> 67 jb L(less_vec)
>>>>> 68
>>>>> (gdb)
>>>>
>>>> I know, but how is this related to what I've said?
>>>>
>>>
>>> vaddps %xmm0, %xmm1, %xmm2
>>>
>>> is EVEX.
>>>
>>> {vex3}/{vex} vaddps %xmm0, %xmm1, %xmm2
>>>
>>> is VEX.
>>
>> But that's not what the disassmebler produces, nor what it
> 
> This is what GDB shows in assembly source.

The %VX prefix was introduced only with the patch we're discussing
here, iirc - how would ordinary AVX insns like vaddps have gained
these undue prefixes? And if you truly mean "assembly sources"
(which isn't what we've been discussing, or at least which isn't
what I had been talking about), then wouldn't gdb show whatever
there was in the sources, without adding or removing any prefixes?

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 14:55                                               ` Jan Beulich
@ 2020-10-19 19:52                                                 ` H.J. Lu
  2020-10-20  8:00                                                   ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-19 19:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Mon, Oct 19, 2020 at 7:55 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.10.2020 16:21, H.J. Lu wrote:
> > On Mon, Oct 19, 2020 at 7:11 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 19.10.2020 15:43, H.J. Lu wrote:
> >>> On Mon, Oct 19, 2020 at 6:40 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 19.10.2020 15:37, H.J. Lu wrote:
> >>>>> On Mon, Oct 19, 2020 at 6:22 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 19.10.2020 14:24, H.J. Lu wrote:
> >>>>>>> On Sun, Oct 18, 2020 at 11:28 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>>
> >>>>>>>> On 16.10.2020 20:07, H.J. Lu wrote:
> >>>>>>>>> When AVX VNNI was added, we could either use different mnemonics
> >>>>>>>>> from AVX512 VNNI or a {vex} prefix.  We went with {vex} and made it
> >>>>>>>>> mandatory to avoid any confusion.
> >>>>>>>>
> >>>>>>>> What confusion could there be when a person has given suitable
> >>>>>>>> explicit .arch directives? And how is {vex3} in disassembler
> >>>>>>>
> >>>>>>> When debugging assembly codes under GDB, one shouldn't guess
> >>>>>>> how they are assembled.
> >>>>>>
> >>>>>> And in gdb how do you tell
> >>>>>>
> >>>>>>         vaddps %xmm0, %xmm1, %xmm2
> >>>>>>
> >>>>>> from
> >>>>>>
> >>>>>>         {evex} vaddps %xmm0, %xmm1, %xmm2
> >>>>>>
> >>>>>> or
> >>>>>>
> >>>>>>         {vex3} vaddps %xmm0, %xmm1, %xmm2
> >>>>>>
> >>>>>> This is exactly the same as the case at hand, just that the
> >>>>>> order in time in which both encodings where introduced is
> >>>>>> reversed. But the order of introduction time shouldn't matter
> >>>>>> to the (long term) behavior of a tool.
> >>>>>
> >>>>> When assembly sources are assembled with -g, one can
> >>>>> debug assembly codes in GDB like C:
> >>>>>
> >>>>> Breakpoint 1, __memcmp_avx2_movbe ()
> >>>>>     at ../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:64
> >>>>> 64 movl %edx, %edx
> >>>>> (gdb) list
> >>>>> 59 ENTRY (MEMCMP)
> >>>>> 60 # ifdef USE_AS_WMEMCMP
> >>>>> 61 shl $2, %RDX_LP
> >>>>> 62 # elif defined __ILP32__
> >>>>> 63 /* Clear the upper 32 bits.  */
> >>>>> 64 movl %edx, %edx
> >>>>> 65 # endif
> >>>>> 66 cmp $VEC_SIZE, %RDX_LP
> >>>>> 67 jb L(less_vec)
> >>>>> 68
> >>>>> (gdb)
> >>>>
> >>>> I know, but how is this related to what I've said?
> >>>>
> >>>
> >>> vaddps %xmm0, %xmm1, %xmm2
> >>>
> >>> is EVEX.
> >>>
> >>> {vex3}/{vex} vaddps %xmm0, %xmm1, %xmm2
> >>>
> >>> is VEX.
> >>
> >> But that's not what the disassmebler produces, nor what it
> >
> > This is what GDB shows in assembly source.
>
> The %VX prefix was introduced only with the patch we're discussing
> here, iirc - how would ordinary AVX insns like vaddps have gained
> these undue prefixes? And if you truly mean "assembly sources"
> (which isn't what we've been discussing, or at least which isn't
> what I had been talking about), then wouldn't gdb show whatever
> there was in the sources, without adding or removing any prefixes?
>

I probably lost the track.   Let me rephrase it again.  The {vex}
prefix for AVX VNNI is mandatory.


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-19 19:52                                                 ` H.J. Lu
@ 2020-10-20  8:00                                                   ` Jan Beulich
  2020-10-20 17:17                                                     ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-20  8:00 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 19.10.2020 21:52, H.J. Lu wrote:
> I probably lost the track.   Let me rephrase it again.  The {vex}
> prefix for AVX VNNI is mandatory.

And let me re-ask: Why? You've made your point for the gas side
(which I don't agree with, but that's a different aspect). I'm
still not understanding your reasoning on the disassembler side.
This is why I did point out that for e.g. vaddps the different
possible encodings can't be told apart either.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-20  8:00                                                   ` Jan Beulich
@ 2020-10-20 17:17                                                     ` H.J. Lu
  2020-10-21  7:07                                                       ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-20 17:17 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Tue, Oct 20, 2020 at 1:00 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.10.2020 21:52, H.J. Lu wrote:
> > I probably lost the track.   Let me rephrase it again.  The {vex}
> > prefix for AVX VNNI is mandatory.
>
> And let me re-ask: Why? You've made your point for the gas side
> (which I don't agree with, but that's a different aspect). I'm
> still not understanding your reasoning on the disassembler side.
> This is why I did point out that for e.g. vaddps the different
> possible encodings can't be told apart either.

We should enable to tell VEX from EVEX in the output of

$ objdump -dw --no-show-raw-insn


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-20 17:17                                                     ` H.J. Lu
@ 2020-10-21  7:07                                                       ` Jan Beulich
  0 siblings, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2020-10-21  7:07 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 20.10.2020 19:17, H.J. Lu wrote:
> On Tue, Oct 20, 2020 at 1:00 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 19.10.2020 21:52, H.J. Lu wrote:
>>> I probably lost the track.   Let me rephrase it again.  The {vex}
>>> prefix for AVX VNNI is mandatory.
>>
>> And let me re-ask: Why? You've made your point for the gas side
>> (which I don't agree with, but that's a different aspect). I'm
>> still not understanding your reasoning on the disassembler side.
>> This is why I did point out that for e.g. vaddps the different
>> possible encodings can't be told apart either.
> 
> We should enable to tell VEX from EVEX in the output of
> 
> $ objdump -dw --no-show-raw-insn

But not by default (and --show-raw-insn is off by default), only
if some option (possibly --no-show-raw-insn, but I'm not convinced
of such a "reuse") was specified explicitly. Just like other cases
of multiple possible encodings don't get disambiguated by default
(simply because by default it doesn't matter which encoding was
chosen) - swap_operand(), for example, gets enabled by -Msuffix,
and I think extending the meaning of this option would be more
sensible.

But the main point in the context here is: The printing of {vex}
for AVX-VNNI insns should _also_ be similarly conditional.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: x86: Support Intel AVX VNNI
  2020-10-19 12:24                                 ` H.J. Lu
  2020-10-19 13:22                                   ` Jan Beulich
@ 2020-10-23  1:57                                   ` Cui, Lili
  2020-10-23  2:10                                     ` H.J. Lu
  2020-10-23  7:04                                     ` Jan Beulich
  1 sibling, 2 replies; 44+ messages in thread
From: Cui, Lili @ 2020-10-23  1:57 UTC (permalink / raw)
  To: H.J. Lu, Jan Beulich; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 6515 bytes --]

> > output helping in any way, when comparing to all other AVX+ insns
> > which have AVX512VL counterparts? (Apart from that I'd further
> > question why it needs to be {vex3} when {vex} would suffice, but I'd
> > like to see this dropped altogether anyway, except perhaps in some
> > non-default mode, where it then should be output consistently.)
> 
> Yes, {vex3} can be dropped from AVX VNNI tests.
> 
> --
> H.J.

Hi all,

Here is the patch to delete {vex3} in AVX_VNNI test, thanks.

[PATCH] Delete {vex3} in AVX_VNNI test

gas/

	*testsuite/gas/i386/avx-vnni.s: Delete {vex3} related test.
	*testsuite/gas/i386/avx-vnni.d: Adjust to the change of avx-vnni.s.
	*testsuite/gas/i386/x86-64-avx-vnni.s: Delete {vex3} related test
	*testsuite/gas/i386/x86-64-avx-vnni.d: Adjust to the change of x86-64-avx-vnni.s.
---
 gas/testsuite/gas/i386/avx-vnni.d        | 8 --------
 gas/testsuite/gas/i386/avx-vnni.s        | 2 --
 gas/testsuite/gas/i386/x86-64-avx-vnni.d | 8 --------
 gas/testsuite/gas/i386/x86-64-avx-vnni.s | 2 --
 4 files changed, 20 deletions(-)

diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
index 6e31528cf2..99cf91e4cc 100644
--- a/gas/testsuite/gas/i386/avx-vnni.d
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -10,26 +10,18 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
 #pass
diff --git a/gas/testsuite/gas/i386/avx-vnni.s b/gas/testsuite/gas/i386/avx-vnni.s
index b37bc85c3a..e68fd25d0e 100644
--- a/gas/testsuite/gas/i386/avx-vnni.s
+++ b/gas/testsuite/gas/i386/avx-vnni.s
@@ -4,9 +4,7 @@
 	\mnemonic	%xmm2, %xmm4, %xmm2
 	{evex} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm2, %xmm4, %xmm2
-	{vex3} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic (%ecx), %xmm4, %xmm2
-	{vex3} \mnemonic (%ecx), %xmm4, %xmm2
 .endm
 
 	.text
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
index c4474739ed..16802b55b2 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -10,29 +10,21 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.s b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
index 95b6dc2ef3..8b3feaa4ff 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.s
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
@@ -4,9 +4,7 @@
 	\mnemonic	 %xmm12, %xmm4, %xmm2
 	{evex} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm12, %xmm4, %xmm2
-	{vex3} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic (%rcx), %xmm4, %xmm2
-	{vex3} \mnemonic (%rcx), %xmm4, %xmm2
 	\mnemonic	 %xmm22, %xmm4, %xmm2
 .endm
 
-- 
2.17.1

Thanks,
Lili.

[-- Attachment #2: 0001-Delete-vex3-in-AVX_VNNI-test.patch --]
[-- Type: application/octet-stream, Size: 6040 bytes --]

From 3678e8ffc40e933b57fdf313170d0e40f151a718 Mon Sep 17 00:00:00 2001
From: "Cui,Lili" <lili.cui@intel.com>
Date: Thu, 22 Oct 2020 07:55:02 +0800
Subject: [PATCH] Delete {vex3} in AVX_VNNI test

gas/

	*testsuite/gas/i386/avx-vnni.s: Delete {vex3} related test.
	*testsuite/gas/i386/avx-vnni.d: Adjust to the change of avx-vnni.s.
	*testsuite/gas/i386/x86-64-avx-vnni.s: Delete {vex3} related test
	*testsuite/gas/i386/x86-64-avx-vnni.d: Adjust to the change of x86-64-avx-vnni.s.
---
 gas/testsuite/gas/i386/avx-vnni.d        | 8 --------
 gas/testsuite/gas/i386/avx-vnni.s        | 2 --
 gas/testsuite/gas/i386/x86-64-avx-vnni.d | 8 --------
 gas/testsuite/gas/i386/x86-64-avx-vnni.s | 2 --
 4 files changed, 20 deletions(-)

diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
index 6e31528cf2..99cf91e4cc 100644
--- a/gas/testsuite/gas/i386/avx-vnni.d
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -10,26 +10,18 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
 #pass
diff --git a/gas/testsuite/gas/i386/avx-vnni.s b/gas/testsuite/gas/i386/avx-vnni.s
index b37bc85c3a..e68fd25d0e 100644
--- a/gas/testsuite/gas/i386/avx-vnni.s
+++ b/gas/testsuite/gas/i386/avx-vnni.s
@@ -4,9 +4,7 @@
 	\mnemonic	%xmm2, %xmm4, %xmm2
 	{evex} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm2, %xmm4, %xmm2
-	{vex3} \mnemonic %xmm2, %xmm4, %xmm2
 	{vex}  \mnemonic (%ecx), %xmm4, %xmm2
-	{vex3} \mnemonic (%ecx), %xmm4, %xmm2
 .endm
 
 	.text
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
index c4474739ed..16802b55b2 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -10,29 +10,21 @@ Disassembly of section .text:
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.s b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
index 95b6dc2ef3..8b3feaa4ff 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.s
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.s
@@ -4,9 +4,7 @@
 	\mnemonic	 %xmm12, %xmm4, %xmm2
 	{evex} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic %xmm12, %xmm4, %xmm2
-	{vex3} \mnemonic %xmm12, %xmm4, %xmm2
 	{vex}  \mnemonic (%rcx), %xmm4, %xmm2
-	{vex3} \mnemonic (%rcx), %xmm4, %xmm2
 	\mnemonic	 %xmm22, %xmm4, %xmm2
 .endm
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-23  1:57                                   ` Cui, Lili
@ 2020-10-23  2:10                                     ` H.J. Lu
  2020-10-23  7:04                                     ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-23  2:10 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Jan Beulich, binutils

On Thu, Oct 22, 2020 at 6:57 PM Cui, Lili <lili.cui@intel.com> wrote:
>
> > > output helping in any way, when comparing to all other AVX+ insns
> > > which have AVX512VL counterparts? (Apart from that I'd further
> > > question why it needs to be {vex3} when {vex} would suffice, but I'd
> > > like to see this dropped altogether anyway, except perhaps in some
> > > non-default mode, where it then should be output consistently.)
> >
> > Yes, {vex3} can be dropped from AVX VNNI tests.
> >
> > --
> > H.J.
>
> Hi all,
>
> Here is the patch to delete {vex3} in AVX_VNNI test, thanks.
>
> [PATCH] Delete {vex3} in AVX_VNNI test
>
> gas/
>
>         *testsuite/gas/i386/avx-vnni.s: Delete {vex3} related test.
>         *testsuite/gas/i386/avx-vnni.d: Adjust to the change of avx-vnni.s.
>         *testsuite/gas/i386/x86-64-avx-vnni.s: Delete {vex3} related test
>         *testsuite/gas/i386/x86-64-avx-vnni.d: Adjust to the change of x86-64-avx-vnni.s.

OK.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-23  1:57                                   ` Cui, Lili
  2020-10-23  2:10                                     ` H.J. Lu
@ 2020-10-23  7:04                                     ` Jan Beulich
  2020-10-23 13:17                                       ` H.J. Lu
  1 sibling, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-23  7:04 UTC (permalink / raw)
  To: Cui, Lili, H.J. Lu; +Cc: binutils

On 23.10.2020 03:57, Cui, Lili wrote:
>>> output helping in any way, when comparing to all other AVX+ insns
>>> which have AVX512VL counterparts? (Apart from that I'd further
>>> question why it needs to be {vex3} when {vex} would suffice, but I'd
>>> like to see this dropped altogether anyway, except perhaps in some
>>> non-default mode, where it then should be output consistently.)
>>
>> Yes, {vex3} can be dropped from AVX VNNI tests.

Okay, so we must have been talking past one another. I don't
see any good in dropping the tests, and this isn't what I did
suggest or talk about. {vex3} should be tested to be
properly accepted by the assembler, just like {vex}. What I
was saying is that _objdump output_ should have {vex} dropped,
and that it should have been {vex} instead of {vex3} there in
the first place.

Hence faod: I think the change here is wrong and should either
not be committed or reverted. (Oddly enough there have been no
Intel syntax checks of any prefix uses at all - I would
otherwise have outright nak-ed the change.)

And just to repeat - there might then want to be a non-default
disassembly mode which allows printing {vex} and alike. There
(and only there) it should then be the shortest unambiguous
form of prefix which gets output (in particular meaning no
prefix at all if an insn is unambiguous without; just like for
insn suffixes in AT&T mode there may then be a yet more verbose
mode where {vex} and alike get output unconditionally).

I can only re-iterate that this feature addition points out a
fundamental shortcoming: What would have needed to be
established first (and in public!) is an underlying abstract
model individually for both assembler and disassembler, that
we want each one to honor respectively. Once such a model is
agreed upon, verifying whether a particular change meets the
specification would be much easier. The primary goal of such
an abstract model would be consistency. As said in many other
contexts, being consistent is the only way to avoid surprises
to the user. I've been eliminating quite a few inconsistencies
over the last couple of years; some further attempts of doing
so were refused (or reverted) for reasons I didn't buy.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-23  7:04                                     ` Jan Beulich
@ 2020-10-23 13:17                                       ` H.J. Lu
  2020-10-23 13:36                                         ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: H.J. Lu @ 2020-10-23 13:17 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Fri, Oct 23, 2020 at 12:04 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 23.10.2020 03:57, Cui, Lili wrote:
> >>> output helping in any way, when comparing to all other AVX+ insns
> >>> which have AVX512VL counterparts? (Apart from that I'd further
> >>> question why it needs to be {vex3} when {vex} would suffice, but I'd
> >>> like to see this dropped altogether anyway, except perhaps in some
> >>> non-default mode, where it then should be output consistently.)
> >>
> >> Yes, {vex3} can be dropped from AVX VNNI tests.
>
> Okay, so we must have been talking past one another. I don't
> see any good in dropping the tests, and this isn't what I did
> suggest or talk about. {vex3} should be tested to be
> properly accepted by the assembler, just like {vex}. What I
> was saying is that _objdump output_ should have {vex} dropped,
> and that it should have been {vex} instead of {vex3} there in
> the first place.

{vex} and {vex3} aren't new.   The new behavior of AVX VNNI is that
{vex} or {vex3} is now mandatory for AVX VNNI.  Since {vex} or {vex3}
is mandatory for AVX VNNI, it shouldn't be dropped in disassembler
output.

> Hence faod: I think the change here is wrong and should either
> not be committed or reverted. (Oddly enough there have been no
> Intel syntax checks of any prefix uses at all - I would
> otherwise have outright nak-ed the change.)

We can change disassembler output from {vex3} to {vex} and
put back the {vex3| tests in AVX VNNI.

> And just to repeat - there might then want to be a non-default
> disassembly mode which allows printing {vex} and alike. There
> (and only there) it should then be the shortest unambiguous
> form of prefix which gets output (in particular meaning no
> prefix at all if an insn is unambiguous without; just like for
> insn suffixes in AT&T mode there may then be a yet more verbose
> mode where {vex} and alike get output unconditionally).

BTW, there are limitation in Intel syntax when used with GCC:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929

I don't know if it can be fixed.

> I can only re-iterate that this feature addition points out a
> fundamental shortcoming: What would have needed to be
> established first (and in public!) is an underlying abstract
> model individually for both assembler and disassembler, that
> we want each one to honor respectively. Once such a model is
> agreed upon, verifying whether a particular change meets the
> specification would be much easier. The primary goal of such
> an abstract model would be consistency. As said in many other
> contexts, being consistent is the only way to avoid surprises
> to the user. I've been eliminating quite a few inconsistencies
> over the last couple of years; some further attempts of doing
> so were refused (or reverted) for reasons I didn't buy.
>

x86 is still evolving.  We are trying to do things reasonably.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-23 13:17                                       ` H.J. Lu
@ 2020-10-23 13:36                                         ` Jan Beulich
  2020-10-23 21:30                                           ` H.J. Lu
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2020-10-23 13:36 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 23.10.2020 15:17, H.J. Lu wrote:
> On Fri, Oct 23, 2020 at 12:04 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 23.10.2020 03:57, Cui, Lili wrote:
>>>>> output helping in any way, when comparing to all other AVX+ insns
>>>>> which have AVX512VL counterparts? (Apart from that I'd further
>>>>> question why it needs to be {vex3} when {vex} would suffice, but I'd
>>>>> like to see this dropped altogether anyway, except perhaps in some
>>>>> non-default mode, where it then should be output consistently.)
>>>>
>>>> Yes, {vex3} can be dropped from AVX VNNI tests.
>>
>> Okay, so we must have been talking past one another. I don't
>> see any good in dropping the tests, and this isn't what I did
>> suggest or talk about. {vex3} should be tested to be
>> properly accepted by the assembler, just like {vex}. What I
>> was saying is that _objdump output_ should have {vex} dropped,
>> and that it should have been {vex} instead of {vex3} there in
>> the first place.
> 
> {vex} and {vex3} aren't new.   The new behavior of AVX VNNI is that
> {vex} or {vex3} is now mandatory for AVX VNNI.  Since {vex} or {vex3}
> is mandatory for AVX VNNI, it shouldn't be dropped in disassembler
> output.

They're not helpful in the disassembler output, and their adding
is inconsistent with other cases where {vex} / {vex3} / {evex}
aren't being displayed despite being necessary in gas to achieve
the respective encoding.

>> Hence faod: I think the change here is wrong and should either
>> not be committed or reverted. (Oddly enough there have been no
>> Intel syntax checks of any prefix uses at all - I would
>> otherwise have outright nak-ed the change.)
> 
> We can change disassembler output from {vex3} to {vex} and
> put back the {vex3| tests in AVX VNNI.

Good, thanks.

>> And just to repeat - there might then want to be a non-default
>> disassembly mode which allows printing {vex} and alike. There
>> (and only there) it should then be the shortest unambiguous
>> form of prefix which gets output (in particular meaning no
>> prefix at all if an insn is unambiguous without; just like for
>> insn suffixes in AT&T mode there may then be a yet more verbose
>> mode where {vex} and alike get output unconditionally).
> 
> BTW, there are limitation in Intel syntax when used with GCC:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929
> 
> I don't know if it can be fixed.

I've seen the notification, but didn't read the bug itself yet.
From the title I'm inclined to say it can't be fixed in the
general case. Some common case may be possible to have a
workaround for.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-23 13:36                                         ` Jan Beulich
@ 2020-10-23 21:30                                           ` H.J. Lu
  2020-10-26  1:54                                             ` Cui, Lili
  2020-10-26  8:46                                             ` Jan Beulich
  0 siblings, 2 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-23 21:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cui, Lili, binutils

On Fri, Oct 23, 2020 at 6:36 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 23.10.2020 15:17, H.J. Lu wrote:
> > On Fri, Oct 23, 2020 at 12:04 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 23.10.2020 03:57, Cui, Lili wrote:
> >>>>> output helping in any way, when comparing to all other AVX+ insns
> >>>>> which have AVX512VL counterparts? (Apart from that I'd further
> >>>>> question why it needs to be {vex3} when {vex} would suffice, but I'd
> >>>>> like to see this dropped altogether anyway, except perhaps in some
> >>>>> non-default mode, where it then should be output consistently.)
> >>>>
> >>>> Yes, {vex3} can be dropped from AVX VNNI tests.
> >>
> >> Okay, so we must have been talking past one another. I don't
> >> see any good in dropping the tests, and this isn't what I did
> >> suggest or talk about. {vex3} should be tested to be
> >> properly accepted by the assembler, just like {vex}. What I
> >> was saying is that _objdump output_ should have {vex} dropped,
> >> and that it should have been {vex} instead of {vex3} there in
> >> the first place.
> >
> > {vex} and {vex3} aren't new.   The new behavior of AVX VNNI is that
> > {vex} or {vex3} is now mandatory for AVX VNNI.  Since {vex} or {vex3}
> > is mandatory for AVX VNNI, it shouldn't be dropped in disassembler
> > output.
>
> They're not helpful in the disassembler output, and their adding
> is inconsistent with other cases where {vex} / {vex3} / {evex}
> aren't being displayed despite being necessary in gas to achieve
> the respective encoding.

AVX VNNI is an anomaly.  We can't apply the same rule on it.
The {vex} prefix is needed for both assembler input and disassembler
output.

> >> Hence faod: I think the change here is wrong and should either
> >> not be committed or reverted. (Oddly enough there have been no
> >> Intel syntax checks of any prefix uses at all - I would
> >> otherwise have outright nak-ed the change.)
> >
> > We can change disassembler output from {vex3} to {vex} and
> > put back the {vex3| tests in AVX VNNI.
>
> Good, thanks.
>

Lili, please prepare a patch.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: x86: Support Intel AVX VNNI
  2020-10-23 21:30                                           ` H.J. Lu
@ 2020-10-26  1:54                                             ` Cui, Lili
  2020-10-26  1:57                                               ` H.J. Lu
  2020-10-26  8:46                                             ` Jan Beulich
  1 sibling, 1 reply; 44+ messages in thread
From: Cui, Lili @ 2020-10-26  1:54 UTC (permalink / raw)
  To: H.J. Lu, Jan Beulich; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 8257 bytes --]


> > >> Hence faod: I think the change here is wrong and should either not
> > >> be committed or reverted. (Oddly enough there have been no Intel
> > >> syntax checks of any prefix uses at all - I would otherwise have
> > >> outright nak-ed the change.)
> > >
> > > We can change disassembler output from {vex3} to {vex} and put back
> > > the {vex3| tests in AVX VNNI.
> >
> > Good, thanks.
> >
> 
> Lili, please prepare a patch.

Here is the patch, keep assembler input {vex3} tests and change the output, thanks.

[PATCH] Change avxvnni disassembler output from {vex3} to {vex}

gas/

	* testsuite/gas/i386/avx-vnni.d: Change psuedo prefix from
	{vex3} to {vex}
	* testsuite/gas/i386/x86-64-avx-vnni.d: Likewise.

opcodes/

	* i386-dis.c: Change "XV" to print "{vex}" pseudo prefix.
---
 gas/testsuite/gas/i386/avx-vnni.d        | 32 ++++++++++++------------
 gas/testsuite/gas/i386/x86-64-avx-vnni.d | 32 ++++++++++++------------
 opcodes/i386-dis.c                       |  1 -
 3 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
index 6e31528cf2..7d20c80973 100644
--- a/gas/testsuite/gas/i386/avx-vnni.d
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -9,27 +9,27 @@ Disassembly of section .text:
 0+ <_start>:
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex\} vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex\} vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex\} vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex\} vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex\} vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex\} vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex\} vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex\} vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
index c4474739ed..6b3acab5d5 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -9,31 +9,31 @@ Disassembly of section .text:
 0+ <_start>:
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex\} vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex\} vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex\} vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex\} vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex\} vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex\} vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex\} vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex\} vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
 #pass
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 068858b1e7..9338b1f375 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -11091,7 +11091,6 @@ putop (const char *in_template, int sizeflag)
 		  *obufp++ = 'v';
 		  *obufp++ = 'e';
 		  *obufp++ = 'x';
-		  *obufp++ = '3';
 		  *obufp++ = '}';
 		}
 	      else if (rex & REX_W)
-- 
2.17.1

Thanks,
Lili.


[-- Attachment #2: 0001-Change-avxvnni-disassembler-output-from-vex3-to-vex.patch --]
[-- Type: application/octet-stream, Size: 7729 bytes --]

From d815d7ffc65f4a0f89fb658cdc167103a3201065 Mon Sep 17 00:00:00 2001
From: "Cui,Lili" <lili.cui@intel.com>
Date: Mon, 26 Oct 2020 09:35:26 +0800
Subject: [PATCH] Change avxvnni disassembler output from {vex3} to {vex}

gas/

	* testsuite/gas/i386/avx-vnni.d: Change psuedo prefix from
	{vex3} to {vex}
	* testsuite/gas/i386/x86-64-avx-vnni.d: Likewise.

opcodes/

	* i386-dis.c: Change "XV" to print "{vex}" pseudo prefix.
---
 gas/testsuite/gas/i386/avx-vnni.d        | 32 ++++++++++++------------
 gas/testsuite/gas/i386/x86-64-avx-vnni.d | 32 ++++++++++++------------
 opcodes/i386-dis.c                       |  1 -
 3 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
index 6e31528cf2..7d20c80973 100644
--- a/gas/testsuite/gas/i386/avx-vnni.d
+++ b/gas/testsuite/gas/i386/avx-vnni.d
@@ -9,27 +9,27 @@ Disassembly of section .text:
 0+ <_start>:
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex\} vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 d2       	\{vex\} vpdpbusd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 52 d2    	vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex\} vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 d2       	\{vex\} vpdpwssd %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 51 d2    	vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex\} vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 d2       	\{vex\} vpdpbusds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 53 d2    	vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex\} vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 d2       	\{vex\} vpdpwssds %xmm2,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%ecx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%ecx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 f2 5d 08 50 d2    	vpdpbusd %xmm2,%xmm4,%xmm2
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
index c4474739ed..6b3acab5d5 100644
--- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
@@ -9,31 +9,31 @@ Disassembly of section .text:
 0+ <_start>:
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 50 11       	\{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex\} vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 50 d4       	\{vex\} vpdpbusd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 50 11       	\{vex\} vpdpbusd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 50 d6    	vpdpbusd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 52 d4    	vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 52 11       	\{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex\} vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 52 d4       	\{vex\} vpdpwssd %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 52 11       	\{vex\} vpdpwssd \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 52 d6    	vpdpwssd %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 51 d4    	vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 51 11       	\{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex\} vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 51 d4       	\{vex\} vpdpbusds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 51 11       	\{vex\} vpdpbusds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 51 d6    	vpdpbusds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 53 d4    	vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
- +[a-f0-9]+:	c4 e2 59 53 11       	\{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex\} vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 c2 59 53 d4       	\{vex\} vpdpwssds %xmm12,%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%rcx\),%xmm4,%xmm2
+ +[a-f0-9]+:	c4 e2 59 53 11       	\{vex\} vpdpwssds \(%rcx\),%xmm4,%xmm2
  +[a-f0-9]+:	62 b2 5d 08 53 d6    	vpdpwssds %xmm22,%xmm4,%xmm2
  +[a-f0-9]+:	62 d2 5d 08 50 d4    	vpdpbusd %xmm12,%xmm4,%xmm2
 #pass
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 068858b1e7..9338b1f375 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -11091,7 +11091,6 @@ putop (const char *in_template, int sizeflag)
 		  *obufp++ = 'v';
 		  *obufp++ = 'e';
 		  *obufp++ = 'x';
-		  *obufp++ = '3';
 		  *obufp++ = '}';
 		}
 	      else if (rex & REX_W)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-26  1:54                                             ` Cui, Lili
@ 2020-10-26  1:57                                               ` H.J. Lu
  0 siblings, 0 replies; 44+ messages in thread
From: H.J. Lu @ 2020-10-26  1:57 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Jan Beulich, binutils

On Sun, Oct 25, 2020 at 6:55 PM Cui, Lili <lili.cui@intel.com> wrote:
>
>
> > > >> Hence faod: I think the change here is wrong and should either not
> > > >> be committed or reverted. (Oddly enough there have been no Intel
> > > >> syntax checks of any prefix uses at all - I would otherwise have
> > > >> outright nak-ed the change.)
> > > >
> > > > We can change disassembler output from {vex3} to {vex} and put back
> > > > the {vex3| tests in AVX VNNI.
> > >
> > > Good, thanks.
> > >
> >
> > Lili, please prepare a patch.
>
> Here is the patch, keep assembler input {vex3} tests and change the output, thanks.
>
> [PATCH] Change avxvnni disassembler output from {vex3} to {vex}
>
> gas/
>
>         * testsuite/gas/i386/avx-vnni.d: Change psuedo prefix from
>         {vex3} to {vex}
>         * testsuite/gas/i386/x86-64-avx-vnni.d: Likewise.
>
> opcodes/
>
>         * i386-dis.c: Change "XV" to print "{vex}" pseudo prefix.
> ---
>  gas/testsuite/gas/i386/avx-vnni.d        | 32 ++++++++++++------------
>  gas/testsuite/gas/i386/x86-64-avx-vnni.d | 32 ++++++++++++------------
>  opcodes/i386-dis.c                       |  1 -
>  3 files changed, 32 insertions(+), 33 deletions(-)
>
> diff --git a/gas/testsuite/gas/i386/avx-vnni.d b/gas/testsuite/gas/i386/avx-vnni.d
> index 6e31528cf2..7d20c80973 100644
> --- a/gas/testsuite/gas/i386/avx-vnni.d
> +++ b/gas/testsuite/gas/i386/avx-vnni.d
> @@ -9,27 +9,27 @@ Disassembly of section .text:
>  0+ <_start>:
>   +[a-f0-9]+:   62 f2 5d 08 50 d2       vpdpbusd %xmm2,%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 50 d2       vpdpbusd %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 50 d2          \{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 50 d2          \{vex3\} vpdpbusd %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 50 11          \{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 50 11          \{vex3\} vpdpbusd \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 50 d2          \{vex\} vpdpbusd %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 50 d2          \{vex\} vpdpbusd %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 50 11          \{vex\} vpdpbusd \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 50 11          \{vex\} vpdpbusd \(%ecx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 52 d2       vpdpwssd %xmm2,%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 52 d2       vpdpwssd %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 52 d2          \{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 52 d2          \{vex3\} vpdpwssd %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 52 11          \{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 52 11          \{vex3\} vpdpwssd \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 52 d2          \{vex\} vpdpwssd %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 52 d2          \{vex\} vpdpwssd %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 52 11          \{vex\} vpdpwssd \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 52 11          \{vex\} vpdpwssd \(%ecx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 51 d2       vpdpbusds %xmm2,%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 51 d2       vpdpbusds %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 51 d2          \{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 51 d2          \{vex3\} vpdpbusds %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 51 11          \{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 51 11          \{vex3\} vpdpbusds \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 51 d2          \{vex\} vpdpbusds %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 51 d2          \{vex\} vpdpbusds %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 51 11          \{vex\} vpdpbusds \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 51 11          \{vex\} vpdpbusds \(%ecx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 53 d2       vpdpwssds %xmm2,%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 53 d2       vpdpwssds %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 53 d2          \{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 53 d2          \{vex3\} vpdpwssds %xmm2,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 53 11          \{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 53 11          \{vex3\} vpdpwssds \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 53 d2          \{vex\} vpdpwssds %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 53 d2          \{vex\} vpdpwssds %xmm2,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 53 11          \{vex\} vpdpwssds \(%ecx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 53 11          \{vex\} vpdpwssds \(%ecx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 f2 5d 08 50 d2       vpdpbusd %xmm2,%xmm4,%xmm2
>  #pass
> diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni.d b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
> index c4474739ed..6b3acab5d5 100644
> --- a/gas/testsuite/gas/i386/x86-64-avx-vnni.d
> +++ b/gas/testsuite/gas/i386/x86-64-avx-vnni.d
> @@ -9,31 +9,31 @@ Disassembly of section .text:
>  0+ <_start>:
>   +[a-f0-9]+:   62 d2 5d 08 50 d4       vpdpbusd %xmm12,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 50 d4       vpdpbusd %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 50 d4          \{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 50 d4          \{vex3\} vpdpbusd %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 50 11          \{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 50 11          \{vex3\} vpdpbusd \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 50 d4          \{vex\} vpdpbusd %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 50 d4          \{vex\} vpdpbusd %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 50 11          \{vex\} vpdpbusd \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 50 11          \{vex\} vpdpbusd \(%rcx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 b2 5d 08 50 d6       vpdpbusd %xmm22,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 52 d4       vpdpwssd %xmm12,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 52 d4       vpdpwssd %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 52 d4          \{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 52 d4          \{vex3\} vpdpwssd %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 52 11          \{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 52 11          \{vex3\} vpdpwssd \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 52 d4          \{vex\} vpdpwssd %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 52 d4          \{vex\} vpdpwssd %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 52 11          \{vex\} vpdpwssd \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 52 11          \{vex\} vpdpwssd \(%rcx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 b2 5d 08 52 d6       vpdpwssd %xmm22,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 51 d4       vpdpbusds %xmm12,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 51 d4       vpdpbusds %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 51 d4          \{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 51 d4          \{vex3\} vpdpbusds %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 51 11          \{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 51 11          \{vex3\} vpdpbusds \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 51 d4          \{vex\} vpdpbusds %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 51 d4          \{vex\} vpdpbusds %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 51 11          \{vex\} vpdpbusds \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 51 11          \{vex\} vpdpbusds \(%rcx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 b2 5d 08 51 d6       vpdpbusds %xmm22,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 53 d4       vpdpwssds %xmm12,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 53 d4       vpdpwssds %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 53 d4          \{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 c2 59 53 d4          \{vex3\} vpdpwssds %xmm12,%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 53 11          \{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
> - +[a-f0-9]+:   c4 e2 59 53 11          \{vex3\} vpdpwssds \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 53 d4          \{vex\} vpdpwssds %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 c2 59 53 d4          \{vex\} vpdpwssds %xmm12,%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 53 11          \{vex\} vpdpwssds \(%rcx\),%xmm4,%xmm2
> + +[a-f0-9]+:   c4 e2 59 53 11          \{vex\} vpdpwssds \(%rcx\),%xmm4,%xmm2
>   +[a-f0-9]+:   62 b2 5d 08 53 d6       vpdpwssds %xmm22,%xmm4,%xmm2
>   +[a-f0-9]+:   62 d2 5d 08 50 d4       vpdpbusd %xmm12,%xmm4,%xmm2
>  #pass
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index 068858b1e7..9338b1f375 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -11091,7 +11091,6 @@ putop (const char *in_template, int sizeflag)
>                   *obufp++ = 'v';
>                   *obufp++ = 'e';
>                   *obufp++ = 'x';
> -                 *obufp++ = '3';
>                   *obufp++ = '}';
>                 }
>               else if (rex & REX_W)
> --
> 2.17.1
>
> Thanks,
> Lili.
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: x86: Support Intel AVX VNNI
  2020-10-23 21:30                                           ` H.J. Lu
  2020-10-26  1:54                                             ` Cui, Lili
@ 2020-10-26  8:46                                             ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2020-10-26  8:46 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Cui, Lili, binutils

On 23.10.2020 23:30, H.J. Lu wrote:
> On Fri, Oct 23, 2020 at 6:36 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 23.10.2020 15:17, H.J. Lu wrote:
>>> On Fri, Oct 23, 2020 at 12:04 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 23.10.2020 03:57, Cui, Lili wrote:
>>>>>>> output helping in any way, when comparing to all other AVX+ insns
>>>>>>> which have AVX512VL counterparts? (Apart from that I'd further
>>>>>>> question why it needs to be {vex3} when {vex} would suffice, but I'd
>>>>>>> like to see this dropped altogether anyway, except perhaps in some
>>>>>>> non-default mode, where it then should be output consistently.)
>>>>>>
>>>>>> Yes, {vex3} can be dropped from AVX VNNI tests.
>>>>
>>>> Okay, so we must have been talking past one another. I don't
>>>> see any good in dropping the tests, and this isn't what I did
>>>> suggest or talk about. {vex3} should be tested to be
>>>> properly accepted by the assembler, just like {vex}. What I
>>>> was saying is that _objdump output_ should have {vex} dropped,
>>>> and that it should have been {vex} instead of {vex3} there in
>>>> the first place.
>>>
>>> {vex} and {vex3} aren't new.   The new behavior of AVX VNNI is that
>>> {vex} or {vex3} is now mandatory for AVX VNNI.  Since {vex} or {vex3}
>>> is mandatory for AVX VNNI, it shouldn't be dropped in disassembler
>>> output.
>>
>> They're not helpful in the disassembler output, and their adding
>> is inconsistent with other cases where {vex} / {vex3} / {evex}
>> aren't being displayed despite being necessary in gas to achieve
>> the respective encoding.
> 
> AVX VNNI is an anomaly.  We can't apply the same rule on it.
> The {vex} prefix is needed for both assembler input and disassembler
> output.

What I continue to be missing is the "why" aspect on the disassembler
side.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2020-10-26  8:46 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-14  6:37 x86: Support Intel AVX VNNI Cui, Lili
2020-10-14 10:34 ` H.J. Lu
2020-10-14 13:12 ` Jan Beulich
2020-10-14 13:28   ` H.J. Lu
2020-10-15  7:10     ` Cui, Lili
2020-10-15  7:24       ` Jan Beulich
2020-10-15 11:15         ` H.J. Lu
2020-10-15 11:45           ` Cui, Lili
2020-10-16  2:05             ` H.J. Lu
2020-10-15 12:28           ` Jan Beulich
2020-10-15 12:38             ` H.J. Lu
2020-10-15 15:22               ` Jan Beulich
2020-10-15 15:23                 ` H.J. Lu
2020-10-15 15:26                   ` H.J. Lu
2020-10-15 15:28                   ` Jan Beulich
2020-10-15 15:34                     ` H.J. Lu
2020-10-15 16:04                       ` Jan Beulich
2020-10-15 16:15                         ` H.J. Lu
2020-10-16  6:10                           ` Jan Beulich
2020-10-16 18:07                             ` H.J. Lu
2020-10-19  6:28                               ` Jan Beulich
2020-10-19  8:26                                 ` Cui, Lili
2020-10-19 11:19                                   ` Jan Beulich
2020-10-19 12:24                                 ` H.J. Lu
2020-10-19 13:22                                   ` Jan Beulich
2020-10-19 13:37                                     ` H.J. Lu
2020-10-19 13:40                                       ` Jan Beulich
2020-10-19 13:43                                         ` H.J. Lu
2020-10-19 14:11                                           ` Jan Beulich
2020-10-19 14:21                                             ` H.J. Lu
2020-10-19 14:55                                               ` Jan Beulich
2020-10-19 19:52                                                 ` H.J. Lu
2020-10-20  8:00                                                   ` Jan Beulich
2020-10-20 17:17                                                     ` H.J. Lu
2020-10-21  7:07                                                       ` Jan Beulich
2020-10-23  1:57                                   ` Cui, Lili
2020-10-23  2:10                                     ` H.J. Lu
2020-10-23  7:04                                     ` Jan Beulich
2020-10-23 13:17                                       ` H.J. Lu
2020-10-23 13:36                                         ` Jan Beulich
2020-10-23 21:30                                           ` H.J. Lu
2020-10-26  1:54                                             ` Cui, Lili
2020-10-26  1:57                                               ` H.J. Lu
2020-10-26  8:46                                             ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).