public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Support Intel AMX-COMPLEX
@ 2023-04-03  7:11 Haochen Jiang
  2023-04-04  7:35 ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Haochen Jiang @ 2023-04-03  7:11 UTC (permalink / raw)
  To: binutils; +Cc: jbeulich, hjl.tools

Hi all,

This patch aims to add Intel AMX-COMPLEX instructions.

The information is based on newly released
Intel Architecture Instruction Set Extensions and Future Features.

The document comes following:
https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Tested on x86_64-pc-linux-gnu. Ok for trunk?

BRs,
Haochen

gas/ChangeLog:

	* NEWS: Support Intel AMX-COMPLEX.
	* config/tc-i386.c: Add amx_complex.
	* doc/c-i386.texi: Document .amx_complex.
	* testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests.
	* testsuite/gas/i386/amx-complex-inval.l: New test.
	* testsuite/gas/i386/amx-complex-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex.s: Ditto.

opcodes/ChangeLog:

	* i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New.
	(PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto.
	(X86_64_VEX_0F386C): Ditto.
	(VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto.
	(VEX_W_0F386C_X86_64): Ditto.
	(mod_table): Add MOD_VEX_0F386C_X86_64_W_0.
	(prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0.
	(x86_64_table): Add X86_64_VEX_0F386C.
	(vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1.
	(vex_w_table): Add VEX_W_0F386C_X86_64.
	* i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and
	CPU_ANY_AMX_COMPLEX_FLAGS.
	* i386-init.h: Regenerated.
	* i386-mnem.h: Ditto.
	* i386-opc.h (CpuAMX_COMPLEX): New.
	(i386_cpu_flags): Add cpuamx_complex.
	* i386-opc.tbl: Add AMX-COMPLEX instructions.
	* i386-tbl.h: Regenerated.
---
 gas/NEWS                                      |    2 +
 gas/config/tc-i386.c                          |    1 +
 gas/doc/c-i386.texi                           |    4 +-
 gas/testsuite/gas/i386/amx-complex-inval.l    |    3 +
 gas/testsuite/gas/i386/amx-complex-inval.s    |    7 +
 gas/testsuite/gas/i386/i386.exp               |    3 +
 .../gas/i386/x86-64-amx-complex-intel.d       |   18 +
 gas/testsuite/gas/i386/x86-64-amx-complex.d   |   15 +
 gas/testsuite/gas/i386/x86-64-amx-complex.s   |   15 +
 opcodes/i386-dis.c                            |   34 +-
 opcodes/i386-gen.c                            |    3 +
 opcodes/i386-init.h                           |  542 +-
 opcodes/i386-mnem.h                           | 1098 +--
 opcodes/i386-opc.h                            |    3 +
 opcodes/i386-opc.tbl                          |    7 +
 opcodes/i386-tbl.h                            | 7834 +++++++++--------
 16 files changed, 4878 insertions(+), 4711 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l
 create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s

diff --git a/gas/NEWS b/gas/NEWS
index f95383e83af..42a2005d7c9 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,7 @@
 -*- text -*-
 
+* Add support for Intel AMX-COMPLEX instructions.
+
 * Add SME2 support to the AArch64 port.
 
 * A new .insn directive is recognized by x86 gas.
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index ea2ed0d818e..ea5705da4af 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1113,6 +1113,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (amx_int8, AMX_INT8, ANY_AMX_INT8, false),
   SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false),
   SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false),
+  SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false),
   SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
   SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
   SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 617cbd46cb7..15d060b2a33 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -208,6 +208,7 @@ accept various extension mnemonics.  For example,
 @code{amx_int8},
 @code{amx_bf16},
 @code{amx_fp16},
+@code{amx_complex},
 @code{amx_tile},
 @code{vmx},
 @code{vmfunc},
@@ -1636,7 +1637,8 @@ supported on the CPU specified.  The choices for @var{cpu_type} are:
 @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote}
 @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq}
 @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
-@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @tab @samp{.amx_tile}
+@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16}
+@item @samp{.amx_complex} @tab @samp{.amx_tile}
 @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset}
 @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
 @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme}
diff --git a/gas/testsuite/gas/i386/amx-complex-inval.l b/gas/testsuite/gas/i386/amx-complex-inval.l
new file mode 100644
index 00000000000..df6713c5d8b
--- /dev/null
+++ b/gas/testsuite/gas/i386/amx-complex-inval.l
@@ -0,0 +1,3 @@
+.* Assembler messages:
+.*:6: Error: `tcmmimfp16ps' is only supported in 64-bit mode
+.*:7: Error: `tcmmrlfp16ps' is only supported in 64-bit mode
diff --git a/gas/testsuite/gas/i386/amx-complex-inval.s b/gas/testsuite/gas/i386/amx-complex-inval.s
new file mode 100644
index 00000000000..b1bbf32585b
--- /dev/null
+++ b/gas/testsuite/gas/i386/amx-complex-inval.s
@@ -0,0 +1,7 @@
+# Check Illegal AMX-COMPLEX instructions
+
+	.allow_index_reg
+	.text
+_start:
+	tcmmimfp16ps	%tmm1, %tmm2, %tmm3
+	tcmmrlfp16ps	%tmm1, %tmm2, %tmm3
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index c44f071a0e2..c098ce2185a 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -493,6 +493,7 @@ if [gas_32_check] then {
     run_dump_test "avx-ne-convert-intel"
     run_dump_test "raoint"
     run_dump_test "raoint-intel"
+    run_list_test "amx-complex-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "invlpgb"
@@ -1183,6 +1184,8 @@ if [gas_64_check] then {
     run_dump_test "x86-64-avx-ne-convert-intel"
     run_dump_test "x86-64-raoint"
     run_dump_test "x86-64-raoint-intel"
+    run_dump_test "x86-64-amx-complex"
+    run_dump_test "x86-64-amx-complex-intel"
     run_dump_test "x86-64-clzero"
     run_dump_test "x86-64-mwaitx-bdver4"
     run_list_test "x86-64-mwaitx-reg"
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
new file mode 100644
index 00000000000..8f2e015104f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
@@ -0,0 +1,18 @@
+#as:
+#objdump: -dw -Mintel
+#name: x86_64 AMX-COMPLEX insns (Intel disassembly)
+#source: x86-64-amx-complex.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1
+\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1
+\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1
+\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.d b/gas/testsuite/gas/i386/x86-64-amx-complex.d
new file mode 100644
index 00000000000..b2157960027
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex.d
@@ -0,0 +1,15 @@
+#as:
+#objdump: -dw
+#name: x86_64 AMX-COMPLEX insns
+#source: x86-64-amx-complex.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps %tmm4,%tmm5,%tmm6
+\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps %tmm1,%tmm2,%tmm3
+\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps %tmm4,%tmm5,%tmm6
+\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps %tmm1,%tmm2,%tmm3
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.s b/gas/testsuite/gas/i386/x86-64-amx-complex.s
new file mode 100644
index 00000000000..56f1a00fa9e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex.s
@@ -0,0 +1,15 @@
+# Check 64bit AMX-COMPLEX instructions
+
+	.allow_index_reg
+	.text
+_start:
+	tcmmimfp16ps	%tmm4, %tmm5, %tmm6	 #AMX-COMPLEX
+	tcmmimfp16ps	%tmm1, %tmm2, %tmm3	 #AMX-COMPLEX
+	tcmmrlfp16ps	%tmm4, %tmm5, %tmm6	 #AMX-COMPLEX
+	tcmmrlfp16ps	%tmm1, %tmm2, %tmm3	 #AMX-COMPLEX
+
+.intel_syntax noprefix
+	tcmmimfp16ps	tmm6, tmm5, tmm4	 #AMX-COMPLEX
+	tcmmimfp16ps	tmm3, tmm2, tmm1	 #AMX-COMPLEX
+	tcmmrlfp16ps	tmm6, tmm5, tmm4	 #AMX-COMPLEX
+	tcmmrlfp16ps	tmm3, tmm2, tmm1	 #AMX-COMPLEX
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index a414e8c9b1e..d6b0fdd4ba3 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -943,6 +943,7 @@ enum
   MOD_VEX_0F385E_X86_64_P_1_W_0,
   MOD_VEX_0F385E_X86_64_P_2_W_0,
   MOD_VEX_0F385E_X86_64_P_3_W_0,
+  MOD_VEX_0F386C_X86_64_W_0,
   MOD_VEX_0F388C,
   MOD_VEX_0F388E,
   MOD_VEX_0F3A30_L_0,
@@ -1145,6 +1146,7 @@ enum
   PREFIX_VEX_0F3851_W_0,
   PREFIX_VEX_0F385C_X86_64,
   PREFIX_VEX_0F385E_X86_64,
+  PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0,
   PREFIX_VEX_0F3872,
   PREFIX_VEX_0F38B0_W_0,
   PREFIX_VEX_0F38B1_W_0,
@@ -1298,6 +1300,7 @@ enum
   X86_64_VEX_0F384B,
   X86_64_VEX_0F385C,
   X86_64_VEX_0F385E,
+  X86_64_VEX_0F386C,
   X86_64_VEX_0F38E0,
   X86_64_VEX_0F38E1,
   X86_64_VEX_0F38E2,
@@ -1398,6 +1401,7 @@ enum
   VEX_LEN_0F385E_X86_64_P_1_W_0_M_0,
   VEX_LEN_0F385E_X86_64_P_2_W_0_M_0,
   VEX_LEN_0F385E_X86_64_P_3_W_0_M_0,
+  VEX_LEN_0F386C_X86_64_W_0_M_1,
   VEX_LEN_0F38DB,
   VEX_LEN_0F38F2,
   VEX_LEN_0F38F3,
@@ -1565,6 +1569,7 @@ enum
   VEX_W_0F385E_X86_64_P_1,
   VEX_W_0F385E_X86_64_P_2,
   VEX_W_0F385E_X86_64_P_3,
+  VEX_W_0F386C_X86_64,
   VEX_W_0F3872_P_1,
   VEX_W_0F3878,
   VEX_W_0F3879,
@@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = {
     { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) },
   },
 
+  /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */
+  {
+    { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 },
+    { Bad_Opcode },
+    { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 },
+  },
+
   /* PREFIX_VEX_0F3872 */
   {
     { Bad_Opcode },
@@ -4486,6 +4498,12 @@ static const struct dis386 x86_64_table[][2] = {
     { PREFIX_TABLE (PREFIX_VEX_0F385E_X86_64) },
   },
 
+  /* X86_64_VEX_0F386C */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F386C_X86_64) },
+  },
+
   /* X86_64_VEX_0F38E0 */
   {
     { Bad_Opcode },
@@ -6461,7 +6479,7 @@ static const struct dis386 vex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_TABLE (X86_64_VEX_0F386C) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -7181,6 +7199,11 @@ static const struct dis386 vex_len_table[][2] = {
     { "tdpbssd", {TMM, EXtmm, VexTmm }, 0 },
   },
 
+  /* VEX_LEN_0F386C_X86_64_W_0_M_1 */
+  {
+    { PREFIX_TABLE (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0) },
+  },
+
   /* VEX_LEN_0F38DB */
   {
     { "vaesimc",	{ XM, EXx }, PREFIX_DATA },
@@ -7849,6 +7872,10 @@ static const struct dis386 vex_w_table[][2] = {
     /* VEX_W_0F385E_X86_64_P_3 */
     { MOD_TABLE (MOD_VEX_0F385E_X86_64_P_3_W_0) },
   },
+  {
+    /* VEX_W_0F386C_X86_64 */
+    { MOD_TABLE (MOD_VEX_0F386C_X86_64_W_0) },
+  },
   {
     /* VEX_W_0F3872_P_1 */
     { "%XVvcvtneps2bf16%XY", { XMM, EXx }, 0 },
@@ -8696,6 +8723,11 @@ static const struct dis386 mod_table[][2] = {
     { Bad_Opcode },
     { VEX_LEN_TABLE (VEX_LEN_0F385E_X86_64_P_3_W_0_M_0) },
   },
+  {
+    /* MOD_VEX_0F386C_X86_64_W_0 */
+    { Bad_Opcode },
+    { VEX_LEN_TABLE (VEX_LEN_0F386C_X86_64_W_0_M_1) },
+  },
   {
     /* MOD_VEX_0F388C */
     { "vpmaskmov%DQ",	{ XM, Vex, Mx }, PREFIX_DATA },
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 489ae3429c9..c2ac3c6832d 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -240,6 +240,8 @@ static const dependency isa_dependencies[] =
     "AMX_TILE" },
   { "AMX_FP16",
     "AMX_TILE" },
+  { "AMX_COMPLEX",
+    "AMX_TILE" },
   { "KL",
     "SSE2" },
   { "WIDEKL",
@@ -378,6 +380,7 @@ static bitfield cpu_flags[] =
   BITFIELD (AMX_INT8),
   BITFIELD (AMX_BF16),
   BITFIELD (AMX_FP16),
+  BITFIELD (AMX_COMPLEX),
   BITFIELD (AMX_TILE),
   BITFIELD (MOVDIRI),
   BITFIELD (MOVDIR64B),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 23d93ae6f81..46a36c3e965 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -248,6 +248,8 @@ enum
   CpuAMX_BF16,
   /* AMX-FP16 instructions required */
   CpuAMX_FP16,
+  /* Intel AMX-COMPLEX Instructions support required.  */
+  CpuAMX_COMPLEX,
   /* AMX-TILE instructions required */
   CpuAMX_TILE,
   /* GFNI instructions required */
@@ -432,6 +434,7 @@ typedef union i386_cpu_flags
       unsigned int cpuamx_int8:1;
       unsigned int cpuamx_bf16:1;
       unsigned int cpuamx_fp16:1;
+      unsigned int cpuamx_complex:1;
       unsigned int cpuamx_tile:1;
       unsigned int cpugfni:1;
       unsigned int cpuvaes:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 9cc909925f4..240e1783d0d 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
 
 // AMX instructions end.
 
+// AMX-COMPLEX instructions.
+
+tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+
+// AMX-COMPLEX instructions end.
+
 // KEYLOCKER instructions.
 
 loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Support Intel AMX-COMPLEX
  2023-04-03  7:11 [PATCH] Support Intel AMX-COMPLEX Haochen Jiang
@ 2023-04-04  7:35 ` Jan Beulich
  2023-04-04  8:41   ` Jiang, Haochen
  2023-04-06  7:17   ` [PATCH v2] " Haochen Jiang
  0 siblings, 2 replies; 7+ messages in thread
From: Jan Beulich @ 2023-04-04  7:35 UTC (permalink / raw)
  To: Haochen Jiang; +Cc: hjl.tools, binutils

On 03.04.2023 09:11, Haochen Jiang wrote:
> @@ -1183,6 +1184,8 @@ if [gas_64_check] then {
>      run_dump_test "x86-64-avx-ne-convert-intel"
>      run_dump_test "x86-64-raoint"
>      run_dump_test "x86-64-raoint-intel"
> +    run_dump_test "x86-64-amx-complex"
> +    run_dump_test "x86-64-amx-complex-intel"
>      run_dump_test "x86-64-clzero"
>      run_dump_test "x86-64-mwaitx-bdver4"
>      run_list_test "x86-64-mwaitx-reg"

There are constraints on operand combinations, like for tdp*, which want
testing here as well (both the assembler and disassembler sides) imo.

> @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = {
>      { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) },
>    },
>  
> +  /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */
> +  {
> +    { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 },
> +    { Bad_Opcode },
> +    { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 },
> +  },

You could avoid going through vex_w_table[] by making use of %XS here.
(I guess I'll make a similar change for tdp*16ps, but - to avoid
causing conflicts - perhaps only once yours went in.)

> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -248,6 +248,8 @@ enum
>    CpuAMX_BF16,
>    /* AMX-FP16 instructions required */
>    CpuAMX_FP16,
> +  /* Intel AMX-COMPLEX Instructions support required.  */
> +  CpuAMX_COMPLEX,
>    /* AMX-TILE instructions required */
>    CpuAMX_TILE,
>    /* GFNI instructions required */

In line with adjacent comments, please omit "Intel" and "support" from
the comment, and don't start "instructions" with a capital latter. Plus
while the full stop is in line with general comment style, looking at
adjacent comments here it probably also wants omitting.

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
>  
>  // AMX instructions end.
>  
> +// AMX-COMPLEX instructions.
> +
> +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
> +tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
> +
> +// AMX-COMPLEX instructions end.

I think these would better not have their own comment-bounded group, but
go inside the "AMX instructions" sections (which already covers all AMX-*).

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] Support Intel AMX-COMPLEX
  2023-04-04  7:35 ` Jan Beulich
@ 2023-04-04  8:41   ` Jiang, Haochen
  2023-04-06  7:17   ` [PATCH v2] " Haochen Jiang
  1 sibling, 0 replies; 7+ messages in thread
From: Jiang, Haochen @ 2023-04-04  8:41 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: hjl.tools, binutils

> On 03.04.2023 09:11, Haochen Jiang wrote:
> > @@ -1183,6 +1184,8 @@ if [gas_64_check] then {
> >      run_dump_test "x86-64-avx-ne-convert-intel"
> >      run_dump_test "x86-64-raoint"
> >      run_dump_test "x86-64-raoint-intel"
> > +    run_dump_test "x86-64-amx-complex"
> > +    run_dump_test "x86-64-amx-complex-intel"
> >      run_dump_test "x86-64-clzero"
> >      run_dump_test "x86-64-mwaitx-bdver4"
> >      run_list_test "x86-64-mwaitx-reg"
> 
> There are constraints on operand combinations, like for tdp*, which want
> testing here as well (both the assembler and disassembler sides) imo.

I just saw those testcases, I will add them in v2 patch just like tdp* did.
Thx for the reminder.

> 
> > @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = {
> >      { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) },
> >    },
> >
> > +  /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */  {
> > +    { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 },
> > +    { Bad_Opcode },
> > +    { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 },  },
> 
> You could avoid going through vex_w_table[] by making use of %XS here.
> (I guess I'll make a similar change for tdp*16ps, but - to avoid causing conflicts
> - perhaps only once yours went in.)

I will leave this to you, using %XS does eliminate W table pass.

> 
> > --- a/opcodes/i386-opc.h
> > +++ b/opcodes/i386-opc.h
> > @@ -248,6 +248,8 @@ enum
> >    CpuAMX_BF16,
> >    /* AMX-FP16 instructions required */
> >    CpuAMX_FP16,
> > +  /* Intel AMX-COMPLEX Instructions support required.  */
> > + CpuAMX_COMPLEX,
> >    /* AMX-TILE instructions required */
> >    CpuAMX_TILE,
> >    /* GFNI instructions required */
> 
> In line with adjacent comments, please omit "Intel" and "support" from the
> comment, and don't start "instructions" with a capital latter. Plus while the
> full stop is in line with general comment style, looking at adjacent comments
> here it probably also wants omitting.

Ok will do that in v2 patch.

> 
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64,
> > Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
> >
> >  // AMX instructions end.
> >
> > +// AMX-COMPLEX instructions.
> > +
> > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64,
> > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM,
> > +RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64,
> > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM,
> > +RegTMM, RegTMM }
> > +
> > +// AMX-COMPLEX instructions end.
> 
> I think these would better not have their own comment-bounded group, but
> go inside the "AMX instructions" sections (which already covers all AMX-*).

I will put them in alphabetical order in v2 patch, which means before tdp*.

Really appreciate your review and I will send v2 patch soon.

Haochen

> 
> Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] Support Intel AMX-COMPLEX
  2023-04-04  7:35 ` Jan Beulich
  2023-04-04  8:41   ` Jiang, Haochen
@ 2023-04-06  7:17   ` Haochen Jiang
  2023-04-06  9:45     ` Jan Beulich
  2023-04-07 15:49     ` H.J. Lu
  1 sibling, 2 replies; 7+ messages in thread
From: Haochen Jiang @ 2023-04-06  7:17 UTC (permalink / raw)
  To: binutils, jbeulich; +Cc: hjl.tools

Hi all,

v2 patch did several changes:

1.
> > @@ -1183,6 +1184,8 @@ if [gas_64_check] then {
> >      run_dump_test "x86-64-avx-ne-convert-intel"
> >      run_dump_test "x86-64-raoint"
> >      run_dump_test "x86-64-raoint-intel"
> > +    run_dump_test "x86-64-amx-complex"
> > +    run_dump_test "x86-64-amx-complex-intel"
> >      run_dump_test "x86-64-clzero"
> >      run_dump_test "x86-64-mwaitx-bdver4"
> >      run_list_test "x86-64-mwaitx-reg"
> 
> There are constraints on operand combinations, like for tdp*, which want
> testing here as well (both the assembler and disassembler sides) imo.

Added x86-64-amx-complex-bad testcases. The operand order keep reversed
here for operand 2 and 3. We could fix that after the PR30317 is solved.

2. 
> > --- a/opcodes/i386-opc.h
> > +++ b/opcodes/i386-opc.h
> > @@ -248,6 +248,8 @@ enum
> >    CpuAMX_BF16,
> >    /* AMX-FP16 instructions required */
> >    CpuAMX_FP16,
> > +  /* Intel AMX-COMPLEX Instructions support required.  */
> > + CpuAMX_COMPLEX,
> >    /* AMX-TILE instructions required */
> >    CpuAMX_TILE,
> >    /* GFNI instructions required */
> 
> In line with adjacent comments, please omit "Intel" and "support" from the
> comment, and don't start "instructions" with a capital latter. Plus while the
> full stop is in line with general comment style, looking at adjacent comments
> here it probably also wants omitting.

Adjusted the comment here.

3.
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64,
> > Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
> >
> >  // AMX instructions end.
> >
> > +// AMX-COMPLEX instructions.
> > +
> > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64,
> > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM,
> > +RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64,
> > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> { RegTMM,
> > +RegTMM, RegTMM }
> > +
> > +// AMX-COMPLEX instructions end.
>
> I think these would better not have their own comment-bounded group, but
> go inside the "AMX instructions" sections (which already covers all AMX-*).

Put them in alphabetical order with AMX instructions.

These change could be seen in the patch following. Thank for your review!

Thx,
Haochen

gas/ChangeLog:

	* NEWS: Support Intel AMX-COMPLEX.
	* config/tc-i386.c: Add amx_complex.
	* doc/c-i386.texi: Document .amx_complex.
	* testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests.
	* testsuite/gas/i386/amx-complex-inval.l: New test.
	* testsuite/gas/i386/amx-complex-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex-bad.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex-bad.s: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-complex.s: Ditto.

opcodes/ChangeLog:

	* i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New.
	(PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto.
	(X86_64_VEX_0F386C): Ditto.
	(VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto.
	(VEX_W_0F386C_X86_64): Ditto.
	(mod_table): Add MOD_VEX_0F386C_X86_64_W_0.
	(prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0.
	(x86_64_table): Add X86_64_VEX_0F386C.
	(vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1.
	(vex_w_table): Add VEX_W_0F386C_X86_64.
	* i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and
	CPU_ANY_AMX_COMPLEX_FLAGS.
	* i386-init.h: Regenerated.
	* i386-mnem.h: Ditto.
	* i386-opc.h (CpuAMX_COMPLEX): New.
	(i386_cpu_flags): Add cpuamx_complex.
	* i386-opc.tbl: Add AMX-COMPLEX instructions.
	* i386-tbl.h: Regenerated.
---
 gas/NEWS                                      |    2 +
 gas/config/tc-i386.c                          |    1 +
 gas/doc/c-i386.texi                           |    4 +-
 gas/testsuite/gas/i386/amx-complex-inval.l    |    3 +
 gas/testsuite/gas/i386/amx-complex-inval.s    |    7 +
 gas/testsuite/gas/i386/i386.exp               |    4 +
 .../gas/i386/x86-64-amx-complex-bad.d         |   19 +
 .../gas/i386/x86-64-amx-complex-bad.s         |   17 +
 .../gas/i386/x86-64-amx-complex-intel.d       |   18 +
 gas/testsuite/gas/i386/x86-64-amx-complex.d   |   15 +
 gas/testsuite/gas/i386/x86-64-amx-complex.s   |   15 +
 opcodes/i386-dis.c                            |   34 +-
 opcodes/i386-gen.c                            |    3 +
 opcodes/i386-init.h                           |  542 +-
 opcodes/i386-mnem.h                           | 1098 +--
 opcodes/i386-opc.h                            |    3 +
 opcodes/i386-opc.tbl                          |    3 +
 opcodes/i386-tbl.h                            | 7836 +++++++++--------
 18 files changed, 4912 insertions(+), 4712 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l
 create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s

diff --git a/gas/NEWS b/gas/NEWS
index f95383e83af..42a2005d7c9 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,7 @@
 -*- text -*-
 
+* Add support for Intel AMX-COMPLEX instructions.
+
 * Add SME2 support to the AArch64 port.
 
 * A new .insn directive is recognized by x86 gas.
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index ea2ed0d818e..ea5705da4af 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1113,6 +1113,7 @@ static const arch_entry cpu_arch[] =
   SUBARCH (amx_int8, AMX_INT8, ANY_AMX_INT8, false),
   SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false),
   SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false),
+  SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false),
   SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
   SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
   SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 617cbd46cb7..15d060b2a33 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -208,6 +208,7 @@ accept various extension mnemonics.  For example,
 @code{amx_int8},
 @code{amx_bf16},
 @code{amx_fp16},
+@code{amx_complex},
 @code{amx_tile},
 @code{vmx},
 @code{vmfunc},
@@ -1636,7 +1637,8 @@ supported on the CPU specified.  The choices for @var{cpu_type} are:
 @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote}
 @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq}
 @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
-@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @tab @samp{.amx_tile}
+@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16}
+@item @samp{.amx_complex} @tab @samp{.amx_tile}
 @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset}
 @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
 @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme}
diff --git a/gas/testsuite/gas/i386/amx-complex-inval.l b/gas/testsuite/gas/i386/amx-complex-inval.l
new file mode 100644
index 00000000000..df6713c5d8b
--- /dev/null
+++ b/gas/testsuite/gas/i386/amx-complex-inval.l
@@ -0,0 +1,3 @@
+.* Assembler messages:
+.*:6: Error: `tcmmimfp16ps' is only supported in 64-bit mode
+.*:7: Error: `tcmmrlfp16ps' is only supported in 64-bit mode
diff --git a/gas/testsuite/gas/i386/amx-complex-inval.s b/gas/testsuite/gas/i386/amx-complex-inval.s
new file mode 100644
index 00000000000..b1bbf32585b
--- /dev/null
+++ b/gas/testsuite/gas/i386/amx-complex-inval.s
@@ -0,0 +1,7 @@
+# Check Illegal AMX-COMPLEX instructions
+
+	.allow_index_reg
+	.text
+_start:
+	tcmmimfp16ps	%tmm1, %tmm2, %tmm3
+	tcmmrlfp16ps	%tmm1, %tmm2, %tmm3
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index c44f071a0e2..6d326b49a39 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -493,6 +493,7 @@ if [gas_32_check] then {
     run_dump_test "avx-ne-convert-intel"
     run_dump_test "raoint"
     run_dump_test "raoint-intel"
+    run_list_test "amx-complex-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "invlpgb"
@@ -1183,6 +1184,9 @@ if [gas_64_check] then {
     run_dump_test "x86-64-avx-ne-convert-intel"
     run_dump_test "x86-64-raoint"
     run_dump_test "x86-64-raoint-intel"
+    run_dump_test "x86-64-amx-complex"
+    run_dump_test "x86-64-amx-complex-intel"
+    run_dump_test "x86-64-amx-complex-bad"
     run_dump_test "x86-64-clzero"
     run_dump_test "x86-64-mwaitx-bdver4"
     run_list_test "x86-64-mwaitx-reg"
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d
new file mode 100644
index 00000000000..646015ca9bb
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d
@@ -0,0 +1,19 @@
+#as:
+#objdump: -drw
+#name: x86_64 Illegal AMX-COMPLEX insns
+#source: x86-64-amx-complex-bad.s
+
+.*: +file format .*
+
+
+Disassembly of section \.text:
+
+0+ <\.text>:
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 6c[ 	]*\(bad\)[ 	]*
+[ 	]*[a-f0-9]+:[ 	]*f5[ 	]*cmc.*
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 5d 6c[ 	]*\(bad\)[ 	]*
+[ 	]*[a-f0-9]+:[ 	]*f5[ 	]*cmc.*
+[ 	]*[a-f0-9]+:[ 	]*c4 62 59 6c f5[ 	]*tcmmimfp16ps %tmm4,%tmm5,\(bad\)
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 59 6c f5[ 	]*tcmmimfp16ps %tmm4,\(bad\),%tmm6
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 31 6c f5[ 	]*tcmmimfp16ps \(bad\),%tmm5,%tmm6
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s
new file mode 100644
index 00000000000..b2e55b13825
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s
@@ -0,0 +1,17 @@
+# Check Illegal 64bit AMX-COMPLEX instructions
+
+.text
+	#tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.W = 1 (illegal value).
+	.insn VEX.128.66.0F38.W1 0x6c, %tmm5, %tmm4, %tmm6
+
+	#tcmmimfp16ps %tmm4,%tmm4,%tmm6 set VEX.L = 1 (illegal value).
+	.insn VEX.256.66.0F38.W0 0x6c, %tmm5, %tmm4, %tmm6
+
+	#tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.R = 0 (illegal value).
+	.insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm4, %xmm14
+
+	#tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.B = 0 (illegal value).
+	.insn VEX.128.66.0F38.W0 0x6c, %xmm13, %xmm4, %xmm6
+
+	#tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.VVVV = 0110 (illegal value).
+	.insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm9, %xmm6
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
new file mode 100644
index 00000000000..8f2e015104f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
@@ -0,0 +1,18 @@
+#as:
+#objdump: -dw -Mintel
+#name: x86_64 AMX-COMPLEX insns (Intel disassembly)
+#source: x86-64-amx-complex.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1
+\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1
+\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1
+\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4
+\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.d b/gas/testsuite/gas/i386/x86-64-amx-complex.d
new file mode 100644
index 00000000000..b2157960027
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex.d
@@ -0,0 +1,15 @@
+#as:
+#objdump: -dw
+#name: x86_64 AMX-COMPLEX insns
+#source: x86-64-amx-complex.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps %tmm4,%tmm5,%tmm6
+\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps %tmm1,%tmm2,%tmm3
+\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps %tmm4,%tmm5,%tmm6
+\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps %tmm1,%tmm2,%tmm3
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.s b/gas/testsuite/gas/i386/x86-64-amx-complex.s
new file mode 100644
index 00000000000..56f1a00fa9e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-complex.s
@@ -0,0 +1,15 @@
+# Check 64bit AMX-COMPLEX instructions
+
+	.allow_index_reg
+	.text
+_start:
+	tcmmimfp16ps	%tmm4, %tmm5, %tmm6	 #AMX-COMPLEX
+	tcmmimfp16ps	%tmm1, %tmm2, %tmm3	 #AMX-COMPLEX
+	tcmmrlfp16ps	%tmm4, %tmm5, %tmm6	 #AMX-COMPLEX
+	tcmmrlfp16ps	%tmm1, %tmm2, %tmm3	 #AMX-COMPLEX
+
+.intel_syntax noprefix
+	tcmmimfp16ps	tmm6, tmm5, tmm4	 #AMX-COMPLEX
+	tcmmimfp16ps	tmm3, tmm2, tmm1	 #AMX-COMPLEX
+	tcmmrlfp16ps	tmm6, tmm5, tmm4	 #AMX-COMPLEX
+	tcmmrlfp16ps	tmm3, tmm2, tmm1	 #AMX-COMPLEX
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index a414e8c9b1e..d6b0fdd4ba3 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -943,6 +943,7 @@ enum
   MOD_VEX_0F385E_X86_64_P_1_W_0,
   MOD_VEX_0F385E_X86_64_P_2_W_0,
   MOD_VEX_0F385E_X86_64_P_3_W_0,
+  MOD_VEX_0F386C_X86_64_W_0,
   MOD_VEX_0F388C,
   MOD_VEX_0F388E,
   MOD_VEX_0F3A30_L_0,
@@ -1145,6 +1146,7 @@ enum
   PREFIX_VEX_0F3851_W_0,
   PREFIX_VEX_0F385C_X86_64,
   PREFIX_VEX_0F385E_X86_64,
+  PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0,
   PREFIX_VEX_0F3872,
   PREFIX_VEX_0F38B0_W_0,
   PREFIX_VEX_0F38B1_W_0,
@@ -1298,6 +1300,7 @@ enum
   X86_64_VEX_0F384B,
   X86_64_VEX_0F385C,
   X86_64_VEX_0F385E,
+  X86_64_VEX_0F386C,
   X86_64_VEX_0F38E0,
   X86_64_VEX_0F38E1,
   X86_64_VEX_0F38E2,
@@ -1398,6 +1401,7 @@ enum
   VEX_LEN_0F385E_X86_64_P_1_W_0_M_0,
   VEX_LEN_0F385E_X86_64_P_2_W_0_M_0,
   VEX_LEN_0F385E_X86_64_P_3_W_0_M_0,
+  VEX_LEN_0F386C_X86_64_W_0_M_1,
   VEX_LEN_0F38DB,
   VEX_LEN_0F38F2,
   VEX_LEN_0F38F3,
@@ -1565,6 +1569,7 @@ enum
   VEX_W_0F385E_X86_64_P_1,
   VEX_W_0F385E_X86_64_P_2,
   VEX_W_0F385E_X86_64_P_3,
+  VEX_W_0F386C_X86_64,
   VEX_W_0F3872_P_1,
   VEX_W_0F3878,
   VEX_W_0F3879,
@@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = {
     { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) },
   },
 
+  /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */
+  {
+    { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 },
+    { Bad_Opcode },
+    { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 },
+  },
+
   /* PREFIX_VEX_0F3872 */
   {
     { Bad_Opcode },
@@ -4486,6 +4498,12 @@ static const struct dis386 x86_64_table[][2] = {
     { PREFIX_TABLE (PREFIX_VEX_0F385E_X86_64) },
   },
 
+  /* X86_64_VEX_0F386C */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F386C_X86_64) },
+  },
+
   /* X86_64_VEX_0F38E0 */
   {
     { Bad_Opcode },
@@ -6461,7 +6479,7 @@ static const struct dis386 vex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_TABLE (X86_64_VEX_0F386C) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -7181,6 +7199,11 @@ static const struct dis386 vex_len_table[][2] = {
     { "tdpbssd", {TMM, EXtmm, VexTmm }, 0 },
   },
 
+  /* VEX_LEN_0F386C_X86_64_W_0_M_1 */
+  {
+    { PREFIX_TABLE (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0) },
+  },
+
   /* VEX_LEN_0F38DB */
   {
     { "vaesimc",	{ XM, EXx }, PREFIX_DATA },
@@ -7849,6 +7872,10 @@ static const struct dis386 vex_w_table[][2] = {
     /* VEX_W_0F385E_X86_64_P_3 */
     { MOD_TABLE (MOD_VEX_0F385E_X86_64_P_3_W_0) },
   },
+  {
+    /* VEX_W_0F386C_X86_64 */
+    { MOD_TABLE (MOD_VEX_0F386C_X86_64_W_0) },
+  },
   {
     /* VEX_W_0F3872_P_1 */
     { "%XVvcvtneps2bf16%XY", { XMM, EXx }, 0 },
@@ -8696,6 +8723,11 @@ static const struct dis386 mod_table[][2] = {
     { Bad_Opcode },
     { VEX_LEN_TABLE (VEX_LEN_0F385E_X86_64_P_3_W_0_M_0) },
   },
+  {
+    /* MOD_VEX_0F386C_X86_64_W_0 */
+    { Bad_Opcode },
+    { VEX_LEN_TABLE (VEX_LEN_0F386C_X86_64_W_0_M_1) },
+  },
   {
     /* MOD_VEX_0F388C */
     { "vpmaskmov%DQ",	{ XM, Vex, Mx }, PREFIX_DATA },
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 489ae3429c9..c2ac3c6832d 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -240,6 +240,8 @@ static const dependency isa_dependencies[] =
     "AMX_TILE" },
   { "AMX_FP16",
     "AMX_TILE" },
+  { "AMX_COMPLEX",
+    "AMX_TILE" },
   { "KL",
     "SSE2" },
   { "WIDEKL",
@@ -378,6 +380,7 @@ static bitfield cpu_flags[] =
   BITFIELD (AMX_INT8),
   BITFIELD (AMX_BF16),
   BITFIELD (AMX_FP16),
+  BITFIELD (AMX_COMPLEX),
   BITFIELD (AMX_TILE),
   BITFIELD (MOVDIRI),
   BITFIELD (MOVDIR64B),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 23d93ae6f81..b17e8341aa2 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -248,6 +248,8 @@ enum
   CpuAMX_BF16,
   /* AMX-FP16 instructions required */
   CpuAMX_FP16,
+  /* AMX-COMPLEX instructions required.  */
+  CpuAMX_COMPLEX,
   /* AMX-TILE instructions required */
   CpuAMX_TILE,
   /* GFNI instructions required */
@@ -432,6 +434,7 @@ typedef union i386_cpu_flags
       unsigned int cpuamx_int8:1;
       unsigned int cpuamx_bf16:1;
       unsigned int cpuamx_fp16:1;
+      unsigned int cpuamx_complex:1;
       unsigned int cpuamx_tile:1;
       unsigned int cpugfni:1;
       unsigned int cpuvaes:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 9cc909925f4..15d48eeb4c7 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3146,6 +3146,9 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
 ldtilecfg, 0x49/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
 sttilecfg, 0x6649/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
 
+tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+
 tdpbf16ps, 0xf35c, AMX_BF16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tdpfp16ps, 0xf25c, AMX_FP16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 tdpbssd, 0xf25e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] Support Intel AMX-COMPLEX
  2023-04-06  7:17   ` [PATCH v2] " Haochen Jiang
@ 2023-04-06  9:45     ` Jan Beulich
  2023-04-07  1:59       ` Jiang, Haochen
  2023-04-07 15:49     ` H.J. Lu
  1 sibling, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2023-04-06  9:45 UTC (permalink / raw)
  To: Haochen Jiang; +Cc: hjl.tools, binutils

On 06.04.2023 09:17, Haochen Jiang wrote:
> gas/ChangeLog:
> 
> 	* NEWS: Support Intel AMX-COMPLEX.
> 	* config/tc-i386.c: Add amx_complex.
> 	* doc/c-i386.texi: Document .amx_complex.
> 	* testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests.
> 	* testsuite/gas/i386/amx-complex-inval.l: New test.
> 	* testsuite/gas/i386/amx-complex-inval.s: Ditto.
> 	* testsuite/gas/i386/x86-64-amx-complex-bad.d: Ditto.
> 	* testsuite/gas/i386/x86-64-amx-complex-bad.s: Ditto.
> 	* testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto.
> 	* testsuite/gas/i386/x86-64-amx-complex.d: Ditto.
> 	* testsuite/gas/i386/x86-64-amx-complex.s: Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New.
> 	(PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto.
> 	(X86_64_VEX_0F386C): Ditto.
> 	(VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto.
> 	(VEX_W_0F386C_X86_64): Ditto.
> 	(mod_table): Add MOD_VEX_0F386C_X86_64_W_0.
> 	(prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0.
> 	(x86_64_table): Add X86_64_VEX_0F386C.
> 	(vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1.
> 	(vex_w_table): Add VEX_W_0F386C_X86_64.
> 	* i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and
> 	CPU_ANY_AMX_COMPLEX_FLAGS.
> 	* i386-init.h: Regenerated.
> 	* i386-mnem.h: Ditto.
> 	* i386-opc.h (CpuAMX_COMPLEX): New.
> 	(i386_cpu_flags): Add cpuamx_complex.
> 	* i386-opc.tbl: Add AMX-COMPLEX instructions.
> 	* i386-tbl.h: Regenerated.
> ---
>  gas/NEWS                                      |    2 +
>  gas/config/tc-i386.c                          |    1 +
>  gas/doc/c-i386.texi                           |    4 +-
>  gas/testsuite/gas/i386/amx-complex-inval.l    |    3 +
>  gas/testsuite/gas/i386/amx-complex-inval.s    |    7 +
>  gas/testsuite/gas/i386/i386.exp               |    4 +
>  .../gas/i386/x86-64-amx-complex-bad.d         |   19 +
>  .../gas/i386/x86-64-amx-complex-bad.s         |   17 +
>  .../gas/i386/x86-64-amx-complex-intel.d       |   18 +
>  gas/testsuite/gas/i386/x86-64-amx-complex.d   |   15 +
>  gas/testsuite/gas/i386/x86-64-amx-complex.s   |   15 +
>  opcodes/i386-dis.c                            |   34 +-
>  opcodes/i386-gen.c                            |    3 +
>  opcodes/i386-init.h                           |  542 +-
>  opcodes/i386-mnem.h                           | 1098 +--
>  opcodes/i386-opc.h                            |    3 +
>  opcodes/i386-opc.tbl                          |    3 +
>  opcodes/i386-tbl.h                            | 7836 +++++++++--------
>  18 files changed, 4912 insertions(+), 4712 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l
>  create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s

Okay.

That said, like AMX-FP16 this one also omits x86-64-amx-inval.s-like
checks (that very testcase could easily be extended instead of making
yet further tiny new ones); even the original AMX work checked only
an AMX-INT8 insn there (I'm specifically after the all-operands-must-
be-distinct checking), but not AMX-BF16. Would be nice if we could
gain additions for both (all three) in a subsequent patch.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v2] Support Intel AMX-COMPLEX
  2023-04-06  9:45     ` Jan Beulich
@ 2023-04-07  1:59       ` Jiang, Haochen
  0 siblings, 0 replies; 7+ messages in thread
From: Jiang, Haochen @ 2023-04-07  1:59 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: hjl.tools, binutils

> That said, like AMX-FP16 this one also omits x86-64-amx-inval.s-like checks
> (that very testcase could easily be extended instead of making yet further
> tiny new ones); even the original AMX work checked only an AMX-INT8 insn
> there (I'm specifically after the all-operands-must- be-distinct checking), but
> not AMX-BF16. Would be nice if we could gain additions for both (all three) in
> a subsequent patch.

I will add those inval AMX testcases just like the existing ones in these days.

Thx,
Haochen

> 
> Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] Support Intel AMX-COMPLEX
  2023-04-06  7:17   ` [PATCH v2] " Haochen Jiang
  2023-04-06  9:45     ` Jan Beulich
@ 2023-04-07 15:49     ` H.J. Lu
  1 sibling, 0 replies; 7+ messages in thread
From: H.J. Lu @ 2023-04-07 15:49 UTC (permalink / raw)
  To: Haochen Jiang; +Cc: binutils, jbeulich

On Thu, Apr 6, 2023 at 12:19 AM Haochen Jiang <haochen.jiang@intel.com> wrote:
>
> Hi all,
>
> v2 patch did several changes:
>
> 1.
> > > @@ -1183,6 +1184,8 @@ if [gas_64_check] then {
> > >      run_dump_test "x86-64-avx-ne-convert-intel"
> > >      run_dump_test "x86-64-raoint"
> > >      run_dump_test "x86-64-raoint-intel"
> > > +    run_dump_test "x86-64-amx-complex"
> > > +    run_dump_test "x86-64-amx-complex-intel"
> > >      run_dump_test "x86-64-clzero"
> > >      run_dump_test "x86-64-mwaitx-bdver4"
> > >      run_list_test "x86-64-mwaitx-reg"
> >
> > There are constraints on operand combinations, like for tdp*, which want
> > testing here as well (both the assembler and disassembler sides) imo.
>
> Added x86-64-amx-complex-bad testcases. The operand order keep reversed
> here for operand 2 and 3. We could fix that after the PR30317 is solved.
>
> 2.
> > > --- a/opcodes/i386-opc.h
> > > +++ b/opcodes/i386-opc.h
> > > @@ -248,6 +248,8 @@ enum
> > >    CpuAMX_BF16,
> > >    /* AMX-FP16 instructions required */
> > >    CpuAMX_FP16,
> > > +  /* Intel AMX-COMPLEX Instructions support required.  */
> > > + CpuAMX_COMPLEX,
> > >    /* AMX-TILE instructions required */
> > >    CpuAMX_TILE,
> > >    /* GFNI instructions required */
> >
> > In line with adjacent comments, please omit "Intel" and "support" from the
> > comment, and don't start "instructions" with a capital latter. Plus while the
> > full stop is in line with general comment style, looking at adjacent comments
> > here it probably also wants omitting.
>
> Adjusted the comment here.
>
> 3.
> > > --- a/opcodes/i386-opc.tbl
> > > +++ b/opcodes/i386-opc.tbl
> > > @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64,
> > > Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM }
> > >
> > >  // AMX instructions end.
> > >
> > > +// AMX-COMPLEX instructions.
> > > +
> > > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64,
> > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> > { RegTMM,
> > > +RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64,
> > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf,
> > { RegTMM,
> > > +RegTMM, RegTMM }
> > > +
> > > +// AMX-COMPLEX instructions end.
> >
> > I think these would better not have their own comment-bounded group, but
> > go inside the "AMX instructions" sections (which already covers all AMX-*).
>
> Put them in alphabetical order with AMX instructions.
>
> These change could be seen in the patch following. Thank for your review!
>
> Thx,
> Haochen
>
> gas/ChangeLog:
>
>         * NEWS: Support Intel AMX-COMPLEX.
>         * config/tc-i386.c: Add amx_complex.
>         * doc/c-i386.texi: Document .amx_complex.
>         * testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests.
>         * testsuite/gas/i386/amx-complex-inval.l: New test.
>         * testsuite/gas/i386/amx-complex-inval.s: Ditto.
>         * testsuite/gas/i386/x86-64-amx-complex-bad.d: Ditto.
>         * testsuite/gas/i386/x86-64-amx-complex-bad.s: Ditto.
>         * testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto.
>         * testsuite/gas/i386/x86-64-amx-complex.d: Ditto.
>         * testsuite/gas/i386/x86-64-amx-complex.s: Ditto.
>
> opcodes/ChangeLog:
>
>         * i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New.
>         (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto.
>         (X86_64_VEX_0F386C): Ditto.
>         (VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto.
>         (VEX_W_0F386C_X86_64): Ditto.
>         (mod_table): Add MOD_VEX_0F386C_X86_64_W_0.
>         (prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0.
>         (x86_64_table): Add X86_64_VEX_0F386C.
>         (vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1.
>         (vex_w_table): Add VEX_W_0F386C_X86_64.
>         * i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and
>         CPU_ANY_AMX_COMPLEX_FLAGS.
>         * i386-init.h: Regenerated.
>         * i386-mnem.h: Ditto.
>         * i386-opc.h (CpuAMX_COMPLEX): New.
>         (i386_cpu_flags): Add cpuamx_complex.
>         * i386-opc.tbl: Add AMX-COMPLEX instructions.
>         * i386-tbl.h: Regenerated.
> ---
>  gas/NEWS                                      |    2 +
>  gas/config/tc-i386.c                          |    1 +
>  gas/doc/c-i386.texi                           |    4 +-
>  gas/testsuite/gas/i386/amx-complex-inval.l    |    3 +
>  gas/testsuite/gas/i386/amx-complex-inval.s    |    7 +
>  gas/testsuite/gas/i386/i386.exp               |    4 +
>  .../gas/i386/x86-64-amx-complex-bad.d         |   19 +
>  .../gas/i386/x86-64-amx-complex-bad.s         |   17 +
>  .../gas/i386/x86-64-amx-complex-intel.d       |   18 +
>  gas/testsuite/gas/i386/x86-64-amx-complex.d   |   15 +
>  gas/testsuite/gas/i386/x86-64-amx-complex.s   |   15 +
>  opcodes/i386-dis.c                            |   34 +-
>  opcodes/i386-gen.c                            |    3 +
>  opcodes/i386-init.h                           |  542 +-
>  opcodes/i386-mnem.h                           | 1098 +--
>  opcodes/i386-opc.h                            |    3 +
>  opcodes/i386-opc.tbl                          |    3 +
>  opcodes/i386-tbl.h                            | 7836 +++++++++--------
>  18 files changed, 4912 insertions(+), 4712 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l
>  create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s
>
> diff --git a/gas/NEWS b/gas/NEWS
> index f95383e83af..42a2005d7c9 100644
> --- a/gas/NEWS
> +++ b/gas/NEWS
> @@ -1,5 +1,7 @@
>  -*- text -*-
>
> +* Add support for Intel AMX-COMPLEX instructions.
> +
>  * Add SME2 support to the AArch64 port.
>
>  * A new .insn directive is recognized by x86 gas.
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index ea2ed0d818e..ea5705da4af 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -1113,6 +1113,7 @@ static const arch_entry cpu_arch[] =
>    SUBARCH (amx_int8, AMX_INT8, ANY_AMX_INT8, false),
>    SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false),
>    SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false),
> +  SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false),
>    SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
>    SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
>    SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),
> diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
> index 617cbd46cb7..15d060b2a33 100644
> --- a/gas/doc/c-i386.texi
> +++ b/gas/doc/c-i386.texi
> @@ -208,6 +208,7 @@ accept various extension mnemonics.  For example,
>  @code{amx_int8},
>  @code{amx_bf16},
>  @code{amx_fp16},
> +@code{amx_complex},
>  @code{amx_tile},
>  @code{vmx},
>  @code{vmfunc},
> @@ -1636,7 +1637,8 @@ supported on the CPU specified.  The choices for @var{cpu_type} are:
>  @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote}
>  @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq}
>  @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
> -@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @tab @samp{.amx_tile}
> +@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16}
> +@item @samp{.amx_complex} @tab @samp{.amx_tile}
>  @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset}
>  @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
>  @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme}
> diff --git a/gas/testsuite/gas/i386/amx-complex-inval.l b/gas/testsuite/gas/i386/amx-complex-inval.l
> new file mode 100644
> index 00000000000..df6713c5d8b
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/amx-complex-inval.l
> @@ -0,0 +1,3 @@
> +.* Assembler messages:
> +.*:6: Error: `tcmmimfp16ps' is only supported in 64-bit mode
> +.*:7: Error: `tcmmrlfp16ps' is only supported in 64-bit mode
> diff --git a/gas/testsuite/gas/i386/amx-complex-inval.s b/gas/testsuite/gas/i386/amx-complex-inval.s
> new file mode 100644
> index 00000000000..b1bbf32585b
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/amx-complex-inval.s
> @@ -0,0 +1,7 @@
> +# Check Illegal AMX-COMPLEX instructions
> +
> +       .allow_index_reg
> +       .text
> +_start:
> +       tcmmimfp16ps    %tmm1, %tmm2, %tmm3
> +       tcmmrlfp16ps    %tmm1, %tmm2, %tmm3
> diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
> index c44f071a0e2..6d326b49a39 100644
> --- a/gas/testsuite/gas/i386/i386.exp
> +++ b/gas/testsuite/gas/i386/i386.exp
> @@ -493,6 +493,7 @@ if [gas_32_check] then {
>      run_dump_test "avx-ne-convert-intel"
>      run_dump_test "raoint"
>      run_dump_test "raoint-intel"
> +    run_list_test "amx-complex-inval"
>      run_list_test "sg"
>      run_dump_test "clzero"
>      run_dump_test "invlpgb"
> @@ -1183,6 +1184,9 @@ if [gas_64_check] then {
>      run_dump_test "x86-64-avx-ne-convert-intel"
>      run_dump_test "x86-64-raoint"
>      run_dump_test "x86-64-raoint-intel"
> +    run_dump_test "x86-64-amx-complex"
> +    run_dump_test "x86-64-amx-complex-intel"
> +    run_dump_test "x86-64-amx-complex-bad"
>      run_dump_test "x86-64-clzero"
>      run_dump_test "x86-64-mwaitx-bdver4"
>      run_list_test "x86-64-mwaitx-reg"
> diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d
> new file mode 100644
> index 00000000000..646015ca9bb
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d
> @@ -0,0 +1,19 @@
> +#as:
> +#objdump: -drw
> +#name: x86_64 Illegal AMX-COMPLEX insns
> +#source: x86-64-amx-complex-bad.s
> +
> +.*: +file format .*
> +
> +
> +Disassembly of section \.text:
> +
> +0+ <\.text>:
> +[      ]*[a-f0-9]+:[   ]*c4 e2 d9 6c[  ]*\(bad\)[      ]*
> +[      ]*[a-f0-9]+:[   ]*f5[   ]*cmc.*
> +[      ]*[a-f0-9]+:[   ]*c4 e2 5d 6c[  ]*\(bad\)[      ]*
> +[      ]*[a-f0-9]+:[   ]*f5[   ]*cmc.*
> +[      ]*[a-f0-9]+:[   ]*c4 62 59 6c f5[       ]*tcmmimfp16ps %tmm4,%tmm5,\(bad\)
> +[      ]*[a-f0-9]+:[   ]*c4 c2 59 6c f5[       ]*tcmmimfp16ps %tmm4,\(bad\),%tmm6
> +[      ]*[a-f0-9]+:[   ]*c4 e2 31 6c f5[       ]*tcmmimfp16ps \(bad\),%tmm5,%tmm6
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s
> new file mode 100644
> index 00000000000..b2e55b13825
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s
> @@ -0,0 +1,17 @@
> +# Check Illegal 64bit AMX-COMPLEX instructions
> +
> +.text
> +       #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.W = 1 (illegal value).
> +       .insn VEX.128.66.0F38.W1 0x6c, %tmm5, %tmm4, %tmm6
> +
> +       #tcmmimfp16ps %tmm4,%tmm4,%tmm6 set VEX.L = 1 (illegal value).
> +       .insn VEX.256.66.0F38.W0 0x6c, %tmm5, %tmm4, %tmm6
> +
> +       #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.R = 0 (illegal value).
> +       .insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm4, %xmm14
> +
> +       #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.B = 0 (illegal value).
> +       .insn VEX.128.66.0F38.W0 0x6c, %xmm13, %xmm4, %xmm6
> +
> +       #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.VVVV = 0110 (illegal value).
> +       .insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm9, %xmm6
> diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
> new file mode 100644
> index 00000000000..8f2e015104f
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d
> @@ -0,0 +1,18 @@
> +#as:
> +#objdump: -dw -Mintel
> +#name: x86_64 AMX-COMPLEX insns (Intel disassembly)
> +#source: x86-64-amx-complex.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4
> +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1
> +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4
> +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1
> +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4
> +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1
> +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4
> +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1
> diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.d b/gas/testsuite/gas/i386/x86-64-amx-complex.d
> new file mode 100644
> index 00000000000..b2157960027
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.d
> @@ -0,0 +1,15 @@
> +#as:
> +#objdump: -dw
> +#name: x86_64 AMX-COMPLEX insns
> +#source: x86-64-amx-complex.s
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps %tmm4,%tmm5,%tmm6
> +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps %tmm1,%tmm2,%tmm3
> +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps %tmm4,%tmm5,%tmm6
> +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps %tmm1,%tmm2,%tmm3
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.s b/gas/testsuite/gas/i386/x86-64-amx-complex.s
> new file mode 100644
> index 00000000000..56f1a00fa9e
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.s
> @@ -0,0 +1,15 @@
> +# Check 64bit AMX-COMPLEX instructions
> +
> +       .allow_index_reg
> +       .text
> +_start:
> +       tcmmimfp16ps    %tmm4, %tmm5, %tmm6      #AMX-COMPLEX
> +       tcmmimfp16ps    %tmm1, %tmm2, %tmm3      #AMX-COMPLEX
> +       tcmmrlfp16ps    %tmm4, %tmm5, %tmm6      #AMX-COMPLEX
> +       tcmmrlfp16ps    %tmm1, %tmm2, %tmm3      #AMX-COMPLEX
> +
> +.intel_syntax noprefix
> +       tcmmimfp16ps    tmm6, tmm5, tmm4         #AMX-COMPLEX
> +       tcmmimfp16ps    tmm3, tmm2, tmm1         #AMX-COMPLEX
> +       tcmmrlfp16ps    tmm6, tmm5, tmm4         #AMX-COMPLEX
> +       tcmmrlfp16ps    tmm3, tmm2, tmm1         #AMX-COMPLEX
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index a414e8c9b1e..d6b0fdd4ba3 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -943,6 +943,7 @@ enum
>    MOD_VEX_0F385E_X86_64_P_1_W_0,
>    MOD_VEX_0F385E_X86_64_P_2_W_0,
>    MOD_VEX_0F385E_X86_64_P_3_W_0,
> +  MOD_VEX_0F386C_X86_64_W_0,
>    MOD_VEX_0F388C,
>    MOD_VEX_0F388E,
>    MOD_VEX_0F3A30_L_0,
> @@ -1145,6 +1146,7 @@ enum
>    PREFIX_VEX_0F3851_W_0,
>    PREFIX_VEX_0F385C_X86_64,
>    PREFIX_VEX_0F385E_X86_64,
> +  PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0,
>    PREFIX_VEX_0F3872,
>    PREFIX_VEX_0F38B0_W_0,
>    PREFIX_VEX_0F38B1_W_0,
> @@ -1298,6 +1300,7 @@ enum
>    X86_64_VEX_0F384B,
>    X86_64_VEX_0F385C,
>    X86_64_VEX_0F385E,
> +  X86_64_VEX_0F386C,
>    X86_64_VEX_0F38E0,
>    X86_64_VEX_0F38E1,
>    X86_64_VEX_0F38E2,
> @@ -1398,6 +1401,7 @@ enum
>    VEX_LEN_0F385E_X86_64_P_1_W_0_M_0,
>    VEX_LEN_0F385E_X86_64_P_2_W_0_M_0,
>    VEX_LEN_0F385E_X86_64_P_3_W_0_M_0,
> +  VEX_LEN_0F386C_X86_64_W_0_M_1,
>    VEX_LEN_0F38DB,
>    VEX_LEN_0F38F2,
>    VEX_LEN_0F38F3,
> @@ -1565,6 +1569,7 @@ enum
>    VEX_W_0F385E_X86_64_P_1,
>    VEX_W_0F385E_X86_64_P_2,
>    VEX_W_0F385E_X86_64_P_3,
> +  VEX_W_0F386C_X86_64,
>    VEX_W_0F3872_P_1,
>    VEX_W_0F3878,
>    VEX_W_0F3879,
> @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = {
>      { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) },
>    },
>
> +  /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */
> +  {
> +    { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 },
> +    { Bad_Opcode },
> +    { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 },
> +  },
> +
>    /* PREFIX_VEX_0F3872 */
>    {
>      { Bad_Opcode },
> @@ -4486,6 +4498,12 @@ static const struct dis386 x86_64_table[][2] = {
>      { PREFIX_TABLE (PREFIX_VEX_0F385E_X86_64) },
>    },
>
> +  /* X86_64_VEX_0F386C */
> +  {
> +    { Bad_Opcode },
> +    { VEX_W_TABLE (VEX_W_0F386C_X86_64) },
> +  },
> +
>    /* X86_64_VEX_0F38E0 */
>    {
>      { Bad_Opcode },
> @@ -6461,7 +6479,7 @@ static const struct dis386 vex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> +    { X86_64_TABLE (X86_64_VEX_0F386C) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -7181,6 +7199,11 @@ static const struct dis386 vex_len_table[][2] = {
>      { "tdpbssd", {TMM, EXtmm, VexTmm }, 0 },
>    },
>
> +  /* VEX_LEN_0F386C_X86_64_W_0_M_1 */
> +  {
> +    { PREFIX_TABLE (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0) },
> +  },
> +
>    /* VEX_LEN_0F38DB */
>    {
>      { "vaesimc",       { XM, EXx }, PREFIX_DATA },
> @@ -7849,6 +7872,10 @@ static const struct dis386 vex_w_table[][2] = {
>      /* VEX_W_0F385E_X86_64_P_3 */
>      { MOD_TABLE (MOD_VEX_0F385E_X86_64_P_3_W_0) },
>    },
> +  {
> +    /* VEX_W_0F386C_X86_64 */
> +    { MOD_TABLE (MOD_VEX_0F386C_X86_64_W_0) },
> +  },
>    {
>      /* VEX_W_0F3872_P_1 */
>      { "%XVvcvtneps2bf16%XY", { XMM, EXx }, 0 },
> @@ -8696,6 +8723,11 @@ static const struct dis386 mod_table[][2] = {
>      { Bad_Opcode },
>      { VEX_LEN_TABLE (VEX_LEN_0F385E_X86_64_P_3_W_0_M_0) },
>    },
> +  {
> +    /* MOD_VEX_0F386C_X86_64_W_0 */
> +    { Bad_Opcode },
> +    { VEX_LEN_TABLE (VEX_LEN_0F386C_X86_64_W_0_M_1) },
> +  },
>    {
>      /* MOD_VEX_0F388C */
>      { "vpmaskmov%DQ",  { XM, Vex, Mx }, PREFIX_DATA },
> diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> index 489ae3429c9..c2ac3c6832d 100644
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -240,6 +240,8 @@ static const dependency isa_dependencies[] =
>      "AMX_TILE" },
>    { "AMX_FP16",
>      "AMX_TILE" },
> +  { "AMX_COMPLEX",
> +    "AMX_TILE" },
>    { "KL",
>      "SSE2" },
>    { "WIDEKL",
> @@ -378,6 +380,7 @@ static bitfield cpu_flags[] =
>    BITFIELD (AMX_INT8),
>    BITFIELD (AMX_BF16),
>    BITFIELD (AMX_FP16),
> +  BITFIELD (AMX_COMPLEX),
>    BITFIELD (AMX_TILE),
>    BITFIELD (MOVDIRI),
>    BITFIELD (MOVDIR64B),
> diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
> index 23d93ae6f81..b17e8341aa2 100644
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -248,6 +248,8 @@ enum
>    CpuAMX_BF16,
>    /* AMX-FP16 instructions required */
>    CpuAMX_FP16,
> +  /* AMX-COMPLEX instructions required.  */
> +  CpuAMX_COMPLEX,
>    /* AMX-TILE instructions required */
>    CpuAMX_TILE,
>    /* GFNI instructions required */
> @@ -432,6 +434,7 @@ typedef union i386_cpu_flags
>        unsigned int cpuamx_int8:1;
>        unsigned int cpuamx_bf16:1;
>        unsigned int cpuamx_fp16:1;
> +      unsigned int cpuamx_complex:1;
>        unsigned int cpuamx_tile:1;
>        unsigned int cpugfni:1;
>        unsigned int cpuvaes:1;
> diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
> index 9cc909925f4..15d48eeb4c7 100644
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3146,6 +3146,9 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
>  ldtilecfg, 0x49/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
>  sttilecfg, 0x6649/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
>
> +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
> +tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
> +
>  tdpbf16ps, 0xf35c, AMX_BF16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
>  tdpfp16ps, 0xf25c, AMX_FP16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
>  tdpbssd, 0xf25e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
> --
> 2.31.1
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-04-07 15:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-03  7:11 [PATCH] Support Intel AMX-COMPLEX Haochen Jiang
2023-04-04  7:35 ` Jan Beulich
2023-04-04  8:41   ` Jiang, Haochen
2023-04-06  7:17   ` [PATCH v2] " Haochen Jiang
2023-04-06  9:45     ` Jan Beulich
2023-04-07  1:59       ` Jiang, Haochen
2023-04-07 15:49     ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).