* [PATCH] Support Intel AMX-COMPLEX @ 2023-04-03 7:11 Haochen Jiang 2023-04-04 7:35 ` Jan Beulich 0 siblings, 1 reply; 7+ messages in thread From: Haochen Jiang @ 2023-04-03 7:11 UTC (permalink / raw) To: binutils; +Cc: jbeulich, hjl.tools Hi all, This patch aims to add Intel AMX-COMPLEX instructions. The information is based on newly released Intel Architecture Instruction Set Extensions and Future Features. The document comes following: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Tested on x86_64-pc-linux-gnu. Ok for trunk? BRs, Haochen gas/ChangeLog: * NEWS: Support Intel AMX-COMPLEX. * config/tc-i386.c: Add amx_complex. * doc/c-i386.texi: Document .amx_complex. * testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests. * testsuite/gas/i386/amx-complex-inval.l: New test. * testsuite/gas/i386/amx-complex-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex.s: Ditto. opcodes/ChangeLog: * i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New. (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto. (X86_64_VEX_0F386C): Ditto. (VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto. (VEX_W_0F386C_X86_64): Ditto. (mod_table): Add MOD_VEX_0F386C_X86_64_W_0. (prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0. (x86_64_table): Add X86_64_VEX_0F386C. (vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1. (vex_w_table): Add VEX_W_0F386C_X86_64. * i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and CPU_ANY_AMX_COMPLEX_FLAGS. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_COMPLEX): New. (i386_cpu_flags): Add cpuamx_complex. * i386-opc.tbl: Add AMX-COMPLEX instructions. * i386-tbl.h: Regenerated. --- gas/NEWS | 2 + gas/config/tc-i386.c | 1 + gas/doc/c-i386.texi | 4 +- gas/testsuite/gas/i386/amx-complex-inval.l | 3 + gas/testsuite/gas/i386/amx-complex-inval.s | 7 + gas/testsuite/gas/i386/i386.exp | 3 + .../gas/i386/x86-64-amx-complex-intel.d | 18 + gas/testsuite/gas/i386/x86-64-amx-complex.d | 15 + gas/testsuite/gas/i386/x86-64-amx-complex.s | 15 + opcodes/i386-dis.c | 34 +- opcodes/i386-gen.c | 3 + opcodes/i386-init.h | 542 +- opcodes/i386-mnem.h | 1098 +-- opcodes/i386-opc.h | 3 + opcodes/i386-opc.tbl | 7 + opcodes/i386-tbl.h | 7834 +++++++++-------- 16 files changed, 4878 insertions(+), 4711 deletions(-) create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s diff --git a/gas/NEWS b/gas/NEWS index f95383e83af..42a2005d7c9 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AMX-COMPLEX instructions. + * Add SME2 support to the AArch64 port. * A new .insn directive is recognized by x86 gas. diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index ea2ed0d818e..ea5705da4af 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1113,6 +1113,7 @@ static const arch_entry cpu_arch[] = SUBARCH (amx_int8, AMX_INT8, ANY_AMX_INT8, false), SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false), SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false), + SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false), SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index 617cbd46cb7..15d060b2a33 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -208,6 +208,7 @@ accept various extension mnemonics. For example, @code{amx_int8}, @code{amx_bf16}, @code{amx_fp16}, +@code{amx_complex}, @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1636,7 +1637,8 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote} @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq} @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} -@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @tab @samp{.amx_tile} +@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} +@item @samp{.amx_complex} @tab @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} diff --git a/gas/testsuite/gas/i386/amx-complex-inval.l b/gas/testsuite/gas/i386/amx-complex-inval.l new file mode 100644 index 00000000000..df6713c5d8b --- /dev/null +++ b/gas/testsuite/gas/i386/amx-complex-inval.l @@ -0,0 +1,3 @@ +.* Assembler messages: +.*:6: Error: `tcmmimfp16ps' is only supported in 64-bit mode +.*:7: Error: `tcmmrlfp16ps' is only supported in 64-bit mode diff --git a/gas/testsuite/gas/i386/amx-complex-inval.s b/gas/testsuite/gas/i386/amx-complex-inval.s new file mode 100644 index 00000000000..b1bbf32585b --- /dev/null +++ b/gas/testsuite/gas/i386/amx-complex-inval.s @@ -0,0 +1,7 @@ +# Check Illegal AMX-COMPLEX instructions + + .allow_index_reg + .text +_start: + tcmmimfp16ps %tmm1, %tmm2, %tmm3 + tcmmrlfp16ps %tmm1, %tmm2, %tmm3 diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index c44f071a0e2..c098ce2185a 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -493,6 +493,7 @@ if [gas_32_check] then { run_dump_test "avx-ne-convert-intel" run_dump_test "raoint" run_dump_test "raoint-intel" + run_list_test "amx-complex-inval" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" @@ -1183,6 +1184,8 @@ if [gas_64_check] then { run_dump_test "x86-64-avx-ne-convert-intel" run_dump_test "x86-64-raoint" run_dump_test "x86-64-raoint-intel" + run_dump_test "x86-64-amx-complex" + run_dump_test "x86-64-amx-complex-intel" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d new file mode 100644 index 00000000000..8f2e015104f --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d @@ -0,0 +1,18 @@ +#as: +#objdump: -dw -Mintel +#name: x86_64 AMX-COMPLEX insns (Intel disassembly) +#source: x86-64-amx-complex.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1 diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.d b/gas/testsuite/gas/i386/x86-64-amx-complex.d new file mode 100644 index 00000000000..b2157960027 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.d @@ -0,0 +1,15 @@ +#as: +#objdump: -dw +#name: x86_64 AMX-COMPLEX insns +#source: x86-64-amx-complex.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps %tmm1,%tmm2,%tmm3 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.s b/gas/testsuite/gas/i386/x86-64-amx-complex.s new file mode 100644 index 00000000000..56f1a00fa9e --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.s @@ -0,0 +1,15 @@ +# Check 64bit AMX-COMPLEX instructions + + .allow_index_reg + .text +_start: + tcmmimfp16ps %tmm4, %tmm5, %tmm6 #AMX-COMPLEX + tcmmimfp16ps %tmm1, %tmm2, %tmm3 #AMX-COMPLEX + tcmmrlfp16ps %tmm4, %tmm5, %tmm6 #AMX-COMPLEX + tcmmrlfp16ps %tmm1, %tmm2, %tmm3 #AMX-COMPLEX + +.intel_syntax noprefix + tcmmimfp16ps tmm6, tmm5, tmm4 #AMX-COMPLEX + tcmmimfp16ps tmm3, tmm2, tmm1 #AMX-COMPLEX + tcmmrlfp16ps tmm6, tmm5, tmm4 #AMX-COMPLEX + tcmmrlfp16ps tmm3, tmm2, tmm1 #AMX-COMPLEX diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index a414e8c9b1e..d6b0fdd4ba3 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -943,6 +943,7 @@ enum MOD_VEX_0F385E_X86_64_P_1_W_0, MOD_VEX_0F385E_X86_64_P_2_W_0, MOD_VEX_0F385E_X86_64_P_3_W_0, + MOD_VEX_0F386C_X86_64_W_0, MOD_VEX_0F388C, MOD_VEX_0F388E, MOD_VEX_0F3A30_L_0, @@ -1145,6 +1146,7 @@ enum PREFIX_VEX_0F3851_W_0, PREFIX_VEX_0F385C_X86_64, PREFIX_VEX_0F385E_X86_64, + PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0, PREFIX_VEX_0F3872, PREFIX_VEX_0F38B0_W_0, PREFIX_VEX_0F38B1_W_0, @@ -1298,6 +1300,7 @@ enum X86_64_VEX_0F384B, X86_64_VEX_0F385C, X86_64_VEX_0F385E, + X86_64_VEX_0F386C, X86_64_VEX_0F38E0, X86_64_VEX_0F38E1, X86_64_VEX_0F38E2, @@ -1398,6 +1401,7 @@ enum VEX_LEN_0F385E_X86_64_P_1_W_0_M_0, VEX_LEN_0F385E_X86_64_P_2_W_0_M_0, VEX_LEN_0F385E_X86_64_P_3_W_0_M_0, + VEX_LEN_0F386C_X86_64_W_0_M_1, VEX_LEN_0F38DB, VEX_LEN_0F38F2, VEX_LEN_0F38F3, @@ -1565,6 +1569,7 @@ enum VEX_W_0F385E_X86_64_P_1, VEX_W_0F385E_X86_64_P_2, VEX_W_0F385E_X86_64_P_3, + VEX_W_0F386C_X86_64, VEX_W_0F3872_P_1, VEX_W_0F3878, VEX_W_0F3879, @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = { { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) }, }, + /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */ + { + { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 }, + { Bad_Opcode }, + { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 }, + }, + /* PREFIX_VEX_0F3872 */ { { Bad_Opcode }, @@ -4486,6 +4498,12 @@ static const struct dis386 x86_64_table[][2] = { { PREFIX_TABLE (PREFIX_VEX_0F385E_X86_64) }, }, + /* X86_64_VEX_0F386C */ + { + { Bad_Opcode }, + { VEX_W_TABLE (VEX_W_0F386C_X86_64) }, + }, + /* X86_64_VEX_0F38E0 */ { { Bad_Opcode }, @@ -6461,7 +6479,7 @@ static const struct dis386 vex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_0F386C) }, { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, @@ -7181,6 +7199,11 @@ static const struct dis386 vex_len_table[][2] = { { "tdpbssd", {TMM, EXtmm, VexTmm }, 0 }, }, + /* VEX_LEN_0F386C_X86_64_W_0_M_1 */ + { + { PREFIX_TABLE (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0) }, + }, + /* VEX_LEN_0F38DB */ { { "vaesimc", { XM, EXx }, PREFIX_DATA }, @@ -7849,6 +7872,10 @@ static const struct dis386 vex_w_table[][2] = { /* VEX_W_0F385E_X86_64_P_3 */ { MOD_TABLE (MOD_VEX_0F385E_X86_64_P_3_W_0) }, }, + { + /* VEX_W_0F386C_X86_64 */ + { MOD_TABLE (MOD_VEX_0F386C_X86_64_W_0) }, + }, { /* VEX_W_0F3872_P_1 */ { "%XVvcvtneps2bf16%XY", { XMM, EXx }, 0 }, @@ -8696,6 +8723,11 @@ static const struct dis386 mod_table[][2] = { { Bad_Opcode }, { VEX_LEN_TABLE (VEX_LEN_0F385E_X86_64_P_3_W_0_M_0) }, }, + { + /* MOD_VEX_0F386C_X86_64_W_0 */ + { Bad_Opcode }, + { VEX_LEN_TABLE (VEX_LEN_0F386C_X86_64_W_0_M_1) }, + }, { /* MOD_VEX_0F388C */ { "vpmaskmov%DQ", { XM, Vex, Mx }, PREFIX_DATA }, diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index 489ae3429c9..c2ac3c6832d 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -240,6 +240,8 @@ static const dependency isa_dependencies[] = "AMX_TILE" }, { "AMX_FP16", "AMX_TILE" }, + { "AMX_COMPLEX", + "AMX_TILE" }, { "KL", "SSE2" }, { "WIDEKL", @@ -378,6 +380,7 @@ static bitfield cpu_flags[] = BITFIELD (AMX_INT8), BITFIELD (AMX_BF16), BITFIELD (AMX_FP16), + BITFIELD (AMX_COMPLEX), BITFIELD (AMX_TILE), BITFIELD (MOVDIRI), BITFIELD (MOVDIR64B), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index 23d93ae6f81..46a36c3e965 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -248,6 +248,8 @@ enum CpuAMX_BF16, /* AMX-FP16 instructions required */ CpuAMX_FP16, + /* Intel AMX-COMPLEX Instructions support required. */ + CpuAMX_COMPLEX, /* AMX-TILE instructions required */ CpuAMX_TILE, /* GFNI instructions required */ @@ -432,6 +434,7 @@ typedef union i386_cpu_flags unsigned int cpuamx_int8:1; unsigned int cpuamx_bf16:1; unsigned int cpuamx_fp16:1; + unsigned int cpuamx_complex:1; unsigned int cpuamx_tile:1; unsigned int cpugfni:1; unsigned int cpuvaes:1; diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index 9cc909925f4..240e1783d0d 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM } // AMX instructions end. +// AMX-COMPLEX instructions. + +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } +tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } + +// AMX-COMPLEX instructions end. + // KEYLOCKER instructions. loadiwkey, 0xf30f38dc, KL, Load|Modrm|NoSuf, { RegXMM, RegXMM } -- 2.31.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Support Intel AMX-COMPLEX 2023-04-03 7:11 [PATCH] Support Intel AMX-COMPLEX Haochen Jiang @ 2023-04-04 7:35 ` Jan Beulich 2023-04-04 8:41 ` Jiang, Haochen 2023-04-06 7:17 ` [PATCH v2] " Haochen Jiang 0 siblings, 2 replies; 7+ messages in thread From: Jan Beulich @ 2023-04-04 7:35 UTC (permalink / raw) To: Haochen Jiang; +Cc: hjl.tools, binutils On 03.04.2023 09:11, Haochen Jiang wrote: > @@ -1183,6 +1184,8 @@ if [gas_64_check] then { > run_dump_test "x86-64-avx-ne-convert-intel" > run_dump_test "x86-64-raoint" > run_dump_test "x86-64-raoint-intel" > + run_dump_test "x86-64-amx-complex" > + run_dump_test "x86-64-amx-complex-intel" > run_dump_test "x86-64-clzero" > run_dump_test "x86-64-mwaitx-bdver4" > run_list_test "x86-64-mwaitx-reg" There are constraints on operand combinations, like for tdp*, which want testing here as well (both the assembler and disassembler sides) imo. > @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = { > { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) }, > }, > > + /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */ > + { > + { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 }, > + { Bad_Opcode }, > + { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 }, > + }, You could avoid going through vex_w_table[] by making use of %XS here. (I guess I'll make a similar change for tdp*16ps, but - to avoid causing conflicts - perhaps only once yours went in.) > --- a/opcodes/i386-opc.h > +++ b/opcodes/i386-opc.h > @@ -248,6 +248,8 @@ enum > CpuAMX_BF16, > /* AMX-FP16 instructions required */ > CpuAMX_FP16, > + /* Intel AMX-COMPLEX Instructions support required. */ > + CpuAMX_COMPLEX, > /* AMX-TILE instructions required */ > CpuAMX_TILE, > /* GFNI instructions required */ In line with adjacent comments, please omit "Intel" and "support" from the comment, and don't start "instructions" with a capital latter. Plus while the full stop is in line with general comment style, looking at adjacent comments here it probably also wants omitting. > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM } > > // AMX instructions end. > > +// AMX-COMPLEX instructions. > + > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } > +tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } > + > +// AMX-COMPLEX instructions end. I think these would better not have their own comment-bounded group, but go inside the "AMX instructions" sections (which already covers all AMX-*). Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH] Support Intel AMX-COMPLEX 2023-04-04 7:35 ` Jan Beulich @ 2023-04-04 8:41 ` Jiang, Haochen 2023-04-06 7:17 ` [PATCH v2] " Haochen Jiang 1 sibling, 0 replies; 7+ messages in thread From: Jiang, Haochen @ 2023-04-04 8:41 UTC (permalink / raw) To: Beulich, Jan; +Cc: hjl.tools, binutils > On 03.04.2023 09:11, Haochen Jiang wrote: > > @@ -1183,6 +1184,8 @@ if [gas_64_check] then { > > run_dump_test "x86-64-avx-ne-convert-intel" > > run_dump_test "x86-64-raoint" > > run_dump_test "x86-64-raoint-intel" > > + run_dump_test "x86-64-amx-complex" > > + run_dump_test "x86-64-amx-complex-intel" > > run_dump_test "x86-64-clzero" > > run_dump_test "x86-64-mwaitx-bdver4" > > run_list_test "x86-64-mwaitx-reg" > > There are constraints on operand combinations, like for tdp*, which want > testing here as well (both the assembler and disassembler sides) imo. I just saw those testcases, I will add them in v2 patch just like tdp* did. Thx for the reminder. > > > @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = { > > { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) }, > > }, > > > > + /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */ { > > + { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 }, > > + { Bad_Opcode }, > > + { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 }, }, > > You could avoid going through vex_w_table[] by making use of %XS here. > (I guess I'll make a similar change for tdp*16ps, but - to avoid causing conflicts > - perhaps only once yours went in.) I will leave this to you, using %XS does eliminate W table pass. > > > --- a/opcodes/i386-opc.h > > +++ b/opcodes/i386-opc.h > > @@ -248,6 +248,8 @@ enum > > CpuAMX_BF16, > > /* AMX-FP16 instructions required */ > > CpuAMX_FP16, > > + /* Intel AMX-COMPLEX Instructions support required. */ > > + CpuAMX_COMPLEX, > > /* AMX-TILE instructions required */ > > CpuAMX_TILE, > > /* GFNI instructions required */ > > In line with adjacent comments, please omit "Intel" and "support" from the > comment, and don't start "instructions" with a capital latter. Plus while the > full stop is in line with general comment style, looking at adjacent comments > here it probably also wants omitting. Ok will do that in v2 patch. > > > --- a/opcodes/i386-opc.tbl > > +++ b/opcodes/i386-opc.tbl > > @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64, > > Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM } > > > > // AMX instructions end. > > > > +// AMX-COMPLEX instructions. > > + > > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, > { RegTMM, > > +RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, > { RegTMM, > > +RegTMM, RegTMM } > > + > > +// AMX-COMPLEX instructions end. > > I think these would better not have their own comment-bounded group, but > go inside the "AMX instructions" sections (which already covers all AMX-*). I will put them in alphabetical order in v2 patch, which means before tdp*. Really appreciate your review and I will send v2 patch soon. Haochen > > Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2] Support Intel AMX-COMPLEX 2023-04-04 7:35 ` Jan Beulich 2023-04-04 8:41 ` Jiang, Haochen @ 2023-04-06 7:17 ` Haochen Jiang 2023-04-06 9:45 ` Jan Beulich 2023-04-07 15:49 ` H.J. Lu 1 sibling, 2 replies; 7+ messages in thread From: Haochen Jiang @ 2023-04-06 7:17 UTC (permalink / raw) To: binutils, jbeulich; +Cc: hjl.tools Hi all, v2 patch did several changes: 1. > > @@ -1183,6 +1184,8 @@ if [gas_64_check] then { > > run_dump_test "x86-64-avx-ne-convert-intel" > > run_dump_test "x86-64-raoint" > > run_dump_test "x86-64-raoint-intel" > > + run_dump_test "x86-64-amx-complex" > > + run_dump_test "x86-64-amx-complex-intel" > > run_dump_test "x86-64-clzero" > > run_dump_test "x86-64-mwaitx-bdver4" > > run_list_test "x86-64-mwaitx-reg" > > There are constraints on operand combinations, like for tdp*, which want > testing here as well (both the assembler and disassembler sides) imo. Added x86-64-amx-complex-bad testcases. The operand order keep reversed here for operand 2 and 3. We could fix that after the PR30317 is solved. 2. > > --- a/opcodes/i386-opc.h > > +++ b/opcodes/i386-opc.h > > @@ -248,6 +248,8 @@ enum > > CpuAMX_BF16, > > /* AMX-FP16 instructions required */ > > CpuAMX_FP16, > > + /* Intel AMX-COMPLEX Instructions support required. */ > > + CpuAMX_COMPLEX, > > /* AMX-TILE instructions required */ > > CpuAMX_TILE, > > /* GFNI instructions required */ > > In line with adjacent comments, please omit "Intel" and "support" from the > comment, and don't start "instructions" with a capital latter. Plus while the > full stop is in line with general comment style, looking at adjacent comments > here it probably also wants omitting. Adjusted the comment here. 3. > > --- a/opcodes/i386-opc.tbl > > +++ b/opcodes/i386-opc.tbl > > @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64, > > Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM } > > > > // AMX instructions end. > > > > +// AMX-COMPLEX instructions. > > + > > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, > { RegTMM, > > +RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, > { RegTMM, > > +RegTMM, RegTMM } > > + > > +// AMX-COMPLEX instructions end. > > I think these would better not have their own comment-bounded group, but > go inside the "AMX instructions" sections (which already covers all AMX-*). Put them in alphabetical order with AMX instructions. These change could be seen in the patch following. Thank for your review! Thx, Haochen gas/ChangeLog: * NEWS: Support Intel AMX-COMPLEX. * config/tc-i386.c: Add amx_complex. * doc/c-i386.texi: Document .amx_complex. * testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests. * testsuite/gas/i386/amx-complex-inval.l: New test. * testsuite/gas/i386/amx-complex-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-complex-bad.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex-bad.s: Ditto. * testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex.d: Ditto. * testsuite/gas/i386/x86-64-amx-complex.s: Ditto. opcodes/ChangeLog: * i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New. (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto. (X86_64_VEX_0F386C): Ditto. (VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto. (VEX_W_0F386C_X86_64): Ditto. (mod_table): Add MOD_VEX_0F386C_X86_64_W_0. (prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0. (x86_64_table): Add X86_64_VEX_0F386C. (vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1. (vex_w_table): Add VEX_W_0F386C_X86_64. * i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and CPU_ANY_AMX_COMPLEX_FLAGS. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_COMPLEX): New. (i386_cpu_flags): Add cpuamx_complex. * i386-opc.tbl: Add AMX-COMPLEX instructions. * i386-tbl.h: Regenerated. --- gas/NEWS | 2 + gas/config/tc-i386.c | 1 + gas/doc/c-i386.texi | 4 +- gas/testsuite/gas/i386/amx-complex-inval.l | 3 + gas/testsuite/gas/i386/amx-complex-inval.s | 7 + gas/testsuite/gas/i386/i386.exp | 4 + .../gas/i386/x86-64-amx-complex-bad.d | 19 + .../gas/i386/x86-64-amx-complex-bad.s | 17 + .../gas/i386/x86-64-amx-complex-intel.d | 18 + gas/testsuite/gas/i386/x86-64-amx-complex.d | 15 + gas/testsuite/gas/i386/x86-64-amx-complex.s | 15 + opcodes/i386-dis.c | 34 +- opcodes/i386-gen.c | 3 + opcodes/i386-init.h | 542 +- opcodes/i386-mnem.h | 1098 +-- opcodes/i386-opc.h | 3 + opcodes/i386-opc.tbl | 3 + opcodes/i386-tbl.h | 7836 +++++++++-------- 18 files changed, 4912 insertions(+), 4712 deletions(-) create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s diff --git a/gas/NEWS b/gas/NEWS index f95383e83af..42a2005d7c9 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AMX-COMPLEX instructions. + * Add SME2 support to the AArch64 port. * A new .insn directive is recognized by x86 gas. diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index ea2ed0d818e..ea5705da4af 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1113,6 +1113,7 @@ static const arch_entry cpu_arch[] = SUBARCH (amx_int8, AMX_INT8, ANY_AMX_INT8, false), SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false), SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false), + SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false), SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index 617cbd46cb7..15d060b2a33 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -208,6 +208,7 @@ accept various extension mnemonics. For example, @code{amx_int8}, @code{amx_bf16}, @code{amx_fp16}, +@code{amx_complex}, @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1636,7 +1637,8 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote} @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq} @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} -@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @tab @samp{.amx_tile} +@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} +@item @samp{.amx_complex} @tab @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} diff --git a/gas/testsuite/gas/i386/amx-complex-inval.l b/gas/testsuite/gas/i386/amx-complex-inval.l new file mode 100644 index 00000000000..df6713c5d8b --- /dev/null +++ b/gas/testsuite/gas/i386/amx-complex-inval.l @@ -0,0 +1,3 @@ +.* Assembler messages: +.*:6: Error: `tcmmimfp16ps' is only supported in 64-bit mode +.*:7: Error: `tcmmrlfp16ps' is only supported in 64-bit mode diff --git a/gas/testsuite/gas/i386/amx-complex-inval.s b/gas/testsuite/gas/i386/amx-complex-inval.s new file mode 100644 index 00000000000..b1bbf32585b --- /dev/null +++ b/gas/testsuite/gas/i386/amx-complex-inval.s @@ -0,0 +1,7 @@ +# Check Illegal AMX-COMPLEX instructions + + .allow_index_reg + .text +_start: + tcmmimfp16ps %tmm1, %tmm2, %tmm3 + tcmmrlfp16ps %tmm1, %tmm2, %tmm3 diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index c44f071a0e2..6d326b49a39 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -493,6 +493,7 @@ if [gas_32_check] then { run_dump_test "avx-ne-convert-intel" run_dump_test "raoint" run_dump_test "raoint-intel" + run_list_test "amx-complex-inval" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" @@ -1183,6 +1184,9 @@ if [gas_64_check] then { run_dump_test "x86-64-avx-ne-convert-intel" run_dump_test "x86-64-raoint" run_dump_test "x86-64-raoint-intel" + run_dump_test "x86-64-amx-complex" + run_dump_test "x86-64-amx-complex-intel" + run_dump_test "x86-64-amx-complex-bad" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d new file mode 100644 index 00000000000..646015ca9bb --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d @@ -0,0 +1,19 @@ +#as: +#objdump: -drw +#name: x86_64 Illegal AMX-COMPLEX insns +#source: x86-64-amx-complex-bad.s + +.*: +file format .* + + +Disassembly of section \.text: + +0+ <\.text>: +[ ]*[a-f0-9]+:[ ]*c4 e2 d9 6c[ ]*\(bad\)[ ]* +[ ]*[a-f0-9]+:[ ]*f5[ ]*cmc.* +[ ]*[a-f0-9]+:[ ]*c4 e2 5d 6c[ ]*\(bad\)[ ]* +[ ]*[a-f0-9]+:[ ]*f5[ ]*cmc.* +[ ]*[a-f0-9]+:[ ]*c4 62 59 6c f5[ ]*tcmmimfp16ps %tmm4,%tmm5,\(bad\) +[ ]*[a-f0-9]+:[ ]*c4 c2 59 6c f5[ ]*tcmmimfp16ps %tmm4,\(bad\),%tmm6 +[ ]*[a-f0-9]+:[ ]*c4 e2 31 6c f5[ ]*tcmmimfp16ps \(bad\),%tmm5,%tmm6 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s new file mode 100644 index 00000000000..b2e55b13825 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s @@ -0,0 +1,17 @@ +# Check Illegal 64bit AMX-COMPLEX instructions + +.text + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.W = 1 (illegal value). + .insn VEX.128.66.0F38.W1 0x6c, %tmm5, %tmm4, %tmm6 + + #tcmmimfp16ps %tmm4,%tmm4,%tmm6 set VEX.L = 1 (illegal value). + .insn VEX.256.66.0F38.W0 0x6c, %tmm5, %tmm4, %tmm6 + + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.R = 0 (illegal value). + .insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm4, %xmm14 + + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.B = 0 (illegal value). + .insn VEX.128.66.0F38.W0 0x6c, %xmm13, %xmm4, %xmm6 + + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.VVVV = 0110 (illegal value). + .insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm9, %xmm6 diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d new file mode 100644 index 00000000000..8f2e015104f --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d @@ -0,0 +1,18 @@ +#as: +#objdump: -dw -Mintel +#name: x86_64 AMX-COMPLEX insns (Intel disassembly) +#source: x86-64-amx-complex.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1 diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.d b/gas/testsuite/gas/i386/x86-64-amx-complex.d new file mode 100644 index 00000000000..b2157960027 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.d @@ -0,0 +1,15 @@ +#as: +#objdump: -dw +#name: x86_64 AMX-COMPLEX insns +#source: x86-64-amx-complex.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps %tmm1,%tmm2,%tmm3 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.s b/gas/testsuite/gas/i386/x86-64-amx-complex.s new file mode 100644 index 00000000000..56f1a00fa9e --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.s @@ -0,0 +1,15 @@ +# Check 64bit AMX-COMPLEX instructions + + .allow_index_reg + .text +_start: + tcmmimfp16ps %tmm4, %tmm5, %tmm6 #AMX-COMPLEX + tcmmimfp16ps %tmm1, %tmm2, %tmm3 #AMX-COMPLEX + tcmmrlfp16ps %tmm4, %tmm5, %tmm6 #AMX-COMPLEX + tcmmrlfp16ps %tmm1, %tmm2, %tmm3 #AMX-COMPLEX + +.intel_syntax noprefix + tcmmimfp16ps tmm6, tmm5, tmm4 #AMX-COMPLEX + tcmmimfp16ps tmm3, tmm2, tmm1 #AMX-COMPLEX + tcmmrlfp16ps tmm6, tmm5, tmm4 #AMX-COMPLEX + tcmmrlfp16ps tmm3, tmm2, tmm1 #AMX-COMPLEX diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index a414e8c9b1e..d6b0fdd4ba3 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -943,6 +943,7 @@ enum MOD_VEX_0F385E_X86_64_P_1_W_0, MOD_VEX_0F385E_X86_64_P_2_W_0, MOD_VEX_0F385E_X86_64_P_3_W_0, + MOD_VEX_0F386C_X86_64_W_0, MOD_VEX_0F388C, MOD_VEX_0F388E, MOD_VEX_0F3A30_L_0, @@ -1145,6 +1146,7 @@ enum PREFIX_VEX_0F3851_W_0, PREFIX_VEX_0F385C_X86_64, PREFIX_VEX_0F385E_X86_64, + PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0, PREFIX_VEX_0F3872, PREFIX_VEX_0F38B0_W_0, PREFIX_VEX_0F38B1_W_0, @@ -1298,6 +1300,7 @@ enum X86_64_VEX_0F384B, X86_64_VEX_0F385C, X86_64_VEX_0F385E, + X86_64_VEX_0F386C, X86_64_VEX_0F38E0, X86_64_VEX_0F38E1, X86_64_VEX_0F38E2, @@ -1398,6 +1401,7 @@ enum VEX_LEN_0F385E_X86_64_P_1_W_0_M_0, VEX_LEN_0F385E_X86_64_P_2_W_0_M_0, VEX_LEN_0F385E_X86_64_P_3_W_0_M_0, + VEX_LEN_0F386C_X86_64_W_0_M_1, VEX_LEN_0F38DB, VEX_LEN_0F38F2, VEX_LEN_0F38F3, @@ -1565,6 +1569,7 @@ enum VEX_W_0F385E_X86_64_P_1, VEX_W_0F385E_X86_64_P_2, VEX_W_0F385E_X86_64_P_3, + VEX_W_0F386C_X86_64, VEX_W_0F3872_P_1, VEX_W_0F3878, VEX_W_0F3879, @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = { { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) }, }, + /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */ + { + { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 }, + { Bad_Opcode }, + { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 }, + }, + /* PREFIX_VEX_0F3872 */ { { Bad_Opcode }, @@ -4486,6 +4498,12 @@ static const struct dis386 x86_64_table[][2] = { { PREFIX_TABLE (PREFIX_VEX_0F385E_X86_64) }, }, + /* X86_64_VEX_0F386C */ + { + { Bad_Opcode }, + { VEX_W_TABLE (VEX_W_0F386C_X86_64) }, + }, + /* X86_64_VEX_0F38E0 */ { { Bad_Opcode }, @@ -6461,7 +6479,7 @@ static const struct dis386 vex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_0F386C) }, { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, @@ -7181,6 +7199,11 @@ static const struct dis386 vex_len_table[][2] = { { "tdpbssd", {TMM, EXtmm, VexTmm }, 0 }, }, + /* VEX_LEN_0F386C_X86_64_W_0_M_1 */ + { + { PREFIX_TABLE (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0) }, + }, + /* VEX_LEN_0F38DB */ { { "vaesimc", { XM, EXx }, PREFIX_DATA }, @@ -7849,6 +7872,10 @@ static const struct dis386 vex_w_table[][2] = { /* VEX_W_0F385E_X86_64_P_3 */ { MOD_TABLE (MOD_VEX_0F385E_X86_64_P_3_W_0) }, }, + { + /* VEX_W_0F386C_X86_64 */ + { MOD_TABLE (MOD_VEX_0F386C_X86_64_W_0) }, + }, { /* VEX_W_0F3872_P_1 */ { "%XVvcvtneps2bf16%XY", { XMM, EXx }, 0 }, @@ -8696,6 +8723,11 @@ static const struct dis386 mod_table[][2] = { { Bad_Opcode }, { VEX_LEN_TABLE (VEX_LEN_0F385E_X86_64_P_3_W_0_M_0) }, }, + { + /* MOD_VEX_0F386C_X86_64_W_0 */ + { Bad_Opcode }, + { VEX_LEN_TABLE (VEX_LEN_0F386C_X86_64_W_0_M_1) }, + }, { /* MOD_VEX_0F388C */ { "vpmaskmov%DQ", { XM, Vex, Mx }, PREFIX_DATA }, diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index 489ae3429c9..c2ac3c6832d 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -240,6 +240,8 @@ static const dependency isa_dependencies[] = "AMX_TILE" }, { "AMX_FP16", "AMX_TILE" }, + { "AMX_COMPLEX", + "AMX_TILE" }, { "KL", "SSE2" }, { "WIDEKL", @@ -378,6 +380,7 @@ static bitfield cpu_flags[] = BITFIELD (AMX_INT8), BITFIELD (AMX_BF16), BITFIELD (AMX_FP16), + BITFIELD (AMX_COMPLEX), BITFIELD (AMX_TILE), BITFIELD (MOVDIRI), BITFIELD (MOVDIR64B), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index 23d93ae6f81..b17e8341aa2 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -248,6 +248,8 @@ enum CpuAMX_BF16, /* AMX-FP16 instructions required */ CpuAMX_FP16, + /* AMX-COMPLEX instructions required. */ + CpuAMX_COMPLEX, /* AMX-TILE instructions required */ CpuAMX_TILE, /* GFNI instructions required */ @@ -432,6 +434,7 @@ typedef union i386_cpu_flags unsigned int cpuamx_int8:1; unsigned int cpuamx_bf16:1; unsigned int cpuamx_fp16:1; + unsigned int cpuamx_complex:1; unsigned int cpuamx_tile:1; unsigned int cpugfni:1; unsigned int cpuvaes:1; diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index 9cc909925f4..15d48eeb4c7 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3146,6 +3146,9 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {} ldtilecfg, 0x49/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex } sttilecfg, 0x6649/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex } +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } +tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } + tdpbf16ps, 0xf35c, AMX_BF16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } tdpfp16ps, 0xf25c, AMX_FP16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbssd, 0xf25e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } -- 2.31.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] Support Intel AMX-COMPLEX 2023-04-06 7:17 ` [PATCH v2] " Haochen Jiang @ 2023-04-06 9:45 ` Jan Beulich 2023-04-07 1:59 ` Jiang, Haochen 2023-04-07 15:49 ` H.J. Lu 1 sibling, 1 reply; 7+ messages in thread From: Jan Beulich @ 2023-04-06 9:45 UTC (permalink / raw) To: Haochen Jiang; +Cc: hjl.tools, binutils On 06.04.2023 09:17, Haochen Jiang wrote: > gas/ChangeLog: > > * NEWS: Support Intel AMX-COMPLEX. > * config/tc-i386.c: Add amx_complex. > * doc/c-i386.texi: Document .amx_complex. > * testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests. > * testsuite/gas/i386/amx-complex-inval.l: New test. > * testsuite/gas/i386/amx-complex-inval.s: Ditto. > * testsuite/gas/i386/x86-64-amx-complex-bad.d: Ditto. > * testsuite/gas/i386/x86-64-amx-complex-bad.s: Ditto. > * testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto. > * testsuite/gas/i386/x86-64-amx-complex.d: Ditto. > * testsuite/gas/i386/x86-64-amx-complex.s: Ditto. > > opcodes/ChangeLog: > > * i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New. > (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto. > (X86_64_VEX_0F386C): Ditto. > (VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto. > (VEX_W_0F386C_X86_64): Ditto. > (mod_table): Add MOD_VEX_0F386C_X86_64_W_0. > (prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0. > (x86_64_table): Add X86_64_VEX_0F386C. > (vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1. > (vex_w_table): Add VEX_W_0F386C_X86_64. > * i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and > CPU_ANY_AMX_COMPLEX_FLAGS. > * i386-init.h: Regenerated. > * i386-mnem.h: Ditto. > * i386-opc.h (CpuAMX_COMPLEX): New. > (i386_cpu_flags): Add cpuamx_complex. > * i386-opc.tbl: Add AMX-COMPLEX instructions. > * i386-tbl.h: Regenerated. > --- > gas/NEWS | 2 + > gas/config/tc-i386.c | 1 + > gas/doc/c-i386.texi | 4 +- > gas/testsuite/gas/i386/amx-complex-inval.l | 3 + > gas/testsuite/gas/i386/amx-complex-inval.s | 7 + > gas/testsuite/gas/i386/i386.exp | 4 + > .../gas/i386/x86-64-amx-complex-bad.d | 19 + > .../gas/i386/x86-64-amx-complex-bad.s | 17 + > .../gas/i386/x86-64-amx-complex-intel.d | 18 + > gas/testsuite/gas/i386/x86-64-amx-complex.d | 15 + > gas/testsuite/gas/i386/x86-64-amx-complex.s | 15 + > opcodes/i386-dis.c | 34 +- > opcodes/i386-gen.c | 3 + > opcodes/i386-init.h | 542 +- > opcodes/i386-mnem.h | 1098 +-- > opcodes/i386-opc.h | 3 + > opcodes/i386-opc.tbl | 3 + > opcodes/i386-tbl.h | 7836 +++++++++-------- > 18 files changed, 4912 insertions(+), 4712 deletions(-) > create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l > create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.d > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.s > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s Okay. That said, like AMX-FP16 this one also omits x86-64-amx-inval.s-like checks (that very testcase could easily be extended instead of making yet further tiny new ones); even the original AMX work checked only an AMX-INT8 insn there (I'm specifically after the all-operands-must- be-distinct checking), but not AMX-BF16. Would be nice if we could gain additions for both (all three) in a subsequent patch. Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH v2] Support Intel AMX-COMPLEX 2023-04-06 9:45 ` Jan Beulich @ 2023-04-07 1:59 ` Jiang, Haochen 0 siblings, 0 replies; 7+ messages in thread From: Jiang, Haochen @ 2023-04-07 1:59 UTC (permalink / raw) To: Beulich, Jan; +Cc: hjl.tools, binutils > That said, like AMX-FP16 this one also omits x86-64-amx-inval.s-like checks > (that very testcase could easily be extended instead of making yet further > tiny new ones); even the original AMX work checked only an AMX-INT8 insn > there (I'm specifically after the all-operands-must- be-distinct checking), but > not AMX-BF16. Would be nice if we could gain additions for both (all three) in > a subsequent patch. I will add those inval AMX testcases just like the existing ones in these days. Thx, Haochen > > Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] Support Intel AMX-COMPLEX 2023-04-06 7:17 ` [PATCH v2] " Haochen Jiang 2023-04-06 9:45 ` Jan Beulich @ 2023-04-07 15:49 ` H.J. Lu 1 sibling, 0 replies; 7+ messages in thread From: H.J. Lu @ 2023-04-07 15:49 UTC (permalink / raw) To: Haochen Jiang; +Cc: binutils, jbeulich On Thu, Apr 6, 2023 at 12:19 AM Haochen Jiang <haochen.jiang@intel.com> wrote: > > Hi all, > > v2 patch did several changes: > > 1. > > > @@ -1183,6 +1184,8 @@ if [gas_64_check] then { > > > run_dump_test "x86-64-avx-ne-convert-intel" > > > run_dump_test "x86-64-raoint" > > > run_dump_test "x86-64-raoint-intel" > > > + run_dump_test "x86-64-amx-complex" > > > + run_dump_test "x86-64-amx-complex-intel" > > > run_dump_test "x86-64-clzero" > > > run_dump_test "x86-64-mwaitx-bdver4" > > > run_list_test "x86-64-mwaitx-reg" > > > > There are constraints on operand combinations, like for tdp*, which want > > testing here as well (both the assembler and disassembler sides) imo. > > Added x86-64-amx-complex-bad testcases. The operand order keep reversed > here for operand 2 and 3. We could fix that after the PR30317 is solved. > > 2. > > > --- a/opcodes/i386-opc.h > > > +++ b/opcodes/i386-opc.h > > > @@ -248,6 +248,8 @@ enum > > > CpuAMX_BF16, > > > /* AMX-FP16 instructions required */ > > > CpuAMX_FP16, > > > + /* Intel AMX-COMPLEX Instructions support required. */ > > > + CpuAMX_COMPLEX, > > > /* AMX-TILE instructions required */ > > > CpuAMX_TILE, > > > /* GFNI instructions required */ > > > > In line with adjacent comments, please omit "Intel" and "support" from the > > comment, and don't start "instructions" with a capital latter. Plus while the > > full stop is in line with general comment style, looking at adjacent comments > > here it probably also wants omitting. > > Adjusted the comment here. > > 3. > > > --- a/opcodes/i386-opc.tbl > > > +++ b/opcodes/i386-opc.tbl > > > @@ -3163,6 +3163,13 @@ tilezero, 0xf249, AMX_TILE|x64, > > > Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM } > > > > > > // AMX instructions end. > > > > > > +// AMX-COMPLEX instructions. > > > + > > > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, > > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, > > { RegTMM, > > > +RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, > > > +Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, > > { RegTMM, > > > +RegTMM, RegTMM } > > > + > > > +// AMX-COMPLEX instructions end. > > > > I think these would better not have their own comment-bounded group, but > > go inside the "AMX instructions" sections (which already covers all AMX-*). > > Put them in alphabetical order with AMX instructions. > > These change could be seen in the patch following. Thank for your review! > > Thx, > Haochen > > gas/ChangeLog: > > * NEWS: Support Intel AMX-COMPLEX. > * config/tc-i386.c: Add amx_complex. > * doc/c-i386.texi: Document .amx_complex. > * testsuite/gas/i386/i386.exp: Run AMX-COMPLEX tests. > * testsuite/gas/i386/amx-complex-inval.l: New test. > * testsuite/gas/i386/amx-complex-inval.s: Ditto. > * testsuite/gas/i386/x86-64-amx-complex-bad.d: Ditto. > * testsuite/gas/i386/x86-64-amx-complex-bad.s: Ditto. > * testsuite/gas/i386/x86-64-amx-complex-intel.d: Ditto. > * testsuite/gas/i386/x86-64-amx-complex.d: Ditto. > * testsuite/gas/i386/x86-64-amx-complex.s: Ditto. > > opcodes/ChangeLog: > > * i386-dis.c (MOD_VEX_0F386C_X86_64_W_0): New. > (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0): Ditto. > (X86_64_VEX_0F386C): Ditto. > (VEX_LEN_0F386C_X86_64_W_0_M_1): Ditto. > (VEX_W_0F386C_X86_64): Ditto. > (mod_table): Add MOD_VEX_0F386C_X86_64_W_0. > (prefix_table): Add PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0. > (x86_64_table): Add X86_64_VEX_0F386C. > (vex_len_table): Add VEX_LEN_0F386C_X86_64_W_0_M_1. > (vex_w_table): Add VEX_W_0F386C_X86_64. > * i386-gen.c (cpu_flag_init): Add CPU_AMX_COMPLEX_FLAGS and > CPU_ANY_AMX_COMPLEX_FLAGS. > * i386-init.h: Regenerated. > * i386-mnem.h: Ditto. > * i386-opc.h (CpuAMX_COMPLEX): New. > (i386_cpu_flags): Add cpuamx_complex. > * i386-opc.tbl: Add AMX-COMPLEX instructions. > * i386-tbl.h: Regenerated. > --- > gas/NEWS | 2 + > gas/config/tc-i386.c | 1 + > gas/doc/c-i386.texi | 4 +- > gas/testsuite/gas/i386/amx-complex-inval.l | 3 + > gas/testsuite/gas/i386/amx-complex-inval.s | 7 + > gas/testsuite/gas/i386/i386.exp | 4 + > .../gas/i386/x86-64-amx-complex-bad.d | 19 + > .../gas/i386/x86-64-amx-complex-bad.s | 17 + > .../gas/i386/x86-64-amx-complex-intel.d | 18 + > gas/testsuite/gas/i386/x86-64-amx-complex.d | 15 + > gas/testsuite/gas/i386/x86-64-amx-complex.s | 15 + > opcodes/i386-dis.c | 34 +- > opcodes/i386-gen.c | 3 + > opcodes/i386-init.h | 542 +- > opcodes/i386-mnem.h | 1098 +-- > opcodes/i386-opc.h | 3 + > opcodes/i386-opc.tbl | 3 + > opcodes/i386-tbl.h | 7836 +++++++++-------- > 18 files changed, 4912 insertions(+), 4712 deletions(-) > create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.l > create mode 100644 gas/testsuite/gas/i386/amx-complex-inval.s > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.d > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-bad.s > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex-intel.d > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.d > create mode 100644 gas/testsuite/gas/i386/x86-64-amx-complex.s > > diff --git a/gas/NEWS b/gas/NEWS > index f95383e83af..42a2005d7c9 100644 > --- a/gas/NEWS > +++ b/gas/NEWS > @@ -1,5 +1,7 @@ > -*- text -*- > > +* Add support for Intel AMX-COMPLEX instructions. > + > * Add SME2 support to the AArch64 port. > > * A new .insn directive is recognized by x86 gas. > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c > index ea2ed0d818e..ea5705da4af 100644 > --- a/gas/config/tc-i386.c > +++ b/gas/config/tc-i386.c > @@ -1113,6 +1113,7 @@ static const arch_entry cpu_arch[] = > SUBARCH (amx_int8, AMX_INT8, ANY_AMX_INT8, false), > SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false), > SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false), > + SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false), > SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), > SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), > SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), > diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi > index 617cbd46cb7..15d060b2a33 100644 > --- a/gas/doc/c-i386.texi > +++ b/gas/doc/c-i386.texi > @@ -208,6 +208,7 @@ accept various extension mnemonics. For example, > @code{amx_int8}, > @code{amx_bf16}, > @code{amx_fp16}, > +@code{amx_complex}, > @code{amx_tile}, > @code{vmx}, > @code{vmfunc}, > @@ -1636,7 +1637,8 @@ supported on the CPU specified. The choices for @var{cpu_type} are: > @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote} > @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq} > @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} > -@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @tab @samp{.amx_tile} > +@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} > +@item @samp{.amx_complex} @tab @samp{.amx_tile} > @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} > @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} > @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} > diff --git a/gas/testsuite/gas/i386/amx-complex-inval.l b/gas/testsuite/gas/i386/amx-complex-inval.l > new file mode 100644 > index 00000000000..df6713c5d8b > --- /dev/null > +++ b/gas/testsuite/gas/i386/amx-complex-inval.l > @@ -0,0 +1,3 @@ > +.* Assembler messages: > +.*:6: Error: `tcmmimfp16ps' is only supported in 64-bit mode > +.*:7: Error: `tcmmrlfp16ps' is only supported in 64-bit mode > diff --git a/gas/testsuite/gas/i386/amx-complex-inval.s b/gas/testsuite/gas/i386/amx-complex-inval.s > new file mode 100644 > index 00000000000..b1bbf32585b > --- /dev/null > +++ b/gas/testsuite/gas/i386/amx-complex-inval.s > @@ -0,0 +1,7 @@ > +# Check Illegal AMX-COMPLEX instructions > + > + .allow_index_reg > + .text > +_start: > + tcmmimfp16ps %tmm1, %tmm2, %tmm3 > + tcmmrlfp16ps %tmm1, %tmm2, %tmm3 > diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp > index c44f071a0e2..6d326b49a39 100644 > --- a/gas/testsuite/gas/i386/i386.exp > +++ b/gas/testsuite/gas/i386/i386.exp > @@ -493,6 +493,7 @@ if [gas_32_check] then { > run_dump_test "avx-ne-convert-intel" > run_dump_test "raoint" > run_dump_test "raoint-intel" > + run_list_test "amx-complex-inval" > run_list_test "sg" > run_dump_test "clzero" > run_dump_test "invlpgb" > @@ -1183,6 +1184,9 @@ if [gas_64_check] then { > run_dump_test "x86-64-avx-ne-convert-intel" > run_dump_test "x86-64-raoint" > run_dump_test "x86-64-raoint-intel" > + run_dump_test "x86-64-amx-complex" > + run_dump_test "x86-64-amx-complex-intel" > + run_dump_test "x86-64-amx-complex-bad" > run_dump_test "x86-64-clzero" > run_dump_test "x86-64-mwaitx-bdver4" > run_list_test "x86-64-mwaitx-reg" > diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d > new file mode 100644 > index 00000000000..646015ca9bb > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.d > @@ -0,0 +1,19 @@ > +#as: > +#objdump: -drw > +#name: x86_64 Illegal AMX-COMPLEX insns > +#source: x86-64-amx-complex-bad.s > + > +.*: +file format .* > + > + > +Disassembly of section \.text: > + > +0+ <\.text>: > +[ ]*[a-f0-9]+:[ ]*c4 e2 d9 6c[ ]*\(bad\)[ ]* > +[ ]*[a-f0-9]+:[ ]*f5[ ]*cmc.* > +[ ]*[a-f0-9]+:[ ]*c4 e2 5d 6c[ ]*\(bad\)[ ]* > +[ ]*[a-f0-9]+:[ ]*f5[ ]*cmc.* > +[ ]*[a-f0-9]+:[ ]*c4 62 59 6c f5[ ]*tcmmimfp16ps %tmm4,%tmm5,\(bad\) > +[ ]*[a-f0-9]+:[ ]*c4 c2 59 6c f5[ ]*tcmmimfp16ps %tmm4,\(bad\),%tmm6 > +[ ]*[a-f0-9]+:[ ]*c4 e2 31 6c f5[ ]*tcmmimfp16ps \(bad\),%tmm5,%tmm6 > +#pass > diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s > new file mode 100644 > index 00000000000..b2e55b13825 > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-bad.s > @@ -0,0 +1,17 @@ > +# Check Illegal 64bit AMX-COMPLEX instructions > + > +.text > + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.W = 1 (illegal value). > + .insn VEX.128.66.0F38.W1 0x6c, %tmm5, %tmm4, %tmm6 > + > + #tcmmimfp16ps %tmm4,%tmm4,%tmm6 set VEX.L = 1 (illegal value). > + .insn VEX.256.66.0F38.W0 0x6c, %tmm5, %tmm4, %tmm6 > + > + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.R = 0 (illegal value). > + .insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm4, %xmm14 > + > + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.B = 0 (illegal value). > + .insn VEX.128.66.0F38.W0 0x6c, %xmm13, %xmm4, %xmm6 > + > + #tcmmimfp16ps %tmm4,%tmm5,%tmm6 set VEX.VVVV = 0110 (illegal value). > + .insn VEX.128.66.0F38.W0 0x6c, %xmm5, %xmm9, %xmm6 > diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d > new file mode 100644 > index 00000000000..8f2e015104f > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-amx-complex-intel.d > @@ -0,0 +1,18 @@ > +#as: > +#objdump: -dw -Mintel > +#name: x86_64 AMX-COMPLEX insns (Intel disassembly) > +#source: x86-64-amx-complex.s > + > +.*: +file format .* > + > +Disassembly of section \.text: > + > +0+ <_start>: > +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4 > +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1 > +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4 > +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1 > +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps tmm6,tmm5,tmm4 > +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps tmm3,tmm2,tmm1 > +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps tmm6,tmm5,tmm4 > +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps tmm3,tmm2,tmm1 > diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.d b/gas/testsuite/gas/i386/x86-64-amx-complex.d > new file mode 100644 > index 00000000000..b2157960027 > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.d > @@ -0,0 +1,15 @@ > +#as: > +#objdump: -dw > +#name: x86_64 AMX-COMPLEX insns > +#source: x86-64-amx-complex.s > + > +.*: +file format .* > + > +Disassembly of section \.text: > + > +0+ <_start>: > +\s*[a-f0-9]+:\s*c4 e2 59 6c f5\s+tcmmimfp16ps %tmm4,%tmm5,%tmm6 > +\s*[a-f0-9]+:\s*c4 e2 71 6c da\s+tcmmimfp16ps %tmm1,%tmm2,%tmm3 > +\s*[a-f0-9]+:\s*c4 e2 58 6c f5\s+tcmmrlfp16ps %tmm4,%tmm5,%tmm6 > +\s*[a-f0-9]+:\s*c4 e2 70 6c da\s+tcmmrlfp16ps %tmm1,%tmm2,%tmm3 > +#pass > diff --git a/gas/testsuite/gas/i386/x86-64-amx-complex.s b/gas/testsuite/gas/i386/x86-64-amx-complex.s > new file mode 100644 > index 00000000000..56f1a00fa9e > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-amx-complex.s > @@ -0,0 +1,15 @@ > +# Check 64bit AMX-COMPLEX instructions > + > + .allow_index_reg > + .text > +_start: > + tcmmimfp16ps %tmm4, %tmm5, %tmm6 #AMX-COMPLEX > + tcmmimfp16ps %tmm1, %tmm2, %tmm3 #AMX-COMPLEX > + tcmmrlfp16ps %tmm4, %tmm5, %tmm6 #AMX-COMPLEX > + tcmmrlfp16ps %tmm1, %tmm2, %tmm3 #AMX-COMPLEX > + > +.intel_syntax noprefix > + tcmmimfp16ps tmm6, tmm5, tmm4 #AMX-COMPLEX > + tcmmimfp16ps tmm3, tmm2, tmm1 #AMX-COMPLEX > + tcmmrlfp16ps tmm6, tmm5, tmm4 #AMX-COMPLEX > + tcmmrlfp16ps tmm3, tmm2, tmm1 #AMX-COMPLEX > diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c > index a414e8c9b1e..d6b0fdd4ba3 100644 > --- a/opcodes/i386-dis.c > +++ b/opcodes/i386-dis.c > @@ -943,6 +943,7 @@ enum > MOD_VEX_0F385E_X86_64_P_1_W_0, > MOD_VEX_0F385E_X86_64_P_2_W_0, > MOD_VEX_0F385E_X86_64_P_3_W_0, > + MOD_VEX_0F386C_X86_64_W_0, > MOD_VEX_0F388C, > MOD_VEX_0F388E, > MOD_VEX_0F3A30_L_0, > @@ -1145,6 +1146,7 @@ enum > PREFIX_VEX_0F3851_W_0, > PREFIX_VEX_0F385C_X86_64, > PREFIX_VEX_0F385E_X86_64, > + PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0, > PREFIX_VEX_0F3872, > PREFIX_VEX_0F38B0_W_0, > PREFIX_VEX_0F38B1_W_0, > @@ -1298,6 +1300,7 @@ enum > X86_64_VEX_0F384B, > X86_64_VEX_0F385C, > X86_64_VEX_0F385E, > + X86_64_VEX_0F386C, > X86_64_VEX_0F38E0, > X86_64_VEX_0F38E1, > X86_64_VEX_0F38E2, > @@ -1398,6 +1401,7 @@ enum > VEX_LEN_0F385E_X86_64_P_1_W_0_M_0, > VEX_LEN_0F385E_X86_64_P_2_W_0_M_0, > VEX_LEN_0F385E_X86_64_P_3_W_0_M_0, > + VEX_LEN_0F386C_X86_64_W_0_M_1, > VEX_LEN_0F38DB, > VEX_LEN_0F38F2, > VEX_LEN_0F38F3, > @@ -1565,6 +1569,7 @@ enum > VEX_W_0F385E_X86_64_P_1, > VEX_W_0F385E_X86_64_P_2, > VEX_W_0F385E_X86_64_P_3, > + VEX_W_0F386C_X86_64, > VEX_W_0F3872_P_1, > VEX_W_0F3878, > VEX_W_0F3879, > @@ -4119,6 +4124,13 @@ static const struct dis386 prefix_table[][4] = { > { VEX_W_TABLE (VEX_W_0F385E_X86_64_P_3) }, > }, > > + /* PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0 */ > + { > + { "tcmmrlfp16ps", { TMM, EXtmm, VexTmm }, 0 }, > + { Bad_Opcode }, > + { "tcmmimfp16ps", { TMM, EXtmm, VexTmm }, 0 }, > + }, > + > /* PREFIX_VEX_0F3872 */ > { > { Bad_Opcode }, > @@ -4486,6 +4498,12 @@ static const struct dis386 x86_64_table[][2] = { > { PREFIX_TABLE (PREFIX_VEX_0F385E_X86_64) }, > }, > > + /* X86_64_VEX_0F386C */ > + { > + { Bad_Opcode }, > + { VEX_W_TABLE (VEX_W_0F386C_X86_64) }, > + }, > + > /* X86_64_VEX_0F38E0 */ > { > { Bad_Opcode }, > @@ -6461,7 +6479,7 @@ static const struct dis386 vex_table[][256] = { > { Bad_Opcode }, > { Bad_Opcode }, > { Bad_Opcode }, > - { Bad_Opcode }, > + { X86_64_TABLE (X86_64_VEX_0F386C) }, > { Bad_Opcode }, > { Bad_Opcode }, > { Bad_Opcode }, > @@ -7181,6 +7199,11 @@ static const struct dis386 vex_len_table[][2] = { > { "tdpbssd", {TMM, EXtmm, VexTmm }, 0 }, > }, > > + /* VEX_LEN_0F386C_X86_64_W_0_M_1 */ > + { > + { PREFIX_TABLE (PREFIX_VEX_0F386C_X86_64_W_0_M_1_L_0) }, > + }, > + > /* VEX_LEN_0F38DB */ > { > { "vaesimc", { XM, EXx }, PREFIX_DATA }, > @@ -7849,6 +7872,10 @@ static const struct dis386 vex_w_table[][2] = { > /* VEX_W_0F385E_X86_64_P_3 */ > { MOD_TABLE (MOD_VEX_0F385E_X86_64_P_3_W_0) }, > }, > + { > + /* VEX_W_0F386C_X86_64 */ > + { MOD_TABLE (MOD_VEX_0F386C_X86_64_W_0) }, > + }, > { > /* VEX_W_0F3872_P_1 */ > { "%XVvcvtneps2bf16%XY", { XMM, EXx }, 0 }, > @@ -8696,6 +8723,11 @@ static const struct dis386 mod_table[][2] = { > { Bad_Opcode }, > { VEX_LEN_TABLE (VEX_LEN_0F385E_X86_64_P_3_W_0_M_0) }, > }, > + { > + /* MOD_VEX_0F386C_X86_64_W_0 */ > + { Bad_Opcode }, > + { VEX_LEN_TABLE (VEX_LEN_0F386C_X86_64_W_0_M_1) }, > + }, > { > /* MOD_VEX_0F388C */ > { "vpmaskmov%DQ", { XM, Vex, Mx }, PREFIX_DATA }, > diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c > index 489ae3429c9..c2ac3c6832d 100644 > --- a/opcodes/i386-gen.c > +++ b/opcodes/i386-gen.c > @@ -240,6 +240,8 @@ static const dependency isa_dependencies[] = > "AMX_TILE" }, > { "AMX_FP16", > "AMX_TILE" }, > + { "AMX_COMPLEX", > + "AMX_TILE" }, > { "KL", > "SSE2" }, > { "WIDEKL", > @@ -378,6 +380,7 @@ static bitfield cpu_flags[] = > BITFIELD (AMX_INT8), > BITFIELD (AMX_BF16), > BITFIELD (AMX_FP16), > + BITFIELD (AMX_COMPLEX), > BITFIELD (AMX_TILE), > BITFIELD (MOVDIRI), > BITFIELD (MOVDIR64B), > diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h > index 23d93ae6f81..b17e8341aa2 100644 > --- a/opcodes/i386-opc.h > +++ b/opcodes/i386-opc.h > @@ -248,6 +248,8 @@ enum > CpuAMX_BF16, > /* AMX-FP16 instructions required */ > CpuAMX_FP16, > + /* AMX-COMPLEX instructions required. */ > + CpuAMX_COMPLEX, > /* AMX-TILE instructions required */ > CpuAMX_TILE, > /* GFNI instructions required */ > @@ -432,6 +434,7 @@ typedef union i386_cpu_flags > unsigned int cpuamx_int8:1; > unsigned int cpuamx_bf16:1; > unsigned int cpuamx_fp16:1; > + unsigned int cpuamx_complex:1; > unsigned int cpuamx_tile:1; > unsigned int cpugfni:1; > unsigned int cpuvaes:1; > diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl > index 9cc909925f4..15d48eeb4c7 100644 > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -3146,6 +3146,9 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {} > ldtilecfg, 0x49/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex } > sttilecfg, 0x6649/0, AMX_TILE|x64, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex } > > +tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } > +tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } > + > tdpbf16ps, 0xf35c, AMX_BF16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } > tdpfp16ps, 0xf25c, AMX_FP16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } > tdpbssd, 0xf25e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM } > -- > 2.31.1 > OK. Thanks. -- H.J. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-04-07 15:50 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-04-03 7:11 [PATCH] Support Intel AMX-COMPLEX Haochen Jiang 2023-04-04 7:35 ` Jan Beulich 2023-04-04 8:41 ` Jiang, Haochen 2023-04-06 7:17 ` [PATCH v2] " Haochen Jiang 2023-04-06 9:45 ` Jan Beulich 2023-04-07 1:59 ` Jiang, Haochen 2023-04-07 15:49 ` H.J. Lu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).