From: Jan Beulich <jbeulich@suse.com>
To: Haochen Jiang <haochen.jiang@intel.com>
Cc: hjl.tools@gmail.com, ludloff@gmail.com, binutils@sourceware.org
Subject: Re: [PATCH v2] Support Intel AMX-AVX512
Date: Mon, 6 Jan 2025 17:29:15 +0100 [thread overview]
Message-ID: <3d06c14f-592a-4e16-878b-e69f8d0a605f@suse.com> (raw)
In-Reply-To: <20250103025029.1909253-1-haochen.jiang@intel.com>
On 03.01.2025 03:50, Haochen Jiang wrote:
> Hi all,
>
> Although there is still opensin AMX-AVX512 encoding, I would like
> to first send out the v2 patch with the current encodings since in
> this patch, there are also other parts need to be reviewed.
>
> Patch descrption and changes are embedded following.
>
> The encoding issue I mentioned previously is on tcvtrowps2[bf16,ph][h,l].
> For Reg32 part, it is ok. However, for Imm8 part, under current HW design,
> it is split to two opcodes. It is not an ideal design. Due to Christmas/
> New Year Holiday, the answer for whether it could be changed is still
> delayed for now. I will update that as soon as I get the answer.
>
> Thx,
> Haochen
>
> ---
>
> Changes in v2:
>
> - Pull out all GPR mode out of vex length switch in OP_VEX to make
> it more general.
> - Remove invalid test for 32-bit.
> - Reuse VexGdq for operands.
> - Update the mnemonics from tcvtrowps2pbf16[h,l] to tcvtrowps2bf16[h,l]
> according to ISE056.
>
> ---
>
> This patch will support AMX-AVX512. In disassmbler, we pull out all
> GPR mode out of the vex length switch to make it more general.
>
> ---
>
> gas/ChangeLog:
>
> * config/tc-i386.c: Add amx_avx512.
> * doc/c-i386.texi: Document .amx_avx512.
> * testsuite/gas/i386/x86-64.exp: Run AMX-AVX512 tests.
> * testsuite/gas/i386/x86-64-amx-avx512-intel.d: New test.
> * testsuite/gas/i386/x86-64-amx-avx512.d: Ditto.
> * testsuite/gas/i386/x86-64-amx-avx512.s: Ditto.
>
> opcodes/ChangeLog:
>
> * i386-dis-evex-len.h: Add EVEX_LEN_0F384A_X86_64_W_0,
> EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F3A07_X86_64_W_0,
> EVEX_LEN_0F3A77_X86_64_W_0.
> * i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F384A_W_0_L_2,
> PREFIX_EVEX_0F386D_W_0_L_2, PREFIX_EVEX_0F3A07_W_0_L_2,
> PREFIX_EVEX_0F3A77_W_0_L_2.
> * i386-dis-evex-w.h: Add EVEX_W_0F384A_X86_64, EVEX_W_0F386D_X86_64,
> EVEX_W_0F3A07_X86_64, EVEX_W_0F3A77_X86_64.
> * i386-dis-evex-x86-64.h: Add X86_64_EVEX_0F384A, X86_64_EVEX_0F386D,
> X86_64_EVEX_0F3A07, X86_64_EVEX_0F3A77.
> * i386-dis-evex.h: Ditto.
> * i386-dis.c (EVEX_LEN_0F384A_X86_64_W_0): New.
> (EVEX_LEN_0F386D_X86_64_W_0): Ditto.
> (EVEX_LEN_0F3A07_X86_64_W_0): Ditto.
> (EVEX_LEN_0F3A77_X86_64_W_0): Ditto.
> (MOD_EVEX_0F384A_X86_64_W_0): Ditto.
> (MOD_EVEX_0F386D_X86_64_W_0): Ditto.
> (MOD_EVEX_0F3A07_X86_64_W_0): Ditto.
> (MOD_EVEX_0F3A77_X86_64_W_0): Ditto.
> (PREFIX_EVEX_0F384A_W_0_L_2): Ditto.
> (PREFIX_EVEX_0F386D_W_0_L_2): Ditto.
> (PREFIX_EVEX_0F3A07_W_0_L_2): Ditto.
> (PREFIX_EVEX_0F3A77_W_0_L_2): Ditto.
> (EVEX_W_0F384A_X86_64): Ditto.
> (EVEX_W_0F386D_X86_64): Ditto.
> (EVEX_W_0F3A07_X86_64): Ditto.
> (EVEX_W_0F3A77_X86_64): Ditto.
> (X86_64_EVEX_0F384A): Ditto.
> (X86_64_EVEX_0F386D): Ditto.
> (X86_64_EVEX_0F3A07): Ditto.
> (X86_64_EVEX_0F3A77): Ditto.
> (OP_VEX): Pull out all GPR mode out of the vector length switch.
> * i386-gen.c (isa_dependencies): Add AMX-AVX512.
> (cpu_flags): Ditto.
> * i386-init.h: Regenerated.
> * i386-mnem.h: Ditto.
> * i386-opc.h (CpuAMX_AVX512): New.
> (i386_cpu_flags): Add cpuamx_avx512.
> * i386-opc.tbl: Add AMX-AVX512 instructions.
> * i386-tbl.h: Regenerated.
> ---
> gas/config/tc-i386.c | 1 +
> gas/doc/c-i386.texi | 4 +-
> .../gas/i386/x86-64-amx-avx512-intel.d | 35 +
> gas/testsuite/gas/i386/x86-64-amx-avx512.d | 34 +
> gas/testsuite/gas/i386/x86-64-amx-avx512.s | 55 +
> gas/testsuite/gas/i386/x86-64.exp | 2 +
> opcodes/i386-dis-evex-len.h | 23 +
> opcodes/i386-dis-evex-prefix.h | 27 +
> opcodes/i386-dis-evex-w.h | 12 +
> opcodes/i386-dis-evex-x86-64.h | 15 +
> opcodes/i386-dis-evex.h | 6 +-
> opcodes/i386-dis.c | 52 +-
> opcodes/i386-gen.c | 3 +
> opcodes/i386-init.h | 718 ++---
> opcodes/i386-mnem.h | 2576 +++++++++--------
> opcodes/i386-opc.h | 3 +
> opcodes/i386-opc.tbl | 15 +
> opcodes/i386-tbl.h | 415 ++-
> 18 files changed, 2210 insertions(+), 1786 deletions(-)
> create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d
> create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.d
> create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.s
It is again nowhere in the patch metadata that you put down what other
non-upstream patch(es) this one goes on top of. This is important info
for a reviewer. Since this isn't the first time, let me make it quite
clear: Going forward I may simply refuse to review (reject) patches
with unclear dependencies.
> @@ -14070,6 +14083,29 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
> return true;
> }
>
> + switch (bytemode)
> + {
> + case v_mode:
> + case dq_mode:
> + if (ins->rex & REX_W)
> + names = att_names64;
> + else if (bytemode == v_mode
> + && !(sizeflag & DFLAG))
> + names = att_names16;
> + else
> + names = att_names32;
> + oappend_register (ins, names[reg]);
> + return true;
> + case b_mode:
> + names = att_names8rex;
> + oappend_register (ins, names[reg]);
> + return true;
> + case q_mode:
> + names = att_names64;
> + oappend_register (ins, names[reg]);
> + return true;
> + }
I think there are two ways to improve legibility here: Either pull out the
call to oappend_register() (and the return), or avoid using the "names"
local var in the latter two cases.
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3243,6 +3243,21 @@ t2rpntlvw<z>rs<loc>, 0x<z:pfx>f8 | <loc:opc>, AMX_TRANSPOSE&APX_F(AMX_MOVRS), Si
> tileloaddrs, 0xf24a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> tileloaddrst1, 0x664a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>
> +tcvtrowd2ps, 0xf34a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowd2ps, 0xf307, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +
> +tcvtrowps2bf16h, 0xf26d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2bf16h, 0xf207, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2bf16l, 0xf36d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2bf16l, 0xf377, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2phh, 0x6d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2phh, 0x07, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2phl, 0x666d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2phl, 0xf277, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +
> +tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
Just to double check: The AVX10.2/256 case really is of no interest for this
feature / these insns, and your designers would rather introduce yet another
CPUID flag in case it turned out desirable later on?
Jan
next prev parent reply other threads:[~2025-01-06 16:29 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-03 2:50 Haochen Jiang
2025-01-06 16:29 ` Jan Beulich [this message]
2025-01-07 2:50 ` Jiang, Haochen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3d06c14f-592a-4e16-878b-e69f8d0a605f@suse.com \
--to=jbeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=haochen.jiang@intel.com \
--cc=hjl.tools@gmail.com \
--cc=ludloff@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).