public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Haochen Jiang <haochen.jiang@intel.com>
Cc: hjl.tools@gmail.com, ludloff@gmail.com, binutils@sourceware.org
Subject: Re: [PATCH v2] Support Intel AMX-AVX512
Date: Mon, 6 Jan 2025 17:29:15 +0100	[thread overview]
Message-ID: <3d06c14f-592a-4e16-878b-e69f8d0a605f@suse.com> (raw)
In-Reply-To: <20250103025029.1909253-1-haochen.jiang@intel.com>

On 03.01.2025 03:50, Haochen Jiang wrote:
> Hi all,
> 
> Although there is still opensin AMX-AVX512 encoding, I would like
> to first send out the v2 patch with the current encodings since in
> this patch, there are also other parts need to be reviewed.
> 
> Patch descrption and changes are embedded following.
> 
> The encoding issue I mentioned previously is on tcvtrowps2[bf16,ph][h,l].
> For Reg32 part, it is ok. However, for Imm8 part, under current HW design,
> it is split to two opcodes. It is not an ideal design. Due to Christmas/
> New Year Holiday, the answer for whether it could be changed is still
> delayed for now. I will update that as soon as I get the answer.
> 
> Thx,
> Haochen
> 
> ---
> 
> Changes in v2:
> 
>   - Pull out all GPR mode out of vex length switch in OP_VEX to make
>     it more general.
>   - Remove invalid test for 32-bit.
>   - Reuse VexGdq for operands.
>   - Update the mnemonics from tcvtrowps2pbf16[h,l] to tcvtrowps2bf16[h,l]
>     according to ISE056.
> 
> ---
> 
> This patch will support AMX-AVX512. In disassmbler, we pull out all
> GPR mode out of the vex length switch to make it more general.
> 
> ---
> 
> gas/ChangeLog:
> 
> 	* config/tc-i386.c: Add amx_avx512.
> 	* doc/c-i386.texi: Document .amx_avx512.
> 	* testsuite/gas/i386/x86-64.exp: Run AMX-AVX512 tests.
> 	* testsuite/gas/i386/x86-64-amx-avx512-intel.d: New test.
> 	* testsuite/gas/i386/x86-64-amx-avx512.d: Ditto.
> 	* testsuite/gas/i386/x86-64-amx-avx512.s: Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis-evex-len.h: Add EVEX_LEN_0F384A_X86_64_W_0,
> 	EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F3A07_X86_64_W_0,
> 	EVEX_LEN_0F3A77_X86_64_W_0.
> 	* i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F384A_W_0_L_2,
> 	PREFIX_EVEX_0F386D_W_0_L_2, PREFIX_EVEX_0F3A07_W_0_L_2,
> 	PREFIX_EVEX_0F3A77_W_0_L_2.
> 	* i386-dis-evex-w.h: Add EVEX_W_0F384A_X86_64, EVEX_W_0F386D_X86_64,
> 	EVEX_W_0F3A07_X86_64, EVEX_W_0F3A77_X86_64.
> 	* i386-dis-evex-x86-64.h: Add X86_64_EVEX_0F384A, X86_64_EVEX_0F386D,
> 	X86_64_EVEX_0F3A07, X86_64_EVEX_0F3A77.
> 	* i386-dis-evex.h: Ditto.
> 	* i386-dis.c (EVEX_LEN_0F384A_X86_64_W_0): New.
> 	(EVEX_LEN_0F386D_X86_64_W_0): Ditto.
> 	(EVEX_LEN_0F3A07_X86_64_W_0): Ditto.
> 	(EVEX_LEN_0F3A77_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F384A_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F386D_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F3A07_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F3A77_X86_64_W_0): Ditto.
> 	(PREFIX_EVEX_0F384A_W_0_L_2): Ditto.
> 	(PREFIX_EVEX_0F386D_W_0_L_2): Ditto.
> 	(PREFIX_EVEX_0F3A07_W_0_L_2): Ditto.
> 	(PREFIX_EVEX_0F3A77_W_0_L_2): Ditto.
> 	(EVEX_W_0F384A_X86_64): Ditto.
> 	(EVEX_W_0F386D_X86_64): Ditto.
> 	(EVEX_W_0F3A07_X86_64): Ditto.
> 	(EVEX_W_0F3A77_X86_64): Ditto.
> 	(X86_64_EVEX_0F384A): Ditto.
> 	(X86_64_EVEX_0F386D): Ditto.
> 	(X86_64_EVEX_0F3A07): Ditto.
> 	(X86_64_EVEX_0F3A77): Ditto.
> 	(OP_VEX): Pull out all GPR mode out of the vector length switch.
> 	* i386-gen.c (isa_dependencies): Add AMX-AVX512.
> 	(cpu_flags): Ditto.
> 	* i386-init.h: Regenerated.
> 	* i386-mnem.h: Ditto.
> 	* i386-opc.h (CpuAMX_AVX512): New.
> 	(i386_cpu_flags): Add cpuamx_avx512.
> 	* i386-opc.tbl: Add AMX-AVX512 instructions.
> 	* i386-tbl.h: Regenerated.
> ---
>  gas/config/tc-i386.c                          |    1 +
>  gas/doc/c-i386.texi                           |    4 +-
>  .../gas/i386/x86-64-amx-avx512-intel.d        |   35 +
>  gas/testsuite/gas/i386/x86-64-amx-avx512.d    |   34 +
>  gas/testsuite/gas/i386/x86-64-amx-avx512.s    |   55 +
>  gas/testsuite/gas/i386/x86-64.exp             |    2 +
>  opcodes/i386-dis-evex-len.h                   |   23 +
>  opcodes/i386-dis-evex-prefix.h                |   27 +
>  opcodes/i386-dis-evex-w.h                     |   12 +
>  opcodes/i386-dis-evex-x86-64.h                |   15 +
>  opcodes/i386-dis-evex.h                       |    6 +-
>  opcodes/i386-dis.c                            |   52 +-
>  opcodes/i386-gen.c                            |    3 +
>  opcodes/i386-init.h                           |  718 ++---
>  opcodes/i386-mnem.h                           | 2576 +++++++++--------
>  opcodes/i386-opc.h                            |    3 +
>  opcodes/i386-opc.tbl                          |   15 +
>  opcodes/i386-tbl.h                            |  415 ++-
>  18 files changed, 2210 insertions(+), 1786 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.s

It is again nowhere in the patch metadata that you put down what other
non-upstream patch(es) this one goes on top of. This is important info
for a reviewer. Since this isn't the first time, let me make it quite
clear: Going forward I may simply refuse to review (reject) patches
with unclear dependencies.

> @@ -14070,6 +14083,29 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>        return true;
>      }
>  
> +  switch (bytemode)
> +    {
> +      case v_mode:
> +      case dq_mode:
> +	if (ins->rex & REX_W)
> +	  names = att_names64;
> +	else if (bytemode == v_mode
> +		  && !(sizeflag & DFLAG))
> +	  names = att_names16;
> +	else
> +	  names = att_names32;
> +	oappend_register (ins, names[reg]);
> +	return true;
> +      case b_mode:
> +	names = att_names8rex;
> +	oappend_register (ins, names[reg]);
> +	return true;
> +      case q_mode:
> +	names = att_names64;
> +	oappend_register (ins, names[reg]);
> +	return true;
> +    }

I think there are two ways to improve legibility here: Either pull out the
call to oappend_register() (and the return), or avoid using the "names"
local var in the latter two cases.

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3243,6 +3243,21 @@ t2rpntlvw<z>rs<loc>, 0x<z:pfx>f8 | <loc:opc>, AMX_TRANSPOSE&APX_F(AMX_MOVRS), Si
>  tileloaddrs, 0xf24a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  tileloaddrst1, 0x664a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  
> +tcvtrowd2ps, 0xf34a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowd2ps, 0xf307, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +
> +tcvtrowps2bf16h, 0xf26d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2bf16h, 0xf207, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2bf16l, 0xf36d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2bf16l, 0xf377, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2phh, 0x6d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2phh, 0x07, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2phl, 0x666d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2phl, 0xf277, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +
> +tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }

Just to double check: The AVX10.2/256 case really is of no interest for this
feature / these insns, and your designers would rather introduce yet another
CPUID flag in case it turned out desirable later on?

Jan

  reply	other threads:[~2025-01-06 16:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-03  2:50 Haochen Jiang
2025-01-06 16:29 ` Jan Beulich [this message]
2025-01-07  2:50   ` Jiang, Haochen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d06c14f-592a-4e16-878b-e69f8d0a605f@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=haochen.jiang@intel.com \
    --cc=hjl.tools@gmail.com \
    --cc=ludloff@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).