public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Haochen Jiang <haochen.jiang@intel.com>
Cc: hjl.tools@gmail.com, wwwhhhyyy <hongyu.wang@intel.com>,
	binutils@sourceware.org
Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA
Date: Fri, 14 Oct 2022 11:52:34 +0200	[thread overview]
Message-ID: <863655db-f202-477f-c638-00773c25886c@suse.com> (raw)
In-Reply-To: <20221014091248.4920-2-haochen.jiang@intel.com>

On 14.10.2022 11:12, Haochen Jiang wrote:
> From: wwwhhhyyy <hongyu.wang@intel.com>
> 
> x86: Support Intel AVX-IFMA
> 
> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> encoding for Intel IFMA instructions.

I firmly object to the proliferation of this mis-feature. As expressed
before for AVX-VNNI, as long as the user has disabled AVX512 (or
respective sub-features thereof), there should be no need to use {vex} in
the source code. There's also no reason at all to make the disassembler
print {vex} prefixes - we don't do so for any other insns (apart from
AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
(when none of the EVEX-specific features is used).

I actually have a patch queued to undo the odd behavior for AVX-VNNI, at
least on the assembler side (which also drops the PseudoVexPrefix
attribute).

> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -1526,6 +1526,8 @@ enum
>    VEX_W_0F385E_X86_64_P_3,
>    VEX_W_0F3878,
>    VEX_W_0F3879,
> +  VEX_W_0F38B4,
> +  VEX_W_0F38B5,
>    VEX_W_0F38CF,
>    VEX_W_0F3A00_L_1,
>    VEX_W_0F3A01_L_1,
> @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { VEX_W_TABLE (VEX_W_0F38B4) },
> +    { VEX_W_TABLE (VEX_W_0F38B5) },
>      { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
>      { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
>      /* b8 */
> @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = {
>      /* VEX_W_0F3879 */
>      { "vpbroadcastw",	{ XM, EXw }, PREFIX_DATA },
>    },
> +  {
> +    /* VEX_W_0F38B4 */
> +    { Bad_Opcode },
> +    { "%XV vpmadd52luq",	{ XM, Vex, EXx }, PREFIX_DATA },
> +  },
> +  {
> +    /* VEX_W_0F38B5 */
> +    { Bad_Opcode },
> +    { "%XV vpmadd52huq",	{ XM, Vex, EXx }, PREFIX_DATA },
> +  },

Irrespective of the aspect mentioned at the top I think this is yet
another case where VEX and EVEX table entries can be shared. This would
(if the {vex} printing really needs retaining for whatever obscure
reason) merely require the processing of %XV to do nothing for EVEX-
encoded insns, plus of course the separating blank would then also need
to be included in the processing of %XV.

I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries,
which you could then re-base on top of.

> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] =
>      "CPU_AVX512F_FLAGS|CpuAVX512_BF16" },
>    { "CPU_AVX512_FP16_FLAGS",
>      "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
> +  { "CPU_AVX_IFMA_FLAGS",
> +    "CPU_AVX2_FLAGS|CpuAVX_IFMA" },
>    { "CPU_IAMCU_FLAGS",
>      "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" },
>    { "CPU_ADX_FLAGS",
> @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] =
>      "CpuHRESET" },
>    { "CPU_ANY_AVX512_FP16_FLAGS",
>      "CpuAVX512_FP16" },
> +  { "CPU_ANY_AVX_IFMA_FLAGS",
> +    "CpuAVX_IFMA" },

If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs
adjustment, such that disabling of AVX2 also results in disabling of
AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.)

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16, Modrm|Masking=3|EVexMap6|VexW0|Broadcast
>  vrsqrtsh, 0x664f, None, CpuAVX512_FP16, Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
>  
>  // FP16 (HFNI) instructions end.
> +
> +// AVX_IFMA instructions.

Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples
of the (needless) use of underscores like this.

> +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
> +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }

Please use plain VexVVVV (without =1) - we want to have as little clutter as
possible on these usually already overlong lines.

Jan

  reply	other threads:[~2022-10-14  9:52 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-14  9:12 [PATCH 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions Haochen Jiang
2022-10-14  9:12 ` [PATCH 01/10] Support Intel AVX-IFMA Haochen Jiang
2022-10-14  9:52   ` Jan Beulich [this message]
2022-10-14 18:10     ` H.J. Lu
2022-10-16  6:39       ` Jan Beulich
2022-10-17 22:23         ` H.J. Lu
2022-10-18  5:33           ` Jan Beulich
2022-10-18 21:28             ` H.J. Lu
2022-10-19  6:01               ` Jan Beulich
2022-10-19 21:27                 ` H.J. Lu
2022-10-20  6:15                   ` Jan Beulich
2022-10-24  2:07     ` Jiang, Haochen
2022-10-24  5:53     ` Jiang, Haochen
2022-10-24 19:09       ` H.J. Lu
2022-10-25  6:29       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 02/10] Support Intel AVX-VNNI-INT8 Haochen Jiang
2022-10-14 10:57   ` Jan Beulich
2022-10-21  3:22     ` Jiang, Haochen
2022-10-25  1:52       ` H.J. Lu
2022-10-14  9:12 ` [PATCH 03/10] Support Intel AVX-NE-CONVERT Haochen Jiang
2022-10-14 12:58   ` Jan Beulich
2022-10-24  5:37     ` Kong, Lingling
2022-10-24  5:59     ` Kong, Lingling
2022-10-24 19:25       ` H.J. Lu
2022-10-25  6:44       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 04/10] Support Intel CMPccXADD Haochen Jiang
2022-10-14 13:46   ` Jan Beulich
2022-10-14 18:27     ` H.J. Lu
2022-10-14 21:51       ` H.J. Lu
2022-10-16  6:34         ` Jan Beulich
2022-10-17 23:31           ` H.J. Lu
2022-10-16  6:25       ` Jan Beulich
2022-10-17 23:44         ` H.J. Lu
2022-10-16  6:19     ` Jan Beulich
2022-10-24  2:30     ` Jiang, Haochen
2022-10-24 19:12       ` H.J. Lu
2022-10-24  5:55     ` Jiang, Haochen
2022-10-25  6:53       ` Jan Beulich
2022-10-26  3:03         ` Jiang, Haochen
2022-10-26  8:49           ` Jan Beulich
2022-10-27  3:09             ` Jiang, Haochen
2022-10-27  6:37               ` Jan Beulich
2022-10-28  0:59                 ` Jiang, Haochen
2022-10-14  9:12 ` [PATCH 05/10] Add handler for more i386_cpu_flags Haochen Jiang
2022-10-14 13:53   ` Jan Beulich
2022-10-14  9:12 ` [PATCH 06/10] Support Intel RAO-INT Haochen Jiang
2022-10-14 14:38   ` Jan Beulich
2022-10-16  6:15     ` Jan Beulich
2022-10-24  3:12     ` Jiang, Haochen
2022-10-24 19:17       ` H.J. Lu
2022-10-24  5:56     ` Jiang, Haochen
2022-10-25  7:01       ` Jan Beulich
2022-10-26  5:16         ` Jiang, Haochen
2022-10-26  8:56           ` Jan Beulich
2022-10-27  3:50             ` Jiang, Haochen
2022-10-27  6:39               ` Jan Beulich
2022-10-27 18:46                 ` H.J. Lu
2022-10-28  6:52                   ` Jan Beulich
2022-10-28  8:10                     ` Jiang, Haochen
2022-10-28  8:22                       ` Jan Beulich
2022-10-28  8:31                         ` Jiang, Haochen
2022-10-28  8:40                           ` Jan Beulich
2022-10-28 16:08                             ` H.J. Lu
2022-10-31  9:41                               ` Jan Beulich
2022-10-31 16:49                                 ` H.J. Lu
2022-11-06 12:50         ` Kong, Lingling
2022-11-07  9:24           ` Jan Beulich
2022-11-07 13:37             ` Kong, Lingling
2022-11-07 20:03               ` H.J. Lu
2022-10-17 23:23   ` H.J. Lu
2022-10-18  5:38     ` Jan Beulich
2022-10-14  9:12 ` [PATCH 07/10] Support Intel WRMSRNS Haochen Jiang
2022-10-17  7:17   ` Jan Beulich
2022-10-24  2:52     ` Jiang, Haochen
2022-10-24  5:56     ` Jiang, Haochen
2022-10-24 19:14       ` H.J. Lu
2022-10-25  7:04       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 08/10] Support Intel MSRLIST Haochen Jiang
2022-10-17  7:20   ` Jan Beulich
2022-10-24  3:03     ` Jiang, Haochen
2022-10-24  5:56     ` Jiang, Haochen
2022-10-24 19:15       ` H.J. Lu
2022-10-25  7:07       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 09/10] Support Intel AMX-FP16 Haochen Jiang
2022-10-17  7:35   ` Jan Beulich
2022-10-18  9:01     ` Cui, Lili
2022-10-18  9:23       ` Jan Beulich
2022-10-18  9:33         ` Jiang, Haochen
2022-10-19 10:33         ` Cui, Lili
2022-10-19 13:35           ` Jan Beulich
2022-10-19 14:05             ` Cui, Lili
2022-10-19 14:09               ` Jan Beulich
2022-10-19 14:41                 ` Cui, Lili
2022-10-19 15:04                   ` Jan Beulich
2022-10-19 15:21                     ` Cui, Lili
2022-10-19 14:01           ` Jiang, Haochen
2022-10-19 14:13             ` Jan Beulich
2022-10-19 14:58               ` Jiang, Haochen
2022-10-25  6:02         ` Jan Beulich
2022-10-25 13:05           ` Cui, Lili
2022-10-14  9:12 ` [PATCH 10/10] Support Intel PREFETCHI Haochen Jiang
2022-10-17  8:15   ` Jan Beulich
2022-10-25 13:03     ` Cui, Lili
2022-10-25 15:41       ` Jan Beulich
2022-10-25 15:52       ` Jan Beulich
2022-10-25 17:01         ` H.J. Lu
2022-10-26 13:42           ` Cui, Lili
2022-10-26 13:53             ` Jan Beulich
2022-10-27  6:04               ` Cui, Lili
2022-10-27  6:45                 ` Jan Beulich
2022-10-27  7:01                   ` Cui, Lili
2022-10-27  7:15                     ` Jan Beulich
2022-10-27  7:43                       ` Cui, Lili
2022-10-28  9:03                       ` Cui, Lili
2022-10-28 15:54                     ` H.J. Lu
2022-10-31 13:23                       ` Cui, Lili
2022-10-31 14:45                     ` Mike Frysinger
2022-10-31 16:25                       ` H.J. Lu
2022-10-19 14:55 [PATCH v2 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions Haochen Jiang
2022-10-19 14:55 ` [PATCH 01/10] Support Intel AVX-IFMA Haochen Jiang
2022-10-19 15:15 [PATCH v2 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions (Resend) Haochen Jiang
2022-10-19 15:15 ` [PATCH 01/10] Support Intel AVX-IFMA Haochen Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=863655db-f202-477f-c638-00773c25886c@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=haochen.jiang@intel.com \
    --cc=hjl.tools@gmail.com \
    --cc=hongyu.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).