From: "H.J. Lu" <hjl.tools@gmail.com>
To: "Jiang, Haochen" <haochen.jiang@intel.com>
Cc: "Beulich, Jan" <JBeulich@suse.com>,
"Wang, Hongyu" <hongyu.wang@intel.com>,
"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA
Date: Mon, 24 Oct 2022 12:09:51 -0700 [thread overview]
Message-ID: <CAMe9rOruSFbjC0MNdUCGk7U0zrf4f7hdJZiuZTQwhT2OZZainQ@mail.gmail.com> (raw)
In-Reply-To: <SA1PR11MB5946DDA650C165540BACD061EC2E9@SA1PR11MB5946.namprd11.prod.outlook.com>
On Sun, Oct 23, 2022 at 10:53 PM Jiang, Haochen <haochen.jiang@intel.com> wrote:
>
> > -----Original Message-----
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: Friday, October 14, 2022 5:53 PM
> > To: Jiang, Haochen <haochen.jiang@intel.com>
> > Cc: hjl.tools@gmail.com; Wang, Hongyu <hongyu.wang@intel.com>;
> > binutils@sourceware.org
> > Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA
> >
> > On 14.10.2022 11:12, Haochen Jiang wrote:
> > > From: wwwhhhyyy <hongyu.wang@intel.com>
> > >
> > > x86: Support Intel AVX-IFMA
> > >
> > > Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> > > cleared by default. Without {vex} pseudo prefix, Intel IFMA instructions
> > > are encoded with EVEX prefix. {vex} pseudo prefix will turn on VEX
> > > encoding for Intel IFMA instructions.
> >
> > I firmly object to the proliferation of this mis-feature. As expressed
> > before for AVX-VNNI, as long as the user has disabled AVX512 (or
> > respective sub-features thereof), there should be no need to use {vex} in
> > the source code. There's also no reason at all to make the disassembler
> > print {vex} prefixes - we don't do so for any other insns (apart from
> > AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> > (when none of the EVEX-specific features is used).
> >
> > I actually have a patch queued to undo the odd behavior for AVX-VNNI, at
> > least on the assembler side (which also drops the PseudoVexPrefix
> > attribute).
>
> Has rebased the patch to latest trunk and removed PseudoVexPrefix in table.
> Also added some testcases just like how your patch did.
>
> >
> > > --- a/opcodes/i386-dis.c
> > > +++ b/opcodes/i386-dis.c
> > > @@ -1526,6 +1526,8 @@ enum
> > > VEX_W_0F385E_X86_64_P_3,
> > > VEX_W_0F3878,
> > > VEX_W_0F3879,
> > > + VEX_W_0F38B4,
> > > + VEX_W_0F38B5,
> > > VEX_W_0F38CF,
> > > VEX_W_0F3A00_L_1,
> > > VEX_W_0F3A01_L_1,
> > > @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = {
> > > { Bad_Opcode },
> > > { Bad_Opcode },
> > > { Bad_Opcode },
> > > - { Bad_Opcode },
> > > - { Bad_Opcode },
> > > + { VEX_W_TABLE (VEX_W_0F38B4) },
> > > + { VEX_W_TABLE (VEX_W_0F38B5) },
> > > { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> > > { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> > > /* b8 */
> > > @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = {
> > > /* VEX_W_0F3879 */
> > > { "vpbroadcastw", { XM, EXw }, PREFIX_DATA },
> > > },
> > > + {
> > > + /* VEX_W_0F38B4 */
> > > + { Bad_Opcode },
> > > + { "%XV vpmadd52luq", { XM, Vex, EXx }, PREFIX_DATA },
> > > + },
> > > + {
> > > + /* VEX_W_0F38B5 */
> > > + { Bad_Opcode },
> > > + { "%XV vpmadd52huq", { XM, Vex, EXx }, PREFIX_DATA },
> > > + },
> >
> > Irrespective of the aspect mentioned at the top I think this is yet
> > another case where VEX and EVEX table entries can be shared. This would
> > (if the {vex} printing really needs retaining for whatever obscure
> > reason) merely require the processing of %XV to do nothing for EVEX-
> > encoded insns, plus of course the separating blank would then also need
> > to be included in the processing of %XV.
> >
> > I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries,
> > which you could then re-base on top of.
>
> Folded the table of AVX512IFMA and AVX-IFMA.
>
> >
> > > --- a/opcodes/i386-gen.c
> > > +++ b/opcodes/i386-gen.c
> > > @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] =
> > > "CPU_AVX512F_FLAGS|CpuAVX512_BF16" },
> > > { "CPU_AVX512_FP16_FLAGS",
> > > "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
> > > + { "CPU_AVX_IFMA_FLAGS",
> > > + "CPU_AVX2_FLAGS|CpuAVX_IFMA" },
> > > { "CPU_IAMCU_FLAGS",
> > > "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" },
> > > { "CPU_ADX_FLAGS",
> > > @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] =
> > > "CpuHRESET" },
> > > { "CPU_ANY_AVX512_FP16_FLAGS",
> > > "CpuAVX512_FP16" },
> > > + { "CPU_ANY_AVX_IFMA_FLAGS",
> > > + "CpuAVX_IFMA" },
> >
> > If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs
> > adjustment, such that disabling of AVX2 also results in disabling of
> > AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.)
> >
>
> Added AVX-IFMA to CPU_ANY_AVX2_FLAGS.
>
> > > --- a/opcodes/i386-opc.tbl
> > > +++ b/opcodes/i386-opc.tbl
> > > @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16,
> > Modrm|Masking=3|EVexMap6|VexW0|Broadcast
> > > vrsqrtsh, 0x664f, None, CpuAVX512_FP16,
> > Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|N
> > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> > { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
> > >
> > > // FP16 (HFNI) instructions end.
> > > +
> > > +// AVX_IFMA instructions.
> >
> > Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples
> > of the (needless) use of underscores like this.
> >
> > > +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA,
> > Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> > { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> > RegXMM|RegYMM }
> > > +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA,
> > Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> > { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> > RegXMM|RegYMM }
> >
> > Please use plain VexVVVV (without =1) - we want to have as little clutter as
> > possible on these usually already overlong lines.
>
> Changed to VexVVVV.
>
> Thx for your review and see if there is still something need to be changed.
OK.
Thanks.
> Haochen
>
> >
> > Jan
--
H.J.
next prev parent reply other threads:[~2022-10-24 19:10 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-14 9:12 [PATCH 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions Haochen Jiang
2022-10-14 9:12 ` [PATCH 01/10] Support Intel AVX-IFMA Haochen Jiang
2022-10-14 9:52 ` Jan Beulich
2022-10-14 18:10 ` H.J. Lu
2022-10-16 6:39 ` Jan Beulich
2022-10-17 22:23 ` H.J. Lu
2022-10-18 5:33 ` Jan Beulich
2022-10-18 21:28 ` H.J. Lu
2022-10-19 6:01 ` Jan Beulich
2022-10-19 21:27 ` H.J. Lu
2022-10-20 6:15 ` Jan Beulich
2022-10-24 2:07 ` Jiang, Haochen
2022-10-24 5:53 ` Jiang, Haochen
2022-10-24 19:09 ` H.J. Lu [this message]
2022-10-25 6:29 ` Jan Beulich
2022-10-14 9:12 ` [PATCH 02/10] Support Intel AVX-VNNI-INT8 Haochen Jiang
2022-10-14 10:57 ` Jan Beulich
2022-10-21 3:22 ` Jiang, Haochen
2022-10-25 1:52 ` H.J. Lu
2022-10-14 9:12 ` [PATCH 03/10] Support Intel AVX-NE-CONVERT Haochen Jiang
2022-10-14 12:58 ` Jan Beulich
2022-10-24 5:37 ` Kong, Lingling
2022-10-24 5:59 ` Kong, Lingling
2022-10-24 19:25 ` H.J. Lu
2022-10-25 6:44 ` Jan Beulich
2022-10-14 9:12 ` [PATCH 04/10] Support Intel CMPccXADD Haochen Jiang
2022-10-14 13:46 ` Jan Beulich
2022-10-14 18:27 ` H.J. Lu
2022-10-14 21:51 ` H.J. Lu
2022-10-16 6:34 ` Jan Beulich
2022-10-17 23:31 ` H.J. Lu
2022-10-16 6:25 ` Jan Beulich
2022-10-17 23:44 ` H.J. Lu
2022-10-16 6:19 ` Jan Beulich
2022-10-24 2:30 ` Jiang, Haochen
2022-10-24 19:12 ` H.J. Lu
2022-10-24 5:55 ` Jiang, Haochen
2022-10-25 6:53 ` Jan Beulich
2022-10-26 3:03 ` Jiang, Haochen
2022-10-26 8:49 ` Jan Beulich
2022-10-27 3:09 ` Jiang, Haochen
2022-10-27 6:37 ` Jan Beulich
2022-10-28 0:59 ` Jiang, Haochen
2022-10-14 9:12 ` [PATCH 05/10] Add handler for more i386_cpu_flags Haochen Jiang
2022-10-14 13:53 ` Jan Beulich
2022-10-14 9:12 ` [PATCH 06/10] Support Intel RAO-INT Haochen Jiang
2022-10-14 14:38 ` Jan Beulich
2022-10-16 6:15 ` Jan Beulich
2022-10-24 3:12 ` Jiang, Haochen
2022-10-24 19:17 ` H.J. Lu
2022-10-24 5:56 ` Jiang, Haochen
2022-10-25 7:01 ` Jan Beulich
2022-10-26 5:16 ` Jiang, Haochen
2022-10-26 8:56 ` Jan Beulich
2022-10-27 3:50 ` Jiang, Haochen
2022-10-27 6:39 ` Jan Beulich
2022-10-27 18:46 ` H.J. Lu
2022-10-28 6:52 ` Jan Beulich
2022-10-28 8:10 ` Jiang, Haochen
2022-10-28 8:22 ` Jan Beulich
2022-10-28 8:31 ` Jiang, Haochen
2022-10-28 8:40 ` Jan Beulich
2022-10-28 16:08 ` H.J. Lu
2022-10-31 9:41 ` Jan Beulich
2022-10-31 16:49 ` H.J. Lu
2022-11-06 12:50 ` Kong, Lingling
2022-11-07 9:24 ` Jan Beulich
2022-11-07 13:37 ` Kong, Lingling
2022-11-07 20:03 ` H.J. Lu
2022-10-17 23:23 ` H.J. Lu
2022-10-18 5:38 ` Jan Beulich
2022-10-14 9:12 ` [PATCH 07/10] Support Intel WRMSRNS Haochen Jiang
2022-10-17 7:17 ` Jan Beulich
2022-10-24 2:52 ` Jiang, Haochen
2022-10-24 5:56 ` Jiang, Haochen
2022-10-24 19:14 ` H.J. Lu
2022-10-25 7:04 ` Jan Beulich
2022-10-14 9:12 ` [PATCH 08/10] Support Intel MSRLIST Haochen Jiang
2022-10-17 7:20 ` Jan Beulich
2022-10-24 3:03 ` Jiang, Haochen
2022-10-24 5:56 ` Jiang, Haochen
2022-10-24 19:15 ` H.J. Lu
2022-10-25 7:07 ` Jan Beulich
2022-10-14 9:12 ` [PATCH 09/10] Support Intel AMX-FP16 Haochen Jiang
2022-10-17 7:35 ` Jan Beulich
2022-10-18 9:01 ` Cui, Lili
2022-10-18 9:23 ` Jan Beulich
2022-10-18 9:33 ` Jiang, Haochen
2022-10-19 10:33 ` Cui, Lili
2022-10-19 13:35 ` Jan Beulich
2022-10-19 14:05 ` Cui, Lili
2022-10-19 14:09 ` Jan Beulich
2022-10-19 14:41 ` Cui, Lili
2022-10-19 15:04 ` Jan Beulich
2022-10-19 15:21 ` Cui, Lili
2022-10-19 14:01 ` Jiang, Haochen
2022-10-19 14:13 ` Jan Beulich
2022-10-19 14:58 ` Jiang, Haochen
2022-10-25 6:02 ` Jan Beulich
2022-10-25 13:05 ` Cui, Lili
2022-10-14 9:12 ` [PATCH 10/10] Support Intel PREFETCHI Haochen Jiang
2022-10-17 8:15 ` Jan Beulich
2022-10-25 13:03 ` Cui, Lili
2022-10-25 15:41 ` Jan Beulich
2022-10-25 15:52 ` Jan Beulich
2022-10-25 17:01 ` H.J. Lu
2022-10-26 13:42 ` Cui, Lili
2022-10-26 13:53 ` Jan Beulich
2022-10-27 6:04 ` Cui, Lili
2022-10-27 6:45 ` Jan Beulich
2022-10-27 7:01 ` Cui, Lili
2022-10-27 7:15 ` Jan Beulich
2022-10-27 7:43 ` Cui, Lili
2022-10-28 9:03 ` Cui, Lili
2022-10-28 15:54 ` H.J. Lu
2022-10-31 13:23 ` Cui, Lili
2022-10-31 14:45 ` Mike Frysinger
2022-10-31 16:25 ` H.J. Lu
2022-10-19 14:55 [PATCH v2 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions Haochen Jiang
2022-10-19 14:55 ` [PATCH 01/10] Support Intel AVX-IFMA Haochen Jiang
2022-10-19 15:15 [PATCH v2 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions (Resend) Haochen Jiang
2022-10-19 15:15 ` [PATCH 01/10] Support Intel AVX-IFMA Haochen Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMe9rOruSFbjC0MNdUCGk7U0zrf4f7hdJZiuZTQwhT2OZZainQ@mail.gmail.com \
--to=hjl.tools@gmail.com \
--cc=JBeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=haochen.jiang@intel.com \
--cc=hongyu.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).