Re: [PATCH 04/10] Support Intel CMPccXADD

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: "H.J. Lu" <hjl.tools@gmail.com>
To: "Jiang, Haochen" <haochen.jiang@intel.com>
Cc: "Beulich, Jan" <JBeulich@suse.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH 04/10] Support Intel CMPccXADD
Date: Mon, 24 Oct 2022 12:12:46 -0700	[thread overview]
Message-ID: <CAMe9rOprGvemZMhjByHLydV55dUpsqDCRe_w5E8h5QQ5jmHdBg@mail.gmail.com> (raw)
In-Reply-To: <SA1PR11MB59469C5F459A411FF484C8E4EC2E9@SA1PR11MB5946.namprd11.prod.outlook.com>

On Sun, Oct 23, 2022 at 7:30 PM Jiang, Haochen <haochen.jiang@intel.com> wrote:
>
> > -----Original Message-----
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: Friday, October 14, 2022 9:47 PM
> > To: Jiang, Haochen <haochen.jiang@intel.com>
> > Cc: hjl.tools@gmail.com; binutils@sourceware.org
> > Subject: Re: [PATCH 04/10] Support Intel CMPccXADD
> >
> > On 14.10.2022 11:12, Haochen Jiang wrote:
> > > --- a/gas/NEWS
> > > +++ b/gas/NEWS
> > > @@ -1,5 +1,7 @@
> > >  -*- text -*-
> > >
> > > +* Add support for Intel CMPccXADD instructions.
> > > +
> > >  * Add support for Intel AVX-NE-CONVERT instructions.
> > >
> > >  * Add support for Intel AVX-VNNI-INT8 instructions.
> >
> > I wonder if all of these really need a separate line.
> >
> > > --- a/gas/config/tc-i386.c
> > > +++ b/gas/config/tc-i386.c
> > > @@ -1097,6 +1097,7 @@ static const arch_entry cpu_arch[] =
> > >    SUBARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, false),
> > >    SUBARCH (avx_vnni_int8, AVX_VNNI_INT8, ANY_AVX_VNNI_INT8, false),
> > >    SUBARCH (avx_ne_convert, AVX_NE_CONVERT, ANY_AVX_NE_CONVERT,
> > > false),
> > > +  SUBARCH (cmpccxadd, CMPCCXADD, ANY_CMPCCXADD, false)
> > >  };
> >
> > No need for ANY_CMPCCXADD, unless you _know_ dependent features will
> > appear.
> > See e.g. FSGSBASE, i.e. you can use CMPCCXADD twice here.
>
> Changed to CMPCCXADD. BTW, do we need to remove them in i386-gen.c?

Since it isn't used, please remove it.

> >
> > > --- a/opcodes/i386-dis.c
> > > +++ b/opcodes/i386-dis.c
> > > @@ -366,6 +366,7 @@ fetch_data (struct disassemble_info *info,
> > > bfd_byte *addr)  #define Ma { OP_M, a_mode }  #define Mb { OP_M,
> > > b_mode }  #define Md { OP_M, d_mode }
> > > +#define Mdq { OP_M, dq_mode }
> >
> > You're decoding via mod_table[], so I don't think you need this. Or (perhaps
> > better) vice versa - keep this (if there's no pre-existing one that fits) and avoid
> > the decode step through mod_table[].
>
> Yes, OP_M will check modrm by itself, removed the pass of mod_table.
>
> >
> > > @@ -939,6 +940,22 @@ enum
> > >    MOD_VEX_0F388E,
> > >    MOD_VEX_0F38B0,
> > >    MOD_VEX_0F38B1,
> > > +  MOD_VEX_0F38E0_X86_64,
> > > +  MOD_VEX_0F38E1_X86_64,
> > > +  MOD_VEX_0F38E2_X86_64,
> > > +  MOD_VEX_0F38E3_X86_64,
> > > +  MOD_VEX_0F38E4_X86_64,
> > > +  MOD_VEX_0F38E5_X86_64,
> > > +  MOD_VEX_0F38E6_X86_64,
> > > +  MOD_VEX_0F38E7_X86_64,
> > > +  MOD_VEX_0F38E8_X86_64,
> > > +  MOD_VEX_0F38E9_X86_64,
> > > +  MOD_VEX_0F38EA_X86_64,
> > > +  MOD_VEX_0F38EB_X86_64,
> > > +  MOD_VEX_0F38EC_X86_64,
> > > +  MOD_VEX_0F38ED_X86_64,
> > > +  MOD_VEX_0F38EE_X86_64,
> > > +  MOD_VEX_0F38EF_X86_64,
> >
> > Hmm, I really need to split off (and re-submit) the re-usable parts of
> > "x86-64: Intel64 adjustments for conditional jumps" (see
> > https://sourceware.org/pipermail/binutils/2020-July/112365.html), to avoid the
> > need for 16 almost identical entries of several kinds throughout this patch.
>
> Currently I am still using 16 almost identical entries. We can put this in further
> discussion.
>
> >
> > > @@ -8480,6 +8609,70 @@ static const struct dis386 mod_table[][2] = {
> > >      /* MOD_VEX_0F38B1*/
> > >      { VEX_W_TABLE (VEX_W_0F38B1) },
> > >    },
> > > +  {
> > > +    /* MOD_VEX_0F38E0_X86_64 */
> > > +    { "cmpoxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },  },  {
> > > +    /* MOD_VEX_0F38E1_X86_64 */
> > > +    { "cmpnoxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },  },  {
> > > +    /* MOD_VEX_0F38E2_X86_64 */
> > > +    { "cmpbxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },  },  {
> > > +    /* MOD_VEX_0F38E3_X86_64 */
> > > +    { "cmpnbxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },
> >
> > I understand the ISA extensions document names the insn this way and doesn't
> > list cmpaexadd (same for other aliases), but I think this is a mistake in the doc.
> > I've raised a respective question in the ISA extensions forum: I think
> > representation of conditions to check for should be uniform among insns, and
> > hence it should be "ae" here. (That would also be the effect if you
> > used %C<whatever> here.)
>
> I changed the table to <cc> so that when developer input something like
> cmpaexadd, assembler could also recognize it. For the disassembler part, it still
> only have one identical name.
>
> >
> > > @@ -433,7 +436,7 @@ typedef union i386_cpu_flags
> > >        unsigned int cpu64:1;
> > >        unsigned int cpuno64:1;
> > >  #ifdef CpuUnused
> > > -      unsigned int unused:(CpuNumOfBits - CpuUnused);
> > > +      // unsigned int unused:(CpuNumOfBits - CpuUnused);
> > >  #endif
> >
> > No - you should instead comment out the #define of CpuUnused - see the
> > comment there.
>
> Fixed this wrong comment.
>
> >
> > > --- a/opcodes/i386-opc.tbl
> > > +++ b/opcodes/i386-opc.tbl
> > > @@ -3296,3 +3296,24 @@ vpdpbsud, 0xf350, None, CpuAVX_VNNI_INT8,
> > > Modrm|Vex|Space0F38|VexVVVV|VexW0|Chec
> > >  vpdpbsuds, 0xf351, None, CpuAVX_VNNI_INT8,
> > >
> > Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|N
> > o_lSuf
> > > |No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex,
> > > RegXMM|RegYMM, RegXMM|RegYMM }
> > >
> > >  // AVX_VNNI_INT8 instructions end.
> > > +
> > > +// CMPCCXADD instructions.
> > > +
> > > +cmpbexadd, 0x66e6, None, CpuCMPCCXADD|Cpu64,
> > >
> > +Modrm|Vex128|Space0F38|VexVVVV=1|SwapSources|No_bSuf|No_wSuf|No
> > _lSuf|
> > > +No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64, Reg32|Reg64,
> > > +Dword|Qword|Unspecified|BaseIndex }
> >
> > Along the lines of the earlier comment - you want to use the <cc> template here,
> > eliminating the need for 16 almost identical lines _and_ supplying all condition
> > code representation in one go.
>
> Mentioned above.
>
> >
> > Apart from that you forgot CheckRegSize here afaict. And please again VexVVVV
> > alone, without =1. Also for non-vector insns perhaps better plain Vex instead of
> > Vex128. Further these insns should allow for l and q suffixes in AT&T mode.
>
> Done.
>
> >
> > And finally - is SwapSources really appropriate to use here? There's only one
> > pure source operand, the other two are also serving as destinations.
> > I wonder whether an attribute is necessary here in the first place: Vex- encoded
> > insns with a memory destination never have two further register operands, so
> > that property should suffice for identifying the case in build_modrm_byte().
> > Alternatively you could also simply use the CPU flag.
>
> We may need a special identifier for CMPccXADD since we have VVVV at
> operand 3, where it is always at operand 2 for all other insts which
> have VVVV. That is the reason we reuse SwapSources. It might be not
> that same as the original meaning. But we want to avoid adding a bit
> for this very rare case. Do we need to change that?
>
> Haochen
>
> >
> > Jan



-- 
H.J.

next prev parent reply	other threads:[~2022-10-24 19:13 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-14  9:12 [PATCH 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions Haochen Jiang
2022-10-14  9:12 ` [PATCH 01/10] Support Intel AVX-IFMA Haochen Jiang
2022-10-14  9:52   ` Jan Beulich
2022-10-14 18:10     ` H.J. Lu
2022-10-16  6:39       ` Jan Beulich
2022-10-17 22:23         ` H.J. Lu
2022-10-18  5:33           ` Jan Beulich
2022-10-18 21:28             ` H.J. Lu
2022-10-19  6:01               ` Jan Beulich
2022-10-19 21:27                 ` H.J. Lu
2022-10-20  6:15                   ` Jan Beulich
2022-10-24  2:07     ` Jiang, Haochen
2022-10-24  5:53     ` Jiang, Haochen
2022-10-24 19:09       ` H.J. Lu
2022-10-25  6:29       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 02/10] Support Intel AVX-VNNI-INT8 Haochen Jiang
2022-10-14 10:57   ` Jan Beulich
2022-10-21  3:22     ` Jiang, Haochen
2022-10-25  1:52       ` H.J. Lu
2022-10-14  9:12 ` [PATCH 03/10] Support Intel AVX-NE-CONVERT Haochen Jiang
2022-10-14 12:58   ` Jan Beulich
2022-10-24  5:37     ` Kong, Lingling
2022-10-24  5:59     ` Kong, Lingling
2022-10-24 19:25       ` H.J. Lu
2022-10-25  6:44       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 04/10] Support Intel CMPccXADD Haochen Jiang
2022-10-14 13:46   ` Jan Beulich
2022-10-14 18:27     ` H.J. Lu
2022-10-14 21:51       ` H.J. Lu
2022-10-16  6:34         ` Jan Beulich
2022-10-17 23:31           ` H.J. Lu
2022-10-16  6:25       ` Jan Beulich
2022-10-17 23:44         ` H.J. Lu
2022-10-16  6:19     ` Jan Beulich
2022-10-24  2:30     ` Jiang, Haochen
2022-10-24 19:12       ` H.J. Lu [this message]
2022-10-24  5:55     ` Jiang, Haochen
2022-10-25  6:53       ` Jan Beulich
2022-10-26  3:03         ` Jiang, Haochen
2022-10-26  8:49           ` Jan Beulich
2022-10-27  3:09             ` Jiang, Haochen
2022-10-27  6:37               ` Jan Beulich
2022-10-28  0:59                 ` Jiang, Haochen
2022-10-14  9:12 ` [PATCH 05/10] Add handler for more i386_cpu_flags Haochen Jiang
2022-10-14 13:53   ` Jan Beulich
2022-10-14  9:12 ` [PATCH 06/10] Support Intel RAO-INT Haochen Jiang
2022-10-14 14:38   ` Jan Beulich
2022-10-16  6:15     ` Jan Beulich
2022-10-24  3:12     ` Jiang, Haochen
2022-10-24 19:17       ` H.J. Lu
2022-10-24  5:56     ` Jiang, Haochen
2022-10-25  7:01       ` Jan Beulich
2022-10-26  5:16         ` Jiang, Haochen
2022-10-26  8:56           ` Jan Beulich
2022-10-27  3:50             ` Jiang, Haochen
2022-10-27  6:39               ` Jan Beulich
2022-10-27 18:46                 ` H.J. Lu
2022-10-28  6:52                   ` Jan Beulich
2022-10-28  8:10                     ` Jiang, Haochen
2022-10-28  8:22                       ` Jan Beulich
2022-10-28  8:31                         ` Jiang, Haochen
2022-10-28  8:40                           ` Jan Beulich
2022-10-28 16:08                             ` H.J. Lu
2022-10-31  9:41                               ` Jan Beulich
2022-10-31 16:49                                 ` H.J. Lu
2022-11-06 12:50         ` Kong, Lingling
2022-11-07  9:24           ` Jan Beulich
2022-11-07 13:37             ` Kong, Lingling
2022-11-07 20:03               ` H.J. Lu
2022-10-17 23:23   ` H.J. Lu
2022-10-18  5:38     ` Jan Beulich
2022-10-14  9:12 ` [PATCH 07/10] Support Intel WRMSRNS Haochen Jiang
2022-10-17  7:17   ` Jan Beulich
2022-10-24  2:52     ` Jiang, Haochen
2022-10-24  5:56     ` Jiang, Haochen
2022-10-24 19:14       ` H.J. Lu
2022-10-25  7:04       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 08/10] Support Intel MSRLIST Haochen Jiang
2022-10-17  7:20   ` Jan Beulich
2022-10-24  3:03     ` Jiang, Haochen
2022-10-24  5:56     ` Jiang, Haochen
2022-10-24 19:15       ` H.J. Lu
2022-10-25  7:07       ` Jan Beulich
2022-10-14  9:12 ` [PATCH 09/10] Support Intel AMX-FP16 Haochen Jiang
2022-10-17  7:35   ` Jan Beulich
2022-10-18  9:01     ` Cui, Lili
2022-10-18  9:23       ` Jan Beulich
2022-10-18  9:33         ` Jiang, Haochen
2022-10-19 10:33         ` Cui, Lili
2022-10-19 13:35           ` Jan Beulich
2022-10-19 14:05             ` Cui, Lili
2022-10-19 14:09               ` Jan Beulich
2022-10-19 14:41                 ` Cui, Lili
2022-10-19 15:04                   ` Jan Beulich
2022-10-19 15:21                     ` Cui, Lili
2022-10-19 14:01           ` Jiang, Haochen
2022-10-19 14:13             ` Jan Beulich
2022-10-19 14:58               ` Jiang, Haochen
2022-10-25  6:02         ` Jan Beulich
2022-10-25 13:05           ` Cui, Lili
2022-10-14  9:12 ` [PATCH 10/10] Support Intel PREFETCHI Haochen Jiang
2022-10-17  8:15   ` Jan Beulich
2022-10-25 13:03     ` Cui, Lili
2022-10-25 15:41       ` Jan Beulich
2022-10-25 15:52       ` Jan Beulich
2022-10-25 17:01         ` H.J. Lu
2022-10-26 13:42           ` Cui, Lili
2022-10-26 13:53             ` Jan Beulich
2022-10-27  6:04               ` Cui, Lili
2022-10-27  6:45                 ` Jan Beulich
2022-10-27  7:01                   ` Cui, Lili
2022-10-27  7:15                     ` Jan Beulich
2022-10-27  7:43                       ` Cui, Lili
2022-10-28  9:03                       ` Cui, Lili
2022-10-28 15:54                     ` H.J. Lu
2022-10-31 13:23                       ` Cui, Lili
2022-10-31 14:45                     ` Mike Frysinger
2022-10-31 16:25                       ` H.J. Lu
2022-10-19 14:55 [PATCH v2 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions Haochen Jiang
2022-10-19 14:56 ` [PATCH 04/10] Support Intel CMPccXADD Haochen Jiang
2022-10-19 15:15 [PATCH v2 0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions (Resend) Haochen Jiang
2022-10-19 15:15 ` [PATCH 04/10] Support Intel CMPccXADD Haochen Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOprGvemZMhjByHLydV55dUpsqDCRe_w5E8h5QQ5jmHdBg@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=haochen.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).