> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, October 14, 2022 9:47 PM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: hjl.tools@gmail.com; binutils@sourceware.org
> Subject: Re: [PATCH 04/10] Support Intel CMPccXADD
> 
> On 14.10.2022 11:12, Haochen Jiang wrote:
> > --- a/gas/NEWS
> > +++ b/gas/NEWS
> > @@ -1,5 +1,7 @@
> >  -*- text -*-
> >
> > +* Add support for Intel CMPccXADD instructions.
> > +
> >  * Add support for Intel AVX-NE-CONVERT instructions.
> >
> >  * Add support for Intel AVX-VNNI-INT8 instructions.
> 
> I wonder if all of these really need a separate line.
> 
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -1097,6 +1097,7 @@ static const arch_entry cpu_arch[] =
> >    SUBARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, false),
> >    SUBARCH (avx_vnni_int8, AVX_VNNI_INT8, ANY_AVX_VNNI_INT8, false),
> >    SUBARCH (avx_ne_convert, AVX_NE_CONVERT, ANY_AVX_NE_CONVERT,
> > false),
> > +  SUBARCH (cmpccxadd, CMPCCXADD, ANY_CMPCCXADD, false)
> >  };
> 
> No need for ANY_CMPCCXADD, unless you _know_ dependent features will
> appear.
> See e.g. FSGSBASE, i.e. you can use CMPCCXADD twice here.

Changed to CMPCCXADD. BTW, do we need to remove them in i386-gen.c?

> 
> > --- a/opcodes/i386-dis.c
> > +++ b/opcodes/i386-dis.c
> > @@ -366,6 +366,7 @@ fetch_data (struct disassemble_info *info,
> > bfd_byte *addr)  #define Ma { OP_M, a_mode }  #define Mb { OP_M,
> > b_mode }  #define Md { OP_M, d_mode }
> > +#define Mdq { OP_M, dq_mode }
> 
> You're decoding via mod_table[], so I don't think you need this. Or (perhaps
> better) vice versa - keep this (if there's no pre-existing one that fits) and avoid
> the decode step through mod_table[].

Yes, OP_M will check modrm by itself, removed the pass of mod_table.

> 
> > @@ -939,6 +940,22 @@ enum
> >    MOD_VEX_0F388E,
> >    MOD_VEX_0F38B0,
> >    MOD_VEX_0F38B1,
> > +  MOD_VEX_0F38E0_X86_64,
> > +  MOD_VEX_0F38E1_X86_64,
> > +  MOD_VEX_0F38E2_X86_64,
> > +  MOD_VEX_0F38E3_X86_64,
> > +  MOD_VEX_0F38E4_X86_64,
> > +  MOD_VEX_0F38E5_X86_64,
> > +  MOD_VEX_0F38E6_X86_64,
> > +  MOD_VEX_0F38E7_X86_64,
> > +  MOD_VEX_0F38E8_X86_64,
> > +  MOD_VEX_0F38E9_X86_64,
> > +  MOD_VEX_0F38EA_X86_64,
> > +  MOD_VEX_0F38EB_X86_64,
> > +  MOD_VEX_0F38EC_X86_64,
> > +  MOD_VEX_0F38ED_X86_64,
> > +  MOD_VEX_0F38EE_X86_64,
> > +  MOD_VEX_0F38EF_X86_64,
> 
> Hmm, I really need to split off (and re-submit) the re-usable parts of
> "x86-64: Intel64 adjustments for conditional jumps" (see
> https://sourceware.org/pipermail/binutils/2020-July/112365.html), to avoid the
> need for 16 almost identical entries of several kinds throughout this patch.

Currently I am still using 16 almost identical entries. We can put this in further
discussion.

> 
> > @@ -8480,6 +8609,70 @@ static const struct dis386 mod_table[][2] = {
> >      /* MOD_VEX_0F38B1*/
> >      { VEX_W_TABLE (VEX_W_0F38B1) },
> >    },
> > +  {
> > +    /* MOD_VEX_0F38E0_X86_64 */
> > +    { "cmpoxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },  },  {
> > +    /* MOD_VEX_0F38E1_X86_64 */
> > +    { "cmpnoxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },  },  {
> > +    /* MOD_VEX_0F38E2_X86_64 */
> > +    { "cmpbxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },  },  {
> > +    /* MOD_VEX_0F38E3_X86_64 */
> > +    { "cmpnbxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },
> 
> I understand the ISA extensions document names the insn this way and doesn't
> list cmpaexadd (same for other aliases), but I think this is a mistake in the doc.
> I've raised a respective question in the ISA extensions forum: I think
> representation of conditions to check for should be uniform among insns, and
> hence it should be "ae" here. (That would also be the effect if you
> used %C<whatever> here.)

I changed the table to <cc> so that when developer input something like
cmpaexadd, assembler could also recognize it. For the disassembler part, it still
only have one identical name.

> 
> > @@ -433,7 +436,7 @@ typedef union i386_cpu_flags
> >        unsigned int cpu64:1;
> >        unsigned int cpuno64:1;
> >  #ifdef CpuUnused
> > -      unsigned int unused:(CpuNumOfBits - CpuUnused);
> > +      // unsigned int unused:(CpuNumOfBits - CpuUnused);
> >  #endif
> 
> No - you should instead comment out the #define of CpuUnused - see the
> comment there.

Fixed this wrong comment.

> 
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3296,3 +3296,24 @@ vpdpbsud, 0xf350, None, CpuAVX_VNNI_INT8,
> > Modrm|Vex|Space0F38|VexVVVV|VexW0|Chec
> >  vpdpbsuds, 0xf351, None, CpuAVX_VNNI_INT8,
> >
> Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|N
> o_lSuf
> > |No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex,
> > RegXMM|RegYMM, RegXMM|RegYMM }
> >
> >  // AVX_VNNI_INT8 instructions end.
> > +
> > +// CMPCCXADD instructions.
> > +
> > +cmpbexadd, 0x66e6, None, CpuCMPCCXADD|Cpu64,
> >
> +Modrm|Vex128|Space0F38|VexVVVV=1|SwapSources|No_bSuf|No_wSuf|No
> _lSuf|
> > +No_sSuf|No_qSuf|No_ldSuf, { Reg32|Reg64, Reg32|Reg64,
> > +Dword|Qword|Unspecified|BaseIndex }
> 
> Along the lines of the earlier comment - you want to use the <cc> template here,
> eliminating the need for 16 almost identical lines _and_ supplying all condition
> code representation in one go.

Mentioned above.

> 
> Apart from that you forgot CheckRegSize here afaict. And please again VexVVVV
> alone, without =1. Also for non-vector insns perhaps better plain Vex instead of
> Vex128. Further these insns should allow for l and q suffixes in AT&T mode.

Done.

> 
> And finally - is SwapSources really appropriate to use here? There's only one
> pure source operand, the other two are also serving as destinations.
> I wonder whether an attribute is necessary here in the first place: Vex- encoded
> insns with a memory destination never have two further register operands, so
> that property should suffice for identifying the case in build_modrm_byte().
> Alternatively you could also simply use the CPU flag.

We may need a special identifier for CMPccXADD since we have VVVV at
operand 3, where it is always at operand 2 for all other insts which
have VVVV. That is the reason we reuse SwapSources. It might be not
that same as the original meaning. But we want to avoid adding a bit
for this very rare case. Do we need to change that?

Haochen

> 
> Jan