From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH v4 3/9] Support APX GPR32 with extend evex prefix
Date: Mon, 25 Dec 2023 12:23:23 +0000 [thread overview]
Message-ID: <SJ0PR11MB5600E0C7483760E9F5FB5B979E99A@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <1a796802-5267-4c88-abc4-dda4bdd262cc@suse.com>
> On 19.12.2023 13:12, Cui, Lili wrote:
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -89,6 +89,7 @@
> > /* This matches the C -> StaticRounding alias in the opcode table.
> > */ #define commutative staticrounding
> >
> > +#define APX_F(cpuid) (maybe_cpu (t, CpuAPX_F) && maybe_cpu (t,
> > +cpuid))
>
> Why is this still here? I said more than once that it's not helpful to have. As can
> be seen ...
>
> > @@ -3673,7 +3674,7 @@ install_template (const insn_template *t)
> >
> > /* Dual VEX/EVEX templates need stripping one of the possible variants. */
> > if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> > - {
> > + {
> > if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> > || maybe_cpu (t, CpuFMA))
> > && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
> @@
> > -3695,7 +3696,15 @@ install_template (const insn_template *t)
> > gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
> > }
> > }
> > - }
> > +
> > + if (APX_F(CpuCMPCCXADD) || APX_F(CpuAMX_TILE) ||
> APX_F(CpuAVX512F)
> > + || APX_F(CpuAVX512DQ) || APX_F(CpuAVX512BW) ||
> APX_F(CpuBMI)
> > + || APX_F(CpuBMI2))
>
> ... right here: There's no point in checking CpuAPX_F a whopping 7 times.
>
> > + if (need_evex_encoding ())
> > + i.tm.opcode_modifier.vex = 0;
> > + else
> > + i.tm.opcode_modifier.evex = 0;
> > + }
>
> I'm also pretty sure that I asked before that such nested if/else please have
> proper braces for the body of the outer if().
>
Sorry, I missed that email.
Changed to:
if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
|| maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
|| maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
|| maybe_cpu (t, CpuBMI2))
&& maybe_cpu (t, CpuAPX_F))
{
if (need_evex_encoding () || i.has_egpr)
i.tm.opcode_modifier.vex = 0;
else
i.tm.opcode_modifier.evex = 0;
}
> To say it very clearly again: When you submit a new version, _all_ prior review
> comments should be addressed. Whether that's verbally (by explaining why a
> change cannot be made) or by adjusting the code is another matter. I said
> before that reviewing this work has proven extremely time consuming. I
> shouldn't be required to needlessly put in yet more time, just to re-spot and
> re-comment things already pointed out.
>
> > @@ -3876,6 +3885,15 @@ is_any_vex_encoding (const insn_template *t)
> > return t->opcode_modifier.vex || t->opcode_modifier.evex; }
> >
> > +/* We can use this function only when the current encoding is evex.
> > +*/ static INLINE bool is_apx_evex_encoding (void) {
> > + return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
> > + || (i.vex.register_specifier
> > + && i.vex.register_specifier->reg_flags & RegRex2);
>
> Nit: Parentheses please around the & expression.
>
Done.
> > @@ -8097,7 +8142,11 @@ process_suffix (void)
> > if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
> > prefix = ADDR_PREFIX_OPCODE;
> >
> > - if (!add_prefix (prefix))
> > + /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> > + needs to be adjusted. */
> > + if (i.tm.opcode_space == SPACE_EVEXMAP4)
> > + i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
>
> Feels like I did ask before: What if i.tm.opcode_modifier.opcodeprefix is
> already set? Aiui that would be a bug, but one that's likely easy to introduce
> and hard to find. IOW - better assert the field is clear before filling it?
>
Something like that, so we moved it here from somewhere else. Added.
/* The DATA PREFIX of EVEX promoted from legacy APX instructions
needs to be adjusted. */
if (i.tm.opcode_space == SPACE_EVEXMAP4)
{
gas_assert (!i.tm.opcode_modifier.opcodeprefix);
i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
}
> > @@ -14293,6 +14342,12 @@ static bool check_register (const reg_entry
> *r)
> > if (!cpu_arch_flags.bitfield.cpuapx_f
> > || flag_code != CODE_64BIT)
> > return false;
> > +
> > + /* When using RegRex2, dual VEX/EVEX templates need to be marked as
> EVEX.
> > + For the later install_template function. */
> > + if (current_templates.start->opcode_modifier.vex
> > + && current_templates.start->opcode_modifier.evex)
> > + i.vec_encoding = vex_encoding_evex;
> > }
>
> Just to state it again - no use of current_templates in this funciton, please.
>
Removed, mentioned in my previous comment.
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -106,16 +106,6 @@
> > #define HLEPrefixRelease PrefixOk=PrefixHLERelease #define
> > NoTrackPrefixOk PrefixOk=PrefixNoTrack
> >
> > -#define Space0F OpcodeSpace=SPACE_0F
> > -#define Space0F38 OpcodeSpace=SPACE_0F38 -#define Space0F3A
> > OpcodeSpace=SPACE_0F3A -#define SpaceXOP08
> OpcodeSpace=SPACE_XOP08
> > -#define SpaceXOP09 OpcodeSpace=SPACE_XOP09 -#define SpaceXOP0A
> > OpcodeSpace=SPACE_XOP0A
> > -
> > -#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5 -#define EVexMap6
> > OpcodeSpace=SPACE_EVEXMAP6
> > -
>
> Why are you moving these, leaving ...
>
> > #define VexMap7 OpcodeSpace=SPACE_VEXMAP7
>
Done.
> ... this one disconnected? IOW - I see no need for the moving, but if there is a
> need, then this one also needs moving (see how it'll become relevant for
> USER_MSR+APX_F now). Specifically ...
>
> > @@ -137,11 +127,25 @@
> > #define EVexLIG EVex=EVEXLIG
> > #define EVexDYN EVex=EVEXDYN
> >
> > +#define Space0F OpcodeSpace=SPACE_0F
> > +#define Space0F38 OpcodeSpace=SPACE_0F38 #define Space0F3A
> > +OpcodeSpace=SPACE_0F3A #define SpaceXOP08
> OpcodeSpace=SPACE_XOP08
> > +#define SpaceXOP09 OpcodeSpace=SPACE_XOP09 #define SpaceXOP0A
> > +OpcodeSpace=SPACE_XOP0A
> > +
> > +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
>
> ... there's no need for this to live after EVex128 was #define-s.
>
Yes, done.
> > +#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5 #define EVexMap6
> > +OpcodeSpace=SPACE_EVEXMAP6
> > +
> > #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
> >
> > #define Vsz256 Vsz=VSZ256
> > #define Vsz512 Vsz=VSZ512
> >
> > +// The template supports VEX format for cpuid and EVEX format for cpuid &
> apx_f.
> > +#define APX_F(cpuid) cpuid&(cpuid|APX_F)
>
> I think the comment wants to go into further detail. Please can you go back to
> read what I said when I suggested this construct, in particular regarding the
> stripping then done? However, with you not having found a need to fiddle
> with cpu_flags_match(), I wonder if this construct is needed in the first place.
> The earlier suggestion was entirely based on the assumption that stripping
> similar to that for other combined VEX/EVEX templates would be needed here,
> t
>
Seeing this, I realized the problem and checked opcodes/i386-tbl.h, for the following entry, we want to set CpuAPX_F and CpuBMI to 1, but gen.c doesn't seem to support the format "cpuid&(cpuid|APX_F)", in fact bzhi sets CpuBMI to 1 and CpuAPX_F to 0 . I'm not familiar with the relevant logic in gen.c and don't know how to debug it. when you have time, could you help take a look ?
bzhi, 0xf5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> > @@ -2049,13 +2059,20 @@ bndldx, 0x0f1a, MPX,
> > Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
> >
> > // SHA instructions.
> > sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S,
> > RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S,
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> > sha1nexte, 0xf38c8, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex,
> > RegXMM }
> > +sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4, {
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> > sha1msg1, 0xf38c9, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex,
> > RegXMM }
> > +sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4, {
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> > sha1msg2, 0xf38ca, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex,
> > RegXMM }
> > +sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4, {
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> > sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword,
> > RegXMM|Unspecified|BaseIndex, RegXMM }
>
> What about this form? It surely also wants to gain an EVEX counterpart.
>
Added.
Thanks,
Lili.
next prev parent reply other threads:[~2023-12-25 12:23 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-19 12:12 [PATCH v4 0/9] Support Intel APX EGPR Cui, Lili
2023-12-19 12:12 ` [PATCH v4 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-22 13:08 ` Jan Beulich
2023-12-25 6:14 ` Cui, Lili
2024-01-04 8:57 ` Jan Beulich
2023-12-19 12:12 ` [PATCH v4 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-12-19 12:12 ` [PATCH v4 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-22 13:49 ` Jan Beulich
2023-12-25 12:23 ` Cui, Lili [this message]
2024-01-04 9:08 ` Jan Beulich
2024-01-04 12:32 ` Cui, Lili
2024-01-04 12:55 ` Jan Beulich
2023-12-22 14:19 ` Jan Beulich
2023-12-26 7:00 ` Cui, Lili
2024-01-04 9:01 ` Jan Beulich
2024-01-04 12:47 ` Cui, Lili
2023-12-19 12:12 ` [PATCH v4 4/9] Add tests for " Cui, Lili
2023-12-22 14:41 ` Jan Beulich
2023-12-25 13:40 ` Cui, Lili
2024-01-04 9:16 ` Jan Beulich
2024-01-05 6:58 ` Cui, Lili
2023-12-19 12:12 ` [PATCH v4 5/9] Support APX NDD Cui, Lili
2023-12-19 12:12 ` [PATCH v4 6/9] Support APX Push2/Pop2 Cui, Lili
2023-12-19 12:12 ` [PATCH v4 7/9] Support APX PUSHP/POPP Cui, Lili
2023-12-19 12:12 ` [PATCH v4 `8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-19 12:12 ` [PATCH v4 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-12-19 12:35 ` [PATCH v4 0/9] Support Intel APX EGPR Jan Beulich
2023-12-20 8:50 ` Cui, Lili
2023-12-20 8:57 ` Jan Beulich
2023-12-20 10:42 ` Cui, Lili
2023-12-20 11:00 ` Jan Beulich
2023-12-20 11:50 ` Cui, Lili
2023-12-20 12:01 ` Jan Beulich
2023-12-20 12:16 ` Cui, Lili
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SJ0PR11MB5600E0C7483760E9F5FB5B979E99A@SJ0PR11MB5600.namprd11.prod.outlook.com \
--to=lili.cui@intel.com \
--cc=JBeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=hongjiu.lu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).