RE: [PATCH v4 3/9] Support APX GPR32 with extend evex prefix

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH v4 3/9] Support APX GPR32 with extend evex prefix
Date: Mon, 25 Dec 2023 12:23:23 +0000	[thread overview]
Message-ID: <SJ0PR11MB5600E0C7483760E9F5FB5B979E99A@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <1a796802-5267-4c88-abc4-dda4bdd262cc@suse.com>

> On 19.12.2023 13:12, Cui, Lili wrote:
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -89,6 +89,7 @@
> >  /* This matches the C -> StaticRounding alias in the opcode table.
> > */  #define commutative staticrounding
> >
> > +#define APX_F(cpuid) (maybe_cpu (t, CpuAPX_F) && maybe_cpu (t,
> > +cpuid))
> 
> Why is this still here? I said more than once that it's not helpful to have. As can
> be seen ...
> 
> > @@ -3673,7 +3674,7 @@ install_template (const insn_template *t)
> >
> >    /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
> >    if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> > -  {
> > +    {
> >        if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> >  	   || maybe_cpu (t, CpuFMA))
> >  	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
> @@
> > -3695,7 +3696,15 @@ install_template (const insn_template *t)
> >  		gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa);
> >  	    }
> >  	}
> > -  }
> > +
> > +      if (APX_F(CpuCMPCCXADD) || APX_F(CpuAMX_TILE) ||
> APX_F(CpuAVX512F)
> > +	  || APX_F(CpuAVX512DQ) || APX_F(CpuAVX512BW) ||
> APX_F(CpuBMI)
> > +	  || APX_F(CpuBMI2))
> 
> ... right here: There's no point in checking CpuAPX_F a whopping 7 times.
> 
> > +	if (need_evex_encoding ())
> > +	  i.tm.opcode_modifier.vex = 0;
> > +	else
> > +	  i.tm.opcode_modifier.evex = 0;
> > +    }
> 
> I'm also pretty sure that I asked before that such nested if/else please have
> proper braces for the body of the outer if().
> 
Sorry, I missed that email.

Changed to:

      if ((maybe_cpu (t, CpuCMPCCXADD) || maybe_cpu (t, CpuAMX_TILE)
           || maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512DQ)
           || maybe_cpu (t, CpuAVX512BW) || maybe_cpu (t, CpuBMI)
           || maybe_cpu (t, CpuBMI2))
          && maybe_cpu (t, CpuAPX_F))
        {
          if (need_evex_encoding () || i.has_egpr)
            i.tm.opcode_modifier.vex = 0;
          else
            i.tm.opcode_modifier.evex = 0;
        }

> To say it very clearly again: When you submit a new version, _all_ prior review
> comments should be addressed. Whether that's verbally (by explaining why a
> change cannot be made) or by adjusting the code is another matter. I said
> before that reviewing this work has proven extremely time consuming. I
> shouldn't be required to needlessly put in yet more time, just to re-spot and
> re-comment things already pointed out.
> 
> > @@ -3876,6 +3885,15 @@ is_any_vex_encoding (const insn_template *t)
> >    return t->opcode_modifier.vex || t->opcode_modifier.evex;  }
> >
> > +/* We can use this function only when the current encoding is evex.
> > +*/ static INLINE bool is_apx_evex_encoding (void) {
> > +  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
> > +    || (i.vex.register_specifier
> > +	&& i.vex.register_specifier->reg_flags & RegRex2);
> 
> Nit: Parentheses please around the & expression.
> 
Done.

> > @@ -8097,7 +8142,11 @@ process_suffix (void)
> >  	  if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */
> >  	    prefix = ADDR_PREFIX_OPCODE;
> >
> > -	  if (!add_prefix (prefix))
> > +	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> > +	     needs to be adjusted.  */
> > +	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
> > +	    i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> 
> Feels like I did ask before: What if i.tm.opcode_modifier.opcodeprefix is
> already set? Aiui that would be a bug, but one that's likely easy to introduce
> and hard to find. IOW - better assert the field is clear before filling it?
>
Something like that, so we moved it here from somewhere else.  Added.

          /* The DATA PREFIX of EVEX promoted from legacy APX instructions
             needs to be adjusted.  */
          if (i.tm.opcode_space == SPACE_EVEXMAP4)
            {
              gas_assert (!i.tm.opcode_modifier.opcodeprefix);
              i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
            }

> > @@ -14293,6 +14342,12 @@ static bool check_register (const reg_entry
> *r)
> >        if (!cpu_arch_flags.bitfield.cpuapx_f
> >  	  || flag_code != CODE_64BIT)
> >  	return false;
> > +
> > +      /* When using RegRex2, dual VEX/EVEX templates need to be marked as
> EVEX.
> > +	 For the later install_template function.  */
> > +      if (current_templates.start->opcode_modifier.vex
> > +	  && current_templates.start->opcode_modifier.evex)
> > +	i.vec_encoding = vex_encoding_evex;
> >      }
> 
> Just to state it again - no use of current_templates in this funciton, please.
> 
Removed, mentioned in my previous comment.

> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -106,16 +106,6 @@
> >  #define HLEPrefixRelease PrefixOk=PrefixHLERelease  #define
> > NoTrackPrefixOk  PrefixOk=PrefixNoTrack
> >
> > -#define Space0F    OpcodeSpace=SPACE_0F
> > -#define Space0F38  OpcodeSpace=SPACE_0F38 -#define Space0F3A
> > OpcodeSpace=SPACE_0F3A -#define SpaceXOP08
> OpcodeSpace=SPACE_XOP08
> > -#define SpaceXOP09 OpcodeSpace=SPACE_XOP09 -#define SpaceXOP0A
> > OpcodeSpace=SPACE_XOP0A
> > -
> > -#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5 -#define EVexMap6
> > OpcodeSpace=SPACE_EVEXMAP6
> > -
> 
> Why are you moving these, leaving ...
> 
> >  #define VexMap7 OpcodeSpace=SPACE_VEXMAP7
>
Done.

> ... this one disconnected? IOW - I see no need for the moving, but if there is a
> need, then this one also needs moving (see how it'll become relevant for
> USER_MSR+APX_F now). Specifically ...
> 
> > @@ -137,11 +127,25 @@
> >  #define EVexLIG EVex=EVEXLIG
> >  #define EVexDYN EVex=EVEXDYN
> >
> > +#define Space0F    OpcodeSpace=SPACE_0F
> > +#define Space0F38  OpcodeSpace=SPACE_0F38 #define Space0F3A
> > +OpcodeSpace=SPACE_0F3A #define SpaceXOP08
> OpcodeSpace=SPACE_XOP08
> > +#define SpaceXOP09 OpcodeSpace=SPACE_XOP09 #define SpaceXOP0A
> > +OpcodeSpace=SPACE_XOP0A
> > +
> > +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128
> 
> ... there's no need for this to live after EVex128 was #define-s.
> 
Yes, done.

> > +#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5 #define EVexMap6
> > +OpcodeSpace=SPACE_EVEXMAP6
> > +
> >  #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
> >
> >  #define Vsz256 Vsz=VSZ256
> >  #define Vsz512 Vsz=VSZ512
> >
> > +// The template supports VEX format for cpuid and EVEX format for cpuid &
> apx_f.
> > +#define APX_F(cpuid) cpuid&(cpuid|APX_F)
> 
> I think the comment wants to go into further detail. Please can you go back to
> read what I said when I suggested this construct, in particular regarding the
> stripping then done? However, with you not having found a need to fiddle
> with cpu_flags_match(), I wonder if this construct is needed in the first place.
> The earlier suggestion was entirely based on the assumption that stripping
> similar to that for other combined VEX/EVEX templates would be needed here,
> t
> 

Seeing this, I realized the problem and checked opcodes/i386-tbl.h, for the following entry, we want to set CpuAPX_F and CpuBMI to 1, but gen.c doesn't seem to support the format "cpuid&(cpuid|APX_F)", in fact bzhi sets CpuBMI to 1 and CpuAPX_F to 0 . I'm not familiar with the relevant logic in gen.c and don't know how to debug it. when you have time, could you help take a look ?

bzhi, 0xf5, APX_F(BMI2), Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }


> > @@ -2049,13 +2059,20 @@ bndldx, 0x0f1a, MPX,
> > Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND }
> >
> >  // SHA instructions.
> >  sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S,
> > RegXMM|Unspecified|BaseIndex, RegXMM }
> > +sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S,
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1nexte, 0xf38c8, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex,
> > RegXMM }
> > +sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4, {
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1msg1, 0xf38c9, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex,
> > RegXMM }
> > +sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4, {
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha1msg2, 0xf38ca, SHA, Modrm|NoSuf,
> { RegXMM|Unspecified|BaseIndex,
> > RegXMM }
> > +sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4, {
> > +RegXMM|Unspecified|BaseIndex, RegXMM }
> >  sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword,
> > RegXMM|Unspecified|BaseIndex, RegXMM }
> 
> What about this form? It surely also wants to gain an EVEX counterpart.
> 

Added.

Thanks,
Lili.

next prev parent reply	other threads:[~2023-12-25 12:23 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-19 12:12 [PATCH v4 0/9] Support Intel APX EGPR Cui, Lili
2023-12-19 12:12 ` [PATCH v4 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-22 13:08   ` Jan Beulich
2023-12-25  6:14     ` Cui, Lili
2024-01-04  8:57       ` Jan Beulich
2023-12-19 12:12 ` [PATCH v4 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-12-19 12:12 ` [PATCH v4 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-22 13:49   ` Jan Beulich
2023-12-25 12:23     ` Cui, Lili [this message]
2024-01-04  9:08       ` Jan Beulich
2024-01-04 12:32         ` Cui, Lili
2024-01-04 12:55           ` Jan Beulich
2023-12-22 14:19   ` Jan Beulich
2023-12-26  7:00     ` Cui, Lili
2024-01-04  9:01       ` Jan Beulich
2024-01-04 12:47         ` Cui, Lili
2023-12-19 12:12 ` [PATCH v4 4/9] Add tests for " Cui, Lili
2023-12-22 14:41   ` Jan Beulich
2023-12-25 13:40     ` Cui, Lili
2024-01-04  9:16       ` Jan Beulich
2024-01-05  6:58         ` Cui, Lili
2023-12-19 12:12 ` [PATCH v4 5/9] Support APX NDD Cui, Lili
2023-12-19 12:12 ` [PATCH v4 6/9] Support APX Push2/Pop2 Cui, Lili
2023-12-19 12:12 ` [PATCH v4 7/9] Support APX PUSHP/POPP Cui, Lili
2023-12-19 12:12 ` [PATCH v4 `8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-19 12:12 ` [PATCH v4 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-12-19 12:35 ` [PATCH v4 0/9] Support Intel APX EGPR Jan Beulich
2023-12-20  8:50   ` Cui, Lili
2023-12-20  8:57     ` Jan Beulich
2023-12-20 10:42       ` Cui, Lili
2023-12-20 11:00         ` Jan Beulich
2023-12-20 11:50           ` Cui, Lili
2023-12-20 12:01             ` Jan Beulich
2023-12-20 12:16               ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5600E0C7483760E9F5FB5B979E99A@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).