RE: [PATCH v3 4/9] Support APX GPR32 with extend evex prefix

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH v3 4/9] Support APX GPR32 with extend evex prefix
Date: Tue, 12 Dec 2023 12:58:02 +0000	[thread overview]
Message-ID: <SJ0PR11MB5600FA4CA703647E7195D90F9E8EA@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <0bb5fbcd-f58e-48ad-a5ee-3413b026f903@suse.com>

> 
> >>> @@ -14233,6 +14276,12 @@ static bool check_register (const reg_entry
> >> *r)
> >>>        if (!cpu_arch_flags.bitfield.cpuapx_f
> >>>  	  || flag_code != CODE_64BIT)
> >>>  	return false;
> >>> +
> >>> +      /* When using RegRex2, dual VEX/EVEX templates need to be
> >>> + marked as
> >> EVEX.
> >>> +	 For the later install_template function.  */
> >>> +      if (current_templates->start->opcode_modifier.vex
> >>> +	  && current_templates->start->opcode_modifier.evex)
> >>> +	i.vec_encoding = vex_encoding_evex;
> >>
> >> I'm afraid I don't understand the 2nd sentence of the comment. This
> >> may be related to my question regarding cpu_flags_match() further up.
> >>
> >> The first sentence isn't quite correct either - you don't mark any
> >> template here (and you can't, because we don't even know yet which
> >> template we're going to use).
> >>
> >> Finally - do you really need the .evex check here? (I won't exclude
> >> that this yields a better diagnostic in certain cases, but this wants
> >> clarifying if so.)
> >>
> >
> > If you look at install_template(), you'll see that before this function we
> need to know if the current encoding is evex.
> 
> "This function" being check_register()? If so, then no, we can't know up front
> whether EVEX encoding is going to be needed, as operand parsing happens
> ahead of template selection. If instead you mean "that function" and hence
> install_template(), then yes, we need to know whether to use EVEX there.
> Yet how does that result in a need for the .evex check here? (Or maybe your
> reply was really to the first of the three parts of my earlier one?)
>

Agree with you, put them here is unreasonable. 

For example 

vtestps (%r27),%ymm6

we should report unsupported  Egpr. But without .evex check, it will report "Error: no EVEX encoding for `vtestps'"

> But anyway - as said earlier on, using current_templates here looks wrong in
> the first place. check_register() deals with only a register, without regard to
> the context it is used in (with the sole exception of allow_pseudo_reg).
> May I remind you that earlier on I already indicated that I suspect you'll need
> a new enumerator to put in i.vec_encoding for this new purpose?
> 

If we don't put it in check_register(), we need to add a for loop at the beginning of the install_template() to check RegRex2. Do you think it is okay? Or create a function for it.

for (unsigned int op = 0; op < i.operands; op++)
    {
      if (i.types[op].bitfield.class != Reg)
        continue;

      if (i.op[op].regs->reg_flags & RegRex2)
        i.vec_encoding = vex_encoding_evex;
    }

  if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
      || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
    i.vec_encoding = vex_encoding_evex; 


> > We need to check opcode_modifier.evex here, it is a fix for issues caused by
> the merge of VEX and EVEX.
> >   if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> >     {
> >       if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA)
> >           || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) ||
> APX_F(CpuCMPCCXADD)
> >           || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) ||
> APX_F(CpuAVX512DQ)
> >           || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || APX_F(CpuBMI2))
> >         {
> >           if (need_evex_encoding ())
> >             {
> >[...]
> >>> @@ -1319,13 +1320,16 @@ getsec, 0xf37, SMX, NoSuf, {}
> >>>
> >>>  invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf, {
> >>> Oword|Unspecified|BaseIndex, Reg32 }  invept, 0x660f3880, EPT&x64,
> >>> Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> >>> +invept, 0xf3f0, EPT&APX_F, Modrm|NoSuf|EVex128|EVexMap4, {
> >>> +Oword|Unspecified|BaseIndex, Reg64 }
> >>>  invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf, {
> >>> Oword|Unspecified|BaseIndex, Reg32 }  invvpid, 0x660f3881, EPT&x64,
> >>> Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> >>> +invvpid, 0xf3f1, EPT&APX_F, Modrm|NoSuf|EVex128|EVexMap4, {
> >>> +Oword|Unspecified|BaseIndex, Reg64 }
> >>
> >> Seeing these: Are there any Map4 encodings which aren't EVex128? If
> >> not (and if you're also not hiddenly aware of some appearing in the
> >> near future), please consider making EVexMap4 include this right
> >> away. Even if in the longer run other encodings appear, it'll then be
> >> easy to simply replace all the
> >> EVexMap4 uses in a purely mechanical way. Until then shorter template
> >> lines are preferable.
> >>
> >
> > Would you mind defining it this way? Since #define EVex128 is behind it.
> Considering that you don't like unnecessary changes.
> >
> > +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex=EVEX128
> 
> The order of #define-s doesn't matter. There's no reason not to use EVex128
> here even if it's #define-d only a few lines later.
> 

OK

#define EVex128 EVex=EVEX128
#define EVex256 EVex=EVEX256
#define EVex512 EVex=EVEX512
#define EVexLIG EVex=EVEXLIG
#define EVexDYN EVex=EVEXDYN

+#define Space0F    OpcodeSpace=SPACE_0F
+#define Space0F38  OpcodeSpace=SPACE_0F38
+#define Space0F3A  OpcodeSpace=SPACE_0F3A
+#define SpaceXOP08 OpcodeSpace=SPACE_XOP08
+#define SpaceXOP09 OpcodeSpace=SPACE_XOP09
+#define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
+
+#define EVexMap4 OpcodeSpace=MAP4|EVex128
+#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
+#define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6

> Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No
> >> _bSuf|No
> >>> _wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex,
> >>> Reg32|Reg64 }
> >>> +bzhi, 0xf5, BMI2&(BMI2|APX_F),
> >>>
> >>
> +Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|SwapS
> >> ources|N
> >>> +o_bSuf|No_wSuf|No_sSuf|NF, { Reg32|Reg64,
> >>> +Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> >>
> >> Hmm, I had specifically suggested a pre-processor macro to use in
> >> place of the open-coded BMI2&(BMI2|APX_F). Is there a reason you
> >> didn't use that (here and below)?
> >
> > There are many different types of combinations, and each combination
> appears relatively few times, so I think adding a #define for each combination
> feels a bit wasteful.
> 
> I never suggested using multiple #define-s. I suggested a single APX_F()
> macro which would be used uniformly here and elsewhere (here:
> APX_F(BMI2)).
> And that macro would come with a comment explaining why the expression
> is the (seemingly strange) way it is. Right now there's no such explanation
> anywhere, and it would also be hard to find a good (central) place where to
> put it.
> 

Oh, get you.

Thanks,
Lili.

next prev parent reply	other threads:[~2023-12-12 12:58 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-24  7:02 [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Cui, Lili
2023-11-24  7:02 ` [PATCH v3 2/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-04 16:30   ` Jan Beulich
2023-12-05 13:31     ` Cui, Lili
2023-12-06  7:52       ` Jan Beulich
2023-12-06 12:43         ` Cui, Lili
2023-12-07  9:01           ` Jan Beulich
2023-12-08  3:10             ` Cui, Lili
2023-11-24  7:02 ` [PATCH v3 3/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-11-24  7:02 ` [PATCH v3 4/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-07 12:38   ` Jan Beulich
2023-12-08 15:21     ` Cui, Lili
2023-12-11  8:34       ` Jan Beulich
2023-12-12 10:44         ` Cui, Lili
2023-12-12 11:16           ` Jan Beulich
2023-12-12 12:32             ` Cui, Lili
2023-12-12 12:39               ` Jan Beulich
2023-12-12 13:15                 ` Cui, Lili
2023-12-12 14:13                   ` Jan Beulich
2023-12-13  7:36                     ` Cui, Lili
2023-12-13  7:48                       ` Jan Beulich
2023-12-12 12:58         ` Cui, Lili [this message]
2023-12-12 14:04           ` Jan Beulich
2023-12-13  8:35             ` Cui, Lili
2023-12-13  9:13               ` Jan Beulich
2023-12-07 13:34   ` Jan Beulich
2023-12-11  6:16     ` Cui, Lili
2023-12-11  8:43       ` Jan Beulich
2023-12-11 11:50   ` Jan Beulich
2023-11-24  7:02 ` [PATCH v3 5/9] Add tests for " Cui, Lili
2023-12-07 14:05   ` Jan Beulich
2023-12-11  6:16     ` Cui, Lili
2023-12-11  8:55       ` Jan Beulich
2023-11-24  7:02 ` [PATCH v3 6/9] Support APX NDD Cui, Lili
2023-12-08 14:12   ` Jan Beulich
2023-12-11 13:36     ` Cui, Lili
2023-12-11 16:50       ` Jan Beulich
2023-12-13 10:42         ` Cui, Lili
2024-03-22 10:02     ` Jan Beulich
2024-03-22 10:31       ` Jan Beulich
2024-03-26  2:04         ` Cui, Lili
2024-03-26  7:06           ` Jan Beulich
2024-03-26  7:18             ` Cui, Lili
2024-03-22 10:59       ` Jan Beulich
2024-03-26  8:22         ` Cui, Lili
2024-03-26  9:30           ` Jan Beulich
2024-03-27  2:41             ` Cui, Lili
2023-12-08 14:27   ` Jan Beulich
2023-12-12  5:53     ` Cui, Lili
2023-12-12  8:28       ` Jan Beulich
2023-11-24  7:02 ` [PATCH v3 7/9] Support APX Push2/Pop2 Cui, Lili
2023-12-11 11:17   ` Jan Beulich
2023-12-15  8:38     ` Cui, Lili
2023-12-15  8:44       ` Jan Beulich
2023-11-24  7:02 ` [PATCH v3 8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-11 12:27   ` Jan Beulich
2023-12-12  3:18     ` Hu, Lin1
2023-12-12  8:41       ` Jan Beulich
2023-12-13  5:31         ` Hu, Lin1
2023-12-12  8:45       ` Jan Beulich
2023-12-13  6:06         ` Hu, Lin1
2023-12-13  8:19           ` Jan Beulich
2023-12-13  8:34             ` Hu, Lin1
2023-11-24  7:02 ` [PATCH v3 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-11-24  7:09 ` [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Jan Beulich
2023-11-24 11:22   ` Cui, Lili
2023-11-24 12:14     ` Jan Beulich
2023-12-12  2:57 ` Lu, Hongjiu
2023-12-12  8:16 ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5600FA4CA703647E7195D90F9E8EA@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).