Re: [PATCH V2] Support APX NF

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: "hjl.tools@gmail.com" <hjl.tools@gmail.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH V2] Support APX NF
Date: Tue, 12 Mar 2024 14:53:15 +0100	[thread overview]
Message-ID: <aec6a7bd-e1e4-4b7e-9a3f-027a28542286@suse.com> (raw)
In-Reply-To: <SJ0PR11MB5600769BDA7D0E2FB5355F709E2B2@SJ0PR11MB5600.namprd11.prod.outlook.com>

On 12.03.2024 14:22, Cui, Lili wrote:
>>> --- a/opcodes/i386-opc.tbl
>>> +++ b/opcodes/i386-opc.tbl
>>> @@ -310,32 +310,42 @@ sti, 0xfb, 0, NoSuf, {}  // Arithmetic.
>>>  add, 0x0, APX_F,
>>> D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, {
>>> Reg8|Reg16|Reg32|Reg64,
>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex,
>>> Reg8|Reg16|Reg32|Reg64 }  add, 0x0, 0,
>>> D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, {
>>> Reg8|Reg16|Reg32|Reg64,
>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>>> +add, 0x0, APX_F,
>> D|W|CheckOperandSize|Modrm|No_sSuf|EVexMap4|NF, {
>>> +Reg8|Reg16|Reg32|Reg64,
>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex
>>> +}
>>>  add, 0x83/0, APX_F,
>>> Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|NF,
>> { Imm8S,
>>> Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }  add,
>>> 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
>>> Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>>> +add, 0x83/0, APX_F, Modrm|No_bSuf|No_sSuf|EVexMap4|NF, { Imm8S,
>>> +Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>>>  add, 0x4, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
>>> Acc|Byte|Word|Dword|Qword }  add, 0x80/0, APX_F,
>>> W|Modrm|CheckOperandSize|No_sSuf|DstVVVV|EVexMap4|NF, {
>>> Imm8|Imm16|Imm32|Imm32S,
>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex,
>>> Reg8|Reg16|Reg32|Reg64}  add, 0x80/0, 0,
>>> W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S,
>>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>>> +add, 0x80/0, APX_F, W|Modrm|No_sSuf|EVexMap4|NF, {
>>> +Imm8|Imm16|Imm32|Imm32S,
>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex
>>> +}
>>
>> As before, this (and similar new ones) can also be picked up without use of
>> {nf}, which will want testing. If you deliberately defer adding such a test, this
>> needs saying in the description.
>>
> 
> Ok.
> 
>> One further remark: In the course of adding APX support the set of templates
>> for ADD and its sibling ALU ops was bumped from 4 to 10. There was quite a
>> bit of redundancy between these 7 groups before, but it's growing much
>> worse now. Did you consider templatizing them up front? This may even be
>> worthwhile for just the ...
>>
> 
> I try to add something like this, but it doesn't work. Seems that it only supports differentiation based on mnemonic.
> 
> +<cpu1:attr, 0:HLEPrefixLock, APX_F:EVexMap4|NF>
> +<cpu2:attr, 0:HLEPrefixLock|Optimize, APX_F:EVexMap4|NF>
> +
>  // Arithmetic.
>  add, 0x0, APX_F, D|C|W|CheckOperandSize|Modrm|No_sSuf|DstVVVV|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
> -add, 0x0, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> -add, 0x0, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
> +add<cpu1>, 0x0, 0, D|W|CheckOperandSize|Modrm|No_sSuf|<cpu1:attr>, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }

Differentiating on other than the mnemonic is possible (see e.g. the
mmx and sse templates, which append nothing to the mnemonics) but not
wanted here. The goal of the suggested template is to have something
that covers all 7 mnemonics in one go, including their legacy forms.

>>> @@ -434,24 +464,34 @@ imul, 0x69, i186,
>>> Modrm|No_bSuf|No_sSuf|RegKludge, { Imm16|Imm32|Imm32S,
>> Reg16|R
>>>
>>>  div, 0xf6/6, 0, W|Modrm|No_sSuf, {
>>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }  div, 0xf6/6, 0,
>>> W|CheckOperandSize|Modrm|No_sSuf, {
>>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex,
>>> Acc|Byte|Word|Dword|Qword }
>>> +div, 0xf6/6, APX_F, W|Modrm|No_sSuf|EVexMap4|NF, {
>>> +Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>>>  idiv, 0xf6/7, 0, W|Modrm|No_sSuf, {
>>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }  idiv, 0xf6/7, 0,
>>> W|CheckOperandSize|Modrm|No_sSuf, {
>>> Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex,
>>> Acc|Byte|Word|Dword|Qword }
>>> +idiv, 0xf6/7, APX_F, W|Modrm|No_sSuf|EVexMap4|NF, {
>>> +Reg8|Reg16|Reg32|Reg64|Unspecified|BaseIndex }
>>
>> What about the secondary forms with an explicit destination operand?
> 
> The doc does not extend the second format. Maybe it is related to Acc.
It certainly is.

> I found that basically all instructions involving Acc are not extended, except SHA.

Well, the SDM doesn't specify DIV or IDIV with a second operand.
Therefore the issue here isn't conformance with the documentation,
but consistency with the pre-existing gas extensions.

>>> @@ -2027,6 +2090,7 @@ blsi, 0xf3/3, APX_F(BMI),
>>> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVV
>>>  blsmsk, 0xf3/2, APX_F(BMI),
>>>
>> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
>> f|No_wSu
>>> f|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
>>> blsr, 0xf3/1, APX_F(BMI),
>>>
>> Modrm|CheckOperandSize|Vex128|EVex128|Space0F38|VexVVVV|No_bSu
>> f|No_wSu
>>> f|No_sSuf|NF, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
>>> tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf, {
>>> Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>>> +tzcnt, 0xf4, BMI&APX_F,
>>> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4|NF, {
>>> +Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>>
>> Like e.g. here, ...
>>
>>> @@ -2104,9 +2168,11 @@ insertq, 0xf20f78, SSE4a, Modrm|NoSuf,
>> { Imm8,
>>> Imm8, RegXMM, RegXMM }
>>>
>>>  // LZCNT instruction
>>>  lzcnt, 0xf30fbd, LZCNT, Modrm|CheckOperandSize|No_bSuf|No_sSuf, {
>>> Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>>> +lzcnt, 0xf5, LZCNT|APX_F,
>>> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4|NF, {
>>> +Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>>
>> ... I think you mean LZCNT&APX_F here.
>>
>>>  // POPCNT instruction
>>>  popcnt, 0xf30fb8, POPCNT, Modrm|CheckOperandSize|No_bSuf|No_sSuf,
>> {
>>> Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>>> +popcnt, 0x88, POPCNT|APX_F,
>>> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4|NF, {
>>> +Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>>
>> Whereas here it ought to be just APX_F if the doc is to be trusted.
> 
> I think they should be LZCNT&APX_F and POPCNT&APX_F, why do you think POPCNT is special? I suspect I've missed something.

Unless you're looking at a newer version of the APX spec than me,
you'll observe the difference when looking at the respective insn
pages: For LZCNT both CPUID bits are mentioned, while for POPCNT
only APX_F is there.

Jan

next prev parent reply	other threads:[~2024-03-12 13:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04  8:15 Cui, Lili
2024-03-08  9:36 ` Jan Beulich
2024-03-11 13:54   ` Cui, Lili
2024-03-11 14:09     ` Jan Beulich
2024-03-12  6:12       ` Cui, Lili
2024-03-12  7:46         ` Jan Beulich
2024-03-12  8:51           ` Cui, Lili
2024-03-12 13:22   ` Cui, Lili
2024-03-12 13:53     ` Jan Beulich [this message]
2024-03-13  2:54       ` Cui, Lili
2024-03-13  7:36         ` Jan Beulich
2024-03-18 11:21           ` Cui, Lili
2024-03-18 11:50             ` Jan Beulich
2024-03-18 13:43               ` Cui, Lili
2024-03-19  1:24         ` Cui, Lili
2024-03-08 10:40 ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aec6a7bd-e1e4-4b7e-9a3f-027a28542286@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).