public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH v3 5/9] Add tests for APX GPR32 with extend evex prefix
Date: Mon, 11 Dec 2023 09:55:08 +0100	[thread overview]
Message-ID: <28164078-97c4-412f-8195-68bb2c6404da@suse.com> (raw)
In-Reply-To: <SJ0PR11MB5600C5B1C25CF973594A8CF99E8FA@SJ0PR11MB5600.namprd11.prod.outlook.com>

On 11.12.2023 07:16, Cui, Lili wrote:
>> On 24.11.2023 08:02, Cui, Lili wrote:
>>> +#VEX without evex
>>> +	vaesimc (%r27), %xmm3
>>> +	vaeskeygenassist $7,(%r27),%xmm3
>>> +	vblendpd $7,(%r27),%xmm6,%xmm2
>>> +	vblendpd $7,(%r27),%ymm6,%ymm2
>>> +	vblendps $7,(%r27),%xmm6,%xmm2
>>> +	vblendps $7,(%r27),%ymm6,%ymm2
>>> +	vblendvpd %xmm4,(%r27),%xmm2,%xmm7
>>> +	vblendvpd %ymm4,(%r27),%ymm2,%ymm7
>>> +	vblendvps %xmm4,(%r27),%xmm2,%xmm7
>>> +	vblendvps %ymm4,(%r27),%ymm2,%ymm7
>>> +	vdppd $7,(%r27),%xmm6,%xmm2
>>> +	vdpps $7,(%r27),%xmm6,%xmm2
>>> +	vdpps $7,(%r27),%ymm6,%ymm2
>>> +	vhaddpd (%r27),%xmm6,%xmm5
>>> +	vhaddpd (%r27),%ymm6,%ymm5
>>> +	vhsubps (%r27),%xmm6,%xmm5
>>> +	vhsubps (%r27),%ymm6,%ymm5
>>> +	vlddqu (%r27),%xmm4
>>> +	vlddqu (%r27),%ymm4
>>> +	vldmxcsr (%r27)
>>> +	vmaskmovpd %xmm4,%xmm6,(%r27)
>>> +	vmaskmovpd %ymm4,%ymm6,(%r27)
>>> +	vmaskmovpd (%r27),%xmm4,%xmm6
>>> +	vmaskmovpd (%r27),%ymm4,%ymm6
>>> +	vmaskmovps %xmm4,%xmm6,(%r27)
>>> +	vmaskmovps %ymm4,%ymm6,(%r27)
>>> +	vmaskmovps (%r27),%xmm4,%xmm6
>>> +	vmaskmovps (%r27),%ymm4,%ymm6
>>> +	vmovmskpd %xmm4,%r27d
>>> +	vmovmskpd %xmm8,%r27d
>>> +	vmovmskps %xmm4,%r27d
>>> +	vmovmskps %ymm8,%r27d
>>> +	vpblendd $7,(%r27),%xmm6,%xmm2
>>> +	vpblendd $7,(%r27),%ymm6,%ymm2
>>> +	vpblendvb %xmm4,(%r27),%xmm2,%xmm7
>>> +	vpblendvb %ymm4,(%r27),%ymm2,%ymm7
>>> +	vpblendw $7,(%r27),%xmm6,%xmm2
>>> +	vpblendw $7,(%r27),%ymm6,%ymm2
>>> +	vpcmpeqb (%r26),%ymm6,%ymm2
>>> +	vpcmpeqd (%r26),%ymm6,%ymm2
>>> +	vpcmpeqq (%r16),%ymm6,%ymm2
>>> +	vpcmpeqw (%r16),%ymm6,%ymm2
>>> +	vpcmpestri $7,(%r27),%xmm6
>>> +	vpcmpestrm $7,(%r27),%xmm6
>>> +	vpcmpgtb (%r26),%ymm6,%ymm2
>>> +	vpcmpgtd (%r26),%ymm6,%ymm2
>>> +	vpcmpgtq (%r16),%ymm6,%ymm2
>>> +	vpcmpgtw (%r16),%ymm6,%ymm2
>>> +	vpcmpistri $100,(%r25),%xmm6
>>> +	vpcmpistrm $100,(%r25),%xmm6
>>> +	vperm2f128 $7,(%r27),%ymm6,%ymm2
>>> +	vperm2i128 $7,(%r27),%ymm6,%ymm2
>>> +	vphaddd (%r27),%xmm6,%xmm7
>>> +	vphaddd (%r27),%ymm6,%ymm7
>>> +	vphaddsw (%r27),%xmm6,%xmm7
>>> +	vphaddsw (%r27),%ymm6,%ymm7
>>> +	vphaddw (%r27),%xmm6,%xmm7
>>> +	vphaddw (%r27),%ymm6,%ymm7
>>> +	vphminposuw (%r27),%xmm6
>>> +	vphsubd (%r27),%xmm6,%xmm7
>>> +	vphsubd (%r27),%ymm6,%ymm7
>>> +	vphsubsw (%r27),%xmm6,%xmm7
>>> +	vphsubsw (%r27),%ymm6,%ymm7
>>> +	vphsubw (%r27),%xmm6,%xmm7
>>> +	vphsubw (%r27),%ymm6,%ymm7
>>> +	vpmaskmovd %xmm4,%xmm6,(%r27)
>>> +	vpmaskmovd %ymm4,%ymm6,(%r27)
>>> +	vpmaskmovd (%r27),%xmm4,%xmm6
>>> +	vpmaskmovd (%r27),%ymm4,%ymm6
>>> +	vpmaskmovq %xmm4,%xmm6,(%r27)
>>> +	vpmaskmovq %ymm4,%ymm6,(%r27)
>>> +	vpmaskmovq (%r27),%xmm4,%xmm6
>>> +	vpmaskmovq (%r27),%ymm4,%ymm6
>>> +	vpmovmskb %xmm4,%r27
>>> +	vpmovmskb %ymm4,%r27d
>>> +	vpsignb (%r27),%xmm6,%xmm7
>>> +	vpsignb (%r27),%xmm6,%xmm7
>>> +	vpsignd (%r27),%xmm6,%xmm7
>>> +	vpsignd (%r27),%xmm6,%xmm7
>>> +	vpsignw (%r27),%xmm6,%xmm7
>>> +	vpsignw (%r27),%xmm6,%xmm7
>>> +	vptest (%r27),%ymm6
>>> +	vrcpps (%r27),%xmm6
>>> +	vrcpps (%r27),%ymm6
>>> +	vrcpss (%r27),%xmm6,%xmm6
>>> +	vroundpd $1,(%r24),%xmm6
>>> +	vroundps $2,(%r24),%xmm6
>>> +	vroundsd $3,(%r24),%xmm6,%xmm3
>>> +	vroundss $4,(%r24),%xmm6,%xmm3
>>
>> There's still the pending question of whether these really need to be treated
>> as invalid (rather than being converted to VRNDSCALE*). Also (to a lesser
>> degree) for {LD,ST}MXCSR.
>>
> 
> GCC already performs these conversions, and many instructions require this. it has converted vstmxcsr/vldmxcsr to ldmxcsr/stmxcsr under APX.

What other instructions are covered by "many"? I don't see a similar pattern
applying for other than the named ones.

Also, how does it help an assembler programmer if gcc already does the
conversion? Or even a C programmer using inline assembly? It's still not
really clear to me how inline assembly is going to be dealt with in a
fully flexible, yet sufficiently restricting way. Hence any help that
can be provided to avoid non-standard constructs ought to be put in place
(imo).

>>> --- /dev/null
>>> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
>>> @@ -0,0 +1,28 @@
>>> +# Check Illegal prefix for 64bit EVEX-promoted instructions
>>> +
>>> +        .allow_index_reg
>>> +        .text
>>> +_start:
>>> +	#movbe %r23w,%ax set EVEX.pp = f3 (illegal value).
>>> +	.insn EVEX.L0.f3.M12.W0 0x60, %di, %ax
>>> +	#movbe %r23w,%ax set EVEX.pp = f2 (illegal value).
>>> +	.insn EVEX.L0.f2.M12.W0 0x60, %di, %ax
>>> +	#VSIB vpgatherqq 0x7b(%rbp,%zmm17,8),%zmm16{%k1} set EVEX.P[10]
>> == 0
>>> +	#(illegal value).
>>> +	.byte 0x62, 0xe2, 0xf9, 0x41, 0x91, 0x84, 0xcd, 0x7b, 0x00, 0x00, 0x00
>>> +	.byte 0xff
>>
>> For the purpose of this test (whatever P[10] again is) you don't need a 32-bit
>> displacement, do you? Shorter is (almost always) better in such tests.
>>
> 
> P[10] is a fixed value, in normal EVEX format we don't use this bit.  Dropped 0x7b.
> 
>>> +	#EVEX_MAP4 movbe %r23w,%ax set EVEX.mm == b01 (illegal value).
>>> +	.insn EVEX.L0.66.M13.W0 0x60, %di, %ax
>>> +	#EVEX_MAP4 movbe %r23w,%ax set EVEX.aa(P[17:16]) == b01 (illegal
>> value).
>>
>> There's aaa, but no aa afaik.
>>
> 
> Change it to EVEX.a1a0, aaa is split into two parts in EVEX-promoted format, a3 is NF and a1a0 is a fixed value.
> 
> EVEX.a1a0 (P[17:16]) == b01

But a1a0 isn't a term documentation uses either. Just to repeat an earlier
request of mine: These comments need to be easy to decipher and follow.
Hence they want to use as easily understandable terminology as possible.
One way to express what you're after may be "EVEX.aaa[1:0] (P[17:16])".
I'm sure there are further ways while stay in line with what the SDM uses.

>>> +	.insn EVEX.L0.66.M12.W0 0x60, %di, %ax{%k1}
>>> +	#EVEX_MAP4 movbe %r18w,%ax set EVEX.zL'L == 0b11 (illegal value).
>>
>> How's z relevant when the value is just a 2-bit one? And then z should likely
>> have a separate test (also for the from-VEX case below)?
>>
> 
> Modified it and added EVEX.z testcase for MAP4 and from-VEX.
> 
>>> +	.insn EVEX.L0.66.M12.W0 0x60, %di, {rd-sae}, %ax
>>> +	#EVEX from VEX bzhi %ebx,%eax,%ecx EVEX.P[17:16](EVEX.aa) == 1
>> (illegal value).
>>> +	.insn EVEX.L0.NP.0f38.W0 0xf5, %eax, %ebx, %ecx{%k1}
>>> +	.byte 0xff, 0xff, 0xff
>>> +	#EVEX from VEX bzhi %ebx,%eax,%ecx EVEX.P[22:21](EVEX.L’L) == 1
>> (illegal value).
>>> +	.insn EVEX.L0.NP.0f38.W0 0xf5, %eax, {rd-sae}, %ebx, %ecx
>>> +	.byte 0xff, 0xff, 0xff
>>
>> If you arranged for a ModR/M byte of 0xc9 (among other possibilities) in both
>> of these cases, you could avoid the .byte lines altogether afaict.
>>
>  
> Use other value instead of 0xc9,
> 
>         #EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[17:16](EVEX.aa) == 0b01
>         #(illegal value).
>         .insn EVEX.L0.NP.0f38.W0 0xf5, %rax, (%rax,%rbx), %rcx{%k1}
>         #EVEX from VEX bzhi %rax,(%rax,%rbx),%ecx EVEX.P[22:21](EVEX.L’L) == 0b01
>         #(illegal value).
>         .insn EVEX.L1.NP.0f38.W0 0xf5, %rax, (%rax,%rbx), %rcx
>         #EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[23](EVEX.z) == 0b1
>         #(illegal value).
>         .insn EVEX.L0.NP.0f38.W0 0xf5, %rax, (%rax,%rbx), %rcx {%k7}{z}
>         #EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[20](EVEX.b) == 0b1
>         #(illegal value).
>         .insn EVEX.L0.NP.0f38.W0 0xf5, %rax ,(%rax,%rbx){1to8}, %rcx       

Hmm, yes, these are memory operands now. I didn't check what ModR/M bytes these
specifically encode to, but with the .byte gone I expect things are better now.

Btw, readability of these would greatly improve if between each .insn and the
following comment there was a blank line. That way what belongs together and
what is separate can be spotted at the first glance.

Jan

  reply	other threads:[~2023-12-11  8:55 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-24  7:02 [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Cui, Lili
2023-11-24  7:02 ` [PATCH v3 2/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-04 16:30   ` Jan Beulich
2023-12-05 13:31     ` Cui, Lili
2023-12-06  7:52       ` Jan Beulich
2023-12-06 12:43         ` Cui, Lili
2023-12-07  9:01           ` Jan Beulich
2023-12-08  3:10             ` Cui, Lili
2023-11-24  7:02 ` [PATCH v3 3/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-11-24  7:02 ` [PATCH v3 4/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-07 12:38   ` Jan Beulich
2023-12-08 15:21     ` Cui, Lili
2023-12-11  8:34       ` Jan Beulich
2023-12-12 10:44         ` Cui, Lili
2023-12-12 11:16           ` Jan Beulich
2023-12-12 12:32             ` Cui, Lili
2023-12-12 12:39               ` Jan Beulich
2023-12-12 13:15                 ` Cui, Lili
2023-12-12 14:13                   ` Jan Beulich
2023-12-13  7:36                     ` Cui, Lili
2023-12-13  7:48                       ` Jan Beulich
2023-12-12 12:58         ` Cui, Lili
2023-12-12 14:04           ` Jan Beulich
2023-12-13  8:35             ` Cui, Lili
2023-12-13  9:13               ` Jan Beulich
2023-12-07 13:34   ` Jan Beulich
2023-12-11  6:16     ` Cui, Lili
2023-12-11  8:43       ` Jan Beulich
2023-12-11 11:50   ` Jan Beulich
2023-11-24  7:02 ` [PATCH v3 5/9] Add tests for " Cui, Lili
2023-12-07 14:05   ` Jan Beulich
2023-12-11  6:16     ` Cui, Lili
2023-12-11  8:55       ` Jan Beulich [this message]
2023-11-24  7:02 ` [PATCH v3 6/9] Support APX NDD Cui, Lili
2023-12-08 14:12   ` Jan Beulich
2023-12-11 13:36     ` Cui, Lili
2023-12-11 16:50       ` Jan Beulich
2023-12-13 10:42         ` Cui, Lili
2024-03-22 10:02     ` Jan Beulich
2024-03-22 10:31       ` Jan Beulich
2024-03-26  2:04         ` Cui, Lili
2024-03-26  7:06           ` Jan Beulich
2024-03-26  7:18             ` Cui, Lili
2024-03-22 10:59       ` Jan Beulich
2024-03-26  8:22         ` Cui, Lili
2024-03-26  9:30           ` Jan Beulich
2024-03-27  2:41             ` Cui, Lili
2023-12-08 14:27   ` Jan Beulich
2023-12-12  5:53     ` Cui, Lili
2023-12-12  8:28       ` Jan Beulich
2023-11-24  7:02 ` [PATCH v3 7/9] Support APX Push2/Pop2 Cui, Lili
2023-12-11 11:17   ` Jan Beulich
2023-12-15  8:38     ` Cui, Lili
2023-12-15  8:44       ` Jan Beulich
2023-11-24  7:02 ` [PATCH v3 8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-11 12:27   ` Jan Beulich
2023-12-12  3:18     ` Hu, Lin1
2023-12-12  8:41       ` Jan Beulich
2023-12-13  5:31         ` Hu, Lin1
2023-12-12  8:45       ` Jan Beulich
2023-12-13  6:06         ` Hu, Lin1
2023-12-13  8:19           ` Jan Beulich
2023-12-13  8:34             ` Hu, Lin1
2023-11-24  7:02 ` [PATCH v3 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-11-24  7:09 ` [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Jan Beulich
2023-11-24 11:22   ` Cui, Lili
2023-11-24 12:14     ` Jan Beulich
2023-12-12  2:57 ` Lu, Hongjiu
2023-12-12  8:16 ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28164078-97c4-412f-8195-68bb2c6404da@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).