From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by sourceware.org (Postfix) with ESMTPS id BB0833858C39 for ; Mon, 11 Dec 2023 08:55:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BB0833858C39 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BB0833858C39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702284929; cv=none; b=KtD+K89VOFMs5YEKHLhcnaKYbh1suv61/gyie3C6Jo41R3QDjL87/OTu29elKMSlY/D8B6VS7iHiippRoybaGoFI79153Fo/UqfMFA+SCMgsz50uESC+XHLQfd+MCFMeJY+SWrAlwR5Bla9Qwj1jKwNiLfVHtHy2yMAilj2mxwI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702284929; c=relaxed/simple; bh=+hkpjRINEol5whdjiaObz9Oz54ThsT0tpuNvR3V3Hao=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=GMSnF+IrKaVNy64+SvKCw7qqNFtWJs0OqrqSfeYG80cYoOxr7e1xYlKO3toLv7e0Z1DgL4iWuA+o0psC5Sexn1lAw4IwqxL2Ij9gdh0/F8qxFmeTx8ltjKdeN4bsGN1zmOruYLQ1iHg6phXkeMzT6y4QiHeYtc9VgwIvmU5pjXY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-40c2308faedso44402865e9.1 for ; Mon, 11 Dec 2023 00:55:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1702284925; x=1702889725; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=RjLDDsxP7W5hoESLA34F1nfuN/9e1C9nxocP/IAekYU=; b=eqSmut5BGCjkT/IFYlVZ+2/CvmztDJTWxvaj6WgWyYs/IAhMKpGrFd2+Lkimcn5AON 3aE72N3s2eOryLK9w1pNIRT3uzJFHcjiuD8EgJJOEHt8X+QrpP7t4GzQvYMYicQhs98a /qCH9OJG18oN+fie+UE1N0dPqb/TkcWwZee3a7slcqWVaX6J+ea+D5H0J6/h/C0viZ9e Bff/V5rinGDyqlnd3reHgCQ8xQjCmoM59DVc1vgFqW8Hf+zoe8i8vmVkNzTEtz5lCUC1 eKgdQ45qgUNPN3ROUFBMWLCNlbGArlfOfo7UY/opbqB1pj7YuO+RjszUOKpWuV8XQ8h+ IsQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702284925; x=1702889725; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RjLDDsxP7W5hoESLA34F1nfuN/9e1C9nxocP/IAekYU=; b=XgI9FWgwSVhTN23RrS+qpq5NESBFiM7JJfCT65n24SV02qKjd0YC4dl4a5d6SF2mUT +Y3KLbilHOT1hn6+TNtg+9YVhpXQq8PUHzgIY3FjEKsNE9QqZEKiqV6HTKc+w5wv9hLe KUeSBjSSh0ZgVRR6qXsmgnBcPbFk9FLuhvm/lXw4RjTjdAlyPI2Cdg0eTncfMuO88f6E mruUbhDSEHijNUOdtE5McPNrash9zoff3UJ+TZfGHfyh/gE91UdJOiWWI2Nv4hBM/aXW +xfZcnG7rwk+W1PlQ6EGNg+SzeNdLbg7aH4s4FRhP47Tsvrlzl0YmVMJZfJmBt+SX4Id GCqA== X-Gm-Message-State: AOJu0YzesLfdnlRxTY6KiOnlsDtkqPUIE3kJQEl4xMPRnJngH64HYvat rh8rM7dnEdyWzYOmOgVIZbnhHLIqJGNGDRg2K6sP X-Google-Smtp-Source: AGHT+IGQxTYfZOkFok7RntMVn8xAuO2JqVeqLazbzbxU5oRiq6gNdAoLgR4vnTw4E5QLr7szV5mJ1Q== X-Received: by 2002:a05:600c:6907:b0:40c:35b9:e2dd with SMTP id fo7-20020a05600c690700b0040c35b9e2ddmr1670869wmb.85.1702284925411; Mon, 11 Dec 2023 00:55:25 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id m14-20020a05600c4f4e00b0040b30be6244sm12259912wmq.24.2023.12.11.00.55.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 11 Dec 2023 00:55:25 -0800 (PST) Message-ID: <28164078-97c4-412f-8195-68bb2c6404da@suse.com> Date: Mon, 11 Dec 2023 09:55:08 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 5/9] Add tests for APX GPR32 with extend evex prefix Content-Language: en-US To: "Cui, Lili" Cc: "Lu, Hongjiu" , "binutils@sourceware.org" References: <20231124070213.3886483-1-lili.cui@intel.com> <20231124070213.3886483-5-lili.cui@intel.com> <1e96d6fc-e657-4235-ac11-4cc6772effcc@suse.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3026.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11.12.2023 07:16, Cui, Lili wrote: >> On 24.11.2023 08:02, Cui, Lili wrote: >>> +#VEX without evex >>> + vaesimc (%r27), %xmm3 >>> + vaeskeygenassist $7,(%r27),%xmm3 >>> + vblendpd $7,(%r27),%xmm6,%xmm2 >>> + vblendpd $7,(%r27),%ymm6,%ymm2 >>> + vblendps $7,(%r27),%xmm6,%xmm2 >>> + vblendps $7,(%r27),%ymm6,%ymm2 >>> + vblendvpd %xmm4,(%r27),%xmm2,%xmm7 >>> + vblendvpd %ymm4,(%r27),%ymm2,%ymm7 >>> + vblendvps %xmm4,(%r27),%xmm2,%xmm7 >>> + vblendvps %ymm4,(%r27),%ymm2,%ymm7 >>> + vdppd $7,(%r27),%xmm6,%xmm2 >>> + vdpps $7,(%r27),%xmm6,%xmm2 >>> + vdpps $7,(%r27),%ymm6,%ymm2 >>> + vhaddpd (%r27),%xmm6,%xmm5 >>> + vhaddpd (%r27),%ymm6,%ymm5 >>> + vhsubps (%r27),%xmm6,%xmm5 >>> + vhsubps (%r27),%ymm6,%ymm5 >>> + vlddqu (%r27),%xmm4 >>> + vlddqu (%r27),%ymm4 >>> + vldmxcsr (%r27) >>> + vmaskmovpd %xmm4,%xmm6,(%r27) >>> + vmaskmovpd %ymm4,%ymm6,(%r27) >>> + vmaskmovpd (%r27),%xmm4,%xmm6 >>> + vmaskmovpd (%r27),%ymm4,%ymm6 >>> + vmaskmovps %xmm4,%xmm6,(%r27) >>> + vmaskmovps %ymm4,%ymm6,(%r27) >>> + vmaskmovps (%r27),%xmm4,%xmm6 >>> + vmaskmovps (%r27),%ymm4,%ymm6 >>> + vmovmskpd %xmm4,%r27d >>> + vmovmskpd %xmm8,%r27d >>> + vmovmskps %xmm4,%r27d >>> + vmovmskps %ymm8,%r27d >>> + vpblendd $7,(%r27),%xmm6,%xmm2 >>> + vpblendd $7,(%r27),%ymm6,%ymm2 >>> + vpblendvb %xmm4,(%r27),%xmm2,%xmm7 >>> + vpblendvb %ymm4,(%r27),%ymm2,%ymm7 >>> + vpblendw $7,(%r27),%xmm6,%xmm2 >>> + vpblendw $7,(%r27),%ymm6,%ymm2 >>> + vpcmpeqb (%r26),%ymm6,%ymm2 >>> + vpcmpeqd (%r26),%ymm6,%ymm2 >>> + vpcmpeqq (%r16),%ymm6,%ymm2 >>> + vpcmpeqw (%r16),%ymm6,%ymm2 >>> + vpcmpestri $7,(%r27),%xmm6 >>> + vpcmpestrm $7,(%r27),%xmm6 >>> + vpcmpgtb (%r26),%ymm6,%ymm2 >>> + vpcmpgtd (%r26),%ymm6,%ymm2 >>> + vpcmpgtq (%r16),%ymm6,%ymm2 >>> + vpcmpgtw (%r16),%ymm6,%ymm2 >>> + vpcmpistri $100,(%r25),%xmm6 >>> + vpcmpistrm $100,(%r25),%xmm6 >>> + vperm2f128 $7,(%r27),%ymm6,%ymm2 >>> + vperm2i128 $7,(%r27),%ymm6,%ymm2 >>> + vphaddd (%r27),%xmm6,%xmm7 >>> + vphaddd (%r27),%ymm6,%ymm7 >>> + vphaddsw (%r27),%xmm6,%xmm7 >>> + vphaddsw (%r27),%ymm6,%ymm7 >>> + vphaddw (%r27),%xmm6,%xmm7 >>> + vphaddw (%r27),%ymm6,%ymm7 >>> + vphminposuw (%r27),%xmm6 >>> + vphsubd (%r27),%xmm6,%xmm7 >>> + vphsubd (%r27),%ymm6,%ymm7 >>> + vphsubsw (%r27),%xmm6,%xmm7 >>> + vphsubsw (%r27),%ymm6,%ymm7 >>> + vphsubw (%r27),%xmm6,%xmm7 >>> + vphsubw (%r27),%ymm6,%ymm7 >>> + vpmaskmovd %xmm4,%xmm6,(%r27) >>> + vpmaskmovd %ymm4,%ymm6,(%r27) >>> + vpmaskmovd (%r27),%xmm4,%xmm6 >>> + vpmaskmovd (%r27),%ymm4,%ymm6 >>> + vpmaskmovq %xmm4,%xmm6,(%r27) >>> + vpmaskmovq %ymm4,%ymm6,(%r27) >>> + vpmaskmovq (%r27),%xmm4,%xmm6 >>> + vpmaskmovq (%r27),%ymm4,%ymm6 >>> + vpmovmskb %xmm4,%r27 >>> + vpmovmskb %ymm4,%r27d >>> + vpsignb (%r27),%xmm6,%xmm7 >>> + vpsignb (%r27),%xmm6,%xmm7 >>> + vpsignd (%r27),%xmm6,%xmm7 >>> + vpsignd (%r27),%xmm6,%xmm7 >>> + vpsignw (%r27),%xmm6,%xmm7 >>> + vpsignw (%r27),%xmm6,%xmm7 >>> + vptest (%r27),%ymm6 >>> + vrcpps (%r27),%xmm6 >>> + vrcpps (%r27),%ymm6 >>> + vrcpss (%r27),%xmm6,%xmm6 >>> + vroundpd $1,(%r24),%xmm6 >>> + vroundps $2,(%r24),%xmm6 >>> + vroundsd $3,(%r24),%xmm6,%xmm3 >>> + vroundss $4,(%r24),%xmm6,%xmm3 >> >> There's still the pending question of whether these really need to be treated >> as invalid (rather than being converted to VRNDSCALE*). Also (to a lesser >> degree) for {LD,ST}MXCSR. >> > > GCC already performs these conversions, and many instructions require this. it has converted vstmxcsr/vldmxcsr to ldmxcsr/stmxcsr under APX. What other instructions are covered by "many"? I don't see a similar pattern applying for other than the named ones. Also, how does it help an assembler programmer if gcc already does the conversion? Or even a C programmer using inline assembly? It's still not really clear to me how inline assembly is going to be dealt with in a fully flexible, yet sufficiently restricting way. Hence any help that can be provided to avoid non-standard constructs ought to be put in place (imo). >>> --- /dev/null >>> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s >>> @@ -0,0 +1,28 @@ >>> +# Check Illegal prefix for 64bit EVEX-promoted instructions >>> + >>> + .allow_index_reg >>> + .text >>> +_start: >>> + #movbe %r23w,%ax set EVEX.pp = f3 (illegal value). >>> + .insn EVEX.L0.f3.M12.W0 0x60, %di, %ax >>> + #movbe %r23w,%ax set EVEX.pp = f2 (illegal value). >>> + .insn EVEX.L0.f2.M12.W0 0x60, %di, %ax >>> + #VSIB vpgatherqq 0x7b(%rbp,%zmm17,8),%zmm16{%k1} set EVEX.P[10] >> == 0 >>> + #(illegal value). >>> + .byte 0x62, 0xe2, 0xf9, 0x41, 0x91, 0x84, 0xcd, 0x7b, 0x00, 0x00, 0x00 >>> + .byte 0xff >> >> For the purpose of this test (whatever P[10] again is) you don't need a 32-bit >> displacement, do you? Shorter is (almost always) better in such tests. >> > > P[10] is a fixed value, in normal EVEX format we don't use this bit. Dropped 0x7b. > >>> + #EVEX_MAP4 movbe %r23w,%ax set EVEX.mm == b01 (illegal value). >>> + .insn EVEX.L0.66.M13.W0 0x60, %di, %ax >>> + #EVEX_MAP4 movbe %r23w,%ax set EVEX.aa(P[17:16]) == b01 (illegal >> value). >> >> There's aaa, but no aa afaik. >> > > Change it to EVEX.a1a0, aaa is split into two parts in EVEX-promoted format, a3 is NF and a1a0 is a fixed value. > > EVEX.a1a0 (P[17:16]) == b01 But a1a0 isn't a term documentation uses either. Just to repeat an earlier request of mine: These comments need to be easy to decipher and follow. Hence they want to use as easily understandable terminology as possible. One way to express what you're after may be "EVEX.aaa[1:0] (P[17:16])". I'm sure there are further ways while stay in line with what the SDM uses. >>> + .insn EVEX.L0.66.M12.W0 0x60, %di, %ax{%k1} >>> + #EVEX_MAP4 movbe %r18w,%ax set EVEX.zL'L == 0b11 (illegal value). >> >> How's z relevant when the value is just a 2-bit one? And then z should likely >> have a separate test (also for the from-VEX case below)? >> > > Modified it and added EVEX.z testcase for MAP4 and from-VEX. > >>> + .insn EVEX.L0.66.M12.W0 0x60, %di, {rd-sae}, %ax >>> + #EVEX from VEX bzhi %ebx,%eax,%ecx EVEX.P[17:16](EVEX.aa) == 1 >> (illegal value). >>> + .insn EVEX.L0.NP.0f38.W0 0xf5, %eax, %ebx, %ecx{%k1} >>> + .byte 0xff, 0xff, 0xff >>> + #EVEX from VEX bzhi %ebx,%eax,%ecx EVEX.P[22:21](EVEX.L’L) == 1 >> (illegal value). >>> + .insn EVEX.L0.NP.0f38.W0 0xf5, %eax, {rd-sae}, %ebx, %ecx >>> + .byte 0xff, 0xff, 0xff >> >> If you arranged for a ModR/M byte of 0xc9 (among other possibilities) in both >> of these cases, you could avoid the .byte lines altogether afaict. >> > > Use other value instead of 0xc9, > > #EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[17:16](EVEX.aa) == 0b01 > #(illegal value). > .insn EVEX.L0.NP.0f38.W0 0xf5, %rax, (%rax,%rbx), %rcx{%k1} > #EVEX from VEX bzhi %rax,(%rax,%rbx),%ecx EVEX.P[22:21](EVEX.L’L) == 0b01 > #(illegal value). > .insn EVEX.L1.NP.0f38.W0 0xf5, %rax, (%rax,%rbx), %rcx > #EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[23](EVEX.z) == 0b1 > #(illegal value). > .insn EVEX.L0.NP.0f38.W0 0xf5, %rax, (%rax,%rbx), %rcx {%k7}{z} > #EVEX from VEX bzhi %rax,(%rax,%rbx),%rcx EVEX.P[20](EVEX.b) == 0b1 > #(illegal value). > .insn EVEX.L0.NP.0f38.W0 0xf5, %rax ,(%rax,%rbx){1to8}, %rcx Hmm, yes, these are memory operands now. I didn't check what ModR/M bytes these specifically encode to, but with the .byte gone I expect things are better now. Btw, readability of these would greatly improve if between each .insn and the following comment there was a blank line. That way what belongs together and what is separate can be spotted at the first glance. Jan