From: Haochen Jiang <haochen.jiang@intel.com>
To: binutils@sourceware.org
Cc: jbeulich@suse.com, hjl.tools@gmail.com
Subject: [PATCH] x86: Eliminate unnecessary {evex} prefixes (was: x86: Add %ME for instructions do not need {evex} prefix with memory)
Date: Tue, 3 Dec 2024 17:17:33 +0800 [thread overview]
Message-ID: <20241203091733.1978559-1-haochen.jiang@intel.com> (raw)
From: "H.J. Lu" <hjl.tools@gmail.com>
Hi all,
I renamed the patch from previous patch name
"x86: Add %ME for instructions do not need {evex} prefix with memory".
Changes and patch descrption are embedded below.
Tested on x86_64-pc-linux-gnu. Ok for trunk?
Thx,
Haochen
---
Changes in v3:
- Further split table for vpsra{d,q} to fully eliminate {evex} prefix
for vpsraq. Also add testcases for vpsraq.
---
For several instructions including vps{l,r}l{d,q,w,dq} and vpsra{d,w},
their VEX part do not have the following version:
vpsrlw $0x1f,(%r15,%rcx,4),%xmm0
Thus, {evex} prefix should not be inserted when their second operand is
memory, while we still need them for register as second operand. Add a
new macro %ME to solve this problem.
For vpsraq, there is no VEX version, so the {evex} prefix should always
be eliminated.
gas/ChangeLog:
PR binutils/32403
* testsuite/gas/i386/i386.exp: Run new test.
* testsuite/gas/i386/x86-64.exp: Ditto.
* testsuite/gas/i386/evex-only.d: New test.
* testsuite/gas/i386/evex-only.s: Ditto.
* testsuite/gas/i386/x86-64-evex-only.d: Ditto.
* testsuite/gas/i386/x86-64-evex-only.s: Ditto.
opcodes/ChangeLog:
PR binutils/32403
* i386-dis-evex-reg.h: Use %ME instead of %XE for vps{l,r}l{w,dq}
and vpsraw. Split table for vpsra{d,q}.
* i386-dis-evex-w.h: Use %ME instead of %XE for vps{l,r}l{d,q}
and vpsrad. Eliminate vpsraq {evex} prefix.
* i386-dis-evex.h: Split table for vpsra{d,q}.
* i386-dis.c: (EVEX_W_0F72_R_4): New.
(EVEX_W_0FE2): Ditto.
(struct dis386): Add comment for %ME.
(putop): Handle %ME.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
---
gas/testsuite/gas/i386/evex-only.d | 24 +++++++++++++++++++++++
gas/testsuite/gas/i386/evex-only.s | 18 +++++++++++++++++
gas/testsuite/gas/i386/i386.exp | 1 +
gas/testsuite/gas/i386/x86-64-evex-only.d | 24 +++++++++++++++++++++++
gas/testsuite/gas/i386/x86-64-evex-only.s | 18 +++++++++++++++++
gas/testsuite/gas/i386/x86-64.exp | 1 +
opcodes/i386-dis-evex-reg.h | 12 ++++++------
opcodes/i386-dis-evex-w.h | 18 +++++++++++++----
opcodes/i386-dis-evex.h | 2 +-
opcodes/i386-dis.c | 8 ++++++++
10 files changed, 115 insertions(+), 11 deletions(-)
create mode 100644 gas/testsuite/gas/i386/evex-only.d
create mode 100644 gas/testsuite/gas/i386/evex-only.s
create mode 100644 gas/testsuite/gas/i386/x86-64-evex-only.d
create mode 100644 gas/testsuite/gas/i386/x86-64-evex-only.s
diff --git a/gas/testsuite/gas/i386/evex-only.d b/gas/testsuite/gas/i386/evex-only.d
new file mode 100644
index 00000000000..373715d664c
--- /dev/null
+++ b/gas/testsuite/gas/i386/evex-only.d
@@ -0,0 +1,24 @@
+#objdump: -dw
+#name: AVX512 instructions do not need {evex} prefix with memory
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f1 7d 08 71 14 88 1f\s+vpsrlw\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 71 24 88 1f\s+vpsraw\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 71 34 88 1f\s+vpsllw\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 72 24 88 1f\s+vpsrad\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 72 24 88 1f\s+vpsraq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 72 e1 1f\s+vpsraq\s+\$0x1f,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 f5 08 e2 04 88\s+vpsraq\s+\(%eax,%ecx,4\),%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 f5 08 e2 c2\s+vpsraq\s+%xmm2,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 73 1c 88 1f\s+vpsrldq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 73 3c 88 1f\s+vpslldq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 72 14 88 1f\s+vpsrld\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 72 34 88 1f\s+vpslld\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 73 14 88 1f\s+vpsrlq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 73 34 88 1f\s+vpsllq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+#pass
diff --git a/gas/testsuite/gas/i386/evex-only.s b/gas/testsuite/gas/i386/evex-only.s
new file mode 100644
index 00000000000..e198c19e089
--- /dev/null
+++ b/gas/testsuite/gas/i386/evex-only.s
@@ -0,0 +1,18 @@
+# Check instructions do not need {evex} prefix under memory operand
+
+ .text
+_start:
+ vpsrlw $0x1f,(%eax,%ecx,4),%xmm0
+ vpsraw $0x1f,(%eax,%ecx,4),%xmm0
+ vpsllw $0x1f,(%eax,%ecx,4),%xmm0
+ vpsrad $0x1f,(%eax,%ecx,4),%xmm0
+ vpsraq $0x1f,(%eax,%ecx,4),%xmm0
+ vpsraq $0x1f,%xmm1,%xmm0
+ vpsraq (%eax,%ecx,4),%xmm1,%xmm0
+ vpsraq %xmm2,%xmm1,%xmm0
+ vpsrldq $0x1f,(%eax,%ecx,4),%xmm0
+ vpslldq $0x1f,(%eax,%ecx,4),%xmm0
+ vpsrld $0x1f,(%eax,%ecx,4),%xmm0
+ vpslld $0x1f,(%eax,%ecx,4),%xmm0
+ vpsrlq $0x1f,(%eax,%ecx,4),%xmm0
+ vpsllq $0x1f,(%eax,%ecx,4),%xmm0
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 34352026ccc..c74f1412dcc 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -294,6 +294,7 @@ if [gas_32_check] then {
run_dump_test "evex-lig-2"
run_dump_test "evex-wig1"
run_dump_test "evex-wig1-intel"
+ run_dump_test "evex-only"
run_dump_test "evex-no-scale-32"
run_dump_test "sse2avx"
run_dump_test "unaligned-vector-move"
diff --git a/gas/testsuite/gas/i386/x86-64-evex-only.d b/gas/testsuite/gas/i386/x86-64-evex-only.d
new file mode 100644
index 00000000000..16565b81fe0
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-evex-only.d
@@ -0,0 +1,24 @@
+#objdump: -dw
+#name: x86-64 AVX512 instructions do not need {evex} prefix with memory
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 d1 7d 08 71 14 8f 1f\s+vpsrlw\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 71 24 8f 1f\s+vpsraw\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 71 34 8f 1f\s+vpsllw\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 72 24 8f 1f\s+vpsrad\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 fd 08 72 24 8f 1f\s+vpsraq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 72 e1 1f\s+vpsraq\s+\$0x1f,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 d1 f5 08 e2 04 8f\s+vpsraq\s+\(%r15,%rcx,4\),%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 f5 08 e2 c2\s+vpsraq\s+%xmm2,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 73 1c 8f 1f\s+vpsrldq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 73 3c 8f 1f\s+vpslldq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 72 14 8f 1f\s+vpsrld\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 72 34 8f 1f\s+vpslld\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 fd 08 73 14 8f 1f\s+vpsrlq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 fd 08 73 34 8f 1f\s+vpsllq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-evex-only.s b/gas/testsuite/gas/i386/x86-64-evex-only.s
new file mode 100644
index 00000000000..f8024fd6016
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-evex-only.s
@@ -0,0 +1,18 @@
+# Check instructions do not need {evex} prefix under memory operand
+
+ .text
+_start:
+ vpsrlw $0x1f,(%r15,%rcx,4),%xmm0
+ vpsraw $0x1f,(%r15,%rcx,4),%xmm0
+ vpsllw $0x1f,(%r15,%rcx,4),%xmm0
+ vpsrad $0x1f,(%r15,%rcx,4),%xmm0
+ vpsraq $0x1f,(%r15,%rcx,4),%xmm0
+ vpsraq $0x1f,%xmm1,%xmm0
+ vpsraq (%r15,%rcx,4),%xmm1,%xmm0
+ vpsraq %xmm2,%xmm1,%xmm0
+ vpsrldq $0x1f,(%r15,%rcx,4),%xmm0
+ vpslldq $0x1f,(%r15,%rcx,4),%xmm0
+ vpsrld $0x1f,(%r15,%rcx,4),%xmm0
+ vpslld $0x1f,(%r15,%rcx,4),%xmm0
+ vpsrlq $0x1f,(%r15,%rcx,4),%xmm0
+ vpsllq $0x1f,(%r15,%rcx,4),%xmm0
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index fee227d2a4d..d8b767c3d54 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -242,6 +242,7 @@ run_dump_test "x86-64-evex-lig-2"
run_dump_test "x86-64-evex-wig1"
run_dump_test "x86-64-evex-wig1-intel"
run_dump_test "x86-64-evex-wig2"
+run_dump_test "x86-64-evex-only"
run_dump_test "evex-no-scale-64"
run_dump_test "x86-64-sse2avx"
run_dump_test "x86-64-unaligned-vector-move"
diff --git a/opcodes/i386-dis-evex-reg.h b/opcodes/i386-dis-evex-reg.h
index eda0e824aef..893ec4f83f2 100644
--- a/opcodes/i386-dis-evex-reg.h
+++ b/opcodes/i386-dis-evex-reg.h
@@ -2,11 +2,11 @@
{
{ Bad_Opcode },
{ Bad_Opcode },
- { "%XEvpsrlw", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpsrlw", { Vex, EXx, Ib }, PREFIX_DATA },
{ Bad_Opcode },
- { "%XEvpsraw", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpsraw", { Vex, EXx, Ib }, PREFIX_DATA },
{ Bad_Opcode },
- { "%XEvpsllw", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpsllw", { Vex, EXx, Ib }, PREFIX_DATA },
},
/* REG_EVEX_0F72 */
{
@@ -14,7 +14,7 @@
{ "vprol%DQ", { Vex, EXx, Ib }, PREFIX_DATA },
{ VEX_W_TABLE (EVEX_W_0F72_R_2) },
{ Bad_Opcode },
- { "%XEvpsra%DQ", { Vex, EXx, Ib }, PREFIX_DATA },
+ { VEX_W_TABLE (EVEX_W_0F72_R_4) },
{ Bad_Opcode },
{ VEX_W_TABLE (EVEX_W_0F72_R_6) },
},
@@ -23,11 +23,11 @@
{ Bad_Opcode },
{ Bad_Opcode },
{ VEX_W_TABLE (EVEX_W_0F73_R_2) },
- { "%XEvpsrldqY", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpsrldqY", { Vex, EXx, Ib }, PREFIX_DATA },
{ Bad_Opcode },
{ Bad_Opcode },
{ VEX_W_TABLE (EVEX_W_0F73_R_6) },
- { "%XEvpslldqY", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpslldqY", { Vex, EXx, Ib }, PREFIX_DATA },
},
/* REG_EVEX_0F38C6_L_2 */
{
diff --git a/opcodes/i386-dis-evex-w.h b/opcodes/i386-dis-evex-w.h
index 27053b49b9c..30af9ddd698 100644
--- a/opcodes/i386-dis-evex-w.h
+++ b/opcodes/i386-dis-evex-w.h
@@ -50,21 +50,26 @@
},
/* EVEX_W_0F72_R_2 */
{
- { "%XEvpsrld", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpsrld", { Vex, EXx, Ib }, PREFIX_DATA },
+ },
+ /* EVEX_W_0F72_R_4 */
+ {
+ { "%MEvpsrad", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "vpsraq", { Vex, EXx, Ib }, PREFIX_DATA },
},
/* EVEX_W_0F72_R_6 */
{
- { "%XEvpslld", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpslld", { Vex, EXx, Ib }, PREFIX_DATA },
},
/* EVEX_W_0F73_R_2 */
{
{ Bad_Opcode },
- { "%XEvpsrlq", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpsrlq", { Vex, EXx, Ib }, PREFIX_DATA },
},
/* EVEX_W_0F73_R_6 */
{
{ Bad_Opcode },
- { "%XEvpsllq", { Vex, EXx, Ib }, PREFIX_DATA },
+ { "%MEvpsllq", { Vex, EXx, Ib }, PREFIX_DATA },
},
/* EVEX_W_0F76 */
{
@@ -149,6 +154,11 @@
{ Bad_Opcode },
{ VEX_LEN_TABLE (VEX_LEN_0FD6) },
},
+ /* EVEX_W_0FE2 */
+ {
+ { "%XEvpsrad", { XM, Vex, EXxmm }, PREFIX_DATA },
+ { "vpsraq", { XM, Vex, EXxmm }, PREFIX_DATA },
+ },
/* EVEX_W_0FE6_P_1 */
{
{ "%XEvcvtdq2pd", { XM, EXEvexHalfBcstXmmq }, 0 },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 9e0546e66ca..af3bd21de9d 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -256,7 +256,7 @@ static const struct dis386 evex_table[][256] = {
/* E0 */
{ "%XEvpavgb", { XM, Vex, EXx }, PREFIX_DATA },
{ "%XEvpsraw", { XM, Vex, EXxmm }, PREFIX_DATA },
- { "%XEvpsra%DQ", { XM, Vex, EXxmm }, PREFIX_DATA },
+ { VEX_W_TABLE (EVEX_W_0FE2) },
{ "%XEvpavgw", { XM, Vex, EXx }, PREFIX_DATA },
{ "%XEvpmulhuw", { XM, Vex, EXx }, PREFIX_DATA },
{ "%XEvpmulhw", { XM, Vex, EXx }, PREFIX_DATA },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index ea3a8e2f860..e9471a14f70 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1657,6 +1657,7 @@ enum
EVEX_W_0F6F_P_3,
EVEX_W_0F70_P_2,
EVEX_W_0F72_R_2,
+ EVEX_W_0F72_R_4,
EVEX_W_0F72_R_6,
EVEX_W_0F73_R_2,
EVEX_W_0F73_R_6,
@@ -1677,6 +1678,7 @@ enum
EVEX_W_0FD3,
EVEX_W_0FD4,
EVEX_W_0FD6,
+ EVEX_W_0FE2,
EVEX_W_0FE6_P_1,
EVEX_W_0FE7,
EVEX_W_0FF2,
@@ -1819,6 +1821,8 @@ struct dis386 {
"XV" => print "{vex} " pseudo prefix
"XE" => print "{evex} " pseudo prefix if no EVEX-specific functionality is
is used by an EVEX-encoded (AVX512VL) instruction.
+ "ME" => Similar to "XE", but only print "{evex} " when there is no
+ memory operand.
"NF" => print "{nf} " pseudo prefix when EVEX.NF = 1 and print "{evex} "
pseudo prefix when instructions without NF, EGPR and VVVV,
"NE" => don't print "{evex} " pseudo prefix for some special instructions
@@ -10594,6 +10598,10 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
{
switch (last[0])
{
+ case 'M':
+ if (ins->modrm.mod != 3)
+ break;
+ /* Fall through. */
case 'X':
if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
|| (ins->rex2 & 7)
--
2.31.1
next reply other threads:[~2024-12-03 9:17 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-03 9:17 Haochen Jiang [this message]
2024-12-03 9:34 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241203091733.1978559-1-haochen.jiang@intel.com \
--to=haochen.jiang@intel.com \
--cc=binutils@sourceware.org \
--cc=hjl.tools@gmail.com \
--cc=jbeulich@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).