public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Haochen Jiang <haochen.jiang@intel.com>
To: binutils@sourceware.org
Cc: jbeulich@suse.com, hjl.tools@gmail.com
Subject: [PATCH] x86: Eliminate unnecessary {evex} prefixes (was: x86: Add %ME for instructions do not need {evex} prefix with memory)
Date: Tue,  3 Dec 2024 17:17:33 +0800	[thread overview]
Message-ID: <20241203091733.1978559-1-haochen.jiang@intel.com> (raw)

From: "H.J. Lu" <hjl.tools@gmail.com>

Hi all,

I renamed the patch from previous patch name
"x86: Add %ME for instructions do not need {evex} prefix with memory".
Changes and patch descrption are embedded below.

Tested on x86_64-pc-linux-gnu. Ok for trunk?

Thx,
Haochen

---

Changes in v3:

  - Further split table for vpsra{d,q} to fully eliminate {evex} prefix
    for vpsraq. Also add testcases for vpsraq.

---

For several instructions including vps{l,r}l{d,q,w,dq} and vpsra{d,w},
their VEX part do not have the following version:

	vpsrlw $0x1f,(%r15,%rcx,4),%xmm0

Thus, {evex} prefix should not be inserted when their second operand is
memory, while we still need them for register as second operand. Add a
new macro %ME to solve this problem.

For vpsraq, there is no VEX version, so the {evex} prefix should always
be eliminated.

gas/ChangeLog:

	PR binutils/32403
	* testsuite/gas/i386/i386.exp: Run new test.
	* testsuite/gas/i386/x86-64.exp: Ditto.
	* testsuite/gas/i386/evex-only.d: New test.
	* testsuite/gas/i386/evex-only.s: Ditto.
	* testsuite/gas/i386/x86-64-evex-only.d: Ditto.
	* testsuite/gas/i386/x86-64-evex-only.s: Ditto.

opcodes/ChangeLog:

	PR binutils/32403
	* i386-dis-evex-reg.h: Use %ME instead of %XE for vps{l,r}l{w,dq}
	and vpsraw. Split table for vpsra{d,q}.
	* i386-dis-evex-w.h: Use %ME instead of %XE for vps{l,r}l{d,q}
	and vpsrad. Eliminate vpsraq {evex} prefix.
	* i386-dis-evex.h: Split table for vpsra{d,q}.
	* i386-dis.c: (EVEX_W_0F72_R_4): New.
	(EVEX_W_0FE2): Ditto.
	(struct dis386): Add comment for %ME.
	(putop): Handle %ME.

Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
---
 gas/testsuite/gas/i386/evex-only.d        | 24 +++++++++++++++++++++++
 gas/testsuite/gas/i386/evex-only.s        | 18 +++++++++++++++++
 gas/testsuite/gas/i386/i386.exp           |  1 +
 gas/testsuite/gas/i386/x86-64-evex-only.d | 24 +++++++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-evex-only.s | 18 +++++++++++++++++
 gas/testsuite/gas/i386/x86-64.exp         |  1 +
 opcodes/i386-dis-evex-reg.h               | 12 ++++++------
 opcodes/i386-dis-evex-w.h                 | 18 +++++++++++++----
 opcodes/i386-dis-evex.h                   |  2 +-
 opcodes/i386-dis.c                        |  8 ++++++++
 10 files changed, 115 insertions(+), 11 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/evex-only.d
 create mode 100644 gas/testsuite/gas/i386/evex-only.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-evex-only.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-evex-only.s

diff --git a/gas/testsuite/gas/i386/evex-only.d b/gas/testsuite/gas/i386/evex-only.d
new file mode 100644
index 00000000000..373715d664c
--- /dev/null
+++ b/gas/testsuite/gas/i386/evex-only.d
@@ -0,0 +1,24 @@
+#objdump: -dw
+#name: AVX512 instructions do not need {evex} prefix with memory
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f1 7d 08 71 14 88 1f\s+vpsrlw\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 71 24 88 1f\s+vpsraw\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 71 34 88 1f\s+vpsllw\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 72 24 88 1f\s+vpsrad\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 72 24 88 1f\s+vpsraq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 72 e1 1f\s+vpsraq\s+\$0x1f,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 f5 08 e2 04 88\s+vpsraq\s+\(%eax,%ecx,4\),%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 f5 08 e2 c2\s+vpsraq\s+%xmm2,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 73 1c 88 1f\s+vpsrldq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 73 3c 88 1f\s+vpslldq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 72 14 88 1f\s+vpsrld\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 7d 08 72 34 88 1f\s+vpslld\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 73 14 88 1f\s+vpsrlq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 73 34 88 1f\s+vpsllq\s+\$0x1f,\(%eax,%ecx,4\),%xmm0
+#pass
diff --git a/gas/testsuite/gas/i386/evex-only.s b/gas/testsuite/gas/i386/evex-only.s
new file mode 100644
index 00000000000..e198c19e089
--- /dev/null
+++ b/gas/testsuite/gas/i386/evex-only.s
@@ -0,0 +1,18 @@
+# Check instructions do not need {evex} prefix under memory operand
+
+	.text
+_start:
+	vpsrlw	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsraw	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsllw	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsrad	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsraq	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsraq	$0x1f,%xmm1,%xmm0
+	vpsraq	(%eax,%ecx,4),%xmm1,%xmm0
+	vpsraq	%xmm2,%xmm1,%xmm0
+	vpsrldq	$0x1f,(%eax,%ecx,4),%xmm0
+	vpslldq	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsrld	$0x1f,(%eax,%ecx,4),%xmm0
+	vpslld	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsrlq	$0x1f,(%eax,%ecx,4),%xmm0
+	vpsllq	$0x1f,(%eax,%ecx,4),%xmm0
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 34352026ccc..c74f1412dcc 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -294,6 +294,7 @@ if [gas_32_check] then {
     run_dump_test "evex-lig-2"
     run_dump_test "evex-wig1"
     run_dump_test "evex-wig1-intel"
+    run_dump_test "evex-only"
     run_dump_test "evex-no-scale-32"
     run_dump_test "sse2avx"
     run_dump_test "unaligned-vector-move"
diff --git a/gas/testsuite/gas/i386/x86-64-evex-only.d b/gas/testsuite/gas/i386/x86-64-evex-only.d
new file mode 100644
index 00000000000..16565b81fe0
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-evex-only.d
@@ -0,0 +1,24 @@
+#objdump: -dw
+#name: x86-64 AVX512 instructions do not need {evex} prefix with memory
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 d1 7d 08 71 14 8f 1f\s+vpsrlw\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 71 24 8f 1f\s+vpsraw\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 71 34 8f 1f\s+vpsllw\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 72 24 8f 1f\s+vpsrad\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 fd 08 72 24 8f 1f\s+vpsraq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 f1 fd 08 72 e1 1f\s+vpsraq\s+\$0x1f,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 d1 f5 08 e2 04 8f\s+vpsraq\s+\(%r15,%rcx,4\),%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 f1 f5 08 e2 c2\s+vpsraq\s+%xmm2,%xmm1,%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 73 1c 8f 1f\s+vpsrldq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 73 3c 8f 1f\s+vpslldq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 72 14 8f 1f\s+vpsrld\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 7d 08 72 34 8f 1f\s+vpslld\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 fd 08 73 14 8f 1f\s+vpsrlq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+\s*[a-f0-9]+:\s*62 d1 fd 08 73 34 8f 1f\s+vpsllq\s+\$0x1f,\(%r15,%rcx,4\),%xmm0
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-evex-only.s b/gas/testsuite/gas/i386/x86-64-evex-only.s
new file mode 100644
index 00000000000..f8024fd6016
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-evex-only.s
@@ -0,0 +1,18 @@
+# Check instructions do not need {evex} prefix under memory operand
+
+	.text
+_start:
+	vpsrlw	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsraw	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsllw	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsrad	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsraq	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsraq	$0x1f,%xmm1,%xmm0
+	vpsraq	(%r15,%rcx,4),%xmm1,%xmm0
+	vpsraq	%xmm2,%xmm1,%xmm0
+	vpsrldq	$0x1f,(%r15,%rcx,4),%xmm0
+	vpslldq	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsrld	$0x1f,(%r15,%rcx,4),%xmm0
+	vpslld	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsrlq	$0x1f,(%r15,%rcx,4),%xmm0
+	vpsllq	$0x1f,(%r15,%rcx,4),%xmm0
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index fee227d2a4d..d8b767c3d54 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -242,6 +242,7 @@ run_dump_test "x86-64-evex-lig-2"
 run_dump_test "x86-64-evex-wig1"
 run_dump_test "x86-64-evex-wig1-intel"
 run_dump_test "x86-64-evex-wig2"
+run_dump_test "x86-64-evex-only"
 run_dump_test "evex-no-scale-64"
 run_dump_test "x86-64-sse2avx"
 run_dump_test "x86-64-unaligned-vector-move"
diff --git a/opcodes/i386-dis-evex-reg.h b/opcodes/i386-dis-evex-reg.h
index eda0e824aef..893ec4f83f2 100644
--- a/opcodes/i386-dis-evex-reg.h
+++ b/opcodes/i386-dis-evex-reg.h
@@ -2,11 +2,11 @@
   {
     { Bad_Opcode },
     { Bad_Opcode },
-    { "%XEvpsrlw",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpsrlw",	{ Vex, EXx, Ib }, PREFIX_DATA },
     { Bad_Opcode },
-    { "%XEvpsraw",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpsraw",	{ Vex, EXx, Ib }, PREFIX_DATA },
     { Bad_Opcode },
-    { "%XEvpsllw",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpsllw",	{ Vex, EXx, Ib }, PREFIX_DATA },
   },
   /* REG_EVEX_0F72 */
   {
@@ -14,7 +14,7 @@
     { "vprol%DQ",	{ Vex, EXx, Ib }, PREFIX_DATA },
     { VEX_W_TABLE (EVEX_W_0F72_R_2) },
     { Bad_Opcode },
-    { "%XEvpsra%DQ",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { VEX_W_TABLE (EVEX_W_0F72_R_4) },
     { Bad_Opcode },
     { VEX_W_TABLE (EVEX_W_0F72_R_6) },
   },
@@ -23,11 +23,11 @@
     { Bad_Opcode },
     { Bad_Opcode },
     { VEX_W_TABLE (EVEX_W_0F73_R_2) },
-    { "%XEvpsrldqY",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpsrldqY",	{ Vex, EXx, Ib }, PREFIX_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { VEX_W_TABLE (EVEX_W_0F73_R_6) },
-    { "%XEvpslldqY",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpslldqY",	{ Vex, EXx, Ib }, PREFIX_DATA },
   },
   /* REG_EVEX_0F38C6_L_2 */
   {
diff --git a/opcodes/i386-dis-evex-w.h b/opcodes/i386-dis-evex-w.h
index 27053b49b9c..30af9ddd698 100644
--- a/opcodes/i386-dis-evex-w.h
+++ b/opcodes/i386-dis-evex-w.h
@@ -50,21 +50,26 @@
   },
   /* EVEX_W_0F72_R_2 */
   {
-    { "%XEvpsrld",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpsrld",	{ Vex, EXx, Ib }, PREFIX_DATA },
+  },
+  /* EVEX_W_0F72_R_4 */
+  {
+    { "%MEvpsrad",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "vpsraq",	{ Vex, EXx, Ib }, PREFIX_DATA },
   },
   /* EVEX_W_0F72_R_6 */
   {
-    { "%XEvpslld",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpslld",	{ Vex, EXx, Ib }, PREFIX_DATA },
   },
   /* EVEX_W_0F73_R_2 */
   {
     { Bad_Opcode },
-    { "%XEvpsrlq",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpsrlq",	{ Vex, EXx, Ib }, PREFIX_DATA },
   },
   /* EVEX_W_0F73_R_6 */
   {
     { Bad_Opcode },
-    { "%XEvpsllq",	{ Vex, EXx, Ib }, PREFIX_DATA },
+    { "%MEvpsllq",	{ Vex, EXx, Ib }, PREFIX_DATA },
   },
   /* EVEX_W_0F76 */
   {
@@ -149,6 +154,11 @@
     { Bad_Opcode },
     { VEX_LEN_TABLE (VEX_LEN_0FD6) },
   },
+  /* EVEX_W_0FE2 */
+  {
+    { "%XEvpsrad",	{ XM, Vex, EXxmm }, PREFIX_DATA },
+    { "vpsraq",	{ XM, Vex, EXxmm }, PREFIX_DATA },
+  },
   /* EVEX_W_0FE6_P_1 */
   {
     { "%XEvcvtdq2pd",	{ XM, EXEvexHalfBcstXmmq }, 0 },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 9e0546e66ca..af3bd21de9d 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -256,7 +256,7 @@ static const struct dis386 evex_table[][256] = {
     /* E0 */
     { "%XEvpavgb",	{ XM, Vex, EXx }, PREFIX_DATA },
     { "%XEvpsraw",	{ XM, Vex, EXxmm }, PREFIX_DATA },
-    { "%XEvpsra%DQ",	{ XM, Vex, EXxmm }, PREFIX_DATA },
+    { VEX_W_TABLE (EVEX_W_0FE2) },
     { "%XEvpavgw",	{ XM, Vex, EXx }, PREFIX_DATA },
     { "%XEvpmulhuw",	{ XM, Vex, EXx }, PREFIX_DATA },
     { "%XEvpmulhw",	{ XM, Vex, EXx }, PREFIX_DATA },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index ea3a8e2f860..e9471a14f70 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1657,6 +1657,7 @@ enum
   EVEX_W_0F6F_P_3,
   EVEX_W_0F70_P_2,
   EVEX_W_0F72_R_2,
+  EVEX_W_0F72_R_4,
   EVEX_W_0F72_R_6,
   EVEX_W_0F73_R_2,
   EVEX_W_0F73_R_6,
@@ -1677,6 +1678,7 @@ enum
   EVEX_W_0FD3,
   EVEX_W_0FD4,
   EVEX_W_0FD6,
+  EVEX_W_0FE2,
   EVEX_W_0FE6_P_1,
   EVEX_W_0FE7,
   EVEX_W_0FF2,
@@ -1819,6 +1821,8 @@ struct dis386 {
    "XV" => print "{vex} " pseudo prefix
    "XE" => print "{evex} " pseudo prefix if no EVEX-specific functionality is
 	   is used by an EVEX-encoded (AVX512VL) instruction.
+   "ME" => Similar to "XE", but only print "{evex} " when there is no
+	   memory operand.
    "NF" => print "{nf} " pseudo prefix when EVEX.NF = 1 and print "{evex} "
 	   pseudo prefix when instructions without NF, EGPR and VVVV,
    "NE" => don't print "{evex} " pseudo prefix for some special instructions
@@ -10594,6 +10598,10 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 	    {
 	      switch (last[0])
 		{
+		case 'M':
+		  if (ins->modrm.mod != 3)
+		    break;
+		/* Fall through.  */
 		case 'X':
 		  if (!ins->vex.evex || ins->vex.b || ins->vex.ll >= 2
 		      || (ins->rex2 & 7)
-- 
2.31.1


             reply	other threads:[~2024-12-03  9:17 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03  9:17 Haochen Jiang [this message]
2024-12-03  9:34 ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241203091733.1978559-1-haochen.jiang@intel.com \
    --to=haochen.jiang@intel.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    --cc=jbeulich@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).