public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Support APX zero-upper
@ 2024-05-20  6:21 Cui, Lili
  2024-05-20  6:22 ` [PATCH 1/3] x86: Split REX/REX2 old registers judgment Cui, Lili
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Cui, Lili @ 2024-05-20  6:21 UTC (permalink / raw)
  To: binutils; +Cc: hjl.tools, jbeulich

In order to put EVEX old register checking together with REX/REX2, I created a new patch 1/3 to split "REX/REX2 old register checking" and "add empty rex prefix" into two separate branches. Then extend it to EVEX format.

1. Adjusted the indentation of test cases in patch 2/3 (previously 1/2).
2. Removed the redundancy of modifying {evex} in patch 3/3 (previously 2/2).
3. Since Intel will not update the new setzu form in the spec, I will not add set<cc> reg32/reg64 support in this patch.

*** BLURB HERE ***

Cui, Lili (3):
  x86: Split REX/REX2 old registers judgment.
  Add check for 8-bit old registers in EVEX format
  Support APX zero-upper

 gas/config/tc-i386.c                         |  33 +++---
 gas/testsuite/gas/i386/rex-bad.l             |   8 +-
 gas/testsuite/gas/i386/x86-64-apx-inval.l    |   3 +
 gas/testsuite/gas/i386/x86-64-apx-inval.s    |   2 +
 gas/testsuite/gas/i386/x86-64-apx-zu-intel.d |  78 +++++++++++++
 gas/testsuite/gas/i386/x86-64-apx-zu-inval.l |  25 +++++
 gas/testsuite/gas/i386/x86-64-apx-zu-inval.s |  28 +++++
 gas/testsuite/gas/i386/x86-64-apx-zu.d       |  78 +++++++++++++
 gas/testsuite/gas/i386/x86-64-apx-zu.s       |  73 ++++++++++++
 gas/testsuite/gas/i386/x86-64.exp            |   3 +
 opcodes/i386-dis-evex-prefix.h               | 112 +++++++++++++++++++
 opcodes/i386-dis-evex.h                      |  36 +++---
 opcodes/i386-dis.c                           |  29 +++++
 opcodes/i386-gen.c                           |   1 +
 opcodes/i386-opc.h                           |   4 +
 opcodes/i386-opc.tbl                         |   6 +
 16 files changed, 480 insertions(+), 39 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu.s

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] x86: Split REX/REX2 old registers judgment.
  2024-05-20  6:21 [PATCH 0/3] Support APX zero-upper Cui, Lili
@ 2024-05-20  6:22 ` Cui, Lili
  2024-05-21 12:18   ` Jan Beulich
  2024-05-20  6:22 ` [PATCH 2/3] Add check for 8-bit old registers in EVEX format Cui, Lili
  2024-05-20  6:22 ` [PATCH 3/3] Support APX zero-upper Cui, Lili
  2 siblings, 1 reply; 13+ messages in thread
From: Cui, Lili @ 2024-05-20  6:22 UTC (permalink / raw)
  To: binutils; +Cc: hjl.tools, jbeulich

Split "REX/REX2 old register checking" and "add empty rex prefix"
into two separate branches.

gas/ChangeLog:

        * config/tc-i386.c (establish_rex): Split the judgments.
---
 gas/config/tc-i386.c | 30 ++++++++++++++----------------
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 339e849a971..2fbd90bedb8 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4303,22 +4303,20 @@ static void establish_rex (void)
   /* Respect a user-specified REX prefix.  */
   i.rex |= i.prefix[REX_PREFIX] & REX_OPCODE;
 
-  /* For 8 bit registers we need an empty rex prefix.  Also if the
-     instruction already has a prefix, we need to convert old
-     registers to new ones.  */
-
-  if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
-       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
-	   || i.rex2 != 0))
-      || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
-	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
-	      || i.rex2 != 0)))
-    {
-      unsigned int x;
-
-      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
-	i.rex |= REX_OPCODE;
-      for (x = first; x <= last; x++)
+  /* For 8 bit registers without a prefix, we need an empty rex prefix.  */
+  if (((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
+	&& ((i.op[first].regs->reg_flags & RegRex64) != 0 ))
+       || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
+	   && (i.op[last].regs->reg_flags & RegRex64) != 0 ))
+      && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm) && !i.rex)
+    i.rex |= REX_OPCODE;
+
+  /* For REX/REX2 prefix instructions, we need to convert old registers
+     (AL, CL, DL and BL) to new ones (AXL, CXL, DXL and BXL) and report bad
+     for AH, CH, DH and BH.  */
+  if (i.rex || i.rex2)
+    {
+      for (unsigned int x = first; x <= last; x++)
 	{
 	  /* Look for 8 bit operand that uses old registers.  */
 	  if (i.types[x].bitfield.class == Reg && i.types[x].bitfield.byte
-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] Add check for 8-bit old registers in EVEX format
  2024-05-20  6:21 [PATCH 0/3] Support APX zero-upper Cui, Lili
  2024-05-20  6:22 ` [PATCH 1/3] x86: Split REX/REX2 old registers judgment Cui, Lili
@ 2024-05-20  6:22 ` Cui, Lili
  2024-05-21 12:24   ` Jan Beulich
  2024-05-20  6:22 ` [PATCH 3/3] Support APX zero-upper Cui, Lili
  2 siblings, 1 reply; 13+ messages in thread
From: Cui, Lili @ 2024-05-20  6:22 UTC (permalink / raw)
  To: binutils; +Cc: hjl.tools, jbeulich

Since APX supports EVEX from legacy instructions, we need to check
the 8-bit old registers in EVEX format. and adjusted the test case results.

gas/ChangeLog:

        * config/tc-i386.c (md_assemble): Add invalid check for old byte
        registers in EVEX/VEX format.
        * testsuite/gas/i386/x86-64-apx-inval.l: Add new test.
        * testsuite/gas/i386/x86-64-apx-inval.s: Ditto.
---
 gas/config/tc-i386.c                      | 6 +++---
 gas/testsuite/gas/i386/rex-bad.l          | 8 ++++----
 gas/testsuite/gas/i386/x86-64-apx-inval.l | 3 +++
 gas/testsuite/gas/i386/x86-64-apx-inval.s | 2 ++
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 2fbd90bedb8..5606049c054 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4311,10 +4311,10 @@ static void establish_rex (void)
       && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm) && !i.rex)
     i.rex |= REX_OPCODE;
 
-  /* For REX/REX2 prefix instructions, we need to convert old registers
+  /* For REX/REX2/EVEX prefix instructions, we need to convert old registers
      (AL, CL, DL and BL) to new ones (AXL, CXL, DXL and BXL) and report bad
      for AH, CH, DH and BH.  */
-  if (i.rex || i.rex2)
+  if (i.rex || i.rex2 || i.tm.opcode_modifier.evex)
     {
       for (unsigned int x = first; x <= last; x++)
 	{
@@ -4326,7 +4326,7 @@ static void establish_rex (void)
 	      /* In case it is "hi" register, give up.  */
 	      if (i.op[x].regs->reg_num > 3)
 		as_bad (_("can't encode register '%s%s' in an "
-			  "instruction requiring REX/REX2 prefix"),
+			  "instruction requiring REX/REX2/EVEX prefix"),
 			register_prefix, i.op[x].regs->reg_name);
 
 	      /* Otherwise it is equivalent to the extended register.
diff --git a/gas/testsuite/gas/i386/rex-bad.l b/gas/testsuite/gas/i386/rex-bad.l
index abd4d3045d0..100eda558b7 100644
--- a/gas/testsuite/gas/i386/rex-bad.l
+++ b/gas/testsuite/gas/i386/rex-bad.l
@@ -3,8 +3,8 @@
 .*:5: Error: same .*
 .*:6: Error: same .*
 .*:7: Error: same .*
-.*:9: Error: .* REX/REX2 .*
-.*:10: Error: .* REX/REX2 .*
-.*:12: Error: .* REX/REX2 .*
-.*:13: Error: .* REX/REX2 .*
+.*:9: Error: .* REX/REX2/EVEX .*
+.*:10: Error: .* REX/REX2/EVEX .*
+.*:12: Error: .* REX/REX2/EVEX .*
+.*:13: Error: .* REX/REX2/EVEX .*
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-inval.l b/gas/testsuite/gas/i386/x86-64-apx-inval.l
index 7a870b27b72..4948c520481 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-inval.l
+++ b/gas/testsuite/gas/i386/x86-64-apx-inval.l
@@ -12,3 +12,6 @@
 .*:13: Error: \{nf\} unsupported for `mulx'
 .*:14: Error: \{nf\} cannot be combined with \{vex\}/\{vex3\}
 .*:15: Error: \{nf\} cannot be combined with \{vex\}/\{vex3\}
+.*:16: Error: can't encode register '%ah' in an instruction requiring REX/REX2/EVEX prefix
+.*:17: Error: can't encode register '%ah' in an instruction requiring REX/REX2/EVEX prefix
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-inval.s b/gas/testsuite/gas/i386/x86-64-apx-inval.s
index 0487b885ec8..3d69deabe4d 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-inval.s
+++ b/gas/testsuite/gas/i386/x86-64-apx-inval.s
@@ -13,3 +13,5 @@
 	{nf} mulx %r15,%r15,%r11
 	{nf} {vex} bextr %ecx, %edx, %r10d
 	{vex} {nf} bextr %ecx, %edx, %r10d
+	{nf} add %dl,%ah
+	{evex} adc %dl,%ah
-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] Support APX zero-upper
  2024-05-20  6:21 [PATCH 0/3] Support APX zero-upper Cui, Lili
  2024-05-20  6:22 ` [PATCH 1/3] x86: Split REX/REX2 old registers judgment Cui, Lili
  2024-05-20  6:22 ` [PATCH 2/3] Add check for 8-bit old registers in EVEX format Cui, Lili
@ 2024-05-20  6:22 ` Cui, Lili
  2024-05-22  6:21   ` Jan Beulich
  2 siblings, 1 reply; 13+ messages in thread
From: Cui, Lili @ 2024-05-20  6:22 UTC (permalink / raw)
  To: binutils; +Cc: hjl.tools, jbeulich

gas/ChangeLog:

        * config/tc-i386.c (build_apx_evex_prefix): Handle ZU.
        * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
        * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
        * testsuite/gas/i386/x86-64-apx-zu-intel.d: New test.
        * testsuite/gas/i386/x86-64-apx-zu-inval.l: Ditto.
        * testsuite/gas/i386/x86-64-apx-zu-inval.s: Ditto.
        * testsuite/gas/i386/x86-64-apx-zu.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-zu.s: Ditto.

opcodes/ChangeLog:

        * i386-dis-evex-prefix.h: Handle PREFIX_EVEX_MAP4_40 ~
        PREFIX_EVEX_MAP4_4F.
        * i386-dis-evex.h: Ditto.
        * i386-dis.c (struct dis386): Add new micro 'ZU'.
        (putop): Handle %ZU.
        * i386-gen.c: Added ZU.
        * i386-opc.h: Ditto.
        * i386-opc.tbl: Added new templates to support ZU.
---
 gas/config/tc-i386.c                         |   5 +-
 gas/testsuite/gas/i386/x86-64-apx-zu-intel.d |  78 +++++++++++++
 gas/testsuite/gas/i386/x86-64-apx-zu-inval.l |  25 +++++
 gas/testsuite/gas/i386/x86-64-apx-zu-inval.s |  28 +++++
 gas/testsuite/gas/i386/x86-64-apx-zu.d       |  78 +++++++++++++
 gas/testsuite/gas/i386/x86-64-apx-zu.s       |  73 ++++++++++++
 gas/testsuite/gas/i386/x86-64.exp            |   3 +
 opcodes/i386-dis-evex-prefix.h               | 112 +++++++++++++++++++
 opcodes/i386-dis-evex.h                      |  36 +++---
 opcodes/i386-dis.c                           |  29 +++++
 opcodes/i386-gen.c                           |   1 +
 opcodes/i386-opc.h                           |   4 +
 opcodes/i386-opc.tbl                         |   6 +
 13 files changed, 458 insertions(+), 20 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-zu.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 5606049c054..c0625fc210e 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4285,8 +4285,9 @@ build_apx_evex_prefix (void)
     i.vex.bytes[3] &= ~0x08;
 
   /* Encode the NDD bit of the instruction promoted from the legacy
-     space.  */
-  if (i.vex.register_specifier && i.tm.opcode_space == SPACE_EVEXMAP4)
+     space. ZU shares the same bit with NDD.  */
+  if ((i.vex.register_specifier && i.tm.opcode_space == SPACE_EVEXMAP4)
+      || i.tm.opcode_modifier.zu)
     i.vex.bytes[3] |= 0x10;
 
   /* Encode the NF bit.  */
diff --git a/gas/testsuite/gas/i386/x86-64-apx-zu-intel.d b/gas/testsuite/gas/i386/x86-64-apx-zu-intel.d
new file mode 100644
index 00000000000..9b61d50fd2f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-zu-intel.d
@@ -0,0 +1,78 @@
+#as:
+#objdump: -dw -Mintel
+#name: x86-64 APX ZU instructions with evex prefix encoding(Intel disassembly)
+#source: x86-64-apx-zu.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f4 7d 18 6b c3 0a[ 	]+imulzu ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 6b 00 0a[ 	]+imulzu ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 ec 7d 18 6b c1 0a[ 	]+imulzu r16w,r17w,0xa
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 6b 38 0a[ 	]+imulzu r31w,WORD PTR \[r16d\],0xa
+\s*[a-f0-9]+:\s*62 4c 7d 18 6b ff 0a[ 	]+imulzu r31w,r31w,0xa
+\s*[a-f0-9]+:\s*62 f4 7d 18 69 c3 82 23[ 	]+imulzu ax,bx,0x2382
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 69 00 82 23[ 	]+imulzu ax,WORD PTR \[eax\],0x2382
+\s*[a-f0-9]+:\s*62 ec 7d 18 69 c1 82 23[ 	]+imulzu r16w,r17w,0x2382
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 69 38 82 23[ 	]+imulzu r31w,WORD PTR \[r16d\],0x2382
+\s*[a-f0-9]+:\s*62 4c 7d 18 69 ff 82 23[ 	]+imulzu r31w,r31w,0x2382
+\s*[a-f0-9]+:\s*62 f4 7d 1c 6b c3 0a[ 	]+\{nf\} imulzu ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 1c 6b 00 0a[ 	]+\{nf\} imulzu ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 f4 7d 0c 6b c3 0a[ 	]+\{nf\} imul ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 0c 6b 00 0a[ 	]+\{nf\} imul ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 f4 7d 08 6b c3 0a[ 	]+\{evex\} imul ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 08 6b 00 0a[ 	]+\{evex\} imul ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 f4 7f 18 41 c0[ 	]+setzuno al
+\s*[a-f0-9]+:\s*62 f4 7f 18 42 c3[ 	]+setzub bl
+\s*[a-f0-9]+:\s*62 d4 7f 18 43 c0[ 	]+setzuae r8b
+\s*[a-f0-9]+:\s*62 d4 7f 18 44 c1[ 	]+setzue r9b
+\s*[a-f0-9]+:\s*62 d4 7f 18 45 c2[ 	]+setzune r10b
+\s*[a-f0-9]+:\s*62 fc 7f 18 46 c0[ 	]+setzube r16b
+\s*[a-f0-9]+:\s*62 fc 7f 18 47 c1[ 	]+setzua r17b
+\s*[a-f0-9]+:\s*62 fc 7f 18 48 c2[ 	]+setzus r18b
+\s*[a-f0-9]+:\s*62 fc 7f 18 49 c3[ 	]+setzuns r19b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4a c4[ 	]+setzup r20b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4b c5[ 	]+setzunp r21b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4c c6[ 	]+setzul r22b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4d c7[ 	]+setzuge r23b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4e c0[ 	]+setzule r24b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4f c1[ 	]+setzug r25b
+\s*[a-f0-9]+:\s*d5 91 9f c1[ 	]+setg   r25b
+\s*[a-f0-9]+:\s*62 f4 7f 08 4f c0[ 	]+\{evex\} setg al
+\s*[a-f0-9]+:\s*62 f4 7d 18 6b c3 0a[ 	]+imulzu ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 6b 00 0a[ 	]+imulzu ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 ec 7d 18 6b c1 0a[ 	]+imulzu r16w,r17w,0xa
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 6b 38 0a[ 	]+imulzu r31w,WORD PTR \[r16d\],0xa
+\s*[a-f0-9]+:\s*62 4c 7d 18 6b ff 0a[ 	]+imulzu r31w,r31w,0xa
+\s*[a-f0-9]+:\s*62 f4 7d 18 69 c3 82 23[ 	]+imulzu ax,bx,0x2382
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 69 00 82 23[ 	]+imulzu ax,WORD PTR \[eax\],0x2382
+\s*[a-f0-9]+:\s*62 ec 7d 18 69 c1 82 23[ 	]+imulzu r16w,r17w,0x2382
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 69 38 82 23[ 	]+imulzu r31w,WORD PTR \[r16d\],0x2382
+\s*[a-f0-9]+:\s*62 4c 7d 18 69 ff 82 23[ 	]+imulzu r31w,r31w,0x2382
+\s*[a-f0-9]+:\s*62 f4 7d 1c 6b c3 0a[ 	]+\{nf\} imulzu ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 1c 6b 00 0a[ 	]+\{nf\} imulzu ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 f4 7d 0c 6b c3 0a[ 	]+\{nf\} imul ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 0c 6b 00 0a[ 	]+\{nf\} imul ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 f4 7d 08 6b c3 0a[ 	]+\{evex\} imul ax,bx,0xa
+\s*[a-f0-9]+:\s*67 62 f4 7d 08 6b 00 0a[ 	]+\{evex\} imul ax,WORD PTR \[eax\],0xa
+\s*[a-f0-9]+:\s*62 f4 7f 18 41 c0[ 	]+setzuno al
+\s*[a-f0-9]+:\s*62 f4 7f 18 42 c3[ 	]+setzub bl
+\s*[a-f0-9]+:\s*62 d4 7f 18 43 c0[ 	]+setzuae r8b
+\s*[a-f0-9]+:\s*62 d4 7f 18 44 c1[ 	]+setzue r9b
+\s*[a-f0-9]+:\s*62 d4 7f 18 45 c2[ 	]+setzune r10b
+\s*[a-f0-9]+:\s*62 fc 7f 18 46 c0[ 	]+setzube r16b
+\s*[a-f0-9]+:\s*62 fc 7f 18 47 c1[ 	]+setzua r17b
+\s*[a-f0-9]+:\s*62 fc 7f 18 48 c2[ 	]+setzus r18b
+\s*[a-f0-9]+:\s*62 fc 7f 18 49 c3[ 	]+setzuns r19b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4a c4[ 	]+setzup r20b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4b c5[ 	]+setzunp r21b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4c c6[ 	]+setzul r22b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4d c7[ 	]+setzuge r23b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4e c0[ 	]+setzule r24b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4f c1[ 	]+setzug r25b
+\s*[a-f0-9]+:\s*d5 91 9f c1[ 	]+setg   r25b
+\s*[a-f0-9]+:\s*62 f4 7f 08 4f c0[ 	]+\{evex\} setg al
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-zu-inval.l b/gas/testsuite/gas/i386/x86-64-apx-zu-inval.l
new file mode 100644
index 00000000000..d58549208ab
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-zu-inval.l
@@ -0,0 +1,25 @@
+.* Assembler messages:
+.*:6: Error: invalid instruction suffix for `imulzu'
+.*:7: Error: no such instruction: `imulzud \$0xa,%ebx,%eax'
+.*:8: Error: operand size mismatch for `imulzu'
+.*:9: Error: invalid instruction suffix for `imulzu'
+.*:10: Error: no such instruction: `imulzud \$0xaaaa,%ebx,%eax'
+.*:11: Error: operand size mismatch for `imulzu'
+.*:12: Error: operand size mismatch for `imulzu'
+.*:13: Error: operand size mismatch for `imulzu'
+.*:14: Error: can't encode register '%ah' in an instruction requiring REX/REX2/EVEX prefix
+.*:15: Error: operand size mismatch for `setzub'
+.*:16: Error: operand size mismatch for `setzuae'
+.*:17: Error: operand size mismatch for `setzue'
+.*:18: Error: operand size mismatch for `setzune'
+.*:19: Error: operand size mismatch for `setzube'
+.*:20: Error: operand size mismatch for `setzua'
+.*:21: Error: operand size mismatch for `setzus'
+.*:22: Error: operand size mismatch for `setzuns'
+.*:23: Error: operand size mismatch for `setzup'
+.*:24: Error: operand size mismatch for `setzunp'
+.*:25: Error: operand size mismatch for `setzul'
+.*:26: Error: operand size mismatch for `setzuge'
+.*:27: Error: operand size mismatch for `setzule'
+.*:28: Error: operand size mismatch for `setzug'
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-zu-inval.s b/gas/testsuite/gas/i386/x86-64-apx-zu-inval.s
new file mode 100644
index 00000000000..bd81b41ff3f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-zu-inval.s
@@ -0,0 +1,28 @@
+# Check illegal APX-ZU instructions
+
+	.allow_index_reg
+	.text
+_start:
+	imulzub $0xa,%bl,%al
+	imulzud $0xa,%ebx,%eax
+	imulzu $0xa,%rbx,%rax
+	imulzub $0xaaaa,%bl,%al
+	imulzud $0xaaaa,%ebx,%eax
+	imulzu $0xaaaa,%rbx,%rax
+	imulzu $0xaaaa,%ebx,%rax
+	imulzu $0xaaaa,%ebx,%rax
+	setzuno  %ah
+	setzub   %bx
+	setzuae  %r8w
+	setzue   %r9
+	setzune  %r10d
+	setzube  %eax
+	setzua   %ebx
+	setzus   %r18w
+	setzuns  %r19w
+	setzup   %r20d
+	setzunp  %r21w
+	setzul   %r22w
+	setzuge  %r23d
+	setzule  %r24w
+	setzug   %r25w
diff --git a/gas/testsuite/gas/i386/x86-64-apx-zu.d b/gas/testsuite/gas/i386/x86-64-apx-zu.d
new file mode 100644
index 00000000000..c6b0cc45d76
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-zu.d
@@ -0,0 +1,78 @@
+#as:
+#objdump: -dw
+#name: x86-64 APX ZU instructions with evex prefix encoding
+#source: x86-64-apx-zu.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f4 7d 18 6b c3 0a[ 	]+imulzu \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 6b 00 0a[ 	]+imulzu \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 ec 7d 18 6b c1 0a[ 	]+imulzu \$0xa,%r17w,%r16w
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 6b 38 0a[ 	]+imulzu \$0xa,\(%r16d\),%r31w
+\s*[a-f0-9]+:\s*62 4c 7d 18 6b ff 0a[ 	]+imulzu \$0xa,%r31w,%r31w
+\s*[a-f0-9]+:\s*62 f4 7d 18 69 c3 82 23[ 	]+imulzu \$0x2382,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 69 00 82 23[ 	]+imulzu \$0x2382,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 ec 7d 18 69 c1 82 23[ 	]+imulzu \$0x2382,%r17w,%r16w
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 69 38 82 23[ 	]+imulzu \$0x2382,\(%r16d\),%r31w
+\s*[a-f0-9]+:\s*62 4c 7d 18 69 ff 82 23[ 	]+imulzu \$0x2382,%r31w,%r31w
+\s*[a-f0-9]+:\s*62 f4 7d 1c 6b c3 0a[ 	]+\{nf\} imulzu \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 1c 6b 00 0a[ 	]+\{nf\} imulzu \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 f4 7d 0c 6b c3 0a[ 	]+\{nf\} imul \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 0c 6b 00 0a[ 	]+\{nf\} imul \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 f4 7d 08 6b c3 0a[ 	]+\{evex\} imul \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 08 6b 00 0a[ 	]+\{evex\} imul \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 f4 7f 18 41 c0[ 	]+setzuno %al
+\s*[a-f0-9]+:\s*62 f4 7f 18 42 c3[ 	]+setzub %bl
+\s*[a-f0-9]+:\s*62 d4 7f 18 43 c0[ 	]+setzuae %r8b
+\s*[a-f0-9]+:\s*62 d4 7f 18 44 c1[ 	]+setzue %r9b
+\s*[a-f0-9]+:\s*62 d4 7f 18 45 c2[ 	]+setzune %r10b
+\s*[a-f0-9]+:\s*62 fc 7f 18 46 c0[ 	]+setzube %r16b
+\s*[a-f0-9]+:\s*62 fc 7f 18 47 c1[ 	]+setzua %r17b
+\s*[a-f0-9]+:\s*62 fc 7f 18 48 c2[ 	]+setzus %r18b
+\s*[a-f0-9]+:\s*62 fc 7f 18 49 c3[ 	]+setzuns %r19b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4a c4[ 	]+setzup %r20b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4b c5[ 	]+setzunp %r21b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4c c6[ 	]+setzul %r22b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4d c7[ 	]+setzuge %r23b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4e c0[ 	]+setzule %r24b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4f c1[ 	]+setzug %r25b
+\s*[a-f0-9]+:\s*d5 91 9f c1[ 	]+setg   %r25b
+\s*[a-f0-9]+:\s*62 f4 7f 08 4f c0[ 	]+\{evex\} setg %al
+\s*[a-f0-9]+:\s*62 f4 7d 18 6b c3 0a[ 	]+imulzu \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 6b 00 0a[ 	]+imulzu \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 ec 7d 18 6b c1 0a[ 	]+imulzu \$0xa,%r17w,%r16w
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 6b 38 0a[ 	]+imulzu \$0xa,\(%r16d\),%r31w
+\s*[a-f0-9]+:\s*62 4c 7d 18 6b ff 0a[ 	]+imulzu \$0xa,%r31w,%r31w
+\s*[a-f0-9]+:\s*62 f4 7d 18 69 c3 82 23[ 	]+imulzu \$0x2382,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 18 69 00 82 23[ 	]+imulzu \$0x2382,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 ec 7d 18 69 c1 82 23[ 	]+imulzu \$0x2382,%r17w,%r16w
+\s*[a-f0-9]+:\s*67 62 6c 7d 18 69 38 82 23[ 	]+imulzu \$0x2382,\(%r16d\),%r31w
+\s*[a-f0-9]+:\s*62 4c 7d 18 69 ff 82 23[ 	]+imulzu \$0x2382,%r31w,%r31w
+\s*[a-f0-9]+:\s*62 f4 7d 1c 6b c3 0a[ 	]+\{nf\} imulzu \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 1c 6b 00 0a[ 	]+\{nf\} imulzu \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 f4 7d 0c 6b c3 0a[ 	]+\{nf\} imul \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 0c 6b 00 0a[  	]+\{nf\} imul \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 f4 7d 08 6b c3 0a[ 	]+\{evex\} imul \$0xa,%bx,%ax
+\s*[a-f0-9]+:\s*67 62 f4 7d 08 6b 00 0a[ 	]+\{evex\} imul \$0xa,\(%eax\),%ax
+\s*[a-f0-9]+:\s*62 f4 7f 18 41 c0[ 	]+setzuno %al
+\s*[a-f0-9]+:\s*62 f4 7f 18 42 c3[ 	]+setzub %bl
+\s*[a-f0-9]+:\s*62 d4 7f 18 43 c0[ 	]+setzuae %r8b
+\s*[a-f0-9]+:\s*62 d4 7f 18 44 c1[ 	]+setzue %r9b
+\s*[a-f0-9]+:\s*62 d4 7f 18 45 c2[ 	]+setzune %r10b
+\s*[a-f0-9]+:\s*62 fc 7f 18 46 c0[ 	]+setzube %r16b
+\s*[a-f0-9]+:\s*62 fc 7f 18 47 c1[ 	]+setzua %r17b
+\s*[a-f0-9]+:\s*62 fc 7f 18 48 c2[ 	]+setzus %r18b
+\s*[a-f0-9]+:\s*62 fc 7f 18 49 c3[ 	]+setzuns %r19b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4a c4[ 	]+setzup %r20b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4b c5[ 	]+setzunp %r21b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4c c6[ 	]+setzul %r22b
+\s*[a-f0-9]+:\s*62 fc 7f 18 4d c7[ 	]+setzuge %r23b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4e c0[ 	]+setzule %r24b
+\s*[a-f0-9]+:\s*62 dc 7f 18 4f c1[ 	]+setzug %r25b
+\s*[a-f0-9]+:\s*d5 91 9f c1[ 	]+setg   %r25b
+\s*[a-f0-9]+:\s*62 f4 7f 08 4f c0[ 	]+\{evex\} setg %al
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-zu.s b/gas/testsuite/gas/i386/x86-64-apx-zu.s
new file mode 100644
index 00000000000..0c287b8aa89
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-zu.s
@@ -0,0 +1,73 @@
+# Check 64bit APX ZU instructions with evex prefix encoding
+
+	.allow_index_reg
+	.text
+_start:
+	imulzu $10, %bx, %ax
+	imulzu $10, (%eax), %ax
+	imulzu $10, %r17w, %r16w
+	imulzu $10, (%r16d), %r31w
+	imulzu $10, %r31w
+	imulzu $9090, %bx, %ax
+	imulzu $9090, (%eax), %ax
+	imulzu $9090, %r17w, %r16w
+	imulzu $9090, (%r16d), %r31w
+	imulzu $9090, %r31w
+	{nf}   imulzu $10, %bx, %ax
+	{nf}   imulzu $10, (%eax), %ax
+	{nf}   imul $10, %bx, %ax       # ZU=0
+	{nf}   imul $10, (%eax), %ax    # ZU=0
+	{evex} imul $10, %bx, %ax       # ZU=0
+	{evex} imul $10, (%eax), %ax    # ZU=0
+	setzuno  %al
+	setzub   %bl
+	setzuae  %r8b
+	setzue   %r9b
+	setzune  %r10b
+	setzube  %r16b
+	setzua   %r17b
+	setzus   %r18b
+	setzuns  %r19b
+	setzup   %r20b
+	setzunp  %r21b
+	setzul   %r22b
+	setzuge  %r23b
+	setzule  %r24b
+	setzug   %r25b
+	setg     %r25b
+	{evex} setg  %al
+
+	.intel_syntax noprefix
+	imulzu ax,bx, 10
+	imulzu ax,WORD PTR [eax],10
+	imulzu r16w,r17w,10
+	imulzu r31w,WORD PTR [r16d],10
+	imulzu r31w,r31w,10
+	imulzu ax,bx,9090
+	imulzu ax,WORD PTR [eax],9090
+	imulzu r16w,r17w,9090
+	imulzu r31w,WORD PTR [r16d],9090
+	imulzu r31w,r31w,9090
+	{nf} imulzu ax,bx, 10
+	{nf} imulzu ax,WORD PTR [eax],10
+	{nf} imul ax,bx, 10               # ZU=0
+	{nf} imul ax,WORD PTR [eax],10    # ZU=0
+	{evex} imul ax,bx, 10             # ZU=0
+	{evex} imul ax,WORD PTR [eax],10  # ZU=0
+	setzuno  al
+	setzub   bl
+	setzuae  r8b
+	setzue   r9b
+	setzune  r10b
+	setzube  r16b
+	setzua   r17b
+	setzus   r18b
+	setzuns  r19b
+	setzup   r20b
+	setzunp  r21b
+	setzul   r22b
+	setzuge  r23b
+	setzule  r24b
+	setzug   r25b
+	setg     r25b
+	{evex} setg  al
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index a50865f6b21..ef1ad2dfe8a 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -389,6 +389,9 @@ run_dump_test "x86-64-apx-jmpabs-intel"
 run_dump_test "x86-64-apx-jmpabs-inval"
 run_dump_test "x86-64-apx-nf"
 run_dump_test "x86-64-apx-nf-intel"
+run_dump_test "x86-64-apx-zu"
+run_dump_test "x86-64-apx-zu-intel"
+run_list_test "x86-64-apx-zu-inval"
 run_dump_test "x86-64-apx_f-evex"
 run_dump_test "sse2avx-apx"
 run_dump_test "sse2avx-evex"
diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
index 5e6423790b5..947fd868dd6 100644
--- a/opcodes/i386-dis-evex-prefix.h
+++ b/opcodes/i386-dis-evex-prefix.h
@@ -338,6 +338,118 @@
     { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
     { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
   },
+  /* PREFIX_EVEX_MAP4_40 */
+  {
+    { "%CFcmovoS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovoS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUo",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_41 */
+  {
+    { "%CFcmovnoS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovnoS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUno",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_42 */
+  {
+    { "%CFcmovbS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovbS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUb",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_43 */
+  {
+    { "%CFcmovaeS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovaeS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUae",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_44 */
+  {
+    { "%CFcmoveS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmoveS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUe",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_45 */
+  {
+    { "%CFcmovneS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovneS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUne",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_46 */
+  {
+    { "%CFcmovbeS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovbeS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUbe",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_47 */
+  {
+    { "%CFcmovaS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovaS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUa",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_48 */
+  {
+    { "%CFcmovsS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovsS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUs",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_49 */
+  {
+    { "%CFcmovnsS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovnsS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUns",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_4A */
+  {
+    { "%CFcmovpS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovpS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUp",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_4B */
+  {
+    { "%CFcmovnpS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovnpS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUnp",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_4C */
+  {
+    { "%CFcmovlS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovlS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUl",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_4D */
+  {
+    { "%CFcmovgeS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovgeS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUge",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_4E */
+  {
+    { "%CFcmovleS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovleS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUle",	{ Eb }, 0 },
+  },
+  /* PREFIX_EVEX_MAP4_4F */
+  {
+    { "%CFcmovgS",	{ VexGv, Gv, Ev }, 0 },
+    { Bad_Opcode },
+    { "%CFcmovgS",	{ VexGv, Gv, Ev }, 0 },
+    { "set%ZUg",	{ Eb }, 0 },
+  },
   /* PREFIX_EVEX_MAP4_F0 */
   {
     { "crc32A", { Gdq, Eb }, 0 },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index b158f2b0b89..ebb3cc20aea 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -947,23 +947,23 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 40 */
-    { "%CFcmovoS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovnoS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovbS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovaeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmoveS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovneS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovbeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovaS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_40) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_41) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_42) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_43) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_44) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_45) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_46) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_47) },
     /* 48 */
-    { "%CFcmovsS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovnsS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovpS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovnpS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovlS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovgeS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovleS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
-    { "%CFcmovgS",	{ VexGv, Gv, Ev }, PREFIX_NP_OR_DATA },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_48) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_49) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_4A) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_4B) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_4C) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_4D) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_4E) },
+    { PREFIX_TABLE (PREFIX_EVEX_MAP4_4F) },
     /* 50 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -993,9 +993,9 @@ static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     /* 68 */
     { Bad_Opcode },
-    { "%NFimulS",	{ Gv, Ev, Iv }, PREFIX_NP_OR_DATA },
+    { "%NFimul%ZUS",	{ Gv, Ev, Iv }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
-    { "%NFimulS",	{ Gv, Ev, sIb }, PREFIX_NP_OR_DATA },
+    { "%NFimul%ZUS",	{ Gv, Ev, sIb }, PREFIX_NP_OR_DATA },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 53f2d6cf6d8..1cbd18398c2 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1199,6 +1199,22 @@ enum
   PREFIX_EVEX_0F3A67,
   PREFIX_EVEX_0F3AC2,
 
+  PREFIX_EVEX_MAP4_40,
+  PREFIX_EVEX_MAP4_41,
+  PREFIX_EVEX_MAP4_42,
+  PREFIX_EVEX_MAP4_43,
+  PREFIX_EVEX_MAP4_44,
+  PREFIX_EVEX_MAP4_45,
+  PREFIX_EVEX_MAP4_46,
+  PREFIX_EVEX_MAP4_47,
+  PREFIX_EVEX_MAP4_48,
+  PREFIX_EVEX_MAP4_49,
+  PREFIX_EVEX_MAP4_4A,
+  PREFIX_EVEX_MAP4_4B,
+  PREFIX_EVEX_MAP4_4C,
+  PREFIX_EVEX_MAP4_4D,
+  PREFIX_EVEX_MAP4_4E,
+  PREFIX_EVEX_MAP4_4F,
   PREFIX_EVEX_MAP4_F0,
   PREFIX_EVEX_MAP4_F1,
   PREFIX_EVEX_MAP4_F2,
@@ -1816,6 +1832,7 @@ struct dis386 {
 	   instruction.
    "NF" => print "{nf} " pseudo prefix when EVEX.NF = 1 and print "{evex} "
 	   pseudo prefix when instructions without NF, EGPR and VVVV,
+   "ZU" => print 'zu' if EVEX.ZU=1.
    "YK" keep unused, to avoid ambiguity with the combined use of Y and K.
    "YX" keep unused, to avoid ambiguity with the combined use of Y and X.
    "LQ" => print 'l' ('d' in Intel mode) or 'q' for memory operand, cond
@@ -10877,6 +10894,18 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 	      abort ();
 	    }
 	  break;
+	case 'U':
+	  if (l == 1 && (last[0] == 'Z'))
+	    {
+	      /* Although IMUL/SETcc does not support NDD, the EVEX.ND bit is
+		 used to control whether its destination register has its upper
+		 bits zeroed when OSIZE is 16b/8b.  */
+	      if (ins->vex.nd)
+		oappend (ins, "zu");
+	    }
+	  else
+	    abort ();
+	  break;
 	case 'V':
 	  if (l == 0)
 	    {
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 35e2f27df3b..fb4e78df3a2 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -491,6 +491,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (NoEgpr),
   BITFIELD (NF),
   BITFIELD (Rex2),
+  BITFIELD (ZU),
 };
 
 #define CLASS(n) #n, n
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index fa482ca3d37..b63ad7c13f5 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -754,6 +754,9 @@ enum
   /* Instrucion requires REX2 prefix.  */
   Rex2,
 
+  /* Support zero upper */
+  ZU,
+
   /* The last bitfield in i386_opcode_modifier.  */
   Opcode_Modifier_Num
 };
@@ -801,6 +804,7 @@ typedef struct i386_opcode_modifier
   unsigned int noegpr:1;
   unsigned int nf:1;
   unsigned int rex2:1;
+  unsigned int zu:1;
 } i386_opcode_modifier;
 
 /* Operand classes.  */
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 0bcda7ea8e1..faaa92c8d11 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -400,15 +400,19 @@ imul, 0xfaf, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|U
 imul, 0xaf, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4|NF, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 imul, 0x6b, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 imul, 0x6b, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+imulzu, 0x6b, APX_F, Modrm|No_bSuf|No_sSuf|EVexMap4|NF|ZU, { Imm8S, Reg16|Unspecified|BaseIndex, Reg16 }
 imul, 0x69, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm16|Imm32|Imm32S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 imul, 0x69, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVexMap4|NF, { Imm16|Imm32|Imm32S, Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+imulzu, 0x69, APX_F, Modrm|No_bSuf|No_sSuf|EVexMap4|NF|ZU, { Imm16, Reg16|Unspecified|BaseIndex, Reg16 }
 // imul with 2 operands mimics imul with 3 by putting the register in
 // both i.rm.reg & i.rm.regmem fields.  RegKludge enables this
 // transformation.
 imul, 0x6b, i186, Modrm|No_bSuf|No_sSuf|RegKludge, { Imm8S, Reg16|Reg32|Reg64 }
 imul, 0x6b, APX_F, Modrm|No_bSuf|No_sSuf|RegKludge|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64 }
+imulzu, 0x6b, APX_F, Modrm|No_bSuf|No_sSuf|RegKludge|EVexMap4|NF|ZU, { Imm8S, Reg16 }
 imul, 0x69, i186, Modrm|No_bSuf|No_sSuf|RegKludge, { Imm16|Imm32|Imm32S, Reg16|Reg32|Reg64 }
 imul, 0x69, APX_F, Modrm|No_bSuf|No_sSuf|RegKludge|EVexMap4|NF, { Imm16|Imm32|Imm32S, Reg16|Reg32|Reg64 }
+imulzu, 0x69, APX_F, Modrm|No_bSuf|No_sSuf|RegKludge|EVexMap4|NF|ZU, { Imm16, Reg16 }
 
 <mul>
 
@@ -529,6 +533,8 @@ loopne, 0xe0, x64, JumpByte|No_bSuf|No_wSuf|No_sSuf|NoRex64, { Disp8 }
 
 // Set byte on flag instructions.
 set<cc>, 0xf9<cc:opc>/0, i386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Reg8|Unspecified|BaseIndex }
+set<cc>, 0xf24<cc:opc>/0, APX_F, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|VexWIG|EVexMap4, { Reg8 }
+setzu<cc>, 0xf24<cc:opc>/0, APX_F, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|VexWIG|EVexMap4|ZU, { Reg8 }
 
 // String manipulation.
 cmps, 0xa6, 0, W|No_sSuf|RepPrefixOk, {}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] x86: Split REX/REX2 old registers judgment.
  2024-05-20  6:22 ` [PATCH 1/3] x86: Split REX/REX2 old registers judgment Cui, Lili
@ 2024-05-21 12:18   ` Jan Beulich
  2024-05-22  1:33     ` Cui, Lili
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2024-05-21 12:18 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hjl.tools, binutils

On 20.05.2024 08:22, Cui, Lili wrote:
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -4303,22 +4303,20 @@ static void establish_rex (void)
>    /* Respect a user-specified REX prefix.  */
>    i.rex |= i.prefix[REX_PREFIX] & REX_OPCODE;
>  
> -  /* For 8 bit registers we need an empty rex prefix.  Also if the
> -     instruction already has a prefix, we need to convert old
> -     registers to new ones.  */
> -
> -  if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
> -       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> -	   || i.rex2 != 0))
> -      || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
> -	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> -	      || i.rex2 != 0)))
> -    {
> -      unsigned int x;
> -
> -      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
> -	i.rex |= REX_OPCODE;
> -      for (x = first; x <= last; x++)
> +  /* For 8 bit registers without a prefix, we need an empty rex prefix.  */

As you're touching this comment, can you please also correct it? It's
not all 8-bit registers which are affected here (as expressed by the
RegRex64 checks).

> +  if (((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
> +	&& ((i.op[first].regs->reg_flags & RegRex64) != 0 ))

Nit: Please parenthesize this line and ...

> +       || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
> +	   && (i.op[last].regs->reg_flags & RegRex64) != 0 ))

... this one the same way, preferably like the latter. Also please omit
the blanks before the closing parentheses.

> +      && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm) && !i.rex)

Is the !i.rex part needed? It wasn't there before. If it's needed, it
would imo better come ahead of the APX/VEX checks.

> +    i.rex |= REX_OPCODE;
> +
> +  /* For REX/REX2 prefix instructions, we need to convert old registers
> +     (AL, CL, DL and BL) to new ones (AXL, CXL, DXL and BXL) and report bad
> +     for AH, CH, DH and BH.  */

"report bad for" is a little odd; how about simply "reject"?

Okay with respective adjustments.

Jan

> +  if (i.rex || i.rex2)
> +    {
> +      for (unsigned int x = first; x <= last; x++)
>  	{
>  	  /* Look for 8 bit operand that uses old registers.  */
>  	  if (i.types[x].bitfield.class == Reg && i.types[x].bitfield.byte


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] Add check for 8-bit old registers in EVEX format
  2024-05-20  6:22 ` [PATCH 2/3] Add check for 8-bit old registers in EVEX format Cui, Lili
@ 2024-05-21 12:24   ` Jan Beulich
  2024-05-22  2:20     ` Cui, Lili
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2024-05-21 12:24 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hjl.tools, binutils

On 20.05.2024 08:22, Cui, Lili wrote:
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -4311,10 +4311,10 @@ static void establish_rex (void)
>        && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm) && !i.rex)
>      i.rex |= REX_OPCODE;
>  
> -  /* For REX/REX2 prefix instructions, we need to convert old registers
> +  /* For REX/REX2/EVEX prefix instructions, we need to convert old registers
>       (AL, CL, DL and BL) to new ones (AXL, CXL, DXL and BXL) and report bad
>       for AH, CH, DH and BH.  */
> -  if (i.rex || i.rex2)
> +  if (i.rex || i.rex2 || i.tm.opcode_modifier.evex)
>      {
>        for (unsigned int x = first; x <= last; x++)
>  	{
> @@ -4326,7 +4326,7 @@ static void establish_rex (void)
>  	      /* In case it is "hi" register, give up.  */
>  	      if (i.op[x].regs->reg_num > 3)
>  		as_bad (_("can't encode register '%s%s' in an "
> -			  "instruction requiring REX/REX2 prefix"),
> +			  "instruction requiring REX/REX2/EVEX prefix"),
>  			register_prefix, i.op[x].regs->reg_name);
>  
>  	      /* Otherwise it is equivalent to the extended register.
> --- a/gas/testsuite/gas/i386/rex-bad.l
> +++ b/gas/testsuite/gas/i386/rex-bad.l
> @@ -3,8 +3,8 @@
>  .*:5: Error: same .*
>  .*:6: Error: same .*
>  .*:7: Error: same .*
> -.*:9: Error: .* REX/REX2 .*
> -.*:10: Error: .* REX/REX2 .*
> -.*:12: Error: .* REX/REX2 .*
> -.*:13: Error: .* REX/REX2 .*
> +.*:9: Error: .* REX/REX2/EVEX .*
> +.*:10: Error: .* REX/REX2/EVEX .*
> +.*:12: Error: .* REX/REX2/EVEX .*
> +.*:13: Error: .* REX/REX2/EVEX .*

To avoid these diagnostics getting yet more diffuse, can you please switch
to

		as_bad (_("can't encode register '%s%s' in an "
			  "instruction requiring %s prefix"),
			register_prefix, i.op[x].regs->reg_name,
			i.tm.opcode_modifier.evex ? "EVEX" : "REX/REX2");

Okay with that (or a substantially similar; I didn't test the above, after
all) adjustment.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/3] x86: Split REX/REX2 old registers judgment.
  2024-05-21 12:18   ` Jan Beulich
@ 2024-05-22  1:33     ` Cui, Lili
  2024-05-22  5:49       ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Cui, Lili @ 2024-05-22  1:33 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: hjl.tools, binutils


> On 20.05.2024 08:22, Cui, Lili wrote:
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -4303,22 +4303,20 @@ static void establish_rex (void)
> >    /* Respect a user-specified REX prefix.  */
> >    i.rex |= i.prefix[REX_PREFIX] & REX_OPCODE;
> >
> > -  /* For 8 bit registers we need an empty rex prefix.  Also if the
> > -     instruction already has a prefix, we need to convert old
> > -     registers to new ones.  */
> > -
> > -  if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
> > -       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> > -	   || i.rex2 != 0))
> > -      || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
> > -	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> > -	      || i.rex2 != 0)))
> > -    {
> > -      unsigned int x;
> > -
> > -      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
> > -	i.rex |= REX_OPCODE;
> > -      for (x = first; x <= last; x++)
> > +  /* For 8 bit registers without a prefix, we need an empty rex
> > + prefix.  */
> 
> As you're touching this comment, can you please also correct it? It's not all 8-
> bit registers which are affected here (as expressed by the
> RegRex64 checks).
> 
Done.

> > +  if (((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
> > +	&& ((i.op[first].regs->reg_flags & RegRex64) != 0 ))
> 
> Nit: Please parenthesize this line and ...
> 
> > +       || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
> > +	   && (i.op[last].regs->reg_flags & RegRex64) != 0 ))
> 
> ... this one the same way, preferably like the latter. Also please omit the blanks
> before the closing parentheses.
> 
Done.

> > +      && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm) &&
> > + !i.rex)
> 
> Is the !i.rex part needed? It wasn't there before. If it's needed, it would imo
> better come ahead of the APX/VEX checks.
> 
Yes, I added it because there is already i.rex and we don't need to overwrite it. And I will put it ahead of the APX/VEX checks.

> > +    i.rex |= REX_OPCODE;
> > +
> > +  /* For REX/REX2 prefix instructions, we need to convert old registers
> > +     (AL, CL, DL and BL) to new ones (AXL, CXL, DXL and BXL) and report bad
> > +     for AH, CH, DH and BH.  */
> 
> "report bad for" is a little odd; how about simply "reject"?
> 
Done.

> Okay with respective adjustments.

Thanks,
Lili.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 2/3] Add check for 8-bit old registers in EVEX format
  2024-05-21 12:24   ` Jan Beulich
@ 2024-05-22  2:20     ` Cui, Lili
  0 siblings, 0 replies; 13+ messages in thread
From: Cui, Lili @ 2024-05-22  2:20 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: hjl.tools, binutils

> To avoid these diagnostics getting yet more diffuse, can you please switch to
> 
> 		as_bad (_("can't encode register '%s%s' in an "
> 			  "instruction requiring %s prefix"),
> 			register_prefix, i.op[x].regs->reg_name,
> 			i.tm.opcode_modifier.evex ? "EVEX" : "REX/REX2");
> 
> Okay with that (or a substantially similar; I didn't test the above, after
> all) adjustment.
> 
It's correct, thanks for the suggestions, submitted both patches.

Lili.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] x86: Split REX/REX2 old registers judgment.
  2024-05-22  1:33     ` Cui, Lili
@ 2024-05-22  5:49       ` Jan Beulich
  2024-05-22  6:11         ` Cui, Lili
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2024-05-22  5:49 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hjl.tools, binutils

On 22.05.2024 03:33, Cui, Lili wrote:
> 
>> On 20.05.2024 08:22, Cui, Lili wrote:
>>> --- a/gas/config/tc-i386.c
>>> +++ b/gas/config/tc-i386.c
>>> @@ -4303,22 +4303,20 @@ static void establish_rex (void)
>>>    /* Respect a user-specified REX prefix.  */
>>>    i.rex |= i.prefix[REX_PREFIX] & REX_OPCODE;
>>>
>>> -  /* For 8 bit registers we need an empty rex prefix.  Also if the
>>> -     instruction already has a prefix, we need to convert old
>>> -     registers to new ones.  */
>>> -
>>> -  if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
>>> -       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
>>> -	   || i.rex2 != 0))
>>> -      || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
>>> -	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
>>> -	      || i.rex2 != 0)))
>>> -    {
>>> -      unsigned int x;
>>> -
>>> -      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
>>> -	i.rex |= REX_OPCODE;
>>> -      for (x = first; x <= last; x++)
>>> +  /* For 8 bit registers without a prefix, we need an empty rex
>>> + prefix.  */
>>
>> As you're touching this comment, can you please also correct it? It's not all 8-
>> bit registers which are affected here (as expressed by the
>> RegRex64 checks).
>>
> Done.
> 
>>> +  if (((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
>>> +	&& ((i.op[first].regs->reg_flags & RegRex64) != 0 ))
>>
>> Nit: Please parenthesize this line and ...
>>
>>> +       || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
>>> +	   && (i.op[last].regs->reg_flags & RegRex64) != 0 ))
>>
>> ... this one the same way, preferably like the latter. Also please omit the blanks
>> before the closing parentheses.
>>
> Done.
> 
>>> +      && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm) &&
>>> + !i.rex)
>>
>> Is the !i.rex part needed? It wasn't there before. If it's needed, it would imo
>> better come ahead of the APX/VEX checks.
>>
> Yes, I added it because there is already i.rex and we don't need to overwrite it. And I will put it ahead of the APX/VEX checks.

I don't think conditionals should be added in such cases. Resulting code
will do better without the extra conditional branch, when the update is
really benign in the case where (here) i.rex is already set. See, btw,
your own response to a pretty similar question of mine:
https://sourceware.org/pipermail/binutils/2024-May/134157.html

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/3] x86: Split REX/REX2 old registers judgment.
  2024-05-22  5:49       ` Jan Beulich
@ 2024-05-22  6:11         ` Cui, Lili
  2024-05-22  6:22           ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Cui, Lili @ 2024-05-22  6:11 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: hjl.tools, binutils

> On 22.05.2024 03:33, Cui, Lili wrote:
> >
> >> On 20.05.2024 08:22, Cui, Lili wrote:
> >>> --- a/gas/config/tc-i386.c
> >>> +++ b/gas/config/tc-i386.c
> >>> @@ -4303,22 +4303,20 @@ static void establish_rex (void)
> >>>    /* Respect a user-specified REX prefix.  */
> >>>    i.rex |= i.prefix[REX_PREFIX] & REX_OPCODE;
> >>>
> >>> -  /* For 8 bit registers we need an empty rex prefix.  Also if the
> >>> -     instruction already has a prefix, we need to convert old
> >>> -     registers to new ones.  */
> >>> -
> >>> -  if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
> >>> -       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> >>> -	   || i.rex2 != 0))
> >>> -      || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
> >>> -	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
> >>> -	      || i.rex2 != 0)))
> >>> -    {
> >>> -      unsigned int x;
> >>> -
> >>> -      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
> >>> -	i.rex |= REX_OPCODE;
> >>> -      for (x = first; x <= last; x++)
> >>> +  /* For 8 bit registers without a prefix, we need an empty rex
> >>> + prefix.  */
> >>
> >> As you're touching this comment, can you please also correct it? It's
> >> not all 8- bit registers which are affected here (as expressed by the
> >> RegRex64 checks).
> >>
> > Done.
> >
> >>> +  if (((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
> >>> +	&& ((i.op[first].regs->reg_flags & RegRex64) != 0 ))
> >>
> >> Nit: Please parenthesize this line and ...
> >>
> >>> +       || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
> >>> +	   && (i.op[last].regs->reg_flags & RegRex64) != 0 ))
> >>
> >> ... this one the same way, preferably like the latter. Also please
> >> omit the blanks before the closing parentheses.
> >>
> > Done.
> >
> >>> +      && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm)
> >>> + &&
> >>> + !i.rex)
> >>
> >> Is the !i.rex part needed? It wasn't there before. If it's needed, it
> >> would imo better come ahead of the APX/VEX checks.
> >>
> > Yes, I added it because there is already i.rex and we don't need to overwrite
> it. And I will put it ahead of the APX/VEX checks.
> 
> I don't think conditionals should be added in such cases. Resulting code will do
> better without the extra conditional branch, when the update is really benign
> in the case where (here) i.rex is already set. See, btw, your own response to a
> pretty similar question of mine:
> https://sourceware.org/pipermail/binutils/2024-May/134157.html
> 
Yes, they are the similar issues. Since I just pushed it to the trunk, I'll create a patch to remove it.

Lili.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Support APX zero-upper
  2024-05-20  6:22 ` [PATCH 3/3] Support APX zero-upper Cui, Lili
@ 2024-05-22  6:21   ` Jan Beulich
  2024-05-22  8:05     ` Cui, Lili
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2024-05-22  6:21 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hjl.tools, binutils

On 20.05.2024 08:22, Cui, Lili wrote:
> gas/ChangeLog:

I did specifically ask for (at least) one thing to be added to the description
here. You put that in the cover letter, which means it'll not end up in the
eventual commit. Please put it here. And as before, please try to get used to
not, ever, submit patches without any description, unless the title alone
really describes it all (including e.g. implementation decisions taken). What
you ...

>         * config/tc-i386.c (build_apx_evex_prefix): Handle ZU.
>         * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
>         * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
>         * testsuite/gas/i386/x86-64-apx-zu-intel.d: New test.
>         * testsuite/gas/i386/x86-64-apx-zu-inval.l: Ditto.
>         * testsuite/gas/i386/x86-64-apx-zu-inval.s: Ditto.
>         * testsuite/gas/i386/x86-64-apx-zu.d: Ditto.
>         * testsuite/gas/i386/x86-64-apx-zu.s: Ditto.
> 
> opcodes/ChangeLog:
> 
>         * i386-dis-evex-prefix.h: Handle PREFIX_EVEX_MAP4_40 ~
>         PREFIX_EVEX_MAP4_4F.
>         * i386-dis-evex.h: Ditto.
>         * i386-dis.c (struct dis386): Add new micro 'ZU'.
>         (putop): Handle %ZU.
>         * i386-gen.c: Added ZU.
>         * i386-opc.h: Ditto.
>         * i386-opc.tbl: Added new templates to support ZU.

... mechanically put here is useful to see _what_ was changed, but it's
entirely unhelpful when one wants to understand _why_ things were done
(perhaps a certain way).

> @@ -10877,6 +10894,18 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
>  	      abort ();
>  	    }
>  	  break;
> +	case 'U':
> +	  if (l == 1 && (last[0] == 'Z'))
> +	    {
> +	      /* Although IMUL/SETcc does not support NDD, the EVEX.ND bit is
> +		 used to control whether its destination register has its upper
> +		 bits zeroed when OSIZE is 16b/8b.  */
> +	      if (ins->vex.nd)
> +		oappend (ins, "zu");

I find in particular the OSIZE part of the comment somewhat misleading. For
SETcc operand size (irrespective of any further insn attributes like
[embedded] prefixes) is 8 bits. Nothing to say there at all in this regard.
And for IMUL while EVEX.ZU indeed has an effect for 16-bit operand size
only, what about the bit being set when operand size is 32 or 64 bits? If
the doc is to be trusted, the this is benign, and afaict you follow that by
emitting "zu" in such cases as well. Just that the comment suggests
otherwise.

Once again: Okay with respective adjustments.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] x86: Split REX/REX2 old registers judgment.
  2024-05-22  6:11         ` Cui, Lili
@ 2024-05-22  6:22           ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2024-05-22  6:22 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hjl.tools, binutils

On 22.05.2024 08:11, Cui, Lili wrote:
>> On 22.05.2024 03:33, Cui, Lili wrote:
>>>> On 20.05.2024 08:22, Cui, Lili wrote:
>>>>> --- a/gas/config/tc-i386.c
>>>>> +++ b/gas/config/tc-i386.c
>>>>> @@ -4303,22 +4303,20 @@ static void establish_rex (void)
>>>>>    /* Respect a user-specified REX prefix.  */
>>>>>    i.rex |= i.prefix[REX_PREFIX] & REX_OPCODE;
>>>>>
>>>>> -  /* For 8 bit registers we need an empty rex prefix.  Also if the
>>>>> -     instruction already has a prefix, we need to convert old
>>>>> -     registers to new ones.  */
>>>>> -
>>>>> -  if ((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
>>>>> -       && ((i.op[first].regs->reg_flags & RegRex64) != 0 || i.rex != 0
>>>>> -	   || i.rex2 != 0))
>>>>> -      || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
>>>>> -	  && ((i.op[last].regs->reg_flags & RegRex64) != 0 || i.rex != 0
>>>>> -	      || i.rex2 != 0)))
>>>>> -    {
>>>>> -      unsigned int x;
>>>>> -
>>>>> -      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
>>>>> -	i.rex |= REX_OPCODE;
>>>>> -      for (x = first; x <= last; x++)
>>>>> +  /* For 8 bit registers without a prefix, we need an empty rex
>>>>> + prefix.  */
>>>>
>>>> As you're touching this comment, can you please also correct it? It's
>>>> not all 8- bit registers which are affected here (as expressed by the
>>>> RegRex64 checks).
>>>>
>>> Done.
>>>
>>>>> +  if (((i.types[first].bitfield.class == Reg && i.types[first].bitfield.byte
>>>>> +	&& ((i.op[first].regs->reg_flags & RegRex64) != 0 ))
>>>>
>>>> Nit: Please parenthesize this line and ...
>>>>
>>>>> +       || (i.types[last].bitfield.class == Reg && i.types[last].bitfield.byte
>>>>> +	   && (i.op[last].regs->reg_flags & RegRex64) != 0 ))
>>>>
>>>> ... this one the same way, preferably like the latter. Also please
>>>> omit the blanks before the closing parentheses.
>>>>
>>> Done.
>>>
>>>>> +      && !is_apx_rex2_encoding () && !is_any_vex_encoding (&i.tm)
>>>>> + &&
>>>>> + !i.rex)
>>>>
>>>> Is the !i.rex part needed? It wasn't there before. If it's needed, it
>>>> would imo better come ahead of the APX/VEX checks.
>>>>
>>> Yes, I added it because there is already i.rex and we don't need to overwrite
>> it. And I will put it ahead of the APX/VEX checks.
>>
>> I don't think conditionals should be added in such cases. Resulting code will do
>> better without the extra conditional branch, when the update is really benign
>> in the case where (here) i.rex is already set. See, btw, your own response to a
>> pretty similar question of mine:
>> https://sourceware.org/pipermail/binutils/2024-May/134157.html
>>
> Yes, they are the similar issues. Since I just pushed it to the trunk, I'll create a patch to remove it.

Thanks, and pre-approved.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 3/3] Support APX zero-upper
  2024-05-22  6:21   ` Jan Beulich
@ 2024-05-22  8:05     ` Cui, Lili
  0 siblings, 0 replies; 13+ messages in thread
From: Cui, Lili @ 2024-05-22  8:05 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: hjl.tools, binutils

> I did specifically ask for (at least) one thing to be added to the description
> here. You put that in the cover letter, which means it'll not end up in the
> eventual commit. Please put it here. And as before, please try to get used to
> not, ever, submit patches without any description, unless the title alone really
> describes it all (including e.g. implementation decisions taken). What you ...
> 
> >         * config/tc-i386.c (build_apx_evex_prefix): Handle ZU.
> >         * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
> >         * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
> >         * testsuite/gas/i386/x86-64-apx-zu-intel.d: New test.
> >         * testsuite/gas/i386/x86-64-apx-zu-inval.l: Ditto.
> >         * testsuite/gas/i386/x86-64-apx-zu-inval.s: Ditto.
> >         * testsuite/gas/i386/x86-64-apx-zu.d: Ditto.
> >         * testsuite/gas/i386/x86-64-apx-zu.s: Ditto.
> >
> > opcodes/ChangeLog:
> >
> >         * i386-dis-evex-prefix.h: Handle PREFIX_EVEX_MAP4_40 ~
> >         PREFIX_EVEX_MAP4_4F.
> >         * i386-dis-evex.h: Ditto.
> >         * i386-dis.c (struct dis386): Add new micro 'ZU'.
> >         (putop): Handle %ZU.
> >         * i386-gen.c: Added ZU.
> >         * i386-opc.h: Ditto.
> >         * i386-opc.tbl: Added new templates to support ZU.
> 
> ... mechanically put here is useful to see _what_ was changed, but it's entirely
> unhelpful when one wants to understand _why_ things were done (perhaps a
> certain way).
> 

Added, I'll pay attention to this in the following patches. 

> > @@ -10877,6 +10894,18 @@ putop (instr_info *ins, const char
> *in_template, int sizeflag)
> >  	      abort ();
> >  	    }
> >  	  break;
> > +	case 'U':
> > +	  if (l == 1 && (last[0] == 'Z'))
> > +	    {
> > +	      /* Although IMUL/SETcc does not support NDD, the EVEX.ND bit is
> > +		 used to control whether its destination register has its upper
> > +		 bits zeroed when OSIZE is 16b/8b.  */
> > +	      if (ins->vex.nd)
> > +		oappend (ins, "zu");
> 
> I find in particular the OSIZE part of the comment somewhat misleading. For
> SETcc operand size (irrespective of any further insn attributes like [embedded]
> prefixes) is 8 bits. Nothing to say there at all in this regard.
> And for IMUL while EVEX.ZU indeed has an effect for 16-bit operand size only,
> what about the bit being set when operand size is 32 or 64 bits? If the doc is
> to be trusted, the this is benign, and afaict you follow that by emitting "zu" in
> such cases as well. Just that the comment suggests otherwise.
> 
Removed " when OSIZE is 16b/8b ".

> Once again: Okay with respective adjustments.

Thanks!

Lili.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-05-22  8:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-20  6:21 [PATCH 0/3] Support APX zero-upper Cui, Lili
2024-05-20  6:22 ` [PATCH 1/3] x86: Split REX/REX2 old registers judgment Cui, Lili
2024-05-21 12:18   ` Jan Beulich
2024-05-22  1:33     ` Cui, Lili
2024-05-22  5:49       ` Jan Beulich
2024-05-22  6:11         ` Cui, Lili
2024-05-22  6:22           ` Jan Beulich
2024-05-20  6:22 ` [PATCH 2/3] Add check for 8-bit old registers in EVEX format Cui, Lili
2024-05-21 12:24   ` Jan Beulich
2024-05-22  2:20     ` Cui, Lili
2024-05-20  6:22 ` [PATCH 3/3] Support APX zero-upper Cui, Lili
2024-05-22  6:21   ` Jan Beulich
2024-05-22  8:05     ` Cui, Lili

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).