public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Support APX PUSHP/POPP
@ 2023-11-27 12:31 Cui, Lili
  2023-11-27 12:56 ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Cui, Lili @ 2023-11-27 12:31 UTC (permalink / raw)
  To: binutils; +Cc: jbeulich, hongjiu.lu, ccoutant

Hi Jan,
   this patch is to add APX pushp/popp. I created a new bitfile rex2 for pushp/popp in i386_opcode_modifier. I wonder if you think this is too expensive and want to replace it with a special handle. Also in decoder, no regular fixup is added. Looking forward to your comments.

Thanks,
Lili.

gas/ChangeLog:

        * config/tc-i386.c (is_apx_rex2_encoding): Added rex2.
        * testsuite/gas/i386/x86-64.exp: Add new test for pushp and popp.
        * testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d: New test.
        * testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l: Ditto.
        * testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s: Ditto.
        * testsuite/gas/i386/x86-64-apx-pushp-popp.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-pushp-popp.s: Ditto.

opcodes/ChangeLog:

        * i386-dis.c (putop): print pushp and popp.
        * i386-gen.c: Added new BITFIELD for Rex2.
        * i386-opc.h: Ditto.
        * i386-opc.tbl: Added new insns.
        * i386-init.h : Regenerated.
        * i386-mnem.h : Regenerated.
        * i386-tbl.h: Regenerated.
---
 gas/config/tc-i386.c                          |  2 +-
 .../gas/i386/x86-64-apx-pushp-popp-intel.d    | 14 ++++++
 .../gas/i386/x86-64-apx-pushp-popp-inval.l    |  5 +++
 .../gas/i386/x86-64-apx-pushp-popp-inval.s    |  7 +++
 .../gas/i386/x86-64-apx-pushp-popp.d          | 14 ++++++
 .../gas/i386/x86-64-apx-pushp-popp.s          |  8 ++++
 gas/testsuite/gas/i386/x86-64.exp             |  3 ++
 opcodes/i386-dis.c                            | 45 ++++++++++++-------
 opcodes/i386-gen.c                            |  1 +
 opcodes/i386-opc.h                            |  3 ++
 opcodes/i386-opc.tbl                          |  2 +
 11 files changed, 87 insertions(+), 17 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 0dde2a9ad44..0321f366f48 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -3892,7 +3892,7 @@ is_apx_evex_encoding (void)
 static INLINE bool
 is_apx_rex2_encoding (void)
 {
-  return i.rex2 || i.rex2_encoding;
+  return i.rex2 || i.rex2_encoding || i.tm.opcode_modifier.rex2;
 }
 
 static unsigned int
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
new file mode 100644
index 00000000000..44e3e96a5df
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
@@ -0,0 +1,14 @@
+#as:
+#objdump: -dw -Mintel
+#name: x86_64 APX_F pushp popp insns (Intel disassembly)
+#source: x86-64-apx-pushp-popp.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*d5 08 50[	 ]+pushp  rax
+\s*[a-f0-9]+:\s*d5 19 57[	 ]+pushp  r31
+\s*[a-f0-9]+:\s*d5 08 58[	 ]+popp   rax
+\s*[a-f0-9]+:\s*d5 19 5f[	 ]+popp   r31
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
new file mode 100644
index 00000000000..c4d774b9673
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
@@ -0,0 +1,5 @@
+.* Assembler messages:
+.*:4: Error: operand size mismatch for `pushp'
+.*:5: Error: operand size mismatch for `popp'
+.*:6: Error: operand size mismatch for `pushp'
+.*:7: Error: operand size mismatch for `popp'
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
new file mode 100644
index 00000000000..28ed5d8145a
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
@@ -0,0 +1,7 @@
+# Check bytecode of APX_F pushp popp instructions with illegal instructions.
+
+	.text
+	pushp %eax
+	popp  %eax
+	pushp (%rax)
+	popp  (%rax)
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
new file mode 100644
index 00000000000..b20e5ba9a35
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
@@ -0,0 +1,14 @@
+#as:
+#objdump: -dw
+#name: x86_64 APX_F pushp popp insns
+#source: x86-64-apx-pushp-popp.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*d5 08 50[ 	]+pushp  %rax
+\s*[a-f0-9]+:\s*d5 19 57[ 	]+pushp  %r31
+\s*[a-f0-9]+:\s*d5 08 58[ 	]+popp   %rax
+\s*[a-f0-9]+:\s*d5 19 5f[ 	]+popp   %r31
diff --git a/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
new file mode 100644
index 00000000000..0ea66d0e70c
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
@@ -0,0 +1,8 @@
+# Check 64bit APX_F pushp popp instructions
+
+       .text
+ _start:
+	pushp %rax
+	pushp %r31
+	popp  %rax
+	popp  %r31
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 8b41f9891a5..74433a20e24 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -348,6 +348,9 @@ run_dump_test "x86-64-avx512dq-rcigrne"
 run_dump_test "x86-64-apx-push2pop2"
 run_dump_test "x86-64-apx-push2pop2-intel"
 run_list_test "x86-64-apx-push2pop2-inval"
+run_dump_test "x86-64-apx-pushp-popp"
+run_dump_test "x86-64-apx-pushp-popp-intel"
+run_list_test "x86-64-apx-pushp-popp-inval"
 run_dump_test "x86-64-avx512dq-rcigru-intel"
 run_dump_test "x86-64-avx512dq-rcigru"
 run_dump_test "x86-64-avx512dq-rcigrz-intel"
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index b33b44d7c27..6542e9b81dc 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1931,23 +1931,23 @@ static const struct dis386 dis386[] = {
   { "dec{S|}",		{ RMeSI }, 0 },
   { "dec{S|}",		{ RMeDI }, 0 },
   /* 50 */
-  { "push{!P|}",		{ RMrAX }, 0 },
-  { "push{!P|}",		{ RMrCX }, 0 },
-  { "push{!P|}",		{ RMrDX }, 0 },
-  { "push{!P|}",		{ RMrBX }, 0 },
-  { "push{!P|}",		{ RMrSP }, 0 },
-  { "push{!P|}",		{ RMrBP }, 0 },
-  { "push{!P|}",		{ RMrSI }, 0 },
-  { "push{!P|}",		{ RMrDI }, 0 },
+  { "push!P",		{ RMrAX }, 0 },
+  { "push!P",		{ RMrCX }, 0 },
+  { "push!P",		{ RMrDX }, 0 },
+  { "push!P",		{ RMrBX }, 0 },
+  { "push!P",		{ RMrSP }, 0 },
+  { "push!P",		{ RMrBP }, 0 },
+  { "push!P",		{ RMrSI }, 0 },
+  { "push!P",		{ RMrDI }, 0 },
   /* 58 */
-  { "pop{!P|}",		{ RMrAX }, 0 },
-  { "pop{!P|}",		{ RMrCX }, 0 },
-  { "pop{!P|}",		{ RMrDX }, 0 },
-  { "pop{!P|}",		{ RMrBX }, 0 },
-  { "pop{!P|}",		{ RMrSP }, 0 },
-  { "pop{!P|}",		{ RMrBP }, 0 },
-  { "pop{!P|}",		{ RMrSI }, 0 },
-  { "pop{!P|}",		{ RMrDI }, 0 },
+  { "pop!P",		{ RMrAX }, 0 },
+  { "pop!P",		{ RMrCX }, 0 },
+  { "pop!P",		{ RMrDX }, 0 },
+  { "pop!P",		{ RMrBX }, 0 },
+  { "pop!P",		{ RMrSP }, 0 },
+  { "pop!P",		{ RMrBP }, 0 },
+  { "pop!P",		{ RMrSI }, 0 },
+  { "pop!P",		{ RMrDI }, 0 },
   /* 60 */
   { X86_64_TABLE (X86_64_60) },
   { X86_64_TABLE (X86_64_61) },
@@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
 	case 'P':
 	  if (l == 0)
 	    {
+	      /* For pushp and popp, do not print {rex2} for them.  */
+	      if (ins->address_mode == mode_64bit && !cond
+		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
+		{
+		  *ins->obufp++ = 'p';
+		  ins->rex2 |= 16;
+		  break;
+		}
+
+	      /* If "!p" printis nothing in intel_syntax.  */
+	      if (!cond && ins->intel_syntax)
+		break;
+
 	      if ((ins->modrm.mod == 3 || !cond)
 		  && !(sizeflag & SUFFIX_ALWAYS))
 		break;
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 7dab744134f..7efb064b9ca 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -471,6 +471,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (IsPrefix),
   BITFIELD (ImmExt),
   BITFIELD (NoRex64),
+  BITFIELD (Rex2),
   BITFIELD (Vex),
   BITFIELD (VexVVVV),
   BITFIELD (VexW),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index edd59dd67ea..575ef123ee6 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -627,6 +627,8 @@ enum
   ImmExt,
   /* instruction don't need Rex64 prefix.  */
   NoRex64,
+  /* instruction need Rex2 prefix.  */
+  Rex2,
   /* insn has VEX prefix:
 	1: 128bit VEX prefix (or operand dependent).
 	2: 256bit VEX prefix.
@@ -787,6 +789,7 @@ typedef struct i386_opcode_modifier
   unsigned int isprefix:1;
   unsigned int immext:1;
   unsigned int norex64:1;
+  unsigned int rex2:1;
   unsigned int vex:2;
   unsigned int vexvvvv:2;
   unsigned int vexw:2;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 642e519fe3a..539837e17ab 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -225,6 +225,7 @@ push, 0x68, i186&No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }
 push, 0x6, No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }
 // In 64bit mode, the operand size is implicitly 64bit.
 push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
+pushp, 0x50, APX_F, No_bSuf|No_lSuf|No_sSuf|Rex2, {Reg64 }
 push, 0xff/6, x64, Modrm|DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64|Word|Qword|Unspecified|BaseIndex }
 push, 0x6a, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Imm8S }
 push, 0x68, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Imm16|Imm32S }
@@ -238,6 +239,7 @@ pop, 0x8f/0, No64, Modrm|DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32|Word
 pop, 0x7, No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }
 // In 64bit mode, the operand size is implicitly 64bit.
 pop, 0x58, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
+popp, 0x58, APX_F, No_bSuf|No_lSuf|No_sSuf|Rex2, { Reg64 }
 pop, 0x8f/0, x64, Modrm|DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64|Word|Qword|Unspecified|BaseIndex }
 pop, 0xfa1, x64, DefaultSize|No_bSuf|No_lSuf|No_sSuf|NoRex64, { SReg }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Support APX PUSHP/POPP
  2023-11-27 12:31 [PATCH] Support APX PUSHP/POPP Cui, Lili
@ 2023-11-27 12:56 ` Jan Beulich
  2023-11-27 13:45   ` Cui, Lili
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2023-11-27 12:56 UTC (permalink / raw)
  To: Cui, Lili; +Cc: hongjiu.lu, ccoutant, binutils

On 27.11.2023 13:31, Cui, Lili wrote:
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -1931,23 +1931,23 @@ static const struct dis386 dis386[] = {
>    { "dec{S|}",		{ RMeSI }, 0 },
>    { "dec{S|}",		{ RMeDI }, 0 },
>    /* 50 */
> -  { "push{!P|}",		{ RMrAX }, 0 },
> -  { "push{!P|}",		{ RMrCX }, 0 },
> -  { "push{!P|}",		{ RMrDX }, 0 },
> -  { "push{!P|}",		{ RMrBX }, 0 },
> -  { "push{!P|}",		{ RMrSP }, 0 },
> -  { "push{!P|}",		{ RMrBP }, 0 },
> -  { "push{!P|}",		{ RMrSI }, 0 },
> -  { "push{!P|}",		{ RMrDI }, 0 },
> +  { "push!P",		{ RMrAX }, 0 },
> +  { "push!P",		{ RMrCX }, 0 },
> +  { "push!P",		{ RMrDX }, 0 },
> +  { "push!P",		{ RMrBX }, 0 },
> +  { "push!P",		{ RMrSP }, 0 },
> +  { "push!P",		{ RMrBP }, 0 },
> +  { "push!P",		{ RMrSI }, 0 },
> +  { "push!P",		{ RMrDI }, 0 },
>    /* 58 */
> -  { "pop{!P|}",		{ RMrAX }, 0 },
> -  { "pop{!P|}",		{ RMrCX }, 0 },
> -  { "pop{!P|}",		{ RMrDX }, 0 },
> -  { "pop{!P|}",		{ RMrBX }, 0 },
> -  { "pop{!P|}",		{ RMrSP }, 0 },
> -  { "pop{!P|}",		{ RMrBP }, 0 },
> -  { "pop{!P|}",		{ RMrSI }, 0 },
> -  { "pop{!P|}",		{ RMrDI }, 0 },
> +  { "pop!P",		{ RMrAX }, 0 },
> +  { "pop!P",		{ RMrCX }, 0 },
> +  { "pop!P",		{ RMrDX }, 0 },
> +  { "pop!P",		{ RMrBX }, 0 },
> +  { "pop!P",		{ RMrSP }, 0 },
> +  { "pop!P",		{ RMrBP }, 0 },
> +  { "pop!P",		{ RMrSI }, 0 },
> +  { "pop!P",		{ RMrDI }, 0 },
>    /* 60 */
>    { X86_64_TABLE (X86_64_60) },
>    { X86_64_TABLE (X86_64_61) },
> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
>  	case 'P':
>  	  if (l == 0)
>  	    {
> +	      /* For pushp and popp, do not print {rex2} for them.  */
> +	      if (ins->address_mode == mode_64bit && !cond

I don't think the mode_64bit check is needed here, as without that REX.W
cannot possibly be set (nor can a REX2 prefix be present).

> +		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
> +		{
> +		  *ins->obufp++ = 'p';
> +		  ins->rex2 |= 16;

Please no new use of magic constants. Have a #define with a suitable name,
and use that here. Also I think the comment you have ahead of the if()
actually belongs here?

> +		  break;
> +		}
> +
> +	      /* If "!p" printis nothing in intel_syntax.  */
> +	      if (!cond && ins->intel_syntax)
> +		break;

The comment isn't quite right ('p' is printed). Also (nit) "prints".

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -225,6 +225,7 @@ push, 0x68, i186&No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }
>  push, 0x6, No64, DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }
>  // In 64bit mode, the operand size is implicitly 64bit.
>  push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
> +pushp, 0x50, APX_F, No_bSuf|No_lSuf|No_sSuf|Rex2, {Reg64 }

Since Reg16 isn't allowed, you also want No_wSuf here (and below). Note
also the missing blank after the opening figure brace.

The new Rex2 attribute is not only wasteful (it can easily be a new
enumerator used with OperandConstraint), but also misleading. We don't
just need REX2 here, but we need it with REX2.W set. Even if from the
tc-i386.c changes it looks as if that was happening implicitly
(presumably due to the absence of NoRex64), naming still needs to
properly reflect the purpose.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] Support APX PUSHP/POPP
  2023-11-27 12:56 ` Jan Beulich
@ 2023-11-27 13:45   ` Cui, Lili
  2023-11-27 14:06     ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Cui, Lili @ 2023-11-27 13:45 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, ccoutant, binutils

> On 27.11.2023 13:31, Cui, Lili wrote:
> > --- a/opcodes/i386-dis.c
> > +++ b/opcodes/i386-dis.c
> > @@ -1931,23 +1931,23 @@ static const struct dis386 dis386[] = {
> >    { "dec{S|}",		{ RMeSI }, 0 },
> >    { "dec{S|}",		{ RMeDI }, 0 },
> >    /* 50 */
> > -  { "push{!P|}",		{ RMrAX }, 0 },
> > -  { "push{!P|}",		{ RMrCX }, 0 },
> > -  { "push{!P|}",		{ RMrDX }, 0 },
> > -  { "push{!P|}",		{ RMrBX }, 0 },
> > -  { "push{!P|}",		{ RMrSP }, 0 },
> > -  { "push{!P|}",		{ RMrBP }, 0 },
> > -  { "push{!P|}",		{ RMrSI }, 0 },
> > -  { "push{!P|}",		{ RMrDI }, 0 },
> > +  { "push!P",		{ RMrAX }, 0 },
> > +  { "push!P",		{ RMrCX }, 0 },
> > +  { "push!P",		{ RMrDX }, 0 },
> > +  { "push!P",		{ RMrBX }, 0 },
> > +  { "push!P",		{ RMrSP }, 0 },
> > +  { "push!P",		{ RMrBP }, 0 },
> > +  { "push!P",		{ RMrSI }, 0 },
> > +  { "push!P",		{ RMrDI }, 0 },
> >    /* 58 */
> > -  { "pop{!P|}",		{ RMrAX }, 0 },
> > -  { "pop{!P|}",		{ RMrCX }, 0 },
> > -  { "pop{!P|}",		{ RMrDX }, 0 },
> > -  { "pop{!P|}",		{ RMrBX }, 0 },
> > -  { "pop{!P|}",		{ RMrSP }, 0 },
> > -  { "pop{!P|}",		{ RMrBP }, 0 },
> > -  { "pop{!P|}",		{ RMrSI }, 0 },
> > -  { "pop{!P|}",		{ RMrDI }, 0 },
> > +  { "pop!P",		{ RMrAX }, 0 },
> > +  { "pop!P",		{ RMrCX }, 0 },
> > +  { "pop!P",		{ RMrDX }, 0 },
> > +  { "pop!P",		{ RMrBX }, 0 },
> > +  { "pop!P",		{ RMrSP }, 0 },
> > +  { "pop!P",		{ RMrBP }, 0 },
> > +  { "pop!P",		{ RMrSI }, 0 },
> > +  { "pop!P",		{ RMrDI }, 0 },
> >    /* 60 */
> >    { X86_64_TABLE (X86_64_60) },
> >    { X86_64_TABLE (X86_64_61) },
> > @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
> *in_template, int sizeflag)
> >  	case 'P':
> >  	  if (l == 0)
> >  	    {
> > +	      /* For pushp and popp, do not print {rex2} for them.  */
> > +	      if (ins->address_mode == mode_64bit && !cond
> 
> I don't think the mode_64bit check is needed here, as without that REX.W
> cannot possibly be set (nor can a REX2 prefix be present).

Done.

> 
> > +		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
> > +		{
> > +		  *ins->obufp++ = 'p';
> > +		  ins->rex2 |= 16;
> 
> Please no new use of magic constants. Have a #define with a suitable name,
> and use that here. Also I think the comment you have ahead of the if() actually
> belongs here?
> 

How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it with JMPABS.


> > +		  break;
> > +		}
> > +
> > +	      /* If "!p" printis nothing in intel_syntax.  */
> > +	      if (!cond && ins->intel_syntax)
> > +		break;
> 
> The comment isn't quite right ('p' is printed). Also (nit) "prints".
> 

Added "!cond"  here, since just removed {} for !p,  I think P has its way of handling intel_syntax.

> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -225,6 +225,7 @@ push, 0x68, i186&No64,
> > DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }  push, 0x6,
> No64,
> > DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }  // In 64bit mode, the
> operand size is implicitly 64bit.
> >  push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
> > +pushp, 0x50, APX_F, No_bSuf|No_lSuf|No_sSuf|Rex2, {Reg64 }
> 
> Since Reg16 isn't allowed, you also want No_wSuf here (and below). Note also
> the missing blank after the opening figure brace.
> 

Done.

> The new Rex2 attribute is not only wasteful (it can easily be a new enumerator
> used with OperandConstraint), but also misleading. We don't just need REX2
> here, but we need it with REX2.W set. Even if from the tc-i386.c changes it
> looks as if that was happening implicitly (presumably due to the absence of
> NoRex64), naming still needs to properly reflect the purpose.
> 

You are right, rex2.w is set in process_suffix, since there is no NoRex64. 

How about Rex2W? 
if (i.tm.opcode_modifier.rex2w)
{
    i.rex2_encoding = true;
    i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
}


Or just add a special judgment?

if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp)
{
    i.rex2_encoding = true;
    i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
}

By the way, Is our APX patches (V3)  review still in progress? Since we have to commit them into the trunk before the next release, time is a bit tight ( many people may have vacation plans at the end of the year).

Thanks,
Lili.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Support APX PUSHP/POPP
  2023-11-27 13:45   ` Cui, Lili
@ 2023-11-27 14:06     ` Jan Beulich
  2023-11-28  2:32       ` Cui, Lili
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2023-11-27 14:06 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Lu, Hongjiu, ccoutant, binutils

On 27.11.2023 14:45, Cui, Lili wrote:
>> On 27.11.2023 13:31, Cui, Lili wrote:
>>> --- a/opcodes/i386-dis.c
>>> +++ b/opcodes/i386-dis.c
>>> @@ -1931,23 +1931,23 @@ static const struct dis386 dis386[] = {
>>>    { "dec{S|}",		{ RMeSI }, 0 },
>>>    { "dec{S|}",		{ RMeDI }, 0 },
>>>    /* 50 */
>>> -  { "push{!P|}",		{ RMrAX }, 0 },
>>> -  { "push{!P|}",		{ RMrCX }, 0 },
>>> -  { "push{!P|}",		{ RMrDX }, 0 },
>>> -  { "push{!P|}",		{ RMrBX }, 0 },
>>> -  { "push{!P|}",		{ RMrSP }, 0 },
>>> -  { "push{!P|}",		{ RMrBP }, 0 },
>>> -  { "push{!P|}",		{ RMrSI }, 0 },
>>> -  { "push{!P|}",		{ RMrDI }, 0 },
>>> +  { "push!P",		{ RMrAX }, 0 },
>>> +  { "push!P",		{ RMrCX }, 0 },
>>> +  { "push!P",		{ RMrDX }, 0 },
>>> +  { "push!P",		{ RMrBX }, 0 },
>>> +  { "push!P",		{ RMrSP }, 0 },
>>> +  { "push!P",		{ RMrBP }, 0 },
>>> +  { "push!P",		{ RMrSI }, 0 },
>>> +  { "push!P",		{ RMrDI }, 0 },
>>>    /* 58 */
>>> -  { "pop{!P|}",		{ RMrAX }, 0 },
>>> -  { "pop{!P|}",		{ RMrCX }, 0 },
>>> -  { "pop{!P|}",		{ RMrDX }, 0 },
>>> -  { "pop{!P|}",		{ RMrBX }, 0 },
>>> -  { "pop{!P|}",		{ RMrSP }, 0 },
>>> -  { "pop{!P|}",		{ RMrBP }, 0 },
>>> -  { "pop{!P|}",		{ RMrSI }, 0 },
>>> -  { "pop{!P|}",		{ RMrDI }, 0 },
>>> +  { "pop!P",		{ RMrAX }, 0 },
>>> +  { "pop!P",		{ RMrCX }, 0 },
>>> +  { "pop!P",		{ RMrDX }, 0 },
>>> +  { "pop!P",		{ RMrBX }, 0 },
>>> +  { "pop!P",		{ RMrSP }, 0 },
>>> +  { "pop!P",		{ RMrBP }, 0 },
>>> +  { "pop!P",		{ RMrSI }, 0 },
>>> +  { "pop!P",		{ RMrDI }, 0 },
>>>    /* 60 */
>>>    { X86_64_TABLE (X86_64_60) },
>>>    { X86_64_TABLE (X86_64_61) },
>>> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
>> *in_template, int sizeflag)
>>>  	case 'P':
>>>  	  if (l == 0)
>>>  	    {
>>> +	      /* For pushp and popp, do not print {rex2} for them.  */
>>> +	      if (ins->address_mode == mode_64bit && !cond
>>
>> I don't think the mode_64bit check is needed here, as without that REX.W
>> cannot possibly be set (nor can a REX2 prefix be present).
> 
> Done.
> 
>>
>>> +		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
>>> +		{
>>> +		  *ins->obufp++ = 'p';
>>> +		  ins->rex2 |= 16;
>>
>> Please no new use of magic constants. Have a #define with a suitable name,
>> and use that here. Also I think the comment you have ahead of the if() actually
>> belongs here?
>>
> 
> How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it with JMPABS.

But what's implicit about the REX2 prefix here?

>>> +		  break;
>>> +		}
>>> +
>>> +	      /* If "!p" printis nothing in intel_syntax.  */
>>> +	      if (!cond && ins->intel_syntax)
>>> +		break;
>>
>> The comment isn't quite right ('p' is printed). Also (nit) "prints".
>>
> 
> Added "!cond"  here, since just removed {} for !p,  I think P has its way of handling intel_syntax.

All fine. Yet still the comment should reflect reality.

>>> --- a/opcodes/i386-opc.tbl
>>> +++ b/opcodes/i386-opc.tbl
>>> @@ -225,6 +225,7 @@ push, 0x68, i186&No64,
>>> DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }  push, 0x6,
>> No64,
>>> DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }  // In 64bit mode, the
>> operand size is implicitly 64bit.
>>>  push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
>>> +pushp, 0x50, APX_F, No_bSuf|No_lSuf|No_sSuf|Rex2, {Reg64 }
>>
>> Since Reg16 isn't allowed, you also want No_wSuf here (and below). Note also
>> the missing blank after the opening figure brace.
>>
> 
> Done.
> 
>> The new Rex2 attribute is not only wasteful (it can easily be a new enumerator
>> used with OperandConstraint), but also misleading. We don't just need REX2
>> here, but we need it with REX2.W set. Even if from the tc-i386.c changes it
>> looks as if that was happening implicitly (presumably due to the absence of
>> NoRex64), naming still needs to properly reflect the purpose.
>>
> 
> You are right, rex2.w is set in process_suffix, since there is no NoRex64. 
> 
> How about Rex2W? 
> if (i.tm.opcode_modifier.rex2w)
> {
>     i.rex2_encoding = true;
>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> }
> 
> 
> Or just add a special judgment?
> 
> if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp)
> {
>     i.rex2_encoding = true;
>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> }

I'd prefer the latter over a new attribute (albeit once again with the
comment actually matching code), and perhaps I view the latter equal to
my earlier suggestion.

> By the way, Is our APX patches (V3)  review still in progress? Since we have to commit them into the trunk before the next release, time is a bit tight ( many people may have vacation plans at the end of the year).

Hmm, not exactly sure what to say. You posted v3 only on Friday. Plus
I have spent an enormous amount of time reviewing v1 and v2. Please be
patient, I have this queued for another round of review. And no, there
is no "we have to ... before the next release". You (Intel) may want to,
but with a large new feature like this it had to be clear that getting
all pieces in will take time.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] Support APX PUSHP/POPP
  2023-11-27 14:06     ` Jan Beulich
@ 2023-11-28  2:32       ` Cui, Lili
  2023-11-28  8:34         ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Cui, Lili @ 2023-11-28  2:32 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, ccoutant, binutils

> >>> --- a/opcodes/i386-dis.c
> >>> +++ b/opcodes/i386-dis.c
> >>> @@ -1931,23 +1931,23 @@ static const struct dis386 dis386[] = {
> >>>    { "dec{S|}",		{ RMeSI }, 0 },
> >>>    { "dec{S|}",		{ RMeDI }, 0 },
> >>>    /* 50 */
> >>> -  { "push{!P|}",		{ RMrAX }, 0 },
> >>> -  { "push{!P|}",		{ RMrCX }, 0 },
> >>> -  { "push{!P|}",		{ RMrDX }, 0 },
> >>> -  { "push{!P|}",		{ RMrBX }, 0 },
> >>> -  { "push{!P|}",		{ RMrSP }, 0 },
> >>> -  { "push{!P|}",		{ RMrBP }, 0 },
> >>> -  { "push{!P|}",		{ RMrSI }, 0 },
> >>> -  { "push{!P|}",		{ RMrDI }, 0 },
> >>> +  { "push!P",		{ RMrAX }, 0 },
> >>> +  { "push!P",		{ RMrCX }, 0 },
> >>> +  { "push!P",		{ RMrDX }, 0 },
> >>> +  { "push!P",		{ RMrBX }, 0 },
> >>> +  { "push!P",		{ RMrSP }, 0 },
> >>> +  { "push!P",		{ RMrBP }, 0 },
> >>> +  { "push!P",		{ RMrSI }, 0 },
> >>> +  { "push!P",		{ RMrDI }, 0 },
> >>>    /* 58 */
> >>> -  { "pop{!P|}",		{ RMrAX }, 0 },
> >>> -  { "pop{!P|}",		{ RMrCX }, 0 },
> >>> -  { "pop{!P|}",		{ RMrDX }, 0 },
> >>> -  { "pop{!P|}",		{ RMrBX }, 0 },
> >>> -  { "pop{!P|}",		{ RMrSP }, 0 },
> >>> -  { "pop{!P|}",		{ RMrBP }, 0 },
> >>> -  { "pop{!P|}",		{ RMrSI }, 0 },
> >>> -  { "pop{!P|}",		{ RMrDI }, 0 },
> >>> +  { "pop!P",		{ RMrAX }, 0 },
> >>> +  { "pop!P",		{ RMrCX }, 0 },
> >>> +  { "pop!P",		{ RMrDX }, 0 },
> >>> +  { "pop!P",		{ RMrBX }, 0 },
> >>> +  { "pop!P",		{ RMrSP }, 0 },
> >>> +  { "pop!P",		{ RMrBP }, 0 },
> >>> +  { "pop!P",		{ RMrSI }, 0 },
> >>> +  { "pop!P",		{ RMrDI }, 0 },
> >>>    /* 60 */
> >>>    { X86_64_TABLE (X86_64_60) },
> >>>    { X86_64_TABLE (X86_64_61) },
> >>> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
> >> *in_template, int sizeflag)
> >>>  	case 'P':
> >>>  	  if (l == 0)
> >>>  	    {
> >>> +	      /* For pushp and popp, do not print {rex2} for them.  */
> >>> +	      if (ins->address_mode == mode_64bit && !cond
> >>
> >> I don't think the mode_64bit check is needed here, as without that
> >> REX.W cannot possibly be set (nor can a REX2 prefix be present).
> >
> > Done.
> >
> >>
> >>> +		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
> >>> +		{
> >>> +		  *ins->obufp++ = 'p';
> >>> +		  ins->rex2 |= 16;
> >>
> >> Please no new use of magic constants. Have a #define with a suitable
> >> name, and use that here. Also I think the comment you have ahead of
> >> the if() actually belongs here?
> >>
> >
> > How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it with
> JMPABS.
> 
> But what's implicit about the REX2 prefix here?
> 

PUSHP/POPP requires rex2.w ==1 and we don't want to print {rex2} for it. JMPABS has the same need, so I want to define a common macro for them, or NO_NEED_PRINT_REX2?

> >>> +		  break;
> >>> +		}
> >>> +
> >>> +	      /* If "!p" printis nothing in intel_syntax.  */
> >>> +	      if (!cond && ins->intel_syntax)
> >>> +		break;
> >>
> >> The comment isn't quite right ('p' is printed). Also (nit) "prints".
> >>
> >
> > Added "!cond"  here, since just removed {} for !p,  I think P has its way of
> handling intel_syntax.
> 
> All fine. Yet still the comment should reflect reality.
> 

Changed.

> >>> --- a/opcodes/i386-opc.tbl
> >>> +++ b/opcodes/i386-opc.tbl
> >>> @@ -225,6 +225,7 @@ push, 0x68, i186&No64,
> >>> DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }  push, 0x6,
> >> No64,
> >>> DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }  // In 64bit mode, the
> >> operand size is implicitly 64bit.
> >>>  push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
> >>> +pushp, 0x50, APX_F, No_bSuf|No_lSuf|No_sSuf|Rex2, {Reg64 }
> >>
> >> Since Reg16 isn't allowed, you also want No_wSuf here (and below).
> >> Note also the missing blank after the opening figure brace.
> >>
> >
> > Done.
> >
> >> The new Rex2 attribute is not only wasteful (it can easily be a new
> >> enumerator used with OperandConstraint), but also misleading. We
> >> don't just need REX2 here, but we need it with REX2.W set. Even if
> >> from the tc-i386.c changes it looks as if that was happening
> >> implicitly (presumably due to the absence of NoRex64), naming still needs
> to properly reflect the purpose.
> >>
> >
> > You are right, rex2.w is set in process_suffix, since there is no NoRex64.
> >
> > How about Rex2W?
> > if (i.tm.opcode_modifier.rex2w)
> > {
> >     i.rex2_encoding = true;
> >     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> > }
> >
> >
> > Or just add a special judgment?
> >
> > if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp) {
> >     i.rex2_encoding = true;
> >     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> > }
> 
> I'd prefer the latter over a new attribute (albeit once again with the comment
> actually matching code), and perhaps I view the latter equal to my earlier
> suggestion.
> 

Ok, put them in process_operands, there's already a special handler here to change the prefix.

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 0dde2a9ad44..2f2b1b04d10 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -8715,6 +8715,13 @@ process_operands (void)
       i.tm.operands++;
     }

+  /* PUSHP/POPP requires rex2.w == 1.  */
+  if (i.tm.mnem_off == MN_pushp || i.tm.mnem_off == MN_popp)
+    {
+      i.rex2_encoding = true;
+      i.rex |= REX_W;
+    }
+
   if (i.tm.opcode_modifier.sse2avx && i.tm.opcode_modifier.vexvvvv)
     {

> > By the way, Is our APX patches (V3)  review still in progress? Since we have to
> commit them into the trunk before the next release, time is a bit tight ( many
> people may have vacation plans at the end of the year).
> 
> Hmm, not exactly sure what to say. You posted v3 only on Friday. Plus I have
> spent an enormous amount of time reviewing v1 and v2. Please be patient, I
> have this queued for another round of review. And no, there is no "we have
> to ... before the next release". You (Intel) may want to, but with a large new
> feature like this it had to be clear that getting all pieces in will take time.
> 

Maybe what I said is inappropriate and makes you uncomfortable, so just follow your own pace. It's not easy to review the code of such a large function, and I really know it will take up a lot of your time, at least I can't do such a detailed inspection as you do.

Thanks,
Lili.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Support APX PUSHP/POPP
  2023-11-28  2:32       ` Cui, Lili
@ 2023-11-28  8:34         ` Jan Beulich
  2023-11-28 13:14           ` Cui, Lili
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2023-11-28  8:34 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Lu, Hongjiu, ccoutant, binutils

On 28.11.2023 03:32, Cui, Lili wrote:
>>>>> --- a/opcodes/i386-dis.c
>>>>> +++ b/opcodes/i386-dis.c
>>>>> @@ -1931,23 +1931,23 @@ static const struct dis386 dis386[] = {
>>>>>    { "dec{S|}",		{ RMeSI }, 0 },
>>>>>    { "dec{S|}",		{ RMeDI }, 0 },
>>>>>    /* 50 */
>>>>> -  { "push{!P|}",		{ RMrAX }, 0 },
>>>>> -  { "push{!P|}",		{ RMrCX }, 0 },
>>>>> -  { "push{!P|}",		{ RMrDX }, 0 },
>>>>> -  { "push{!P|}",		{ RMrBX }, 0 },
>>>>> -  { "push{!P|}",		{ RMrSP }, 0 },
>>>>> -  { "push{!P|}",		{ RMrBP }, 0 },
>>>>> -  { "push{!P|}",		{ RMrSI }, 0 },
>>>>> -  { "push{!P|}",		{ RMrDI }, 0 },
>>>>> +  { "push!P",		{ RMrAX }, 0 },
>>>>> +  { "push!P",		{ RMrCX }, 0 },
>>>>> +  { "push!P",		{ RMrDX }, 0 },
>>>>> +  { "push!P",		{ RMrBX }, 0 },
>>>>> +  { "push!P",		{ RMrSP }, 0 },
>>>>> +  { "push!P",		{ RMrBP }, 0 },
>>>>> +  { "push!P",		{ RMrSI }, 0 },
>>>>> +  { "push!P",		{ RMrDI }, 0 },
>>>>>    /* 58 */
>>>>> -  { "pop{!P|}",		{ RMrAX }, 0 },
>>>>> -  { "pop{!P|}",		{ RMrCX }, 0 },
>>>>> -  { "pop{!P|}",		{ RMrDX }, 0 },
>>>>> -  { "pop{!P|}",		{ RMrBX }, 0 },
>>>>> -  { "pop{!P|}",		{ RMrSP }, 0 },
>>>>> -  { "pop{!P|}",		{ RMrBP }, 0 },
>>>>> -  { "pop{!P|}",		{ RMrSI }, 0 },
>>>>> -  { "pop{!P|}",		{ RMrDI }, 0 },
>>>>> +  { "pop!P",		{ RMrAX }, 0 },
>>>>> +  { "pop!P",		{ RMrCX }, 0 },
>>>>> +  { "pop!P",		{ RMrDX }, 0 },
>>>>> +  { "pop!P",		{ RMrBX }, 0 },
>>>>> +  { "pop!P",		{ RMrSP }, 0 },
>>>>> +  { "pop!P",		{ RMrBP }, 0 },
>>>>> +  { "pop!P",		{ RMrSI }, 0 },
>>>>> +  { "pop!P",		{ RMrDI }, 0 },
>>>>>    /* 60 */
>>>>>    { X86_64_TABLE (X86_64_60) },
>>>>>    { X86_64_TABLE (X86_64_61) },
>>>>> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
>>>> *in_template, int sizeflag)
>>>>>  	case 'P':
>>>>>  	  if (l == 0)
>>>>>  	    {
>>>>> +	      /* For pushp and popp, do not print {rex2} for them.  */
>>>>> +	      if (ins->address_mode == mode_64bit && !cond
>>>>
>>>> I don't think the mode_64bit check is needed here, as without that
>>>> REX.W cannot possibly be set (nor can a REX2 prefix be present).
>>>
>>> Done.
>>>
>>>>
>>>>> +		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
>>>>> +		{
>>>>> +		  *ins->obufp++ = 'p';
>>>>> +		  ins->rex2 |= 16;
>>>>
>>>> Please no new use of magic constants. Have a #define with a suitable
>>>> name, and use that here. Also I think the comment you have ahead of
>>>> the if() actually belongs here?
>>>>
>>>
>>> How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it with
>> JMPABS.
>>
>> But what's implicit about the REX2 prefix here?
> 
> PUSHP/POPP requires rex2.w ==1 and we don't want to print {rex2} for it. JMPABS has the same need, so I want to define a common macro for them, or NO_NEED_PRINT_REX2?

Yes, we want a shared constant here (as much as we want a shared way of
dealing with the need to emit REX2 in the assembler). Still, the naming
wants to fit not only the goal (to suppress the printing of {rex2}), but
also where the bit is actually stored (in ->rex2). Therefore its name
wants to fit with the other names used for bits in that field. Which
(to me) first of all means it wants to start with REX_ (or REX2_).
REX_SPECIAL or REX2_SPECIAL might be an option, but might also be too
generic (i.e. becoming an issue down the road). But I think this should
give you an idea ...

Then again, why is this constant needed for PUSHP/POPP, which have
REX2.W set anyway, and hence that bit alone (when properly marked as
consumed) should already allow to omit {rex2}. The question of adding
a new constant (and how to suitably name it) would then be fully
constrained to the JMPABS patch (for now, i.e. until such time that a
2nd insn appears which requires an otherwise empty REX2 prefix).

>>>>> --- a/opcodes/i386-opc.tbl
>>>>> +++ b/opcodes/i386-opc.tbl
>>>>> @@ -225,6 +225,7 @@ push, 0x68, i186&No64,
>>>>> DefaultSize|No_bSuf|No_sSuf|No_qSuf, { Imm16|Imm32 }  push, 0x6,
>>>> No64,
>>>>> DefaultSize|No_bSuf|No_sSuf|No_qSuf, { SReg }  // In 64bit mode, the
>>>> operand size is implicitly 64bit.
>>>>>  push, 0x50, x64, No_bSuf|No_lSuf|No_sSuf|NoRex64, { Reg16|Reg64 }
>>>>> +pushp, 0x50, APX_F, No_bSuf|No_lSuf|No_sSuf|Rex2, {Reg64 }
>>>>
>>>> Since Reg16 isn't allowed, you also want No_wSuf here (and below).
>>>> Note also the missing blank after the opening figure brace.
>>>>
>>>
>>> Done.
>>>
>>>> The new Rex2 attribute is not only wasteful (it can easily be a new
>>>> enumerator used with OperandConstraint), but also misleading. We
>>>> don't just need REX2 here, but we need it with REX2.W set. Even if
>>>> from the tc-i386.c changes it looks as if that was happening
>>>> implicitly (presumably due to the absence of NoRex64), naming still needs
>> to properly reflect the purpose.
>>>>
>>>
>>> You are right, rex2.w is set in process_suffix, since there is no NoRex64.
>>>
>>> How about Rex2W?
>>> if (i.tm.opcode_modifier.rex2w)
>>> {
>>>     i.rex2_encoding = true;
>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
>>> }
>>>
>>>
>>> Or just add a special judgment?
>>>
>>> if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp) {
>>>     i.rex2_encoding = true;
>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
>>> }
>>
>> I'd prefer the latter over a new attribute (albeit once again with the comment
>> actually matching code), and perhaps I view the latter equal to my earlier
>> suggestion.
>>
> 
> Ok, put them in process_operands, there's already a special handler here to change the prefix.
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 0dde2a9ad44..2f2b1b04d10 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -8715,6 +8715,13 @@ process_operands (void)
>        i.tm.operands++;
>      }
> 
> +  /* PUSHP/POPP requires rex2.w == 1.  */
> +  if (i.tm.mnem_off == MN_pushp || i.tm.mnem_off == MN_popp)
> +    {
> +      i.rex2_encoding = true;
> +      i.rex |= REX_W;
> +    }

Well, I'm sorry for not considering JMPABS earlier on, but with that
also needing dealing with, I think I view my alternative suggestion as
preferable. That'll scale better when also considering that down the
road further such insns may appear. Whether it's actually
OperandConstraint that we leverage here is secondary (it's not ideal
because there's nothing operand related here). I'd be perfectly okay
with some other attribute being suitably overloaded, whereas I
continue to think that introducing new attributes should preferably
be limited to either cases where more than just two or three
templates use them or cases where otherwise it's impossible to avoid
ambiguities. From earlier changes of mine the underlying reason
ought to be pretty clear: Each new attribute consumes storage, and
with thousands of templates growth of storage requirements should
be balanced with how frequently an attribute is actually going to
have a non-zero value. For example, with is_evex_encoding() gone a
brief inspection suggests that it might be possible to overload
Masking (or maybe Broadcast): They're applicable to EVEX templates
only, and the class of insns we're discussing here is never going to
be EVEX (or VEX). IOW not much different from the overloading of
StaticRounding. Such an overload may then well be named Rex2 (as you
had it, and considering its intended use also for JMPABS, plus
taking into consideration that REX2.W will be set simply because of
the absence of NoRex64).

The other issue I have with your approach is that you again (ab)use
i.rex2_encoding: As expressed before, that's representing {rex2}, i.e.
a weak request (aka hint). Whereas the two insns here _require_ a REX2
prefix. Doing so may end up being tolerable, but would require extra
justification and/or commenting.

Finally putting the logic in process_operands() is also misleading:
There's no processing of operands here. Pre-existing abuse of the
function (as you appear to indicate exists, albeit I didn't go check)
is not really a good excuse, I'm afraid.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] Support APX PUSHP/POPP
  2023-11-28  8:34         ` Jan Beulich
@ 2023-11-28 13:14           ` Cui, Lili
  2023-11-28 13:54             ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Cui, Lili @ 2023-11-28 13:14 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, ccoutant, binutils

> 
> >>>>> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
> >>>> *in_template, int sizeflag)
> >>>>>  	case 'P':
> >>>>>  	  if (l == 0)
> >>>>>  	    {
> >>>>> +	      /* For pushp and popp, do not print {rex2} for them.  */
> >>>>> +	      if (ins->address_mode == mode_64bit && !cond
> >>>>
> >>>> I don't think the mode_64bit check is needed here, as without that
> >>>> REX.W cannot possibly be set (nor can a REX2 prefix be present).
> >>>
> >>> Done.
> >>>
> >>>>
> >>>>> +		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
> >>>>> +		{
> >>>>> +		  *ins->obufp++ = 'p';
> >>>>> +		  ins->rex2 |= 16;
> >>>>
> >>>> Please no new use of magic constants. Have a #define with a
> >>>> suitable name, and use that here. Also I think the comment you have
> >>>> ahead of the if() actually belongs here?
> >>>>
> >>>
> >>> How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it with
> >> JMPABS.
> >>
> >> But what's implicit about the REX2 prefix here?
> >
> > PUSHP/POPP requires rex2.w ==1 and we don't want to print {rex2} for it.
> JMPABS has the same need, so I want to define a common macro for them, or
> NO_NEED_PRINT_REX2?
> 
> Yes, we want a shared constant here (as much as we want a shared way of
> dealing with the need to emit REX2 in the assembler). Still, the naming wants
> to fit not only the goal (to suppress the printing of {rex2}), but also where the
> bit is actually stored (in ->rex2). Therefore its name wants to fit with the other
> names used for bits in that field. Which (to me) first of all means it wants to
> start with REX_ (or REX2_).
> REX_SPECIAL or REX2_SPECIAL might be an option, but might also be too
> generic (i.e. becoming an issue down the road). But I think this should give you
> an idea ...
> 

How about REX2_ IGNORED ?

> Then again, why is this constant needed for PUSHP/POPP, which have REX2.W
> set anyway, and hence that bit alone (when properly marked as
> consumed) should already allow to omit {rex2}. The question of adding a new
> constant (and how to suitably name it) would then be fully constrained to the
> JMPABS patch (for now, i.e. until such time that a 2nd insn appears which
> requires an otherwise empty REX2 prefix).
> 

The current implementation is that no matter whether rex.w is consumed or not, {rex2} will be printed as long as there is no Egpr. Of course, this place may be changed as discussed later.

> >>>
> >>>> The new Rex2 attribute is not only wasteful (it can easily be a new
> >>>> enumerator used with OperandConstraint), but also misleading. We
> >>>> don't just need REX2 here, but we need it with REX2.W set. Even if
> >>>> from the tc-i386.c changes it looks as if that was happening
> >>>> implicitly (presumably due to the absence of NoRex64), naming still
> >>>> needs
> >> to properly reflect the purpose.
> >>>>
> >>>
> >>> You are right, rex2.w is set in process_suffix, since there is no NoRex64.
> >>>
> >>> How about Rex2W?
> >>> if (i.tm.opcode_modifier.rex2w)
> >>> {
> >>>     i.rex2_encoding = true;
> >>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> >>> }
> >>>
> >>>
> >>> Or just add a special judgment?
> >>>
> >>> if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp) {
> >>>     i.rex2_encoding = true;
> >>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> >>> }
> >>
> >> I'd prefer the latter over a new attribute (albeit once again with
> >> the comment actually matching code), and perhaps I view the latter
> >> equal to my earlier suggestion.
> >>
> >
> > Ok, put them in process_operands, there's already a special handler here to
> change the prefix.
> >
> > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index
> > 0dde2a9ad44..2f2b1b04d10 100644
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -8715,6 +8715,13 @@ process_operands (void)
> >        i.tm.operands++;
> >      }
> >
> > +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
> > + MN_pushp || i.tm.mnem_off == MN_popp)
> > +    {
> > +      i.rex2_encoding = true;
> > +      i.rex |= REX_W;
> > +    }
> 
> Well, I'm sorry for not considering JMPABS earlier on, but with that also
> needing dealing with, I think I view my alternative suggestion as preferable.
> That'll scale better when also considering that down the road further such
> insns may appear. Whether it's actually OperandConstraint that we leverage
> here is secondary (it's not ideal because there's nothing operand related here).
> I'd be perfectly okay with some other attribute being suitably overloaded,
> whereas I continue to think that introducing new attributes should preferably
> be limited to either cases where more than just two or three templates use
> them or cases where otherwise it's impossible to avoid ambiguities. From
> earlier changes of mine the underlying reason ought to be pretty clear: Each
> new attribute consumes storage, and with thousands of templates growth of
> storage requirements should be balanced with how frequently an attribute is
> actually going to have a non-zero value. For example, with is_evex_encoding()
> gone a brief inspection suggests that it might be possible to overload Masking
> (or maybe Broadcast): They're applicable to EVEX templates only, and the class
> of insns we're discussing here is never going to be EVEX (or VEX). IOW not
> much different from the overloading of StaticRounding. Such an overload may
> then well be named Rex2 (as you had it, and considering its intended use also
> for JMPABS, plus taking into consideration that REX2.W will be set simply
> because of the absence of NoRex64).
>

Haha, StaticRounding is really special, I tried "#define Rex2Req Masking" and found that it will be used in i386-gen.c to identify EVEX (Broadcast...), then I tried VexW and SIB found that they are all used without precheck whether it was an vex instruction. Finally I wanted to re-use StaticRounding and found out that hulin already uses it for legacy insns.

Thanks,
Lili.

> The other issue I have with your approach is that you again (ab)use
> i.rex2_encoding: As expressed before, that's representing {rex2}, i.e.
> a weak request (aka hint). Whereas the two insns here _require_ a REX2
> prefix. Doing so may end up being tolerable, but would require extra
> justification and/or commenting.
> 
> Finally putting the logic in process_operands() is also misleading:
> There's no processing of operands here. Pre-existing abuse of the function (as
> you appear to indicate exists, albeit I didn't go check) is not really a good
> excuse, I'm afraid.
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Support APX PUSHP/POPP
  2023-11-28 13:14           ` Cui, Lili
@ 2023-11-28 13:54             ` Jan Beulich
  2023-11-29  3:08               ` Cui, Lili
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2023-11-28 13:54 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Lu, Hongjiu, ccoutant, binutils

On 28.11.2023 14:14, Cui, Lili wrote:
>>
>>>>>>> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
>>>>>> *in_template, int sizeflag)
>>>>>>>  	case 'P':
>>>>>>>  	  if (l == 0)
>>>>>>>  	    {
>>>>>>> +	      /* For pushp and popp, do not print {rex2} for them.  */
>>>>>>> +	      if (ins->address_mode == mode_64bit && !cond
>>>>>>
>>>>>> I don't think the mode_64bit check is needed here, as without that
>>>>>> REX.W cannot possibly be set (nor can a REX2 prefix be present).
>>>>>
>>>>> Done.
>>>>>
>>>>>>
>>>>>>> +		  && ins->last_rex2_prefix >= 0 && (ins->rex & REX_W))
>>>>>>> +		{
>>>>>>> +		  *ins->obufp++ = 'p';
>>>>>>> +		  ins->rex2 |= 16;
>>>>>>
>>>>>> Please no new use of magic constants. Have a #define with a
>>>>>> suitable name, and use that here. Also I think the comment you have
>>>>>> ahead of the if() actually belongs here?
>>>>>>
>>>>>
>>>>> How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it with
>>>> JMPABS.
>>>>
>>>> But what's implicit about the REX2 prefix here?
>>>
>>> PUSHP/POPP requires rex2.w ==1 and we don't want to print {rex2} for it.
>> JMPABS has the same need, so I want to define a common macro for them, or
>> NO_NEED_PRINT_REX2?
>>
>> Yes, we want a shared constant here (as much as we want a shared way of
>> dealing with the need to emit REX2 in the assembler). Still, the naming wants
>> to fit not only the goal (to suppress the printing of {rex2}), but also where the
>> bit is actually stored (in ->rex2). Therefore its name wants to fit with the other
>> names used for bits in that field. Which (to me) first of all means it wants to
>> start with REX_ (or REX2_).
>> REX_SPECIAL or REX2_SPECIAL might be an option, but might also be too
>> generic (i.e. becoming an issue down the road). But I think this should give you
>> an idea ...
> 
> How about REX2_ IGNORED ?

Perhaps. It's no better or worse than REX2_SPECIAL. Just make sure you
add a comment explaining what it's to be used for.

>> Then again, why is this constant needed for PUSHP/POPP, which have REX2.W
>> set anyway, and hence that bit alone (when properly marked as
>> consumed) should already allow to omit {rex2}. The question of adding a new
>> constant (and how to suitably name it) would then be fully constrained to the
>> JMPABS patch (for now, i.e. until such time that a 2nd insn appears which
>> requires an otherwise empty REX2 prefix).
>>
> 
> The current implementation is that no matter whether rex.w is consumed or not, {rex2} will be printed as long as there is no Egpr. Of course, this place may be changed as discussed later.

That's a bug then. If there is a REX2 prefix with just REX2.W set, and
if that bit is properly consumed, no {rex2} should be printed imo. That's
despite the same thing being expressable (in the common case; not here)
with REX.W. Anything beyond that depends on the wider question of how
to deal with _unused_ REX2 payload bits.

>>>>>> The new Rex2 attribute is not only wasteful (it can easily be a new
>>>>>> enumerator used with OperandConstraint), but also misleading. We
>>>>>> don't just need REX2 here, but we need it with REX2.W set. Even if
>>>>>> from the tc-i386.c changes it looks as if that was happening
>>>>>> implicitly (presumably due to the absence of NoRex64), naming still
>>>>>> needs
>>>> to properly reflect the purpose.
>>>>>>
>>>>>
>>>>> You are right, rex2.w is set in process_suffix, since there is no NoRex64.
>>>>>
>>>>> How about Rex2W?
>>>>> if (i.tm.opcode_modifier.rex2w)
>>>>> {
>>>>>     i.rex2_encoding = true;
>>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
>>>>> }
>>>>>
>>>>>
>>>>> Or just add a special judgment?
>>>>>
>>>>> if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp) {
>>>>>     i.rex2_encoding = true;
>>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
>>>>> }
>>>>
>>>> I'd prefer the latter over a new attribute (albeit once again with
>>>> the comment actually matching code), and perhaps I view the latter
>>>> equal to my earlier suggestion.
>>>>
>>>
>>> Ok, put them in process_operands, there's already a special handler here to
>> change the prefix.
>>>
>>> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index
>>> 0dde2a9ad44..2f2b1b04d10 100644
>>> --- a/gas/config/tc-i386.c
>>> +++ b/gas/config/tc-i386.c
>>> @@ -8715,6 +8715,13 @@ process_operands (void)
>>>        i.tm.operands++;
>>>      }
>>>
>>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
>>> + MN_pushp || i.tm.mnem_off == MN_popp)
>>> +    {
>>> +      i.rex2_encoding = true;
>>> +      i.rex |= REX_W;
>>> +    }
>>
>> Well, I'm sorry for not considering JMPABS earlier on, but with that also
>> needing dealing with, I think I view my alternative suggestion as preferable.
>> That'll scale better when also considering that down the road further such
>> insns may appear. Whether it's actually OperandConstraint that we leverage
>> here is secondary (it's not ideal because there's nothing operand related here).
>> I'd be perfectly okay with some other attribute being suitably overloaded,
>> whereas I continue to think that introducing new attributes should preferably
>> be limited to either cases where more than just two or three templates use
>> them or cases where otherwise it's impossible to avoid ambiguities. From
>> earlier changes of mine the underlying reason ought to be pretty clear: Each
>> new attribute consumes storage, and with thousands of templates growth of
>> storage requirements should be balanced with how frequently an attribute is
>> actually going to have a non-zero value. For example, with is_evex_encoding()
>> gone a brief inspection suggests that it might be possible to overload Masking
>> (or maybe Broadcast): They're applicable to EVEX templates only, and the class
>> of insns we're discussing here is never going to be EVEX (or VEX). IOW not
>> much different from the overloading of StaticRounding. Such an overload may
>> then well be named Rex2 (as you had it, and considering its intended use also
>> for JMPABS, plus taking into consideration that REX2.W will be set simply
>> because of the absence of NoRex64).
>>
> 
> Haha, StaticRounding is really special, I tried "#define Rex2Req Masking" and found that it will be used in i386-gen.c to identify EVEX (Broadcast...), then I tried VexW and SIB found that they are all used without precheck whether it was an vex instruction. Finally I wanted to re-use StaticRounding and found out that hulin already uses it for legacy insns.

Hmm, I'm sorry for the trouble. I'm inclined to say OperandConstraint with a
new #define it is then. Once everything's in I could then still see whether
I can (reasonably) make e.g. Masking work here.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] Support APX PUSHP/POPP
  2023-11-28 13:54             ` Jan Beulich
@ 2023-11-29  3:08               ` Cui, Lili
  2023-11-29  8:29                 ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Cui, Lili @ 2023-11-29  3:08 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, ccoutant, binutils

> On 28.11.2023 14:14, Cui, Lili wrote:
> >>
> >>>>>>> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
> >>>>>> *in_template, int sizeflag)
> >>>>>>>  	case 'P':
> >>>>>>>  	  if (l == 0)
> >>>>>>>  	    {
> >>>>>>> +	      /* For pushp and popp, do not print {rex2} for them.  */
> >>>>>>> +	      if (ins->address_mode == mode_64bit && !cond
> >>>>>>
> >>>>>> I don't think the mode_64bit check is needed here, as without
> >>>>>> that REX.W cannot possibly be set (nor can a REX2 prefix be present).
> >>>>>
> >>>>> Done.
> >>>>>
> >>>>>>
> >>>>>>> +		  && ins->last_rex2_prefix >= 0 && (ins->rex &
> REX_W))
> >>>>>>> +		{
> >>>>>>> +		  *ins->obufp++ = 'p';
> >>>>>>> +		  ins->rex2 |= 16;
> >>>>>>
> >>>>>> Please no new use of magic constants. Have a #define with a
> >>>>>> suitable name, and use that here. Also I think the comment you
> >>>>>> have ahead of the if() actually belongs here?
> >>>>>>
> >>>>>
> >>>>> How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it
> >>>>> with
> >>>> JMPABS.
> >>>>
> >>>> But what's implicit about the REX2 prefix here?
> >>>
> >>> PUSHP/POPP requires rex2.w ==1 and we don't want to print {rex2} for it.
> >> JMPABS has the same need, so I want to define a common macro for
> >> them, or NO_NEED_PRINT_REX2?
> >>
> >> Yes, we want a shared constant here (as much as we want a shared way
> >> of dealing with the need to emit REX2 in the assembler). Still, the
> >> naming wants to fit not only the goal (to suppress the printing of
> >> {rex2}), but also where the bit is actually stored (in ->rex2).
> >> Therefore its name wants to fit with the other names used for bits in
> >> that field. Which (to me) first of all means it wants to start with REX_ (or
> REX2_).
> >> REX_SPECIAL or REX2_SPECIAL might be an option, but might also be too
> >> generic (i.e. becoming an issue down the road). But I think this
> >> should give you an idea ...
> >
> > How about REX2_ IGNORED ?
> 
> Perhaps. It's no better or worse than REX2_SPECIAL. Just make sure you add a
> comment explaining what it's to be used for.
> 

Done.

> >> Then again, why is this constant needed for PUSHP/POPP, which have
> >> REX2.W set anyway, and hence that bit alone (when properly marked as
> >> consumed) should already allow to omit {rex2}. The question of adding
> >> a new constant (and how to suitably name it) would then be fully
> >> constrained to the JMPABS patch (for now, i.e. until such time that a
> >> 2nd insn appears which requires an otherwise empty REX2 prefix).
> >>
> >
> > The current implementation is that no matter whether rex.w is consumed or
> not, {rex2} will be printed as long as there is no Egpr. Of course, this place may
> be changed as discussed later.
> 
> That's a bug then. If there is a REX2 prefix with just REX2.W set, and if that bit
> is properly consumed, no {rex2} should be printed imo. That's despite the
> same thing being expressable (in the common case; not here) with REX.W.
> Anything beyond that depends on the wider question of how to deal with
> _unused_ REX2 payload bits.
> 

For the last two instructions, although rex.w is consumed, we still need to print {rex2} to distinguish them.

0000000000000000 <_start>:
   0:   48 50                                 rex.W push %rax
   2:   d5 08 50                           pushp  %rax
   5:   48 6a 01                           rex.W push $0x1
   8:   d5 08 6a 01                     {rex2} push $0x1
   c:   48 c7 00 12 00 00 00                 movq   $0x12,(%rax)
  13:   d5 08 c7 00 12 00 00 00         {rex2} movq $0x12,(%rax)


> >>>>>> The new Rex2 attribute is not only wasteful (it can easily be a
> >>>>>> new enumerator used with OperandConstraint), but also misleading.
> >>>>>> We don't just need REX2 here, but we need it with REX2.W set.
> >>>>>> Even if from the tc-i386.c changes it looks as if that was
> >>>>>> happening implicitly (presumably due to the absence of NoRex64),
> >>>>>> naming still needs
> >>>> to properly reflect the purpose.
> >>>>>>
> >>>>>
> >>>>> You are right, rex2.w is set in process_suffix, since there is no NoRex64.
> >>>>>
> >>>>> How about Rex2W?
> >>>>> if (i.tm.opcode_modifier.rex2w)
> >>>>> {
> >>>>>     i.rex2_encoding = true;
> >>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> >>>>> }
> >>>>>
> >>>>>
> >>>>> Or just add a special judgment?
> >>>>>
> >>>>> if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp) {
> >>>>>     i.rex2_encoding = true;
> >>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> >>>>> }
> >>>>
> >>>> I'd prefer the latter over a new attribute (albeit once again with
> >>>> the comment actually matching code), and perhaps I view the latter
> >>>> equal to my earlier suggestion.
> >>>>
> >>>
> >>> Ok, put them in process_operands, there's already a special handler
> >>> here to
> >> change the prefix.
> >>>
> >>> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index
> >>> 0dde2a9ad44..2f2b1b04d10 100644
> >>> --- a/gas/config/tc-i386.c
> >>> +++ b/gas/config/tc-i386.c
> >>> @@ -8715,6 +8715,13 @@ process_operands (void)
> >>>        i.tm.operands++;
> >>>      }
> >>>
> >>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
> >>> + MN_pushp || i.tm.mnem_off == MN_popp)
> >>> +    {
> >>> +      i.rex2_encoding = true;
> >>> +      i.rex |= REX_W;
> >>> +    }
> >>
> >> Well, I'm sorry for not considering JMPABS earlier on, but with that
> >> also needing dealing with, I think I view my alternative suggestion as
> preferable.
> >> That'll scale better when also considering that down the road further
> >> such insns may appear. Whether it's actually OperandConstraint that
> >> we leverage here is secondary (it's not ideal because there's nothing
> operand related here).
> >> I'd be perfectly okay with some other attribute being suitably
> >> overloaded, whereas I continue to think that introducing new
> >> attributes should preferably be limited to either cases where more
> >> than just two or three templates use them or cases where otherwise
> >> it's impossible to avoid ambiguities. From earlier changes of mine
> >> the underlying reason ought to be pretty clear: Each new attribute
> >> consumes storage, and with thousands of templates growth of storage
> >> requirements should be balanced with how frequently an attribute is
> >> actually going to have a non-zero value. For example, with
> >> is_evex_encoding() gone a brief inspection suggests that it might be
> >> possible to overload Masking (or maybe Broadcast): They're applicable
> >> to EVEX templates only, and the class of insns we're discussing here
> >> is never going to be EVEX (or VEX). IOW not much different from the
> >> overloading of StaticRounding. Such an overload may then well be
> >> named Rex2 (as you had it, and considering its intended use also for
> JMPABS, plus taking into consideration that REX2.W will be set simply because
> of the absence of NoRex64).
> >>
> >
> > Haha, StaticRounding is really special, I tried "#define Rex2Req Masking" and
> found that it will be used in i386-gen.c to identify EVEX (Broadcast...), then I
> tried VexW and SIB found that they are all used without precheck whether it
> was an vex instruction. Finally I wanted to re-use StaticRounding and found
> out that hulin already uses it for legacy insns.
> 
> Hmm, I'm sorry for the trouble. I'm inclined to say OperandConstraint with a
> new #define it is then. Once everything's in I could then still see whether I can
> (reasonably) make e.g. Masking work here.
> 

Then I will create a new attribute Rex2 for it.

Thanks,
Lili.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Support APX PUSHP/POPP
  2023-11-29  3:08               ` Cui, Lili
@ 2023-11-29  8:29                 ` Jan Beulich
  2023-11-29 10:38                   ` Cui, Lili
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2023-11-29  8:29 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Lu, Hongjiu, ccoutant, binutils

On 29.11.2023 04:08, Cui, Lili wrote:
>> On 28.11.2023 14:14, Cui, Lili wrote:
>>>>
>>>>>>>>> @@ -10621,6 +10621,19 @@ putop (instr_info *ins, const char
>>>>>>>> *in_template, int sizeflag)
>>>>>>>>>  	case 'P':
>>>>>>>>>  	  if (l == 0)
>>>>>>>>>  	    {
>>>>>>>>> +	      /* For pushp and popp, do not print {rex2} for them.  */
>>>>>>>>> +	      if (ins->address_mode == mode_64bit && !cond
>>>>>>>>
>>>>>>>> I don't think the mode_64bit check is needed here, as without
>>>>>>>> that REX.W cannot possibly be set (nor can a REX2 prefix be present).
>>>>>>>
>>>>>>> Done.
>>>>>>>
>>>>>>>>
>>>>>>>>> +		  && ins->last_rex2_prefix >= 0 && (ins->rex &
>> REX_W))
>>>>>>>>> +		{
>>>>>>>>> +		  *ins->obufp++ = 'p';
>>>>>>>>> +		  ins->rex2 |= 16;
>>>>>>>>
>>>>>>>> Please no new use of magic constants. Have a #define with a
>>>>>>>> suitable name, and use that here. Also I think the comment you
>>>>>>>> have ahead of the if() actually belongs here?
>>>>>>>>
>>>>>>>
>>>>>>> How about " #define IMPLICIT_REX2 16",  PUSHP/POPP can share it
>>>>>>> with
>>>>>> JMPABS.
>>>>>>
>>>>>> But what's implicit about the REX2 prefix here?
>>>>>
>>>>> PUSHP/POPP requires rex2.w ==1 and we don't want to print {rex2} for it.
>>>> JMPABS has the same need, so I want to define a common macro for
>>>> them, or NO_NEED_PRINT_REX2?
>>>>
>>>> Yes, we want a shared constant here (as much as we want a shared way
>>>> of dealing with the need to emit REX2 in the assembler). Still, the
>>>> naming wants to fit not only the goal (to suppress the printing of
>>>> {rex2}), but also where the bit is actually stored (in ->rex2).
>>>> Therefore its name wants to fit with the other names used for bits in
>>>> that field. Which (to me) first of all means it wants to start with REX_ (or
>> REX2_).
>>>> REX_SPECIAL or REX2_SPECIAL might be an option, but might also be too
>>>> generic (i.e. becoming an issue down the road). But I think this
>>>> should give you an idea ...
>>>
>>> How about REX2_ IGNORED ?
>>
>> Perhaps. It's no better or worse than REX2_SPECIAL. Just make sure you add a
>> comment explaining what it's to be used for.
>>
> 
> Done.
> 
>>>> Then again, why is this constant needed for PUSHP/POPP, which have
>>>> REX2.W set anyway, and hence that bit alone (when properly marked as
>>>> consumed) should already allow to omit {rex2}. The question of adding
>>>> a new constant (and how to suitably name it) would then be fully
>>>> constrained to the JMPABS patch (for now, i.e. until such time that a
>>>> 2nd insn appears which requires an otherwise empty REX2 prefix).
>>>>
>>>
>>> The current implementation is that no matter whether rex.w is consumed or
>> not, {rex2} will be printed as long as there is no Egpr. Of course, this place may
>> be changed as discussed later.
>>
>> That's a bug then. If there is a REX2 prefix with just REX2.W set, and if that bit
>> is properly consumed, no {rex2} should be printed imo. That's despite the
>> same thing being expressable (in the common case; not here) with REX.W.
>> Anything beyond that depends on the wider question of how to deal with
>> _unused_ REX2 payload bits.
>>
> 
> For the last two instructions, although rex.w is consumed, we still need to print {rex2} to distinguish them.
> 
> 0000000000000000 <_start>:
>    0:   48 50                                 rex.W push %rax
>    2:   d5 08 50                           pushp  %rax
>    5:   48 6a 01                           rex.W push $0x1
>    8:   d5 08 6a 01                     {rex2} push $0x1
>    c:   48 c7 00 12 00 00 00                 movq   $0x12,(%rax)
>   13:   d5 08 c7 00 12 00 00 00         {rex2} movq $0x12,(%rax)

Well, no, I don't really agree (especially not for the last one). But
that goes back to the more general question on the printing of {rex2}
without also indicating unconsumed bits.

>>>>>>>> The new Rex2 attribute is not only wasteful (it can easily be a
>>>>>>>> new enumerator used with OperandConstraint), but also misleading.
>>>>>>>> We don't just need REX2 here, but we need it with REX2.W set.
>>>>>>>> Even if from the tc-i386.c changes it looks as if that was
>>>>>>>> happening implicitly (presumably due to the absence of NoRex64),
>>>>>>>> naming still needs
>>>>>> to properly reflect the purpose.
>>>>>>>>
>>>>>>>
>>>>>>> You are right, rex2.w is set in process_suffix, since there is no NoRex64.
>>>>>>>
>>>>>>> How about Rex2W?
>>>>>>> if (i.tm.opcode_modifier.rex2w)
>>>>>>> {
>>>>>>>     i.rex2_encoding = true;
>>>>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> Or just add a special judgment?
>>>>>>>
>>>>>>> if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp) {
>>>>>>>     i.rex2_encoding = true;
>>>>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
>>>>>>> }
>>>>>>
>>>>>> I'd prefer the latter over a new attribute (albeit once again with
>>>>>> the comment actually matching code), and perhaps I view the latter
>>>>>> equal to my earlier suggestion.
>>>>>>
>>>>>
>>>>> Ok, put them in process_operands, there's already a special handler
>>>>> here to
>>>> change the prefix.
>>>>>
>>>>> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index
>>>>> 0dde2a9ad44..2f2b1b04d10 100644
>>>>> --- a/gas/config/tc-i386.c
>>>>> +++ b/gas/config/tc-i386.c
>>>>> @@ -8715,6 +8715,13 @@ process_operands (void)
>>>>>        i.tm.operands++;
>>>>>      }
>>>>>
>>>>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
>>>>> + MN_pushp || i.tm.mnem_off == MN_popp)
>>>>> +    {
>>>>> +      i.rex2_encoding = true;
>>>>> +      i.rex |= REX_W;
>>>>> +    }
>>>>
>>>> Well, I'm sorry for not considering JMPABS earlier on, but with that
>>>> also needing dealing with, I think I view my alternative suggestion as
>> preferable.
>>>> That'll scale better when also considering that down the road further
>>>> such insns may appear. Whether it's actually OperandConstraint that
>>>> we leverage here is secondary (it's not ideal because there's nothing
>> operand related here).
>>>> I'd be perfectly okay with some other attribute being suitably
>>>> overloaded, whereas I continue to think that introducing new
>>>> attributes should preferably be limited to either cases where more
>>>> than just two or three templates use them or cases where otherwise
>>>> it's impossible to avoid ambiguities. From earlier changes of mine
>>>> the underlying reason ought to be pretty clear: Each new attribute
>>>> consumes storage, and with thousands of templates growth of storage
>>>> requirements should be balanced with how frequently an attribute is
>>>> actually going to have a non-zero value. For example, with
>>>> is_evex_encoding() gone a brief inspection suggests that it might be
>>>> possible to overload Masking (or maybe Broadcast): They're applicable
>>>> to EVEX templates only, and the class of insns we're discussing here
>>>> is never going to be EVEX (or VEX). IOW not much different from the
>>>> overloading of StaticRounding. Such an overload may then well be
>>>> named Rex2 (as you had it, and considering its intended use also for
>> JMPABS, plus taking into consideration that REX2.W will be set simply because
>> of the absence of NoRex64).
>>>>
>>>
>>> Haha, StaticRounding is really special, I tried "#define Rex2Req Masking" and
>> found that it will be used in i386-gen.c to identify EVEX (Broadcast...), then I
>> tried VexW and SIB found that they are all used without precheck whether it
>> was an vex instruction. Finally I wanted to re-use StaticRounding and found
>> out that hulin already uses it for legacy insns.
>>
>> Hmm, I'm sorry for the trouble. I'm inclined to say OperandConstraint with a
>> new #define it is then. Once everything's in I could then still see whether I can
>> (reasonably) make e.g. Masking work here.
>>
> 
> Then I will create a new attribute Rex2 for it.

???

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] Support APX PUSHP/POPP
  2023-11-29  8:29                 ` Jan Beulich
@ 2023-11-29 10:38                   ` Cui, Lili
  2023-11-29 11:01                     ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Cui, Lili @ 2023-11-29 10:38 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, ccoutant, binutils

> >>>> Then again, why is this constant needed for PUSHP/POPP, which have
> >>>> REX2.W set anyway, and hence that bit alone (when properly marked
> >>>> as
> >>>> consumed) should already allow to omit {rex2}. The question of
> >>>> adding a new constant (and how to suitably name it) would then be
> >>>> fully constrained to the JMPABS patch (for now, i.e. until such
> >>>> time that a 2nd insn appears which requires an otherwise empty REX2
> prefix).
> >>>>
> >>>
> >>> The current implementation is that no matter whether rex.w is
> >>> consumed or
> >> not, {rex2} will be printed as long as there is no Egpr. Of course,
> >> this place may be changed as discussed later.
> >>
> >> That's a bug then. If there is a REX2 prefix with just REX2.W set,
> >> and if that bit is properly consumed, no {rex2} should be printed
> >> imo. That's despite the same thing being expressable (in the common case;
> not here) with REX.W.
> >> Anything beyond that depends on the wider question of how to deal
> >> with _unused_ REX2 payload bits.
> >>
> >
> > For the last two instructions, although rex.w is consumed, we still need to
> print {rex2} to distinguish them.
> >
> > 0000000000000000 <_start>:
> >    0:   48 50                                 rex.W push %rax
> >    2:   d5 08 50                           pushp  %rax
> >    5:   48 6a 01                           rex.W push $0x1
> >    8:   d5 08 6a 01                     {rex2} push $0x1
> >    c:   48 c7 00 12 00 00 00                 movq   $0x12,(%rax)
> >   13:   d5 08 c7 00 12 00 00 00         {rex2} movq $0x12,(%rax)
> 
> Well, no, I don't really agree (especially not for the last one). But that goes
> back to the more general question on the printing of {rex2} without also
> indicating unconsumed bits.
> 

For the last one, rex2.w is consumed, but we still need to print {rex2} to distinguish it from rex, but now the problem is that we have different views on the specific information printed out, which comes back to the original question, Let’s wait for HJ for a few days. For the current patch I prefer to use REX2_SPECIAL, it needs this special handle.

> >>>>>>>> The new Rex2 attribute is not only wasteful (it can easily be a
> >>>>>>>> new enumerator used with OperandConstraint), but also
> misleading.
> >>>>>>>> We don't just need REX2 here, but we need it with REX2.W set.
> >>>>>>>> Even if from the tc-i386.c changes it looks as if that was
> >>>>>>>> happening implicitly (presumably due to the absence of
> >>>>>>>> NoRex64), naming still needs
> >>>>>> to properly reflect the purpose.
> >>>>>>>>
> >>>>>>>
> >>>>>>> You are right, rex2.w is set in process_suffix, since there is no
> NoRex64.
> >>>>>>>
> >>>>>>> How about Rex2W?
> >>>>>>> if (i.tm.opcode_modifier.rex2w)
> >>>>>>> {
> >>>>>>>     i.rex2_encoding = true;
> >>>>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> >>>>>>> }
> >>>>>>>
> >>>>>>>
> >>>>>>> Or just add a special judgment?
> >>>>>>>
> >>>>>>> if (t->mnem_off == MN_pushp || t->mnem_off == MN_popp) {
> >>>>>>>     i.rex2_encoding = true;
> >>>>>>>     i.rex |= REX_W;   // add NoRex64 back, and set REX_W here.
> >>>>>>> }
> >>>>>>
> >>>>>> I'd prefer the latter over a new attribute (albeit once again
> >>>>>> with the comment actually matching code), and perhaps I view the
> >>>>>> latter equal to my earlier suggestion.
> >>>>>>
> >>>>>
> >>>>> Ok, put them in process_operands, there's already a special
> >>>>> handler here to
> >>>> change the prefix.
> >>>>>
> >>>>> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index
> >>>>> 0dde2a9ad44..2f2b1b04d10 100644
> >>>>> --- a/gas/config/tc-i386.c
> >>>>> +++ b/gas/config/tc-i386.c
> >>>>> @@ -8715,6 +8715,13 @@ process_operands (void)
> >>>>>        i.tm.operands++;
> >>>>>      }
> >>>>>
> >>>>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
> >>>>> + MN_pushp || i.tm.mnem_off == MN_popp)
> >>>>> +    {
> >>>>> +      i.rex2_encoding = true;
> >>>>> +      i.rex |= REX_W;
> >>>>> +    }
> >>>>
> >>>> Well, I'm sorry for not considering JMPABS earlier on, but with
> >>>> that also needing dealing with, I think I view my alternative
> >>>> suggestion as
> >> preferable.
> >>>> That'll scale better when also considering that down the road
> >>>> further such insns may appear. Whether it's actually
> >>>> OperandConstraint that we leverage here is secondary (it's not
> >>>> ideal because there's nothing
> >> operand related here).
> >>>> I'd be perfectly okay with some other attribute being suitably
> >>>> overloaded, whereas I continue to think that introducing new
> >>>> attributes should preferably be limited to either cases where more
> >>>> than just two or three templates use them or cases where otherwise
> >>>> it's impossible to avoid ambiguities. From earlier changes of mine
> >>>> the underlying reason ought to be pretty clear: Each new attribute
> >>>> consumes storage, and with thousands of templates growth of storage
> >>>> requirements should be balanced with how frequently an attribute is
> >>>> actually going to have a non-zero value. For example, with
> >>>> is_evex_encoding() gone a brief inspection suggests that it might
> >>>> be possible to overload Masking (or maybe Broadcast): They're
> >>>> applicable to EVEX templates only, and the class of insns we're
> >>>> discussing here is never going to be EVEX (or VEX). IOW not much
> >>>> different from the overloading of StaticRounding. Such an overload
> >>>> may then well be named Rex2 (as you had it, and considering its
> >>>> intended use also for
> >> JMPABS, plus taking into consideration that REX2.W will be set simply
> >> because of the absence of NoRex64).
> >>>>
> >>>
> >>> Haha, StaticRounding is really special, I tried "#define Rex2Req
> >>> Masking" and
> >> found that it will be used in i386-gen.c to identify EVEX
> >> (Broadcast...), then I tried VexW and SIB found that they are all
> >> used without precheck whether it was an vex instruction. Finally I
> >> wanted to re-use StaticRounding and found out that hulin already uses it
> for legacy insns.
> >>
> >> Hmm, I'm sorry for the trouble. I'm inclined to say OperandConstraint
> >> with a new #define it is then. Once everything's in I could then
> >> still see whether I can
> >> (reasonably) make e.g. Masking work here.
> >>
> >
> > Then I will create a new attribute Rex2 for it.
> 
> ???
> 

I can't use "#define Rex2Req Masking" instead of creating a new bitfile for Rex2Req, then I have to create a new bitfile for Rex2Req , or you want to use i.tm.mnem_off to handle it? both are ok to me.

Lili.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Support APX PUSHP/POPP
  2023-11-29 10:38                   ` Cui, Lili
@ 2023-11-29 11:01                     ` Jan Beulich
  2023-11-29 13:02                       ` Cui, Lili
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2023-11-29 11:01 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Lu, Hongjiu, ccoutant, binutils

On 29.11.2023 11:38, Cui, Lili wrote:
>>>>>>> --- a/gas/config/tc-i386.c
>>>>>>> +++ b/gas/config/tc-i386.c
>>>>>>> @@ -8715,6 +8715,13 @@ process_operands (void)
>>>>>>>        i.tm.operands++;
>>>>>>>      }
>>>>>>>
>>>>>>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
>>>>>>> + MN_pushp || i.tm.mnem_off == MN_popp)
>>>>>>> +    {
>>>>>>> +      i.rex2_encoding = true;
>>>>>>> +      i.rex |= REX_W;
>>>>>>> +    }
>>>>>>
>>>>>> Well, I'm sorry for not considering JMPABS earlier on, but with
>>>>>> that also needing dealing with, I think I view my alternative
>>>>>> suggestion as
>>>> preferable.
>>>>>> That'll scale better when also considering that down the road
>>>>>> further such insns may appear. Whether it's actually
>>>>>> OperandConstraint that we leverage here is secondary (it's not
>>>>>> ideal because there's nothing
>>>> operand related here).
>>>>>> I'd be perfectly okay with some other attribute being suitably
>>>>>> overloaded, whereas I continue to think that introducing new
>>>>>> attributes should preferably be limited to either cases where more
>>>>>> than just two or three templates use them or cases where otherwise
>>>>>> it's impossible to avoid ambiguities. From earlier changes of mine
>>>>>> the underlying reason ought to be pretty clear: Each new attribute
>>>>>> consumes storage, and with thousands of templates growth of storage
>>>>>> requirements should be balanced with how frequently an attribute is
>>>>>> actually going to have a non-zero value. For example, with
>>>>>> is_evex_encoding() gone a brief inspection suggests that it might
>>>>>> be possible to overload Masking (or maybe Broadcast): They're
>>>>>> applicable to EVEX templates only, and the class of insns we're
>>>>>> discussing here is never going to be EVEX (or VEX). IOW not much
>>>>>> different from the overloading of StaticRounding. Such an overload
>>>>>> may then well be named Rex2 (as you had it, and considering its
>>>>>> intended use also for
>>>> JMPABS, plus taking into consideration that REX2.W will be set simply
>>>> because of the absence of NoRex64).
>>>>>>
>>>>>
>>>>> Haha, StaticRounding is really special, I tried "#define Rex2Req
>>>>> Masking" and
>>>> found that it will be used in i386-gen.c to identify EVEX
>>>> (Broadcast...), then I tried VexW and SIB found that they are all
>>>> used without precheck whether it was an vex instruction. Finally I
>>>> wanted to re-use StaticRounding and found out that hulin already uses it
>> for legacy insns.
>>>>
>>>> Hmm, I'm sorry for the trouble. I'm inclined to say OperandConstraint
>>>> with a new #define it is then.

Did you miss this before ...

>>>> Once everything's in I could then
>>>> still see whether I can
>>>> (reasonably) make e.g. Masking work here.
>>>>
>>>
>>> Then I will create a new attribute Rex2 for it.

... saying this and ...

>> ???
>>
> 
> I can't use "#define Rex2Req Masking" instead of creating a new bitfile for Rex2Req, then I have to create a new bitfile for Rex2Req , or you want to use i.tm.mnem_off to handle it? both are ok to me.

... this? IOW something along the lines of

#define Rex2              OperandConstraint=REX2

(REX2 may be too short a name for the necessary new #define).

And as indicated I'd subsequently see about reusing Masking (or some
other of the EVEX-only attributes) anyway.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] Support APX PUSHP/POPP
  2023-11-29 11:01                     ` Jan Beulich
@ 2023-11-29 13:02                       ` Cui, Lili
  2023-11-30  9:02                         ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Cui, Lili @ 2023-11-29 13:02 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, ccoutant, binutils

> On 29.11.2023 11:38, Cui, Lili wrote:
> >>>>>>> --- a/gas/config/tc-i386.c
> >>>>>>> +++ b/gas/config/tc-i386.c
> >>>>>>> @@ -8715,6 +8715,13 @@ process_operands (void)
> >>>>>>>        i.tm.operands++;
> >>>>>>>      }
> >>>>>>>
> >>>>>>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
> >>>>>>> + MN_pushp || i.tm.mnem_off == MN_popp)
> >>>>>>> +    {
> >>>>>>> +      i.rex2_encoding = true;
> >>>>>>> +      i.rex |= REX_W;
> >>>>>>> +    }
> >>>>>>
> >>>>>> Well, I'm sorry for not considering JMPABS earlier on, but with
> >>>>>> that also needing dealing with, I think I view my alternative
> >>>>>> suggestion as
> >>>> preferable.
> >>>>>> That'll scale better when also considering that down the road
> >>>>>> further such insns may appear. Whether it's actually
> >>>>>> OperandConstraint that we leverage here is secondary (it's not
> >>>>>> ideal because there's nothing
> >>>> operand related here).
> >>>>>> I'd be perfectly okay with some other attribute being suitably
> >>>>>> overloaded, whereas I continue to think that introducing new
> >>>>>> attributes should preferably be limited to either cases where
> >>>>>> more than just two or three templates use them or cases where
> >>>>>> otherwise it's impossible to avoid ambiguities. From earlier
> >>>>>> changes of mine the underlying reason ought to be pretty clear:
> >>>>>> Each new attribute consumes storage, and with thousands of
> >>>>>> templates growth of storage requirements should be balanced with
> >>>>>> how frequently an attribute is actually going to have a non-zero
> >>>>>> value. For example, with
> >>>>>> is_evex_encoding() gone a brief inspection suggests that it might
> >>>>>> be possible to overload Masking (or maybe Broadcast): They're
> >>>>>> applicable to EVEX templates only, and the class of insns we're
> >>>>>> discussing here is never going to be EVEX (or VEX). IOW not much
> >>>>>> different from the overloading of StaticRounding. Such an
> >>>>>> overload may then well be named Rex2 (as you had it, and
> >>>>>> considering its intended use also for
> >>>> JMPABS, plus taking into consideration that REX2.W will be set
> >>>> simply because of the absence of NoRex64).
> >>>>>>
> >>>>>
> >>>>> Haha, StaticRounding is really special, I tried "#define Rex2Req
> >>>>> Masking" and
> >>>> found that it will be used in i386-gen.c to identify EVEX
> >>>> (Broadcast...), then I tried VexW and SIB found that they are all
> >>>> used without precheck whether it was an vex instruction. Finally I
> >>>> wanted to re-use StaticRounding and found out that hulin already
> >>>> uses it
> >> for legacy insns.
> >>>>
> >>>> Hmm, I'm sorry for the trouble. I'm inclined to say
> >>>> OperandConstraint with a new #define it is then.
> 
> Did you miss this before ...
> 
> >>>> Once everything's in I could then
> >>>> still see whether I can
> >>>> (reasonably) make e.g. Masking work here.
> >>>>
> >>>
> >>> Then I will create a new attribute Rex2 for it.
> 
> ... saying this and ...
> 
> >> ???
> >>
> >
> > I can't use "#define Rex2Req Masking" instead of creating a new bitfile for
> Rex2Req, then I have to create a new bitfile for Rex2Req , or you want to use
> i.tm.mnem_off to handle it? both are ok to me.
> 
> ... this? IOW something along the lines of
> 
> #define Rex2              OperandConstraint=REX2
> 
> (REX2 may be too short a name for the necessary new #define).
> 
> And as indicated I'd subsequently see about reusing Masking (or some other
> of the EVEX-only attributes) anyway.
> 

Oh, I didn't understand OperandConstraint correctly. REX2 conflict with  "pseudopfx: rex2:REX2:APX_F", I changed it to REX2_PREFIX( or, REX_REQ ?).

#define Rex2              OperandConstraint=REX2_PREFIX

Thanks,
Lili.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Support APX PUSHP/POPP
  2023-11-29 13:02                       ` Cui, Lili
@ 2023-11-30  9:02                         ` Jan Beulich
  2023-11-30 11:19                           ` Cui, Lili
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2023-11-30  9:02 UTC (permalink / raw)
  To: Cui, Lili; +Cc: Lu, Hongjiu, ccoutant, binutils

On 29.11.2023 14:02, Cui, Lili wrote:
>> On 29.11.2023 11:38, Cui, Lili wrote:
>>>>>>>>> --- a/gas/config/tc-i386.c
>>>>>>>>> +++ b/gas/config/tc-i386.c
>>>>>>>>> @@ -8715,6 +8715,13 @@ process_operands (void)
>>>>>>>>>        i.tm.operands++;
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off ==
>>>>>>>>> + MN_pushp || i.tm.mnem_off == MN_popp)
>>>>>>>>> +    {
>>>>>>>>> +      i.rex2_encoding = true;
>>>>>>>>> +      i.rex |= REX_W;
>>>>>>>>> +    }
>>>>>>>>
>>>>>>>> Well, I'm sorry for not considering JMPABS earlier on, but with
>>>>>>>> that also needing dealing with, I think I view my alternative
>>>>>>>> suggestion as
>>>>>> preferable.
>>>>>>>> That'll scale better when also considering that down the road
>>>>>>>> further such insns may appear. Whether it's actually
>>>>>>>> OperandConstraint that we leverage here is secondary (it's not
>>>>>>>> ideal because there's nothing
>>>>>> operand related here).
>>>>>>>> I'd be perfectly okay with some other attribute being suitably
>>>>>>>> overloaded, whereas I continue to think that introducing new
>>>>>>>> attributes should preferably be limited to either cases where
>>>>>>>> more than just two or three templates use them or cases where
>>>>>>>> otherwise it's impossible to avoid ambiguities. From earlier
>>>>>>>> changes of mine the underlying reason ought to be pretty clear:
>>>>>>>> Each new attribute consumes storage, and with thousands of
>>>>>>>> templates growth of storage requirements should be balanced with
>>>>>>>> how frequently an attribute is actually going to have a non-zero
>>>>>>>> value. For example, with
>>>>>>>> is_evex_encoding() gone a brief inspection suggests that it might
>>>>>>>> be possible to overload Masking (or maybe Broadcast): They're
>>>>>>>> applicable to EVEX templates only, and the class of insns we're
>>>>>>>> discussing here is never going to be EVEX (or VEX). IOW not much
>>>>>>>> different from the overloading of StaticRounding. Such an
>>>>>>>> overload may then well be named Rex2 (as you had it, and
>>>>>>>> considering its intended use also for
>>>>>> JMPABS, plus taking into consideration that REX2.W will be set
>>>>>> simply because of the absence of NoRex64).
>>>>>>>>
>>>>>>>
>>>>>>> Haha, StaticRounding is really special, I tried "#define Rex2Req
>>>>>>> Masking" and
>>>>>> found that it will be used in i386-gen.c to identify EVEX
>>>>>> (Broadcast...), then I tried VexW and SIB found that they are all
>>>>>> used without precheck whether it was an vex instruction. Finally I
>>>>>> wanted to re-use StaticRounding and found out that hulin already
>>>>>> uses it
>>>> for legacy insns.
>>>>>>
>>>>>> Hmm, I'm sorry for the trouble. I'm inclined to say
>>>>>> OperandConstraint with a new #define it is then.
>>
>> Did you miss this before ...
>>
>>>>>> Once everything's in I could then
>>>>>> still see whether I can
>>>>>> (reasonably) make e.g. Masking work here.
>>>>>>
>>>>>
>>>>> Then I will create a new attribute Rex2 for it.
>>
>> ... saying this and ...
>>
>>>> ???
>>>>
>>>
>>> I can't use "#define Rex2Req Masking" instead of creating a new bitfile for
>> Rex2Req, then I have to create a new bitfile for Rex2Req , or you want to use
>> i.tm.mnem_off to handle it? both are ok to me.
>>
>> ... this? IOW something along the lines of
>>
>> #define Rex2              OperandConstraint=REX2
>>
>> (REX2 may be too short a name for the necessary new #define).
>>
>> And as indicated I'd subsequently see about reusing Masking (or some other
>> of the EVEX-only attributes) anyway.
>>
> 
> Oh, I didn't understand OperandConstraint correctly. REX2 conflict with  "pseudopfx: rex2:REX2:APX_F", I changed it to REX2_PREFIX( or, REX_REQ ?).
> 
> #define Rex2              OperandConstraint=REX2_PREFIX

REX2_REQUIRED perhaps?

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] Support APX PUSHP/POPP
  2023-11-30  9:02                         ` Jan Beulich
@ 2023-11-30 11:19                           ` Cui, Lili
  0 siblings, 0 replies; 15+ messages in thread
From: Cui, Lili @ 2023-11-30 11:19 UTC (permalink / raw)
  To: Beulich, Jan; +Cc: Lu, Hongjiu, ccoutant, binutils

> On 29.11.2023 14:02, Cui, Lili wrote:
> >> On 29.11.2023 11:38, Cui, Lili wrote:
> >>>>>>>>> --- a/gas/config/tc-i386.c
> >>>>>>>>> +++ b/gas/config/tc-i386.c
> >>>>>>>>> @@ -8715,6 +8715,13 @@ process_operands (void)
> >>>>>>>>>        i.tm.operands++;
> >>>>>>>>>      }
> >>>>>>>>>
> >>>>>>>>> +  /* PUSHP/POPP requires rex2.w == 1.  */  if (i.tm.mnem_off
> >>>>>>>>> + == MN_pushp || i.tm.mnem_off == MN_popp)
> >>>>>>>>> +    {
> >>>>>>>>> +      i.rex2_encoding = true;
> >>>>>>>>> +      i.rex |= REX_W;
> >>>>>>>>> +    }
> >>>>>>>>
> >>>>>>>> Well, I'm sorry for not considering JMPABS earlier on, but with
> >>>>>>>> that also needing dealing with, I think I view my alternative
> >>>>>>>> suggestion as
> >>>>>> preferable.
> >>>>>>>> That'll scale better when also considering that down the road
> >>>>>>>> further such insns may appear. Whether it's actually
> >>>>>>>> OperandConstraint that we leverage here is secondary (it's not
> >>>>>>>> ideal because there's nothing
> >>>>>> operand related here).
> >>>>>>>> I'd be perfectly okay with some other attribute being suitably
> >>>>>>>> overloaded, whereas I continue to think that introducing new
> >>>>>>>> attributes should preferably be limited to either cases where
> >>>>>>>> more than just two or three templates use them or cases where
> >>>>>>>> otherwise it's impossible to avoid ambiguities. From earlier
> >>>>>>>> changes of mine the underlying reason ought to be pretty clear:
> >>>>>>>> Each new attribute consumes storage, and with thousands of
> >>>>>>>> templates growth of storage requirements should be balanced
> >>>>>>>> with how frequently an attribute is actually going to have a
> >>>>>>>> non-zero value. For example, with
> >>>>>>>> is_evex_encoding() gone a brief inspection suggests that it
> >>>>>>>> might be possible to overload Masking (or maybe Broadcast):
> >>>>>>>> They're applicable to EVEX templates only, and the class of
> >>>>>>>> insns we're discussing here is never going to be EVEX (or VEX).
> >>>>>>>> IOW not much different from the overloading of StaticRounding.
> >>>>>>>> Such an overload may then well be named Rex2 (as you had it,
> >>>>>>>> and considering its intended use also for
> >>>>>> JMPABS, plus taking into consideration that REX2.W will be set
> >>>>>> simply because of the absence of NoRex64).
> >>>>>>>>
> >>>>>>>
> >>>>>>> Haha, StaticRounding is really special, I tried "#define Rex2Req
> >>>>>>> Masking" and
> >>>>>> found that it will be used in i386-gen.c to identify EVEX
> >>>>>> (Broadcast...), then I tried VexW and SIB found that they are all
> >>>>>> used without precheck whether it was an vex instruction. Finally
> >>>>>> I wanted to re-use StaticRounding and found out that hulin
> >>>>>> already uses it
> >>>> for legacy insns.
> >>>>>>
> >>>>>> Hmm, I'm sorry for the trouble. I'm inclined to say
> >>>>>> OperandConstraint with a new #define it is then.
> >>
> >> Did you miss this before ...
> >>
> >>>>>> Once everything's in I could then still see whether I can
> >>>>>> (reasonably) make e.g. Masking work here.
> >>>>>>
> >>>>>
> >>>>> Then I will create a new attribute Rex2 for it.
> >>
> >> ... saying this and ...
> >>
> >>>> ???
> >>>>
> >>>
> >>> I can't use "#define Rex2Req Masking" instead of creating a new
> >>> bitfile for
> >> Rex2Req, then I have to create a new bitfile for Rex2Req , or you
> >> want to use i.tm.mnem_off to handle it? both are ok to me.
> >>
> >> ... this? IOW something along the lines of
> >>
> >> #define Rex2              OperandConstraint=REX2
> >>
> >> (REX2 may be too short a name for the necessary new #define).
> >>
> >> And as indicated I'd subsequently see about reusing Masking (or some
> >> other of the EVEX-only attributes) anyway.
> >>
> >
> > Oh, I didn't understand OperandConstraint correctly. REX2 conflict with
> "pseudopfx: rex2:REX2:APX_F", I changed it to REX2_PREFIX( or, REX_REQ ?).
> >
> > #define Rex2              OperandConstraint=REX2_PREFIX
> 
> REX2_REQUIRED perhaps?
> 

Ok

Thanks,
Lili.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-11-30 11:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-27 12:31 [PATCH] Support APX PUSHP/POPP Cui, Lili
2023-11-27 12:56 ` Jan Beulich
2023-11-27 13:45   ` Cui, Lili
2023-11-27 14:06     ` Jan Beulich
2023-11-28  2:32       ` Cui, Lili
2023-11-28  8:34         ` Jan Beulich
2023-11-28 13:14           ` Cui, Lili
2023-11-28 13:54             ` Jan Beulich
2023-11-29  3:08               ` Cui, Lili
2023-11-29  8:29                 ` Jan Beulich
2023-11-29 10:38                   ` Cui, Lili
2023-11-29 11:01                     ` Jan Beulich
2023-11-29 13:02                       ` Cui, Lili
2023-11-30  9:02                         ` Jan Beulich
2023-11-30 11:19                           ` Cui, Lili

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).