From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH V2 3/8] Support APX GPR32 with extend evex prefix
Date: Mon, 6 Nov 2023 18:07:12 +0100 [thread overview]
Message-ID: <2bdf64b9-f0fb-4d16-0a82-9edb36fa409f@suse.com> (raw)
In-Reply-To: <SJ0PR11MB56006942F1BEB2EF7BBF292C9EA5A@SJ0PR11MB5600.namprd11.prod.outlook.com>
On 03.11.2023 17:50, Cui, Lili wrote:
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -3672,9 +3672,10 @@ install_template (const insn_template *t)
> /* Dual VEX/EVEX templates need stripping one of the possible variants. */
> if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> {
> - if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> - || maybe_cpu (t, CpuFMA))
> - && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
> + if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> + || maybe_cpu (t, CpuFMA) || maybe_cpu (t, CpuAMX_TILE))
> + && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)
> + || maybe_cpu (t, CpuAPX_F)))
Something's odd with formatting (indentation) here. I first thought I
had screwed up in my patch, but looking back things appear to be
correct there.
> @@ -3689,12 +3690,11 @@ install_template (const insn_template *t)
> i.tm.cpu.bitfield.cpuavx = 1;
> else
> {
> - gas_assert (!i.tm.cpu.bitfield.isa);
> i.tm.cpu.bitfield.isa = i.tm.cpu_any.bitfield.isa;
You may not remove the assertion here, or else the assignment could
silently overwrite a value. Can you explain why you did the removal?
> @@ -3885,6 +3885,14 @@ is_any_vex_encoding (const insn_template *t)
> return t->opcode_modifier.vex || is_evex_encoding (t);
> }
>
> +static INLINE bool
> +is_any_apx_evex_encoding (void)
> +{
> + return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
> + || (i.vex.register_specifier
> + && i.vex.register_specifier->reg_flags & RegRex2);
> +}
The use of i.rex2 here doesn't fit the name; the sole user has first
checked that no legacy encoding is going to be used, and that's a
prereq here. Such a prereq needs spelling out, such that one can be
easily aware when possibly adding another caller.
Also, what does "any" stand for in the name here ...
> static INLINE bool
> is_any_apx_rex2_encoding (void)
... and also the one here? In is_any_vex_encoding() it refers to
VEX/XOP/EVEX.
> @@ -4161,6 +4169,27 @@ build_rex2_prefix (void)
> | (i.rex2 << 4) | i.rex);
> }
>
> +/* Build the EVEX prefix (4-byte) for evex insn
> + | 62h |
> + | `R`X`B`R' | B'mmm |
> + | W | v`v`v`v | `x' | pp |
> + | z| L'L | b | `v | aaa |
> +*/
> +static void
> +build_apx_evex_prefix (void)
> +{
> + build_evex_prefix ();
> + if (i.rex2 & REX_R)
> + i.vex.bytes[1] &= 0xef;
I think this would read easier as ~0x10 (similarly below then). I also
think ...
> + if (i.vex.register_specifier
> + && register_number (i.vex.register_specifier) > 0xf)
> + i.vex.bytes[3] &= 0xf7;
> + if (i.rex2 & REX_B)
> + i.vex.bytes[1] |= 0x08;
... it would help if the byte 1 updates were kept together (the compiler
may even produce better code then), and maybe ...
> + if (i.rex2 & REX_X)
> + i.vex.bytes[2] &= 0xfb;
... the byte 2 update ahead of the byte 3 one.
Also why do you use register_number() here, and not the RegRex2 flag?
> @@ -5624,19 +5653,42 @@ md_assemble (char *line)
> }
>
> /* Check for explicit REX2 prefix. */
> - if (i.rex2 || i.rex2_encoding)
> + if (i.rex2_encoding)
This change ...
> {
> as_bad (_("REX2 prefix invalid with `%s'"), insn_name (&i.tm));
... further invalidates this message. Yet iirc you said you removed the
i.rex2 part of the check anyway from patch 1.
> return;
> }
>
> - if (i.tm.opcode_modifier.vex)
> + if (is_any_apx_evex_encoding ())
> + {
> + if (i.tm.opcode_space == SPACE_EVEXMAP4 && (i.prefix[DATA_PREFIX] != 0))
> + {
> + i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
Perhaps better assert that no other embedded prefix was already recorded
here?
> + i.prefix[DATA_PREFIX] = 0;
> + }
> +
> + build_apx_evex_prefix ();
> +
> + /* Encode the NDD bit of the instruction promoted from the legacy
> + space. */
> + if (i.vex.register_specifier && i.tm.opcode_space == SPACE_EVEXMAP4)
> + i.vex.bytes[3] |= 0x10;
Why the restriction to map 3? And why is this not part of
build_apx_evex_prefix()?
> + /* Encode the NF bit of the instruction promoted from legacy and vex
> + space. */
> + if (i.has_nf)
> + i.vex.bytes[3] |= 0x04;
This wants to move to the patch actually introducing NF handling.
> @@ -5663,16 +5715,17 @@ md_assemble (char *line)
> instruction already has a prefix, we need to convert old
> registers to new ones. */
>
> - if ((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte
> - && (i.op[0].regs->reg_flags & RegRex64) != 0)
> - || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte
> - && (i.op[1].regs->reg_flags & RegRex64) != 0)
> - || (((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte)
> - || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte))
> - && (i.rex != 0 || i.rex2 != 0)))
> + if (!is_any_vex_encoding (&i.tm)
> + && ((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte
> + && (i.op[0].regs->reg_flags & RegRex64) != 0)
> + || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte
> + && (i.op[1].regs->reg_flags & RegRex64) != 0)
> + || (((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte)
> + || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte))
> + && (i.rex != 0 || i.rex2 != 0))))
I'm a little puzzled by this (hard to read) adjustment:
is_any_vex_encoding() basically excludes most new templates added here,
yet for those permitting Reg8 you still need to enforce the EVEX-imposed
restriction.
> @@ -7043,7 +7096,7 @@ VEX_check_encoding (const insn_template *t)
> static int
> check_EgprOperands (const insn_template *t)
> {
> - if (t->opcode_modifier.noegpr)
> + if (t->opcode_modifier.noegpr && !need_evex_encoding())
> {
> for (unsigned int op = 0; op < i.operands; op++)
> {
What is this change about?
> @@ -14252,6 +14306,9 @@ static bool check_register (const reg_entry *r)
>
> if (r->reg_flags & RegRex2)
> {
> + if (is_evex_encoding (current_templates->start))
> + i.vec_encoding = vex_encoding_evex;
What if the APX template isn't first in the group?
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -975,6 +975,7 @@ typedef struct insn_template
> 1: 0F opcode prefix / space.
> 2: 0F38 opcode prefix / space.
> 3: 0F3A opcode prefix / space.
> + 4: EVEXMAP4 opcode prefix / space.
> 5: EVEXMAP5 opcode prefix / space.
> 6: EVEXMAP6 opcode prefix / space.
> 7: VEXMAP7 opcode prefix / space.
> @@ -986,6 +987,7 @@ typedef struct insn_template
> #define SPACE_0F 1
> #define SPACE_0F38 2
> #define SPACE_0F3A 3
> +#define SPACE_EVEXMAP4 4
> #define SPACE_EVEXMAP5 5
> #define SPACE_EVEXMAP6 6
> #define SPACE_VEXMAP7 7
Nit: Please can padding here match surrounding code?
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -109,6 +109,7 @@
> #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
> #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
>
> +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4
> #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
> #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
>
> @@ -138,6 +139,8 @@
> #define Vsz256 Vsz=VSZ256
> #define Vsz512 Vsz=VSZ512
>
> +#define APX_F_64 APX_F&x64
What's wrong with
#define APX_F APX_F&x64
? Oh, I see - this wouldn't build with the uses in the AMX-TILE insns.
Yet as said elsewhere, it may be better if we make the 64-bit-mode-
only-ISAs' dependencies (not just for APX) implicit as far as the
opcode table goes.
> @@ -338,6 +342,7 @@ adc, 0x14, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
> adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
>
> neg, 0xf6/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> +
> not, 0xf6/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
>
> aaa, 0x37, No64, NoSuf, {}
Nit: In an already overlarge patch it is extremely helpful if any
unrelated changed could be omitted.
> @@ -1316,13 +1321,16 @@ getsec, 0xf37, SMX, NoSuf, {}
>
> invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
> invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invept, 0xf3f0, APX_F_64&EPT, Modrm|NoSuf|EVex128|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
> invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
> invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invvpid, 0xf3f1, APX_F_64&EPT, Modrm|NoSuf|EVex128|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
>
> // INVPCID instruction
>
> invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
> invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invpcid, 0xf3f2, APX_F_64&INVPCID, Modrm|NoSuf|EVex128|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
While ordering doesn't matter functionality wise, overall readability
imo would end up improved if you had EPT&APX_F and INVPCID&APX_f here
(and the likewise elsewhere, albeit it looks in other cases you already
have things the other way round).
> @@ -3310,6 +3364,7 @@ prefetchit1, 0xf18/6, PREFETCHI&x64, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex
> // CMPCCXADD instructions.
>
> cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD&x64, Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD&x64&APX_F_64, Modrm|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
Nit: Redundant x64.
Jan
next prev parent reply other threads:[~2023-11-06 17:07 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-03 16:50 Cui, Lili
2023-11-06 17:07 ` Jan Beulich [this message]
2023-11-13 5:53 ` Cui, Lili
2023-11-13 8:34 ` Jan Beulich
2023-11-14 3:12 ` Cui, Lili
2023-11-14 10:29 ` Jan Beulich
2023-11-15 8:39 ` Cui, Lili
2023-11-07 13:29 ` Jan Beulich
2023-11-09 8:38 ` Cui, Lili
2023-11-09 11:07 ` Jan Beulich
2023-11-09 11:12 ` Jan Beulich
2023-11-07 14:53 ` Jan Beulich
2023-11-09 12:31 ` Cui, Lili
2023-11-09 13:05 ` Jan Beulich
2023-11-09 14:57 ` Cui, Lili
2023-11-09 15:39 ` Jan Beulich
2023-11-14 7:42 ` Cui, Lili
2023-11-14 10:40 ` Jan Beulich
2023-11-14 14:46 ` Cui, Lili
2023-11-15 6:03 ` Cui, Lili
2023-11-15 9:11 ` Jan Beulich
2023-11-15 11:43 ` Cui, Lili
2023-11-16 13:57 ` Jan Beulich
2023-11-16 15:10 ` Cui, Lili
2023-11-16 15:15 ` Jan Beulich
2023-11-16 16:12 ` Cui, Lili
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2bdf64b9-f0fb-4d16-0a82-9edb36fa409f@suse.com \
--to=jbeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=hongjiu.lu@intel.com \
--cc=lili.cui@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).