Re: [PATCH V2 3/8] Support APX GPR32 with extend evex prefix

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH V2 3/8] Support APX GPR32 with extend evex prefix
Date: Mon, 6 Nov 2023 18:07:12 +0100	[thread overview]
Message-ID: <2bdf64b9-f0fb-4d16-0a82-9edb36fa409f@suse.com> (raw)
In-Reply-To: <SJ0PR11MB56006942F1BEB2EF7BBF292C9EA5A@SJ0PR11MB5600.namprd11.prod.outlook.com>

On 03.11.2023 17:50, Cui, Lili wrote:
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -3672,9 +3672,10 @@ install_template (const insn_template *t)
>    /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
>    if (t->opcode_modifier.vex && t->opcode_modifier.evex)
>    {
> -      if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> -	   || maybe_cpu (t, CpuFMA))
> -	  && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
> +    if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
> +	 || maybe_cpu (t, CpuFMA) ||  maybe_cpu (t, CpuAMX_TILE))
> +	&& (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)
> +	    || maybe_cpu (t, CpuAPX_F)))

Something's odd with formatting (indentation) here. I first thought I
had screwed up in my patch, but looking back things appear to be
correct there.

> @@ -3689,12 +3690,11 @@ install_template (const insn_template *t)
>  		i.tm.cpu.bitfield.cpuavx = 1;
>  	      else
>  		{
> -		  gas_assert (!i.tm.cpu.bitfield.isa);
>  		  i.tm.cpu.bitfield.isa = i.tm.cpu_any.bitfield.isa;

You may not remove the assertion here, or else the assignment could
silently overwrite a value. Can you explain why you did the removal?

> @@ -3885,6 +3885,14 @@ is_any_vex_encoding (const insn_template *t)
>    return t->opcode_modifier.vex || is_evex_encoding (t);
>  }
>  
> +static INLINE bool
> +is_any_apx_evex_encoding (void)
> +{
> +  return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4 
> +    || (i.vex.register_specifier
> +	&& i.vex.register_specifier->reg_flags & RegRex2);
> +}

The use of i.rex2 here doesn't fit the name; the sole user has first
checked that no legacy encoding is going to be used, and that's a
prereq here. Such a prereq needs spelling out, such that one can be
easily aware when possibly adding another caller.

Also, what does "any" stand for in the name here ...

>  static INLINE bool
>  is_any_apx_rex2_encoding (void)

... and also the one here? In is_any_vex_encoding() it refers to
VEX/XOP/EVEX.

> @@ -4161,6 +4169,27 @@ build_rex2_prefix (void)
>  		    | (i.rex2 << 4) | i.rex);
>  }
>  
> +/* Build the EVEX prefix (4-byte) for evex insn
> +   | 62h |
> +   | `R`X`B`R' | B'mmm |
> +   | W | v`v`v`v | `x' | pp |
> +   | z| L'L | b | `v | aaa |
> +*/
> +static void
> +build_apx_evex_prefix (void)
> +{
> +  build_evex_prefix ();
> +  if (i.rex2 & REX_R)
> +    i.vex.bytes[1] &= 0xef;

I think this would read easier as ~0x10 (similarly below then). I also
think ...

> +  if (i.vex.register_specifier
> +      && register_number (i.vex.register_specifier) > 0xf)
> +    i.vex.bytes[3] &= 0xf7;
> +  if (i.rex2 & REX_B)
> +    i.vex.bytes[1] |= 0x08;

... it would help if the byte 1 updates were kept together (the compiler
may even produce better code then), and maybe ...

> +  if (i.rex2 & REX_X)
> +    i.vex.bytes[2] &= 0xfb;

... the byte 2 update ahead of the byte 3 one.

Also why do you use register_number() here, and not the RegRex2 flag?

> @@ -5624,19 +5653,42 @@ md_assemble (char *line)
>  	}
>  
>        /* Check for explicit REX2 prefix.  */
> -      if (i.rex2 || i.rex2_encoding)
> +      if (i.rex2_encoding)

This change ...

>  	{
>  	  as_bad (_("REX2 prefix invalid with `%s'"), insn_name (&i.tm));

... further invalidates this message. Yet iirc you said you removed the
i.rex2 part of the check anyway from patch 1.

>  	  return;
>  	}
>  
> -      if (i.tm.opcode_modifier.vex)
> +      if (is_any_apx_evex_encoding ())
> +	{
> +	  if (i.tm.opcode_space == SPACE_EVEXMAP4 && (i.prefix[DATA_PREFIX] != 0))
> +	    {
> +	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;

Perhaps better assert that no other embedded prefix was already recorded
here?

> +	      i.prefix[DATA_PREFIX] = 0;
> +	    }
> +
> +	  build_apx_evex_prefix ();
> +
> +	  /* Encode the NDD bit of the instruction promoted from the legacy
> +	     space.  */
> +	  if (i.vex.register_specifier && i.tm.opcode_space == SPACE_EVEXMAP4)
> +	    i.vex.bytes[3] |= 0x10;

Why the restriction to map 3? And why is this not part of
build_apx_evex_prefix()?

> +	  /* Encode the NF bit of the instruction promoted from legacy and vex
> +	     space.  */
> +	  if (i.has_nf)
> +	    i.vex.bytes[3] |= 0x04;

This wants to move to the patch actually introducing NF handling.

> @@ -5663,16 +5715,17 @@ md_assemble (char *line)
>       instruction already has a prefix, we need to convert old
>       registers to new ones.  */
>  
> -  if ((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte
> -       && (i.op[0].regs->reg_flags & RegRex64) != 0)
> -      || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte
> -	  && (i.op[1].regs->reg_flags & RegRex64) != 0)
> -      || (((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte)
> -	   || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte))
> -	  && (i.rex != 0 || i.rex2 != 0)))
> +  if (!is_any_vex_encoding (&i.tm)
> +      && ((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte
> +	   && (i.op[0].regs->reg_flags & RegRex64) != 0)
> +	  || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte
> +	      && (i.op[1].regs->reg_flags & RegRex64) != 0)
> +	  || (((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte)
> +	       || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte))
> +	      && (i.rex != 0 || i.rex2 != 0))))

I'm a little puzzled by this (hard to read) adjustment:
is_any_vex_encoding() basically excludes most new templates added here,
yet for those permitting Reg8 you still need to enforce the EVEX-imposed
restriction.

> @@ -7043,7 +7096,7 @@ VEX_check_encoding (const insn_template *t)
>  static int
>  check_EgprOperands (const insn_template *t)
>  {
> -  if (t->opcode_modifier.noegpr)
> +  if (t->opcode_modifier.noegpr && !need_evex_encoding())
>      {
>        for (unsigned int op = 0; op < i.operands; op++)
>  	{

What is this change about?

> @@ -14252,6 +14306,9 @@ static bool check_register (const reg_entry *r)
>  
>    if (r->reg_flags & RegRex2)
>      {
> +      if (is_evex_encoding (current_templates->start))
> +	i.vec_encoding = vex_encoding_evex;

What if the APX template isn't first in the group?

> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -975,6 +975,7 @@ typedef struct insn_template
>       1: 0F opcode prefix / space.
>       2: 0F38 opcode prefix / space.
>       3: 0F3A opcode prefix / space.
> +     4: EVEXMAP4 opcode prefix / space.
>       5: EVEXMAP5 opcode prefix / space.
>       6: EVEXMAP6 opcode prefix / space.
>       7: VEXMAP7 opcode prefix / space.
> @@ -986,6 +987,7 @@ typedef struct insn_template
>  #define SPACE_0F	1
>  #define SPACE_0F38	2
>  #define SPACE_0F3A	3
> +#define SPACE_EVEXMAP4  4
>  #define SPACE_EVEXMAP5	5
>  #define SPACE_EVEXMAP6	6
>  #define SPACE_VEXMAP7	7

Nit: Please can padding here match surrounding code?

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -109,6 +109,7 @@
>  #define SpaceXOP09 OpcodeSpace=SPACE_XOP09
>  #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A
>  
> +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4
>  #define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5
>  #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6
>  
> @@ -138,6 +139,8 @@
>  #define Vsz256 Vsz=VSZ256
>  #define Vsz512 Vsz=VSZ512
>  
> +#define APX_F_64 APX_F&x64

What's wrong with

#define APX_F APX_F&x64

? Oh, I see - this wouldn't build with the uses in the AMX-TILE insns.

Yet as said elsewhere, it may be better if we make the 64-bit-mode-
only-ISAs' dependencies (not just for APX) implicit as far as the
opcode table goes.

> @@ -338,6 +342,7 @@ adc, 0x14, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
>  adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
>  
>  neg, 0xf6/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> +
>  not, 0xf6/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
>  
>  aaa, 0x37, No64, NoSuf, {}

Nit: In an already overlarge patch it is extremely helpful if any
unrelated changed could be omitted.

> @@ -1316,13 +1321,16 @@ getsec, 0xf37, SMX, NoSuf, {}
>  
>  invept, 0x660f3880, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
>  invept, 0x660f3880, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invept, 0xf3f0, APX_F_64&EPT, Modrm|NoSuf|EVex128|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
>  invvpid, 0x660f3881, EPT&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
>  invvpid, 0x660f3881, EPT&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invvpid, 0xf3f1, APX_F_64&EPT, Modrm|NoSuf|EVex128|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }
>  
>  // INVPCID instruction
>  
>  invpcid, 0x660f3882, INVPCID&No64, Modrm|IgnoreSize|NoSuf, { Oword|Unspecified|BaseIndex, Reg32 }
>  invpcid, 0x660f3882, INVPCID&x64, Modrm|NoSuf|NoRex64, { Oword|Unspecified|BaseIndex, Reg64 }
> +invpcid, 0xf3f2, APX_F_64&INVPCID, Modrm|NoSuf|EVex128|EVexMap4, { Oword|Unspecified|BaseIndex, Reg64 }

While ordering doesn't matter functionality wise, overall readability
imo would end up improved if you had EPT&APX_F and INVPCID&APX_f here
(and the likewise elsewhere, albeit it looks in other cases you already
have things the other way round).

> @@ -3310,6 +3364,7 @@ prefetchit1, 0xf18/6, PREFETCHI&x64, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex
>  // CMPCCXADD instructions.
>  
>  cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD&x64, Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
> +cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD&x64&APX_F_64, Modrm|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }

Nit: Redundant x64.

Jan

next prev parent reply	other threads:[~2023-11-06 17:07 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-03 16:50 Cui, Lili
2023-11-06 17:07 ` Jan Beulich [this message]
2023-11-13  5:53   ` Cui, Lili
2023-11-13  8:34     ` Jan Beulich
2023-11-14  3:12       ` Cui, Lili
2023-11-14 10:29         ` Jan Beulich
2023-11-15  8:39           ` Cui, Lili
2023-11-07 13:29 ` Jan Beulich
2023-11-09  8:38   ` Cui, Lili
2023-11-09 11:07     ` Jan Beulich
2023-11-09 11:12     ` Jan Beulich
2023-11-07 14:53 ` Jan Beulich
2023-11-09 12:31   ` Cui, Lili
2023-11-09 13:05     ` Jan Beulich
2023-11-09 14:57       ` Cui, Lili
2023-11-09 15:39         ` Jan Beulich
2023-11-14  7:42   ` Cui, Lili
2023-11-14 10:40     ` Jan Beulich
2023-11-14 14:46       ` Cui, Lili
2023-11-15  6:03   ` Cui, Lili
2023-11-15  9:11     ` Jan Beulich
2023-11-15 11:43       ` Cui, Lili
2023-11-16 13:57         ` Jan Beulich
2023-11-16 15:10           ` Cui, Lili
2023-11-16 15:15             ` Jan Beulich
2023-11-16 16:12           ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bdf64b9-f0fb-4d16-0a82-9edb36fa409f@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).