Re: [PATCH V2 3/8] Support APX GPR32 with extend evex prefix

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: Re: [PATCH V2 3/8] Support APX GPR32 with extend evex prefix
Date: Tue, 7 Nov 2023 15:53:43 +0100	[thread overview]
Message-ID: <4aebadde-7cba-050e-d2eb-188d6216c566@suse.com> (raw)
In-Reply-To: <SJ0PR11MB56006942F1BEB2EF7BBF292C9EA5A@SJ0PR11MB5600.namprd11.prod.outlook.com>

On 03.11.2023 17:50, Cui, Lili wrote:
> --- a/gas/testsuite/gas/i386/x86-64-inval-movbe.l
> +++ b/gas/testsuite/gas/i386/x86-64-inval-movbe.l
> @@ -1,29 +1,30 @@
>  .*: Assembler messages:
> -.*:4: Error: .*
>  .*:5: Error: .*
>  .*:6: Error: .*
>  .*:7: Error: .*
>  .*:8: Error: .*
> -.*:11: Error: .*
> +.*:9: Error: .*
>  .*:12: Error: .*
>  .*:13: Error: .*
>  .*:14: Error: .*
>  .*:15: Error: .*
> +.*:16: Error: .*
>  GAS LISTING .*
>  
>  
>  [ 	]*1[ 	]+\# Check illegal movbe in 64bit mode\.
>  [ 	]*2[ 	]+\.text
> -[ 	]*3[ 	]+foo:
> -[ 	]*4[ 	]+movbe	\(%rcx\),%bl
> -[ 	]*5[ 	]+movbe	%ecx,%ebx
> -[ 	]*6[ 	]+movbe	%bx,%rcx
> -[ 	]*7[ 	]+movbe	%rbx,%rcx
> -[ 	]*8[ 	]+movbe	%bl,\(%rcx\)
> -[ 	]*9[ 	]+
> -[ 	]*10[ 	]+\.intel_syntax noprefix
> -[ 	]*11[ 	]+movbe bl, byte ptr \[rcx\]
> -[ 	]*12[ 	]+movbe ebx, ecx
> -[ 	]*13[ 	]+movbe rcx, bx
> -[ 	]*14[ 	]+movbe rcx, rbx
> -[ 	]*15[ 	]+movbe byte ptr \[rcx\], bl
> +[ 	]*3[ 	]+\.arch \.noapx_f
> +[ 	]*4[ 	]+foo:
> +[ 	]*5[ 	]+movbe	\(%rcx\),%bl
> +[ 	]*6[ 	]+movbe	%ecx,%ebx
> +[ 	]*7[ 	]+movbe	%bx,%rcx
> +[ 	]*8[ 	]+movbe	%rbx,%rcx
> +[ 	]*9[ 	]+movbe	%bl,\(%rcx\)
> +[ 	]*10[ 	]+
> +[ 	]*11[ 	]+\.intel_syntax noprefix
> +[ 	]*12[ 	]+movbe bl, byte ptr \[rcx\]
> +[ 	]*13[ 	]+movbe ebx, ecx
> +[ 	]*14[ 	]+movbe rcx, bx
> +[ 	]*15[ 	]+movbe rcx, rbx
> +[ 	]*16[ 	]+movbe byte ptr \[rcx\], bl

To avoid the need to fiddle with this file, did you consider changing
the test's command line options instead? In any event ...

> --- a/gas/testsuite/gas/i386/x86-64-inval-movbe.s
> +++ b/gas/testsuite/gas/i386/x86-64-inval-movbe.s
> @@ -1,5 +1,6 @@
>  # Check illegal movbe in 64bit mode.
>  	.text
> +	.arch .noapx_f
>  foo:
>  	movbe	(%rcx),%bl
>  	movbe	%ecx,%ebx

... the comment here wants expanding, to (briefly) mention the deliberate
exclusion of APX.

> --- a/opcodes/i386-dis-evex-len.h
> +++ b/opcodes/i386-dis-evex-len.h
> @@ -62,6 +62,16 @@ static const struct dis386 evex_len_table[][3] = {
>      { REG_TABLE (REG_EVEX_0F38C7_L_2) },
>    },
>  
> +  /* EVEX_LEN_0F38F2 */
> +  {
> +    { "andnS",		{ Gdq, VexGdq, Edq }, 0 },
> +  },

There's no sign of a prefix decode step here.

> --- a/opcodes/i386-dis-evex-mod.h
> +++ b/opcodes/i386-dis-evex-mod.h
> @@ -1 +1,43 @@
>  /* Nothing at present.  */

This comment needs to go away when new stuff is added here. However, I'm
sure I requested before that no new entries be put here which have only
one of their two slots populated. The reg-only and mem-only aspects can
be expressed via proper choice of operand specifiers, at least in almost
all cases. Note how you already use ...

> +  /* MOD_EVEX_MAP4_DA_PREFIX_1 */
> +  {
> +    { Bad_Opcode },
> +    { "encodekey128", { Gd, Ed }, 0 },
> +  },
> +  /* MOD_EVEX_MAP4_DB_PREFIX_1 */
> +  {
> +    { Bad_Opcode },
> +    { "encodekey256", { Gd, Ed }, 0 },
> +  },
> +  /* MOD_EVEX_MAP4_DC_PREFIX_1 */
> +  {
> +    { "aesenc128kl",    { XM, M }, 0 },
> +  },
> +  /* MOD_EVEX_MAP4_DD_PREFIX_1 */
> +  {
> +    { "aesdec128kl",    { XM, M }, 0 },
> +  },
> +  /* MOD_EVEX_MAP4_DE_PREFIX_1 */
> +  {
> +    { "aesenc256kl",    { XM, M }, 0 },
> +  },
> +  /* MOD_EVEX_MAP4_DF_PREFIX_1 */
> +  {
> +    { "aesdec256kl",    { XM, M }, 0 },
> +  },
> +  /* MOD_EVEX_MAP4_F8_PREFIX_1 */
> +  {
> +    { "enqcmds",	{ Gva, M },  0 },
> +  },
> +  /* MOD_EVEX_MAP4_F8_PREFIX_2 */
> +  {
> +    { "movdir64b",	{ Gva, M }, 0 },
> +  },
> +  /* MOD_EVEX_MAP4_F8_PREFIX_3 */
> +  {
> +    { "enqcmd",		{ Gva, M }, 0 },
> +  },

... M in many entries anyway. These can move one level up without needing
further adjustment.

For all of the above, however, an EVEX.W decode step looks to be missing.
Interestingly the doc consistently omits the (presumably) .W0 suffix for
these, having merely a trailing dot there. The issue (doc and/or code) is
present elsewhere as well.

> +  /* MOD_EVEX_MAP4_F9 */
> +  {
> +    { "movdiri",	{ Edq, Gdq }, 0 },
> +  },

Missing PREFIX_OPCODE?

> --- a/opcodes/i386-dis-evex-prefix.h
> +++ b/opcodes/i386-dis-evex-prefix.h
> @@ -338,6 +338,75 @@
>      { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
>      { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
>    },
> +  /* PREFIX_EVEX_MAP4_66 */
> +  {
> +    { "wrssK",	{ M, Gdq }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_D8 */
> +  {
> +    { "sha1nexte", { XM, EXxmm }, 0 },
> +    { REG_TABLE (REG_EVEX_MAP4_D8_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_DA */
> +  {
> +    { "sha1msg2", { XM, EXxmm }, 0 },
> +    { MOD_TABLE (MOD_EVEX_MAP4_DA_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_DB */
> +  {
> +    { "sha256rnds2", { XM, EXxmm, XMM0 }, 0 },
> +    { MOD_TABLE (MOD_EVEX_MAP4_DB_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_DC */
> +  {
> +    { "sha256msg1", { XM, EXxmm }, 0 },
> +    { MOD_TABLE (MOD_EVEX_MAP4_DC_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_DD */
> +  {
> +    { "sha256msg2", { XM, EXxmm }, 0 },
> +    { MOD_TABLE (MOD_EVEX_MAP4_DD_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_DE */
> +  {
> +    { Bad_Opcode },
> +    { MOD_TABLE (MOD_EVEX_MAP4_DE_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_DF */
> +  {
> +    { Bad_Opcode },
> +    { MOD_TABLE (MOD_EVEX_MAP4_DF_PREFIX_1) },
> +  },
> +  /* PREFIX_EVEX_MAP4_F0 */
> +  {
> +    { "crc32A",	{ Gdq, Eb }, 0 },
> +    { "invept",	{ Gm, Mo }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_F1 */
> +  {
> +    { "crc32Q",	{ Gdq, Ev }, 0 },
> +    { "invvpid", { Gm, Mo }, 0 },
> +    { "crc32Q",	{ Gdq, Ev }, 0 },
> +  },
> +  /* PREFIX_EVEX_MAP4_F2 */
> +  {
> +    { Bad_Opcode },
> +    { "invpcid", { Gm, M }, 0 },
> +  },

As opposed to the entries further up, INV* are indeed WIG, so validly
don't decode EVEX.W.

> +  /* PREFIX_EVEX_MAP4_F8 */
> +  {
> +    { Bad_Opcode },
> +    { MOD_TABLE (MOD_EVEX_MAP4_F8_PREFIX_1) },
> +    { MOD_TABLE (MOD_EVEX_MAP4_F8_PREFIX_2) },
> +    { MOD_TABLE (MOD_EVEX_MAP4_F8_PREFIX_3) },
> +  },
> +  /* PREFIX_EVEX_MAP4_FC */
> +  {
> +    { "aadd",	{ Mdq, Gdq }, 0 },
> +    { "axor",	{ Mdq, Gdq }, 0 },
> +    { "aand",	{ Mdq, Gdq }, 0 },
> +    { "aor",	{ Mdq, Gdq }, 0 },
> +  },

This looks to match the PREFIX_0F38FC entry.

> --- a/opcodes/i386-dis-evex-reg.h
> +++ b/opcodes/i386-dis-evex-reg.h
> @@ -49,3 +49,17 @@
>      { "vscatterpf0qp%XW",  { MVexVSIBQWpX }, PREFIX_DATA },
>      { "vscatterpf1qp%XW",  { MVexVSIBQWpX }, PREFIX_DATA },
>    },
> +  /* REG_EVEX_0F38F3_L_0 */
> +  {
> +    { Bad_Opcode },
> +    { "blsrS",		{ VexGdq, Edq }, 0 },
> +    { "blsmskS",	{ VexGdq, Edq }, 0 },
> +    { "blsiS",		{ VexGdq, Edq }, 0 },
> +  },

Compared to the REG_VEX_0F38F3_L_0 entry this lacks PREFIX_OPCODE, but
then the question is why you do not re-use that entry in the first
place.

> +  /* REG_EVEX_MAP4_D8_PREFIX_1 */
> +  {
> +    { "aesencwide128kl",	{ M }, 0 },
> +    { "aesdecwide128kl",	{ M }, 0 },
> +    { "aesencwide256kl",	{ M }, 0 },
> +    { "aesdecwide256kl",	{ M }, 0 },
> +  },

How is this different from the REG_0F38D8_PREFIX_1 entry?

> --- /dev/null
> +++ b/opcodes/i386-dis-evex-x86-64.h
> @@ -0,0 +1,140 @@
> +  /* X86_64_EVEX_0F90 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_LEN_TABLE (VEX_LEN_0F90) },
> +  },
> +  /* X86_64_EVEX_0F91 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_LEN_TABLE (VEX_LEN_0F91) },
> +  },
> +  /* X86_64_EVEX_0F92 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_LEN_TABLE (VEX_LEN_0F92) },
> +  },
> +  /* X86_64_EVEX_0F93 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_LEN_TABLE (VEX_LEN_0F93) },
> +  },
> +  /* X86_64_EVEX_0F3849 */
> +  {
> +    { Bad_Opcode },
> +    { VEX_LEN_TABLE (VEX_LEN_0F3849_X86_64) },
> +  },
> +  /* X86_64_EVEX_0F384B */
> +  {
> +    { Bad_Opcode },
> +    { VEX_LEN_TABLE (VEX_LEN_0F384B_X86_64) },
> +  },
> +  /* X86_64_EVEX_0F38E0 */
> +  {
> +    { Bad_Opcode },
> +    { "cmpoxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },
> +  },

This and its sibling entries look to again fully match X86_64_VEX_0F38E<n>.

> --- a/opcodes/i386-dis-evex.h
> +++ b/opcodes/i386-dis-evex.h
>[...]
> @@ -1113,19 +1113,19 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "sha1rnds4", { XM, EXxmm, Ib }, 0 },

PREFIX_OPCODE? EVEX.W?

>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* D8 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { PREFIX_TABLE (PREFIX_EVEX_MAP4_D8) },
> +    { "sha1msg1", { XM, EXxmm }, 0 },

Again.

> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
>[...]
> @@ -4522,12 +4594,13 @@ static const struct dis386 x86_64_table[][2] = {
>      { "cmpnlexadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },
>    },
>  
> +#include "i386-dis-evex-x86-64.h"
> +
>    /* X86_64_VEX_MAP7_F8_L_0_W_0_R_0 */
>    {
>      { Bad_Opcode },
>      { PREFIX_TABLE (PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64) },
>    },
> -
>  };

Please can EVEX entries come after VEX ones?

> @@ -8979,9 +9055,13 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>        if (!fetch_code (ins->info, ins->codep + 4))
>  	return &err_opcode;
>        /* The first byte after 0x62.  */
> +      if (*ins->codep & 0x8)
> +	ins->rex2 |= REX_B;
> +      if (!(*ins->codep & 0x10))
> +	ins->rex2 |= REX_R;

In patch 1 you check whether REX2 bits are properly consumed. While
in principle I agree with re-using ->rex2 here, don't you need to take
precautions to not print a random {rex2} (or whichever way it was
expressed) when one of these bits ends up not contributing to any
operand?

> @@ -8994,12 +9074,19 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  	case 0x3:
>  	  vex_table_index = EVEX_0F3A;
>  	  break;
> +	case 0x4:
> +	  vex_table_index = EVEX_MAP4;
> +	  ins->evex_type = evex_from_legacy;
> +	  break;
>  	case 0x5:
>  	  vex_table_index = EVEX_MAP5;
>  	  break;
>  	case 0x6:
>  	  vex_table_index = EVEX_MAP6;
>  	  break;
> +	case 0x7:
> +	  vex_table_index = EVEX_MAP7;
> +	  break;
>  	}

How is map7 coming into play here?

> @@ -9042,12 +9128,31 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  
>        if (ins->address_mode != mode_64bit)
>  	{
> +	  if (ins->evex_type != evex_default
> +	      || (ins->rex2 & (REX_B | REX_X)))
> +	    return &bad_opcode;
>  	  /* In 16/32-bit mode silently ignore following bits.  */
>  	  ins->rex &= ~REX_B;
> -	  ins->vex.r = true;
> +	  ins->rex2 &= ~REX_R;
>  	}
>  
>        ins->need_vex = 4;
> +
> +      /* EVEX from legacy instructions require that EVEX.L'L, EVEX.z and the
> +	 lower 2 bits of EVEX.aaa must be 0.
> +	 EVEX from evex instrucions require that EVEX.L'L and the lower 2 bits of
> +	 EVEX.aaa must be 0.  */

EVEX from VEX you mean? Also, what about EVEX.z for those? Really the sole
difference between the two forms is EVEX.nd. The doc mentions EVEX.l for
the from-VEX case, but afaics there's no insn actually having that bit set
(which also matches what you say above).

> +      if (ins->evex_type == evex_from_legacy || ins->evex_type == evex_from_vex)
> +	{
> +	  if ((*ins->codep & 0x3) != 0
> +	      || (*ins->codep >> 6 & 0x3) != 0
> +	      || (ins->evex_type == evex_from_legacy
> +		  && (*ins->codep >> 5 & 0x1) != 0)

Please can the shifts be parenthesized agaist the &-s? Albeit - this is
perhaps easier to read as just &-s anyway.

> +	      || (ins->evex_type == evex_from_vex
> +		  && !ins->vex.b))

This doesn't look to match any part of the comment.

> @@ -9640,6 +9752,19 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>        if (ins.last_repnz_prefix >= 0)
>  	ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
>        break;
> +
> +    case PREFIX_NP_OR_DATA:
> +      if (ins.vex.prefix & ~DATA_PREFIX_OPCODE)
> +	{
> +	  i386_dis_printf (info, dis_style_text, "(bad)");
> +	  ret = ins.end_codep - priv.the_buffer;
> +	  goto out;
> +	}
> +      break;
> +
> +    default:
> +      break;
> +
>      }

Why suddenly a default case? And why an extra blank line below it?

> @@ -11242,7 +11367,7 @@ print_register (instr_info *ins, unsigned int reg, unsigned int rexmask,
>      case b_swap_mode:
>        if (reg & 4)
>  	USED_REX (0);
> -      if (ins->rex)
> +      if (ins->rex || ins->rex2)
>  	names = att_names8rex;
>        else
>  	names = att_names8;

Isn't this already needed in the REX2 patch?

> @@ -11460,7 +11585,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  
>    add += (ins->rex2 & REX_B) ? 16 : 0;
>  
> -  if (ins->vex.evex)
> +  if (ins->vex.evex && ins->evex_type == evex_default)
>      {
>  
>        /* Zeroing-masking is invalid for memory destinations. Set the flag
> @@ -11608,6 +11733,13 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  		abort ();
>  	      if (ins->vex.evex)
>  		{
> +		  /* S/G EVEX insns require REX_X not to be set.  */
> +		  if (ins->rex2 & REX_X)

REX_X in a comment can only mean REX.X. You mean EVEX.X4 though, I think
(which internally is latched into the REX_X bit of ->rex2), and that also
needs to be set, not clear.

Jan

next prev parent reply	other threads:[~2023-11-07 14:53 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-03 16:50 Cui, Lili
2023-11-06 17:07 ` Jan Beulich
2023-11-13  5:53   ` Cui, Lili
2023-11-13  8:34     ` Jan Beulich
2023-11-14  3:12       ` Cui, Lili
2023-11-14 10:29         ` Jan Beulich
2023-11-15  8:39           ` Cui, Lili
2023-11-07 13:29 ` Jan Beulich
2023-11-09  8:38   ` Cui, Lili
2023-11-09 11:07     ` Jan Beulich
2023-11-09 11:12     ` Jan Beulich
2023-11-07 14:53 ` Jan Beulich [this message]
2023-11-09 12:31   ` Cui, Lili
2023-11-09 13:05     ` Jan Beulich
2023-11-09 14:57       ` Cui, Lili
2023-11-09 15:39         ` Jan Beulich
2023-11-14  7:42   ` Cui, Lili
2023-11-14 10:40     ` Jan Beulich
2023-11-14 14:46       ` Cui, Lili
2023-11-15  6:03   ` Cui, Lili
2023-11-15  9:11     ` Jan Beulich
2023-11-15 11:43       ` Cui, Lili
2023-11-16 13:57         ` Jan Beulich
2023-11-16 15:10           ` Cui, Lili
2023-11-16 15:15             ` Jan Beulich
2023-11-16 16:12           ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4aebadde-7cba-050e-d2eb-188d6216c566@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).