public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: hongjiu.lu@intel.com, binutils@sourceware.org
Subject: Re: [PATCH v4 1/9] Support APX GPR32 with rex2 prefix
Date: Fri, 22 Dec 2023 14:08:01 +0100	[thread overview]
Message-ID: <ec294971-631c-4fa8-8635-a096f8316fe7@suse.com> (raw)
In-Reply-To: <20231219121218.974012-2-lili.cui@intel.com>

On 19.12.2023 13:12, Cui, Lili wrote:
> @@ -5283,6 +5321,9 @@ md_assemble (char *line)
>  	case unsupported_syntax:
>  	  err_msg = _("unsupported syntax");
>  	  break;
> +	case unsupported_EGPR_for_addressing:
> +	  err_msg = _("extended GPR cannot be used as base/index");
> +	  break;

While this one's now suitable for the as_bad() below the switch, ...

> @@ -5336,6 +5377,9 @@ md_assemble (char *line)
>  	case invalid_dest_and_src_register_set:
>  	  err_msg = _("destination and source registers must be distinct");
>  	  break;
> +	case invalid_pseudo_prefix:
> +	  err_msg = _("rex2 pseudo prefix cannot be used here");
> +	  break;

... this one still doesn't really fit the "... for `<insn>'" there. At
least the "here" needs dropping.

> @@ -5630,11 +5681,12 @@ md_assemble (char *line)
>  	  && (i.op[1].regs->reg_flags & RegRex64) != 0)
>        || (((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte)
>  	   || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte))
> -	  && i.rex != 0))
> +	  && (i.rex != 0 || i.rex2 != 0)))
>      {
>        int x;
>  
> -      i.rex |= REX_OPCODE;
> +      if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm))
> +	i.rex |= REX_OPCODE;
>        for (x = 0; x < 2; x++)
>  	{
>  	  /* Look for 8 bit operand that uses old registers.  */
> @@ -5645,7 +5697,7 @@ md_assemble (char *line)
>  	      /* In case it is "hi" register, give up.  */
>  	      if (i.op[x].regs->reg_num > 3)
>  		as_bad (_("can't encode register '%s%s' in an "
> -			  "instruction requiring REX prefix."),
> +			  "instruction requiring REX/REX2 prefix."),
>  			register_prefix, i.op[x].regs->reg_name);
>  
>  	      /* Otherwise it is equivalent to the extended register.
> @@ -5657,11 +5709,11 @@ md_assemble (char *line)
>  	}
>      }
>  
> -  if (i.rex == 0 && i.rex_encoding)
> +  if (i.rex == 0 && i.rex2 == 0 && (i.rex_encoding || i.rex2_encoding))
>      {
>        /* Check if we can add a REX_OPCODE byte.  Look for 8 bit operand
>  	 that uses legacy register.  If it is "hi" register, don't add
> -	 the REX_OPCODE byte.  */
> +	 rex and rex2 prefix.  */
>        int x;
>        for (x = 0; x < 2; x++)
>  	if (i.types[x].bitfield.class == Reg
> @@ -5671,6 +5723,7 @@ md_assemble (char *line)
>  	  {
>  	    gas_assert (!(i.op[x].regs->reg_flags & RegRex));
>  	    i.rex_encoding = false;
> +	    i.rex2_encoding = false;
>  	    break;
>  	  }
>  
> @@ -5678,7 +5731,13 @@ md_assemble (char *line)
>  	i.rex = REX_OPCODE;
>      }
>  
> -  if (i.rex != 0)
> +  if (is_apx_rex2_encoding ())
> +    {
> +      build_rex2_prefix ();
> +      /* The individual REX.RXBW bits got consumed.  */
> +      i.rex &= REX_OPCODE;
> +    }
> +  else if (i.rex != 0)
>      add_prefix (REX_OPCODE | i.rex);
>  
>    insert_lfence_before (last_insn);

All of this will need re-basing over "x86: properly respect rex/{rex}",
with the result (I hope) that .insn will then also be covered REX2-wise.

> @@ -5752,6 +5811,20 @@ parse_insn (const char *line, char *mnemonic, bool prefix_only)
>  	    goto too_long;
>  	  *mnem_p = '\0';
>  
> +	  /* Point l at the closing brace if there's no other separator.  */
> +	  if (*l != END_OF_INSN && !is_space_char (*l)
> +	      && *l != PREFIX_SEPARATOR)
> +	    --l;
> +	}
> +      /* Skip the immediate 0x** of {rex2 0x00} prefix.  */
> +      else if (*mnemonic == '{'&& is_space_char (*l))

Nit: Missing blank.

> +	{
> +	  while ( *l != '}')

Nit: Stray blank.

> +	    ++l;

What if there's no '}' on the line?

> +	  *mnem_p++ = *l++;
> +	  if (mnem_p >= mnemonic + MAX_MNEM_SIZE)
> +	    goto too_long;
> +	  *mnem_p = '\0';

You skip everything, not just 0xNN. You also skip stuff after other pseudo
prefixes, if I'm not mistaken. That's both too lax. However, skipping
isn't an option here anyway. Either we actually respect what the user has
written, or we error out. Skipping therefore is an option only if the
provided expression (not just plain number!) evaluates to 0. I don't
understand anyway why this code was added: When I asked about the specific
plans, H.J. clearly said the form with a constant would be a disassembler-
only thing for now.

> @@ -7005,6 +7082,43 @@ VEX_check_encoding (const insn_template *t)
>    return 0;
>  }
>  
> +/* Check if Egprs operands are valid for the instruction.  */
> +
> +static int
> +check_EgprOperands (const insn_template *t)

Hmm, I thought I had asked before to make functions with boolean return
values have a return type of bool, and then use "true" for success. An
alternative would be to return the error indicator, rather than putting
it in i.error here.

Then again I realize this is in line with VEX_check_encoding() and
check_VecOperands() (which I think would better be changed, but anyway).

> @@ -7149,6 +7263,14 @@ match_template (char mnem_suffix)
>  	      continue;
>  	    }
>  
> +	  /* Check if pseudo prefix {rex2} is valid.  */
> +	  if (t->opcode_modifier.noegpr && i.rex2_encoding)
> +	    {
> +	      i.error = invalid_pseudo_prefix;

What is this needed for? I.e. why can't you pass the value ...

> +	      specific_error = progress (i.error);

... in here directly (as is done elsewhere as well)?

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.s
> @@ -0,0 +1,86 @@
> +# Check 64bit instructions with rex2 prefix encoding
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +         test	$0x7, %r24b
> +         test	$0x7, %r24d
> +         test	$0x7, %r24
> +         test	$0x7, %r24w
> +## REX2.M bit
> +         imull	%eax, %r15d
> +         imull	%eax, %r16d
> +         punpckldq (%r18), %mm2
> +## REX2.R4 bit
> +         leal	(%rax), %r16d
> +         leal	(%rax), %r17d
> +         leal	(%rax), %r18d
> +         leal	(%rax), %r19d
> +         leal	(%rax), %r20d
> +         leal	(%rax), %r21d
> +         leal	(%rax), %r22d
> +         leal	(%rax), %r23d
> +         leal	(%rax), %r24d
> +         leal	(%rax), %r25d
> +         leal	(%rax), %r26d
> +         leal	(%rax), %r27d
> +         leal	(%rax), %r28d
> +         leal	(%rax), %r29d
> +         leal	(%rax), %r30d
> +         leal	(%rax), %r31d
> +## REX2.X4 bit
> +         leal	(,%r16), %eax
> +         leal	(,%r17), %eax
> +         leal	(,%r18), %eax
> +         leal	(,%r19), %eax
> +         leal	(,%r20), %eax
> +         leal	(,%r21), %eax
> +         leal	(,%r22), %eax
> +         leal	(,%r23), %eax
> +         leal	(,%r24), %eax
> +         leal	(,%r25), %eax
> +         leal	(,%r26), %eax
> +         leal	(,%r27), %eax
> +         leal	(,%r28), %eax
> +         leal	(,%r29), %eax
> +         leal	(,%r30), %eax
> +         leal	(,%r31), %eax
> +## REX2.B4 bit
> +         leal	(%r16), %eax
> +         leal	(%r17), %eax
> +         leal	(%r18), %eax
> +         leal	(%r19), %eax
> +         leal	(%r20), %eax
> +         leal	(%r21), %eax
> +         leal	(%r22), %eax
> +         leal	(%r23), %eax
> +         leal	(%r24), %eax
> +         leal	(%r25), %eax
> +         leal	(%r26), %eax
> +         leal	(%r27), %eax
> +         leal	(%r28), %eax
> +         leal	(%r29), %eax
> +         leal	(%r30), %eax
> +         leal	(%r31), %eax
> +## REX2.W bit
> +         leaq	(%rax), %r15
> +         leaq	(%rax), %r16
> +         leaq	(%r15), %rax
> +         leaq	(%r16), %rax
> +         leaq	(,%r15), %rax
> +         leaq	(,%r16), %rax
> +## REX2.R3 bit
> +         add    (%r16), %r8
> +         add    (%r16), %r15
> +## REX2.X3 bit
> +         mov    (,%r9), %r16
> +         mov    (,%r14), %r16
> +## REX2.B3 bit
> +	 sub   (%r10), %r31
> +	 sub   (%r13), %r31

Nit: There's still an indentation anomaly here.

> --- a/gas/testsuite/gas/i386/x86-64-inval-pseudo.s
> +++ b/gas/testsuite/gas/i386/x86-64-inval-pseudo.s
> @@ -1,4 +1,8 @@
>  	.text
>  	{disp16} movb (%ebp),%al
>  	{disp16} movb (%rbp),%al
> +
> +	/* Instruction not support APX.  */
> +	{rex2} xsave (%r15, %rbx)
> +	{rex2} xsave64 (%r15, %rbx)
>  	.p2align 4,0

Aren't these dealt with (in a more complete fashion) in x86-64-pseudos-bad.s?

> --- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
> +++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
> @@ -5,3 +5,77 @@ pseudos:
>  	{rex} vmovaps %xmm7,%xmm2
>  	{rex} vmovaps %xmm17,%xmm2
>  	{rex} rorx $7,%eax,%ebx
> +	{rex2} vmovaps %xmm7,%xmm2
> +	{rex2} xsave (%rax)
> +	{rex2} xsaves (%ecx)
> +	{rex2} xsaves64 (%ecx)
> +	{rex2} xsavec (%ecx)
> +	{rex2} xrstors (%ecx)
> +	{rex2} xrstors64 (%ecx)

Here.

> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -144,6 +144,12 @@ struct instr_info
>    /* Bits of REX we've already used.  */
>    uint8_t rex_used;
>  
> +  /* Record W R4 X4 B4 bits for rex2.  */
> +  unsigned char rex2;
> +  /* Bits of REX2 we've already used.  */
> +  unsigned char rex2_used;

When you say REX2, one ought to be permitted to imply you mean the
prefix, not the struct field. That's ambiguous here, though - bit
positions used match those in rex2, not those in the REX2 payload.
IOW either you use lower case to make more obvious that you mean the
other struct field, or you say e.g. "REX2 prefix bits we've already
used." Albeit that would still be imprecise, as other REX2 prefix
bits' use is recorded in rex_used.

> @@ -265,8 +272,13 @@ struct dis_private {
>    {							\
>      if (value)						\
>        {							\
> -	if ((ins->rex & value))				\
> +	if (ins->rex & value)				\
>  	  ins->rex_used |= (value) | REX_OPCODE;	\

Like is done here, ...

> +	if (ins->rex2 & value)				\
> +	  {						\
> +	    ins->rex2_used |= value;			\

other uses of "value" also want parenthesizing, unless not used with
any kind of operator (e.g. in the if() above).

> @@ -276,6 +288,9 @@ struct dis_private {
>  #define EVEX_b_used 1
>  #define EVEX_len_used 2
>  
> +/* M0 in rex2 prefix represents map0 or map1.  */
> +#define REX2_M 0x8

Extending an earlier comment: This really should go next to REX_W and
friends. In principle the assembler could want to use this constant as
well, hence why it would better go the opcode/i386.h anyway.

> @@ -4196,19 +4221,19 @@ static const struct dis386 x86_64_table[][2] = {
>  
>    /* X86_64_E8 */
>    {
> -    { "callP",		{ Jv, BND }, 0 },
> -    { "call@",		{ Jv, BND }, 0 }
> +    { "callP",		{ Jv, BND }, PREFIX_REX2_ILLEGAL },

This, ...

> +    { "call@",		{ Jv, BND }, PREFIX_REX2_ILLEGAL }
>    },
>  
>    /* X86_64_E9 */
>    {
> -    { "jmpP",		{ Jv, BND }, 0 },
> -    { "jmp@",		{ Jv, BND }, 0 }
> +    { "jmpP",		{ Jv, BND }, PREFIX_REX2_ILLEGAL },

... this, and ...

> +    { "jmp@",		{ Jv, BND }, PREFIX_REX2_ILLEGAL }
>    },
>  
>    /* X86_64_EA */
>    {
> -    { "{l|}jmp{P|}", { Ap }, 0 },
> +    { "{l|}jmp{P|}", { Ap }, PREFIX_REX2_ILLEGAL },
>    },

... this change isn't really needed, is it? The marker is only needed
on 64-bit insns (i.e. respective slots 2 of x86_64_table[] entries).

Jan

  reply	other threads:[~2023-12-22 13:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-19 12:12 [PATCH v4 0/9] Support Intel APX EGPR Cui, Lili
2023-12-19 12:12 ` [PATCH v4 1/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-22 13:08   ` Jan Beulich [this message]
2023-12-25  6:14     ` Cui, Lili
2024-01-04  8:57       ` Jan Beulich
2023-12-19 12:12 ` [PATCH v4 2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-12-19 12:12 ` [PATCH v4 3/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-22 13:49   ` Jan Beulich
2023-12-25 12:23     ` Cui, Lili
2024-01-04  9:08       ` Jan Beulich
2024-01-04 12:32         ` Cui, Lili
2024-01-04 12:55           ` Jan Beulich
2023-12-22 14:19   ` Jan Beulich
2023-12-26  7:00     ` Cui, Lili
2024-01-04  9:01       ` Jan Beulich
2024-01-04 12:47         ` Cui, Lili
2023-12-19 12:12 ` [PATCH v4 4/9] Add tests for " Cui, Lili
2023-12-22 14:41   ` Jan Beulich
2023-12-25 13:40     ` Cui, Lili
2024-01-04  9:16       ` Jan Beulich
2024-01-05  6:58         ` Cui, Lili
2023-12-19 12:12 ` [PATCH v4 5/9] Support APX NDD Cui, Lili
2023-12-19 12:12 ` [PATCH v4 6/9] Support APX Push2/Pop2 Cui, Lili
2023-12-19 12:12 ` [PATCH v4 7/9] Support APX PUSHP/POPP Cui, Lili
2023-12-19 12:12 ` [PATCH v4 `8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-19 12:12 ` [PATCH v4 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-12-19 12:35 ` [PATCH v4 0/9] Support Intel APX EGPR Jan Beulich
2023-12-20  8:50   ` Cui, Lili
2023-12-20  8:57     ` Jan Beulich
2023-12-20 10:42       ` Cui, Lili
2023-12-20 11:00         ` Jan Beulich
2023-12-20 11:50           ` Cui, Lili
2023-12-20 12:01             ` Jan Beulich
2023-12-20 12:16               ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec294971-631c-4fa8-8635-a096f8316fe7@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).