Re: [PATCH 6/8] Support APX Push2/Pop2

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: hongjiu.lu@intel.com, "Mo, Zewei" <zewei.mo@intel.com>,
	binutils@sourceware.org
Subject: Re: [PATCH 6/8] Support APX Push2/Pop2
Date: Thu, 28 Sep 2023 13:37:38 +0200	[thread overview]
Message-ID: <153e1263-bbc9-1af6-a11e-58489153d95f@suse.com> (raw)
In-Reply-To: <20230919152527.497773-7-lili.cui@intel.com>

On 19.09.2023 17:25, Cui, Lili wrote:
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -5667,6 +5667,22 @@ md_assemble (char *line)
>        i.rex &= REX_OPCODE;
>      }
>  
> +  if (i.tm.opcode_modifier.push2pop2)
> +    {
> +      i.imm_operands = 0;

Why?

> +      unsigned int reg1 = register_number (i.op[0].regs);
> +      unsigned int reg2 = register_number (i.op[1].regs);
> +
> +      /* Push2/Pop2 cannot use RSP and Pop2 cannot pop two same registers.  */
> +      if (reg1 == 0x4 || reg2 == 0x4)
> +	as_bad (_("%s for `%s'"), _("invalid register operand"),
> +		insn_name (current_templates->start));
> +
> +      if ((i.tm.mnem_off == MN_pop2 || i.tm.mnem_off == MN_pop2p) && reg1 == reg2)

This would be easier with a single opcode check on the lhs of the &&.

> +	as_bad (_("%s for `%s'"), _("invalid register operand"),
> +		insn_name (current_templates->start));
> +    }

Both error messages want to be more specific, such that it's clear
what exactly is wrong. Also a string literal for %s is slightly odd,
and may pose issues to translators.

Furthermore there are pre-existing insns with restrictions on register
operands. I wonder whether these new checks wouldn't better be put
close to those.

> @@ -7100,7 +7116,11 @@ optimize_NDD_to_nonNDD (const insn_template *t)
>        && t->opcode_space == SPACE_EVEXMAP4
>        && i.reg_operands >= 2
>        && (i.types[i.operands - 1].bitfield.dword
> -	  || i.types[i.operands - 1].bitfield.qword))
> +	  || i.types[i.operands - 1].bitfield.qword)
> +      && (t->mnem_off != MN_pop2
> +	  && t->mnem_off != MN_pop2p
> +	  && t->mnem_off != MN_push2
> +	  && t->mnem_off != MN_push2p))

If an explicit check is needed here, why not use the push2pop2 attribute?

> @@ -8912,7 +8932,13 @@ build_modrm_byte (void)
>        dest = ~0;

This already sets dest ...

>      }
>    gas_assert (source < dest);
> -  if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
> +  if (i.tm.opcode_modifier.push2pop2)
> +    {
> +      v = 1;
> +      dest = (unsigned int) ~0;

... to the intended value. Furthermore, doesn't ...

> +      source = 0;
> +    }
> +  else if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
>        && source != op)
>      {
>        unsigned int tmp = source;

... this logic already take care of the wanted swapping, provided the
templates get SwapSources added? Hmm, odd - you already have that
attribute on the two POP2 templates, but not the two PUSH2 ones. Yet both
use identical operand (encoding) order.

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-decode-inval.d
> @@ -0,0 +1,29 @@
> +#as: --64
> +#objdump: -dw
> +#name: illegal decoding of APX-push2pop2 insns

Decoding cannot be illegal. Either you mean encoding, or you mean
"decoding of illegal APX-push2pop2 insn forms".

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-decode-inval.s
> @@ -0,0 +1,19 @@
> +# Check illegal bytecode of APX-Push2Pop2 instructions
> +# pop2 %rax, %rbx

What's illegal here? From ...

> +# pop2 %rax, %rsp
> +# push2 %rsp, %r17
> +# pop2 %r12, %r12
> +# pop2 %r31, %r31
> +
> +	.allow_index_reg
> +	.text
> +popnd0:
> +	.byte 0x62,0xF4,0x64,0x08,0x8F,0xC0

... the label name I guess EVEX.nd=0, but that needs saying. (As mentioned
earlier, the comments want moving next to the code emission anyway, and
.byte wants avoiding it at all possible.)

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
> @@ -0,0 +1,13 @@
> +# Check illegal APX-Push2Pop2 instructions
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +	push2 %eax, %ebx
> +	pop2 %rax, %rsp
> +	push2 %rsp, %r17
> +	pop2 %r12, %r12
> +	push2p %eax, %ebx
> +	pop2p %rax, %rsp
> +	push2p %rsp, %r17
> +	pop2p %r12, %r12

Please make sure that at least one of the forms uses %rsp once as first
and once as second operand for both push and pop.

> --- a/opcodes/i386-dis-evex-x86.h
> +++ b/opcodes/i386-dis-evex-x86.h
> @@ -138,3 +138,13 @@
>      { Bad_Opcode },
>      { VEX_LEN_TABLE (VEX_LEN_0F3AF0) },
>    },
> +  /* X86_64_EVEX_MAP4_8F*/

Nit: Please make well-formed comments (...

> +  {
> +    { Bad_Opcode },
> +    { EVEX_LEN_TABLE (EVEX_LEN_MAP4_8F_X86_64) },
> +  },
> +  /* X86_64_EVEX_MAP4_FF_R_6*/

... i.e. also here and possibly elsewhere).

> @@ -1341,6 +1354,9 @@ enum
>    X86_64_EVEX_0F38F6,
>    X86_64_EVEX_0F38F7,
>    X86_64_EVEX_0F3AF0,
> +
> +  X86_64_EVEX_MAP4_8F,
> +  X86_64_EVEX_MAP4_FF_R_6,
>  };

This might indicate a problem in patch 4: Why is the x86-64 decode step
entirely missing there? Or, if correct there, why is it needed here? For
push and pop here I'd expect decode order and hence enumerators to be
entirely consistent.

> @@ -1537,7 +1553,10 @@ enum
>    EVEX_LEN_0F3A39,
>    EVEX_LEN_0F3A3A,
>    EVEX_LEN_0F3A3B,
> -  EVEX_LEN_0F3A43
> +  EVEX_LEN_0F3A43,
> +
> +  EVEX_LEN_MAP4_8F_X86_64,
> +  EVEX_LEN_MAP4_FF_R_6_X86_64,
>  };

Prior changes didn't find it necessary to handle EVEX.l through a table
lookup - why is this needed here? Map4 has uniform requirements, and an
earlier patch added respective checking, iirc.

> @@ -8757,10 +8779,24 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>        dp = &prefix_table[dp->op[1].bytemode][vindex];
>        break;
>  
> +    case USE_X86_64_EVEX_PUSH2_TABLE:
> +    case USE_X86_64_EVEX_POP2_TABLE:
> +	ins->evex_type = evex_push2_pop2;
> +      unsigned int vvvv_reg = ins->vex.register_specifier
> +			      | !ins->vex.v << 4;
> +      unsigned int rm_reg = ins->modrm.rm + (ins->rex & REX_B ? 8 : 0)
> +			    + (ins->rex2 & REX_B ? 16 : 0);
> +      if (!ins->vex.b || vvvv_reg == 0x4 || rm_reg == 0x4
> +	  || (dp->op[0].bytemode == USE_X86_64_EVEX_POP2_TABLE
> +	      && vvvv_reg == rm_reg))
> +	  return &bad_opcode;
> +      goto use_x86_64_table;

I don't think this is the way to handle such restrictions. Since this
likely can't very well be handled in existing operand handlers, so far
the approach was to introduce new ...Fixup() ones. That'll also
produce more helpful output, e.g. "push2 (bad)", rather than ".byte ..."
with no hint at approximately what insn this was. New USE_..._TABLE
should imo really only appear when truly new tables are introduced.

>      case USE_X86_64_EVEX_FROM_VEX_TABLE:
>        ins->evex_type = evex_from_vex;
>        /* Fall through.  */
>      case USE_X86_64_TABLE:
> +use_x86_64_table:

While this is going to go away with the comment above, as a general
remark: Within a switch(), please align labels used by "goto" from one
case to another with the adjacent case label(s). Elsewhere, for
"diff -p", please indent labels by at least one blank.

> @@ -9570,7 +9606,8 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>  	  /* Check whether rounding control was enabled for an insn not
>  	     supporting it.  */
>  	  if (ins.modrm.mod == 3 && ins.vex.b
> -	      && !(ins.evex_used & EVEX_b_used))
> +	      && !(ins.evex_used & EVEX_b_used)
> +	      && ins.evex_type != evex_push2_pop2)
>  	    {

Looks like addressing a comment on an earlier patch will render this
change (and hence the new evex_push2_pop2 enumerator) unnecessary. If
not, I'd ask whether you wouldn't better set EVEX_b_used at an
appropriate point.

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3555,3 +3555,9 @@ eretu, 0xf30f01ca, FRED|x64, NoSuf, {}
>  
>  // FRED instructions end.
>  
> +// APX Push2/Pop2 instruction.
> +
> +push2, 0xff/6, APX_F|x64, Modrm|VexW0|EVex128|Push2Pop2|EVexMap4|VexVVVV|No_bSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
> +push2p, 0xff/6, APX_F|x64, Modrm|VexW1|EVex128|Push2Pop2|EVexMap4|VexVVVV|No_bSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
> +pop2, 0x8f/0, APX_F|x64, Modrm|VexW0|EVex128|Push2Pop2|SwapSources|EVexMap4|VexVVVV|No_bSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }
> +pop2p, 0x8f/0, APX_F|x64, Modrm|VexW1|EVex128|Push2Pop2|SwapSources|EVexMap4|VexVVVV|No_bSuf|No_lSuf|No_sSuf, { Reg64, Reg64 }

Missing No_wSuf in all 4 entries?

As to the suffixing 'p' - may I ask in how far that's aligned with other
assemblers? The doc doesn't even give a hint as to what mnemonic
representation should be used (personally I had been taking .x into
consideration). Furthermore the earlier REX2 patch also didn't deal with
the PPX functionality for PUSH/POP, unless I missed something.

Jan

next prev parent reply	other threads:[~2023-09-28 11:37 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-19 15:25 [PATCH 0/8] [RFC] Support Intel APX EGPR Cui, Lili
2023-09-19 15:25 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-09-21 15:27   ` Jan Beulich
2023-09-27 15:57     ` Cui, Lili
2023-09-21 15:51   ` Jan Beulich
2023-09-27 15:59     ` Cui, Lili
2023-09-28  8:02       ` Jan Beulich
2023-10-07  3:27         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 2/8] Support APX GPR32 with extend evex prefix Cui, Lili
2023-09-22 10:12   ` Jan Beulich
2023-10-17 15:48     ` Cui, Lili
2023-10-18  6:40       ` Jan Beulich
2023-10-18 10:44         ` Cui, Lili
2023-10-18 10:50           ` Jan Beulich
2023-09-22 10:50   ` Jan Beulich
2023-10-17 15:50     ` Cui, Lili
2023-10-17 16:11       ` Jan Beulich
2023-10-18  2:02         ` Cui, Lili
2023-10-18  6:10           ` Jan Beulich
2023-09-25  6:03   ` Jan Beulich
2023-10-17 15:52     ` Cui, Lili
2023-10-17 16:12       ` Jan Beulich
2023-10-18  6:31         ` Cui, Lili
2023-10-18  6:47           ` Jan Beulich
2023-10-18  7:52             ` Cui, Lili
2023-10-18  8:21               ` Jan Beulich
2023-10-18 11:30                 ` Cui, Lili
2023-10-19 11:58                   ` Cui, Lili
2023-10-19 15:24                     ` Jan Beulich
2023-10-19 16:38                       ` Cui, Lili
2023-10-20  6:25                         ` Jan Beulich
2023-10-22 14:33                           ` Cui, Lili
2023-09-19 15:25 ` [PATCH 3/8] Add tests for " Cui, Lili
2023-09-27 13:11   ` Jan Beulich
2023-10-17 15:53     ` FW: " Cui, Lili
2023-10-17 16:19       ` Jan Beulich
2023-10-18  2:32         ` Cui, Lili
2023-10-18  6:05           ` Jan Beulich
2023-10-18  7:16             ` Cui, Lili
2023-10-18  8:05               ` Jan Beulich
2023-10-18 11:26                 ` Cui, Lili
2023-10-18 12:06                   ` Jan Beulich
2023-10-25 16:03                     ` Cui, Lili
2023-09-27 13:19   ` Jan Beulich
2023-09-19 15:25 ` [PATCH 4/8] Support APX NDD Cui, Lili
2023-09-27 14:44   ` Jan Beulich
2023-10-22 14:05     ` Cui, Lili
2023-10-23  7:12       ` Jan Beulich
2023-10-25  8:10         ` Cui, Lili
2023-10-25  8:47           ` Jan Beulich
2023-10-25 15:49             ` Cui, Lili
2023-10-25 15:59               ` Jan Beulich
2023-09-28  7:57   ` Jan Beulich
2023-10-22 14:57     ` Cui, Lili
2023-10-24 11:39     ` Cui, Lili
2023-10-24 11:58       ` Jan Beulich
2023-10-25 15:29         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 5/8] Support APX NDD optimized encoding Cui, Lili
2023-09-28  9:29   ` Jan Beulich
2023-10-23  2:57     ` Hu, Lin1
2023-10-23  7:23       ` Jan Beulich
2023-10-23  7:50         ` Hu, Lin1
2023-10-23  8:15           ` Jan Beulich
2023-10-24  1:40             ` Hu, Lin1
2023-10-24  6:03               ` Jan Beulich
2023-10-24  6:08                 ` Hu, Lin1
2023-10-23  3:07     ` [PATCH-V2] " Hu, Lin1
2023-10-23  3:30     ` [PATCH 5/8] [v2] " Hu, Lin1
2023-10-23  7:26       ` Jan Beulich
2023-09-19 15:25 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-09-28 11:37   ` Jan Beulich [this message]
2023-10-30 15:21     ` Cui, Lili
2023-10-30 15:31       ` Jan Beulich
2023-11-20 13:05         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 7/8] Support APX NF Cui, Lili
2023-09-25  6:07   ` Jan Beulich
2023-09-28 12:42   ` Jan Beulich
2023-11-02 10:15     ` Cui, Lili
2023-11-02 10:23       ` Jan Beulich
2023-11-02 10:46         ` Cui, Lili
2023-12-12  2:59           ` H.J. Lu
2023-09-19 15:25 ` [PATCH 8/8] Support APX JMPABS Cui, Lili
2023-09-28 13:11   ` Jan Beulich
2023-11-02  2:32     ` Hu, Lin1
2023-11-02 11:29 [PATCH v2 0/8] Support Intel APX EGPR Cui, Lili
2023-11-02 11:29 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-11-08 11:44   ` Jan Beulich
2023-11-08 12:52     ` Jan Beulich
2023-11-22  5:48     ` Cui, Lili
2023-11-22  8:53       ` Jan Beulich
2023-11-22 12:26         ` Cui, Lili
2023-11-09  9:57   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=153e1263-bbc9-1af6-a11e-58489153d95f@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=lili.cui@intel.com \
    --cc=zewei.mo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).