public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"Mo, Zewei" <zewei.mo@intel.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH 6/8] Support APX Push2/Pop2
Date: Mon, 30 Oct 2023 15:21:40 +0000	[thread overview]
Message-ID: <SJ0PR11MB5600CCCE672EE2DA150969E19EA1A@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <153e1263-bbc9-1af6-a11e-58489153d95f@suse.com>

> Subject: Re: [PATCH 6/8] Support APX Push2/Pop2
> 
> On 19.09.2023 17:25, Cui, Lili wrote:
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -5667,6 +5667,22 @@ md_assemble (char *line)
> >        i.rex &= REX_OPCODE;
> >      }
> >
> > +  if (i.tm.opcode_modifier.push2pop2)
> > +    {
> > +      i.imm_operands = 0;
> 
> Why?

Removed, it is redundant.

> 
> > +      unsigned int reg1 = register_number (i.op[0].regs);
> > +      unsigned int reg2 = register_number (i.op[1].regs);
> > +
> > +      /* Push2/Pop2 cannot use RSP and Pop2 cannot pop two same
> registers.  */
> > +      if (reg1 == 0x4 || reg2 == 0x4)
> > +	as_bad (_("%s for `%s'"), _("invalid register operand"),
> > +		insn_name (current_templates->start));
> > +
> > +      if ((i.tm.mnem_off == MN_pop2 || i.tm.mnem_off == MN_pop2p) &&
> > + reg1 == reg2)
> 
> This would be easier with a single opcode check on the lhs of the &&.
> 

Done.

> > +	as_bad (_("%s for `%s'"), _("invalid register operand"),
> > +		insn_name (current_templates->start));
> > +    }
> 

> Both error messages want to be more specific, such that it's clear what exactly
> is wrong. Also a string literal for %s is slightly odd, and may pose issues to
> translators.
> 
> Furthermore there are pre-existing insns with restrictions on register
> operands. I wonder whether these new checks wouldn't better be put close to
> those.
> 

Done.

> > @@ -7100,7 +7116,11 @@ optimize_NDD_to_nonNDD (const insn_template
> *t)
> >        && t->opcode_space == SPACE_EVEXMAP4
> >        && i.reg_operands >= 2
> >        && (i.types[i.operands - 1].bitfield.dword
> > -	  || i.types[i.operands - 1].bitfield.qword))
> > +	  || i.types[i.operands - 1].bitfield.qword)
> > +      && (t->mnem_off != MN_pop2
> > +	  && t->mnem_off != MN_pop2p
> > +	  && t->mnem_off != MN_push2
> > +	  && t->mnem_off != MN_push2p))
> 
> If an explicit check is needed here, why not use the push2pop2 attribute?
> 

We will move it to NDD optimization encoding patch. and use i.tm.opcode_modifier.push2pop2 instead.

> > @@ -8912,7 +8932,13 @@ build_modrm_byte (void)
> >        dest = ~0;
> 
> This already sets dest ...
> 
> >      }
> >    gas_assert (source < dest);
> > -  if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
> > +  if (i.tm.opcode_modifier.push2pop2)
> > +    {
> > +      v = 1;
> > +      dest = (unsigned int) ~0;
> 
> ... to the intended value. Furthermore, doesn't ...
> 
> > +      source = 0;
> > +    }
> > +  else if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
> >        && source != op)
> >      {
> >        unsigned int tmp = source;
> 
> ... this logic already take care of the wanted swapping, provided the templates
> get SwapSources added? Hmm, odd - you already have that attribute on the
> two POP2 templates, but not the two PUSH2 ones. Yet both use identical
> operand (encoding) order.

Push2 and Pop2 have extension opcode,  so they don't need to change anything in this function, and removed SWAP_SOURCES for POP2.

      if (i.tm.extension_opcode != None)
        {
          if (dest != source)
            v = dest;
          dest = ~0;

> 
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-decode-inval.d
> > @@ -0,0 +1,29 @@
> > +#as: --64
> > +#objdump: -dw
> > +#name: illegal decoding of APX-push2pop2 insns
> 
> Decoding cannot be illegal. Either you mean encoding, or you mean "decoding
> of illegal APX-push2pop2 insn forms".
> 

It should be "illegal encoding of APX-push2pop2 insns".

> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-decode-inval.s
> > @@ -0,0 +1,19 @@
> > +# Check illegal bytecode of APX-Push2Pop2 instructions # pop2 %rax,
> > +%rbx
> 
> What's illegal here? From ...
> 

> > +# pop2 %rax, %rsp
> > +# push2 %rsp, %r17
> > +# pop2 %r12, %r12
> > +# pop2 %r31, %r31
> > +
> > +	.allow_index_reg
> > +	.text
> > +popnd0:
> > +	.byte 0x62,0xF4,0x64,0x08,0x8F,0xC0
> 
> ... the label name I guess EVEX.nd=0, but that needs saying. (As mentioned
> earlier, the comments want moving next to the code emission anyway,
> and .byte wants avoiding it at all possible.)
> 

Removed this invalid test file, some instructions were duplicated with tx86-64-apx-push2pop2-inval.s and moved some .byte (unavoidable but maybe not necessary) tests to x86-64-apx-evex-promoted-bad.s

> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
> > @@ -0,0 +1,13 @@
> > +# Check illegal APX-Push2Pop2 instructions
> > +
> > +	.allow_index_reg
> > +	.text
> > +_start:
> > +	push2 %eax, %ebx
> > +	pop2 %rax, %rsp
> > +	push2 %rsp, %r17
> > +	pop2 %r12, %r12
> > +	push2p %eax, %ebx
> > +	pop2p %rax, %rsp
> > +	push2p %rsp, %r17
> > +	pop2p %r12, %r12
> 
> Please make sure that at least one of the forms uses %rsp once as first and
> once as second operand for both push and pop.
> 

Done.

> > --- a/opcodes/i386-dis-evex-x86.h
> > +++ b/opcodes/i386-dis-evex-x86.h
> > @@ -138,3 +138,13 @@
> >      { Bad_Opcode },
> >      { VEX_LEN_TABLE (VEX_LEN_0F3AF0) },
> >    },
> > +  /* X86_64_EVEX_MAP4_8F*/
> 
> Nit: Please make well-formed comments (...
> 

Done.

> > +  {
> > +    { Bad_Opcode },
> > +    { EVEX_LEN_TABLE (EVEX_LEN_MAP4_8F_X86_64) },  },
> > +  /* X86_64_EVEX_MAP4_FF_R_6*/
> 
> ... i.e. also here and possibly elsewhere).
> 

Done.

> > @@ -1341,6 +1354,9 @@ enum
> >    X86_64_EVEX_0F38F6,
> >    X86_64_EVEX_0F38F7,
> >    X86_64_EVEX_0F3AF0,
> > +
> > +  X86_64_EVEX_MAP4_8F,
> > +  X86_64_EVEX_MAP4_FF_R_6,
> >  };
> 
> This might indicate a problem in patch 4: Why is the x86-64 decode step
> entirely missing there? Or, if correct there, why is it needed here? For push
> and pop here I'd expect decode order and hence enumerators to be entirely
> consistent.
> 

Confirmed with patch 2/8 and 4/8, all MAP4 are missing in x86-64 table. Added x86-64 check for all MAP4 in get_valid_dis386. Deleted push2 and  pop2 here.

        case 0x4:
          vex_table_index = EVEX_MAP4;
          ins->evex_type = evex_from_legacy;+
 +       if (ins->address_mode != mode_64bit)
 +         return &bad_opcode;
          break;

> > @@ -1537,7 +1553,10 @@ enum
> >    EVEX_LEN_0F3A39,
> >    EVEX_LEN_0F3A3A,
> >    EVEX_LEN_0F3A3B,
> > -  EVEX_LEN_0F3A43
> > +  EVEX_LEN_0F3A43,
> > +
> > +  EVEX_LEN_MAP4_8F_X86_64,
> > +  EVEX_LEN_MAP4_FF_R_6_X86_64,
> >  };
> 
> Prior changes didn't find it necessary to handle EVEX.l through a table lookup
> - why is this needed here? Map4 has uniform requirements, and an earlier
> patch added respective checking, iirc.
> 

Removed EVEX.l , and added check for it.

> > @@ -8757,10 +8779,24 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >        dp = &prefix_table[dp->op[1].bytemode][vindex];
> >        break;
> >
> > +    case USE_X86_64_EVEX_PUSH2_TABLE:
> > +    case USE_X86_64_EVEX_POP2_TABLE:
> > +	ins->evex_type = evex_push2_pop2;
> > +      unsigned int vvvv_reg = ins->vex.register_specifier
> > +			      | !ins->vex.v << 4;
> > +      unsigned int rm_reg = ins->modrm.rm + (ins->rex & REX_B ? 8 : 0)
> > +			    + (ins->rex2 & REX_B ? 16 : 0);
> > +      if (!ins->vex.b || vvvv_reg == 0x4 || rm_reg == 0x4
> > +	  || (dp->op[0].bytemode == USE_X86_64_EVEX_POP2_TABLE
> > +	      && vvvv_reg == rm_reg))
> > +	  return &bad_opcode;
> > +      goto use_x86_64_table;
> 
> I don't think this is the way to handle such restrictions. Since this likely can't
> very well be handled in existing operand handlers, so far the approach was to
> introduce new ...Fixup() ones. That'll also produce more helpful output, e.g.
> "push2 (bad)", rather than ".byte ..."
> with no hint at approximately what insn this was. New USE_..._TABLE should
> imo really only appear when truly new tables are introduced.
> 

Put them in PUSH2_POP2_Fixup and removed USE_X86_64_EVEX_PUSH2_TABLE and USE_X86_64_EVEX_POP2_TABLE.

> >      case USE_X86_64_EVEX_FROM_VEX_TABLE:
> >        ins->evex_type = evex_from_vex;
> >        /* Fall through.  */
> >      case USE_X86_64_TABLE:
> > +use_x86_64_table:
> 
> While this is going to go away with the comment above, as a general
> remark: Within a switch(), please align labels used by "goto" from one case to
> another with the adjacent case label(s). Elsewhere, for "diff -p", please indent
> labels by at least one blank.
> 

Yes, they were deleted.

> > @@ -9570,7 +9606,8 @@ print_insn (bfd_vma pc, disassemble_info *info,
> int intel_syntax)
> >  	  /* Check whether rounding control was enabled for an insn not
> >  	     supporting it.  */
> >  	  if (ins.modrm.mod == 3 && ins.vex.b
> > -	      && !(ins.evex_used & EVEX_b_used))
> > +	      && !(ins.evex_used & EVEX_b_used)
> > +	      && ins.evex_type != evex_push2_pop2)
> >  	    {
> 
> Looks like addressing a comment on an earlier patch will render this change
> (and hence the new evex_push2_pop2 enumerator) unnecessary. If not, I'd
> ask whether you wouldn't better set EVEX_b_used at an appropriate point.
> 

Yes, it is redundant here, removed it. 

> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3555,3 +3555,9 @@ eretu, 0xf30f01ca, FRED|x64, NoSuf, {}
> >
> >  // FRED instructions end.
> >
> > +// APX Push2/Pop2 instruction.
> > +
> > +push2, 0xff/6, APX_F|x64,
> >
> +Modrm|VexW0|EVex128|Push2Pop2|EVexMap4|VexVVVV|No_bSuf|No_lSuf
> |No_sSu
> > +f, { Reg64, Reg64 } push2p, 0xff/6, APX_F|x64,
> >
> +Modrm|VexW1|EVex128|Push2Pop2|EVexMap4|VexVVVV|No_bSuf|No_lSuf
> |No_sSu
> > +f, { Reg64, Reg64 } pop2, 0x8f/0, APX_F|x64,
> >
> +Modrm|VexW0|EVex128|Push2Pop2|SwapSources|EVexMap4|VexVVVV|No_
> bSuf|No
> > +_lSuf|No_sSuf, { Reg64, Reg64 } pop2p, 0x8f/0, APX_F|x64,
> >
> +Modrm|VexW1|EVex128|Push2Pop2|SwapSources|EVexMap4|VexVVVV|No_
> bSuf|No
> > +_lSuf|No_sSuf, { Reg64, Reg64 }
> 
> Missing No_wSuf in all 4 entries?
>
 
Added.

> As to the suffixing 'p' - may I ask in how far that's aligned with other
> assemblers? The doc doesn't even give a hint as to what mnemonic
> representation should be used (personally I had been taking .x into
> consideration). Furthermore the earlier REX2 patch also didn't deal with the
> PPX functionality for PUSH/POP, unless I missed something.
>

Pop2p and Push2p are listed in sections 9.1 and 9.2 of the doc. PPX functionality for PUSH/POP is not implemented in this patch, It will be implemented separately later. I'll note it in the Push2/pop2 commit log.

EVEX.LLZ.NP.MAP4.W1 8F 11:000:bbb
POP2P {NF=0} {ND=1} r64, r64

EVEX.LLZ.NP.MAP4.W1 FF 11:110:bbb
PUSH2P {NF=0} {ND=1} r64, r64

Thanks,
Lili.

  reply	other threads:[~2023-10-30 15:21 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-19 15:25 [PATCH 0/8] [RFC] Support Intel APX EGPR Cui, Lili
2023-09-19 15:25 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-09-21 15:27   ` Jan Beulich
2023-09-27 15:57     ` Cui, Lili
2023-09-21 15:51   ` Jan Beulich
2023-09-27 15:59     ` Cui, Lili
2023-09-28  8:02       ` Jan Beulich
2023-10-07  3:27         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 2/8] Support APX GPR32 with extend evex prefix Cui, Lili
2023-09-22 10:12   ` Jan Beulich
2023-10-17 15:48     ` Cui, Lili
2023-10-18  6:40       ` Jan Beulich
2023-10-18 10:44         ` Cui, Lili
2023-10-18 10:50           ` Jan Beulich
2023-09-22 10:50   ` Jan Beulich
2023-10-17 15:50     ` Cui, Lili
2023-10-17 16:11       ` Jan Beulich
2023-10-18  2:02         ` Cui, Lili
2023-10-18  6:10           ` Jan Beulich
2023-09-25  6:03   ` Jan Beulich
2023-10-17 15:52     ` Cui, Lili
2023-10-17 16:12       ` Jan Beulich
2023-10-18  6:31         ` Cui, Lili
2023-10-18  6:47           ` Jan Beulich
2023-10-18  7:52             ` Cui, Lili
2023-10-18  8:21               ` Jan Beulich
2023-10-18 11:30                 ` Cui, Lili
2023-10-19 11:58                   ` Cui, Lili
2023-10-19 15:24                     ` Jan Beulich
2023-10-19 16:38                       ` Cui, Lili
2023-10-20  6:25                         ` Jan Beulich
2023-10-22 14:33                           ` Cui, Lili
2023-09-19 15:25 ` [PATCH 3/8] Add tests for " Cui, Lili
2023-09-27 13:11   ` Jan Beulich
2023-10-17 15:53     ` FW: " Cui, Lili
2023-10-17 16:19       ` Jan Beulich
2023-10-18  2:32         ` Cui, Lili
2023-10-18  6:05           ` Jan Beulich
2023-10-18  7:16             ` Cui, Lili
2023-10-18  8:05               ` Jan Beulich
2023-10-18 11:26                 ` Cui, Lili
2023-10-18 12:06                   ` Jan Beulich
2023-10-25 16:03                     ` Cui, Lili
2023-09-27 13:19   ` Jan Beulich
2023-09-19 15:25 ` [PATCH 4/8] Support APX NDD Cui, Lili
2023-09-27 14:44   ` Jan Beulich
2023-10-22 14:05     ` Cui, Lili
2023-10-23  7:12       ` Jan Beulich
2023-10-25  8:10         ` Cui, Lili
2023-10-25  8:47           ` Jan Beulich
2023-10-25 15:49             ` Cui, Lili
2023-10-25 15:59               ` Jan Beulich
2023-09-28  7:57   ` Jan Beulich
2023-10-22 14:57     ` Cui, Lili
2023-10-24 11:39     ` Cui, Lili
2023-10-24 11:58       ` Jan Beulich
2023-10-25 15:29         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 5/8] Support APX NDD optimized encoding Cui, Lili
2023-09-28  9:29   ` Jan Beulich
2023-10-23  2:57     ` Hu, Lin1
2023-10-23  7:23       ` Jan Beulich
2023-10-23  7:50         ` Hu, Lin1
2023-10-23  8:15           ` Jan Beulich
2023-10-24  1:40             ` Hu, Lin1
2023-10-24  6:03               ` Jan Beulich
2023-10-24  6:08                 ` Hu, Lin1
2023-10-23  3:07     ` [PATCH-V2] " Hu, Lin1
2023-10-23  3:30     ` [PATCH 5/8] [v2] " Hu, Lin1
2023-10-23  7:26       ` Jan Beulich
2023-09-19 15:25 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-09-28 11:37   ` Jan Beulich
2023-10-30 15:21     ` Cui, Lili [this message]
2023-10-30 15:31       ` Jan Beulich
2023-11-20 13:05         ` Cui, Lili
2023-09-19 15:25 ` [PATCH 7/8] Support APX NF Cui, Lili
2023-09-25  6:07   ` Jan Beulich
2023-09-28 12:42   ` Jan Beulich
2023-11-02 10:15     ` Cui, Lili
2023-11-02 10:23       ` Jan Beulich
2023-11-02 10:46         ` Cui, Lili
2023-12-12  2:59           ` H.J. Lu
2023-09-19 15:25 ` [PATCH 8/8] Support APX JMPABS Cui, Lili
2023-09-28 13:11   ` Jan Beulich
2023-11-02  2:32     ` Hu, Lin1
2023-11-02 11:29 [PATCH v2 0/8] Support Intel APX EGPR Cui, Lili
2023-11-02 11:29 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-11-08 11:44   ` Jan Beulich
2023-11-08 12:52     ` Jan Beulich
2023-11-22  5:48     ` Cui, Lili
2023-11-22  8:53       ` Jan Beulich
2023-11-22 12:26         ` Cui, Lili
2023-11-09  9:57   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5600CCCE672EE2DA150969E19EA1A@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hongjiu.lu@intel.com \
    --cc=zewei.mo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).