RE: [PATCH 1/8] Support APX GPR32 with rex2 prefix

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"ccoutant@gmail.com" <ccoutant@gmail.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH 1/8] Support APX GPR32 with rex2 prefix
Date: Fri, 10 Nov 2023 07:11:15 +0000	[thread overview]
Message-ID: <SJ0PR11MB5600BFDD77AEFE660762F5339EAEA@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <b417ccc8-0a18-1319-ce62-3efadf560cdf@suse.com>

> Subject: Re: [PATCH 1/8] Support APX GPR32 with rex2 prefix
> 
> On 09.11.2023 14:27, Cui, Lili wrote:
> >>>> Also is this, ...
> >>>>
> >>>>>      {
> >>>>>        unsigned char threebyte;
> >>>>>
> >>>>> -      ins.codep++;
> >>>>> -      if (!fetch_code (info, ins.codep + 1))
> >>>>> -	goto fetch_error_out;
> >>>>> +      if (!ins.rex2)
> >>>>> +	{
> >>>>> +	  ins.codep++;
> >>>>> +	  if (!fetch_code (info, ins.codep + 1))
> >>>>> +	    goto fetch_error_out;
> >>>>> +	}
> >>>>>        threebyte = *ins.codep;
> >>>>>        dp = &dis386_twobyte[threebyte];
> >>>>>        ins.need_modrm = twobyte_has_modrm[threebyte];
> >>>>
> >>>> ... all the way to here, really correct for d5 00 0f?
> >>>>
> >>>
> >>> I think the 0f here must indicate that it is the first byte of the
> >>> legacy map1
> >> instruction, meaning legacy map0 does not have 0f opcode. If this
> >> instruction has a rex2 prefix, rex2.w must be 1 and should be d5 80.
> >> If a bad binary does appear, our original code also has the same issue.
> >>>
> >>> static const struct dis386 dis386[] = { ...
> >>> / * 0f  */
> >>> { Bad_Opcode },       /* 0x0f extended opcode escape */
> >>
> >> No, this entry simply will never be used, because of how decoding is done.
> >> My comment was about what's going to happen if you encounter the d5
> >> 00 0f byte sequence. That's _not_ an indication to use map1 for
> >> decoding, nor to read another opcode byte. In this case the table
> >> entry you quote above will need to come into play, not any entry from
> >> dis386_twobyte[]. (As long as both are Bad_Opcode the difference may
> >> not even be noticeable, but it would be a latent trap for someone to
> >> fall into down the road.)
> >>
> >
> >
> >   /* REX2.M in rex2 prefix represents map0 or map1.  */
> >   if (*ins.codep == 0x0f || (ins.rex2 & REX2_M))
> >     {
> >       unsigned char threebyte;
> >
> >       if (!ins.rex2)
> >         {
> >           ins.codep++;
> >           if (!fetch_code (info, ins.codep + 1))
> >             goto fetch_error_out;                                                      ---> When there
> are no bytes after 0f, it will jump to fetch error, but no error will be reported.
> >         }
> >       threebyte = *ins.codep;
> >       dp = &dis386_twobyte[threebyte];
> >       ins.need_modrm = twobyte_has_modrm[threebyte];
> >       ins.codep++;
> >     }
> >
> > For d5 00 0f
> > Decode to:
> >    0:   d5                      rex2
> >    1:   00 0f                   add    %cl,(%rdi)
> 
> But this would better have d5 00 0f all on the first line (it definitely needs to
> have d5 00 on the same line, as the bytes belong together), as opposed to ...
> 
> > For 40 0f
> > Decode to:
> >    0:   40                      rex
> >    1:   0f                      .byte 0xf
> 
> ... this where there truly is a known missing byte before we could proceed
> further. (It's still a little questionable to print REX separately in this case, but
> that's the way the binutils disassembler has always worked.)
> 
> Yet to restate - to see what I mean, you'd need to populate at least one of the
> two 0f slots in the mentioned arrays. What I'm suspecting from the code as
> this patch version has it is that d5 00 0f would wrongly descend into
> dis386_twobyte[]. Yet you can tell that from it correctly using dis386[] only if
> the two 0f slots of these arrays are meaningfully different (or by actually
> looking at things in e.g.
> a debugger).
> 

I'm confused here, for d5 00 0f when it fetches the next byte after 0f it will find there is no byte there and then go to fetch_error_out and then it will return from print_insn and I don't have a chance to do anything for it. It cannot reach dis386_twobyte[]. 

> >>>>> @@ -9513,6 +9572,13 @@ print_insn (bfd_vma pc, disassemble_info
> >>>>> *info,
> >>>> int intel_syntax)
> >>>>>        && !ins.need_vex && ins.last_rex_prefix >= 0)
> >>>>>      ins.all_prefixes[ins.last_rex_prefix] = 0;
> >>>>>
> >>>>> +  /* Check if the REX2 prefix is used.  */
> >>>>> +  if (ins.last_rex2_prefix >= 0
> >>>>> +      && ((((ins.rex2 & 0x7) ^ (ins.rex2_used & 0x7)) == 0
> >>>>> +	   && (ins.rex2 & 0x7))
> >>>>
> >>>> DYM ((ins.rex2 & 7) & ~(ins.rex2_used & 7)) != 0
> >>>>
> >>>
> >>> Here's an example of a negative scenario, when ins.rex2 == 1 and
> >> ins.rex2_used == 1, we want to clear last_rex2_prefix, because it has
> >> egpr and we don't want to add {rex2} to it.
> >>
> >> Well, that would be dealt with as well by the simpler code I
> >> suggested, wouldn't it?
> >>
> >
> > No, for d510 ,  ((ins.rex2 & 7) & ~(ins.rex2_used & 7)) == 0. Anyway, I want to
> delete them. I don't see any point in it at all.
> 
> Hmm, I guess I'm confused. How would you present unconsumed
> REX2.{R,X,B}{3,4} then?
> 

For rex2.R4X4B4. I can't imagine why they set RXB but didn't consume it at the end. Since rex2 has no content listed, so I think it is not useful to have rex2_used.
For rex2.WR3X3B3, I think it can be ignored in rex2, which doesn't list what's there.

We currently use the following rules to disambiguate.

When the instruction has egpr, it means it has rex2 prefix and we don't print {rex2} for it. When it also has an evex version, we add {evex} to the evex instruction.
When the instruction has no egpr, we need to print {rex2} for it. In case of evex instruction, we need to print {evex} for it.

            adc     $0x7b,%r8b
{rex2} adc     $0x7b,%r8b
            adc     $0x7b,%r16b
{evex} adc     $0x7b,%r16b
{evex} adc     $0x7b,%r8b

> >>>>> @@ -11086,8 +11155,11 @@ print_register (instr_info *ins, unsigned
> >>>>> int
> >>>> reg, unsigned int rexmask,
> >>>>>      ins->illegal_masking = true;
> >>>>>
> >>>>>    USED_REX (rexmask);
> >>>>> +  USED_REX2 (rexmask);
> >>>>
> >>>> Do both really need tracking separately? Whatever consumes REX.B
> >>>> will also consume REX2.B4, an so on.
> >>>>
> >>> I was confused here, I think we only need to print {rex2} for the
> >>> upper 4 bits
> >> == *000, which means egpr is not used and we need to use {rex2} to
> >> distinguish it from legacy encoding.  maybe we don’t need ((ins.rex2
> >> & 0x7) ^ (ins.rex2_used & 0x7)) == 0, and nor USED_REX2 (rexmask). I
> >> intend to delete them.
> >>>
> >>> +  /* Check if the REX2 prefix is used.  */
> >>> +  if (ins.last_rex2_prefix >= 0
> >>> +      && ((((ins.rex2 & 0x7) ^ (ins.rex2_used & 0x7)) == 0
> >>> +	   && (ins.rex2 & 0x7))
> >>
> >> But that's the same you had before. I'm afraid I don't see what
> >> you're trying to tell me.
> >>
> > After removing  " ((ins.rex2 & 0x7) ^ (ins.rex2_used & 0x7)) == 0 ",
> > the code changes to
> >
> >   +  /* Check if the REX2 prefix is used.  */
> >   +  if (ins.last_rex2_prefix >= 0 && (ins.rex2 & 0x7))
> >
> > When it is true, decode will not print the {rex2} for this insn.
> 
> Yet ins.rex2 having any of the low 3 bits set says nothing about whether every
> one of these was consumed while processing operands / suffixes.
> You need to consult .rex{,2}_used; my earlier point was merely that you don't
> need a separate .rex2_used; the bits in .rex_used are all you require to get
> this right (as a consumer of, say, REX.X / REX2.X3 is also a consumer of
> REX2.X4; leaving aside EVEX encoded insns for the moment).
> 

Here are some cases:
0:   41 0f a8                  rex.B push %gs         ---> rex.B was not consumed, rex will print it.
0:   d5 01 0f a8             {rex2} push %gs      ----> Without egpr, we need to print {rex2} for it. But we can't see anything about REX2.B3 from the prefix.
4:   d5 19 58                  pop    %r24             ----> There is egpr, we know it uses rex2 prefix. We cannot see the information of the lower 4 bits of rex2

It is not helpful to judge rex_used in rex2.

Thanks.
Lili.

next prev parent reply	other threads:[~2023-11-10  7:11 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-02 11:29 [PATCH v2 0/8] Support Intel APX EGPR Cui, Lili
2023-11-02 11:29 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-11-02 17:05   ` Jan Beulich
2023-11-03  6:20     ` Cui, Lili
2023-11-03 13:05     ` Jan Beulich
2023-11-03 14:19   ` Jan Beulich
2023-11-06 15:20     ` Cui, Lili
2023-11-06 16:08       ` Jan Beulich
2023-11-07  8:16         ` Cui, Lili
2023-11-07 10:43           ` Jan Beulich
2023-11-07 15:31             ` Cui, Lili
2023-11-07 15:43               ` Jan Beulich
2023-11-07 15:53                 ` Cui, Lili
2023-11-06 15:02   ` Jan Beulich
2023-11-07  8:06     ` Cui, Lili
2023-11-07 10:20       ` Jan Beulich
2023-11-07 14:32         ` Cui, Lili
2023-11-07 15:08           ` Jan Beulich
2023-11-06 15:39   ` Jan Beulich
2023-11-09  8:02     ` Cui, Lili
2023-11-09 10:52       ` Jan Beulich
2023-11-09 13:27         ` Cui, Lili
2023-11-09 15:22           ` Jan Beulich
2023-11-10  7:11             ` Cui, Lili [this message]
2023-11-10  9:14               ` Jan Beulich
2023-11-10  9:21                 ` Jan Beulich
2023-11-10 12:38                   ` Cui, Lili
2023-12-14 10:13                   ` Cui, Lili
2023-12-18 15:24                     ` Jan Beulich
2023-12-18 16:23                       ` H.J. Lu
2023-11-10  9:47                 ` Cui, Lili
2023-11-10  9:57                   ` Jan Beulich
2023-11-10 12:05                     ` Cui, Lili
2023-11-10 12:35                       ` Jan Beulich
2023-11-13  0:18                         ` Cui, Lili
2023-11-02 11:29 ` [PATCH 2/8] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-11-02 11:29 ` [PATCH 3/8] Support APX GPR32 with extend evex prefix Cui, Lili
2023-11-02 11:29 ` [PATCH 4/8] Add tests for " Cui, Lili
2023-11-08  9:11   ` Jan Beulich
2023-11-15 14:56     ` Cui, Lili
2023-11-16  9:17       ` Jan Beulich
2023-11-16 15:34     ` Cui, Lili
2023-11-16 16:50       ` Jan Beulich
2023-11-17 12:42         ` Cui, Lili
2023-11-17 14:38           ` Jan Beulich
2023-11-22 13:40             ` Cui, Lili
2023-11-02 11:29 ` [PATCH 5/8] Support APX NDD Cui, Lili
2023-11-08 10:39   ` Jan Beulich
2023-11-20  1:19     ` Cui, Lili
2023-11-08 11:13   ` Jan Beulich
2023-11-20 12:36     ` Cui, Lili
2023-11-20 16:33       ` Jan Beulich
2023-11-22  7:46         ` Cui, Lili
2023-11-22  8:47           ` Jan Beulich
2023-11-22 10:45             ` Cui, Lili
2023-11-23 10:57               ` Jan Beulich
2023-11-23 12:14                 ` Cui, Lili
2023-11-24  6:56                 ` [PATCH v3 0/9] Support Intel APX EGPR Cui, Lili
2023-12-07  8:17                   ` Cui, Lili
2023-12-07  8:33                     ` Cui, Lili
2023-11-09  9:37   ` [PATCH 5/8] Support APX NDD Jan Beulich
2023-11-20  1:33     ` Cui, Lili
2023-11-20  8:19       ` Jan Beulich
2023-11-20 12:54         ` Cui, Lili
2023-11-20 16:43           ` Jan Beulich
2023-11-02 11:29 ` [PATCH 6/8] Support APX Push2/Pop2 Cui, Lili
2023-11-08 11:44   ` Jan Beulich
2023-11-08 12:52     ` Jan Beulich
2023-11-22  5:48     ` Cui, Lili
2023-11-22  8:53       ` Jan Beulich
2023-11-22 12:26         ` Cui, Lili
2023-11-09  9:57   ` Jan Beulich
2023-11-02 11:29 ` [PATCH 7/8] Support APX NDD optimized encoding Cui, Lili
2023-11-09 10:36   ` Jan Beulich
2023-11-10  5:43     ` Hu, Lin1
2023-11-10  9:54       ` Jan Beulich
2023-11-14  2:28         ` Hu, Lin1
2023-11-14 10:50           ` Jan Beulich
2023-11-15  2:52             ` Hu, Lin1
2023-11-15  8:57               ` Jan Beulich
2023-11-15  2:59             ` [PATCH][v3] " Hu, Lin1
2023-11-15  9:34               ` Jan Beulich
2023-11-17  7:24                 ` Hu, Lin1
2023-11-17  9:47                   ` Jan Beulich
2023-11-20  3:28                     ` Hu, Lin1
2023-11-20  8:34                       ` Jan Beulich
2023-11-14  2:58         ` [PATCH 1/2] Reorder APX insns in i386.tbl Hu, Lin1
2023-11-14 11:20           ` Jan Beulich
2023-11-15  1:49             ` Hu, Lin1
2023-11-15  8:52               ` Jan Beulich
2023-11-17  3:27                 ` Hu, Lin1
2023-11-02 11:29 ` [PATCH 8/8] Support APX JMPABS Cui, Lili
2023-11-09 12:59   ` Jan Beulich
2023-11-14  3:26     ` Hu, Lin1
2023-11-14 11:15       ` Jan Beulich
2023-11-24  5:40         ` Hu, Lin1
2023-11-24  7:21           ` Jan Beulich
2023-11-27  2:16             ` Hu, Lin1
2023-11-27  8:03               ` Jan Beulich
2023-11-27  8:46                 ` Hu, Lin1
2023-11-27  8:54                   ` Jan Beulich
2023-11-27  9:03                     ` Hu, Lin1
2023-11-27 10:32                       ` Jan Beulich
2023-12-04  7:33                         ` Hu, Lin1
2023-11-02 13:22 ` [PATCH v2 0/8] Support Intel APX EGPR Jan Beulich
2023-11-03 16:42   ` Cui, Lili
2023-11-06  7:30     ` Jan Beulich
2023-11-06 14:20       ` Cui, Lili
2023-11-06 14:44         ` Jan Beulich
2023-11-06 16:03           ` Cui, Lili
2023-11-06 16:10             ` Jan Beulich
2023-11-07  1:53               ` Cui, Lili
2023-11-07 10:11                 ` Jan Beulich
  -- strict thread matches above, loose matches on Subject: below --
2023-09-19 15:25 [PATCH 0/8] [RFC] " Cui, Lili
2023-09-19 15:25 ` [PATCH 1/8] Support APX GPR32 with rex2 prefix Cui, Lili
2023-09-21 15:27   ` Jan Beulich
2023-09-27 15:57     ` Cui, Lili
2023-09-21 15:51   ` Jan Beulich
2023-09-27 15:59     ` Cui, Lili
2023-09-28  8:02       ` Jan Beulich
2023-10-07  3:27         ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5600BFDD77AEFE660762F5339EAEA@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=ccoutant@gmail.com \
    --cc=hongjiu.lu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).