From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>,
"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH v3 2/9] Support APX GPR32 with rex2 prefix
Date: Wed, 6 Dec 2023 12:43:50 +0000 [thread overview]
Message-ID: <SJ0PR11MB5600FDF8200EEDEF70714AE79E84A@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <9578ce1e-233e-4c9a-8003-a619a685c236@suse.com>
> On 05.12.2023 14:31, Cui, Lili wrote:
> >> On 24.11.2023 08:02, Cui, Lili wrote:
> >>> @@ -4120,6 +4134,21 @@ build_evex_prefix (void)
> >>> i.vex.bytes[3] |= i.mask.reg->reg_num; }
> >>>
> >>> +/* Build (2 bytes) rex2 prefix.
> >>> + | D5h |
> >>> + | m | R4 X4 B4 | W R X B |
> >>> +*/
> >>> +static void
> >>> +build_rex2_prefix (void)
> >>> +{
> >>> + /* Rex2 reuses i.vex because they handle i.tm.opcode_space the
> >>> +same. */
> >>
> >> How do they handle it the same? (Also I don't think this is useful as
> >> a code comment; it instead belongs in the description imo.)
> >>
> >
> > Moved the comment to the functions description.
> >
> > /* Build (2 bytes) rex2 prefix.
> > | D5h |
> > | m | R4 X4 B4 | W R X B |
> >
> > Rex2 reuses i.vex as they handle i.tm.opcode_space the same way.
> > */ static void build_rex2_prefix (void)
> >
> >
> > In function "output_insn", some handle like this.
> >
> > if (!i.vex.length)
> > switch (i.tm.opcode_space)
> > {
> > case SPACE_BASE:
> > break;
> > case SPACE_0F:
> > ++j;
> > break;
> > case SPACE_0F38:
> > case SPACE_0F3A:
> > j += 2;
> > break;
> > default:
> > abort ();
> > }
> > .....
> > if (!i.vex.length
> > && i.tm.opcode_space != SPACE_BASE)
> > {
> > *p++ = 0x0f;
> > if (i.tm.opcode_space != SPACE_0F)
> > *p++ = i.tm.opcode_space == SPACE_0F38
> > ? 0x38 : 0x3a;
> > }
>
> Oh, I see. That's pretty remote. How about replacing "the same way"?
> Perhaps
> "Rex2 reuses i.vex as they both encode i.tm.opcode_space in their prefixes"?
>
Done.
> While in that form it's fine to remain in a code comment, just a general
> clarification: When I say something wants saying in the "description", it's
> (almost) always that I mean the patch description, not anything else.
>
I see.
> >> I did comment on, in particular, the 8-bit register counts before.
> >> Afaict the comments above are nevertheless unchanged and hence still
> >> not really correct.
> >>
> >
> > Changed to :
> >
> > if (flag_code == CODE_64BIT || base_regnum < 4)
> > {
> > i.types[1].bitfield.byte = 1;
> > /* Ignore the suffix. */
> > i.suffix = 0;
> > /* Convert to byte registers. 8-bit registers are special,
> > RegRex64 and non-RegRex64 each have 8 registers. */
> > if (i.types[1].bitfield.word)
> > /* 32 (or 40) 8-bit registers. */
> > j = 32;
> > else if (i.types[1].bitfield.dword)
> > /* 32 (or 40)8-bit registers + 32 16-bit registers. */
>
> Nit: Missing blank.
>
Done.
> > j = 64;
> > else
> > /* 32 (or 40) 8-bit registers + 32 16-bit registers
> > + 32 32-bit registers. */
> > j = 96;
> >
> > if (!(i.op[1].regs->reg_flags & (RegRex | RegRex2)) && base_regnum <
> 4)
> > j += 8;
> > i.op[1].regs -= j;
> > }
>
> I won't insist on further changes, but imo as you're adding comments, also
> adding a comment to this last if() (which finally takes care of the 8-bit reg
> special case) would be advisable.
>
Added.
/* In 64-bit mode, the following byte registers cannot be accessed
if using the Rex and Rex2 prefix: AH, BH, CH, DH */
if (!(i.op[1].regs->reg_flags & (RegRex | RegRex2)) && base_regnum < 4)
j += 8;
> >>> +/* Check if Egprs operands are valid for the instruction. */
> >>> +
> >>> +static int
> >>> +check_EgprOperands (const insn_template *t) {
> >>> + if (!t->opcode_modifier.noegpr)
> >>> + return 0;
> >>> +
> >>> + for (unsigned int op = 0; op < i.operands; op++)
> >>> + {
> >>> + if (i.types[op].bitfield.class != Reg
> >>> + /* Special case for (%dx) while doing input/output op */
> >>> + || i.input_output_operand)
> >>
> >> Didn't we agree that this extra condition isn't necessary, once the
> >> producer site correctly updates all state (which was supposed to be
> >> done in a small prereq patch)?
> >>
> >
> > I tried adding "Unspecified | BaseIndex" to the InOutPortReg, then some
> related instructions had two memory operands, so it raised a lot of invalid
> test case fail, and more ugly code needed to be added. In the end, I felt that
> this simple modification might be better.
>
> Changing InOutPortReg of course isn't going to be easy. But that also wasn't
> what we had discussed. Instead (I thought) we agreed on ...
>
> > @@ -13137,6 +13137,7 @@ i386_att_operand (char *operand_string)
> > && !operand_type_check (i.types[this_operand], disp))
> > {
> > i.types[this_operand] = i.base_reg->reg_type;
> > + i.types[this_operand].bitfield.class = 0;
> > i.input_output_operand = true;
> > return 1;
>
> amending this code to also correctly set i.op[].regs. Perhaps it would also be
> best to actually clear i.base_reg (for there not being any memory operand).
> (FTAOD: All of this in a separate prereq patch, not here. The code creating
> inconsistent state has been a [latent] bug for a long time.)
>
Added i.base_reg = NULL. Just discussing it here, I'll create a new patch for it.
@@ -13016,6 +13016,8 @@ i386_att_operand (char *operand_string)
&& !operand_type_check (i.types[this_operand], disp))
{
i.types[this_operand] = i.base_reg->reg_type;
+ i.types[this_operand].bitfield.class = 0;
+ i.base_reg = NULL;
i.input_output_operand = true;
return 1;
}
> >>> --- a/gas/doc/c-i386.texi
> >>> +++ b/gas/doc/c-i386.texi
> >>> @@ -217,6 +217,7 @@ accept various extension mnemonics. For
> >>> example, @code{avx10.1/256}, @code{avx10.1/128}, @code{user_msr},
> >>> +@code{apx_f},
> >>> @code{amx_int8},
> >>> @code{amx_bf16},
> >>> @code{amx_fp16},
> >>> @@ -983,6 +984,9 @@ Different encoding options can be specified via
> >> pseudo prefixes:
> >>> instructions (x86-64 only). Note that this differs from the
> >>> @samp{rex} prefix which generates REX prefix unconditionally.
> >>>
> >>> +@item
> >>> +@samp{@{rex2@}} -- encode with REX2 prefix
> >>
> >> This isn't in line with what's said for {rex}. Iirc we were in
> >> agreement that we want both to behave consistently. In which case
> >> documentation also needs to describe them consistently.
> >>
> >
> > Changed to
> >
> > @item
> > @samp{@{rex2@}} -- prefer REX2 prefix for integer and legacy vector
> > instructions (APX_F only). Note that this differs from the
> > @samp{rex2} prefix which generates REX2 prefix unconditionally.
>
> Except there's no "rex2" prefix according to the present implementation.
>
Remove them for current implementation.
@item
@samp{@{rex2@}} -- prefer REX2 prefix for integer and legacy vector
instructions (APX_F only).
> >>> --- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
> >>> +++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s
> >>> @@ -5,3 +5,61 @@ pseudos:
> >>> {rex} vmovaps %xmm7,%xmm2
> >>> {rex} vmovaps %xmm17,%xmm2
> >>> {rex} rorx $7,%eax,%ebx
> >>> + {rex2} vmovaps %xmm7,%xmm2
> >>> + {rex2} xsave (%rax)
> >>> + {rex2} xsaves (%ecx)
> >>> + {rex2} xsaves64 (%ecx)
> >>> + {rex2} xsavec (%ecx)
> >>> + {rex2} xrstors (%ecx)
> >>> + {rex2} xrstors64 (%ecx)
> >>> +
> >>> + #All opcodes in the row 0xa* prefixed REX2 are illegal.
> >>> + #{rex2} test (0xa8) is a special case, it will remap to test (0xf6)
> >>> + {rex2} mov 0x90909090,%al
> >>> + {rex2} movabs 0x1,%al
> >>> + {rex2} cmpsb %es:(%edi),%ds:(%esi)
> >>> + {rex2} lodsb
> >>> + {rex2} lods %ds:(%esi),%al
> >>> + {rex2} lodsb (%esi)
> >>> + {rex2} movs
> >>> + {rex2} movs (%esi), (%edi)
> >>> + {rex2} scasl
> >>> + {rex2} scas %es:(%edi),%eax
> >>> + {rex2} scasb (%edi)
> >>> + {rex2} stosb
> >>> + {rex2} stosb (%edi)
> >>> + {rex2} stos %eax,%es:(%edi)
> >>> +
> >>> + #All opcodes in the row 0x7* prefixed REX2 are illegal.
> >>
> >> This also covers map 1 row 8, doesn't it?
> >>
> >
> > No, I didn't find 0xf8* in opcode table.
>
> Assuming (again) you mean 0x0f 0x8*, how did you not find it? Or wait,
> depends on what "opcode table" here means: The manual's or opcodes/i386-
> opc.tbl? The latter of course doesn't have them, as they're ...
>
> >>> + {rex2} jo .+2-0x70
> >>> + {rex2} jno .+2-0x70
> >>> + {rex2} jb .+2-0x70
> >>> + {rex2} jae .+2-0x70
> >>> + {rex2} je .+2-0x70
> >>> + {rex2} jne .+2-0x70
> >>> + {rex2} jbe .+2-0x70
> >>> + {rex2} ja .+2-0x70
> >>> + {rex2} js .+2-0x70
> >>> + {rex2} jns .+2-0x70
> >>> + {rex2} jp .+2-0x70
> >>> + {rex2} jnp .+2-0x70
> >>> + {rex2} jl .+2-0x70
> >>> + {rex2} jge .+2-0x70
> >>> + {rex2} jle .+2-0x70
> >>> + {rex2} jg .+2-0x70
>
> ... the disp32/disp16 forms of these branches, which are created only during
> relaxation.
>
Oh, I see, I found them in sdm and added testcase for them.
#All opcodes in the row 0x8* (map1) prefixed REX2 are illegal.
{rex2} jo .+6+0x90909090
{rex2} jno .+6+0x90909090
{rex2} jb .+6+0x90909090
{rex2} jae .+6+0x90909090
{rex2} je .+6+0x90909090
{rex2} jne .+6+0x90909090
{rex2} jbe .+6+0x90909090
{rex2} ja .+6+0x90909090
{rex2} js .+6+0x90909090
{rex2} jns .+6+0x90909090
{rex2} jp .+6+0x90909090
{rex2} jnp .+6+0x90909090
{rex2} jl .+6+0x90909090
{rex2} jge .+6+0x90909090
{rex2} jle .+6+0x90909090
{rex2} jg .+6+0x90909090
> >>> + /* All opcodes listed map0 0x4*, 0x7*, 0xa*, 0xe* and map1 0x3*, 0x8*
> >>> + are reserved under REX2 and triggers #UD when prefixed with
> >>> + REX2 */ if (space == 0)
> >>> + switch (opcode >> 4)
> >>
> >> Both here and ...
> >>
> >>> + {
> >>> + case 0x4:
> >>> + case 0x7:
> >>> + case 0xA:
> >>> + case 0xE:
> >>> + return true;
> >>> + default:
> >>> + return false;
> >>> + }
> >>> +
> >>> + if (space == SPACE_0F)
> >>> + switch (opcode >> 4)
> >>
> >> ... here, don't you also need to mask off further bits? There are
> >> quite a few opcodes which have a kind-of ModR/M byte encoded directly
> >> in the opcode, for example.
> >>
> >
> > Thanks for reminding. Added the code like this.
> >
> > /* Some opcodes encode a ModR/M byte directly in the opcode. */
> > unsigned long long
> > base_opcode = (length > 1) ? opcode >> (8 * length - 8) : opcode;
>
> Can length be 0? I didn't think so, and then
>
> base_opcode = opcode >> (8 * length - 8);
>
> would be all you need.
>
yes good way.
> Also in the comment, I think it would be slightly better to say "ModR/M-like
> byte".
>
Done.
Thanks,
Lili.
next prev parent reply other threads:[~2023-12-06 12:44 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-24 7:02 [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Cui, Lili
2023-11-24 7:02 ` [PATCH v3 2/9] Support APX GPR32 with rex2 prefix Cui, Lili
2023-12-04 16:30 ` Jan Beulich
2023-12-05 13:31 ` Cui, Lili
2023-12-06 7:52 ` Jan Beulich
2023-12-06 12:43 ` Cui, Lili [this message]
2023-12-07 9:01 ` Jan Beulich
2023-12-08 3:10 ` Cui, Lili
2023-11-24 7:02 ` [PATCH v3 3/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions Cui, Lili
2023-11-24 7:02 ` [PATCH v3 4/9] Support APX GPR32 with extend evex prefix Cui, Lili
2023-12-07 12:38 ` Jan Beulich
2023-12-08 15:21 ` Cui, Lili
2023-12-11 8:34 ` Jan Beulich
2023-12-12 10:44 ` Cui, Lili
2023-12-12 11:16 ` Jan Beulich
2023-12-12 12:32 ` Cui, Lili
2023-12-12 12:39 ` Jan Beulich
2023-12-12 13:15 ` Cui, Lili
2023-12-12 14:13 ` Jan Beulich
2023-12-13 7:36 ` Cui, Lili
2023-12-13 7:48 ` Jan Beulich
2023-12-12 12:58 ` Cui, Lili
2023-12-12 14:04 ` Jan Beulich
2023-12-13 8:35 ` Cui, Lili
2023-12-13 9:13 ` Jan Beulich
2023-12-07 13:34 ` Jan Beulich
2023-12-11 6:16 ` Cui, Lili
2023-12-11 8:43 ` Jan Beulich
2023-12-11 11:50 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 5/9] Add tests for " Cui, Lili
2023-12-07 14:05 ` Jan Beulich
2023-12-11 6:16 ` Cui, Lili
2023-12-11 8:55 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 6/9] Support APX NDD Cui, Lili
2023-12-08 14:12 ` Jan Beulich
2023-12-11 13:36 ` Cui, Lili
2023-12-11 16:50 ` Jan Beulich
2023-12-13 10:42 ` Cui, Lili
2024-03-22 10:02 ` Jan Beulich
2024-03-22 10:31 ` Jan Beulich
2024-03-26 2:04 ` Cui, Lili
2024-03-26 7:06 ` Jan Beulich
2024-03-26 7:18 ` Cui, Lili
2024-03-22 10:59 ` Jan Beulich
2024-03-26 8:22 ` Cui, Lili
2024-03-26 9:30 ` Jan Beulich
2024-03-27 2:41 ` Cui, Lili
2023-12-08 14:27 ` Jan Beulich
2023-12-12 5:53 ` Cui, Lili
2023-12-12 8:28 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 7/9] Support APX Push2/Pop2 Cui, Lili
2023-12-11 11:17 ` Jan Beulich
2023-12-15 8:38 ` Cui, Lili
2023-12-15 8:44 ` Jan Beulich
2023-11-24 7:02 ` [PATCH v3 8/9] Support APX NDD optimized encoding Cui, Lili
2023-12-11 12:27 ` Jan Beulich
2023-12-12 3:18 ` Hu, Lin1
2023-12-12 8:41 ` Jan Beulich
2023-12-13 5:31 ` Hu, Lin1
2023-12-12 8:45 ` Jan Beulich
2023-12-13 6:06 ` Hu, Lin1
2023-12-13 8:19 ` Jan Beulich
2023-12-13 8:34 ` Hu, Lin1
2023-11-24 7:02 ` [PATCH v3 9/9] Support APX JMPABS for disassembler Cui, Lili
2023-11-24 7:09 ` [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax Jan Beulich
2023-11-24 11:22 ` Cui, Lili
2023-11-24 12:14 ` Jan Beulich
2023-12-12 2:57 ` Lu, Hongjiu
2023-12-12 8:16 ` Cui, Lili
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SJ0PR11MB5600FDF8200EEDEF70714AE79E84A@SJ0PR11MB5600.namprd11.prod.outlook.com \
--to=lili.cui@intel.com \
--cc=JBeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=hongjiu.lu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).