From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: hjl.tools@gmail.com, binutils@sourceware.org
Subject: Re: [PATCH] x86: Optimize the encoder of the vvvv register
Date: Fri, 19 Apr 2024 12:05:11 +0200 [thread overview]
Message-ID: <013ad260-9b28-496e-bbaf-c5a066774e99@suse.com> (raw)
In-Reply-To: <20240419073657.2418102-1-lili.cui@intel.com>
On 19.04.2024 09:36, Cui, Lili wrote:
> This patch wants to optimize the encoder of the vvvv register.
> Previously we used Vexvvvv, SWAP_SOURCES and extension_opcode
> to help encode the vvvv register, this patch simplified the
> logic to only use vexvvvv and added appropriate Vexvvvv values
> for the related instructions.
This looks to be a good move, yet you're not fully leveraging the potential:
Afaict all uses of SwapSources now go away. Hence SwapSources itself should
go away too, together with SWAP_SOURCES.
Beyond that largely (but not only) cosmetic comments:
> @@ -10426,40 +10426,35 @@ build_modrm_byte (void)
> || i.encoding == encoding_evex));
> }
>
> - if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
> + switch (i.tm.opcode_modifier.vexvvvv)
> {
> - v = dest;
> - dest-- ;
> - }
> - else
> - {
> - for (v = source + 1; v < dest; ++v)
> - if (v != reg_slot)
> + case VexVVVV_SRC2:
> + if (source != op)
> + {
> + v = source++;
> break;
> - if (v >= dest)
> - v = ~0;
> - }
> - if (i.tm.extension_opcode != None)
> - {
> - if (dest != source)
> - v = dest;
> - dest = ~0;
> + }
> + /* For XOP: vpshl* and vpsha*. */
> + else
> + /* Fall through. */
> + case VexVVVV_SRC1:
> + v = dest - 1;
Indentation here wants to fit the "else", not the "case ...". Even better
would be to avoid the "else", seeing that there already is a "break" inside
the if()'s body.
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -640,10 +640,13 @@ enum
> Vex,
> /* How to encode VEX.vvvv:
> 0: VEX.vvvv must be 1111b.
> - 1: VEX.vvvv encodes one of the src register operands.
> - 2: VEX.vvvv encodes the dest register operand.
> + 1: VEX.vvvv encodes the src1 register operand.
> + 2: VEX.vvvv encodes the src2 register operand.
> + 3: VEX.vvvv encodes the dest register operand.
> */
> -#define VexVVVV_DST 2
> +#define VexVVVV_SRC1 1
> +#define VexVVVV_SRC2 2
> +#define VexVVVV_DST 3
> VexVVVV,
While I'm not overly fussed on the names used here, ...
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -141,7 +141,9 @@
>
> #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
>
> -#define DstVVVV VexVVVV=VexVVVV_DST
> +#define VexVVVV_Src1 VexVVVV=VexVVVV_SRC1
> +#define VexVVVV_Src2 VexVVVV=VexVVVV_SRC2
> +#define VexVVVV_Dst VexVVVV=VexVVVV_DST
... I am here, due to the line length issues we already have. Please can
you keep DstVVVV as a name (thus reducing the churn on the table below)
and add Src1VVVV and Src2VVVV, all being 4 characters shorter than what
you presently have?
> @@ -1000,13 +1002,13 @@ pause, 0xf390, i186, NoSuf, {}
>
> // MMX/SSE2 instructions.
>
> -<mmx:cpu:pfx:attr:reg:mem, +
> - $avx:AVX:66:Vex128|VexVVVV|VexW0|SSE2AVX:RegXMM:Xmmword, +
> - $sse:SSE2:66::RegXMM:Xmmword, +
> - $mmx:MMX:::RegMMX:Qword>
> +<mmx:cpu:pfx:attr:scal:reg:mem, +
> + $avx:AVX:66:Vex128|VexVVVV_Src1|VexW0|SSE2AVX:Vex128|VexVVVV_Dst|VexW0|SSE2AVX:RegXMM:Xmmword, +
> + $sse:SSE2:66:::RegXMM:Xmmword, +
> + $mmx:MMX::::RegMMX:Qword>
>
> <sse2:cpu:attr:scal:vvvv, +
> - $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, +
> + $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV_Src1, +
> $sse:SSE2:::>
>
> <bw:opc:vexw:elem:kcpu:kpfx:cpubmi, +
> @@ -1058,7 +1060,7 @@ pmulhw<mmx>, 0x<mmx:pfx>0fe5, <mmx:cpu>, Modrm|<mmx:attr>|C|NoSuf, { <mmx:reg>|<
> pmullw<mmx>, 0x<mmx:pfx>0fd5, <mmx:cpu>, Modrm|<mmx:attr>|C|NoSuf, { <mmx:reg>|<mmx:mem>|Unspecified|BaseIndex, <mmx:reg> }
> por<mmx>, 0x<mmx:pfx>0feb, <mmx:cpu>, Modrm|<mmx:attr>|C|NoSuf, { <mmx:reg>|<mmx:mem>|Unspecified|BaseIndex, <mmx:reg> }
> psllw<mmx>, 0x<mmx:pfx>0ff1, <mmx:cpu>, Modrm|<mmx:attr>|NoSuf, { <mmx:reg>|<mmx:mem>|Unspecified|BaseIndex, <mmx:reg> }
> -psllw<mmx>, 0x<mmx:pfx>0f71/6, <mmx:cpu>, Modrm|<mmx:attr>|NoSuf, { Imm8, <mmx:reg> }
> +psllw<mmx>, 0x<mmx:pfx>0f71/6, <mmx:cpu>, Modrm|<mmx:scal>|NoSuf, { Imm8, <mmx:reg> }
This is not a scalar instruction, hence "scal" as a template parameter name
is misleading. It's not really clear to me anyway why this needs fiddling
with - there was no SwapSources here, and none of its siblings are being
touched either.
To help recognizing such anomalies (possible problems), could I talk you
into splitting the patch in two pieces? First a purely mechanical one
introducing (perhaps simply as an alias of VexVVVV) / using Src1VVVV
wherever it is meant to be used. Then the remaining changes, with a much
smaller diff on the actual opcode templates, in the 2nd one.
Jan
next prev parent reply other threads:[~2024-04-19 10:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-19 7:36 Cui, Lili
2024-04-19 10:05 ` Jan Beulich [this message]
2024-04-22 8:42 ` Cui, Lili
2024-04-22 8:48 ` Jan Beulich
2024-04-22 10:07 ` Cui, Lili
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=013ad260-9b28-496e-bbaf-c5a066774e99@suse.com \
--to=jbeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=hjl.tools@gmail.com \
--cc=lili.cui@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).