public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "Cui, Lili" <lili.cui@intel.com>
Cc: hjl.tools@gmail.com, binutils@sourceware.org
Subject: Re: [PATCH] x86: Optimize the encoder of the vvvv register
Date: Fri, 19 Apr 2024 12:05:11 +0200	[thread overview]
Message-ID: <013ad260-9b28-496e-bbaf-c5a066774e99@suse.com> (raw)
In-Reply-To: <20240419073657.2418102-1-lili.cui@intel.com>

On 19.04.2024 09:36, Cui, Lili wrote:
> This patch wants to optimize the encoder of the vvvv register.
> Previously we used Vexvvvv, SWAP_SOURCES and extension_opcode
> to help encode the vvvv register, this patch simplified the
> logic to only use vexvvvv and added appropriate Vexvvvv values
> for the related instructions.

This looks to be a good move, yet you're not fully leveraging the potential:
Afaict all uses of SwapSources now go away. Hence SwapSources itself should
go away too, together with SWAP_SOURCES.

Beyond that largely (but not only) cosmetic comments:

> @@ -10426,40 +10426,35 @@ build_modrm_byte (void)
>  				     || i.encoding == encoding_evex));
>      }
>  
> -  if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
> +  switch (i.tm.opcode_modifier.vexvvvv)
>      {
> -      v = dest;
> -      dest-- ;
> -    }
> -  else
> -    {
> -      for (v = source + 1; v < dest; ++v)
> -	if (v != reg_slot)
> +    case VexVVVV_SRC2:
> +      if (source != op)
> +	{
> +	  v = source++;
>  	  break;
> -      if (v >= dest)
> -	v = ~0;
> -    }
> -  if (i.tm.extension_opcode != None)
> -    {
> -      if (dest != source)
> -	v = dest;
> -      dest = ~0;
> +	}
> +      /* For XOP: vpshl* and vpsha*.  */
> +      else
> +      /* Fall through.  */
> +    case VexVVVV_SRC1:
> +      v =  dest - 1;

Indentation here wants to fit the "else", not the "case ...". Even better
would be to avoid the "else", seeing that there already is a "break" inside
the if()'s body.

> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -640,10 +640,13 @@ enum
>    Vex,
>    /* How to encode VEX.vvvv:
>       0: VEX.vvvv must be 1111b.
> -     1: VEX.vvvv encodes one of the src register operands.
> -     2: VEX.vvvv encodes the dest register operand.
> +     1: VEX.vvvv encodes the src1 register operand.
> +     2: VEX.vvvv encodes the src2 register operand.
> +     3: VEX.vvvv encodes the dest register operand.
>     */
> -#define VexVVVV_DST   2
> +#define VexVVVV_SRC1   1
> +#define VexVVVV_SRC2   2
> +#define VexVVVV_DST    3
>    VexVVVV,

While I'm not overly fussed on the names used here, ...

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -141,7 +141,9 @@
>  
>  #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL
>  
> -#define DstVVVV VexVVVV=VexVVVV_DST
> +#define VexVVVV_Src1 VexVVVV=VexVVVV_SRC1
> +#define VexVVVV_Src2 VexVVVV=VexVVVV_SRC2
> +#define VexVVVV_Dst  VexVVVV=VexVVVV_DST

... I am here, due to the line length issues we already have. Please can
you keep DstVVVV as a name (thus reducing the churn on the table below)
and add Src1VVVV and Src2VVVV, all being 4 characters shorter than what
you presently have?

> @@ -1000,13 +1002,13 @@ pause, 0xf390, i186, NoSuf, {}
>  
>  // MMX/SSE2 instructions.
>  
> -<mmx:cpu:pfx:attr:reg:mem, +
> -    $avx:AVX:66:Vex128|VexVVVV|VexW0|SSE2AVX:RegXMM:Xmmword, +
> -    $sse:SSE2:66::RegXMM:Xmmword, +
> -    $mmx:MMX:::RegMMX:Qword>
> +<mmx:cpu:pfx:attr:scal:reg:mem, +
> +    $avx:AVX:66:Vex128|VexVVVV_Src1|VexW0|SSE2AVX:Vex128|VexVVVV_Dst|VexW0|SSE2AVX:RegXMM:Xmmword, +
> +    $sse:SSE2:66:::RegXMM:Xmmword, +
> +    $mmx:MMX::::RegMMX:Qword>
>  
>  <sse2:cpu:attr:scal:vvvv, +
> -    $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, +
> +    $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV_Src1, +
>      $sse:SSE2:::>
>  
>  <bw:opc:vexw:elem:kcpu:kpfx:cpubmi, +
> @@ -1058,7 +1060,7 @@ pmulhw<mmx>, 0x<mmx:pfx>0fe5, <mmx:cpu>, Modrm|<mmx:attr>|C|NoSuf, { <mmx:reg>|<
>  pmullw<mmx>, 0x<mmx:pfx>0fd5, <mmx:cpu>, Modrm|<mmx:attr>|C|NoSuf, { <mmx:reg>|<mmx:mem>|Unspecified|BaseIndex, <mmx:reg> }
>  por<mmx>, 0x<mmx:pfx>0feb, <mmx:cpu>, Modrm|<mmx:attr>|C|NoSuf, { <mmx:reg>|<mmx:mem>|Unspecified|BaseIndex, <mmx:reg> }
>  psllw<mmx>, 0x<mmx:pfx>0ff1, <mmx:cpu>, Modrm|<mmx:attr>|NoSuf, { <mmx:reg>|<mmx:mem>|Unspecified|BaseIndex, <mmx:reg> }
> -psllw<mmx>, 0x<mmx:pfx>0f71/6, <mmx:cpu>, Modrm|<mmx:attr>|NoSuf, { Imm8, <mmx:reg> }
> +psllw<mmx>, 0x<mmx:pfx>0f71/6, <mmx:cpu>, Modrm|<mmx:scal>|NoSuf, { Imm8, <mmx:reg> }

This is not a scalar instruction, hence "scal" as a template parameter name
is misleading. It's not really clear to me anyway why this needs fiddling
with - there was no SwapSources here, and none of its siblings are being
touched either.

To help recognizing such anomalies (possible problems), could I talk you
into splitting the patch in two pieces? First a purely mechanical one
introducing (perhaps simply as an alias of VexVVVV) / using Src1VVVV
wherever it is meant to be used. Then the remaining changes, with a much
smaller diff on the actual opcode templates, in the 2nd one.

Jan

  reply	other threads:[~2024-04-19 10:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19  7:36 Cui, Lili
2024-04-19 10:05 ` Jan Beulich [this message]
2024-04-22  8:42   ` Cui, Lili
2024-04-22  8:48     ` Jan Beulich
2024-04-22 10:07       ` Cui, Lili

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=013ad260-9b28-496e-bbaf-c5a066774e99@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    --cc=lili.cui@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).