public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "hjl.tools@gmail.com" <hjl.tools@gmail.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [PATCH 2/3] x86: Drop SwapSources
Date: Mon, 29 Apr 2024 12:23:22 +0000	[thread overview]
Message-ID: <SJ0PR11MB5600990BB714D915F249D4B79E1B2@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <11644d0a-1b51-4c31-82ac-92bb39fd59df@suse.com>

> On 28.04.2024 06:47, Cui, Lili wrote:
> >> On 26.04.2024 10:14, Cui, Lili wrote:
> >>>> On 24.04.2024 09:23, Cui, Lili wrote:
> >>>>> --- a/gas/config/tc-i386.c
> >>>>> +++ b/gas/config/tc-i386.c
> >>>>> @@ -10434,6 +10434,14 @@ build_modrm_byte (void)
> >>>>>
> >>>>>    switch (i.tm.opcode_modifier.vexvvvv)
> >>>>>      {
> >>>>> +    case VexVVVV_SRC2:
> >>>>> +      if (source != op)
> >>>>> +	{
> >>>>> +	  v = source++;
> >>>>> +	  break;
> >>>>> +	}
> >>>>> +      /* For XOP: vpshl* and vpsha*.  */
> >>>>> +      /* Fall through.  */
> >>>>>      case VexVVVV_SRC1:
> >>>>
> >>>> This falling-through is odd and hence needs a better comment (then
> >>>> also covering vprot*, which afaict is similarly affected). The
> >>>> reason for this is the XOP.W-controlled operand swapping, if I'm
> >>>> not mistaken? In which case perhaps instead of the fall-through
> >>>> here the logic swapping the operands should replace VexVVVV_SRC2 by
> >> VexVVVV_SRC1?
> >>>>
> >>>
> >>> Yes, vprot* should be included, and it is related to
> >>> XOP.W-controlled
> >> operand swapping, the comments says " /* Only the first two register
> >> operands need reversing, alongside flipping VEX.W.  */ ",  But there
> >> is actually a memory operand, not two register operands.
> >>>
> >>> I think VexVVVV_SRC2 makes more sense here, it matches the actual
> >> situation, we want to use vvvv to encode the first operand.
> >>>
> >>> Opcode table:
> >>> vprot<xop>, 0x90 | <xop:opc>, XOP,
> >>> D|Modrm|Vex128|SpaceXOP09|VexVVVV_Src2|VexW0|NoSuf,
> { RegXMM,
> >>> RegXMM|Unspecified|BaseIndex, RegXMM }
> >>>
> >>> testcase:
> >>> vprotb (%rax),%xmm12,%xmm15
> >>> vprotb %xmm15,(%r12),%xmm0
> >>
> >> VexVVVV_Src2 is appropriate for the latter, yes, but not for the
> >> former. That uses VexVVVV_Src1 layout. Hence my suggestion to replace
> >> the attribute when swapping operands.
> >>
> >
> > If replace the Src2VVVV| VexW0 with Src1VVVV| VexW1 and swapping
> operands. We can put VexVVVV_SRC1 before VexVVVV_SRC2, but we still
> need to add "(!is_cpu (&i.tm, CpuXOP) || source == op" under
> VexVVVV_SRC1 , and match_template also needs to be adjusted (I made a
> simple modification and it still failed, I think continuing like this may go
> against the original intention).
> >
> >   switch (i.tm.opcode_modifier.vexvvvv)
> >     {
> >     /* VEX.vvvv encodes the first source register operand.  */
> >     case VexVVVV_SRC1:
> >       if (!is_cpu (&i.tm, CpuXOP) || source == op)
> >         {
> >           v =  dest - 1;
> >           break;
> >         }
> >     /* For XOP: vpshl*, vpsha* and vprot*.  */
> >     /* Fall through.  */
> >     /* VEX.vvvv encodes the last source register operand.  */
> >     case VexVVVV_SRC2:
> >       v = source++;
> >       break;
> >     /* VEX.vvvv encodes the destination register operand.  */
> >     case VexVVVV_DST:
> >       v = dest--;
> >       break;
> >     default:
> >       v = ~0;
> >       break;
> >      }
> >
> > Do you think we should add a separate patch 4 for XOP that removes the
> special handling in match_template and completes its template? so we don't
> have to add special handling for src1vvvv or src2vvvv. This might go against
> your desire to reduce template size, but it would help simplify the logic. I'd
> like to know your thoughts.
> 
> Indeed. You'd effectively revert earlier folding that I did. And the adjustment I
> suggested earlier ought to be small/simple enough.
> 

So, I continued working on the previous suggestion. With the following modification and it worked.

@@ -8932,7 +8932,7 @@ match_template (char mnem_suffix)
                          || is_cpu (t, CpuAPX_F));
              if (!operand_type_match (overlap0, i.types[0])
                  || !operand_type_match (overlap1, i.types[j])
-                 || (t->operands == 3
+                 || (t->operands == 3 && !is_cpu (t, CpuXOP)
                      && !operand_type_match (overlap2, i.types[1]))
                  || (check_register
                      && !operand_type_register_match (i.types[0],

But I found that there are 4 test files that failed, I didn't find the doc on how to encode vprotb but I guess that is because I changed the default template from Src2VVVV| VexW0 to Src1VVVV| VexW1, then all the related test cases needed to be modified. Do you have any comments here? 

regexp "^[      ]*[a-f0-9]+:    8f e9 40 90 d8[         ]+vprotb %xmm7,%xmm0,%xmm3$"
line   "    11ed:       8f e9 f8 90 df          vprotb %xmm7,%xmm0,%xmm3"

Thanks,
Lili.

  reply	other threads:[~2024-04-29 12:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-24  7:23 [PATCH 0/3] x86: Optimize the encoder of the vvvv register Cui, Lili
2024-04-24  7:23 ` [PATCH 1/3] x86: Use vexvvvv to encode " Cui, Lili
2024-04-24  7:52   ` Jan Beulich
2024-04-25 13:14     ` Cui, Lili
2024-04-25 13:22       ` Jan Beulich
2024-04-26  5:33         ` Cui, Lili
2024-04-26  6:52           ` Jan Beulich
2024-04-24  7:23 ` [PATCH 2/3] x86: Drop SwapSources Cui, Lili
2024-04-24  8:07   ` Jan Beulich
2024-04-26  8:14     ` Cui, Lili
2024-04-26 10:37       ` Jan Beulich
2024-04-28  4:47         ` Cui, Lili
2024-04-29  6:40           ` Jan Beulich
2024-04-29 12:23             ` Cui, Lili [this message]
2024-04-29 13:08               ` Jan Beulich
2024-04-29 13:41                 ` Cui, Lili
2024-04-29 13:49                   ` Jan Beulich
2024-04-30  2:56                     ` Cui, Lili
2024-04-30  6:18                       ` Jan Beulich
2024-04-30  7:34                         ` Cui, Lili
2024-04-30  9:22                           ` Jan Beulich
2024-04-24  7:23 ` [PATCH 3/3] x86: Drop using extension_opcode to encode vvvv register Cui, Lili
2024-04-24  8:19   ` Jan Beulich
2024-04-24  8:27     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5600990BB714D915F249D4B79E1B2@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).