public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "H.J. Lu" <hjl.tools@gmail.com>, Binutils <binutils@sourceware.org>
Subject: RE: [PATCH 4/5] x86/APX: extend SSE2AVX coverage
Date: Wed, 3 Apr 2024 09:22:02 +0000	[thread overview]
Message-ID: <SJ0PR11MB5600AA11B09850D13E41C0FC9E3D2@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <7add52dd-e2ab-4a65-8636-f5bb41d4d45c@suse.com>

> On 03.04.2024 09:59, Cui, Lili wrote:
> >>> This conversion is clever, although the mnemonic has changed, but
> >> considering it is controlled by -msse2avx, maybe we can mention in
> >> the option that it might change the mnemonic. Judging from the option
> >> name alone, it is difficult for users to predict that the mnemonic
> >> will change (traditionally, it seems to just add V).
> >>
> >> I don't think doc adjustment is needed here. We already have at least
> >> one example where the mnemonic also changes: CVTPI2PD ->
> VCVTDQ2PD.
> >>
> >
> > Oh, there has been such a conversion before. Another thing that comes to
> mind is that sse2avx was previously used to support sse to vex conversion.
> This option works on machines that don't support evex. We now extend sse
> to evex, which makes this option unavailable on machines that do not
> support the evex instruction (e.g. hybrid machines like Alderlake). Do you
> think we should add a new option?
> 
> That's a question I've tentatively answered with "No". SSE => VEX requires
> systems supporting AVX. SSE-with-eGPR requires systems with APX.
> SSE-with-eGPR => EVEX similarly can rely on APX being there, and I expect all
> such systems will support at least AVX10/128. If that is deemed a wrong
> assumption, then indeed we may need to consider adding a new option (but
> not -msse2avx512 as you suggest further down, as SSE only ever covers 128-
> bit operations; -msse2avx10 maybe).
> 

Yes, I was wrong, only Egprs trigger sse to evex conversion. Your assumption is correct.

> >>>> Should we also convert %xmm<N>-only templates (to consistently
> >>>> permit use of {evex})? Or should we reject use of {evex}, but then
> >>>> also that of {vex}/{vex3}?
> >>>
> >>> Do you mean SHA and KeyLocker?
> >>
> >> No, I mean templates with all XMM operands and no memory ones. Such
> >> don't use eGPR-s, yet could be converted to their EVEX counterparts,
> >> too (by way of the programmer adding {evex} to the _legacy_ insn).
> >> Hence the question on how to treat {evex} there, and then also {vex}
> >> / {vex3}. Take, for example, MOVHLPS or MOVLHPS.
> >
> > I'm not sure if you want to support this conversion under -sse2avx. I think
> this conversion is only used by people writing assembler by hand.
> 
> Aiui -msse2avx is there mainly for hand-written assembly. Compilers will do
> better insn selection on their own anyway.
> 
> > As for adding a prefix to convert sse to vex or evex, I think this requirement
> doesn't make much sense at the moment, maybe in the future if evex is faster
> than the vex instruction we can provide an option like sse2avx512 to achieve
> this conversion.
> 
> That's not my point. Consider this example:
> 
> 	.text
> sse2avx:
> 		movlhps	%xmm0, %xmm1
> 	{vex}	movlhps	%xmm0, %xmm1
> 	{vex3}	movlhps	%xmm0, %xmm1
> 	{evex}	movlhps	%xmm0, %xmm1
> 
> 		movlps	(%rax), %xmm1
> 	{vex}	movlps	(%rax), %xmm1
> 	{vex3}	movlps	(%rax), %xmm1
> 	{evex}	movlps	(%rax), %xmm1
> 
> Other than the {evex}-prefixed lines, everything assembles smoothly prior to
> the patch here. IOW even {vex3} has an effect on the non-VEX mnemonic.
> With my patch as it is now, the 2nd {evex}-prefixed line assembles fine, while
> the 1st doesn't. This is simply inconsistent. Hence why I see two
> options: Disallow all three pseudo-prefixes on legacy mnemonics, or permit
> {evex} consistently, too.
> 

Oh, I got you, thank you for your detailed explanation. This increases the robustness of binutils, and since we already support some of them, I thought it would be nice to support it if it didn't require too much effort.

Thanks,
Lili.

  parent reply	other threads:[~2024-04-03  9:22 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-22  9:25 [PATCH 0/5] x86/APX: respect -msse2avx Jan Beulich
2024-03-22  9:27 ` [PATCH 1/5] x86/SSE2AVX: respect prefixes Jan Beulich
2024-03-27  8:47   ` Cui, Lili
2024-03-27 11:31     ` Jan Beulich
2024-03-22  9:27 ` [PATCH 2/5] x86/SSE2AVX: move checking Jan Beulich
2024-03-27  9:38   ` Cui, Lili
2024-03-22  9:27 ` [PATCH 3/5] x86: zap value-less Disp8MemShift from non-EVEX templates Jan Beulich
2024-03-22  9:28 ` [PATCH 4/5] x86/APX: extend SSE2AVX coverage Jan Beulich
2024-03-29  9:10   ` Cui, Lili
2024-04-02  8:48     ` Jan Beulich
2024-04-03  7:59       ` Cui, Lili
2024-04-03  8:19         ` Jan Beulich
2024-04-03  9:17           ` Jiang, Haochen
2024-04-03  9:29             ` Cui, Lili
2024-04-03 10:22             ` Jan Beulich
2024-04-03  9:22           ` Cui, Lili [this message]
2024-04-05  7:09     ` Jan Beulich
2024-04-07  1:48       ` Cui, Lili
2024-04-08  6:25         ` Jan Beulich
2024-04-08  7:38           ` Cui, Lili
2024-03-22  9:29 ` [PATCH 5/5] x86: tidy <sse*> templates Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5600AA11B09850D13E41C0FC9E3D2@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=JBeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).