From: "Cui, Lili" <lili.cui@intel.com>
To: "Beulich, Jan" <JBeulich@suse.com>
Cc: "H.J. Lu" <hjl.tools@gmail.com>, Binutils <binutils@sourceware.org>
Subject: RE: [PATCH 4/5] x86/APX: extend SSE2AVX coverage
Date: Wed, 3 Apr 2024 09:22:02 +0000 [thread overview]
Message-ID: <SJ0PR11MB5600AA11B09850D13E41C0FC9E3D2@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <7add52dd-e2ab-4a65-8636-f5bb41d4d45c@suse.com>
> On 03.04.2024 09:59, Cui, Lili wrote:
> >>> This conversion is clever, although the mnemonic has changed, but
> >> considering it is controlled by -msse2avx, maybe we can mention in
> >> the option that it might change the mnemonic. Judging from the option
> >> name alone, it is difficult for users to predict that the mnemonic
> >> will change (traditionally, it seems to just add V).
> >>
> >> I don't think doc adjustment is needed here. We already have at least
> >> one example where the mnemonic also changes: CVTPI2PD ->
> VCVTDQ2PD.
> >>
> >
> > Oh, there has been such a conversion before. Another thing that comes to
> mind is that sse2avx was previously used to support sse to vex conversion.
> This option works on machines that don't support evex. We now extend sse
> to evex, which makes this option unavailable on machines that do not
> support the evex instruction (e.g. hybrid machines like Alderlake). Do you
> think we should add a new option?
>
> That's a question I've tentatively answered with "No". SSE => VEX requires
> systems supporting AVX. SSE-with-eGPR requires systems with APX.
> SSE-with-eGPR => EVEX similarly can rely on APX being there, and I expect all
> such systems will support at least AVX10/128. If that is deemed a wrong
> assumption, then indeed we may need to consider adding a new option (but
> not -msse2avx512 as you suggest further down, as SSE only ever covers 128-
> bit operations; -msse2avx10 maybe).
>
Yes, I was wrong, only Egprs trigger sse to evex conversion. Your assumption is correct.
> >>>> Should we also convert %xmm<N>-only templates (to consistently
> >>>> permit use of {evex})? Or should we reject use of {evex}, but then
> >>>> also that of {vex}/{vex3}?
> >>>
> >>> Do you mean SHA and KeyLocker?
> >>
> >> No, I mean templates with all XMM operands and no memory ones. Such
> >> don't use eGPR-s, yet could be converted to their EVEX counterparts,
> >> too (by way of the programmer adding {evex} to the _legacy_ insn).
> >> Hence the question on how to treat {evex} there, and then also {vex}
> >> / {vex3}. Take, for example, MOVHLPS or MOVLHPS.
> >
> > I'm not sure if you want to support this conversion under -sse2avx. I think
> this conversion is only used by people writing assembler by hand.
>
> Aiui -msse2avx is there mainly for hand-written assembly. Compilers will do
> better insn selection on their own anyway.
>
> > As for adding a prefix to convert sse to vex or evex, I think this requirement
> doesn't make much sense at the moment, maybe in the future if evex is faster
> than the vex instruction we can provide an option like sse2avx512 to achieve
> this conversion.
>
> That's not my point. Consider this example:
>
> .text
> sse2avx:
> movlhps %xmm0, %xmm1
> {vex} movlhps %xmm0, %xmm1
> {vex3} movlhps %xmm0, %xmm1
> {evex} movlhps %xmm0, %xmm1
>
> movlps (%rax), %xmm1
> {vex} movlps (%rax), %xmm1
> {vex3} movlps (%rax), %xmm1
> {evex} movlps (%rax), %xmm1
>
> Other than the {evex}-prefixed lines, everything assembles smoothly prior to
> the patch here. IOW even {vex3} has an effect on the non-VEX mnemonic.
> With my patch as it is now, the 2nd {evex}-prefixed line assembles fine, while
> the 1st doesn't. This is simply inconsistent. Hence why I see two
> options: Disallow all three pseudo-prefixes on legacy mnemonics, or permit
> {evex} consistently, too.
>
Oh, I got you, thank you for your detailed explanation. This increases the robustness of binutils, and since we already support some of them, I thought it would be nice to support it if it didn't require too much effort.
Thanks,
Lili.
next prev parent reply other threads:[~2024-04-03 9:22 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-22 9:25 [PATCH 0/5] x86/APX: respect -msse2avx Jan Beulich
2024-03-22 9:27 ` [PATCH 1/5] x86/SSE2AVX: respect prefixes Jan Beulich
2024-03-27 8:47 ` Cui, Lili
2024-03-27 11:31 ` Jan Beulich
2024-03-22 9:27 ` [PATCH 2/5] x86/SSE2AVX: move checking Jan Beulich
2024-03-27 9:38 ` Cui, Lili
2024-03-22 9:27 ` [PATCH 3/5] x86: zap value-less Disp8MemShift from non-EVEX templates Jan Beulich
2024-03-22 9:28 ` [PATCH 4/5] x86/APX: extend SSE2AVX coverage Jan Beulich
2024-03-29 9:10 ` Cui, Lili
2024-04-02 8:48 ` Jan Beulich
2024-04-03 7:59 ` Cui, Lili
2024-04-03 8:19 ` Jan Beulich
2024-04-03 9:17 ` Jiang, Haochen
2024-04-03 9:29 ` Cui, Lili
2024-04-03 10:22 ` Jan Beulich
2024-04-03 9:22 ` Cui, Lili [this message]
2024-04-05 7:09 ` Jan Beulich
2024-04-07 1:48 ` Cui, Lili
2024-04-08 6:25 ` Jan Beulich
2024-04-08 7:38 ` Cui, Lili
2024-03-22 9:29 ` [PATCH 5/5] x86: tidy <sse*> templates Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SJ0PR11MB5600AA11B09850D13E41C0FC9E3D2@SJ0PR11MB5600.namprd11.prod.outlook.com \
--to=lili.cui@intel.com \
--cc=JBeulich@suse.com \
--cc=binutils@sourceware.org \
--cc=hjl.tools@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).