Re: [PATCH 3/5] x86: support AVX10.1/512

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Jan Beulich <jbeulich@suse.com>
To: "Jiang, Haochen" <haochen.jiang@intel.com>
Cc: Binutils <binutils@sourceware.org>, "H.J. Lu" <hjl.tools@gmail.com>
Subject: Re: [PATCH 3/5] x86: support AVX10.1/512
Date: Tue, 5 Sep 2023 09:25:47 +0200	[thread overview]
Message-ID: <15eaf902-08c3-11b2-2e0b-f4ff0adce83f@suse.com> (raw)
In-Reply-To: <SA1PR11MB59466BAF2ACD296B95C9225DECE8A@SA1PR11MB5946.namprd11.prod.outlook.com>

On 05.09.2023 09:04, Jiang, Haochen wrote:
>>>> Actually there's something similar with AVX10 itself: AVX512F includes
>>>> equivalents right away of what comes under separate extensions for AVX:
>>>> F16C and FMA. AVX10, otoh, is presently specified to only guarantee
>>>> AVX and AVX2. Does that mean that VEX-encoded vfm{add,sub}* and ps<-ph
>>>> conversion insns aren't guaranteed to also be available? Doesn't seem
>>>> logical to me, so I'm inclined to make FMA and F16C prereqs of AVX10.1
>>>> as well (or alternatively of AVX512F, but I think this would have
>>>> undesirable effects). AVX2 isn't an explicit prereq only because it
>>>> already is one of AVX512F.
>>>
>>> I suppose AVX10 should only enable EVEX encoding,  they have nothing
>>> to do with the VEX encoding.
>>>
>>> For those independent VEX ISAs, if AVX512F is not enabling it, AVX10 neither.
>>>
>>> Actually, not only F16C and FMA, under AVX10, ISAs like AVX-VNNI, AVX-IFMA
>>> are also not enabled.
>>
>> The difference to the AVX-* ones you mention is important here: AVX-VNNI
>> (taking that as example) isn't a feature that had equivalent EVEX
>> encodings added right in AVX512F. So I'd like to ask that you re-consider
> 
> I see your point since here we are just focusing on features introduced in
> AVX512F. But I still would like to mention AVX-VNNI below just for discussion.
> 
>> what you said. Also think about what the compiler does (which doesn't
>> emit .arch directives to limit the usable ISA extensions) when just
>> -mavx512vl is passed to it: VEX-encoded vfm{add,sub}* would then still be
>> resulting (to prevent that, the compiler would need to further emit {evex}
>> pseudo-prefixes). IOW in the compiler there is such an implication already
>> anyway.
> 
> For FMA, in GCC, we have such comment on that:
> 
> ;; The standard names for scalar FMA are only available with SSE math enabled.
> ;; CPUID bit AVX512F enables evex encoded scalar and 512-bit fma.  It doesn't
> ;; care about FMA bit, so we enable fma for TARGET_AVX512F even when TARGET_FMA
> ;; and TARGET_FMA4 are both false.
> ;; TODO: In theory AVX512F does not automatically imply FMA, and without FMA
> ;; one must force the EVEX encoding of the fma insns.  Ideally we'd improve
> ;; GAS to allow proper prefix selection.  However, for the moment all hardware
> ;; that supports AVX512F also supports FMA so we can ignore this for now.

Interesting. I wonder what gas improvement is being thought about here, when
gcc doesn't emit .arch.

> Although splitting the pattern between FMA/FMA4 and AVX512F, the code itself actually
> won't emit an {evex} prefix in mnemonic if there is only AVX512F since there is no true
> hardware for codegen to do so.
> 
> For F16C, the pattern is even not split, so the scenario is the same as FMA/FMA4.
> 
> Therefore, I suppose it could be ok for AVX10 to imply FMA/F16C in gas for simplicity. But
> let's wait for H.J.'s opinion on that.

Okay, I'll submit v2 then with this just as a remark for the time being.
Luckily in the follow-on work where I ran into this I now no longer depend
on there being such an explicit connection. (Whether what I'm doing there
is acceptable will need to be seen.)

> For AVX-VNNI issue, it is introduced in Sapphire Rapids, which is before AVX10.1 introduction
> (Granite Rapids), which means that on the hardware we will always have AVX-VNNI while
> AVX10.1 is there. So there might be a chance to imply AVX-VNNI in AVX10.1 in compiler,
> but we could put that discussion after everything in AVX10.1 is set in community.

Hmm, yes. An implication from making it another prereq is that with AVX10.1
explicitly enabled, VEX encodings then ought to be preferred over the EVEX
ones (for being shorter), except when Disp8-scaling helps shortening a memory
reference. That'll for sure require extra code in tc-i386.c, so would likely
want to be a separate patch then. (Actually I think we should already do so
anyway when AVX-VNNI is explicitly enabled.)

I'd then further raise the same question towards AVX-IFMA.

Jan

next prev parent reply	other threads:[~2023-09-05  7:25 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-25 12:43 [PATCH 0/5] x86: AVX10.1 (alternative attempt) Jan Beulich
2023-08-25 12:44 ` [PATCH 1/5] x86: correct source used for two non-AVX512 VEXWIG tests Jan Beulich
2023-08-25 12:45 ` [PATCH 2/5] x86: rename CpuPCLMUL Jan Beulich
2023-08-25 12:46 ` [PATCH 3/5] x86: support AVX10.1/512 Jan Beulich
2023-08-28  2:34   ` Jiang, Haochen
2023-08-28  6:45     ` Jan Beulich
2023-08-28  6:59       ` Jiang, Haochen
2023-08-28  7:09         ` Jan Beulich
2023-08-29 16:18           ` H.J. Lu
2023-08-30  1:10             ` Jiang, Haochen
2023-08-30  7:47               ` Jan Beulich
2023-08-30 15:28                 ` H.J. Lu
2023-09-01  8:41                   ` Jan Beulich
2023-09-01  8:52                     ` Jiang, Haochen
2023-09-01  9:57                       ` Jan Beulich
2023-09-05  7:04                         ` Jiang, Haochen
2023-09-05  7:25                           ` Jan Beulich [this message]
2023-08-25 12:47 ` [PATCH 4/5] x86: unindent most of set_cpu_arch() Jan Beulich
2023-08-25 12:47 ` [PATCH 5/5] x86: support AVX10.1 vector size restrictions Jan Beulich
2023-08-29 16:26   ` H.J. Lu
2023-08-30  7:57     ` Jan Beulich
2023-08-30 15:25       ` H.J. Lu
2023-08-30 16:16         ` Jan Beulich
2023-08-30 18:00           ` H.J. Lu
2023-08-31  5:56             ` Jiang, Haochen
2023-08-31  7:18               ` Jan Beulich
2023-09-01  6:21                 ` Jiang, Haochen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15eaf902-08c3-11b2-2e0b-f4ff0adce83f@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=haochen.jiang@intel.com \
    --cc=hjl.tools@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).