From: Matthias Kretz <m.kretz@gsi.de>
To: Matthias Kretz <m.kretz@gsi.de>,
Srinivas Yadav <vasusrinivas.vasu14@gmail.com>,
<libstdc++@gcc.gnu.org>, <gcc-patches@gcc.gnu.org>,
<richard.sandiford@arm.com>
Subject: Re: [PATCH] libstdc++: add ARM SVE support to std::experimental::simd
Date: Thu, 28 Mar 2024 15:48:21 +0100 [thread overview]
Message-ID: <25137674.EfDdHjke4D@excalibur> (raw)
In-Reply-To: <mpt4jcr4x37.fsf@arm.com>
On Mittwoch, 27. März 2024 14:34:52 CET Richard Sandiford wrote:
> Matthias Kretz <m.kretz@gsi.de> writes:
> > The big issue here is that, IIUC, a user (and the simd library) cannot do
> > the right thing at the moment. There simply isn't enough context
> > information available when parsing the <experimental/simd> header. I.e.
> > on definition of the class template there's no facility to take
> > target_clones or SME "streaming" mode into account. Consequently, if we
> > want the library to be fit for SME, then we need more language
> > extension(s) to make it work.
>
> Yeah. I think the same applies to plain SVE.
With "plain SVE" you mean the *scalable* part of it, right? BTW, I've
experimented with implementing simd<T> basically as
template <typename T, int N>
class simd
{
alignas(bit_ceil(sizeof(T) * N)) T data[N];
See here: https://compiler-explorer.com/z/WW6KqanTW
Maybe the compiler can get better at optimizing this approach. But for now
it's not a solution for a *scalable* variant, because every code is going to
be load/store bound from the get go.
@Srinivas: See the guard variables for __index0123? They need to go. I believe
you can and should declare them `constexpr`.
> It seems reasonable to
> have functions whose implementation is specialised for a specific SVE
> length, with that function being selected at runtime where appropriate.
> Those functions needn't (in principle) be in separate TUs. The “best”
> definition of native<float> then becomes a per-function property rather
> than a per-TU property.
Hmm, I never considered this; but can one actually write fixed-length SVE code
if -msve-vector-bits is not given? Then it's certainly possible to write a
single TU with a runtime dispatch for all different SVE-widths. (This is less
interesting on x86 where we need to dispatch on ISA extensions *and* vector
width. It's much simpler (and safer) to compile a TU multiple times,
restricted to a certain set of ISA extensions and then dispatch to the right
translation at from some general code section.)
> As you note later, I think the same thing would apply to x86_64.
Yes. I don't think "same" is the case (yet) but it's very similar. Once ARM is
at SVE9 😉 and binaries need to support HW from SVE2 up to SVE9 it gets closer
to "same".
> > The big issue I see here is that currently all of std::* is declared
> > without a arm_streaming or arm_streaming_compatible. Thus, IIUC, you
> > can't use anything from the standard library in streaming mode. Since
> > that also applies to std::experimental::simd, we're not creating a new
> > footgun, only missing out on potential users?
>
> Kind-of. However, we can inline a non-streaming function into a streaming
> function if that doesn't change defined behaviour. And that's important
> in practice for C++, since most trivial inline functions will not be
> marked streaming-compatible despite being so in practice.
Ah good to know that it takes a pragmatic approach here. But I imagine this
could become a source of confusion to users.
> > [...]
> > the compiler *must* virally apply target_clones to all functions it calls.
> > And member functions must either also get cloned as functions, or the
> > whole type must be cloned (as in the std::simd case, where the sizeof
> > needs to change). 😳
> Yeah, tricky :)
>
> It's also not just about vector widths. The target-clones case also has
> the problem that you cannot detect at include time which features are
> available. E.g. “do I have SVE2-specific instructions?” becomes a
> contextual question rather than a global question.
>
> Fortunately, this should just be a missed optimisation. But it would be
> nice if uses of std::simd in SVE2 clones could take advantage of SVE2-only
> instructions, even if SVE2 wasn't enabled at include time.
Exactly. Even if we solve the scalable vector-length question, the
target_clones question stays relevant.
So far my best answer, for x86 at least, is to compile the SIMD code multiple
times into different shared libraries. And then let the dynamic linker pick
the right library variant depending on the CPU. I'd be happy to have something
simpler and working right out of the box.
Best,
Matthias
--
──────────────────────────────────────────────────────────────────────────
Dr. Matthias Kretz https://mattkretz.github.io
GSI Helmholtz Center for Heavy Ion Research https://gsi.de
std::simd
──────────────────────────────────────────────────────────────────────────
next prev parent reply other threads:[~2024-03-28 14:53 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-24 15:59 Srinivas Yadav
2023-12-10 13:29 ` Richard Sandiford
2023-12-11 11:02 ` Richard Sandiford
2024-01-04 7:42 ` Srinivas Yadav
2024-01-04 9:10 ` Andrew Pinski
2024-01-18 7:27 ` Matthias Kretz
2024-01-18 7:40 ` Andrew Pinski
2024-01-18 8:40 ` Matthias Kretz
2024-01-18 6:54 ` Matthias Kretz
2024-01-23 20:57 ` Richard Sandiford
2024-03-27 11:53 ` Matthias Kretz
2024-03-27 13:34 ` Richard Sandiford
2024-03-28 14:48 ` Matthias Kretz [this message]
2024-02-09 14:28 ` [PATCH v2] " Srinivas Yadav Singanaboina
2024-03-08 9:57 ` Matthias Kretz
2024-03-27 9:50 ` Jonathan Wakely
2024-03-27 10:07 ` Richard Sandiford
2024-03-27 10:30 ` Matthias Kretz
2024-03-27 12:13 ` Richard Sandiford
2024-03-27 12:47 ` Jonathan Wakely
2024-03-27 14:18 ` Matthias Kretz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=25137674.EfDdHjke4D@excalibur \
--to=m.kretz@gsi.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=libstdc++@gcc.gnu.org \
--cc=richard.sandiford@arm.com \
--cc=vasusrinivas.vasu14@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).