From: Matthias Kretz <m.kretz@gsi.de>
To: <gcc-patches@gcc.gnu.org>
Cc: Srinivas Yadav <vasusrinivas.vasu14@gmail.com>,
<gcc-patches@gcc.gnu.org>, <libstdc++@gcc.gnu.org>,
<richard.sandiford@arm.com>, Andrew Pinski <pinskia@gmail.com>
Subject: Re: [PATCH] libstdc++: add ARM SVE support to std::experimental::simd
Date: Thu, 18 Jan 2024 09:40:51 +0100 [thread overview]
Message-ID: <4123346.6PsWsQAL7t@centauriprime> (raw)
In-Reply-To: <CA+=Sn1nTL8LLLFNmj2797a5p-3WLCVuEXukoPXe=GykNGep2Vw@mail.gmail.com>
On Thursday, 18 January 2024 08:40:48 CET Andrew Pinski wrote:
> On Wed, Jan 17, 2024 at 11:28 PM Matthias Kretz <m.kretz@gsi.de> wrote:
> > template <typename T>
> > struct Point
> > {
> > T x, y, z;
> >
> > T distance_to_origin() {
> > return sqrt(x * x + y * y + z * z);
> > }
> > };
> >
> > Point<float> is one point in 3D space, Point<simd<float>> stores multiple
> > points in 3D space and can work on them in parallel.
> >
> > This implies that simd<T> must have a sizeof. C++ is unlikely to get
> > sizeless types (the discussions were long, there were many papers, ...).
> > Should sizeless types in C++ ever happen, then composition is likely going
> > to be constrained to the last data member.
>
> Even this is a bad design in general for simd. It means the code needs
> to know the size.
Yes and no. The developer writes size-agnostic code. The person compiling the
code chooses the size (via -m flags) and thus the compiler sees fixed-size
code.
> Also AoS vs SoA is always an interesting point here. In some cases you
> want an array of structs
> for speed and Point<simd<float>> does not work there at all. I guess
> This is all water under the bridge with how folks design code.
> You are basically pushing AoSoA idea here which is much worse idea than
> before.
I like to call it "array of vectorized struct" (AoVS) instead of AoSoA to
emphasize the compiler-flags dependent memory layout.
I've been doing a lot of heterogeneous SIMD programming since 2009, starting
with an outer loop vectorization across many TUs of a high-energy physics code
targeting Intel Larrabee (pre-AVX512 ISA) and SSE2 with one source. In all
these years my experience has been that, if the problem allows, AoVS is best
in terms of performance and code generality & readability. I'd be interested
to learn why you think differently.
> That being said sometimes it is not a vector of N elements you want to
> work on but rather 1/2/3 vector of N elements. Seems like this is
> just pushing the idea one of one vector of one type of element which
> again is wrong push.
I might have misunderstood. You're saying that sometimes I want a <float, 8>
even though my target CPU only has <float, 4> registers? Yes! The
std::experimental::simd spec and implementation isn't good enough in that area
yet, but the C++26 paper(s) and my prototype implementation provides perfect
SIMD + ILP translation of the expressed data-parallelism.
> Also more over, I guess pushing one idea of SIMD is worse than pushing
> any idea of SIMD. For Mathematical code, it is better for the compiler
> to do the vectorization than the user try to be semi-portable between
> different targets.
I guess I agree with that statement. But I wouldn't, in general, call the use
of simd<T> "the user try[ing] to be semi-portable". In my experience, working
on physics code - a lot of math - using simd<T> (as intended) is better in
terms of performance and performance portability. As always, abuse is possible
...
> This is what was learned on Fortran but I guess
> some folks in the C++ likes to expose the underlying HW instead of
> thinking high level here.
The C++ approach is to "leave no room for a lower-level language" while
designing for high-level abstractions / usage.
> > With the above as our design constraints, SVE at first seems to be a bad
> > fit for implementing std::simd. However, if (at least initially) we accept
> > the need for different binaries for different SVE implementations, then
> > you
> > can look at the "scalable" part of SVE as an efficient way of reducing the
> > number of opcodes necessary for supporting all kinds of different vector
> > lengths. But otherwise you can treat it as fixed-size registers - which it
> > is for a given CPU. In the case of a multi-CPU shared-memory system (e.g.
> > RDMA between different ARM implementations) all you need is a different
> > name for incompatible types. So std::simd<float> on SVE256 must have a
> > different name on SVE512. Same for std::simd<float, 8> (which is currently
> > not the case with Sriniva's patch, I think, and needs to be resolved).
>
> For SVE that is a bad design. It means The code is not portable at all.
When you say "code" you mean "source code", not binaries, right? I don't see
how that follows.
- Matthias
--
──────────────────────────────────────────────────────────────────────────
Dr. Matthias Kretz https://mattkretz.github.io
GSI Helmholtz Center for Heavy Ion Research https://gsi.de
std::simd
──────────────────────────────────────────────────────────────────────────
next prev parent reply other threads:[~2024-01-18 8:40 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-24 15:59 Srinivas Yadav
2023-12-10 13:29 ` Richard Sandiford
2023-12-11 11:02 ` Richard Sandiford
2024-01-04 7:42 ` Srinivas Yadav
2024-01-04 9:10 ` Andrew Pinski
2024-01-18 7:27 ` Matthias Kretz
2024-01-18 7:40 ` Andrew Pinski
2024-01-18 8:40 ` Matthias Kretz [this message]
2024-01-18 6:54 ` Matthias Kretz
2024-01-23 20:57 ` Richard Sandiford
2024-03-27 11:53 ` Matthias Kretz
2024-03-27 13:34 ` Richard Sandiford
2024-03-28 14:48 ` Matthias Kretz
2024-02-09 14:28 ` [PATCH v2] " Srinivas Yadav Singanaboina
2024-03-08 9:57 ` Matthias Kretz
2024-03-27 9:50 ` Jonathan Wakely
2024-03-27 10:07 ` Richard Sandiford
2024-03-27 10:30 ` Matthias Kretz
2024-03-27 12:13 ` Richard Sandiford
2024-03-27 12:47 ` Jonathan Wakely
2024-03-27 14:18 ` Matthias Kretz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4123346.6PsWsQAL7t@centauriprime \
--to=m.kretz@gsi.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=libstdc++@gcc.gnu.org \
--cc=pinskia@gmail.com \
--cc=richard.sandiford@arm.com \
--cc=vasusrinivas.vasu14@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).