Re: [PATCH] libstdc++: add ARM SVE support to std::experimental::simd

public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed

From: Matthias Kretz <m.kretz@gsi.de>
To: Srinivas Yadav <vasusrinivas.vasu14@gmail.com>,
	<libstdc++@gcc.gnu.org>, <gcc-patches@gcc.gnu.org>,
	<richard.sandiford@arm.com>
Subject: Re: [PATCH] libstdc++: add ARM SVE support to std::experimental::simd
Date: Thu, 18 Jan 2024 07:54:32 +0100	[thread overview]
Message-ID: <3296207.VqM8IeB0Os@centauriprime> (raw)
In-Reply-To: <mpt8r62kwxi.fsf@arm.com>

On Sunday, 10 December 2023 14:29:45 CET Richard Sandiford wrote:
> Thanks for the patch and sorry for the slow review.

Sorry for my slow reaction. I needed a long vacation. For now I'll focus on 
the design question wrt. multi-arch compilation.

> I can only comment on the usage of SVE, rather than on the scaffolding
> around it.  Hopefully Jonathan or others can comment on the rest.

That's very useful!

> The main thing that worries me is:
> 
> #if _GLIBCXX_SIMD_HAVE_SVE
> constexpr inline int __sve_vectorized_size_bytes = __ARM_FEATURE_SVE_BITS/8;
> #else
> constexpr inline int __sve_vectorized_size_bytes = 0;
> #endif
> 
> Although -msve-vector-bits is currently a per-TU setting, that isn't
> necessarily going to be the case in future.

This is a topic that I care about a lot... as simd user, implementer, and WG21 
proposal author. Are you thinking of a plan to implement the target_clones 
function attribute for different SVE lengths? Or does it go further than that? 
PR83875 is raising the same issue and solution ideas for x86. If I understand 
your concern correctly, then the issue you're raising exists in the same form 
for x86.

If anyone is interested in working on a "translation phase 7 replacement" for 
compiler flags macros I'd be happy to give some input of what I believe is 
necessary to make target_clones work with std(x)::simd. This seems to be about 
constant expressions that return compiler-internal state - probably similar to 
how static reflection needs to work.

For a sketch of a direction: what I'm already doing in 
std::experimental::simd, is to tag all non-always_inline function names with a 
bitflag, representing a relevant subset of -f and -m flags. That way, I'm 
guarding against surprises on linking TUs compiled with different flags.

> Ideally it would be
> possible to define different implementations of a function with
> different (fixed) vector lengths within the same TU.  The value at
> the point that the header file is included is then somewhat arbitrary.
> 
> So rather than have:
> >  using __neon128 = _Neon<16>;
> >  using __neon64 = _Neon<8>;
> > 
> > +using __sve = _Sve<>;
> 
> would it be possible instead to have:
> 
>   using __sve128 = _Sve<128>;
>   using __sve256 = _Sve<256>;
>   ...etc...
> 
> ?  Code specialised for 128-bit vectors could then use __sve128 and
> code specialised for 256-bit vectors could use __sve256.

Hmm, as things stand we'd need two numbers, IIUC:
_Sve<NumberOfUsedBytes, SizeofRegister>

On x86, "NumberOfUsedBytes" is sufficient, because 33-64 implies zmm registers 
(and -mavx512f), 17-32 implies ymm, and <=16 implies xmm (except where it 
doesn't ;) ).

> Perhaps that's not possible as things stand, but it'd be interesting
> to know the exact failure mode if so.  Either way, it would be good to
> remove direct uses of __ARM_FEATURE_SVE_BITS from simd_sve.h if possible,
> and instead rely on information derived from template parameters.

The TS spec requires std::experimental::native_simd<int> to basically give you 
the largest, most efficient, full SIMD register. (And it can't be a sizeless 
type because they don't exist in C++). So how would you do that without 
looking at __ARM_FEATURE_SVE_BITS in the simd implementation?

> It should be possible to use SVE to optimise some of the __neon*
> implementations, which has the advantage of working even for VLA SVE.
> That's obviously a separate patch, though.  Just saying for the record.

I learned that NVidia Grace CPUs alias NEON and SVE registers. But I must 
assume that other SVE implementations (especially those with 
__ARM_FEATURE_SVE_BITS > 128) don't do that and might incur a significant 
latency when going from a NEON register to an SVE register and back (which 
each requires a store-load, IIUC). So are you thinking of implementing 
everything via SVE? That would break ABI, no?

- Matthias

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Center for Heavy Ion Research               https://gsi.de
 std::simd
──────────────────────────────────────────────────────────────────────────

next prev parent reply	other threads:[~2024-01-18  7:28 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-24 15:59 Srinivas Yadav
2023-12-10 13:29 ` Richard Sandiford
2023-12-11 11:02   ` Richard Sandiford
2024-01-04  7:42   ` Srinivas Yadav
2024-01-04  9:10     ` Andrew Pinski
2024-01-18  7:27       ` Matthias Kretz
2024-01-18  7:40         ` Andrew Pinski
2024-01-18  8:40           ` Matthias Kretz
2024-01-18  6:54   ` Matthias Kretz [this message]
2024-01-23 20:57     ` Richard Sandiford
2024-03-27 11:53       ` Matthias Kretz
2024-03-27 13:34         ` Richard Sandiford
2024-03-28 14:48           ` Matthias Kretz
2024-02-09 14:28   ` [PATCH v2] " Srinivas Yadav Singanaboina
2024-03-08  9:57     ` Matthias Kretz
2024-03-27  9:50       ` Jonathan Wakely
2024-03-27 10:07         ` Richard Sandiford
2024-03-27 10:30           ` Matthias Kretz
2024-03-27 12:13             ` Richard Sandiford
2024-03-27 12:47               ` Jonathan Wakely
2024-03-27 14:18         ` Matthias Kretz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3296207.VqM8IeB0Os@centauriprime \
    --to=m.kretz@gsi.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    --cc=richard.sandiford@arm.com \
    --cc=vasusrinivas.vasu14@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).