public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Determining maximum vector length supported by the CPU?
@ 2019-05-22  8:36 Martin Reinecke
  2019-05-22  9:18 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Reinecke @ 2019-05-22  8:36 UTC (permalink / raw)
  To: gcc; +Cc: m.kretz 

Hi Matthias!

> I agree, we need more information from the compiler. Esp. whether the user 
> specified `-mprefer-avx128` or `-mprefer-vector-width=none/128/256/512`.
> OTOH `-msve-vector-bits=N` is reported as __ARM_FEATURE_SVE_BITS. So that's 
> covered.

Almost ... except that I'd need a platform-agnostic definition. The
point is that the code does not care about the underlying hardware at
all, only for the vector length supported by it.

> Related: PR83875 - because while we're adding things in that area, it'd be 
> nice if they worked with target clones as well.

Yes, this is a problem I've come across as well in the past.
(https://gcc.gnu.org/ml/gcc-help/2018-10/msg00118.html)

> Are you aware of std::experimental::simd? It didn't make GCC 9.1, but you 
> can easily patch your (installed) libstdc++ using https://github.com/VcDevel/
> std-simd.

This looks extremely interesting! I have to look at it in more detail,
but this might be the way to go in the future.
However, the code I'm working on may be incorporated into numpy/scipy at
some point, and the minimum required compilers for these packages are
pretty old. I can't expect more than vanilla C++11 support there.

Cheers,
  Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Determining maximum vector length supported by the CPU?
  2019-05-22  8:36 Determining maximum vector length supported by the CPU? Martin Reinecke
@ 2019-05-22  9:18 ` Richard Biener
  2019-05-22  9:27   ` Matthias Kretz
  2019-05-22  9:27   ` Martin Reinecke
  0 siblings, 2 replies; 7+ messages in thread
From: Richard Biener @ 2019-05-22  9:18 UTC (permalink / raw)
  To: Martin Reinecke; +Cc: GCC Development, m.kretz

On Wed, May 22, 2019 at 10:36 AM Martin Reinecke
<martin@mpa-garching.mpg.de> wrote:
>
> Hi Matthias!
>
> > I agree, we need more information from the compiler. Esp. whether the user
> > specified `-mprefer-avx128` or `-mprefer-vector-width=none/128/256/512`.
> > OTOH `-msve-vector-bits=N` is reported as __ARM_FEATURE_SVE_BITS. So that's
> > covered.
>
> Almost ... except that I'd need a platform-agnostic definition. The
> point is that the code does not care about the underlying hardware at
> all, only for the vector length supported by it.

And then you run into AVX + SSE vs. AVX2 + SSE cases where the (optimal) length
depends on the component type...

I wonder if we'd want to have a 'auto' length instead ;)

I suppose exposing a __BIGGEST_VECTOR__ might be possible (not for SVE
though?).

> > Related: PR83875 - because while we're adding things in that area, it'd be
> > nice if they worked with target clones as well.
>
> Yes, this is a problem I've come across as well in the past.
> (https://gcc.gnu.org/ml/gcc-help/2018-10/msg00118.html)
>
> > Are you aware of std::experimental::simd? It didn't make GCC 9.1, but you
> > can easily patch your (installed) libstdc++ using https://github.com/VcDevel/
> > std-simd.
>
> This looks extremely interesting! I have to look at it in more detail,
> but this might be the way to go in the future.
> However, the code I'm working on may be incorporated into numpy/scipy at
> some point, and the minimum required compilers for these packages are
> pretty old. I can't expect more than vanilla C++11 support there.
>
> Cheers,
>   Martin
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Determining maximum vector length supported by the CPU?
  2019-05-22  9:18 ` Richard Biener
@ 2019-05-22  9:27   ` Matthias Kretz
  2019-05-22  9:27   ` Martin Reinecke
  1 sibling, 0 replies; 7+ messages in thread
From: Matthias Kretz @ 2019-05-22  9:27 UTC (permalink / raw)
  To: GCC Development

On Mittwoch, 22. Mai 2019 11:17:57 CEST Richard Biener wrote:
> On Wed, May 22, 2019 at 10:36 AM Martin Reinecke
> <martin@mpa-garching.mpg.de> wrote:
> > Hi Matthias!
> > 
> > > I agree, we need more information from the compiler. Esp. whether the
> > > user
> > > specified `-mprefer-avx128` or `-mprefer-vector-width=none/128/256/512`.
> > > OTOH `-msve-vector-bits=N` is reported as __ARM_FEATURE_SVE_BITS. So
> > > that's
> > > covered.
> > 
> > Almost ... except that I'd need a platform-agnostic definition. The
> > point is that the code does not care about the underlying hardware at
> > all, only for the vector length supported by it.
> 
> And then you run into AVX + SSE vs. AVX2 + SSE cases where the (optimal)
> length depends on the component type...
> 
> I wonder if we'd want to have a 'auto' length instead ;)
> 
> I suppose exposing a __BIGGEST_VECTOR__ might be possible (not for SVE
> though?).

AVX vs. AVX2 is a valid question. std::experimental::simd solves it by having 
the size depend on the element type. So I agree, `vector_size(auto)` seems 
like a possible solution. One could then take the sizeof if the number is 
important to know.

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                                https://kretzfamily.de
 GSI Helmholtzzentrum für Schwerionenforschung             https://gsi.de
 SIMD easy and portable                     https://github.com/VcDevel/Vc
──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Determining maximum vector length supported by the CPU?
  2019-05-22  9:18 ` Richard Biener
  2019-05-22  9:27   ` Matthias Kretz
@ 2019-05-22  9:27   ` Martin Reinecke
  2019-05-22  9:37     ` Matthias Kretz
  1 sibling, 1 reply; 7+ messages in thread
From: Martin Reinecke @ 2019-05-22  9:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development, m.kretz



On 5/22/19 11:17 AM, Richard Biener wrote:

>> Almost ... except that I'd need a platform-agnostic definition. The
>> point is that the code does not care about the underlying hardware at
>> all, only for the vector length supported by it.
> 
> And then you run into AVX + SSE vs. AVX2 + SSE cases where the (optimal) length
> depends on the component type...

You mean different vector lengths for float, double, int etc?
I would be fine to have different macros for those, if necessary.

> I wonder if we'd want to have a 'auto' length instead ;)

I know it's weird, but for my code (which is definitely not a synthetic
test case) that would work perfectly :)
If you'd like to have a look:
https://gitlab.mpcdf.mpg.de/mtr/pocketfft/tree/cpp
(the only platform-dependent part is on lines 82-95 of the header file).

Still, I would need a way to determine how long the vectors actually
are. But it would probably be enough to measure this at runtime then.

Cheers,
  Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Determining maximum vector length supported by the CPU?
  2019-05-22  9:27   ` Martin Reinecke
@ 2019-05-22  9:37     ` Matthias Kretz
  0 siblings, 0 replies; 7+ messages in thread
From: Matthias Kretz @ 2019-05-22  9:37 UTC (permalink / raw)
  To: Martin Reinecke, GCC Development

On Mittwoch, 22. Mai 2019 11:27:25 CEST Martin Reinecke wrote:
> Still, I would need a way to determine how long the vectors actually
> are. But it would probably be enough to measure this at runtime then.

FWIW, something that took me way too long to figure out: You can use vector 
builtins very conveniently in C++ with the traits I define at https://
github.com/VcDevel/std-simd/blob/59e6348a9d34b4ef4f5ef1fc4f423dd75e1987f3/
experimental/bits/simd.h#L925 (ignore _SimdWrapper: I'm working on phasing it 
out after I discovered the _VectorTraits solution). You'll need __vector_type, 
__is_vector_type and _VectorTraits and then you can write:

template <typename T, typename VT = VectorTraits<T>>
void f(T x) {
  using element_type = typename VT::value_type;
  constexpr auto N = VT::_S_width;
  ...
}

f(x) only participates in overload resolution if x is a vector builtin type 
because otherwise VectorTraits<T> leads to a substitution failure (SFINAE).

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                                https://kretzfamily.de
 GSI Helmholtzzentrum für Schwerionenforschung             https://gsi.de
 SIMD easy and portable                     https://github.com/VcDevel/Vc
──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Determining maximum vector length supported by the CPU?
  2019-05-22  6:39 Martin Reinecke
@ 2019-05-22  8:09 ` Matthias Kretz
  0 siblings, 0 replies; 7+ messages in thread
From: Matthias Kretz @ 2019-05-22  8:09 UTC (permalink / raw)
  To: gcc

Hi Martin,

I agree, we need more information from the compiler. Esp. whether the user 
specified `-mprefer-avx128` or `-mprefer-vector-width=none/128/256/512`.
OTOH `-msve-vector-bits=N` is reported as __ARM_FEATURE_SVE_BITS. So that's 
covered.

Related: PR83875 - because while we're adding things in that area, it'd be 
nice if they worked with target clones as well.

Are you aware of std::experimental::simd? It didn't make GCC 9.1, but you 
can easily patch your (installed) libstdc++ using https://github.com/VcDevel/
std-simd.

Cheers,
  Matthias

On Mittwoch, 22. Mai 2019 08:39:25 CEST Martin Reinecke wrote:
> [Disclaimer: I sent this to gcc-help two weeks ago, but didn't get an
> answer. Maybe the topic is more suited for the main gcc list ... I
> really think the feature in question would be extremely useful to have,
> and easy to add!]
> 
> Hi,
> 
> I'm currently writing an FFT library which tries to make use of SIMD
> instructions and uses a lot of variables with
>  __attribute__ ((vector_size (xyz))
> 
> The resulting source is nicely portable and architecture-independent -
> except for the one place where I need to determine the maximum
> hardware-supported vector length on the target CPU.
> 
> This currently looks like
> 
> #if defined(__AVX__)
> constexpr int veclen=32;
> #elif defined(__SSE2__)
> constexpr int veclen=16;
> [...]
> 
> This approach requires me to add an #ifdef for many architectures, most
> of which I cannot really test on ... and new architectures will be
> unsupported by default.
> 
> Hence my question: is there a way in gcc to determine the hardware
> vector length for the architecture the compiler is currently targeting?
> Some predefined macro like
> 
> HARDWARE_VECTOR_LENGTH_IN_BYTES
> 
> which is 32 for AVX, 16 for SSE2, and has proper values for Neon, VPX
> etc. etc.
> 
> If this is not provided at the moment, would it bo possible to add this
> in the future? This could massively simplify writing and maintaining
> multi-platform SIMD code.
> 
> Thanks,
>   Martin
//

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                                https://kretzfamily.de
 GSI Helmholtzzentrum für Schwerionenforschung             https://gsi.de
 SIMD easy and portable                     https://github.com/VcDevel/Vc
──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Determining maximum vector length supported by the CPU?
@ 2019-05-22  6:39 Martin Reinecke
  2019-05-22  8:09 ` Matthias Kretz
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Reinecke @ 2019-05-22  6:39 UTC (permalink / raw)
  To: gcc

[Disclaimer: I sent this to gcc-help two weeks ago, but didn't get an
answer. Maybe the topic is more suited for the main gcc list ... I
really think the feature in question would be extremely useful to have,
and easy to add!]

Hi,

I'm currently writing an FFT library which tries to make use of SIMD
instructions and uses a lot of variables with
 __attribute__ ((vector_size (xyz))

The resulting source is nicely portable and architecture-independent -
except for the one place where I need to determine the maximum
hardware-supported vector length on the target CPU.

This currently looks like

#if defined(__AVX__)
constexpr int veclen=32;
#elif defined(__SSE2__)
constexpr int veclen=16;
[...]

This approach requires me to add an #ifdef for many architectures, most
of which I cannot really test on ... and new architectures will be
unsupported by default.

Hence my question: is there a way in gcc to determine the hardware
vector length for the architecture the compiler is currently targeting?
Some predefined macro like

HARDWARE_VECTOR_LENGTH_IN_BYTES

which is 32 for AVX, 16 for SSE2, and has proper values for Neon, VPX
etc. etc.

If this is not provided at the moment, would it bo possible to add this
in the future? This could massively simplify writing and maintaining
multi-platform SIMD code.

Thanks,
  Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-05-22  9:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22  8:36 Determining maximum vector length supported by the CPU? Martin Reinecke
2019-05-22  9:18 ` Richard Biener
2019-05-22  9:27   ` Matthias Kretz
2019-05-22  9:27   ` Martin Reinecke
2019-05-22  9:37     ` Matthias Kretz
  -- strict thread matches above, loose matches on Subject: below --
2019-05-22  6:39 Martin Reinecke
2019-05-22  8:09 ` Matthias Kretz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).