Re: Limits and feature test macroses for vector extension

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <richard.guenther@gmail.com>
To: Nikita Zlobin <nick87720z@gmail.com>
Cc: gcc@gcc.gnu.org
Subject: Re: Limits and feature test macroses for vector extension
Date: Mon, 9 Jan 2023 08:55:35 +0100	[thread overview]
Message-ID: <CAFiYyc3Eg_S5kk473H8n_D5V2YXOOVJnZBnV8NJ+=-HD5bnSdQ@mail.gmail.com> (raw)
In-Reply-To: <CAGaigJORdgraoEjU_4Va3-SuJups9Zsgf=EmG+Y1BmzerO_yqQ@mail.gmail.com>

On Sun, Jan 1, 2023 at 3:54 PM Nikita Zlobin via Gcc <gcc@gcc.gnu.org> wrote:
>
> Vector extension is great, because allowes to use controllable
> vectorization without dealing with each SIMD ISA separately. When
> properly used, it allowes to get better performance, than with
> auto-vectorization. However, there's just one issue.
>
> While for specific SIMD, used as backends for vec-ext, it's possible
> to check if they are supported, there's no similar features for vector
> extension. The only way yo make it configurable without manually
> checking each ISA, is to e.g. add configure parameter
> --vector-size=<bytes>, with enough goot commentary for user to
> understand, what should be there (to be specified in __attributes__((
> bytes )) ).
>
> My first approach was to check for possibility to make autodetected
> config, e.g. with autoconf, ins such way (not ideal, just for start):
>
> gcc -march=native -E -v - < /dev/null 2>&1 | awk 'BEGIN{ arr[0]=0;
> delete arr[0]; } /cc1/{ for (i=1; i<=NF; i++){ if ($i ~ /-mno-/)
> continue; switch ($i){ case /-m(mmx|3dnow|vis)/: arr[8]=1; break; case
> /-m(sse|altivec)/: arr[16]=1; break; case /^-mavx[2]?$/: arr[32]=1;
> break; case /-mavx-512/: arr[64]=1; break; } }; for (j in arr) print
> j; }'

There's -Wvector-operation-performance which will diagnose cases
where GCC decomposes larger into smaller vectors or even to scalar
operations.  That might be of some help here as well.

> However, I discovered, that I have no idea, how to detect NEON vector
> size in this way (even its presence). There was answer, suggesting to
> check feature test macros. After trying this command:
>
> gcc -march=native -dM -E - </dev/null | less
>
> I discovered, that other ISA, like MMX, SSE and AVX, have similar
> feature test macroses, e.g. __MMX__, __SSE2__, __AVX__. This means,
> that simple C header with __GNU_SOURCE, would be enough to check for
> each ISA without calling functions from Target Builtins extension.
>
> However, it's not end. Some ISA have limited set of elementary types
> to be used in vectors. E.g., MMX and 3DNow! don't support integer.
> This may be issue if integer implementation of some code has better
> performance than if using floating point format (even with same data
> width). This neccesitates for real feature test macroses, representing
> data types, supported by supported SIMD ISA.
>
> E.g., for simple vector sizes - it could be done with array (example):
>
> #define __EXT_VECTOR_SIZEV (int[]){64, 128, 256, 512}
>
> with array len determined as sizeof(vec) / sizeof(vec[0])
>
> But for exact check of supported data types - there could be variants:
>
> 1. Using per-type feature test macroses: __V8SI16__, __V8UI16__,
> __V8F16__, __V4SI32__, __V4UI32__, __V4F32__, __V2SI64__,
> __V2F64__....
> (I discovered at wikipedia - some ISA restrict underlying int size to
> 32bit without 64bit support).
>
> 2. Extend array for supported lengths to be 2d matrix of supported
> vector size + underlying element type combination. This could use
> NULL-terminated array to mark end if real values sequence. First
> subarray represents vector sizes, while next subarrays each correspond
> to value from first. Their elements are int fields, combining bitwidth
> value with bit flags, representing if it's float/int and (for int)
> signed/unsigned.
>
> Though who knowes if eventually complex numbers could have chance to
> appear in this list :D . Well, even without this this could be tricky
> way.
>
> 3. There could be variation of 2nd way, representing per-type vector
> sizes lists rather than per-vector-size data types. This could be more
> practical, since algothythms would rather need available vector sizes
> for specific data types, used inside.
>
> As for relying for vector size subdivision when it has no
> corresponding ISA support - I got only worse performance in this way.
> Although I'm not sure, that it's not gcc bug: if there are 2
> subvectors existing at the same time, than it could be just too much
> SIMD registers used. While if they are processed in sequence, this
> probably should not worsen performance (I never tried manual code
> intrinsics).

In general you'll figure that writing generic vector code is as hard
as autovectorizing scalar code...

Richard.

     prev parent reply	other threads:[~2023-01-09  7:55 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-01 14:53 Nikita Zlobin
2023-01-09  7:55 ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc3Eg_S5kk473H8n_D5V2YXOOVJnZBnV8NJ+=-HD5bnSdQ@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=nick87720z@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).