public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd
@ 2022-12-21  4:48 luoyonggang at gmail dot com
  2022-12-21  7:44 ` [Bug target/108191] " jakub at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: luoyonggang at gmail dot com @ 2022-12-21  4:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

            Bug ID: 108191
           Summary: Add support to usage of *intrin.h without -mavx512f
                    -mavx512cd
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: luoyonggang at gmail dot com
  Target Milestone: ---

This is for getting the following command to be works
```
gcc -fPIC -O2 -D__SSE3__=1 -D__SSSE3__=1 \
-D__SSE4_1__=1 -D__SSE4_2__=1 -D__SSE4A__=1 \
-D__POPCNT__=1 -D__XSAVE__=1 -D__CRC32__=1 \
-D__AVX__=1 -D__AVX2__=1 \
-D__FP_FAST_FMAF32=1 \
-D__FP_FAST_FMAF64=1 \
-D__FP_FAST_FMAF=1 \
-D__FP_FAST_FMAF32x=1 \
-D__AVX512F__=1 -D__AVX512CD__=1 test.c
```
That is generating code for SSE2 only, and we can using 
#include <x86intrin.h>
by using runtime flags.

Indeed, MSVC are aready can did that, if gcc can also support for that, we can
reduce the usage of inline assembly, because MSVC(x64) doesn't support for
inline assembly, so that we can reduce the code complex

The content of test.c is:
```
#if defined(_MSC_VER)
#include <intrin.h>
#else
#include <x86intrin.h>
#endif

#include <math.h>

static inline int
util_iround(float f)
{
   __m128 m = _mm_set_ss(f);
   return _mm_cvtss_i32(m);
}

int util_iround_outside(int x, float y) {
    return x + util_iround(y);
}
```

The compile error is something like:
```
In file included from
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/immintrin.h:35,
                 from
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/x86intrin.h:32,
                 from test.c:4:
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmintrin.h:
In function '_mm_addsub_ps':
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmintrin.h:53:3:
error: cannot convert a value of type 'int' to vector type '__vector(4) float'
which has different size
   53 |   return (__m128) __builtin_ia32_addsubps ((__v4sf)__X, (__v4sf)__Y);
      |   ^~~~~~
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmintrin.h:
In function '_mm_hadd_ps':
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmintrin.h:59:3:
error: cannot convert a value of type 'int' to vector type '__vector(4) float'
which has different size
   59 |   return (__m128) __builtin_ia32_haddps ((__v4sf)__X, (__v4sf)__Y);
      |   ^~~~~~
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmintrin.h:
In function '_mm_hsub_ps':
C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmintrin.h:65:3:
error: cannot convert a value of type 'int' to vector type '__vector(4) float'
which has different size
   65 |   return (__m128) __builtin_ia32_hsubps ((__v4sf)__X, (__v4sf)__Y);
```

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
  2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
@ 2022-12-21  7:44 ` jakub at gcc dot gnu.org
  2022-12-21  7:55 ` luoyonggang at gmail dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-12-21  7:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
You are lying to the compiler, don't.  In GCC you can #include <x86intrin.h>
with SSE2 only and later in say __attribute__((target ("avx512cd"))) function
use avx512f/avx512cd intrinsics, no need to do the what you show above.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
  2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
  2022-12-21  7:44 ` [Bug target/108191] " jakub at gcc dot gnu.org
@ 2022-12-21  7:55 ` luoyonggang at gmail dot com
  2022-12-21  7:57 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: luoyonggang at gmail dot com @ 2022-12-21  7:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

--- Comment #2 from 罗勇刚(Yonggang Luo) <luoyonggang at gmail dot com> ---
(In reply to Jakub Jelinek from comment #1)
> You are lying to the compiler, don't.  In GCC you can #include <x86intrin.h>
> with SSE2 only and later in say __attribute__((target ("avx512cd")))
> function use avx512f/avx512cd intrinsics, no need to do the what you show
> above.

Can you be more specific, show me the code, thanks:)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
  2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
  2022-12-21  7:44 ` [Bug target/108191] " jakub at gcc dot gnu.org
  2022-12-21  7:55 ` luoyonggang at gmail dot com
@ 2022-12-21  7:57 ` rguenth at gcc dot gnu.org
  2022-12-21  7:58 ` luoyonggang at gmail dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-21  7:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
         Resolution|---                         |WONTFIX
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I suppose the issue will be that __attribute__((target)) isn't supported by
MSVC?  But indeed this isn't something we are going to support.  Note another
way is to put the functions into different translation units.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
  2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
                   ` (2 preceding siblings ...)
  2022-12-21  7:57 ` rguenth at gcc dot gnu.org
@ 2022-12-21  7:58 ` luoyonggang at gmail dot com
  2022-12-21  8:05 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: luoyonggang at gmail dot com @ 2022-12-21  7:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

--- Comment #4 from 罗勇刚(Yonggang Luo) <luoyonggang at gmail dot com> ---
(In reply to Richard Biener from comment #3)
> I suppose the issue will be that __attribute__((target)) isn't supported by
> MSVC?  But indeed this isn't something we are going to support.  Note
> another way is to put the functions into different translation units.

gcc is enough, no need care about msvc, msvc can support without attribute, we
can  use macro to deal with that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
  2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
                   ` (3 preceding siblings ...)
  2022-12-21  7:58 ` luoyonggang at gmail dot com
@ 2022-12-21  8:05 ` rguenth at gcc dot gnu.org
  2022-12-21  9:29 ` luoyonggang at gmail dot com
  2022-12-21  9:47 ` jakub at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-21  8:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to 罗勇刚(Yonggang Luo) from comment #2)
> (In reply to Jakub Jelinek from comment #1)
> > You are lying to the compiler, don't.  In GCC you can #include <x86intrin.h>
> > with SSE2 only and later in say __attribute__((target ("avx512cd")))
> > function use avx512f/avx512cd intrinsics, no need to do the what you show
> > above.
> 
> Can you be more specific, show me the code, thanks:)

#include <x86intrin.h>

int __attribute__((target("avx512f"))) foo(float f)
{
  __m128 m = _mm_set_ss(f);
  return _mm_cvtss_i32(m);
}

results in (with just SSE2):

foo:
.LFB6614:
        .cfi_startproc
        vcvtss2sil      %xmm0, %eax
        ret

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
  2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
                   ` (4 preceding siblings ...)
  2022-12-21  8:05 ` rguenth at gcc dot gnu.org
@ 2022-12-21  9:29 ` luoyonggang at gmail dot com
  2022-12-21  9:47 ` jakub at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: luoyonggang at gmail dot com @ 2022-12-21  9:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

--- Comment #6 from 罗勇刚(Yonggang Luo) <luoyonggang at gmail dot com> ---
Is the following command are valid usage? It's compiled properly

```

// compile args:  -fPIC -O2 -D__SSE3__=1 -D__SSSE3__=1 -D__SSE4_1__=1
-D__SSE4_2__=1 -D__SSE4A__=1 -D__POPCNT__=1 -D__XSAVE__=1 -D__CRC32__=1
-D__AVX__=1 -D__AVX2__=1 -D__FP_FAST_FMAF32=1 -D__FP_FAST_FMAF64=1
-D__FP_FAST_FMAF=1 -D__FP_FAST_FMAF32x=1 -D__AVX512F__=1 -D__AVX512CD__=1
#include <math.h>

#pragma GCC push_options
#pragma GCC target("avx512f")
#pragma GCC target("avx512cd")
#pragma GCC target("sse4a")

#if defined(_MSC_VER)
#include <intrin.h>
#else
#include <x86intrin.h>
#endif

#pragma GCC pop_options


#pragma GCC push_options
#pragma GCC target("avx512f")
#pragma GCC target("avx512cd")
#pragma GCC target("sse4a")

void util_fadd_512(float *a, float *b, float *c) {
    /* a = b + c */
    __m512 av = _mm512_load_ps(a);
    __m512 bv = _mm512_load_ps(b);
    __m512 cv = _mm512_add_ps(av, bv);
    _mm512_store_ps(c, cv);
}
static inline int
util_iround(float f)
{
   __m128 m = _mm_set_ss(f);
   return _mm_cvtss_i32(m);
}

#pragma GCC pop_options

int util_iround_outside(int x, float y) {
    return x + util_iround(y);
}
float util_fadd(float a, float b) {
   return a + b;
}
```

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
  2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
                   ` (5 preceding siblings ...)
  2022-12-21  9:29 ` luoyonggang at gmail dot com
@ 2022-12-21  9:47 ` jakub at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-12-21  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to 罗勇刚(Yonggang Luo) from comment #6)
> Is the following command are valid usage? It's compiled properly

No, this is invalid.
> 
> ```
> 
> // compile args:  -fPIC -O2 -D__SSE3__=1 -D__SSSE3__=1 -D__SSE4_1__=1
> -D__SSE4_2__=1 -D__SSE4A__=1 -D__POPCNT__=1 -D__XSAVE__=1 -D__CRC32__=1
> -D__AVX__=1 -D__AVX2__=1 -D__FP_FAST_FMAF32=1 -D__FP_FAST_FMAF64=1
> -D__FP_FAST_FMAF=1 -D__FP_FAST_FMAF32x=1 -D__AVX512F__=1 -D__AVX512CD__=1

Only -fPIC -O2 here, none of the -D arguments, all of them are internal
GCC macros that shouldn't be redefined by users.
Plus it isn't needed.

> #include <math.h>
> 
> #pragma GCC push_options
> #pragma GCC target("avx512f")
> #pragma GCC target("avx512cd")
> #pragma GCC target("sse4a")
> 
> #if defined(_MSC_VER)
> #include <intrin.h>
> #else
> #include <x86intrin.h>
> #endif
> 
> #pragma GCC pop_options

You can do it, but for GCC it is completely useless, you can just
#include <x86intrin.h> without anything further.

> #pragma GCC push_options
> #pragma GCC target("avx512f")
> #pragma GCC target("avx512cd")
> #pragma GCC target("sse4a")

This is certainly fine, but avx512f in there isn't needed, that is implied by
avx512cd.
Though, I don't see anything avx512cd nor sse4a-ish in there.
> 
> void util_fadd_512(float *a, float *b, float *c) {
>     /* a = b + c */
>     __m512 av = _mm512_load_ps(a);
>     __m512 bv = _mm512_load_ps(b);
>     __m512 cv = _mm512_add_ps(av, bv);
>     _mm512_store_ps(c, cv);
> }
> static inline int
> util_iround(float f)
> {
>    __m128 m = _mm_set_ss(f);
>    return _mm_cvtss_i32(m);
> }
> 
> #pragma GCC pop_options
> 
> int util_iround_outside(int x, float y) {
>     return x + util_iround(y);
> }
> float util_fadd(float a, float b) {
>    return a + b;
> }
> ```

That said, code with avx512cd etc. target won't inline into code without it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-21  9:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-21  4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
2022-12-21  7:44 ` [Bug target/108191] " jakub at gcc dot gnu.org
2022-12-21  7:55 ` luoyonggang at gmail dot com
2022-12-21  7:57 ` rguenth at gcc dot gnu.org
2022-12-21  7:58 ` luoyonggang at gmail dot com
2022-12-21  8:05 ` rguenth at gcc dot gnu.org
2022-12-21  9:29 ` luoyonggang at gmail dot com
2022-12-21  9:47 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).