* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
2022-12-21 4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
@ 2022-12-21 7:44 ` jakub at gcc dot gnu.org
2022-12-21 7:55 ` luoyonggang at gmail dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-12-21 7:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
You are lying to the compiler, don't. In GCC you can #include <x86intrin.h>
with SSE2 only and later in say __attribute__((target ("avx512cd"))) function
use avx512f/avx512cd intrinsics, no need to do the what you show above.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
2022-12-21 4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
2022-12-21 7:44 ` [Bug target/108191] " jakub at gcc dot gnu.org
@ 2022-12-21 7:55 ` luoyonggang at gmail dot com
2022-12-21 7:57 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: luoyonggang at gmail dot com @ 2022-12-21 7:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191
--- Comment #2 from 罗勇刚(Yonggang Luo) <luoyonggang at gmail dot com> ---
(In reply to Jakub Jelinek from comment #1)
> You are lying to the compiler, don't. In GCC you can #include <x86intrin.h>
> with SSE2 only and later in say __attribute__((target ("avx512cd")))
> function use avx512f/avx512cd intrinsics, no need to do the what you show
> above.
Can you be more specific, show me the code, thanks:)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
2022-12-21 4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
2022-12-21 7:44 ` [Bug target/108191] " jakub at gcc dot gnu.org
2022-12-21 7:55 ` luoyonggang at gmail dot com
@ 2022-12-21 7:57 ` rguenth at gcc dot gnu.org
2022-12-21 7:58 ` luoyonggang at gmail dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-21 7:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Resolution|--- |WONTFIX
Status|UNCONFIRMED |RESOLVED
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I suppose the issue will be that __attribute__((target)) isn't supported by
MSVC? But indeed this isn't something we are going to support. Note another
way is to put the functions into different translation units.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
2022-12-21 4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
` (2 preceding siblings ...)
2022-12-21 7:57 ` rguenth at gcc dot gnu.org
@ 2022-12-21 7:58 ` luoyonggang at gmail dot com
2022-12-21 8:05 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: luoyonggang at gmail dot com @ 2022-12-21 7:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191
--- Comment #4 from 罗勇刚(Yonggang Luo) <luoyonggang at gmail dot com> ---
(In reply to Richard Biener from comment #3)
> I suppose the issue will be that __attribute__((target)) isn't supported by
> MSVC? But indeed this isn't something we are going to support. Note
> another way is to put the functions into different translation units.
gcc is enough, no need care about msvc, msvc can support without attribute, we
can use macro to deal with that.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
2022-12-21 4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
` (3 preceding siblings ...)
2022-12-21 7:58 ` luoyonggang at gmail dot com
@ 2022-12-21 8:05 ` rguenth at gcc dot gnu.org
2022-12-21 9:29 ` luoyonggang at gmail dot com
2022-12-21 9:47 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-21 8:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to 罗勇刚(Yonggang Luo) from comment #2)
> (In reply to Jakub Jelinek from comment #1)
> > You are lying to the compiler, don't. In GCC you can #include <x86intrin.h>
> > with SSE2 only and later in say __attribute__((target ("avx512cd")))
> > function use avx512f/avx512cd intrinsics, no need to do the what you show
> > above.
>
> Can you be more specific, show me the code, thanks:)
#include <x86intrin.h>
int __attribute__((target("avx512f"))) foo(float f)
{
__m128 m = _mm_set_ss(f);
return _mm_cvtss_i32(m);
}
results in (with just SSE2):
foo:
.LFB6614:
.cfi_startproc
vcvtss2sil %xmm0, %eax
ret
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
2022-12-21 4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
` (4 preceding siblings ...)
2022-12-21 8:05 ` rguenth at gcc dot gnu.org
@ 2022-12-21 9:29 ` luoyonggang at gmail dot com
2022-12-21 9:47 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: luoyonggang at gmail dot com @ 2022-12-21 9:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191
--- Comment #6 from 罗勇刚(Yonggang Luo) <luoyonggang at gmail dot com> ---
Is the following command are valid usage? It's compiled properly
```
// compile args: -fPIC -O2 -D__SSE3__=1 -D__SSSE3__=1 -D__SSE4_1__=1
-D__SSE4_2__=1 -D__SSE4A__=1 -D__POPCNT__=1 -D__XSAVE__=1 -D__CRC32__=1
-D__AVX__=1 -D__AVX2__=1 -D__FP_FAST_FMAF32=1 -D__FP_FAST_FMAF64=1
-D__FP_FAST_FMAF=1 -D__FP_FAST_FMAF32x=1 -D__AVX512F__=1 -D__AVX512CD__=1
#include <math.h>
#pragma GCC push_options
#pragma GCC target("avx512f")
#pragma GCC target("avx512cd")
#pragma GCC target("sse4a")
#if defined(_MSC_VER)
#include <intrin.h>
#else
#include <x86intrin.h>
#endif
#pragma GCC pop_options
#pragma GCC push_options
#pragma GCC target("avx512f")
#pragma GCC target("avx512cd")
#pragma GCC target("sse4a")
void util_fadd_512(float *a, float *b, float *c) {
/* a = b + c */
__m512 av = _mm512_load_ps(a);
__m512 bv = _mm512_load_ps(b);
__m512 cv = _mm512_add_ps(av, bv);
_mm512_store_ps(c, cv);
}
static inline int
util_iround(float f)
{
__m128 m = _mm_set_ss(f);
return _mm_cvtss_i32(m);
}
#pragma GCC pop_options
int util_iround_outside(int x, float y) {
return x + util_iround(y);
}
float util_fadd(float a, float b) {
return a + b;
}
```
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd
2022-12-21 4:48 [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd luoyonggang at gmail dot com
` (5 preceding siblings ...)
2022-12-21 9:29 ` luoyonggang at gmail dot com
@ 2022-12-21 9:47 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-12-21 9:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to 罗勇刚(Yonggang Luo) from comment #6)
> Is the following command are valid usage? It's compiled properly
No, this is invalid.
>
> ```
>
> // compile args: -fPIC -O2 -D__SSE3__=1 -D__SSSE3__=1 -D__SSE4_1__=1
> -D__SSE4_2__=1 -D__SSE4A__=1 -D__POPCNT__=1 -D__XSAVE__=1 -D__CRC32__=1
> -D__AVX__=1 -D__AVX2__=1 -D__FP_FAST_FMAF32=1 -D__FP_FAST_FMAF64=1
> -D__FP_FAST_FMAF=1 -D__FP_FAST_FMAF32x=1 -D__AVX512F__=1 -D__AVX512CD__=1
Only -fPIC -O2 here, none of the -D arguments, all of them are internal
GCC macros that shouldn't be redefined by users.
Plus it isn't needed.
> #include <math.h>
>
> #pragma GCC push_options
> #pragma GCC target("avx512f")
> #pragma GCC target("avx512cd")
> #pragma GCC target("sse4a")
>
> #if defined(_MSC_VER)
> #include <intrin.h>
> #else
> #include <x86intrin.h>
> #endif
>
> #pragma GCC pop_options
You can do it, but for GCC it is completely useless, you can just
#include <x86intrin.h> without anything further.
> #pragma GCC push_options
> #pragma GCC target("avx512f")
> #pragma GCC target("avx512cd")
> #pragma GCC target("sse4a")
This is certainly fine, but avx512f in there isn't needed, that is implied by
avx512cd.
Though, I don't see anything avx512cd nor sse4a-ish in there.
>
> void util_fadd_512(float *a, float *b, float *c) {
> /* a = b + c */
> __m512 av = _mm512_load_ps(a);
> __m512 bv = _mm512_load_ps(b);
> __m512 cv = _mm512_add_ps(av, bv);
> _mm512_store_ps(c, cv);
> }
> static inline int
> util_iround(float f)
> {
> __m128 m = _mm_set_ss(f);
> return _mm_cvtss_i32(m);
> }
>
> #pragma GCC pop_options
>
> int util_iround_outside(int x, float y) {
> return x + util_iround(y);
> }
> float util_fadd(float a, float b) {
> return a + b;
> }
> ```
That said, code with avx512cd etc. target won't inline into code without it.
^ permalink raw reply [flat|nested] 8+ messages in thread