From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C6C0F3858436; Wed, 21 Dec 2022 09:47:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C6C0F3858436 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1671616066; bh=O1zw8KgGNN7kJry1bs4JU0Yy5HtzWwqK7/0iLfEBwxY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=WqIQtxek1AozFiJV4ZjvKJwVO6yHe5IFTUqKyMYjmPAFec4SXH/sWSHJkTpnVSCxg A+/EeHLpzZUGGp8fEmILrZ4yBWd5fgpWAggUhMbzTInta8RX3xkrr9DK6WjU36jjWQ x2aHrquf6rthebUEV9lRjORyQSTFXBzPRjW7yYBs= From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/108191] Add support to usage of *intrin.h without -mavx512f -mavx512cd Date: Wed, 21 Dec 2022 09:47:46 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: WONTFIX X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108191 --- Comment #7 from Jakub Jelinek --- (In reply to =E7=BD=97=E5=8B=87=E5=88=9A(Yonggang Luo) from comment #6) > Is the following command are valid usage? It's compiled properly No, this is invalid. >=20 > ``` >=20 > // compile args: -fPIC -O2 -D__SSE3__=3D1 -D__SSSE3__=3D1 -D__SSE4_1__= =3D1 > -D__SSE4_2__=3D1 -D__SSE4A__=3D1 -D__POPCNT__=3D1 -D__XSAVE__=3D1 -D__CRC= 32__=3D1 > -D__AVX__=3D1 -D__AVX2__=3D1 -D__FP_FAST_FMAF32=3D1 -D__FP_FAST_FMAF64=3D1 > -D__FP_FAST_FMAF=3D1 -D__FP_FAST_FMAF32x=3D1 -D__AVX512F__=3D1 -D__AVX512= CD__=3D1 Only -fPIC -O2 here, none of the -D arguments, all of them are internal GCC macros that shouldn't be redefined by users. Plus it isn't needed. > #include >=20 > #pragma GCC push_options > #pragma GCC target("avx512f") > #pragma GCC target("avx512cd") > #pragma GCC target("sse4a") >=20 > #if defined(_MSC_VER) > #include > #else > #include > #endif >=20 > #pragma GCC pop_options You can do it, but for GCC it is completely useless, you can just #include without anything further. > #pragma GCC push_options > #pragma GCC target("avx512f") > #pragma GCC target("avx512cd") > #pragma GCC target("sse4a") This is certainly fine, but avx512f in there isn't needed, that is implied = by avx512cd. Though, I don't see anything avx512cd nor sse4a-ish in there. >=20 > void util_fadd_512(float *a, float *b, float *c) { > /* a =3D b + c */ > __m512 av =3D _mm512_load_ps(a); > __m512 bv =3D _mm512_load_ps(b); > __m512 cv =3D _mm512_add_ps(av, bv); > _mm512_store_ps(c, cv); > } > static inline int > util_iround(float f) > { > __m128 m =3D _mm_set_ss(f); > return _mm_cvtss_i32(m); > } >=20 > #pragma GCC pop_options >=20 > int util_iround_outside(int x, float y) { > return x + util_iround(y); > } > float util_fadd(float a, float b) { > return a + b; > } > ``` That said, code with avx512cd etc. target won't inline into code without it= .=