From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 7B0583858281; Wed, 21 Dec 2022 04:48:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7B0583858281 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1671598105; bh=1MAYLewgYhCKRJNLKBXvm0j+mvxCYA0eAnDZWDiScpw=; h=From:To:Subject:Date:From; b=yeva2dEwiiu7Pb6u7tFg/Lute8rD8pdpthwMR3KiRkciD8NNbxF3bwgay5tFMaV2d Xl1z5XisjjBGU1yR0MHyC2LyZ1U1B3UvwcYQNdIF0dA090cZ8irJDMW4biwvKitD1a 1K4bZUfZpxXE41q4MzHruoH1bAdndDowobBGUKvY= From: "luoyonggang at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/108191] New: Add support to usage of *intrin.h without -mavx512f -mavx512cd Date: Wed, 21 Dec 2022 04:48:24 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: luoyonggang at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108191 Bug ID: 108191 Summary: Add support to usage of *intrin.h without -mavx512f -mavx512cd Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: luoyonggang at gmail dot com Target Milestone: --- This is for getting the following command to be works ``` gcc -fPIC -O2 -D__SSE3__=3D1 -D__SSSE3__=3D1 \ -D__SSE4_1__=3D1 -D__SSE4_2__=3D1 -D__SSE4A__=3D1 \ -D__POPCNT__=3D1 -D__XSAVE__=3D1 -D__CRC32__=3D1 \ -D__AVX__=3D1 -D__AVX2__=3D1 \ -D__FP_FAST_FMAF32=3D1 \ -D__FP_FAST_FMAF64=3D1 \ -D__FP_FAST_FMAF=3D1 \ -D__FP_FAST_FMAF32x=3D1 \ -D__AVX512F__=3D1 -D__AVX512CD__=3D1 test.c ``` That is generating code for SSE2 only, and we can using=20 #include by using runtime flags. Indeed, MSVC are aready can did that, if gcc can also support for that, we = can reduce the usage of inline assembly, because MSVC(x64) doesn't support for inline assembly, so that we can reduce the code complex The content of test.c is: ``` #if defined(_MSC_VER) #include #else #include #endif #include static inline int util_iround(float f) { __m128 m =3D _mm_set_ss(f); return _mm_cvtss_i32(m); } int util_iround_outside(int x, float y) { return x + util_iround(y); } ``` The compile error is something like: ``` In file included from C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/immint= rin.h:35, from C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/x86int= rin.h:32, from test.c:4: C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmint= rin.h: In function '_mm_addsub_ps': C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmint= rin.h:53:3: error: cannot convert a value of type 'int' to vector type '__vector(4) flo= at' which has different size 53 | return (__m128) __builtin_ia32_addsubps ((__v4sf)__X, (__v4sf)__Y= ); | ^~~~~~ C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmint= rin.h: In function '_mm_hadd_ps': C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmint= rin.h:59:3: error: cannot convert a value of type 'int' to vector type '__vector(4) flo= at' which has different size 59 | return (__m128) __builtin_ia32_haddps ((__v4sf)__X, (__v4sf)__Y); | ^~~~~~ C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmint= rin.h: In function '_mm_hsub_ps': C:/CI-Tools/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/pmmint= rin.h:65:3: error: cannot convert a value of type 'int' to vector type '__vector(4) flo= at' which has different size 65 | return (__m128) __builtin_ia32_hsubps ((__v4sf)__X, (__v4sf)__Y); ```=