From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 005E13858D39; Tue, 7 Mar 2023 06:01:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 005E13858D39 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678168864; bh=ClbUV1pSUcD8/Bo16F7/SlscDtO+SY3zMejUdiAACcM=; h=From:To:Subject:Date:From; b=oG7AcrlJt1DUACFlrz8NqBilo76VuBeoRjatZYEpBQh7j8IVtXit7FW/fIJ+HTZqG Kj8SK9f54hwiJGkX53XqosT5+r1K9ukX11rEi9zHUV9uhB9n/XfN8o7HUbirXA0ETo ZBWtnzCMT8859rHuiY1Uhuvqarg67D5J19yVXSqQ= From: "hbucher at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/109047] New: Harmonize __attribute__((target_clones)) requirement in function prototype Date: Tue, 07 Mar 2023 06:01:03 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: hbucher at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109047 Bug ID: 109047 Summary: Harmonize __attribute__((target_clones)) requirement in function prototype Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: hbucher at gmail dot com Target Milestone: --- Let's say I have a library that I want to share with multiple targets ```c++ #include #include using Vector =3D std::array; using Matrix =3D std::array; __attribute__((target_clones("default","arch=3Dcore2","arch=3Dznver2"))) Vector multiply(const Matrix& m, const Vector& v) { Vector r; r[0] =3D v[0] * m[0] + v[1] * m[2]; r[1] =3D v[0] * m[1] + v[1] * m[3]; return r; } ``` and I want to use that as below ```c++ #include #include using Vector =3D std::array; using Matrix =3D std::array; Vector multiply(const Matrix& m, const Vector& v); int main() { Matrix m{1,2,3,4}; Vector v{1,2}; Vector r =3D multiply(m,v); printf( "%f %f\n", r[0], r[1] ); } ``` Godbolt project: https://godbolt.org/z/3hd4MrzsG GCC will be happy to compile and link the above two files together but clan= g++ will complain about undefined references.=20 ``` ld: CMakeFiles/example.dir/example.cpp.o: in function `main': example.cpp:(.text+0x2a): undefined reference to `multiply(std::array const&, std::array const&)' ``` To make this work, I have to add all the targets in the prototype as well a= s in=20 ``` #include #include using Vector =3D std::array; using Matrix =3D std::array; __attribute__((target_clones("default","arch=3Dcore2","arch=3Dznver2"))) Vector multiply(const Matrix& m, const Vector& v); int main() { Matrix m{1,2,3,4}; Vector v{1,2}; Vector r =3D multiply(m,v); printf( "%f %f\n", r[0], r[1] ); } ``` Looking at the object files generated, it seeems that the main difference is that GCC generates an indirect link for the naked prototype while clang generate an indirect link to .ifunc and that requires knowledge of the targ= et attributes.=20 ``` $ diff nm.gcc nm.clang 3,4d2 < U _GLOBAL_OFFSET_TABLE_ < U __stack_chk_fail 6,13c4,11 < i multiply(std::array const&, std::array const&) < t multiply(std::array const&, std::array const&) [clone .arch_cascadelake.3] < t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_core2.0 < t multiply(std::array const&, std::array const&) [clone .arch_haswell.2] < t multiply(std::array const&, std::array const&) [clone .arch_sandybridge.1] < t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver1.4 < t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver2.5 < t multiply(std::array const&, std::array const&) [clone .default.6] --- > T multiply(std::array const&, std::array const&) = [clone .arch_cascadelake.3] > T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_core2.0 > T multiply(std::array const&, std::array const&) = [clone .arch_haswell.2] > T multiply(std::array const&, std::array const&) = [clone .arch_sandybridge.1] > T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver1.4 > T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver2.5 > T multiply(std::array const&, std::array const&) = [clone .default.6] > i multiply(std::array const&, std::array const&) = [clone .ifunc] 20d17 < r std::piecewise_construct ``` is this behavior by design?=