public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/109047] New: Harmonize __attribute__((target_clones)) requirement in function prototype
@ 2023-03-07  6:01 hbucher at gmail dot com
  2023-03-07  6:14 ` [Bug c++/109047] " pinskia at gcc dot gnu.org
  2023-03-07  6:29 ` pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: hbucher at gmail dot com @ 2023-03-07  6:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109047

            Bug ID: 109047
           Summary: Harmonize __attribute__((target_clones)) requirement
                    in function prototype
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hbucher at gmail dot com
  Target Milestone: ---

Let's say I have a library that I want to share with multiple targets
```c++
#include <array>
#include <cstdio>

using Vector = std::array<float, 2>;
using Matrix = std::array<float, 4>;

__attribute__((target_clones("default","arch=core2","arch=znver2")))
Vector multiply(const Matrix& m, const Vector& v) {
    Vector r;
    r[0] = v[0] * m[0] + v[1] * m[2];
    r[1] = v[0] * m[1] + v[1] * m[3];
    return r;
}
```
and I want to use that as below
```c++
#include <array>
#include <cstdio>

using Vector = std::array<float, 2>;
using Matrix = std::array<float, 4>;

Vector multiply(const Matrix& m, const Vector& v);

int main() {
    Matrix m{1,2,3,4};
    Vector v{1,2};
    Vector r = multiply(m,v);
    printf( "%f %f\n", r[0], r[1] );
}
```
Godbolt project: https://godbolt.org/z/3hd4MrzsG

GCC will be happy to compile and link the above two files together but clang++
will complain about undefined references. 
```
ld: CMakeFiles/example.dir/example.cpp.o: in function `main':
example.cpp:(.text+0x2a): undefined reference to `multiply(std::array<float,
4ul> const&, std::array<float, 2ul> const&)'
```
To make this work, I have to add all the targets in the prototype as well as in 
```
#include <array>
#include <cstdio>

using Vector = std::array<float, 2>;
using Matrix = std::array<float, 4>;

__attribute__((target_clones("default","arch=core2","arch=znver2")))
Vector multiply(const Matrix& m, const Vector& v);

int main() {
    Matrix m{1,2,3,4};
    Vector v{1,2};
    Vector r = multiply(m,v);
    printf( "%f %f\n", r[0], r[1] );
}
```
Looking at the object files generated, it seeems that the main difference is
that GCC generates an indirect link for the naked prototype while clang
generate an indirect link to .ifunc and that requires knowledge of the target
attributes. 
```
$ diff nm.gcc nm.clang
3,4d2
<                 U _GLOBAL_OFFSET_TABLE_
<                 U __stack_chk_fail
6,13c4,11
< i multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&)
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&)
[clone .arch_cascadelake.3]
< t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_core2.0
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&)
[clone .arch_haswell.2]
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&)
[clone .arch_sandybridge.1]
< t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver1.4
< t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver2.5
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&)
[clone .default.6]
---
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_cascadelake.3]
> T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_core2.0
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_haswell.2]
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_sandybridge.1]
> T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver1.4
> T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver2.5
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .default.6]
> i multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .ifunc]
20d17
< r std::piecewise_construct
```
is this behavior by design?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug c++/109047] Harmonize __attribute__((target_clones)) requirement in function prototype
  2023-03-07  6:01 [Bug c++/109047] New: Harmonize __attribute__((target_clones)) requirement in function prototype hbucher at gmail dot com
@ 2023-03-07  6:14 ` pinskia at gcc dot gnu.org
  2023-03-07  6:29 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-07  6:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109047

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Yes this is the correct behavior for this attribute.

https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Common-Function-Attributes.html#index-target_005fclones-function-attribute

"It also creates a resolver function (see the ifunc attribute above) that
dynamically selects a clone suitable for current architecture. The resolver is
created only if there is a usage of a function with target_clones attribute.
"

This has been this way since target_clones support was added in GCC 6:
r6-4443-g3b1661a9b93fe8 .

Looks like it was not implemented the same way in clang, report it to them.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug c++/109047] Harmonize __attribute__((target_clones)) requirement in function prototype
  2023-03-07  6:01 [Bug c++/109047] New: Harmonize __attribute__((target_clones)) requirement in function prototype hbucher at gmail dot com
  2023-03-07  6:14 ` [Bug c++/109047] " pinskia at gcc dot gnu.org
@ 2023-03-07  6:29 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-07  6:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109047

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
More over the idea of this attribute is you only need to say on the definition
if there is going to be multiple targets. Otherwise you would get different
behavior across targets. And exporting different symbols if you have one
version of the code with and without it. E.g. you have a shared library with a
static abi.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-03-07  6:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-07  6:01 [Bug c++/109047] New: Harmonize __attribute__((target_clones)) requirement in function prototype hbucher at gmail dot com
2023-03-07  6:14 ` [Bug c++/109047] " pinskia at gcc dot gnu.org
2023-03-07  6:29 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).