public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Michael Meissner <meissner@linux.ibm.com>
To: Peter Bergner <bergner@linux.ibm.com>
Cc: Zack Weinberg <zack@owlfolio.org>,
	Richard Henderson <richard.henderson@linaro.org>,
	libc-alpha@sourceware.org,
	Michael Meissner <meissner@linux.ibm.com>
Subject: Re: Maybe we should get rid of ifuncs
Date: Wed, 1 May 2024 22:59:37 -0400	[thread overview]
Message-ID: <ZjMBmQs042EvN8ZS@cowardly-lion.the-meissners.org> (raw)
In-Reply-To: <71a749ba-d843-424a-9a41-1d20f6be685c@linux.ibm.com>

On Sat, Apr 27, 2024 at 07:24:05PM -0500, Peter Bergner wrote:
> On 4/24/24 9:43 AM, Zack Weinberg wrote:
> > I'm very curious what the plan for function multiversioning in GCC
> > and LLVM is, and how close to declarative it gets.
> 
> GCC (at least on powerpc) already supports it via the target_clones
> attribute.  See gcc/testsuite/gcc.target/powerpc/clone*.c for examples.
> Basically, it looks like (from clone3.c):
> 
> __attribute__((target_clones("cpu=power10,cpu=power9,default")))
> long mod_func (long a, long b)
> {
>   return (a % b) + s;
> }
> 
> long mod_func_or (long a, long b, long c)
> {
>   return mod_func (a, b) | c;
> }
> 
> 
> Mike knows how this works better than I, but GCC automatically emits an
> ifunc resolver for the different clones and looks to use the HWCAP*
> architecture mask associated with the cpu we're compiling for.
> The "default" function being called in the case our ifunc resolver
> doesn't match any of the HWCAP* masks from the cpus we're compiling
> for.

Sorry, I've been in and out of the hospital with my wife.

> Mike, it seems like this is more of a "cpu" clone and not a true HWCAP
> test, so this specific thing doesn't (at least currently) work for
> something like __attribute__((target_clones("vsx,mma,default"))) ?
> Or did I misread the code?

There are 3 things GCC provides:

1) Is the ability to write an ifunc function.  Any call to func is always
indirect.  The loader calls resolver at program/shared library load to get the
address of the function to use:

	extern int func_power10 (void);
	extern int func_power9 (void);
	extern int func_default (void);

	int func (void) __attribute__ ((__ifunc__ ("resolver")));

	void *
	resolver (void)
	{
	  if (__builtin_cpu_supports ("arch_3_1"))
	    return (void *) func_power10;

	  else if (__builtin_cpu_supports ("arch_3_00"))
	    return (void *) func_power9;

	  else
	    return (void *) func_default;
	}

2) The ability to change the target defaults for a particular function:

	int func_power10 (void) __attribute__((__target__("cpu=power10")));

	int func_power10 (void)
	{
	  // this function will be compiled for power10
	}

GCC allows the stuff inside __attribute__ to have 2 prefix underscores and 2
suffix underscores or not.  I prefer to always use the underscore prefixes and
suffixes just in case the user defined a 'target' macro (i.e. the stuff within
attributes is subject to macro replacement).

An alternative is to use #pragmas to change the defaults for a bit:

	#pragma GCC push_options
	#pragma GCC target ("cpu=power10")

	int func_power10 (void)
	{
	  // compiled with power10 options
	}

	#pragma GCC target ("cpu=power9")

	int func_power9 (void)
	{
	  // compiled with power9 options
	}

	#pragma GCC pop_options

	int func_default (void)
	{
	  // compiled with the default options
	}


3) The ability to use target clones, where the compiler constructs the ifunc
function, and recompiles the function multiple times with different target
defaults.

	extern int func (void)
	  __attribute__((__target_clones__("cpu=power10,cpu=power9,default")));

	int func (void) {
	  // 3 versions of func are compiled along with an ifunc resolver.
	}

Note, 'default' must always be listed in the target clones.  You can only
specify one option (i.e. you can't do something like compile -mcpu=power9 and
-mtune=power10 into one option).  So in practice, only -mcpu=<xxx> options are
useful.

If we need better fine grained support, we could have -mcpu options that adds
or subtracts the options.

The automatic ifunc only looks at hwcap/hwcap2 bits, and it sorts it so that it
checks for power10 first, etc.  At present, we have target clone support for:

	power6
	power7
	power8
	power9
	power10

Note since there is no real hwcap bit for power11, with my current patches for
power11, if you do:

	extern int func (void)
	  __attribute__((__target_clones__("cpu=power11,cpu=power10,cpu=power9,default")));

it will compile both power11 and power10 clones, but the resolver will only
call the power10 clone because we don't have a separate hwcap bit for power11
(that I know of).  If we do have a separate hwcap bit, it is easy to add
support for power11.

Now one thing that I thought had been done, but it appears no longer being done
is that the #ifdefs (i.e. _ARCH_PWR10, etc.) aren't changed when compiling the
target clone.

> 
> I'll note I'm pretty sure we (IBM/powerpc) have added ifunc usage to
> OpenBLAS and some other libraries outside of glibc.
> 
> 
> Peter
> 
> 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

  reply	other threads:[~2024-05-02  2:59 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-23 18:14 Zack Weinberg
2024-04-23 18:39 ` enh
2024-04-23 19:46   ` Palmer Dabbelt
2024-04-24 13:56   ` Zack Weinberg
2024-04-24 14:25     ` enh
2024-04-23 18:52 ` Sam James
2024-04-23 18:54 ` Florian Weimer
2024-04-24 13:53   ` Zack Weinberg
2024-04-23 19:26 ` Andreas Schwab
2024-04-24 13:54   ` Zack Weinberg
2024-04-24  1:41 ` Richard Henderson
2024-04-24 14:43   ` Zack Weinberg
2024-04-24 15:09     ` enh
2024-04-28  0:24     ` Peter Bergner
2024-05-02  2:59       ` Michael Meissner [this message]
2024-04-30  8:42 ` Simon Josefsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZjMBmQs042EvN8ZS@cowardly-lion.the-meissners.org \
    --to=meissner@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=libc-alpha@sourceware.org \
    --cc=richard.henderson@linaro.org \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).