Hi, I have made the following changes in this new patch which is attached: * Use target attribute itself to create function versions. * Handle any number of ISA names and arch= args to target attribute, generating the right dispatchers. * Integrate with the CPU runtime detection checked in this week. * Overload resolution: If the caller's target matches any of the version function's target, then a direct call to the version is generated, no need to go through the dispatching. Patch also available for review here: http://codereview.appspot.com/5752064 Thanks, -Sri. On Fri, Mar 9, 2012 at 12:04 PM, Sriraman Tallam wrote: > Hi Richard, > >  Here is a more detailed overview of the front-end description: > > * Tracking decls that correspond to function versions of function > name, say "foo": > > Wnen the front-end sees a decl for "foo" with "targetv" attributes, it > tags it as a function version. To prevent duplicate definition errors > with other versions of "foo", I change "decls_match" function in > cp/decl.c to return false when 2 decls have the same signature but > different targetv attributes. This will make all function versions of > "foo" to be added to the overload list of "foo". > > To expand further, different targetv attributes is checked for by > sorting the arguments to targetv. > > * Change the assembler names of the function versions. > > The front-end, changes the assembler names of the function versions by > tagging the sorted list of args to "targetv" to the function name of > "foo". For example, the assembler name of "void foo () __attribute__ > ((targetv ("sse4")))" will become _Z3foov.sse4. > > * Separately group all function versions of "foo" together, in multiversion.c: > > File multiversion.c maintains a hashtab, decl_version_htab,  that maps > the  default function decl of "foo" to the list of all other versions > of this function "foo". This is meant to be used when creating the > dispatcher for this function. > > * Overload resolution: > >  Function "build_over_call" in cp/call.c sees a call to function > "foo", which is multi-versioned. The overload resolution happens in > function "joust" in "cp/call.c". Here, the call to "foo" has all > possible versions of "foo" as candidates. Currently, "joust" returns > the default version of "foo" as the winning candidate. But, > "build_over_call" realizes that this is a versioned function and > replaces the call-site of foo with a "ifunc" call for foo, by querying > a function in "multiversion.c" which builds the ifunc decl. After > this, all call-sites of "foo" contain the call to the ifunc. > > Notice that, for  calls from a sse function to a versioned function > with an sse variant, I can modify "joust" to return the "sse" function > version rather than the default and not replace this call with an > ifunc. To do this, I must pass the target attributes of the callee to > "joust" and check if the target attributes also match any version. > > * Creating the dispatcher: > > The dispatcher is independently created in a new pass, called > "pass_dispatch_version", that runs immediately after cfg and cgraph is > created. The dispatcher looks at all possible versions and queries the > target to give it the CPU detection predicates it must use to dispatch > each version. Then, the dispatcher body is created and the ifunc is > mapped to use this dispatcher. > > Notice that only the dispatcher creation is done after the front-end. > Everything else occurs in the front-end itself. I could have created > the dispatcher also in the front-end. I did not do so because I > thought keeping it as a separate pass made sense to easily add more > dispatch mechanisms. Like when IFUNC is not available, replace it with >  control-flow to make direct calls to the function versions. Also, > making the dispatcher after "cfg" is created was easy. > > Thanks, > -Sri. > > > On Wed, Mar 7, 2012 at 6:05 AM, Richard Guenther > wrote: >> On Wed, Mar 7, 2012 at 1:46 AM, Sriraman Tallam wrote: >>> User directed Function Multiversioning (MV) via Function Overloading >>> ==================================================================== >>> >>> This patch adds support for user directed function MV via function overloading. >>> For more detailed description: >>> http://gcc.gnu.org/ml/gcc/2012-03/msg00074.html >>> >>> >>> Here is an example program with function versions: >>> >>> int foo ();  /* Default version */ >>> int foo () __attribute__ ((targetv("arch=corei7")));/*Specialized for corei7 */ >>> int foo () __attribute__ ((targetv("arch=core2")));/*Specialized for core2 */ >>> >>> int main () >>> { >>>  int (*p)() = &foo; >>>  return foo () + (*p)(); >>> } >>> >>> int foo () >>> { >>>  return 0; >>> } >>> >>> int __attribute__ ((targetv("arch=corei7"))) >>> foo () >>> { >>>  return 0; >>> } >>> >>> int __attribute__ ((targetv("arch=core2"))) >>> foo () >>> { >>>  return 0; >>> } >>> >>> The above example has foo defined 3 times, but all 3 definitions of foo are >>> different versions of the same function. The call to foo in main, directly and >>> via a pointer, are calls to the multi-versioned function foo which is dispatched >>> to the right foo at run-time. >>> >>> Function versions must have the same signature but must differ in the specifier >>> string provided to a new attribute called "targetv", which is nothing but the >>> target attribute with an extra specification to indicate a version. Any number >>> of versions can be created using the targetv attribute but it is mandatory to >>> have one function without the attribute, which is treated as the default >>> version. >>> >>> The dispatching is done using the IFUNC mechanism to keep the dispatch overhead >>> low. The compiler creates a dispatcher function which checks the CPU type and >>> calls the right version of foo. The dispatching code checks for the platform >>> type and calls the first version that matches. The default function is called if >>> no specialized version is appropriate for execution. >>> >>> The pointer to foo is made to be the address of the dispatcher function, so that >>> it is unique and calls made via the pointer also work correctly. The assembler >>> names of the various versions of foo is made different, by tagging >>> the specifier strings, to keep them unique.  A specific version can be called >>> directly by creating an alias to its assembler name. For instance, to call the >>> corei7 version directly, make an alias : >>> int foo_corei7 () __attribute__((alias ("_Z3foov.arch_corei7"))); >>> and then call foo_corei7. >>> >>> Note that using IFUNC  blocks inlining of versioned functions. I had implemented >>> an optimization earlier to do hot path cloning to allow versioned functions to >>> be inlined. Please see : http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html >>> In the next iteration, I plan to merge these two. With that, hot code paths with >>> versioned functions will be cloned so that versioned functions can be inlined. >> >> Note that inlining of functions with the target attribute is limited as well, >> but your issue is that of the indirect dispatch as ... >> >> You don't give an overview of the frontend implementation.  Thus I have >> extracted the following >> >>  - the FE does not really know about the "overloading", nor can it directly >>   resolve calls from a "sse" function to another "sse" function without going >>   through the 2nd IFUNC >> >>  - cgraph also does not know about the "overloading", so it cannot do such >>   "devirtualization" either >> >> you seem to have implemented something inbetween a pure frontend >> solution and a proper middle-end solution.  For optimization and eventually >> automatically selecting functions for cloning (like, callees of a manual "sse" >> versioned function should be cloned?) it would be nice if the cgraph would >> know about the different versions and their relationships (and the dispatcher). >> Especially the cgraph code should know the functions are semantically >> equivalent (I suppose we should require that).  The IFUNC should be >> generated by cgraph / target code, similar to how we generate C++ thunks. >> >> Honza, any suggestions on how the FE side of such cgraph infrastructure >> should look like and how we should encode the target bits? >> >> Thanks, >> Richard. >> >>>        * doc/tm.texi.in: Add description for TARGET_DISPATCH_VERSION. >>>        * doc/tm.texi: Regenerate. >>>        * c-family/c-common.c (handle_targetv_attribute): New function. >>>        * target.def (dispatch_version): New target hook. >>>        * tree.h (DECL_FUNCTION_VERSIONED): New macro. >>>        (tree_function_decl): New bit-field versioned_function. >>>        * tree-pass.h (pass_dispatch_versions): New pass. >>>        * multiversion.c: New file. >>>        * multiversion.h: New file. >>>        * cgraphunit.c: Include multiversion.h >>>        (cgraph_finalize_function): Change assembler names of versioned >>>        functions. >>>        * cp/class.c: Include multiversion.h >>>        (add_method): aggregate function versions. Change assembler names of >>>        versioned functions. >>>        (resolve_address_of_overloaded_function): Match address of function >>>        version with default function.  Return address of ifunc dispatcher >>>        for address of versioned functions. >>>        * cp/decl.c (decls_match): Make decls unmatched for versioned >>>        functions. >>>        (duplicate_decls): Remove ambiguity for versioned functions. Notify >>>        of deleted function version decls. >>>        (start_decl): Change assembler name of versioned functions. >>>        (start_function): Change assembler name of versioned functions. >>>        (cxx_comdat_group): Make comdat group of versioned functions be the >>>        same. >>>        * cp/semantics.c (expand_or_defer_fn_1): Mark as needed versioned >>>        functions that are also marked inline. >>>        * cp/decl2.c: Include multiversion.h >>>        (check_classfn): Check attributes of versioned functions for match. >>>        * cp/call.c: Include multiversion.h >>>        (build_over_call): Make calls to multiversioned functions to call the >>>        dispatcher. >>>        (joust): For calls to multi-versioned functions, make the default >>>        function win. >>>        * timevar.def (TV_MULTIVERSION_DISPATCH): New time var. >>>        * varasm.c (finish_aliases_1): Check if the alias points to a function >>>        with a body before giving an error. >>>        * Makefile.in: Add multiversion.o >>>        * passes.c: Add pass_dispatch_versions to the pass list. >>>        * config/i386/i386.c (add_condition_to_bb): New function. >>>        (get_builtin_code_for_version): New function. >>>        (ix86_dispatch_version): New function. >>>        (TARGET_DISPATCH_VERSION): New macro. >>>        * testsuite/g++.dg/mv1.C: New test. >>> >>> Index: doc/tm.texi >>> =================================================================== >>> --- doc/tm.texi (revision 184971) >>> +++ doc/tm.texi (working copy) >>> @@ -10995,6 +10995,14 @@ The result is another tree containing a simplified >>>  call's result.  If @var{ignore} is true the value will be ignored. >>>  @end deftypefn >>> >>> +@deftypefn {Target Hook} int TARGET_DISPATCH_VERSION (tree @var{dispatch_decl}, void *@var{fndecls}, basic_block *@var{empty_bb}) >>> +For multi-versioned function, this hook sets up the dispatcher. >>> +@var{dispatch_decl} is the function that will be used to dispatch the >>> +version. @var{fndecls} are the function choices for dispatch. >>> +@var{empty_bb} is an basic block in @var{dispatch_decl} where the >>> +code to do the dispatch will be added. >>> +@end deftypefn >>> + >>>  @deftypefn {Target Hook} {const char *} TARGET_INVALID_WITHIN_DOLOOP (const_rtx @var{insn}) >>> >>>  Take an instruction in @var{insn} and return NULL if it is valid within a >>> Index: doc/tm.texi.in >>> =================================================================== >>> --- doc/tm.texi.in      (revision 184971) >>> +++ doc/tm.texi.in      (working copy) >>> @@ -10873,6 +10873,14 @@ The result is another tree containing a simplified >>>  call's result.  If @var{ignore} is true the value will be ignored. >>>  @end deftypefn >>> >>> +@hook TARGET_DISPATCH_VERSION >>> +For multi-versioned function, this hook sets up the dispatcher. >>> +@var{dispatch_decl} is the function that will be used to dispatch the >>> +version. @var{fndecls} are the function choices for dispatch. >>> +@var{empty_bb} is an basic block in @var{dispatch_decl} where the >>> +code to do the dispatch will be added. >>> +@end deftypefn >>> + >>>  @hook TARGET_INVALID_WITHIN_DOLOOP >>> >>>  Take an instruction in @var{insn} and return NULL if it is valid within a >>> Index: c-family/c-common.c >>> =================================================================== >>> --- c-family/c-common.c (revision 184971) >>> +++ c-family/c-common.c (working copy) >>> @@ -315,6 +315,7 @@ static tree check_case_value (tree); >>>  static bool check_case_bounds (tree, tree, tree *, tree *); >>> >>>  static tree handle_packed_attribute (tree *, tree, tree, int, bool *); >>> +static tree handle_targetv_attribute (tree *, tree, tree, int, bool *); >>>  static tree handle_nocommon_attribute (tree *, tree, tree, int, bool *); >>>  static tree handle_common_attribute (tree *, tree, tree, int, bool *); >>>  static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *); >>> @@ -604,6 +605,8 @@ const struct attribute_spec c_common_attribute_tab >>>  { >>>   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler, >>>        affects_type_identity } */ >>> +  { "targetv",               1, -1, true, false, false, >>> +                             handle_targetv_attribute, false }, >>>   { "packed",                 0, 0, false, false, false, >>>                              handle_packed_attribute , false}, >>>   { "nocommon",               0, 0, true,  false, false, >>> @@ -5869,6 +5872,54 @@ handle_packed_attribute (tree *node, tree name, tr >>>   return NULL_TREE; >>>  } >>> >>> +/* The targetv attribue is used to specify a function version >>> +   targeted to specific platform types.  The "targetv" attributes >>> +   have to be valid "target" attributes.  NODE should always point >>> +   to a FUNCTION_DECL.  ARGS contain the arguments to "targetv" >>> +   which should be valid arguments to attribute "target" too. >>> +   Check handle_target_attribute for FLAGS and NO_ADD_ATTRS.  */ >>> + >>> +static tree >>> +handle_targetv_attribute (tree *node, tree name, >>> +                         tree args, >>> +                         int flags, >>> +                         bool *no_add_attrs) >>> +{ >>> +  const char *attr_str = NULL; >>> +  gcc_assert (TREE_CODE (*node) == FUNCTION_DECL); >>> +  gcc_assert (args != NULL); >>> + >>> +  /* This is a function version.  */ >>> +  DECL_FUNCTION_VERSIONED (*node) = 1; >>> + >>> +  attr_str = TREE_STRING_POINTER (TREE_VALUE (args)); >>> + >>> +  /* Check if multiple sets of target attributes are there.  This >>> +     is not supported now.   In future, this will be supported by >>> +     cloning this function for each set.  */ >>> +  if (TREE_CHAIN (args) != NULL) >>> +    warning (OPT_Wattributes, "%qE attribute has multiple sets which " >>> +            "is not supported", name); >>> + >>> +  if (attr_str == NULL >>> +      || strstr (attr_str, "arch=") == NULL) >>> +    error_at (DECL_SOURCE_LOCATION (*node), >>> +             "Versioning supported only on \"arch=\" for now"); >>> + >>> +  /* targetv attributes must translate into target attributes.  */ >>> +  handle_target_attribute (node, get_identifier ("target"), args, flags, >>> +                          no_add_attrs); >>> + >>> +  if (*no_add_attrs) >>> +    warning (OPT_Wattributes, "%qE attribute has no effect", name); >>> + >>> +  /* This is necessary to keep the attribute tagged to the decl >>> +     all the time.  */ >>> +  *no_add_attrs = false; >>> + >>> +  return NULL_TREE; >>> +} >>> + >>>  /* Handle a "nocommon" attribute; arguments as in >>>    struct attribute_spec.handler.  */ >>> >>> Index: target.def >>> =================================================================== >>> --- target.def  (revision 184971) >>> +++ target.def  (working copy) >>> @@ -1249,6 +1249,15 @@ DEFHOOK >>>  tree, (tree fndecl, int n_args, tree *argp, bool ignore), >>>  hook_tree_tree_int_treep_bool_null) >>> >>> +/* Target hook to generate the dispatching code for calls to multi-versioned >>> +   functions.  DISPATCH_DECL is the function that will have the dispatching >>> +   logic.  FNDECLS are the list of choices for dispatch and EMPTY_BB is the >>> +   basic bloc in DISPATCH_DECL which will contain the code.  */ >>> +DEFHOOK >>> +(dispatch_version, >>> + "", >>> + int, (tree dispatch_decl, void *fndecls, basic_block *empty_bb), NULL) >>> + >>>  /* Returns a code for a target-specific builtin that implements >>>    reciprocal of the function, or NULL_TREE if not available.  */ >>>  DEFHOOK >>> Index: tree.h >>> =================================================================== >>> --- tree.h      (revision 184971) >>> +++ tree.h      (working copy) >>> @@ -3532,6 +3532,12 @@ extern VEC(tree, gc) **decl_debug_args_insert (tre >>>  #define DECL_FUNCTION_SPECIFIC_OPTIMIZATION(NODE) \ >>>    (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization) >>> >>> +/* In FUNCTION_DECL, this is set if this function has other versions generated >>> +   using "targetv" attributes.  The default version is the one which does not >>> +   have any "targetv" attribute set. */ >>> +#define DECL_FUNCTION_VERSIONED(NODE)\ >>> +   (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function) >>> + >>>  /* FUNCTION_DECL inherits from DECL_NON_COMMON because of the use of the >>>    arguments/result/saved_tree fields by front ends.   It was either inherit >>>    FUNCTION_DECL from non_common, or inherit non_common from FUNCTION_DECL, >>> @@ -3576,8 +3582,8 @@ struct GTY(()) tree_function_decl { >>>   unsigned looping_const_or_pure_flag : 1; >>>   unsigned has_debug_args_flag : 1; >>>   unsigned tm_clone_flag : 1; >>> - >>> -  /* 1 bit left */ >>> +  unsigned versioned_function : 1; >>> +  /* No bits left.  */ >>>  }; >>> >>>  /* The source language of the translation-unit.  */ >>> Index: tree-pass.h >>> =================================================================== >>> --- tree-pass.h (revision 184971) >>> +++ tree-pass.h (working copy) >>> @@ -455,6 +455,7 @@ extern struct gimple_opt_pass pass_tm_memopt; >>>  extern struct gimple_opt_pass pass_tm_edges; >>>  extern struct gimple_opt_pass pass_split_functions; >>>  extern struct gimple_opt_pass pass_feedback_split_functions; >>> +extern struct gimple_opt_pass pass_dispatch_versions; >>> >>>  /* IPA Passes */ >>>  extern struct simple_ipa_opt_pass pass_ipa_lower_emutls; >>> Index: multiversion.c >>> =================================================================== >>> --- multiversion.c      (revision 0) >>> +++ multiversion.c      (revision 0) >>> @@ -0,0 +1,798 @@ >>> +/* Function Multiversioning. >>> +   Copyright (C) 2012 Free Software Foundation, Inc. >>> +   Contributed by Sriraman Tallam (tmsriram@google.com) >>> + >>> +This file is part of GCC. >>> + >>> +GCC is free software; you can redistribute it and/or modify it under >>> +the terms of the GNU General Public License as published by the Free >>> +Software Foundation; either version 3, or (at your option) any later >>> +version. >>> + >>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY >>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or >>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License >>> +for more details. >>> + >>> +You should have received a copy of the GNU General Public License >>> +along with GCC; see the file COPYING3.  If not see >>> +. */ >>> + >>> +/* Holds the state for multi-versioned functions here. The front-end >>> +   updates the state as and when function versions are encountered. >>> +   This is then used to generate the dispatch code.  Also, the >>> +   optimization passes to clone hot paths involving versioned functions >>> +   will be done here. >>> + >>> +   Function versions are created by using the same function signature but >>> +   also tagging attribute "targetv" to specify the platform type for which >>> +   the version must be executed.  Here is an example: >>> + >>> +   int foo () >>> +   { >>> +     printf ("Execute as default"); >>> +     return 0; >>> +   } >>> + >>> +   int  __attribute__ ((targetv ("arch=corei7"))) >>> +   foo () >>> +   { >>> +     printf ("Execute for corei7"); >>> +     return 0; >>> +   } >>> + >>> +   int main () >>> +   { >>> +     return foo (); >>> +   } >>> + >>> +   The call to foo in main is replaced with a call to an IFUNC function that >>> +   contains the dispatch code to call the correct function version at >>> +   run-time.  */ >>> + >>> + >>> +#include "config.h" >>> +#include "system.h" >>> +#include "coretypes.h" >>> +#include "tm.h" >>> +#include "tree.h" >>> +#include "tree-inline.h" >>> +#include "langhooks.h" >>> +#include "flags.h" >>> +#include "cgraph.h" >>> +#include "diagnostic.h" >>> +#include "toplev.h" >>> +#include "timevar.h" >>> +#include "params.h" >>> +#include "fibheap.h" >>> +#include "intl.h" >>> +#include "tree-pass.h" >>> +#include "hashtab.h" >>> +#include "coverage.h" >>> +#include "ggc.h" >>> +#include "tree-flow.h" >>> +#include "rtl.h" >>> +#include "ipa-prop.h" >>> +#include "basic-block.h" >>> +#include "toplev.h" >>> +#include "dbgcnt.h" >>> +#include "tree-dump.h" >>> +#include "output.h" >>> +#include "vecprim.h" >>> +#include "gimple-pretty-print.h" >>> +#include "ipa-inline.h" >>> +#include "target.h" >>> +#include "multiversion.h" >>> + >>> +typedef void * void_p; >>> + >>> +DEF_VEC_P (void_p); >>> +DEF_VEC_ALLOC_P (void_p, heap); >>> + >>> +/* Each function decl that is a function version gets an instance of this >>> +   structure.   Since this is called by the front-end, decl merging can >>> +   happen, where a decl created for a new declaration is merged with >>> +   the old. In this case, the new decl is deleted and the IS_DELETED >>> +   field is set for the struct instance corresponding to the new decl. >>> +   IFUNC_DECL is the decl of the ifunc function for default decls. >>> +   IFUNC_RESOLVER_DECL is the decl of the dispatch function.  VERSIONS >>> +   is a vector containing the list of function versions  that are >>> +   the candidates for dispatch.  */ >>> + >>> +typedef struct version_function_d { >>> +  tree decl; >>> +  tree ifunc_decl; >>> +  tree ifunc_resolver_decl; >>> +  VEC (void_p, heap) *versions; >>> +  bool is_deleted; >>> +} version_function; >>> + >>> +/* Hashmap has an entry for every function decl that has other function >>> +   versions.  For function decls that are the default, it also stores the >>> +   list of all the other function versions.  Each entry is a structure >>> +   of type version_function_d.  */ >>> +static htab_t decl_version_htab = NULL; >>> + >>> +/* Hashtable helpers for decl_version_htab. */ >>> + >>> +static hashval_t >>> +decl_version_htab_hash_descriptor (const void *p) >>> +{ >>> +  const version_function *t = (const version_function *) p; >>> +  return htab_hash_pointer (t->decl); >>> +} >>> + >>> +/* Hashtable helper for decl_version_htab. */ >>> + >>> +static int >>> +decl_version_htab_eq_descriptor (const void *p1, const void *p2) >>> +{ >>> +  const version_function *t1 = (const version_function *) p1; >>> +  return htab_eq_pointer ((const void_p) t1->decl, p2); >>> +} >>> + >>> +/* Create the decl_version_htab.  */ >>> +static void >>> +create_decl_version_htab (void) >>> +{ >>> +  if (decl_version_htab == NULL) >>> +    decl_version_htab = htab_create (10, decl_version_htab_hash_descriptor, >>> +                                    decl_version_htab_eq_descriptor, NULL); >>> +} >>> + >>> +/* Creates an instance of version_function for decl DECL.  */ >>> + >>> +static version_function* >>> +new_version_function (const tree decl) >>> +{ >>> +  version_function *v; >>> +  v = (version_function *)xmalloc(sizeof (version_function)); >>> +  v->decl = decl; >>> +  v->ifunc_decl = NULL; >>> +  v->ifunc_resolver_decl = NULL; >>> +  v->versions = NULL; >>> +  v->is_deleted = false; >>> +  return v; >>> +} >>> + >>> +/* Comparator function to be used in qsort routine to sort attribute >>> +   specification strings to "targetv".  */ >>> + >>> +static int >>> +attr_strcmp (const void *v1, const void *v2) >>> +{ >>> +  const char *c1 = *(char *const*)v1; >>> +  const char *c2 = *(char *const*)v2; >>> +  return strcmp (c1, c2); >>> +} >>> + >>> +/* STR is the argument to targetv attribute.  This function tokenizes >>> +   the comma separated arguments, sorts them and returns a string which >>> +   is a unique identifier for the comma separated arguments.  */ >>> + >>> +static char * >>> +sorted_attr_string (const char *str) >>> +{ >>> +  char **args = NULL; >>> +  char *attr_str, *ret_str; >>> +  char *attr = NULL; >>> +  unsigned int argnum = 1; >>> +  unsigned int i; >>> + >>> +  for (i = 0; i < strlen (str); i++) >>> +    if (str[i] == ',') >>> +      argnum++; >>> + >>> +  attr_str = (char *)xmalloc (strlen (str) + 1); >>> +  strcpy (attr_str, str); >>> + >>> +  for (i = 0; i < strlen (attr_str); i++) >>> +    if (attr_str[i] == '=') >>> +      attr_str[i] = '_'; >>> + >>> +  if (argnum == 1) >>> +    return attr_str; >>> + >>> +  args = (char **)xmalloc (argnum * sizeof (char *)); >>> + >>> +  i = 0; >>> +  attr = strtok (attr_str, ","); >>> +  while (attr != NULL) >>> +    { >>> +      args[i] = attr; >>> +      i++; >>> +      attr = strtok (NULL, ","); >>> +    } >>> + >>> +  qsort (args, argnum, sizeof (char*), attr_strcmp); >>> + >>> +  ret_str = (char *)xmalloc (strlen (str) + 1); >>> +  strcpy (ret_str, args[0]); >>> +  for (i = 1; i < argnum; i++) >>> +    { >>> +      strcat (ret_str, "_"); >>> +      strcat (ret_str, args[i]); >>> +    } >>> + >>> +  free (args); >>> +  free (attr_str); >>> +  return ret_str; >>> +} >>> + >>> +/* Returns true when only one of DECL1 and DECL2 is marked with "targetv" >>> +   or if the "targetv" attribute strings of DECL1 and DECL2 dont match.  */ >>> + >>> +bool >>> +has_different_version_attributes (const tree decl1, const tree decl2) >>> +{ >>> +  tree attr1, attr2; >>> +  char *c1, *c2; >>> +  bool ret = false; >>> + >>> +  if (TREE_CODE (decl1) != FUNCTION_DECL >>> +      || TREE_CODE (decl2) != FUNCTION_DECL) >>> +    return false; >>> + >>> +  attr1 = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl1)); >>> +  attr2 = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl2)); >>> + >>> +  if (attr1 == NULL_TREE && attr2 == NULL_TREE) >>> +    return false; >>> + >>> +  if ((attr1 == NULL_TREE && attr2 != NULL_TREE) >>> +      || (attr1 != NULL_TREE && attr2 == NULL_TREE)) >>> +    return true; >>> + >>> +  c1 = sorted_attr_string ( >>> +       TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr1)))); >>> +  c2 = sorted_attr_string ( >>> +       TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr2)))); >>> + >>> +  if (strcmp (c1, c2) != 0) >>> +     ret = true; >>> + >>> +  free (c1); >>> +  free (c2); >>> + >>> +  return ret; >>> +} >>> + >>> +/* If this decl corresponds to a function and has "targetv" attribute, >>> +   append the attribute string to its assembler name.  */ >>> + >>> +void >>> +version_assembler_name (const tree decl) >>> +{ >>> +  tree version_attr; >>> +  const char *orig_name, *version_string, *attr_str; >>> +  char *assembler_name; >>> +  tree assembler_name_tree; >>> + >>> +  if (TREE_CODE (decl) != FUNCTION_DECL >>> +      || DECL_ASSEMBLER_NAME_SET_P (decl) >>> +      || !DECL_FUNCTION_VERSIONED (decl)) >>> +    return; >>> + >>> +  if (DECL_DECLARED_INLINE_P (decl) >>> +      &&lookup_attribute ("gnu_inline", >>> +                         DECL_ATTRIBUTES (decl))) >>> +    error_at (DECL_SOURCE_LOCATION (decl), >>> +             "Function versions cannot be marked as gnu_inline," >>> +             " bodies have to be generated\n"); >>> + >>> +  if (DECL_VIRTUAL_P (decl) >>> +      || DECL_VINDEX (decl)) >>> +    error_at (DECL_SOURCE_LOCATION (decl), >>> +             "Virtual function versioning not supported\n"); >>> + >>> +  version_attr = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl)); >>> +  /* targetv attribute string is NULL for default functions.  */ >>> +  if (version_attr == NULL_TREE) >>> +    return; >>> + >>> +  orig_name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); >>> +  version_string >>> +    = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (version_attr))); >>> + >>> +  attr_str = sorted_attr_string (version_string); >>> +  assembler_name = (char *) xmalloc (strlen (orig_name) >>> +                                    + strlen (attr_str) + 2); >>> + >>> +  sprintf (assembler_name, "%s.%s", orig_name, attr_str); >>> +  if (dump_file) >>> +    fprintf (dump_file, "Assembler name set to %s for function version %s\n", >>> +            assembler_name, IDENTIFIER_POINTER (DECL_NAME (decl))); >>> +  assembler_name_tree = get_identifier (assembler_name); >>> +  SET_DECL_ASSEMBLER_NAME (decl, assembler_name_tree); >>> +} >>> + >>> +/* Returns true if decl is multi-versioned and DECL is the default function, >>> +   that is it is not tagged with "targetv" attribute.  */ >>> + >>> +bool >>> +is_default_function (const tree decl) >>> +{ >>> +  return (TREE_CODE (decl) == FUNCTION_DECL >>> +         && DECL_FUNCTION_VERSIONED (decl) >>> +         && (lookup_attribute ("targetv", DECL_ATTRIBUTES (decl)) >>> +             == NULL_TREE)); >>> +} >>> + >>> +/* For function decl DECL, find the version_function struct in the >>> +   decl_version_htab.  */ >>> + >>> +static version_function * >>> +find_function_version (const tree decl) >>> +{ >>> +  void *slot; >>> + >>> +  if (!DECL_FUNCTION_VERSIONED (decl)) >>> +    return NULL; >>> + >>> +  if (!decl_version_htab) >>> +    return NULL; >>> + >>> +  slot = htab_find_with_hash (decl_version_htab, decl, >>> +                              htab_hash_pointer (decl)); >>> + >>> +  if (slot != NULL) >>> +    return (version_function *)slot; >>> + >>> +  return NULL; >>> +} >>> + >>> +/* Record DECL as a function version by creating a version_function struct >>> +   for it and storing it in the hashtable.  */ >>> + >>> +static version_function * >>> +add_function_version (const tree decl) >>> +{ >>> +  void **slot; >>> +  version_function *v; >>> + >>> +  if (!DECL_FUNCTION_VERSIONED (decl)) >>> +    return NULL; >>> + >>> +  create_decl_version_htab (); >>> + >>> +  slot = htab_find_slot_with_hash (decl_version_htab, (const void_p)decl, >>> +                                   htab_hash_pointer ((const void_p)decl), >>> +                                  INSERT); >>> + >>> +  if (*slot != NULL) >>> +    return (version_function *)*slot; >>> + >>> +  v = new_version_function (decl); >>> +  *slot = v; >>> + >>> +  return v; >>> +} >>> + >>> +/* Push V into VEC only if it is not already present.  */ >>> + >>> +static void >>> +push_function_version (version_function *v, VEC (void_p, heap) *vec) >>> +{ >>> +  int ix; >>> +  void_p ele; >>> +  for (ix = 0; VEC_iterate (void_p, vec, ix, ele); ++ix) >>> +    { >>> +      if (ele == (void_p)v) >>> +        return; >>> +    } >>> + >>> +  VEC_safe_push (void_p, heap, vec, (void*)v); >>> +} >>> + >>> +/* Mark DECL as deleted.  This is called by the front-end when a duplicate >>> +   decl is merged with the original decl and the duplicate decl is deleted. >>> +   This function marks the duplicate_decl as invalid.  Called by >>> +   duplicate_decls in cp/decl.c.  */ >>> + >>> +void >>> +mark_delete_decl_version (const tree decl) >>> +{ >>> +  version_function *decl_v; >>> + >>> +  decl_v = find_function_version (decl); >>> + >>> +  if (decl_v == NULL) >>> +    return; >>> + >>> +  decl_v->is_deleted = true; >>> + >>> +  if (is_default_function (decl) >>> +      && decl_v->versions != NULL) >>> +    { >>> +      VEC_truncate (void_p, decl_v->versions, 0); >>> +      VEC_free (void_p, heap, decl_v->versions); >>> +    } >>> +} >>> + >>> +/* Mark DECL1 and DECL2 to be function versions in the same group.  One >>> +   of DECL1 and DECL2 must be the default, otherwise this function does >>> +   nothing.  This function aggregates the versions.  */ >>> + >>> +int >>> +group_function_versions (const tree decl1, const tree decl2) >>> +{ >>> +  tree default_decl, version_decl; >>> +  version_function *default_v, *version_v; >>> + >>> +  gcc_assert (DECL_FUNCTION_VERSIONED (decl1) >>> +             && DECL_FUNCTION_VERSIONED (decl2)); >>> + >>> +  /* The version decls are added only to the default decl.  */ >>> +  if (!is_default_function (decl1) >>> +      && !is_default_function (decl2)) >>> +    return 0; >>> + >>> +  /* This can happen with duplicate declarations.  Just ignore.  */ >>> +  if (is_default_function (decl1) >>> +      && is_default_function (decl2)) >>> +    return 0; >>> + >>> +  default_decl = (is_default_function (decl1)) ? decl1 : decl2; >>> +  version_decl = (default_decl == decl1) ? decl2 : decl1; >>> + >>> +  gcc_assert (default_decl != version_decl); >>> +  create_decl_version_htab (); >>> + >>> +  /* If the version function is found, it has been added.  */ >>> +  if (find_function_version (version_decl)) >>> +    return 0; >>> + >>> +  default_v = add_function_version (default_decl); >>> +  version_v = add_function_version (version_decl); >>> + >>> +  if (default_v->versions == NULL) >>> +    default_v->versions = VEC_alloc (void_p, heap, 1); >>> + >>> +  push_function_version (version_v, default_v->versions); >>> +  return 0; >>> +} >>> + >>> +/* Makes a function attribute of the form NAME(ARG_NAME) and chains >>> +   it to CHAIN.  */ >>> + >>> +static tree >>> +make_attribute (const char *name, const char *arg_name, tree chain) >>> +{ >>> +  tree attr_name; >>> +  tree attr_arg_name; >>> +  tree attr_args; >>> +  tree attr; >>> + >>> +  attr_name = get_identifier (name); >>> +  attr_arg_name = build_string (strlen (arg_name), arg_name); >>> +  attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE); >>> +  attr = tree_cons (attr_name, attr_args, chain); >>> +  return attr; >>> +} >>> + >>> +/* Return a new name by appending SUFFIX to the DECL name.  If >>> +   make_unique is true, append the full path name.  */ >>> + >>> +static char * >>> +make_name (tree decl, const char *suffix, bool make_unique) >>> +{ >>> +  char *global_var_name; >>> +  int name_len; >>> +  const char *name; >>> +  const char *unique_name = NULL; >>> + >>> +  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); >>> + >>> +  /* Get a unique name that can be used globally without any chances >>> +     of collision at link time.  */ >>> +  if (make_unique) >>> +    unique_name = IDENTIFIER_POINTER (get_file_function_name ("\0")); >>> + >>> +  name_len = strlen (name) + strlen (suffix) + 2; >>> + >>> +  if (make_unique) >>> +    name_len += strlen (unique_name) + 1; >>> +  global_var_name = (char *) xmalloc (name_len); >>> + >>> +  /* Use '.' to concatenate names as it is demangler friendly.  */ >>> +  if (make_unique) >>> +      snprintf (global_var_name, name_len, "%s.%s.%s", name, >>> +               unique_name, suffix); >>> +  else >>> +      snprintf (global_var_name, name_len, "%s.%s", name, suffix); >>> + >>> +  return global_var_name; >>> +} >>> + >>> +/* Make the resolver function decl for ifunc (IFUNC_DECL) to dispatch >>> +   the versions of multi-versioned function DEFAULT_DECL.  Create and >>> +   empty basic block in the resolver and store the pointer in >>> +   EMPTY_BB.  Return the decl of the resolver function.  */ >>> + >>> +static tree >>> +make_ifunc_resolver_func (const tree default_decl, >>> +                         const tree ifunc_decl, >>> +                         basic_block *empty_bb) >>> +{ >>> +  char *resolver_name; >>> +  tree decl, type, decl_name, t; >>> +  basic_block new_bb; >>> +  tree old_current_function_decl; >>> +  bool make_unique = false; >>> + >>> +  /* IFUNC's have to be globally visible.  So, if the default_decl is >>> +     not, then the name of the IFUNC should be made unique.  */ >>> +  if (TREE_PUBLIC (default_decl) == 0) >>> +    make_unique = true; >>> + >>> +  /* Append the filename to the resolver function if the versions are >>> +     not externally visible.  This is because the resolver function has >>> +     to be externally visible for the loader to find it.  So, appending >>> +     the filename will prevent conflicts with a resolver function from >>> +     another module which is based on the same version name.  */ >>> +  resolver_name = make_name (default_decl, "resolver", make_unique); >>> + >>> +  /* The resolver function should return a (void *). */ >>> +  type = build_function_type_list (ptr_type_node, NULL_TREE); >>> + >>> +  decl = build_fn_decl (resolver_name, type); >>> +  decl_name = get_identifier (resolver_name); >>> +  SET_DECL_ASSEMBLER_NAME (decl, decl_name); >>> + >>> +  DECL_NAME (decl) = decl_name; >>> +  TREE_USED (decl) = TREE_USED (default_decl); >>> +  DECL_ARTIFICIAL (decl) = 1; >>> +  DECL_IGNORED_P (decl) = 0; >>> +  /* IFUNC resolvers have to be externally visible.  */ >>> +  TREE_PUBLIC (decl) = 1; >>> +  DECL_UNINLINABLE (decl) = 1; >>> + >>> +  DECL_EXTERNAL (decl) = DECL_EXTERNAL (default_decl); >>> +  DECL_EXTERNAL (ifunc_decl) = 0; >>> + >>> +  DECL_CONTEXT (decl) = NULL_TREE; >>> +  DECL_INITIAL (decl) = make_node (BLOCK); >>> +  DECL_STATIC_CONSTRUCTOR (decl) = 0; >>> +  TREE_READONLY (decl) = 0; >>> +  DECL_PURE_P (decl) = 0; >>> +  DECL_COMDAT (decl) = DECL_COMDAT (default_decl); >>> +  if (DECL_COMDAT_GROUP (default_decl)) >>> +    { >>> +      make_decl_one_only (decl, DECL_COMDAT_GROUP (default_decl)); >>> +    } >>> +  /* Build result decl and add to function_decl. */ >>> +  t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node); >>> +  DECL_ARTIFICIAL (t) = 1; >>> +  DECL_IGNORED_P (t) = 1; >>> +  DECL_RESULT (decl) = t; >>> + >>> +  gimplify_function_tree (decl); >>> +  old_current_function_decl = current_function_decl; >>> +  push_cfun (DECL_STRUCT_FUNCTION (decl)); >>> +  current_function_decl = decl; >>> +  init_empty_tree_cfg_for_function (DECL_STRUCT_FUNCTION (decl)); >>> +  cfun->curr_properties |= >>> +    (PROP_gimple_lcf | PROP_gimple_leh | PROP_cfg | PROP_referenced_vars | >>> +     PROP_ssa); >>> +  new_bb = create_empty_bb (ENTRY_BLOCK_PTR); >>> +  make_edge (ENTRY_BLOCK_PTR, new_bb, EDGE_FALLTHRU); >>> +  make_edge (new_bb, EXIT_BLOCK_PTR, 0); >>> +  *empty_bb = new_bb; >>> + >>> +  cgraph_add_new_function (decl, true); >>> +  cgraph_call_function_insertion_hooks (cgraph_get_create_node (decl)); >>> +  cgraph_analyze_function (cgraph_get_create_node (decl)); >>> +  cgraph_mark_needed_node (cgraph_get_create_node (decl)); >>> + >>> +  if (DECL_COMDAT_GROUP (default_decl)) >>> +    { >>> +      gcc_assert (cgraph_get_node (default_decl)); >>> +      cgraph_add_to_same_comdat_group (cgraph_get_node (decl), >>> +                                      cgraph_get_node (default_decl)); >>> +    } >>> + >>> +  pop_cfun (); >>> +  current_function_decl = old_current_function_decl; >>> + >>> +  gcc_assert (ifunc_decl != NULL); >>> +  DECL_ATTRIBUTES (ifunc_decl) >>> +    = make_attribute ("ifunc", resolver_name, DECL_ATTRIBUTES (ifunc_decl)); >>> +  assemble_alias (ifunc_decl, get_identifier (resolver_name)); >>> +  return decl; >>> +} >>> + >>> +/* Make and ifunc declaration for the multi-versioned function DECL.  Calls to >>> +   DECL function will be replaced with calls to the ifunc.   Return the decl >>> +   of the ifunc created.  */ >>> + >>> +static tree >>> +make_ifunc_func (const tree decl) >>> +{ >>> +  tree ifunc_decl; >>> +  char *ifunc_name, *resolver_name; >>> +  tree fn_type, ifunc_type; >>> +  bool make_unique = false; >>> + >>> +  if (TREE_PUBLIC (decl) == 0) >>> +    make_unique = true; >>> + >>> +  ifunc_name = make_name (decl, "ifunc", make_unique); >>> +  resolver_name = make_name (decl, "resolver", make_unique); >>> +  gcc_assert (resolver_name); >>> + >>> +  fn_type = TREE_TYPE (decl); >>> +  ifunc_type = build_function_type (TREE_TYPE (fn_type), >>> +                                   TYPE_ARG_TYPES (fn_type)); >>> + >>> +  ifunc_decl = build_fn_decl (ifunc_name, ifunc_type); >>> +  TREE_USED (ifunc_decl) = 1; >>> +  DECL_CONTEXT (ifunc_decl) = NULL_TREE; >>> +  DECL_INITIAL (ifunc_decl) = error_mark_node; >>> +  DECL_ARTIFICIAL (ifunc_decl) = 1; >>> +  /* Mark this ifunc as external, the resolver will flip it again if >>> +     it gets generated.  */ >>> +  DECL_EXTERNAL (ifunc_decl) = 1; >>> +  /* IFUNCs have to be externally visible.  */ >>> +  TREE_PUBLIC (ifunc_decl) = 1; >>> + >>> +  return ifunc_decl; >>> +} >>> + >>> +/* For multi-versioned function decl, which should also be the default, >>> +   return the decl of the ifunc resolver, create it if it does not >>> +   exist.  */ >>> + >>> +tree >>> +get_ifunc_for_version (const tree decl) >>> +{ >>> +  version_function *decl_v; >>> +  int ix; >>> +  void_p ele; >>> + >>> +  /* DECL has to be the default version, otherwise it is missing and >>> +     that is not allowed.  */ >>> +  if (!is_default_function (decl)) >>> +    { >>> +      error_at (DECL_SOURCE_LOCATION (decl), "Default version not found"); >>> +      return decl; >>> +    } >>> + >>> +  decl_v = find_function_version (decl); >>> +  gcc_assert (decl_v != NULL); >>> +  if (decl_v->ifunc_decl == NULL) >>> +    { >>> +      tree ifunc_decl; >>> +      ifunc_decl = make_ifunc_func (decl); >>> +      decl_v->ifunc_decl = ifunc_decl; >>> +    } >>> + >>> +  if (cgraph_get_node (decl)) >>> +    cgraph_mark_needed_node (cgraph_get_node (decl)); >>> + >>> +  for (ix = 0; VEC_iterate (void_p, decl_v->versions, ix, ele); ++ix) >>> +    { >>> +      version_function *v = (version_function *) ele; >>> +      gcc_assert (v->decl != NULL); >>> +      if (cgraph_get_node (v->decl)) >>> +       cgraph_mark_needed_node (cgraph_get_node (v->decl)); >>> +    } >>> + >>> +  return decl_v->ifunc_decl; >>> +} >>> + >>> +/* Generate the dispatching code to dispatch multi-versioned function >>> +   DECL.  Make a new function decl for dispatching and call the target >>> +   hook to process the "targetv" attributes and provide the code to >>> +   dispatch the right function at run-time.  */ >>> + >>> +static tree >>> +make_ifunc_resolver_for_version (const tree decl) >>> +{ >>> +  version_function *decl_v; >>> +  tree ifunc_resolver_decl, ifunc_decl; >>> +  basic_block empty_bb; >>> +  int ix; >>> +  void_p ele; >>> +  VEC (tree, heap) *fn_ver_vec = NULL; >>> + >>> +  gcc_assert (is_default_function (decl)); >>> + >>> +  decl_v = find_function_version (decl); >>> +  gcc_assert (decl_v != NULL); >>> + >>> +  if (decl_v->ifunc_resolver_decl != NULL) >>> +    return decl_v->ifunc_resolver_decl; >>> + >>> +  ifunc_decl = decl_v->ifunc_decl; >>> + >>> +  if (ifunc_decl == NULL) >>> +    ifunc_decl = decl_v->ifunc_decl = make_ifunc_func (decl); >>> + >>> +  ifunc_resolver_decl = make_ifunc_resolver_func (decl, ifunc_decl, >>> +                                                 &empty_bb); >>> + >>> +  fn_ver_vec = VEC_alloc (tree, heap, 2); >>> +  VEC_safe_push (tree, heap, fn_ver_vec, decl); >>> + >>> +  for (ix = 0; VEC_iterate (void_p, decl_v->versions, ix, ele); ++ix) >>> +    { >>> +      version_function *v = (version_function *) ele; >>> +      gcc_assert (v->decl != NULL); >>> +      /* Check for virtual functions here again, as by this time it should >>> +        have been determined if this function needs a vtable index or >>> +        not.  This happens for methods in derived classes that override >>> +        virtual methods in base classes but are not explicitly marked as >>> +        virtual.  */ >>> +      if (DECL_VINDEX (v->decl)) >>> +        error_at (DECL_SOURCE_LOCATION (v->decl), >>> +                 "Virtual function versioning not supported\n"); >>> +      if (!v->is_deleted) >>> +       VEC_safe_push (tree, heap, fn_ver_vec, v->decl); >>> +    } >>> + >>> +  gcc_assert (targetm.dispatch_version); >>> +  targetm.dispatch_version (ifunc_resolver_decl, fn_ver_vec, &empty_bb); >>> +  decl_v->ifunc_resolver_decl = ifunc_resolver_decl; >>> + >>> +  return ifunc_resolver_decl; >>> +} >>> + >>> +/* Main entry point to pass_dispatch_versions. For multi-versioned functions, >>> +   generate the dispatching code.  */ >>> + >>> +static unsigned int >>> +do_dispatch_versions (void) >>> +{ >>> +  /* A new pass for generating dispatch code for multi-versioned functions. >>> +     Other forms of dispatch can be added when ifunc support is not available >>> +     like just calling the function directly after checking for target type. >>> +     Currently, dispatching is done through IFUNC.  This pass will become >>> +     more meaningful when other dispatch mechanisms are added.  */ >>> + >>> +  /* Cloning a function to produce more versions will happen here when the >>> +     user requests that via the targetv attribute. For example, >>> +     int foo () __attribute__ ((targetv(("arch=core2"), ("arch=corei7")))); >>> +     means that the user wants the same body of foo to be versioned for core2 >>> +     and corei7.  In that case, this function will be cloned during this >>> +     pass.  */ >>> + >>> +  if (DECL_FUNCTION_VERSIONED (current_function_decl) >>> +      && is_default_function (current_function_decl)) >>> +    { >>> +      tree decl = make_ifunc_resolver_for_version (current_function_decl); >>> +      if (dump_file && decl) >>> +       dump_function_to_file (decl, dump_file, TDF_BLOCKS); >>> +    } >>> +  return 0; >>> +} >>> + >>> +static  bool >>> +gate_dispatch_versions (void) >>> +{ >>> +  return true; >>> +} >>> + >>> +/* A pass to generate the dispatch code to execute the appropriate version >>> +   of a multi-versioned function at run-time.  */ >>> + >>> +struct gimple_opt_pass pass_dispatch_versions = >>> +{ >>> + { >>> +  GIMPLE_PASS, >>> +  "dispatch_multiversion_functions",    /* name */ >>> +  gate_dispatch_versions,              /* gate */ >>> +  do_dispatch_versions,                        /* execute */ >>> +  NULL,                                        /* sub */ >>> +  NULL,                                        /* next */ >>> +  0,                                   /* static_pass_number */ >>> +  TV_MULTIVERSION_DISPATCH,            /* tv_id */ >>> +  PROP_cfg,                            /* properties_required */ >>> +  PROP_cfg,                            /* properties_provided */ >>> +  0,                                   /* properties_destroyed */ >>> +  0,                                   /* todo_flags_start */ >>> +  TODO_dump_func |                     /* todo_flags_finish */ >>> +  TODO_cleanup_cfg | TODO_dump_cgraph >>> + } >>> +}; >>> Index: cgraphunit.c >>> =================================================================== >>> --- cgraphunit.c        (revision 184971) >>> +++ cgraphunit.c        (working copy) >>> @@ -141,6 +141,7 @@ along with GCC; see the file COPYING3.  If not see >>>  #include "ipa-inline.h" >>>  #include "ipa-utils.h" >>>  #include "lto-streamer.h" >>> +#include "multiversion.h" >>> >>>  static void cgraph_expand_all_functions (void); >>>  static void cgraph_mark_functions_to_output (void); >>> @@ -343,6 +344,13 @@ cgraph_finalize_function (tree decl, bool nested) >>>       node->local.redefined_extern_inline = true; >>>     } >>> >>> +  /* If this is a function version and not the default, change the >>> +     assembler name of this function.  The DECL names of function >>> +     versions are the same, only the assembler names are made unique. >>> +     The assembler name is changed by appending the string from >>> +     the "targetv" attribute.  */ >>> +  version_assembler_name (decl); >>> + >>>   notice_global_symbol (decl); >>>   node->local.finalized = true; >>>   node->lowered = DECL_STRUCT_FUNCTION (decl)->cfg != NULL; >>> Index: multiversion.h >>> =================================================================== >>> --- multiversion.h      (revision 0) >>> +++ multiversion.h      (revision 0) >>> @@ -0,0 +1,52 @@ >>> +/* Function Multiversioning. >>> +   Copyright (C) 2012 Free Software Foundation, Inc. >>> +   Contributed by Sriraman Tallam (tmsriram@google.com) >>> + >>> +This file is part of GCC. >>> + >>> +GCC is free software; you can redistribute it and/or modify it under >>> +the terms of the GNU General Public License as published by the Free >>> +Software Foundation; either version 3, or (at your option) any later >>> +version. >>> + >>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY >>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or >>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License >>> +for more details. >>> + >>> +You should have received a copy of the GNU General Public License >>> +along with GCC; see the file COPYING3.  If not see >>> +. */ >>> + >>> +/* This is the header file which provides the functions to keep track >>> +   of functions that are multi-versioned and to generate the dispatch >>> +   code to call the right version at run-time.  */ >>> + >>> +#ifndef GCC_MULTIVERSION_H >>> +#define GCC_MULTIVERION_H >>> + >>> +#include "tree.h" >>> + >>> +/* Mark DECL1 and DECL2 as function versions.  */ >>> +int group_function_versions (const tree decl1, const tree decl2); >>> + >>> +/* Mark DECL as deleted and no longer a version.  */ >>> +void mark_delete_decl_version (const tree decl); >>> + >>> +/* Returns true if DECL is the default version to be executed if all >>> +   other versions are inappropriate at run-time.  */ >>> +bool is_default_function (const tree decl); >>> + >>> +/* Gets the IFUNC dispatcher for this multi-versioned function DECL. DECL >>> +   must be the default function in the multi-versioned group.  */ >>> +tree get_ifunc_for_version (const tree decl); >>> + >>> +/* Returns true when only one of DECL1 and DECL2 is marked with "targetv" >>> +   or if the "targetv" attribute strings of  DECL1 and DECL2 dont match.  */ >>> +bool has_different_version_attributes (const tree decl1, const tree decl2); >>> + >>> +/* If DECL is a function version and not the default version, the assembler >>> +   name of DECL is changed to include the attribute string to keep the >>> +   name unambiguous.  */ >>> +void version_assembler_name (const tree decl); >>> +#endif >>> Index: cp/class.c >>> =================================================================== >>> --- cp/class.c  (revision 184971) >>> +++ cp/class.c  (working copy) >>> @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see >>>  #include "tree-dump.h" >>>  #include "splay-tree.h" >>>  #include "pointer-set.h" >>> +#include "multiversion.h" >>> >>>  /* The number of nested classes being processed.  If we are not in the >>>    scope of any class, this is zero.  */ >>> @@ -1092,7 +1093,20 @@ add_method (tree type, tree method, tree using_dec >>>              || same_type_p (TREE_TYPE (fn_type), >>>                              TREE_TYPE (method_type)))) >>>        { >>> -         if (using_decl) >>> +         /* For function versions, their parms and types match >>> +            but they are not duplicates.  Record function versions >>> +            as and when they are found.  */ >>> +         if (TREE_CODE (fn) == FUNCTION_DECL >>> +             && TREE_CODE (method) == FUNCTION_DECL >>> +             && (DECL_FUNCTION_VERSIONED (fn) >>> +                 || DECL_FUNCTION_VERSIONED (method))) >>> +           { >>> +             DECL_FUNCTION_VERSIONED (fn) = 1; >>> +             DECL_FUNCTION_VERSIONED (method) = 1; >>> +             group_function_versions (fn, method); >>> +             continue; >>> +           } >>> +         else if (using_decl) >>>            { >>>              if (DECL_CONTEXT (fn) == type) >>>                /* Defer to the local function.  */ >>> @@ -1150,6 +1164,13 @@ add_method (tree type, tree method, tree using_dec >>>   else >>>     /* Replace the current slot.  */ >>>     VEC_replace (tree, method_vec, slot, overload); >>> + >>> +  /* Change the assembler name of method here if it has "targetv" >>> +     attributes.  Since all versions have the same mangled name, >>> +     their assembler name is changed by appending the string from >>> +     the "targetv" attribute. */ >>> +  version_assembler_name (method); >>> + >>>   return true; >>>  } >>> >>> @@ -6890,8 +6911,11 @@ resolve_address_of_overloaded_function (tree targe >>>          if (DECL_ANTICIPATED (fn)) >>>            continue; >>> >>> -         /* See if there's a match.  */ >>> -         if (same_type_p (target_fn_type, static_fn_type (fn))) >>> +         /* See if there's a match.   For functions that are multi-versioned >>> +            match it to the default function.  */ >>> +         if (same_type_p (target_fn_type, static_fn_type (fn)) >>> +             && (!DECL_FUNCTION_VERSIONED (fn) >>> +                 || is_default_function (fn))) >>>            matches = tree_cons (fn, NULL_TREE, matches); >>>        } >>>     } >>> @@ -7053,6 +7077,21 @@ resolve_address_of_overloaded_function (tree targe >>>       perform_or_defer_access_check (access_path, fn, fn); >>>     } >>> >>> +  /* If a pointer to a function that is multi-versioned is requested, the >>> +     pointer to the dispatcher function is returned instead.  This works >>> +     well because indirectly calling the function will dispatch the right >>> +     function version at run-time. Also, the function address is kept >>> +     unique.  */ >>> +  if (DECL_FUNCTION_VERSIONED (fn) >>> +      && is_default_function (fn)) >>> +    { >>> +      tree ifunc_decl; >>> +      ifunc_decl = get_ifunc_for_version (fn); >>> +      gcc_assert (ifunc_decl != NULL); >>> +      mark_used (fn); >>> +      return build_fold_addr_expr (ifunc_decl); >>> +    } >>> + >>>   if (TYPE_PTRFN_P (target_type) || TYPE_PTRMEMFUNC_P (target_type)) >>>     return cp_build_addr_expr (fn, flags); >>>   else >>> Index: cp/decl.c >>> =================================================================== >>> --- cp/decl.c   (revision 184971) >>> +++ cp/decl.c   (working copy) >>> @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see >>>  #include "pointer-set.h" >>>  #include "splay-tree.h" >>>  #include "plugin.h" >>> +#include "multiversion.h" >>> >>>  /* Possible cases of bad specifiers type used by bad_specifiers. */ >>>  enum bad_spec_place { >>> @@ -972,6 +973,23 @@ decls_match (tree newdecl, tree olddecl) >>>       if (t1 != t2) >>>        return 0; >>> >>> +      /* The decls dont match if they correspond to two different versions >>> +        of the same function.  */ >>> +      if (compparms (p1, p2) >>> +         && same_type_p (TREE_TYPE (f1), TREE_TYPE (f2)) >>> +         && (DECL_FUNCTION_VERSIONED (newdecl) >>> +             || DECL_FUNCTION_VERSIONED (olddecl)) >>> +         && has_different_version_attributes (newdecl, olddecl)) >>> +       { >>> +         /* One of the decls could be the default without the "targetv" >>> +            attribute. Set it to be a versioned function here.  */ >>> +         DECL_FUNCTION_VERSIONED (newdecl) = 1; >>> +         DECL_FUNCTION_VERSIONED (olddecl) = 1; >>> +         /* Accumulate all the versions of a function.  */ >>> +         group_function_versions (olddecl, newdecl); >>> +         return 0; >>> +       } >>> + >>>       if (CP_DECL_CONTEXT (newdecl) != CP_DECL_CONTEXT (olddecl) >>>          && ! (DECL_EXTERN_C_P (newdecl) >>>                && DECL_EXTERN_C_P (olddecl))) >>> @@ -1482,7 +1500,11 @@ duplicate_decls (tree newdecl, tree olddecl, bool >>>              error ("previous declaration %q+#D here", olddecl); >>>              return NULL_TREE; >>>            } >>> -         else if (compparms (TYPE_ARG_TYPES (TREE_TYPE (newdecl)), >>> +         /* For function versions, params and types match, but they >>> +            are not ambiguous.  */ >>> +         else if ((!DECL_FUNCTION_VERSIONED (newdecl) >>> +                   && !DECL_FUNCTION_VERSIONED (olddecl)) >>> +                  && compparms (TYPE_ARG_TYPES (TREE_TYPE (newdecl)), >>>                              TYPE_ARG_TYPES (TREE_TYPE (olddecl)))) >>>            { >>>              error ("new declaration %q#D", newdecl); >>> @@ -2250,6 +2272,16 @@ duplicate_decls (tree newdecl, tree olddecl, bool >>>   else if (DECL_PRESERVE_P (newdecl)) >>>     DECL_PRESERVE_P (olddecl) = 1; >>> >>> +  /* If the olddecl is a version, so is the newdecl.  */ >>> +  if (TREE_CODE (newdecl) == FUNCTION_DECL >>> +      && DECL_FUNCTION_VERSIONED (olddecl)) >>> +    { >>> +      DECL_FUNCTION_VERSIONED (newdecl) = 1; >>> +      /* Record that newdecl is not a valid version and has >>> +        been deleted.  */ >>> +      mark_delete_decl_version (newdecl); >>> +    } >>> + >>>   if (TREE_CODE (newdecl) == FUNCTION_DECL) >>>     { >>>       int function_size; >>> @@ -4512,6 +4544,10 @@ start_decl (const cp_declarator *declarator, >>>   /* Enter this declaration into the symbol table.  */ >>>   decl = maybe_push_decl (decl); >>> >>> +  /* If this decl is a function version and not the default, its assembler >>> +     name has to be changed.  */ >>> +  version_assembler_name (decl); >>> + >>>   if (processing_template_decl) >>>     decl = push_template_decl (decl); >>>   if (decl == error_mark_node) >>> @@ -13019,6 +13055,10 @@ start_function (cp_decl_specifier_seq *declspecs, >>>     gcc_assert (same_type_p (TREE_TYPE (TREE_TYPE (decl1)), >>>                             integer_type_node)); >>> >>> +  /* If this decl is a function version and not the default, its assembler >>> +     name has to be changed.  */ >>> +  version_assembler_name (decl1); >>> + >>>   start_preparsed_function (decl1, attrs, /*flags=*/SF_DEFAULT); >>> >>>   return 1; >>> @@ -13960,6 +14000,11 @@ cxx_comdat_group (tree decl) >>>            break; >>>        } >>>       name = DECL_ASSEMBLER_NAME (decl); >>> +      if (TREE_CODE (decl) == FUNCTION_DECL >>> +         && DECL_FUNCTION_VERSIONED (decl)) >>> +       name = DECL_NAME (decl); >>> +      else >>> +        name = DECL_ASSEMBLER_NAME (decl); >>>     } >>> >>>   return name; >>> Index: cp/semantics.c >>> =================================================================== >>> --- cp/semantics.c      (revision 184971) >>> +++ cp/semantics.c      (working copy) >>> @@ -3783,8 +3783,11 @@ expand_or_defer_fn_1 (tree fn) >>>       /* If the user wants us to keep all inline functions, then mark >>>         this function as needed so that finish_file will make sure to >>>         output it later.  Similarly, all dllexport'd functions must >>> -        be emitted; there may be callers in other DLLs.  */ >>> -      if ((flag_keep_inline_functions >>> +        be emitted; there may be callers in other DLLs. >>> +        Also, mark this function as needed if it is marked inline but >>> +        is a multi-versioned function.  */ >>> +      if (((flag_keep_inline_functions >>> +           || DECL_FUNCTION_VERSIONED (fn)) >>>           && DECL_DECLARED_INLINE_P (fn) >>>           && !DECL_REALLY_EXTERN (fn)) >>>          || (flag_keep_inline_dllexport >>> Index: cp/decl2.c >>> =================================================================== >>> --- cp/decl2.c  (revision 184971) >>> +++ cp/decl2.c  (working copy) >>> @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see >>>  #include "splay-tree.h" >>>  #include "langhooks.h" >>>  #include "c-family/c-ada-spec.h" >>> +#include "multiversion.h" >>> >>>  extern cpp_reader *parse_in; >>> >>> @@ -674,9 +675,13 @@ check_classfn (tree ctype, tree function, tree tem >>>          if (is_template != (TREE_CODE (fndecl) == TEMPLATE_DECL)) >>>            continue; >>> >>> +         /* While finding a match, same types and params are not enough >>> +            if the function is versioned.  Also check version ("targetv") >>> +            attributes.  */ >>>          if (same_type_p (TREE_TYPE (TREE_TYPE (function)), >>>                           TREE_TYPE (TREE_TYPE (fndecl))) >>>              && compparms (p1, p2) >>> +             && !has_different_version_attributes (function, fndecl) >>>              && (!is_template >>>                  || comp_template_parms (template_parms, >>>                                          DECL_TEMPLATE_PARMS (fndecl))) >>> Index: cp/call.c >>> =================================================================== >>> --- cp/call.c   (revision 184971) >>> +++ cp/call.c   (working copy) >>> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see >>>  #include "langhooks.h" >>>  #include "c-family/c-objc.h" >>>  #include "timevar.h" >>> +#include "multiversion.h" >>> >>>  /* The various kinds of conversion.  */ >>> >>> @@ -6730,6 +6731,17 @@ build_over_call (struct z_candidate *cand, int fla >>>   if (!already_used) >>>     mark_used (fn); >>> >>> +  /* For a call to a multi-versioned function, the call should actually be to >>> +     the dispatcher.  */ >>> +  if (DECL_FUNCTION_VERSIONED (fn)) >>> +    { >>> +      tree ifunc_decl; >>> +      ifunc_decl = get_ifunc_for_version (fn); >>> +      gcc_assert (ifunc_decl != NULL); >>> +      return build_call_expr_loc_array (UNKNOWN_LOCATION, ifunc_decl, >>> +                                       nargs, argarray); >>> +    } >>> + >>>   if (DECL_VINDEX (fn) && (flags & LOOKUP_NONVIRTUAL) == 0) >>>     { >>>       tree t; >>> @@ -7980,6 +7992,30 @@ joust (struct z_candidate *cand1, struct z_candida >>>   size_t i; >>>   size_t len; >>> >>> +  /* For Candidates of a multi-versioned function, the one marked default >>> +     wins.  This is because the default decl is used as key to aggregate >>> +     all the other versions provided for it in multiversion.c.  When >>> +     generating the actual call, the appropriate dispatcher is created >>> +     to call the right function version at run-time.  */ >>> + >>> +  if ((TREE_CODE (cand1->fn) == FUNCTION_DECL >>> +       && DECL_FUNCTION_VERSIONED (cand1->fn)) >>> +      ||(TREE_CODE (cand2->fn) == FUNCTION_DECL >>> +        && DECL_FUNCTION_VERSIONED (cand2->fn))) >>> +    { >>> +      if (is_default_function (cand1->fn)) >>> +       { >>> +          mark_used (cand2->fn); >>> +         return 1; >>> +       } >>> +      if (is_default_function (cand2->fn)) >>> +       { >>> +          mark_used (cand1->fn); >>> +         return -1; >>> +       } >>> +      return 0; >>> +    } >>> + >>>   /* Candidates that involve bad conversions are always worse than those >>>      that don't.  */ >>>   if (cand1->viable > cand2->viable) >>> Index: timevar.def >>> =================================================================== >>> --- timevar.def (revision 184971) >>> +++ timevar.def (working copy) >>> @@ -253,6 +253,7 @@ DEFTIMEVAR (TV_TREE_IFCOMBINE        , "tree if-co >>>  DEFTIMEVAR (TV_TREE_UNINIT           , "uninit var analysis") >>>  DEFTIMEVAR (TV_PLUGIN_INIT           , "plugin initialization") >>>  DEFTIMEVAR (TV_PLUGIN_RUN            , "plugin execution") >>> +DEFTIMEVAR (TV_MULTIVERSION_DISPATCH , "multiversion dispatch") >>> >>>  /* Everything else in rest_of_compilation not included above.  */ >>>  DEFTIMEVAR (TV_EARLY_LOCAL          , "early local passes") >>> Index: varasm.c >>> =================================================================== >>> --- varasm.c    (revision 184971) >>> +++ varasm.c    (working copy) >>> @@ -5755,6 +5755,8 @@ finish_aliases_1 (void) >>>        } >>>       else if (! (p->emitted_diags & ALIAS_DIAG_TO_EXTERN) >>>               && DECL_EXTERNAL (target_decl) >>> +              && (!TREE_CODE (target_decl) == FUNCTION_DECL >>> +                  || !DECL_STRUCT_FUNCTION (target_decl)) >>>               /* We use local aliases for C++ thunks to force the tailcall >>>                  to bind locally.  This is a hack - to keep it working do >>>                  the following (which is not strictly correct).  */ >>> Index: Makefile.in >>> =================================================================== >>> --- Makefile.in (revision 184971) >>> +++ Makefile.in (working copy) >>> @@ -1298,6 +1298,7 @@ OBJS = \ >>>        mcf.o \ >>>        mode-switching.o \ >>>        modulo-sched.o \ >>> +       multiversion.o \ >>>        omega.o \ >>>        omp-low.o \ >>>        optabs.o \ >>> @@ -3030,6 +3031,11 @@ ree.o : ree.c $(CONFIG_H) $(SYSTEM_H) coretypes.h >>>    $(DF_H) $(TIMEVAR_H) tree-pass.h $(RECOG_H) $(EXPR_H) \ >>>    $(REGS_H) $(TREE_H) $(TM_P_H) insn-config.h $(INSN_ATTR_H) $(DIAGNOSTIC_CORE_H) \ >>>    $(TARGET_H) $(OPTABS_H) insn-codes.h rtlhooks-def.h $(PARAMS_H) $(CGRAPH_H) >>> +multiversion.o : multiversion.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ >>> +   $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \ >>> +   $(DIAGNOSTIC_H) $(FIBHEAP_H) $(PARAMS_H) $(TIMEVAR_H) tree-pass.h \ >>> +   $(HASHTAB_H) $(COVERAGE_H) $(GGC_H) $(TREE_FLOW_H) $(RTL_H) $(IPA_PROP_H) \ >>> +   $(BASIC_BLOCK_H) $(TOPLEV_H) $(TREE_DUMP_H) ipa-inline.h >>>  cprop.o : cprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ >>>    $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(GGC_H) \ >>>    $(RECOG_H) $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) output.h toplev.h $(DIAGNOSTIC_CORE_H) \ >>> Index: passes.c >>> =================================================================== >>> --- passes.c    (revision 184971) >>> +++ passes.c    (working copy) >>> @@ -1190,6 +1190,7 @@ init_optimization_passes (void) >>>   NEXT_PASS (pass_build_cfg); >>>   NEXT_PASS (pass_warn_function_return); >>>   NEXT_PASS (pass_build_cgraph_edges); >>> +  NEXT_PASS (pass_dispatch_versions); >>>   *p = NULL; >>> >>>   /* Interprocedural optimization passes.  */ >>> Index: config/i386/i386.c >>> =================================================================== >>> --- config/i386/i386.c  (revision 184971) >>> +++ config/i386/i386.c  (working copy) >>> @@ -27446,6 +27473,593 @@ ix86_init_mmx_sse_builtins (void) >>>     } >>>  } >>> >>> +/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL >>> +   to return a pointer to VERSION_DECL if the outcome of the function >>> +   PREDICATE_DECL is true.  This function will be called during version >>> +   dispatch to decide which function version to execute.  It returns the >>> +   basic block at the end to which more conditions can be added.  */ >>> + >>> +static basic_block >>> +add_condition_to_bb (tree function_decl, tree version_decl, >>> +                    basic_block new_bb, tree predicate_decl) >>> +{ >>> +  gimple return_stmt; >>> +  tree convert_expr, result_var; >>> +  gimple convert_stmt; >>> +  gimple call_cond_stmt; >>> +  gimple if_else_stmt; >>> + >>> +  basic_block bb1, bb2, bb3; >>> +  edge e12, e23; >>> + >>> +  tree cond_var; >>> +  gimple_seq gseq; >>> + >>> +  tree old_current_function_decl; >>> + >>> +  old_current_function_decl = current_function_decl; >>> +  push_cfun (DECL_STRUCT_FUNCTION (function_decl)); >>> +  current_function_decl = function_decl; >>> + >>> +  gcc_assert (new_bb != NULL); >>> +  gseq = bb_seq (new_bb); >>> + >>> + >>> +  convert_expr = build1 (CONVERT_EXPR, ptr_type_node, >>> +                        build_fold_addr_expr (version_decl)); >>> +  result_var = create_tmp_var (ptr_type_node, NULL); >>> +  convert_stmt = gimple_build_assign (result_var, convert_expr); >>> +  return_stmt = gimple_build_return (result_var); >>> + >>> +  if (predicate_decl == NULL_TREE) >>> +    { >>> +      gimple_seq_add_stmt (&gseq, convert_stmt); >>> +      gimple_seq_add_stmt (&gseq, return_stmt); >>> +      set_bb_seq (new_bb, gseq); >>> +      gimple_set_bb (convert_stmt, new_bb); >>> +      gimple_set_bb (return_stmt, new_bb); >>> +      pop_cfun (); >>> +      current_function_decl = old_current_function_decl; >>> +      return new_bb; >>> +    } >>> + >>> +  cond_var = create_tmp_var (integer_type_node, NULL); >>> +  call_cond_stmt = gimple_build_call (predicate_decl, 0); >>> +  gimple_call_set_lhs (call_cond_stmt, cond_var); >>> + >>> +  gimple_set_block (call_cond_stmt, DECL_INITIAL (function_decl)); >>> +  gimple_set_bb (call_cond_stmt, new_bb); >>> +  gimple_seq_add_stmt (&gseq, call_cond_stmt); >>> + >>> +  if_else_stmt = gimple_build_cond (GT_EXPR, cond_var, >>> +                                   integer_zero_node, >>> +                                   NULL_TREE, NULL_TREE); >>> +  gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl)); >>> +  gimple_set_bb (if_else_stmt, new_bb); >>> +  gimple_seq_add_stmt (&gseq, if_else_stmt); >>> + >>> +  gimple_seq_add_stmt (&gseq, convert_stmt); >>> +  gimple_seq_add_stmt (&gseq, return_stmt); >>> +  set_bb_seq (new_bb, gseq); >>> + >>> +  bb1 = new_bb; >>> +  e12 = split_block (bb1, if_else_stmt); >>> +  bb2 = e12->dest; >>> +  e12->flags &= ~EDGE_FALLTHRU; >>> +  e12->flags |= EDGE_TRUE_VALUE; >>> + >>> +  e23 = split_block (bb2, return_stmt); >>> + >>> +  gimple_set_bb (convert_stmt, bb2); >>> +  gimple_set_bb (return_stmt, bb2); >>> + >>> +  bb3 = e23->dest; >>> +  make_edge (bb1, bb3, EDGE_FALSE_VALUE); >>> + >>> +  remove_edge (e23); >>> +  make_edge (bb2, EXIT_BLOCK_PTR, 0); >>> + >>> +  rebuild_cgraph_edges (); >>> + >>> +  pop_cfun (); >>> +  current_function_decl = old_current_function_decl; >>> + >>> +  return bb3; >>> +} >>> + >>> +/* This parses the attribute arguments to targetv in DECL and determines >>> +   the right builtin to use to match the platform specification. >>> +   For now, only one target argument ("arch=") is allowed.  */ >>> + >>> +static enum ix86_builtins >>> +get_builtin_code_for_version (tree decl) >>> +{ >>> +  tree attrs; >>> +  struct cl_target_option cur_target; >>> +  tree target_node; >>> +  struct cl_target_option *new_target; >>> +  enum ix86_builtins builtin_code = IX86_BUILTIN_MAX; >>> + >>> +  attrs = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl)); >>> +  gcc_assert (attrs != NULL); >>> + >>> +  cl_target_option_save (&cur_target, &global_options); >>> + >>> +  target_node = ix86_valid_target_attribute_tree >>> +                 (TREE_VALUE (TREE_VALUE (attrs))); >>> + >>> +  gcc_assert (target_node); >>> +  new_target = TREE_TARGET_OPTION (target_node); >>> +  gcc_assert (new_target); >>> + >>> +  if (new_target->arch_specified && new_target->arch > 0) >>> +    { >>> +      switch (new_target->arch) >>> +        { >>> +       case 1: >>> +       case 2: >>> +       case 3: >>> +       case 4: >>> +       case 5: >>> +       case 6: >>> +       case 7: >>> +       case 8: >>> +       case 9: >>> +       case 10: >>> +       case 11: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL; >>> +         break; >>> +       case 12: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL_CORE2; >>> +         break; >>> +       case 13: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL_COREI7; >>> +         break; >>> +       case 14: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL_ATOM; >>> +         break; >>> +       case 15: >>> +       case 16: >>> +       case 17: >>> +       case 18: >>> +       case 19: >>> +       case 20: >>> +       case 21: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMD; >>> +         break; >>> +       case 22: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMDFAM10H; >>> +         break; >>> +       case 23: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1; >>> +         break; >>> +       case 24: >>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2; >>> +         break; >>> +       case 25: /* What is btver1 ? */ >>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMD; >>> +         break; >>> +       } >>> +    } >>> + >>> +  cl_target_option_restore (&global_options, &cur_target); >>> +  if (builtin_code == IX86_BUILTIN_MAX) >>> +      error_at (DECL_SOURCE_LOCATION (decl), >>> +               "No dispatcher found for the versioning attributes"); >>> + >>> +  return builtin_code; >>> +} >>> + >>> +/* This is the target hook to generate the dispatch function for >>> +   multi-versioned functions.  DISPATCH_DECL is the function which will >>> +   contain the dispatch logic.  FNDECLS are the function choices for >>> +   dispatch, and is a tree chain.  EMPTY_BB is the basic block pointer >>> +   in DISPATCH_DECL in which the dispatch code is generated.  */ >>> + >>> +static int >>> +ix86_dispatch_version (tree dispatch_decl, >>> +                      void *fndecls_p, >>> +                      basic_block *empty_bb) >>> +{ >>> +  tree default_decl; >>> +  gimple ifunc_cpu_init_stmt; >>> +  gimple_seq gseq; >>> +  tree old_current_function_decl; >>> +  int ix; >>> +  tree ele; >>> +  VEC (tree, heap) *fndecls; >>> + >>> +  gcc_assert (dispatch_decl != NULL >>> +             && fndecls_p != NULL >>> +             && empty_bb != NULL); >>> + >>> +  /*fndecls_p is actually a vector.  */ >>> +  fndecls = (VEC (tree, heap) *)fndecls_p; >>> + >>> +  /* Atleast one more version other than the default.  */ >>> +  gcc_assert (VEC_length (tree, fndecls) >= 2); >>> + >>> +  /* The first version in the vector is the default decl.  */ >>> +  default_decl = VEC_index (tree, fndecls, 0); >>> + >>> +  old_current_function_decl = current_function_decl; >>> +  push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl)); >>> +  current_function_decl = dispatch_decl; >>> + >>> +  gseq = bb_seq (*empty_bb); >>> +  ifunc_cpu_init_stmt = gimple_build_call_vec ( >>> +                     ix86_builtins [(int) IX86_BUILTIN_CPU_INIT], NULL); >>> +  gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt); >>> +  gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb); >>> +  set_bb_seq (*empty_bb, gseq); >>> + >>> +  pop_cfun (); >>> +  current_function_decl = old_current_function_decl; >>> + >>> + >>> +  for (ix = 1; VEC_iterate (tree, fndecls, ix, ele); ++ix) >>> +    { >>> +      tree version_decl = ele; >>> +      /* Get attribute string, parse it and find the right predicate decl. >>> +         The predicate function could be a lengthy combination of many >>> +        features, like arch-type and various isa-variants.  For now, only >>> +        check the arch-type.  */ >>> +      tree predicate_decl = ix86_builtins [ >>> +                       get_builtin_code_for_version (version_decl)]; >>> +      *empty_bb = add_condition_to_bb (dispatch_decl, version_decl, *empty_bb, >>> +                                      predicate_decl); >>> + >>> +    } >>> +  /* dispatch default version at the end.  */ >>> +  *empty_bb = add_condition_to_bb (dispatch_decl, default_decl, *empty_bb, >>> +                                  NULL); >>> +  return 0; >>> +} >>> >>> @@ -38610,6 +39269,12 @@ ix86_autovectorize_vector_sizes (void) >>>  #undef TARGET_BUILD_BUILTIN_VA_LIST >>>  #define TARGET_BUILD_BUILTIN_VA_LIST ix86_build_builtin_va_list >>> >>> +#undef TARGET_DISPATCH_VERSION >>> +#define TARGET_DISPATCH_VERSION ix86_dispatch_version >>> + >>>  #undef TARGET_ENUM_VA_LIST_P >>>  #define TARGET_ENUM_VA_LIST_P ix86_enum_va_list >>> >>> Index: testsuite/g++.dg/mv1.C >>> =================================================================== >>> --- testsuite/g++.dg/mv1.C      (revision 0) >>> +++ testsuite/g++.dg/mv1.C      (revision 0) >>> @@ -0,0 +1,23 @@ >>> +/* Simple test case to check if Multiversioning works.  */ >>> +/* { dg-do run } */ >>> +/* { dg-options "-O2" } */ >>> + >>> +int foo (); >>> +int foo () __attribute__ ((targetv("arch=corei7"))); >>> + >>> +int main () >>> +{ >>> +  int (*p)() = &foo; >>> +  return foo () + (*p)(); >>> +} >>> + >>> +int foo () >>> +{ >>> +  return 0; >>> +} >>> + >>> +int __attribute__ ((targetv("arch=corei7"))) >>> +foo () >>> +{ >>> +  return 0; >>> +} >>> >>> >>> -- >>> This patch is available for review at http://codereview.appspot.com/5752064