public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Xinliang David Li <davidxl@google.com>
To: Sriraman Tallam <tmsriram@google.com>
Cc: Richard Guenther <richard.guenther@gmail.com>,
	reply@codereview.appspotmail.com, 	gcc-patches@gcc.gnu.org
Subject: Re: User directed Function Multiversioning via Function Overloading (issue5752064)
Date: Thu, 08 Mar 2012 21:37:00 -0000	[thread overview]
Message-ID: <CAAkRFZ+s2-fvR5CovaJZF4yJdiwpT1M73ADafAXkeVU9+At+zA@mail.gmail.com> (raw)
In-Reply-To: <CAAs8Hmzawe6KhQkTwM0jtmXkK+Cch9EtG5BMwZ6aNzUmtoFhdg@mail.gmail.com>

On Wed, Mar 7, 2012 at 11:08 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Wed, Mar 7, 2012 at 6:05 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Wed, Mar 7, 2012 at 1:46 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> User directed Function Multiversioning (MV) via Function Overloading
>>> ====================================================================
>>>
>>> This patch adds support for user directed function MV via function overloading.
>>> For more detailed description:
>>> http://gcc.gnu.org/ml/gcc/2012-03/msg00074.html
>>>
>>>
>>> Here is an example program with function versions:
>>>
>>> int foo ();  /* Default version */
>>> int foo () __attribute__ ((targetv("arch=corei7")));/*Specialized for corei7 */
>>> int foo () __attribute__ ((targetv("arch=core2")));/*Specialized for core2 */
>>>
>>> int main ()
>>> {
>>>  int (*p)() = &foo;
>>>  return foo () + (*p)();
>>> }
>>>
>>> int foo ()
>>> {
>>>  return 0;
>>> }
>>>
>>> int __attribute__ ((targetv("arch=corei7")))
>>> foo ()
>>> {
>>>  return 0;
>>> }
>>>
>>> int __attribute__ ((targetv("arch=core2")))
>>> foo ()
>>> {
>>>  return 0;
>>> }
>>>
>>> The above example has foo defined 3 times, but all 3 definitions of foo are
>>> different versions of the same function. The call to foo in main, directly and
>>> via a pointer, are calls to the multi-versioned function foo which is dispatched
>>> to the right foo at run-time.
>>>
>>> Function versions must have the same signature but must differ in the specifier
>>> string provided to a new attribute called "targetv", which is nothing but the
>>> target attribute with an extra specification to indicate a version. Any number
>>> of versions can be created using the targetv attribute but it is mandatory to
>>> have one function without the attribute, which is treated as the default
>>> version.
>>>
>>> The dispatching is done using the IFUNC mechanism to keep the dispatch overhead
>>> low. The compiler creates a dispatcher function which checks the CPU type and
>>> calls the right version of foo. The dispatching code checks for the platform
>>> type and calls the first version that matches. The default function is called if
>>> no specialized version is appropriate for execution.
>>>
>>> The pointer to foo is made to be the address of the dispatcher function, so that
>>> it is unique and calls made via the pointer also work correctly. The assembler
>>> names of the various versions of foo is made different, by tagging
>>> the specifier strings, to keep them unique.  A specific version can be called
>>> directly by creating an alias to its assembler name. For instance, to call the
>>> corei7 version directly, make an alias :
>>> int foo_corei7 () __attribute__((alias ("_Z3foov.arch_corei7")));
>>> and then call foo_corei7.
>>>
>>> Note that using IFUNC  blocks inlining of versioned functions. I had implemented
>>> an optimization earlier to do hot path cloning to allow versioned functions to
>>> be inlined. Please see : http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html
>>> In the next iteration, I plan to merge these two. With that, hot code paths with
>>> versioned functions will be cloned so that versioned functions can be inlined.
>>
>> Note that inlining of functions with the target attribute is limited as well,
>> but your issue is that of the indirect dispatch as ...
>>
>> You don't give an overview of the frontend implementation.  Thus I have
>> extracted the following
>>
>>  - the FE does not really know about the "overloading", nor can it directly
>>   resolve calls from a "sse" function to another "sse" function without going
>>   through the 2nd IFUNC
>
> This is a good point but I can change function joust, where the
> overload candidate is selected, to return the decl of the versioned
> function with matching target attributes as that of the callee. That
> will solve this problem. I have to treat the target attributes as an
> additional criterion for a match in overload resolution. The front end
> *does know* about the overloading, it is a question of doing the
> overload resolution correctly right?  This is easy when there is no
> cloning involved.

Should this be covered by a new IFUNC folding rule? FE just needs to
generate dummy code.

>
> When cloning of a version is required, it gets complicated since the
> FE must clone and produce the bodies. Once, all the bodies are
> available the overload resolution can do the right thing.
>

How can you safely clone a function without knowing if the versioned
body is available in another module?

David

>>
>>  - cgraph also does not know about the "overloading", so it cannot do such
>>   "devirtualization" either
>>
>> you seem to have implemented something inbetween a pure frontend
>> solution and a proper middle-end solution.
>
> The only thing I delayed is the code generation of the dispatcher. I
> thought it is better to have this come later, after cfg and cgraph is
> generated, so that multiple dispatching mechanisms could be
> implemented.
>
> For optimization and eventually
>> automatically selecting functions for cloning (like, callees of a manual "sse"
>> versioned function should be cloned?) it would be nice if the cgraph would
>> know about the different versions and their relationships (and the dispatcher).
>> Especially the cgraph code should know the functions are semantically
>> equivalent (I suppose we should require that).  The IFUNC should be
>> generated by cgraph / target code, similar to how we generate C++ thunks.
>>
>> Honza, any suggestions on how the FE side of such cgraph infrastructure
>> should look like and how we should encode the target bits?
>>
>> Thanks,
>> Richard.
>>
>>>        * doc/tm.texi.in: Add description for TARGET_DISPATCH_VERSION.
>>>        * doc/tm.texi: Regenerate.
>>>        * c-family/c-common.c (handle_targetv_attribute): New function.
>>>        * target.def (dispatch_version): New target hook.
>>>        * tree.h (DECL_FUNCTION_VERSIONED): New macro.
>>>        (tree_function_decl): New bit-field versioned_function.
>>>        * tree-pass.h (pass_dispatch_versions): New pass.
>>>        * multiversion.c: New file.
>>>        * multiversion.h: New file.
>>>        * cgraphunit.c: Include multiversion.h
>>>        (cgraph_finalize_function): Change assembler names of versioned
>>>        functions.
>>>        * cp/class.c: Include multiversion.h
>>>        (add_method): aggregate function versions. Change assembler names of
>>>        versioned functions.
>>>        (resolve_address_of_overloaded_function): Match address of function
>>>        version with default function.  Return address of ifunc dispatcher
>>>        for address of versioned functions.
>>>        * cp/decl.c (decls_match): Make decls unmatched for versioned
>>>        functions.
>>>        (duplicate_decls): Remove ambiguity for versioned functions. Notify
>>>        of deleted function version decls.
>>>        (start_decl): Change assembler name of versioned functions.
>>>        (start_function): Change assembler name of versioned functions.
>>>        (cxx_comdat_group): Make comdat group of versioned functions be the
>>>        same.
>>>        * cp/semantics.c (expand_or_defer_fn_1): Mark as needed versioned
>>>        functions that are also marked inline.
>>>        * cp/decl2.c: Include multiversion.h
>>>        (check_classfn): Check attributes of versioned functions for match.
>>>        * cp/call.c: Include multiversion.h
>>>        (build_over_call): Make calls to multiversioned functions to call the
>>>        dispatcher.
>>>        (joust): For calls to multi-versioned functions, make the default
>>>        function win.
>>>        * timevar.def (TV_MULTIVERSION_DISPATCH): New time var.
>>>        * varasm.c (finish_aliases_1): Check if the alias points to a function
>>>        with a body before giving an error.
>>>        * Makefile.in: Add multiversion.o
>>>        * passes.c: Add pass_dispatch_versions to the pass list.
>>>        * config/i386/i386.c (add_condition_to_bb): New function.
>>>        (get_builtin_code_for_version): New function.
>>>        (ix86_dispatch_version): New function.
>>>        (TARGET_DISPATCH_VERSION): New macro.
>>>        * testsuite/g++.dg/mv1.C: New test.
>>>
>>> Index: doc/tm.texi
>>> ===================================================================
>>> --- doc/tm.texi (revision 184971)
>>> +++ doc/tm.texi (working copy)
>>> @@ -10995,6 +10995,14 @@ The result is another tree containing a simplified
>>>  call's result.  If @var{ignore} is true the value will be ignored.
>>>  @end deftypefn
>>>
>>> +@deftypefn {Target Hook} int TARGET_DISPATCH_VERSION (tree @var{dispatch_decl}, void *@var{fndecls}, basic_block *@var{empty_bb})
>>> +For multi-versioned function, this hook sets up the dispatcher.
>>> +@var{dispatch_decl} is the function that will be used to dispatch the
>>> +version. @var{fndecls} are the function choices for dispatch.
>>> +@var{empty_bb} is an basic block in @var{dispatch_decl} where the
>>> +code to do the dispatch will be added.
>>> +@end deftypefn
>>> +
>>>  @deftypefn {Target Hook} {const char *} TARGET_INVALID_WITHIN_DOLOOP (const_rtx @var{insn})
>>>
>>>  Take an instruction in @var{insn} and return NULL if it is valid within a
>>> Index: doc/tm.texi.in
>>> ===================================================================
>>> --- doc/tm.texi.in      (revision 184971)
>>> +++ doc/tm.texi.in      (working copy)
>>> @@ -10873,6 +10873,14 @@ The result is another tree containing a simplified
>>>  call's result.  If @var{ignore} is true the value will be ignored.
>>>  @end deftypefn
>>>
>>> +@hook TARGET_DISPATCH_VERSION
>>> +For multi-versioned function, this hook sets up the dispatcher.
>>> +@var{dispatch_decl} is the function that will be used to dispatch the
>>> +version. @var{fndecls} are the function choices for dispatch.
>>> +@var{empty_bb} is an basic block in @var{dispatch_decl} where the
>>> +code to do the dispatch will be added.
>>> +@end deftypefn
>>> +
>>>  @hook TARGET_INVALID_WITHIN_DOLOOP
>>>
>>>  Take an instruction in @var{insn} and return NULL if it is valid within a
>>> Index: c-family/c-common.c
>>> ===================================================================
>>> --- c-family/c-common.c (revision 184971)
>>> +++ c-family/c-common.c (working copy)
>>> @@ -315,6 +315,7 @@ static tree check_case_value (tree);
>>>  static bool check_case_bounds (tree, tree, tree *, tree *);
>>>
>>>  static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
>>> +static tree handle_targetv_attribute (tree *, tree, tree, int, bool *);
>>>  static tree handle_nocommon_attribute (tree *, tree, tree, int, bool *);
>>>  static tree handle_common_attribute (tree *, tree, tree, int, bool *);
>>>  static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
>>> @@ -604,6 +605,8 @@ const struct attribute_spec c_common_attribute_tab
>>>  {
>>>   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
>>>        affects_type_identity } */
>>> +  { "targetv",               1, -1, true, false, false,
>>> +                             handle_targetv_attribute, false },
>>>   { "packed",                 0, 0, false, false, false,
>>>                              handle_packed_attribute , false},
>>>   { "nocommon",               0, 0, true,  false, false,
>>> @@ -5869,6 +5872,54 @@ handle_packed_attribute (tree *node, tree name, tr
>>>   return NULL_TREE;
>>>  }
>>>
>>> +/* The targetv attribue is used to specify a function version
>>> +   targeted to specific platform types.  The "targetv" attributes
>>> +   have to be valid "target" attributes.  NODE should always point
>>> +   to a FUNCTION_DECL.  ARGS contain the arguments to "targetv"
>>> +   which should be valid arguments to attribute "target" too.
>>> +   Check handle_target_attribute for FLAGS and NO_ADD_ATTRS.  */
>>> +
>>> +static tree
>>> +handle_targetv_attribute (tree *node, tree name,
>>> +                         tree args,
>>> +                         int flags,
>>> +                         bool *no_add_attrs)
>>> +{
>>> +  const char *attr_str = NULL;
>>> +  gcc_assert (TREE_CODE (*node) == FUNCTION_DECL);
>>> +  gcc_assert (args != NULL);
>>> +
>>> +  /* This is a function version.  */
>>> +  DECL_FUNCTION_VERSIONED (*node) = 1;
>>> +
>>> +  attr_str = TREE_STRING_POINTER (TREE_VALUE (args));
>>> +
>>> +  /* Check if multiple sets of target attributes are there.  This
>>> +     is not supported now.   In future, this will be supported by
>>> +     cloning this function for each set.  */
>>> +  if (TREE_CHAIN (args) != NULL)
>>> +    warning (OPT_Wattributes, "%qE attribute has multiple sets which "
>>> +            "is not supported", name);
>>> +
>>> +  if (attr_str == NULL
>>> +      || strstr (attr_str, "arch=") == NULL)
>>> +    error_at (DECL_SOURCE_LOCATION (*node),
>>> +             "Versioning supported only on \"arch=\" for now");
>>> +
>>> +  /* targetv attributes must translate into target attributes.  */
>>> +  handle_target_attribute (node, get_identifier ("target"), args, flags,
>>> +                          no_add_attrs);
>>> +
>>> +  if (*no_add_attrs)
>>> +    warning (OPT_Wattributes, "%qE attribute has no effect", name);
>>> +
>>> +  /* This is necessary to keep the attribute tagged to the decl
>>> +     all the time.  */
>>> +  *no_add_attrs = false;
>>> +
>>> +  return NULL_TREE;
>>> +}
>>> +
>>>  /* Handle a "nocommon" attribute; arguments as in
>>>    struct attribute_spec.handler.  */
>>>
>>> Index: target.def
>>> ===================================================================
>>> --- target.def  (revision 184971)
>>> +++ target.def  (working copy)
>>> @@ -1249,6 +1249,15 @@ DEFHOOK
>>>  tree, (tree fndecl, int n_args, tree *argp, bool ignore),
>>>  hook_tree_tree_int_treep_bool_null)
>>>
>>> +/* Target hook to generate the dispatching code for calls to multi-versioned
>>> +   functions.  DISPATCH_DECL is the function that will have the dispatching
>>> +   logic.  FNDECLS are the list of choices for dispatch and EMPTY_BB is the
>>> +   basic bloc in DISPATCH_DECL which will contain the code.  */
>>> +DEFHOOK
>>> +(dispatch_version,
>>> + "",
>>> + int, (tree dispatch_decl, void *fndecls, basic_block *empty_bb), NULL)
>>> +
>>>  /* Returns a code for a target-specific builtin that implements
>>>    reciprocal of the function, or NULL_TREE if not available.  */
>>>  DEFHOOK
>>> Index: tree.h
>>> ===================================================================
>>> --- tree.h      (revision 184971)
>>> +++ tree.h      (working copy)
>>> @@ -3532,6 +3532,12 @@ extern VEC(tree, gc) **decl_debug_args_insert (tre
>>>  #define DECL_FUNCTION_SPECIFIC_OPTIMIZATION(NODE) \
>>>    (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization)
>>>
>>> +/* In FUNCTION_DECL, this is set if this function has other versions generated
>>> +   using "targetv" attributes.  The default version is the one which does not
>>> +   have any "targetv" attribute set. */
>>> +#define DECL_FUNCTION_VERSIONED(NODE)\
>>> +   (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function)
>>> +
>>>  /* FUNCTION_DECL inherits from DECL_NON_COMMON because of the use of the
>>>    arguments/result/saved_tree fields by front ends.   It was either inherit
>>>    FUNCTION_DECL from non_common, or inherit non_common from FUNCTION_DECL,
>>> @@ -3576,8 +3582,8 @@ struct GTY(()) tree_function_decl {
>>>   unsigned looping_const_or_pure_flag : 1;
>>>   unsigned has_debug_args_flag : 1;
>>>   unsigned tm_clone_flag : 1;
>>> -
>>> -  /* 1 bit left */
>>> +  unsigned versioned_function : 1;
>>> +  /* No bits left.  */
>>>  };
>>>
>>>  /* The source language of the translation-unit.  */
>>> Index: tree-pass.h
>>> ===================================================================
>>> --- tree-pass.h (revision 184971)
>>> +++ tree-pass.h (working copy)
>>> @@ -455,6 +455,7 @@ extern struct gimple_opt_pass pass_tm_memopt;
>>>  extern struct gimple_opt_pass pass_tm_edges;
>>>  extern struct gimple_opt_pass pass_split_functions;
>>>  extern struct gimple_opt_pass pass_feedback_split_functions;
>>> +extern struct gimple_opt_pass pass_dispatch_versions;
>>>
>>>  /* IPA Passes */
>>>  extern struct simple_ipa_opt_pass pass_ipa_lower_emutls;
>>> Index: multiversion.c
>>> ===================================================================
>>> --- multiversion.c      (revision 0)
>>> +++ multiversion.c      (revision 0)
>>> @@ -0,0 +1,798 @@
>>> +/* Function Multiversioning.
>>> +   Copyright (C) 2012 Free Software Foundation, Inc.
>>> +   Contributed by Sriraman Tallam (tmsriram@google.com)
>>> +
>>> +This file is part of GCC.
>>> +
>>> +GCC is free software; you can redistribute it and/or modify it under
>>> +the terms of the GNU General Public License as published by the Free
>>> +Software Foundation; either version 3, or (at your option) any later
>>> +version.
>>> +
>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> +for more details.
>>> +
>>> +You should have received a copy of the GNU General Public License
>>> +along with GCC; see the file COPYING3.  If not see
>>> +<http://www.gnu.org/licenses/>. */
>>> +
>>> +/* Holds the state for multi-versioned functions here. The front-end
>>> +   updates the state as and when function versions are encountered.
>>> +   This is then used to generate the dispatch code.  Also, the
>>> +   optimization passes to clone hot paths involving versioned functions
>>> +   will be done here.
>>> +
>>> +   Function versions are created by using the same function signature but
>>> +   also tagging attribute "targetv" to specify the platform type for which
>>> +   the version must be executed.  Here is an example:
>>> +
>>> +   int foo ()
>>> +   {
>>> +     printf ("Execute as default");
>>> +     return 0;
>>> +   }
>>> +
>>> +   int  __attribute__ ((targetv ("arch=corei7")))
>>> +   foo ()
>>> +   {
>>> +     printf ("Execute for corei7");
>>> +     return 0;
>>> +   }
>>> +
>>> +   int main ()
>>> +   {
>>> +     return foo ();
>>> +   }
>>> +
>>> +   The call to foo in main is replaced with a call to an IFUNC function that
>>> +   contains the dispatch code to call the correct function version at
>>> +   run-time.  */
>>> +
>>> +
>>> +#include "config.h"
>>> +#include "system.h"
>>> +#include "coretypes.h"
>>> +#include "tm.h"
>>> +#include "tree.h"
>>> +#include "tree-inline.h"
>>> +#include "langhooks.h"
>>> +#include "flags.h"
>>> +#include "cgraph.h"
>>> +#include "diagnostic.h"
>>> +#include "toplev.h"
>>> +#include "timevar.h"
>>> +#include "params.h"
>>> +#include "fibheap.h"
>>> +#include "intl.h"
>>> +#include "tree-pass.h"
>>> +#include "hashtab.h"
>>> +#include "coverage.h"
>>> +#include "ggc.h"
>>> +#include "tree-flow.h"
>>> +#include "rtl.h"
>>> +#include "ipa-prop.h"
>>> +#include "basic-block.h"
>>> +#include "toplev.h"
>>> +#include "dbgcnt.h"
>>> +#include "tree-dump.h"
>>> +#include "output.h"
>>> +#include "vecprim.h"
>>> +#include "gimple-pretty-print.h"
>>> +#include "ipa-inline.h"
>>> +#include "target.h"
>>> +#include "multiversion.h"
>>> +
>>> +typedef void * void_p;
>>> +
>>> +DEF_VEC_P (void_p);
>>> +DEF_VEC_ALLOC_P (void_p, heap);
>>> +
>>> +/* Each function decl that is a function version gets an instance of this
>>> +   structure.   Since this is called by the front-end, decl merging can
>>> +   happen, where a decl created for a new declaration is merged with
>>> +   the old. In this case, the new decl is deleted and the IS_DELETED
>>> +   field is set for the struct instance corresponding to the new decl.
>>> +   IFUNC_DECL is the decl of the ifunc function for default decls.
>>> +   IFUNC_RESOLVER_DECL is the decl of the dispatch function.  VERSIONS
>>> +   is a vector containing the list of function versions  that are
>>> +   the candidates for dispatch.  */
>>> +
>>> +typedef struct version_function_d {
>>> +  tree decl;
>>> +  tree ifunc_decl;
>>> +  tree ifunc_resolver_decl;
>>> +  VEC (void_p, heap) *versions;
>>> +  bool is_deleted;
>>> +} version_function;
>>> +
>>> +/* Hashmap has an entry for every function decl that has other function
>>> +   versions.  For function decls that are the default, it also stores the
>>> +   list of all the other function versions.  Each entry is a structure
>>> +   of type version_function_d.  */
>>> +static htab_t decl_version_htab = NULL;
>>> +
>>> +/* Hashtable helpers for decl_version_htab. */
>>> +
>>> +static hashval_t
>>> +decl_version_htab_hash_descriptor (const void *p)
>>> +{
>>> +  const version_function *t = (const version_function *) p;
>>> +  return htab_hash_pointer (t->decl);
>>> +}
>>> +
>>> +/* Hashtable helper for decl_version_htab. */
>>> +
>>> +static int
>>> +decl_version_htab_eq_descriptor (const void *p1, const void *p2)
>>> +{
>>> +  const version_function *t1 = (const version_function *) p1;
>>> +  return htab_eq_pointer ((const void_p) t1->decl, p2);
>>> +}
>>> +
>>> +/* Create the decl_version_htab.  */
>>> +static void
>>> +create_decl_version_htab (void)
>>> +{
>>> +  if (decl_version_htab == NULL)
>>> +    decl_version_htab = htab_create (10, decl_version_htab_hash_descriptor,
>>> +                                    decl_version_htab_eq_descriptor, NULL);
>>> +}
>>> +
>>> +/* Creates an instance of version_function for decl DECL.  */
>>> +
>>> +static version_function*
>>> +new_version_function (const tree decl)
>>> +{
>>> +  version_function *v;
>>> +  v = (version_function *)xmalloc(sizeof (version_function));
>>> +  v->decl = decl;
>>> +  v->ifunc_decl = NULL;
>>> +  v->ifunc_resolver_decl = NULL;
>>> +  v->versions = NULL;
>>> +  v->is_deleted = false;
>>> +  return v;
>>> +}
>>> +
>>> +/* Comparator function to be used in qsort routine to sort attribute
>>> +   specification strings to "targetv".  */
>>> +
>>> +static int
>>> +attr_strcmp (const void *v1, const void *v2)
>>> +{
>>> +  const char *c1 = *(char *const*)v1;
>>> +  const char *c2 = *(char *const*)v2;
>>> +  return strcmp (c1, c2);
>>> +}
>>> +
>>> +/* STR is the argument to targetv attribute.  This function tokenizes
>>> +   the comma separated arguments, sorts them and returns a string which
>>> +   is a unique identifier for the comma separated arguments.  */
>>> +
>>> +static char *
>>> +sorted_attr_string (const char *str)
>>> +{
>>> +  char **args = NULL;
>>> +  char *attr_str, *ret_str;
>>> +  char *attr = NULL;
>>> +  unsigned int argnum = 1;
>>> +  unsigned int i;
>>> +
>>> +  for (i = 0; i < strlen (str); i++)
>>> +    if (str[i] == ',')
>>> +      argnum++;
>>> +
>>> +  attr_str = (char *)xmalloc (strlen (str) + 1);
>>> +  strcpy (attr_str, str);
>>> +
>>> +  for (i = 0; i < strlen (attr_str); i++)
>>> +    if (attr_str[i] == '=')
>>> +      attr_str[i] = '_';
>>> +
>>> +  if (argnum == 1)
>>> +    return attr_str;
>>> +
>>> +  args = (char **)xmalloc (argnum * sizeof (char *));
>>> +
>>> +  i = 0;
>>> +  attr = strtok (attr_str, ",");
>>> +  while (attr != NULL)
>>> +    {
>>> +      args[i] = attr;
>>> +      i++;
>>> +      attr = strtok (NULL, ",");
>>> +    }
>>> +
>>> +  qsort (args, argnum, sizeof (char*), attr_strcmp);
>>> +
>>> +  ret_str = (char *)xmalloc (strlen (str) + 1);
>>> +  strcpy (ret_str, args[0]);
>>> +  for (i = 1; i < argnum; i++)
>>> +    {
>>> +      strcat (ret_str, "_");
>>> +      strcat (ret_str, args[i]);
>>> +    }
>>> +
>>> +  free (args);
>>> +  free (attr_str);
>>> +  return ret_str;
>>> +}
>>> +
>>> +/* Returns true when only one of DECL1 and DECL2 is marked with "targetv"
>>> +   or if the "targetv" attribute strings of DECL1 and DECL2 dont match.  */
>>> +
>>> +bool
>>> +has_different_version_attributes (const tree decl1, const tree decl2)
>>> +{
>>> +  tree attr1, attr2;
>>> +  char *c1, *c2;
>>> +  bool ret = false;
>>> +
>>> +  if (TREE_CODE (decl1) != FUNCTION_DECL
>>> +      || TREE_CODE (decl2) != FUNCTION_DECL)
>>> +    return false;
>>> +
>>> +  attr1 = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl1));
>>> +  attr2 = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl2));
>>> +
>>> +  if (attr1 == NULL_TREE && attr2 == NULL_TREE)
>>> +    return false;
>>> +
>>> +  if ((attr1 == NULL_TREE && attr2 != NULL_TREE)
>>> +      || (attr1 != NULL_TREE && attr2 == NULL_TREE))
>>> +    return true;
>>> +
>>> +  c1 = sorted_attr_string (
>>> +       TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr1))));
>>> +  c2 = sorted_attr_string (
>>> +       TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr2))));
>>> +
>>> +  if (strcmp (c1, c2) != 0)
>>> +     ret = true;
>>> +
>>> +  free (c1);
>>> +  free (c2);
>>> +
>>> +  return ret;
>>> +}
>>> +
>>> +/* If this decl corresponds to a function and has "targetv" attribute,
>>> +   append the attribute string to its assembler name.  */
>>> +
>>> +void
>>> +version_assembler_name (const tree decl)
>>> +{
>>> +  tree version_attr;
>>> +  const char *orig_name, *version_string, *attr_str;
>>> +  char *assembler_name;
>>> +  tree assembler_name_tree;
>>> +
>>> +  if (TREE_CODE (decl) != FUNCTION_DECL
>>> +      || DECL_ASSEMBLER_NAME_SET_P (decl)
>>> +      || !DECL_FUNCTION_VERSIONED (decl))
>>> +    return;
>>> +
>>> +  if (DECL_DECLARED_INLINE_P (decl)
>>> +      &&lookup_attribute ("gnu_inline",
>>> +                         DECL_ATTRIBUTES (decl)))
>>> +    error_at (DECL_SOURCE_LOCATION (decl),
>>> +             "Function versions cannot be marked as gnu_inline,"
>>> +             " bodies have to be generated\n");
>>> +
>>> +  if (DECL_VIRTUAL_P (decl)
>>> +      || DECL_VINDEX (decl))
>>> +    error_at (DECL_SOURCE_LOCATION (decl),
>>> +             "Virtual function versioning not supported\n");
>>> +
>>> +  version_attr = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl));
>>> +  /* targetv attribute string is NULL for default functions.  */
>>> +  if (version_attr == NULL_TREE)
>>> +    return;
>>> +
>>> +  orig_name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
>>> +  version_string
>>> +    = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (version_attr)));
>>> +
>>> +  attr_str = sorted_attr_string (version_string);
>>> +  assembler_name = (char *) xmalloc (strlen (orig_name)
>>> +                                    + strlen (attr_str) + 2);
>>> +
>>> +  sprintf (assembler_name, "%s.%s", orig_name, attr_str);
>>> +  if (dump_file)
>>> +    fprintf (dump_file, "Assembler name set to %s for function version %s\n",
>>> +            assembler_name, IDENTIFIER_POINTER (DECL_NAME (decl)));
>>> +  assembler_name_tree = get_identifier (assembler_name);
>>> +  SET_DECL_ASSEMBLER_NAME (decl, assembler_name_tree);
>>> +}
>>> +
>>> +/* Returns true if decl is multi-versioned and DECL is the default function,
>>> +   that is it is not tagged with "targetv" attribute.  */
>>> +
>>> +bool
>>> +is_default_function (const tree decl)
>>> +{
>>> +  return (TREE_CODE (decl) == FUNCTION_DECL
>>> +         && DECL_FUNCTION_VERSIONED (decl)
>>> +         && (lookup_attribute ("targetv", DECL_ATTRIBUTES (decl))
>>> +             == NULL_TREE));
>>> +}
>>> +
>>> +/* For function decl DECL, find the version_function struct in the
>>> +   decl_version_htab.  */
>>> +
>>> +static version_function *
>>> +find_function_version (const tree decl)
>>> +{
>>> +  void *slot;
>>> +
>>> +  if (!DECL_FUNCTION_VERSIONED (decl))
>>> +    return NULL;
>>> +
>>> +  if (!decl_version_htab)
>>> +    return NULL;
>>> +
>>> +  slot = htab_find_with_hash (decl_version_htab, decl,
>>> +                              htab_hash_pointer (decl));
>>> +
>>> +  if (slot != NULL)
>>> +    return (version_function *)slot;
>>> +
>>> +  return NULL;
>>> +}
>>> +
>>> +/* Record DECL as a function version by creating a version_function struct
>>> +   for it and storing it in the hashtable.  */
>>> +
>>> +static version_function *
>>> +add_function_version (const tree decl)
>>> +{
>>> +  void **slot;
>>> +  version_function *v;
>>> +
>>> +  if (!DECL_FUNCTION_VERSIONED (decl))
>>> +    return NULL;
>>> +
>>> +  create_decl_version_htab ();
>>> +
>>> +  slot = htab_find_slot_with_hash (decl_version_htab, (const void_p)decl,
>>> +                                   htab_hash_pointer ((const void_p)decl),
>>> +                                  INSERT);
>>> +
>>> +  if (*slot != NULL)
>>> +    return (version_function *)*slot;
>>> +
>>> +  v = new_version_function (decl);
>>> +  *slot = v;
>>> +
>>> +  return v;
>>> +}
>>> +
>>> +/* Push V into VEC only if it is not already present.  */
>>> +
>>> +static void
>>> +push_function_version (version_function *v, VEC (void_p, heap) *vec)
>>> +{
>>> +  int ix;
>>> +  void_p ele;
>>> +  for (ix = 0; VEC_iterate (void_p, vec, ix, ele); ++ix)
>>> +    {
>>> +      if (ele == (void_p)v)
>>> +        return;
>>> +    }
>>> +
>>> +  VEC_safe_push (void_p, heap, vec, (void*)v);
>>> +}
>>> +
>>> +/* Mark DECL as deleted.  This is called by the front-end when a duplicate
>>> +   decl is merged with the original decl and the duplicate decl is deleted.
>>> +   This function marks the duplicate_decl as invalid.  Called by
>>> +   duplicate_decls in cp/decl.c.  */
>>> +
>>> +void
>>> +mark_delete_decl_version (const tree decl)
>>> +{
>>> +  version_function *decl_v;
>>> +
>>> +  decl_v = find_function_version (decl);
>>> +
>>> +  if (decl_v == NULL)
>>> +    return;
>>> +
>>> +  decl_v->is_deleted = true;
>>> +
>>> +  if (is_default_function (decl)
>>> +      && decl_v->versions != NULL)
>>> +    {
>>> +      VEC_truncate (void_p, decl_v->versions, 0);
>>> +      VEC_free (void_p, heap, decl_v->versions);
>>> +    }
>>> +}
>>> +
>>> +/* Mark DECL1 and DECL2 to be function versions in the same group.  One
>>> +   of DECL1 and DECL2 must be the default, otherwise this function does
>>> +   nothing.  This function aggregates the versions.  */
>>> +
>>> +int
>>> +group_function_versions (const tree decl1, const tree decl2)
>>> +{
>>> +  tree default_decl, version_decl;
>>> +  version_function *default_v, *version_v;
>>> +
>>> +  gcc_assert (DECL_FUNCTION_VERSIONED (decl1)
>>> +             && DECL_FUNCTION_VERSIONED (decl2));
>>> +
>>> +  /* The version decls are added only to the default decl.  */
>>> +  if (!is_default_function (decl1)
>>> +      && !is_default_function (decl2))
>>> +    return 0;
>>> +
>>> +  /* This can happen with duplicate declarations.  Just ignore.  */
>>> +  if (is_default_function (decl1)
>>> +      && is_default_function (decl2))
>>> +    return 0;
>>> +
>>> +  default_decl = (is_default_function (decl1)) ? decl1 : decl2;
>>> +  version_decl = (default_decl == decl1) ? decl2 : decl1;
>>> +
>>> +  gcc_assert (default_decl != version_decl);
>>> +  create_decl_version_htab ();
>>> +
>>> +  /* If the version function is found, it has been added.  */
>>> +  if (find_function_version (version_decl))
>>> +    return 0;
>>> +
>>> +  default_v = add_function_version (default_decl);
>>> +  version_v = add_function_version (version_decl);
>>> +
>>> +  if (default_v->versions == NULL)
>>> +    default_v->versions = VEC_alloc (void_p, heap, 1);
>>> +
>>> +  push_function_version (version_v, default_v->versions);
>>> +  return 0;
>>> +}
>>> +
>>> +/* Makes a function attribute of the form NAME(ARG_NAME) and chains
>>> +   it to CHAIN.  */
>>> +
>>> +static tree
>>> +make_attribute (const char *name, const char *arg_name, tree chain)
>>> +{
>>> +  tree attr_name;
>>> +  tree attr_arg_name;
>>> +  tree attr_args;
>>> +  tree attr;
>>> +
>>> +  attr_name = get_identifier (name);
>>> +  attr_arg_name = build_string (strlen (arg_name), arg_name);
>>> +  attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE);
>>> +  attr = tree_cons (attr_name, attr_args, chain);
>>> +  return attr;
>>> +}
>>> +
>>> +/* Return a new name by appending SUFFIX to the DECL name.  If
>>> +   make_unique is true, append the full path name.  */
>>> +
>>> +static char *
>>> +make_name (tree decl, const char *suffix, bool make_unique)
>>> +{
>>> +  char *global_var_name;
>>> +  int name_len;
>>> +  const char *name;
>>> +  const char *unique_name = NULL;
>>> +
>>> +  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
>>> +
>>> +  /* Get a unique name that can be used globally without any chances
>>> +     of collision at link time.  */
>>> +  if (make_unique)
>>> +    unique_name = IDENTIFIER_POINTER (get_file_function_name ("\0"));
>>> +
>>> +  name_len = strlen (name) + strlen (suffix) + 2;
>>> +
>>> +  if (make_unique)
>>> +    name_len += strlen (unique_name) + 1;
>>> +  global_var_name = (char *) xmalloc (name_len);
>>> +
>>> +  /* Use '.' to concatenate names as it is demangler friendly.  */
>>> +  if (make_unique)
>>> +      snprintf (global_var_name, name_len, "%s.%s.%s", name,
>>> +               unique_name, suffix);
>>> +  else
>>> +      snprintf (global_var_name, name_len, "%s.%s", name, suffix);
>>> +
>>> +  return global_var_name;
>>> +}
>>> +
>>> +/* Make the resolver function decl for ifunc (IFUNC_DECL) to dispatch
>>> +   the versions of multi-versioned function DEFAULT_DECL.  Create and
>>> +   empty basic block in the resolver and store the pointer in
>>> +   EMPTY_BB.  Return the decl of the resolver function.  */
>>> +
>>> +static tree
>>> +make_ifunc_resolver_func (const tree default_decl,
>>> +                         const tree ifunc_decl,
>>> +                         basic_block *empty_bb)
>>> +{
>>> +  char *resolver_name;
>>> +  tree decl, type, decl_name, t;
>>> +  basic_block new_bb;
>>> +  tree old_current_function_decl;
>>> +  bool make_unique = false;
>>> +
>>> +  /* IFUNC's have to be globally visible.  So, if the default_decl is
>>> +     not, then the name of the IFUNC should be made unique.  */
>>> +  if (TREE_PUBLIC (default_decl) == 0)
>>> +    make_unique = true;
>>> +
>>> +  /* Append the filename to the resolver function if the versions are
>>> +     not externally visible.  This is because the resolver function has
>>> +     to be externally visible for the loader to find it.  So, appending
>>> +     the filename will prevent conflicts with a resolver function from
>>> +     another module which is based on the same version name.  */
>>> +  resolver_name = make_name (default_decl, "resolver", make_unique);
>>> +
>>> +  /* The resolver function should return a (void *). */
>>> +  type = build_function_type_list (ptr_type_node, NULL_TREE);
>>> +
>>> +  decl = build_fn_decl (resolver_name, type);
>>> +  decl_name = get_identifier (resolver_name);
>>> +  SET_DECL_ASSEMBLER_NAME (decl, decl_name);
>>> +
>>> +  DECL_NAME (decl) = decl_name;
>>> +  TREE_USED (decl) = TREE_USED (default_decl);
>>> +  DECL_ARTIFICIAL (decl) = 1;
>>> +  DECL_IGNORED_P (decl) = 0;
>>> +  /* IFUNC resolvers have to be externally visible.  */
>>> +  TREE_PUBLIC (decl) = 1;
>>> +  DECL_UNINLINABLE (decl) = 1;
>>> +
>>> +  DECL_EXTERNAL (decl) = DECL_EXTERNAL (default_decl);
>>> +  DECL_EXTERNAL (ifunc_decl) = 0;
>>> +
>>> +  DECL_CONTEXT (decl) = NULL_TREE;
>>> +  DECL_INITIAL (decl) = make_node (BLOCK);
>>> +  DECL_STATIC_CONSTRUCTOR (decl) = 0;
>>> +  TREE_READONLY (decl) = 0;
>>> +  DECL_PURE_P (decl) = 0;
>>> +  DECL_COMDAT (decl) = DECL_COMDAT (default_decl);
>>> +  if (DECL_COMDAT_GROUP (default_decl))
>>> +    {
>>> +      make_decl_one_only (decl, DECL_COMDAT_GROUP (default_decl));
>>> +    }
>>> +  /* Build result decl and add to function_decl. */
>>> +  t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node);
>>> +  DECL_ARTIFICIAL (t) = 1;
>>> +  DECL_IGNORED_P (t) = 1;
>>> +  DECL_RESULT (decl) = t;
>>> +
>>> +  gimplify_function_tree (decl);
>>> +  old_current_function_decl = current_function_decl;
>>> +  push_cfun (DECL_STRUCT_FUNCTION (decl));
>>> +  current_function_decl = decl;
>>> +  init_empty_tree_cfg_for_function (DECL_STRUCT_FUNCTION (decl));
>>> +  cfun->curr_properties |=
>>> +    (PROP_gimple_lcf | PROP_gimple_leh | PROP_cfg | PROP_referenced_vars |
>>> +     PROP_ssa);
>>> +  new_bb = create_empty_bb (ENTRY_BLOCK_PTR);
>>> +  make_edge (ENTRY_BLOCK_PTR, new_bb, EDGE_FALLTHRU);
>>> +  make_edge (new_bb, EXIT_BLOCK_PTR, 0);
>>> +  *empty_bb = new_bb;
>>> +
>>> +  cgraph_add_new_function (decl, true);
>>> +  cgraph_call_function_insertion_hooks (cgraph_get_create_node (decl));
>>> +  cgraph_analyze_function (cgraph_get_create_node (decl));
>>> +  cgraph_mark_needed_node (cgraph_get_create_node (decl));
>>> +
>>> +  if (DECL_COMDAT_GROUP (default_decl))
>>> +    {
>>> +      gcc_assert (cgraph_get_node (default_decl));
>>> +      cgraph_add_to_same_comdat_group (cgraph_get_node (decl),
>>> +                                      cgraph_get_node (default_decl));
>>> +    }
>>> +
>>> +  pop_cfun ();
>>> +  current_function_decl = old_current_function_decl;
>>> +
>>> +  gcc_assert (ifunc_decl != NULL);
>>> +  DECL_ATTRIBUTES (ifunc_decl)
>>> +    = make_attribute ("ifunc", resolver_name, DECL_ATTRIBUTES (ifunc_decl));
>>> +  assemble_alias (ifunc_decl, get_identifier (resolver_name));
>>> +  return decl;
>>> +}
>>> +
>>> +/* Make and ifunc declaration for the multi-versioned function DECL.  Calls to
>>> +   DECL function will be replaced with calls to the ifunc.   Return the decl
>>> +   of the ifunc created.  */
>>> +
>>> +static tree
>>> +make_ifunc_func (const tree decl)
>>> +{
>>> +  tree ifunc_decl;
>>> +  char *ifunc_name, *resolver_name;
>>> +  tree fn_type, ifunc_type;
>>> +  bool make_unique = false;
>>> +
>>> +  if (TREE_PUBLIC (decl) == 0)
>>> +    make_unique = true;
>>> +
>>> +  ifunc_name = make_name (decl, "ifunc", make_unique);
>>> +  resolver_name = make_name (decl, "resolver", make_unique);
>>> +  gcc_assert (resolver_name);
>>> +
>>> +  fn_type = TREE_TYPE (decl);
>>> +  ifunc_type = build_function_type (TREE_TYPE (fn_type),
>>> +                                   TYPE_ARG_TYPES (fn_type));
>>> +
>>> +  ifunc_decl = build_fn_decl (ifunc_name, ifunc_type);
>>> +  TREE_USED (ifunc_decl) = 1;
>>> +  DECL_CONTEXT (ifunc_decl) = NULL_TREE;
>>> +  DECL_INITIAL (ifunc_decl) = error_mark_node;
>>> +  DECL_ARTIFICIAL (ifunc_decl) = 1;
>>> +  /* Mark this ifunc as external, the resolver will flip it again if
>>> +     it gets generated.  */
>>> +  DECL_EXTERNAL (ifunc_decl) = 1;
>>> +  /* IFUNCs have to be externally visible.  */
>>> +  TREE_PUBLIC (ifunc_decl) = 1;
>>> +
>>> +  return ifunc_decl;
>>> +}
>>> +
>>> +/* For multi-versioned function decl, which should also be the default,
>>> +   return the decl of the ifunc resolver, create it if it does not
>>> +   exist.  */
>>> +
>>> +tree
>>> +get_ifunc_for_version (const tree decl)
>>> +{
>>> +  version_function *decl_v;
>>> +  int ix;
>>> +  void_p ele;
>>> +
>>> +  /* DECL has to be the default version, otherwise it is missing and
>>> +     that is not allowed.  */
>>> +  if (!is_default_function (decl))
>>> +    {
>>> +      error_at (DECL_SOURCE_LOCATION (decl), "Default version not found");
>>> +      return decl;
>>> +    }
>>> +
>>> +  decl_v = find_function_version (decl);
>>> +  gcc_assert (decl_v != NULL);
>>> +  if (decl_v->ifunc_decl == NULL)
>>> +    {
>>> +      tree ifunc_decl;
>>> +      ifunc_decl = make_ifunc_func (decl);
>>> +      decl_v->ifunc_decl = ifunc_decl;
>>> +    }
>>> +
>>> +  if (cgraph_get_node (decl))
>>> +    cgraph_mark_needed_node (cgraph_get_node (decl));
>>> +
>>> +  for (ix = 0; VEC_iterate (void_p, decl_v->versions, ix, ele); ++ix)
>>> +    {
>>> +      version_function *v = (version_function *) ele;
>>> +      gcc_assert (v->decl != NULL);
>>> +      if (cgraph_get_node (v->decl))
>>> +       cgraph_mark_needed_node (cgraph_get_node (v->decl));
>>> +    }
>>> +
>>> +  return decl_v->ifunc_decl;
>>> +}
>>> +
>>> +/* Generate the dispatching code to dispatch multi-versioned function
>>> +   DECL.  Make a new function decl for dispatching and call the target
>>> +   hook to process the "targetv" attributes and provide the code to
>>> +   dispatch the right function at run-time.  */
>>> +
>>> +static tree
>>> +make_ifunc_resolver_for_version (const tree decl)
>>> +{
>>> +  version_function *decl_v;
>>> +  tree ifunc_resolver_decl, ifunc_decl;
>>> +  basic_block empty_bb;
>>> +  int ix;
>>> +  void_p ele;
>>> +  VEC (tree, heap) *fn_ver_vec = NULL;
>>> +
>>> +  gcc_assert (is_default_function (decl));
>>> +
>>> +  decl_v = find_function_version (decl);
>>> +  gcc_assert (decl_v != NULL);
>>> +
>>> +  if (decl_v->ifunc_resolver_decl != NULL)
>>> +    return decl_v->ifunc_resolver_decl;
>>> +
>>> +  ifunc_decl = decl_v->ifunc_decl;
>>> +
>>> +  if (ifunc_decl == NULL)
>>> +    ifunc_decl = decl_v->ifunc_decl = make_ifunc_func (decl);
>>> +
>>> +  ifunc_resolver_decl = make_ifunc_resolver_func (decl, ifunc_decl,
>>> +                                                 &empty_bb);
>>> +
>>> +  fn_ver_vec = VEC_alloc (tree, heap, 2);
>>> +  VEC_safe_push (tree, heap, fn_ver_vec, decl);
>>> +
>>> +  for (ix = 0; VEC_iterate (void_p, decl_v->versions, ix, ele); ++ix)
>>> +    {
>>> +      version_function *v = (version_function *) ele;
>>> +      gcc_assert (v->decl != NULL);
>>> +      /* Check for virtual functions here again, as by this time it should
>>> +        have been determined if this function needs a vtable index or
>>> +        not.  This happens for methods in derived classes that override
>>> +        virtual methods in base classes but are not explicitly marked as
>>> +        virtual.  */
>>> +      if (DECL_VINDEX (v->decl))
>>> +        error_at (DECL_SOURCE_LOCATION (v->decl),
>>> +                 "Virtual function versioning not supported\n");
>>> +      if (!v->is_deleted)
>>> +       VEC_safe_push (tree, heap, fn_ver_vec, v->decl);
>>> +    }
>>> +
>>> +  gcc_assert (targetm.dispatch_version);
>>> +  targetm.dispatch_version (ifunc_resolver_decl, fn_ver_vec, &empty_bb);
>>> +  decl_v->ifunc_resolver_decl = ifunc_resolver_decl;
>>> +
>>> +  return ifunc_resolver_decl;
>>> +}
>>> +
>>> +/* Main entry point to pass_dispatch_versions. For multi-versioned functions,
>>> +   generate the dispatching code.  */
>>> +
>>> +static unsigned int
>>> +do_dispatch_versions (void)
>>> +{
>>> +  /* A new pass for generating dispatch code for multi-versioned functions.
>>> +     Other forms of dispatch can be added when ifunc support is not available
>>> +     like just calling the function directly after checking for target type.
>>> +     Currently, dispatching is done through IFUNC.  This pass will become
>>> +     more meaningful when other dispatch mechanisms are added.  */
>>> +
>>> +  /* Cloning a function to produce more versions will happen here when the
>>> +     user requests that via the targetv attribute. For example,
>>> +     int foo () __attribute__ ((targetv(("arch=core2"), ("arch=corei7"))));
>>> +     means that the user wants the same body of foo to be versioned for core2
>>> +     and corei7.  In that case, this function will be cloned during this
>>> +     pass.  */
>>> +
>>> +  if (DECL_FUNCTION_VERSIONED (current_function_decl)
>>> +      && is_default_function (current_function_decl))
>>> +    {
>>> +      tree decl = make_ifunc_resolver_for_version (current_function_decl);
>>> +      if (dump_file && decl)
>>> +       dump_function_to_file (decl, dump_file, TDF_BLOCKS);
>>> +    }
>>> +  return 0;
>>> +}
>>> +
>>> +static  bool
>>> +gate_dispatch_versions (void)
>>> +{
>>> +  return true;
>>> +}
>>> +
>>> +/* A pass to generate the dispatch code to execute the appropriate version
>>> +   of a multi-versioned function at run-time.  */
>>> +
>>> +struct gimple_opt_pass pass_dispatch_versions =
>>> +{
>>> + {
>>> +  GIMPLE_PASS,
>>> +  "dispatch_multiversion_functions",    /* name */
>>> +  gate_dispatch_versions,              /* gate */
>>> +  do_dispatch_versions,                        /* execute */
>>> +  NULL,                                        /* sub */
>>> +  NULL,                                        /* next */
>>> +  0,                                   /* static_pass_number */
>>> +  TV_MULTIVERSION_DISPATCH,            /* tv_id */
>>> +  PROP_cfg,                            /* properties_required */
>>> +  PROP_cfg,                            /* properties_provided */
>>> +  0,                                   /* properties_destroyed */
>>> +  0,                                   /* todo_flags_start */
>>> +  TODO_dump_func |                     /* todo_flags_finish */
>>> +  TODO_cleanup_cfg | TODO_dump_cgraph
>>> + }
>>> +};
>>> Index: cgraphunit.c
>>> ===================================================================
>>> --- cgraphunit.c        (revision 184971)
>>> +++ cgraphunit.c        (working copy)
>>> @@ -141,6 +141,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "ipa-inline.h"
>>>  #include "ipa-utils.h"
>>>  #include "lto-streamer.h"
>>> +#include "multiversion.h"
>>>
>>>  static void cgraph_expand_all_functions (void);
>>>  static void cgraph_mark_functions_to_output (void);
>>> @@ -343,6 +344,13 @@ cgraph_finalize_function (tree decl, bool nested)
>>>       node->local.redefined_extern_inline = true;
>>>     }
>>>
>>> +  /* If this is a function version and not the default, change the
>>> +     assembler name of this function.  The DECL names of function
>>> +     versions are the same, only the assembler names are made unique.
>>> +     The assembler name is changed by appending the string from
>>> +     the "targetv" attribute.  */
>>> +  version_assembler_name (decl);
>>> +
>>>   notice_global_symbol (decl);
>>>   node->local.finalized = true;
>>>   node->lowered = DECL_STRUCT_FUNCTION (decl)->cfg != NULL;
>>> Index: multiversion.h
>>> ===================================================================
>>> --- multiversion.h      (revision 0)
>>> +++ multiversion.h      (revision 0)
>>> @@ -0,0 +1,52 @@
>>> +/* Function Multiversioning.
>>> +   Copyright (C) 2012 Free Software Foundation, Inc.
>>> +   Contributed by Sriraman Tallam (tmsriram@google.com)
>>> +
>>> +This file is part of GCC.
>>> +
>>> +GCC is free software; you can redistribute it and/or modify it under
>>> +the terms of the GNU General Public License as published by the Free
>>> +Software Foundation; either version 3, or (at your option) any later
>>> +version.
>>> +
>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> +for more details.
>>> +
>>> +You should have received a copy of the GNU General Public License
>>> +along with GCC; see the file COPYING3.  If not see
>>> +<http://www.gnu.org/licenses/>. */
>>> +
>>> +/* This is the header file which provides the functions to keep track
>>> +   of functions that are multi-versioned and to generate the dispatch
>>> +   code to call the right version at run-time.  */
>>> +
>>> +#ifndef GCC_MULTIVERSION_H
>>> +#define GCC_MULTIVERION_H
>>> +
>>> +#include "tree.h"
>>> +
>>> +/* Mark DECL1 and DECL2 as function versions.  */
>>> +int group_function_versions (const tree decl1, const tree decl2);
>>> +
>>> +/* Mark DECL as deleted and no longer a version.  */
>>> +void mark_delete_decl_version (const tree decl);
>>> +
>>> +/* Returns true if DECL is the default version to be executed if all
>>> +   other versions are inappropriate at run-time.  */
>>> +bool is_default_function (const tree decl);
>>> +
>>> +/* Gets the IFUNC dispatcher for this multi-versioned function DECL. DECL
>>> +   must be the default function in the multi-versioned group.  */
>>> +tree get_ifunc_for_version (const tree decl);
>>> +
>>> +/* Returns true when only one of DECL1 and DECL2 is marked with "targetv"
>>> +   or if the "targetv" attribute strings of  DECL1 and DECL2 dont match.  */
>>> +bool has_different_version_attributes (const tree decl1, const tree decl2);
>>> +
>>> +/* If DECL is a function version and not the default version, the assembler
>>> +   name of DECL is changed to include the attribute string to keep the
>>> +   name unambiguous.  */
>>> +void version_assembler_name (const tree decl);
>>> +#endif
>>> Index: cp/class.c
>>> ===================================================================
>>> --- cp/class.c  (revision 184971)
>>> +++ cp/class.c  (working copy)
>>> @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "tree-dump.h"
>>>  #include "splay-tree.h"
>>>  #include "pointer-set.h"
>>> +#include "multiversion.h"
>>>
>>>  /* The number of nested classes being processed.  If we are not in the
>>>    scope of any class, this is zero.  */
>>> @@ -1092,7 +1093,20 @@ add_method (tree type, tree method, tree using_dec
>>>              || same_type_p (TREE_TYPE (fn_type),
>>>                              TREE_TYPE (method_type))))
>>>        {
>>> -         if (using_decl)
>>> +         /* For function versions, their parms and types match
>>> +            but they are not duplicates.  Record function versions
>>> +            as and when they are found.  */
>>> +         if (TREE_CODE (fn) == FUNCTION_DECL
>>> +             && TREE_CODE (method) == FUNCTION_DECL
>>> +             && (DECL_FUNCTION_VERSIONED (fn)
>>> +                 || DECL_FUNCTION_VERSIONED (method)))
>>> +           {
>>> +             DECL_FUNCTION_VERSIONED (fn) = 1;
>>> +             DECL_FUNCTION_VERSIONED (method) = 1;
>>> +             group_function_versions (fn, method);
>>> +             continue;
>>> +           }
>>> +         else if (using_decl)
>>>            {
>>>              if (DECL_CONTEXT (fn) == type)
>>>                /* Defer to the local function.  */
>>> @@ -1150,6 +1164,13 @@ add_method (tree type, tree method, tree using_dec
>>>   else
>>>     /* Replace the current slot.  */
>>>     VEC_replace (tree, method_vec, slot, overload);
>>> +
>>> +  /* Change the assembler name of method here if it has "targetv"
>>> +     attributes.  Since all versions have the same mangled name,
>>> +     their assembler name is changed by appending the string from
>>> +     the "targetv" attribute. */
>>> +  version_assembler_name (method);
>>> +
>>>   return true;
>>>  }
>>>
>>> @@ -6890,8 +6911,11 @@ resolve_address_of_overloaded_function (tree targe
>>>          if (DECL_ANTICIPATED (fn))
>>>            continue;
>>>
>>> -         /* See if there's a match.  */
>>> -         if (same_type_p (target_fn_type, static_fn_type (fn)))
>>> +         /* See if there's a match.   For functions that are multi-versioned
>>> +            match it to the default function.  */
>>> +         if (same_type_p (target_fn_type, static_fn_type (fn))
>>> +             && (!DECL_FUNCTION_VERSIONED (fn)
>>> +                 || is_default_function (fn)))
>>>            matches = tree_cons (fn, NULL_TREE, matches);
>>>        }
>>>     }
>>> @@ -7053,6 +7077,21 @@ resolve_address_of_overloaded_function (tree targe
>>>       perform_or_defer_access_check (access_path, fn, fn);
>>>     }
>>>
>>> +  /* If a pointer to a function that is multi-versioned is requested, the
>>> +     pointer to the dispatcher function is returned instead.  This works
>>> +     well because indirectly calling the function will dispatch the right
>>> +     function version at run-time. Also, the function address is kept
>>> +     unique.  */
>>> +  if (DECL_FUNCTION_VERSIONED (fn)
>>> +      && is_default_function (fn))
>>> +    {
>>> +      tree ifunc_decl;
>>> +      ifunc_decl = get_ifunc_for_version (fn);
>>> +      gcc_assert (ifunc_decl != NULL);
>>> +      mark_used (fn);
>>> +      return build_fold_addr_expr (ifunc_decl);
>>> +    }
>>> +
>>>   if (TYPE_PTRFN_P (target_type) || TYPE_PTRMEMFUNC_P (target_type))
>>>     return cp_build_addr_expr (fn, flags);
>>>   else
>>> Index: cp/decl.c
>>> ===================================================================
>>> --- cp/decl.c   (revision 184971)
>>> +++ cp/decl.c   (working copy)
>>> @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "pointer-set.h"
>>>  #include "splay-tree.h"
>>>  #include "plugin.h"
>>> +#include "multiversion.h"
>>>
>>>  /* Possible cases of bad specifiers type used by bad_specifiers. */
>>>  enum bad_spec_place {
>>> @@ -972,6 +973,23 @@ decls_match (tree newdecl, tree olddecl)
>>>       if (t1 != t2)
>>>        return 0;
>>>
>>> +      /* The decls dont match if they correspond to two different versions
>>> +        of the same function.  */
>>> +      if (compparms (p1, p2)
>>> +         && same_type_p (TREE_TYPE (f1), TREE_TYPE (f2))
>>> +         && (DECL_FUNCTION_VERSIONED (newdecl)
>>> +             || DECL_FUNCTION_VERSIONED (olddecl))
>>> +         && has_different_version_attributes (newdecl, olddecl))
>>> +       {
>>> +         /* One of the decls could be the default without the "targetv"
>>> +            attribute. Set it to be a versioned function here.  */
>>> +         DECL_FUNCTION_VERSIONED (newdecl) = 1;
>>> +         DECL_FUNCTION_VERSIONED (olddecl) = 1;
>>> +         /* Accumulate all the versions of a function.  */
>>> +         group_function_versions (olddecl, newdecl);
>>> +         return 0;
>>> +       }
>>> +
>>>       if (CP_DECL_CONTEXT (newdecl) != CP_DECL_CONTEXT (olddecl)
>>>          && ! (DECL_EXTERN_C_P (newdecl)
>>>                && DECL_EXTERN_C_P (olddecl)))
>>> @@ -1482,7 +1500,11 @@ duplicate_decls (tree newdecl, tree olddecl, bool
>>>              error ("previous declaration %q+#D here", olddecl);
>>>              return NULL_TREE;
>>>            }
>>> -         else if (compparms (TYPE_ARG_TYPES (TREE_TYPE (newdecl)),
>>> +         /* For function versions, params and types match, but they
>>> +            are not ambiguous.  */
>>> +         else if ((!DECL_FUNCTION_VERSIONED (newdecl)
>>> +                   && !DECL_FUNCTION_VERSIONED (olddecl))
>>> +                  && compparms (TYPE_ARG_TYPES (TREE_TYPE (newdecl)),
>>>                              TYPE_ARG_TYPES (TREE_TYPE (olddecl))))
>>>            {
>>>              error ("new declaration %q#D", newdecl);
>>> @@ -2250,6 +2272,16 @@ duplicate_decls (tree newdecl, tree olddecl, bool
>>>   else if (DECL_PRESERVE_P (newdecl))
>>>     DECL_PRESERVE_P (olddecl) = 1;
>>>
>>> +  /* If the olddecl is a version, so is the newdecl.  */
>>> +  if (TREE_CODE (newdecl) == FUNCTION_DECL
>>> +      && DECL_FUNCTION_VERSIONED (olddecl))
>>> +    {
>>> +      DECL_FUNCTION_VERSIONED (newdecl) = 1;
>>> +      /* Record that newdecl is not a valid version and has
>>> +        been deleted.  */
>>> +      mark_delete_decl_version (newdecl);
>>> +    }
>>> +
>>>   if (TREE_CODE (newdecl) == FUNCTION_DECL)
>>>     {
>>>       int function_size;
>>> @@ -4512,6 +4544,10 @@ start_decl (const cp_declarator *declarator,
>>>   /* Enter this declaration into the symbol table.  */
>>>   decl = maybe_push_decl (decl);
>>>
>>> +  /* If this decl is a function version and not the default, its assembler
>>> +     name has to be changed.  */
>>> +  version_assembler_name (decl);
>>> +
>>>   if (processing_template_decl)
>>>     decl = push_template_decl (decl);
>>>   if (decl == error_mark_node)
>>> @@ -13019,6 +13055,10 @@ start_function (cp_decl_specifier_seq *declspecs,
>>>     gcc_assert (same_type_p (TREE_TYPE (TREE_TYPE (decl1)),
>>>                             integer_type_node));
>>>
>>> +  /* If this decl is a function version and not the default, its assembler
>>> +     name has to be changed.  */
>>> +  version_assembler_name (decl1);
>>> +
>>>   start_preparsed_function (decl1, attrs, /*flags=*/SF_DEFAULT);
>>>
>>>   return 1;
>>> @@ -13960,6 +14000,11 @@ cxx_comdat_group (tree decl)
>>>            break;
>>>        }
>>>       name = DECL_ASSEMBLER_NAME (decl);
>>> +      if (TREE_CODE (decl) == FUNCTION_DECL
>>> +         && DECL_FUNCTION_VERSIONED (decl))
>>> +       name = DECL_NAME (decl);
>>> +      else
>>> +        name = DECL_ASSEMBLER_NAME (decl);
>>>     }
>>>
>>>   return name;
>>> Index: cp/semantics.c
>>> ===================================================================
>>> --- cp/semantics.c      (revision 184971)
>>> +++ cp/semantics.c      (working copy)
>>> @@ -3783,8 +3783,11 @@ expand_or_defer_fn_1 (tree fn)
>>>       /* If the user wants us to keep all inline functions, then mark
>>>         this function as needed so that finish_file will make sure to
>>>         output it later.  Similarly, all dllexport'd functions must
>>> -        be emitted; there may be callers in other DLLs.  */
>>> -      if ((flag_keep_inline_functions
>>> +        be emitted; there may be callers in other DLLs.
>>> +        Also, mark this function as needed if it is marked inline but
>>> +        is a multi-versioned function.  */
>>> +      if (((flag_keep_inline_functions
>>> +           || DECL_FUNCTION_VERSIONED (fn))
>>>           && DECL_DECLARED_INLINE_P (fn)
>>>           && !DECL_REALLY_EXTERN (fn))
>>>          || (flag_keep_inline_dllexport
>>> Index: cp/decl2.c
>>> ===================================================================
>>> --- cp/decl2.c  (revision 184971)
>>> +++ cp/decl2.c  (working copy)
>>> @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "splay-tree.h"
>>>  #include "langhooks.h"
>>>  #include "c-family/c-ada-spec.h"
>>> +#include "multiversion.h"
>>>
>>>  extern cpp_reader *parse_in;
>>>
>>> @@ -674,9 +675,13 @@ check_classfn (tree ctype, tree function, tree tem
>>>          if (is_template != (TREE_CODE (fndecl) == TEMPLATE_DECL))
>>>            continue;
>>>
>>> +         /* While finding a match, same types and params are not enough
>>> +            if the function is versioned.  Also check version ("targetv")
>>> +            attributes.  */
>>>          if (same_type_p (TREE_TYPE (TREE_TYPE (function)),
>>>                           TREE_TYPE (TREE_TYPE (fndecl)))
>>>              && compparms (p1, p2)
>>> +             && !has_different_version_attributes (function, fndecl)
>>>              && (!is_template
>>>                  || comp_template_parms (template_parms,
>>>                                          DECL_TEMPLATE_PARMS (fndecl)))
>>> Index: cp/call.c
>>> ===================================================================
>>> --- cp/call.c   (revision 184971)
>>> +++ cp/call.c   (working copy)
>>> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "langhooks.h"
>>>  #include "c-family/c-objc.h"
>>>  #include "timevar.h"
>>> +#include "multiversion.h"
>>>
>>>  /* The various kinds of conversion.  */
>>>
>>> @@ -6730,6 +6731,17 @@ build_over_call (struct z_candidate *cand, int fla
>>>   if (!already_used)
>>>     mark_used (fn);
>>>
>>> +  /* For a call to a multi-versioned function, the call should actually be to
>>> +     the dispatcher.  */
>>> +  if (DECL_FUNCTION_VERSIONED (fn))
>>> +    {
>>> +      tree ifunc_decl;
>>> +      ifunc_decl = get_ifunc_for_version (fn);
>>> +      gcc_assert (ifunc_decl != NULL);
>>> +      return build_call_expr_loc_array (UNKNOWN_LOCATION, ifunc_decl,
>>> +                                       nargs, argarray);
>>> +    }
>>> +
>>>   if (DECL_VINDEX (fn) && (flags & LOOKUP_NONVIRTUAL) == 0)
>>>     {
>>>       tree t;
>>> @@ -7980,6 +7992,30 @@ joust (struct z_candidate *cand1, struct z_candida
>>>   size_t i;
>>>   size_t len;
>>>
>>> +  /* For Candidates of a multi-versioned function, the one marked default
>>> +     wins.  This is because the default decl is used as key to aggregate
>>> +     all the other versions provided for it in multiversion.c.  When
>>> +     generating the actual call, the appropriate dispatcher is created
>>> +     to call the right function version at run-time.  */
>>> +
>>> +  if ((TREE_CODE (cand1->fn) == FUNCTION_DECL
>>> +       && DECL_FUNCTION_VERSIONED (cand1->fn))
>>> +      ||(TREE_CODE (cand2->fn) == FUNCTION_DECL
>>> +        && DECL_FUNCTION_VERSIONED (cand2->fn)))
>>> +    {
>>> +      if (is_default_function (cand1->fn))
>>> +       {
>>> +          mark_used (cand2->fn);
>>> +         return 1;
>>> +       }
>>> +      if (is_default_function (cand2->fn))
>>> +       {
>>> +          mark_used (cand1->fn);
>>> +         return -1;
>>> +       }
>>> +      return 0;
>>> +    }
>>> +
>>>   /* Candidates that involve bad conversions are always worse than those
>>>      that don't.  */
>>>   if (cand1->viable > cand2->viable)
>>> Index: timevar.def
>>> ===================================================================
>>> --- timevar.def (revision 184971)
>>> +++ timevar.def (working copy)
>>> @@ -253,6 +253,7 @@ DEFTIMEVAR (TV_TREE_IFCOMBINE        , "tree if-co
>>>  DEFTIMEVAR (TV_TREE_UNINIT           , "uninit var analysis")
>>>  DEFTIMEVAR (TV_PLUGIN_INIT           , "plugin initialization")
>>>  DEFTIMEVAR (TV_PLUGIN_RUN            , "plugin execution")
>>> +DEFTIMEVAR (TV_MULTIVERSION_DISPATCH , "multiversion dispatch")
>>>
>>>  /* Everything else in rest_of_compilation not included above.  */
>>>  DEFTIMEVAR (TV_EARLY_LOCAL          , "early local passes")
>>> Index: varasm.c
>>> ===================================================================
>>> --- varasm.c    (revision 184971)
>>> +++ varasm.c    (working copy)
>>> @@ -5755,6 +5755,8 @@ finish_aliases_1 (void)
>>>        }
>>>       else if (! (p->emitted_diags & ALIAS_DIAG_TO_EXTERN)
>>>               && DECL_EXTERNAL (target_decl)
>>> +              && (!TREE_CODE (target_decl) == FUNCTION_DECL
>>> +                  || !DECL_STRUCT_FUNCTION (target_decl))
>>>               /* We use local aliases for C++ thunks to force the tailcall
>>>                  to bind locally.  This is a hack - to keep it working do
>>>                  the following (which is not strictly correct).  */
>>> Index: Makefile.in
>>> ===================================================================
>>> --- Makefile.in (revision 184971)
>>> +++ Makefile.in (working copy)
>>> @@ -1298,6 +1298,7 @@ OBJS = \
>>>        mcf.o \
>>>        mode-switching.o \
>>>        modulo-sched.o \
>>> +       multiversion.o \
>>>        omega.o \
>>>        omp-low.o \
>>>        optabs.o \
>>> @@ -3030,6 +3031,11 @@ ree.o : ree.c $(CONFIG_H) $(SYSTEM_H) coretypes.h
>>>    $(DF_H) $(TIMEVAR_H) tree-pass.h $(RECOG_H) $(EXPR_H) \
>>>    $(REGS_H) $(TREE_H) $(TM_P_H) insn-config.h $(INSN_ATTR_H) $(DIAGNOSTIC_CORE_H) \
>>>    $(TARGET_H) $(OPTABS_H) insn-codes.h rtlhooks-def.h $(PARAMS_H) $(CGRAPH_H)
>>> +multiversion.o : multiversion.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
>>> +   $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
>>> +   $(DIAGNOSTIC_H) $(FIBHEAP_H) $(PARAMS_H) $(TIMEVAR_H) tree-pass.h \
>>> +   $(HASHTAB_H) $(COVERAGE_H) $(GGC_H) $(TREE_FLOW_H) $(RTL_H) $(IPA_PROP_H) \
>>> +   $(BASIC_BLOCK_H) $(TOPLEV_H) $(TREE_DUMP_H) ipa-inline.h
>>>  cprop.o : cprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
>>>    $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(GGC_H) \
>>>    $(RECOG_H) $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) output.h toplev.h $(DIAGNOSTIC_CORE_H) \
>>> Index: passes.c
>>> ===================================================================
>>> --- passes.c    (revision 184971)
>>> +++ passes.c    (working copy)
>>> @@ -1190,6 +1190,7 @@ init_optimization_passes (void)
>>>   NEXT_PASS (pass_build_cfg);
>>>   NEXT_PASS (pass_warn_function_return);
>>>   NEXT_PASS (pass_build_cgraph_edges);
>>> +  NEXT_PASS (pass_dispatch_versions);
>>>   *p = NULL;
>>>
>>>   /* Interprocedural optimization passes.  */
>>> Index: config/i386/i386.c
>>> ===================================================================
>>> --- config/i386/i386.c  (revision 184971)
>>> +++ config/i386/i386.c  (working copy)
>>> @@ -27446,6 +27473,593 @@ ix86_init_mmx_sse_builtins (void)
>>>     }
>>>  }
>>>
>>> +/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL
>>> +   to return a pointer to VERSION_DECL if the outcome of the function
>>> +   PREDICATE_DECL is true.  This function will be called during version
>>> +   dispatch to decide which function version to execute.  It returns the
>>> +   basic block at the end to which more conditions can be added.  */
>>> +
>>> +static basic_block
>>> +add_condition_to_bb (tree function_decl, tree version_decl,
>>> +                    basic_block new_bb, tree predicate_decl)
>>> +{
>>> +  gimple return_stmt;
>>> +  tree convert_expr, result_var;
>>> +  gimple convert_stmt;
>>> +  gimple call_cond_stmt;
>>> +  gimple if_else_stmt;
>>> +
>>> +  basic_block bb1, bb2, bb3;
>>> +  edge e12, e23;
>>> +
>>> +  tree cond_var;
>>> +  gimple_seq gseq;
>>> +
>>> +  tree old_current_function_decl;
>>> +
>>> +  old_current_function_decl = current_function_decl;
>>> +  push_cfun (DECL_STRUCT_FUNCTION (function_decl));
>>> +  current_function_decl = function_decl;
>>> +
>>> +  gcc_assert (new_bb != NULL);
>>> +  gseq = bb_seq (new_bb);
>>> +
>>> +
>>> +  convert_expr = build1 (CONVERT_EXPR, ptr_type_node,
>>> +                        build_fold_addr_expr (version_decl));
>>> +  result_var = create_tmp_var (ptr_type_node, NULL);
>>> +  convert_stmt = gimple_build_assign (result_var, convert_expr);
>>> +  return_stmt = gimple_build_return (result_var);
>>> +
>>> +  if (predicate_decl == NULL_TREE)
>>> +    {
>>> +      gimple_seq_add_stmt (&gseq, convert_stmt);
>>> +      gimple_seq_add_stmt (&gseq, return_stmt);
>>> +      set_bb_seq (new_bb, gseq);
>>> +      gimple_set_bb (convert_stmt, new_bb);
>>> +      gimple_set_bb (return_stmt, new_bb);
>>> +      pop_cfun ();
>>> +      current_function_decl = old_current_function_decl;
>>> +      return new_bb;
>>> +    }
>>> +
>>> +  cond_var = create_tmp_var (integer_type_node, NULL);
>>> +  call_cond_stmt = gimple_build_call (predicate_decl, 0);
>>> +  gimple_call_set_lhs (call_cond_stmt, cond_var);
>>> +
>>> +  gimple_set_block (call_cond_stmt, DECL_INITIAL (function_decl));
>>> +  gimple_set_bb (call_cond_stmt, new_bb);
>>> +  gimple_seq_add_stmt (&gseq, call_cond_stmt);
>>> +
>>> +  if_else_stmt = gimple_build_cond (GT_EXPR, cond_var,
>>> +                                   integer_zero_node,
>>> +                                   NULL_TREE, NULL_TREE);
>>> +  gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl));
>>> +  gimple_set_bb (if_else_stmt, new_bb);
>>> +  gimple_seq_add_stmt (&gseq, if_else_stmt);
>>> +
>>> +  gimple_seq_add_stmt (&gseq, convert_stmt);
>>> +  gimple_seq_add_stmt (&gseq, return_stmt);
>>> +  set_bb_seq (new_bb, gseq);
>>> +
>>> +  bb1 = new_bb;
>>> +  e12 = split_block (bb1, if_else_stmt);
>>> +  bb2 = e12->dest;
>>> +  e12->flags &= ~EDGE_FALLTHRU;
>>> +  e12->flags |= EDGE_TRUE_VALUE;
>>> +
>>> +  e23 = split_block (bb2, return_stmt);
>>> +
>>> +  gimple_set_bb (convert_stmt, bb2);
>>> +  gimple_set_bb (return_stmt, bb2);
>>> +
>>> +  bb3 = e23->dest;
>>> +  make_edge (bb1, bb3, EDGE_FALSE_VALUE);
>>> +
>>> +  remove_edge (e23);
>>> +  make_edge (bb2, EXIT_BLOCK_PTR, 0);
>>> +
>>> +  rebuild_cgraph_edges ();
>>> +
>>> +  pop_cfun ();
>>> +  current_function_decl = old_current_function_decl;
>>> +
>>> +  return bb3;
>>> +}
>>> +
>>> +/* This parses the attribute arguments to targetv in DECL and determines
>>> +   the right builtin to use to match the platform specification.
>>> +   For now, only one target argument ("arch=") is allowed.  */
>>> +
>>> +static enum ix86_builtins
>>> +get_builtin_code_for_version (tree decl)
>>> +{
>>> +  tree attrs;
>>> +  struct cl_target_option cur_target;
>>> +  tree target_node;
>>> +  struct cl_target_option *new_target;
>>> +  enum ix86_builtins builtin_code = IX86_BUILTIN_MAX;
>>> +
>>> +  attrs = lookup_attribute ("targetv", DECL_ATTRIBUTES (decl));
>>> +  gcc_assert (attrs != NULL);
>>> +
>>> +  cl_target_option_save (&cur_target, &global_options);
>>> +
>>> +  target_node = ix86_valid_target_attribute_tree
>>> +                 (TREE_VALUE (TREE_VALUE (attrs)));
>>> +
>>> +  gcc_assert (target_node);
>>> +  new_target = TREE_TARGET_OPTION (target_node);
>>> +  gcc_assert (new_target);
>>> +
>>> +  if (new_target->arch_specified && new_target->arch > 0)
>>> +    {
>>> +      switch (new_target->arch)
>>> +        {
>>> +       case 1:
>>> +       case 2:
>>> +       case 3:
>>> +       case 4:
>>> +       case 5:
>>> +       case 6:
>>> +       case 7:
>>> +       case 8:
>>> +       case 9:
>>> +       case 10:
>>> +       case 11:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL;
>>> +         break;
>>> +       case 12:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL_CORE2;
>>> +         break;
>>> +       case 13:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL_COREI7;
>>> +         break;
>>> +       case 14:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_INTEL_ATOM;
>>> +         break;
>>> +       case 15:
>>> +       case 16:
>>> +       case 17:
>>> +       case 18:
>>> +       case 19:
>>> +       case 20:
>>> +       case 21:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMD;
>>> +         break;
>>> +       case 22:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMDFAM10H;
>>> +         break;
>>> +       case 23:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1;
>>> +         break;
>>> +       case 24:
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2;
>>> +         break;
>>> +       case 25: /* What is btver1 ? */
>>> +         builtin_code = IX86_BUILTIN_CPU_IS_AMD;
>>> +         break;
>>> +       }
>>> +    }
>>> +
>>> +  cl_target_option_restore (&global_options, &cur_target);
>>> +  if (builtin_code == IX86_BUILTIN_MAX)
>>> +      error_at (DECL_SOURCE_LOCATION (decl),
>>> +               "No dispatcher found for the versioning attributes");
>>> +
>>> +  return builtin_code;
>>> +}
>>> +
>>> +/* This is the target hook to generate the dispatch function for
>>> +   multi-versioned functions.  DISPATCH_DECL is the function which will
>>> +   contain the dispatch logic.  FNDECLS are the function choices for
>>> +   dispatch, and is a tree chain.  EMPTY_BB is the basic block pointer
>>> +   in DISPATCH_DECL in which the dispatch code is generated.  */
>>> +
>>> +static int
>>> +ix86_dispatch_version (tree dispatch_decl,
>>> +                      void *fndecls_p,
>>> +                      basic_block *empty_bb)
>>> +{
>>> +  tree default_decl;
>>> +  gimple ifunc_cpu_init_stmt;
>>> +  gimple_seq gseq;
>>> +  tree old_current_function_decl;
>>> +  int ix;
>>> +  tree ele;
>>> +  VEC (tree, heap) *fndecls;
>>> +
>>> +  gcc_assert (dispatch_decl != NULL
>>> +             && fndecls_p != NULL
>>> +             && empty_bb != NULL);
>>> +
>>> +  /*fndecls_p is actually a vector.  */
>>> +  fndecls = (VEC (tree, heap) *)fndecls_p;
>>> +
>>> +  /* Atleast one more version other than the default.  */
>>> +  gcc_assert (VEC_length (tree, fndecls) >= 2);
>>> +
>>> +  /* The first version in the vector is the default decl.  */
>>> +  default_decl = VEC_index (tree, fndecls, 0);
>>> +
>>> +  old_current_function_decl = current_function_decl;
>>> +  push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl));
>>> +  current_function_decl = dispatch_decl;
>>> +
>>> +  gseq = bb_seq (*empty_bb);
>>> +  ifunc_cpu_init_stmt = gimple_build_call_vec (
>>> +                     ix86_builtins [(int) IX86_BUILTIN_CPU_INIT], NULL);
>>> +  gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt);
>>> +  gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb);
>>> +  set_bb_seq (*empty_bb, gseq);
>>> +
>>> +  pop_cfun ();
>>> +  current_function_decl = old_current_function_decl;
>>> +
>>> +
>>> +  for (ix = 1; VEC_iterate (tree, fndecls, ix, ele); ++ix)
>>> +    {
>>> +      tree version_decl = ele;
>>> +      /* Get attribute string, parse it and find the right predicate decl.
>>> +         The predicate function could be a lengthy combination of many
>>> +        features, like arch-type and various isa-variants.  For now, only
>>> +        check the arch-type.  */
>>> +      tree predicate_decl = ix86_builtins [
>>> +                       get_builtin_code_for_version (version_decl)];
>>> +      *empty_bb = add_condition_to_bb (dispatch_decl, version_decl, *empty_bb,
>>> +                                      predicate_decl);
>>> +
>>> +    }
>>> +  /* dispatch default version at the end.  */
>>> +  *empty_bb = add_condition_to_bb (dispatch_decl, default_decl, *empty_bb,
>>> +                                  NULL);
>>> +  return 0;
>>> +}
>>>
>>> @@ -38610,6 +39269,12 @@ ix86_autovectorize_vector_sizes (void)
>>>  #undef TARGET_BUILD_BUILTIN_VA_LIST
>>>  #define TARGET_BUILD_BUILTIN_VA_LIST ix86_build_builtin_va_list
>>>
>>> +#undef TARGET_DISPATCH_VERSION
>>> +#define TARGET_DISPATCH_VERSION ix86_dispatch_version
>>> +
>>>  #undef TARGET_ENUM_VA_LIST_P
>>>  #define TARGET_ENUM_VA_LIST_P ix86_enum_va_list
>>>
>>> Index: testsuite/g++.dg/mv1.C
>>> ===================================================================
>>> --- testsuite/g++.dg/mv1.C      (revision 0)
>>> +++ testsuite/g++.dg/mv1.C      (revision 0)
>>> @@ -0,0 +1,23 @@
>>> +/* Simple test case to check if Multiversioning works.  */
>>> +/* { dg-do run } */
>>> +/* { dg-options "-O2" } */
>>> +
>>> +int foo ();
>>> +int foo () __attribute__ ((targetv("arch=corei7")));
>>> +
>>> +int main ()
>>> +{
>>> +  int (*p)() = &foo;
>>> +  return foo () + (*p)();
>>> +}
>>> +
>>> +int foo ()
>>> +{
>>> +  return 0;
>>> +}
>>> +
>>> +int __attribute__ ((targetv("arch=corei7")))
>>> +foo ()
>>> +{
>>> +  return 0;
>>> +}
>>>
>>>
>>> --
>>> This patch is available for review at http://codereview.appspot.com/5752064

  reply	other threads:[~2012-03-08 21:37 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-07  0:47 Sriraman Tallam
2012-03-07 14:05 ` Richard Guenther
2012-03-07 19:08   ` Sriraman Tallam
2012-03-08 21:37     ` Xinliang David Li [this message]
2012-03-08 21:00   ` Xinliang David Li
2012-03-09 20:04   ` Sriraman Tallam
2012-04-27  5:09     ` Sriraman Tallam
2012-04-27 13:39       ` H.J. Lu
2012-04-27 14:35         ` Sriraman Tallam
2012-04-27 14:39           ` H.J. Lu
2012-04-27 14:53             ` Sriraman Tallam
2012-04-27 15:36               ` H.J. Lu
2012-04-27 15:45                 ` Sriraman Tallam
2012-05-01 23:51                 ` Sriraman Tallam
2012-05-02  0:09                   ` H.J. Lu
2012-05-02  2:45                     ` Sriraman Tallam
2012-05-02 13:42                       ` H.J. Lu
2012-05-02 15:08                         ` Sriraman Tallam
2012-05-02 16:06                           ` H.J. Lu
2012-05-02 17:44                             ` Sriraman Tallam
2012-05-02 18:04                               ` H.J. Lu
2012-05-07 16:58                                 ` Sriraman Tallam
2012-05-09 19:01                                   ` Sriraman Tallam
2012-05-10 17:55                                     ` H.J. Lu
2012-05-12  2:04                                       ` Sriraman Tallam
2012-05-12 13:38                                         ` H.J. Lu
2012-05-14 18:29                                           ` Sriraman Tallam
2012-05-26  0:07                                             ` H.J. Lu
2012-05-26  0:16                                               ` Sriraman Tallam
2012-05-26  0:27                                                 ` H.J. Lu
2012-05-26  1:54                                                   ` Sriraman Tallam
     [not found]                                                     ` <CAMe9rOowm9K7r1xnRdRjW5Y4Ay+WxgSsBLTgGvq24z=i42AS+g@mail.gmail.com>
     [not found]                                                       ` <CAAs8HmzeQigcLQyfkC02u=6gCTLkjLLa_jYmp+b1HEtpMCrYWw@mail.gmail.com>
2012-05-26  5:06                                                         ` H.J. Lu
2012-05-26 22:35                                                           ` Sriraman Tallam
2012-05-26 23:56                                                             ` H.J. Lu
2012-05-27  0:24                                                               ` Sriraman Tallam
2012-05-27  2:06                                                                 ` H.J. Lu
2012-05-27  2:23                                                                   ` Sriraman Tallam
2012-05-27  2:31                                                                     ` H.J. Lu
2012-05-27 19:02                                                                     ` Ian Lance Taylor
2012-06-04 19:01                                             ` Sriraman Tallam
2012-06-04 21:36                                               ` H.J. Lu
2012-06-04 22:29                                                 ` Sriraman Tallam
2012-06-05 13:56                                                   ` H.J. Lu
2012-06-14 20:35                                               ` Sriraman Tallam
2012-06-20  1:10                                                 ` Sriraman Tallam
2012-07-06  9:14                                                 ` Richard Guenther
2012-07-06 17:38                                                   ` Sriraman Tallam
2012-07-07  6:06                                                 ` Jason Merrill
2012-07-07 18:38                                                   ` Xinliang David Li
2012-07-08 11:21                                                     ` Jason Merrill
2012-07-09 21:27                                                       ` Xinliang David Li
2012-07-10  9:46                                                         ` Jason Merrill
2012-07-10 16:09                                                           ` Xinliang David Li
     [not found]                                                             ` <CAAs8HmxHF38ktt6syjWp-MpjiX+6NcXh7_8Xn6iKnAiF2vRymQ@mail.gmail.com>
2012-07-19 20:40                                                               ` Jason Merrill
2012-07-30 19:16                                                                 ` Sriraman Tallam
2012-08-25  0:34                                                                   ` Sriraman Tallam
2012-09-18 16:29                                                                     ` Sriraman Tallam
2012-10-05 17:07                                                                       ` Xinliang David Li
2012-10-05 17:44                                                                     ` Jason Merrill
2012-10-05 18:14                                                                       ` Jason Merrill
2012-10-05 21:58                                                                       ` Sriraman Tallam
2012-10-05 22:50                                                                         ` Jason Merrill
2012-10-05 23:45                                                                           ` Sriraman Tallam
2012-10-05 18:32                                                                     ` Jason Merrill
2012-10-11  0:13                                                                       ` Sriraman Tallam
2012-10-12 22:41                                                                         ` Sriraman Tallam
2012-10-19 15:23                                                                           ` Diego Novillo
2012-10-20  4:29                                                                             ` Sriraman Tallam
2012-10-23 21:21                                                                               ` Sriraman Tallam
2012-10-26 16:53                                                                                 ` Jan Hubicka
2012-10-28  4:31                                                                                   ` Sriraman Tallam
2012-10-29 13:05                                                                                     ` Jan Hubicka
2012-10-29 17:56                                                                                       ` Sriraman Tallam
2012-10-30 19:18                                                                                     ` Jason Merrill
2012-10-31  0:58                                                                                       ` Sriraman Tallam
     [not found]                                                                                       ` <CAAs8Hmw09giv-5_v0irhByTjTJV=kD58rCAD2SAz7M8zrwjBOA@mail.gmail.com>
2012-10-31 14:27                                                                                         ` Jason Merrill
2012-11-02  2:53                                                                                           ` Sriraman Tallam
2012-11-06  2:38                                                                                             ` Sriraman Tallam
2012-11-06 15:52                                                                                               ` Jason Merrill
2012-11-06 18:17                                                                                                 ` Sriraman Tallam
2012-11-10  1:33                                                                                                 ` Sriraman Tallam
2012-11-12  5:04                                                                                                   ` Jason Merrill
2012-11-13  1:11                                                                                                     ` Sriraman Tallam
2012-11-13  2:39                                                                                                       ` Jason Merrill
2012-11-13 21:57                                                                                                         ` Sriraman Tallam
2012-11-17 22:23                                                                                                           ` H.J. Lu
2012-11-06 22:15                                                                                               ` Gerald Pfeifer
2012-10-26 14:11                                                                               ` Diego Novillo
2012-10-26 16:54 Xinliang David Li
2012-10-26 17:28 ` Sriraman Tallam
2012-11-06 22:17 Dominique Dhumieres
2012-11-07  1:16 ` Gerald Pfeifer
2012-11-07  8:53   ` Dominique Dhumieres

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAkRFZ+s2-fvR5CovaJZF4yJdiwpT1M73ADafAXkeVU9+At+zA@mail.gmail.com \
    --to=davidxl@google.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=reply@codereview.appspotmail.com \
    --cc=richard.guenther@gmail.com \
    --cc=tmsriram@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).