public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [0/3] target_version and aarch64 function multiversioning
@ 2023-10-18 15:39 Andrew Carlotti
  2023-10-18 15:42 ` [1/3] Add support for target_version attribute Andrew Carlotti
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Andrew Carlotti @ 2023-10-18 15:39 UTC (permalink / raw)
  To: gcc-patches

This series adds support for function multiversioning on aarch64.  There are a
few minor issues in patch 2/3, that I intend to fix in future versions or
follow-up patches.  I also have some open questions about the correctness of
existing function multiversioning implementations [1], that could affect some
details of this patch series.

Patches 1/3 and 2/3 both pass regression testing on x86.  Patch 2/3 requires
adding function multiversioning tests to aarch64, which I haven't included yet.
Patch 3/3 demonstrates a potential approach for improving consistency of symbol
naming between target_clones and target/target_version multiversioning, but
would require agreement on how to resolve some of the issues discussed in [1].

Thanks,
Andrew


[1] https://gcc.gnu.org/pipermail/gcc/2023-October/242686.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [1/3] Add support for target_version attribute
  2023-10-18 15:39 [0/3] target_version and aarch64 function multiversioning Andrew Carlotti
@ 2023-10-18 15:42 ` Andrew Carlotti
  2023-10-19  7:04   ` Richard Biener
  2023-10-26 18:41   ` Richard Sandiford
  2023-10-18 15:44 ` [2/3] [aarch64] Add function multiversioning support Andrew Carlotti
  2023-10-18 15:44 ` [3/3] WIP/RFC: Fix name mangling for target_clones Andrew Carlotti
  2 siblings, 2 replies; 10+ messages in thread
From: Andrew Carlotti @ 2023-10-18 15:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: jason, nathan, rguenther, richard.sandiford

This patch adds support for the "target_version" attribute to the middle
end and the C++ frontend, which will be used to implement function
multiversioning in the aarch64 backend.

Note that C++ is currently the only frontend which supports
multiversioning using the "target" attribute, whereas the
"target_clones" attribute is additionally supported in C, D and Ada.
Support for the target_version attribute will be extended to C at a
later date.

Targets that currently use the "target" attribute for function
multiversioning (i.e. i386 and rs6000) are not affected by this patch.


I could have implemented the target hooks slightly differently, by reusing the
valid_attribute_p hook and adding attribute name checks to each backend
implementation (c.f. the aarch64 implementation in patch 2/3).  Would this be
preferable?

Otherwise, is this ok for master?


gcc/c-family/ChangeLog:

	* c-attribs.cc (handle_target_version_attribute): New.
	(c_common_attribute_table): Add target_version.
	(handle_target_clones_attribute): Add conflict with
	target_version attribute.

gcc/ChangeLog:

	* attribs.cc (is_function_default_version): Update comment to
	specify incompatibility with target_version attributes.
	* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
	Call valid_version_attribute_p for target_version attributes.
	* target.def (valid_version_attribute_p): New hook.
	(expanded_clones_attribute): New hook.
	* doc/tm.texi.in: Add new hooks.
	* doc/tm.texi: Regenerate.
	* multiple_target.cc (create_dispatcher_calls): Remove redundant
	is_function_default_version check.
	(expand_target_clones): Use target hook for attribute name.
	* targhooks.cc (default_target_option_valid_version_attribute_p):
	New.
	* targhooks.h (default_target_option_valid_version_attribute_p):
	New.
	* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
	target_version attributes.

gcc/cp/ChangeLog:

	* decl2.cc (check_classfn): Update comment to include
	target_version attributes.


diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index b1300018d1e8ed8e02ded1ea721dc192a6d32a49..a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1233,8 +1233,9 @@ make_dispatcher_decl (const tree decl)
   return func_decl;  
 }
 
-/* Returns true if decl is multi-versioned and DECL is the default function,
-   that is it is not tagged with target specific optimization.  */
+/* Returns true if DECL is multi-versioned using the target attribute, and this
+   is the default version.  This function can only be used for targets that do
+   not support the "target_version" attribute.  */
 
 bool
 is_function_default_version (const tree decl)
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 072cfb69147bd6b314459c0bd48a0c1fb92d3e4d..1a224c036277d51ab4dc0d33a403177bd226e48a 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -148,6 +148,7 @@ static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
 static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *);
 static tree handle_assume_attribute (tree *, tree, tree, int, bool *);
 static tree handle_target_attribute (tree *, tree, tree, int, bool *);
+static tree handle_target_version_attribute (tree *, tree, tree, int, bool *);
 static tree handle_target_clones_attribute (tree *, tree, tree, int, bool *);
 static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
 static tree ignore_attribute (tree *, tree, tree, int, bool *);
@@ -480,6 +481,8 @@ const struct attribute_spec c_common_attribute_table[] =
 			      handle_error_attribute, NULL },
   { "target",                 1, -1, true, false, false, false,
 			      handle_target_attribute, NULL },
+  { "target_version",         1, -1, true, false, false, false,
+			      handle_target_version_attribute, NULL },
   { "target_clones",          1, -1, true, false, false, false,
 			      handle_target_clones_attribute, NULL },
   { "optimize",               1, -1, true, false, false, false,
@@ -5569,6 +5572,45 @@ handle_target_attribute (tree *node, tree name, tree args, int flags,
   return NULL_TREE;
 }
 
+/* Handle a "target_version" attribute.  */
+
+static tree
+handle_target_version_attribute (tree *node, tree name, tree args, int flags,
+				  bool *no_add_attrs)
+{
+  /* Ensure we have a function type.  */
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored", name);
+      *no_add_attrs = true;
+    }
+  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
+		   "with %qs attribute", name, "target_clones");
+      *no_add_attrs = true;
+    }
+  else if (!targetm.target_option.valid_version_attribute_p (*node, name, args,
+							     flags))
+    *no_add_attrs = true;
+
+  /* Check that there's no empty string in values of the attribute.  */
+  for (tree t = args; t != NULL_TREE; t = TREE_CHAIN (t))
+    {
+      tree value = TREE_VALUE (t);
+      if (TREE_CODE (value) == STRING_CST
+	  && TREE_STRING_LENGTH (value) == 1
+	  && TREE_STRING_POINTER (value)[0] == '\0')
+	{
+	  warning (OPT_Wattributes,
+		   "empty string in attribute %<target_version%>");
+	  *no_add_attrs = true;
+	}
+    }
+
+  return NULL_TREE;
+}
+
 /* Handle a "target_clones" attribute.  */
 
 static tree
@@ -5601,6 +5643,12 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args),
 		   "with %qs attribute", name, "target");
 	  *no_add_attrs = true;
 	}
+      else if (lookup_attribute ("target_version", DECL_ATTRIBUTES (*node)))
+	{
+	  warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
+		   "with %qs attribute", name, "target_version");
+	  *no_add_attrs = true;
+	}
       else if (get_target_clone_attr_len (args) == -1)
 	{
 	  warning (OPT_Wattributes,
diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index 29d28ef895a73a223695cbb86aafbc845bbe7688..8af6b23d8c0306920e0fdcb3559ef047a16689f4 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-eh.h"
 #include "tree-cfg.h"
 #include "tree-inline.h"
+#include "attribs.h"
 #include "dumpfile.h"
 #include "gimple-pretty-print.h"
 #include "alloc-pool.h"
@@ -1048,7 +1049,17 @@ cgraph_node::create_version_clone_with_body
       location_t saved_loc = input_location;
       tree v = TREE_VALUE (target_attributes);
       input_location = DECL_SOURCE_LOCATION (new_decl);
-      bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 1);
+      bool r;
+      tree name_id = get_attribute_name (target_attributes);
+      const char* name_str = IDENTIFIER_POINTER (name_id);
+      if (strcmp (name_str, "target") == 0)
+	r = targetm.target_option.valid_attribute_p (new_decl, name_id, v, 1);
+      else if (strcmp (name_str, "target_version") == 0)
+	r = targetm.target_option.valid_version_attribute_p (new_decl, name_id,
+							     v, 1);
+      else
+	gcc_assert(false);
+
       input_location = saved_loc;
       if (!r)
 	return NULL;
diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 9594be4092c3c00fddc9d4c6da5931ea3b7e8792..ec78d5a5440bedd360ac8e5bc44e164da3dab410 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -829,8 +829,8 @@ check_classfn (tree ctype, tree function, tree template_parms)
       tree c2 = get_constraints (fndecl);
 
       /* While finding a match, same types and params are not enough
-	 if the function is versioned.  Also check version ("target")
-	 attributes.  */
+	 if the function is versioned.  Also check for different target
+	 specific attributes.  */
       if (same_type_p (TREE_TYPE (TREE_TYPE (function)),
 		       TREE_TYPE (TREE_TYPE (fndecl)))
 	  && compparms (p1, p2)
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 33e7ffc8af5a9d48430145ef9b8e7924613b7dd7..97b3e9c31631a943d95f1cf7739716a574afcfb7 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10507,6 +10507,23 @@ the function declaration to hold a pointer to a target-specific
 @code{struct cl_target_option} structure.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P (tree @var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
+This hook is called to parse @code{attribute(target_version("..."))},
+which allows setting target-specific options on individual function versions.
+These function-specific options may differ
+from the options specified on the command line.  The hook should return
+@code{true} if the options are valid.
+
+The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in
+the function declaration to hold a pointer to a target-specific
+@code{struct cl_target_option} structure.
+@end deftypefn
+
+@deftypevr {Target Hook} {const char *} TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
+Contains the name of the attribute used for the version description string
+when expanding clones for a function with the target_clones attribute.
+@end deftypevr
+
 @deftypefn {Target Hook} void TARGET_OPTION_SAVE (struct cl_target_option *@var{ptr}, struct gcc_options *@var{opts}, struct gcc_options *@var{opts_set})
 This hook is called to save any additional target-specific information
 in the @code{struct cl_target_option} structure for function-specific
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index c98b2447e28aa17996b1cbf8af7ed02d70db54f2..56fa3de6bba06bc0ac124bb3a41324be1997e209 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6979,6 +6979,10 @@ on this implementation detail.
 
 @hook TARGET_OPTION_VALID_ATTRIBUTE_P
 
+@hook TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
+
+@hook TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
+
 @hook TARGET_OPTION_SAVE
 
 @hook TARGET_OPTION_RESTORE
diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
index a2ed048d7dd28ec470953fcd8a0dc86817e4b7dc..3db57c2b13d612a37240d9dcf58ad21b2286633c 100644
--- a/gcc/multiple_target.cc
+++ b/gcc/multiple_target.cc
@@ -66,10 +66,6 @@ create_dispatcher_calls (struct cgraph_node *node)
 {
   ipa_ref *ref;
 
-  if (!DECL_FUNCTION_VERSIONED (node->decl)
-      || !is_function_default_version (node->decl))
-    return;
-
   if (!targetm.has_ifunc_p ())
     {
       error_at (DECL_SOURCE_LOCATION (node->decl),
@@ -377,6 +373,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
       return false;
     }
 
+  const char *new_attr_name = targetm.target_option.expanded_clones_attribute;
   cgraph_function_version_info *decl1_v = NULL;
   cgraph_function_version_info *decl2_v = NULL;
   cgraph_function_version_info *before = NULL;
@@ -392,7 +389,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
       char *attr = attrs[i];
 
       /* Create new target clone.  */
-      tree attributes = make_attribute ("target", attr,
+      tree attributes = make_attribute (new_attr_name, attr,
 					DECL_ATTRIBUTES (node->decl));
 
       char *suffix = XNEWVEC (char, strlen (attr) + 1);
@@ -430,7 +427,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
   XDELETEVEC (attr_str);
 
   /* Setting new attribute to initial function.  */
-  tree attributes = make_attribute ("target", "default",
+  tree attributes = make_attribute (new_attr_name, "default",
 				    DECL_ATTRIBUTES (node->decl));
   DECL_ATTRIBUTES (node->decl) = attributes;
   node->local = false;
diff --git a/gcc/target.def b/gcc/target.def
index cda6c51e5167f85625168c7c26b777d6c8ccad82..39acea04db01ebaf918910b7dd73d397de6a84ec 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6492,6 +6492,31 @@ the function declaration to hold a pointer to a target-specific\n\
  bool, (tree fndecl, tree name, tree args, int flags),
  default_target_option_valid_attribute_p)
 
+/* Function to validate the attribute((target_version(...))) strings.  If
+   the option is validated, the hook should also fill in
+   DECL_FUNCTION_SPECIFIC_TARGET in the function decl node.  */
+DEFHOOK
+(valid_version_attribute_p,
+ "This hook is called to parse @code{attribute(target_version(\"...\"))},\n\
+which allows setting target-specific options on individual function versions.\n\
+These function-specific options may differ\n\
+from the options specified on the command line.  The hook should return\n\
+@code{true} if the options are valid.\n\
+\n\
+The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in\n\
+the function declaration to hold a pointer to a target-specific\n\
+@code{struct cl_target_option} structure.",
+ bool, (tree fndecl, tree name, tree args, int flags),
+ default_target_option_valid_version_attribute_p)
+
+/* Attribute to be used when expanding clones for functions with
+   target_clones attribute.  */
+DEFHOOKPOD
+(expanded_clones_attribute,
+ "Contains the name of the attribute used for the version description string\n\
+when expanding clones for a function with the target_clones attribute.",
+ const char *, "target")
+
 /* Function to save any extra target state in the target options structure.  */
 DEFHOOK
 (save,
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 1a0db8dddd594d9b1fb04ae0d9a66ad6b7a396dc..0efc993d82ef59b581a1df74ee0de71135a28703 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -192,6 +192,7 @@ extern bool default_hard_regno_scratch_ok (unsigned int);
 extern bool default_mode_dependent_address_p (const_rtx, addr_space_t);
 extern bool default_new_address_profitable_p (rtx, rtx_insn *, rtx);
 extern bool default_target_option_valid_attribute_p (tree, tree, tree, int);
+extern bool default_target_option_valid_version_attribute_p (tree, tree, tree, int);
 extern bool default_target_option_pragma_parse (tree, tree);
 extern bool default_target_can_inline_p (tree, tree);
 extern bool default_update_ipa_fn_target_info (unsigned int &, const gimple *);
diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index e190369f87a92e6a92372dc348d9374c3a965c0a..7fc7bf455e80c333cced1bac7085210c2b108f8d 100644
--- a/gcc/targhooks.cc
+++ b/gcc/targhooks.cc
@@ -1787,7 +1787,19 @@ default_target_option_valid_attribute_p (tree ARG_UNUSED (fndecl),
 					 int ARG_UNUSED (flags))
 {
   warning (OPT_Wattributes,
-	   "target attribute is not supported on this machine");
+	   "%<target%> attribute is not supported on this machine");
+
+  return false;
+}
+
+bool
+default_target_option_valid_version_attribute_p (tree ARG_UNUSED (fndecl),
+						 tree ARG_UNUSED (name),
+						 tree ARG_UNUSED (args),
+						 int ARG_UNUSED (flags))
+{
+  warning (OPT_Wattributes,
+	   "%<target_version%> attribute is not supported on this machine");
 
   return false;
 }
diff --git a/gcc/tree.h b/gcc/tree.h
index 0b72663e6a1a94406127f6253460f498b7a3ea9c..ebd89ce79566c350eaaab210c0dca3cc1ac2048e 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3438,8 +3438,8 @@ extern vec<tree, va_gc> **decl_debug_args_insert (tree);
    (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization)
 
 /* In FUNCTION_DECL, this is set if this function has other versions generated
-   using "target" attributes.  The default version is the one which does not
-   have any "target" attribute set. */
+   to support different architecture feature sets, e.g. using "target" or
+   "target_version" attributes.  */
 #define DECL_FUNCTION_VERSIONED(NODE)\
    (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function)
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [2/3] [aarch64] Add function multiversioning support
  2023-10-18 15:39 [0/3] target_version and aarch64 function multiversioning Andrew Carlotti
  2023-10-18 15:42 ` [1/3] Add support for target_version attribute Andrew Carlotti
@ 2023-10-18 15:44 ` Andrew Carlotti
  2023-10-30 22:35   ` Richard Sandiford
  2023-10-18 15:44 ` [3/3] WIP/RFC: Fix name mangling for target_clones Andrew Carlotti
  2 siblings, 1 reply; 10+ messages in thread
From: Andrew Carlotti @ 2023-10-18 15:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, richard.sandiford

This adds initial support for function multiversion on aarch64 using the
target_version and target_clones attributes. This mostly follows the
Beta specification in the ACLE [1], with a few diffences that remain to
be fixed:

- Symbol mangling for target_clones differs from that for target_version
  and does not match the mangling specified in the ACLE. This
  inconsistency is also present in i386 and rs6000 mangling.
- The target_clones attribute does not currently support an implicit
  "default" version.
- Unrecognised target names in a target_clones attribute should be
  ignored (with an optional warning), but currently cause an error to be
  raised instead.
- There is no option to disable function multiversioning at compile
  time.
- There is no support for function multiversioning in C, since this is
  not yet enabled in the frontend. On the other hand, this patch
  happens to enable multiversioning in Ada and D as well, using their
  existing frontend support.

This patch relies on adding functionality to libgcc, to support:
- struct { unsigned long long features; } __aarch64_cpu_features;
- void __init_cpu_features (void);
- void __init_cpu_features_resolver (unsigned long hwcap,
				     const __ifunc_arg_t *arg);
This support matches the interface currently used in LLVM's compiler-rt,
and will be implemented in a future patch (which will be merged before
merging this patch).

This version of the patch incorrectly uses __init_cpu_features in the
ifunc resolvers, which could lead to invalid library calls at load time.
I will fix this to use __init_cpu_features_resolver in a future version
of the patch.

[1] https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning

gcc/ChangeLog:

	* attribs.cc (decl_attributes): Pass attribute name to target
	hook.
	* config/aarch64/aarch64.cc
	(aarch64_process_target_version_attr): New.
	(aarch64_option_valid_attribute_p): Add check and support for
	target_version attribute.
	(enum CPUFeatures): New list of for bitmask positions.
	(aarch64_fmv_feature_data): New.
	(get_feature_bit): New.
	(get_feature_mask_for_version): New.
	(compare_feature_masks): New.
	(aarch64_compare_version_priority): New.
	(make_resolver_func): New.
	(add_condition_to_bb): New.
	(compare_feature_version_info): New.
	(dispatch_function_versions): New.
	(aarch64_generate_version_dispatcher_body): New.
	(aarch64_get_function_versions_dispatcher): New.
	(aarch64_common_function_versions): New.
	(aarch64_mangle_decl_assembler_name): New.
	(TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): New implementation.
	(TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE): New implementation.
	(TARGET_OPTION_FUNCTION_VERSIONS): New implementation.
	(TARGET_COMPARE_VERSION_PRIORITY): New implementation.
	(TARGET_GENERATE_VERSION_DISPATCHER_BODY): New implementation.
	(TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New implementation.
	(TARGET_MANGLE_DECL_ASSEMBLER_NAME): New implementation.


diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6..cc935b502028392ebdc105f940900f01f79196a7 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -657,7 +657,8 @@ decl_attributes (tree *node, tree attributes, int flags,
      options to the attribute((target(...))) list.  */
   if (TREE_CODE (*node) == FUNCTION_DECL
       && current_target_pragma
-      && targetm.target_option.valid_attribute_p (*node, NULL_TREE,
+      && targetm.target_option.valid_attribute_p (*node,
+						  get_identifier("target"),
 						  current_target_pragma, 0))
     {
       tree cur_attr = lookup_attribute ("target", attributes);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 9c3c0e705e2e6ea3b55b4a5f1e7d3360f91eb51d..ca0e2a2507ffdbf99e17b77240504bf2d175b9c0 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -19088,11 +19088,70 @@ aarch64_process_target_attr (tree args)
   return true;
 }
 
+/* Parse the tree in ARGS that contains the targeti_version attribute
+   information and update the global target options space.  */
+
+bool
+aarch64_process_target_version_attr (tree args)
+{
+  if (TREE_CODE (args) == TREE_LIST)
+    {
+      if (TREE_CHAIN (args))
+	{
+	  error ("attribute %<target_version%> has multiple values");
+	  return false;
+	}
+      args = TREE_VALUE (args);
+    }
+
+  if (!args || TREE_CODE (args) != STRING_CST)
+    {
+      error ("attribute %<target_version%> argument not a string");
+      return false;
+    }
+
+  const char *str = TREE_STRING_POINTER (args);
+  if (strcmp (str, "default") == 0)
+    return true;
+
+  auto with_plus = std::string ("+") + str;
+  enum aarch_parse_opt_result parse_res;
+  auto isa_flags = aarch64_asm_isa_flags;
+
+  std::string invalid_extension;
+  parse_res = aarch64_parse_extension (with_plus.c_str(), &isa_flags,
+				       &invalid_extension);
+
+  if (parse_res == AARCH_PARSE_OK)
+    {
+      aarch64_set_asm_isa_flags (isa_flags);
+      return true;
+    }
+
+  switch (parse_res)
+    {
+      case AARCH_PARSE_MISSING_ARG:
+	error ("missing value in %<target_version%> attribute");
+	break;
+
+      case AARCH_PARSE_INVALID_FEATURE:
+	error ("invalid feature modifier %qs of value %qs in "
+	       "%<target_version%> attribute", invalid_extension.c_str (),
+	       with_plus.c_str());
+	break;
+
+      default:
+	gcc_unreachable ();
+    }
+
+  return false;
+}
+
 /* Implement TARGET_OPTION_VALID_ATTRIBUTE_P.  This is used to
    process attribute ((target ("..."))).  */
 
 static bool
-aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
+aarch64_option_valid_attribute_p (tree fndecl, tree name, tree args, int)
 {
   struct cl_target_option cur_target;
   bool ret;
@@ -19100,13 +19159,22 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
   tree new_target, new_optimize;
   tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
 
+  bool target_version_p;
+  const char *attr_name = IDENTIFIER_POINTER (name);
+  if (strcmp (attr_name, "target") == 0)
+    target_version_p = false;
+  else if (strcmp (attr_name, "target_version") == 0)
+    target_version_p = true;
+  else
+    gcc_assert (false);
+
   /* If what we're processing is the current pragma string then the
      target option node is already stored in target_option_current_node
      by aarch64_pragma_target_parse in aarch64-c.cc.  Use that to avoid
      having to re-parse the string.  This is especially useful to keep
      arm_neon.h compile times down since that header contains a lot
      of intrinsics enclosed in pragmas.  */
-  if (!existing_target && args == current_target_pragma)
+  if (!target_version_p && !existing_target && args == current_target_pragma)
     {
       DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_option_current_node;
       return true;
@@ -19142,7 +19210,25 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
     cl_target_option_restore (&global_options, &global_options_set,
 			      TREE_TARGET_OPTION (target_option_current_node));
 
-  ret = aarch64_process_target_attr (args);
+  if (!target_version_p)
+    {
+      ret = aarch64_process_target_attr (args);
+      if (ret)
+	{
+	  tree version_attr = lookup_attribute ("target_version",
+						DECL_ATTRIBUTES (fndecl));
+	  if (version_attr != NULL_TREE)
+	    {
+	      /* Reapply any target_version attribute after target attribute.
+		 This should be equivalent to applying the target_version once
+		 after processing all target attributes.  */
+	      tree version_args = TREE_VALUE (version_attr);
+	      ret = aarch64_process_target_version_attr (version_args);
+	    }
+	}
+    }
+  else
+    ret = aarch64_process_target_version_attr (args);
 
   /* Set up any additional state.  */
   if (ret)
@@ -19173,6 +19259,730 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
   return ret;
 }
 
+/* This enum needs to match the enum used in libgcc cpuinfo.c.  */
+//TODO: Does this clash with or overlap an existing list of target features?
+enum CPUFeatures {
+  FEAT_RNG,
+  FEAT_FLAGM,
+  FEAT_FLAGM2,
+  FEAT_FP16FML,
+  FEAT_DOTPROD,
+  FEAT_SM4,
+  FEAT_RDM,
+  FEAT_LSE,
+  FEAT_FP,
+  FEAT_SIMD,
+  FEAT_CRC,
+  FEAT_SHA1,
+  FEAT_SHA2,
+  FEAT_SHA3,
+  FEAT_AES,
+  FEAT_PMULL,
+  FEAT_FP16,
+  FEAT_DIT,
+  FEAT_DPB,
+  FEAT_DPB2,
+  FEAT_JSCVT,
+  FEAT_FCMA,
+  FEAT_RCPC,
+  FEAT_RCPC2,
+  FEAT_FRINTTS,
+  FEAT_DGH,
+  FEAT_I8MM,
+  FEAT_BF16,
+  FEAT_EBF16,
+  FEAT_RPRES,
+  FEAT_SVE,
+  FEAT_SVE_BF16,
+  FEAT_SVE_EBF16,
+  FEAT_SVE_I8MM,
+  FEAT_SVE_F32MM,
+  FEAT_SVE_F64MM,
+  FEAT_SVE2,
+  FEAT_SVE_AES,
+  FEAT_SVE_PMULL128,
+  FEAT_SVE_BITPERM,
+  FEAT_SVE_SHA3,
+  FEAT_SVE_SM4,
+  FEAT_SME,
+  FEAT_MEMTAG,
+  FEAT_MEMTAG2,
+  FEAT_MEMTAG3,
+  FEAT_SB,
+  FEAT_PREDRES,
+  FEAT_SSBS,
+  FEAT_SSBS2,
+  FEAT_BTI,
+  FEAT_LS64,
+  FEAT_LS64_V,
+  FEAT_LS64_ACCDATA,
+  FEAT_WFXT,
+  FEAT_SME_F64,
+  FEAT_SME_I64,
+  FEAT_SME2,
+  FEAT_RCPC3, //TODO: Check this index - needs to agree with LLVM.
+  FEAT_MAX
+};
+
+typedef struct
+{
+  const char *name;
+  int priority;
+  unsigned long long feature_mask;
+} aarch64_fmv_feature_datum;
+
+/* List these in priority order, to make it easier to sort target strings.  */
+static aarch64_fmv_feature_datum aarch64_fmv_feature_data[] = {
+  {"default",        0, 0ULL},
+  {"rng",            10, 1ULL << FEAT_RNG},
+  {"flagm",          20, 1ULL << FEAT_FLAGM},
+  {"flagm2",         30, 1ULL << FEAT_FLAGM2},
+  {"fp16fml",        40, 1ULL << FEAT_FP16FML},
+  {"dotprod",        50, 1ULL << FEAT_DOTPROD},
+  {"sm4",            60, 1ULL << FEAT_SM4},
+  {"rdm",            70, 1ULL << FEAT_RDM},
+  {"lse",            80, 1ULL << FEAT_LSE},
+  {"fp",             90, 1ULL << FEAT_FP},
+  {"simd",          100, 1ULL << FEAT_SIMD},
+  {"crc",           110, 1ULL << FEAT_CRC},
+  {"sha1",          120, 1ULL << FEAT_SHA1},
+  {"sha2",          130, 1ULL << FEAT_SHA2},
+  {"sha3",          140, 1ULL << FEAT_SHA3},
+  {"aes",           150, 1ULL << FEAT_AES},
+  {"pmull",         160, 1ULL << FEAT_PMULL},
+  {"fp16",          170, 1ULL << FEAT_FP16},
+  {"dit",           180, 1ULL << FEAT_DIT},
+  {"dpb",           190, 1ULL << FEAT_DPB},
+  {"dpb2",          200, 1ULL << FEAT_DPB2},
+  {"jscvt",         210, 1ULL << FEAT_JSCVT},
+  {"fcma",          220, 1ULL << FEAT_FCMA},
+  {"rcpc",          230, 1ULL << FEAT_RCPC},
+  {"rcpc2",         240, 1ULL << FEAT_RCPC2},
+  {"rcpc3",         241, 1ULL << FEAT_RCPC3},
+  {"frintts",       250, 1ULL << FEAT_FRINTTS},
+  {"dgh",           260, 1ULL << FEAT_DGH},
+  {"i8mm",          270, 1ULL << FEAT_I8MM},
+  {"bf16",          280, 1ULL << FEAT_BF16},
+  {"ebf16",         290, 1ULL << FEAT_EBF16},
+  {"rpres",         300, 1ULL << FEAT_RPRES},
+  {"sve",           310, 1ULL << FEAT_SVE},
+  {"sve-bf16",      320, 1ULL << FEAT_SVE_BF16},
+  {"sve-ebf16",     330, 1ULL << FEAT_SVE_EBF16},
+  {"sve-i8mm",      340, 1ULL << FEAT_SVE_I8MM},
+  {"f32mm",         350, 1ULL << FEAT_SVE_F32MM},
+  {"f64mm",         360, 1ULL << FEAT_SVE_F64MM},
+  {"sve2",          370, 1ULL << FEAT_SVE2},
+  {"sve2-aes",      380, 1ULL << FEAT_SVE_AES},
+  {"sve2-pmull128", 390, 1ULL << FEAT_SVE_PMULL128},
+  {"sve2-bitperm",  400, 1ULL << FEAT_SVE_BITPERM},
+  {"sve2-sha3",     410, 1ULL << FEAT_SVE_SHA3},
+  {"sve2-sm4",      420, 1ULL << FEAT_SVE_SM4},
+  {"sme",           430, 1ULL << FEAT_SME},
+  {"memtag",        440, 1ULL << FEAT_MEMTAG},
+  {"memtag2",       450, 1ULL << FEAT_MEMTAG2},
+  {"memtag3",       460, 1ULL << FEAT_MEMTAG3},
+  {"sb",            470, 1ULL << FEAT_SB},
+  {"predres",       480, 1ULL << FEAT_PREDRES},
+  {"ssbs",          490, 1ULL << FEAT_SSBS},
+  {"ssbs2",         500, 1ULL << FEAT_SSBS2},
+  {"bti",           510, 1ULL << FEAT_BTI},
+  {"ls64",          520, 1ULL << FEAT_LS64},
+  {"ls64_v",        530, 1ULL << FEAT_LS64_V},
+  {"ls64_accdata",  540, 1ULL << FEAT_LS64_ACCDATA},
+  {"wfxt",          550, 1ULL << FEAT_WFXT},
+  {"sme-f64f64",    560, 1ULL << FEAT_SME_F64},
+  {"sme-i16i64",    570, 1ULL << FEAT_SME_I64},
+  {"sme2",          580, 1ULL << FEAT_SME2}
+};
+
+/* Look up a single feature name, and return the bitmask.  */
+unsigned long long
+get_feature_bit (char *name)
+{
+  /* Skip default entry here.  */
+  for (int i = 1; i < FEAT_MAX; i++)
+      if (strcmp(aarch64_fmv_feature_data[i].name, name) == 0)
+	return aarch64_fmv_feature_data[i].feature_mask;
+  return 0;
+}
+
+/* This parses the attribute arguments to target_version in DECL and the
+   feature mask required to select those targets.  No adjustments are made to
+   add or remove redundant feature requirements.  */
+
+unsigned long long
+get_feature_mask_for_version (tree decl)
+{
+  tree version_attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl));
+  if (version_attr == NULL)
+    return 0;
+
+  const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE
+						    (version_attr)));
+  if (strcmp (version_string, "default") == 0
+      || strcmp (version_string, "") == 0)
+    return 0;
+
+  int attr_len = strlen (version_string);
+
+  char *feature_string = XNEWVEC (char, attr_len+ 1);
+  strcpy (feature_string, version_string);
+
+  int count = 1;
+  for (int i = 0; i < attr_len; i++)
+    {
+      if (feature_string[i] == '+')
+	{
+	  feature_string[i] = '\0';
+	  count++;
+	}
+    }
+
+  unsigned long long feature_mask = 0ULL;
+  char *cur_feature = feature_string;
+  for (int i = 0; i < count; i++)
+    {
+      unsigned long long feature_bit = get_feature_bit (cur_feature);
+      if (feature_bit == 0)
+	{
+	  /* TODO: For target_clones, we should just ignore this version
+	     instead.  */
+	  error_at (DECL_SOURCE_LOCATION (decl), 0,
+		      "Unrecognised feature %s in function version string",
+		      cur_feature);
+	  feature_mask = -1ULL;
+	}
+      feature_mask |= feature_bit;
+      cur_feature += strlen(cur_feature) + 1;
+    }
+  XDELETEVEC (feature_string);
+  return feature_mask;
+}
+
+/* Compare priorities of two feature masks. Return:
+     1: mask1 is higher priority
+    -1: mask2 is higher priority
+     0: masks are equal.  */
+
+int
+compare_feature_masks (unsigned long long mask1, unsigned long long mask2)
+{
+  int pop1 = __builtin_popcountll(mask1);
+  int pop2 = __builtin_popcountll(mask2);
+  if (pop1 > pop2)
+    return 1;
+  if (pop2 > pop1)
+    return -1;
+
+  unsigned long long diff_mask = mask1 ^ mask2;
+  if (diff_mask == 0ULL)
+    return 0;
+  for (int i = FEAT_MAX - 1; i > 0; i--)
+    {
+      unsigned long long bit_mask = aarch64_fmv_feature_data[i].feature_mask;
+      if (diff_mask & bit_mask)
+	return (mask1 & bit_mask) ? 1 : -1;
+    }
+  gcc_unreachable();
+}
+
+int
+aarch64_compare_version_priority (tree decl1, tree decl2)
+{
+  unsigned long long mask1 = get_feature_mask_for_version (decl1);
+  unsigned long long mask2 = get_feature_mask_for_version (decl2);
+
+  return compare_feature_masks (mask1, mask2);
+}
+
+/* Make the resolver function decl to dispatch the versions of
+   a multi-versioned function,  DEFAULT_DECL.  IFUNC_ALIAS_DECL is
+   ifunc alias that will point to the created resolver.  Create an
+   empty basic block in the resolver and store the pointer in
+   EMPTY_BB.  Return the decl of the resolver function.  */
+
+static tree
+make_resolver_func (const tree default_decl,
+		    const tree ifunc_alias_decl,
+		    basic_block *empty_bb)
+{
+  tree decl, type, t;
+
+  /* Create resolver function name based on default_decl.  */
+  tree decl_name = clone_function_name (default_decl, "resolver");
+  const char *resolver_name = IDENTIFIER_POINTER (decl_name);
+
+  /* The resolver function should return a (void *). */
+  type = build_function_type_list (ptr_type_node, NULL_TREE);
+
+  decl = build_fn_decl (resolver_name, type);
+  SET_DECL_ASSEMBLER_NAME (decl, decl_name);
+
+  DECL_NAME (decl) = decl_name;
+  TREE_USED (decl) = 1;
+  DECL_ARTIFICIAL (decl) = 1;
+  DECL_IGNORED_P (decl) = 1;
+  TREE_PUBLIC (decl) = 0;
+  DECL_UNINLINABLE (decl) = 1;
+
+  /* Resolver is not external, body is generated.  */
+  DECL_EXTERNAL (decl) = 0;
+  DECL_EXTERNAL (ifunc_alias_decl) = 0;
+
+  DECL_CONTEXT (decl) = NULL_TREE;
+  DECL_INITIAL (decl) = make_node (BLOCK);
+  DECL_STATIC_CONSTRUCTOR (decl) = 0;
+
+  if (DECL_COMDAT_GROUP (default_decl)
+      || TREE_PUBLIC (default_decl))
+    {
+      /* In this case, each translation unit with a call to this
+	 versioned function will put out a resolver.  Ensure it
+	 is comdat to keep just one copy.  */
+      DECL_COMDAT (decl) = 1;
+      make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl));
+    }
+  else
+    TREE_PUBLIC (ifunc_alias_decl) = 0;
+
+  /* Build result decl and add to function_decl. */
+  t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node);
+  DECL_CONTEXT (t) = decl;
+  DECL_ARTIFICIAL (t) = 1;
+  DECL_IGNORED_P (t) = 1;
+  DECL_RESULT (decl) = t;
+
+  gimplify_function_tree (decl);
+  push_cfun (DECL_STRUCT_FUNCTION (decl));
+  *empty_bb = init_lowered_empty_function (decl, false,
+					   profile_count::uninitialized ());
+
+  cgraph_node::add_new_function (decl, true);
+  symtab->call_cgraph_insertion_hooks (cgraph_node::get_create (decl));
+
+  pop_cfun ();
+
+  gcc_assert (ifunc_alias_decl != NULL);
+  /* Mark ifunc_alias_decl as "ifunc" with resolver as resolver_name.  */
+  DECL_ATTRIBUTES (ifunc_alias_decl)
+    = make_attribute ("ifunc", resolver_name,
+		      DECL_ATTRIBUTES (ifunc_alias_decl));
+
+  /* Create the alias for dispatch to resolver here.  */
+  cgraph_node::create_same_body_alias (ifunc_alias_decl, decl);
+  return decl;
+}
+
+/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL
+   to return a pointer to VERSION_DECL if all feature bits specified in
+   FEATURE_MASK are not set in MASK_VAR.  This function will be called during
+   version dispatch to decide which function version to execute.  It returns
+   the basic block at the end, to which more conditions can be added.  */
+static basic_block
+add_condition_to_bb (tree function_decl, tree version_decl,
+		     unsigned long long feature_mask,
+		     tree mask_var, basic_block new_bb)
+{
+  gimple *return_stmt;
+  tree convert_expr, result_var;
+  gimple *convert_stmt;
+  gimple *if_else_stmt;
+
+  basic_block bb1, bb2, bb3;
+  edge e12, e23;
+
+  gimple_seq gseq;
+
+  push_cfun (DECL_STRUCT_FUNCTION (function_decl));
+
+  gcc_assert (new_bb != NULL);
+  gseq = bb_seq (new_bb);
+
+
+  convert_expr = build1 (CONVERT_EXPR, ptr_type_node,
+			 build_fold_addr_expr (version_decl));
+  result_var = create_tmp_var (ptr_type_node);
+  convert_stmt = gimple_build_assign (result_var, convert_expr);
+  return_stmt = gimple_build_return (result_var);
+
+
+  if (feature_mask == 0)
+    {
+      /* Default version.  */
+      gimple_seq_add_stmt (&gseq, convert_stmt);
+      gimple_seq_add_stmt (&gseq, return_stmt);
+      set_bb_seq (new_bb, gseq);
+      gimple_set_bb (convert_stmt, new_bb);
+      gimple_set_bb (return_stmt, new_bb);
+      pop_cfun ();
+      return new_bb;
+    }
+
+  tree and_expr_var = create_tmp_var (long_long_unsigned_type_node);
+  tree and_expr = build2 (BIT_AND_EXPR,
+			  long_long_unsigned_type_node,
+			  mask_var,
+			  build_int_cst (long_long_unsigned_type_node,
+					 feature_mask));
+  gimple *and_stmt = gimple_build_assign (and_expr_var, and_expr);
+  gimple_set_block (and_stmt, DECL_INITIAL (function_decl));
+  gimple_set_bb (and_stmt, new_bb);
+  gimple_seq_add_stmt (&gseq, and_stmt);
+
+  tree zero_llu = build_int_cst (long_long_unsigned_type_node, 0);
+  if_else_stmt = gimple_build_cond (EQ_EXPR, and_expr_var, zero_llu,
+				    NULL_TREE, NULL_TREE);
+  gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl));
+  gimple_set_bb (if_else_stmt, new_bb);
+  gimple_seq_add_stmt (&gseq, if_else_stmt);
+
+  gimple_seq_add_stmt (&gseq, convert_stmt);
+  gimple_seq_add_stmt (&gseq, return_stmt);
+  set_bb_seq (new_bb, gseq);
+
+  bb1 = new_bb;
+  e12 = split_block (bb1, if_else_stmt);
+  bb2 = e12->dest;
+  e12->flags &= ~EDGE_FALLTHRU;
+  e12->flags |= EDGE_TRUE_VALUE;
+
+  e23 = split_block (bb2, return_stmt);
+
+  gimple_set_bb (convert_stmt, bb2);
+  gimple_set_bb (return_stmt, bb2);
+
+  bb3 = e23->dest;
+  make_edge (bb1, bb3, EDGE_FALSE_VALUE);
+
+  remove_edge (e23);
+  make_edge (bb2, EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
+
+  pop_cfun ();
+
+  return bb3;
+}
+
+/* Used when sorting the decls into dispatch order.  */
+static int compare_feature_version_info (const void *p1, const void *p2)
+{
+  typedef struct _function_version_info
+    {
+      tree version_decl;
+      unsigned long long feature_mask;
+    } function_version_info;
+  const function_version_info v1 = *(const function_version_info *)p1;
+  const function_version_info v2 = *(const function_version_info *)p2;
+  return - compare_feature_masks (v1.feature_mask, v2.feature_mask);
+}
+
+static int
+dispatch_function_versions (tree dispatch_decl,
+			    void *fndecls_p,
+			    basic_block *empty_bb)
+{
+  gimple *ifunc_cpu_init_stmt;
+  gimple_seq gseq;
+  int ix;
+  tree ele;
+  vec<tree> *fndecls;
+  unsigned int num_versions = 0;
+  unsigned int actual_versions = 0;
+  unsigned int i;
+
+  struct _function_version_info
+    {
+      tree version_decl;
+      unsigned long long feature_mask;
+    }*function_version_info;
+
+  gcc_assert (dispatch_decl != NULL
+	      && fndecls_p != NULL
+	      && empty_bb != NULL);
+
+  /*fndecls_p is actually a vector.  */
+  fndecls = static_cast<vec<tree> *> (fndecls_p);
+
+  /* At least one more version other than the default.  */
+  num_versions = fndecls->length ();
+  gcc_assert (num_versions >= 2);
+
+  function_version_info = (struct _function_version_info *)
+    XNEWVEC (struct _function_version_info, (num_versions));
+
+  push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl));
+
+  gseq = bb_seq (*empty_bb);
+  /* Function version dispatch is via IFUNC.  IFUNC resolvers fire before
+     constructors, so explicity call __builtin_cpu_init here.  */
+  tree init_fn_type = build_function_type_list (void_type_node, NULL);
+  tree init_fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL,
+				  get_identifier ("init_cpu_features"),
+				  init_fn_type);
+  ifunc_cpu_init_stmt = gimple_build_call (init_fn_decl, 0);
+  gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt);
+  gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb);
+
+  /* Build the struct type for __aarch64_cpu_features.  */
+  tree global_type = lang_hooks.types.make_type (RECORD_TYPE);
+  tree field1 = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
+			    get_identifier ("features"),
+			    long_long_unsigned_type_node);
+  DECL_FIELD_CONTEXT (field1) = global_type;
+  TYPE_FIELDS (global_type) = field1;
+  layout_type (global_type);
+
+  tree global_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier
+				("__aarch64_cpu_features"), global_type);
+  DECL_EXTERNAL (global_var) = 1;
+  tree mask_var = create_tmp_var (long_long_unsigned_type_node);
+
+  tree component_expr = build3 (COMPONENT_REF, long_long_unsigned_type_node,
+				global_var, field1, NULL_TREE);
+  gimple *component_stmt = gimple_build_assign (mask_var, component_expr);
+  gimple_set_block (component_stmt, DECL_INITIAL (dispatch_decl));
+  gimple_set_bb (component_stmt, *empty_bb);
+  gimple_seq_add_stmt (&gseq, component_stmt);
+
+  tree not_expr = build1 (BIT_NOT_EXPR, long_long_unsigned_type_node, mask_var);
+  gimple *not_stmt = gimple_build_assign (mask_var, not_expr);
+  gimple_set_block (not_stmt, DECL_INITIAL (dispatch_decl));
+  gimple_set_bb (not_stmt, *empty_bb);
+  gimple_seq_add_stmt (&gseq, not_stmt);
+
+  set_bb_seq (*empty_bb, gseq);
+
+  pop_cfun ();
+
+  for (ix = 0; fndecls->iterate (ix, &ele); ++ix)
+    {
+      tree version_decl = ele;
+      unsigned long long feature_mask;
+      /* Get attribute string, parse it and find the right features.  */
+      feature_mask = get_feature_mask_for_version (version_decl);
+      function_version_info [actual_versions].version_decl = version_decl;
+      function_version_info [actual_versions].feature_mask = feature_mask;
+      actual_versions++;
+    }
+
+  /* Sort the versions according to descending order of dispatch priority.  */
+  qsort (function_version_info, actual_versions,
+	 sizeof (struct _function_version_info), compare_feature_version_info);
+
+  for (i = 0; i < actual_versions; ++i)
+    *empty_bb = add_condition_to_bb (dispatch_decl,
+				     function_version_info[i].version_decl,
+				     function_version_info[i].feature_mask,
+				     mask_var,
+				     *empty_bb);
+
+  free (function_version_info);
+  return 0;
+}
+
+
+tree
+aarch64_generate_version_dispatcher_body (void *node_p)
+{
+  tree resolver_decl;
+  basic_block empty_bb;
+  tree default_ver_decl;
+  struct cgraph_node *versn;
+  struct cgraph_node *node;
+
+  struct cgraph_function_version_info *node_version_info = NULL;
+  struct cgraph_function_version_info *versn_info = NULL;
+
+  node = (cgraph_node *)node_p;
+
+  node_version_info = node->function_version ();
+  gcc_assert (node->dispatcher_function
+	      && node_version_info != NULL);
+
+  if (node_version_info->dispatcher_resolver)
+    return node_version_info->dispatcher_resolver;
+
+  /* The first version in the chain corresponds to the default version.  */
+  default_ver_decl = node_version_info->next->this_node->decl;
+
+  /* node is going to be an alias, so remove the finalized bit.  */
+  node->definition = false;
+
+  resolver_decl = make_resolver_func (default_ver_decl,
+				      node->decl, &empty_bb);
+
+  node_version_info->dispatcher_resolver = resolver_decl;
+
+  push_cfun (DECL_STRUCT_FUNCTION (resolver_decl));
+
+  auto_vec<tree, 2> fn_ver_vec;
+
+  for (versn_info = node_version_info->next; versn_info;
+       versn_info = versn_info->next)
+    {
+      versn = versn_info->this_node;
+      /* Check for virtual functions here again, as by this time it should
+	 have been determined if this function needs a vtable index or
+	 not.  This happens for methods in derived classes that override
+	 virtual methods in base classes but are not explicitly marked as
+	 virtual.  */
+      if (DECL_VINDEX (versn->decl))
+	sorry ("virtual function multiversioning not supported");
+
+      fn_ver_vec.safe_push (versn->decl);
+    }
+
+  dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb);
+  cgraph_edge::rebuild_edges ();
+  pop_cfun ();
+  return resolver_decl;
+}
+
+/* Make a dispatcher declaration for the multi-versioned function DECL.
+   Calls to DECL function will be replaced with calls to the dispatcher
+   by the front-end.  Returns the decl of the dispatcher function.  */
+
+tree
+aarch64_get_function_versions_dispatcher (void *decl)
+{
+  tree fn = (tree) decl;
+  struct cgraph_node *node = NULL;
+  struct cgraph_node *default_node = NULL;
+  struct cgraph_function_version_info *node_v = NULL;
+  struct cgraph_function_version_info *first_v = NULL;
+
+  tree dispatch_decl = NULL;
+
+  struct cgraph_function_version_info *default_version_info = NULL;
+
+  gcc_assert (fn != NULL && DECL_FUNCTION_VERSIONED (fn));
+
+  node = cgraph_node::get (fn);
+  gcc_assert (node != NULL);
+
+  node_v = node->function_version ();
+  gcc_assert (node_v != NULL);
+
+  if (node_v->dispatcher_resolver != NULL)
+    return node_v->dispatcher_resolver;
+
+  /* Find the default version and make it the first node.  */
+  first_v = node_v;
+  /* Go to the beginning of the chain.  */
+  while (first_v->prev != NULL)
+    first_v = first_v->prev;
+  default_version_info = first_v;
+  while (default_version_info != NULL)
+    {
+      if (get_feature_mask_for_version
+	    (default_version_info->this_node->decl) == 0ULL)
+	break;
+      default_version_info = default_version_info->next;
+    }
+
+  /* If there is no default node, just return NULL.  */
+  if (default_version_info == NULL)
+    return NULL;
+
+  /* Make default info the first node.  */
+  if (first_v != default_version_info)
+    {
+      default_version_info->prev->next = default_version_info->next;
+      if (default_version_info->next)
+	default_version_info->next->prev = default_version_info->prev;
+      first_v->prev = default_version_info;
+      default_version_info->next = first_v;
+      default_version_info->prev = NULL;
+    }
+
+  default_node = default_version_info->this_node;
+
+  if (targetm.has_ifunc_p ())
+    {
+      struct cgraph_function_version_info *it_v = NULL;
+      struct cgraph_node *dispatcher_node = NULL;
+      struct cgraph_function_version_info *dispatcher_version_info = NULL;
+
+      /* Right now, the dispatching is done via ifunc.  */
+      dispatch_decl = make_dispatcher_decl (default_node->decl);
+      TREE_NOTHROW (dispatch_decl) = TREE_NOTHROW (fn);
+
+      dispatcher_node = cgraph_node::get_create (dispatch_decl);
+      gcc_assert (dispatcher_node != NULL);
+      dispatcher_node->dispatcher_function = 1;
+      dispatcher_version_info
+	= dispatcher_node->insert_new_function_version ();
+      dispatcher_version_info->next = default_version_info;
+      dispatcher_node->definition = 1;
+
+      /* Set the dispatcher for all the versions.  */
+      it_v = default_version_info;
+      while (it_v != NULL)
+	{
+	  it_v->dispatcher_resolver = dispatch_decl;
+	  it_v = it_v->next;
+	}
+    }
+  else
+    {
+      error_at (DECL_SOURCE_LOCATION (default_node->decl),
+		"multiversioning needs %<ifunc%> which is not supported "
+		"on this target");
+    }
+
+  return dispatch_decl;
+}
+
+bool
+aarch64_common_function_versions (tree fn1, tree fn2)
+{
+  if (TREE_CODE (fn1) != FUNCTION_DECL
+      || TREE_CODE (fn2) != FUNCTION_DECL)
+    return false;
+
+  return (aarch64_compare_version_priority (fn1, fn2) != 0);
+}
+
+
+tree
+aarch64_mangle_decl_assembler_name (tree decl, tree id)
+{
+  /* For function version, add the target suffix to the assembler name.  */
+  if (TREE_CODE (decl) == FUNCTION_DECL
+      && DECL_FUNCTION_VERSIONED (decl))
+    {
+      unsigned long long feature_mask = get_feature_mask_for_version (decl);
+
+      /* No suffix for the default version.  */
+      if (feature_mask == 0ULL)
+	return id;
+
+      char suffix[2048];
+      int pos = 0;
+      const char *base = IDENTIFIER_POINTER (id);
+
+      for (int i = 1; i < FEAT_MAX; i++)
+	{
+	  if (feature_mask & aarch64_fmv_feature_data[i].feature_mask)
+	    {
+	      suffix[pos] = 'M';
+	      strcpy (&suffix[pos+1], aarch64_fmv_feature_data[i].name);
+	      pos += strlen(aarch64_fmv_feature_data[i].name) + 1;
+	    }
+	}
+      suffix[pos] = '\0';
+
+      char *ret = XNEWVEC (char, strlen (base) + strlen (suffix) + 3);
+      sprintf (ret, "%s._%s", base, suffix);
+
+      if (DECL_ASSEMBLER_NAME_SET_P (decl))
+	SET_DECL_RTL (decl, NULL);
+
+      id = get_identifier (ret);
+    }
+  return id;
+}
+
+
 /* Helper for aarch64_can_inline_p.  In the case where CALLER and CALLEE are
    tri-bool options (yes, no, don't care) and the default value is
    DEF, determine whether to reject inlining.  */
@@ -27804,6 +28614,12 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_OPTION_VALID_ATTRIBUTE_P
 #define TARGET_OPTION_VALID_ATTRIBUTE_P aarch64_option_valid_attribute_p
 
+#undef TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
+#define TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P aarch64_option_valid_attribute_p
+
+#undef TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
+#define TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE "target_version"
+
 #undef TARGET_SET_CURRENT_FUNCTION
 #define TARGET_SET_CURRENT_FUNCTION aarch64_set_current_function
 
@@ -28128,6 +28944,24 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_CONST_ANCHOR
 #define TARGET_CONST_ANCHOR 0x1000000
 
+#undef TARGET_OPTION_FUNCTION_VERSIONS
+#define TARGET_OPTION_FUNCTION_VERSIONS aarch64_common_function_versions
+
+#undef TARGET_COMPARE_VERSION_PRIORITY
+#define TARGET_COMPARE_VERSION_PRIORITY aarch64_compare_version_priority
+
+#undef TARGET_GENERATE_VERSION_DISPATCHER_BODY
+#define TARGET_GENERATE_VERSION_DISPATCHER_BODY \
+  aarch64_generate_version_dispatcher_body
+
+#undef TARGET_GET_FUNCTION_VERSIONS_DISPATCHER
+#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \
+  aarch64_get_function_versions_dispatcher
+
+#undef TARGET_MANGLE_DECL_ASSEMBLER_NAME
+#define TARGET_MANGLE_DECL_ASSEMBLER_NAME aarch64_mangle_decl_assembler_name
+
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-aarch64.h"

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [3/3] WIP/RFC: Fix name mangling for target_clones
  2023-10-18 15:39 [0/3] target_version and aarch64 function multiversioning Andrew Carlotti
  2023-10-18 15:42 ` [1/3] Add support for target_version attribute Andrew Carlotti
  2023-10-18 15:44 ` [2/3] [aarch64] Add function multiversioning support Andrew Carlotti
@ 2023-10-18 15:44 ` Andrew Carlotti
  2 siblings, 0 replies; 10+ messages in thread
From: Andrew Carlotti @ 2023-10-18 15:44 UTC (permalink / raw)
  To: gcc-patches

This is a partial patch to make the mangling of function version names
for target_clones match those generated using the target or
target_version attributes.  It modifies the name of function versions,
but does not yet rename the resolved symbol, resulting in a duplicate
symbol name (and an error at assembly time).


Is this sort of approach ok?  Should I create an extra target hook to be called
here, so that the target_clones mangling can be target-specific but not
necessarily the same as for target attribute versioning?


diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index 8af6b23d8c0306920e0fdcb3559ef047a16689f4..15672c02c6f9d6043a36bf081067f08d1ab834e5 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -1033,11 +1033,6 @@ cgraph_node::create_version_clone_with_body
   else
     new_decl = copy_node (old_decl);
 
-  /* Generate a new name for the new version. */
-  tree fnname = (version_decl ? clone_function_name_numbered (old_decl, suffix)
-		: clone_function_name (old_decl, suffix));
-  DECL_NAME (new_decl) = fnname;
-  SET_DECL_ASSEMBLER_NAME (new_decl, fnname);
   SET_DECL_RTL (new_decl, NULL);
 
   DECL_VIRTUAL_P (new_decl) = 0;
@@ -1065,6 +1060,24 @@ cgraph_node::create_version_clone_with_body
 	return NULL;
     }
 
+  /* Generate a new name for the new version. */
+  if (version_decl)
+    {
+      tree fnname = (clone_function_name_numbered (old_decl, suffix));
+      DECL_NAME (new_decl) = fnname;
+      SET_DECL_ASSEMBLER_NAME (new_decl, fnname);
+    }
+  else
+    {
+      /* Add target version mangling.  We assume that the target hook will
+	 produce the same mangled name as it would have produced if the decl
+	 had already been versioned when the hook was previously called.  */
+      tree fnname = DECL_ASSEMBLER_NAME (old_decl);
+      DECL_NAME (new_decl) = fnname;
+      fnname = targetm.mangle_decl_assembler_name (new_decl, fnname);
+      SET_DECL_ASSEMBLER_NAME (new_decl, fnname);
+    }
+
   /* When the old decl was a con-/destructor make sure the clone isn't.  */
   DECL_STATIC_CONSTRUCTOR (new_decl) = 0;
   DECL_STATIC_DESTRUCTOR (new_decl) = 0;
diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
index 3db57c2b13d612a37240d9dcf58ad21b2286633c..d9aec9a5ab532701b4a1877b440f3a553ffa28e2 100644
--- a/gcc/multiple_target.cc
+++ b/gcc/multiple_target.cc
@@ -162,7 +162,12 @@ create_dispatcher_calls (struct cgraph_node *node)
 	}
     }
 
-  tree fname = clone_function_name (node->decl, "default");
+  /* Add version mangling to default decl name.  We assume that the target
+     hook will produce the same mangled name as it would have produced if the
+     decl had already been versioned when the hook was previously called.  */
+  tree fname = DECL_ASSEMBLER_NAME (node->decl);
+  DECL_NAME (node->decl) = fname;
+  fname = targetm.mangle_decl_assembler_name (node->decl, fname);
   symtab->change_decl_assembler_name (node->decl, fname);
 
   if (node->definition)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [1/3] Add support for target_version attribute
  2023-10-18 15:42 ` [1/3] Add support for target_version attribute Andrew Carlotti
@ 2023-10-19  7:04   ` Richard Biener
  2023-10-19 16:13     ` Andrew Carlotti
  2023-10-26 18:41   ` Richard Sandiford
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2023-10-19  7:04 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, jason, nathan, richard.sandiford

On Wed, 18 Oct 2023, Andrew Carlotti wrote:

> This patch adds support for the "target_version" attribute to the middle
> end and the C++ frontend, which will be used to implement function
> multiversioning in the aarch64 backend.
> 
> Note that C++ is currently the only frontend which supports
> multiversioning using the "target" attribute, whereas the
> "target_clones" attribute is additionally supported in C, D and Ada.
> Support for the target_version attribute will be extended to C at a
> later date.
> 
> Targets that currently use the "target" attribute for function
> multiversioning (i.e. i386 and rs6000) are not affected by this patch.
> 
> 
> I could have implemented the target hooks slightly differently, by reusing the
> valid_attribute_p hook and adding attribute name checks to each backend
> implementation (c.f. the aarch64 implementation in patch 2/3).  Would this be
> preferable?
> 
> Otherwise, is this ok for master?

This lacks user-level documentation in doc/extend.texi (where
target_clones is documented).

Was there any discussion/description of why target_clones cannot
be made work for aarch64?

Richard.

> 
> gcc/c-family/ChangeLog:
> 
> 	* c-attribs.cc (handle_target_version_attribute): New.
> 	(c_common_attribute_table): Add target_version.
> 	(handle_target_clones_attribute): Add conflict with
> 	target_version attribute.
> 
> gcc/ChangeLog:
> 
> 	* attribs.cc (is_function_default_version): Update comment to
> 	specify incompatibility with target_version attributes.
> 	* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
> 	Call valid_version_attribute_p for target_version attributes.
> 	* target.def (valid_version_attribute_p): New hook.
> 	(expanded_clones_attribute): New hook.
> 	* doc/tm.texi.in: Add new hooks.
> 	* doc/tm.texi: Regenerate.
> 	* multiple_target.cc (create_dispatcher_calls): Remove redundant
> 	is_function_default_version check.
> 	(expand_target_clones): Use target hook for attribute name.
> 	* targhooks.cc (default_target_option_valid_version_attribute_p):
> 	New.
> 	* targhooks.h (default_target_option_valid_version_attribute_p):
> 	New.
> 	* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
> 	target_version attributes.
> 
> gcc/cp/ChangeLog:
> 
> 	* decl2.cc (check_classfn): Update comment to include
> 	target_version attributes.
> 
> 
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index b1300018d1e8ed8e02ded1ea721dc192a6d32a49..a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -1233,8 +1233,9 @@ make_dispatcher_decl (const tree decl)
>    return func_decl;  
>  }
>  
> -/* Returns true if decl is multi-versioned and DECL is the default function,
> -   that is it is not tagged with target specific optimization.  */
> +/* Returns true if DECL is multi-versioned using the target attribute, and this
> +   is the default version.  This function can only be used for targets that do
> +   not support the "target_version" attribute.  */
>  
>  bool
>  is_function_default_version (const tree decl)
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index 072cfb69147bd6b314459c0bd48a0c1fb92d3e4d..1a224c036277d51ab4dc0d33a403177bd226e48a 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -148,6 +148,7 @@ static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_assume_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_target_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_target_version_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_target_clones_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
>  static tree ignore_attribute (tree *, tree, tree, int, bool *);
> @@ -480,6 +481,8 @@ const struct attribute_spec c_common_attribute_table[] =
>  			      handle_error_attribute, NULL },
>    { "target",                 1, -1, true, false, false, false,
>  			      handle_target_attribute, NULL },
> +  { "target_version",         1, -1, true, false, false, false,
> +			      handle_target_version_attribute, NULL },
>    { "target_clones",          1, -1, true, false, false, false,
>  			      handle_target_clones_attribute, NULL },
>    { "optimize",               1, -1, true, false, false, false,
> @@ -5569,6 +5572,45 @@ handle_target_attribute (tree *node, tree name, tree args, int flags,
>    return NULL_TREE;
>  }
>  
> +/* Handle a "target_version" attribute.  */
> +
> +static tree
> +handle_target_version_attribute (tree *node, tree name, tree args, int flags,
> +				  bool *no_add_attrs)
> +{
> +  /* Ensure we have a function type.  */
> +  if (TREE_CODE (*node) != FUNCTION_DECL)
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored", name);
> +      *no_add_attrs = true;
> +    }
> +  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> +		   "with %qs attribute", name, "target_clones");
> +      *no_add_attrs = true;
> +    }
> +  else if (!targetm.target_option.valid_version_attribute_p (*node, name, args,
> +							     flags))
> +    *no_add_attrs = true;
> +
> +  /* Check that there's no empty string in values of the attribute.  */
> +  for (tree t = args; t != NULL_TREE; t = TREE_CHAIN (t))
> +    {
> +      tree value = TREE_VALUE (t);
> +      if (TREE_CODE (value) == STRING_CST
> +	  && TREE_STRING_LENGTH (value) == 1
> +	  && TREE_STRING_POINTER (value)[0] == '\0')
> +	{
> +	  warning (OPT_Wattributes,
> +		   "empty string in attribute %<target_version%>");
> +	  *no_add_attrs = true;
> +	}
> +    }
> +
> +  return NULL_TREE;
> +}
> +
>  /* Handle a "target_clones" attribute.  */
>  
>  static tree
> @@ -5601,6 +5643,12 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args),
>  		   "with %qs attribute", name, "target");
>  	  *no_add_attrs = true;
>  	}
> +      else if (lookup_attribute ("target_version", DECL_ATTRIBUTES (*node)))
> +	{
> +	  warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> +		   "with %qs attribute", name, "target_version");
> +	  *no_add_attrs = true;
> +	}
>        else if (get_target_clone_attr_len (args) == -1)
>  	{
>  	  warning (OPT_Wattributes,
> diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
> index 29d28ef895a73a223695cbb86aafbc845bbe7688..8af6b23d8c0306920e0fdcb3559ef047a16689f4 100644
> --- a/gcc/cgraphclones.cc
> +++ b/gcc/cgraphclones.cc
> @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-eh.h"
>  #include "tree-cfg.h"
>  #include "tree-inline.h"
> +#include "attribs.h"
>  #include "dumpfile.h"
>  #include "gimple-pretty-print.h"
>  #include "alloc-pool.h"
> @@ -1048,7 +1049,17 @@ cgraph_node::create_version_clone_with_body
>        location_t saved_loc = input_location;
>        tree v = TREE_VALUE (target_attributes);
>        input_location = DECL_SOURCE_LOCATION (new_decl);
> -      bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 1);
> +      bool r;
> +      tree name_id = get_attribute_name (target_attributes);
> +      const char* name_str = IDENTIFIER_POINTER (name_id);
> +      if (strcmp (name_str, "target") == 0)
> +	r = targetm.target_option.valid_attribute_p (new_decl, name_id, v, 1);
> +      else if (strcmp (name_str, "target_version") == 0)
> +	r = targetm.target_option.valid_version_attribute_p (new_decl, name_id,
> +							     v, 1);
> +      else
> +	gcc_assert(false);
> +
>        input_location = saved_loc;
>        if (!r)
>  	return NULL;
> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> index 9594be4092c3c00fddc9d4c6da5931ea3b7e8792..ec78d5a5440bedd360ac8e5bc44e164da3dab410 100644
> --- a/gcc/cp/decl2.cc
> +++ b/gcc/cp/decl2.cc
> @@ -829,8 +829,8 @@ check_classfn (tree ctype, tree function, tree template_parms)
>        tree c2 = get_constraints (fndecl);
>  
>        /* While finding a match, same types and params are not enough
> -	 if the function is versioned.  Also check version ("target")
> -	 attributes.  */
> +	 if the function is versioned.  Also check for different target
> +	 specific attributes.  */
>        if (same_type_p (TREE_TYPE (TREE_TYPE (function)),
>  		       TREE_TYPE (TREE_TYPE (fndecl)))
>  	  && compparms (p1, p2)
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 33e7ffc8af5a9d48430145ef9b8e7924613b7dd7..97b3e9c31631a943d95f1cf7739716a574afcfb7 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -10507,6 +10507,23 @@ the function declaration to hold a pointer to a target-specific
>  @code{struct cl_target_option} structure.
>  @end deftypefn
>  
> +@deftypefn {Target Hook} bool TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P (tree @var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
> +This hook is called to parse @code{attribute(target_version("..."))},
> +which allows setting target-specific options on individual function versions.
> +These function-specific options may differ
> +from the options specified on the command line.  The hook should return
> +@code{true} if the options are valid.
> +
> +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in
> +the function declaration to hold a pointer to a target-specific
> +@code{struct cl_target_option} structure.
> +@end deftypefn
> +
> +@deftypevr {Target Hook} {const char *} TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> +Contains the name of the attribute used for the version description string
> +when expanding clones for a function with the target_clones attribute.
> +@end deftypevr
> +
>  @deftypefn {Target Hook} void TARGET_OPTION_SAVE (struct cl_target_option *@var{ptr}, struct gcc_options *@var{opts}, struct gcc_options *@var{opts_set})
>  This hook is called to save any additional target-specific information
>  in the @code{struct cl_target_option} structure for function-specific
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index c98b2447e28aa17996b1cbf8af7ed02d70db54f2..56fa3de6bba06bc0ac124bb3a41324be1997e209 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -6979,6 +6979,10 @@ on this implementation detail.
>  
>  @hook TARGET_OPTION_VALID_ATTRIBUTE_P
>  
> +@hook TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
> +
> +@hook TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> +
>  @hook TARGET_OPTION_SAVE
>  
>  @hook TARGET_OPTION_RESTORE
> diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
> index a2ed048d7dd28ec470953fcd8a0dc86817e4b7dc..3db57c2b13d612a37240d9dcf58ad21b2286633c 100644
> --- a/gcc/multiple_target.cc
> +++ b/gcc/multiple_target.cc
> @@ -66,10 +66,6 @@ create_dispatcher_calls (struct cgraph_node *node)
>  {
>    ipa_ref *ref;
>  
> -  if (!DECL_FUNCTION_VERSIONED (node->decl)
> -      || !is_function_default_version (node->decl))
> -    return;
> -
>    if (!targetm.has_ifunc_p ())
>      {
>        error_at (DECL_SOURCE_LOCATION (node->decl),
> @@ -377,6 +373,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
>        return false;
>      }
>  
> +  const char *new_attr_name = targetm.target_option.expanded_clones_attribute;
>    cgraph_function_version_info *decl1_v = NULL;
>    cgraph_function_version_info *decl2_v = NULL;
>    cgraph_function_version_info *before = NULL;
> @@ -392,7 +389,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
>        char *attr = attrs[i];
>  
>        /* Create new target clone.  */
> -      tree attributes = make_attribute ("target", attr,
> +      tree attributes = make_attribute (new_attr_name, attr,
>  					DECL_ATTRIBUTES (node->decl));
>  
>        char *suffix = XNEWVEC (char, strlen (attr) + 1);
> @@ -430,7 +427,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
>    XDELETEVEC (attr_str);
>  
>    /* Setting new attribute to initial function.  */
> -  tree attributes = make_attribute ("target", "default",
> +  tree attributes = make_attribute (new_attr_name, "default",
>  				    DECL_ATTRIBUTES (node->decl));
>    DECL_ATTRIBUTES (node->decl) = attributes;
>    node->local = false;
> diff --git a/gcc/target.def b/gcc/target.def
> index cda6c51e5167f85625168c7c26b777d6c8ccad82..39acea04db01ebaf918910b7dd73d397de6a84ec 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -6492,6 +6492,31 @@ the function declaration to hold a pointer to a target-specific\n\
>   bool, (tree fndecl, tree name, tree args, int flags),
>   default_target_option_valid_attribute_p)
>  
> +/* Function to validate the attribute((target_version(...))) strings.  If
> +   the option is validated, the hook should also fill in
> +   DECL_FUNCTION_SPECIFIC_TARGET in the function decl node.  */
> +DEFHOOK
> +(valid_version_attribute_p,
> + "This hook is called to parse @code{attribute(target_version(\"...\"))},\n\
> +which allows setting target-specific options on individual function versions.\n\
> +These function-specific options may differ\n\
> +from the options specified on the command line.  The hook should return\n\
> +@code{true} if the options are valid.\n\
> +\n\
> +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in\n\
> +the function declaration to hold a pointer to a target-specific\n\
> +@code{struct cl_target_option} structure.",
> + bool, (tree fndecl, tree name, tree args, int flags),
> + default_target_option_valid_version_attribute_p)
> +
> +/* Attribute to be used when expanding clones for functions with
> +   target_clones attribute.  */
> +DEFHOOKPOD
> +(expanded_clones_attribute,
> + "Contains the name of the attribute used for the version description string\n\
> +when expanding clones for a function with the target_clones attribute.",
> + const char *, "target")
> +
>  /* Function to save any extra target state in the target options structure.  */
>  DEFHOOK
>  (save,
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 1a0db8dddd594d9b1fb04ae0d9a66ad6b7a396dc..0efc993d82ef59b581a1df74ee0de71135a28703 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -192,6 +192,7 @@ extern bool default_hard_regno_scratch_ok (unsigned int);
>  extern bool default_mode_dependent_address_p (const_rtx, addr_space_t);
>  extern bool default_new_address_profitable_p (rtx, rtx_insn *, rtx);
>  extern bool default_target_option_valid_attribute_p (tree, tree, tree, int);
> +extern bool default_target_option_valid_version_attribute_p (tree, tree, tree, int);
>  extern bool default_target_option_pragma_parse (tree, tree);
>  extern bool default_target_can_inline_p (tree, tree);
>  extern bool default_update_ipa_fn_target_info (unsigned int &, const gimple *);
> diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> index e190369f87a92e6a92372dc348d9374c3a965c0a..7fc7bf455e80c333cced1bac7085210c2b108f8d 100644
> --- a/gcc/targhooks.cc
> +++ b/gcc/targhooks.cc
> @@ -1787,7 +1787,19 @@ default_target_option_valid_attribute_p (tree ARG_UNUSED (fndecl),
>  					 int ARG_UNUSED (flags))
>  {
>    warning (OPT_Wattributes,
> -	   "target attribute is not supported on this machine");
> +	   "%<target%> attribute is not supported on this machine");
> +
> +  return false;
> +}
> +
> +bool
> +default_target_option_valid_version_attribute_p (tree ARG_UNUSED (fndecl),
> +						 tree ARG_UNUSED (name),
> +						 tree ARG_UNUSED (args),
> +						 int ARG_UNUSED (flags))
> +{
> +  warning (OPT_Wattributes,
> +	   "%<target_version%> attribute is not supported on this machine");
>  
>    return false;
>  }
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 0b72663e6a1a94406127f6253460f498b7a3ea9c..ebd89ce79566c350eaaab210c0dca3cc1ac2048e 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -3438,8 +3438,8 @@ extern vec<tree, va_gc> **decl_debug_args_insert (tree);
>     (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization)
>  
>  /* In FUNCTION_DECL, this is set if this function has other versions generated
> -   using "target" attributes.  The default version is the one which does not
> -   have any "target" attribute set. */
> +   to support different architecture feature sets, e.g. using "target" or
> +   "target_version" attributes.  */
>  #define DECL_FUNCTION_VERSIONED(NODE)\
>     (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function)
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [1/3] Add support for target_version attribute
  2023-10-19  7:04   ` Richard Biener
@ 2023-10-19 16:13     ` Andrew Carlotti
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Carlotti @ 2023-10-19 16:13 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, jason, nathan, richard.sandiford

On Thu, Oct 19, 2023 at 07:04:09AM +0000, Richard Biener wrote:
> On Wed, 18 Oct 2023, Andrew Carlotti wrote:
> 
> > This patch adds support for the "target_version" attribute to the middle
> > end and the C++ frontend, which will be used to implement function
> > multiversioning in the aarch64 backend.
> > 
> > Note that C++ is currently the only frontend which supports
> > multiversioning using the "target" attribute, whereas the
> > "target_clones" attribute is additionally supported in C, D and Ada.
> > Support for the target_version attribute will be extended to C at a
> > later date.
> > 
> > Targets that currently use the "target" attribute for function
> > multiversioning (i.e. i386 and rs6000) are not affected by this patch.
> > 
> > 
> > I could have implemented the target hooks slightly differently, by reusing the
> > valid_attribute_p hook and adding attribute name checks to each backend
> > implementation (c.f. the aarch64 implementation in patch 2/3).  Would this be
> > preferable?
> > 
> > Otherwise, is this ok for master?
> 
> This lacks user-level documentation in doc/extend.texi (where
> target_clones is documented).

Good point.  I'll add documentation updates as a separate patch in the series
(rather than documenting the state after this patch, in which the attribute is
supported on zero targets).  I think the existing documentation for target and
target_clones needs some improvement as well.

> Was there any discussion/description of why target_clones cannot
> be made work for aarch64?
> 
> Richard.

The second patch in this series does include support for target_clones on
aarch64.  However, the support in that patch is not fully compliant with our
ACLE specification.  I also have some unresolved questions about the
correctness of current function multiversioning implementations using ifuncs
across translation units, which could affect how we want to implement it for
aarch64.

Andrew

> > 
> > gcc/c-family/ChangeLog:
> > 
> > 	* c-attribs.cc (handle_target_version_attribute): New.
> > 	(c_common_attribute_table): Add target_version.
> > 	(handle_target_clones_attribute): Add conflict with
> > 	target_version attribute.
> > 
> > gcc/ChangeLog:
> > 
> > 	* attribs.cc (is_function_default_version): Update comment to
> > 	specify incompatibility with target_version attributes.
> > 	* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
> > 	Call valid_version_attribute_p for target_version attributes.
> > 	* target.def (valid_version_attribute_p): New hook.
> > 	(expanded_clones_attribute): New hook.
> > 	* doc/tm.texi.in: Add new hooks.
> > 	* doc/tm.texi: Regenerate.
> > 	* multiple_target.cc (create_dispatcher_calls): Remove redundant
> > 	is_function_default_version check.
> > 	(expand_target_clones): Use target hook for attribute name.
> > 	* targhooks.cc (default_target_option_valid_version_attribute_p):
> > 	New.
> > 	* targhooks.h (default_target_option_valid_version_attribute_p):
> > 	New.
> > 	* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
> > 	target_version attributes.
> > 
> > gcc/cp/ChangeLog:
> > 
> > 	* decl2.cc (check_classfn): Update comment to include
> > 	target_version attributes.
> > 
> > 
> > diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> > index b1300018d1e8ed8e02ded1ea721dc192a6d32a49..a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6 100644
> > --- a/gcc/attribs.cc
> > +++ b/gcc/attribs.cc
> > @@ -1233,8 +1233,9 @@ make_dispatcher_decl (const tree decl)
> >    return func_decl;  
> >  }
> >  
> > -/* Returns true if decl is multi-versioned and DECL is the default function,
> > -   that is it is not tagged with target specific optimization.  */
> > +/* Returns true if DECL is multi-versioned using the target attribute, and this
> > +   is the default version.  This function can only be used for targets that do
> > +   not support the "target_version" attribute.  */
> >  
> >  bool
> >  is_function_default_version (const tree decl)
> > diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> > index 072cfb69147bd6b314459c0bd48a0c1fb92d3e4d..1a224c036277d51ab4dc0d33a403177bd226e48a 100644
> > --- a/gcc/c-family/c-attribs.cc
> > +++ b/gcc/c-family/c-attribs.cc
> > @@ -148,6 +148,7 @@ static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_assume_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_target_attribute (tree *, tree, tree, int, bool *);
> > +static tree handle_target_version_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_target_clones_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
> >  static tree ignore_attribute (tree *, tree, tree, int, bool *);
> > @@ -480,6 +481,8 @@ const struct attribute_spec c_common_attribute_table[] =
> >  			      handle_error_attribute, NULL },
> >    { "target",                 1, -1, true, false, false, false,
> >  			      handle_target_attribute, NULL },
> > +  { "target_version",         1, -1, true, false, false, false,
> > +			      handle_target_version_attribute, NULL },
> >    { "target_clones",          1, -1, true, false, false, false,
> >  			      handle_target_clones_attribute, NULL },
> >    { "optimize",               1, -1, true, false, false, false,
> > @@ -5569,6 +5572,45 @@ handle_target_attribute (tree *node, tree name, tree args, int flags,
> >    return NULL_TREE;
> >  }
> >  
> > +/* Handle a "target_version" attribute.  */
> > +
> > +static tree
> > +handle_target_version_attribute (tree *node, tree name, tree args, int flags,
> > +				  bool *no_add_attrs)
> > +{
> > +  /* Ensure we have a function type.  */
> > +  if (TREE_CODE (*node) != FUNCTION_DECL)
> > +    {
> > +      warning (OPT_Wattributes, "%qE attribute ignored", name);
> > +      *no_add_attrs = true;
> > +    }
> > +  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
> > +    {
> > +      warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> > +		   "with %qs attribute", name, "target_clones");
> > +      *no_add_attrs = true;
> > +    }
> > +  else if (!targetm.target_option.valid_version_attribute_p (*node, name, args,
> > +							     flags))
> > +    *no_add_attrs = true;
> > +
> > +  /* Check that there's no empty string in values of the attribute.  */
> > +  for (tree t = args; t != NULL_TREE; t = TREE_CHAIN (t))
> > +    {
> > +      tree value = TREE_VALUE (t);
> > +      if (TREE_CODE (value) == STRING_CST
> > +	  && TREE_STRING_LENGTH (value) == 1
> > +	  && TREE_STRING_POINTER (value)[0] == '\0')
> > +	{
> > +	  warning (OPT_Wattributes,
> > +		   "empty string in attribute %<target_version%>");
> > +	  *no_add_attrs = true;
> > +	}
> > +    }
> > +
> > +  return NULL_TREE;
> > +}
> > +
> >  /* Handle a "target_clones" attribute.  */
> >  
> >  static tree
> > @@ -5601,6 +5643,12 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args),
> >  		   "with %qs attribute", name, "target");
> >  	  *no_add_attrs = true;
> >  	}
> > +      else if (lookup_attribute ("target_version", DECL_ATTRIBUTES (*node)))
> > +	{
> > +	  warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> > +		   "with %qs attribute", name, "target_version");
> > +	  *no_add_attrs = true;
> > +	}
> >        else if (get_target_clone_attr_len (args) == -1)
> >  	{
> >  	  warning (OPT_Wattributes,
> > diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
> > index 29d28ef895a73a223695cbb86aafbc845bbe7688..8af6b23d8c0306920e0fdcb3559ef047a16689f4 100644
> > --- a/gcc/cgraphclones.cc
> > +++ b/gcc/cgraphclones.cc
> > @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "tree-eh.h"
> >  #include "tree-cfg.h"
> >  #include "tree-inline.h"
> > +#include "attribs.h"
> >  #include "dumpfile.h"
> >  #include "gimple-pretty-print.h"
> >  #include "alloc-pool.h"
> > @@ -1048,7 +1049,17 @@ cgraph_node::create_version_clone_with_body
> >        location_t saved_loc = input_location;
> >        tree v = TREE_VALUE (target_attributes);
> >        input_location = DECL_SOURCE_LOCATION (new_decl);
> > -      bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 1);
> > +      bool r;
> > +      tree name_id = get_attribute_name (target_attributes);
> > +      const char* name_str = IDENTIFIER_POINTER (name_id);
> > +      if (strcmp (name_str, "target") == 0)
> > +	r = targetm.target_option.valid_attribute_p (new_decl, name_id, v, 1);
> > +      else if (strcmp (name_str, "target_version") == 0)
> > +	r = targetm.target_option.valid_version_attribute_p (new_decl, name_id,
> > +							     v, 1);
> > +      else
> > +	gcc_assert(false);
> > +
> >        input_location = saved_loc;
> >        if (!r)
> >  	return NULL;
> > diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> > index 9594be4092c3c00fddc9d4c6da5931ea3b7e8792..ec78d5a5440bedd360ac8e5bc44e164da3dab410 100644
> > --- a/gcc/cp/decl2.cc
> > +++ b/gcc/cp/decl2.cc
> > @@ -829,8 +829,8 @@ check_classfn (tree ctype, tree function, tree template_parms)
> >        tree c2 = get_constraints (fndecl);
> >  
> >        /* While finding a match, same types and params are not enough
> > -	 if the function is versioned.  Also check version ("target")
> > -	 attributes.  */
> > +	 if the function is versioned.  Also check for different target
> > +	 specific attributes.  */
> >        if (same_type_p (TREE_TYPE (TREE_TYPE (function)),
> >  		       TREE_TYPE (TREE_TYPE (fndecl)))
> >  	  && compparms (p1, p2)
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > index 33e7ffc8af5a9d48430145ef9b8e7924613b7dd7..97b3e9c31631a943d95f1cf7739716a574afcfb7 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -10507,6 +10507,23 @@ the function declaration to hold a pointer to a target-specific
> >  @code{struct cl_target_option} structure.
> >  @end deftypefn
> >  
> > +@deftypefn {Target Hook} bool TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P (tree @var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
> > +This hook is called to parse @code{attribute(target_version("..."))},
> > +which allows setting target-specific options on individual function versions.
> > +These function-specific options may differ
> > +from the options specified on the command line.  The hook should return
> > +@code{true} if the options are valid.
> > +
> > +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in
> > +the function declaration to hold a pointer to a target-specific
> > +@code{struct cl_target_option} structure.
> > +@end deftypefn
> > +
> > +@deftypevr {Target Hook} {const char *} TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> > +Contains the name of the attribute used for the version description string
> > +when expanding clones for a function with the target_clones attribute.
> > +@end deftypevr
> > +
> >  @deftypefn {Target Hook} void TARGET_OPTION_SAVE (struct cl_target_option *@var{ptr}, struct gcc_options *@var{opts}, struct gcc_options *@var{opts_set})
> >  This hook is called to save any additional target-specific information
> >  in the @code{struct cl_target_option} structure for function-specific
> > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> > index c98b2447e28aa17996b1cbf8af7ed02d70db54f2..56fa3de6bba06bc0ac124bb3a41324be1997e209 100644
> > --- a/gcc/doc/tm.texi.in
> > +++ b/gcc/doc/tm.texi.in
> > @@ -6979,6 +6979,10 @@ on this implementation detail.
> >  
> >  @hook TARGET_OPTION_VALID_ATTRIBUTE_P
> >  
> > +@hook TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
> > +
> > +@hook TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> > +
> >  @hook TARGET_OPTION_SAVE
> >  
> >  @hook TARGET_OPTION_RESTORE
> > diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
> > index a2ed048d7dd28ec470953fcd8a0dc86817e4b7dc..3db57c2b13d612a37240d9dcf58ad21b2286633c 100644
> > --- a/gcc/multiple_target.cc
> > +++ b/gcc/multiple_target.cc
> > @@ -66,10 +66,6 @@ create_dispatcher_calls (struct cgraph_node *node)
> >  {
> >    ipa_ref *ref;
> >  
> > -  if (!DECL_FUNCTION_VERSIONED (node->decl)
> > -      || !is_function_default_version (node->decl))
> > -    return;
> > -
> >    if (!targetm.has_ifunc_p ())
> >      {
> >        error_at (DECL_SOURCE_LOCATION (node->decl),
> > @@ -377,6 +373,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
> >        return false;
> >      }
> >  
> > +  const char *new_attr_name = targetm.target_option.expanded_clones_attribute;
> >    cgraph_function_version_info *decl1_v = NULL;
> >    cgraph_function_version_info *decl2_v = NULL;
> >    cgraph_function_version_info *before = NULL;
> > @@ -392,7 +389,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
> >        char *attr = attrs[i];
> >  
> >        /* Create new target clone.  */
> > -      tree attributes = make_attribute ("target", attr,
> > +      tree attributes = make_attribute (new_attr_name, attr,
> >  					DECL_ATTRIBUTES (node->decl));
> >  
> >        char *suffix = XNEWVEC (char, strlen (attr) + 1);
> > @@ -430,7 +427,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
> >    XDELETEVEC (attr_str);
> >  
> >    /* Setting new attribute to initial function.  */
> > -  tree attributes = make_attribute ("target", "default",
> > +  tree attributes = make_attribute (new_attr_name, "default",
> >  				    DECL_ATTRIBUTES (node->decl));
> >    DECL_ATTRIBUTES (node->decl) = attributes;
> >    node->local = false;
> > diff --git a/gcc/target.def b/gcc/target.def
> > index cda6c51e5167f85625168c7c26b777d6c8ccad82..39acea04db01ebaf918910b7dd73d397de6a84ec 100644
> > --- a/gcc/target.def
> > +++ b/gcc/target.def
> > @@ -6492,6 +6492,31 @@ the function declaration to hold a pointer to a target-specific\n\
> >   bool, (tree fndecl, tree name, tree args, int flags),
> >   default_target_option_valid_attribute_p)
> >  
> > +/* Function to validate the attribute((target_version(...))) strings.  If
> > +   the option is validated, the hook should also fill in
> > +   DECL_FUNCTION_SPECIFIC_TARGET in the function decl node.  */
> > +DEFHOOK
> > +(valid_version_attribute_p,
> > + "This hook is called to parse @code{attribute(target_version(\"...\"))},\n\
> > +which allows setting target-specific options on individual function versions.\n\
> > +These function-specific options may differ\n\
> > +from the options specified on the command line.  The hook should return\n\
> > +@code{true} if the options are valid.\n\
> > +\n\
> > +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in\n\
> > +the function declaration to hold a pointer to a target-specific\n\
> > +@code{struct cl_target_option} structure.",
> > + bool, (tree fndecl, tree name, tree args, int flags),
> > + default_target_option_valid_version_attribute_p)
> > +
> > +/* Attribute to be used when expanding clones for functions with
> > +   target_clones attribute.  */
> > +DEFHOOKPOD
> > +(expanded_clones_attribute,
> > + "Contains the name of the attribute used for the version description string\n\
> > +when expanding clones for a function with the target_clones attribute.",
> > + const char *, "target")
> > +
> >  /* Function to save any extra target state in the target options structure.  */
> >  DEFHOOK
> >  (save,
> > diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> > index 1a0db8dddd594d9b1fb04ae0d9a66ad6b7a396dc..0efc993d82ef59b581a1df74ee0de71135a28703 100644
> > --- a/gcc/targhooks.h
> > +++ b/gcc/targhooks.h
> > @@ -192,6 +192,7 @@ extern bool default_hard_regno_scratch_ok (unsigned int);
> >  extern bool default_mode_dependent_address_p (const_rtx, addr_space_t);
> >  extern bool default_new_address_profitable_p (rtx, rtx_insn *, rtx);
> >  extern bool default_target_option_valid_attribute_p (tree, tree, tree, int);
> > +extern bool default_target_option_valid_version_attribute_p (tree, tree, tree, int);
> >  extern bool default_target_option_pragma_parse (tree, tree);
> >  extern bool default_target_can_inline_p (tree, tree);
> >  extern bool default_update_ipa_fn_target_info (unsigned int &, const gimple *);
> > diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> > index e190369f87a92e6a92372dc348d9374c3a965c0a..7fc7bf455e80c333cced1bac7085210c2b108f8d 100644
> > --- a/gcc/targhooks.cc
> > +++ b/gcc/targhooks.cc
> > @@ -1787,7 +1787,19 @@ default_target_option_valid_attribute_p (tree ARG_UNUSED (fndecl),
> >  					 int ARG_UNUSED (flags))
> >  {
> >    warning (OPT_Wattributes,
> > -	   "target attribute is not supported on this machine");
> > +	   "%<target%> attribute is not supported on this machine");
> > +
> > +  return false;
> > +}
> > +
> > +bool
> > +default_target_option_valid_version_attribute_p (tree ARG_UNUSED (fndecl),
> > +						 tree ARG_UNUSED (name),
> > +						 tree ARG_UNUSED (args),
> > +						 int ARG_UNUSED (flags))
> > +{
> > +  warning (OPT_Wattributes,
> > +	   "%<target_version%> attribute is not supported on this machine");
> >  
> >    return false;
> >  }
> > diff --git a/gcc/tree.h b/gcc/tree.h
> > index 0b72663e6a1a94406127f6253460f498b7a3ea9c..ebd89ce79566c350eaaab210c0dca3cc1ac2048e 100644
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -3438,8 +3438,8 @@ extern vec<tree, va_gc> **decl_debug_args_insert (tree);
> >     (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization)
> >  
> >  /* In FUNCTION_DECL, this is set if this function has other versions generated
> > -   using "target" attributes.  The default version is the one which does not
> > -   have any "target" attribute set. */
> > +   to support different architecture feature sets, e.g. using "target" or
> > +   "target_version" attributes.  */
> >  #define DECL_FUNCTION_VERSIONED(NODE)\
> >     (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function)
> >  
> > 
> 
> -- 
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [1/3] Add support for target_version attribute
  2023-10-18 15:42 ` [1/3] Add support for target_version attribute Andrew Carlotti
  2023-10-19  7:04   ` Richard Biener
@ 2023-10-26 18:41   ` Richard Sandiford
  2023-11-03 12:43     ` Andrew Carlotti
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Sandiford @ 2023-10-26 18:41 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, jason, nathan, rguenther

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> This patch adds support for the "target_version" attribute to the middle
> end and the C++ frontend, which will be used to implement function
> multiversioning in the aarch64 backend.
>
> Note that C++ is currently the only frontend which supports
> multiversioning using the "target" attribute, whereas the
> "target_clones" attribute is additionally supported in C, D and Ada.
> Support for the target_version attribute will be extended to C at a
> later date.
>
> Targets that currently use the "target" attribute for function
> multiversioning (i.e. i386 and rs6000) are not affected by this patch.
>
>
> I could have implemented the target hooks slightly differently, by reusing the
> valid_attribute_p hook and adding attribute name checks to each backend
> implementation (c.f. the aarch64 implementation in patch 2/3).  Would this be
> preferable?

Having as much as possible in target-independent code seems better
to me FWIW.  On that basis:

>
> Otherwise, is this ok for master?
>
>
> gcc/c-family/ChangeLog:
>
> 	* c-attribs.cc (handle_target_version_attribute): New.
> 	(c_common_attribute_table): Add target_version.
> 	(handle_target_clones_attribute): Add conflict with
> 	target_version attribute.
>
> gcc/ChangeLog:
>
> 	* attribs.cc (is_function_default_version): Update comment to
> 	specify incompatibility with target_version attributes.
> 	* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
> 	Call valid_version_attribute_p for target_version attributes.
> 	* target.def (valid_version_attribute_p): New hook.
> 	(expanded_clones_attribute): New hook.
> 	* doc/tm.texi.in: Add new hooks.
> 	* doc/tm.texi: Regenerate.
> 	* multiple_target.cc (create_dispatcher_calls): Remove redundant
> 	is_function_default_version check.
> 	(expand_target_clones): Use target hook for attribute name.
> 	* targhooks.cc (default_target_option_valid_version_attribute_p):
> 	New.
> 	* targhooks.h (default_target_option_valid_version_attribute_p):
> 	New.
> 	* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
> 	target_version attributes.
>
> gcc/cp/ChangeLog:
>
> 	* decl2.cc (check_classfn): Update comment to include
> 	target_version attributes.
>
>
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index b1300018d1e8ed8e02ded1ea721dc192a6d32a49..a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -1233,8 +1233,9 @@ make_dispatcher_decl (const tree decl)
>    return func_decl;  
>  }
>  
> -/* Returns true if decl is multi-versioned and DECL is the default function,
> -   that is it is not tagged with target specific optimization.  */
> +/* Returns true if DECL is multi-versioned using the target attribute, and this
> +   is the default version.  This function can only be used for targets that do
> +   not support the "target_version" attribute.  */
>  
>  bool
>  is_function_default_version (const tree decl)
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index 072cfb69147bd6b314459c0bd48a0c1fb92d3e4d..1a224c036277d51ab4dc0d33a403177bd226e48a 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -148,6 +148,7 @@ static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_assume_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_target_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_target_version_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_target_clones_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
>  static tree ignore_attribute (tree *, tree, tree, int, bool *);
> @@ -480,6 +481,8 @@ const struct attribute_spec c_common_attribute_table[] =
>  			      handle_error_attribute, NULL },
>    { "target",                 1, -1, true, false, false, false,
>  			      handle_target_attribute, NULL },
> +  { "target_version",         1, -1, true, false, false, false,
> +			      handle_target_version_attribute, NULL },
>    { "target_clones",          1, -1, true, false, false, false,
>  			      handle_target_clones_attribute, NULL },
>    { "optimize",               1, -1, true, false, false, false,
> @@ -5569,6 +5572,45 @@ handle_target_attribute (tree *node, tree name, tree args, int flags,
>    return NULL_TREE;
>  }
>  
> +/* Handle a "target_version" attribute.  */
> +
> +static tree
> +handle_target_version_attribute (tree *node, tree name, tree args, int flags,
> +				  bool *no_add_attrs)
> +{
> +  /* Ensure we have a function type.  */
> +  if (TREE_CODE (*node) != FUNCTION_DECL)
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored", name);
> +      *no_add_attrs = true;
> +    }
> +  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> +		   "with %qs attribute", name, "target_clones");
> +      *no_add_attrs = true;
> +    }
> +  else if (!targetm.target_option.valid_version_attribute_p (*node, name, args,
> +							     flags))
> +    *no_add_attrs = true;
> +
> +  /* Check that there's no empty string in values of the attribute.  */
> +  for (tree t = args; t != NULL_TREE; t = TREE_CHAIN (t))
> +    {
> +      tree value = TREE_VALUE (t);
> +      if (TREE_CODE (value) == STRING_CST
> +	  && TREE_STRING_LENGTH (value) == 1
> +	  && TREE_STRING_POINTER (value)[0] == '\0')
> +	{
> +	  warning (OPT_Wattributes,
> +		   "empty string in attribute %<target_version%>");
> +	  *no_add_attrs = true;
> +	}
> +    }

would it make sense to do the empty string test first, and only pass
the vetted arguments to the target hook?  Also, a Google search suggests
that there aren't any pre-existing, conflicting uses of "target_version"
that take multiple arguments.  So could this code check that there
is exactly one argument (by changing 1, -1 to 1, 1 in the spec above)
and then require it to be a nonempty string?  It could then pass the
string itself to the target hook (probably as a const char *).

(FWIW, it doesn't look like the Clang documentation has kept the door
open to multiple arguments.)

I wonder if we could use attribute_spec::exclusions to describe the
mutual exclusion with "target_clones".  It doesn't look like the
existing code does, though, so maybe not.

I couldn't see anything that forbids a combination of "target" and
"target_version".  Should that combination be allowed?  In some ways
it makes conceptual sense, since using "target" is like changing the
command-line options.  But I suppose we'd then need to diagnose conflicts
and deal with ordering issues.  So perhaps "target" should be made
mutually exclusive as well.

Thanks,
Richard

> +
> +  return NULL_TREE;
> +}
> +
>  /* Handle a "target_clones" attribute.  */
>  
>  static tree
> @@ -5601,6 +5643,12 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args),
>  		   "with %qs attribute", name, "target");
>  	  *no_add_attrs = true;
>  	}
> +      else if (lookup_attribute ("target_version", DECL_ATTRIBUTES (*node)))
> +	{
> +	  warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> +		   "with %qs attribute", name, "target_version");
> +	  *no_add_attrs = true;
> +	}
>        else if (get_target_clone_attr_len (args) == -1)
>  	{
>  	  warning (OPT_Wattributes,
> diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
> index 29d28ef895a73a223695cbb86aafbc845bbe7688..8af6b23d8c0306920e0fdcb3559ef047a16689f4 100644
> --- a/gcc/cgraphclones.cc
> +++ b/gcc/cgraphclones.cc
> @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-eh.h"
>  #include "tree-cfg.h"
>  #include "tree-inline.h"
> +#include "attribs.h"
>  #include "dumpfile.h"
>  #include "gimple-pretty-print.h"
>  #include "alloc-pool.h"
> @@ -1048,7 +1049,17 @@ cgraph_node::create_version_clone_with_body
>        location_t saved_loc = input_location;
>        tree v = TREE_VALUE (target_attributes);
>        input_location = DECL_SOURCE_LOCATION (new_decl);
> -      bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 1);
> +      bool r;
> +      tree name_id = get_attribute_name (target_attributes);
> +      const char* name_str = IDENTIFIER_POINTER (name_id);
> +      if (strcmp (name_str, "target") == 0)
> +	r = targetm.target_option.valid_attribute_p (new_decl, name_id, v, 1);
> +      else if (strcmp (name_str, "target_version") == 0)
> +	r = targetm.target_option.valid_version_attribute_p (new_decl, name_id,
> +							     v, 1);
> +      else
> +	gcc_assert(false);
> +
>        input_location = saved_loc;
>        if (!r)
>  	return NULL;
> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> index 9594be4092c3c00fddc9d4c6da5931ea3b7e8792..ec78d5a5440bedd360ac8e5bc44e164da3dab410 100644
> --- a/gcc/cp/decl2.cc
> +++ b/gcc/cp/decl2.cc
> @@ -829,8 +829,8 @@ check_classfn (tree ctype, tree function, tree template_parms)
>        tree c2 = get_constraints (fndecl);
>  
>        /* While finding a match, same types and params are not enough
> -	 if the function is versioned.  Also check version ("target")
> -	 attributes.  */
> +	 if the function is versioned.  Also check for different target
> +	 specific attributes.  */
>        if (same_type_p (TREE_TYPE (TREE_TYPE (function)),
>  		       TREE_TYPE (TREE_TYPE (fndecl)))
>  	  && compparms (p1, p2)
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 33e7ffc8af5a9d48430145ef9b8e7924613b7dd7..97b3e9c31631a943d95f1cf7739716a574afcfb7 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -10507,6 +10507,23 @@ the function declaration to hold a pointer to a target-specific
>  @code{struct cl_target_option} structure.
>  @end deftypefn
>  
> +@deftypefn {Target Hook} bool TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P (tree @var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
> +This hook is called to parse @code{attribute(target_version("..."))},
> +which allows setting target-specific options on individual function versions.
> +These function-specific options may differ
> +from the options specified on the command line.  The hook should return
> +@code{true} if the options are valid.
> +
> +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in
> +the function declaration to hold a pointer to a target-specific
> +@code{struct cl_target_option} structure.
> +@end deftypefn
> +
> +@deftypevr {Target Hook} {const char *} TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> +Contains the name of the attribute used for the version description string
> +when expanding clones for a function with the target_clones attribute.
> +@end deftypevr
> +
>  @deftypefn {Target Hook} void TARGET_OPTION_SAVE (struct cl_target_option *@var{ptr}, struct gcc_options *@var{opts}, struct gcc_options *@var{opts_set})
>  This hook is called to save any additional target-specific information
>  in the @code{struct cl_target_option} structure for function-specific
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index c98b2447e28aa17996b1cbf8af7ed02d70db54f2..56fa3de6bba06bc0ac124bb3a41324be1997e209 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -6979,6 +6979,10 @@ on this implementation detail.
>  
>  @hook TARGET_OPTION_VALID_ATTRIBUTE_P
>  
> +@hook TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
> +
> +@hook TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> +
>  @hook TARGET_OPTION_SAVE
>  
>  @hook TARGET_OPTION_RESTORE
> diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
> index a2ed048d7dd28ec470953fcd8a0dc86817e4b7dc..3db57c2b13d612a37240d9dcf58ad21b2286633c 100644
> --- a/gcc/multiple_target.cc
> +++ b/gcc/multiple_target.cc
> @@ -66,10 +66,6 @@ create_dispatcher_calls (struct cgraph_node *node)
>  {
>    ipa_ref *ref;
>  
> -  if (!DECL_FUNCTION_VERSIONED (node->decl)
> -      || !is_function_default_version (node->decl))
> -    return;
> -
>    if (!targetm.has_ifunc_p ())
>      {
>        error_at (DECL_SOURCE_LOCATION (node->decl),
> @@ -377,6 +373,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
>        return false;
>      }
>  
> +  const char *new_attr_name = targetm.target_option.expanded_clones_attribute;
>    cgraph_function_version_info *decl1_v = NULL;
>    cgraph_function_version_info *decl2_v = NULL;
>    cgraph_function_version_info *before = NULL;
> @@ -392,7 +389,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
>        char *attr = attrs[i];
>  
>        /* Create new target clone.  */
> -      tree attributes = make_attribute ("target", attr,
> +      tree attributes = make_attribute (new_attr_name, attr,
>  					DECL_ATTRIBUTES (node->decl));
>  
>        char *suffix = XNEWVEC (char, strlen (attr) + 1);
> @@ -430,7 +427,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
>    XDELETEVEC (attr_str);
>  
>    /* Setting new attribute to initial function.  */
> -  tree attributes = make_attribute ("target", "default",
> +  tree attributes = make_attribute (new_attr_name, "default",
>  				    DECL_ATTRIBUTES (node->decl));
>    DECL_ATTRIBUTES (node->decl) = attributes;
>    node->local = false;
> diff --git a/gcc/target.def b/gcc/target.def
> index cda6c51e5167f85625168c7c26b777d6c8ccad82..39acea04db01ebaf918910b7dd73d397de6a84ec 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -6492,6 +6492,31 @@ the function declaration to hold a pointer to a target-specific\n\
>   bool, (tree fndecl, tree name, tree args, int flags),
>   default_target_option_valid_attribute_p)
>  
> +/* Function to validate the attribute((target_version(...))) strings.  If
> +   the option is validated, the hook should also fill in
> +   DECL_FUNCTION_SPECIFIC_TARGET in the function decl node.  */
> +DEFHOOK
> +(valid_version_attribute_p,
> + "This hook is called to parse @code{attribute(target_version(\"...\"))},\n\
> +which allows setting target-specific options on individual function versions.\n\
> +These function-specific options may differ\n\
> +from the options specified on the command line.  The hook should return\n\
> +@code{true} if the options are valid.\n\
> +\n\
> +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in\n\
> +the function declaration to hold a pointer to a target-specific\n\
> +@code{struct cl_target_option} structure.",
> + bool, (tree fndecl, tree name, tree args, int flags),
> + default_target_option_valid_version_attribute_p)
> +
> +/* Attribute to be used when expanding clones for functions with
> +   target_clones attribute.  */
> +DEFHOOKPOD
> +(expanded_clones_attribute,
> + "Contains the name of the attribute used for the version description string\n\
> +when expanding clones for a function with the target_clones attribute.",
> + const char *, "target")
> +
>  /* Function to save any extra target state in the target options structure.  */
>  DEFHOOK
>  (save,
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 1a0db8dddd594d9b1fb04ae0d9a66ad6b7a396dc..0efc993d82ef59b581a1df74ee0de71135a28703 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -192,6 +192,7 @@ extern bool default_hard_regno_scratch_ok (unsigned int);
>  extern bool default_mode_dependent_address_p (const_rtx, addr_space_t);
>  extern bool default_new_address_profitable_p (rtx, rtx_insn *, rtx);
>  extern bool default_target_option_valid_attribute_p (tree, tree, tree, int);
> +extern bool default_target_option_valid_version_attribute_p (tree, tree, tree, int);
>  extern bool default_target_option_pragma_parse (tree, tree);
>  extern bool default_target_can_inline_p (tree, tree);
>  extern bool default_update_ipa_fn_target_info (unsigned int &, const gimple *);
> diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> index e190369f87a92e6a92372dc348d9374c3a965c0a..7fc7bf455e80c333cced1bac7085210c2b108f8d 100644
> --- a/gcc/targhooks.cc
> +++ b/gcc/targhooks.cc
> @@ -1787,7 +1787,19 @@ default_target_option_valid_attribute_p (tree ARG_UNUSED (fndecl),
>  					 int ARG_UNUSED (flags))
>  {
>    warning (OPT_Wattributes,
> -	   "target attribute is not supported on this machine");
> +	   "%<target%> attribute is not supported on this machine");
> +
> +  return false;
> +}
> +
> +bool
> +default_target_option_valid_version_attribute_p (tree ARG_UNUSED (fndecl),
> +						 tree ARG_UNUSED (name),
> +						 tree ARG_UNUSED (args),
> +						 int ARG_UNUSED (flags))
> +{
> +  warning (OPT_Wattributes,
> +	   "%<target_version%> attribute is not supported on this machine");
>  
>    return false;
>  }
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 0b72663e6a1a94406127f6253460f498b7a3ea9c..ebd89ce79566c350eaaab210c0dca3cc1ac2048e 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -3438,8 +3438,8 @@ extern vec<tree, va_gc> **decl_debug_args_insert (tree);
>     (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization)
>  
>  /* In FUNCTION_DECL, this is set if this function has other versions generated
> -   using "target" attributes.  The default version is the one which does not
> -   have any "target" attribute set. */
> +   to support different architecture feature sets, e.g. using "target" or
> +   "target_version" attributes.  */
>  #define DECL_FUNCTION_VERSIONED(NODE)\
>     (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function)
>  

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [2/3] [aarch64] Add function multiversioning support
  2023-10-18 15:44 ` [2/3] [aarch64] Add function multiversioning support Andrew Carlotti
@ 2023-10-30 22:35   ` Richard Sandiford
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2023-10-30 22:35 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, richard.earnshaw

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> This adds initial support for function multiversion on aarch64 using the
> target_version and target_clones attributes. This mostly follows the
> Beta specification in the ACLE [1], with a few diffences that remain to
> be fixed:
>
> - Symbol mangling for target_clones differs from that for target_version
>   and does not match the mangling specified in the ACLE. This
>   inconsistency is also present in i386 and rs6000 mangling.
> - The target_clones attribute does not currently support an implicit
>   "default" version.
> - Unrecognised target names in a target_clones attribute should be
>   ignored (with an optional warning), but currently cause an error to be
>   raised instead.
> - There is no option to disable function multiversioning at compile
>   time.
> - There is no support for function multiversioning in C, since this is
>   not yet enabled in the frontend. On the other hand, this patch
>   happens to enable multiversioning in Ada and D as well, using their
>   existing frontend support.
>
> This patch relies on adding functionality to libgcc, to support:
> - struct { unsigned long long features; } __aarch64_cpu_features;
> - void __init_cpu_features (void);
> - void __init_cpu_features_resolver (unsigned long hwcap,
> 				     const __ifunc_arg_t *arg);
> This support matches the interface currently used in LLVM's compiler-rt,
> and will be implemented in a future patch (which will be merged before
> merging this patch).
>
> This version of the patch incorrectly uses __init_cpu_features in the
> ifunc resolvers, which could lead to invalid library calls at load time.
> I will fix this to use __init_cpu_features_resolver in a future version
> of the patch.
>
> [1] https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning
>
> gcc/ChangeLog:
>
> 	* attribs.cc (decl_attributes): Pass attribute name to target
> 	hook.
> 	* config/aarch64/aarch64.cc
> 	(aarch64_process_target_version_attr): New.
> 	(aarch64_option_valid_attribute_p): Add check and support for
> 	target_version attribute.
> 	(enum CPUFeatures): New list of for bitmask positions.
> 	(aarch64_fmv_feature_data): New.
> 	(get_feature_bit): New.
> 	(get_feature_mask_for_version): New.
> 	(compare_feature_masks): New.
> 	(aarch64_compare_version_priority): New.
> 	(make_resolver_func): New.
> 	(add_condition_to_bb): New.
> 	(compare_feature_version_info): New.
> 	(dispatch_function_versions): New.
> 	(aarch64_generate_version_dispatcher_body): New.
> 	(aarch64_get_function_versions_dispatcher): New.
> 	(aarch64_common_function_versions): New.
> 	(aarch64_mangle_decl_assembler_name): New.
> 	(TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): New implementation.
> 	(TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE): New implementation.
> 	(TARGET_OPTION_FUNCTION_VERSIONS): New implementation.
> 	(TARGET_COMPARE_VERSION_PRIORITY): New implementation.
> 	(TARGET_GENERATE_VERSION_DISPATCHER_BODY): New implementation.
> 	(TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New implementation.
> 	(TARGET_MANGLE_DECL_ASSEMBLER_NAME): New implementation.

Nice.  Mostly LGTM, but some comments below.

> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6..cc935b502028392ebdc105f940900f01f79196a7 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -657,7 +657,8 @@ decl_attributes (tree *node, tree attributes, int flags,
>       options to the attribute((target(...))) list.  */
>    if (TREE_CODE (*node) == FUNCTION_DECL
>        && current_target_pragma
> -      && targetm.target_option.valid_attribute_p (*node, NULL_TREE,
> +      && targetm.target_option.valid_attribute_p (*node,
> +						  get_identifier("target"),
>  						  current_target_pragma, 0))
>      {
>        tree cur_attr = lookup_attribute ("target", attributes);
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 9c3c0e705e2e6ea3b55b4a5f1e7d3360f91eb51d..ca0e2a2507ffdbf99e17b77240504bf2d175b9c0 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -19088,11 +19088,70 @@ aarch64_process_target_attr (tree args)
>    return true;
>  }
>  
> +/* Parse the tree in ARGS that contains the targeti_version attribute
> +   information and update the global target options space.  */
> +
> +bool
> +aarch64_process_target_version_attr (tree args)
> +{
> +  if (TREE_CODE (args) == TREE_LIST)
> +    {
> +      if (TREE_CHAIN (args))
> +	{
> +	  error ("attribute %<target_version%> has multiple values");
> +	  return false;
> +	}
> +      args = TREE_VALUE (args);
> +    }
> +
> +  if (!args || TREE_CODE (args) != STRING_CST)
> +    {
> +      error ("attribute %<target_version%> argument not a string");
> +      return false;
> +    }
> +
> +  const char *str = TREE_STRING_POINTER (args);
> +  if (strcmp (str, "default") == 0)
> +    return true;
> +
> +  auto with_plus = std::string ("+") + str;
> +  enum aarch_parse_opt_result parse_res;
> +  auto isa_flags = aarch64_asm_isa_flags;
> +
> +  std::string invalid_extension;
> +  parse_res = aarch64_parse_extension (with_plus.c_str(), &isa_flags,
> +				       &invalid_extension);
> +
> +  if (parse_res == AARCH_PARSE_OK)
> +    {
> +      aarch64_set_asm_isa_flags (isa_flags);
> +      return true;
> +    }
> +
> +  switch (parse_res)
> +    {
> +      case AARCH_PARSE_MISSING_ARG:
> +	error ("missing value in %<target_version%> attribute");
> +	break;
> +
> +      case AARCH_PARSE_INVALID_FEATURE:
> +	error ("invalid feature modifier %qs of value %qs in "
> +	       "%<target_version%> attribute", invalid_extension.c_str (),
> +	       with_plus.c_str());
> +	break;
> +
> +      default:
> +	gcc_unreachable ();
> +    }
> +
> +  return false;
> +}
> +
>  /* Implement TARGET_OPTION_VALID_ATTRIBUTE_P.  This is used to
>     process attribute ((target ("..."))).  */
>  
>  static bool
> -aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
> +aarch64_option_valid_attribute_p (tree fndecl, tree name, tree args, int)
>  {
>    struct cl_target_option cur_target;
>    bool ret;
> @@ -19100,13 +19159,22 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
>    tree new_target, new_optimize;
>    tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
>  
> +  bool target_version_p;
> +  const char *attr_name = IDENTIFIER_POINTER (name);
> +  if (strcmp (attr_name, "target") == 0)
> +    target_version_p = false;
> +  else if (strcmp (attr_name, "target_version") == 0)
> +    target_version_p = true;
> +  else
> +    gcc_assert (false);
> +
>    /* If what we're processing is the current pragma string then the
>       target option node is already stored in target_option_current_node
>       by aarch64_pragma_target_parse in aarch64-c.cc.  Use that to avoid
>       having to re-parse the string.  This is especially useful to keep
>       arm_neon.h compile times down since that header contains a lot
>       of intrinsics enclosed in pragmas.  */
> -  if (!existing_target && args == current_target_pragma)
> +  if (!target_version_p && !existing_target && args == current_target_pragma)
>      {
>        DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_option_current_node;
>        return true;
> @@ -19142,7 +19210,25 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
>      cl_target_option_restore (&global_options, &global_options_set,
>  			      TREE_TARGET_OPTION (target_option_current_node));
>  
> -  ret = aarch64_process_target_attr (args);
> +  if (!target_version_p)
> +    {
> +      ret = aarch64_process_target_attr (args);
> +      if (ret)
> +	{
> +	  tree version_attr = lookup_attribute ("target_version",
> +						DECL_ATTRIBUTES (fndecl));
> +	  if (version_attr != NULL_TREE)
> +	    {
> +	      /* Reapply any target_version attribute after target attribute.
> +		 This should be equivalent to applying the target_version once
> +		 after processing all target attributes.  */
> +	      tree version_args = TREE_VALUE (version_attr);
> +	      ret = aarch64_process_target_version_attr (version_args);
> +	    }
> +	}
> +    }
> +  else
> +    ret = aarch64_process_target_version_attr (args);
>  
>    /* Set up any additional state.  */
>    if (ret)
> @@ -19173,6 +19259,730 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
>    return ret;
>  }
>  
> +/* This enum needs to match the enum used in libgcc cpuinfo.c.  */
> +//TODO: Does this clash with or overlap an existing list of target features?

Yeah, good question :)  Might be worth sticking it in an anonymous namespace
just in case.  We can do something more drastic if that turns out not
to be enough.

libgcc should have access to gcc/ headers, so I think we could probably
share this enum between them.

> +enum CPUFeatures {
> +  FEAT_RNG,
> +  FEAT_FLAGM,
> +  FEAT_FLAGM2,
> +  FEAT_FP16FML,
> +  FEAT_DOTPROD,
> +  FEAT_SM4,
> +  FEAT_RDM,
> +  FEAT_LSE,
> +  FEAT_FP,
> +  FEAT_SIMD,
> +  FEAT_CRC,
> +  FEAT_SHA1,
> +  FEAT_SHA2,
> +  FEAT_SHA3,
> +  FEAT_AES,
> +  FEAT_PMULL,
> +  FEAT_FP16,
> +  FEAT_DIT,
> +  FEAT_DPB,
> +  FEAT_DPB2,
> +  FEAT_JSCVT,
> +  FEAT_FCMA,
> +  FEAT_RCPC,
> +  FEAT_RCPC2,
> +  FEAT_FRINTTS,
> +  FEAT_DGH,
> +  FEAT_I8MM,
> +  FEAT_BF16,
> +  FEAT_EBF16,
> +  FEAT_RPRES,
> +  FEAT_SVE,
> +  FEAT_SVE_BF16,
> +  FEAT_SVE_EBF16,
> +  FEAT_SVE_I8MM,
> +  FEAT_SVE_F32MM,
> +  FEAT_SVE_F64MM,
> +  FEAT_SVE2,
> +  FEAT_SVE_AES,
> +  FEAT_SVE_PMULL128,
> +  FEAT_SVE_BITPERM,
> +  FEAT_SVE_SHA3,
> +  FEAT_SVE_SM4,
> +  FEAT_SME,
> +  FEAT_MEMTAG,
> +  FEAT_MEMTAG2,
> +  FEAT_MEMTAG3,
> +  FEAT_SB,
> +  FEAT_PREDRES,
> +  FEAT_SSBS,
> +  FEAT_SSBS2,
> +  FEAT_BTI,
> +  FEAT_LS64,
> +  FEAT_LS64_V,
> +  FEAT_LS64_ACCDATA,
> +  FEAT_WFXT,
> +  FEAT_SME_F64,
> +  FEAT_SME_I64,
> +  FEAT_SME2,
> +  FEAT_RCPC3, //TODO: Check this index - needs to agree with LLVM.
> +  FEAT_MAX
> +};
> +
> +typedef struct
> +{
> +  const char *name;
> +  int priority;
> +  unsigned long long feature_mask;
> +} aarch64_fmv_feature_datum;
> +
> +/* List these in priority order, to make it easier to sort target strings.  */
> +static aarch64_fmv_feature_datum aarch64_fmv_feature_data[] = {
> +  {"default",        0, 0ULL},
> +  {"rng",            10, 1ULL << FEAT_RNG},
> +  {"flagm",          20, 1ULL << FEAT_FLAGM},
> +  {"flagm2",         30, 1ULL << FEAT_FLAGM2},
> +  {"fp16fml",        40, 1ULL << FEAT_FP16FML},
> +  {"dotprod",        50, 1ULL << FEAT_DOTPROD},
> +  {"sm4",            60, 1ULL << FEAT_SM4},
> +  {"rdm",            70, 1ULL << FEAT_RDM},
> +  {"lse",            80, 1ULL << FEAT_LSE},
> +  {"fp",             90, 1ULL << FEAT_FP},
> +  {"simd",          100, 1ULL << FEAT_SIMD},
> +  {"crc",           110, 1ULL << FEAT_CRC},
> +  {"sha1",          120, 1ULL << FEAT_SHA1},
> +  {"sha2",          130, 1ULL << FEAT_SHA2},
> +  {"sha3",          140, 1ULL << FEAT_SHA3},
> +  {"aes",           150, 1ULL << FEAT_AES},
> +  {"pmull",         160, 1ULL << FEAT_PMULL},
> +  {"fp16",          170, 1ULL << FEAT_FP16},
> +  {"dit",           180, 1ULL << FEAT_DIT},
> +  {"dpb",           190, 1ULL << FEAT_DPB},
> +  {"dpb2",          200, 1ULL << FEAT_DPB2},
> +  {"jscvt",         210, 1ULL << FEAT_JSCVT},
> +  {"fcma",          220, 1ULL << FEAT_FCMA},
> +  {"rcpc",          230, 1ULL << FEAT_RCPC},
> +  {"rcpc2",         240, 1ULL << FEAT_RCPC2},
> +  {"rcpc3",         241, 1ULL << FEAT_RCPC3},
> +  {"frintts",       250, 1ULL << FEAT_FRINTTS},
> +  {"dgh",           260, 1ULL << FEAT_DGH},
> +  {"i8mm",          270, 1ULL << FEAT_I8MM},
> +  {"bf16",          280, 1ULL << FEAT_BF16},
> +  {"ebf16",         290, 1ULL << FEAT_EBF16},
> +  {"rpres",         300, 1ULL << FEAT_RPRES},
> +  {"sve",           310, 1ULL << FEAT_SVE},
> +  {"sve-bf16",      320, 1ULL << FEAT_SVE_BF16},

I suppose we need a way of handling things like this.  We deliberately
don't have a +sve-bf16 feature flag because it's implied by the
combination +sve+bf16.

We probably need to do something similar to LLVM and have a mapping
from these strings to a set of supported feature flags, either as
a string or (preferably) as an aarch64_feature_flags.

> +  {"sve-ebf16",     330, 1ULL << FEAT_SVE_EBF16},
> +  {"sve-i8mm",      340, 1ULL << FEAT_SVE_I8MM},
> +  {"f32mm",         350, 1ULL << FEAT_SVE_F32MM},
> +  {"f64mm",         360, 1ULL << FEAT_SVE_F64MM},
> +  {"sve2",          370, 1ULL << FEAT_SVE2},
> +  {"sve2-aes",      380, 1ULL << FEAT_SVE_AES},
> +  {"sve2-pmull128", 390, 1ULL << FEAT_SVE_PMULL128},
> +  {"sve2-bitperm",  400, 1ULL << FEAT_SVE_BITPERM},
> +  {"sve2-sha3",     410, 1ULL << FEAT_SVE_SHA3},
> +  {"sve2-sm4",      420, 1ULL << FEAT_SVE_SM4},
> +  {"sme",           430, 1ULL << FEAT_SME},
> +  {"memtag",        440, 1ULL << FEAT_MEMTAG},
> +  {"memtag2",       450, 1ULL << FEAT_MEMTAG2},
> +  {"memtag3",       460, 1ULL << FEAT_MEMTAG3},
> +  {"sb",            470, 1ULL << FEAT_SB},
> +  {"predres",       480, 1ULL << FEAT_PREDRES},
> +  {"ssbs",          490, 1ULL << FEAT_SSBS},
> +  {"ssbs2",         500, 1ULL << FEAT_SSBS2},
> +  {"bti",           510, 1ULL << FEAT_BTI},
> +  {"ls64",          520, 1ULL << FEAT_LS64},
> +  {"ls64_v",        530, 1ULL << FEAT_LS64_V},
> +  {"ls64_accdata",  540, 1ULL << FEAT_LS64_ACCDATA},
> +  {"wfxt",          550, 1ULL << FEAT_WFXT},
> +  {"sme-f64f64",    560, 1ULL << FEAT_SME_F64},
> +  {"sme-i16i64",    570, 1ULL << FEAT_SME_I64},
> +  {"sme2",          580, 1ULL << FEAT_SME2}
> +};

Do we anticipate any mask having more than one bit set?  I wonder if
aarch64_fmv_feature_datum should have a CPUFeatures instead, since
that would be more future-proof when we run out of bits.

Also, if we did that, it might make sense to put the list of features
in a .def file.  It would then be possible to use that to define the
enum and aarch64_fmv_feature_data, which would help to make sure that
they stay in sync.

> +
> +/* Look up a single feature name, and return the bitmask.  */
> +unsigned long long
> +get_feature_bit (char *name)
> +{
> +  /* Skip default entry here.  */
> +  for (int i = 1; i < FEAT_MAX; i++)
> +      if (strcmp(aarch64_fmv_feature_data[i].name, name) == 0)
> +	return aarch64_fmv_feature_data[i].feature_mask;

Formatting nit (sorry): the if and return are indented too far.
Missing space between "strcmp" and "(".

> +  return 0;
> +}
> +
> +/* This parses the attribute arguments to target_version in DECL and the
> +   feature mask required to select those targets.  No adjustments are made to
> +   add or remove redundant feature requirements.  */
> +
> +unsigned long long

This is probably personal preference, but how about hiding this behind
a typedef now?  I assume even unsigned long long won't be enough for long.
And it's harder to retroactively find which unsigned long longs are
for this and which are for something else.

> +get_feature_mask_for_version (tree decl)
> +{
> +  tree version_attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl));

Formatting nit: long line.

> +  if (version_attr == NULL)
> +    return 0;
> +
> +  const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE
> +						    (version_attr)));
> +  if (strcmp (version_string, "default") == 0
> +      || strcmp (version_string, "") == 0)
> +    return 0;
> +
> +  int attr_len = strlen (version_string);
> +
> +  char *feature_string = XNEWVEC (char, attr_len+ 1);
> +  strcpy (feature_string, version_string);

There's a checked xstrdup we can use for this.

> +
> +  int count = 1;
> +  for (int i = 0; i < attr_len; i++)
> +    {
> +      if (feature_string[i] == '+')
> +	{
> +	  feature_string[i] = '\0';
> +	  count++;
> +	}
> +    }
> +
> +  unsigned long long feature_mask = 0ULL;
> +  char *cur_feature = feature_string;
> +  for (int i = 0; i < count; i++)
> +    {
> +      unsigned long long feature_bit = get_feature_bit (cur_feature);
> +      if (feature_bit == 0)
> +	{
> +	  /* TODO: For target_clones, we should just ignore this version
> +	     instead.  */
> +	  error_at (DECL_SOURCE_LOCATION (decl), 0,
> +		      "Unrecognised feature %s in function version string",
> +		      cur_feature);

GCC's convention is that error messages start with lower case.

> +	  feature_mask = -1ULL;
> +	}
> +      feature_mask |= feature_bit;
> +      cur_feature += strlen(cur_feature) + 1;

Formatting nit: missing space between "strlen" and "(".

> +    }
> +  XDELETEVEC (feature_string);
> +  return feature_mask;
> +}
> +
> +/* Compare priorities of two feature masks. Return:
> +     1: mask1 is higher priority
> +    -1: mask2 is higher priority
> +     0: masks are equal.  */
> +
> +int
> +compare_feature_masks (unsigned long long mask1, unsigned long long mask2)
> +{
> +  int pop1 = __builtin_popcountll(mask1);
> +  int pop2 = __builtin_popcountll(mask2);

Not sure we can use __builtin_popcountll unconditionally.  Probably best
to use popcount_hwi instead.

> +  if (pop1 > pop2)
> +    return 1;
> +  if (pop2 > pop1)
> +    return -1;

I suppose this is implementing the ACLE rule:

4. Select the most specific version else

But I wonder if the rule is the right approach.  E.g. simd+fp16 would
be considered more specific than sve, even though sve implies fp16.

The behaviour without this seems reasonable on the face of it
(rank by the highest priority point of difference), but I've not
thought about it much.

> +  unsigned long long diff_mask = mask1 ^ mask2;
> +  if (diff_mask == 0ULL)
> +    return 0;
> +  for (int i = FEAT_MAX - 1; i > 0; i--)
> +    {
> +      unsigned long long bit_mask = aarch64_fmv_feature_data[i].feature_mask;
> +      if (diff_mask & bit_mask)
> +	return (mask1 & bit_mask) ? 1 : -1;
> +    }
> +  gcc_unreachable();
> +}
> +
> +int
> +aarch64_compare_version_priority (tree decl1, tree decl2)
> +{
> +  unsigned long long mask1 = get_feature_mask_for_version (decl1);
> +  unsigned long long mask2 = get_feature_mask_for_version (decl2);
> +
> +  return compare_feature_masks (mask1, mask2);
> +}
> +
> +/* Make the resolver function decl to dispatch the versions of
> +   a multi-versioned function,  DEFAULT_DECL.  IFUNC_ALIAS_DECL is
> +   ifunc alias that will point to the created resolver.  Create an
> +   empty basic block in the resolver and store the pointer in
> +   EMPTY_BB.  Return the decl of the resolver function.  */
> +
> +static tree
> +make_resolver_func (const tree default_decl,
> +		    const tree ifunc_alias_decl,
> +		    basic_block *empty_bb)
> +{
> +  tree decl, type, t;
> +
> +  /* Create resolver function name based on default_decl.  */
> +  tree decl_name = clone_function_name (default_decl, "resolver");
> +  const char *resolver_name = IDENTIFIER_POINTER (decl_name);
> +
> +  /* The resolver function should return a (void *). */
> +  type = build_function_type_list (ptr_type_node, NULL_TREE);
> +
> +  decl = build_fn_decl (resolver_name, type);
> +  SET_DECL_ASSEMBLER_NAME (decl, decl_name);
> +
> +  DECL_NAME (decl) = decl_name;
> +  TREE_USED (decl) = 1;
> +  DECL_ARTIFICIAL (decl) = 1;
> +  DECL_IGNORED_P (decl) = 1;
> +  TREE_PUBLIC (decl) = 0;
> +  DECL_UNINLINABLE (decl) = 1;
> +
> +  /* Resolver is not external, body is generated.  */
> +  DECL_EXTERNAL (decl) = 0;
> +  DECL_EXTERNAL (ifunc_alias_decl) = 0;
> +
> +  DECL_CONTEXT (decl) = NULL_TREE;
> +  DECL_INITIAL (decl) = make_node (BLOCK);
> +  DECL_STATIC_CONSTRUCTOR (decl) = 0;
> +
> +  if (DECL_COMDAT_GROUP (default_decl)
> +      || TREE_PUBLIC (default_decl))
> +    {
> +      /* In this case, each translation unit with a call to this
> +	 versioned function will put out a resolver.  Ensure it
> +	 is comdat to keep just one copy.  */
> +      DECL_COMDAT (decl) = 1;
> +      make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl));
> +    }
> +  else
> +    TREE_PUBLIC (ifunc_alias_decl) = 0;
> +
> +  /* Build result decl and add to function_decl. */
> +  t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node);
> +  DECL_CONTEXT (t) = decl;
> +  DECL_ARTIFICIAL (t) = 1;
> +  DECL_IGNORED_P (t) = 1;
> +  DECL_RESULT (decl) = t;
> +
> +  gimplify_function_tree (decl);
> +  push_cfun (DECL_STRUCT_FUNCTION (decl));
> +  *empty_bb = init_lowered_empty_function (decl, false,
> +					   profile_count::uninitialized ());
> +
> +  cgraph_node::add_new_function (decl, true);
> +  symtab->call_cgraph_insertion_hooks (cgraph_node::get_create (decl));
> +
> +  pop_cfun ();
> +
> +  gcc_assert (ifunc_alias_decl != NULL);
> +  /* Mark ifunc_alias_decl as "ifunc" with resolver as resolver_name.  */
> +  DECL_ATTRIBUTES (ifunc_alias_decl)
> +    = make_attribute ("ifunc", resolver_name,
> +		      DECL_ATTRIBUTES (ifunc_alias_decl));
> +
> +  /* Create the alias for dispatch to resolver here.  */
> +  cgraph_node::create_same_body_alias (ifunc_alias_decl, decl);
> +  return decl;
> +}

It looks like this is identical to the x86 version, is that right?
If so, I think it'd be worth moving it to target-independent code.

> +
> +/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL
> +   to return a pointer to VERSION_DECL if all feature bits specified in
> +   FEATURE_MASK are not set in MASK_VAR.  This function will be called during
> +   version dispatch to decide which function version to execute.  It returns
> +   the basic block at the end, to which more conditions can be added.  */
> +static basic_block
> +add_condition_to_bb (tree function_decl, tree version_decl,
> +		     unsigned long long feature_mask,
> +		     tree mask_var, basic_block new_bb)
> +{
> +  gimple *return_stmt;
> +  tree convert_expr, result_var;
> +  gimple *convert_stmt;
> +  gimple *if_else_stmt;
> +
> +  basic_block bb1, bb2, bb3;
> +  edge e12, e23;
> +
> +  gimple_seq gseq;
> +
> +  push_cfun (DECL_STRUCT_FUNCTION (function_decl));
> +
> +  gcc_assert (new_bb != NULL);
> +  gseq = bb_seq (new_bb);
> +
> +
> +  convert_expr = build1 (CONVERT_EXPR, ptr_type_node,
> +			 build_fold_addr_expr (version_decl));
> +  result_var = create_tmp_var (ptr_type_node);
> +  convert_stmt = gimple_build_assign (result_var, convert_expr);
> +  return_stmt = gimple_build_return (result_var);
> +
> +
> +  if (feature_mask == 0)
> +    {
> +      /* Default version.  */
> +      gimple_seq_add_stmt (&gseq, convert_stmt);
> +      gimple_seq_add_stmt (&gseq, return_stmt);
> +      set_bb_seq (new_bb, gseq);
> +      gimple_set_bb (convert_stmt, new_bb);
> +      gimple_set_bb (return_stmt, new_bb);
> +      pop_cfun ();
> +      return new_bb;
> +    }
> +
> +  tree and_expr_var = create_tmp_var (long_long_unsigned_type_node);
> +  tree and_expr = build2 (BIT_AND_EXPR,
> +			  long_long_unsigned_type_node,
> +			  mask_var,
> +			  build_int_cst (long_long_unsigned_type_node,
> +					 feature_mask));
> +  gimple *and_stmt = gimple_build_assign (and_expr_var, and_expr);
> +  gimple_set_block (and_stmt, DECL_INITIAL (function_decl));
> +  gimple_set_bb (and_stmt, new_bb);
> +  gimple_seq_add_stmt (&gseq, and_stmt);
> +
> +  tree zero_llu = build_int_cst (long_long_unsigned_type_node, 0);
> +  if_else_stmt = gimple_build_cond (EQ_EXPR, and_expr_var, zero_llu,
> +				    NULL_TREE, NULL_TREE);
> +  gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl));
> +  gimple_set_bb (if_else_stmt, new_bb);
> +  gimple_seq_add_stmt (&gseq, if_else_stmt);
> +
> +  gimple_seq_add_stmt (&gseq, convert_stmt);
> +  gimple_seq_add_stmt (&gseq, return_stmt);
> +  set_bb_seq (new_bb, gseq);
> +
> +  bb1 = new_bb;
> +  e12 = split_block (bb1, if_else_stmt);
> +  bb2 = e12->dest;
> +  e12->flags &= ~EDGE_FALLTHRU;
> +  e12->flags |= EDGE_TRUE_VALUE;
> +
> +  e23 = split_block (bb2, return_stmt);
> +
> +  gimple_set_bb (convert_stmt, bb2);
> +  gimple_set_bb (return_stmt, bb2);
> +
> +  bb3 = e23->dest;
> +  make_edge (bb1, bb3, EDGE_FALSE_VALUE);
> +
> +  remove_edge (e23);
> +  make_edge (bb2, EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
> +
> +  pop_cfun ();
> +
> +  return bb3;
> +}

It looks like there's quite a bit of common code here too,
but also significant differences.

Perhaps the target-indendent code could contain a base class for generating
resolvers that targets can customise via derivation?  Just an idea though.
Maybe the amount of common code beyond make_resolver_func doesn't make
it worth it.

> +
> +/* Used when sorting the decls into dispatch order.  */
> +static int compare_feature_version_info (const void *p1, const void *p2)
> +{
> +  typedef struct _function_version_info
> +    {
> +      tree version_decl;
> +      unsigned long long feature_mask;
> +    } function_version_info;

There's no need for the typedef struct construct now that the
codebase is C++.

> +  const function_version_info v1 = *(const function_version_info *)p1;
> +  const function_version_info v2 = *(const function_version_info *)p2;
> +  return - compare_feature_masks (v1.feature_mask, v2.feature_mask);
> +}
> +
> +static int
> +dispatch_function_versions (tree dispatch_decl,
> +			    void *fndecls_p,
> +			    basic_block *empty_bb)
> +{
> +  gimple *ifunc_cpu_init_stmt;
> +  gimple_seq gseq;
> +  int ix;
> +  tree ele;
> +  vec<tree> *fndecls;
> +  unsigned int num_versions = 0;
> +  unsigned int actual_versions = 0;
> +  unsigned int i;
> +
> +  struct _function_version_info
> +    {
> +      tree version_decl;
> +      unsigned long long feature_mask;
> +    }*function_version_info;
> +
> +  gcc_assert (dispatch_decl != NULL
> +	      && fndecls_p != NULL
> +	      && empty_bb != NULL);
> +
> +  /*fndecls_p is actually a vector.  */
> +  fndecls = static_cast<vec<tree> *> (fndecls_p);
> +
> +  /* At least one more version other than the default.  */
> +  num_versions = fndecls->length ();
> +  gcc_assert (num_versions >= 2);
> +
> +  function_version_info = (struct _function_version_info *)
> +    XNEWVEC (struct _function_version_info, (num_versions));
> +
> +  push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl));
> +
> +  gseq = bb_seq (*empty_bb);
> +  /* Function version dispatch is via IFUNC.  IFUNC resolvers fire before
> +     constructors, so explicity call __builtin_cpu_init here.  */

Comment needs updating for AArch64 (but I realise you said in the
covering note that this wasn't final).

> +  tree init_fn_type = build_function_type_list (void_type_node, NULL);
> +  tree init_fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL,
> +				  get_identifier ("init_cpu_features"),
> +				  init_fn_type);
> +  ifunc_cpu_init_stmt = gimple_build_call (init_fn_decl, 0);
> +  gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt);
> +  gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb);
> +
> +  /* Build the struct type for __aarch64_cpu_features.  */
> +  tree global_type = lang_hooks.types.make_type (RECORD_TYPE);
> +  tree field1 = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
> +			    get_identifier ("features"),
> +			    long_long_unsigned_type_node);
> +  DECL_FIELD_CONTEXT (field1) = global_type;
> +  TYPE_FIELDS (global_type) = field1;
> +  layout_type (global_type);
> +
> +  tree global_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier
> +				("__aarch64_cpu_features"), global_type);
> +  DECL_EXTERNAL (global_var) = 1;
> +  tree mask_var = create_tmp_var (long_long_unsigned_type_node);
> +
> +  tree component_expr = build3 (COMPONENT_REF, long_long_unsigned_type_node,
> +				global_var, field1, NULL_TREE);
> +  gimple *component_stmt = gimple_build_assign (mask_var, component_expr);
> +  gimple_set_block (component_stmt, DECL_INITIAL (dispatch_decl));
> +  gimple_set_bb (component_stmt, *empty_bb);
> +  gimple_seq_add_stmt (&gseq, component_stmt);
> +
> +  tree not_expr = build1 (BIT_NOT_EXPR, long_long_unsigned_type_node, mask_var);
> +  gimple *not_stmt = gimple_build_assign (mask_var, not_expr);
> +  gimple_set_block (not_stmt, DECL_INITIAL (dispatch_decl));
> +  gimple_set_bb (not_stmt, *empty_bb);
> +  gimple_seq_add_stmt (&gseq, not_stmt);
> +
> +  set_bb_seq (*empty_bb, gseq);
> +
> +  pop_cfun ();
> +
> +  for (ix = 0; fndecls->iterate (ix, &ele); ++ix)

Very minor, but we can use:

  for (tree version_decl : *fndecls)

now.

Thanks,
Richard

> +    {
> +      tree version_decl = ele;
> +      unsigned long long feature_mask;
> +      /* Get attribute string, parse it and find the right features.  */
> +      feature_mask = get_feature_mask_for_version (version_decl);
> +      function_version_info [actual_versions].version_decl = version_decl;
> +      function_version_info [actual_versions].feature_mask = feature_mask;
> +      actual_versions++;
> +    }
> +
> +  /* Sort the versions according to descending order of dispatch priority.  */
> +  qsort (function_version_info, actual_versions,
> +	 sizeof (struct _function_version_info), compare_feature_version_info);
> +
> +  for (i = 0; i < actual_versions; ++i)
> +    *empty_bb = add_condition_to_bb (dispatch_decl,
> +				     function_version_info[i].version_decl,
> +				     function_version_info[i].feature_mask,
> +				     mask_var,
> +				     *empty_bb);
> +
> +  free (function_version_info);
> +  return 0;
> +}
> +
> +
> +tree
> +aarch64_generate_version_dispatcher_body (void *node_p)
> +{
> +  tree resolver_decl;
> +  basic_block empty_bb;
> +  tree default_ver_decl;
> +  struct cgraph_node *versn;
> +  struct cgraph_node *node;
> +
> +  struct cgraph_function_version_info *node_version_info = NULL;
> +  struct cgraph_function_version_info *versn_info = NULL;
> +
> +  node = (cgraph_node *)node_p;
> +
> +  node_version_info = node->function_version ();
> +  gcc_assert (node->dispatcher_function
> +	      && node_version_info != NULL);
> +
> +  if (node_version_info->dispatcher_resolver)
> +    return node_version_info->dispatcher_resolver;
> +
> +  /* The first version in the chain corresponds to the default version.  */
> +  default_ver_decl = node_version_info->next->this_node->decl;
> +
> +  /* node is going to be an alias, so remove the finalized bit.  */
> +  node->definition = false;
> +
> +  resolver_decl = make_resolver_func (default_ver_decl,
> +				      node->decl, &empty_bb);
> +
> +  node_version_info->dispatcher_resolver = resolver_decl;
> +
> +  push_cfun (DECL_STRUCT_FUNCTION (resolver_decl));
> +
> +  auto_vec<tree, 2> fn_ver_vec;
> +
> +  for (versn_info = node_version_info->next; versn_info;
> +       versn_info = versn_info->next)
> +    {
> +      versn = versn_info->this_node;
> +      /* Check for virtual functions here again, as by this time it should
> +	 have been determined if this function needs a vtable index or
> +	 not.  This happens for methods in derived classes that override
> +	 virtual methods in base classes but are not explicitly marked as
> +	 virtual.  */
> +      if (DECL_VINDEX (versn->decl))
> +	sorry ("virtual function multiversioning not supported");
> +
> +      fn_ver_vec.safe_push (versn->decl);
> +    }
> +
> +  dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb);
> +  cgraph_edge::rebuild_edges ();
> +  pop_cfun ();
> +  return resolver_decl;
> +}
> +
> +/* Make a dispatcher declaration for the multi-versioned function DECL.
> +   Calls to DECL function will be replaced with calls to the dispatcher
> +   by the front-end.  Returns the decl of the dispatcher function.  */
> +
> +tree
> +aarch64_get_function_versions_dispatcher (void *decl)
> +{
> +  tree fn = (tree) decl;
> +  struct cgraph_node *node = NULL;
> +  struct cgraph_node *default_node = NULL;
> +  struct cgraph_function_version_info *node_v = NULL;
> +  struct cgraph_function_version_info *first_v = NULL;
> +
> +  tree dispatch_decl = NULL;
> +
> +  struct cgraph_function_version_info *default_version_info = NULL;
> +
> +  gcc_assert (fn != NULL && DECL_FUNCTION_VERSIONED (fn));
> +
> +  node = cgraph_node::get (fn);
> +  gcc_assert (node != NULL);
> +
> +  node_v = node->function_version ();
> +  gcc_assert (node_v != NULL);
> +
> +  if (node_v->dispatcher_resolver != NULL)
> +    return node_v->dispatcher_resolver;
> +
> +  /* Find the default version and make it the first node.  */
> +  first_v = node_v;
> +  /* Go to the beginning of the chain.  */
> +  while (first_v->prev != NULL)
> +    first_v = first_v->prev;
> +  default_version_info = first_v;
> +  while (default_version_info != NULL)
> +    {
> +      if (get_feature_mask_for_version
> +	    (default_version_info->this_node->decl) == 0ULL)
> +	break;
> +      default_version_info = default_version_info->next;
> +    }
> +
> +  /* If there is no default node, just return NULL.  */
> +  if (default_version_info == NULL)
> +    return NULL;
> +
> +  /* Make default info the first node.  */
> +  if (first_v != default_version_info)
> +    {
> +      default_version_info->prev->next = default_version_info->next;
> +      if (default_version_info->next)
> +	default_version_info->next->prev = default_version_info->prev;
> +      first_v->prev = default_version_info;
> +      default_version_info->next = first_v;
> +      default_version_info->prev = NULL;
> +    }
> +
> +  default_node = default_version_info->this_node;
> +
> +  if (targetm.has_ifunc_p ())
> +    {
> +      struct cgraph_function_version_info *it_v = NULL;
> +      struct cgraph_node *dispatcher_node = NULL;
> +      struct cgraph_function_version_info *dispatcher_version_info = NULL;
> +
> +      /* Right now, the dispatching is done via ifunc.  */
> +      dispatch_decl = make_dispatcher_decl (default_node->decl);
> +      TREE_NOTHROW (dispatch_decl) = TREE_NOTHROW (fn);
> +
> +      dispatcher_node = cgraph_node::get_create (dispatch_decl);
> +      gcc_assert (dispatcher_node != NULL);
> +      dispatcher_node->dispatcher_function = 1;
> +      dispatcher_version_info
> +	= dispatcher_node->insert_new_function_version ();
> +      dispatcher_version_info->next = default_version_info;
> +      dispatcher_node->definition = 1;
> +
> +      /* Set the dispatcher for all the versions.  */
> +      it_v = default_version_info;
> +      while (it_v != NULL)
> +	{
> +	  it_v->dispatcher_resolver = dispatch_decl;
> +	  it_v = it_v->next;
> +	}
> +    }
> +  else
> +    {
> +      error_at (DECL_SOURCE_LOCATION (default_node->decl),
> +		"multiversioning needs %<ifunc%> which is not supported "
> +		"on this target");
> +    }
> +
> +  return dispatch_decl;
> +}
> +
> +bool
> +aarch64_common_function_versions (tree fn1, tree fn2)
> +{
> +  if (TREE_CODE (fn1) != FUNCTION_DECL
> +      || TREE_CODE (fn2) != FUNCTION_DECL)
> +    return false;
> +
> +  return (aarch64_compare_version_priority (fn1, fn2) != 0);
> +}
> +
> +
> +tree
> +aarch64_mangle_decl_assembler_name (tree decl, tree id)
> +{
> +  /* For function version, add the target suffix to the assembler name.  */
> +  if (TREE_CODE (decl) == FUNCTION_DECL
> +      && DECL_FUNCTION_VERSIONED (decl))
> +    {
> +      unsigned long long feature_mask = get_feature_mask_for_version (decl);
> +
> +      /* No suffix for the default version.  */
> +      if (feature_mask == 0ULL)
> +	return id;
> +
> +      char suffix[2048];
> +      int pos = 0;
> +      const char *base = IDENTIFIER_POINTER (id);
> +
> +      for (int i = 1; i < FEAT_MAX; i++)
> +	{
> +	  if (feature_mask & aarch64_fmv_feature_data[i].feature_mask)
> +	    {
> +	      suffix[pos] = 'M';
> +	      strcpy (&suffix[pos+1], aarch64_fmv_feature_data[i].name);
> +	      pos += strlen(aarch64_fmv_feature_data[i].name) + 1;
> +	    }
> +	}
> +      suffix[pos] = '\0';
> +
> +      char *ret = XNEWVEC (char, strlen (base) + strlen (suffix) + 3);
> +      sprintf (ret, "%s._%s", base, suffix);
> +
> +      if (DECL_ASSEMBLER_NAME_SET_P (decl))
> +	SET_DECL_RTL (decl, NULL);
> +
> +      id = get_identifier (ret);
> +    }
> +  return id;
> +}
> +
> +
>  /* Helper for aarch64_can_inline_p.  In the case where CALLER and CALLEE are
>     tri-bool options (yes, no, don't care) and the default value is
>     DEF, determine whether to reject inlining.  */
> @@ -27804,6 +28614,12 @@ aarch64_libgcc_floating_mode_supported_p
>  #undef TARGET_OPTION_VALID_ATTRIBUTE_P
>  #define TARGET_OPTION_VALID_ATTRIBUTE_P aarch64_option_valid_attribute_p
>  
> +#undef TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
> +#define TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P aarch64_option_valid_attribute_p
> +
> +#undef TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> +#define TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE "target_version"
> +
>  #undef TARGET_SET_CURRENT_FUNCTION
>  #define TARGET_SET_CURRENT_FUNCTION aarch64_set_current_function
>  
> @@ -28128,6 +28944,24 @@ aarch64_libgcc_floating_mode_supported_p
>  #undef TARGET_CONST_ANCHOR
>  #define TARGET_CONST_ANCHOR 0x1000000
>  
> +#undef TARGET_OPTION_FUNCTION_VERSIONS
> +#define TARGET_OPTION_FUNCTION_VERSIONS aarch64_common_function_versions
> +
> +#undef TARGET_COMPARE_VERSION_PRIORITY
> +#define TARGET_COMPARE_VERSION_PRIORITY aarch64_compare_version_priority
> +
> +#undef TARGET_GENERATE_VERSION_DISPATCHER_BODY
> +#define TARGET_GENERATE_VERSION_DISPATCHER_BODY \
> +  aarch64_generate_version_dispatcher_body
> +
> +#undef TARGET_GET_FUNCTION_VERSIONS_DISPATCHER
> +#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \
> +  aarch64_get_function_versions_dispatcher
> +
> +#undef TARGET_MANGLE_DECL_ASSEMBLER_NAME
> +#define TARGET_MANGLE_DECL_ASSEMBLER_NAME aarch64_mangle_decl_assembler_name
> +
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>  
>  #include "gt-aarch64.h"

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [1/3] Add support for target_version attribute
  2023-10-26 18:41   ` Richard Sandiford
@ 2023-11-03 12:43     ` Andrew Carlotti
  2023-11-05 20:18       ` Richard Sandiford
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Carlotti @ 2023-11-03 12:43 UTC (permalink / raw)
  To: gcc-patches, jason, nathan, rguenther, richard.sandiford

On Thu, Oct 26, 2023 at 07:41:09PM +0100, Richard Sandiford wrote:
> Andrew Carlotti <andrew.carlotti@arm.com> writes:
> > This patch adds support for the "target_version" attribute to the middle
> > end and the C++ frontend, which will be used to implement function
> > multiversioning in the aarch64 backend.
> >
> > Note that C++ is currently the only frontend which supports
> > multiversioning using the "target" attribute, whereas the
> > "target_clones" attribute is additionally supported in C, D and Ada.
> > Support for the target_version attribute will be extended to C at a
> > later date.
> >
> > Targets that currently use the "target" attribute for function
> > multiversioning (i.e. i386 and rs6000) are not affected by this patch.
> >
> >
> > I could have implemented the target hooks slightly differently, by reusing the
> > valid_attribute_p hook and adding attribute name checks to each backend
> > implementation (c.f. the aarch64 implementation in patch 2/3).  Would this be
> > preferable?
> 
> Having as much as possible in target-independent code seems better
> to me FWIW.  On that basis:
> 
> >
> > Otherwise, is this ok for master?
> >
> >
> > gcc/c-family/ChangeLog:
> >
> > 	* c-attribs.cc (handle_target_version_attribute): New.
> > 	(c_common_attribute_table): Add target_version.
> > 	(handle_target_clones_attribute): Add conflict with
> > 	target_version attribute.
> >
> > gcc/ChangeLog:
> >
> > 	* attribs.cc (is_function_default_version): Update comment to
> > 	specify incompatibility with target_version attributes.
> > 	* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
> > 	Call valid_version_attribute_p for target_version attributes.
> > 	* target.def (valid_version_attribute_p): New hook.
> > 	(expanded_clones_attribute): New hook.
> > 	* doc/tm.texi.in: Add new hooks.
> > 	* doc/tm.texi: Regenerate.
> > 	* multiple_target.cc (create_dispatcher_calls): Remove redundant
> > 	is_function_default_version check.
> > 	(expand_target_clones): Use target hook for attribute name.
> > 	* targhooks.cc (default_target_option_valid_version_attribute_p):
> > 	New.
> > 	* targhooks.h (default_target_option_valid_version_attribute_p):
> > 	New.
> > 	* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
> > 	target_version attributes.
> >
> > gcc/cp/ChangeLog:
> >
> > 	* decl2.cc (check_classfn): Update comment to include
> > 	target_version attributes.
> >
> >
> > diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> > index b1300018d1e8ed8e02ded1ea721dc192a6d32a49..a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6 100644
> > --- a/gcc/attribs.cc
> > +++ b/gcc/attribs.cc
> > @@ -1233,8 +1233,9 @@ make_dispatcher_decl (const tree decl)
> >    return func_decl;  
> >  }
> >  
> > -/* Returns true if decl is multi-versioned and DECL is the default function,
> > -   that is it is not tagged with target specific optimization.  */
> > +/* Returns true if DECL is multi-versioned using the target attribute, and this
> > +   is the default version.  This function can only be used for targets that do
> > +   not support the "target_version" attribute.  */
> >  
> >  bool
> >  is_function_default_version (const tree decl)
> > diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> > index 072cfb69147bd6b314459c0bd48a0c1fb92d3e4d..1a224c036277d51ab4dc0d33a403177bd226e48a 100644
> > --- a/gcc/c-family/c-attribs.cc
> > +++ b/gcc/c-family/c-attribs.cc
> > @@ -148,6 +148,7 @@ static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_assume_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_target_attribute (tree *, tree, tree, int, bool *);
> > +static tree handle_target_version_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_target_clones_attribute (tree *, tree, tree, int, bool *);
> >  static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
> >  static tree ignore_attribute (tree *, tree, tree, int, bool *);
> > @@ -480,6 +481,8 @@ const struct attribute_spec c_common_attribute_table[] =
> >  			      handle_error_attribute, NULL },
> >    { "target",                 1, -1, true, false, false, false,
> >  			      handle_target_attribute, NULL },
> > +  { "target_version",         1, -1, true, false, false, false,
> > +			      handle_target_version_attribute, NULL },
> >    { "target_clones",          1, -1, true, false, false, false,
> >  			      handle_target_clones_attribute, NULL },
> >    { "optimize",               1, -1, true, false, false, false,
> > @@ -5569,6 +5572,45 @@ handle_target_attribute (tree *node, tree name, tree args, int flags,
> >    return NULL_TREE;
> >  }
> >  
> > +/* Handle a "target_version" attribute.  */
> > +
> > +static tree
> > +handle_target_version_attribute (tree *node, tree name, tree args, int flags,
> > +				  bool *no_add_attrs)
> > +{
> > +  /* Ensure we have a function type.  */
> > +  if (TREE_CODE (*node) != FUNCTION_DECL)
> > +    {
> > +      warning (OPT_Wattributes, "%qE attribute ignored", name);
> > +      *no_add_attrs = true;
> > +    }
> > +  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
> > +    {
> > +      warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> > +		   "with %qs attribute", name, "target_clones");
> > +      *no_add_attrs = true;
> > +    }
> > +  else if (!targetm.target_option.valid_version_attribute_p (*node, name, args,
> > +							     flags))
> > +    *no_add_attrs = true;
> > +
> > +  /* Check that there's no empty string in values of the attribute.  */
> > +  for (tree t = args; t != NULL_TREE; t = TREE_CHAIN (t))
> > +    {
> > +      tree value = TREE_VALUE (t);
> > +      if (TREE_CODE (value) == STRING_CST
> > +	  && TREE_STRING_LENGTH (value) == 1
> > +	  && TREE_STRING_POINTER (value)[0] == '\0')
> > +	{
> > +	  warning (OPT_Wattributes,
> > +		   "empty string in attribute %<target_version%>");
> > +	  *no_add_attrs = true;
> > +	}
> > +    }
> 
> would it make sense to do the empty string test first, and only pass
> the vetted arguments to the target hook?  Also, a Google search suggests
> that there aren't any pre-existing, conflicting uses of "target_version"
> that take multiple arguments.  So could this code check that there
> is exactly one argument (by changing 1, -1 to 1, 1 in the spec above)
> and then require it to be a nonempty string?  It could then pass the
> string itself to the target hook (probably as a const char *).
> 
> (FWIW, it doesn't look like the Clang documentation has kept the door
> open to multiple arguments.)
> 
> I wonder if we could use attribute_spec::exclusions to describe the
> mutual exclusion with "target_clones".  It doesn't look like the
> existing code does, though, so maybe not.
> 
> I couldn't see anything that forbids a combination of "target" and
> "target_version".  Should that combination be allowed?  In some ways
> it makes conceptual sense, since using "target" is like changing the
> command-line options.  But I suppose we'd then need to diagnose conflicts
> and deal with ordering issues.  So perhaps "target" should be made
> mutually exclusive as well.

My aarch64 backend pass deals with backend issues by always applying
target_version attribute changes after target_attribute changes.  I don't think
there's any additional conflicts to worry about, since adding a target_version
is simply equivalent to enabling extra features in the target string.

A similar thing would work for target_clones, but I didn't initially do that
because it would require making the frontend exclusions target dependant.
However, I think it's just a case of checking the backend hook to see whether
target_clones gets expanded to target attributes or not.

Clang currently disallows combining target attributes with either
target_version or target_clones.  However, I think it's worth being able to
combine these attributes.  For example, it could be useful to use the target
attribute to select different tuning for an sve target_version.

> Thanks,
> Richard
> 
> > +
> > +  return NULL_TREE;
> > +}
> > +
> >  /* Handle a "target_clones" attribute.  */
> >  
> >  static tree
> > @@ -5601,6 +5643,12 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args),
> >  		   "with %qs attribute", name, "target");
> >  	  *no_add_attrs = true;
> >  	}
> > +      else if (lookup_attribute ("target_version", DECL_ATTRIBUTES (*node)))
> > +	{
> > +	  warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
> > +		   "with %qs attribute", name, "target_version");
> > +	  *no_add_attrs = true;
> > +	}
> >        else if (get_target_clone_attr_len (args) == -1)
> >  	{
> >  	  warning (OPT_Wattributes,
> > diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
> > index 29d28ef895a73a223695cbb86aafbc845bbe7688..8af6b23d8c0306920e0fdcb3559ef047a16689f4 100644
> > --- a/gcc/cgraphclones.cc
> > +++ b/gcc/cgraphclones.cc
> > @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "tree-eh.h"
> >  #include "tree-cfg.h"
> >  #include "tree-inline.h"
> > +#include "attribs.h"
> >  #include "dumpfile.h"
> >  #include "gimple-pretty-print.h"
> >  #include "alloc-pool.h"
> > @@ -1048,7 +1049,17 @@ cgraph_node::create_version_clone_with_body
> >        location_t saved_loc = input_location;
> >        tree v = TREE_VALUE (target_attributes);
> >        input_location = DECL_SOURCE_LOCATION (new_decl);
> > -      bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 1);
> > +      bool r;
> > +      tree name_id = get_attribute_name (target_attributes);
> > +      const char* name_str = IDENTIFIER_POINTER (name_id);
> > +      if (strcmp (name_str, "target") == 0)
> > +	r = targetm.target_option.valid_attribute_p (new_decl, name_id, v, 1);
> > +      else if (strcmp (name_str, "target_version") == 0)
> > +	r = targetm.target_option.valid_version_attribute_p (new_decl, name_id,
> > +							     v, 1);
> > +      else
> > +	gcc_assert(false);
> > +
> >        input_location = saved_loc;
> >        if (!r)
> >  	return NULL;
> > diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> > index 9594be4092c3c00fddc9d4c6da5931ea3b7e8792..ec78d5a5440bedd360ac8e5bc44e164da3dab410 100644
> > --- a/gcc/cp/decl2.cc
> > +++ b/gcc/cp/decl2.cc
> > @@ -829,8 +829,8 @@ check_classfn (tree ctype, tree function, tree template_parms)
> >        tree c2 = get_constraints (fndecl);
> >  
> >        /* While finding a match, same types and params are not enough
> > -	 if the function is versioned.  Also check version ("target")
> > -	 attributes.  */
> > +	 if the function is versioned.  Also check for different target
> > +	 specific attributes.  */
> >        if (same_type_p (TREE_TYPE (TREE_TYPE (function)),
> >  		       TREE_TYPE (TREE_TYPE (fndecl)))
> >  	  && compparms (p1, p2)
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > index 33e7ffc8af5a9d48430145ef9b8e7924613b7dd7..97b3e9c31631a943d95f1cf7739716a574afcfb7 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -10507,6 +10507,23 @@ the function declaration to hold a pointer to a target-specific
> >  @code{struct cl_target_option} structure.
> >  @end deftypefn
> >  
> > +@deftypefn {Target Hook} bool TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P (tree @var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
> > +This hook is called to parse @code{attribute(target_version("..."))},
> > +which allows setting target-specific options on individual function versions.
> > +These function-specific options may differ
> > +from the options specified on the command line.  The hook should return
> > +@code{true} if the options are valid.
> > +
> > +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in
> > +the function declaration to hold a pointer to a target-specific
> > +@code{struct cl_target_option} structure.
> > +@end deftypefn
> > +
> > +@deftypevr {Target Hook} {const char *} TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> > +Contains the name of the attribute used for the version description string
> > +when expanding clones for a function with the target_clones attribute.
> > +@end deftypevr
> > +
> >  @deftypefn {Target Hook} void TARGET_OPTION_SAVE (struct cl_target_option *@var{ptr}, struct gcc_options *@var{opts}, struct gcc_options *@var{opts_set})
> >  This hook is called to save any additional target-specific information
> >  in the @code{struct cl_target_option} structure for function-specific
> > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> > index c98b2447e28aa17996b1cbf8af7ed02d70db54f2..56fa3de6bba06bc0ac124bb3a41324be1997e209 100644
> > --- a/gcc/doc/tm.texi.in
> > +++ b/gcc/doc/tm.texi.in
> > @@ -6979,6 +6979,10 @@ on this implementation detail.
> >  
> >  @hook TARGET_OPTION_VALID_ATTRIBUTE_P
> >  
> > +@hook TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
> > +
> > +@hook TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> > +
> >  @hook TARGET_OPTION_SAVE
> >  
> >  @hook TARGET_OPTION_RESTORE
> > diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
> > index a2ed048d7dd28ec470953fcd8a0dc86817e4b7dc..3db57c2b13d612a37240d9dcf58ad21b2286633c 100644
> > --- a/gcc/multiple_target.cc
> > +++ b/gcc/multiple_target.cc
> > @@ -66,10 +66,6 @@ create_dispatcher_calls (struct cgraph_node *node)
> >  {
> >    ipa_ref *ref;
> >  
> > -  if (!DECL_FUNCTION_VERSIONED (node->decl)
> > -      || !is_function_default_version (node->decl))
> > -    return;
> > -
> >    if (!targetm.has_ifunc_p ())
> >      {
> >        error_at (DECL_SOURCE_LOCATION (node->decl),
> > @@ -377,6 +373,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
> >        return false;
> >      }
> >  
> > +  const char *new_attr_name = targetm.target_option.expanded_clones_attribute;
> >    cgraph_function_version_info *decl1_v = NULL;
> >    cgraph_function_version_info *decl2_v = NULL;
> >    cgraph_function_version_info *before = NULL;
> > @@ -392,7 +389,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
> >        char *attr = attrs[i];
> >  
> >        /* Create new target clone.  */
> > -      tree attributes = make_attribute ("target", attr,
> > +      tree attributes = make_attribute (new_attr_name, attr,
> >  					DECL_ATTRIBUTES (node->decl));
> >  
> >        char *suffix = XNEWVEC (char, strlen (attr) + 1);
> > @@ -430,7 +427,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
> >    XDELETEVEC (attr_str);
> >  
> >    /* Setting new attribute to initial function.  */
> > -  tree attributes = make_attribute ("target", "default",
> > +  tree attributes = make_attribute (new_attr_name, "default",
> >  				    DECL_ATTRIBUTES (node->decl));
> >    DECL_ATTRIBUTES (node->decl) = attributes;
> >    node->local = false;
> > diff --git a/gcc/target.def b/gcc/target.def
> > index cda6c51e5167f85625168c7c26b777d6c8ccad82..39acea04db01ebaf918910b7dd73d397de6a84ec 100644
> > --- a/gcc/target.def
> > +++ b/gcc/target.def
> > @@ -6492,6 +6492,31 @@ the function declaration to hold a pointer to a target-specific\n\
> >   bool, (tree fndecl, tree name, tree args, int flags),
> >   default_target_option_valid_attribute_p)
> >  
> > +/* Function to validate the attribute((target_version(...))) strings.  If
> > +   the option is validated, the hook should also fill in
> > +   DECL_FUNCTION_SPECIFIC_TARGET in the function decl node.  */
> > +DEFHOOK
> > +(valid_version_attribute_p,
> > + "This hook is called to parse @code{attribute(target_version(\"...\"))},\n\
> > +which allows setting target-specific options on individual function versions.\n\
> > +These function-specific options may differ\n\
> > +from the options specified on the command line.  The hook should return\n\
> > +@code{true} if the options are valid.\n\
> > +\n\
> > +The hook should set the @code{DECL_FUNCTION_SPECIFIC_TARGET} field in\n\
> > +the function declaration to hold a pointer to a target-specific\n\
> > +@code{struct cl_target_option} structure.",
> > + bool, (tree fndecl, tree name, tree args, int flags),
> > + default_target_option_valid_version_attribute_p)
> > +
> > +/* Attribute to be used when expanding clones for functions with
> > +   target_clones attribute.  */
> > +DEFHOOKPOD
> > +(expanded_clones_attribute,
> > + "Contains the name of the attribute used for the version description string\n\
> > +when expanding clones for a function with the target_clones attribute.",
> > + const char *, "target")
> > +
> >  /* Function to save any extra target state in the target options structure.  */
> >  DEFHOOK
> >  (save,
> > diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> > index 1a0db8dddd594d9b1fb04ae0d9a66ad6b7a396dc..0efc993d82ef59b581a1df74ee0de71135a28703 100644
> > --- a/gcc/targhooks.h
> > +++ b/gcc/targhooks.h
> > @@ -192,6 +192,7 @@ extern bool default_hard_regno_scratch_ok (unsigned int);
> >  extern bool default_mode_dependent_address_p (const_rtx, addr_space_t);
> >  extern bool default_new_address_profitable_p (rtx, rtx_insn *, rtx);
> >  extern bool default_target_option_valid_attribute_p (tree, tree, tree, int);
> > +extern bool default_target_option_valid_version_attribute_p (tree, tree, tree, int);
> >  extern bool default_target_option_pragma_parse (tree, tree);
> >  extern bool default_target_can_inline_p (tree, tree);
> >  extern bool default_update_ipa_fn_target_info (unsigned int &, const gimple *);
> > diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> > index e190369f87a92e6a92372dc348d9374c3a965c0a..7fc7bf455e80c333cced1bac7085210c2b108f8d 100644
> > --- a/gcc/targhooks.cc
> > +++ b/gcc/targhooks.cc
> > @@ -1787,7 +1787,19 @@ default_target_option_valid_attribute_p (tree ARG_UNUSED (fndecl),
> >  					 int ARG_UNUSED (flags))
> >  {
> >    warning (OPT_Wattributes,
> > -	   "target attribute is not supported on this machine");
> > +	   "%<target%> attribute is not supported on this machine");
> > +
> > +  return false;
> > +}
> > +
> > +bool
> > +default_target_option_valid_version_attribute_p (tree ARG_UNUSED (fndecl),
> > +						 tree ARG_UNUSED (name),
> > +						 tree ARG_UNUSED (args),
> > +						 int ARG_UNUSED (flags))
> > +{
> > +  warning (OPT_Wattributes,
> > +	   "%<target_version%> attribute is not supported on this machine");
> >  
> >    return false;
> >  }
> > diff --git a/gcc/tree.h b/gcc/tree.h
> > index 0b72663e6a1a94406127f6253460f498b7a3ea9c..ebd89ce79566c350eaaab210c0dca3cc1ac2048e 100644
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -3438,8 +3438,8 @@ extern vec<tree, va_gc> **decl_debug_args_insert (tree);
> >     (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization)
> >  
> >  /* In FUNCTION_DECL, this is set if this function has other versions generated
> > -   using "target" attributes.  The default version is the one which does not
> > -   have any "target" attribute set. */
> > +   to support different architecture feature sets, e.g. using "target" or
> > +   "target_version" attributes.  */
> >  #define DECL_FUNCTION_VERSIONED(NODE)\
> >     (FUNCTION_DECL_CHECK (NODE)->function_decl.versioned_function)
> >  

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [1/3] Add support for target_version attribute
  2023-11-03 12:43     ` Andrew Carlotti
@ 2023-11-05 20:18       ` Richard Sandiford
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2023-11-05 20:18 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, jason, nathan, rguenther

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> On Thu, Oct 26, 2023 at 07:41:09PM +0100, Richard Sandiford wrote:
>> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>> > This patch adds support for the "target_version" attribute to the middle
>> > end and the C++ frontend, which will be used to implement function
>> > multiversioning in the aarch64 backend.
>> >
>> > Note that C++ is currently the only frontend which supports
>> > multiversioning using the "target" attribute, whereas the
>> > "target_clones" attribute is additionally supported in C, D and Ada.
>> > Support for the target_version attribute will be extended to C at a
>> > later date.
>> >
>> > Targets that currently use the "target" attribute for function
>> > multiversioning (i.e. i386 and rs6000) are not affected by this patch.
>> >
>> >
>> > I could have implemented the target hooks slightly differently, by reusing the
>> > valid_attribute_p hook and adding attribute name checks to each backend
>> > implementation (c.f. the aarch64 implementation in patch 2/3).  Would this be
>> > preferable?
>> 
>> Having as much as possible in target-independent code seems better
>> to me FWIW.  On that basis:
>> 
>> >
>> > Otherwise, is this ok for master?
>> >
>> >
>> > gcc/c-family/ChangeLog:
>> >
>> > 	* c-attribs.cc (handle_target_version_attribute): New.
>> > 	(c_common_attribute_table): Add target_version.
>> > 	(handle_target_clones_attribute): Add conflict with
>> > 	target_version attribute.
>> >
>> > gcc/ChangeLog:
>> >
>> > 	* attribs.cc (is_function_default_version): Update comment to
>> > 	specify incompatibility with target_version attributes.
>> > 	* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
>> > 	Call valid_version_attribute_p for target_version attributes.
>> > 	* target.def (valid_version_attribute_p): New hook.
>> > 	(expanded_clones_attribute): New hook.
>> > 	* doc/tm.texi.in: Add new hooks.
>> > 	* doc/tm.texi: Regenerate.
>> > 	* multiple_target.cc (create_dispatcher_calls): Remove redundant
>> > 	is_function_default_version check.
>> > 	(expand_target_clones): Use target hook for attribute name.
>> > 	* targhooks.cc (default_target_option_valid_version_attribute_p):
>> > 	New.
>> > 	* targhooks.h (default_target_option_valid_version_attribute_p):
>> > 	New.
>> > 	* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
>> > 	target_version attributes.
>> >
>> > gcc/cp/ChangeLog:
>> >
>> > 	* decl2.cc (check_classfn): Update comment to include
>> > 	target_version attributes.
>> >
>> >
>> > diff --git a/gcc/attribs.cc b/gcc/attribs.cc
>> > index b1300018d1e8ed8e02ded1ea721dc192a6d32a49..a3c4a81e8582ea4fd06b9518bf51fad7c998ddd6 100644
>> > --- a/gcc/attribs.cc
>> > +++ b/gcc/attribs.cc
>> > @@ -1233,8 +1233,9 @@ make_dispatcher_decl (const tree decl)
>> >    return func_decl;  
>> >  }
>> >  
>> > -/* Returns true if decl is multi-versioned and DECL is the default function,
>> > -   that is it is not tagged with target specific optimization.  */
>> > +/* Returns true if DECL is multi-versioned using the target attribute, and this
>> > +   is the default version.  This function can only be used for targets that do
>> > +   not support the "target_version" attribute.  */
>> >  
>> >  bool
>> >  is_function_default_version (const tree decl)
>> > diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
>> > index 072cfb69147bd6b314459c0bd48a0c1fb92d3e4d..1a224c036277d51ab4dc0d33a403177bd226e48a 100644
>> > --- a/gcc/c-family/c-attribs.cc
>> > +++ b/gcc/c-family/c-attribs.cc
>> > @@ -148,6 +148,7 @@ static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
>> >  static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *);
>> >  static tree handle_assume_attribute (tree *, tree, tree, int, bool *);
>> >  static tree handle_target_attribute (tree *, tree, tree, int, bool *);
>> > +static tree handle_target_version_attribute (tree *, tree, tree, int, bool *);
>> >  static tree handle_target_clones_attribute (tree *, tree, tree, int, bool *);
>> >  static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
>> >  static tree ignore_attribute (tree *, tree, tree, int, bool *);
>> > @@ -480,6 +481,8 @@ const struct attribute_spec c_common_attribute_table[] =
>> >  			      handle_error_attribute, NULL },
>> >    { "target",                 1, -1, true, false, false, false,
>> >  			      handle_target_attribute, NULL },
>> > +  { "target_version",         1, -1, true, false, false, false,
>> > +			      handle_target_version_attribute, NULL },
>> >    { "target_clones",          1, -1, true, false, false, false,
>> >  			      handle_target_clones_attribute, NULL },
>> >    { "optimize",               1, -1, true, false, false, false,
>> > @@ -5569,6 +5572,45 @@ handle_target_attribute (tree *node, tree name, tree args, int flags,
>> >    return NULL_TREE;
>> >  }
>> >  
>> > +/* Handle a "target_version" attribute.  */
>> > +
>> > +static tree
>> > +handle_target_version_attribute (tree *node, tree name, tree args, int flags,
>> > +				  bool *no_add_attrs)
>> > +{
>> > +  /* Ensure we have a function type.  */
>> > +  if (TREE_CODE (*node) != FUNCTION_DECL)
>> > +    {
>> > +      warning (OPT_Wattributes, "%qE attribute ignored", name);
>> > +      *no_add_attrs = true;
>> > +    }
>> > +  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
>> > +    {
>> > +      warning (OPT_Wattributes, "%qE attribute ignored due to conflict "
>> > +		   "with %qs attribute", name, "target_clones");
>> > +      *no_add_attrs = true;
>> > +    }
>> > +  else if (!targetm.target_option.valid_version_attribute_p (*node, name, args,
>> > +							     flags))
>> > +    *no_add_attrs = true;
>> > +
>> > +  /* Check that there's no empty string in values of the attribute.  */
>> > +  for (tree t = args; t != NULL_TREE; t = TREE_CHAIN (t))
>> > +    {
>> > +      tree value = TREE_VALUE (t);
>> > +      if (TREE_CODE (value) == STRING_CST
>> > +	  && TREE_STRING_LENGTH (value) == 1
>> > +	  && TREE_STRING_POINTER (value)[0] == '\0')
>> > +	{
>> > +	  warning (OPT_Wattributes,
>> > +		   "empty string in attribute %<target_version%>");
>> > +	  *no_add_attrs = true;
>> > +	}
>> > +    }
>> 
>> would it make sense to do the empty string test first, and only pass
>> the vetted arguments to the target hook?  Also, a Google search suggests
>> that there aren't any pre-existing, conflicting uses of "target_version"
>> that take multiple arguments.  So could this code check that there
>> is exactly one argument (by changing 1, -1 to 1, 1 in the spec above)
>> and then require it to be a nonempty string?  It could then pass the
>> string itself to the target hook (probably as a const char *).
>> 
>> (FWIW, it doesn't look like the Clang documentation has kept the door
>> open to multiple arguments.)
>> 
>> I wonder if we could use attribute_spec::exclusions to describe the
>> mutual exclusion with "target_clones".  It doesn't look like the
>> existing code does, though, so maybe not.
>> 
>> I couldn't see anything that forbids a combination of "target" and
>> "target_version".  Should that combination be allowed?  In some ways
>> it makes conceptual sense, since using "target" is like changing the
>> command-line options.  But I suppose we'd then need to diagnose conflicts
>> and deal with ordering issues.  So perhaps "target" should be made
>> mutually exclusive as well.
>
> My aarch64 backend pass deals with backend issues by always applying
> target_version attribute changes after target_attribute changes.  I don't think
> there's any additional conflicts to worry about, since adding a target_version
> is simply equivalent to enabling extra features in the target string.
>
> A similar thing would work for target_clones, but I didn't initially do that
> because it would require making the frontend exclusions target dependant.
> However, I think it's just a case of checking the backend hook to see whether
> target_clones gets expanded to target attributes or not.
>
> Clang currently disallows combining target attributes with either
> target_version or target_clones.  However, I think it's worth being able to
> combine these attributes.  For example, it could be useful to use the target
> attribute to select different tuning for an sve target_version.

Ah, ok, thanks for the explanation.  That approach sounds good to me.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-11-05 20:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-18 15:39 [0/3] target_version and aarch64 function multiversioning Andrew Carlotti
2023-10-18 15:42 ` [1/3] Add support for target_version attribute Andrew Carlotti
2023-10-19  7:04   ` Richard Biener
2023-10-19 16:13     ` Andrew Carlotti
2023-10-26 18:41   ` Richard Sandiford
2023-11-03 12:43     ` Andrew Carlotti
2023-11-05 20:18       ` Richard Sandiford
2023-10-18 15:44 ` [2/3] [aarch64] Add function multiversioning support Andrew Carlotti
2023-10-30 22:35   ` Richard Sandiford
2023-10-18 15:44 ` [3/3] WIP/RFC: Fix name mangling for target_clones Andrew Carlotti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).