public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC/PATCH] ipa-inline: Add target info into fn summary [PR102059]
@ 2021-09-01  7:02 Kewen.Lin
  2021-09-02  9:25 ` Richard Biener
  2021-09-02 17:44 ` Segher Boessenkool
  0 siblings, 2 replies; 8+ messages in thread
From: Kewen.Lin @ 2021-09-01  7:02 UTC (permalink / raw)
  To: GCC Patches
  Cc: Richard Biener, Jan Hubicka, Martin Liška,
	Segher Boessenkool, Bill Schmidt, fweimer, mjambor

[-- Attachment #1: Type: text/plain, Size: 3944 bytes --]

Hi!

Power ISA 2.07 (Power8) introduces transactional memory feature
but ISA3.1 (Power10) removes it.  It exposes one troublesome
issue as PR102059 shows.  Users define some function with
target pragma cpu=power10 then it calls one function with
attribute always_inline which inherits command line option
cpu=power8 which enables HTM implicitly.  The current isa_flags
check doesn't allow this inlining due to "target specific
option mismatch" and error mesasge is emitted.

Normally, the callee function isn't intended to exploit HTM
feature, but the default flag setting make it look it has.
As Richi raised in the PR, we have fp_expressions flag in
function summary, and allow us to check the function actually
contains any floating point expressions to avoid overkill.
So this patch follows the similar idea but is more target
specific, for this rs6000 port specific requirement on HTM
feature check, we would like to check rs6000 specific HTM
built-in functions and inline assembly, it allows targets
to do their own customized checks and updates.

It introduces two target hooks need_ipa_fn_target_info and
update_ipa_fn_target_info.  The former allows target to do
some previous check and decides to collect target specific
information for this function or not.  For some special case,
it can predict the analysis result and push it early without
any scannings.  The latter allows the analyze_function_body
to pass gimple stmts down just like fp_expressions handlings,
target can do its own tricks.  I put them as one hook initially
with one boolean to indicates whether it's initial time, but
the code looks a bit ugly, to separate them seems to have
better readability.

To make it simple, this patch uses HOST_WIDE_INT to record the
flags just like what we use for isa_flags.  For rs6000's HTM
need, one HOST_WIDE_INT variable is quite enough, but it seems
good to have one auto_vec for scalability as I noticed some
targets have more than one HOST_WIDE_INT flag.  For now, this
target information collection is only for always_inline function,
function ipa_merge_fn_summary_after_inlining deals with target
information merging.

The patch has been bootstrapped and regress-tested on
powerpc64le-linux-gnu Power9.

Is it on the right track?

Any comments are highly appreciated!

BR,
Kewen
------
gcc/ChangeLog:

	PR ipa/102059
	* config/rs6000/rs6000-call.c (rs6000_builtin_mask_set_p): New
	function.
	* config/rs6000/rs6000-internal.h (rs6000_builtin_mask_set_p): New
	declare.
	* config/rs6000/rs6000.c (TARGET_NEED_IPA_FN_TARGET_INFO): New macro.
	(TARGET_UPDATE_IPA_FN_TARGET_INFO): Likewise.
	(rs6000_need_ipa_fn_target_info): New function.
	(rs6000_update_ipa_fn_target_info): Likewise.
	(rs6000_can_inline_p): Adjust for ipa function summary target info.
	* ipa-fnsummary.c (ipa_dump_fn_summary): Adjust for ipa function
	summary target info.
	(analyze_function_body): Adjust for ipa function summary target
	info and call hook rs6000_need_ipa_fn_target_info and
	rs6000_update_ipa_fn_target_info.
	(ipa_merge_fn_summary_after_inlining): Adjust for ipa function
	summary target info.
	(inline_read_section): Likewise.
	(ipa_fn_summary_write): Likewise.
	* ipa-fnsummary.h (ipa_fn_summary::target_info): New member.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in (TARGET_UPDATE_IPA_FN_TARGET_INFO): Document new
	hook.
	(TARGET_NEED_IPA_FN_TARGET_INFO): Likewise.
	* target.def (update_ipa_fn_target_info): New hook.
	(need_ipa_fn_target_info): Likewise.
	* targhooks.c (default_need_ipa_fn_target_info): New function.
	(default_update_ipa_fn_target_info): Likewise.
	* targhooks.h (default_update_ipa_fn_target_info): New declare.
	(default_need_ipa_fn_target_info): Likewise.

gcc/testsuite/ChangeLog:

	PR ipa/102059
	* gcc.dg/lto/pr102059_0.c: New test.
	* gcc.dg/lto/pr102059_1.c: New test.
	* gcc.dg/lto/pr102059_2.c: New test.
	* gcc.target/powerpc/pr102059-5.c: New test.
	* gcc.target/powerpc/pr102059-6.c: New test.


[-- Attachment #2: 0003-ipa-inline-Add-target-info-to-fnsummary.patch --]
[-- Type: text/plain, Size: 21339 bytes --]

---
 gcc/config/rs6000/rs6000-call.c               | 11 +++
 gcc/config/rs6000/rs6000-internal.h           |  1 +
 gcc/config/rs6000/rs6000.c                    | 94 ++++++++++++++++++-
 gcc/doc/tm.texi                               | 31 ++++++
 gcc/doc/tm.texi.in                            |  4 +
 gcc/ipa-fnsummary.c                           | 53 +++++++++++
 gcc/ipa-fnsummary.h                           |  6 +-
 gcc/testsuite/gcc.dg/lto/pr102059_0.c         | 12 +++
 gcc/testsuite/gcc.dg/lto/pr102059_1.c         |  9 ++
 gcc/testsuite/gcc.dg/lto/pr102059_2.c         | 11 +++
 gcc/testsuite/gcc.target/powerpc/pr102059-5.c | 21 +++++
 gcc/testsuite/gcc.target/powerpc/pr102059-6.c | 21 +++++
 gcc/target.def                                | 35 +++++++
 gcc/targhooks.c                               | 16 ++++
 gcc/targhooks.h                               |  3 +
 15 files changed, 322 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr102059_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr102059_1.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr102059_2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-5.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-6.c

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index b92928c891a..0fbdab15e67 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13642,6 +13642,17 @@ rs6000_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
   return rs6000_builtin_decls[code];
 }
 
+/* Return true if the builtin with CODE has any mask bits set
+   which are specified by MASK.  */
+
+bool
+rs6000_builtin_mask_set_p (unsigned code, HOST_WIDE_INT mask)
+{
+  gcc_assert (code < RS6000_BUILTIN_COUNT);
+  HOST_WIDE_INT fnmask = rs6000_builtin_info[code].mask;
+  return fnmask & mask;
+}
+
 static void
 altivec_init_builtins (void)
 {
diff --git a/gcc/config/rs6000/rs6000-internal.h b/gcc/config/rs6000/rs6000-internal.h
index 88cf9bd5692..fc71b5637f1 100644
--- a/gcc/config/rs6000/rs6000-internal.h
+++ b/gcc/config/rs6000/rs6000-internal.h
@@ -179,6 +179,7 @@ extern tree rs6000_fold_builtin (tree fndecl ATTRIBUTE_UNUSED,
 			         int n_args ATTRIBUTE_UNUSED,
 			         tree *args ATTRIBUTE_UNUSED,
 			         bool ignore ATTRIBUTE_UNUSED);
+extern bool rs6000_builtin_mask_set_p (unsigned code, HOST_WIDE_INT mask);
 
 #if TARGET_ELF
 extern bool rs6000_passes_ieee128;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index c2582a3efab..2ee12275b78 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -71,6 +71,9 @@
 #include "tree-vector-builder.h"
 #include "context.h"
 #include "tree-pass.h"
+#include "symbol-summary.h"
+#include "ipa-prop.h"
+#include "ipa-fnsummary.h"
 #include "except.h"
 #if TARGET_XCOFF
 #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
@@ -1780,6 +1783,12 @@ static const struct attribute_spec rs6000_attribute_table[] =
 
 #undef TARGET_INVALID_CONVERSION
 #define TARGET_INVALID_CONVERSION rs6000_invalid_conversion
+
+#undef TARGET_NEED_IPA_FN_TARGET_INFO
+#define TARGET_NEED_IPA_FN_TARGET_INFO rs6000_need_ipa_fn_target_info
+
+#undef TARGET_UPDATE_IPA_FN_TARGET_INFO
+#define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
 \f
 
 /* Processor table.  */
@@ -25052,7 +25061,70 @@ rs6000_generate_version_dispatcher_body (void *node_p)
   return resolver;
 }
 
-\f
+/* Hook to decide if we need to scan function gimples to collect
+   target specific information for inlining, push the initial flag
+   into INFO if we want to make use of target info.  */
+
+static bool
+rs6000_need_ipa_fn_target_info (const_tree decl, vec<HOST_WIDE_INT> &info)
+{
+  tree target = DECL_FUNCTION_SPECIFIC_TARGET (decl);
+  if (!target)
+    target = target_option_default_node;
+  struct cl_target_option *opts = TREE_TARGET_OPTION (target);
+
+  /* See PR102059, we only handle HTM for now, so will only do
+     the consequent scannings when HTM feature enabled.  */
+  if (opts->x_rs6000_isa_flags & OPTION_MASK_HTM)
+    {
+      info.safe_push (0);
+      return true;
+    }
+
+  /* The given function disables HTM explicitly, it's safe to be
+     inlined from HTM feature's perspective and don't need any
+     further checkings.  */
+  if (opts->x_rs6000_isa_flags_explicit & OPTION_MASK_HTM)
+    {
+      info.safe_push (0);
+      return false;
+    }
+
+  return false;
+}
+
+/* Hook to update target specific information INFO for inlining by
+   scanning the given STMT.  Return false if we don't need to scan
+   any more.  */
+
+static bool
+rs6000_update_ipa_fn_target_info (vec<HOST_WIDE_INT> &info, const gimple *stmt)
+{
+  /* Assume inline asm can use any instruction features.  */
+  if (gimple_code (stmt) == GIMPLE_ASM)
+    {
+      info[0] = -1;
+      return false;
+    }
+  else if (gimple_code (stmt) == GIMPLE_CALL)
+    {
+      tree fndecl = gimple_call_fndecl (stmt);
+      if (fndecl && fndecl_built_in_p (fndecl, BUILT_IN_MD))
+	{
+	  enum rs6000_builtins fcode =
+	    (enum rs6000_builtins) DECL_MD_FUNCTION_CODE (fndecl);
+	  /* HTM bifs definitely exploit HTM insns.  */
+	  if (rs6000_builtin_mask_set_p ((unsigned) fcode, RS6000_BTM_HTM))
+	    {
+	      info[0] |= OPTION_MASK_HTM;
+	      return false;
+	    }
+	}
+    }
+
+  return true;
+}
+
 /* Hook to determine if one function can safely inline another.  */
 
 static bool
@@ -25085,10 +25157,22 @@ rs6000_can_inline_p (tree caller, tree callee)
      | OPTION_MASK_P10_FUSION_ADDLOG | OPTION_MASK_P10_FUSION_2ADD
      | OPTION_MASK_PCREL_OPT);
 
-  if (always_inline) {
-    caller_isa &= ~always_inline_safe_mask;
-    callee_isa &= ~always_inline_safe_mask;
-  }
+  if (always_inline)
+    {
+      cgraph_node *callee_node = cgraph_node::get (callee);
+      if (ipa_fn_summaries && ipa_fn_summaries->get (callee_node) != NULL)
+	{
+	  if (dump_file)
+	    ipa_dump_fn_summary (dump_file, callee_node);
+	  const vec<HOST_WIDE_INT> &info =
+	    ipa_fn_summaries->get (callee_node)->target_info;
+	  if (!info.is_empty ())
+	    always_inline_safe_mask |= ~info[0] & OPTION_MASK_HTM;
+	}
+
+      caller_isa &= ~always_inline_safe_mask;
+      callee_isa &= ~always_inline_safe_mask;
+    }
 
   /* The callee's options must be a subset of the caller's options, i.e.
      a vsx function may inline an altivec function, but a no-vsx function
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c8f4abe3e41..4af5e3e6819 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10743,6 +10743,37 @@ default, inlining is not allowed if the callee function has function
 specific target options and the caller does not use the same options.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_UPDATE_IPA_FN_TARGET_INFO (vec<HOST_WIDE_INT>& @var{vec}, const gimple* @var{stmt})
+Allow target to analyze all gimple statements for the given function to
+record and update some target specific information for inlining.  A typical
+example is that a caller with one isa feature disabled is normally not
+allowed to inline a callee with that same isa feature enabled even which is
+attributed by always_inline, but with the conservative analysis on all
+statements of the callee if we are able to guarantee the callee does not
+exploit any instructions from the mismatch isa feature, it would be safe to
+allow the caller to inline the callee.
+@var{vec} is one @code{HOST_WIDE_INT} element vector to record information
+in which one set bit indicates one corresponding feature is detected in the
+analysis, @var{stmt} is the statement being analyzed.  Return true if target
+still need to analyze the subsequent statements, otherwise return false to
+stop subsequent analysis.
+The default version of this hook returns false.
+@end deftypefn
+
+@deftypefn {Target Hook} bool TARGET_NEED_IPA_FN_TARGET_INFO (const_tree @var{decl}, vec<HOST_WIDE_INT>& @var{vec})
+Allow target to check early whether it is necessary to analyze all gimple
+statements in the given function to update target specific information for
+inlining.  See hook @code{update_ipa_fn_target_info} for usage example of
+target specific information.  This hook is expected to be invoked ahead of
+the iterating with hook @code{update_ipa_fn_target_info}.
+@var{decl} is the function being analyzed, @var{vec} is the same as what
+in hook @code{update_ipa_fn_target_info}, target can do one time update
+into @var{vec} without iterating for some case.  Return true if target
+decides to analyze all gimple statements to collect information, otherwise
+return false.
+The default version of this hook returns false.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_RELAYOUT_FUNCTION (tree @var{fndecl})
 This target hook fixes function @var{fndecl} after attributes are processed.
 Default does nothing. On ARM, the default function's alignment is updated
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 9c4b5016053..2b013649a45 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7196,6 +7196,10 @@ on this implementation detail.
 
 @hook TARGET_CAN_INLINE_P
 
+@hook TARGET_UPDATE_IPA_FN_TARGET_INFO
+
+@hook TARGET_NEED_IPA_FN_TARGET_INFO
+
 @hook TARGET_RELAYOUT_FUNCTION
 
 @node Emulated TLS
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index cf80ce3c040..1e5f29464e7 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
+#include "target.h"
 #include "tree.h"
 #include "gimple.h"
 #include "alloc-pool.h"
@@ -1146,6 +1147,16 @@ ipa_dump_fn_summary (FILE *f, struct cgraph_node *node)
 	  fprintf (f, "  calls:\n");
 	  dump_ipa_call_summary (f, 4, node, s);
 	  fprintf (f, "\n");
+	  HOST_WIDE_INT flags;
+	  for (int i = 0; s->target_info.iterate (i, &flags); i++)
+	    {
+	      if (i == 0)
+		{
+		  fprintf (f, "  target_info flags:");
+		}
+	      fprintf (f, " [%d]: " HOST_WIDE_INT_PRINT_HEX "\n", i, flags);
+	    }
+	  fprintf (f, "\n");
 	}
       else
 	fprintf (f, "IPA summary for %s is missing.\n", node->dump_name ());
@@ -2608,6 +2619,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
   info->conds = NULL;
   info->size_time_table.release ();
   info->call_size_time_table.release ();
+  info->target_info.release();
 
   /* When optimizing and analyzing for IPA inliner, initialize loop optimizer
      so we can produce proper inline hints.
@@ -2659,6 +2671,14 @@ analyze_function_body (struct cgraph_node *node, bool early)
 			   bb_predicate,
 		           bb_predicate);
 
+  /* Only look for target information for inlinable always_inline functions.  */
+  bool scan_for_target_info =
+    (info->inlinable
+     && DECL_DISREGARD_INLINE_LIMITS (node->decl)
+     && lookup_attribute ("always_inline", DECL_ATTRIBUTES (node->decl))
+     && targetm.target_option.need_ipa_fn_target_info (node->decl,
+						       info->target_info));
+
   if (fbi.info)
     compute_bb_predicates (&fbi, node, info, params_summary);
   const profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
@@ -2876,6 +2896,10 @@ analyze_function_body (struct cgraph_node *node, bool early)
 		  if (dump_file)
 		    fprintf (dump_file, "   fp_expression set\n");
 		}
+	      if (scan_for_target_info)
+		scan_for_target_info =
+		  targetm.target_option.update_ipa_fn_target_info
+		  (info->target_info, stmt);
 	    }
 
 	  /* Account cost of address calculations in the statements.  */
@@ -4073,6 +4097,22 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge)
 
   info->fp_expressions |= callee_info->fp_expressions;
 
+  /* For now, only always_inline functions have the target info.  */
+  if (callee_info->target_info.is_empty ())
+    {
+      if (!info->target_info.is_empty ())
+	/* We don't have target info from the inlined callee, discard the
+	   current one to keep conservative.  */
+	info->target_info.truncate (0);
+    }
+  else if (!info->target_info.is_empty ())
+    {
+      unsigned len = info->target_info.length ();
+      gcc_assert (len == callee_info->target_info.length ());
+      for (unsigned i = 0; i < len; i++)
+	info->target_info[i] |= callee_info->target_info[i];
+    }
+
   if (callee_info->conds)
     {
       ipa_auto_call_arg_values avals;
@@ -4534,6 +4574,15 @@ inline_read_section (struct lto_file_decl_data *file_data, const char *data,
 	  if (info)
 	    info->builtin_constant_p_parms.quick_push (parm);
 	}
+      count2 = streamer_read_uhwi (&ib);
+      if (info && count2)
+	info->target_info.reserve_exact (count2);
+      for (j = 0; j < count2; j++)
+	{
+	  HOST_WIDE_INT flag = streamer_read_hwi (&ib);
+	  if (info)
+	    info->target_info.quick_push (flag);
+	}
       for (e = node->callees; e; e = e->next_callee)
 	read_ipa_call_summary (&ib, e, info != NULL);
       for (e = node->indirect_calls; e; e = e->next_callee)
@@ -4713,6 +4762,10 @@ ipa_fn_summary_write (void)
 	  for (i = 0; info->builtin_constant_p_parms.iterate (i, &ip);
 	       i++)
 	    streamer_write_uhwi (ob, ip);
+	  streamer_write_uhwi (ob, info->target_info.length ());
+	  HOST_WIDE_INT hwi;
+	  for (i = 0; info->target_info.iterate (i, &hwi); i++)
+	    streamer_write_hwi (ob, hwi);
 	  for (edge = cnode->callees; edge; edge = edge->next_callee)
 	    write_ipa_call_summary (ob, edge);
 	  for (edge = cnode->indirect_calls; edge; edge = edge->next_callee)
diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h
index 78399b0b9bb..300b8da4507 100644
--- a/gcc/ipa-fnsummary.h
+++ b/gcc/ipa-fnsummary.h
@@ -131,7 +131,7 @@ public:
       time (0), conds (NULL),
       size_time_table (), call_size_time_table (vNULL),
       loop_iterations (NULL), loop_strides (NULL),
-      builtin_constant_p_parms (vNULL),
+      builtin_constant_p_parms (vNULL), target_info (),
       growth (0), scc_no (0)
   {
   }
@@ -146,6 +146,7 @@ public:
     call_size_time_table (vNULL),
     loop_iterations (s.loop_iterations), loop_strides (s.loop_strides),
     builtin_constant_p_parms (s.builtin_constant_p_parms),
+    target_info (s.target_info.copy()),
     growth (s.growth), scc_no (s.scc_no)
   {}
 
@@ -193,6 +194,9 @@ public:
   vec<ipa_freqcounting_predicate, va_gc> *loop_strides;
   /* Parameters tested by builtin_constant_p.  */
   vec<int, va_heap, vl_ptr> GTY((skip)) builtin_constant_p_parms;
+  /* Like fp_expressions, but it's to hold some target specific information,
+     such as some target specific isa flags.  */
+  auto_vec<HOST_WIDE_INT> GTY((skip)) target_info;
   /* Estimated growth for inlining all copies of the function before start
      of small functions inlining.
      This value will get out of date as the callers are duplicated, but
diff --git a/gcc/testsuite/gcc.dg/lto/pr102059_0.c b/gcc/testsuite/gcc.dg/lto/pr102059_0.c
new file mode 100644
index 00000000000..f164d96637e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr102059_0.c
@@ -0,0 +1,12 @@
+/* { dg-lto-do link } */
+/* { dg-skip-if "power only" { ! { powerpc*-*-* } } } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -Wno-attributes" } } */
+
+int __attribute__ ((always_inline))
+foo1 (int *b)
+{
+  *b += 100;
+  return *b;
+}
+
diff --git a/gcc/testsuite/gcc.dg/lto/pr102059_1.c b/gcc/testsuite/gcc.dg/lto/pr102059_1.c
new file mode 100644
index 00000000000..7e31fc7fbd9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr102059_1.c
@@ -0,0 +1,9 @@
+extern int foo1 (int *b);
+
+int __attribute__ ((always_inline)) foo2 (int *b)
+{
+  int res = foo1 (b);
+  *b += res;
+  return *b;
+}
+
diff --git a/gcc/testsuite/gcc.dg/lto/pr102059_2.c b/gcc/testsuite/gcc.dg/lto/pr102059_2.c
new file mode 100644
index 00000000000..bfa1e62fb7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr102059_2.c
@@ -0,0 +1,11 @@
+extern int foo2 (int *b);
+
+#pragma GCC target "cpu=power10"
+__attribute__ ((always_inline))
+int
+main (int *a)
+{
+  *a = foo2 (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-5.c b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
new file mode 100644
index 00000000000..b969c4c4750
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -Wno-attributes" } */
+
+/* Verify the reduced case from PR102059 won't fail.  */
+
+__attribute__ ((always_inline)) int
+foo (int *b)
+{
+  *b += 10;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int
+bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-6.c b/gcc/testsuite/gcc.target/powerpc/pr102059-6.c
new file mode 100644
index 00000000000..52e009289f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-6.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-htm" } */
+
+/* Verify target info for inlining still works even if callee
+   disables HTM explicitly while caller enables it.  */
+
+static inline int __attribute__ ((always_inline))
+foo (int *b)
+{
+  *b += 10;
+  return *b;
+}
+
+#pragma GCC target "htm"
+int
+bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
+
diff --git a/gcc/target.def b/gcc/target.def
index 2e40448e6c5..92e4b0be667 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6567,6 +6567,41 @@ specific target options and the caller does not use the same options.",
  bool, (tree caller, tree callee),
  default_target_can_inline_p)
 
+DEFHOOK
+(update_ipa_fn_target_info,
+ "Allow target to analyze all gimple statements for the given function to\n\
+record and update some target specific information for inlining.  A typical\n\
+example is that a caller with one isa feature disabled is normally not\n\
+allowed to inline a callee with that same isa feature enabled even which is\n\
+attributed by always_inline, but with the conservative analysis on all\n\
+statements of the callee if we are able to guarantee the callee does not\n\
+exploit any instructions from the mismatch isa feature, it would be safe to\n\
+allow the caller to inline the callee.\n\
+@var{vec} is one @code{HOST_WIDE_INT} element vector to record information\n\
+in which one set bit indicates one corresponding feature is detected in the\n\
+analysis, @var{stmt} is the statement being analyzed.  Return true if target\n\
+still need to analyze the subsequent statements, otherwise return false to\n\
+stop subsequent analysis.\n\
+The default version of this hook returns false.",
+ bool, (vec<HOST_WIDE_INT>& vec, const gimple* stmt),
+ default_update_ipa_fn_target_info)
+
+DEFHOOK
+(need_ipa_fn_target_info,
+ "Allow target to check early whether it is necessary to analyze all gimple\n\
+statements in the given function to update target specific information for\n\
+inlining.  See hook @code{update_ipa_fn_target_info} for usage example of\n\
+target specific information.  This hook is expected to be invoked ahead of\n\
+the iterating with hook @code{update_ipa_fn_target_info}.\n\
+@var{decl} is the function being analyzed, @var{vec} is the same as what\n\
+in hook @code{update_ipa_fn_target_info}, target can do one time update\n\
+into @var{vec} without iterating for some case.  Return true if target\n\
+decides to analyze all gimple statements to collect information, otherwise\n\
+return false.\n\
+The default version of this hook returns false.",
+ bool, (const_tree decl, vec<HOST_WIDE_INT>& vec),
+ default_need_ipa_fn_target_info)
+
 DEFHOOK
 (relayout_function,
 "This target hook fixes function @var{fndecl} after attributes are processed.\n\
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 44a1facedcf..b14f5e22ba8 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1757,6 +1757,22 @@ default_target_can_inline_p (tree caller, tree callee)
   return callee_opts == caller_opts;
 }
 
+/* By default, return false to not need to collect any target information
+   for inlining.  Target maintainer should re-define the hook if the
+   target want to take advantage of it.  */
+
+bool
+default_need_ipa_fn_target_info (const_tree, vec<HOST_WIDE_INT> &)
+{
+  return false;
+}
+
+bool
+default_update_ipa_fn_target_info (vec<HOST_WIDE_INT> &, const gimple *)
+{
+  return false;
+}
+
 /* If the machine does not have a case insn that compares the bounds,
    this means extra overhead for dispatch tables, which raises the
    threshold for using them.  */
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index f70a307d26c..7950f907d27 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -193,6 +193,9 @@ extern bool default_new_address_profitable_p (rtx, rtx_insn *, rtx);
 extern bool default_target_option_valid_attribute_p (tree, tree, tree, int);
 extern bool default_target_option_pragma_parse (tree, tree);
 extern bool default_target_can_inline_p (tree, tree);
+extern bool default_update_ipa_fn_target_info (vec<HOST_WIDE_INT> &,
+					       const gimple *);
+extern bool default_need_ipa_fn_target_info (const_tree, vec<HOST_WIDE_INT> &);
 extern bool default_valid_pointer_mode (scalar_int_mode);
 extern bool default_ref_may_alias_errno (class ao_ref *);
 extern scalar_int_mode default_addr_space_pointer_mode (addr_space_t);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-03  3:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01  7:02 [RFC/PATCH] ipa-inline: Add target info into fn summary [PR102059] Kewen.Lin
2021-09-02  9:25 ` Richard Biener
2021-09-02 11:13   ` Kewen.Lin
2021-09-02 11:51     ` Richard Biener
2021-09-02 13:10       ` Kewen.Lin
2021-09-02 13:21         ` Richard Biener
2021-09-02 17:44 ` Segher Boessenkool
2021-09-03  3:09   ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).