public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
@ 2021-09-01  6:55 Kewen.Lin
  2021-09-15  8:42 ` PING^1 " Kewen.Lin
  2021-11-29 16:57 ` Segher Boessenkool
  0 siblings, 2 replies; 17+ messages in thread
From: Kewen.Lin @ 2021-09-01  6:55 UTC (permalink / raw)
  To: GCC Patches
  Cc: Segher Boessenkool, David Edelsohn, Bill Schmidt, Michael Meissner

[-- Attachment #1: Type: text/plain, Size: 1289 bytes --]

Hi!

This patch is to fix the inconsistent behaviors for non-LTO mode
and LTO mode.  As Martin pointed out, currently the function
rs6000_can_inline_p simply makes it inlinable if callee_tree is
NULL, but it's wrong, we should use the command line options
from target_option_default_node as default.  It also replaces
rs6000_isa_flags with the one from target_option_default_node
when caller_tree is NULL as rs6000_isa_flags could probably
change since initialization.

It also extends the scope of the check for the case that callee
has explicit set options, for test case pr102059-2.c inlining can
happen unexpectedly before, it's fixed accordingly.

As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
can be neglected for inlining, this patch also exludes them when
the callee is attributed by always_inline.

Bootstrapped and regtested on powerpc64le-linux-gnu Power9.

BR,
Kewen
-----
gcc/ChangeLog:

	PR ipa/102059
	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
	target_option_default_node and consider always_inline_safe flags.

gcc/testsuite/ChangeLog:

	PR ipa/102059
	* gcc.target/powerpc/pr102059-1.c: New test.
	* gcc.target/powerpc/pr102059-2.c: New test.
	* gcc.target/powerpc/pr102059-3.c: New test.
	* gcc.target/powerpc/pr102059-4.c: New test.

[-- Attachment #2: 0001-rs6000-Fix-some-issues-in-rs6000_can_inline_p-PR1020.patch --]
[-- Type: text/plain, Size: 9742 bytes --]

---
 gcc/config/rs6000/rs6000.c                    | 87 +++++++++++------
 gcc/testsuite/gcc.target/powerpc/pr102059-1.c | 24 +++++
 gcc/testsuite/gcc.target/powerpc/pr102059-2.c | 20 ++++
 gcc/testsuite/gcc.target/powerpc/pr102059-3.c | 95 +++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr102059-4.c | 22 +++++
 5 files changed, 221 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 46b8909104e..c2582a3efab 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -25058,45 +25058,78 @@ rs6000_generate_version_dispatcher_body (void *node_p)
 static bool
 rs6000_can_inline_p (tree caller, tree callee)
 {
-  bool ret = false;
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
 
-  /* If the callee has no option attributes, then it is ok to inline.  */
+  /* If the caller/callee has option attributes, then use them.
+     Otherwise, use the command line options.  */
   if (!callee_tree)
-    ret = true;
+    callee_tree = target_option_default_node;
+  if (!caller_tree)
+    caller_tree = target_option_default_node;
+
+  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
+  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+  HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
+  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
+
+  bool always_inline =
+    (DECL_DISREGARD_INLINE_LIMITS (callee)
+     && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)));
+
+  /* Some flags such as fusion can be tolerated for always inlines.  */
+  unsigned HOST_WIDE_INT always_inline_safe_mask =
+    (MASK_P8_FUSION | MASK_P10_FUSION | OPTION_MASK_SAVE_TOC_INDIRECT
+     | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION_LD_CMPI
+     | OPTION_MASK_P10_FUSION_2LOGICAL | OPTION_MASK_P10_FUSION_LOGADD
+     | OPTION_MASK_P10_FUSION_ADDLOG | OPTION_MASK_P10_FUSION_2ADD
+     | OPTION_MASK_PCREL_OPT);
+
+  if (always_inline) {
+    caller_isa &= ~always_inline_safe_mask;
+    callee_isa &= ~always_inline_safe_mask;
+  }
 
-  else
+  /* The callee's options must be a subset of the caller's options, i.e.
+     a vsx function may inline an altivec function, but a no-vsx function
+     must not inline a vsx function.  */
+  if ((caller_isa & callee_isa) != callee_isa)
     {
-      HOST_WIDE_INT caller_isa;
-      struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
-      HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
-      HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
-
-      /* If the caller has option attributes, then use them.
-	 Otherwise, use the command line options.  */
-      if (caller_tree)
-	caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags;
-      else
-	caller_isa = rs6000_isa_flags;
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
+    }
 
-      /* The callee's options must be a subset of the caller's options, i.e.
-	 a vsx function may inline an altivec function, but a no-vsx function
-	 must not inline a vsx function.  However, for those options that the
-	 callee has explicitly enabled or disabled, then we must enforce that
-	 the callee's and caller's options match exactly; see PR70010.  */
-      if (((caller_isa & callee_isa) == callee_isa)
-	  && (caller_isa & explicit_isa) == (callee_isa & explicit_isa))
-	ret = true;
+  /* For those options that the callee has explicitly enabled or disabled,
+     then we must enforce that the callee's and caller's options match
+     exactly; see PR70010.  */
+  HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
+  if (always_inline)
+    explicit_isa &= ~always_inline_safe_mask;
+  if ((caller_isa & explicit_isa) != (callee_isa & explicit_isa))
+    {
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set by considering callee's "
+		 "explicitly set options.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
     }
 
   if (TARGET_DEBUG_TARGET)
-    fprintf (stderr, "rs6000_can_inline_p:, caller %s, callee %s, %s inline\n",
-	     get_decl_name (caller), get_decl_name (callee),
-	     (ret ? "can" : "cannot"));
+    fprintf (stderr,
+	     "rs6000_can_inline_p:, caller %s, callee %s, can inline.\n",
+	     get_decl_name (caller), get_decl_name (callee));
 
-  return ret;
+  return true;
 }
+
 \f
 /* Allocate a stack temp and fixup the address so it meets the particular
    memory requirements (either offetable or REG+REG addressing).  */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-1.c b/gcc/testsuite/gcc.target/powerpc/pr102059-1.c
new file mode 100644
index 00000000000..d2a002cf141
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+
+/* Verify it emits inlining error msg at non-LTO mode.  */
+
+#include <htmintrin.h>
+
+static inline int __attribute__ ((always_inline))
+foo (int *b) /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  int res = _HTM_STATE(__builtin_ttest());
+  *b += res;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int
+bar (int *a)
+{
+  *a = foo (a); /* { dg-message "called from here" } */
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-2.c b/gcc/testsuite/gcc.target/powerpc/pr102059-2.c
new file mode 100644
index 00000000000..1d5d6c38bf3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-vsx" } */
+
+/* Verify it emits inlining error msg when the callee has explicit
+   disabling option from command line.  */
+
+vector int c, a, b;
+
+static inline void __attribute__ ((__always_inline__))
+foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  c = a + b;
+}
+
+__attribute__ ((target ("vsx")))
+int main ()
+{
+  foo (); /* { dg-message "called from here" } */
+  c = a + b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-3.c b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
new file mode 100644
index 00000000000..9684cab986a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
@@ -0,0 +1,95 @@
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since some mismatched
+   features are considered as safe for always_inline.  */
+
+/* 1. Callee enables Power8 fusion implicitly, while caller
+      with Power9 doesn't support power8 fusion at all.  */
+
+__attribute__ ((always_inline))
+int callee1 (int *b)
+{
+  *b += 1;
+  return *b;
+}
+
+#pragma GCC target "cpu=power9"
+int caller1 (int *a)
+{
+  *a = callee1 (a);
+  return 0;
+}
+
+/* 2. Caller enables indirect toc save feature while callee
+      disables it explicitly.  */
+
+#pragma GCC target "save-toc-indirect"
+__attribute__ ((always_inline))
+int callee2 (int *b)
+{
+  *b += 2;
+  return *b;
+}
+
+#pragma GCC target "no-save-toc-indirect"
+int caller2 (int *a)
+{
+  *a = callee2 (a);
+  return 0;
+}
+
+/* 3. Caller disables Power10 fusion explicitly, while callee
+      still supports it as Power10 turns it on by default.  */
+
+#pragma GCC target "cpu=power10"
+__attribute__ ((always_inline))
+int callee3 (int *b)
+{
+  *b += 3;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,no-power10-fusion"
+int caller3 (int *a)
+{
+  *a = callee3 (a);
+  return 0;
+}
+
+/* 4. Caller enables Power10 fusion implicitly, while callee
+      disables it explicitly.  */
+
+#pragma GCC target "no-power10-fusion"
+__attribute__ ((always_inline))
+int callee4 (int *b)
+{
+  *b += 4;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int caller4 (int *a)
+{
+  *a = callee4 (a);
+  return 0;
+}
+
+/* 5. Caller disables pcrel-opt while callee enables it explicitly.  */
+
+#pragma GCC target "cpu=power10,no-pcrel-opt"
+__attribute__ ((always_inline))
+int callee5 (int *b)
+{
+  *b += 5;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,pcrel-opt"
+int caller5 (int *a)
+{
+  *a = callee5 (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-4.c b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
new file mode 100644
index 00000000000..0f27f2ce7d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since the flag power8
+   fusion is considered as safe for always_inline, it's still safe
+   even the flag is set explicitly.  */
+
+__attribute__ ((always_inline))
+int foo (int *b)
+{
+  *b += 10;
+  return *b;
+}
+
+#pragma GCC target "power8-fusion"
+int bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
+
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* PING^1 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-09-01  6:55 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059] Kewen.Lin
@ 2021-09-15  8:42 ` Kewen.Lin
  2021-09-28  8:58   ` PING^2 " Kewen.Lin
  2021-11-29 16:57 ` Segher Boessenkool
  1 sibling, 1 reply; 17+ messages in thread
From: Kewen.Lin @ 2021-09-15  8:42 UTC (permalink / raw)
  To: GCC Patches
  Cc: Michael Meissner, Bill Schmidt, David Edelsohn, Segher Boessenkool

Hi!

Gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html

BR,
Kewen

on 2021/9/1 下午2:55, Kewen.Lin via Gcc-patches wrote:
> Hi!
> 
> This patch is to fix the inconsistent behaviors for non-LTO mode
> and LTO mode.  As Martin pointed out, currently the function
> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> NULL, but it's wrong, we should use the command line options
> from target_option_default_node as default.  It also replaces
> rs6000_isa_flags with the one from target_option_default_node
> when caller_tree is NULL as rs6000_isa_flags could probably
> change since initialization.
> 
> It also extends the scope of the check for the case that callee
> has explicit set options, for test case pr102059-2.c inlining can
> happen unexpectedly before, it's fixed accordingly.
> 
> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
> can be neglected for inlining, this patch also exludes them when
> the callee is attributed by always_inline.
> 
> Bootstrapped and regtested on powerpc64le-linux-gnu Power9.
> 
> BR,
> Kewen
> -----
> gcc/ChangeLog:
> 
> 	PR ipa/102059
> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
> 	target_option_default_node and consider always_inline_safe flags.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR ipa/102059
> 	* gcc.target/powerpc/pr102059-1.c: New test.
> 	* gcc.target/powerpc/pr102059-2.c: New test.
> 	* gcc.target/powerpc/pr102059-3.c: New test.
> 	* gcc.target/powerpc/pr102059-4.c: New test.
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* PING^2 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-09-15  8:42 ` PING^1 " Kewen.Lin
@ 2021-09-28  8:58   ` Kewen.Lin
  2021-10-13  2:33     ` PING^3 " Kewen.Lin
  0 siblings, 1 reply; 17+ messages in thread
From: Kewen.Lin @ 2021-09-28  8:58 UTC (permalink / raw)
  To: GCC Patches
  Cc: Michael Meissner, Bill Schmidt, Segher Boessenkool, David Edelsohn

Hi,

Gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html

One related patch [1] is ready to commit, whose test cases rely on
this patch if no changes are applied to them.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html

BR,
Kewen

on 2021/9/15 下午4:42, Kewen.Lin via Gcc-patches wrote:
> Hi!
> 
> Gentle ping this patch:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
> 
> BR,
> Kewen
> 
> on 2021/9/1 下午2:55, Kewen.Lin via Gcc-patches wrote:
>> Hi!
>>
>> This patch is to fix the inconsistent behaviors for non-LTO mode
>> and LTO mode.  As Martin pointed out, currently the function
>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>> NULL, but it's wrong, we should use the command line options
>> from target_option_default_node as default.  It also replaces
>> rs6000_isa_flags with the one from target_option_default_node
>> when caller_tree is NULL as rs6000_isa_flags could probably
>> change since initialization.
>>
>> It also extends the scope of the check for the case that callee
>> has explicit set options, for test case pr102059-2.c inlining can
>> happen unexpectedly before, it's fixed accordingly.
>>
>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>> can be neglected for inlining, this patch also exludes them when
>> the callee is attributed by always_inline.
>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu Power9.
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>> 	PR ipa/102059
>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>> 	target_option_default_node and consider always_inline_safe flags.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR ipa/102059
>> 	* gcc.target/powerpc/pr102059-1.c: New test.
>> 	* gcc.target/powerpc/pr102059-2.c: New test.
>> 	* gcc.target/powerpc/pr102059-3.c: New test.
>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>>
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* PING^3 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-09-28  8:58   ` PING^2 " Kewen.Lin
@ 2021-10-13  2:33     ` Kewen.Lin
  2021-10-20  9:30       ` PING^4 " Kewen.Lin
  0 siblings, 1 reply; 17+ messages in thread
From: Kewen.Lin @ 2021-10-13  2:33 UTC (permalink / raw)
  To: GCC Patches
  Cc: Michael Meissner, Bill Schmidt, David Edelsohn, Segher Boessenkool

Hi,

Gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html

One related patch [1] is ready to commit, whose test cases rely on
this patch if no changes are applied to them.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html

BR,
Kewen

>> on 2021/9/1 下午2:55, Kewen.Lin via Gcc-patches wrote:
>>> Hi!
>>>
>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>> and LTO mode.  As Martin pointed out, currently the function
>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>> NULL, but it's wrong, we should use the command line options
>>> from target_option_default_node as default.  It also replaces
>>> rs6000_isa_flags with the one from target_option_default_node
>>> when caller_tree is NULL as rs6000_isa_flags could probably
>>> change since initialization.
>>>
>>> It also extends the scope of the check for the case that callee
>>> has explicit set options, for test case pr102059-2.c inlining can
>>> happen unexpectedly before, it's fixed accordingly.
>>>
>>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>>> can be neglected for inlining, this patch also exludes them when
>>> the callee is attributed by always_inline.
>>>
>>> Bootstrapped and regtested on powerpc64le-linux-gnu Power9.
>>>
>>> BR,
>>> Kewen
>>> -----
>>> gcc/ChangeLog:
>>>
>>> 	PR ipa/102059
>>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>>> 	target_option_default_node and consider always_inline_safe flags.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 	PR ipa/102059
>>> 	* gcc.target/powerpc/pr102059-1.c: New test.
>>> 	* gcc.target/powerpc/pr102059-2.c: New test.
>>> 	* gcc.target/powerpc/pr102059-3.c: New test.
>>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>>>
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* PING^4 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-10-13  2:33     ` PING^3 " Kewen.Lin
@ 2021-10-20  9:30       ` Kewen.Lin
  2021-11-04 10:57         ` PING^5 " Kewen.Lin
  0 siblings, 1 reply; 17+ messages in thread
From: Kewen.Lin @ 2021-10-20  9:30 UTC (permalink / raw)
  To: GCC Patches
  Cc: Michael Meissner, Bill Schmidt, Segher Boessenkool, David Edelsohn

Hi,

Gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html

One related patch [1] is ready to commit, whose test cases rely on
this patch if no changes are applied to them.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html

BR,
Kewen

>>> on 2021/9/1 下午2:55, Kewen.Lin via Gcc-patches wrote:
>>>> Hi!
>>>>
>>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>>> and LTO mode.  As Martin pointed out, currently the function
>>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>>> NULL, but it's wrong, we should use the command line options
>>>> from target_option_default_node as default.  It also replaces
>>>> rs6000_isa_flags with the one from target_option_default_node
>>>> when caller_tree is NULL as rs6000_isa_flags could probably
>>>> change since initialization.
>>>>
>>>> It also extends the scope of the check for the case that callee
>>>> has explicit set options, for test case pr102059-2.c inlining can
>>>> happen unexpectedly before, it's fixed accordingly.
>>>>
>>>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>>>> can be neglected for inlining, this patch also exludes them when
>>>> the callee is attributed by always_inline.
>>>>
>>>> Bootstrapped and regtested on powerpc64le-linux-gnu Power9.
>>>>
>>>> BR,
>>>> Kewen
>>>> -----
>>>> gcc/ChangeLog:
>>>>
>>>> 	PR ipa/102059
>>>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>>>> 	target_option_default_node and consider always_inline_safe flags.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> 	PR ipa/102059
>>>> 	* gcc.target/powerpc/pr102059-1.c: New test.
>>>> 	* gcc.target/powerpc/pr102059-2.c: New test.
>>>> 	* gcc.target/powerpc/pr102059-3.c: New test.
>>>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>>>>
>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* PING^5 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-10-20  9:30       ` PING^4 " Kewen.Lin
@ 2021-11-04 10:57         ` Kewen.Lin
  2021-11-22  2:24           ` PING^6 " Kewen.Lin
  0 siblings, 1 reply; 17+ messages in thread
From: Kewen.Lin @ 2021-11-04 10:57 UTC (permalink / raw)
  To: GCC Patches
  Cc: Michael Meissner, Bill Schmidt, David Edelsohn, Segher Boessenkool

Hi,

Gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html

One related patch [1] is ready to commit, whose test cases rely on
this patch if no changes are applied to them.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html

BR,
Kewen

>>>> on 2021/9/1 下午2:55, Kewen.Lin via Gcc-patches wrote:
>>>>> Hi!
>>>>>
>>>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>>>> and LTO mode.  As Martin pointed out, currently the function
>>>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>>>> NULL, but it's wrong, we should use the command line options
>>>>> from target_option_default_node as default.  It also replaces
>>>>> rs6000_isa_flags with the one from target_option_default_node
>>>>> when caller_tree is NULL as rs6000_isa_flags could probably
>>>>> change since initialization.
>>>>>
>>>>> It also extends the scope of the check for the case that callee
>>>>> has explicit set options, for test case pr102059-2.c inlining can
>>>>> happen unexpectedly before, it's fixed accordingly.
>>>>>
>>>>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>>>>> can be neglected for inlining, this patch also exludes them when
>>>>> the callee is attributed by always_inline.
>>>>>
>>>>> Bootstrapped and regtested on powerpc64le-linux-gnu Power9.
>>>>>
>>>>> BR,
>>>>> Kewen
>>>>> -----
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 	PR ipa/102059
>>>>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>>>>> 	target_option_default_node and consider always_inline_safe flags.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 	PR ipa/102059
>>>>> 	* gcc.target/powerpc/pr102059-1.c: New test.
>>>>> 	* gcc.target/powerpc/pr102059-2.c: New test.
>>>>> 	* gcc.target/powerpc/pr102059-3.c: New test.
>>>>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>>>>>
>>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* PING^6 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-11-04 10:57         ` PING^5 " Kewen.Lin
@ 2021-11-22  2:24           ` Kewen.Lin
  0 siblings, 0 replies; 17+ messages in thread
From: Kewen.Lin @ 2021-11-22  2:24 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, Bill Schmidt, David Edelsohn, GCC Patches

Hi,

Gentle ping this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html

One related patch [1] is ready to commit, whose test cases rely on
this patch if no changes are applied to them.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html

BR,
Kewen

>>>>> on 2021/9/1 下午2:55, Kewen.Lin via Gcc-patches wrote:
>>>>>> Hi!
>>>>>>
>>>>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>>>>> and LTO mode.  As Martin pointed out, currently the function
>>>>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>>>>> NULL, but it's wrong, we should use the command line options
>>>>>> from target_option_default_node as default.  It also replaces
>>>>>> rs6000_isa_flags with the one from target_option_default_node
>>>>>> when caller_tree is NULL as rs6000_isa_flags could probably
>>>>>> change since initialization.
>>>>>>
>>>>>> It also extends the scope of the check for the case that callee
>>>>>> has explicit set options, for test case pr102059-2.c inlining can
>>>>>> happen unexpectedly before, it's fixed accordingly.
>>>>>>
>>>>>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>>>>>> can be neglected for inlining, this patch also exludes them when
>>>>>> the callee is attributed by always_inline.
>>>>>>
>>>>>> Bootstrapped and regtested on powerpc64le-linux-gnu Power9.
>>>>>>
>>>>>> BR,
>>>>>> Kewen
>>>>>> -----
>>>>>> gcc/ChangeLog:
>>>>>>
>>>>>> 	PR ipa/102059
>>>>>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>>>>>> 	target_option_default_node and consider always_inline_safe flags.
>>>>>>
>>>>>> gcc/testsuite/ChangeLog:
>>>>>>
>>>>>> 	PR ipa/102059
>>>>>> 	* gcc.target/powerpc/pr102059-1.c: New test.
>>>>>> 	* gcc.target/powerpc/pr102059-2.c: New test.
>>>>>> 	* gcc.target/powerpc/pr102059-3.c: New test.
>>>>>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>>>>>>
>>>>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-09-01  6:55 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059] Kewen.Lin
  2021-09-15  8:42 ` PING^1 " Kewen.Lin
@ 2021-11-29 16:57 ` Segher Boessenkool
  2021-12-03  0:51   ` Michael Meissner
  2021-12-03  3:46   ` [PATCH v2] " Kewen.Lin
  1 sibling, 2 replies; 17+ messages in thread
From: Segher Boessenkool @ 2021-11-29 16:57 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, David Edelsohn, Bill Schmidt, Michael Meissner

Hi!

On Wed, Sep 01, 2021 at 02:55:51PM +0800, Kewen.Lin wrote:
> This patch is to fix the inconsistent behaviors for non-LTO mode
> and LTO mode.  As Martin pointed out, currently the function
> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> NULL, but it's wrong, we should use the command line options
> from target_option_default_node as default.

This is not documented.

> It also replaces
> rs6000_isa_flags with the one from target_option_default_node
> when caller_tree is NULL as rs6000_isa_flags could probably
> change since initialization.

Is this enough then?  Are there no other places that use
rs6000_isa_flags?  Is it correct for us to have that variable at all
anymore?  Etc.

> +  bool always_inline =
> +    (DECL_DISREGARD_INLINE_LIMITS (callee)
> +     && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)));

The "=" should start a line, not end it.  And please don't use
unnecessary parens.

> +  /* Some flags such as fusion can be tolerated for always inlines.  */
> +  unsigned HOST_WIDE_INT always_inline_safe_mask =
> +    (MASK_P8_FUSION | MASK_P10_FUSION | OPTION_MASK_SAVE_TOC_INDIRECT
> +     | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION_LD_CMPI
> +     | OPTION_MASK_P10_FUSION_2LOGICAL | OPTION_MASK_P10_FUSION_LOGADD
> +     | OPTION_MASK_P10_FUSION_ADDLOG | OPTION_MASK_P10_FUSION_2ADD
> +     | OPTION_MASK_PCREL_OPT);

Same.

The fusion ones are obvious.  The other two are not.  Please put those
two more obviously separate (not somewhere in the sea of fusion
options), and some comment wouldn't hurt either.  You can make it
separate statements even, make it really stand out.

Why are there OPTION_MASKs for separate P10 fusion types here, as well as
MASK_P10_FUSION?

> +
> +  if (always_inline) {
> +    caller_isa &= ~always_inline_safe_mask;
> +    callee_isa &= ~always_inline_safe_mask;
> +  }

"{" starts a new line, indented.


Segher

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-11-29 16:57 ` Segher Boessenkool
@ 2021-12-03  0:51   ` Michael Meissner
  2021-12-03  3:30     ` Kewen.Lin
  2021-12-03  3:46   ` [PATCH v2] " Kewen.Lin
  1 sibling, 1 reply; 17+ messages in thread
From: Michael Meissner @ 2021-12-03  0:51 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Kewen.Lin, GCC Patches, David Edelsohn, Bill Schmidt, Michael Meissner

On Mon, Nov 29, 2021 at 10:57:12AM -0600, Segher Boessenkool wrote:
> Why are there OPTION_MASKs for separate P10 fusion types here, as well as
> MASK_P10_FUSION?

Well going back in time, before we used rs6000_isa_flags, we used the default
flag word for MASK arguments.  Unfortunately, the default flag word is only
32-bits, and we needed more mask bits, so we moved to rs6000_isa_flags, which
is HOST_WIDE_INT.

Unfortunately, the options infrastructure used 'OPTION_MASK_<foo>' instead of
'MASK_<foo>'.  So we have a bunch of macros in rs6000.h that map 'MASK_<foo>'
to 'OPTION_MASK_<foo>'.

We should clean this up and use 'OPTION_MASK_<foo>' everywhere, but so far it
hasn't percolated to the top as being important enough to do.

As new options are added, people just clone the code and add new macros, even
though in theory nobody should be using MASK_P10_FUSION_<foo>.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-12-03  0:51   ` Michael Meissner
@ 2021-12-03  3:30     ` Kewen.Lin
  0 siblings, 0 replies; 17+ messages in thread
From: Kewen.Lin @ 2021-12-03  3:30 UTC (permalink / raw)
  To: Michael Meissner
  Cc: Segher Boessenkool, GCC Patches, Bill Schmidt, David Edelsohn

Hi Mike,

on 2021/12/3 上午8:51, Michael Meissner wrote:
> On Mon, Nov 29, 2021 at 10:57:12AM -0600, Segher Boessenkool wrote:
>> Why are there OPTION_MASKs for separate P10 fusion types here, as well as
>> MASK_P10_FUSION?
> 
> Well going back in time, before we used rs6000_isa_flags, we used the default
> flag word for MASK arguments.  Unfortunately, the default flag word is only
> 32-bits, and we needed more mask bits, so we moved to rs6000_isa_flags, which
> is HOST_WIDE_INT.
> 
> Unfortunately, the options infrastructure used 'OPTION_MASK_<foo>' instead of
> 'MASK_<foo>'.  So we have a bunch of macros in rs6000.h that map 'MASK_<foo>'
> to 'OPTION_MASK_<foo>'.
> 
> We should clean this up and use 'OPTION_MASK_<foo>' everywhere, but so far it
> hasn't percolated to the top as being important enough to do.
> 
> As new options are added, people just clone the code and add new macros, even
> though in theory nobody should be using MASK_P10_FUSION_<foo>.
> 

Thanks for the explanation on the history.  Fortunately now for Power10 fusion,
we have only one defined MASK_P10_FUSION but not the other MASK_P10_FUSION_<...>.

As you explained, I will use OPTION_MASK_P10_FUSION and OPTION_MASK_P8_FUSION
instead.

BR,
Kewen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-11-29 16:57 ` Segher Boessenkool
  2021-12-03  0:51   ` Michael Meissner
@ 2021-12-03  3:46   ` Kewen.Lin
  2021-12-03 23:23     ` Peter Bergner
  2021-12-06 13:06     ` Segher Boessenkool
  1 sibling, 2 replies; 17+ messages in thread
From: Kewen.Lin @ 2021-12-03  3:46 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: GCC Patches, David Edelsohn, Bill Schmidt, Michael Meissner

[-- Attachment #1: Type: text/plain, Size: 5253 bytes --]

Hi Segher,

Thanks for the review!

on 2021/11/30 上午12:57, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Sep 01, 2021 at 02:55:51PM +0800, Kewen.Lin wrote:
>> This patch is to fix the inconsistent behaviors for non-LTO mode
>> and LTO mode.  As Martin pointed out, currently the function
>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>> NULL, but it's wrong, we should use the command line options
>> from target_option_default_node as default.
> 
> This is not documented.
> 

Yeah, but according to the document for the target attribute [1],
"Multiple target back ends implement the target attribute to specify
that a function is to be compiled with different target options than
specified on the command line. The original target command-line options
are ignored. ", it seems to say the function without any target
attribute/pragma will be compiled with target options specified on the
command line.  I think it's a normal expectation for users.

Excepting for the inconsistent behaviors between LTO and non-LTO,
it can also make the below case different.

// Option: -O2 -mcpu=power8 -mhtm

int foo(int *b) {
  *b += __builtin_ttest();
  return *b;
}

__attribute__((target("no-htm"))) int bar(int *a) {
  *a = foo(a);
  return 0;
}

Without this fix, bar inlines foo but get the error as below:

In function ‘foo’,
    inlined from ‘bar’ at test.c:8:8:
test.c:3:9: error: ‘__builtin_ttest’ requires the ‘-mhtm’ option
    3 |   *b += __builtin_ttest();
      |         ^~~~~~~~~~~~~~~~~

Since bar doesn't support foo's required HTM feature.

With this fix, bar doesn't inline foo and the case gets compiled well.

I've added this test case as pr102059-5.c.

[1] https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

>> It also replaces
>> rs6000_isa_flags with the one from target_option_default_node
>> when caller_tree is NULL as rs6000_isa_flags could probably
>> change since initialization.
> 
> Is this enough then?  Are there no other places that use
> rs6000_isa_flags?  Is it correct for us to have that variable at all
> anymore?  Etc.
> 

Good question, I think the answer is yes.  I digged into it and found
the current target option save/restore scheme would keep rs6000_isa_flags
to be the same as the one saved in target_option_default_node for the context
of inlining.  So I put one assertion as below:

    if (!caller_tree) {
      caller_tree = target_option_default_node;
      gcc_assert (rs6000_isa_flags
                  == TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags);
    }

And kicked off regression testings on both BE and LE, the result showed
only one same failure, which seems to be a latent bug.  I've filed PR103515
for it.

This bug also indicates it's better to use target_option_default_node
rather than rs6000_isa_flags when the caller_tree is NULL.  Since
the target_option_default_node is straightforward and doesn't rely
on the assumption that rs6000_isa_flags would be kept as the default
one correctly, it doesn't suffer from bugs like PR103515.

>> +  bool always_inline =
>> +    (DECL_DISREGARD_INLINE_LIMITS (callee)
>> +     && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)));
> 
> The "=" should start a line, not end it.  And please don't use
> unnecessary parens.
> 
>> +  /* Some flags such as fusion can be tolerated for always inlines.  */
>> +  unsigned HOST_WIDE_INT always_inline_safe_mask =
>> +    (MASK_P8_FUSION | MASK_P10_FUSION | OPTION_MASK_SAVE_TOC_INDIRECT
>> +     | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION_LD_CMPI
>> +     | OPTION_MASK_P10_FUSION_2LOGICAL | OPTION_MASK_P10_FUSION_LOGADD
>> +     | OPTION_MASK_P10_FUSION_ADDLOG | OPTION_MASK_P10_FUSION_2ADD
>> +     | OPTION_MASK_PCREL_OPT);
> 
> Same.
> 

Both above and here fixed.

> The fusion ones are obvious.  The other two are not.  Please put those
> two more obviously separate (not somewhere in the sea of fusion
> options), and some comment wouldn't hurt either.  You can make it
> separate statements even, make it really stand out.
> 

Fixed by splitting it into: fusion_options_mask and optimization_options_mask.
I'm not happy for the later name, but poor to get a better in mind.  Do you
have any suggestions on the name and words?

> Why are there OPTION_MASKs for separate P10 fusion types here, as well as
> MASK_P10_FUSION?
> 

Mike helped to explain the history, I've updated all of them using OPTION_MASK_
to avoid potential confusion.

>> +
>> +  if (always_inline) {
>> +    caller_isa &= ~always_inline_safe_mask;
>> +    callee_isa &= ~always_inline_safe_mask;
>> +  }
> 
> "{" starts a new line, indented.
> 

Fixed.  I refined this part a bit since caller doesn't need to take actions.

The new version V2 is attached.  Tested as before.

BR,
Kewen
---
gcc/ChangeLog:

	PR ipa/102059
	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
	target_option_default_node and consider always_inline_safe flags.

gcc/testsuite/ChangeLog:

	PR ipa/102059
	* gcc.target/powerpc/pr102059-1.c: New test.
	* gcc.target/powerpc/pr102059-2.c: New test.
	* gcc.target/powerpc/pr102059-3.c: New test.
	* gcc.target/powerpc/pr102059-4.c: New test.
	* gcc.target/powerpc/pr102059-5.c: New test.

-----


[-- Attachment #2: PR102059_v2.patch --]
[-- Type: text/plain, Size: 10949 bytes --]

---
 gcc/config/rs6000/rs6000.c                    | 95 +++++++++++++------
 gcc/testsuite/gcc.target/powerpc/pr102059-1.c | 24 +++++
 gcc/testsuite/gcc.target/powerpc/pr102059-2.c | 20 ++++
 gcc/testsuite/gcc.target/powerpc/pr102059-3.c | 95 +++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr102059-4.c | 22 +++++
 gcc/testsuite/gcc.target/powerpc/pr102059-5.c | 22 +++++
 6 files changed, 250 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-5.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 289c1b3df24..8d4258e58b1 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -25539,45 +25539,84 @@ rs6000_generate_version_dispatcher_body (void *node_p)
 static bool
 rs6000_can_inline_p (tree caller, tree callee)
 {
-  bool ret = false;
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
 
-  /* If the callee has no option attributes, then it is ok to inline.  */
+  /* If the caller/callee has option attributes, then use them.
+     Otherwise, use the command line options.  */
   if (!callee_tree)
-    ret = true;
+    callee_tree = target_option_default_node;
+  if (!caller_tree)
+    caller_tree = target_option_default_node;
+
+  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
+  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+  HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
+  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
+
+  bool always_inline
+    = DECL_DISREGARD_INLINE_LIMITS (callee)
+      && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee));
+
+  /* Fusion option masks.  */
+  unsigned HOST_WIDE_INT fusion_options_mask
+    = OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION | OPTION_MASK_P8_FUSION_SIGN
+      | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL
+      | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG
+      | OPTION_MASK_P10_FUSION_2ADD;
+
+  /* Like fusion, some option masks which are just for optimization.  */
+  unsigned HOST_WIDE_INT optimization_options_mask
+    = OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT;
+
+  /* Some features can be tolerated for always inlines.  */
+  unsigned HOST_WIDE_INT always_inline_safe_mask
+    = fusion_options_mask | optimization_options_mask;
+
+  /* To adjust callee is enough since we just check for subset.  */
+  if (always_inline)
+    callee_isa &= ~always_inline_safe_mask;
+
+  /* The callee's options must be a subset of the caller's options, i.e.
+     a vsx function may inline an altivec function, but a no-vsx function
+     must not inline a vsx function.  */
+  if ((caller_isa & callee_isa) != callee_isa)
+    {
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
+    }
 
-  else
+  /* For those options that the callee has explicitly enabled or disabled,
+     then we must enforce that the callee's and caller's options match
+     exactly; see PR70010.  */
+  HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
+  if (always_inline)
+    explicit_isa &= ~always_inline_safe_mask;
+  if ((caller_isa & explicit_isa) != (callee_isa & explicit_isa))
     {
-      HOST_WIDE_INT caller_isa;
-      struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
-      HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
-      HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
-
-      /* If the caller has option attributes, then use them.
-	 Otherwise, use the command line options.  */
-      if (caller_tree)
-	caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags;
-      else
-	caller_isa = rs6000_isa_flags;
-
-      /* The callee's options must be a subset of the caller's options, i.e.
-	 a vsx function may inline an altivec function, but a no-vsx function
-	 must not inline a vsx function.  However, for those options that the
-	 callee has explicitly enabled or disabled, then we must enforce that
-	 the callee's and caller's options match exactly; see PR70010.  */
-      if (((caller_isa & callee_isa) == callee_isa)
-	  && (caller_isa & explicit_isa) == (callee_isa & explicit_isa))
-	ret = true;
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set by considering callee's "
+		 "explicitly set options.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
     }
 
   if (TARGET_DEBUG_TARGET)
-    fprintf (stderr, "rs6000_can_inline_p:, caller %s, callee %s, %s inline\n",
-	     get_decl_name (caller), get_decl_name (callee),
-	     (ret ? "can" : "cannot"));
+    fprintf (stderr,
+	     "rs6000_can_inline_p:, caller %s, callee %s, can inline.\n",
+	     get_decl_name (caller), get_decl_name (callee));
 
-  return ret;
+  return true;
 }
+
 \f
 /* Allocate a stack temp and fixup the address so it meets the particular
    memory requirements (either offetable or REG+REG addressing).  */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-1.c b/gcc/testsuite/gcc.target/powerpc/pr102059-1.c
new file mode 100644
index 00000000000..d2a002cf141
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+
+/* Verify it emits inlining error msg at non-LTO mode.  */
+
+#include <htmintrin.h>
+
+static inline int __attribute__ ((always_inline))
+foo (int *b) /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  int res = _HTM_STATE(__builtin_ttest());
+  *b += res;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int
+bar (int *a)
+{
+  *a = foo (a); /* { dg-message "called from here" } */
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-2.c b/gcc/testsuite/gcc.target/powerpc/pr102059-2.c
new file mode 100644
index 00000000000..1d5d6c38bf3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-vsx" } */
+
+/* Verify it emits inlining error msg when the callee has explicit
+   disabling option from command line.  */
+
+vector int c, a, b;
+
+static inline void __attribute__ ((__always_inline__))
+foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  c = a + b;
+}
+
+__attribute__ ((target ("vsx")))
+int main ()
+{
+  foo (); /* { dg-message "called from here" } */
+  c = a + b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-3.c b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
new file mode 100644
index 00000000000..9684cab986a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
@@ -0,0 +1,95 @@
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since some mismatched
+   features are considered as safe for always_inline.  */
+
+/* 1. Callee enables Power8 fusion implicitly, while caller
+      with Power9 doesn't support power8 fusion at all.  */
+
+__attribute__ ((always_inline))
+int callee1 (int *b)
+{
+  *b += 1;
+  return *b;
+}
+
+#pragma GCC target "cpu=power9"
+int caller1 (int *a)
+{
+  *a = callee1 (a);
+  return 0;
+}
+
+/* 2. Caller enables indirect toc save feature while callee
+      disables it explicitly.  */
+
+#pragma GCC target "save-toc-indirect"
+__attribute__ ((always_inline))
+int callee2 (int *b)
+{
+  *b += 2;
+  return *b;
+}
+
+#pragma GCC target "no-save-toc-indirect"
+int caller2 (int *a)
+{
+  *a = callee2 (a);
+  return 0;
+}
+
+/* 3. Caller disables Power10 fusion explicitly, while callee
+      still supports it as Power10 turns it on by default.  */
+
+#pragma GCC target "cpu=power10"
+__attribute__ ((always_inline))
+int callee3 (int *b)
+{
+  *b += 3;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,no-power10-fusion"
+int caller3 (int *a)
+{
+  *a = callee3 (a);
+  return 0;
+}
+
+/* 4. Caller enables Power10 fusion implicitly, while callee
+      disables it explicitly.  */
+
+#pragma GCC target "no-power10-fusion"
+__attribute__ ((always_inline))
+int callee4 (int *b)
+{
+  *b += 4;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int caller4 (int *a)
+{
+  *a = callee4 (a);
+  return 0;
+}
+
+/* 5. Caller disables pcrel-opt while callee enables it explicitly.  */
+
+#pragma GCC target "cpu=power10,no-pcrel-opt"
+__attribute__ ((always_inline))
+int callee5 (int *b)
+{
+  *b += 5;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,pcrel-opt"
+int caller5 (int *a)
+{
+  *a = callee5 (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-4.c b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
new file mode 100644
index 00000000000..0f27f2ce7d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since the flag power8
+   fusion is considered as safe for always_inline, it's still safe
+   even the flag is set explicitly.  */
+
+__attribute__ ((always_inline))
+int foo (int *b)
+{
+  *b += 10;
+  return *b;
+}
+
+#pragma GCC target "power8-fusion"
+int bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-5.c b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
new file mode 100644
index 00000000000..826ce88025f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mhtm" } */
+
+/* Verify the inlining won't perform when the callee requires
+   some target feature which isn't supported by caller, even
+   if the callee doesn't have any target attributes or pragmas.
+   If the inlining performs here, the compilation will fail.  */
+
+int
+foo (int *b)
+{
+  *b += __builtin_ttest ();
+  return *b;
+}
+
+__attribute__ ((target ("no-htm"))) int
+bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-12-03  3:46   ` [PATCH v2] " Kewen.Lin
@ 2021-12-03 23:23     ` Peter Bergner
  2021-12-06  9:35       ` Martin Liška
  2021-12-06 13:06     ` Segher Boessenkool
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Bergner @ 2021-12-03 23:23 UTC (permalink / raw)
  To: Kewen.Lin, Segher Boessenkool
  Cc: Bill Schmidt, GCC Patches, David Edelsohn, Michael Meissner,
	Martin Liška, Richard Biener

On 12/2/21 9:46 PM, Kewen.Lin via Gcc-patches wrote:
> on 2021/11/30 上午12:57, Segher Boessenkool wrote:
>> On Wed, Sep 01, 2021 at 02:55:51PM +0800, Kewen.Lin wrote:
>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>> and LTO mode.  As Martin pointed out, currently the function
>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>> NULL, but it's wrong, we should use the command line options
>>> from target_option_default_node as default.
>>
>> This is not documented.
>>
> 
> Yeah, but according to the document for the target attribute [1],
> "Multiple target back ends implement the target attribute to specify
> that a function is to be compiled with different target options than
> specified on the command line. The original target command-line options
> are ignored. ", it seems to say the function without any target
> attribute/pragma will be compiled with target options specified on the
> command line.  I think it's a normal expectation for users.
> 
> Excepting for the inconsistent behaviors between LTO and non-LTO,
> it can also make the below case different.

I thought Martin and richi mentioned that target attribute options
are treated as if they are appended to the end of the command line
options, so they can potentially override earlier options, but they
don't actually ignore them?

Peter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-12-03 23:23     ` Peter Bergner
@ 2021-12-06  9:35       ` Martin Liška
  2021-12-06 14:40         ` Peter Bergner
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Liška @ 2021-12-06  9:35 UTC (permalink / raw)
  To: Peter Bergner, Kewen.Lin, Segher Boessenkool
  Cc: Bill Schmidt, GCC Patches, David Edelsohn, Michael Meissner,
	Richard Biener

On 12/4/21 00:23, Peter Bergner wrote:
> On 12/2/21 9:46 PM, Kewen.Lin via Gcc-patches wrote:
>> on 2021/11/30 上午12:57, Segher Boessenkool wrote:
>>> On Wed, Sep 01, 2021 at 02:55:51PM +0800, Kewen.Lin wrote:
>>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>>> and LTO mode.  As Martin pointed out, currently the function
>>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>>> NULL, but it's wrong, we should use the command line options
>>>> from target_option_default_node as default.
>>>
>>> This is not documented.
>>>
>>
>> Yeah, but according to the document for the target attribute [1],
>> "Multiple target back ends implement the target attribute to specify
>> that a function is to be compiled with different target options than
>> specified on the command line. The original target command-line options
>> are ignored. ", it seems to say the function without any target
>> attribute/pragma will be compiled with target options specified on the
>> command line.  I think it's a normal expectation for users.
>>
>> Excepting for the inconsistent behaviors between LTO and non-LTO,
>> it can also make the below case different.
> 
> I thought Martin and richi mentioned that target attribute options
> are treated as if they are appended to the end of the command line
> options, so they can potentially override earlier options, but they
> don't actually ignore them?

No, the described behavior is true for optimize attribute:

optimize (string, …)
...
The optimize attribute arguments of a function behave behave as if appended to the command-line.

but:

target (string, …)
...
The original target command-line options are ignored.

As seen here:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

Cheers,
Martin

> 
> Peter
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-12-03  3:46   ` [PATCH v2] " Kewen.Lin
  2021-12-03 23:23     ` Peter Bergner
@ 2021-12-06 13:06     ` Segher Boessenkool
  2021-12-06 14:52       ` Peter Bergner
  2021-12-07  3:42       ` Kewen.Lin
  1 sibling, 2 replies; 17+ messages in thread
From: Segher Boessenkool @ 2021-12-06 13:06 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, David Edelsohn, Bill Schmidt, Michael Meissner

On Fri, Dec 03, 2021 at 11:46:53AM +0800, Kewen.Lin wrote:
> >> This patch is to fix the inconsistent behaviors for non-LTO mode
> >> and LTO mode.  As Martin pointed out, currently the function
> >> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> >> NULL, but it's wrong, we should use the command line options
> >> from target_option_default_node as default.
> > 
> > This is not documented.
> 
> Yeah, but according to the document for the target attribute [1],
> "Multiple target back ends implement the target attribute to specify
> that a function is to be compiled with different target options than
> specified on the command line. The original target command-line options
> are ignored. ", it seems to say the function without any target
> attribute/pragma will be compiled with target options specified on the
> command line.  I think it's a normal expectation for users.

The whole target_option_default_node is not documented, I meant.

> Excepting for the inconsistent behaviors between LTO and non-LTO,
> it can also make the below case different.
> 
> // Option: -O2 -mcpu=power8 -mhtm
> 
> int foo(int *b) {
>   *b += __builtin_ttest();
>   return *b;
> }
> 
> __attribute__((target("no-htm"))) int bar(int *a) {
>   *a = foo(a);
>   return 0;
> }
> 
> Without this fix, bar inlines foo but get the error as below:
> 
> In function ‘foo’,
>     inlined from ‘bar’ at test.c:8:8:
> test.c:3:9: error: ‘__builtin_ttest’ requires the ‘-mhtm’ option
>     3 |   *b += __builtin_ttest();
>       |         ^~~~~~~~~~~~~~~~~
> 
> Since bar doesn't support foo's required HTM feature.
> 
> With this fix, bar doesn't inline foo and the case gets compiled well.

And if you inline something no-htm into something that allows htm?  That
should work fine, be allowed just fine.

> >> It also replaces
> >> rs6000_isa_flags with the one from target_option_default_node
> >> when caller_tree is NULL as rs6000_isa_flags could probably
> >> change since initialization.
> > 
> > Is this enough then?  Are there no other places that use
> > rs6000_isa_flags?  Is it correct for us to have that variable at all
> > anymore?  Etc.
> 
> Good question, I think the answer is yes.  I digged into it and found
> the current target option save/restore scheme would keep rs6000_isa_flags
> to be the same as the one saved in target_option_default_node for the context
> of inlining.  So I put one assertion as below:
> 
>     if (!caller_tree) {
>       caller_tree = target_option_default_node;
>       gcc_assert (rs6000_isa_flags
>                   == TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags);
>     }
> 
> And kicked off regression testings on both BE and LE, the result showed
> only one same failure, which seems to be a latent bug.  I've filed PR103515
> for it.

Ah cool :-)  Please send a patch to add that asser then (well, once we
can bootstrap with it ;-) )

> This bug also indicates it's better to use target_option_default_node
> rather than rs6000_isa_flags when the caller_tree is NULL.  Since
> the target_option_default_node is straightforward and doesn't rely
> on the assumption that rs6000_isa_flags would be kept as the default
> one correctly, it doesn't suffer from bugs like PR103515.

So we should not ever use rs6000_isa_flags anymore?

> > The fusion ones are obvious.  The other two are not.  Please put those
> > two more obviously separate (not somewhere in the sea of fusion
> > options), and some comment wouldn't hurt either.  You can make it
> > separate statements even, make it really stand out.
> > 
> 
> Fixed by splitting it into: fusion_options_mask and optimization_options_mask.
> I'm not happy for the later name, but poor to get a better in mind.  Do you
> have any suggestions on the name and words?

You dont have to split it into variables, as you found out then you get
the usual naming problem.  But you can just split it in the code:

  if (important_condition
      || another_important_one
      /* comment explaining things */
      || bla1 || bla2 || bla3 || bla4 || bla5)

> > Why are there OPTION_MASKs for separate P10 fusion types here, as well as
> > MASK_P10_FUSION?
> 
> Mike helped to explain the history, I've updated all of them using OPTION_MASK_
> to avoid potential confusion.

That is one thing, sure, but why are both needed?  Both the "main" flag,
and the "details" flags.

(The latter should soon go away btw).


Segher

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-12-06  9:35       ` Martin Liška
@ 2021-12-06 14:40         ` Peter Bergner
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Bergner @ 2021-12-06 14:40 UTC (permalink / raw)
  To: Martin Liška, Kewen.Lin, Segher Boessenkool
  Cc: Bill Schmidt, GCC Patches, David Edelsohn, Michael Meissner,
	Richard Biener

On 12/6/21 3:35 AM, Martin Liška wrote:
> On 12/4/21 00:23, Peter Bergner wrote:
>> I thought Martin and richi mentioned that target attribute options
>> are treated as if they are appended to the end of the command line
>> options, so they can potentially override earlier options, but they
>> don't actually ignore them?
> 
> No, the described behavior is true for optimize attribute:
> 
> optimize (string, …)
> ...
> The optimize attribute arguments of a function behave behave as if appended to the command-line.
> 
> but:
> 
> target (string, …)
> ...
> The original target command-line options are ignored.

Ok, you learn something new every day.  Thanks for setting me straight!

Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-12-06 13:06     ` Segher Boessenkool
@ 2021-12-06 14:52       ` Peter Bergner
  2021-12-07  3:42       ` Kewen.Lin
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Bergner @ 2021-12-06 14:52 UTC (permalink / raw)
  To: Segher Boessenkool, Kewen.Lin
  Cc: Bill Schmidt, GCC Patches, David Edelsohn, Michael Meissner

On 12/6/21 7:06 AM, Segher Boessenkool wrote:
> On Fri, Dec 03, 2021 at 11:46:53AM +0800, Kewen.Lin wrot
>> Without this fix, bar inlines foo but get the error as below:
>>
>> In function ‘foo’,
>>     inlined from ‘bar’ at test.c:8:8:
>> test.c:3:9: error: ‘__builtin_ttest’ requires the ‘-mhtm’ option
>>     3 |   *b += __builtin_ttest();
>>       |         ^~~~~~~~~~~~~~~~~
>>
>> Since bar doesn't support foo's required HTM feature.
>>
>> With this fix, bar doesn't inline foo and the case gets compiled well.
> 
> And if you inline something no-htm into something that allows htm?  That
> should work fine, be allowed just fine.

It is only ok to inline a function to be compiled with no-htm into a
function that will be compiled with htm if the no-htm function is
being compiled with no-htm by default.  If the user explicitly used
-fno-htm on the function, then we should not inline the htm function
into it.  Ditto for the other way around.  Meaning, the caller's
option flags should be a superset of the callee's flags, but the
explicit flags used on both functions need to match exactly.

Peter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]
  2021-12-06 13:06     ` Segher Boessenkool
  2021-12-06 14:52       ` Peter Bergner
@ 2021-12-07  3:42       ` Kewen.Lin
  1 sibling, 0 replies; 17+ messages in thread
From: Kewen.Lin @ 2021-12-07  3:42 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: GCC Patches, David Edelsohn, Bill Schmidt, Michael Meissner, Peter

[-- Attachment #1: Type: text/plain, Size: 7096 bytes --]

Hi Segher,

on 2021/12/6 下午9:06, Segher Boessenkool wrote:
> On Fri, Dec 03, 2021 at 11:46:53AM +0800, Kewen.Lin wrote:
>>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>>> and LTO mode.  As Martin pointed out, currently the function
>>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>>> NULL, but it's wrong, we should use the command line options
>>>> from target_option_default_node as default.
>>>
>>> This is not documented.
>>
>> Yeah, but according to the document for the target attribute [1],
>> "Multiple target back ends implement the target attribute to specify
>> that a function is to be compiled with different target options than
>> specified on the command line. The original target command-line options
>> are ignored. ", it seems to say the function without any target
>> attribute/pragma will be compiled with target options specified on the
>> command line.  I think it's a normal expectation for users.
> 
> The whole target_option_default_node is not documented, I meant.
> 

Ah, you meant that.  Yeah, it can be improved separately.

>> Excepting for the inconsistent behaviors between LTO and non-LTO,
>> it can also make the below case different.
>>
>> // Option: -O2 -mcpu=power8 -mhtm
>>
>> int foo(int *b) {
>>   *b += __builtin_ttest();
>>   return *b;
>> }
>>
>> __attribute__((target("no-htm"))) int bar(int *a) {
>>   *a = foo(a);
>>   return 0;
>> }
>>
>> Without this fix, bar inlines foo but get the error as below:
>>
>> In function ‘foo’,
>>     inlined from ‘bar’ at test.c:8:8:
>> test.c:3:9: error: ‘__builtin_ttest’ requires the ‘-mhtm’ option
>>     3 |   *b += __builtin_ttest();
>>       |         ^~~~~~~~~~~~~~~~~
>>
>> Since bar doesn't support foo's required HTM feature.
>>
>> With this fix, bar doesn't inline foo and the case gets compiled well.
> 
> And if you inline something no-htm into something that allows htm?  That
> should work fine, be allowed just fine.
> 

Thanks for helping to answer this, Peter!

>>>> It also replaces
>>>> rs6000_isa_flags with the one from target_option_default_node
>>>> when caller_tree is NULL as rs6000_isa_flags could probably
>>>> change since initialization.
>>>
>>> Is this enough then?  Are there no other places that use
>>> rs6000_isa_flags?  Is it correct for us to have that variable at all
>>> anymore?  Etc.
>>
>> Good question, I think the answer is yes.  I digged into it and found
>> the current target option save/restore scheme would keep rs6000_isa_flags
>> to be the same as the one saved in target_option_default_node for the context
>> of inlining.  So I put one assertion as below:
>>
>>     if (!caller_tree) {
>>       caller_tree = target_option_default_node;
>>       gcc_assert (rs6000_isa_flags
>>                   == TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags);
>>     }
>>
>> And kicked off regression testings on both BE and LE, the result showed
>> only one same failure, which seems to be a latent bug.  I've filed PR103515
>> for it.
> 
> Ah cool :-)  Please send a patch to add that asser then (well, once we
> can bootstrap with it ;-) )
> 

OK.

>> This bug also indicates it's better to use target_option_default_node
>> rather than rs6000_isa_flags when the caller_tree is NULL.  Since
>> the target_option_default_node is straightforward and doesn't rely
>> on the assumption that rs6000_isa_flags would be kept as the default
>> one correctly, it doesn't suffer from bugs like PR103515.
> 
> So we should not ever use rs6000_isa_flags anymore?
> 

If the question is for the scope in function rs6000_can_inline_p, yes.

Before this patch, the only use of rs6000_isa_flags is:

   /* If the caller has option attributes, then use them.
      Otherwise, use the command line options.  */
      if (caller_tree)
	caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags;
      else
	caller_isa = rs6000_isa_flags;

This patch changes this part to be like (logically):

      if (caller_tree)
	caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags;
      else
	caller_isa = TREE_TARGET_OPTION (target_option_default_node)->x_rs6000_isa_flags;

If the question is for the others out of function rs6000_can_inline_p, no.
The rs6000_isa_flags holds the flags for the current context of either top level
or some given function decl.  As function rs6000_set_current_function shows, we
already consider target_option_default_node for the case that there is NULL
DECL_FUNCTION_SPECIFIC_TARGET for one decl.

>>> The fusion ones are obvious.  The other two are not.  Please put those
>>> two more obviously separate (not somewhere in the sea of fusion
>>> options), and some comment wouldn't hurt either.  You can make it
>>> separate statements even, make it really stand out.
>>>
>>
>> Fixed by splitting it into: fusion_options_mask and optimization_options_mask.
>> I'm not happy for the later name, but poor to get a better in mind.  Do you
>> have any suggestions on the name and words?
> 
> You dont have to split it into variables, as you found out then you get
> the usual naming problem.  But you can just split it in the code:
> 
>   if (important_condition
>       || another_important_one
>       /* comment explaining things */
>       || bla1 || bla2 || bla3 || bla4 || bla5)
> 

Got it, thanks.  Removed those two variables and updated it to:

  /* Some features can be tolerated for always inlines.  */
  unsigned HOST_WIDE_INT always_inline_safe_mask
    /* Fusion option masks.  */
    = OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION
      | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION
      | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL
      | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG
      | OPTION_MASK_P10_FUSION_2ADD
      /* Like fusion, some option masks which are just for optimization.  */
      | OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT;

>>> Why are there OPTION_MASKs for separate P10 fusion types here, as well as
>>> MASK_P10_FUSION?
>>
>> Mike helped to explain the history, I've updated all of them using OPTION_MASK_
>> to avoid potential confusion.
> 
> That is one thing, sure, but why are both needed?  Both the "main" flag,
> and the "details" flags.

Because they are taken as independent bits in the below checking with bitwise
manipulation, like:

  (caller_isa & callee_isa) == callee_isa)

> 
> (The latter should soon go away btw).
> 
> 
> Segher
> 

The updated patch is attached, just changing always_inline_safe_mask
should be no any functional changes comparing to the previous version.

Does it look good to you?

BR,
Kewen
---
gcc/ChangeLog:

	PR ipa/102059
	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
	target_option_default_node and consider always_inline_safe flags.

gcc/testsuite/ChangeLog:

	PR ipa/102059
	* gcc.target/powerpc/pr102059-1.c: New test.
	* gcc.target/powerpc/pr102059-2.c: New test.
	* gcc.target/powerpc/pr102059-3.c: New test.
	* gcc.target/powerpc/pr102059-4.c: New test.
	* gcc.target/powerpc/pr102059-5.c: New test.

-----

[-- Attachment #2: PR102059_v3.patch --]
[-- Type: text/plain, Size: 10829 bytes --]

---
 gcc/config/rs6000/rs6000.c                    | 91 ++++++++++++------
 gcc/testsuite/gcc.target/powerpc/pr102059-1.c | 24 +++++
 gcc/testsuite/gcc.target/powerpc/pr102059-2.c | 20 ++++
 gcc/testsuite/gcc.target/powerpc/pr102059-3.c | 95 +++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr102059-4.c | 22 +++++
 gcc/testsuite/gcc.target/powerpc/pr102059-5.c | 22 +++++
 6 files changed, 246 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-5.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 289c1b3df24..856bd271a14 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -25539,45 +25539,80 @@ rs6000_generate_version_dispatcher_body (void *node_p)
 static bool
 rs6000_can_inline_p (tree caller, tree callee)
 {
-  bool ret = false;
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
 
-  /* If the callee has no option attributes, then it is ok to inline.  */
+  /* If the caller/callee has option attributes, then use them.
+     Otherwise, use the command line options.  */
   if (!callee_tree)
-    ret = true;
+    callee_tree = target_option_default_node;
+  if (!caller_tree)
+    caller_tree = target_option_default_node;
+
+  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
+  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+  HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
+  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
+
+  bool always_inline
+    = DECL_DISREGARD_INLINE_LIMITS (callee)
+      && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee));
+
+  /* Some features can be tolerated for always inlines.  */
+  unsigned HOST_WIDE_INT always_inline_safe_mask
+    /* Fusion option masks.  */
+    = OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION
+      | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION
+      | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL
+      | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG
+      | OPTION_MASK_P10_FUSION_2ADD
+      /* Like fusion, some option masks which are just for optimization.  */
+      | OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT;
+
+  /* To adjust callee is enough since we just check for subset.  */
+  if (always_inline)
+    callee_isa &= ~always_inline_safe_mask;
+
+  /* The callee's options must be a subset of the caller's options, i.e.
+     a vsx function may inline an altivec function, but a no-vsx function
+     must not inline a vsx function.  */
+  if ((caller_isa & callee_isa) != callee_isa)
+    {
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
+    }
 
-  else
+  /* For those options that the callee has explicitly enabled or disabled,
+     then we must enforce that the callee's and caller's options match
+     exactly; see PR70010.  */
+  HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
+  if (always_inline)
+    explicit_isa &= ~always_inline_safe_mask;
+  if ((caller_isa & explicit_isa) != (callee_isa & explicit_isa))
     {
-      HOST_WIDE_INT caller_isa;
-      struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
-      HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
-      HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
-
-      /* If the caller has option attributes, then use them.
-	 Otherwise, use the command line options.  */
-      if (caller_tree)
-	caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags;
-      else
-	caller_isa = rs6000_isa_flags;
-
-      /* The callee's options must be a subset of the caller's options, i.e.
-	 a vsx function may inline an altivec function, but a no-vsx function
-	 must not inline a vsx function.  However, for those options that the
-	 callee has explicitly enabled or disabled, then we must enforce that
-	 the callee's and caller's options match exactly; see PR70010.  */
-      if (((caller_isa & callee_isa) == callee_isa)
-	  && (caller_isa & explicit_isa) == (callee_isa & explicit_isa))
-	ret = true;
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set by considering callee's "
+		 "explicitly set options.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
     }
 
   if (TARGET_DEBUG_TARGET)
-    fprintf (stderr, "rs6000_can_inline_p:, caller %s, callee %s, %s inline\n",
-	     get_decl_name (caller), get_decl_name (callee),
-	     (ret ? "can" : "cannot"));
+    fprintf (stderr,
+	     "rs6000_can_inline_p:, caller %s, callee %s, can inline.\n",
+	     get_decl_name (caller), get_decl_name (callee));
 
-  return ret;
+  return true;
 }
+
 \f
 /* Allocate a stack temp and fixup the address so it meets the particular
    memory requirements (either offetable or REG+REG addressing).  */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-1.c b/gcc/testsuite/gcc.target/powerpc/pr102059-1.c
new file mode 100644
index 00000000000..d2a002cf141
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+
+/* Verify it emits inlining error msg at non-LTO mode.  */
+
+#include <htmintrin.h>
+
+static inline int __attribute__ ((always_inline))
+foo (int *b) /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  int res = _HTM_STATE(__builtin_ttest());
+  *b += res;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int
+bar (int *a)
+{
+  *a = foo (a); /* { dg-message "called from here" } */
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-2.c b/gcc/testsuite/gcc.target/powerpc/pr102059-2.c
new file mode 100644
index 00000000000..1d5d6c38bf3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-vsx" } */
+
+/* Verify it emits inlining error msg when the callee has explicit
+   disabling option from command line.  */
+
+vector int c, a, b;
+
+static inline void __attribute__ ((__always_inline__))
+foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  c = a + b;
+}
+
+__attribute__ ((target ("vsx")))
+int main ()
+{
+  foo (); /* { dg-message "called from here" } */
+  c = a + b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-3.c b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
new file mode 100644
index 00000000000..9684cab986a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
@@ -0,0 +1,95 @@
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since some mismatched
+   features are considered as safe for always_inline.  */
+
+/* 1. Callee enables Power8 fusion implicitly, while caller
+      with Power9 doesn't support power8 fusion at all.  */
+
+__attribute__ ((always_inline))
+int callee1 (int *b)
+{
+  *b += 1;
+  return *b;
+}
+
+#pragma GCC target "cpu=power9"
+int caller1 (int *a)
+{
+  *a = callee1 (a);
+  return 0;
+}
+
+/* 2. Caller enables indirect toc save feature while callee
+      disables it explicitly.  */
+
+#pragma GCC target "save-toc-indirect"
+__attribute__ ((always_inline))
+int callee2 (int *b)
+{
+  *b += 2;
+  return *b;
+}
+
+#pragma GCC target "no-save-toc-indirect"
+int caller2 (int *a)
+{
+  *a = callee2 (a);
+  return 0;
+}
+
+/* 3. Caller disables Power10 fusion explicitly, while callee
+      still supports it as Power10 turns it on by default.  */
+
+#pragma GCC target "cpu=power10"
+__attribute__ ((always_inline))
+int callee3 (int *b)
+{
+  *b += 3;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,no-power10-fusion"
+int caller3 (int *a)
+{
+  *a = callee3 (a);
+  return 0;
+}
+
+/* 4. Caller enables Power10 fusion implicitly, while callee
+      disables it explicitly.  */
+
+#pragma GCC target "no-power10-fusion"
+__attribute__ ((always_inline))
+int callee4 (int *b)
+{
+  *b += 4;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int caller4 (int *a)
+{
+  *a = callee4 (a);
+  return 0;
+}
+
+/* 5. Caller disables pcrel-opt while callee enables it explicitly.  */
+
+#pragma GCC target "cpu=power10,no-pcrel-opt"
+__attribute__ ((always_inline))
+int callee5 (int *b)
+{
+  *b += 5;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,pcrel-opt"
+int caller5 (int *a)
+{
+  *a = callee5 (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-4.c b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
new file mode 100644
index 00000000000..0f27f2ce7d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since the flag power8
+   fusion is considered as safe for always_inline, it's still safe
+   even the flag is set explicitly.  */
+
+__attribute__ ((always_inline))
+int foo (int *b)
+{
+  *b += 10;
+  return *b;
+}
+
+#pragma GCC target "power8-fusion"
+int bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-5.c b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
new file mode 100644
index 00000000000..826ce88025f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mhtm" } */
+
+/* Verify the inlining won't perform when the callee requires
+   some target feature which isn't supported by caller, even
+   if the callee doesn't have any target attributes or pragmas.
+   If the inlining performs here, the compilation will fail.  */
+
+int
+foo (int *b)
+{
+  *b += __builtin_ttest ();
+  return *b;
+}
+
+__attribute__ ((target ("no-htm"))) int
+bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-12-07  3:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01  6:55 [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059] Kewen.Lin
2021-09-15  8:42 ` PING^1 " Kewen.Lin
2021-09-28  8:58   ` PING^2 " Kewen.Lin
2021-10-13  2:33     ` PING^3 " Kewen.Lin
2021-10-20  9:30       ` PING^4 " Kewen.Lin
2021-11-04 10:57         ` PING^5 " Kewen.Lin
2021-11-22  2:24           ` PING^6 " Kewen.Lin
2021-11-29 16:57 ` Segher Boessenkool
2021-12-03  0:51   ` Michael Meissner
2021-12-03  3:30     ` Kewen.Lin
2021-12-03  3:46   ` [PATCH v2] " Kewen.Lin
2021-12-03 23:23     ` Peter Bergner
2021-12-06  9:35       ` Martin Liška
2021-12-06 14:40         ` Peter Bergner
2021-12-06 13:06     ` Segher Boessenkool
2021-12-06 14:52       ` Peter Bergner
2021-12-07  3:42       ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).