public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]
@ 2023-01-04  9:20 Kewen.Lin
  2023-01-04 10:46 ` Segher Boessenkool
  0 siblings, 1 reply; 5+ messages in thread
From: Kewen.Lin @ 2023-01-04  9:20 UTC (permalink / raw)
  To: GCC Patches; +Cc: Segher Boessenkool, David Edelsohn, Peter Bergner

Hi,

As Honza pointed out in [1], the current uses of function
optimize_function_for_speed_p in rs6000_option_override_internal
are too early, since the query results from the functions
optimize_function_for_{speed,size}_p could be changed later due
to profile feedback and some function attributes handlings etc.

This patch is to move optimize_function_for_speed_p to all the
use places of the corresponding flags, which follows the existing
practices.  Maybe we can cache it somewhere at an appropriate
timing, but that's another thing.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607527.html

Bootstrapped and regtested on powerpc64-linux-gnu P8 and
powerpc64le-linux-gnu P9 and P10.

I'm going to push this soon if no objections.

BR,
Kewen
-----
	PR target/108184

gcc/ChangeLog:

	* config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove
	all optimize_function_for_speed_p uses.
	* config/rs6000/predicates.md (fusion_gpr_mem_load): Call
	optimize_function_for_speed_p along with TARGET_P8_FUSION_SIGN.
	(fusion_gpr_load_p): Likewise.
	(expand_fusion_gpr_load): Likewise.
	(rs6000_call_aix): Call optimize_function_for_speed_p along with
	TARGET_SAVE_TOC_INDIRECT.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr108184.c: New test.
---
 gcc/config/rs6000/predicates.md             |  4 +++-
 gcc/config/rs6000/rs6000.cc                 | 16 ++++++++++------
 gcc/testsuite/gcc.target/powerpc/pr108184.c | 15 +++++++++++++++
 3 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108184.c

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b1fcc69bb60..11f1779e7bf 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1878,7 +1878,9 @@ (define_predicate "fusion_gpr_mem_load"

   /* Handle sign/zero extend.  */
   if (GET_CODE (op) == ZERO_EXTEND
-      || (TARGET_P8_FUSION_SIGN && GET_CODE (op) == SIGN_EXTEND))
+      || (TARGET_P8_FUSION_SIGN
+	  && GET_CODE (op) == SIGN_EXTEND
+	  && optimize_function_for_speed_p (cfun)))
     {
       op = XEXP (op, 0);
       mode = GET_MODE (op);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index eb7ad5e954f..bbf829f45d9 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3978,8 +3978,7 @@ rs6000_option_override_internal (bool global_init_p)
   /* If we can shrink-wrap the TOC register save separately, then use
      -msave-toc-indirect unless explicitly disabled.  */
   if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
-      && flag_shrink_wrap_separate
-      && optimize_function_for_speed_p (cfun))
+      && flag_shrink_wrap_separate)
     rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;

   /* Enable power8 fusion if we are tuning for power8, even if we aren't
@@ -4013,7 +4012,6 @@ rs6000_option_override_internal (bool global_init_p)
      zero extending load, and an explicit sign extension.  */
   if (TARGET_P8_FUSION
       && !(rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION_SIGN)
-      && optimize_function_for_speed_p (cfun)
       && optimize >= 3)
     rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN;

@@ -25604,7 +25602,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)

 	  /* Can we optimize saving the TOC in the prologue or
 	     do we need to do it at every call?  */
-	  if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
+	  if (TARGET_SAVE_TOC_INDIRECT
+	      && !cfun->calls_alloca
+	      && optimize_function_for_speed_p (cfun))
 	    cfun->machine->save_toc_in_prologue = true;
 	  else
 	    {
@@ -27471,7 +27471,9 @@ fusion_gpr_load_p (rtx addis_reg,	/* register set via addis.  */

   /* Allow sign/zero extension.  */
   if (GET_CODE (mem) == ZERO_EXTEND
-      || (GET_CODE (mem) == SIGN_EXTEND && TARGET_P8_FUSION_SIGN))
+      || (GET_CODE (mem) == SIGN_EXTEND
+	  && TARGET_P8_FUSION_SIGN
+	  && optimize_function_for_speed_p (cfun)))
     mem = XEXP (mem, 0);

   if (!MEM_P (mem))
@@ -27535,7 +27537,9 @@ expand_fusion_gpr_load (rtx *operands)
   enum rtx_code extend = UNKNOWN;

   if (GET_CODE (orig_mem) == ZERO_EXTEND
-      || (TARGET_P8_FUSION_SIGN && GET_CODE (orig_mem) == SIGN_EXTEND))
+      || (TARGET_P8_FUSION_SIGN
+	  && GET_CODE (orig_mem) == SIGN_EXTEND
+	  && optimize_function_for_speed_p (cfun)))
     {
       extend = GET_CODE (orig_mem);
       orig_mem = XEXP (orig_mem, 0);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108184.c b/gcc/testsuite/gcc.target/powerpc/pr108184.c
new file mode 100644
index 00000000000..8f1e91d9258
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr108184.c
@@ -0,0 +1,15 @@
+/* Only possible to fuse sign extended loads with the addis when
+   optimize >= 3 and Power8 fusion takes effects.  */
+/* { dg-options "-mdejagnu-tune=power8 -O3" } */
+
+/* Verify it doesn't optimize this function for speed as it's cold,
+   by checking it doesn't try to fuse sign extended loads with addis,
+   which results in a zero extended load and a sign extension.  */
+
+__attribute__ ((cold)) int
+fusion_short (short *p)
+{
+  return p[0x12345];
+}
+
+/* { dg-final { scan-assembler-not {\mextsh\M} } } */
--
2.27.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-05  4:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-04  9:20 [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184] Kewen.Lin
2023-01-04 10:46 ` Segher Boessenkool
2023-01-04 12:15   ` Kewen.Lin
2023-01-04 14:02     ` Segher Boessenkool
2023-01-05  4:04       ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).