public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@redhat.com>
To: Vladimir Makarov <vmakarov@redhat.com>,
	Jeff Law <jeffreyalaw@gmail.com>,
	Richard Sandiford <richard.sandiford@arm.com>
Cc: gcc-patches@gcc.gnu.org,
	Surya Kumari Jangala <jskumari@linux.ibm.com>,
	Peter Bergner <bergner@linux.ibm.com>
Subject: [PATCH] lra, v2: emit caller-save register spills before call insn [PR116028]
Date: Fri, 21 Mar 2025 14:00:18 +0100	[thread overview]
Message-ID: <Z91i4oSOe2t9Wsu5@tucnak> (raw)

Hi!

Here is an updated version of Surya's PR116028 fix from August, which got
reverted because it caused bootstrap failures on aarch64, later on bootstrap
comparison errors there as well and problems on other targets as well.

The changes compared to the
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/658973.html
version are:
1) the reason for aarch64 miscompilations and later on bootstrap comparison
   issues as can be seen on the pr118615.c testcase in the patch was that
   when curr_insn is a JUMP_INSN or some cases of CALL_INSNs,
   split_if_necessary is called with before_p true and if it is successful,
   the code set use_insn = PREV_INSN (curr_insn); instead of use_insn =
   curr_insn; and that use_insn is then what is passed to
   add_next_usage_insn; now, if the patch decides to emit the save
   instruction(s) before the first call after curr_insn in the ebb rather
   than before the JUMP_INSN/CALL_INSN, PREV_INSN (curr_insn) is some random
   insn before it, not anything related to the split_reg actions.
   If it is e.g. a DEBUG_INSN in one case vs. some unrelated other insn
   otherwise, that can affect further split_reg within the same function
2) as suggested by Surya in PR118615, it makes no sense to try to change
   behavior if the first call after curr_insn is in the same bb as curr_insn
3) split_reg is actually called sometimes from within inherit_in_ebb but
   sometimes from elsewhere; trying to use whatever last call to
   inherit_in_ebb saw last is a sure way to run into wrong-code issues,
   so instead of clearing the rtx var at the start of inherit_in_ebb it is
   now cleared at the end of it
4) calling the var latest_call_insn was weird, inherit_in_ebb walks the ebb
   backwards, so what the var contains is the first call insn within the
   ebb (after curr_insn)
5) the patch was using
   lra_process_new_insns (PREV_INSN (latest_call_insn), NULL, save,
                          "Add save<-reg");
   to emit the save insn before latest_call_insn.  That feels quite weird
   given that latest_call_insn has explicit support for adding stuff
   before some insn or after some insn, adding something before some
   insn doesn't really need to be done as addition after PREV_INSN
6) some formatting nits + new testcase + removal of xfail even on arm32

Bootstrapped/regtested on x86_64-linux/i686-linux (my usual
--enable-checking=yes,rtl,extra builds), aarch64-linux (normal default
bootstrap) and our distro scratch build
({x86_64,i686,aarch64,powerpc64le,s390x}-linux --enable-checking=release
LTO profiledbootstrap/regtest), I think Sam James tested on 32-bit arm
too.
On aarch64-linux this results in
-FAIL: gcc.dg/pr10474.c scan-rtl-dump pro_and_epilogue "Performing shrink-wrapping"

I admit I don't know the code well nor understood everything it is doing.

I have some concerns:
1) I wonder if there is a guarantee that first_call_insn if non-NULL will be
   always in between curr_insn and usage_insn when call_save_p; I'd hope
   yes because if usage_insn is before first_call_insn in the ebb,
   presumably it wouldn't need to find call save regs because the range
   wouldn't cross any calls
2) I wonder whether it wouldn't be better instead of inserting the saves
   before first_call_insn insert it at the start of the bb containing that
   call (after labels of course); emitting it right before a call could
   mislead code looking for argument slot initialization of the call
3) even when avoiding the use_insn = PREV_INSN (curr_insn);, I wonder
   if it is ok to use use_insn equal to curr_insn rather than the insns
   far later where we actually inserted it, but primarily because I don't
   understand the code much; I think for the !before_p case it is doing
   similar thing  on a shorter distance, the saves were emitted after
   curr_insn and we record it on curr_insn

2025-03-21  Surya Kumari Jangala  <jskumari@linux.ibm.com>
	    Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/116028
	PR rtl-optimization/118615
	* lra-constraints.cc (first_call_insn): New variable.
	(split_reg): Spill register before first_call_insn if call_save_p
	and the call is in a different bb in the ebb.
	(split_if_necessary): Formatting fix.
	(inherit_in_ebb): Set first_call_insn when handling a CALL_INSN.
	For successful split_if_necessary with before_p, only change
	use_insn if it emitted any new instructions before curr_insn.
	Clear first_call_insn before returning.
	
	* gcc.dg/ira-shrinkwrap-prep-1.c: Remove xfail for powerpc.
	* gcc.dg/pr10474.c: Remove xfail for powerpc and arm.
	* gcc.dg/pr118615.c: New test.

--- gcc/lra-constraints.cc.jj	2025-03-19 19:20:41.644440691 +0100
+++ gcc/lra-constraints.cc	2025-03-20 18:40:04.188299643 +0100
@@ -152,6 +152,9 @@ static machine_mode curr_operand_mode[MA
    (e.g. constant) and whose subreg is given operand of the current
    insn.  VOIDmode in all other cases.  */
 static machine_mode original_subreg_reg_mode[MAX_RECOG_OPERANDS];
+/* The first call insn after curr_insn within the EBB during inherit_in_ebb
+   or NULL outside of that function.  */
+static rtx_insn *first_call_insn;
 
 \f
 
@@ -6373,12 +6376,26 @@ split_reg (bool before_p, int original_r
   lra_process_new_insns (as_a <rtx_insn *> (usage_insn),
 			 after_p ? NULL : restore,
 			 after_p ? restore : NULL,
-			 call_save_p
-			 ?  "Add reg<-save" : "Add reg<-split");
-  lra_process_new_insns (insn, before_p ? save : NULL,
-			 before_p ? NULL : save,
-			 call_save_p
-			 ?  "Add save<-reg" : "Add split<-reg");
+			 call_save_p ? "Add reg<-save" : "Add reg<-split");
+  if (call_save_p
+      && first_call_insn != NULL
+      && BLOCK_FOR_INSN (first_call_insn) != BLOCK_FOR_INSN (insn))
+    /* PR116028: If original_regno is a pseudo that has been assigned a
+       call-save hard register, then emit the spill insn before the call
+       insn 'first_call_insn' instead of adjacent to 'insn'.  If 'insn'
+       and 'first_call_insn' belong to the same EBB but to two separate
+       BBs, and if 'insn' is present in the entry BB, then generating the
+       spill insn in the entry BB can prevent shrink wrap from happening.
+       This is because the spill insn references the stack pointer and
+       hence the prolog gets generated in the entry BB itself.  It is
+       also more efficient to generate the spill before
+       'first_call_insn' as the spill now occurs only in the path
+       containing the call.  */
+    lra_process_new_insns (first_call_insn, save, NULL, "Add save<-reg");
+  else
+    lra_process_new_insns (insn, before_p ? save : NULL,
+			   before_p ? NULL : save,
+			   call_save_p ? "Add save<-reg" : "Add split<-reg");
   if (nregs > 1 || original_regno < FIRST_PSEUDO_REGISTER)
     /* If we are trying to split multi-register.  We should check
        conflicts on the next assignment sub-pass.  IRA can allocate on
@@ -6484,7 +6501,7 @@ split_if_necessary (int regno, machine_m
 		&& (INSN_UID (XEXP (next_usage_insns, 0)) < max_uid)))
 	&& need_for_split_p (potential_reload_hard_regs, regno + i)
 	&& split_reg (before_p, regno + i, insn, next_usage_insns, NULL))
-    res = true;
+      res = true;
   return res;
 }
 
@@ -7074,6 +7092,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 	      last_call_for_abi[callee_abi.id ()] = calls_num;
 	      full_and_partial_call_clobbers
 		|= callee_abi.full_and_partial_reg_clobbers ();
+	      first_call_insn = curr_insn;
 	      if ((cheap = find_reg_note (curr_insn,
 					  REG_RETURNED, NULL_RTX)) != NULL_RTX
 		  && ((cheap = XEXP (cheap, 0)), true)
@@ -7142,6 +7161,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 		    {
 		      bool before_p;
 		      rtx_insn *use_insn = curr_insn;
+		      rtx_insn *prev_insn = PREV_INSN (curr_insn);
 
 		      before_p = (JUMP_P (curr_insn)
 				  || (CALL_P (curr_insn) && reg->type == OP_IN));
@@ -7156,7 +7176,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 			  change_p = true;
 			  /* Invalidate. */
 			  usage_insns[src_regno].check = 0;
-			  if (before_p)
+			  if (before_p && PREV_INSN (curr_insn) != prev_insn)
 			    use_insn = PREV_INSN (curr_insn);
 			}
 		      if (NONDEBUG_INSN_P (curr_insn))
@@ -7278,6 +7298,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 	    }
 	}
     }
+  first_call_insn = NULL;
   return change_p;
 }
 
--- gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c.jj	2024-07-01 11:28:23.278230620 +0200
+++ gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c	2025-03-20 19:50:49.369486995 +0100
@@ -26,4 +26,4 @@ bar (long a)
 
 /* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira" } } */
 /* { dg-final { scan-rtl-dump "Split live-range of register" "ira" { xfail { ! aarch64*-*-* } } } } */
-/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue" { xfail powerpc*-*-* } } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue" } } */
--- gcc/testsuite/gcc.dg/pr10474.c.jj	2025-03-20 19:50:49.379486857 +0100
+++ gcc/testsuite/gcc.dg/pr10474.c	2025-03-21 09:33:23.808051052 +0100
@@ -12,5 +12,4 @@ void f(int *i)
 	}
 }
 
-/* XFAIL due to PR70681.  */ 
-/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  { xfail arm*-*-* powerpc*-*-* } } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue" } } */
--- gcc/testsuite/gcc.dg/pr118615.c.jj	2025-03-21 09:44:26.502876412 +0100
+++ gcc/testsuite/gcc.dg/pr118615.c	2025-03-21 09:45:22.447104189 +0100
@@ -0,0 +1,24 @@
+/* PR rtl-optimization/118615 */
+/* { dg-do compile { target scheduling } } */
+/* { dg-options "-O2 -fcompare-debug -fschedule-insns" } */
+
+void foo (void);
+void bar (int);
+void baz (int *);
+int j;
+
+void
+qux (int k, int *m)
+{
+  int n;
+  if (k)
+    {
+      foo ();
+      if (m)
+	{
+	  bar (j);
+	  baz (m);
+	}
+    }
+  baz (&n);
+}

	Jakub


             reply	other threads:[~2025-03-21 13:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21 13:00 Jakub Jelinek [this message]
2025-03-21 16:45 ` Jakub Jelinek
2025-03-21 19:01 ` Vladimir Makarov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z91i4oSOe2t9Wsu5@tucnak \
    --to=jakub@redhat.com \
    --cc=bergner@linux.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=jskumari@linux.ibm.com \
    --cc=richard.sandiford@arm.com \
    --cc=vmakarov@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).