public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [x86] Cache result of expensive_function_p between frame layouts
@ 2019-09-30 14:11 Richard Sandiford
  2019-09-30 17:57 ` Jan Hubicka
  0 siblings, 1 reply; 2+ messages in thread
From: Richard Sandiford @ 2019-09-30 14:11 UTC (permalink / raw)
  To: gcc-patches; +Cc: hubicka, ubizjak

ix86_compute_frame_layout sets use_fast_prologue_epilogue if
the function isn't more expensive than a certain threshold,
where the threshold depends on the number of saved registers.
However, the RA is allowed to insert and delete instructions
as it goes along, which can change whether this threshold is
crossed or not.

I hit this with an RA change I'm working on.  Rematerialisation
was able to remove an instruction and avoid a spill, which happened
to bring the size of the function below the threshold.  But since
nothing legitimately frame-related had changed, there was no need for
the RA to lay out the frame again.  We then failed the final sanity
check in lra_eliminate.

Tested on x86_64-linux-gnu.  OK to install?

Richard


2019-09-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/i386/i386.h (ix86_frame::expensive_p): New field.
	(ix86_frame::expensive_count): Likewise.
	* config/i386/i386.c (ix86_compute_frame_layout): Make the choice
	of use_fast_prologue_epilogue robust against incidental changes
	in function size.

Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	2019-09-26 08:37:44.000000000 +0100
+++ gcc/config/i386/i386.h	2019-09-30 15:07:51.784114465 +0100
@@ -2643,6 +2643,11 @@ struct GTY(()) ix86_frame
   /* When save_regs_using_mov is set, emit prologue using
      move instead of push instructions.  */
   bool save_regs_using_mov;
+
+  /* Assume without checking that:
+       EXPENSIVE_P = expensive_function_p (EXPENSIVE_COUNT).  */
+  bool expensive_p;
+  int expensive_count;
 };
 
 /* Machine specific frame tracking during prologue/epilogue generation.  All
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	2019-09-26 08:37:44.000000000 +0100
+++ gcc/config/i386/i386.c	2019-09-30 15:07:51.784114465 +0100
@@ -5876,7 +5876,14 @@ ix86_compute_frame_layout (void)
 	 case function is known to be outside hot spot (this is known with
 	 feedback only).  Weight the size of function by number of registers
 	 to save as it is cheap to use one or two push instructions but very
-	 slow to use many of them.  */
+	 slow to use many of them.
+
+	 Calling this hook multiple times with the same frame requirements
+	 must produce the same layout, since the RA might otherwise be
+	 unable to reach a fixed point or might fail its final sanity checks.
+	 This means that once we've assumed that a function does or doesn't
+	 have a particular size, we have to stick to that assumption
+	 regardless of how the function has changed since.  */
       if (count)
 	count = (count - 1) * FAST_PROLOGUE_INSN_COUNT;
       if (node->frequency < NODE_FREQUENCY_NORMAL
@@ -5884,8 +5891,14 @@ ix86_compute_frame_layout (void)
 	      && node->frequency < NODE_FREQUENCY_HOT))
 	m->use_fast_prologue_epilogue = false;
       else
-	m->use_fast_prologue_epilogue
-	   = !expensive_function_p (count);
+	{
+	  if (count != frame->expensive_count)
+	    {
+	      frame->expensive_count = count;
+	      frame->expensive_p = expensive_function_p (count);
+	    }
+	  m->use_fast_prologue_epilogue = !frame->expensive_p;
+	}
     }
 
   frame->save_regs_using_mov

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [x86] Cache result of expensive_function_p between frame layouts
  2019-09-30 14:11 [x86] Cache result of expensive_function_p between frame layouts Richard Sandiford
@ 2019-09-30 17:57 ` Jan Hubicka
  0 siblings, 0 replies; 2+ messages in thread
From: Jan Hubicka @ 2019-09-30 17:57 UTC (permalink / raw)
  To: gcc-patches, ubizjak, richard.sandiford

> ix86_compute_frame_layout sets use_fast_prologue_epilogue if
> the function isn't more expensive than a certain threshold,
> where the threshold depends on the number of saved registers.
> However, the RA is allowed to insert and delete instructions
> as it goes along, which can change whether this threshold is
> crossed or not.
> 
> I hit this with an RA change I'm working on.  Rematerialisation
> was able to remove an instruction and avoid a spill, which happened
> to bring the size of the function below the threshold.  But since
> nothing legitimately frame-related had changed, there was no need for
> the RA to lay out the frame again.  We then failed the final sanity
> check in lra_eliminate.
> 
> Tested on x86_64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2019-09-30  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* config/i386/i386.h (ix86_frame::expensive_p): New field.
> 	(ix86_frame::expensive_count): Likewise.
> 	* config/i386/i386.c (ix86_compute_frame_layout): Make the choice
> 	of use_fast_prologue_epilogue robust against incidental changes
> 	in function size.
OK,
Honza

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-09-30 17:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-30 14:11 [x86] Cache result of expensive_function_p between frame layouts Richard Sandiford
2019-09-30 17:57 ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).