public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Fix (work around) LRA infinite loop, PR78911
@ 2017-03-03 13:36 Bernd Schmidt
  2017-03-03 16:57 ` Vladimir Makarov
  0 siblings, 1 reply; 2+ messages in thread
From: Bernd Schmidt @ 2017-03-03 13:36 UTC (permalink / raw)
  To: GCC Patches, Vladimir Makarov

[-- Attachment #1: Type: text/plain, Size: 2052 bytes --]

In this PR, we have an endless cycle in LRA, generating ever more 
instructions. The relevant portions of the dump seem to be these:

********** Local #9: **********
       Creating newreg=130 from oldreg=128, assigning class CREG to r130
    18: 
{r117:DI=unspec/v[[r129:SI*0x8+r123:SI],r117:DI,r122:SI,r130:SI,0x8005] 
62;[r129:SI*0x8+r123:SI]=unspec/v[0] 62;flags:CCZ=unspec/v[0] 62;}

********** Assignment #7: **********
	 Assigning to 130 (cl=CREG, orig=114, freq=1148, tfirst=130, tfreq=1148)...
   2nd iter for reload pseudo assignments:
	 Reload r130 assignment failure
	  Spill  r87(hr=5, freq=4296)
	  Spill  r123(hr=4, freq=1148)
	  Spill reload  r129(hr=2, freq=1148)

********** Local #11: **********

	   Spilling non-eliminable hard regs: 6
       Creating newreg=131, assigning class GENERAL_REGS to address r131
	   Change to class INDEX_REGS for r131

********** Assignment #8: **********

	 Assigning to 131 (cl=INDEX_REGS, orig=131, freq=1148, tfirst=131, 
tfreq=1148)...
	 Trying 0: spill 117(freq=1722)	 Now best 0(cost=574, bad_spills=0, 
insn_pseudos=1)

	 Trying 1: spill 117(freq=1722)
	 Trying 2: spill 130(freq=1148)	 Now best 2(cost=0, bad_spills=0, 
insn_pseudos=1)

	 Trying 3: spill 122(freq=1148)
	 Trying 4: spill 129(freq=1148)

********** Local #12: **********
       Creating newreg=133 from oldreg=130, assigning class CREG to r133

What seems to happen is that in the "Assignemnt #8" phase, we're 
considering one new reload reg only, and we end up spilling r130, which 
only allows a register from a single class. Then we reload r130 again 
and make r133, and the cycle starts.

Reload is designed in a way to avoid cycles and to process all reloads 
for an insn in order of ascending class so as to avoid this kind of 
issue. With LRA I'm really not sure how to fix this properly, but the 
following patch seems to cure the PR at least, by recognizing when we're 
about to spill a reload reg whose assigned class contains only contains 
a single register.

Bootstrapped and tested on x86_64-linux, ok?


Bernd

[-- Attachment #2: lra-singleton.diff --]
[-- Type: text/x-patch, Size: 3765 bytes --]

	PR rtl-optimization/78911
	* lra-assigns.c (must_not_spill_p): New function.
	(spill_for): Use it.

	PR rtl-optimization/78911
	* gcc.target/i386/pr78911-1.c: New test.
	* gcc.target/i386/pr78911-2.c: New test.

Index: gcc/lra-assigns.c
===================================================================
--- gcc/lra-assigns.c	(revision 245685)
+++ gcc/lra-assigns.c	(working copy)
@@ -889,6 +889,30 @@ assign_temporarily (int regno, int hard_
   live_pseudos_reg_renumber[regno] = hard_regno;
 }
 
+/* Return true iff there is a reason why pseudo SPILL_REGNO should not
+   be spilled.  */
+static bool
+must_not_spill_p (unsigned spill_regno)
+{
+  if ((pic_offset_table_rtx != NULL
+       && spill_regno == REGNO (pic_offset_table_rtx))
+      || ((int) spill_regno >= lra_constraint_new_regno_start
+	  && ! bitmap_bit_p (&lra_inheritance_pseudos, spill_regno)
+	  && ! bitmap_bit_p (&lra_split_regs, spill_regno)
+	  && ! bitmap_bit_p (&lra_subreg_reload_pseudos, spill_regno)
+	  && ! bitmap_bit_p (&lra_optional_reload_pseudos, spill_regno)))
+    return true;
+  /* A reload pseudo that requires a singleton register class should
+     not be spilled.
+     FIXME: this mitigates the issue on certain i386 patterns, but
+     does not solve the general case where existing reloads fully
+     cover a limited register class.  */
+  if (!bitmap_bit_p (&non_reload_pseudos, spill_regno)
+      && reg_class_size [reg_preferred_class (spill_regno)] == 1)
+    return true;
+  return false;
+}
+
 /* Array used for sorting reload pseudos for subsequent allocation
    after spilling some pseudo.	*/
 static int *sorted_reload_pseudos;
@@ -960,13 +984,7 @@ spill_for (int regno, bitmap spilled_pse
       /* Spill pseudos.	 */
       static_p = false;
       EXECUTE_IF_SET_IN_BITMAP (&spill_pseudos_bitmap, 0, spill_regno, bi)
-	if ((pic_offset_table_rtx != NULL
-	     && spill_regno == REGNO (pic_offset_table_rtx))
-	    || ((int) spill_regno >= lra_constraint_new_regno_start
-		&& ! bitmap_bit_p (&lra_inheritance_pseudos, spill_regno)
-		&& ! bitmap_bit_p (&lra_split_regs, spill_regno)
-		&& ! bitmap_bit_p (&lra_subreg_reload_pseudos, spill_regno)
-		&& ! bitmap_bit_p (&lra_optional_reload_pseudos, spill_regno)))
+	if (must_not_spill_p (spill_regno))
 	  goto fail;
 	else if (non_spilled_static_chain_regno_p (spill_regno))
 	  static_p = true;
Index: gcc/testsuite/gcc.target/i386/pr78911-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr78911-1.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr78911-1.c	(working copy)
@@ -0,0 +1,22 @@
+/* PR rtl-optimization/78911 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-strict-aliasing -fno-omit-frame-pointer" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+/* { dg-additional-options "-march=pentium-m" { target ia32 } } */
+
+int a, b, d, e;
+long long *c;
+
+static int
+foo (long long *x)
+{
+  return __sync_val_compare_and_swap (x, b, a);
+}
+
+void
+bar (void)
+{
+  if (!c)
+    return;
+  e = foo (&c[d]);
+}
Index: gcc/testsuite/gcc.target/i386/pr78911-2.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr78911-2.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr78911-2.c	(working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-omit-frame-pointer" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+/* { dg-additional-options "-march=i686" { target ia32 } } */
+
+long long *a, *b, c;
+int d, e;
+int baz (void);
+
+static inline long long
+foo (long long *x)
+{
+  return __sync_val_compare_and_swap (x, 0, 0);
+}
+
+void
+bar ()
+{
+  int f = baz ();
+  c = foo (&a[f]);
+  if (c)
+    e = d;
+  a = b;
+}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Fix (work around) LRA infinite loop, PR78911
  2017-03-03 13:36 Fix (work around) LRA infinite loop, PR78911 Bernd Schmidt
@ 2017-03-03 16:57 ` Vladimir Makarov
  0 siblings, 0 replies; 2+ messages in thread
From: Vladimir Makarov @ 2017-03-03 16:57 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches



On 03/03/2017 08:36 AM, Bernd Schmidt wrote:
>
> Reload is designed in a way to avoid cycles and to process all reloads 
> for an insn in order of ascending class so as to avoid this kind of 
> issue. With LRA I'm really not sure how to fix this properly, but the 
> following patch seems to cure the PR at least, by recognizing when 
> we're about to spill a reload reg whose assigned class contains only 
> contains a single register.
>
I had conversations with several people who worked on different RAs and 
got a conclusion that looping is a pretty general problem in iterative 
classical RAs.  Unfortunately, it is not discussed in a literature and 
ad-hoc approaches are used (that is why it is probably not discussed in 
scientific literature).

On first stages of LRA development, looping was a very frequent bug and 
also add-hoc approaches were used to fix it.  The major way is to 
predict that given decision can result in repeating patterns and to 
avoid such decision (e.g. lra_constraints.c::process_alt_operands 
contains code).

Reasons for looping can have different nature and now I have no 
systematic solution on my mind to avoid all of them.
> Bootstrapped and tested on x86_64-linux, ok?
>
>
Yes.  Thank you for working on this, Bernd.  The issue mentioned in 
FIXME part of the comment can be addressed later if we have cases for 
which your fix will be not enough.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-03-03 16:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-03 13:36 Fix (work around) LRA infinite loop, PR78911 Bernd Schmidt
2017-03-03 16:57 ` Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).