public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Don't add REG_EQUAL notes in fwprop for paradoxical subregs (PR rtl-optimization/70574)
@ 2016-04-07 21:56 Jakub Jelinek
  2016-04-08 11:18 ` Bernd Schmidt
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2016-04-07 21:56 UTC (permalink / raw)
  To: Jeff Law, Bernd Schmidt; +Cc: gcc-patches

Hi!

The following testcase is miscompiled, because we have:
(set (reg:SI ...) (subreg:SI (reg:QI (...)) 0))
instruction and the fwprop attempts to propagate (const_int -1)
into the reg:QI use in there, but gives up because costs don't say it is
beneficial and adds instead REG_EQUAL (const_int -1) note on the insn.
That is wrong though, it is fine to optimize this insn to
(set (reg:SI ...) (const_int -1))
because the higher bits in the paradoxical subreg are undefined,
but as they are undefined, if the subreg is kept, those bits can be
anything.  If we say the subreg is equal to (const_int -1), it means
e.g. CSE2 can replace other places that need SImode -1 with the SET_DEST of
this insn, but then it really depends on what bits actually end up in the
register.  If we are unlucky, and it is e.g. spilled during LRA and reloaded
using QImode, the upper bits can be anything.

Not sure if this patch catches everything though, perhaps there could be
e.g.
(set (reg:SI ...) (plus:SI ((subreg:SI (reg:QI ...) 0) (const_int ...)))
and we'd still assign REG_EQUAL note.  So maybe instead we should walk the
*loc expression and look for paradoxical subregs, and for each of them, if
we find the DF_REF_REG (use) mentioned in their operand, clear
set_reg_equal.  Though of course, if DF_REF_REG (use) itself is a
paradoxical subreg, we could clear set_reg_equal without any walking.

2016-04-07  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/70574
	* fwprop.c (forward_propagate_and_simplify): Don't add
	REG_EQUAL note if DF_REF_REG (use) is a paradoxical subreg.

	* gcc.target/i386/avx2-pr70574.c: New test.

--- gcc/fwprop.c.jj	2016-01-04 14:55:53.000000000 +0100
+++ gcc/fwprop.c	2016-04-07 18:01:42.953844357 +0200
@@ -1213,7 +1213,7 @@ forward_propagate_and_simplify (df_ref u
   rtx_insn *use_insn = DF_REF_INSN (use);
   rtx use_set = single_set (use_insn);
   rtx src, reg, new_rtx, *loc;
-  bool set_reg_equal;
+  bool set_reg_equal = true;
   machine_mode mode;
   int asm_use = -1;
 
@@ -1240,7 +1240,15 @@ forward_propagate_and_simplify (df_ref u
   /* Check if the use has a subreg, but the def had the whole reg.  Unlike the
      previous case, the optimization is possible and often useful indeed.  */
   else if (GET_CODE (reg) == SUBREG && REG_P (SET_DEST (def_set)))
-    reg = SUBREG_REG (reg);
+    {
+      /* If the use is a paradoxical subreg, make sure we don't add a
+	 REG_EQUAL note for it, because it is not equivalent, it is one
+	 possible value for it, but we can't rely on it holding that value.
+	 See PR70574.  */
+      if (paradoxical_subreg_p (reg))
+	set_reg_equal = false;
+      reg = SUBREG_REG (reg);
+    }
 
   /* Make sure that we can treat REG as having the same mode as the
      source of DEF_SET.  */
@@ -1301,13 +1309,13 @@ forward_propagate_and_simplify (df_ref u
 	 otherwise.  We also don't want to install a note if we are merely
 	 propagating a pseudo since verifying that this pseudo isn't dead
 	 is a pain; moreover such a note won't help anything.  */
-      set_reg_equal = (note == NULL_RTX
-		       && REG_P (SET_DEST (use_set))
-		       && !REG_P (src)
-		       && !(GET_CODE (src) == SUBREG
-			    && REG_P (SUBREG_REG (src)))
-		       && !reg_mentioned_p (SET_DEST (use_set),
-					    SET_SRC (use_set)));
+      set_reg_equal &= (note == NULL_RTX
+			&& REG_P (SET_DEST (use_set))
+			&& !REG_P (src)
+			&& !(GET_CODE (src) == SUBREG
+			     && REG_P (SUBREG_REG (src)))
+			&& !reg_mentioned_p (SET_DEST (use_set),
+					     SET_SRC (use_set)));
     }
 
   if (GET_MODE (*loc) == VOIDmode)
--- gcc/testsuite/gcc.target/i386/avx2-pr70574.c.jj	2016-04-07 18:09:25.788519218 +0200
+++ gcc/testsuite/gcc.target/i386/avx2-pr70574.c	2016-04-07 18:09:21.825573327 +0200
@@ -0,0 +1,26 @@
+/* PR rtl-optimization/70574 */
+/* { dg-do run { target lp64 } } */
+/* { dg-require-effective-target avx2 } */
+/* { dg-options "-O -frerun-cse-after-loop -fno-tree-ccp -mcmodel=medium -mavx2" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+
+#include "avx2-check.h"
+
+typedef char A __attribute__((vector_size (32)));
+typedef short B __attribute__((vector_size (32)));
+
+int
+foo (int x, __int128 y, __int128 z, A w)
+{
+  y <<= 64;
+  w *= (A) { 0, -1, z, 0, ~y };
+  return w[0] + ((B) { x, 0, y, 0, -1 } | 1)[4];
+}
+
+static void
+avx2_test ()
+{
+  int x = foo (0, 0, 0, (A) {});
+  if (x != -1)
+    __builtin_abort ();
+}

	Jakub

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Don't add REG_EQUAL notes in fwprop for paradoxical subregs (PR rtl-optimization/70574)
  2016-04-07 21:56 [PATCH] Don't add REG_EQUAL notes in fwprop for paradoxical subregs (PR rtl-optimization/70574) Jakub Jelinek
@ 2016-04-08 11:18 ` Bernd Schmidt
  2016-04-08 17:05   ` Jakub Jelinek
  0 siblings, 1 reply; 4+ messages in thread
From: Bernd Schmidt @ 2016-04-08 11:18 UTC (permalink / raw)
  To: Jakub Jelinek, Jeff Law; +Cc: gcc-patches

On 04/07/2016 11:56 PM, Jakub Jelinek wrote:
> Not sure if this patch catches everything though, perhaps there could be
> e.g.
> (set (reg:SI ...) (plus:SI ((subreg:SI (reg:QI ...) 0) (const_int ...)))
> and we'd still assign REG_EQUAL note.  So maybe instead we should walk the
> *loc expression and look for paradoxical subregs, and for each of them, if
> we find the DF_REF_REG (use) mentioned in their operand, clear
> set_reg_equal.  Though of course, if DF_REF_REG (use) itself is a
> paradoxical subreg, we could clear set_reg_equal without any walking.

It seems like something like that could happen.

How much do we lose if we just don't make new REG_EQUAL notes here?


Bernd

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Don't add REG_EQUAL notes in fwprop for paradoxical subregs (PR rtl-optimization/70574)
  2016-04-08 11:18 ` Bernd Schmidt
@ 2016-04-08 17:05   ` Jakub Jelinek
  2016-04-08 17:17     ` Bernd Schmidt
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2016-04-08 17:05 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Jeff Law, gcc-patches

On Fri, Apr 08, 2016 at 01:18:50PM +0200, Bernd Schmidt wrote:
> On 04/07/2016 11:56 PM, Jakub Jelinek wrote:
> >Not sure if this patch catches everything though, perhaps there could be
> >e.g.
> >(set (reg:SI ...) (plus:SI ((subreg:SI (reg:QI ...) 0) (const_int ...)))
> >and we'd still assign REG_EQUAL note.  So maybe instead we should walk the
> >*loc expression and look for paradoxical subregs, and for each of them, if
> >we find the DF_REF_REG (use) mentioned in their operand, clear
> >set_reg_equal.  Though of course, if DF_REF_REG (use) itself is a
> >paradoxical subreg, we could clear set_reg_equal without any walking.
> 
> It seems like something like that could happen.
> 
> How much do we lose if we just don't make new REG_EQUAL notes here?

After IRC discussions, I've bootstrapped/regtested following patch that
just punts if *loc contains any paradoxical subregs, together with
additional statistics gathering that proved that the new testcase is
the only spot in which this patch makes a difference on x86_64-linux
and i686-linux bootstrap/regtest.

Of course on other targets it might affect more.

Ok for trunk?

2016-04-08  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/70574
	* fwprop.c (forward_propagate_and_simplify): Don't add
	REG_EQUAL note if DF_REF_REG (use) is a paradoxical subreg.
	(try_fwprop_subst): Don't add REG_EQUAL note if there are any
	paradoxical subregs within *loc.

	* gcc.target/i386/avx2-pr70574.c: New test.

--- gcc/fwprop.c.jj	2016-04-07 23:27:41.396310248 +0200
+++ gcc/fwprop.c	2016-04-08 15:51:09.164706722 +0200
@@ -999,10 +999,27 @@ try_fwprop_subst (df_ref use, rtx *loc,
 	 making a new one if one does not already exist.  */
       if (set_reg_equal)
 	{
-	  if (dump_file)
-	    fprintf (dump_file, " Setting REG_EQUAL note\n");
+	  /* If there are any paradoxical SUBREGs, don't add REG_EQUAL note,
+	     because the bits in there can be anything and so might not
+	     match the REG_EQUAL note content.  See PR70574.  */
+	  subrtx_var_iterator::array_type array;
+	  FOR_EACH_SUBRTX_VAR (iter, array, *loc, NONCONST)
+	    {
+	      rtx x = *iter;
+	      if (SUBREG_P (x) && paradoxical_subreg_p (x))
+		{
+		  set_reg_equal = false;
+		  break;
+		}
+	    }
+
+	  if (set_reg_equal)
+	    {
+	      if (dump_file)
+		fprintf (dump_file, " Setting REG_EQUAL note\n");
 
-	  note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (new_rtx));
+	      note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (new_rtx));
+	    }
 	}
     }
 
@@ -1300,14 +1317,19 @@ forward_propagate_and_simplify (df_ref u
 	 that isn't mentioned in USE_SET, as the note would be invalid
 	 otherwise.  We also don't want to install a note if we are merely
 	 propagating a pseudo since verifying that this pseudo isn't dead
-	 is a pain; moreover such a note won't help anything.  */
+	 is a pain; moreover such a note won't help anything.
+	 If the use is a paradoxical subreg, make sure we don't add a
+	 REG_EQUAL note for it, because it is not equivalent, it is one
+	 possible value for it, but we can't rely on it holding that value.
+	 See PR70574.  */
       set_reg_equal = (note == NULL_RTX
 		       && REG_P (SET_DEST (use_set))
 		       && !REG_P (src)
 		       && !(GET_CODE (src) == SUBREG
 			    && REG_P (SUBREG_REG (src)))
 		       && !reg_mentioned_p (SET_DEST (use_set),
-					    SET_SRC (use_set)));
+					    SET_SRC (use_set))
+		       && !paradoxical_subreg_p (DF_REF_REG (use)));
     }
 
   if (GET_MODE (*loc) == VOIDmode)
--- gcc/testsuite/gcc.target/i386/avx2-pr70574.c.jj	2016-04-08 13:32:04.525196849 +0200
+++ gcc/testsuite/gcc.target/i386/avx2-pr70574.c	2016-04-08 13:32:04.525196849 +0200
@@ -0,0 +1,26 @@
+/* PR rtl-optimization/70574 */
+/* { dg-do run { target lp64 } } */
+/* { dg-require-effective-target avx2 } */
+/* { dg-options "-O -frerun-cse-after-loop -fno-tree-ccp -mcmodel=medium -mavx2" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+
+#include "avx2-check.h"
+
+typedef char A __attribute__((vector_size (32)));
+typedef short B __attribute__((vector_size (32)));
+
+int
+foo (int x, __int128 y, __int128 z, A w)
+{
+  y <<= 64;
+  w *= (A) { 0, -1, z, 0, ~y };
+  return w[0] + ((B) { x, 0, y, 0, -1 } | 1)[4];
+}
+
+static void
+avx2_test ()
+{
+  int x = foo (0, 0, 0, (A) {});
+  if (x != -1)
+    __builtin_abort ();
+}


	Jakub

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Don't add REG_EQUAL notes in fwprop for paradoxical subregs (PR rtl-optimization/70574)
  2016-04-08 17:05   ` Jakub Jelinek
@ 2016-04-08 17:17     ` Bernd Schmidt
  0 siblings, 0 replies; 4+ messages in thread
From: Bernd Schmidt @ 2016-04-08 17:17 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, gcc-patches

On 04/08/2016 07:05 PM, Jakub Jelinek wrote:
> After IRC discussions, I've bootstrapped/regtested following patch that
> just punts if *loc contains any paradoxical subregs, together with
> additional statistics gathering that proved that the new testcase is
> the only spot in which this patch makes a difference on x86_64-linux
> and i686-linux bootstrap/regtest.
>
> Of course on other targets it might affect more.

I ran the test with just not generating any REG_EQUAL notes here in 
fwprop, and out of 4492 source files only three showed code generation 
differences, which suggests we're not getting much value out of these.

Two of the cases where the same missed tree optimization, for which I've 
filed PR70600.

> 2016-04-08  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR rtl-optimization/70574
> 	* fwprop.c (forward_propagate_and_simplify): Don't add
> 	REG_EQUAL note if DF_REF_REG (use) is a paradoxical subreg.
> 	(try_fwprop_subst): Don't add REG_EQUAL note if there are any
> 	paradoxical subregs within *loc.
>
> 	* gcc.target/i386/avx2-pr70574.c: New test.

Ok.


Bernd

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-04-08 17:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-07 21:56 [PATCH] Don't add REG_EQUAL notes in fwprop for paradoxical subregs (PR rtl-optimization/70574) Jakub Jelinek
2016-04-08 11:18 ` Bernd Schmidt
2016-04-08 17:05   ` Jakub Jelinek
2016-04-08 17:17     ` Bernd Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).