public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix REE with vector modes (PR rtl-optimization/70542)
@ 2016-04-05 16:56 Jakub Jelinek
  2016-04-05 17:02 ` Jeff Law
  0 siblings, 1 reply; 2+ messages in thread
From: Jakub Jelinek @ 2016-04-05 16:56 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Hi!

As mentioned in PR64286 already, unlike integral modes, vector modes
don't have the property that lowpart subreg of sign/zero extended
value contains the original non-extended bits, so if we touch some
definition, we really have to change all uses.  Except that we really
don't have infrastructure/code to make sure we change either none or
all within a single transaction, and there is lots of reasons why it could
fail, more with copy_needed cases as in the testcase below, where we have:
  (set (reg:V8HI xmm6) (reg:V8HI xmm10))
  (set (reg:V8SI xmm3) (sign_extend:V8SI (reg:V8HI xmm6)))
...
// in another bb later on
  (set (reg:V8SI ...) (sign_extend:V8SI (reg:V8HI xmm6)))
(why CSE has not managed to optimize this is strange, something to look
for GCC7).  Without this patch we optimize the first two insns into:
  (set (reg:V8SI xmm3) (sign_extend:V8HI xmm10))
  (set (reg:V8SI xmm6) (reg:V8SI xmm3))
but then fail to tweak the last insn, because the definition has been
already modified and it is another copy_needed case.

So, IMHO without big changes we really can't guarantee all or nothing,
and thus this patch changes the PR64286 fix to be more conservative,
verify the def is used just in a single non-debug insn, which we can then
safely modify.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-04-05  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/70542
	* ree.c (add_removable_extension): For VECTOR_MODE_P punt
	if there are any uses other than insn or debug insns.

	* gcc.dg/torture/pr70542.c: New test.
	* gcc.target/i386/avx2-pr70542.c: New test.

--- gcc/ree.c.jj	2016-01-04 14:55:54.000000000 +0100
+++ gcc/ree.c	2016-04-05 14:38:08.219486904 +0200
@@ -1025,11 +1025,11 @@ add_removable_extension (const_rtx expr,
 	    return;
 	  }
 	/* For vector mode extensions, ensure that all uses of the
-	   XEXP (src, 0) register are the same extension (both code
-	   and to which mode), as unlike integral extensions lowpart
-	   subreg of the sign/zero extended register are not equal
-	   to the original register, so we have to change all uses or
-	   none.  */
+	   XEXP (src, 0) register are in insn or debug insns, as unlike
+	   integral extensions lowpart subreg of the sign/zero extended
+	   register are not equal to the original register, so we have
+	   to change all uses or none and the current code isn't able
+	   to change them all at once in one transaction.  */
 	else if (VECTOR_MODE_P (GET_MODE (XEXP (src, 0))))
 	  {
 	    if (idx == 0)
@@ -1046,15 +1046,7 @@ add_removable_extension (const_rtx expr,
 			break;
 		      }
 		    rtx_insn *use_insn = DF_REF_INSN (ref_link->ref);
-		    const_rtx use_set;
-		    if (use_insn == insn || DEBUG_INSN_P (use_insn))
-		      continue;
-		    if (!(use_set = single_set (use_insn))
-			|| !REG_P (SET_DEST (use_set))
-			|| GET_MODE (SET_DEST (use_set)) != GET_MODE (dest)
-			|| GET_CODE (SET_SRC (use_set)) != code
-			|| !rtx_equal_p (XEXP (SET_SRC (use_set), 0),
-					 XEXP (src, 0)))
+		    if (use_insn != insn && !DEBUG_INSN_P (use_insn))
 		      {
 			idx = -1U;
 			break;
--- gcc/testsuite/gcc.dg/torture/pr70542.c.jj	2016-04-05 13:05:53.925334083 +0200
+++ gcc/testsuite/gcc.dg/torture/pr70542.c	2016-04-05 13:05:35.000000000 +0200
@@ -0,0 +1,31 @@
+/* PR rtl-optimization/70542 */
+/* { dg-do run } */
+
+int a[113], d[113];
+short b[113], c[113], e[113];
+
+int
+main ()
+{
+  int i;
+  long j;
+  for (i = 0; i < 113; ++i)
+    {
+      a[i] = -636544305;
+      b[i] = -31804;
+    }
+  for (j = 1; j <= 112; ++j)
+    {
+      c[j] = b[j] >> ((a[j] & 1587842570) - 1510214139);
+      if (a[j])
+	d[j] = j;
+      e[j] = 7 << ((2312631697 - b[j]) - 2312663500);
+    }
+  asm volatile ("" : : : "memory");
+  if (c[0] || d[0] || e[0])
+    __builtin_abort ();
+  for (i = 1; i <= 112; ++i)
+    if (c[i] != -1 || d[i] != i || e[i] != 14)
+      __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/avx2-pr70542.c.jj	2016-04-05 13:06:29.154855716 +0200
+++ gcc/testsuite/gcc.target/i386/avx2-pr70542.c	2016-04-05 13:07:01.779412722 +0200
@@ -0,0 +1,16 @@
+/* PR tree-optimization/70542 */
+/* { dg-do run } */
+/* { dg-options "-O3 -mavx2" } */
+/* { dg-require-effective-target avx2 } */
+
+#include "avx2-check.h"
+
+#define main() do_main ()
+
+#include "../../gcc.dg/torture/pr70542.c"
+
+static void
+avx2_test (void)
+{
+  do_main ();
+}

	Jakub

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Fix REE with vector modes (PR rtl-optimization/70542)
  2016-04-05 16:56 [PATCH] Fix REE with vector modes (PR rtl-optimization/70542) Jakub Jelinek
@ 2016-04-05 17:02 ` Jeff Law
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2016-04-05 17:02 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On 04/05/2016 10:56 AM, Jakub Jelinek wrote:
> Hi!
>
> As mentioned in PR64286 already, unlike integral modes, vector modes
> don't have the property that lowpart subreg of sign/zero extended
> value contains the original non-extended bits, so if we touch some
> definition, we really have to change all uses.  Except that we really
> don't have infrastructure/code to make sure we change either none or
> all within a single transaction, and there is lots of reasons why it could
> fail, more with copy_needed cases as in the testcase below, where we have:
>    (set (reg:V8HI xmm6) (reg:V8HI xmm10))
>    (set (reg:V8SI xmm3) (sign_extend:V8SI (reg:V8HI xmm6)))
> ...
> // in another bb later on
>    (set (reg:V8SI ...) (sign_extend:V8SI (reg:V8HI xmm6)))
> (why CSE has not managed to optimize this is strange, something to look
> for GCC7).  Without this patch we optimize the first two insns into:
>    (set (reg:V8SI xmm3) (sign_extend:V8HI xmm10))
>    (set (reg:V8SI xmm6) (reg:V8SI xmm3))
> but then fail to tweak the last insn, because the definition has been
> already modified and it is another copy_needed case.
>
> So, IMHO without big changes we really can't guarantee all or nothing,
> and thus this patch changes the PR64286 fix to be more conservative,
> verify the def is used just in a single non-debug insn, which we can then
> safely modify.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-04-05  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR rtl-optimization/70542
> 	* ree.c (add_removable_extension): For VECTOR_MODE_P punt
> 	if there are any uses other than insn or debug insns.
>
> 	* gcc.dg/torture/pr70542.c: New test.
> 	* gcc.target/i386/avx2-pr70542.c: New test.
OK.
jeff

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-04-05 17:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-05 16:56 [PATCH] Fix REE with vector modes (PR rtl-optimization/70542) Jakub Jelinek
2016-04-05 17:02 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).