public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PR tree-optimization/61009] Do not use a block as a joiner if it is too big for threading
@ 2014-05-09  4:54 Jeff Law
  0 siblings, 0 replies; only message in thread
From: Jeff Law @ 2014-05-09  4:54 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]



This was yet another problem issue with threading through a loop 
backedge and finding equivalences that should have been invalidated.

In this instance, we were trying to thread through a large block.  When 
we hit the statement threshold, thread_through_normal_block returned and 
thus statements later in the block which would have invalidated 
equivalences that we no longer valid after following a backedge were 
never examined.

So those equivalences were still in the tables.  We then proceeded to 
try and use the block as a joiner and thread through one or more of its 
successors.

That's, of course, stupid.  If we determined that the block was too big 
for normal threading, we certainly don't want to thread through it as a 
joiner either since that duplicates the block as well.  So it's both a 
codesize and correctness issue.

This patch changes thread_through_normal_block to signal to its caller 
that the block was not fully processed due to code growth 
considerations.  When that happens, we avoid trying to thread through 
the block's successors.  That fixes both the code growth problem as well 
as the correctness issue.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Installed on the trunk.  Will backport this and the other jump threading 
fix to the 4.9 branch shortly.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch --]
[-- Type: text/plain; charset=us-ascii; name="patch", Size: 7076 bytes --]

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8e8b76e..0b27fc8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2014-05-08  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/61009
+	* tree-ssa-threadedge.c (thread_through_normal_block): Return a
+	tri-state rather than a boolean.  When a block is too big to
+	thread through, inform caller via negative return value.
+	(thread_across_edge): If a block was too big for normal threading,
+	then it's too big for a joiner too, so remove temporary equivalences
+	and return immediately.
+
 2014-05-08  Manuel López-Ibáñez  <manu@gcc.gnu.org>
 	    Matthias Klose  <doko@ubuntu.com>
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2dcf9dc..959763f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-05-08  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/61009
+	* g++.dg/tree-ssa/pr61009.C: New test.
+
 2014-05-08  Matthias Klose  <doko@ubuntu.com>
 
 	PR driver/61106
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr61009.C b/gcc/testsuite/g++.dg/tree-ssa/pr61009.C
new file mode 100644
index 0000000..4e7bb1a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr61009.C
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -std=c++11 -fno-strict-aliasing -fdump-tree-dom1" } */
+
+#include <stdio.h>
+struct Field {
+ virtual int Compare(void*, void*);
+};
+extern int NKF, NR;
+extern int idxs[];
+extern Field* the_field;
+extern int *incs;
+extern char** fptrs;
+inline int doCmp(int this_row_offset, int field_idx) {
+ void *p = fptrs[field_idx] + this_row_offset * incs[field_idx];
+ return the_field->Compare(p,0);
+}
+bool  Test(void) {
+
+ int row_offset = 0;
+
+ for (; row_offset < NR; ++row_offset) {
+
+   bool is_different = false;
+   for (int j = 0; j < NKF ; ++j) {
+     int field_idx = idxs[j];
+     int cmp = doCmp(row_offset, field_idx);
+     fprintf (stderr, "cmp=%d\n",cmp);
+
+     if (cmp == 0) {
+       continue;
+     }
+     if (cmp > 0) {
+       is_different = true;
+       break;
+     } else {
+       fprintf (stderr, "Incorrect\n");
+       return false;
+     }
+   }
+   if (!is_different) {
+
+     return false;
+   }
+ }
+
+ return true;
+}
+
+// The block ending with cmp == 0 should not be threaded.  ie,
+// there should be a single == 0 comparison in the dump file.
+
+// { dg-final { scan-tree-dump-times "== 0" 1 "dom1" } }
+// { dg-final { cleanup-tree-dump "dom1" } }
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 7621348..8e628d5 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -966,9 +966,14 @@ thread_around_empty_blocks (edge taken_edge,
    SIMPLIFY is a pass-specific function used to simplify statements.
 
    Our caller is responsible for restoring the state of the expression
-   and const_and_copies stacks.  */
+   and const_and_copies stacks.
 
-static bool
+   Positive return value is success.  Zero return value is failure, but
+   the block can still be duplicated as a joiner in a jump thread path,
+   negative indicates the block should not be duplicated and thus is not
+   suitable for a joiner in a jump threading path.  */
+
+static int
 thread_through_normal_block (edge e,
 			     gimple dummy_cond,
 			     bool handle_dominating_asserts,
@@ -990,7 +995,7 @@ thread_through_normal_block (edge e,
   /* PHIs create temporary equivalences.  */
   if (!record_temporary_equivalences_from_phis (e, stack, *backedge_seen_p,
 						src_map, dst_map))
-    return false;
+    return 0;
 
   /* Now walk each statement recording any context sensitive
      temporary equivalences we can detect.  */
@@ -998,8 +1003,16 @@ thread_through_normal_block (edge e,
     = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify,
 							*backedge_seen_p,
 							src_map, dst_map);
+
+  /* If we didn't look at all the statements, the most likely reason is
+     there were too many and thus duplicating this block is not profitable.
+
+     Also note if we do not look at all the statements, then we may not
+     have invalidated equivalences that are no longer valid if we threaded
+     around a loop.  Thus we must signal to our caller that this block
+     is not suitable for use as a joiner in a threading path.  */
   if (!stmt)
-    return false;
+    return -1;
 
   /* If we stopped at a COND_EXPR or SWITCH_EXPR, see if we know which arm
      will be taken.  */
@@ -1023,7 +1036,7 @@ thread_through_normal_block (edge e,
 	  if (dest == NULL
 	      || dest == e->dest
 	      || bitmap_bit_p (visited, dest->index))
-	    return false;
+	    return 0;
 
 	  /* Only push the EDGE_START_JUMP_THREAD marker if this is
 	     first edge on the path.  */
@@ -1057,10 +1070,10 @@ thread_through_normal_block (edge e,
 				      visited,
 				      path,
 				      backedge_seen_p);
-	  return true;
+	  return 1;
 	}
     }
-  return false;
+  return 0;
 }
 
 /* We are exiting E->src, see if E->dest ends with a conditional
@@ -1112,9 +1125,12 @@ thread_across_edge (gimple dummy_cond,
   if (backedge_seen)
     simplify = dummy_simplify;
 
-  if (thread_through_normal_block (e, dummy_cond, handle_dominating_asserts,
-				   stack, simplify, path, visited,
-				   &backedge_seen, src_map, dst_map))
+  int threaded = thread_through_normal_block (e, dummy_cond,
+					      handle_dominating_asserts,
+					      stack, simplify, path,
+					      visited, &backedge_seen,
+					      src_map, dst_map);
+  if (threaded > 0)
     {
       propagate_threaded_block_debug_into (path->last ()->e->dest,
 					   e->dest);
@@ -1127,10 +1143,27 @@ thread_across_edge (gimple dummy_cond,
     }
   else
     {
-      /* There should be no edges on the path, so no need to walk through
-	 the vector entries.  */
+      /* Negative and zero return values indicate no threading was possible,
+	 thus there should be no edges on the thread path and no need to walk
+	 through the vector entries.  */
       gcc_assert (path->length () == 0);
       path->release ();
+
+      /* A negative status indicates the target block was deemed too big to
+	 duplicate.  Just quit now rather than trying to use the block as
+	 a joiner in a jump threading path.
+
+	 This prevents unnecessary code growth, but more importantly if we
+	 do not look at all the statements in the block, then we may have
+	 missed some invalidations if we had traversed a backedge!  */
+      if (threaded < 0)
+	{
+	  BITMAP_FREE (visited);
+	  BITMAP_FREE (src_map);
+	  BITMAP_FREE (dst_map);
+	  remove_temporary_equivalences (stack);
+	  return;
+	}
     }
 
  /* We were unable to determine what out edge from E->dest is taken.  However,
@@ -1212,7 +1245,7 @@ thread_across_edge (gimple dummy_cond,
 					       handle_dominating_asserts,
 					       stack, simplify, path, visited,
 					       &backedge_seen,
-					       src_map, dst_map);
+					       src_map, dst_map) > 0;
 
 	/* If we were able to thread through a successor of E->dest, then
 	   record the jump threading opportunity.  */

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2014-05-09  4:54 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-09  4:54 [PR tree-optimization/61009] Do not use a block as a joiner if it is too big for threading Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).