* [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
@ 2015-10-07 2:26 Jeff Law
2015-10-07 8:26 ` Richard Biener
0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2015-10-07 2:26 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1723 bytes --]
As touched on in the BZ, we record jump threads as a list of edges to
traverse. A jump thread may be recorded through a block which hasn't
been optimized by DOM yet.
If DOM is able to optimize a control flow statement in such a block,
then it will remove one or more outgoing edges from the block.
Removal of an edge triggers releasing the edge back to the GC system, so
naturally bad things happen if the threader then looks at the content of
those edges.
After some instrumentation I found this was happening for both jump
threads with joiner blocks as well as FSM jump threads. The former are
actually more common than the latter.
Given the sequencing of the recording of the jump thread, DOM optimizing
the COND_EXPR, subsequent releasing the edge structure and finally
examination of the jump thread to modify the CFG I saw only one good
solution and one bad solution.
The bad solution involved removing jump thread paths at the time in
which DOM optimizes the COND_EXPR. The problem is we have to walk the
entire path of every recorded jump thread each time DOM performs this
optimization. Obviously not good.
So instead we record the affected edge pointers into a hash table and
use those later to prune the jump thread paths. It's a single walk over
each recorded jump threading path. Since we're not looking at the
contents of the edge, this works reasonably well.
One more reason to push harder for jump threading to occur independently
of DOM using Sebastian's backward walking FSM bits.
Anyway, bootstrapped and regression tested on x86_64-linux-gnu. Also
bootstrapped and testcase tested on x86_64-linux-gnu with just release
checking enabled.
Installed on the trunk.
Jeff
[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 6430 bytes --]
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 732b3d1..db6f1b6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2015-10-06 Jeff Law <law@redhat.com>
+
+ PR tree-optimization/67816
+ * tree-ssa-threadupdate.h (remove_jump_threads_including): Renamed
+ from remove_jump_threads_starting_at. Accept an edge rather than
+ a basic block.
+ * tree-ssa-threadupdate.c (removed_edges): New hash table.
+ (remove_jump_threads_including): Note edges that get removed from
+ the CFG for later pruning of jump threading paths including them.
+ (thread_through_all_blocks): Remove paths which include edges that
+ have been removed.
+ * tree-ssa-dom.c (optimize_stmt): Call remove_jump_threads_including
+ on each outgoing edges when optimizing away a control statement.
+
2015-10-06 Trevor Saunders <tbsaunde+gcc@tbsaunde.org>
* reorg.c (emit_delay_sequence): Store list of delay slot insns
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 4ec4743..1882fbd 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2015-10-06 Jeff Law <law@redhat.com>
+
+ * gcc.c-torture/compile/pr67816.c: New test.
+
2015-10-07 Kugan Vivekanandarajah <kuganv@linaro.org>
* gcc.target/aarch64/get_lane_f16_1.c: New test.
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr67816.c b/gcc/testsuite/gcc.c-torture/compile/pr67816.c
new file mode 100644
index 0000000..c3db3a3
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr67816.c
@@ -0,0 +1,19 @@
+int a, c, d, e;
+int b[10];
+void fn1() {
+ int i, f = 0;
+ for (;;) {
+ i = 0;
+ for (; i < a; i++)
+ if (c) {
+ if (b[i])
+ f = 1;
+ } else if (b[i])
+ f = 0;
+ if (f)
+ d++;
+ while (e)
+ ;
+ }
+}
+
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index c226da5..941087d 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1840,8 +1840,13 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator si,
edge taken_edge = find_taken_edge (bb, val);
if (taken_edge)
{
- /* Delete threads that start at BB. */
- remove_jump_threads_starting_at (bb);
+
+ /* We need to remove any queued jump threads that
+ reference outgoing edges from this block. */
+ edge_iterator ei;
+ edge e;
+ FOR_EACH_EDGE (e, ei, bb->succs)
+ remove_jump_threads_including (e);
/* If BB is in a loop, then removing an outgoing edge from BB
may cause BB to move outside the loop, changes in the
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 4a147bb..26b199b 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -215,6 +215,18 @@ redirection_data::equal (const redirection_data *p1, const redirection_data *p2)
return true;
}
+/* Rather than search all the edges in jump thread paths each time
+ DOM is able to simply if control statement, we build a hash table
+ with the deleted edges. We only care about the address of the edge,
+ not its contents. */
+struct removed_edges : nofree_ptr_hash<edge_def>
+{
+ static hashval_t hash (edge e) { return htab_hash_pointer (e); }
+ static bool equal (edge e1, edge e2) { return e1 == e2; }
+};
+
+static hash_table<removed_edges> *removed_edges;
+
/* Data structure of information to pass to hash table traversal routines. */
struct ssa_local_info_t
{
@@ -2539,35 +2551,23 @@ valid_jump_thread_path (vec<jump_thread_edge *> *path)
return true;
}
-/* Remove any queued jump threads that start at BB. */
+/* Remove any queued jump threads that include edge E.
+
+ We don't actually remove them here, just record the edges into ax
+ hash table. That way we can do the search once per iteration of
+ DOM/VRP rather than for every case where DOM optimizes away a COND_EXPR. */
void
-remove_jump_threads_starting_at (basic_block bb)
+remove_jump_threads_including (edge_def *e)
{
if (!paths.exists ())
return;
- for (unsigned i = 0; i < paths.length ();)
- {
- vec<jump_thread_edge *> *path = paths[i];
+ if (!removed_edges)
+ removed_edges = new hash_table<struct removed_edges> (17);
- /* Sadly, FSM jump threads have a slightly different
- representation than the rest of the jump threads. */
- if ((*path)[0]->type == EDGE_FSM_THREAD
- && (*path)[0]->e->src == bb)
- {
- delete_jump_thread_path (path);
- paths.unordered_remove (i);
- }
- else if ((*path)[0]->type != EDGE_FSM_THREAD
- && (*path)[0]->e->dest == bb)
- {
- delete_jump_thread_path (path);
- paths.unordered_remove (i);
- }
- else
- i++;
- }
+ edge *slot = removed_edges->find_slot (e, INSERT);
+ *slot = e;
}
/* Walk through all blocks and thread incoming edges to the appropriate
@@ -2591,11 +2591,37 @@ thread_through_all_blocks (bool may_peel_loop_headers)
struct loop *loop;
if (!paths.exists ())
- return false;
+ {
+ retval = false;
+ goto out;
+ }
threaded_blocks = BITMAP_ALLOC (NULL);
memset (&thread_stats, 0, sizeof (thread_stats));
+ /* Remove any paths that referenced removed edges. */
+ if (removed_edges)
+ for (i = 0; i < paths.length (); )
+ {
+ unsigned int j;
+ vec<jump_thread_edge *> *path = paths[i];
+
+ for (j = 0; j < path->length (); j++)
+ {
+ edge e = (*path)[j]->e;
+ if (removed_edges->find_slot (e, NO_INSERT))
+ break;
+ }
+
+ if (j != path->length ())
+ {
+ delete_jump_thread_path (path);
+ paths.unordered_remove (i);
+ continue;
+ }
+ i++;
+ }
+
/* Jump-thread all FSM threads before other jump-threads. */
for (i = 0; i < paths.length ();)
{
@@ -2778,6 +2804,9 @@ thread_through_all_blocks (bool may_peel_loop_headers)
if (retval)
loops_state_set (LOOPS_NEED_FIXUP);
+ out:
+ delete removed_edges;
+ removed_edges = NULL;
return retval;
}
diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
index 30428e8..984b6c4 100644
--- a/gcc/tree-ssa-threadupdate.h
+++ b/gcc/tree-ssa-threadupdate.h
@@ -43,7 +43,7 @@ public:
};
extern void register_jump_thread (vec <class jump_thread_edge *> *);
-extern void remove_jump_threads_starting_at (basic_block);
+extern void remove_jump_threads_including (edge);
extern void delete_jump_thread_path (vec <class jump_thread_edge *> *);
extern void remove_ctrl_stmt_and_useless_edges (basic_block, basic_block);
#endif
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-10-07 2:26 [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path Jeff Law
@ 2015-10-07 8:26 ` Richard Biener
2015-10-07 22:00 ` Jeff Law
0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2015-10-07 8:26 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches
On Wed, Oct 7, 2015 at 4:26 AM, Jeff Law <law@redhat.com> wrote:
>
> As touched on in the BZ, we record jump threads as a list of edges to
> traverse. A jump thread may be recorded through a block which hasn't been
> optimized by DOM yet.
>
> If DOM is able to optimize a control flow statement in such a block, then it
> will remove one or more outgoing edges from the block.
>
> Removal of an edge triggers releasing the edge back to the GC system, so
> naturally bad things happen if the threader then looks at the content of
> those edges.
>
> After some instrumentation I found this was happening for both jump threads
> with joiner blocks as well as FSM jump threads. The former are actually
> more common than the latter.
>
> Given the sequencing of the recording of the jump thread, DOM optimizing the
> COND_EXPR, subsequent releasing the edge structure and finally examination
> of the jump thread to modify the CFG I saw only one good solution and one
> bad solution.
>
> The bad solution involved removing jump thread paths at the time in which
> DOM optimizes the COND_EXPR. The problem is we have to walk the entire path
> of every recorded jump thread each time DOM performs this optimization.
> Obviously not good.
>
> So instead we record the affected edge pointers into a hash table and use
> those later to prune the jump thread paths. It's a single walk over each
> recorded jump threading path. Since we're not looking at the contents of
> the edge, this works reasonably well.
>
> One more reason to push harder for jump threading to occur independently of
> DOM using Sebastian's backward walking FSM bits.
>
> Anyway, bootstrapped and regression tested on x86_64-linux-gnu. Also
> bootstrapped and testcase tested on x86_64-linux-gnu with just release
> checking enabled.
>
> Installed on the trunk.
Hmm, other passes avoid all this by not removing edges or blocks themselves
but leaving that to cfgcleanup. They simply replace the condition in
GIMPLE_CONDs
or the switch value in GIMPLE_SWITCHes and let cleanup_control_expr_graph
do the hard part of the job.
Now - I don't know how that would interact with jump threads covering
those paths,
that is, what kind of "mess" jump threading would produce with the
conditions/switches
mangled that way.
The other nice thing is that CFG cleanup properly deals with loop
structure changes.
Richard.
> Jeff
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 732b3d1..db6f1b6 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,17 @@
> +2015-10-06 Jeff Law <law@redhat.com>
> +
> + PR tree-optimization/67816
> + * tree-ssa-threadupdate.h (remove_jump_threads_including): Renamed
> + from remove_jump_threads_starting_at. Accept an edge rather than
> + a basic block.
> + * tree-ssa-threadupdate.c (removed_edges): New hash table.
> + (remove_jump_threads_including): Note edges that get removed from
> + the CFG for later pruning of jump threading paths including them.
> + (thread_through_all_blocks): Remove paths which include edges that
> + have been removed.
> + * tree-ssa-dom.c (optimize_stmt): Call remove_jump_threads_including
> + on each outgoing edges when optimizing away a control statement.
> +
> 2015-10-06 Trevor Saunders <tbsaunde+gcc@tbsaunde.org>
>
> * reorg.c (emit_delay_sequence): Store list of delay slot insns
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 4ec4743..1882fbd 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,7 @@
> +2015-10-06 Jeff Law <law@redhat.com>
> +
> + * gcc.c-torture/compile/pr67816.c: New test.
> +
> 2015-10-07 Kugan Vivekanandarajah <kuganv@linaro.org>
>
> * gcc.target/aarch64/get_lane_f16_1.c: New test.
> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr67816.c
> b/gcc/testsuite/gcc.c-torture/compile/pr67816.c
> new file mode 100644
> index 0000000..c3db3a3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr67816.c
> @@ -0,0 +1,19 @@
> +int a, c, d, e;
> +int b[10];
> +void fn1() {
> + int i, f = 0;
> + for (;;) {
> + i = 0;
> + for (; i < a; i++)
> + if (c) {
> + if (b[i])
> + f = 1;
> + } else if (b[i])
> + f = 0;
> + if (f)
> + d++;
> + while (e)
> + ;
> + }
> +}
> +
> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
> index c226da5..941087d 100644
> --- a/gcc/tree-ssa-dom.c
> +++ b/gcc/tree-ssa-dom.c
> @@ -1840,8 +1840,13 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator
> si,
> edge taken_edge = find_taken_edge (bb, val);
> if (taken_edge)
> {
> - /* Delete threads that start at BB. */
> - remove_jump_threads_starting_at (bb);
> +
> + /* We need to remove any queued jump threads that
> + reference outgoing edges from this block. */
> + edge_iterator ei;
> + edge e;
> + FOR_EACH_EDGE (e, ei, bb->succs)
> + remove_jump_threads_including (e);
>
> /* If BB is in a loop, then removing an outgoing edge from BB
> may cause BB to move outside the loop, changes in the
> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
> index 4a147bb..26b199b 100644
> --- a/gcc/tree-ssa-threadupdate.c
> +++ b/gcc/tree-ssa-threadupdate.c
> @@ -215,6 +215,18 @@ redirection_data::equal (const redirection_data *p1,
> const redirection_data *p2)
> return true;
> }
>
> +/* Rather than search all the edges in jump thread paths each time
> + DOM is able to simply if control statement, we build a hash table
> + with the deleted edges. We only care about the address of the edge,
> + not its contents. */
> +struct removed_edges : nofree_ptr_hash<edge_def>
> +{
> + static hashval_t hash (edge e) { return htab_hash_pointer (e); }
> + static bool equal (edge e1, edge e2) { return e1 == e2; }
> +};
> +
> +static hash_table<removed_edges> *removed_edges;
> +
> /* Data structure of information to pass to hash table traversal routines.
> */
> struct ssa_local_info_t
> {
> @@ -2539,35 +2551,23 @@ valid_jump_thread_path (vec<jump_thread_edge *>
> *path)
> return true;
> }
>
> -/* Remove any queued jump threads that start at BB. */
> +/* Remove any queued jump threads that include edge E.
> +
> + We don't actually remove them here, just record the edges into ax
> + hash table. That way we can do the search once per iteration of
> + DOM/VRP rather than for every case where DOM optimizes away a COND_EXPR.
> */
>
> void
> -remove_jump_threads_starting_at (basic_block bb)
> +remove_jump_threads_including (edge_def *e)
> {
> if (!paths.exists ())
> return;
>
> - for (unsigned i = 0; i < paths.length ();)
> - {
> - vec<jump_thread_edge *> *path = paths[i];
> + if (!removed_edges)
> + removed_edges = new hash_table<struct removed_edges> (17);
>
> - /* Sadly, FSM jump threads have a slightly different
> - representation than the rest of the jump threads. */
> - if ((*path)[0]->type == EDGE_FSM_THREAD
> - && (*path)[0]->e->src == bb)
> - {
> - delete_jump_thread_path (path);
> - paths.unordered_remove (i);
> - }
> - else if ((*path)[0]->type != EDGE_FSM_THREAD
> - && (*path)[0]->e->dest == bb)
> - {
> - delete_jump_thread_path (path);
> - paths.unordered_remove (i);
> - }
> - else
> - i++;
> - }
> + edge *slot = removed_edges->find_slot (e, INSERT);
> + *slot = e;
> }
>
> /* Walk through all blocks and thread incoming edges to the appropriate
> @@ -2591,11 +2591,37 @@ thread_through_all_blocks (bool
> may_peel_loop_headers)
> struct loop *loop;
>
> if (!paths.exists ())
> - return false;
> + {
> + retval = false;
> + goto out;
> + }
>
> threaded_blocks = BITMAP_ALLOC (NULL);
> memset (&thread_stats, 0, sizeof (thread_stats));
>
> + /* Remove any paths that referenced removed edges. */
> + if (removed_edges)
> + for (i = 0; i < paths.length (); )
> + {
> + unsigned int j;
> + vec<jump_thread_edge *> *path = paths[i];
> +
> + for (j = 0; j < path->length (); j++)
> + {
> + edge e = (*path)[j]->e;
> + if (removed_edges->find_slot (e, NO_INSERT))
> + break;
> + }
> +
> + if (j != path->length ())
> + {
> + delete_jump_thread_path (path);
> + paths.unordered_remove (i);
> + continue;
> + }
> + i++;
> + }
> +
> /* Jump-thread all FSM threads before other jump-threads. */
> for (i = 0; i < paths.length ();)
> {
> @@ -2778,6 +2804,9 @@ thread_through_all_blocks (bool may_peel_loop_headers)
> if (retval)
> loops_state_set (LOOPS_NEED_FIXUP);
>
> + out:
> + delete removed_edges;
> + removed_edges = NULL;
> return retval;
> }
>
> diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
> index 30428e8..984b6c4 100644
> --- a/gcc/tree-ssa-threadupdate.h
> +++ b/gcc/tree-ssa-threadupdate.h
> @@ -43,7 +43,7 @@ public:
> };
>
> extern void register_jump_thread (vec <class jump_thread_edge *> *);
> -extern void remove_jump_threads_starting_at (basic_block);
> +extern void remove_jump_threads_including (edge);
> extern void delete_jump_thread_path (vec <class jump_thread_edge *> *);
> extern void remove_ctrl_stmt_and_useless_edges (basic_block, basic_block);
> #endif
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-10-07 8:26 ` Richard Biener
@ 2015-10-07 22:00 ` Jeff Law
2015-10-08 9:56 ` Richard Biener
0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2015-10-07 22:00 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On 10/07/2015 02:26 AM, Richard Biener wrote:
>
> Hmm, other passes avoid all this by not removing edges or blocks
> themselves but leaving that to cfgcleanup. They simply replace the
> condition in GIMPLE_CONDs or the switch value in GIMPLE_SWITCHes and
> let cleanup_control_expr_graph do the hard part of the job.
>
> Now - I don't know how that would interact with jump threads
> covering those paths, that is, what kind of "mess" jump threading
> would produce with the conditions/switches mangled that way.
>
> The other nice thing is that CFG cleanup properly deals with loop
> structure changes.
Threading will handle it correctly as that's essentially how we handled
it until last week ;-)
The "problem" is the threader thread through those blocks. And while
that works correctly, it is quite wasteful in terms of compile time and
memory consumption due to the block duplication required to preserve
side effects. It also churns the blocks & ssa-names used in the
isolated regions which makes comparing debugging dumps before/after
threading much harder than it ought to be.
Jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-10-07 22:00 ` Jeff Law
@ 2015-10-08 9:56 ` Richard Biener
2015-10-09 15:45 ` Jeff Law
0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2015-10-08 9:56 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches
On Thu, Oct 8, 2015 at 12:00 AM, Jeff Law <law@redhat.com> wrote:
> On 10/07/2015 02:26 AM, Richard Biener wrote:
>
>>
>> Hmm, other passes avoid all this by not removing edges or blocks
>> themselves but leaving that to cfgcleanup. They simply replace the
>> condition in GIMPLE_CONDs or the switch value in GIMPLE_SWITCHes and
>> let cleanup_control_expr_graph do the hard part of the job.
>>
>> Now - I don't know how that would interact with jump threads
>> covering those paths, that is, what kind of "mess" jump threading
>> would produce with the conditions/switches mangled that way.
>>
>> The other nice thing is that CFG cleanup properly deals with loop
>> structure changes.
>
> Threading will handle it correctly as that's essentially how we handled it
> until last week ;-)
>
> The "problem" is the threader thread through those blocks. And while that
> works correctly, it is quite wasteful in terms of compile time and memory
> consumption due to the block duplication required to preserve side effects.
> It also churns the blocks & ssa-names used in the isolated regions which
> makes comparing debugging dumps before/after threading much harder than it
> ought to be.
Yes, but as you remove jump threading paths you could leave the CFG change to
cfg-cleanup anyway? To get better behavior wrt loop fixup at least?
> Jeff
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-10-08 9:56 ` Richard Biener
@ 2015-10-09 15:45 ` Jeff Law
2015-12-01 21:32 ` Jeff Law
0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2015-10-09 15:45 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On 10/08/2015 03:56 AM, Richard Biener wrote:
> On Thu, Oct 8, 2015 at 12:00 AM, Jeff Law <law@redhat.com> wrote:
>> On 10/07/2015 02:26 AM, Richard Biener wrote:
>>
>>>
>>> Hmm, other passes avoid all this by not removing edges or blocks
>>> themselves but leaving that to cfgcleanup. They simply replace the
>>> condition in GIMPLE_CONDs or the switch value in GIMPLE_SWITCHes and
>>> let cleanup_control_expr_graph do the hard part of the job.
>>>
>>> Now - I don't know how that would interact with jump threads
>>> covering those paths, that is, what kind of "mess" jump threading
>>> would produce with the conditions/switches mangled that way.
>>>
>>> The other nice thing is that CFG cleanup properly deals with loop
>>> structure changes.
>>
>> Threading will handle it correctly as that's essentially how we handled it
>> until last week ;-)
>>
>> The "problem" is the threader thread through those blocks. And while that
>> works correctly, it is quite wasteful in terms of compile time and memory
>> consumption due to the block duplication required to preserve side effects.
>> It also churns the blocks & ssa-names used in the isolated regions which
>> makes comparing debugging dumps before/after threading much harder than it
>> ought to be.
>
> Yes, but as you remove jump threading paths you could leave the CFG change to
> cfg-cleanup anyway? To get better behavior wrt loop fixup at least?
So go ahead and detect, remove the threading paths, but leave final
fixup to cfg-cleanup. I can certainly try that.
It'd actually be a good thing to experiement with regardless -- I've
speculated that removing the edges in DOM allows DOM to do a better job,
but never did the instrumentation to find out for sure. Deferring the
final cleanup like you've suggested ought to give me most of what I'd
want to see if there's really any good secondary effects of cleaning up
the edges in DOM.
jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-10-09 15:45 ` Jeff Law
@ 2015-12-01 21:32 ` Jeff Law
2015-12-02 9:54 ` Richard Biener
0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2015-12-01 21:32 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On 10/09/2015 09:45 AM, Jeff Law wrote:
>> Yes, but as you remove jump threading paths you could leave the CFG
>> change to
>> cfg-cleanup anyway? To get better behavior wrt loop fixup at least?
> So go ahead and detect, remove the threading paths, but leave final
> fixup to cfg-cleanup. I can certainly try that.
>
> It'd actually be a good thing to experiement with regardless -- I've
> speculated that removing the edges in DOM allows DOM to do a better job,
> but never did the instrumentation to find out for sure. Deferring the
> final cleanup like you've suggested ought to give me most of what I'd
> want to see if there's really any good secondary effects of cleaning up
> the edges in DOM.
So I started looking at this in response to 68619, where this approach
does indeed solve the problem.
Essentially DOM's optimization of those edges results in two irredicuble
loops becoming reducible. The loop analysis code then complains because
we don't have proper loop structures for the new natural loops.
Deferring to cfg_cleanup works because if cfg_cleanup does anything, it
sets LOOPS_NEED_FIXUP (which we were trying to avoid in DOM). So it
seems that the gyrations we often do to avoid LOOPS_NEED_FIXUP are
probably not all that valuable in the end. Anyway...
There's some fallout which I'm still exploring. For example, we have
cases where removal of the edge by DOM results in removal of a PHI
argument in the target, which in turn results in the PHI becoming a
degenerate which we can then propagate away. I have a possible solution
for this that I'm playing with.
I suspect the right path is to continue down this path.
Jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-01 21:32 ` Jeff Law
@ 2015-12-02 9:54 ` Richard Biener
2015-12-02 15:31 ` Jeff Law
0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2015-12-02 9:54 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches
On Tue, Dec 1, 2015 at 10:32 PM, Jeff Law <law@redhat.com> wrote:
> On 10/09/2015 09:45 AM, Jeff Law wrote:
>>>
>>> Yes, but as you remove jump threading paths you could leave the CFG
>>> change to
>>> cfg-cleanup anyway? To get better behavior wrt loop fixup at least?
>>
>> So go ahead and detect, remove the threading paths, but leave final
>> fixup to cfg-cleanup. I can certainly try that.
>>
>> It'd actually be a good thing to experiement with regardless -- I've
>> speculated that removing the edges in DOM allows DOM to do a better job,
>> but never did the instrumentation to find out for sure. Deferring the
>> final cleanup like you've suggested ought to give me most of what I'd
>> want to see if there's really any good secondary effects of cleaning up
>> the edges in DOM.
>
> So I started looking at this in response to 68619, where this approach does
> indeed solve the problem.
>
> Essentially DOM's optimization of those edges results in two irredicuble
> loops becoming reducible. The loop analysis code then complains because we
> don't have proper loop structures for the new natural loops.
>
> Deferring to cfg_cleanup works because if cfg_cleanup does anything, it sets
> LOOPS_NEED_FIXUP (which we were trying to avoid in DOM). So it seems that
> the gyrations we often do to avoid LOOPS_NEED_FIXUP are probably not all
> that valuable in the end. Anyway...
Yeah, I have partially working patches lying around to "fix" CFG cleanup to
avoid this. Of course in the case of new loops appearing that's not easily
possible.
> There's some fallout which I'm still exploring. For example, we have cases
> where removal of the edge by DOM results in removal of a PHI argument in the
> target, which in turn results in the PHI becoming a degenerate which we can
> then propagate away. I have a possible solution for this that I'm playing
> with.
>
> I suspect the right path is to continue down this path.
Yeah, the issue here is that DOM isn't tracking which edges are executable
to handle merge PHIs (or to aovid doing work in unreachable regions). It should
be possible to make it do that much like I extended SCCVN to do this
(when doing the DOM walk see if any incoming edge is marked executable
and if not, mark all outgoing edges as not executable, if the block is
executable
at the time we process the last stmt determine if we can compute the edge
that ends up always executed and mark all others as not executable)
Richard.
> Jeff
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-02 9:54 ` Richard Biener
@ 2015-12-02 15:31 ` Jeff Law
2015-12-02 15:35 ` Richard Biener
0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2015-12-02 15:31 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On 12/02/2015 02:54 AM, Richard Biener wrote:
>> Deferring to cfg_cleanup works because if cfg_cleanup does anything, it sets
>> LOOPS_NEED_FIXUP (which we were trying to avoid in DOM). So it seems that
>> the gyrations we often do to avoid LOOPS_NEED_FIXUP are probably not all
>> that valuable in the end. Anyway...
>
> Yeah, I have partially working patches lying around to "fix" CFG cleanup to
> avoid this. Of course in the case of new loops appearing that's not easily
> possible.
And that may argue that it's largely inevitable if we collapse a
conditional (and thus delete an edge).
>
>> There's some fallout which I'm still exploring. For example, we have cases
>> where removal of the edge by DOM results in removal of a PHI argument in the
>> target, which in turn results in the PHI becoming a degenerate which we can
>> then propagate away. I have a possible solution for this that I'm playing
>> with.
>>
>> I suspect the right path is to continue down this path.
>
> Yeah, the issue here is that DOM isn't tracking which edges are executable
> to handle merge PHIs (or to aovid doing work in unreachable regions).
Right.
It should
> be possible to make it do that much like I extended SCCVN to do this
> (when doing the DOM walk see if any incoming edge is marked executable
> and if not, mark all outgoing edges as not executable, if the block is
> executable
> at the time we process the last stmt determine if we can compute the edge
> that ends up always executed and mark all others as not executable)
Essentially yes. I'm using the not-executable flag and bypassing things
when it's discovered.
The most interesting side effect, and one I haven't fully analyzed yet
is an unexpected jump thread -- which I've traced back to differences in
what the alias oracle is able to find when we walk unaliased vuses.
Which makes totally no sense that it's unable to find the unaliased vuse
in the simplified CFG, but finds it when we don't remove the
unexecutable edge. As I said, it makes no sense to me yet and I'm still
digging.
jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-02 15:31 ` Jeff Law
@ 2015-12-02 15:35 ` Richard Biener
2015-12-02 22:57 ` Jeff Law
2015-12-03 20:29 ` Jeff Law
0 siblings, 2 replies; 14+ messages in thread
From: Richard Biener @ 2015-12-02 15:35 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches
On Wed, Dec 2, 2015 at 4:31 PM, Jeff Law <law@redhat.com> wrote:
> On 12/02/2015 02:54 AM, Richard Biener wrote:
>>>
>>> Deferring to cfg_cleanup works because if cfg_cleanup does anything, it
>>> sets
>>> LOOPS_NEED_FIXUP (which we were trying to avoid in DOM). So it seems
>>> that
>>> the gyrations we often do to avoid LOOPS_NEED_FIXUP are probably not all
>>> that valuable in the end. Anyway...
>>
>>
>> Yeah, I have partially working patches lying around to "fix" CFG cleanup
>> to
>> avoid this. Of course in the case of new loops appearing that's not
>> easily
>> possible.
>
> And that may argue that it's largely inevitable if we collapse a conditional
> (and thus delete an edge).
>
>
>>
>>> There's some fallout which I'm still exploring. For example, we have
>>> cases
>>> where removal of the edge by DOM results in removal of a PHI argument in
>>> the
>>> target, which in turn results in the PHI becoming a degenerate which we
>>> can
>>> then propagate away. I have a possible solution for this that I'm
>>> playing
>>> with.
>>>
>>> I suspect the right path is to continue down this path.
>>
>>
>> Yeah, the issue here is that DOM isn't tracking which edges are executable
>> to handle merge PHIs (or to aovid doing work in unreachable regions).
>
> Right.
>
>
> It should
>>
>> be possible to make it do that much like I extended SCCVN to do this
>> (when doing the DOM walk see if any incoming edge is marked executable
>> and if not, mark all outgoing edges as not executable, if the block is
>> executable
>> at the time we process the last stmt determine if we can compute the edge
>> that ends up always executed and mark all others as not executable)
>
> Essentially yes. I'm using the not-executable flag and bypassing things when
> it's discovered.
>
> The most interesting side effect, and one I haven't fully analyzed yet is an
> unexpected jump thread -- which I've traced back to differences in what the
> alias oracle is able to find when we walk unaliased vuses. Which makes
> totally no sense that it's unable to find the unaliased vuse in the
> simplified CFG, but finds it when we don't remove the unexecutable edge. As
> I said, it makes no sense to me yet and I'm still digging.
The walking of PHI nodes is quite simplistic to avoid doing too much work so
an extra (not executable) edge may confuse it enough. So this might be
"expected". Adding a flag on whether EDGE_EXECUTABLE is to be
trusted would be an option (also helping SCCVN).
Richard.
> jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-02 15:35 ` Richard Biener
@ 2015-12-02 22:57 ` Jeff Law
2015-12-03 9:57 ` Richard Biener
2015-12-03 20:29 ` Jeff Law
1 sibling, 1 reply; 14+ messages in thread
From: Jeff Law @ 2015-12-02 22:57 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On 12/02/2015 08:35 AM, Richard Biener wrote:
>>> be possible to make it do that much like I extended SCCVN to do this
>>> (when doing the DOM walk see if any incoming edge is marked executable
>>> and if not, mark all outgoing edges as not executable, if the block is
>>> executable
>>> at the time we process the last stmt determine if we can compute the edge
>>> that ends up always executed and mark all others as not executable)
>>
>> Essentially yes. I'm using the not-executable flag and bypassing things when
>> it's discovered.
I took at look at what you did with SCCVN and I like it better than what
I was doing -- your handling is more complete.
There's some code that ought to be factored out. In particular at the
start of before_dom_children you've got code that clears
EDGE_EXECUTABLE. There's no reason we should duplicate in SCCVN, DOM
and perhaps other DOM walkers in the future. If you've got a place in
mind where you think it ought to live (cfg-something) speak up. Else
I'll find a spot on my own.
>>
>> The most interesting side effect, and one I haven't fully analyzed yet is an
>> unexpected jump thread -- which I've traced back to differences in what the
>> alias oracle is able to find when we walk unaliased vuses. Which makes
>> totally no sense that it's unable to find the unaliased vuse in the
>> simplified CFG, but finds it when we don't remove the unexecutable edge. As
>> I said, it makes no sense to me yet and I'm still digging.
>
> The walking of PHI nodes is quite simplistic to avoid doing too much work so
> an extra (not executable) edge may confuse it enough. So this might be
> "expected". Adding a flag on whether EDGE_EXECUTABLE is to be
> trusted would be an option (also helping SCCVN).
It was actually the opposite effect. ie, with the simplified CFG, we
missed the jump thread, but with the more complex CFG we found the jump
thread. And the edge we were removing didn't (at first glance) appear
to be of any real significance.
I'm not seeing the oddity anymore now that I've converted DOM to mimick
what you were doing in SCCVN. So I'm going to assume I botched
something somewhere.
Back to analysis to see if there's other fallout.
jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-02 22:57 ` Jeff Law
@ 2015-12-03 9:57 ` Richard Biener
0 siblings, 0 replies; 14+ messages in thread
From: Richard Biener @ 2015-12-03 9:57 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches
On Wed, Dec 2, 2015 at 11:56 PM, Jeff Law <law@redhat.com> wrote:
> On 12/02/2015 08:35 AM, Richard Biener wrote:
>
>>>> be possible to make it do that much like I extended SCCVN to do this
>>>> (when doing the DOM walk see if any incoming edge is marked executable
>>>> and if not, mark all outgoing edges as not executable, if the block is
>>>> executable
>>>> at the time we process the last stmt determine if we can compute the
>>>> edge
>>>> that ends up always executed and mark all others as not executable)
>>>
>>>
>>> Essentially yes. I'm using the not-executable flag and bypassing things
>>> when
>>> it's discovered.
>
> I took at look at what you did with SCCVN and I like it better than what I
> was doing -- your handling is more complete.
>
> There's some code that ought to be factored out. In particular at the start
> of before_dom_children you've got code that clears EDGE_EXECUTABLE.
> There's no reason we should duplicate in SCCVN, DOM and perhaps other DOM
> walkers in the future. If you've got a place in mind where you think it
> ought to live (cfg-something) speak up. Else I'll find a spot on my own.
I think domwalk.[ch] itself might be a good enough spot, a static method
inside the domwalk class
bool before_dom_children_track_and_query_bb_executable (basic_block);
using a derived class that automagically does this would still require
to explicitely call the parent function before_dom_children, so better
make it explicit.
And document it is non-optimistic thus all edges have to be marked
EDGE_EXECUTABLE before the domwalk and before_dom_children
is supposed to mark non-executable outgoing edges.
>>>
>>> The most interesting side effect, and one I haven't fully analyzed yet is
>>> an
>>> unexpected jump thread -- which I've traced back to differences in what
>>> the
>>> alias oracle is able to find when we walk unaliased vuses. Which makes
>>> totally no sense that it's unable to find the unaliased vuse in the
>>> simplified CFG, but finds it when we don't remove the unexecutable edge.
>>> As
>>> I said, it makes no sense to me yet and I'm still digging.
>>
>>
>> The walking of PHI nodes is quite simplistic to avoid doing too much work
>> so
>> an extra (not executable) edge may confuse it enough. So this might be
>> "expected". Adding a flag on whether EDGE_EXECUTABLE is to be
>> trusted would be an option (also helping SCCVN).
>
> It was actually the opposite effect. ie, with the simplified CFG, we missed
> the jump thread, but with the more complex CFG we found the jump thread.
> And the edge we were removing didn't (at first glance) appear to be of any
> real significance.
Oh, I see...
> I'm not seeing the oddity anymore now that I've converted DOM to mimick what
> you were doing in SCCVN. So I'm going to assume I botched something
> somewhere.
>
> Back to analysis to see if there's other fallout.
>
> jeff
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-02 15:35 ` Richard Biener
2015-12-02 22:57 ` Jeff Law
@ 2015-12-03 20:29 ` Jeff Law
2015-12-04 10:12 ` Richard Biener
1 sibling, 1 reply; 14+ messages in thread
From: Jeff Law @ 2015-12-03 20:29 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On 12/02/2015 08:35 AM, Richard Biener wrote:
>>
>> The most interesting side effect, and one I haven't fully analyzed yet is an
>> unexpected jump thread -- which I've traced back to differences in what the
>> alias oracle is able to find when we walk unaliased vuses. Which makes
>> totally no sense that it's unable to find the unaliased vuse in the
>> simplified CFG, but finds it when we don't remove the unexecutable edge. As
>> I said, it makes no sense to me yet and I'm still digging.
>
> The walking of PHI nodes is quite simplistic to avoid doing too much work so
> an extra (not executable) edge may confuse it enough. So this might be
> "expected". Adding a flag on whether EDGE_EXECUTABLE is to be
> trusted would be an option (also helping SCCVN).
Found it. In the CFG with the unexectuable edges _not_ removed there is
a PHI associated with that edge which provides a dominating unaliased
vuse. Once that edge is removed, the PHI arg gets removed and thus we
can't easily see the unaliased vuse.
So all is working as expected. It wasn't ever a big issue, I just
wanted to make sure I thoroughly understood the somewhat
counter-intuitive result.
Jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-03 20:29 ` Jeff Law
@ 2015-12-04 10:12 ` Richard Biener
2015-12-04 17:18 ` Jeff Law
0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2015-12-04 10:12 UTC (permalink / raw)
To: Jeff Law; +Cc: GCC Patches
[-- Attachment #1: Type: text/plain, Size: 2700 bytes --]
On Thu, Dec 3, 2015 at 9:29 PM, Jeff Law <law@redhat.com> wrote:
> On 12/02/2015 08:35 AM, Richard Biener wrote:
>
>>>
>>> The most interesting side effect, and one I haven't fully analyzed yet is
>>> an
>>> unexpected jump thread -- which I've traced back to differences in what
>>> the
>>> alias oracle is able to find when we walk unaliased vuses. Which makes
>>> totally no sense that it's unable to find the unaliased vuse in the
>>> simplified CFG, but finds it when we don't remove the unexecutable edge.
>>> As
>>> I said, it makes no sense to me yet and I'm still digging.
>>
>>
>> The walking of PHI nodes is quite simplistic to avoid doing too much work
>> so
>> an extra (not executable) edge may confuse it enough. So this might be
>> "expected". Adding a flag on whether EDGE_EXECUTABLE is to be
>> trusted would be an option (also helping SCCVN).
>
> Found it. In the CFG with the unexectuable edges _not_ removed there is a
> PHI associated with that edge which provides a dominating unaliased vuse.
> Once that edge is removed, the PHI arg gets removed and thus we can't easily
> see the unaliased vuse.
>
> So all is working as expected. It wasn't ever a big issue, I just wanted to
> make sure I thoroughly understood the somewhat counter-intuitive result.
Good. Btw, I remembered that with the unreachable tracking in SCCVN I
ran into cases where it didn't correctly track all unreachable blocks due
to the way the dominator walk works (which doesn't ensure we've visited
all predecessors). Like for
<unreachable>
| |
| <bb3>
| / \
<bb2> / \ /
\ <bb4> <reachable>
\ /
<bb5>
DOM order visits <unreachable>, <bb2>, <bb3>, <bb5> and then
the DOM children like <bb4>. So we fail to detect bb5 as unreachable
(didn't visit bb4 to mark outgoing edges unreachable yet).
The fix is easy (in testing right now). Simply track if the current block
was unreachable when visiting DOM children.
Didn't manage to produce a missed-optimization testcase (but only
tried for a few minutes), the cases I've seen it were involving
unreachable loops, but I don't have them anymore.
Sorry if that makes extracting the machinery harder ;)
Richard.
2015-12-04 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (sccvn_dom_walker): Add unreachable_dom
member and initialize it.
(sccvn_dom_walker::after_dom_children): Reset unreachable_dom
if necessary.
(sccvn_dom_walker::before_dom_children): If unreachable_dom
is set BB is not reachable either. Set unreachable_dom
if not set and BB is unreachable.
> Jeff
[-- Attachment #2: p --]
[-- Type: application/octet-stream, Size: 2490 bytes --]
2015-12-04 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (sccvn_dom_walker): Add unreachable_dom
member and initialize it.
(sccvn_dom_walker::after_dom_children): Reset unreachable_dom
if necessary.
(sccvn_dom_walker::before_dom_children): If unreachable_dom
is set BB is not reachable either. Set unreachable_dom
if not set and BB is unreachable.
Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c (revision 231244)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -4207,7 +4207,8 @@ class sccvn_dom_walker : public dom_walk
{
public:
sccvn_dom_walker ()
- : dom_walker (CDI_DOMINATORS), fail (false), cond_stack (vNULL) {}
+ : dom_walker (CDI_DOMINATORS), fail (false), unreachable_dom (NULL),
+ cond_stack (vNULL) {}
~sccvn_dom_walker ();
virtual void before_dom_children (basic_block);
@@ -4219,6 +4220,7 @@ public:
enum tree_code code, tree lhs, tree rhs, bool value);
bool fail;
+ basic_block unreachable_dom;
vec<std::pair <basic_block, std::pair <vn_nary_op_t, vn_nary_op_t> > >
cond_stack;
};
@@ -4299,6 +4301,9 @@ sccvn_dom_walker::record_conds (basic_bl
void
sccvn_dom_walker::after_dom_children (basic_block bb)
{
+ if (unreachable_dom == bb)
+ unreachable_dom = NULL;
+
while (!cond_stack.is_empty ()
&& cond_stack.last ().first == bb)
{
@@ -4325,10 +4330,14 @@ sccvn_dom_walker::before_dom_children (b
/* If any of the predecessor edges that do not come from blocks dominated
by us are still marked as possibly executable consider this block
reachable. */
- bool reachable = bb == ENTRY_BLOCK_PTR_FOR_FN (cfun);
- FOR_EACH_EDGE (e, ei, bb->preds)
- if (!dominated_by_p (CDI_DOMINATORS, e->src, bb))
- reachable |= (e->flags & EDGE_EXECUTABLE);
+ bool reachable = false;
+ if (!unreachable_dom)
+ {
+ reachable = bb == ENTRY_BLOCK_PTR_FOR_FN (cfun);
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ if (!dominated_by_p (CDI_DOMINATORS, e->src, bb))
+ reachable |= (e->flags & EDGE_EXECUTABLE);
+ }
/* If the block is not reachable all outgoing edges are not
executable. Neither are incoming edges with src dominated by us. */
@@ -4352,6 +4361,11 @@ sccvn_dom_walker::before_dom_children (b
e->flags &= ~EDGE_EXECUTABLE;
}
}
+
+ /* Record the most dominating unreachable block. */
+ if (!unreachable_dom)
+ unreachable_dom = bb;
+
return;
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path
2015-12-04 10:12 ` Richard Biener
@ 2015-12-04 17:18 ` Jeff Law
0 siblings, 0 replies; 14+ messages in thread
From: Jeff Law @ 2015-12-04 17:18 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On 12/04/2015 03:12 AM, Richard Biener wrote:
> On Thu, Dec 3, 2015 at 9:29 PM, Jeff Law <law@redhat.com> wrote:
>> On 12/02/2015 08:35 AM, Richard Biener wrote:
>>
>>>>
>>>> The most interesting side effect, and one I haven't fully analyzed yet is
>>>> an
>>>> unexpected jump thread -- which I've traced back to differences in what
>>>> the
>>>> alias oracle is able to find when we walk unaliased vuses. Which makes
>>>> totally no sense that it's unable to find the unaliased vuse in the
>>>> simplified CFG, but finds it when we don't remove the unexecutable edge.
>>>> As
>>>> I said, it makes no sense to me yet and I'm still digging.
>>>
>>>
>>> The walking of PHI nodes is quite simplistic to avoid doing too much work
>>> so
>>> an extra (not executable) edge may confuse it enough. So this might be
>>> "expected". Adding a flag on whether EDGE_EXECUTABLE is to be
>>> trusted would be an option (also helping SCCVN).
>>
>> Found it. In the CFG with the unexectuable edges _not_ removed there is a
>> PHI associated with that edge which provides a dominating unaliased vuse.
>> Once that edge is removed, the PHI arg gets removed and thus we can't easily
>> see the unaliased vuse.
>>
>> So all is working as expected. It wasn't ever a big issue, I just wanted to
>> make sure I thoroughly understood the somewhat counter-intuitive result.
>
> Good. Btw, I remembered that with the unreachable tracking in SCCVN I
> ran into cases where it didn't correctly track all unreachable blocks due
> to the way the dominator walk works (which doesn't ensure we've visited
> all predecessors). Like for
>
> <unreachable>
> | |
> | <bb3>
> | / \
> <bb2> / \ /
> \ <bb4> <reachable>
> \ /
> <bb5>
>
> DOM order visits <unreachable>, <bb2>, <bb3>, <bb5> and then
> the DOM children like <bb4>. So we fail to detect bb5 as unreachable
> (didn't visit bb4 to mark outgoing edges unreachable yet).
>
> The fix is easy (in testing right now). Simply track if the current block
> was unreachable when visiting DOM children.
>
> Didn't manage to produce a missed-optimization testcase (but only
> tried for a few minutes), the cases I've seen it were involving
> unreachable loops, but I don't have them anymore.
>
> Sorry if that makes extracting the machinery harder ;)
I'll get it sorted out.
I found that I actually wanted those bits extracted and split into two
functions. One for the test and one for the propagation. The hardest
part is actually chosing meaningful names.
jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-12-04 17:18 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-07 2:26 [PATCH][PR tree-optimization/67816] Fix jump threading when DOM removes conditionals in jump threading path Jeff Law
2015-10-07 8:26 ` Richard Biener
2015-10-07 22:00 ` Jeff Law
2015-10-08 9:56 ` Richard Biener
2015-10-09 15:45 ` Jeff Law
2015-12-01 21:32 ` Jeff Law
2015-12-02 9:54 ` Richard Biener
2015-12-02 15:31 ` Jeff Law
2015-12-02 15:35 ` Richard Biener
2015-12-02 22:57 ` Jeff Law
2015-12-03 9:57 ` Richard Biener
2015-12-03 20:29 ` Jeff Law
2015-12-04 10:12 ` Richard Biener
2015-12-04 17:18 ` Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).