* [committed. gomp4] pass_dominator_oacc_kernels patch series
@ 2015-10-12 14:50 Tom de Vries
2015-10-12 15:00 ` [committed, gomp4, 1/6] Add pass_dominator::jump_threading_p () Tom de Vries
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Tom de Vries @ 2015-10-12 14:50 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
Hi,
I've committed the following patch series to the gomp-4_0-branch.
1 Add pass_dominator::jump_threading_p ()
2 Add dom_walker::walk_until
3 Add pass_dominator::sese_mode_p ()
4 Add skip_stmt parm to pass_dominator::get_sese ()
5 Add oacc kernels related infra functions
6 Add pass_dominator_oacc_kernels
The patch series adds a pass pass_dominator_oacc_kernels, which does the
pass_dominator optimizations (with the exception of jump threading) on
each oacc kernels region rather than on the whole function.
Bootstrapped and reg-tested on x86_64.
I'll post the patches individually, in reply to this email.
Thanks,
- Tom
^ permalink raw reply [flat|nested] 7+ messages in thread
* [committed, gomp4, 1/6] Add pass_dominator::jump_threading_p ()
2015-10-12 14:50 [committed. gomp4] pass_dominator_oacc_kernels patch series Tom de Vries
@ 2015-10-12 15:00 ` Tom de Vries
2015-10-12 15:02 ` [committed, gomp4, 2/6] Add dom_walker::walk_until Tom de Vries
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2015-10-12 15:00 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1016 bytes --]
On 12/10/15 16:49, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series to the gomp-4_0-branch.
>
> 1 Add pass_dominator::jump_threading_p ()
> 2 Add dom_walker::walk_until
> 3 Add pass_dominator::sese_mode_p ()
> 4 Add skip_stmt parm to pass_dominator::get_sese ()
> 5 Add oacc kernels related infra functions
> 6 Add pass_dominator_oacc_kernels
>
> The patch series adds a pass pass_dominator_oacc_kernels, which does the
> pass_dominator optimizations (with the exception of jump threading) on
> each oacc kernels region rather than on the whole function.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch adds the possibility to pass_dominators to switch off the
jump threading optimization.
Note that we do not disable threadedge_initialize_values /
threadedge_finalize_values, since the values stored there are used for
other optimizations as well.
Thanks,
- Tom
[-- Attachment #2: 0001-Add-pass_dominator-jump_threading_p.patch --]
[-- Type: text/x-patch, Size: 7116 bytes --]
Add pass_dominator::jump_threading_p ()
2015-10-12 Tom de Vries <tom@codesourcery.com>
* tree-ssa-dom.c (dom_opt_dom_walker::dom_opt_dom_walker): Add
jump_threading_p parameters.
(dom_opt_dom_walker::m_jump_threading_p): New private var.
(pass_dominator::jump_threading_p): New protected virtual function.
(pass_dominator::execute): Handle jump_threading_p.
(dom_opt_dom_walker::before_dom_children)
(dom_opt_dom_walker::after_dom_children): Handle m_jump_threading_p.
---
gcc/tree-ssa-dom.c | 109 +++++++++++++++++++++++++++++++----------------------
1 file changed, 64 insertions(+), 45 deletions(-)
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index a8b7038..162d9ed 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -492,11 +492,14 @@ class dom_opt_dom_walker : public dom_walker
public:
dom_opt_dom_walker (cdi_direction direction,
class const_and_copies *const_and_copies,
- class avail_exprs_stack *avail_exprs_stack)
+ class avail_exprs_stack *avail_exprs_stack,
+ bool jump_threading_p)
: dom_walker (direction),
m_const_and_copies (const_and_copies),
m_avail_exprs_stack (avail_exprs_stack),
- m_dummy_cond (NULL) {}
+ m_dummy_cond (NULL),
+ m_jump_threading_p (jump_threading_p)
+ {}
virtual void before_dom_children (basic_block);
virtual void after_dom_children (basic_block);
@@ -509,6 +512,7 @@ private:
class avail_exprs_stack *m_avail_exprs_stack;
gcond *m_dummy_cond;
+ bool m_jump_threading_p;
};
/* Jump threading, redundancy elimination and const/copy propagation.
@@ -544,6 +548,10 @@ public:
virtual bool gate (function *) { return flag_tree_dom != 0; }
virtual unsigned int execute (function *);
+ protected:
+ /* Return true if pass should perform jump threading. */
+ virtual bool jump_threading_p (void) { return true; }
+
}; // class pass_dominator
unsigned int
@@ -578,25 +586,29 @@ pass_dominator::execute (function *fun)
/* Initialize the value-handle array. */
threadedge_initialize_values ();
- /* We need accurate information regarding back edges in the CFG
- for jump threading; this may include back edges that are not part of
- a single loop. */
- mark_dfs_back_edges ();
-
- /* We want to create the edge info structures before the dominator walk
- so that they'll be in place for the jump threader, particularly when
- threading through a join block.
-
- The conditions will be lazily updated with global equivalences as
- we reach them during the dominator walk. */
- basic_block bb;
- FOR_EACH_BB_FN (bb, fun)
- record_edge_info (bb);
+ if (jump_threading_p ())
+ {
+ /* We need accurate information regarding back edges in the CFG
+ for jump threading; this may include back edges that are not part of
+ a single loop. */
+ mark_dfs_back_edges ();
+
+ /* We want to create the edge info structures before the dominator walk
+ so that they'll be in place for the jump threader, particularly when
+ threading through a join block.
+
+ The conditions will be lazily updated with global equivalences as
+ we reach them during the dominator walk. */
+ basic_block bb;
+ FOR_EACH_BB_FN (bb, fun)
+ record_edge_info (bb);
+ }
/* Recursively walk the dominator tree optimizing statements. */
dom_opt_dom_walker walker (CDI_DOMINATORS,
const_and_copies,
- avail_exprs_stack);
+ avail_exprs_stack,
+ jump_threading_p ());
walker.walk (fun->cfg->x_entry_block_ptr);
{
@@ -616,10 +628,13 @@ pass_dominator::execute (function *fun)
duplication and CFG manipulation. */
update_ssa (TODO_update_ssa);
- free_all_edge_infos ();
+ if (jump_threading_p ())
+ {
+ free_all_edge_infos ();
- /* Thread jumps, creating duplicate blocks as needed. */
- cfg_altered |= thread_through_all_blocks (first_pass_instance);
+ /* Thread jumps, creating duplicate blocks as needed. */
+ cfg_altered |= thread_through_all_blocks (first_pass_instance);
+ }
if (cfg_altered)
free_dominance_info (CDI_DOMINATORS);
@@ -1314,7 +1329,8 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
optimize_stmt (bb, gsi, m_const_and_copies, m_avail_exprs_stack);
/* Now prepare to process dominated blocks. */
- record_edge_info (bb);
+ if (m_jump_threading_p)
+ record_edge_info (bb);
cprop_into_successor_phis (bb, m_const_and_copies);
}
@@ -1327,35 +1343,38 @@ dom_opt_dom_walker::after_dom_children (basic_block bb)
{
gimple *last;
- /* If we have an outgoing edge to a block with multiple incoming and
- outgoing edges, then we may be able to thread the edge, i.e., we
- may be able to statically determine which of the outgoing edges
- will be traversed when the incoming edge from BB is traversed. */
- if (single_succ_p (bb)
- && (single_succ_edge (bb)->flags & EDGE_ABNORMAL) == 0
- && potentially_threadable_block (single_succ (bb)))
- {
- thread_across_edge (single_succ_edge (bb));
- }
- else if ((last = last_stmt (bb))
- && gimple_code (last) == GIMPLE_COND
- && EDGE_COUNT (bb->succs) == 2
- && (EDGE_SUCC (bb, 0)->flags & EDGE_ABNORMAL) == 0
- && (EDGE_SUCC (bb, 1)->flags & EDGE_ABNORMAL) == 0)
+ if (m_jump_threading_p)
{
- edge true_edge, false_edge;
+ /* If we have an outgoing edge to a block with multiple incoming and
+ outgoing edges, then we may be able to thread the edge, i.e., we
+ may be able to statically determine which of the outgoing edges
+ will be traversed when the incoming edge from BB is traversed. */
+ if (single_succ_p (bb)
+ && (single_succ_edge (bb)->flags & EDGE_ABNORMAL) == 0
+ && potentially_threadable_block (single_succ (bb)))
+ {
+ thread_across_edge (single_succ_edge (bb));
+ }
+ else if ((last = last_stmt (bb))
+ && gimple_code (last) == GIMPLE_COND
+ && EDGE_COUNT (bb->succs) == 2
+ && (EDGE_SUCC (bb, 0)->flags & EDGE_ABNORMAL) == 0
+ && (EDGE_SUCC (bb, 1)->flags & EDGE_ABNORMAL) == 0)
+ {
+ edge true_edge, false_edge;
- extract_true_false_edges_from_block (bb, &true_edge, &false_edge);
+ extract_true_false_edges_from_block (bb, &true_edge, &false_edge);
- /* Only try to thread the edge if it reaches a target block with
- more than one predecessor and more than one successor. */
- if (potentially_threadable_block (true_edge->dest))
- thread_across_edge (true_edge);
+ /* Only try to thread the edge if it reaches a target block with
+ more than one predecessor and more than one successor. */
+ if (potentially_threadable_block (true_edge->dest))
+ thread_across_edge (true_edge);
- /* Similarly for the ELSE arm. */
- if (potentially_threadable_block (false_edge->dest))
- thread_across_edge (false_edge);
+ /* Similarly for the ELSE arm. */
+ if (potentially_threadable_block (false_edge->dest))
+ thread_across_edge (false_edge);
+ }
}
/* These remove expressions local to BB from the tables. */
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [committed, gomp4, 2/6] Add dom_walker::walk_until
2015-10-12 14:50 [committed. gomp4] pass_dominator_oacc_kernels patch series Tom de Vries
2015-10-12 15:00 ` [committed, gomp4, 1/6] Add pass_dominator::jump_threading_p () Tom de Vries
@ 2015-10-12 15:02 ` Tom de Vries
2015-10-12 15:06 ` [committed, gomp4, 3/6] Add pass_dominator::sese_mode_p () Tom de Vries
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2015-10-12 15:02 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 848 bytes --]
On 12/10/15 16:49, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series to the gomp-4_0-branch.
>
> 1 Add pass_dominator::jump_threading_p ()
> 2 Add dom_walker::walk_until
> 3 Add pass_dominator::sese_mode_p ()
> 4 Add skip_stmt parm to pass_dominator::get_sese ()
> 5 Add oacc kernels related infra functions
> 6 Add pass_dominator_oacc_kernels
>
> The patch series adds a pass pass_dominator_oacc_kernels, which does the
> pass_dominator optimizations (with the exception of jump threading) on
> each oacc kernels region rather than on the whole function.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch adds the ability to walk a part of a dominator tree, rather
than the whole tree.
Thanks,
- Tom
[-- Attachment #2: 0002-Add-dom_walker-walk_until.patch --]
[-- Type: text/x-patch, Size: 2688 bytes --]
Add dom_walker::walk_until
2015-10-12 Tom de Vries <tom@codesourcery.com>
* domwalk.c (dom_walker::walk): Rename to ...
(dom_walker::walk_until): ... this. Add and handle until and
until_inclusive parameters.
(dom_walker::walk): Reimplement using dom_walker::walk_until.
* domwalk.h (dom_walker::walk_until): Declare.
---
gcc/domwalk.c | 32 +++++++++++++++++++++++++++-----
gcc/domwalk.h | 2 ++
2 files changed, 29 insertions(+), 5 deletions(-)
diff --git a/gcc/domwalk.c b/gcc/domwalk.c
index bbf9ff8..5fe666e 100644
--- a/gcc/domwalk.c
+++ b/gcc/domwalk.c
@@ -144,11 +144,18 @@ cmp_bb_postorder (const void *a, const void *b)
}
/* Recursively walk the dominator tree.
- BB is the basic block we are currently visiting. */
+ BB is the basic block we are currently visiting. UNTIL is a basic_block that
+ is the root of a subtree that we won't visit. If UNTIL_INCLUSIVE, we visit
+ UNTIL, but not it's children. Otherwise don't visit UNTIL and its
+ children. */
void
-dom_walker::walk (basic_block bb)
+dom_walker::walk_until (basic_block bb, basic_block until, bool until_inclusive)
{
+ bool skip_self = (bb == until && !until_inclusive);
+ if (skip_self)
+ return;
+
basic_block dest;
basic_block *worklist = XNEWVEC (basic_block,
n_basic_blocks_for_fn (cfun) * 2);
@@ -182,9 +189,15 @@ dom_walker::walk (basic_block bb)
worklist[sp++] = NULL;
int saved_sp = sp;
- for (dest = first_dom_son (m_dom_direction, bb);
- dest; dest = next_dom_son (m_dom_direction, dest))
- worklist[sp++] = dest;
+ bool skip_children = bb == until && until_inclusive;
+ if (!skip_children)
+ for (dest = first_dom_son (m_dom_direction, bb);
+ dest; dest = next_dom_son (m_dom_direction, dest))
+ {
+ bool skip_child = (dest == until && !until_inclusive);
+ if (!skip_child)
+ worklist[sp++] = dest;
+ }
if (m_dom_direction == CDI_DOMINATORS)
switch (sp - saved_sp)
{
@@ -218,3 +231,12 @@ dom_walker::walk (basic_block bb)
}
free (worklist);
}
+
+/* Recursively walk the dominator tree.
+ BB is the basic block we are currently visiting. */
+
+void
+dom_walker::walk (basic_block bb)
+{
+ walk_until (bb, NULL, true);
+}
diff --git a/gcc/domwalk.h b/gcc/domwalk.h
index 71a7c47..71e6075 100644
--- a/gcc/domwalk.h
+++ b/gcc/domwalk.h
@@ -34,6 +34,8 @@ public:
/* Walk the dominator tree. */
void walk (basic_block);
+ /* Walk a part of the dominator tree. */
+ void walk_until (basic_block, basic_block, bool);
/* Function to call before the recursive walk of the dominator children. */
virtual void before_dom_children (basic_block) {}
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [committed, gomp4, 3/6] Add pass_dominator::sese_mode_p ()
2015-10-12 14:50 [committed. gomp4] pass_dominator_oacc_kernels patch series Tom de Vries
2015-10-12 15:00 ` [committed, gomp4, 1/6] Add pass_dominator::jump_threading_p () Tom de Vries
2015-10-12 15:02 ` [committed, gomp4, 2/6] Add dom_walker::walk_until Tom de Vries
@ 2015-10-12 15:06 ` Tom de Vries
2015-10-12 15:07 ` [committed, gomp4, 4/6] Add skip_stmt parm to pass_dominator::get_sese () Tom de Vries
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2015-10-12 15:06 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 874 bytes --]
On 12/10/15 16:49, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series to the gomp-4_0-branch.
>
> 1 Add pass_dominator::jump_threading_p ()
> 2 Add dom_walker::walk_until
> 3 Add pass_dominator::sese_mode_p ()
> 4 Add skip_stmt parm to pass_dominator::get_sese ()
> 5 Add oacc kernels related infra functions
> 6 Add pass_dominator_oacc_kernels
>
> The patch series adds a pass pass_dominator_oacc_kernels, which does the
> pass_dominator optimizations (with the exception of jump threading) on
> each oacc kernels region rather than on the whole function.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch adds the ability to pass_dominator to work on a series of
sese regions rather than on the entire function.
Thanks,
- Tom
[-- Attachment #2: 0003-Add-pass_dominator-sese_mode_p.patch --]
[-- Type: text/x-patch, Size: 3413 bytes --]
Add pass_dominator::sese_mode_p ()
2015-10-12 Tom de Vries <tom@codesourcery.com>
* tree-ssa-dom.c (pass_dominator::jump_threading_p): Handle sese_mode_p.
(pass_dominator::sese_mode_p, pass_dominator::get_sese): New protected
virtual function.
(pass_dominator::execute): Handle sese_mode_p.
---
gcc/tree-ssa-dom.c | 49 +++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 43 insertions(+), 6 deletions(-)
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 162d9ed..7a1250e 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-ssa-dom.h"
#include "gimplify.h"
#include "tree-cfgcleanup.h"
+#include "cfgcleanup.h"
/* This file implements optimizations on the dominator tree. */
@@ -550,7 +551,17 @@ public:
protected:
/* Return true if pass should perform jump threading. */
- virtual bool jump_threading_p (void) { return true; }
+ virtual bool jump_threading_p (void) { return !sese_mode_p (); }
+
+ /* Return true if pass should visit a series of seses rather than the whole
+ dominator tree. */
+ virtual bool sese_mode_p (void) { return false; }
+
+ /* In sese mode, return true if there's another sese to visit. Return the
+ sese to visit in SESE_ENTRY and SESE_EXIT. */
+ virtual bool get_sese (basic_block *sese_entry ATTRIBUTE_UNUSED,
+ basic_block *sese_exit ATTRIBUTE_UNUSED)
+ { gcc_unreachable (); }
}; // class pass_dominator
@@ -583,11 +594,14 @@ pass_dominator::execute (function *fun)
LOOPS_HAVE_PREHEADERS won't be needed here. */
loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
- /* Initialize the value-handle array. */
- threadedge_initialize_values ();
+ if (!sese_mode_p ())
+ /* Initialize the value-handle array. */
+ threadedge_initialize_values ();
if (jump_threading_p ())
{
+ gcc_assert (!sese_mode_p ());
+
/* We need accurate information regarding back edges in the CFG
for jump threading; this may include back edges that are not part of
a single loop. */
@@ -609,7 +623,29 @@ pass_dominator::execute (function *fun)
const_and_copies,
avail_exprs_stack,
jump_threading_p ());
- walker.walk (fun->cfg->x_entry_block_ptr);
+ if (!sese_mode_p ())
+ walker.walk (fun->cfg->x_entry_block_ptr);
+ else
+ {
+ basic_block sese_entry, sese_exit;
+ while (get_sese (&sese_entry, &sese_exit))
+ {
+ threadedge_initialize_values ();
+ avail_exprs_stack->push_marker ();
+ const_and_copies->push_marker ();
+
+ walker.walk_until (sese_entry, sese_exit, true);
+
+ avail_exprs_stack->pop_to_marker ();
+ const_and_copies->pop_to_marker ();
+ threadedge_finalize_values ();
+
+ /* KLUDGE: The dom_walker does not allow unreachable blocks when
+ starting the walk, and during the dom_opt_dom_walker walk we may
+ produce unreachable blocks, so we need to clean them up here. */
+ delete_unreachable_blocks ();
+ }
+ }
{
gimple_stmt_iterator gsi;
@@ -709,8 +745,9 @@ pass_dominator::execute (function *fun)
delete avail_exprs_stack;
delete const_and_copies;
- /* Free the value-handle array. */
- threadedge_finalize_values ();
+ if (!sese_mode_p ())
+ /* Free the value-handle array. */
+ threadedge_finalize_values ();
return 0;
}
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [committed, gomp4, 4/6] Add skip_stmt parm to pass_dominator::get_sese ()
2015-10-12 14:50 [committed. gomp4] pass_dominator_oacc_kernels patch series Tom de Vries
` (2 preceding siblings ...)
2015-10-12 15:06 ` [committed, gomp4, 3/6] Add pass_dominator::sese_mode_p () Tom de Vries
@ 2015-10-12 15:07 ` Tom de Vries
2015-10-12 15:10 ` [committed. gomp4, 5/6] Add oacc kernels related infra functions Tom de Vries
2015-10-12 15:17 ` [committed. gomp4, 6/6] Add pass_dominator_oacc_kernels Tom de Vries
5 siblings, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2015-10-12 15:07 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 849 bytes --]
On 12/10/15 16:49, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series to the gomp-4_0-branch.
>
> 1 Add pass_dominator::jump_threading_p ()
> 2 Add dom_walker::walk_until
> 3 Add pass_dominator::sese_mode_p ()
> 4 Add skip_stmt parm to pass_dominator::get_sese ()
> 5 Add oacc kernels related infra functions
> 6 Add pass_dominator_oacc_kernels
>
> The patch series adds a pass pass_dominator_oacc_kernels, which does the
> pass_dominator optimizations (with the exception of jump threading) on
> each oacc kernels region rather than on the whole function.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch adds the ability in pass_dominator to skip a stmt while
optimizing a sese region.
Thanks,
- Tom
[-- Attachment #2: 0004-Add-skip_stmt-parm-to-pass_dominator-get_sese.patch --]
[-- Type: text/x-patch, Size: 2615 bytes --]
Add skip_stmt parm to pass_dominator::get_sese ()
2015-10-12 Tom de Vries <tom@codesourcery.com>
* tree-ssa-dom.c (dom_opt_dom_walker::set_skip_stmt): New function.
(dom_opt_dom_walker::m_skip_stmt): New private var.
(pass_dominator::get_sese): Add skip_stmt parameters.
(pass_dominator::execute): Call set_skip_stmt with statement to skip for
sese.
(dom_opt_dom_walker::before_dom_children): Handle m_skip_stmt.
---
gcc/tree-ssa-dom.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 7a1250e..573e6fc 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -504,6 +504,7 @@ public:
virtual void before_dom_children (basic_block);
virtual void after_dom_children (basic_block);
+ void set_skip_stmt (gimple *skip_stmt) { m_skip_stmt = skip_stmt; }
private:
void thread_across_edge (edge);
@@ -514,6 +515,7 @@ private:
gcond *m_dummy_cond;
bool m_jump_threading_p;
+ gimple *m_skip_stmt;
};
/* Jump threading, redundancy elimination and const/copy propagation.
@@ -558,9 +560,11 @@ public:
virtual bool sese_mode_p (void) { return false; }
/* In sese mode, return true if there's another sese to visit. Return the
- sese to visit in SESE_ENTRY and SESE_EXIT. */
+ sese to visit in SESE_ENTRY and SESE_EXIT. If a stmt in the sese should
+ not be optimized, return it in SKIP_STMT. */
virtual bool get_sese (basic_block *sese_entry ATTRIBUTE_UNUSED,
- basic_block *sese_exit ATTRIBUTE_UNUSED)
+ basic_block *sese_exit ATTRIBUTE_UNUSED,
+ gimple **skip_stmt ATTRIBUTE_UNUSED)
{ gcc_unreachable (); }
}; // class pass_dominator
@@ -628,8 +632,11 @@ pass_dominator::execute (function *fun)
else
{
basic_block sese_entry, sese_exit;
- while (get_sese (&sese_entry, &sese_exit))
+ gimple *skip_stmt = NULL;
+ while (get_sese (&sese_entry, &sese_exit, &skip_stmt))
{
+ walker.set_skip_stmt (skip_stmt);
+
threadedge_initialize_values ();
avail_exprs_stack->push_marker ();
const_and_copies->push_marker ();
@@ -1363,7 +1370,12 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
m_avail_exprs_stack->pop_to_marker ();
for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
- optimize_stmt (bb, gsi, m_const_and_copies, m_avail_exprs_stack);
+ {
+ if (gsi_stmt (gsi) == m_skip_stmt)
+ continue;
+
+ optimize_stmt (bb, gsi, m_const_and_copies, m_avail_exprs_stack);
+ }
/* Now prepare to process dominated blocks. */
if (m_jump_threading_p)
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [committed. gomp4, 5/6] Add oacc kernels related infra functions
2015-10-12 14:50 [committed. gomp4] pass_dominator_oacc_kernels patch series Tom de Vries
` (3 preceding siblings ...)
2015-10-12 15:07 ` [committed, gomp4, 4/6] Add skip_stmt parm to pass_dominator::get_sese () Tom de Vries
@ 2015-10-12 15:10 ` Tom de Vries
2015-10-12 15:17 ` [committed. gomp4, 6/6] Add pass_dominator_oacc_kernels Tom de Vries
5 siblings, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2015-10-12 15:10 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1013 bytes --]
On 12/10/15 16:49, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series to the gomp-4_0-branch.
>
> 1 Add pass_dominator::jump_threading_p ()
> 2 Add dom_walker::walk_until
> 3 Add pass_dominator::sese_mode_p ()
> 4 Add skip_stmt parm to pass_dominator::get_sese ()
> 5 Add oacc kernels related infra functions
> 6 Add pass_dominator_oacc_kernels
>
> The patch series adds a pass pass_dominator_oacc_kernels, which does the
> pass_dominator optimizations (with the exception of jump threading) on
> each oacc kernels region rather than on the whole function.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch adds three new oacc kernels region related infrastructure
functions:
extern tree get_omp_data_i (basic_block);
extern bool oacc_kernels_region_entry_p (basic_block, gomp_target **);
extern basic_block get_oacc_kernels_region_exit (basic_block);
Thanks,
- Tom
[-- Attachment #2: 0005-Add-oacc-kernels-related-infra-functions.patch --]
[-- Type: text/x-patch, Size: 5028 bytes --]
Add oacc kernels related infra functions
2015-10-12 Tom de Vries <tom@codesourcery.com>
* omp-low.c (get_oacc_kernels_region_exit, get_omp_data_i): New
function.
(oacc_kernels_region_entry_p): New function. Factor out of ...
(gimple_stmt_omp_data_i_init_p): ... here.
* omp-low.h (get_oacc_kernels_region_exit, oacc_kernels_region_entry_p)
(get_omp_data_i): Declare.
---
gcc/omp-low.c | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++------
gcc/omp-low.h | 3 ++
2 files changed, 96 insertions(+), 9 deletions(-)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 2b2c3a7..2289486 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -9981,6 +9981,53 @@ loop_get_oacc_kernels_region_entry (struct loop *loop)
}
}
+/* Return the oacc kernels region exit corresponding to REGION_ENTRY. */
+
+basic_block
+get_oacc_kernels_region_exit (basic_block region_entry)
+{
+ gcc_checking_assert (oacc_kernels_region_entry_p (region_entry, NULL));
+
+ bitmap to_visit = BITMAP_ALLOC (NULL);
+ bitmap visited = BITMAP_ALLOC (NULL);
+ bitmap_clear (to_visit);
+ bitmap_clear (visited);
+
+ bitmap_set_bit (to_visit, region_entry->index);
+
+ basic_block bb;
+ while (true)
+ {
+ if (bitmap_empty_p (to_visit))
+ {
+ bb = NULL;
+ break;
+ }
+
+ unsigned int index = bitmap_first_set_bit (to_visit);
+ bitmap_clear_bit (to_visit, index);
+ bitmap_set_bit (visited, index);
+ bb = BASIC_BLOCK_FOR_FN (cfun, index);
+
+ gimple *last = last_stmt (bb);
+ if (last != NULL
+ && gimple_code (last) == GIMPLE_OMP_RETURN)
+ break;
+
+ edge_iterator ei;
+ for (ei = ei_start (bb->succs); !ei_end_p (ei); ei_next (&ei))
+ {
+ edge e = ei_edge (ei);
+ unsigned int dest_index = e->dest->index;
+ if (!bitmap_bit_p (visited, dest_index))
+ bitmap_set_bit (to_visit, dest_index);
+ }
+ }
+
+ BITMAP_FREE (to_visit);
+ return bb;
+}
+
/* Encode an oacc launch argument. This matches the GOMP_LAUNCH_PACK
macro on gomp-constants.h. We do not check for overflow. */
@@ -15154,6 +15201,31 @@ omp_finish_file (void)
}
}
+/* Return true if BB is an oacc kernels region entry. If DIRECTIVE is non-null,
+ return the corresponding kernels directive in *DIRECTIVE. */
+
+bool
+oacc_kernels_region_entry_p (basic_block bb, gomp_target **directive)
+{
+ /* Check that the last statement in the preceding bb is an oacc kernels
+ stmt. */
+ if (!single_pred_p (bb))
+ return false;
+ gimple *last = last_stmt (single_pred (bb));
+ if (last == NULL
+ || gimple_code (last) != GIMPLE_OMP_TARGET)
+ return false;
+ gomp_target *kernels = as_a <gomp_target *> (last);
+
+ bool res = (gimple_omp_target_kind (kernels)
+ == GF_OMP_TARGET_KIND_OACC_KERNELS);
+
+ if (res && directive)
+ *directive = kernels;
+
+ return res;
+}
+
/* Return true if STMT is copy assignment .omp_data_i = &.omp_data_arr. */
bool
@@ -15171,15 +15243,8 @@ gimple_stmt_omp_data_i_init_p (gimple *stmt)
/* Check that the last statement in the preceding bb is an oacc kernels
stmt. */
basic_block bb = gimple_bb (stmt);
- if (!single_pred_p (bb))
- return false;
- gimple *last = last_stmt (single_pred (bb));
- if (last == NULL
- || gimple_code (last) != GIMPLE_OMP_TARGET)
- return false;
- gomp_target *kernels = as_a <gomp_target *> (last);
- if (gimple_omp_target_kind (kernels)
- != GF_OMP_TARGET_KIND_OACC_KERNELS)
+ gomp_target *kernels;
+ if (!oacc_kernels_region_entry_p (bb, &kernels))
return false;
/* Get omp_data_arr from the oacc kernels stmt. */
@@ -15190,6 +15255,25 @@ gimple_stmt_omp_data_i_init_p (gimple *stmt)
return operand_equal_p (obj, omp_data_arr, 0);
}
+
+/* Return omp_data_i corresponding to the assignment
+ .omp_data_i = &.omp_data_arr in oacc kernels region entry REGION_ENTRY. */
+
+tree
+get_omp_data_i (basic_block region_entry)
+{
+ if (!single_succ_p (region_entry))
+ return NULL_TREE;
+ basic_block bb = single_succ (region_entry);
+ gimple_stmt_iterator gsi = gsi_start_bb (bb);
+ if (gsi_end_p (gsi))
+ return NULL_TREE;
+ gimple *stmt = gsi_stmt (gsi);
+ if (!gimple_stmt_omp_data_i_init_p (stmt))
+ return NULL_TREE;
+ return gimple_assign_lhs (stmt);
+}
+
namespace {
const pass_data pass_data_late_lower_omp =
diff --git a/gcc/omp-low.h b/gcc/omp-low.h
index febcbd7..62a7d4a 100644
--- a/gcc/omp-low.h
+++ b/gcc/omp-low.h
@@ -30,6 +30,9 @@ extern tree omp_reduction_init (tree, tree);
extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
extern void omp_finish_file (void);
extern bool gimple_stmt_omp_data_i_init_p (gimple *);
+extern tree get_omp_data_i (basic_block);
+extern bool oacc_kernels_region_entry_p (basic_block, gomp_target **);
+extern basic_block get_oacc_kernels_region_exit (basic_block);
extern basic_block loop_get_oacc_kernels_region_entry (struct loop *);
extern void replace_oacc_fn_attrib (tree, tree);
extern tree build_oacc_routine_dims (tree);
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [committed. gomp4, 6/6] Add pass_dominator_oacc_kernels
2015-10-12 14:50 [committed. gomp4] pass_dominator_oacc_kernels patch series Tom de Vries
` (4 preceding siblings ...)
2015-10-12 15:10 ` [committed. gomp4, 5/6] Add oacc kernels related infra functions Tom de Vries
@ 2015-10-12 15:17 ` Tom de Vries
5 siblings, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2015-10-12 15:17 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1028 bytes --]
On 12/10/15 16:49, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series to the gomp-4_0-branch.
>
> 1 Add pass_dominator::jump_threading_p ()
> 2 Add dom_walker::walk_until
> 3 Add pass_dominator::sese_mode_p ()
> 4 Add skip_stmt parm to pass_dominator::get_sese ()
> 5 Add oacc kernels related infra functions
> 6 Add pass_dominator_oacc_kernels
>
> The patch series adds a pass pass_dominator_oacc_kernels, which does the
> pass_dominator optimizations (with the exception of jump threading) on
> each oacc kernels region rather than on the whole function.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch :
- factors a class dominator_base out of class pass_dominators,
- declares a new class pass_dominators_oacc_kernels, that operates on
oacc kernels regions, and
- adds the new pass before pass_parallelize_loops_oacc_kernels in the
oacc kernels pass group.
Thanks,
- Tom
[-- Attachment #2: 0006-Add-pass_dominator_oacc_kernels.patch --]
[-- Type: text/x-patch, Size: 8343 bytes --]
Add pass_dominator_oacc_kernels
2015-10-12 Tom de Vries <tom@codesourcery.com>
* passes.def: Add pass_dominator_oacc_kernels to pass group pass_oacc_kernels.
Add pass_tree_loop_done before, and pass_tree_loop_init after.
* tree-pass.h (make_pass_dominator_oacc_kernels): Declare.
* tree-ssa-dom.c (class dominator_base): New class. Factor out of ...
(class pass_dominator): ... here.
(pass_dominator_oacc_kernels): New pass.
(make_pass_dominator_oacc_kernels): New function.
* c-c++-common/goacc/kernels-counter-var-redundant-load.c: New test.
---
gcc/passes.def | 3 +
.../goacc/kernels-counter-var-redundant-load.c | 34 ++++++
gcc/tree-pass.h | 1 +
gcc/tree-ssa-dom.c | 117 +++++++++++++++++----
4 files changed, 134 insertions(+), 21 deletions(-)
create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
diff --git a/gcc/passes.def b/gcc/passes.def
index 0498a8b..bc454c0 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -98,6 +98,9 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_lim);
NEXT_PASS (pass_copy_prop);
NEXT_PASS (pass_scev_cprop);
+ NEXT_PASS (pass_tree_loop_done);
+ NEXT_PASS (pass_dominator_oacc_kernels);
+ NEXT_PASS (pass_tree_loop_init);
NEXT_PASS (pass_parallelize_loops_oacc_kernels);
NEXT_PASS (pass_expand_omp_ssa);
NEXT_PASS (pass_tree_loop_done);
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c b/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
new file mode 100644
index 0000000..84dee69
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
@@ -0,0 +1,34 @@
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fdump-tree-dom_oacc_kernels" } */
+
+#include <stdlib.h>
+
+#define N (1024 * 512)
+#define COUNTERTYPE unsigned int
+
+COUNTERTYPE
+foo (unsigned int *c)
+{
+ COUNTERTYPE ii;
+
+#pragma acc kernels copyout (c[0:N])
+ {
+ for (ii = 0; ii < N; ii++)
+ c[ii] = 1;
+ }
+
+ return ii;
+}
+
+/* We're expecting:
+
+ .omp_data_i_10 = &.omp_data_arr.3;
+ _11 = .omp_data_i_10->ii;
+ *_11 = 0;
+ _15 = .omp_data_i_10->c;
+ c.1_16 = *_15;
+
+ Check that there's only one load from anonymous ssa-name (which we assume to
+ be the one to read c), and that there's no such load for ii. */
+
+/* { dg-final { scan-tree-dump-times "(?n)\\*_\[0-9\]\[0-9\]*;$" 1 "dom_oacc_kernels" } } */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 52ba3e5..15c8bf6 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -392,6 +392,7 @@ extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_build_alias (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_build_ealias (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_dominator (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_dominator_oacc_kernels (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_dce (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_cd_dce (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_call_cdce (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 573e6fc..c7dc7b0 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3. If not see
#include "gimplify.h"
#include "tree-cfgcleanup.h"
#include "cfgcleanup.h"
+#include "omp-low.h"
/* This file implements optimizations on the dominator tree. */
@@ -526,6 +527,31 @@ private:
namespace {
+class dominator_base : public gimple_opt_pass
+{
+ protected:
+ dominator_base (pass_data data, gcc::context *ctxt)
+ : gimple_opt_pass (data, ctxt)
+ {}
+
+ unsigned int execute (function *);
+
+ /* Return true if pass should perform jump threading. */
+ virtual bool jump_threading_p (void) { return !sese_mode_p (); }
+
+ /* Return true if pass should visit a series of seses rather than the whole
+ dominator tree. */
+ virtual bool sese_mode_p (void) { return false; }
+
+ /* In sese mode, return true if there's another sese to visit. Return the
+ sese to visit in SESE_ENTRY and SESE_EXIT. If a stmt in the sese should
+ not be optimized, return it in SKIP_STMT. */
+ virtual bool get_sese (basic_block *sese_entry ATTRIBUTE_UNUSED,
+ basic_block *sese_exit ATTRIBUTE_UNUSED,
+ gimple **skip_stmt ATTRIBUTE_UNUSED)
+ { gcc_unreachable (); }
+}; // class dominator_base
+
const pass_data pass_data_dominator =
{
GIMPLE_PASS, /* type */
@@ -539,38 +565,20 @@ const pass_data pass_data_dominator =
( TODO_cleanup_cfg | TODO_update_ssa ), /* todo_flags_finish */
};
-class pass_dominator : public gimple_opt_pass
+class pass_dominator : public dominator_base
{
public:
pass_dominator (gcc::context *ctxt)
- : gimple_opt_pass (pass_data_dominator, ctxt)
+ : dominator_base (pass_data_dominator, ctxt)
{}
/* opt_pass methods: */
opt_pass * clone () { return new pass_dominator (m_ctxt); }
virtual bool gate (function *) { return flag_tree_dom != 0; }
- virtual unsigned int execute (function *);
-
- protected:
- /* Return true if pass should perform jump threading. */
- virtual bool jump_threading_p (void) { return !sese_mode_p (); }
-
- /* Return true if pass should visit a series of seses rather than the whole
- dominator tree. */
- virtual bool sese_mode_p (void) { return false; }
-
- /* In sese mode, return true if there's another sese to visit. Return the
- sese to visit in SESE_ENTRY and SESE_EXIT. If a stmt in the sese should
- not be optimized, return it in SKIP_STMT. */
- virtual bool get_sese (basic_block *sese_entry ATTRIBUTE_UNUSED,
- basic_block *sese_exit ATTRIBUTE_UNUSED,
- gimple **skip_stmt ATTRIBUTE_UNUSED)
- { gcc_unreachable (); }
-
}; // class pass_dominator
unsigned int
-pass_dominator::execute (function *fun)
+dominator_base::execute (function *fun)
{
memset (&opt_stats, 0, sizeof (opt_stats));
@@ -759,6 +767,68 @@ pass_dominator::execute (function *fun)
return 0;
}
+const pass_data pass_data_dominator_oacc_kernels =
+{
+ GIMPLE_PASS, /* type */
+ "dom_oacc_kernels", /* name */
+ OPTGROUP_NONE, /* optinfo_flags */
+ TV_TREE_SSA_DOMINATOR_OPTS, /* tv_id */
+ ( PROP_cfg | PROP_ssa ), /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ ( TODO_cleanup_cfg | TODO_update_ssa ), /* todo_flags_finish */
+};
+
+class pass_dominator_oacc_kernels : public dominator_base
+{
+public:
+ pass_dominator_oacc_kernels (gcc::context *ctxt)
+ : dominator_base (pass_data_dominator_oacc_kernels, ctxt), m_regions (NULL)
+ {}
+
+ /* opt_pass methods: */
+ virtual bool gate (function *) { return true; }
+
+ private:
+ bitmap m_regions;
+
+protected:
+ /* dominator_base methods: */
+ virtual bool sese_mode_p (void) { return true; }
+ virtual bool get_sese (basic_block *sese_entry, basic_block *sese_exit,
+ gimple **skip_stmt)
+ {
+ if (m_regions == NULL)
+ {
+ m_regions = BITMAP_ALLOC (NULL);
+ basic_block bb;
+ FOR_EACH_BB_FN (bb, cfun)
+ if (oacc_kernels_region_entry_p (bb, NULL))
+ bitmap_set_bit (m_regions, bb->index);
+ }
+
+ if (bitmap_empty_p (m_regions))
+ {
+ BITMAP_FREE (m_regions);
+ return false;
+ }
+
+ unsigned int index = bitmap_first_set_bit (m_regions);
+ bitmap_clear_bit (m_regions, index);
+
+ *sese_entry = BASIC_BLOCK_FOR_FN (cfun, index);
+ *sese_exit = get_oacc_kernels_region_exit (*sese_entry);
+
+ tree omp_data_i = get_omp_data_i (single_pred (*sese_entry));
+ if (omp_data_i != NULL_TREE)
+ *skip_stmt = SSA_NAME_DEF_STMT (omp_data_i);
+
+ return true;
+ }
+
+}; // class pass_dominator_oacc_kernels
+
} // anon namespace
gimple_opt_pass *
@@ -767,6 +837,11 @@ make_pass_dominator (gcc::context *ctxt)
return new pass_dominator (ctxt);
}
+gimple_opt_pass *
+make_pass_dominator_oacc_kernels (gcc::context *ctxt)
+{
+ return new pass_dominator_oacc_kernels (ctxt);
+}
/* Given a conditional statement CONDSTMT, convert the
condition to a canonical form. */
--
1.9.1
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-10-12 15:17 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12 14:50 [committed. gomp4] pass_dominator_oacc_kernels patch series Tom de Vries
2015-10-12 15:00 ` [committed, gomp4, 1/6] Add pass_dominator::jump_threading_p () Tom de Vries
2015-10-12 15:02 ` [committed, gomp4, 2/6] Add dom_walker::walk_until Tom de Vries
2015-10-12 15:06 ` [committed, gomp4, 3/6] Add pass_dominator::sese_mode_p () Tom de Vries
2015-10-12 15:07 ` [committed, gomp4, 4/6] Add skip_stmt parm to pass_dominator::get_sese () Tom de Vries
2015-10-12 15:10 ` [committed. gomp4, 5/6] Add oacc kernels related infra functions Tom de Vries
2015-10-12 15:17 ` [committed. gomp4, 6/6] Add pass_dominator_oacc_kernels Tom de Vries
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).