* [PATCH][RFC] Fix PR72851 by "randomizing" SSA propagator worklist extraction
@ 2016-08-10 14:20 Richard Biener
2016-08-11 8:56 ` Richard Biener
0 siblings, 1 reply; 3+ messages in thread
From: Richard Biener @ 2016-08-10 14:20 UTC (permalink / raw)
To: gcc-patches
The following randomizes SSA propagator worklist processing to make the
processing order less likely hit the worst-case as in the PR. Ideally
we'd process it in stmt order but that adds overhead to the idea of a
SSA propagator that makes it very much not appealing. Using a queue
rather than a stack would als be a possibility (but also not really
fix the underlying issue). So this patch uses a very simple approach
to fix the specific testcase.
Opinion? Any great ideas on how to avoid O (n log n) sorting or
O (log n) insertion? [mark blocks to be visited and simply iterate
over all stmts in to-be-visited BBs]
Bootstrap/regtest running on x86_64-unknown-linux-gnu.
Richard.
2016-08-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/72851
* tree-ssa-propagate.c (process_ssa_edge_worklist): Use a
randomized worklist item order.
* gcc.dg/torture/pr72851.c: New testcase.
Index: gcc/tree-ssa-propagate.c
===================================================================
*** gcc/tree-ssa-propagate.c (revision 238469)
--- gcc/tree-ssa-propagate.c (working copy)
*************** process_ssa_edge_worklist (vec<gimple *>
*** 422,428 ****
basic_block bb;
/* Pull the statement to simulate off the worklist. */
! gimple *stmt = worklist->pop ();
/* If this statement was already visited by simulate_block, then
we don't need to visit it again here. */
--- 422,429 ----
basic_block bb;
/* Pull the statement to simulate off the worklist. */
! gimple *stmt = (*worklist)[0];
! worklist->unordered_remove (0);
/* If this statement was already visited by simulate_block, then
we don't need to visit it again here. */
Index: gcc/testsuite/gcc.dg/torture/pr72851.c
===================================================================
*** gcc/testsuite/gcc.dg/torture/pr72851.c (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr72851.c (working copy)
***************
*** 0 ****
--- 1,30 ----
+ /* { dg-do compile } */
+
+ typedef unsigned char uint8_t;
+ typedef unsigned long int uint64_t;
+ union unaligned_64 {
+ uint64_t l;
+ }
+ __attribute__((packed)) __attribute__((may_alias));
+ typedef struct AVDES {
+ uint64_t round_keys[3][16];
+ } AVDES;
+ static const uint8_t PC1_shuffle[] = {
+ 64-57,64-49,64-41,64-33,64-25,64-17,64-9, 64-1,64-58,64-50,64-42,64-34,64-26,64-18, 64-10,64-2,64-59,64-51,64-43,64-35,64-27, 64-19,64-11,64-3,64-60,64-52,64-44,64-36, 64-63,64-55,64-47,64-39,64-31,64-23,64-15, 64-7,64-62,64-54,64-46,64-38,64-30,64-22, 64-14,64-6,64-61,64-53,64-45,64-37,64-29, 64-21,64-13,64-5,64-28,64-20,64-12,64-4 };
+ static const uint8_t PC2_shuffle[] = {
+ 56-14,56-17,56-11,56-24,56-1,56-5, 56-3,56-28,56-15,56-6,56-21,56-10, 56-23,56-19,56-12,56-4,56-26,56-8, 56-16,56-7,56-27,56-20,56-13,56-2, 56-41,56-52,56-31,56-37,56-47,56-55, 56-30,56-40,56-51,56-45,56-33,56-48, 56-44,56-49,56-39,56-56,56-34,56-53, 56-46,56-42,56-50,56-36,56-29,56-32 };
+ static uint64_t shuffle(uint64_t in, const uint8_t *shuffle, int shuffle_len)
+ {
+ int i;
+ uint64_t res = 0;
+ for (i = 0; i < shuffle_len; i++)
+ res += res + ((in >> *shuffle++) & 1);
+ return res;
+ }
+ void gen_roundkeys(uint64_t K[16], uint64_t key)
+ {
+ int i;
+ uint64_t CDn = shuffle(key, PC1_shuffle, sizeof(PC1_shuffle));
+ for (i = 0; i < 16; i++)
+ K[i] = shuffle(CDn, PC2_shuffle, sizeof(PC2_shuffle));
+ }
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH][RFC] Fix PR72851 by "randomizing" SSA propagator worklist extraction
2016-08-10 14:20 [PATCH][RFC] Fix PR72851 by "randomizing" SSA propagator worklist extraction Richard Biener
@ 2016-08-11 8:56 ` Richard Biener
2016-08-12 7:33 ` Richard Biener
0 siblings, 1 reply; 3+ messages in thread
From: Richard Biener @ 2016-08-11 8:56 UTC (permalink / raw)
To: gcc-patches
On Wed, 10 Aug 2016, Richard Biener wrote:
>
> The following randomizes SSA propagator worklist processing to make the
> processing order less likely hit the worst-case as in the PR. Ideally
> we'd process it in stmt order but that adds overhead to the idea of a
> SSA propagator that makes it very much not appealing. Using a queue
> rather than a stack would als be a possibility (but also not really
> fix the underlying issue). So this patch uses a very simple approach
> to fix the specific testcase.
>
> Opinion? Any great ideas on how to avoid O (n log n) sorting or
> O (log n) insertion? [mark blocks to be visited and simply iterate
> over all stmts in to-be-visited BBs]
Ok, I've looked at the propagator again and discovered "interesting"
code trying to do BB evaluation in a good order. So I rewrote that
to use PRE order which conveniently allows us to compute stmt UIDs
in that order. Throwing memory on the sorting issue (a uid-to-stmt
array) and using a bitmap for the worklist should solve the issue fully.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
Richard.
2016-08-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/72851
* tree-ssa-propagate.c: Include cfganal.h. Rewrite block and stmt
worklists to use bitmaps indexed in execution order.
(executable_blocks, cfg_blocks_num, cfg_blocks_tail, cfg_blocks_head,
bb_in_list, interesting_ssa_edges, varying_ssa_edges): Remove.
(cfg_blocks): Make a bitmap.
(bb_to_cfg_order, cfg_order_to_bb, ssa_edge_worklist, uid_to_stmt):
New globals.
(cfg_blocks_empty_p): Adjust.
(cfg_blocks_add): Likewise.
(cfg_blocks_get): Likewise.
(add_ssa_edge): Likewise.
(add_control_edge): Likewise.
(simulate_stmt): Likewise.
(process_ssa_edge_worklist): Likewise.
(simulate_block): Likewise.
(ssa_prop_init): Compute PRE order and stmt UIDs.
(ssa_prop_fini): Adjust.
(ssa_propagate): Adjust.
* gcc.dg/torture/pr72851.c: New testcase.
Index: gcc/tree-ssa-propagate.c
===================================================================
*** gcc/tree-ssa-propagate.c (revision 239317)
--- gcc/tree-ssa-propagate.c (working copy)
***************
*** 37,42 ****
--- 37,43 ----
#include "domwalk.h"
#include "cfgloop.h"
#include "tree-cfgcleanup.h"
+ #include "cfganal.h"
/* This file implements a generic value propagation engine based on
the same propagation used by the SSA-CCP algorithm [1].
***************
*** 83,95 ****
Blocks are added to this list if their incoming edges are
found executable.
! VARYING_SSA_EDGES contains the list of statements that feed
! from statements that produce an SSA_PROP_VARYING result.
! These are simulated first to speed up processing.
!
! INTERESTING_SSA_EDGES contains the list of statements that
! feed from statements that produce an SSA_PROP_INTERESTING
! result.
5- Simulation terminates when all three work lists are drained.
--- 84,91 ----
Blocks are added to this list if their incoming edges are
found executable.
! SSA_EDGE_WORKLIST contains the list of statements that we
! need to revisit.
5- Simulation terminates when all three work lists are drained.
***************
*** 116,224 ****
static ssa_prop_visit_stmt_fn ssa_prop_visit_stmt;
static ssa_prop_visit_phi_fn ssa_prop_visit_phi;
! /* Keep track of statements that have been added to one of the SSA
! edges worklists. This flag is used to avoid visiting statements
! unnecessarily when draining an SSA edge worklist. If while
! simulating a basic block, we find a statement with
! STMT_IN_SSA_EDGE_WORKLIST set, we clear it to prevent SSA edge
! processing from visiting it again.
!
! NOTE: users of the propagation engine are not allowed to use
! the GF_PLF_1 flag. */
! #define STMT_IN_SSA_EDGE_WORKLIST GF_PLF_1
!
! /* A bitmap to keep track of executable blocks in the CFG. */
! static sbitmap executable_blocks;
!
! /* Array of control flow edges on the worklist. */
! static vec<basic_block> cfg_blocks;
!
! static unsigned int cfg_blocks_num = 0;
! static int cfg_blocks_tail;
! static int cfg_blocks_head;
!
! static sbitmap bb_in_list;
/* Worklist of SSA edges which will need reexamination as their
definition has changed. SSA edges are def-use edges in the SSA
web. For each D-U edge, we store the target statement or PHI node
! U. */
! static vec<gimple *> interesting_ssa_edges;
!
! /* Identical to INTERESTING_SSA_EDGES. For performance reasons, the
! list of SSA edges is split into two. One contains all SSA edges
! who need to be reexamined because their lattice value changed to
! varying (this worklist), and the other contains all other SSA edges
! to be reexamined (INTERESTING_SSA_EDGES).
!
! Since most values in the program are VARYING, the ideal situation
! is to move them to that lattice value as quickly as possible.
! Thus, it doesn't make sense to process any other type of lattice
! value until all VARYING values are propagated fully, which is one
! thing using the VARYING worklist achieves. In addition, if we
! don't use a separate worklist for VARYING edges, we end up with
! situations where lattice values move from
! UNDEFINED->INTERESTING->VARYING instead of UNDEFINED->VARYING. */
! static vec<gimple *> varying_ssa_edges;
!
/* Return true if the block worklist empty. */
static inline bool
cfg_blocks_empty_p (void)
{
! return (cfg_blocks_num == 0);
}
! /* Add a basic block to the worklist. The block must not be already
! in the worklist, and it must not be the ENTRY or EXIT block. */
static void
cfg_blocks_add (basic_block bb)
{
- bool head = false;
-
gcc_assert (bb != ENTRY_BLOCK_PTR_FOR_FN (cfun)
&& bb != EXIT_BLOCK_PTR_FOR_FN (cfun));
! gcc_assert (!bitmap_bit_p (bb_in_list, bb->index));
!
! if (cfg_blocks_empty_p ())
! {
! cfg_blocks_tail = cfg_blocks_head = 0;
! cfg_blocks_num = 1;
! }
! else
! {
! cfg_blocks_num++;
! if (cfg_blocks_num > cfg_blocks.length ())
! {
! /* We have to grow the array now. Adjust to queue to occupy
! the full space of the original array. We do not need to
! initialize the newly allocated portion of the array
! because we keep track of CFG_BLOCKS_HEAD and
! CFG_BLOCKS_HEAD. */
! cfg_blocks_tail = cfg_blocks.length ();
! cfg_blocks_head = 0;
! cfg_blocks.safe_grow (2 * cfg_blocks_tail);
! }
! /* Minor optimization: we prefer to see blocks with more
! predecessors later, because there is more of a chance that
! the incoming edges will be executable. */
! else if (EDGE_COUNT (bb->preds)
! >= EDGE_COUNT (cfg_blocks[cfg_blocks_head]->preds))
! cfg_blocks_tail = ((cfg_blocks_tail + 1) % cfg_blocks.length ());
! else
! {
! if (cfg_blocks_head == 0)
! cfg_blocks_head = cfg_blocks.length ();
! --cfg_blocks_head;
! head = true;
! }
! }
!
! cfg_blocks[head ? cfg_blocks_head : cfg_blocks_tail] = bb;
! bitmap_set_bit (bb_in_list, bb->index);
}
--- 112,149 ----
static ssa_prop_visit_stmt_fn ssa_prop_visit_stmt;
static ssa_prop_visit_phi_fn ssa_prop_visit_phi;
! /* Worklist of control flow edge destinations. This contains
! the CFG order number of the blocks so we can iterate in CFG
! order by visiting in bit-order. */
! static bitmap cfg_blocks;
! static int *bb_to_cfg_order;
! static int *cfg_order_to_bb;
/* Worklist of SSA edges which will need reexamination as their
definition has changed. SSA edges are def-use edges in the SSA
web. For each D-U edge, we store the target statement or PHI node
! UID in a bitmap. UIDs order stmts in execution order. */
! static bitmap ssa_edge_worklist;
! static vec<gimple *> uid_to_stmt;
/* Return true if the block worklist empty. */
static inline bool
cfg_blocks_empty_p (void)
{
! return bitmap_empty_p (cfg_blocks);
}
! /* Add a basic block to the worklist. The block must not be the ENTRY
! or EXIT block. */
static void
cfg_blocks_add (basic_block bb)
{
gcc_assert (bb != ENTRY_BLOCK_PTR_FOR_FN (cfun)
&& bb != EXIT_BLOCK_PTR_FOR_FN (cfun));
! bitmap_set_bit (cfg_blocks, bb_to_cfg_order[bb->index]);
}
*************** cfg_blocks_add (basic_block bb)
*** 227,244 ****
static basic_block
cfg_blocks_get (void)
{
- basic_block bb;
-
- bb = cfg_blocks[cfg_blocks_head];
-
gcc_assert (!cfg_blocks_empty_p ());
! gcc_assert (bb);
!
! cfg_blocks_head = ((cfg_blocks_head + 1) % cfg_blocks.length ());
! --cfg_blocks_num;
! bitmap_clear_bit (bb_in_list, bb->index);
!
! return bb;
}
--- 152,161 ----
static basic_block
cfg_blocks_get (void)
{
gcc_assert (!cfg_blocks_empty_p ());
! int order_index = bitmap_first_set_bit (cfg_blocks);
! bitmap_clear_bit (cfg_blocks, order_index);
! return BASIC_BLOCK_FOR_FN (cfun, cfg_order_to_bb [order_index]);
}
*************** cfg_blocks_get (void)
*** 247,253 ****
them to INTERESTING_SSA_EDGES. */
static void
! add_ssa_edge (tree var, bool is_varying)
{
imm_use_iterator iter;
use_operand_p use_p;
--- 164,170 ----
them to INTERESTING_SSA_EDGES. */
static void
! add_ssa_edge (tree var)
{
imm_use_iterator iter;
use_operand_p use_p;
*************** add_ssa_edge (tree var, bool is_varying)
*** 256,282 ****
{
gimple *use_stmt = USE_STMT (use_p);
if (prop_simulate_again_p (use_stmt)
! && !gimple_plf (use_stmt, STMT_IN_SSA_EDGE_WORKLIST))
{
! gimple_set_plf (use_stmt, STMT_IN_SSA_EDGE_WORKLIST, true);
! if (is_varying)
! {
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "varying_ssa_edges: adding SSA use in ");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
! }
! varying_ssa_edges.safe_push (use_stmt);
! }
! else
{
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "interesting_ssa_edges: adding SSA use in ");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
! }
! interesting_ssa_edges.safe_push (use_stmt);
}
}
}
--- 173,191 ----
{
gimple *use_stmt = USE_STMT (use_p);
+ /* If we did not yet simulate the block wait for this to happen
+ and do not add the stmt to the SSA edge worklist. */
+ if (! (gimple_bb (use_stmt)->flags & BB_VISITED))
+ continue;
+
if (prop_simulate_again_p (use_stmt)
! && bitmap_set_bit (ssa_edge_worklist, gimple_uid (use_stmt)))
{
! uid_to_stmt[gimple_uid (use_stmt)] = use_stmt;
! if (dump_file && (dump_flags & TDF_DETAILS))
{
! fprintf (dump_file, "ssa_edge_worklist: adding SSA use in ");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
}
}
}
*************** add_control_edge (edge e)
*** 298,307 ****
e->flags |= EDGE_EXECUTABLE;
- /* If the block is already in the list, we're done. */
- if (bitmap_bit_p (bb_in_list, bb->index))
- return;
-
cfg_blocks_add (bb);
if (dump_file && (dump_flags & TDF_DETAILS))
--- 207,212 ----
*************** simulate_stmt (gimple *stmt)
*** 319,324 ****
--- 224,232 ----
edge taken_edge = NULL;
tree output_name = NULL_TREE;
+ /* Pull the stmt off the SSA edge worklist. */
+ bitmap_clear_bit (ssa_edge_worklist, gimple_uid (stmt));
+
/* Don't bother visiting statements that are already
considered varying by the propagator. */
if (!prop_simulate_again_p (stmt))
*************** simulate_stmt (gimple *stmt)
*** 339,345 ****
/* If the statement produced a new varying value, add the SSA
edges coming out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name, true);
/* If STMT transfers control out of its basic block, add
all outgoing edges to the work list. */
--- 247,253 ----
/* If the statement produced a new varying value, add the SSA
edges coming out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name);
/* If STMT transfers control out of its basic block, add
all outgoing edges to the work list. */
*************** simulate_stmt (gimple *stmt)
*** 358,364 ****
/* If the statement produced new value, add the SSA edges coming
out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name, false);
/* If we know which edge is going to be taken out of this block,
add it to the CFG work list. */
--- 266,272 ----
/* If the statement produced new value, add the SSA edges coming
out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name);
/* If we know which edge is going to be taken out of this block,
add it to the CFG work list. */
*************** simulate_stmt (gimple *stmt)
*** 413,466 ****
when an SSA edge is added to it in simulate_stmt. Return true if a stmt
was simulated. */
! static bool
! process_ssa_edge_worklist (vec<gimple *> *worklist, const char *edge_list_name)
{
/* Process the next entry from the worklist. */
! while (worklist->length () > 0)
! {
! basic_block bb;
!
! /* Pull the statement to simulate off the worklist. */
! gimple *stmt = worklist->pop ();
!
! /* If this statement was already visited by simulate_block, then
! we don't need to visit it again here. */
! if (!gimple_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST))
! continue;
!
! /* STMT is no longer in a worklist. */
! gimple_set_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST, false);
! bb = gimple_bb (stmt);
! /* Visit the statement only if its block is marked executable.
! If it is not executable then it will be visited when we simulate
! all statements in the block as soon as an incoming edge gets
! marked executable. */
! if (!bitmap_bit_p (executable_blocks, bb->index))
! {
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "\nDropping statement from SSA worklist: ");
! print_gimple_stmt (dump_file, stmt, 0, dump_flags);
! }
! continue;
! }
!
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "\nSimulating statement (from %s): ",
! edge_list_name);
! print_gimple_stmt (dump_file, stmt, 0, dump_flags);
! }
!
! simulate_stmt (stmt);
!
! return true;
}
! return false;
}
--- 321,344 ----
when an SSA edge is added to it in simulate_stmt. Return true if a stmt
was simulated. */
! static void
! process_ssa_edge_worklist ()
{
/* Process the next entry from the worklist. */
! unsigned stmt_uid = bitmap_first_set_bit (ssa_edge_worklist);
! bitmap_clear_bit (ssa_edge_worklist, stmt_uid);
! gimple *stmt = uid_to_stmt[stmt_uid];
! /* We should not have stmts in not yet simulated BBs on the worklist. */
! gcc_assert (gimple_bb (stmt)->flags & BB_VISITED);
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "\nSimulating statement: ");
! print_gimple_stmt (dump_file, stmt, 0, dump_flags);
}
! simulate_stmt (stmt);
}
*************** simulate_block (basic_block block)
*** 486,515 ****
/* If this is the first time we've simulated this block, then we
must simulate each of its statements. */
! if (!bitmap_bit_p (executable_blocks, block->index))
{
gimple_stmt_iterator j;
unsigned int normal_edge_count;
edge e, normal_edge;
edge_iterator ei;
- /* Note that we have simulated this block. */
- bitmap_set_bit (executable_blocks, block->index);
-
for (j = gsi_start_bb (block); !gsi_end_p (j); gsi_next (&j))
! {
! gimple *stmt = gsi_stmt (j);
! /* If this statement is already in the worklist then
! "cancel" it. The reevaluation implied by the worklist
! entry will produce the same value we generate here and
! thus reevaluating it again from the worklist is
! pointless. */
! if (gimple_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST))
! gimple_set_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST, false);
!
! simulate_stmt (stmt);
! }
/* We can not predict when abnormal and EH edges will be executed, so
once a block is considered executable, we consider any
--- 364,381 ----
/* If this is the first time we've simulated this block, then we
must simulate each of its statements. */
! if (! (block->flags & BB_VISITED))
{
gimple_stmt_iterator j;
unsigned int normal_edge_count;
edge e, normal_edge;
edge_iterator ei;
for (j = gsi_start_bb (block); !gsi_end_p (j); gsi_next (&j))
! simulate_stmt (gsi_stmt (j));
! /* Note that we have simulated this block. */
! block->flags |= BB_VISITED;
/* We can not predict when abnormal and EH edges will be executed, so
once a block is considered executable, we consider any
*************** ssa_prop_init (void)
*** 551,591 ****
basic_block bb;
/* Worklists of SSA edges. */
! interesting_ssa_edges.create (20);
! varying_ssa_edges.create (20);
! executable_blocks = sbitmap_alloc (last_basic_block_for_fn (cfun));
! bitmap_clear (executable_blocks);
!
! bb_in_list = sbitmap_alloc (last_basic_block_for_fn (cfun));
! bitmap_clear (bb_in_list);
if (dump_file && (dump_flags & TDF_DETAILS))
dump_immediate_uses (dump_file);
- cfg_blocks.create (20);
- cfg_blocks.safe_grow_cleared (20);
-
/* Initially assume that every edge in the CFG is not executable.
! (including the edges coming out of the entry block). */
! FOR_ALL_BB_FN (bb, cfun)
{
gimple_stmt_iterator si;
!
! for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
! gimple_set_plf (gsi_stmt (si), STMT_IN_SSA_EDGE_WORKLIST, false);
for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
! gimple_set_plf (gsi_stmt (si), STMT_IN_SSA_EDGE_WORKLIST, false);
FOR_EACH_EDGE (e, ei, bb->succs)
e->flags &= ~EDGE_EXECUTABLE;
}
/* Seed the algorithm by adding the successors of the entry block to the
edge worklist. */
FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
! add_control_edge (e);
}
--- 417,471 ----
basic_block bb;
/* Worklists of SSA edges. */
! ssa_edge_worklist = BITMAP_ALLOC (NULL);
! /* Worklist of basic-blocks. */
! bb_to_cfg_order = XNEWVEC (int, last_basic_block_for_fn (cfun) + 1);
! cfg_order_to_bb = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
! int n = pre_and_rev_post_order_compute_fn (cfun, cfg_order_to_bb,
! NULL, false);
! for (int i = 0; i < n; ++i)
! bb_to_cfg_order[cfg_order_to_bb[i]] = i;
! cfg_blocks = BITMAP_ALLOC (NULL);
if (dump_file && (dump_flags & TDF_DETAILS))
dump_immediate_uses (dump_file);
/* Initially assume that every edge in the CFG is not executable.
! (including the edges coming out of the entry block). Mark blocks
! as not visited, blocks not yet visited will have all their statements
! simulated once an incoming edge gets executable. */
! set_gimple_stmt_max_uid (cfun, 0);
! for (int i = 0; i < n; ++i)
{
gimple_stmt_iterator si;
! bb = BASIC_BLOCK_FOR_FN (cfun, cfg_order_to_bb[i]);
for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
! {
! gimple *stmt = gsi_stmt (si);
! gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
! }
+ for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
+ {
+ gimple *stmt = gsi_stmt (si);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+ }
+
+ gcc_assert (! (bb->flags & BB_VISITED));
FOR_EACH_EDGE (e, ei, bb->succs)
e->flags &= ~EDGE_EXECUTABLE;
}
+ uid_to_stmt.safe_grow (gimple_stmt_max_uid (cfun));
/* Seed the algorithm by adding the successors of the entry block to the
edge worklist. */
FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
! {
! e->flags &= ~EDGE_EXECUTABLE;
! add_control_edge (e);
! }
}
*************** ssa_prop_init (void)
*** 594,604 ****
static void
ssa_prop_fini (void)
{
! interesting_ssa_edges.release ();
! varying_ssa_edges.release ();
! cfg_blocks.release ();
! sbitmap_free (bb_in_list);
! sbitmap_free (executable_blocks);
}
--- 474,485 ----
static void
ssa_prop_fini (void)
{
! BITMAP_FREE (cfg_blocks);
! free (bb_to_cfg_order);
! free (cfg_order_to_bb);
! BITMAP_FREE (ssa_edge_worklist);
! uid_to_stmt.release ();
! clear_bb_flags ();
}
*************** ssa_propagate (ssa_prop_visit_stmt_fn vi
*** 917,927 ****
ssa_prop_init ();
/* Iterate until the worklists are empty. */
! while (!cfg_blocks_empty_p ()
! || interesting_ssa_edges.length () > 0
! || varying_ssa_edges.length () > 0)
{
! if (!cfg_blocks_empty_p ())
{
/* Pull the next block to simulate off the worklist. */
basic_block dest_block = cfg_blocks_get ();
--- 798,808 ----
ssa_prop_init ();
/* Iterate until the worklists are empty. */
! while (! cfg_blocks_empty_p ()
! || ! bitmap_empty_p (ssa_edge_worklist))
{
! /* First simulate whole blocks. */
! if (! cfg_blocks_empty_p ())
{
/* Pull the next block to simulate off the worklist. */
basic_block dest_block = cfg_blocks_get ();
*************** ssa_propagate (ssa_prop_visit_stmt_fn vi
*** 929,942 ****
continue;
}
! /* In order to move things to varying as quickly as
! possible,process the VARYING_SSA_EDGES worklist first. */
! if (process_ssa_edge_worklist (&varying_ssa_edges, "varying_ssa_edges"))
! continue;
!
! /* Now process the INTERESTING_SSA_EDGES worklist. */
! process_ssa_edge_worklist (&interesting_ssa_edges,
! "interesting_ssa_edges");
}
ssa_prop_fini ();
--- 810,817 ----
continue;
}
! /* Then simulate from the SSA edge worklist. */
! process_ssa_edge_worklist ();
}
ssa_prop_fini ();
Index: gcc/testsuite/gcc.dg/torture/pr72851.c
===================================================================
*** gcc/testsuite/gcc.dg/torture/pr72851.c (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr72851.c (working copy)
***************
*** 0 ****
--- 1,30 ----
+ /* { dg-do compile } */
+
+ typedef unsigned char uint8_t;
+ typedef unsigned long int uint64_t;
+ union unaligned_64 {
+ uint64_t l;
+ }
+ __attribute__((packed)) __attribute__((may_alias));
+ typedef struct AVDES {
+ uint64_t round_keys[3][16];
+ } AVDES;
+ static const uint8_t PC1_shuffle[] = {
+ 64-57,64-49,64-41,64-33,64-25,64-17,64-9, 64-1,64-58,64-50,64-42,64-34,64-26,64-18, 64-10,64-2,64-59,64-51,64-43,64-35,64-27, 64-19,64-11,64-3,64-60,64-52,64-44,64-36, 64-63,64-55,64-47,64-39,64-31,64-23,64-15, 64-7,64-62,64-54,64-46,64-38,64-30,64-22, 64-14,64-6,64-61,64-53,64-45,64-37,64-29, 64-21,64-13,64-5,64-28,64-20,64-12,64-4 };
+ static const uint8_t PC2_shuffle[] = {
+ 56-14,56-17,56-11,56-24,56-1,56-5, 56-3,56-28,56-15,56-6,56-21,56-10, 56-23,56-19,56-12,56-4,56-26,56-8, 56-16,56-7,56-27,56-20,56-13,56-2, 56-41,56-52,56-31,56-37,56-47,56-55, 56-30,56-40,56-51,56-45,56-33,56-48, 56-44,56-49,56-39,56-56,56-34,56-53, 56-46,56-42,56-50,56-36,56-29,56-32 };
+ static uint64_t shuffle(uint64_t in, const uint8_t *shuffle, int shuffle_len)
+ {
+ int i;
+ uint64_t res = 0;
+ for (i = 0; i < shuffle_len; i++)
+ res += res + ((in >> *shuffle++) & 1);
+ return res;
+ }
+ void gen_roundkeys(uint64_t K[16], uint64_t key)
+ {
+ int i;
+ uint64_t CDn = shuffle(key, PC1_shuffle, sizeof(PC1_shuffle));
+ for (i = 0; i < 16; i++)
+ K[i] = shuffle(CDn, PC2_shuffle, sizeof(PC2_shuffle));
+ }
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH][RFC] Fix PR72851 by "randomizing" SSA propagator worklist extraction
2016-08-11 8:56 ` Richard Biener
@ 2016-08-12 7:33 ` Richard Biener
0 siblings, 0 replies; 3+ messages in thread
From: Richard Biener @ 2016-08-12 7:33 UTC (permalink / raw)
To: gcc-patches
On Thu, 11 Aug 2016, Richard Biener wrote:
> On Wed, 10 Aug 2016, Richard Biener wrote:
>
> >
> > The following randomizes SSA propagator worklist processing to make the
> > processing order less likely hit the worst-case as in the PR. Ideally
> > we'd process it in stmt order but that adds overhead to the idea of a
> > SSA propagator that makes it very much not appealing. Using a queue
> > rather than a stack would als be a possibility (but also not really
> > fix the underlying issue). So this patch uses a very simple approach
> > to fix the specific testcase.
> >
> > Opinion? Any great ideas on how to avoid O (n log n) sorting or
> > O (log n) insertion? [mark blocks to be visited and simply iterate
> > over all stmts in to-be-visited BBs]
>
> Ok, I've looked at the propagator again and discovered "interesting"
> code trying to do BB evaluation in a good order. So I rewrote that
> to use PRE order which conveniently allows us to compute stmt UIDs
> in that order. Throwing memory on the sorting issue (a uid-to-stmt
> array) and using a bitmap for the worklist should solve the issue fully.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
Turns out using clear_bb_flags isn't a good idea so I replaced that
with just clearing BB_VISTITED manually. (not clearing it will
cause IRA to ICE later on stale BB_VISITED - it seems we rely on
passes clearing this flag but we don't verify this)
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
Doesn't look appropriate for the branch though (maybe the first hack
qualifies), at least not so shortly before 6.2.
Richard.
2016-08-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/72851
* tree-ssa-propagate.c: Include cfganal.h. Rewrite block and stmt
worklists to use bitmaps indexed in execution order.
(executable_blocks, cfg_blocks_num, cfg_blocks_tail, cfg_blocks_head,
bb_in_list, interesting_ssa_edges, varying_ssa_edges): Remove.
(cfg_blocks): Make a bitmap.
(bb_to_cfg_order, cfg_order_to_bb, ssa_edge_worklist, uid_to_stmt):
New globals.
(cfg_blocks_empty_p): Adjust.
(cfg_blocks_add): Likewise.
(cfg_blocks_get): Likewise.
(add_ssa_edge): Likewise.
(add_control_edge): Likewise.
(simulate_stmt): Likewise.
(process_ssa_edge_worklist): Likewise.
(simulate_block): Likewise.
(ssa_prop_init): Compute PRE order and stmt UIDs.
(ssa_prop_fini): Adjust.
(ssa_propagate): Adjust.
* gcc.dg/torture/pr72851.c: New testcase.
Index: gcc/testsuite/gcc.dg/torture/pr72851.c
===================================================================
*** gcc/testsuite/gcc.dg/torture/pr72851.c (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr72851.c (working copy)
***************
*** 0 ****
--- 1,30 ----
+ /* { dg-do compile } */
+
+ typedef unsigned char uint8_t;
+ typedef unsigned long int uint64_t;
+ union unaligned_64 {
+ uint64_t l;
+ }
+ __attribute__((packed)) __attribute__((may_alias));
+ typedef struct AVDES {
+ uint64_t round_keys[3][16];
+ } AVDES;
+ static const uint8_t PC1_shuffle[] = {
+ 64-57,64-49,64-41,64-33,64-25,64-17,64-9, 64-1,64-58,64-50,64-42,64-34,64-26,64-18, 64-10,64-2,64-59,64-51,64-43,64-35,64-27, 64-19,64-11,64-3,64-60,64-52,64-44,64-36, 64-63,64-55,64-47,64-39,64-31,64-23,64-15, 64-7,64-62,64-54,64-46,64-38,64-30,64-22, 64-14,64-6,64-61,64-53,64-45,64-37,64-29, 64-21,64-13,64-5,64-28,64-20,64-12,64-4 };
+ static const uint8_t PC2_shuffle[] = {
+ 56-14,56-17,56-11,56-24,56-1,56-5, 56-3,56-28,56-15,56-6,56-21,56-10, 56-23,56-19,56-12,56-4,56-26,56-8, 56-16,56-7,56-27,56-20,56-13,56-2, 56-41,56-52,56-31,56-37,56-47,56-55, 56-30,56-40,56-51,56-45,56-33,56-48, 56-44,56-49,56-39,56-56,56-34,56-53, 56-46,56-42,56-50,56-36,56-29,56-32 };
+ static uint64_t shuffle(uint64_t in, const uint8_t *shuffle, int shuffle_len)
+ {
+ int i;
+ uint64_t res = 0;
+ for (i = 0; i < shuffle_len; i++)
+ res += res + ((in >> *shuffle++) & 1);
+ return res;
+ }
+ void gen_roundkeys(uint64_t K[16], uint64_t key)
+ {
+ int i;
+ uint64_t CDn = shuffle(key, PC1_shuffle, sizeof(PC1_shuffle));
+ for (i = 0; i < 16; i++)
+ K[i] = shuffle(CDn, PC2_shuffle, sizeof(PC2_shuffle));
+ }
Index: gcc/tree-ssa-propagate.c
===================================================================
*** gcc/tree-ssa-propagate.c (revision 239361)
--- gcc/tree-ssa-propagate.c (working copy)
***************
*** 37,42 ****
--- 37,43 ----
#include "domwalk.h"
#include "cfgloop.h"
#include "tree-cfgcleanup.h"
+ #include "cfganal.h"
/* This file implements a generic value propagation engine based on
the same propagation used by the SSA-CCP algorithm [1].
***************
*** 83,95 ****
Blocks are added to this list if their incoming edges are
found executable.
! VARYING_SSA_EDGES contains the list of statements that feed
! from statements that produce an SSA_PROP_VARYING result.
! These are simulated first to speed up processing.
!
! INTERESTING_SSA_EDGES contains the list of statements that
! feed from statements that produce an SSA_PROP_INTERESTING
! result.
5- Simulation terminates when all three work lists are drained.
--- 84,91 ----
Blocks are added to this list if their incoming edges are
found executable.
! SSA_EDGE_WORKLIST contains the list of statements that we
! need to revisit.
5- Simulation terminates when all three work lists are drained.
***************
*** 116,224 ****
static ssa_prop_visit_stmt_fn ssa_prop_visit_stmt;
static ssa_prop_visit_phi_fn ssa_prop_visit_phi;
! /* Keep track of statements that have been added to one of the SSA
! edges worklists. This flag is used to avoid visiting statements
! unnecessarily when draining an SSA edge worklist. If while
! simulating a basic block, we find a statement with
! STMT_IN_SSA_EDGE_WORKLIST set, we clear it to prevent SSA edge
! processing from visiting it again.
!
! NOTE: users of the propagation engine are not allowed to use
! the GF_PLF_1 flag. */
! #define STMT_IN_SSA_EDGE_WORKLIST GF_PLF_1
!
! /* A bitmap to keep track of executable blocks in the CFG. */
! static sbitmap executable_blocks;
!
! /* Array of control flow edges on the worklist. */
! static vec<basic_block> cfg_blocks;
!
! static unsigned int cfg_blocks_num = 0;
! static int cfg_blocks_tail;
! static int cfg_blocks_head;
!
! static sbitmap bb_in_list;
/* Worklist of SSA edges which will need reexamination as their
definition has changed. SSA edges are def-use edges in the SSA
web. For each D-U edge, we store the target statement or PHI node
! U. */
! static vec<gimple *> interesting_ssa_edges;
!
! /* Identical to INTERESTING_SSA_EDGES. For performance reasons, the
! list of SSA edges is split into two. One contains all SSA edges
! who need to be reexamined because their lattice value changed to
! varying (this worklist), and the other contains all other SSA edges
! to be reexamined (INTERESTING_SSA_EDGES).
!
! Since most values in the program are VARYING, the ideal situation
! is to move them to that lattice value as quickly as possible.
! Thus, it doesn't make sense to process any other type of lattice
! value until all VARYING values are propagated fully, which is one
! thing using the VARYING worklist achieves. In addition, if we
! don't use a separate worklist for VARYING edges, we end up with
! situations where lattice values move from
! UNDEFINED->INTERESTING->VARYING instead of UNDEFINED->VARYING. */
! static vec<gimple *> varying_ssa_edges;
!
/* Return true if the block worklist empty. */
static inline bool
cfg_blocks_empty_p (void)
{
! return (cfg_blocks_num == 0);
}
! /* Add a basic block to the worklist. The block must not be already
! in the worklist, and it must not be the ENTRY or EXIT block. */
static void
cfg_blocks_add (basic_block bb)
{
- bool head = false;
-
gcc_assert (bb != ENTRY_BLOCK_PTR_FOR_FN (cfun)
&& bb != EXIT_BLOCK_PTR_FOR_FN (cfun));
! gcc_assert (!bitmap_bit_p (bb_in_list, bb->index));
!
! if (cfg_blocks_empty_p ())
! {
! cfg_blocks_tail = cfg_blocks_head = 0;
! cfg_blocks_num = 1;
! }
! else
! {
! cfg_blocks_num++;
! if (cfg_blocks_num > cfg_blocks.length ())
! {
! /* We have to grow the array now. Adjust to queue to occupy
! the full space of the original array. We do not need to
! initialize the newly allocated portion of the array
! because we keep track of CFG_BLOCKS_HEAD and
! CFG_BLOCKS_HEAD. */
! cfg_blocks_tail = cfg_blocks.length ();
! cfg_blocks_head = 0;
! cfg_blocks.safe_grow (2 * cfg_blocks_tail);
! }
! /* Minor optimization: we prefer to see blocks with more
! predecessors later, because there is more of a chance that
! the incoming edges will be executable. */
! else if (EDGE_COUNT (bb->preds)
! >= EDGE_COUNT (cfg_blocks[cfg_blocks_head]->preds))
! cfg_blocks_tail = ((cfg_blocks_tail + 1) % cfg_blocks.length ());
! else
! {
! if (cfg_blocks_head == 0)
! cfg_blocks_head = cfg_blocks.length ();
! --cfg_blocks_head;
! head = true;
! }
! }
!
! cfg_blocks[head ? cfg_blocks_head : cfg_blocks_tail] = bb;
! bitmap_set_bit (bb_in_list, bb->index);
}
--- 112,149 ----
static ssa_prop_visit_stmt_fn ssa_prop_visit_stmt;
static ssa_prop_visit_phi_fn ssa_prop_visit_phi;
! /* Worklist of control flow edge destinations. This contains
! the CFG order number of the blocks so we can iterate in CFG
! order by visiting in bit-order. */
! static bitmap cfg_blocks;
! static int *bb_to_cfg_order;
! static int *cfg_order_to_bb;
/* Worklist of SSA edges which will need reexamination as their
definition has changed. SSA edges are def-use edges in the SSA
web. For each D-U edge, we store the target statement or PHI node
! UID in a bitmap. UIDs order stmts in execution order. */
! static bitmap ssa_edge_worklist;
! static vec<gimple *> uid_to_stmt;
/* Return true if the block worklist empty. */
static inline bool
cfg_blocks_empty_p (void)
{
! return bitmap_empty_p (cfg_blocks);
}
! /* Add a basic block to the worklist. The block must not be the ENTRY
! or EXIT block. */
static void
cfg_blocks_add (basic_block bb)
{
gcc_assert (bb != ENTRY_BLOCK_PTR_FOR_FN (cfun)
&& bb != EXIT_BLOCK_PTR_FOR_FN (cfun));
! bitmap_set_bit (cfg_blocks, bb_to_cfg_order[bb->index]);
}
*************** cfg_blocks_add (basic_block bb)
*** 227,244 ****
static basic_block
cfg_blocks_get (void)
{
- basic_block bb;
-
- bb = cfg_blocks[cfg_blocks_head];
-
gcc_assert (!cfg_blocks_empty_p ());
! gcc_assert (bb);
!
! cfg_blocks_head = ((cfg_blocks_head + 1) % cfg_blocks.length ());
! --cfg_blocks_num;
! bitmap_clear_bit (bb_in_list, bb->index);
!
! return bb;
}
--- 152,161 ----
static basic_block
cfg_blocks_get (void)
{
gcc_assert (!cfg_blocks_empty_p ());
! int order_index = bitmap_first_set_bit (cfg_blocks);
! bitmap_clear_bit (cfg_blocks, order_index);
! return BASIC_BLOCK_FOR_FN (cfun, cfg_order_to_bb [order_index]);
}
*************** cfg_blocks_get (void)
*** 247,253 ****
them to INTERESTING_SSA_EDGES. */
static void
! add_ssa_edge (tree var, bool is_varying)
{
imm_use_iterator iter;
use_operand_p use_p;
--- 164,170 ----
them to INTERESTING_SSA_EDGES. */
static void
! add_ssa_edge (tree var)
{
imm_use_iterator iter;
use_operand_p use_p;
*************** add_ssa_edge (tree var, bool is_varying)
*** 256,282 ****
{
gimple *use_stmt = USE_STMT (use_p);
if (prop_simulate_again_p (use_stmt)
! && !gimple_plf (use_stmt, STMT_IN_SSA_EDGE_WORKLIST))
{
! gimple_set_plf (use_stmt, STMT_IN_SSA_EDGE_WORKLIST, true);
! if (is_varying)
! {
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "varying_ssa_edges: adding SSA use in ");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
! }
! varying_ssa_edges.safe_push (use_stmt);
! }
! else
{
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "interesting_ssa_edges: adding SSA use in ");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
! }
! interesting_ssa_edges.safe_push (use_stmt);
}
}
}
--- 173,191 ----
{
gimple *use_stmt = USE_STMT (use_p);
+ /* If we did not yet simulate the block wait for this to happen
+ and do not add the stmt to the SSA edge worklist. */
+ if (! (gimple_bb (use_stmt)->flags & BB_VISITED))
+ continue;
+
if (prop_simulate_again_p (use_stmt)
! && bitmap_set_bit (ssa_edge_worklist, gimple_uid (use_stmt)))
{
! uid_to_stmt[gimple_uid (use_stmt)] = use_stmt;
! if (dump_file && (dump_flags & TDF_DETAILS))
{
! fprintf (dump_file, "ssa_edge_worklist: adding SSA use in ");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
}
}
}
*************** add_control_edge (edge e)
*** 298,307 ****
e->flags |= EDGE_EXECUTABLE;
- /* If the block is already in the list, we're done. */
- if (bitmap_bit_p (bb_in_list, bb->index))
- return;
-
cfg_blocks_add (bb);
if (dump_file && (dump_flags & TDF_DETAILS))
--- 207,212 ----
*************** simulate_stmt (gimple *stmt)
*** 319,324 ****
--- 224,232 ----
edge taken_edge = NULL;
tree output_name = NULL_TREE;
+ /* Pull the stmt off the SSA edge worklist. */
+ bitmap_clear_bit (ssa_edge_worklist, gimple_uid (stmt));
+
/* Don't bother visiting statements that are already
considered varying by the propagator. */
if (!prop_simulate_again_p (stmt))
*************** simulate_stmt (gimple *stmt)
*** 339,345 ****
/* If the statement produced a new varying value, add the SSA
edges coming out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name, true);
/* If STMT transfers control out of its basic block, add
all outgoing edges to the work list. */
--- 247,253 ----
/* If the statement produced a new varying value, add the SSA
edges coming out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name);
/* If STMT transfers control out of its basic block, add
all outgoing edges to the work list. */
*************** simulate_stmt (gimple *stmt)
*** 358,364 ****
/* If the statement produced new value, add the SSA edges coming
out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name, false);
/* If we know which edge is going to be taken out of this block,
add it to the CFG work list. */
--- 266,272 ----
/* If the statement produced new value, add the SSA edges coming
out of OUTPUT_NAME. */
if (output_name)
! add_ssa_edge (output_name);
/* If we know which edge is going to be taken out of this block,
add it to the CFG work list. */
*************** simulate_stmt (gimple *stmt)
*** 413,466 ****
when an SSA edge is added to it in simulate_stmt. Return true if a stmt
was simulated. */
! static bool
! process_ssa_edge_worklist (vec<gimple *> *worklist, const char *edge_list_name)
{
/* Process the next entry from the worklist. */
! while (worklist->length () > 0)
! {
! basic_block bb;
!
! /* Pull the statement to simulate off the worklist. */
! gimple *stmt = worklist->pop ();
!
! /* If this statement was already visited by simulate_block, then
! we don't need to visit it again here. */
! if (!gimple_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST))
! continue;
!
! /* STMT is no longer in a worklist. */
! gimple_set_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST, false);
! bb = gimple_bb (stmt);
! /* Visit the statement only if its block is marked executable.
! If it is not executable then it will be visited when we simulate
! all statements in the block as soon as an incoming edge gets
! marked executable. */
! if (!bitmap_bit_p (executable_blocks, bb->index))
! {
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "\nDropping statement from SSA worklist: ");
! print_gimple_stmt (dump_file, stmt, 0, dump_flags);
! }
! continue;
! }
!
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "\nSimulating statement (from %s): ",
! edge_list_name);
! print_gimple_stmt (dump_file, stmt, 0, dump_flags);
! }
!
! simulate_stmt (stmt);
!
! return true;
}
! return false;
}
--- 321,344 ----
when an SSA edge is added to it in simulate_stmt. Return true if a stmt
was simulated. */
! static void
! process_ssa_edge_worklist ()
{
/* Process the next entry from the worklist. */
! unsigned stmt_uid = bitmap_first_set_bit (ssa_edge_worklist);
! bitmap_clear_bit (ssa_edge_worklist, stmt_uid);
! gimple *stmt = uid_to_stmt[stmt_uid];
! /* We should not have stmts in not yet simulated BBs on the worklist. */
! gcc_assert (gimple_bb (stmt)->flags & BB_VISITED);
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "\nSimulating statement: ");
! print_gimple_stmt (dump_file, stmt, 0, dump_flags);
}
! simulate_stmt (stmt);
}
*************** simulate_block (basic_block block)
*** 486,515 ****
/* If this is the first time we've simulated this block, then we
must simulate each of its statements. */
! if (!bitmap_bit_p (executable_blocks, block->index))
{
gimple_stmt_iterator j;
unsigned int normal_edge_count;
edge e, normal_edge;
edge_iterator ei;
- /* Note that we have simulated this block. */
- bitmap_set_bit (executable_blocks, block->index);
-
for (j = gsi_start_bb (block); !gsi_end_p (j); gsi_next (&j))
! {
! gimple *stmt = gsi_stmt (j);
! /* If this statement is already in the worklist then
! "cancel" it. The reevaluation implied by the worklist
! entry will produce the same value we generate here and
! thus reevaluating it again from the worklist is
! pointless. */
! if (gimple_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST))
! gimple_set_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST, false);
!
! simulate_stmt (stmt);
! }
/* We can not predict when abnormal and EH edges will be executed, so
once a block is considered executable, we consider any
--- 364,381 ----
/* If this is the first time we've simulated this block, then we
must simulate each of its statements. */
! if (! (block->flags & BB_VISITED))
{
gimple_stmt_iterator j;
unsigned int normal_edge_count;
edge e, normal_edge;
edge_iterator ei;
for (j = gsi_start_bb (block); !gsi_end_p (j); gsi_next (&j))
! simulate_stmt (gsi_stmt (j));
! /* Note that we have simulated this block. */
! block->flags |= BB_VISITED;
/* We can not predict when abnormal and EH edges will be executed, so
once a block is considered executable, we consider any
*************** ssa_prop_init (void)
*** 551,591 ****
basic_block bb;
/* Worklists of SSA edges. */
! interesting_ssa_edges.create (20);
! varying_ssa_edges.create (20);
! executable_blocks = sbitmap_alloc (last_basic_block_for_fn (cfun));
! bitmap_clear (executable_blocks);
!
! bb_in_list = sbitmap_alloc (last_basic_block_for_fn (cfun));
! bitmap_clear (bb_in_list);
if (dump_file && (dump_flags & TDF_DETAILS))
dump_immediate_uses (dump_file);
- cfg_blocks.create (20);
- cfg_blocks.safe_grow_cleared (20);
-
/* Initially assume that every edge in the CFG is not executable.
! (including the edges coming out of the entry block). */
! FOR_ALL_BB_FN (bb, cfun)
{
gimple_stmt_iterator si;
!
! for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
! gimple_set_plf (gsi_stmt (si), STMT_IN_SSA_EDGE_WORKLIST, false);
for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
! gimple_set_plf (gsi_stmt (si), STMT_IN_SSA_EDGE_WORKLIST, false);
FOR_EACH_EDGE (e, ei, bb->succs)
e->flags &= ~EDGE_EXECUTABLE;
}
/* Seed the algorithm by adding the successors of the entry block to the
edge worklist. */
FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
! add_control_edge (e);
}
--- 417,471 ----
basic_block bb;
/* Worklists of SSA edges. */
! ssa_edge_worklist = BITMAP_ALLOC (NULL);
! /* Worklist of basic-blocks. */
! bb_to_cfg_order = XNEWVEC (int, last_basic_block_for_fn (cfun) + 1);
! cfg_order_to_bb = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
! int n = pre_and_rev_post_order_compute_fn (cfun, cfg_order_to_bb,
! NULL, false);
! for (int i = 0; i < n; ++i)
! bb_to_cfg_order[cfg_order_to_bb[i]] = i;
! cfg_blocks = BITMAP_ALLOC (NULL);
if (dump_file && (dump_flags & TDF_DETAILS))
dump_immediate_uses (dump_file);
/* Initially assume that every edge in the CFG is not executable.
! (including the edges coming out of the entry block). Mark blocks
! as not visited, blocks not yet visited will have all their statements
! simulated once an incoming edge gets executable. */
! set_gimple_stmt_max_uid (cfun, 0);
! for (int i = 0; i < n; ++i)
{
gimple_stmt_iterator si;
! bb = BASIC_BLOCK_FOR_FN (cfun, cfg_order_to_bb[i]);
for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
! {
! gimple *stmt = gsi_stmt (si);
! gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
! }
+ for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
+ {
+ gimple *stmt = gsi_stmt (si);
+ gimple_set_uid (stmt, inc_gimple_stmt_max_uid (cfun));
+ }
+
+ gcc_assert (! (bb->flags & BB_VISITED));
FOR_EACH_EDGE (e, ei, bb->succs)
e->flags &= ~EDGE_EXECUTABLE;
}
+ uid_to_stmt.safe_grow (gimple_stmt_max_uid (cfun));
/* Seed the algorithm by adding the successors of the entry block to the
edge worklist. */
FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
! {
! e->flags &= ~EDGE_EXECUTABLE;
! add_control_edge (e);
! }
}
*************** ssa_prop_init (void)
*** 594,604 ****
static void
ssa_prop_fini (void)
{
! interesting_ssa_edges.release ();
! varying_ssa_edges.release ();
! cfg_blocks.release ();
! sbitmap_free (bb_in_list);
! sbitmap_free (executable_blocks);
}
--- 474,487 ----
static void
ssa_prop_fini (void)
{
! BITMAP_FREE (cfg_blocks);
! free (bb_to_cfg_order);
! free (cfg_order_to_bb);
! BITMAP_FREE (ssa_edge_worklist);
! uid_to_stmt.release ();
! basic_block bb;
! FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
! bb->flags &= ~BB_VISITED;
}
*************** ssa_propagate (ssa_prop_visit_stmt_fn vi
*** 917,927 ****
ssa_prop_init ();
/* Iterate until the worklists are empty. */
! while (!cfg_blocks_empty_p ()
! || interesting_ssa_edges.length () > 0
! || varying_ssa_edges.length () > 0)
{
! if (!cfg_blocks_empty_p ())
{
/* Pull the next block to simulate off the worklist. */
basic_block dest_block = cfg_blocks_get ();
--- 800,810 ----
ssa_prop_init ();
/* Iterate until the worklists are empty. */
! while (! cfg_blocks_empty_p ()
! || ! bitmap_empty_p (ssa_edge_worklist))
{
! /* First simulate whole blocks. */
! if (! cfg_blocks_empty_p ())
{
/* Pull the next block to simulate off the worklist. */
basic_block dest_block = cfg_blocks_get ();
*************** ssa_propagate (ssa_prop_visit_stmt_fn vi
*** 929,942 ****
continue;
}
! /* In order to move things to varying as quickly as
! possible,process the VARYING_SSA_EDGES worklist first. */
! if (process_ssa_edge_worklist (&varying_ssa_edges, "varying_ssa_edges"))
! continue;
!
! /* Now process the INTERESTING_SSA_EDGES worklist. */
! process_ssa_edge_worklist (&interesting_ssa_edges,
! "interesting_ssa_edges");
}
ssa_prop_fini ();
--- 812,819 ----
continue;
}
! /* Then simulate from the SSA edge worklist. */
! process_ssa_edge_worklist ();
}
ssa_prop_fini ();
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-08-12 7:33 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-10 14:20 [PATCH][RFC] Fix PR72851 by "randomizing" SSA propagator worklist extraction Richard Biener
2016-08-11 8:56 ` Richard Biener
2016-08-12 7:33 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).