* [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
@ 2024-07-17 14:15 Filip Kastl
2024-07-18 10:07 ` Richard Biener
0 siblings, 1 reply; 10+ messages in thread
From: Filip Kastl @ 2024-07-17 14:15 UTC (permalink / raw)
To: gcc-patches; +Cc: rguenther, mjambor
Hi,
This is the second version of my patch introducing "exponential index
transformation" to the switch conversion pass. See the version 1 mail here:
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653120.html
Changes made
------------
In this version I addressed the following comments:
- Linaro CI: switch-3.c failing regtest
Exp transform interfered with this test so I added -fdisable-tree-switchconv.
- richi: gsi_start_bb -> gsi_after_labels
- richi: Use int_const_binop
- richi: Merge two cycles into one
- apinki, richi: Use wide_int instead of HOST_WIDE_INT
- richi: You can use the gimple_build API for nicer code
- richi: Use -mbmi -mpopcount instead -march=znver4
Made these modifications.
- richi: Split out code generating GIMPLE for log2 and pow2p operations
Made this modification. The final implementation differs from the one I sent
in a reply to the version 1 patch. I chose to return gimple_seq as Richard
suggested because it seems more elegant to me.
- richi: "It is good to not leave IL in a broken state"
Removed my FIXME remark suggesting to not fix dominators because the GIMPLE
code gets removed later.
Changes not made
----------------
These are the comments that I think were meant as suggestions, not as "this
must be changed" and I'm leaving them for possible future patches:
- richi: Also handle *minus* powers of 2 and powers of 2 +- a constant
- richi: Also allow using CTZ to compute log2
- apinski, richi: Smarter handling of type of index variable (current
implementation cannot handle shorts and chars for example)
These are the comments that I'd like to reply to here but they didn't prompt
any change to the patch:
- richi: Signed index variable with 0x80000000 as its value may be a problem.
Tested the patch version 2 for this situation. In this situation, switch
conversion evaluates exponential transform as not viable so its fine.
- richi:
> > + redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
> > +
> > + /* FIXME: Since exponential transform is run only if we know that the
> > + switch will be converted, we know these blocks will be removed and we
> > + maybe don't have to bother updating their dominators. */
>
> It's good to not leave the IL in an intermediate broken state.
>
> > + edge e;
> > + edge_iterator ei;
> > + FOR_EACH_EDGE (e, ei, swtch_bb->succs)
> > + {
> > + basic_block bb = e->dest;
> > + if (bb == m_final_bb || bb == default_bb)
> > + continue;
> > + set_immediate_dominator (CDI_DOMINATORS, bb, swtch_bb);
>
> If there's an alternate edge into the cases, thus the original
> dominator wasn't the swtch_bb the dominator shouldn't change.
> I wonder why it would change at all - you are only creating
> a new edge into the default case so only that block dominance
> relationship changes?
If I read the switch conversion sources right, there cannot be an alternate
edge into the non-default cases. switch_convesion::collect would reject that
kind of switch. So we know that case basic blocks will always have the switch
basic block as their immediate dominator. This code here actually just
partially reverts (only for the case basic blocks) the
redirect_immediate_dominators call that happens a few lines earlier. That call
is needed because all basic blocks *outside* the switch that had the switch
basic block as their immediate dominator should now have cond_bb as their
immediate dominator.
- richi:
> > + }
> > +
> > + vec<basic_block> v;
> > + v.create (1);
> > + v.quick_push (m_final_bb);
> > + iterate_fix_dominators (CDI_DOMINATORS, v, true);
>
> The final BB should have the condition block as immediate dominator
> if it's old immediate dominator was the switch block, otherwise
> it's immediate dominator shouldn't change.
I think that this is not true. Consider this CFG where no path through default
BB intersects final BB:
switch BB ---+
/ | \ \
case BBs default BB
\ | / /
final BB /
| /
Here idom(final BB) == switch BB.
After the index exponential transform the CFG looks like this
cond BB ---------+
| |
switch BB ---+ |
/ | \ \ |
case BBs default BB
\ | / /
final BB /
| /
It still holds that idom(final BB) == switch BB.
I bootstrapped and regtested the patch on x86_64 linux.
Can I commit the patch like this? Or are there some things that still need
addressing?
Cheers
Filip Kastl
--- 8< ---
gimple ssa: Teach switch conversion to optimize powers of 2 switches
Sometimes a switch has case numbers that are powers of 2. Switch
conversion usually isn't able to optimize switches. This patch adds
"exponential index transformation" to switch conversion. After switch
conversion applies this transformation on the switch the index variable
of the switch becomes the exponent instead of the whole value. For
example:
switch (i)
{
case (1 << 0): return 0;
case (1 << 1): return 1;
case (1 << 2): return 2;
...
case (1 << 30): return 30;
default: return 31;
}
gets transformed roughly into
switch (log2(i))
{
case 0: return 0;
case 1: return 1;
case 2: return 2;
...
case 30: return 30;
default: return 31;
}
This enables switch conversion to further optimize the switch.
This patch only enables this transformation if there are optabs for
POPCOUNT and FFS so that the base 2 logarithm can be computed
efficiently at runtime.
gcc/ChangeLog:
* tree-switch-conversion.cc (can_log2):
(gen_log2):
(gen_pow2p):
(switch_conversion::switch_conversion):
(switch_conversion::is_exp_index_transform_viable):
(switch_conversion::exp_index_transform):
(switch_conversion::gen_inbound_check):
(switch_conversion::expand):
* tree-switch-conversion.h:
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/switch-3.c:
* gcc.target/i386/switch-exp-transform-1.c: New test.
* gcc.target/i386/switch-exp-transform-2.c: New test.
* gcc.target/i386/switch-exp-transform-3.c: New test.
Signed-off-by: Filip Kastl <fkastl@suse.cz>
---
gcc/testsuite/gcc.dg/tree-ssa/switch-3.c | 2 +-
.../gcc.target/i386/switch-exp-transform-1.c | 32 ++
.../gcc.target/i386/switch-exp-transform-2.c | 35 ++
.../gcc.target/i386/switch-exp-transform-3.c | 148 ++++++++
gcc/tree-switch-conversion.cc | 326 +++++++++++++++++-
gcc/tree-switch-conversion.h | 18 +
6 files changed, 555 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c b/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c
index 44981e1d186..83aae3843e9 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c
@@ -1,4 +1,4 @@
-/* { dg-options "-O2 -fdump-tree-switchlower1" } */
+/* { dg-options "-O2 -fdump-tree-switchlower1 -fdisable-tree-switchconv" } */
int cipher_to_alg(int cipher)
{
diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
new file mode 100644
index 00000000000..53d31460ba3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-switchconv -mpopcnt -mbmi" } */
+
+/* Checks that exponential index transform enables switch conversion to convert
+ this switch into an array lookup. Also checks that the "index variable is a
+ power of two" check has been generated. */
+
+int foo(unsigned bar)
+{
+ switch (bar)
+ {
+ case (1 << 0):
+ return 1;
+ case (1 << 1):
+ return 2;
+ case (1 << 2):
+ return 3;
+ case (1 << 3):
+ return 4;
+ case (1 << 4):
+ return 8;
+ case (1 << 5):
+ return 13;
+ case (1 << 6):
+ return 21;
+ default:
+ return 0;
+ }
+}
+
+/* { dg-final { scan-tree-dump "CSWTCH" "switchconv" } } */
+/* { dg-final { scan-tree-dump "POPCOUNT" "switchconv" } } */
diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
new file mode 100644
index 00000000000..f1a9a2d1ee9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-switchconv -mbmi" } */
+
+/* Checks that when exponential index transform is viable but switch conversion
+ decides that the switch cannot be converted, the exponential index transform
+ is not done. */
+
+int a;
+
+int foo(unsigned bar)
+{
+ switch (bar)
+ {
+ case (1 << 0):
+ return 0;
+ case (1 << 1):
+ return 1;
+ case (1 << 2):
+ return 2;
+ case (1 << 3):
+ return 3;
+ case (1 << 4):
+ return 4;
+ case (1 << 5):
+ a = 3;
+ return 5;
+ case (1 << 6):
+ return 6;
+ default:
+ return 0;
+ }
+}
+
+/* { dg-final { scan-tree-dump "Exponential index transform viable" "switchconv" } } */
+/* { dg-final { scan-tree-dump-not "Applying exponential index transform" "switchconv" } } */
diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
new file mode 100644
index 00000000000..c8fae70692e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
@@ -0,0 +1,148 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-switchconv -mbmi" } */
+
+/* Checks that the exponential index transformation is done for all these types
+ of the index variable:
+ - (unsigned) int
+ - (unsigned) long
+ - (unsigned) long long */
+
+int unopt_int(int bit_position)
+{
+ switch (bit_position)
+ {
+ case (1 << 0):
+ return 0;
+ case (1 << 1):
+ return 1;
+ case (1 << 2):
+ return 2;
+ case (1 << 3):
+ return 3;
+ case (1 << 4):
+ return 4;
+ case (1 << 5):
+ return 5;
+ case (1 << 6):
+ return 6;
+ default:
+ return 0;
+ }
+}
+
+int unopt_unsigned_int(unsigned int bit_position)
+{
+ switch (bit_position)
+ {
+ case (1 << 0):
+ return 0;
+ case (1 << 1):
+ return 1;
+ case (1 << 2):
+ return 2;
+ case (1 << 3):
+ return 3;
+ case (1 << 4):
+ return 4;
+ case (1 << 5):
+ return 5;
+ case (1 << 6):
+ return 6;
+ default:
+ return 0;
+ }
+}
+
+int unopt_long(long bit_position)
+{
+ switch (bit_position)
+ {
+ case (1 << 0):
+ return 0;
+ case (1 << 1):
+ return 1;
+ case (1 << 2):
+ return 2;
+ case (1 << 3):
+ return 3;
+ case (1 << 4):
+ return 4;
+ case (1 << 5):
+ return 5;
+ case (1 << 6):
+ return 6;
+ default:
+ return 0;
+ }
+}
+
+int unopt_unsigned_long(unsigned long bit_position)
+{
+ switch (bit_position)
+ {
+ case (1 << 0):
+ return 0;
+ case (1 << 1):
+ return 1;
+ case (1 << 2):
+ return 2;
+ case (1 << 3):
+ return 3;
+ case (1 << 4):
+ return 4;
+ case (1 << 5):
+ return 5;
+ case (1 << 6):
+ return 6;
+ default:
+ return 0;
+ }
+}
+
+int unopt_long_long(long long bit_position)
+{
+ switch (bit_position)
+ {
+ case (1 << 0):
+ return 0;
+ case (1 << 1):
+ return 1;
+ case (1 << 2):
+ return 2;
+ case (1 << 3):
+ return 3;
+ case (1 << 4):
+ return 4;
+ case (1 << 5):
+ return 5;
+ case (1 << 6):
+ return 6;
+ default:
+ return 0;
+ }
+}
+
+int unopt_unsigned_long_long(unsigned long long bit_position)
+{
+ switch (bit_position)
+ {
+ case (1 << 0):
+ return 0;
+ case (1 << 1):
+ return 1;
+ case (1 << 2):
+ return 2;
+ case (1 << 3):
+ return 3;
+ case (1 << 4):
+ return 4;
+ case (1 << 5):
+ return 5;
+ case (1 << 6):
+ return 6;
+ default:
+ return 0;
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "Applying exponential index transform" 6 "switchconv" } } */
diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
index 64629122ec6..4b11c8d25f4 100644
--- a/gcc/tree-switch-conversion.cc
+++ b/gcc/tree-switch-conversion.cc
@@ -52,6 +52,8 @@ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
#include "omp-general.h"
#include "gimple-range.h"
#include "tree-cfgcleanup.h"
+#include "hwint.h"
+#include "internal-fn.h"
/* ??? For lang_hooks.types.type_for_mode, but is there a word_mode
type in the GIMPLE type system that is language-independent? */
@@ -61,12 +63,73 @@ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
\f
using namespace tree_switch_conversion;
+/* Does the target have optabs needed to efficiently compute exact base two
+ logarithm of a value with type TYPE?
+
+ See gen_log2. */
+
+static bool
+can_log2 (tree type, optimization_type opt_type)
+{
+ /* Check if target supports FFS. */
+ return direct_internal_fn_supported_p (IFN_FFS, type, opt_type);
+}
+
+/* Assume that OP is a power of two. Build a sequence of gimple statements
+ efficiently computing the base two logarithm of OP using special optabs.
+ Return the ssa name represeting the result of the logarithm through RESULT.
+
+ Should only be used if target supports the needed optabs. See can_log2. */
+
+static gimple_seq
+gen_log2 (tree op, location_t loc, tree *result)
+{
+ tree type = TREE_TYPE (op);
+ gimple_seq stmts = NULL;
+ gimple_stmt_iterator gsi = gsi_last (stmts);
+ tree tmp1 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, IFN_FFS, type, op);
+ tree tmp2 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, MINUS_EXPR, type,
+ tmp1, build_one_cst (type));
+ *result = tmp2;
+ return stmts;
+}
+
+/* Build a sequence of gimple statements checking that OP is a power of 2. Use
+ special optabs if target supports them. Return the result as a
+ boolen_type_node ssa name through RESULT. */
+
+static gimple_seq
+gen_pow2p (tree op, location_t loc, optimization_type opt_type, tree *result)
+{
+ tree type = TREE_TYPE (op);
+ gimple_seq stmts = NULL;
+ gimple_stmt_iterator gsi = gsi_last (stmts);
+ if (direct_internal_fn_supported_p (IFN_POPCOUNT, type, opt_type))
+ {
+ tree tmp = gimple_build (&gsi, false, GSI_NEW_STMT, loc, IFN_POPCOUNT,
+ type, op);
+ *result = gimple_build (&gsi, false, GSI_NEW_STMT, loc, EQ_EXPR,
+ boolean_type_node, tmp, build_one_cst (type));
+ }
+ else
+ {
+ tree tmp1 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, NEGATE_EXPR,
+ type, op);
+ tree tmp2 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, BIT_AND_EXPR,
+ type, op, tmp1);
+ *result = gimple_build (&gsi, false, GSI_NEW_STMT, loc, EQ_EXPR,
+ boolean_type_node, tmp2, op);
+ }
+ return stmts;
+}
+
/* Constructor. */
switch_conversion::switch_conversion (): m_final_bb (NULL),
m_constructors (NULL), m_default_values (NULL),
m_arr_ref_first (NULL), m_arr_ref_last (NULL),
- m_reason (NULL), m_default_case_nonstandard (false), m_cfg_altered (false)
+ m_reason (NULL), m_default_case_nonstandard (false), m_cfg_altered (false),
+ m_exp_index_transform_applied (false)
{
}
@@ -202,6 +265,244 @@ switch_conversion::collect (gswitch *swtch)
}
}
+/* Check that the "exponential index transform" can be applied to this switch.
+
+ See comment of the exp_index_transform function for details about this
+ transformation.
+
+ We want:
+ - This form of the switch is more efficient
+ - Cases are powers of 2
+
+ Expects that SWTCH has at least one case. */
+
+bool
+switch_conversion::is_exp_index_transform_viable (gswitch *swtch)
+{
+ tree index = gimple_switch_index (swtch);
+ tree index_type = TREE_TYPE (index);
+ basic_block swtch_bb = gimple_bb (swtch);
+ unsigned num_labels = gimple_switch_num_labels (swtch);
+
+ optimization_type opt_type = bb_optimization_type (swtch_bb);
+ if (!can_log2 (index_type, opt_type))
+ return false;
+
+ /* Check that each case label corresponds only to one value
+ (no case 1..3). */
+ unsigned i;
+ for (i = 1; i < num_labels; i++)
+ {
+ tree label = gimple_switch_label (swtch, i);
+ if (CASE_HIGH (label))
+ return false;
+ }
+
+ /* Check that each label is nonnegative and a power of 2. */
+ for (i = 1; i < num_labels; i++)
+ {
+ tree label = gimple_switch_label (swtch, i);
+ wide_int label_wi = wi::to_wide (CASE_LOW (label));
+ if (!wi::ge_p (label_wi, 0, TYPE_SIGN (index_type)))
+ return false;
+ if (wi::exact_log2 (label_wi) == -1)
+ return false;
+ }
+
+ if (dump_file)
+ fprintf (dump_file, "Exponential index transform viable\n");
+
+ return true;
+}
+
+/* Perform the "exponential index transform".
+
+ Assume that cases of SWTCH are powers of 2. The transformation replaces the
+ cases by their exponents (2^k -> k). It also inserts a statement that
+ computes the exponent of the original index variable (basically taking the
+ logarithm) and then sets the result as the new index variable.
+
+ The transformation also inserts a conditional statement checking that the
+ incoming original index variable is a power of 2 with the false edge leading
+ to the default case.
+
+ The exponential index transform shrinks the range of case numbers which
+ helps switch conversion convert switches it otherwise could not.
+
+ Consider for example:
+
+ switch (i)
+ {
+ case (1 << 0): return 0;
+ case (1 << 1): return 1;
+ case (1 << 2): return 2;
+ ...
+ case (1 << 30): return 30;
+ default: return 31;
+ }
+
+ First, exponential index transform gets applied. Since each case becomes
+ case x: return x;, the rest of switch conversion is then able to get rid of
+ the switch statement.
+
+ if (i is power of 2)
+ return log2 (i);
+ else
+ return 31;
+
+ */
+
+void
+switch_conversion::exp_index_transform (gswitch *swtch)
+{
+ if (dump_file)
+ fprintf (dump_file, "Applying exponential index transform\n");
+
+ tree index = gimple_switch_index (swtch);
+ tree index_type = TREE_TYPE (index);
+ basic_block swtch_bb = gimple_bb (swtch);
+ unsigned num_labels = gimple_switch_num_labels (swtch);
+
+ /* Insert a cond stmt that checks if the index variable is a power of 2. */
+ gimple_stmt_iterator gsi = gsi_for_stmt (swtch);
+ gsi_prev (&gsi);
+ gimple *foo = gsi_stmt (gsi);
+ edge new_edge1 = split_block (swtch_bb, foo);
+
+ swtch_bb = new_edge1->dest;
+ basic_block cond_bb = new_edge1->src;
+ new_edge1->flags |= EDGE_TRUE_VALUE;
+ new_edge1->flags &= ~EDGE_FALLTHRU;
+ new_edge1->probability = profile_probability::even ();
+
+ basic_block default_bb = gimple_switch_default_bb (cfun, swtch);
+ edge new_edge2 = make_edge (cond_bb, default_bb, EDGE_FALSE_VALUE);
+ new_edge2->probability = profile_probability::even ();
+
+ tree tmp;
+ optimization_type opt_type = bb_optimization_type (cond_bb);
+ gimple_seq stmts = gen_pow2p (index, UNKNOWN_LOCATION, opt_type, &tmp);
+ gsi = gsi_last_bb (cond_bb);
+ gsi_insert_seq_after (&gsi, stmts, GSI_LAST_NEW_STMT);
+ gcond *stmt_cond = gimple_build_cond (NE_EXPR, tmp, boolean_false_node,
+ NULL, NULL);
+ gsi_insert_after (&gsi, stmt_cond, GSI_NEW_STMT);
+
+ /* We just added an edge going to default bb so fix PHI nodes in that bb:
+ For each PHI add new PHI arg. It will be the same arg as when comming to
+ the default bb from the switch bb. */
+ edge default_edge = find_edge (swtch_bb, default_bb);
+ for (gphi_iterator gsi = gsi_start_phis (default_bb);
+ !gsi_end_p (gsi); gsi_next (&gsi))
+ {
+ gphi *phi = gsi.phi ();
+ tree arg = PHI_ARG_DEF_FROM_EDGE (phi, default_edge);
+ location_t loc = gimple_phi_arg_location_from_edge (phi, default_edge);
+ add_phi_arg (phi, arg, new_edge2, loc);
+ }
+
+ /* Insert a sequence of stmts that takes the log of the index variable. */
+ stmts = gen_log2 (index, UNKNOWN_LOCATION, &tmp);
+ gsi = gsi_after_labels (swtch_bb);
+ gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+
+ /* Use the result of the logarithm as the new index variable. */
+ gimple_switch_set_index (swtch, tmp);
+ update_stmt (swtch);
+
+ /* Replace each case number with its logarithm. */
+ unsigned i;
+ for (i = 1; i < num_labels; i++)
+ {
+ tree label = gimple_switch_label (swtch, i);
+ CASE_LOW (label) = build_int_cst (index_type,
+ tree_log2 (CASE_LOW (label)));
+ }
+
+ /* Fix the dominator tree, if it is available. */
+ if (dom_info_available_p (CDI_DOMINATORS))
+ {
+ /* Analysis of how dominators should look after we add the edge E going
+ from the cond block to the default block.
+
+ 1 For the blocks between the switch block and the final block
+ (excluding the final block itself): They had the switch block as
+ their immediate dominator. That shouldn't change.
+
+ 2 The final block may now have the switch block or the cond block as
+ its immediate dominator. There's no easy way of knowing (consider
+ two cases where in both m_default_case_nonstandard = true, in one a
+ path through default intersects the final block and in one all paths
+ through default avoid the final block but intersect a successor of the
+ final block).
+
+ 3 Other blocks that had the switch block as their immediate dominator
+ should now have the cond block as their immediate dominator.
+
+ 4 Immediate dominators of the rest of the blocks shouldn't change.
+
+ Reasoning for 3 and 4:
+
+ We'll only consider blocks that do not fall into 1 or 2.
+
+ Consider a block X whose original imm dom was the switch block. All
+ paths to X must also intersect the cond block since it's the only
+ pred of the switch block. The final block doesn't dominate X so at
+ least one path P must lead through the default block. Let P' be P but
+ instead of going through the switch block, take E. The switch block
+ doesn't dominate X so its imm dom must now be the cond block.
+
+ Consider a block X whose original imm dom was Y != the switch block.
+ We only added an edge so all original paths to X are still present.
+ So X gained no new dominators. Observe that Y still dominates X.
+ There would have to be a path that avoids Y otherwise. But any block
+ we can avoid now except for the switch block we were able to avoid
+ before adding E. */
+
+ redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
+
+ edge e;
+ edge_iterator ei;
+ FOR_EACH_EDGE (e, ei, swtch_bb->succs)
+ {
+ basic_block bb = e->dest;
+ if (bb == m_final_bb || bb == default_bb)
+ continue;
+ set_immediate_dominator (CDI_DOMINATORS, bb, swtch_bb);
+ }
+
+ vec<basic_block> v;
+ v.create (1);
+ v.quick_push (m_final_bb);
+ iterate_fix_dominators (CDI_DOMINATORS, v, true);
+ }
+
+ /* Update information about the switch statement. */
+ tree first_label = gimple_switch_label (swtch, 1);
+ tree last_label = gimple_switch_label (swtch, num_labels - 1);
+
+ m_range_min = CASE_LOW (first_label);
+ m_range_max = CASE_LOW (last_label);
+ m_index_expr = gimple_switch_index (swtch);
+ m_switch_bb = swtch_bb;
+
+ m_range_size = int_const_binop (MINUS_EXPR, m_range_max, m_range_min);
+
+ m_cfg_altered = true;
+
+ m_contiguous_range = true;
+ wide_int last_wi = wi::to_wide (CASE_LOW (first_label));
+ for (i = 2; i < num_labels; i++)
+ {
+ tree label = gimple_switch_label (swtch, i);
+ wide_int label_wi = wi::to_wide (CASE_LOW (label));
+ m_contiguous_range &= wi::eq_p (wi::add (last_wi, 1), label_wi);
+ last_wi = label_wi;
+ }
+
+ m_exp_index_transform_applied = true;
+}
+
/* Checks whether the range given by individual case statements of the switch
switch statement isn't too big and whether the number of branches actually
satisfies the size of the new array. */
@@ -973,8 +1274,9 @@ switch_conversion::gen_inbound_check ()
bbf->count = e1f->count () + e2f->count ();
/* Tidy blocks that have become unreachable. */
- prune_bbs (bbd, m_final_bb,
- m_default_case_nonstandard ? m_default_bb : NULL);
+ bool prune_default_bb = !m_default_case_nonstandard
+ && !m_exp_index_transform_applied;
+ prune_bbs (bbd, m_final_bb, prune_default_bb ? NULL : m_default_bb);
/* Fixup the PHI nodes in bbF. */
fix_phi_nodes (e1f, e2f, bbf);
@@ -1053,8 +1355,19 @@ switch_conversion::expand (gswitch *swtch)
return;
}
- /* Check the case label values are within reasonable range: */
- if (!check_range ())
+ /* Sometimes it is possible to use the "exponential index transform" to help
+ switch conversion convert switches which it otherwise could not convert.
+ However, we want to do this transform only when we know that switch
+ conversion will then really be able to convert the switch. So we first
+ check if the transformation is applicable and then maybe later do the
+ transformation. */
+ bool exp_transform_viable = is_exp_index_transform_viable (swtch);
+
+ /* Check the case label values are within reasonable range.
+
+ If we will be doing exponential index transform, the range will be always
+ reasonable. */
+ if (!exp_transform_viable && !check_range ())
{
gcc_assert (m_reason);
return;
@@ -1076,6 +1389,9 @@ switch_conversion::expand (gswitch *swtch)
/* At this point all checks have passed and we can proceed with the
transformation. */
+ if (exp_transform_viable)
+ exp_index_transform (swtch);
+
create_temp_arrays ();
gather_default_values (m_default_case_nonstandard
? gimple_switch_label (swtch, 1)
diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
index 6939eec6018..1a865f85f3a 100644
--- a/gcc/tree-switch-conversion.h
+++ b/gcc/tree-switch-conversion.h
@@ -743,6 +743,19 @@ public:
/* Collection information about SWTCH statement. */
void collect (gswitch *swtch);
+ /* Check that the 'exponential index transform' can be applied.
+
+ See the comment at the function definition for more details. */
+ bool is_exp_index_transform_viable (gswitch *swtch);
+
+ /* Perform the 'exponential index transform'.
+
+ The exponential index transform shrinks the range of case numbers which
+ helps switch conversion convert switches it otherwise could not.
+
+ See the comment at the function definition for more details. */
+ void exp_index_transform (gswitch *swtch);
+
/* Checks whether the range given by individual case statements of the switch
switch statement isn't too big and whether the number of branches actually
satisfies the size of the new array. */
@@ -900,6 +913,11 @@ public:
/* True if CFG has been changed. */
bool m_cfg_altered;
+
+ /* True if exponential index transform has been applied. See the comment at
+ the definition of exp_index_transform for details about the
+ transformation. */
+ bool m_exp_index_transform_applied;
};
void
--
2.45.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-17 14:15 [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches Filip Kastl
@ 2024-07-18 10:07 ` Richard Biener
2024-07-18 14:17 ` Filip Kastl
0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2024-07-18 10:07 UTC (permalink / raw)
To: Filip Kastl; +Cc: gcc-patches, mjambor
On Wed, 17 Jul 2024, Filip Kastl wrote:
> Hi,
>
> This is the second version of my patch introducing "exponential index
> transformation" to the switch conversion pass. See the version 1 mail here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653120.html
>
>
> Changes made
> ------------
>
> In this version I addressed the following comments:
>
> - Linaro CI: switch-3.c failing regtest
> Exp transform interfered with this test so I added -fdisable-tree-switchconv.
>
> - richi: gsi_start_bb -> gsi_after_labels
> - richi: Use int_const_binop
> - richi: Merge two cycles into one
> - apinki, richi: Use wide_int instead of HOST_WIDE_INT
> - richi: You can use the gimple_build API for nicer code
> - richi: Use -mbmi -mpopcount instead -march=znver4
> Made these modifications.
>
> - richi: Split out code generating GIMPLE for log2 and pow2p operations
> Made this modification. The final implementation differs from the one I sent
> in a reply to the version 1 patch. I chose to return gimple_seq as Richard
> suggested because it seems more elegant to me.
>
> - richi: "It is good to not leave IL in a broken state"
> Removed my FIXME remark suggesting to not fix dominators because the GIMPLE
> code gets removed later.
>
>
> Changes not made
> ----------------
>
> These are the comments that I think were meant as suggestions, not as "this
> must be changed" and I'm leaving them for possible future patches:
>
> - richi: Also handle *minus* powers of 2 and powers of 2 +- a constant
> - richi: Also allow using CTZ to compute log2
> - apinski, richi: Smarter handling of type of index variable (current
> implementation cannot handle shorts and chars for example)
>
> These are the comments that I'd like to reply to here but they didn't prompt
> any change to the patch:
>
> - richi: Signed index variable with 0x80000000 as its value may be a problem.
> Tested the patch version 2 for this situation. In this situation, switch
> conversion evaluates exponential transform as not viable so its fine.
>
> - richi:
> > > + redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
> > > +
> > > + /* FIXME: Since exponential transform is run only if we know that the
> > > + switch will be converted, we know these blocks will be removed and we
> > > + maybe don't have to bother updating their dominators. */
> >
> > It's good to not leave the IL in an intermediate broken state.
> >
> > > + edge e;
> > > + edge_iterator ei;
> > > + FOR_EACH_EDGE (e, ei, swtch_bb->succs)
> > > + {
> > > + basic_block bb = e->dest;
> > > + if (bb == m_final_bb || bb == default_bb)
> > > + continue;
> > > + set_immediate_dominator (CDI_DOMINATORS, bb, swtch_bb);
> >
> > If there's an alternate edge into the cases, thus the original
> > dominator wasn't the swtch_bb the dominator shouldn't change.
> > I wonder why it would change at all - you are only creating
> > a new edge into the default case so only that block dominance
> > relationship changes?
> If I read the switch conversion sources right, there cannot be an alternate
> edge into the non-default cases. switch_convesion::collect would reject that
> kind of switch. So we know that case basic blocks will always have the switch
> basic block as their immediate dominator. This code here actually just
> partially reverts (only for the case basic blocks) the
> redirect_immediate_dominators call that happens a few lines earlier. That call
> is needed because all basic blocks *outside* the switch that had the switch
> basic block as their immediate dominator should now have cond_bb as their
> immediate dominator.
>
> - richi:
> > > + }
> > > +
> > > + vec<basic_block> v;
> > > + v.create (1);
> > > + v.quick_push (m_final_bb);
> > > + iterate_fix_dominators (CDI_DOMINATORS, v, true);
> >
> > The final BB should have the condition block as immediate dominator
> > if it's old immediate dominator was the switch block, otherwise
> > it's immediate dominator shouldn't change.
> I think that this is not true. Consider this CFG where no path through default
> BB intersects final BB:
>
> switch BB ---+
> / | \ \
> case BBs default BB
> \ | / /
> final BB /
> | /
>
> Here idom(final BB) == switch BB.
> After the index exponential transform the CFG looks like this
>
> cond BB ---------+
> | |
> switch BB ---+ |
> / | \ \ |
> case BBs default BB
> \ | / /
> final BB /
> | /
>
> It still holds that idom(final BB) == switch BB.
Sure but you still know the dominator of the default BB as the cond BB
then? That said, I think that using iterate_fix_dominators is overkill
and you should know the new immediate dominator and can set it directly?
That said, even above if there's a merge of the default BB and final BB
downstream in the CFG, inserting cond BB requires adjustment of the
immediate dominator of that merge block and you are missing that?
>
> I bootstrapped and regtested the patch on x86_64 linux.
>
> Can I commit the patch like this? Or are there some things that still need
> addressing?
Apart from the dominator update question above the patch looks OK to me.
Thanks,
Richard.
> Cheers
> Filip Kastl
>
>
> --- 8< ---
>
>
> gimple ssa: Teach switch conversion to optimize powers of 2 switches
>
> Sometimes a switch has case numbers that are powers of 2. Switch
> conversion usually isn't able to optimize switches. This patch adds
> "exponential index transformation" to switch conversion. After switch
> conversion applies this transformation on the switch the index variable
> of the switch becomes the exponent instead of the whole value. For
> example:
>
> switch (i)
> {
> case (1 << 0): return 0;
> case (1 << 1): return 1;
> case (1 << 2): return 2;
> ...
> case (1 << 30): return 30;
> default: return 31;
> }
>
> gets transformed roughly into
>
> switch (log2(i))
> {
> case 0: return 0;
> case 1: return 1;
> case 2: return 2;
> ...
> case 30: return 30;
> default: return 31;
> }
>
> This enables switch conversion to further optimize the switch.
>
> This patch only enables this transformation if there are optabs for
> POPCOUNT and FFS so that the base 2 logarithm can be computed
> efficiently at runtime.
>
> gcc/ChangeLog:
>
> * tree-switch-conversion.cc (can_log2):
> (gen_log2):
> (gen_pow2p):
> (switch_conversion::switch_conversion):
> (switch_conversion::is_exp_index_transform_viable):
> (switch_conversion::exp_index_transform):
> (switch_conversion::gen_inbound_check):
> (switch_conversion::expand):
> * tree-switch-conversion.h:
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/switch-3.c:
> * gcc.target/i386/switch-exp-transform-1.c: New test.
> * gcc.target/i386/switch-exp-transform-2.c: New test.
> * gcc.target/i386/switch-exp-transform-3.c: New test.
>
> Signed-off-by: Filip Kastl <fkastl@suse.cz>
> ---
> gcc/testsuite/gcc.dg/tree-ssa/switch-3.c | 2 +-
> .../gcc.target/i386/switch-exp-transform-1.c | 32 ++
> .../gcc.target/i386/switch-exp-transform-2.c | 35 ++
> .../gcc.target/i386/switch-exp-transform-3.c | 148 ++++++++
> gcc/tree-switch-conversion.cc | 326 +++++++++++++++++-
> gcc/tree-switch-conversion.h | 18 +
> 6 files changed, 555 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
> create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
> create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c b/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c
> index 44981e1d186..83aae3843e9 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c
> @@ -1,4 +1,4 @@
> -/* { dg-options "-O2 -fdump-tree-switchlower1" } */
> +/* { dg-options "-O2 -fdump-tree-switchlower1 -fdisable-tree-switchconv" } */
>
> int cipher_to_alg(int cipher)
> {
> diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
> new file mode 100644
> index 00000000000..53d31460ba3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-switchconv -mpopcnt -mbmi" } */
> +
> +/* Checks that exponential index transform enables switch conversion to convert
> + this switch into an array lookup. Also checks that the "index variable is a
> + power of two" check has been generated. */
> +
> +int foo(unsigned bar)
> +{
> + switch (bar)
> + {
> + case (1 << 0):
> + return 1;
> + case (1 << 1):
> + return 2;
> + case (1 << 2):
> + return 3;
> + case (1 << 3):
> + return 4;
> + case (1 << 4):
> + return 8;
> + case (1 << 5):
> + return 13;
> + case (1 << 6):
> + return 21;
> + default:
> + return 0;
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump "CSWTCH" "switchconv" } } */
> +/* { dg-final { scan-tree-dump "POPCOUNT" "switchconv" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
> new file mode 100644
> index 00000000000..f1a9a2d1ee9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-switchconv -mbmi" } */
> +
> +/* Checks that when exponential index transform is viable but switch conversion
> + decides that the switch cannot be converted, the exponential index transform
> + is not done. */
> +
> +int a;
> +
> +int foo(unsigned bar)
> +{
> + switch (bar)
> + {
> + case (1 << 0):
> + return 0;
> + case (1 << 1):
> + return 1;
> + case (1 << 2):
> + return 2;
> + case (1 << 3):
> + return 3;
> + case (1 << 4):
> + return 4;
> + case (1 << 5):
> + a = 3;
> + return 5;
> + case (1 << 6):
> + return 6;
> + default:
> + return 0;
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump "Exponential index transform viable" "switchconv" } } */
> +/* { dg-final { scan-tree-dump-not "Applying exponential index transform" "switchconv" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
> new file mode 100644
> index 00000000000..c8fae70692e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
> @@ -0,0 +1,148 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-switchconv -mbmi" } */
> +
> +/* Checks that the exponential index transformation is done for all these types
> + of the index variable:
> + - (unsigned) int
> + - (unsigned) long
> + - (unsigned) long long */
> +
> +int unopt_int(int bit_position)
> +{
> + switch (bit_position)
> + {
> + case (1 << 0):
> + return 0;
> + case (1 << 1):
> + return 1;
> + case (1 << 2):
> + return 2;
> + case (1 << 3):
> + return 3;
> + case (1 << 4):
> + return 4;
> + case (1 << 5):
> + return 5;
> + case (1 << 6):
> + return 6;
> + default:
> + return 0;
> + }
> +}
> +
> +int unopt_unsigned_int(unsigned int bit_position)
> +{
> + switch (bit_position)
> + {
> + case (1 << 0):
> + return 0;
> + case (1 << 1):
> + return 1;
> + case (1 << 2):
> + return 2;
> + case (1 << 3):
> + return 3;
> + case (1 << 4):
> + return 4;
> + case (1 << 5):
> + return 5;
> + case (1 << 6):
> + return 6;
> + default:
> + return 0;
> + }
> +}
> +
> +int unopt_long(long bit_position)
> +{
> + switch (bit_position)
> + {
> + case (1 << 0):
> + return 0;
> + case (1 << 1):
> + return 1;
> + case (1 << 2):
> + return 2;
> + case (1 << 3):
> + return 3;
> + case (1 << 4):
> + return 4;
> + case (1 << 5):
> + return 5;
> + case (1 << 6):
> + return 6;
> + default:
> + return 0;
> + }
> +}
> +
> +int unopt_unsigned_long(unsigned long bit_position)
> +{
> + switch (bit_position)
> + {
> + case (1 << 0):
> + return 0;
> + case (1 << 1):
> + return 1;
> + case (1 << 2):
> + return 2;
> + case (1 << 3):
> + return 3;
> + case (1 << 4):
> + return 4;
> + case (1 << 5):
> + return 5;
> + case (1 << 6):
> + return 6;
> + default:
> + return 0;
> + }
> +}
> +
> +int unopt_long_long(long long bit_position)
> +{
> + switch (bit_position)
> + {
> + case (1 << 0):
> + return 0;
> + case (1 << 1):
> + return 1;
> + case (1 << 2):
> + return 2;
> + case (1 << 3):
> + return 3;
> + case (1 << 4):
> + return 4;
> + case (1 << 5):
> + return 5;
> + case (1 << 6):
> + return 6;
> + default:
> + return 0;
> + }
> +}
> +
> +int unopt_unsigned_long_long(unsigned long long bit_position)
> +{
> + switch (bit_position)
> + {
> + case (1 << 0):
> + return 0;
> + case (1 << 1):
> + return 1;
> + case (1 << 2):
> + return 2;
> + case (1 << 3):
> + return 3;
> + case (1 << 4):
> + return 4;
> + case (1 << 5):
> + return 5;
> + case (1 << 6):
> + return 6;
> + default:
> + return 0;
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Applying exponential index transform" 6 "switchconv" } } */
> diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
> index 64629122ec6..4b11c8d25f4 100644
> --- a/gcc/tree-switch-conversion.cc
> +++ b/gcc/tree-switch-conversion.cc
> @@ -52,6 +52,8 @@ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
> #include "omp-general.h"
> #include "gimple-range.h"
> #include "tree-cfgcleanup.h"
> +#include "hwint.h"
> +#include "internal-fn.h"
>
> /* ??? For lang_hooks.types.type_for_mode, but is there a word_mode
> type in the GIMPLE type system that is language-independent? */
> @@ -61,12 +63,73 @@ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
> \f
> using namespace tree_switch_conversion;
>
> +/* Does the target have optabs needed to efficiently compute exact base two
> + logarithm of a value with type TYPE?
> +
> + See gen_log2. */
> +
> +static bool
> +can_log2 (tree type, optimization_type opt_type)
> +{
> + /* Check if target supports FFS. */
> + return direct_internal_fn_supported_p (IFN_FFS, type, opt_type);
> +}
> +
> +/* Assume that OP is a power of two. Build a sequence of gimple statements
> + efficiently computing the base two logarithm of OP using special optabs.
> + Return the ssa name represeting the result of the logarithm through RESULT.
> +
> + Should only be used if target supports the needed optabs. See can_log2. */
> +
> +static gimple_seq
> +gen_log2 (tree op, location_t loc, tree *result)
> +{
> + tree type = TREE_TYPE (op);
> + gimple_seq stmts = NULL;
> + gimple_stmt_iterator gsi = gsi_last (stmts);
> + tree tmp1 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, IFN_FFS, type, op);
> + tree tmp2 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, MINUS_EXPR, type,
> + tmp1, build_one_cst (type));
> + *result = tmp2;
> + return stmts;
> +}
> +
> +/* Build a sequence of gimple statements checking that OP is a power of 2. Use
> + special optabs if target supports them. Return the result as a
> + boolen_type_node ssa name through RESULT. */
> +
> +static gimple_seq
> +gen_pow2p (tree op, location_t loc, optimization_type opt_type, tree *result)
> +{
> + tree type = TREE_TYPE (op);
> + gimple_seq stmts = NULL;
> + gimple_stmt_iterator gsi = gsi_last (stmts);
> + if (direct_internal_fn_supported_p (IFN_POPCOUNT, type, opt_type))
> + {
> + tree tmp = gimple_build (&gsi, false, GSI_NEW_STMT, loc, IFN_POPCOUNT,
> + type, op);
> + *result = gimple_build (&gsi, false, GSI_NEW_STMT, loc, EQ_EXPR,
> + boolean_type_node, tmp, build_one_cst (type));
> + }
> + else
> + {
> + tree tmp1 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, NEGATE_EXPR,
> + type, op);
> + tree tmp2 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, BIT_AND_EXPR,
> + type, op, tmp1);
> + *result = gimple_build (&gsi, false, GSI_NEW_STMT, loc, EQ_EXPR,
> + boolean_type_node, tmp2, op);
> + }
> + return stmts;
> +}
> +
> /* Constructor. */
>
> switch_conversion::switch_conversion (): m_final_bb (NULL),
> m_constructors (NULL), m_default_values (NULL),
> m_arr_ref_first (NULL), m_arr_ref_last (NULL),
> - m_reason (NULL), m_default_case_nonstandard (false), m_cfg_altered (false)
> + m_reason (NULL), m_default_case_nonstandard (false), m_cfg_altered (false),
> + m_exp_index_transform_applied (false)
> {
> }
>
> @@ -202,6 +265,244 @@ switch_conversion::collect (gswitch *swtch)
> }
> }
>
> +/* Check that the "exponential index transform" can be applied to this switch.
> +
> + See comment of the exp_index_transform function for details about this
> + transformation.
> +
> + We want:
> + - This form of the switch is more efficient
> + - Cases are powers of 2
> +
> + Expects that SWTCH has at least one case. */
> +
> +bool
> +switch_conversion::is_exp_index_transform_viable (gswitch *swtch)
> +{
> + tree index = gimple_switch_index (swtch);
> + tree index_type = TREE_TYPE (index);
> + basic_block swtch_bb = gimple_bb (swtch);
> + unsigned num_labels = gimple_switch_num_labels (swtch);
> +
> + optimization_type opt_type = bb_optimization_type (swtch_bb);
> + if (!can_log2 (index_type, opt_type))
> + return false;
> +
> + /* Check that each case label corresponds only to one value
> + (no case 1..3). */
> + unsigned i;
> + for (i = 1; i < num_labels; i++)
> + {
> + tree label = gimple_switch_label (swtch, i);
> + if (CASE_HIGH (label))
> + return false;
> + }
> +
> + /* Check that each label is nonnegative and a power of 2. */
> + for (i = 1; i < num_labels; i++)
> + {
> + tree label = gimple_switch_label (swtch, i);
> + wide_int label_wi = wi::to_wide (CASE_LOW (label));
> + if (!wi::ge_p (label_wi, 0, TYPE_SIGN (index_type)))
> + return false;
> + if (wi::exact_log2 (label_wi) == -1)
> + return false;
> + }
> +
> + if (dump_file)
> + fprintf (dump_file, "Exponential index transform viable\n");
> +
> + return true;
> +}
> +
> +/* Perform the "exponential index transform".
> +
> + Assume that cases of SWTCH are powers of 2. The transformation replaces the
> + cases by their exponents (2^k -> k). It also inserts a statement that
> + computes the exponent of the original index variable (basically taking the
> + logarithm) and then sets the result as the new index variable.
> +
> + The transformation also inserts a conditional statement checking that the
> + incoming original index variable is a power of 2 with the false edge leading
> + to the default case.
> +
> + The exponential index transform shrinks the range of case numbers which
> + helps switch conversion convert switches it otherwise could not.
> +
> + Consider for example:
> +
> + switch (i)
> + {
> + case (1 << 0): return 0;
> + case (1 << 1): return 1;
> + case (1 << 2): return 2;
> + ...
> + case (1 << 30): return 30;
> + default: return 31;
> + }
> +
> + First, exponential index transform gets applied. Since each case becomes
> + case x: return x;, the rest of switch conversion is then able to get rid of
> + the switch statement.
> +
> + if (i is power of 2)
> + return log2 (i);
> + else
> + return 31;
> +
> + */
> +
> +void
> +switch_conversion::exp_index_transform (gswitch *swtch)
> +{
> + if (dump_file)
> + fprintf (dump_file, "Applying exponential index transform\n");
> +
> + tree index = gimple_switch_index (swtch);
> + tree index_type = TREE_TYPE (index);
> + basic_block swtch_bb = gimple_bb (swtch);
> + unsigned num_labels = gimple_switch_num_labels (swtch);
> +
> + /* Insert a cond stmt that checks if the index variable is a power of 2. */
> + gimple_stmt_iterator gsi = gsi_for_stmt (swtch);
> + gsi_prev (&gsi);
> + gimple *foo = gsi_stmt (gsi);
> + edge new_edge1 = split_block (swtch_bb, foo);
> +
> + swtch_bb = new_edge1->dest;
> + basic_block cond_bb = new_edge1->src;
> + new_edge1->flags |= EDGE_TRUE_VALUE;
> + new_edge1->flags &= ~EDGE_FALLTHRU;
> + new_edge1->probability = profile_probability::even ();
> +
> + basic_block default_bb = gimple_switch_default_bb (cfun, swtch);
> + edge new_edge2 = make_edge (cond_bb, default_bb, EDGE_FALSE_VALUE);
> + new_edge2->probability = profile_probability::even ();
> +
> + tree tmp;
> + optimization_type opt_type = bb_optimization_type (cond_bb);
> + gimple_seq stmts = gen_pow2p (index, UNKNOWN_LOCATION, opt_type, &tmp);
> + gsi = gsi_last_bb (cond_bb);
> + gsi_insert_seq_after (&gsi, stmts, GSI_LAST_NEW_STMT);
> + gcond *stmt_cond = gimple_build_cond (NE_EXPR, tmp, boolean_false_node,
> + NULL, NULL);
> + gsi_insert_after (&gsi, stmt_cond, GSI_NEW_STMT);
> +
> + /* We just added an edge going to default bb so fix PHI nodes in that bb:
> + For each PHI add new PHI arg. It will be the same arg as when comming to
> + the default bb from the switch bb. */
> + edge default_edge = find_edge (swtch_bb, default_bb);
> + for (gphi_iterator gsi = gsi_start_phis (default_bb);
> + !gsi_end_p (gsi); gsi_next (&gsi))
> + {
> + gphi *phi = gsi.phi ();
> + tree arg = PHI_ARG_DEF_FROM_EDGE (phi, default_edge);
> + location_t loc = gimple_phi_arg_location_from_edge (phi, default_edge);
> + add_phi_arg (phi, arg, new_edge2, loc);
> + }
> +
> + /* Insert a sequence of stmts that takes the log of the index variable. */
> + stmts = gen_log2 (index, UNKNOWN_LOCATION, &tmp);
> + gsi = gsi_after_labels (swtch_bb);
> + gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
> +
> + /* Use the result of the logarithm as the new index variable. */
> + gimple_switch_set_index (swtch, tmp);
> + update_stmt (swtch);
> +
> + /* Replace each case number with its logarithm. */
> + unsigned i;
> + for (i = 1; i < num_labels; i++)
> + {
> + tree label = gimple_switch_label (swtch, i);
> + CASE_LOW (label) = build_int_cst (index_type,
> + tree_log2 (CASE_LOW (label)));
> + }
> +
> + /* Fix the dominator tree, if it is available. */
> + if (dom_info_available_p (CDI_DOMINATORS))
> + {
> + /* Analysis of how dominators should look after we add the edge E going
> + from the cond block to the default block.
> +
> + 1 For the blocks between the switch block and the final block
> + (excluding the final block itself): They had the switch block as
> + their immediate dominator. That shouldn't change.
> +
> + 2 The final block may now have the switch block or the cond block as
> + its immediate dominator. There's no easy way of knowing (consider
> + two cases where in both m_default_case_nonstandard = true, in one a
> + path through default intersects the final block and in one all paths
> + through default avoid the final block but intersect a successor of the
> + final block).
> +
> + 3 Other blocks that had the switch block as their immediate dominator
> + should now have the cond block as their immediate dominator.
> +
> + 4 Immediate dominators of the rest of the blocks shouldn't change.
> +
> + Reasoning for 3 and 4:
> +
> + We'll only consider blocks that do not fall into 1 or 2.
> +
> + Consider a block X whose original imm dom was the switch block. All
> + paths to X must also intersect the cond block since it's the only
> + pred of the switch block. The final block doesn't dominate X so at
> + least one path P must lead through the default block. Let P' be P but
> + instead of going through the switch block, take E. The switch block
> + doesn't dominate X so its imm dom must now be the cond block.
> +
> + Consider a block X whose original imm dom was Y != the switch block.
> + We only added an edge so all original paths to X are still present.
> + So X gained no new dominators. Observe that Y still dominates X.
> + There would have to be a path that avoids Y otherwise. But any block
> + we can avoid now except for the switch block we were able to avoid
> + before adding E. */
> +
> + redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
> +
> + edge e;
> + edge_iterator ei;
> + FOR_EACH_EDGE (e, ei, swtch_bb->succs)
> + {
> + basic_block bb = e->dest;
> + if (bb == m_final_bb || bb == default_bb)
> + continue;
> + set_immediate_dominator (CDI_DOMINATORS, bb, swtch_bb);
> + }
> +
> + vec<basic_block> v;
> + v.create (1);
> + v.quick_push (m_final_bb);
> + iterate_fix_dominators (CDI_DOMINATORS, v, true);
> + }
> +
> + /* Update information about the switch statement. */
> + tree first_label = gimple_switch_label (swtch, 1);
> + tree last_label = gimple_switch_label (swtch, num_labels - 1);
> +
> + m_range_min = CASE_LOW (first_label);
> + m_range_max = CASE_LOW (last_label);
> + m_index_expr = gimple_switch_index (swtch);
> + m_switch_bb = swtch_bb;
> +
> + m_range_size = int_const_binop (MINUS_EXPR, m_range_max, m_range_min);
> +
> + m_cfg_altered = true;
> +
> + m_contiguous_range = true;
> + wide_int last_wi = wi::to_wide (CASE_LOW (first_label));
> + for (i = 2; i < num_labels; i++)
> + {
> + tree label = gimple_switch_label (swtch, i);
> + wide_int label_wi = wi::to_wide (CASE_LOW (label));
> + m_contiguous_range &= wi::eq_p (wi::add (last_wi, 1), label_wi);
> + last_wi = label_wi;
> + }
> +
> + m_exp_index_transform_applied = true;
> +}
> +
> /* Checks whether the range given by individual case statements of the switch
> switch statement isn't too big and whether the number of branches actually
> satisfies the size of the new array. */
> @@ -973,8 +1274,9 @@ switch_conversion::gen_inbound_check ()
> bbf->count = e1f->count () + e2f->count ();
>
> /* Tidy blocks that have become unreachable. */
> - prune_bbs (bbd, m_final_bb,
> - m_default_case_nonstandard ? m_default_bb : NULL);
> + bool prune_default_bb = !m_default_case_nonstandard
> + && !m_exp_index_transform_applied;
> + prune_bbs (bbd, m_final_bb, prune_default_bb ? NULL : m_default_bb);
>
> /* Fixup the PHI nodes in bbF. */
> fix_phi_nodes (e1f, e2f, bbf);
> @@ -1053,8 +1355,19 @@ switch_conversion::expand (gswitch *swtch)
> return;
> }
>
> - /* Check the case label values are within reasonable range: */
> - if (!check_range ())
> + /* Sometimes it is possible to use the "exponential index transform" to help
> + switch conversion convert switches which it otherwise could not convert.
> + However, we want to do this transform only when we know that switch
> + conversion will then really be able to convert the switch. So we first
> + check if the transformation is applicable and then maybe later do the
> + transformation. */
> + bool exp_transform_viable = is_exp_index_transform_viable (swtch);
> +
> + /* Check the case label values are within reasonable range.
> +
> + If we will be doing exponential index transform, the range will be always
> + reasonable. */
> + if (!exp_transform_viable && !check_range ())
> {
> gcc_assert (m_reason);
> return;
> @@ -1076,6 +1389,9 @@ switch_conversion::expand (gswitch *swtch)
> /* At this point all checks have passed and we can proceed with the
> transformation. */
>
> + if (exp_transform_viable)
> + exp_index_transform (swtch);
> +
> create_temp_arrays ();
> gather_default_values (m_default_case_nonstandard
> ? gimple_switch_label (swtch, 1)
> diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
> index 6939eec6018..1a865f85f3a 100644
> --- a/gcc/tree-switch-conversion.h
> +++ b/gcc/tree-switch-conversion.h
> @@ -743,6 +743,19 @@ public:
> /* Collection information about SWTCH statement. */
> void collect (gswitch *swtch);
>
> + /* Check that the 'exponential index transform' can be applied.
> +
> + See the comment at the function definition for more details. */
> + bool is_exp_index_transform_viable (gswitch *swtch);
> +
> + /* Perform the 'exponential index transform'.
> +
> + The exponential index transform shrinks the range of case numbers which
> + helps switch conversion convert switches it otherwise could not.
> +
> + See the comment at the function definition for more details. */
> + void exp_index_transform (gswitch *swtch);
> +
> /* Checks whether the range given by individual case statements of the switch
> switch statement isn't too big and whether the number of branches actually
> satisfies the size of the new array. */
> @@ -900,6 +913,11 @@ public:
>
> /* True if CFG has been changed. */
> bool m_cfg_altered;
> +
> + /* True if exponential index transform has been applied. See the comment at
> + the definition of exp_index_transform for details about the
> + transformation. */
> + bool m_exp_index_transform_applied;
> };
>
> void
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-18 10:07 ` Richard Biener
@ 2024-07-18 14:17 ` Filip Kastl
2024-07-26 12:34 ` Richard Biener
0 siblings, 1 reply; 10+ messages in thread
From: Filip Kastl @ 2024-07-18 14:17 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, mjambor
On Thu 2024-07-18 12:07:42, Richard Biener wrote:
> On Wed, 17 Jul 2024, Filip Kastl wrote:
> > > > + }
> > > > +
> > > > + vec<basic_block> v;
> > > > + v.create (1);
> > > > + v.quick_push (m_final_bb);
> > > > + iterate_fix_dominators (CDI_DOMINATORS, v, true);
> > >
> > > The final BB should have the condition block as immediate dominator
> > > if it's old immediate dominator was the switch block, otherwise
> > > it's immediate dominator shouldn't change.
> > I think that this is not true. Consider this CFG where no path through default
> > BB intersects final BB:
> >
> > switch BB ---+
> > / | \ \
> > case BBs default BB
> > \ | / /
> > final BB /
> > | /
> >
> > Here idom(final BB) == switch BB.
> > After the index exponential transform the CFG looks like this
> >
> > cond BB ---------+
> > | |
> > switch BB ---+ |
> > / | \ \ |
> > case BBs default BB
> > \ | / /
> > final BB /
> > | /
> >
> > It still holds that idom(final BB) == switch BB.
>
> Sure but you still know the dominator of the default BB as the cond BB
> then? That said, I think that using iterate_fix_dominators is overkill
> and you should know the new immediate dominator and can set it directly?
Sorry, I'm not sure if I understand. Are you suggesting something like this?
if (idom(default bb) == cond bb)
{
if (exists a path from default bb to final bb)
{
idom(final bb) = cond bb;
}
else
{
idom(final bb) = switch bb;
}
}
else
{
// idom(final bb) doesn't change
}
If so, how do I implement testing existence of a path from default bb to final
bb? I guess I could do BFS but that seems like a pretty complicated solution.
>
> That said, even above if there's a merge of the default BB and final BB
> downstream in the CFG, inserting cond BB requires adjustment of the
> immediate dominator of that merge block and you are missing that?
I think this isn't a problem because I do
redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
If the old immediate dominator of the merge bb was the switch bb then I set its
immediate dominator to cond bb. Otherwise its immediate dominator stays the
same.
I try to prove that this is correct here:
> > + 3 Other blocks that had the switch block as their immediate dominator
> > + should now have the cond block as their immediate dominator.
> > +
> > + 4 Immediate dominators of the rest of the blocks shouldn't change.
> > +
> > + Reasoning for 3 and 4:
> > +
> > + We'll only consider blocks that do not fall into 1 or 2.
> > +
> > + Consider a block X whose original imm dom was the switch block. All
> > + paths to X must also intersect the cond block since it's the only
> > + pred of the switch block. The final block doesn't dominate X so at
> > + least one path P must lead through the default block. Let P' be P but
> > + instead of going through the switch block, take E. The switch block
> > + doesn't dominate X so its imm dom must now be the cond block.
> > +
> > + Consider a block X whose original imm dom was Y != the switch block.
> > + We only added an edge so all original paths to X are still present.
> > + So X gained no new dominators. Observe that Y still dominates X.
> > + There would have to be a path that avoids Y otherwise. But any block
> > + we can avoid now except for the switch block we were able to avoid
> > + before adding E. */
Cheers,
Filip Kastl
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-18 14:17 ` Filip Kastl
@ 2024-07-26 12:34 ` Richard Biener
2024-07-29 15:28 ` Filip Kastl
0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2024-07-26 12:34 UTC (permalink / raw)
To: Filip Kastl; +Cc: gcc-patches, mjambor
On Thu, 18 Jul 2024, Filip Kastl wrote:
> On Thu 2024-07-18 12:07:42, Richard Biener wrote:
> > On Wed, 17 Jul 2024, Filip Kastl wrote:
> > > > > + }
> > > > > +
> > > > > + vec<basic_block> v;
> > > > > + v.create (1);
> > > > > + v.quick_push (m_final_bb);
> > > > > + iterate_fix_dominators (CDI_DOMINATORS, v, true);
> > > >
> > > > The final BB should have the condition block as immediate dominator
> > > > if it's old immediate dominator was the switch block, otherwise
> > > > it's immediate dominator shouldn't change.
> > > I think that this is not true. Consider this CFG where no path through default
> > > BB intersects final BB:
> > >
> > > switch BB ---+
> > > / | \ \
> > > case BBs default BB
> > > \ | / /
> > > final BB /
> > > | /
> > >
> > > Here idom(final BB) == switch BB.
> > > After the index exponential transform the CFG looks like this
> > >
> > > cond BB ---------+
> > > | |
> > > switch BB ---+ |
> > > / | \ \ |
> > > case BBs default BB
> > > \ | / /
> > > final BB /
> > > | /
> > >
> > > It still holds that idom(final BB) == switch BB.
> >
> > Sure but you still know the dominator of the default BB as the cond BB
> > then? That said, I think that using iterate_fix_dominators is overkill
> > and you should know the new immediate dominator and can set it directly?
>
> Sorry, I'm not sure if I understand. Are you suggesting something like this?
>
> if (idom(default bb) == cond bb)
> {
> if (exists a path from default bb to final bb)
> {
> idom(final bb) = cond bb;
> }
> else
> {
> idom(final bb) = switch bb;
> }
> }
> else
> {
> // idom(final bb) doesn't change
> }
>
> If so, how do I implement testing existence of a path from default bb to final
> bb? I guess I could do BFS but that seems like a pretty complicated solution.
> >
> > That said, even above if there's a merge of the default BB and final BB
> > downstream in the CFG, inserting cond BB requires adjustment of the
> > immediate dominator of that merge block and you are missing that?
>
> I think this isn't a problem because I do
>
> redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
Hmm, I'm probably just confused. So the problem is that
redirect_immediate_dominators makes the dominator of final_bb incorrect
(but also all case_bb immediate dominator?!)?
Ah, I see you fix those up. Then 2.) is left - the final block. Iff
the final block needs adjustment you know there was a path from
the default case to it which means one of its predecessors is dominated
by the default case? In that case, adjust the dominator to cond_bb,
otherwise leave it at switch_bb?
Richard.
> If the old immediate dominator of the merge bb was the switch bb then I set its
> immediate dominator to cond bb. Otherwise its immediate dominator stays the
> same.
>
> I try to prove that this is correct here:
>
> > > + 3 Other blocks that had the switch block as their immediate dominator
> > > + should now have the cond block as their immediate dominator.
> > > +
> > > + 4 Immediate dominators of the rest of the blocks shouldn't change.
> > > +
> > > + Reasoning for 3 and 4:
> > > +
> > > + We'll only consider blocks that do not fall into 1 or 2.
> > > +
> > > + Consider a block X whose original imm dom was the switch block. All
> > > + paths to X must also intersect the cond block since it's the only
> > > + pred of the switch block. The final block doesn't dominate X so at
> > > + least one path P must lead through the default block. Let P' be P but
> > > + instead of going through the switch block, take E. The switch block
> > > + doesn't dominate X so its imm dom must now be the cond block.
> > > +
> > > + Consider a block X whose original imm dom was Y != the switch block.
> > > + We only added an edge so all original paths to X are still present.
> > > + So X gained no new dominators. Observe that Y still dominates X.
> > > + There would have to be a path that avoids Y otherwise. But any block
> > > + we can avoid now except for the switch block we were able to avoid
> > > + before adding E. */
>
> Cheers,
> Filip Kastl
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstra
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-26 12:34 ` Richard Biener
@ 2024-07-29 15:28 ` Filip Kastl
2024-07-30 7:20 ` Richard Biener
0 siblings, 1 reply; 10+ messages in thread
From: Filip Kastl @ 2024-07-29 15:28 UTC (permalink / raw)
To: Richard Biener; +Cc: Filip Kastl, gcc-patches, mjambor
Hi Richard,
> > Sorry, I'm not sure if I understand. Are you suggesting something like this?
> >
> > if (idom(default bb) == cond bb)
> > {
> > if (exists a path from default bb to final bb)
> > {
> > idom(final bb) = cond bb;
> > }
> > else
> > {
> > idom(final bb) = switch bb;
> > }
> > }
> > else
> > {
> > // idom(final bb) doesn't change
> > }
Sidenote: I've just noticed that this code wouldn't work since the original
idom of final_bb may be some block outside of the switch. In that case, idom
of final_bb should remain the same after the transformation regardless of if
idom(default bb) == cond bb. So the code would look like this
if (original idom(final bb) == switch bb and idom(default bb) == cond bb)
{
if (exists a path from default bb to final bb)
{
idom(final bb) = cond bb;
}
else
{
idom(final bb) = switch bb;
}
}
else
{
// idom(final bb) doesn't change
}
> >
> > If so, how do I implement testing existence of a path from default bb to final
> > bb? I guess I could do BFS but that seems like a pretty complicated solution.
> > >
> > > That said, even above if there's a merge of the default BB and final BB
> > > downstream in the CFG, inserting cond BB requires adjustment of the
> > > immediate dominator of that merge block and you are missing that?
> >
> > I think this isn't a problem because I do
> >
> > redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
>
> Hmm, I'm probably just confused. So the problem is that
> redirect_immediate_dominators makes the dominator of final_bb incorrect
> (but also all case_bb immediate dominator?!)?
Yes, the problem is what the idom of final_bb should be after the
transformation. However, redirect_immediate_dominators doesn't *make* the idom
of final_bb incorrect. It may have been already incorrect before the call (the
call may also possibly make the idom correct btw).
This has probably already been clear to you. I'm just making sure we're on the
same page.
>
> Ah, I see you fix those up. Then 2.) is left - the final block. Iff
> the final block needs adjustment you know there was a path from
> the default case to it which means one of its predecessors is dominated
> by the default case? In that case, adjust the dominator to cond_bb,
> otherwise leave it at switch_bb?
Yes, what I'm saying is that if I want to know idom of final_bb after the
transformation, I have to know if there is a path between default_bb and
final_bb. It is because of these two cases:
1.
cond BB ---------+
| |
switch BB ---+ |
/ | \ \ |
case BBs default BB
\ | / /
final BB <---+ <- this may be an edge or a path
|
2.
cond BB ---------+
| |
switch BB ---+ |
/ | \ \ |
case BBs default BB
\ | / /
final BB / <- this may be an edge or a path
| /
In the first case, there is a path between default_bb and final_bb and in the
second there isn't. Otherwise the cases are the same. In the first case idom
of final_bb should be cond_bb. In the second case idom of final_bb should be
switch_bb. Algorithm deciding what should be idom of final_bb therefore has to
know if there is a path between default_bb and final_bb.
You said that if there is a path between default_bb and final_bb, one of the
predecessors of final_bb is dominated by default_bb. That would indeed give a
nice way to check existence of a path between default_bb and final_bb. But
does it hold? Consider this situation:
| |
cond BB --------------+
| | |
switch BB --------+ |
/ | \ | \ |
case BBs | default BB
\ | / | /
final BB <- pred BB -+
|
Here no predecessors of final_bb are dominated by default_bb but at the same
time there does exist a path from default_bb to final_bb. Or is this CFG
impossible for some reason?
Btw to further check that we're on the same page: Right now we're only trying
to figure out if there is a way to update idom of final_bb after the
transformation without using iterate_fix_dominators, right? The rest of my
dominator fixing code makes sense / is ok?
Cheers,
Filip Kastl
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-29 15:28 ` Filip Kastl
@ 2024-07-30 7:20 ` Richard Biener
2024-07-30 12:04 ` Filip Kastl
0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2024-07-30 7:20 UTC (permalink / raw)
To: Filip Kastl; +Cc: gcc-patches, mjambor
On Mon, 29 Jul 2024, Filip Kastl wrote:
> Hi Richard,
>
> > > Sorry, I'm not sure if I understand. Are you suggesting something like this?
> > >
> > > if (idom(default bb) == cond bb)
> > > {
> > > if (exists a path from default bb to final bb)
> > > {
> > > idom(final bb) = cond bb;
> > > }
> > > else
> > > {
> > > idom(final bb) = switch bb;
> > > }
> > > }
> > > else
> > > {
> > > // idom(final bb) doesn't change
> > > }
>
> Sidenote: I've just noticed that this code wouldn't work since the original
> idom of final_bb may be some block outside of the switch. In that case, idom
> of final_bb should remain the same after the transformation regardless of if
> idom(default bb) == cond bb. So the code would look like this
>
> if (original idom(final bb) == switch bb and idom(default bb) == cond bb)
> {
> if (exists a path from default bb to final bb)
> {
> idom(final bb) = cond bb;
> }
> else
> {
> idom(final bb) = switch bb;
> }
> }
> else
> {
> // idom(final bb) doesn't change
> }
>
> > >
> > > If so, how do I implement testing existence of a path from default bb to final
> > > bb? I guess I could do BFS but that seems like a pretty complicated solution.
> > > >
> > > > That said, even above if there's a merge of the default BB and final BB
> > > > downstream in the CFG, inserting cond BB requires adjustment of the
> > > > immediate dominator of that merge block and you are missing that?
> > >
> > > I think this isn't a problem because I do
> > >
> > > redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
> >
> > Hmm, I'm probably just confused. So the problem is that
> > redirect_immediate_dominators makes the dominator of final_bb incorrect
> > (but also all case_bb immediate dominator?!)?
>
> Yes, the problem is what the idom of final_bb should be after the
> transformation. However, redirect_immediate_dominators doesn't *make* the idom
> of final_bb incorrect. It may have been already incorrect before the call (the
> call may also possibly make the idom correct btw).
>
> This has probably already been clear to you. I'm just making sure we're on the
> same page.
>
> >
> > Ah, I see you fix those up. Then 2.) is left - the final block. Iff
> > the final block needs adjustment you know there was a path from
> > the default case to it which means one of its predecessors is dominated
> > by the default case? In that case, adjust the dominator to cond_bb,
> > otherwise leave it at switch_bb?
>
> Yes, what I'm saying is that if I want to know idom of final_bb after the
> transformation, I have to know if there is a path between default_bb and
> final_bb. It is because of these two cases:
>
> 1.
>
> cond BB ---------+
> | |
> switch BB ---+ |
> / | \ \ |
> case BBs default BB
> \ | / /
> final BB <---+ <- this may be an edge or a path
> |
>
> 2.
>
> cond BB ---------+
> | |
> switch BB ---+ |
> / | \ \ |
> case BBs default BB
> \ | / /
> final BB / <- this may be an edge or a path
> | /
>
> In the first case, there is a path between default_bb and final_bb and in the
> second there isn't. Otherwise the cases are the same. In the first case idom
> of final_bb should be cond_bb. In the second case idom of final_bb should be
> switch_bb. Algorithm deciding what should be idom of final_bb therefore has to
> know if there is a path between default_bb and final_bb.
>
> You said that if there is a path between default_bb and final_bb, one of the
> predecessors of final_bb is dominated by default_bb. That would indeed give a
> nice way to check existence of a path between default_bb and final_bb. But
> does it hold? Consider this situation:
>
> | |
> cond BB --------------+
> | | |
> switch BB --------+ |
> / | \ | \ |
> case BBs | default BB
> \ | / | /
> final BB <- pred BB -+
> |
>
> Here no predecessors of final_bb are dominated by default_bb but at the same
> time there does exist a path from default_bb to final_bb. Or is this CFG
> impossible for some reason?
I think in this case the dominator simply need not change - the only case
you need to adjust it is when the immediate dominator of final BB was
switch BB before the transform, and then we know we have to update it
too cond BB, no?
> Btw to further check that we're on the same page: Right now we're only trying
> to figure out if there is a way to update idom of final_bb after the
> transformation without using iterate_fix_dominators, right? The rest of my
> dominator fixing code makes sense / is ok?
Yes. These "fix me up" utilities are used too often lazily - I have
the gut feeling that the situation here is very simple.
Richard.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-30 7:20 ` Richard Biener
@ 2024-07-30 12:04 ` Filip Kastl
2024-07-30 12:23 ` Filip Kastl
2024-07-30 12:34 ` Richard Biener
0 siblings, 2 replies; 10+ messages in thread
From: Filip Kastl @ 2024-07-30 12:04 UTC (permalink / raw)
To: Richard Biener; +Cc: Filip Kastl, gcc-patches, mjambor
> > > Ah, I see you fix those up. Then 2.) is left - the final block. Iff
> > > the final block needs adjustment you know there was a path from
> > > the default case to it which means one of its predecessors is dominated
> > > by the default case? In that case, adjust the dominator to cond_bb,
> > > otherwise leave it at switch_bb?
> >
> > Yes, what I'm saying is that if I want to know idom of final_bb after the
> > transformation, I have to know if there is a path between default_bb and
> > final_bb. It is because of these two cases:
> >
> > 1.
> >
> > cond BB ---------+
> > | |
> > switch BB ---+ |
> > / | \ \ |
> > case BBs default BB
> > \ | / /
> > final BB <---+ <- this may be an edge or a path
> > |
> >
> > 2.
> >
> > cond BB ---------+
> > | |
> > switch BB ---+ |
> > / | \ \ |
> > case BBs default BB
> > \ | / /
> > final BB / <- this may be an edge or a path
> > | /
> >
> > In the first case, there is a path between default_bb and final_bb and in the
> > second there isn't. Otherwise the cases are the same. In the first case idom
> > of final_bb should be cond_bb. In the second case idom of final_bb should be
> > switch_bb. Algorithm deciding what should be idom of final_bb therefore has to
> > know if there is a path between default_bb and final_bb.
> >
> > You said that if there is a path between default_bb and final_bb, one of the
> > predecessors of final_bb is dominated by default_bb. That would indeed give a
> > nice way to check existence of a path between default_bb and final_bb. But
> > does it hold? Consider this situation:
> >
> > | |
> > cond BB --------------+
> > | | |
> > switch BB --------+ |
> > / | \ | \ |
> > case BBs | default BB
> > \ | / | /
> > final BB <- pred BB -+
> > |
> >
> > Here no predecessors of final_bb are dominated by default_bb but at the same
> > time there does exist a path from default_bb to final_bb. Or is this CFG
> > impossible for some reason?
>
> I think in this case the dominator simply need not change - the only case
> you need to adjust it is when the immediate dominator of final BB was
> switch BB before the transform, and then we know we have to update it
> too cond BB, no?
Ah, my bad. Yes, my counterexample actually isn't a problem. I was glad when
I realized that and started thinking that this...
if (original idom(final bb) == switch bb)
{
if (exists a pred of final bb dominated by default bb)
{
idom(final bb) = cond bb;
}
else
{
idom(final bb) = switch bb;
}
}
else
{
// idom(final bb) doesn't change
}
...might be the final solution. But after thinking about it for a while I
(saddly) came up with another counterexample.
|
cond BB --------------+
| |
switch BB --------+ |
/ | \ \ |
case BBs default BB
\ | / /
final BB <- pred BB -+
| ^
| |
+-----------+ <- this may be a path or an edge I guess
Here there *is* a path between default_bb and final_bb but since no predecessor
of final_bb is dominated by default_bb we incorrectly decide that there is no
such path. Therefore we incorrectly assign idom(final_bb) = switch_bb instead
of idom(final_bb) = cond_bb.
So unless I'm missing something, "final has a pred dominated by default" isn't
equivalent with "there is a path between default and final" even when we assume
that the original idom of final_bb was switch_bb. Therefore I think we're back
to searching for a nice way to test "there is a path between default and
final".
Maybe you can spot a flaw in my logic or maybe you see a solution I don't.
Meanwhile I'll look into source code of the rest of the switch conversion pass.
Switch conversion pass inserts conditions similar to what I'm doing so someone
before me may have already solved how to properly fix dominators in this
situation.
Cheers,
Filip Kastl
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-30 12:04 ` Filip Kastl
@ 2024-07-30 12:23 ` Filip Kastl
2024-07-30 12:34 ` Richard Biener
1 sibling, 0 replies; 10+ messages in thread
From: Filip Kastl @ 2024-07-30 12:23 UTC (permalink / raw)
To: Filip Kastl; +Cc: Richard Biener, gcc-patches, mjambor
> Meanwhile I'll look into source code of the rest of the switch conversion pass.
> Switch conversion pass inserts conditions similar to what I'm doing so someone
> before me may have already solved how to properly fix dominators in this
> situation.
Oh nevermind. Switch conversion (gen_inbound_check ()) actually uses
iterate_fix_dominators.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-30 12:04 ` Filip Kastl
2024-07-30 12:23 ` Filip Kastl
@ 2024-07-30 12:34 ` Richard Biener
2024-07-30 16:48 ` Filip Kastl
1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2024-07-30 12:34 UTC (permalink / raw)
To: Filip Kastl; +Cc: gcc-patches, mjambor
On Tue, 30 Jul 2024, Filip Kastl wrote:
> > > > Ah, I see you fix those up. Then 2.) is left - the final block. Iff
> > > > the final block needs adjustment you know there was a path from
> > > > the default case to it which means one of its predecessors is dominated
> > > > by the default case? In that case, adjust the dominator to cond_bb,
> > > > otherwise leave it at switch_bb?
> > >
> > > Yes, what I'm saying is that if I want to know idom of final_bb after the
> > > transformation, I have to know if there is a path between default_bb and
> > > final_bb. It is because of these two cases:
> > >
> > > 1.
> > >
> > > cond BB ---------+
> > > | |
> > > switch BB ---+ |
> > > / | \ \ |
> > > case BBs default BB
> > > \ | / /
> > > final BB <---+ <- this may be an edge or a path
> > > |
> > >
> > > 2.
> > >
> > > cond BB ---------+
> > > | |
> > > switch BB ---+ |
> > > / | \ \ |
> > > case BBs default BB
> > > \ | / /
> > > final BB / <- this may be an edge or a path
> > > | /
> > >
> > > In the first case, there is a path between default_bb and final_bb and in the
> > > second there isn't. Otherwise the cases are the same. In the first case idom
> > > of final_bb should be cond_bb. In the second case idom of final_bb should be
> > > switch_bb. Algorithm deciding what should be idom of final_bb therefore has to
> > > know if there is a path between default_bb and final_bb.
> > >
> > > You said that if there is a path between default_bb and final_bb, one of the
> > > predecessors of final_bb is dominated by default_bb. That would indeed give a
> > > nice way to check existence of a path between default_bb and final_bb. But
> > > does it hold? Consider this situation:
> > >
> > > | |
> > > cond BB --------------+
> > > | | |
> > > switch BB --------+ |
> > > / | \ | \ |
> > > case BBs | default BB
> > > \ | / | /
> > > final BB <- pred BB -+
> > > |
> > >
> > > Here no predecessors of final_bb are dominated by default_bb but at the same
> > > time there does exist a path from default_bb to final_bb. Or is this CFG
> > > impossible for some reason?
> >
> > I think in this case the dominator simply need not change - the only case
> > you need to adjust it is when the immediate dominator of final BB was
> > switch BB before the transform, and then we know we have to update it
> > too cond BB, no?
>
> Ah, my bad. Yes, my counterexample actually isn't a problem. I was glad when
> I realized that and started thinking that this...
>
> if (original idom(final bb) == switch bb)
> {
> if (exists a pred of final bb dominated by default bb)
> {
> idom(final bb) = cond bb;
> }
> else
> {
> idom(final bb) = switch bb;
> }
> }
> else
> {
> // idom(final bb) doesn't change
> }
>
> ...might be the final solution. But after thinking about it for a while I
> (saddly) came up with another counterexample.
>
> |
> cond BB --------------+
> | |
> switch BB --------+ |
> / | \ \ |
> case BBs default BB
> \ | / /
> final BB <- pred BB -+
> | ^
> | |
> +-----------+ <- this may be a path or an edge I guess
>
> Here there *is* a path between default_bb and final_bb but since no predecessor
> of final_bb is dominated by default_bb we incorrectly decide that there is no
> such path. Therefore we incorrectly assign idom(final_bb) = switch_bb instead
> of idom(final_bb) = cond_bb.
>
> So unless I'm missing something, "final has a pred dominated by default" isn't
> equivalent with "there is a path between default and final" even when we assume
> that the original idom of final_bb was switch_bb. Therefore I think we're back
> to searching for a nice way to test "there is a path between default and
> final".
Hmm.
> Maybe you can spot a flaw in my logic or maybe you see a solution I don't.
> Meanwhile I'll look into source code of the rest of the switch conversion pass.
> Switch conversion pass inserts conditions similar to what I'm doing so someone
> before me may have already solved how to properly fix dominators in this
> situation.
OK, as I see in your next followup that uses iterate_fix_dominators as
well.
So your patch is OK as-is.
It might be nice to factor out a common helper from gen_inbound_check
and your "copy" of it though. As followup, if you like.
Thanks and sorry for the confusion,
Richard.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches
2024-07-30 12:34 ` Richard Biener
@ 2024-07-30 16:48 ` Filip Kastl
0 siblings, 0 replies; 10+ messages in thread
From: Filip Kastl @ 2024-07-30 16:48 UTC (permalink / raw)
To: Richard Biener; +Cc: Filip Kastl, gcc-patches, mjambor
On Tue 2024-07-30 14:34:54, Richard Biener wrote:
> On Tue, 30 Jul 2024, Filip Kastl wrote:
>
> > > > > Ah, I see you fix those up. Then 2.) is left - the final block. Iff
> > > > > the final block needs adjustment you know there was a path from
> > > > > the default case to it which means one of its predecessors is dominated
> > > > > by the default case? In that case, adjust the dominator to cond_bb,
> > > > > otherwise leave it at switch_bb?
> > > >
> > > > Yes, what I'm saying is that if I want to know idom of final_bb after the
> > > > transformation, I have to know if there is a path between default_bb and
> > > > final_bb. It is because of these two cases:
> > > >
> > > > 1.
> > > >
> > > > cond BB ---------+
> > > > | |
> > > > switch BB ---+ |
> > > > / | \ \ |
> > > > case BBs default BB
> > > > \ | / /
> > > > final BB <---+ <- this may be an edge or a path
> > > > |
> > > >
> > > > 2.
> > > >
> > > > cond BB ---------+
> > > > | |
> > > > switch BB ---+ |
> > > > / | \ \ |
> > > > case BBs default BB
> > > > \ | / /
> > > > final BB / <- this may be an edge or a path
> > > > | /
> > > >
> > > > In the first case, there is a path between default_bb and final_bb and in the
> > > > second there isn't. Otherwise the cases are the same. In the first case idom
> > > > of final_bb should be cond_bb. In the second case idom of final_bb should be
> > > > switch_bb. Algorithm deciding what should be idom of final_bb therefore has to
> > > > know if there is a path between default_bb and final_bb.
> > > >
> > > > You said that if there is a path between default_bb and final_bb, one of the
> > > > predecessors of final_bb is dominated by default_bb. That would indeed give a
> > > > nice way to check existence of a path between default_bb and final_bb. But
> > > > does it hold? Consider this situation:
> > > >
> > > > | |
> > > > cond BB --------------+
> > > > | | |
> > > > switch BB --------+ |
> > > > / | \ | \ |
> > > > case BBs | default BB
> > > > \ | / | /
> > > > final BB <- pred BB -+
> > > > |
> > > >
> > > > Here no predecessors of final_bb are dominated by default_bb but at the same
> > > > time there does exist a path from default_bb to final_bb. Or is this CFG
> > > > impossible for some reason?
> > >
> > > I think in this case the dominator simply need not change - the only case
> > > you need to adjust it is when the immediate dominator of final BB was
> > > switch BB before the transform, and then we know we have to update it
> > > too cond BB, no?
> >
> > Ah, my bad. Yes, my counterexample actually isn't a problem. I was glad when
> > I realized that and started thinking that this...
> >
> > if (original idom(final bb) == switch bb)
> > {
> > if (exists a pred of final bb dominated by default bb)
> > {
> > idom(final bb) = cond bb;
> > }
> > else
> > {
> > idom(final bb) = switch bb;
> > }
> > }
> > else
> > {
> > // idom(final bb) doesn't change
> > }
> >
> > ...might be the final solution. But after thinking about it for a while I
> > (saddly) came up with another counterexample.
> >
> > |
> > cond BB --------------+
> > | |
> > switch BB --------+ |
> > / | \ \ |
> > case BBs default BB
> > \ | / /
> > final BB <- pred BB -+
> > | ^
> > | |
> > +-----------+ <- this may be a path or an edge I guess
> >
> > Here there *is* a path between default_bb and final_bb but since no predecessor
> > of final_bb is dominated by default_bb we incorrectly decide that there is no
> > such path. Therefore we incorrectly assign idom(final_bb) = switch_bb instead
> > of idom(final_bb) = cond_bb.
> >
> > So unless I'm missing something, "final has a pred dominated by default" isn't
> > equivalent with "there is a path between default and final" even when we assume
> > that the original idom of final_bb was switch_bb. Therefore I think we're back
> > to searching for a nice way to test "there is a path between default and
> > final".
>
> Hmm.
>
> > Maybe you can spot a flaw in my logic or maybe you see a solution I don't.
> > Meanwhile I'll look into source code of the rest of the switch conversion pass.
> > Switch conversion pass inserts conditions similar to what I'm doing so someone
> > before me may have already solved how to properly fix dominators in this
> > situation.
>
> OK, as I see in your next followup that uses iterate_fix_dominators as
> well.
>
> So your patch is OK as-is.
>
> It might be nice to factor out a common helper from gen_inbound_check
> and your "copy" of it though. As followup, if you like.
>
> Thanks and sorry for the confusion,
> Richard.
Alright, I pushed the patch. I bootstrapped and regtested it once again to be
sure nothing breaks on the new master. I also noticed that I didn't fill out
the changelog in the commit message of the second version of the patch so I
fixed that and I fixed two other mistakes I found in the commit message.
I'm adding the task of factoring out the common code to my list of possible
future projects.
It's a shame that we didn't manage to figure out the final bb idom. For a
while, it looked like the predecessor dominated by default idea would really
work.
Thank you for all the guidance with this patch.
Cheers,
Filip Kastl
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-07-30 16:48 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-17 14:15 [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches Filip Kastl
2024-07-18 10:07 ` Richard Biener
2024-07-18 14:17 ` Filip Kastl
2024-07-26 12:34 ` Richard Biener
2024-07-29 15:28 ` Filip Kastl
2024-07-30 7:20 ` Richard Biener
2024-07-30 12:04 ` Filip Kastl
2024-07-30 12:23 ` Filip Kastl
2024-07-30 12:34 ` Richard Biener
2024-07-30 16:48 ` Filip Kastl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).