* [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.
@ 2022-06-16 11:08 Tamar Christina
2022-06-16 11:09 ` [PATCH 2/2]middle-end: Support recognition of three-way max/min Tamar Christina
2022-06-20 8:03 ` [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted Richard Biener
0 siblings, 2 replies; 26+ messages in thread
From: Tamar Christina @ 2022-06-16 11:08 UTC (permalink / raw)
To: gcc-patches; +Cc: nd, rguenther
[-- Attachment #1: Type: text/plain, Size: 1532 bytes --]
Hi All,
This adds a match.pd rule that drops the bitwwise nots when both arguments to a
subtract is inverted. i.e. for:
float g(float a, float b)
{
return ~(int)a - ~(int)b;
}
we instead generate
float g(float a, float b)
{
return (int)a - (int)b;
}
We already do a limited version of this from the fold_binary fold functions but
this makes a more general version in match.pd that applies more often.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* match.pd: New bit_not rule.
gcc/testsuite/ChangeLog:
* gcc.dg/subnot.c: New test.
--- inline copy of patch --
diff --git a/gcc/match.pd b/gcc/match.pd
index a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a10c30b8a3e1ae2e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(simplify
(bit_not (plus:c (bit_not @0) @1))
(minus @0 @1))
+/* (~X - ~Y) -> X - Y. */
+(simplify
+ (minus (bit_not @0) (bit_not @1))
+ (minus @0 @1))
/* ~(X - Y) -> ~X + Y. */
(simplify
diff --git a/gcc/testsuite/gcc.dg/subnot.c b/gcc/testsuite/gcc.dg/subnot.c
new file mode 100644
index 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f831aa77d28bd02d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/subnot.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+float g(float a, float b)
+{
+ return ~(int)a - ~(int)b;
+}
+
+/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
--
[-- Attachment #2: rb15840.patch --]
[-- Type: text/plain, Size: 886 bytes --]
diff --git a/gcc/match.pd b/gcc/match.pd
index a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a10c30b8a3e1ae2e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(simplify
(bit_not (plus:c (bit_not @0) @1))
(minus @0 @1))
+/* (~X - ~Y) -> X - Y. */
+(simplify
+ (minus (bit_not @0) (bit_not @1))
+ (minus @0 @1))
/* ~(X - Y) -> ~X + Y. */
(simplify
diff --git a/gcc/testsuite/gcc.dg/subnot.c b/gcc/testsuite/gcc.dg/subnot.c
new file mode 100644
index 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f831aa77d28bd02d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/subnot.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+float g(float a, float b)
+{
+ return ~(int)a - ~(int)b;
+}
+
+/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-16 11:08 [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted Tamar Christina
@ 2022-06-16 11:09 ` Tamar Christina
2022-06-20 8:36 ` Richard Biener
2022-06-20 23:16 ` Andrew Pinski
2022-06-20 8:03 ` [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted Richard Biener
1 sibling, 2 replies; 26+ messages in thread
From: Tamar Christina @ 2022-06-16 11:09 UTC (permalink / raw)
To: gcc-patches; +Cc: nd, rguenther, jakub
[-- Attachment #1: Type: text/plain, Size: 18478 bytes --]
Hi All,
This patch adds support for three-way min/max recognition in phi-opts.
Concretely for e.g.
#include <stdint.h>
uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
uint8_t xk;
if (xc < xm) {
xk = (uint8_t) (xc < xy ? xc : xy);
} else {
xk = (uint8_t) (xm < xy ? xm : xy);
}
return xk;
}
we generate:
<bb 2> [local count: 1073741824]:
_5 = MIN_EXPR <xc_1(D), xy_3(D)>;
_7 = MIN_EXPR <xm_2(D), _5>;
return _7;
instead of
<bb 2>:
if (xc_2(D) < xm_3(D))
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
goto <bb 5>;
<bb 4>:
xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
<bb 5>:
# xk_1 = PHI <xk_5(3), xk_6(4)>
return xk_1;
The same function also immediately deals with turning a minimization problem
into a maximization one if the results are inverted. We do this here since
doing it in match.pd would end up changing the shape of the BBs and adding
additional instructions which would prevent various optimizations from working.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the phi
sequence of a three-way conditional.
(replace_phi_edge_with_variable): Support deferring of BB removal.
(tree_ssa_phiopt_worker): Detect diamond phi structure for three-way
min/max.
(strip_bit_not, invert_minmax_code): New.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't optimize
code away.
* gcc.dg/tree-ssa/minmax-3.c: New test.
* gcc.dg/tree-ssa/minmax-4.c: New test.
* gcc.dg/tree-ssa/minmax-5.c: New test.
* gcc.dg/tree-ssa/minmax-6.c: New test.
* gcc.dg/tree-ssa/minmax-7.c: New test.
* gcc.dg/tree-ssa/minmax-8.c: New test.
--- inline copy of patch --
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e6a843695a05786e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17cb522da44bafa0e2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
new file mode 100644
index 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f5b9993074f8510
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
new file mode 100644
index 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e85639e3a49dd4b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xy < xc ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
new file mode 100644
index 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8bef9fa7b1c5e104
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
new file mode 100644
index 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60cb654fcab0bfdd1c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
index 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba7247d75a32d2c860ed 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20" } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
#include <stdio.h>
#include <stdlib.h>
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2e367d2d21421b1 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
gimple *);
static int value_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
-static bool minmax_replacement (basic_block, basic_block,
- edge, edge, gphi *, tree, tree);
+static bool minmax_replacement (basic_block, basic_block, basic_block,
+ edge, edge, gphi *, tree, tree, bool);
static bool spaceship_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
@@ -73,7 +73,7 @@ static bool cond_store_replacement (basic_block, basic_block, edge, edge,
hash_set<tree> *);
static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
static hash_set<tree> * get_non_trapping ();
-static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
+static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree, bool);
static void hoist_adjacent_loads (basic_block, basic_block,
basic_block, basic_block);
static bool gate_hoist_loads (void);
@@ -199,6 +199,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
basic_block bb1, bb2;
edge e1, e2;
tree arg0, arg1;
+ bool diamond_minmax_p = false;
bb = bb_order[i];
@@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
hoist_adjacent_loads (bb, bb1, bb2, bb3);
continue;
}
+ else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
+ && single_succ_p (bb1)
+ && single_succ_p (bb2)
+ && single_pred_p (bb1)
+ && single_pred_p (bb2)
+ && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
+ {
+ gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb (bb1);
+ gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb (bb2);
+ if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
+ {
+ gimple *stmt1 = gsi_stmt (it1);
+ gimple *stmt2 = gsi_stmt (it2);
+ if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
+ {
+ enum tree_code code1 = gimple_assign_rhs_code (stmt1);
+ enum tree_code code2 = gimple_assign_rhs_code (stmt2);
+ diamond_minmax_p
+ = (code1 == MIN_EXPR || code1 == MAX_EXPR)
+ && (code2 == MIN_EXPR || code2 == MAX_EXPR);
+ }
+ }
+ }
else
continue;
@@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
if (!candorest)
continue;
+ /* Check that we're looking for nested phis. */
+ if (phis == NULL && diamond_minmax_p)
+ {
+ phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
+ e2 = EDGE_SUCC (bb2, 0);
+ }
+
phi = single_non_singleton_phi_for_edges (phis, e1, e2);
if (!phi)
continue;
@@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
gphi *newphi;
if (single_pred_p (bb1)
+ && !diamond_minmax_p
&& (newphi = factor_out_conditional_conversion (e1, e2, phi,
arg0, arg1,
cond_stmt)))
@@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
}
/* Do the replacement of conditional if it can be done. */
- if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
+ if (!early_p
+ && !diamond_minmax_p
+ && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
cfgchanged = true;
- else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
- arg0, arg1,
- early_p))
+ else if (!diamond_minmax_p
+ && match_simplify_replacement (bb, bb1, e1, e2, phi,
+ arg0, arg1, early_p))
cfgchanged = true;
else if (!early_p
+ && !diamond_minmax_p
&& single_pred_p (bb1)
&& cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
phi, arg0, arg1))
cfgchanged = true;
- else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
+ else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
+ diamond_minmax_p))
cfgchanged = true;
else if (single_pred_p (bb1)
+ && !diamond_minmax_p
&& spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
cfgchanged = true;
}
@@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
static void
replace_phi_edge_with_variable (basic_block cond_block,
- edge e, gphi *phi, tree new_tree)
+ edge e, gphi *phi, tree new_tree, bool delete_bb = true)
{
basic_block bb = gimple_bb (phi);
gimple_stmt_iterator gsi;
@@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
edge_to_remove = EDGE_SUCC (cond_block, 1);
else
edge_to_remove = EDGE_SUCC (cond_block, 0);
- if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
+ if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
{
e->flags |= EDGE_FALLTHRU;
e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
@@ -1564,15 +1601,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
return 0;
}
+/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the TREE for
+ the value being inverted. */
+
+static tree
+strip_bit_not (tree var)
+{
+ if (TREE_CODE (var) != SSA_NAME)
+ return NULL_TREE;
+
+ gimple *assign = SSA_NAME_DEF_STMT (var);
+ if (gimple_code (assign) != GIMPLE_ASSIGN)
+ return NULL_TREE;
+
+ if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
+ return NULL_TREE;
+
+ return gimple_assign_rhs1 (assign);
+}
+
+/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
+
+enum tree_code
+invert_minmax_code (enum tree_code code)
+{
+ switch (code) {
+ case MIN_EXPR:
+ return MAX_EXPR;
+ case MAX_EXPR:
+ return MIN_EXPR;
+ default:
+ gcc_unreachable ();
+ }
+}
+
/* The function minmax_replacement does the main work of doing the minmax
replacement. Return true if the replacement is done. Otherwise return
false.
BB is the basic block where the replacement is going to be done on. ARG0
- is argument 0 from the PHI. Likewise for ARG1. */
+ is argument 0 from the PHI. Likewise for ARG1.
+
+ If THREEWAY_P then expect the BB to be laid out in diamond shape with each
+ BB containing only a MIN or MAX expression. */
static bool
-minmax_replacement (basic_block cond_bb, basic_block middle_bb,
- edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
+minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_middle_bb,
+ edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool threeway_p)
{
tree result;
edge true_edge, false_edge;
@@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
if (false_edge->dest == middle_bb)
false_edge = EDGE_SUCC (false_edge->dest, 0);
+ /* When THREEWAY_P then e1 will point to the edge of the final transition
+ from middle-bb to end. */
if (true_edge == e0)
{
- gcc_assert (false_edge == e1);
+ if (threeway_p)
+ gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
+ else
+ gcc_assert (false_edge == e1);
arg_true = arg0;
arg_false = arg1;
}
@@ -1768,6 +1847,133 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
else
return false;
}
+ else if (middle_bb != alt_middle_bb && threeway_p)
+ {
+ /* Recognize the following case:
+
+ if (smaller < larger)
+ a = MIN (smaller, c);
+ else
+ b = MIN (larger, c);
+ x = PHI <a, b>
+
+ This is equivalent to
+
+ a = MIN (smaller, c);
+ x = MIN (larger, a); */
+
+ gimple *assign = last_and_only_stmt (middle_bb);
+ tree lhs, op0, op1, bound;
+ tree alt_lhs, alt_op0, alt_op1;
+ bool invert = false;
+
+ if (!single_pred_p (middle_bb)
+ || !single_pred_p (alt_middle_bb))
+ return false;
+
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ lhs = gimple_assign_lhs (assign);
+ ass_code = gimple_assign_rhs_code (assign);
+ if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
+ return false;
+
+ op0 = gimple_assign_rhs1 (assign);
+ op1 = gimple_assign_rhs2 (assign);
+
+ assign = last_and_only_stmt (alt_middle_bb);
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ alt_lhs = gimple_assign_lhs (assign);
+ if (ass_code != gimple_assign_rhs_code (assign))
+ return false;
+
+ alt_op0 = gimple_assign_rhs1 (assign);
+ alt_op1 = gimple_assign_rhs2 (assign);
+
+ if (!operand_equal_for_phi_arg_p (lhs, arg_true)
+ || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
+ return false;
+
+ if ((operand_equal_for_phi_arg_p (op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (op0, alt_smaller)))
+ && (operand_equal_for_phi_arg_p (alt_op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER < LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MAX_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MIN_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else if ((operand_equal_for_phi_arg_p (op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (op0, alt_larger)))
+ && (operand_equal_for_phi_arg_p (alt_op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER > LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MIN_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MAX_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else
+ return false;
+
+ /* Reset any range information from the basic block. */
+ reset_flow_sensitive_info_in_bb (cond_bb);
+
+ /* Emit the statement to compute min/max. */
+ gimple_seq stmts = NULL;
+ tree phi_result = PHI_RESULT (phi);
+ result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, bound);
+ result = gimple_build (&stmts, ass_code, TREE_TYPE (phi_result), result, arg1);
+ if (invert)
+ result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE (phi_result), result);
+
+ gsi = gsi_last_bb (cond_bb);
+ gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
+
+ replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
+
+ return true;
+ }
else
{
/* Recognize the following case, assuming d <= u:
--
[-- Attachment #2: rb15841.patch --]
[-- Type: text/plain, Size: 16630 bytes --]
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e6a843695a05786e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17cb522da44bafa0e2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
new file mode 100644
index 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f5b9993074f8510
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
new file mode 100644
index 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e85639e3a49dd4b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xy < xc ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
new file mode 100644
index 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8bef9fa7b1c5e104
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
new file mode 100644
index 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60cb654fcab0bfdd1c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
index 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba7247d75a32d2c860ed 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20" } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
#include <stdio.h>
#include <stdlib.h>
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2e367d2d21421b1 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
gimple *);
static int value_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
-static bool minmax_replacement (basic_block, basic_block,
- edge, edge, gphi *, tree, tree);
+static bool minmax_replacement (basic_block, basic_block, basic_block,
+ edge, edge, gphi *, tree, tree, bool);
static bool spaceship_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
@@ -73,7 +73,7 @@ static bool cond_store_replacement (basic_block, basic_block, edge, edge,
hash_set<tree> *);
static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
static hash_set<tree> * get_non_trapping ();
-static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
+static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree, bool);
static void hoist_adjacent_loads (basic_block, basic_block,
basic_block, basic_block);
static bool gate_hoist_loads (void);
@@ -199,6 +199,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
basic_block bb1, bb2;
edge e1, e2;
tree arg0, arg1;
+ bool diamond_minmax_p = false;
bb = bb_order[i];
@@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
hoist_adjacent_loads (bb, bb1, bb2, bb3);
continue;
}
+ else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
+ && single_succ_p (bb1)
+ && single_succ_p (bb2)
+ && single_pred_p (bb1)
+ && single_pred_p (bb2)
+ && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
+ {
+ gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb (bb1);
+ gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb (bb2);
+ if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
+ {
+ gimple *stmt1 = gsi_stmt (it1);
+ gimple *stmt2 = gsi_stmt (it2);
+ if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
+ {
+ enum tree_code code1 = gimple_assign_rhs_code (stmt1);
+ enum tree_code code2 = gimple_assign_rhs_code (stmt2);
+ diamond_minmax_p
+ = (code1 == MIN_EXPR || code1 == MAX_EXPR)
+ && (code2 == MIN_EXPR || code2 == MAX_EXPR);
+ }
+ }
+ }
else
continue;
@@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
if (!candorest)
continue;
+ /* Check that we're looking for nested phis. */
+ if (phis == NULL && diamond_minmax_p)
+ {
+ phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
+ e2 = EDGE_SUCC (bb2, 0);
+ }
+
phi = single_non_singleton_phi_for_edges (phis, e1, e2);
if (!phi)
continue;
@@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
gphi *newphi;
if (single_pred_p (bb1)
+ && !diamond_minmax_p
&& (newphi = factor_out_conditional_conversion (e1, e2, phi,
arg0, arg1,
cond_stmt)))
@@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
}
/* Do the replacement of conditional if it can be done. */
- if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
+ if (!early_p
+ && !diamond_minmax_p
+ && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
cfgchanged = true;
- else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
- arg0, arg1,
- early_p))
+ else if (!diamond_minmax_p
+ && match_simplify_replacement (bb, bb1, e1, e2, phi,
+ arg0, arg1, early_p))
cfgchanged = true;
else if (!early_p
+ && !diamond_minmax_p
&& single_pred_p (bb1)
&& cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
phi, arg0, arg1))
cfgchanged = true;
- else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
+ else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
+ diamond_minmax_p))
cfgchanged = true;
else if (single_pred_p (bb1)
+ && !diamond_minmax_p
&& spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
cfgchanged = true;
}
@@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
static void
replace_phi_edge_with_variable (basic_block cond_block,
- edge e, gphi *phi, tree new_tree)
+ edge e, gphi *phi, tree new_tree, bool delete_bb = true)
{
basic_block bb = gimple_bb (phi);
gimple_stmt_iterator gsi;
@@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
edge_to_remove = EDGE_SUCC (cond_block, 1);
else
edge_to_remove = EDGE_SUCC (cond_block, 0);
- if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
+ if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
{
e->flags |= EDGE_FALLTHRU;
e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
@@ -1564,15 +1601,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
return 0;
}
+/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the TREE for
+ the value being inverted. */
+
+static tree
+strip_bit_not (tree var)
+{
+ if (TREE_CODE (var) != SSA_NAME)
+ return NULL_TREE;
+
+ gimple *assign = SSA_NAME_DEF_STMT (var);
+ if (gimple_code (assign) != GIMPLE_ASSIGN)
+ return NULL_TREE;
+
+ if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
+ return NULL_TREE;
+
+ return gimple_assign_rhs1 (assign);
+}
+
+/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
+
+enum tree_code
+invert_minmax_code (enum tree_code code)
+{
+ switch (code) {
+ case MIN_EXPR:
+ return MAX_EXPR;
+ case MAX_EXPR:
+ return MIN_EXPR;
+ default:
+ gcc_unreachable ();
+ }
+}
+
/* The function minmax_replacement does the main work of doing the minmax
replacement. Return true if the replacement is done. Otherwise return
false.
BB is the basic block where the replacement is going to be done on. ARG0
- is argument 0 from the PHI. Likewise for ARG1. */
+ is argument 0 from the PHI. Likewise for ARG1.
+
+ If THREEWAY_P then expect the BB to be laid out in diamond shape with each
+ BB containing only a MIN or MAX expression. */
static bool
-minmax_replacement (basic_block cond_bb, basic_block middle_bb,
- edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
+minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_middle_bb,
+ edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool threeway_p)
{
tree result;
edge true_edge, false_edge;
@@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
if (false_edge->dest == middle_bb)
false_edge = EDGE_SUCC (false_edge->dest, 0);
+ /* When THREEWAY_P then e1 will point to the edge of the final transition
+ from middle-bb to end. */
if (true_edge == e0)
{
- gcc_assert (false_edge == e1);
+ if (threeway_p)
+ gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
+ else
+ gcc_assert (false_edge == e1);
arg_true = arg0;
arg_false = arg1;
}
@@ -1768,6 +1847,133 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
else
return false;
}
+ else if (middle_bb != alt_middle_bb && threeway_p)
+ {
+ /* Recognize the following case:
+
+ if (smaller < larger)
+ a = MIN (smaller, c);
+ else
+ b = MIN (larger, c);
+ x = PHI <a, b>
+
+ This is equivalent to
+
+ a = MIN (smaller, c);
+ x = MIN (larger, a); */
+
+ gimple *assign = last_and_only_stmt (middle_bb);
+ tree lhs, op0, op1, bound;
+ tree alt_lhs, alt_op0, alt_op1;
+ bool invert = false;
+
+ if (!single_pred_p (middle_bb)
+ || !single_pred_p (alt_middle_bb))
+ return false;
+
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ lhs = gimple_assign_lhs (assign);
+ ass_code = gimple_assign_rhs_code (assign);
+ if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
+ return false;
+
+ op0 = gimple_assign_rhs1 (assign);
+ op1 = gimple_assign_rhs2 (assign);
+
+ assign = last_and_only_stmt (alt_middle_bb);
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ alt_lhs = gimple_assign_lhs (assign);
+ if (ass_code != gimple_assign_rhs_code (assign))
+ return false;
+
+ alt_op0 = gimple_assign_rhs1 (assign);
+ alt_op1 = gimple_assign_rhs2 (assign);
+
+ if (!operand_equal_for_phi_arg_p (lhs, arg_true)
+ || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
+ return false;
+
+ if ((operand_equal_for_phi_arg_p (op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (op0, alt_smaller)))
+ && (operand_equal_for_phi_arg_p (alt_op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER < LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MAX_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MIN_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else if ((operand_equal_for_phi_arg_p (op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (op0, alt_larger)))
+ && (operand_equal_for_phi_arg_p (alt_op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER > LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MIN_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MAX_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else
+ return false;
+
+ /* Reset any range information from the basic block. */
+ reset_flow_sensitive_info_in_bb (cond_bb);
+
+ /* Emit the statement to compute min/max. */
+ gimple_seq stmts = NULL;
+ tree phi_result = PHI_RESULT (phi);
+ result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, bound);
+ result = gimple_build (&stmts, ass_code, TREE_TYPE (phi_result), result, arg1);
+ if (invert)
+ result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE (phi_result), result);
+
+ gsi = gsi_last_bb (cond_bb);
+ gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
+
+ replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
+
+ return true;
+ }
else
{
/* Recognize the following case, assuming d <= u:
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.
2022-06-16 11:08 [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted Tamar Christina
2022-06-16 11:09 ` [PATCH 2/2]middle-end: Support recognition of three-way max/min Tamar Christina
@ 2022-06-20 8:03 ` Richard Biener
2022-06-20 8:18 ` Richard Sandiford
1 sibling, 1 reply; 26+ messages in thread
From: Richard Biener @ 2022-06-20 8:03 UTC (permalink / raw)
To: Tamar Christina; +Cc: GCC Patches, nd, Richard Guenther
On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi All,
>
> This adds a match.pd rule that drops the bitwwise nots when both arguments to a
> subtract is inverted. i.e. for:
>
> float g(float a, float b)
> {
> return ~(int)a - ~(int)b;
> }
>
> we instead generate
>
> float g(float a, float b)
> {
> return (int)a - (int)b;
> }
>
> We already do a limited version of this from the fold_binary fold functions but
> this makes a more general version in match.pd that applies more often.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * match.pd: New bit_not rule.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/subnot.c: New test.
>
> --- inline copy of patch --
> diff --git a/gcc/match.pd b/gcc/match.pd
> index a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a10c30b8a3e1ae2e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (simplify
> (bit_not (plus:c (bit_not @0) @1))
> (minus @0 @1))
> +/* (~X - ~Y) -> X - Y. */
> +(simplify
> + (minus (bit_not @0) (bit_not @1))
> + (minus @0 @1))
It doesn't seem correct.
(gdb) p/x ~-1 - ~0x80000000
$3 = 0x80000001
(gdb) p/x -1 - 0x80000000
$4 = 0x7fffffff
where I was looking for a case exposing undefined integer overflow.
Richard.
>
> /* ~(X - Y) -> ~X + Y. */
> (simplify
> diff --git a/gcc/testsuite/gcc.dg/subnot.c b/gcc/testsuite/gcc.dg/subnot.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f831aa77d28bd02d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/subnot.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +float g(float a, float b)
> +{
> + return ~(int)a - ~(int)b;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
>
>
>
>
> --
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.
2022-06-20 8:03 ` [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted Richard Biener
@ 2022-06-20 8:18 ` Richard Sandiford
2022-06-20 8:49 ` Tamar Christina
0 siblings, 1 reply; 26+ messages in thread
From: Richard Sandiford @ 2022-06-20 8:18 UTC (permalink / raw)
To: Richard Biener via Gcc-patches
Cc: Tamar Christina, Richard Biener, Richard Guenther, nd
Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Hi All,
>>
>> This adds a match.pd rule that drops the bitwwise nots when both arguments to a
>> subtract is inverted. i.e. for:
>>
>> float g(float a, float b)
>> {
>> return ~(int)a - ~(int)b;
>> }
>>
>> we instead generate
>>
>> float g(float a, float b)
>> {
>> return (int)a - (int)b;
>> }
>>
>> We already do a limited version of this from the fold_binary fold functions but
>> this makes a more general version in match.pd that applies more often.
>>
>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>>
>> Ok for master?
>>
>> Thanks,
>> Tamar
>>
>> gcc/ChangeLog:
>>
>> * match.pd: New bit_not rule.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.dg/subnot.c: New test.
>>
>> --- inline copy of patch --
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a10c30b8a3e1ae2e 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> (simplify
>> (bit_not (plus:c (bit_not @0) @1))
>> (minus @0 @1))
>> +/* (~X - ~Y) -> X - Y. */
>> +(simplify
>> + (minus (bit_not @0) (bit_not @1))
>> + (minus @0 @1))
>
> It doesn't seem correct.
>
> (gdb) p/x ~-1 - ~0x80000000
> $3 = 0x80000001
> (gdb) p/x -1 - 0x80000000
> $4 = 0x7fffffff
>
> where I was looking for a case exposing undefined integer overflow.
Yeah, shouldn't it be folding to (minus @1 @0) instead?
~X = (-X - 1)
-Y = (-Y - 1)
so:
~X - ~Y = (-X - 1) - (-Y - 1)
= -X - 1 + Y + 1
= Y - X
Richard
> Richard.
>
>>
>> /* ~(X - Y) -> ~X + Y. */
>> (simplify
>> diff --git a/gcc/testsuite/gcc.dg/subnot.c b/gcc/testsuite/gcc.dg/subnot.c
>> new file mode 100644
>> index 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f831aa77d28bd02d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/subnot.c
>> @@ -0,0 +1,9 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O -fdump-tree-optimized" } */
>> +
>> +float g(float a, float b)
>> +{
>> + return ~(int)a - ~(int)b;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
>>
>>
>>
>>
>> --
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-16 11:09 ` [PATCH 2/2]middle-end: Support recognition of three-way max/min Tamar Christina
@ 2022-06-20 8:36 ` Richard Biener
2022-06-20 9:01 ` Tamar Christina
2022-07-05 15:25 ` Tamar Christina
2022-06-20 23:16 ` Andrew Pinski
1 sibling, 2 replies; 26+ messages in thread
From: Richard Biener @ 2022-06-20 8:36 UTC (permalink / raw)
To: Tamar Christina; +Cc: gcc-patches, nd, jakub
On Thu, 16 Jun 2022, Tamar Christina wrote:
> Hi All,
>
> This patch adds support for three-way min/max recognition in phi-opts.
>
> Concretely for e.g.
>
> #include <stdint.h>
>
> uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> uint8_t xk;
> if (xc < xm) {
> xk = (uint8_t) (xc < xy ? xc : xy);
> } else {
> xk = (uint8_t) (xm < xy ? xm : xy);
> }
> return xk;
> }
>
> we generate:
>
> <bb 2> [local count: 1073741824]:
> _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> _7 = MIN_EXPR <xm_2(D), _5>;
> return _7;
>
> instead of
>
> <bb 2>:
> if (xc_2(D) < xm_3(D))
> goto <bb 3>;
> else
> goto <bb 4>;
>
> <bb 3>:
> xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> goto <bb 5>;
>
> <bb 4>:
> xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
>
> <bb 5>:
> # xk_1 = PHI <xk_5(3), xk_6(4)>
> return xk_1;
>
> The same function also immediately deals with turning a minimization problem
> into a maximization one if the results are inverted. We do this here since
> doing it in match.pd would end up changing the shape of the BBs and adding
> additional instructions which would prevent various optimizations from working.
Can you explain a bit more?
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the phi
> sequence of a three-way conditional.
> (replace_phi_edge_with_variable): Support deferring of BB removal.
> (tree_ssa_phiopt_worker): Detect diamond phi structure for three-way
> min/max.
> (strip_bit_not, invert_minmax_code): New.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't optimize
> code away.
> * gcc.dg/tree-ssa/minmax-3.c: New test.
> * gcc.dg/tree-ssa/minmax-4.c: New test.
> * gcc.dg/tree-ssa/minmax-5.c: New test.
> * gcc.dg/tree-ssa/minmax-6.c: New test.
> * gcc.dg/tree-ssa/minmax-7.c: New test.
> * gcc.dg/tree-ssa/minmax-8.c: New test.
>
> --- inline copy of patch --
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e6a843695a05786e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17cb522da44bafa0e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f5b9993074f8510
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e85639e3a49dd4b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xy < xc ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8bef9fa7b1c5e104
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60cb654fcab0bfdd1c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> index 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba7247d75a32d2c860ed 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> @@ -1,5 +1,5 @@
> /* { dg-do run } */
> -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20" } */
> +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
>
> #include <stdio.h>
> #include <stdlib.h>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2e367d2d21421b1 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
> gimple *);
> static int value_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree);
> -static bool minmax_replacement (basic_block, basic_block,
> - edge, edge, gphi *, tree, tree);
> +static bool minmax_replacement (basic_block, basic_block, basic_block,
> + edge, edge, gphi *, tree, tree, bool);
> static bool spaceship_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree);
> static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
> @@ -73,7 +73,7 @@ static bool cond_store_replacement (basic_block, basic_block, edge, edge,
> hash_set<tree> *);
> static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
> static hash_set<tree> * get_non_trapping ();
> -static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> +static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree, bool);
> static void hoist_adjacent_loads (basic_block, basic_block,
> basic_block, basic_block);
> static bool gate_hoist_loads (void);
> @@ -199,6 +199,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> basic_block bb1, bb2;
> edge e1, e2;
> tree arg0, arg1;
> + bool diamond_minmax_p = false;
>
> bb = bb_order[i];
>
> @@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> hoist_adjacent_loads (bb, bb1, bb2, bb3);
> continue;
> }
> + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> + && single_succ_p (bb1)
> + && single_succ_p (bb2)
> + && single_pred_p (bb1)
> + && single_pred_p (bb2)
> + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
please do the single_succ/pred checks below where appropriate, also
what's the last check about? why does the merge block need a single
successor?
> + {
> + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb (bb1);
> + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb (bb2);
> + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> + {
> + gimple *stmt1 = gsi_stmt (it1);
> + gimple *stmt2 = gsi_stmt (it2);
> + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> + {
> + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> + diamond_minmax_p
> + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> + }
> + }
> + }
I'd generalize this to general diamond detection, simply cutting off
*_replacement workers that do not handle diamonds and do appropriate
checks in minmax_replacement only.
> else
> continue;
>
> @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> if (!candorest)
> continue;
>
> + /* Check that we're looking for nested phis. */
> + if (phis == NULL && diamond_minmax_p)
> + {
> + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> + e2 = EDGE_SUCC (bb2, 0);
> + }
> +
instead
basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
gimple_seq phis = phi_nodes (merge);
> phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> if (!phi)
> continue;
> @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
> gphi *newphi;
> if (single_pred_p (bb1)
> + && !diamond_minmax_p
> && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> arg0, arg1,
> cond_stmt)))
> @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> }
>
> /* Do the replacement of conditional if it can be done. */
> - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> + if (!early_p
> + && !diamond_minmax_p
> + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> cfgchanged = true;
> - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> - arg0, arg1,
> - early_p))
> + else if (!diamond_minmax_p
> + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> + arg0, arg1, early_p))
> cfgchanged = true;
> else if (!early_p
> + && !diamond_minmax_p
> && single_pred_p (bb1)
> && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
> phi, arg0, arg1))
> cfgchanged = true;
> - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> + diamond_minmax_p))
> cfgchanged = true;
> else if (single_pred_p (bb1)
> + && !diamond_minmax_p
> && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> cfgchanged = true;
> }
> @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
> static void
> replace_phi_edge_with_variable (basic_block cond_block,
> - edge e, gphi *phi, tree new_tree)
> + edge e, gphi *phi, tree new_tree, bool delete_bb = true)
> {
> basic_block bb = gimple_bb (phi);
> gimple_stmt_iterator gsi;
> @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> else
> edge_to_remove = EDGE_SUCC (cond_block, 0);
> - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
why do you need this change?
Did you check whether the new case works when the merge block has
more than two incoming edges?
> {
> e->flags |= EDGE_FALLTHRU;
> e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> @@ -1564,15 +1601,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
> return 0;
> }
>
> +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the TREE for
> + the value being inverted. */
> +
> +static tree
> +strip_bit_not (tree var)
> +{
> + if (TREE_CODE (var) != SSA_NAME)
> + return NULL_TREE;
> +
> + gimple *assign = SSA_NAME_DEF_STMT (var);
> + if (gimple_code (assign) != GIMPLE_ASSIGN)
> + return NULL_TREE;
> +
> + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> + return NULL_TREE;
> +
> + return gimple_assign_rhs1 (assign);
> +}
> +
> +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> +
> +enum tree_code
> +invert_minmax_code (enum tree_code code)
> +{
> + switch (code) {
> + case MIN_EXPR:
> + return MAX_EXPR;
> + case MAX_EXPR:
> + return MIN_EXPR;
> + default:
> + gcc_unreachable ();
> + }
> +}
> +
> /* The function minmax_replacement does the main work of doing the minmax
> replacement. Return true if the replacement is done. Otherwise return
> false.
> BB is the basic block where the replacement is going to be done on. ARG0
> - is argument 0 from the PHI. Likewise for ARG1. */
> + is argument 0 from the PHI. Likewise for ARG1.
> +
> + If THREEWAY_P then expect the BB to be laid out in diamond shape with each
> + BB containing only a MIN or MAX expression. */
>
> static bool
> -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> +minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_middle_bb,
> + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool threeway_p)
> {
> tree result;
> edge true_edge, false_edge;
> @@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> if (false_edge->dest == middle_bb)
> false_edge = EDGE_SUCC (false_edge->dest, 0);
>
> + /* When THREEWAY_P then e1 will point to the edge of the final transition
> + from middle-bb to end. */
> if (true_edge == e0)
> {
> - gcc_assert (false_edge == e1);
> + if (threeway_p)
> + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> + else
> + gcc_assert (false_edge == e1);
> arg_true = arg0;
> arg_false = arg1;
> }
> @@ -1768,6 +1847,133 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> else
> return false;
> }
> + else if (middle_bb != alt_middle_bb && threeway_p)
> + {
> + /* Recognize the following case:
> +
> + if (smaller < larger)
> + a = MIN (smaller, c);
> + else
> + b = MIN (larger, c);
> + x = PHI <a, b>
> +
> + This is equivalent to
> +
> + a = MIN (smaller, c);
> + x = MIN (larger, a); */
> +
> + gimple *assign = last_and_only_stmt (middle_bb);
> + tree lhs, op0, op1, bound;
> + tree alt_lhs, alt_op0, alt_op1;
> + bool invert = false;
> +
> + if (!single_pred_p (middle_bb)
> + || !single_pred_p (alt_middle_bb))
> + return false;
> +
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + lhs = gimple_assign_lhs (assign);
> + ass_code = gimple_assign_rhs_code (assign);
> + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> + return false;
> +
> + op0 = gimple_assign_rhs1 (assign);
> + op1 = gimple_assign_rhs2 (assign);
> +
> + assign = last_and_only_stmt (alt_middle_bb);
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + alt_lhs = gimple_assign_lhs (assign);
> + if (ass_code != gimple_assign_rhs_code (assign))
> + return false;
> +
> + alt_op0 = gimple_assign_rhs1 (assign);
> + alt_op1 = gimple_assign_rhs2 (assign);
> +
> + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> + return false;
> +
> + if ((operand_equal_for_phi_arg_p (op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MAX_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MIN_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else if ((operand_equal_for_phi_arg_p (op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MIN_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MAX_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else
> + return false;
Did you check you have coverage for all cases above in your testcases?
> + /* Reset any range information from the basic block. */
> + reset_flow_sensitive_info_in_bb (cond_bb);
Huh. You need to reset flow-sensitive info of the middle-bb stmt
that prevails only...
> + /* Emit the statement to compute min/max. */
> + gimple_seq stmts = NULL;
> + tree phi_result = PHI_RESULT (phi);
> + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, bound);
> + result = gimple_build (&stmts, ass_code, TREE_TYPE (phi_result), result, arg1);
... but you are re-building both here. And also you drop locations, the
preserved min/max should keep the old, the new should get the location
of ... hmm, the condition possibly?
> + if (invert)
> + result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE (phi_result), result);
> +
> + gsi = gsi_last_bb (cond_bb);
> + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> +
> + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> + return true;
> + }
> else
> {
> /* Recognize the following case, assuming d <= u:
>
>
>
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.
2022-06-20 8:18 ` Richard Sandiford
@ 2022-06-20 8:49 ` Tamar Christina
2022-06-21 7:43 ` Richard Biener
0 siblings, 1 reply; 26+ messages in thread
From: Tamar Christina @ 2022-06-20 8:49 UTC (permalink / raw)
To: Richard Sandiford, Richard Biener via Gcc-patches
Cc: Richard Biener, Richard Guenther, nd
> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Monday, June 20, 2022 9:19 AM
> To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
> Cc: Tamar Christina <Tamar.Christina@arm.com>; Richard Biener
> <richard.guenther@gmail.com>; Richard Guenther <rguenther@suse.de>;
> nd <nd@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: Simplify subtract where both
> arguments are being bitwise inverted.
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> Hi All,
> >>
> >> This adds a match.pd rule that drops the bitwwise nots when both
> >> arguments to a subtract is inverted. i.e. for:
> >>
> >> float g(float a, float b)
> >> {
> >> return ~(int)a - ~(int)b;
> >> }
> >>
> >> we instead generate
> >>
> >> float g(float a, float b)
> >> {
> >> return (int)a - (int)b;
> >> }
> >>
> >> We already do a limited version of this from the fold_binary fold
> >> functions but this makes a more general version in match.pd that applies
> more often.
> >>
> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >>
> >> Ok for master?
> >>
> >> Thanks,
> >> Tamar
> >>
> >> gcc/ChangeLog:
> >>
> >> * match.pd: New bit_not rule.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> * gcc.dg/subnot.c: New test.
> >>
> >> --- inline copy of patch --
> >> diff --git a/gcc/match.pd b/gcc/match.pd index
> >>
> a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a1
> 0
> >> c30b8a3e1ae2e 100644
> >> --- a/gcc/match.pd
> >> +++ b/gcc/match.pd
> >> @@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >> (simplify
> >> (bit_not (plus:c (bit_not @0) @1))
> >> (minus @0 @1))
> >> +/* (~X - ~Y) -> X - Y. */
> >> +(simplify
> >> + (minus (bit_not @0) (bit_not @1))
> >> + (minus @0 @1))
> >
> > It doesn't seem correct.
> >
> > (gdb) p/x ~-1 - ~0x80000000
> > $3 = 0x80000001
> > (gdb) p/x -1 - 0x80000000
> > $4 = 0x7fffffff
> >
> > where I was looking for a case exposing undefined integer overflow.
>
> Yeah, shouldn't it be folding to (minus @1 @0) instead?
>
> ~X = (-X - 1)
> -Y = (-Y - 1)
>
> so:
>
> ~X - ~Y = (-X - 1) - (-Y - 1)
> = -X - 1 + Y + 1
> = Y - X
>
You're right, sorry, I should have paid more attention when I wrote the patch.
Tamar
> Richard
>
>
> > Richard.
> >
> >>
> >> /* ~(X - Y) -> ~X + Y. */
> >> (simplify
> >> diff --git a/gcc/testsuite/gcc.dg/subnot.c
> >> b/gcc/testsuite/gcc.dg/subnot.c new file mode 100644 index
> >>
> 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f
> 83
> >> 1aa77d28bd02d
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/subnot.c
> >> @@ -0,0 +1,9 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O -fdump-tree-optimized" } */
> >> +
> >> +float g(float a, float b)
> >> +{
> >> + return ~(int)a - ~(int)b;
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
> >>
> >>
> >>
> >>
> >> --
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-20 8:36 ` Richard Biener
@ 2022-06-20 9:01 ` Tamar Christina
2022-06-21 13:15 ` Richard Biener
2022-07-05 15:25 ` Tamar Christina
1 sibling, 1 reply; 26+ messages in thread
From: Tamar Christina @ 2022-06-20 9:01 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, nd, jakub
> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, June 20, 2022 9:36 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
>
> On Thu, 16 Jun 2022, Tamar Christina wrote:
>
> > Hi All,
> >
> > This patch adds support for three-way min/max recognition in phi-opts.
> >
> > Concretely for e.g.
> >
> > #include <stdint.h>
> >
> > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > uint8_t xk;
> > if (xc < xm) {
> > xk = (uint8_t) (xc < xy ? xc : xy);
> > } else {
> > xk = (uint8_t) (xm < xy ? xm : xy);
> > }
> > return xk;
> > }
> >
> > we generate:
> >
> > <bb 2> [local count: 1073741824]:
> > _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> > _7 = MIN_EXPR <xm_2(D), _5>;
> > return _7;
> >
> > instead of
> >
> > <bb 2>:
> > if (xc_2(D) < xm_3(D))
> > goto <bb 3>;
> > else
> > goto <bb 4>;
> >
> > <bb 3>:
> > xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> > goto <bb 5>;
> >
> > <bb 4>:
> > xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
> >
> > <bb 5>:
> > # xk_1 = PHI <xk_5(3), xk_6(4)>
> > return xk_1;
> >
> > The same function also immediately deals with turning a minimization
> > problem into a maximization one if the results are inverted. We do
> > this here since doing it in match.pd would end up changing the shape
> > of the BBs and adding additional instructions which would prevent various
> optimizations from working.
>
> Can you explain a bit more?
I'll respond to this one first In case it changes how you want me to proceed.
I initially had used a match.pd rule to do the min to max conversion, but a
number of testcases started to fail. The reason was that a lot of the foldings
checked that the BB contains only a single SSA and that that SSA is a phi node.
By changing the min into max, the negation of the result ends up In the same BB
and so the optimizations are skipped leading to less optimal code.
I did look into relaxing those phi opts but it felt like I'd make a rather arbitrary
exception for minus and seemed better to handle it in the minmax folding.
Thanks,
Tamar
>
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> the phi
> > sequence of a three-way conditional.
> > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> way
> > min/max.
> > (strip_bit_not, invert_minmax_code): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> optimize
> > code away.
> > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > * gcc.dg/tree-ssa/minmax-8.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> 6a8
> > 43695a05786e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> b52
> > 2da44bafa0e2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> 5b
> > 9993074f8510
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> 85
> > 639e3a49dd4b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xy < xc ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> ef
> > 9fa7b1c5e104
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > @@ -0,0 +1,16 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> b65
> > 4fcab0bfdd1c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > index
> >
> 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> 7d7
> > 5a32d2c860ed 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > @@ -1,5 +1,5 @@
> > /* { dg-do run } */
> > -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > --param max-jump-thread-duplication-stmts=20" } */
> > +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > +--param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> >
> 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2e
> 3
> > 67d2d21421b1 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion (edge,
> edge, gphi *, tree, tree,
> > gimple *);
> > static int value_replacement (basic_block, basic_block,
> > edge, edge, gphi *, tree, tree); -static bool
> > minmax_replacement (basic_block, basic_block,
> > - edge, edge, gphi *, tree, tree);
> > +static bool minmax_replacement (basic_block, basic_block, basic_block,
> > + edge, edge, gphi *, tree, tree, bool);
> > static bool spaceship_replacement (basic_block, basic_block,
> > edge, edge, gphi *, tree, tree); static bool
> > cond_removal_in_builtin_zero_pattern (basic_block, basic_block, @@
> > -73,7 +73,7 @@ static bool cond_store_replacement (basic_block,
> basic_block, edge, edge,
> > hash_set<tree> *);
> > static bool cond_if_else_store_replacement (basic_block, basic_block,
> > basic_block); static hash_set<tree> * get_non_trapping (); -static
> > void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> > +static void replace_phi_edge_with_variable (basic_block, edge, gphi
> > +*, tree, bool);
> > static void hoist_adjacent_loads (basic_block, basic_block,
> > basic_block, basic_block);
> > static bool gate_hoist_loads (void);
> > @@ -199,6 +199,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> > basic_block bb1, bb2;
> > edge e1, e2;
> > tree arg0, arg1;
> > + bool diamond_minmax_p = false;
> >
> > bb = bb_order[i];
> >
> > @@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > hoist_adjacent_loads (bb, bb1, bb2, bb3);
> > continue;
> > }
> > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > + && single_succ_p (bb1)
> > + && single_succ_p (bb2)
> > + && single_pred_p (bb1)
> > + && single_pred_p (bb2)
> > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
>
> please do the single_succ/pred checks below where appropriate, also what's
> the last check about? why does the merge block need a single successor?
>
> > + {
> > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> (bb1);
> > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> (bb2);
> > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > + {
> > + gimple *stmt1 = gsi_stmt (it1);
> > + gimple *stmt2 = gsi_stmt (it2);
> > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > + {
> > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > + diamond_minmax_p
> > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > + }
> > + }
> > + }
>
> I'd generalize this to general diamond detection, simply cutting off
> *_replacement workers that do not handle diamonds and do appropriate
> checks in minmax_replacement only.
>
> > else
> > continue;
> >
> > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > if (!candorest)
> > continue;
> >
> > + /* Check that we're looking for nested phis. */
> > + if (phis == NULL && diamond_minmax_p)
> > + {
> > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > + e2 = EDGE_SUCC (bb2, 0);
> > + }
> > +
>
> instead
>
> basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> gimple_seq phis = phi_nodes (merge);
>
>
> > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > if (!phi)
> > continue;
> > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > gphi *newphi;
> > if (single_pred_p (bb1)
> > + && !diamond_minmax_p
> > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > arg0, arg1,
> > cond_stmt)))
> > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > }
> >
> > /* Do the replacement of conditional if it can be done. */
> > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> arg1))
> > + if (!early_p
> > + && !diamond_minmax_p
> > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > - arg0, arg1,
> > - early_p))
> > + else if (!diamond_minmax_p
> > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > + arg0, arg1, early_p))
> > cfgchanged = true;
> > else if (!early_p
> > + && !diamond_minmax_p
> > && single_pred_p (bb1)
> > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> e2,
> > phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > + diamond_minmax_p))
> > cfgchanged = true;
> > else if (single_pred_p (bb1)
> > + && !diamond_minmax_p
> > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> arg1))
> > cfgchanged = true;
> > }
> > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > static void
> > replace_phi_edge_with_variable (basic_block cond_block,
> > - edge e, gphi *phi, tree new_tree)
> > + edge e, gphi *phi, tree new_tree, bool
> delete_bb = true)
> > {
> > basic_block bb = gimple_bb (phi);
> > gimple_stmt_iterator gsi;
> > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> cond_block,
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > else
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
>
> why do you need this change?
>
> Did you check whether the new case works when the merge block has more
> than two incoming edges?
>
> > {
> > e->flags |= EDGE_FALLTHRU;
> > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -
> 1564,15
> > +1601,52 @@ value_replacement (basic_block cond_bb, basic_block
> middle_bb,
> > return 0;
> > }
> >
> > +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the
> TREE for
> > + the value being inverted. */
> > +
> > +static tree
> > +strip_bit_not (tree var)
> > +{
> > + if (TREE_CODE (var) != SSA_NAME)
> > + return NULL_TREE;
> > +
> > + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code (assign)
> > + != GIMPLE_ASSIGN)
> > + return NULL_TREE;
> > +
> > + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> > + return NULL_TREE;
> > +
> > + return gimple_assign_rhs1 (assign); }
> > +
> > +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> > +
> > +enum tree_code
> > +invert_minmax_code (enum tree_code code) {
> > + switch (code) {
> > + case MIN_EXPR:
> > + return MAX_EXPR;
> > + case MAX_EXPR:
> > + return MIN_EXPR;
> > + default:
> > + gcc_unreachable ();
> > + }
> > +}
> > +
> > /* The function minmax_replacement does the main work of doing the
> minmax
> > replacement. Return true if the replacement is done. Otherwise return
> > false.
> > BB is the basic block where the replacement is going to be done on.
> ARG0
> > - is argument 0 from the PHI. Likewise for ARG1. */
> > + is argument 0 from the PHI. Likewise for ARG1.
> > +
> > + If THREEWAY_P then expect the BB to be laid out in diamond shape with
> each
> > + BB containing only a MIN or MAX expression. */
> >
> > static bool
> > -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> > +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> basic_block alt_middle_bb,
> > + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool
> > +threeway_p)
> > {
> > tree result;
> > edge true_edge, false_edge;
> > @@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> > if (false_edge->dest == middle_bb)
> > false_edge = EDGE_SUCC (false_edge->dest, 0);
> >
> > + /* When THREEWAY_P then e1 will point to the edge of the final
> transition
> > + from middle-bb to end. */
> > if (true_edge == e0)
> > {
> > - gcc_assert (false_edge == e1);
> > + if (threeway_p)
> > + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> > + else
> > + gcc_assert (false_edge == e1);
> > arg_true = arg0;
> > arg_false = arg1;
> > }
> > @@ -1768,6 +1847,133 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> > else
> > return false;
> > }
> > + else if (middle_bb != alt_middle_bb && threeway_p)
> > + {
> > + /* Recognize the following case:
> > +
> > + if (smaller < larger)
> > + a = MIN (smaller, c);
> > + else
> > + b = MIN (larger, c);
> > + x = PHI <a, b>
> > +
> > + This is equivalent to
> > +
> > + a = MIN (smaller, c);
> > + x = MIN (larger, a); */
> > +
> > + gimple *assign = last_and_only_stmt (middle_bb);
> > + tree lhs, op0, op1, bound;
> > + tree alt_lhs, alt_op0, alt_op1;
> > + bool invert = false;
> > +
> > + if (!single_pred_p (middle_bb)
> > + || !single_pred_p (alt_middle_bb))
> > + return false;
> > +
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + lhs = gimple_assign_lhs (assign);
> > + ass_code = gimple_assign_rhs_code (assign);
> > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > + return false;
> > +
> > + op0 = gimple_assign_rhs1 (assign);
> > + op1 = gimple_assign_rhs2 (assign);
> > +
> > + assign = last_and_only_stmt (alt_middle_bb);
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + alt_lhs = gimple_assign_lhs (assign);
> > + if (ass_code != gimple_assign_rhs_code (assign))
> > + return false;
> > +
> > + alt_op0 = gimple_assign_rhs1 (assign);
> > + alt_op1 = gimple_assign_rhs2 (assign);
> > +
> > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > + return false;
> > +
> > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MAX_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MIN_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (alt_op0,
> alt_smaller))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MIN_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MAX_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else
> > + return false;
>
> Did you check you have coverage for all cases above in your testcases?
>
> > + /* Reset any range information from the basic block. */
> > + reset_flow_sensitive_info_in_bb (cond_bb);
>
> Huh. You need to reset flow-sensitive info of the middle-bb stmt that
> prevails only...
>
> > + /* Emit the statement to compute min/max. */
> > + gimple_seq stmts = NULL;
> > + tree phi_result = PHI_RESULT (phi);
> > + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0,
> bound);
> > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > + (phi_result), result, arg1);
>
> ... but you are re-building both here. And also you drop locations, the
> preserved min/max should keep the old, the new should get the location of
> ... hmm, the condition possibly?
>
> > + if (invert)
> > + result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE
> (phi_result),
> > +result);
> > +
> > + gsi = gsi_last_bb (cond_bb);
> > + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> > +
> > + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> > + return true;
> > + }
> > else
> > {
> > /* Recognize the following case, assuming d <= u:
> >
> >
> >
> >
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461
> Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> Boudien Moerman; HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-16 11:09 ` [PATCH 2/2]middle-end: Support recognition of three-way max/min Tamar Christina
2022-06-20 8:36 ` Richard Biener
@ 2022-06-20 23:16 ` Andrew Pinski
2022-06-21 6:54 ` Richard Biener
2022-06-21 7:12 ` Tamar Christina
1 sibling, 2 replies; 26+ messages in thread
From: Andrew Pinski @ 2022-06-20 23:16 UTC (permalink / raw)
To: Tamar Christina; +Cc: GCC Patches, Jakub Jelinek, nd, Richard Guenther
On Thu, Jun 16, 2022 at 4:11 AM Tamar Christina via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi All,
>
> This patch adds support for three-way min/max recognition in phi-opts.
>
> Concretely for e.g.
>
> #include <stdint.h>
>
> uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> uint8_t xk;
> if (xc < xm) {
> xk = (uint8_t) (xc < xy ? xc : xy);
> } else {
> xk = (uint8_t) (xm < xy ? xm : xy);
> }
> return xk;
> }
>
> we generate:
>
> <bb 2> [local count: 1073741824]:
> _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> _7 = MIN_EXPR <xm_2(D), _5>;
> return _7;
>
> instead of
>
> <bb 2>:
> if (xc_2(D) < xm_3(D))
> goto <bb 3>;
> else
> goto <bb 4>;
>
> <bb 3>:
> xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> goto <bb 5>;
>
> <bb 4>:
> xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
>
> <bb 5>:
> # xk_1 = PHI <xk_5(3), xk_6(4)>
> return xk_1;
>
> The same function also immediately deals with turning a minimization problem
> into a maximization one if the results are inverted. We do this here since
> doing it in match.pd would end up changing the shape of the BBs and adding
> additional instructions which would prevent various optimizations from working.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the phi
> sequence of a three-way conditional.
> (replace_phi_edge_with_variable): Support deferring of BB removal.
> (tree_ssa_phiopt_worker): Detect diamond phi structure for three-way
> min/max.
> (strip_bit_not, invert_minmax_code): New.
I have been working on getting rid of minmax_replacement and a few
others and only having match_simplify_replacement and having the
simplification logic all in match.pd instead.
Is there a reason why you can't expand match_simplify_replacement and match.pd?
>The reason was that a lot of the foldings checked that the BB contains only
> a single SSA and that that SSA is a phi node.
Could you expand on that?
Thanks,
Andrew
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't optimize
> code away.
> * gcc.dg/tree-ssa/minmax-3.c: New test.
> * gcc.dg/tree-ssa/minmax-4.c: New test.
> * gcc.dg/tree-ssa/minmax-5.c: New test.
> * gcc.dg/tree-ssa/minmax-6.c: New test.
> * gcc.dg/tree-ssa/minmax-7.c: New test.
> * gcc.dg/tree-ssa/minmax-8.c: New test.
>
> --- inline copy of patch --
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e6a843695a05786e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17cb522da44bafa0e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f5b9993074f8510
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e85639e3a49dd4b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xy < xc ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8bef9fa7b1c5e104
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60cb654fcab0bfdd1c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> index 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba7247d75a32d2c860ed 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> @@ -1,5 +1,5 @@
> /* { dg-do run } */
> -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20" } */
> +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
>
> #include <stdio.h>
> #include <stdlib.h>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2e367d2d21421b1 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
> gimple *);
> static int value_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree);
> -static bool minmax_replacement (basic_block, basic_block,
> - edge, edge, gphi *, tree, tree);
> +static bool minmax_replacement (basic_block, basic_block, basic_block,
> + edge, edge, gphi *, tree, tree, bool);
> static bool spaceship_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree);
> static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
> @@ -73,7 +73,7 @@ static bool cond_store_replacement (basic_block, basic_block, edge, edge,
> hash_set<tree> *);
> static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
> static hash_set<tree> * get_non_trapping ();
> -static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> +static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree, bool);
> static void hoist_adjacent_loads (basic_block, basic_block,
> basic_block, basic_block);
> static bool gate_hoist_loads (void);
> @@ -199,6 +199,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> basic_block bb1, bb2;
> edge e1, e2;
> tree arg0, arg1;
> + bool diamond_minmax_p = false;
>
> bb = bb_order[i];
>
> @@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> hoist_adjacent_loads (bb, bb1, bb2, bb3);
> continue;
> }
> + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> + && single_succ_p (bb1)
> + && single_succ_p (bb2)
> + && single_pred_p (bb1)
> + && single_pred_p (bb2)
> + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> + {
> + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb (bb1);
> + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb (bb2);
> + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> + {
> + gimple *stmt1 = gsi_stmt (it1);
> + gimple *stmt2 = gsi_stmt (it2);
> + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> + {
> + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> + diamond_minmax_p
> + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> + }
> + }
> + }
> else
> continue;
>
> @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> if (!candorest)
> continue;
>
> + /* Check that we're looking for nested phis. */
> + if (phis == NULL && diamond_minmax_p)
> + {
> + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> + e2 = EDGE_SUCC (bb2, 0);
> + }
> +
> phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> if (!phi)
> continue;
> @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
> gphi *newphi;
> if (single_pred_p (bb1)
> + && !diamond_minmax_p
> && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> arg0, arg1,
> cond_stmt)))
> @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> }
>
> /* Do the replacement of conditional if it can be done. */
> - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> + if (!early_p
> + && !diamond_minmax_p
> + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> cfgchanged = true;
> - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> - arg0, arg1,
> - early_p))
> + else if (!diamond_minmax_p
> + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> + arg0, arg1, early_p))
> cfgchanged = true;
> else if (!early_p
> + && !diamond_minmax_p
> && single_pred_p (bb1)
> && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
> phi, arg0, arg1))
> cfgchanged = true;
> - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> + diamond_minmax_p))
> cfgchanged = true;
> else if (single_pred_p (bb1)
> + && !diamond_minmax_p
> && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> cfgchanged = true;
> }
> @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
> static void
> replace_phi_edge_with_variable (basic_block cond_block,
> - edge e, gphi *phi, tree new_tree)
> + edge e, gphi *phi, tree new_tree, bool delete_bb = true)
> {
> basic_block bb = gimple_bb (phi);
> gimple_stmt_iterator gsi;
> @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> else
> edge_to_remove = EDGE_SUCC (cond_block, 0);
> - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> {
> e->flags |= EDGE_FALLTHRU;
> e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> @@ -1564,15 +1601,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
> return 0;
> }
>
> +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the TREE for
> + the value being inverted. */
> +
> +static tree
> +strip_bit_not (tree var)
> +{
> + if (TREE_CODE (var) != SSA_NAME)
> + return NULL_TREE;
> +
> + gimple *assign = SSA_NAME_DEF_STMT (var);
> + if (gimple_code (assign) != GIMPLE_ASSIGN)
> + return NULL_TREE;
> +
> + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> + return NULL_TREE;
> +
> + return gimple_assign_rhs1 (assign);
> +}
> +
> +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> +
> +enum tree_code
> +invert_minmax_code (enum tree_code code)
> +{
> + switch (code) {
> + case MIN_EXPR:
> + return MAX_EXPR;
> + case MAX_EXPR:
> + return MIN_EXPR;
> + default:
> + gcc_unreachable ();
> + }
> +}
> +
> /* The function minmax_replacement does the main work of doing the minmax
> replacement. Return true if the replacement is done. Otherwise return
> false.
> BB is the basic block where the replacement is going to be done on. ARG0
> - is argument 0 from the PHI. Likewise for ARG1. */
> + is argument 0 from the PHI. Likewise for ARG1.
> +
> + If THREEWAY_P then expect the BB to be laid out in diamond shape with each
> + BB containing only a MIN or MAX expression. */
>
> static bool
> -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> +minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_middle_bb,
> + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool threeway_p)
> {
> tree result;
> edge true_edge, false_edge;
> @@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> if (false_edge->dest == middle_bb)
> false_edge = EDGE_SUCC (false_edge->dest, 0);
>
> + /* When THREEWAY_P then e1 will point to the edge of the final transition
> + from middle-bb to end. */
> if (true_edge == e0)
> {
> - gcc_assert (false_edge == e1);
> + if (threeway_p)
> + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> + else
> + gcc_assert (false_edge == e1);
> arg_true = arg0;
> arg_false = arg1;
> }
> @@ -1768,6 +1847,133 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> else
> return false;
> }
> + else if (middle_bb != alt_middle_bb && threeway_p)
> + {
> + /* Recognize the following case:
> +
> + if (smaller < larger)
> + a = MIN (smaller, c);
> + else
> + b = MIN (larger, c);
> + x = PHI <a, b>
> +
> + This is equivalent to
> +
> + a = MIN (smaller, c);
> + x = MIN (larger, a); */
> +
> + gimple *assign = last_and_only_stmt (middle_bb);
> + tree lhs, op0, op1, bound;
> + tree alt_lhs, alt_op0, alt_op1;
> + bool invert = false;
> +
> + if (!single_pred_p (middle_bb)
> + || !single_pred_p (alt_middle_bb))
> + return false;
> +
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + lhs = gimple_assign_lhs (assign);
> + ass_code = gimple_assign_rhs_code (assign);
> + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> + return false;
> +
> + op0 = gimple_assign_rhs1 (assign);
> + op1 = gimple_assign_rhs2 (assign);
> +
> + assign = last_and_only_stmt (alt_middle_bb);
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + alt_lhs = gimple_assign_lhs (assign);
> + if (ass_code != gimple_assign_rhs_code (assign))
> + return false;
> +
> + alt_op0 = gimple_assign_rhs1 (assign);
> + alt_op1 = gimple_assign_rhs2 (assign);
> +
> + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> + return false;
> +
> + if ((operand_equal_for_phi_arg_p (op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MAX_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MIN_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else if ((operand_equal_for_phi_arg_p (op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MIN_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MAX_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else
> + return false;
> +
> + /* Reset any range information from the basic block. */
> + reset_flow_sensitive_info_in_bb (cond_bb);
> +
> + /* Emit the statement to compute min/max. */
> + gimple_seq stmts = NULL;
> + tree phi_result = PHI_RESULT (phi);
> + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, bound);
> + result = gimple_build (&stmts, ass_code, TREE_TYPE (phi_result), result, arg1);
> + if (invert)
> + result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE (phi_result), result);
> +
> + gsi = gsi_last_bb (cond_bb);
> + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> +
> + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> +
> + return true;
> + }
> else
> {
> /* Recognize the following case, assuming d <= u:
>
>
>
>
> --
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-20 23:16 ` Andrew Pinski
@ 2022-06-21 6:54 ` Richard Biener
2022-06-21 7:12 ` Tamar Christina
1 sibling, 0 replies; 26+ messages in thread
From: Richard Biener @ 2022-06-21 6:54 UTC (permalink / raw)
To: Andrew Pinski; +Cc: Tamar Christina, GCC Patches, Jakub Jelinek, nd
On Mon, 20 Jun 2022, Andrew Pinski wrote:
> On Thu, Jun 16, 2022 at 4:11 AM Tamar Christina via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi All,
> >
> > This patch adds support for three-way min/max recognition in phi-opts.
> >
> > Concretely for e.g.
> >
> > #include <stdint.h>
> >
> > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > uint8_t xk;
> > if (xc < xm) {
> > xk = (uint8_t) (xc < xy ? xc : xy);
> > } else {
> > xk = (uint8_t) (xm < xy ? xm : xy);
> > }
> > return xk;
> > }
> >
> > we generate:
> >
> > <bb 2> [local count: 1073741824]:
> > _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> > _7 = MIN_EXPR <xm_2(D), _5>;
> > return _7;
> >
> > instead of
> >
> > <bb 2>:
> > if (xc_2(D) < xm_3(D))
> > goto <bb 3>;
> > else
> > goto <bb 4>;
> >
> > <bb 3>:
> > xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> > goto <bb 5>;
> >
> > <bb 4>:
> > xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
> >
> > <bb 5>:
> > # xk_1 = PHI <xk_5(3), xk_6(4)>
> > return xk_1;
> >
> > The same function also immediately deals with turning a minimization problem
> > into a maximization one if the results are inverted. We do this here since
> > doing it in match.pd would end up changing the shape of the BBs and adding
> > additional instructions which would prevent various optimizations from working.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the phi
> > sequence of a three-way conditional.
> > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-way
> > min/max.
> > (strip_bit_not, invert_minmax_code): New.
>
> I have been working on getting rid of minmax_replacement and a few
> others and only having match_simplify_replacement and having the
> simplification logic all in match.pd instead.
> Is there a reason why you can't expand match_simplify_replacement and match.pd?
Btw, I have the below change pending to remove GENERIC comparison op
handling from GIMPLE matching. I don't yet like the phi-op part very
much which is why I didn't push that yet.
Maybe you can take that into account when extending
match_simplify_replacement... (and maybe you have a nicer idea)
Richard.
From 5c2428227e2fbfde72244dbc4aabeecf70c763ed Mon Sep 17 00:00:00 2001
From: Richard Biener <rguenther@suse.de>
Date: Tue, 17 May 2022 09:58:59 +0200
Subject: [PATCH] Remove genmatch GENERIC condition in COND_EXPR support
To: gcc-patches@gcc.gnu.org
The following removes support for matching a GENERIC condition
in COND_EXPRs when doing GIMPLE matching. Thus cuts 5% of the
size of gimple-match.cc.
Unfortunately it makes phiopt a bit more awkward since the
COND_EXPRs it tries to simplify no longer fit the single
gimple_match_op but instead the comparison now needs to be
separately built. That also means it is emitted and we have
to avoid leaving it around when it is not actually used by
the simplification to make the cascading transforms work.
Handling insertion only of the used parts of a sequence as
produced by simplification or stmt build and simplification
is something that asks to be separated out in a convenient
way and I have to think on how to best do this still.
2022-05-17 Richard Biener <rguenther@suse.de>
* genmatch.cc ():
* gimple-match.head.cc ():
* tree-ssa-phiopt.cc ():
---
gcc/genmatch.cc | 140 +++------------------------------------
gcc/gimple-match-head.cc | 61 +++--------------
gcc/tree-ssa-phiopt.cc | 45 ++++++++++---
3 files changed, 57 insertions(+), 189 deletions(-)
diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
index 2b84b849330..5fbe5aa28b3 100644
--- a/gcc/genmatch.cc
+++ b/gcc/genmatch.cc
@@ -697,12 +697,12 @@ public:
expr (id_base *operation_, location_t loc, bool is_commutative_ = false)
: operand (OP_EXPR, loc), operation (operation_),
ops (vNULL), expr_type (NULL), is_commutative (is_commutative_),
- is_generic (false), force_single_use (false), force_leaf (false),
+ force_single_use (false), force_leaf (false),
opt_grp (0) {}
expr (expr *e)
: operand (OP_EXPR, e->location), operation (e->operation),
ops (vNULL), expr_type (e->expr_type), is_commutative (e->is_commutative),
- is_generic (e->is_generic), force_single_use (e->force_single_use),
+ force_single_use (e->force_single_use),
force_leaf (e->force_leaf), opt_grp (e->opt_grp) {}
void append_op (operand *op) { ops.safe_push (op); }
/* The operator and its operands. */
@@ -713,8 +713,6 @@ public:
/* Whether the operation is to be applied commutatively. This is
later lowered to two separate patterns. */
bool is_commutative;
- /* Whether the expression is expected to be in GENERIC form. */
- bool is_generic;
/* Whether pushing any stmt to the sequence should be conditional
on this expression having a single-use. */
bool force_single_use;
@@ -1210,107 +1208,6 @@ lower_opt (simplify *s, vec<simplify *>& simplifiers)
}
}
-/* Lower the compare operand of COND_EXPRs to a
- GENERIC and a GIMPLE variant. */
-
-static vec<operand *>
-lower_cond (operand *o)
-{
- vec<operand *> ro = vNULL;
-
- if (capture *c = dyn_cast<capture *> (o))
- {
- if (c->what)
- {
- vec<operand *> lop = vNULL;
- lop = lower_cond (c->what);
-
- for (unsigned i = 0; i < lop.length (); ++i)
- ro.safe_push (new capture (c->location, c->where, lop[i],
- c->value_match));
- return ro;
- }
- }
-
- expr *e = dyn_cast<expr *> (o);
- if (!e || e->ops.length () == 0)
- {
- ro.safe_push (o);
- return ro;
- }
-
- vec< vec<operand *> > ops_vector = vNULL;
- for (unsigned i = 0; i < e->ops.length (); ++i)
- ops_vector.safe_push (lower_cond (e->ops[i]));
-
- auto_vec< vec<operand *> > result;
- auto_vec<operand *> v (e->ops.length ());
- v.quick_grow_cleared (e->ops.length ());
- cartesian_product (ops_vector, result, v, 0);
-
- for (unsigned i = 0; i < result.length (); ++i)
- {
- expr *ne = new expr (e);
- for (unsigned j = 0; j < result[i].length (); ++j)
- ne->append_op (result[i][j]);
- ro.safe_push (ne);
- /* If this is a COND with a captured expression or an
- expression with two operands then also match a GENERIC
- form on the compare. */
- if (*e->operation == COND_EXPR
- && ((is_a <capture *> (e->ops[0])
- && as_a <capture *> (e->ops[0])->what
- && is_a <expr *> (as_a <capture *> (e->ops[0])->what)
- && as_a <expr *>
- (as_a <capture *> (e->ops[0])->what)->ops.length () == 2)
- || (is_a <expr *> (e->ops[0])
- && as_a <expr *> (e->ops[0])->ops.length () == 2)))
- {
- ne = new expr (e);
- for (unsigned j = 0; j < result[i].length (); ++j)
- ne->append_op (result[i][j]);
- if (capture *c = dyn_cast <capture *> (ne->ops[0]))
- {
- expr *ocmp = as_a <expr *> (c->what);
- expr *cmp = new expr (ocmp);
- for (unsigned j = 0; j < ocmp->ops.length (); ++j)
- cmp->append_op (ocmp->ops[j]);
- cmp->is_generic = true;
- ne->ops[0] = new capture (c->location, c->where, cmp,
- c->value_match);
- }
- else
- {
- expr *ocmp = as_a <expr *> (ne->ops[0]);
- expr *cmp = new expr (ocmp);
- for (unsigned j = 0; j < ocmp->ops.length (); ++j)
- cmp->append_op (ocmp->ops[j]);
- cmp->is_generic = true;
- ne->ops[0] = cmp;
- }
- ro.safe_push (ne);
- }
- }
-
- return ro;
-}
-
-/* Lower the compare operand of COND_EXPRs to a
- GENERIC and a GIMPLE variant. */
-
-static void
-lower_cond (simplify *s, vec<simplify *>& simplifiers)
-{
- vec<operand *> matchers = lower_cond (s->match);
- for (unsigned i = 0; i < matchers.length (); ++i)
- {
- simplify *ns = new simplify (s->kind, s->id, matchers[i], s->result,
- s->for_vec, s->capture_ids);
- ns->for_subst_vec.safe_splice (s->for_subst_vec);
- simplifiers.safe_push (ns);
- }
-}
-
/* Return true if O refers to ID. */
bool
@@ -1541,7 +1438,7 @@ lower_for (simplify *sin, vec<simplify *>& simplifiers)
/* Lower the AST for everything in SIMPLIFIERS. */
static void
-lower (vec<simplify *>& simplifiers, bool gimple)
+lower (vec<simplify *>& simplifiers, bool)
{
auto_vec<simplify *> out_simplifiers;
for (auto s: simplifiers)
@@ -1560,11 +1457,7 @@ lower (vec<simplify *>& simplifiers, bool gimple)
lower_for (s, out_simplifiers);
simplifiers.truncate (0);
- if (gimple)
- for (auto s: out_simplifiers)
- lower_cond (s, simplifiers);
- else
- simplifiers.safe_splice (out_simplifiers);
+ simplifiers.safe_splice (out_simplifiers);
}
@@ -1742,8 +1635,7 @@ cmp_operand (operand *o1, operand *o2)
{
expr *e1 = static_cast<expr *>(o1);
expr *e2 = static_cast<expr *>(o2);
- return (e1->operation == e2->operation
- && e1->is_generic == e2->is_generic);
+ return e1->operation == e2->operation;
}
else
return false;
@@ -2815,26 +2707,16 @@ dt_operand::gen_gimple_expr (FILE *f, int indent, int depth)
if (id->kind == id_base::CODE)
{
- if (e->is_generic
- || *id == REALPART_EXPR || *id == IMAGPART_EXPR
+ if (*id == REALPART_EXPR || *id == IMAGPART_EXPR
|| *id == BIT_FIELD_REF || *id == VIEW_CONVERT_EXPR)
{
/* ??? If this is a memory operation we can't (and should not)
match this. The only sensible operand types are
SSA names and invariants. */
- if (e->is_generic)
- {
- char opname[20];
- get_name (opname);
- fprintf_indent (f, indent,
- "tree %s = TREE_OPERAND (%s, %i);\n",
- child_opname, opname, i);
- }
- else
- fprintf_indent (f, indent,
- "tree %s = TREE_OPERAND "
- "(gimple_assign_rhs1 (_a%d), %i);\n",
- child_opname, depth, i);
+ fprintf_indent (f, indent,
+ "tree %s = TREE_OPERAND "
+ "(gimple_assign_rhs1 (_a%d), %i);\n",
+ child_opname, depth, i);
fprintf_indent (f, indent,
"if ((TREE_CODE (%s) == SSA_NAME\n",
child_opname);
@@ -2940,7 +2822,7 @@ dt_node::gen_kids (FILE *f, int indent, bool gimple, int depth)
preds.safe_push (op);
else
{
- if (gimple && !e->is_generic)
+ if (gimple)
gimple_exprs.safe_push (op);
else
generic_exprs.safe_push (op);
diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
index 4c80d77f8ba..41a11a3cf64 100644
--- a/gcc/gimple-match-head.cc
+++ b/gcc/gimple-match-head.cc
@@ -150,17 +150,11 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op,
tree_code op_code = (tree_code) res_op->code;
bool op_could_trap;
- /* COND_EXPR will trap if, and only if, the condition
- traps and hence we have to check this. For all other operations, we
- don't need to consider the operands. */
- if (op_code == COND_EXPR)
- op_could_trap = generic_expr_could_trap_p (res_op->ops[0]);
- else
- op_could_trap = operation_could_trap_p ((tree_code) res_op->code,
- FLOAT_TYPE_P (res_op->type),
- honor_trapv,
- res_op->op_or_null (1));
-
+ /* For no operations we have to consider the operands. */
+ op_could_trap = operation_could_trap_p ((tree_code) res_op->code,
+ FLOAT_TYPE_P (res_op->type),
+ honor_trapv,
+ res_op->op_or_null (1));
if (!op_could_trap)
{
res_op->cond.cond = NULL_TREE;
@@ -922,11 +916,10 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op,
Both routines take a tree argument and returns a tree. */
-template<typename ValueizeOp, typename ValueizeCondition>
+template<typename ValueizeOp>
inline bool
gimple_extract (gimple *stmt, gimple_match_op *res_op,
- ValueizeOp valueize_op,
- ValueizeCondition valueize_condition)
+ ValueizeOp valueize_op)
{
switch (gimple_code (stmt))
{
@@ -977,11 +970,7 @@ gimple_extract (gimple *stmt, gimple_match_op *res_op,
}
case GIMPLE_TERNARY_RHS:
{
- tree rhs1 = gimple_assign_rhs1 (stmt);
- if (code == COND_EXPR && COMPARISON_CLASS_P (rhs1))
- rhs1 = valueize_condition (rhs1);
- else
- rhs1 = valueize_op (rhs1);
+ tree rhs1 = valueize_op (gimple_assign_rhs1 (stmt));
tree rhs2 = valueize_op (gimple_assign_rhs2 (stmt));
tree rhs3 = valueize_op (gimple_assign_rhs3 (stmt));
res_op->set_op (code, type, rhs1, rhs2, rhs3);
@@ -1053,7 +1042,7 @@ bool
gimple_extract_op (gimple *stmt, gimple_match_op *res_op)
{
auto nop = [](tree op) { return op; };
- return gimple_extract (stmt, res_op, nop, nop);
+ return gimple_extract (stmt, res_op, nop);
}
/* The main STMT based simplification entry. It is used by the fold_stmt
@@ -1068,38 +1057,8 @@ gimple_simplify (gimple *stmt, gimple_match_op *res_op, gimple_seq *seq,
{
return do_valueize (op, top_valueize, valueized);
};
- auto valueize_condition = [&](tree op) -> tree
- {
- bool cond_valueized = false;
- tree lhs = do_valueize (TREE_OPERAND (op, 0), top_valueize,
- cond_valueized);
- tree rhs = do_valueize (TREE_OPERAND (op, 1), top_valueize,
- cond_valueized);
- gimple_match_op res_op2 (res_op->cond, TREE_CODE (op),
- TREE_TYPE (op), lhs, rhs);
- if ((gimple_resimplify2 (seq, &res_op2, valueize)
- || cond_valueized)
- && res_op2.code.is_tree_code ())
- {
- auto code = tree_code (res_op2.code);
- if (TREE_CODE_CLASS (code) == tcc_comparison)
- {
- valueized = true;
- return build2 (code, TREE_TYPE (op),
- res_op2.ops[0], res_op2.ops[1]);
- }
- else if (code == SSA_NAME
- || code == INTEGER_CST
- || code == VECTOR_CST)
- {
- valueized = true;
- return res_op2.ops[0];
- }
- }
- return valueize_op (op);
- };
- if (!gimple_extract (stmt, res_op, valueize_op, valueize_condition))
+ if (!gimple_extract (stmt, res_op, valueize_op))
return false;
if (res_op->code.is_internal_fn ())
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index e61d9736937..8130a60e5bb 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -885,15 +885,17 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple *comp_stmt,
less efficient.
Don't use fold_build2 here as that might create (bool)a instead of just
"a != 0". */
- tree cond = build2_loc (loc, comp_code, boolean_type_node,
- cmp0, cmp1);
+ tree cond_def = gimple_build (&seq1, comp_code,
+ boolean_type_node, cmp0, cmp1);
gimple_match_op op (gimple_match_cond::UNCOND,
- COND_EXPR, type, cond, arg0, arg1);
+ COND_EXPR, type, cond_def, arg0, arg1);
if (op.resimplify (&seq1, follow_all_ssa_edges))
{
/* Early we want only to allow some generated tree codes. */
if (!early_p
+ /* ??? The following likely needs adjustments for the extra
+ comparison stmt. */
|| phiopt_early_allow (seq1, op))
{
result = maybe_push_res_to_seq (&op, &seq1);
@@ -915,11 +917,10 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple *comp_stmt,
if (comp_code == ERROR_MARK)
return NULL;
- cond = build2_loc (loc,
- comp_code, boolean_type_node,
- cmp0, cmp1);
+ cond_def = gimple_build (&seq1, comp_code,
+ boolean_type_node, cmp0, cmp1);
gimple_match_op op1 (gimple_match_cond::UNCOND,
- COND_EXPR, type, cond, arg1, arg0);
+ COND_EXPR, type, cond_def, arg1, arg0);
if (op1.resimplify (&seq1, follow_all_ssa_edges))
{
@@ -1031,9 +1032,35 @@ match_simplify_replacement (basic_block cond_bb, basic_block middle_bb,
return false;
gsi = gsi_last_bb (cond_bb);
- /* Insert the sequence generated from gimple_simplify_phiopt. */
+ /* Insert the sequence generated from gimple_simplify_phiopt. Remove
+ stmts no longer necessary and produced during intermediate
+ simplification. */
if (seq)
- gsi_insert_seq_before (&gsi, seq, GSI_CONTINUE_LINKING);
+ {
+ gsi_insert_seq_before (&gsi, seq, GSI_CONTINUE_LINKING);
+ do
+ {
+ gimple *s = gsi_stmt (gsi);
+ /* ??? Cleaning the sequence before inserting would be
+ nice, but then immediate uses and stmt operands are
+ so nice to have...
+ We could add a gsi_insert_seq_before wrapper doing
+ the insertion backwards and one stmt at a time. */
+ def_operand_p def;
+ if ((def = single_ssa_def_operand (s, SSA_OP_DEF))
+ && DEF_FROM_PTR (def) != result
+ && has_zero_uses (DEF_FROM_PTR (def))
+ && !gimple_has_side_effects (s))
+ {
+ gsi_remove (&gsi, true);
+ release_defs (s);
+ ggc_free (s);
+ }
+ else
+ gsi_next (&gsi);
+ }
+ while (gsi_stmt (gsi) != stmt);
+ }
/* If there was a statement to move and the result of the statement
is going to be used, move it to right before the original
--
2.35.3
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-20 23:16 ` Andrew Pinski
2022-06-21 6:54 ` Richard Biener
@ 2022-06-21 7:12 ` Tamar Christina
1 sibling, 0 replies; 26+ messages in thread
From: Tamar Christina @ 2022-06-21 7:12 UTC (permalink / raw)
To: Andrew Pinski; +Cc: GCC Patches, Jakub Jelinek, nd, Richard Guenther
> -----Original Message-----
> From: Andrew Pinski <pinskia@gmail.com>
> Sent: Tuesday, June 21, 2022 12:16 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Jakub Jelinek
> <jakub@redhat.com>; nd <nd@arm.com>; Richard Guenther
> <rguenther@suse.de>
> Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
>
> On Thu, Jun 16, 2022 at 4:11 AM Tamar Christina via Gcc-patches <gcc-
> patches@gcc.gnu.org> wrote:
> >
> > Hi All,
> >
> > This patch adds support for three-way min/max recognition in phi-opts.
> >
> > Concretely for e.g.
> >
> > #include <stdint.h>
> >
> > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > uint8_t xk;
> > if (xc < xm) {
> > xk = (uint8_t) (xc < xy ? xc : xy);
> > } else {
> > xk = (uint8_t) (xm < xy ? xm : xy);
> > }
> > return xk;
> > }
> >
> > we generate:
> >
> > <bb 2> [local count: 1073741824]:
> > _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> > _7 = MIN_EXPR <xm_2(D), _5>;
> > return _7;
> >
> > instead of
> >
> > <bb 2>:
> > if (xc_2(D) < xm_3(D))
> > goto <bb 3>;
> > else
> > goto <bb 4>;
> >
> > <bb 3>:
> > xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> > goto <bb 5>;
> >
> > <bb 4>:
> > xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
> >
> > <bb 5>:
> > # xk_1 = PHI <xk_5(3), xk_6(4)>
> > return xk_1;
> >
> > The same function also immediately deals with turning a minimization
> > problem into a maximization one if the results are inverted. We do
> > this here since doing it in match.pd would end up changing the shape
> > of the BBs and adding additional instructions which would prevent various
> optimizations from working.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the
> phi
> > sequence of a three-way conditional.
> > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> way
> > min/max.
> > (strip_bit_not, invert_minmax_code): New.
>
> I have been working on getting rid of minmax_replacement and a few others
> and only having match_simplify_replacement and having the simplification
> logic all in match.pd instead.
> Is there a reason why you can't expand match_simplify_replacement and
> match.pd?
Because this is just a simple extension of minmax_replacement which just adds
a third case but re-uses all the validation and normalization code already present
in the pass.
>
> >The reason was that a lot of the foldings checked that the BB contains
> >only a single SSA and that that SSA is a phi node.
>
> Could you expand on that?
Passes that call last_and_only_stmt break because you push an extra statement into the BB. The phi node is then no longer the last and only statement.
From the top of my head, ssa-spit-path is one that started giving some failures in the testsuite because of this.
Tamar
>
> Thanks,
> Andrew
>
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't optimize
> > code away.
> > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > * gcc.dg/tree-ssa/minmax-8.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> 6a8
> > 43695a05786e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> b52
> > 2da44bafa0e2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> 5b
> > 9993074f8510
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> 85
> > 639e3a49dd4b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xy < xc ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> ef
> > 9fa7b1c5e104
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > @@ -0,0 +1,16 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> b65
> > 4fcab0bfdd1c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > index
> >
> 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> 7d7
> > 5a32d2c860ed 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > @@ -1,5 +1,5 @@
> > /* { dg-do run } */
> > -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > --param max-jump-thread-duplication-stmts=20" } */
> > +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > +--param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> >
> 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2e
> 3
> > 67d2d21421b1 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion (edge,
> edge, gphi *, tree, tree,
> > gimple *); static int
> > value_replacement (basic_block, basic_block,
> > edge, edge, gphi *, tree, tree); -static
> > bool minmax_replacement (basic_block, basic_block,
> > - edge, edge, gphi *, tree, tree);
> > +static bool minmax_replacement (basic_block, basic_block, basic_block,
> > + edge, edge, gphi *, tree, tree, bool);
> > static bool spaceship_replacement (basic_block, basic_block,
> > edge, edge, gphi *, tree, tree);
> > static bool cond_removal_in_builtin_zero_pattern (basic_block,
> > basic_block, @@ -73,7 +73,7 @@ static bool cond_store_replacement
> (basic_block, basic_block, edge, edge,
> > hash_set<tree> *); static bool
> > cond_if_else_store_replacement (basic_block, basic_block,
> > basic_block); static hash_set<tree> * get_non_trapping (); -static
> > void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> > +static void replace_phi_edge_with_variable (basic_block, edge, gphi
> > +*, tree, bool);
> > static void hoist_adjacent_loads (basic_block, basic_block,
> > basic_block, basic_block); static
> > bool gate_hoist_loads (void); @@ -199,6 +199,7 @@
> > tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool
> early_p)
> > basic_block bb1, bb2;
> > edge e1, e2;
> > tree arg0, arg1;
> > + bool diamond_minmax_p = false;
> >
> > bb = bb_order[i];
> >
> > @@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > hoist_adjacent_loads (bb, bb1, bb2, bb3);
> > continue;
> > }
> > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > + && single_succ_p (bb1)
> > + && single_succ_p (bb2)
> > + && single_pred_p (bb1)
> > + && single_pred_p (bb2)
> > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> > + {
> > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> (bb1);
> > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> (bb2);
> > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > + {
> > + gimple *stmt1 = gsi_stmt (it1);
> > + gimple *stmt2 = gsi_stmt (it2);
> > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > + {
> > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > + diamond_minmax_p
> > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > + }
> > + }
> > + }
> > else
> > continue;
> >
> > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > if (!candorest)
> > continue;
> >
> > + /* Check that we're looking for nested phis. */
> > + if (phis == NULL && diamond_minmax_p)
> > + {
> > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > + e2 = EDGE_SUCC (bb2, 0);
> > + }
> > +
> > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > if (!phi)
> > continue;
> > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > gphi *newphi;
> > if (single_pred_p (bb1)
> > + && !diamond_minmax_p
> > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > arg0, arg1,
> >
> > cond_stmt))) @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool
> do_store_elim, bool do_hoist_loads, bool early_p)
> > }
> >
> > /* Do the replacement of conditional if it can be done. */
> > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > + if (!early_p
> > + && !diamond_minmax_p
> > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > - arg0, arg1,
> > - early_p))
> > + else if (!diamond_minmax_p
> > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > + arg0, arg1,
> > + early_p))
> > cfgchanged = true;
> > else if (!early_p
> > + && !diamond_minmax_p
> > && single_pred_p (bb1)
> > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
> > phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > + diamond_minmax_p))
> > cfgchanged = true;
> > else if (single_pred_p (bb1)
> > + && !diamond_minmax_p
> > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > cfgchanged = true;
> > }
> > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > static void
> > replace_phi_edge_with_variable (basic_block cond_block,
> > - edge e, gphi *phi, tree new_tree)
> > + edge e, gphi *phi, tree new_tree, bool
> > + delete_bb = true)
> > {
> > basic_block bb = gimple_bb (phi);
> > gimple_stmt_iterator gsi;
> > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> cond_block,
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > else
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> > {
> > e->flags |= EDGE_FALLTHRU;
> > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -
> 1564,15
> > +1601,52 @@ value_replacement (basic_block cond_bb, basic_block
> middle_bb,
> > return 0;
> > }
> >
> > +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the
> TREE for
> > + the value being inverted. */
> > +
> > +static tree
> > +strip_bit_not (tree var)
> > +{
> > + if (TREE_CODE (var) != SSA_NAME)
> > + return NULL_TREE;
> > +
> > + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code (assign)
> > + != GIMPLE_ASSIGN)
> > + return NULL_TREE;
> > +
> > + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> > + return NULL_TREE;
> > +
> > + return gimple_assign_rhs1 (assign); }
> > +
> > +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> > +
> > +enum tree_code
> > +invert_minmax_code (enum tree_code code) {
> > + switch (code) {
> > + case MIN_EXPR:
> > + return MAX_EXPR;
> > + case MAX_EXPR:
> > + return MIN_EXPR;
> > + default:
> > + gcc_unreachable ();
> > + }
> > +}
> > +
> > /* The function minmax_replacement does the main work of doing the
> minmax
> > replacement. Return true if the replacement is done. Otherwise return
> > false.
> > BB is the basic block where the replacement is going to be done on.
> ARG0
> > - is argument 0 from the PHI. Likewise for ARG1. */
> > + is argument 0 from the PHI. Likewise for ARG1.
> > +
> > + If THREEWAY_P then expect the BB to be laid out in diamond shape with
> each
> > + BB containing only a MIN or MAX expression. */
> >
> > static bool
> > -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> > +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> basic_block alt_middle_bb,
> > + edge e0, edge e1, gphi *phi, tree arg0, tree arg1,
> > +bool threeway_p)
> > {
> > tree result;
> > edge true_edge, false_edge;
> > @@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> > if (false_edge->dest == middle_bb)
> > false_edge = EDGE_SUCC (false_edge->dest, 0);
> >
> > + /* When THREEWAY_P then e1 will point to the edge of the final
> transition
> > + from middle-bb to end. */
> > if (true_edge == e0)
> > {
> > - gcc_assert (false_edge == e1);
> > + if (threeway_p)
> > + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> > + else
> > + gcc_assert (false_edge == e1);
> > arg_true = arg0;
> > arg_false = arg1;
> > }
> > @@ -1768,6 +1847,133 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> > else
> > return false;
> > }
> > + else if (middle_bb != alt_middle_bb && threeway_p)
> > + {
> > + /* Recognize the following case:
> > +
> > + if (smaller < larger)
> > + a = MIN (smaller, c);
> > + else
> > + b = MIN (larger, c);
> > + x = PHI <a, b>
> > +
> > + This is equivalent to
> > +
> > + a = MIN (smaller, c);
> > + x = MIN (larger, a); */
> > +
> > + gimple *assign = last_and_only_stmt (middle_bb);
> > + tree lhs, op0, op1, bound;
> > + tree alt_lhs, alt_op0, alt_op1;
> > + bool invert = false;
> > +
> > + if (!single_pred_p (middle_bb)
> > + || !single_pred_p (alt_middle_bb))
> > + return false;
> > +
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + lhs = gimple_assign_lhs (assign);
> > + ass_code = gimple_assign_rhs_code (assign);
> > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > + return false;
> > +
> > + op0 = gimple_assign_rhs1 (assign);
> > + op1 = gimple_assign_rhs2 (assign);
> > +
> > + assign = last_and_only_stmt (alt_middle_bb);
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + alt_lhs = gimple_assign_lhs (assign);
> > + if (ass_code != gimple_assign_rhs_code (assign))
> > + return false;
> > +
> > + alt_op0 = gimple_assign_rhs1 (assign);
> > + alt_op1 = gimple_assign_rhs2 (assign);
> > +
> > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > + return false;
> > +
> > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MAX_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MIN_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MIN_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MAX_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else
> > + return false;
> > +
> > + /* Reset any range information from the basic block. */
> > + reset_flow_sensitive_info_in_bb (cond_bb);
> > +
> > + /* Emit the statement to compute min/max. */
> > + gimple_seq stmts = NULL;
> > + tree phi_result = PHI_RESULT (phi);
> > + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0,
> bound);
> > + result = gimple_build (&stmts, ass_code, TREE_TYPE (phi_result),
> result, arg1);
> > + if (invert)
> > + result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE
> > + (phi_result), result);
> > +
> > + gsi = gsi_last_bb (cond_bb);
> > + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> > +
> > + replace_phi_edge_with_variable (cond_bb, e1, phi, result,
> > + false);
> > +
> > + return true;
> > + }
> > else
> > {
> > /* Recognize the following case, assuming d <= u:
> >
> >
> >
> >
> > --
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.
2022-06-20 8:49 ` Tamar Christina
@ 2022-06-21 7:43 ` Richard Biener
2022-08-03 15:13 ` Tamar Christina
0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2022-06-21 7:43 UTC (permalink / raw)
To: Tamar Christina
Cc: Richard Sandiford, Richard Biener via Gcc-patches, Richard Guenther, nd
On Mon, Jun 20, 2022 at 10:49 AM Tamar Christina
<Tamar.Christina@arm.com> wrote:
>
> > -----Original Message-----
> > From: Richard Sandiford <richard.sandiford@arm.com>
> > Sent: Monday, June 20, 2022 9:19 AM
> > To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
> > Cc: Tamar Christina <Tamar.Christina@arm.com>; Richard Biener
> > <richard.guenther@gmail.com>; Richard Guenther <rguenther@suse.de>;
> > nd <nd@arm.com>
> > Subject: Re: [PATCH 1/2]middle-end: Simplify subtract where both
> > arguments are being bitwise inverted.
> >
> > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > >>
> > >> Hi All,
> > >>
> > >> This adds a match.pd rule that drops the bitwwise nots when both
> > >> arguments to a subtract is inverted. i.e. for:
> > >>
> > >> float g(float a, float b)
> > >> {
> > >> return ~(int)a - ~(int)b;
> > >> }
> > >>
> > >> we instead generate
> > >>
> > >> float g(float a, float b)
> > >> {
> > >> return (int)a - (int)b;
> > >> }
> > >>
> > >> We already do a limited version of this from the fold_binary fold
> > >> functions but this makes a more general version in match.pd that applies
> > more often.
> > >>
> > >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >>
> > >> Ok for master?
> > >>
> > >> Thanks,
> > >> Tamar
> > >>
> > >> gcc/ChangeLog:
> > >>
> > >> * match.pd: New bit_not rule.
> > >>
> > >> gcc/testsuite/ChangeLog:
> > >>
> > >> * gcc.dg/subnot.c: New test.
> > >>
> > >> --- inline copy of patch --
> > >> diff --git a/gcc/match.pd b/gcc/match.pd index
> > >>
> > a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a1
> > 0
> > >> c30b8a3e1ae2e 100644
> > >> --- a/gcc/match.pd
> > >> +++ b/gcc/match.pd
> > >> @@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >> (simplify
> > >> (bit_not (plus:c (bit_not @0) @1))
> > >> (minus @0 @1))
> > >> +/* (~X - ~Y) -> X - Y. */
> > >> +(simplify
> > >> + (minus (bit_not @0) (bit_not @1))
> > >> + (minus @0 @1))
> > >
> > > It doesn't seem correct.
> > >
> > > (gdb) p/x ~-1 - ~0x80000000
> > > $3 = 0x80000001
> > > (gdb) p/x -1 - 0x80000000
> > > $4 = 0x7fffffff
> > >
> > > where I was looking for a case exposing undefined integer overflow.
> >
> > Yeah, shouldn't it be folding to (minus @1 @0) instead?
> >
> > ~X = (-X - 1)
> > -Y = (-Y - 1)
> >
> > so:
> >
> > ~X - ~Y = (-X - 1) - (-Y - 1)
> > = -X - 1 + Y + 1
> > = Y - X
> >
>
> You're right, sorry, I should have paid more attention when I wrote the patch.
You still need to watch out for undefined overflow cases in the result
that were well-defined in the original expression I think.
Richard.
> Tamar
> > Richard
> >
> >
> > > Richard.
> > >
> > >>
> > >> /* ~(X - Y) -> ~X + Y. */
> > >> (simplify
> > >> diff --git a/gcc/testsuite/gcc.dg/subnot.c
> > >> b/gcc/testsuite/gcc.dg/subnot.c new file mode 100644 index
> > >>
> > 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f
> > 83
> > >> 1aa77d28bd02d
> > >> --- /dev/null
> > >> +++ b/gcc/testsuite/gcc.dg/subnot.c
> > >> @@ -0,0 +1,9 @@
> > >> +/* { dg-do compile } */
> > >> +/* { dg-options "-O -fdump-tree-optimized" } */
> > >> +
> > >> +float g(float a, float b)
> > >> +{
> > >> + return ~(int)a - ~(int)b;
> > >> +}
> > >> +
> > >> +/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
> > >>
> > >>
> > >>
> > >>
> > >> --
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-20 9:01 ` Tamar Christina
@ 2022-06-21 13:15 ` Richard Biener
2022-06-21 13:42 ` Tamar Christina
0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2022-06-21 13:15 UTC (permalink / raw)
To: Tamar Christina; +Cc: gcc-patches, nd, jakub
On Mon, 20 Jun 2022, Tamar Christina wrote:
> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: Monday, June 20, 2022 9:36 AM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> > Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> > max/min.
> >
> > On Thu, 16 Jun 2022, Tamar Christina wrote:
> >
> > > Hi All,
> > >
> > > This patch adds support for three-way min/max recognition in phi-opts.
> > >
> > > Concretely for e.g.
> > >
> > > #include <stdint.h>
> > >
> > > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > uint8_t xk;
> > > if (xc < xm) {
> > > xk = (uint8_t) (xc < xy ? xc : xy);
> > > } else {
> > > xk = (uint8_t) (xm < xy ? xm : xy);
> > > }
> > > return xk;
> > > }
> > >
> > > we generate:
> > >
> > > <bb 2> [local count: 1073741824]:
> > > _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> > > _7 = MIN_EXPR <xm_2(D), _5>;
> > > return _7;
> > >
> > > instead of
> > >
> > > <bb 2>:
> > > if (xc_2(D) < xm_3(D))
> > > goto <bb 3>;
> > > else
> > > goto <bb 4>;
> > >
> > > <bb 3>:
> > > xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> > > goto <bb 5>;
> > >
> > > <bb 4>:
> > > xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
> > >
> > > <bb 5>:
> > > # xk_1 = PHI <xk_5(3), xk_6(4)>
> > > return xk_1;
> > >
> > > The same function also immediately deals with turning a minimization
> > > problem into a maximization one if the results are inverted. We do
> > > this here since doing it in match.pd would end up changing the shape
> > > of the BBs and adding additional instructions which would prevent various
> > optimizations from working.
> >
> > Can you explain a bit more?
>
> I'll respond to this one first In case it changes how you want me to proceed.
>
> I initially had used a match.pd rule to do the min to max conversion, but a
> number of testcases started to fail. The reason was that a lot of the foldings
> checked that the BB contains only a single SSA and that that SSA is a phi node.
>
> By changing the min into max, the negation of the result ends up In the same BB
> and so the optimizations are skipped leading to less optimal code.
>
> I did look into relaxing those phi opts but it felt like I'd make a rather arbitrary
> exception for minus and seemed better to handle it in the minmax folding.
That's a possibility but we try to maintain a single place for a transform
which might be in match.pd which would then also handle this when
there's a RHS COND_EXPR connecting the stmts rather than a PHI node.
Richard.
> Thanks,
> Tamar
>
> >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> > the phi
> > > sequence of a three-way conditional.
> > > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> > way
> > > min/max.
> > > (strip_bit_not, invert_minmax_code): New.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> > optimize
> > > code away.
> > > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > > * gcc.dg/tree-ssa/minmax-8.c: New test.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> > 6a8
> > > 43695a05786e
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> > b52
> > > 2da44bafa0e2
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> > 5b
> > > 9993074f8510
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> > 85
> > > 639e3a49dd4b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xy < xc ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> > ef
> > > 9fa7b1c5e104
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > @@ -0,0 +1,16 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> > b65
> > > 4fcab0bfdd1c
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > index
> > >
> > 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> > 7d7
> > > 5a32d2c860ed 100644
> > > --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > @@ -1,5 +1,5 @@
> > > /* { dg-do run } */
> > > -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > > --param max-jump-thread-duplication-stmts=20" } */
> > > +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > > +--param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> > >
> > > #include <stdio.h>
> > > #include <stdlib.h>
> > > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> > >
> > 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2e
> > 3
> > > 67d2d21421b1 100644
> > > --- a/gcc/tree-ssa-phiopt.cc
> > > +++ b/gcc/tree-ssa-phiopt.cc
> > > @@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion (edge,
> > edge, gphi *, tree, tree,
> > > gimple *);
> > > static int value_replacement (basic_block, basic_block,
> > > edge, edge, gphi *, tree, tree); -static bool
> > > minmax_replacement (basic_block, basic_block,
> > > - edge, edge, gphi *, tree, tree);
> > > +static bool minmax_replacement (basic_block, basic_block, basic_block,
> > > + edge, edge, gphi *, tree, tree, bool);
> > > static bool spaceship_replacement (basic_block, basic_block,
> > > edge, edge, gphi *, tree, tree); static bool
> > > cond_removal_in_builtin_zero_pattern (basic_block, basic_block, @@
> > > -73,7 +73,7 @@ static bool cond_store_replacement (basic_block,
> > basic_block, edge, edge,
> > > hash_set<tree> *);
> > > static bool cond_if_else_store_replacement (basic_block, basic_block,
> > > basic_block); static hash_set<tree> * get_non_trapping (); -static
> > > void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> > > +static void replace_phi_edge_with_variable (basic_block, edge, gphi
> > > +*, tree, bool);
> > > static void hoist_adjacent_loads (basic_block, basic_block,
> > > basic_block, basic_block);
> > > static bool gate_hoist_loads (void);
> > > @@ -199,6 +199,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> > > basic_block bb1, bb2;
> > > edge e1, e2;
> > > tree arg0, arg1;
> > > + bool diamond_minmax_p = false;
> > >
> > > bb = bb_order[i];
> > >
> > > @@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > hoist_adjacent_loads (bb, bb1, bb2, bb3);
> > > continue;
> > > }
> > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > + && single_succ_p (bb1)
> > > + && single_succ_p (bb2)
> > > + && single_pred_p (bb1)
> > > + && single_pred_p (bb2)
> > > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> >
> > please do the single_succ/pred checks below where appropriate, also what's
> > the last check about? why does the merge block need a single successor?
> >
> > > + {
> > > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> > (bb1);
> > > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> > (bb2);
> > > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > > + {
> > > + gimple *stmt1 = gsi_stmt (it1);
> > > + gimple *stmt2 = gsi_stmt (it2);
> > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > + {
> > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > + diamond_minmax_p
> > > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > > + }
> > > + }
> > > + }
> >
> > I'd generalize this to general diamond detection, simply cutting off
> > *_replacement workers that do not handle diamonds and do appropriate
> > checks in minmax_replacement only.
> >
> > > else
> > > continue;
> > >
> > > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > if (!candorest)
> > > continue;
> > >
> > > + /* Check that we're looking for nested phis. */
> > > + if (phis == NULL && diamond_minmax_p)
> > > + {
> > > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > > + e2 = EDGE_SUCC (bb2, 0);
> > > + }
> > > +
> >
> > instead
> >
> > basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > gimple_seq phis = phi_nodes (merge);
> >
> >
> > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > if (!phi)
> > > continue;
> > > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > do_hoist_loads, bool early_p)
> > >
> > > gphi *newphi;
> > > if (single_pred_p (bb1)
> > > + && !diamond_minmax_p
> > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > arg0, arg1,
> > > cond_stmt)))
> > > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > }
> > >
> > > /* Do the replacement of conditional if it can be done. */
> > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > arg1))
> > > + if (!early_p
> > > + && !diamond_minmax_p
> > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > - arg0, arg1,
> > > - early_p))
> > > + else if (!diamond_minmax_p
> > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > + arg0, arg1, early_p))
> > > cfgchanged = true;
> > > else if (!early_p
> > > + && !diamond_minmax_p
> > > && single_pred_p (bb1)
> > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > e2,
> > > phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > + diamond_minmax_p))
> > > cfgchanged = true;
> > > else if (single_pred_p (bb1)
> > > + && !diamond_minmax_p
> > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > arg1))
> > > cfgchanged = true;
> > > }
> > > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > do_hoist_loads, bool early_p)
> > >
> > > static void
> > > replace_phi_edge_with_variable (basic_block cond_block,
> > > - edge e, gphi *phi, tree new_tree)
> > > + edge e, gphi *phi, tree new_tree, bool
> > delete_bb = true)
> > > {
> > > basic_block bb = gimple_bb (phi);
> > > gimple_stmt_iterator gsi;
> > > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> > cond_block,
> > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > else
> > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> >
> > why do you need this change?
> >
> > Did you check whether the new case works when the merge block has more
> > than two incoming edges?
> >
> > > {
> > > e->flags |= EDGE_FALLTHRU;
> > > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -
> > 1564,15
> > > +1601,52 @@ value_replacement (basic_block cond_bb, basic_block
> > middle_bb,
> > > return 0;
> > > }
> > >
> > > +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the
> > TREE for
> > > + the value being inverted. */
> > > +
> > > +static tree
> > > +strip_bit_not (tree var)
> > > +{
> > > + if (TREE_CODE (var) != SSA_NAME)
> > > + return NULL_TREE;
> > > +
> > > + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code (assign)
> > > + != GIMPLE_ASSIGN)
> > > + return NULL_TREE;
> > > +
> > > + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> > > + return NULL_TREE;
> > > +
> > > + return gimple_assign_rhs1 (assign); }
> > > +
> > > +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> > > +
> > > +enum tree_code
> > > +invert_minmax_code (enum tree_code code) {
> > > + switch (code) {
> > > + case MIN_EXPR:
> > > + return MAX_EXPR;
> > > + case MAX_EXPR:
> > > + return MIN_EXPR;
> > > + default:
> > > + gcc_unreachable ();
> > > + }
> > > +}
> > > +
> > > /* The function minmax_replacement does the main work of doing the
> > minmax
> > > replacement. Return true if the replacement is done. Otherwise return
> > > false.
> > > BB is the basic block where the replacement is going to be done on.
> > ARG0
> > > - is argument 0 from the PHI. Likewise for ARG1. */
> > > + is argument 0 from the PHI. Likewise for ARG1.
> > > +
> > > + If THREEWAY_P then expect the BB to be laid out in diamond shape with
> > each
> > > + BB containing only a MIN or MAX expression. */
> > >
> > > static bool
> > > -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > > - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> > > +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > basic_block alt_middle_bb,
> > > + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool
> > > +threeway_p)
> > > {
> > > tree result;
> > > edge true_edge, false_edge;
> > > @@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb,
> > basic_block middle_bb,
> > > if (false_edge->dest == middle_bb)
> > > false_edge = EDGE_SUCC (false_edge->dest, 0);
> > >
> > > + /* When THREEWAY_P then e1 will point to the edge of the final
> > transition
> > > + from middle-bb to end. */
> > > if (true_edge == e0)
> > > {
> > > - gcc_assert (false_edge == e1);
> > > + if (threeway_p)
> > > + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> > > + else
> > > + gcc_assert (false_edge == e1);
> > > arg_true = arg0;
> > > arg_false = arg1;
> > > }
> > > @@ -1768,6 +1847,133 @@ minmax_replacement (basic_block cond_bb,
> > basic_block middle_bb,
> > > else
> > > return false;
> > > }
> > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > + {
> > > + /* Recognize the following case:
> > > +
> > > + if (smaller < larger)
> > > + a = MIN (smaller, c);
> > > + else
> > > + b = MIN (larger, c);
> > > + x = PHI <a, b>
> > > +
> > > + This is equivalent to
> > > +
> > > + a = MIN (smaller, c);
> > > + x = MIN (larger, a); */
> > > +
> > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > + tree lhs, op0, op1, bound;
> > > + tree alt_lhs, alt_op0, alt_op1;
> > > + bool invert = false;
> > > +
> > > + if (!single_pred_p (middle_bb)
> > > + || !single_pred_p (alt_middle_bb))
> > > + return false;
> > > +
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + lhs = gimple_assign_lhs (assign);
> > > + ass_code = gimple_assign_rhs_code (assign);
> > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > + return false;
> > > +
> > > + op0 = gimple_assign_rhs1 (assign);
> > > + op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + assign = last_and_only_stmt (alt_middle_bb);
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + alt_lhs = gimple_assign_lhs (assign);
> > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > + return false;
> > > +
> > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > + return false;
> > > +
> > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MAX_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MIN_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (alt_op0,
> > alt_smaller))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MIN_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MAX_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else
> > > + return false;
> >
> > Did you check you have coverage for all cases above in your testcases?
> >
> > > + /* Reset any range information from the basic block. */
> > > + reset_flow_sensitive_info_in_bb (cond_bb);
> >
> > Huh. You need to reset flow-sensitive info of the middle-bb stmt that
> > prevails only...
> >
> > > + /* Emit the statement to compute min/max. */
> > > + gimple_seq stmts = NULL;
> > > + tree phi_result = PHI_RESULT (phi);
> > > + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0,
> > bound);
> > > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > > + (phi_result), result, arg1);
> >
> > ... but you are re-building both here. And also you drop locations, the
> > preserved min/max should keep the old, the new should get the location of
> > ... hmm, the condition possibly?
> >
> > > + if (invert)
> > > + result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE
> > (phi_result),
> > > +result);
> > > +
> > > + gsi = gsi_last_bb (cond_bb);
> > > + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> > > +
> > > + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> > > + return true;
> > > + }
> > > else
> > > {
> > > /* Recognize the following case, assuming d <= u:
> > >
> > >
> > >
> > >
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461
> > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > Boudien Moerman; HRB 36809 (AG Nuernberg)
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-21 13:15 ` Richard Biener
@ 2022-06-21 13:42 ` Tamar Christina
2022-06-27 7:52 ` Richard Biener
0 siblings, 1 reply; 26+ messages in thread
From: Tamar Christina @ 2022-06-21 13:42 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, nd, jakub
> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Tuesday, June 21, 2022 2:15 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> Subject: RE: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
>
> On Mon, 20 Jun 2022, Tamar Christina wrote:
>
> > > -----Original Message-----
> > > From: Richard Biener <rguenther@suse.de>
> > > Sent: Monday, June 20, 2022 9:36 AM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> > > Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> > > max/min.
> > >
> > > On Thu, 16 Jun 2022, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This patch adds support for three-way min/max recognition in phi-opts.
> > > >
> > > > Concretely for e.g.
> > > >
> > > > #include <stdint.h>
> > > >
> > > > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > uint8_t xk;
> > > > if (xc < xm) {
> > > > xk = (uint8_t) (xc < xy ? xc : xy);
> > > > } else {
> > > > xk = (uint8_t) (xm < xy ? xm : xy);
> > > > }
> > > > return xk;
> > > > }
> > > >
> > > > we generate:
> > > >
> > > > <bb 2> [local count: 1073741824]:
> > > > _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> > > > _7 = MIN_EXPR <xm_2(D), _5>;
> > > > return _7;
> > > >
> > > > instead of
> > > >
> > > > <bb 2>:
> > > > if (xc_2(D) < xm_3(D))
> > > > goto <bb 3>;
> > > > else
> > > > goto <bb 4>;
> > > >
> > > > <bb 3>:
> > > > xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> > > > goto <bb 5>;
> > > >
> > > > <bb 4>:
> > > > xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
> > > >
> > > > <bb 5>:
> > > > # xk_1 = PHI <xk_5(3), xk_6(4)>
> > > > return xk_1;
> > > >
> > > > The same function also immediately deals with turning a
> > > > minimization problem into a maximization one if the results are
> > > > inverted. We do this here since doing it in match.pd would end up
> > > > changing the shape of the BBs and adding additional instructions
> > > > which would prevent various
> > > optimizations from working.
> > >
> > > Can you explain a bit more?
> >
> > I'll respond to this one first In case it changes how you want me to proceed.
> >
> > I initially had used a match.pd rule to do the min to max conversion,
> > but a number of testcases started to fail. The reason was that a lot
> > of the foldings checked that the BB contains only a single SSA and that that
> SSA is a phi node.
> >
> > By changing the min into max, the negation of the result ends up In
> > the same BB and so the optimizations are skipped leading to less optimal
> code.
> >
> > I did look into relaxing those phi opts but it felt like I'd make a
> > rather arbitrary exception for minus and seemed better to handle it in the
> minmax folding.
>
> That's a possibility but we try to maintain a single place for a transform which
> might be in match.pd which would then also handle this when there's a RHS
> COND_EXPR connecting the stmts rather than a PHI node.
Sorry, I am probably missing something here. Just to be clear at the moment I just do it all in
minmax_replacement, so everything is already in one place. It's a simple extension of the code
already there.
Are you suggesting I have to move it all to match.pd? That's non-trivial..
Thanks,
Tamar
>
> Richard.
>
> > Thanks,
> > Tamar
> >
> > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> > > the phi
> > > > sequence of a three-way conditional.
> > > > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > > > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> > > way
> > > > min/max.
> > > > (strip_bit_not, invert_minmax_code): New.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> > > optimize
> > > > code away.
> > > > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-8.c: New test.
> > > >
> > > > --- inline copy of patch --
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > new file mode 100644
> > > > index
> > > >
> > >
> 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> > > 6a8
> > > > 43695a05786e
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > @@ -0,0 +1,17 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > +
> > > > +#include <stdint.h>
> > > > +
> > > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > + uint8_t xk;
> > > > + if (xc < xm) {
> > > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > > + } else {
> > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > + }
> > > > + return xk;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } }
> > > > +*/
> > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } }
> > > > +*/
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > > new file mode 100644
> > > > index
> > > >
> > >
> 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> > > b52
> > > > 2da44bafa0e2
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > > @@ -0,0 +1,17 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > +
> > > > +#include <stdint.h>
> > > > +
> > > > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > + uint8_t xk;
> > > > + if (xc > xm) {
> > > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > > + } else {
> > > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > > + }
> > > > + return xk;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } }
> > > > +*/
> > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } }
> > > > +*/
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > > new file mode 100644
> > > > index
> > > >
> > >
> 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> > > 5b
> > > > 9993074f8510
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > > @@ -0,0 +1,17 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > +
> > > > +#include <stdint.h>
> > > > +
> > > > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > + uint8_t xk;
> > > > + if (xc > xm) {
> > > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > > + } else {
> > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > + }
> > > > + return xk;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } }
> > > > +*/
> > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } }
> > > > +*/
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > > new file mode 100644
> > > > index
> > > >
> > >
> 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> > > 85
> > > > 639e3a49dd4b
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > > @@ -0,0 +1,17 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > +
> > > > +#include <stdint.h>
> > > > +
> > > > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > + uint8_t xk;
> > > > + if (xc > xm) {
> > > > + xk = (uint8_t) (xy < xc ? xc : xy);
> > > > + } else {
> > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > + }
> > > > + return xk;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } }
> > > > +*/
> > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } }
> > > > +*/
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > > new file mode 100644
> > > > index
> > > >
> > >
> 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> > > ef
> > > > 9fa7b1c5e104
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > > @@ -0,0 +1,16 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > +
> > > > +#include <stdint.h>
> > > > +
> > > > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > + uint8_t xk;
> > > > + if (xc > xm) {
> > > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > > + } else {
> > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > + }
> > > > + return xk;
> > > > +}
> > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } }
> > > > +*/
> > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } }
> > > > +*/
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > > new file mode 100644
> > > > index
> > > >
> > >
> 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> > > b65
> > > > 4fcab0bfdd1c
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > > @@ -0,0 +1,17 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > +
> > > > +#include <stdint.h>
> > > > +
> > > > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > + uint8_t xk;
> > > > + if (xc < xm) {
> > > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > > + } else {
> > > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > > + }
> > > > + return xk;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } }
> > > > +*/
> > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } }
> > > > +*/
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > index
> > > >
> > >
> 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> > > 7d7
> > > > 5a32d2c860ed 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > @@ -1,5 +1,5 @@
> > > > /* { dg-do run } */
> > > > -/* { dg-options "-O2 -fsplit-paths
> > > > -fdump-tree-split-paths-details --param
> > > > max-jump-thread-duplication-stmts=20" } */
> > > > +/* { dg-options "-O2 -fsplit-paths
> > > > +-fdump-tree-split-paths-details --param
> > > > +max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> > > >
> > > > #include <stdio.h>
> > > > #include <stdlib.h>
> > > > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> > > >
> > >
> 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2
> > > e
> > > 3
> > > > 67d2d21421b1 100644
> > > > --- a/gcc/tree-ssa-phiopt.cc
> > > > +++ b/gcc/tree-ssa-phiopt.cc
> > > > @@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion
> > > > (edge,
> > > edge, gphi *, tree, tree,
> > > > gimple *);
> > > > static int value_replacement (basic_block, basic_block,
> > > > edge, edge, gphi *, tree, tree); -static bool
> > > > minmax_replacement (basic_block, basic_block,
> > > > - edge, edge, gphi *, tree, tree);
> > > > +static bool minmax_replacement (basic_block, basic_block,
> basic_block,
> > > > + edge, edge, gphi *, tree, tree, bool);
> > > > static bool spaceship_replacement (basic_block, basic_block,
> > > > edge, edge, gphi *, tree, tree); static bool
> > > > cond_removal_in_builtin_zero_pattern (basic_block, basic_block, @@
> > > > -73,7 +73,7 @@ static bool cond_store_replacement (basic_block,
> > > basic_block, edge, edge,
> > > > hash_set<tree> *);
> > > > static bool cond_if_else_store_replacement (basic_block,
> > > > basic_block, basic_block); static hash_set<tree> *
> > > > get_non_trapping (); -static void replace_phi_edge_with_variable
> > > > (basic_block, edge, gphi *, tree);
> > > > +static void replace_phi_edge_with_variable (basic_block, edge,
> > > > +gphi *, tree, bool);
> > > > static void hoist_adjacent_loads (basic_block, basic_block,
> > > > basic_block, basic_block);
> > > > static bool gate_hoist_loads (void); @@ -199,6 +199,7 @@
> > > > tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > do_hoist_loads, bool early_p)
> > > > basic_block bb1, bb2;
> > > > edge e1, e2;
> > > > tree arg0, arg1;
> > > > + bool diamond_minmax_p = false;
> > > >
> > > > bb = bb_order[i];
> > > >
> > > > @@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > bool do_hoist_loads, bool early_p)
> > > > hoist_adjacent_loads (bb, bb1, bb2, bb3);
> > > > continue;
> > > > }
> > > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > > + && single_succ_p (bb1)
> > > > + && single_succ_p (bb2)
> > > > + && single_pred_p (bb1)
> > > > + && single_pred_p (bb2)
> > > > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> > >
> > > please do the single_succ/pred checks below where appropriate, also
> > > what's the last check about? why does the merge block need a single
> successor?
> > >
> > > > + {
> > > > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> > > (bb1);
> > > > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> > > (bb2);
> > > > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > > > + {
> > > > + gimple *stmt1 = gsi_stmt (it1);
> > > > + gimple *stmt2 = gsi_stmt (it2);
> > > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > > + {
> > > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > > + diamond_minmax_p
> > > > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > > > + }
> > > > + }
> > > > + }
> > >
> > > I'd generalize this to general diamond detection, simply cutting off
> > > *_replacement workers that do not handle diamonds and do appropriate
> > > checks in minmax_replacement only.
> > >
> > > > else
> > > > continue;
> > > >
> > > > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > bool do_hoist_loads, bool early_p)
> > > > if (!candorest)
> > > > continue;
> > > >
> > > > + /* Check that we're looking for nested phis. */
> > > > + if (phis == NULL && diamond_minmax_p)
> > > > + {
> > > > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > > > + e2 = EDGE_SUCC (bb2, 0);
> > > > + }
> > > > +
> > >
> > > instead
> > >
> > > basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > > gimple_seq phis = phi_nodes (merge);
> > >
> > >
> > > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > > if (!phi)
> > > > continue;
> > > > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > >
> > > > gphi *newphi;
> > > > if (single_pred_p (bb1)
> > > > + && !diamond_minmax_p
> > > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > > arg0, arg1,
> > > > cond_stmt)))
> > > > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool
> do_store_elim,
> > > bool do_hoist_loads, bool early_p)
> > > > }
> > > >
> > > > /* Do the replacement of conditional if it can be done. */
> > > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > > arg1))
> > > > + if (!early_p
> > > > + && !diamond_minmax_p
> > > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > > cfgchanged = true;
> > > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > - arg0, arg1,
> > > > - early_p))
> > > > + else if (!diamond_minmax_p
> > > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > + arg0, arg1, early_p))
> > > > cfgchanged = true;
> > > > else if (!early_p
> > > > + && !diamond_minmax_p
> > > > && single_pred_p (bb1)
> > > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > > e2,
> > > > phi, arg0, arg1))
> > > > cfgchanged = true;
> > > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > > + diamond_minmax_p))
> > > > cfgchanged = true;
> > > > else if (single_pred_p (bb1)
> > > > + && !diamond_minmax_p
> > > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > > arg1))
> > > > cfgchanged = true;
> > > > }
> > > > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > >
> > > > static void
> > > > replace_phi_edge_with_variable (basic_block cond_block,
> > > > - edge e, gphi *phi, tree new_tree)
> > > > + edge e, gphi *phi, tree new_tree, bool
> > > delete_bb = true)
> > > > {
> > > > basic_block bb = gimple_bb (phi);
> > > > gimple_stmt_iterator gsi;
> > > > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> > > cond_block,
> > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > else
> > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 &&
> delete_bb)
> > >
> > > why do you need this change?
> > >
> > > Did you check whether the new case works when the merge block has
> > > more than two incoming edges?
> > >
> > > > {
> > > > e->flags |= EDGE_FALLTHRU;
> > > > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -
> > > 1564,15
> > > > +1601,52 @@ value_replacement (basic_block cond_bb, basic_block
> > > middle_bb,
> > > > return 0;
> > > > }
> > > >
> > > > +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then
> > > > +return the
> > > TREE for
> > > > + the value being inverted. */
> > > > +
> > > > +static tree
> > > > +strip_bit_not (tree var)
> > > > +{
> > > > + if (TREE_CODE (var) != SSA_NAME)
> > > > + return NULL_TREE;
> > > > +
> > > > + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code
> > > > + (assign) != GIMPLE_ASSIGN)
> > > > + return NULL_TREE;
> > > > +
> > > > + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> > > > + return NULL_TREE;
> > > > +
> > > > + return gimple_assign_rhs1 (assign); }
> > > > +
> > > > +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> > > > +
> > > > +enum tree_code
> > > > +invert_minmax_code (enum tree_code code) {
> > > > + switch (code) {
> > > > + case MIN_EXPR:
> > > > + return MAX_EXPR;
> > > > + case MAX_EXPR:
> > > > + return MIN_EXPR;
> > > > + default:
> > > > + gcc_unreachable ();
> > > > + }
> > > > +}
> > > > +
> > > > /* The function minmax_replacement does the main work of doing
> > > > the
> > > minmax
> > > > replacement. Return true if the replacement is done. Otherwise
> return
> > > > false.
> > > > BB is the basic block where the replacement is going to be done on.
> > > ARG0
> > > > - is argument 0 from the PHI. Likewise for ARG1. */
> > > > + is argument 0 from the PHI. Likewise for ARG1.
> > > > +
> > > > + If THREEWAY_P then expect the BB to be laid out in diamond
> > > > + shape with
> > > each
> > > > + BB containing only a MIN or MAX expression. */
> > > >
> > > > static bool
> > > > -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > > > - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> > > > +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > > basic_block alt_middle_bb,
> > > > + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool
> > > > +threeway_p)
> > > > {
> > > > tree result;
> > > > edge true_edge, false_edge;
> > > > @@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb,
> > > basic_block middle_bb,
> > > > if (false_edge->dest == middle_bb)
> > > > false_edge = EDGE_SUCC (false_edge->dest, 0);
> > > >
> > > > + /* When THREEWAY_P then e1 will point to the edge of the final
> > > transition
> > > > + from middle-bb to end. */
> > > > if (true_edge == e0)
> > > > {
> > > > - gcc_assert (false_edge == e1);
> > > > + if (threeway_p)
> > > > + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> > > > + else
> > > > + gcc_assert (false_edge == e1);
> > > > arg_true = arg0;
> > > > arg_false = arg1;
> > > > }
> > > > @@ -1768,6 +1847,133 @@ minmax_replacement (basic_block
> cond_bb,
> > > basic_block middle_bb,
> > > > else
> > > > return false;
> > > > }
> > > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > > + {
> > > > + /* Recognize the following case:
> > > > +
> > > > + if (smaller < larger)
> > > > + a = MIN (smaller, c);
> > > > + else
> > > > + b = MIN (larger, c);
> > > > + x = PHI <a, b>
> > > > +
> > > > + This is equivalent to
> > > > +
> > > > + a = MIN (smaller, c);
> > > > + x = MIN (larger, a); */
> > > > +
> > > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > > + tree lhs, op0, op1, bound;
> > > > + tree alt_lhs, alt_op0, alt_op1;
> > > > + bool invert = false;
> > > > +
> > > > + if (!single_pred_p (middle_bb)
> > > > + || !single_pred_p (alt_middle_bb))
> > > > + return false;
> > > > +
> > > > + if (!assign
> > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > + return false;
> > > > +
> > > > + lhs = gimple_assign_lhs (assign);
> > > > + ass_code = gimple_assign_rhs_code (assign);
> > > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > > + return false;
> > > > +
> > > > + op0 = gimple_assign_rhs1 (assign);
> > > > + op1 = gimple_assign_rhs2 (assign);
> > > > +
> > > > + assign = last_and_only_stmt (alt_middle_bb);
> > > > + if (!assign
> > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > + return false;
> > > > +
> > > > + alt_lhs = gimple_assign_lhs (assign);
> > > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > > + return false;
> > > > +
> > > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > > +
> > > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > > + return false;
> > > > +
> > > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > > + || (alt_smaller
> > > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > > + || (alt_larger
> > > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > > + {
> > > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > + return false;
> > > > +
> > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > + {
> > > > + minmax = MAX_EXPR;
> > > > + ass_code = invert_minmax_code (ass_code);
> > > > + invert = true;
> > > > + }
> > > > + else
> > > > + {
> > > > + bound = op1;
> > > > + minmax = MIN_EXPR;
> > > > + arg0 = op0;
> > > > + arg1 = alt_op0;
> > > > + }
> > > > + }
> > > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > > + || (alt_larger
> > > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > > + || (alt_smaller
> > > > + && operand_equal_for_phi_arg_p (alt_op0,
> > > alt_smaller))))
> > > > + {
> > > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > + return false;
> > > > +
> > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > + {
> > > > + minmax = MIN_EXPR;
> > > > + ass_code = invert_minmax_code (ass_code);
> > > > + invert = true;
> > > > + }
> > > > + else
> > > > + {
> > > > + bound = op1;
> > > > + minmax = MAX_EXPR;
> > > > + arg0 = op0;
> > > > + arg1 = alt_op0;
> > > > + }
> > > > + }
> > > > + else
> > > > + return false;
> > >
> > > Did you check you have coverage for all cases above in your testcases?
> > >
> > > > + /* Reset any range information from the basic block. */
> > > > + reset_flow_sensitive_info_in_bb (cond_bb);
> > >
> > > Huh. You need to reset flow-sensitive info of the middle-bb stmt
> > > that prevails only...
> > >
> > > > + /* Emit the statement to compute min/max. */
> > > > + gimple_seq stmts = NULL;
> > > > + tree phi_result = PHI_RESULT (phi);
> > > > + result = gimple_build (&stmts, minmax, TREE_TYPE
> > > > + (phi_result), arg0,
> > > bound);
> > > > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > > > + (phi_result), result, arg1);
> > >
> > > ... but you are re-building both here. And also you drop locations,
> > > the preserved min/max should keep the old, the new should get the
> > > location of ... hmm, the condition possibly?
> > >
> > > > + if (invert)
> > > > + result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE
> > > (phi_result),
> > > > +result);
> > > > +
> > > > + gsi = gsi_last_bb (cond_bb);
> > > > + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> > > > +
> > > > + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> > > > + return true;
> > > > + }
> > > > else
> > > > {
> > > > /* Recognize the following case, assuming d <= u:
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461
> > > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > > Boudien Moerman; HRB 36809 (AG Nuernberg)
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461
> Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> Boudien Moerman; HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-21 13:42 ` Tamar Christina
@ 2022-06-27 7:52 ` Richard Biener
0 siblings, 0 replies; 26+ messages in thread
From: Richard Biener @ 2022-06-27 7:52 UTC (permalink / raw)
To: Tamar Christina; +Cc: gcc-patches, nd, jakub
On Tue, 21 Jun 2022, Tamar Christina wrote:
> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: Tuesday, June 21, 2022 2:15 PM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> > Subject: RE: [PATCH 2/2]middle-end: Support recognition of three-way
> > max/min.
> >
> > On Mon, 20 Jun 2022, Tamar Christina wrote:
> >
> > > > -----Original Message-----
> > > > From: Richard Biener <rguenther@suse.de>
> > > > Sent: Monday, June 20, 2022 9:36 AM
> > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> > > > Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> > > > max/min.
> > > >
> > > > On Thu, 16 Jun 2022, Tamar Christina wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > This patch adds support for three-way min/max recognition in phi-opts.
> > > > >
> > > > > Concretely for e.g.
> > > > >
> > > > > #include <stdint.h>
> > > > >
> > > > > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > > uint8_t xk;
> > > > > if (xc < xm) {
> > > > > xk = (uint8_t) (xc < xy ? xc : xy);
> > > > > } else {
> > > > > xk = (uint8_t) (xm < xy ? xm : xy);
> > > > > }
> > > > > return xk;
> > > > > }
> > > > >
> > > > > we generate:
> > > > >
> > > > > <bb 2> [local count: 1073741824]:
> > > > > _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
> > > > > _7 = MIN_EXPR <xm_2(D), _5>;
> > > > > return _7;
> > > > >
> > > > > instead of
> > > > >
> > > > > <bb 2>:
> > > > > if (xc_2(D) < xm_3(D))
> > > > > goto <bb 3>;
> > > > > else
> > > > > goto <bb 4>;
> > > > >
> > > > > <bb 3>:
> > > > > xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
> > > > > goto <bb 5>;
> > > > >
> > > > > <bb 4>:
> > > > > xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;
> > > > >
> > > > > <bb 5>:
> > > > > # xk_1 = PHI <xk_5(3), xk_6(4)>
> > > > > return xk_1;
> > > > >
> > > > > The same function also immediately deals with turning a
> > > > > minimization problem into a maximization one if the results are
> > > > > inverted. We do this here since doing it in match.pd would end up
> > > > > changing the shape of the BBs and adding additional instructions
> > > > > which would prevent various
> > > > optimizations from working.
> > > >
> > > > Can you explain a bit more?
> > >
> > > I'll respond to this one first In case it changes how you want me to proceed.
> > >
> > > I initially had used a match.pd rule to do the min to max conversion,
> > > but a number of testcases started to fail. The reason was that a lot
> > > of the foldings checked that the BB contains only a single SSA and that that
> > SSA is a phi node.
> > >
> > > By changing the min into max, the negation of the result ends up In
> > > the same BB and so the optimizations are skipped leading to less optimal
> > code.
> > >
> > > I did look into relaxing those phi opts but it felt like I'd make a
> > > rather arbitrary exception for minus and seemed better to handle it in the
> > minmax folding.
> >
> > That's a possibility but we try to maintain a single place for a transform which
> > might be in match.pd which would then also handle this when there's a RHS
> > COND_EXPR connecting the stmts rather than a PHI node.
>
> Sorry, I am probably missing something here. Just to be clear at the moment I just do it all in
> minmax_replacement, so everything is already in one place. It's a simple extension of the code
> already there.
>
> Are you suggesting I have to move it all to match.pd? That's non-trivial..
I was hoping Andrew was going to respond with an estimate as to how
far he is along moving the minmax replacement to match.pd.
I'm just after less overall work. But then improving minmax_replacement
is OK (see my comments on the actual patch), it just will make the
match.pd attempt more complex because of the need to handle diamonds.
Richard.
> Thanks,
> Tamar
>
> >
> > Richard.
> >
> > > Thanks,
> > > Tamar
> > >
> > > >
> > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > > >
> > > > > Ok for master?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> > > > the phi
> > > > > sequence of a three-way conditional.
> > > > > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > > > > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> > > > way
> > > > > min/max.
> > > > > (strip_bit_not, invert_minmax_code): New.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> > > > optimize
> > > > > code away.
> > > > > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > > > > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > > > > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > > > > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > > > > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > > > > * gcc.dg/tree-ssa/minmax-8.c: New test.
> > > > >
> > > > > --- inline copy of patch --
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> > > >
> > 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> > > > 6a8
> > > > > 43695a05786e
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > > @@ -0,0 +1,17 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > > +
> > > > > +#include <stdint.h>
> > > > > +
> > > > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > > + uint8_t xk;
> > > > > + if (xc < xm) {
> > > > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > > > + } else {
> > > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > > + }
> > > > > + return xk;
> > > > > +}
> > > > > +
> > > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } }
> > > > > +*/
> > > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } }
> > > > > +*/
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> > > >
> > 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> > > > b52
> > > > > 2da44bafa0e2
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > > > @@ -0,0 +1,17 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > > +
> > > > > +#include <stdint.h>
> > > > > +
> > > > > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > > + uint8_t xk;
> > > > > + if (xc > xm) {
> > > > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > > > + } else {
> > > > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > > > + }
> > > > > + return xk;
> > > > > +}
> > > > > +
> > > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } }
> > > > > +*/
> > > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } }
> > > > > +*/
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> > > >
> > 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> > > > 5b
> > > > > 9993074f8510
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > > > @@ -0,0 +1,17 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > > +
> > > > > +#include <stdint.h>
> > > > > +
> > > > > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > > + uint8_t xk;
> > > > > + if (xc > xm) {
> > > > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > > > + } else {
> > > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > > + }
> > > > > + return xk;
> > > > > +}
> > > > > +
> > > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } }
> > > > > +*/
> > > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } }
> > > > > +*/
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> > > >
> > 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> > > > 85
> > > > > 639e3a49dd4b
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > > > @@ -0,0 +1,17 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > > +
> > > > > +#include <stdint.h>
> > > > > +
> > > > > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > > + uint8_t xk;
> > > > > + if (xc > xm) {
> > > > > + xk = (uint8_t) (xy < xc ? xc : xy);
> > > > > + } else {
> > > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > > + }
> > > > > + return xk;
> > > > > +}
> > > > > +
> > > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } }
> > > > > +*/
> > > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } }
> > > > > +*/
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> > > >
> > 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> > > > ef
> > > > > 9fa7b1c5e104
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > > > @@ -0,0 +1,16 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > > +
> > > > > +#include <stdint.h>
> > > > > +
> > > > > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > > + uint8_t xk;
> > > > > + if (xc > xm) {
> > > > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > > > + } else {
> > > > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > > > + }
> > > > > + return xk;
> > > > > +}
> > > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } }
> > > > > +*/
> > > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } }
> > > > > +*/
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> > > >
> > 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> > > > b65
> > > > > 4fcab0bfdd1c
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > > > @@ -0,0 +1,17 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > > +
> > > > > +#include <stdint.h>
> > > > > +
> > > > > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > > + uint8_t xk;
> > > > > + if (xc < xm) {
> > > > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > > > + } else {
> > > > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > > > + }
> > > > > + return xk;
> > > > > +}
> > > > > +
> > > > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } }
> > > > > +*/
> > > > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } }
> > > > > +*/
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > > index
> > > > >
> > > >
> > 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> > > > 7d7
> > > > > 5a32d2c860ed 100644
> > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > > > @@ -1,5 +1,5 @@
> > > > > /* { dg-do run } */
> > > > > -/* { dg-options "-O2 -fsplit-paths
> > > > > -fdump-tree-split-paths-details --param
> > > > > max-jump-thread-duplication-stmts=20" } */
> > > > > +/* { dg-options "-O2 -fsplit-paths
> > > > > +-fdump-tree-split-paths-details --param
> > > > > +max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> > > > >
> > > > > #include <stdio.h>
> > > > > #include <stdlib.h>
> > > > > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> > > > >
> > > >
> > 562468b7f02a9ffe2713318add551902c14f89c3..6246f054006ff16e73602e7ce2
> > > > e
> > > > 3
> > > > > 67d2d21421b1 100644
> > > > > --- a/gcc/tree-ssa-phiopt.cc
> > > > > +++ b/gcc/tree-ssa-phiopt.cc
> > > > > @@ -62,8 +62,8 @@ static gphi *factor_out_conditional_conversion
> > > > > (edge,
> > > > edge, gphi *, tree, tree,
> > > > > gimple *);
> > > > > static int value_replacement (basic_block, basic_block,
> > > > > edge, edge, gphi *, tree, tree); -static bool
> > > > > minmax_replacement (basic_block, basic_block,
> > > > > - edge, edge, gphi *, tree, tree);
> > > > > +static bool minmax_replacement (basic_block, basic_block,
> > basic_block,
> > > > > + edge, edge, gphi *, tree, tree, bool);
> > > > > static bool spaceship_replacement (basic_block, basic_block,
> > > > > edge, edge, gphi *, tree, tree); static bool
> > > > > cond_removal_in_builtin_zero_pattern (basic_block, basic_block, @@
> > > > > -73,7 +73,7 @@ static bool cond_store_replacement (basic_block,
> > > > basic_block, edge, edge,
> > > > > hash_set<tree> *);
> > > > > static bool cond_if_else_store_replacement (basic_block,
> > > > > basic_block, basic_block); static hash_set<tree> *
> > > > > get_non_trapping (); -static void replace_phi_edge_with_variable
> > > > > (basic_block, edge, gphi *, tree);
> > > > > +static void replace_phi_edge_with_variable (basic_block, edge,
> > > > > +gphi *, tree, bool);
> > > > > static void hoist_adjacent_loads (basic_block, basic_block,
> > > > > basic_block, basic_block);
> > > > > static bool gate_hoist_loads (void); @@ -199,6 +199,7 @@
> > > > > tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > > do_hoist_loads, bool early_p)
> > > > > basic_block bb1, bb2;
> > > > > edge e1, e2;
> > > > > tree arg0, arg1;
> > > > > + bool diamond_minmax_p = false;
> > > > >
> > > > > bb = bb_order[i];
> > > > >
> > > > > @@ -265,6 +266,29 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > > > hoist_adjacent_loads (bb, bb1, bb2, bb3);
> > > > > continue;
> > > > > }
> > > > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > > > + && single_succ_p (bb1)
> > > > > + && single_succ_p (bb2)
> > > > > + && single_pred_p (bb1)
> > > > > + && single_pred_p (bb2)
> > > > > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> > > >
> > > > please do the single_succ/pred checks below where appropriate, also
> > > > what's the last check about? why does the merge block need a single
> > successor?
> > > >
> > > > > + {
> > > > > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> > > > (bb1);
> > > > > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> > > > (bb2);
> > > > > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > > > > + {
> > > > > + gimple *stmt1 = gsi_stmt (it1);
> > > > > + gimple *stmt2 = gsi_stmt (it2);
> > > > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > > > + {
> > > > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > > > + diamond_minmax_p
> > > > > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > > > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > > > > + }
> > > > > + }
> > > > > + }
> > > >
> > > > I'd generalize this to general diamond detection, simply cutting off
> > > > *_replacement workers that do not handle diamonds and do appropriate
> > > > checks in minmax_replacement only.
> > > >
> > > > > else
> > > > > continue;
> > > > >
> > > > > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > > > if (!candorest)
> > > > > continue;
> > > > >
> > > > > + /* Check that we're looking for nested phis. */
> > > > > + if (phis == NULL && diamond_minmax_p)
> > > > > + {
> > > > > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > > > > + e2 = EDGE_SUCC (bb2, 0);
> > > > > + }
> > > > > +
> > > >
> > > > instead
> > > >
> > > > basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > > > gimple_seq phis = phi_nodes (merge);
> > > >
> > > >
> > > > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > > > if (!phi)
> > > > > continue;
> > > > > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > > bool do_hoist_loads, bool early_p)
> > > > >
> > > > > gphi *newphi;
> > > > > if (single_pred_p (bb1)
> > > > > + && !diamond_minmax_p
> > > > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > > > arg0, arg1,
> > > > > cond_stmt)))
> > > > > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool
> > do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > > > }
> > > > >
> > > > > /* Do the replacement of conditional if it can be done. */
> > > > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > > > arg1))
> > > > > + if (!early_p
> > > > > + && !diamond_minmax_p
> > > > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > > > cfgchanged = true;
> > > > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > > - arg0, arg1,
> > > > > - early_p))
> > > > > + else if (!diamond_minmax_p
> > > > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > > + arg0, arg1, early_p))
> > > > > cfgchanged = true;
> > > > > else if (!early_p
> > > > > + && !diamond_minmax_p
> > > > > && single_pred_p (bb1)
> > > > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > > > e2,
> > > > > phi, arg0, arg1))
> > > > > cfgchanged = true;
> > > > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > > > + diamond_minmax_p))
> > > > > cfgchanged = true;
> > > > > else if (single_pred_p (bb1)
> > > > > + && !diamond_minmax_p
> > > > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > > > arg1))
> > > > > cfgchanged = true;
> > > > > }
> > > > > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > > bool do_hoist_loads, bool early_p)
> > > > >
> > > > > static void
> > > > > replace_phi_edge_with_variable (basic_block cond_block,
> > > > > - edge e, gphi *phi, tree new_tree)
> > > > > + edge e, gphi *phi, tree new_tree, bool
> > > > delete_bb = true)
> > > > > {
> > > > > basic_block bb = gimple_bb (phi);
> > > > > gimple_stmt_iterator gsi;
> > > > > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> > > > cond_block,
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > else
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 &&
> > delete_bb)
> > > >
> > > > why do you need this change?
> > > >
> > > > Did you check whether the new case works when the merge block has
> > > > more than two incoming edges?
> > > >
> > > > > {
> > > > > e->flags |= EDGE_FALLTHRU;
> > > > > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -
> > > > 1564,15
> > > > > +1601,52 @@ value_replacement (basic_block cond_bb, basic_block
> > > > middle_bb,
> > > > > return 0;
> > > > > }
> > > > >
> > > > > +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then
> > > > > +return the
> > > > TREE for
> > > > > + the value being inverted. */
> > > > > +
> > > > > +static tree
> > > > > +strip_bit_not (tree var)
> > > > > +{
> > > > > + if (TREE_CODE (var) != SSA_NAME)
> > > > > + return NULL_TREE;
> > > > > +
> > > > > + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code
> > > > > + (assign) != GIMPLE_ASSIGN)
> > > > > + return NULL_TREE;
> > > > > +
> > > > > + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> > > > > + return NULL_TREE;
> > > > > +
> > > > > + return gimple_assign_rhs1 (assign); }
> > > > > +
> > > > > +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> > > > > +
> > > > > +enum tree_code
> > > > > +invert_minmax_code (enum tree_code code) {
> > > > > + switch (code) {
> > > > > + case MIN_EXPR:
> > > > > + return MAX_EXPR;
> > > > > + case MAX_EXPR:
> > > > > + return MIN_EXPR;
> > > > > + default:
> > > > > + gcc_unreachable ();
> > > > > + }
> > > > > +}
> > > > > +
> > > > > /* The function minmax_replacement does the main work of doing
> > > > > the
> > > > minmax
> > > > > replacement. Return true if the replacement is done. Otherwise
> > return
> > > > > false.
> > > > > BB is the basic block where the replacement is going to be done on.
> > > > ARG0
> > > > > - is argument 0 from the PHI. Likewise for ARG1. */
> > > > > + is argument 0 from the PHI. Likewise for ARG1.
> > > > > +
> > > > > + If THREEWAY_P then expect the BB to be laid out in diamond
> > > > > + shape with
> > > > each
> > > > > + BB containing only a MIN or MAX expression. */
> > > > >
> > > > > static bool
> > > > > -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > > > > - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> > > > > +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > > > basic_block alt_middle_bb,
> > > > > + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool
> > > > > +threeway_p)
> > > > > {
> > > > > tree result;
> > > > > edge true_edge, false_edge;
> > > > > @@ -1727,9 +1801,14 @@ minmax_replacement (basic_block cond_bb,
> > > > basic_block middle_bb,
> > > > > if (false_edge->dest == middle_bb)
> > > > > false_edge = EDGE_SUCC (false_edge->dest, 0);
> > > > >
> > > > > + /* When THREEWAY_P then e1 will point to the edge of the final
> > > > transition
> > > > > + from middle-bb to end. */
> > > > > if (true_edge == e0)
> > > > > {
> > > > > - gcc_assert (false_edge == e1);
> > > > > + if (threeway_p)
> > > > > + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> > > > > + else
> > > > > + gcc_assert (false_edge == e1);
> > > > > arg_true = arg0;
> > > > > arg_false = arg1;
> > > > > }
> > > > > @@ -1768,6 +1847,133 @@ minmax_replacement (basic_block
> > cond_bb,
> > > > basic_block middle_bb,
> > > > > else
> > > > > return false;
> > > > > }
> > > > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > > > + {
> > > > > + /* Recognize the following case:
> > > > > +
> > > > > + if (smaller < larger)
> > > > > + a = MIN (smaller, c);
> > > > > + else
> > > > > + b = MIN (larger, c);
> > > > > + x = PHI <a, b>
> > > > > +
> > > > > + This is equivalent to
> > > > > +
> > > > > + a = MIN (smaller, c);
> > > > > + x = MIN (larger, a); */
> > > > > +
> > > > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > > > + tree lhs, op0, op1, bound;
> > > > > + tree alt_lhs, alt_op0, alt_op1;
> > > > > + bool invert = false;
> > > > > +
> > > > > + if (!single_pred_p (middle_bb)
> > > > > + || !single_pred_p (alt_middle_bb))
> > > > > + return false;
> > > > > +
> > > > > + if (!assign
> > > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > > + return false;
> > > > > +
> > > > > + lhs = gimple_assign_lhs (assign);
> > > > > + ass_code = gimple_assign_rhs_code (assign);
> > > > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > > > + return false;
> > > > > +
> > > > > + op0 = gimple_assign_rhs1 (assign);
> > > > > + op1 = gimple_assign_rhs2 (assign);
> > > > > +
> > > > > + assign = last_and_only_stmt (alt_middle_bb);
> > > > > + if (!assign
> > > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > > + return false;
> > > > > +
> > > > > + alt_lhs = gimple_assign_lhs (assign);
> > > > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > > > + return false;
> > > > > +
> > > > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > > > +
> > > > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > > > + return false;
> > > > > +
> > > > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > > > + || (alt_smaller
> > > > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > > > + || (alt_larger
> > > > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > > > + {
> > > > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > > + return false;
> > > > > +
> > > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > > + {
> > > > > + minmax = MAX_EXPR;
> > > > > + ass_code = invert_minmax_code (ass_code);
> > > > > + invert = true;
> > > > > + }
> > > > > + else
> > > > > + {
> > > > > + bound = op1;
> > > > > + minmax = MIN_EXPR;
> > > > > + arg0 = op0;
> > > > > + arg1 = alt_op0;
> > > > > + }
> > > > > + }
> > > > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > > > + || (alt_larger
> > > > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > > > + || (alt_smaller
> > > > > + && operand_equal_for_phi_arg_p (alt_op0,
> > > > alt_smaller))))
> > > > > + {
> > > > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > > + return false;
> > > > > +
> > > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > > + {
> > > > > + minmax = MIN_EXPR;
> > > > > + ass_code = invert_minmax_code (ass_code);
> > > > > + invert = true;
> > > > > + }
> > > > > + else
> > > > > + {
> > > > > + bound = op1;
> > > > > + minmax = MAX_EXPR;
> > > > > + arg0 = op0;
> > > > > + arg1 = alt_op0;
> > > > > + }
> > > > > + }
> > > > > + else
> > > > > + return false;
> > > >
> > > > Did you check you have coverage for all cases above in your testcases?
> > > >
> > > > > + /* Reset any range information from the basic block. */
> > > > > + reset_flow_sensitive_info_in_bb (cond_bb);
> > > >
> > > > Huh. You need to reset flow-sensitive info of the middle-bb stmt
> > > > that prevails only...
> > > >
> > > > > + /* Emit the statement to compute min/max. */
> > > > > + gimple_seq stmts = NULL;
> > > > > + tree phi_result = PHI_RESULT (phi);
> > > > > + result = gimple_build (&stmts, minmax, TREE_TYPE
> > > > > + (phi_result), arg0,
> > > > bound);
> > > > > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > > > > + (phi_result), result, arg1);
> > > >
> > > > ... but you are re-building both here. And also you drop locations,
> > > > the preserved min/max should keep the old, the new should get the
> > > > location of ... hmm, the condition possibly?
> > > >
> > > > > + if (invert)
> > > > > + result = gimple_build (&stmts, BIT_NOT_EXPR, TREE_TYPE
> > > > (phi_result),
> > > > > +result);
> > > > > +
> > > > > + gsi = gsi_last_bb (cond_bb);
> > > > > + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> > > > > +
> > > > > + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> > > > > + return true;
> > > > > + }
> > > > > else
> > > > > {
> > > > > /* Recognize the following case, assuming d <= u:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461
> > > > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > > > Boudien Moerman; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461
> > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > Boudien Moerman; HRB 36809 (AG Nuernberg)
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-06-20 8:36 ` Richard Biener
2022-06-20 9:01 ` Tamar Christina
@ 2022-07-05 15:25 ` Tamar Christina
2022-07-12 9:39 ` Tamar Christina
2022-07-12 13:19 ` Richard Biener
1 sibling, 2 replies; 26+ messages in thread
From: Tamar Christina @ 2022-07-05 15:25 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, nd, jakub
[-- Attachment #1: Type: text/plain, Size: 35542 bytes --]
> > }
> > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > + && single_succ_p (bb1)
> > + && single_succ_p (bb2)
> > + && single_pred_p (bb1)
> > + && single_pred_p (bb2)
> > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
>
> please do the single_succ/pred checks below where appropriate, also what's
> the last check about?
Done.
> why does the merge block need a single successor?
I was using it to fix an ICE, but I realize that's not the right fix. I'm now checking
If the BB is empty instead, in which case it's just a fall through edge so don't
treat it as a diamond.
>
> > + {
> > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> (bb1);
> > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> (bb2);
> > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > + {
> > + gimple *stmt1 = gsi_stmt (it1);
> > + gimple *stmt2 = gsi_stmt (it2);
> > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > + {
> > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > + diamond_minmax_p
> > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > + }
> > + }
> > + }
>
> I'd generalize this to general diamond detection, simply cutting off
> *_replacement workers that do not handle diamonds and do appropriate
> checks in minmax_replacement only.
>
Done.
> > else
> > continue;
> >
> > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > if (!candorest)
> > continue;
> >
> > + /* Check that we're looking for nested phis. */
> > + if (phis == NULL && diamond_minmax_p)
> > + {
> > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > + e2 = EDGE_SUCC (bb2, 0);
> > + }
> > +
>
> instead
>
> basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> gimple_seq phis = phi_nodes (merge);
>
Done.
>
> > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > if (!phi)
> > continue;
> > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > gphi *newphi;
> > if (single_pred_p (bb1)
> > + && !diamond_minmax_p
> > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > arg0, arg1,
> > cond_stmt)))
> > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > }
> >
> > /* Do the replacement of conditional if it can be done. */
> > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> arg1))
> > + if (!early_p
> > + && !diamond_minmax_p
> > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > - arg0, arg1,
> > - early_p))
> > + else if (!diamond_minmax_p
> > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > + arg0, arg1, early_p))
> > cfgchanged = true;
> > else if (!early_p
> > + && !diamond_minmax_p
> > && single_pred_p (bb1)
> > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> e2,
> > phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > + diamond_minmax_p))
> > cfgchanged = true;
> > else if (single_pred_p (bb1)
> > + && !diamond_minmax_p
> > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> arg1))
> > cfgchanged = true;
> > }
> > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > static void
> > replace_phi_edge_with_variable (basic_block cond_block,
> > - edge e, gphi *phi, tree new_tree)
> > + edge e, gphi *phi, tree new_tree, bool
> delete_bb = true)
> > {
> > basic_block bb = gimple_bb (phi);
> > gimple_stmt_iterator gsi;
> > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> cond_block,
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > else
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
>
> why do you need this change?
When this function replaces the edge it doesn't seem to update the dominators.
Since It's replacing the middle BB we then end up with an error
gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error: dominator of 5 should be 4, not 2
during early verify. So instead, I replace the BB but defer its deletion until cleanup which
removes it and updates the dominators.
>
> Did you check whether the new case works when the merge block has more
> than two incoming edges?
>
Yes, added a new testcase for it.
> > + else if (middle_bb != alt_middle_bb && threeway_p)
> > + {
> > + /* Recognize the following case:
> > +
> > + if (smaller < larger)
> > + a = MIN (smaller, c);
> > + else
> > + b = MIN (larger, c);
> > + x = PHI <a, b>
> > +
> > + This is equivalent to
> > +
> > + a = MIN (smaller, c);
> > + x = MIN (larger, a); */
> > +
> > + gimple *assign = last_and_only_stmt (middle_bb);
> > + tree lhs, op0, op1, bound;
> > + tree alt_lhs, alt_op0, alt_op1;
> > + bool invert = false;
> > +
> > + if (!single_pred_p (middle_bb)
> > + || !single_pred_p (alt_middle_bb))
> > + return false;
> > +
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + lhs = gimple_assign_lhs (assign);
> > + ass_code = gimple_assign_rhs_code (assign);
> > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > + return false;
> > +
> > + op0 = gimple_assign_rhs1 (assign);
> > + op1 = gimple_assign_rhs2 (assign);
> > +
> > + assign = last_and_only_stmt (alt_middle_bb);
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + alt_lhs = gimple_assign_lhs (assign);
> > + if (ass_code != gimple_assign_rhs_code (assign))
> > + return false;
> > +
> > + alt_op0 = gimple_assign_rhs1 (assign);
> > + alt_op1 = gimple_assign_rhs2 (assign);
> > +
> > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > + return false;
> > +
> > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MAX_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MIN_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (alt_op0,
> alt_smaller))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MIN_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MAX_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else
> > + return false;
>
> Did you check you have coverage for all cases above in your testcases?
I've added some more, should now have full coverage.
>
> > + /* Reset any range information from the basic block. */
> > + reset_flow_sensitive_info_in_bb (cond_bb);
>
> Huh. You need to reset flow-sensitive info of the middle-bb stmt that
> prevails only...
>
> > + /* Emit the statement to compute min/max. */
> > + gimple_seq stmts = NULL;
> > + tree phi_result = PHI_RESULT (phi);
> > + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0,
> bound);
> > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > + (phi_result), result, arg1);
>
> ... but you are re-building both here. And also you drop locations, the
> preserved min/max should keep the old, the new should get the location of
> ... hmm, the condition possibly?
Done, also added a testcase which checks that it still works when -g.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the phi
sequence of a three-way conditional.
(replace_phi_edge_with_variable): Support deferring of BB removal.
(tree_ssa_phiopt_worker): Detect diamond phi structure for three-way
min/max.
(strip_bit_not, invert_minmax_code): New.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't optimize
code away.
* gcc.dg/tree-ssa/minmax-10.c: New test.
* gcc.dg/tree-ssa/minmax-11.c: New test.
* gcc.dg/tree-ssa/minmax-12.c: New test.
* gcc.dg/tree-ssa/minmax-13.c: New test.
* gcc.dg/tree-ssa/minmax-14.c: New test.
* gcc.dg/tree-ssa/minmax-15.c: New test.
* gcc.dg/tree-ssa/minmax-16.c: New test.
* gcc.dg/tree-ssa/minmax-3.c: New test.
* gcc.dg/tree-ssa/minmax-4.c: New test.
* gcc.dg/tree-ssa/minmax-5.c: New test.
* gcc.dg/tree-ssa/minmax-6.c: New test.
* gcc.dg/tree-ssa/minmax-7.c: New test.
* gcc.dg/tree-ssa/minmax-8.c: New test.
* gcc.dg/tree-ssa/minmax-9.c: New test.
--- inline copy of patch ---
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
new file mode 100644
index 0000000000000000000000000000000000000000..589953684416a9d263084deb58f6cde7094dd517
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
new file mode 100644
index 0000000000000000000000000000000000000000..1c2ef01b5d1e639fbf95bb5ca473b63cc98e9df1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
new file mode 100644
index 0000000000000000000000000000000000000000..3d0c07d9b57dd689bcb89653937727ab441e7f2b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xy < xc ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
new file mode 100644
index 0000000000000000000000000000000000000000..c0d0f27c8027ae87654532d1b919cfeccf4413e0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
new file mode 100644
index 0000000000000000000000000000000000000000..9c0cadbf7e3119527cb2007d01fe4c7dd772c069
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc < xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
new file mode 100644
index 0000000000000000000000000000000000000000..1d97a16564f069b4348ff325c4fd713a224f838a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy, bool m) {
+ uint8_t xk;
+ if (xc)
+ {
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ }
+
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
new file mode 100644
index 0000000000000000000000000000000000000000..89377a2cb341bdafa6ba145c61c1f966af536839
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt -g" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e6a843695a05786e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17cb522da44bafa0e2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
new file mode 100644
index 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f5b9993074f8510
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
new file mode 100644
index 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e85639e3a49dd4b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xy < xc ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
new file mode 100644
index 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8bef9fa7b1c5e104
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
new file mode 100644
index 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60cb654fcab0bfdd1c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
new file mode 100644
index 0000000000000000000000000000000000000000..24f580271c3ac3945860b506d4dc7d178a826093
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
index 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba7247d75a32d2c860ed 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20" } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
#include <stdio.h>
#include <stdlib.h>
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index e61d9736937573d773acdf3e43a7c76074bfb2c7..df543b22cd720538c14bcea72fc78a8dec9bf12b 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -63,8 +63,8 @@ static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
gimple *);
static int value_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
-static bool minmax_replacement (basic_block, basic_block,
- edge, edge, gphi *, tree, tree);
+static bool minmax_replacement (basic_block, basic_block, basic_block,
+ edge, edge, gphi *, tree, tree, bool);
static bool spaceship_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
@@ -74,7 +74,7 @@ static bool cond_store_replacement (basic_block, basic_block, edge, edge,
hash_set<tree> *);
static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
static hash_set<tree> * get_non_trapping ();
-static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
+static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree, bool);
static void hoist_adjacent_loads (basic_block, basic_block,
basic_block, basic_block);
static bool gate_hoist_loads (void);
@@ -200,6 +200,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
basic_block bb1, bb2;
edge e1, e2;
tree arg0, arg1;
+ bool diamond_p = false;
bb = bb_order[i];
@@ -266,6 +267,9 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
hoist_adjacent_loads (bb, bb1, bb2, bb3);
continue;
}
+ else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
+ && !empty_block_p (bb1))
+ diamond_p = true;
else
continue;
@@ -294,10 +298,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
}
else
{
- gimple_seq phis = phi_nodes (bb2);
gimple_stmt_iterator gsi;
bool candorest = true;
+ /* Check that we're looking for nested phis. */
+ basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
+ gimple_seq phis = phi_nodes (merge);
+
/* Value replacement can work with more than one PHI
so try that first. */
if (!early_p)
@@ -317,6 +324,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
if (!candorest)
continue;
+ e2 = diamond_p ? EDGE_SUCC (bb2, 0) : e2;
phi = single_non_singleton_phi_for_edges (phis, e1, e2);
if (!phi)
continue;
@@ -330,6 +338,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
gphi *newphi;
if (single_pred_p (bb1)
+ && !diamond_p
&& (newphi = factor_out_conditional_conversion (e1, e2, phi,
arg0, arg1,
cond_stmt)))
@@ -344,20 +353,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
}
/* Do the replacement of conditional if it can be done. */
- if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
+ if (!early_p
+ && !diamond_p
+ && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
cfgchanged = true;
- else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
- arg0, arg1,
- early_p))
+ else if (!diamond_p
+ && match_simplify_replacement (bb, bb1, e1, e2, phi,
+ arg0, arg1, early_p))
cfgchanged = true;
else if (!early_p
+ && !diamond_p
&& single_pred_p (bb1)
&& cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
phi, arg0, arg1))
cfgchanged = true;
- else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
+ else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
+ diamond_p))
cfgchanged = true;
else if (single_pred_p (bb1)
+ && !diamond_p
&& spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
cfgchanged = true;
}
@@ -386,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
static void
replace_phi_edge_with_variable (basic_block cond_block,
- edge e, gphi *phi, tree new_tree)
+ edge e, gphi *phi, tree new_tree, bool delete_bb = true)
{
basic_block bb = gimple_bb (phi);
gimple_stmt_iterator gsi;
@@ -427,7 +441,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
edge_to_remove = EDGE_SUCC (cond_block, 1);
else
edge_to_remove = EDGE_SUCC (cond_block, 0);
- if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
+ if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
{
e->flags |= EDGE_FALLTHRU;
e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
@@ -1733,15 +1747,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
return 0;
}
+/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the TREE for
+ the value being inverted. */
+
+static tree
+strip_bit_not (tree var)
+{
+ if (TREE_CODE (var) != SSA_NAME)
+ return NULL_TREE;
+
+ gimple *assign = SSA_NAME_DEF_STMT (var);
+ if (gimple_code (assign) != GIMPLE_ASSIGN)
+ return NULL_TREE;
+
+ if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
+ return NULL_TREE;
+
+ return gimple_assign_rhs1 (assign);
+}
+
+/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
+
+enum tree_code
+invert_minmax_code (enum tree_code code)
+{
+ switch (code) {
+ case MIN_EXPR:
+ return MAX_EXPR;
+ case MAX_EXPR:
+ return MIN_EXPR;
+ default:
+ gcc_unreachable ();
+ }
+}
+
/* The function minmax_replacement does the main work of doing the minmax
replacement. Return true if the replacement is done. Otherwise return
false.
BB is the basic block where the replacement is going to be done on. ARG0
- is argument 0 from the PHI. Likewise for ARG1. */
+ is argument 0 from the PHI. Likewise for ARG1.
+
+ If THREEWAY_P then expect the BB to be laid out in diamond shape with each
+ BB containing only a MIN or MAX expression. */
static bool
-minmax_replacement (basic_block cond_bb, basic_block middle_bb,
- edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
+minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_middle_bb,
+ edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool threeway_p)
{
tree result;
edge true_edge, false_edge;
@@ -1896,16 +1947,20 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
if (false_edge->dest == middle_bb)
false_edge = EDGE_SUCC (false_edge->dest, 0);
+ /* When THREEWAY_P then e1 will point to the edge of the final transition
+ from middle-bb to end. */
if (true_edge == e0)
{
- gcc_assert (false_edge == e1);
+ if (!threeway_p)
+ gcc_assert (false_edge == e1);
arg_true = arg0;
arg_false = arg1;
}
else
{
gcc_assert (false_edge == e0);
- gcc_assert (true_edge == e1);
+ if (!threeway_p)
+ gcc_assert (true_edge == e1);
arg_true = arg1;
arg_false = arg0;
}
@@ -1937,6 +1992,165 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
else
return false;
}
+ else if (middle_bb != alt_middle_bb && threeway_p)
+ {
+ /* Recognize the following case:
+
+ if (smaller < larger)
+ a = MIN (smaller, c);
+ else
+ b = MIN (larger, c);
+ x = PHI <a, b>
+
+ This is equivalent to
+
+ a = MIN (smaller, c);
+ x = MIN (larger, a); */
+
+ gimple *assign = last_and_only_stmt (middle_bb);
+ tree lhs, op0, op1, bound;
+ tree alt_lhs, alt_op0, alt_op1;
+ bool invert = false;
+
+ if (!single_pred_p (middle_bb)
+ || !single_pred_p (alt_middle_bb)
+ || !single_succ_p (middle_bb)
+ || !single_succ_p (alt_middle_bb))
+ return false;
+
+ /* When THREEWAY_P then e1 will point to the edge of the final transition
+ from middle-bb to end. */
+ if (true_edge == e0)
+ gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
+ else
+ gcc_assert (true_edge == EDGE_PRED (e1->src, 0));
+
+ bool valid_minmax_p = false;
+ gimple_stmt_iterator it1
+ = gsi_start_nondebug_after_labels_bb (middle_bb);
+ gimple_stmt_iterator it2
+ = gsi_start_nondebug_after_labels_bb (alt_middle_bb);
+ if (gsi_one_nondebug_before_end_p (it1)
+ && gsi_one_nondebug_before_end_p (it2))
+ {
+ gimple *stmt1 = gsi_stmt (it1);
+ gimple *stmt2 = gsi_stmt (it2);
+ if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
+ {
+ enum tree_code code1 = gimple_assign_rhs_code (stmt1);
+ enum tree_code code2 = gimple_assign_rhs_code (stmt2);
+ valid_minmax_p = (code1 == MIN_EXPR || code1 == MAX_EXPR)
+ && (code2 == MIN_EXPR || code2 == MAX_EXPR);
+ }
+ }
+
+ if (!valid_minmax_p)
+ return false;
+
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ lhs = gimple_assign_lhs (assign);
+ ass_code = gimple_assign_rhs_code (assign);
+ if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
+ return false;
+
+ op0 = gimple_assign_rhs1 (assign);
+ op1 = gimple_assign_rhs2 (assign);
+
+ assign = last_and_only_stmt (alt_middle_bb);
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ alt_lhs = gimple_assign_lhs (assign);
+ if (ass_code != gimple_assign_rhs_code (assign))
+ return false;
+
+ if (!operand_equal_for_phi_arg_p (lhs, arg_true)
+ || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
+ return false;
+
+ alt_op0 = gimple_assign_rhs1 (assign);
+ alt_op1 = gimple_assign_rhs2 (assign);
+
+ if ((operand_equal_for_phi_arg_p (op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (op0, alt_smaller)))
+ && (operand_equal_for_phi_arg_p (alt_op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER < LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MAX_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MIN_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else if ((operand_equal_for_phi_arg_p (op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (op0, alt_larger)))
+ && (operand_equal_for_phi_arg_p (alt_op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER > LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MIN_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MAX_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else
+ return false;
+
+ /* Emit the statement to compute min/max. */
+ location_t locus = gimple_location (last_stmt (cond_bb));
+ gimple_seq stmts = NULL;
+ tree phi_result = PHI_RESULT (phi);
+ result = gimple_build (&stmts, locus, minmax, TREE_TYPE (phi_result),
+ arg0, bound);
+ result = gimple_build (&stmts, locus, ass_code, TREE_TYPE (phi_result),
+ result, arg1);
+ if (invert)
+ result = gimple_build (&stmts, locus, BIT_NOT_EXPR, TREE_TYPE (phi_result),
+ result);
+
+ gsi = gsi_last_bb (cond_bb);
+ gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
+
+ replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
+
+ return true;
+ }
else
{
/* Recognize the following case, assuming d <= u:
[-- Attachment #2: rb15841.patch --]
[-- Type: application/octet-stream, Size: 23823 bytes --]
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
new file mode 100644
index 0000000000000000000000000000000000000000..589953684416a9d263084deb58f6cde7094dd517
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
new file mode 100644
index 0000000000000000000000000000000000000000..1c2ef01b5d1e639fbf95bb5ca473b63cc98e9df1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
new file mode 100644
index 0000000000000000000000000000000000000000..3d0c07d9b57dd689bcb89653937727ab441e7f2b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xy < xc ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
new file mode 100644
index 0000000000000000000000000000000000000000..c0d0f27c8027ae87654532d1b919cfeccf4413e0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
new file mode 100644
index 0000000000000000000000000000000000000000..9c0cadbf7e3119527cb2007d01fe4c7dd772c069
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc < xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
new file mode 100644
index 0000000000000000000000000000000000000000..1d97a16564f069b4348ff325c4fd713a224f838a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy, bool m) {
+ uint8_t xk;
+ if (xc)
+ {
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ }
+
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
new file mode 100644
index 0000000000000000000000000000000000000000..89377a2cb341bdafa6ba145c61c1f966af536839
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt -g" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e6a843695a05786e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17cb522da44bafa0e2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
new file mode 100644
index 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f5b9993074f8510
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
new file mode 100644
index 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e85639e3a49dd4b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xy < xc ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
new file mode 100644
index 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8bef9fa7b1c5e104
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc > xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
new file mode 100644
index 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60cb654fcab0bfdd1c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt" } */
+
+#include <stdint.h>
+
+uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ if (xc < xm) {
+ xk = (uint8_t) (xc > xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm > xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
new file mode 100644
index 0000000000000000000000000000000000000000..24f580271c3ac3945860b506d4dc7d178a826093
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
+ uint8_t xk;
+ xc=~xc;
+ xm=~xm;
+ xy=~xy;
+ if (xc < xm) {
+ xk = (uint8_t) (xc < xy ? xc : xy);
+ } else {
+ xk = (uint8_t) (xm < xy ? xm : xy);
+ }
+ return xk;
+}
+
+/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
index 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba7247d75a32d2c860ed 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20" } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
#include <stdio.h>
#include <stdlib.h>
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index e61d9736937573d773acdf3e43a7c76074bfb2c7..df543b22cd720538c14bcea72fc78a8dec9bf12b 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -63,8 +63,8 @@ static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
gimple *);
static int value_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
-static bool minmax_replacement (basic_block, basic_block,
- edge, edge, gphi *, tree, tree);
+static bool minmax_replacement (basic_block, basic_block, basic_block,
+ edge, edge, gphi *, tree, tree, bool);
static bool spaceship_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
@@ -74,7 +74,7 @@ static bool cond_store_replacement (basic_block, basic_block, edge, edge,
hash_set<tree> *);
static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
static hash_set<tree> * get_non_trapping ();
-static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
+static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree, bool);
static void hoist_adjacent_loads (basic_block, basic_block,
basic_block, basic_block);
static bool gate_hoist_loads (void);
@@ -200,6 +200,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
basic_block bb1, bb2;
edge e1, e2;
tree arg0, arg1;
+ bool diamond_p = false;
bb = bb_order[i];
@@ -266,6 +267,9 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
hoist_adjacent_loads (bb, bb1, bb2, bb3);
continue;
}
+ else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
+ && !empty_block_p (bb1))
+ diamond_p = true;
else
continue;
@@ -294,10 +298,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
}
else
{
- gimple_seq phis = phi_nodes (bb2);
gimple_stmt_iterator gsi;
bool candorest = true;
+ /* Check that we're looking for nested phis. */
+ basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
+ gimple_seq phis = phi_nodes (merge);
+
/* Value replacement can work with more than one PHI
so try that first. */
if (!early_p)
@@ -317,6 +324,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
if (!candorest)
continue;
+ e2 = diamond_p ? EDGE_SUCC (bb2, 0) : e2;
phi = single_non_singleton_phi_for_edges (phis, e1, e2);
if (!phi)
continue;
@@ -330,6 +338,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
gphi *newphi;
if (single_pred_p (bb1)
+ && !diamond_p
&& (newphi = factor_out_conditional_conversion (e1, e2, phi,
arg0, arg1,
cond_stmt)))
@@ -344,20 +353,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
}
/* Do the replacement of conditional if it can be done. */
- if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
+ if (!early_p
+ && !diamond_p
+ && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
cfgchanged = true;
- else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
- arg0, arg1,
- early_p))
+ else if (!diamond_p
+ && match_simplify_replacement (bb, bb1, e1, e2, phi,
+ arg0, arg1, early_p))
cfgchanged = true;
else if (!early_p
+ && !diamond_p
&& single_pred_p (bb1)
&& cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
phi, arg0, arg1))
cfgchanged = true;
- else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
+ else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
+ diamond_p))
cfgchanged = true;
else if (single_pred_p (bb1)
+ && !diamond_p
&& spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
cfgchanged = true;
}
@@ -386,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
static void
replace_phi_edge_with_variable (basic_block cond_block,
- edge e, gphi *phi, tree new_tree)
+ edge e, gphi *phi, tree new_tree, bool delete_bb = true)
{
basic_block bb = gimple_bb (phi);
gimple_stmt_iterator gsi;
@@ -427,7 +441,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
edge_to_remove = EDGE_SUCC (cond_block, 1);
else
edge_to_remove = EDGE_SUCC (cond_block, 0);
- if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
+ if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
{
e->flags |= EDGE_FALLTHRU;
e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
@@ -1733,15 +1747,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
return 0;
}
+/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the TREE for
+ the value being inverted. */
+
+static tree
+strip_bit_not (tree var)
+{
+ if (TREE_CODE (var) != SSA_NAME)
+ return NULL_TREE;
+
+ gimple *assign = SSA_NAME_DEF_STMT (var);
+ if (gimple_code (assign) != GIMPLE_ASSIGN)
+ return NULL_TREE;
+
+ if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
+ return NULL_TREE;
+
+ return gimple_assign_rhs1 (assign);
+}
+
+/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
+
+enum tree_code
+invert_minmax_code (enum tree_code code)
+{
+ switch (code) {
+ case MIN_EXPR:
+ return MAX_EXPR;
+ case MAX_EXPR:
+ return MIN_EXPR;
+ default:
+ gcc_unreachable ();
+ }
+}
+
/* The function minmax_replacement does the main work of doing the minmax
replacement. Return true if the replacement is done. Otherwise return
false.
BB is the basic block where the replacement is going to be done on. ARG0
- is argument 0 from the PHI. Likewise for ARG1. */
+ is argument 0 from the PHI. Likewise for ARG1.
+
+ If THREEWAY_P then expect the BB to be laid out in diamond shape with each
+ BB containing only a MIN or MAX expression. */
static bool
-minmax_replacement (basic_block cond_bb, basic_block middle_bb,
- edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
+minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_middle_bb,
+ edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool threeway_p)
{
tree result;
edge true_edge, false_edge;
@@ -1896,16 +1947,20 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
if (false_edge->dest == middle_bb)
false_edge = EDGE_SUCC (false_edge->dest, 0);
+ /* When THREEWAY_P then e1 will point to the edge of the final transition
+ from middle-bb to end. */
if (true_edge == e0)
{
- gcc_assert (false_edge == e1);
+ if (!threeway_p)
+ gcc_assert (false_edge == e1);
arg_true = arg0;
arg_false = arg1;
}
else
{
gcc_assert (false_edge == e0);
- gcc_assert (true_edge == e1);
+ if (!threeway_p)
+ gcc_assert (true_edge == e1);
arg_true = arg1;
arg_false = arg0;
}
@@ -1937,6 +1992,165 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
else
return false;
}
+ else if (middle_bb != alt_middle_bb && threeway_p)
+ {
+ /* Recognize the following case:
+
+ if (smaller < larger)
+ a = MIN (smaller, c);
+ else
+ b = MIN (larger, c);
+ x = PHI <a, b>
+
+ This is equivalent to
+
+ a = MIN (smaller, c);
+ x = MIN (larger, a); */
+
+ gimple *assign = last_and_only_stmt (middle_bb);
+ tree lhs, op0, op1, bound;
+ tree alt_lhs, alt_op0, alt_op1;
+ bool invert = false;
+
+ if (!single_pred_p (middle_bb)
+ || !single_pred_p (alt_middle_bb)
+ || !single_succ_p (middle_bb)
+ || !single_succ_p (alt_middle_bb))
+ return false;
+
+ /* When THREEWAY_P then e1 will point to the edge of the final transition
+ from middle-bb to end. */
+ if (true_edge == e0)
+ gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
+ else
+ gcc_assert (true_edge == EDGE_PRED (e1->src, 0));
+
+ bool valid_minmax_p = false;
+ gimple_stmt_iterator it1
+ = gsi_start_nondebug_after_labels_bb (middle_bb);
+ gimple_stmt_iterator it2
+ = gsi_start_nondebug_after_labels_bb (alt_middle_bb);
+ if (gsi_one_nondebug_before_end_p (it1)
+ && gsi_one_nondebug_before_end_p (it2))
+ {
+ gimple *stmt1 = gsi_stmt (it1);
+ gimple *stmt2 = gsi_stmt (it2);
+ if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
+ {
+ enum tree_code code1 = gimple_assign_rhs_code (stmt1);
+ enum tree_code code2 = gimple_assign_rhs_code (stmt2);
+ valid_minmax_p = (code1 == MIN_EXPR || code1 == MAX_EXPR)
+ && (code2 == MIN_EXPR || code2 == MAX_EXPR);
+ }
+ }
+
+ if (!valid_minmax_p)
+ return false;
+
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ lhs = gimple_assign_lhs (assign);
+ ass_code = gimple_assign_rhs_code (assign);
+ if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
+ return false;
+
+ op0 = gimple_assign_rhs1 (assign);
+ op1 = gimple_assign_rhs2 (assign);
+
+ assign = last_and_only_stmt (alt_middle_bb);
+ if (!assign
+ || gimple_code (assign) != GIMPLE_ASSIGN)
+ return false;
+
+ alt_lhs = gimple_assign_lhs (assign);
+ if (ass_code != gimple_assign_rhs_code (assign))
+ return false;
+
+ if (!operand_equal_for_phi_arg_p (lhs, arg_true)
+ || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
+ return false;
+
+ alt_op0 = gimple_assign_rhs1 (assign);
+ alt_op1 = gimple_assign_rhs2 (assign);
+
+ if ((operand_equal_for_phi_arg_p (op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (op0, alt_smaller)))
+ && (operand_equal_for_phi_arg_p (alt_op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER < LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MAX_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MIN_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else if ((operand_equal_for_phi_arg_p (op0, larger)
+ || (alt_larger
+ && operand_equal_for_phi_arg_p (op0, alt_larger)))
+ && (operand_equal_for_phi_arg_p (alt_op0, smaller)
+ || (alt_smaller
+ && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
+ {
+ /* We got here if the condition is true, i.e., SMALLER > LARGER. */
+ if (!operand_equal_for_phi_arg_p (op1, alt_op1))
+ return false;
+
+ if ((arg0 = strip_bit_not (op0)) != NULL
+ && (arg1 = strip_bit_not (alt_op0)) != NULL
+ && (bound = strip_bit_not (op1)) != NULL)
+ {
+ minmax = MIN_EXPR;
+ ass_code = invert_minmax_code (ass_code);
+ invert = true;
+ }
+ else
+ {
+ bound = op1;
+ minmax = MAX_EXPR;
+ arg0 = op0;
+ arg1 = alt_op0;
+ }
+ }
+ else
+ return false;
+
+ /* Emit the statement to compute min/max. */
+ location_t locus = gimple_location (last_stmt (cond_bb));
+ gimple_seq stmts = NULL;
+ tree phi_result = PHI_RESULT (phi);
+ result = gimple_build (&stmts, locus, minmax, TREE_TYPE (phi_result),
+ arg0, bound);
+ result = gimple_build (&stmts, locus, ass_code, TREE_TYPE (phi_result),
+ result, arg1);
+ if (invert)
+ result = gimple_build (&stmts, locus, BIT_NOT_EXPR, TREE_TYPE (phi_result),
+ result);
+
+ gsi = gsi_last_bb (cond_bb);
+ gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
+
+ replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
+
+ return true;
+ }
else
{
/* Recognize the following case, assuming d <= u:
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-07-05 15:25 ` Tamar Christina
@ 2022-07-12 9:39 ` Tamar Christina
2022-07-12 13:19 ` Richard Biener
1 sibling, 0 replies; 26+ messages in thread
From: Tamar Christina @ 2022-07-12 9:39 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, nd, jakub
ping
> -----Original Message-----
> From: Tamar Christina
> Sent: Tuesday, July 5, 2022 4:26 PM
> To: Richard Biener <rguenther@suse.de>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> Subject: RE: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
>
> > > }
> > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > + && single_succ_p (bb1)
> > > + && single_succ_p (bb2)
> > > + && single_pred_p (bb1)
> > > + && single_pred_p (bb2)
> > > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> >
> > please do the single_succ/pred checks below where appropriate, also
> > what's the last check about?
>
> Done.
>
> > why does the merge block need a single successor?
>
> I was using it to fix an ICE, but I realize that's not the right fix. I'm now
> checking If the BB is empty instead, in which case it's just a fall through edge
> so don't treat it as a diamond.
>
> >
> > > + {
> > > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> > (bb1);
> > > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> > (bb2);
> > > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > > + {
> > > + gimple *stmt1 = gsi_stmt (it1);
> > > + gimple *stmt2 = gsi_stmt (it2);
> > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > + {
> > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > + diamond_minmax_p
> > > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > > + }
> > > + }
> > > + }
> >
> > I'd generalize this to general diamond detection, simply cutting off
> > *_replacement workers that do not handle diamonds and do appropriate
> > checks in minmax_replacement only.
> >
>
> Done.
>
> > > else
> > > continue;
> > >
> > > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > if (!candorest)
> > > continue;
> > >
> > > + /* Check that we're looking for nested phis. */
> > > + if (phis == NULL && diamond_minmax_p)
> > > + {
> > > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > > + e2 = EDGE_SUCC (bb2, 0);
> > > + }
> > > +
> >
> > instead
> >
> > basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > gimple_seq phis = phi_nodes (merge);
> >
>
> Done.
>
> >
> > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > if (!phi)
> > > continue;
> > > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool
> > > do_hoist_loads, bool early_p)
> > >
> > > gphi *newphi;
> > > if (single_pred_p (bb1)
> > > + && !diamond_minmax_p
> > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > arg0, arg1,
> > > cond_stmt)))
> > > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > }
> > >
> > > /* Do the replacement of conditional if it can be done. */
> > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > arg1))
> > > + if (!early_p
> > > + && !diamond_minmax_p
> > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > - arg0, arg1,
> > > - early_p))
> > > + else if (!diamond_minmax_p
> > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > + arg0, arg1, early_p))
> > > cfgchanged = true;
> > > else if (!early_p
> > > + && !diamond_minmax_p
> > > && single_pred_p (bb1)
> > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > e2,
> > > phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > + diamond_minmax_p))
> > > cfgchanged = true;
> > > else if (single_pred_p (bb1)
> > > + && !diamond_minmax_p
> > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > arg1))
> > > cfgchanged = true;
> > > }
> > > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool
> > > do_hoist_loads, bool early_p)
> > >
> > > static void
> > > replace_phi_edge_with_variable (basic_block cond_block,
> > > - edge e, gphi *phi, tree new_tree)
> > > + edge e, gphi *phi, tree new_tree, bool
> > delete_bb = true)
> > > {
> > > basic_block bb = gimple_bb (phi);
> > > gimple_stmt_iterator gsi;
> > > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> > cond_block,
> > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > else
> > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 &&
> delete_bb)
> >
> > why do you need this change?
>
> When this function replaces the edge it doesn't seem to update the
> dominators.
> Since It's replacing the middle BB we then end up with an error
>
> gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error: dominator of 5
> should be 4, not 2
>
> during early verify. So instead, I replace the BB but defer its deletion until
> cleanup which removes it and updates the dominators.
>
> >
> > Did you check whether the new case works when the merge block has
> more
> > than two incoming edges?
> >
>
> Yes, added a new testcase for it.
>
> > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > + {
> > > + /* Recognize the following case:
> > > +
> > > + if (smaller < larger)
> > > + a = MIN (smaller, c);
> > > + else
> > > + b = MIN (larger, c);
> > > + x = PHI <a, b>
> > > +
> > > + This is equivalent to
> > > +
> > > + a = MIN (smaller, c);
> > > + x = MIN (larger, a); */
> > > +
> > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > + tree lhs, op0, op1, bound;
> > > + tree alt_lhs, alt_op0, alt_op1;
> > > + bool invert = false;
> > > +
> > > + if (!single_pred_p (middle_bb)
> > > + || !single_pred_p (alt_middle_bb))
> > > + return false;
> > > +
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + lhs = gimple_assign_lhs (assign);
> > > + ass_code = gimple_assign_rhs_code (assign);
> > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > + return false;
> > > +
> > > + op0 = gimple_assign_rhs1 (assign);
> > > + op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + assign = last_and_only_stmt (alt_middle_bb);
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + alt_lhs = gimple_assign_lhs (assign);
> > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > + return false;
> > > +
> > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > + return false;
> > > +
> > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MAX_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MIN_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (alt_op0,
> > alt_smaller))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MIN_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MAX_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else
> > > + return false;
> >
> > Did you check you have coverage for all cases above in your testcases?
>
> I've added some more, should now have full coverage.
>
> >
> > > + /* Reset any range information from the basic block. */
> > > + reset_flow_sensitive_info_in_bb (cond_bb);
> >
> > Huh. You need to reset flow-sensitive info of the middle-bb stmt that
> > prevails only...
> >
> > > + /* Emit the statement to compute min/max. */
> > > + gimple_seq stmts = NULL;
> > > + tree phi_result = PHI_RESULT (phi);
> > > + result = gimple_build (&stmts, minmax, TREE_TYPE
> > > + (phi_result), arg0,
> > bound);
> > > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > > + (phi_result), result, arg1);
> >
> > ... but you are re-building both here. And also you drop locations,
> > the preserved min/max should keep the old, the new should get the
> > location of ... hmm, the condition possibly?
>
> Done, also added a testcase which checks that it still works when -g.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> the phi
> sequence of a three-way conditional.
> (replace_phi_edge_with_variable): Support deferring of BB removal.
> (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> way
> min/max.
> (strip_bit_not, invert_minmax_code): New.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> optimize
> code away.
> * gcc.dg/tree-ssa/minmax-10.c: New test.
> * gcc.dg/tree-ssa/minmax-11.c: New test.
> * gcc.dg/tree-ssa/minmax-12.c: New test.
> * gcc.dg/tree-ssa/minmax-13.c: New test.
> * gcc.dg/tree-ssa/minmax-14.c: New test.
> * gcc.dg/tree-ssa/minmax-15.c: New test.
> * gcc.dg/tree-ssa/minmax-16.c: New test.
> * gcc.dg/tree-ssa/minmax-3.c: New test.
> * gcc.dg/tree-ssa/minmax-4.c: New test.
> * gcc.dg/tree-ssa/minmax-5.c: New test.
> * gcc.dg/tree-ssa/minmax-6.c: New test.
> * gcc.dg/tree-ssa/minmax-7.c: New test.
> * gcc.dg/tree-ssa/minmax-8.c: New test.
> * gcc.dg/tree-ssa/minmax-9.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..589953684416a9d263084deb5
> 8f6cde7094dd517
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..1c2ef01b5d1e639fbf95bb5ca4
> 73b63cc98e9df1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..3d0c07d9b57dd689bcb896539
> 37727ab441e7f2b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xy < xc ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..c0d0f27c8027ae87654532d1b9
> 19cfeccf4413e0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..9c0cadbf7e3119527cb2007d01
> fe4c7dd772c069
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc < xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..1d97a16564f069b4348ff325c4f
> d713a224f838a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy, bool m) {
> + uint8_t xk;
> + if (xc)
> + {
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + }
> +
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..89377a2cb341bdafa6ba145c61
> c1f966af536839
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt -g" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> 6a843695a05786e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> b522da44bafa0e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> 5b9993074f8510
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> 85639e3a49dd4b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xy < xc ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> ef9fa7b1c5e104
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> b654fcab0bfdd1c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..24f580271c3ac3945860b506d4
> dc7d178a826093
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> index
> 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> 7d75a32d2c860ed 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> @@ -1,5 +1,5 @@
> /* { dg-do run } */
> -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param
> max-jump-thread-duplication-stmts=20" } */
> +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> +--param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
>
> #include <stdio.h>
> #include <stdlib.h>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> e61d9736937573d773acdf3e43a7c76074bfb2c7..df543b22cd720538c14bcea72f
> c78a8dec9bf12b 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -63,8 +63,8 @@ static gphi *factor_out_conditional_conversion (edge,
> edge, gphi *, tree, tree,
> gimple *);
> static int value_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree); -static bool
> minmax_replacement (basic_block, basic_block,
> - edge, edge, gphi *, tree, tree);
> +static bool minmax_replacement (basic_block, basic_block, basic_block,
> + edge, edge, gphi *, tree, tree, bool);
> static bool spaceship_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree); static bool
> cond_removal_in_builtin_zero_pattern (basic_block, basic_block, @@ -74,7
> +74,7 @@ static bool cond_store_replacement (basic_block, basic_block,
> edge, edge,
> hash_set<tree> *);
> static bool cond_if_else_store_replacement (basic_block, basic_block,
> basic_block); static hash_set<tree> * get_non_trapping (); -static void
> replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> +static void replace_phi_edge_with_variable (basic_block, edge, gphi *,
> +tree, bool);
> static void hoist_adjacent_loads (basic_block, basic_block,
> basic_block, basic_block);
> static bool gate_hoist_loads (void);
> @@ -200,6 +200,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> basic_block bb1, bb2;
> edge e1, e2;
> tree arg0, arg1;
> + bool diamond_p = false;
>
> bb = bb_order[i];
>
> @@ -266,6 +267,9 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> hoist_adjacent_loads (bb, bb1, bb2, bb3);
> continue;
> }
> + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> + && !empty_block_p (bb1))
> + diamond_p = true;
> else
> continue;
>
> @@ -294,10 +298,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> }
> else
> {
> - gimple_seq phis = phi_nodes (bb2);
> gimple_stmt_iterator gsi;
> bool candorest = true;
>
> + /* Check that we're looking for nested phis. */
> + basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> + gimple_seq phis = phi_nodes (merge);
> +
> /* Value replacement can work with more than one PHI
> so try that first. */
> if (!early_p)
> @@ -317,6 +324,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> if (!candorest)
> continue;
>
> + e2 = diamond_p ? EDGE_SUCC (bb2, 0) : e2;
> phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> if (!phi)
> continue;
> @@ -330,6 +338,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
>
> gphi *newphi;
> if (single_pred_p (bb1)
> + && !diamond_p
> && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> arg0, arg1,
> cond_stmt)))
> @@ -344,20 +353,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> }
>
> /* Do the replacement of conditional if it can be done. */
> - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> arg1))
> + if (!early_p
> + && !diamond_p
> + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> cfgchanged = true;
> - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> - arg0, arg1,
> - early_p))
> + else if (!diamond_p
> + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> + arg0, arg1, early_p))
> cfgchanged = true;
> else if (!early_p
> + && !diamond_p
> && single_pred_p (bb1)
> && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> e2,
> phi, arg0, arg1))
> cfgchanged = true;
> - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> + diamond_p))
> cfgchanged = true;
> else if (single_pred_p (bb1)
> + && !diamond_p
> && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> arg1))
> cfgchanged = true;
> }
> @@ -386,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
>
> static void
> replace_phi_edge_with_variable (basic_block cond_block,
> - edge e, gphi *phi, tree new_tree)
> + edge e, gphi *phi, tree new_tree, bool
> delete_bb = true)
> {
> basic_block bb = gimple_bb (phi);
> gimple_stmt_iterator gsi;
> @@ -427,7 +441,7 @@ replace_phi_edge_with_variable (basic_block
> cond_block,
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> else
> edge_to_remove = EDGE_SUCC (cond_block, 0);
> - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> {
> e->flags |= EDGE_FALLTHRU;
> e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -1733,15
> +1747,52 @@ value_replacement (basic_block cond_bb, basic_block
> middle_bb,
> return 0;
> }
>
> +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the
> TREE for
> + the value being inverted. */
> +
> +static tree
> +strip_bit_not (tree var)
> +{
> + if (TREE_CODE (var) != SSA_NAME)
> + return NULL_TREE;
> +
> + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code (assign)
> + != GIMPLE_ASSIGN)
> + return NULL_TREE;
> +
> + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> + return NULL_TREE;
> +
> + return gimple_assign_rhs1 (assign);
> +}
> +
> +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> +
> +enum tree_code
> +invert_minmax_code (enum tree_code code) {
> + switch (code) {
> + case MIN_EXPR:
> + return MAX_EXPR;
> + case MAX_EXPR:
> + return MIN_EXPR;
> + default:
> + gcc_unreachable ();
> + }
> +}
> +
> /* The function minmax_replacement does the main work of doing the
> minmax
> replacement. Return true if the replacement is done. Otherwise return
> false.
> BB is the basic block where the replacement is going to be done on. ARG0
> - is argument 0 from the PHI. Likewise for ARG1. */
> + is argument 0 from the PHI. Likewise for ARG1.
> +
> + If THREEWAY_P then expect the BB to be laid out in diamond shape with
> each
> + BB containing only a MIN or MAX expression. */
>
> static bool
> -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> basic_block alt_middle_bb,
> + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool
> +threeway_p)
> {
> tree result;
> edge true_edge, false_edge;
> @@ -1896,16 +1947,20 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> if (false_edge->dest == middle_bb)
> false_edge = EDGE_SUCC (false_edge->dest, 0);
>
> + /* When THREEWAY_P then e1 will point to the edge of the final transition
> + from middle-bb to end. */
> if (true_edge == e0)
> {
> - gcc_assert (false_edge == e1);
> + if (!threeway_p)
> + gcc_assert (false_edge == e1);
> arg_true = arg0;
> arg_false = arg1;
> }
> else
> {
> gcc_assert (false_edge == e0);
> - gcc_assert (true_edge == e1);
> + if (!threeway_p)
> + gcc_assert (true_edge == e1);
> arg_true = arg1;
> arg_false = arg0;
> }
> @@ -1937,6 +1992,165 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> else
> return false;
> }
> + else if (middle_bb != alt_middle_bb && threeway_p)
> + {
> + /* Recognize the following case:
> +
> + if (smaller < larger)
> + a = MIN (smaller, c);
> + else
> + b = MIN (larger, c);
> + x = PHI <a, b>
> +
> + This is equivalent to
> +
> + a = MIN (smaller, c);
> + x = MIN (larger, a); */
> +
> + gimple *assign = last_and_only_stmt (middle_bb);
> + tree lhs, op0, op1, bound;
> + tree alt_lhs, alt_op0, alt_op1;
> + bool invert = false;
> +
> + if (!single_pred_p (middle_bb)
> + || !single_pred_p (alt_middle_bb)
> + || !single_succ_p (middle_bb)
> + || !single_succ_p (alt_middle_bb))
> + return false;
> +
> + /* When THREEWAY_P then e1 will point to the edge of the final
> transition
> + from middle-bb to end. */
> + if (true_edge == e0)
> + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> + else
> + gcc_assert (true_edge == EDGE_PRED (e1->src, 0));
> +
> + bool valid_minmax_p = false;
> + gimple_stmt_iterator it1
> + = gsi_start_nondebug_after_labels_bb (middle_bb);
> + gimple_stmt_iterator it2
> + = gsi_start_nondebug_after_labels_bb (alt_middle_bb);
> + if (gsi_one_nondebug_before_end_p (it1)
> + && gsi_one_nondebug_before_end_p (it2))
> + {
> + gimple *stmt1 = gsi_stmt (it1);
> + gimple *stmt2 = gsi_stmt (it2);
> + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> + {
> + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> + valid_minmax_p = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> + && (code2 == MIN_EXPR || code2 ==
> MAX_EXPR);
> + }
> + }
> +
> + if (!valid_minmax_p)
> + return false;
> +
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + lhs = gimple_assign_lhs (assign);
> + ass_code = gimple_assign_rhs_code (assign);
> + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> + return false;
> +
> + op0 = gimple_assign_rhs1 (assign);
> + op1 = gimple_assign_rhs2 (assign);
> +
> + assign = last_and_only_stmt (alt_middle_bb);
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + alt_lhs = gimple_assign_lhs (assign);
> + if (ass_code != gimple_assign_rhs_code (assign))
> + return false;
> +
> + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> + return false;
> +
> + alt_op0 = gimple_assign_rhs1 (assign);
> + alt_op1 = gimple_assign_rhs2 (assign);
> +
> + if ((operand_equal_for_phi_arg_p (op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MAX_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MIN_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else if ((operand_equal_for_phi_arg_p (op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (alt_op0,
> alt_smaller))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MIN_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MAX_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else
> + return false;
> +
> + /* Emit the statement to compute min/max. */
> + location_t locus = gimple_location (last_stmt (cond_bb));
> + gimple_seq stmts = NULL;
> + tree phi_result = PHI_RESULT (phi);
> + result = gimple_build (&stmts, locus, minmax, TREE_TYPE (phi_result),
> + arg0, bound);
> + result = gimple_build (&stmts, locus, ass_code, TREE_TYPE (phi_result),
> + result, arg1);
> + if (invert)
> + result = gimple_build (&stmts, locus, BIT_NOT_EXPR, TREE_TYPE
> (phi_result),
> + result);
> +
> + gsi = gsi_last_bb (cond_bb);
> + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> +
> + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> +
> + return true;
> + }
> else
> {
> /* Recognize the following case, assuming d <= u:
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-07-05 15:25 ` Tamar Christina
2022-07-12 9:39 ` Tamar Christina
@ 2022-07-12 13:19 ` Richard Biener
2022-07-27 10:40 ` Tamar Christina
1 sibling, 1 reply; 26+ messages in thread
From: Richard Biener @ 2022-07-12 13:19 UTC (permalink / raw)
To: Tamar Christina; +Cc: gcc-patches, nd, jakub
On Tue, 5 Jul 2022, Tamar Christina wrote:
> > > }
> > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > + && single_succ_p (bb1)
> > > + && single_succ_p (bb2)
> > > + && single_pred_p (bb1)
> > > + && single_pred_p (bb2)
> > > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> >
> > please do the single_succ/pred checks below where appropriate, also what's
> > the last check about?
>
> Done.
>
> > why does the merge block need a single successor?
>
> I was using it to fix an ICE, but I realize that's not the right fix. I'm now checking
> If the BB is empty instead, in which case it's just a fall through edge so don't
> treat it as a diamond.
>
> >
> > > + {
> > > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> > (bb1);
> > > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> > (bb2);
> > > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > > + {
> > > + gimple *stmt1 = gsi_stmt (it1);
> > > + gimple *stmt2 = gsi_stmt (it2);
> > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > + {
> > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > + diamond_minmax_p
> > > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > > + }
> > > + }
> > > + }
> >
> > I'd generalize this to general diamond detection, simply cutting off
> > *_replacement workers that do not handle diamonds and do appropriate
> > checks in minmax_replacement only.
> >
>
> Done.
>
> > > else
> > > continue;
> > >
> > > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > if (!candorest)
> > > continue;
> > >
> > > + /* Check that we're looking for nested phis. */
> > > + if (phis == NULL && diamond_minmax_p)
> > > + {
> > > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > > + e2 = EDGE_SUCC (bb2, 0);
> > > + }
> > > +
> >
> > instead
> >
> > basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > gimple_seq phis = phi_nodes (merge);
> >
>
> Done.
>
> >
> > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > if (!phi)
> > > continue;
> > > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > do_hoist_loads, bool early_p)
> > >
> > > gphi *newphi;
> > > if (single_pred_p (bb1)
> > > + && !diamond_minmax_p
> > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > arg0, arg1,
> > > cond_stmt)))
> > > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > }
> > >
> > > /* Do the replacement of conditional if it can be done. */
> > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > arg1))
> > > + if (!early_p
> > > + && !diamond_minmax_p
> > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > - arg0, arg1,
> > > - early_p))
> > > + else if (!diamond_minmax_p
> > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > + arg0, arg1, early_p))
> > > cfgchanged = true;
> > > else if (!early_p
> > > + && !diamond_minmax_p
> > > && single_pred_p (bb1)
> > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > e2,
> > > phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > + diamond_minmax_p))
> > > cfgchanged = true;
> > > else if (single_pred_p (bb1)
> > > + && !diamond_minmax_p
> > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > arg1))
> > > cfgchanged = true;
> > > }
> > > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > do_hoist_loads, bool early_p)
> > >
> > > static void
> > > replace_phi_edge_with_variable (basic_block cond_block,
> > > - edge e, gphi *phi, tree new_tree)
> > > + edge e, gphi *phi, tree new_tree, bool
> > delete_bb = true)
> > > {
> > > basic_block bb = gimple_bb (phi);
> > > gimple_stmt_iterator gsi;
> > > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> > cond_block,
> > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > else
> > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> >
> > why do you need this change?
>
> When this function replaces the edge it doesn't seem to update the dominators.
> Since It's replacing the middle BB we then end up with an error
>
> gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error: dominator of 5 should be 4, not 2
>
> during early verify. So instead, I replace the BB but defer its deletion until cleanup which
> removes it and updates the dominators.
Hmm, for a diamond shouldn't you replace
if (EDGE_SUCC (cond_block, 0)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 1);
else
edge_to_remove = EDGE_SUCC (cond_block, 0);
with
if (EDGE_SUCC (cond_block, 0)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 1);
else if (EDGE_SUCC (cond_block, 1)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 0);
thus, the code expects to be left with a fallthru to the PHI block
which is expected to have the immediate dominator being
cond_block but with a diamond there's a (possibly empty) block
inbetween and dominators are wrong.
So I think you simply need to handle this properly (and then
fall through to the else).
> >
> > Did you check whether the new case works when the merge block has more
> > than two incoming edges?
> >
>
> Yes, added a new testcase for it.
>
> > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > + {
> > > + /* Recognize the following case:
> > > +
> > > + if (smaller < larger)
> > > + a = MIN (smaller, c);
> > > + else
> > > + b = MIN (larger, c);
> > > + x = PHI <a, b>
> > > +
> > > + This is equivalent to
> > > +
> > > + a = MIN (smaller, c);
> > > + x = MIN (larger, a); */
> > > +
> > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > + tree lhs, op0, op1, bound;
> > > + tree alt_lhs, alt_op0, alt_op1;
> > > + bool invert = false;
> > > +
> > > + if (!single_pred_p (middle_bb)
> > > + || !single_pred_p (alt_middle_bb))
> > > + return false;
> > > +
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + lhs = gimple_assign_lhs (assign);
> > > + ass_code = gimple_assign_rhs_code (assign);
> > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > + return false;
> > > +
> > > + op0 = gimple_assign_rhs1 (assign);
> > > + op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + assign = last_and_only_stmt (alt_middle_bb);
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + alt_lhs = gimple_assign_lhs (assign);
> > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > + return false;
> > > +
> > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > + return false;
> > > +
> > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MAX_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MIN_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (alt_op0,
> > alt_smaller))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MIN_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MAX_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else
> > > + return false;
> >
> > Did you check you have coverage for all cases above in your testcases?
>
> I've added some more, should now have full coverage.
Great.
> >
> > > + /* Reset any range information from the basic block. */
> > > + reset_flow_sensitive_info_in_bb (cond_bb);
> >
> > Huh. You need to reset flow-sensitive info of the middle-bb stmt that
> > prevails only...
> >
> > > + /* Emit the statement to compute min/max. */
> > > + gimple_seq stmts = NULL;
> > > + tree phi_result = PHI_RESULT (phi);
> > > + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0,
> > bound);
> > > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > > + (phi_result), result, arg1);
> >
> > ... but you are re-building both here. And also you drop locations, the
> > preserved min/max should keep the old, the new should get the location of
> > ... hmm, the condition possibly?
>
> Done, also added a testcase which checks that it still works when -g.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
Besides the above issue it looks good to me.
Thanks and sorry for the delay.
Richard.
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the phi
> sequence of a three-way conditional.
> (replace_phi_edge_with_variable): Support deferring of BB removal.
> (tree_ssa_phiopt_worker): Detect diamond phi structure for three-way
> min/max.
> (strip_bit_not, invert_minmax_code): New.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't optimize
> code away.
> * gcc.dg/tree-ssa/minmax-10.c: New test.
> * gcc.dg/tree-ssa/minmax-11.c: New test.
> * gcc.dg/tree-ssa/minmax-12.c: New test.
> * gcc.dg/tree-ssa/minmax-13.c: New test.
> * gcc.dg/tree-ssa/minmax-14.c: New test.
> * gcc.dg/tree-ssa/minmax-15.c: New test.
> * gcc.dg/tree-ssa/minmax-16.c: New test.
> * gcc.dg/tree-ssa/minmax-3.c: New test.
> * gcc.dg/tree-ssa/minmax-4.c: New test.
> * gcc.dg/tree-ssa/minmax-5.c: New test.
> * gcc.dg/tree-ssa/minmax-6.c: New test.
> * gcc.dg/tree-ssa/minmax-7.c: New test.
> * gcc.dg/tree-ssa/minmax-8.c: New test.
> * gcc.dg/tree-ssa/minmax-9.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..589953684416a9d263084deb58f6cde7094dd517
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..1c2ef01b5d1e639fbf95bb5ca473b63cc98e9df1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..3d0c07d9b57dd689bcb89653937727ab441e7f2b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xy < xc ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..c0d0f27c8027ae87654532d1b919cfeccf4413e0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..9c0cadbf7e3119527cb2007d01fe4c7dd772c069
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc < xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..1d97a16564f069b4348ff325c4fd713a224f838a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy, bool m) {
> + uint8_t xk;
> + if (xc)
> + {
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + }
> +
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..89377a2cb341bdafa6ba145c61c1f966af536839
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt -g" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e6a843695a05786e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17cb522da44bafa0e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f5b9993074f8510
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e85639e3a49dd4b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xy < xc ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8bef9fa7b1c5e104
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc > xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60cb654fcab0bfdd1c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + if (xc < xm) {
> + xk = (uint8_t) (xc > xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm > xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..24f580271c3ac3945860b506d4dc7d178a826093
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +#include <stdint.h>
> +
> +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> + uint8_t xk;
> + xc=~xc;
> + xm=~xm;
> + xy=~xy;
> + if (xc < xm) {
> + xk = (uint8_t) (xc < xy ? xc : xy);
> + } else {
> + xk = (uint8_t) (xm < xy ? xm : xy);
> + }
> + return xk;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> index 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba7247d75a32d2c860ed 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> @@ -1,5 +1,5 @@
> /* { dg-do run } */
> -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20" } */
> +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
>
> #include <stdio.h>
> #include <stdlib.h>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index e61d9736937573d773acdf3e43a7c76074bfb2c7..df543b22cd720538c14bcea72fc78a8dec9bf12b 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -63,8 +63,8 @@ static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
> gimple *);
> static int value_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree);
> -static bool minmax_replacement (basic_block, basic_block,
> - edge, edge, gphi *, tree, tree);
> +static bool minmax_replacement (basic_block, basic_block, basic_block,
> + edge, edge, gphi *, tree, tree, bool);
> static bool spaceship_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree);
> static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
> @@ -74,7 +74,7 @@ static bool cond_store_replacement (basic_block, basic_block, edge, edge,
> hash_set<tree> *);
> static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
> static hash_set<tree> * get_non_trapping ();
> -static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> +static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree, bool);
> static void hoist_adjacent_loads (basic_block, basic_block,
> basic_block, basic_block);
> static bool gate_hoist_loads (void);
> @@ -200,6 +200,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> basic_block bb1, bb2;
> edge e1, e2;
> tree arg0, arg1;
> + bool diamond_p = false;
>
> bb = bb_order[i];
>
> @@ -266,6 +267,9 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> hoist_adjacent_loads (bb, bb1, bb2, bb3);
> continue;
> }
> + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> + && !empty_block_p (bb1))
> + diamond_p = true;
> else
> continue;
>
> @@ -294,10 +298,13 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> }
> else
> {
> - gimple_seq phis = phi_nodes (bb2);
> gimple_stmt_iterator gsi;
> bool candorest = true;
>
> + /* Check that we're looking for nested phis. */
> + basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> + gimple_seq phis = phi_nodes (merge);
> +
> /* Value replacement can work with more than one PHI
> so try that first. */
> if (!early_p)
> @@ -317,6 +324,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> if (!candorest)
> continue;
>
> + e2 = diamond_p ? EDGE_SUCC (bb2, 0) : e2;
> phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> if (!phi)
> continue;
> @@ -330,6 +338,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
> gphi *newphi;
> if (single_pred_p (bb1)
> + && !diamond_p
> && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> arg0, arg1,
> cond_stmt)))
> @@ -344,20 +353,25 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> }
>
> /* Do the replacement of conditional if it can be done. */
> - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> + if (!early_p
> + && !diamond_p
> + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> cfgchanged = true;
> - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> - arg0, arg1,
> - early_p))
> + else if (!diamond_p
> + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> + arg0, arg1, early_p))
> cfgchanged = true;
> else if (!early_p
> + && !diamond_p
> && single_pred_p (bb1)
> && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
> phi, arg0, arg1))
> cfgchanged = true;
> - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> + diamond_p))
> cfgchanged = true;
> else if (single_pred_p (bb1)
> + && !diamond_p
> && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> cfgchanged = true;
> }
> @@ -386,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
> static void
> replace_phi_edge_with_variable (basic_block cond_block,
> - edge e, gphi *phi, tree new_tree)
> + edge e, gphi *phi, tree new_tree, bool delete_bb = true)
> {
> basic_block bb = gimple_bb (phi);
> gimple_stmt_iterator gsi;
> @@ -427,7 +441,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> else
> edge_to_remove = EDGE_SUCC (cond_block, 0);
> - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> {
> e->flags |= EDGE_FALLTHRU;
> e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> @@ -1733,15 +1747,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
> return 0;
> }
>
> +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the TREE for
> + the value being inverted. */
> +
> +static tree
> +strip_bit_not (tree var)
> +{
> + if (TREE_CODE (var) != SSA_NAME)
> + return NULL_TREE;
> +
> + gimple *assign = SSA_NAME_DEF_STMT (var);
> + if (gimple_code (assign) != GIMPLE_ASSIGN)
> + return NULL_TREE;
> +
> + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> + return NULL_TREE;
> +
> + return gimple_assign_rhs1 (assign);
> +}
> +
> +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> +
> +enum tree_code
> +invert_minmax_code (enum tree_code code)
> +{
> + switch (code) {
> + case MIN_EXPR:
> + return MAX_EXPR;
> + case MAX_EXPR:
> + return MIN_EXPR;
> + default:
> + gcc_unreachable ();
> + }
> +}
> +
> /* The function minmax_replacement does the main work of doing the minmax
> replacement. Return true if the replacement is done. Otherwise return
> false.
> BB is the basic block where the replacement is going to be done on. ARG0
> - is argument 0 from the PHI. Likewise for ARG1. */
> + is argument 0 from the PHI. Likewise for ARG1.
> +
> + If THREEWAY_P then expect the BB to be laid out in diamond shape with each
> + BB containing only a MIN or MAX expression. */
>
> static bool
> -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> +minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_middle_bb,
> + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool threeway_p)
> {
> tree result;
> edge true_edge, false_edge;
> @@ -1896,16 +1947,20 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> if (false_edge->dest == middle_bb)
> false_edge = EDGE_SUCC (false_edge->dest, 0);
>
> + /* When THREEWAY_P then e1 will point to the edge of the final transition
> + from middle-bb to end. */
> if (true_edge == e0)
> {
> - gcc_assert (false_edge == e1);
> + if (!threeway_p)
> + gcc_assert (false_edge == e1);
> arg_true = arg0;
> arg_false = arg1;
> }
> else
> {
> gcc_assert (false_edge == e0);
> - gcc_assert (true_edge == e1);
> + if (!threeway_p)
> + gcc_assert (true_edge == e1);
> arg_true = arg1;
> arg_false = arg0;
> }
> @@ -1937,6 +1992,165 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> else
> return false;
> }
> + else if (middle_bb != alt_middle_bb && threeway_p)
> + {
> + /* Recognize the following case:
> +
> + if (smaller < larger)
> + a = MIN (smaller, c);
> + else
> + b = MIN (larger, c);
> + x = PHI <a, b>
> +
> + This is equivalent to
> +
> + a = MIN (smaller, c);
> + x = MIN (larger, a); */
> +
> + gimple *assign = last_and_only_stmt (middle_bb);
> + tree lhs, op0, op1, bound;
> + tree alt_lhs, alt_op0, alt_op1;
> + bool invert = false;
> +
> + if (!single_pred_p (middle_bb)
> + || !single_pred_p (alt_middle_bb)
> + || !single_succ_p (middle_bb)
> + || !single_succ_p (alt_middle_bb))
> + return false;
> +
> + /* When THREEWAY_P then e1 will point to the edge of the final transition
> + from middle-bb to end. */
> + if (true_edge == e0)
> + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> + else
> + gcc_assert (true_edge == EDGE_PRED (e1->src, 0));
> +
> + bool valid_minmax_p = false;
> + gimple_stmt_iterator it1
> + = gsi_start_nondebug_after_labels_bb (middle_bb);
> + gimple_stmt_iterator it2
> + = gsi_start_nondebug_after_labels_bb (alt_middle_bb);
> + if (gsi_one_nondebug_before_end_p (it1)
> + && gsi_one_nondebug_before_end_p (it2))
> + {
> + gimple *stmt1 = gsi_stmt (it1);
> + gimple *stmt2 = gsi_stmt (it2);
> + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> + {
> + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> + valid_minmax_p = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> + }
> + }
> +
> + if (!valid_minmax_p)
> + return false;
> +
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + lhs = gimple_assign_lhs (assign);
> + ass_code = gimple_assign_rhs_code (assign);
> + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> + return false;
> +
> + op0 = gimple_assign_rhs1 (assign);
> + op1 = gimple_assign_rhs2 (assign);
> +
> + assign = last_and_only_stmt (alt_middle_bb);
> + if (!assign
> + || gimple_code (assign) != GIMPLE_ASSIGN)
> + return false;
> +
> + alt_lhs = gimple_assign_lhs (assign);
> + if (ass_code != gimple_assign_rhs_code (assign))
> + return false;
> +
> + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> + return false;
> +
> + alt_op0 = gimple_assign_rhs1 (assign);
> + alt_op1 = gimple_assign_rhs2 (assign);
> +
> + if ((operand_equal_for_phi_arg_p (op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MAX_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MIN_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else if ((operand_equal_for_phi_arg_p (op0, larger)
> + || (alt_larger
> + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> + || (alt_smaller
> + && operand_equal_for_phi_arg_p (alt_op0, alt_smaller))))
> + {
> + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> + return false;
> +
> + if ((arg0 = strip_bit_not (op0)) != NULL
> + && (arg1 = strip_bit_not (alt_op0)) != NULL
> + && (bound = strip_bit_not (op1)) != NULL)
> + {
> + minmax = MIN_EXPR;
> + ass_code = invert_minmax_code (ass_code);
> + invert = true;
> + }
> + else
> + {
> + bound = op1;
> + minmax = MAX_EXPR;
> + arg0 = op0;
> + arg1 = alt_op0;
> + }
> + }
> + else
> + return false;
> +
> + /* Emit the statement to compute min/max. */
> + location_t locus = gimple_location (last_stmt (cond_bb));
> + gimple_seq stmts = NULL;
> + tree phi_result = PHI_RESULT (phi);
> + result = gimple_build (&stmts, locus, minmax, TREE_TYPE (phi_result),
> + arg0, bound);
> + result = gimple_build (&stmts, locus, ass_code, TREE_TYPE (phi_result),
> + result, arg1);
> + if (invert)
> + result = gimple_build (&stmts, locus, BIT_NOT_EXPR, TREE_TYPE (phi_result),
> + result);
> +
> + gsi = gsi_last_bb (cond_bb);
> + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> +
> + replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> +
> + return true;
> + }
> else
> {
> /* Recognize the following case, assuming d <= u:
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-07-12 13:19 ` Richard Biener
@ 2022-07-27 10:40 ` Tamar Christina
2022-07-27 11:18 ` Richard Biener
0 siblings, 1 reply; 26+ messages in thread
From: Tamar Christina @ 2022-07-27 10:40 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, nd, jakub
> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Tuesday, July 12, 2022 2:19 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> Subject: RE: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
>
> On Tue, 5 Jul 2022, Tamar Christina wrote:
>
> > > > }
> > > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > > + && single_succ_p (bb1)
> > > > + && single_succ_p (bb2)
> > > > + && single_pred_p (bb1)
> > > > + && single_pred_p (bb2)
> > > > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> > >
> > > please do the single_succ/pred checks below where appropriate, also
> > > what's the last check about?
> >
> > Done.
> >
> > > why does the merge block need a single successor?
> >
> > I was using it to fix an ICE, but I realize that's not the right fix.
> > I'm now checking If the BB is empty instead, in which case it's just a
> > fall through edge so don't treat it as a diamond.
> >
> > >
> > > > + {
> > > > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> > > (bb1);
> > > > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> > > (bb2);
> > > > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > > > + {
> > > > + gimple *stmt1 = gsi_stmt (it1);
> > > > + gimple *stmt2 = gsi_stmt (it2);
> > > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > > + {
> > > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > > + diamond_minmax_p
> > > > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > > > + }
> > > > + }
> > > > + }
> > >
> > > I'd generalize this to general diamond detection, simply cutting off
> > > *_replacement workers that do not handle diamonds and do appropriate
> > > checks in minmax_replacement only.
> > >
> >
> > Done.
> >
> > > > else
> > > > continue;
> > > >
> > > > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > bool do_hoist_loads, bool early_p)
> > > > if (!candorest)
> > > > continue;
> > > >
> > > > + /* Check that we're looking for nested phis. */
> > > > + if (phis == NULL && diamond_minmax_p)
> > > > + {
> > > > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > > > + e2 = EDGE_SUCC (bb2, 0);
> > > > + }
> > > > +
> > >
> > > instead
> > >
> > > basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > > gimple_seq phis = phi_nodes (merge);
> > >
> >
> > Done.
> >
> > >
> > > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > > if (!phi)
> > > > continue;
> > > > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > >
> > > > gphi *newphi;
> > > > if (single_pred_p (bb1)
> > > > + && !diamond_minmax_p
> > > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > > arg0, arg1,
> > > > cond_stmt)))
> > > > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool
> do_store_elim,
> > > bool do_hoist_loads, bool early_p)
> > > > }
> > > >
> > > > /* Do the replacement of conditional if it can be done. */
> > > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > > arg1))
> > > > + if (!early_p
> > > > + && !diamond_minmax_p
> > > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > > cfgchanged = true;
> > > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > - arg0, arg1,
> > > > - early_p))
> > > > + else if (!diamond_minmax_p
> > > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > + arg0, arg1, early_p))
> > > > cfgchanged = true;
> > > > else if (!early_p
> > > > + && !diamond_minmax_p
> > > > && single_pred_p (bb1)
> > > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > > e2,
> > > > phi, arg0, arg1))
> > > > cfgchanged = true;
> > > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > > + diamond_minmax_p))
> > > > cfgchanged = true;
> > > > else if (single_pred_p (bb1)
> > > > + && !diamond_minmax_p
> > > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > > arg1))
> > > > cfgchanged = true;
> > > > }
> > > > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > >
> > > > static void
> > > > replace_phi_edge_with_variable (basic_block cond_block,
> > > > - edge e, gphi *phi, tree new_tree)
> > > > + edge e, gphi *phi, tree new_tree, bool
> > > delete_bb = true)
> > > > {
> > > > basic_block bb = gimple_bb (phi);
> > > > gimple_stmt_iterator gsi;
> > > > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> > > cond_block,
> > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > else
> > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 &&
> delete_bb)
> > >
> > > why do you need this change?
> >
> > When this function replaces the edge it doesn't seem to update the
> dominators.
> > Since It's replacing the middle BB we then end up with an error
> >
> > gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error: dominator of 5
> > should be 4, not 2
> >
> > during early verify. So instead, I replace the BB but defer its
> > deletion until cleanup which removes it and updates the dominators.
>
> Hmm, for a diamond shouldn't you replace
>
> if (EDGE_SUCC (cond_block, 0)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> else
> edge_to_remove = EDGE_SUCC (cond_block, 0);
>
> with
>
> if (EDGE_SUCC (cond_block, 0)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 0);
>
> thus, the code expects to be left with a fallthru to the PHI block which is
> expected to have the immediate dominator being cond_block but with a
> diamond there's a (possibly empty) block inbetween and dominators are
> wrong.
Agreed, but the (EDGE_SUCC (cond_block, 1)->dest == bb) doesn't seem like the
Right one since for a diamond there will be a block in between the two. Did you perhaps
mean EDGE_SUCC (EDGE_SUCC (cond_block, 1)->dest, 0)->dest == bb? i.e. that that destination
across the diamond be bb, and then you remove the middle block?
For the minmax diamond we want both edges removed, since all the code in the middle BBs are now
dead. But this is probably not true in the general sense.
>>> p debug (cond_block)
<bb 2> :
xc_3 = ~xc_2(D);
xm_5 = ~xm_4(D);
xy_7 = ~xy_6(D);
_10 = MAX_EXPR <xc_2(D), xy_6(D)>;
_12 = MIN_EXPR <_10, xm_4(D)>;
_13 = ~_12;
if (xc_3 < xm_5)
goto <bb 3>; [INV]
else
goto <bb 4>; [INV]
>>> p debug (EDGE_SUCC (cond_block, 0)->dest)
<bb 3> :
xk_9 = MAX_EXPR <xc_3, xy_7>;
goto <bb 5>; [INV]
>>> p debug (EDGE_SUCC (cond_block, 1)->dest)
<bb 4> :
xk_8 = MAX_EXPR <xm_5, xy_7>;
>>> p debug (bb)
<bb 5> :
# xk_1 = PHI <xk_9(3), xk_8(4)>
return xk_1;
$6 = void
So something like this?
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 72d7b40a501..c107eeea1aa 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -400,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
static void
replace_phi_edge_with_variable (basic_block cond_block,
- edge e, gphi *phi, tree new_tree, bool delete_bb = true)
+ edge e, gphi *phi, tree new_tree, bool diamond_p = false)
{
basic_block bb = gimple_bb (phi);
gimple_stmt_iterator gsi;
@@ -439,9 +439,9 @@ replace_phi_edge_with_variable (basic_block cond_block,
edge edge_to_remove;
if (EDGE_SUCC (cond_block, 0)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 1);
- else
+ else if (!diamond_p || (diamond_p && EDGE_SUCC (EDGE_SUCC (cond_block, 1)->dest, 0)->dest == bb))
edge_to_remove = EDGE_SUCC (cond_block, 0);
- if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
+ if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && !diamond_p)
{
e->flags |= EDGE_FALLTHRU;
e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
@@ -2147,7 +2147,7 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_
gsi = gsi_last_bb (cond_bb);
gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
- replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
+ replace_phi_edge_with_variable (cond_bb, e1, phi, result, threeway_p);
return true;
}
Cheers,
Tamar
>
> So I think you simply need to handle this properly (and then fall through to
> the else).
>
>
> > >
> > > Did you check whether the new case works when the merge block has
> > > more than two incoming edges?
> > >
> >
> > Yes, added a new testcase for it.
> >
> > > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > > + {
> > > > + /* Recognize the following case:
> > > > +
> > > > + if (smaller < larger)
> > > > + a = MIN (smaller, c);
> > > > + else
> > > > + b = MIN (larger, c);
> > > > + x = PHI <a, b>
> > > > +
> > > > + This is equivalent to
> > > > +
> > > > + a = MIN (smaller, c);
> > > > + x = MIN (larger, a); */
> > > > +
> > > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > > + tree lhs, op0, op1, bound;
> > > > + tree alt_lhs, alt_op0, alt_op1;
> > > > + bool invert = false;
> > > > +
> > > > + if (!single_pred_p (middle_bb)
> > > > + || !single_pred_p (alt_middle_bb))
> > > > + return false;
> > > > +
> > > > + if (!assign
> > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > + return false;
> > > > +
> > > > + lhs = gimple_assign_lhs (assign);
> > > > + ass_code = gimple_assign_rhs_code (assign);
> > > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > > + return false;
> > > > +
> > > > + op0 = gimple_assign_rhs1 (assign);
> > > > + op1 = gimple_assign_rhs2 (assign);
> > > > +
> > > > + assign = last_and_only_stmt (alt_middle_bb);
> > > > + if (!assign
> > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > + return false;
> > > > +
> > > > + alt_lhs = gimple_assign_lhs (assign);
> > > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > > + return false;
> > > > +
> > > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > > +
> > > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > > + return false;
> > > > +
> > > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > > + || (alt_smaller
> > > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > > + || (alt_larger
> > > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > > + {
> > > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > + return false;
> > > > +
> > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > + {
> > > > + minmax = MAX_EXPR;
> > > > + ass_code = invert_minmax_code (ass_code);
> > > > + invert = true;
> > > > + }
> > > > + else
> > > > + {
> > > > + bound = op1;
> > > > + minmax = MIN_EXPR;
> > > > + arg0 = op0;
> > > > + arg1 = alt_op0;
> > > > + }
> > > > + }
> > > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > > + || (alt_larger
> > > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > > + || (alt_smaller
> > > > + && operand_equal_for_phi_arg_p (alt_op0,
> > > alt_smaller))))
> > > > + {
> > > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > + return false;
> > > > +
> > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > + {
> > > > + minmax = MIN_EXPR;
> > > > + ass_code = invert_minmax_code (ass_code);
> > > > + invert = true;
> > > > + }
> > > > + else
> > > > + {
> > > > + bound = op1;
> > > > + minmax = MAX_EXPR;
> > > > + arg0 = op0;
> > > > + arg1 = alt_op0;
> > > > + }
> > > > + }
> > > > + else
> > > > + return false;
> > >
> > > Did you check you have coverage for all cases above in your testcases?
> >
> > I've added some more, should now have full coverage.
>
> Great.
>
> > >
> > > > + /* Reset any range information from the basic block. */
> > > > + reset_flow_sensitive_info_in_bb (cond_bb);
> > >
> > > Huh. You need to reset flow-sensitive info of the middle-bb stmt
> > > that prevails only...
> > >
> > > > + /* Emit the statement to compute min/max. */
> > > > + gimple_seq stmts = NULL;
> > > > + tree phi_result = PHI_RESULT (phi);
> > > > + result = gimple_build (&stmts, minmax, TREE_TYPE
> > > > + (phi_result), arg0,
> > > bound);
> > > > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > > > + (phi_result), result, arg1);
> > >
> > > ... but you are re-building both here. And also you drop locations,
> > > the preserved min/max should keep the old, the new should get the
> > > location of ... hmm, the condition possibly?
> >
> > Done, also added a testcase which checks that it still works when -g.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
>
> Besides the above issue it looks good to me.
>
> Thanks and sorry for the delay.
> Richard.
>
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> the phi
> > sequence of a three-way conditional.
> > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> way
> > min/max.
> > (strip_bit_not, invert_minmax_code): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> optimize
> > code away.
> > * gcc.dg/tree-ssa/minmax-10.c: New test.
> > * gcc.dg/tree-ssa/minmax-11.c: New test.
> > * gcc.dg/tree-ssa/minmax-12.c: New test.
> > * gcc.dg/tree-ssa/minmax-13.c: New test.
> > * gcc.dg/tree-ssa/minmax-14.c: New test.
> > * gcc.dg/tree-ssa/minmax-15.c: New test.
> > * gcc.dg/tree-ssa/minmax-16.c: New test.
> > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > * gcc.dg/tree-ssa/minmax-8.c: New test.
> > * gcc.dg/tree-ssa/minmax-9.c: New test.
> >
> > --- inline copy of patch ---
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..589953684416a9d263084deb5
> 8f6
> > cde7094dd517
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> > @@ -0,0 +1,20 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-optimized" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + xc=~xc;
> > + xm=~xm;
> > + xy=~xy;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..1c2ef01b5d1e639fbf95bb5ca4
> 73
> > b63cc98e9df1
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-optimized" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + xc=~xc;
> > + xm=~xm;
> > + xy=~xy;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..3d0c07d9b57dd689bcb896539
> 377
> > 27ab441e7f2b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> > @@ -0,0 +1,20 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + xc=~xc;
> > + xm=~xm;
> > + xy=~xy;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xy < xc ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..c0d0f27c8027ae87654532d1b9
> 19
> > cfeccf4413e0
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> > @@ -0,0 +1,19 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + xc=~xc;
> > + xm=~xm;
> > + xy=~xy;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..9c0cadbf7e3119527cb2007d01
> fe
> > 4c7dd772c069
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-optimized" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + xc=~xc;
> > + xm=~xm;
> > + xy=~xy;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..1d97a16564f069b4348ff325c4f
> d
> > 713a224f838a
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +
> > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy, bool m) {
> > + uint8_t xk;
> > + if (xc)
> > + {
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + }
> > +
> > + return xk;
> > +}
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..89377a2cb341bdafa6ba145c61
> c1
> > f966af536839
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt -g" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> 6a8
> > 43695a05786e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> b52
> > 2da44bafa0e2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> 5b
> > 9993074f8510
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> 85
> > 639e3a49dd4b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xy < xc ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> ef
> > 9fa7b1c5e104
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > @@ -0,0 +1,16 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc > xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> b65
> > 4fcab0bfdd1c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc > xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm > xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..24f580271c3ac3945860b506d4
> dc
> > 7d178a826093
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> > @@ -0,0 +1,20 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-optimized" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > + uint8_t xk;
> > + xc=~xc;
> > + xm=~xm;
> > + xy=~xy;
> > + if (xc < xm) {
> > + xk = (uint8_t) (xc < xy ? xc : xy);
> > + } else {
> > + xk = (uint8_t) (xm < xy ? xm : xy);
> > + }
> > + return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "optimized" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > index
> >
> 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> 7d7
> > 5a32d2c860ed 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > @@ -1,5 +1,5 @@
> > /* { dg-do run } */
> > -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > --param max-jump-thread-duplication-stmts=20" } */
> > +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > +--param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> >
> e61d9736937573d773acdf3e43a7c76074bfb2c7..df543b22cd720538c14bcea72f
> c7
> > 8a8dec9bf12b 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -63,8 +63,8 @@ static gphi *factor_out_conditional_conversion (edge,
> edge, gphi *, tree, tree,
> > gimple *);
> > static int value_replacement (basic_block, basic_block,
> > edge, edge, gphi *, tree, tree); -static bool
> > minmax_replacement (basic_block, basic_block,
> > - edge, edge, gphi *, tree, tree);
> > +static bool minmax_replacement (basic_block, basic_block, basic_block,
> > + edge, edge, gphi *, tree, tree, bool);
> > static bool spaceship_replacement (basic_block, basic_block,
> > edge, edge, gphi *, tree, tree); static bool
> > cond_removal_in_builtin_zero_pattern (basic_block, basic_block, @@
> > -74,7 +74,7 @@ static bool cond_store_replacement (basic_block,
> basic_block, edge, edge,
> > hash_set<tree> *);
> > static bool cond_if_else_store_replacement (basic_block, basic_block,
> > basic_block); static hash_set<tree> * get_non_trapping (); -static
> > void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> > +static void replace_phi_edge_with_variable (basic_block, edge, gphi
> > +*, tree, bool);
> > static void hoist_adjacent_loads (basic_block, basic_block,
> > basic_block, basic_block);
> > static bool gate_hoist_loads (void);
> > @@ -200,6 +200,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> > basic_block bb1, bb2;
> > edge e1, e2;
> > tree arg0, arg1;
> > + bool diamond_p = false;
> >
> > bb = bb_order[i];
> >
> > @@ -266,6 +267,9 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> > hoist_adjacent_loads (bb, bb1, bb2, bb3);
> > continue;
> > }
> > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > + && !empty_block_p (bb1))
> > + diamond_p = true;
> > else
> > continue;
> >
> > @@ -294,10 +298,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > }
> > else
> > {
> > - gimple_seq phis = phi_nodes (bb2);
> > gimple_stmt_iterator gsi;
> > bool candorest = true;
> >
> > + /* Check that we're looking for nested phis. */
> > + basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > + gimple_seq phis = phi_nodes (merge);
> > +
> > /* Value replacement can work with more than one PHI
> > so try that first. */
> > if (!early_p)
> > @@ -317,6 +324,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> do_hoist_loads, bool early_p)
> > if (!candorest)
> > continue;
> >
> > + e2 = diamond_p ? EDGE_SUCC (bb2, 0) : e2;
> > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > if (!phi)
> > continue;
> > @@ -330,6 +338,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > gphi *newphi;
> > if (single_pred_p (bb1)
> > + && !diamond_p
> > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > arg0, arg1,
> > cond_stmt)))
> > @@ -344,20 +353,25 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> bool do_hoist_loads, bool early_p)
> > }
> >
> > /* Do the replacement of conditional if it can be done. */
> > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> arg1))
> > + if (!early_p
> > + && !diamond_p
> > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > - arg0, arg1,
> > - early_p))
> > + else if (!diamond_p
> > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > + arg0, arg1, early_p))
> > cfgchanged = true;
> > else if (!early_p
> > + && !diamond_p
> > && single_pred_p (bb1)
> > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> e2,
> > phi, arg0, arg1))
> > cfgchanged = true;
> > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > + diamond_p))
> > cfgchanged = true;
> > else if (single_pred_p (bb1)
> > + && !diamond_p
> > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> arg1))
> > cfgchanged = true;
> > }
> > @@ -386,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> >
> > static void
> > replace_phi_edge_with_variable (basic_block cond_block,
> > - edge e, gphi *phi, tree new_tree)
> > + edge e, gphi *phi, tree new_tree, bool
> delete_bb = true)
> > {
> > basic_block bb = gimple_bb (phi);
> > gimple_stmt_iterator gsi;
> > @@ -427,7 +441,7 @@ replace_phi_edge_with_variable (basic_block
> cond_block,
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > else
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> > {
> > e->flags |= EDGE_FALLTHRU;
> > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -
> 1733,15
> > +1747,52 @@ value_replacement (basic_block cond_bb, basic_block
> middle_bb,
> > return 0;
> > }
> >
> > +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the
> TREE for
> > + the value being inverted. */
> > +
> > +static tree
> > +strip_bit_not (tree var)
> > +{
> > + if (TREE_CODE (var) != SSA_NAME)
> > + return NULL_TREE;
> > +
> > + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code (assign)
> > + != GIMPLE_ASSIGN)
> > + return NULL_TREE;
> > +
> > + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> > + return NULL_TREE;
> > +
> > + return gimple_assign_rhs1 (assign); }
> > +
> > +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> > +
> > +enum tree_code
> > +invert_minmax_code (enum tree_code code) {
> > + switch (code) {
> > + case MIN_EXPR:
> > + return MAX_EXPR;
> > + case MAX_EXPR:
> > + return MIN_EXPR;
> > + default:
> > + gcc_unreachable ();
> > + }
> > +}
> > +
> > /* The function minmax_replacement does the main work of doing the
> minmax
> > replacement. Return true if the replacement is done. Otherwise return
> > false.
> > BB is the basic block where the replacement is going to be done on.
> ARG0
> > - is argument 0 from the PHI. Likewise for ARG1. */
> > + is argument 0 from the PHI. Likewise for ARG1.
> > +
> > + If THREEWAY_P then expect the BB to be laid out in diamond shape with
> each
> > + BB containing only a MIN or MAX expression. */
> >
> > static bool
> > -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> > +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> basic_block alt_middle_bb,
> > + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool
> > +threeway_p)
> > {
> > tree result;
> > edge true_edge, false_edge;
> > @@ -1896,16 +1947,20 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> > if (false_edge->dest == middle_bb)
> > false_edge = EDGE_SUCC (false_edge->dest, 0);
> >
> > + /* When THREEWAY_P then e1 will point to the edge of the final
> transition
> > + from middle-bb to end. */
> > if (true_edge == e0)
> > {
> > - gcc_assert (false_edge == e1);
> > + if (!threeway_p)
> > + gcc_assert (false_edge == e1);
> > arg_true = arg0;
> > arg_false = arg1;
> > }
> > else
> > {
> > gcc_assert (false_edge == e0);
> > - gcc_assert (true_edge == e1);
> > + if (!threeway_p)
> > + gcc_assert (true_edge == e1);
> > arg_true = arg1;
> > arg_false = arg0;
> > }
> > @@ -1937,6 +1992,165 @@ minmax_replacement (basic_block cond_bb,
> basic_block middle_bb,
> > else
> > return false;
> > }
> > + else if (middle_bb != alt_middle_bb && threeway_p)
> > + {
> > + /* Recognize the following case:
> > +
> > + if (smaller < larger)
> > + a = MIN (smaller, c);
> > + else
> > + b = MIN (larger, c);
> > + x = PHI <a, b>
> > +
> > + This is equivalent to
> > +
> > + a = MIN (smaller, c);
> > + x = MIN (larger, a); */
> > +
> > + gimple *assign = last_and_only_stmt (middle_bb);
> > + tree lhs, op0, op1, bound;
> > + tree alt_lhs, alt_op0, alt_op1;
> > + bool invert = false;
> > +
> > + if (!single_pred_p (middle_bb)
> > + || !single_pred_p (alt_middle_bb)
> > + || !single_succ_p (middle_bb)
> > + || !single_succ_p (alt_middle_bb))
> > + return false;
> > +
> > + /* When THREEWAY_P then e1 will point to the edge of the final
> transition
> > + from middle-bb to end. */
> > + if (true_edge == e0)
> > + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> > + else
> > + gcc_assert (true_edge == EDGE_PRED (e1->src, 0));
> > +
> > + bool valid_minmax_p = false;
> > + gimple_stmt_iterator it1
> > + = gsi_start_nondebug_after_labels_bb (middle_bb);
> > + gimple_stmt_iterator it2
> > + = gsi_start_nondebug_after_labels_bb (alt_middle_bb);
> > + if (gsi_one_nondebug_before_end_p (it1)
> > + && gsi_one_nondebug_before_end_p (it2))
> > + {
> > + gimple *stmt1 = gsi_stmt (it1);
> > + gimple *stmt2 = gsi_stmt (it2);
> > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > + {
> > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > + valid_minmax_p = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > + && (code2 == MIN_EXPR || code2 ==
> MAX_EXPR);
> > + }
> > + }
> > +
> > + if (!valid_minmax_p)
> > + return false;
> > +
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + lhs = gimple_assign_lhs (assign);
> > + ass_code = gimple_assign_rhs_code (assign);
> > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > + return false;
> > +
> > + op0 = gimple_assign_rhs1 (assign);
> > + op1 = gimple_assign_rhs2 (assign);
> > +
> > + assign = last_and_only_stmt (alt_middle_bb);
> > + if (!assign
> > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > + return false;
> > +
> > + alt_lhs = gimple_assign_lhs (assign);
> > + if (ass_code != gimple_assign_rhs_code (assign))
> > + return false;
> > +
> > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > + return false;
> > +
> > + alt_op0 = gimple_assign_rhs1 (assign);
> > + alt_op1 = gimple_assign_rhs2 (assign);
> > +
> > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MAX_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MIN_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > + || (alt_larger
> > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > + || (alt_smaller
> > + && operand_equal_for_phi_arg_p (alt_op0,
> alt_smaller))))
> > + {
> > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > + return false;
> > +
> > + if ((arg0 = strip_bit_not (op0)) != NULL
> > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > + && (bound = strip_bit_not (op1)) != NULL)
> > + {
> > + minmax = MIN_EXPR;
> > + ass_code = invert_minmax_code (ass_code);
> > + invert = true;
> > + }
> > + else
> > + {
> > + bound = op1;
> > + minmax = MAX_EXPR;
> > + arg0 = op0;
> > + arg1 = alt_op0;
> > + }
> > + }
> > + else
> > + return false;
> > +
> > + /* Emit the statement to compute min/max. */
> > + location_t locus = gimple_location (last_stmt (cond_bb));
> > + gimple_seq stmts = NULL;
> > + tree phi_result = PHI_RESULT (phi);
> > + result = gimple_build (&stmts, locus, minmax, TREE_TYPE (phi_result),
> > + arg0, bound);
> > + result = gimple_build (&stmts, locus, ass_code, TREE_TYPE
> (phi_result),
> > + result, arg1);
> > + if (invert)
> > + result = gimple_build (&stmts, locus, BIT_NOT_EXPR, TREE_TYPE
> (phi_result),
> > + result);
> > +
> > + gsi = gsi_last_bb (cond_bb);
> > + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> > +
> > + replace_phi_edge_with_variable (cond_bb, e1, phi, result,
> > + false);
> > +
> > + return true;
> > + }
> > else
> > {
> > /* Recognize the following case, assuming d <= u:
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> Boudien Moerman; HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-07-27 10:40 ` Tamar Christina
@ 2022-07-27 11:18 ` Richard Biener
2022-08-02 8:32 ` Tamar Christina
0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2022-07-27 11:18 UTC (permalink / raw)
To: Tamar Christina; +Cc: gcc-patches, nd, jakub
On Wed, 27 Jul 2022, Tamar Christina wrote:
> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: Tuesday, July 12, 2022 2:19 PM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jakub@redhat.com
> > Subject: RE: [PATCH 2/2]middle-end: Support recognition of three-way
> > max/min.
> >
> > On Tue, 5 Jul 2022, Tamar Christina wrote:
> >
> > > > > }
> > > > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > > > + && single_succ_p (bb1)
> > > > > + && single_succ_p (bb2)
> > > > > + && single_pred_p (bb1)
> > > > > + && single_pred_p (bb2)
> > > > > + && single_succ_p (EDGE_SUCC (bb1, 0)->dest))
> > > >
> > > > please do the single_succ/pred checks below where appropriate, also
> > > > what's the last check about?
> > >
> > > Done.
> > >
> > > > why does the merge block need a single successor?
> > >
> > > I was using it to fix an ICE, but I realize that's not the right fix.
> > > I'm now checking If the BB is empty instead, in which case it's just a
> > > fall through edge so don't treat it as a diamond.
> > >
> > > >
> > > > > + {
> > > > > + gimple_stmt_iterator it1 = gsi_start_nondebug_after_labels_bb
> > > > (bb1);
> > > > > + gimple_stmt_iterator it2 = gsi_start_nondebug_after_labels_bb
> > > > (bb2);
> > > > > + if (gsi_one_before_end_p (it1) && gsi_one_before_end_p (it2))
> > > > > + {
> > > > > + gimple *stmt1 = gsi_stmt (it1);
> > > > > + gimple *stmt2 = gsi_stmt (it2);
> > > > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > > > + {
> > > > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > > > + diamond_minmax_p
> > > > > + = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > > > + && (code2 == MIN_EXPR || code2 == MAX_EXPR);
> > > > > + }
> > > > > + }
> > > > > + }
> > > >
> > > > I'd generalize this to general diamond detection, simply cutting off
> > > > *_replacement workers that do not handle diamonds and do appropriate
> > > > checks in minmax_replacement only.
> > > >
> > >
> > > Done.
> > >
> > > > > else
> > > > > continue;
> > > > >
> > > > > @@ -316,6 +340,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > > > if (!candorest)
> > > > > continue;
> > > > >
> > > > > + /* Check that we're looking for nested phis. */
> > > > > + if (phis == NULL && diamond_minmax_p)
> > > > > + {
> > > > > + phis = phi_nodes (EDGE_SUCC (bb2, 0)->dest);
> > > > > + e2 = EDGE_SUCC (bb2, 0);
> > > > > + }
> > > > > +
> > > >
> > > > instead
> > > >
> > > > basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > > > gimple_seq phis = phi_nodes (merge);
> > > >
> > >
> > > Done.
> > >
> > > >
> > > > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > > > if (!phi)
> > > > > continue;
> > > > > @@ -329,6 +360,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > > bool do_hoist_loads, bool early_p)
> > > > >
> > > > > gphi *newphi;
> > > > > if (single_pred_p (bb1)
> > > > > + && !diamond_minmax_p
> > > > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > > > arg0, arg1,
> > > > > cond_stmt)))
> > > > > @@ -343,20 +375,25 @@ tree_ssa_phiopt_worker (bool
> > do_store_elim,
> > > > bool do_hoist_loads, bool early_p)
> > > > > }
> > > > >
> > > > > /* Do the replacement of conditional if it can be done. */
> > > > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > > > arg1))
> > > > > + if (!early_p
> > > > > + && !diamond_minmax_p
> > > > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > > > cfgchanged = true;
> > > > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > > - arg0, arg1,
> > > > > - early_p))
> > > > > + else if (!diamond_minmax_p
> > > > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > > > + arg0, arg1, early_p))
> > > > > cfgchanged = true;
> > > > > else if (!early_p
> > > > > + && !diamond_minmax_p
> > > > > && single_pred_p (bb1)
> > > > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > > > e2,
> > > > > phi, arg0, arg1))
> > > > > cfgchanged = true;
> > > > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > > > + diamond_minmax_p))
> > > > > cfgchanged = true;
> > > > > else if (single_pred_p (bb1)
> > > > > + && !diamond_minmax_p
> > > > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > > > arg1))
> > > > > cfgchanged = true;
> > > > > }
> > > > > @@ -385,7 +422,7 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > > > > bool do_hoist_loads, bool early_p)
> > > > >
> > > > > static void
> > > > > replace_phi_edge_with_variable (basic_block cond_block,
> > > > > - edge e, gphi *phi, tree new_tree)
> > > > > + edge e, gphi *phi, tree new_tree, bool
> > > > delete_bb = true)
> > > > > {
> > > > > basic_block bb = gimple_bb (phi);
> > > > > gimple_stmt_iterator gsi;
> > > > > @@ -428,7 +465,7 @@ replace_phi_edge_with_variable (basic_block
> > > > cond_block,
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > else
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 &&
> > delete_bb)
> > > >
> > > > why do you need this change?
> > >
> > > When this function replaces the edge it doesn't seem to update the
> > dominators.
> > > Since It's replacing the middle BB we then end up with an error
> > >
> > > gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error: dominator of 5
> > > should be 4, not 2
> > >
> > > during early verify. So instead, I replace the BB but defer its
> > > deletion until cleanup which removes it and updates the dominators.
> >
> > Hmm, for a diamond shouldn't you replace
> >
> > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > else
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> >
> > with
> >
> > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> >
> > thus, the code expects to be left with a fallthru to the PHI block which is
> > expected to have the immediate dominator being cond_block but with a
> > diamond there's a (possibly empty) block inbetween and dominators are
> > wrong.
>
> Agreed, but the (EDGE_SUCC (cond_block, 1)->dest == bb) doesn't seem like the
> Right one since for a diamond there will be a block in between the two. Did you perhaps
> mean EDGE_SUCC (EDGE_SUCC (cond_block, 1)->dest, 0)->dest == bb? i.e. that that destination
> across the diamond be bb, and then you remove the middle block?
Hmm, I think my condition was correct - the code tries to remove the
edge to the middle-block and checks the remaining edge falls through
to the merge block. With a true diamond there is no fallthru to
the merge block to keep so we better don't remove any edge?
> For the minmax diamond we want both edges removed, since all the code in the middle BBs are now
> dead. But this is probably not true in the general sense.
>
> >>> p debug (cond_block)
> <bb 2> :
> xc_3 = ~xc_2(D);
> xm_5 = ~xm_4(D);
> xy_7 = ~xy_6(D);
> _10 = MAX_EXPR <xc_2(D), xy_6(D)>;
> _12 = MIN_EXPR <_10, xm_4(D)>;
> _13 = ~_12;
> if (xc_3 < xm_5)
> goto <bb 3>; [INV]
> else
> goto <bb 4>; [INV]
>
> >>> p debug (EDGE_SUCC (cond_block, 0)->dest)
> <bb 3> :
> xk_9 = MAX_EXPR <xc_3, xy_7>;
> goto <bb 5>; [INV]
>
> >>> p debug (EDGE_SUCC (cond_block, 1)->dest)
> <bb 4> :
> xk_8 = MAX_EXPR <xm_5, xy_7>;
>
> >>> p debug (bb)
> <bb 5> :
> # xk_1 = PHI <xk_9(3), xk_8(4)>
> return xk_1;
>
> $6 = void
>
> So something like this?
>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 72d7b40a501..c107eeea1aa 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -400,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
> static void
> replace_phi_edge_with_variable (basic_block cond_block,
> - edge e, gphi *phi, tree new_tree, bool delete_bb = true)
> + edge e, gphi *phi, tree new_tree, bool diamond_p = false)
> {
> basic_block bb = gimple_bb (phi);
> gimple_stmt_iterator gsi;
> @@ -439,9 +439,9 @@ replace_phi_edge_with_variable (basic_block cond_block,
> edge edge_to_remove;
> if (EDGE_SUCC (cond_block, 0)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> - else
> + else if (!diamond_p || (diamond_p && EDGE_SUCC (EDGE_SUCC (cond_block, 1)->dest, 0)->dest == bb))
> edge_to_remove = EDGE_SUCC (cond_block, 0);
> - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && !diamond_p)
> {
> e->flags |= EDGE_FALLTHRU;
> e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> @@ -2147,7 +2147,7 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_
> gsi = gsi_last_bb (cond_bb);
> gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
>
> - replace_phi_edge_with_variable (cond_bb, e1, phi, result, false);
> + replace_phi_edge_with_variable (cond_bb, e1, phi, result, threeway_p);
>
> return true;
> }
>
>
> Cheers,
> Tamar
>
> >
> > So I think you simply need to handle this properly (and then fall through to
> > the else).
> >
> >
> > > >
> > > > Did you check whether the new case works when the merge block has
> > > > more than two incoming edges?
> > > >
> > >
> > > Yes, added a new testcase for it.
> > >
> > > > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > > > + {
> > > > > + /* Recognize the following case:
> > > > > +
> > > > > + if (smaller < larger)
> > > > > + a = MIN (smaller, c);
> > > > > + else
> > > > > + b = MIN (larger, c);
> > > > > + x = PHI <a, b>
> > > > > +
> > > > > + This is equivalent to
> > > > > +
> > > > > + a = MIN (smaller, c);
> > > > > + x = MIN (larger, a); */
> > > > > +
> > > > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > > > + tree lhs, op0, op1, bound;
> > > > > + tree alt_lhs, alt_op0, alt_op1;
> > > > > + bool invert = false;
> > > > > +
> > > > > + if (!single_pred_p (middle_bb)
> > > > > + || !single_pred_p (alt_middle_bb))
> > > > > + return false;
> > > > > +
> > > > > + if (!assign
> > > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > > + return false;
> > > > > +
> > > > > + lhs = gimple_assign_lhs (assign);
> > > > > + ass_code = gimple_assign_rhs_code (assign);
> > > > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > > > + return false;
> > > > > +
> > > > > + op0 = gimple_assign_rhs1 (assign);
> > > > > + op1 = gimple_assign_rhs2 (assign);
> > > > > +
> > > > > + assign = last_and_only_stmt (alt_middle_bb);
> > > > > + if (!assign
> > > > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > > > + return false;
> > > > > +
> > > > > + alt_lhs = gimple_assign_lhs (assign);
> > > > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > > > + return false;
> > > > > +
> > > > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > > > +
> > > > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > > > + return false;
> > > > > +
> > > > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > > > + || (alt_smaller
> > > > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > > > + || (alt_larger
> > > > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > > > + {
> > > > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > > + return false;
> > > > > +
> > > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > > + {
> > > > > + minmax = MAX_EXPR;
> > > > > + ass_code = invert_minmax_code (ass_code);
> > > > > + invert = true;
> > > > > + }
> > > > > + else
> > > > > + {
> > > > > + bound = op1;
> > > > > + minmax = MIN_EXPR;
> > > > > + arg0 = op0;
> > > > > + arg1 = alt_op0;
> > > > > + }
> > > > > + }
> > > > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > > > + || (alt_larger
> > > > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > > > + || (alt_smaller
> > > > > + && operand_equal_for_phi_arg_p (alt_op0,
> > > > alt_smaller))))
> > > > > + {
> > > > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > > > + return false;
> > > > > +
> > > > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > > > + && (bound = strip_bit_not (op1)) != NULL)
> > > > > + {
> > > > > + minmax = MIN_EXPR;
> > > > > + ass_code = invert_minmax_code (ass_code);
> > > > > + invert = true;
> > > > > + }
> > > > > + else
> > > > > + {
> > > > > + bound = op1;
> > > > > + minmax = MAX_EXPR;
> > > > > + arg0 = op0;
> > > > > + arg1 = alt_op0;
> > > > > + }
> > > > > + }
> > > > > + else
> > > > > + return false;
> > > >
> > > > Did you check you have coverage for all cases above in your testcases?
> > >
> > > I've added some more, should now have full coverage.
> >
> > Great.
> >
> > > >
> > > > > + /* Reset any range information from the basic block. */
> > > > > + reset_flow_sensitive_info_in_bb (cond_bb);
> > > >
> > > > Huh. You need to reset flow-sensitive info of the middle-bb stmt
> > > > that prevails only...
> > > >
> > > > > + /* Emit the statement to compute min/max. */
> > > > > + gimple_seq stmts = NULL;
> > > > > + tree phi_result = PHI_RESULT (phi);
> > > > > + result = gimple_build (&stmts, minmax, TREE_TYPE
> > > > > + (phi_result), arg0,
> > > > bound);
> > > > > + result = gimple_build (&stmts, ass_code, TREE_TYPE
> > > > > + (phi_result), result, arg1);
> > > >
> > > > ... but you are re-building both here. And also you drop locations,
> > > > the preserved min/max should keep the old, the new should get the
> > > > location of ... hmm, the condition possibly?
> > >
> > > Done, also added a testcase which checks that it still works when -g.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> >
> > Besides the above issue it looks good to me.
> >
> > Thanks and sorry for the delay.
> > Richard.
> >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> > the phi
> > > sequence of a three-way conditional.
> > > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> > way
> > > min/max.
> > > (strip_bit_not, invert_minmax_code): New.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> > optimize
> > > code away.
> > > * gcc.dg/tree-ssa/minmax-10.c: New test.
> > > * gcc.dg/tree-ssa/minmax-11.c: New test.
> > > * gcc.dg/tree-ssa/minmax-12.c: New test.
> > > * gcc.dg/tree-ssa/minmax-13.c: New test.
> > > * gcc.dg/tree-ssa/minmax-14.c: New test.
> > > * gcc.dg/tree-ssa/minmax-15.c: New test.
> > > * gcc.dg/tree-ssa/minmax-16.c: New test.
> > > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > > * gcc.dg/tree-ssa/minmax-8.c: New test.
> > > * gcc.dg/tree-ssa/minmax-9.c: New test.
> > >
> > > --- inline copy of patch ---
> > >
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..589953684416a9d263084deb5
> > 8f6
> > > cde7094dd517
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-10.c
> > > @@ -0,0 +1,20 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-optimized" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + xc=~xc;
> > > + xm=~xm;
> > > + xy=~xy;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..1c2ef01b5d1e639fbf95bb5ca4
> > 73
> > > b63cc98e9df1
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-11.c
> > > @@ -0,0 +1,21 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-optimized" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + xc=~xc;
> > > + xm=~xm;
> > > + xy=~xy;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..3d0c07d9b57dd689bcb896539
> > 377
> > > 27ab441e7f2b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-12.c
> > > @@ -0,0 +1,20 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + xc=~xc;
> > > + xm=~xm;
> > > + xy=~xy;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xy < xc ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..c0d0f27c8027ae87654532d1b9
> > 19
> > > cfeccf4413e0
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-13.c
> > > @@ -0,0 +1,19 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + xc=~xc;
> > > + xm=~xm;
> > > + xy=~xy;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..9c0cadbf7e3119527cb2007d01
> > fe
> > > 4c7dd772c069
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c
> > > @@ -0,0 +1,21 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-optimized" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + xc=~xc;
> > > + xm=~xm;
> > > + xy=~xy;
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..1d97a16564f069b4348ff325c4f
> > d
> > > 713a224f838a
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-15.c
> > > @@ -0,0 +1,21 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +#include <stdbool.h>
> > > +
> > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy, bool m) {
> > > + uint8_t xk;
> > > + if (xc)
> > > + {
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + }
> > > +
> > > + return xk;
> > > +}
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..89377a2cb341bdafa6ba145c61
> > c1
> > > f966af536839
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt -g" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..de3b2e946e81701e3b75f580e
> > 6a8
> > > 43695a05786e
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..0b6d667be868c2405eaefd17c
> > b52
> > > 2da44bafa0e2
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..650601a3cc75d09a9e6e54a35f
> > 5b
> > > 9993074f8510
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-5.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax1 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..a628f6d99222958cfd8c410f0e
> > 85
> > > 639e3a49dd4b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-6.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax3 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xy < xc ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..cb42412c4ada433b2f59df0a8b
> > ef
> > > 9fa7b1c5e104
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-7.c
> > > @@ -0,0 +1,16 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax2 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc > xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..9cd050e932376bc50bd6ae60c
> > b65
> > > 4fcab0bfdd1c
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-8.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_minmax11 (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc > xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm > xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> > > new file mode 100644
> > > index
> > >
> > 0000000000000000000000000000000000000000..24f580271c3ac3945860b506d4
> > dc
> > > 7d178a826093
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-9.c
> > > @@ -0,0 +1,20 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-optimized" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t xk;
> > > + xc=~xc;
> > > + xm=~xm;
> > > + xy=~xy;
> > > + if (xc < xm) {
> > > + xk = (uint8_t) (xc < xy ? xc : xy);
> > > + } else {
> > > + xk = (uint8_t) (xm < xy ? xm : xy);
> > > + }
> > > + return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "= ~" 1 "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "optimized" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > index
> > >
> > 8b23ef4c7a3484cdc1647ee6d1b150f15685beff..902dde44a50e171b4f34ba724
> > 7d7
> > > 5a32d2c860ed 100644
> > > --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> > > @@ -1,5 +1,5 @@
> > > /* { dg-do run } */
> > > -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > > --param max-jump-thread-duplication-stmts=20" } */
> > > +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details
> > > +--param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> > >
> > > #include <stdio.h>
> > > #include <stdlib.h>
> > > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index
> > >
> > e61d9736937573d773acdf3e43a7c76074bfb2c7..df543b22cd720538c14bcea72f
> > c7
> > > 8a8dec9bf12b 100644
> > > --- a/gcc/tree-ssa-phiopt.cc
> > > +++ b/gcc/tree-ssa-phiopt.cc
> > > @@ -63,8 +63,8 @@ static gphi *factor_out_conditional_conversion (edge,
> > edge, gphi *, tree, tree,
> > > gimple *);
> > > static int value_replacement (basic_block, basic_block,
> > > edge, edge, gphi *, tree, tree); -static bool
> > > minmax_replacement (basic_block, basic_block,
> > > - edge, edge, gphi *, tree, tree);
> > > +static bool minmax_replacement (basic_block, basic_block, basic_block,
> > > + edge, edge, gphi *, tree, tree, bool);
> > > static bool spaceship_replacement (basic_block, basic_block,
> > > edge, edge, gphi *, tree, tree); static bool
> > > cond_removal_in_builtin_zero_pattern (basic_block, basic_block, @@
> > > -74,7 +74,7 @@ static bool cond_store_replacement (basic_block,
> > basic_block, edge, edge,
> > > hash_set<tree> *);
> > > static bool cond_if_else_store_replacement (basic_block, basic_block,
> > > basic_block); static hash_set<tree> * get_non_trapping (); -static
> > > void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> > > +static void replace_phi_edge_with_variable (basic_block, edge, gphi
> > > +*, tree, bool);
> > > static void hoist_adjacent_loads (basic_block, basic_block,
> > > basic_block, basic_block);
> > > static bool gate_hoist_loads (void);
> > > @@ -200,6 +200,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> > > basic_block bb1, bb2;
> > > edge e1, e2;
> > > tree arg0, arg1;
> > > + bool diamond_p = false;
> > >
> > > bb = bb_order[i];
> > >
> > > @@ -266,6 +267,9 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> > > hoist_adjacent_loads (bb, bb1, bb2, bb3);
> > > continue;
> > > }
> > > + else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> > > + && !empty_block_p (bb1))
> > > + diamond_p = true;
> > > else
> > > continue;
> > >
> > > @@ -294,10 +298,13 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > }
> > > else
> > > {
> > > - gimple_seq phis = phi_nodes (bb2);
> > > gimple_stmt_iterator gsi;
> > > bool candorest = true;
> > >
> > > + /* Check that we're looking for nested phis. */
> > > + basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> > > + gimple_seq phis = phi_nodes (merge);
> > > +
> > > /* Value replacement can work with more than one PHI
> > > so try that first. */
> > > if (!early_p)
> > > @@ -317,6 +324,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > do_hoist_loads, bool early_p)
> > > if (!candorest)
> > > continue;
> > >
> > > + e2 = diamond_p ? EDGE_SUCC (bb2, 0) : e2;
> > > phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> > > if (!phi)
> > > continue;
> > > @@ -330,6 +338,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > do_hoist_loads, bool early_p)
> > >
> > > gphi *newphi;
> > > if (single_pred_p (bb1)
> > > + && !diamond_p
> > > && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > arg0, arg1,
> > > cond_stmt)))
> > > @@ -344,20 +353,25 @@ tree_ssa_phiopt_worker (bool do_store_elim,
> > bool do_hoist_loads, bool early_p)
> > > }
> > >
> > > /* Do the replacement of conditional if it can be done. */
> > > - if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0,
> > arg1))
> > > + if (!early_p
> > > + && !diamond_p
> > > + && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > - arg0, arg1,
> > > - early_p))
> > > + else if (!diamond_p
> > > + && match_simplify_replacement (bb, bb1, e1, e2, phi,
> > > + arg0, arg1, early_p))
> > > cfgchanged = true;
> > > else if (!early_p
> > > + && !diamond_p
> > > && single_pred_p (bb1)
> > > && cond_removal_in_builtin_zero_pattern (bb, bb1, e1,
> > e2,
> > > phi, arg0, arg1))
> > > cfgchanged = true;
> > > - else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> > > + else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> > > + diamond_p))
> > > cfgchanged = true;
> > > else if (single_pred_p (bb1)
> > > + && !diamond_p
> > > && spaceship_replacement (bb, bb1, e1, e2, phi, arg0,
> > arg1))
> > > cfgchanged = true;
> > > }
> > > @@ -386,7 +400,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool
> > > do_hoist_loads, bool early_p)
> > >
> > > static void
> > > replace_phi_edge_with_variable (basic_block cond_block,
> > > - edge e, gphi *phi, tree new_tree)
> > > + edge e, gphi *phi, tree new_tree, bool
> > delete_bb = true)
> > > {
> > > basic_block bb = gimple_bb (phi);
> > > gimple_stmt_iterator gsi;
> > > @@ -427,7 +441,7 @@ replace_phi_edge_with_variable (basic_block
> > cond_block,
> > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > else
> > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > + if (EDGE_COUNT (edge_to_remove->dest->preds) == 1 && delete_bb)
> > > {
> > > e->flags |= EDGE_FALLTHRU;
> > > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE); @@ -
> > 1733,15
> > > +1747,52 @@ value_replacement (basic_block cond_bb, basic_block
> > middle_bb,
> > > return 0;
> > > }
> > >
> > > +/* If VAR is an SSA_NAME that points to a BIT_NOT_EXPR then return the
> > TREE for
> > > + the value being inverted. */
> > > +
> > > +static tree
> > > +strip_bit_not (tree var)
> > > +{
> > > + if (TREE_CODE (var) != SSA_NAME)
> > > + return NULL_TREE;
> > > +
> > > + gimple *assign = SSA_NAME_DEF_STMT (var); if (gimple_code (assign)
> > > + != GIMPLE_ASSIGN)
> > > + return NULL_TREE;
> > > +
> > > + if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> > > + return NULL_TREE;
> > > +
> > > + return gimple_assign_rhs1 (assign); }
> > > +
> > > +/* Invert a MIN to a MAX or a MAX to a MIN expression CODE. */
> > > +
> > > +enum tree_code
> > > +invert_minmax_code (enum tree_code code) {
> > > + switch (code) {
> > > + case MIN_EXPR:
> > > + return MAX_EXPR;
> > > + case MAX_EXPR:
> > > + return MIN_EXPR;
> > > + default:
> > > + gcc_unreachable ();
> > > + }
> > > +}
> > > +
> > > /* The function minmax_replacement does the main work of doing the
> > minmax
> > > replacement. Return true if the replacement is done. Otherwise return
> > > false.
> > > BB is the basic block where the replacement is going to be done on.
> > ARG0
> > > - is argument 0 from the PHI. Likewise for ARG1. */
> > > + is argument 0 from the PHI. Likewise for ARG1.
> > > +
> > > + If THREEWAY_P then expect the BB to be laid out in diamond shape with
> > each
> > > + BB containing only a MIN or MAX expression. */
> > >
> > > static bool
> > > -minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > > - edge e0, edge e1, gphi *phi, tree arg0, tree arg1)
> > > +minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> > basic_block alt_middle_bb,
> > > + edge e0, edge e1, gphi *phi, tree arg0, tree arg1, bool
> > > +threeway_p)
> > > {
> > > tree result;
> > > edge true_edge, false_edge;
> > > @@ -1896,16 +1947,20 @@ minmax_replacement (basic_block cond_bb,
> > basic_block middle_bb,
> > > if (false_edge->dest == middle_bb)
> > > false_edge = EDGE_SUCC (false_edge->dest, 0);
> > >
> > > + /* When THREEWAY_P then e1 will point to the edge of the final
> > transition
> > > + from middle-bb to end. */
> > > if (true_edge == e0)
> > > {
> > > - gcc_assert (false_edge == e1);
> > > + if (!threeway_p)
> > > + gcc_assert (false_edge == e1);
> > > arg_true = arg0;
> > > arg_false = arg1;
> > > }
> > > else
> > > {
> > > gcc_assert (false_edge == e0);
> > > - gcc_assert (true_edge == e1);
> > > + if (!threeway_p)
> > > + gcc_assert (true_edge == e1);
> > > arg_true = arg1;
> > > arg_false = arg0;
> > > }
> > > @@ -1937,6 +1992,165 @@ minmax_replacement (basic_block cond_bb,
> > basic_block middle_bb,
> > > else
> > > return false;
> > > }
> > > + else if (middle_bb != alt_middle_bb && threeway_p)
> > > + {
> > > + /* Recognize the following case:
> > > +
> > > + if (smaller < larger)
> > > + a = MIN (smaller, c);
> > > + else
> > > + b = MIN (larger, c);
> > > + x = PHI <a, b>
> > > +
> > > + This is equivalent to
> > > +
> > > + a = MIN (smaller, c);
> > > + x = MIN (larger, a); */
> > > +
> > > + gimple *assign = last_and_only_stmt (middle_bb);
> > > + tree lhs, op0, op1, bound;
> > > + tree alt_lhs, alt_op0, alt_op1;
> > > + bool invert = false;
> > > +
> > > + if (!single_pred_p (middle_bb)
> > > + || !single_pred_p (alt_middle_bb)
> > > + || !single_succ_p (middle_bb)
> > > + || !single_succ_p (alt_middle_bb))
> > > + return false;
> > > +
> > > + /* When THREEWAY_P then e1 will point to the edge of the final
> > transition
> > > + from middle-bb to end. */
> > > + if (true_edge == e0)
> > > + gcc_assert (false_edge == EDGE_PRED (e1->src, 0));
> > > + else
> > > + gcc_assert (true_edge == EDGE_PRED (e1->src, 0));
> > > +
> > > + bool valid_minmax_p = false;
> > > + gimple_stmt_iterator it1
> > > + = gsi_start_nondebug_after_labels_bb (middle_bb);
> > > + gimple_stmt_iterator it2
> > > + = gsi_start_nondebug_after_labels_bb (alt_middle_bb);
> > > + if (gsi_one_nondebug_before_end_p (it1)
> > > + && gsi_one_nondebug_before_end_p (it2))
> > > + {
> > > + gimple *stmt1 = gsi_stmt (it1);
> > > + gimple *stmt2 = gsi_stmt (it2);
> > > + if (is_gimple_assign (stmt1) && is_gimple_assign (stmt2))
> > > + {
> > > + enum tree_code code1 = gimple_assign_rhs_code (stmt1);
> > > + enum tree_code code2 = gimple_assign_rhs_code (stmt2);
> > > + valid_minmax_p = (code1 == MIN_EXPR || code1 == MAX_EXPR)
> > > + && (code2 == MIN_EXPR || code2 ==
> > MAX_EXPR);
> > > + }
> > > + }
> > > +
> > > + if (!valid_minmax_p)
> > > + return false;
> > > +
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + lhs = gimple_assign_lhs (assign);
> > > + ass_code = gimple_assign_rhs_code (assign);
> > > + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
> > > + return false;
> > > +
> > > + op0 = gimple_assign_rhs1 (assign);
> > > + op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + assign = last_and_only_stmt (alt_middle_bb);
> > > + if (!assign
> > > + || gimple_code (assign) != GIMPLE_ASSIGN)
> > > + return false;
> > > +
> > > + alt_lhs = gimple_assign_lhs (assign);
> > > + if (ass_code != gimple_assign_rhs_code (assign))
> > > + return false;
> > > +
> > > + if (!operand_equal_for_phi_arg_p (lhs, arg_true)
> > > + || !operand_equal_for_phi_arg_p (alt_lhs, arg_false))
> > > + return false;
> > > +
> > > + alt_op0 = gimple_assign_rhs1 (assign);
> > > + alt_op1 = gimple_assign_rhs2 (assign);
> > > +
> > > + if ((operand_equal_for_phi_arg_p (op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (op0, alt_smaller)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (alt_op0, alt_larger))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER < LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MAX_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MIN_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else if ((operand_equal_for_phi_arg_p (op0, larger)
> > > + || (alt_larger
> > > + && operand_equal_for_phi_arg_p (op0, alt_larger)))
> > > + && (operand_equal_for_phi_arg_p (alt_op0, smaller)
> > > + || (alt_smaller
> > > + && operand_equal_for_phi_arg_p (alt_op0,
> > alt_smaller))))
> > > + {
> > > + /* We got here if the condition is true, i.e., SMALLER > LARGER. */
> > > + if (!operand_equal_for_phi_arg_p (op1, alt_op1))
> > > + return false;
> > > +
> > > + if ((arg0 = strip_bit_not (op0)) != NULL
> > > + && (arg1 = strip_bit_not (alt_op0)) != NULL
> > > + && (bound = strip_bit_not (op1)) != NULL)
> > > + {
> > > + minmax = MIN_EXPR;
> > > + ass_code = invert_minmax_code (ass_code);
> > > + invert = true;
> > > + }
> > > + else
> > > + {
> > > + bound = op1;
> > > + minmax = MAX_EXPR;
> > > + arg0 = op0;
> > > + arg1 = alt_op0;
> > > + }
> > > + }
> > > + else
> > > + return false;
> > > +
> > > + /* Emit the statement to compute min/max. */
> > > + location_t locus = gimple_location (last_stmt (cond_bb));
> > > + gimple_seq stmts = NULL;
> > > + tree phi_result = PHI_RESULT (phi);
> > > + result = gimple_build (&stmts, locus, minmax, TREE_TYPE (phi_result),
> > > + arg0, bound);
> > > + result = gimple_build (&stmts, locus, ass_code, TREE_TYPE
> > (phi_result),
> > > + result, arg1);
> > > + if (invert)
> > > + result = gimple_build (&stmts, locus, BIT_NOT_EXPR, TREE_TYPE
> > (phi_result),
> > > + result);
> > > +
> > > + gsi = gsi_last_bb (cond_bb);
> > > + gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
> > > +
> > > + replace_phi_edge_with_variable (cond_bb, e1, phi, result,
> > > + false);
> > > +
> > > + return true;
> > > + }
> > > else
> > > {
> > > /* Recognize the following case, assuming d <= u:
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > Boudien Moerman; HRB 36809 (AG Nuernberg)
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-07-27 11:18 ` Richard Biener
@ 2022-08-02 8:32 ` Tamar Christina
2022-08-02 9:11 ` Richard Biener
0 siblings, 1 reply; 26+ messages in thread
From: Tamar Christina @ 2022-08-02 8:32 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, nd, jakub
> > > > When this function replaces the edge it doesn't seem to update the
> > > dominators.
> > > > Since It's replacing the middle BB we then end up with an error
> > > >
> > > > gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error: dominator
> > > > of 5 should be 4, not 2
> > > >
> > > > during early verify. So instead, I replace the BB but defer its
> > > > deletion until cleanup which removes it and updates the dominators.
> > >
> > > Hmm, for a diamond shouldn't you replace
> > >
> > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > else
> > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > >
> > > with
> > >
> > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > >
> > > thus, the code expects to be left with a fallthru to the PHI block
> > > which is expected to have the immediate dominator being cond_block
> > > but with a diamond there's a (possibly empty) block inbetween and
> > > dominators are wrong.
> >
> > Agreed, but the (EDGE_SUCC (cond_block, 1)->dest == bb) doesn't seem
> > like the Right one since for a diamond there will be a block in
> > between the two. Did you perhaps mean EDGE_SUCC (EDGE_SUCC
> > (cond_block, 1)->dest, 0)->dest == bb? i.e. that that destination across the
> diamond be bb, and then you remove the middle block?
>
> Hmm, I think my condition was correct - the code tries to remove the edge to
> the middle-block and checks the remaining edge falls through to the merge
> block. With a true diamond there is no fallthru to the merge block to keep so
> we better don't remove any edge?
>
> > For the minmax diamond we want both edges removed, since all the code
> > in the middle BBs are now dead. But this is probably not true in the general
> sense.
Ah! Sorry I was firing a few cylinders short, I get what you mean now:
@@ -425,8 +439,19 @@ replace_phi_edge_with_variable (basic_block cond_block,
edge edge_to_remove;
if (EDGE_SUCC (cond_block, 0)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 1);
- else
+ else if (EDGE_SUCC (cond_block, 1)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 0);
+ else
+ {
+ /* If neither edge from the conditional is the final bb
+ then we must have a diamond block, in which case
+ the true edge was changed by SET_USE above and we must
+ mark the other edge as the false edge. */
+ gcond *cond = as_a <gcond *> (last_stmt (cond_block));
+ gimple_cond_make_false (cond);
+ return;
+ }
+
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok with this Change?
Thanks,
Tamar
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-08-02 8:32 ` Tamar Christina
@ 2022-08-02 9:11 ` Richard Biener
2022-08-03 8:17 ` Tamar Christina
0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2022-08-02 9:11 UTC (permalink / raw)
To: Tamar Christina; +Cc: Richard Biener, jakub, nd, gcc-patches
On Tue, Aug 2, 2022 at 10:33 AM Tamar Christina via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> > > > > When this function replaces the edge it doesn't seem to update the
> > > > dominators.
> > > > > Since It's replacing the middle BB we then end up with an error
> > > > >
> > > > > gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error: dominator
> > > > > of 5 should be 4, not 2
> > > > >
> > > > > during early verify. So instead, I replace the BB but defer its
> > > > > deletion until cleanup which removes it and updates the dominators.
> > > >
> > > > Hmm, for a diamond shouldn't you replace
> > > >
> > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > else
> > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > >
> > > > with
> > > >
> > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > >
> > > > thus, the code expects to be left with a fallthru to the PHI block
> > > > which is expected to have the immediate dominator being cond_block
> > > > but with a diamond there's a (possibly empty) block inbetween and
> > > > dominators are wrong.
> > >
> > > Agreed, but the (EDGE_SUCC (cond_block, 1)->dest == bb) doesn't seem
> > > like the Right one since for a diamond there will be a block in
> > > between the two. Did you perhaps mean EDGE_SUCC (EDGE_SUCC
> > > (cond_block, 1)->dest, 0)->dest == bb? i.e. that that destination across the
> > diamond be bb, and then you remove the middle block?
> >
> > Hmm, I think my condition was correct - the code tries to remove the edge to
> > the middle-block and checks the remaining edge falls through to the merge
> > block. With a true diamond there is no fallthru to the merge block to keep so
> > we better don't remove any edge?
> >
> > > For the minmax diamond we want both edges removed, since all the code
> > > in the middle BBs are now dead. But this is probably not true in the general
> > sense.
>
> Ah! Sorry I was firing a few cylinders short, I get what you mean now:
>
> @@ -425,8 +439,19 @@ replace_phi_edge_with_variable (basic_block cond_block,
> edge edge_to_remove;
> if (EDGE_SUCC (cond_block, 0)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> - else
> + else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 0);
> + else
> + {
> + /* If neither edge from the conditional is the final bb
> + then we must have a diamond block, in which case
> + the true edge was changed by SET_USE above and we must
> + mark the other edge as the false edge. */
> + gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> + gimple_cond_make_false (cond);
> + return;
> + }
> +
Note there is already
if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
{
...
}
else
{
/* If there are other edges into the middle block make
CFG cleanup deal with the edge removal to avoid
updating dominators here in a non-trivial way. */
gcond *cond = as_a <gcond *> (last_stmt (cond_block));
if (edge_to_remove->flags & EDGE_TRUE_VALUE)
gimple_cond_make_false (cond);
else
gimple_cond_make_true (cond);
}
I'm not sure how you can say 'e' is always the true edge? May I suggest
to amend the first condition with edge_to_remove && (and initialize that
to NULL) and use e->flags instead of edge_to_remove in the else,
of course also inverting the logic since we're keeping 'e'?
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok with this Change?
>
> Thanks,
> Tamar
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-08-02 9:11 ` Richard Biener
@ 2022-08-03 8:17 ` Tamar Christina
2022-08-03 8:25 ` Richard Biener
0 siblings, 1 reply; 26+ messages in thread
From: Tamar Christina @ 2022-08-03 8:17 UTC (permalink / raw)
To: Richard Biener; +Cc: Richard Biener, jakub, nd, gcc-patches
> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Tuesday, August 2, 2022 10:11 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; jakub@redhat.com; nd
> <nd@arm.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
>
> On Tue, Aug 2, 2022 at 10:33 AM Tamar Christina via Gcc-patches <gcc-
> patches@gcc.gnu.org> wrote:
> >
> > > > > > When this function replaces the edge it doesn't seem to update
> > > > > > the
> > > > > dominators.
> > > > > > Since It's replacing the middle BB we then end up with an
> > > > > > error
> > > > > >
> > > > > > gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error:
> > > > > > dominator of 5 should be 4, not 2
> > > > > >
> > > > > > during early verify. So instead, I replace the BB but defer
> > > > > > its deletion until cleanup which removes it and updates the
> dominators.
> > > > >
> > > > > Hmm, for a diamond shouldn't you replace
> > > > >
> > > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > else
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > >
> > > > > with
> > > > >
> > > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > >
> > > > > thus, the code expects to be left with a fallthru to the PHI
> > > > > block which is expected to have the immediate dominator being
> > > > > cond_block but with a diamond there's a (possibly empty) block
> > > > > inbetween and dominators are wrong.
> > > >
> > > > Agreed, but the (EDGE_SUCC (cond_block, 1)->dest == bb) doesn't
> > > > seem like the Right one since for a diamond there will be a block
> > > > in between the two. Did you perhaps mean EDGE_SUCC (EDGE_SUCC
> > > > (cond_block, 1)->dest, 0)->dest == bb? i.e. that that destination
> > > > across the
> > > diamond be bb, and then you remove the middle block?
> > >
> > > Hmm, I think my condition was correct - the code tries to remove the
> > > edge to the middle-block and checks the remaining edge falls through
> > > to the merge block. With a true diamond there is no fallthru to the
> > > merge block to keep so we better don't remove any edge?
> > >
> > > > For the minmax diamond we want both edges removed, since all the
> > > > code in the middle BBs are now dead. But this is probably not
> > > > true in the general
> > > sense.
> >
> > Ah! Sorry I was firing a few cylinders short, I get what you mean now:
> >
> > @@ -425,8 +439,19 @@ replace_phi_edge_with_variable (basic_block
> cond_block,
> > edge edge_to_remove;
> > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > - else
> > + else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > + else
> > + {
> > + /* If neither edge from the conditional is the final bb
> > + then we must have a diamond block, in which case
> > + the true edge was changed by SET_USE above and we must
> > + mark the other edge as the false edge. */
> > + gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> > + gimple_cond_make_false (cond);
> > + return;
> > + }
> > +
>
> Note there is already
>
> if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> {
> ...
> }
> else
> {
> /* If there are other edges into the middle block make
> CFG cleanup deal with the edge removal to avoid
> updating dominators here in a non-trivial way. */
> gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> if (edge_to_remove->flags & EDGE_TRUE_VALUE)
> gimple_cond_make_false (cond);
> else
> gimple_cond_make_true (cond);
> }
>
> I'm not sure how you can say 'e' is always the true edge? May I suggest to
> amend the first condition with edge_to_remove && (and initialize that to
> NULL) and use e->flags instead of edge_to_remove in the else, of course
> also inverting the logic since we're keeping 'e'?
As discussed on IRC, here's the version using keep_edge:
@@ -422,12 +436,17 @@ replace_phi_edge_with_variable (basic_block cond_block,
SET_USE (PHI_ARG_DEF_PTR (phi, e->dest_idx), new_tree);
/* Remove the empty basic block. */
- edge edge_to_remove;
+ edge edge_to_remove = NULL, keep_edge = NULL;
if (EDGE_SUCC (cond_block, 0)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 1);
- else
+ else if (EDGE_SUCC (cond_block, 1)->dest == bb)
edge_to_remove = EDGE_SUCC (cond_block, 0);
- if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
+ else if ((keep_edge = find_edge (cond_block, e->src)))
+ ;
+ else
+ gcc_unreachable ();
+
+ if (edge_to_remove && EDGE_COUNT (edge_to_remove->dest->preds) == 1)
{
e->flags |= EDGE_FALLTHRU;
e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
@@ -438,6 +457,18 @@ replace_phi_edge_with_variable (basic_block cond_block,
gsi = gsi_last_bb (cond_block);
gsi_remove (&gsi, true);
}
+ else if (keep_edge)
+ {
+ /* If we're in a diamond then we have identified the edge
+ that we want to keep. Since the dominators will require
+ updating in a non-trivial way we leave it to CFG cleanup
+ but mark the condition as appropriately true/false. */
+ gcond *cond = as_a <gcond *> (last_stmt (cond_block));
+ if (keep_edge->flags & EDGE_FALSE_VALUE)
+ gimple_cond_make_false (cond);
+ else if (keep_edge->flags & EDGE_TRUE_VALUE)
+ gimple_cond_make_true (cond);
+ }
else
{
/* If there are other edges into the middle block make
@@ -1733,15 +1764,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
return 0;
}
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok with change?
Thanks,
Tamar
>
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok with this Change?
> >
> > Thanks,
> > Tamar
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-08-03 8:17 ` Tamar Christina
@ 2022-08-03 8:25 ` Richard Biener
2022-08-03 20:41 ` H.J. Lu
0 siblings, 1 reply; 26+ messages in thread
From: Richard Biener @ 2022-08-03 8:25 UTC (permalink / raw)
To: Tamar Christina; +Cc: Richard Biener, jakub, nd, gcc-patches
On Wed, 3 Aug 2022, Tamar Christina wrote:
>
> > -----Original Message-----
> > From: Richard Biener <richard.guenther@gmail.com>
> > Sent: Tuesday, August 2, 2022 10:11 AM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: Richard Biener <rguenther@suse.de>; jakub@redhat.com; nd
> > <nd@arm.com>; gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> > max/min.
> >
> > On Tue, Aug 2, 2022 at 10:33 AM Tamar Christina via Gcc-patches <gcc-
> > patches@gcc.gnu.org> wrote:
> > >
> > > > > > > When this function replaces the edge it doesn't seem to update
> > > > > > > the
> > > > > > dominators.
> > > > > > > Since It's replacing the middle BB we then end up with an
> > > > > > > error
> > > > > > >
> > > > > > > gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error:
> > > > > > > dominator of 5 should be 4, not 2
> > > > > > >
> > > > > > > during early verify. So instead, I replace the BB but defer
> > > > > > > its deletion until cleanup which removes it and updates the
> > dominators.
> > > > > >
> > > > > > Hmm, for a diamond shouldn't you replace
> > > > > >
> > > > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > > else
> > > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > > >
> > > > > > with
> > > > > >
> > > > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > > else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > > >
> > > > > > thus, the code expects to be left with a fallthru to the PHI
> > > > > > block which is expected to have the immediate dominator being
> > > > > > cond_block but with a diamond there's a (possibly empty) block
> > > > > > inbetween and dominators are wrong.
> > > > >
> > > > > Agreed, but the (EDGE_SUCC (cond_block, 1)->dest == bb) doesn't
> > > > > seem like the Right one since for a diamond there will be a block
> > > > > in between the two. Did you perhaps mean EDGE_SUCC (EDGE_SUCC
> > > > > (cond_block, 1)->dest, 0)->dest == bb? i.e. that that destination
> > > > > across the
> > > > diamond be bb, and then you remove the middle block?
> > > >
> > > > Hmm, I think my condition was correct - the code tries to remove the
> > > > edge to the middle-block and checks the remaining edge falls through
> > > > to the merge block. With a true diamond there is no fallthru to the
> > > > merge block to keep so we better don't remove any edge?
> > > >
> > > > > For the minmax diamond we want both edges removed, since all the
> > > > > code in the middle BBs are now dead. But this is probably not
> > > > > true in the general
> > > > sense.
> > >
> > > Ah! Sorry I was firing a few cylinders short, I get what you mean now:
> > >
> > > @@ -425,8 +439,19 @@ replace_phi_edge_with_variable (basic_block
> > cond_block,
> > > edge edge_to_remove;
> > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > - else
> > > + else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > + else
> > > + {
> > > + /* If neither edge from the conditional is the final bb
> > > + then we must have a diamond block, in which case
> > > + the true edge was changed by SET_USE above and we must
> > > + mark the other edge as the false edge. */
> > > + gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> > > + gimple_cond_make_false (cond);
> > > + return;
> > > + }
> > > +
> >
> > Note there is already
> >
> > if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > {
> > ...
> > }
> > else
> > {
> > /* If there are other edges into the middle block make
> > CFG cleanup deal with the edge removal to avoid
> > updating dominators here in a non-trivial way. */
> > gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> > if (edge_to_remove->flags & EDGE_TRUE_VALUE)
> > gimple_cond_make_false (cond);
> > else
> > gimple_cond_make_true (cond);
> > }
> >
> > I'm not sure how you can say 'e' is always the true edge? May I suggest to
> > amend the first condition with edge_to_remove && (and initialize that to
> > NULL) and use e->flags instead of edge_to_remove in the else, of course
> > also inverting the logic since we're keeping 'e'?
>
> As discussed on IRC, here's the version using keep_edge:
>
> @@ -422,12 +436,17 @@ replace_phi_edge_with_variable (basic_block cond_block,
> SET_USE (PHI_ARG_DEF_PTR (phi, e->dest_idx), new_tree);
>
> /* Remove the empty basic block. */
> - edge edge_to_remove;
> + edge edge_to_remove = NULL, keep_edge = NULL;
> if (EDGE_SUCC (cond_block, 0)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 1);
> - else
> + else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> edge_to_remove = EDGE_SUCC (cond_block, 0);
> - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> + else if ((keep_edge = find_edge (cond_block, e->src)))
> + ;
> + else
> + gcc_unreachable ();
> +
> + if (edge_to_remove && EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> {
> e->flags |= EDGE_FALLTHRU;
> e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> @@ -438,6 +457,18 @@ replace_phi_edge_with_variable (basic_block cond_block,
> gsi = gsi_last_bb (cond_block);
> gsi_remove (&gsi, true);
> }
> + else if (keep_edge)
> + {
> + /* If we're in a diamond then we have identified the edge
> + that we want to keep. Since the dominators will require
> + updating in a non-trivial way we leave it to CFG cleanup
> + but mark the condition as appropriately true/false. */
> + gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> + if (keep_edge->flags & EDGE_FALSE_VALUE)
> + gimple_cond_make_false (cond);
> + else if (keep_edge->flags & EDGE_TRUE_VALUE)
> + gimple_cond_make_true (cond);
> + }
> else
> {
> /* If there are other edges into the middle block make
I meant to merge the keep_edge and the existing else case by setting
keep_edge the obvious way in the other two if cases. Sorry for not
being clear ...
OK with that change.
> @@ -1733,15 +1764,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
> return 0;
> }
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok with change?
>
> Thanks,
> Tamar
>
> >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok with this Change?
> > >
> > > Thanks,
> > > Tamar
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.
2022-06-21 7:43 ` Richard Biener
@ 2022-08-03 15:13 ` Tamar Christina
2022-08-04 6:58 ` Richard Biener
0 siblings, 1 reply; 26+ messages in thread
From: Tamar Christina @ 2022-08-03 15:13 UTC (permalink / raw)
To: Richard Biener
Cc: Richard Sandiford, Richard Biener via Gcc-patches, Richard Guenther, nd
[-- Attachment #1: Type: text/plain, Size: 5252 bytes --]
> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Tuesday, June 21, 2022 8:43 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Richard Biener via Gcc-
> patches <gcc-patches@gcc.gnu.org>; Richard Guenther
> <rguenther@suse.de>; nd <nd@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: Simplify subtract where both
> arguments are being bitwise inverted.
>
> On Mon, Jun 20, 2022 at 10:49 AM Tamar Christina
> <Tamar.Christina@arm.com> wrote:
> >
> > > -----Original Message-----
> > > From: Richard Sandiford <richard.sandiford@arm.com>
> > > Sent: Monday, June 20, 2022 9:19 AM
> > > To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
> > > Cc: Tamar Christina <Tamar.Christina@arm.com>; Richard Biener
> > > <richard.guenther@gmail.com>; Richard Guenther
> <rguenther@suse.de>;
> > > nd <nd@arm.com>
> > > Subject: Re: [PATCH 1/2]middle-end: Simplify subtract where both
> > > arguments are being bitwise inverted.
> > >
> > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > >>
> > > >> Hi All,
> > > >>
> > > >> This adds a match.pd rule that drops the bitwwise nots when both
> > > >> arguments to a subtract is inverted. i.e. for:
> > > >>
> > > >> float g(float a, float b)
> > > >> {
> > > >> return ~(int)a - ~(int)b;
> > > >> }
> > > >>
> > > >> we instead generate
> > > >>
> > > >> float g(float a, float b)
> > > >> {
> > > >> return (int)a - (int)b;
> > > >> }
> > > >>
> > > >> We already do a limited version of this from the fold_binary fold
> > > >> functions but this makes a more general version in match.pd that
> > > >> applies
> > > more often.
> > > >>
> > > >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >>
> > > >> Ok for master?
> > > >>
> > > >> Thanks,
> > > >> Tamar
> > > >>
> > > >> gcc/ChangeLog:
> > > >>
> > > >> * match.pd: New bit_not rule.
> > > >>
> > > >> gcc/testsuite/ChangeLog:
> > > >>
> > > >> * gcc.dg/subnot.c: New test.
> > > >>
> > > >> --- inline copy of patch --
> > > >> diff --git a/gcc/match.pd b/gcc/match.pd index
> > > >>
> > >
> a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a1
> > > 0
> > > >> c30b8a3e1ae2e 100644
> > > >> --- a/gcc/match.pd
> > > >> +++ b/gcc/match.pd
> > > >> @@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN
> (RINT)
> > > >> (simplify
> > > >> (bit_not (plus:c (bit_not @0) @1))
> > > >> (minus @0 @1))
> > > >> +/* (~X - ~Y) -> X - Y. */
> > > >> +(simplify
> > > >> + (minus (bit_not @0) (bit_not @1)) (minus @0 @1))
> > > >
> > > > It doesn't seem correct.
> > > >
> > > > (gdb) p/x ~-1 - ~0x80000000
> > > > $3 = 0x80000001
> > > > (gdb) p/x -1 - 0x80000000
> > > > $4 = 0x7fffffff
> > > >
> > > > where I was looking for a case exposing undefined integer overflow.
> > >
> > > Yeah, shouldn't it be folding to (minus @1 @0) instead?
> > >
> > > ~X = (-X - 1)
> > > -Y = (-Y - 1)
> > >
> > > so:
> > >
> > > ~X - ~Y = (-X - 1) - (-Y - 1)
> > > = -X - 1 + Y + 1
> > > = Y - X
> > >
> >
> > You're right, sorry, I should have paid more attention when I wrote the
> patch.
>
> You still need to watch out for undefined overflow cases in the result that
> were well-defined in the original expression I think.
The only special thing we do for signed numbers if to do the subtract as unsigned. As I mentioned
before GCC already does this transformation as part of the fold machinery, but that only only happens
when a very simple tree is matched and only when single use. i.e. https://godbolt.org/z/EWsdhfrKj
I'm only attempting to make it apply more generally as the result is always beneficial.
I've respun the patch to the same as we already do.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* match.pd: New bit_not rule.
gcc/testsuite/ChangeLog:
* gcc.dg/subnot.c: New test.
--- inline copy of patch ---
diff --git a/gcc/match.pd b/gcc/match.pd
index 330c1db0c8e12b0fb010b1958729444672403866..00b3e07b2a5216b19ed58500923680d83c67d8cf 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1308,6 +1308,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(simplify
(bit_not (plus:c (bit_not @0) @1))
(minus @0 @1))
+/* (~X - ~Y) -> Y - X. */
+(simplify
+ (minus (bit_not @0) (bit_not @1))
+ (with { tree utype = unsigned_type_for (type); }
+ (convert (minus (convert:utype @1) (convert:utype @0)))))
/* ~(X - Y) -> ~X + Y. */
(simplify
diff --git a/gcc/testsuite/gcc.dg/subnot.c b/gcc/testsuite/gcc.dg/subnot.c
new file mode 100644
index 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f831aa77d28bd02d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/subnot.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+float g(float a, float b)
+{
+ return ~(int)a - ~(int)b;
+}
+
+/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
[-- Attachment #2: rb15840.patch --]
[-- Type: application/octet-stream, Size: 980 bytes --]
diff --git a/gcc/match.pd b/gcc/match.pd
index 330c1db0c8e12b0fb010b1958729444672403866..00b3e07b2a5216b19ed58500923680d83c67d8cf 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1308,6 +1308,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(simplify
(bit_not (plus:c (bit_not @0) @1))
(minus @0 @1))
+/* (~X - ~Y) -> Y - X. */
+(simplify
+ (minus (bit_not @0) (bit_not @1))
+ (with { tree utype = unsigned_type_for (type); }
+ (convert (minus (convert:utype @1) (convert:utype @0)))))
/* ~(X - Y) -> ~X + Y. */
(simplify
diff --git a/gcc/testsuite/gcc.dg/subnot.c b/gcc/testsuite/gcc.dg/subnot.c
new file mode 100644
index 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f831aa77d28bd02d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/subnot.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+float g(float a, float b)
+{
+ return ~(int)a - ~(int)b;
+}
+
+/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/2]middle-end: Support recognition of three-way max/min.
2022-08-03 8:25 ` Richard Biener
@ 2022-08-03 20:41 ` H.J. Lu
0 siblings, 0 replies; 26+ messages in thread
From: H.J. Lu @ 2022-08-03 20:41 UTC (permalink / raw)
To: Richard Biener; +Cc: Tamar Christina, jakub, nd, gcc-patches
On Wed, Aug 3, 2022 at 1:26 AM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Wed, 3 Aug 2022, Tamar Christina wrote:
>
> >
> > > -----Original Message-----
> > > From: Richard Biener <richard.guenther@gmail.com>
> > > Sent: Tuesday, August 2, 2022 10:11 AM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: Richard Biener <rguenther@suse.de>; jakub@redhat.com; nd
> > > <nd@arm.com>; gcc-patches@gcc.gnu.org
> > > Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> > > max/min.
> > >
> > > On Tue, Aug 2, 2022 at 10:33 AM Tamar Christina via Gcc-patches <gcc-
> > > patches@gcc.gnu.org> wrote:
> > > >
> > > > > > > > When this function replaces the edge it doesn't seem to update
> > > > > > > > the
> > > > > > > dominators.
> > > > > > > > Since It's replacing the middle BB we then end up with an
> > > > > > > > error
> > > > > > > >
> > > > > > > > gcc/testsuite/gcc.dg/tree-ssa/minmax-14.c:17:1: error:
> > > > > > > > dominator of 5 should be 4, not 2
> > > > > > > >
> > > > > > > > during early verify. So instead, I replace the BB but defer
> > > > > > > > its deletion until cleanup which removes it and updates the
> > > dominators.
> > > > > > >
> > > > > > > Hmm, for a diamond shouldn't you replace
> > > > > > >
> > > > > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > > > else
> > > > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > > > >
> > > > > > > with
> > > > > > >
> > > > > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > > > > else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > > > > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > > > >
> > > > > > > thus, the code expects to be left with a fallthru to the PHI
> > > > > > > block which is expected to have the immediate dominator being
> > > > > > > cond_block but with a diamond there's a (possibly empty) block
> > > > > > > inbetween and dominators are wrong.
> > > > > >
> > > > > > Agreed, but the (EDGE_SUCC (cond_block, 1)->dest == bb) doesn't
> > > > > > seem like the Right one since for a diamond there will be a block
> > > > > > in between the two. Did you perhaps mean EDGE_SUCC (EDGE_SUCC
> > > > > > (cond_block, 1)->dest, 0)->dest == bb? i.e. that that destination
> > > > > > across the
> > > > > diamond be bb, and then you remove the middle block?
> > > > >
> > > > > Hmm, I think my condition was correct - the code tries to remove the
> > > > > edge to the middle-block and checks the remaining edge falls through
> > > > > to the merge block. With a true diamond there is no fallthru to the
> > > > > merge block to keep so we better don't remove any edge?
> > > > >
> > > > > > For the minmax diamond we want both edges removed, since all the
> > > > > > code in the middle BBs are now dead. But this is probably not
> > > > > > true in the general
> > > > > sense.
> > > >
> > > > Ah! Sorry I was firing a few cylinders short, I get what you mean now:
> > > >
> > > > @@ -425,8 +439,19 @@ replace_phi_edge_with_variable (basic_block
> > > cond_block,
> > > > edge edge_to_remove;
> > > > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > > > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > > > - else
> > > > + else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > > > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > > > + else
> > > > + {
> > > > + /* If neither edge from the conditional is the final bb
> > > > + then we must have a diamond block, in which case
> > > > + the true edge was changed by SET_USE above and we must
> > > > + mark the other edge as the false edge. */
> > > > + gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> > > > + gimple_cond_make_false (cond);
> > > > + return;
> > > > + }
> > > > +
> > >
> > > Note there is already
> > >
> > > if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > > {
> > > ...
> > > }
> > > else
> > > {
> > > /* If there are other edges into the middle block make
> > > CFG cleanup deal with the edge removal to avoid
> > > updating dominators here in a non-trivial way. */
> > > gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> > > if (edge_to_remove->flags & EDGE_TRUE_VALUE)
> > > gimple_cond_make_false (cond);
> > > else
> > > gimple_cond_make_true (cond);
> > > }
> > >
> > > I'm not sure how you can say 'e' is always the true edge? May I suggest to
> > > amend the first condition with edge_to_remove && (and initialize that to
> > > NULL) and use e->flags instead of edge_to_remove in the else, of course
> > > also inverting the logic since we're keeping 'e'?
> >
> > As discussed on IRC, here's the version using keep_edge:
> >
> > @@ -422,12 +436,17 @@ replace_phi_edge_with_variable (basic_block cond_block,
> > SET_USE (PHI_ARG_DEF_PTR (phi, e->dest_idx), new_tree);
> >
> > /* Remove the empty basic block. */
> > - edge edge_to_remove;
> > + edge edge_to_remove = NULL, keep_edge = NULL;
> > if (EDGE_SUCC (cond_block, 0)->dest == bb)
> > edge_to_remove = EDGE_SUCC (cond_block, 1);
> > - else
> > + else if (EDGE_SUCC (cond_block, 1)->dest == bb)
> > edge_to_remove = EDGE_SUCC (cond_block, 0);
> > - if (EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > + else if ((keep_edge = find_edge (cond_block, e->src)))
> > + ;
> > + else
> > + gcc_unreachable ();
> > +
> > + if (edge_to_remove && EDGE_COUNT (edge_to_remove->dest->preds) == 1)
> > {
> > e->flags |= EDGE_FALLTHRU;
> > e->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> > @@ -438,6 +457,18 @@ replace_phi_edge_with_variable (basic_block cond_block,
> > gsi = gsi_last_bb (cond_block);
> > gsi_remove (&gsi, true);
> > }
> > + else if (keep_edge)
> > + {
> > + /* If we're in a diamond then we have identified the edge
> > + that we want to keep. Since the dominators will require
> > + updating in a non-trivial way we leave it to CFG cleanup
> > + but mark the condition as appropriately true/false. */
> > + gcond *cond = as_a <gcond *> (last_stmt (cond_block));
> > + if (keep_edge->flags & EDGE_FALSE_VALUE)
> > + gimple_cond_make_false (cond);
> > + else if (keep_edge->flags & EDGE_TRUE_VALUE)
> > + gimple_cond_make_true (cond);
> > + }
> > else
> > {
> > /* If there are other edges into the middle block make
>
> I meant to merge the keep_edge and the existing else case by setting
> keep_edge the obvious way in the other two if cases. Sorry for not
> being clear ...
>
> OK with that change.
>
> > @@ -1733,15 +1764,52 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
> > return 0;
> > }
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok with change?
This caused:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106519
> >
> > Thanks,
> > Tamar
> >
> > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok with this Change?
> > > >
> > > > Thanks,
> > > > Tamar
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)
--
H.J.
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.
2022-08-03 15:13 ` Tamar Christina
@ 2022-08-04 6:58 ` Richard Biener
0 siblings, 0 replies; 26+ messages in thread
From: Richard Biener @ 2022-08-04 6:58 UTC (permalink / raw)
To: Tamar Christina
Cc: Richard Biener, Richard Sandiford, Richard Biener via Gcc-patches, nd
On Wed, 3 Aug 2022, Tamar Christina wrote:
> > -----Original Message-----
> > From: Richard Biener <richard.guenther@gmail.com>
> > Sent: Tuesday, June 21, 2022 8:43 AM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Richard Biener via Gcc-
> > patches <gcc-patches@gcc.gnu.org>; Richard Guenther
> > <rguenther@suse.de>; nd <nd@arm.com>
> > Subject: Re: [PATCH 1/2]middle-end: Simplify subtract where both
> > arguments are being bitwise inverted.
> >
> > On Mon, Jun 20, 2022 at 10:49 AM Tamar Christina
> > <Tamar.Christina@arm.com> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Richard Sandiford <richard.sandiford@arm.com>
> > > > Sent: Monday, June 20, 2022 9:19 AM
> > > > To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
> > > > Cc: Tamar Christina <Tamar.Christina@arm.com>; Richard Biener
> > > > <richard.guenther@gmail.com>; Richard Guenther
> > <rguenther@suse.de>;
> > > > nd <nd@arm.com>
> > > > Subject: Re: [PATCH 1/2]middle-end: Simplify subtract where both
> > > > arguments are being bitwise inverted.
> > > >
> > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
> > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >>
> > > > >> Hi All,
> > > > >>
> > > > >> This adds a match.pd rule that drops the bitwwise nots when both
> > > > >> arguments to a subtract is inverted. i.e. for:
> > > > >>
> > > > >> float g(float a, float b)
> > > > >> {
> > > > >> return ~(int)a - ~(int)b;
> > > > >> }
> > > > >>
> > > > >> we instead generate
> > > > >>
> > > > >> float g(float a, float b)
> > > > >> {
> > > > >> return (int)a - (int)b;
> > > > >> }
> > > > >>
> > > > >> We already do a limited version of this from the fold_binary fold
> > > > >> functions but this makes a more general version in match.pd that
> > > > >> applies
> > > > more often.
> > > > >>
> > > > >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > > >>
> > > > >> Ok for master?
> > > > >>
> > > > >> Thanks,
> > > > >> Tamar
> > > > >>
> > > > >> gcc/ChangeLog:
> > > > >>
> > > > >> * match.pd: New bit_not rule.
> > > > >>
> > > > >> gcc/testsuite/ChangeLog:
> > > > >>
> > > > >> * gcc.dg/subnot.c: New test.
> > > > >>
> > > > >> --- inline copy of patch --
> > > > >> diff --git a/gcc/match.pd b/gcc/match.pd index
> > > > >>
> > > >
> > a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a1
> > > > 0
> > > > >> c30b8a3e1ae2e 100644
> > > > >> --- a/gcc/match.pd
> > > > >> +++ b/gcc/match.pd
> > > > >> @@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN
> > (RINT)
> > > > >> (simplify
> > > > >> (bit_not (plus:c (bit_not @0) @1))
> > > > >> (minus @0 @1))
> > > > >> +/* (~X - ~Y) -> X - Y. */
> > > > >> +(simplify
> > > > >> + (minus (bit_not @0) (bit_not @1)) (minus @0 @1))
> > > > >
> > > > > It doesn't seem correct.
> > > > >
> > > > > (gdb) p/x ~-1 - ~0x80000000
> > > > > $3 = 0x80000001
> > > > > (gdb) p/x -1 - 0x80000000
> > > > > $4 = 0x7fffffff
> > > > >
> > > > > where I was looking for a case exposing undefined integer overflow.
> > > >
> > > > Yeah, shouldn't it be folding to (minus @1 @0) instead?
> > > >
> > > > ~X = (-X - 1)
> > > > -Y = (-Y - 1)
> > > >
> > > > so:
> > > >
> > > > ~X - ~Y = (-X - 1) - (-Y - 1)
> > > > = -X - 1 + Y + 1
> > > > = Y - X
> > > >
> > >
> > > You're right, sorry, I should have paid more attention when I wrote the
> > patch.
> >
> > You still need to watch out for undefined overflow cases in the result that
> > were well-defined in the original expression I think.
>
> The only special thing we do for signed numbers if to do the subtract as unsigned. As I mentioned
> before GCC already does this transformation as part of the fold machinery, but that only only happens
> when a very simple tree is matched and only when single use. i.e. https://godbolt.org/z/EWsdhfrKj
>
> I'm only attempting to make it apply more generally as the result is always beneficial.
>
> I've respun the patch to the same as we already do.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
OK.
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * match.pd: New bit_not rule.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/subnot.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 330c1db0c8e12b0fb010b1958729444672403866..00b3e07b2a5216b19ed58500923680d83c67d8cf 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1308,6 +1308,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (simplify
> (bit_not (plus:c (bit_not @0) @1))
> (minus @0 @1))
> +/* (~X - ~Y) -> Y - X. */
> +(simplify
> + (minus (bit_not @0) (bit_not @1))
> + (with { tree utype = unsigned_type_for (type); }
> + (convert (minus (convert:utype @1) (convert:utype @0)))))
>
> /* ~(X - Y) -> ~X + Y. */
> (simplify
> diff --git a/gcc/testsuite/gcc.dg/subnot.c b/gcc/testsuite/gcc.dg/subnot.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..d621bacd27bd3d19a010e4c9f831aa77d28bd02d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/subnot.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +float g(float a, float b)
> +{
> + return ~(int)a - ~(int)b;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2022-08-04 6:58 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-16 11:08 [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted Tamar Christina
2022-06-16 11:09 ` [PATCH 2/2]middle-end: Support recognition of three-way max/min Tamar Christina
2022-06-20 8:36 ` Richard Biener
2022-06-20 9:01 ` Tamar Christina
2022-06-21 13:15 ` Richard Biener
2022-06-21 13:42 ` Tamar Christina
2022-06-27 7:52 ` Richard Biener
2022-07-05 15:25 ` Tamar Christina
2022-07-12 9:39 ` Tamar Christina
2022-07-12 13:19 ` Richard Biener
2022-07-27 10:40 ` Tamar Christina
2022-07-27 11:18 ` Richard Biener
2022-08-02 8:32 ` Tamar Christina
2022-08-02 9:11 ` Richard Biener
2022-08-03 8:17 ` Tamar Christina
2022-08-03 8:25 ` Richard Biener
2022-08-03 20:41 ` H.J. Lu
2022-06-20 23:16 ` Andrew Pinski
2022-06-21 6:54 ` Richard Biener
2022-06-21 7:12 ` Tamar Christina
2022-06-20 8:03 ` [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted Richard Biener
2022-06-20 8:18 ` Richard Sandiford
2022-06-20 8:49 ` Tamar Christina
2022-06-21 7:43 ` Richard Biener
2022-08-03 15:13 ` Tamar Christina
2022-08-04 6:58 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).