public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] ifcvt: Convert multiple
@ 2021-12-06 18:43 Robin Dapp
  2021-12-06 18:43 ` [PATCH v3 1/7] ifcvt: Check if cmovs are needed Robin Dapp
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

Hi Richard,

as requested, this is the whole series (without the s390-specific parts)
in total.  7/7 is "new" as it was posted separately as "8/7" before
but actually does not change too much compared to the rest.

Supposing that the adjusted 4/7 is OK after recent discussions, if I'm
not mistaken, 5/7, 6/7 (very minor) and 7/7 still need review.

Robin Dapp (7):
  ifcvt: Check if cmovs are needed.
  ifcvt: Allow constants for noce_convert_multiple.
  ifcvt: Improve costs handling for noce_convert_multiple.
  ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.
  ifcvt: Try re-using CC for conditional moves.
  testsuite/s390: Add tests for noce_convert_multiple.
  ifcvt: Run second pass if it is possible to omit a temporary.

 gcc/expmed.c                                  |   8 +-
 gcc/expr.c                                    |  10 +-
 gcc/ifcvt.c                                   | 366 +++++++++++++++---
 gcc/optabs.c                                  | 135 +++++--
 gcc/optabs.h                                  |   4 +-
 gcc/rtl.h                                     |  11 +-
 gcc/testsuite/gcc.dg/ifcvt-4.c                |   2 +-
 .../gcc.target/s390/ifcvt-two-insns-bool.c    |  39 ++
 .../gcc.target/s390/ifcvt-two-insns-int.c     |  39 ++
 .../gcc.target/s390/ifcvt-two-insns-long.c    |  39 ++
 10 files changed, 551 insertions(+), 102 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 1/7] ifcvt: Check if cmovs are needed.
  2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
@ 2021-12-06 18:43 ` Robin Dapp
  2021-12-09  1:26   ` Jeff Law
  2021-12-06 18:43 ` [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple Robin Dapp
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

When if-converting multiple SETs and we encounter a swap-style idiom

  if (a > b)
    {
      tmp = c;   // [1]
      c = d;
      d = tmp;
    }

ifcvt should not generate a conditional move for the instruction at
[1].

In order to achieve that, this patch goes through all relevant SETs
and marks the relevant instructions.  This helps to evaluate costs.

On top, only generate temporaries if the current cmov is going to
overwrite one of the comparands of the initial compare.
---
 gcc/ifcvt.c | 174 ++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 150 insertions(+), 24 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 017944f4f79..c98668d5646 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -98,6 +98,8 @@ static int dead_or_predicable (basic_block, basic_block, basic_block,
 			       edge, int);
 static void noce_emit_move_insn (rtx, rtx);
 static rtx_insn *block_has_only_trap (basic_block);
+static void need_cmov_or_rewire (basic_block, hash_set<rtx_insn *> *,
+				 hash_map<rtx_insn *, int> *);
 \f
 /* Count the number of non-jump active insns in BB.  */
 
@@ -3203,6 +3205,11 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   auto_vec<rtx_insn *> unmodified_insns;
   int count = 0;
 
+  hash_set<rtx_insn *> need_no_cmov;
+  hash_map<rtx_insn *, int> rewired_src;
+
+  need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src);
+
   FOR_BB_INSNS (then_bb, insn)
     {
       /* Skip over non-insns.  */
@@ -3213,26 +3220,49 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
       gcc_checking_assert (set);
 
       rtx target = SET_DEST (set);
-      rtx temp = gen_reg_rtx (GET_MODE (target));
+      rtx temp;
+
       rtx new_val = SET_SRC (set);
+      if (int *ii = rewired_src.get (insn))
+	new_val = simplify_replace_rtx (new_val, targets[*ii],
+					temporaries[*ii]);
       rtx old_val = target;
 
-      /* If we were supposed to read from an earlier write in this block,
-	 we've changed the register allocation.  Rewire the read.  While
-	 we are looking, also try to catch a swap idiom.  */
-      for (int i = count - 1; i >= 0; --i)
-	if (reg_overlap_mentioned_p (new_val, targets[i]))
-	  {
-	    /* Catch a "swap" style idiom.  */
-	    if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX)
-	      /* The write to targets[i] is only live until the read
-		 here.  As the condition codes match, we can propagate
-		 the set to here.  */
-	      new_val = SET_SRC (single_set (unmodified_insns[i]));
-	    else
-	      new_val = temporaries[i];
-	    break;
-	  }
+      /* As we are transforming
+	 if (x > y)
+	   {
+	     a = b;
+	     c = d;
+	   }
+	 into
+	   a = (x > y) ...
+	   c = (x > y) ...
+
+	 we potentially check x > y before every set.
+	 Even though the check might be removed by subsequent passes, this means
+	 that we cannot transform
+	   if (x > y)
+	     {
+	       x = y;
+	       ...
+	     }
+	 into
+	   x = (x > y) ...
+	   ...
+	 since this would invalidate x and the following to-be-removed checks.
+	 Therefore we introduce a temporary every time we are about to
+	 overwrite a variable used in the check.  Costing of a sequence with
+	 these is going to be inaccurate so only use temporaries when
+	 needed.  */
+      if (reg_overlap_mentioned_p (target, cond))
+	temp = gen_reg_rtx (GET_MODE (target));
+      else
+	temp = target;
+
+      /* We have identified swap-style idioms before.  A normal
+	 set will need to be a cmov while the first instruction of a swap-style
+	 idiom can be a regular move.  This helps with costing.  */
+      bool need_cmov = !need_no_cmov.contains (insn);
 
       /* If we had a non-canonical conditional jump (i.e. one where
 	 the fallthrough is to the "else" case) we need to reverse
@@ -3275,16 +3305,29 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	  old_val = lowpart_subreg (dst_mode, old_val, src_mode);
 	}
 
-      /* Actually emit the conditional move.  */
-      rtx temp_dest = noce_emit_cmove (if_info, temp, cond_code,
+      rtx temp_dest = NULL_RTX;
+
+      if (need_cmov)
+	{
+	  /* Actually emit the conditional move.  */
+	  temp_dest = noce_emit_cmove (if_info, temp, cond_code,
 				       x, y, new_val, old_val);
 
-      /* If we failed to expand the conditional move, drop out and don't
-	 try to continue.  */
-      if (temp_dest == NULL_RTX)
+	  /* If we failed to expand the conditional move, drop out and don't
+	     try to continue.  */
+	  if (temp_dest == NULL_RTX)
+	    {
+	      end_sequence ();
+	      return FALSE;
+	    }
+	}
+      else
 	{
-	  end_sequence ();
-	  return FALSE;
+	  if (if_info->then_else_reversed)
+	    noce_emit_move_insn (temp, old_val);
+	  else
+	    noce_emit_move_insn (temp, new_val);
+	  temp_dest = temp;
 	}
 
       /* Bookkeeping.  */
@@ -3808,6 +3851,89 @@ check_cond_move_block (basic_block bb,
   return TRUE;
 }
 
+/* Find local swap-style idioms in BB and mark the first insn (1)
+   that is only a temporary as not needing a conditional move as
+   it is going to be dead afterwards anyway.
+
+     (1) int tmp = a;
+	 a = b;
+	 b = tmp;
+
+	 ifcvt
+	 -->
+
+	 tmp = a;
+	 a = cond ? b : a_old;
+	 b = cond ? tmp : b_old;
+
+    Additionally, store the index of insns like (2) when a subsequent
+    SET reads from their destination.
+
+    (2) int c = a;
+	int d = c;
+
+	ifcvt
+	-->
+
+	c = cond ? a : c_old;
+	d = cond ? d : c;     // Need to use c rather than c_old here.
+*/
+
+static void
+need_cmov_or_rewire (basic_block bb,
+		     hash_set<rtx_insn *> *need_no_cmov,
+		     hash_map<rtx_insn *, int> *rewired_src)
+{
+  rtx_insn *insn;
+  int count = 0;
+  auto_vec<rtx_insn *> insns;
+  auto_vec<rtx> dests;
+
+  /* Iterate over all SETs, storing the destinations
+     in DEST.
+     - If we hit a SET that reads from a destination
+       that we have seen before and the corresponding register
+       is dead afterwards, the register does not need to be
+       moved conditionally.
+     - If we encounter a previously changed register,
+       rewire the read to the original source.  */
+  FOR_BB_INSNS (bb, insn)
+    {
+      rtx set, src, dest;
+
+      if (!active_insn_p (insn))
+	continue;
+
+      set = single_set (insn);
+      if (set == NULL_RTX)
+	continue;
+
+      src = SET_SRC (set);
+      if (SUBREG_P (src))
+	src = SUBREG_REG (src);
+      dest = SET_DEST (set);
+
+      /* Check if the current SET's source is the same
+	 as any previously seen destination.
+	 This is quadratic but the number of insns in BB
+	 is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS.  */
+      if (REG_P (src))
+	for (int i = count - 1; i >= 0; --i)
+	  if (reg_overlap_mentioned_p (src, dests[i]))
+	    {
+	      if (find_reg_note (insn, REG_DEAD, src) != NULL_RTX)
+		need_no_cmov->add (insns[i]);
+	      else
+		rewired_src->put (insn, i);
+	    }
+
+      insns.safe_push (insn);
+      dests.safe_push (dest);
+
+      count++;
+    }
+}
+
 /* Given a basic block BB suitable for conditional move conversion,
    a condition COND, and pointer maps THEN_VALS and ELSE_VALS containing
    the register values depending on COND, emit the insns in the block as
-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple.
  2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
  2021-12-06 18:43 ` [PATCH v3 1/7] ifcvt: Check if cmovs are needed Robin Dapp
@ 2021-12-06 18:43 ` Robin Dapp
  2021-12-08 23:51   ` Jeff Law
  2021-12-06 18:43 ` [PATCH v3 3/7] ifcvt: Improve costs handling " Robin Dapp
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

This lifts the restriction of not allowing constants for
noce_convert_multiple.  The code later checks if a valid sequence
is produced anyway.
---
 gcc/ifcvt.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index c98668d5646..4642176957e 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3282,7 +3282,9 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	 we'll end up trying to emit r4:HI = cond ? (r1:SI) : (r3:HI).
 	 Wrap the two cmove operands into subregs if appropriate to prevent
 	 that.  */
-      if (GET_MODE (new_val) != GET_MODE (temp))
+
+      if (!CONSTANT_P (new_val)
+	  && GET_MODE (new_val) != GET_MODE (temp))
 	{
 	  machine_mode src_mode = GET_MODE (new_val);
 	  machine_mode dst_mode = GET_MODE (temp);
@@ -3293,7 +3295,8 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	    }
 	  new_val = lowpart_subreg (dst_mode, new_val, src_mode);
 	}
-      if (GET_MODE (old_val) != GET_MODE (temp))
+      if (!CONSTANT_P (old_val)
+	  && GET_MODE (old_val) != GET_MODE (temp))
 	{
 	  machine_mode src_mode = GET_MODE (old_val);
 	  machine_mode dst_mode = GET_MODE (temp);
@@ -3435,9 +3438,9 @@ bb_ok_for_noce_convert_multiple_sets (basic_block test_bb)
       if (!REG_P (dest))
 	return false;
 
-      if (!(REG_P (src)
-	   || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
-	       && subreg_lowpart_p (src))))
+      if (!((REG_P (src) || CONSTANT_P (src))
+	    || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
+	      && subreg_lowpart_p (src))))
 	return false;
 
       /* Destination must be appropriate for a conditional write.  */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 3/7] ifcvt: Improve costs handling for noce_convert_multiple.
  2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
  2021-12-06 18:43 ` [PATCH v3 1/7] ifcvt: Check if cmovs are needed Robin Dapp
  2021-12-06 18:43 ` [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple Robin Dapp
@ 2021-12-06 18:43 ` Robin Dapp
  2021-12-08 23:54   ` Jeff Law
  2021-12-06 18:43 ` [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move Robin Dapp
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

When noce_convert_multiple is called the original costs are not yet
initialized.  Therefore, up to now, costs were only ever unfairly
compared against COSTS_N_INSNS (2).  This would lead to
default_noce_conversion_profitable_p () rejecting all but the most
contrived of sequences.

This patch temporarily initializes the original costs by counting
adding costs for all sets inside the then_bb.
---
 gcc/ifcvt.c | 31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 4642176957e..7e1ae2564a3 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3408,14 +3408,17 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
    (SET (REG) (REG)) insns suitable for conversion to a series
    of conditional moves.  Also check that we have more than one set
    (other routines can handle a single set better than we would), and
-   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  */
+   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
+   through the insns store the sum of their potential costs in COST.  */
 
 static bool
-bb_ok_for_noce_convert_multiple_sets (basic_block test_bb)
+bb_ok_for_noce_convert_multiple_sets (basic_block test_bb, unsigned *cost)
 {
   rtx_insn *insn;
   unsigned count = 0;
   unsigned param = param_max_rtl_if_conversion_insns;
+  bool speed_p = optimize_bb_for_speed_p (test_bb);
+  unsigned potential_cost = 0;
 
   FOR_BB_INSNS (test_bb, insn)
     {
@@ -3451,9 +3454,13 @@ bb_ok_for_noce_convert_multiple_sets (basic_block test_bb)
       if (!can_conditionally_move_p (GET_MODE (dest)))
 	return false;
 
+      potential_cost += insn_cost (insn, speed_p);
+
       count++;
     }
 
+  *cost += potential_cost;
+
   /* If we would only put out one conditional move, the other strategies
      this pass tries are better optimized and will be more appropriate.
      Some targets want to strictly limit the number of conditional moves
@@ -3501,11 +3508,24 @@ noce_process_if_block (struct noce_if_info *if_info)
      to calculate a value for x.
      ??? For future expansion, further expand the "multiple X" rules.  */
 
-  /* First look for multiple SETS.  */
+  /* First look for multiple SETS.  The original costs already include
+     a base cost of COSTS_N_INSNS (2): one instruction for the compare
+     (which we will be needing either way) and one instruction for the
+     branch.  When comparing costs we want to use the branch instruction
+     cost and the sets vs. the cmovs generated here.  Therefore subtract
+     the costs of the compare before checking.
+     ??? Actually, instead of the branch instruction costs we might want
+     to use COSTS_N_INSNS (BRANCH_COST ()) as in other places.  */
+
+  unsigned potential_cost = if_info->original_cost - COSTS_N_INSNS (1);
+  unsigned old_cost = if_info->original_cost;
   if (!else_bb
       && HAVE_conditional_move
-      && bb_ok_for_noce_convert_multiple_sets (then_bb))
+      && bb_ok_for_noce_convert_multiple_sets (then_bb, &potential_cost))
     {
+      /* Temporarily set the original costs to what we estimated so
+	 we can determine if the transformation is worth it.  */
+      if_info->original_cost = potential_cost;
       if (noce_convert_multiple_sets (if_info))
 	{
 	  if (dump_file && if_info->transform_name)
@@ -3513,6 +3533,9 @@ noce_process_if_block (struct noce_if_info *if_info)
 		     if_info->transform_name);
 	  return TRUE;
 	}
+
+      /* Restore the original costs.  */
+      if_info->original_cost = old_cost;
     }
 
   bool speed_p = optimize_bb_for_speed_p (test_bb);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.
  2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
                   ` (2 preceding siblings ...)
  2021-12-06 18:43 ` [PATCH v3 3/7] ifcvt: Improve costs handling " Robin Dapp
@ 2021-12-06 18:43 ` Robin Dapp
  2021-12-09  0:11   ` Jeff Law
  2021-12-06 18:43 ` [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves Robin Dapp
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

Currently we only ever call emit_conditional_move with the comparison
(as well as its comparands) we got from the jump.  Thus, backends are
going to emit a CC comparison for every conditional move that is being
generated instead of re-using the existing CC.
This, combined with emitting temporaries for each conditional move,
causes sky-high costs for conditional moves.

This patch allows to re-use a CC so the costing situation is improved a
bit.
---
 gcc/expmed.c |   8 +--
 gcc/expr.c   |  10 ++--
 gcc/ifcvt.c  |  45 ++++++++++-------
 gcc/optabs.c | 135 ++++++++++++++++++++++++++++++++++++++-------------
 gcc/optabs.h |   4 +-
 gcc/rtl.h    |  11 ++++-
 6 files changed, 150 insertions(+), 63 deletions(-)

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 59734d4841c..af51bad7cfc 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -4122,8 +4122,8 @@ expand_sdiv_pow2 (scalar_int_mode mode, rtx op0, HOST_WIDE_INT d)
       temp = force_reg (mode, temp);
 
       /* Construct "temp2 = (temp2 < 0) ? temp : temp2".  */
-      temp2 = emit_conditional_move (temp2, LT, temp2, const0_rtx,
-				     mode, temp, temp2, mode, 0);
+      temp2 = emit_conditional_move (temp2, { LT, temp2, const0_rtx, mode },
+				     temp, temp2, mode, 0);
       if (temp2)
 	{
 	  rtx_insn *seq = get_insns ();
@@ -6125,10 +6125,10 @@ emit_store_flag (rtx target, enum rtx_code code, rtx op0, rtx op1,
 	return 0;
 
       if (and_them)
-	tem = emit_conditional_move (target, code, op0, op1, mode,
+	tem = emit_conditional_move (target, { code, op0, op1, mode },
 				     tem, const0_rtx, GET_MODE (tem), 0);
       else
-	tem = emit_conditional_move (target, code, op0, op1, mode,
+	tem = emit_conditional_move (target, { code, op0, op1, mode },
 				     trueval, tem, GET_MODE (tem), 0);
 
       if (tem == 0)
diff --git a/gcc/expr.c b/gcc/expr.c
index e0bcbccd905..c5631a9dd2e 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8840,8 +8840,9 @@ expand_cond_expr_using_cmove (tree treeop0 ATTRIBUTE_UNUSED,
     op2 = gen_lowpart (mode, op2);
 
   /* Try to emit the conditional move.  */
-  insn = emit_conditional_move (temp, comparison_code,
-				op00, op01, comparison_mode,
+  insn = emit_conditional_move (temp,
+				{ comparison_code, op00, op01,
+				  comparison_mode },
 				op1, op2, mode,
 				unsignedp);
 
@@ -9732,8 +9733,9 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	    start_sequence ();
 
 	    /* Try to emit the conditional move.  */
-	    insn = emit_conditional_move (target, comparison_code,
-					  op0, cmpop1, mode,
+	    insn = emit_conditional_move (target,
+					  { comparison_code,
+					    op0, cmpop1, mode },
 					  op0, op1, mode,
 					  unsignedp);
 
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 7e1ae2564a3..3e78e1bb03d 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -772,7 +772,7 @@ static int noce_try_addcc (struct noce_if_info *);
 static int noce_try_store_flag_constants (struct noce_if_info *);
 static int noce_try_store_flag_mask (struct noce_if_info *);
 static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx,
-			    rtx, rtx, rtx);
+			    rtx, rtx, rtx, rtx = NULL, rtx = NULL);
 static int noce_try_cmove (struct noce_if_info *);
 static int noce_try_cmove_arith (struct noce_if_info *);
 static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
@@ -1711,7 +1711,8 @@ noce_try_store_flag_mask (struct noce_if_info *if_info)
 
 static rtx
 noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
-		 rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue)
+		 rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue, rtx cc_cmp,
+		 rtx rev_cc_cmp)
 {
   rtx target ATTRIBUTE_UNUSED;
   int unsignedp ATTRIBUTE_UNUSED;
@@ -1743,23 +1744,30 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
       end_sequence ();
     }
 
-  /* Don't even try if the comparison operands are weird
-     except that the target supports cbranchcc4.  */
-  if (! general_operand (cmp_a, GET_MODE (cmp_a))
-      || ! general_operand (cmp_b, GET_MODE (cmp_b)))
-    {
-      if (!have_cbranchcc4
-	  || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC
-	  || cmp_b != const0_rtx)
-	return NULL_RTX;
-    }
-
   unsignedp = (code == LTU || code == GEU
 	       || code == LEU || code == GTU);
 
-  target = emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
-				  vtrue, vfalse, GET_MODE (x),
-				  unsignedp);
+  if (cc_cmp != NULL_RTX && rev_cc_cmp != NULL_RTX)
+    target = emit_conditional_move (x, cc_cmp, rev_cc_cmp,
+				    vtrue, vfalse, GET_MODE (x));
+  else
+    {
+      /* Don't even try if the comparison operands are weird
+	 except that the target supports cbranchcc4.  */
+      if (! general_operand (cmp_a, GET_MODE (cmp_a))
+	  || ! general_operand (cmp_b, GET_MODE (cmp_b)))
+	{
+	  if (!have_cbranchcc4
+	      || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC
+	      || cmp_b != const0_rtx)
+	    return NULL_RTX;
+	}
+
+      target = emit_conditional_move (x, { code, cmp_a, cmp_b, VOIDmode },
+				      vtrue, vfalse, GET_MODE (x),
+				      unsignedp);
+    }
+
   if (target)
     return target;
 
@@ -1795,8 +1803,9 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
 
       promoted_target = gen_reg_rtx (GET_MODE (reg_vtrue));
 
-      target = emit_conditional_move (promoted_target, code, cmp_a, cmp_b,
-				      VOIDmode, reg_vtrue, reg_vfalse,
+      target = emit_conditional_move (promoted_target,
+				      { code, cmp_a, cmp_b, VOIDmode },
+				      reg_vtrue, reg_vfalse,
 				      GET_MODE (reg_vtrue), unsignedp);
       /* Nope, couldn't do it in that mode either.  */
       if (!target)
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 019bbb62882..9513b666e5a 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -52,6 +52,8 @@ static void prepare_float_lib_cmp (rtx, rtx, enum rtx_code, rtx *,
 static rtx expand_unop_direct (machine_mode, optab, rtx, rtx, int);
 static void emit_libcall_block_1 (rtx_insn *, rtx, rtx, rtx, bool);
 
+static rtx emit_conditional_move_1 (rtx, rtx, rtx, rtx, machine_mode);
+
 /* Debug facility for use in GDB.  */
 void debug_optab_libfuncs (void);
 \f
@@ -624,12 +626,13 @@ expand_doubleword_shift_condmove (scalar_int_mode op1_mode, optab binoptab,
 
   /* Select between them.  Do the INTO half first because INTO_SUPERWORD
      might be the current value of OUTOF_TARGET.  */
-  if (!emit_conditional_move (into_target, cmp_code, cmp1, cmp2, op1_mode,
+  if (!emit_conditional_move (into_target, { cmp_code, cmp1, cmp2, op1_mode },
 			      into_target, into_superword, word_mode, false))
     return false;
 
   if (outof_target != 0)
-    if (!emit_conditional_move (outof_target, cmp_code, cmp1, cmp2, op1_mode,
+    if (!emit_conditional_move (outof_target,
+				{ cmp_code, cmp1, cmp2, op1_mode },
 				outof_target, outof_superword,
 				word_mode, false))
       return false;
@@ -4843,8 +4846,8 @@ emit_indirect_jump (rtx loc)
    is not supported.  */
 
 rtx
-emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
-		       machine_mode cmode, rtx op2, rtx op3,
+emit_conditional_move (rtx target, struct rtx_comparison comp,
+		       rtx op2, rtx op3,
 		       machine_mode mode, int unsignedp)
 {
   rtx comparison;
@@ -4866,31 +4869,33 @@ emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
   /* If one operand is constant, make it the second one.  Only do this
      if the other operand is not constant as well.  */
 
-  if (swap_commutative_operands_p (op0, op1))
+  if (swap_commutative_operands_p (comp.op0, comp.op1))
     {
-      std::swap (op0, op1);
-      code = swap_condition (code);
+      std::swap (comp.op0, comp.op1);
+      comp.code = swap_condition (comp.code);
     }
 
   /* get_condition will prefer to generate LT and GT even if the old
      comparison was against zero, so undo that canonicalization here since
      comparisons against zero are cheaper.  */
-  if (code == LT && op1 == const1_rtx)
-    code = LE, op1 = const0_rtx;
-  else if (code == GT && op1 == constm1_rtx)
-    code = GE, op1 = const0_rtx;
 
-  if (cmode == VOIDmode)
-    cmode = GET_MODE (op0);
+  if (comp.code == LT && comp.op1 == const1_rtx)
+    comp.code = LE, comp.op1 = const0_rtx;
+  else if (comp.code == GT && comp.op1 == constm1_rtx)
+    comp.code = GE, comp.op1 = const0_rtx;
+
+  if (comp.mode == VOIDmode)
+    comp.mode = GET_MODE (comp.op0);
 
-  enum rtx_code orig_code = code;
+  enum rtx_code orig_code = comp.code;
   bool swapped = false;
   if (swap_commutative_operands_p (op2, op3)
-      && ((reversed = reversed_comparison_code_parts (code, op0, op1, NULL))
+      && ((reversed =
+	   reversed_comparison_code_parts (comp.code, comp.op0, comp.op1, NULL))
           != UNKNOWN))
     {
       std::swap (op2, op3);
-      code = reversed;
+      comp.code = reversed;
       swapped = true;
     }
 
@@ -4907,8 +4912,10 @@ emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
 
   for (int pass = 0; ; pass++)
     {
-      code = unsignedp ? unsigned_condition (code) : code;
-      comparison = simplify_gen_relational (code, VOIDmode, cmode, op0, op1);
+      comp.code = unsignedp ? unsigned_condition (comp.code) : comp.code;
+      comparison = 
+	simplify_gen_relational (comp.code, VOIDmode,
+				 comp.mode, comp.op0, comp.op1);
 
       /* We can get const0_rtx or const_true_rtx in some circumstances.  Just
 	 punt and let the caller figure out how best to deal with this
@@ -4919,24 +4926,16 @@ emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
 	  save_pending_stack_adjust (&save);
 	  last = get_last_insn ();
 	  do_pending_stack_adjust ();
-	  machine_mode cmpmode = cmode;
+	  machine_mode cmpmode = comp.mode;
 	  prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1),
 			    GET_CODE (comparison), NULL_RTX, unsignedp,
 			    OPTAB_WIDEN, &comparison, &cmpmode);
 	  if (comparison)
 	    {
-	      class expand_operand ops[4];
-
-	      create_output_operand (&ops[0], target, mode);
-	      create_fixed_operand (&ops[1], comparison);
-	      create_input_operand (&ops[2], op2, mode);
-	      create_input_operand (&ops[3], op3, mode);
-	      if (maybe_expand_insn (icode, 4, ops))
-		{
-		  if (ops[0].value != target)
-		    convert_move (target, ops[0].value, false);
-		  return target;
-		}
+	       rtx res = emit_conditional_move_1 (target, comparison,
+						  op2, op3, mode);
+	       if (res != NULL_RTX)
+		 return res;
 	    }
 	  delete_insns_since (last);
 	  restore_pending_stack_adjust (&save);
@@ -4948,17 +4947,85 @@ emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
       /* If the preferred op2/op3 order is not usable, retry with other
 	 operand order, perhaps it will expand successfully.  */
       if (swapped)
-	code = orig_code;
-      else if ((reversed = reversed_comparison_code_parts (orig_code, op0, op1,
+	comp.code = orig_code;
+      else if ((reversed =
+		reversed_comparison_code_parts (orig_code, comp.op0, comp.op1,
 							   NULL))
 	       != UNKNOWN)
-	code = reversed;
+	comp.code = reversed;
       else
 	return NULL_RTX;
       std::swap (op2, op3);
     }
 }
 
+/* Helper function that, in addition to COMPARISON, also tries
+   the reversed REV_COMPARISON with swapped OP2 and OP3.  As opposed
+   to when we pass the specific constituents of a comparison, no
+   additional insns are emitted for it.  It might still be necessary
+   to emit more than one insn for the final conditional move, though.  */
+
+rtx
+emit_conditional_move (rtx target, rtx comparison, rtx rev_comparison,
+		       rtx op2, rtx op3, machine_mode mode)
+{
+  rtx res = emit_conditional_move_1 (target, comparison, op2, op3, mode);
+
+  if (res != NULL_RTX)
+    return res;
+
+  return emit_conditional_move_1 (target, rev_comparison, op3, op2, mode);
+}
+
+/* Helper for emitting a conditional move.  */
+
+static rtx
+emit_conditional_move_1 (rtx target, rtx comparison,
+			 rtx op2, rtx op3, machine_mode mode)
+{
+  enum insn_code icode;
+
+  if (comparison == NULL_RTX || !COMPARISON_P (comparison))
+    return NULL_RTX;
+
+  /* If the two source operands are identical, that's just a move.  */
+  if (rtx_equal_p (op2, op3))
+    {
+      if (!target)
+	target = gen_reg_rtx (mode);
+
+      emit_move_insn (target, op3);
+      return target;
+    }
+
+  if (mode == VOIDmode)
+    mode = GET_MODE (op2);
+
+  icode = direct_optab_handler (movcc_optab, mode);
+
+  if (icode == CODE_FOR_nothing)
+    return NULL_RTX;
+
+  if (!target)
+    target = gen_reg_rtx (mode);
+
+  class expand_operand ops[4];
+
+  create_output_operand (&ops[0], target, mode);
+  create_fixed_operand (&ops[1], comparison);
+  create_input_operand (&ops[2], op2, mode);
+  create_input_operand (&ops[3], op3, mode);
+
+  if (maybe_expand_insn (icode, 4, ops))
+    {
+      if (ops[0].value != target)
+	convert_move (target, ops[0].value, false);
+      return target;
+    }
+
+  return NULL_RTX;
+}
+
 
 /* Emit a conditional negate or bitwise complement using the
    negcc or notcc optabs if available.  Return NULL_RTX if such operations
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 3bbceff92d9..700dfa24139 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -279,8 +279,8 @@ extern void emit_indirect_jump (rtx);
 #endif
 
 /* Emit a conditional move operation.  */
-rtx emit_conditional_move (rtx, enum rtx_code, rtx, rtx, machine_mode,
-			   rtx, rtx, machine_mode, int);
+rtx emit_conditional_move (rtx, rtx_comparison, rtx, rtx, machine_mode, int);
+rtx emit_conditional_move (rtx, rtx, rtx, rtx, rtx, machine_mode);
 
 /* Emit a conditional negate or bitwise complement operation.  */
 rtx emit_conditional_neg_or_complement (rtx, rtx_code, machine_mode, rtx,
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 5473cc9f2dd..85d0bdf48b1 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -4589,7 +4589,16 @@ word_register_operation_p (const_rtx x)
       return true;
     }
 }
-    
+
+/* Holds an rtx comparison to simplify passing many parameters pertaining to a
+   single comparison.  */
+
+struct rtx_comparison {
+  rtx_code code;
+  rtx op0, op1;
+  machine_mode mode;
+};
+
 /* gtype-desc.c.  */
 extern void gt_ggc_mx (rtx &);
 extern void gt_pch_nx (rtx &);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves.
  2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
                   ` (3 preceding siblings ...)
  2021-12-06 18:43 ` [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move Robin Dapp
@ 2021-12-06 18:43 ` Robin Dapp
  2021-12-09  1:18   ` Jeff Law
  2021-12-06 18:43 ` [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple Robin Dapp
  2021-12-06 18:43 ` [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary Robin Dapp
  6 siblings, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

Following up on the previous patch, this patch makes
noce_convert_multiple emit two cmov sequences:  The same one as before
and a second one that tries to re-use the existing CC.  Then their costs
are compared and the cheaper one is selected.
---
 gcc/ifcvt.c | 112 +++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 88 insertions(+), 24 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 3e78e1bb03d..d1e7db1ee27 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -83,7 +83,7 @@ static rtx_insn *last_active_insn (basic_block, int);
 static rtx_insn *find_active_insn_before (basic_block, rtx_insn *);
 static rtx_insn *find_active_insn_after (basic_block, rtx_insn *);
 static basic_block block_fallthru (basic_block);
-static rtx cond_exec_get_condition (rtx_insn *);
+static rtx cond_exec_get_condition (rtx_insn *, bool);
 static rtx noce_get_condition (rtx_insn *, rtx_insn **, bool);
 static int noce_operand_ok (const_rtx);
 static void merge_if_block (ce_if_block *);
@@ -427,7 +427,7 @@ cond_exec_process_insns (ce_if_block *ce_info ATTRIBUTE_UNUSED,
 /* Return the condition for a jump.  Do not do any special processing.  */
 
 static rtx
-cond_exec_get_condition (rtx_insn *jump)
+cond_exec_get_condition (rtx_insn *jump, bool get_reversed = false)
 {
   rtx test_if, cond;
 
@@ -439,8 +439,9 @@ cond_exec_get_condition (rtx_insn *jump)
 
   /* If this branches to JUMP_LABEL when the condition is false,
      reverse the condition.  */
-  if (GET_CODE (XEXP (test_if, 2)) == LABEL_REF
-      && label_ref_label (XEXP (test_if, 2)) == JUMP_LABEL (jump))
+  if (get_reversed || (GET_CODE (XEXP (test_if, 2)) == LABEL_REF
+		       && label_ref_label (XEXP (test_if, 2))
+		       == JUMP_LABEL (jump)))
     {
       enum rtx_code rev = reversed_comparison_code (cond, jump);
       if (rev == UNKNOWN)
@@ -3146,6 +3147,46 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond,
   return false;
 }
 
+/* Helper function to emit a cmov sequence.  */
+
+static rtx_insn*
+try_emit_cmove_seq (struct noce_if_info *if_info, rtx temp,
+		    rtx cond, rtx new_val, rtx old_val, bool need_cmov,
+		    unsigned *cost, rtx *temp_dest,
+		    rtx cc_cmp = NULL, rtx rev_cc_cmp = NULL)
+{
+  rtx_insn *seq = NULL;
+  *cost = 0;
+
+  rtx x = XEXP (cond, 0);
+  rtx y = XEXP (cond, 1);
+  rtx_code cond_code = GET_CODE (cond);
+
+  start_sequence ();
+
+  if (need_cmov)
+    *temp_dest = noce_emit_cmove (if_info, temp, cond_code,
+				  x, y, new_val, old_val, cc_cmp, rev_cc_cmp);
+  else
+    {
+      *temp_dest = temp;
+      if (if_info->then_else_reversed)
+	noce_emit_move_insn (temp, old_val);
+      else
+	noce_emit_move_insn (temp, new_val);
+    }
+
+  if (*temp_dest != NULL_RTX)
+    {
+      seq = get_insns ();
+      *cost = seq_cost (seq, if_info->speed_p);
+    }
+
+  end_sequence ();
+
+  return seq;
+}
+
 /* We have something like:
 
      if (x > y)
@@ -3203,7 +3244,9 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   rtx cond = noce_get_condition (jump, &cond_earliest, false);
   rtx x = XEXP (cond, 0);
   rtx y = XEXP (cond, 1);
-  rtx_code cond_code = GET_CODE (cond);
+
+  rtx cc_cmp = cond_exec_get_condition (jump);
+  rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true);
 
   /* The true targets for a conditional move.  */
   auto_vec<rtx> targets;
@@ -3317,31 +3360,51 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	  old_val = lowpart_subreg (dst_mode, old_val, src_mode);
 	}
 
-      rtx temp_dest = NULL_RTX;
+      /* Try emitting a conditional move passing the backend the
+	 canonicalized comparison.  The backend is then able to
+	 recognize expressions like
 
-      if (need_cmov)
+	   if (x > y)
+	     y = x;
+
+	 as min/max and emit an insn, accordingly.  */
+      unsigned cost1 = 0, cost2 = 0;
+      rtx_insn *seq, *seq1, *seq2;
+      rtx temp_dest = NULL_RTX, temp_dest1 = NULL_RTX, temp_dest2 = NULL_RTX;
+
+      seq1 = try_emit_cmove_seq (if_info, temp, cond,
+				 new_val, old_val, need_cmov,
+				 &cost1, &temp_dest1);
+
+      /* Here, we try to pass the backend a non-canonicalized cc comparison
+	 as well.  This allows the backend to emit a cmov directly without
+	 creating an additional compare for each.  If successful, costing
+	 is easier and this sequence is usually preferred.  */
+      seq2 = try_emit_cmove_seq (if_info, target, cond,
+				 new_val, old_val, need_cmov,
+				 &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
+
+      /* Check which version is less expensive.  */
+      if (seq1 != NULL_RTX && (cost1 <= cost2 || seq2 == NULL_RTX))
 	{
-	  /* Actually emit the conditional move.  */
-	  temp_dest = noce_emit_cmove (if_info, temp, cond_code,
-				       x, y, new_val, old_val);
-
-	  /* If we failed to expand the conditional move, drop out and don't
-	     try to continue.  */
-	  if (temp_dest == NULL_RTX)
-	    {
-	      end_sequence ();
-	      return FALSE;
-	    }
+	  seq = seq1;
+	  temp_dest = temp_dest1;
+	}
+      else if (seq2 != NULL_RTX)
+	{
+	  seq = seq2;
+	  temp_dest = temp_dest2;
 	}
       else
 	{
-	  if (if_info->then_else_reversed)
-	    noce_emit_move_insn (temp, old_val);
-	  else
-	    noce_emit_move_insn (temp, new_val);
-	  temp_dest = temp;
+	  /* Nothing worked, bail out.  */
+	  end_sequence ();
+	  return FALSE;
 	}
 
+      /* End the sub sequence and emit to the main sequence.  */
+      emit_insn (seq);
+
       /* Bookkeeping.  */
       count++;
       targets.safe_push (target);
@@ -3355,7 +3418,8 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 
   /* Now fixup the assignments.  */
   for (int i = 0; i < count; i++)
-    noce_emit_move_insn (targets[i], temporaries[i]);
+    if (targets[i] != temporaries[i])
+      noce_emit_move_insn (targets[i], temporaries[i]);
 
   /* Actually emit the sequence if it isn't too expensive.  */
   rtx_insn *seq = get_insns ();
-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple.
  2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
                   ` (4 preceding siblings ...)
  2021-12-06 18:43 ` [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves Robin Dapp
@ 2021-12-06 18:43 ` Robin Dapp
  2021-12-08 23:48   ` Jeff Law
  2021-12-06 18:43 ` [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary Robin Dapp
  6 siblings, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

Add new s390-specific tests that check if we convert two SETs into two
loads on condition.  Remove the s390-specific target-check in
gcc.dg/ifcvt-4.c.
---
 gcc/testsuite/gcc.dg/ifcvt-4.c                |  2 +-
 .../gcc.target/s390/ifcvt-two-insns-bool.c    | 39 +++++++++++++++++++
 .../gcc.target/s390/ifcvt-two-insns-int.c     | 39 +++++++++++++++++++
 .../gcc.target/s390/ifcvt-two-insns-long.c    | 39 +++++++++++++++++++
 4 files changed, 118 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c

diff --git a/gcc/testsuite/gcc.dg/ifcvt-4.c b/gcc/testsuite/gcc.dg/ifcvt-4.c
index ec142cfd943..871ab950756 100644
--- a/gcc/testsuite/gcc.dg/ifcvt-4.c
+++ b/gcc/testsuite/gcc.dg/ifcvt-4.c
@@ -2,7 +2,7 @@
 /* { dg-additional-options "-misel" { target { powerpc*-*-* } } } */
 /* { dg-additional-options "-march=z196" { target { s390x-*-* } } } */
 /* { dg-additional-options "-mtune-ctrl=^one_if_conv_insn" { target { i?86-*-* x86_64-*-* } } } */
-/* { dg-skip-if "Multiple set if-conversion not guaranteed on all subtargets" { "arm*-*-* avr-*-* hppa*64*-*-* s390-*-* visium-*-*" riscv*-*-* msp430-*-* } }  */
+/* { dg-skip-if "Multiple set if-conversion not guaranteed on all subtargets" { "arm*-*-* avr-*-* hppa*64*-*-* visium-*-*" riscv*-*-* msp430-*-* } }  */
 /* { dg-skip-if "" { "s390x-*-*" } { "-m31" } }  */
 
 typedef int word __attribute__((mode(word)));
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
new file mode 100644
index 00000000000..4415cb49f11
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
@@ -0,0 +1,39 @@
+/* Check if conversion for two instructions.  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 -march=z13 --save-temps" } */
+
+/* { dg-final { scan-assembler "lochinhe\t%r.?,1" } } */
+/* { dg-final { scan-assembler "locrnhe\t.*" } } */
+#include <stdbool.h>
+#include <limits.h>
+#include <stdio.h>
+#include <assert.h>
+
+__attribute__ ((noinline))
+int foo (int *a, unsigned int n)
+{
+  int min = 999999;
+  bool bla = false;
+  for (int i = 0; i < n; i++)
+    {
+      if (a[i] < min)
+	{
+	  min = a[i];
+	  bla = true;
+	}
+    }
+
+  if (bla)
+    min += 1;
+  return min;
+}
+
+int main()
+{
+  int a[] = {2, 1, -13, INT_MAX, INT_MIN, 0};
+
+  int res = foo (a, sizeof (a));
+
+  assert (res == (INT_MIN + 1));
+}
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
new file mode 100644
index 00000000000..db8df84cced
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
@@ -0,0 +1,39 @@
+/* Check if conversion for two instructions.  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 -march=z13 --save-temps" } */
+
+/* { dg-final { scan-assembler "lochinhe\t%r.?,1" } } */
+/* { dg-final { scan-assembler "locrnhe\t.*" } } */
+#include <stdbool.h>
+#include <limits.h>
+#include <stdio.h>
+#include <assert.h>
+
+__attribute__ ((noinline))
+int foo (int *a, unsigned int n)
+{
+  int min = 999999;
+  int bla = 0;
+  for (int i = 0; i < n; i++)
+    {
+      if (a[i] < min)
+	{
+	  min = a[i];
+	  bla = 1;
+	}
+    }
+
+  if (bla)
+    min += 1;
+  return min;
+}
+
+int main()
+{
+  int a[] = {2, 1, -13, INT_MAX, INT_MIN, 0};
+
+  int res = foo (a, sizeof (a));
+
+  assert (res == (INT_MIN + 1));
+}
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
new file mode 100644
index 00000000000..bee63328729
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
@@ -0,0 +1,39 @@
+/* Check if conversion for two instructions.  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 -march=z13 --save-temps" } */
+
+/* { dg-final { scan-assembler "locghinhe\t%r.?,1" } } */
+/* { dg-final { scan-assembler "locgrnhe\t.*" } } */
+#include <stdbool.h>
+#include <limits.h>
+#include <stdio.h>
+#include <assert.h>
+
+__attribute__ ((noinline))
+long foo (long *a, unsigned long n)
+{
+  long min = 999999;
+  long bla = 0;
+  for (int i = 0; i < n; i++)
+    {
+      if (a[i] < min)
+	{
+	  min = a[i];
+	  bla = 1;
+	}
+    }
+
+  if (bla)
+    min += 1;
+  return min;
+}
+
+int main()
+{
+  long a[] = {2, 1, -13, LONG_MAX, LONG_MIN, 0};
+
+  long res = foo (a, sizeof (a));
+
+  assert (res == (LONG_MIN + 1));
+}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary.
  2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
                   ` (5 preceding siblings ...)
  2021-12-06 18:43 ` [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple Robin Dapp
@ 2021-12-06 18:43 ` Robin Dapp
  2021-12-09  1:24   ` Jeff Law
  6 siblings, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-06 18:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

If one of the to-be-converted SETs requires the original comparison
(i.e. in order to generate a min/max insn) but no other insn after it
does, we can omit creating temporaries, thus facilitating costing.
---
 gcc/ifcvt.c | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index d1e7db1ee27..3be5bb131df 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3262,6 +3262,11 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 
   need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src);
 
+  int last_needs_comparison = -1;
+  bool second_try = false;
+
+restart:
+
   FOR_BB_INSNS (then_bb, insn)
     {
       /* Skip over non-insns.  */
@@ -3305,8 +3310,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	 Therefore we introduce a temporary every time we are about to
 	 overwrite a variable used in the check.  Costing of a sequence with
 	 these is going to be inaccurate so only use temporaries when
-	 needed.  */
-      if (reg_overlap_mentioned_p (target, cond))
+	 needed.
+
+	 If performing a second try, we know how many insns require a
+	 temporary.  For the last of these, we can omit creating one.  */
+      if (reg_overlap_mentioned_p (target, cond)
+	  && (!second_try || count < last_needs_comparison))
 	temp = gen_reg_rtx (GET_MODE (target));
       else
 	temp = target;
@@ -3389,6 +3398,8 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	{
 	  seq = seq1;
 	  temp_dest = temp_dest1;
+	  if (!second_try)
+	    last_needs_comparison = count;
 	}
       else if (seq2 != NULL_RTX)
 	{
@@ -3412,6 +3423,24 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
       unmodified_insns.safe_push (insn);
     }
 
+    /* If there are insns that overwrite part of the initial
+       comparison, we can still omit creating temporaries for
+       the last of them.
+       As the second try will always create a less expensive,
+       valid sequence, we do not need to compare and can discard
+       the first one.  */
+    if (!second_try && last_needs_comparison >= 0)
+      {
+	end_sequence ();
+	start_sequence ();
+	count = 0;
+	targets.truncate (0);
+	temporaries.truncate (0);
+	unmodified_insns.truncate (0);
+	second_try = true;
+	goto restart;
+      }
+
   /* We must have seen some sort of insn to insert, otherwise we were
      given an empty BB to convert, and we can't handle that.  */
   gcc_assert (!unmodified_insns.is_empty ());
-- 
2.31.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple.
  2021-12-06 18:43 ` [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple Robin Dapp
@ 2021-12-08 23:48   ` Jeff Law
  2022-01-10 11:18     ` Robin Dapp
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2021-12-08 23:48 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/6/2021 11:43 AM, Robin Dapp via Gcc-patches wrote:
> Add new s390-specific tests that check if we convert two SETs into two
> loads on condition.  Remove the s390-specific target-check in
> gcc.dg/ifcvt-4.c.
> ---
>   gcc/testsuite/gcc.dg/ifcvt-4.c                |  2 +-
>   .../gcc.target/s390/ifcvt-two-insns-bool.c    | 39 +++++++++++++++++++
>   .../gcc.target/s390/ifcvt-two-insns-int.c     | 39 +++++++++++++++++++
>   .../gcc.target/s390/ifcvt-two-insns-long.c    | 39 +++++++++++++++++++
>   4 files changed, 118 insertions(+), 1 deletion(-)
>   create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
>   create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
>   create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
This is fine with a ChangeLog entry once the prereqs are approved.

jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple.
  2021-12-06 18:43 ` [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple Robin Dapp
@ 2021-12-08 23:51   ` Jeff Law
  2022-01-10 11:17     ` Robin Dapp
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2021-12-08 23:51 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/6/2021 11:43 AM, Robin Dapp via Gcc-patches wrote:
> This lifts the restriction of not allowing constants for
> noce_convert_multiple.  The code later checks if a valid sequence
> is produced anyway.
> ---
>   gcc/ifcvt.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
Fine with a ChangeLog once the prereqs are approved.

jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 3/7] ifcvt: Improve costs handling for noce_convert_multiple.
  2021-12-06 18:43 ` [PATCH v3 3/7] ifcvt: Improve costs handling " Robin Dapp
@ 2021-12-08 23:54   ` Jeff Law
  2022-01-10 11:17     ` Robin Dapp
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2021-12-08 23:54 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/6/2021 11:43 AM, Robin Dapp via Gcc-patches wrote:
> When noce_convert_multiple is called the original costs are not yet
> initialized.  Therefore, up to now, costs were only ever unfairly
> compared against COSTS_N_INSNS (2).  This would lead to
> default_noce_conversion_profitable_p () rejecting all but the most
> contrived of sequences.
>
> This patch temporarily initializes the original costs by counting
> adding costs for all sets inside the then_bb.
> ---
>   gcc/ifcvt.c | 31 +++++++++++++++++++++++++++----
>   1 file changed, 27 insertions(+), 4 deletions(-)
OK with a ChangeLog once the prereqs are approved.

jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.
  2021-12-06 18:43 ` [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move Robin Dapp
@ 2021-12-09  0:11   ` Jeff Law
  2021-12-09 17:20     ` Robin Dapp
  2022-01-10 11:18     ` Robin Dapp
  0 siblings, 2 replies; 26+ messages in thread
From: Jeff Law @ 2021-12-09  0:11 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/6/2021 11:43 AM, Robin Dapp via Gcc-patches wrote:
> Currently we only ever call emit_conditional_move with the comparison
> (as well as its comparands) we got from the jump.  Thus, backends are
> going to emit a CC comparison for every conditional move that is being
> generated instead of re-using the existing CC.
> This, combined with emitting temporaries for each conditional move,
> causes sky-high costs for conditional moves.
>
> This patch allows to re-use a CC so the costing situation is improved a
> bit.
> ---
>   gcc/expmed.c |   8 +--
>   gcc/expr.c   |  10 ++--
>   gcc/ifcvt.c  |  45 ++++++++++-------
>   gcc/optabs.c | 135 ++++++++++++++++++++++++++++++++++++++-------------
>   gcc/optabs.h |   4 +-
>   gcc/rtl.h    |  11 ++++-
>   6 files changed, 150 insertions(+), 63 deletions(-)
>
>

> @@ -4948,17 +4947,85 @@ emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
>
> +/* Helper for emitting a conditional move.  */
> +
> +static rtx
> +emit_conditional_move_1 (rtx target, rtx comparison,
> +			 rtx op2, rtx op3, machine_mode mode)
> +{
> +  enum insn_code icode;
> +
> +  if (comparison == NULL_RTX || !COMPARISON_P (comparison))
> +    return NULL_RTX;
> +
> +  /* If the two source operands are identical, that's just a move.  */
> +  if (rtx_equal_p (op2, op3))
> +    {
> +      if (!target)
> +	target = gen_reg_rtx (mode);
> +
> +      emit_move_insn (target, op3);
> +      return target;
> +    }
What if the condition has a side effect?  Doesn't this drop the side 
effect by converting the conditional move into a simple move?

That's my only technical concern.  Obviously the patch needs a ChangeLog.

Jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves.
  2021-12-06 18:43 ` [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves Robin Dapp
@ 2021-12-09  1:18   ` Jeff Law
  2022-01-10 11:18     ` Robin Dapp
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2021-12-09  1:18 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/6/2021 11:43 AM, Robin Dapp via Gcc-patches wrote:
> Following up on the previous patch, this patch makes
> noce_convert_multiple emit two cmov sequences:  The same one as before
> and a second one that tries to re-use the existing CC.  Then their costs
> are compared and the cheaper one is selected.
> ---
>   gcc/ifcvt.c | 112 +++++++++++++++++++++++++++++++++++++++++-----------
>   1 file changed, 88 insertions(+), 24 deletions(-)
>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 3e78e1bb03d..d1e7db1ee27 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -83,7 +83,7 @@ static rtx_insn *last_active_insn (basic_block, int);
>   static rtx_insn *find_active_insn_before (basic_block, rtx_insn *);
>   static rtx_insn *find_active_insn_after (basic_block, rtx_insn *);
>   static basic_block block_fallthru (basic_block);
> -static rtx cond_exec_get_condition (rtx_insn *);
> +static rtx cond_exec_get_condition (rtx_insn *, bool);
>   static rtx noce_get_condition (rtx_insn *, rtx_insn **, bool);
>   static int noce_operand_ok (const_rtx);
>   static void merge_if_block (ce_if_block *);
> @@ -427,7 +427,7 @@ cond_exec_process_insns (ce_if_block *ce_info ATTRIBUTE_UNUSED,
>   /* Return the condition for a jump.  Do not do any special processing.  */
>   
>   static rtx
> -cond_exec_get_condition (rtx_insn *jump)
> +cond_exec_get_condition (rtx_insn *jump, bool get_reversed = false)
>   {
>     rtx test_if, cond;
>   
> @@ -439,8 +439,9 @@ cond_exec_get_condition (rtx_insn *jump)
>   
>     /* If this branches to JUMP_LABEL when the condition is false,
>        reverse the condition.  */
> -  if (GET_CODE (XEXP (test_if, 2)) == LABEL_REF
> -      && label_ref_label (XEXP (test_if, 2)) == JUMP_LABEL (jump))
> +  if (get_reversed || (GET_CODE (XEXP (test_if, 2)) == LABEL_REF
> +		       && label_ref_label (XEXP (test_if, 2))
> +		       == JUMP_LABEL (jump)))
Just a formatting nit.   Bring the condition after get_reversed down to 
a new line and reindent appropriately.  ie

   if (get_reversed
        || (blah blah blah
              && more blah == frobit))




>       {
>         enum rtx_code rev = reversed_comparison_code (cond, jump);
>         if (rev == UNKNOWN)
> @@ -3146,6 +3147,46 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond,
>     return false;
>   }
>   
> +/* Helper function to emit a cmov sequence.  */
> +
> +static rtx_insn*
> +try_emit_cmove_seq (struct noce_if_info *if_info, rtx temp,
> +		    rtx cond, rtx new_val, rtx old_val, bool need_cmov,
> +		    unsigned *cost, rtx *temp_dest,
> +		    rtx cc_cmp = NULL, rtx rev_cc_cmp = NULL)
Could you expand the function comment a bit here.  Saying something is a 
helper doesn't really help the reader understand what the purpose of the 
function really is.

With the formatting nit fixed, a better function comment and a suitable 
ChangeLog, this is fine.

jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary.
  2021-12-06 18:43 ` [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary Robin Dapp
@ 2021-12-09  1:24   ` Jeff Law
  2021-12-10 15:06     ` Robin Dapp
  2022-01-10 11:18     ` Robin Dapp
  0 siblings, 2 replies; 26+ messages in thread
From: Jeff Law @ 2021-12-09  1:24 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/6/2021 11:43 AM, Robin Dapp via Gcc-patches wrote:
> If one of the to-be-converted SETs requires the original comparison
> (i.e. in order to generate a min/max insn) but no other insn after it
> does, we can omit creating temporaries, thus facilitating costing.
> ---
>   gcc/ifcvt.c | 33 +++++++++++++++++++++++++++++++--
>   1 file changed, 31 insertions(+), 2 deletions(-)
I'd generally prefer to refactor the bits between the restart label and 
the goto restart into a function and call it twice, passing in the 
additional bits to allow for better costing.  Can you look into that?  
If it's going to be major surgery, then the goto approach will be OK.

Conceptually I don't have any concerns with the patch.  It'll obviously 
need a ChangeLog entry.

Jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 1/7] ifcvt: Check if cmovs are needed.
  2021-12-06 18:43 ` [PATCH v3 1/7] ifcvt: Check if cmovs are needed Robin Dapp
@ 2021-12-09  1:26   ` Jeff Law
  2022-01-10 11:17     ` Robin Dapp
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2021-12-09  1:26 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/6/2021 11:43 AM, Robin Dapp via Gcc-patches wrote:
> When if-converting multiple SETs and we encounter a swap-style idiom
>
>    if (a > b)
>      {
>        tmp = c;   // [1]
>        c = d;
>        d = tmp;
>      }
>
> ifcvt should not generate a conditional move for the instruction at
> [1].
>
> In order to achieve that, this patch goes through all relevant SETs
> and marks the relevant instructions.  This helps to evaluate costs.
>
> On top, only generate temporaries if the current cmov is going to
> overwrite one of the comparands of the initial compare.
> ---
>   gcc/ifcvt.c | 174 ++++++++++++++++++++++++++++++++++++++++++++--------
>   1 file changed, 150 insertions(+), 24 deletions(-)
OK.  Needs a ChangeLog entry though.

Jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.
  2021-12-09  0:11   ` Jeff Law
@ 2021-12-09 17:20     ` Robin Dapp
  2021-12-09 17:34       ` Jeff Law
  2022-01-10 11:18     ` Robin Dapp
  1 sibling, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-09 17:20 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Hi Jeff,

thanks for looking into this.

> What if the condition has a side effect?  Doesn't this drop the side 
> effect by converting the conditional move into a simple move?
Hmm, good point, if the condition does more than a CC compare, it might
get tricky as we are not canonicalizing here (on purpose). Is there an
easy way out like checking something like side_effects_p ()?

Maybe we should drop this altogether and let the backend deal with it?
It would probably not know what to do and FAIL.

Regards
 Robin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.
  2021-12-09 17:20     ` Robin Dapp
@ 2021-12-09 17:34       ` Jeff Law
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff Law @ 2021-12-09 17:34 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/9/2021 10:20 AM, Robin Dapp wrote:
> Hi Jeff,
>
> thanks for looking into this.
NP.  I'd been watching this set evolve and I think it'll help our target 
as well, so it seemed natural to handle the review :-)

>
>> What if the condition has a side effect?  Doesn't this drop the side
>> effect by converting the conditional move into a simple move?
> Hmm, good point, if the condition does more than a CC compare, it might
> get tricky as we are not canonicalizing here (on purpose). Is there an
> easy way out like checking something like side_effects_p ()?
I think if side_effects_p is true, we can just emit the conditional move 
as-is without trying to collapse it to a simple move.  It should be 
exceedingly rare to have a side effect in the destination.    Checking 
side_effects_p will also reject if the destination is volatile MEM, but 
that should be OK and also exceedingly rare.
>
> Maybe we should drop this altogether and let the backend deal with it?
> It would probably not know what to do and FAIL.
I like the idea of collapsing to a simple move if the true/false arms 
are the same.    Did you see that happen in practice?  If so, I'd like 
to keep it, but just guard with the !side_effects_p check.

jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary.
  2021-12-09  1:24   ` Jeff Law
@ 2021-12-10 15:06     ` Robin Dapp
  2021-12-15 20:24       ` Jeff Law
  2022-01-10 11:18     ` Robin Dapp
  1 sibling, 1 reply; 26+ messages in thread
From: Robin Dapp @ 2021-12-10 15:06 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

Hi Jeff,

> I'd generally prefer to refactor the bits between the restart label and 
> the goto restart into a function and call it twice, passing in the 
> additional bits to allow for better costing.  Can you look into that?  
> If it's going to be major surgery, then the goto approach will be OK.

I transplanted the loop into a separate function
"noce_convert_multiple_sets_1" (for the lack of a better name right
now).  I guess an argument could be made about also moving

+  rtx cc_cmp = cond_exec_get_condition (jump);
+  rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true);

into the function and not care about traversing all instructions
twice/four times (will not be more than a few anyway) but I did not do
that for now.

Does this look better? Not fully tested yet everywhere but a test suite
run on s390 looked good.

Regards
 Robin

[-- Attachment #2: fun.patch --]
[-- Type: text/x-patch, Size: 10075 bytes --]

commit 63926561fcdeace9e07b0240fc929b47b7b99404
Author: Robin Dapp <rdapp@linux.ibm.com>
Date:   Fri Sep 17 20:22:10 2021 +0200

    ifcvt: Run second pass if it is possible to omit a temporary.
    
    If one of the to-be-converted SETs requires the original comparison
    (i.e. in order to generate a min/max insn) but no other insn after it
    does, we can omit creating temporaries, thus facilitating costing.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index a13b28c9ee0..301181b2d1a 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3250,9 +3250,6 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   rtx x = XEXP (cond, 0);
   rtx y = XEXP (cond, 1);
 
-  rtx cc_cmp = cond_exec_get_condition (jump);
-  rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true);
-
   /* The true targets for a conditional move.  */
   auto_vec<rtx> targets;
   /* The temporaries introduced to allow us to not consider register
@@ -3260,13 +3257,139 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   auto_vec<rtx> temporaries;
   /* The insns we've emitted.  */
   auto_vec<rtx_insn *> unmodified_insns;
-  int count = 0;
 
   hash_set<rtx_insn *> need_no_cmov;
   hash_map<rtx_insn *, int> rewired_src;
 
   need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src);
 
+  int last_needs_comparison = -1;
+
+  bool ok = noce_convert_multiple_sets_1
+    (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries,
+     &unmodified_insns, &last_needs_comparison);
+  if (!ok)
+      return false;
+
+  /* If there are insns that overwrite part of the initial
+     comparison, we can still omit creating temporaries for
+     the last of them.
+     As the second try will always create a less expensive,
+     valid sequence, we do not need to compare and can discard
+     the first one.  */
+  if (last_needs_comparison != -1)
+    {
+      end_sequence ();
+      start_sequence ();
+      ok = noce_convert_multiple_sets_1
+	(if_info, &need_no_cmov, &rewired_src, &targets, &temporaries,
+	 &unmodified_insns, &last_needs_comparison);
+      /* Actually we should not fail anymore if we reached here,
+	 but better still check.  */
+      if (!ok)
+	  return false;
+    }
+
+  /* We must have seen some sort of insn to insert, otherwise we were
+     given an empty BB to convert, and we can't handle that.  */
+  gcc_assert (!unmodified_insns.is_empty ());
+
+  /* Now fixup the assignments.  */
+  for (unsigned i = 0; i < targets.length (); i++)
+    if (targets[i] != temporaries[i])
+      noce_emit_move_insn (targets[i], temporaries[i]);
+
+  /* Actually emit the sequence if it isn't too expensive.  */
+  rtx_insn *seq = get_insns ();
+
+  if (!targetm.noce_conversion_profitable_p (seq, if_info))
+    {
+      end_sequence ();
+      return FALSE;
+    }
+
+  for (insn = seq; insn; insn = NEXT_INSN (insn))
+    set_used_flags (insn);
+
+  /* Mark all our temporaries and targets as used.  */
+  for (unsigned i = 0; i < targets.length (); i++)
+    {
+      set_used_flags (temporaries[i]);
+      set_used_flags (targets[i]);
+    }
+
+  set_used_flags (cond);
+  set_used_flags (x);
+  set_used_flags (y);
+
+  unshare_all_rtl_in_chain (seq);
+  end_sequence ();
+
+  if (!seq)
+    return FALSE;
+
+  for (insn = seq; insn; insn = NEXT_INSN (insn))
+    if (JUMP_P (insn)
+	|| recog_memoized (insn) == -1)
+      return FALSE;
+
+  emit_insn_before_setloc (seq, if_info->jump,
+			   INSN_LOCATION (unmodified_insns.last ()));
+
+  /* Clean up THEN_BB and the edges in and out of it.  */
+  remove_edge (find_edge (test_bb, join_bb));
+  remove_edge (find_edge (then_bb, join_bb));
+  redirect_edge_and_branch_force (single_succ_edge (test_bb), join_bb);
+  delete_basic_block (then_bb);
+  num_true_changes++;
+
+  /* Maybe merge blocks now the jump is simple enough.  */
+  if (can_merge_blocks_p (test_bb, join_bb))
+    {
+      merge_blocks (test_bb, join_bb);
+      num_true_changes++;
+    }
+
+  num_updated_if_blocks++;
+  if_info->transform_name = "noce_convert_multiple_sets";
+  return TRUE;
+}
+
+/* This goes through all relevant insns of IF_INFO->then_bb and tries to
+   create conditional moves.  In case a simple move sufficis the insn
+   should be listed in NEED_NO_CMOV.  The rewired-src cases should be
+   specified via REWIRED_SRC.  TARGETS, TEMPORARIES and UNMODIFIED_INSNS
+   are specified and used in noce_convert_multiple_sets and should be passed
+   to this function..  */
+
+static bool
+noce_convert_multiple_sets_1 (struct noce_if_info *if_info,
+			      hash_set<rtx_insn *> *need_no_cmov,
+			      hash_map<rtx_insn *, int> *rewired_src,
+			      auto_vec<rtx> *targets,
+			      auto_vec<rtx> *temporaries,
+			      auto_vec<rtx_insn *> *unmodified_insns,
+			      int *last_needs_comparison)
+{
+  basic_block then_bb = if_info->then_bb;
+  rtx_insn *jump = if_info->jump;
+  rtx_insn *cond_earliest;
+
+  /* Decompose the condition attached to the jump.  */
+  rtx cond = noce_get_condition (jump, &cond_earliest, false);
+
+  rtx cc_cmp = cond_exec_get_condition (jump);
+  rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true);
+
+  rtx_insn *insn;
+  int count = 0;
+
+  targets->truncate (0);
+  temporaries->truncate (0);
+  unmodified_insns->truncate (0);
+
+  bool second_try = *last_needs_comparison != -1;
+
   FOR_BB_INSNS (then_bb, insn)
     {
       /* Skip over non-insns.  */
@@ -3280,9 +3403,9 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
       rtx temp;
 
       rtx new_val = SET_SRC (set);
-      if (int *ii = rewired_src.get (insn))
-	new_val = simplify_replace_rtx (new_val, targets[*ii],
-					temporaries[*ii]);
+      if (int *ii = rewired_src->get (insn))
+	new_val = simplify_replace_rtx (new_val, (*targets)[*ii],
+					(*temporaries)[*ii]);
       rtx old_val = target;
 
       /* As we are transforming
@@ -3310,8 +3433,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	 Therefore we introduce a temporary every time we are about to
 	 overwrite a variable used in the check.  Costing of a sequence with
 	 these is going to be inaccurate so only use temporaries when
-	 needed.  */
-      if (reg_overlap_mentioned_p (target, cond))
+	 needed.
+
+	 If performing a second try, we know how many insns require a
+	 temporary.  For the last of these, we can omit creating one.  */
+      if (reg_overlap_mentioned_p (target, cond)
+	  && (!second_try || count < *last_needs_comparison))
 	temp = gen_reg_rtx (GET_MODE (target));
       else
 	temp = target;
@@ -3319,7 +3446,7 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
       /* We have identified swap-style idioms before.  A normal
 	 set will need to be a cmov while the first instruction of a swap-style
 	 idiom can be a regular move.  This helps with costing.  */
-      bool need_cmov = !need_no_cmov.contains (insn);
+      bool need_cmov = !need_no_cmov->contains (insn);
 
       /* If we had a non-canonical conditional jump (i.e. one where
 	 the fallthrough is to the "else" case) we need to reverse
@@ -3394,6 +3521,8 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	{
 	  seq = seq1;
 	  temp_dest = temp_dest1;
+	  if (!second_try)
+	    *last_needs_comparison = count;
 	}
       else if (seq2 != NULL_RTX)
 	{
@@ -3412,75 +3541,15 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 
       /* Bookkeeping.  */
       count++;
-      targets.safe_push (target);
-      temporaries.safe_push (temp_dest);
-      unmodified_insns.safe_push (insn);
-    }
-
-  /* We must have seen some sort of insn to insert, otherwise we were
-     given an empty BB to convert, and we can't handle that.  */
-  gcc_assert (!unmodified_insns.is_empty ());
-
-  /* Now fixup the assignments.  */
-  for (int i = 0; i < count; i++)
-    if (targets[i] != temporaries[i])
-      noce_emit_move_insn (targets[i], temporaries[i]);
-
-  /* Actually emit the sequence if it isn't too expensive.  */
-  rtx_insn *seq = get_insns ();
-
-  if (!targetm.noce_conversion_profitable_p (seq, if_info))
-    {
-      end_sequence ();
-      return FALSE;
-    }
-
-  for (insn = seq; insn; insn = NEXT_INSN (insn))
-    set_used_flags (insn);
-
-  /* Mark all our temporaries and targets as used.  */
-  for (int i = 0; i < count; i++)
-    {
-      set_used_flags (temporaries[i]);
-      set_used_flags (targets[i]);
+      targets->safe_push (target);
+      temporaries->safe_push (temp_dest);
+      unmodified_insns->safe_push (insn);
     }
 
-  set_used_flags (cond);
-  set_used_flags (x);
-  set_used_flags (y);
-
-  unshare_all_rtl_in_chain (seq);
-  end_sequence ();
-
-  if (!seq)
-    return FALSE;
-
-  for (insn = seq; insn; insn = NEXT_INSN (insn))
-    if (JUMP_P (insn)
-	|| recog_memoized (insn) == -1)
-      return FALSE;
-
-  emit_insn_before_setloc (seq, if_info->jump,
-			   INSN_LOCATION (unmodified_insns.last ()));
+  return true;
+}
 
-  /* Clean up THEN_BB and the edges in and out of it.  */
-  remove_edge (find_edge (test_bb, join_bb));
-  remove_edge (find_edge (then_bb, join_bb));
-  redirect_edge_and_branch_force (single_succ_edge (test_bb), join_bb);
-  delete_basic_block (then_bb);
-  num_true_changes++;
 
-  /* Maybe merge blocks now the jump is simple enough.  */
-  if (can_merge_blocks_p (test_bb, join_bb))
-    {
-      merge_blocks (test_bb, join_bb);
-      num_true_changes++;
-    }
-
-  num_updated_if_blocks++;
-  if_info->transform_name = "noce_convert_multiple_sets";
-  return TRUE;
-}
 
 /* Return true iff basic block TEST_BB is comprised of only
    (SET (REG) (REG)) insns suitable for conversion to a series
diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h
index 62632ed2fec..f1986baf395 100644
--- a/gcc/ifcvt.h
+++ b/gcc/ifcvt.h
@@ -110,4 +110,11 @@ struct noce_if_info
   const char *transform_name;
 };
 
+static bool
+noce_convert_multiple_sets_1 (struct noce_if_info *,
+			      hash_set<rtx_insn *> *,
+			      hash_map<rtx_insn *, int> *,
+			      auto_vec<rtx> *, auto_vec<rtx> *,
+			      auto_vec<rtx_insn *> *, int *);
+
 #endif /* GCC_IFCVT_H */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary.
  2021-12-10 15:06     ` Robin Dapp
@ 2021-12-15 20:24       ` Jeff Law
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff Law @ 2021-12-15 20:24 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, richard.sandiford



On 12/10/2021 8:06 AM, Robin Dapp wrote:
> Hi Jeff,
>
>> I'd generally prefer to refactor the bits between the restart label and
>> the goto restart into a function and call it twice, passing in the
>> additional bits to allow for better costing.  Can you look into that?
>> If it's going to be major surgery, then the goto approach will be OK.
> I transplanted the loop into a separate function
> "noce_convert_multiple_sets_1" (for the lack of a better name right
> now).  I guess an argument could be made about also moving
>
> +  rtx cc_cmp = cond_exec_get_condition (jump);
> +  rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true);
>
> into the function and not care about traversing all instructions
> twice/four times (will not be more than a few anyway) but I did not do
> that for now.
>
> Does this look better? Not fully tested yet everywhere but a test suite
> run on s390 looked good.
I think it's looks better.  One might argue that a structure rather than 
a half-dozen named arguments or a class would be even better, but I 
think that can wait for a full class-ification of that file.

You probably should move the prototype for noce_convert_multiple_set_1 
into the .c file.  It's static, no no need to expose it in the .h file 
AFAICT.  OK with that change.

jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 1/7] ifcvt: Check if cmovs are needed.
  2021-12-09  1:26   ` Jeff Law
@ 2022-01-10 11:17     ` Robin Dapp
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Dapp @ 2022-01-10 11:17 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Hi,

I included the outstanding minor remarks and believe everything is OK'ed
now.  Still posting the ChangeLogs that I omitted before continuing.
I'd expect some fallout on other targets (hopefully nothing major) since
rtx costs are handled differently now for this code path.

Regards
 Robin

--

gcc/ChangeLog:

	* ifcvt.c (need_cmov_or_rewire): New function.
	(noce_convert_multiple_sets): Call it.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple.
  2021-12-08 23:51   ` Jeff Law
@ 2022-01-10 11:17     ` Robin Dapp
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Dapp @ 2022-01-10 11:17 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Posting the ChangeLog before pushing.

--

gcc/ChangeLog:

	* ifcvt.c (noce_convert_multiple_sets): Allow constants.
	(bb_ok_for_noce_convert_multiple_sets): Likewise.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 3/7] ifcvt: Improve costs handling for noce_convert_multiple.
  2021-12-08 23:54   ` Jeff Law
@ 2022-01-10 11:17     ` Robin Dapp
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Dapp @ 2022-01-10 11:17 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Posting the ChangeLog before pushing.

--

gcc/ChangeLog:

	* ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Estimate insns costs.
	(noce_process_if_block): Use potential costs.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.
  2021-12-09  0:11   ` Jeff Law
  2021-12-09 17:20     ` Robin Dapp
@ 2022-01-10 11:18     ` Robin Dapp
  1 sibling, 0 replies; 26+ messages in thread
From: Robin Dapp @ 2022-01-10 11:18 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Posting the ChangeLog before pushing.

--

gcc/ChangeLog:

	* rtl.h (struct rtx_comparison): New struct that holds an rtx
	comparison.
	* config/rs6000/rs6000.c (rs6000_emit_minmax): Use struct instead of
        single parameters.
	(rs6000_emit_swsqrt): Likewise.
	* expmed.c (expand_sdiv_pow2): Likewise.
	(emit_store_flag): Likewise.
	* expr.c (expand_cond_expr_using_cmove): Likewise.
	(expand_expr_real_2): Likewise.
	* ifcvt.c (noce_emit_cmove): Add compare and reversed compare
	parameters and allow to call directly without going through
	preparation steps.
	* optabs.c (emit_conditional_move_1): New function.
	(expand_doubleword_shift_condmove): Use struct.
	(emit_conditional_move): Use struct.
	* optabs.h (emit_conditional_move): Use struct.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves.
  2021-12-09  1:18   ` Jeff Law
@ 2022-01-10 11:18     ` Robin Dapp
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Dapp @ 2022-01-10 11:18 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Posting the ChangeLog before pushing.

--

gcc/ChangeLog:

	* ifcvt.c (cond_exec_get_condition): New parameter to allow getting the
	reversed comparison.
	(try_emit_cmove_seq): New function to facilitate creating a cmov
	sequence.
	(noce_convert_multiple_sets): Create two sequences and use the less
	expensive one.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple.
  2021-12-08 23:48   ` Jeff Law
@ 2022-01-10 11:18     ` Robin Dapp
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Dapp @ 2022-01-10 11:18 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Posting the ChangeLog before pushing.

--

gcc/testsuite/ChangeLog:

	* gcc.dg/ifcvt-4.c: Remove s390-specific check.
	* gcc.target/s390/ifcvt-two-insns-bool.c: New test.
	* gcc.target/s390/ifcvt-two-insns-int.c: New test.
	* gcc.target/s390/ifcvt-two-insns-long.c: New test.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary.
  2021-12-09  1:24   ` Jeff Law
  2021-12-10 15:06     ` Robin Dapp
@ 2022-01-10 11:18     ` Robin Dapp
  1 sibling, 0 replies; 26+ messages in thread
From: Robin Dapp @ 2022-01-10 11:18 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

Posting the ChangeLog before pushing.

--

gcc/ChangeLog:

	* ifcvt.c (noce_convert_multiple_sets_1): New function.
	(noce_convert_multiple_sets): Call function a second time if we can
	improve the first try.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-01-10 11:18 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 18:43 [PATCH v3 0/7] ifcvt: Convert multiple Robin Dapp
2021-12-06 18:43 ` [PATCH v3 1/7] ifcvt: Check if cmovs are needed Robin Dapp
2021-12-09  1:26   ` Jeff Law
2022-01-10 11:17     ` Robin Dapp
2021-12-06 18:43 ` [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple Robin Dapp
2021-12-08 23:51   ` Jeff Law
2022-01-10 11:17     ` Robin Dapp
2021-12-06 18:43 ` [PATCH v3 3/7] ifcvt: Improve costs handling " Robin Dapp
2021-12-08 23:54   ` Jeff Law
2022-01-10 11:17     ` Robin Dapp
2021-12-06 18:43 ` [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move Robin Dapp
2021-12-09  0:11   ` Jeff Law
2021-12-09 17:20     ` Robin Dapp
2021-12-09 17:34       ` Jeff Law
2022-01-10 11:18     ` Robin Dapp
2021-12-06 18:43 ` [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves Robin Dapp
2021-12-09  1:18   ` Jeff Law
2022-01-10 11:18     ` Robin Dapp
2021-12-06 18:43 ` [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple Robin Dapp
2021-12-08 23:48   ` Jeff Law
2022-01-10 11:18     ` Robin Dapp
2021-12-06 18:43 ` [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary Robin Dapp
2021-12-09  1:24   ` Jeff Law
2021-12-10 15:06     ` Robin Dapp
2021-12-15 20:24       ` Jeff Law
2022-01-10 11:18     ` Robin Dapp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).