public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC][PATCH][mid-end] Optimize immediate choice in comparisons.
@ 2018-08-07 13:34 Vlad Lazar
  2018-08-07 20:11 ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Vlad Lazar @ 2018-08-07 13:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd, law, ian, rguenther

Hi.

This patch optimises the choice of immediates in integer comparisons. Integer
comparisons allow for different choices (e.g. a > b is equivalent to a >= b+1)
and there are cases where an incremented/decremented immediate can be loaded into a
register in fewer instructions. The cases are as follows:
   
i)   a >  b or a >= b + 1
ii)  a <= b or a <  b + 1
iii) a >= b or a >  b - 1
iv)  a <  b or a <= b - 1

For each comparison we check whether the equivalent can be performed in less instructions.
This is done on a statement by statement basis, right before the GIMPLE statement is expanded
to RTL. Therefore, it requires a correct implementation of the TARGET_INSN_COST hook.
The change is general and it applies to any integer comparison, regardless of types or location.

For example, on AArch64 for the following code:

int foo (int x)
{
   return x > 0xfefffffe;
}

it generates:

mov	w1, -16777217
cmp	w0, w1
cset	w0, cs

instead of:

mov	w1, 65534
movk	w1, 0xfeff, lsl 16
cmp	w0, w1
cset	w0, hi

Bootstrapped and regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and there are no regressions.

Thanks,
Vlad

gcc/testsuite/
Changelog for gcc/testsuite/Changelog
2018-07-30  Vlad Lazar  <vlad.lazar@arm.com>

	* gcc.target/aarch64/imm_choice_comparison.c: New.

gcc/
Changelog for gcc/Changelog
2018-07-30  Vlad Lazar  <vlad.lazar@arm.com>

	* cfgexpand.c (optimize_immediate_choice): New.
	(can_optimize_immediate_p): Likewise.
	(validate_immediate_optimization): Likewise.
	(update_immediate): Likewise.
	(immediate_optimized_code): Likewise.

---

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 9b91279282e1c6956c8b3699f13036c401ea1dcd..5b0a2e0cdb23f649336844506c8241428b3e6e7d 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5423,6 +5423,157 @@ reorder_operands (basic_block bb)
    XDELETE (lattice);
  }
  
+/* Helper function for update_immediate.  Returns the adjusted conditional
+   code for the immediate choice optimization.  See optimize_immediate_choice
+   for cases.  */
+
+static enum tree_code
+immediate_optimized_code (enum tree_code code)
+{
+  switch (code)
+    {
+    case GT_EXPR:
+      return GE_EXPR;
+    case GE_EXPR:
+      return GT_EXPR;
+    case LT_EXPR:
+      return LE_EXPR;
+    case LE_EXPR:
+      return LT_EXPR;
+    default:
+      return code;
+    }
+}
+
+/* Helper function for optimize_immediate_choice.  It updates the immediate
+   and the subcode of the gimple statement.  At the point of calling
+   the function we know that the optimization leads to fewer instructions.
+   stmt points to the gimple statement containing the comparison we wish to
+   optimize and to_add is the amount by which the immediate needs to be
+   adjusted (1 or -1).  */
+
+static void
+update_immediate (gimple *stmt, int to_add)
+{
+  tree inc_dec = to_add == 1 ? build_one_cst (integer_type_node) :
+			       build_minus_one_cst (integer_type_node);
+
+  enum gimple_code code = gimple_code (stmt);
+  if (code == GIMPLE_COND)
+    {
+      gcond *gc = GIMPLE_CHECK2<gcond *> (stmt);
+      tree rhs = gimple_cond_rhs (gc);
+
+      /* Update the statement.  */
+      tree new_rhs = fold_build2 (PLUS_EXPR, TREE_TYPE (rhs), rhs, inc_dec);
+      gimple_cond_set_rhs (gc, new_rhs);
+      gc->subcode = immediate_optimized_code ((enum tree_code) gc->subcode);
+    }
+  if (code == GIMPLE_ASSIGN)
+    {
+      gassign *ga = GIMPLE_CHECK2<gassign *> (stmt);
+      tree rhs2 = gimple_assign_rhs2 (ga);
+
+      tree new_rhs2 = fold_build2 (PLUS_EXPR, TREE_TYPE (rhs2), rhs2, inc_dec);
+      gimple_assign_set_rhs2 (ga, new_rhs2);
+      ga->subcode = immediate_optimized_code ((enum tree_code) ga->subcode);
+    }
+}
+
+/* Helper function for optimize_immediate_choice.  It checks if the other
+   possible immediate leads to a lower rtx cost.  to_add is the amount by
+   which the immediate needs to be adjusted (1 or -1).  The function
+   returns 0 if there is no improvement and the amount by which the immediate
+   needs to be adjusted (1 or -1) otherwise.  */
+
+static int
+validate_immediate_optimization (gimple *stmt, int to_add)
+{
+  tree tree_imm = gimple_code (stmt) == GIMPLE_COND ? gimple_cond_rhs (stmt)
+						: gimple_assign_rhs2 (stmt);
+  const_tree type = TREE_TYPE (tree_imm);
+  widest_int imm = wi::to_widest (tree_imm);
+  enum signop sgn = TYPE_UNSIGNED (type) ? UNSIGNED : SIGNED;
+
+  /* Check for overflow/underflow in the case of signed values and
+     wrapping around in the case of unsigned values.  If any occur
+     cancel the optimization.  */
+
+  widest_int max_type = wi::to_widest (TYPE_MAX_VALUE (type));
+  widest_int min_type = wi::to_widest (TYPE_MIN_VALUE (type));
+  if ((wi::cmp (imm, max_type, sgn) == 0 && to_add == 1)
+      || (wi::cmp (imm, min_type, sgn) == 0 && to_add == -1))
+    return 0;
+
+  widest_int imm_modif = wi::add (imm, to_add);
+
+  rtx reg = gen_reg_rtx (DImode);
+  rtx old_imm = GEN_INT (imm.to_shwi ());
+  rtx new_imm = GEN_INT (imm_modif.to_shwi ());
+
+  rtx_insn *old_rtx = gen_move_insn (reg, old_imm);
+  rtx_insn *new_rtx = gen_move_insn (reg, new_imm);
+
+  /* If the change is beneficial we can just propagate to_add further,
+     otherwise return 0 to cancel the optimization.  */
+  return insn_cost (old_rtx, true) > insn_cost (new_rtx, true) ? to_add : 0;
+}
+
+/* Helper function for optimize_immediate_choice.  Checks if the gimple
+   statement has the right shape for the optimization.  */
+
+static bool
+can_optimize_immediate_p (const gimple *stmt)
+{
+  enum gimple_code code = gimple_code (stmt);
+  if (code != GIMPLE_ASSIGN && code != GIMPLE_COND)
+    return false;
+
+  if (code == GIMPLE_ASSIGN
+      && (gimple_num_ops (stmt) != 3
+	  || TREE_CODE (gimple_assign_rhs2 (stmt)) != INTEGER_CST))
+    return false;
+  if (code == GIMPLE_COND && TREE_CODE (gimple_cond_rhs (stmt)) != INTEGER_CST)
+    return false;
+
+  return true;
+}
+
+/* Entry point for immediate choice optimization.  It aims to choose
+   the simpler immediate in integer comparisons.  The purpose of this
+   is to end up with an immediate which can be loaded into a register
+   in fewer moves, if possible.
+
+   For each integer comparison there exists an equivalent choice:
+     i)   a >  b or a >= b + 1
+     ii)  a <= b or a <  b + 1
+     iii) a >= b or a >  b - 1
+     iv)  a <  b or a <= b - 1
+
+   Comparisons can only appear on the rhs of a GIMPLE_ASSIGN
+   or in a GIMPLE_COND.  */
+
+static void
+optimize_immediate_choice (gimple *stmt)
+{
+  if (!can_optimize_immediate_p (stmt))
+    return;
+
+  /* Check if the other immediate available is preferable.  */
+  int to_add = 0;
+  if (stmt->subcode == GT_EXPR || stmt->subcode == LE_EXPR)
+    to_add = validate_immediate_optimization (stmt, 1);
+
+  if (stmt->subcode == GE_EXPR || stmt->subcode == LT_EXPR)
+    to_add = validate_immediate_optimization (stmt, -1);
+
+  if (!to_add)
+    return;
+
+  /* Update the current statement.  */
+  update_immediate (stmt, to_add);
+}
+
  /* Expand basic block BB from GIMPLE trees to RTL.  */
  
  static basic_block
@@ -5515,6 +5666,7 @@ expand_gimple_basic_block (basic_block bb, bool disable_tail_calls)
        basic_block new_bb;
  
        stmt = gsi_stmt (gsi);
+      optimize_immediate_choice (stmt);
  
        /* If this statement is a non-debug one, and we generate debug
  	 insns, then this one might be the last real use of a TERed
diff --git a/gcc/testsuite/gcc.target/aarch64/imm_choice_comparison.c b/gcc/testsuite/gcc.target/aarch64/imm_choice_comparison.c
new file mode 100644
index 0000000000000000000000000000000000000000..b30dcca88f44ca73fcb8202ea488888b365400c8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/imm_choice_comparison.c
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int x)
+{
+  return x >= 0xfff01;
+}
+
+int
+GT (unsigned int x)
+{
+  return x > 0xfefffffe;
+}
+
+/* Go from four moves to two.  */
+
+int
+baz (long long x)
+{
+  return x <= 0x1999999999999998;
+}
+
+int
+LE (unsigned int x)
+{
+  return x <= 0xfefffffe;
+}
+
+int
+GE (long long x)
+{
+  return x >= 0xff000000;
+}
+
+int
+LT (int x)
+{
+  return x < 0xff000000;
+}
+
+/* Optimize the immediate in conditionals.  */
+
+int check (int x, int y)
+{
+  if (x > y && GT (x))
+    return 100;
+
+  return x;
+}
+
+/* baz produces one movk instruction.  */
+/* { dg-final { scan-assembler-times "movk" 1 } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread
* Re: [RFC][PATCH][mid-end] Optimize immediate choice in comparisons.
@ 2018-08-18  8:44 Uros Bizjak
  2018-08-18 10:12 ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2018-08-18  8:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeff Law, vlad.lazar, Richard Sandiford

Hello!

>> gcc/testsuite/
>> Changelog for gcc/testsuite/Changelog
>> 2018-08-14  Vlad Lazar  <vlad.lazar@arm.com>
>>
>>     * gcc.target/aarch64/imm_choice_comparison.c: New.
>>
>> gcc/
>> Changelog for gcc/Changelog
>> 2018-08-14  Vlad Lazar  <vlad.lazar@arm.com>
>>     * expmed.h (canonicalize_comparison): New declaration.
>>     * expmed.c (canonicalize_comparison, equivalent_cmp_code): New.
>>     * expmed.c (emit_store_flag_1): Add call to canonicalize_comparison.
>>     * optabs.c (prepare_cmp_insn): Likewise.
>>     * rtl.h (unsigned_condition_p): New function which checks if a
>> comparison operator is unsigned.
> Thanks.  I fixed up the ChangeLog entry and installed this on the trunk.
>
> Richard S -- thanks for working with Vlad to push this forward.

This patch caused

+FAIL: gcc.target/i386/20040112-1.c scan-assembler testb

on x86 targets, reported in PR86994 [1].

The compiler now expands with:

(insn 12 8 13 4 (set (reg:CCGC 17 flags)
        (compare:CCGC (reg/v:QI 87 [ status ])
            (const_int -1 [0xffffffffffffffff]))) "20040112-1.c":13 9 {*cmpqi_1}
     (nil))

instead of:

(insn 10 8 11 4 (set (reg:CCGOC 17 flags)
        (compare:CCGOC (reg/v:QI 87 [ status ])
            (const_int 0 [0]))) "20040112-1.c":13 5 {*cmpqi_ccno_1}
     (nil))

The canonicalization, introduced in r263591 does not universally
benefit all targets. I wonder why TARGET_CANONICALIZE_COMPARISON
target hook was not used instead.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86994

Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-08-18 10:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-07 13:34 [RFC][PATCH][mid-end] Optimize immediate choice in comparisons Vlad Lazar
2018-08-07 20:11 ` Richard Sandiford
2018-08-09  5:48   ` Jeff Law
2018-08-10 15:54     ` Vlad Lazar
2018-08-13 14:00       ` Richard Sandiford
2018-08-14 17:01         ` Vlad Lazar
2018-08-16 10:28           ` Richard Sandiford
2018-08-16 16:35           ` Jeff Law
2018-08-16 16:46             ` Vlad Lazar
2018-08-16 16:48               ` Jeff Law
2018-08-18  8:44 Uros Bizjak
2018-08-18 10:12 ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).