public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][compare-elim] Merge zero-comparisons with normal ops
@ 2017-08-10 22:03 Michael Collison
  2017-08-28 18:43 ` Jeff Law
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Collison @ 2017-08-10 22:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 2592 bytes --]

Hi all,

One issue that we keep encountering on aarch64 is GCC not making good use of the flag-setting arithmetic instructions
like ADDS, SUBS, ANDS etc. that perform an arithmetic operation and compare the result against zero.
They are represented in a fairly standard way in the backend as PARALLEL patterns:
(parallel [(set (reg x1) (op (reg x2) (reg x3)))
           (set (reg cc) (compare (op (reg x2) (reg x3)) (const_int 0)))])

GCC isn't forming these from separate arithmetic and comparison instructions as aggressively as it could.
A particular pain point is when the result of the arithmetic insn is used before the comparison instruction.
The testcase in this patch is one such example where we have:
(insn 7 35 33 2 (set (reg/v:SI 0 x0 [orig:73 <retval> ] [73])
        (plus:SI (reg:SI 0 x0 [ x ])
            (reg:SI 1 x1 [ y ]))) "comb.c":3 95 {*addsi3_aarch64}
     (nil))
(insn 33 7 34 2 (set (reg:SI 1 x1 [77])
        (plus:SI (reg/v:SI 0 x0 [orig:73 <retval> ] [73])
            (const_int 2 [0x2]))) "comb.c":4 95 {*addsi3_aarch64}
     (nil))
(insn 34 33 17 2 (set (reg:CC 66 cc)
        (compare:CC (reg/v:SI 0 x0 [orig:73 <retval> ] [73])
            (const_int 0 [0]))) "comb.c":4 391 {cmpsi}
     (nil))

This scares combine away as x0 is used in insn 33 as well as the comparison in insn 34.
I think the compare-elim pass can help us here.

This patch extends it by handling comparisons against zero, finding the defining instruction of the compare
and merging the comparison with the defining instruction into a PARALLEL that will hopefully match the form
described above. If between the comparison and the defining insn we find an instruction that uses the condition
registers or any control flow we bail out, and we don't cross basic blocks.
This simple technique allows us to catch cases such as this example.

Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu and x86_64.

Ok for trunk?

2017-08-05  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
	    Michael Collison <michael.collison@arm.com>

	* compare-elim.c: Include emit-rtl.h.
	(can_merge_compare_into_arith): New function.
	(try_validate_parallel): Likewise.
	(try_merge_compare): Likewise.
	(try_eliminate_compare): Call the above when no previous clobber
	is available.
	(execute_compare_elim_after_reload): Add DF_UD_CHAIN and DF_DU_CHAIN
	dataflow problems.

2017-08-05  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
	    Michael Collison <michael.collison@arm.com>
	    
	* gcc.target/aarch64/cmpelim_mult_uses_1.c: New test.

[-- Attachment #2: pr5198v1.patch --]
[-- Type: application/octet-stream, Size: 6340 bytes --]

diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c
index 7e557a2..c65d155 100644
--- a/gcc/compare-elim.c
+++ b/gcc/compare-elim.c
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tm_p.h"
 #include "insn-config.h"
 #include "recog.h"
+#include "emit-rtl.h"
 #include "cfgrtl.h"
 #include "tree-pass.h"
 #include "domwalk.h"
@@ -579,6 +580,145 @@ equivalent_reg_at_start (rtx reg, rtx_insn *end, rtx_insn *start)
   return reg;
 }
 
+/* Return true if it is okay to merge the comparison CMP_INSN with
+   the instruction ARITH_INSN.  Both instructions are assumed to be in the
+   same basic block with ARITH_INSN appearing before CMP_INSN.  This checks
+   that there are no uses or defs of the condition flags or control flow
+   changes between the two instructions.  */
+
+static bool
+can_merge_compare_into_arith (rtx_insn *cmp_insn, rtx_insn *arith_insn)
+{
+  for (rtx_insn *insn = PREV_INSN (cmp_insn);
+       insn && insn != arith_insn;
+       insn = PREV_INSN (insn))
+    {
+      if (!NONDEBUG_INSN_P (insn))
+	continue;
+      /* Bail if there are jumps or calls in between.  */
+      if (!NONJUMP_INSN_P (insn))
+	return false;
+
+      df_ref ref;
+      /* Find a USE of the flags register.  */
+      FOR_EACH_INSN_USE (ref, insn)
+	if (DF_REF_REGNO (ref) == targetm.flags_regnum)
+	  return false;
+
+      /* Find a DEF of the flags register.  */
+      FOR_EACH_INSN_DEF (ref, insn)
+	if (DF_REF_REGNO (ref) == targetm.flags_regnum)
+	  return false;
+    }
+  return true;
+}
+
+/* Given two SET expressions, SET_A and SET_B determine whether they form
+   a recognizable pattern when emitted in parallel.  Return that parallel
+   if so.  Otherwise return NULL.  This tries both permutations of SET_A
+   and SET_B within the PARALLEL.  */
+
+static rtx
+try_validate_parallel (rtx set_a, rtx set_b)
+{
+  rtx par
+    = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set_a, set_b));
+  rtx_insn *seq;
+  start_sequence ();
+  seq = emit_insn (par);
+  end_sequence ();
+  if (recog_memoized (seq) > 0)
+    return par;
+
+  par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set_b, set_a));
+  start_sequence ();
+  seq = emit_insn (par);
+  end_sequence ();
+  return recog_memoized (seq) > 0 ? par : NULL_RTX;
+}
+
+/* For a comparison instruction described by CMP check if it compares a
+   register with zero i.e. it is of the form CC := CMP R1, 0.
+   If it is, find the instruction defining R1 (say I1) and try to create a
+   PARALLEL consisting of I1 and the comparison, representing a flag-setting
+   arithmetic instruction.  Example:
+   I1: R1 := R2 + R3
+   <instructions that don't read the condition register>
+   I2: CC := CMP R1 0
+   I2 can be merged with I1 into:
+   I1: { R1 := R2 + R3 ; CC := CMP (R2 + R3) 0 }
+   This catches cases where R1 is used between I1 and I2 and therefore
+   combine and other RTL optimisations will not try to propagate it into
+   I2.  Return true if we succeeded in merging CMP.  */
+
+static bool
+try_merge_compare (struct comparison *cmp)
+{
+  rtx_insn *cmp_insn = cmp->insn;
+
+  if (!REG_P (cmp->in_a) || cmp->in_b != const0_rtx)
+    return false;
+  rtx in_a = cmp->in_a;
+  if (!in_a)
+    return false;
+  df_ref use;
+
+  FOR_EACH_INSN_USE (use, cmp_insn)
+    if (DF_REF_REGNO (use) == REGNO (in_a))
+      break;
+  if (!use)
+    return false;
+
+  struct df_link *ref_chain;
+  ref_chain = DF_REF_CHAIN (use);
+  if (!ref_chain || !ref_chain->ref
+      || !DF_REF_INSN_INFO (ref_chain->ref) || ref_chain->next != NULL)
+    return false;
+
+  rtx_insn *def_insn = DF_REF_INSN (ref_chain->ref);
+  /* We found the insn that defines in_a.  Only consider the cases where
+     it is in the same block as the comparison.  */
+  if (BLOCK_FOR_INSN (cmp_insn) != BLOCK_FOR_INSN (def_insn))
+    return false;
+
+  rtx set = single_set (def_insn);
+  if (!set)
+    return false;
+
+  if (!can_merge_compare_into_arith (cmp_insn, def_insn))
+    return false;
+
+  rtx src = SET_SRC (set);
+  rtx flags = maybe_select_cc_mode (cmp, src, CONST0_RTX (GET_MODE (src)));
+  if (!flags)
+    {
+    /* We may already have a change group going through maybe_select_cc_mode.
+       Discard it properly.  */
+      cancel_changes (0);
+      return false;
+    }
+
+  rtx flag_set
+    = gen_rtx_SET (flags, gen_rtx_COMPARE (GET_MODE (flags),
+					   copy_rtx (src),
+					   CONST0_RTX (GET_MODE (src))));
+  rtx arith_set = copy_rtx (PATTERN (def_insn));
+  rtx par = try_validate_parallel (flag_set, arith_set);
+  if (!par)
+    {
+      /* We may already have a change group going through maybe_select_cc_mode.
+	 Discard it properly.  */
+      cancel_changes (0);
+      return false;
+    }
+  if (!apply_change_group ())
+    return false;
+  emit_insn_after (par, def_insn);
+  delete_insn (def_insn);
+  delete_insn (cmp->insn);
+  return true;
+}
+
 /* Attempt to replace a comparison with a prior arithmetic insn that can
    compute the same flags value as the comparison itself.  Return true if
    successful, having made all rtl modifications necessary.  */
@@ -588,7 +728,9 @@ try_eliminate_compare (struct comparison *cmp)
 {
   rtx flags, in_a, in_b, cmp_src;
 
-  /* We must have found an interesting "clobber" preceding the compare.  */
+  if (try_merge_compare (cmp))
+    return true;
+
   if (cmp->prev_clobber == NULL)
     return false;
 
@@ -714,6 +856,7 @@ try_eliminate_compare (struct comparison *cmp)
 static unsigned int
 execute_compare_elim_after_reload (void)
 {
+  df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN);
   df_analyze ();
 
   gcc_checking_assert (!all_compares.exists ());
diff --git a/gcc/testsuite/gcc.target/aarch64/cmpelim_mult_uses_1.c b/gcc/testsuite/gcc.target/aarch64/cmpelim_mult_uses_1.c
new file mode 100644
index 0000000..953c388
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cmpelim_mult_uses_1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* X is both compared against zero and used.  Make sure we can still
+   generate an ADDS and avoid an explicit comparison against zero.  */
+
+int
+foo (int x, int y)
+{
+  x += y;
+  if (x != 0)
+    x = x + 2;
+  return x;
+}
+
+/* { dg-final { scan-assembler-times "adds\\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, 0" } } */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-10-18 14:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-10 22:03 [PATCH][compare-elim] Merge zero-comparisons with normal ops Michael Collison
2017-08-28 18:43 ` Jeff Law
2017-08-29  9:25   ` Kyrill Tkachov
2017-09-01 23:07     ` Segher Boessenkool
2017-09-06 15:56       ` Michael Collison
2017-10-13 18:04         ` Jeff Law
2017-10-14  8:45           ` Eric Botcazou
2017-10-17 11:57             ` Richard Biener
2017-10-17 19:29               ` Michael Collison
2017-10-17 20:05                 ` Eric Botcazou
2017-10-17 20:12                 ` Richard Biener
2017-10-17 20:29                   ` Michael Collison
2017-10-18 14:14                     ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).