[PATCH][AArch64][1/3] Expand signed mod by power of 2 using CSNEG

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Kyrill Tkachov <kyrylo.tkachov@arm.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Marcus Shawcroft <marcus.shawcroft@arm.com>,
	 Richard Earnshaw <Richard.Earnshaw@arm.com>,
	James Greenhalgh <james.greenhalgh@arm.com>
Subject: [PATCH][AArch64][1/3] Expand signed mod by power of 2 using CSNEG
Date: Fri, 24 Jul 2015 10:55:00 -0000	[thread overview]
Message-ID: <55B219A5.8060307@arm.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2520 bytes --]

Hi all,

This patch implements an aarch64-specific expansion of the signed modulo by a power of 2.
The proposed sequence makes use of the conditional negate instruction CSNEG.
For a power of N, x % N can be calculated with:
negs   x1, x0
and    x0, x0, #(N - 1)
and    x1, x1, #(N - 1)
csneg  x0, x0, x1, mi

So, for N == 256 this would be:
negs   x1, x0
and    x0, x0, #255
and    x1, x1, #255
csneg  x0, x0, x1, mi

For comparison, the existing sequence emitted by expand_smod_pow2 in expmed.c is:
asr     x1, x0, 63
lsr     x1, x1, 56
add     x0, x0, x1
and     x0, x0, 255
sub     x0, x0, x1

Note that the CSNEG sequence is one instruction shorter and that the two and operations
are independent, compared to the existing sequence where all instructions are dependent
on the preceeding instructions.

For the special case of N == 2 we can do even better:
cmp     x0, xzr
and     x0, x0, 1
csneg   x0, x0, x0, ge

I first tried implementing this in the generic code in expmed.c but that didn't work
out for a few reasons:

* This relies on having a conditional-negate instruction. We could gate it on
HAVE_conditional_move and the combiner is capable of merging the final negate into
the conditional move if a conditional negate is available (like on aarch64) but on
targets without a conditional negate this would end up emitting a separate negate.

* The first negs has to be a negs for the sequence to be a win i.e. having a separate
negate and compare makes the sequence slower than the existing one (at least in my
microbenchmarking) and I couldn't get subsequent passes to combine the negate and combine
into the negs (presumably due to the use of the negated result in one of the ands).
Doing it in the aarch64 backend where I could just call the exact gen_* functions that
I need worked much more cleanly.

The costing logic is updated to reflect this sequence during the intialisation of
expmed.c where it calculates the smod_pow2_cheap metric.

The tests will come in patch 3 of the series which are partly shared with the equivalent
arm implementation.

Bootstrapped and tested on aarch64.
Ok for trunk?

Thanks,
Kyrill

2015-07-24  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64.md (mod<mode>3): New define_expand.
     (*neg<mode>2_compare0): Rename to...
     (neg<mode>2_compare0): ... This.
     * config/aarch64/aarch64.c (aarch64_rtx_costs, MOD case): Reflect
     CSNEG sequence in MOD by power of 2 case.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: aarch64-mod-2.patch --]
[-- Type: text/x-patch; name=aarch64-mod-2.patch, Size: 3961 bytes --]

commit 428cbef107b559f67006fe84b4522210902b101f
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed Jul 15 17:01:13 2015 +0100

    [AArch64][1/3] Expand signed mod by power of 2 using CSNEG

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 9d88a60..7bb4a55 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6639,8 +6639,26 @@ cost_plus:
 	  if (VECTOR_MODE_P (mode))
 	    *cost += extra_cost->vect.alu;
 	  else if (GET_MODE_CLASS (mode) == MODE_INT)
-	    *cost += (extra_cost->mult[mode == DImode].add
-		      + extra_cost->mult[mode == DImode].idiv);
+	    {
+	      /* We can expand signed mod by power of 2 using a
+		 NEGS, two parallel ANDs and a CSNEG.  Assume here
+		 that CSNEG is COSTS_N_INSNS (1).  This case should
+		 only ever be reached through the set_smod_pow2_cheap check
+		 in expmed.c.  */
+	      if (code == MOD
+		  && CONST_INT_P (XEXP (x, 1))
+		  && exact_log2 (INTVAL (XEXP (x, 1))) > 0
+		  && (mode == SImode || mode == DImode))
+		{
+		  *cost += COSTS_N_INSNS (3)
+			   + 2 * extra_cost->alu.logical
+			   + extra_cost->alu.arith;
+		  return true;
+		}
+
+	      *cost += (extra_cost->mult[mode == DImode].add
+			+ extra_cost->mult[mode == DImode].idiv);
+	    }
 	  else if (mode == DFmode)
 	    *cost += (extra_cost->fp[1].mult
 		      + extra_cost->fp[1].div);
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f264534..3aaf407 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -302,6 +302,62 @@ (define_expand "cmp<mode>"
   }
 )
 
+;; AArch64-specific expansion of signed mod by power of 2 using CSNEG.
+;; For x0 % n where n is a power of 2 produce:
+;; negs   x1, x0
+;; and    x0, x0, #(n - 1)
+;; and    x1, x1, #(n - 1)
+;; csneg  x0, x0, x1, mi
+
+(define_expand "mod<mode>3"
+  [(match_operand:GPI 0 "register_operand" "")
+   (match_operand:GPI 1 "register_operand" "")
+   (match_operand:GPI 2 "const_int_operand" "")]
+  ""
+  {
+    HOST_WIDE_INT val = INTVAL (operands[2]);
+
+    if (val <= 0
+       || exact_log2 (INTVAL (operands[2])) <= 0
+       || !aarch64_bitmask_imm (INTVAL (operands[2]) - 1, <MODE>mode))
+      FAIL;
+
+    rtx mask = GEN_INT (val - 1);
+
+    /* In the special case of x0 % 2 we can do the even shorter:
+	cmp     x0, xzr
+	and     x0, x0, 1
+	csneg   x0, x0, x0, ge.  */
+    if (val == 2)
+      {
+	rtx masked = gen_reg_rtx (<MODE>mode);
+	rtx ccreg = aarch64_gen_compare_reg (LT, operands[1], const0_rtx);
+	emit_insn (gen_and<mode>3 (masked, operands[1], mask));
+	rtx x = gen_rtx_LT (VOIDmode, ccreg, const0_rtx);
+	emit_insn (gen_csneg3<mode>_insn (operands[0], x, masked, masked));
+	DONE;
+      }
+
+    rtx neg_op = gen_reg_rtx (<MODE>mode);
+    rtx_insn *insn = emit_insn (gen_neg<mode>2_compare0 (neg_op, operands[1]));
+
+    /* Extract the condition register and mode.  */
+    rtx cmp = XVECEXP (PATTERN (insn), 0, 0);
+    rtx cc_reg = SET_DEST (cmp);
+    rtx cond = gen_rtx_GE (VOIDmode, cc_reg, const0_rtx);
+
+    rtx masked_pos = gen_reg_rtx (<MODE>mode);
+    emit_insn (gen_and<mode>3 (masked_pos, operands[1], mask));
+
+    rtx masked_neg = gen_reg_rtx (<MODE>mode);
+    emit_insn (gen_and<mode>3 (masked_neg, neg_op, mask));
+
+    emit_insn (gen_csneg3<mode>_insn (operands[0], cond,
+				       masked_neg, masked_pos));
+    DONE;
+  }
+)
+
 (define_insn "*condjump"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
 			    [(match_operand 1 "cc_register" "") (const_int 0)])
@@ -2372,7 +2428,7 @@ (define_insn "*ngcsi_uxtw"
   [(set_attr "type" "adc_reg")]
 )
 
-(define_insn "*neg<mode>2_compare0"
+(define_insn "neg<mode>2_compare0"
   [(set (reg:CC_NZ CC_REGNUM)
 	(compare:CC_NZ (neg:GPI (match_operand:GPI 1 "register_operand" "r"))
 		       (const_int 0)))

next             reply	other threads:[~2015-07-24 10:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-24 10:55 Kyrill Tkachov [this message]
2015-07-31  8:43 ` Kyrill Tkachov
2015-07-31  9:26 ` Andrew Pinski
2015-08-03 13:01 ` James Greenhalgh
2015-08-13 12:41   ` Kyrill Tkachov
2015-08-20  8:24     ` Kyrill Tkachov
2015-09-01  8:37       ` Kyrill Tkachov
2015-09-01  9:25     ` James Greenhalgh
2015-09-01 10:40       ` Kyrill Tkachov
2015-09-02 13:00         ` Kyrill Tkachov
2015-09-08  8:42           ` James Greenhalgh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B219A5.8060307@arm.com \
    --to=kyrylo.tkachov@arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=marcus.shawcroft@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).