public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jiong Wang <jiong.wang@arm.com>
To: Jeff Law <law@redhat.com>
Cc: "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding
Date: Fri, 14 Aug 2015 17:55:00 -0000	[thread overview]
Message-ID: <n99oai9epez.fsf@arm.com> (raw)
In-Reply-To: <55415C2E.9010001@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3532 bytes --]


Jeff Law writes:

> On 04/29/2015 03:36 PM, Jiong Wang wrote:
>>
>> Jeff Law writes:
>>
>>> On 04/27/2015 02:21 PM, Jiong Wang wrote:
>>>
>>>> Jeff,
>>>>
>>>>     Sorry, I can't understand the meaning of "overlap between t_low and low",
>>>>     assume "right" in "right value" means the opposite of "left" not
>>>>     "correct".
>>>>
>>>>     So what you mean is t_low and low share the same pseudo regiser?
>>> My concern is sharing the same pseudo or memory location.  But thinking
>>> more about it, the shifted value has to have range information, so it
>>> must have been an SSA_NAME, right?  If so, then it can't overlap with
>>> the destination, so this is a non-issue.  Sorry for the confusion.
>>
>> Thanks for the light. By looking at related code, looks like even it's
>> SSA_NAME, it's still possible to share the same pseudo given the
>> destination is in the same SSA map parition after ssa name coleascing?
> If they're the same size and have non-overlapping lifetimes, then yes, 
> they could be the same pseudo.  That ought to be easy to check. 
> Thankfully we don't have to worry about MEMs, which is a harder check.
>
>> OK. I will rework the patch, and I found there is a function named
>> "expand_doubleword_shift" which looks like a more natural place to do
>> this optimization, although it's hard to get range info there. I will do
>> further explore on this.
> Sounds good.
>
> jeff

Jeff,

  Sorry for the slow response.

  I found we can't integrate this transformation into
  "expand_doubleword_shift" as it's at quite deep layer in the call
  stack, when we reach there, we have lost some high level info.

  So I am keeping this transformaion still in expr.c, and attachment is
  the updated version of this patch. The main changes are:

  * Figuring out whether the shift source is coming from sign extension
    by checking SSA_NAME_DEF_STMT instead of deducing from tree range
    info. I fell checking the gimple statement is more reliable and
    straigtforward.

  * For the pseudo register overlaping issue, given:

      RTX target = TREE source << TREE amount

    I was trying to make sure those two SSA_NAME won't get the same
    pseudo register by comparing their assigned partition using
    var_to_partition, but I can't get the associated tree representation
    for "target" which is RTX.

    Then I just expand "source" and compare the register number with
    "target".

    But I found the simplest way maybe just reorder

      low_part (target) = low_part (source) << amount
      high_part (target) = low_part (source) >> amount1

    to:

      high_part (target) = low_part (source) >> amount1
      low_part (target) = low_part (source) << amount

    then, even target and source share the same pseudo register, we can
    still get what we want, as we are operating on their subreg.

  * Added checking on the result value of expand_variable_shift, if it's
    not the same as "target" then call emit_move_insn to do that.

  How is the re-worked patch looks to you?

  x86 bootstrap OK, regression OK. aarch64 bootstrap OK.

2015-08-14  Jiong.Wang  <jiong.wang@arm.com>

gcc/
  * expr.c (expand_expr_real_2): Check tree format to optimize
  LSHIFT_EXPR expand.

gcc/testsuite
  * gcc.dg/wide_shift_64_1.c: New testcase.
  * gcc.dg/wide_shift_128_1.c: Likewise.
  * gcc.target/aarch64/ashlti3_1.c: Likewise.
  * gcc.target/arm/ashldisi_1.c: Likewise.
--
Regards,
Jiong


[-- Attachment #2: k.patch --]
[-- Type: text/x-diff, Size: 7508 bytes --]

diff --git a/gcc/expr.c b/gcc/expr.c
index 31b4573..8a25e98 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8836,23 +8836,90 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 
     case LSHIFT_EXPR:
     case RSHIFT_EXPR:
-      /* If this is a fixed-point operation, then we cannot use the code
-	 below because "expand_shift" doesn't support sat/no-sat fixed-point
-         shifts.   */
-      if (ALL_FIXED_POINT_MODE_P (mode))
-	goto binop;
+      {
+	/* If this is a fixed-point operation, then we cannot use the code
+	   below because "expand_shift" doesn't support sat/no-sat fixed-point
+	   shifts.  */
+	if (ALL_FIXED_POINT_MODE_P (mode))
+	  goto binop;
+
+	if (! safe_from_p (subtarget, treeop1, 1))
+	  subtarget = 0;
+	if (modifier == EXPAND_STACK_PARM)
+	  target = 0;
+	op0 = expand_expr (treeop0, subtarget,
+			   VOIDmode, EXPAND_NORMAL);
+	/* Left shfit optimization:
+
+	   If mode == GET_MODE_WIDER_MODE (word_mode), then normally there isn't
+	   native instruction to support this wide mode left shift.  Given below
+	   example:
+
+	    Type A = (Type) B  << C
+
+            |<           T	    >|
+            |   high     |   low     |
+
+                         |<- size  ->|
+
+	   By checking tree shape, if operand B is coming from signed extension,
+	   then the left shift operand can be simplified into:
+
+	      1. high = low >> (size - C);
+	      2. low = low << C;
+	  */
+
+	temp = NULL_RTX;
+	if (code == LSHIFT_EXPR
+	    && target
+	    && REG_P (target)
+	    && ! unsignedp
+	    && mode == GET_MODE_WIDER_MODE (word_mode)
+	    && ! have_insn_for (ASHIFT, mode)
+	    && TREE_CONSTANT (treeop1)
+	    && TREE_CODE (treeop0) == SSA_NAME)
+	  {
+	    gimple def = SSA_NAME_DEF_STMT (treeop0);
+	    if (is_gimple_assign (def)
+		&& gimple_assign_rhs_code (def) == NOP_EXPR)
+	      {
+		machine_mode rmode = TYPE_MODE
+		  (TREE_TYPE (gimple_assign_rhs1 (def)));
 
-      if (! safe_from_p (subtarget, treeop1, 1))
-	subtarget = 0;
-      if (modifier == EXPAND_STACK_PARM)
-	target = 0;
-      op0 = expand_expr (treeop0, subtarget,
-			 VOIDmode, EXPAND_NORMAL);
-      temp = expand_variable_shift (code, mode, op0, treeop1, target,
-				    unsignedp);
-      if (code == LSHIFT_EXPR)
-	temp = REDUCE_BIT_FIELD (temp);
-      return temp;
+		if (GET_MODE_SIZE (rmode) < GET_MODE_SIZE (mode)
+		    && ((TREE_INT_CST_LOW (treeop1) + GET_MODE_BITSIZE (rmode))
+			>= GET_MODE_BITSIZE (word_mode)))
+		  {
+		    rtx low = simplify_gen_subreg (word_mode, op0, mode, 0);
+		    rtx tlow = simplify_gen_subreg (word_mode, target, mode, 0);
+		    rtx thigh = simplify_gen_subreg (word_mode, target, mode,
+						     UNITS_PER_WORD);
+		    HOST_WIDE_INT ramount = (BITS_PER_WORD
+					     - TREE_INT_CST_LOW (treeop1));
+		    tree rshift = build_int_cst (TREE_TYPE (treeop1), ramount);
+
+		    temp = expand_variable_shift (code, word_mode, low, treeop1,
+						  tlow, unsignedp);
+		    if (temp != tlow)
+		      emit_move_insn (tlow, temp);
+
+		    temp = expand_variable_shift (RSHIFT_EXPR, word_mode, low,
+						  rshift, thigh, unsignedp);
+		    if (temp != thigh)
+		      emit_move_insn (thigh, temp);
+
+		    temp = target ;
+		  }
+	      }
+	  }
+
+	if (temp == NULL_RTX)
+	  temp = expand_variable_shift (code, mode, op0, treeop1, target,
+					unsignedp);
+	if (code == LSHIFT_EXPR)
+	  temp = REDUCE_BIT_FIELD (temp);
+	return temp;
+      }
 
       /* Could determine the answer when only additive constants differ.  Also,
 	 the addition of one can be handled by changing the condition.  */
diff --git a/gcc/testsuite/gcc.dg/wide-shift-128.c b/gcc/testsuite/gcc.dg/wide-shift-128.c
new file mode 100644
index 0000000..9b62715
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/wide-shift-128.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target aarch64*-*-* mips64*-*-* sparc64*-*-* } } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2 -fdump-rtl-combine" } */
+
+__int128_t
+load2 (int data)
+{
+    return (__int128_t) data << 50;
+}
+
+/* { dg-final { scan-rtl-dump-not "ior" "combine" } } */
+/* { dg-final { cleanup-rtl-dump "combine" } } */
diff --git a/gcc/testsuite/gcc.dg/wide-shift-64.c b/gcc/testsuite/gcc.dg/wide-shift-64.c
new file mode 100644
index 0000000..5bc278f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/wide-shift-64.c
@@ -0,0 +1,10 @@
+/* { dg-do compile { target arm*-*-* mips*-*-* sparc*-*-* } } */
+/* { dg-options "-O2 -fdump-rtl-combine" } */
+
+long long
+load1 (int data)
+{
+    return (long long) data << 12;
+}
+
+/* { dg-final { scan-rtl-dump-not "ior" "combine" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/ashltidisi.c b/gcc/testsuite/gcc.target/aarch64/ashltidisi.c
new file mode 100644
index 0000000..aeb2a24
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ashltidisi.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+
+extern void abort (void);
+
+#define GEN_TEST_CASE(x, y, z)\
+__uint128_t __attribute__ ((noinline))\
+ushift_##x##_##z (unsigned y data)\
+{\
+  return (__uint128_t) data << x;\
+}\
+__int128_t __attribute__ ((noinline)) \
+shift_##x##_##z (y data) \
+{\
+  return (__int128_t) data << x;\
+}
+
+GEN_TEST_CASE (53, int, i)
+GEN_TEST_CASE (3, long long, ll)
+GEN_TEST_CASE (13, long long, ll)
+GEN_TEST_CASE (53, long long, ll)
+
+int
+main (int argc, char **argv)
+{
+
+#define SHIFT_CHECK(x, y, z, p) \
+	if (ushift_##y##_##p (x)\
+	    != ((__uint128_t) (unsigned z) x << y)) \
+	  abort ();\
+	if (shift_##y##_##p (x)\
+	    != ((__uint128_t) (signed z) x << y)) \
+	  abort ();
+
+  SHIFT_CHECK (0x12345678, 53, int, i)
+  SHIFT_CHECK (0xcafecafe, 53, int, i)
+
+  SHIFT_CHECK (0x1234567890abcdefLL, 3, long long, ll)
+  SHIFT_CHECK (0x1234567890abcdefLL, 13, long long, ll)
+  SHIFT_CHECK (0x1234567890abcdefLL, 53, long long, ll)
+  SHIFT_CHECK (0xcafecafedeaddeadLL, 3, long long, ll)
+  SHIFT_CHECK (0xcafecafedeaddeadLL, 13, long long, ll)
+  SHIFT_CHECK (0xcafecafedeaddeadLL, 53, long long, ll)
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "asr" 4 } } */
diff --git a/gcc/testsuite/gcc.target/arm/ashldisi.c b/gcc/testsuite/gcc.target/arm/ashldisi.c
new file mode 100644
index 0000000..00dc06e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/ashldisi.c
@@ -0,0 +1,44 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+
+extern void abort (void);
+
+#define GEN_TEST_CASE(x)\
+unsigned long long __attribute__ ((noinline))\
+ushift_ ## x (unsigned int data)\
+{\
+  return (unsigned long long) data << x;\
+}\
+long long __attribute__ ((noinline)) \
+shift_ ## x (int data) \
+{\
+  return (long long) data << x;\
+}
+
+GEN_TEST_CASE (3)
+GEN_TEST_CASE (23)
+GEN_TEST_CASE (30)
+int
+main (int argc, char **argv)
+{
+
+#define SHIFT_CHECK(x, y) \
+	if (ushift_ ## y (x)\
+	    != ((unsigned long long) (unsigned) x << y)) \
+	  abort (); \
+	if (shift_ ## y (x)\
+	    != ((long long) (signed) x << y)) \
+	  abort ();
+
+  SHIFT_CHECK (0x12345678, 3)
+  SHIFT_CHECK (0xcafecafe, 3)
+  SHIFT_CHECK (0x12345678, 23)
+  SHIFT_CHECK (0xcafecafe, 23)
+  SHIFT_CHECK (0x12345678, 30)
+  SHIFT_CHECK (0xcafecafe, 30)
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "asr" 3 } } */
+/* { dg-final { cleanup-saved-temps } } */

  reply	other threads:[~2015-08-14 17:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-16 11:04 Jiong Wang
2015-04-24  2:23 ` Jeff Law
2015-04-27 20:58   ` Jiong Wang
2015-04-29  3:53     ` Jeff Law
2015-04-29 22:14       ` Jiong Wang
2015-04-29 22:55         ` Jeff Law
2015-08-14 17:55           ` Jiong Wang [this message]
2015-08-14 20:30             ` Jeff Law
2015-08-14 22:24               ` Jiong Wang
2015-08-18 13:22                 ` Jiong Wang
2015-08-18 17:47                   ` Jeff Law
2015-08-19 23:05                     ` Jiong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=n99oai9epez.fsf@arm.com \
    --to=jiong.wang@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).