Re: [Patch] PR rtl-optimization/87763 - generate more bfi instructions on aarch64

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Steve Ellcey <sellcey@marvell.com>
To: "Marcus.Shawcroft@arm.com" <Marcus.Shawcroft@arm.com>,
	       "james.greenhalgh@arm.com" <james.greenhalgh@arm.com>,
	       "jakub@redhat.com"	<jakub@redhat.com>,
	       "Richard.Earnshaw@arm.com" <Richard.Earnshaw@arm.com>,
	       "richard.sandiford@arm.com" <richard.sandiford@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch]  PR rtl-optimization/87763 - generate more bfi instructions on aarch64
Date: Mon, 04 Feb 2019 16:55:00 -0000	[thread overview]
Message-ID: <a7ae649798a07e5beb0f4a35ea860a0b4bccaa52.camel@marvell.com> (raw)
In-Reply-To: <77a1882e51dc4d4c81cb2a9b8d9994eb14b050dd.camel@marvell.com>

[-- Attachment #1: Type: text/plain, Size: 2341 bytes --]

Ping.  And adding Aarch64 Maintainers.


On Mon, 2019-01-28 at 16:11 -0800, Steve Ellcey wrote:
> On Sat, 2019-01-26 at 00:00 +0100, Jakub Jelinek wrote:
> > 
> > > +  /* Verify that there is no overlap in what bits are set in the
> > > two masks.  */
> > > +  if ((m1 + m2 + 1) != 0)
> > > +    return false;
> > 
> > Wouldn't that be clearer to test
> >   if (m1 + m2 != HOST_WIDE_INT_1U)
> >     return false;
> > ?
> > You IMHO also should test
> >   if ((m1 & m2) != 0)
> >     return false;
> 
> I think you mean HOST_WIDE_INT_M1U, not HOST_WIDE_INT_1U.  That does
> seem clearer and I made that change as well as adding the second
> test.
> 
> > > 
> > > +  t = popcount_hwi (m2);
> > > +  t = (1 << t) - 1;
> > 
> > This should be (HOST_WIDE_INT_1U << t), otherwise if popcount of m2
> > is
> > > = 32, there will be UB.
> 
> Fixed.
> 
> > As mentioned in rs6000.md, I believe you also need a similar
> > pattern where
> > the two ANDs are swapped, because they have the same priority.
> 
> I fixed the long lines in aarch64.md and I added a second pattern for
> the *aarch64_bfi<GPI:mode>4_noshift pattern, swapping the order of
> the
> IOR operands.  I did not swap the AND operands, I assume the compiler
> would ensure that the register was always before the constant or that
> it would check both orderings.
> 
> I tried adding a second version of *aarch64_bfi<GPI:mode>5_shift as
> well but when I tested it, it never got used during a bootstrap build
> or a GCC test run so I decided it was not needed.
> 
> Tested on aarch64 with a bootstrap build and a GCC testsuite run.
> OK to checkin?
> 
> 
> Steve Ellcey
> sellcey@marvell.com
> 
> 
> 2018-01-28  Steve Ellcey  <sellcey@marvell.com>
> 
> 	PR rtl-optimization/87763
> 	* config/aarch64/aarch64-protos.h
> (aarch64_masks_and_shift_for_bfi_p):
> 	New prototype.
> 	* config/aarch64/aarch64.c (aarch64_masks_and_shift_for_bfi_p):
> 	New function.
> 	* config/aarch64/aarch64.md (*aarch64_bfi<GPI:mode>5_shift):
> 	New instruction.
> 	(*aarch64_bfi<GPI:mode>4_noshift): Ditto.
> 	(*aarch64_bfi<GPI:mode>4_noshift_alt): Ditto.
> 
> 
> 2018-01-28  Steve Ellcey  <sellcey@marvell.com>
> 
> 	PR rtl-optimization/87763
> 	* gcc.target/aarch64/combine_bfxil.c: Change some bfxil checks
> 	to bfi.
> 
> 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: combine.patch --]
[-- Type: text/x-patch; name="combine.patch", Size: 5357 bytes --]

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index b035e35f33b..b6c0d0a8eb6 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -429,6 +429,9 @@ bool aarch64_label_mentioned_p (rtx);
 void aarch64_declare_function_name (FILE *, const char*, tree);
 bool aarch64_legitimate_pic_operand_p (rtx);
 bool aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode, rtx, rtx);
+bool aarch64_masks_and_shift_for_bfi_p (scalar_int_mode, unsigned HOST_WIDE_INT,
+					unsigned HOST_WIDE_INT,
+					unsigned HOST_WIDE_INT);
 bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
 bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
 opt_machine_mode aarch64_sve_pred_mode (unsigned int);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d7c453cdad0..9a3080b5db8 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9330,6 +9330,41 @@ aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode mode, rtx mask,
 	     & ((HOST_WIDE_INT_1U << INTVAL (shft_amnt)) - 1)) == 0;
 }
 
+/* Return true if the masks and a shift amount from an RTX of the form
+   ((x & MASK1) | ((y << SHIFT_AMNT) & MASK2)) are valid to combine into
+   a BFI instruction of mode MODE.  See *arch64_bfi patterns.  */
+
+bool
+aarch64_masks_and_shift_for_bfi_p (scalar_int_mode mode,
+				   unsigned HOST_WIDE_INT mask1,
+				   unsigned HOST_WIDE_INT shft_amnt,
+				   unsigned HOST_WIDE_INT mask2)
+{
+  unsigned HOST_WIDE_INT t;
+
+  /* Verify that there is no overlap in what bits are set in the two masks.  */
+  if ((mask1 + mask2) != HOST_WIDE_INT_M1U)
+    return false;
+  if ((mask1 & mask2) != 0)
+    return false;
+
+  /* Verify that the shift amount is less than the mode size.  */
+  if (shft_amnt >= GET_MODE_BITSIZE (mode))
+    return false;
+
+  /* Verify that the mask being shifted is contiguous and would be in the
+     least significant bits after shifting by creating a mask 't' based on
+     the number of bits set in mask2 and the shift amount for mask2 and
+     comparing that to the actual mask2.  */
+  t = popcount_hwi (mask2);
+  t = (HOST_WIDE_INT_1U << t) - 1;
+  t = t << shft_amnt;
+  if (t != mask2)
+    return false;
+  
+  return true;
+}
+
 /* Calculate the cost of calculating X, storing it in *COST.  Result
    is true if the total cost of the operation has now been calculated.  */
 static bool
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index b7f6fe0f135..6b5339092ba 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5476,6 +5476,56 @@
   [(set_attr "type" "bfm")]
 )
 
+;;  Match a bfi instruction where the shift of OP3 means that we are
+;;  actually copying the least significant bits of OP3 into OP0 by way
+;;  of the AND masks and the IOR instruction.  A similar instruction
+;;  with the two parts of the IOR swapped around was never triggered
+;;  in a bootstrap build and test of GCC so it was not included.
+
+(define_insn "*aarch64_bfi<GPI:mode>5_shift"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0")
+                          (match_operand:GPI 2 "const_int_operand" "n"))
+                 (and:GPI (ashift:GPI
+                           (match_operand:GPI 3 "register_operand" "r")
+                           (match_operand:GPI 4 "aarch64_simd_shift_imm_<mode>" "n"))
+                          (match_operand:GPI 5 "const_int_operand" "n"))))]
+  "aarch64_masks_and_shift_for_bfi_p (<MODE>mode, UINTVAL (operands[2]),
+				      UINTVAL (operands[4]),
+				      UINTVAL(operands[5]))"
+  "bfi\t%<GPI:w>0, %<GPI:w>3, %4, %P5"
+  [(set_attr "type" "bfm")]
+)
+
+;; Like the above instruction but with no shifting, we are just copying
+;; the least significant bits of OP3 to OP0.  In this case we do need
+;; two versions of the instruction to handle different checks on the
+;; constant values.
+
+(define_insn "*aarch64_bfi<GPI:mode>4_noshift"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0")
+                          (match_operand:GPI 2 "const_int_operand" "n"))
+                 (and:GPI (match_operand:GPI 3 "register_operand" "r")
+                          (match_operand:GPI 4 "const_int_operand" "n"))))]
+  "aarch64_masks_and_shift_for_bfi_p (<MODE>mode, UINTVAL (operands[2]), 0,
+				      UINTVAL (operands[4]))"
+  "bfi\t%<GPI:w>0, %<GPI:w>3, 0, %P4"
+  [(set_attr "type" "bfm")]
+)
+
+(define_insn "*aarch64_bfi<GPI:mode>4_noshift_alt"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "r")
+                          (match_operand:GPI 2 "const_int_operand" "n"))
+                 (and:GPI (match_operand:GPI 3 "register_operand" "0")
+                          (match_operand:GPI 4 "const_int_operand" "n"))))]
+  "aarch64_masks_and_shift_for_bfi_p (<MODE>mode, UINTVAL (operands[4]), 0,
+				      UINTVAL (operands[2]))"
+  "bfi\t%<GPI:w>0, %<GPI:w>1, 0, %P2"
+  [(set_attr "type" "bfm")]
+)
+
 (define_insn "*extr_insv_lower_reg<mode>"
   [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r")
 			  (match_operand 1 "const_int_operand" "n")

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: combine-test.patch --]
[-- Type: text/x-patch; name="combine-test.patch", Size: 485 bytes --]

diff --git a/gcc/testsuite/gcc.target/aarch64/combine_bfxil.c b/gcc/testsuite/gcc.target/aarch64/combine_bfxil.c
index 109f989a2f0..c3b5fc58800 100644
--- a/gcc/testsuite/gcc.target/aarch64/combine_bfxil.c
+++ b/gcc/testsuite/gcc.target/aarch64/combine_bfxil.c
@@ -114,4 +114,5 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-assembler-times "bfxil\\t" 18 } } */
+/* { dg-final { scan-assembler-times "bfxil\\t" 3 } } */
+/* { dg-final { scan-assembler-times "bfi\\t" 15 } } */

next prev parent reply	other threads:[~2019-02-04 16:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-25  0:11 Steve Ellcey
2019-01-25 10:40 ` Richard Earnshaw (lists)
2019-01-25 22:16   ` Steve Ellcey
2019-01-25 23:22 ` Jakub Jelinek
2019-01-29  0:22   ` Steve Ellcey
2019-01-29  0:47     ` Jakub Jelinek
2019-02-04 16:55     ` Steve Ellcey [this message]
2019-02-05 21:12 Wilco Dijkstra
2019-02-06 23:42 ` Steve Ellcey
2019-02-07 18:13   ` Wilco Dijkstra
2019-02-11 18:46     ` Steve Ellcey
2019-02-26 16:46       ` Steve Ellcey
2019-03-14 22:54         ` Steve Ellcey
2019-04-01 17:23           ` Steve Ellcey
2019-04-10 10:19             ` Richard Earnshaw (lists)
2019-04-10 20:32               ` Steve Ellcey
2019-04-11  9:04                 ` Richard Earnshaw (lists)
2019-04-11 15:14                   ` [EXT] " Steve Ellcey
2019-04-11 15:55                     ` Steve Ellcey
2019-04-11 16:01                       ` Richard Earnshaw (lists)
2019-04-11 18:50                         ` Steve Ellcey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a7ae649798a07e5beb0f4a35ea860a0b4bccaa52.camel@marvell.com \
    --to=sellcey@marvell.com \
    --cc=Marcus.Shawcroft@arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=james.greenhalgh@arm.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).