public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Sam Tebbs <sam.tebbs@arm.com>
To: Sudakshina Das <sudi.das@arm.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Marcus Shawcroft <marcus.shawcroft@arm.com>, nd <nd@arm.com>,
	Richard Earnshaw <richard.earnshaw@arm.com>,
	James Greenhalgh <james.greenhalgh@arm.com>
Subject: Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode
Date: Thu, 26 Jul 2018 16:52:00 -0000	[thread overview]
Message-ID: <9e7ccf5b-55b9-543d-1f9e-f9ab36e93376@arm.com> (raw)
In-Reply-To: <74da7cb6-485b-3ce4-7901-d10cb6f1ed95@arm.com>

[-- Attachment #1: Type: text/plain, Size: 4978 bytes --]



On 07/25/2018 07:08 PM, Sudakshina Das wrote:
> Hi Sam
>
> On 25/07/18 14:08, Sam Tebbs wrote:
>> On 07/23/2018 05:01 PM, Sudakshina Das wrote:
>>> Hi Sam
>>>
>>>
>>> On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote:
>>>> Hi all,
>>>>
>>>> This patch extends the aarch64_get_lane_zero_extendsi instruction 
>>>> definition to
>>>> also cover DI mode. This prevents a redundant AND instruction from 
>>>> being
>>>> generated due to the pattern failing to be matched.
>>>>
>>>> Example:
>>>>
>>>> typedef char v16qi __attribute__ ((vector_size (16)));
>>>>
>>>> unsigned long long
>>>> foo (v16qi a)
>>>> {
>>>>   return a[0];
>>>> }
>>>>
>>>> Previously generated:
>>>>
>>>> foo:
>>>>         umov    w0, v0.b[0]
>>>>         and     x0, x0, 255
>>>>         ret
>>>>
>>>> And now generates:
>>>>
>>>> foo:
>>>>         umov    w0, v0.b[0]
>>>>         ret
>>>>
>>>> Bootstrapped on aarch64-none-linux-gnu and tested on 
>>>> aarch64-none-elf with no
>>>> regressions.
>>>>
>>>> gcc/
>>>> 2018-07-23  Sam Tebbs <sam.tebbs@arm.com>
>>>>
>>>>         * config/aarch64/aarch64-simd.md
>>>>     (*aarch64_get_lane_zero_extendsi<mode>):
>>>>         Rename to...
>>>> (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This.
>>>>         Use GPI iterator instead of SI mode.
>>>>
>>>> gcc/testsuite
>>>> 2018-07-23  Sam Tebbs <sam.tebbs@arm.com>
>>>>
>>>>         * gcc.target/aarch64/extract_zero_extend.c: New file
>>>>
>>> You will need an approval from a maintainer, but I would only add 
>>> one request to this:
>>>
>>> diff --git a/gcc/config/aarch64/aarch64-simd.md 
>>> b/gcc/config/aarch64/aarch64-simd.md
>>> index 89e38e6..15fb661 100644
>>> --- a/gcc/config/aarch64/aarch64-simd.md
>>> +++ b/gcc/config/aarch64/aarch64-simd.md
>>> @@ -3032,15 +3032,16 @@
>>>    [(set_attr "type" "neon_to_gp<q>")]
>>>  )
>>>
>>> -(define_insn "*aarch64_get_lane_zero_extendsi<mode>"
>>> -  [(set (match_operand:SI 0 "register_operand" "=r")
>>> -    (zero_extend:SI
>>> +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>"
>>> +  [(set (match_operand:GPI 0 "register_operand" "=r")
>>> +    (zero_extend:GPI
>>>
>>> Since you are adding 4 new patterns with this change, could you add
>>> more cases in your test as well to make sure you have coverage for 
>>> each of them.
>>>
>>> Thanks
>>> Sudi
>>
>> Hi Sudi,
>>
>> Thanks for the feedback. Here is an updated patch that adds more 
>> testcases to cover the patterns generated by the different mode 
>> combinations. The changelog and description from my original email 
>> still apply.
>>
>
> Thanks for making the changes and adding more test cases. I do however
> see that you are only covering 2 out of 4 new
> *aarch64_get_lane_zero_extenddi<> patterns. The
> *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind
> those tests. I would just ask you to add the other two new patterns
> as well. Also since the different versions of the instruction generate
> same instructions (like foo_16qi and foo_8qi both give out the same
> instruction), I would suggest using a -fdump-rtl-final (or any relevant
> rtl dump) with the dg-options and using a scan-rtl-dump to scan the
> pattern name. Something like:
> /* { dg-do compile } */
> /* { dg-options "-O3 -fdump-rtl-final" } */
> ...
> ...
> /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" 
> "final" } } */
>
> Thanks
> Sudi

Hi Sudi,

Thanks again. Here's an update that adds 4 more tests, so all 8 patterns
generated are now tested for!

Below is the updated changelog

gcc/
2018-07-26  Sam Tebbs  <sam.tebbs@arm.com>

         * config/aarch64/aarch64-simd.md
         (*aarch64_get_lane_zero_extendsi<mode>):
         Rename to...
(*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>): ... This.
         Use GPI iterator instead of SI mode.

gcc/testsuite
2018-07-26  Sam Tebbs  <sam.tebbs@arm.com>

         * gcc.target/aarch64/extract_zero_extend.c: New file

>
>>>
>>>        (vec_select:<VEL>
>>>          (match_operand:VDQQH 1 "register_operand" "w")
>>>          (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
>>>    "TARGET_SIMD"
>>>    {
>>> -    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL 
>>> (operands[2]));
>>> +    operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode,
>>> +                       INTVAL (operands[2]));
>>>      return "umov\\t%w0, %1.<Vetype>[%2]";
>>>    }
>>>    [(set_attr "type" "neon_to_gp<q>")]
>>
>


[-- Attachment #2: latest.patch --]
[-- Type: text/x-patch, Size: 3430 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index f1784d72e55c412d076de43f2f7aad4632d55ecb..e92a3b49c65e84d2a16a2a480c359a0b4d8fa3e3 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3033,15 +3033,16 @@
   [(set_attr "type" "neon_to_gp<q>")]
 )
 
-(define_insn "*aarch64_get_lane_zero_extendsi<mode>"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(zero_extend:SI
+(define_insn "*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(zero_extend:GPI
 	  (vec_select:<VEL>
 	    (match_operand:VDQQH 1 "register_operand" "w")
 	    (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_SIMD"
   {
-    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode,
+					   INTVAL (operands[2]));
     return "umov\\t%w0, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon_to_gp<q>")]
diff --git a/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c
new file mode 100644
index 0000000000000000000000000000000000000000..a294b261909a1d67ab339c929f2609dcda01c067
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c
@@ -0,0 +1,81 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-final" } */
+
+/* Tests div16qi.  */
+typedef unsigned char div16qi __attribute__ ((vector_size (16)));
+/* Tests div8qi.  */
+typedef unsigned char div8qi __attribute__ ((vector_size (8)));
+/* Tests div8hi.  */
+typedef unsigned short div8hi __attribute__ ((vector_size (16)));
+/* Tests div4hi.  */
+typedef unsigned short div4hi __attribute__ ((vector_size (8)));
+
+/* Tests siv16qi.  */
+typedef unsigned char siv16qi __attribute__ ((vector_size (16)));
+/* Tests siv8qi.  */
+typedef unsigned char siv8qi __attribute__ ((vector_size (8)));
+/* Tests siv8hi.  */
+typedef unsigned short siv8hi __attribute__ ((vector_size (16)));
+/* Tests siv4hi.  */
+typedef unsigned short siv4hi __attribute__ ((vector_size (8)));
+
+
+unsigned long long
+foo_div16qi (div16qi a)
+{
+  return a[0];
+}
+
+unsigned long long
+foo_div8qi (div8qi a)
+{
+  return a[0];
+}
+
+unsigned long long
+foo_div8hi (div8hi a)
+{
+  return a[0];
+}
+
+unsigned long long
+foo_div4hi (div4hi a)
+{
+  return a[0];
+}
+
+unsigned int
+foo_siv16qi (siv16qi a)
+{
+  return a[0];
+}
+
+unsigned int
+foo_siv8qi (siv8qi a)
+{
+  return a[0];
+}
+
+unsigned int
+foo_siv8hi (siv8hi a)
+{
+  return a[0];
+}
+
+unsigned int
+foo_siv4hi (siv4hi a)
+{
+  return a[0];
+}
+
+/* { dg-final { scan-assembler-times "umov\\t" 8 } } */
+/* { dg-final { scan-assembler-not "and\\t" } } */
+
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" "final" } } */
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv8qi" "final" } } */
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv8hi" "final" } } */
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv4hi" "final" } } */
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv16qi" "final" } } */
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv8qi" "final" } } */
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv8hi" "final" } } */
+/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv4hi" "final" } } */

  reply	other threads:[~2018-07-26 16:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-23 10:39 Sam Tebbs
2018-07-23 16:01 ` Sudakshina Das
2018-07-25 13:09   ` Sam Tebbs
2018-07-25 18:08     ` Sudakshina Das
2018-07-26 16:52       ` Sam Tebbs [this message]
2018-07-31 22:16         ` James Greenhalgh
2018-08-01  9:13           ` Sam Tebbs
2018-08-01 10:20             ` Sudakshina Das
2018-07-27 12:39     ` Sudakshina Das

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e7ccf5b-55b9-543d-1f9e-f9ab36e93376@arm.com \
    --to=sam.tebbs@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=marcus.shawcroft@arm.com \
    --cc=nd@arm.com \
    --cc=richard.earnshaw@arm.com \
    --cc=sudi.das@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).