From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-482368-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 87008 invoked by alias); 25 Jul 2018 18:08:31 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 86998 invoked by uid 89); 25 Jul 2018 18:08:31 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-25.8 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: EUR04-DB3-obe.outbound.protection.outlook.com
Received: from mail-eopbgr60059.outbound.protection.outlook.com (HELO EUR04-DB3-obe.outbound.protection.outlook.com) (40.107.6.59) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 25 Jul 2018 18:08:28 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MrrmaTgTEvYmtBz4vNfecxShavvofSbzGkOUG9RxrS0=; b=hmWfHNjKEDm+48VjO9uO7WyI9q352MQsVr1Bf3sj4IElULMhYQdFwSZkfM+E6hT46OxQe5EbbbGgNgMCd8kK6mQLO7vBgOcxm3uEj2fv/pcUaLFTKsAUeSTFxr/mCm/tWMq3OfxjiR5DwcITh8ZT37g9vhROXtklBsN7I4cXWRI=
Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Sudi.Das@arm.com;
Received: from [10.2.206.246] (217.140.96.140) by DB6PR0801MB1717.eurprd08.prod.outlook.com (2603:10a6:4:3a::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.973.16; Wed, 25 Jul 2018 18:08:25 +0000
Subject: Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode
To: Sam Tebbs <sam.tebbs@arm.com>, "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Marcus Shawcroft <marcus.shawcroft@arm.com>, nd <nd@arm.com>, Richard Earnshaw <richard.earnshaw@arm.com>, James Greenhalgh <james.greenhalgh@arm.com>
References: <953dbdd2-e20c-4587-3e0d-ad1a65fc93c6@arm.com> <85b58ddb-3da2-67c6-1514-e308201191d3@arm.com> <b9e387fe-471d-1508-70cd-03fd1c3fcacd@arm.com>
From: Sudakshina Das <sudi.das@arm.com>
Message-ID: <74da7cb6-485b-3ce4-7901-d10cb6f1ed95@arm.com>
Date: Wed, 25 Jul 2018 18:08:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <b9e387fe-471d-1508-70cd-03fd1c3fcacd@arm.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Return-Path: sudi.das@arm.com
Received-SPF: None (protection.outlook.com: arm.com does not designate permitted sender hosts)
X-IsSubscribed: yes
X-SW-Source: 2018-07/txt/msg01565.txt.bz2

Hi Sam

On 25/07/18 14:08, Sam Tebbs wrote:
> On 07/23/2018 05:01 PM, Sudakshina Das wrote:
>> Hi Sam
>>
>>
>> On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote:
>>> Hi all,
>>>
>>> This patch extends the aarch64_get_lane_zero_extendsi instruction 
>>> definition to
>>> also cover DI mode. This prevents a redundant AND instruction from being
>>> generated due to the pattern failing to be matched.
>>>
>>> Example:
>>>
>>> typedef char v16qi __attribute__ ((vector_size (16)));
>>>
>>> unsigned long long
>>> foo (v16qi a)
>>> {
>>> Â  return a[0];
>>> }
>>>
>>> Previously generated:
>>>
>>> foo:
>>> Â Â Â Â Â Â Â  umovÂ Â Â  w0, v0.b[0]
>>> Â Â Â Â Â Â Â  andÂ Â Â Â  x0, x0, 255
>>> Â Â Â Â Â Â Â  ret
>>>
>>> And now generates:
>>>
>>> foo:
>>> Â Â Â Â Â Â Â  umovÂ Â Â  w0, v0.b[0]
>>> Â Â Â Â Â Â Â  ret
>>>
>>> Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf 
>>> with no
>>> regressions.
>>>
>>> gcc/
>>> 2018-07-23Â  Sam Tebbs <sam.tebbs@arm.com>
>>>
>>> Â Â Â Â Â Â Â  * config/aarch64/aarch64-simd.md
>>> Â Â Â  (*aarch64_get_lane_zero_extendsi<mode>):
>>> Â Â Â Â Â Â Â  Rename to...
>>> (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This.
>>> Â Â Â Â Â Â Â  Use GPI iterator instead of SI mode.
>>>
>>> gcc/testsuite
>>> 2018-07-23Â  Sam Tebbs <sam.tebbs@arm.com>
>>>
>>> Â Â Â Â Â Â Â  * gcc.target/aarch64/extract_zero_extend.c: New file
>>>
>> You will need an approval from a maintainer, but I would only add one 
>> request to this:
>>
>> diff --git a/gcc/config/aarch64/aarch64-simd.md 
>> b/gcc/config/aarch64/aarch64-simd.md
>> index 89e38e6..15fb661 100644
>> --- a/gcc/config/aarch64/aarch64-simd.md
>> +++ b/gcc/config/aarch64/aarch64-simd.md
>> @@ -3032,15 +3032,16 @@
>> Â Â  [(set_attr "type" "neon_to_gp<q>")]
>> Â )
>>
>> -(define_insn "*aarch64_get_lane_zero_extendsi<mode>"
>> -Â  [(set (match_operand:SI 0 "register_operand" "=r")
>> -Â Â Â  (zero_extend:SI
>> +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>"
>> +Â  [(set (match_operand:GPI 0 "register_operand" "=r")
>> +Â Â Â  (zero_extend:GPI
>>
>> Since you are adding 4 new patterns with this change, could you add
>> more cases in your test as well to make sure you have coverage for 
>> each of them.
>>
>> Thanks
>> Sudi
> 
> Hi Sudi,
> 
> Thanks for the feedback. Here is an updated patch that adds more 
> testcases to cover the patterns generated by the different mode 
> combinations. The changelog and description from my original email still 
> apply.
> 

Thanks for making the changes and adding more test cases. I do however
see that you are only covering 2 out of 4 new
*aarch64_get_lane_zero_extenddi<> patterns. The
*aarch64_get_lane_zero_extendsi<> were already existing. I don't mind
those tests. I would just ask you to add the other two new patterns
as well. Also since the different versions of the instruction generate
same instructions (like foo_16qi and foo_8qi both give out the same
instruction), I would suggest using a -fdump-rtl-final (or any relevant
rtl dump) with the dg-options and using a scan-rtl-dump to scan the
pattern name. Something like:
/* { dg-do compile } */
/* { dg-options "-O3 -fdump-rtl-final" } */
...
...
/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" 
"final" } } */

Thanks
Sudi

>>
>> Â Â Â Â  Â  (vec_select:<VEL>
>> Â Â Â Â  Â Â Â  (match_operand:VDQQH 1 "register_operand" "w")
>> Â Â Â Â  Â Â Â  (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
>> Â Â  "TARGET_SIMD"
>> Â Â  {
>> -Â Â Â  operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL 
>> (operands[2]));
>> +Â Â Â  operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode,
>> +Â Â Â  Â Â Â  Â Â Â  Â Â Â  Â Â Â  Â Â  INTVAL (operands[2]));
>> Â Â Â Â  return "umov\\t%w0, %1.<Vetype>[%2]";
>> Â Â  }
>> Â Â  [(set_attr "type" "neon_to_gp<q>")]
>