From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-423074-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 42191 invoked by alias); 10 Mar 2016 16:32:30 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 42170 invoked by uid 89); 10 Mar 2016 16:32:29 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=sk:e.menez, emenezessamsungcom, D*samsung.com, U*e.menezes
X-HELO: usmailout4.samsung.com
Received: from mailout4.w2.samsung.com (HELO usmailout4.samsung.com) (211.189.100.14) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 10 Mar 2016 16:32:19 +0000
Received: from uscpsbgm2.samsung.com (u115.gpu85.samsung.co.kr [203.254.195.115]) by usmailout4.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May  5 2014)) with ESMTP id <0O3U00HZR0LTB560@usmailout4.samsung.com> for gcc-patches@gcc.gnu.org; Thu, 10 Mar 2016 11:32:17 -0500 (EST)
Received: from ussync2.samsung.com ( [203.254.195.82])	by uscpsbgm2.samsung.com (USCPMTA) with SMTP id 61.A8.07641.191A1E65; Thu, 10 Mar 2016 11:32:17 -0500 (EST)
Received: from [172.31.207.194] ([105.140.31.10]) by ussync2.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May  5 2014)) with ESMTPA id <0O3U00J890LSGY40@ussync2.samsung.com>; Thu, 10 Mar 2016 11:32:17 -0500 (EST)
Subject: Re: [PATCH][AArch64] Replace insn to zero up DF register
To: James Greenhalgh <james.greenhalgh@arm.com>
References: <AM3PR08MB0088B55381337130367380A383C40@AM3PR08MB0088.eurprd08.prod.outlook.com> <56A94F35.8000000@samsung.com> <AM3PR08MB00887106972C8DCA02A635A283A70@AM3PR08MB0088.eurprd08.prod.outlook.com> <56D0D50E.8030802@samsung.com> <AM3PR08MB008834B01CC97B88D92C78F983BA0@AM3PR08MB0088.eurprd08.prod.outlook.com> <56D4D02C.6030309@samsung.com> <AM3PR08MB00883516D1FA6B9AB80AEA9E83BB0@AM3PR08MB0088.eurprd08.prod.outlook.com> <56D5E8B6.1090900@samsung.com> <56E0972F.3060207@samsung.com> <20160310132327.GA38844@arm.com> <56E1A061.5080901@samsung.com>
Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>, "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, Marcus Shawcroft <Marcus.Shawcroft@arm.com>, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>, richard.earnshaw@arm.com
From: Evandro Menezes <e.menezes@samsung.com>
Message-id: <56E1A18F.5000006@samsung.com>
Date: Thu, 10 Mar 2016 16:32:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1
MIME-version: 1.0
In-reply-to: <56E1A061.5080901@samsung.com>
Content-type: multipart/mixed; boundary=------------040907090200000706060303
X-IsSubscribed: yes
X-SW-Source: 2016-03/txt/msg00646.txt.bz2

This is a multi-part message in MIME format.
--------------040907090200000706060303
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-length: 7019

On 03/10/16 10:27, Evandro Menezes wrote:
> On 03/10/16 07:23, James Greenhalgh wrote:
>> On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:
>>> On 03/01/16 13:08, Evandro Menezes wrote:
>>>> On 03/01/16 13:02, Wilco Dijkstra wrote:
>>>>> Evandro Menezes wrote:
>>>>>> The meaning of these attributes are not clear to me.  Is there a
>>>>>> reference somewhere about which insns are FP or SIMD or neither?
>>>>> The meaning should be clear, "fp" is a floating point
>>>>> instruction, "simd" a SIMD one
>>>>> as defined in ARM-ARM.
>>>>>
>>>>>> Indeed, I had to add the Y for the f_mcr insn to match it with 
>>>>>> nosimd.
>>>>>> However, I didn't feel that it should be moved to the right, 
>>>>>> since it's
>>>>>> already disparaged.  Am I missing something detail?
>>>>> It might not matter for this specific case, but I have seen
>>>>> reload forcing the very
>>>>> first alternative without looking at any costs or preferences -
>>>>> as long as it is legal.
>>>>> This suggests we need to order alternatives from most preferred
>>>>> alternative to least
>>>>> preferred one.
>>>>>
>>>>> I think it is good enough for commit, James?
>>>> Methinks that my issue with those attributes is that I'm not as
>>>> fluent in AArch64 as I'd like to be.
>>>>
>>>> Please, feel free to edit the patch changing the order then.
>>>     Replace insn to zero up SIMD registers
>>>
>>>     gcc/
>>>              * config/aarch64/aarch64.md
>>>              (*movhf_aarch64): Add "movi %0, #0" to zero up register.
>>>              (*movsf_aarch64): Likewise and add "simd" and "fp" 
>>> attributes.
>>>              (*movdf_aarch64): Likewise.
>>>
>>> Swapped the order of the constraints to favor MOVI.
>>>
>>> Just say the word...
>> I'm wondering whether this is appropriate for GCC6 now that we are so 
>> late
>> in the development cycle. Additionally, I have some comments on your 
>> patch:
>>
>>> diff --git a/gcc/config/aarch64/aarch64.md 
>>> b/gcc/config/aarch64/aarch64.md
>>> index 68676c9..4502a58 100644
>>> --- a/gcc/config/aarch64/aarch64.md
>>> +++ b/gcc/config/aarch64/aarch64.md
>>> @@ -1163,11 +1163,12 @@
>>>   )
>>>     (define_insn "*movhf_aarch64"
>>> -  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, 
>>> ?r,w,w,m,r,m ,r")
>>> -    (match_operand:HF 1 "general_operand"      "?rY, 
>>> w,w,m,w,m,rY,r"))]
>>> +  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w 
>>> ,?r,w,w,m,r,m ,r")
>>> +    (match_operand:HF 1 "general_operand"      "Y ,?rY, 
>>> w,w,m,w,m,rY,r"))]
>>>     "TARGET_FLOAT && (register_operand (operands[0], HFmode)
>>>       || aarch64_reg_or_fp_zero (operands[1], HFmode))"
>>>     "@
>>> +   movi\\t%0.4h, #0
>>>      mov\\t%0.h[0], %w1
>>>      umov\\t%w0, %1.h[0]
>>>      mov\\t%0.h[0], %1.h[0]
>>> @@ -1176,18 +1177,19 @@
>>>      ldrh\\t%w0, %1
>>>      strh\\t%w1, %0
>>>      mov\\t%w0, %w1"
>>> -  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
>>> +  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
>>>                        f_loads,f_stores,load1,store1,mov_reg")
>>> -   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
>>> -   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
>>> +   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
>>> +   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
>>>   )
>>>     (define_insn "*movsf_aarch64"
>>> -  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  
>>> ,w,m,r,m ,r")
>>> -    (match_operand:SF 1 "general_operand"      "?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>> +  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w  
>>> ,w,m,r,m ,r")
>>> +    (match_operand:SF 1 "general_operand"      "Y ,?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>>     "TARGET_FLOAT && (register_operand (operands[0], SFmode)
>>>       || aarch64_reg_or_fp_zero (operands[1], SFmode))"
>>>     "@
>>> +   movi\\t%0.2s, #0
>>>      fmov\\t%s0, %w1
>>>      fmov\\t%w0, %s1
>>>      fmov\\t%s0, %s1
>>> @@ -1197,16 +1199,19 @@
>>>      ldr\\t%w0, %1
>>>      str\\t%w1, %0
>>>      mov\\t%w0, %w1"
>>> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
>>> -                     f_loads,f_stores,load1,store1,mov_reg")]
>>> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
>>> +                     f_loads,f_stores,load1,store1,mov_reg")
>>> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
>>> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
>>>   )
>> This fp attribute looks wrong to me. The two fmov instructions that move
>> between core and FP registers should be tagged "yes". However, this is
>> irrelevant as the whole pattern is guarded by TARGET_FLOAT.
>>
>> It would be clearer to drop the FP attribute entirely, so as not to give
>> the erroneous impression that some alternatives in this insn are enabled
>> for !TARGET_FLOAT.
>
> You mean to remove the FP attribute from all, HF, SF, DF, TF?
>
>>>   (define_insn "*movdf_aarch64"
>>> -  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  
>>> ,w,m,r,m ,r")
>>> -    (match_operand:DF 1 "general_operand"      "?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>> +  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w  
>>> ,w,m,r,m ,r")
>>> +    (match_operand:DF 1 "general_operand"      "Y ,?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>>     "TARGET_FLOAT && (register_operand (operands[0], DFmode)
>>>       || aarch64_reg_or_fp_zero (operands[1], DFmode))"
>>>     "@
>>> +   movi\\t%d0, #0
>>>      fmov\\t%d0, %x1
>>>      fmov\\t%x0, %d1
>>>      fmov\\t%d0, %d1
>>> @@ -1216,8 +1221,10 @@
>>>      ldr\\t%x0, %1
>>>      str\\t%x1, %0
>>>      mov\\t%x0, %x1"
>>> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
>>> -                     f_loadd,f_stored,load1,store1,mov_reg")]
>>> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
>>> +                     f_loadd,f_stored,load1,store1,mov_reg")
>>> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
>>> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
>> Likewise here.
>>
>> As to whether this is OK for GCC6, this doesn't look like a 
>> regression to me,
>> we've generated FMOV since the dawn of the port. The patch has 
>> performance
>> implications (arguably trivial ones, but still performance implications)
>> for generic code generation, and release is coming soon - so I'd rather
>> keep this back until GCC 7 opens.
>>
>> Remove the FP attributes, and this patch is OK for GCC 7.
>>
>> I'm happy to be overruled on this by Marcus or Richard if they feel we
>> should take this patch now, but my opinion is that waiting is more in 
>> the
>> spirit of using Stage 4 to stabilise the compiler.
>
> I agree to postpone until GCC 7.
>
>         [AArch64] Replace insn to zero up SIMD registers
>
>         gcc/
>             * config/aarch64/aarch64.md
>             (*movhf_aarch64): Add "movi %0, #0" to zero up register.
>             (*movsf_aarch64): Likewise and add "simd" attributes.
>             (*movdf_aarch64): Likewise.
>
> This patch removes the FP attributes from the HF, SF, DF, TF moves.

And now, with the patch. :-/

-- 
Evandro Menezes


--------------040907090200000706060303
Content-Type: text/x-patch;
 name="0001-AArch64-Replace-insn-to-zero-up-SIMD-registers.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename*0="0001-AArch64-Replace-insn-to-zero-up-SIMD-registers.patch"
Content-length: 3974

>From 190f26b71927516d28b851e0a62b78c2f67c2dc6 Mon Sep 17 00:00:00 2001
From: Evandro Menezes <e.menezes@samsung.com>
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] [AArch64] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register and
	remove the "fp" attributes.
	(*movsf_aarch64): Add "movi %0, #0" to zero up register and
	add the "simd" attributes.
	(*movdf_aarch64): Likewise.
	(*movtf_aarch64): Remove the "fp" attributes.
---
 gcc/config/aarch64/aarch64.md | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..4fb12a6 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,11 +1163,12 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"      "Y ,?rY, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
     || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
+   movi\\t%0.4h, #0
    mov\\t%0.h[0], %w1
    umov\\t%w0, %1.h[0]
    mov\\t%0.h[0], %1.h[0]
@@ -1176,18 +1177,18 @@
    ldrh\\t%w0, %1
    strh\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
                      f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
     || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
+   movi\\t%0.2s, #0
    fmov\\t%s0, %w1
    fmov\\t%w0, %s1
    fmov\\t%s0, %s1
@@ -1197,16 +1198,18 @@
    ldr\\t%w0, %1
    str\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
-                     f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
+                     f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
     || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
+   movi\\t%d0, #0
    fmov\\t%d0, %x1
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
@@ -1216,8 +1219,9 @@
    ldr\\t%x0, %1
    str\\t%x1, %0
    mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
-                     f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
+                     f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")]
 )
 
 (define_insn "*movtf_aarch64"
@@ -1242,7 +1246,6 @@
   [(set_attr "type" "logic_reg,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\
                      f_loadd,f_stored,load2,store2,store2")
    (set_attr "length" "4,8,8,8,4,4,4,4,4,4,4")
-   (set_attr "fp" "*,*,yes,yes,*,yes,yes,yes,*,*,*")
    (set_attr "simd" "yes,*,*,*,yes,*,*,*,*,*,*")]
 )
 
-- 
1.9.1


--------------040907090200000706060303--