[PATCH][AArch64] Replace insn to zero up DF register

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH][AArch64] Replace insn to zero up DF register
@ 2015-10-19 23:41 Evandro Menezes
  2015-10-19 23:52 ` Andrew Pinski
  2015-10-30 10:26 ` Marcus Shawcroft
  0 siblings, 2 replies; 24+ messages in thread
From: Evandro Menezes @ 2015-10-19 23:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: Marcus.Shawcroft, kyrylo.tkachov

[-- Attachment #1: Type: text/plain, Size: 244 bytes --]

In the existing targets, it seems that it's always faster to zero up a 
DF register with "movi %d0, #0" instead of "fmov %d0, xzr".

This patch modifies the respective pattern.

Please, commit if it's alright.

Thank you,

-- 
Evandro Menezes


[-- Attachment #2: 0001-AArch64-Replace-insn-to-zero-up-DF-register.patch --]
[-- Type: text/x-patch, Size: 1468 bytes --]

From 429b1d70a7eca76c96250fec6ec5269a7a661a4c Mon Sep 17 00:00:00 2001
From: Evandro Menezes <e.menezes@samsung.com>
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] [AArch64] Replace insn to zero up DF register

gcc/
	* config/aarch64/aarch64.md
	(*movdf_aarch64): Add "movi %d0, #0" to zero up DF register.
---
 gcc/config/aarch64/aarch64.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5b7f2fd..5f00686 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1156,21 +1156,22 @@
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,?r,w,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "?r, w,w,Y,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
     || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
    fmov\\t%d0, %x1
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
+   movi\\t%d0, #0
    fmov\\t%d0, %1
    ldr\\t%d0, %1
    str\\t%d1, %0
    ldr\\t%x0, %1
    str\\t%x1, %0
    mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
+  [(set_attr "type" "f_mcr,f_mrc,fmov,neon_move,fconstd,\
                      f_loadd,f_stored,load1,store1,mov_reg")]
 )
 
-- 
2.1.0.243.g30d45f7


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-19 23:41 [PATCH][AArch64] Replace insn to zero up DF register Evandro Menezes
@ 2015-10-19 23:52 ` Andrew Pinski
  2015-10-20  0:33   ` Andrew Pinski
  2015-10-30 10:26 ` Marcus Shawcroft
  1 sibling, 1 reply; 24+ messages in thread
From: Andrew Pinski @ 2015-10-19 23:52 UTC (permalink / raw)
  To: Evandro Menezes; +Cc: GCC Patches, Marcus Shawcroft, Kyrill Tkachov

On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes <e.menezes@samsung.com> wrote:
> In the existing targets, it seems that it's always faster to zero up a DF
> register with "movi %d0, #0" instead of "fmov %d0, xzr".

I think for ThunderX 1, this change will not make a difference.  So I
am neutral on this change.

Thanks,
Andrew

>
> This patch modifies the respective pattern.
>
> Please, commit if it's alright.
>
> Thank you,
>
> --
> Evandro Menezes
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-19 23:52 ` Andrew Pinski
@ 2015-10-20  0:33   ` Andrew Pinski
  2015-10-20 14:46     ` Andrew Pinski
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Pinski @ 2015-10-20  0:33 UTC (permalink / raw)
  To: Evandro Menezes; +Cc: GCC Patches, Marcus Shawcroft, Kyrill Tkachov

On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes <e.menezes@samsung.com> wrote:
>> In the existing targets, it seems that it's always faster to zero up a DF
>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
> I think for ThunderX 1, this change will not make a difference.  So I
> am neutral on this change.

Actually depending on fmov is decoded in our pipeline, this change
might actually be worse.  Currently fmov with an immediate is 1 cycle
while movi is two cycles.  Let me double check how internally on how
it is decoded and if it is 1 cycle or two.

Thanks,
Andrew

>
> Thanks,
> Andrew
>
>>
>> This patch modifies the respective pattern.
>>
>> Please, commit if it's alright.
>>
>> Thank you,
>>
>> --
>> Evandro Menezes
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-20  0:33   ` Andrew Pinski
@ 2015-10-20 14:46     ` Andrew Pinski
  2015-10-28 18:49       ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Pinski @ 2015-10-20 14:46 UTC (permalink / raw)
  To: Evandro Menezes; +Cc: GCC Patches, Marcus Shawcroft, Kyrill Tkachov

On Tue, Oct 20, 2015 at 7:59 AM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski <pinskia@gmail.com> wrote:
>> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes <e.menezes@samsung.com> wrote:
>>> In the existing targets, it seems that it's always faster to zero up a DF
>>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>
>> I think for ThunderX 1, this change will not make a difference.  So I
>> am neutral on this change.
>
> Actually depending on fmov is decoded in our pipeline, this change
> might actually be worse.  Currently fmov with an immediate is 1 cycle
> while movi is two cycles.  Let me double check how internally on how
> it is decoded and if it is 1 cycle or two.

Ok, my objections are removed as I talked with the architectures here
at Cavium and using movi is better in this case.

Thanks,
Andrew

>
> Thanks,
> Andrew
>
>>
>> Thanks,
>> Andrew
>>
>>>
>>> This patch modifies the respective pattern.
>>>
>>> Please, commit if it's alright.
>>>
>>> Thank you,
>>>
>>> --
>>> Evandro Menezes
>>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-20 14:46     ` Andrew Pinski
@ 2015-10-28 18:49       ` Evandro Menezes
  0 siblings, 0 replies; 24+ messages in thread
From: Evandro Menezes @ 2015-10-28 18:49 UTC (permalink / raw)
  To: GCC Patches; +Cc: Andrew Pinski, Marcus Shawcroft, Kyrill Tkachov

Ping.

-- 
Evandro Menezes

On 10/20/2015 09:27 AM, Andrew Pinski wrote:
> On Tue, Oct 20, 2015 at 7:59 AM, Andrew Pinski <pinskia@gmail.com> wrote:
>> On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski <pinskia@gmail.com> wrote:
>>> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes <e.menezes@samsung.com> wrote:
>>>> In the existing targets, it seems that it's always faster to zero up a DF
>>>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>> I think for ThunderX 1, this change will not make a difference.  So I
>>> am neutral on this change.
>> Actually depending on fmov is decoded in our pipeline, this change
>> might actually be worse.  Currently fmov with an immediate is 1 cycle
>> while movi is two cycles.  Let me double check how internally on how
>> it is decoded and if it is 1 cycle or two.
> Ok, my objections are removed as I talked with the architectures here
> at Cavium and using movi is better in this case.
>
> Thanks,
> Andrew
>
>> Thanks,
>> Andrew
>>
>>> Thanks,
>>> Andrew
>>>
>>>> This patch modifies the respective pattern.
>>>>
>>>> Please, commit if it's alright.
>>>>
>>>> Thank you,
>>>>
>>>> --
>>>> Evandro Menezes
>>>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-19 23:41 [PATCH][AArch64] Replace insn to zero up DF register Evandro Menezes
  2015-10-19 23:52 ` Andrew Pinski
@ 2015-10-30 10:26 ` Marcus Shawcroft
  2015-11-09 22:59   ` Evandro Menezes
                     ` (2 more replies)
  1 sibling, 3 replies; 24+ messages in thread
From: Marcus Shawcroft @ 2015-10-30 10:26 UTC (permalink / raw)
  To: Evandro Menezes; +Cc: gcc-patches, Marcus Shawcroft, Kyrill Tkachov

On 20 October 2015 at 00:40, Evandro Menezes <e.menezes@samsung.com> wrote:
> In the existing targets, it seems that it's always faster to zero up a DF
> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
> This patch modifies the respective pattern.

Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.

Cheers
/Marcus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-30 10:26 ` Marcus Shawcroft
@ 2015-11-09 22:59   ` Evandro Menezes
  2015-12-03 21:01     ` Evandro Menezes
  2015-11-19 22:01   ` Evandro Menezes
  2015-12-16 21:30   ` Evandro Menezes
  2 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2015-11-09 22:59 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: gcc-patches, Marcus Shawcroft, Kyrill Tkachov

Hi, Marcus.

Have you an update from the architecture folks about this?

Thank you,

-- 
Evandro Menezes

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
> On 20 October 2015 at 00:40, Evandro Menezes <e.menezes@samsung.com> wrote:
>> In the existing targets, it seems that it's always faster to zero up a DF
>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>
>> This patch modifies the respective pattern.
>
> Hi Evandro,
>
> This patch changes the generic, u architecture independent instruction
> selection. The ARM ARM (C3.5.3) makes a specific recommendation about
> the choice of instruction in this situation and the current
> implementation in GCC follows that recommendation.  Wilco has also
> picked up on this issue he has the same patch internal to ARM along
> with an ongoing discussion with ARM architecture folk regarding this
> recommendation.  I'm reluctant to take this patch right now on the
> basis that it runs contrary to ARM ARM recommendation pending the
> conclusion of Wilco's discussion with ARM architecture folk.
>
> Cheers
> /Marcus
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-30 10:26 ` Marcus Shawcroft
  2015-11-09 22:59   ` Evandro Menezes
@ 2015-11-19 22:01   ` Evandro Menezes
  2015-12-16 21:30   ` Evandro Menezes
  2 siblings, 0 replies; 24+ messages in thread
From: Evandro Menezes @ 2015-11-19 22:01 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: gcc-patches, Marcus Shawcroft, Kyrill Tkachov

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
> On 20 October 2015 at 00:40, Evandro Menezes <e.menezes@samsung.com> wrote:
>> In the existing targets, it seems that it's always faster to zero up a DF
>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>
>> This patch modifies the respective pattern.
>
> Hi Evandro,
>
> This patch changes the generic, u architecture independent instruction
> selection. The ARM ARM (C3.5.3) makes a specific recommendation about
> the choice of instruction in this situation and the current
> implementation in GCC follows that recommendation.  Wilco has also
> picked up on this issue he has the same patch internal to ARM along
> with an ongoing discussion with ARM architecture folk regarding this
> recommendation.  I'm reluctant to take this patch right now on the
> basis that it runs contrary to ARM ARM recommendation pending the
> conclusion of Wilco's discussion with ARM architecture folk.

Ping.

-- 
Evandro Menezes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-11-09 22:59   ` Evandro Menezes
@ 2015-12-03 21:01     ` Evandro Menezes
  0 siblings, 0 replies; 24+ messages in thread
From: Evandro Menezes @ 2015-12-03 21:01 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: gcc-patches, Marcus Shawcroft, Kyrill Tkachov

On 11/09/2015 04:59 PM, Evandro Menezes wrote:
> Hi, Marcus.
>
> Have you an update from the architecture folks about this?
>
> Thank you,

Marcus?

-- 
Evandro Menezes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-10-30 10:26 ` Marcus Shawcroft
  2015-11-09 22:59   ` Evandro Menezes
  2015-11-19 22:01   ` Evandro Menezes
@ 2015-12-16 21:30   ` Evandro Menezes
  2016-01-13  0:06     ` Evandro Menezes
  2 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2015-12-16 21:30 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: gcc-patches, Marcus Shawcroft, Kyrill Tkachov

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
> On 20 October 2015 at 00:40, Evandro Menezes <e.menezes@samsung.com> wrote:
>> In the existing targets, it seems that it's always faster to zero up a DF
>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>
>> This patch modifies the respective pattern.
>
> Hi Evandro,
>
> This patch changes the generic, u architecture independent instruction
> selection. The ARM ARM (C3.5.3) makes a specific recommendation about
> the choice of instruction in this situation and the current
> implementation in GCC follows that recommendation.  Wilco has also
> picked up on this issue he has the same patch internal to ARM along
> with an ongoing discussion with ARM architecture folk regarding this
> recommendation.  I'm reluctant to take this patch right now on the
> basis that it runs contrary to ARM ARM recommendation pending the
> conclusion of Wilco's discussion with ARM architecture folk.
>

Marcus,

Have you had a chance to discuss this internally further?

Thank you,

-- 
Evandro Menezes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2015-12-16 21:30   ` Evandro Menezes
@ 2016-01-13  0:06     ` Evandro Menezes
  0 siblings, 0 replies; 24+ messages in thread
From: Evandro Menezes @ 2016-01-13  0:06 UTC (permalink / raw)
  To: Marcus Shawcroft
  Cc: gcc-patches, Marcus Shawcroft, Kyrill Tkachov, James Greenhalgh

On 12/16/2015 03:30 PM, Evandro Menezes wrote:
> On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
>> On 20 October 2015 at 00:40, Evandro Menezes <e.menezes@samsung.com> 
>> wrote:
>>> In the existing targets, it seems that it's always faster to zero up 
>>> a DF
>>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>>
>>> This patch modifies the respective pattern.
>>
>> Hi Evandro,
>>
>> This patch changes the generic, u architecture independent instruction
>> selection. The ARM ARM (C3.5.3) makes a specific recommendation about
>> the choice of instruction in this situation and the current
>> implementation in GCC follows that recommendation.  Wilco has also
>> picked up on this issue he has the same patch internal to ARM along
>> with an ongoing discussion with ARM architecture folk regarding this
>> recommendation.  I'm reluctant to take this patch right now on the
>> basis that it runs contrary to ARM ARM recommendation pending the
>> conclusion of Wilco's discussion with ARM architecture folk.
>
> Have you had a chance to discuss this internally further?
>

Ping.

-- 
Evandro Menezes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-03-10 16:32                     ` Evandro Menezes
@ 2016-03-10 16:37                       ` James Greenhalgh
  0 siblings, 0 replies; 24+ messages in thread
From: James Greenhalgh @ 2016-03-10 16:37 UTC (permalink / raw)
  To: Evandro Menezes
  Cc: Wilco Dijkstra, gcc-patches, nd, Marcus Shawcroft,
	Kyrylo Tkachov, richard.earnshaw

On Thu, Mar 10, 2016 at 10:32:15AM -0600, Evandro Menezes wrote:
> >I agree to postpone until GCC 7.
> >
> >        [AArch64] Replace insn to zero up SIMD registers
> >
> >        gcc/
> >            * config/aarch64/aarch64.md
> >            (*movhf_aarch64): Add "movi %0, #0" to zero up register.
> >            (*movsf_aarch64): Likewise and add "simd" attributes.
> >            (*movdf_aarch64): Likewise.
> >
> >This patch removes the FP attributes from the HF, SF, DF, TF moves.
> 
> And now, with the patch. :-/
> 

Thanks for sticking with it. This is OK for GCC 7 when development
opens.

Remember to mention the most recent changes in your Changelog entry
(Remove "fp" attribute from *movhf_aarch64 and *movtf_aarch64).

Thanks,
James

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-03-10 16:27                   ` Evandro Menezes
@ 2016-03-10 16:32                     ` Evandro Menezes
  2016-03-10 16:37                       ` James Greenhalgh
  0 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2016-03-10 16:32 UTC (permalink / raw)
  To: James Greenhalgh
  Cc: Wilco Dijkstra, gcc-patches, nd, Marcus Shawcroft,
	Kyrylo Tkachov, richard.earnshaw

[-- Attachment #1: Type: text/plain, Size: 7019 bytes --]

On 03/10/16 10:27, Evandro Menezes wrote:
> On 03/10/16 07:23, James Greenhalgh wrote:
>> On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:
>>> On 03/01/16 13:08, Evandro Menezes wrote:
>>>> On 03/01/16 13:02, Wilco Dijkstra wrote:
>>>>> Evandro Menezes wrote:
>>>>>> The meaning of these attributes are not clear to me.  Is there a
>>>>>> reference somewhere about which insns are FP or SIMD or neither?
>>>>> The meaning should be clear, "fp" is a floating point
>>>>> instruction, "simd" a SIMD one
>>>>> as defined in ARM-ARM.
>>>>>
>>>>>> Indeed, I had to add the Y for the f_mcr insn to match it with 
>>>>>> nosimd.
>>>>>> However, I didn't feel that it should be moved to the right, 
>>>>>> since it's
>>>>>> already disparaged.  Am I missing something detail?
>>>>> It might not matter for this specific case, but I have seen
>>>>> reload forcing the very
>>>>> first alternative without looking at any costs or preferences -
>>>>> as long as it is legal.
>>>>> This suggests we need to order alternatives from most preferred
>>>>> alternative to least
>>>>> preferred one.
>>>>>
>>>>> I think it is good enough for commit, James?
>>>> Methinks that my issue with those attributes is that I'm not as
>>>> fluent in AArch64 as I'd like to be.
>>>>
>>>> Please, feel free to edit the patch changing the order then.
>>>     Replace insn to zero up SIMD registers
>>>
>>>     gcc/
>>>              * config/aarch64/aarch64.md
>>>              (*movhf_aarch64): Add "movi %0, #0" to zero up register.
>>>              (*movsf_aarch64): Likewise and add "simd" and "fp" 
>>> attributes.
>>>              (*movdf_aarch64): Likewise.
>>>
>>> Swapped the order of the constraints to favor MOVI.
>>>
>>> Just say the word...
>> I'm wondering whether this is appropriate for GCC6 now that we are so 
>> late
>> in the development cycle. Additionally, I have some comments on your 
>> patch:
>>
>>> diff --git a/gcc/config/aarch64/aarch64.md 
>>> b/gcc/config/aarch64/aarch64.md
>>> index 68676c9..4502a58 100644
>>> --- a/gcc/config/aarch64/aarch64.md
>>> +++ b/gcc/config/aarch64/aarch64.md
>>> @@ -1163,11 +1163,12 @@
>>>   )
>>>     (define_insn "*movhf_aarch64"
>>> -  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, 
>>> ?r,w,w,m,r,m ,r")
>>> -    (match_operand:HF 1 "general_operand"      "?rY, 
>>> w,w,m,w,m,rY,r"))]
>>> +  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w 
>>> ,?r,w,w,m,r,m ,r")
>>> +    (match_operand:HF 1 "general_operand"      "Y ,?rY, 
>>> w,w,m,w,m,rY,r"))]
>>>     "TARGET_FLOAT && (register_operand (operands[0], HFmode)
>>>       || aarch64_reg_or_fp_zero (operands[1], HFmode))"
>>>     "@
>>> +   movi\\t%0.4h, #0
>>>      mov\\t%0.h[0], %w1
>>>      umov\\t%w0, %1.h[0]
>>>      mov\\t%0.h[0], %1.h[0]
>>> @@ -1176,18 +1177,19 @@
>>>      ldrh\\t%w0, %1
>>>      strh\\t%w1, %0
>>>      mov\\t%w0, %w1"
>>> -  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
>>> +  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
>>>                        f_loads,f_stores,load1,store1,mov_reg")
>>> -   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
>>> -   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
>>> +   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
>>> +   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
>>>   )
>>>     (define_insn "*movsf_aarch64"
>>> -  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  
>>> ,w,m,r,m ,r")
>>> -    (match_operand:SF 1 "general_operand"      "?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>> +  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w  
>>> ,w,m,r,m ,r")
>>> +    (match_operand:SF 1 "general_operand"      "Y ,?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>>     "TARGET_FLOAT && (register_operand (operands[0], SFmode)
>>>       || aarch64_reg_or_fp_zero (operands[1], SFmode))"
>>>     "@
>>> +   movi\\t%0.2s, #0
>>>      fmov\\t%s0, %w1
>>>      fmov\\t%w0, %s1
>>>      fmov\\t%s0, %s1
>>> @@ -1197,16 +1199,19 @@
>>>      ldr\\t%w0, %1
>>>      str\\t%w1, %0
>>>      mov\\t%w0, %w1"
>>> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
>>> -                     f_loads,f_stores,load1,store1,mov_reg")]
>>> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
>>> +                     f_loads,f_stores,load1,store1,mov_reg")
>>> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
>>> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
>>>   )
>> This fp attribute looks wrong to me. The two fmov instructions that move
>> between core and FP registers should be tagged "yes". However, this is
>> irrelevant as the whole pattern is guarded by TARGET_FLOAT.
>>
>> It would be clearer to drop the FP attribute entirely, so as not to give
>> the erroneous impression that some alternatives in this insn are enabled
>> for !TARGET_FLOAT.
>
> You mean to remove the FP attribute from all, HF, SF, DF, TF?
>
>>>   (define_insn "*movdf_aarch64"
>>> -  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  
>>> ,w,m,r,m ,r")
>>> -    (match_operand:DF 1 "general_operand"      "?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>> +  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w  
>>> ,w,m,r,m ,r")
>>> +    (match_operand:DF 1 "general_operand"      "Y ,?rY, 
>>> w,w,Ufc,m,w,m,rY,r"))]
>>>     "TARGET_FLOAT && (register_operand (operands[0], DFmode)
>>>       || aarch64_reg_or_fp_zero (operands[1], DFmode))"
>>>     "@
>>> +   movi\\t%d0, #0
>>>      fmov\\t%d0, %x1
>>>      fmov\\t%x0, %d1
>>>      fmov\\t%d0, %d1
>>> @@ -1216,8 +1221,10 @@
>>>      ldr\\t%x0, %1
>>>      str\\t%x1, %0
>>>      mov\\t%x0, %x1"
>>> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
>>> -                     f_loadd,f_stored,load1,store1,mov_reg")]
>>> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
>>> +                     f_loadd,f_stored,load1,store1,mov_reg")
>>> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
>>> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
>> Likewise here.
>>
>> As to whether this is OK for GCC6, this doesn't look like a 
>> regression to me,
>> we've generated FMOV since the dawn of the port. The patch has 
>> performance
>> implications (arguably trivial ones, but still performance implications)
>> for generic code generation, and release is coming soon - so I'd rather
>> keep this back until GCC 7 opens.
>>
>> Remove the FP attributes, and this patch is OK for GCC 7.
>>
>> I'm happy to be overruled on this by Marcus or Richard if they feel we
>> should take this patch now, but my opinion is that waiting is more in 
>> the
>> spirit of using Stage 4 to stabilise the compiler.
>
> I agree to postpone until GCC 7.
>
>         [AArch64] Replace insn to zero up SIMD registers
>
>         gcc/
>             * config/aarch64/aarch64.md
>             (*movhf_aarch64): Add "movi %0, #0" to zero up register.
>             (*movsf_aarch64): Likewise and add "simd" attributes.
>             (*movdf_aarch64): Likewise.
>
> This patch removes the FP attributes from the HF, SF, DF, TF moves.

And now, with the patch. :-/

-- 
Evandro Menezes


[-- Attachment #2: 0001-AArch64-Replace-insn-to-zero-up-SIMD-registers.patch --]
[-- Type: text/x-patch, Size: 3973 bytes --]

From 190f26b71927516d28b851e0a62b78c2f67c2dc6 Mon Sep 17 00:00:00 2001
From: Evandro Menezes <e.menezes@samsung.com>
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] [AArch64] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register and
	remove the "fp" attributes.
	(*movsf_aarch64): Add "movi %0, #0" to zero up register and
	add the "simd" attributes.
	(*movdf_aarch64): Likewise.
	(*movtf_aarch64): Remove the "fp" attributes.
---
 gcc/config/aarch64/aarch64.md | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..4fb12a6 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,11 +1163,12 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"      "Y ,?rY, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
     || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
+   movi\\t%0.4h, #0
    mov\\t%0.h[0], %w1
    umov\\t%w0, %1.h[0]
    mov\\t%0.h[0], %1.h[0]
@@ -1176,18 +1177,18 @@
    ldrh\\t%w0, %1
    strh\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
                      f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
     || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
+   movi\\t%0.2s, #0
    fmov\\t%s0, %w1
    fmov\\t%w0, %s1
    fmov\\t%s0, %s1
@@ -1197,16 +1198,18 @@
    ldr\\t%w0, %1
    str\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
-                     f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
+                     f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
     || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
+   movi\\t%d0, #0
    fmov\\t%d0, %x1
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
@@ -1216,8 +1219,9 @@
    ldr\\t%x0, %1
    str\\t%x1, %0
    mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
-                     f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
+                     f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")]
 )
 
 (define_insn "*movtf_aarch64"
@@ -1242,7 +1246,6 @@
   [(set_attr "type" "logic_reg,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\
                      f_loadd,f_stored,load2,store2,store2")
    (set_attr "length" "4,8,8,8,4,4,4,4,4,4,4")
-   (set_attr "fp" "*,*,yes,yes,*,yes,yes,yes,*,*,*")
    (set_attr "simd" "yes,*,*,*,yes,*,*,*,*,*,*")]
 )
 
-- 
1.9.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-03-10 13:23                 ` James Greenhalgh
@ 2016-03-10 16:27                   ` Evandro Menezes
  2016-03-10 16:32                     ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2016-03-10 16:27 UTC (permalink / raw)
  To: James Greenhalgh
  Cc: Wilco Dijkstra, gcc-patches, nd, Marcus Shawcroft,
	Kyrylo Tkachov, richard.earnshaw

On 03/10/16 07:23, James Greenhalgh wrote:
> On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:
>> On 03/01/16 13:08, Evandro Menezes wrote:
>>> On 03/01/16 13:02, Wilco Dijkstra wrote:
>>>> Evandro Menezes wrote:
>>>>> The meaning of these attributes are not clear to me.  Is there a
>>>>> reference somewhere about which insns are FP or SIMD or neither?
>>>> The meaning should be clear, "fp" is a floating point
>>>> instruction, "simd" a SIMD one
>>>> as defined in ARM-ARM.
>>>>
>>>>> Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
>>>>> However, I didn't feel that it should be moved to the right, since it's
>>>>> already disparaged.  Am I missing something detail?
>>>> It might not matter for this specific case, but I have seen
>>>> reload forcing the very
>>>> first alternative without looking at any costs or preferences -
>>>> as long as it is legal.
>>>> This suggests we need to order alternatives from most preferred
>>>> alternative to least
>>>> preferred one.
>>>>
>>>> I think it is good enough for commit, James?
>>> Methinks that my issue with those attributes is that I'm not as
>>> fluent in AArch64 as I'd like to be.
>>>
>>> Please, feel free to edit the patch changing the order then.
>>     Replace insn to zero up SIMD registers
>>
>>     gcc/
>>              * config/aarch64/aarch64.md
>>              (*movhf_aarch64): Add "movi %0, #0" to zero up register.
>>              (*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
>>              (*movdf_aarch64): Likewise.
>>
>> Swapped the order of the constraints to favor MOVI.
>>
>> Just say the word...
> I'm wondering whether this is appropriate for GCC6 now that we are so late
> in the development cycle. Additionally, I have some comments on your patch:
>
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index 68676c9..4502a58 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -1163,11 +1163,12 @@
>>   )
>>   
>>   (define_insn "*movhf_aarch64"
>> -  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
>> -	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
>> +  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
>> +	(match_operand:HF 1 "general_operand"      "Y ,?rY, w,w,m,w,m,rY,r"))]
>>     "TARGET_FLOAT && (register_operand (operands[0], HFmode)
>>       || aarch64_reg_or_fp_zero (operands[1], HFmode))"
>>     "@
>> +   movi\\t%0.4h, #0
>>      mov\\t%0.h[0], %w1
>>      umov\\t%w0, %1.h[0]
>>      mov\\t%0.h[0], %1.h[0]
>> @@ -1176,18 +1177,19 @@
>>      ldrh\\t%w0, %1
>>      strh\\t%w1, %0
>>      mov\\t%w0, %w1"
>> -  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
>> +  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
>>                        f_loads,f_stores,load1,store1,mov_reg")
>> -   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
>> -   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
>> +   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
>> +   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
>>   )
>>   
>>   (define_insn "*movsf_aarch64"
>> -  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
>> -	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
>> +  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
>> +	(match_operand:SF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
>>     "TARGET_FLOAT && (register_operand (operands[0], SFmode)
>>       || aarch64_reg_or_fp_zero (operands[1], SFmode))"
>>     "@
>> +   movi\\t%0.2s, #0
>>      fmov\\t%s0, %w1
>>      fmov\\t%w0, %s1
>>      fmov\\t%s0, %s1
>> @@ -1197,16 +1199,19 @@
>>      ldr\\t%w0, %1
>>      str\\t%w1, %0
>>      mov\\t%w0, %w1"
>> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
>> -                     f_loads,f_stores,load1,store1,mov_reg")]
>> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
>> +                     f_loads,f_stores,load1,store1,mov_reg")
>> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
>> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
>>   )
> This fp attribute looks wrong to me. The two fmov instructions that move
> between core and FP registers should be tagged "yes". However, this is
> irrelevant as the whole pattern is guarded by TARGET_FLOAT.
>
> It would be clearer to drop the FP attribute entirely, so as not to give
> the erroneous impression that some alternatives in this insn are enabled
> for !TARGET_FLOAT.

You mean to remove the FP attribute from all, HF, SF, DF, TF?

>>   (define_insn "*movdf_aarch64"
>> -  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
>> -	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
>> +  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
>> +	(match_operand:DF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
>>     "TARGET_FLOAT && (register_operand (operands[0], DFmode)
>>       || aarch64_reg_or_fp_zero (operands[1], DFmode))"
>>     "@
>> +   movi\\t%d0, #0
>>      fmov\\t%d0, %x1
>>      fmov\\t%x0, %d1
>>      fmov\\t%d0, %d1
>> @@ -1216,8 +1221,10 @@
>>      ldr\\t%x0, %1
>>      str\\t%x1, %0
>>      mov\\t%x0, %x1"
>> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
>> -                     f_loadd,f_stored,load1,store1,mov_reg")]
>> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
>> +                     f_loadd,f_stored,load1,store1,mov_reg")
>> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
>> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
> Likewise here.
>
> As to whether this is OK for GCC6, this doesn't look like a regression to me,
> we've generated FMOV since the dawn of the port. The patch has performance
> implications (arguably trivial ones, but still performance implications)
> for generic code generation, and release is coming soon - so I'd rather
> keep this back until GCC 7 opens.
>
> Remove the FP attributes, and this patch is OK for GCC 7.
>
> I'm happy to be overruled on this by Marcus or Richard if they feel we
> should take this patch now, but my opinion is that waiting is more in the
> spirit of using Stage 4 to stabilise the compiler.

I agree to postpone until GCC 7.

         [AArch64] Replace insn to zero up SIMD registers

         gcc/
             * config/aarch64/aarch64.md
             (*movhf_aarch64): Add "movi %0, #0" to zero up register.
             (*movsf_aarch64): Likewise and add "simd" attributes.
             (*movdf_aarch64): Likewise.

This patch removes the FP attributes from the HF, SF, DF, TF moves.

Thank you,

-- 
Evandro Menezes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-03-09 21:36               ` Evandro Menezes
@ 2016-03-10 13:23                 ` James Greenhalgh
  2016-03-10 16:27                   ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: James Greenhalgh @ 2016-03-10 13:23 UTC (permalink / raw)
  To: Evandro Menezes
  Cc: Wilco Dijkstra, gcc-patches, nd, Marcus Shawcroft,
	Kyrylo Tkachov, richard.earnshaw

On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:
> On 03/01/16 13:08, Evandro Menezes wrote:
> >On 03/01/16 13:02, Wilco Dijkstra wrote:
> >>Evandro Menezes wrote:
> >>>The meaning of these attributes are not clear to me.  Is there a
> >>>reference somewhere about which insns are FP or SIMD or neither?
> >>The meaning should be clear, "fp" is a floating point
> >>instruction, "simd" a SIMD one
> >>as defined in ARM-ARM.
> >>
> >>>Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
> >>>However, I didn't feel that it should be moved to the right, since it's
> >>>already disparaged.  Am I missing something detail?
> >>It might not matter for this specific case, but I have seen
> >>reload forcing the very
> >>first alternative without looking at any costs or preferences -
> >>as long as it is legal.
> >>This suggests we need to order alternatives from most preferred
> >>alternative to least
> >>preferred one.
> >>
> >>I think it is good enough for commit, James?
> >
> >Methinks that my issue with those attributes is that I'm not as
> >fluent in AArch64 as I'd like to be.
> >
> >Please, feel free to edit the patch changing the order then.
> 
>    Replace insn to zero up SIMD registers
> 
>    gcc/
>             * config/aarch64/aarch64.md
>             (*movhf_aarch64): Add "movi %0, #0" to zero up register.
>             (*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
>             (*movdf_aarch64): Likewise.
> 
> Swapped the order of the constraints to favor MOVI.
> 
> Just say the word...

I'm wondering whether this is appropriate for GCC6 now that we are so late
in the development cycle. Additionally, I have some comments on your patch:

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 68676c9..4502a58 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1163,11 +1163,12 @@
>  )
>  
>  (define_insn "*movhf_aarch64"
> -  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
> -	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
> +  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
> +	(match_operand:HF 1 "general_operand"      "Y ,?rY, w,w,m,w,m,rY,r"))]
>    "TARGET_FLOAT && (register_operand (operands[0], HFmode)
>      || aarch64_reg_or_fp_zero (operands[1], HFmode))"
>    "@
> +   movi\\t%0.4h, #0
>     mov\\t%0.h[0], %w1
>     umov\\t%w0, %1.h[0]
>     mov\\t%0.h[0], %1.h[0]
> @@ -1176,18 +1177,19 @@
>     ldrh\\t%w0, %1
>     strh\\t%w1, %0
>     mov\\t%w0, %w1"
> -  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
> +  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
>                       f_loads,f_stores,load1,store1,mov_reg")
> -   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
> -   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
> +   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
> +   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
>  )
>  
>  (define_insn "*movsf_aarch64"
> -  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
> -	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
> +  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
> +	(match_operand:SF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
>    "TARGET_FLOAT && (register_operand (operands[0], SFmode)
>      || aarch64_reg_or_fp_zero (operands[1], SFmode))"
>    "@
> +   movi\\t%0.2s, #0
>     fmov\\t%s0, %w1
>     fmov\\t%w0, %s1
>     fmov\\t%s0, %s1
> @@ -1197,16 +1199,19 @@
>     ldr\\t%w0, %1
>     str\\t%w1, %0
>     mov\\t%w0, %w1"
> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
> -                     f_loads,f_stores,load1,store1,mov_reg")]
> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
> +                     f_loads,f_stores,load1,store1,mov_reg")
> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
>  )

This fp attribute looks wrong to me. The two fmov instructions that move
between core and FP registers should be tagged "yes". However, this is
irrelevant as the whole pattern is guarded by TARGET_FLOAT.

It would be clearer to drop the FP attribute entirely, so as not to give
the erroneous impression that some alternatives in this insn are enabled
for !TARGET_FLOAT.

>  (define_insn "*movdf_aarch64"
> -  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
> -	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
> +  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
> +	(match_operand:DF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
>    "TARGET_FLOAT && (register_operand (operands[0], DFmode)
>      || aarch64_reg_or_fp_zero (operands[1], DFmode))"
>    "@
> +   movi\\t%d0, #0
>     fmov\\t%d0, %x1
>     fmov\\t%x0, %d1
>     fmov\\t%d0, %d1
> @@ -1216,8 +1221,10 @@
>     ldr\\t%x0, %1
>     str\\t%x1, %0
>     mov\\t%x0, %x1"
> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
> -                     f_loadd,f_stored,load1,store1,mov_reg")]
> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
> +                     f_loadd,f_stored,load1,store1,mov_reg")
> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]

Likewise here.

As to whether this is OK for GCC6, this doesn't look like a regression to me,
we've generated FMOV since the dawn of the port. The patch has performance
implications (arguably trivial ones, but still performance implications)
for generic code generation, and release is coming soon - so I'd rather
keep this back until GCC 7 opens.

Remove the FP attributes, and this patch is OK for GCC 7.

I'm happy to be overruled on this by Marcus or Richard if they feel we
should take this patch now, but my opinion is that waiting is more in the
spirit of using Stage 4 to stabilise the compiler.

Thanks,
James

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-03-01 19:08             ` Evandro Menezes
@ 2016-03-09 21:36               ` Evandro Menezes
  2016-03-10 13:23                 ` James Greenhalgh
  0 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2016-03-09 21:36 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

[-- Attachment #1: Type: text/plain, Size: 1488 bytes --]

On 03/01/16 13:08, Evandro Menezes wrote:
> On 03/01/16 13:02, Wilco Dijkstra wrote:
>> Evandro Menezes wrote:
>>> The meaning of these attributes are not clear to me.  Is there a
>>> reference somewhere about which insns are FP or SIMD or neither?
>> The meaning should be clear, "fp" is a floating point instruction, 
>> "simd" a SIMD one
>> as defined in ARM-ARM.
>>
>>> Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
>>> However, I didn't feel that it should be moved to the right, since it's
>>> already disparaged.  Am I missing something detail?
>> It might not matter for this specific case, but I have seen reload 
>> forcing the very
>> first alternative without looking at any costs or preferences - as 
>> long as it is legal.
>> This suggests we need to order alternatives from most preferred 
>> alternative to least
>> preferred one.
>>
>> I think it is good enough for commit, James?
>
> Methinks that my issue with those attributes is that I'm not as fluent 
> in AArch64 as I'd like to be.
>
> Please, feel free to edit the patch changing the order then.

    Replace insn to zero up SIMD registers

    gcc/
             * config/aarch64/aarch64.md
             (*movhf_aarch64): Add "movi %0, #0" to zero up register.
             (*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
             (*movdf_aarch64): Likewise.

Swapped the order of the constraints to favor MOVI.

Just say the word...

Thank you,

-- 
Evandro Menezes


[-- Attachment #2: 0001-Replace-insn-to-zero-up-SIMD-registers.patch --]
[-- Type: text/x-patch, Size: 3698 bytes --]

From bcb76a4c864436930e1236e7ce35d9e689adf075 Mon Sep 17 00:00:00 2001
From: Evandro Menezes <e.menezes@samsung.com>
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..4502a58 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,11 +1163,12 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"      "Y ,?rY, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
     || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
+   movi\\t%0.4h, #0
    mov\\t%0.h[0], %w1
    umov\\t%w0, %1.h[0]
    mov\\t%0.h[0], %1.h[0]
@@ -1176,18 +1177,19 @@
    ldrh\\t%w0, %1
    strh\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
                      f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
     || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
+   movi\\t%0.2s, #0
    fmov\\t%s0, %w1
    fmov\\t%w0, %s1
    fmov\\t%s0, %s1
@@ -1197,16 +1199,19 @@
    ldr\\t%w0, %1
    str\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
-                     f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
+                     f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
     || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
+   movi\\t%d0, #0
    fmov\\t%d0, %x1
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
@@ -1216,8 +1221,10 @@
    ldr\\t%x0, %1
    str\\t%x1, %0
    mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
-                     f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
+                     f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
 )
 
 (define_insn "*movtf_aarch64"
-- 
1.9.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-03-01 19:02           ` Wilco Dijkstra
@ 2016-03-01 19:08             ` Evandro Menezes
  2016-03-09 21:36               ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2016-03-01 19:08 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

On 03/01/16 13:02, Wilco Dijkstra wrote:
> Evandro Menezes wrote:
>> The meaning of these attributes are not clear to me.  Is there a
>> reference somewhere about which insns are FP or SIMD or neither?
> The meaning should be clear, "fp" is a floating point instruction, "simd" a SIMD one
> as defined in ARM-ARM.
>
>> Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
>> However, I didn't feel that it should be moved to the right, since it's
>> already disparaged.  Am I missing something detail?
> It might not matter for this specific case, but I have seen reload forcing the very
> first alternative without looking at any costs or preferences - as long as it is legal.
> This suggests we need to order alternatives from most preferred alternative to least
> preferred one.
>
> I think it is good enough for commit, James?

Methinks that my issue with those attributes is that I'm not as fluent 
in AArch64 as I'd like to be.

Please, feel free to edit the patch changing the order then.

Thank you,

-- 
Evandro Menezes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-02-29 23:11         ` Evandro Menezes
@ 2016-03-01 19:02           ` Wilco Dijkstra
  2016-03-01 19:08             ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: Wilco Dijkstra @ 2016-03-01 19:02 UTC (permalink / raw)
  To: Evandro Menezes
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

Evandro Menezes wrote:
>
> The meaning of these attributes are not clear to me.  Is there a
> reference somewhere about which insns are FP or SIMD or neither?

The meaning should be clear, "fp" is a floating point instruction, "simd" a SIMD one
as defined in ARM-ARM.

> Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
> However, I didn't feel that it should be moved to the right, since it's
> already disparaged.  Am I missing something detail?

It might not matter for this specific case, but I have seen reload forcing the very
first alternative without looking at any costs or preferences - as long as it is legal.
This suggests we need to order alternatives from most preferred alternative to least
preferred one.

I think it is good enough for commit, James?

Wilco

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-02-29 18:07       ` Wilco Dijkstra
@ 2016-02-29 23:11         ` Evandro Menezes
  2016-03-01 19:02           ` Wilco Dijkstra
  0 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2016-02-29 23:11 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

[-- Attachment #1: Type: text/plain, Size: 811 bytes --]

On 02/29/16 12:07, Wilco Dijkstra wrote:
> Evandro Menezes <e.menezes@samsung.com> wrote:
>> Please, verify the new "simd" and "fp" attributes for SF and DF.
> Both movsf and movdf should be:
>
> (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
> (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")
>
> Did you check that with -mcpu=generic+nosimd you get fmov s0, wzr?
> In my version I kept the Y on the fmov and placed the neon_mov first.

The meaning of these attributes are not clear to me.  Is there a 
reference somewhere about which insns are FP or SIMD or neither?

Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.  
However, I didn't feel that it should be moved to the right, since it's 
already disparaged.  Am I missing something detail?

Thank you for the review,

-- 
Evandro Menezes


[-- Attachment #2: 0001-Replace-insn-to-zero-up-SIMD-registers.patch --]
[-- Type: text/x-patch, Size: 3748 bytes --]

From 952c0f74da98efd7fcb37b2cfe3c17518a619088 Mon Sep 17 00:00:00 2001
From: Evandro Menezes <e.menezes@samsung.com>
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..416e065 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,12 +1163,13 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, w,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"      "?rY,Y, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
     || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
    mov\\t%0.h[0], %w1
+   movi\\t%0.4h, #0
    umov\\t%w0, %1.h[0]
    mov\\t%0.h[0], %1.h[0]
    ldr\\t%h0, %1
@@ -1176,19 +1177,20 @@
    ldrh\\t%w0, %1
    strh\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\
                      f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"      "?rY,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
     || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
    fmov\\t%s0, %w1
+   movi\\t%0.2s, #0
    fmov\\t%w0, %s1
    fmov\\t%s0, %s1
    fmov\\t%s0, %1
@@ -1197,17 +1199,20 @@
    ldr\\t%w0, %1
    str\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
-                     f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\
+                     f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "?rY,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
     || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
    fmov\\t%d0, %x1
+   movi\\t%d0, #0
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
    fmov\\t%d0, %1
@@ -1216,8 +1221,10 @@
    ldr\\t%x0, %1
    str\\t%x1, %0
    mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
-                     f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconstd,\
+                     f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
 )
 
 (define_insn "*movtf_aarch64"
-- 
2.6.3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-02-26 22:43     ` Evandro Menezes
@ 2016-02-29 18:07       ` Wilco Dijkstra
  2016-02-29 23:11         ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: Wilco Dijkstra @ 2016-02-29 18:07 UTC (permalink / raw)
  To: Evandro Menezes
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

Evandro Menezes <e.menezes@samsung.com> wrote:
> 
> Please, verify the new "simd" and "fp" attributes for SF and DF.

Both movsf and movdf should be:

(set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
(set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")

Did you check that with -mcpu=generic+nosimd you get fmov s0, wzr?
In my version I kept the Y on the fmov and placed the neon_mov first.

Wilco

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-02-26 12:37   ` Wilco Dijkstra
@ 2016-02-26 22:43     ` Evandro Menezes
  2016-02-29 18:07       ` Wilco Dijkstra
  0 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2016-02-26 22:43 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

[-- Attachment #1: Type: text/plain, Size: 1267 bytes --]

On 02/26/16 06:37, Wilco Dijkstra wrote:
> Evandro Menezes <e.menezes@samsung.com> wrote:
>> I have a question though: is it necessary to add the "fp" and "simd"
>> attributes to both movsf_aarch64 and movdf_aarch64 as well?
> You need at least the "simd" attribute, but providing "fp" as well is clearer
> (in principle the TARGET_FLOAT check in the pattern condition is
> redundant as a result, but the movhf and movtf patterns already do both).
>
> Also you want to use the smallest possible SIMD size as these are
> scalar operations and some microarchitectures execute 64-bit operations
> more efficiently than 128-bit ones, so:
>
>      mov\\t%0.h[0], %w1
> +   movi\\t%0.4h, #0
>      umov\\t%w0, %1.h[0]
>
>      fmov\\t%s0, %w1
> +   movi\\t%0.2s, #0
>      fmov\\t%w0, %s1
>
> With those changes it should be ready for commit once you get the OK from James/Marcus.

         Replace insn to zero up SIMD registers

         gcc/
             * config/aarch64/aarch64.md
             (*movhf_aarch64): Add "movi %0, #0" to zero up register.
             (*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
             (*movdf_aarch64): Likewise.

Please, verify the new "simd" and "fp" attributes for SF and DF.

Thank you,

-- 
Evandro Menezes


[-- Attachment #2: 0001-Replace-insn-to-zero-up-SIMD-registers.patch --]
[-- Type: text/x-patch, Size: 3770 bytes --]

From d447411ea85f065faa77d56cb143effffb07a467 Mon Sep 17 00:00:00 2001
From: Evandro Menezes <e.menezes@samsung.com>
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..a5661ed 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,12 +1163,13 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"      "?r,Y, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
     || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
    mov\\t%0.h[0], %w1
+   movi\\t%0.4h, #0
    umov\\t%w0, %1.h[0]
    mov\\t%0.h[0], %1.h[0]
    ldr\\t%h0, %1
@@ -1176,19 +1177,20 @@
    ldrh\\t%w0, %1
    strh\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\
                      f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"      "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
     || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
    fmov\\t%s0, %w1
+   movi\\t%0.2s, #0
    fmov\\t%w0, %s1
    fmov\\t%s0, %s1
    fmov\\t%s0, %1
@@ -1197,17 +1199,20 @@
    ldr\\t%w0, %1
    str\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
-                     f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\
+                     f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "yes,*,yes,yes,yes,yes,yes,*,*,yes")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
     || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
    fmov\\t%d0, %x1
+   movi\\t%d0, #0
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
    fmov\\t%d0, %1
@@ -1216,8 +1221,10 @@
    ldr\\t%x0, %1
    str\\t%x1, %0
    mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
-                     f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconstd,\
+                     f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "yes,yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "yes,*,yes,yes,yes,yes,yes,*,*,yes")]
 )
 
 (define_insn "*movtf_aarch64"
-- 
2.6.3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-01-27 23:14 ` Evandro Menezes
@ 2016-02-26 12:37   ` Wilco Dijkstra
  2016-02-26 22:43     ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: Wilco Dijkstra @ 2016-02-26 12:37 UTC (permalink / raw)
  To: Evandro Menezes
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

Evandro Menezes <e.menezes@samsung.com> wrote:
>
> I have a question though: is it necessary to add the "fp" and "simd"
> attributes to both movsf_aarch64 and movdf_aarch64 as well?

You need at least the "simd" attribute, but providing "fp" as well is clearer
(in principle the TARGET_FLOAT check in the pattern condition is 
redundant as a result, but the movhf and movtf patterns already do both).

Also you want to use the smallest possible SIMD size as these are
scalar operations and some microarchitectures execute 64-bit operations
more efficiently than 128-bit ones, so:

    mov\\t%0.h[0], %w1
+   movi\\t%0.4h, #0
    umov\\t%w0, %1.h[0]

    fmov\\t%s0, %w1
+   movi\\t%0.2s, #0
    fmov\\t%w0, %s1

With those changes it should be ready for commit once you get the OK from James/Marcus.

Wilco

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
  2016-01-22 13:52 Wilco Dijkstra
@ 2016-01-27 23:14 ` Evandro Menezes
  2016-02-26 12:37   ` Wilco Dijkstra
  0 siblings, 1 reply; 24+ messages in thread
From: Evandro Menezes @ 2016-01-27 23:14 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

[-- Attachment #1: Type: text/plain, Size: 2242 bytes --]

On 01/22/16 07:52, Wilco Dijkstra wrote:
> On 12/16/2015 03:30 PM, Evandro Menezes wrote:
>>      On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
>>
>>          On 20 October 2015 at 00:40, Evandro Menezes <e.menezes@samsung.com> wrote:
>>
>>              In the existing targets, it seems that it's always faster to zero up a DF
>>
>>              register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>
>>              This patch modifies the respective pattern.
>>
>>
>>          Hi Evandro,
>>
>>          This patch changes the generic, u architecture independent instruction
>>          selection. The ARM ARM (C3.5.3) makes a specific recommendation about
>>          the choice of instruction in this situation and the current
>>          implementation in GCC follows that recommendation.  Wilco has also
>>          picked up on this issue he has the same patch internal to ARM along
>>          with an ongoing discussion with ARM architecture folk regarding this
>>          recommendation.  I'm reluctant to take this patch right now on the
>>          basis that it runs contrary to ARM ARM recommendation pending the
>>          conclusion of Wilco's discussion with ARM architecture folk.
>>
>>
>>      Have you had a chance to discuss this internally further?
> Yes, it was decided to remove the recommendation from future ARM ARM's.
>
> Several review comments on your patch (https://patchwork.ozlabs.org/patch/532736):
>
> * This should be added to movhf, movsf and movdf - movtf already does this.
> * It is important to set the "fp" and "simd" attributes so that the movi variant can
>     only be selected if it is available.
>

Hi, Wilco.

    2016-01-27 Evandro Menezes <e.menezes@samsung.com>

         Replace insn to zero up SIMD registers

         gcc/
             * config/aarch64/aarch64.md
             (*movhf_aarch64): Add "movi %0, #0" to zero up register.
             (*movsf_aarch64): Likewise.
             (*movdf_aarch64): Likewise.

When this decision is final, methinks that this patch would be close to 
addressing this change.

I have a question though: is it necessary to add the "fp" and "simd" 
attributes to both movsf_aarch64 and movdf_aarch64 as well?

Thank you,

-- 
Evandro Menezes


[-- Attachment #2: 0001-Replace-insn-to-zero-up-SIMD-registers.patch --]
[-- Type: text/x-patch, Size: 3488 bytes --]

From b390d411b56cfcdedf4601d0487baf1ef84e79ab Mon Sep 17 00:00:00 2001
From: Evandro Menezes <e.menezes@samsung.com>
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f6c8eb1..43a3854 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1164,64 +1164,67 @@
       operands[1] = force_reg (<MODE>mode, operands[1]);
   }
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"      "?r,Y, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
     || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
    mov\\t%0.h[0], %w1
+   movi\\t%0.8h, #0
    umov\\t%w0, %1.h[0]
    mov\\t%0.h[0], %1.h[0]
    ldr\\t%h0, %1
    str\\t%h1, %0
    ldrh\\t%w0, %1
    strh\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\
                      f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"      "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
     || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
    fmov\\t%s0, %w1
+   movi\\t%0.4s, #0
    fmov\\t%w0, %s1
    fmov\\t%s0, %s1
    fmov\\t%s0, %1
    ldr\\t%s0, %1
    str\\t%s1, %0
    ldr\\t%w0, %1
    str\\t%w1, %0
    mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\
                      f_loads,f_stores,load1,store1,mov_reg")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"      "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
     || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
    fmov\\t%d0, %x1
+   movi\\t%d0, #0
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
    fmov\\t%d0, %1
    ldr\\t%d0, %1
    str\\t%d1, %0
    ldr\\t%x0, %1
    str\\t%x1, %0
    mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconstd,\
                      f_loadd,f_stored,load1,store1,mov_reg")]
 )
 
 (define_insn "*movtf_aarch64"
   [(set (match_operand:TF 0
-- 
2.6.3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH][AArch64] Replace insn to zero up DF register
@ 2016-01-22 13:52 Wilco Dijkstra
  2016-01-27 23:14 ` Evandro Menezes
  0 siblings, 1 reply; 24+ messages in thread
From: Wilco Dijkstra @ 2016-01-22 13:52 UTC (permalink / raw)
  To: Evandro Menezes
  Cc: gcc-patches, nd, Marcus Shawcroft, Kyrylo Tkachov, James Greenhalgh

On 12/16/2015 03:30 PM, Evandro Menezes wrote:
>
>    On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
>
>        On 20 October 2015 at 00:40, Evandro Menezes <e.menezes@samsung.com> wrote:
>
>            In the existing targets, it seems that it's always faster to zero up a DF
>
>            register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
>            This patch modifies the respective pattern.
>
>
>        Hi Evandro,
>
>        This patch changes the generic, u architecture independent instruction
>        selection. The ARM ARM (C3.5.3) makes a specific recommendation about
>        the choice of instruction in this situation and the current
>        implementation in GCC follows that recommendation.  Wilco has also
>        picked up on this issue he has the same patch internal to ARM along
>        with an ongoing discussion with ARM architecture folk regarding this
>        recommendation.  I'm reluctant to take this patch right now on the
>        basis that it runs contrary to ARM ARM recommendation pending the
>        conclusion of Wilco's discussion with ARM architecture folk.
>
>
>    Have you had a chance to discuss this internally further?

Yes, it was decided to remove the recommendation from future ARM ARM's.

Several review comments on your patch (https://patchwork.ozlabs.org/patch/532736):

* This should be added to movhf, movsf and movdf - movtf already does this.
* It is important to set the "fp" and "simd" attributes so that the movi variant can
   only be selected if it is available.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-03-10 16:37 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-19 23:41 [PATCH][AArch64] Replace insn to zero up DF register Evandro Menezes
2015-10-19 23:52 ` Andrew Pinski
2015-10-20  0:33   ` Andrew Pinski
2015-10-20 14:46     ` Andrew Pinski
2015-10-28 18:49       ` Evandro Menezes
2015-10-30 10:26 ` Marcus Shawcroft
2015-11-09 22:59   ` Evandro Menezes
2015-12-03 21:01     ` Evandro Menezes
2015-11-19 22:01   ` Evandro Menezes
2015-12-16 21:30   ` Evandro Menezes
2016-01-13  0:06     ` Evandro Menezes
2016-01-22 13:52 Wilco Dijkstra
2016-01-27 23:14 ` Evandro Menezes
2016-02-26 12:37   ` Wilco Dijkstra
2016-02-26 22:43     ` Evandro Menezes
2016-02-29 18:07       ` Wilco Dijkstra
2016-02-29 23:11         ` Evandro Menezes
2016-03-01 19:02           ` Wilco Dijkstra
2016-03-01 19:08             ` Evandro Menezes
2016-03-09 21:36               ` Evandro Menezes
2016-03-10 13:23                 ` James Greenhalgh
2016-03-10 16:27                   ` Evandro Menezes
2016-03-10 16:32                     ` Evandro Menezes
2016-03-10 16:37                       ` James Greenhalgh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).