From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 42191 invoked by alias); 10 Mar 2016 16:32:30 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 42170 invoked by uid 89); 10 Mar 2016 16:32:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=sk:e.menez, emenezessamsungcom, D*samsung.com, U*e.menezes X-HELO: usmailout4.samsung.com Received: from mailout4.w2.samsung.com (HELO usmailout4.samsung.com) (211.189.100.14) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 10 Mar 2016 16:32:19 +0000 Received: from uscpsbgm2.samsung.com (u115.gpu85.samsung.co.kr [203.254.195.115]) by usmailout4.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0O3U00HZR0LTB560@usmailout4.samsung.com> for gcc-patches@gcc.gnu.org; Thu, 10 Mar 2016 11:32:17 -0500 (EST) Received: from ussync2.samsung.com ( [203.254.195.82]) by uscpsbgm2.samsung.com (USCPMTA) with SMTP id 61.A8.07641.191A1E65; Thu, 10 Mar 2016 11:32:17 -0500 (EST) Received: from [172.31.207.194] ([105.140.31.10]) by ussync2.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0O3U00J890LSGY40@ussync2.samsung.com>; Thu, 10 Mar 2016 11:32:17 -0500 (EST) Subject: Re: [PATCH][AArch64] Replace insn to zero up DF register To: James Greenhalgh References: <56A94F35.8000000@samsung.com> <56D0D50E.8030802@samsung.com> <56D4D02C.6030309@samsung.com> <56D5E8B6.1090900@samsung.com> <56E0972F.3060207@samsung.com> <20160310132327.GA38844@arm.com> <56E1A061.5080901@samsung.com> Cc: Wilco Dijkstra , "gcc-patches@gcc.gnu.org" , nd , Marcus Shawcroft , Kyrylo Tkachov , richard.earnshaw@arm.com From: Evandro Menezes Message-id: <56E1A18F.5000006@samsung.com> Date: Thu, 10 Mar 2016 16:32:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-version: 1.0 In-reply-to: <56E1A061.5080901@samsung.com> Content-type: multipart/mixed; boundary=------------040907090200000706060303 X-IsSubscribed: yes X-SW-Source: 2016-03/txt/msg00646.txt.bz2 This is a multi-part message in MIME format. --------------040907090200000706060303 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-length: 7019 On 03/10/16 10:27, Evandro Menezes wrote: > On 03/10/16 07:23, James Greenhalgh wrote: >> On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote: >>> On 03/01/16 13:08, Evandro Menezes wrote: >>>> On 03/01/16 13:02, Wilco Dijkstra wrote: >>>>> Evandro Menezes wrote: >>>>>> The meaning of these attributes are not clear to me. Is there a >>>>>> reference somewhere about which insns are FP or SIMD or neither? >>>>> The meaning should be clear, "fp" is a floating point >>>>> instruction, "simd" a SIMD one >>>>> as defined in ARM-ARM. >>>>> >>>>>> Indeed, I had to add the Y for the f_mcr insn to match it with >>>>>> nosimd. >>>>>> However, I didn't feel that it should be moved to the right, >>>>>> since it's >>>>>> already disparaged. Am I missing something detail? >>>>> It might not matter for this specific case, but I have seen >>>>> reload forcing the very >>>>> first alternative without looking at any costs or preferences - >>>>> as long as it is legal. >>>>> This suggests we need to order alternatives from most preferred >>>>> alternative to least >>>>> preferred one. >>>>> >>>>> I think it is good enough for commit, James? >>>> Methinks that my issue with those attributes is that I'm not as >>>> fluent in AArch64 as I'd like to be. >>>> >>>> Please, feel free to edit the patch changing the order then. >>> Replace insn to zero up SIMD registers >>> >>> gcc/ >>> * config/aarch64/aarch64.md >>> (*movhf_aarch64): Add "movi %0, #0" to zero up register. >>> (*movsf_aarch64): Likewise and add "simd" and "fp" >>> attributes. >>> (*movdf_aarch64): Likewise. >>> >>> Swapped the order of the constraints to favor MOVI. >>> >>> Just say the word... >> I'm wondering whether this is appropriate for GCC6 now that we are so >> late >> in the development cycle. Additionally, I have some comments on your >> patch: >> >>> diff --git a/gcc/config/aarch64/aarch64.md >>> b/gcc/config/aarch64/aarch64.md >>> index 68676c9..4502a58 100644 >>> --- a/gcc/config/aarch64/aarch64.md >>> +++ b/gcc/config/aarch64/aarch64.md >>> @@ -1163,11 +1163,12 @@ >>> ) >>> (define_insn "*movhf_aarch64" >>> - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, >>> ?r,w,w,m,r,m ,r") >>> - (match_operand:HF 1 "general_operand" "?rY, >>> w,w,m,w,m,rY,r"))] >>> + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w >>> ,?r,w,w,m,r,m ,r") >>> + (match_operand:HF 1 "general_operand" "Y ,?rY, >>> w,w,m,w,m,rY,r"))] >>> "TARGET_FLOAT && (register_operand (operands[0], HFmode) >>> || aarch64_reg_or_fp_zero (operands[1], HFmode))" >>> "@ >>> + movi\\t%0.4h, #0 >>> mov\\t%0.h[0], %w1 >>> umov\\t%w0, %1.h[0] >>> mov\\t%0.h[0], %1.h[0] >>> @@ -1176,18 +1177,19 @@ >>> ldrh\\t%w0, %1 >>> strh\\t%w1, %0 >>> mov\\t%w0, %w1" >>> - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ >>> + [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\ >>> f_loads,f_stores,load1,store1,mov_reg") >>> - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") >>> - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] >>> + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") >>> + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] >>> ) >>> (define_insn "*movsf_aarch64" >>> - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w >>> ,w,m,r,m ,r") >>> - (match_operand:SF 1 "general_operand" "?rY, >>> w,w,Ufc,m,w,m,rY,r"))] >>> + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w >>> ,w,m,r,m ,r") >>> + (match_operand:SF 1 "general_operand" "Y ,?rY, >>> w,w,Ufc,m,w,m,rY,r"))] >>> "TARGET_FLOAT && (register_operand (operands[0], SFmode) >>> || aarch64_reg_or_fp_zero (operands[1], SFmode))" >>> "@ >>> + movi\\t%0.2s, #0 >>> fmov\\t%s0, %w1 >>> fmov\\t%w0, %s1 >>> fmov\\t%s0, %s1 >>> @@ -1197,16 +1199,19 @@ >>> ldr\\t%w0, %1 >>> str\\t%w1, %0 >>> mov\\t%w0, %w1" >>> - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ >>> - f_loads,f_stores,load1,store1,mov_reg")] >>> + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\ >>> + f_loads,f_stores,load1,store1,mov_reg") >>> + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*") >>> + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] >>> ) >> This fp attribute looks wrong to me. The two fmov instructions that move >> between core and FP registers should be tagged "yes". However, this is >> irrelevant as the whole pattern is guarded by TARGET_FLOAT. >> >> It would be clearer to drop the FP attribute entirely, so as not to give >> the erroneous impression that some alternatives in this insn are enabled >> for !TARGET_FLOAT. > > You mean to remove the FP attribute from all, HF, SF, DF, TF? > >>> (define_insn "*movdf_aarch64" >>> - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w >>> ,w,m,r,m ,r") >>> - (match_operand:DF 1 "general_operand" "?rY, >>> w,w,Ufc,m,w,m,rY,r"))] >>> + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w >>> ,w,m,r,m ,r") >>> + (match_operand:DF 1 "general_operand" "Y ,?rY, >>> w,w,Ufc,m,w,m,rY,r"))] >>> "TARGET_FLOAT && (register_operand (operands[0], DFmode) >>> || aarch64_reg_or_fp_zero (operands[1], DFmode))" >>> "@ >>> + movi\\t%d0, #0 >>> fmov\\t%d0, %x1 >>> fmov\\t%x0, %d1 >>> fmov\\t%d0, %d1 >>> @@ -1216,8 +1221,10 @@ >>> ldr\\t%x0, %1 >>> str\\t%x1, %0 >>> mov\\t%x0, %x1" >>> - [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\ >>> - f_loadd,f_stored,load1,store1,mov_reg")] >>> + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\ >>> + f_loadd,f_stored,load1,store1,mov_reg") >>> + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*") >>> + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] >> Likewise here. >> >> As to whether this is OK for GCC6, this doesn't look like a >> regression to me, >> we've generated FMOV since the dawn of the port. The patch has >> performance >> implications (arguably trivial ones, but still performance implications) >> for generic code generation, and release is coming soon - so I'd rather >> keep this back until GCC 7 opens. >> >> Remove the FP attributes, and this patch is OK for GCC 7. >> >> I'm happy to be overruled on this by Marcus or Richard if they feel we >> should take this patch now, but my opinion is that waiting is more in >> the >> spirit of using Stage 4 to stabilise the compiler. > > I agree to postpone until GCC 7. > > [AArch64] Replace insn to zero up SIMD registers > > gcc/ > * config/aarch64/aarch64.md > (*movhf_aarch64): Add "movi %0, #0" to zero up register. > (*movsf_aarch64): Likewise and add "simd" attributes. > (*movdf_aarch64): Likewise. > > This patch removes the FP attributes from the HF, SF, DF, TF moves. And now, with the patch. :-/ -- Evandro Menezes --------------040907090200000706060303 Content-Type: text/x-patch; name="0001-AArch64-Replace-insn-to-zero-up-SIMD-registers.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-AArch64-Replace-insn-to-zero-up-SIMD-registers.patch" Content-length: 3974 >From 190f26b71927516d28b851e0a62b78c2f67c2dc6 Mon Sep 17 00:00:00 2001 From: Evandro Menezes Date: Mon, 19 Oct 2015 18:31:48 -0500 Subject: [PATCH] [AArch64] Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register and remove the "fp" attributes. (*movsf_aarch64): Add "movi %0, #0" to zero up register and add the "simd" attributes. (*movdf_aarch64): Likewise. (*movtf_aarch64): Remove the "fp" attributes. --- gcc/config/aarch64/aarch64.md | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 68676c9..4fb12a6 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1163,11 +1163,12 @@ ) (define_insn "*movhf_aarch64" - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") - (match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w ,?r,w,w,m,r,m ,r") + (match_operand:HF 1 "general_operand" "Y ,?rY, w,w,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], HFmode) || aarch64_reg_or_fp_zero (operands[1], HFmode))" "@ + movi\\t%0.4h, #0 mov\\t%0.h[0], %w1 umov\\t%w0, %1.h[0] mov\\t%0.h[0], %1.h[0] @@ -1176,18 +1177,18 @@ ldrh\\t%w0, %1 strh\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ + [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\ f_loads,f_stores,load1,store1,mov_reg") - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")] ) (define_insn "*movsf_aarch64" - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") + (match_operand:SF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], SFmode) || aarch64_reg_or_fp_zero (operands[1], SFmode))" "@ + movi\\t%0.2s, #0 fmov\\t%s0, %w1 fmov\\t%w0, %s1 fmov\\t%s0, %s1 @@ -1197,16 +1198,18 @@ ldr\\t%w0, %1 str\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ - f_loads,f_stores,load1,store1,mov_reg")] + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\ + f_loads,f_stores,load1,store1,mov_reg") + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")] ) (define_insn "*movdf_aarch64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") + (match_operand:DF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], DFmode) || aarch64_reg_or_fp_zero (operands[1], DFmode))" "@ + movi\\t%d0, #0 fmov\\t%d0, %x1 fmov\\t%x0, %d1 fmov\\t%d0, %d1 @@ -1216,8 +1219,9 @@ ldr\\t%x0, %1 str\\t%x1, %0 mov\\t%x0, %x1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\ - f_loadd,f_stored,load1,store1,mov_reg")] + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\ + f_loadd,f_stored,load1,store1,mov_reg") + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")] ) (define_insn "*movtf_aarch64" @@ -1242,7 +1246,6 @@ [(set_attr "type" "logic_reg,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\ f_loadd,f_stored,load2,store2,store2") (set_attr "length" "4,8,8,8,4,4,4,4,4,4,4") - (set_attr "fp" "*,*,yes,yes,*,yes,yes,yes,*,*,*") (set_attr "simd" "yes,*,*,*,yes,*,*,*,*,*,*")] ) -- 1.9.1 --------------040907090200000706060303--