From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21195 invoked by alias); 7 Aug 2014 11:22:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 21183 invoked by uid 89); 7 Aug 2014 11:22:14 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: service87.mimecast.com Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Aug 2014 11:22:12 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 07 Aug 2014 12:22:10 +0100 Received: from [10.1.208.24] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 7 Aug 2014 12:22:10 +0100 Message-ID: <53E36161.1080800@arm.com> Date: Thu, 07 Aug 2014 11:22:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw Subject: [PATCH][AArch64] Restrict usage of FP/SIMD registers for TImode reload and absdi2 patterns for non-float/simd targets X-MC-Unique: 114080712221004901 Content-Type: multipart/mixed; boundary="------------060103020509090409090609" X-IsSubscribed: yes X-SW-Source: 2014-08/txt/msg00790.txt.bz2 This is a multi-part message in MIME format. --------------060103020509090409090609 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Content-Transfer-Encoding: quoted-printable Content-length: 1487 Hi all, This patch arises from PR 62014 where apparently gcc generates usage of=20 FP registers with -mgeneral-regs-only. The PR turned out to be bogus in=20 the end that but an inspection of aarch64.md shows that there are some=20 patterns that don't have their usage of FP/SIMD registers properly=20 guarded by the simd attribute or by the TARGET_FLOAT predicate. This=20 patch addresses that, although I could not come up with a testcase that=20 demonstrated wrong behaviour. I built the linux kernel with this patch and looked for fmov=20 instructions in the disassembly. They appeared only in the crypto code=20 that uses the new AES instructions and therefore allows usage of vector=20 registers. But even without this patch the kernel compiled to an identical binary=20 as with this patch (phew!) I've added a comment that hopefully clarifies the usage of the fp and=20 simd attributes. Bootstrapped on aarch64-linux and tested on aarch64-none-elf as well. Ok for trunk? Thanks, Kyrill 2014-08-07 Kyrylo Tkachov * config/aarch64/aarch64.md (absdi2): Set simd attribute. (aarch64_reload_mov): Predicate on TARGET_FLOAT. (aarch64_movdi_high): Likewise. (aarch64_movhigh_di): Likewise. (aarch64_movdi_low): Likewise. (aarch64_movlow_di): Likewise. (aarch64_movtilow_tilow): Likewise. Add comment explaining usage of fp,simd attributes and of TARGET_FLOAT and TARGET_SIMD.= --------------060103020509090409090609 Content-Type: text/x-patch; name=aarch64-restrict-simd.patch Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="aarch64-restrict-simd.patch" Content-length: 3927 commit 4bc3f5d3da9450bd0748ab4a61a48c739586fd3c Author: Kyrylo Tkachov Date: Tue Aug 5 12:20:23 2014 +0100 [AArch64] Restrict FP/SIMD reg usage on non-fp/simd targets diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 59c4ba4..e9758a8 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -141,12 +141,22 @@ ; to share pipeline descriptions. (include "../arm/types.md") =20 +;; It is important to set the fp or simd attributes to yes when a pattern +;; alternative uses the FP or SIMD register files, usually signified by us= e of +;; the 'w' constraint. This will ensure that the alternative will be +;; disabled when compiling with -mgeneral-regs-only or with the +nofp/+nos= imd +;; architecture extensions. If all the alternatives in a pattern use the +;; FP or SIMD registers then the pattern predicate should include TARGET_F= LOAT +;; or TARGET_SIMD. + ;; Attribute that specifies whether or not the instruction touches fp -;; registers. +;; registers. When this is set to yes for an alternative, that alternative +;; will be disabled when !TARGET_FLOAT. (define_attr "fp" "no,yes" (const_string "no")) =20 ;; Attribute that specifies whether or not the instruction touches simd -;; registers. +;; registers. When this is set to yes for an alternative, that alternative +;; will be disabled when !TARGET_SIMD. (define_attr "simd" "no,yes" (const_string "no")) =20 (define_attr "length" "" @@ -1954,7 +1964,8 @@ GEN_INT (63))))); DONE; } - [(set_attr "type" "alu_sreg")] + [(set_attr "type" "alu_sreg") + (set_attr "simd" "no,yes")] ) =20 (define_insn "neg2" @@ -3728,7 +3739,7 @@ (match_operand:TX 1 "register_operand" "w")) (clobber (match_operand:DI 2 "register_operand" "=3D&r")) ] - "" + "TARGET_FLOAT" { rtx op0 =3D simplify_gen_subreg (TImode, operands[0], mode, 0); rtx op1 =3D simplify_gen_subreg (TImode, operands[1], mode, 0); @@ -3746,7 +3757,7 @@ (define_insn "aarch64_movdi_low" [(set (match_operand:DI 0 "register_operand" "=3Dr") (truncate:DI (match_operand:TX 1 "register_operand" "w")))] - "reload_completed || reload_in_progress" + "TARGET_FLOAT && (reload_completed || reload_in_progress)" "fmov\\t%x0, %d1" [(set_attr "type" "f_mrc") (set_attr "length" "4") @@ -3757,7 +3768,7 @@ (truncate:DI (lshiftrt:TX (match_operand:TX 1 "register_operand" "w") (const_int 64))))] - "reload_completed || reload_in_progress" + "TARGET_FLOAT && (reload_completed || reload_in_progress)" "fmov\\t%x0, %1.d[1]" [(set_attr "type" "f_mrc") (set_attr "length" "4") @@ -3767,7 +3778,7 @@ [(set (zero_extract:TX (match_operand:TX 0 "register_operand" "+w") (const_int 64) (const_int 64)) (zero_extend:TX (match_operand:DI 1 "register_operand" "r")))] - "reload_completed || reload_in_progress" + "TARGET_FLOAT && (reload_completed || reload_in_progress)" "fmov\\t%0.d[1], %x1" [(set_attr "type" "f_mcr") (set_attr "length" "4") @@ -3776,7 +3787,7 @@ (define_insn "aarch64_movlow_di" [(set (match_operand:TX 0 "register_operand" "=3Dw") (zero_extend:TX (match_operand:DI 1 "register_operand" "r")))] - "reload_completed || reload_in_progress" + "TARGET_FLOAT && (reload_completed || reload_in_progress)" "fmov\\t%d0, %x1" [(set_attr "type" "f_mcr") (set_attr "length" "4") @@ -3784,9 +3795,9 @@ =20 (define_insn "aarch64_movtilow_tilow" [(set (match_operand:TI 0 "register_operand" "=3Dw") - (zero_extend:TI=20 + (zero_extend:TI (truncate:DI (match_operand:TI 1 "register_operand" "w"))))] - "reload_completed || reload_in_progress" + "TARGET_FLOAT && (reload_completed || reload_in_progress)" "fmov\\t%d0, %d1" [(set_attr "type" "fmov") (set_attr "length" "4")= --------------060103020509090409090609--