From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 123836 invoked by alias); 27 Apr 2015 10:01:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 123816 invoked by uid 89); 27 Apr 2015 10:01:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 27 Apr 2015 10:01:17 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by uk-mta-8.uk.mimecast.lan; Mon, 27 Apr 2015 11:01:13 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 27 Apr 2015 11:01:13 +0100 Message-ID: <553E08E9.7060107@arm.com> Date: Mon, 27 Apr 2015 10:01:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh Subject: [PATCH][AArch64] Add alternative 'extr' pattern, calculate rtx cost properly X-MC-Unique: wQzvdZXyRxio96alpNyfPw-1 Content-Type: multipart/mixed; boundary="------------030804080000010504080406" X-IsSubscribed: yes X-SW-Source: 2015-04/txt/msg01590.txt.bz2 This is a multi-part message in MIME format. --------------030804080000010504080406 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Content-length: 1389 Hi all, We currently have a pattern that will recognise a particular combination of= shifts and bitwise-or as an extr instruction. However, the order of the shifts inside the IOR doe= sn't have canonicalisation rules (see rev16 pattern for similar stuff). This means that for code like: unsigned long foo (unsigned long a, unsigned long b) { return (a << 16) | (b >> 48); } we will recognise the extr, but for the equivalent: unsigned long foo (unsigned long a, unsigned long b) { return (b >> 48) | (a << 16); } we won't, and we'll emit three instructions. This patch adds the pattern for the alternative order of shifts and allows = us to generate for the above the code: foo: extr x0, x0, x1, 48 ret The zero-extended version is added as well and the rtx costs function is up= dated to handle all of these cases. I've seen this pattern trigger in the gcc code itself in expmed.c where it = eliminated a sequence of orrs and shifts into an extr instruction! Bootstrapped and tested on aarch64-linux. Ok for trunk? Thanks, Kyrill 2015-04-27 Kyrylo Tkachov * config/aarch64/aarch64.md (*extr5_insn_alt): New pattern. (*extrsi5_insn_uxtw_alt): Likewise. * config/aarch64/aarch64.c (aarch64_extr_rtx_p): New function. (aarch64_rtx_costs, IOR case): Use above to properly cost extr operations. --------------030804080000010504080406 Content-Type: text/x-patch; name=aarch64-extr.patch Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="aarch64-extr.patch" Content-length: 4258 commit d45e92b3b8c5837328b7b10682565cacfb566e5b Author: Kyrylo Tkachov Date: Mon Mar 2 17:26:38 2015 +0000 [AArch64] Add alternative 'extr' pattern, calculate rtx cost properly diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 860a1dd..ef5a1e4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5438,6 +5438,51 @@ aarch64_frint_unspec_p (unsigned int u) } } =20 +/* Return true iff X is an rtx that will match an extr instruction + i.e. as described in the *extr5_insn family of patterns. + OP0 and OP1 will be set to the operands of the shifts involved + on success and will be NULL_RTX otherwise. */ + +static bool +aarch64_extr_rtx_p (rtx x, rtx *res_op0, rtx *res_op1) +{ + rtx op0, op1; + machine_mode mode =3D GET_MODE (x); + + *res_op0 =3D NULL_RTX; + *res_op1 =3D NULL_RTX; + + if (GET_CODE (x) !=3D IOR) + return false; + + op0 =3D XEXP (x, 0); + op1 =3D XEXP (x, 1); + + if ((GET_CODE (op0) =3D=3D ASHIFT && GET_CODE (op1) =3D=3D LSHIFTRT) + || (GET_CODE (op1) =3D=3D ASHIFT && GET_CODE (op0) =3D=3D LSHIFTRT)) + { + /* Canonicalise locally to ashift in op0, lshiftrt in op1. */ + if (GET_CODE (op1) =3D=3D ASHIFT) + std::swap (op0, op1); + + if (!CONST_INT_P (XEXP (op0, 1)) || !CONST_INT_P (XEXP (op1, 1))) + return false; + + unsigned HOST_WIDE_INT shft_amnt_0 =3D UINTVAL (XEXP (op0, 1)); + unsigned HOST_WIDE_INT shft_amnt_1 =3D UINTVAL (XEXP (op1, 1)); + + if (shft_amnt_0 < GET_MODE_BITSIZE (mode) + && shft_amnt_0 + shft_amnt_1 =3D=3D GET_MODE_BITSIZE (mode)) + { + *res_op0 =3D XEXP (op0, 0); + *res_op1 =3D XEXP (op1, 0); + return true; + } + } + + return false; +} + /* Calculate the cost of calculating (if_then_else (OP0) (OP1) (OP2)), storing it in *COST. Result is true if the total cost of the operation has now been calculated. */ @@ -5968,6 +6013,16 @@ cost_plus: =20 return true; } + + if (aarch64_extr_rtx_p (x, &op0, &op1)) + { + *cost +=3D rtx_cost (op0, IOR, 0, speed) + + rtx_cost (op1, IOR, 1, speed); + if (speed) + *cost +=3D extra_cost->alu.shift; + + return true; + } /* Fall through. */ case XOR: case AND: diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1a7f888..17a8755 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3597,6 +3597,21 @@ (define_insn "*extr5_insn" [(set_attr "type" "shift_imm")] ) =20 +;; There are no canonicalisation rules for ashift and lshiftrt inside an i= or +;; so we have to match both orderings. +(define_insn "*extr5_insn_alt" + [(set (match_operand:GPI 0 "register_operand" "=3Dr") + (ior:GPI (lshiftrt:GPI (match_operand:GPI 2 "register_operand" "r") + (match_operand 4 "const_int_operand" "n")) + (ashift:GPI (match_operand:GPI 1 "register_operand" "r") + (match_operand 3 "const_int_operand" "n"))))] + "UINTVAL (operands[3]) < GET_MODE_BITSIZE (mode) + && (UINTVAL (operands[3]) + UINTVAL (operands[4]) + =3D=3D GET_MODE_BITSIZE (mode))" + "extr\\t%0, %1, %2, %4" + [(set_attr "type" "shift_imm")] +) + ;; zero_extend version of the above (define_insn "*extrsi5_insn_uxtw" [(set (match_operand:DI 0 "register_operand" "=3Dr") @@ -3611,6 +3626,19 @@ (define_insn "*extrsi5_insn_uxtw" [(set_attr "type" "shift_imm")] ) =20 +(define_insn "*extrsi5_insn_uxtw_alt" + [(set (match_operand:DI 0 "register_operand" "=3Dr") + (zero_extend:DI + (ior:SI (lshiftrt:SI (match_operand:SI 2 "register_operand" "r") + (match_operand 4 "const_int_operand" "n")) + (ashift:SI (match_operand:SI 1 "register_operand" "r") + (match_operand 3 "const_int_operand" "n")))))] + "UINTVAL (operands[3]) < 32 && + (UINTVAL (operands[3]) + UINTVAL (operands[4]) =3D=3D 32)" + "extr\\t%w0, %w1, %w2, %4" + [(set_attr "type" "shift_imm")] +) + (define_insn "*ror3_insn" [(set (match_operand:GPI 0 "register_operand" "=3Dr") (rotate:GPI (match_operand:GPI 1 "register_operand" "r") --------------030804080000010504080406--