From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 86804 invoked by alias); 9 Jun 2016 09:36:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 83929 invoked by uid 89); 9 Jun 2016 09:36:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_LOTSOFHASH,RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=non-trivial, 2846 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 09 Jun 2016 09:36:40 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4CD1FF; Thu, 9 Jun 2016 02:37:14 -0700 (PDT) Received: from [10.2.206.43] (e100706-lin.cambridge.arm.com [10.2.206.43]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 96B6B3F253; Thu, 9 Jun 2016 02:36:37 -0700 (PDT) Message-ID: <575938A4.6030709@foss.arm.com> Date: Thu, 09 Jun 2016 09:36:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh Subject: [PATCH][AArch64] Handle AND+ASHIFT form of UBFIZ correctly in costs Content-Type: multipart/mixed; boundary="------------020309090904090803070101" X-SW-Source: 2016-06/txt/msg00654.txt.bz2 This is a multi-part message in MIME format. --------------020309090904090803070101 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1212 Hi all, We currently don't handle in the aarch64 rtx costs the pattern *andim_ashift_bfiz that performs an ASHIFT followed by an AND. So we end up recursing inside the AND and assigning a high cost to the pattern. Not high enough to reject it during combine, but still wrong. This patch fixes that. It refactors the non-trivial matching condition from the pattern to a new function in aarch64.c that is also re-used in the costs calculation to properly handle this pattern. With this patch I see the pattern being assigned a cost of COSTS_N_INSNS (2) for cortex-a53 rather than COSTS_N_INSN (3) which we got due to the recursion into the operands of the AND. Bootstrapped and tested on aarch64. Ok for trunk? Thanks, Kyrill 2015-06-09 Kyrylo Tkachov * config/aarch64/aarch64.c (aarch64_mask_and_shift_for_ubfiz_p): New function. (aarch64_rtx_costs): Use it. Rewrite CONST_INT_P (op1) case to handle mask+shift version. * config/aarch64/aarch64-protos.h (aarch64_mask_and_shift_for_ubfiz_p): New prototype. * config/aarch64/aarch64.md (*andim_ashift_bfiz): Replace matching condition with aarch64_mask_and_shift_for_ubfiz_p. --------------020309090904090803070101 Content-Type: text/x-patch; name="aarch64-ubfiz.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="aarch64-ubfiz.patch" Content-length: 3650 diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index e0a050ce5bc24b0269a5c6664d8a7dc4901bfe0e..9daab3a00171fca7247a9802240006116678c9e0 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -284,6 +284,7 @@ bool aarch64_is_noplt_call_p (rtx); bool aarch64_label_mentioned_p (rtx); void aarch64_declare_function_name (FILE *, const char*, tree); bool aarch64_legitimate_pic_operand_p (rtx); +bool aarch64_mask_and_shift_for_ubfiz_p (machine_mode, rtx, rtx); bool aarch64_modes_tieable_p (machine_mode mode1, machine_mode mode2); bool aarch64_move_imm (HOST_WIDE_INT, machine_mode); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index d180f6f2d37a280ad77f34caad8496ddaa6e01b2..2be36645979b0e4b3cd4232f533c1f833b0d8cdd 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5877,6 +5877,19 @@ aarch64_extend_bitfield_pattern_p (rtx x) return op; } +/* Return true if the mask and a shift amount from an RTX of the form + (x << SHFT_AMNT) & MASK are valid to combine into a UBFIZ instruction of + mode MODE. See the *andim_ashift_bfiz pattern. */ + +bool +aarch64_mask_and_shift_for_ubfiz_p (machine_mode mode, rtx mask, rtx shft_amnt) +{ + return CONST_INT_P (mask) && CONST_INT_P (shft_amnt) + && INTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) + && exact_log2 ((INTVAL (mask) >> INTVAL (shft_amnt)) + 1) >= 0 + && (INTVAL (mask) & ((1 << INTVAL (shft_amnt)) - 1)) == 0; +} + /* Calculate the cost of calculating X, storing it in *COST. Result is true if the total cost of the operation has now been calculated. */ static bool @@ -6437,17 +6450,31 @@ cost_plus: if (GET_MODE_CLASS (mode) == MODE_INT) { - /* We possibly get the immediate for free, this is not - modelled. */ - if (CONST_INT_P (op1) - && aarch64_bitmask_imm (INTVAL (op1), mode)) + if (CONST_INT_P (op1)) { - *cost += rtx_cost (op0, mode, (enum rtx_code) code, 0, speed); + /* We have a mask + shift version of a UBFIZ + i.e. the *andim_ashift_bfiz pattern. */ + if (GET_CODE (op0) == ASHIFT + && aarch64_mask_and_shift_for_ubfiz_p (mode, op1, + XEXP (op0, 1))) + { + *cost += rtx_cost (XEXP (op0, 0), mode, + (enum rtx_code) code, 0, speed); + if (speed) + *cost += extra_cost->alu.bfx; - if (speed) - *cost += extra_cost->alu.logical; + return true; + } + else if (aarch64_bitmask_imm (INTVAL (op1), mode)) + { + /* We possibly get the immediate for free, this is not + modelled. */ + *cost += rtx_cost (op0, mode, (enum rtx_code) code, 0, speed); + if (speed) + *cost += extra_cost->alu.logical; - return true; + return true; + } } else { diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 4cbd3bfba06792f80dadfd342dde73f3df7b6352..6320eb93e1bccbf56f51a92b051cfe2bb9549258 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4223,9 +4223,7 @@ (define_insn "*andim_ashift_bfiz" (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand" "r") (match_operand 2 "const_int_operand" "n")) (match_operand 3 "const_int_operand" "n")))] - "(INTVAL (operands[2]) < ()) - && exact_log2 ((INTVAL (operands[3]) >> INTVAL (operands[2])) + 1) >= 0 - && (INTVAL (operands[3]) & ((1 << INTVAL (operands[2])) - 1)) == 0" + "aarch64_mask_and_shift_for_ubfiz_p (mode, operands[3], operands[2])" "ubfiz\\t%0, %1, %2, %P3" [(set_attr "type" "bfm")] ) --------------020309090904090803070101--