From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 41206 invoked by alias); 23 Nov 2016 18:43:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 41187 invoked by uid 89); 23 Nov 2016 18:43:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=uso X-HELO: mail-qt0-f170.google.com Received: from mail-qt0-f170.google.com (HELO mail-qt0-f170.google.com) (209.85.216.170) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 23 Nov 2016 18:43:34 +0000 Received: by mail-qt0-f170.google.com with SMTP id p16so19986678qta.0 for ; Wed, 23 Nov 2016 10:43:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=JXTI95ItVUQsYbfTdO+55qE82Dk7JoEFGqtDED0Ro1c=; b=YJ0nGSV5e1IwCH0hAFQeCydbLvPi/9ZpwSaeZQ5SfqaumwJRRllOTaVsil6q2GHOJ4 0ZxGYaKWTz4nHPKucC0nDjv9ZzR0hI9R9uaxji4Cm+xidKQazu4pp7x4riXQmz2h458C OPUb+535kkiWPbLPRBzqsu82sKDOY8cmlfyPkmZfGYcFf6DPC+fyBkYwFgTFlfswQ7B+ S49qXcXm/MPrfVb5U9FVr2Gi2BO4K97pzI601gt7fIplHPCFZQTeyRmGWCP7KKeOVmUV y53zM0J5xFMTR8e9IjaTUDplclPRjTJjf39w2ngP3d64D2W1e10G5sG4XOZ/j6aY9lkL /WTg== X-Gm-Message-State: AKaTC01rsEoyk2re54nQ0RiRzoNTckXlQ9Fub3F0Evq1XinYyPr8JyFNqoCu+61yFTibVQCaHAzA9CODX7a47Wvb X-Received: by 10.200.51.87 with SMTP id u23mr3782853qta.139.1479926612697; Wed, 23 Nov 2016 10:43:32 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.28.226 with HTTP; Wed, 23 Nov 2016 10:43:32 -0800 (PST) In-Reply-To: References: <20161117162603.GB14894@arm.com> <20161121095226.GB30941@arm.com> From: Christophe Lyon Date: Wed, 23 Nov 2016 18:43:00 -0000 Message-ID: Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions To: Michael Collison Cc: James Greenhalgh , "gcc-patches@gcc.gnu.org" , nd Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2016-11/txt/msg02414.txt.bz2 On 23 November 2016 at 17:30, Michael Collison wrote: > Hi Christophe, > > This is not a regression per se; the patch causes the test case to generate one less instruction overall, but one additional 'and'. Trunk before the patch (-O2): > > foo: > and w0, w0, 255 > lsl w1, w0, 20 > orr w0, w1, w0, lsl 8 > mov w1, 65024 > movk w1, 0xfe0, lsl 16 > and w0, w0, w1 > ret > > The new trunk aarch64 compiler with the patch generates fewer instructions but one additional 'and' > > foo: > and w0, w0, 255 > lsl w1, w0, 20 > orr w0, w1, w0, lsl 8 > and x0, x0, 268434944 > and x0, x0, -2031617 > ret > OK, by "regression" I meant that a test used to pass and now fails. It looks like you have to update it to take into account the new, better code. Christophe > -----Original Message----- > From: Christophe Lyon [mailto:christophe.lyon@linaro.org] > Sent: Wednesday, November 23, 2016 6:42 AM > To: James Greenhalgh > Cc: Michael Collison; gcc-patches@gcc.gnu.org; nd > Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions > > Hi Michael, > > > On 21 November 2016 at 10:52, James Greenhalgh wrote: >> On Fri, Nov 18, 2016 at 07:41:58AM +0000, Michael Collison wrote: >>> James, >>> >>> I incorporated all your suggestions, and successfully bootstrapped >>> and re-ran the testsuite. >>> >>> Okay for trunk? >>> >>> 2016-11-18 Michael Collison >>> >>> * config/aarch64/aarch64-protos.h >>> (aarch64_and_split_imm1, aarch64_and_split_imm2) >>> (aarch64_and_bitmask_imm): New prototypes >>> * config/aarch64/aarch64.c (aarch64_and_split_imm1): >>> New overloaded function to create bit mask covering the >>> lowest to highest bits set. >>> (aarch64_and_split_imm2): New overloaded functions to create bit >>> mask of zeros between first and last bit set. >>> (aarch64_and_bitmask_imm): New function to determine if a integer >>> is a valid two instruction "and" operation. >>> * config/aarch64/aarch64.md:(and3): New define_insn and _split >>> allowing wider range of constants with "and" operations. >>> * (ior3, xor3): Use new LOGICAL2 iterator to prevent >>> "and" operator from matching restricted constant range used for >>> ior and xor operators. >>> * config/aarch64/constraints.md (UsO constraint): New SImode constraint >>> for constants in "and" operantions. >>> (UsP constraint): New DImode constraint for constants in "and" operations. >>> * config/aarch64/iterators.md (lconst2): New mode iterator. >>> (LOGICAL2): New code iterator. >>> * config/aarch64/predicates.md (aarch64_logical_and_immediate): New >>> predicate >>> (aarch64_logical_and_operand): New predicate allowing extended constants >>> for "and" operations. >>> * testsuite/gcc.target/aarch64/and_const.c: New test to verify >>> additional constants are recognized and fewer instructions generated. >>> * testsuite/gcc.target/aarch64/and_const2.c: New test to verify >>> additional constants are recognized and fewer instructions generated. >>> >> >>> diff --git a/gcc/config/aarch64/aarch64-protos.h >>> b/gcc/config/aarch64/aarch64-protos.h >>> index 3cdd69b..7ef8cdf 100644 >>> --- a/gcc/config/aarch64/aarch64-protos.h >>> +++ b/gcc/config/aarch64/aarch64-protos.h >>> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; >>> HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, >>> unsigned); int aarch64_get_condition_code (rtx); bool >>> aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); >>> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT >>> +val_in); unsigned HOST_WIDE_INT aarch64_and_split_imm2 >>> +(HOST_WIDE_INT val_in); bool aarch64_and_bitmask_imm (unsigned >>> +HOST_WIDE_INT val_in, machine_mode mode); >>> int aarch64_branch_cost (bool, bool); enum aarch64_symbol_type >>> aarch64_classify_symbolic_expression (rtx); bool >>> aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); diff --git >>> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index >>> 3e663eb..8e33c40 100644 >>> --- a/gcc/config/aarch64/aarch64.c >>> +++ b/gcc/config/aarch64/aarch64.c >>> @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, machine_mode mode) >>> return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; >>> } >>> >>> +/* Create mask of ones, covering the lowest to highest bits set in >>> +VAL_IN. */ >>> + >>> +unsigned HOST_WIDE_INT >>> +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) { >>> + int lowest_bit_set = ctz_hwi (val_in); >>> + int highest_bit_set = floor_log2 (val_in); >>> + gcc_assert (val_in != 0); >> >> The comment above the function should make this precondition clear. Or >> you could pick a well defined behaviour for when val_in == 0 (return >> 0?), if that fits the rest of the algorithm. >> >> Otherwise, this patch looks OK to me. >> >> Thanks, >> James >> > > This patch (r242739) causes a regression on aarch64: > > gcc.dg/zero_bits_compound-2.c scan-assembler-times \\(and: 2 now fails. > > Christophe.