From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x92e.google.com (mail-ua1-x92e.google.com [IPv6:2607:f8b0:4864:20::92e]) by sourceware.org (Postfix) with ESMTPS id 1F3ED3858D35 for ; Mon, 11 Oct 2021 19:56:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1F3ED3858D35 Received: by mail-ua1-x92e.google.com with SMTP id i8so14942979uae.7 for ; Mon, 11 Oct 2021 12:56:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=TGoWwFi77+xh9IEeCdd0H9+jycuiMjuqFry6un48id8=; b=h9CdEWnyJ2fGauRkVi+XtzRWAaBp2xF7aphUUt/NvGvVcDNGDzihZQeWaqnSMAraZU Eg3G1xfyPuq/Ets/MTM3gM5zGapsq2UVxAlnh0zt9LoVysaZPW3ntgnE0BpIJQ37nV+d 6yfeOghVhFP+gaxWWO5ieVjQzKdFupiBrVDH2O7w4VUMG/qPTMp8PbQuDp3dp4d8Lic6 tD8lfKMxpnonpBLawpP4VYsJcSWtei7yYzx//o3LC11pt7tWKWMII6KOjRzpx2fWC+oS LMM26ArbzGPAyLIPibXDV3zIv8bBRZWEcG8pUimtWRnIb2FMZQ8OD7OFtevYtao8IiTM 9hsA== X-Gm-Message-State: AOAM532dba9AeX8G/RQZAa8WvXVxwWdWpREQAYb3zMuTCO7nRwB4VdVF CQSyU1gDNRvEtLQPcWuKUMDuT1tkxr212V8whqM= X-Google-Smtp-Source: ABdhPJy0E4b5fJDreFBLBCrYT7V14XCOMKIoZ15bHcs2K8DupboTHM77R0xnSpS+ForkQ7azDceTFIB8KCyZJvLeX+I= X-Received: by 2002:a05:6102:b13:: with SMTP id b19mr25534247vst.37.1633982191634; Mon, 11 Oct 2021 12:56:31 -0700 (PDT) MIME-Version: 1.0 References: <20210929162001.GA31867@arm.com> In-Reply-To: From: Andrew Pinski Date: Mon, 11 Oct 2021 12:56:18 -0700 Message-ID: Subject: Re: [PATCH 3/7]AArch64 Add pattern for sshr to cmlt To: Kyrylo Tkachov Cc: Tamar Christina , "gcc-patches@gcc.gnu.org" , "apinski@marvell.com" , Richard Earnshaw , nd , Marcus Shawcroft , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Oct 2021 19:56:33 -0000 On Thu, Sep 30, 2021 at 2:28 AM Kyrylo Tkachov via Gcc-patches wrote: > > -----Original Message----- > > From: Tamar Christina > > Sent: Wednesday, September 29, 2021 5:20 PM > > To: gcc-patches@gcc.gnu.org > > Cc: nd ; Richard Earnshaw ; > > Marcus Shawcroft ; Kyrylo Tkachov > > ; Richard Sandiford > > > > Subject: [PATCH 3/7]AArch64 Add pattern for sshr to cmlt > > > > Hi All, > > > > This optimizes signed right shift by BITSIZE-1 into a cmlt operation which is > > more optimal because generally compares have a higher throughput than > > shifts. > > > > On AArch64 the result of the shift would have been either -1 or 0 which is the > > results of the compare. > > > > i.e. > > > > void e (int * restrict a, int *b, int n) > > { > > for (int i = 0; i < n; i++) > > b[i] = a[i] >> 31; > > } > > > > now generates: > > > > .L4: > > ldr q0, [x0, x3] > > cmlt v0.4s, v0.4s, #0 > > str q0, [x1, x3] > > add x3, x3, 16 > > cmp x4, x3 > > bne .L4 > > > > instead of: > > > > .L4: > > ldr q0, [x0, x3] > > sshr v0.4s, v0.4s, 31 > > str q0, [x1, x3] > > add x3, x3, 16 > > cmp x4, x3 > > bne .L4 > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > This should be okay (either a win or neutral) for Arm Cortex and Neoverse cores so I'm tempted to not ask for a CPU-specific tunable to guard it to keep the code clean. > Andrew, would this change be okay from a Thunder X line perspective? I don't know about ThunderX2 but here are the details for ThunderX1 (and OcteonX1) and OcteonX2: The sshr and cmlt are handled the same in the pipeline as far as I can tell. Thanks, Andrew > Thanks, > Kyrill > > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-simd.md (aarch64_simd_ashr): > > Add case cmp > > case. > > * config/aarch64/constraints.md (D1): New. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/shl-combine-2.c: New test. > > > > --- inline copy of patch -- > > diff --git a/gcc/config/aarch64/aarch64-simd.md > > b/gcc/config/aarch64/aarch64-simd.md > > index > > 300bf001b59ca7fa197c580b10adb7f70f20d1e0..19b2d0ad4dab4d574269829 > > 7ded861228ee22007 100644 > > --- a/gcc/config/aarch64/aarch64-simd.md > > +++ b/gcc/config/aarch64/aarch64-simd.md > > @@ -1127,12 +1127,14 @@ (define_insn "aarch64_simd_lshr" > > ) > > > > (define_insn "aarch64_simd_ashr" > > - [(set (match_operand:VDQ_I 0 "register_operand" "=w") > > - (ashiftrt:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w") > > - (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" > > "Dr")))] > > + [(set (match_operand:VDQ_I 0 "register_operand" "=w,w") > > + (ashiftrt:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w,w") > > + (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" > > "D1,Dr")))] > > "TARGET_SIMD" > > - "sshr\t%0., %1., %2" > > - [(set_attr "type" "neon_shift_imm")] > > + "@ > > + cmlt\t%0., %1., #0 > > + sshr\t%0., %1., %2" > > + [(set_attr "type" "neon_compare,neon_shift_imm")] > > ) > > > > (define_insn "*aarch64_simd_sra" > > diff --git a/gcc/config/aarch64/constraints.md > > b/gcc/config/aarch64/constraints.md > > index > > 3b49b452119c49320020fa9183314d9a25b92491..18630815ffc13f2168300a89 > > 9db69fd428dfb0d6 100644 > > --- a/gcc/config/aarch64/constraints.md > > +++ b/gcc/config/aarch64/constraints.md > > @@ -437,6 +437,14 @@ (define_constraint "Dl" > > (match_test "aarch64_simd_shift_imm_p (op, GET_MODE (op), > > true)"))) > > > > +(define_constraint "D1" > > + "@internal > > + A constraint that matches vector of immediates that is bits(mode)-1." > > + (and (match_code "const,const_vector") > > + (match_test "aarch64_const_vec_all_same_in_range_p (op, > > + GET_MODE_UNIT_BITSIZE (mode) - 1, > > + GET_MODE_UNIT_BITSIZE (mode) - 1)"))) > > + > > (define_constraint "Dr" > > "@internal > > A constraint that matches vector of immediates for right shifts." > > diff --git a/gcc/testsuite/gcc.target/aarch64/shl-combine-2.c > > b/gcc/testsuite/gcc.target/aarch64/shl-combine-2.c > > new file mode 100644 > > index > > 0000000000000000000000000000000000000000..bdfe35d09ffccc7928947c9e > > 57f1034f7ca2c798 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/shl-combine-2.c > > @@ -0,0 +1,12 @@ > > +/* { dg-do assemble } */ > > +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ > > + > > +void e (int * restrict a, int *b, int n) > > +{ > > + for (int i = 0; i < n; i++) > > + b[i] = a[i] >> 31; > > +} > > + > > +/* { dg-final { scan-assembler-times {\tcmlt\t} 1 } } */ > > +/* { dg-final { scan-assembler-not {\tsshr\t} } } */ > > + > > > > > > --