From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5109D384F022 for ; Thu, 15 Jul 2021 19:34:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5109D384F022 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DCE2931B; Thu, 15 Jul 2021 12:34:38 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EC2753F7D8; Thu, 15 Jul 2021 12:34:37 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina , gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com Subject: Re: [PATCH 2/4]AArch64: correct usdot vectorizer and intrinsics optabs References: <20210715163953.GA2861@arm.com> Date: Thu, 15 Jul 2021 20:34:36 +0100 In-Reply-To: <20210715163953.GA2861@arm.com> (Tamar Christina's message of "Thu, 15 Jul 2021 17:39:58 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Jul 2021 19:34:40 -0000 Tamar Christina writes: > Hi All, > > There's a slight mismatch between the vectorizer optabs and the intrinsics > patterns for NEON. The vectorizer expects operands[3] and operands[0] to= be > the same but the aarch64 intrinsics expanders expect operands[0] and > operands[1] to be the same. > > This means we need different patterns here. This adds a separate usdot > vectorizer pattern which just shuffles around the RTL params. > > There's also an inconsistency between the usdot and (u|s)dot intrinsics R= TL > patterns which is not corrected here. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? Couldn't we just change: > diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h > index 00d76ea937ace5763746478cbdfadf6479e0b15a..17e059efb80fa86a8a32127ac= e4fc7f43e2040a8 100644 > --- a/gcc/config/aarch64/arm_neon.h > +++ b/gcc/config/aarch64/arm_neon.h > @@ -34039,14 +34039,14 @@ __extension__ extern __inline int32x2_t > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > vusdot_s32 (int32x2_t __r, uint8x8_t __a, int8x8_t __b) > { > - return __builtin_aarch64_usdot_prodv8qi_ssus (__r, __a, __b); > + return __builtin_aarch64_usdotv8qi_ssus (__r, __a, __b); =E2=80=A6this to __builtin_aarch64_usdot_prodv8qi_ssus (__a, __b, __r) etc.? I think that's an OK thing to do when the function is named after an optab rather than an arm_neon.h intrinsic. Thanks, Richard > } >=20=20 > __extension__ extern __inline int32x4_t > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > vusdotq_s32 (int32x4_t __r, uint8x16_t __a, int8x16_t __b) > { > - return __builtin_aarch64_usdot_prodv16qi_ssus (__r, __a, __b); > + return __builtin_aarch64_usdotv16qi_ssus (__r, __a, __b); > } >=20=20 > __extension__ extern __inline int32x2_t