From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C79E7385BF84 for ; Thu, 15 Jul 2021 19:44:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C79E7385BF84 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 71BEC31B; Thu, 15 Jul 2021 12:44:39 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 816423F7D8; Thu, 15 Jul 2021 12:44:38 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina , gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com Subject: Re: [PATCH 3/4]AArch64: correct dot-product RTL patterns for aarch64. References: <20210715164019.GA5409@arm.com> Date: Thu, 15 Jul 2021 20:44:37 +0100 In-Reply-To: <20210715164019.GA5409@arm.com> (Tamar Christina's message of "Thu, 15 Jul 2021 17:40:23 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Jul 2021 19:44:41 -0000 Tamar Christina writes: > Hi All, > > The previous fix for this problem was wrong due to a subtle difference be= tween > where NEON expects the RMW values and where intrinsics expects them. > > The insn pattern is modeled after the intrinsics and so needs an expand f= or > the vectorizer optab to switch the RTL. > > However operand[3] is not expected to be written to so the current patter= n is > bogus. > > Instead we use the expand to shuffle around the RTL. > > The vectorizer expects operands[3] and operands[0] to be > the same but the aarch64 intrinsics expanders expect operands[0] and > operands[1] to be the same. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? and active branches after some stew? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md (dot_prod): Correct > RTL. > > --- inline copy of patch --=20 > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarc= h64-simd.md > index 7397f1ec5ca0cb9e3cdd5c46772f604e640666e4..51789f954affd9fa88e2bc1bc= c3dacf64ccb5bde 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -635,18 +635,12 @@ (define_insn "aarch64_usdot" > ;; and so the vectorizer provides r, in which the result has to be accum= ulated. > (define_expand "dot_prod" > [(set (match_operand:VS 0 "register_operand") > - (plus:VS (unspec:VS [(match_operand: 1 "register_operand") > + (plus:VS (match_operand:VS 3 "register_operand") > + (unspec:VS [(match_operand: 1 "register_operand") > (match_operand: 2 "register_operand")] > - DOTPROD) > - (match_operand:VS 3 "register_operand")))] > + DOTPROD)))] > "TARGET_DOTPROD" The canonical plus: operand order was the original one, so I think it would be better to keep this rtl as-is and instead change aarch64_dot to: (plus:VS (unspec:VS [(match_operand: 2 "register_operand" "w") (match_operand: 3 "register_operand" "w")] DOTPROD) (match_operand:VS 1 "register_operand" "0")) Same idea for aarch64_dot_lane and aarch64_dot_laneq. Sorry to be awkward=E2=80=A6 Thanks, Richard > -{ > - emit_insn ( > - gen_aarch64_dot (operands[3], operands[3], operands[1], > - operands[2])); > - emit_insn (gen_rtx_SET (operands[0], operands[3])); > - DONE; > -}) > +) >=20=20 > ;; Auto-vectorizer pattern for usdot. The operand[3] and operand[0] are= the > ;; RMW parameters that when it comes to the vectorizer.