From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 04714385740D for ; Mon, 10 May 2021 16:49:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 04714385740D Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9BFD7168F; Mon, 10 May 2021 09:49:13 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AB8D03F73B; Mon, 10 May 2021 09:49:12 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina , gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com Subject: Re: [PATCH 2/4]AArch64: Add support for sign differing dot-product usdot for NEON and SVE. References: <20210505173854.GA17884@arm.com> Date: Mon, 10 May 2021 17:49:11 +0100 In-Reply-To: <20210505173854.GA17884@arm.com> (Tamar Christina's message of "Wed, 5 May 2021 18:38:56 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2021 16:49:16 -0000 Tamar Christina writes: > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarc= h64-simd.md > index 4edee99051c4e2112b546becca47da32aae21df2..c9fb8e702732dd311fb10de17= 126432e2a63a32b 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -648,6 +648,22 @@ (define_expand "dot_prod" > DONE; > }) >=20=20 > +;; Auto-vectorizer pattern for usdot > +(define_expand "usdot_prod" > + [(set (match_operand:VS 0 "register_operand") > + (plus:VS (unspec:VS [(match_operand: 1 "register_operand") > + (match_operand: 2 "register_operand")] > + UNSPEC_USDOT) > + (match_operand:VS 3 "register_operand")))] > + "TARGET_I8MM" > +{ > + emit_insn ( > + gen_aarch64_usdot (operands[3], operands[3], operands[1], > + operands[2])); > + emit_move_insn (operands[0], operands[3]); > + DONE; > +}) We can't modify operands[3] here; it's an input rather than an output. It looks like this would work with just the {=E2=80=A6} removed though. The pattern will match aarch64_usdot on its own accord. Even better would be to rename __builtin_aarch64_usdot=E2=80=A6 to __builtin_usdot_prod=E2=80=A6, change its arguments so that they line up with the optabs, and change arm_neon.h to match. > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c b/gcc= /testsuite/gcc.target/aarch64/simd/vusdot-autovec.c > new file mode 100644 > index 0000000000000000000000000000000000000000..b99a945903c043c7410becaf6= f09496dd038410d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c > @@ -0,0 +1,38 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 -march=3Darmv8.2-a+i8mm" } */ > + > +#define N 480 > +#define SIGNEDNESS_1 unsigned > +#define SIGNEDNESS_2 signed > +#define SIGNEDNESS_3 signed > +#define SIGNEDNESS_4 unsigned > + > +SIGNEDNESS_1 int __attribute__ ((noipa)) > +f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a, > + SIGNEDNESS_4 char *restrict b) > +{ > + for (__INTPTR_TYPE__ i =3D 0; i < N; ++i) > + { > + int av =3D a[i]; > + int bv =3D b[i]; > + SIGNEDNESS_2 short mult =3D av * bv; > + res +=3D mult; > + } > + return res; > +} > + > +SIGNEDNESS_1 int __attribute__ ((noipa)) > +g (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict b, > + SIGNEDNESS_4 char *restrict a) > +{ > + for (__INTPTR_TYPE__ i =3D 0; i < N; ++i) > + { > + int av =3D a[i]; > + int bv =3D b[i]; > + SIGNEDNESS_2 short mult =3D av * bv; > + res +=3D mult; > + } > + return res; > +} > + > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c b/gcc/= testsuite/gcc.target/aarch64/sve/vusdot-autovec.c > new file mode 100644 > index 0000000000000000000000000000000000000000..094dd51cea62e0ba05ec35056= 57bf05320e5fdbb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c > @@ -0,0 +1,38 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 -march=3Darmv8.2-a+i8mm+sve" } */ > + > +#define N 480 > +#define SIGNEDNESS_1 unsigned > +#define SIGNEDNESS_2 signed > +#define SIGNEDNESS_3 signed > +#define SIGNEDNESS_4 unsigned > + > +SIGNEDNESS_1 int __attribute__ ((noipa)) > +f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a, > + SIGNEDNESS_4 char *restrict b) > +{ > + for (__INTPTR_TYPE__ i =3D 0; i < N; ++i) > + { > + int av =3D a[i]; > + int bv =3D b[i]; > + SIGNEDNESS_2 short mult =3D av * bv; > + res +=3D mult; > + } > + return res; > +} > + > +SIGNEDNESS_1 int __attribute__ ((noipa)) > +g (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict b, > + SIGNEDNESS_4 char *restrict a) > +{ > + for (__INTPTR_TYPE__ i =3D 0; i < N; ++i) > + { > + int av =3D a[i]; > + int bv =3D b[i]; > + SIGNEDNESS_2 short mult =3D av * bv; > + res +=3D mult; > + } > + return res; > +} > + > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */ Guess this is personal preference, but I don't think the SIGNEDNESS_* macros add anything when used like this. I remember doing something similar in the past when including .c files from other .c files(!) in order to avoid cut-&-paste, but there doesn't seem much benefit for standalone files like these. Thanks, Richard