From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Sandiford <Richard.Sandiford@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
nd <nd@arm.com>, Richard Earnshaw <Richard.Earnshaw@arm.com>,
Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: RE: [PATCH 2/4]AArch64: Add support for sign differing dot-product usdot for NEON and SVE.
Date: Tue, 25 May 2021 14:57:22 +0000 [thread overview]
Message-ID: <VI1PR08MB5325B5CD04376BBB84F1F919FF259@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <mpttuna356g.fsf@arm.com>
[-- Attachment #1: Type: text/plain, Size: 6337 bytes --]
Hi Richard,
> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Monday, May 10, 2021 5:49 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Marcus Shawcroft
> <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: Re: [PATCH 2/4]AArch64: Add support for sign differing dot-product
> usdot for NEON and SVE.
>
> Tamar Christina <tamar.christina@arm.com> writes:
> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > b/gcc/config/aarch64/aarch64-simd.md
> > index
> >
> 4edee99051c4e2112b546becca47da32aae21df2..c9fb8e702732dd311fb10de1
> 7126
> > 432e2a63a32b 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -648,6 +648,22 @@ (define_expand "<sur>dot_prod<vsi2qi>"
> > DONE;
> > })
> >
> > +;; Auto-vectorizer pattern for usdot
> > +(define_expand "usdot_prod<vsi2qi>"
> > + [(set (match_operand:VS 0 "register_operand")
> > + (plus:VS (unspec:VS [(match_operand:<VSI2QI> 1
> "register_operand")
> > + (match_operand:<VSI2QI> 2 "register_operand")]
> > + UNSPEC_USDOT)
> > + (match_operand:VS 3 "register_operand")))]
> > + "TARGET_I8MM"
> > +{
> > + emit_insn (
> > + gen_aarch64_usdot<vsi2qi> (operands[3], operands[3], operands[1],
> > + operands[2]));
> > + emit_move_insn (operands[0], operands[3]);
> > + DONE;
> > +})
>
> We can't modify operands[3] here; it's an input rather than an output.
Sorry, I should have noticed this.. I had blindly copied the existing pattern for dot-product and that looks like it's wrong.
I'll send a different patch to fix that one.
>
> It looks like this would work with just the {…} removed though.
> The pattern will match aarch64_usdot<vsi2qi> on its own accord.
>
> Even better would be to rename __builtin_aarch64_usdot… to
> __builtin_usdot_prod…, change its arguments so that they line up with the
> optabs, and change arm_neon.h to match.
>
> > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
> > b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..b99a945903c043c7410becaf6f
> 09
> > 496dd038410d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
> > @@ -0,0 +1,38 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -march=armv8.2-a+i8mm" } */
> > +
> > +#define N 480
> > +#define SIGNEDNESS_1 unsigned
> > +#define SIGNEDNESS_2 signed
> > +#define SIGNEDNESS_3 signed
> > +#define SIGNEDNESS_4 unsigned
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict a,
> > + SIGNEDNESS_4 char *restrict b)
> > +{
> > + for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > + {
> > + int av = a[i];
> > + int bv = b[i];
> > + SIGNEDNESS_2 short mult = av * bv;
> > + res += mult;
> > + }
> > + return res;
> > +}
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) g (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict b,
> > + SIGNEDNESS_4 char *restrict a)
> > +{
> > + for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > + {
> > + int av = a[i];
> > + int bv = b[i];
> > + SIGNEDNESS_2 short mult = av * bv;
> > + res += mult;
> > + }
> > + return res;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
> > b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..094dd51cea62e0ba05ec35056
> 57b
> > f05320e5fdbb
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
> > @@ -0,0 +1,38 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -march=armv8.2-a+i8mm+sve" } */
> > +
> > +#define N 480
> > +#define SIGNEDNESS_1 unsigned
> > +#define SIGNEDNESS_2 signed
> > +#define SIGNEDNESS_3 signed
> > +#define SIGNEDNESS_4 unsigned
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict a,
> > + SIGNEDNESS_4 char *restrict b)
> > +{
> > + for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > + {
> > + int av = a[i];
> > + int bv = b[i];
> > + SIGNEDNESS_2 short mult = av * bv;
> > + res += mult;
> > + }
> > + return res;
> > +}
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) g (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict b,
> > + SIGNEDNESS_4 char *restrict a)
> > +{
> > + for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > + {
> > + int av = a[i];
> > + int bv = b[i];
> > + SIGNEDNESS_2 short mult = av * bv;
> > + res += mult;
> > + }
> > + return res;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
>
> Guess this is personal preference, but I don't think the SIGNEDNESS_*
> macros add anything when used like this. I remember doing something
> similar in the past when including .c files from other .c files(!) in order to
> avoid cut-&-paste, but there doesn't seem much benefit for standalone files
> like these.
If it's the same to you, I do prefer this version, since it's identical to the mid-end tests,
It does allow when familiar with the tests to just quickly see what it's testing.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_usdot<vsi2qi>): Rename to...
(usdot_prod<vsi2qi>): ... This.
* config/aarch64/aarch64-simd-builtins.def (usdot): Rename to...
(usdot_prod): ...This.
* config/aarch64/arm_neon.h (vusdot_s32, vusdotq_s32): Likewise.
* config/aarch64/aarch64-sve.md (@aarch64_<sur>dot_prod<vsi2qi>):
Rename to...
(@<sur>dot_prod<vsi2qi>): ...This.
* config/aarch64/aarch64-sve-builtins-base.cc
(svusdot_impl::expand): Use it.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/simd/vusdot-autovec.c: New test.
* gcc.target/aarch64/sve/vusdot-autovec.c: New test.
>
> Thanks,
> Richard
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rb14434.patch --]
[-- Type: text/x-diff; name="rb14434.patch", Size: 6384 bytes --]
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index b885bd5b38bf7ad83eb9d801284bf9b34db17210..c869ed9a6ab7d63f0e3d5fe393a93c1cc9142e78 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -361,10 +361,11 @@
BUILTIN_VSDQ_I_DI (BINOP, srshl, 0, NONE)
BUILTIN_VSDQ_I_DI (BINOP_UUS, urshl, 0, NONE)
- /* Implemented by aarch64_<sur><dotprod>{_lane}{q}<dot_mode>. */
+ /* Implemented by <sur><dotprod>_prod<dot_mode>. */
BUILTIN_VB (TERNOP, sdot, 0, NONE)
BUILTIN_VB (TERNOPU, udot, 0, NONE)
- BUILTIN_VB (TERNOP_SSUS, usdot, 0, NONE)
+ BUILTIN_VB (TERNOP_SSUS, usdot_prod, 10, NONE)
+ /* Implemented by aarch64_<sur><dotprod>_lane{q}<dot_mode>. */
BUILTIN_VB (QUADOP_LANE, sdot_lane, 0, NONE)
BUILTIN_VB (QUADOPU_LANE, udot_lane, 0, NONE)
BUILTIN_VB (QUADOP_LANE, sdot_laneq, 0, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 4edee99051c4e2112b546becca47da32aae21df2..253ddbe25d3a86af4b40b056132e6a86a0392ea6 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -601,7 +601,7 @@ (define_insn "aarch64_<sur>dot<vsi2qi>"
;; These instructions map to the __builtins for the armv8.6a I8MM usdot
;; (vector) Dot Product operation.
-(define_insn "aarch64_usdot<vsi2qi>"
+(define_insn "usdot_prod<vsi2qi>"
[(set (match_operand:VS 0 "register_operand" "=w")
(plus:VS
(unspec:VS [(match_operand:<VSI2QI> 2 "register_operand" "w")
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index dfdf0e2fd186389cbddcff51ef52f8778d7fdb24..50adcd5404e97e610485140fdbfe4c8ebbf2f602 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -2366,7 +2366,7 @@ public:
Hence we do the same rotation on arguments as svdot_impl does. */
e.rotate_inputs_left (0, 3);
machine_mode mode = e.vector_mode (0);
- insn_code icode = code_for_aarch64_dot_prod (UNSPEC_USDOT, mode);
+ insn_code icode = code_for_dot_prod (UNSPEC_USDOT, mode);
return e.use_exact_insn (icode);
}
diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
index 7db2938bb84e04d066a7b07574e5cf344a3a8fb6..1278f6f12fadf8eec693cd47fd545ff3277f08f1 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -6870,7 +6870,7 @@ (define_insn "@aarch64_<sur>dot_prod_lane<vsi2qi>"
[(set_attr "movprfx" "*,yes")]
)
-(define_insn "@aarch64_<sur>dot_prod<vsi2qi>"
+(define_insn "@<sur>dot_prod<vsi2qi>"
[(set (match_operand:VNx4SI_ONLY 0 "register_operand" "=w, ?&w")
(plus:VNx4SI_ONLY
(unspec:VNx4SI_ONLY
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index baa30bd5a9d96c1bf04a37fb105091ea56a6444a..373f06a24ea6ce686d7e0cdf53dd364041c61092 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -34384,14 +34384,14 @@ __extension__ extern __inline int32x2_t
__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
vusdot_s32 (int32x2_t __r, uint8x8_t __a, int8x8_t __b)
{
- return __builtin_aarch64_usdotv8qi_ssus (__r, __a, __b);
+ return __builtin_aarch64_usdot_prodv8qi_ssus (__r, __a, __b);
}
__extension__ extern __inline int32x4_t
__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
vusdotq_s32 (int32x4_t __r, uint8x16_t __a, int8x16_t __b)
{
- return __builtin_aarch64_usdotv16qi_ssus (__r, __a, __b);
+ return __builtin_aarch64_usdot_prodv16qi_ssus (__r, __a, __b);
}
__extension__ extern __inline int32x2_t
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
new file mode 100644
index 0000000000000000000000000000000000000000..b99a945903c043c7410becaf6f09496dd038410d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+i8mm" } */
+
+#define N 480
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+SIGNEDNESS_1 int __attribute__ ((noipa))
+f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
+ SIGNEDNESS_4 char *restrict b)
+{
+ for (__INTPTR_TYPE__ i = 0; i < N; ++i)
+ {
+ int av = a[i];
+ int bv = b[i];
+ SIGNEDNESS_2 short mult = av * bv;
+ res += mult;
+ }
+ return res;
+}
+
+SIGNEDNESS_1 int __attribute__ ((noipa))
+g (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict b,
+ SIGNEDNESS_4 char *restrict a)
+{
+ for (__INTPTR_TYPE__ i = 0; i < N; ++i)
+ {
+ int av = a[i];
+ int bv = b[i];
+ SIGNEDNESS_2 short mult = av * bv;
+ res += mult;
+ }
+ return res;
+}
+
+/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
new file mode 100644
index 0000000000000000000000000000000000000000..094dd51cea62e0ba05ec3505657bf05320e5fdbb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+i8mm+sve" } */
+
+#define N 480
+#define SIGNEDNESS_1 unsigned
+#define SIGNEDNESS_2 signed
+#define SIGNEDNESS_3 signed
+#define SIGNEDNESS_4 unsigned
+
+SIGNEDNESS_1 int __attribute__ ((noipa))
+f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
+ SIGNEDNESS_4 char *restrict b)
+{
+ for (__INTPTR_TYPE__ i = 0; i < N; ++i)
+ {
+ int av = a[i];
+ int bv = b[i];
+ SIGNEDNESS_2 short mult = av * bv;
+ res += mult;
+ }
+ return res;
+}
+
+SIGNEDNESS_1 int __attribute__ ((noipa))
+g (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict b,
+ SIGNEDNESS_4 char *restrict a)
+{
+ for (__INTPTR_TYPE__ i = 0; i < N; ++i)
+ {
+ int av = a[i];
+ int bv = b[i];
+ SIGNEDNESS_2 short mult = av * bv;
+ res += mult;
+ }
+ return res;
+}
+
+/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
next prev parent reply other threads:[~2021-05-25 14:57 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-05 17:38 [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes Tamar Christina
2021-05-05 17:38 ` [PATCH 2/4]AArch64: Add support for sign differing dot-product usdot for NEON and SVE Tamar Christina
2021-05-10 16:49 ` Richard Sandiford
2021-05-25 14:57 ` Tamar Christina [this message]
2021-05-26 8:50 ` Richard Sandiford
2021-05-05 17:39 ` [PATCH 3/4][AArch32]: Add support for sign differing dot-product usdot for NEON Tamar Christina
2021-05-05 17:42 ` FW: " Tamar Christina
[not found] ` <VI1PR08MB5325B832EE3BB6139886C0E9FF259@VI1PR08MB5325.eurprd08.prod.outlook.com>
2021-05-25 15:02 ` Tamar Christina
2021-05-26 10:45 ` Kyrylo Tkachov
2021-05-06 9:23 ` Christophe Lyon
2021-05-06 9:27 ` Tamar Christina
2021-05-05 17:39 ` [PATCH 4/4]middle-end: Add tests middle end generic tests for sign differing dotproduct Tamar Christina
[not found] ` <VI1PR08MB532511701573C18A33AC6291FF259@VI1PR08MB5325.eurprd08.prod.outlook.com>
2021-05-25 15:01 ` FW: " Tamar Christina
[not found] ` <11s2181-8856-30rq-26or-84q8o7qrr2o@fhfr.qr>
2021-05-26 8:48 ` Tamar Christina
2021-06-14 12:08 ` Tamar Christina
2021-05-07 11:45 ` [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes Richard Biener
2021-05-07 12:42 ` Tamar Christina
2021-05-10 11:39 ` Richard Biener
2021-05-10 12:58 ` Tamar Christina
2021-05-10 13:29 ` Richard Biener
2021-05-25 14:57 ` Tamar Christina
2021-05-26 8:56 ` Richard Biener
2021-06-02 9:28 ` Tamar Christina
2021-06-04 10:12 ` Tamar Christina
2021-06-07 10:10 ` Richard Sandiford
2021-06-14 12:06 ` Tamar Christina
2021-06-21 8:11 ` Tamar Christina
2021-06-22 10:56 ` Richard Sandiford
2021-06-22 11:16 ` Richard Sandiford
2021-07-12 9:18 ` Tamar Christina
2021-07-12 9:39 ` Richard Sandiford
2021-07-12 9:56 ` Tamar Christina
2021-07-12 10:25 ` Richard Sandiford
2021-07-12 12:29 ` Tamar Christina
2021-07-12 14:55 ` Richard Sandiford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR08MB5325B5CD04376BBB84F1F919FF259@VI1PR08MB5325.eurprd08.prod.outlook.com \
--to=tamar.christina@arm.com \
--cc=Kyrylo.Tkachov@arm.com \
--cc=Marcus.Shawcroft@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=Richard.Sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).