From: Tamar Christina <tamar.christina@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: nd@arm.com, rguenther@suse.de, jeffreyalaw@gmail.com
Subject: [PATCH 2/8]middle-end: Recognize scalar widening reductions
Date: Mon, 31 Oct 2022 11:57:12 +0000 [thread overview]
Message-ID: <Y1+4GFnUyuwSK1hy@arm.com> (raw)
In-Reply-To: <patch-16240-tamar@arm.com>
[-- Attachment #1: Type: text/plain, Size: 4321 bytes --]
Hi All,
This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting
scalar reduction has twice the precision of the input elements.
At some point in a later patch I will also teach the vectorizer to recognize
this builtin once I figure out how the various bits of reductions work.
For now it's generated only by the match.pd pattern.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* internal-fn.def (REDUC_PLUS_WIDEN): New.
* doc/md.texi: Document it.
* match.pd: Recognize widening plus.
* optabs.def (reduc_splus_widen_scal_optab,
reduc_uplus_widen_scal_optab): New.
--- inline copy of patch --
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and
operand 0 is the scalar result, with mode equal to the mode of the elements of
the input vector.
+@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_uplus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and zero-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
+@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_splus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and sign-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
@cindex @code{reduc_and_scal_@var{m}} instruction pattern
@item @samp{reduc_and_scal_@var{m}}
@cindex @code{reduc_ior_scal_@var{m}} instruction pattern
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW,
reduc_plus_scal, unary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW,
+ first, reduc_splus_widen_scal,
+ reduc_uplus_widen_scal, unary)
DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first,
reduc_smax_scal, reduc_umax_scal, unary)
DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first,
diff --git a/gcc/match.pd b/gcc/match.pd
index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))
(ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))))
+/* Widening reduction conversions. */
+(simplify
+ (convert (IFN_REDUC_PLUS @0))
+ (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type)
+ && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0))
+ && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0)))
+ (IFN_REDUC_PLUS_WIDEN @0)))
+
(simplify
(BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
(BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); }))
diff --git a/gcc/optabs.def b/gcc/optabs.def
index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a")
OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a")
+OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a")
+OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a")
OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a")
OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a")
OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a")
--
[-- Attachment #2: rb16241.patch --]
[-- Type: text/plain, Size: 3610 bytes --]
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and
operand 0 is the scalar result, with mode equal to the mode of the elements of
the input vector.
+@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_uplus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and zero-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
+@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_splus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and sign-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
@cindex @code{reduc_and_scal_@var{m}} instruction pattern
@item @samp{reduc_and_scal_@var{m}}
@cindex @code{reduc_ior_scal_@var{m}} instruction pattern
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW,
reduc_plus_scal, unary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW,
+ first, reduc_splus_widen_scal,
+ reduc_uplus_widen_scal, unary)
DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first,
reduc_smax_scal, reduc_umax_scal, unary)
DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first,
diff --git a/gcc/match.pd b/gcc/match.pd
index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))
(ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))))
+/* Widening reduction conversions. */
+(simplify
+ (convert (IFN_REDUC_PLUS @0))
+ (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type)
+ && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0))
+ && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0)))
+ (IFN_REDUC_PLUS_WIDEN @0)))
+
(simplify
(BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
(BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); }))
diff --git a/gcc/optabs.def b/gcc/optabs.def
index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a")
OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a")
+OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a")
+OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a")
OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a")
OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a")
OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a")
next prev parent reply other threads:[~2022-10-31 11:57 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 11:56 [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs Tamar Christina
2022-10-31 11:57 ` Tamar Christina [this message]
2022-10-31 21:42 ` [PATCH 2/8]middle-end: Recognize scalar widening reductions Jeff Law
2022-11-07 13:21 ` Richard Biener
2022-10-31 11:57 ` [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector Tamar Christina
2022-10-31 21:44 ` Jeff Law
2022-11-01 14:25 ` Richard Sandiford
2022-11-11 14:33 ` Tamar Christina
2022-11-15 8:35 ` Hongtao Liu
2022-11-15 8:51 ` Tamar Christina
2022-11-15 9:37 ` Hongtao Liu
2022-11-15 10:00 ` Tamar Christina
2022-11-15 17:39 ` Richard Sandiford
2022-11-17 8:04 ` Hongtao Liu
2022-11-17 9:39 ` Richard Sandiford
2022-11-17 10:20 ` Hongtao Liu
2022-11-17 13:59 ` Richard Sandiford
2022-11-18 2:31 ` Hongtao Liu
2022-11-18 9:16 ` Richard Sandiford
2022-10-31 11:58 ` [PATCH 4/8]AArch64 aarch64: Implement widening reduction patterns Tamar Christina
2022-11-01 14:41 ` Richard Sandiford
2022-10-31 11:58 ` [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable Tamar Christina
2022-11-01 14:58 ` Richard Sandiford
2022-11-01 15:11 ` Tamar Christina
2022-11-11 14:39 ` Tamar Christina
2022-11-22 16:01 ` Tamar Christina
2022-11-30 4:26 ` Tamar Christina
2022-12-06 10:28 ` Richard Sandiford
2022-12-06 10:58 ` Tamar Christina
2022-12-06 11:05 ` Richard Sandiford
2022-10-31 11:59 ` [PATCH 6/8]AArch64: Add peephole and scheduling logic for pairwise operations that appear late in RTL Tamar Christina
2022-10-31 11:59 ` [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones Tamar Christina
2022-11-30 4:28 ` Tamar Christina
2022-12-06 15:59 ` Richard Sandiford
2022-10-31 12:00 ` [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side Tamar Christina
2022-11-01 15:04 ` Richard Sandiford
2022-11-01 15:20 ` Tamar Christina
2022-10-31 21:41 ` [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs Jeff Law
2022-11-05 11:32 ` Richard Biener
2022-11-07 7:16 ` Tamar Christina
2022-11-07 10:17 ` Richard Biener
2022-11-07 11:00 ` Tamar Christina
2022-11-07 11:22 ` Richard Biener
2022-11-07 11:56 ` Tamar Christina
2022-11-22 10:36 ` Richard Sandiford
2022-11-22 10:58 ` Richard Biener
2022-11-22 11:02 ` Tamar Christina
2022-11-22 11:06 ` Richard Sandiford
2022-11-22 11:08 ` Richard Biener
2022-11-22 14:33 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1+4GFnUyuwSK1hy@arm.com \
--to=tamar.christina@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jeffreyalaw@gmail.com \
--cc=nd@arm.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).