From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
nd <nd@arm.com>
Subject: RE: [PATCH]middle-end convert negate + right shift into compare greater.
Date: Fri, 15 Oct 2021 07:48:11 +0000 [thread overview]
Message-ID: <VI1PR08MB532555ED0B6F6B80740F46C0FFB99@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <3rqo9r7o-9584-4n72-n252-o47ro31qn31@fhfr.qr>
[-- Attachment #1: Type: text/plain, Size: 6908 bytes --]
> >
> > +/* Fold (-x >> C) into x > 0 where C = precision(type) - 1. */ (for
> > +cst (INTEGER_CST VECTOR_CST) (simplify
> > + (rshift (negate:s @0) cst@1)
> > + (if (!flag_wrapv)
>
> Don't test flag_wrapv directly, instead use the appropriate
> TYPE_OVERFLOW_{UNDEFINED,WRAPS} predicates. But I'm not sure what
> we are protecting against? Right-shift of signed integers is implementation-
> defined and GCC treats it as you'd expect, sign-extending the result.
>
It's protecting against the overflow of the negate on INT_MIN. When wrapping
overflows are enabled the results would be wrong.
> > + (with { tree ctype = TREE_TYPE (@0);
> > + tree stype = TREE_TYPE (@1);
> > + tree bt = truth_type_for (ctype); }
> > + (switch
> > + /* Handle scalar case. */
> > + (if (INTEGRAL_TYPE_P (ctype)
> > + && !VECTOR_TYPE_P (ctype)
> > + && !TYPE_UNSIGNED (ctype)
> > + && canonicalize_math_after_vectorization_p ()
> > + && wi::eq_p (wi::to_wide (@1), TYPE_PRECISION (stype) - 1))
> > + (convert:bt (gt:bt @0 { build_zero_cst (stype); })))
>
> I'm not sure why the result is of type 'bt' rather than the original type of the
> expression?
That was to satisfy some RTL check that expected results of comparisons to always
be a Boolean, though for scalar that logically always is the case, I just added it
for consistency.
>
> In that regard for non-vectors we'd have to add the sign extension from
> unsigned bool, in the vector case we'd hope the type of the comparison is
> correct. I think in both cases it might be convenient to use
>
> (cond (gt:bt @0 { build_zero_cst (ctype); }) { build_all_ones_cst (ctype); }
> { build_zero_cost (ctype); })
>
> to compute the correct result and rely on (cond ..) simplifications to simplify
> that if possible.
>
> Btw, 'stype' should be irrelevant - you need to look at the precision of 'ctype',
> no?
I was working under the assumption that both input types must have the same
precision, but turns out that assumption doesn't need to hold.
New version attached.
Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* match.pd: New negate+shift pattern.
gcc/testsuite/ChangeLog:
* gcc.dg/signbit-2.c: New test.
* gcc.dg/signbit-3.c: New test.
* gcc.target/aarch64/signbit-1.c: New test.
--- inline copy of patch ---
diff --git a/gcc/match.pd b/gcc/match.pd
index 7d2a24dbc5e9644a09968f877e12a824d8ba1caa..9532cae582e152cae6e22fcce95a9744a844e3c2 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -38,7 +38,8 @@ along with GCC; see the file COPYING3. If not see
uniform_integer_cst_p
HONOR_NANS
uniform_vector_p
- bitmask_inv_cst_vector_p)
+ bitmask_inv_cst_vector_p
+ expand_vec_cmp_expr_p)
/* Operator lists. */
(define_operator_list tcc_comparison
@@ -826,6 +827,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
{ tree utype = unsigned_type_for (type); }
(convert (rshift (lshift (convert:utype @0) @2) @3))))))
+/* Fold (-x >> C) into x > 0 where C = precision(type) - 1. */
+(for cst (INTEGER_CST VECTOR_CST)
+ (simplify
+ (rshift (negate:s @0) cst@1)
+ (if (!TYPE_OVERFLOW_WRAPS (type))
+ (with { tree ctype = TREE_TYPE (@0);
+ tree stype = TREE_TYPE (@1);
+ tree bt = truth_type_for (ctype);
+ tree zeros = build_zero_cst (ctype); }
+ (switch
+ /* Handle scalar case. */
+ (if (INTEGRAL_TYPE_P (ctype)
+ && !VECTOR_TYPE_P (ctype)
+ && !TYPE_UNSIGNED (ctype)
+ && canonicalize_math_after_vectorization_p ()
+ && wi::eq_p (wi::to_wide (@1), TYPE_PRECISION (ctype) - 1))
+ (cond (gt:bt @0 { zeros; }) { build_all_ones_cst (ctype); } { zeros; }))
+ /* Handle vector case with a scalar immediate. */
+ (if (VECTOR_INTEGER_TYPE_P (ctype)
+ && !VECTOR_TYPE_P (stype)
+ && !TYPE_UNSIGNED (ctype)
+ && expand_vec_cmp_expr_p (ctype, ctype, { GT_EXPR }))
+ (with { HOST_WIDE_INT bits = GET_MODE_UNIT_BITSIZE (TYPE_MODE (ctype)); }
+ (if (wi::eq_p (wi::to_wide (@1), bits - 1))
+ (convert:bt (gt:bt @0 { zeros; })))))
+ /* Handle vector case with a vector immediate. */
+ (if (VECTOR_INTEGER_TYPE_P (ctype)
+ && VECTOR_TYPE_P (stype)
+ && !TYPE_UNSIGNED (ctype)
+ && uniform_vector_p (@1)
+ && expand_vec_cmp_expr_p (ctype, ctype, { GT_EXPR }))
+ (with { tree cst = vector_cst_elt (@1, 0);
+ HOST_WIDE_INT bits = GET_MODE_UNIT_BITSIZE (TYPE_MODE (ctype)); }
+ (if (wi::eq_p (wi::to_wide (cst), bits - 1))
+ (convert:bt (gt:bt @0 { zeros; }))))))))))
+
/* Fold (C1/X)*C2 into (C1*C2)/X. */
(simplify
(mult (rdiv@3 REAL_CST@0 @1) REAL_CST@2)
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
new file mode 100644
index 0000000000000000000000000000000000000000..fc0157cbc5c7996b481f2998bc30176c96a669bb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -0,0 +1,19 @@
+/* { dg-do assemble } */
+/* { dg-options "-O3 --save-temps -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+void fun1(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 31;
+}
+
+void fun2(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 30;
+}
+
+/* { dg-final { scan-tree-dump-times {\s+>\s+\{ 0, 0, 0, 0 \}} 1 optimized } } */
+/* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */
diff --git a/gcc/testsuite/gcc.dg/signbit-3.c b/gcc/testsuite/gcc.dg/signbit-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..19e9c06c349b3287610f817628f00938ece60bf7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/signbit-3.c
@@ -0,0 +1,13 @@
+/* { dg-do assemble } */
+/* { dg-options "-O1 --save-temps -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+void fun1(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 31;
+}
+
+/* { dg-final { scan-tree-dump-times {\s+>\s+0;} 1 optimized } } */
+/* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/signbit-1.c b/gcc/testsuite/gcc.target/aarch64/signbit-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..3ebfb0586f37de29cf58635b27fe48503714447e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/signbit-1.c
@@ -0,0 +1,18 @@
+/* { dg-do assemble } */
+/* { dg-options "-O3 --save-temps" } */
+
+#include <stdint.h>
+
+void fun1(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 31;
+}
+
+void fun2(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 30;
+}
+
+/* { dg-final { scan-assembler-times {\tcmgt\t} 1 } } */
[-- Attachment #2: rb14918.patch --]
[-- Type: application/octet-stream, Size: 4296 bytes --]
diff --git a/gcc/match.pd b/gcc/match.pd
index 7d2a24dbc5e9644a09968f877e12a824d8ba1caa..9532cae582e152cae6e22fcce95a9744a844e3c2 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -38,7 +38,8 @@ along with GCC; see the file COPYING3. If not see
uniform_integer_cst_p
HONOR_NANS
uniform_vector_p
- bitmask_inv_cst_vector_p)
+ bitmask_inv_cst_vector_p
+ expand_vec_cmp_expr_p)
/* Operator lists. */
(define_operator_list tcc_comparison
@@ -826,6 +827,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
{ tree utype = unsigned_type_for (type); }
(convert (rshift (lshift (convert:utype @0) @2) @3))))))
+/* Fold (-x >> C) into x > 0 where C = precision(type) - 1. */
+(for cst (INTEGER_CST VECTOR_CST)
+ (simplify
+ (rshift (negate:s @0) cst@1)
+ (if (!TYPE_OVERFLOW_WRAPS (type))
+ (with { tree ctype = TREE_TYPE (@0);
+ tree stype = TREE_TYPE (@1);
+ tree bt = truth_type_for (ctype);
+ tree zeros = build_zero_cst (ctype); }
+ (switch
+ /* Handle scalar case. */
+ (if (INTEGRAL_TYPE_P (ctype)
+ && !VECTOR_TYPE_P (ctype)
+ && !TYPE_UNSIGNED (ctype)
+ && canonicalize_math_after_vectorization_p ()
+ && wi::eq_p (wi::to_wide (@1), TYPE_PRECISION (ctype) - 1))
+ (cond (gt:bt @0 { zeros; }) { build_all_ones_cst (ctype); } { zeros; }))
+ /* Handle vector case with a scalar immediate. */
+ (if (VECTOR_INTEGER_TYPE_P (ctype)
+ && !VECTOR_TYPE_P (stype)
+ && !TYPE_UNSIGNED (ctype)
+ && expand_vec_cmp_expr_p (ctype, ctype, { GT_EXPR }))
+ (with { HOST_WIDE_INT bits = GET_MODE_UNIT_BITSIZE (TYPE_MODE (ctype)); }
+ (if (wi::eq_p (wi::to_wide (@1), bits - 1))
+ (convert:bt (gt:bt @0 { zeros; })))))
+ /* Handle vector case with a vector immediate. */
+ (if (VECTOR_INTEGER_TYPE_P (ctype)
+ && VECTOR_TYPE_P (stype)
+ && !TYPE_UNSIGNED (ctype)
+ && uniform_vector_p (@1)
+ && expand_vec_cmp_expr_p (ctype, ctype, { GT_EXPR }))
+ (with { tree cst = vector_cst_elt (@1, 0);
+ HOST_WIDE_INT bits = GET_MODE_UNIT_BITSIZE (TYPE_MODE (ctype)); }
+ (if (wi::eq_p (wi::to_wide (cst), bits - 1))
+ (convert:bt (gt:bt @0 { zeros; }))))))))))
+
/* Fold (C1/X)*C2 into (C1*C2)/X. */
(simplify
(mult (rdiv@3 REAL_CST@0 @1) REAL_CST@2)
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
new file mode 100644
index 0000000000000000000000000000000000000000..fc0157cbc5c7996b481f2998bc30176c96a669bb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -0,0 +1,19 @@
+/* { dg-do assemble } */
+/* { dg-options "-O3 --save-temps -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+void fun1(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 31;
+}
+
+void fun2(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 30;
+}
+
+/* { dg-final { scan-tree-dump-times {\s+>\s+\{ 0, 0, 0, 0 \}} 1 optimized } } */
+/* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */
diff --git a/gcc/testsuite/gcc.dg/signbit-3.c b/gcc/testsuite/gcc.dg/signbit-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..19e9c06c349b3287610f817628f00938ece60bf7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/signbit-3.c
@@ -0,0 +1,13 @@
+/* { dg-do assemble } */
+/* { dg-options "-O1 --save-temps -fdump-tree-optimized" } */
+
+#include <stdint.h>
+
+void fun1(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 31;
+}
+
+/* { dg-final { scan-tree-dump-times {\s+>\s+0;} 1 optimized } } */
+/* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/signbit-1.c b/gcc/testsuite/gcc.target/aarch64/signbit-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..3ebfb0586f37de29cf58635b27fe48503714447e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/signbit-1.c
@@ -0,0 +1,18 @@
+/* { dg-do assemble } */
+/* { dg-options "-O3 --save-temps" } */
+
+#include <stdint.h>
+
+void fun1(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 31;
+}
+
+void fun2(int32_t *x, int n)
+{
+ for (int i = 0; i < (n & -16); i++)
+ x[i] = (-x[i]) >> 30;
+}
+
+/* { dg-final { scan-assembler-times {\tcmgt\t} 1 } } */
next prev parent reply other threads:[~2021-10-15 7:48 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-05 12:50 Tamar Christina
2021-10-05 12:56 ` Richard Earnshaw
2021-10-05 13:30 ` Tamar Christina
2021-10-05 13:34 ` Richard Earnshaw
2021-10-05 13:49 ` Tamar Christina
2021-10-05 13:51 ` Richard Earnshaw
2021-10-05 13:56 ` Tamar Christina
2021-10-07 12:46 ` Richard Earnshaw
2021-10-11 11:36 ` Tamar Christina
2021-10-13 12:11 ` Richard Biener
2021-10-15 7:48 ` Tamar Christina [this message]
2021-10-15 9:06 ` Richard Biener
2021-10-15 10:36 ` Richard Earnshaw
2021-10-15 10:57 ` Richard Biener
2021-10-15 11:55 ` Tamar Christina
[not found] ` <VI1PR08MB5325D87574F2D09568EF5C40FF849@VI1PR08MB5325.eurprd08.prod.outlook.com>
[not found] ` <34p8433-751p-2n5s-qp50-r8rss490npop@fhfr.qr>
2021-11-03 13:21 ` Tamar Christina
2021-11-04 13:06 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR08MB532555ED0B6F6B80740F46C0FFB99@VI1PR08MB5325.eurprd08.prod.outlook.com \
--to=tamar.christina@arm.com \
--cc=Richard.Earnshaw@foss.arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=nd@arm.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).