* [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest
@ 2017-05-29 7:14 Yuri Gribov
2017-05-31 11:24 ` Richard Biener
0 siblings, 1 reply; 4+ messages in thread
From: Yuri Gribov @ 2017-05-29 7:14 UTC (permalink / raw)
To: GCC Patches; +Cc: Alan Modra, rguenth
[-- Attachment #1: Type: text/plain, Size: 116 bytes --]
This no longer fixes the PR but still works in some cases as
demonstrated by the test. So I decided to keep it.
-I
[-- Attachment #2: 0004-Optimize-some-masked-comparisons-to-efficient-bittes.patch --]
[-- Type: application/octet-stream, Size: 2704 bytes --]
From 582d0ffb224d9130075f448812e0b1de4921e267 Mon Sep 17 00:00:00 2001
From: Yury Gribov <tetra2005@gmail.com>
Date: Fri, 26 May 2017 07:58:48 +0100
Subject: [PATCH 4/4] Optimize some masked comparisons to efficient bittest.
gcc/
2017-05-26 Yury Gribov <tetra2005@gmail.com>
* match.pd: New pattern.
gcc/testsuite/
2017-05-26 Yury Gribov <tetra2005@gmail.com>
* c-c++-common/fold-masked-cmp-3.c: New test.
---
gcc/match.pd | 33 ++++++++++++++++++++++++++
gcc/testsuite/c-c++-common/fold-masked-cmp-3.c | 16 +++++++++++++
2 files changed, 49 insertions(+)
create mode 100644 gcc/testsuite/c-c++-common/fold-masked-cmp-3.c
diff --git a/gcc/match.pd b/gcc/match.pd
index b5e5a98..253104b 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2765,6 +2765,39 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
|| VECTOR_INTEGER_TYPE_P (TREE_TYPE (@0)))
{ constant_boolean_node (false, type); })))
+/* A & (2**N - 1) <= 2**K - 1 -> A & (2**N - 2**K) == 0
+ A & (2**N - 1) > 2**K - 1 -> A & (2**N - 2**K) != 0
+
+ Note that comparisons
+ A & (2**N - 1) < 2**K -> A & (2**N - 2**K) == 0
+ A & (2**N - 1) >= 2**K -> A & (2**N - 2**K) != 0
+ will be canonicalized to above so there's no need to
+ consider them here.
+ */
+
+(for cmp (le gt)
+ (simplify
+ (cmp (bit_and@0 @1 INTEGER_CST@2) INTEGER_CST@3)
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+ (with
+ {
+ widest_int mask = wi::to_widest (@2);
+ bool mask_all_ones_p = !(mask & (mask + 1));
+ widest_int rhs = wi::to_widest (@3);
+ bool rhs_all_ones_p = !(rhs & (rhs + 1));
+ }
+ (if (mask_all_ones_p && rhs > 0 && rhs_all_ones_p && mask >= rhs)
+ (with
+ {
+ tree ty = TREE_TYPE (@0);
+ widest_int hi_bits = mask - rhs;
+ }
+ (switch
+ (if (cmp == LE_EXPR)
+ (eq:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) { build_zero_cst (ty); }))
+ (if (cmp == GT_EXPR)
+ (ne:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) { build_zero_cst (ty); })))))))))
+
/* -A CMP -B -> B CMP A. */
(for cmp (tcc_comparison)
scmp (swapped_tcc_comparison)
diff --git a/gcc/testsuite/c-c++-common/fold-masked-cmp-3.c b/gcc/testsuite/c-c++-common/fold-masked-cmp-3.c
new file mode 100644
index 0000000..98900ec
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fold-masked-cmp-3.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-fdump-tree-original" } */
+
+void foo (int *p, int x)
+{
+ if ((x & 0xff) <= 7)
+ *p = 0;
+}
+
+void bar (int *p, int x)
+{
+ if ((x & 0xff) < 8)
+ *p = 0;
+}
+
+/* { dg-final { scan-tree-dump-times "(x & .*) == 0" 2 "original" } } */
--
2.7.4
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest
2017-05-29 7:14 [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest Yuri Gribov
@ 2017-05-31 11:24 ` Richard Biener
2017-06-08 12:17 ` Yuri Gribov
0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2017-05-31 11:24 UTC (permalink / raw)
To: Yuri Gribov; +Cc: GCC Patches, Alan Modra, rguenth
On Mon, 29 May 2017, Yuri Gribov wrote:
> This no longer fixes the PR but still works in some cases as
> demonstrated by the test. So I decided to keep it.
As Richard noticed you don't need widest_ints but can use wide_ints.
Please use == 0 instead of ! on wide-ints as well.
+(for cmp (le gt)
+ (simplify
..
+ (switch
+ (if (cmp == LE_EXPR)
+ (eq:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) {
build_zero_cst (ty); }))
+ (if (cmp == GT_EXPR)
+ (ne:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) {
build_zero_cst (ty); })))))))))
long lines plus you can simplify this with using
(for cmp (le gt)
eqcmp (eq ne)
...
(eqcmp (bit_and @1 { wide_int_to_tree (ty, hi_bits); })
{build_zero_cst (ty); }))))
no need to spell out :type on the result as well.
Richard.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest
2017-05-31 11:24 ` Richard Biener
@ 2017-06-08 12:17 ` Yuri Gribov
2017-06-09 12:37 ` Richard Biener
0 siblings, 1 reply; 4+ messages in thread
From: Yuri Gribov @ 2017-06-08 12:17 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches, Alan Modra, rguenth
[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]
On Wed, May 31, 2017 at 12:19 PM, Richard Biener <rguenther@suse.de> wrote:
> On Mon, 29 May 2017, Yuri Gribov wrote:
>
>> This no longer fixes the PR but still works in some cases as
>> demonstrated by the test. So I decided to keep it.
>
> As Richard noticed you don't need widest_ints but can use wide_ints.
> Please use == 0 instead of ! on wide-ints as well.
>
> +(for cmp (le gt)
> + (simplify
> ..
> + (switch
> + (if (cmp == LE_EXPR)
> + (eq:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) {
> build_zero_cst (ty); }))
> + (if (cmp == GT_EXPR)
> + (ne:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) {
> build_zero_cst (ty); })))))))))
>
> long lines plus you can simplify this with using
>
> (for cmp (le gt)
> eqcmp (eq ne)
> ...
>
> (eqcmp (bit_and @1 { wide_int_to_tree (ty, hi_bits); })
> {build_zero_cst (ty); }))))
>
> no need to spell out :type on the result as well.
Hi Richard,
I fixed the issues (attached), rebased and retested on x64. Ok to commit?
-Yury
[-- Attachment #2: 0003-Optimize-some-masked-comparisons-to-efficient-bittes.patch --]
[-- Type: application/octet-stream, Size: 2483 bytes --]
From 3181290dcda7def1b9b5cb8f1ff6638b425c3958 Mon Sep 17 00:00:00 2001
From: Yury Gribov <tetra2005@gmail.com>
Date: Fri, 26 May 2017 07:58:48 +0100
Subject: [PATCH 3/3] Optimize some masked comparisons to efficient bittest.
2017-06-07 Yury Gribov <tetra2005@gmail.com>
gcc/
* match.pd: New pattern.
gcc/testsuite/
* c-c++-common/fold-masked-cmp-3.c: New test.
---
gcc/match.pd | 28 ++++++++++++++++++++++++++
gcc/testsuite/c-c++-common/fold-masked-cmp-3.c | 16 +++++++++++++++
2 files changed, 44 insertions(+)
create mode 100644 gcc/testsuite/c-c++-common/fold-masked-cmp-3.c
diff --git a/gcc/match.pd b/gcc/match.pd
index 54a8e04..244e9eb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2741,6 +2741,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
|| VECTOR_INTEGER_TYPE_P (TREE_TYPE (@0)))
{ constant_boolean_node (false, type); })))
+/* A & (2**N - 1) <= 2**K - 1 -> A & (2**N - 2**K) == 0
+ A & (2**N - 1) > 2**K - 1 -> A & (2**N - 2**K) != 0
+
+ Note that comparisons
+ A & (2**N - 1) < 2**K -> A & (2**N - 2**K) == 0
+ A & (2**N - 1) >= 2**K -> A & (2**N - 2**K) != 0
+ will be canonicalized to above so there's no need to
+ consider them here.
+ */
+
+(for cmp (le gt)
+ eqcmp (eq ne)
+ (simplify
+ (cmp (bit_and@0 @1 INTEGER_CST@2) INTEGER_CST@3)
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+ (with
+ {
+ tree ty = TREE_TYPE (@0);
+ unsigned prec = TYPE_PRECISION (ty);
+ wide_int mask = wi::to_wide (@2, prec);
+ wide_int rhs = wi::to_wide (@3, prec);
+ signop sgn = TYPE_SIGN (ty);
+ }
+ (if ((mask & (mask + 1)) == 0 && wi::gt_p (rhs, 0, sgn)
+ && (rhs & (rhs + 1)) == 0 && wi::ge_p (mask, rhs, sgn))
+ (eqcmp (bit_and @1 { wide_int_to_tree (ty, mask - rhs); })
+ { build_zero_cst (ty); }))))))
+
/* -A CMP -B -> B CMP A. */
(for cmp (tcc_comparison)
scmp (swapped_tcc_comparison)
diff --git a/gcc/testsuite/c-c++-common/fold-masked-cmp-3.c b/gcc/testsuite/c-c++-common/fold-masked-cmp-3.c
new file mode 100644
index 0000000..98900ec
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fold-masked-cmp-3.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-fdump-tree-original" } */
+
+void foo (int *p, int x)
+{
+ if ((x & 0xff) <= 7)
+ *p = 0;
+}
+
+void bar (int *p, int x)
+{
+ if ((x & 0xff) < 8)
+ *p = 0;
+}
+
+/* { dg-final { scan-tree-dump-times "(x & .*) == 0" 2 "original" } } */
--
2.7.4
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest
2017-06-08 12:17 ` Yuri Gribov
@ 2017-06-09 12:37 ` Richard Biener
0 siblings, 0 replies; 4+ messages in thread
From: Richard Biener @ 2017-06-09 12:37 UTC (permalink / raw)
To: Yuri Gribov; +Cc: GCC Patches, Alan Modra, rguenth
On Thu, 8 Jun 2017, Yuri Gribov wrote:
> On Wed, May 31, 2017 at 12:19 PM, Richard Biener <rguenther@suse.de> wrote:
> > On Mon, 29 May 2017, Yuri Gribov wrote:
> >
> >> This no longer fixes the PR but still works in some cases as
> >> demonstrated by the test. So I decided to keep it.
> >
> > As Richard noticed you don't need widest_ints but can use wide_ints.
> > Please use == 0 instead of ! on wide-ints as well.
> >
> > +(for cmp (le gt)
> > + (simplify
> > ..
> > + (switch
> > + (if (cmp == LE_EXPR)
> > + (eq:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) {
> > build_zero_cst (ty); }))
> > + (if (cmp == GT_EXPR)
> > + (ne:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) {
> > build_zero_cst (ty); })))))))))
> >
> > long lines plus you can simplify this with using
> >
> > (for cmp (le gt)
> > eqcmp (eq ne)
> > ...
> >
> > (eqcmp (bit_and @1 { wide_int_to_tree (ty, hi_bits); })
> > {build_zero_cst (ty); }))))
> >
> > no need to spell out :type on the result as well.
>
> Hi Richard,
>
> I fixed the issues (attached), rebased and retested on x64. Ok to commit?
Ok.
Thanks,
Richard.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-06-09 12:37 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-29 7:14 [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest Yuri Gribov
2017-05-31 11:24 ` Richard Biener
2017-06-08 12:17 ` Yuri Gribov
2017-06-09 12:37 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).