* [Patch] PR67351 Implement << N & >> N optimizers
@ 2015-09-01 8:58 Hurugalawadi, Naveen
2015-09-01 9:15 ` Richard Biener
2015-09-02 11:18 ` Marc Glisse
0 siblings, 2 replies; 6+ messages in thread
From: Hurugalawadi, Naveen @ 2015-09-01 8:58 UTC (permalink / raw)
To: gcc-patches; +Cc: Richard Biener, Pinski, Andrew, ubizjak
[-- Attachment #1: Type: text/plain, Size: 736 bytes --]
Hi,
Please find attached the patch "pr67351.patch" that implements the
pattern << N & >> N optimizers.
Please review and let me know if its okay.
Regression tested on AARH64 and x86_64.
Thanks,
Naveen
2015-09-01 Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
gcc/ChangeLog:
PR middle-end/67351
* fold-const.c (fold_binary_loc) : Move
Transform (x >> c) << c into x & (-1<<c) or
transform (x << c) >> c into x & ((unsigned)-1 >> c) for unsigned
types using simplify and match.
* match.pd (lshift (rshift @0 INTEGER_CST@1) @1) : New simplifier.
(rshift (lshift @0 INTEGER_CST@1) @1) : New Simplifier.
gcc/testsuite/ChangeLog:
PR middle-end/67351
* g++.dg/pr66752-2.C: New test.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: pr67351.patch --]
[-- Type: text/x-patch; name="pr67351.patch", Size: 4018 bytes --]
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c826e67..4746836 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -10502,32 +10502,6 @@ fold_binary_loc (location_t loc,
prec = element_precision (type);
- /* Transform (x >> c) << c into x & (-1<<c), or transform (x << c) >> c
- into x & ((unsigned)-1 >> c) for unsigned types. */
- if (((code == LSHIFT_EXPR && TREE_CODE (arg0) == RSHIFT_EXPR)
- || (TYPE_UNSIGNED (type)
- && code == RSHIFT_EXPR && TREE_CODE (arg0) == LSHIFT_EXPR))
- && tree_fits_uhwi_p (arg1)
- && tree_to_uhwi (arg1) < prec
- && tree_fits_uhwi_p (TREE_OPERAND (arg0, 1))
- && tree_to_uhwi (TREE_OPERAND (arg0, 1)) < prec)
- {
- HOST_WIDE_INT low0 = tree_to_uhwi (TREE_OPERAND (arg0, 1));
- HOST_WIDE_INT low1 = tree_to_uhwi (arg1);
- tree lshift;
- tree arg00;
-
- if (low0 == low1)
- {
- arg00 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-
- lshift = build_minus_one_cst (type);
- lshift = const_binop (code, lshift, arg1);
-
- return fold_build2_loc (loc, BIT_AND_EXPR, type, arg00, lshift);
- }
- }
-
/* If we have a rotate of a bit operation with the rotate count and
the second operand of the bit operation both constant,
permute the two operations. */
diff --git a/gcc/match.pd b/gcc/match.pd
index 289bc5c..9b9f09d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -929,6 +929,22 @@ along with GCC; see the file COPYING3. If not see
&& tree_expr_nonnegative_p (@1))
@0))
+/* Optimize (x >> c) << c into x & (-1<<c). */
+(simplify
+ (lshift (rshift @0 INTEGER_CST@1) @1)
+ (if (tree_fits_uhwi_p (@1)
+ && tree_to_uhwi (@1) < TYPE_PRECISION (type))
+ (bit_and @0 (lshift { build_minus_one_cst (type); } @1))))
+
+/* Optimize (x << c) >> c into x & ((unsigned)-1 >> c) for unsigned
+ types. */
+(simplify
+ (rshift (lshift @0 INTEGER_CST@1) @1)
+ (if (TYPE_UNSIGNED (type)
+ && tree_fits_uhwi_p (@1)
+ && tree_to_uhwi (@1) < TYPE_PRECISION (type))
+ (bit_and @0 (rshift { build_minus_one_cst (type); } @1))))
+
(for shiftrotate (lrotate rrotate lshift rshift)
(simplify
(shiftrotate @0 integer_zerop)
diff --git a/gcc/testsuite/g++.dg/pr67351.C b/gcc/testsuite/g++.dg/pr67351.C
new file mode 100644
index 0000000..c86c920
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr67351.C
@@ -0,0 +1,106 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+typedef unsigned char uchar;
+typedef unsigned short ushort;
+typedef unsigned int uint;
+typedef unsigned long long uint64;
+
+class MyRgba
+{
+ uint rgba;
+
+public:
+ explicit MyRgba (uint c):rgba (c)
+ {
+ };
+
+ static MyRgba fromRgba (uchar r, uchar g, uchar b, uchar a)
+ {
+ return MyRgba (uint (r) << 24
+ | uint (g) << 16 | uint (b) << 8 | uint (a));
+ }
+
+ uchar r ()
+ {
+ return rgba >> 24;
+ }
+ uchar g ()
+ {
+ return rgba >> 16;
+ }
+ uchar b ()
+ {
+ return rgba >> 8;
+ }
+ uchar a ()
+ {
+ return rgba;
+ }
+
+ void setG (uchar _g)
+ {
+ *this = fromRgba (r (), _g, b (), a ());
+ }
+};
+
+extern MyRgba giveMe ();
+
+MyRgba
+test ()
+{
+ MyRgba a = giveMe ();
+ a.setG (0xf0);
+ return a;
+}
+
+class MyRgba64
+{
+ uint64 rgba;
+
+public:
+ explicit MyRgba64 (uint64 c):rgba (c)
+ {
+ };
+
+ static MyRgba64 fromRgba64 (ushort r, ushort g, ushort b, ushort a)
+ {
+ return MyRgba64 (uint64 (r) << 48
+ | uint64 (g) << 32 | uint64 (b) << 16 | uint64 (a));
+ }
+
+ ushort r ()
+ {
+ return rgba >> 48;
+ }
+ ushort g ()
+ {
+ return rgba >> 32;
+ }
+ ushort b ()
+ {
+ return rgba >> 16;
+ }
+ ushort a ()
+ {
+ return rgba;
+ }
+
+ void setG (ushort _g)
+ {
+ *this = fromRgba64 (r (), _g, b (), a ());
+ }
+};
+
+extern MyRgba64 giveMe64 ();
+
+MyRgba64
+test64 ()
+{
+ MyRgba64 a = giveMe64 ();
+ a.setG (0xf0f0);
+ return a;
+}
+
+/* { dg-final { scan-assembler-not "<<" } } */
+/* { dg-final { scan-assembler-not ">>" } } */
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch] PR67351 Implement << N & >> N optimizers
2015-09-01 8:58 [Patch] PR67351 Implement << N & >> N optimizers Hurugalawadi, Naveen
@ 2015-09-01 9:15 ` Richard Biener
2015-09-02 11:18 ` Marc Glisse
1 sibling, 0 replies; 6+ messages in thread
From: Richard Biener @ 2015-09-01 9:15 UTC (permalink / raw)
To: Hurugalawadi, Naveen; +Cc: gcc-patches, Pinski, Andrew, ubizjak
On Tue, Sep 1, 2015 at 10:57 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Please find attached the patch "pr67351.patch" that implements the
> pattern << N & >> N optimizers.
+ (bit_and @0 (lshift { build_minus_one_cst (type); } @1))))
please use
(bit_and @0 { wide_int_to_tree (type, wi::lshift (-1, @1)); })
and wi::arshift for the other pattern. It should then be possible
to drop the tree_fits_uhwi_p tests and replace the precision test
with wi::ltu_p (@1, TYPE_PRECISION (type)).
Ok with these changes.
Thanks,
Richard.
>
> Please review and let me know if its okay.
>
> Regression tested on AARH64 and x86_64.
>
> Thanks,
> Naveen
>
> 2015-09-01 Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
>
> gcc/ChangeLog:
>
> PR middle-end/67351
> * fold-const.c (fold_binary_loc) : Move
> Transform (x >> c) << c into x & (-1<<c) or
> transform (x << c) >> c into x & ((unsigned)-1 >> c) for unsigned
> types using simplify and match.
> * match.pd (lshift (rshift @0 INTEGER_CST@1) @1) : New simplifier.
> (rshift (lshift @0 INTEGER_CST@1) @1) : New Simplifier.
>
> gcc/testsuite/ChangeLog:
>
> PR middle-end/67351
> * g++.dg/pr66752-2.C: New test.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch] PR67351 Implement << N & >> N optimizers
2015-09-01 8:58 [Patch] PR67351 Implement << N & >> N optimizers Hurugalawadi, Naveen
2015-09-01 9:15 ` Richard Biener
@ 2015-09-02 11:18 ` Marc Glisse
2015-09-03 8:05 ` Hurugalawadi, Naveen
1 sibling, 1 reply; 6+ messages in thread
From: Marc Glisse @ 2015-09-02 11:18 UTC (permalink / raw)
To: Hurugalawadi, Naveen; +Cc: gcc-patches, Richard Biener, Pinski, Andrew, ubizjak
+/* Optimize (x >> c) << c into x & (-1<<c). */
+(simplify
+ (lshift (rshift @0 INTEGER_CST@1) @1)
+ (if (tree_fits_uhwi_p (@1)
+ && tree_to_uhwi (@1) < TYPE_PRECISION (type))
+ (bit_and @0 (lshift { build_minus_one_cst (type); } @1))))
It looks like vectors might match, so please use element_precision instead
of TYPE_PRECISION, as in the fold-const.c code you are converting from.
--
Marc Glisse
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch] PR67351 Implement << N & >> N optimizers
2015-09-02 11:18 ` Marc Glisse
@ 2015-09-03 8:05 ` Hurugalawadi, Naveen
2015-09-03 8:39 ` Uros Bizjak
2015-09-03 9:58 ` Richard Biener
0 siblings, 2 replies; 6+ messages in thread
From: Hurugalawadi, Naveen @ 2015-09-03 8:05 UTC (permalink / raw)
To: gcc-patches; +Cc: Richard Biener, marc.glisse, Pinski, Andrew, ubizjak
[-- Attachment #1: Type: text/plain, Size: 369 bytes --]
Hi,
Thanks for all the review and comments.
>> replace the precision test with wi::ltu_p (@1, TYPE_PRECISION (type)
>> use element_precision instead of TYPE_PRECISION
Please find attached the modified patch as per review comments.
Please review the same and let me know if the patch is okay?
Regression Tested on AArch64 and X86_64.
Thanks,
Naveen
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: pr67351-1.patch --]
[-- Type: text/x-patch; name="pr67351-1.patch", Size: 4018 bytes --]
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index d478c4d..a79bfa7 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -10412,32 +10412,6 @@ fold_binary_loc (location_t loc,
prec = element_precision (type);
- /* Transform (x >> c) << c into x & (-1<<c), or transform (x << c) >> c
- into x & ((unsigned)-1 >> c) for unsigned types. */
- if (((code == LSHIFT_EXPR && TREE_CODE (arg0) == RSHIFT_EXPR)
- || (TYPE_UNSIGNED (type)
- && code == RSHIFT_EXPR && TREE_CODE (arg0) == LSHIFT_EXPR))
- && tree_fits_uhwi_p (arg1)
- && tree_to_uhwi (arg1) < prec
- && tree_fits_uhwi_p (TREE_OPERAND (arg0, 1))
- && tree_to_uhwi (TREE_OPERAND (arg0, 1)) < prec)
- {
- HOST_WIDE_INT low0 = tree_to_uhwi (TREE_OPERAND (arg0, 1));
- HOST_WIDE_INT low1 = tree_to_uhwi (arg1);
- tree lshift;
- tree arg00;
-
- if (low0 == low1)
- {
- arg00 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-
- lshift = build_minus_one_cst (type);
- lshift = const_binop (code, lshift, arg1);
-
- return fold_build2_loc (loc, BIT_AND_EXPR, type, arg00, lshift);
- }
- }
-
/* If we have a rotate of a bit operation with the rotate count and
the second operand of the bit operation both constant,
permute the two operations. */
diff --git a/gcc/match.pd b/gcc/match.pd
index fb4b342..181a389 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -931,6 +931,22 @@ along with GCC; see the file COPYING3. If not see
&& tree_expr_nonnegative_p (@1))
@0))
+/* Optimize (x >> c) << c into x & (-1<<c). */
+(simplify
+ (lshift (rshift @0 INTEGER_CST@1) @1)
+ (if (tree_fits_uhwi_p (@1)
+ && wi::ltu_p (@1, element_precision (type)))
+ (bit_and @0 (lshift { build_minus_one_cst (type); } @1))))
+
+/* Optimize (x << c) >> c into x & ((unsigned)-1 >> c) for unsigned
+ types. */
+(simplify
+ (rshift (lshift @0 INTEGER_CST@1) @1)
+ (if (TYPE_UNSIGNED (type)
+ && tree_fits_uhwi_p (@1)
+ && (wi::ltu_p (@1, element_precision (type))))
+ (bit_and @0 (rshift { build_minus_one_cst (type); } @1))))
+
(for shiftrotate (lrotate rrotate lshift rshift)
(simplify
(shiftrotate @0 integer_zerop)
diff --git a/gcc/testsuite/g++.dg/pr67351.C b/gcc/testsuite/g++.dg/pr67351.C
new file mode 100644
index 0000000..c86c920
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr67351.C
@@ -0,0 +1,106 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+typedef unsigned char uchar;
+typedef unsigned short ushort;
+typedef unsigned int uint;
+typedef unsigned long long uint64;
+
+class MyRgba
+{
+ uint rgba;
+
+public:
+ explicit MyRgba (uint c):rgba (c)
+ {
+ };
+
+ static MyRgba fromRgba (uchar r, uchar g, uchar b, uchar a)
+ {
+ return MyRgba (uint (r) << 24
+ | uint (g) << 16 | uint (b) << 8 | uint (a));
+ }
+
+ uchar r ()
+ {
+ return rgba >> 24;
+ }
+ uchar g ()
+ {
+ return rgba >> 16;
+ }
+ uchar b ()
+ {
+ return rgba >> 8;
+ }
+ uchar a ()
+ {
+ return rgba;
+ }
+
+ void setG (uchar _g)
+ {
+ *this = fromRgba (r (), _g, b (), a ());
+ }
+};
+
+extern MyRgba giveMe ();
+
+MyRgba
+test ()
+{
+ MyRgba a = giveMe ();
+ a.setG (0xf0);
+ return a;
+}
+
+class MyRgba64
+{
+ uint64 rgba;
+
+public:
+ explicit MyRgba64 (uint64 c):rgba (c)
+ {
+ };
+
+ static MyRgba64 fromRgba64 (ushort r, ushort g, ushort b, ushort a)
+ {
+ return MyRgba64 (uint64 (r) << 48
+ | uint64 (g) << 32 | uint64 (b) << 16 | uint64 (a));
+ }
+
+ ushort r ()
+ {
+ return rgba >> 48;
+ }
+ ushort g ()
+ {
+ return rgba >> 32;
+ }
+ ushort b ()
+ {
+ return rgba >> 16;
+ }
+ ushort a ()
+ {
+ return rgba;
+ }
+
+ void setG (ushort _g)
+ {
+ *this = fromRgba64 (r (), _g, b (), a ());
+ }
+};
+
+extern MyRgba64 giveMe64 ();
+
+MyRgba64
+test64 ()
+{
+ MyRgba64 a = giveMe64 ();
+ a.setG (0xf0f0);
+ return a;
+}
+
+/* { dg-final { scan-assembler-not "<<" } } */
+/* { dg-final { scan-assembler-not ">>" } } */
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch] PR67351 Implement << N & >> N optimizers
2015-09-03 8:05 ` Hurugalawadi, Naveen
@ 2015-09-03 8:39 ` Uros Bizjak
2015-09-03 9:58 ` Richard Biener
1 sibling, 0 replies; 6+ messages in thread
From: Uros Bizjak @ 2015-09-03 8:39 UTC (permalink / raw)
To: Hurugalawadi, Naveen
Cc: gcc-patches, Richard Biener, marc.glisse, Pinski, Andrew
On Thu, Sep 3, 2015 at 9:29 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Thanks for all the review and comments.
>
>>> replace the precision test with wi::ltu_p (@1, TYPE_PRECISION (type)
>>> use element_precision instead of TYPE_PRECISION
>
> Please find attached the modified patch as per review comments.
>
> Please review the same and let me know if the patch is okay?
>
> Regression Tested on AArch64 and X86_64.
+/* { dg-final { scan-assembler-not "<<" } } */
+/* { dg-final { scan-assembler-not ">>" } } */
You probably want to use scan-tree-dump-not here.
Uros.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Patch] PR67351 Implement << N & >> N optimizers
2015-09-03 8:05 ` Hurugalawadi, Naveen
2015-09-03 8:39 ` Uros Bizjak
@ 2015-09-03 9:58 ` Richard Biener
1 sibling, 0 replies; 6+ messages in thread
From: Richard Biener @ 2015-09-03 9:58 UTC (permalink / raw)
To: Hurugalawadi, Naveen; +Cc: gcc-patches, marc.glisse, Pinski, Andrew, ubizjak
On Thu, Sep 3, 2015 at 9:29 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Thanks for all the review and comments.
>
>>> replace the precision test with wi::ltu_p (@1, TYPE_PRECISION (type)
>>> use element_precision instead of TYPE_PRECISION
>
> Please find attached the modified patch as per review comments.
>
> Please review the same and let me know if the patch is okay?
Ok with the tree_fits_uhwi_p checks removed (they are redundant)
and with
+/* { dg-final { scan-assembler-not "<<" } } */
+/* { dg-final { scan-assembler-not ">>" } } */
replaced with
/* { dg-final { scan-tree-dump-not "<<" "optimized" } } */
Thanks,
Richard.
> Regression Tested on AArch64 and X86_64.
>
> Thanks,
> Naveen
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-09-03 9:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-01 8:58 [Patch] PR67351 Implement << N & >> N optimizers Hurugalawadi, Naveen
2015-09-01 9:15 ` Richard Biener
2015-09-02 11:18 ` Marc Glisse
2015-09-03 8:05 ` Hurugalawadi, Naveen
2015-09-03 8:39 ` Uros Bizjak
2015-09-03 9:58 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).