* [PATCH 1/3]Improve induction variable elimination
@ 2014-07-17 9:08 Bin Cheng
2014-07-21 9:47 ` Fwd: " Bin.Cheng
2014-07-25 12:27 ` Richard Biener
0 siblings, 2 replies; 4+ messages in thread
From: Bin Cheng @ 2014-07-17 9:08 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1283 bytes --]
Hi,
This is a series of three patches improving induction variable elimination.
Currently GCC only eliminates iv for very specific case when the loop’s
latch could run zero times, i.e., when may_be_zero field of loop niter
information evaluates to true. In fact, it’s so specific that
iv_elimination_compare_lt rarely succeeds during either GCC bootstrap or
spec2000/spec2006 compilation. Though intrusive data shows these patches
don’t help iv elimination that much for GCC bootstrap, they do capture
5%~15% more eliminations for compiling spec2000/2006. Detailed numbers are
like:
2k/int 2k/fp 2k6/int 2k6/fp
improve ~9.6% ~4.8% ~5.5% ~14.4%
All patches pass bootstrap and regression test on x86_64/x86. I will
bootstrap and test them on aarch64/arm platforms too.
The first patch turns to tree operand_equal_p to check the number of
iterations in iv_elimination_lt. Though I think this change isn’t necessary
for current code, it’s needed if we further relax iv elimination for cases
in which sign/unsigned conversion is involved.
Thanks,
bin
2014-07-17 Bin Cheng <bin.cheng@arm.com>
* tree-ssa-loop-ivopts.c (iv_elimination_compare_lt): Check number
of iteration using tree comparison.
[-- Attachment #2: iv_elimination-improve-a-20140716.txt --]
[-- Type: text/plain, Size: 1757 bytes --]
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c (revision 212387)
+++ gcc/tree-ssa-loop-ivopts.c (working copy)
@@ -4605,7 +4605,7 @@ iv_elimination_compare_lt (struct ivopts_data *dat
struct tree_niter_desc *niter)
{
tree cand_type, a, b, mbz, nit_type = TREE_TYPE (niter->niter), offset;
- struct aff_tree nit, tmpa, tmpb;
+ struct aff_tree nit, tmp1, tmpa, tmpb;
enum tree_code comp;
HOST_WIDE_INT step;
@@ -4661,15 +4661,19 @@ iv_elimination_compare_lt (struct ivopts_data *dat
return false;
/* Expected number of iterations is B - A - 1. Check that it matches
- the actual number, i.e., that B - A - NITER = 1. */
+ the actual number, i.e., that B - A = NITER + 1. */
tree_to_aff_combination (niter->niter, nit_type, &nit);
- tree_to_aff_combination (fold_convert (nit_type, a), nit_type, &tmpa);
- tree_to_aff_combination (fold_convert (nit_type, b), nit_type, &tmpb);
- aff_combination_scale (&nit, -1);
- aff_combination_scale (&tmpa, -1);
- aff_combination_add (&tmpb, &tmpa);
- aff_combination_add (&tmpb, &nit);
- if (tmpb.n != 0 || tmpb.offset != 1)
+ aff_combination_const (&tmp1, nit_type, 1);
+ tree_to_aff_combination (b, TREE_TYPE (b), &tmpb);
+ aff_combination_add (&nit, &tmp1);
+ if (a != integer_zero_node)
+ {
+ tree_to_aff_combination (a, TREE_TYPE (b), &tmpa);
+ aff_combination_scale (&tmpa, -1);
+ aff_combination_add (&tmpb, &tmpa);
+ }
+ if (!operand_equal_p (aff_combination_to_tree (&nit),
+ aff_combination_to_tree (&tmpb), 0))
return false;
/* Finally, check that CAND->IV->BASE - CAND->IV->STEP * A does not
^ permalink raw reply [flat|nested] 4+ messages in thread
* Fwd: [PATCH 1/3]Improve induction variable elimination
2014-07-17 9:08 [PATCH 1/3]Improve induction variable elimination Bin Cheng
@ 2014-07-21 9:47 ` Bin.Cheng
2014-07-25 12:27 ` Richard Biener
1 sibling, 0 replies; 4+ messages in thread
From: Bin.Cheng @ 2014-07-21 9:47 UTC (permalink / raw)
To: Zdenek Dvorak; +Cc: gcc-patches List
[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]
Hi, forward to Zdenek for the review.
Thanks,
bin
---------- Forwarded message ----------
From: Bin Cheng <bin.cheng@arm.com>
Date: Thu, Jul 17, 2014 at 10:07 AM
Subject: [PATCH 1/3]Improve induction variable elimination
To: gcc-patches@gcc.gnu.org
Hi,
This is a series of three patches improving induction variable elimination.
Currently GCC only eliminates iv for very specific case when the loop's
latch could run zero times, i.e., when may_be_zero field of loop niter
information evaluates to true. In fact, it's so specific that
iv_elimination_compare_lt rarely succeeds during either GCC bootstrap or
spec2000/spec2006 compilation. Though intrusive data shows these patches
don't help iv elimination that much for GCC bootstrap, they do capture
5%~15% more eliminations for compiling spec2000/2006. Detailed numbers are
like:
2k/int 2k/fp 2k6/int 2k6/fp
improve ~9.6% ~4.8% ~5.5% ~14.4%
All patches pass bootstrap and regression test on x86_64/x86. I will
bootstrap and test them on aarch64/arm platforms too.
The first patch turns to tree operand_equal_p to check the number of
iterations in iv_elimination_lt. Though I think this change isn't necessary
for current code, it's needed if we further relax iv elimination for cases
in which sign/unsigned conversion is involved.
Thanks,
bin
2014-07-17 Bin Cheng <bin.cheng@arm.com>
* tree-ssa-loop-ivopts.c (iv_elimination_compare_lt): Check number
of iteration using tree comparison.
[-- Attachment #2: iv_elimination-improve-a-20140716.txt --]
[-- Type: text/plain, Size: 1757 bytes --]
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c (revision 212387)
+++ gcc/tree-ssa-loop-ivopts.c (working copy)
@@ -4605,7 +4605,7 @@ iv_elimination_compare_lt (struct ivopts_data *dat
struct tree_niter_desc *niter)
{
tree cand_type, a, b, mbz, nit_type = TREE_TYPE (niter->niter), offset;
- struct aff_tree nit, tmpa, tmpb;
+ struct aff_tree nit, tmp1, tmpa, tmpb;
enum tree_code comp;
HOST_WIDE_INT step;
@@ -4661,15 +4661,19 @@ iv_elimination_compare_lt (struct ivopts_data *dat
return false;
/* Expected number of iterations is B - A - 1. Check that it matches
- the actual number, i.e., that B - A - NITER = 1. */
+ the actual number, i.e., that B - A = NITER + 1. */
tree_to_aff_combination (niter->niter, nit_type, &nit);
- tree_to_aff_combination (fold_convert (nit_type, a), nit_type, &tmpa);
- tree_to_aff_combination (fold_convert (nit_type, b), nit_type, &tmpb);
- aff_combination_scale (&nit, -1);
- aff_combination_scale (&tmpa, -1);
- aff_combination_add (&tmpb, &tmpa);
- aff_combination_add (&tmpb, &nit);
- if (tmpb.n != 0 || tmpb.offset != 1)
+ aff_combination_const (&tmp1, nit_type, 1);
+ tree_to_aff_combination (b, TREE_TYPE (b), &tmpb);
+ aff_combination_add (&nit, &tmp1);
+ if (a != integer_zero_node)
+ {
+ tree_to_aff_combination (a, TREE_TYPE (b), &tmpa);
+ aff_combination_scale (&tmpa, -1);
+ aff_combination_add (&tmpb, &tmpa);
+ }
+ if (!operand_equal_p (aff_combination_to_tree (&nit),
+ aff_combination_to_tree (&tmpb), 0))
return false;
/* Finally, check that CAND->IV->BASE - CAND->IV->STEP * A does not
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/3]Improve induction variable elimination
2014-07-17 9:08 [PATCH 1/3]Improve induction variable elimination Bin Cheng
2014-07-21 9:47 ` Fwd: " Bin.Cheng
@ 2014-07-25 12:27 ` Richard Biener
2014-07-25 14:04 ` Bin.Cheng
1 sibling, 1 reply; 4+ messages in thread
From: Richard Biener @ 2014-07-25 12:27 UTC (permalink / raw)
To: Bin Cheng; +Cc: GCC Patches
On Thu, Jul 17, 2014 at 11:07 AM, Bin Cheng <bin.cheng@arm.com> wrote:
> Hi,
> This is a series of three patches improving induction variable elimination.
> Currently GCC only eliminates iv for very specific case when the loop’s
> latch could run zero times, i.e., when may_be_zero field of loop niter
> information evaluates to true. In fact, it’s so specific that
> iv_elimination_compare_lt rarely succeeds during either GCC bootstrap or
> spec2000/spec2006 compilation. Though intrusive data shows these patches
> don’t help iv elimination that much for GCC bootstrap, they do capture
> 5%~15% more eliminations for compiling spec2000/2006. Detailed numbers are
> like:
> 2k/int 2k/fp 2k6/int 2k6/fp
> improve ~9.6% ~4.8% ~5.5% ~14.4%
>
> All patches pass bootstrap and regression test on x86_64/x86. I will
> bootstrap and test them on aarch64/arm platforms too.
>
> The first patch turns to tree operand_equal_p to check the number of
> iterations in iv_elimination_lt. Though I think this change isn’t necessary
> for current code, it’s needed if we further relax iv elimination for cases
> in which sign/unsigned conversion is involved.
As said elsewhere this bug should be fixed in tree-affine.c. Do you have
a testcase?
Thanks,
Richard.
> Thanks,
> bin
>
> 2014-07-17 Bin Cheng <bin.cheng@arm.com>
>
> * tree-ssa-loop-ivopts.c (iv_elimination_compare_lt): Check number
> of iteration using tree comparison.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/3]Improve induction variable elimination
2014-07-25 12:27 ` Richard Biener
@ 2014-07-25 14:04 ` Bin.Cheng
0 siblings, 0 replies; 4+ messages in thread
From: Bin.Cheng @ 2014-07-25 14:04 UTC (permalink / raw)
To: Richard Biener; +Cc: Bin Cheng, GCC Patches
On Fri, Jul 25, 2014 at 1:27 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Jul 17, 2014 at 11:07 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>> Hi,
>> This is a series of three patches improving induction variable elimination.
>> Currently GCC only eliminates iv for very specific case when the loop's
>> latch could run zero times, i.e., when may_be_zero field of loop niter
>> information evaluates to true. In fact, it's so specific that
>> iv_elimination_compare_lt rarely succeeds during either GCC bootstrap or
>> spec2000/spec2006 compilation. Though intrusive data shows these patches
>> don't help iv elimination that much for GCC bootstrap, they do capture
>> 5%~15% more eliminations for compiling spec2000/2006. Detailed numbers are
>> like:
>> 2k/int 2k/fp 2k6/int 2k6/fp
>> improve ~9.6% ~4.8% ~5.5% ~14.4%
>>
>> All patches pass bootstrap and regression test on x86_64/x86. I will
>> bootstrap and test them on aarch64/arm platforms too.
>>
>> The first patch turns to tree operand_equal_p to check the number of
>> iterations in iv_elimination_lt. Though I think this change isn't necessary
>> for current code, it's needed if we further relax iv elimination for cases
>> in which sign/unsigned conversion is involved.
>
> As said elsewhere this bug should be fixed in tree-affine.c. Do you have
> a testcase?
>
Sorry I don't have test case without patching GCC, I will revisit the
problem and try to understand whether it's necessary or in which part
it should be fixed.
Thanks,
bin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-07-25 14:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-17 9:08 [PATCH 1/3]Improve induction variable elimination Bin Cheng
2014-07-21 9:47 ` Fwd: " Bin.Cheng
2014-07-25 12:27 ` Richard Biener
2014-07-25 14:04 ` Bin.Cheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).