From: Richard Biener <richard.guenther@gmail.com>
To: "Bin.Cheng" <amker.cheng@gmail.com>
Cc: Bin Cheng <bin.cheng@arm.com>, GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH PR66388]Add sizetype cand for BIV of smaller type if it's used as index of memory ref
Date: Wed, 16 Sep 2015 11:55:00 -0000 [thread overview]
Message-ID: <CAFiYyc3vT6e_bC2Bz-iYYMcRra61-tRveorq406LGm0zhLoQ=g@mail.gmail.com> (raw)
In-Reply-To: <CAHFci2-C=VDu0Cow5o35sjsQ_0uLZB3di0h-JUbkgUYXA5UYgw@mail.gmail.com>
Ok.
Thanks,
Richard.
On Tue, Sep 15, 2015 at 7:56 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> Just realized that I missed the updated patch before. Here it is...
>
> Thanks,
> bin
>
> On Tue, Sep 8, 2015 at 6:07 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Sep 8, 2015 at 6:06 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Wed, Sep 2, 2015 at 10:12 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Wed, Sep 2, 2015 at 5:26 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>>>>> Hi,
>>>>> This patch is a new approach to fix PR66388. IVO today computes iv_use with
>>>>> iv_cand which has at least same type precision as the use. On 64bit
>>>>> platforms like AArch64, this results in different iv_cand created for each
>>>>> address type iv_use, and register pressure increased. As a matter of fact,
>>>>> the BIV should be used for all iv_uses in some of these cases. It is a
>>>>> latent bug but recently getting worse because of overflow changes.
>>>>>
>>>>> The original approach at
>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01484.html can fix the issue
>>>>> except it conflict with IV elimination. Seems to me it is impossible to
>>>>> mitigate the contradiction.
>>>>>
>>>>> This new approach fixes the issue by adding sizetype iv_cand for BIVs
>>>>> directly. In cases if the original BIV is preferred, the sizetype iv_cand
>>>>> will be chosen. As for code generation, the sizetype iv_cand has the same
>>>>> effect as the original BIV. Actually, it's better because BIV needs to be
>>>>> explicitly extended to sizetype to be used in address expression on most
>>>>> targets.
>>>>>
>>>>> One shortage of this approach is it may introduce more iv candidates. To
>>>>> minimize the impact, this patch does sophisticated code analysis and adds
>>>>> sizetype candidate for BIV only if it is used as index. Moreover, it avoids
>>>>> to add candidate of the original type if the BIV is only used as index.
>>>>> Statistics for compiling spec2k6 shows increase of candidate number is
>>>>> modest and can be ignored.
>>>>>
>>>>> There are two more patches following to fix corner cases revealed by this
>>>>> one. In together they bring obvious perf improvement for spec26k/int on
>>>>> aarch64.
>>>>> Spec2k6/int
>>>>> 400.perlbench 3.44%
>>>>> 445.gobmk -0.86%
>>>>> 456.hmmer 14.83%
>>>>> 458.sjeng 2.49%
>>>>> 462.libquantum -0.79%
>>>>> GEOMEAN 1.68%
>>>>>
>>>>> There is also about 0.36% improvement for spec2k6/fp, mostly because of case
>>>>> 436.cactusADM. I believe it can be further improved, but that should be
>>>>> another patch.
>>>>>
>>>>> I also collected benchmark data for x86_64. Spec2k6/fp is not affected. As
>>>>> for spec2k6/int, though the geomean is improved slightly, 400.perlbench is
>>>>> regressed by ~3%. I can see BIVs are chosen for some loops instead of
>>>>> address candidates. Generally, the loop header will be simplified because
>>>>> iv elimination with BIV is simpler; the number of instructions in loop body
>>>>> isn't changed. I suspect the regression comes from different addressing
>>>>> modes. With BIV, complex addressing mode like [base + index << scale +
>>>>> disp] is used, rather than [base + disp]. I guess the former has more
>>>>> micro-ops, thus more expensive. This guess can be confirmed by manually
>>>>> suppressing the complex addressing mode with higher address cost.
>>>>> Now the problem becomes why overall cost of BIV is computed lower while the
>>>>> actual cost is higher. I noticed for most affected loops, loop header is
>>>>> bloated because of iv elimination using the old address candidate. The
>>>>> bloated loop header results in much higher cost than BIV. As a result, BIV
>>>>> is preferred. I also noticed the bloated loop header generally can be
>>>>> simplified (I have a following patch for this). After applying the local
>>>>> patch, the old address candidate is chosen, and most of regression is
>>>>> recovered.
>>>>> Conclusion is I think loop header bloated issue should be blamed for the
>>>>> regression, and it can be resolved.
>>>>>
>>>>> Bootstrap and test on x64_64 and aarch64. It fixes failure of
>>>>> gcc.target/i386/pr49781-1.c, without new breakage.
>>>>>
>>>>> So what do you think?
>>>>
>>>> The data above looks ok to me.
>>>>
>>>> +static struct iv *
>>>> +find_deriving_biv_for_iv (struct ivopts_data *data, struct iv *iv)
>>>> +{
>>>> + aff_tree aff;
>>>> + struct expand_data exp_data;
>>>> +
>>>> + if (!iv->ssa_name || TREE_CODE (iv->ssa_name) != SSA_NAME)
>>>> + return iv;
>>>> +
>>>> + /* Expand IV's ssa_name till the deriving biv is found. */
>>>> + exp_data.data = data;
>>>> + exp_data.biv = NULL;
>>>> + tree_to_aff_combination_expand (iv->ssa_name, TREE_TYPE (iv->ssa_name),
>>>> + &aff, &data->name_expansion_cache,
>>>> + stop_expand, &exp_data);
>>>> + return exp_data.biv;
>>>>
>>>> that's actually "abusing" tree_to_aff_combination_expand for simply walking
>>>> SSA uses and their defs uses recursively until you hit "stop". ISTR past
>>>> discussion to add a generic walk_ssa_use interface for that. Not sure if it
>>>> materialized with a name I can't remember or whether it didn't.
>>> Thanks for reviewing. I didn't found existing interface to walk up
>>> definition chains of ssa vars. In this updated patch, I implemented a
>>> simple function which meets the minimal requirement of walking up
>>> definition chains of BIV variables. I also counted number of
>>> no_overflow BIVs that are not used in address type use. Since
>>> generally there are only two BIVs in a loop, this can prevent us from
>>> visiting definition chains most of time. Statistics shows that number
>>> of call to find_deriving_biv_for_expr plummet.
>>>
>>>>
>>>> - add_candidate (data, iv->base, iv->step, true, NULL);
>>>> + /* Check if this biv is used in address type use. */
>>>> + if (iv->no_overflow && iv->have_address_use
>>>> + && INTEGRAL_TYPE_P (TREE_TYPE (iv->base))
>>>> + && TYPE_PRECISION (TREE_TYPE (iv->base)) < TYPE_PRECISION (sizetype))
>>>> + {
>>>> + tree type = unsigned_type_for (sizetype);
>>>>
>>>> sizetype is unsigned.
>>> Fixed.
>>>
>>> Bootstrap and test on x86_64, is this OK?
>>>
>>> Thanks,
>>> bin
>>
>> And here is the updated Changelog entry.
>>
>>
>> 2015-09-08 Bin Cheng <bin.cheng@arm.com>
>>
>> PR tree-optimization/66388
>> * tree-ssa-loop-ivopts.c (struct iv, iv_cand, ivopts_data): New
>> fields.
>> (dump_iv): Dump no_overflow information.
>> (alloc_iv): Initialize new field for struct iv.
>> (mark_bivs): Count number of no_overflow bivs.
>> (find_deriving_biv_for_expr, record_biv_for_address_use): New
>> functions.
>> (idx_find_step): Call new functions above.
>> (add_candidate_1, add_candidate): New paramter.
>> (add_iv_candidate_for_biv): Add sizetype cand for BIV.
>> (get_computation_aff): Simplify convertion of cand for BIV.
>> (get_computation_cost_at): Step cand's base if necessary.
prev parent reply other threads:[~2015-09-16 11:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-02 3:26 Bin Cheng
2015-09-02 14:12 ` Richard Biener
2015-09-08 10:07 ` Bin.Cheng
2015-09-08 10:47 ` Bin.Cheng
2015-09-15 5:57 ` Bin.Cheng
2015-09-16 11:55 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFiYyc3vT6e_bC2Bz-iYYMcRra61-tRveorq406LGm0zhLoQ=g@mail.gmail.com' \
--to=richard.guenther@gmail.com \
--cc=amker.cheng@gmail.com \
--cc=bin.cheng@arm.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).