From: Richard Biener <richard.guenther@gmail.com>
To: Yuri Rumyantsev <ysrumyan@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
Igor Zamyatin <izamyatin@gmail.com>
Subject: Re: [PATCH] Yet another simple fix to enhance outer-loop vectorization.
Date: Tue, 14 Jul 2015 11:00:00 -0000 [thread overview]
Message-ID: <CAFiYyc2mQMCNhUQtfigE12kw+3q58WFo6Xg7xkvmDZBcFXwosA@mail.gmail.com> (raw)
In-Reply-To: <CAEoMCqRi9GJhnqkB4Yea3ENZce8pbrH6Qc53hVUBgQHCYko7Pw@mail.gmail.com>
On Wed, Jun 17, 2015 at 6:35 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> I attached updated patch.
> You asked me to explain why I did changes for renaming.
> If we do not change PHI arguments for inner loop header we can get the
> following IR:
> source outer loop:
> <bb 5>: outer-loop header
> # i_45 = PHI <0(4), i_18(9)>
> # .MEM_17 = PHI <.MEM_4(D)(4), .MEM_44(9)>
>
> <bb 6>:inner-loop header
> # .MEM_46 = PHI <.MEM_44(7), .MEM_17(5)>
>
> after duplication we have (without argument correction):
> <bb 12>:
> # i_74 = PHI <i_64(13), 0(17)>
> # .MEM_73 = PHI <.MEM_97(13), .MEM_4(D)(17)>
>
> <bb 15>:
> # .MEM_63 = PHI <.MEM_17(12), .MEM_97(16)>
>
> and later we get verifier error:
> test1.c:20:6: error: definition in block 6 does not dominate use in block 10
> void foo (int n)
> ^
> for SSA_NAME: .MEM_17 in statement:
> .MEM_63 = PHI <.MEM_17(10), .MEM_97(14)>
>
> and you can see that we need to rename MEM_17 argument for out-coming
> edge to MEM_73 since
> MEM_17 was converted to MEM_73 in outer-loop header.
>
> This explains my simple fix in rename_variables_in_bb.
> Note also that loop distribution is not performed for outer loops.
Yes, I never committed the patch implementing this... but I also now
see that loop-distribution always re-writes the virtual SSA web.
Richard.
> I also did a change in slpeel_can_duplicate_loop_p to simplify check.
>
> Any comments will be appreciated.
>
> Yuri.
>
> 2015-06-17 15:24 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Jun 16, 2015 at 4:12 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Thanks a lot Richard for your review.
>>>
>>> I presented updated patch which is not gated by force_vectorize.
>>> I added test on outer-loop in vect_enhance_data_refs_alignment
>>> and it returns false for it because we can not improve dr alighment
>>> through outer-loop peeling in general. So I assume that only
>>> versioning for alignment can be applied for targets do not support
>>> unaligned memory access.
>>
>> @@ -998,7 +998,12 @@
>> gimple stmt = DR_STMT (dr);
>> stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>> + loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>
>> + /* Peeling of outer loops can't improve alignment. */
>> + if (loop_vinfo && LOOP_VINFO_LOOP (loop_vinfo)->inner)
>> + return false;
>> +
>>
>> but this looks wrong. It depends on the context (DRs in the outer loop
>> can improve alignment by peeling the outer loop and we can still
>> peel the inner loop for alignment).
>>
>> So IMHO the correct place to amend is vect_enhance_data_refs_alignment
>> (which it seems doesn't consider peeling the inner loop).
>>
>> I'd say you should simply add
>>
>> || loop->inner)
>>
>> to the
>>
>> /* Check if we can possibly peel the loop. */
>> if (!vect_can_advance_ivs_p (loop_vinfo)
>> || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>> do_peeling = false;
>>
>> check.
>>
>>> I did not change tests for outer loops in slpeel_can_duplicate_loop_p
>>> as you proposed since it is not called outside vectorization.
>>
>> There is still no reason for this complex condition, so please remove it.
>>
>> _Please_ also generate diffs with -p, it is very tedious to see patch hunks
>> without a function name context.
>>
>> You didn't explain why you needed the renaming changes as I don't
>> remember needing it when using the code from loop distribution.
>>
>>> I also noticed one not-resolved issue with outer-loop peeling - we don't
>>> consider remainder for possible vectorization of inner-loop as we can see
>>> on the following example:
>>>
>>> for (i = 0; i < n; i++) {
>>> diff = 0;
>>> for (j = 0; j < M; j++) {
>>> diff += in[j+i]*coeff[j];
>>> }
>>> out[i] = diff;
>>> }
>>>
>>> Is it worth to fix it?
>>
>> You mean vectorizing the inner loop in the niter peel epilogue loop
>> of the outer loop? Possibly yes.
>>
>> Thanks,
>> Richard.
>>
>>> 2015-06-16 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>
>>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>>> to allow renaming of PHI arguments on edges incoming from outer
>>> loop header, add corresponding check before start PHI iterator.
>>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>>> with true force_vectorize. Set-up dominator for outer loop too.
>>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>>> was marked with force_vectorize and has restricted cfg.
>>> * tree-vect-data-refs.c (vector_alignment_reachable_p): Alignment can
>>> not be reachable for outer loops.
>>> (vect_enhance_data_refs_alignment): Add test on true value of
>>> do_peeling.
>>>
>>> gcc/testsuite/ChangeLog:
>>> * gcc.dg/vect/vect-outer-simd-2.c: New test.
>>>
>>> 2015-06-09 16:26 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Mon, Jun 8, 2015 at 12:27 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Hi All,
>>>>>
>>>>> Here is a simple fix which allows duplication of outer loops to
>>>>> perform peeling for number of iterations if outer loop is marked with
>>>>> pragma omp simd.
>>>>>
>>>>> Bootstrap and regression testing did not show any new failures.
>>>>> Is it OK for trunk?
>>>>
>>>> Hmm, I don't remember needing to adjust rename_variables_in_bb
>>>> when teaching loop distibution to call slpeel_tree_duplicate_to_edge_cfg
>>>> on non-innermost loops... (I just copied, I never called
>>>> slpeel_can_duplicate_loop_p though).
>>>>
>>>> So - you should just remove the loop->inner condition from
>>>> slpeel_can_duplicate_loop_p as it is used by non-vectorizer
>>>> code as well (yeah, I never merged the nested loop support
>>>> for loop distribution...).
>>>>
>>>> Index: tree-vect-loop.c
>>>> ===================================================================
>>>> --- tree-vect-loop.c (revision 224100)
>>>> +++ tree-vect-loop.c (working copy)
>>>> @@ -1879,6 +1879,10 @@
>>>> return false;
>>>> }
>>>>
>>>> + /* Peeling for alignment is not supported for outer-loop vectorization. */
>>>> + if (LOOP_VINFO_LOOP (loop_vinfo)->inner)
>>>> + LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) = 0;
>>>>
>>>> I think you can't simply do this - if vect_enhance_data_refs_alignment
>>>> decided to peel for alignment then it has adjusted the DRs alignment
>>>> info already. So instead of the above simply disallow peeling for
>>>> alignment in vect_enhance_data_refs_alignment? Thus add
>>>> || ->inner to
>>>>
>>>> /* Check if we can possibly peel the loop. */
>>>> if (!vect_can_advance_ivs_p (loop_vinfo)
>>>> || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>>>> do_peeling = false;
>>>>
>>>> ?
>>>>
>>>> I also can't see why the improvement has to be gated on force_vect,
>>>> it surely looks profitable to enable more outer loop vectorization in
>>>> general, no?
>>>>
>>>> How do the cost model calculations end up with peeling the outer loop
>>>> for niter?
>>>>
>>>> On targets which don't support unaligned accesses we're left with
>>>> versioning for alignment. Isn't peeling for alignment better there?
>>>> Thus only disallow peeling for alignment if there is no unhandled
>>>> alignment?
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> ChangeLog:
>>>>>
>>>>> 2015-06-08 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>
>>>>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>>>>> to allow renaming of PHI arguments on edges incoming from outer
>>>>> loop header, add corresponding check before start PHI iterator.
>>>>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>>>>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>>>>> with true force_vectorize. Set-up dominator for outer loop too.
>>>>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>>>>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>>>>> was marked with force_vectorize and has restricted cfg.
>>>>> * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling
>>>>> for outer loops.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>> * gcc.dg/vect/vect-outer-simd-2.c: New test.
prev parent reply other threads:[~2015-07-14 11:00 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-08 10:43 Yuri Rumyantsev
2015-06-09 13:28 ` Richard Biener
2015-06-16 14:39 ` Yuri Rumyantsev
2015-06-17 12:28 ` Richard Biener
2015-06-17 17:13 ` Yuri Rumyantsev
2015-06-29 16:53 ` Yuri Rumyantsev
2015-07-14 11:00 ` Richard Biener
2015-07-14 11:00 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFiYyc2mQMCNhUQtfigE12kw+3q58WFo6Xg7xkvmDZBcFXwosA@mail.gmail.com \
--to=richard.guenther@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=izamyatin@gmail.com \
--cc=ysrumyan@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).