public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Yuri Rumyantsev <ysrumyan@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Igor Zamyatin <izamyatin@gmail.com>
Subject: Re: [PATCH] Yet another simple fix to enhance outer-loop vectorization.
Date: Tue, 14 Jul 2015 11:00:00 -0000	[thread overview]
Message-ID: <CAFiYyc1EuKKTcM82sq9Ot99Q4WCRrGRAb9ewD2rqmeqKr_8xUA@mail.gmail.com> (raw)
In-Reply-To: <CAEoMCqS7omfUEC0i_+dMhge0FdazNYz7AAZYXuseEXTEJW9HuQ@mail.gmail.com>

On Mon, Jun 29, 2015 at 6:15 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Hi All,
>
> Here is updated patch containing missed change in
> slpeel_tree_peel_loop_to_edge which prevents renaming of exit PHI uses
> in inner loop.

Ok.

Thanks,
Richard.

> ChangeLog:
> 2015-06-29  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
> to allow renaming of PHI arguments on edges incoming from outer
> loop header, add corresponding check before start PHI iterator.
> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
> with true force_vectorize.  Set-up dominator for outer loop too.
> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
> was marked with force_vectorize and has restricted cfg.
> (slpeel_tree_peel_loop_to_edge): Do not rename exit PHI uses in
> inner loop.
> * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Do not
> do peeling for outer loops.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/vect/vect-outer-simd-2.c: New test.
>
> 2015-06-17 19:35 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
>> Richard,
>>
>> I attached updated patch.
>> You asked me to explain why I did changes for renaming.
>> If we do not change PHI arguments for inner loop header we can get the
>> following IR:
>> source outer loop:
>>   <bb 5>: outer-loop header
>>   # i_45 = PHI <0(4), i_18(9)>
>>   # .MEM_17 = PHI <.MEM_4(D)(4), .MEM_44(9)>
>>
>>   <bb 6>:inner-loop header
>>   # .MEM_46 = PHI <.MEM_44(7), .MEM_17(5)>
>>
>> after duplication we have (without argument correction):
>>   <bb 12>:
>>   # i_74 = PHI <i_64(13), 0(17)>
>>   # .MEM_73 = PHI <.MEM_97(13), .MEM_4(D)(17)>
>>
>>   <bb 15>:
>>   # .MEM_63 = PHI <.MEM_17(12), .MEM_97(16)>
>>
>> and later we get verifier error:
>> test1.c:20:6: error: definition in block 6 does not dominate use in block 10
>>  void foo (int n)
>>       ^
>> for SSA_NAME: .MEM_17 in statement:
>> .MEM_63 = PHI <.MEM_17(10), .MEM_97(14)>
>>
>> and you can see that we need to rename MEM_17 argument for out-coming
>> edge to MEM_73 since
>> MEM_17 was converted to MEM_73 in outer-loop header.
>>
>> This explains my simple fix in rename_variables_in_bb.
>> Note also that loop distribution is not performed for outer loops.
>>
>> I also did a change in slpeel_can_duplicate_loop_p to simplify check.
>>
>> Any comments will be appreciated.
>>
>> Yuri.
>>
>> 2015-06-17 15:24 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Tue, Jun 16, 2015 at 4:12 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Thanks a lot Richard for your review.
>>>>
>>>> I presented updated patch which is not gated by force_vectorize.
>>>> I added test on outer-loop in vect_enhance_data_refs_alignment
>>>> and it returns false for it because we can not improve dr alighment
>>>> through outer-loop peeling in general. So I assume that only
>>>> versioning for alignment can be applied for targets do not support
>>>> unaligned memory access.
>>>
>>> @@ -998,7 +998,12 @@
>>>    gimple stmt = DR_STMT (dr);
>>>    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>>>    tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>>> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>>
>>> +  /* Peeling of outer loops can't improve alignment.  */
>>> +  if (loop_vinfo && LOOP_VINFO_LOOP (loop_vinfo)->inner)
>>> +    return false;
>>> +
>>>
>>> but this looks wrong.  It depends on the context (DRs in the outer loop
>>> can improve alignment by peeling the outer loop and we can still
>>> peel the inner loop for alignment).
>>>
>>> So IMHO the correct place to amend is vect_enhance_data_refs_alignment
>>> (which it seems doesn't consider peeling the inner loop).
>>>
>>> I'd say you should simply add
>>>
>>>      || loop->inner)
>>>
>>> to the
>>>
>>>   /* Check if we can possibly peel the loop.  */
>>>   if (!vect_can_advance_ivs_p (loop_vinfo)
>>>       || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>>>     do_peeling = false;
>>>
>>> check.
>>>
>>>> I did not change tests for outer loops in slpeel_can_duplicate_loop_p
>>>> as you proposed since it is not called outside vectorization.
>>>
>>> There is still no reason for this complex condition, so please remove it.
>>>
>>> _Please_ also generate diffs with -p, it is very tedious to see patch hunks
>>> without a function name context.
>>>
>>> You didn't explain why you needed the renaming changes as I don't
>>> remember needing it when using the code from loop distribution.
>>>
>>>> I also noticed one not-resolved issue with outer-loop peeling - we don't
>>>> consider remainder for possible vectorization of inner-loop as we can see
>>>> on the following example:
>>>>
>>>>   for (i = 0; i < n; i++) {
>>>>     diff = 0;
>>>>     for (j = 0; j < M; j++) {
>>>>       diff += in[j+i]*coeff[j];
>>>>     }
>>>>     out[i] = diff;
>>>>   }
>>>>
>>>> Is it worth to fix it?
>>>
>>> You mean vectorizing the inner loop in the niter peel epilogue loop
>>> of the outer loop?  Possibly yes.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> 2015-06-16  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>
>>>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>>>> to allow renaming of PHI arguments on edges incoming from outer
>>>> loop header, add corresponding check before start PHI iterator.
>>>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>>>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>>>> with true force_vectorize.  Set-up dominator for outer loop too.
>>>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>>>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>>>> was marked with force_vectorize and has restricted cfg.
>>>> * tree-vect-data-refs.c (vector_alignment_reachable_p): Alignment can
>>>> not be reachable for outer loops.
>>>> (vect_enhance_data_refs_alignment): Add test on true value of
>>>> do_peeling.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>> * gcc.dg/vect/vect-outer-simd-2.c: New test.
>>>>
>>>> 2015-06-09 16:26 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Mon, Jun 8, 2015 at 12:27 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Here is a simple fix which allows duplication of outer loops to
>>>>>> perform peeling for number of iterations if outer loop is marked with
>>>>>> pragma omp simd.
>>>>>>
>>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>> Is it OK for trunk?
>>>>>
>>>>> Hmm, I don't remember needing to adjust rename_variables_in_bb
>>>>> when teaching loop distibution to call slpeel_tree_duplicate_to_edge_cfg
>>>>> on non-innermost loops...  (I just copied, I never called
>>>>> slpeel_can_duplicate_loop_p though).
>>>>>
>>>>> So - you should just remove the loop->inner condition from
>>>>> slpeel_can_duplicate_loop_p as it is used by non-vectorizer
>>>>> code as well (yeah, I never merged the nested loop support
>>>>> for loop distribution...).
>>>>>
>>>>> Index: tree-vect-loop.c
>>>>> ===================================================================
>>>>> --- tree-vect-loop.c    (revision 224100)
>>>>> +++ tree-vect-loop.c    (working copy)
>>>>> @@ -1879,6 +1879,10 @@
>>>>>        return false;
>>>>>      }
>>>>>
>>>>> +  /* Peeling for alignment is not supported for outer-loop vectorization.  */
>>>>> +  if (LOOP_VINFO_LOOP (loop_vinfo)->inner)
>>>>> +    LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) = 0;
>>>>>
>>>>> I think you can't simply do this - if vect_enhance_data_refs_alignment
>>>>> decided to peel for alignment then it has adjusted the DRs alignment
>>>>> info already.  So instead of the above simply disallow peeling for
>>>>> alignment in vect_enhance_data_refs_alignment?  Thus add
>>>>> || ->inner to
>>>>>
>>>>>   /* Check if we can possibly peel the loop.  */
>>>>>   if (!vect_can_advance_ivs_p (loop_vinfo)
>>>>>       || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>>>>>     do_peeling = false;
>>>>>
>>>>> ?
>>>>>
>>>>> I also can't see why the improvement has to be gated on force_vect,
>>>>> it surely looks profitable to enable more outer loop vectorization in
>>>>> general, no?
>>>>>
>>>>> How do the cost model calculations end up with peeling the outer loop
>>>>> for niter?
>>>>>
>>>>> On targets which don't support unaligned accesses we're left with
>>>>> versioning for alignment.  Isn't peeling for alignment better there?
>>>>> Thus only disallow peeling for alignment if there is no unhandled
>>>>> alignment?
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>> ChangeLog:
>>>>>>
>>>>>> 2015-06-08  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>
>>>>>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>>>>>> to allow renaming of PHI arguments on edges incoming from outer
>>>>>> loop header, add corresponding check before start PHI iterator.
>>>>>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>>>>>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>>>>>> with true force_vectorize.  Set-up dominator for outer loop too.
>>>>>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>>>>>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>>>>>> was marked with force_vectorize and has restricted cfg.
>>>>>> * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling
>>>>>> for outer loops.
>>>>>>
>>>>>> gcc/testsuite/ChangeLog:
>>>>>> * gcc.dg/vect/vect-outer-simd-2.c: New test.

  reply	other threads:[~2015-07-14 11:00 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 10:43 Yuri Rumyantsev
2015-06-09 13:28 ` Richard Biener
2015-06-16 14:39   ` Yuri Rumyantsev
2015-06-17 12:28     ` Richard Biener
2015-06-17 17:13       ` Yuri Rumyantsev
2015-06-29 16:53         ` Yuri Rumyantsev
2015-07-14 11:00           ` Richard Biener [this message]
2015-07-14 11:00         ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc1EuKKTcM82sq9Ot99Q4WCRrGRAb9ewD2rqmeqKr_8xUA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=izamyatin@gmail.com \
    --cc=ysrumyan@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).