public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	Martin Sebor <msebor@gmail.com>,
	Bill Schmidt <wschmidt@linux.ibm.com>
Subject: Re: [PATCH v3] Make loops_list support an optional loop_p root
Date: Thu, 5 Aug 2021 16:50:22 +0800	[thread overview]
Message-ID: <4aa7d9dc-d284-16ed-36a0-6999c9e9a3ee@linux.ibm.com> (raw)
In-Reply-To: <CAFiYyc1rEZPbZPyAYQfF8uyiHN64brr4e4cdJhb=qXc7vaS+NQ@mail.gmail.com>

on 2021/8/4 下午8:04, Richard Biener wrote:
> On Wed, Aug 4, 2021 at 12:47 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/8/4 下午6:01, Richard Biener wrote:
>>> On Wed, Aug 4, 2021 at 4:36 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>
>>>> on 2021/8/3 下午8:08, Richard Biener wrote:
>>>>> On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>
>>>>>> on 2021/7/29 下午4:01, Richard Biener wrote:
>>>>>>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>>>
>>>>>>>> on 2021/7/22 下午8:56, Richard Biener wrote:
>>>>>>>>> On Tue, Jul 20, 2021 at 4:37
>>>>>>>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This v2 has addressed some review comments/suggestions:
>>>>>>>>>>
>>>>>>>>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>>>>>>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>>>>>>>>     to support loop hierarchy tree rather than just a function,
>>>>>>>>>>     and adjust to use loops* accordingly.
>>>>>>>>>
>>>>>>>>> I actually meant struct loop *, not struct loops * ;)  At the point
>>>>>>>>> we pondered to make loop invariant motion work on single
>>>>>>>>> loop nests we gave up not only but also because it iterates
>>>>>>>>> over the loop nest but all the iterators only ever can process
>>>>>>>>> all loops, not say, all loops inside a specific 'loop' (and
>>>>>>>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
>>>>>>>>> CTOR would take the 'root' of the loop tree as argument.
>>>>>>>>>
>>>>>>>>> I see that doesn't trivially fit how loops_list works, at least
>>>>>>>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
>>>>>>>>> could be adjusted to do ONLY_INNERMOST as well?
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for the clarification!  I just realized that the previous
>>>>>>>> version with struct loops* is problematic, all traversal is
>>>>>>>> still bounded with outer_loop == NULL.  I think what you expect
>>>>>>>> is to respect the given loop_p root boundary.  Since we just
>>>>>>>> record the loops' nums, I think we still need the function* fn?
>>>>>>>
>>>>>>> Would it simplify things if we recorded the actual loop *?
>>>>>>>
>>>>>>
>>>>>> I'm afraid it's unsafe to record the loop*.  I had the same
>>>>>> question why the loop iterator uses index rather than loop* when
>>>>>> I read this at the first time.  I guess the design of processing
>>>>>> loops allows its user to update or even delete the folllowing
>>>>>> loops to be visited.  For example, when the user does some tricks
>>>>>> on one loop, then it duplicates the loop and its children to
>>>>>> somewhere and then removes the loop and its children, when
>>>>>> iterating onto its children later, the "index" way will check its
>>>>>> validity by get_loop at that point, but the "loop *" way will
>>>>>> have some recorded pointers to become dangling, can't do the
>>>>>> validity check on itself, seems to need a side linear search to
>>>>>> ensure the validity.
>>>>>>
>>>>>>> There's still the to_visit reserve which needs a bound on
>>>>>>> the number of loops for efficiency reasons.
>>>>>>>
>>>>>>
>>>>>> Yes, I still keep the fn in the updated version.
>>>>>>
>>>>>>>> So I add one optional argument loop_p root and update the
>>>>>>>> visiting codes accordingly.  Before this change, the previous
>>>>>>>> visiting uses the outer_loop == NULL as the termination condition,
>>>>>>>> it perfectly includes the root itself, but with this given root,
>>>>>>>> we have to use it as the termination condition to avoid to iterate
>>>>>>>> onto its possible existing next.
>>>>>>>>
>>>>>>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
>>>>>>>> code like:
>>>>>>>>
>>>>>>>>     struct loops *fn_loops = loops_for_fn (fn)->larray;
>>>>>>>>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>>>>>>>>         if (aloop != NULL
>>>>>>>>             && aloop->inner == NULL
>>>>>>>>             && flow_loop_nested_p (tree_root, aloop))
>>>>>>>>              this->to_visit.quick_push (aloop->num);
>>>>>>>>
>>>>>>>> it has the stable bound, but if the given root only has several
>>>>>>>> child loops, it can be much worse if there are many loops in fn.
>>>>>>>> It seems impossible to predict the given root loop hierarchy size,
>>>>>>>> maybe we can still use the original linear searching for the case
>>>>>>>> loops_for_fn (fn) == root?  But since this visiting seems not so
>>>>>>>> performance critical, I chose to share the code originally used
>>>>>>>> for FROM_INNERMOST, hope it can have better readability and
>>>>>>>> maintainability.
>>>>>>>
>>>>>>> I was indeed looking for something that has execution/storage
>>>>>>> bound on the subtree we're interested in.  If we pull the CTOR
>>>>>>> out-of-line we can probably keep the linear search for
>>>>>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>>>>>
>>>>>>
>>>>>> OK, I've moved the suggested single loop tree walker out-of-line
>>>>>> to cfgloop.c, and brought the linear search back for
>>>>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>>>>
>>>>>>> It just seemed to me that we can eventually re-use a
>>>>>>> single loop tree walker for all orders, just adjusting the
>>>>>>> places we push.
>>>>>>>
>>>>>>
>>>>>> Wow, good point!  Indeed, I have further unified all orders
>>>>>> handlings into a single function walk_loop_tree.
>>>>>>
>>>>>>>>
>>>>>>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>>>>>>>> x86_64-redhat-linux and aarch64-linux-gnu, also
>>>>>>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>>>>>>>
>>>>>>>> Does the attached patch meet what you expect?
>>>>>>>
>>>>>>> So yeah, it's probably close to what is sensible.  Not sure
>>>>>>> whether optimizing the loops for the !only_push_innermost_p
>>>>>>> case is important - if we manage to produce a single
>>>>>>> walker with conditionals based on 'flags' then IPA-CP should
>>>>>>> produce optimal clones as well I guess.
>>>>>>>
>>>>>>
>>>>>> Thanks for the comments, the updated v2 is attached.
>>>>>> Comparing with v1, it does:
>>>>>>
>>>>>>   - Unify one single loop tree walker for all orders.
>>>>>>   - Move walk_loop_tree out-of-line to cfgloop.c.
>>>>>>   - Keep the linear search for LI_ONLY_INNERMOST with
>>>>>>     tree_root of fn loops.
>>>>>>   - Use class loop * instead of loop_p.
>>>>>>
>>>>>> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
>>>>>> (with/without the hunk for LI_ONLY_INNERMOST linear search,
>>>>>> it can have the coverage to exercise LI_ONLY_INNERMOST
>>>>>> in walk_loop_tree when "without").
>>>>>>
>>>>>> Is it ok for trunk?
>>>>>
>>>>> Looks good to me.  I think that the 'mn' was an optimization
>>>>> for the linear walk and it's cheaper to pointer test against
>>>>> the actual 'root' loop (no need to dereference).  Thus
>>>>>
>>>>> +  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
>>>>>      {
>>>>> -      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
>>>>> +      class loop *aloop;
>>>>> +      unsigned int i;
>>>>> +      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
>>>>>         if (aloop != NULL
>>>>>             && aloop->inner == NULL
>>>>> -           && aloop->num >= mn)
>>>>> +           && aloop->num != mn)
>>>>>           this->to_visit.quick_push (aloop->num);
>>>>>
>>>>> could elide the aloop->num != mn check and start iterating from 1,
>>>>> since loops->tree_root->num == 0
>>>>>
>>>>> and the walk_loop_tree could simply do
>>>>>
>>>>>   class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
>>>>>
>>>>> and pointer test aloop against exclude.  That avoids the idea that
>>>>> 'mn' is a vehicle to exclude one random loop from the iteration.
>>>>>
>>>>
>>>> Good idea!  Thanks for the comments!  The attached v3 has addressed
>>>> the review comments on "mn".
>>>>
>>>> Bootstrapped & regtested again on powerpc64le-linux-gnu Power9
>>>> (with/without the hunk for LI_ONLY_INNERMOST linear search).
>>>>
>>>> Is it ok for trunk?
>>>
>>> +  /* Early handle root without any inner loops, make later
>>> +     processing simpler, that is all loops processed in the
>>> +     following while loop are impossible to be root.  */
>>> +  if (!root->inner)
>>> +    {
>>> +      if (root != exclude)
>>> +       this->to_visit.quick_push (root->num);
>>> +      return;
>>> +    }
>>>
>>> could be
>>>
>>>    if (!root->inner)
>>>      {
>>>         if (flags & LI_INCLUDE_ROOT)
>>>           this->to_visit.quick_push (root->num);
>>>      }
>>>
>>
>> OK, I thought wrongly that all places with "exclude" might be
>> more consistent, so gave up to use flags directly.  :)
>>
>>> +  class loop *aloop;
>>> +  for (aloop = root;
>>> +       aloop->inner != NULL;
>>> +       aloop = aloop->inner)
>>> +    {
>>> +      if (preorder_p && aloop != exclude)
>>> +       this->to_visit.quick_push (aloop->num);
>>> +      continue;
>>> +    }
>>>
>>> could be
>>>
>>> +  class loop *aloop;
>>> +  for (aloop = root->inner;
>>> +       aloop->inner != NULL;
>>> +       aloop = aloop->inner)
>>> +    {
>>> +      if (preorder_p)
>>> +       this->to_visit.quick_push (aloop->num);
>>> +      continue;
>>> +    }
>>>
>>
>> This seems wrong?  For preorder_p, we might miss to push root
>> when root->inner isn't NULL.  The below "else if" makes it safe.
> 
> oops, yes.
> 
>> @@ -2125,17 +2125,19 @@ loops_list::walk_loop_tree (class loop *root, unsigned flags)
>>       following while loop are impossible to be root.  */
>>    if (!root->inner)
>>      {
>> -      if (root != exclude)
>> +      if (flags & LI_INCLUDE_ROOT)
>>         this->to_visit.quick_push (root->num);
>>        return;
>>      }
>> +  else if (preorder_p && flags & LI_INCLUDE_ROOT)
>> +    this->to_visit.quick_push (root->num);
>>
>>> +  /* When visiting from innermost, we need to consider root here
>>> +     since the previous while loop doesn't handle it.  */
>>> +  if (from_innermost_p && root != exclude)
>>> +    this->to_visit.quick_push (root->num);
>>>
>>> could be like the first.  I think that's more clear even.  Sorry for
>>> finding a better solution again.
>>>
>>
>> It's totally fine, thanks for all the nice suggestions!  :)
>>
>>> OK with that change
>>>
>>
>> Thanks, the attached diff is the delta against v3, excepting for
>> the "else if", the other changes follow the suggestion above.
>>
>> Could you have another look to confirm?
> 
> I'm missing the line that removes 'exclude', other than that it looks
> OK.
> 

Thanks!  Bootstrapped & regress-tested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu.  Committed in r12-2756.

BR,
Kewen

  reply	other threads:[~2021-08-05  8:50 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-19  6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
2021-07-19  6:26 ` Andrew Pinski
2021-07-20  8:56   ` Kewen.Lin
2021-07-19 14:08 ` Jonathan Wakely
2021-07-20  8:56   ` Kewen.Lin
2021-07-19 14:34 ` Richard Biener
2021-07-20  8:57   ` Kewen.Lin
2021-07-19 15:59 ` Martin Sebor
2021-07-20  8:58   ` Kewen.Lin
2021-07-20  9:49     ` Jonathan Wakely
2021-07-20  9:50       ` Jonathan Wakely
2021-07-20 14:42       ` Kewen.Lin
2021-07-20 14:36 ` [PATCH v2] " Kewen.Lin
2021-07-22 12:56   ` Richard Biener
2021-07-22 12:56     ` Richard Biener
2021-07-23  8:41     ` [PATCH] Make loops_list support an optional loop_p root Kewen.Lin
2021-07-23 16:26       ` Martin Sebor
2021-07-27  2:25         ` Kewen.Lin
2021-07-29  8:01       ` Richard Biener
2021-07-30  5:20         ` [PATCH v2] " Kewen.Lin
2021-08-03 12:08           ` Richard Biener
2021-08-04  2:36             ` [PATCH v3] " Kewen.Lin
2021-08-04 10:01               ` Richard Biener
2021-08-04 10:47                 ` Kewen.Lin
2021-08-04 12:04                   ` Richard Biener
2021-08-05  8:50                     ` Kewen.Lin [this message]
2021-07-23  8:35   ` [PATCH v3] Use range-based for loops for traversing loops Kewen.Lin
2021-07-23 16:10     ` Martin Sebor
2021-07-27  2:10       ` [PATCH v4] " Kewen.Lin
2021-07-29  7:48         ` Richard Biener
2021-07-30  7:18         ` Thomas Schwinge
2021-07-30  7:58           ` Kewen.Lin
2021-11-24 14:24             ` Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops) Thomas Schwinge
2021-11-24 16:58               ` Martin Jambor
2021-11-24 19:44               ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4aa7d9dc-d284-16ed-36a0-6999c9e9a3ee@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=msebor@gmail.com \
    --cc=richard.guenther@gmail.com \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).