From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Segher Boessenkool <segher@kernel.crashing.org>,
Martin Sebor <msebor@gmail.com>,
Bill Schmidt <wschmidt@linux.ibm.com>
Subject: [PATCH v3] Make loops_list support an optional loop_p root
Date: Wed, 4 Aug 2021 10:36:16 +0800 [thread overview]
Message-ID: <963b4fca-8ce6-c9d7-0b08-8431fa433322@linux.ibm.com> (raw)
In-Reply-To: <CAFiYyc0fVjwe14ToVrUWNqDB7OEvTJ48T6FvPXQKPqQCku0WHw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 7400 bytes --]
on 2021/8/3 下午8:08, Richard Biener wrote:
> On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/7/29 下午4:01, Richard Biener wrote:
>>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>
>>>> on 2021/7/22 下午8:56, Richard Biener wrote:
>>>>> On Tue, Jul 20, 2021 at 4:37
>>>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This v2 has addressed some review comments/suggestions:
>>>>>>
>>>>>> - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>>>> - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>>>> to support loop hierarchy tree rather than just a function,
>>>>>> and adjust to use loops* accordingly.
>>>>>
>>>>> I actually meant struct loop *, not struct loops * ;) At the point
>>>>> we pondered to make loop invariant motion work on single
>>>>> loop nests we gave up not only but also because it iterates
>>>>> over the loop nest but all the iterators only ever can process
>>>>> all loops, not say, all loops inside a specific 'loop' (and
>>>>> including that 'loop' if LI_INCLUDE_ROOT). So the
>>>>> CTOR would take the 'root' of the loop tree as argument.
>>>>>
>>>>> I see that doesn't trivially fit how loops_list works, at least
>>>>> not for LI_ONLY_INNERMOST. But I guess FROM_INNERMOST
>>>>> could be adjusted to do ONLY_INNERMOST as well?
>>>>>
>>>>
>>>>
>>>> Thanks for the clarification! I just realized that the previous
>>>> version with struct loops* is problematic, all traversal is
>>>> still bounded with outer_loop == NULL. I think what you expect
>>>> is to respect the given loop_p root boundary. Since we just
>>>> record the loops' nums, I think we still need the function* fn?
>>>
>>> Would it simplify things if we recorded the actual loop *?
>>>
>>
>> I'm afraid it's unsafe to record the loop*. I had the same
>> question why the loop iterator uses index rather than loop* when
>> I read this at the first time. I guess the design of processing
>> loops allows its user to update or even delete the folllowing
>> loops to be visited. For example, when the user does some tricks
>> on one loop, then it duplicates the loop and its children to
>> somewhere and then removes the loop and its children, when
>> iterating onto its children later, the "index" way will check its
>> validity by get_loop at that point, but the "loop *" way will
>> have some recorded pointers to become dangling, can't do the
>> validity check on itself, seems to need a side linear search to
>> ensure the validity.
>>
>>> There's still the to_visit reserve which needs a bound on
>>> the number of loops for efficiency reasons.
>>>
>>
>> Yes, I still keep the fn in the updated version.
>>
>>>> So I add one optional argument loop_p root and update the
>>>> visiting codes accordingly. Before this change, the previous
>>>> visiting uses the outer_loop == NULL as the termination condition,
>>>> it perfectly includes the root itself, but with this given root,
>>>> we have to use it as the termination condition to avoid to iterate
>>>> onto its possible existing next.
>>>>
>>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
>>>> code like:
>>>>
>>>> struct loops *fn_loops = loops_for_fn (fn)->larray;
>>>> for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>>>> if (aloop != NULL
>>>> && aloop->inner == NULL
>>>> && flow_loop_nested_p (tree_root, aloop))
>>>> this->to_visit.quick_push (aloop->num);
>>>>
>>>> it has the stable bound, but if the given root only has several
>>>> child loops, it can be much worse if there are many loops in fn.
>>>> It seems impossible to predict the given root loop hierarchy size,
>>>> maybe we can still use the original linear searching for the case
>>>> loops_for_fn (fn) == root? But since this visiting seems not so
>>>> performance critical, I chose to share the code originally used
>>>> for FROM_INNERMOST, hope it can have better readability and
>>>> maintainability.
>>>
>>> I was indeed looking for something that has execution/storage
>>> bound on the subtree we're interested in. If we pull the CTOR
>>> out-of-line we can probably keep the linear search for
>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>
>>
>> OK, I've moved the suggested single loop tree walker out-of-line
>> to cfgloop.c, and brought the linear search back for
>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>
>>> It just seemed to me that we can eventually re-use a
>>> single loop tree walker for all orders, just adjusting the
>>> places we push.
>>>
>>
>> Wow, good point! Indeed, I have further unified all orders
>> handlings into a single function walk_loop_tree.
>>
>>>>
>>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>>>> x86_64-redhat-linux and aarch64-linux-gnu, also
>>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>>>
>>>> Does the attached patch meet what you expect?
>>>
>>> So yeah, it's probably close to what is sensible. Not sure
>>> whether optimizing the loops for the !only_push_innermost_p
>>> case is important - if we manage to produce a single
>>> walker with conditionals based on 'flags' then IPA-CP should
>>> produce optimal clones as well I guess.
>>>
>>
>> Thanks for the comments, the updated v2 is attached.
>> Comparing with v1, it does:
>>
>> - Unify one single loop tree walker for all orders.
>> - Move walk_loop_tree out-of-line to cfgloop.c.
>> - Keep the linear search for LI_ONLY_INNERMOST with
>> tree_root of fn loops.
>> - Use class loop * instead of loop_p.
>>
>> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
>> (with/without the hunk for LI_ONLY_INNERMOST linear search,
>> it can have the coverage to exercise LI_ONLY_INNERMOST
>> in walk_loop_tree when "without").
>>
>> Is it ok for trunk?
>
> Looks good to me. I think that the 'mn' was an optimization
> for the linear walk and it's cheaper to pointer test against
> the actual 'root' loop (no need to dereference). Thus
>
> + if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
> {
> - for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
> + class loop *aloop;
> + unsigned int i;
> + for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
> if (aloop != NULL
> && aloop->inner == NULL
> - && aloop->num >= mn)
> + && aloop->num != mn)
> this->to_visit.quick_push (aloop->num);
>
> could elide the aloop->num != mn check and start iterating from 1,
> since loops->tree_root->num == 0
>
> and the walk_loop_tree could simply do
>
> class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
>
> and pointer test aloop against exclude. That avoids the idea that
> 'mn' is a vehicle to exclude one random loop from the iteration.
>
Good idea! Thanks for the comments! The attached v3 has addressed
the review comments on "mn".
Bootstrapped & regtested again on powerpc64le-linux-gnu Power9
(with/without the hunk for LI_ONLY_INNERMOST linear search).
Is it ok for trunk?
BR,
Kewen
-----
gcc/ChangeLog:
* cfgloop.h (loops_list::loops_list): Add one optional argument root
and adjust accordingly, update loop tree walking and factor out
to ...
* cfgloop.c (loops_list::walk_loop_tree): ...this. New function.
[-- Attachment #2: loop_root-v3.diff --]
[-- Type: text/plain, Size: 6358 bytes --]
---
gcc/cfgloop.c | 65 ++++++++++++++++++++++++++++++++
gcc/cfgloop.h | 100 ++++++++++++++++++++------------------------------
2 files changed, 105 insertions(+), 60 deletions(-)
diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 6284ae292b6..afbaa216ce5 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -2104,3 +2104,68 @@ mark_loop_for_removal (loop_p loop)
loop->latch = NULL;
loops_state_set (LOOPS_NEED_FIXUP);
}
+
+/* Starting from loop tree ROOT, walk loop tree as the visiting
+ order specified by FLAGS. The supported visiting orders
+ are:
+ - LI_ONLY_INNERMOST
+ - LI_FROM_INNERMOST
+ - Preorder (if neither of above is specified) */
+
+void
+loops_list::walk_loop_tree (class loop *root, unsigned flags)
+{
+ bool only_innermost_p = flags & LI_ONLY_INNERMOST;
+ bool from_innermost_p = flags & LI_FROM_INNERMOST;
+ bool preorder_p = !(only_innermost_p || from_innermost_p);
+ class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
+
+ /* Early handle root without any inner loops, make later
+ processing simpler, that is all loops processed in the
+ following while loop are impossible to be root. */
+ if (!root->inner)
+ {
+ if (root != exclude)
+ this->to_visit.quick_push (root->num);
+ return;
+ }
+
+ class loop *aloop;
+ for (aloop = root;
+ aloop->inner != NULL;
+ aloop = aloop->inner)
+ {
+ if (preorder_p && aloop != exclude)
+ this->to_visit.quick_push (aloop->num);
+ continue;
+ }
+
+ while (1)
+ {
+ gcc_assert (aloop != root);
+ if (from_innermost_p || aloop->inner == NULL)
+ this->to_visit.quick_push (aloop->num);
+
+ if (aloop->next)
+ {
+ for (aloop = aloop->next;
+ aloop->inner != NULL;
+ aloop = aloop->inner)
+ {
+ if (preorder_p)
+ this->to_visit.quick_push (aloop->num);
+ continue;
+ }
+ }
+ else if (loop_outer (aloop) == root)
+ break;
+ else
+ aloop = loop_outer (aloop);
+ }
+
+ /* When visiting from innermost, we need to consider root here
+ since the previous while loop doesn't handle it. */
+ if (from_innermost_p && root != exclude)
+ this->to_visit.quick_push (root->num);
+}
+
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index d5eee6b4840..fed2b0daf4b 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -669,13 +669,15 @@ as_const (T &t)
}
/* A list for visiting loops, which contains the loop numbers instead of
- the loop pointers. The scope is restricted in function FN and the
- visiting order is specified by FLAGS. */
+ the loop pointers. If the loop ROOT is offered (non-null), the visiting
+ will start from it, otherwise it would start from the tree_root of
+ loops_for_fn (FN) instead. The scope is restricted in function FN and
+ the visiting order is specified by FLAGS. */
class loops_list
{
public:
- loops_list (function *fn, unsigned flags);
+ loops_list (function *fn, unsigned flags, class loop *root = nullptr);
template <typename T> class Iter
{
@@ -750,6 +752,10 @@ public:
}
private:
+ /* Walk loop tree starting from ROOT as the visiting order specified
+ by FLAGS. */
+ void walk_loop_tree (class loop *root, unsigned flags);
+
/* The function we are visiting. */
function *fn;
@@ -782,76 +788,50 @@ loops_list::Iter<T>::fill_curr_loop ()
}
/* Set up the loops list to visit according to the specified
- function scope FN and iterating order FLAGS. */
+ function scope FN and iterating order FLAGS. If ROOT is
+ not null, the visiting would start from it, otherwise it
+ will start from tree_root of loops_for_fn (FN). */
-inline loops_list::loops_list (function *fn, unsigned flags)
+inline loops_list::loops_list (function *fn, unsigned flags, class loop *root)
{
- class loop *aloop;
- unsigned i;
- int mn;
+ struct loops *loops = loops_for_fn (fn);
+ gcc_assert (!root || loops);
+
+ /* Check mutually exclusive flags should not co-exist. */
+ unsigned checked_flags = LI_ONLY_INNERMOST | LI_FROM_INNERMOST;
+ gcc_assert ((flags & checked_flags) != checked_flags);
this->fn = fn;
- if (!loops_for_fn (fn))
+ if (!loops)
return;
+ class loop *tree_root = root ? root : loops->tree_root;
+
this->to_visit.reserve_exact (number_of_loops (fn));
- mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
- if (flags & LI_ONLY_INNERMOST)
+ /* When root is tree_root of loops_for_fn (fn) and the visiting
+ order is LI_ONLY_INNERMOST, we would like to use linear
+ search here since it has a more stable bound than the
+ walk_loop_tree. */
+ if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
{
- for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
- if (aloop != NULL
- && aloop->inner == NULL
- && aloop->num >= mn)
- this->to_visit.quick_push (aloop->num);
- }
- else if (flags & LI_FROM_INNERMOST)
- {
- /* Push the loops to LI->TO_VISIT in postorder. */
- for (aloop = loops_for_fn (fn)->tree_root;
- aloop->inner != NULL;
- aloop = aloop->inner)
- continue;
-
- while (1)
+ gcc_assert (tree_root->num == 0);
+ if (tree_root->inner == NULL)
{
- if (aloop->num >= mn)
- this->to_visit.quick_push (aloop->num);
-
- if (aloop->next)
- {
- for (aloop = aloop->next;
- aloop->inner != NULL;
- aloop = aloop->inner)
- continue;
- }
- else if (!loop_outer (aloop))
- break;
- else
- aloop = loop_outer (aloop);
+ if (flags & LI_INCLUDE_ROOT)
+ this->to_visit.quick_push (0);
+
+ return;
}
+
+ class loop *aloop;
+ unsigned int i;
+ for (i = 1; vec_safe_iterate (loops->larray, i, &aloop); i++)
+ if (aloop != NULL && aloop->inner == NULL)
+ this->to_visit.quick_push (aloop->num);
}
else
- {
- /* Push the loops to LI->TO_VISIT in preorder. */
- aloop = loops_for_fn (fn)->tree_root;
- while (1)
- {
- if (aloop->num >= mn)
- this->to_visit.quick_push (aloop->num);
-
- if (aloop->inner != NULL)
- aloop = aloop->inner;
- else
- {
- while (aloop != NULL && aloop->next == NULL)
- aloop = loop_outer (aloop);
- if (aloop == NULL)
- break;
- aloop = aloop->next;
- }
- }
- }
+ walk_loop_tree (tree_root, flags);
}
/* The properties of the target. */
--
2.27.0
next prev parent reply other threads:[~2021-08-04 2:36 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-19 6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
2021-07-19 6:26 ` Andrew Pinski
2021-07-20 8:56 ` Kewen.Lin
2021-07-19 14:08 ` Jonathan Wakely
2021-07-20 8:56 ` Kewen.Lin
2021-07-19 14:34 ` Richard Biener
2021-07-20 8:57 ` Kewen.Lin
2021-07-19 15:59 ` Martin Sebor
2021-07-20 8:58 ` Kewen.Lin
2021-07-20 9:49 ` Jonathan Wakely
2021-07-20 9:50 ` Jonathan Wakely
2021-07-20 14:42 ` Kewen.Lin
2021-07-20 14:36 ` [PATCH v2] " Kewen.Lin
2021-07-22 12:56 ` Richard Biener
2021-07-22 12:56 ` Richard Biener
2021-07-23 8:41 ` [PATCH] Make loops_list support an optional loop_p root Kewen.Lin
2021-07-23 16:26 ` Martin Sebor
2021-07-27 2:25 ` Kewen.Lin
2021-07-29 8:01 ` Richard Biener
2021-07-30 5:20 ` [PATCH v2] " Kewen.Lin
2021-08-03 12:08 ` Richard Biener
2021-08-04 2:36 ` Kewen.Lin [this message]
2021-08-04 10:01 ` [PATCH v3] " Richard Biener
2021-08-04 10:47 ` Kewen.Lin
2021-08-04 12:04 ` Richard Biener
2021-08-05 8:50 ` Kewen.Lin
2021-07-23 8:35 ` [PATCH v3] Use range-based for loops for traversing loops Kewen.Lin
2021-07-23 16:10 ` Martin Sebor
2021-07-27 2:10 ` [PATCH v4] " Kewen.Lin
2021-07-29 7:48 ` Richard Biener
2021-07-30 7:18 ` Thomas Schwinge
2021-07-30 7:58 ` Kewen.Lin
2021-11-24 14:24 ` Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops) Thomas Schwinge
2021-11-24 16:58 ` Martin Jambor
2021-11-24 19:44 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=963b4fca-8ce6-c9d7-0b08-8431fa433322@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=msebor@gmail.com \
--cc=richard.guenther@gmail.com \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).