From: Richard Sandiford <richard.sandiford@arm.com>
To: Aldy Hernandez <aldyh@redhat.com>
Cc: GCC patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Replace evrp use in loop versioning with ranger.
Date: Fri, 30 Jul 2021 09:39:15 +0100 [thread overview]
Message-ID: <mptwnp8gpqk.fsf@arm.com> (raw)
In-Reply-To: <CAGm3qMUOKZpE0sRCzBJBfLvuDRjP8MB0Okh6c+pzUL8peknNCg@mail.gmail.com> (Aldy Hernandez's message of "Tue, 27 Jul 2021 11:52:11 +0200")
Aldy Hernandez <aldyh@redhat.com> writes:
> On Mon, Jul 26, 2021 at 7:28 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Aldy Hernandez <aldyh@redhat.com> writes:
>> > On Mon, Jul 26, 2021 at 4:18 PM Richard Sandiford
>> > <richard.sandiford@arm.com> wrote:
>> >>
>> >> Aldy Hernandez <aldyh@redhat.com> writes:
>> >> > This patch replaces the evrp_range_analyzer in the loop versioning code
>> >> > with an on-demand ranger.
>> >> >
>> >> > Everything was pretty straightforward, except that range_of_expr requires
>> >> > a gimple statement as context to provide context aware ranges. I didn't see
>> >> > a convient place where the statement was saved, so I made a vector indexed
>> >> > by SSA names. As an alternative, I tried to use the loop's first statement,
>> >> > but that proved to be insufficient.
>> >>
>> >> The mapping is one-to-many though: there can be multiple statements
>> >> for each SSA name. Maybe that doesn't matter in this context and
>> >> any of the statements can act as a representative.
>> >>
>> >> I'm surprised that the loop's first statement didn't work though,
>> >> since the SSA name is supposedly known to be loop-invariant. What went
>> >> wrong when you tried that?
>> >
>> > I was looking at the first statement of loop_info->block_list and one
>> > of the dg.exp=loop-versioning* tests failed. Perhaps I should have
>> > used the loop itself, as in the attached patch. With this patch all
>> > of the loop-versioning tests pass.
>> >
>> >>
>> >> > I am not familiar with loop versioning, but if the DOM walk was only
>> >> > necessary for the calls to record_ranges_from_stmt, this too could be
>> >> > removed as the ranger will work without it.
>> >>
>> >> Yeah, that was the only reason. If the information is available at
>> >> version_for_unity (I guess it is) then we should just avoid recording
>> >> the versioning there if so.
>> >>
>> >> How expensive is the check? If the result is worth caching, perhaps
>> >> we should have two bitmaps: the existing one, and one that records
>> >> whether we've checked a particular SSA name.
>> >>
>> >> If the check is relatively cheap then that won't be worth it though.
>> >
>> > If you're asking about the range_of_expr check, that's all cached, so
>> > it should be pretty cheap. Besides, we're no longer calculating
>> > ranges for each statement in the IL, as we were doing in lv_dom_walker
>> > with evrp's record_ranges_from_stmt. Only statements of interest are
>> > queried.
>>
>> Sounds good. If the results are already cached then another level
>> of caching (via the second bitmap I mentioned above) would obviously
>> be a waste of time.
>
> My callgrind harness for performance testing wasn't able to pick up
> enough samples to measure the time spent in
> pass_loop_versioning::execute. I've seen this happen before with
> passes that run too fast. I'm afraid I don't have enough cycles to
> continue working on this.
Yeah, any testing of this was above and beyond IMO. Hearing that the
range query does its own caching was enough for me. :-)
>> > How about this patch, pending tests?
>>
>> OK, thanks, as a strict improvement over the status quo. But it'd be
>> even better without the dom walk :-)
>
> I've removed the DOM walk, and re-tested.
>
> OK to push?
Sorry for asking for another iteration, but…
> Aldy
>
> From 9b1cba95377e7b26b4f0495b1b5998d2f7f33a14 Mon Sep 17 00:00:00 2001
> From: Aldy Hernandez <aldyh@redhat.com>
> Date: Sat, 24 Jul 2021 12:29:28 +0200
> Subject: [PATCH] Replace evrp use in loop versioning with ranger.
>
> This patch replaces the evrp_range_analyzer in the loop versioning code
> with a ranger.
>
> Tested on x86-64 Linux.
>
> gcc/ChangeLog:
>
> * gimple-loop-versioning.cc (lv_dom_walker::lv_dom_walker): Remove.
> (loop_versioning::lv_dom_walker::before_dom_children): Remove.
> (loop_versioning::lv_dom_walker::after_dom_children): Remove.
> (loop_versioning::prune_loop_conditions): Replace vr_values use
> with range_query interface.
> (loop_versioning::prune_conditions): Replace dom walk with
> straight iteration.
> (pass_loop_versioning::execute): Use ranger.
> ---
> gcc/gimple-loop-versioning.cc | 78 ++++++++---------------------------
> 1 file changed, 18 insertions(+), 60 deletions(-)
>
> diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
> index 4b70c5a4aab..52eb6429171 100644
> --- a/gcc/gimple-loop-versioning.cc
> +++ b/gcc/gimple-loop-versioning.cc
> @@ -30,19 +30,17 @@ along with GCC; see the file COPYING3. If not see
> #include "tree-ssa-loop.h"
> #include "ssa.h"
> #include "tree-scalar-evolution.h"
> -#include "tree-chrec.h"
> #include "tree-ssa-loop-ivopts.h"
> #include "fold-const.h"
> #include "tree-ssa-propagate.h"
> #include "tree-inline.h"
> #include "domwalk.h"
> -#include "alloc-pool.h"
> -#include "vr-values.h"
> -#include "gimple-ssa-evrp-analyze.h"
> #include "tree-vectorizer.h"
> #include "omp-general.h"
> #include "predict.h"
> #include "tree-into-ssa.h"
> +#include "gimple-range.h"
> +#include "tree-cfg.h"
>
> namespace {
>
> @@ -253,24 +251,6 @@ public:
> unsigned int run ();
>
> private:
> - /* Used to walk the dominator tree to find loop versioning conditions
> - that are always false. */
> - class lv_dom_walker : public dom_walker
> - {
> - public:
> - lv_dom_walker (loop_versioning &);
> -
> - edge before_dom_children (basic_block) FINAL OVERRIDE;
> - void after_dom_children (basic_block) FINAL OVERRIDE;
> -
> - private:
> - /* The parent pass. */
> - loop_versioning &m_lv;
> -
> - /* Used to build context-dependent range information. */
> - evrp_range_analyzer m_range_analyzer;
> - };
> -
> /* Used to simplify statements based on conditions that are established
> by the version checks. */
> class name_prop : public substitute_and_fold_engine
> @@ -308,7 +288,7 @@ private:
> bool analyze_block (basic_block);
> bool analyze_blocks ();
>
> - void prune_loop_conditions (class loop *, vr_values *);
> + void prune_loop_conditions (class loop *);
> bool prune_conditions ();
>
> void merge_loop_info (class loop *, class loop *);
> @@ -499,36 +479,6 @@ loop_info::worth_versioning_p () const
> && (!bitmap_empty_p (&unity_names) || subloops_benefit_p));
> }
>
> -loop_versioning::lv_dom_walker::lv_dom_walker (loop_versioning &lv)
> - : dom_walker (CDI_DOMINATORS), m_lv (lv), m_range_analyzer (false)
> -{
> -}
> -
> -/* Process BB before processing the blocks it dominates. */
> -
> -edge
> -loop_versioning::lv_dom_walker::before_dom_children (basic_block bb)
> -{
> - m_range_analyzer.enter (bb);
> -
> - if (bb == bb->loop_father->header)
> - m_lv.prune_loop_conditions (bb->loop_father, &m_range_analyzer);
> -
> - for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
> - gsi_next (&si))
> - m_range_analyzer.record_ranges_from_stmt (gsi_stmt (si), false);
> -
> - return NULL;
> -}
> -
> -/* Process BB after processing the blocks it dominates. */
> -
> -void
> -loop_versioning::lv_dom_walker::after_dom_children (basic_block bb)
> -{
> - m_range_analyzer.leave (bb);
> -}
> -
> /* Decide whether to replace VAL with a new value in a versioned loop.
> Return the new value if so, otherwise return null. */
>
> @@ -1483,18 +1433,21 @@ loop_versioning::analyze_blocks ()
> LOOP. */
>
> void
> -loop_versioning::prune_loop_conditions (class loop *loop, vr_values *vrs)
> +loop_versioning::prune_loop_conditions (class loop *loop)
> {
> loop_info &li = get_loop_info (loop);
>
> int to_remove = -1;
> bitmap_iterator bi;
> unsigned int i;
> + int_range_max r;
> EXECUTE_IF_SET_IN_BITMAP (&li.unity_names, 0, i, bi)
> {
> tree name = ssa_name (i);
> - const value_range_equiv *vr = vrs->get_value_range (name);
> - if (vr && !vr->may_contain_p (build_one_cst (TREE_TYPE (name))))
> + gimple *stmt = first_stmt (loop->header);
> +
> + if (get_range_query (cfun)->range_of_expr (r, name, stmt)
> + && !r.contains_p (build_one_cst (TREE_TYPE (name))))
> {
> if (dump_enabled_p ())
> dump_printf_loc (MSG_NOTE, find_loop_location (loop),
> @@ -1519,9 +1472,11 @@ loop_versioning::prune_conditions ()
> AUTO_DUMP_SCOPE ("prune_loop_conditions",
> dump_user_location_t::from_function_decl (m_fn->decl));
>
> - calculate_dominance_info (CDI_DOMINATORS);
> - lv_dom_walker dom_walker (*this);
> - dom_walker.walk (ENTRY_BLOCK_PTR_FOR_FN (m_fn));
> + basic_block bb;
> + FOR_EACH_BB_FN (bb, m_fn)
> + if (bb == bb->loop_father->header)
> + prune_loop_conditions (bb->loop_father);
If we were going to keep pruning as a separate step, I think we should
iterate over loops rather than blocks.
However, what I meant by;
>> >> If the information is available at
>> >> version_for_unity (I guess it is) then we should just avoid recording
>> >> the versioning there if so.
is that we should instead put the get_range_query (cfun)->range_of_expr
and !r.contains_p test…
------------------------------------------------------------------------
void
loop_versioning::version_for_unity (gimple *stmt, tree name)
{
class loop *loop = loop_containing_stmt (stmt);
loop_info &li = get_loop_info (loop);
…here
if (bitmap_set_bit (&li.unity_names, SSA_NAME_VERSION (name)))
------------------------------------------------------------------------
and report that the value can't be 1 at that point. There would then
be no need for a separate pruning step. Having this range information
on tap makes the pass much simpler than it used to be. :-)
FAOD, I think it would be good to keep using first_stmt (loop->header)
(as in your patch) rather than use the stmt argument to version_for_unity.
Thanks,
Richard
> +
> return m_num_conditions != 0;
> }
>
> @@ -1810,7 +1765,10 @@ pass_loop_versioning::execute (function *fn)
> if (number_of_loops (fn) <= 1)
> return 0;
>
> - return loop_versioning (fn).run ();
> + enable_ranger (fn);
> + unsigned int ret = loop_versioning (fn).run ();
> + disable_ranger (fn);
> + return ret;
> }
>
> } // anon namespace
next prev parent reply other threads:[~2021-07-30 8:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-24 14:19 Aldy Hernandez
2021-07-26 14:18 ` Richard Sandiford
2021-07-26 15:16 ` Aldy Hernandez
2021-07-26 16:08 ` Aldy Hernandez
2021-07-26 17:28 ` Richard Sandiford
2021-07-27 9:52 ` Aldy Hernandez
2021-07-30 8:39 ` Richard Sandiford [this message]
2021-07-30 9:34 ` Aldy Hernandez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptwnp8gpqk.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=aldyh@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).