[RFC/PATCH] Use range-based for loops for traversing loops

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [RFC/PATCH] Use range-based for loops for traversing loops
@ 2021-07-19  6:20 Kewen.Lin
  2021-07-19  6:26 ` Andrew Pinski
                   ` (4 more replies)
  0 siblings, 5 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-19  6:20 UTC (permalink / raw)
  To: GCC Patches
  Cc: Martin Sebor, Richard Biener, Richard Sandiford, Jakub Jelinek,
	tbsaunde, Segher Boessenkool, Jonathan Wakely

[-- Attachment #1: Type: text/plain, Size: 4250 bytes --]

Hi,

This patch follows Martin's suggestion here[1], to support
range-based for loops for traversing loops, analogously to
the patch for vec[2].

Bootstrapped and regtested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, also
bootstrapped on ppc64le P9 with bootstrap-O3 config.

Any comments are appreciated.

BR,
Kewen

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573424.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572315.html
-----
gcc/ChangeLog:

	* cfgloop.h (class loop_iterator): Rename to ...
	(class loops_list): ... this.
	(loop_iterator::next): Rename to ...
	(loops_list::iterator::fill_curr_loop): ... this and adjust.
	(loop_iterator::loop_iterator): Rename to ...
	(loops_list::loops_list): ... this and adjust.
	(FOR_EACH_LOOP): Rename to ...
	(ALL_LOOPS): ... this.
	(FOR_EACH_LOOP_FN): Rename to ...
	(ALL_LOOPS_FN): this.
	(loops_list::iterator): New class.
	(loops_list::begin): New function.
	(loops_list::end): Likewise.
	* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with ALL_LOOPS*.
	(sort_sibling_loops): Likewise.
	(disambiguate_loops_with_multiple_latches): Likewise.
	(verify_loop_structure): Likewise.
	* cfgloopmanip.c (create_preheaders): Likewise.
	(force_single_succ_latches): Likewise.
	* config/aarch64/falkor-tag-collision-avoidance.c
	(execute_tag_collision_avoidance): Likewise.
	* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
	* config/s390/s390.c (s390_adjust_loops): Likewise.
	* doc/loop.texi: Likewise.
	* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
	* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
	(loop_versioning::make_versioning_decisions): Likewise.
	* gimple-ssa-split-paths.c (split_paths): Likewise.
	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
	* graphite.c (canonicalize_loop_form): Likewise.
	(graphite_transform_loops): Likewise.
	* ipa-fnsummary.c (analyze_function_body): Likewise.
	* ipa-pure-const.c (analyze_function): Likewise.
	* loop-doloop.c (doloop_optimize_loops): Likewise.
	* loop-init.c (loop_optimizer_finalize): Likewise.
	(fix_loop_structure): Likewise.
	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
	(move_loop_invariants): Likewise.
	* loop-unroll.c (decide_unrolling): Likewise.
	(unroll_loops): Likewise.
	* modulo-sched.c (sms_schedule): Likewise.
	* predict.c (predict_loops): Likewise.
	(pass_profile::execute): Likewise.
	* profile.c (branch_prob): Likewise.
	* sel-sched-ir.c (sel_finish_pipelining): Likewise.
	(sel_find_rgns): Likewise.
	* tree-cfg.c (replace_loop_annotate): Likewise.
	(replace_uses_by): Likewise.
	(move_sese_region_to_fn): Likewise.
	* tree-if-conv.c (pass_if_conversion::execute): Likewise.
	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
	* tree-parloops.c (parallelize_loops): Likewise.
	* tree-predcom.c (tree_predictive_commoning): Likewise.
	* tree-scalar-evolution.c (scev_initialize): Likewise.
	(scev_reset): Likewise.
	* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
	* tree-ssa-live.c (remove_unused_locals): Likewise.
	* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
	* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
	(tree_ssa_lim_initialize): Likewise.
	* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
	* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
	* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
	(free_numbers_of_iterations_estimates): Likewise.
	* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
	* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
	* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
	* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
	(pass_scev_cprop::execute): Likewise.
	* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
	* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
	* tree-ssa-threadupdate.c
	(jump_thread_path_registry::thread_through_all_blocks): Likewise.
	* tree-vectorizer.c (vectorize_loops): Likewise.
	* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

[-- Attachment #2: 0002-Use-range-based-for-loops-for-traversing-loops.patch --]
[-- Type: text/plain, Size: 42539 bytes --]

---
 gcc/cfgloop.c                                 |  19 ++-
 gcc/cfgloop.h                                 | 115 +++++++++++++-----
 gcc/cfgloopmanip.c                            |   7 +-
 .../aarch64/falkor-tag-collision-avoidance.c  |   4 +-
 gcc/config/mn10300/mn10300.c                  |   4 +-
 gcc/config/s390/s390.c                        |   4 +-
 gcc/doc/loop.texi                             |   2 +-
 gcc/gimple-loop-interchange.cc                |   3 +-
 gcc/gimple-loop-jam.c                         |   3 +-
 gcc/gimple-loop-versioning.cc                 |   6 +-
 gcc/gimple-ssa-split-paths.c                  |   3 +-
 gcc/graphite-isl-ast-to-gimple.c              |   5 +-
 gcc/graphite.c                                |   6 +-
 gcc/ipa-fnsummary.c                           |   2 +-
 gcc/ipa-pure-const.c                          |   3 +-
 gcc/loop-doloop.c                             |   8 +-
 gcc/loop-init.c                               |   5 +-
 gcc/loop-invariant.c                          |  14 +--
 gcc/loop-unroll.c                             |   7 +-
 gcc/modulo-sched.c                            |   5 +-
 gcc/predict.c                                 |   5 +-
 gcc/profile.c                                 |   3 +-
 gcc/sel-sched-ir.c                            |  12 +-
 gcc/tree-cfg.c                                |  13 +-
 gcc/tree-if-conv.c                            |   3 +-
 gcc/tree-loop-distribution.c                  |   2 +-
 gcc/tree-parloops.c                           |   3 +-
 gcc/tree-predcom.c                            |   3 +-
 gcc/tree-scalar-evolution.c                   |  16 +--
 gcc/tree-ssa-dce.c                            |   3 +-
 gcc/tree-ssa-live.c                           |   3 +-
 gcc/tree-ssa-loop-ch.c                        |   3 +-
 gcc/tree-ssa-loop-im.c                        |   7 +-
 gcc/tree-ssa-loop-ivcanon.c                   |   3 +-
 gcc/tree-ssa-loop-ivopts.c                    |   3 +-
 gcc/tree-ssa-loop-manip.c                     |   3 +-
 gcc/tree-ssa-loop-niter.c                     |   8 +-
 gcc/tree-ssa-loop-prefetch.c                  |   3 +-
 gcc/tree-ssa-loop-split.c                     |   7 +-
 gcc/tree-ssa-loop-unswitch.c                  |   3 +-
 gcc/tree-ssa-loop.c                           |   6 +-
 gcc/tree-ssa-propagate.c                      |   3 +-
 gcc/tree-ssa-sccvn.c                          |   3 +-
 gcc/tree-ssa-threadupdate.c                   |   3 +-
 gcc/tree-vectorizer.c                         |   4 +-
 gcc/tree-vrp.c                                |   3 +-
 46 files changed, 164 insertions(+), 189 deletions(-)

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index f094538b9ff..5fce39042c4 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -162,14 +162,12 @@ flow_loop_dump (const class loop *loop, FILE *file,
 void
 flow_loops_dump (FILE *file, void (*loop_dump_aux) (const class loop *, FILE *, int), int verbose)
 {
-  class loop *loop;
-
   if (!current_loops || ! file)
     return;
 
   fprintf (file, ";; %d loops found\n", number_of_loops (cfun));
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (loop_p loop : ALL_LOOPS (LI_INCLUDE_ROOT))
     {
       flow_loop_dump (loop, file, loop_dump_aux, verbose);
     }
@@ -559,8 +557,7 @@ sort_sibling_loops (function *fn)
   free (rc_order);
 
   auto_vec<loop_p, 3> siblings;
-  loop_p loop;
-  FOR_EACH_LOOP_FN (fn, loop, LI_INCLUDE_ROOT)
+  for (loop_p loop : ALL_LOOPS_FN (fn, LI_INCLUDE_ROOT))
     if (loop->inner && loop->inner->next)
       {
 	loop_p sibling = loop->inner;
@@ -836,9 +833,7 @@ disambiguate_multiple_latches (class loop *loop)
 void
 disambiguate_loops_with_multiple_latches (void)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       if (!loop->latch)
 	disambiguate_multiple_latches (loop);
@@ -1457,7 +1452,7 @@ verify_loop_structure (void)
   auto_sbitmap visited (last_basic_block_for_fn (cfun));
   bitmap_clear (visited);
   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       unsigned n;
 
@@ -1503,7 +1498,7 @@ verify_loop_structure (void)
   free (bbs);
 
   /* Check headers and latches.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       i = loop->num;
       if (loop->header == NULL)
@@ -1629,7 +1624,7 @@ verify_loop_structure (void)
     }
 
   /* Check the recorded loop exits.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       if (!loop->exits || loop->exits->e != NULL)
 	{
@@ -1723,7 +1718,7 @@ verify_loop_structure (void)
 	  err = 1;
 	}
 
-      FOR_EACH_LOOP (loop, 0)
+      for (loop_p loop : ALL_LOOPS (0))
 	{
 	  eloops = 0;
 	  for (exit = loop->exits->next; exit->e; exit = exit->next)
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 5e699276c88..2cb1a1f0d5d 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -658,55 +658,115 @@ enum li_flags
   LI_ONLY_INNERMOST = 4		/* Iterate only over innermost loops.  */
 };
 
-/* The iterator for loops.  */
+/* A list for visiting loops, which contains the loop numbers instead of
+   the loop pointers.  The scope is restricted in function FN and the
+   visiting order is specified by FLAGS.  */
 
-class loop_iterator
+class loops_list
 {
 public:
-  loop_iterator (function *fn, loop_p *loop, unsigned flags);
+  loops_list (function *fn, unsigned flags);
 
-  inline loop_p next ();
+  class iterator
+  {
+  public:
+    iterator (const loops_list &l, unsigned idx) : list (l), curr_idx (idx)
+    {
+      fill_curr_loop ();
+    }
+
+    loop_p operator* () const { return curr_loop; }
+
+    iterator &
+    operator++ ()
+    {
+      if (curr_idx < list.to_visit.length ())
+	{
+	  /* Bump the index and fill a new one.  */
+	  curr_idx++;
+	  fill_curr_loop ();
+	}
+      else
+	gcc_assert (!curr_loop);
+
+      return *this;
+    }
+
+    bool
+    operator!= (const iterator &rhs) const
+    {
+      return this->curr_idx < rhs.curr_idx;
+    }
+
+  private:
+    /* Fill the current loop starting from the current index.  */
+    void fill_curr_loop ();
+
+    /* Reference to the loop list to visit.  */
+    const loops_list &list;
+
+    /* The current index in the list to visit.  */
+    unsigned curr_idx;
 
+    /* The loop implied by the current index.  */
+    loop_p curr_loop;
+  };
+
+  iterator
+  begin () const
+  {
+    return iterator (*this, 0);
+  }
+
+  iterator
+  end () const
+  {
+    return iterator (*this, to_visit.length ());
+  }
+
+private:
   /* The function we are visiting.  */
   function *fn;
 
   /* The list of loops to visit.  */
   auto_vec<int, 16> to_visit;
-
-  /* The index of the actual loop.  */
-  unsigned idx;
 };
 
-inline loop_p
-loop_iterator::next ()
+/* Starting from current index CURR_IDX (inclusive), find one index
+   which stands for one valid loop and fill the found loop as CURR_LOOP,
+   if we can't find one, set CURR_LOOP as null.  */
+
+inline void
+loops_list::iterator::fill_curr_loop ()
 {
   int anum;
 
-  while (this->to_visit.iterate (this->idx, &anum))
+  while (this->list.to_visit.iterate (this->curr_idx, &anum))
     {
-      this->idx++;
-      loop_p loop = get_loop (fn, anum);
+      loop_p loop = get_loop (this->list.fn, anum);
       if (loop)
-	return loop;
+	{
+	  curr_loop = loop;
+	  return;
+	}
+      this->curr_idx++;
     }
 
-  return NULL;
+  curr_loop = nullptr;
 }
 
-inline
-loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
+/* Set up the loops list to visit according to the specified
+   function scope FN and iterating order FLAGS.  */
+
+inline loops_list::loops_list (function *fn, unsigned flags)
 {
   class loop *aloop;
   unsigned i;
   int mn;
 
-  this->idx = 0;
   this->fn = fn;
   if (!loops_for_fn (fn))
-    {
-      *loop = NULL;
-      return;
-    }
+    return;
 
   this->to_visit.reserve_exact (number_of_loops (fn));
   mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
@@ -766,19 +826,10 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
 	    }
 	}
     }
-
-  *loop = this->next ();
 }
 
-#define FOR_EACH_LOOP(LOOP, FLAGS) \
-  for (loop_iterator li(cfun, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
-
-#define FOR_EACH_LOOP_FN(FN, LOOP, FLAGS) \
-  for (loop_iterator li(FN, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
+#define ALL_LOOPS(FLAGS) loops_list (cfun, FLAGS)
+#define ALL_LOOPS_FN(FN, FLAGS) loops_list (FN, FLAGS)
 
 /* The properties of the target.  */
 struct target_cfgloop {
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index 2af59fedc92..bf4f666a1f2 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1572,12 +1572,10 @@ create_preheader (class loop *loop, int flags)
 void
 create_preheaders (int flags)
 {
-  class loop *loop;
-
   if (!current_loops)
     return;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     create_preheader (loop, flags);
   loops_state_set (LOOPS_HAVE_PREHEADERS);
 }
@@ -1587,10 +1585,9 @@ create_preheaders (int flags)
 void
 force_single_succ_latches (void)
 {
-  class loop *loop;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       if (loop->latch != loop->header && single_succ_p (loop->latch))
 	continue;
diff --git a/gcc/config/aarch64/falkor-tag-collision-avoidance.c b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
index de214e4a0f7..141afa23f2b 100644
--- a/gcc/config/aarch64/falkor-tag-collision-avoidance.c
+++ b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
@@ -808,8 +808,6 @@ record_loads (tag_map_t &tag_map, struct loop *loop)
 void
 execute_tag_collision_avoidance ()
 {
-  struct loop *loop;
-
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_UD_CHAIN);
   df_compute_regs_ever_live (true);
@@ -824,7 +822,7 @@ execute_tag_collision_avoidance ()
   calculate_dominance_info (CDI_DOMINATORS);
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       tag_map_t tag_map (512);
 
diff --git a/gcc/config/mn10300/mn10300.c b/gcc/config/mn10300/mn10300.c
index 6f842a3ad32..b977104bd07 100644
--- a/gcc/config/mn10300/mn10300.c
+++ b/gcc/config/mn10300/mn10300.c
@@ -3234,8 +3234,6 @@ mn10300_loop_contains_call_insn (loop_p loop)
 static void
 mn10300_scan_for_setlb_lcc (void)
 {
-  loop_p loop;
-
   DUMP ("Looking for loops that can use the SETLB insn", NULL_RTX);
 
   df_analyze ();
@@ -3248,7 +3246,7 @@ mn10300_scan_for_setlb_lcc (void)
      if an inner loop is not suitable for use with the SETLB/Lcc insns, it may
      be the case that its parent loop is suitable.  Thus we should check all
      loops, but work from the innermost outwards.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_ONLY_INNERMOST))
     {
       const char * reason = NULL;
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b1d3b99784d..df3c4361a7e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -14479,15 +14479,13 @@ s390_adjust_loop_scan_osc (struct loop* loop)
 static void
 s390_adjust_loops ()
 {
-  struct loop *loop = NULL;
-
   df_analyze ();
   compute_bb_for_insn ();
 
   /* Find the loops.  */
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_ONLY_INNERMOST))
     {
       if (dump_file)
 	{
diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
index a135656ed01..310889702ec 100644
--- a/gcc/doc/loop.texi
+++ b/gcc/doc/loop.texi
@@ -80,7 +80,7 @@ and its subloops in the numbering.  The index of a loop never changes.
 The entries of the @code{larray} field should not be accessed directly.
 The function @code{get_loop} returns the loop description for a loop with
 the given index.  @code{number_of_loops} function returns number of
-loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
+loops in the function.  To traverse all loops, use @code{ALL_LOOPS}
 macro.  The @code{flags} argument of the macro is used to determine
 the direction of traversal and the set of loops visited.  Each loop is
 guaranteed to be visited exactly once, regardless of the changes to the
diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 7a88faa2c07..960885371e1 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -2089,8 +2089,7 @@ pass_linterchange::execute (function *fun)
     return 0;
 
   bool changed_p = false;
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_ONLY_INNERMOST))
     {
       vec<loop_p> loop_nest = vNULL;
       vec<data_reference_p> datarefs = vNULL;
diff --git a/gcc/gimple-loop-jam.c b/gcc/gimple-loop-jam.c
index 4842f0dff80..26ac06f1b8f 100644
--- a/gcc/gimple-loop-jam.c
+++ b/gcc/gimple-loop-jam.c
@@ -486,13 +486,12 @@ adjust_unroll_factor (class loop *inner, struct data_dependence_relation *ddr,
 static unsigned int
 tree_loop_unroll_and_jam (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   /* Go through all innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_ONLY_INNERMOST))
     {
       class loop *outer = loop_outer (loop);
 
diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
index 4b70c5a4aab..d021d31b4f9 100644
--- a/gcc/gimple-loop-versioning.cc
+++ b/gcc/gimple-loop-versioning.cc
@@ -1428,8 +1428,7 @@ loop_versioning::analyze_blocks ()
      versioning at that level could be useful in some cases.  */
   get_loop_info (get_loop (m_fn, 0)).rejected_p = true;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
 
@@ -1650,8 +1649,7 @@ loop_versioning::make_versioning_decisions ()
   AUTO_DUMP_SCOPE ("make_versioning_decisions",
 		   dump_user_location_t::from_function_decl (m_fn->decl));
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
       if (decide_whether_loop_is_versionable (loop))
diff --git a/gcc/gimple-ssa-split-paths.c b/gcc/gimple-ssa-split-paths.c
index 2dd953d5ef9..53d7924e393 100644
--- a/gcc/gimple-ssa-split-paths.c
+++ b/gcc/gimple-ssa-split-paths.c
@@ -473,13 +473,12 @@ static bool
 split_paths ()
 {
   bool changed = false;
-  loop_p loop;
 
   loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
   initialize_original_copy_tables ();
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       /* Only split paths if we are optimizing this loop for speed.  */
       if (!optimize_loop_for_speed_p (loop))
diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index c202213f39b..41042a190a2 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -1535,9 +1535,8 @@ graphite_regenerate_ast_isl (scop_p scop)
       if_region->false_region->region.entry->flags |= EDGE_FALLTHRU;
       /* remove_edge_and_dominated_blocks marks loops for removal but
 	 doesn't actually remove them (fix that...).  */
-      loop_p loop;
-      FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
-	if (! loop->header)
+      for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
+	if (!loop->header)
 	  delete_loop (loop);
     }
 
diff --git a/gcc/graphite.c b/gcc/graphite.c
index 6c4fb42282b..13f3d74ae15 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -377,8 +377,7 @@ canonicalize_loop_closed_ssa (loop_p loop, edge e)
 static void
 canonicalize_loop_form (void)
 {
-  loop_p loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       edge e = single_exit (loop);
       if (!e || (e->flags & (EDGE_COMPLEX|EDGE_FAKE)))
@@ -494,10 +493,9 @@ graphite_transform_loops (void)
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
-      loop_p loop;
       int num_no_dependency = 0;
 
-      FOR_EACH_LOOP (loop, 0)
+      for (loop_p loop : ALL_LOOPS (0))
 	if (loop->can_be_parallel)
 	  num_no_dependency++;
 
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 95d28757f95..c8e9eb4a004 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2923,7 +2923,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
       if (dump_file && (dump_flags & TDF_DETAILS))
 	flow_loops_dump (dump_file, NULL, 0);
       scev_initialize ();
-      FOR_EACH_LOOP (loop, 0)
+      for (loop_p loop : ALL_LOOPS (0))
 	{
 	  predicate loop_iterations = true;
 	  sreal header_freq;
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index f045108af21..7bfb8ce2216 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -1087,9 +1087,8 @@ end:
 	    }
 	  else
 	    {
-	      class loop *loop;
 	      scev_initialize ();
-	      FOR_EACH_LOOP (loop, 0)
+	      for (loop_p loop : ALL_LOOPS (0))
 		if (!finite_loop_p (loop))
 		  {
 		    if (dump_file)
diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c
index dda7b9e268f..9cdf6c0c942 100644
--- a/gcc/loop-doloop.c
+++ b/gcc/loop-doloop.c
@@ -789,18 +789,14 @@ doloop_optimize (class loop *loop)
 void
 doloop_optimize_loops (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     {
       df_live_add_problem ();
       df_live_set_all_dirty ();
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      doloop_optimize (loop);
-    }
+  for (loop_p loop : ALL_LOOPS (0))
+    doloop_optimize (loop);
 
   if (optimize == 1)
     df_remove_problem (df_live);
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 1fde0ede441..54ea1b6bb55 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -137,7 +137,6 @@ loop_optimizer_init (unsigned flags)
 void
 loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
 {
-  class loop *loop;
   basic_block bb;
 
   timevar_push (TV_LOOP_FINI);
@@ -167,7 +166,7 @@ loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
       goto loop_fini_done;
     }
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (loop_p loop : ALL_LOOPS_FN (fn, 0))
     free_simple_loop_desc (loop);
 
   /* Clean up.  */
@@ -229,7 +228,7 @@ fix_loop_structure (bitmap changed_bbs)
      loops, so that when we remove the loops, we know that the loops inside
      are preserved, and do not waste time relinking loops that will be
      removed later.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       /* Detect the case that the loop is no longer present even though
          it wasn't marked for removal.
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index bdc7b59dd5f..1da9d855dc9 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -2136,7 +2136,7 @@ calculate_loop_reg_pressure (void)
   rtx link;
   class loop *loop, *parent;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     if (loop->aux == NULL)
       {
 	loop->aux = xcalloc (1, sizeof (class loop_data));
@@ -2203,7 +2203,7 @@ calculate_loop_reg_pressure (void)
   bitmap_release (&curr_regs_live);
   if (flag_ira_region == IRA_REGION_MIXED
       || flag_ira_region == IRA_REGION_ALL)
-    FOR_EACH_LOOP (loop, 0)
+    for (loop_p loop : ALL_LOOPS (0))
       {
 	EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
 	  if (! bitmap_bit_p (&LOOP_DATA (loop)->regs_ref, j))
@@ -2217,7 +2217,7 @@ calculate_loop_reg_pressure (void)
       }
   if (dump_file == NULL)
     return;
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       parent = loop_outer (loop);
       fprintf (dump_file, "\n  Loop %d (parent %d, header bb%d, depth %d)\n",
@@ -2251,8 +2251,6 @@ calculate_loop_reg_pressure (void)
 void
 move_loop_invariants (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     df_live_add_problem ();
   /* ??? This is a hack.  We should only need to call df_live_set_all_dirty
@@ -2271,7 +2269,7 @@ move_loop_invariants (void)
     }
   df_set_flags (DF_EQ_NOTES + DF_DEFER_INSN_RESCAN);
   /* Process the loops, innermost first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       curr_loop = loop;
       /* move_single_loop_invariants for very large loops is time consuming
@@ -2284,10 +2282,8 @@ move_loop_invariants (void)
 	move_single_loop_invariants (loop);
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
+  for (loop_p loop : ALL_LOOPS (0))
       free_loop_data (loop);
-    }
 
   if (flag_ira_loop_pressure)
     /* There is no sense to keep this info because it was most
diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c
index 66d93487e29..a833f340d4f 100644
--- a/gcc/loop-unroll.c
+++ b/gcc/loop-unroll.c
@@ -214,10 +214,8 @@ report_unroll (class loop *loop, dump_location_t locus)
 static void
 decide_unrolling (int flags)
 {
-  class loop *loop;
-
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       loop->lpt_decision.decision = LPT_NONE;
       dump_user_location_t locus = get_loop_location (loop);
@@ -278,14 +276,13 @@ decide_unrolling (int flags)
 void
 unroll_loops (int flags)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Now decide rest of unrolling.  */
   decide_unrolling (flags);
 
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       /* And perform the appropriate transformations.  */
       switch (loop->lpt_decision.decision)
diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index e72e46db387..8a5d5eace35 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -1353,7 +1353,6 @@ sms_schedule (void)
   int maxii, max_asap;
   partial_schedule_ptr ps;
   basic_block bb = NULL;
-  class loop *loop;
   basic_block condition_bb = NULL;
   edge latch_edge;
   HOST_WIDE_INT trip_count, max_trip_count;
@@ -1397,7 +1396,7 @@ sms_schedule (void)
 
   /* Build DDGs for all the relevant loops and hold them in G_ARR
      indexed by the loop index.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
@@ -1543,7 +1542,7 @@ sms_schedule (void)
   }
 
   /* We don't want to perform SMS on new loops - created by versioning.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
diff --git a/gcc/predict.c b/gcc/predict.c
index d751e6cecce..acd7be0011e 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -1949,7 +1949,7 @@ predict_loops (void)
 
   /* Try to predict out blocks in a loop that are not part of a
      natural loop.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       basic_block bb, *bbs;
       unsigned j, n_exits = 0;
@@ -4111,8 +4111,7 @@ pass_profile::execute (function *fun)
     profile_status_for_fn (fun) = PROFILE_GUESSED;
  if (dump_file && (dump_flags & TDF_DETAILS))
    {
-     class loop *loop;
-     FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+     for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
        if (loop->header->count.initialized_p ())
          fprintf (dump_file, "Loop got predicted %d to iterate %i times.\n",
        	   loop->num,
diff --git a/gcc/profile.c b/gcc/profile.c
index 1fa4196fa16..a00925079e9 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -1466,13 +1466,12 @@ branch_prob (bool thunk)
   if (flag_branch_probabilities
       && (profile_status_for_fn (cfun) == PROFILE_READ))
     {
-      class loop *loop;
       if (dump_file && (dump_flags & TDF_DETAILS))
 	report_predictor_hitrates ();
 
       /* At this moment we have precise loop iteration count estimates.
 	 Record them to loop structure before the profile gets out of date. */
-      FOR_EACH_LOOP (loop, 0)
+      for (loop_p loop : ALL_LOOPS (0))
 	if (loop->header->count > 0 && loop->header->count.reliable_p ())
 	  {
 	    gcov_type nit = expected_loop_iterations_unbounded (loop);
diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index eef9d6969f4..0479756e5dc 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -6247,10 +6247,8 @@ make_regions_from_the_rest (void)
 /* Free data structures used in pipelining of loops.  */
 void sel_finish_pipelining (void)
 {
-  class loop *loop;
-
   /* Release aux fields so we don't free them later by mistake.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     loop->aux = NULL;
 
   loop_optimizer_finalize ();
@@ -6271,11 +6269,11 @@ sel_find_rgns (void)
 
   if (current_loops)
     {
-      loop_p loop;
+      unsigned flags = flag_sel_sched_pipelining_outer_loops
+			 ? LI_FROM_INNERMOST
+			 : LI_ONLY_INNERMOST;
 
-      FOR_EACH_LOOP (loop, (flag_sel_sched_pipelining_outer_loops
-			    ? LI_FROM_INNERMOST
-			    : LI_ONLY_INNERMOST))
+      for (loop_p loop : ALL_LOOPS (flags))
 	make_regions_from_loop_nest (loop);
     }
 
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index c8b0f7b33e1..5226b265938 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -312,12 +312,11 @@ replace_loop_annotate_in_block (basic_block bb, class loop *loop)
 static void
 replace_loop_annotate (void)
 {
-  class loop *loop;
   basic_block bb;
   gimple_stmt_iterator gsi;
   gimple *stmt;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       /* First look into the header.  */
       replace_loop_annotate_in_block (loop->header, loop);
@@ -2027,12 +2026,8 @@ replace_uses_by (tree name, tree val)
   /* Also update the trees stored in loop structures.  */
   if (current_loops)
     {
-      class loop *loop;
-
-      FOR_EACH_LOOP (loop, 0)
-	{
+      for (loop_p loop : ALL_LOOPS (0))
 	  substitute_in_loop_info (loop, name, val);
-	}
     }
 }
 
@@ -7752,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
 
   /* Fix up orig_loop_num.  If the block referenced in it has been moved
      to dest_cfun, update orig_loop_num field, otherwise clear it.  */
-  class loop *dloop;
+  class loop *dloop = NULL;
   signed char *moved_orig_loop_num = NULL;
-  FOR_EACH_LOOP_FN (dest_cfun, dloop, 0)
+  for (loop_p dloop : ALL_LOOPS_FN (dest_cfun, 0))
     if (dloop->orig_loop_num)
       {
 	if (moved_orig_loop_num == NULL)
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 345488e2a19..4a359e038d0 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -3300,14 +3300,13 @@ pass_if_conversion::gate (function *fun)
 unsigned int
 pass_if_conversion::execute (function *fun)
 {
-  class loop *loop;
   unsigned todo = 0;
 
   if (number_of_loops (fun) <= 1)
     return 0;
 
   auto_vec<gimple *> preds;
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     if (flag_tree_loop_if_convert == 1
 	|| ((flag_tree_loop_vectorize || loop->force_vectorize)
 	    && !loop->dont_vectorize))
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 65aa1df4aba..cd79a836540 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -3312,7 +3312,7 @@ loop_distribution::execute (function *fun)
 
   /* We can at the moment only distribute non-nested loops, thus restrict
      walking to innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_ONLY_INNERMOST))
     {
       /* Don't distribute multiple exit edges loop, or cold loop when
          not doing pattern detection.  */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index fe1baef32a7..4a880287229 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -3989,7 +3989,6 @@ parallelize_loops (bool oacc_kernels_p)
 {
   unsigned n_threads;
   bool changed = false;
-  class loop *loop;
   class loop *skip_loop = NULL;
   class tree_niter_desc niter_desc;
   struct obstack parloop_obstack;
@@ -4020,7 +4019,7 @@ parallelize_loops (bool oacc_kernels_p)
 
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       if (loop == skip_loop)
 	{
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index cf85517e1c7..ac7c3bed5c4 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -3419,11 +3419,10 @@ pcom_worker::tree_predictive_commoning_loop (bool allow_unroll_p)
 unsigned
 tree_predictive_commoning (bool allow_unroll_p)
 {
-  class loop *loop;
   unsigned ret = 0, changed = 0;
 
   initialize_original_copy_tables ();
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_ONLY_INNERMOST))
     if (optimize_loop_for_speed_p (loop))
       {
 	pcom_worker w(loop);
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index b22d49a0ab6..4e514f55f7f 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -2977,16 +2977,12 @@ gather_stats_on_scev_database (void)
 void
 scev_initialize (void)
 {
-  class loop *loop;
-
   gcc_assert (! scev_initialized_p ());
 
   scalar_evolution_info = hash_table<scev_info_hasher>::create_ggc (100);
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (loop_p loop : ALL_LOOPS (0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if SCEV is initialized.  */
@@ -3015,14 +3011,10 @@ scev_reset_htab (void)
 void
 scev_reset (void)
 {
-  class loop *loop;
-
   scev_reset_htab ();
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (loop_p loop : ALL_LOOPS (0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if the IV calculation in TYPE can overflow based on the knowledge
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index e2d3b63a30c..8a42242e06e 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -417,7 +417,6 @@ find_obviously_necessary_stmts (bool aggressive)
   /* Prevent the empty possibly infinite loops from being removed.  */
   if (aggressive)
     {
-      class loop *loop;
       if (mark_irreducible_loops ())
 	FOR_EACH_BB_FN (bb, cfun)
 	  {
@@ -433,7 +432,7 @@ find_obviously_necessary_stmts (bool aggressive)
 		}
 	  }
 
-      FOR_EACH_LOOP (loop, 0)
+      for (loop_p loop : ALL_LOOPS (0))
 	if (!finite_loop_p (loop))
 	  {
 	    if (dump_file)
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index a2aab25e862..dccf395c72f 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -908,8 +908,7 @@ remove_unused_locals (void)
 
   if (cfun->has_simduid_loops)
     {
-      class loop *loop;
-      FOR_EACH_LOOP (loop, 0)
+      for (loop_p loop : ALL_LOOPS (0))
 	if (loop->simduid && !is_used_p (loop->simduid))
 	  loop->simduid = NULL_TREE;
     }
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index dfa5dc87c34..f39ff204b35 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -348,7 +348,6 @@ protected:
 unsigned int
 ch_base::copy_headers (function *fun)
 {
-  class loop *loop;
   basic_block header;
   edge exit, entry;
   basic_block *bbs, *copied_bbs;
@@ -365,7 +364,7 @@ ch_base::copy_headers (function *fun)
 
   auto_vec<std::pair<edge, loop_p> > copied;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       int initial_limit = param_max_loop_header_insns;
       int remaining_limit = initial_limit;
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 81b4ec21d6e..b18a2fabbd8 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -1662,7 +1662,7 @@ analyze_memory_references (bool store_motion)
 {
   gimple_stmt_iterator bsi;
   basic_block bb, *bbs;
-  class loop *loop, *outer;
+  class loop *outer;
   unsigned i, n;
 
   /* Collect all basic-blocks in loops and sort them after their
@@ -1706,7 +1706,7 @@ analyze_memory_references (bool store_motion)
 
   /* Propagate the information about accessed memory references up
      the loop hierarchy.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       /* Finalize the overall touched references (including subloops).  */
       bitmap_ior_into (&memory_accesses.all_refs_stored_in_loop[loop->num],
@@ -3133,7 +3133,6 @@ fill_always_executed_in (void)
 static void
 tree_ssa_lim_initialize (bool store_motion)
 {
-  class loop *loop;
   unsigned i;
 
   bitmap_obstack_initialize (&lim_bitmap_obstack);
@@ -3177,7 +3176,7 @@ tree_ssa_lim_initialize (bool store_motion)
      its postorder index.  */
   i = 0;
   bb_loop_postorder = XNEWVEC (unsigned, number_of_loops (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     bb_loop_postorder[loop->num] = i++;
 }
 
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index b1971f83544..289f9fb9dd1 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1285,14 +1285,13 @@ canonicalize_loop_induction_variables (class loop *loop,
 unsigned int
 canonicalize_induction_variables (void)
 {
-  class loop *loop;
   bool changed = false;
   bool irred_invalidated = false;
   bitmap loop_closed_ssa_invalidated = BITMAP_ALLOC (NULL);
 
   estimate_numbers_of_iterations (cfun);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       changed |= canonicalize_loop_induction_variables (loop,
 							true, UL_SINGLE_ITER,
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 12a8a49a307..7d15e316d6c 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -8066,14 +8066,13 @@ finish:
 void
 tree_ssa_iv_optimize (void)
 {
-  class loop *loop;
   struct ivopts_data data;
   auto_bitmap toremove;
 
   tree_ssa_iv_optimize_init (&data);
 
   /* Optimize the loops starting with the innermost ones.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       if (!dbg_cnt (ivopts_loop))
 	continue;
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index 28ae1316fa0..78aecc355c8 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -362,11 +362,10 @@ add_exit_phis (bitmap names_to_rename, bitmap *use_blocks, bitmap *loop_exits)
 static void
 get_loops_exits (bitmap *loop_exits)
 {
-  class loop *loop;
   unsigned j;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       auto_vec<edge> exit_edges = get_loop_exit_edges (loop);
       loop_exits[loop->num] = BITMAP_ALLOC (&loop_renamer_obstack);
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 6fabf10a215..4bc5e392c1f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -4559,13 +4559,11 @@ estimated_stmt_executions (class loop *loop, widest_int *nit)
 void
 estimate_numbers_of_iterations (function *fn)
 {
-  class loop *loop;
-
   /* We don't want to issue signed overflow warnings while getting
      loop iteration estimates.  */
   fold_defer_overflow_warnings ();
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (loop_p loop : ALL_LOOPS_FN (fn, 0))
     estimate_numbers_of_iterations (loop);
 
   fold_undefer_and_ignore_overflow_warnings ();
@@ -5031,9 +5029,7 @@ free_numbers_of_iterations_estimates (class loop *loop)
 void
 free_numbers_of_iterations_estimates (function *fn)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (loop_p loop : ALL_LOOPS_FN (fn, 0))
     free_numbers_of_iterations_estimates (loop);
 }
 
diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index 98062eb4616..fb142093f48 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -1980,7 +1980,6 @@ fail:
 unsigned int
 tree_ssa_prefetch_arrays (void)
 {
-  class loop *loop;
   bool unrolled = false;
   int todo_flags = 0;
 
@@ -2025,7 +2024,7 @@ tree_ssa_prefetch_arrays (void)
       set_builtin_decl (BUILT_IN_PREFETCH, decl, false);
     }
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Processing loop %d:\n", loop->num);
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index 3a09bbc39e5..b4043d72cbf 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -1598,18 +1598,17 @@ split_loop_on_cond (struct loop *loop)
 static unsigned int
 tree_ssa_split_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   calculate_dominance_info (CDI_POST_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (loop_p loop : ALL_LOOPS (LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       if (loop->aux)
 	{
@@ -1630,7 +1629,7 @@ tree_ssa_split_loops (void)
 	}
     }
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (loop_p loop : ALL_LOOPS (LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   clear_aux_for_blocks ();
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 04d4553f13e..58472b29b7b 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -90,11 +90,10 @@ static tree get_vop_from_header (class loop *);
 unsigned int
 tree_ssa_unswitch_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       if (!loop->inner)
 	/* Unswitch innermost loop.  */
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 957ac0f3baa..5eee25ea17f 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -157,8 +157,7 @@ gate_oacc_kernels (function *fn)
   if (!lookup_attribute ("oacc kernels", DECL_ATTRIBUTES (fn->decl)))
     return false;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     if (loop->in_oacc_kernels_region)
       return true;
 
@@ -455,12 +454,11 @@ public:
 unsigned
 pass_scev_cprop::execute (function *)
 {
-  class loop *loop;
   bool any = false;
 
   /* Perform final value replacement in loops, in case the replacement
      expressions are cheap.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     any |= final_value_replacement_loop (loop);
 
   return any ? TODO_cleanup_cfg | TODO_update_ssa_only_virtuals : 0;
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index d93ec90b002..579be874888 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -1262,7 +1262,6 @@ clean_up_loop_closed_phi (function *fun)
   tree rhs;
   tree lhs;
   gphi_iterator gsi;
-  struct loop *loop;
 
   /* Avoid possibly quadratic work when scanning for loop exits across
    all loops of a nest.  */
@@ -1274,7 +1273,7 @@ clean_up_loop_closed_phi (function *fun)
   calculate_dominance_info  (CDI_DOMINATORS);
 
   /* Walk over loop in function.  */
-  FOR_EACH_LOOP_FN (fun, loop, 0)
+  for (loop_p loop : ALL_LOOPS_FN (fun, 0))
     {
       /* Check each exit edege of loop.  */
       auto_vec<edge> exits = get_loop_exit_edges (loop);
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 7900df946f4..480be4f3935 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -7637,9 +7637,8 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
      loops and the outermost one optimistically.  */
   if (iterate)
     {
-      loop_p loop;
       unsigned max_depth = param_rpo_vn_max_loop_depth;
-      FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+      for (loop_p loop : ALL_LOOPS (LI_ONLY_INNERMOST))
 	if (loop_depth (loop) > max_depth)
 	  for (unsigned i = 2;
 	       i < loop_depth (loop) - max_depth; ++i)
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index f496dd3eb8c..3381aadc325 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2561,7 +2561,6 @@ jump_thread_path_registry::thread_through_all_blocks
 {
   bool retval = false;
   unsigned int i;
-  class loop *loop;
   auto_bitmap threaded_blocks;
   hash_set<edge> visited_starting_edges;
 
@@ -2702,7 +2701,7 @@ jump_thread_path_registry::thread_through_all_blocks
   /* Then perform the threading through loop headers.  We start with the
      innermost loop, so that the changes in cfg we perform won't affect
      further threading.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (loop_p loop : ALL_LOOPS (LI_FROM_INNERMOST))
     {
       if (!loop->header
 	  || !bitmap_bit_p (threaded_blocks, loop->header->index))
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index f1035a83826..b0bf98fc667 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -1194,7 +1194,7 @@ vectorize_loops (void)
   /* If some loop was duplicated, it gets bigger number
      than all previously defined loops.  This fact allows us to run
      only over initial loops skipping newly generated ones.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     if (loop->dont_vectorize)
       {
 	any_ifcvt_loops = true;
@@ -1213,7 +1213,7 @@ vectorize_loops (void)
 		  loop4 (copy of loop2)
 		else
 		  loop5 (copy of loop4)
-	   If FOR_EACH_LOOP gives us loop3 first (which has
+	   If ALL_LOOPS gives us loop3 first (which has
 	   dont_vectorize set), make sure to process loop1 before loop4;
 	   so that we can prevent vectorization of loop4 if loop1
 	   is successfully vectorized.  */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 0565c9b5073..2e268aedd94 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3337,8 +3337,7 @@ vrp_asserts::find_assert_locations (void)
   /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
      the order we compute liveness and insert asserts we otherwise
      fail to insert asserts into the loop latch.  */
-  loop_p loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (loop_p loop : ALL_LOOPS (0))
     {
       i = loop->latch->index;
       unsigned int j = single_succ_edge (loop->latch)->dest_idx;


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19  6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
@ 2021-07-19  6:26 ` Andrew Pinski
  2021-07-20  8:56   ` Kewen.Lin
  2021-07-19 14:08 ` Jonathan Wakely
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Andrew Pinski @ 2021-07-19  6:26 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders

On Sun, Jul 18, 2021 at 11:21 PM Kewen.Lin via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> This patch follows Martin's suggestion here[1], to support
> range-based for loops for traversing loops, analogously to
> the patch for vec[2].
>
> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>
> Any comments are appreciated.

+1 from me (note I did not review the patch but I like the idea).

Thanks,
Andrew

>
> BR,
> Kewen
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573424.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572315.html
> -----
> gcc/ChangeLog:
>
>         * cfgloop.h (class loop_iterator): Rename to ...
>         (class loops_list): ... this.
>         (loop_iterator::next): Rename to ...
>         (loops_list::iterator::fill_curr_loop): ... this and adjust.
>         (loop_iterator::loop_iterator): Rename to ...
>         (loops_list::loops_list): ... this and adjust.
>         (FOR_EACH_LOOP): Rename to ...
>         (ALL_LOOPS): ... this.
>         (FOR_EACH_LOOP_FN): Rename to ...
>         (ALL_LOOPS_FN): this.
>         (loops_list::iterator): New class.
>         (loops_list::begin): New function.
>         (loops_list::end): Likewise.
>         * cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with ALL_LOOPS*.
>         (sort_sibling_loops): Likewise.
>         (disambiguate_loops_with_multiple_latches): Likewise.
>         (verify_loop_structure): Likewise.
>         * cfgloopmanip.c (create_preheaders): Likewise.
>         (force_single_succ_latches): Likewise.
>         * config/aarch64/falkor-tag-collision-avoidance.c
>         (execute_tag_collision_avoidance): Likewise.
>         * config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
>         * config/s390/s390.c (s390_adjust_loops): Likewise.
>         * doc/loop.texi: Likewise.
>         * gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
>         * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
>         * gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
>         (loop_versioning::make_versioning_decisions): Likewise.
>         * gimple-ssa-split-paths.c (split_paths): Likewise.
>         * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
>         * graphite.c (canonicalize_loop_form): Likewise.
>         (graphite_transform_loops): Likewise.
>         * ipa-fnsummary.c (analyze_function_body): Likewise.
>         * ipa-pure-const.c (analyze_function): Likewise.
>         * loop-doloop.c (doloop_optimize_loops): Likewise.
>         * loop-init.c (loop_optimizer_finalize): Likewise.
>         (fix_loop_structure): Likewise.
>         * loop-invariant.c (calculate_loop_reg_pressure): Likewise.
>         (move_loop_invariants): Likewise.
>         * loop-unroll.c (decide_unrolling): Likewise.
>         (unroll_loops): Likewise.
>         * modulo-sched.c (sms_schedule): Likewise.
>         * predict.c (predict_loops): Likewise.
>         (pass_profile::execute): Likewise.
>         * profile.c (branch_prob): Likewise.
>         * sel-sched-ir.c (sel_finish_pipelining): Likewise.
>         (sel_find_rgns): Likewise.
>         * tree-cfg.c (replace_loop_annotate): Likewise.
>         (replace_uses_by): Likewise.
>         (move_sese_region_to_fn): Likewise.
>         * tree-if-conv.c (pass_if_conversion::execute): Likewise.
>         * tree-loop-distribution.c (loop_distribution::execute): Likewise.
>         * tree-parloops.c (parallelize_loops): Likewise.
>         * tree-predcom.c (tree_predictive_commoning): Likewise.
>         * tree-scalar-evolution.c (scev_initialize): Likewise.
>         (scev_reset): Likewise.
>         * tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
>         * tree-ssa-live.c (remove_unused_locals): Likewise.
>         * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
>         * tree-ssa-loop-im.c (analyze_memory_references): Likewise.
>         (tree_ssa_lim_initialize): Likewise.
>         * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
>         * tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
>         * tree-ssa-loop-manip.c (get_loops_exits): Likewise.
>         * tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
>         (free_numbers_of_iterations_estimates): Likewise.
>         * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
>         * tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
>         * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
>         * tree-ssa-loop.c (gate_oacc_kernels): Likewise.
>         (pass_scev_cprop::execute): Likewise.
>         * tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
>         * tree-ssa-sccvn.c (do_rpo_vn): Likewise.
>         * tree-ssa-threadupdate.c
>         (jump_thread_path_registry::thread_through_all_blocks): Likewise.
>         * tree-vectorizer.c (vectorize_loops): Likewise.
>         * tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19  6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
  2021-07-19  6:26 ` Andrew Pinski
@ 2021-07-19 14:08 ` Jonathan Wakely
  2021-07-20  8:56   ` Kewen.Lin
  2021-07-19 14:34 ` Richard Biener
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Jonathan Wakely @ 2021-07-19 14:08 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: GCC Patches, Martin Sebor, Richard Biener, Richard Sandiford,
	Jakub Jelinek, Trevor Saunders, Segher Boessenkool

On Mon, 19 Jul 2021 at 07:20, Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> Hi,
>
> This patch follows Martin's suggestion here[1], to support
> range-based for loops for traversing loops, analogously to
> the patch for vec[2].
>
> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>
> Any comments are appreciated.

In the loops_list::iterator type, this looks a little strange:

+    bool
+    operator!= (const iterator &rhs) const
+    {
+      return this->curr_idx < rhs.curr_idx;
+    }
+

This works fine when the iterator type is used implicitly in a
range-based for loop, but it wouldn't work for explicit uses of the
iterator type where somebody does the != comparison with the
past-the-end iterator on on the LHS:

auto&& list ALL_LOOPS(foo);
auto end = list.end();
auto begin = list.begin();
while (--end != begin)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19  6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
  2021-07-19  6:26 ` Andrew Pinski
  2021-07-19 14:08 ` Jonathan Wakely
@ 2021-07-19 14:34 ` Richard Biener
  2021-07-20  8:57   ` Kewen.Lin
  2021-07-19 15:59 ` Martin Sebor
  2021-07-20 14:36 ` [PATCH v2] " Kewen.Lin
  4 siblings, 1 reply; 35+ messages in thread
From: Richard Biener @ 2021-07-19 14:34 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: GCC Patches, Martin Sebor, Richard Sandiford, Jakub Jelinek,
	Trevor Saunders, Segher Boessenkool, Jonathan Wakely

On Mon, Jul 19, 2021 at 8:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> Hi,
>
> This patch follows Martin's suggestion here[1], to support
> range-based for loops for traversing loops, analogously to
> the patch for vec[2].
>
> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>
> Any comments are appreciated.

Since you are touching all FOR_EACH_LOOP please
make implicit 'cfun' uses explicit.  I'm not sure ALL_LOOPS
should scream, I think all_loops (function *, flags) would be
nicer.

Note I'm anticipating iteration over a subset of the loop tree
which would ask for specifying the 'root' of the loop tree to
iterate over so it could be

  loops_list (class loop *root, unsigned flags)

and the "all" cases use loops_list (loops_for_fn (cfun), flags) then.
Providing an overload with struct function is of course OK.

Richard.

> BR,
> Kewen
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573424.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572315.html
> -----
> gcc/ChangeLog:
>
>         * cfgloop.h (class loop_iterator): Rename to ...
>         (class loops_list): ... this.
>         (loop_iterator::next): Rename to ...
>         (loops_list::iterator::fill_curr_loop): ... this and adjust.
>         (loop_iterator::loop_iterator): Rename to ...
>         (loops_list::loops_list): ... this and adjust.
>         (FOR_EACH_LOOP): Rename to ...
>         (ALL_LOOPS): ... this.
>         (FOR_EACH_LOOP_FN): Rename to ...
>         (ALL_LOOPS_FN): this.
>         (loops_list::iterator): New class.
>         (loops_list::begin): New function.
>         (loops_list::end): Likewise.
>         * cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with ALL_LOOPS*.
>         (sort_sibling_loops): Likewise.
>         (disambiguate_loops_with_multiple_latches): Likewise.
>         (verify_loop_structure): Likewise.
>         * cfgloopmanip.c (create_preheaders): Likewise.
>         (force_single_succ_latches): Likewise.
>         * config/aarch64/falkor-tag-collision-avoidance.c
>         (execute_tag_collision_avoidance): Likewise.
>         * config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
>         * config/s390/s390.c (s390_adjust_loops): Likewise.
>         * doc/loop.texi: Likewise.
>         * gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
>         * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
>         * gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
>         (loop_versioning::make_versioning_decisions): Likewise.
>         * gimple-ssa-split-paths.c (split_paths): Likewise.
>         * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
>         * graphite.c (canonicalize_loop_form): Likewise.
>         (graphite_transform_loops): Likewise.
>         * ipa-fnsummary.c (analyze_function_body): Likewise.
>         * ipa-pure-const.c (analyze_function): Likewise.
>         * loop-doloop.c (doloop_optimize_loops): Likewise.
>         * loop-init.c (loop_optimizer_finalize): Likewise.
>         (fix_loop_structure): Likewise.
>         * loop-invariant.c (calculate_loop_reg_pressure): Likewise.
>         (move_loop_invariants): Likewise.
>         * loop-unroll.c (decide_unrolling): Likewise.
>         (unroll_loops): Likewise.
>         * modulo-sched.c (sms_schedule): Likewise.
>         * predict.c (predict_loops): Likewise.
>         (pass_profile::execute): Likewise.
>         * profile.c (branch_prob): Likewise.
>         * sel-sched-ir.c (sel_finish_pipelining): Likewise.
>         (sel_find_rgns): Likewise.
>         * tree-cfg.c (replace_loop_annotate): Likewise.
>         (replace_uses_by): Likewise.
>         (move_sese_region_to_fn): Likewise.
>         * tree-if-conv.c (pass_if_conversion::execute): Likewise.
>         * tree-loop-distribution.c (loop_distribution::execute): Likewise.
>         * tree-parloops.c (parallelize_loops): Likewise.
>         * tree-predcom.c (tree_predictive_commoning): Likewise.
>         * tree-scalar-evolution.c (scev_initialize): Likewise.
>         (scev_reset): Likewise.
>         * tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
>         * tree-ssa-live.c (remove_unused_locals): Likewise.
>         * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
>         * tree-ssa-loop-im.c (analyze_memory_references): Likewise.
>         (tree_ssa_lim_initialize): Likewise.
>         * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
>         * tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
>         * tree-ssa-loop-manip.c (get_loops_exits): Likewise.
>         * tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
>         (free_numbers_of_iterations_estimates): Likewise.
>         * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
>         * tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
>         * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
>         * tree-ssa-loop.c (gate_oacc_kernels): Likewise.
>         (pass_scev_cprop::execute): Likewise.
>         * tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
>         * tree-ssa-sccvn.c (do_rpo_vn): Likewise.
>         * tree-ssa-threadupdate.c
>         (jump_thread_path_registry::thread_through_all_blocks): Likewise.
>         * tree-vectorizer.c (vectorize_loops): Likewise.
>         * tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19  6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
                   ` (2 preceding siblings ...)
  2021-07-19 14:34 ` Richard Biener
@ 2021-07-19 15:59 ` Martin Sebor
  2021-07-20  8:58   ` Kewen.Lin
  2021-07-20 14:36 ` [PATCH v2] " Kewen.Lin
  4 siblings, 1 reply; 35+ messages in thread
From: Martin Sebor @ 2021-07-19 15:59 UTC (permalink / raw)
  To: Kewen.Lin, GCC Patches
  Cc: Richard Biener, Richard Sandiford, Jakub Jelinek, tbsaunde,
	Segher Boessenkool, Jonathan Wakely

On 7/19/21 12:20 AM, Kewen.Lin wrote:
> Hi,
> 
> This patch follows Martin's suggestion here[1], to support
> range-based for loops for traversing loops, analogously to
> the patch for vec[2].
> 
> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped on ppc64le P9 with bootstrap-O3 config.
> 
> Any comments are appreciated.

Thanks for this nice cleanup!  Just a few suggestions:

I would recommend against introducing new macros unless they
offer a significant advantage over alternatives (for the two
macros the patch adds I don't think they do).

If improving const-correctness is one of our a goals
the loops_list iterator type would need to a corresponding
const_iterator type, and const overloads of the begin()
and end() member functions.

Rather than introducing more instances of the loop_p typedef
I'd suggest to use loop *.  It has at least two advantages:
it's clearer (it's obvious it refers to a pointer), and lends
itself more readily to making code const-correct by declaring
the control variable const: for (const class loop *loop: ...)
while avoiding the mistake of using const loop_p loop to
declare a pointer to a const loop.

Martin

> 
> BR,
> Kewen
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573424.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572315.html
> -----
> gcc/ChangeLog:
> 
> 	* cfgloop.h (class loop_iterator): Rename to ...
> 	(class loops_list): ... this.
> 	(loop_iterator::next): Rename to ...
> 	(loops_list::iterator::fill_curr_loop): ... this and adjust.
> 	(loop_iterator::loop_iterator): Rename to ...
> 	(loops_list::loops_list): ... this and adjust.
> 	(FOR_EACH_LOOP): Rename to ...
> 	(ALL_LOOPS): ... this.
> 	(FOR_EACH_LOOP_FN): Rename to ...
> 	(ALL_LOOPS_FN): this.
> 	(loops_list::iterator): New class.
> 	(loops_list::begin): New function.
> 	(loops_list::end): Likewise.
> 	* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with ALL_LOOPS*.
> 	(sort_sibling_loops): Likewise.
> 	(disambiguate_loops_with_multiple_latches): Likewise.
> 	(verify_loop_structure): Likewise.
> 	* cfgloopmanip.c (create_preheaders): Likewise.
> 	(force_single_succ_latches): Likewise.
> 	* config/aarch64/falkor-tag-collision-avoidance.c
> 	(execute_tag_collision_avoidance): Likewise.
> 	* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
> 	* config/s390/s390.c (s390_adjust_loops): Likewise.
> 	* doc/loop.texi: Likewise.
> 	* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
> 	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
> 	* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
> 	(loop_versioning::make_versioning_decisions): Likewise.
> 	* gimple-ssa-split-paths.c (split_paths): Likewise.
> 	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
> 	* graphite.c (canonicalize_loop_form): Likewise.
> 	(graphite_transform_loops): Likewise.
> 	* ipa-fnsummary.c (analyze_function_body): Likewise.
> 	* ipa-pure-const.c (analyze_function): Likewise.
> 	* loop-doloop.c (doloop_optimize_loops): Likewise.
> 	* loop-init.c (loop_optimizer_finalize): Likewise.
> 	(fix_loop_structure): Likewise.
> 	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
> 	(move_loop_invariants): Likewise.
> 	* loop-unroll.c (decide_unrolling): Likewise.
> 	(unroll_loops): Likewise.
> 	* modulo-sched.c (sms_schedule): Likewise.
> 	* predict.c (predict_loops): Likewise.
> 	(pass_profile::execute): Likewise.
> 	* profile.c (branch_prob): Likewise.
> 	* sel-sched-ir.c (sel_finish_pipelining): Likewise.
> 	(sel_find_rgns): Likewise.
> 	* tree-cfg.c (replace_loop_annotate): Likewise.
> 	(replace_uses_by): Likewise.
> 	(move_sese_region_to_fn): Likewise.
> 	* tree-if-conv.c (pass_if_conversion::execute): Likewise.
> 	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
> 	* tree-parloops.c (parallelize_loops): Likewise.
> 	* tree-predcom.c (tree_predictive_commoning): Likewise.
> 	* tree-scalar-evolution.c (scev_initialize): Likewise.
> 	(scev_reset): Likewise.
> 	* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
> 	* tree-ssa-live.c (remove_unused_locals): Likewise.
> 	* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
> 	* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
> 	(tree_ssa_lim_initialize): Likewise.
> 	* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
> 	* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
> 	* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
> 	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
> 	(free_numbers_of_iterations_estimates): Likewise.
> 	* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
> 	* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
> 	* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
> 	* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
> 	(pass_scev_cprop::execute): Likewise.
> 	* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
> 	* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
> 	* tree-ssa-threadupdate.c
> 	(jump_thread_path_registry::thread_through_all_blocks): Likewise.
> 	* tree-vectorizer.c (vectorize_loops): Likewise.
> 	* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19  6:26 ` Andrew Pinski
@ 2021-07-20  8:56   ` Kewen.Lin
  0 siblings, 0 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-20  8:56 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders

on 2021/7/19 下午2:26, Andrew Pinski wrote:
> On Sun, Jul 18, 2021 at 11:21 PM Kewen.Lin via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Hi,
>>
>> This patch follows Martin's suggestion here[1], to support
>> range-based for loops for traversing loops, analogously to
>> the patch for vec[2].
>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>> x86_64-redhat-linux and aarch64-linux-gnu, also
>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>
>> Any comments are appreciated.
> 
> +1 from me (note I did not review the patch but I like the idea).
> 

Thanks Andrew!  It's actually Martin's idea.  :)

BR,
Kewen

> Thanks,
> Andrew
> 
>>
>> BR,
>> Kewen
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573424.html
>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572315.html
>> -----
>> gcc/ChangeLog:
>>
>>         * cfgloop.h (class loop_iterator): Rename to ...
>>         (class loops_list): ... this.
>>         (loop_iterator::next): Rename to ...
>>         (loops_list::iterator::fill_curr_loop): ... this and adjust.
>>         (loop_iterator::loop_iterator): Rename to ...
>>         (loops_list::loops_list): ... this and adjust.
>>         (FOR_EACH_LOOP): Rename to ...
>>         (ALL_LOOPS): ... this.
>>         (FOR_EACH_LOOP_FN): Rename to ...
>>         (ALL_LOOPS_FN): this.
>>         (loops_list::iterator): New class.
>>         (loops_list::begin): New function.
>>         (loops_list::end): Likewise.
>>         * cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with ALL_LOOPS*.
>>         (sort_sibling_loops): Likewise.
>>         (disambiguate_loops_with_multiple_latches): Likewise.
>>         (verify_loop_structure): Likewise.
>>         * cfgloopmanip.c (create_preheaders): Likewise.
>>         (force_single_succ_latches): Likewise.
>>         * config/aarch64/falkor-tag-collision-avoidance.c
>>         (execute_tag_collision_avoidance): Likewise.
>>         * config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
>>         * config/s390/s390.c (s390_adjust_loops): Likewise.
>>         * doc/loop.texi: Likewise.
>>         * gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
>>         * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
>>         * gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
>>         (loop_versioning::make_versioning_decisions): Likewise.
>>         * gimple-ssa-split-paths.c (split_paths): Likewise.
>>         * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
>>         * graphite.c (canonicalize_loop_form): Likewise.
>>         (graphite_transform_loops): Likewise.
>>         * ipa-fnsummary.c (analyze_function_body): Likewise.
>>         * ipa-pure-const.c (analyze_function): Likewise.
>>         * loop-doloop.c (doloop_optimize_loops): Likewise.
>>         * loop-init.c (loop_optimizer_finalize): Likewise.
>>         (fix_loop_structure): Likewise.
>>         * loop-invariant.c (calculate_loop_reg_pressure): Likewise.
>>         (move_loop_invariants): Likewise.
>>         * loop-unroll.c (decide_unrolling): Likewise.
>>         (unroll_loops): Likewise.
>>         * modulo-sched.c (sms_schedule): Likewise.
>>         * predict.c (predict_loops): Likewise.
>>         (pass_profile::execute): Likewise.
>>         * profile.c (branch_prob): Likewise.
>>         * sel-sched-ir.c (sel_finish_pipelining): Likewise.
>>         (sel_find_rgns): Likewise.
>>         * tree-cfg.c (replace_loop_annotate): Likewise.
>>         (replace_uses_by): Likewise.
>>         (move_sese_region_to_fn): Likewise.
>>         * tree-if-conv.c (pass_if_conversion::execute): Likewise.
>>         * tree-loop-distribution.c (loop_distribution::execute): Likewise.
>>         * tree-parloops.c (parallelize_loops): Likewise.
>>         * tree-predcom.c (tree_predictive_commoning): Likewise.
>>         * tree-scalar-evolution.c (scev_initialize): Likewise.
>>         (scev_reset): Likewise.
>>         * tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
>>         * tree-ssa-live.c (remove_unused_locals): Likewise.
>>         * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
>>         * tree-ssa-loop-im.c (analyze_memory_references): Likewise.
>>         (tree_ssa_lim_initialize): Likewise.
>>         * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
>>         * tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
>>         * tree-ssa-loop-manip.c (get_loops_exits): Likewise.
>>         * tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
>>         (free_numbers_of_iterations_estimates): Likewise.
>>         * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
>>         * tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
>>         * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
>>         * tree-ssa-loop.c (gate_oacc_kernels): Likewise.
>>         (pass_scev_cprop::execute): Likewise.
>>         * tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
>>         * tree-ssa-sccvn.c (do_rpo_vn): Likewise.
>>         * tree-ssa-threadupdate.c
>>         (jump_thread_path_registry::thread_through_all_blocks): Likewise.
>>         * tree-vectorizer.c (vectorize_loops): Likewise.
>>         * tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19 14:08 ` Jonathan Wakely
@ 2021-07-20  8:56   ` Kewen.Lin
  0 siblings, 0 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-20  8:56 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: GCC Patches, Martin Sebor, Richard Biener, Richard Sandiford,
	Jakub Jelinek, Trevor Saunders, Segher Boessenkool

on 2021/7/19 下午10:08, Jonathan Wakely wrote:
> On Mon, 19 Jul 2021 at 07:20, Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> Hi,
>>
>> This patch follows Martin's suggestion here[1], to support
>> range-based for loops for traversing loops, analogously to
>> the patch for vec[2].
>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>> x86_64-redhat-linux and aarch64-linux-gnu, also
>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>
>> Any comments are appreciated.
> 
> In the loops_list::iterator type, this looks a little strange:
> 
> +    bool
> +    operator!= (const iterator &rhs) const
> +    {
> +      return this->curr_idx < rhs.curr_idx;
> +    }
> +
> 
> This works fine when the iterator type is used implicitly in a
> range-based for loop, but it wouldn't work for explicit uses of the
> iterator type where somebody does the != comparison with the
> past-the-end iterator on on the LHS:
> 
> auto&& list ALL_LOOPS(foo);
> auto end = list.end();
> auto begin = list.begin();
> while (--end != begin)
> 

Thanks for the comments, Jonathan.  Yeah, to use "!=" is better
for clear meaning and later extension.  It was under the assumption
that the index can only increase (only supports operator++()), so
I simply used "<".  Will fix it in V2.

BR,
Kewen

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19 14:34 ` Richard Biener
@ 2021-07-20  8:57   ` Kewen.Lin
  0 siblings, 0 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-20  8:57 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Patches, Martin Sebor, Richard Sandiford, Jakub Jelinek,
	Trevor Saunders, Segher Boessenkool, Jonathan Wakely

on 2021/7/19 下午10:34, Richard Biener wrote:
> On Mon, Jul 19, 2021 at 8:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> Hi,
>>
>> This patch follows Martin's suggestion here[1], to support
>> range-based for loops for traversing loops, analogously to
>> the patch for vec[2].
>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>> x86_64-redhat-linux and aarch64-linux-gnu, also
>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>
>> Any comments are appreciated.
> 
> Since you are touching all FOR_EACH_LOOP please
> make implicit 'cfun' uses explicit.  I'm not sure ALL_LOOPS
> should scream, I think all_loops (function *, flags) would be
> nicer.
> 
> Note I'm anticipating iteration over a subset of the loop tree
> which would ask for specifying the 'root' of the loop tree to
> iterate over so it could be
> 
>   loops_list (class loop *root, unsigned flags)
> 
> and the "all" cases use loops_list (loops_for_fn (cfun), flags) then.
> Providing an overload with struct function is of course OK.
> 

Thanks for the comments, Richi.  Will update them in V2. 
I noticed the current loop_iterator requires a struct loops*
for LI_ONLY_INNERMOST, if you don't mind, I will use

  loops_list (class loops *loops, unsigned flags)

instead to make LI_ONLY_INNERMOST happy.  Your mentioned root can
be just the tree_root of the input loops.

BR,
Kewen



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-19 15:59 ` Martin Sebor
@ 2021-07-20  8:58   ` Kewen.Lin
  2021-07-20  9:49     ` Jonathan Wakely
  0 siblings, 1 reply; 35+ messages in thread
From: Kewen.Lin @ 2021-07-20  8:58 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Richard Biener, Richard Sandiford, Jakub Jelinek, tbsaunde,
	Segher Boessenkool, Jonathan Wakely, GCC Patches

on 2021/7/19 下午11:59, Martin Sebor wrote:
> On 7/19/21 12:20 AM, Kewen.Lin wrote:
>> Hi,
>>
>> This patch follows Martin's suggestion here[1], to support
>> range-based for loops for traversing loops, analogously to
>> the patch for vec[2].
>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>> x86_64-redhat-linux and aarch64-linux-gnu, also
>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>
>> Any comments are appreciated.
> 
> Thanks for this nice cleanup!  Just a few suggestions:
> 
> I would recommend against introducing new macros unless they
> offer a significant advantage over alternatives (for the two
> macros the patch adds I don't think they do).
> 
> If improving const-correctness is one of our a goals
> the loops_list iterator type would need to a corresponding
> const_iterator type, and const overloads of the begin()
> and end() member functions.
> 
> Rather than introducing more instances of the loop_p typedef
> I'd suggest to use loop *.  It has at least two advantages:
> it's clearer (it's obvious it refers to a pointer), and lends
> itself more readily to making code const-correct by declaring
> the control variable const: for (const class loop *loop: ...)
> while avoiding the mistake of using const loop_p loop to
> declare a pointer to a const loop.
> 

Thanks for the suggestions, Martin!  Will update them in V2.

With some experiments, I noticed that even provided const_iterator
like:

   iterator
   begin ()
   {
     return iterator (*this, 0);
   }

+  const_iterator
+  begin () const
+  {
+    return const_iterator (*this, 0);
+  }

for (const class loop *loop: ...) will still use iterator instead
of const_iterator pair.  We have to make the code look like:

  const auto& const_loops = loops_list (...);
  for (const class loop *loop: const_loops)

or
  template<typename T> constexpr const T &as_const(T &t) noexcept { return t; }
  for (const class loop *loop: as_const(loops_list...)) 

Does it look good to add below as_const along with loops_list in cfgloop.h?

+/* Provide the functionality of std::as_const to support range-based for
+   to use const iterator.  (We can't use std::as_const itself because it's
+   a C++17 feature.)  */
+template <typename T>
+constexpr const T &
+as_const (T &t) noexcept
+{
+  return t;
+}
+

BR,
Kewen

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-20  8:58   ` Kewen.Lin
@ 2021-07-20  9:49     ` Jonathan Wakely
  2021-07-20  9:50       ` Jonathan Wakely
  2021-07-20 14:42       ` Kewen.Lin
  0 siblings, 2 replies; 35+ messages in thread
From: Jonathan Wakely @ 2021-07-20  9:49 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: Martin Sebor, Richard Biener, Richard Sandiford, Jakub Jelinek,
	Trevor Saunders, Segher Boessenkool, GCC Patches

On Tue, 20 Jul 2021 at 09:58, Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2021/7/19 下午11:59, Martin Sebor wrote:
> > On 7/19/21 12:20 AM, Kewen.Lin wrote:
> >> Hi,
> >>
> >> This patch follows Martin's suggestion here[1], to support
> >> range-based for loops for traversing loops, analogously to
> >> the patch for vec[2].
> >>
> >> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> >> x86_64-redhat-linux and aarch64-linux-gnu, also
> >> bootstrapped on ppc64le P9 with bootstrap-O3 config.
> >>
> >> Any comments are appreciated.
> >
> > Thanks for this nice cleanup!  Just a few suggestions:
> >
> > I would recommend against introducing new macros unless they
> > offer a significant advantage over alternatives (for the two
> > macros the patch adds I don't think they do).
> >
> > If improving const-correctness is one of our a goals
> > the loops_list iterator type would need to a corresponding
> > const_iterator type, and const overloads of the begin()
> > and end() member functions.
> >
> > Rather than introducing more instances of the loop_p typedef
> > I'd suggest to use loop *.  It has at least two advantages:
> > it's clearer (it's obvious it refers to a pointer), and lends
> > itself more readily to making code const-correct by declaring
> > the control variable const: for (const class loop *loop: ...)
> > while avoiding the mistake of using const loop_p loop to
> > declare a pointer to a const loop.
> >
>
> Thanks for the suggestions, Martin!  Will update them in V2.
>
> With some experiments, I noticed that even provided const_iterator
> like:
>
>    iterator
>    begin ()
>    {
>      return iterator (*this, 0);
>    }
>
> +  const_iterator
> +  begin () const
> +  {
> +    return const_iterator (*this, 0);
> +  }
>
> for (const class loop *loop: ...) will still use iterator instead
> of const_iterator pair.  We have to make the code look like:
>
>   const auto& const_loops = loops_list (...);
>   for (const class loop *loop: const_loops)
>
> or
>   template<typename T> constexpr const T &as_const(T &t) noexcept { return t; }
>   for (const class loop *loop: as_const(loops_list...))
>
> Does it look good to add below as_const along with loops_list in cfgloop.h?
>
> +/* Provide the functionality of std::as_const to support range-based for
> +   to use const iterator.  (We can't use std::as_const itself because it's
> +   a C++17 feature.)  */
> +template <typename T>
> +constexpr const T &
> +as_const (T &t) noexcept

The noexcept is not needed because GCC is built -fno-exceptions. For
consistency with all the other code that doesn't use noexcept, it
should probably not be there.

> +{
> +  return t;
> +}
> +

That's one option. Another option (which could coexist with as_const)
is to add cbegin() and cend() members, which are not overloaded for
const and non-const, and so always return a const_iterator:

const_iterator cbegin () const { return const_iterator (*this, 0); }
iterator begin () const { return cbegin(); }

And similarly for `end () const` and `cend () const`.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-20  9:49     ` Jonathan Wakely
@ 2021-07-20  9:50       ` Jonathan Wakely
  2021-07-20 14:42       ` Kewen.Lin
  1 sibling, 0 replies; 35+ messages in thread
From: Jonathan Wakely @ 2021-07-20  9:50 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: Martin Sebor, Richard Biener, Richard Sandiford, Jakub Jelinek,
	Trevor Saunders, Segher Boessenkool, GCC Patches

On Tue, 20 Jul 2021 at 10:49, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
> On Tue, 20 Jul 2021 at 09:58, Kewen.Lin <linkw@linux.ibm.com> wrote:
> >
> > on 2021/7/19 下午11:59, Martin Sebor wrote:
> > > On 7/19/21 12:20 AM, Kewen.Lin wrote:
> > >> Hi,
> > >>
> > >> This patch follows Martin's suggestion here[1], to support
> > >> range-based for loops for traversing loops, analogously to
> > >> the patch for vec[2].
> > >>
> > >> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> > >> x86_64-redhat-linux and aarch64-linux-gnu, also
> > >> bootstrapped on ppc64le P9 with bootstrap-O3 config.
> > >>
> > >> Any comments are appreciated.
> > >
> > > Thanks for this nice cleanup!  Just a few suggestions:
> > >
> > > I would recommend against introducing new macros unless they
> > > offer a significant advantage over alternatives (for the two
> > > macros the patch adds I don't think they do).
> > >
> > > If improving const-correctness is one of our a goals
> > > the loops_list iterator type would need to a corresponding
> > > const_iterator type, and const overloads of the begin()
> > > and end() member functions.
> > >
> > > Rather than introducing more instances of the loop_p typedef
> > > I'd suggest to use loop *.  It has at least two advantages:
> > > it's clearer (it's obvious it refers to a pointer), and lends
> > > itself more readily to making code const-correct by declaring
> > > the control variable const: for (const class loop *loop: ...)
> > > while avoiding the mistake of using const loop_p loop to
> > > declare a pointer to a const loop.
> > >
> >
> > Thanks for the suggestions, Martin!  Will update them in V2.
> >
> > With some experiments, I noticed that even provided const_iterator
> > like:
> >
> >    iterator
> >    begin ()
> >    {
> >      return iterator (*this, 0);
> >    }
> >
> > +  const_iterator
> > +  begin () const
> > +  {
> > +    return const_iterator (*this, 0);
> > +  }
> >
> > for (const class loop *loop: ...) will still use iterator instead
> > of const_iterator pair.  We have to make the code look like:
> >
> >   const auto& const_loops = loops_list (...);
> >   for (const class loop *loop: const_loops)
> >
> > or
> >   template<typename T> constexpr const T &as_const(T &t) noexcept { return t; }
> >   for (const class loop *loop: as_const(loops_list...))
> >
> > Does it look good to add below as_const along with loops_list in cfgloop.h?
> >
> > +/* Provide the functionality of std::as_const to support range-based for
> > +   to use const iterator.  (We can't use std::as_const itself because it's
> > +   a C++17 feature.)  */
> > +template <typename T>
> > +constexpr const T &
> > +as_const (T &t) noexcept
>
> The noexcept is not needed because GCC is built -fno-exceptions. For
> consistency with all the other code that doesn't use noexcept, it
> should probably not be there.
>
> > +{
> > +  return t;
> > +}
> > +
>
> That's one option. Another option (which could coexist with as_const)
> is to add cbegin() and cend() members, which are not overloaded for
> const and non-const, and so always return a const_iterator:
>
> const_iterator cbegin () const { return const_iterator (*this, 0); }
> iterator begin () const { return cbegin(); }
>
> And similarly for `end () const` and `cend () const`.

The range-based for loop would not use cbegin and cend, so you'd still
want to use as_const for that purpose.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2] Use range-based for loops for traversing loops
  2021-07-19  6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
                   ` (3 preceding siblings ...)
  2021-07-19 15:59 ` Martin Sebor
@ 2021-07-20 14:36 ` Kewen.Lin
  2021-07-22 12:56   ` Richard Biener
  2021-07-23  8:35   ` [PATCH v3] Use range-based for loops for traversing loops Kewen.Lin
  4 siblings, 2 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-20 14:36 UTC (permalink / raw)
  To: GCC Patches
  Cc: Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, tbsaunde, Richard Biener, Martin Sebor,
	Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 4643 bytes --]

Hi,

This v2 has addressed some review comments/suggestions:

  - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
  - Add new CTOR loops_list (struct loops *loops, unsigned flags)
    to support loop hierarchy tree rather than just a function,
    and adjust to use loops* accordingly.
  - Make implicit 'cfun' become explicit.
  - Get rid of macros ALL_LOOPS*, use loops_list instance.
  - Add const_iterator type begin()/end().
  - Use class loop* instead of loop_p in range-based for.

Bootstrapped and regtested again on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, also
bootstrapped again on ppc64le P9 with bootstrap-O3 config.

Does it look better?  Is it ok for trunk?

BR,
Kewen
-----
gcc/ChangeLog:

	* cfgloop.h (as_const): New function.
	(class loop_iterator): Rename to ...
	(class loops_list): ... this.
	(loop_iterator::next): Rename to ...
	(loops_list::Iter::fill_curr_loop): ... this and adjust.
	(loop_iterator::loop_iterator): Rename to ...
	(loops_list::loops_list): ... this and adjust.
	(loops_list::Iter): New class.
	(loops_list::iterator): New type.
	(loops_list::const_iterator): New type.
	(loops_list::begin): New function.
	(loops_list::end): Likewise.
	(loops_list::begin const): Likewise.
	(loops_list::end const): Likewise.
	(FOR_EACH_LOOP): Remove.
	(FOR_EACH_LOOP_FN): Remove.
	* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
	for loop with loops_list instance.
	(sort_sibling_loops): Likewise.
	(disambiguate_loops_with_multiple_latches): Likewise.
	(verify_loop_structure): Likewise.
	* cfgloopmanip.c (create_preheaders): Likewise.
	(force_single_succ_latches): Likewise.
	* config/aarch64/falkor-tag-collision-avoidance.c
	(execute_tag_collision_avoidance): Likewise.
	* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
	* config/s390/s390.c (s390_adjust_loops): Likewise.
	* doc/loop.texi: Likewise.
	* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
	* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
	(loop_versioning::make_versioning_decisions): Likewise.
	* gimple-ssa-split-paths.c (split_paths): Likewise.
	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
	* graphite.c (canonicalize_loop_form): Likewise.
	(graphite_transform_loops): Likewise.
	* ipa-fnsummary.c (analyze_function_body): Likewise.
	* ipa-pure-const.c (analyze_function): Likewise.
	* loop-doloop.c (doloop_optimize_loops): Likewise.
	* loop-init.c (loop_optimizer_finalize): Likewise.
	(fix_loop_structure): Likewise.
	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
	(move_loop_invariants): Likewise.
	* loop-unroll.c (decide_unrolling): Likewise.
	(unroll_loops): Likewise.
	* modulo-sched.c (sms_schedule): Likewise.
	* predict.c (predict_loops): Likewise.
	(pass_profile::execute): Likewise.
	* profile.c (branch_prob): Likewise.
	* sel-sched-ir.c (sel_finish_pipelining): Likewise.
	(sel_find_rgns): Likewise.
	* tree-cfg.c (replace_loop_annotate): Likewise.
	(replace_uses_by): Likewise.
	(move_sese_region_to_fn): Likewise.
	* tree-if-conv.c (pass_if_conversion::execute): Likewise.
	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
	* tree-parloops.c (parallelize_loops): Likewise.
	* tree-predcom.c (tree_predictive_commoning): Likewise.
	* tree-scalar-evolution.c (scev_initialize): Likewise.
	(scev_reset): Likewise.
	* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
	* tree-ssa-live.c (remove_unused_locals): Likewise.
	* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
	* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
	(tree_ssa_lim_initialize): Likewise.
	* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
	* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
	* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
	(free_numbers_of_iterations_estimates): Likewise.
	* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
	* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
	* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
	* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
	(pass_scev_cprop::execute): Likewise.
	* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
	* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
	* tree-ssa-threadupdate.c
	(jump_thread_path_registry::thread_through_all_blocks): Likewise.
	* tree-vectorizer.c (vectorize_loops): Likewise.
	* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

[-- Attachment #2: range-based-v2.patch --]
[-- Type: text/plain, Size: 45904 bytes --]

---
 gcc/cfgloop.c                                 |  19 +--
 gcc/cfgloop.h                                 | 161 +++++++++++++-----
 gcc/cfgloopmanip.c                            |   7 +-
 .../aarch64/falkor-tag-collision-avoidance.c  |   4 +-
 gcc/config/mn10300/mn10300.c                  |   4 +-
 gcc/config/s390/s390.c                        |   4 +-
 gcc/doc/loop.texi                             |  16 +-
 gcc/gimple-loop-interchange.cc                |   3 +-
 gcc/gimple-loop-jam.c                         |   3 +-
 gcc/gimple-loop-versioning.cc                 |   6 +-
 gcc/gimple-ssa-split-paths.c                  |   3 +-
 gcc/graphite-isl-ast-to-gimple.c              |   5 +-
 gcc/graphite.c                                |   6 +-
 gcc/ipa-fnsummary.c                           |   2 +-
 gcc/ipa-pure-const.c                          |   3 +-
 gcc/loop-doloop.c                             |   8 +-
 gcc/loop-init.c                               |   5 +-
 gcc/loop-invariant.c                          |  14 +-
 gcc/loop-unroll.c                             |   7 +-
 gcc/modulo-sched.c                            |   5 +-
 gcc/predict.c                                 |   5 +-
 gcc/profile.c                                 |   3 +-
 gcc/sel-sched-ir.c                            |  12 +-
 gcc/tree-cfg.c                                |  13 +-
 gcc/tree-if-conv.c                            |   3 +-
 gcc/tree-loop-distribution.c                  |   2 +-
 gcc/tree-parloops.c                           |   3 +-
 gcc/tree-predcom.c                            |   3 +-
 gcc/tree-scalar-evolution.c                   |  16 +-
 gcc/tree-ssa-dce.c                            |   3 +-
 gcc/tree-ssa-live.c                           |   3 +-
 gcc/tree-ssa-loop-ch.c                        |   3 +-
 gcc/tree-ssa-loop-im.c                        |   7 +-
 gcc/tree-ssa-loop-ivcanon.c                   |   3 +-
 gcc/tree-ssa-loop-ivopts.c                    |   3 +-
 gcc/tree-ssa-loop-manip.c                     |   3 +-
 gcc/tree-ssa-loop-niter.c                     |   8 +-
 gcc/tree-ssa-loop-prefetch.c                  |   3 +-
 gcc/tree-ssa-loop-split.c                     |   7 +-
 gcc/tree-ssa-loop-unswitch.c                  |   3 +-
 gcc/tree-ssa-loop.c                           |   6 +-
 gcc/tree-ssa-propagate.c                      |   3 +-
 gcc/tree-ssa-sccvn.c                          |   3 +-
 gcc/tree-ssa-threadupdate.c                   |   3 +-
 gcc/tree-vectorizer.c                         |   4 +-
 gcc/tree-vrp.c                                |   3 +-
 46 files changed, 208 insertions(+), 205 deletions(-)

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index f094538b9ff..d16591b2e1b 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -162,14 +162,12 @@ flow_loop_dump (const class loop *loop, FILE *file,
 void
 flow_loops_dump (FILE *file, void (*loop_dump_aux) (const class loop *, FILE *, int), int verbose)
 {
-  class loop *loop;
-
   if (!current_loops || ! file)
     return;
 
   fprintf (file, ";; %d loops found\n", number_of_loops (cfun));
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (cfun, LI_INCLUDE_ROOT))
     {
       flow_loop_dump (loop, file, loop_dump_aux, verbose);
     }
@@ -559,8 +557,7 @@ sort_sibling_loops (function *fn)
   free (rc_order);
 
   auto_vec<loop_p, 3> siblings;
-  loop_p loop;
-  FOR_EACH_LOOP_FN (fn, loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (fn, LI_INCLUDE_ROOT))
     if (loop->inner && loop->inner->next)
       {
 	loop_p sibling = loop->inner;
@@ -836,9 +833,7 @@ disambiguate_multiple_latches (class loop *loop)
 void
 disambiguate_loops_with_multiple_latches (void)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (!loop->latch)
 	disambiguate_multiple_latches (loop);
@@ -1457,7 +1452,7 @@ verify_loop_structure (void)
   auto_sbitmap visited (last_basic_block_for_fn (cfun));
   bitmap_clear (visited);
   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       unsigned n;
 
@@ -1503,7 +1498,7 @@ verify_loop_structure (void)
   free (bbs);
 
   /* Check headers and latches.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       i = loop->num;
       if (loop->header == NULL)
@@ -1629,7 +1624,7 @@ verify_loop_structure (void)
     }
 
   /* Check the recorded loop exits.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (!loop->exits || loop->exits->e != NULL)
 	{
@@ -1723,7 +1718,7 @@ verify_loop_structure (void)
 	  err = 1;
 	}
 
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	{
 	  eloops = 0;
 	  for (exit = loop->exits->next; exit->e; exit = exit->next)
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 5e699276c88..ed6b0765e1b 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -658,62 +658,153 @@ enum li_flags
   LI_ONLY_INNERMOST = 4		/* Iterate only over innermost loops.  */
 };
 
-/* The iterator for loops.  */
+/* Provide the functionality of std::as_const to support range-based for
+   to use const iterator.  (We can't use std::as_const itself because it's
+   a C++17 feature.)  */
+template <typename T>
+constexpr const T &
+as_const (T &t)
+{
+  return t;
+}
+
+/* A list for visiting loops, which contains the loop numbers instead of
+   the loop pointers.  The scope is restricted in loop hierarchy tree
+   LOOPS or function FN and the visiting order is specified by FLAGS.  */
 
-class loop_iterator
+class loops_list
 {
 public:
-  loop_iterator (function *fn, loop_p *loop, unsigned flags);
+  loops_list (struct loops *loops, unsigned flags);
+
+  loops_list (function *fn, unsigned flags)
+    : loops_list (loops_for_fn (fn), flags)
+  {
+  }
+
+  template <typename T> class Iter
+  {
+  public:
+    Iter (const loops_list &l, unsigned idx) : list (l), curr_idx (idx)
+    {
+      fill_curr_loop ();
+    }
+
+    T operator* () const { return curr_loop; }
+
+    Iter &
+    operator++ ()
+    {
+      if (curr_idx < list.to_visit.length ())
+	{
+	  /* Bump the index and fill a new one.  */
+	  curr_idx++;
+	  fill_curr_loop ();
+	}
+      else
+	gcc_assert (!curr_loop);
+
+      return *this;
+    }
+
+    bool
+    operator!= (const Iter &rhs) const
+    {
+      return this->curr_idx != rhs.curr_idx;
+    }
+
+  private:
+    /* Fill the current loop starting from the current index.  */
+    void fill_curr_loop ();
+
+    /* Reference to the loop list to visit.  */
+    const loops_list &list;
+
+    /* The current index in the list to visit.  */
+    unsigned curr_idx;
 
-  inline loop_p next ();
+    /* The loop implied by the current index.  */
+    loop_p curr_loop;
+  };
 
-  /* The function we are visiting.  */
-  function *fn;
+  using iterator = Iter<loop_p>;
+  using const_iterator = Iter<const loop_p>;
+
+  iterator
+  begin ()
+  {
+    return iterator (*this, 0);
+  }
+
+  iterator
+  end ()
+  {
+    return iterator (*this, to_visit.length ());
+  }
+
+  const_iterator
+  begin () const
+  {
+    return const_iterator (*this, 0);
+  }
+
+  const_iterator
+  end () const
+  {
+    return const_iterator (*this, to_visit.length ());
+  }
+
+private:
+  /* The loop hierarchy tree we are visiting.  */
+  struct loops *loops;
 
   /* The list of loops to visit.  */
   auto_vec<int, 16> to_visit;
-
-  /* The index of the actual loop.  */
-  unsigned idx;
 };
 
-inline loop_p
-loop_iterator::next ()
+/* Starting from current index CURR_IDX (inclusive), find one index
+   which stands for one valid loop and fill the found loop as CURR_LOOP,
+   if we can't find one, set CURR_LOOP as null.  */
+
+template <typename T>
+inline void
+loops_list::Iter<T>::fill_curr_loop ()
 {
   int anum;
 
-  while (this->to_visit.iterate (this->idx, &anum))
+  while (this->list.to_visit.iterate (this->curr_idx, &anum))
     {
-      this->idx++;
-      loop_p loop = get_loop (fn, anum);
+      loop_p loop = (*this->list.loops->larray)[anum];
       if (loop)
-	return loop;
+	{
+	  curr_loop = loop;
+	  return;
+	}
+      this->curr_idx++;
     }
 
-  return NULL;
+  curr_loop = nullptr;
 }
 
-inline
-loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
+/* Set up the loops list to visit according to the specified
+   loop hierarchy tree LOOPS and iterating order FLAGS.  */
+
+inline loops_list::loops_list (struct loops *loops, unsigned flags)
 {
   class loop *aloop;
   unsigned i;
   int mn;
 
-  this->idx = 0;
-  this->fn = fn;
-  if (!loops_for_fn (fn))
-    {
-      *loop = NULL;
-      return;
-    }
+  this->loops = loops;
+  if (!loops)
+    return;
 
-  this->to_visit.reserve_exact (number_of_loops (fn));
+  this->to_visit.reserve_exact (vec_safe_length (loops->larray));
   mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
 
   if (flags & LI_ONLY_INNERMOST)
     {
-      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
+      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
 	if (aloop != NULL
 	    && aloop->inner == NULL
 	    && aloop->num >= mn)
@@ -722,7 +813,7 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
   else if (flags & LI_FROM_INNERMOST)
     {
       /* Push the loops to LI->TO_VISIT in postorder.  */
-      for (aloop = loops_for_fn (fn)->tree_root;
+      for (aloop = loops->tree_root;
 	   aloop->inner != NULL;
 	   aloop = aloop->inner)
 	continue;
@@ -748,7 +839,7 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
   else
     {
       /* Push the loops to LI->TO_VISIT in preorder.  */
-      aloop = loops_for_fn (fn)->tree_root;
+      aloop = loops->tree_root;
       while (1)
 	{
 	  if (aloop->num >= mn)
@@ -766,20 +857,8 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
 	    }
 	}
     }
-
-  *loop = this->next ();
 }
 
-#define FOR_EACH_LOOP(LOOP, FLAGS) \
-  for (loop_iterator li(cfun, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
-
-#define FOR_EACH_LOOP_FN(FN, LOOP, FLAGS) \
-  for (loop_iterator li(FN, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
-
 /* The properties of the target.  */
 struct target_cfgloop {
   /* Number of available registers.  */
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index 2af59fedc92..28087a14e0f 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1572,12 +1572,10 @@ create_preheader (class loop *loop, int flags)
 void
 create_preheaders (int flags)
 {
-  class loop *loop;
-
   if (!current_loops)
     return;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     create_preheader (loop, flags);
   loops_state_set (LOOPS_HAVE_PREHEADERS);
 }
@@ -1587,10 +1585,9 @@ create_preheaders (int flags)
 void
 force_single_succ_latches (void)
 {
-  class loop *loop;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (loop->latch != loop->header && single_succ_p (loop->latch))
 	continue;
diff --git a/gcc/config/aarch64/falkor-tag-collision-avoidance.c b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
index de214e4a0f7..d9b1b3fe835 100644
--- a/gcc/config/aarch64/falkor-tag-collision-avoidance.c
+++ b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
@@ -808,8 +808,6 @@ record_loads (tag_map_t &tag_map, struct loop *loop)
 void
 execute_tag_collision_avoidance ()
 {
-  struct loop *loop;
-
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_UD_CHAIN);
   df_compute_regs_ever_live (true);
@@ -824,7 +822,7 @@ execute_tag_collision_avoidance ()
   calculate_dominance_info (CDI_DOMINATORS);
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       tag_map_t tag_map (512);
 
diff --git a/gcc/config/mn10300/mn10300.c b/gcc/config/mn10300/mn10300.c
index 6f842a3ad32..d9229ff5cc6 100644
--- a/gcc/config/mn10300/mn10300.c
+++ b/gcc/config/mn10300/mn10300.c
@@ -3234,8 +3234,6 @@ mn10300_loop_contains_call_insn (loop_p loop)
 static void
 mn10300_scan_for_setlb_lcc (void)
 {
-  loop_p loop;
-
   DUMP ("Looking for loops that can use the SETLB insn", NULL_RTX);
 
   df_analyze ();
@@ -3248,7 +3246,7 @@ mn10300_scan_for_setlb_lcc (void)
      if an inner loop is not suitable for use with the SETLB/Lcc insns, it may
      be the case that its parent loop is suitable.  Thus we should check all
      loops, but work from the innermost outwards.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       const char * reason = NULL;
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b1d3b99784d..a98e1beebf1 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -14479,15 +14479,13 @@ s390_adjust_loop_scan_osc (struct loop* loop)
 static void
 s390_adjust_loops ()
 {
-  struct loop *loop = NULL;
-
   df_analyze ();
   compute_bb_for_insn ();
 
   /* Find the loops.  */
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       if (dump_file)
 	{
diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
index a135656ed01..27697b08728 100644
--- a/gcc/doc/loop.texi
+++ b/gcc/doc/loop.texi
@@ -79,14 +79,14 @@ and its subloops in the numbering.  The index of a loop never changes.
 
 The entries of the @code{larray} field should not be accessed directly.
 The function @code{get_loop} returns the loop description for a loop with
-the given index.  @code{number_of_loops} function returns number of
-loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
-macro.  The @code{flags} argument of the macro is used to determine
-the direction of traversal and the set of loops visited.  Each loop is
-guaranteed to be visited exactly once, regardless of the changes to the
-loop tree, and the loops may be removed during the traversal.  The newly
-created loops are never traversed, if they need to be visited, this
-must be done separately after their creation.
+the given index.  @code{number_of_loops} function returns number of loops
+in the function.  To traverse all loops, use range-based for loop with
+class @code{loop_list} instance. The @code{flags} argument of the macro
+is used to determine the direction of traversal and the set of loops
+visited.  Each loop is guaranteed to be visited exactly once, regardless
+of the changes to the loop tree, and the loops may be removed during the
+traversal.  The newly created loops are never traversed, if they need to
+be visited, this must be done separately after their creation.
 
 Each basic block contains the reference to the innermost loop it belongs
 to (@code{loop_father}).  For this reason, it is only possible to have
diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 7a88faa2c07..a9044182a59 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -2089,8 +2089,7 @@ pass_linterchange::execute (function *fun)
     return 0;
 
   bool changed_p = false;
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       vec<loop_p> loop_nest = vNULL;
       vec<data_reference_p> datarefs = vNULL;
diff --git a/gcc/gimple-loop-jam.c b/gcc/gimple-loop-jam.c
index 4842f0dff80..271139a1d87 100644
--- a/gcc/gimple-loop-jam.c
+++ b/gcc/gimple-loop-jam.c
@@ -486,13 +486,12 @@ adjust_unroll_factor (class loop *inner, struct data_dependence_relation *ddr,
 static unsigned int
 tree_loop_unroll_and_jam (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   /* Go through all innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       class loop *outer = loop_outer (loop);
 
diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
index 4b70c5a4aab..3fd086c0c65 100644
--- a/gcc/gimple-loop-versioning.cc
+++ b/gcc/gimple-loop-versioning.cc
@@ -1428,8 +1428,7 @@ loop_versioning::analyze_blocks ()
      versioning at that level could be useful in some cases.  */
   get_loop_info (get_loop (m_fn, 0)).rejected_p = true;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
 
@@ -1650,8 +1649,7 @@ loop_versioning::make_versioning_decisions ()
   AUTO_DUMP_SCOPE ("make_versioning_decisions",
 		   dump_user_location_t::from_function_decl (m_fn->decl));
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
       if (decide_whether_loop_is_versionable (loop))
diff --git a/gcc/gimple-ssa-split-paths.c b/gcc/gimple-ssa-split-paths.c
index 2dd953d5ef9..1470de279d6 100644
--- a/gcc/gimple-ssa-split-paths.c
+++ b/gcc/gimple-ssa-split-paths.c
@@ -473,13 +473,12 @@ static bool
 split_paths ()
 {
   bool changed = false;
-  loop_p loop;
 
   loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
   initialize_original_copy_tables ();
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Only split paths if we are optimizing this loop for speed.  */
       if (!optimize_loop_for_speed_p (loop))
diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index c202213f39b..30370859460 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -1535,9 +1535,8 @@ graphite_regenerate_ast_isl (scop_p scop)
       if_region->false_region->region.entry->flags |= EDGE_FALLTHRU;
       /* remove_edge_and_dominated_blocks marks loops for removal but
 	 doesn't actually remove them (fix that...).  */
-      loop_p loop;
-      FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
-	if (! loop->header)
+      for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
+	if (!loop->header)
 	  delete_loop (loop);
     }
 
diff --git a/gcc/graphite.c b/gcc/graphite.c
index 6c4fb42282b..a26d5a3a818 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -377,8 +377,7 @@ canonicalize_loop_closed_ssa (loop_p loop, edge e)
 static void
 canonicalize_loop_form (void)
 {
-  loop_p loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       edge e = single_exit (loop);
       if (!e || (e->flags & (EDGE_COMPLEX|EDGE_FAKE)))
@@ -494,10 +493,9 @@ graphite_transform_loops (void)
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
-      loop_p loop;
       int num_no_dependency = 0;
 
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (loop->can_be_parallel)
 	  num_no_dependency++;
 
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 95d28757f95..883773abe12 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2923,7 +2923,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
       if (dump_file && (dump_flags & TDF_DETAILS))
 	flow_loops_dump (dump_file, NULL, 0);
       scev_initialize ();
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	{
 	  predicate loop_iterations = true;
 	  sreal header_freq;
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index f045108af21..f1724e4f46d 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -1087,9 +1087,8 @@ end:
 	    }
 	  else
 	    {
-	      class loop *loop;
 	      scev_initialize ();
-	      FOR_EACH_LOOP (loop, 0)
+	      for (class loop *loop : loops_list (cfun, 0))
 		if (!finite_loop_p (loop))
 		  {
 		    if (dump_file)
diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c
index dda7b9e268f..12288bd64d8 100644
--- a/gcc/loop-doloop.c
+++ b/gcc/loop-doloop.c
@@ -789,18 +789,14 @@ doloop_optimize (class loop *loop)
 void
 doloop_optimize_loops (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     {
       df_live_add_problem ();
       df_live_set_all_dirty ();
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      doloop_optimize (loop);
-    }
+  for (class loop *loop : loops_list (cfun, 0))
+    doloop_optimize (loop);
 
   if (optimize == 1)
     df_remove_problem (df_live);
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 1fde0ede441..1f75f509f53 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -137,7 +137,6 @@ loop_optimizer_init (unsigned flags)
 void
 loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
 {
-  class loop *loop;
   basic_block bb;
 
   timevar_push (TV_LOOP_FINI);
@@ -167,7 +166,7 @@ loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
       goto loop_fini_done;
     }
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (class loop *loop : loops_list (fn, 0))
     free_simple_loop_desc (loop);
 
   /* Clean up.  */
@@ -229,7 +228,7 @@ fix_loop_structure (bitmap changed_bbs)
      loops, so that when we remove the loops, we know that the loops inside
      are preserved, and do not waste time relinking loops that will be
      removed later.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Detect the case that the loop is no longer present even though
          it wasn't marked for removal.
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index bdc7b59dd5f..059ef8fe7f4 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -2136,7 +2136,7 @@ calculate_loop_reg_pressure (void)
   rtx link;
   class loop *loop, *parent;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (loop->aux == NULL)
       {
 	loop->aux = xcalloc (1, sizeof (class loop_data));
@@ -2203,7 +2203,7 @@ calculate_loop_reg_pressure (void)
   bitmap_release (&curr_regs_live);
   if (flag_ira_region == IRA_REGION_MIXED
       || flag_ira_region == IRA_REGION_ALL)
-    FOR_EACH_LOOP (loop, 0)
+    for (class loop *loop : loops_list (cfun, 0))
       {
 	EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
 	  if (! bitmap_bit_p (&LOOP_DATA (loop)->regs_ref, j))
@@ -2217,7 +2217,7 @@ calculate_loop_reg_pressure (void)
       }
   if (dump_file == NULL)
     return;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       parent = loop_outer (loop);
       fprintf (dump_file, "\n  Loop %d (parent %d, header bb%d, depth %d)\n",
@@ -2251,8 +2251,6 @@ calculate_loop_reg_pressure (void)
 void
 move_loop_invariants (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     df_live_add_problem ();
   /* ??? This is a hack.  We should only need to call df_live_set_all_dirty
@@ -2271,7 +2269,7 @@ move_loop_invariants (void)
     }
   df_set_flags (DF_EQ_NOTES + DF_DEFER_INSN_RESCAN);
   /* Process the loops, innermost first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       curr_loop = loop;
       /* move_single_loop_invariants for very large loops is time consuming
@@ -2284,10 +2282,8 @@ move_loop_invariants (void)
 	move_single_loop_invariants (loop);
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
+  for (class loop *loop : loops_list (cfun, 0))
       free_loop_data (loop);
-    }
 
   if (flag_ira_loop_pressure)
     /* There is no sense to keep this info because it was most
diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c
index 66d93487e29..f11c9c396ae 100644
--- a/gcc/loop-unroll.c
+++ b/gcc/loop-unroll.c
@@ -214,10 +214,8 @@ report_unroll (class loop *loop, dump_location_t locus)
 static void
 decide_unrolling (int flags)
 {
-  class loop *loop;
-
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop->lpt_decision.decision = LPT_NONE;
       dump_user_location_t locus = get_loop_location (loop);
@@ -278,14 +276,13 @@ decide_unrolling (int flags)
 void
 unroll_loops (int flags)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Now decide rest of unrolling.  */
   decide_unrolling (flags);
 
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* And perform the appropriate transformations.  */
       switch (loop->lpt_decision.decision)
diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index e72e46db387..fc761cf60c5 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -1353,7 +1353,6 @@ sms_schedule (void)
   int maxii, max_asap;
   partial_schedule_ptr ps;
   basic_block bb = NULL;
-  class loop *loop;
   basic_block condition_bb = NULL;
   edge latch_edge;
   HOST_WIDE_INT trip_count, max_trip_count;
@@ -1397,7 +1396,7 @@ sms_schedule (void)
 
   /* Build DDGs for all the relevant loops and hold them in G_ARR
      indexed by the loop index.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
@@ -1543,7 +1542,7 @@ sms_schedule (void)
   }
 
   /* We don't want to perform SMS on new loops - created by versioning.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
diff --git a/gcc/predict.c b/gcc/predict.c
index d751e6cecce..5876b6e44a8 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -1949,7 +1949,7 @@ predict_loops (void)
 
   /* Try to predict out blocks in a loop that are not part of a
      natural loop.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       basic_block bb, *bbs;
       unsigned j, n_exits = 0;
@@ -4111,8 +4111,7 @@ pass_profile::execute (function *fun)
     profile_status_for_fn (fun) = PROFILE_GUESSED;
  if (dump_file && (dump_flags & TDF_DETAILS))
    {
-     class loop *loop;
-     FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+     for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
        if (loop->header->count.initialized_p ())
          fprintf (dump_file, "Loop got predicted %d to iterate %i times.\n",
        	   loop->num,
diff --git a/gcc/profile.c b/gcc/profile.c
index 1fa4196fa16..6357fb37cfd 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -1466,13 +1466,12 @@ branch_prob (bool thunk)
   if (flag_branch_probabilities
       && (profile_status_for_fn (cfun) == PROFILE_READ))
     {
-      class loop *loop;
       if (dump_file && (dump_flags & TDF_DETAILS))
 	report_predictor_hitrates ();
 
       /* At this moment we have precise loop iteration count estimates.
 	 Record them to loop structure before the profile gets out of date. */
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (loop->header->count > 0 && loop->header->count.reliable_p ())
 	  {
 	    gcov_type nit = expected_loop_iterations_unbounded (loop);
diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index eef9d6969f4..3dff69f21ce 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -6247,10 +6247,8 @@ make_regions_from_the_rest (void)
 /* Free data structures used in pipelining of loops.  */
 void sel_finish_pipelining (void)
 {
-  class loop *loop;
-
   /* Release aux fields so we don't free them later by mistake.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     loop->aux = NULL;
 
   loop_optimizer_finalize ();
@@ -6271,11 +6269,11 @@ sel_find_rgns (void)
 
   if (current_loops)
     {
-      loop_p loop;
+      unsigned flags = flag_sel_sched_pipelining_outer_loops
+			 ? LI_FROM_INNERMOST
+			 : LI_ONLY_INNERMOST;
 
-      FOR_EACH_LOOP (loop, (flag_sel_sched_pipelining_outer_loops
-			    ? LI_FROM_INNERMOST
-			    : LI_ONLY_INNERMOST))
+      for (class loop *loop : loops_list (cfun, flags))
 	make_regions_from_loop_nest (loop);
     }
 
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index c8b0f7b33e1..0d082851f2f 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -312,12 +312,11 @@ replace_loop_annotate_in_block (basic_block bb, class loop *loop)
 static void
 replace_loop_annotate (void)
 {
-  class loop *loop;
   basic_block bb;
   gimple_stmt_iterator gsi;
   gimple *stmt;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       /* First look into the header.  */
       replace_loop_annotate_in_block (loop->header, loop);
@@ -2027,12 +2026,8 @@ replace_uses_by (tree name, tree val)
   /* Also update the trees stored in loop structures.  */
   if (current_loops)
     {
-      class loop *loop;
-
-      FOR_EACH_LOOP (loop, 0)
-	{
+      for (class loop *loop : loops_list (cfun, 0))
 	  substitute_in_loop_info (loop, name, val);
-	}
     }
 }
 
@@ -7752,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
 
   /* Fix up orig_loop_num.  If the block referenced in it has been moved
      to dest_cfun, update orig_loop_num field, otherwise clear it.  */
-  class loop *dloop;
+  class loop *dloop = NULL;
   signed char *moved_orig_loop_num = NULL;
-  FOR_EACH_LOOP_FN (dest_cfun, dloop, 0)
+  for (class loop *dloop : loops_list (dest_cfun, 0))
     if (dloop->orig_loop_num)
       {
 	if (moved_orig_loop_num == NULL)
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 345488e2a19..1bf68455f72 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -3300,14 +3300,13 @@ pass_if_conversion::gate (function *fun)
 unsigned int
 pass_if_conversion::execute (function *fun)
 {
-  class loop *loop;
   unsigned todo = 0;
 
   if (number_of_loops (fun) <= 1)
     return 0;
 
   auto_vec<gimple *> preds;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (flag_tree_loop_if_convert == 1
 	|| ((flag_tree_loop_vectorize || loop->force_vectorize)
 	    && !loop->dont_vectorize))
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 65aa1df4aba..1f0b6eb3a95 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -3312,7 +3312,7 @@ loop_distribution::execute (function *fun)
 
   /* We can at the moment only distribute non-nested loops, thus restrict
      walking to innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       /* Don't distribute multiple exit edges loop, or cold loop when
          not doing pattern detection.  */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index fe1baef32a7..9d73f241204 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -3989,7 +3989,6 @@ parallelize_loops (bool oacc_kernels_p)
 {
   unsigned n_threads;
   bool changed = false;
-  class loop *loop;
   class loop *skip_loop = NULL;
   class tree_niter_desc niter_desc;
   struct obstack parloop_obstack;
@@ -4020,7 +4019,7 @@ parallelize_loops (bool oacc_kernels_p)
 
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (loop == skip_loop)
 	{
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index cf85517e1c7..b8ae6b66df9 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -3419,11 +3419,10 @@ pcom_worker::tree_predictive_commoning_loop (bool allow_unroll_p)
 unsigned
 tree_predictive_commoning (bool allow_unroll_p)
 {
-  class loop *loop;
   unsigned ret = 0, changed = 0;
 
   initialize_original_copy_tables ();
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     if (optimize_loop_for_speed_p (loop))
       {
 	pcom_worker w(loop);
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index b22d49a0ab6..c6a3b7f9bd7 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -2977,16 +2977,12 @@ gather_stats_on_scev_database (void)
 void
 scev_initialize (void)
 {
-  class loop *loop;
-
   gcc_assert (! scev_initialized_p ());
 
   scalar_evolution_info = hash_table<scev_info_hasher>::create_ggc (100);
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (class loop *loop : loops_list (cfun, 0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if SCEV is initialized.  */
@@ -3015,14 +3011,10 @@ scev_reset_htab (void)
 void
 scev_reset (void)
 {
-  class loop *loop;
-
   scev_reset_htab ();
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (class loop *loop : loops_list (cfun, 0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if the IV calculation in TYPE can overflow based on the knowledge
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index e2d3b63a30c..226cbc18a2f 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -417,7 +417,6 @@ find_obviously_necessary_stmts (bool aggressive)
   /* Prevent the empty possibly infinite loops from being removed.  */
   if (aggressive)
     {
-      class loop *loop;
       if (mark_irreducible_loops ())
 	FOR_EACH_BB_FN (bb, cfun)
 	  {
@@ -433,7 +432,7 @@ find_obviously_necessary_stmts (bool aggressive)
 		}
 	  }
 
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (!finite_loop_p (loop))
 	  {
 	    if (dump_file)
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index a2aab25e862..16a3f7bb0bb 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -908,8 +908,7 @@ remove_unused_locals (void)
 
   if (cfun->has_simduid_loops)
     {
-      class loop *loop;
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (loop->simduid && !is_used_p (loop->simduid))
 	  loop->simduid = NULL_TREE;
     }
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index dfa5dc87c34..57fa3ee3608 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -348,7 +348,6 @@ protected:
 unsigned int
 ch_base::copy_headers (function *fun)
 {
-  class loop *loop;
   basic_block header;
   edge exit, entry;
   basic_block *bbs, *copied_bbs;
@@ -365,7 +364,7 @@ ch_base::copy_headers (function *fun)
 
   auto_vec<std::pair<edge, loop_p> > copied;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       int initial_limit = param_max_loop_header_insns;
       int remaining_limit = initial_limit;
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 81b4ec21d6e..a8cc9bc8151 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -1662,7 +1662,7 @@ analyze_memory_references (bool store_motion)
 {
   gimple_stmt_iterator bsi;
   basic_block bb, *bbs;
-  class loop *loop, *outer;
+  class loop *outer;
   unsigned i, n;
 
   /* Collect all basic-blocks in loops and sort them after their
@@ -1706,7 +1706,7 @@ analyze_memory_references (bool store_motion)
 
   /* Propagate the information about accessed memory references up
      the loop hierarchy.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Finalize the overall touched references (including subloops).  */
       bitmap_ior_into (&memory_accesses.all_refs_stored_in_loop[loop->num],
@@ -3133,7 +3133,6 @@ fill_always_executed_in (void)
 static void
 tree_ssa_lim_initialize (bool store_motion)
 {
-  class loop *loop;
   unsigned i;
 
   bitmap_obstack_initialize (&lim_bitmap_obstack);
@@ -3177,7 +3176,7 @@ tree_ssa_lim_initialize (bool store_motion)
      its postorder index.  */
   i = 0;
   bb_loop_postorder = XNEWVEC (unsigned, number_of_loops (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     bb_loop_postorder[loop->num] = i++;
 }
 
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index b1971f83544..81e9a22be4c 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1285,14 +1285,13 @@ canonicalize_loop_induction_variables (class loop *loop,
 unsigned int
 canonicalize_induction_variables (void)
 {
-  class loop *loop;
   bool changed = false;
   bool irred_invalidated = false;
   bitmap loop_closed_ssa_invalidated = BITMAP_ALLOC (NULL);
 
   estimate_numbers_of_iterations (cfun);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       changed |= canonicalize_loop_induction_variables (loop,
 							true, UL_SINGLE_ITER,
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 12a8a49a307..43668463b21 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -8066,14 +8066,13 @@ finish:
 void
 tree_ssa_iv_optimize (void)
 {
-  class loop *loop;
   struct ivopts_data data;
   auto_bitmap toremove;
 
   tree_ssa_iv_optimize_init (&data);
 
   /* Optimize the loops starting with the innermost ones.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!dbg_cnt (ivopts_loop))
 	continue;
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index 28ae1316fa0..15dab0c1451 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -362,11 +362,10 @@ add_exit_phis (bitmap names_to_rename, bitmap *use_blocks, bitmap *loop_exits)
 static void
 get_loops_exits (bitmap *loop_exits)
 {
-  class loop *loop;
   unsigned j;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       auto_vec<edge> exit_edges = get_loop_exit_edges (loop);
       loop_exits[loop->num] = BITMAP_ALLOC (&loop_renamer_obstack);
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 6fabf10a215..84e5ce064fe 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -4559,13 +4559,11 @@ estimated_stmt_executions (class loop *loop, widest_int *nit)
 void
 estimate_numbers_of_iterations (function *fn)
 {
-  class loop *loop;
-
   /* We don't want to issue signed overflow warnings while getting
      loop iteration estimates.  */
   fold_defer_overflow_warnings ();
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (class loop *loop : loops_list (fn, 0))
     estimate_numbers_of_iterations (loop);
 
   fold_undefer_and_ignore_overflow_warnings ();
@@ -5031,9 +5029,7 @@ free_numbers_of_iterations_estimates (class loop *loop)
 void
 free_numbers_of_iterations_estimates (function *fn)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (class loop *loop : loops_list (fn, 0))
     free_numbers_of_iterations_estimates (loop);
 }
 
diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index 98062eb4616..b1af922d403 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -1980,7 +1980,6 @@ fail:
 unsigned int
 tree_ssa_prefetch_arrays (void)
 {
-  class loop *loop;
   bool unrolled = false;
   int todo_flags = 0;
 
@@ -2025,7 +2024,7 @@ tree_ssa_prefetch_arrays (void)
       set_builtin_decl (BUILT_IN_PREFETCH, decl, false);
     }
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Processing loop %d:\n", loop->num);
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index 3a09bbc39e5..40629227863 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -1598,18 +1598,17 @@ split_loop_on_cond (struct loop *loop)
 static unsigned int
 tree_ssa_split_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   calculate_dominance_info (CDI_POST_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (cfun, LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (loop->aux)
 	{
@@ -1630,7 +1629,7 @@ tree_ssa_split_loops (void)
 	}
     }
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (cfun, LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   clear_aux_for_blocks ();
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 04d4553f13e..c4adb52a67c 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -90,11 +90,10 @@ static tree get_vop_from_header (class loop *);
 unsigned int
 tree_ssa_unswitch_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!loop->inner)
 	/* Unswitch innermost loop.  */
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 957ac0f3baa..2255c228780 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -157,8 +157,7 @@ gate_oacc_kernels (function *fn)
   if (!lookup_attribute ("oacc kernels", DECL_ATTRIBUTES (fn->decl)))
     return false;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (loop->in_oacc_kernels_region)
       return true;
 
@@ -455,12 +454,11 @@ public:
 unsigned
 pass_scev_cprop::execute (function *)
 {
-  class loop *loop;
   bool any = false;
 
   /* Perform final value replacement in loops, in case the replacement
      expressions are cheap.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     any |= final_value_replacement_loop (loop);
 
   return any ? TODO_cleanup_cfg | TODO_update_ssa_only_virtuals : 0;
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index d93ec90b002..1ac4d050f99 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -1262,7 +1262,6 @@ clean_up_loop_closed_phi (function *fun)
   tree rhs;
   tree lhs;
   gphi_iterator gsi;
-  struct loop *loop;
 
   /* Avoid possibly quadratic work when scanning for loop exits across
    all loops of a nest.  */
@@ -1274,7 +1273,7 @@ clean_up_loop_closed_phi (function *fun)
   calculate_dominance_info  (CDI_DOMINATORS);
 
   /* Walk over loop in function.  */
-  FOR_EACH_LOOP_FN (fun, loop, 0)
+  for (class loop *loop : loops_list (fun, 0))
     {
       /* Check each exit edege of loop.  */
       auto_vec<edge> exits = get_loop_exit_edges (loop);
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 7900df946f4..1bc22950790 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -7637,9 +7637,8 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
      loops and the outermost one optimistically.  */
   if (iterate)
     {
-      loop_p loop;
       unsigned max_depth = param_rpo_vn_max_loop_depth;
-      FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+      for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
 	if (loop_depth (loop) > max_depth)
 	  for (unsigned i = 2;
 	       i < loop_depth (loop) - max_depth; ++i)
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index f496dd3eb8c..ad1fd115161 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2561,7 +2561,6 @@ jump_thread_path_registry::thread_through_all_blocks
 {
   bool retval = false;
   unsigned int i;
-  class loop *loop;
   auto_bitmap threaded_blocks;
   hash_set<edge> visited_starting_edges;
 
@@ -2702,7 +2701,7 @@ jump_thread_path_registry::thread_through_all_blocks
   /* Then perform the threading through loop headers.  We start with the
      innermost loop, so that the changes in cfg we perform won't affect
      further threading.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!loop->header
 	  || !bitmap_bit_p (threaded_blocks, loop->header->index))
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index f1035a83826..ebf237d054a 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -1194,7 +1194,7 @@ vectorize_loops (void)
   /* If some loop was duplicated, it gets bigger number
      than all previously defined loops.  This fact allows us to run
      only over initial loops skipping newly generated ones.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (loop->dont_vectorize)
       {
 	any_ifcvt_loops = true;
@@ -1213,7 +1213,7 @@ vectorize_loops (void)
 		  loop4 (copy of loop2)
 		else
 		  loop5 (copy of loop4)
-	   If FOR_EACH_LOOP gives us loop3 first (which has
+	   If loops' iteration gives us loop3 first (which has
 	   dont_vectorize set), make sure to process loop1 before loop4;
 	   so that we can prevent vectorization of loop4 if loop1
 	   is successfully vectorized.  */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 0565c9b5073..abcf33c4f26 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3337,8 +3337,7 @@ vrp_asserts::find_assert_locations (void)
   /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
      the order we compute liveness and insert asserts we otherwise
      fail to insert asserts into the loop latch.  */
-  loop_p loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       i = loop->latch->index;
       unsigned int j = single_succ_edge (loop->latch)->dest_idx;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC/PATCH] Use range-based for loops for traversing loops
  2021-07-20  9:49     ` Jonathan Wakely
  2021-07-20  9:50       ` Jonathan Wakely
@ 2021-07-20 14:42       ` Kewen.Lin
  1 sibling, 0 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-20 14:42 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Martin Sebor, Richard Biener, Richard Sandiford, Jakub Jelinek,
	Trevor Saunders, Segher Boessenkool, GCC Patches

on 2021/7/20 下午5:49, Jonathan Wakely wrote:
> On Tue, 20 Jul 2021 at 09:58, Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/7/19 下午11:59, Martin Sebor wrote:
>>> On 7/19/21 12:20 AM, Kewen.Lin wrote:
>>>> Hi,
>>>>
>>>> This patch follows Martin's suggestion here[1], to support
>>>> range-based for loops for traversing loops, analogously to
>>>> the patch for vec[2].
>>>>
>>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>>>> x86_64-redhat-linux and aarch64-linux-gnu, also
>>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>>>
>>>> Any comments are appreciated.
>>>
>>> Thanks for this nice cleanup!  Just a few suggestions:
>>>
>>> I would recommend against introducing new macros unless they
>>> offer a significant advantage over alternatives (for the two
>>> macros the patch adds I don't think they do).
>>>
>>> If improving const-correctness is one of our a goals
>>> the loops_list iterator type would need to a corresponding
>>> const_iterator type, and const overloads of the begin()
>>> and end() member functions.
>>>
>>> Rather than introducing more instances of the loop_p typedef
>>> I'd suggest to use loop *.  It has at least two advantages:
>>> it's clearer (it's obvious it refers to a pointer), and lends
>>> itself more readily to making code const-correct by declaring
>>> the control variable const: for (const class loop *loop: ...)
>>> while avoiding the mistake of using const loop_p loop to
>>> declare a pointer to a const loop.
>>>
>>
>> Thanks for the suggestions, Martin!  Will update them in V2.
>>
>> With some experiments, I noticed that even provided const_iterator
>> like:
>>
>>    iterator
>>    begin ()
>>    {
>>      return iterator (*this, 0);
>>    }
>>
>> +  const_iterator
>> +  begin () const
>> +  {
>> +    return const_iterator (*this, 0);
>> +  }
>>
>> for (const class loop *loop: ...) will still use iterator instead
>> of const_iterator pair.  We have to make the code look like:
>>
>>   const auto& const_loops = loops_list (...);
>>   for (const class loop *loop: const_loops)
>>
>> or
>>   template<typename T> constexpr const T &as_const(T &t) noexcept { return t; }
>>   for (const class loop *loop: as_const(loops_list...))
>>
>> Does it look good to add below as_const along with loops_list in cfgloop.h?
>>
>> +/* Provide the functionality of std::as_const to support range-based for
>> +   to use const iterator.  (We can't use std::as_const itself because it's
>> +   a C++17 feature.)  */
>> +template <typename T>
>> +constexpr const T &
>> +as_const (T &t) noexcept
> 
> The noexcept is not needed because GCC is built -fno-exceptions. For
> consistency with all the other code that doesn't use noexcept, it
> should probably not be there.
> 

Thanks for pointing out!   Fixed it in v2.

>> +{
>> +  return t;
>> +}
>> +
> 
> That's one option. Another option (which could coexist with as_const)
> is to add cbegin() and cend() members, which are not overloaded for
> const and non-const, and so always return a const_iterator:
> 
> const_iterator cbegin () const { return const_iterator (*this, 0); }
> iterator begin () const { return cbegin(); }
> 
> And similarly for `end () const` and `cend () const`.
> 

Thanks for the suggestion.  As you pointed out in the later reply, the
range-based for loop doesn't use cbegin and cend, so I didn't add them
in v2.

BR,
Kewen

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] Use range-based for loops for traversing loops
  2021-07-20 14:36 ` [PATCH v2] " Kewen.Lin
@ 2021-07-22 12:56   ` Richard Biener
  2021-07-22 12:56     ` Richard Biener
  2021-07-23  8:41     ` [PATCH] Make loops_list support an optional loop_p root Kewen.Lin
  2021-07-23  8:35   ` [PATCH v3] Use range-based for loops for traversing loops Kewen.Lin
  1 sibling, 2 replies; 35+ messages in thread
From: Richard Biener @ 2021-07-22 12:56 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders, Martin Sebor, Bill Schmidt

On Tue, Jul 20, 2021 at 4:37
PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> Hi,
>
> This v2 has addressed some review comments/suggestions:
>
>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>     to support loop hierarchy tree rather than just a function,
>     and adjust to use loops* accordingly.

I actually meant struct loop *, not struct loops * ;)  At the point
we pondered to make loop invariant motion work on single
loop nests we gave up not only but also because it iterates
over the loop nest but all the iterators only ever can process
all loops, not say, all loops inside a specific 'loop' (and
including that 'loop' if LI_INCLUDE_ROOT).  So the
CTOR would take the 'root' of the loop tree as argument.

I see that doesn't trivially fit how loops_list works, at least
not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
could be adjusted to do ONLY_INNERMOST as well?

>   - Make implicit 'cfun' become explicit.
>   - Get rid of macros ALL_LOOPS*, use loops_list instance.
>   - Add const_iterator type begin()/end().
>   - Use class loop* instead of loop_p in range-based for.
>
> Bootstrapped and regtested again on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped again on ppc64le P9 with bootstrap-O3 config.
>
> Does it look better?  Is it ok for trunk?
>
> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
>         * cfgloop.h (as_const): New function.
>         (class loop_iterator): Rename to ...
>         (class loops_list): ... this.
>         (loop_iterator::next): Rename to ...
>         (loops_list::Iter::fill_curr_loop): ... this and adjust.
>         (loop_iterator::loop_iterator): Rename to ...
>         (loops_list::loops_list): ... this and adjust.
>         (loops_list::Iter): New class.
>         (loops_list::iterator): New type.
>         (loops_list::const_iterator): New type.
>         (loops_list::begin): New function.
>         (loops_list::end): Likewise.
>         (loops_list::begin const): Likewise.
>         (loops_list::end const): Likewise.
>         (FOR_EACH_LOOP): Remove.
>         (FOR_EACH_LOOP_FN): Remove.
>         * cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
>         for loop with loops_list instance.
>         (sort_sibling_loops): Likewise.
>         (disambiguate_loops_with_multiple_latches): Likewise.
>         (verify_loop_structure): Likewise.
>         * cfgloopmanip.c (create_preheaders): Likewise.
>         (force_single_succ_latches): Likewise.
>         * config/aarch64/falkor-tag-collision-avoidance.c
>         (execute_tag_collision_avoidance): Likewise.
>         * config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
>         * config/s390/s390.c (s390_adjust_loops): Likewise.
>         * doc/loop.texi: Likewise.
>         * gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
>         * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
>         * gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
>         (loop_versioning::make_versioning_decisions): Likewise.
>         * gimple-ssa-split-paths.c (split_paths): Likewise.
>         * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
>         * graphite.c (canonicalize_loop_form): Likewise.
>         (graphite_transform_loops): Likewise.
>         * ipa-fnsummary.c (analyze_function_body): Likewise.
>         * ipa-pure-const.c (analyze_function): Likewise.
>         * loop-doloop.c (doloop_optimize_loops): Likewise.
>         * loop-init.c (loop_optimizer_finalize): Likewise.
>         (fix_loop_structure): Likewise.
>         * loop-invariant.c (calculate_loop_reg_pressure): Likewise.
>         (move_loop_invariants): Likewise.
>         * loop-unroll.c (decide_unrolling): Likewise.
>         (unroll_loops): Likewise.
>         * modulo-sched.c (sms_schedule): Likewise.
>         * predict.c (predict_loops): Likewise.
>         (pass_profile::execute): Likewise.
>         * profile.c (branch_prob): Likewise.
>         * sel-sched-ir.c (sel_finish_pipelining): Likewise.
>         (sel_find_rgns): Likewise.
>         * tree-cfg.c (replace_loop_annotate): Likewise.
>         (replace_uses_by): Likewise.
>         (move_sese_region_to_fn): Likewise.
>         * tree-if-conv.c (pass_if_conversion::execute): Likewise.
>         * tree-loop-distribution.c (loop_distribution::execute): Likewise.
>         * tree-parloops.c (parallelize_loops): Likewise.
>         * tree-predcom.c (tree_predictive_commoning): Likewise.
>         * tree-scalar-evolution.c (scev_initialize): Likewise.
>         (scev_reset): Likewise.
>         * tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
>         * tree-ssa-live.c (remove_unused_locals): Likewise.
>         * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
>         * tree-ssa-loop-im.c (analyze_memory_references): Likewise.
>         (tree_ssa_lim_initialize): Likewise.
>         * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
>         * tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
>         * tree-ssa-loop-manip.c (get_loops_exits): Likewise.
>         * tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
>         (free_numbers_of_iterations_estimates): Likewise.
>         * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
>         * tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
>         * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
>         * tree-ssa-loop.c (gate_oacc_kernels): Likewise.
>         (pass_scev_cprop::execute): Likewise.
>         * tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
>         * tree-ssa-sccvn.c (do_rpo_vn): Likewise.
>         * tree-ssa-threadupdate.c
>         (jump_thread_path_registry::thread_through_all_blocks): Likewise.
>         * tree-vectorizer.c (vectorize_loops): Likewise.
>         * tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] Use range-based for loops for traversing loops
  2021-07-22 12:56   ` Richard Biener
@ 2021-07-22 12:56     ` Richard Biener
  2021-07-23  8:41     ` [PATCH] Make loops_list support an optional loop_p root Kewen.Lin
  1 sibling, 0 replies; 35+ messages in thread
From: Richard Biener @ 2021-07-22 12:56 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders, Martin Sebor, Bill Schmidt

On Thu, Jul 22, 2021 at 2:56 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Jul 20, 2021 at 4:37
> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >
> > Hi,
> >
> > This v2 has addressed some review comments/suggestions:
> >
> >   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
> >   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
> >     to support loop hierarchy tree rather than just a function,
> >     and adjust to use loops* accordingly.
>
> I actually meant struct loop *, not struct loops * ;)  At the point
> we pondered to make loop invariant motion work on single
> loop nests we gave up not only but also because it iterates
> over the loop nest but all the iterators only ever can process
> all loops, not say, all loops inside a specific 'loop' (and
> including that 'loop' if LI_INCLUDE_ROOT).  So the
> CTOR would take the 'root' of the loop tree as argument.
>
> I see that doesn't trivially fit how loops_list works, at least
> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
> could be adjusted to do ONLY_INNERMOST as well?

Oh, just to say - simply leave out the extra CTOR for now.

> >   - Make implicit 'cfun' become explicit.
> >   - Get rid of macros ALL_LOOPS*, use loops_list instance.
> >   - Add const_iterator type begin()/end().
> >   - Use class loop* instead of loop_p in range-based for.
> >
> > Bootstrapped and regtested again on powerpc64le-linux-gnu P9,
> > x86_64-redhat-linux and aarch64-linux-gnu, also
> > bootstrapped again on ppc64le P9 with bootstrap-O3 config.
> >
> > Does it look better?  Is it ok for trunk?
> >
> > BR,
> > Kewen
> > -----
> > gcc/ChangeLog:
> >
> >         * cfgloop.h (as_const): New function.
> >         (class loop_iterator): Rename to ...
> >         (class loops_list): ... this.
> >         (loop_iterator::next): Rename to ...
> >         (loops_list::Iter::fill_curr_loop): ... this and adjust.
> >         (loop_iterator::loop_iterator): Rename to ...
> >         (loops_list::loops_list): ... this and adjust.
> >         (loops_list::Iter): New class.
> >         (loops_list::iterator): New type.
> >         (loops_list::const_iterator): New type.
> >         (loops_list::begin): New function.
> >         (loops_list::end): Likewise.
> >         (loops_list::begin const): Likewise.
> >         (loops_list::end const): Likewise.
> >         (FOR_EACH_LOOP): Remove.
> >         (FOR_EACH_LOOP_FN): Remove.
> >         * cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
> >         for loop with loops_list instance.
> >         (sort_sibling_loops): Likewise.
> >         (disambiguate_loops_with_multiple_latches): Likewise.
> >         (verify_loop_structure): Likewise.
> >         * cfgloopmanip.c (create_preheaders): Likewise.
> >         (force_single_succ_latches): Likewise.
> >         * config/aarch64/falkor-tag-collision-avoidance.c
> >         (execute_tag_collision_avoidance): Likewise.
> >         * config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
> >         * config/s390/s390.c (s390_adjust_loops): Likewise.
> >         * doc/loop.texi: Likewise.
> >         * gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
> >         * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
> >         * gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
> >         (loop_versioning::make_versioning_decisions): Likewise.
> >         * gimple-ssa-split-paths.c (split_paths): Likewise.
> >         * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
> >         * graphite.c (canonicalize_loop_form): Likewise.
> >         (graphite_transform_loops): Likewise.
> >         * ipa-fnsummary.c (analyze_function_body): Likewise.
> >         * ipa-pure-const.c (analyze_function): Likewise.
> >         * loop-doloop.c (doloop_optimize_loops): Likewise.
> >         * loop-init.c (loop_optimizer_finalize): Likewise.
> >         (fix_loop_structure): Likewise.
> >         * loop-invariant.c (calculate_loop_reg_pressure): Likewise.
> >         (move_loop_invariants): Likewise.
> >         * loop-unroll.c (decide_unrolling): Likewise.
> >         (unroll_loops): Likewise.
> >         * modulo-sched.c (sms_schedule): Likewise.
> >         * predict.c (predict_loops): Likewise.
> >         (pass_profile::execute): Likewise.
> >         * profile.c (branch_prob): Likewise.
> >         * sel-sched-ir.c (sel_finish_pipelining): Likewise.
> >         (sel_find_rgns): Likewise.
> >         * tree-cfg.c (replace_loop_annotate): Likewise.
> >         (replace_uses_by): Likewise.
> >         (move_sese_region_to_fn): Likewise.
> >         * tree-if-conv.c (pass_if_conversion::execute): Likewise.
> >         * tree-loop-distribution.c (loop_distribution::execute): Likewise.
> >         * tree-parloops.c (parallelize_loops): Likewise.
> >         * tree-predcom.c (tree_predictive_commoning): Likewise.
> >         * tree-scalar-evolution.c (scev_initialize): Likewise.
> >         (scev_reset): Likewise.
> >         * tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
> >         * tree-ssa-live.c (remove_unused_locals): Likewise.
> >         * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
> >         * tree-ssa-loop-im.c (analyze_memory_references): Likewise.
> >         (tree_ssa_lim_initialize): Likewise.
> >         * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
> >         * tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
> >         * tree-ssa-loop-manip.c (get_loops_exits): Likewise.
> >         * tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
> >         (free_numbers_of_iterations_estimates): Likewise.
> >         * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
> >         * tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
> >         * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
> >         * tree-ssa-loop.c (gate_oacc_kernels): Likewise.
> >         (pass_scev_cprop::execute): Likewise.
> >         * tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
> >         * tree-ssa-sccvn.c (do_rpo_vn): Likewise.
> >         * tree-ssa-threadupdate.c
> >         (jump_thread_path_registry::thread_through_all_blocks): Likewise.
> >         * tree-vectorizer.c (vectorize_loops): Likewise.
> >         * tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3] Use range-based for loops for traversing loops
  2021-07-20 14:36 ` [PATCH v2] " Kewen.Lin
  2021-07-22 12:56   ` Richard Biener
@ 2021-07-23  8:35   ` Kewen.Lin
  2021-07-23 16:10     ` Martin Sebor
  1 sibling, 1 reply; 35+ messages in thread
From: Kewen.Lin @ 2021-07-23  8:35 UTC (permalink / raw)
  To: GCC Patches
  Cc: Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Bill Schmidt, tbsaunde, Martin Sebor,
	Richard Biener

[-- Attachment #1: Type: text/plain, Size: 4337 bytes --]

Hi,

Comparing to v2, this v3 removed the new CTOR with struct loops *loops
as Richi clarified.  I'd like to support it in a separated follow up
patch by extending the existing CTOR with an optional argument loop_p
root.

Bootstrapped and regtested again on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, also
bootstrapped again on ppc64le P9 with bootstrap-O3 config.

Is it ok for trunk?

BR,
Kewen
-----
gcc/ChangeLog:

	* cfgloop.h (as_const): New function.
	(class loop_iterator): Rename to ...
	(class loops_list): ... this.
	(loop_iterator::next): Rename to ...
	(loops_list::Iter::fill_curr_loop): ... this and adjust.
	(loop_iterator::loop_iterator): Rename to ...
	(loops_list::loops_list): ... this and adjust.
	(loops_list::Iter): New class.
	(loops_list::iterator): New type.
	(loops_list::const_iterator): New type.
	(loops_list::begin): New function.
	(loops_list::end): Likewise.
	(loops_list::begin const): Likewise.
	(loops_list::end const): Likewise.
	(FOR_EACH_LOOP): Remove.
	(FOR_EACH_LOOP_FN): Remove.
	* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
	for loop with loops_list instance.
	(sort_sibling_loops): Likewise.
	(disambiguate_loops_with_multiple_latches): Likewise.
	(verify_loop_structure): Likewise.
	* cfgloopmanip.c (create_preheaders): Likewise.
	(force_single_succ_latches): Likewise.
	* config/aarch64/falkor-tag-collision-avoidance.c
	(execute_tag_collision_avoidance): Likewise.
	* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
	* config/s390/s390.c (s390_adjust_loops): Likewise.
	* doc/loop.texi: Likewise.
	* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
	* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
	(loop_versioning::make_versioning_decisions): Likewise.
	* gimple-ssa-split-paths.c (split_paths): Likewise.
	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
	* graphite.c (canonicalize_loop_form): Likewise.
	(graphite_transform_loops): Likewise.
	* ipa-fnsummary.c (analyze_function_body): Likewise.
	* ipa-pure-const.c (analyze_function): Likewise.
	* loop-doloop.c (doloop_optimize_loops): Likewise.
	* loop-init.c (loop_optimizer_finalize): Likewise.
	(fix_loop_structure): Likewise.
	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
	(move_loop_invariants): Likewise.
	* loop-unroll.c (decide_unrolling): Likewise.
	(unroll_loops): Likewise.
	* modulo-sched.c (sms_schedule): Likewise.
	* predict.c (predict_loops): Likewise.
	(pass_profile::execute): Likewise.
	* profile.c (branch_prob): Likewise.
	* sel-sched-ir.c (sel_finish_pipelining): Likewise.
	(sel_find_rgns): Likewise.
	* tree-cfg.c (replace_loop_annotate): Likewise.
	(replace_uses_by): Likewise.
	(move_sese_region_to_fn): Likewise.
	* tree-if-conv.c (pass_if_conversion::execute): Likewise.
	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
	* tree-parloops.c (parallelize_loops): Likewise.
	* tree-predcom.c (tree_predictive_commoning): Likewise.
	* tree-scalar-evolution.c (scev_initialize): Likewise.
	(scev_reset): Likewise.
	* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
	* tree-ssa-live.c (remove_unused_locals): Likewise.
	* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
	* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
	(tree_ssa_lim_initialize): Likewise.
	* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
	* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
	* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
	(free_numbers_of_iterations_estimates): Likewise.
	* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
	* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
	* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
	* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
	(pass_scev_cprop::execute): Likewise.
	* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
	* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
	* tree-ssa-threadupdate.c
	(jump_thread_path_registry::thread_through_all_blocks): Likewise.
	* tree-vectorizer.c (vectorize_loops): Likewise.
	* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

[-- Attachment #2: range-based-v3.patch --]
[-- Type: text/plain, Size: 44652 bytes --]

---
 gcc/cfgloop.c                                 |  19 +--
 gcc/cfgloop.h                                 | 140 +++++++++++++-----
 gcc/cfgloopmanip.c                            |   7 +-
 .../aarch64/falkor-tag-collision-avoidance.c  |   4 +-
 gcc/config/mn10300/mn10300.c                  |   4 +-
 gcc/config/s390/s390.c                        |   4 +-
 gcc/doc/loop.texi                             |  16 +-
 gcc/gimple-loop-interchange.cc                |   3 +-
 gcc/gimple-loop-jam.c                         |   3 +-
 gcc/gimple-loop-versioning.cc                 |   6 +-
 gcc/gimple-ssa-split-paths.c                  |   3 +-
 gcc/graphite-isl-ast-to-gimple.c              |   5 +-
 gcc/graphite.c                                |   6 +-
 gcc/ipa-fnsummary.c                           |   2 +-
 gcc/ipa-pure-const.c                          |   3 +-
 gcc/loop-doloop.c                             |   8 +-
 gcc/loop-init.c                               |   5 +-
 gcc/loop-invariant.c                          |  14 +-
 gcc/loop-unroll.c                             |   7 +-
 gcc/modulo-sched.c                            |   5 +-
 gcc/predict.c                                 |   5 +-
 gcc/profile.c                                 |   3 +-
 gcc/sel-sched-ir.c                            |  12 +-
 gcc/tree-cfg.c                                |  13 +-
 gcc/tree-if-conv.c                            |   3 +-
 gcc/tree-loop-distribution.c                  |   2 +-
 gcc/tree-parloops.c                           |   3 +-
 gcc/tree-predcom.c                            |   3 +-
 gcc/tree-scalar-evolution.c                   |  16 +-
 gcc/tree-ssa-dce.c                            |   3 +-
 gcc/tree-ssa-live.c                           |   3 +-
 gcc/tree-ssa-loop-ch.c                        |   3 +-
 gcc/tree-ssa-loop-im.c                        |   7 +-
 gcc/tree-ssa-loop-ivcanon.c                   |   3 +-
 gcc/tree-ssa-loop-ivopts.c                    |   3 +-
 gcc/tree-ssa-loop-manip.c                     |   3 +-
 gcc/tree-ssa-loop-niter.c                     |   8 +-
 gcc/tree-ssa-loop-prefetch.c                  |   3 +-
 gcc/tree-ssa-loop-split.c                     |   7 +-
 gcc/tree-ssa-loop-unswitch.c                  |   3 +-
 gcc/tree-ssa-loop.c                           |   6 +-
 gcc/tree-ssa-propagate.c                      |   3 +-
 gcc/tree-ssa-sccvn.c                          |   3 +-
 gcc/tree-ssa-threadupdate.c                   |   3 +-
 gcc/tree-vectorizer.c                         |   4 +-
 gcc/tree-vrp.c                                |   3 +-
 46 files changed, 195 insertions(+), 197 deletions(-)

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index f094538b9ff..d16591b2e1b 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -162,14 +162,12 @@ flow_loop_dump (const class loop *loop, FILE *file,
 void
 flow_loops_dump (FILE *file, void (*loop_dump_aux) (const class loop *, FILE *, int), int verbose)
 {
-  class loop *loop;
-
   if (!current_loops || ! file)
     return;
 
   fprintf (file, ";; %d loops found\n", number_of_loops (cfun));
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (cfun, LI_INCLUDE_ROOT))
     {
       flow_loop_dump (loop, file, loop_dump_aux, verbose);
     }
@@ -559,8 +557,7 @@ sort_sibling_loops (function *fn)
   free (rc_order);
 
   auto_vec<loop_p, 3> siblings;
-  loop_p loop;
-  FOR_EACH_LOOP_FN (fn, loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (fn, LI_INCLUDE_ROOT))
     if (loop->inner && loop->inner->next)
       {
 	loop_p sibling = loop->inner;
@@ -836,9 +833,7 @@ disambiguate_multiple_latches (class loop *loop)
 void
 disambiguate_loops_with_multiple_latches (void)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (!loop->latch)
 	disambiguate_multiple_latches (loop);
@@ -1457,7 +1452,7 @@ verify_loop_structure (void)
   auto_sbitmap visited (last_basic_block_for_fn (cfun));
   bitmap_clear (visited);
   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       unsigned n;
 
@@ -1503,7 +1498,7 @@ verify_loop_structure (void)
   free (bbs);
 
   /* Check headers and latches.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       i = loop->num;
       if (loop->header == NULL)
@@ -1629,7 +1624,7 @@ verify_loop_structure (void)
     }
 
   /* Check the recorded loop exits.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (!loop->exits || loop->exits->e != NULL)
 	{
@@ -1723,7 +1718,7 @@ verify_loop_structure (void)
 	  err = 1;
 	}
 
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	{
 	  eloops = 0;
 	  for (exit = loop->exits->next; exit->e; exit = exit->next)
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 5e699276c88..741df44ea51 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -658,55 +658,141 @@ enum li_flags
   LI_ONLY_INNERMOST = 4		/* Iterate only over innermost loops.  */
 };
 
-/* The iterator for loops.  */
+/* Provide the functionality of std::as_const to support range-based for
+   to use const iterator.  (We can't use std::as_const itself because it's
+   a C++17 feature.)  */
+template <typename T>
+constexpr const T &
+as_const (T &t)
+{
+  return t;
+}
+
+/* A list for visiting loops, which contains the loop numbers instead of
+   the loop pointers.  The scope is restricted in function FN and the
+   visiting order is specified by FLAGS.  */
 
-class loop_iterator
+class loops_list
 {
 public:
-  loop_iterator (function *fn, loop_p *loop, unsigned flags);
+  loops_list (function *fn, unsigned flags);
+
+  template <typename T> class Iter
+  {
+  public:
+    Iter (const loops_list &l, unsigned idx) : list (l), curr_idx (idx)
+    {
+      fill_curr_loop ();
+    }
+
+    T operator* () const { return curr_loop; }
+
+    Iter &
+    operator++ ()
+    {
+      if (curr_idx < list.to_visit.length ())
+	{
+	  /* Bump the index and fill a new one.  */
+	  curr_idx++;
+	  fill_curr_loop ();
+	}
+      else
+	gcc_assert (!curr_loop);
+
+      return *this;
+    }
+
+    bool
+    operator!= (const Iter &rhs) const
+    {
+      return this->curr_idx != rhs.curr_idx;
+    }
+
+  private:
+    /* Fill the current loop starting from the current index.  */
+    void fill_curr_loop ();
+
+    /* Reference to the loop list to visit.  */
+    const loops_list &list;
+
+    /* The current index in the list to visit.  */
+    unsigned curr_idx;
 
-  inline loop_p next ();
+    /* The loop implied by the current index.  */
+    loop_p curr_loop;
+  };
 
+  using iterator = Iter<loop_p>;
+  using const_iterator = Iter<const loop_p>;
+
+  iterator
+  begin ()
+  {
+    return iterator (*this, 0);
+  }
+
+  iterator
+  end ()
+  {
+    return iterator (*this, to_visit.length ());
+  }
+
+  const_iterator
+  begin () const
+  {
+    return const_iterator (*this, 0);
+  }
+
+  const_iterator
+  end () const
+  {
+    return const_iterator (*this, to_visit.length ());
+  }
+
+private:
   /* The function we are visiting.  */
   function *fn;
 
   /* The list of loops to visit.  */
   auto_vec<int, 16> to_visit;
-
-  /* The index of the actual loop.  */
-  unsigned idx;
 };
 
-inline loop_p
-loop_iterator::next ()
+/* Starting from current index CURR_IDX (inclusive), find one index
+   which stands for one valid loop and fill the found loop as CURR_LOOP,
+   if we can't find one, set CURR_LOOP as null.  */
+
+template <typename T>
+inline void
+loops_list::Iter<T>::fill_curr_loop ()
 {
   int anum;
 
-  while (this->to_visit.iterate (this->idx, &anum))
+  while (this->list.to_visit.iterate (this->curr_idx, &anum))
     {
-      this->idx++;
-      loop_p loop = get_loop (fn, anum);
+      loop_p loop = get_loop (this->list.fn, anum);
       if (loop)
-	return loop;
+	{
+	  curr_loop = loop;
+	  return;
+	}
+      this->curr_idx++;
     }
 
-  return NULL;
+  curr_loop = nullptr;
 }
 
-inline
-loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
+/* Set up the loops list to visit according to the specified
+   function scope FN and iterating order FLAGS.  */
+
+inline loops_list::loops_list (function *fn, unsigned flags)
 {
   class loop *aloop;
   unsigned i;
   int mn;
 
-  this->idx = 0;
   this->fn = fn;
   if (!loops_for_fn (fn))
-    {
-      *loop = NULL;
-      return;
-    }
+    return;
 
   this->to_visit.reserve_exact (number_of_loops (fn));
   mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
@@ -766,20 +852,8 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
 	    }
 	}
     }
-
-  *loop = this->next ();
 }
 
-#define FOR_EACH_LOOP(LOOP, FLAGS) \
-  for (loop_iterator li(cfun, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
-
-#define FOR_EACH_LOOP_FN(FN, LOOP, FLAGS) \
-  for (loop_iterator li(FN, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
-
 /* The properties of the target.  */
 struct target_cfgloop {
   /* Number of available registers.  */
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index 2af59fedc92..28087a14e0f 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1572,12 +1572,10 @@ create_preheader (class loop *loop, int flags)
 void
 create_preheaders (int flags)
 {
-  class loop *loop;
-
   if (!current_loops)
     return;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     create_preheader (loop, flags);
   loops_state_set (LOOPS_HAVE_PREHEADERS);
 }
@@ -1587,10 +1585,9 @@ create_preheaders (int flags)
 void
 force_single_succ_latches (void)
 {
-  class loop *loop;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (loop->latch != loop->header && single_succ_p (loop->latch))
 	continue;
diff --git a/gcc/config/aarch64/falkor-tag-collision-avoidance.c b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
index de214e4a0f7..d9b1b3fe835 100644
--- a/gcc/config/aarch64/falkor-tag-collision-avoidance.c
+++ b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
@@ -808,8 +808,6 @@ record_loads (tag_map_t &tag_map, struct loop *loop)
 void
 execute_tag_collision_avoidance ()
 {
-  struct loop *loop;
-
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_UD_CHAIN);
   df_compute_regs_ever_live (true);
@@ -824,7 +822,7 @@ execute_tag_collision_avoidance ()
   calculate_dominance_info (CDI_DOMINATORS);
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       tag_map_t tag_map (512);
 
diff --git a/gcc/config/mn10300/mn10300.c b/gcc/config/mn10300/mn10300.c
index 6f842a3ad32..d9229ff5cc6 100644
--- a/gcc/config/mn10300/mn10300.c
+++ b/gcc/config/mn10300/mn10300.c
@@ -3234,8 +3234,6 @@ mn10300_loop_contains_call_insn (loop_p loop)
 static void
 mn10300_scan_for_setlb_lcc (void)
 {
-  loop_p loop;
-
   DUMP ("Looking for loops that can use the SETLB insn", NULL_RTX);
 
   df_analyze ();
@@ -3248,7 +3246,7 @@ mn10300_scan_for_setlb_lcc (void)
      if an inner loop is not suitable for use with the SETLB/Lcc insns, it may
      be the case that its parent loop is suitable.  Thus we should check all
      loops, but work from the innermost outwards.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       const char * reason = NULL;
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b1d3b99784d..a98e1beebf1 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -14479,15 +14479,13 @@ s390_adjust_loop_scan_osc (struct loop* loop)
 static void
 s390_adjust_loops ()
 {
-  struct loop *loop = NULL;
-
   df_analyze ();
   compute_bb_for_insn ();
 
   /* Find the loops.  */
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       if (dump_file)
 	{
diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
index a135656ed01..27697b08728 100644
--- a/gcc/doc/loop.texi
+++ b/gcc/doc/loop.texi
@@ -79,14 +79,14 @@ and its subloops in the numbering.  The index of a loop never changes.
 
 The entries of the @code{larray} field should not be accessed directly.
 The function @code{get_loop} returns the loop description for a loop with
-the given index.  @code{number_of_loops} function returns number of
-loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
-macro.  The @code{flags} argument of the macro is used to determine
-the direction of traversal and the set of loops visited.  Each loop is
-guaranteed to be visited exactly once, regardless of the changes to the
-loop tree, and the loops may be removed during the traversal.  The newly
-created loops are never traversed, if they need to be visited, this
-must be done separately after their creation.
+the given index.  @code{number_of_loops} function returns number of loops
+in the function.  To traverse all loops, use range-based for loop with
+class @code{loop_list} instance. The @code{flags} argument of the macro
+is used to determine the direction of traversal and the set of loops
+visited.  Each loop is guaranteed to be visited exactly once, regardless
+of the changes to the loop tree, and the loops may be removed during the
+traversal.  The newly created loops are never traversed, if they need to
+be visited, this must be done separately after their creation.
 
 Each basic block contains the reference to the innermost loop it belongs
 to (@code{loop_father}).  For this reason, it is only possible to have
diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 7a88faa2c07..a9044182a59 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -2089,8 +2089,7 @@ pass_linterchange::execute (function *fun)
     return 0;
 
   bool changed_p = false;
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       vec<loop_p> loop_nest = vNULL;
       vec<data_reference_p> datarefs = vNULL;
diff --git a/gcc/gimple-loop-jam.c b/gcc/gimple-loop-jam.c
index 4842f0dff80..271139a1d87 100644
--- a/gcc/gimple-loop-jam.c
+++ b/gcc/gimple-loop-jam.c
@@ -486,13 +486,12 @@ adjust_unroll_factor (class loop *inner, struct data_dependence_relation *ddr,
 static unsigned int
 tree_loop_unroll_and_jam (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   /* Go through all innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       class loop *outer = loop_outer (loop);
 
diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
index 4b70c5a4aab..3fd086c0c65 100644
--- a/gcc/gimple-loop-versioning.cc
+++ b/gcc/gimple-loop-versioning.cc
@@ -1428,8 +1428,7 @@ loop_versioning::analyze_blocks ()
      versioning at that level could be useful in some cases.  */
   get_loop_info (get_loop (m_fn, 0)).rejected_p = true;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
 
@@ -1650,8 +1649,7 @@ loop_versioning::make_versioning_decisions ()
   AUTO_DUMP_SCOPE ("make_versioning_decisions",
 		   dump_user_location_t::from_function_decl (m_fn->decl));
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
       if (decide_whether_loop_is_versionable (loop))
diff --git a/gcc/gimple-ssa-split-paths.c b/gcc/gimple-ssa-split-paths.c
index 2dd953d5ef9..1470de279d6 100644
--- a/gcc/gimple-ssa-split-paths.c
+++ b/gcc/gimple-ssa-split-paths.c
@@ -473,13 +473,12 @@ static bool
 split_paths ()
 {
   bool changed = false;
-  loop_p loop;
 
   loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
   initialize_original_copy_tables ();
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Only split paths if we are optimizing this loop for speed.  */
       if (!optimize_loop_for_speed_p (loop))
diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index c202213f39b..30370859460 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -1535,9 +1535,8 @@ graphite_regenerate_ast_isl (scop_p scop)
       if_region->false_region->region.entry->flags |= EDGE_FALLTHRU;
       /* remove_edge_and_dominated_blocks marks loops for removal but
 	 doesn't actually remove them (fix that...).  */
-      loop_p loop;
-      FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
-	if (! loop->header)
+      for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
+	if (!loop->header)
 	  delete_loop (loop);
     }
 
diff --git a/gcc/graphite.c b/gcc/graphite.c
index 6c4fb42282b..a26d5a3a818 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -377,8 +377,7 @@ canonicalize_loop_closed_ssa (loop_p loop, edge e)
 static void
 canonicalize_loop_form (void)
 {
-  loop_p loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       edge e = single_exit (loop);
       if (!e || (e->flags & (EDGE_COMPLEX|EDGE_FAKE)))
@@ -494,10 +493,9 @@ graphite_transform_loops (void)
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
-      loop_p loop;
       int num_no_dependency = 0;
 
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (loop->can_be_parallel)
 	  num_no_dependency++;
 
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 95d28757f95..883773abe12 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2923,7 +2923,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
       if (dump_file && (dump_flags & TDF_DETAILS))
 	flow_loops_dump (dump_file, NULL, 0);
       scev_initialize ();
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	{
 	  predicate loop_iterations = true;
 	  sreal header_freq;
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index f045108af21..f1724e4f46d 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -1087,9 +1087,8 @@ end:
 	    }
 	  else
 	    {
-	      class loop *loop;
 	      scev_initialize ();
-	      FOR_EACH_LOOP (loop, 0)
+	      for (class loop *loop : loops_list (cfun, 0))
 		if (!finite_loop_p (loop))
 		  {
 		    if (dump_file)
diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c
index dda7b9e268f..12288bd64d8 100644
--- a/gcc/loop-doloop.c
+++ b/gcc/loop-doloop.c
@@ -789,18 +789,14 @@ doloop_optimize (class loop *loop)
 void
 doloop_optimize_loops (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     {
       df_live_add_problem ();
       df_live_set_all_dirty ();
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      doloop_optimize (loop);
-    }
+  for (class loop *loop : loops_list (cfun, 0))
+    doloop_optimize (loop);
 
   if (optimize == 1)
     df_remove_problem (df_live);
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 1fde0ede441..1f75f509f53 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -137,7 +137,6 @@ loop_optimizer_init (unsigned flags)
 void
 loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
 {
-  class loop *loop;
   basic_block bb;
 
   timevar_push (TV_LOOP_FINI);
@@ -167,7 +166,7 @@ loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
       goto loop_fini_done;
     }
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (class loop *loop : loops_list (fn, 0))
     free_simple_loop_desc (loop);
 
   /* Clean up.  */
@@ -229,7 +228,7 @@ fix_loop_structure (bitmap changed_bbs)
      loops, so that when we remove the loops, we know that the loops inside
      are preserved, and do not waste time relinking loops that will be
      removed later.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Detect the case that the loop is no longer present even though
          it wasn't marked for removal.
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index bdc7b59dd5f..059ef8fe7f4 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -2136,7 +2136,7 @@ calculate_loop_reg_pressure (void)
   rtx link;
   class loop *loop, *parent;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (loop->aux == NULL)
       {
 	loop->aux = xcalloc (1, sizeof (class loop_data));
@@ -2203,7 +2203,7 @@ calculate_loop_reg_pressure (void)
   bitmap_release (&curr_regs_live);
   if (flag_ira_region == IRA_REGION_MIXED
       || flag_ira_region == IRA_REGION_ALL)
-    FOR_EACH_LOOP (loop, 0)
+    for (class loop *loop : loops_list (cfun, 0))
       {
 	EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
 	  if (! bitmap_bit_p (&LOOP_DATA (loop)->regs_ref, j))
@@ -2217,7 +2217,7 @@ calculate_loop_reg_pressure (void)
       }
   if (dump_file == NULL)
     return;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       parent = loop_outer (loop);
       fprintf (dump_file, "\n  Loop %d (parent %d, header bb%d, depth %d)\n",
@@ -2251,8 +2251,6 @@ calculate_loop_reg_pressure (void)
 void
 move_loop_invariants (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     df_live_add_problem ();
   /* ??? This is a hack.  We should only need to call df_live_set_all_dirty
@@ -2271,7 +2269,7 @@ move_loop_invariants (void)
     }
   df_set_flags (DF_EQ_NOTES + DF_DEFER_INSN_RESCAN);
   /* Process the loops, innermost first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       curr_loop = loop;
       /* move_single_loop_invariants for very large loops is time consuming
@@ -2284,10 +2282,8 @@ move_loop_invariants (void)
 	move_single_loop_invariants (loop);
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
+  for (class loop *loop : loops_list (cfun, 0))
       free_loop_data (loop);
-    }
 
   if (flag_ira_loop_pressure)
     /* There is no sense to keep this info because it was most
diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c
index 66d93487e29..f11c9c396ae 100644
--- a/gcc/loop-unroll.c
+++ b/gcc/loop-unroll.c
@@ -214,10 +214,8 @@ report_unroll (class loop *loop, dump_location_t locus)
 static void
 decide_unrolling (int flags)
 {
-  class loop *loop;
-
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop->lpt_decision.decision = LPT_NONE;
       dump_user_location_t locus = get_loop_location (loop);
@@ -278,14 +276,13 @@ decide_unrolling (int flags)
 void
 unroll_loops (int flags)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Now decide rest of unrolling.  */
   decide_unrolling (flags);
 
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* And perform the appropriate transformations.  */
       switch (loop->lpt_decision.decision)
diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index e72e46db387..fc761cf60c5 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -1353,7 +1353,6 @@ sms_schedule (void)
   int maxii, max_asap;
   partial_schedule_ptr ps;
   basic_block bb = NULL;
-  class loop *loop;
   basic_block condition_bb = NULL;
   edge latch_edge;
   HOST_WIDE_INT trip_count, max_trip_count;
@@ -1397,7 +1396,7 @@ sms_schedule (void)
 
   /* Build DDGs for all the relevant loops and hold them in G_ARR
      indexed by the loop index.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
@@ -1543,7 +1542,7 @@ sms_schedule (void)
   }
 
   /* We don't want to perform SMS on new loops - created by versioning.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
diff --git a/gcc/predict.c b/gcc/predict.c
index d751e6cecce..5876b6e44a8 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -1949,7 +1949,7 @@ predict_loops (void)
 
   /* Try to predict out blocks in a loop that are not part of a
      natural loop.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       basic_block bb, *bbs;
       unsigned j, n_exits = 0;
@@ -4111,8 +4111,7 @@ pass_profile::execute (function *fun)
     profile_status_for_fn (fun) = PROFILE_GUESSED;
  if (dump_file && (dump_flags & TDF_DETAILS))
    {
-     class loop *loop;
-     FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+     for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
        if (loop->header->count.initialized_p ())
          fprintf (dump_file, "Loop got predicted %d to iterate %i times.\n",
        	   loop->num,
diff --git a/gcc/profile.c b/gcc/profile.c
index 1fa4196fa16..6357fb37cfd 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -1466,13 +1466,12 @@ branch_prob (bool thunk)
   if (flag_branch_probabilities
       && (profile_status_for_fn (cfun) == PROFILE_READ))
     {
-      class loop *loop;
       if (dump_file && (dump_flags & TDF_DETAILS))
 	report_predictor_hitrates ();
 
       /* At this moment we have precise loop iteration count estimates.
 	 Record them to loop structure before the profile gets out of date. */
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (loop->header->count > 0 && loop->header->count.reliable_p ())
 	  {
 	    gcov_type nit = expected_loop_iterations_unbounded (loop);
diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index eef9d6969f4..3dff69f21ce 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -6247,10 +6247,8 @@ make_regions_from_the_rest (void)
 /* Free data structures used in pipelining of loops.  */
 void sel_finish_pipelining (void)
 {
-  class loop *loop;
-
   /* Release aux fields so we don't free them later by mistake.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     loop->aux = NULL;
 
   loop_optimizer_finalize ();
@@ -6271,11 +6269,11 @@ sel_find_rgns (void)
 
   if (current_loops)
     {
-      loop_p loop;
+      unsigned flags = flag_sel_sched_pipelining_outer_loops
+			 ? LI_FROM_INNERMOST
+			 : LI_ONLY_INNERMOST;
 
-      FOR_EACH_LOOP (loop, (flag_sel_sched_pipelining_outer_loops
-			    ? LI_FROM_INNERMOST
-			    : LI_ONLY_INNERMOST))
+      for (class loop *loop : loops_list (cfun, flags))
 	make_regions_from_loop_nest (loop);
     }
 
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index c8b0f7b33e1..0d082851f2f 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -312,12 +312,11 @@ replace_loop_annotate_in_block (basic_block bb, class loop *loop)
 static void
 replace_loop_annotate (void)
 {
-  class loop *loop;
   basic_block bb;
   gimple_stmt_iterator gsi;
   gimple *stmt;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       /* First look into the header.  */
       replace_loop_annotate_in_block (loop->header, loop);
@@ -2027,12 +2026,8 @@ replace_uses_by (tree name, tree val)
   /* Also update the trees stored in loop structures.  */
   if (current_loops)
     {
-      class loop *loop;
-
-      FOR_EACH_LOOP (loop, 0)
-	{
+      for (class loop *loop : loops_list (cfun, 0))
 	  substitute_in_loop_info (loop, name, val);
-	}
     }
 }
 
@@ -7752,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
 
   /* Fix up orig_loop_num.  If the block referenced in it has been moved
      to dest_cfun, update orig_loop_num field, otherwise clear it.  */
-  class loop *dloop;
+  class loop *dloop = NULL;
   signed char *moved_orig_loop_num = NULL;
-  FOR_EACH_LOOP_FN (dest_cfun, dloop, 0)
+  for (class loop *dloop : loops_list (dest_cfun, 0))
     if (dloop->orig_loop_num)
       {
 	if (moved_orig_loop_num == NULL)
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 345488e2a19..1bf68455f72 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -3300,14 +3300,13 @@ pass_if_conversion::gate (function *fun)
 unsigned int
 pass_if_conversion::execute (function *fun)
 {
-  class loop *loop;
   unsigned todo = 0;
 
   if (number_of_loops (fun) <= 1)
     return 0;
 
   auto_vec<gimple *> preds;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (flag_tree_loop_if_convert == 1
 	|| ((flag_tree_loop_vectorize || loop->force_vectorize)
 	    && !loop->dont_vectorize))
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 65aa1df4aba..1f0b6eb3a95 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -3312,7 +3312,7 @@ loop_distribution::execute (function *fun)
 
   /* We can at the moment only distribute non-nested loops, thus restrict
      walking to innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       /* Don't distribute multiple exit edges loop, or cold loop when
          not doing pattern detection.  */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index fe1baef32a7..9d73f241204 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -3989,7 +3989,6 @@ parallelize_loops (bool oacc_kernels_p)
 {
   unsigned n_threads;
   bool changed = false;
-  class loop *loop;
   class loop *skip_loop = NULL;
   class tree_niter_desc niter_desc;
   struct obstack parloop_obstack;
@@ -4020,7 +4019,7 @@ parallelize_loops (bool oacc_kernels_p)
 
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       if (loop == skip_loop)
 	{
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index cf85517e1c7..b8ae6b66df9 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -3419,11 +3419,10 @@ pcom_worker::tree_predictive_commoning_loop (bool allow_unroll_p)
 unsigned
 tree_predictive_commoning (bool allow_unroll_p)
 {
-  class loop *loop;
   unsigned ret = 0, changed = 0;
 
   initialize_original_copy_tables ();
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
     if (optimize_loop_for_speed_p (loop))
       {
 	pcom_worker w(loop);
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index b22d49a0ab6..c6a3b7f9bd7 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -2977,16 +2977,12 @@ gather_stats_on_scev_database (void)
 void
 scev_initialize (void)
 {
-  class loop *loop;
-
   gcc_assert (! scev_initialized_p ());
 
   scalar_evolution_info = hash_table<scev_info_hasher>::create_ggc (100);
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (class loop *loop : loops_list (cfun, 0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if SCEV is initialized.  */
@@ -3015,14 +3011,10 @@ scev_reset_htab (void)
 void
 scev_reset (void)
 {
-  class loop *loop;
-
   scev_reset_htab ();
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (class loop *loop : loops_list (cfun, 0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if the IV calculation in TYPE can overflow based on the knowledge
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index e2d3b63a30c..226cbc18a2f 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -417,7 +417,6 @@ find_obviously_necessary_stmts (bool aggressive)
   /* Prevent the empty possibly infinite loops from being removed.  */
   if (aggressive)
     {
-      class loop *loop;
       if (mark_irreducible_loops ())
 	FOR_EACH_BB_FN (bb, cfun)
 	  {
@@ -433,7 +432,7 @@ find_obviously_necessary_stmts (bool aggressive)
 		}
 	  }
 
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (!finite_loop_p (loop))
 	  {
 	    if (dump_file)
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index a2aab25e862..16a3f7bb0bb 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -908,8 +908,7 @@ remove_unused_locals (void)
 
   if (cfun->has_simduid_loops)
     {
-      class loop *loop;
-      FOR_EACH_LOOP (loop, 0)
+      for (class loop *loop : loops_list (cfun, 0))
 	if (loop->simduid && !is_used_p (loop->simduid))
 	  loop->simduid = NULL_TREE;
     }
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index dfa5dc87c34..57fa3ee3608 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -348,7 +348,6 @@ protected:
 unsigned int
 ch_base::copy_headers (function *fun)
 {
-  class loop *loop;
   basic_block header;
   edge exit, entry;
   basic_block *bbs, *copied_bbs;
@@ -365,7 +364,7 @@ ch_base::copy_headers (function *fun)
 
   auto_vec<std::pair<edge, loop_p> > copied;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       int initial_limit = param_max_loop_header_insns;
       int remaining_limit = initial_limit;
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 81b4ec21d6e..a8cc9bc8151 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -1662,7 +1662,7 @@ analyze_memory_references (bool store_motion)
 {
   gimple_stmt_iterator bsi;
   basic_block bb, *bbs;
-  class loop *loop, *outer;
+  class loop *outer;
   unsigned i, n;
 
   /* Collect all basic-blocks in loops and sort them after their
@@ -1706,7 +1706,7 @@ analyze_memory_references (bool store_motion)
 
   /* Propagate the information about accessed memory references up
      the loop hierarchy.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Finalize the overall touched references (including subloops).  */
       bitmap_ior_into (&memory_accesses.all_refs_stored_in_loop[loop->num],
@@ -3133,7 +3133,6 @@ fill_always_executed_in (void)
 static void
 tree_ssa_lim_initialize (bool store_motion)
 {
-  class loop *loop;
   unsigned i;
 
   bitmap_obstack_initialize (&lim_bitmap_obstack);
@@ -3177,7 +3176,7 @@ tree_ssa_lim_initialize (bool store_motion)
      its postorder index.  */
   i = 0;
   bb_loop_postorder = XNEWVEC (unsigned, number_of_loops (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     bb_loop_postorder[loop->num] = i++;
 }
 
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index b1971f83544..81e9a22be4c 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1285,14 +1285,13 @@ canonicalize_loop_induction_variables (class loop *loop,
 unsigned int
 canonicalize_induction_variables (void)
 {
-  class loop *loop;
   bool changed = false;
   bool irred_invalidated = false;
   bitmap loop_closed_ssa_invalidated = BITMAP_ALLOC (NULL);
 
   estimate_numbers_of_iterations (cfun);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       changed |= canonicalize_loop_induction_variables (loop,
 							true, UL_SINGLE_ITER,
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 12a8a49a307..43668463b21 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -8066,14 +8066,13 @@ finish:
 void
 tree_ssa_iv_optimize (void)
 {
-  class loop *loop;
   struct ivopts_data data;
   auto_bitmap toremove;
 
   tree_ssa_iv_optimize_init (&data);
 
   /* Optimize the loops starting with the innermost ones.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!dbg_cnt (ivopts_loop))
 	continue;
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index 28ae1316fa0..15dab0c1451 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -362,11 +362,10 @@ add_exit_phis (bitmap names_to_rename, bitmap *use_blocks, bitmap *loop_exits)
 static void
 get_loops_exits (bitmap *loop_exits)
 {
-  class loop *loop;
   unsigned j;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       auto_vec<edge> exit_edges = get_loop_exit_edges (loop);
       loop_exits[loop->num] = BITMAP_ALLOC (&loop_renamer_obstack);
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 6fabf10a215..84e5ce064fe 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -4559,13 +4559,11 @@ estimated_stmt_executions (class loop *loop, widest_int *nit)
 void
 estimate_numbers_of_iterations (function *fn)
 {
-  class loop *loop;
-
   /* We don't want to issue signed overflow warnings while getting
      loop iteration estimates.  */
   fold_defer_overflow_warnings ();
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (class loop *loop : loops_list (fn, 0))
     estimate_numbers_of_iterations (loop);
 
   fold_undefer_and_ignore_overflow_warnings ();
@@ -5031,9 +5029,7 @@ free_numbers_of_iterations_estimates (class loop *loop)
 void
 free_numbers_of_iterations_estimates (function *fn)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (class loop *loop : loops_list (fn, 0))
     free_numbers_of_iterations_estimates (loop);
 }
 
diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index 98062eb4616..b1af922d403 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -1980,7 +1980,6 @@ fail:
 unsigned int
 tree_ssa_prefetch_arrays (void)
 {
-  class loop *loop;
   bool unrolled = false;
   int todo_flags = 0;
 
@@ -2025,7 +2024,7 @@ tree_ssa_prefetch_arrays (void)
       set_builtin_decl (BUILT_IN_PREFETCH, decl, false);
     }
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Processing loop %d:\n", loop->num);
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index 3a09bbc39e5..40629227863 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -1598,18 +1598,17 @@ split_loop_on_cond (struct loop *loop)
 static unsigned int
 tree_ssa_split_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   calculate_dominance_info (CDI_POST_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (cfun, LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (loop->aux)
 	{
@@ -1630,7 +1629,7 @@ tree_ssa_split_loops (void)
 	}
     }
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (class loop *loop : loops_list (cfun, LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   clear_aux_for_blocks ();
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 04d4553f13e..c4adb52a67c 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -90,11 +90,10 @@ static tree get_vop_from_header (class loop *);
 unsigned int
 tree_ssa_unswitch_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!loop->inner)
 	/* Unswitch innermost loop.  */
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 957ac0f3baa..2255c228780 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -157,8 +157,7 @@ gate_oacc_kernels (function *fn)
   if (!lookup_attribute ("oacc kernels", DECL_ATTRIBUTES (fn->decl)))
     return false;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (loop->in_oacc_kernels_region)
       return true;
 
@@ -455,12 +454,11 @@ public:
 unsigned
 pass_scev_cprop::execute (function *)
 {
-  class loop *loop;
   bool any = false;
 
   /* Perform final value replacement in loops, in case the replacement
      expressions are cheap.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     any |= final_value_replacement_loop (loop);
 
   return any ? TODO_cleanup_cfg | TODO_update_ssa_only_virtuals : 0;
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index d93ec90b002..1ac4d050f99 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -1262,7 +1262,6 @@ clean_up_loop_closed_phi (function *fun)
   tree rhs;
   tree lhs;
   gphi_iterator gsi;
-  struct loop *loop;
 
   /* Avoid possibly quadratic work when scanning for loop exits across
    all loops of a nest.  */
@@ -1274,7 +1273,7 @@ clean_up_loop_closed_phi (function *fun)
   calculate_dominance_info  (CDI_DOMINATORS);
 
   /* Walk over loop in function.  */
-  FOR_EACH_LOOP_FN (fun, loop, 0)
+  for (class loop *loop : loops_list (fun, 0))
     {
       /* Check each exit edege of loop.  */
       auto_vec<edge> exits = get_loop_exit_edges (loop);
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 7900df946f4..1bc22950790 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -7637,9 +7637,8 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
      loops and the outermost one optimistically.  */
   if (iterate)
     {
-      loop_p loop;
       unsigned max_depth = param_rpo_vn_max_loop_depth;
-      FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+      for (class loop *loop : loops_list (cfun, LI_ONLY_INNERMOST))
 	if (loop_depth (loop) > max_depth)
 	  for (unsigned i = 2;
 	       i < loop_depth (loop) - max_depth; ++i)
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index f496dd3eb8c..ad1fd115161 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2561,7 +2561,6 @@ jump_thread_path_registry::thread_through_all_blocks
 {
   bool retval = false;
   unsigned int i;
-  class loop *loop;
   auto_bitmap threaded_blocks;
   hash_set<edge> visited_starting_edges;
 
@@ -2702,7 +2701,7 @@ jump_thread_path_registry::thread_through_all_blocks
   /* Then perform the threading through loop headers.  We start with the
      innermost loop, so that the changes in cfg we perform won't affect
      further threading.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (class loop *loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!loop->header
 	  || !bitmap_bit_p (threaded_blocks, loop->header->index))
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index f1035a83826..ebf237d054a 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -1194,7 +1194,7 @@ vectorize_loops (void)
   /* If some loop was duplicated, it gets bigger number
      than all previously defined loops.  This fact allows us to run
      only over initial loops skipping newly generated ones.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     if (loop->dont_vectorize)
       {
 	any_ifcvt_loops = true;
@@ -1213,7 +1213,7 @@ vectorize_loops (void)
 		  loop4 (copy of loop2)
 		else
 		  loop5 (copy of loop4)
-	   If FOR_EACH_LOOP gives us loop3 first (which has
+	   If loops' iteration gives us loop3 first (which has
 	   dont_vectorize set), make sure to process loop1 before loop4;
 	   so that we can prevent vectorization of loop4 if loop1
 	   is successfully vectorized.  */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 0565c9b5073..abcf33c4f26 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3337,8 +3337,7 @@ vrp_asserts::find_assert_locations (void)
   /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
      the order we compute liveness and insert asserts we otherwise
      fail to insert asserts into the loop latch.  */
-  loop_p loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (class loop *loop : loops_list (cfun, 0))
     {
       i = loop->latch->index;
       unsigned int j = single_succ_edge (loop->latch)->dest_idx;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH] Make loops_list support an optional loop_p root
  2021-07-22 12:56   ` Richard Biener
  2021-07-22 12:56     ` Richard Biener
@ 2021-07-23  8:41     ` Kewen.Lin
  2021-07-23 16:26       ` Martin Sebor
  2021-07-29  8:01       ` Richard Biener
  1 sibling, 2 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-23  8:41 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders, Martin Sebor, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 2892 bytes --]

on 2021/7/22 下午8:56, Richard Biener wrote:
> On Tue, Jul 20, 2021 at 4:37
> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> Hi,
>>
>> This v2 has addressed some review comments/suggestions:
>>
>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>     to support loop hierarchy tree rather than just a function,
>>     and adjust to use loops* accordingly.
> 
> I actually meant struct loop *, not struct loops * ;)  At the point
> we pondered to make loop invariant motion work on single
> loop nests we gave up not only but also because it iterates
> over the loop nest but all the iterators only ever can process
> all loops, not say, all loops inside a specific 'loop' (and
> including that 'loop' if LI_INCLUDE_ROOT).  So the
> CTOR would take the 'root' of the loop tree as argument.
> 
> I see that doesn't trivially fit how loops_list works, at least
> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
> could be adjusted to do ONLY_INNERMOST as well?
> 


Thanks for the clarification!  I just realized that the previous
version with struct loops* is problematic, all traversal is
still bounded with outer_loop == NULL.  I think what you expect
is to respect the given loop_p root boundary.  Since we just
record the loops' nums, I think we still need the function* fn?
So I add one optional argument loop_p root and update the
visiting codes accordingly.  Before this change, the previous
visiting uses the outer_loop == NULL as the termination condition,
it perfectly includes the root itself, but with this given root,
we have to use it as the termination condition to avoid to iterate
onto its possible existing next.

For LI_ONLY_INNERMOST, I was thinking whether we can use the
code like:

    struct loops *fn_loops = loops_for_fn (fn)->larray;
    for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
        if (aloop != NULL
            && aloop->inner == NULL
            && flow_loop_nested_p (tree_root, aloop))
             this->to_visit.quick_push (aloop->num);

it has the stable bound, but if the given root only has several
child loops, it can be much worse if there are many loops in fn.
It seems impossible to predict the given root loop hierarchy size,
maybe we can still use the original linear searching for the case
loops_for_fn (fn) == root?  But since this visiting seems not so
performance critical, I chose to share the code originally used
for FROM_INNERMOST, hope it can have better readability and
maintainability.

Bootstrapped and regtested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, also
bootstrapped on ppc64le P9 with bootstrap-O3 config.

Does the attached patch meet what you expect?

BR,
Kewen
-----
gcc/ChangeLog:

	* cfgloop.h (loops_list::loops_list): Add one optional argument root
	and adjust accordingly.

[-- Attachment #2: loop_root.diff --]
[-- Type: text/plain, Size: 4609 bytes --]

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 741df44ea51..f7148df1758 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -669,13 +669,15 @@ as_const (T &t)
 }
 
 /* A list for visiting loops, which contains the loop numbers instead of
-   the loop pointers.  The scope is restricted in function FN and the
-   visiting order is specified by FLAGS.  */
+   the loop pointers.  If the loop ROOT is offered (non-null), the visiting
+   will start from it, otherwise it would start from loops_for_fn (FN)
+   instead.  The scope is restricted in function FN and the visiting order
+   is specified by FLAGS.  */
 
 class loops_list
 {
 public:
-  loops_list (function *fn, unsigned flags);
+  loops_list (function *fn, unsigned flags, loop_p root = nullptr);
 
   template <typename T> class Iter
   {
@@ -782,71 +784,94 @@ loops_list::Iter<T>::fill_curr_loop ()
 }
 
 /* Set up the loops list to visit according to the specified
-   function scope FN and iterating order FLAGS.  */
+   function scope FN and iterating order FLAGS.  If ROOT is
+   not null, the visiting would start from it, otherwise it
+   will start from tree_root of loops_for_fn (FN).  */
 
-inline loops_list::loops_list (function *fn, unsigned flags)
+inline loops_list::loops_list (function *fn, unsigned flags, loop_p root)
 {
   class loop *aloop;
-  unsigned i;
   int mn;
 
+  struct loops *loops = loops_for_fn (fn);
+  gcc_assert (!root || loops);
+
   this->fn = fn;
-  if (!loops_for_fn (fn))
+  if (!loops)
     return;
 
+  loop_p tree_root = root ? root : loops->tree_root;
+
   this->to_visit.reserve_exact (number_of_loops (fn));
-  mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
+  mn = (flags & LI_INCLUDE_ROOT) ? -1 : tree_root->num;
 
-  if (flags & LI_ONLY_INNERMOST)
-    {
-      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
-	if (aloop != NULL
-	    && aloop->inner == NULL
-	    && aloop->num >= mn)
+  /* The helper function for LI_FROM_INNERMOST and LI_ONLY_INNERMOST
+     visiting, ONLY_PUSH_INNERMOST_P indicates whether only push
+     the innermost loop, it's true for LI_ONLY_INNERMOST visiting
+     while false for LI_FROM_INNERMOST visiting.  */
+  auto visit_from_innermost = [&] (bool only_push_innermost_p)
+  {
+    /* Push the loops to LI->TO_VISIT in postorder.  */
+
+    /* Early handle tree_root without any inner loops, make later
+       processing simpler, that is the while loop can only care
+       about loops which aren't possible to be tree_root.  */
+    if (!tree_root->inner)
+      {
+	if (tree_root->num != mn)
+	  this->to_visit.quick_push (tree_root->num);
+	return;
+      }
+
+    for (aloop = tree_root;
+	aloop->inner != NULL;
+	aloop = aloop->inner)
+      continue;
+
+    while (1)
+      {
+	gcc_assert (aloop != tree_root);
+	if (!only_push_innermost_p || aloop->inner == NULL)
 	  this->to_visit.quick_push (aloop->num);
-    }
-  else if (flags & LI_FROM_INNERMOST)
-    {
-      /* Push the loops to LI->TO_VISIT in postorder.  */
-      for (aloop = loops_for_fn (fn)->tree_root;
-	   aloop->inner != NULL;
-	   aloop = aloop->inner)
-	continue;
 
-      while (1)
-	{
-	  if (aloop->num >= mn)
-	    this->to_visit.quick_push (aloop->num);
+	if (aloop->next)
+	  {
+	    for (aloop = aloop->next;
+		 aloop->inner != NULL;
+		 aloop = aloop->inner)
+	      continue;
+	  }
+	else if (loop_outer (aloop) == tree_root)
+	  break;
+	else
+	  aloop = loop_outer (aloop);
+      }
+
+    /* Reconsider tree_root since the previous loop doesn't handle it.  */
+    if (!only_push_innermost_p && tree_root->num != mn)
+      this->to_visit.quick_push (tree_root->num);
+  };
 
-	  if (aloop->next)
-	    {
-	      for (aloop = aloop->next;
-		   aloop->inner != NULL;
-		   aloop = aloop->inner)
-		continue;
-	    }
-	  else if (!loop_outer (aloop))
-	    break;
-	  else
-	    aloop = loop_outer (aloop);
-	}
-    }
+  if (flags & LI_ONLY_INNERMOST)
+    visit_from_innermost (true);
+  else if (flags & LI_FROM_INNERMOST)
+    visit_from_innermost (false);
   else
     {
       /* Push the loops to LI->TO_VISIT in preorder.  */
-      aloop = loops_for_fn (fn)->tree_root;
+      aloop = tree_root;
       while (1)
 	{
-	  if (aloop->num >= mn)
+	  if (aloop->num != mn)
 	    this->to_visit.quick_push (aloop->num);
 
 	  if (aloop->inner != NULL)
 	    aloop = aloop->inner;
 	  else
 	    {
-	      while (aloop != NULL && aloop->next == NULL)
+	      while (aloop != tree_root && aloop->next == NULL)
 		aloop = loop_outer (aloop);
-	      if (aloop == NULL)
+	      if (aloop == tree_root)
 		break;
 	      aloop = aloop->next;
 	    }

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] Use range-based for loops for traversing loops
  2021-07-23  8:35   ` [PATCH v3] Use range-based for loops for traversing loops Kewen.Lin
@ 2021-07-23 16:10     ` Martin Sebor
  2021-07-27  2:10       ` [PATCH v4] " Kewen.Lin
  0 siblings, 1 reply; 35+ messages in thread
From: Martin Sebor @ 2021-07-23 16:10 UTC (permalink / raw)
  To: Kewen.Lin, GCC Patches
  Cc: Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Bill Schmidt, tbsaunde, Richard Biener

On 7/23/21 2:35 AM, Kewen.Lin wrote:
> Hi,
> 
> Comparing to v2, this v3 removed the new CTOR with struct loops *loops
> as Richi clarified.  I'd like to support it in a separated follow up
> patch by extending the existing CTOR with an optional argument loop_p
> root.

Looks very nice (and quite a bit work)!  Thanks again!

Not to make even more work for you, but it occurred to me that
the declaration of the loop control variable could be simplified
by the use of auto like so:

   for (auto loop: loops_list (cfun, ...))

I spotted what looks to me like a few minor typos in the docs
diff:

diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
index a135656ed01..27697b08728 100644
--- a/gcc/doc/loop.texi
+++ b/gcc/doc/loop.texi
@@ -79,14 +79,14 @@ and its subloops in the numbering.  The index of a 
loop never changes.

  The entries of the @code{larray} field should not be accessed directly.
  The function @code{get_loop} returns the loop description for a loop with
-the given index.  @code{number_of_loops} function returns number of
-loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
-macro.  The @code{flags} argument of the macro is used to determine
-the direction of traversal and the set of loops visited.  Each loop is
-guaranteed to be visited exactly once, regardless of the changes to the
-loop tree, and the loops may be removed during the traversal.  The newly
-created loops are never traversed, if they need to be visited, this
-must be done separately after their creation.
+the given index.  @code{number_of_loops} function returns number of loops
+in the function.  To traverse all loops, use range-based for loop with

Missing article:

    use <ins>a </a>range-based for loop

+class @code{loop_list} instance. The @code{flags} argument of the macro

Is that loop_list or loops_list?

IIUC, it's also not a macro anymore, right?  The flags argument
is passed to the loop_list ctor, no?

Martin

> 
> Bootstrapped and regtested again on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped again on ppc64le P9 with bootstrap-O3 config.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -----
> gcc/ChangeLog:
> 
> 	* cfgloop.h (as_const): New function.
> 	(class loop_iterator): Rename to ...
> 	(class loops_list): ... this.
> 	(loop_iterator::next): Rename to ...
> 	(loops_list::Iter::fill_curr_loop): ... this and adjust.
> 	(loop_iterator::loop_iterator): Rename to ...
> 	(loops_list::loops_list): ... this and adjust.
> 	(loops_list::Iter): New class.
> 	(loops_list::iterator): New type.
> 	(loops_list::const_iterator): New type.
> 	(loops_list::begin): New function.
> 	(loops_list::end): Likewise.
> 	(loops_list::begin const): Likewise.
> 	(loops_list::end const): Likewise.
> 	(FOR_EACH_LOOP): Remove.
> 	(FOR_EACH_LOOP_FN): Remove.
> 	* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
> 	for loop with loops_list instance.
> 	(sort_sibling_loops): Likewise.
> 	(disambiguate_loops_with_multiple_latches): Likewise.
> 	(verify_loop_structure): Likewise.
> 	* cfgloopmanip.c (create_preheaders): Likewise.
> 	(force_single_succ_latches): Likewise.
> 	* config/aarch64/falkor-tag-collision-avoidance.c
> 	(execute_tag_collision_avoidance): Likewise.
> 	* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
> 	* config/s390/s390.c (s390_adjust_loops): Likewise.
> 	* doc/loop.texi: Likewise.
> 	* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
> 	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
> 	* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
> 	(loop_versioning::make_versioning_decisions): Likewise.
> 	* gimple-ssa-split-paths.c (split_paths): Likewise.
> 	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
> 	* graphite.c (canonicalize_loop_form): Likewise.
> 	(graphite_transform_loops): Likewise.
> 	* ipa-fnsummary.c (analyze_function_body): Likewise.
> 	* ipa-pure-const.c (analyze_function): Likewise.
> 	* loop-doloop.c (doloop_optimize_loops): Likewise.
> 	* loop-init.c (loop_optimizer_finalize): Likewise.
> 	(fix_loop_structure): Likewise.
> 	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
> 	(move_loop_invariants): Likewise.
> 	* loop-unroll.c (decide_unrolling): Likewise.
> 	(unroll_loops): Likewise.
> 	* modulo-sched.c (sms_schedule): Likewise.
> 	* predict.c (predict_loops): Likewise.
> 	(pass_profile::execute): Likewise.
> 	* profile.c (branch_prob): Likewise.
> 	* sel-sched-ir.c (sel_finish_pipelining): Likewise.
> 	(sel_find_rgns): Likewise.
> 	* tree-cfg.c (replace_loop_annotate): Likewise.
> 	(replace_uses_by): Likewise.
> 	(move_sese_region_to_fn): Likewise.
> 	* tree-if-conv.c (pass_if_conversion::execute): Likewise.
> 	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
> 	* tree-parloops.c (parallelize_loops): Likewise.
> 	* tree-predcom.c (tree_predictive_commoning): Likewise.
> 	* tree-scalar-evolution.c (scev_initialize): Likewise.
> 	(scev_reset): Likewise.
> 	* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
> 	* tree-ssa-live.c (remove_unused_locals): Likewise.
> 	* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
> 	* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
> 	(tree_ssa_lim_initialize): Likewise.
> 	* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
> 	* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
> 	* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
> 	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
> 	(free_numbers_of_iterations_estimates): Likewise.
> 	* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
> 	* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
> 	* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
> 	* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
> 	(pass_scev_cprop::execute): Likewise.
> 	* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
> 	* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
> 	* tree-ssa-threadupdate.c
> 	(jump_thread_path_registry::thread_through_all_blocks): Likewise.
> 	* tree-vectorizer.c (vectorize_loops): Likewise.
> 	* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] Make loops_list support an optional loop_p root
  2021-07-23  8:41     ` [PATCH] Make loops_list support an optional loop_p root Kewen.Lin
@ 2021-07-23 16:26       ` Martin Sebor
  2021-07-27  2:25         ` Kewen.Lin
  2021-07-29  8:01       ` Richard Biener
  1 sibling, 1 reply; 35+ messages in thread
From: Martin Sebor @ 2021-07-23 16:26 UTC (permalink / raw)
  To: Kewen.Lin, Richard Biener
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders, Bill Schmidt

On 7/23/21 2:41 AM, Kewen.Lin wrote:
> on 2021/7/22 下午8:56, Richard Biener wrote:
>> On Tue, Jul 20, 2021 at 4:37
>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>
>>> Hi,
>>>
>>> This v2 has addressed some review comments/suggestions:
>>>
>>>    - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>    - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>      to support loop hierarchy tree rather than just a function,
>>>      and adjust to use loops* accordingly.
>>
>> I actually meant struct loop *, not struct loops * ;)  At the point
>> we pondered to make loop invariant motion work on single
>> loop nests we gave up not only but also because it iterates
>> over the loop nest but all the iterators only ever can process
>> all loops, not say, all loops inside a specific 'loop' (and
>> including that 'loop' if LI_INCLUDE_ROOT).  So the
>> CTOR would take the 'root' of the loop tree as argument.
>>
>> I see that doesn't trivially fit how loops_list works, at least
>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
>> could be adjusted to do ONLY_INNERMOST as well?
>>
> 
> 
> Thanks for the clarification!  I just realized that the previous
> version with struct loops* is problematic, all traversal is
> still bounded with outer_loop == NULL.  I think what you expect
> is to respect the given loop_p root boundary.  Since we just
> record the loops' nums, I think we still need the function* fn?
> So I add one optional argument loop_p root and update the
> visiting codes accordingly.  Before this change, the previous
> visiting uses the outer_loop == NULL as the termination condition,
> it perfectly includes the root itself, but with this given root,
> we have to use it as the termination condition to avoid to iterate
> onto its possible existing next.
> 
> For LI_ONLY_INNERMOST, I was thinking whether we can use the
> code like:
> 
>      struct loops *fn_loops = loops_for_fn (fn)->larray;
>      for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>          if (aloop != NULL
>              && aloop->inner == NULL
>              && flow_loop_nested_p (tree_root, aloop))
>               this->to_visit.quick_push (aloop->num);
> 
> it has the stable bound, but if the given root only has several
> child loops, it can be much worse if there are many loops in fn.
> It seems impossible to predict the given root loop hierarchy size,
> maybe we can still use the original linear searching for the case
> loops_for_fn (fn) == root?  But since this visiting seems not so
> performance critical, I chose to share the code originally used
> for FROM_INNERMOST, hope it can have better readability and
> maintainability.

I might be mixing up the two patches (they both seem to touch
the same functions), but in this one the loops_list ctor looks
like a sizeable function with at least one loop.  Since the ctor
is used in the initialization of each of the many range-for loops,
that could result in inlining of a lot of these calls and so quite
a bit code bloat.  Unless this is necessary for efficiency  (not
my area) I would recommend to consider defining the loops_list
ctor out-of-line in some .c or .cc file.

(Also, if you agree with the rationale, I'd replace loop_p with
loop * in the new code.)

Thanks
Martin

> 
> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped on ppc64le P9 with bootstrap-O3 config.
> 
> Does the attached patch meet what you expect?
> 
> BR,
> Kewen
> -----
> gcc/ChangeLog:
> 
> 	* cfgloop.h (loops_list::loops_list): Add one optional argument root
> 	and adjust accordingly.
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v4] Use range-based for loops for traversing loops
  2021-07-23 16:10     ` Martin Sebor
@ 2021-07-27  2:10       ` Kewen.Lin
  2021-07-29  7:48         ` Richard Biener
  2021-07-30  7:18         ` Thomas Schwinge
  0 siblings, 2 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-27  2:10 UTC (permalink / raw)
  To: Martin Sebor, GCC Patches
  Cc: Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Bill Schmidt, tbsaunde, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 6522 bytes --]

on 2021/7/24 上午12:10, Martin Sebor wrote:
> On 7/23/21 2:35 AM, Kewen.Lin wrote:
>> Hi,
>>
>> Comparing to v2, this v3 removed the new CTOR with struct loops *loops
>> as Richi clarified.  I'd like to support it in a separated follow up
>> patch by extending the existing CTOR with an optional argument loop_p
>> root.
> 
> Looks very nice (and quite a bit work)!  Thanks again!
> 
> Not to make even more work for you, but it occurred to me that
> the declaration of the loop control variable could be simplified
> by the use of auto like so:
> 
>  for (auto loop: loops_list (cfun, ...))
> 

Thanks for the suggestion!  Updated in v4 accordingly.

I was under the impression to use C++11 auto is arguable since it sometimes
may make things less clear.  But I think you are right, using auto here won't
make it harder to read but more concise.  Thanks again.

> I spotted what looks to me like a few minor typos in the docs
> diff:
> 
> diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
> index a135656ed01..27697b08728 100644
> --- a/gcc/doc/loop.texi
> +++ b/gcc/doc/loop.texi
> @@ -79,14 +79,14 @@ and its subloops in the numbering.  The index of a loop never changes.
> 
> The entries of the @code{larray} field should not be accessed directly.
> The function @code{get_loop} returns the loop description for a loop with
> -the given index.  @code{number_of_loops} function returns number of
> -loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
> -macro.  The @code{flags} argument of the macro is used to determine
> -the direction of traversal and the set of loops visited.  Each loop is
> -guaranteed to be visited exactly once, regardless of the changes to the
> -loop tree, and the loops may be removed during the traversal.  The newly
> -created loops are never traversed, if they need to be visited, this
> -must be done separately after their creation.
> +the given index.  @code{number_of_loops} function returns number of loops
> +in the function.  To traverse all loops, use range-based for loop with
> 
> Missing article:
> 
>   use <ins>a </a>range-based for loop
> 
> +class @code{loop_list} instance. The @code{flags} argument of the macro
> 
> Is that loop_list or loops_list?
> 
> IIUC, it's also not a macro anymore, right?  The flags argument
> is passed to the loop_list ctor, no?
> 

Oops, thanks for catching all above ones!  Fixed in v4.

Bootstrapped and regtested again on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu, also
bootstrapped again on ppc64le P9 with bootstrap-O3 config.

Is it ok for trunk?

BR,
Kewen
-----
gcc/ChangeLog:

	* cfgloop.h (as_const): New function.
	(class loop_iterator): Rename to ...
	(class loops_list): ... this.
	(loop_iterator::next): Rename to ...
	(loops_list::Iter::fill_curr_loop): ... this and adjust.
	(loop_iterator::loop_iterator): Rename to ...
	(loops_list::loops_list): ... this and adjust.
	(loops_list::Iter): New class.
	(loops_list::iterator): New type.
	(loops_list::const_iterator): New type.
	(loops_list::begin): New function.
	(loops_list::end): Likewise.
	(loops_list::begin const): Likewise.
	(loops_list::end const): Likewise.
	(FOR_EACH_LOOP): Remove.
	(FOR_EACH_LOOP_FN): Remove.
	* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
	for loop with loops_list instance.
	(sort_sibling_loops): Likewise.
	(disambiguate_loops_with_multiple_latches): Likewise.
	(verify_loop_structure): Likewise.
	* cfgloopmanip.c (create_preheaders): Likewise.
	(force_single_succ_latches): Likewise.
	* config/aarch64/falkor-tag-collision-avoidance.c
	(execute_tag_collision_avoidance): Likewise.
	* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
	* config/s390/s390.c (s390_adjust_loops): Likewise.
	* doc/loop.texi: Likewise.
	* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
	* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
	(loop_versioning::make_versioning_decisions): Likewise.
	* gimple-ssa-split-paths.c (split_paths): Likewise.
	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
	* graphite.c (canonicalize_loop_form): Likewise.
	(graphite_transform_loops): Likewise.
	* ipa-fnsummary.c (analyze_function_body): Likewise.
	* ipa-pure-const.c (analyze_function): Likewise.
	* loop-doloop.c (doloop_optimize_loops): Likewise.
	* loop-init.c (loop_optimizer_finalize): Likewise.
	(fix_loop_structure): Likewise.
	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
	(move_loop_invariants): Likewise.
	* loop-unroll.c (decide_unrolling): Likewise.
	(unroll_loops): Likewise.
	* modulo-sched.c (sms_schedule): Likewise.
	* predict.c (predict_loops): Likewise.
	(pass_profile::execute): Likewise.
	* profile.c (branch_prob): Likewise.
	* sel-sched-ir.c (sel_finish_pipelining): Likewise.
	(sel_find_rgns): Likewise.
	* tree-cfg.c (replace_loop_annotate): Likewise.
	(replace_uses_by): Likewise.
	(move_sese_region_to_fn): Likewise.
	* tree-if-conv.c (pass_if_conversion::execute): Likewise.
	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
	* tree-parloops.c (parallelize_loops): Likewise.
	* tree-predcom.c (tree_predictive_commoning): Likewise.
	* tree-scalar-evolution.c (scev_initialize): Likewise.
	(scev_reset): Likewise.
	* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
	* tree-ssa-live.c (remove_unused_locals): Likewise.
	* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
	* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
	(tree_ssa_lim_initialize): Likewise.
	* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
	* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
	* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
	(free_numbers_of_iterations_estimates): Likewise.
	* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
	* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
	* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
	* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
	(pass_scev_cprop::execute): Likewise.
	* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
	* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
	* tree-ssa-threadupdate.c
	(jump_thread_path_registry::thread_through_all_blocks): Likewise.
	* tree-vectorizer.c (vectorize_loops): Likewise.
	* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

[-- Attachment #2: range-based-v4.patch --]
[-- Type: text/plain, Size: 44097 bytes --]

---
 gcc/cfgloop.c                                 |  19 +--
 gcc/cfgloop.h                                 | 140 +++++++++++++-----
 gcc/cfgloopmanip.c                            |   7 +-
 .../aarch64/falkor-tag-collision-avoidance.c  |   4 +-
 gcc/config/mn10300/mn10300.c                  |   4 +-
 gcc/config/s390/s390.c                        |   4 +-
 gcc/doc/loop.texi                             |  13 +-
 gcc/gimple-loop-interchange.cc                |   3 +-
 gcc/gimple-loop-jam.c                         |   3 +-
 gcc/gimple-loop-versioning.cc                 |   6 +-
 gcc/gimple-ssa-split-paths.c                  |   3 +-
 gcc/graphite-isl-ast-to-gimple.c              |   5 +-
 gcc/graphite.c                                |   6 +-
 gcc/ipa-fnsummary.c                           |   2 +-
 gcc/ipa-pure-const.c                          |   3 +-
 gcc/loop-doloop.c                             |   8 +-
 gcc/loop-init.c                               |   5 +-
 gcc/loop-invariant.c                          |  14 +-
 gcc/loop-unroll.c                             |   7 +-
 gcc/modulo-sched.c                            |   5 +-
 gcc/predict.c                                 |   5 +-
 gcc/profile.c                                 |   3 +-
 gcc/sel-sched-ir.c                            |  12 +-
 gcc/tree-cfg.c                                |  13 +-
 gcc/tree-if-conv.c                            |   3 +-
 gcc/tree-loop-distribution.c                  |   2 +-
 gcc/tree-parloops.c                           |   3 +-
 gcc/tree-predcom.c                            |   3 +-
 gcc/tree-scalar-evolution.c                   |  16 +-
 gcc/tree-ssa-dce.c                            |   3 +-
 gcc/tree-ssa-live.c                           |   3 +-
 gcc/tree-ssa-loop-ch.c                        |   3 +-
 gcc/tree-ssa-loop-im.c                        |   7 +-
 gcc/tree-ssa-loop-ivcanon.c                   |   3 +-
 gcc/tree-ssa-loop-ivopts.c                    |   3 +-
 gcc/tree-ssa-loop-manip.c                     |   3 +-
 gcc/tree-ssa-loop-niter.c                     |   8 +-
 gcc/tree-ssa-loop-prefetch.c                  |   3 +-
 gcc/tree-ssa-loop-split.c                     |   7 +-
 gcc/tree-ssa-loop-unswitch.c                  |   3 +-
 gcc/tree-ssa-loop.c                           |   6 +-
 gcc/tree-ssa-propagate.c                      |   3 +-
 gcc/tree-ssa-sccvn.c                          |   3 +-
 gcc/tree-ssa-threadupdate.c                   |   3 +-
 gcc/tree-vectorizer.c                         |   4 +-
 gcc/tree-vrp.c                                |   3 +-
 46 files changed, 194 insertions(+), 195 deletions(-)

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index f094538b9ff..6284ae292b6 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -162,14 +162,12 @@ flow_loop_dump (const class loop *loop, FILE *file,
 void
 flow_loops_dump (FILE *file, void (*loop_dump_aux) (const class loop *, FILE *, int), int verbose)
 {
-  class loop *loop;
-
   if (!current_loops || ! file)
     return;
 
   fprintf (file, ";; %d loops found\n", number_of_loops (cfun));
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (auto loop : loops_list (cfun, LI_INCLUDE_ROOT))
     {
       flow_loop_dump (loop, file, loop_dump_aux, verbose);
     }
@@ -559,8 +557,7 @@ sort_sibling_loops (function *fn)
   free (rc_order);
 
   auto_vec<loop_p, 3> siblings;
-  loop_p loop;
-  FOR_EACH_LOOP_FN (fn, loop, LI_INCLUDE_ROOT)
+  for (auto loop : loops_list (fn, LI_INCLUDE_ROOT))
     if (loop->inner && loop->inner->next)
       {
 	loop_p sibling = loop->inner;
@@ -836,9 +833,7 @@ disambiguate_multiple_latches (class loop *loop)
 void
 disambiguate_loops_with_multiple_latches (void)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       if (!loop->latch)
 	disambiguate_multiple_latches (loop);
@@ -1457,7 +1452,7 @@ verify_loop_structure (void)
   auto_sbitmap visited (last_basic_block_for_fn (cfun));
   bitmap_clear (visited);
   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       unsigned n;
 
@@ -1503,7 +1498,7 @@ verify_loop_structure (void)
   free (bbs);
 
   /* Check headers and latches.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       i = loop->num;
       if (loop->header == NULL)
@@ -1629,7 +1624,7 @@ verify_loop_structure (void)
     }
 
   /* Check the recorded loop exits.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       if (!loop->exits || loop->exits->e != NULL)
 	{
@@ -1723,7 +1718,7 @@ verify_loop_structure (void)
 	  err = 1;
 	}
 
-      FOR_EACH_LOOP (loop, 0)
+      for (auto loop : loops_list (cfun, 0))
 	{
 	  eloops = 0;
 	  for (exit = loop->exits->next; exit->e; exit = exit->next)
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 5e699276c88..d5eee6b4840 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -658,55 +658,141 @@ enum li_flags
   LI_ONLY_INNERMOST = 4		/* Iterate only over innermost loops.  */
 };
 
-/* The iterator for loops.  */
+/* Provide the functionality of std::as_const to support range-based for
+   to use const iterator.  (We can't use std::as_const itself because it's
+   a C++17 feature.)  */
+template <typename T>
+constexpr const T &
+as_const (T &t)
+{
+  return t;
+}
+
+/* A list for visiting loops, which contains the loop numbers instead of
+   the loop pointers.  The scope is restricted in function FN and the
+   visiting order is specified by FLAGS.  */
 
-class loop_iterator
+class loops_list
 {
 public:
-  loop_iterator (function *fn, loop_p *loop, unsigned flags);
+  loops_list (function *fn, unsigned flags);
+
+  template <typename T> class Iter
+  {
+  public:
+    Iter (const loops_list &l, unsigned idx) : list (l), curr_idx (idx)
+    {
+      fill_curr_loop ();
+    }
+
+    T operator* () const { return curr_loop; }
+
+    Iter &
+    operator++ ()
+    {
+      if (curr_idx < list.to_visit.length ())
+	{
+	  /* Bump the index and fill a new one.  */
+	  curr_idx++;
+	  fill_curr_loop ();
+	}
+      else
+	gcc_assert (!curr_loop);
+
+      return *this;
+    }
+
+    bool
+    operator!= (const Iter &rhs) const
+    {
+      return this->curr_idx != rhs.curr_idx;
+    }
+
+  private:
+    /* Fill the current loop starting from the current index.  */
+    void fill_curr_loop ();
+
+    /* Reference to the loop list to visit.  */
+    const loops_list &list;
+
+    /* The current index in the list to visit.  */
+    unsigned curr_idx;
 
-  inline loop_p next ();
+    /* The loop implied by the current index.  */
+    class loop *curr_loop;
+  };
 
+  using iterator = Iter<class loop *>;
+  using const_iterator = Iter<const class loop *>;
+
+  iterator
+  begin ()
+  {
+    return iterator (*this, 0);
+  }
+
+  iterator
+  end ()
+  {
+    return iterator (*this, to_visit.length ());
+  }
+
+  const_iterator
+  begin () const
+  {
+    return const_iterator (*this, 0);
+  }
+
+  const_iterator
+  end () const
+  {
+    return const_iterator (*this, to_visit.length ());
+  }
+
+private:
   /* The function we are visiting.  */
   function *fn;
 
   /* The list of loops to visit.  */
   auto_vec<int, 16> to_visit;
-
-  /* The index of the actual loop.  */
-  unsigned idx;
 };
 
-inline loop_p
-loop_iterator::next ()
+/* Starting from current index CURR_IDX (inclusive), find one index
+   which stands for one valid loop and fill the found loop as CURR_LOOP,
+   if we can't find one, set CURR_LOOP as null.  */
+
+template <typename T>
+inline void
+loops_list::Iter<T>::fill_curr_loop ()
 {
   int anum;
 
-  while (this->to_visit.iterate (this->idx, &anum))
+  while (this->list.to_visit.iterate (this->curr_idx, &anum))
     {
-      this->idx++;
-      loop_p loop = get_loop (fn, anum);
+      class loop *loop = get_loop (this->list.fn, anum);
       if (loop)
-	return loop;
+	{
+	  curr_loop = loop;
+	  return;
+	}
+      this->curr_idx++;
     }
 
-  return NULL;
+  curr_loop = nullptr;
 }
 
-inline
-loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
+/* Set up the loops list to visit according to the specified
+   function scope FN and iterating order FLAGS.  */
+
+inline loops_list::loops_list (function *fn, unsigned flags)
 {
   class loop *aloop;
   unsigned i;
   int mn;
 
-  this->idx = 0;
   this->fn = fn;
   if (!loops_for_fn (fn))
-    {
-      *loop = NULL;
-      return;
-    }
+    return;
 
   this->to_visit.reserve_exact (number_of_loops (fn));
   mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
@@ -766,20 +852,8 @@ loop_iterator::loop_iterator (function *fn, loop_p *loop, unsigned flags)
 	    }
 	}
     }
-
-  *loop = this->next ();
 }
 
-#define FOR_EACH_LOOP(LOOP, FLAGS) \
-  for (loop_iterator li(cfun, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
-
-#define FOR_EACH_LOOP_FN(FN, LOOP, FLAGS) \
-  for (loop_iterator li(FN, &(LOOP), FLAGS); \
-       (LOOP); \
-       (LOOP) = li.next ())
-
 /* The properties of the target.  */
 struct target_cfgloop {
   /* Number of available registers.  */
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index 2af59fedc92..82c242dd720 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1572,12 +1572,10 @@ create_preheader (class loop *loop, int flags)
 void
 create_preheaders (int flags)
 {
-  class loop *loop;
-
   if (!current_loops)
     return;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     create_preheader (loop, flags);
   loops_state_set (LOOPS_HAVE_PREHEADERS);
 }
@@ -1587,10 +1585,9 @@ create_preheaders (int flags)
 void
 force_single_succ_latches (void)
 {
-  class loop *loop;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       if (loop->latch != loop->header && single_succ_p (loop->latch))
 	continue;
diff --git a/gcc/config/aarch64/falkor-tag-collision-avoidance.c b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
index de214e4a0f7..6c8e02a56ab 100644
--- a/gcc/config/aarch64/falkor-tag-collision-avoidance.c
+++ b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
@@ -808,8 +808,6 @@ record_loads (tag_map_t &tag_map, struct loop *loop)
 void
 execute_tag_collision_avoidance ()
 {
-  struct loop *loop;
-
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_UD_CHAIN);
   df_compute_regs_ever_live (true);
@@ -824,7 +822,7 @@ execute_tag_collision_avoidance ()
   calculate_dominance_info (CDI_DOMINATORS);
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       tag_map_t tag_map (512);
 
diff --git a/gcc/config/mn10300/mn10300.c b/gcc/config/mn10300/mn10300.c
index 6f842a3ad32..aeb5d04b3e1 100644
--- a/gcc/config/mn10300/mn10300.c
+++ b/gcc/config/mn10300/mn10300.c
@@ -3234,8 +3234,6 @@ mn10300_loop_contains_call_insn (loop_p loop)
 static void
 mn10300_scan_for_setlb_lcc (void)
 {
-  loop_p loop;
-
   DUMP ("Looking for loops that can use the SETLB insn", NULL_RTX);
 
   df_analyze ();
@@ -3248,7 +3246,7 @@ mn10300_scan_for_setlb_lcc (void)
      if an inner loop is not suitable for use with the SETLB/Lcc insns, it may
      be the case that its parent loop is suitable.  Thus we should check all
      loops, but work from the innermost outwards.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       const char * reason = NULL;
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b1d3b99784d..8c7d36675f5 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -14479,15 +14479,13 @@ s390_adjust_loop_scan_osc (struct loop* loop)
 static void
 s390_adjust_loops ()
 {
-  struct loop *loop = NULL;
-
   df_analyze ();
   compute_bb_for_insn ();
 
   /* Find the loops.  */
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       if (dump_file)
 	{
diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
index a135656ed01..94eed6720b1 100644
--- a/gcc/doc/loop.texi
+++ b/gcc/doc/loop.texi
@@ -79,14 +79,15 @@ and its subloops in the numbering.  The index of a loop never changes.
 
 The entries of the @code{larray} field should not be accessed directly.
 The function @code{get_loop} returns the loop description for a loop with
-the given index.  @code{number_of_loops} function returns number of
-loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
-macro.  The @code{flags} argument of the macro is used to determine
-the direction of traversal and the set of loops visited.  Each loop is
+the given index.  @code{number_of_loops} function returns number of loops
+in the function.  To traverse all loops, use a range-based for loop with
+class @code{loops_list} instance. The @code{flags} argument passed to the
+constructor function of class @code{loops_list} is used to determine the
+direction of traversal and the set of loops visited.  Each loop is
 guaranteed to be visited exactly once, regardless of the changes to the
 loop tree, and the loops may be removed during the traversal.  The newly
-created loops are never traversed, if they need to be visited, this
-must be done separately after their creation.
+created loops are never traversed, if they need to be visited, this must
+be done separately after their creation.
 
 Each basic block contains the reference to the innermost loop it belongs
 to (@code{loop_father}).  For this reason, it is only possible to have
diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 7a88faa2c07..ccd5083145f 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -2089,8 +2089,7 @@ pass_linterchange::execute (function *fun)
     return 0;
 
   bool changed_p = false;
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       vec<loop_p> loop_nest = vNULL;
       vec<data_reference_p> datarefs = vNULL;
diff --git a/gcc/gimple-loop-jam.c b/gcc/gimple-loop-jam.c
index 4842f0dff80..aac2d1a3fd4 100644
--- a/gcc/gimple-loop-jam.c
+++ b/gcc/gimple-loop-jam.c
@@ -486,13 +486,12 @@ adjust_unroll_factor (class loop *inner, struct data_dependence_relation *ddr,
 static unsigned int
 tree_loop_unroll_and_jam (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   /* Go through all innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       class loop *outer = loop_outer (loop);
 
diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
index 4b70c5a4aab..114a22f6e5f 100644
--- a/gcc/gimple-loop-versioning.cc
+++ b/gcc/gimple-loop-versioning.cc
@@ -1428,8 +1428,7 @@ loop_versioning::analyze_blocks ()
      versioning at that level could be useful in some cases.  */
   get_loop_info (get_loop (m_fn, 0)).rejected_p = true;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
 
@@ -1650,8 +1649,7 @@ loop_versioning::make_versioning_decisions ()
   AUTO_DUMP_SCOPE ("make_versioning_decisions",
 		   dump_user_location_t::from_function_decl (m_fn->decl));
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop_info &linfo = get_loop_info (loop);
       if (decide_whether_loop_is_versionable (loop))
diff --git a/gcc/gimple-ssa-split-paths.c b/gcc/gimple-ssa-split-paths.c
index 2dd953d5ef9..04ad9c02477 100644
--- a/gcc/gimple-ssa-split-paths.c
+++ b/gcc/gimple-ssa-split-paths.c
@@ -473,13 +473,12 @@ static bool
 split_paths ()
 {
   bool changed = false;
-  loop_p loop;
 
   loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
   initialize_original_copy_tables ();
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Only split paths if we are optimizing this loop for speed.  */
       if (!optimize_loop_for_speed_p (loop))
diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index c202213f39b..1ad68a1d473 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -1535,9 +1535,8 @@ graphite_regenerate_ast_isl (scop_p scop)
       if_region->false_region->region.entry->flags |= EDGE_FALLTHRU;
       /* remove_edge_and_dominated_blocks marks loops for removal but
 	 doesn't actually remove them (fix that...).  */
-      loop_p loop;
-      FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
-	if (! loop->header)
+      for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
+	if (!loop->header)
 	  delete_loop (loop);
     }
 
diff --git a/gcc/graphite.c b/gcc/graphite.c
index 6c4fb42282b..0060caea22e 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -377,8 +377,7 @@ canonicalize_loop_closed_ssa (loop_p loop, edge e)
 static void
 canonicalize_loop_form (void)
 {
-  loop_p loop;
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       edge e = single_exit (loop);
       if (!e || (e->flags & (EDGE_COMPLEX|EDGE_FAKE)))
@@ -494,10 +493,9 @@ graphite_transform_loops (void)
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
-      loop_p loop;
       int num_no_dependency = 0;
 
-      FOR_EACH_LOOP (loop, 0)
+      for (auto loop : loops_list (cfun, 0))
 	if (loop->can_be_parallel)
 	  num_no_dependency++;
 
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 95d28757f95..20cfd5766fd 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2923,7 +2923,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
       if (dump_file && (dump_flags & TDF_DETAILS))
 	flow_loops_dump (dump_file, NULL, 0);
       scev_initialize ();
-      FOR_EACH_LOOP (loop, 0)
+      for (auto loop : loops_list (cfun, 0))
 	{
 	  predicate loop_iterations = true;
 	  sreal header_freq;
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index f045108af21..a84a4eb7ac0 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -1087,9 +1087,8 @@ end:
 	    }
 	  else
 	    {
-	      class loop *loop;
 	      scev_initialize ();
-	      FOR_EACH_LOOP (loop, 0)
+	      for (auto loop : loops_list (cfun, 0))
 		if (!finite_loop_p (loop))
 		  {
 		    if (dump_file)
diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c
index dda7b9e268f..c3a4523ad18 100644
--- a/gcc/loop-doloop.c
+++ b/gcc/loop-doloop.c
@@ -789,18 +789,14 @@ doloop_optimize (class loop *loop)
 void
 doloop_optimize_loops (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     {
       df_live_add_problem ();
       df_live_set_all_dirty ();
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      doloop_optimize (loop);
-    }
+  for (auto loop : loops_list (cfun, 0))
+    doloop_optimize (loop);
 
   if (optimize == 1)
     df_remove_problem (df_live);
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 1fde0ede441..04054ef6222 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -137,7 +137,6 @@ loop_optimizer_init (unsigned flags)
 void
 loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
 {
-  class loop *loop;
   basic_block bb;
 
   timevar_push (TV_LOOP_FINI);
@@ -167,7 +166,7 @@ loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
       goto loop_fini_done;
     }
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (auto loop : loops_list (fn, 0))
     free_simple_loop_desc (loop);
 
   /* Clean up.  */
@@ -229,7 +228,7 @@ fix_loop_structure (bitmap changed_bbs)
      loops, so that when we remove the loops, we know that the loops inside
      are preserved, and do not waste time relinking loops that will be
      removed later.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Detect the case that the loop is no longer present even though
          it wasn't marked for removal.
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index bdc7b59dd5f..fca0c2b24be 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -2136,7 +2136,7 @@ calculate_loop_reg_pressure (void)
   rtx link;
   class loop *loop, *parent;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     if (loop->aux == NULL)
       {
 	loop->aux = xcalloc (1, sizeof (class loop_data));
@@ -2203,7 +2203,7 @@ calculate_loop_reg_pressure (void)
   bitmap_release (&curr_regs_live);
   if (flag_ira_region == IRA_REGION_MIXED
       || flag_ira_region == IRA_REGION_ALL)
-    FOR_EACH_LOOP (loop, 0)
+    for (auto loop : loops_list (cfun, 0))
       {
 	EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
 	  if (! bitmap_bit_p (&LOOP_DATA (loop)->regs_ref, j))
@@ -2217,7 +2217,7 @@ calculate_loop_reg_pressure (void)
       }
   if (dump_file == NULL)
     return;
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       parent = loop_outer (loop);
       fprintf (dump_file, "\n  Loop %d (parent %d, header bb%d, depth %d)\n",
@@ -2251,8 +2251,6 @@ calculate_loop_reg_pressure (void)
 void
 move_loop_invariants (void)
 {
-  class loop *loop;
-
   if (optimize == 1)
     df_live_add_problem ();
   /* ??? This is a hack.  We should only need to call df_live_set_all_dirty
@@ -2271,7 +2269,7 @@ move_loop_invariants (void)
     }
   df_set_flags (DF_EQ_NOTES + DF_DEFER_INSN_RESCAN);
   /* Process the loops, innermost first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       curr_loop = loop;
       /* move_single_loop_invariants for very large loops is time consuming
@@ -2284,10 +2282,8 @@ move_loop_invariants (void)
 	move_single_loop_invariants (loop);
     }
 
-  FOR_EACH_LOOP (loop, 0)
-    {
+  for (auto loop : loops_list (cfun, 0))
       free_loop_data (loop);
-    }
 
   if (flag_ira_loop_pressure)
     /* There is no sense to keep this info because it was most
diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c
index 66d93487e29..2b31fafa3a3 100644
--- a/gcc/loop-unroll.c
+++ b/gcc/loop-unroll.c
@@ -214,10 +214,8 @@ report_unroll (class loop *loop, dump_location_t locus)
 static void
 decide_unrolling (int flags)
 {
-  class loop *loop;
-
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       loop->lpt_decision.decision = LPT_NONE;
       dump_user_location_t locus = get_loop_location (loop);
@@ -278,14 +276,13 @@ decide_unrolling (int flags)
 void
 unroll_loops (int flags)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Now decide rest of unrolling.  */
   decide_unrolling (flags);
 
   /* Scan the loops, inner ones first.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* And perform the appropriate transformations.  */
       switch (loop->lpt_decision.decision)
diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index e72e46db387..1c1b459d34f 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -1353,7 +1353,6 @@ sms_schedule (void)
   int maxii, max_asap;
   partial_schedule_ptr ps;
   basic_block bb = NULL;
-  class loop *loop;
   basic_block condition_bb = NULL;
   edge latch_edge;
   HOST_WIDE_INT trip_count, max_trip_count;
@@ -1397,7 +1396,7 @@ sms_schedule (void)
 
   /* Build DDGs for all the relevant loops and hold them in G_ARR
      indexed by the loop index.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
@@ -1543,7 +1542,7 @@ sms_schedule (void)
   }
 
   /* We don't want to perform SMS on new loops - created by versioning.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       rtx_insn *head, *tail;
       rtx count_reg;
diff --git a/gcc/predict.c b/gcc/predict.c
index d751e6cecce..d9c7249831e 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -1949,7 +1949,7 @@ predict_loops (void)
 
   /* Try to predict out blocks in a loop that are not part of a
      natural loop.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       basic_block bb, *bbs;
       unsigned j, n_exits = 0;
@@ -4111,8 +4111,7 @@ pass_profile::execute (function *fun)
     profile_status_for_fn (fun) = PROFILE_GUESSED;
  if (dump_file && (dump_flags & TDF_DETAILS))
    {
-     class loop *loop;
-     FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+     for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
        if (loop->header->count.initialized_p ())
          fprintf (dump_file, "Loop got predicted %d to iterate %i times.\n",
        	   loop->num,
diff --git a/gcc/profile.c b/gcc/profile.c
index 1fa4196fa16..c33c833167f 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -1466,13 +1466,12 @@ branch_prob (bool thunk)
   if (flag_branch_probabilities
       && (profile_status_for_fn (cfun) == PROFILE_READ))
     {
-      class loop *loop;
       if (dump_file && (dump_flags & TDF_DETAILS))
 	report_predictor_hitrates ();
 
       /* At this moment we have precise loop iteration count estimates.
 	 Record them to loop structure before the profile gets out of date. */
-      FOR_EACH_LOOP (loop, 0)
+      for (auto loop : loops_list (cfun, 0))
 	if (loop->header->count > 0 && loop->header->count.reliable_p ())
 	  {
 	    gcov_type nit = expected_loop_iterations_unbounded (loop);
diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index eef9d6969f4..48965bfb0ad 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -6247,10 +6247,8 @@ make_regions_from_the_rest (void)
 /* Free data structures used in pipelining of loops.  */
 void sel_finish_pipelining (void)
 {
-  class loop *loop;
-
   /* Release aux fields so we don't free them later by mistake.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     loop->aux = NULL;
 
   loop_optimizer_finalize ();
@@ -6271,11 +6269,11 @@ sel_find_rgns (void)
 
   if (current_loops)
     {
-      loop_p loop;
+      unsigned flags = flag_sel_sched_pipelining_outer_loops
+			 ? LI_FROM_INNERMOST
+			 : LI_ONLY_INNERMOST;
 
-      FOR_EACH_LOOP (loop, (flag_sel_sched_pipelining_outer_loops
-			    ? LI_FROM_INNERMOST
-			    : LI_ONLY_INNERMOST))
+      for (auto loop : loops_list (cfun, flags))
 	make_regions_from_loop_nest (loop);
     }
 
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index c8b0f7b33e1..48ee8c011ab 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -312,12 +312,11 @@ replace_loop_annotate_in_block (basic_block bb, class loop *loop)
 static void
 replace_loop_annotate (void)
 {
-  class loop *loop;
   basic_block bb;
   gimple_stmt_iterator gsi;
   gimple *stmt;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       /* First look into the header.  */
       replace_loop_annotate_in_block (loop->header, loop);
@@ -2027,12 +2026,8 @@ replace_uses_by (tree name, tree val)
   /* Also update the trees stored in loop structures.  */
   if (current_loops)
     {
-      class loop *loop;
-
-      FOR_EACH_LOOP (loop, 0)
-	{
+      for (auto loop : loops_list (cfun, 0))
 	  substitute_in_loop_info (loop, name, val);
-	}
     }
 }
 
@@ -7752,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
 
   /* Fix up orig_loop_num.  If the block referenced in it has been moved
      to dest_cfun, update orig_loop_num field, otherwise clear it.  */
-  class loop *dloop;
+  class loop *dloop = NULL;
   signed char *moved_orig_loop_num = NULL;
-  FOR_EACH_LOOP_FN (dest_cfun, dloop, 0)
+  for (class loop *dloop : loops_list (dest_cfun, 0))
     if (dloop->orig_loop_num)
       {
 	if (moved_orig_loop_num == NULL)
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 345488e2a19..ff4bc62838d 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -3300,14 +3300,13 @@ pass_if_conversion::gate (function *fun)
 unsigned int
 pass_if_conversion::execute (function *fun)
 {
-  class loop *loop;
   unsigned todo = 0;
 
   if (number_of_loops (fun) <= 1)
     return 0;
 
   auto_vec<gimple *> preds;
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     if (flag_tree_loop_if_convert == 1
 	|| ((flag_tree_loop_vectorize || loop->force_vectorize)
 	    && !loop->dont_vectorize))
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 65aa1df4aba..e445e610f15 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -3312,7 +3312,7 @@ loop_distribution::execute (function *fun)
 
   /* We can at the moment only distribute non-nested loops, thus restrict
      walking to innermost loops.  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
     {
       /* Don't distribute multiple exit edges loop, or cold loop when
          not doing pattern detection.  */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index fe1baef32a7..6b1423be441 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -3989,7 +3989,6 @@ parallelize_loops (bool oacc_kernels_p)
 {
   unsigned n_threads;
   bool changed = false;
-  class loop *loop;
   class loop *skip_loop = NULL;
   class tree_niter_desc niter_desc;
   struct obstack parloop_obstack;
@@ -4020,7 +4019,7 @@ parallelize_loops (bool oacc_kernels_p)
 
   calculate_dominance_info (CDI_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       if (loop == skip_loop)
 	{
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index cf85517e1c7..bed30d2ec7a 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -3419,11 +3419,10 @@ pcom_worker::tree_predictive_commoning_loop (bool allow_unroll_p)
 unsigned
 tree_predictive_commoning (bool allow_unroll_p)
 {
-  class loop *loop;
   unsigned ret = 0, changed = 0;
 
   initialize_original_copy_tables ();
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
     if (optimize_loop_for_speed_p (loop))
       {
 	pcom_worker w(loop);
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index b22d49a0ab6..dbdfe8ffa72 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -2977,16 +2977,12 @@ gather_stats_on_scev_database (void)
 void
 scev_initialize (void)
 {
-  class loop *loop;
-
   gcc_assert (! scev_initialized_p ());
 
   scalar_evolution_info = hash_table<scev_info_hasher>::create_ggc (100);
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (auto loop : loops_list (cfun, 0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if SCEV is initialized.  */
@@ -3015,14 +3011,10 @@ scev_reset_htab (void)
 void
 scev_reset (void)
 {
-  class loop *loop;
-
   scev_reset_htab ();
 
-  FOR_EACH_LOOP (loop, 0)
-    {
-      loop->nb_iterations = NULL_TREE;
-    }
+  for (auto loop : loops_list (cfun, 0))
+    loop->nb_iterations = NULL_TREE;
 }
 
 /* Return true if the IV calculation in TYPE can overflow based on the knowledge
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index e2d3b63a30c..0778eb9704a 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -417,7 +417,6 @@ find_obviously_necessary_stmts (bool aggressive)
   /* Prevent the empty possibly infinite loops from being removed.  */
   if (aggressive)
     {
-      class loop *loop;
       if (mark_irreducible_loops ())
 	FOR_EACH_BB_FN (bb, cfun)
 	  {
@@ -433,7 +432,7 @@ find_obviously_necessary_stmts (bool aggressive)
 		}
 	  }
 
-      FOR_EACH_LOOP (loop, 0)
+      for (auto loop : loops_list (cfun, 0))
 	if (!finite_loop_p (loop))
 	  {
 	    if (dump_file)
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index a2aab25e862..3d5fa8dc0f8 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -908,8 +908,7 @@ remove_unused_locals (void)
 
   if (cfun->has_simduid_loops)
     {
-      class loop *loop;
-      FOR_EACH_LOOP (loop, 0)
+      for (auto loop : loops_list (cfun, 0))
 	if (loop->simduid && !is_used_p (loop->simduid))
 	  loop->simduid = NULL_TREE;
     }
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index dfa5dc87c34..b4e09f97b28 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -348,7 +348,6 @@ protected:
 unsigned int
 ch_base::copy_headers (function *fun)
 {
-  class loop *loop;
   basic_block header;
   edge exit, entry;
   basic_block *bbs, *copied_bbs;
@@ -365,7 +364,7 @@ ch_base::copy_headers (function *fun)
 
   auto_vec<std::pair<edge, loop_p> > copied;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       int initial_limit = param_max_loop_header_insns;
       int remaining_limit = initial_limit;
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 81b4ec21d6e..2d7b17d7824 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -1662,7 +1662,7 @@ analyze_memory_references (bool store_motion)
 {
   gimple_stmt_iterator bsi;
   basic_block bb, *bbs;
-  class loop *loop, *outer;
+  class loop *outer;
   unsigned i, n;
 
   /* Collect all basic-blocks in loops and sort them after their
@@ -1706,7 +1706,7 @@ analyze_memory_references (bool store_motion)
 
   /* Propagate the information about accessed memory references up
      the loop hierarchy.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       /* Finalize the overall touched references (including subloops).  */
       bitmap_ior_into (&memory_accesses.all_refs_stored_in_loop[loop->num],
@@ -3133,7 +3133,6 @@ fill_always_executed_in (void)
 static void
 tree_ssa_lim_initialize (bool store_motion)
 {
-  class loop *loop;
   unsigned i;
 
   bitmap_obstack_initialize (&lim_bitmap_obstack);
@@ -3177,7 +3176,7 @@ tree_ssa_lim_initialize (bool store_motion)
      its postorder index.  */
   i = 0;
   bb_loop_postorder = XNEWVEC (unsigned, number_of_loops (cfun));
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     bb_loop_postorder[loop->num] = i++;
 }
 
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index b1971f83544..8d8791f837e 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1285,14 +1285,13 @@ canonicalize_loop_induction_variables (class loop *loop,
 unsigned int
 canonicalize_induction_variables (void)
 {
-  class loop *loop;
   bool changed = false;
   bool irred_invalidated = false;
   bitmap loop_closed_ssa_invalidated = BITMAP_ALLOC (NULL);
 
   estimate_numbers_of_iterations (cfun);
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       changed |= canonicalize_loop_induction_variables (loop,
 							true, UL_SINGLE_ITER,
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 12a8a49a307..5259fb05a90 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -8066,14 +8066,13 @@ finish:
 void
 tree_ssa_iv_optimize (void)
 {
-  class loop *loop;
   struct ivopts_data data;
   auto_bitmap toremove;
 
   tree_ssa_iv_optimize_init (&data);
 
   /* Optimize the loops starting with the innermost ones.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!dbg_cnt (ivopts_loop))
 	continue;
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index 28ae1316fa0..3aff3735bab 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -362,11 +362,10 @@ add_exit_phis (bitmap names_to_rename, bitmap *use_blocks, bitmap *loop_exits)
 static void
 get_loops_exits (bitmap *loop_exits)
 {
-  class loop *loop;
   unsigned j;
   edge e;
 
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       auto_vec<edge> exit_edges = get_loop_exit_edges (loop);
       loop_exits[loop->num] = BITMAP_ALLOC (&loop_renamer_obstack);
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 6fabf10a215..650ec720e43 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -4559,13 +4559,11 @@ estimated_stmt_executions (class loop *loop, widest_int *nit)
 void
 estimate_numbers_of_iterations (function *fn)
 {
-  class loop *loop;
-
   /* We don't want to issue signed overflow warnings while getting
      loop iteration estimates.  */
   fold_defer_overflow_warnings ();
 
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (auto loop : loops_list (fn, 0))
     estimate_numbers_of_iterations (loop);
 
   fold_undefer_and_ignore_overflow_warnings ();
@@ -5031,9 +5029,7 @@ free_numbers_of_iterations_estimates (class loop *loop)
 void
 free_numbers_of_iterations_estimates (function *fn)
 {
-  class loop *loop;
-
-  FOR_EACH_LOOP_FN (fn, loop, 0)
+  for (auto loop : loops_list (fn, 0))
     free_numbers_of_iterations_estimates (loop);
 }
 
diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index 98062eb4616..85977e23245 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -1980,7 +1980,6 @@ fail:
 unsigned int
 tree_ssa_prefetch_arrays (void)
 {
-  class loop *loop;
   bool unrolled = false;
   int todo_flags = 0;
 
@@ -2025,7 +2024,7 @@ tree_ssa_prefetch_arrays (void)
       set_builtin_decl (BUILT_IN_PREFETCH, decl, false);
     }
 
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Processing loop %d:\n", loop->num);
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index 3a09bbc39e5..3f6ad046623 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -1598,18 +1598,17 @@ split_loop_on_cond (struct loop *loop)
 static unsigned int
 tree_ssa_split_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   gcc_assert (scev_initialized_p ());
 
   calculate_dominance_info (CDI_POST_DOMINATORS);
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (auto loop : loops_list (cfun, LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (loop->aux)
 	{
@@ -1630,7 +1629,7 @@ tree_ssa_split_loops (void)
 	}
     }
 
-  FOR_EACH_LOOP (loop, LI_INCLUDE_ROOT)
+  for (auto loop : loops_list (cfun, LI_INCLUDE_ROOT))
     loop->aux = NULL;
 
   clear_aux_for_blocks ();
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 04d4553f13e..fe4dacc0833 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -90,11 +90,10 @@ static tree get_vop_from_header (class loop *);
 unsigned int
 tree_ssa_unswitch_loops (void)
 {
-  class loop *loop;
   bool changed = false;
 
   /* Go through all loops starting from innermost.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!loop->inner)
 	/* Unswitch innermost loop.  */
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 957ac0f3baa..0cc4b3bbccf 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -157,8 +157,7 @@ gate_oacc_kernels (function *fn)
   if (!lookup_attribute ("oacc kernels", DECL_ATTRIBUTES (fn->decl)))
     return false;
 
-  class loop *loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     if (loop->in_oacc_kernels_region)
       return true;
 
@@ -455,12 +454,11 @@ public:
 unsigned
 pass_scev_cprop::execute (function *)
 {
-  class loop *loop;
   bool any = false;
 
   /* Perform final value replacement in loops, in case the replacement
      expressions are cheap.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     any |= final_value_replacement_loop (loop);
 
   return any ? TODO_cleanup_cfg | TODO_update_ssa_only_virtuals : 0;
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index d93ec90b002..6d19410caa8 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -1262,7 +1262,6 @@ clean_up_loop_closed_phi (function *fun)
   tree rhs;
   tree lhs;
   gphi_iterator gsi;
-  struct loop *loop;
 
   /* Avoid possibly quadratic work when scanning for loop exits across
    all loops of a nest.  */
@@ -1274,7 +1273,7 @@ clean_up_loop_closed_phi (function *fun)
   calculate_dominance_info  (CDI_DOMINATORS);
 
   /* Walk over loop in function.  */
-  FOR_EACH_LOOP_FN (fun, loop, 0)
+  for (auto loop : loops_list (fun, 0))
     {
       /* Check each exit edege of loop.  */
       auto_vec<edge> exits = get_loop_exit_edges (loop);
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 7900df946f4..7cb5c4de1a6 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -7637,9 +7637,8 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
      loops and the outermost one optimistically.  */
   if (iterate)
     {
-      loop_p loop;
       unsigned max_depth = param_rpo_vn_max_loop_depth;
-      FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+      for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
 	if (loop_depth (loop) > max_depth)
 	  for (unsigned i = 2;
 	       i < loop_depth (loop) - max_depth; ++i)
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index f496dd3eb8c..cb0dd90ec94 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2561,7 +2561,6 @@ jump_thread_path_registry::thread_through_all_blocks
 {
   bool retval = false;
   unsigned int i;
-  class loop *loop;
   auto_bitmap threaded_blocks;
   hash_set<edge> visited_starting_edges;
 
@@ -2702,7 +2701,7 @@ jump_thread_path_registry::thread_through_all_blocks
   /* Then perform the threading through loop headers.  We start with the
      innermost loop, so that the changes in cfg we perform won't affect
      further threading.  */
-  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
+  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
       if (!loop->header
 	  || !bitmap_bit_p (threaded_blocks, loop->header->index))
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index f1035a83826..b9709a613d5 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -1194,7 +1194,7 @@ vectorize_loops (void)
   /* If some loop was duplicated, it gets bigger number
      than all previously defined loops.  This fact allows us to run
      only over initial loops skipping newly generated ones.  */
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     if (loop->dont_vectorize)
       {
 	any_ifcvt_loops = true;
@@ -1213,7 +1213,7 @@ vectorize_loops (void)
 		  loop4 (copy of loop2)
 		else
 		  loop5 (copy of loop4)
-	   If FOR_EACH_LOOP gives us loop3 first (which has
+	   If loops' iteration gives us loop3 first (which has
 	   dont_vectorize set), make sure to process loop1 before loop4;
 	   so that we can prevent vectorization of loop4 if loop1
 	   is successfully vectorized.  */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 0565c9b5073..3d214760c25 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3337,8 +3337,7 @@ vrp_asserts::find_assert_locations (void)
   /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
      the order we compute liveness and insert asserts we otherwise
      fail to insert asserts into the loop latch.  */
-  loop_p loop;
-  FOR_EACH_LOOP (loop, 0)
+  for (auto loop : loops_list (cfun, 0))
     {
       i = loop->latch->index;
       unsigned int j = single_succ_edge (loop->latch)->dest_idx;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] Make loops_list support an optional loop_p root
  2021-07-23 16:26       ` Martin Sebor
@ 2021-07-27  2:25         ` Kewen.Lin
  0 siblings, 0 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-07-27  2:25 UTC (permalink / raw)
  To: Martin Sebor, Richard Biener
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders, Bill Schmidt

on 2021/7/24 上午12:26, Martin Sebor wrote:
> On 7/23/21 2:41 AM, Kewen.Lin wrote:
>> on 2021/7/22 下午8:56, Richard Biener wrote:
>>> On Tue, Jul 20, 2021 at 4:37
>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> This v2 has addressed some review comments/suggestions:
>>>>
>>>>    - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>>    - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>>      to support loop hierarchy tree rather than just a function,
>>>>      and adjust to use loops* accordingly.
>>>
>>> I actually meant struct loop *, not struct loops * ;)  At the point
>>> we pondered to make loop invariant motion work on single
>>> loop nests we gave up not only but also because it iterates
>>> over the loop nest but all the iterators only ever can process
>>> all loops, not say, all loops inside a specific 'loop' (and
>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
>>> CTOR would take the 'root' of the loop tree as argument.
>>>
>>> I see that doesn't trivially fit how loops_list works, at least
>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
>>> could be adjusted to do ONLY_INNERMOST as well?
>>>
>>
>>
>> Thanks for the clarification!  I just realized that the previous
>> version with struct loops* is problematic, all traversal is
>> still bounded with outer_loop == NULL.  I think what you expect
>> is to respect the given loop_p root boundary.  Since we just
>> record the loops' nums, I think we still need the function* fn?
>> So I add one optional argument loop_p root and update the
>> visiting codes accordingly.  Before this change, the previous
>> visiting uses the outer_loop == NULL as the termination condition,
>> it perfectly includes the root itself, but with this given root,
>> we have to use it as the termination condition to avoid to iterate
>> onto its possible existing next.
>>
>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
>> code like:
>>
>>      struct loops *fn_loops = loops_for_fn (fn)->larray;
>>      for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>>          if (aloop != NULL
>>              && aloop->inner == NULL
>>              && flow_loop_nested_p (tree_root, aloop))
>>               this->to_visit.quick_push (aloop->num);
>>
>> it has the stable bound, but if the given root only has several
>> child loops, it can be much worse if there are many loops in fn.
>> It seems impossible to predict the given root loop hierarchy size,
>> maybe we can still use the original linear searching for the case
>> loops_for_fn (fn) == root?  But since this visiting seems not so
>> performance critical, I chose to share the code originally used
>> for FROM_INNERMOST, hope it can have better readability and
>> maintainability.
> 
> I might be mixing up the two patches (they both seem to touch
> the same functions), but in this one the loops_list ctor looks
> like a sizeable function with at least one loop.  Since the ctor
> is used in the initialization of each of the many range-for loops,
> that could result in inlining of a lot of these calls and so quite
> a bit code bloat.  Unless this is necessary for efficiency  (not
> my area) I would recommend to consider defining the loops_list
> ctor out-of-line in some .c or .cc file.
> 

Yeah, they touch the same functions.  Good point on the code bloat,
I'm not sure the historical reason here, it needs Richi's input.  :)

> (Also, if you agree with the rationale, I'd replace loop_p with
> loop * in the new code.)
> 

Oh, thanks for the reminder, will update it.  

BR,
Kewen

> Thanks
> Martin
> 
>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>> x86_64-redhat-linux and aarch64-linux-gnu, also
>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>
>> Does the attached patch meet what you expect?
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>>     * cfgloop.h (loops_list::loops_list): Add one optional argument root
>>     and adjust accordingly.
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4] Use range-based for loops for traversing loops
  2021-07-27  2:10       ` [PATCH v4] " Kewen.Lin
@ 2021-07-29  7:48         ` Richard Biener
  2021-07-30  7:18         ` Thomas Schwinge
  1 sibling, 0 replies; 35+ messages in thread
From: Richard Biener @ 2021-07-29  7:48 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: Martin Sebor, GCC Patches, Jakub Jelinek, Jonathan Wakely,
	Segher Boessenkool, Richard Sandiford, Bill Schmidt,
	Trevor Saunders

On Tue, Jul 27, 2021 at 4:11 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2021/7/24 上午12:10, Martin Sebor wrote:
> > On 7/23/21 2:35 AM, Kewen.Lin wrote:
> >> Hi,
> >>
> >> Comparing to v2, this v3 removed the new CTOR with struct loops *loops
> >> as Richi clarified.  I'd like to support it in a separated follow up
> >> patch by extending the existing CTOR with an optional argument loop_p
> >> root.
> >
> > Looks very nice (and quite a bit work)!  Thanks again!
> >
> > Not to make even more work for you, but it occurred to me that
> > the declaration of the loop control variable could be simplified
> > by the use of auto like so:
> >
> >  for (auto loop: loops_list (cfun, ...))
> >
>
> Thanks for the suggestion!  Updated in v4 accordingly.
>
> I was under the impression to use C++11 auto is arguable since it sometimes
> may make things less clear.  But I think you are right, using auto here won't
> make it harder to read but more concise.  Thanks again.
>
> > I spotted what looks to me like a few minor typos in the docs
> > diff:
> >
> > diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
> > index a135656ed01..27697b08728 100644
> > --- a/gcc/doc/loop.texi
> > +++ b/gcc/doc/loop.texi
> > @@ -79,14 +79,14 @@ and its subloops in the numbering.  The index of a loop never changes.
> >
> > The entries of the @code{larray} field should not be accessed directly.
> > The function @code{get_loop} returns the loop description for a loop with
> > -the given index.  @code{number_of_loops} function returns number of
> > -loops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
> > -macro.  The @code{flags} argument of the macro is used to determine
> > -the direction of traversal and the set of loops visited.  Each loop is
> > -guaranteed to be visited exactly once, regardless of the changes to the
> > -loop tree, and the loops may be removed during the traversal.  The newly
> > -created loops are never traversed, if they need to be visited, this
> > -must be done separately after their creation.
> > +the given index.  @code{number_of_loops} function returns number of loops
> > +in the function.  To traverse all loops, use range-based for loop with
> >
> > Missing article:
> >
> >   use <ins>a </a>range-based for loop
> >
> > +class @code{loop_list} instance. The @code{flags} argument of the macro
> >
> > Is that loop_list or loops_list?
> >
> > IIUC, it's also not a macro anymore, right?  The flags argument
> > is passed to the loop_list ctor, no?
> >
>
> Oops, thanks for catching all above ones!  Fixed in v4.
>
> Bootstrapped and regtested again on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped again on ppc64le P9 with bootstrap-O3 config.
>
> Is it ok for trunk?

OK.

Thanks,
Richard.

> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
>         * cfgloop.h (as_const): New function.
>         (class loop_iterator): Rename to ...
>         (class loops_list): ... this.
>         (loop_iterator::next): Rename to ...
>         (loops_list::Iter::fill_curr_loop): ... this and adjust.
>         (loop_iterator::loop_iterator): Rename to ...
>         (loops_list::loops_list): ... this and adjust.
>         (loops_list::Iter): New class.
>         (loops_list::iterator): New type.
>         (loops_list::const_iterator): New type.
>         (loops_list::begin): New function.
>         (loops_list::end): Likewise.
>         (loops_list::begin const): Likewise.
>         (loops_list::end const): Likewise.
>         (FOR_EACH_LOOP): Remove.
>         (FOR_EACH_LOOP_FN): Remove.
>         * cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
>         for loop with loops_list instance.
>         (sort_sibling_loops): Likewise.
>         (disambiguate_loops_with_multiple_latches): Likewise.
>         (verify_loop_structure): Likewise.
>         * cfgloopmanip.c (create_preheaders): Likewise.
>         (force_single_succ_latches): Likewise.
>         * config/aarch64/falkor-tag-collision-avoidance.c
>         (execute_tag_collision_avoidance): Likewise.
>         * config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
>         * config/s390/s390.c (s390_adjust_loops): Likewise.
>         * doc/loop.texi: Likewise.
>         * gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
>         * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
>         * gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
>         (loop_versioning::make_versioning_decisions): Likewise.
>         * gimple-ssa-split-paths.c (split_paths): Likewise.
>         * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
>         * graphite.c (canonicalize_loop_form): Likewise.
>         (graphite_transform_loops): Likewise.
>         * ipa-fnsummary.c (analyze_function_body): Likewise.
>         * ipa-pure-const.c (analyze_function): Likewise.
>         * loop-doloop.c (doloop_optimize_loops): Likewise.
>         * loop-init.c (loop_optimizer_finalize): Likewise.
>         (fix_loop_structure): Likewise.
>         * loop-invariant.c (calculate_loop_reg_pressure): Likewise.
>         (move_loop_invariants): Likewise.
>         * loop-unroll.c (decide_unrolling): Likewise.
>         (unroll_loops): Likewise.
>         * modulo-sched.c (sms_schedule): Likewise.
>         * predict.c (predict_loops): Likewise.
>         (pass_profile::execute): Likewise.
>         * profile.c (branch_prob): Likewise.
>         * sel-sched-ir.c (sel_finish_pipelining): Likewise.
>         (sel_find_rgns): Likewise.
>         * tree-cfg.c (replace_loop_annotate): Likewise.
>         (replace_uses_by): Likewise.
>         (move_sese_region_to_fn): Likewise.
>         * tree-if-conv.c (pass_if_conversion::execute): Likewise.
>         * tree-loop-distribution.c (loop_distribution::execute): Likewise.
>         * tree-parloops.c (parallelize_loops): Likewise.
>         * tree-predcom.c (tree_predictive_commoning): Likewise.
>         * tree-scalar-evolution.c (scev_initialize): Likewise.
>         (scev_reset): Likewise.
>         * tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
>         * tree-ssa-live.c (remove_unused_locals): Likewise.
>         * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
>         * tree-ssa-loop-im.c (analyze_memory_references): Likewise.
>         (tree_ssa_lim_initialize): Likewise.
>         * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
>         * tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
>         * tree-ssa-loop-manip.c (get_loops_exits): Likewise.
>         * tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
>         (free_numbers_of_iterations_estimates): Likewise.
>         * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
>         * tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
>         * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
>         * tree-ssa-loop.c (gate_oacc_kernels): Likewise.
>         (pass_scev_cprop::execute): Likewise.
>         * tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
>         * tree-ssa-sccvn.c (do_rpo_vn): Likewise.
>         * tree-ssa-threadupdate.c
>         (jump_thread_path_registry::thread_through_all_blocks): Likewise.
>         * tree-vectorizer.c (vectorize_loops): Likewise.
>         * tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] Make loops_list support an optional loop_p root
  2021-07-23  8:41     ` [PATCH] Make loops_list support an optional loop_p root Kewen.Lin
  2021-07-23 16:26       ` Martin Sebor
@ 2021-07-29  8:01       ` Richard Biener
  2021-07-30  5:20         ` [PATCH v2] " Kewen.Lin
  1 sibling, 1 reply; 35+ messages in thread
From: Richard Biener @ 2021-07-29  8:01 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: GCC Patches, Jakub Jelinek, Jonathan Wakely, Segher Boessenkool,
	Richard Sandiford, Trevor Saunders, Martin Sebor, Bill Schmidt

On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2021/7/22 下午8:56, Richard Biener wrote:
> > On Tue, Jul 20, 2021 at 4:37
> > PM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>
> >> Hi,
> >>
> >> This v2 has addressed some review comments/suggestions:
> >>
> >>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
> >>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
> >>     to support loop hierarchy tree rather than just a function,
> >>     and adjust to use loops* accordingly.
> >
> > I actually meant struct loop *, not struct loops * ;)  At the point
> > we pondered to make loop invariant motion work on single
> > loop nests we gave up not only but also because it iterates
> > over the loop nest but all the iterators only ever can process
> > all loops, not say, all loops inside a specific 'loop' (and
> > including that 'loop' if LI_INCLUDE_ROOT).  So the
> > CTOR would take the 'root' of the loop tree as argument.
> >
> > I see that doesn't trivially fit how loops_list works, at least
> > not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
> > could be adjusted to do ONLY_INNERMOST as well?
> >
>
>
> Thanks for the clarification!  I just realized that the previous
> version with struct loops* is problematic, all traversal is
> still bounded with outer_loop == NULL.  I think what you expect
> is to respect the given loop_p root boundary.  Since we just
> record the loops' nums, I think we still need the function* fn?

Would it simplify things if we recorded the actual loop *?

There's still the to_visit reserve which needs a bound on
the number of loops for efficiency reasons.

> So I add one optional argument loop_p root and update the
> visiting codes accordingly.  Before this change, the previous
> visiting uses the outer_loop == NULL as the termination condition,
> it perfectly includes the root itself, but with this given root,
> we have to use it as the termination condition to avoid to iterate
> onto its possible existing next.
>
> For LI_ONLY_INNERMOST, I was thinking whether we can use the
> code like:
>
>     struct loops *fn_loops = loops_for_fn (fn)->larray;
>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>         if (aloop != NULL
>             && aloop->inner == NULL
>             && flow_loop_nested_p (tree_root, aloop))
>              this->to_visit.quick_push (aloop->num);
>
> it has the stable bound, but if the given root only has several
> child loops, it can be much worse if there are many loops in fn.
> It seems impossible to predict the given root loop hierarchy size,
> maybe we can still use the original linear searching for the case
> loops_for_fn (fn) == root?  But since this visiting seems not so
> performance critical, I chose to share the code originally used
> for FROM_INNERMOST, hope it can have better readability and
> maintainability.

I was indeed looking for something that has execution/storage
bound on the subtree we're interested in.  If we pull the CTOR
out-of-line we can probably keep the linear search for
LI_ONLY_INNERMOST when looking at the whole loop tree.

It just seemed to me that we can eventually re-use a
single loop tree walker for all orders, just adjusting the
places we push.

>
> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> x86_64-redhat-linux and aarch64-linux-gnu, also
> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>
> Does the attached patch meet what you expect?

So yeah, it's probably close to what is sensible.  Not sure
whether optimizing the loops for the !only_push_innermost_p
case is important - if we manage to produce a single
walker with conditionals based on 'flags' then IPA-CP should
produce optimal clones as well I guess.

Richard.

>
> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
>         * cfgloop.h (loops_list::loops_list): Add one optional argument root
>         and adjust accordingly.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2] Make loops_list support an optional loop_p root
  2021-07-29  8:01       ` Richard Biener
@ 2021-07-30  5:20         ` Kewen.Lin
  2021-08-03 12:08           ` Richard Biener
  0 siblings, 1 reply; 35+ messages in thread
From: Kewen.Lin @ 2021-07-30  5:20 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Patches, Segher Boessenkool, Martin Sebor, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 5726 bytes --]

on 2021/7/29 下午4:01, Richard Biener wrote:
> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/7/22 下午8:56, Richard Biener wrote:
>>> On Tue, Jul 20, 2021 at 4:37
>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> This v2 has addressed some review comments/suggestions:
>>>>
>>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>>     to support loop hierarchy tree rather than just a function,
>>>>     and adjust to use loops* accordingly.
>>>
>>> I actually meant struct loop *, not struct loops * ;)  At the point
>>> we pondered to make loop invariant motion work on single
>>> loop nests we gave up not only but also because it iterates
>>> over the loop nest but all the iterators only ever can process
>>> all loops, not say, all loops inside a specific 'loop' (and
>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
>>> CTOR would take the 'root' of the loop tree as argument.
>>>
>>> I see that doesn't trivially fit how loops_list works, at least
>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
>>> could be adjusted to do ONLY_INNERMOST as well?
>>>
>>
>>
>> Thanks for the clarification!  I just realized that the previous
>> version with struct loops* is problematic, all traversal is
>> still bounded with outer_loop == NULL.  I think what you expect
>> is to respect the given loop_p root boundary.  Since we just
>> record the loops' nums, I think we still need the function* fn?
> 
> Would it simplify things if we recorded the actual loop *?
> 

I'm afraid it's unsafe to record the loop*.  I had the same
question why the loop iterator uses index rather than loop* when
I read this at the first time.  I guess the design of processing
loops allows its user to update or even delete the folllowing
loops to be visited.  For example, when the user does some tricks
on one loop, then it duplicates the loop and its children to
somewhere and then removes the loop and its children, when
iterating onto its children later, the "index" way will check its
validity by get_loop at that point, but the "loop *" way will
have some recorded pointers to become dangling, can't do the
validity check on itself, seems to need a side linear search to
ensure the validity.

> There's still the to_visit reserve which needs a bound on
> the number of loops for efficiency reasons.
> 

Yes, I still keep the fn in the updated version.

>> So I add one optional argument loop_p root and update the
>> visiting codes accordingly.  Before this change, the previous
>> visiting uses the outer_loop == NULL as the termination condition,
>> it perfectly includes the root itself, but with this given root,
>> we have to use it as the termination condition to avoid to iterate
>> onto its possible existing next.
>>
>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
>> code like:
>>
>>     struct loops *fn_loops = loops_for_fn (fn)->larray;
>>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>>         if (aloop != NULL
>>             && aloop->inner == NULL
>>             && flow_loop_nested_p (tree_root, aloop))
>>              this->to_visit.quick_push (aloop->num);
>>
>> it has the stable bound, but if the given root only has several
>> child loops, it can be much worse if there are many loops in fn.
>> It seems impossible to predict the given root loop hierarchy size,
>> maybe we can still use the original linear searching for the case
>> loops_for_fn (fn) == root?  But since this visiting seems not so
>> performance critical, I chose to share the code originally used
>> for FROM_INNERMOST, hope it can have better readability and
>> maintainability.
> 
> I was indeed looking for something that has execution/storage
> bound on the subtree we're interested in.  If we pull the CTOR
> out-of-line we can probably keep the linear search for
> LI_ONLY_INNERMOST when looking at the whole loop tree.
> 

OK, I've moved the suggested single loop tree walker out-of-line
to cfgloop.c, and brought the linear search back for
LI_ONLY_INNERMOST when looking at the whole loop tree.

> It just seemed to me that we can eventually re-use a
> single loop tree walker for all orders, just adjusting the
> places we push.
> 

Wow, good point!  Indeed, I have further unified all orders
handlings into a single function walk_loop_tree.

>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>> x86_64-redhat-linux and aarch64-linux-gnu, also
>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>
>> Does the attached patch meet what you expect?
> 
> So yeah, it's probably close to what is sensible.  Not sure
> whether optimizing the loops for the !only_push_innermost_p
> case is important - if we manage to produce a single
> walker with conditionals based on 'flags' then IPA-CP should
> produce optimal clones as well I guess.
> 

Thanks for the comments, the updated v2 is attached.
Comparing with v1, it does:

  - Unify one single loop tree walker for all orders.
  - Move walk_loop_tree out-of-line to cfgloop.c.
  - Keep the linear search for LI_ONLY_INNERMOST with
    tree_root of fn loops.
  - Use class loop * instead of loop_p.

Bootstrapped & regtested on powerpc64le-linux-gnu Power9
(with/without the hunk for LI_ONLY_INNERMOST linear search,
it can have the coverage to exercise LI_ONLY_INNERMOST
in walk_loop_tree when "without").

Is it ok for trunk?

BR,
Kewen
-----
gcc/ChangeLog:

	* cfgloop.h (loops_list::loops_list): Add one optional argument root
	and adjust accordingly, update loop tree walking and factor out
	to ...
	* cfgloop.c (loops_list::walk_loop_tree): ...this.  New function.

[-- Attachment #2: loop_root-v2.diff --]
[-- Type: text/plain, Size: 6199 bytes --]

---
 gcc/cfgloop.c | 64 +++++++++++++++++++++++++++++++++++
 gcc/cfgloop.h | 92 ++++++++++++++++++---------------------------------
 2 files changed, 97 insertions(+), 59 deletions(-)

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 6284ae292b6..acdb4ed14c8 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -2104,3 +2104,67 @@ mark_loop_for_removal (loop_p loop)
   loop->latch = NULL;
   loops_state_set (LOOPS_NEED_FIXUP);
 }
+
+/* Starting from loop tree ROOT, walk loop tree as the visiting
+   order specified by FLAGS, skipping the loop with number MN.
+   The supported visiting orders are:
+     - LI_ONLY_INNERMOST
+     - LI_FROM_INNERMOST
+     - Preorder (if neither of above is specified)  */
+
+void
+loops_list::walk_loop_tree (class loop *root, unsigned flags, int mn)
+{
+  bool only_innermost_p = flags & LI_ONLY_INNERMOST;
+  bool from_innermost_p = flags & LI_FROM_INNERMOST;
+  bool preorder_p = !(only_innermost_p || from_innermost_p);
+
+  /* Early handle root without any inner loops, make later
+     processing simpler, that is all loops processed in the
+     following while loop are impossible to be root.  */
+  if (!root->inner)
+    {
+      if (root->num != mn)
+	this->to_visit.quick_push (root->num);
+      return;
+    }
+
+  class loop *aloop;
+  for (aloop = root;
+       aloop->inner != NULL;
+       aloop = aloop->inner)
+    {
+      if (preorder_p && aloop->num != mn)
+	this->to_visit.quick_push (aloop->num);
+      continue;
+    }
+
+  while (1)
+    {
+      gcc_assert (aloop != root);
+      if (from_innermost_p || aloop->inner == NULL)
+	this->to_visit.quick_push (aloop->num);
+
+      if (aloop->next)
+	{
+	  for (aloop = aloop->next;
+	       aloop->inner != NULL;
+	       aloop = aloop->inner)
+	    {
+	      if (preorder_p)
+		this->to_visit.quick_push (aloop->num);
+	      continue;
+	    }
+	}
+      else if (loop_outer (aloop) == root)
+	break;
+      else
+	aloop = loop_outer (aloop);
+    }
+
+  /* When visiting from innermost, we need to consider root here
+     since the previous loop doesn't handle it.  */
+  if (from_innermost_p && root->num != mn)
+    this->to_visit.quick_push (root->num);
+}
+
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index d5eee6b4840..3046bf713bb 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -669,13 +669,15 @@ as_const (T &t)
 }
 
 /* A list for visiting loops, which contains the loop numbers instead of
-   the loop pointers.  The scope is restricted in function FN and the
-   visiting order is specified by FLAGS.  */
+   the loop pointers.  If the loop ROOT is offered (non-null), the visiting
+   will start from it, otherwise it would start from the tree_root of
+   loops_for_fn (FN) instead.  The scope is restricted in function FN and
+   the visiting order is specified by FLAGS.  */
 
 class loops_list
 {
 public:
-  loops_list (function *fn, unsigned flags);
+  loops_list (function *fn, unsigned flags, class loop *root = nullptr);
 
   template <typename T> class Iter
   {
@@ -750,6 +752,10 @@ public:
   }
 
 private:
+  /* Walk loop tree starting from ROOT as the visiting order specified
+     by FLAGS, skipping the loop with number MN.  */
+  void walk_loop_tree (class loop *root, unsigned flags, int mn);
+
   /* The function we are visiting.  */
   function *fn;
 
@@ -782,76 +788,44 @@ loops_list::Iter<T>::fill_curr_loop ()
 }
 
 /* Set up the loops list to visit according to the specified
-   function scope FN and iterating order FLAGS.  */
+   function scope FN and iterating order FLAGS.  If ROOT is
+   not null, the visiting would start from it, otherwise it
+   will start from tree_root of loops_for_fn (FN).  */
 
-inline loops_list::loops_list (function *fn, unsigned flags)
+inline loops_list::loops_list (function *fn, unsigned flags, class loop *root)
 {
-  class loop *aloop;
-  unsigned i;
-  int mn;
+  struct loops *loops = loops_for_fn (fn);
+  gcc_assert (!root || loops);
+
+  /* Check mutually exclusive flags should not co-exist.  */
+  unsigned checked_flags = LI_ONLY_INNERMOST | LI_FROM_INNERMOST;
+  gcc_assert ((flags & checked_flags) != checked_flags);
 
   this->fn = fn;
-  if (!loops_for_fn (fn))
+  if (!loops)
     return;
 
+  class loop *tree_root = root ? root : loops->tree_root;
+
   this->to_visit.reserve_exact (number_of_loops (fn));
-  mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
+  int mn = (flags & LI_INCLUDE_ROOT) ? -1 : tree_root->num;
 
-  if (flags & LI_ONLY_INNERMOST)
+  /* When root is tree_root of loops_for_fn (fn) and the visiting
+     order is LI_ONLY_INNERMOST, we would like to use linear
+     search here since it has a more stable bound than the
+     walk_loop_tree.  */
+  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
     {
-      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
+      class loop *aloop;
+      unsigned int i;
+      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
 	if (aloop != NULL
 	    && aloop->inner == NULL
-	    && aloop->num >= mn)
+	    && aloop->num != mn)
 	  this->to_visit.quick_push (aloop->num);
     }
-  else if (flags & LI_FROM_INNERMOST)
-    {
-      /* Push the loops to LI->TO_VISIT in postorder.  */
-      for (aloop = loops_for_fn (fn)->tree_root;
-	   aloop->inner != NULL;
-	   aloop = aloop->inner)
-	continue;
-
-      while (1)
-	{
-	  if (aloop->num >= mn)
-	    this->to_visit.quick_push (aloop->num);
-
-	  if (aloop->next)
-	    {
-	      for (aloop = aloop->next;
-		   aloop->inner != NULL;
-		   aloop = aloop->inner)
-		continue;
-	    }
-	  else if (!loop_outer (aloop))
-	    break;
-	  else
-	    aloop = loop_outer (aloop);
-	}
-    }
   else
-    {
-      /* Push the loops to LI->TO_VISIT in preorder.  */
-      aloop = loops_for_fn (fn)->tree_root;
-      while (1)
-	{
-	  if (aloop->num >= mn)
-	    this->to_visit.quick_push (aloop->num);
-
-	  if (aloop->inner != NULL)
-	    aloop = aloop->inner;
-	  else
-	    {
-	      while (aloop != NULL && aloop->next == NULL)
-		aloop = loop_outer (aloop);
-	      if (aloop == NULL)
-		break;
-	      aloop = aloop->next;
-	    }
-	}
-    }
+    walk_loop_tree (tree_root, flags, mn);
 }
 
 /* The properties of the target.  */

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4] Use range-based for loops for traversing loops
  2021-07-27  2:10       ` [PATCH v4] " Kewen.Lin
  2021-07-29  7:48         ` Richard Biener
@ 2021-07-30  7:18         ` Thomas Schwinge
  2021-07-30  7:58           ` Kewen.Lin
  1 sibling, 1 reply; 35+ messages in thread
From: Thomas Schwinge @ 2021-07-30  7:18 UTC (permalink / raw)
  To: linkw
  Cc: Martin Sebor, gcc-patches, Jakub Jelinek, Jonathan Wakely,
	Segher Boessenkool, Richard Sandiford, Bill Schmidt, tbsaunde

Hi!

Thanks for this nice clean-up.

Curious why in some instances we're not removing the 'class loop *loop'
declaration, I had a look, and this may suggest some further clean-up?
(See below; so far, all untested!)


But first, is this transformation correct?

> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c

> @@ -7752,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
>
>    /* Fix up orig_loop_num.  If the block referenced in it has been moved
>       to dest_cfun, update orig_loop_num field, otherwise clear it.  */
> -  class loop *dloop;
> +  class loop *dloop = NULL;
>    signed char *moved_orig_loop_num = NULL;
> -  FOR_EACH_LOOP_FN (dest_cfun, dloop, 0)
> +  for (class loop *dloop : loops_list (dest_cfun, 0))
>      if (dloop->orig_loop_num)
>        {
>       if (moved_orig_loop_num == NULL)

We've got the original outer 'dloop' and a new separate inner 'dloop'
inside the 'for'.  The outer one now is only ever 'NULL'-initialized --
but then meant to be used in later code (not shown here)?  (I cannot
claim to understand this later code usage of 'dloop', though.  Maybe even
is there a pre-existing problem here?  Or, it's still too early on Friday
morning.)  If there is an actual problem, would the following restore the
original behavior?

    --- gcc/tree-cfg.c
    +++ gcc/tree-cfg.c
    @@ -7747,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,

       /* Fix up orig_loop_num.  If the block referenced in it has been moved
          to dest_cfun, update orig_loop_num field, otherwise clear it.  */
    -  class loop *dloop = NULL;
    +  class loop *dloop;
       signed char *moved_orig_loop_num = NULL;
    -  for (class loop *dloop : loops_list (dest_cfun, 0))
    +  for (dloop : loops_list (dest_cfun, 0))
         if (dloop->orig_loop_num)
           {
        if (moved_orig_loop_num == NULL)


Second, additional clean-up possible as follows?

> --- a/gcc/cfgloop.c
> +++ b/gcc/cfgloop.c

> @@ -1457,7 +1452,7 @@ verify_loop_structure (void)
>    auto_sbitmap visited (last_basic_block_for_fn (cfun));
>    bitmap_clear (visited);
>    bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
> -  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
> +  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>      {
>        unsigned n;
>
> @@ -1503,7 +1498,7 @@ verify_loop_structure (void)
>    free (bbs);
>
>    /* Check headers and latches.  */
> -  FOR_EACH_LOOP (loop, 0)
> +  for (auto loop : loops_list (cfun, 0))
>      {
>        i = loop->num;
>        if (loop->header == NULL)
> @@ -1629,7 +1624,7 @@ verify_loop_structure (void)
>      }
>
>    /* Check the recorded loop exits.  */
> -  FOR_EACH_LOOP (loop, 0)
> +  for (auto loop : loops_list (cfun, 0))
>      {
>        if (!loop->exits || loop->exits->e != NULL)
>       {
> @@ -1723,7 +1718,7 @@ verify_loop_structure (void)
>         err = 1;
>       }
>
> -      FOR_EACH_LOOP (loop, 0)
> +      for (auto loop : loops_list (cfun, 0))
>       {
>         eloops = 0;
>         for (exit = loop->exits->next; exit->e; exit = exit->next)

    --- gcc/cfgloop.c
    +++ gcc/cfgloop.c
    @@ -1398,7 +1398,6 @@ verify_loop_structure (void)
     {
       unsigned *sizes, i, j;
       basic_block bb, *bbs;
    -  class loop *loop;
       int err = 0;
       edge e;
       unsigned num = number_of_loops (cfun);
    @@ -1690,7 +1689,7 @@ verify_loop_structure (void)
              for (; exit; exit = exit->next_e)
                eloops++;

    -         for (loop = bb->loop_father;
    +         for (class loop *loop = bb->loop_father;
                   loop != e->dest->loop_father
                   /* When a loop exit is also an entry edge which
                      can happen when avoiding CFG manipulations

> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -2923,7 +2923,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
>        if (dump_file && (dump_flags & TDF_DETAILS))
>       flow_loops_dump (dump_file, NULL, 0);
>        scev_initialize ();
> -      FOR_EACH_LOOP (loop, 0)
> +      for (auto loop : loops_list (cfun, 0))
>       {
>         predicate loop_iterations = true;
>         sreal header_freq;

    --- gcc/ipa-fnsummary.c
    +++ gcc/ipa-fnsummary.c
    @@ -2916,7 +2916,6 @@ analyze_function_body (struct cgraph_node *node, bool early)
       if (nonconstant_names.exists () && !early)
         {
           ipa_fn_summary *s = ipa_fn_summaries->get (node);
    -      class loop *loop;
           unsigned max_loop_predicates = opt_for_fn (node->decl,
                                                 param_ipa_max_loop_predicates);

    @@ -2960,7 +2959,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
           /* To avoid quadratic behavior we analyze stride predicates only
              with respect to the containing loop.  Thus we simply iterate
         over all defs in the outermost loop body.  */
    -      for (loop = loops_for_fn (cfun)->tree_root->inner;
    +      for (class loop *loop = loops_for_fn (cfun)->tree_root->inner;
           loop != NULL; loop = loop->next)
        {
          predicate loop_stride = true;

> --- a/gcc/loop-init.c
> +++ b/gcc/loop-init.c

> @@ -229,7 +228,7 @@ fix_loop_structure (bitmap changed_bbs)
>       loops, so that when we remove the loops, we know that the loops inside
>       are preserved, and do not waste time relinking loops that will be
>       removed later.  */
> -  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
> +  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>      {
>        /* Detect the case that the loop is no longer present even though
>           it wasn't marked for removal.

    --- gcc/loop-init.c
    +++ gcc/loop-init.c
    @@ -201,7 +201,6 @@ fix_loop_structure (bitmap changed_bbs)
     {
       basic_block bb;
       int record_exits = 0;
    -  class loop *loop;
       unsigned old_nloops, i;

       timevar_push (TV_LOOP_INIT);
    @@ -279,6 +278,7 @@ fix_loop_structure (bitmap changed_bbs)

       /* Finally free deleted loops.  */
       bool any_deleted = false;
    +  class loop *loop;
       FOR_EACH_VEC_ELT (*get_loops (cfun), i, loop)
         if (loop && loop->header == NULL)
           {

> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -2136,7 +2136,7 @@ calculate_loop_reg_pressure (void)
>    rtx link;
>    class loop *loop, *parent;
>
> -  FOR_EACH_LOOP (loop, 0)
> +  for (auto loop : loops_list (cfun, 0))
>      if (loop->aux == NULL)
>        {
>       loop->aux = xcalloc (1, sizeof (class loop_data));
> @@ -2203,7 +2203,7 @@ calculate_loop_reg_pressure (void)
>    bitmap_release (&curr_regs_live);
>    if (flag_ira_region == IRA_REGION_MIXED
>        || flag_ira_region == IRA_REGION_ALL)
> -    FOR_EACH_LOOP (loop, 0)
> +    for (auto loop : loops_list (cfun, 0))
>        {
>       EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
>         if (! bitmap_bit_p (&LOOP_DATA (loop)->regs_ref, j))
> @@ -2217,7 +2217,7 @@ calculate_loop_reg_pressure (void)
>        }
>    if (dump_file == NULL)
>      return;
> -  FOR_EACH_LOOP (loop, 0)
> +  for (auto loop : loops_list (cfun, 0))
>      {
>        parent = loop_outer (loop);
>        fprintf (dump_file, "\n  Loop %d (parent %d, header bb%d, depth %d)\n",

    --- gcc/loop-invariant.c
    +++ gcc/loop-invariant.c
    @@ -2134,7 +2134,7 @@ calculate_loop_reg_pressure (void)
       basic_block bb;
       rtx_insn *insn;
       rtx link;
    -  class loop *loop, *parent;
    +  class loop *parent;

       for (auto loop : loops_list (cfun, 0))
         if (loop->aux == NULL)
    @@ -2151,7 +2151,7 @@ calculate_loop_reg_pressure (void)
           if (curr_loop == current_loops->tree_root)
        continue;

    -      for (loop = curr_loop;
    +      for (class loop *loop = curr_loop;
           loop != current_loops->tree_root;
           loop = loop_outer (loop))
        bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_IN (bb));

> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -1949,7 +1949,7 @@ predict_loops (void)
>
>    /* Try to predict out blocks in a loop that are not part of a
>       natural loop.  */
> -  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
> +  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>      {
>        basic_block bb, *bbs;
>        unsigned j, n_exits = 0;

    --- gcc/predict.c
    +++ gcc/predict.c
    @@ -1927,7 +1927,6 @@ predict_extra_loop_exits (edge exit_edge)
     static void
     predict_loops (void)
     {
    -  class loop *loop;
       basic_block bb;
       hash_set <class loop *> with_recursion(10);

    @@ -1941,7 +1941,7 @@ predict_loops (void)
            && (decl = gimple_call_fndecl (gsi_stmt (gsi))) != NULL
            && recursive_call_p (current_function_decl, decl))
          {
    -       loop = bb->loop_father;
    +       class loop *loop = bb->loop_father;
            while (loop && !with_recursion.add (loop))
              loop = loop_outer (loop);
          }

> --- a/gcc/tree-loop-distribution.c
> +++ b/gcc/tree-loop-distribution.c
> @@ -3312,7 +3312,7 @@ loop_distribution::execute (function *fun)
>
>    /* We can at the moment only distribute non-nested loops, thus restrict
>       walking to innermost loops.  */
> -  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
> +  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
>      {
>        /* Don't distribute multiple exit edges loop, or cold loop when
>           not doing pattern detection.  */

    --- gcc/tree-loop-distribution.c
    +++ gcc/tree-loop-distribution.c
    @@ -3293,7 +3293,6 @@ prepare_perfect_loop_nest (class loop *loop)
     unsigned int
     loop_distribution::execute (function *fun)
     {
    -  class loop *loop;
       bool changed = false;
       basic_block bb;
       control_dependences *cd = NULL;
    @@ -3384,6 +3383,7 @@ loop_distribution::execute (function *fun)
           /* Destroy loop bodies that could not be reused.  Do this late as we
         otherwise can end up refering to stale data in control dependences.  */
           unsigned i;
    +      class loop *loop;
           FOR_EACH_VEC_ELT (loops_to_be_destroyed, i, loop)
        destroy_loop (loop);


> --- a/gcc/tree-vectorizer.c
> +++ b/gcc/tree-vectorizer.c
> @@ -1194,7 +1194,7 @@ vectorize_loops (void)
>    /* If some loop was duplicated, it gets bigger number
>       than all previously defined loops.  This fact allows us to run
>       only over initial loops skipping newly generated ones.  */
> -  FOR_EACH_LOOP (loop, 0)
> +  for (auto loop : loops_list (cfun, 0))
>      if (loop->dont_vectorize)
>        {
>       any_ifcvt_loops = true;
> @@ -1213,7 +1213,7 @@ vectorize_loops (void)
>                 loop4 (copy of loop2)
>               else
>                 loop5 (copy of loop4)
> -        If FOR_EACH_LOOP gives us loop3 first (which has
> +        If loops' iteration gives us loop3 first (which has
>          dont_vectorize set), make sure to process loop1 before loop4;
>          so that we can prevent vectorization of loop4 if loop1
>          is successfully vectorized.  */

    --- gcc/tree-vectorizer.c
    +++ gcc/tree-vectorizer.c
    @@ -1172,7 +1172,6 @@ vectorize_loops (void)
       unsigned int i;
       unsigned int num_vectorized_loops = 0;
       unsigned int vect_loops_num;
    -  class loop *loop;
       hash_table<simduid_to_vf> *simduid_to_vf_htab = NULL;
       hash_table<simd_array_to_simduid> *simd_array_to_simduid_htab = NULL;
       bool any_ifcvt_loops = false;
    @@ -1256,7 +1255,7 @@ vectorize_loops (void)
       if (any_ifcvt_loops)
         for (i = 1; i < number_of_loops (cfun); i++)
           {
    -   loop = get_loop (cfun, i);
    +   class loop *loop = get_loop (cfun, i);
        if (loop && loop->dont_vectorize)
          {
            gimple *g = vect_loop_vectorized_call (loop);
    @@ -1282,7 +1281,7 @@ vectorize_loops (void)
           loop_vec_info loop_vinfo;
           bool has_mask_store;

    -      loop = get_loop (cfun, i);
    +      class loop *loop = get_loop (cfun, i);
           if (!loop || !loop->aux)
        continue;
           loop_vinfo = (loop_vec_info) loop->aux;


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v4] Use range-based for loops for traversing loops
  2021-07-30  7:18         ` Thomas Schwinge
@ 2021-07-30  7:58           ` Kewen.Lin
  2021-11-24 14:24             ` Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops) Thomas Schwinge
  0 siblings, 1 reply; 35+ messages in thread
From: Kewen.Lin @ 2021-07-30  7:58 UTC (permalink / raw)
  To: Thomas Schwinge
  Cc: Martin Sebor, gcc-patches, Jakub Jelinek, Jonathan Wakely,
	Segher Boessenkool, Richard Sandiford, Bill Schmidt, tbsaunde

Hi Thomas,

on 2021/7/30 下午3:18, Thomas Schwinge wrote:
> Hi!
> 
> Thanks for this nice clean-up.
> 
> Curious why in some instances we're not removing the 'class loop *loop'
> declaration, I had a look, and this may suggest some further clean-up?
> (See below; so far, all untested!)
> 
> 

Yeah, since this patch is mainly to replace FOR_EACH_LOOP* for loop
traserval, I didn't scan all the class loop *loop, just those ones
used by FOR_EACH_LOOP*.  I like your nice proposed further clean-up,
thanks for doing that!

> But first, is this transformation correct?
> 
>> --- a/gcc/tree-cfg.c
>> +++ b/gcc/tree-cfg.c
> 
>> @@ -7752,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
>>
>>    /* Fix up orig_loop_num.  If the block referenced in it has been moved
>>       to dest_cfun, update orig_loop_num field, otherwise clear it.  */
>> -  class loop *dloop;
>> +  class loop *dloop = NULL;
>>    signed char *moved_orig_loop_num = NULL;
>> -  FOR_EACH_LOOP_FN (dest_cfun, dloop, 0)
>> +  for (class loop *dloop : loops_list (dest_cfun, 0))
>>      if (dloop->orig_loop_num)
>>        {
>>       if (moved_orig_loop_num == NULL)
> 
> We've got the original outer 'dloop' and a new separate inner 'dloop'
> inside the 'for'.  The outer one now is only ever 'NULL'-initialized --
> but then meant to be used in later code (not shown here)?  (I cannot
> claim to understand this later code usage of 'dloop', though.  Maybe even
> is there a pre-existing problem here?  Or, it's still too early on Friday
> morning.)  If there is an actual problem, would the following restore the
> original behavior?

Good question, I also noticed this before.  I think it's a pre-existing
problem since the loop iterating will terminate till dloop is NULL as
there are none early breaks.  So IMHO the initialization with NULL should
be the same as before.  The following path using dloop very likely doesn't
got executed, I planed to dig into the associated test case with graphite
enabled later.  Anyway, thanks for raising this!

BR,
Kewen

> 
>     --- gcc/tree-cfg.c
>     +++ gcc/tree-cfg.c
>     @@ -7747,9 +7747,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
> 
>        /* Fix up orig_loop_num.  If the block referenced in it has been moved
>           to dest_cfun, update orig_loop_num field, otherwise clear it.  */
>     -  class loop *dloop = NULL;
>     +  class loop *dloop;
>        signed char *moved_orig_loop_num = NULL;
>     -  for (class loop *dloop : loops_list (dest_cfun, 0))
>     +  for (dloop : loops_list (dest_cfun, 0))
>          if (dloop->orig_loop_num)
>            {
>         if (moved_orig_loop_num == NULL)
> 
> 
> Second, additional clean-up possible as follows?
> 
>> --- a/gcc/cfgloop.c
>> +++ b/gcc/cfgloop.c
> 
>> @@ -1457,7 +1452,7 @@ verify_loop_structure (void)
>>    auto_sbitmap visited (last_basic_block_for_fn (cfun));
>>    bitmap_clear (visited);
>>    bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
>> -  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
>> +  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>>      {
>>        unsigned n;
>>
>> @@ -1503,7 +1498,7 @@ verify_loop_structure (void)
>>    free (bbs);
>>
>>    /* Check headers and latches.  */
>> -  FOR_EACH_LOOP (loop, 0)
>> +  for (auto loop : loops_list (cfun, 0))
>>      {
>>        i = loop->num;
>>        if (loop->header == NULL)
>> @@ -1629,7 +1624,7 @@ verify_loop_structure (void)
>>      }
>>
>>    /* Check the recorded loop exits.  */
>> -  FOR_EACH_LOOP (loop, 0)
>> +  for (auto loop : loops_list (cfun, 0))
>>      {
>>        if (!loop->exits || loop->exits->e != NULL)
>>       {
>> @@ -1723,7 +1718,7 @@ verify_loop_structure (void)
>>         err = 1;
>>       }
>>
>> -      FOR_EACH_LOOP (loop, 0)
>> +      for (auto loop : loops_list (cfun, 0))
>>       {
>>         eloops = 0;
>>         for (exit = loop->exits->next; exit->e; exit = exit->next)
> 
>     --- gcc/cfgloop.c
>     +++ gcc/cfgloop.c
>     @@ -1398,7 +1398,6 @@ verify_loop_structure (void)
>      {
>        unsigned *sizes, i, j;
>        basic_block bb, *bbs;
>     -  class loop *loop;
>        int err = 0;
>        edge e;
>        unsigned num = number_of_loops (cfun);
>     @@ -1690,7 +1689,7 @@ verify_loop_structure (void)
>               for (; exit; exit = exit->next_e)
>                 eloops++;
> 
>     -         for (loop = bb->loop_father;
>     +         for (class loop *loop = bb->loop_father;
>                    loop != e->dest->loop_father
>                    /* When a loop exit is also an entry edge which
>                       can happen when avoiding CFG manipulations
> 
>> --- a/gcc/ipa-fnsummary.c
>> +++ b/gcc/ipa-fnsummary.c
>> @@ -2923,7 +2923,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
>>        if (dump_file && (dump_flags & TDF_DETAILS))
>>       flow_loops_dump (dump_file, NULL, 0);
>>        scev_initialize ();
>> -      FOR_EACH_LOOP (loop, 0)
>> +      for (auto loop : loops_list (cfun, 0))
>>       {
>>         predicate loop_iterations = true;
>>         sreal header_freq;
> 
>     --- gcc/ipa-fnsummary.c
>     +++ gcc/ipa-fnsummary.c
>     @@ -2916,7 +2916,6 @@ analyze_function_body (struct cgraph_node *node, bool early)
>        if (nonconstant_names.exists () && !early)
>          {
>            ipa_fn_summary *s = ipa_fn_summaries->get (node);
>     -      class loop *loop;
>            unsigned max_loop_predicates = opt_for_fn (node->decl,
>                                                  param_ipa_max_loop_predicates);
> 
>     @@ -2960,7 +2959,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
>            /* To avoid quadratic behavior we analyze stride predicates only
>               with respect to the containing loop.  Thus we simply iterate
>          over all defs in the outermost loop body.  */
>     -      for (loop = loops_for_fn (cfun)->tree_root->inner;
>     +      for (class loop *loop = loops_for_fn (cfun)->tree_root->inner;
>            loop != NULL; loop = loop->next)
>         {
>           predicate loop_stride = true;
> 
>> --- a/gcc/loop-init.c
>> +++ b/gcc/loop-init.c
> 
>> @@ -229,7 +228,7 @@ fix_loop_structure (bitmap changed_bbs)
>>       loops, so that when we remove the loops, we know that the loops inside
>>       are preserved, and do not waste time relinking loops that will be
>>       removed later.  */
>> -  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
>> +  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>>      {
>>        /* Detect the case that the loop is no longer present even though
>>           it wasn't marked for removal.
> 
>     --- gcc/loop-init.c
>     +++ gcc/loop-init.c
>     @@ -201,7 +201,6 @@ fix_loop_structure (bitmap changed_bbs)
>      {
>        basic_block bb;
>        int record_exits = 0;
>     -  class loop *loop;
>        unsigned old_nloops, i;
> 
>        timevar_push (TV_LOOP_INIT);
>     @@ -279,6 +278,7 @@ fix_loop_structure (bitmap changed_bbs)
> 
>        /* Finally free deleted loops.  */
>        bool any_deleted = false;
>     +  class loop *loop;
>        FOR_EACH_VEC_ELT (*get_loops (cfun), i, loop)
>          if (loop && loop->header == NULL)
>            {
> 
>> --- a/gcc/loop-invariant.c
>> +++ b/gcc/loop-invariant.c
>> @@ -2136,7 +2136,7 @@ calculate_loop_reg_pressure (void)
>>    rtx link;
>>    class loop *loop, *parent;
>>
>> -  FOR_EACH_LOOP (loop, 0)
>> +  for (auto loop : loops_list (cfun, 0))
>>      if (loop->aux == NULL)
>>        {
>>       loop->aux = xcalloc (1, sizeof (class loop_data));
>> @@ -2203,7 +2203,7 @@ calculate_loop_reg_pressure (void)
>>    bitmap_release (&curr_regs_live);
>>    if (flag_ira_region == IRA_REGION_MIXED
>>        || flag_ira_region == IRA_REGION_ALL)
>> -    FOR_EACH_LOOP (loop, 0)
>> +    for (auto loop : loops_list (cfun, 0))
>>        {
>>       EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
>>         if (! bitmap_bit_p (&LOOP_DATA (loop)->regs_ref, j))
>> @@ -2217,7 +2217,7 @@ calculate_loop_reg_pressure (void)
>>        }
>>    if (dump_file == NULL)
>>      return;
>> -  FOR_EACH_LOOP (loop, 0)
>> +  for (auto loop : loops_list (cfun, 0))
>>      {
>>        parent = loop_outer (loop);
>>        fprintf (dump_file, "\n  Loop %d (parent %d, header bb%d, depth %d)\n",
> 
>     --- gcc/loop-invariant.c
>     +++ gcc/loop-invariant.c
>     @@ -2134,7 +2134,7 @@ calculate_loop_reg_pressure (void)
>        basic_block bb;
>        rtx_insn *insn;
>        rtx link;
>     -  class loop *loop, *parent;
>     +  class loop *parent;
> 
>        for (auto loop : loops_list (cfun, 0))
>          if (loop->aux == NULL)
>     @@ -2151,7 +2151,7 @@ calculate_loop_reg_pressure (void)
>            if (curr_loop == current_loops->tree_root)
>         continue;
> 
>     -      for (loop = curr_loop;
>     +      for (class loop *loop = curr_loop;
>            loop != current_loops->tree_root;
>            loop = loop_outer (loop))
>         bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_IN (bb));
> 
>> --- a/gcc/predict.c
>> +++ b/gcc/predict.c
>> @@ -1949,7 +1949,7 @@ predict_loops (void)
>>
>>    /* Try to predict out blocks in a loop that are not part of a
>>       natural loop.  */
>> -  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
>> +  for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>>      {
>>        basic_block bb, *bbs;
>>        unsigned j, n_exits = 0;
> 
>     --- gcc/predict.c
>     +++ gcc/predict.c
>     @@ -1927,7 +1927,6 @@ predict_extra_loop_exits (edge exit_edge)
>      static void
>      predict_loops (void)
>      {
>     -  class loop *loop;
>        basic_block bb;
>        hash_set <class loop *> with_recursion(10);
> 
>     @@ -1941,7 +1941,7 @@ predict_loops (void)
>             && (decl = gimple_call_fndecl (gsi_stmt (gsi))) != NULL
>             && recursive_call_p (current_function_decl, decl))
>           {
>     -       loop = bb->loop_father;
>     +       class loop *loop = bb->loop_father;
>             while (loop && !with_recursion.add (loop))
>               loop = loop_outer (loop);
>           }
> 
>> --- a/gcc/tree-loop-distribution.c
>> +++ b/gcc/tree-loop-distribution.c
>> @@ -3312,7 +3312,7 @@ loop_distribution::execute (function *fun)
>>
>>    /* We can at the moment only distribute non-nested loops, thus restrict
>>       walking to innermost loops.  */
>> -  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
>> +  for (auto loop : loops_list (cfun, LI_ONLY_INNERMOST))
>>      {
>>        /* Don't distribute multiple exit edges loop, or cold loop when
>>           not doing pattern detection.  */
> 
>     --- gcc/tree-loop-distribution.c
>     +++ gcc/tree-loop-distribution.c
>     @@ -3293,7 +3293,6 @@ prepare_perfect_loop_nest (class loop *loop)
>      unsigned int
>      loop_distribution::execute (function *fun)
>      {
>     -  class loop *loop;
>        bool changed = false;
>        basic_block bb;
>        control_dependences *cd = NULL;
>     @@ -3384,6 +3383,7 @@ loop_distribution::execute (function *fun)
>            /* Destroy loop bodies that could not be reused.  Do this late as we
>          otherwise can end up refering to stale data in control dependences.  */
>            unsigned i;
>     +      class loop *loop;
>            FOR_EACH_VEC_ELT (loops_to_be_destroyed, i, loop)
>         destroy_loop (loop);
> 
> 
>> --- a/gcc/tree-vectorizer.c
>> +++ b/gcc/tree-vectorizer.c
>> @@ -1194,7 +1194,7 @@ vectorize_loops (void)
>>    /* If some loop was duplicated, it gets bigger number
>>       than all previously defined loops.  This fact allows us to run
>>       only over initial loops skipping newly generated ones.  */
>> -  FOR_EACH_LOOP (loop, 0)
>> +  for (auto loop : loops_list (cfun, 0))
>>      if (loop->dont_vectorize)
>>        {
>>       any_ifcvt_loops = true;
>> @@ -1213,7 +1213,7 @@ vectorize_loops (void)
>>                 loop4 (copy of loop2)
>>               else
>>                 loop5 (copy of loop4)
>> -        If FOR_EACH_LOOP gives us loop3 first (which has
>> +        If loops' iteration gives us loop3 first (which has
>>          dont_vectorize set), make sure to process loop1 before loop4;
>>          so that we can prevent vectorization of loop4 if loop1
>>          is successfully vectorized.  */
> 
>     --- gcc/tree-vectorizer.c
>     +++ gcc/tree-vectorizer.c
>     @@ -1172,7 +1172,6 @@ vectorize_loops (void)
>        unsigned int i;
>        unsigned int num_vectorized_loops = 0;
>        unsigned int vect_loops_num;
>     -  class loop *loop;
>        hash_table<simduid_to_vf> *simduid_to_vf_htab = NULL;
>        hash_table<simd_array_to_simduid> *simd_array_to_simduid_htab = NULL;
>        bool any_ifcvt_loops = false;
>     @@ -1256,7 +1255,7 @@ vectorize_loops (void)
>        if (any_ifcvt_loops)
>          for (i = 1; i < number_of_loops (cfun); i++)
>            {
>     -   loop = get_loop (cfun, i);
>     +   class loop *loop = get_loop (cfun, i);
>         if (loop && loop->dont_vectorize)
>           {
>             gimple *g = vect_loop_vectorized_call (loop);
>     @@ -1282,7 +1281,7 @@ vectorize_loops (void)
>            loop_vec_info loop_vinfo;
>            bool has_mask_store;
> 
>     -      loop = get_loop (cfun, i);
>     +      class loop *loop = get_loop (cfun, i);
>            if (!loop || !loop->aux)
>         continue;
>            loop_vinfo = (loop_vec_info) loop->aux;
> 
> 
> Grüße
>  Thomas
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] Make loops_list support an optional loop_p root
  2021-07-30  5:20         ` [PATCH v2] " Kewen.Lin
@ 2021-08-03 12:08           ` Richard Biener
  2021-08-04  2:36             ` [PATCH v3] " Kewen.Lin
  0 siblings, 1 reply; 35+ messages in thread
From: Richard Biener @ 2021-08-03 12:08 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Segher Boessenkool, Martin Sebor, Bill Schmidt

On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2021/7/29 下午4:01, Richard Biener wrote:
> > On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>
> >> on 2021/7/22 下午8:56, Richard Biener wrote:
> >>> On Tue, Jul 20, 2021 at 4:37
> >>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> This v2 has addressed some review comments/suggestions:
> >>>>
> >>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
> >>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
> >>>>     to support loop hierarchy tree rather than just a function,
> >>>>     and adjust to use loops* accordingly.
> >>>
> >>> I actually meant struct loop *, not struct loops * ;)  At the point
> >>> we pondered to make loop invariant motion work on single
> >>> loop nests we gave up not only but also because it iterates
> >>> over the loop nest but all the iterators only ever can process
> >>> all loops, not say, all loops inside a specific 'loop' (and
> >>> including that 'loop' if LI_INCLUDE_ROOT).  So the
> >>> CTOR would take the 'root' of the loop tree as argument.
> >>>
> >>> I see that doesn't trivially fit how loops_list works, at least
> >>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
> >>> could be adjusted to do ONLY_INNERMOST as well?
> >>>
> >>
> >>
> >> Thanks for the clarification!  I just realized that the previous
> >> version with struct loops* is problematic, all traversal is
> >> still bounded with outer_loop == NULL.  I think what you expect
> >> is to respect the given loop_p root boundary.  Since we just
> >> record the loops' nums, I think we still need the function* fn?
> >
> > Would it simplify things if we recorded the actual loop *?
> >
>
> I'm afraid it's unsafe to record the loop*.  I had the same
> question why the loop iterator uses index rather than loop* when
> I read this at the first time.  I guess the design of processing
> loops allows its user to update or even delete the folllowing
> loops to be visited.  For example, when the user does some tricks
> on one loop, then it duplicates the loop and its children to
> somewhere and then removes the loop and its children, when
> iterating onto its children later, the "index" way will check its
> validity by get_loop at that point, but the "loop *" way will
> have some recorded pointers to become dangling, can't do the
> validity check on itself, seems to need a side linear search to
> ensure the validity.
>
> > There's still the to_visit reserve which needs a bound on
> > the number of loops for efficiency reasons.
> >
>
> Yes, I still keep the fn in the updated version.
>
> >> So I add one optional argument loop_p root and update the
> >> visiting codes accordingly.  Before this change, the previous
> >> visiting uses the outer_loop == NULL as the termination condition,
> >> it perfectly includes the root itself, but with this given root,
> >> we have to use it as the termination condition to avoid to iterate
> >> onto its possible existing next.
> >>
> >> For LI_ONLY_INNERMOST, I was thinking whether we can use the
> >> code like:
> >>
> >>     struct loops *fn_loops = loops_for_fn (fn)->larray;
> >>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
> >>         if (aloop != NULL
> >>             && aloop->inner == NULL
> >>             && flow_loop_nested_p (tree_root, aloop))
> >>              this->to_visit.quick_push (aloop->num);
> >>
> >> it has the stable bound, but if the given root only has several
> >> child loops, it can be much worse if there are many loops in fn.
> >> It seems impossible to predict the given root loop hierarchy size,
> >> maybe we can still use the original linear searching for the case
> >> loops_for_fn (fn) == root?  But since this visiting seems not so
> >> performance critical, I chose to share the code originally used
> >> for FROM_INNERMOST, hope it can have better readability and
> >> maintainability.
> >
> > I was indeed looking for something that has execution/storage
> > bound on the subtree we're interested in.  If we pull the CTOR
> > out-of-line we can probably keep the linear search for
> > LI_ONLY_INNERMOST when looking at the whole loop tree.
> >
>
> OK, I've moved the suggested single loop tree walker out-of-line
> to cfgloop.c, and brought the linear search back for
> LI_ONLY_INNERMOST when looking at the whole loop tree.
>
> > It just seemed to me that we can eventually re-use a
> > single loop tree walker for all orders, just adjusting the
> > places we push.
> >
>
> Wow, good point!  Indeed, I have further unified all orders
> handlings into a single function walk_loop_tree.
>
> >>
> >> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> >> x86_64-redhat-linux and aarch64-linux-gnu, also
> >> bootstrapped on ppc64le P9 with bootstrap-O3 config.
> >>
> >> Does the attached patch meet what you expect?
> >
> > So yeah, it's probably close to what is sensible.  Not sure
> > whether optimizing the loops for the !only_push_innermost_p
> > case is important - if we manage to produce a single
> > walker with conditionals based on 'flags' then IPA-CP should
> > produce optimal clones as well I guess.
> >
>
> Thanks for the comments, the updated v2 is attached.
> Comparing with v1, it does:
>
>   - Unify one single loop tree walker for all orders.
>   - Move walk_loop_tree out-of-line to cfgloop.c.
>   - Keep the linear search for LI_ONLY_INNERMOST with
>     tree_root of fn loops.
>   - Use class loop * instead of loop_p.
>
> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
> (with/without the hunk for LI_ONLY_INNERMOST linear search,
> it can have the coverage to exercise LI_ONLY_INNERMOST
> in walk_loop_tree when "without").
>
> Is it ok for trunk?

Looks good to me.  I think that the 'mn' was an optimization
for the linear walk and it's cheaper to pointer test against
the actual 'root' loop (no need to dereference).  Thus

+  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
     {
-      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
+      class loop *aloop;
+      unsigned int i;
+      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
        if (aloop != NULL
            && aloop->inner == NULL
-           && aloop->num >= mn)
+           && aloop->num != mn)
          this->to_visit.quick_push (aloop->num);

could elide the aloop->num != mn check and start iterating from 1,
since loops->tree_root->num == 0

and the walk_loop_tree could simply do

  class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;

and pointer test aloop against exclude.  That avoids the idea that
'mn' is a vehicle to exclude one random loop from the iteration.

Richard.

> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
>         * cfgloop.h (loops_list::loops_list): Add one optional argument root
>         and adjust accordingly, update loop tree walking and factor out
>         to ...
>         * cfgloop.c (loops_list::walk_loop_tree): ...this.  New function.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3] Make loops_list support an optional loop_p root
  2021-08-03 12:08           ` Richard Biener
@ 2021-08-04  2:36             ` Kewen.Lin
  2021-08-04 10:01               ` Richard Biener
  0 siblings, 1 reply; 35+ messages in thread
From: Kewen.Lin @ 2021-08-04  2:36 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Patches, Segher Boessenkool, Martin Sebor, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 7400 bytes --]

on 2021/8/3 下午8:08, Richard Biener wrote:
> On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/7/29 下午4:01, Richard Biener wrote:
>>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>
>>>> on 2021/7/22 下午8:56, Richard Biener wrote:
>>>>> On Tue, Jul 20, 2021 at 4:37
>>>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This v2 has addressed some review comments/suggestions:
>>>>>>
>>>>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>>>>     to support loop hierarchy tree rather than just a function,
>>>>>>     and adjust to use loops* accordingly.
>>>>>
>>>>> I actually meant struct loop *, not struct loops * ;)  At the point
>>>>> we pondered to make loop invariant motion work on single
>>>>> loop nests we gave up not only but also because it iterates
>>>>> over the loop nest but all the iterators only ever can process
>>>>> all loops, not say, all loops inside a specific 'loop' (and
>>>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
>>>>> CTOR would take the 'root' of the loop tree as argument.
>>>>>
>>>>> I see that doesn't trivially fit how loops_list works, at least
>>>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
>>>>> could be adjusted to do ONLY_INNERMOST as well?
>>>>>
>>>>
>>>>
>>>> Thanks for the clarification!  I just realized that the previous
>>>> version with struct loops* is problematic, all traversal is
>>>> still bounded with outer_loop == NULL.  I think what you expect
>>>> is to respect the given loop_p root boundary.  Since we just
>>>> record the loops' nums, I think we still need the function* fn?
>>>
>>> Would it simplify things if we recorded the actual loop *?
>>>
>>
>> I'm afraid it's unsafe to record the loop*.  I had the same
>> question why the loop iterator uses index rather than loop* when
>> I read this at the first time.  I guess the design of processing
>> loops allows its user to update or even delete the folllowing
>> loops to be visited.  For example, when the user does some tricks
>> on one loop, then it duplicates the loop and its children to
>> somewhere and then removes the loop and its children, when
>> iterating onto its children later, the "index" way will check its
>> validity by get_loop at that point, but the "loop *" way will
>> have some recorded pointers to become dangling, can't do the
>> validity check on itself, seems to need a side linear search to
>> ensure the validity.
>>
>>> There's still the to_visit reserve which needs a bound on
>>> the number of loops for efficiency reasons.
>>>
>>
>> Yes, I still keep the fn in the updated version.
>>
>>>> So I add one optional argument loop_p root and update the
>>>> visiting codes accordingly.  Before this change, the previous
>>>> visiting uses the outer_loop == NULL as the termination condition,
>>>> it perfectly includes the root itself, but with this given root,
>>>> we have to use it as the termination condition to avoid to iterate
>>>> onto its possible existing next.
>>>>
>>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
>>>> code like:
>>>>
>>>>     struct loops *fn_loops = loops_for_fn (fn)->larray;
>>>>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>>>>         if (aloop != NULL
>>>>             && aloop->inner == NULL
>>>>             && flow_loop_nested_p (tree_root, aloop))
>>>>              this->to_visit.quick_push (aloop->num);
>>>>
>>>> it has the stable bound, but if the given root only has several
>>>> child loops, it can be much worse if there are many loops in fn.
>>>> It seems impossible to predict the given root loop hierarchy size,
>>>> maybe we can still use the original linear searching for the case
>>>> loops_for_fn (fn) == root?  But since this visiting seems not so
>>>> performance critical, I chose to share the code originally used
>>>> for FROM_INNERMOST, hope it can have better readability and
>>>> maintainability.
>>>
>>> I was indeed looking for something that has execution/storage
>>> bound on the subtree we're interested in.  If we pull the CTOR
>>> out-of-line we can probably keep the linear search for
>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>
>>
>> OK, I've moved the suggested single loop tree walker out-of-line
>> to cfgloop.c, and brought the linear search back for
>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>
>>> It just seemed to me that we can eventually re-use a
>>> single loop tree walker for all orders, just adjusting the
>>> places we push.
>>>
>>
>> Wow, good point!  Indeed, I have further unified all orders
>> handlings into a single function walk_loop_tree.
>>
>>>>
>>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>>>> x86_64-redhat-linux and aarch64-linux-gnu, also
>>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>>>
>>>> Does the attached patch meet what you expect?
>>>
>>> So yeah, it's probably close to what is sensible.  Not sure
>>> whether optimizing the loops for the !only_push_innermost_p
>>> case is important - if we manage to produce a single
>>> walker with conditionals based on 'flags' then IPA-CP should
>>> produce optimal clones as well I guess.
>>>
>>
>> Thanks for the comments, the updated v2 is attached.
>> Comparing with v1, it does:
>>
>>   - Unify one single loop tree walker for all orders.
>>   - Move walk_loop_tree out-of-line to cfgloop.c.
>>   - Keep the linear search for LI_ONLY_INNERMOST with
>>     tree_root of fn loops.
>>   - Use class loop * instead of loop_p.
>>
>> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
>> (with/without the hunk for LI_ONLY_INNERMOST linear search,
>> it can have the coverage to exercise LI_ONLY_INNERMOST
>> in walk_loop_tree when "without").
>>
>> Is it ok for trunk?
> 
> Looks good to me.  I think that the 'mn' was an optimization
> for the linear walk and it's cheaper to pointer test against
> the actual 'root' loop (no need to dereference).  Thus
> 
> +  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
>      {
> -      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
> +      class loop *aloop;
> +      unsigned int i;
> +      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
>         if (aloop != NULL
>             && aloop->inner == NULL
> -           && aloop->num >= mn)
> +           && aloop->num != mn)
>           this->to_visit.quick_push (aloop->num);
> 
> could elide the aloop->num != mn check and start iterating from 1,
> since loops->tree_root->num == 0
> 
> and the walk_loop_tree could simply do
> 
>   class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
> 
> and pointer test aloop against exclude.  That avoids the idea that
> 'mn' is a vehicle to exclude one random loop from the iteration.
> 

Good idea!  Thanks for the comments!  The attached v3 has addressed
the review comments on "mn".

Bootstrapped & regtested again on powerpc64le-linux-gnu Power9
(with/without the hunk for LI_ONLY_INNERMOST linear search).

Is it ok for trunk?

BR,
Kewen
-----
gcc/ChangeLog:

	* cfgloop.h (loops_list::loops_list): Add one optional argument root
	and adjust accordingly, update loop tree walking and factor out
	to ...
	* cfgloop.c (loops_list::walk_loop_tree): ...this.  New function.

[-- Attachment #2: loop_root-v3.diff --]
[-- Type: text/plain, Size: 6358 bytes --]

---
 gcc/cfgloop.c |  65 ++++++++++++++++++++++++++++++++
 gcc/cfgloop.h | 100 ++++++++++++++++++++------------------------------
 2 files changed, 105 insertions(+), 60 deletions(-)

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 6284ae292b6..afbaa216ce5 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -2104,3 +2104,68 @@ mark_loop_for_removal (loop_p loop)
   loop->latch = NULL;
   loops_state_set (LOOPS_NEED_FIXUP);
 }
+
+/* Starting from loop tree ROOT, walk loop tree as the visiting
+   order specified by FLAGS.  The supported visiting orders
+   are:
+     - LI_ONLY_INNERMOST
+     - LI_FROM_INNERMOST
+     - Preorder (if neither of above is specified)  */
+
+void
+loops_list::walk_loop_tree (class loop *root, unsigned flags)
+{
+  bool only_innermost_p = flags & LI_ONLY_INNERMOST;
+  bool from_innermost_p = flags & LI_FROM_INNERMOST;
+  bool preorder_p = !(only_innermost_p || from_innermost_p);
+  class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
+
+  /* Early handle root without any inner loops, make later
+     processing simpler, that is all loops processed in the
+     following while loop are impossible to be root.  */
+  if (!root->inner)
+    {
+      if (root != exclude)
+	this->to_visit.quick_push (root->num);
+      return;
+    }
+
+  class loop *aloop;
+  for (aloop = root;
+       aloop->inner != NULL;
+       aloop = aloop->inner)
+    {
+      if (preorder_p && aloop != exclude)
+	this->to_visit.quick_push (aloop->num);
+      continue;
+    }
+
+  while (1)
+    {
+      gcc_assert (aloop != root);
+      if (from_innermost_p || aloop->inner == NULL)
+	this->to_visit.quick_push (aloop->num);
+
+      if (aloop->next)
+	{
+	  for (aloop = aloop->next;
+	       aloop->inner != NULL;
+	       aloop = aloop->inner)
+	    {
+	      if (preorder_p)
+		this->to_visit.quick_push (aloop->num);
+	      continue;
+	    }
+	}
+      else if (loop_outer (aloop) == root)
+	break;
+      else
+	aloop = loop_outer (aloop);
+    }
+
+  /* When visiting from innermost, we need to consider root here
+     since the previous while loop doesn't handle it.  */
+  if (from_innermost_p && root != exclude)
+    this->to_visit.quick_push (root->num);
+}
+
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index d5eee6b4840..fed2b0daf4b 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -669,13 +669,15 @@ as_const (T &t)
 }
 
 /* A list for visiting loops, which contains the loop numbers instead of
-   the loop pointers.  The scope is restricted in function FN and the
-   visiting order is specified by FLAGS.  */
+   the loop pointers.  If the loop ROOT is offered (non-null), the visiting
+   will start from it, otherwise it would start from the tree_root of
+   loops_for_fn (FN) instead.  The scope is restricted in function FN and
+   the visiting order is specified by FLAGS.  */
 
 class loops_list
 {
 public:
-  loops_list (function *fn, unsigned flags);
+  loops_list (function *fn, unsigned flags, class loop *root = nullptr);
 
   template <typename T> class Iter
   {
@@ -750,6 +752,10 @@ public:
   }
 
 private:
+  /* Walk loop tree starting from ROOT as the visiting order specified
+     by FLAGS.  */
+  void walk_loop_tree (class loop *root, unsigned flags);
+
   /* The function we are visiting.  */
   function *fn;
 
@@ -782,76 +788,50 @@ loops_list::Iter<T>::fill_curr_loop ()
 }
 
 /* Set up the loops list to visit according to the specified
-   function scope FN and iterating order FLAGS.  */
+   function scope FN and iterating order FLAGS.  If ROOT is
+   not null, the visiting would start from it, otherwise it
+   will start from tree_root of loops_for_fn (FN).  */
 
-inline loops_list::loops_list (function *fn, unsigned flags)
+inline loops_list::loops_list (function *fn, unsigned flags, class loop *root)
 {
-  class loop *aloop;
-  unsigned i;
-  int mn;
+  struct loops *loops = loops_for_fn (fn);
+  gcc_assert (!root || loops);
+
+  /* Check mutually exclusive flags should not co-exist.  */
+  unsigned checked_flags = LI_ONLY_INNERMOST | LI_FROM_INNERMOST;
+  gcc_assert ((flags & checked_flags) != checked_flags);
 
   this->fn = fn;
-  if (!loops_for_fn (fn))
+  if (!loops)
     return;
 
+  class loop *tree_root = root ? root : loops->tree_root;
+
   this->to_visit.reserve_exact (number_of_loops (fn));
-  mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
 
-  if (flags & LI_ONLY_INNERMOST)
+  /* When root is tree_root of loops_for_fn (fn) and the visiting
+     order is LI_ONLY_INNERMOST, we would like to use linear
+     search here since it has a more stable bound than the
+     walk_loop_tree.  */
+  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
     {
-      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
-	if (aloop != NULL
-	    && aloop->inner == NULL
-	    && aloop->num >= mn)
-	  this->to_visit.quick_push (aloop->num);
-    }
-  else if (flags & LI_FROM_INNERMOST)
-    {
-      /* Push the loops to LI->TO_VISIT in postorder.  */
-      for (aloop = loops_for_fn (fn)->tree_root;
-	   aloop->inner != NULL;
-	   aloop = aloop->inner)
-	continue;
-
-      while (1)
+      gcc_assert (tree_root->num == 0);
+      if (tree_root->inner == NULL)
 	{
-	  if (aloop->num >= mn)
-	    this->to_visit.quick_push (aloop->num);
-
-	  if (aloop->next)
-	    {
-	      for (aloop = aloop->next;
-		   aloop->inner != NULL;
-		   aloop = aloop->inner)
-		continue;
-	    }
-	  else if (!loop_outer (aloop))
-	    break;
-	  else
-	    aloop = loop_outer (aloop);
+	  if (flags & LI_INCLUDE_ROOT)
+	    this->to_visit.quick_push (0);
+
+	  return;
 	}
+
+      class loop *aloop;
+      unsigned int i;
+      for (i = 1; vec_safe_iterate (loops->larray, i, &aloop); i++)
+	if (aloop != NULL && aloop->inner == NULL)
+	  this->to_visit.quick_push (aloop->num);
     }
   else
-    {
-      /* Push the loops to LI->TO_VISIT in preorder.  */
-      aloop = loops_for_fn (fn)->tree_root;
-      while (1)
-	{
-	  if (aloop->num >= mn)
-	    this->to_visit.quick_push (aloop->num);
-
-	  if (aloop->inner != NULL)
-	    aloop = aloop->inner;
-	  else
-	    {
-	      while (aloop != NULL && aloop->next == NULL)
-		aloop = loop_outer (aloop);
-	      if (aloop == NULL)
-		break;
-	      aloop = aloop->next;
-	    }
-	}
-    }
+    walk_loop_tree (tree_root, flags);
 }
 
 /* The properties of the target.  */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] Make loops_list support an optional loop_p root
  2021-08-04  2:36             ` [PATCH v3] " Kewen.Lin
@ 2021-08-04 10:01               ` Richard Biener
  2021-08-04 10:47                 ` Kewen.Lin
  0 siblings, 1 reply; 35+ messages in thread
From: Richard Biener @ 2021-08-04 10:01 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Segher Boessenkool, Martin Sebor, Bill Schmidt

On Wed, Aug 4, 2021 at 4:36 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2021/8/3 下午8:08, Richard Biener wrote:
> > On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>
> >> on 2021/7/29 下午4:01, Richard Biener wrote:
> >>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>
> >>>> on 2021/7/22 下午8:56, Richard Biener wrote:
> >>>>> On Tue, Jul 20, 2021 at 4:37
> >>>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> This v2 has addressed some review comments/suggestions:
> >>>>>>
> >>>>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
> >>>>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
> >>>>>>     to support loop hierarchy tree rather than just a function,
> >>>>>>     and adjust to use loops* accordingly.
> >>>>>
> >>>>> I actually meant struct loop *, not struct loops * ;)  At the point
> >>>>> we pondered to make loop invariant motion work on single
> >>>>> loop nests we gave up not only but also because it iterates
> >>>>> over the loop nest but all the iterators only ever can process
> >>>>> all loops, not say, all loops inside a specific 'loop' (and
> >>>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
> >>>>> CTOR would take the 'root' of the loop tree as argument.
> >>>>>
> >>>>> I see that doesn't trivially fit how loops_list works, at least
> >>>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
> >>>>> could be adjusted to do ONLY_INNERMOST as well?
> >>>>>
> >>>>
> >>>>
> >>>> Thanks for the clarification!  I just realized that the previous
> >>>> version with struct loops* is problematic, all traversal is
> >>>> still bounded with outer_loop == NULL.  I think what you expect
> >>>> is to respect the given loop_p root boundary.  Since we just
> >>>> record the loops' nums, I think we still need the function* fn?
> >>>
> >>> Would it simplify things if we recorded the actual loop *?
> >>>
> >>
> >> I'm afraid it's unsafe to record the loop*.  I had the same
> >> question why the loop iterator uses index rather than loop* when
> >> I read this at the first time.  I guess the design of processing
> >> loops allows its user to update or even delete the folllowing
> >> loops to be visited.  For example, when the user does some tricks
> >> on one loop, then it duplicates the loop and its children to
> >> somewhere and then removes the loop and its children, when
> >> iterating onto its children later, the "index" way will check its
> >> validity by get_loop at that point, but the "loop *" way will
> >> have some recorded pointers to become dangling, can't do the
> >> validity check on itself, seems to need a side linear search to
> >> ensure the validity.
> >>
> >>> There's still the to_visit reserve which needs a bound on
> >>> the number of loops for efficiency reasons.
> >>>
> >>
> >> Yes, I still keep the fn in the updated version.
> >>
> >>>> So I add one optional argument loop_p root and update the
> >>>> visiting codes accordingly.  Before this change, the previous
> >>>> visiting uses the outer_loop == NULL as the termination condition,
> >>>> it perfectly includes the root itself, but with this given root,
> >>>> we have to use it as the termination condition to avoid to iterate
> >>>> onto its possible existing next.
> >>>>
> >>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
> >>>> code like:
> >>>>
> >>>>     struct loops *fn_loops = loops_for_fn (fn)->larray;
> >>>>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
> >>>>         if (aloop != NULL
> >>>>             && aloop->inner == NULL
> >>>>             && flow_loop_nested_p (tree_root, aloop))
> >>>>              this->to_visit.quick_push (aloop->num);
> >>>>
> >>>> it has the stable bound, but if the given root only has several
> >>>> child loops, it can be much worse if there are many loops in fn.
> >>>> It seems impossible to predict the given root loop hierarchy size,
> >>>> maybe we can still use the original linear searching for the case
> >>>> loops_for_fn (fn) == root?  But since this visiting seems not so
> >>>> performance critical, I chose to share the code originally used
> >>>> for FROM_INNERMOST, hope it can have better readability and
> >>>> maintainability.
> >>>
> >>> I was indeed looking for something that has execution/storage
> >>> bound on the subtree we're interested in.  If we pull the CTOR
> >>> out-of-line we can probably keep the linear search for
> >>> LI_ONLY_INNERMOST when looking at the whole loop tree.
> >>>
> >>
> >> OK, I've moved the suggested single loop tree walker out-of-line
> >> to cfgloop.c, and brought the linear search back for
> >> LI_ONLY_INNERMOST when looking at the whole loop tree.
> >>
> >>> It just seemed to me that we can eventually re-use a
> >>> single loop tree walker for all orders, just adjusting the
> >>> places we push.
> >>>
> >>
> >> Wow, good point!  Indeed, I have further unified all orders
> >> handlings into a single function walk_loop_tree.
> >>
> >>>>
> >>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> >>>> x86_64-redhat-linux and aarch64-linux-gnu, also
> >>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
> >>>>
> >>>> Does the attached patch meet what you expect?
> >>>
> >>> So yeah, it's probably close to what is sensible.  Not sure
> >>> whether optimizing the loops for the !only_push_innermost_p
> >>> case is important - if we manage to produce a single
> >>> walker with conditionals based on 'flags' then IPA-CP should
> >>> produce optimal clones as well I guess.
> >>>
> >>
> >> Thanks for the comments, the updated v2 is attached.
> >> Comparing with v1, it does:
> >>
> >>   - Unify one single loop tree walker for all orders.
> >>   - Move walk_loop_tree out-of-line to cfgloop.c.
> >>   - Keep the linear search for LI_ONLY_INNERMOST with
> >>     tree_root of fn loops.
> >>   - Use class loop * instead of loop_p.
> >>
> >> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
> >> (with/without the hunk for LI_ONLY_INNERMOST linear search,
> >> it can have the coverage to exercise LI_ONLY_INNERMOST
> >> in walk_loop_tree when "without").
> >>
> >> Is it ok for trunk?
> >
> > Looks good to me.  I think that the 'mn' was an optimization
> > for the linear walk and it's cheaper to pointer test against
> > the actual 'root' loop (no need to dereference).  Thus
> >
> > +  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
> >      {
> > -      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
> > +      class loop *aloop;
> > +      unsigned int i;
> > +      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
> >         if (aloop != NULL
> >             && aloop->inner == NULL
> > -           && aloop->num >= mn)
> > +           && aloop->num != mn)
> >           this->to_visit.quick_push (aloop->num);
> >
> > could elide the aloop->num != mn check and start iterating from 1,
> > since loops->tree_root->num == 0
> >
> > and the walk_loop_tree could simply do
> >
> >   class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
> >
> > and pointer test aloop against exclude.  That avoids the idea that
> > 'mn' is a vehicle to exclude one random loop from the iteration.
> >
>
> Good idea!  Thanks for the comments!  The attached v3 has addressed
> the review comments on "mn".
>
> Bootstrapped & regtested again on powerpc64le-linux-gnu Power9
> (with/without the hunk for LI_ONLY_INNERMOST linear search).
>
> Is it ok for trunk?

+  /* Early handle root without any inner loops, make later
+     processing simpler, that is all loops processed in the
+     following while loop are impossible to be root.  */
+  if (!root->inner)
+    {
+      if (root != exclude)
+       this->to_visit.quick_push (root->num);
+      return;
+    }

could be

   if (!root->inner)
     {
        if (flags & LI_INCLUDE_ROOT)
          this->to_visit.quick_push (root->num);
     }

+  class loop *aloop;
+  for (aloop = root;
+       aloop->inner != NULL;
+       aloop = aloop->inner)
+    {
+      if (preorder_p && aloop != exclude)
+       this->to_visit.quick_push (aloop->num);
+      continue;
+    }

could be

+  class loop *aloop;
+  for (aloop = root->inner;
+       aloop->inner != NULL;
+       aloop = aloop->inner)
+    {
+      if (preorder_p)
+       this->to_visit.quick_push (aloop->num);
+      continue;
+    }

+  /* When visiting from innermost, we need to consider root here
+     since the previous while loop doesn't handle it.  */
+  if (from_innermost_p && root != exclude)
+    this->to_visit.quick_push (root->num);

could be like the first.  I think that's more clear even.  Sorry for
finding a better solution again.

OK with that change

Richard.


> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
>         * cfgloop.h (loops_list::loops_list): Add one optional argument root
>         and adjust accordingly, update loop tree walking and factor out
>         to ...
>         * cfgloop.c (loops_list::walk_loop_tree): ...this.  New function.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] Make loops_list support an optional loop_p root
  2021-08-04 10:01               ` Richard Biener
@ 2021-08-04 10:47                 ` Kewen.Lin
  2021-08-04 12:04                   ` Richard Biener
  0 siblings, 1 reply; 35+ messages in thread
From: Kewen.Lin @ 2021-08-04 10:47 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Patches, Segher Boessenkool, Martin Sebor, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 9886 bytes --]

on 2021/8/4 下午6:01, Richard Biener wrote:
> On Wed, Aug 4, 2021 at 4:36 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/8/3 下午8:08, Richard Biener wrote:
>>> On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>
>>>> on 2021/7/29 下午4:01, Richard Biener wrote:
>>>>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>
>>>>>> on 2021/7/22 下午8:56, Richard Biener wrote:
>>>>>>> On Tue, Jul 20, 2021 at 4:37
>>>>>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> This v2 has addressed some review comments/suggestions:
>>>>>>>>
>>>>>>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>>>>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>>>>>>     to support loop hierarchy tree rather than just a function,
>>>>>>>>     and adjust to use loops* accordingly.
>>>>>>>
>>>>>>> I actually meant struct loop *, not struct loops * ;)  At the point
>>>>>>> we pondered to make loop invariant motion work on single
>>>>>>> loop nests we gave up not only but also because it iterates
>>>>>>> over the loop nest but all the iterators only ever can process
>>>>>>> all loops, not say, all loops inside a specific 'loop' (and
>>>>>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
>>>>>>> CTOR would take the 'root' of the loop tree as argument.
>>>>>>>
>>>>>>> I see that doesn't trivially fit how loops_list works, at least
>>>>>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
>>>>>>> could be adjusted to do ONLY_INNERMOST as well?
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks for the clarification!  I just realized that the previous
>>>>>> version with struct loops* is problematic, all traversal is
>>>>>> still bounded with outer_loop == NULL.  I think what you expect
>>>>>> is to respect the given loop_p root boundary.  Since we just
>>>>>> record the loops' nums, I think we still need the function* fn?
>>>>>
>>>>> Would it simplify things if we recorded the actual loop *?
>>>>>
>>>>
>>>> I'm afraid it's unsafe to record the loop*.  I had the same
>>>> question why the loop iterator uses index rather than loop* when
>>>> I read this at the first time.  I guess the design of processing
>>>> loops allows its user to update or even delete the folllowing
>>>> loops to be visited.  For example, when the user does some tricks
>>>> on one loop, then it duplicates the loop and its children to
>>>> somewhere and then removes the loop and its children, when
>>>> iterating onto its children later, the "index" way will check its
>>>> validity by get_loop at that point, but the "loop *" way will
>>>> have some recorded pointers to become dangling, can't do the
>>>> validity check on itself, seems to need a side linear search to
>>>> ensure the validity.
>>>>
>>>>> There's still the to_visit reserve which needs a bound on
>>>>> the number of loops for efficiency reasons.
>>>>>
>>>>
>>>> Yes, I still keep the fn in the updated version.
>>>>
>>>>>> So I add one optional argument loop_p root and update the
>>>>>> visiting codes accordingly.  Before this change, the previous
>>>>>> visiting uses the outer_loop == NULL as the termination condition,
>>>>>> it perfectly includes the root itself, but with this given root,
>>>>>> we have to use it as the termination condition to avoid to iterate
>>>>>> onto its possible existing next.
>>>>>>
>>>>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
>>>>>> code like:
>>>>>>
>>>>>>     struct loops *fn_loops = loops_for_fn (fn)->larray;
>>>>>>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>>>>>>         if (aloop != NULL
>>>>>>             && aloop->inner == NULL
>>>>>>             && flow_loop_nested_p (tree_root, aloop))
>>>>>>              this->to_visit.quick_push (aloop->num);
>>>>>>
>>>>>> it has the stable bound, but if the given root only has several
>>>>>> child loops, it can be much worse if there are many loops in fn.
>>>>>> It seems impossible to predict the given root loop hierarchy size,
>>>>>> maybe we can still use the original linear searching for the case
>>>>>> loops_for_fn (fn) == root?  But since this visiting seems not so
>>>>>> performance critical, I chose to share the code originally used
>>>>>> for FROM_INNERMOST, hope it can have better readability and
>>>>>> maintainability.
>>>>>
>>>>> I was indeed looking for something that has execution/storage
>>>>> bound on the subtree we're interested in.  If we pull the CTOR
>>>>> out-of-line we can probably keep the linear search for
>>>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>>>
>>>>
>>>> OK, I've moved the suggested single loop tree walker out-of-line
>>>> to cfgloop.c, and brought the linear search back for
>>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>>
>>>>> It just seemed to me that we can eventually re-use a
>>>>> single loop tree walker for all orders, just adjusting the
>>>>> places we push.
>>>>>
>>>>
>>>> Wow, good point!  Indeed, I have further unified all orders
>>>> handlings into a single function walk_loop_tree.
>>>>
>>>>>>
>>>>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>>>>>> x86_64-redhat-linux and aarch64-linux-gnu, also
>>>>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>>>>>
>>>>>> Does the attached patch meet what you expect?
>>>>>
>>>>> So yeah, it's probably close to what is sensible.  Not sure
>>>>> whether optimizing the loops for the !only_push_innermost_p
>>>>> case is important - if we manage to produce a single
>>>>> walker with conditionals based on 'flags' then IPA-CP should
>>>>> produce optimal clones as well I guess.
>>>>>
>>>>
>>>> Thanks for the comments, the updated v2 is attached.
>>>> Comparing with v1, it does:
>>>>
>>>>   - Unify one single loop tree walker for all orders.
>>>>   - Move walk_loop_tree out-of-line to cfgloop.c.
>>>>   - Keep the linear search for LI_ONLY_INNERMOST with
>>>>     tree_root of fn loops.
>>>>   - Use class loop * instead of loop_p.
>>>>
>>>> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
>>>> (with/without the hunk for LI_ONLY_INNERMOST linear search,
>>>> it can have the coverage to exercise LI_ONLY_INNERMOST
>>>> in walk_loop_tree when "without").
>>>>
>>>> Is it ok for trunk?
>>>
>>> Looks good to me.  I think that the 'mn' was an optimization
>>> for the linear walk and it's cheaper to pointer test against
>>> the actual 'root' loop (no need to dereference).  Thus
>>>
>>> +  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
>>>      {
>>> -      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
>>> +      class loop *aloop;
>>> +      unsigned int i;
>>> +      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
>>>         if (aloop != NULL
>>>             && aloop->inner == NULL
>>> -           && aloop->num >= mn)
>>> +           && aloop->num != mn)
>>>           this->to_visit.quick_push (aloop->num);
>>>
>>> could elide the aloop->num != mn check and start iterating from 1,
>>> since loops->tree_root->num == 0
>>>
>>> and the walk_loop_tree could simply do
>>>
>>>   class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
>>>
>>> and pointer test aloop against exclude.  That avoids the idea that
>>> 'mn' is a vehicle to exclude one random loop from the iteration.
>>>
>>
>> Good idea!  Thanks for the comments!  The attached v3 has addressed
>> the review comments on "mn".
>>
>> Bootstrapped & regtested again on powerpc64le-linux-gnu Power9
>> (with/without the hunk for LI_ONLY_INNERMOST linear search).
>>
>> Is it ok for trunk?
> 
> +  /* Early handle root without any inner loops, make later
> +     processing simpler, that is all loops processed in the
> +     following while loop are impossible to be root.  */
> +  if (!root->inner)
> +    {
> +      if (root != exclude)
> +       this->to_visit.quick_push (root->num);
> +      return;
> +    }
> 
> could be
> 
>    if (!root->inner)
>      {
>         if (flags & LI_INCLUDE_ROOT)
>           this->to_visit.quick_push (root->num);
>      }
> 

OK, I thought wrongly that all places with "exclude" might be
more consistent, so gave up to use flags directly.  :)

> +  class loop *aloop;
> +  for (aloop = root;
> +       aloop->inner != NULL;
> +       aloop = aloop->inner)
> +    {
> +      if (preorder_p && aloop != exclude)
> +       this->to_visit.quick_push (aloop->num);
> +      continue;
> +    }
> 
> could be
> 
> +  class loop *aloop;
> +  for (aloop = root->inner;
> +       aloop->inner != NULL;
> +       aloop = aloop->inner)
> +    {
> +      if (preorder_p)
> +       this->to_visit.quick_push (aloop->num);
> +      continue;
> +    }
> 

This seems wrong?  For preorder_p, we might miss to push root
when root->inner isn't NULL.  The below "else if" makes it safe.

@@ -2125,17 +2125,19 @@ loops_list::walk_loop_tree (class loop *root, unsigned flags)
      following while loop are impossible to be root.  */
   if (!root->inner)
     {
-      if (root != exclude)
+      if (flags & LI_INCLUDE_ROOT)
        this->to_visit.quick_push (root->num);
       return;
     }
+  else if (preorder_p && flags & LI_INCLUDE_ROOT)
+    this->to_visit.quick_push (root->num);

> +  /* When visiting from innermost, we need to consider root here
> +     since the previous while loop doesn't handle it.  */
> +  if (from_innermost_p && root != exclude)
> +    this->to_visit.quick_push (root->num);
> 
> could be like the first.  I think that's more clear even.  Sorry for
> finding a better solution again.
> 

It's totally fine, thanks for all the nice suggestions!  :)

> OK with that change
> 

Thanks, the attached diff is the delta against v3, excepting for
the "else if", the other changes follow the suggestion above.

Could you have another look to confirm?

I'll do the full testing again before committing.

BR,
Kewen

[-- Attachment #2: delta.diff --]
[-- Type: text/plain, Size: 1145 bytes --]

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index afbaa216ce5..4c6d8ed90d2 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -2125,17 +2125,19 @@ loops_list::walk_loop_tree (class loop *root, unsigned flags)
      following while loop are impossible to be root.  */
   if (!root->inner)
     {
-      if (root != exclude)
+      if (flags & LI_INCLUDE_ROOT)
 	this->to_visit.quick_push (root->num);
       return;
     }
+  else if (preorder_p && flags & LI_INCLUDE_ROOT)
+    this->to_visit.quick_push (root->num);
 
   class loop *aloop;
-  for (aloop = root;
+  for (aloop = root->inner;
        aloop->inner != NULL;
        aloop = aloop->inner)
     {
-      if (preorder_p && aloop != exclude)
+      if (preorder_p)
 	this->to_visit.quick_push (aloop->num);
       continue;
     }
@@ -2165,7 +2167,7 @@ loops_list::walk_loop_tree (class loop *root, unsigned flags)
 
   /* When visiting from innermost, we need to consider root here
      since the previous while loop doesn't handle it.  */
-  if (from_innermost_p && root != exclude)
+  if (from_innermost_p && flags & LI_INCLUDE_ROOT)
     this->to_visit.quick_push (root->num);
 }
 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] Make loops_list support an optional loop_p root
  2021-08-04 10:47                 ` Kewen.Lin
@ 2021-08-04 12:04                   ` Richard Biener
  2021-08-05  8:50                     ` Kewen.Lin
  0 siblings, 1 reply; 35+ messages in thread
From: Richard Biener @ 2021-08-04 12:04 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Segher Boessenkool, Martin Sebor, Bill Schmidt

On Wed, Aug 4, 2021 at 12:47 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2021/8/4 下午6:01, Richard Biener wrote:
> > On Wed, Aug 4, 2021 at 4:36 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>
> >> on 2021/8/3 下午8:08, Richard Biener wrote:
> >>> On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>
> >>>> on 2021/7/29 下午4:01, Richard Biener wrote:
> >>>>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>>>
> >>>>>> on 2021/7/22 下午8:56, Richard Biener wrote:
> >>>>>>> On Tue, Jul 20, 2021 at 4:37
> >>>>>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> This v2 has addressed some review comments/suggestions:
> >>>>>>>>
> >>>>>>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
> >>>>>>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
> >>>>>>>>     to support loop hierarchy tree rather than just a function,
> >>>>>>>>     and adjust to use loops* accordingly.
> >>>>>>>
> >>>>>>> I actually meant struct loop *, not struct loops * ;)  At the point
> >>>>>>> we pondered to make loop invariant motion work on single
> >>>>>>> loop nests we gave up not only but also because it iterates
> >>>>>>> over the loop nest but all the iterators only ever can process
> >>>>>>> all loops, not say, all loops inside a specific 'loop' (and
> >>>>>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
> >>>>>>> CTOR would take the 'root' of the loop tree as argument.
> >>>>>>>
> >>>>>>> I see that doesn't trivially fit how loops_list works, at least
> >>>>>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
> >>>>>>> could be adjusted to do ONLY_INNERMOST as well?
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Thanks for the clarification!  I just realized that the previous
> >>>>>> version with struct loops* is problematic, all traversal is
> >>>>>> still bounded with outer_loop == NULL.  I think what you expect
> >>>>>> is to respect the given loop_p root boundary.  Since we just
> >>>>>> record the loops' nums, I think we still need the function* fn?
> >>>>>
> >>>>> Would it simplify things if we recorded the actual loop *?
> >>>>>
> >>>>
> >>>> I'm afraid it's unsafe to record the loop*.  I had the same
> >>>> question why the loop iterator uses index rather than loop* when
> >>>> I read this at the first time.  I guess the design of processing
> >>>> loops allows its user to update or even delete the folllowing
> >>>> loops to be visited.  For example, when the user does some tricks
> >>>> on one loop, then it duplicates the loop and its children to
> >>>> somewhere and then removes the loop and its children, when
> >>>> iterating onto its children later, the "index" way will check its
> >>>> validity by get_loop at that point, but the "loop *" way will
> >>>> have some recorded pointers to become dangling, can't do the
> >>>> validity check on itself, seems to need a side linear search to
> >>>> ensure the validity.
> >>>>
> >>>>> There's still the to_visit reserve which needs a bound on
> >>>>> the number of loops for efficiency reasons.
> >>>>>
> >>>>
> >>>> Yes, I still keep the fn in the updated version.
> >>>>
> >>>>>> So I add one optional argument loop_p root and update the
> >>>>>> visiting codes accordingly.  Before this change, the previous
> >>>>>> visiting uses the outer_loop == NULL as the termination condition,
> >>>>>> it perfectly includes the root itself, but with this given root,
> >>>>>> we have to use it as the termination condition to avoid to iterate
> >>>>>> onto its possible existing next.
> >>>>>>
> >>>>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
> >>>>>> code like:
> >>>>>>
> >>>>>>     struct loops *fn_loops = loops_for_fn (fn)->larray;
> >>>>>>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
> >>>>>>         if (aloop != NULL
> >>>>>>             && aloop->inner == NULL
> >>>>>>             && flow_loop_nested_p (tree_root, aloop))
> >>>>>>              this->to_visit.quick_push (aloop->num);
> >>>>>>
> >>>>>> it has the stable bound, but if the given root only has several
> >>>>>> child loops, it can be much worse if there are many loops in fn.
> >>>>>> It seems impossible to predict the given root loop hierarchy size,
> >>>>>> maybe we can still use the original linear searching for the case
> >>>>>> loops_for_fn (fn) == root?  But since this visiting seems not so
> >>>>>> performance critical, I chose to share the code originally used
> >>>>>> for FROM_INNERMOST, hope it can have better readability and
> >>>>>> maintainability.
> >>>>>
> >>>>> I was indeed looking for something that has execution/storage
> >>>>> bound on the subtree we're interested in.  If we pull the CTOR
> >>>>> out-of-line we can probably keep the linear search for
> >>>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
> >>>>>
> >>>>
> >>>> OK, I've moved the suggested single loop tree walker out-of-line
> >>>> to cfgloop.c, and brought the linear search back for
> >>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
> >>>>
> >>>>> It just seemed to me that we can eventually re-use a
> >>>>> single loop tree walker for all orders, just adjusting the
> >>>>> places we push.
> >>>>>
> >>>>
> >>>> Wow, good point!  Indeed, I have further unified all orders
> >>>> handlings into a single function walk_loop_tree.
> >>>>
> >>>>>>
> >>>>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
> >>>>>> x86_64-redhat-linux and aarch64-linux-gnu, also
> >>>>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
> >>>>>>
> >>>>>> Does the attached patch meet what you expect?
> >>>>>
> >>>>> So yeah, it's probably close to what is sensible.  Not sure
> >>>>> whether optimizing the loops for the !only_push_innermost_p
> >>>>> case is important - if we manage to produce a single
> >>>>> walker with conditionals based on 'flags' then IPA-CP should
> >>>>> produce optimal clones as well I guess.
> >>>>>
> >>>>
> >>>> Thanks for the comments, the updated v2 is attached.
> >>>> Comparing with v1, it does:
> >>>>
> >>>>   - Unify one single loop tree walker for all orders.
> >>>>   - Move walk_loop_tree out-of-line to cfgloop.c.
> >>>>   - Keep the linear search for LI_ONLY_INNERMOST with
> >>>>     tree_root of fn loops.
> >>>>   - Use class loop * instead of loop_p.
> >>>>
> >>>> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
> >>>> (with/without the hunk for LI_ONLY_INNERMOST linear search,
> >>>> it can have the coverage to exercise LI_ONLY_INNERMOST
> >>>> in walk_loop_tree when "without").
> >>>>
> >>>> Is it ok for trunk?
> >>>
> >>> Looks good to me.  I think that the 'mn' was an optimization
> >>> for the linear walk and it's cheaper to pointer test against
> >>> the actual 'root' loop (no need to dereference).  Thus
> >>>
> >>> +  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
> >>>      {
> >>> -      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
> >>> +      class loop *aloop;
> >>> +      unsigned int i;
> >>> +      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
> >>>         if (aloop != NULL
> >>>             && aloop->inner == NULL
> >>> -           && aloop->num >= mn)
> >>> +           && aloop->num != mn)
> >>>           this->to_visit.quick_push (aloop->num);
> >>>
> >>> could elide the aloop->num != mn check and start iterating from 1,
> >>> since loops->tree_root->num == 0
> >>>
> >>> and the walk_loop_tree could simply do
> >>>
> >>>   class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
> >>>
> >>> and pointer test aloop against exclude.  That avoids the idea that
> >>> 'mn' is a vehicle to exclude one random loop from the iteration.
> >>>
> >>
> >> Good idea!  Thanks for the comments!  The attached v3 has addressed
> >> the review comments on "mn".
> >>
> >> Bootstrapped & regtested again on powerpc64le-linux-gnu Power9
> >> (with/without the hunk for LI_ONLY_INNERMOST linear search).
> >>
> >> Is it ok for trunk?
> >
> > +  /* Early handle root without any inner loops, make later
> > +     processing simpler, that is all loops processed in the
> > +     following while loop are impossible to be root.  */
> > +  if (!root->inner)
> > +    {
> > +      if (root != exclude)
> > +       this->to_visit.quick_push (root->num);
> > +      return;
> > +    }
> >
> > could be
> >
> >    if (!root->inner)
> >      {
> >         if (flags & LI_INCLUDE_ROOT)
> >           this->to_visit.quick_push (root->num);
> >      }
> >
>
> OK, I thought wrongly that all places with "exclude" might be
> more consistent, so gave up to use flags directly.  :)
>
> > +  class loop *aloop;
> > +  for (aloop = root;
> > +       aloop->inner != NULL;
> > +       aloop = aloop->inner)
> > +    {
> > +      if (preorder_p && aloop != exclude)
> > +       this->to_visit.quick_push (aloop->num);
> > +      continue;
> > +    }
> >
> > could be
> >
> > +  class loop *aloop;
> > +  for (aloop = root->inner;
> > +       aloop->inner != NULL;
> > +       aloop = aloop->inner)
> > +    {
> > +      if (preorder_p)
> > +       this->to_visit.quick_push (aloop->num);
> > +      continue;
> > +    }
> >
>
> This seems wrong?  For preorder_p, we might miss to push root
> when root->inner isn't NULL.  The below "else if" makes it safe.

oops, yes.

> @@ -2125,17 +2125,19 @@ loops_list::walk_loop_tree (class loop *root, unsigned flags)
>       following while loop are impossible to be root.  */
>    if (!root->inner)
>      {
> -      if (root != exclude)
> +      if (flags & LI_INCLUDE_ROOT)
>         this->to_visit.quick_push (root->num);
>        return;
>      }
> +  else if (preorder_p && flags & LI_INCLUDE_ROOT)
> +    this->to_visit.quick_push (root->num);
>
> > +  /* When visiting from innermost, we need to consider root here
> > +     since the previous while loop doesn't handle it.  */
> > +  if (from_innermost_p && root != exclude)
> > +    this->to_visit.quick_push (root->num);
> >
> > could be like the first.  I think that's more clear even.  Sorry for
> > finding a better solution again.
> >
>
> It's totally fine, thanks for all the nice suggestions!  :)
>
> > OK with that change
> >
>
> Thanks, the attached diff is the delta against v3, excepting for
> the "else if", the other changes follow the suggestion above.
>
> Could you have another look to confirm?

I'm missing the line that removes 'exclude', other than that it looks
OK.

Thanks,
Richard.

> I'll do the full testing again before committing.
>
> BR,
> Kewen

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] Make loops_list support an optional loop_p root
  2021-08-04 12:04                   ` Richard Biener
@ 2021-08-05  8:50                     ` Kewen.Lin
  0 siblings, 0 replies; 35+ messages in thread
From: Kewen.Lin @ 2021-08-05  8:50 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Patches, Segher Boessenkool, Martin Sebor, Bill Schmidt

on 2021/8/4 下午8:04, Richard Biener wrote:
> On Wed, Aug 4, 2021 at 12:47 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/8/4 下午6:01, Richard Biener wrote:
>>> On Wed, Aug 4, 2021 at 4:36 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>
>>>> on 2021/8/3 下午8:08, Richard Biener wrote:
>>>>> On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>
>>>>>> on 2021/7/29 下午4:01, Richard Biener wrote:
>>>>>>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>>>
>>>>>>>> on 2021/7/22 下午8:56, Richard Biener wrote:
>>>>>>>>> On Tue, Jul 20, 2021 at 4:37
>>>>>>>>> PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This v2 has addressed some review comments/suggestions:
>>>>>>>>>>
>>>>>>>>>>   - Use "!=" instead of "<" in function operator!= (const Iter &rhs)
>>>>>>>>>>   - Add new CTOR loops_list (struct loops *loops, unsigned flags)
>>>>>>>>>>     to support loop hierarchy tree rather than just a function,
>>>>>>>>>>     and adjust to use loops* accordingly.
>>>>>>>>>
>>>>>>>>> I actually meant struct loop *, not struct loops * ;)  At the point
>>>>>>>>> we pondered to make loop invariant motion work on single
>>>>>>>>> loop nests we gave up not only but also because it iterates
>>>>>>>>> over the loop nest but all the iterators only ever can process
>>>>>>>>> all loops, not say, all loops inside a specific 'loop' (and
>>>>>>>>> including that 'loop' if LI_INCLUDE_ROOT).  So the
>>>>>>>>> CTOR would take the 'root' of the loop tree as argument.
>>>>>>>>>
>>>>>>>>> I see that doesn't trivially fit how loops_list works, at least
>>>>>>>>> not for LI_ONLY_INNERMOST.  But I guess FROM_INNERMOST
>>>>>>>>> could be adjusted to do ONLY_INNERMOST as well?
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for the clarification!  I just realized that the previous
>>>>>>>> version with struct loops* is problematic, all traversal is
>>>>>>>> still bounded with outer_loop == NULL.  I think what you expect
>>>>>>>> is to respect the given loop_p root boundary.  Since we just
>>>>>>>> record the loops' nums, I think we still need the function* fn?
>>>>>>>
>>>>>>> Would it simplify things if we recorded the actual loop *?
>>>>>>>
>>>>>>
>>>>>> I'm afraid it's unsafe to record the loop*.  I had the same
>>>>>> question why the loop iterator uses index rather than loop* when
>>>>>> I read this at the first time.  I guess the design of processing
>>>>>> loops allows its user to update or even delete the folllowing
>>>>>> loops to be visited.  For example, when the user does some tricks
>>>>>> on one loop, then it duplicates the loop and its children to
>>>>>> somewhere and then removes the loop and its children, when
>>>>>> iterating onto its children later, the "index" way will check its
>>>>>> validity by get_loop at that point, but the "loop *" way will
>>>>>> have some recorded pointers to become dangling, can't do the
>>>>>> validity check on itself, seems to need a side linear search to
>>>>>> ensure the validity.
>>>>>>
>>>>>>> There's still the to_visit reserve which needs a bound on
>>>>>>> the number of loops for efficiency reasons.
>>>>>>>
>>>>>>
>>>>>> Yes, I still keep the fn in the updated version.
>>>>>>
>>>>>>>> So I add one optional argument loop_p root and update the
>>>>>>>> visiting codes accordingly.  Before this change, the previous
>>>>>>>> visiting uses the outer_loop == NULL as the termination condition,
>>>>>>>> it perfectly includes the root itself, but with this given root,
>>>>>>>> we have to use it as the termination condition to avoid to iterate
>>>>>>>> onto its possible existing next.
>>>>>>>>
>>>>>>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the
>>>>>>>> code like:
>>>>>>>>
>>>>>>>>     struct loops *fn_loops = loops_for_fn (fn)->larray;
>>>>>>>>     for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++)
>>>>>>>>         if (aloop != NULL
>>>>>>>>             && aloop->inner == NULL
>>>>>>>>             && flow_loop_nested_p (tree_root, aloop))
>>>>>>>>              this->to_visit.quick_push (aloop->num);
>>>>>>>>
>>>>>>>> it has the stable bound, but if the given root only has several
>>>>>>>> child loops, it can be much worse if there are many loops in fn.
>>>>>>>> It seems impossible to predict the given root loop hierarchy size,
>>>>>>>> maybe we can still use the original linear searching for the case
>>>>>>>> loops_for_fn (fn) == root?  But since this visiting seems not so
>>>>>>>> performance critical, I chose to share the code originally used
>>>>>>>> for FROM_INNERMOST, hope it can have better readability and
>>>>>>>> maintainability.
>>>>>>>
>>>>>>> I was indeed looking for something that has execution/storage
>>>>>>> bound on the subtree we're interested in.  If we pull the CTOR
>>>>>>> out-of-line we can probably keep the linear search for
>>>>>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>>>>>
>>>>>>
>>>>>> OK, I've moved the suggested single loop tree walker out-of-line
>>>>>> to cfgloop.c, and brought the linear search back for
>>>>>> LI_ONLY_INNERMOST when looking at the whole loop tree.
>>>>>>
>>>>>>> It just seemed to me that we can eventually re-use a
>>>>>>> single loop tree walker for all orders, just adjusting the
>>>>>>> places we push.
>>>>>>>
>>>>>>
>>>>>> Wow, good point!  Indeed, I have further unified all orders
>>>>>> handlings into a single function walk_loop_tree.
>>>>>>
>>>>>>>>
>>>>>>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9,
>>>>>>>> x86_64-redhat-linux and aarch64-linux-gnu, also
>>>>>>>> bootstrapped on ppc64le P9 with bootstrap-O3 config.
>>>>>>>>
>>>>>>>> Does the attached patch meet what you expect?
>>>>>>>
>>>>>>> So yeah, it's probably close to what is sensible.  Not sure
>>>>>>> whether optimizing the loops for the !only_push_innermost_p
>>>>>>> case is important - if we manage to produce a single
>>>>>>> walker with conditionals based on 'flags' then IPA-CP should
>>>>>>> produce optimal clones as well I guess.
>>>>>>>
>>>>>>
>>>>>> Thanks for the comments, the updated v2 is attached.
>>>>>> Comparing with v1, it does:
>>>>>>
>>>>>>   - Unify one single loop tree walker for all orders.
>>>>>>   - Move walk_loop_tree out-of-line to cfgloop.c.
>>>>>>   - Keep the linear search for LI_ONLY_INNERMOST with
>>>>>>     tree_root of fn loops.
>>>>>>   - Use class loop * instead of loop_p.
>>>>>>
>>>>>> Bootstrapped & regtested on powerpc64le-linux-gnu Power9
>>>>>> (with/without the hunk for LI_ONLY_INNERMOST linear search,
>>>>>> it can have the coverage to exercise LI_ONLY_INNERMOST
>>>>>> in walk_loop_tree when "without").
>>>>>>
>>>>>> Is it ok for trunk?
>>>>>
>>>>> Looks good to me.  I think that the 'mn' was an optimization
>>>>> for the linear walk and it's cheaper to pointer test against
>>>>> the actual 'root' loop (no need to dereference).  Thus
>>>>>
>>>>> +  if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root)
>>>>>      {
>>>>> -      for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++)
>>>>> +      class loop *aloop;
>>>>> +      unsigned int i;
>>>>> +      for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++)
>>>>>         if (aloop != NULL
>>>>>             && aloop->inner == NULL
>>>>> -           && aloop->num >= mn)
>>>>> +           && aloop->num != mn)
>>>>>           this->to_visit.quick_push (aloop->num);
>>>>>
>>>>> could elide the aloop->num != mn check and start iterating from 1,
>>>>> since loops->tree_root->num == 0
>>>>>
>>>>> and the walk_loop_tree could simply do
>>>>>
>>>>>   class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root;
>>>>>
>>>>> and pointer test aloop against exclude.  That avoids the idea that
>>>>> 'mn' is a vehicle to exclude one random loop from the iteration.
>>>>>
>>>>
>>>> Good idea!  Thanks for the comments!  The attached v3 has addressed
>>>> the review comments on "mn".
>>>>
>>>> Bootstrapped & regtested again on powerpc64le-linux-gnu Power9
>>>> (with/without the hunk for LI_ONLY_INNERMOST linear search).
>>>>
>>>> Is it ok for trunk?
>>>
>>> +  /* Early handle root without any inner loops, make later
>>> +     processing simpler, that is all loops processed in the
>>> +     following while loop are impossible to be root.  */
>>> +  if (!root->inner)
>>> +    {
>>> +      if (root != exclude)
>>> +       this->to_visit.quick_push (root->num);
>>> +      return;
>>> +    }
>>>
>>> could be
>>>
>>>    if (!root->inner)
>>>      {
>>>         if (flags & LI_INCLUDE_ROOT)
>>>           this->to_visit.quick_push (root->num);
>>>      }
>>>
>>
>> OK, I thought wrongly that all places with "exclude" might be
>> more consistent, so gave up to use flags directly.  :)
>>
>>> +  class loop *aloop;
>>> +  for (aloop = root;
>>> +       aloop->inner != NULL;
>>> +       aloop = aloop->inner)
>>> +    {
>>> +      if (preorder_p && aloop != exclude)
>>> +       this->to_visit.quick_push (aloop->num);
>>> +      continue;
>>> +    }
>>>
>>> could be
>>>
>>> +  class loop *aloop;
>>> +  for (aloop = root->inner;
>>> +       aloop->inner != NULL;
>>> +       aloop = aloop->inner)
>>> +    {
>>> +      if (preorder_p)
>>> +       this->to_visit.quick_push (aloop->num);
>>> +      continue;
>>> +    }
>>>
>>
>> This seems wrong?  For preorder_p, we might miss to push root
>> when root->inner isn't NULL.  The below "else if" makes it safe.
> 
> oops, yes.
> 
>> @@ -2125,17 +2125,19 @@ loops_list::walk_loop_tree (class loop *root, unsigned flags)
>>       following while loop are impossible to be root.  */
>>    if (!root->inner)
>>      {
>> -      if (root != exclude)
>> +      if (flags & LI_INCLUDE_ROOT)
>>         this->to_visit.quick_push (root->num);
>>        return;
>>      }
>> +  else if (preorder_p && flags & LI_INCLUDE_ROOT)
>> +    this->to_visit.quick_push (root->num);
>>
>>> +  /* When visiting from innermost, we need to consider root here
>>> +     since the previous while loop doesn't handle it.  */
>>> +  if (from_innermost_p && root != exclude)
>>> +    this->to_visit.quick_push (root->num);
>>>
>>> could be like the first.  I think that's more clear even.  Sorry for
>>> finding a better solution again.
>>>
>>
>> It's totally fine, thanks for all the nice suggestions!  :)
>>
>>> OK with that change
>>>
>>
>> Thanks, the attached diff is the delta against v3, excepting for
>> the "else if", the other changes follow the suggestion above.
>>
>> Could you have another look to confirm?
> 
> I'm missing the line that removes 'exclude', other than that it looks
> OK.
> 

Thanks!  Bootstrapped & regress-tested on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu.  Committed in r12-2756.

BR,
Kewen

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops)
  2021-07-30  7:58           ` Kewen.Lin
@ 2021-11-24 14:24             ` Thomas Schwinge
  2021-11-24 16:58               ` Martin Jambor
  2021-11-24 19:44               ` Jeff Law
  0 siblings, 2 replies; 35+ messages in thread
From: Thomas Schwinge @ 2021-11-24 14:24 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 751 bytes --]

Hi!

On 2021-07-30T15:58:36+0800, "Kewen.Lin" <linkw@linux.ibm.com> wrote:
> on 2021/7/30 下午3:18, Thomas Schwinge wrote:
>> Curious why in some instances we're not removing the 'class loop *loop'
>> declaration, I had a look, and this may suggest some further clean-up?
>
> [...] I like your nice proposed further clean-up,
> thanks for doing that!

Ping for my patch to "Reduce scope of a few 'class loop *loop' variables",
see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Reduce-scope-of-a-few-class-loop-loop-variables.patch --]
[-- Type: text/x-diff, Size: 6484 bytes --]

From 6051ff3a4ba0b8f44ecb262e4553f8a471c66237 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 30 Jul 2021 09:23:20 +0200
Subject: [PATCH] Reduce scope of a few 'class loop *loop' variables

Further clean-up after commit e41ba804ba5f5ca433e09238d561b1b4c8b10985
"Use range-based for loops for traversing loops".  No functional change.

	gcc/
	* cfgloop.c (verify_loop_structure): Reduce scope of
	'class loop *loop' variable.
	* ipa-fnsummary.c (analyze_function_body): Likewise.
	* loop-init.c (fix_loop_structure): Likewise.
	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
	* predict.c (predict_loops): Likewise.
	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
	* tree-vectorizer.c (pass_vectorize::execute): Likewise.
---
 gcc/cfgloop.c                | 3 +--
 gcc/ipa-fnsummary.c          | 3 +--
 gcc/loop-init.c              | 2 +-
 gcc/loop-invariant.c         | 4 ++--
 gcc/predict.c                | 3 +--
 gcc/tree-loop-distribution.c | 2 +-
 gcc/tree-vectorizer.c        | 5 ++---
 7 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 20c24c13c36..3190d12b2ce 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -1398,7 +1398,6 @@ verify_loop_structure (void)
 {
   unsigned *sizes, i, j;
   basic_block bb, *bbs;
-  class loop *loop;
   int err = 0;
   edge e;
   unsigned num = number_of_loops (cfun);
@@ -1689,7 +1688,7 @@ verify_loop_structure (void)
 	      for (; exit; exit = exit->next_e)
 		eloops++;
 
-	      for (loop = bb->loop_father;
+	      for (class loop *loop = bb->loop_father;
 		   loop != e->dest->loop_father
 		   /* When a loop exit is also an entry edge which
 		      can happen when avoiding CFG manipulations
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 7e9201a554a..cb13d2e4b3c 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -2934,7 +2934,6 @@ analyze_function_body (struct cgraph_node *node, bool early)
   if (nonconstant_names.exists () && !early)
     {
       ipa_fn_summary *s = ipa_fn_summaries->get (node);
-      class loop *loop;
       unsigned max_loop_predicates = opt_for_fn (node->decl,
 						 param_ipa_max_loop_predicates);
 
@@ -2978,7 +2977,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
       /* To avoid quadratic behavior we analyze stride predicates only
          with respect to the containing loop.  Thus we simply iterate
 	 over all defs in the outermost loop body.  */
-      for (loop = loops_for_fn (cfun)->tree_root->inner;
+      for (class loop *loop = loops_for_fn (cfun)->tree_root->inner;
 	   loop != NULL; loop = loop->next)
 	{
 	  ipa_predicate loop_stride = true;
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 04054ef6222..f0931a99661 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -201,7 +201,6 @@ fix_loop_structure (bitmap changed_bbs)
 {
   basic_block bb;
   int record_exits = 0;
-  class loop *loop;
   unsigned old_nloops, i;
 
   timevar_push (TV_LOOP_INIT);
@@ -279,6 +278,7 @@ fix_loop_structure (bitmap changed_bbs)
 
   /* Finally free deleted loops.  */
   bool any_deleted = false;
+  class loop *loop;
   FOR_EACH_VEC_ELT (*get_loops (cfun), i, loop)
     if (loop && loop->header == NULL)
       {
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index fca0c2b24be..5eee2e5c9f8 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -2134,7 +2134,7 @@ calculate_loop_reg_pressure (void)
   basic_block bb;
   rtx_insn *insn;
   rtx link;
-  class loop *loop, *parent;
+  class loop *parent;
 
   for (auto loop : loops_list (cfun, 0))
     if (loop->aux == NULL)
@@ -2151,7 +2151,7 @@ calculate_loop_reg_pressure (void)
       if (curr_loop == current_loops->tree_root)
 	continue;
 
-      for (loop = curr_loop;
+      for (class loop *loop = curr_loop;
 	   loop != current_loops->tree_root;
 	   loop = loop_outer (loop))
 	bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_IN (bb));
diff --git a/gcc/predict.c b/gcc/predict.c
index 68b11135680..3cb4e3c0eb5 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -1927,7 +1927,6 @@ predict_extra_loop_exits (edge exit_edge)
 static void
 predict_loops (void)
 {
-  class loop *loop;
   basic_block bb;
   hash_set <class loop *> with_recursion(10);
 
@@ -1941,7 +1940,7 @@ predict_loops (void)
 	    && (decl = gimple_call_fndecl (gsi_stmt (gsi))) != NULL
 	    && recursive_call_p (current_function_decl, decl))
 	  {
-	    loop = bb->loop_father;
+	    class loop *loop = bb->loop_father;
 	    while (loop && !with_recursion.add (loop))
 	      loop = loop_outer (loop);
 	  }
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 583c01a42d8..c9e18739165 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -3737,7 +3737,6 @@ prepare_perfect_loop_nest (class loop *loop)
 unsigned int
 loop_distribution::execute (function *fun)
 {
-  class loop *loop;
   bool changed = false;
   basic_block bb;
   control_dependences *cd = NULL;
@@ -3845,6 +3844,7 @@ loop_distribution::execute (function *fun)
       /* Destroy loop bodies that could not be reused.  Do this late as we
 	 otherwise can end up refering to stale data in control dependences.  */
       unsigned i;
+      class loop *loop;
       FOR_EACH_VEC_ELT (loops_to_be_destroyed, i, loop)
 	destroy_loop (loop);
 
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 0e1cee99bae..f4a2873a91e 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -1209,7 +1209,6 @@ pass_vectorize::execute (function *fun)
   unsigned int i;
   unsigned int num_vectorized_loops = 0;
   unsigned int vect_loops_num;
-  class loop *loop;
   hash_table<simduid_to_vf> *simduid_to_vf_htab = NULL;
   hash_table<simd_array_to_simduid> *simd_array_to_simduid_htab = NULL;
   bool any_ifcvt_loops = false;
@@ -1293,7 +1292,7 @@ pass_vectorize::execute (function *fun)
   if (any_ifcvt_loops)
     for (i = 1; i < number_of_loops (fun); i++)
       {
-	loop = get_loop (fun, i);
+	class loop *loop = get_loop (fun, i);
 	if (loop && loop->dont_vectorize)
 	  {
 	    gimple *g = vect_loop_vectorized_call (loop);
@@ -1342,7 +1341,7 @@ pass_vectorize::execute (function *fun)
       loop_vec_info loop_vinfo;
       bool has_mask_store;
 
-      loop = get_loop (fun, i);
+      class loop *loop = get_loop (fun, i);
       if (!loop || !loop->aux)
 	continue;
       loop_vinfo = (loop_vec_info) loop->aux;
-- 
2.33.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops)
  2021-11-24 14:24             ` Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops) Thomas Schwinge
@ 2021-11-24 16:58               ` Martin Jambor
  2021-11-24 19:44               ` Jeff Law
  1 sibling, 0 replies; 35+ messages in thread
From: Martin Jambor @ 2021-11-24 16:58 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: gcc-patches

Hi,

On Wed, Nov 24 2021, Thomas Schwinge wrote:
> Hi!
>
> On 2021-07-30T15:58:36+0800, "Kewen.Lin" <linkw@linux.ibm.com> wrote:
>> on 2021/7/30 下午3:18, Thomas Schwinge wrote:
>>> Curious why in some instances we're not removing the 'class loop *loop'
>>> declaration, I had a look, and this may suggest some further clean-up?
>>
>> [...] I like your nice proposed further clean-up,
>> thanks for doing that!
>
> Ping for my patch to "Reduce scope of a few 'class loop *loop' variables",
> see attached.
>

[...]
>

> Further clean-up after commit e41ba804ba5f5ca433e09238d561b1b4c8b10985
> "Use range-based for loops for traversing loops".  No functional change.
>
> 	gcc/
> 	* cfgloop.c (verify_loop_structure): Reduce scope of
> 	'class loop *loop' variable.
> 	* ipa-fnsummary.c (analyze_function_body): Likewise.

FWIW, the ipa-fnsummary.c hunk is OK (and better-that-expected clean-up
too, because it avoids the loop variable being hidden by another with
the same name in an earlier loop).

Thanks,

Martin


> 	* loop-init.c (fix_loop_structure): Likewise.
> 	* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
> 	* predict.c (predict_loops): Likewise.
> 	* tree-loop-distribution.c (loop_distribution::execute): Likewise.
> 	* tree-vectorizer.c (pass_vectorize::execute): Likewise.

[...]

> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 7e9201a554a..cb13d2e4b3c 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -2934,7 +2934,6 @@ analyze_function_body (struct cgraph_node *node, bool early)
>    if (nonconstant_names.exists () && !early)
>      {
>        ipa_fn_summary *s = ipa_fn_summaries->get (node);
> -      class loop *loop;
>        unsigned max_loop_predicates = opt_for_fn (node->decl,
>  						 param_ipa_max_loop_predicates);
>  
> @@ -2978,7 +2977,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
>        /* To avoid quadratic behavior we analyze stride predicates only
>           with respect to the containing loop.  Thus we simply iterate
>  	 over all defs in the outermost loop body.  */
> -      for (loop = loops_for_fn (cfun)->tree_root->inner;
> +      for (class loop *loop = loops_for_fn (cfun)->tree_root->inner;
>  	   loop != NULL; loop = loop->next)
>  	{
>  	  ipa_predicate loop_stride = true;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops)
  2021-11-24 14:24             ` Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops) Thomas Schwinge
  2021-11-24 16:58               ` Martin Jambor
@ 2021-11-24 19:44               ` Jeff Law
  1 sibling, 0 replies; 35+ messages in thread
From: Jeff Law @ 2021-11-24 19:44 UTC (permalink / raw)
  To: Thomas Schwinge, gcc-patches



On 11/24/2021 7:24 AM, Thomas Schwinge wrote:
> Hi!
>
> On 2021-07-30T15:58:36+0800, "Kewen.Lin" <linkw@linux.ibm.com> wrote:
>> on 2021/7/30 下午3:18, Thomas Schwinge wrote:
>>> Curious why in some instances we're not removing the 'class loop *loop'
>>> declaration, I had a look, and this may suggest some further clean-up?
>> [...] I like your nice proposed further clean-up,
>> thanks for doing that!
> Ping for my patch to "Reduce scope of a few 'class loop *loop' variables",
> see attached.
OK for the trunk.  Sorry about the wait.

jeff


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2021-11-24 19:44 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-19  6:20 [RFC/PATCH] Use range-based for loops for traversing loops Kewen.Lin
2021-07-19  6:26 ` Andrew Pinski
2021-07-20  8:56   ` Kewen.Lin
2021-07-19 14:08 ` Jonathan Wakely
2021-07-20  8:56   ` Kewen.Lin
2021-07-19 14:34 ` Richard Biener
2021-07-20  8:57   ` Kewen.Lin
2021-07-19 15:59 ` Martin Sebor
2021-07-20  8:58   ` Kewen.Lin
2021-07-20  9:49     ` Jonathan Wakely
2021-07-20  9:50       ` Jonathan Wakely
2021-07-20 14:42       ` Kewen.Lin
2021-07-20 14:36 ` [PATCH v2] " Kewen.Lin
2021-07-22 12:56   ` Richard Biener
2021-07-22 12:56     ` Richard Biener
2021-07-23  8:41     ` [PATCH] Make loops_list support an optional loop_p root Kewen.Lin
2021-07-23 16:26       ` Martin Sebor
2021-07-27  2:25         ` Kewen.Lin
2021-07-29  8:01       ` Richard Biener
2021-07-30  5:20         ` [PATCH v2] " Kewen.Lin
2021-08-03 12:08           ` Richard Biener
2021-08-04  2:36             ` [PATCH v3] " Kewen.Lin
2021-08-04 10:01               ` Richard Biener
2021-08-04 10:47                 ` Kewen.Lin
2021-08-04 12:04                   ` Richard Biener
2021-08-05  8:50                     ` Kewen.Lin
2021-07-23  8:35   ` [PATCH v3] Use range-based for loops for traversing loops Kewen.Lin
2021-07-23 16:10     ` Martin Sebor
2021-07-27  2:10       ` [PATCH v4] " Kewen.Lin
2021-07-29  7:48         ` Richard Biener
2021-07-30  7:18         ` Thomas Schwinge
2021-07-30  7:58           ` Kewen.Lin
2021-11-24 14:24             ` Reduce scope of a few 'class loop *loop' variables (was: [PATCH v4] Use range-based for loops for traversing loops) Thomas Schwinge
2021-11-24 16:58               ` Martin Jambor
2021-11-24 19:44               ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).