[PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
@ 2016-08-03 13:32 Martin Liška
  2016-08-03 13:54 ` Richard Biener
  2016-08-03 14:22 ` Nathan Sidwell
  0 siblings, 2 replies; 21+ messages in thread
From: Martin Liška @ 2016-08-03 13:32 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jan Hubicka, Pidgeot18, Nathan Sidwell

[-- Attachment #1: Type: text/plain, Size: 822 bytes --]

Hello.

As I've going through all PRs related to gcov-profile, I've noticed this PR.
Current implementation of cycle detection in gcov is very poor, leading to extreme run time
for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
the patch is quite subtle and fast (of course).

The patch survives gcov.exp regression tests and I also verified than *.gcov is identical before
and after the patch for Inkscape:

$ find . -name '*.gcov' | wc -l
10752

I'm also thinking about adding [1] to test-suite, however it would require implementation 'timeout'
argument in gcov.exp. Does it worth doing?

Ready to install?
Thanks,
Martin

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67992#c5

[-- Attachment #2: 0001-gcov-tool-Implement-Hawick-s-algorithm-for-cycle-det.patch --]
[-- Type: text/x-patch, Size: 19575 bytes --]

From faf7fb72d439974de68eb672edc6d76424f6022d Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Wed, 3 Aug 2016 09:56:45 +0200
Subject: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection
 (PR gcov-profile/67992)

gcc/ChangeLog:

2016-08-03  Martin Liska  <mliska@suse.cz>
	    Joshua Cranmer  <Pidgeot18@gmail.com>

	PR gcov-profile/67992
	* gcov.c (line_t::has_block): New function.
	(handle_cycle): Likewise.
	(unblock): Likewise.
	(circuit): Likewise.
	(find_cycles): Likewise.
	(get_cycles_count): Likewise.
	(main): Fix GNU coding style.
	(output_intermediate_file): Likewise.
	(process_file): Likewise.
	(generate_results): Likewise.
	(release_structures): Likewise.
	(create_file_names): Likewise.
	(find_source): Likewise.
	(read_graph_file): Likewise.
	(find_exception_blocks): Likewise.
	(canonicalize_name): Likewise.
	(make_gcov_file_name): Likewise.
	(mangle_name): Likewise.
	(accumulate_line_counts): Use the new Hawick's algorithm.
	(output_branch_count): Fix GNU coding style.
	(read_line): Likewise.
---
 gcc/gcov.c | 378 +++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 229 insertions(+), 149 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 417b4f4..8855980 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -41,6 +41,11 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include <getopt.h>
 
+#include <vector>
+#include <algorithm>
+
+using namespace std;
+
 #define IN_GCOV 1
 #include "gcov-io.h"
 #include "gcov-io.c"
@@ -222,6 +227,9 @@ typedef struct coverage_info
 
 typedef struct line_info
 {
+  /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
+  bool has_block (block_t *needle);
+
   gcov_type count;	   /* execution count */
   union
   {
@@ -235,6 +243,16 @@ typedef struct line_info
   unsigned unexceptional : 1;
 } line_t;
 
+bool
+line_t::has_block (block_t *needle)
+{
+  for (block_t *n = u.blocks; n; n = n->chain)
+    if (n == needle)
+      return true;
+
+  return false;
+}
+
 /* Describes a file mentioned in the block graph.  Contains an array
    of line info.  */
 
@@ -407,6 +425,164 @@ static void release_structures (void);
 static void release_function (function_t *);
 extern int main (int, char **);
 
+/* Cycle detection!
+   There are a bajillion algorithms that do this.  Boost's function is named
+   hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
+   "Enumerating Circuits and Loops in Graphs with Self-Arcs and Multiple-Arcs"
+   (url at <http://complexity.massey.ac.nz/cstn/013/cstn-013.pdf>).
+
+   The basic algorithm is simple: effectively, we're finding all simple paths
+   in a subgraph (that shrinks every iteration).  Duplicates are filtered by
+   "blocking" a path when a node is added to the path (this also prevents non-
+   simple paths)--the node is unblocked only when it participates in a cycle.
+   */
+
+/* Flag that drives cycle detection after a negative cycle is seen.  */
+static bool did_negate = false;
+
+/* Handle cycle identified by EDGES, where the function finds minimum cs_count
+   and subtract the value from all counts.  The subtracted value is added
+   to COUNT.  */
+
+static void
+handle_cycle (const vector<arc_t *> &edges, int64_t &count)
+{
+  /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
+     that amount.  */
+  int64_t cycle_count = INT64_MAX;
+  for (unsigned i = 0; i < edges.size (); i++)
+    {
+      int64_t ecount = edges[i]->cs_count;
+      if (cycle_count > ecount)
+	cycle_count = ecount;
+    }
+  count += cycle_count;
+  for (unsigned i = 0; i < edges.size (); i++)
+    edges[i]->cs_count -= cycle_count;
+
+  if (cycle_count < 0)
+    did_negate = true;
+}
+
+/* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
+   blocked by U in BLOCK_LISTS.  */
+
+static void
+unblock (block_t *u, vector<block_t *> &blocked,
+	 vector<vector<block_t *> > &block_lists)
+{
+  vector<block_t *>::iterator it = find (blocked.begin (), blocked.end (), u);
+  if (it == blocked.end ())
+    return;
+
+  unsigned index = it - blocked.begin ();
+  blocked.erase (it);
+
+  for (vector<block_t *>::iterator it2 = block_lists[index].begin ();
+       it2 != block_lists[index].end (); it2++)
+    unblock (*it2, blocked, block_lists);
+  for (unsigned j = 0; j < block_lists[index].size (); j++)
+    unblock (u, blocked, block_lists);
+
+  block_lists.erase (block_lists.begin () + index);
+}
+
+/* Find circuit going to block V, PATH is provisional seen cycle.
+   BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
+   blocked by a block.  COUNT is accumulated count of the current LINE.  */
+
+static bool
+circuit (block_t *v, vector<arc_t *> &path, block_t *start,
+	 vector<block_t *> &blocked, vector<vector<block_t *>> &block_lists,
+	 line_t &linfo, int64_t &count)
+{
+  bool found = false;
+
+  /* Add v to the block list.  */
+  gcc_assert (find (blocked.begin (), blocked.end (), v) == blocked.end ());
+  blocked.push_back (v);
+  block_lists.push_back (vector<block_t *> ());
+
+  for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+    {
+      block_t *w = arc->dst;
+      if (w < start || !linfo.has_block (w))
+	continue;
+
+      path.push_back (arc);
+      if (w == start)
+	{
+	  /* Cycle has been found.  */
+	  handle_cycle (path, count);
+	  found = true;
+	}
+      else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
+	found |= circuit (w, path, start, blocked, block_lists, linfo, count);
+
+      path.pop_back ();
+    }
+
+  if (found)
+    unblock (v, blocked, block_lists);
+  else
+    for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+      {
+	block_t *w = arc->dst;
+	if (w < start || !linfo.has_block (w))
+	  continue;
+
+	size_t index
+	  = find (blocked.begin (), blocked.end (), w) - blocked.begin ();
+	gcc_assert (index < blocked.size ());
+	vector<block_t *> &list = block_lists[index];
+	if (find (list.begin (), list.end (), v) == list.end ())
+	  list.push_back (v);
+      }
+
+  return found;
+}
+
+/* Find cycles for a LINFO.  */
+
+static gcov_type
+find_cycles (line_t &linfo)
+{
+  /* Note that this algorithm works even if blocks aren't in sorted order.
+     Each iteration of the circuit detection is completely independent
+     (except for reducing counts, but that shouldn't matter anyways).
+     Therefore, operating on a permuted order (i.e., non-sorted) only
+     has the effect of permuting the output cycles.  */
+
+  gcov_type count = 0;
+  for (block_t *block = linfo.u.blocks; block; block = block->chain)
+    {
+      vector<arc_t *> path;
+      vector<block_t *> blocked;
+      vector<vector<block_t *> > block_lists;
+      circuit (block, path, block, blocked, block_lists, linfo, count);
+    }
+
+  return count;
+}
+
+/* Get cycle count for a LINFO.  */
+
+static gcov_type
+get_cycles_count (line_t &linfo)
+{
+  gcov_type count = 0;
+
+  /* Bug compatibility with gcc's broken cycle-finder: if a negative cycle
+     exists, run the algorithm again.  */
+
+  did_negate = false;
+  count = find_cycles (linfo);
+  if (did_negate)
+    count += find_cycles (linfo);
+
+  return count;
+}
+
 int
 main (int argc, char **argv)
 {
@@ -435,7 +611,7 @@ main (int argc, char **argv)
   names = XNEWVEC (name_map_t, a_names);
   a_sources = 10;
   sources = XNEWVEC (source_t, a_sources);
-  
+
   argno = process_args (argc, argv);
   if (optind == argc)
     print_usage (true);
@@ -444,12 +620,12 @@ main (int argc, char **argv)
     multiple_files = 1;
 
   first_arg = argno;
-  
+
   for (; argno != argc; argno++)
     {
       if (flag_display_progress)
-        printf ("Processing file %d out of %d\n",
-		argno - first_arg + 1, argc - first_arg);
+	printf ("Processing file %d out of %d\n", argno - first_arg + 1,
+		argc - first_arg);
       process_file (argv[argno]);
     }
 
@@ -459,7 +635,7 @@ main (int argc, char **argv)
 
   return 0;
 }
-\f
+
 /* Print a usage message and exit.  If ERROR_P is nonzero, this is an error,
    otherwise the output of --help.  */
 
@@ -671,8 +847,8 @@ output_intermediate_file (FILE *gcov_file, source_t *src)
     {
       /* function:<name>,<line_number>,<execution_count> */
       fprintf (gcov_file, "function:%d,%s,%s\n", fn->line,
-               format_gcov (fn->blocks[0].count, 0, -1),
-               flag_demangled_names ? fn->demangled_name : fn->name);
+	       format_gcov (fn->blocks[0].count, 0, -1),
+	       flag_demangled_names ? fn->demangled_name : fn->name);
     }
 
   for (line_num = 1, line = &src->lines[line_num];
@@ -681,8 +857,8 @@ output_intermediate_file (FILE *gcov_file, source_t *src)
     {
       arc_t *arc;
       if (line->exists)
-        fprintf (gcov_file, "lcount:%u,%s\n", line_num,
-                 format_gcov (line->count, 0, -1));
+	fprintf (gcov_file, "lcount:%u,%s\n", line_num,
+		 format_gcov (line->count, 0, -1));
       if (flag_branches)
         for (arc = line->u.branches; arc; arc = arc->line_next)
           {
@@ -705,7 +881,6 @@ output_intermediate_file (FILE *gcov_file, source_t *src)
     }
 }
 
-
 /* Process a single input file.  */
 
 static void
@@ -717,7 +892,7 @@ process_file (const char *file_name)
   fns = read_graph_file ();
   if (!fns)
     return;
-  
+
   read_count_file (fns);
   while (fns)
     {
@@ -767,7 +942,7 @@ process_file (const char *file_name)
 	    }
 	  if (line >= sources[src].num_lines)
 	    sources[src].num_lines = line + 1;
-	  
+
 	  solve_flow_graph (fn);
 	  if (fn->has_catch)
 	    find_exception_blocks (fn);
@@ -848,15 +1023,14 @@ generate_results (const char *file_name)
   if (flag_gcov_file && flag_intermediate_format)
     {
       /* Open the intermediate file.  */
-      gcov_intermediate_filename =
-        get_gcov_intermediate_filename (file_name);
+      gcov_intermediate_filename = get_gcov_intermediate_filename (file_name);
       gcov_intermediate_file = fopen (gcov_intermediate_filename, "w");
       if (!gcov_intermediate_file)
-        {
-          fnotice (stderr, "Cannot open intermediate output file %s\n",
-                   gcov_intermediate_filename);
-          return;
-        }
+	{
+	  fnotice (stderr, "Cannot open intermediate output file %s\n",
+		   gcov_intermediate_filename);
+	  return;
+	}
     }
 
   for (ix = n_sources, src = sources; ix--; src++)
@@ -866,7 +1040,7 @@ generate_results (const char *file_name)
 	  /* Ignore this source, if it is an absolute path (after
 	     source prefix removal).  */
 	  char first = src->coverage.name[0];
-      
+
 #if HAVE_DOS_BASED_FILE_SYSTEM
 	  if (first && src->coverage.name[1] == ':')
 	    first = src->coverage.name[2];
@@ -874,7 +1048,7 @@ generate_results (const char *file_name)
 	  if (IS_DIR_SEPARATOR (first))
 	    continue;
 	}
-      
+
       accumulate_line_counts (src);
       function_summary (&src->coverage, "File");
       total_lines += src->coverage.lines;
@@ -938,7 +1112,7 @@ release_structures (void)
   for (ix = n_sources; ix--;)
     free (sources[ix].lines);
   free (sources);
-  
+
   for (ix = n_names; ix--;)
     free (names[ix].name);
   free (names);
@@ -982,7 +1156,7 @@ create_file_names (const char *file_name)
 
       base = !stat (object_directory, &status) && S_ISDIR (status.st_mode);
       strcat (name, object_directory);
-      if (base && (! IS_DIR_SEPARATOR (name[strlen (name) - 1])))
+      if (base && (!IS_DIR_SEPARATOR (name[strlen (name) - 1])))
 	strcat (name, "/");
     }
   else
@@ -1075,16 +1249,16 @@ find_source (const char *file_name)
       free (names);
       names = name_map;
     }
-  
+
   /* Not found, try the canonical name. */
   canon = canonicalize_name (file_name);
-  name_map = (name_map_t *)bsearch
-    (canon, names, n_names, sizeof (*names), name_search);
+  name_map = (name_map_t *) bsearch (canon, names, n_names, sizeof (*names),
+				     name_search);
   if (!name_map)
     {
       /* Not found with canonical name, create a new source.  */
       source_t *src;
-      
+
       if (n_sources == a_sources)
 	{
 	  a_sources *= 2;
@@ -1210,12 +1384,12 @@ read_graph_file (void)
 
 	  fn = XCNEW (function_t);
 	  fn->name = function_name;
-          if (flag_demangled_names)
-            {
-              fn->demangled_name = cplus_demangle (fn->name, DMGL_PARAMS);
-              if (!fn->demangled_name)
-                fn->demangled_name = fn->name;
-            }
+	  if (flag_demangled_names)
+	    {
+	      fn->demangled_name = cplus_demangle (fn->name, DMGL_PARAMS);
+	      if (!fn->demangled_name)
+		fn->demangled_name = fn->name;
+	    }
 	  fn->ident = ident;
 	  fn->lineno_checksum = lineno_checksum;
 	  fn->cfg_checksum = cfg_checksum;
@@ -1293,7 +1467,7 @@ read_graph_file (void)
 		  else
 		    {
 		      /* Non-local return from a callee of this
-		         function. The destination block is a setjmp.  */
+			 function. The destination block is a setjmp.  */
 		      arc->is_nonlocal_return = 1;
 		      fn->blocks[dest].is_nonlocal_return = 1;
 		    }
@@ -1302,7 +1476,7 @@ read_graph_file (void)
 	      if (!arc->on_tree)
 		fn->num_counts++;
 	    }
-	  
+
 	  if (mark_catches)
 	    {
 	      /* We have a fake exit from this block.  The other
@@ -1774,7 +1948,7 @@ find_exception_blocks (function_t *fn)
     {
       block_t *block = queue[--ix];
       const arc_t *arc;
-      
+
       for (arc = block->succ; arc; arc = arc->succ_next)
 	if (!arc->fake && !arc->is_throw && arc->dst->exceptional)
 	  {
@@ -1783,7 +1957,6 @@ find_exception_blocks (function_t *fn)
 	  }
     }
 }
-\f
 
 /* Increment totals in COVERAGE according to arc ARC.  */
 
@@ -1862,7 +2035,7 @@ executed_summary (unsigned lines, unsigned executed)
   else
     fnotice (stdout, "No executable lines\n");
 }
-  
+
 /* Output summary info for a function or file.  */
 
 static void
@@ -1921,7 +2094,7 @@ canonicalize_name (const char *name)
   for (dd_base = ptr; *base; base = probe)
     {
       size_t len;
-      
+
       for (probe = base; *probe; probe++)
 	if (IS_DIR_SEPARATOR (*probe))
 	  break;
@@ -1942,7 +2115,7 @@ canonicalize_name (const char *name)
 	      /* S_ISLNK is not POSIX.1-1996.  */
 	      || stat (result, &buf) || S_ISLNK (buf.st_mode)
 #endif
-	      )
+		)
 	    {
 	      /* Cannot elide, or unreadable or a symlink.  */
 	      dd_base = ptr + 2 + slash;
@@ -1993,7 +2166,7 @@ make_gcov_file_name (const char *input_name, const char *src_name)
     {
       /* Generate the input filename part.  */
       result = XNEWVEC (char, strlen (input_name) + strlen (src_name) + 10);
-  
+
       ptr = result;
       ptr = mangle_name (input_name, ptr);
       ptr[0] = ptr[1] = '#';
@@ -2007,7 +2180,7 @@ make_gcov_file_name (const char *input_name, const char *src_name)
 
   ptr = mangle_name (src_name, ptr);
   strcpy (ptr, ".gcov");
-  
+
   return result;
 }
 
@@ -2015,7 +2188,7 @@ static char *
 mangle_name (char const *base, char *ptr)
 {
   size_t len;
-  
+
   /* Generate the source filename part.  */
   if (!flag_preserve_paths)
     {
@@ -2061,7 +2234,7 @@ mangle_name (char const *base, char *ptr)
 	    }
 	}
     }
-  
+
   return ptr;
 }
 
@@ -2152,8 +2325,7 @@ accumulate_line_counts (source_t *src)
   unsigned ix;
 
   /* Reverse the function order.  */
-  for (fn = src->functions, fn_p = NULL; fn;
-       fn_p = fn, fn = fn_n)
+  for (fn = src->functions, fn_p = NULL; fn; fn_p = fn, fn = fn_n)
     {
       fn_n = fn->line_next;
       fn->line_next = fn_p;
@@ -2204,113 +2376,22 @@ accumulate_line_counts (source_t *src)
 	      arc_t *arc;
 
 	      for (arc = block->pred; arc; arc = arc->pred_next)
-		{
-		  if (arc->src->u.cycle.ident != ix)
-		    count += arc->count;
-		  if (flag_branches)
-		    add_branch_counts (&src->coverage, arc);
-		}
-
-	      /* Initialize the cs_count.  */
-	      for (arc = block->succ; arc; arc = arc->succ_next)
-		arc->cs_count = arc->count;
+		if (flag_branches)
+		  add_branch_counts (&src->coverage, arc);
 	    }
 
-	  /* Find the loops. This uses the algorithm described in
-	     Tiernan 'An Efficient Search Algorithm to Find the
-	     Elementary Circuits of a Graph', CACM Dec 1970. We hold
-	     the P array by having each block point to the arc that
-	     connects to the previous block. The H array is implicitly
-	     held because of the arc ordering, and the block's
-	     previous arc pointer.
-
-	     Although the algorithm is O(N^3) for highly connected
-	     graphs, at worst we'll have O(N^2), as most blocks have
-	     only one or two exits. Most graphs will be small.
-
-	     For each loop we find, locate the arc with the smallest
-	     transition count, and add that to the cumulative
-	     count.  Decrease flow over the cycle and remove the arc
-	     from consideration.  */
+	  /* Cycle detection.  */
 	  for (block = line->u.blocks; block; block = block->chain)
 	    {
-	      block_t *head = block;
-	      arc_t *arc;
-
-	    next_vertex:;
-	      arc = head->succ;
-	    current_vertex:;
-	      while (arc)
-		{
-		  block_t *dst = arc->dst;
-		  if (/* Already used that arc.  */
-		      arc->cycle
-		      /* Not to same graph, or before first vertex.  */
-		      || dst->u.cycle.ident != ix
-		      /* Already in path.  */
-		      || dst->u.cycle.arc)
-		    {
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  if (dst == block)
-		    {
-		      /* Found a closing arc.  */
-		      gcov_type cycle_count = arc->cs_count;
-		      arc_t *cycle_arc = arc;
-		      arc_t *probe_arc;
-
-		      /* Locate the smallest arc count of the loop.  */
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			if (cycle_count > probe_arc->cs_count)
-			  {
-			    cycle_count = probe_arc->cs_count;
-			    cycle_arc = probe_arc;
-			  }
-
-		      count += cycle_count;
-		      cycle_arc->cycle = 1;
-
-		      /* Remove the flow from the cycle.  */
-		      arc->cs_count -= cycle_count;
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			probe_arc->cs_count -= cycle_count;
-
-		      /* Unwind to the cyclic arc.  */
-		      while (head != cycle_arc->src)
-			{
-			  arc = head->u.cycle.arc;
-			  head->u.cycle.arc = NULL;
-			  head = arc->src;
-			}
-		      /* Move on.  */
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  /* Add new block to chain.  */
-		  dst->u.cycle.arc = arc;
-		  head = dst;
-		  goto next_vertex;
-		}
-	      /* We could not add another vertex to the path. Remove
-		 the last vertex from the list.  */
-	      arc = head->u.cycle.arc;
-	      if (arc)
-		{
-		  /* It was not the first vertex. Move onto next arc.  */
-		  head->u.cycle.arc = NULL;
-		  head = arc->src;
-		  arc = arc->succ_next;
-		  goto current_vertex;
-		}
-	      /* Mark this block as unusable.  */
-	      block->u.cycle.ident = ~0U;
+	      for (arc_t *arc = block->pred; arc; arc = arc->pred_next)
+		if (!line->has_block (arc->src))
+		  count += arc->count;
+	      for (arc_t *arc = block->succ; arc; arc = arc->succ_next)
+		arc->cs_count = arc->count;
 	    }
 
+	  //* Now, add the count of loops entirely on this line.  */
+	  count += get_cycles_count (*line);
 	  line->count = count;
 	}
 
@@ -2361,7 +2442,6 @@ output_branch_count (FILE *gcov_file, int ix, const arc_t *arc)
   else
     return 0;
   return 1;
-
 }
 
 static const char *
@@ -2391,7 +2471,7 @@ read_line (FILE *file)
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
     }
-      
+
   return pos ? string : NULL;
 }
 
-- 
2.9.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-03 13:32 [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992) Martin Liška
@ 2016-08-03 13:54 ` Richard Biener
  2016-08-03 14:22 ` Nathan Sidwell
  1 sibling, 0 replies; 21+ messages in thread
From: Richard Biener @ 2016-08-03 13:54 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, Jan Hubicka, Pidgeot18, Nathan Sidwell

On Wed, Aug 3, 2016 at 3:31 PM, Martin Liška <mliska@suse.cz> wrote:
> Hello.
>
> As I've going through all PRs related to gcov-profile, I've noticed this PR.
> Current implementation of cycle detection in gcov is very poor, leading to extreme run time
> for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
> the patch is quite subtle and fast (of course).
>
> The patch survives gcov.exp regression tests and I also verified than *.gcov is identical before
> and after the patch for Inkscape:
>
> $ find . -name '*.gcov' | wc -l
> 10752
>
> I'm also thinking about adding [1] to test-suite, however it would require implementation 'timeout'
> argument in gcov.exp. Does it worth doing?

We usually add such tests without any "timeout" and expect we'll notice
if compile-time goes through the roof for them.

Richard.

> Ready to install?
> Thanks,
> Martin
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67992#c5

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-03 13:32 [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992) Martin Liška
  2016-08-03 13:54 ` Richard Biener
@ 2016-08-03 14:22 ` Nathan Sidwell
  2016-08-04 10:39   ` [PATCH 1/2] Fix GNU coding style in gcov.c Martin Liška
  2016-08-04 10:41   ` [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992) Martin Liška
  1 sibling, 2 replies; 21+ messages in thread
From: Nathan Sidwell @ 2016-08-03 14:22 UTC (permalink / raw)
  To: Martin Liška, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

Martin,
> As I've going through all PRs related to gcov-profile, I've noticed this PR.
> Current implementation of cycle detection in gcov is very poor, leading to extreme run time
> for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
> the patch is quite subtle and fast (of course).

sorry to be a pain, but could you split the patch into
a) formatting changes
b) the clever  bits

the formatting changes can then (probably) be applied as obvious.

nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/2] Fix GNU coding style in gcov.c
  2016-08-03 14:22 ` Nathan Sidwell
@ 2016-08-04 10:39   ` Martin Liška
  2016-08-04 11:39     ` Nathan Sidwell
  2016-08-04 10:41   ` [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992) Martin Liška
  1 sibling, 1 reply; 21+ messages in thread
From: Martin Liška @ 2016-08-04 10:39 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

[-- Attachment #1: Type: text/plain, Size: 771 bytes --]

On 08/03/2016 04:22 PM, Nathan Sidwell wrote:
> Martin,
>> As I've going through all PRs related to gcov-profile, I've noticed this PR.
>> Current implementation of cycle detection in gcov is very poor, leading to extreme run time
>> for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
>> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
>> the patch is quite subtle and fast (of course).
> 
> sorry to be a pain, but could you split the patch into
> a) formatting changes
> b) the clever  bits
> 
> the formatting changes can then (probably) be applied as obvious.
> 
> nathan

That's all right, it's my mistake that I messed up coding style issues and
core of that patch.

Martin


[-- Attachment #2: 0001-Fix-GNU-coding-style-in-gcov.c.patch --]
[-- Type: text/x-patch, Size: 9218 bytes --]

From d3ec7fc18df43ffcb39e1c9b9aacc269bd641ab5 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 4 Aug 2016 12:30:37 +0200
Subject: [PATCH 1/2] Fix GNU coding style in gcov.c

gcc/ChangeLog:

2016-08-04  Martin Liska  <mliska@suse.cz>

	* gcov.c (main): Fix GNU coding style.
	(output_intermediate_file): Likewise.
	(process_file): Likewise.
	(generate_results): Likewise.
	(release_structures): Likewise.
	(create_file_names): Likewise.
	(find_source): Likewise.
	(read_graph_file): Likewise.
	(find_exception_blocks): Likewise.
	(canonicalize_name): Likewise.
	(make_gcov_file_name): Likewise.
	(mangle_name): Likewise.
	(accumulate_line_counts): Likewise.
	(output_branch_count): Likewise.
	(read_line): Likewise.
---
 gcc/gcov.c | 88 ++++++++++++++++++++++++++++++--------------------------------
 1 file changed, 42 insertions(+), 46 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 417b4f4..40701a1 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -435,7 +435,7 @@ main (int argc, char **argv)
   names = XNEWVEC (name_map_t, a_names);
   a_sources = 10;
   sources = XNEWVEC (source_t, a_sources);
-  
+
   argno = process_args (argc, argv);
   if (optind == argc)
     print_usage (true);
@@ -444,12 +444,12 @@ main (int argc, char **argv)
     multiple_files = 1;
 
   first_arg = argno;
-  
+
   for (; argno != argc; argno++)
     {
       if (flag_display_progress)
-        printf ("Processing file %d out of %d\n",
-		argno - first_arg + 1, argc - first_arg);
+	printf ("Processing file %d out of %d\n", argno - first_arg + 1,
+		argc - first_arg);
       process_file (argv[argno]);
     }
 
@@ -671,8 +671,8 @@ output_intermediate_file (FILE *gcov_file, source_t *src)
     {
       /* function:<name>,<line_number>,<execution_count> */
       fprintf (gcov_file, "function:%d,%s,%s\n", fn->line,
-               format_gcov (fn->blocks[0].count, 0, -1),
-               flag_demangled_names ? fn->demangled_name : fn->name);
+	       format_gcov (fn->blocks[0].count, 0, -1),
+	       flag_demangled_names ? fn->demangled_name : fn->name);
     }
 
   for (line_num = 1, line = &src->lines[line_num];
@@ -681,8 +681,8 @@ output_intermediate_file (FILE *gcov_file, source_t *src)
     {
       arc_t *arc;
       if (line->exists)
-        fprintf (gcov_file, "lcount:%u,%s\n", line_num,
-                 format_gcov (line->count, 0, -1));
+	fprintf (gcov_file, "lcount:%u,%s\n", line_num,
+		 format_gcov (line->count, 0, -1));
       if (flag_branches)
         for (arc = line->u.branches; arc; arc = arc->line_next)
           {
@@ -705,7 +705,6 @@ output_intermediate_file (FILE *gcov_file, source_t *src)
     }
 }
 
-
 /* Process a single input file.  */
 
 static void
@@ -717,7 +716,7 @@ process_file (const char *file_name)
   fns = read_graph_file ();
   if (!fns)
     return;
-  
+
   read_count_file (fns);
   while (fns)
     {
@@ -767,7 +766,7 @@ process_file (const char *file_name)
 	    }
 	  if (line >= sources[src].num_lines)
 	    sources[src].num_lines = line + 1;
-	  
+
 	  solve_flow_graph (fn);
 	  if (fn->has_catch)
 	    find_exception_blocks (fn);
@@ -848,15 +847,14 @@ generate_results (const char *file_name)
   if (flag_gcov_file && flag_intermediate_format)
     {
       /* Open the intermediate file.  */
-      gcov_intermediate_filename =
-        get_gcov_intermediate_filename (file_name);
+      gcov_intermediate_filename = get_gcov_intermediate_filename (file_name);
       gcov_intermediate_file = fopen (gcov_intermediate_filename, "w");
       if (!gcov_intermediate_file)
-        {
-          fnotice (stderr, "Cannot open intermediate output file %s\n",
-                   gcov_intermediate_filename);
-          return;
-        }
+	{
+	  fnotice (stderr, "Cannot open intermediate output file %s\n",
+		   gcov_intermediate_filename);
+	  return;
+	}
     }
 
   for (ix = n_sources, src = sources; ix--; src++)
@@ -866,7 +864,7 @@ generate_results (const char *file_name)
 	  /* Ignore this source, if it is an absolute path (after
 	     source prefix removal).  */
 	  char first = src->coverage.name[0];
-      
+
 #if HAVE_DOS_BASED_FILE_SYSTEM
 	  if (first && src->coverage.name[1] == ':')
 	    first = src->coverage.name[2];
@@ -874,7 +872,7 @@ generate_results (const char *file_name)
 	  if (IS_DIR_SEPARATOR (first))
 	    continue;
 	}
-      
+
       accumulate_line_counts (src);
       function_summary (&src->coverage, "File");
       total_lines += src->coverage.lines;
@@ -938,7 +936,7 @@ release_structures (void)
   for (ix = n_sources; ix--;)
     free (sources[ix].lines);
   free (sources);
-  
+
   for (ix = n_names; ix--;)
     free (names[ix].name);
   free (names);
@@ -982,7 +980,7 @@ create_file_names (const char *file_name)
 
       base = !stat (object_directory, &status) && S_ISDIR (status.st_mode);
       strcat (name, object_directory);
-      if (base && (! IS_DIR_SEPARATOR (name[strlen (name) - 1])))
+      if (base && (!IS_DIR_SEPARATOR (name[strlen (name) - 1])))
 	strcat (name, "/");
     }
   else
@@ -1075,16 +1073,16 @@ find_source (const char *file_name)
       free (names);
       names = name_map;
     }
-  
+
   /* Not found, try the canonical name. */
   canon = canonicalize_name (file_name);
-  name_map = (name_map_t *)bsearch
-    (canon, names, n_names, sizeof (*names), name_search);
+  name_map = (name_map_t *) bsearch (canon, names, n_names, sizeof (*names),
+				     name_search);
   if (!name_map)
     {
       /* Not found with canonical name, create a new source.  */
       source_t *src;
-      
+
       if (n_sources == a_sources)
 	{
 	  a_sources *= 2;
@@ -1210,12 +1208,12 @@ read_graph_file (void)
 
 	  fn = XCNEW (function_t);
 	  fn->name = function_name;
-          if (flag_demangled_names)
-            {
-              fn->demangled_name = cplus_demangle (fn->name, DMGL_PARAMS);
-              if (!fn->demangled_name)
-                fn->demangled_name = fn->name;
-            }
+	  if (flag_demangled_names)
+	    {
+	      fn->demangled_name = cplus_demangle (fn->name, DMGL_PARAMS);
+	      if (!fn->demangled_name)
+		fn->demangled_name = fn->name;
+	    }
 	  fn->ident = ident;
 	  fn->lineno_checksum = lineno_checksum;
 	  fn->cfg_checksum = cfg_checksum;
@@ -1293,7 +1291,7 @@ read_graph_file (void)
 		  else
 		    {
 		      /* Non-local return from a callee of this
-		         function. The destination block is a setjmp.  */
+			 function.  The destination block is a setjmp.  */
 		      arc->is_nonlocal_return = 1;
 		      fn->blocks[dest].is_nonlocal_return = 1;
 		    }
@@ -1302,7 +1300,7 @@ read_graph_file (void)
 	      if (!arc->on_tree)
 		fn->num_counts++;
 	    }
-	  
+
 	  if (mark_catches)
 	    {
 	      /* We have a fake exit from this block.  The other
@@ -1774,7 +1772,7 @@ find_exception_blocks (function_t *fn)
     {
       block_t *block = queue[--ix];
       const arc_t *arc;
-      
+
       for (arc = block->succ; arc; arc = arc->succ_next)
 	if (!arc->fake && !arc->is_throw && arc->dst->exceptional)
 	  {
@@ -1862,7 +1860,7 @@ executed_summary (unsigned lines, unsigned executed)
   else
     fnotice (stdout, "No executable lines\n");
 }
-  
+
 /* Output summary info for a function or file.  */
 
 static void
@@ -1921,7 +1919,7 @@ canonicalize_name (const char *name)
   for (dd_base = ptr; *base; base = probe)
     {
       size_t len;
-      
+
       for (probe = base; *probe; probe++)
 	if (IS_DIR_SEPARATOR (*probe))
 	  break;
@@ -1942,7 +1940,7 @@ canonicalize_name (const char *name)
 	      /* S_ISLNK is not POSIX.1-1996.  */
 	      || stat (result, &buf) || S_ISLNK (buf.st_mode)
 #endif
-	      )
+		)
 	    {
 	      /* Cannot elide, or unreadable or a symlink.  */
 	      dd_base = ptr + 2 + slash;
@@ -1993,7 +1991,7 @@ make_gcov_file_name (const char *input_name, const char *src_name)
     {
       /* Generate the input filename part.  */
       result = XNEWVEC (char, strlen (input_name) + strlen (src_name) + 10);
-  
+
       ptr = result;
       ptr = mangle_name (input_name, ptr);
       ptr[0] = ptr[1] = '#';
@@ -2007,7 +2005,7 @@ make_gcov_file_name (const char *input_name, const char *src_name)
 
   ptr = mangle_name (src_name, ptr);
   strcpy (ptr, ".gcov");
-  
+
   return result;
 }
 
@@ -2015,7 +2013,7 @@ static char *
 mangle_name (char const *base, char *ptr)
 {
   size_t len;
-  
+
   /* Generate the source filename part.  */
   if (!flag_preserve_paths)
     {
@@ -2061,7 +2059,7 @@ mangle_name (char const *base, char *ptr)
 	    }
 	}
     }
-  
+
   return ptr;
 }
 
@@ -2152,8 +2150,7 @@ accumulate_line_counts (source_t *src)
   unsigned ix;
 
   /* Reverse the function order.  */
-  for (fn = src->functions, fn_p = NULL; fn;
-       fn_p = fn, fn = fn_n)
+  for (fn = src->functions, fn_p = NULL; fn; fn_p = fn, fn = fn_n)
     {
       fn_n = fn->line_next;
       fn->line_next = fn_p;
@@ -2361,7 +2358,6 @@ output_branch_count (FILE *gcov_file, int ix, const arc_t *arc)
   else
     return 0;
   return 1;
-
 }
 
 static const char *
@@ -2391,7 +2387,7 @@ read_line (FILE *file)
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
     }
-      
+
   return pos ? string : NULL;
 }
 
-- 
2.9.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-03 14:22 ` Nathan Sidwell
  2016-08-04 10:39   ` [PATCH 1/2] Fix GNU coding style in gcov.c Martin Liška
@ 2016-08-04 10:41   ` Martin Liška
  2016-08-04 13:15     ` Nathan Sidwell
  1 sibling, 1 reply; 21+ messages in thread
From: Martin Liška @ 2016-08-04 10:41 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

[-- Attachment #1: Type: text/plain, Size: 744 bytes --]

On 08/03/2016 04:22 PM, Nathan Sidwell wrote:
> Martin,
>> As I've going through all PRs related to gcov-profile, I've noticed this PR.
>> Current implementation of cycle detection in gcov is very poor, leading to extreme run time
>> for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
>> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
>> the patch is quite subtle and fast (of course).
> 
> sorry to be a pain, but could you split the patch into
> a) formatting changes
> b) the clever  bits
> 
> the formatting changes can then (probably) be applied as obvious.
> 
> nathan

This is second part which is the change of loop detection algorithm.

Martin

[-- Attachment #2: 0002-gcov-tool-Implement-Hawick-s-algorithm-for-cycle-det.patch --]
[-- Type: text/x-patch, Size: 10491 bytes --]

From 353e469aa2ac9260f31dd09aaedfd21eebc47c02 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 4 Aug 2016 12:34:08 +0200
Subject: [PATCH 2/2] gcov tool: Implement Hawick's algorithm for cycle
 detection, (PR gcov-profile/67992)

gcc/ChangeLog:

2016-08-04  Martin Liska  <mliska@suse.cz>

	* gcov.c (line_t::has_block): New function.
	(handle_cycle): Likewise.
	(unblock): Likewise.
	(circuit): Likewise.
	(find_cycles): Likewise.
	(get_cycles_count): Likewise.
	(accumulate_line_counts): Use new loop detection algorithm.
---
 gcc/gcov.c | 287 +++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 186 insertions(+), 101 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 40701a1..f39a731 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -41,6 +41,11 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include <getopt.h>
 
+#include <vector>
+#include <algorithm>
+
+using namespace std;
+
 #define IN_GCOV 1
 #include "gcov-io.h"
 #include "gcov-io.c"
@@ -222,6 +227,9 @@ typedef struct coverage_info
 
 typedef struct line_info
 {
+  /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
+  bool has_block (block_t *needle);
+
   gcov_type count;	   /* execution count */
   union
   {
@@ -235,6 +243,16 @@ typedef struct line_info
   unsigned unexceptional : 1;
 } line_t;
 
+bool
+line_t::has_block (block_t *needle)
+{
+  for (block_t *n = u.blocks; n; n = n->chain)
+    if (n == needle)
+      return true;
+
+  return false;
+}
+
 /* Describes a file mentioned in the block graph.  Contains an array
    of line info.  */
 
@@ -407,6 +425,164 @@ static void release_structures (void);
 static void release_function (function_t *);
 extern int main (int, char **);
 
+/* Cycle detection!
+   There are a bajillion algorithms that do this.  Boost's function is named
+   hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
+   "Enumerating Circuits and Loops in Graphs with Self-Arcs and Multiple-Arcs"
+   (url at <http://complexity.massey.ac.nz/cstn/013/cstn-013.pdf>).
+
+   The basic algorithm is simple: effectively, we're finding all simple paths
+   in a subgraph (that shrinks every iteration).  Duplicates are filtered by
+   "blocking" a path when a node is added to the path (this also prevents non-
+   simple paths)--the node is unblocked only when it participates in a cycle.
+   */
+
+/* Flag that drives cycle detection after a negative cycle is seen.  */
+static bool did_negate = false;
+
+/* Handle cycle identified by EDGES, where the function finds minimum cs_count
+   and subtract the value from all counts.  The subtracted value is added
+   to COUNT.  */
+
+static void
+handle_cycle (const vector<arc_t *> &edges, int64_t &count)
+{
+  /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
+     that amount.  */
+  int64_t cycle_count = INT64_MAX;
+  for (unsigned i = 0; i < edges.size (); i++)
+    {
+      int64_t ecount = edges[i]->cs_count;
+      if (cycle_count > ecount)
+	cycle_count = ecount;
+    }
+  count += cycle_count;
+  for (unsigned i = 0; i < edges.size (); i++)
+    edges[i]->cs_count -= cycle_count;
+
+  if (cycle_count < 0)
+    did_negate = true;
+}
+
+/* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
+   blocked by U in BLOCK_LISTS.  */
+
+static void
+unblock (block_t *u, vector<block_t *> &blocked,
+	 vector<vector<block_t *> > &block_lists)
+{
+  vector<block_t *>::iterator it = find (blocked.begin (), blocked.end (), u);
+  if (it == blocked.end ())
+    return;
+
+  unsigned index = it - blocked.begin ();
+  blocked.erase (it);
+
+  for (vector<block_t *>::iterator it2 = block_lists[index].begin ();
+       it2 != block_lists[index].end (); it2++)
+    unblock (*it2, blocked, block_lists);
+  for (unsigned j = 0; j < block_lists[index].size (); j++)
+    unblock (u, blocked, block_lists);
+
+  block_lists.erase (block_lists.begin () + index);
+}
+
+/* Find circuit going to block V, PATH is provisional seen cycle.
+   BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
+   blocked by a block.  COUNT is accumulated count of the current LINE.  */
+
+static bool
+circuit (block_t *v, vector<arc_t *> &path, block_t *start,
+	 vector<block_t *> &blocked, vector<vector<block_t *>> &block_lists,
+	 line_t &linfo, int64_t &count)
+{
+  bool found = false;
+
+  /* Add v to the block list.  */
+  gcc_assert (find (blocked.begin (), blocked.end (), v) == blocked.end ());
+  blocked.push_back (v);
+  block_lists.push_back (vector<block_t *> ());
+
+  for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+    {
+      block_t *w = arc->dst;
+      if (w < start || !linfo.has_block (w))
+	continue;
+
+      path.push_back (arc);
+      if (w == start)
+	{
+	  /* Cycle has been found.  */
+	  handle_cycle (path, count);
+	  found = true;
+	}
+      else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
+	found |= circuit (w, path, start, blocked, block_lists, linfo, count);
+
+      path.pop_back ();
+    }
+
+  if (found)
+    unblock (v, blocked, block_lists);
+  else
+    for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+      {
+	block_t *w = arc->dst;
+	if (w < start || !linfo.has_block (w))
+	  continue;
+
+	size_t index
+	  = find (blocked.begin (), blocked.end (), w) - blocked.begin ();
+	gcc_assert (index < blocked.size ());
+	vector<block_t *> &list = block_lists[index];
+	if (find (list.begin (), list.end (), v) == list.end ())
+	  list.push_back (v);
+      }
+
+  return found;
+}
+
+/* Find cycles for a LINFO.  */
+
+static gcov_type
+find_cycles (line_t &linfo)
+{
+  /* Note that this algorithm works even if blocks aren't in sorted order.
+     Each iteration of the circuit detection is completely independent
+     (except for reducing counts, but that shouldn't matter anyways).
+     Therefore, operating on a permuted order (i.e., non-sorted) only
+     has the effect of permuting the output cycles.  */
+
+  gcov_type count = 0;
+  for (block_t *block = linfo.u.blocks; block; block = block->chain)
+    {
+      vector<arc_t *> path;
+      vector<block_t *> blocked;
+      vector<vector<block_t *> > block_lists;
+      circuit (block, path, block, blocked, block_lists, linfo, count);
+    }
+
+  return count;
+}
+
+/* Get cycle count for a LINFO.  */
+
+static gcov_type
+get_cycles_count (line_t &linfo)
+{
+  gcov_type count = 0;
+
+  /* Bug compatibility with gcc's broken cycle-finder: if a negative cycle
+     exists, run the algorithm again.  */
+
+  did_negate = false;
+  count = find_cycles (linfo);
+  if (did_negate)
+    count += find_cycles (linfo);
+
+  return count;
+}
+
 int
 main (int argc, char **argv)
 {
@@ -2201,113 +2377,22 @@ accumulate_line_counts (source_t *src)
 	      arc_t *arc;
 
 	      for (arc = block->pred; arc; arc = arc->pred_next)
-		{
-		  if (arc->src->u.cycle.ident != ix)
-		    count += arc->count;
-		  if (flag_branches)
-		    add_branch_counts (&src->coverage, arc);
-		}
-
-	      /* Initialize the cs_count.  */
-	      for (arc = block->succ; arc; arc = arc->succ_next)
-		arc->cs_count = arc->count;
+		if (flag_branches)
+		  add_branch_counts (&src->coverage, arc);
 	    }
 
-	  /* Find the loops. This uses the algorithm described in
-	     Tiernan 'An Efficient Search Algorithm to Find the
-	     Elementary Circuits of a Graph', CACM Dec 1970. We hold
-	     the P array by having each block point to the arc that
-	     connects to the previous block. The H array is implicitly
-	     held because of the arc ordering, and the block's
-	     previous arc pointer.
-
-	     Although the algorithm is O(N^3) for highly connected
-	     graphs, at worst we'll have O(N^2), as most blocks have
-	     only one or two exits. Most graphs will be small.
-
-	     For each loop we find, locate the arc with the smallest
-	     transition count, and add that to the cumulative
-	     count.  Decrease flow over the cycle and remove the arc
-	     from consideration.  */
+	  /* Cycle detection.  */
 	  for (block = line->u.blocks; block; block = block->chain)
 	    {
-	      block_t *head = block;
-	      arc_t *arc;
-
-	    next_vertex:;
-	      arc = head->succ;
-	    current_vertex:;
-	      while (arc)
-		{
-		  block_t *dst = arc->dst;
-		  if (/* Already used that arc.  */
-		      arc->cycle
-		      /* Not to same graph, or before first vertex.  */
-		      || dst->u.cycle.ident != ix
-		      /* Already in path.  */
-		      || dst->u.cycle.arc)
-		    {
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  if (dst == block)
-		    {
-		      /* Found a closing arc.  */
-		      gcov_type cycle_count = arc->cs_count;
-		      arc_t *cycle_arc = arc;
-		      arc_t *probe_arc;
-
-		      /* Locate the smallest arc count of the loop.  */
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			if (cycle_count > probe_arc->cs_count)
-			  {
-			    cycle_count = probe_arc->cs_count;
-			    cycle_arc = probe_arc;
-			  }
-
-		      count += cycle_count;
-		      cycle_arc->cycle = 1;
-
-		      /* Remove the flow from the cycle.  */
-		      arc->cs_count -= cycle_count;
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			probe_arc->cs_count -= cycle_count;
-
-		      /* Unwind to the cyclic arc.  */
-		      while (head != cycle_arc->src)
-			{
-			  arc = head->u.cycle.arc;
-			  head->u.cycle.arc = NULL;
-			  head = arc->src;
-			}
-		      /* Move on.  */
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  /* Add new block to chain.  */
-		  dst->u.cycle.arc = arc;
-		  head = dst;
-		  goto next_vertex;
-		}
-	      /* We could not add another vertex to the path. Remove
-		 the last vertex from the list.  */
-	      arc = head->u.cycle.arc;
-	      if (arc)
-		{
-		  /* It was not the first vertex. Move onto next arc.  */
-		  head->u.cycle.arc = NULL;
-		  head = arc->src;
-		  arc = arc->succ_next;
-		  goto current_vertex;
-		}
-	      /* Mark this block as unusable.  */
-	      block->u.cycle.ident = ~0U;
+	      for (arc_t *arc = block->pred; arc; arc = arc->pred_next)
+		if (!line->has_block (arc->src))
+		  count += arc->count;
+	      for (arc_t *arc = block->succ; arc; arc = arc->succ_next)
+		arc->cs_count = arc->count;
 	    }
 
+	  /* Now, add the count of loops entirely on this line.  */
+	  count += get_cycles_count (*line);
 	  line->count = count;
 	}
 
-- 
2.9.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] Fix GNU coding style in gcov.c
  2016-08-04 10:39   ` [PATCH 1/2] Fix GNU coding style in gcov.c Martin Liška
@ 2016-08-04 11:39     ` Nathan Sidwell
  0 siblings, 0 replies; 21+ messages in thread
From: Nathan Sidwell @ 2016-08-04 11:39 UTC (permalink / raw)
  To: Martin Liška, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

On 08/04/16 06:39, Martin LiÅ¡ka wrote:
> On 08/03/2016 04:22 PM, Nathan Sidwell wrote:
>> Martin,
>>> As I've going through all PRs related to gcov-profile, I've noticed this PR.
>>> Current implementation of cycle detection in gcov is very poor, leading to extreme run time
>>> for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
>>> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
>>> the patch is quite subtle and fast (of course).
>>
>> sorry to be a pain, but could you split the patch into
>> a) formatting changes
>> b) the clever  bits
>>
>> the formatting changes can then (probably) be applied as obvious.
>>
>> nathan
>
> That's all right, it's my mistake that I messed up coding style issues and
> core of that patch.

Thanks.  the formatting patch is fine.  Reading the other one next ...

nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-04 10:41   ` [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992) Martin Liška
@ 2016-08-04 13:15     ` Nathan Sidwell
  2016-08-04 13:31       ` Jan Hubicka
  2016-08-04 14:43       ` Martin Liška
  0 siblings, 2 replies; 21+ messages in thread
From: Nathan Sidwell @ 2016-08-04 13:15 UTC (permalink / raw)
  To: Martin Liška, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

On 08/04/16 06:41, Martin LiÅ¡ka wrote:
> On 08/03/2016 04:22 PM, Nathan Sidwell wrote:
>> Martin,
>>> As I've going through all PRs related to gcov-profile, I've noticed this PR.
>>> Current implementation of cycle detection in gcov is very poor, leading to extreme run time
>>> for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
>>> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
>>> the patch is quite subtle and fast (of course).
>>
>> sorry to be a pain, but could you split the patch into
>> a) formatting changes
>> b) the clever  bits
>>
>> the formatting changes can then (probably) be applied as obvious.
>>
>> nathan
>
> This is second part which is the change of loop detection algorithm.

typedefs for arc and block pointer vectors would be useful to add.  They're used 
in a lot of  places:

typedef vector<arc_t *> arc_vector_t;
typedef vector<block_t *> block_vector_t;

(question, should those be  'const T *' template parms?)

No need for vector of block vectors typedef, unless you think otherwise.

+/* Flag that drives cycle detection after a negative cycle is seen.  */
+static bool did_negate = false;

That's ugly, and I think unnecessary.  Use +1 for loop, -1 for negated loop, 0 
for no loop  (or a tri-valued enum with the right properties)

1) have handle_cycle return +1 (not negated) or -1 (negated) appropriately.

2) have circuit return an int similarly. Then
   if (w == start)
     found |= handle_cycle (path, count);
   else if (...)
     found |= circuit (...)
will DTRT there

3) finally have find_cycles merge the results from its circuit calls and 
determine whether to repeat itself -- rather than have the caller do it. (or 
have another reference parm to tell the caller?)

nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-04 13:15     ` Nathan Sidwell
@ 2016-08-04 13:31       ` Jan Hubicka
  2016-08-04 14:43       ` Martin Liška
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Hubicka @ 2016-08-04 13:31 UTC (permalink / raw)
  To: Nathan Sidwell; +Cc: Martin Liška, GCC Patches, Jan Hubicka, Pidgeot18

> On 08/04/16 06:41, Martin LiÂška wrote:
> >On 08/03/2016 04:22 PM, Nathan Sidwell wrote:
> >>Martin,
> >>>As I've going through all PRs related to gcov-profile, I've noticed this PR.
> >>>Current implementation of cycle detection in gcov is very poor, leading to extreme run time
> >>>for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
> >>>grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
> >>>the patch is quite subtle and fast (of course).
> >>
> >>sorry to be a pain, but could you split the patch into
> >>a) formatting changes
> >>b) the clever  bits
> >>
> >>the formatting changes can then (probably) be applied as obvious.
> >>
> >>nathan
> >
> >This is second part which is the change of loop detection algorithm.
> 
> typedefs for arc and block pointer vectors would be useful to add.
> They're used in a lot of  places:
> 
> typedef vector<arc_t *> arc_vector_t;
> typedef vector<block_t *> block_vector_t;
> 
> (question, should those be  'const T *' template parms?)

What about trying to get naming scheme consistent with rest of GCC which call
those bbs and edges?  I know it is hard wired into -fprofile-arcs name but it
may be nice to get types more consistent.

Honza
> 
> No need for vector of block vectors typedef, unless you think otherwise.
> 
> +/* Flag that drives cycle detection after a negative cycle is seen.  */
> +static bool did_negate = false;
> 
> That's ugly, and I think unnecessary.  Use +1 for loop, -1 for
> negated loop, 0 for no loop  (or a tri-valued enum with the right
> properties)
> 
> 1) have handle_cycle return +1 (not negated) or -1 (negated) appropriately.
> 
> 2) have circuit return an int similarly. Then
>   if (w == start)
>     found |= handle_cycle (path, count);
>   else if (...)
>     found |= circuit (...)
> will DTRT there
> 
> 3) finally have find_cycles merge the results from its circuit calls
> and determine whether to repeat itself -- rather than have the
> caller do it. (or have another reference parm to tell the caller?)
> 
> nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-04 13:15     ` Nathan Sidwell
  2016-08-04 13:31       ` Jan Hubicka
@ 2016-08-04 14:43       ` Martin Liška
  2016-08-04 15:13         ` Nathan Sidwell
  1 sibling, 1 reply; 21+ messages in thread
From: Martin Liška @ 2016-08-04 14:43 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

[-- Attachment #1: Type: text/plain, Size: 2179 bytes --]

On 08/04/2016 03:15 PM, Nathan Sidwell wrote:
> On 08/04/16 06:41, Martin LiÅ¡ka wrote:
>> On 08/03/2016 04:22 PM, Nathan Sidwell wrote:
>>> Martin,
>>>> As I've going through all PRs related to gcov-profile, I've noticed this PR.
>>>> Current implementation of cycle detection in gcov is very poor, leading to extreme run time
>>>> for cases like mentioned in the PR (which does not contain a cycle). Thank to Joshua, I've
>>>> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he did. After doing that
>>>> the patch is quite subtle and fast (of course).
>>>
>>> sorry to be a pain, but could you split the patch into
>>> a) formatting changes
>>> b) the clever  bits
>>>
>>> the formatting changes can then (probably) be applied as obvious.
>>>
>>> nathan
>>
>> This is second part which is the change of loop detection algorithm.
> 
> typedefs for arc and block pointer vectors would be useful to add.  They're used in a lot of  places:
> 
> typedef vector<arc_t *> arc_vector_t;
> typedef vector<block_t *> block_vector_t;
> 
> (question, should those be  'const T *' template parms?)

const block_t works for me, arc_t doesn't:
../../gcc/gcov.c:470:27: error: assignment of member â€˜arc_info::cs_countâ€™ in read-only object
     edges[i]->cs_count -= cycle_count;


> 
> No need for vector of block vectors typedef, unless you think otherwise.
> 
> +/* Flag that drives cycle detection after a negative cycle is seen.  */
> +static bool did_negate = false;
> 
> That's ugly, and I think unnecessary.  Use +1 for loop, -1 for negated loop, 0 for no loop  (or a tri-valued enum with the right properties)
> 
> 1) have handle_cycle return +1 (not negated) or -1 (negated) appropriately.
> 
> 2) have circuit return an int similarly. Then
>   if (w == start)
>     found |= handle_cycle (path, count);
>   else if (...)
>     found |= circuit (...)
> will DTRT there
> 
> 3) finally have find_cycles merge the results from its circuit calls and determine whether to repeat itself -- rather than have the caller do it. (or have another reference parm to tell the caller?)

I decided to use a new enum, hope it's better?

Martin

> 
> nathan
> 


[-- Attachment #2: 0001-gcov-tool-Implement-Hawick-s-algorithm-for-cycle-det-v3.patch --]
[-- Type: text/x-patch, Size: 10830 bytes --]

From 24cd47f44e9958bd7fd0c40af849cedc567d6341 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 4 Aug 2016 12:34:08 +0200
Subject: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection,
 (PR gcov-profile/67992)

gcc/ChangeLog:

2016-08-04  Martin Liska  <mliska@suse.cz>
	    Joshua Cranmer  <Pidgeot18@gmail.com>

	* gcov.c (line_t::has_block): New function.
	(enum loop_type): New enum.
	(handle_cycle): New function.
	(unblock): Likewise.
	(circuit): Likewise.
	(get_cycles_count): Likewise.
	(accumulate_line_counts): Use new loop detection algorithm.
---
 gcc/gcov.c | 290 ++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 189 insertions(+), 101 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 40701a1..9c9eccf 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -41,6 +41,11 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include <getopt.h>
 
+#include <vector>
+#include <algorithm>
+
+using namespace std;
+
 #define IN_GCOV 1
 #include "gcov-io.h"
 #include "gcov-io.c"
@@ -222,6 +227,9 @@ typedef struct coverage_info
 
 typedef struct line_info
 {
+  /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
+  bool has_block (block_t *needle);
+
   gcov_type count;	   /* execution count */
   union
   {
@@ -235,6 +243,16 @@ typedef struct line_info
   unsigned unexceptional : 1;
 } line_t;
 
+bool
+line_t::has_block (block_t *needle)
+{
+  for (block_t *n = u.blocks; n; n = n->chain)
+    if (n == needle)
+      return true;
+
+  return false;
+}
+
 /* Describes a file mentioned in the block graph.  Contains an array
    of line info.  */
 
@@ -407,6 +425,167 @@ static void release_structures (void);
 static void release_function (function_t *);
 extern int main (int, char **);
 
+/* Cycle detection!
+   There are a bajillion algorithms that do this.  Boost's function is named
+   hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
+   "Enumerating Circuits and Loops in Graphs with Self-Arcs and Multiple-Arcs"
+   (url at <http://complexity.massey.ac.nz/cstn/013/cstn-013.pdf>).
+
+   The basic algorithm is simple: effectively, we're finding all simple paths
+   in a subgraph (that shrinks every iteration).  Duplicates are filtered by
+   "blocking" a path when a node is added to the path (this also prevents non-
+   simple paths)--the node is unblocked only when it participates in a cycle.
+   */
+
+typedef vector<arc_t *> arc_vector_t;
+typedef vector<const block_t *> block_vector_t;
+
+/* Enum with types of loop in CFG.  */
+
+enum loop_type
+{
+  NO_LOOP,
+  LOOP,
+  NEGATIVE_LOOP
+};
+
+/* Handle cycle identified by EDGES, where the function finds minimum cs_count
+   and subtract the value from all counts.  The subtracted value is added
+   to COUNT.  Returns type of loop.  */
+
+static loop_type
+handle_cycle (const arc_vector_t &edges, int64_t &count)
+{
+  /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
+     that amount.  */
+  int64_t cycle_count = INT64_MAX;
+  for (unsigned i = 0; i < edges.size (); i++)
+    {
+      int64_t ecount = edges[i]->cs_count;
+      if (cycle_count > ecount)
+	cycle_count = ecount;
+    }
+  count += cycle_count;
+  for (unsigned i = 0; i < edges.size (); i++)
+    edges[i]->cs_count -= cycle_count;
+
+  return cycle_count < 0 ? NEGATIVE_LOOP : LOOP;
+}
+
+/* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
+   blocked by U in BLOCK_LISTS.  */
+
+static void
+unblock (const block_t *u, block_vector_t &blocked,
+	 vector<block_vector_t > &block_lists)
+{
+  block_vector_t::iterator it = find (blocked.begin (), blocked.end (), u);
+  if (it == blocked.end ())
+    return;
+
+  unsigned index = it - blocked.begin ();
+  blocked.erase (it);
+
+  for (block_vector_t::iterator it2 = block_lists[index].begin ();
+       it2 != block_lists[index].end (); it2++)
+    unblock (*it2, blocked, block_lists);
+  for (unsigned j = 0; j < block_lists[index].size (); j++)
+    unblock (u, blocked, block_lists);
+
+  block_lists.erase (block_lists.begin () + index);
+}
+
+/* Find circuit going to block V, PATH is provisional seen cycle.
+   BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
+   blocked by a block.  COUNT is accumulated count of the current LINE.
+   Returns what type of loop it contains.  */
+
+static loop_type
+circuit (block_t *v, arc_vector_t &path, block_t *start,
+	 block_vector_t &blocked, vector<block_vector_t> &block_lists,
+	 line_t &linfo, int64_t &count)
+{
+  loop_type result = NO_LOOP;
+
+  /* Add v to the block list.  */
+  gcc_assert (find (blocked.begin (), blocked.end (), v) == blocked.end ());
+  blocked.push_back (v);
+  block_lists.push_back (block_vector_t ());
+
+  for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+    {
+      block_t *w = arc->dst;
+      if (w < start || !linfo.has_block (w))
+	continue;
+
+      path.push_back (arc);
+      if (w == start)
+	{
+	  /* Cycle has been found.  */
+	  loop_type cresult = handle_cycle (path, count);
+	  if (cresult > result)
+	    result = cresult;
+	}
+      else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
+	{
+	  loop_type cresult = circuit (w, path, start, blocked, block_lists,
+				       linfo, count);
+	  if (cresult > result)
+	    result = cresult;
+	}
+
+      path.pop_back ();
+    }
+
+  if (result != NO_LOOP)
+    unblock (v, blocked, block_lists);
+  else
+    for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+      {
+	block_t *w = arc->dst;
+	if (w < start || !linfo.has_block (w))
+	  continue;
+
+	size_t index
+	  = find (blocked.begin (), blocked.end (), w) - blocked.begin ();
+	gcc_assert (index < blocked.size ());
+	block_vector_t &list = block_lists[index];
+	if (find (list.begin (), list.end (), v) == list.end ())
+	  list.push_back (v);
+      }
+
+  return result;
+}
+
+/* Find cycles for a LINFO.  If HANDLE_NEGATIVE_CYCLES is set and the line
+   contains a negative loop, then perform the same function once again.  */
+
+static gcov_type
+get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
+{
+  /* Note that this algorithm works even if blocks aren't in sorted order.
+     Each iteration of the circuit detection is completely independent
+     (except for reducing counts, but that shouldn't matter anyways).
+     Therefore, operating on a permuted order (i.e., non-sorted) only
+     has the effect of permuting the output cycles.  */
+
+  loop_type result = NO_LOOP;
+  gcov_type count = 0;
+  for (block_t *block = linfo.u.blocks; block; block = block->chain)
+    {
+      arc_vector_t path;
+      block_vector_t blocked;
+      vector<block_vector_t > block_lists;
+      result = circuit (block, path, block, blocked, block_lists, linfo, count);
+    }
+
+  /* If we have a negative cycle, repeat the find_cycles routine.  */
+  if (result == NEGATIVE_LOOP && handle_negative_cycles)
+    count += get_cycles_count (linfo, false);
+
+  return count;
+}
+
 int
 main (int argc, char **argv)
 {
@@ -2201,113 +2380,22 @@ accumulate_line_counts (source_t *src)
 	      arc_t *arc;
 
 	      for (arc = block->pred; arc; arc = arc->pred_next)
-		{
-		  if (arc->src->u.cycle.ident != ix)
-		    count += arc->count;
-		  if (flag_branches)
-		    add_branch_counts (&src->coverage, arc);
-		}
-
-	      /* Initialize the cs_count.  */
-	      for (arc = block->succ; arc; arc = arc->succ_next)
-		arc->cs_count = arc->count;
+		if (flag_branches)
+		  add_branch_counts (&src->coverage, arc);
 	    }
 
-	  /* Find the loops. This uses the algorithm described in
-	     Tiernan 'An Efficient Search Algorithm to Find the
-	     Elementary Circuits of a Graph', CACM Dec 1970. We hold
-	     the P array by having each block point to the arc that
-	     connects to the previous block. The H array is implicitly
-	     held because of the arc ordering, and the block's
-	     previous arc pointer.
-
-	     Although the algorithm is O(N^3) for highly connected
-	     graphs, at worst we'll have O(N^2), as most blocks have
-	     only one or two exits. Most graphs will be small.
-
-	     For each loop we find, locate the arc with the smallest
-	     transition count, and add that to the cumulative
-	     count.  Decrease flow over the cycle and remove the arc
-	     from consideration.  */
+	  /* Cycle detection.  */
 	  for (block = line->u.blocks; block; block = block->chain)
 	    {
-	      block_t *head = block;
-	      arc_t *arc;
-
-	    next_vertex:;
-	      arc = head->succ;
-	    current_vertex:;
-	      while (arc)
-		{
-		  block_t *dst = arc->dst;
-		  if (/* Already used that arc.  */
-		      arc->cycle
-		      /* Not to same graph, or before first vertex.  */
-		      || dst->u.cycle.ident != ix
-		      /* Already in path.  */
-		      || dst->u.cycle.arc)
-		    {
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  if (dst == block)
-		    {
-		      /* Found a closing arc.  */
-		      gcov_type cycle_count = arc->cs_count;
-		      arc_t *cycle_arc = arc;
-		      arc_t *probe_arc;
-
-		      /* Locate the smallest arc count of the loop.  */
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			if (cycle_count > probe_arc->cs_count)
-			  {
-			    cycle_count = probe_arc->cs_count;
-			    cycle_arc = probe_arc;
-			  }
-
-		      count += cycle_count;
-		      cycle_arc->cycle = 1;
-
-		      /* Remove the flow from the cycle.  */
-		      arc->cs_count -= cycle_count;
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			probe_arc->cs_count -= cycle_count;
-
-		      /* Unwind to the cyclic arc.  */
-		      while (head != cycle_arc->src)
-			{
-			  arc = head->u.cycle.arc;
-			  head->u.cycle.arc = NULL;
-			  head = arc->src;
-			}
-		      /* Move on.  */
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  /* Add new block to chain.  */
-		  dst->u.cycle.arc = arc;
-		  head = dst;
-		  goto next_vertex;
-		}
-	      /* We could not add another vertex to the path. Remove
-		 the last vertex from the list.  */
-	      arc = head->u.cycle.arc;
-	      if (arc)
-		{
-		  /* It was not the first vertex. Move onto next arc.  */
-		  head->u.cycle.arc = NULL;
-		  head = arc->src;
-		  arc = arc->succ_next;
-		  goto current_vertex;
-		}
-	      /* Mark this block as unusable.  */
-	      block->u.cycle.ident = ~0U;
+	      for (arc_t *arc = block->pred; arc; arc = arc->pred_next)
+		if (!line->has_block (arc->src))
+		  count += arc->count;
+	      for (arc_t *arc = block->succ; arc; arc = arc->succ_next)
+		arc->cs_count = arc->count;
 	    }
 
+	  /* Now, add the count of loops entirely on this line.  */
+	  count += get_cycles_count (*line);
 	  line->count = count;
 	}
 
-- 
2.9.2


[-- Attachment #3: diff.patch --]
[-- Type: text/x-patch, Size: 6192 bytes --]

diff --git a/gcc/gcov.c b/gcc/gcov.c
index f39a731..9c9eccf 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -437,15 +437,24 @@ extern int main (int, char **);
    simple paths)--the node is unblocked only when it participates in a cycle.
    */
 
-/* Flag that drives cycle detection after a negative cycle is seen.  */
-static bool did_negate = false;
+typedef vector<arc_t *> arc_vector_t;
+typedef vector<const block_t *> block_vector_t;
+
+/* Enum with types of loop in CFG.  */
+
+enum loop_type
+{
+  NO_LOOP,
+  LOOP,
+  NEGATIVE_LOOP
+};
 
 /* Handle cycle identified by EDGES, where the function finds minimum cs_count
    and subtract the value from all counts.  The subtracted value is added
-   to COUNT.  */
+   to COUNT.  Returns type of loop.  */
 
-static void
-handle_cycle (const vector<arc_t *> &edges, int64_t &count)
+static loop_type
+handle_cycle (const arc_vector_t &edges, int64_t &count)
 {
   /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
      that amount.  */
@@ -460,25 +469,24 @@ handle_cycle (const vector<arc_t *> &edges, int64_t &count)
   for (unsigned i = 0; i < edges.size (); i++)
     edges[i]->cs_count -= cycle_count;
 
-  if (cycle_count < 0)
-    did_negate = true;
+  return cycle_count < 0 ? NEGATIVE_LOOP : LOOP;
 }
 
 /* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
    blocked by U in BLOCK_LISTS.  */
 
 static void
-unblock (block_t *u, vector<block_t *> &blocked,
-	 vector<vector<block_t *> > &block_lists)
+unblock (const block_t *u, block_vector_t &blocked,
+	 vector<block_vector_t > &block_lists)
 {
-  vector<block_t *>::iterator it = find (blocked.begin (), blocked.end (), u);
+  block_vector_t::iterator it = find (blocked.begin (), blocked.end (), u);
   if (it == blocked.end ())
     return;
 
   unsigned index = it - blocked.begin ();
   blocked.erase (it);
 
-  for (vector<block_t *>::iterator it2 = block_lists[index].begin ();
+  for (block_vector_t::iterator it2 = block_lists[index].begin ();
        it2 != block_lists[index].end (); it2++)
     unblock (*it2, blocked, block_lists);
   for (unsigned j = 0; j < block_lists[index].size (); j++)
@@ -489,19 +497,20 @@ unblock (block_t *u, vector<block_t *> &blocked,
 
 /* Find circuit going to block V, PATH is provisional seen cycle.
    BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
-   blocked by a block.  COUNT is accumulated count of the current LINE.  */
+   blocked by a block.  COUNT is accumulated count of the current LINE.
+   Returns what type of loop it contains.  */
 
-static bool
-circuit (block_t *v, vector<arc_t *> &path, block_t *start,
-	 vector<block_t *> &blocked, vector<vector<block_t *>> &block_lists,
+static loop_type
+circuit (block_t *v, arc_vector_t &path, block_t *start,
+	 block_vector_t &blocked, vector<block_vector_t> &block_lists,
 	 line_t &linfo, int64_t &count)
 {
-  bool found = false;
+  loop_type result = NO_LOOP;
 
   /* Add v to the block list.  */
   gcc_assert (find (blocked.begin (), blocked.end (), v) == blocked.end ());
   blocked.push_back (v);
-  block_lists.push_back (vector<block_t *> ());
+  block_lists.push_back (block_vector_t ());
 
   for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
     {
@@ -513,16 +522,22 @@ circuit (block_t *v, vector<arc_t *> &path, block_t *start,
       if (w == start)
 	{
 	  /* Cycle has been found.  */
-	  handle_cycle (path, count);
-	  found = true;
+	  loop_type cresult = handle_cycle (path, count);
+	  if (cresult > result)
+	    result = cresult;
 	}
       else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
-	found |= circuit (w, path, start, blocked, block_lists, linfo, count);
+	{
+	  loop_type cresult = circuit (w, path, start, blocked, block_lists,
+				       linfo, count);
+	  if (cresult > result)
+	    result = cresult;
+	}
 
       path.pop_back ();
     }
 
-  if (found)
+  if (result != NO_LOOP)
     unblock (v, blocked, block_lists);
   else
     for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
@@ -534,18 +549,19 @@ circuit (block_t *v, vector<arc_t *> &path, block_t *start,
 	size_t index
 	  = find (blocked.begin (), blocked.end (), w) - blocked.begin ();
 	gcc_assert (index < blocked.size ());
-	vector<block_t *> &list = block_lists[index];
+	block_vector_t &list = block_lists[index];
 	if (find (list.begin (), list.end (), v) == list.end ())
 	  list.push_back (v);
       }
 
-  return found;
+  return result;
 }
 
-/* Find cycles for a LINFO.  */
+/* Find cycles for a LINFO.  If HANDLE_NEGATIVE_CYCLES is set and the line
+   contains a negative loop, then perform the same function once again.  */
 
 static gcov_type
-find_cycles (line_t &linfo)
+get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
 {
   /* Note that this algorithm works even if blocks aren't in sorted order.
      Each iteration of the circuit detection is completely independent
@@ -553,32 +569,19 @@ find_cycles (line_t &linfo)
      Therefore, operating on a permuted order (i.e., non-sorted) only
      has the effect of permuting the output cycles.  */
 
+  loop_type result = NO_LOOP;
   gcov_type count = 0;
   for (block_t *block = linfo.u.blocks; block; block = block->chain)
     {
-      vector<arc_t *> path;
-      vector<block_t *> blocked;
-      vector<vector<block_t *> > block_lists;
-      circuit (block, path, block, blocked, block_lists, linfo, count);
+      arc_vector_t path;
+      block_vector_t blocked;
+      vector<block_vector_t > block_lists;
+      result = circuit (block, path, block, blocked, block_lists, linfo, count);
     }
 
-  return count;
-}
-
-/* Get cycle count for a LINFO.  */
-
-static gcov_type
-get_cycles_count (line_t &linfo)
-{
-  gcov_type count = 0;
-
-  /* Bug compatibility with gcc's broken cycle-finder: if a negative cycle
-     exists, run the algorithm again.  */
-
-  did_negate = false;
-  count = find_cycles (linfo);
-  if (did_negate)
-    count += find_cycles (linfo);
+  /* If we have a negative cycle, repeat the find_cycles routine.  */
+  if (result == NEGATIVE_LOOP && handle_negative_cycles)
+    count += get_cycles_count (linfo, false);
 
   return count;
 }

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-04 14:43       ` Martin Liška
@ 2016-08-04 15:13         ` Nathan Sidwell
  2016-08-04 16:10           ` Martin Liška
  0 siblings, 1 reply; 21+ messages in thread
From: Nathan Sidwell @ 2016-08-04 15:13 UTC (permalink / raw)
  To: Martin Liška, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

On 08/04/16 10:42, Martin LiÅ¡ka wrote:

> I decided to use a new enum, hope it's better?

that's fine.  But you know, if you set the enum values appropriately you could 
use the | trick rather than the compare you've done (c++ enum type safety would 
require an overloaded | operator though).  I don't mind either way,


+get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
...
+  for (block_t *block = linfo.u.blocks; block; block = block->chain)
+    {
+      arc_vector_t path;
+      block_vector_t blocked;
+      vector<block_vector_t > block_lists;
+      result = circuit (block, path, block, blocked, block_lists, linfo, count);
+    }
+
+  /* If we have a negative cycle, repeat the find_cycles routine.  */
+  if (result == NEGATIVE_LOOP && handle_negative_cycles)
+    count += get_cycles_count (linfo, false);

The retry will depend on the result of the final call of circuit.  Before  it 
happened if any loop was negated.  Is this change intentional?

nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-04 15:13         ` Nathan Sidwell
@ 2016-08-04 16:10           ` Martin Liška
  2016-08-04 16:52             ` Nathan Sidwell
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Liška @ 2016-08-04 16:10 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

[-- Attachment #1: Type: text/plain, Size: 1213 bytes --]

On 08/04/2016 05:13 PM, Nathan Sidwell wrote:
> On 08/04/16 10:42, Martin LiÅ¡ka wrote:
> 
>> I decided to use a new enum, hope it's better?
> 
> that's fine.  But you know, if you set the enum values appropriately you could use the | trick rather than the compare you've done (c++ enum type safety would require an overloaded | operator though).  I don't mind either way,

Yeah, I decided to use enum + operator|.

> 
> 
> +get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
> ...
> +  for (block_t *block = linfo.u.blocks; block; block = block->chain)
> +    {
> +      arc_vector_t path;
> +      block_vector_t blocked;
> +      vector<block_vector_t > block_lists;
> +      result = circuit (block, path, block, blocked, block_lists, linfo, count);
> +    }
> +
> +  /* If we have a negative cycle, repeat the find_cycles routine.  */
> +  if (result == NEGATIVE_LOOP && handle_negative_cycles)
> +    count += get_cycles_count (linfo, false);
> 
> The retry will depend on the result of the final call of circuit.  Before  it happened if any loop was negated.  Is this change intentional?

That's not intentional, fixed in the new version.
May I install the patch?

Martin

> 
> nathan


[-- Attachment #2: 0001-gcov-tool-Implement-Hawick-s-algorithm-for-cycle-det-v4.patch --]
[-- Type: text/x-patch, Size: 10871 bytes --]

From 4517ea775ff041c8f37faff76b637fd671f269e3 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 4 Aug 2016 12:34:08 +0200
Subject: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection,
 (PR gcov-profile/67992)

gcc/ChangeLog:

2016-08-04  Martin Liska  <mliska@suse.cz>
	    Joshua Cranmer  <Pidgeot18@gmail.com>

	* gcov.c (line_t::has_block): New function.
	(enum loop_type): New enum.
	(handle_cycle): New function.
	(unblock): Likewise.
	(circuit): Likewise.
	(get_cycles_count): Likewise.
	(accumulate_line_counts): Use new loop detection algorithm.
---
 gcc/gcov.c | 291 ++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 190 insertions(+), 101 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 40701a1..b1ab6e5 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -41,6 +41,11 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include <getopt.h>
 
+#include <vector>
+#include <algorithm>
+
+using namespace std;
+
 #define IN_GCOV 1
 #include "gcov-io.h"
 #include "gcov-io.c"
@@ -222,6 +227,9 @@ typedef struct coverage_info
 
 typedef struct line_info
 {
+  /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
+  bool has_block (block_t *needle);
+
   gcov_type count;	   /* execution count */
   union
   {
@@ -235,6 +243,16 @@ typedef struct line_info
   unsigned unexceptional : 1;
 } line_t;
 
+bool
+line_t::has_block (block_t *needle)
+{
+  for (block_t *n = u.blocks; n; n = n->chain)
+    if (n == needle)
+      return true;
+
+  return false;
+}
+
 /* Describes a file mentioned in the block graph.  Contains an array
    of line info.  */
 
@@ -407,6 +425,168 @@ static void release_structures (void);
 static void release_function (function_t *);
 extern int main (int, char **);
 
+/* Cycle detection!
+   There are a bajillion algorithms that do this.  Boost's function is named
+   hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
+   "Enumerating Circuits and Loops in Graphs with Self-Arcs and Multiple-Arcs"
+   (url at <http://complexity.massey.ac.nz/cstn/013/cstn-013.pdf>).
+
+   The basic algorithm is simple: effectively, we're finding all simple paths
+   in a subgraph (that shrinks every iteration).  Duplicates are filtered by
+   "blocking" a path when a node is added to the path (this also prevents non-
+   simple paths)--the node is unblocked only when it participates in a cycle.
+   */
+
+typedef vector<arc_t *> arc_vector_t;
+typedef vector<const block_t *> block_vector_t;
+
+/* Enum with types of loop in CFG.  */
+
+enum loop_type
+{
+  NO_LOOP,
+  LOOP,
+  NEGATIVE_LOOP
+};
+
+/* Loop_type operator that merges two values: A and B.  */
+
+inline loop_type& operator |= (loop_type& a, loop_type b)
+{
+    return a = static_cast<loop_type> (a | b);
+}
+
+/* Handle cycle identified by EDGES, where the function finds minimum cs_count
+   and subtract the value from all counts.  The subtracted value is added
+   to COUNT.  Returns type of loop.  */
+
+static loop_type
+handle_cycle (const arc_vector_t &edges, int64_t &count)
+{
+  /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
+     that amount.  */
+  int64_t cycle_count = INT64_MAX;
+  for (unsigned i = 0; i < edges.size (); i++)
+    {
+      int64_t ecount = edges[i]->cs_count;
+      if (cycle_count > ecount)
+	cycle_count = ecount;
+    }
+  count += cycle_count;
+  for (unsigned i = 0; i < edges.size (); i++)
+    edges[i]->cs_count -= cycle_count;
+
+  return cycle_count < 0 ? NEGATIVE_LOOP : LOOP;
+}
+
+/* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
+   blocked by U in BLOCK_LISTS.  */
+
+static void
+unblock (const block_t *u, block_vector_t &blocked,
+	 vector<block_vector_t > &block_lists)
+{
+  block_vector_t::iterator it = find (blocked.begin (), blocked.end (), u);
+  if (it == blocked.end ())
+    return;
+
+  unsigned index = it - blocked.begin ();
+  blocked.erase (it);
+
+  for (block_vector_t::iterator it2 = block_lists[index].begin ();
+       it2 != block_lists[index].end (); it2++)
+    unblock (*it2, blocked, block_lists);
+  for (unsigned j = 0; j < block_lists[index].size (); j++)
+    unblock (u, blocked, block_lists);
+
+  block_lists.erase (block_lists.begin () + index);
+}
+
+/* Find circuit going to block V, PATH is provisional seen cycle.
+   BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
+   blocked by a block.  COUNT is accumulated count of the current LINE.
+   Returns what type of loop it contains.  */
+
+static loop_type
+circuit (block_t *v, arc_vector_t &path, block_t *start,
+	 block_vector_t &blocked, vector<block_vector_t> &block_lists,
+	 line_t &linfo, int64_t &count)
+{
+  loop_type result = NO_LOOP;
+
+  /* Add v to the block list.  */
+  gcc_assert (find (blocked.begin (), blocked.end (), v) == blocked.end ());
+  blocked.push_back (v);
+  block_lists.push_back (block_vector_t ());
+
+  for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+    {
+      block_t *w = arc->dst;
+      if (w < start || !linfo.has_block (w))
+	continue;
+
+      path.push_back (arc);
+      if (w == start)
+	{
+	  /* Cycle has been found.  */
+	  result |= handle_cycle (path, count);
+	}
+      else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
+	result |= circuit (w, path, start, blocked, block_lists, linfo, count);
+
+      path.pop_back ();
+    }
+
+  if (result != NO_LOOP)
+    unblock (v, blocked, block_lists);
+  else
+    for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+      {
+	block_t *w = arc->dst;
+	if (w < start || !linfo.has_block (w))
+	  continue;
+
+	size_t index
+	  = find (blocked.begin (), blocked.end (), w) - blocked.begin ();
+	gcc_assert (index < blocked.size ());
+	block_vector_t &list = block_lists[index];
+	if (find (list.begin (), list.end (), v) == list.end ())
+	  list.push_back (v);
+      }
+
+  return result;
+}
+
+/* Find cycles for a LINFO.  If HANDLE_NEGATIVE_CYCLES is set and the line
+   contains a negative loop, then perform the same function once again.  */
+
+static gcov_type
+get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
+{
+  /* Note that this algorithm works even if blocks aren't in sorted order.
+     Each iteration of the circuit detection is completely independent
+     (except for reducing counts, but that shouldn't matter anyways).
+     Therefore, operating on a permuted order (i.e., non-sorted) only
+     has the effect of permuting the output cycles.  */
+
+  loop_type result = NO_LOOP;
+  gcov_type count = 0;
+  for (block_t *block = linfo.u.blocks; block; block = block->chain)
+    {
+      arc_vector_t path;
+      block_vector_t blocked;
+      vector<block_vector_t > block_lists;
+      result |= circuit (block, path, block, blocked, block_lists, linfo,
+			 count);
+    }
+
+  /* If we have a negative cycle, repeat the find_cycles routine.  */
+  if (result == NEGATIVE_LOOP && handle_negative_cycles)
+    count += get_cycles_count (linfo, false);
+
+  return count;
+}
+
 int
 main (int argc, char **argv)
 {
@@ -2201,113 +2381,22 @@ accumulate_line_counts (source_t *src)
 	      arc_t *arc;
 
 	      for (arc = block->pred; arc; arc = arc->pred_next)
-		{
-		  if (arc->src->u.cycle.ident != ix)
-		    count += arc->count;
-		  if (flag_branches)
-		    add_branch_counts (&src->coverage, arc);
-		}
-
-	      /* Initialize the cs_count.  */
-	      for (arc = block->succ; arc; arc = arc->succ_next)
-		arc->cs_count = arc->count;
+		if (flag_branches)
+		  add_branch_counts (&src->coverage, arc);
 	    }
 
-	  /* Find the loops. This uses the algorithm described in
-	     Tiernan 'An Efficient Search Algorithm to Find the
-	     Elementary Circuits of a Graph', CACM Dec 1970. We hold
-	     the P array by having each block point to the arc that
-	     connects to the previous block. The H array is implicitly
-	     held because of the arc ordering, and the block's
-	     previous arc pointer.
-
-	     Although the algorithm is O(N^3) for highly connected
-	     graphs, at worst we'll have O(N^2), as most blocks have
-	     only one or two exits. Most graphs will be small.
-
-	     For each loop we find, locate the arc with the smallest
-	     transition count, and add that to the cumulative
-	     count.  Decrease flow over the cycle and remove the arc
-	     from consideration.  */
+	  /* Cycle detection.  */
 	  for (block = line->u.blocks; block; block = block->chain)
 	    {
-	      block_t *head = block;
-	      arc_t *arc;
-
-	    next_vertex:;
-	      arc = head->succ;
-	    current_vertex:;
-	      while (arc)
-		{
-		  block_t *dst = arc->dst;
-		  if (/* Already used that arc.  */
-		      arc->cycle
-		      /* Not to same graph, or before first vertex.  */
-		      || dst->u.cycle.ident != ix
-		      /* Already in path.  */
-		      || dst->u.cycle.arc)
-		    {
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  if (dst == block)
-		    {
-		      /* Found a closing arc.  */
-		      gcov_type cycle_count = arc->cs_count;
-		      arc_t *cycle_arc = arc;
-		      arc_t *probe_arc;
-
-		      /* Locate the smallest arc count of the loop.  */
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			if (cycle_count > probe_arc->cs_count)
-			  {
-			    cycle_count = probe_arc->cs_count;
-			    cycle_arc = probe_arc;
-			  }
-
-		      count += cycle_count;
-		      cycle_arc->cycle = 1;
-
-		      /* Remove the flow from the cycle.  */
-		      arc->cs_count -= cycle_count;
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			probe_arc->cs_count -= cycle_count;
-
-		      /* Unwind to the cyclic arc.  */
-		      while (head != cycle_arc->src)
-			{
-			  arc = head->u.cycle.arc;
-			  head->u.cycle.arc = NULL;
-			  head = arc->src;
-			}
-		      /* Move on.  */
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  /* Add new block to chain.  */
-		  dst->u.cycle.arc = arc;
-		  head = dst;
-		  goto next_vertex;
-		}
-	      /* We could not add another vertex to the path. Remove
-		 the last vertex from the list.  */
-	      arc = head->u.cycle.arc;
-	      if (arc)
-		{
-		  /* It was not the first vertex. Move onto next arc.  */
-		  head->u.cycle.arc = NULL;
-		  head = arc->src;
-		  arc = arc->succ_next;
-		  goto current_vertex;
-		}
-	      /* Mark this block as unusable.  */
-	      block->u.cycle.ident = ~0U;
+	      for (arc_t *arc = block->pred; arc; arc = arc->pred_next)
+		if (!line->has_block (arc->src))
+		  count += arc->count;
+	      for (arc_t *arc = block->succ; arc; arc = arc->succ_next)
+		arc->cs_count = arc->count;
 	    }
 
+	  /* Now, add the count of loops entirely on this line.  */
+	  count += get_cycles_count (*line);
 	  line->count = count;
 	}
 
-- 
2.9.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-04 16:10           ` Martin Liška
@ 2016-08-04 16:52             ` Nathan Sidwell
  2016-08-05  7:30               ` Martin Liška
  0 siblings, 1 reply; 21+ messages in thread
From: Nathan Sidwell @ 2016-08-04 16:52 UTC (permalink / raw)
  To: Martin Liška, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

On 08/04/16 12:10, Martin LiÅ¡ka wrote:
> On 08/04/2016 05:13 PM, Nathan Sidwell wrote:
>> On 08/04/16 10:42, Martin LiÅ¡ka wrote:
>>
>>> I decided to use a new enum, hope it's better?
>>
>> that's fine.  But you know, if you set the enum values appropriately you could use the | trick rather than the compare you've done (c++ enum type safety would require an overloaded | operator though).  I don't mind either way,
>
> Yeah, I decided to use enum + operator|.

You have a bug. The enum values are {0,1,2},  So the result of meeting both 
regular and reversed loops will be the value '3'.  so the check for == 
NEGATIVE_LOOP could erroneously fail.  Fixable by making NEGATIVE_LOOP's value 2 
+ LOOP (or many other variants on that theme).



+      if (w == start)
+	{
+	  /* Cycle has been found.  */
+	  result |= handle_cycle (path, count);
+	}

{...} not necessary here (even with the comment).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-04 16:52             ` Nathan Sidwell
@ 2016-08-05  7:30               ` Martin Liška
  2016-08-05  7:32                 ` Martin Liška
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Liška @ 2016-08-05  7:30 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

[-- Attachment #1: Type: text/plain, Size: 1119 bytes --]

On 08/04/2016 06:52 PM, Nathan Sidwell wrote:
> On 08/04/16 12:10, Martin LiÅ¡ka wrote:
>> On 08/04/2016 05:13 PM, Nathan Sidwell wrote:
>>> On 08/04/16 10:42, Martin LiÅ¡ka wrote:
>>>
>>>> I decided to use a new enum, hope it's better?
>>>
>>> that's fine.  But you know, if you set the enum values appropriately you could use the | trick rather than the compare you've done (c++ enum type safety would require an overloaded | operator though).  I don't mind either way,
>>
>> Yeah, I decided to use enum + operator|.
> 
> You have a bug. The enum values are {0,1,2},  So the result of meeting both regular and reversed loops will be the value '3'.  so the check for == NEGATIVE_LOOP could erroneously fail.  Fixable by making NEGATIVE_LOOP's value 2 + LOOP (or many other variants on that theme).
> 
> 
> 
> +      if (w == start)
> +    {
> +      /* Cycle has been found.  */
> +      result |= handle_cycle (path, count);
> +    }
> 
> {...} not necessary here (even with the comment).

Hi.

Sorry for the mistake with the enum, that was silly ;)
New patch version also handles the unnecessary braces.

Martin

[-- Attachment #2: 0001-gcov-tool-Implement-Hawick-s-algorithm-for-cycle-det-v5.patch --]
[-- Type: text/x-patch, Size: 10883 bytes --]

From 8aec25a71d303c4411f0c2ef307b1a20e71483a1 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 4 Aug 2016 12:34:08 +0200
Subject: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection,
 (PR gcov-profile/67992)

gcc/ChangeLog:

2016-08-04  Martin Liska  <mliska@suse.cz>
	    Joshua Cranmer  <Pidgeot18@gmail.com>

	* gcov.c (line_t::has_block): New function.
	(enum loop_type): New enum.
	(handle_cycle): New function.
	(unblock): Likewise.
	(circuit): Likewise.
	(get_cycles_count): Likewise.
	(accumulate_line_counts): Use new loop detection algorithm.
---
 gcc/gcov.c | 291 ++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 190 insertions(+), 101 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 40701a1..1d12c59 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -41,6 +41,11 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include <getopt.h>
 
+#include <vector>
+#include <algorithm>
+
+using namespace std;
+
 #define IN_GCOV 1
 #include "gcov-io.h"
 #include "gcov-io.c"
@@ -222,6 +227,9 @@ typedef struct coverage_info
 
 typedef struct line_info
 {
+  /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
+  bool has_block (block_t *needle);
+
   gcov_type count;	   /* execution count */
   union
   {
@@ -235,6 +243,16 @@ typedef struct line_info
   unsigned unexceptional : 1;
 } line_t;
 
+bool
+line_t::has_block (block_t *needle)
+{
+  for (block_t *n = u.blocks; n; n = n->chain)
+    if (n == needle)
+      return true;
+
+  return false;
+}
+
 /* Describes a file mentioned in the block graph.  Contains an array
    of line info.  */
 
@@ -407,6 +425,168 @@ static void release_structures (void);
 static void release_function (function_t *);
 extern int main (int, char **);
 
+/* Cycle detection!
+   There are a bajillion algorithms that do this.  Boost's function is named
+   hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
+   "Enumerating Circuits and Loops in Graphs with Self-Arcs and Multiple-Arcs"
+   (url at <http://complexity.massey.ac.nz/cstn/013/cstn-013.pdf>).
+
+   The basic algorithm is simple: effectively, we're finding all simple paths
+   in a subgraph (that shrinks every iteration).  Duplicates are filtered by
+   "blocking" a path when a node is added to the path (this also prevents non-
+   simple paths)--the node is unblocked only when it participates in a cycle.
+   */
+
+typedef vector<arc_t *> arc_vector_t;
+typedef vector<const block_t *> block_vector_t;
+
+/* Enum with types of loop in CFG.  */
+
+enum loop_type
+{
+  NO_LOOP = 0,
+  LOOP = 1,
+  NEGATIVE_LOOP = 3
+};
+
+/* Loop_type operator that merges two values: A and B.  */
+
+inline loop_type& operator |= (loop_type& a, loop_type b)
+{
+    return a = static_cast<loop_type> (a | b);
+}
+
+/* Handle cycle identified by EDGES, where the function finds minimum cs_count
+   and subtract the value from all counts.  The subtracted value is added
+   to COUNT.  Returns type of loop.  */
+
+static loop_type
+handle_cycle (const arc_vector_t &edges, int64_t &count)
+{
+  /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
+     that amount.  */
+  int64_t cycle_count = INT64_MAX;
+  for (unsigned i = 0; i < edges.size (); i++)
+    {
+      int64_t ecount = edges[i]->cs_count;
+      if (cycle_count > ecount)
+	cycle_count = ecount;
+    }
+  count += cycle_count;
+  for (unsigned i = 0; i < edges.size (); i++)
+    edges[i]->cs_count -= cycle_count;
+
+  return cycle_count < 0 ? NEGATIVE_LOOP : LOOP;
+}
+
+/* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
+   blocked by U in BLOCK_LISTS.  */
+
+static void
+unblock (const block_t *u, block_vector_t &blocked,
+	 vector<block_vector_t > &block_lists)
+{
+  block_vector_t::iterator it = find (blocked.begin (), blocked.end (), u);
+  if (it == blocked.end ())
+    return;
+
+  unsigned index = it - blocked.begin ();
+  blocked.erase (it);
+
+  for (block_vector_t::iterator it2 = block_lists[index].begin ();
+       it2 != block_lists[index].end (); it2++)
+    unblock (*it2, blocked, block_lists);
+  for (unsigned j = 0; j < block_lists[index].size (); j++)
+    unblock (u, blocked, block_lists);
+
+  block_lists.erase (block_lists.begin () + index);
+}
+
+/* Find circuit going to block V, PATH is provisional seen cycle.
+   BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
+   blocked by a block.  COUNT is accumulated count of the current LINE.
+   Returns what type of loop it contains.  */
+
+static loop_type
+circuit (block_t *v, arc_vector_t &path, block_t *start,
+	 block_vector_t &blocked, vector<block_vector_t> &block_lists,
+	 line_t &linfo, int64_t &count)
+{
+  loop_type result = NO_LOOP;
+
+  /* Add v to the block list.  */
+  gcc_assert (find (blocked.begin (), blocked.end (), v) == blocked.end ());
+  blocked.push_back (v);
+  block_lists.push_back (block_vector_t ());
+
+  for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+    {
+      block_t *w = arc->dst;
+      if (w < start || !linfo.has_block (w))
+	continue;
+
+      path.push_back (arc);
+      if (w == start)
+	{
+	  /* Cycle has been found.  */
+	  result |= handle_cycle (path, count);
+	}
+      else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
+	result |= circuit (w, path, start, blocked, block_lists, linfo, count);
+
+      path.pop_back ();
+    }
+
+  if (result != NO_LOOP)
+    unblock (v, blocked, block_lists);
+  else
+    for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+      {
+	block_t *w = arc->dst;
+	if (w < start || !linfo.has_block (w))
+	  continue;
+
+	size_t index
+	  = find (blocked.begin (), blocked.end (), w) - blocked.begin ();
+	gcc_assert (index < blocked.size ());
+	block_vector_t &list = block_lists[index];
+	if (find (list.begin (), list.end (), v) == list.end ())
+	  list.push_back (v);
+      }
+
+  return result;
+}
+
+/* Find cycles for a LINFO.  If HANDLE_NEGATIVE_CYCLES is set and the line
+   contains a negative loop, then perform the same function once again.  */
+
+static gcov_type
+get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
+{
+  /* Note that this algorithm works even if blocks aren't in sorted order.
+     Each iteration of the circuit detection is completely independent
+     (except for reducing counts, but that shouldn't matter anyways).
+     Therefore, operating on a permuted order (i.e., non-sorted) only
+     has the effect of permuting the output cycles.  */
+
+  loop_type result = NO_LOOP;
+  gcov_type count = 0;
+  for (block_t *block = linfo.u.blocks; block; block = block->chain)
+    {
+      arc_vector_t path;
+      block_vector_t blocked;
+      vector<block_vector_t > block_lists;
+      result |= circuit (block, path, block, blocked, block_lists, linfo,
+			 count);
+    }
+
+  /* If we have a negative cycle, repeat the find_cycles routine.  */
+  if (result == NEGATIVE_LOOP && handle_negative_cycles)
+    count += get_cycles_count (linfo, false);
+
+  return count;
+}
+
 int
 main (int argc, char **argv)
 {
@@ -2201,113 +2381,22 @@ accumulate_line_counts (source_t *src)
 	      arc_t *arc;
 
 	      for (arc = block->pred; arc; arc = arc->pred_next)
-		{
-		  if (arc->src->u.cycle.ident != ix)
-		    count += arc->count;
-		  if (flag_branches)
-		    add_branch_counts (&src->coverage, arc);
-		}
-
-	      /* Initialize the cs_count.  */
-	      for (arc = block->succ; arc; arc = arc->succ_next)
-		arc->cs_count = arc->count;
+		if (flag_branches)
+		  add_branch_counts (&src->coverage, arc);
 	    }
 
-	  /* Find the loops. This uses the algorithm described in
-	     Tiernan 'An Efficient Search Algorithm to Find the
-	     Elementary Circuits of a Graph', CACM Dec 1970. We hold
-	     the P array by having each block point to the arc that
-	     connects to the previous block. The H array is implicitly
-	     held because of the arc ordering, and the block's
-	     previous arc pointer.
-
-	     Although the algorithm is O(N^3) for highly connected
-	     graphs, at worst we'll have O(N^2), as most blocks have
-	     only one or two exits. Most graphs will be small.
-
-	     For each loop we find, locate the arc with the smallest
-	     transition count, and add that to the cumulative
-	     count.  Decrease flow over the cycle and remove the arc
-	     from consideration.  */
+	  /* Cycle detection.  */
 	  for (block = line->u.blocks; block; block = block->chain)
 	    {
-	      block_t *head = block;
-	      arc_t *arc;
-
-	    next_vertex:;
-	      arc = head->succ;
-	    current_vertex:;
-	      while (arc)
-		{
-		  block_t *dst = arc->dst;
-		  if (/* Already used that arc.  */
-		      arc->cycle
-		      /* Not to same graph, or before first vertex.  */
-		      || dst->u.cycle.ident != ix
-		      /* Already in path.  */
-		      || dst->u.cycle.arc)
-		    {
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  if (dst == block)
-		    {
-		      /* Found a closing arc.  */
-		      gcov_type cycle_count = arc->cs_count;
-		      arc_t *cycle_arc = arc;
-		      arc_t *probe_arc;
-
-		      /* Locate the smallest arc count of the loop.  */
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			if (cycle_count > probe_arc->cs_count)
-			  {
-			    cycle_count = probe_arc->cs_count;
-			    cycle_arc = probe_arc;
-			  }
-
-		      count += cycle_count;
-		      cycle_arc->cycle = 1;
-
-		      /* Remove the flow from the cycle.  */
-		      arc->cs_count -= cycle_count;
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			probe_arc->cs_count -= cycle_count;
-
-		      /* Unwind to the cyclic arc.  */
-		      while (head != cycle_arc->src)
-			{
-			  arc = head->u.cycle.arc;
-			  head->u.cycle.arc = NULL;
-			  head = arc->src;
-			}
-		      /* Move on.  */
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  /* Add new block to chain.  */
-		  dst->u.cycle.arc = arc;
-		  head = dst;
-		  goto next_vertex;
-		}
-	      /* We could not add another vertex to the path. Remove
-		 the last vertex from the list.  */
-	      arc = head->u.cycle.arc;
-	      if (arc)
-		{
-		  /* It was not the first vertex. Move onto next arc.  */
-		  head->u.cycle.arc = NULL;
-		  head = arc->src;
-		  arc = arc->succ_next;
-		  goto current_vertex;
-		}
-	      /* Mark this block as unusable.  */
-	      block->u.cycle.ident = ~0U;
+	      for (arc_t *arc = block->pred; arc; arc = arc->pred_next)
+		if (!line->has_block (arc->src))
+		  count += arc->count;
+	      for (arc_t *arc = block->succ; arc; arc = arc->succ_next)
+		arc->cs_count = arc->count;
 	    }
 
+	  /* Now, add the count of loops entirely on this line.  */
+	  count += get_cycles_count (*line);
 	  line->count = count;
 	}
 
-- 
2.9.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-05  7:30               ` Martin Liška
@ 2016-08-05  7:32                 ` Martin Liška
  2016-08-05 12:25                   ` Nathan Sidwell
  2016-08-05 15:27                   ` Andrew Pinski
  0 siblings, 2 replies; 21+ messages in thread
From: Martin Liška @ 2016-08-05  7:32 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

[-- Attachment #1: Type: text/plain, Size: 238 bytes --]

On 08/05/2016 09:30 AM, Martin LiÅ¡ka wrote:
> Hi.
> 
> Sorry for the mistake with the enum, that was silly ;)
> New patch version also handles the unnecessary braces.
> 
> Martin

I attached a wrong patch, sending the new one.

Martin

[-- Attachment #2: 0001-gcov-tool-Implement-Hawick-s-algorithm-for-cycle-det-v6.patch --]
[-- Type: text/x-patch, Size: 10871 bytes --]

From 6a811bf138aa29e6b821de21de61f77afec5a0d6 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 4 Aug 2016 12:34:08 +0200
Subject: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection,
 (PR gcov-profile/67992)

gcc/ChangeLog:

2016-08-04  Martin Liska  <mliska@suse.cz>
	    Joshua Cranmer  <Pidgeot18@gmail.com>

	* gcov.c (line_t::has_block): New function.
	(enum loop_type): New enum.
	(handle_cycle): New function.
	(unblock): Likewise.
	(circuit): Likewise.
	(get_cycles_count): Likewise.
	(accumulate_line_counts): Use new loop detection algorithm.
---
 gcc/gcov.c | 289 ++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 188 insertions(+), 101 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 40701a1..8f47fd1 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -41,6 +41,11 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include <getopt.h>
 
+#include <vector>
+#include <algorithm>
+
+using namespace std;
+
 #define IN_GCOV 1
 #include "gcov-io.h"
 #include "gcov-io.c"
@@ -222,6 +227,9 @@ typedef struct coverage_info
 
 typedef struct line_info
 {
+  /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
+  bool has_block (block_t *needle);
+
   gcov_type count;	   /* execution count */
   union
   {
@@ -235,6 +243,16 @@ typedef struct line_info
   unsigned unexceptional : 1;
 } line_t;
 
+bool
+line_t::has_block (block_t *needle)
+{
+  for (block_t *n = u.blocks; n; n = n->chain)
+    if (n == needle)
+      return true;
+
+  return false;
+}
+
 /* Describes a file mentioned in the block graph.  Contains an array
    of line info.  */
 
@@ -407,6 +425,166 @@ static void release_structures (void);
 static void release_function (function_t *);
 extern int main (int, char **);
 
+/* Cycle detection!
+   There are a bajillion algorithms that do this.  Boost's function is named
+   hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
+   "Enumerating Circuits and Loops in Graphs with Self-Arcs and Multiple-Arcs"
+   (url at <http://complexity.massey.ac.nz/cstn/013/cstn-013.pdf>).
+
+   The basic algorithm is simple: effectively, we're finding all simple paths
+   in a subgraph (that shrinks every iteration).  Duplicates are filtered by
+   "blocking" a path when a node is added to the path (this also prevents non-
+   simple paths)--the node is unblocked only when it participates in a cycle.
+   */
+
+typedef vector<arc_t *> arc_vector_t;
+typedef vector<const block_t *> block_vector_t;
+
+/* Enum with types of loop in CFG.  */
+
+enum loop_type
+{
+  NO_LOOP = 0,
+  LOOP = 1,
+  NEGATIVE_LOOP = 3
+};
+
+/* Loop_type operator that merges two values: A and B.  */
+
+inline loop_type& operator |= (loop_type& a, loop_type b)
+{
+    return a = static_cast<loop_type> (a | b);
+}
+
+/* Handle cycle identified by EDGES, where the function finds minimum cs_count
+   and subtract the value from all counts.  The subtracted value is added
+   to COUNT.  Returns type of loop.  */
+
+static loop_type
+handle_cycle (const arc_vector_t &edges, int64_t &count)
+{
+  /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
+     that amount.  */
+  int64_t cycle_count = INT64_MAX;
+  for (unsigned i = 0; i < edges.size (); i++)
+    {
+      int64_t ecount = edges[i]->cs_count;
+      if (cycle_count > ecount)
+	cycle_count = ecount;
+    }
+  count += cycle_count;
+  for (unsigned i = 0; i < edges.size (); i++)
+    edges[i]->cs_count -= cycle_count;
+
+  return cycle_count < 0 ? NEGATIVE_LOOP : LOOP;
+}
+
+/* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
+   blocked by U in BLOCK_LISTS.  */
+
+static void
+unblock (const block_t *u, block_vector_t &blocked,
+	 vector<block_vector_t > &block_lists)
+{
+  block_vector_t::iterator it = find (blocked.begin (), blocked.end (), u);
+  if (it == blocked.end ())
+    return;
+
+  unsigned index = it - blocked.begin ();
+  blocked.erase (it);
+
+  for (block_vector_t::iterator it2 = block_lists[index].begin ();
+       it2 != block_lists[index].end (); it2++)
+    unblock (*it2, blocked, block_lists);
+  for (unsigned j = 0; j < block_lists[index].size (); j++)
+    unblock (u, blocked, block_lists);
+
+  block_lists.erase (block_lists.begin () + index);
+}
+
+/* Find circuit going to block V, PATH is provisional seen cycle.
+   BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
+   blocked by a block.  COUNT is accumulated count of the current LINE.
+   Returns what type of loop it contains.  */
+
+static loop_type
+circuit (block_t *v, arc_vector_t &path, block_t *start,
+	 block_vector_t &blocked, vector<block_vector_t> &block_lists,
+	 line_t &linfo, int64_t &count)
+{
+  loop_type result = NO_LOOP;
+
+  /* Add v to the block list.  */
+  gcc_assert (find (blocked.begin (), blocked.end (), v) == blocked.end ());
+  blocked.push_back (v);
+  block_lists.push_back (block_vector_t ());
+
+  for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+    {
+      block_t *w = arc->dst;
+      if (w < start || !linfo.has_block (w))
+	continue;
+
+      path.push_back (arc);
+      if (w == start)
+	/* Cycle has been found.  */
+	result |= handle_cycle (path, count);
+      else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
+	result |= circuit (w, path, start, blocked, block_lists, linfo, count);
+
+      path.pop_back ();
+    }
+
+  if (result != NO_LOOP)
+    unblock (v, blocked, block_lists);
+  else
+    for (arc_t *arc = v->succ; arc; arc = arc->succ_next)
+      {
+	block_t *w = arc->dst;
+	if (w < start || !linfo.has_block (w))
+	  continue;
+
+	size_t index
+	  = find (blocked.begin (), blocked.end (), w) - blocked.begin ();
+	gcc_assert (index < blocked.size ());
+	block_vector_t &list = block_lists[index];
+	if (find (list.begin (), list.end (), v) == list.end ())
+	  list.push_back (v);
+      }
+
+  return result;
+}
+
+/* Find cycles for a LINFO.  If HANDLE_NEGATIVE_CYCLES is set and the line
+   contains a negative loop, then perform the same function once again.  */
+
+static gcov_type
+get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
+{
+  /* Note that this algorithm works even if blocks aren't in sorted order.
+     Each iteration of the circuit detection is completely independent
+     (except for reducing counts, but that shouldn't matter anyways).
+     Therefore, operating on a permuted order (i.e., non-sorted) only
+     has the effect of permuting the output cycles.  */
+
+  loop_type result = NO_LOOP;
+  gcov_type count = 0;
+  for (block_t *block = linfo.u.blocks; block; block = block->chain)
+    {
+      arc_vector_t path;
+      block_vector_t blocked;
+      vector<block_vector_t > block_lists;
+      result |= circuit (block, path, block, blocked, block_lists, linfo,
+			 count);
+    }
+
+  /* If we have a negative cycle, repeat the find_cycles routine.  */
+  if (result == NEGATIVE_LOOP && handle_negative_cycles)
+    count += get_cycles_count (linfo, false);
+
+  return count;
+}
+
 int
 main (int argc, char **argv)
 {
@@ -2201,113 +2379,22 @@ accumulate_line_counts (source_t *src)
 	      arc_t *arc;
 
 	      for (arc = block->pred; arc; arc = arc->pred_next)
-		{
-		  if (arc->src->u.cycle.ident != ix)
-		    count += arc->count;
-		  if (flag_branches)
-		    add_branch_counts (&src->coverage, arc);
-		}
-
-	      /* Initialize the cs_count.  */
-	      for (arc = block->succ; arc; arc = arc->succ_next)
-		arc->cs_count = arc->count;
+		if (flag_branches)
+		  add_branch_counts (&src->coverage, arc);
 	    }
 
-	  /* Find the loops. This uses the algorithm described in
-	     Tiernan 'An Efficient Search Algorithm to Find the
-	     Elementary Circuits of a Graph', CACM Dec 1970. We hold
-	     the P array by having each block point to the arc that
-	     connects to the previous block. The H array is implicitly
-	     held because of the arc ordering, and the block's
-	     previous arc pointer.
-
-	     Although the algorithm is O(N^3) for highly connected
-	     graphs, at worst we'll have O(N^2), as most blocks have
-	     only one or two exits. Most graphs will be small.
-
-	     For each loop we find, locate the arc with the smallest
-	     transition count, and add that to the cumulative
-	     count.  Decrease flow over the cycle and remove the arc
-	     from consideration.  */
+	  /* Cycle detection.  */
 	  for (block = line->u.blocks; block; block = block->chain)
 	    {
-	      block_t *head = block;
-	      arc_t *arc;
-
-	    next_vertex:;
-	      arc = head->succ;
-	    current_vertex:;
-	      while (arc)
-		{
-		  block_t *dst = arc->dst;
-		  if (/* Already used that arc.  */
-		      arc->cycle
-		      /* Not to same graph, or before first vertex.  */
-		      || dst->u.cycle.ident != ix
-		      /* Already in path.  */
-		      || dst->u.cycle.arc)
-		    {
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  if (dst == block)
-		    {
-		      /* Found a closing arc.  */
-		      gcov_type cycle_count = arc->cs_count;
-		      arc_t *cycle_arc = arc;
-		      arc_t *probe_arc;
-
-		      /* Locate the smallest arc count of the loop.  */
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			if (cycle_count > probe_arc->cs_count)
-			  {
-			    cycle_count = probe_arc->cs_count;
-			    cycle_arc = probe_arc;
-			  }
-
-		      count += cycle_count;
-		      cycle_arc->cycle = 1;
-
-		      /* Remove the flow from the cycle.  */
-		      arc->cs_count -= cycle_count;
-		      for (dst = head; (probe_arc = dst->u.cycle.arc);
-			   dst = probe_arc->src)
-			probe_arc->cs_count -= cycle_count;
-
-		      /* Unwind to the cyclic arc.  */
-		      while (head != cycle_arc->src)
-			{
-			  arc = head->u.cycle.arc;
-			  head->u.cycle.arc = NULL;
-			  head = arc->src;
-			}
-		      /* Move on.  */
-		      arc = arc->succ_next;
-		      continue;
-		    }
-
-		  /* Add new block to chain.  */
-		  dst->u.cycle.arc = arc;
-		  head = dst;
-		  goto next_vertex;
-		}
-	      /* We could not add another vertex to the path. Remove
-		 the last vertex from the list.  */
-	      arc = head->u.cycle.arc;
-	      if (arc)
-		{
-		  /* It was not the first vertex. Move onto next arc.  */
-		  head->u.cycle.arc = NULL;
-		  head = arc->src;
-		  arc = arc->succ_next;
-		  goto current_vertex;
-		}
-	      /* Mark this block as unusable.  */
-	      block->u.cycle.ident = ~0U;
+	      for (arc_t *arc = block->pred; arc; arc = arc->pred_next)
+		if (!line->has_block (arc->src))
+		  count += arc->count;
+	      for (arc_t *arc = block->succ; arc; arc = arc->succ_next)
+		arc->cs_count = arc->count;
 	    }
 
+	  /* Now, add the count of loops entirely on this line.  */
+	  count += get_cycles_count (*line);
 	  line->count = count;
 	}
 
-- 
2.9.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-05  7:32                 ` Martin Liška
@ 2016-08-05 12:25                   ` Nathan Sidwell
  2016-08-05 15:27                   ` Andrew Pinski
  1 sibling, 0 replies; 21+ messages in thread
From: Nathan Sidwell @ 2016-08-05 12:25 UTC (permalink / raw)
  To: Martin Liška, GCC Patches; +Cc: Jan Hubicka, Pidgeot18

On 08/05/16 03:32, Martin LiÅ¡ka wrote:
> On 08/05/2016 09:30 AM, Martin LiÅ¡ka wrote:
>> Hi.
>>
>> Sorry for the mistake with the enum, that was silly ;)
>> New patch version also handles the unnecessary braces.
>>
>> Martin
>
> I attached a wrong patch, sending the new one.

This one looks good, thanks!


nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-05  7:32                 ` Martin Liška
  2016-08-05 12:25                   ` Nathan Sidwell
@ 2016-08-05 15:27                   ` Andrew Pinski
  2016-08-05 17:12                     ` Gerald Pfeifer
                                       ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: Andrew Pinski @ 2016-08-05 15:27 UTC (permalink / raw)
  To: Martin Liška; +Cc: Nathan Sidwell, GCC Patches, Jan Hubicka, Pidgeot18

On Fri, Aug 5, 2016 at 12:32 AM, Martin Liška <mliska@suse.cz> wrote:
> On 08/05/2016 09:30 AM, Martin Liška wrote:
>> Hi.
>>
>> Sorry for the mistake with the enum, that was silly ;)
>> New patch version also handles the unnecessary braces.
>>
>> Martin
>
> I attached a wrong patch, sending the new one.


This broke cross aarch64-elf-gnu building.

/home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:
In function ‘loop_type handle_cycle(const arc_vector_t&, int64_t&)’:
/home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:468:25:
error: ‘INT64_MAX’ was not declared in this scope


See http://stackoverflow.com/questions/3233054/error-int32-max-was-not-declared-in-this-scope
.


Thanks,
Andrew




>
> Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-05 15:27                   ` Andrew Pinski
@ 2016-08-05 17:12                     ` Gerald Pfeifer
  2016-08-05 20:39                     ` Michael Meissner
  2016-08-06 10:31                     ` Jakub Jelinek
  2 siblings, 0 replies; 21+ messages in thread
From: Gerald Pfeifer @ 2016-08-05 17:12 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Martin Liška, Nathan Sidwell, GCC Patches, Jan Hubicka, Pidgeot18

On Fri, 5 Aug 2016, Andrew Pinski wrote:
>> I attached a wrong patch, sending the new one.
> This broke cross aarch64-elf-gnu building.

It also broke native i586-unknown-freebsd10.3 (which features 
clang as the system compiler).

/scratch/tmp/gerald/gcc-HEAD/gcc/gcov.c:468:25: error: use of undeclared 
identifier 'INT64_MAX'
  int64_t cycle_count = INT64_MAX;
                        ^

Gerald

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-05 15:27                   ` Andrew Pinski
  2016-08-05 17:12                     ` Gerald Pfeifer
@ 2016-08-05 20:39                     ` Michael Meissner
  2016-08-06 10:31                     ` Jakub Jelinek
  2 siblings, 0 replies; 21+ messages in thread
From: Michael Meissner @ 2016-08-05 20:39 UTC (permalink / raw)
  To: Andrew Pinski, Martin LiE!ka
  Cc: Nathan Sidwell, GCC Patches, Jan Hubicka, Pidgeot18, Kelvin Nilsen

On Fri, Aug 05, 2016 at 08:27:39AM -0700, Andrew Pinski wrote:
> On Fri, Aug 5, 2016 at 12:32 AM, Martin LiÅ¡ka <mliska@suse.cz> wrote:
> > On 08/05/2016 09:30 AM, Martin LiÅ¡ka wrote:
> >> Hi.
> >>
> >> Sorry for the mistake with the enum, that was silly ;)
> >> New patch version also handles the unnecessary braces.
> >>
> >> Martin
> >
> > I attached a wrong patch, sending the new one.
> 
> 
> This broke cross aarch64-elf-gnu building.
> 
> /home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:
> In function â€˜loop_type handle_cycle(const arc_vector_t&, int64_t&)â€™:
> /home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:468:25:
> error: â€˜INT64_MAXâ€™ was not declared in this scope
> 
> 
> See http://stackoverflow.com/questions/3233054/error-int32-max-was-not-declared-in-this-scope
> .

We are seeing the same thing on RHEL 7.2 PowerPC systems and RHEL 6.7 x86_64
systems when building a cross compiler.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-05 15:27                   ` Andrew Pinski
  2016-08-05 17:12                     ` Gerald Pfeifer
  2016-08-05 20:39                     ` Michael Meissner
@ 2016-08-06 10:31                     ` Jakub Jelinek
  2016-08-07 21:33                       ` Martin Liška
  2016-08-08 12:47                       ` Nathan Sidwell
  2 siblings, 2 replies; 21+ messages in thread
From: Jakub Jelinek @ 2016-08-06 10:31 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Martin Liška, Nathan Sidwell, GCC Patches, Jan Hubicka, Pidgeot18

On Fri, Aug 05, 2016 at 08:27:39AM -0700, Andrew Pinski wrote:
> On Fri, Aug 5, 2016 at 12:32 AM, Martin LiÅ¡ka <mliska@suse.cz> wrote:
> > On 08/05/2016 09:30 AM, Martin LiÅ¡ka wrote:
> >> Hi.
> >>
> >> Sorry for the mistake with the enum, that was silly ;)
> >> New patch version also handles the unnecessary braces.
> >>
> >> Martin
> >
> > I attached a wrong patch, sending the new one.
> 
> 
> This broke cross aarch64-elf-gnu building.
> 
> /home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:
> In function â€˜loop_type handle_cycle(const arc_vector_t&, int64_t&)â€™:
> /home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:468:25:
> error: â€˜INT64_MAXâ€™ was not declared in this scope
> 
> 
> See http://stackoverflow.com/questions/3233054/error-int32-max-was-not-declared-in-this-scope
> .

Fixed thusly, committed as obvious to trunk:

2016-08-06  Jakub Jelinek  <jakub@redhat.com>

	* gcov.c (handle_cycle): Use INTTYPE_MAXIMUM (int64_t) instead of
	INT64_MAX.

--- gcc/gcov.c.jj	2016-08-06 12:11:51.000000000 +0200
+++ gcc/gcov.c	2016-08-06 12:28:32.373898684 +0200
@@ -465,7 +465,7 @@ handle_cycle (const arc_vector_t &edges,
 {
   /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
      that amount.  */
-  int64_t cycle_count = INT64_MAX;
+  int64_t cycle_count = INTTYPE_MAXIMUM (int64_t);
   for (unsigned i = 0; i < edges.size (); i++)
     {
       int64_t ecount = edges[i]->cs_count;


	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-06 10:31                     ` Jakub Jelinek
@ 2016-08-07 21:33                       ` Martin Liška
  2016-08-08 12:47                       ` Nathan Sidwell
  1 sibling, 0 replies; 21+ messages in thread
From: Martin Liška @ 2016-08-07 21:33 UTC (permalink / raw)
  To: Jakub Jelinek, Andrew Pinski
  Cc: Nathan Sidwell, GCC Patches, Jan Hubicka, Pidgeot18

On 08/06/2016 12:31 PM, Jakub Jelinek wrote:
> On Fri, Aug 05, 2016 at 08:27:39AM -0700, Andrew Pinski wrote:
>> On Fri, Aug 5, 2016 at 12:32 AM, Martin LiÅ¡ka <mliska@suse.cz> wrote:
>>> On 08/05/2016 09:30 AM, Martin LiÅ¡ka wrote:
>>>> Hi.
>>>>
>>>> Sorry for the mistake with the enum, that was silly ;)
>>>> New patch version also handles the unnecessary braces.
>>>>
>>>> Martin
>>>
>>> I attached a wrong patch, sending the new one.
>>
>>
>> This broke cross aarch64-elf-gnu building.
>>
>> /home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:
>> In function â€˜loop_type handle_cycle(const arc_vector_t&, int64_t&)â€™:
>> /home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/gcov.c:468:25:
>> error: â€˜INT64_MAXâ€™ was not declared in this scope
>>
>>
>> See http://stackoverflow.com/questions/3233054/error-int32-max-was-not-declared-in-this-scope
>> .
>
> Fixed thusly, committed as obvious to trunk:
>
> 2016-08-06  Jakub Jelinek  <jakub@redhat.com>
>
> 	* gcov.c (handle_cycle): Use INTTYPE_MAXIMUM (int64_t) instead of
> 	INT64_MAX.
>
> --- gcc/gcov.c.jj	2016-08-06 12:11:51.000000000 +0200
> +++ gcc/gcov.c	2016-08-06 12:28:32.373898684 +0200
> @@ -465,7 +465,7 @@ handle_cycle (const arc_vector_t &edges,
>  {
>    /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
>       that amount.  */
> -  int64_t cycle_count = INT64_MAX;
> +  int64_t cycle_count = INTTYPE_MAXIMUM (int64_t);
>    for (unsigned i = 0; i < edges.size (); i++)
>      {
>        int64_t ecount = edges[i]->cs_count;
>
>
> 	Jakub
>

Thank you Jakub for the fixed and sorry to all affected by the breakage.

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)
  2016-08-06 10:31                     ` Jakub Jelinek
  2016-08-07 21:33                       ` Martin Liška
@ 2016-08-08 12:47                       ` Nathan Sidwell
  1 sibling, 0 replies; 21+ messages in thread
From: Nathan Sidwell @ 2016-08-08 12:47 UTC (permalink / raw)
  To: Jakub Jelinek, Andrew Pinski
  Cc: Martin Liška, GCC Patches, Jan Hubicka, Pidgeot18

On 08/06/16 06:31, Jakub Jelinek wrote:

> Fixed thusly, committed as obvious to trunk:
>
> 2016-08-06  Jakub Jelinek  <jakub@redhat.com>
>
> 	* gcov.c (handle_cycle): Use INTTYPE_MAXIMUM (int64_t) instead of
> 	INT64_MAX.
>

thanks Jakub!


nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-08-08 12:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-03 13:32 [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992) Martin Liška
2016-08-03 13:54 ` Richard Biener
2016-08-03 14:22 ` Nathan Sidwell
2016-08-04 10:39   ` [PATCH 1/2] Fix GNU coding style in gcov.c Martin Liška
2016-08-04 11:39     ` Nathan Sidwell
2016-08-04 10:41   ` [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992) Martin Liška
2016-08-04 13:15     ` Nathan Sidwell
2016-08-04 13:31       ` Jan Hubicka
2016-08-04 14:43       ` Martin Liška
2016-08-04 15:13         ` Nathan Sidwell
2016-08-04 16:10           ` Martin Liška
2016-08-04 16:52             ` Nathan Sidwell
2016-08-05  7:30               ` Martin Liška
2016-08-05  7:32                 ` Martin Liška
2016-08-05 12:25                   ` Nathan Sidwell
2016-08-05 15:27                   ` Andrew Pinski
2016-08-05 17:12                     ` Gerald Pfeifer
2016-08-05 20:39                     ` Michael Meissner
2016-08-06 10:31                     ` Jakub Jelinek
2016-08-07 21:33                       ` Martin Liška
2016-08-08 12:47                       ` Nathan Sidwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).