public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Clustering switch cases
@ 2010-08-27 13:49 Paulo J. Matos
  2010-08-27 15:03 ` Ian Lance Taylor
       [not found] ` <D84AB18A31AC674EBE9B5300634A800601BF0B0932@EXCHANGEVS.IceraSemi.local>
  0 siblings, 2 replies; 12+ messages in thread
From: Paulo J. Matos @ 2010-08-27 13:49 UTC (permalink / raw)
  To: gcc

Hi,

I have been analysing the gcc4.4 code due to the way it's handling:
1  extern void f(const char *);
2  extern void g(int);
3
4  #define C(n) case n: f(#n); break
5
6  void g(int n)
7  {
8      switch(n)
9      {
10         C(0); C(1); C(2); C(3); C(4); C(5); C(6); C(7); C(8); C(9);
11         C(10); C(11); C(12); C(13); C(14); C(15); C(16); C(17);
C(18); C(19);
12         C(20); C(21); C(22); C(23); C(24); C(25); C(26); C(27);
C(28); C(29);
13
14         C(1000); C(1001); C(1002); C(1003); C(1004); C(1005);
C(1006); C(1007); C(1008); C(1009);
15     }
16 }

The interesting thing about this is that GCC generates much better code if I do:
1  extern void f(const char *);
2  extern void g(int);
3
4  #define C(n) case n: f(#n); break
5
6  void g(int n)
7  {
8      switch(n)
9      {
10         C(0); C(1); C(2); C(3); C(4); C(5); C(6); C(7); C(8); C(9);
11         C(10); C(11); C(12); C(13); C(14); C(15); C(16); C(17);
C(18); C(19);
12         C(20); C(21); C(22); C(23); C(24); C(25); C(26); C(27);
C(28); C(29);
13     }
14     switch(n)
15     {
16         C(1000); C(1001); C(1002); C(1003); C(1004); C(1005);
C(1006); C(1007); C(1008); C(1009);
17     }
18 }

In the first case, it generates a binary tree, and in the second two
jump tables. The jump tables solution is much more elegant (at least
in our situation), generating less code and being faster.
Now, what I am wondering is the reason why GCC doesn't try to cluster
the cases trying to find for clusters of contiguous values in the
switch.

If there is no specific reason then I would implement such pass, which
would before expansion split switches according to value clustering,
since I find it would be a good code improvement.

Currently GCC seems to only use jump table is the range of the switch
is not much bigger than its count, which works well in most cases
except when you have big switches with clusters of contiguous values
(like the first example I sent).

Any comments on this would be appreciated.

-- 
PMatos

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-08-27 13:49 Clustering switch cases Paulo J. Matos
@ 2010-08-27 15:03 ` Ian Lance Taylor
  2010-08-27 15:09   ` Paulo J. Matos
  2010-08-27 15:27   ` Richard Guenther
       [not found] ` <D84AB18A31AC674EBE9B5300634A800601BF0B0932@EXCHANGEVS.IceraSemi.local>
  1 sibling, 2 replies; 12+ messages in thread
From: Ian Lance Taylor @ 2010-08-27 15:03 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

"Paulo J. Matos" <pocmatos@gmail.com> writes:

> In the first case, it generates a binary tree, and in the second two
> jump tables. The jump tables solution is much more elegant (at least
> in our situation), generating less code and being faster.
> Now, what I am wondering is the reason why GCC doesn't try to cluster
> the cases trying to find for clusters of contiguous values in the
> switch.
>
> If there is no specific reason then I would implement such pass, which
> would before expansion split switches according to value clustering,
> since I find it would be a good code improvement.
>
> Currently GCC seems to only use jump table is the range of the switch
> is not much bigger than its count, which works well in most cases
> except when you have big switches with clusters of contiguous values
> (like the first example I sent).

I don't know of any specific reason not to look for clusters of switch
cases.  The main issue would be the affect on compilation time.  If you
can do it with an algorithm which is linear in the number of cases, then
I think it would be an acceptable optimization.

Ian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-08-27 15:03 ` Ian Lance Taylor
@ 2010-08-27 15:09   ` Paulo J. Matos
  2010-08-27 15:27   ` Richard Guenther
  1 sibling, 0 replies; 12+ messages in thread
From: Paulo J. Matos @ 2010-08-27 15:09 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

On Fri, Aug 27, 2010 at 3:47 PM, Ian Lance Taylor <iant@google.com> wrote:
>
> I don't know of any specific reason not to look for clusters of switch
> cases.  The main issue would be the affect on compilation time.  If you
> can do it with an algorithm which is linear in the number of cases, then
> I think it would be an acceptable optimization.
>

Thanks. I will be working on it. I will let you know how it goes.

Cheers,
-- 
PMatos

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-08-27 15:03 ` Ian Lance Taylor
  2010-08-27 15:09   ` Paulo J. Matos
@ 2010-08-27 15:27   ` Richard Guenther
  2010-08-28  0:21     ` Xinliang David Li
  2010-08-28 11:08     ` Paulo J. Matos
  1 sibling, 2 replies; 12+ messages in thread
From: Richard Guenther @ 2010-08-27 15:27 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Paulo J. Matos, gcc

On Fri, Aug 27, 2010 at 4:47 PM, Ian Lance Taylor <iant@google.com> wrote:
> "Paulo J. Matos" <pocmatos@gmail.com> writes:
>
>> In the first case, it generates a binary tree, and in the second two
>> jump tables. The jump tables solution is much more elegant (at least
>> in our situation), generating less code and being faster.
>> Now, what I am wondering is the reason why GCC doesn't try to cluster
>> the cases trying to find for clusters of contiguous values in the
>> switch.
>>
>> If there is no specific reason then I would implement such pass, which
>> would before expansion split switches according to value clustering,
>> since I find it would be a good code improvement.
>>
>> Currently GCC seems to only use jump table is the range of the switch
>> is not much bigger than its count, which works well in most cases
>> except when you have big switches with clusters of contiguous values
>> (like the first example I sent).
>
> I don't know of any specific reason not to look for clusters of switch
> cases.  The main issue would be the affect on compilation time.  If you
> can do it with an algorithm which is linear in the number of cases, then
> I think it would be an acceptable optimization.

In fact we might want to move switch optimization up to the tree level
(just because it's way easier to deal with there).  Thus, lower switch
to a mixture of binary tree & jump-tables (possibly using perfect
hashing).

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-08-27 15:27   ` Richard Guenther
@ 2010-08-28  0:21     ` Xinliang David Li
  2010-08-28 11:08     ` Paulo J. Matos
  1 sibling, 0 replies; 12+ messages in thread
From: Xinliang David Li @ 2010-08-28  0:21 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Ian Lance Taylor, Paulo J. Matos, gcc

Another main thing missing is to consider profile information (if
available) so that most frequent cases can be peeled out.

David

On Fri, Aug 27, 2010 at 8:03 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Fri, Aug 27, 2010 at 4:47 PM, Ian Lance Taylor <iant@google.com> wrote:
>> "Paulo J. Matos" <pocmatos@gmail.com> writes:
>>
>>> In the first case, it generates a binary tree, and in the second two
>>> jump tables. The jump tables solution is much more elegant (at least
>>> in our situation), generating less code and being faster.
>>> Now, what I am wondering is the reason why GCC doesn't try to cluster
>>> the cases trying to find for clusters of contiguous values in the
>>> switch.
>>>
>>> If there is no specific reason then I would implement such pass, which
>>> would before expansion split switches according to value clustering,
>>> since I find it would be a good code improvement.
>>>
>>> Currently GCC seems to only use jump table is the range of the switch
>>> is not much bigger than its count, which works well in most cases
>>> except when you have big switches with clusters of contiguous values
>>> (like the first example I sent).
>>
>> I don't know of any specific reason not to look for clusters of switch
>> cases.  The main issue would be the affect on compilation time.  If you
>> can do it with an algorithm which is linear in the number of cases, then
>> I think it would be an acceptable optimization.
>
> In fact we might want to move switch optimization up to the tree level
> (just because it's way easier to deal with there).  Thus, lower switch
> to a mixture of binary tree & jump-tables (possibly using perfect
> hashing).
>
> Richard.
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-08-27 15:27   ` Richard Guenther
  2010-08-28  0:21     ` Xinliang David Li
@ 2010-08-28 11:08     ` Paulo J. Matos
  2010-08-31 15:59       ` Rahul Kharche
       [not found]       ` <201009010034.18924.paul@codesourcery.com>
  1 sibling, 2 replies; 12+ messages in thread
From: Paulo J. Matos @ 2010-08-28 11:08 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Ian Lance Taylor, gcc

On Fri, Aug 27, 2010 at 4:03 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
>
> In fact we might want to move switch optimization up to the tree level
> (just because it's way easier to deal with there).  Thus, lower switch
> to a mixture of binary tree & jump-tables (possibly using perfect
> hashing).
>

Doing the optimisation at the tree-level was exactly my initial idea.
By splitting the switches at the tree-level, before expand_case, would
then allow for expand_case to transform it either to a jump table or
binary tree depending on the situation.

I will be looking at the patch Rahul posted and will try to see if I
can improve on it.

-- 
PMatos

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Clustering switch cases
  2010-08-28 11:08     ` Paulo J. Matos
@ 2010-08-31 15:59       ` Rahul Kharche
       [not found]       ` <201009010034.18924.paul@codesourcery.com>
  1 sibling, 0 replies; 12+ messages in thread
From: Rahul Kharche @ 2010-08-31 15:59 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc, sdkteam-gnu

[-- Attachment #1: Type: text/plain, Size: 320 bytes --]

> I will be looking at the patch Rahul posted and will try to see if I
> can improve on it.

See attached patch (again) that Paulo is referring to. Sending to GCC
failed due to email client issues.

I have another patch for http://gcc.gnu.org/ml/gcc/2010-08/msg00413.html
Which I will send out shortly.

Rahul

[-- Attachment #2: switch_case_clusters.patch --]
[-- Type: application/octet-stream, Size: 29885 bytes --]

	* stmt.c (struct case_node): Added new member table_set.
	(max_initialized): New flag.
	(emit_case_nodes): Add new tree argument, handle case cclusters
	separately.
	(emit_case_table): New function, factored out of expand_case.
	(scan_for_case_clusters): New.
	(dump_case_nodes): New.
	(check_tree): New.
	(check_tree_1): New, called by check_tree.
	(expand_case): Call scan_for_case_clusters, call emit_case_table.
	(balance_case_nodes): Re-balance tree based on clusters identified
	by scan_for_case_clusters.


--- stmt.c	2009-04-26 19:53:41.000000000 +0100
+++ stmt.c	2010-08-28 01:36:54.000000000 +0100
@@ -85,6 +85,7 @@ struct case_node
   struct case_node	*left;	/* Left son in binary tree */
   struct case_node	*right;	/* Right son in binary tree; also node chain */
   struct case_node	*parent; /* Parent of node in binary tree */
+  int			table_set; /* 0 means binary test, else grouped by val */
   tree			low;	/* Lowest index value for this label */
   tree			high;	/* Highest index value for this label */
   tree			code_label; /* Label to jump to when node matches */
@@ -99,6 +100,7 @@ typedef struct case_node *case_node_ptr;
 static short cost_table_[129];
 static int use_cost_table;
 static int cost_table_initialized;
+static int max_initialized;
 
 /* Special care is needed because we allow -1, but TREE_INT_CST_LOW
    is unsigned.  */
@@ -120,9 +122,15 @@ static void balance_case_nodes (case_nod
 static int node_has_low_bound (case_node_ptr, tree);
 static int node_has_high_bound (case_node_ptr, tree);
 static int node_is_bounded (case_node_ptr, tree);
-static void emit_case_nodes (rtx, case_node_ptr, rtx, tree);
+static int emit_case_nodes (rtx, tree, case_node_ptr, rtx, tree);
 static struct case_node *add_case_node (struct case_node *, tree,
                                         tree, tree, tree, alloc_pool);
+static void emit_case_table (tree, case_node_ptr, rtx, rtx, tree);
+static void scan_for_case_clusters (case_node *, tree);
+static void dump_case_nodes (struct case_node *, int);
+static int check_tree (case_node_ptr, int);
+static int check_tree_1 (case_node_ptr, unsigned long long int*,
+			 long long int*, int unsignedp);
 
 \f
 /* Return the rtx-label that corresponds to a LABEL_DECL,
@@ -2150,6 +2158,23 @@ emit_case_bit_tests (tree index_type, tr
 #define HAVE_tablejump 0
 #endif
 
+static void
+dump_case_nodes(struct case_node *root, int indent)
+{
+  int h, l;
+  if (root == 0)
+    return;
+  dump_case_nodes (root->left, indent+3);
+  h = (int)TREE_INT_CST_LOW(root->high);
+  l = (int)TREE_INT_CST_LOW(root->low);
+  if (h == l)
+    fprintf (dump_file, "%*s%d", indent, "", l);
+  else
+    fprintf (dump_file, "%*s%d .. %d", indent, "", l, h);
+  fprintf (dump_file, " (%d)\n", root->table_set);
+  dump_case_nodes (root->right, indent+3);
+}
+
 /* Terminate a case (Pascal/Ada) or switch (C) statement
    in which ORIG_INDEX is the expression to be tested.
    If ORIG_TYPE is not NULL, it is the original ORIG_INDEX
@@ -2164,9 +2189,6 @@ expand_case (tree exp)
   struct case_node *n;
   unsigned int count, uniq;
   rtx index;
-  rtx table_label;
-  int ncases;
-  rtx *labelvec;
   int i;
   rtx before_case, end, lab;
 
@@ -2291,6 +2313,14 @@ expand_case (tree exp)
       /* Compute span of values.  */
       range = fold_build2 (MINUS_EXPR, index_type, maxval, minval);
 
+     if (dump_file)
+	 fprintf (dump_file, "range low is %ld unique cases is %d count is  %d pc rel %d\n",
+		  TREE_INT_CST_LOW (range),
+		  uniq,
+		  count,
+		  CASE_VECTOR_PC_RELATIVE);
+     
+
       /* Try implementing this switch statement by a short sequence of
 	 bit-wise comparisons.  However, we let the binary-tree case
 	 below handle constant index expressions.  */
@@ -2338,6 +2368,26 @@ expand_case (tree exp)
 	{
 	  index = expand_normal (index_expr);
 
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "Emitting Binary Decision Tree because ");
+	      if (count < case_values_threshold ())
+		fprintf (dump_file, " Count [%d] < case_values_threshold [%d] \n",
+			 count, case_values_threshold () );
+	      else
+		if (compare_tree_int (range,
+				      (optimize_insn_for_size_p () ? 3 : 10) * count) > 0 )
+		  {
+		    const char *msg = (optimize_insn_for_size_p () ? "3" : "10");
+		    fprintf (dump_file, "Range %ld > %s [%d] * count [%d] \n",
+			     TREE_INT_CST_LOW (range),
+			     msg,
+			     ((optimize_insn_for_size_p ()) ? 3 : 10),
+			     count);
+
+		  }
+	    }
+	
 	  /* If the index is a short or char that we do not have
 	     an insn to handle comparisons directly, convert it to
 	     a full integer now, rather than letting each comparison
@@ -2377,90 +2427,60 @@ expand_case (tree exp)
 	  use_cost_table
 	    = (TREE_CODE (orig_type) != ENUMERAL_TYPE
 	       && estimate_case_costs (case_list));
+
+	  if (dump_file)
+	  {
+	      fprintf (dump_file, "\n-----  tree before scanning clusters\n"); 
+	      dump_case_nodes (case_list, 0); 
+	  }
+
+	  scan_for_case_clusters (case_list, index_type);
+
+	  if (dump_file)
+	  {
+	      fprintf (dump_file, "\n-----  tree after scanning clusters and before balancing.\n"); 
+	      dump_case_nodes (case_list, 0); 
+	  }
+
 	  balance_case_nodes (&case_list, NULL);
-	  emit_case_nodes (index, case_list, default_label, index_type);
-	  if (default_label)
+	  
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "\n----- final tree\n"); 
+	      dump_case_nodes (case_list, 0); 
+	    }
+
+	  /* Check the Binary Decision Tree generated. */
+	  if (check_tree (case_list, unsignedp))
+	  {
+	      if (dump_file)
+		  fprintf (dump_file, "Bad Binary Search Tree created in %s\n",
+			   current_function_name ());
+	      error ("Bad Binary Search Tree. Check the dump by using -dq on the command line \n");
+	      
+	  }
+	  else
+	  {
+	      if (dump_file)
+		  fprintf (dump_file, "Generated Binary Search Tree verified to be correct. \n");
+	      max_initialized = 0;
+	  }
+	  
+	  if (emit_case_nodes (index, index_expr, case_list, default_label, index_type))
 	    emit_jump (default_label);
+
 	}
       else
 	{
 	  rtx fallback_label = label_rtx (case_list->code_label);
-	  table_label = gen_label_rtx ();
-	  if (! try_casesi (index_type, index_expr, minval, range,
-			    table_label, default_label, fallback_label))
-	    {
-	      bool ok;
-
-	      /* Index jumptables from zero for suitable values of
-                 minval to avoid a subtraction.  */
-	      if (optimize_insn_for_speed_p ()
-		  && compare_tree_int (minval, 0) > 0
-		  && compare_tree_int (minval, 3) < 0)
-		{
-		  minval = build_int_cst (index_type, 0);
-		  range = maxval;
-		}
-
-	      ok = try_tablejump (index_type, index_expr, minval, range,
-				  table_label, default_label);
-	      gcc_assert (ok);
-	    }
-
-	  /* Get table of labels to jump to, in order of case index.  */
-
-	  ncases = tree_low_cst (range, 0) + 1;
-	  labelvec = XALLOCAVEC (rtx, ncases);
-	  memset (labelvec, 0, ncases * sizeof (rtx));
-
-	  for (n = case_list; n; n = n->right)
-	    {
-	      /* Compute the low and high bounds relative to the minimum
-		 value since that should fit in a HOST_WIDE_INT while the
-		 actual values may not.  */
-	      HOST_WIDE_INT i_low
-		= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
-					     n->low, minval), 1);
-	      HOST_WIDE_INT i_high
-		= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
-					     n->high, minval), 1);
-	      HOST_WIDE_INT i;
-
-	      for (i = i_low; i <= i_high; i ++)
-		labelvec[i]
-		  = gen_rtx_LABEL_REF (Pmode, label_rtx (n->code_label));
-	    }
-
-	  /* Fill in the gaps with the default.  We may have gaps at
-	     the beginning if we tried to avoid the minval subtraction,
-	     so substitute some label even if the default label was
-	     deemed unreachable.  */
-	  if (!default_label)
-	    default_label = fallback_label;
-	  for (i = 0; i < ncases; i++)
-	    if (labelvec[i] == 0)
-	      labelvec[i] = gen_rtx_LABEL_REF (Pmode, default_label);
-
-	  /* Output the table.  */
-	  emit_label (table_label);
-
-	  if (CASE_VECTOR_PC_RELATIVE || flag_pic)
-	    emit_jump_insn (gen_rtx_ADDR_DIFF_VEC (CASE_VECTOR_MODE,
-						   gen_rtx_LABEL_REF (Pmode, table_label),
-						   gen_rtvec_v (ncases, labelvec),
-						   const0_rtx, const0_rtx));
-	  else
-	    emit_jump_insn (gen_rtx_ADDR_VEC (CASE_VECTOR_MODE,
-					      gen_rtvec_v (ncases, labelvec)));
-
-	  /* Record no drop-through after the table.  */
-	  emit_barrier ();
+	  emit_case_table (index_expr, case_list, default_label,
+			   fallback_label, index_type);
 	}
 
       before_case = NEXT_INSN (before_case);
       end = get_last_insn ();
       reorder_insns (before_case, end, start);
     }
-
   free_temp_slots ();
   free_alloc_pool (case_node_pool);
 }
@@ -2555,6 +2575,56 @@ estimate_case_costs (case_node_ptr node)
   return 1;
 }
 
+/* Scan a list of case values, group them into clusters which should
+   be emitted as tablejumps even when the overall case list indicates
+   a desicion tree.  */
+
+static void
+scan_for_case_clusters (case_node *list, tree index_type)
+{
+  case_node *n, *l, *last_in_cluster;
+  unsigned int count;
+  tree min_case;
+  int table_set = 0;
+
+  list->parent = 0;
+  for (n=list; n; n=n->right)
+    {
+      n->table_set = 0;
+      if (n->right)
+	n->right->parent = n;
+    }
+
+  for (n=list; n; n=n->right)
+    {
+      min_case = n->low;
+      count = 0;
+      last_in_cluster = 0;
+      for (l=n; l; l=l->right)
+	{
+	  count ++;
+	  if (count > case_values_threshold ()
+	      && compare_tree_int (fold (build2_stat (MINUS_EXPR, index_type, 
+						      l->high, min_case)),
+				   (int)(optimize_size ? 3 : 10) * count) < 0)
+	    last_in_cluster = l;
+	  else if (last_in_cluster)
+	    break;
+	}
+      if (last_in_cluster)
+	{
+	  table_set ++;
+	  for (l=n; l;)
+	    {
+	      l->table_set = table_set;
+	      if (l == last_in_cluster)
+		break;
+	      l = l->right;
+	    }
+	  n = last_in_cluster;
+	}
+    }
+}
 /* Take an ordered list of case nodes
    and transform them into a near optimal binary tree,
    on the assumption that any target code selection value is as
@@ -2651,13 +2721,164 @@ balance_case_nodes (case_node_ptr *head,
 		  npp = &(*npp)->right;
 		}
 	    }
+	  if ((*npp)->table_set)
+	    {
+	      /* We've selected a member of a dense cluster.  Pick one
+ 		 of the ends of the cluster so that we can later
+ 		 isolate the cluster and emit a table for it.  */
+	      int set = (*npp)->table_set;
+	      int near_start=0, near_end=0;
+	      case_node *start_node, *end_node;
+	      if (dump_file)	      
+	      fprintf (dump_file, "Searching with set %d\n",
+		      set);
+
+	      /* Find the first and last member of the cluster.  */
+	      for (start_node = *npp;
+		   (start_node->parent != parent)
+		     && start_node->parent
+		     && (start_node->parent->table_set == set);
+		   start_node = start_node->parent)
+		{
+		    if (dump_file)
+		    {
+			fprintf (dump_file, "start_node is %ld\n",
+				TREE_INT_CST_LOW (start_node->low));
+			fprintf (dump_file, "parent is %ld\n",
+				(parent ? TREE_INT_CST_LOW (parent->low) : -1));
+		    }
+		  near_start ++;
+		}
+
+	      for (end_node = *npp;
+		   end_node->right
+		     && (end_node->right->table_set == set);
+		   end_node = end_node->right)
+		near_end ++;
+
+	      if (dump_file)
+		  fprintf (dump_file, "start %ld dist %d end %ld dist %d\n",
+			   TREE_INT_CST_LOW (start_node->low), near_start,
+			   TREE_INT_CST_LOW (end_node->low), near_end);
+	      
+	      if (start_node->parent
+		  && (start_node->parent != parent)
+		  && (start_node->parent->table_set == 0))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "Start node changes to a node with table set 0 \n");
+		    
+		  start_node = start_node->parent;
+		}
+	      if (dump_file)
+		  fprintf (dump_file, "start_node->parent %p (%ld) parent %p (%ld)\n",
+			  start_node->parent,
+			  (start_node->parent
+			   ? TREE_INT_CST_LOW (start_node->parent->low) : -1),
+			  parent,
+			  (parent ? TREE_INT_CST_LOW (parent->low) : -1));
+
+	      /* Watch out for end-of-list conditions.  */
+	      if (!start_node->parent || (start_node->parent == parent))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "start node is made to be empty.\n");
+
+		  start_node = 0;
+		}
+	      if (!end_node->right)
+		{
+		    if (dump_file)
+			fprintf (dump_file, "end_node->right = empty\n");
+
+		  end_node = 0;
+		}
+	      
+
+	      /* This means that there's nothing left in the list but
+ 		 the cluster; we keep it as-is and later emit a
+ 		 tablejump for it.  */
+	      if ((!start_node && !end_node))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "dj: nothing left but the table\n");
+
+		  return;
+		}
+
+	      /* Move the selection to either before or after this
+ 		 cluster, whichever is closer.  */
+	      if (!start_node || (near_end < near_start  &&
+				  (end_node && end_node->right)))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "Setting new parent to end node -> right\n");
+		    npp = &(end_node->right);
+		}
+	      else if (!start_node)
+		{
+		    if (dump_file)
+			fprintf (dump_file, "setting new parent  to head %p %ld\n",*head,
+				TREE_INT_CST_LOW ((*head)->low));
+		    
+		  npp = head;
+		}
+	      else
+		{
+		    if (dump_file)
+			fprintf (dump_file, "setting new parent to start_node->parent->right\n");
+
+		  npp = &(start_node->parent->right);
+		}
+
+	      if (dump_file)
+		  fprintf (dump_file, "dj: split moved to %ld\n", TREE_INT_CST_LOW ((*npp)->low));
+	      /* FIXME:: Added this to correct case. */
+
+	      if (left)
+		left->parent  = NULL;
+
+	    }
 	  *head = np = *npp;
 	  *npp = 0;
 	  np->parent = parent;
 	  np->left = left;
+	  left->parent = np;
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "Parent of left subtree is %p 0x%lx \n",
+		       left->parent,
+		       (left->parent) ? TREE_INT_CST_LOW (left->parent->low) : -1);
+	      
+	      if (left->parent && left->parent->table_set)
+		  fprintf (dump_file, "Caution: A cluster has been broken at 0x%lx\n",
+			   (left->parent) ? TREE_INT_CST_LOW (left->parent->low) : -1);
+		  
+
+	      fprintf (dump_file, "Setting left subtree to %p(0x%lx)\n",
+		      left, (left ? TREE_INT_CST_LOW (left->low) : -1));
+	      
+	      fprintf (dump_file, "Right is set to %p(%lx)\n",np->right,
+		      (np->right ? TREE_INT_CST_LOW (np->right->low) : -1));
+	      
+	       
+	    }
+	  if (dump_file)
+	  {
+	      fprintf (dump_file, "details about np before balancing left\n");
+	      fprintf (dump_file, "np = 0x%lx, np->parent = 0x%lx, ",
+		       TREE_INT_CST_LOW (np->low),
+		       np->parent ? TREE_INT_CST_LOW (np->parent->low) : -1);
+	      fprintf (dump_file, "np->left->parent = 0x%lx\n",
+		       np->left->parent ? TREE_INT_CST_LOW (np->left->parent->low) : -1);
+	      fprintf (dump_file, "balancing left\n");
+	       
+	  }
 
 	  /* Optimize each of the two split parts.  */
 	  balance_case_nodes (&np->left, np);
+	  if (dump_file)
+	      fprintf (dump_file, "balancing right\n");
 	  balance_case_nodes (&np->right, np);
 	}
       else
@@ -2769,6 +2990,92 @@ node_has_high_bound (case_node_ptr node,
   return 0;
 }
 
+static void
+emit_case_table (tree index_expr, case_node_ptr node, rtx default_label,
+		 rtx fallback_label, tree index_type)
+{
+  rtx table_label;
+  case_node *n;
+  int ncases;
+  rtx *labelvec;
+  int i;
+  tree minval, maxval, range;
+
+  for (n=node; n->right; n=n->right);
+
+  minval = node->low;
+  maxval = n->high;
+  range = fold_build2 (MINUS_EXPR, index_type, maxval, minval);
+
+
+  table_label = gen_label_rtx ();
+  if (! try_casesi (index_type, index_expr, minval, range,
+		    table_label, default_label, fallback_label))
+    {
+      bool ok;
+      
+      /* Index jumptables from zero for suitable values of
+	 minval to avoid a subtraction.  */
+      if (! optimize_size
+	  && compare_tree_int (minval, 0) > 0
+	  && compare_tree_int (minval, 3) < 0)
+	{
+	  minval = build_int_cst (index_type, 0);
+	  range = maxval;
+	}
+      
+      ok = try_tablejump (index_type, index_expr, minval, range,
+			  table_label, default_label);
+      gcc_assert (ok);
+    }
+  
+  /* Get table of labels to jump to, in order of case index.  */
+  
+  ncases = tree_low_cst (range, 0) + 1;
+  labelvec = alloca (ncases * sizeof (rtx));
+  memset (labelvec, 0, ncases * sizeof (rtx));
+  
+  for (n = node; n; n = n->right)
+    {
+      /* Compute the low and high bounds relative to the minimum
+	 value since that should fit in a HOST_WIDE_INT while the
+	 actual values may not.  */
+      HOST_WIDE_INT i_low
+	= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
+				     n->low, minval), 1);
+      HOST_WIDE_INT i_high
+	= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
+				     n->high, minval), 1);
+      HOST_WIDE_INT i;
+      
+      for (i = i_low; i <= i_high; i ++)
+	labelvec[i]
+	  = gen_rtx_LABEL_REF (Pmode, label_rtx (n->code_label));
+    }
+  
+  /* Fill in the gaps with the default.  */
+  for (i = 0; i < ncases; i++)
+    if (labelvec[i] == 0)
+	      labelvec[i] = gen_rtx_LABEL_REF (Pmode, default_label);
+  
+  
+  /* Output the table.  */
+  emit_label (table_label);
+  
+  if (CASE_VECTOR_PC_RELATIVE || flag_pic)
+    emit_jump_insn (gen_rtx_ADDR_DIFF_VEC (CASE_VECTOR_MODE,
+					   gen_rtx_LABEL_REF (Pmode, table_label),
+					   gen_rtvec_v (ncases, labelvec),
+					   const0_rtx, const0_rtx));
+  else
+    emit_jump_insn (gen_rtx_ADDR_VEC (CASE_VECTOR_MODE,
+				      gen_rtvec_v (ncases, labelvec)));
+  
+  /* Record no drop-through after the table.  */
+  emit_barrier ();
+  
+}
+
 /* Search the parent sections of the
    case node tree to see if both tests for the upper and lower
    bounds of NODE would be redundant.  */
@@ -2806,18 +3113,47 @@ node_is_bounded (case_node_ptr node, tre
    the possibility of a value greater than 51.  If another parent
    tests for the value 50, then this node need not test anything.  */
 
-static void
-emit_case_nodes (rtx index, case_node_ptr node, rtx default_label,
+static int
+emit_case_nodes (rtx index, tree index_expr, case_node_ptr node, rtx default_label,
 		 tree index_type)
 {
   /* If INDEX has an unsigned type, we must make unsigned branches.  */
   int unsignedp = TYPE_UNSIGNED (index_type);
+  int ret_val = 1;
   enum machine_mode mode = GET_MODE (index);
   enum machine_mode imode = TYPE_MODE (index_type);
 
   /* Handle indices detected as constant during RTL expansion.  */
   if (mode == VOIDmode)
     mode = imode;
+  /* If node->table_set is nonzero, then that node and all the
+     node->right in the chain represent a cluster of case nodes which
+     are appropriate for a tablejump.  Note that node->left may also
+     have a cluster, so test for that too.  */
+  if (node->table_set != 0 && 
+      node->right && 
+      node->table_set == node->right->table_set)
+    {
+      if (node->left)
+	{
+	  tree test_label = build_decl (LABEL_DECL, NULL_TREE, NULL_TREE);
+
+	  /* See if the value is on the right.  */
+	  emit_cmp_and_jump_insns (index,
+				   convert_modes
+				   (mode, imode,
+				    expand_expr (node->low, NULL_RTX,
+						 VOIDmode, 0),
+				    unsignedp),
+				   GE, NULL_RTX, mode, unsignedp,
+				   label_rtx (test_label));
+	  ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+				     index_type);
+	  expand_label (test_label);
+	}
+      emit_case_table (index_expr, node, default_label, default_label, index_type);
+      return 0;
+    }
 
   /* See if our parents have already tested everything for us.
      If they have, emit an unconditional jump for this node.  */
@@ -2852,7 +3188,8 @@ emit_case_nodes (rtx index, case_node_pt
 					unsignedp),
 				       GT, NULL_RTX, mode, unsignedp,
 				       label_rtx (node->right->code_label));
-	      emit_case_nodes (index, node->left, default_label, index_type);
+          ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+				     index_type);
 	    }
 
 	  else if (node_is_bounded (node->left, index_type))
@@ -2864,7 +3201,8 @@ emit_case_nodes (rtx index, case_node_pt
 					unsignedp),
 				       LT, NULL_RTX, mode, unsignedp,
 				       label_rtx (node->left->code_label));
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label,
+					 index_type);
 	    }
 
 	  /* If both children are single-valued cases with no
@@ -2915,17 +3253,19 @@ emit_case_nodes (rtx index, case_node_pt
 				       GT, NULL_RTX, mode, unsignedp,
 				       label_rtx (test_label));
 
-	      /* Value must be on the left.
+           /* Value must be on the left.
 		 Handle the left-hand subtree.  */
-	      emit_case_nodes (index, node->left, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+	                                 index_type);
 	      /* If left-hand subtree does nothing,
 		 go to default.  */
-	      if (default_label)
-	        emit_jump (default_label);
+	      if (ret_val && default_label)
+		emit_jump (default_label);
 
 	      /* Code branches here for the right-hand subtree.  */
 	      expand_label (test_label);
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label,
+	                                 index_type);
 	    }
 	}
 
@@ -2952,7 +3292,8 @@ emit_case_nodes (rtx index, case_node_pt
 					   default_label);
 		}
 
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label,
+					 index_type);
 	    }
 	  else
 	    /* We cannot process node->right normally
@@ -2983,7 +3324,8 @@ emit_case_nodes (rtx index, case_node_pt
 					   default_label);
 		}
 
-	      emit_case_nodes (index, node->left, default_label, index_type);
+	     ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+	                                index_type);
 	    }
 	  else
 	    /* We cannot process node->left normally
@@ -3048,7 +3390,7 @@ emit_case_nodes (rtx index, case_node_pt
 				   label_rtx (node->code_label));
 
 	  /* Handle the left-hand subtree.  */
-	  emit_case_nodes (index, node->left, default_label, index_type);
+	  ret_val = emit_case_nodes (index, index_expr, node->left, default_label, index_type);
 
 	  /* If right node had to be handled later, do that now.  */
 
@@ -3056,11 +3398,11 @@ emit_case_nodes (rtx index, case_node_pt
 	    {
 	      /* If the left-hand subtree fell through,
 		 don't let it fall into the right-hand subtree.  */
-	      if (default_label)
-		emit_jump (default_label);
+	      if (ret_val && default_label)
+	      emit_jump (default_label);
 
 	      expand_label (test_label);
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label, index_type);
 	    }
 	}
 
@@ -3089,7 +3431,7 @@ emit_case_nodes (rtx index, case_node_pt
 				   LE, NULL_RTX, mode, unsignedp,
 				   label_rtx (node->code_label));
 
-	  emit_case_nodes (index, node->right, default_label, index_type);
+      ret_val =  emit_case_nodes (index, index_expr, node->right, default_label, index_type);
 	}
 
       else if (node->right == 0 && node->left != 0)
@@ -3117,7 +3459,8 @@ emit_case_nodes (rtx index, case_node_pt
 				   GE, NULL_RTX, mode, unsignedp,
 				   label_rtx (node->code_label));
 
-	  emit_case_nodes (index, node->left, default_label, index_type);
+	  ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+	                             index_type);
 	}
 
       else
@@ -3174,4 +3517,147 @@ emit_case_nodes (rtx index, case_node_pt
 	  emit_jump (label_rtx (node->code_label));
 	}
     }
+
+  return ret_val;
 }
+
+/* Subroutine of check_tree.
+   Performs inorder traversal of the tree rooted at NODE. MAX is the last
+   value that would have been printed, if we were to print nodes ( inorder
+   traversal). Check if MAX is greater than node->high, if Yes, Return 1
+   indicating a bad Binary Search Tree. Check Whether the parent
+   pointer of each child points to node. If not, Return 1, indicating a
+   bad Tree. Return 0, If the tree is correct. */
+static
+int check_tree_1 (case_node_ptr node, unsigned long long int *maxu, long long int* maxi, int unsignedp)
+{
+    unsigned long long int hu;
+    long long int hi;
+
+    if (! node)
+	return 0;
+
+    /* check the left subtree */
+    if (check_tree_1 (node->left, maxu, maxi, unsignedp))
+	return 1;
+
+    if (unsignedp)
+      {
+        unsigned int hi_part = TREE_INT_CST_HIGH (node->high);
+        unsigned int lo_part = TREE_INT_CST_LOW (node->high);
+
+        hu = ((unsigned long long int)hi_part << 32) | lo_part;   
+      }
+    else
+      {
+        int hi_part = TREE_INT_CST_HIGH (node->high);
+        int lo_part = TREE_INT_CST_LOW (node->high);
+
+        hi = ((long long int)hi_part << 32) | lo_part;          
+      }
+
+    /* Check if the tree is correct in terms of the values of the nodes. */
+    if (!max_initialized)
+    {
+	/* We have reached the leftmost node of the tree. */
+	if (unsignedp)
+	    *maxu = hu;
+	else
+	    *maxi = hi;
+	max_initialized = 1;
+    }
+    else  if ((unsignedp && (*maxu < hu)) || (!unsignedp && (*maxi < hi)))
+    {
+	if (unsignedp)
+	    *maxu = hu;
+	else
+	    *maxi = hi;
+    }
+    else
+    {
+	if (dump_file)
+	{
+	    if (unsignedp)
+	    {
+		fprintf (dump_file, "hu = 0x%llx, hu = %llu (unsigned), *maxu = 0x%llx, *maxu (decimal) = 0x%llu (unsigned)\n",
+			 hu, hu, *maxu, *maxu );
+		fprintf (dump_file, "PROBLEM: maxu is greater than hu, when it should not be so.\n" );
+	    }
+	    else
+	    {
+		fprintf (dump_file, "hi = 0x%llx, hi = %lld (signed), *maxi = 0x%llx, *maxi (decimal) = 0x%llx (signed)\n",
+			 hi, hi, *maxi, *maxi );
+		fprintf (dump_file, "PROBLEM: maxi is greater than hi, when it should not be so.\n" );
+	    }
+	}
+	return 1;
+    }
+    /* Check to see if the pointers in each node are sane */
+    if ((node->left && (!(node->left->parent == node))) || (node->right && (!(node->right->parent == node))))
+    {
+	if (dump_file)
+	{
+	    if (unsignedp)
+	    {
+		fprintf (dump_file, "PROBLEM: The parent pointers of children do not point to the correct" 
+			 "pointer for the following\n");
+		fprintf (dump_file, "Parent (Hex) = 0x%lx, Parent (Decimal) = %lu (unsigned)\n", TREE_INT_CST_LOW (node->low),
+			 TREE_INT_CST_LOW (node->low));
+		fprintf (dump_file, "Left Child (Hex) = 0x%lx, Left Child (Decimal) = %lu (unsigned)\n",
+			 node->left?TREE_INT_CST_LOW (node->left->low):-1,node->left?TREE_INT_CST_LOW (node->left->low):-1);
+		fprintf (dump_file, "Left Child's Parent (Hex) = 0x%lx, Left Child's Parent (Decimal) = %lu (unsigned)\n",
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1,
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1);
+		fprintf (dump_file, "Right Child (Hex) = 0x%lx, Right Child (Decimal) = %lu (unsigned)\n",
+			 node->right?TREE_INT_CST_LOW (node->right->low):-1,node->right?TREE_INT_CST_LOW (node->right->low):-1);
+		fprintf (dump_file, "Right Child's Parent (Hex) = 0x%lx, Right Child's Parent (Decimal) = %lu (unsigned)\n",
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1,
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1);
+		fprintf (dump_file, "If the value of a node is shown to be -1 then it means that the case value is actually"
+			 " -1 or that the pointer points to nothing i.e is NULL\n");
+	    }
+	    else
+	    {
+
+		fprintf (dump_file, "PROBLEM: The parent pointers of children do not point to the correct" 
+			 "pointer for the following\n");
+		fprintf (dump_file, "Parent (Hex) = 0x%lx, Parent (Decimal) = %ld (signed)\n", TREE_INT_CST_LOW (node->low),
+			 TREE_INT_CST_LOW (node->low));
+		fprintf (dump_file, "Left Child (Hex) = 0x%lx, Left Child (Decimal) = %ld (signed)\n",
+			 node->left?TREE_INT_CST_LOW (node->left->low):-1,node->left?TREE_INT_CST_LOW (node->left->low):-1);
+		fprintf (dump_file, "Left Child's Parent (Hex) = 0x%lx, Left Child's Parent (Decimal) = %ld (signed)\n",
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1,
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1);
+		fprintf (dump_file, "Right Child (Hex) = 0x%lx, Right Child (Decimal) = %ld (signed)\n",
+			 node->right?TREE_INT_CST_LOW (node->right->low):-1,node->right?TREE_INT_CST_LOW (node->right->low):-1);
+		fprintf (dump_file, "Right Child's Parent (Hex) = 0x%lx, Right Child's Parent (Decimal) = %ld (signed)\n",
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1,
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1);
+		fprintf (dump_file, "If the value of a node is shown to be -1 then it means that the case value is actually"
+			 " -1 or that the pointer points to nothing i.e is NULL\n");
+	    }
+	}
+		
+	return 1;
+    }
+    
+/* Check the right subtree. */
+    if (check_tree_1 (node->right, maxu, maxi, unsignedp))
+	return 1;
+
+    return 0;
+}
+
+
+static
+int check_tree (case_node_ptr head, int unsignedp)
+{
+    unsigned long long int max_u_inorder=0;
+    long long int max_i_inorder = 0;
+/* max_initialized is set to 1 when we reach the leftmost node of the tree. */
+    max_initialized = 0;
+    if (dump_file)
+	fprintf (dump_file, "Checking the generated BST\n");
+    return check_tree_1 (head, &max_u_inorder, &max_i_inorder, unsignedp);
+}
+

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
       [not found]       ` <201009010034.18924.paul@codesourcery.com>
@ 2010-09-01  8:39         ` Richard Guenther
  2010-09-01  9:15           ` Paulo J. Matos
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2010-09-01  8:39 UTC (permalink / raw)
  To: Paul Brook; +Cc: gcc, Paulo J. Matos, Ian Lance Taylor

On Wed, Sep 1, 2010 at 1:34 AM, Paul Brook <paul@codesourcery.com> wrote:
>> > In fact we might want to move switch optimization up to the tree level
>> > (just because it's way easier to deal with there).  Thus, lower switch
>> > to a mixture of binary tree & jump-tables (possibly using perfect
>> > hashing).
>>
>> Doing the optimisation at the tree-level was exactly my initial idea.
>> By splitting the switches at the tree-level, before expand_case, would
>> then allow for expand_case to transform it either to a jump table or
>> binary tree depending on the situation.
>
> I'd kinda hope that doing the optimization at the tree level means expand_case
> doesn't have to handle both types.  The tree code converts sparse case ranges
> to explicit conditionals (or a switch on a compact perfect hash), so anything
> left to RTL expansion must be a jump table.

Yes, that was my idea.

Richard.

> Paul
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-09-01  8:39         ` Richard Guenther
@ 2010-09-01  9:15           ` Paulo J. Matos
  2010-09-01  9:33             ` Richard Guenther
  0 siblings, 1 reply; 12+ messages in thread
From: Paulo J. Matos @ 2010-09-01  9:15 UTC (permalink / raw)
  To: gcc

Richard Guenther <richard.guenther@gmail.com> writes:

>>
>> I'd kinda hope that doing the optimization at the tree level means expand_case
>> doesn't have to handle both types.  The tree code converts sparse case ranges
>> to explicit conditionals (or a switch on a compact perfect hash), so anything
>> left to RTL expansion must be a jump table.
>

You are right, I haven't thought about it but it does make sense. If I
want to provide a patch for this, would it be ok if its a patch against
gcc 4.4.4, or would it be better if its against the latest 4.5.1? Is
there any policy regarding this?

> Yes, that was my idea.
>

Good! I will get to work!

Cheers,
-- 
PMatos

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-09-01  9:15           ` Paulo J. Matos
@ 2010-09-01  9:33             ` Richard Guenther
  2010-09-01 12:37               ` Paulo J. Matos
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2010-09-01  9:33 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

On Wed, Sep 1, 2010 at 11:11 AM, Paulo J. Matos <pocmatos@gmail.com> wrote:
> Richard Guenther <richard.guenther@gmail.com> writes:
>
>>>
>>> I'd kinda hope that doing the optimization at the tree level means expand_case
>>> doesn't have to handle both types.  The tree code converts sparse case ranges
>>> to explicit conditionals (or a switch on a compact perfect hash), so anything
>>> left to RTL expansion must be a jump table.
>>
>
> You are right, I haven't thought about it but it does make sense. If I
> want to provide a patch for this, would it be ok if its a patch against
> gcc 4.4.4, or would it be better if its against the latest 4.5.1? Is
> there any policy regarding this?

Patches should be submitted against SVN trunk head as this is a new
feature and thus will not go on the 4.4 or 4.5 branches.

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Clustering switch cases
  2010-09-01  9:33             ` Richard Guenther
@ 2010-09-01 12:37               ` Paulo J. Matos
  0 siblings, 0 replies; 12+ messages in thread
From: Paulo J. Matos @ 2010-09-01 12:37 UTC (permalink / raw)
  To: gcc

Richard Guenther <richard.guenther@gmail.com> writes:

>
> Patches should be submitted against SVN trunk head as this is a new
> feature and thus will not go on the 4.4 or 4.5 branches.

Sure. Thanks.
-- 
PMatos

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Clustering switch cases
       [not found]       ` <AANLkTinkjc4Y9baHXXK7v3ibQNwMtwGhA_oGmaFbykkA@mail.gmail.com>
@ 2010-09-03 10:29         ` Rahul Kharche
  0 siblings, 0 replies; 12+ messages in thread
From: Rahul Kharche @ 2010-09-03 10:29 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: sdkteam-gnu, gcc

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

> I have been working on your patch but I didn't manage to get it
> working yet. Unfortunately our current stable GCC version if 4.3.4 and
> our 4.4.x still has some issues.
> I tried backporting your patch to GCC 4.3.4 but it introduced several
> regressions.  It's my guess this is due to my incorrect manual
> patching of 4.3.
> Before I go through the regressions on our GCC 4.3 and try to fix the
> patching, do you have by any change a patch against 4.3.x?

Can you try the attached patch again with GCC 4.4.1 and let me know if it
fixes your regressions. There are related changes to how basic blocks are
split that may needs changing when using switch clustering. The changes
to stmt.c are the same as before. The additional changes are to cfgrtl.c
and cfgbuild.c.

Cheers,
Rahul


[-- Attachment #2: switch_case_clusters.patch --]
[-- Type: application/octet-stream, Size: 34243 bytes --]

2007-07-04  DJ Delorie  <dj@redhat.com>
	    Pranav Bhandarkar  <pranav.bhandarkar@celunite.com>
	    Ramana Radhakrishnan  <ramana@icerasemi.com>
	    Rahul Kharche  <rahul@icerasemi.com>

	* stmt.c (struct case_node): Added new member table_set.
	(max_initialized): New flag.
	(emit_case_nodes): Add new tree argument, handle case cclusters
	separately.
	(emit_case_table): New function, factored out of expand_case.
	(scan_for_case_clusters): New.
	(dump_case_nodes): New.
	(check_tree): New.
	(check_tree_1): New, called by check_tree.
	(expand_case): Call scan_for_case_clusters, call emit_case_table.
	(balance_case_nodes): Re-balance tree based on clusters identified
	by scan_for_case_clusters.
	* cfgrtl.c (jump_table_in_between): New.
	(find_next_label_after_jump_table): New.
	(jump_table_ok_to_skip): New.
	(rtl_split_block): Call above to split after jumptable.
	* cfgbuild.c (find_bb_boundaries): Grab new head of basic block
	after splitting.

--- stmt.c	2009-04-26 19:53:41.000000000 +0100
+++ stmt.c	2010-08-28 01:36:54.000000000 +0100
@@ -85,6 +85,7 @@ struct case_node
   struct case_node	*left;	/* Left son in binary tree */
   struct case_node	*right;	/* Right son in binary tree; also node chain */
   struct case_node	*parent; /* Parent of node in binary tree */
+  int			table_set; /* 0 means binary test, else grouped by val */
   tree			low;	/* Lowest index value for this label */
   tree			high;	/* Highest index value for this label */
   tree			code_label; /* Label to jump to when node matches */
@@ -99,6 +100,7 @@ typedef struct case_node *case_node_ptr;
 static short cost_table_[129];
 static int use_cost_table;
 static int cost_table_initialized;
+static int max_initialized;
 
 /* Special care is needed because we allow -1, but TREE_INT_CST_LOW
    is unsigned.  */
@@ -120,9 +122,15 @@ static void balance_case_nodes (case_nod
 static int node_has_low_bound (case_node_ptr, tree);
 static int node_has_high_bound (case_node_ptr, tree);
 static int node_is_bounded (case_node_ptr, tree);
-static void emit_case_nodes (rtx, case_node_ptr, rtx, tree);
+static int emit_case_nodes (rtx, tree, case_node_ptr, rtx, tree);
 static struct case_node *add_case_node (struct case_node *, tree,
                                         tree, tree, tree, alloc_pool);
+static void emit_case_table (tree, case_node_ptr, rtx, rtx, tree);
+static void scan_for_case_clusters (case_node *, tree);
+static void dump_case_nodes (struct case_node *, int);
+static int check_tree (case_node_ptr, int);
+static int check_tree_1 (case_node_ptr, unsigned long long int*,
+			 long long int*, int unsignedp);
 
 \f
 /* Return the rtx-label that corresponds to a LABEL_DECL,
@@ -2150,6 +2158,23 @@ emit_case_bit_tests (tree index_type, tr
 #define HAVE_tablejump 0
 #endif
 
+static void
+dump_case_nodes(struct case_node *root, int indent)
+{
+  int h, l;
+  if (root == 0)
+    return;
+  dump_case_nodes (root->left, indent+3);
+  h = (int)TREE_INT_CST_LOW(root->high);
+  l = (int)TREE_INT_CST_LOW(root->low);
+  if (h == l)
+    fprintf (dump_file, "%*s%d", indent, "", l);
+  else
+    fprintf (dump_file, "%*s%d .. %d", indent, "", l, h);
+  fprintf (dump_file, " (%d)\n", root->table_set);
+  dump_case_nodes (root->right, indent+3);
+}
+
 /* Terminate a case (Pascal/Ada) or switch (C) statement
    in which ORIG_INDEX is the expression to be tested.
    If ORIG_TYPE is not NULL, it is the original ORIG_INDEX
@@ -2164,9 +2189,6 @@ expand_case (tree exp)
   struct case_node *n;
   unsigned int count, uniq;
   rtx index;
-  rtx table_label;
-  int ncases;
-  rtx *labelvec;
   int i;
   rtx before_case, end, lab;
 
@@ -2291,6 +2313,14 @@ expand_case (tree exp)
       /* Compute span of values.  */
       range = fold_build2 (MINUS_EXPR, index_type, maxval, minval);
 
+     if (dump_file)
+	 fprintf (dump_file, "range low is %ld unique cases is %d count is  %d pc rel %d\n",
+		  TREE_INT_CST_LOW (range),
+		  uniq,
+		  count,
+		  CASE_VECTOR_PC_RELATIVE);
+     
+
       /* Try implementing this switch statement by a short sequence of
 	 bit-wise comparisons.  However, we let the binary-tree case
 	 below handle constant index expressions.  */
@@ -2338,6 +2368,26 @@ expand_case (tree exp)
 	{
 	  index = expand_normal (index_expr);
 
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "Emitting Binary Decision Tree because ");
+	      if (count < case_values_threshold ())
+		fprintf (dump_file, " Count [%d] < case_values_threshold [%d] \n",
+			 count, case_values_threshold () );
+	      else
+		if (compare_tree_int (range,
+				      (optimize_insn_for_size_p () ? 3 : 10) * count) > 0 )
+		  {
+		    const char *msg = (optimize_insn_for_size_p () ? "3" : "10");
+		    fprintf (dump_file, "Range %ld > %s [%d] * count [%d] \n",
+			     TREE_INT_CST_LOW (range),
+			     msg,
+			     ((optimize_insn_for_size_p ()) ? 3 : 10),
+			     count);
+
+		  }
+	    }
+	
 	  /* If the index is a short or char that we do not have
 	     an insn to handle comparisons directly, convert it to
 	     a full integer now, rather than letting each comparison
@@ -2377,90 +2427,60 @@ expand_case (tree exp)
 	  use_cost_table
 	    = (TREE_CODE (orig_type) != ENUMERAL_TYPE
 	       && estimate_case_costs (case_list));
+
+	  if (dump_file)
+	  {
+	      fprintf (dump_file, "\n-----  tree before scanning clusters\n"); 
+	      dump_case_nodes (case_list, 0); 
+	  }
+
+	  scan_for_case_clusters (case_list, index_type);
+
+	  if (dump_file)
+	  {
+	      fprintf (dump_file, "\n-----  tree after scanning clusters and before balancing.\n"); 
+	      dump_case_nodes (case_list, 0); 
+	  }
+
 	  balance_case_nodes (&case_list, NULL);
-	  emit_case_nodes (index, case_list, default_label, index_type);
-	  if (default_label)
+	  
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "\n----- final tree\n"); 
+	      dump_case_nodes (case_list, 0); 
+	    }
+
+	  /* Check the Binary Decision Tree generated. */
+	  if (check_tree (case_list, unsignedp))
+	  {
+	      if (dump_file)
+		  fprintf (dump_file, "Bad Binary Search Tree created in %s\n",
+			   current_function_name ());
+	      error ("Bad Binary Search Tree. Check the dump by using -dq on the command line \n");
+	      
+	  }
+	  else
+	  {
+	      if (dump_file)
+		  fprintf (dump_file, "Generated Binary Search Tree verified to be correct. \n");
+	      max_initialized = 0;
+	  }
+	  
+	  if (emit_case_nodes (index, index_expr, case_list, default_label, index_type))
 	    emit_jump (default_label);
+
 	}
       else
 	{
 	  rtx fallback_label = label_rtx (case_list->code_label);
-	  table_label = gen_label_rtx ();
-	  if (! try_casesi (index_type, index_expr, minval, range,
-			    table_label, default_label, fallback_label))
-	    {
-	      bool ok;
-
-	      /* Index jumptables from zero for suitable values of
-                 minval to avoid a subtraction.  */
-	      if (optimize_insn_for_speed_p ()
-		  && compare_tree_int (minval, 0) > 0
-		  && compare_tree_int (minval, 3) < 0)
-		{
-		  minval = build_int_cst (index_type, 0);
-		  range = maxval;
-		}
-
-	      ok = try_tablejump (index_type, index_expr, minval, range,
-				  table_label, default_label);
-	      gcc_assert (ok);
-	    }
-
-	  /* Get table of labels to jump to, in order of case index.  */
-
-	  ncases = tree_low_cst (range, 0) + 1;
-	  labelvec = XALLOCAVEC (rtx, ncases);
-	  memset (labelvec, 0, ncases * sizeof (rtx));
-
-	  for (n = case_list; n; n = n->right)
-	    {
-	      /* Compute the low and high bounds relative to the minimum
-		 value since that should fit in a HOST_WIDE_INT while the
-		 actual values may not.  */
-	      HOST_WIDE_INT i_low
-		= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
-					     n->low, minval), 1);
-	      HOST_WIDE_INT i_high
-		= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
-					     n->high, minval), 1);
-	      HOST_WIDE_INT i;
-
-	      for (i = i_low; i <= i_high; i ++)
-		labelvec[i]
-		  = gen_rtx_LABEL_REF (Pmode, label_rtx (n->code_label));
-	    }
-
-	  /* Fill in the gaps with the default.  We may have gaps at
-	     the beginning if we tried to avoid the minval subtraction,
-	     so substitute some label even if the default label was
-	     deemed unreachable.  */
-	  if (!default_label)
-	    default_label = fallback_label;
-	  for (i = 0; i < ncases; i++)
-	    if (labelvec[i] == 0)
-	      labelvec[i] = gen_rtx_LABEL_REF (Pmode, default_label);
-
-	  /* Output the table.  */
-	  emit_label (table_label);
-
-	  if (CASE_VECTOR_PC_RELATIVE || flag_pic)
-	    emit_jump_insn (gen_rtx_ADDR_DIFF_VEC (CASE_VECTOR_MODE,
-						   gen_rtx_LABEL_REF (Pmode, table_label),
-						   gen_rtvec_v (ncases, labelvec),
-						   const0_rtx, const0_rtx));
-	  else
-	    emit_jump_insn (gen_rtx_ADDR_VEC (CASE_VECTOR_MODE,
-					      gen_rtvec_v (ncases, labelvec)));
-
-	  /* Record no drop-through after the table.  */
-	  emit_barrier ();
+	  emit_case_table (index_expr, case_list, default_label,
+			   fallback_label, index_type);
 	}
 
       before_case = NEXT_INSN (before_case);
       end = get_last_insn ();
       reorder_insns (before_case, end, start);
     }
-
   free_temp_slots ();
   free_alloc_pool (case_node_pool);
 }
@@ -2555,6 +2575,56 @@ estimate_case_costs (case_node_ptr node)
   return 1;
 }
 
+/* Scan a list of case values, group them into clusters which should
+   be emitted as tablejumps even when the overall case list indicates
+   a desicion tree.  */
+
+static void
+scan_for_case_clusters (case_node *list, tree index_type)
+{
+  case_node *n, *l, *last_in_cluster;
+  unsigned int count;
+  tree min_case;
+  int table_set = 0;
+
+  list->parent = 0;
+  for (n=list; n; n=n->right)
+    {
+      n->table_set = 0;
+      if (n->right)
+	n->right->parent = n;
+    }
+
+  for (n=list; n; n=n->right)
+    {
+      min_case = n->low;
+      count = 0;
+      last_in_cluster = 0;
+      for (l=n; l; l=l->right)
+	{
+	  count ++;
+	  if (count > case_values_threshold ()
+	      && compare_tree_int (fold (build2_stat (MINUS_EXPR, index_type, 
+						      l->high, min_case)),
+				   (int)(optimize_size ? 3 : 10) * count) < 0)
+	    last_in_cluster = l;
+	  else if (last_in_cluster)
+	    break;
+	}
+      if (last_in_cluster)
+	{
+	  table_set ++;
+	  for (l=n; l;)
+	    {
+	      l->table_set = table_set;
+	      if (l == last_in_cluster)
+		break;
+	      l = l->right;
+	    }
+	  n = last_in_cluster;
+	}
+    }
+}
 /* Take an ordered list of case nodes
    and transform them into a near optimal binary tree,
    on the assumption that any target code selection value is as
@@ -2651,13 +2721,164 @@ balance_case_nodes (case_node_ptr *head,
 		  npp = &(*npp)->right;
 		}
 	    }
+	  if ((*npp)->table_set)
+	    {
+	      /* We've selected a member of a dense cluster.  Pick one
+ 		 of the ends of the cluster so that we can later
+ 		 isolate the cluster and emit a table for it.  */
+	      int set = (*npp)->table_set;
+	      int near_start=0, near_end=0;
+	      case_node *start_node, *end_node;
+	      if (dump_file)	      
+	      fprintf (dump_file, "Searching with set %d\n",
+		      set);
+
+	      /* Find the first and last member of the cluster.  */
+	      for (start_node = *npp;
+		   (start_node->parent != parent)
+		     && start_node->parent
+		     && (start_node->parent->table_set == set);
+		   start_node = start_node->parent)
+		{
+		    if (dump_file)
+		    {
+			fprintf (dump_file, "start_node is %ld\n",
+				TREE_INT_CST_LOW (start_node->low));
+			fprintf (dump_file, "parent is %ld\n",
+				(parent ? TREE_INT_CST_LOW (parent->low) : -1));
+		    }
+		  near_start ++;
+		}
+
+	      for (end_node = *npp;
+		   end_node->right
+		     && (end_node->right->table_set == set);
+		   end_node = end_node->right)
+		near_end ++;
+
+	      if (dump_file)
+		  fprintf (dump_file, "start %ld dist %d end %ld dist %d\n",
+			   TREE_INT_CST_LOW (start_node->low), near_start,
+			   TREE_INT_CST_LOW (end_node->low), near_end);
+	      
+	      if (start_node->parent
+		  && (start_node->parent != parent)
+		  && (start_node->parent->table_set == 0))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "Start node changes to a node with table set 0 \n");
+		    
+		  start_node = start_node->parent;
+		}
+	      if (dump_file)
+		  fprintf (dump_file, "start_node->parent %p (%ld) parent %p (%ld)\n",
+			  start_node->parent,
+			  (start_node->parent
+			   ? TREE_INT_CST_LOW (start_node->parent->low) : -1),
+			  parent,
+			  (parent ? TREE_INT_CST_LOW (parent->low) : -1));
+
+	      /* Watch out for end-of-list conditions.  */
+	      if (!start_node->parent || (start_node->parent == parent))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "start node is made to be empty.\n");
+
+		  start_node = 0;
+		}
+	      if (!end_node->right)
+		{
+		    if (dump_file)
+			fprintf (dump_file, "end_node->right = empty\n");
+
+		  end_node = 0;
+		}
+	      
+
+	      /* This means that there's nothing left in the list but
+ 		 the cluster; we keep it as-is and later emit a
+ 		 tablejump for it.  */
+	      if ((!start_node && !end_node))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "dj: nothing left but the table\n");
+
+		  return;
+		}
+
+	      /* Move the selection to either before or after this
+ 		 cluster, whichever is closer.  */
+	      if (!start_node || (near_end < near_start  &&
+				  (end_node && end_node->right)))
+		{
+		    if (dump_file)
+			fprintf (dump_file, "Setting new parent to end node -> right\n");
+		    npp = &(end_node->right);
+		}
+	      else if (!start_node)
+		{
+		    if (dump_file)
+			fprintf (dump_file, "setting new parent  to head %p %ld\n",*head,
+				TREE_INT_CST_LOW ((*head)->low));
+		    
+		  npp = head;
+		}
+	      else
+		{
+		    if (dump_file)
+			fprintf (dump_file, "setting new parent to start_node->parent->right\n");
+
+		  npp = &(start_node->parent->right);
+		}
+
+	      if (dump_file)
+		  fprintf (dump_file, "dj: split moved to %ld\n", TREE_INT_CST_LOW ((*npp)->low));
+	      /* FIXME:: Added this to correct case. */
+
+	      if (left)
+		left->parent  = NULL;
+
+	    }
 	  *head = np = *npp;
 	  *npp = 0;
 	  np->parent = parent;
 	  np->left = left;
+	  left->parent = np;
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "Parent of left subtree is %p 0x%lx \n",
+		       left->parent,
+		       (left->parent) ? TREE_INT_CST_LOW (left->parent->low) : -1);
+	      
+	      if (left->parent && left->parent->table_set)
+		  fprintf (dump_file, "Caution: A cluster has been broken at 0x%lx\n",
+			   (left->parent) ? TREE_INT_CST_LOW (left->parent->low) : -1);
+		  
+
+	      fprintf (dump_file, "Setting left subtree to %p(0x%lx)\n",
+		      left, (left ? TREE_INT_CST_LOW (left->low) : -1));
+	      
+	      fprintf (dump_file, "Right is set to %p(%lx)\n",np->right,
+		      (np->right ? TREE_INT_CST_LOW (np->right->low) : -1));
+	      
+	       
+	    }
+	  if (dump_file)
+	  {
+	      fprintf (dump_file, "details about np before balancing left\n");
+	      fprintf (dump_file, "np = 0x%lx, np->parent = 0x%lx, ",
+		       TREE_INT_CST_LOW (np->low),
+		       np->parent ? TREE_INT_CST_LOW (np->parent->low) : -1);
+	      fprintf (dump_file, "np->left->parent = 0x%lx\n",
+		       np->left->parent ? TREE_INT_CST_LOW (np->left->parent->low) : -1);
+	      fprintf (dump_file, "balancing left\n");
+	       
+	  }
 
 	  /* Optimize each of the two split parts.  */
 	  balance_case_nodes (&np->left, np);
+	  if (dump_file)
+	      fprintf (dump_file, "balancing right\n");
 	  balance_case_nodes (&np->right, np);
 	}
       else
@@ -2769,6 +2990,92 @@ node_has_high_bound (case_node_ptr node,
   return 0;
 }
 
+static void
+emit_case_table (tree index_expr, case_node_ptr node, rtx default_label,
+		 rtx fallback_label, tree index_type)
+{
+  rtx table_label;
+  case_node *n;
+  int ncases;
+  rtx *labelvec;
+  int i;
+  tree minval, maxval, range;
+
+  for (n=node; n->right; n=n->right);
+
+  minval = node->low;
+  maxval = n->high;
+  range = fold_build2 (MINUS_EXPR, index_type, maxval, minval);
+
+
+  table_label = gen_label_rtx ();
+  if (! try_casesi (index_type, index_expr, minval, range,
+		    table_label, default_label, fallback_label))
+    {
+      bool ok;
+      
+      /* Index jumptables from zero for suitable values of
+	 minval to avoid a subtraction.  */
+      if (! optimize_size
+	  && compare_tree_int (minval, 0) > 0
+	  && compare_tree_int (minval, 3) < 0)
+	{
+	  minval = build_int_cst (index_type, 0);
+	  range = maxval;
+	}
+      
+      ok = try_tablejump (index_type, index_expr, minval, range,
+			  table_label, default_label);
+      gcc_assert (ok);
+    }
+  
+  /* Get table of labels to jump to, in order of case index.  */
+  
+  ncases = tree_low_cst (range, 0) + 1;
+  labelvec = alloca (ncases * sizeof (rtx));
+  memset (labelvec, 0, ncases * sizeof (rtx));
+  
+  for (n = node; n; n = n->right)
+    {
+      /* Compute the low and high bounds relative to the minimum
+	 value since that should fit in a HOST_WIDE_INT while the
+	 actual values may not.  */
+      HOST_WIDE_INT i_low
+	= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
+				     n->low, minval), 1);
+      HOST_WIDE_INT i_high
+	= tree_low_cst (fold_build2 (MINUS_EXPR, index_type,
+				     n->high, minval), 1);
+      HOST_WIDE_INT i;
+      
+      for (i = i_low; i <= i_high; i ++)
+	labelvec[i]
+	  = gen_rtx_LABEL_REF (Pmode, label_rtx (n->code_label));
+    }
+  
+  /* Fill in the gaps with the default.  */
+  for (i = 0; i < ncases; i++)
+    if (labelvec[i] == 0)
+	      labelvec[i] = gen_rtx_LABEL_REF (Pmode, default_label);
+  
+  
+  /* Output the table.  */
+  emit_label (table_label);
+  
+  if (CASE_VECTOR_PC_RELATIVE || flag_pic)
+    emit_jump_insn (gen_rtx_ADDR_DIFF_VEC (CASE_VECTOR_MODE,
+					   gen_rtx_LABEL_REF (Pmode, table_label),
+					   gen_rtvec_v (ncases, labelvec),
+					   const0_rtx, const0_rtx));
+  else
+    emit_jump_insn (gen_rtx_ADDR_VEC (CASE_VECTOR_MODE,
+				      gen_rtvec_v (ncases, labelvec)));
+  
+  /* Record no drop-through after the table.  */
+  emit_barrier ();
+  
+}
+
 /* Search the parent sections of the
    case node tree to see if both tests for the upper and lower
    bounds of NODE would be redundant.  */
@@ -2806,18 +3113,47 @@ node_is_bounded (case_node_ptr node, tre
    the possibility of a value greater than 51.  If another parent
    tests for the value 50, then this node need not test anything.  */
 
-static void
-emit_case_nodes (rtx index, case_node_ptr node, rtx default_label,
+static int
+emit_case_nodes (rtx index, tree index_expr, case_node_ptr node, rtx default_label,
 		 tree index_type)
 {
   /* If INDEX has an unsigned type, we must make unsigned branches.  */
   int unsignedp = TYPE_UNSIGNED (index_type);
+  int ret_val = 1;
   enum machine_mode mode = GET_MODE (index);
   enum machine_mode imode = TYPE_MODE (index_type);
 
   /* Handle indices detected as constant during RTL expansion.  */
   if (mode == VOIDmode)
     mode = imode;
+  /* If node->table_set is nonzero, then that node and all the
+     node->right in the chain represent a cluster of case nodes which
+     are appropriate for a tablejump.  Note that node->left may also
+     have a cluster, so test for that too.  */
+  if (node->table_set != 0 && 
+      node->right && 
+      node->table_set == node->right->table_set)
+    {
+      if (node->left)
+	{
+	  tree test_label = build_decl (LABEL_DECL, NULL_TREE, NULL_TREE);
+
+	  /* See if the value is on the right.  */
+	  emit_cmp_and_jump_insns (index,
+				   convert_modes
+				   (mode, imode,
+				    expand_expr (node->low, NULL_RTX,
+						 VOIDmode, 0),
+				    unsignedp),
+				   GE, NULL_RTX, mode, unsignedp,
+				   label_rtx (test_label));
+	  ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+				     index_type);
+	  expand_label (test_label);
+	}
+      emit_case_table (index_expr, node, default_label, default_label, index_type);
+      return 0;
+    }
 
   /* See if our parents have already tested everything for us.
      If they have, emit an unconditional jump for this node.  */
@@ -2852,7 +3188,8 @@ emit_case_nodes (rtx index, case_node_pt
 					unsignedp),
 				       GT, NULL_RTX, mode, unsignedp,
 				       label_rtx (node->right->code_label));
-	      emit_case_nodes (index, node->left, default_label, index_type);
+          ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+				     index_type);
 	    }
 
 	  else if (node_is_bounded (node->left, index_type))
@@ -2864,7 +3201,8 @@ emit_case_nodes (rtx index, case_node_pt
 					unsignedp),
 				       LT, NULL_RTX, mode, unsignedp,
 				       label_rtx (node->left->code_label));
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label,
+					 index_type);
 	    }
 
 	  /* If both children are single-valued cases with no
@@ -2915,17 +3253,19 @@ emit_case_nodes (rtx index, case_node_pt
 				       GT, NULL_RTX, mode, unsignedp,
 				       label_rtx (test_label));
 
-	      /* Value must be on the left.
+           /* Value must be on the left.
 		 Handle the left-hand subtree.  */
-	      emit_case_nodes (index, node->left, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+	                                 index_type);
 	      /* If left-hand subtree does nothing,
 		 go to default.  */
-	      if (default_label)
-	        emit_jump (default_label);
+	      if (ret_val && default_label)
+		emit_jump (default_label);
 
 	      /* Code branches here for the right-hand subtree.  */
 	      expand_label (test_label);
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label,
+	                                 index_type);
 	    }
 	}
 
@@ -2952,7 +3292,8 @@ emit_case_nodes (rtx index, case_node_pt
 					   default_label);
 		}
 
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label,
+					 index_type);
 	    }
 	  else
 	    /* We cannot process node->right normally
@@ -2983,7 +3324,8 @@ emit_case_nodes (rtx index, case_node_pt
 					   default_label);
 		}
 
-	      emit_case_nodes (index, node->left, default_label, index_type);
+	     ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+	                                index_type);
 	    }
 	  else
 	    /* We cannot process node->left normally
@@ -3048,7 +3390,7 @@ emit_case_nodes (rtx index, case_node_pt
 				   label_rtx (node->code_label));
 
 	  /* Handle the left-hand subtree.  */
-	  emit_case_nodes (index, node->left, default_label, index_type);
+	  ret_val = emit_case_nodes (index, index_expr, node->left, default_label, index_type);
 
 	  /* If right node had to be handled later, do that now.  */
 
@@ -3056,11 +3398,11 @@ emit_case_nodes (rtx index, case_node_pt
 	    {
 	      /* If the left-hand subtree fell through,
 		 don't let it fall into the right-hand subtree.  */
-	      if (default_label)
-		emit_jump (default_label);
+	      if (ret_val && default_label)
+	      emit_jump (default_label);
 
 	      expand_label (test_label);
-	      emit_case_nodes (index, node->right, default_label, index_type);
+	      ret_val = emit_case_nodes (index, index_expr, node->right, default_label, index_type);
 	    }
 	}
 
@@ -3089,7 +3431,7 @@ emit_case_nodes (rtx index, case_node_pt
 				   LE, NULL_RTX, mode, unsignedp,
 				   label_rtx (node->code_label));
 
-	  emit_case_nodes (index, node->right, default_label, index_type);
+      ret_val =  emit_case_nodes (index, index_expr, node->right, default_label, index_type);
 	}
 
       else if (node->right == 0 && node->left != 0)
@@ -3117,7 +3459,8 @@ emit_case_nodes (rtx index, case_node_pt
 				   GE, NULL_RTX, mode, unsignedp,
 				   label_rtx (node->code_label));
 
-	  emit_case_nodes (index, node->left, default_label, index_type);
+	  ret_val = emit_case_nodes (index, index_expr, node->left, default_label,
+	                             index_type);
 	}
 
       else
@@ -3174,4 +3517,147 @@ emit_case_nodes (rtx index, case_node_pt
 	  emit_jump (label_rtx (node->code_label));
 	}
     }
+
+  return ret_val;
 }
+
+/* Subroutine of check_tree.
+   Performs inorder traversal of the tree rooted at NODE. MAX is the last
+   value that would have been printed, if we were to print nodes ( inorder
+   traversal). Check if MAX is greater than node->high, if Yes, Return 1
+   indicating a bad Binary Search Tree. Check Whether the parent
+   pointer of each child points to node. If not, Return 1, indicating a
+   bad Tree. Return 0, If the tree is correct. */
+static
+int check_tree_1 (case_node_ptr node, unsigned long long int *maxu, long long int* maxi, int unsignedp)
+{
+    unsigned long long int hu;
+    long long int hi;
+
+    if (! node)
+	return 0;
+
+    /* check the left subtree */
+    if (check_tree_1 (node->left, maxu, maxi, unsignedp))
+	return 1;
+
+    if (unsignedp)
+      {
+        unsigned int hi_part = TREE_INT_CST_HIGH (node->high);
+        unsigned int lo_part = TREE_INT_CST_LOW (node->high);
+
+        hu = ((unsigned long long int)hi_part << 32) | lo_part;   
+      }
+    else
+      {
+        int hi_part = TREE_INT_CST_HIGH (node->high);
+        int lo_part = TREE_INT_CST_LOW (node->high);
+
+        hi = ((long long int)hi_part << 32) | lo_part;          
+      }
+
+    /* Check if the tree is correct in terms of the values of the nodes. */
+    if (!max_initialized)
+    {
+	/* We have reached the leftmost node of the tree. */
+	if (unsignedp)
+	    *maxu = hu;
+	else
+	    *maxi = hi;
+	max_initialized = 1;
+    }
+    else  if ((unsignedp && (*maxu < hu)) || (!unsignedp && (*maxi < hi)))
+    {
+	if (unsignedp)
+	    *maxu = hu;
+	else
+	    *maxi = hi;
+    }
+    else
+    {
+	if (dump_file)
+	{
+	    if (unsignedp)
+	    {
+		fprintf (dump_file, "hu = 0x%llx, hu = %llu (unsigned), *maxu = 0x%llx, *maxu (decimal) = 0x%llu (unsigned)\n",
+			 hu, hu, *maxu, *maxu );
+		fprintf (dump_file, "PROBLEM: maxu is greater than hu, when it should not be so.\n" );
+	    }
+	    else
+	    {
+		fprintf (dump_file, "hi = 0x%llx, hi = %lld (signed), *maxi = 0x%llx, *maxi (decimal) = 0x%llx (signed)\n",
+			 hi, hi, *maxi, *maxi );
+		fprintf (dump_file, "PROBLEM: maxi is greater than hi, when it should not be so.\n" );
+	    }
+	}
+	return 1;
+    }
+    /* Check to see if the pointers in each node are sane */
+    if ((node->left && (!(node->left->parent == node))) || (node->right && (!(node->right->parent == node))))
+    {
+	if (dump_file)
+	{
+	    if (unsignedp)
+	    {
+		fprintf (dump_file, "PROBLEM: The parent pointers of children do not point to the correct" 
+			 "pointer for the following\n");
+		fprintf (dump_file, "Parent (Hex) = 0x%lx, Parent (Decimal) = %lu (unsigned)\n", TREE_INT_CST_LOW (node->low),
+			 TREE_INT_CST_LOW (node->low));
+		fprintf (dump_file, "Left Child (Hex) = 0x%lx, Left Child (Decimal) = %lu (unsigned)\n",
+			 node->left?TREE_INT_CST_LOW (node->left->low):-1,node->left?TREE_INT_CST_LOW (node->left->low):-1);
+		fprintf (dump_file, "Left Child's Parent (Hex) = 0x%lx, Left Child's Parent (Decimal) = %lu (unsigned)\n",
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1,
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1);
+		fprintf (dump_file, "Right Child (Hex) = 0x%lx, Right Child (Decimal) = %lu (unsigned)\n",
+			 node->right?TREE_INT_CST_LOW (node->right->low):-1,node->right?TREE_INT_CST_LOW (node->right->low):-1);
+		fprintf (dump_file, "Right Child's Parent (Hex) = 0x%lx, Right Child's Parent (Decimal) = %lu (unsigned)\n",
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1,
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1);
+		fprintf (dump_file, "If the value of a node is shown to be -1 then it means that the case value is actually"
+			 " -1 or that the pointer points to nothing i.e is NULL\n");
+	    }
+	    else
+	    {
+
+		fprintf (dump_file, "PROBLEM: The parent pointers of children do not point to the correct" 
+			 "pointer for the following\n");
+		fprintf (dump_file, "Parent (Hex) = 0x%lx, Parent (Decimal) = %ld (signed)\n", TREE_INT_CST_LOW (node->low),
+			 TREE_INT_CST_LOW (node->low));
+		fprintf (dump_file, "Left Child (Hex) = 0x%lx, Left Child (Decimal) = %ld (signed)\n",
+			 node->left?TREE_INT_CST_LOW (node->left->low):-1,node->left?TREE_INT_CST_LOW (node->left->low):-1);
+		fprintf (dump_file, "Left Child's Parent (Hex) = 0x%lx, Left Child's Parent (Decimal) = %ld (signed)\n",
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1,
+			 node->left?node->left->parent?TREE_INT_CST_LOW (node->left->parent->low):-1:-1);
+		fprintf (dump_file, "Right Child (Hex) = 0x%lx, Right Child (Decimal) = %ld (signed)\n",
+			 node->right?TREE_INT_CST_LOW (node->right->low):-1,node->right?TREE_INT_CST_LOW (node->right->low):-1);
+		fprintf (dump_file, "Right Child's Parent (Hex) = 0x%lx, Right Child's Parent (Decimal) = %ld (signed)\n",
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1,
+			 node->right?node->right->parent?TREE_INT_CST_LOW (node->right->parent->low):-1:-1);
+		fprintf (dump_file, "If the value of a node is shown to be -1 then it means that the case value is actually"
+			 " -1 or that the pointer points to nothing i.e is NULL\n");
+	    }
+	}
+		
+	return 1;
+    }
+    
+/* Check the right subtree. */
+    if (check_tree_1 (node->right, maxu, maxi, unsignedp))
+	return 1;
+
+    return 0;
+}
+
+
+static
+int check_tree (case_node_ptr head, int unsignedp)
+{
+    unsigned long long int max_u_inorder=0;
+    long long int max_i_inorder = 0;
+/* max_initialized is set to 1 when we reach the leftmost node of the tree. */
+    max_initialized = 0;
+    if (dump_file)
+	fprintf (dump_file, "Checking the generated BST\n");
+    return check_tree_1 (head, &max_u_inorder, &max_i_inorder, unsignedp);
+}
+
--- cfgrtl.c	2009-05-22 00:17:37.000000000 +0100
+++ cfgrtl.c	2010-09-03 10:54:55.000000000 +0100
@@ -528,6 +528,75 @@ first_insn_after_basic_block_note (basic
   return NEXT_INSN (insn);
 }
 
+static bool
+jump_table_in_between (rtx start, rtx end)
+{
+  bool between = false;
+  rtx insn;
+
+  for (insn = start; insn != end; insn = NEXT_INSN (insn))
+    if (JUMP_TABLE_DATA_P (insn))
+      between = true;
+  
+  if (insn == end)
+    if (JUMP_TABLE_DATA_P (insn))
+      between = true;
+   
+  return between;
+
+}
+/* Subroutine of rtl_split_block. Returns the first label found after
+   a jumptable, if a jumptable is found between start and end. */
+
+static rtx
+find_next_label_after_jump_table (rtx start, rtx end)
+{
+  rtx next_label, table,insn;
+  table = NULL_RTX;
+  next_label = NULL_RTX;
+  for (insn = start; insn != end; insn = NEXT_INSN (insn))
+    {
+      if (JUMP_TABLE_DATA_P (insn))
+	table = insn;
+      if (table && LABEL_P (insn))
+	return insn;
+      /* By this time we have decided that the instructions that form
+	 part of the jumptable have to be skipped, so make their basic
+	 block 0*/
+      if (!BARRIER_P (insn))
+	BLOCK_FOR_INSN (insn) = 0;
+    }
+  return NULL_RTX;
+}
+
+/* Subroutine of rtl_split_block. Check if the sequence of
+   instructions from start contains a jump table. If yes is it ok to
+   skip the block while spliting the block.  */
+
+static bool
+jump_table_ok_to_skip (rtx start)
+{
+  bool ok = false;
+  rtx test_insn;
+  /* start should always be a label. */
+  gcc_assert ( LABEL_P (start));
+
+  test_insn = NEXT_INSN (start);
+  if (JUMP_TABLE_DATA_P (test_insn))
+    {
+      test_insn = NEXT_INSN (test_insn);
+      /* check if the jump_table is followd by a barrier */
+      if (BARRIER_P (test_insn))
+      {
+	test_insn = NEXT_INSN (test_insn);
+	/* check if the barrier is followed by a label */
+	if (LABEL_P ( test_insn))
+	  ok = true;
+      }
+    }
+  return ok;
+}
+
 /* Creates a new basic block just after basic block B by splitting
    everything after specified instruction I.  */
 
@@ -539,6 +608,8 @@ rtl_split_block (basic_block bb, void *i
   edge e;
   edge_iterator ei;
 
+  rtx start_insn;
+
   if (!insn)
     {
       insn = first_insn_after_basic_block_note (bb);
@@ -555,8 +626,40 @@ rtl_split_block (basic_block bb, void *i
   if (insn == BB_END (bb))
     emit_note_after (NOTE_INSN_DELETED, insn);
 
+  /* If 'insn' is a barrier we need to look for the following sequence of
+   instructions:
+   
+   label_1
+   jumptable
+   barrier_2
+   
+   label_2
+
+   If we find such a sequence following 'insn' then we do not
+   split the block at the label following insn (label_1 in this case)
+   but we split the block at the label following the barrier that follows
+   the jump table (label_2 in this case). */
+
+  if (BARRIER_P (insn))
+    {
+      rtx temp_label;
+      if (LABEL_P (NEXT_INSN (insn)) 
+	  && jump_table_in_between (NEXT_INSN (insn), BB_END (bb)) 
+	  && jump_table_ok_to_skip (NEXT_INSN (insn)))
+	{
+	  temp_label = find_next_label_after_jump_table (NEXT_INSN (insn), BB_END (bb));
+	  gcc_assert (temp_label);
+	  start_insn = temp_label;
+	}
+      else
+	start_insn = NEXT_INSN (insn);
+      
+    }
+  else
+    start_insn = NEXT_INSN (insn);
+
   /* Create the new basic block.  */
-  new_bb = create_basic_block (NEXT_INSN (insn), BB_END (bb), bb);
+  new_bb = create_basic_block (start_insn, BB_END (bb), bb);
   BB_COPY_PARTITION (new_bb, bb);
   BB_END (bb) = insn;
 
--- cfgbuild.c	2009-02-20 15:20:38.000000000 +0000
+++ cfgbuild.c	2010-09-03 10:54:50.000000000 +0100
@@ -663,6 +663,11 @@ find_bb_boundaries (basic_block bb)
 	    }
 
 	  bb = fallthru->dest;
+	  
+	  /* The head of the insn may not be the same after
+	     splitting. */
+	  insn = BB_HEAD (bb);
+	  
 	  remove_edge (fallthru);
 	  flow_transfer_insn = NULL_RTX;
 	  if (LABEL_ALT_ENTRY_P (insn))

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-09-03 10:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-27 13:49 Clustering switch cases Paulo J. Matos
2010-08-27 15:03 ` Ian Lance Taylor
2010-08-27 15:09   ` Paulo J. Matos
2010-08-27 15:27   ` Richard Guenther
2010-08-28  0:21     ` Xinliang David Li
2010-08-28 11:08     ` Paulo J. Matos
2010-08-31 15:59       ` Rahul Kharche
     [not found]       ` <201009010034.18924.paul@codesourcery.com>
2010-09-01  8:39         ` Richard Guenther
2010-09-01  9:15           ` Paulo J. Matos
2010-09-01  9:33             ` Richard Guenther
2010-09-01 12:37               ` Paulo J. Matos
     [not found] ` <D84AB18A31AC674EBE9B5300634A800601BF0B0932@EXCHANGEVS.IceraSemi.local>
     [not found]   ` <AANLkTinFhrgJw3Wz8aLhBU6PMYpJzBFTRxLUH3eoz4M0@mail.gmail.com>
     [not found]     ` <4D60B0700D1DB54A8C0C6E9BE691637008965A28@EXCHANGEVS.IceraSemi.local>
     [not found]       ` <AANLkTinkjc4Y9baHXXK7v3ibQNwMtwGhA_oGmaFbykkA@mail.gmail.com>
2010-09-03 10:29         ` Rahul Kharche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).