public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch[ Add explanations to sbitmap, bitmap, and sparseset
@ 2012-07-26  9:16 Steven Bosscher
  2012-07-26  9:23 ` Richard Guenther
  2012-07-26 10:14 ` Alexander Monakov
  0 siblings, 2 replies; 12+ messages in thread
From: Steven Bosscher @ 2012-07-26  9:16 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

Hello,

Add some explanations because I see a lot of places where the choice
of data structure is less than optimal.

Also some functional changes: sbitmap with popcounts are only used in
ebitmap, and they penalize the "normal" case without popcount a lot.

Bootstrapped&tested on powepc64-unknown-linux-gnu. OK?

Ciao!
Steven

[-- Attachment #2: sets.diff --]
[-- Type: application/octet-stream, Size: 22367 bytes --]

	* bitmap.h: Add explanation of sparse set as linked-list bitmap.
	* sbitmap.h: Add explanation about non-sparse sets as simple bitmap.
	(TEST_BIT): Make a static inline function for stronger type checking.
	(SET_BIT): Don't handle sbitmaps with popcount.
	(RESET_BIT): Likewise.
	(SET_BIT_WITH_POPCOUNT): New, like SET_BIT but with popcount.
	(RESET_BIT_WITH_POPCOUNT): New, like RESET_BIT but with popcount.
	* ebitmap.c (ebitmap_clear_bit): Use SET_BIT_WITH_POPCOUNT and
	RESET_BIT_WITH_POPCOUNT on wordmask bitmaps.
	(ebitmap_set_bit, ebitmap_and_into, ebitmap_and, ebitmap_ior_into,
	ebitmap_and_compl_into, ebitmap_and_compl): Likewise.
	* sparseset.h: Add explanation of sparse set representation.

Index: bitmap.h
===================================================================
--- bitmap.h	(revision 189861)
+++ bitmap.h	(working copy)
@@ -1,6 +1,5 @@
 /* Functions to support general ended bitmaps.
-   Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
-   2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012  Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -20,6 +19,114 @@ along with GCC; see the file COPYING3.  If not see
 
 #ifndef GCC_BITMAP_H
 #define GCC_BITMAP_H
+
+/* Implementation of sparse integer sets as a linked list.
+
+   This sparse set representation is suitable for sparse sets with an
+   unknown (a priori) universe.  The set is represented as a double-linked
+   list of container nodes (struct bitmap_element_def).  Each node consists
+   of an index for the first member that could be held in the container,
+   a small array of integers that represent the members in the container,
+   and pointers to the next and previous element in the linked list.  The
+   elements in the list are sorted in ascending order, i.e. the head of
+   the list holds the element with the smallest member of the set.
+
+   For a given member I in the set:
+     - the element for I will have index is I / (bits per element)
+     - the position for I within element is I % (bits per element)
+
+   This representation is very space-efficient for large sparse sets, and
+   the size of the set can be changed dynamically without much overhead.
+   An important parameter is the number of bits per element.  In this
+   implementation, there are 128 bits per element.  This results in a
+   high storage overhead *per element*, but a small overall overhead if
+   the set is very sparse.
+
+   The downside is that many operations are relatively slow because the
+   linked list has to be traversed to test membership (i.e. member_p/
+   add_member/remove_member).  To improve the performance of this set
+   representation, the last accessed element and its index are cached.
+   For membership tests on members close to recently accessed members,
+   the cached last element improves membership test to a constant-time
+   operation.
+
+   The following operations can always be performed in O(1) time:
+
+     * clear			: bitmap_clear
+     * choose_one		: (not implemented, but could be
+				   implemented in constant time)
+
+   The following operations can be performed in O(E) time worst-case (with
+   E the number of elements in the linked list), but in O(1) time with a
+   suitable access patterns:
+
+     * member_p			: bitmap_bit_p
+     * add_member		: bitmap_set_bit
+     * remove_member		: bitmap_clear_bit
+
+   The following operations can be performed in O(E) time:
+
+     * cardinality		: bitmap_count_bits
+     * set_size			: bitmap_last_set_bit (but this could
+				  in constant time with a pointer to
+				  the last element in the chain)
+
+   Additionally, the linked-list sparse set representation supports
+   enumeration of the members in O(E) time:
+
+     * forall			: EXECUTE_IF_SET_IN_BITMAP
+     * set_copy			: bitmap_copy
+     * set_intersection		: bitmap_intersect_p /
+				  bitmap_and / bitmap_and_into /
+				  EXECUTE_IF_AND_IN_BITMAP
+     * set_union		: bitmap_ior / bitmap_ior_into
+     * set_difference		: bitmap_intersect_compl_p /
+				  bitmap_and_comp / bitmap_and_comp_into /
+				  EXECUTE_IF_AND_COMPL_IN_BITMAP
+     * set_disjuction		: bitmap_xor_comp / bitmap_xor_comp_into
+     * set_compare		: bitmap_equal_p
+
+   Some operations on 3 sets that occur frequently in in data flow problems
+   are also implemented:
+
+     * A | (B & C)		: bitmap_ior_and_into
+     * A | (B & ~C)		: bitmap_ior_and_compl /
+				  bitmap_ior_and_compl_into
+
+   The storage requirements for linked-list sparse sets are O(E), with E->N
+   in the worst case (a sparse set with large distances between the values
+   of the set members).
+
+   The linked-list set representation works well for problems involving very
+   sparse sets.  The canonical example in GCC is, of course, the "set of
+   sets" for some CFG-based data flow problems (liveness analysis, dominance
+   frontiers, etc.).
+   
+   This representation also works well for data flow problems where the size
+   of the set may grow dynamically, but care must be taken that the member_p,
+   add_member, and remove_member operations occur with a suitable access
+   pattern.
+   
+   For random-access sets with a known, relatively small universe size, the
+   SparseSet or simple bitmap representations may be more efficient than a
+   linked-list set.  For random-access sets of unknown universe, a hash table
+   or a balanced binary tree representation is likely to be a more suitable
+   choice.
+
+   Traversing linked lists is usually cache-unfriendly, even with the last
+   accessed element cached.
+   
+   Cache performance can be improved by keeping the elements in the set
+   grouped together in memory, using a dedicated obstack for a set (or group
+   of related sets).  Elements allocated on obstacks are released to a
+   free-list and taken off the free list.  If multiple sets are allocated on
+   the same obstack, elements freed from one set may be re-used for one of
+   the other sets.  This usually helps avoid cache misses.
+
+   A single free-list is used for all sets allocated in GGC space.  This is
+   bad for persistent sets, so persistent sets should be allocated on an
+   obstack whenever possible.  */
+
 #include "hashtab.h"
 #include "statistics.h"
 #include "obstack.h"
@@ -130,9 +237,11 @@ extern void bitmap_xor_into (bitmap, const_bitmap)
 /* DST = A | (B & C).  Return true if DST changes.  */
 extern bool bitmap_ior_and_into (bitmap DST, const_bitmap B, const_bitmap C);
 /* DST = A | (B & ~C).  Return true if DST changes.  */
-extern bool bitmap_ior_and_compl (bitmap DST, const_bitmap A, const_bitmap B, const_bitmap C);
+extern bool bitmap_ior_and_compl (bitmap DST, const_bitmap A,
+				  const_bitmap B, const_bitmap C);
 /* A |= (B & ~C).  Return true if A changes.  */
-extern bool bitmap_ior_and_compl_into (bitmap DST, const_bitmap B, const_bitmap C);
+extern bool bitmap_ior_and_compl_into (bitmap A,
+				       const_bitmap B, const_bitmap C);
 
 /* Clear a single bit in a bitmap.  Return true if the bit changed.  */
 extern bool bitmap_clear_bit (bitmap, int);
@@ -328,7 +437,8 @@ bmp_iter_and_init (bitmap_iterator *bi, const_bitm
    */
 
 static inline void
-bmp_iter_and_compl_init (bitmap_iterator *bi, const_bitmap map1, const_bitmap map2,
+bmp_iter_and_compl_init (bitmap_iterator *bi,
+			 const_bitmap map1, const_bitmap map2,
 			 unsigned start_bit, unsigned *bit_no)
 {
   bi->elt1 = map1->first;
Index: sbitmap.h
===================================================================
--- sbitmap.h	(revision 189861)
+++ sbitmap.h	(working copy)
@@ -1,6 +1,5 @@
 /* Simple bitmaps.
-   Copyright (C) 1999, 2000, 2002, 2003, 2004, 2006, 2007, 2008, 2010
-   Free Software Foundation, Inc.
+   Copyright (C) 1999-2012  Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -21,10 +20,66 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_SBITMAP_H
 #define GCC_SBITMAP_H
 
-/* It's not clear yet whether using bitmap.[ch] will be a win.
-   It should be straightforward to convert so for now we keep things simple
-   while more important issues are dealt with.  */
+/* Implementation of sets using simple bitmap vectors.
 
+   This set representation is suitable for non-sparse sets with a known
+   (a priori) universe.  The set is represented as a simple array of the
+   host's fastest unsigned integer.  For a given member I in the set:
+     - the element for I will be at sbitmap[I / (bits per element)]
+     - the position for I within element is I % (bits per element)
+
+   This representation is very space-efficient for large non-sparse sets
+   with random access patterns.
+
+   The following operations can be performed in O(1) time:
+
+     * set_size			: SBITMAP_SIZE
+     * member_p			: TEST_BIT
+     * add_member		: SET_BIT
+     * remove_member		: RESET_BIT
+
+   Most other operations on this set representation are O(U) where U is
+   the size of the set universe:
+
+     * clear			: sbitmap_zero
+     * cardinality		: sbitmap_popcount
+     * choose_one		: sbitmap_first_set_bit /
+				  sbitmap_last_set_bit
+     * forall			: EXECUTE_IF_SET_IN_SBITMAP
+     * set_copy			: sbitmap_copy / sbitmap_copy_n
+     * set_intersection		: sbitmap_a_and_b
+     * set_union		: sbitmap_a_or_b
+     * set_difference		: sbitmap_difference
+     * set_disjuction		: (not implemented)
+     * set_compare		: sbitmap_equal
+
+   Some operations on 3 sets that occur frequently in in data flow problems
+   are also implemented:
+
+      * A | (B & C)		: sbitmap_a_or_b_and_c
+      * A | (B & ~C)		: sbitmap_union_of_diff
+      * A & (B | C)		: sbitmap_a_and_b_or_c
+
+   Most of the set functions have two variants: One that returns non-zero
+   if members were added or removed from the target set, and one that just
+   performs the operation without feedback.  The former operations are a
+   bit more expensive but the result can often be used to avoid iterations
+   on other sets.
+
+   Allocating a bitmap is done with sbitmap_alloc, and resizing is
+   performed with sbitmap_resize.
+
+   The storage requirements for simple bitmap sets is O(U) where U is the
+   size of the set universe (colloquially the number of bits in the bitmap).
+
+   This set representation works well for relatively small data flow problems
+   (there are special routines for that, see sbitmap_vector_*).  The set
+   operations can be vectorized and there is almost no computating overhead,
+   so that even sparse simple bitmap sets outperform dedicated sparse set
+   representations like linked-list bitmaps.  For larger problems, the size
+   overhead of simple bitmap sets gets too high and other set representations
+   have to be used.  */
+
 #define SBITMAP_ELT_BITS (HOST_BITS_PER_WIDEST_FAST_INT * 1u)
 #define SBITMAP_ELT_TYPE unsigned HOST_WIDEST_FAST_INT
 
@@ -40,46 +95,66 @@ struct simple_bitmap_def
 #define SBITMAP_SET_SIZE(N) (((N) + SBITMAP_ELT_BITS - 1) / SBITMAP_ELT_BITS)
 #define SBITMAP_SIZE_BYTES(BITMAP) ((BITMAP)->size * sizeof (SBITMAP_ELT_TYPE))
 
+/* Return the number of bits in BITMAP.  */
+#define SBITMAP_SIZE(BITMAP) ((BITMAP)->n_bits)
+
 /* Test if bit number bitno in the bitmap is set.  */
-#define TEST_BIT(BITMAP, BITNO) \
-((BITMAP)->elms [(BITNO) / SBITMAP_ELT_BITS] >> (BITNO) % SBITMAP_ELT_BITS & 1)
+static inline SBITMAP_ELT_TYPE
+TEST_BIT (const_sbitmap map, unsigned int bitno)
+{
+  size_t i = bitno / SBITMAP_ELT_BITS;
+  unsigned int s = bitno % SBITMAP_ELT_BITS;
+  return (map->elms[i] >> s) & (SBITMAP_ELT_TYPE) 1;
+}
 
-/* Set bit number BITNO in the sbitmap MAP.  Updates population count
-   if this bitmap has one.  */
+/* Set bit number BITNO in the sbitmap MAP.  */
 
 static inline void
 SET_BIT (sbitmap map, unsigned int bitno)
 {
-  if (map->popcount)
-    {
-      bool oldbit;
-      oldbit = TEST_BIT (map, bitno);
-      if (!oldbit)
-	map->popcount[bitno / SBITMAP_ELT_BITS]++;
-    }
+  gcc_checking_assert (! map->popcount);
   map->elms[bitno / SBITMAP_ELT_BITS]
     |= (SBITMAP_ELT_TYPE) 1 << (bitno) % SBITMAP_ELT_BITS;
 }
 
+/* Like SET_BIT, but updates population count.  */
 
+static inline void
+SET_BIT_WITH_POPCOUNT (sbitmap map, unsigned int bitno)
+{
+  bool oldbit;
+  gcc_checking_assert (map->popcount);
+  oldbit = TEST_BIT (map, bitno);
+  if (!oldbit)
+    map->popcount[bitno / SBITMAP_ELT_BITS]++;
+  map->elms[bitno / SBITMAP_ELT_BITS]
+    |= (SBITMAP_ELT_TYPE) 1 << (bitno) % SBITMAP_ELT_BITS;
+}
 
-/* Reset bit number BITNO in the sbitmap MAP.  Updates population
-   count if this bitmap has one.  */
+/* Reset bit number BITNO in the sbitmap MAP.  */
 
 static inline void
 RESET_BIT (sbitmap map,  unsigned int bitno)
 {
-  if (map->popcount)
-    {
-      bool oldbit;
-      oldbit = TEST_BIT (map, bitno);
-      if (oldbit)
-	map->popcount[bitno / SBITMAP_ELT_BITS]--;
-    }
+  gcc_checking_assert (! map->popcount);
   map->elms[bitno / SBITMAP_ELT_BITS]
     &= ~((SBITMAP_ELT_TYPE) 1 << (bitno) % SBITMAP_ELT_BITS);
 }
 
+/* Like RESET_BIT, but updates population count.  */
+
+static inline void
+RESET_BIT_WITH_POPCOUNT (sbitmap map,  unsigned int bitno)
+{
+  bool oldbit;
+  gcc_checking_assert (map->popcount);
+  oldbit = TEST_BIT (map, bitno);
+  if (oldbit)
+    map->popcount[bitno / SBITMAP_ELT_BITS]--;
+  map->elms[bitno / SBITMAP_ELT_BITS]
+    &= ~((SBITMAP_ELT_TYPE) 1 << (bitno) % SBITMAP_ELT_BITS);
+}
+
 /* The iterator for sbitmap.  */
 typedef struct {
   /* The pointer to the first word of the bitmap.  */
@@ -211,14 +286,20 @@ extern void sbitmap_ones (sbitmap);
 extern void sbitmap_vector_zero (sbitmap *, unsigned int);
 extern void sbitmap_vector_ones (sbitmap *, unsigned int);
 
-extern void sbitmap_union_of_diff (sbitmap, const_sbitmap, const_sbitmap, const_sbitmap);
-extern bool sbitmap_union_of_diff_cg (sbitmap, const_sbitmap, const_sbitmap, const_sbitmap);
+extern void sbitmap_union_of_diff (sbitmap, const_sbitmap,
+				   const_sbitmap, const_sbitmap);
+extern bool sbitmap_union_of_diff_cg (sbitmap, const_sbitmap,
+				      const_sbitmap, const_sbitmap);
 extern void sbitmap_difference (sbitmap, const_sbitmap, const_sbitmap);
 extern void sbitmap_not (sbitmap, const_sbitmap);
-extern void sbitmap_a_or_b_and_c (sbitmap, const_sbitmap, const_sbitmap, const_sbitmap);
-extern bool sbitmap_a_or_b_and_c_cg (sbitmap, const_sbitmap, const_sbitmap, const_sbitmap);
-extern void sbitmap_a_and_b_or_c (sbitmap, const_sbitmap, const_sbitmap, const_sbitmap);
-extern bool sbitmap_a_and_b_or_c_cg (sbitmap, const_sbitmap, const_sbitmap, const_sbitmap);
+extern void sbitmap_a_or_b_and_c (sbitmap, const_sbitmap,
+				  const_sbitmap, const_sbitmap);
+extern bool sbitmap_a_or_b_and_c_cg (sbitmap, const_sbitmap,
+				     const_sbitmap, const_sbitmap);
+extern void sbitmap_a_and_b_or_c (sbitmap, const_sbitmap,
+				  const_sbitmap, const_sbitmap);
+extern bool sbitmap_a_and_b_or_c_cg (sbitmap, const_sbitmap,
+				     const_sbitmap, const_sbitmap);
 extern bool sbitmap_any_common_bits (const_sbitmap, const_sbitmap);
 extern void sbitmap_a_and_b (sbitmap, const_sbitmap, const_sbitmap);
 extern bool sbitmap_a_and_b_cg (sbitmap, const_sbitmap, const_sbitmap);
Index: ebitmap.c
===================================================================
--- ebitmap.c	(revision 189861)
+++ ebitmap.c	(working copy)
@@ -1,5 +1,5 @@
 /* Sparse array-based bitmaps.
-   Copyright (C) 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 2007-2012  Free Software Foundation, Inc.
    Contributed by Daniel Berlin <dberlin@dberlin.org>
 
 This file is part of GCC.
@@ -258,7 +258,7 @@ ebitmap_clear_bit (ebitmap map, unsigned int bit)
             map->cache = map->cache - 1;
         }
 
-      RESET_BIT (map->wordmask, wordindex);
+      RESET_BIT_WITH_POPCOUNT (map->wordmask, wordindex);
 
       memmove(&map->elts[eltwordindex], &map->elts[eltwordindex + 1],
 	      sizeof (EBITMAP_ELT_TYPE) * (map->numwords - eltwordindex));
@@ -293,7 +293,7 @@ ebitmap_set_bit (ebitmap map, unsigned int bit)
       unsigned long count;
       unsigned int i;
 
-      SET_BIT (map->wordmask, wordindex);
+      SET_BIT_WITH_POPCOUNT (map->wordmask, wordindex);
       count = sbitmap_popcount (map->wordmask, wordindex);
       gcc_assert (count <= map->numwords);
 
@@ -449,7 +449,7 @@ ebitmap_and_into (ebitmap dst, ebitmap src)
 	  *dstplace = tmpword;
 	}
       else
-	RESET_BIT (dst->wordmask, i);
+	RESET_BIT_WITH_POPCOUNT (dst->wordmask, i);
     }
 #ifdef EBITMAP_DEBUGGING
   {
@@ -508,7 +508,7 @@ ebitmap_and (ebitmap dst, ebitmap src1, ebitmap sr
 	      *dstplace = tmpword;
 	    }
 	  else
-	    RESET_BIT (dst->wordmask, i);
+	    RESET_BIT_WITH_POPCOUNT (dst->wordmask, i);
 	}
       else if (src1hasword)
 	src1eltindex++;
@@ -624,7 +624,7 @@ ebitmap_ior_into (ebitmap dst, ebitmap src)
 	{
 	  newarray [neweltindex++] = ebitmap_array_get (src, srceltindex++);
 	  gcc_assert (i < dst->wordmask->n_bits);
-	  SET_BIT (dst->wordmask, i);
+	  SET_BIT_WITH_POPCOUNT (dst->wordmask, i);
 	  changed |= true;
 	}
     }
@@ -825,7 +825,7 @@ ebitmap_and_compl_into (ebitmap dst, ebitmap src)
 	      *dstplace = tmpword;
 	    }
 	  else
-	    RESET_BIT (dst->wordmask, i);
+	    RESET_BIT_WITH_POPCOUNT (dst->wordmask, i);
 	}
       else
 	{
@@ -904,7 +904,7 @@ ebitmap_and_compl (ebitmap dst, ebitmap src1, ebit
 	      newarray[neweltindex++] = tmpword;
 	    }
 	  else
-	    RESET_BIT (tempmask, i);
+	    RESET_BIT_WITH_POPCOUNT (tempmask, i);
 
 	}
       else
Index: sparseset.h
===================================================================
--- sparseset.h	(revision 189861)
+++ sparseset.h	(working copy)
@@ -1,5 +1,5 @@
 /* SparseSet implementation.
-   Copyright (C) 2007, 2010 Free Software Foundation, Inc.
+   Copyright (C) 2007-2012 Free Software Foundation, Inc.
    Contributed by Peter Bergner <bergner@vnet.ibm.com>
 
 This file is part of GCC.
@@ -21,11 +21,71 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_SPARSESET_H
 #define GCC_SPARSESET_H
 
-#define SPARSESET_ELT_BITS ((unsigned) HOST_BITS_PER_WIDEST_FAST_INT)
-#define SPARSESET_ELT_TYPE unsigned int
+/* Implementation of the Briggs and Torczon sparse set representation.
+   The sparse set representation was first published in:
 
+   "An Efficient Representation for Sparse Sets",
+   ACM LOPLAS, Vol. 2, Nos. 1-4, March-December 1993, Pages 59-69.
+
+   The sparse set representation is suitable for integer sets with a
+   fixed-size universe.  Two vectors are used to store the members of
+   the set.  If an element I is in the set, then sparse[I] is the
+   index of I in the dense vector, and dense[I] == I.  The dense
+   vector works like a stack.  The size of the stack is the cardinality
+   of the set.
+
+   The following operations can be performed in O(1) time:
+
+     * clear			: sparseset_clear
+     * cardinality		: sparseset_cardinality
+     * set_size			: sparseset_size
+     * member_p			: sparseset_bit_p
+     * add_member		: sparseset_set_bit
+     * remove_member		: sparseset_clear_bit
+     * choose_one		: sparseset_pop
+
+   Additionally, the sparse set representation supports enumeration of
+   the members in O(N) time, where n is the number of members in the set.
+   The members of the set are stored cache-friendly in the dense vector.
+   This makes it a competitive choice for iterating over relatively sparse
+   sets requiring operations:
+
+     * forall			: EXECUTE_IF_SET_IN_SPARSESET
+     * set_copy			: sparseset_copy
+     * set_intersection		: sparseset_and
+     * set_union		: sparseset_ior
+     * set_difference		: sparseset_and_compl
+     * set_disjuction		: (not implemented)
+     * set_compare		: sparseset_equal_p
+
+   NB: It is OK to use remove_member during EXECUTE_IF_SET_IN_SPARSESET.
+   The iterator is updated for it.
+
+   Based on the efficiency of these operations, this representation of
+   sparse sets will often be superior to alternatives such as simple
+   bitmaps, linked-list bitmaps, array bitmaps, balanced binary trees,
+   hash tables, linked lists, etc., if the set is sufficiently sparse.
+   In the LOPLAS paper the cut-off point where sparse sets became faster
+   than simple bitmaps (see sbitmap.h) when N / U < 64 (where U is the
+   size of the universe of the set).
+
+   Because the set universe is fixed, the set cannot be resized.  For
+   sparse sets with initially unknown size, linked-list bitmaps are a
+   better choice, see bitmap.h.
+
+   Sparse sets storage requirements are relatively large: O(U) with a
+   larger constant than sbitmaps (if the storage requirement for an
+   sbitmap with universe U is S, then the storage required for a sparse
+   set for the same universe are 2*HOST_BITS_PER_WIDEST_FAST_INT * S).
+   Accessing the sparse vector is not very cache-friendly, but iterating
+   over the members in the set is cache-friendly because only the dense
+   vector is used.  */
+
 /* Data Structure used for the SparseSet representation.  */
 
+#define SPARSESET_ELT_BITS ((unsigned) HOST_BITS_PER_WIDEST_FAST_INT)
+#define SPARSESET_ELT_TYPE unsigned HOST_WIDEST_FAST_INT
+
 typedef struct sparseset_def
 {
   SPARSESET_ELT_TYPE *dense;	/* Dense array.  */
@@ -107,7 +167,7 @@ sparseset_set_bit (sparseset s, SPARSESET_ELT_TYPE
     sparseset_insert_bit (s, e, s->members++);
 }
 
-/* Return and remove an arbitrary element from the set S.  */
+/* Return and remove the last member added to the set S.  */
 
 static inline SPARSESET_ELT_TYPE
 sparseset_pop (sparseset s)
Index: bitmap.c
===================================================================
--- bitmap.c	(revision 189861)
+++ bitmap.c	(working copy)
@@ -1,6 +1,5 @@
 /* Functions to support general ended bitmaps.
-   Copyright (C) 1997, 1998, 1999, 2000, 2001, 2003, 2004, 2005,
-   2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012  Free Software Foundation, Inc.
 
 This file is part of GCC.
 
Index: sbitmap.c
===================================================================
--- sbitmap.c	(revision 189861)
+++ sbitmap.c	(working copy)
@@ -1,6 +1,5 @@
 /* Simple bitmaps.
-   Copyright (C) 1999, 2000, 2002, 2003, 2004, 2006, 2007, 2008, 2010
-   Free Software Foundation, Inc.
+   Copyright (C) 1999-2012  Free Software Foundation, Inc.
 
 This file is part of GCC.
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-26  9:16 [patch[ Add explanations to sbitmap, bitmap, and sparseset Steven Bosscher
@ 2012-07-26  9:23 ` Richard Guenther
  2012-07-26  9:58   ` Steven Bosscher
  2012-07-26 10:14 ` Alexander Monakov
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2012-07-26  9:23 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: GCC Patches

On Thu, Jul 26, 2012 at 11:16 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> Hello,
>
> Add some explanations because I see a lot of places where the choice
> of data structure is less than optimal.
>
> Also some functional changes: sbitmap with popcounts are only used in
> ebitmap, and they penalize the "normal" case without popcount a lot.
>
> Bootstrapped&tested on powepc64-unknown-linux-gnu. OK?

Ok!  Thanks for adding this exhaustive documentation.

Btw, ebitmap is unused since it was added - maybe we should simply remove
it ...?

Thanks,
Richard.

> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-26  9:23 ` Richard Guenther
@ 2012-07-26  9:58   ` Steven Bosscher
  2012-07-27 13:56     ` Richard Guenther
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Bosscher @ 2012-07-26  9:58 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches

On Thu, Jul 26, 2012 at 11:23 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> Ok!  Thanks for adding this exhaustive documentation.

There's more to come! I want to add some explanations to ebitmap,
pointer-set, fibheap, and splay-tree as sets, and add a chapter in the
gccint manual too.

Now if only you'd document those loop changes... ;-)


> Btw, ebitmap is unused since it was added - maybe we should simply remove
> it ...?

I wouldn't remove it just yet. I'm going to make sure that bitmap.[ch]
and ebitmap.[ch] provide the same interface and see if there are
places where ebitmap is a better choice than bitmap or sbitmap (cprop
and gcse.c come to mind).

Ciao!
Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-26  9:16 [patch[ Add explanations to sbitmap, bitmap, and sparseset Steven Bosscher
  2012-07-26  9:23 ` Richard Guenther
@ 2012-07-26 10:14 ` Alexander Monakov
  2012-07-26 12:01   ` Steven Bosscher
  1 sibling, 1 reply; 12+ messages in thread
From: Alexander Monakov @ 2012-07-26 10:14 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: GCC Patches

Hello,

+   the set.  If an element I is in the set, then sparse[I] is the
+   index of I in the dense vector, and dense[I] == I.  The dense
                                        ^^^^^^^^^^^^^
This should read "dense[sparse[I]] == I"

Thanks

Alexander

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-26 10:14 ` Alexander Monakov
@ 2012-07-26 12:01   ` Steven Bosscher
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Bosscher @ 2012-07-26 12:01 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: GCC Patches

On Thu, Jul 26, 2012 at 12:14 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
> Hello,
>
> +   the set.  If an element I is in the set, then sparse[I] is the
> +   index of I in the dense vector, and dense[I] == I.  The dense
>                                         ^^^^^^^^^^^^^
> This should read "dense[sparse[I]] == I"

Of course. Thanks for spotting this mistake!

Ciao!
Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-26  9:58   ` Steven Bosscher
@ 2012-07-27 13:56     ` Richard Guenther
  2012-07-27 17:16       ` William J. Schmidt
  2012-07-30 14:48       ` Peter Bergner
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Guenther @ 2012-07-27 13:56 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: GCC Patches

On Thu, Jul 26, 2012 at 11:57 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Thu, Jul 26, 2012 at 11:23 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> Ok!  Thanks for adding this exhaustive documentation.
>
> There's more to come! I want to add some explanations to ebitmap,
> pointer-set, fibheap, and splay-tree as sets, and add a chapter in the
> gccint manual too.
>
> Now if only you'd document those loop changes... ;-)

Eh ...

>
>> Btw, ebitmap is unused since it was added - maybe we should simply remove
>> it ...?
>
> I wouldn't remove it just yet. I'm going to make sure that bitmap.[ch]
> and ebitmap.[ch] provide the same interface and see if there are
> places where ebitmap is a better choice than bitmap or sbitmap (cprop
> and gcse.c come to mind).

Btw, just looking over sparseset.h what needs to be documented is that
iterating over the set is faster than for an sbitmap but element ordering
is random!  Also it looks less efficient than sbitmap in the case when
your main operation is adding to the set and querying the set randomly.
It's space overhead is really huge - for smaller universes a smaller
SPARSESET_ELT_TYPE would be nice, templates to the rescue!  I
wonder in which cases a unsigned HOST_WIDEST_FAST_INT sized
universe is even useful (but a short instead of an int is probably too
small ...)

Richard.

> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-27 13:56     ` Richard Guenther
@ 2012-07-27 17:16       ` William J. Schmidt
  2012-07-30 14:48       ` Peter Bergner
  1 sibling, 0 replies; 12+ messages in thread
From: William J. Schmidt @ 2012-07-27 17:16 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Steven Bosscher, GCC Patches

On Fri, 2012-07-27 at 15:40 +0200, Richard Guenther wrote:
> On Thu, Jul 26, 2012 at 11:57 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> > On Thu, Jul 26, 2012 at 11:23 AM, Richard Guenther
> > <richard.guenther@gmail.com> wrote:
> >> Ok!  Thanks for adding this exhaustive documentation.
> >
> > There's more to come! I want to add some explanations to ebitmap,
> > pointer-set, fibheap, and splay-tree as sets, and add a chapter in the
> > gccint manual too.
> >
> > Now if only you'd document those loop changes... ;-)
> 
> Eh ...
> 
> >
> >> Btw, ebitmap is unused since it was added - maybe we should simply remove
> >> it ...?
> >
> > I wouldn't remove it just yet. I'm going to make sure that bitmap.[ch]
> > and ebitmap.[ch] provide the same interface and see if there are
> > places where ebitmap is a better choice than bitmap or sbitmap (cprop
> > and gcse.c come to mind).
> 
> Btw, just looking over sparseset.h what needs to be documented is that
> iterating over the set is faster than for an sbitmap but element ordering
> is random!  Also it looks less efficient than sbitmap in the case when
> your main operation is adding to the set and querying the set randomly.
> It's space overhead is really huge - for smaller universes a smaller
> SPARSESET_ELT_TYPE would be nice, templates to the rescue!  I
> wonder in which cases a unsigned HOST_WIDEST_FAST_INT sized
> universe is even useful (but a short instead of an int is probably too
> small ...)

Another option for sparse sets would be a templatized version of Pugh's
skip lists.  Iteration is the same as a linked list and random access is
logarithmic in the size of the set (not the universe).  Space overhead
is also logarithmic.  The potential downside is that it involves
pointers.

Bill

> 
> Richard.
> 
> > Ciao!
> > Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-27 13:56     ` Richard Guenther
  2012-07-27 17:16       ` William J. Schmidt
@ 2012-07-30 14:48       ` Peter Bergner
  2012-07-30 15:05         ` Richard Guenther
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Bergner @ 2012-07-30 14:48 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Steven Bosscher, GCC Patches

On Fri, 27 Jul 2012 15:40:35 +0200 Richard Guenther <richard.guenther@gmail.com> wrote:
> Also it looks less efficient than sbitmap in the case when
> your main operation is adding to the set and querying the set randomly.

How so?  Adding/deleting a member to a sparseset is an O(1) operation,
as is querying whether something is/isn't a member of a sparseset.
Or are you talking about slower by some small constant factor?


> It's space overhead is really huge - for smaller universes a smaller
> SPARSESET_ELT_TYPE would be nice, templates to the rescue!  I
> wonder in which cases a unsigned HOST_WIDEST_FAST_INT sized
> universe is even useful (but a short instead of an int is probably too
> small ...)

Yes, space overhead it large, but the extra space overhead allows sparseset
to have O(1) operations for most set functions and O(N) for iterating over
the members of the set.  Obviously, you don't want to use this as in general
set usage, but where speed is critical, it has its uses.

Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-30 14:48       ` Peter Bergner
@ 2012-07-30 15:05         ` Richard Guenther
  2012-07-30 15:06           ` Steven Bosscher
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2012-07-30 15:05 UTC (permalink / raw)
  To: Peter Bergner; +Cc: Steven Bosscher, GCC Patches

On Mon, Jul 30, 2012 at 4:43 PM, Peter Bergner <bergner@vnet.ibm.com> wrote:
> On Fri, 27 Jul 2012 15:40:35 +0200 Richard Guenther <richard.guenther@gmail.com> wrote:
>> Also it looks less efficient than sbitmap in the case when
>> your main operation is adding to the set and querying the set randomly.
>
> How so?  Adding/deleting a member to a sparseset is an O(1) operation,
> as is querying whether something is/isn't a member of a sparseset.
> Or are you talking about slower by some small constant factor?

No, but less space efficient and of comparable speed as sbitmap which
is also O(1).

>> It's space overhead is really huge - for smaller universes a smaller
>> SPARSESET_ELT_TYPE would be nice, templates to the rescue!  I
>> wonder in which cases a unsigned HOST_WIDEST_FAST_INT sized
>> universe is even useful (but a short instead of an int is probably too
>> small ...)
>
> Yes, space overhead it large, but the extra space overhead allows sparseset
> to have O(1) operations for most set functions and O(N) for iterating over
> the members of the set.  Obviously, you don't want to use this as in general
> set usage, but where speed is critical, it has its uses.

True.

Richard.

> Peter
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-30 15:05         ` Richard Guenther
@ 2012-07-30 15:06           ` Steven Bosscher
  2012-07-30 15:17             ` Richard Guenther
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Bosscher @ 2012-07-30 15:06 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Peter Bergner, GCC Patches

On Mon, Jul 30, 2012 at 4:53 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> No, but less space efficient and of comparable speed as sbitmap which
> is also O(1).

But iterating an sbitmap has worse complexity than sparseset.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-30 15:06           ` Steven Bosscher
@ 2012-07-30 15:17             ` Richard Guenther
  2012-07-30 15:18               ` Richard Guenther
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2012-07-30 15:17 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Peter Bergner, GCC Patches

On Mon, Jul 30, 2012 at 5:05 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Mon, Jul 30, 2012 at 4:53 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> No, but less space efficient and of comparable speed as sbitmap which
>> is also O(1).
>
> But iterating an sbitmap has worse complexity than sparseset.

Which is why I mentioned the common idiom of only random set and query
operations.  The docs seem to suggest sparseset is appropriate there.

Richard.

> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset
  2012-07-30 15:17             ` Richard Guenther
@ 2012-07-30 15:18               ` Richard Guenther
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Guenther @ 2012-07-30 15:18 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Peter Bergner, GCC Patches

On Mon, Jul 30, 2012 at 5:14 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Mon, Jul 30, 2012 at 5:05 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Mon, Jul 30, 2012 at 4:53 PM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> No, but less space efficient and of comparable speed as sbitmap which
>>> is also O(1).
>>
>> But iterating an sbitmap has worse complexity than sparseset.
>
> Which is why I mentioned the common idiom of only random set and query
> operations.  The docs seem to suggest sparseset is appropriate there.

And even if we add iterating a combination of an sbitmap plus a VEC of elements
is cheaper if you don't remove elements from the set.

Richard.

> Richard.
>
>> Ciao!
>> Steven

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-07-30 15:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-26  9:16 [patch[ Add explanations to sbitmap, bitmap, and sparseset Steven Bosscher
2012-07-26  9:23 ` Richard Guenther
2012-07-26  9:58   ` Steven Bosscher
2012-07-27 13:56     ` Richard Guenther
2012-07-27 17:16       ` William J. Schmidt
2012-07-30 14:48       ` Peter Bergner
2012-07-30 15:05         ` Richard Guenther
2012-07-30 15:06           ` Steven Bosscher
2012-07-30 15:17             ` Richard Guenther
2012-07-30 15:18               ` Richard Guenther
2012-07-26 10:14 ` Alexander Monakov
2012-07-26 12:01   ` Steven Bosscher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).