public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Some questions on hash_set/hash_map traits
@ 2021-09-08 20:48 Erick Ochoa
  2021-09-09  7:05 ` Richard Biener
  0 siblings, 1 reply; 2+ messages in thread
From: Erick Ochoa @ 2021-09-08 20:48 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 6228 bytes --]

Hello,

I am refactoring some old code that runs as an IPA_PASS. This code is
intended to run at link-time using the LTO framework and so I am
always testing it that way. At the moment, I am trying to use
hash-maps but having some difficulties. I am adding some patches that
will help illustrate along the way.

The first patch simply adds a link-time optimization pass that will be
used to demo the problem that I am having. One can run with -flto
-fipa-hash. During the first patch, the pass does not do any
meaningful work but it just helps to set up the stage.

For the second patch I create a wrapper around a symtab_node. This
wrapper is intended to add some other functionality to elements that I
might want to insert in hash-sets or hash-maps.

class symtab_wrapper
{
private:
  symtab_node *_symtab_node;
public:
  friend symtab_wrapper_traits;
  symtab_wrapper ()
    : _symtab_node (NULL)
  {}
  symtab_wrapper (symtab_node *s)
    : _symtab_node (s)
  { gcc_assert (s); }

  void print (void)
  {
    if (!dump_file) return;
    gcc_assert (_symtab_node);
    fprintf (dump_file, "%s\n", _symtab_node->name ());
  }
};

I also have a wrapper around a hash-set to add some things that might
be of interest to me in the future:

template <typename KeyId, bool Lazy = false, typename Traits =
default_hash_traits <KeyId> >
class my_set_t
{
private:
  hash_set <KeyId, Lazy, Traits> _impl;
public:
  my_set_t () {}

  void insert (const KeyId &k)
  {
    _impl.add (k);
  }

  bool contains (const KeyId &k)
  {
    return _impl.contains (k);
  }

  void print (void)
  {
    if (!dump_file) return;
    for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
    {
      (*i).print ();
    }
  }
};

I also created a symtab_wrapper_traits because I want to store the
object in the hash-map itself, and so I need to specify how to hash
it.

class symtab_wrapper_traits
{
public:
  typedef symtab_wrapper value_type;
  typedef symtab_wrapper compare_type;
  static const bool empty_zero_p = true;

  static inline hashval_t hash (const value_type &v)
  {
    return pointer_hash <void>::hash (v._symtab_node);
  }

  static bool equal (const value_type &l, const value_type &r)
  {
    return l._symtab_node == r._symtab_node;
  }

  static void mark_deleted (value_type &v)
  {
    v._symtab_node = NULL;
  }

  static void mark_empty (value_type &v)
  {
    v._symtab_node = NULL;
  }

  static void remove (value_type &v)
  {
    v._symtab_node = NULL;
  }

  static bool is_deleted (const value_type &v)
  {
     return !v._symtab_node;
  }

  static bool is_empty (const value_type &v)
  {
     return !v._symtab_node;
  }
};

I then create a global set with my traits and populate it with some
information and later print. It seems to be working:

my_set_t <symtab_wrapper, false, symtab_wrapper_traits> my_set;

 static void
 ipa_hash_generate_summary (void)
 {
  cgraph_node *cnode = NULL;
  FOR_EACH_FUNCTION (cnode)
  {
    symtab_wrapper w (cnode);
    my_set.insert (w);
  }

  my_set.print ();

  return;
 }

Here, I already have some questions, but this is not the biggest issue
that I am having: I believe that the hash_set implementation manages
its own memory, but I am unclear about the semantics of "removed".

* Is "removed" supposed to, for example, call the destructor? Since,
in this particular case, it is only a wrapper and cgraph_node is not
intended to be destroyed, I just assign the pointer to NULL. Not sure
if that's the intention.
* I don't understand the semantics of empty_zero_p, I think it is used
to zero out deleted entries? If I mark something as empty.
* I am assuming that its memory will be re-used, is this the case?
* I see that there is a typed_delete_remove function that is sometimes
used in traits, but I think that it doesn't apply to this case because
I am storing the whole object. Is that the case?

I later did some refactoring (patch 3), which did not seem to impact
the code now, but will impact hash_map later. The refactoring is as
follows, I create an abstract "printable" class

class printable
{
public:
  virtual char* str () = 0;
  void print (void);
};

void
printable::print (void)
{
  if (!dump_file) return;
  char* _str = str ();
  fprintf (dump_file, "%s\n", _str);
  free (_str);
}

Make symtab_wrapper a publically derived class

-class symtab_wrapper
+class symtab_wrapper : public printable

and implemented the str function in symtab_wrapper:

  virtual char* str (void)
   {
    if (!dump_file) return;
    gcc_assert (_symtab_node);
    char *retval = NULL;
    asprintf (&retval, "%s", _symtab_node->name ());
    return retval;
  }

Again, this seems to be working correctly.

What is more interesting is moving from a hash-set to a hash-map. On
the fourth patch, I implemented the hash_map equivalent to the
hash_set that I am describing above:

template <typename KeyId, typename Value, typename Traits>
class my_map_t
{
private:
  hash_map <KeyId, Value, Traits> _impl;
public:
  my_map_t () {}

  void insert (const KeyId &k, const Value &v)
  {
    _impl.put (k, v);
  }

  Value *select (const KeyId &k)
  {
    return _impl.get (k);
  }

  void print (void)
  {
    if (!dump_file) return;
    for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
    {
      (*i).first.print ();
    }
  }
};

my_map_t <symtab_wrapper, int, simple_hashmap_traits
<symtab_wrapper_traits, int>> my_map;

I fill the keys with the same data as the hash_set and for the
integers, just some incrementing values. I try to print the hash_map
after printing the hash_set, but it crashes at the call-site to "str
()"

void
printable::print (void)
{
  if (!dump_file) return;
  char* _str = str (); // <-- it crashes here
  fprintf (dump_file, "%s\n", _str);
  free (_str);
}

The interesting thing is that if I remove printable from
symtab_wrapper's class hierarchy and inline print (and devirtualize
str), then everything works correctly... (or at least to my
understanding). This is illustrated in the fifth patch, which
succeeds.

Now, I am not sure if I am doing something wrong and I am a bit lost
here. If anyone could help me, I'd appreciate it.

I have built these patches on top of trunk and also bootstrapped them.

Many thanks in advance!

[-- Attachment #2: 0004-Runtime-error-when-printing.patch --]
[-- Type: application/octet-stream, Size: 2333 bytes --]

From 47c6006401cea02a55020b0df6e85e96fe5667b8 Mon Sep 17 00:00:00 2001
From: Erick Ochoa <eochoa@gcc.gnu.org>
Date: Wed, 8 Sep 2021 21:58:08 +0200
Subject: [PATCH 4/4] Runtime error when printing

---
 gcc/ipa-hash.c | 44 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/gcc/ipa-hash.c b/gcc/ipa-hash.c
index 9cee5ba5f14..2d7e2819f32 100644
--- a/gcc/ipa-hash.c
+++ b/gcc/ipa-hash.c
@@ -29,12 +29,12 @@
 class printable
 {
 public:
-  virtual char* str () = 0;
-  void print (void);
+  virtual char* str () const = 0;
+  void print (void) const;
 };
 
 void
-printable::print (void)
+printable::print (void) const
 {
   if (!dump_file) return;
   char* _str = str ();
@@ -56,11 +56,11 @@ public:
     : _symtab_node (s)
   { gcc_assert (s); }
 
-  virtual char* str (void)
+  virtual char* str (void) const
   {
     gcc_assert (_symtab_node);
     char *retval = NULL;
-    asprintf (&retval, "%sn", _symtab_node->name ());
+    asprintf (&retval, "%s", _symtab_node->name ());
     return retval;
   }
 };
@@ -83,7 +83,7 @@ public:
     return _impl.contains (k);
   }
 
-  void print (void)
+  void print (void) const
   {
     if (!dump_file) return;
     for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
@@ -93,6 +93,34 @@ public:
   }
 };
 
+template <typename KeyId, typename Value, typename Traits>
+class my_map_t
+{
+private:
+  hash_map <KeyId, Value, Traits> _impl;
+public:
+  my_map_t () {}
+
+  void insert (const KeyId &k, const Value &v)
+  {
+    _impl.put (k, v);
+  }
+
+  Value *select (const KeyId &k)
+  {
+    return _impl.get (k);
+  }
+
+  void print (void)
+  {
+    if (!dump_file) return;
+    for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
+    {
+      (*i).first.print ();
+    }
+  }
+};
+
 class symtab_wrapper_traits 
 {
 public:
@@ -138,18 +166,22 @@ public:
 };
 
 my_set_t <symtab_wrapper, false, symtab_wrapper_traits> my_set;
+my_map_t <symtab_wrapper, int, simple_hashmap_traits <symtab_wrapper_traits, int>> my_map;
 
 static void
 ipa_hash_generate_summary (void)
 {
   cgraph_node *cnode = NULL;
+  int i = 0;
   FOR_EACH_FUNCTION (cnode)
   {
     symtab_wrapper w (cnode);
     my_set.insert (w);
+    my_map.insert (w, i++);
   }
 
   my_set.print ();
+  my_map.print ();
 
   return;
 }
-- 
2.25.1


[-- Attachment #3: main.c --]
[-- Type: application/octet-stream, Size: 463 bytes --]

#include <stdlib.h>
#include <stdio.h>

int
gt (const void *a, const void *b)
{
  return *(int*)a > *(int*)b;
}

int
main (int argc, char** argv)
{
  int *array = calloc(argc, sizeof(int));
  for (int i = 0; i < argc; i++)
  {
    array[i] = rand() % 10;
    printf ("%d ", array[i]);
  }
  printf ("\n");
  qsort (array, argc, sizeof (int), gt);
  for (int i = 0; i < argc; i++)
  {
    printf ("%d ", array[i]);
  }
  printf ("\n");
}

[-- Attachment #4: 0001-Initial-commit.patch --]
[-- Type: application/octet-stream, Size: 5245 bytes --]

From 85da7f5c90a9755578d54755b10df29b9305f9b5 Mon Sep 17 00:00:00 2001
From: Erick Ochoa <eochoa@gcc.gnu.org>
Date: Wed, 8 Sep 2021 20:34:11 +0200
Subject: [PATCH 1/4] Initial commit

---
 gcc/Makefile.in      |  1 +
 gcc/common.opt       |  5 +++
 gcc/ipa-hash.c       | 83 ++++++++++++++++++++++++++++++++++++++++++++
 gcc/ipa-hash.h       |  1 +
 gcc/lto-section-in.c |  1 +
 gcc/lto-streamer.h   |  1 +
 gcc/passes.def       |  1 +
 gcc/tree-pass.h      |  1 +
 8 files changed, 94 insertions(+)
 create mode 100644 gcc/ipa-hash.c
 create mode 100644 gcc/ipa-hash.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f0c560fe45b..efcee73c00f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1449,6 +1449,7 @@ OBJS = \
 	init-regs.o \
 	internal-fn.o \
 	ipa-cp.o \
+	ipa-hash.o \
 	ipa-sra.o \
 	ipa-devirt.o \
 	ipa-fnsummary.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index f103a7de004..c1163a9a501 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1866,6 +1866,10 @@ fipa-pta
 Common Var(flag_ipa_pta) Init(0) Optimization
 Perform interprocedural points-to analysis.
 
+fipa-hash
+Common Var(flag_ipa_hash) Init (0) Optimization
+Please help me understand the problem.
+
 fipa-pure-const
 Common Var(flag_ipa_pure_const) Init(0) Optimization
 Discover pure and const functions.
@@ -3528,4 +3532,5 @@ fipa-ra
 Common Var(flag_ipa_ra) Optimization
 Use caller save register across calls if possible.
 
+
 ; This comment is to ensure we retain the blank line above.
diff --git a/gcc/ipa-hash.c b/gcc/ipa-hash.c
new file mode 100644
index 00000000000..84abe1da0d9
--- /dev/null
+++ b/gcc/ipa-hash.c
@@ -0,0 +1,83 @@
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple-expr.h"
+#include "predict.h"
+#include "alloc-pool.h"
+#include "tree-pass.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "fold-const.h"
+#include "gimple-fold.h"
+#include "symbol-summary.h"
+#include "tree-vrp.h"
+#include "ipa-prop.h"
+#include "tree-pretty-print.h"
+#include "tree-inline.h"
+#include "ipa-fnsummary.h"
+#include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "gimple.h"
+#include "data-streamer.h"
+#include "lto-streamer.h"
+#include "ipa-hash.h"
+
+static void
+ipa_hash_generate_summary (void)
+{
+  return;
+}
+
+static unsigned
+ipa_pta_execute (void)
+{
+  if (dump_file) fprintf (dump_file, "hello from wpa\n");
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_ipa_hash =
+{
+  IPA_PASS,
+  "hash",
+  OPTGROUP_NONE,
+  TV_NONE,
+  (PROP_cfg | PROP_ssa),
+  0,
+  0,
+  0,
+  0,
+};
+
+class pass_ipa_hash : public ipa_opt_pass_d
+{
+public:
+  pass_ipa_hash (gcc::context *ctxt)
+    : ipa_opt_pass_d (pass_data_ipa_hash, ctxt,
+		    ipa_hash_generate_summary,
+		    NULL,
+		    NULL,
+		    NULL,
+		    NULL,
+		    NULL,
+		    0,
+		    NULL,
+		    NULL)
+  {}
+
+  virtual bool gate (function*) { return flag_ipa_hash; }
+  virtual unsigned execute (function*) { return ipa_pta_execute (); }
+
+};
+
+} // anonymous-namespace
+
+ipa_opt_pass_d*
+make_pass_ipa_hash (gcc::context *ctxt)
+{
+  return new pass_ipa_hash (ctxt);
+}
diff --git a/gcc/ipa-hash.h b/gcc/ipa-hash.h
new file mode 100644
index 00000000000..6f70f09beec
--- /dev/null
+++ b/gcc/ipa-hash.h
@@ -0,0 +1 @@
+#pragma once
diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
index d16b3353221..bd02c24a00e 100644
--- a/gcc/lto-section-in.c
+++ b/gcc/lto-section-in.c
@@ -57,6 +57,7 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
   "ipa_sra",
   "odr_types",
   "ipa_modref",
+  "ipa_hash",
 };
 
 /* Hooks so that the ipa passes can call into the lto front end to get
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index 5f0335eb76c..493cb4d6fe7 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -229,6 +229,7 @@ enum lto_section_type
   LTO_section_ipa_sra,
   LTO_section_odr_types,
   LTO_section_ipa_modref,
+  LTO_section_ipa_hash,
   LTO_N_SECTION_TYPES		/* Must be last.  */
 };
 
diff --git a/gcc/passes.def b/gcc/passes.def
index d7a1f8c97a6..7c4752db710 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -168,6 +168,7 @@ along with GCC; see the file COPYING3.  If not see
      symbols are not allowed outside of the comdat group.  Privatizing early
      would result in missed optimizations due to this restriction.  */
   NEXT_PASS (pass_ipa_comdats);
+  NEXT_PASS (pass_ipa_hash);
   TERMINATE_PASS_LIST (all_regular_ipa_passes)
 
   /* Simple IPA passes executed after the regular passes.  In WHOPR mode the
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 83941bc0cee..825fc4e98b7 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -531,6 +531,7 @@ extern ipa_opt_pass_d *make_pass_ipa_profile (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_cdtor_merge (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_single_use (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_comdats (gcc::context *ctxt);
+extern ipa_opt_pass_d *make_pass_ipa_hash (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_modref (gcc::context *ctxt);
 
 extern gimple_opt_pass *make_pass_cleanup_cfg_post_optimizing (gcc::context
-- 
2.25.1


[-- Attachment #5: 0003-Printing-better.patch --]
[-- Type: application/octet-stream, Size: 1222 bytes --]

From a714c7847e7ad0c5677340926a28ddac8cf0c183 Mon Sep 17 00:00:00 2001
From: Erick Ochoa <eochoa@gcc.gnu.org>
Date: Wed, 8 Sep 2021 21:31:00 +0200
Subject: [PATCH 3/4] Printing better

---
 gcc/ipa-hash.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/gcc/ipa-hash.c b/gcc/ipa-hash.c
index 58d9157a124..9cee5ba5f14 100644
--- a/gcc/ipa-hash.c
+++ b/gcc/ipa-hash.c
@@ -26,8 +26,24 @@
 #include "lto-streamer.h"
 #include "ipa-hash.h"
 
+class printable
+{
+public:
+  virtual char* str () = 0;
+  void print (void);
+};
+
+void
+printable::print (void)
+{
+  if (!dump_file) return;
+  char* _str = str ();
+  fprintf (dump_file, "%s\n", _str);
+  free (_str);
+}
+
 class symtab_wrapper_traits;
-class symtab_wrapper
+class symtab_wrapper : public printable
 {
 private:
   symtab_node *_symtab_node;
@@ -40,11 +56,12 @@ public:
     : _symtab_node (s)
   { gcc_assert (s); }
 
-  void print (void)
+  virtual char* str (void)
   {
-    if (!dump_file) return;
     gcc_assert (_symtab_node);
-    fprintf (dump_file, "%s\n", _symtab_node->name ());
+    char *retval = NULL;
+    asprintf (&retval, "%sn", _symtab_node->name ());
+    return retval;
   }
 };
 
-- 
2.25.1


[-- Attachment #6: 0002-Printing.patch --]
[-- Type: application/octet-stream, Size: 2686 bytes --]

From 35b5598e5d21f979fcc038c7b5bb457c25d5d889 Mon Sep 17 00:00:00 2001
From: Erick Ochoa <eochoa@gcc.gnu.org>
Date: Wed, 8 Sep 2021 21:26:31 +0200
Subject: [PATCH 2/4] Printing

---
 gcc/ipa-hash.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 105 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-hash.c b/gcc/ipa-hash.c
index 84abe1da0d9..58d9157a124 100644
--- a/gcc/ipa-hash.c
+++ b/gcc/ipa-hash.c
@@ -26,16 +26,120 @@
 #include "lto-streamer.h"
 #include "ipa-hash.h"
 
+class symtab_wrapper_traits;
+class symtab_wrapper
+{
+private:
+  symtab_node *_symtab_node;
+public:
+  friend symtab_wrapper_traits;
+  symtab_wrapper ()
+    : _symtab_node (NULL)
+  {}
+  symtab_wrapper (symtab_node *s)
+    : _symtab_node (s)
+  { gcc_assert (s); }
+
+  void print (void)
+  {
+    if (!dump_file) return;
+    gcc_assert (_symtab_node);
+    fprintf (dump_file, "%s\n", _symtab_node->name ());
+  }
+};
+
+template <typename KeyId, bool Lazy = false, typename Traits = default_hash_traits <KeyId> >
+class my_set_t
+{
+private:
+  hash_set <KeyId, Lazy, Traits> _impl;
+public:
+  my_set_t () {}
+
+  void insert (const KeyId &k)
+  {
+    _impl.add (k);
+  }
+
+  bool contains (const KeyId &k)
+  {
+    return _impl.contains (k);
+  }
+
+  void print (void)
+  {
+    if (!dump_file) return;
+    for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
+    {
+      (*i).print ();
+    }
+  }
+};
+
+class symtab_wrapper_traits 
+{
+public:
+  typedef symtab_wrapper value_type;
+  typedef symtab_wrapper compare_type;
+  static const bool empty_zero_p = true;
+
+  static inline hashval_t hash (const value_type &v)
+  {
+    return pointer_hash <void>::hash (v._symtab_node);
+  }
+
+  static bool equal (const value_type &l, const value_type &r)
+  {
+    return l._symtab_node == r._symtab_node;
+  }
+
+  static void mark_deleted (value_type &v)
+  {
+    v._symtab_node = NULL;
+  }
+
+  static void mark_empty (value_type &v)
+  {
+    v._symtab_node = NULL;
+  }
+
+  static void remove (value_type &v)
+  {
+    v._symtab_node = NULL;
+  }
+
+  static bool is_deleted (const value_type &v)
+  {
+     return !v._symtab_node;
+  }
+
+  static bool is_empty (const value_type &v)
+  {
+     return !v._symtab_node;
+  }
+
+};
+
+my_set_t <symtab_wrapper, false, symtab_wrapper_traits> my_set;
+
 static void
 ipa_hash_generate_summary (void)
 {
+  cgraph_node *cnode = NULL;
+  FOR_EACH_FUNCTION (cnode)
+  {
+    symtab_wrapper w (cnode);
+    my_set.insert (w);
+  }
+
+  my_set.print ();
+
   return;
 }
 
 static unsigned
 ipa_pta_execute (void)
 {
-  if (dump_file) fprintf (dump_file, "hello from wpa\n");
   return 0;
 }
 
-- 
2.25.1


[-- Attachment #7: 0005-Fixes-the-issue.patch --]
[-- Type: application/octet-stream, Size: 1135 bytes --]

From b93f13468c48c77e3f0b445b1217ad73a05d4864 Mon Sep 17 00:00:00 2001
From: Erick Ochoa <eochoa@gcc.gnu.org>
Date: Wed, 8 Sep 2021 22:42:57 +0200
Subject: [PATCH 5/5] Fixes the issue

---
 gcc/ipa-hash.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-hash.c b/gcc/ipa-hash.c
index 2d7e2819f32..5f37bc7c69e 100644
--- a/gcc/ipa-hash.c
+++ b/gcc/ipa-hash.c
@@ -43,7 +43,7 @@ printable::print (void) const
 }
 
 class symtab_wrapper_traits;
-class symtab_wrapper : public printable
+class symtab_wrapper 
 {
 private:
   symtab_node *_symtab_node;
@@ -56,13 +56,22 @@ public:
     : _symtab_node (s)
   { gcc_assert (s); }
 
-  virtual char* str (void) const
+
+  char* str (void) const
   {
     gcc_assert (_symtab_node);
     char *retval = NULL;
     asprintf (&retval, "%s", _symtab_node->name ());
     return retval;
   }
+
+  void print (void) const
+  {
+    if (!dump_file) return;
+    char* _str = str ();
+    fprintf (dump_file, "%s\n", _str);
+    free (_str);
+  }
 };
 
 template <typename KeyId, bool Lazy = false, typename Traits = default_hash_traits <KeyId> >
-- 
2.25.1


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Some questions on hash_set/hash_map traits
  2021-09-08 20:48 Some questions on hash_set/hash_map traits Erick Ochoa
@ 2021-09-09  7:05 ` Richard Biener
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Biener @ 2021-09-09  7:05 UTC (permalink / raw)
  To: Erick Ochoa; +Cc: GCC Development

On Wed, Sep 8, 2021 at 10:50 PM Erick Ochoa via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hello,
>
> I am refactoring some old code that runs as an IPA_PASS. This code is
> intended to run at link-time using the LTO framework and so I am
> always testing it that way. At the moment, I am trying to use
> hash-maps but having some difficulties. I am adding some patches that
> will help illustrate along the way.
>
> The first patch simply adds a link-time optimization pass that will be
> used to demo the problem that I am having. One can run with -flto
> -fipa-hash. During the first patch, the pass does not do any
> meaningful work but it just helps to set up the stage.
>
> For the second patch I create a wrapper around a symtab_node. This
> wrapper is intended to add some other functionality to elements that I
> might want to insert in hash-sets or hash-maps.
>
> class symtab_wrapper
> {
> private:
>   symtab_node *_symtab_node;
> public:
>   friend symtab_wrapper_traits;
>   symtab_wrapper ()
>     : _symtab_node (NULL)
>   {}
>   symtab_wrapper (symtab_node *s)
>     : _symtab_node (s)
>   { gcc_assert (s); }
>
>   void print (void)
>   {
>     if (!dump_file) return;
>     gcc_assert (_symtab_node);
>     fprintf (dump_file, "%s\n", _symtab_node->name ());
>   }
> };
>
> I also have a wrapper around a hash-set to add some things that might
> be of interest to me in the future:
>
> template <typename KeyId, bool Lazy = false, typename Traits =
> default_hash_traits <KeyId> >
> class my_set_t
> {
> private:
>   hash_set <KeyId, Lazy, Traits> _impl;
> public:
>   my_set_t () {}
>
>   void insert (const KeyId &k)
>   {
>     _impl.add (k);
>   }
>
>   bool contains (const KeyId &k)
>   {
>     return _impl.contains (k);
>   }
>
>   void print (void)
>   {
>     if (!dump_file) return;
>     for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
>     {
>       (*i).print ();
>     }
>   }
> };
>
> I also created a symtab_wrapper_traits because I want to store the
> object in the hash-map itself, and so I need to specify how to hash
> it.
>
> class symtab_wrapper_traits
> {
> public:
>   typedef symtab_wrapper value_type;
>   typedef symtab_wrapper compare_type;
>   static const bool empty_zero_p = true;
>
>   static inline hashval_t hash (const value_type &v)
>   {
>     return pointer_hash <void>::hash (v._symtab_node);
>   }
>
>   static bool equal (const value_type &l, const value_type &r)
>   {
>     return l._symtab_node == r._symtab_node;
>   }
>
>   static void mark_deleted (value_type &v)
>   {
>     v._symtab_node = NULL;
>   }
>
>   static void mark_empty (value_type &v)
>   {
>     v._symtab_node = NULL;
>   }
>
>   static void remove (value_type &v)
>   {
>     v._symtab_node = NULL;
>   }
>
>   static bool is_deleted (const value_type &v)
>   {
>      return !v._symtab_node;
>   }
>
>   static bool is_empty (const value_type &v)
>   {
>      return !v._symtab_node;
>   }
> };
>
> I then create a global set with my traits and populate it with some
> information and later print. It seems to be working:
>
> my_set_t <symtab_wrapper, false, symtab_wrapper_traits> my_set;
>
>  static void
>  ipa_hash_generate_summary (void)
>  {
>   cgraph_node *cnode = NULL;
>   FOR_EACH_FUNCTION (cnode)
>   {
>     symtab_wrapper w (cnode);
>     my_set.insert (w);
>   }
>
>   my_set.print ();
>
>   return;
>  }
>
> Here, I already have some questions, but this is not the biggest issue
> that I am having: I believe that the hash_set implementation manages
> its own memory, but I am unclear about the semantics of "removed".
>
> * Is "removed" supposed to, for example, call the destructor? Since,
> in this particular case, it is only a wrapper and cgraph_node is not
> intended to be destroyed, I just assign the pointer to NULL. Not sure
> if that's the intention.
> * I don't understand the semantics of empty_zero_p, I think it is used
> to zero out deleted entries? If I mark something as empty.
> * I am assuming that its memory will be re-used, is this the case?
> * I see that there is a typed_delete_remove function that is sometimes
> used in traits, but I think that it doesn't apply to this case because
> I am storing the whole object. Is that the case?
>
> I later did some refactoring (patch 3), which did not seem to impact
> the code now, but will impact hash_map later. The refactoring is as
> follows, I create an abstract "printable" class
>
> class printable
> {
> public:
>   virtual char* str () = 0;
>   void print (void);
> };
>
> void
> printable::print (void)
> {
>   if (!dump_file) return;
>   char* _str = str ();
>   fprintf (dump_file, "%s\n", _str);
>   free (_str);
> }
>
> Make symtab_wrapper a publically derived class
>
> -class symtab_wrapper
> +class symtab_wrapper : public printable
>
> and implemented the str function in symtab_wrapper:
>
>   virtual char* str (void)
>    {
>     if (!dump_file) return;
>     gcc_assert (_symtab_node);
>     char *retval = NULL;
>     asprintf (&retval, "%s", _symtab_node->name ());
>     return retval;
>   }
>
> Again, this seems to be working correctly.
>
> What is more interesting is moving from a hash-set to a hash-map. On
> the fourth patch, I implemented the hash_map equivalent to the
> hash_set that I am describing above:
>
> template <typename KeyId, typename Value, typename Traits>
> class my_map_t
> {
> private:
>   hash_map <KeyId, Value, Traits> _impl;
> public:
>   my_map_t () {}
>
>   void insert (const KeyId &k, const Value &v)
>   {
>     _impl.put (k, v);
>   }
>
>   Value *select (const KeyId &k)
>   {
>     return _impl.get (k);
>   }
>
>   void print (void)
>   {
>     if (!dump_file) return;
>     for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
>     {
>       (*i).first.print ();
>     }
>   }
> };
>
> my_map_t <symtab_wrapper, int, simple_hashmap_traits
> <symtab_wrapper_traits, int>> my_map;
>
> I fill the keys with the same data as the hash_set and for the
> integers, just some incrementing values. I try to print the hash_map
> after printing the hash_set, but it crashes at the call-site to "str
> ()"
>
> void
> printable::print (void)
> {
>   if (!dump_file) return;
>   char* _str = str (); // <-- it crashes here
>   fprintf (dump_file, "%s\n", _str);
>   free (_str);
> }
>
> The interesting thing is that if I remove printable from
> symtab_wrapper's class hierarchy and inline print (and devirtualize
> str), then everything works correctly... (or at least to my
> understanding). This is illustrated in the fifth patch, which
> succeeds.
>
> Now, I am not sure if I am doing something wrong and I am a bit lost
> here. If anyone could help me, I'd appreciate it.
>
> I have built these patches on top of trunk and also bootstrapped them.
>
> Many thanks in advance!

There are likely still holes in how hash-set and hash-map handle
non-POD key/value types.  There have been patches floating around
and also some changes to ensure that constructors and destructors
are appropriately invoked.  As with your printable issue it's likely
the object is not properly constructed and thus the vtable pointer
initialization is missing.

The empty/zero/deleted stuff is to aid in representing hash table
entries that are unoccupied, either empty or deleted.  For every
object you put into a hash you have to define representations for
those states (for hash-map that applies to the 'key' type).

Richard.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-09-09  7:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-08 20:48 Some questions on hash_set/hash_map traits Erick Ochoa
2021-09-09  7:05 ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).