public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Martin Liška" <mliska@suse.cz>
To: Richard Biener <richard.guenther@gmail.com>, Jeff Law <law@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
	Alexander Monakov <amonakov@ispras.ru>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	Nathan Sidwell <nathan@acm.org>, Jason Merrill <jason@redhat.com>,
	Paul Richard Thomas <paul.richard.thomas@gmail.com>,
	Martin Jambor <mjambor@suse.cz>
Subject: Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.
Date: Tue, 21 May 2019 11:02:00 -0000	[thread overview]
Message-ID: <da052a95-6d5d-2b6c-ffce-d3f92a4ae1ac@suse.cz> (raw)
In-Reply-To: <CAFiYyc1oRW5_qrQ3=DDK5SqD7XhKo7X8-AApXyKgQ2xs3mWgcg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5414 bytes --]

On 5/21/19 11:38 AM, Richard Biener wrote:
> On Tue, May 21, 2019 at 12:07 AM Jeff Law <law@redhat.com> wrote:
>>
>> On 5/13/19 1:41 AM, Martin Liška wrote:
>>> On 11/8/18 9:56 AM, Martin Liška wrote:
>>>> On 11/7/18 11:23 PM, Jeff Law wrote:
>>>>> On 10/30/18 6:28 AM, Martin Liška wrote:
>>>>>> On 10/30/18 11:03 AM, Jakub Jelinek wrote:
>>>>>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
>>>>>>>> +hashtab_chk_error ()
>>>>>>>> +{
>>>>>>>> +  fprintf (stderr, "hash table checking failed: "
>>>>>>>> +           "equal operator returns true for a pair "
>>>>>>>> +           "of values with a different hash value");
>>>>>>> BTW, either use internal_error here, or at least if using fprintf
>>>>>>> terminate with \n, in your recent mail I saw:
>>>>>>> ...different hash valueduring RTL pass: vartrack
>>>>>>>                     ^^^^^^
>>>>>> Sure, fixed in attached patch.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>>>> +  gcc_unreachable ();
>>>>>>>> +}
>>>>>>>   Jakub
>>>>>>>
>>>>>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
>>>>>>
>>>>>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
>>>>>> From: marxin <mliska@suse.cz>
>>>>>> Date: Mon, 29 Oct 2018 09:38:21 +0100
>>>>>> Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
>>>>>>
>>>>>> ---
>>>>>>  gcc/hash-table.h | 40 +++++++++++++++++++++++++++++++++++++++-
>>>>>>  1 file changed, 39 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>>>>>> index bd83345c7b8..694eedfc4be 100644
>>>>>> --- a/gcc/hash-table.h
>>>>>> +++ b/gcc/hash-table.h
>>>>>> @@ -503,6 +503,7 @@ private:
>>>>>>
>>>>>>    value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
>>>>>>    value_type *find_empty_slot_for_expand (hashval_t);
>>>>>> +  void verify (const compare_type &comparable, hashval_t hash);
>>>>>>    bool too_empty_p (unsigned int);
>>>>>>    void expand ();
>>>>>>    static bool is_deleted (value_type &v)
>>>>>> @@ -882,8 +883,12 @@ hash_table<Descriptor, Allocator>
>>>>>>    if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>>>>>>      expand ();
>>>>>>
>>>>>> -  m_searches++;
>>>>>> +#if ENABLE_EXTRA_CHECKING
>>>>>> +    if (insert == INSERT)
>>>>>> +      verify (comparable, hash);
>>>>>> +#endif
>>>>>>
>>>>>> +  m_searches++;
>>>>>>    value_type *first_deleted_slot = NULL;
>>>>>>    hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>>>>>>    hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
>>>>>> @@ -930,6 +935,39 @@ hash_table<Descriptor, Allocator>
>>>>>>    return &m_entries[index];
>>>>>>  }
>>>>>>
>>>>>> +#if ENABLE_EXTRA_CHECKING
>>>>>> +
>>>>>> +/* Report a hash table checking error.  */
>>>>>> +
>>>>>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
>>>>>> +static void
>>>>>> +hashtab_chk_error ()
>>>>>> +{
>>>>>> +  fprintf (stderr, "hash table checking failed: "
>>>>>> +     "equal operator returns true for a pair "
>>>>>> +     "of values with a different hash value\n");
>>>>>> +  gcc_unreachable ();
>>>>>> +}
>>>>> I think an internal_error here is probably still better than a simple
>>>>> fprintf, even if the fprintf is terminated with a \n :-)
>>>> Fully agree with that, but I see a lot of build errors when using internal_error.
>>>>
>>>>> The question then becomes can we bootstrap with this stuff enabled and
>>>>> if not, are we likely to soon?  It'd be a shame to put it into
>>>>> EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
>>>>> because we've got too many bugs to fix.
>>>> Unfortunately it's blocked with these 2 PRs:
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847
>>> Hi.
>>>
>>> I've just added one more PR:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450
>>>
>>> I'm sending updated version of the patch that provides a disablement for the 3 PRs
>>> with a new function disable_sanitize_eq_and_hash.
>>>
>>> With that I can bootstrap and finish tests. However, I've done that with a patch
>>> limits maximal number of checks:
>> So rather than call the disable_sanitize_eq_and_hash, can you have its
>> state set up when you instantiate the object?  It's not a huge deal,
>> just thinking about loud.
>>
>>
>>
>> So how do we want to go forward, particularly the EXTRA_EXTRA checking
>> issue :-)
> 
> There is at least one PR where we have a table where elements _in_ the
> table are never compared against each other but always against another
> object (I guess that's usual even), but the setup is in a way that the
> comparison function only works with those.  With the patch we verify
> hashing/comparison for something that is never used.
> 
> So - wouldn't it be more "correct" to only verify comparison/hashing
> at lookup time, using the object from the lookup and verify that against
> all other elements?

I don't a have problem with that. Apparently this changes fixes
PR90450 and PR87847.

Changes from previous version:
- verification happens only when an element is searched (not inserted)
- new argument 'sanitize_eq_and_hash' added for hash_table::hash_table
- new param has been introduced hash-table-verification-limit in order
  to limit number of elements that are compared within a table
- verification happens only with flag_checking >= 2

I've been bootstrapping and testing the patch right now.

Martin

> 
> Richard.
> 
>>
>> Jeff


[-- Attachment #2: 0001-Enable-sanitization-for-hash-tables.patch --]
[-- Type: text/x-patch, Size: 8202 bytes --]

From 47df69feb035672056d2ad86c53eda9c286ed92a Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Mon, 13 May 2019 07:16:22 +0200
Subject: [PATCH] Enable sanitization for hash tables.

---
 gcc/cselib.c     |  9 ++++++--
 gcc/hash-set.h   |  2 +-
 gcc/hash-table.c |  3 +++
 gcc/hash-table.h | 60 ++++++++++++++++++++++++++++++++++++++++++++----
 gcc/params.def   |  6 +++++
 gcc/toplev.c     |  4 ++++
 6 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/gcc/cselib.c b/gcc/cselib.c
index 84c17c23f6d..a1cbdec9718 100644
--- a/gcc/cselib.c
+++ b/gcc/cselib.c
@@ -2858,9 +2858,14 @@ cselib_init (int record_what)
     }
   used_regs = XNEWVEC (unsigned int, cselib_nregs);
   n_used_regs = 0;
-  cselib_hash_table = new hash_table<cselib_hasher> (31);
+  /* FIXME: enable sanitization (PR87845) */
+  cselib_hash_table
+    = new hash_table<cselib_hasher> (31, /* ggc */ false,
+				     /* sanitize_eq_and_hash */ false);
   if (cselib_preserve_constants)
-    cselib_preserved_hash_table = new hash_table<cselib_hasher> (31);
+    cselib_preserved_hash_table
+      = new hash_table<cselib_hasher> (31, /* ggc */ false,
+				       /* sanitize_eq_and_hash */ false);
   next_uid = 1;
 }
 
diff --git a/gcc/hash-set.h b/gcc/hash-set.h
index de3532f5f68..d891ed78297 100644
--- a/gcc/hash-set.h
+++ b/gcc/hash-set.h
@@ -28,7 +28,7 @@ class hash_set
 public:
   typedef typename Traits::value_type Key;
   explicit hash_set (size_t n = 13, bool ggc = false CXX_MEM_STAT_INFO)
-    : m_table (n, ggc, GATHER_STATISTICS, HASH_SET_ORIGIN PASS_MEM_STAT) {}
+    : m_table (n, ggc, true, GATHER_STATISTICS, HASH_SET_ORIGIN PASS_MEM_STAT) {}
 
   /* Create a hash_set in gc memory with space for at least n elements.  */
 
diff --git a/gcc/hash-table.c b/gcc/hash-table.c
index 646a7a1c497..8e86fffa36f 100644
--- a/gcc/hash-table.c
+++ b/gcc/hash-table.c
@@ -74,6 +74,9 @@ struct prime_ent const prime_tab[] = {
   { 0xfffffffb, 0x00000006, 0x00000008, 31 }
 };
 
+/* Limit number of comparisons when calling hash_table<>::verify.  */
+unsigned int hash_table_sanitize_eq_limit;
+
 /* The following function returns an index into the above table of the
    nearest prime number which is greater than N, and near a power of two. */
 
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 4178616478e..b3e579b8822 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -295,6 +295,8 @@ struct prime_ent
 
 extern struct prime_ent const prime_tab[];
 
+/* Limit number of comparisons when calling hash_table<>::verify.  */
+extern unsigned int hash_table_sanitize_eq_limit;
 
 /* Functions for computing hash table indexes.  */
 
@@ -371,10 +373,12 @@ class hash_table
 
 public:
   explicit hash_table (size_t, bool ggc = false,
+		       bool sanitize_eq_and_hash = true,
 		       bool gather_mem_stats = GATHER_STATISTICS,
 		       mem_alloc_origin origin = HASH_TABLE_ORIGIN
 		       CXX_MEM_STAT_INFO);
   explicit hash_table (const hash_table &, bool ggc = false,
+		       bool sanitize_eq_and_hash = true,
 		       bool gather_mem_stats = GATHER_STATISTICS,
 		       mem_alloc_origin origin = HASH_TABLE_ORIGIN
 		       CXX_MEM_STAT_INFO);
@@ -516,6 +520,7 @@ private:
 
   value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
   value_type *find_empty_slot_for_expand (hashval_t);
+  void verify (const compare_type &comparable, hashval_t hash);
   bool too_empty_p (unsigned int);
   void expand ();
   static bool is_deleted (value_type &v)
@@ -564,6 +569,9 @@ private:
   /* if m_entries is stored in ggc memory.  */
   bool m_ggc;
 
+  /* True if the table should be sanitized for equal and hash functions.  */
+  bool m_sanitize_eq_and_hash;
+
   /* If we should gather memory statistics for the table.  */
 #if GATHER_STATISTICS
   bool m_gather_mem_stats;
@@ -586,12 +594,13 @@ extern void dump_hash_table_loc_statistics (void);
 template<typename Descriptor, bool Lazy,
 	 template<typename Type> class Allocator>
 hash_table<Descriptor, Lazy, Allocator>::hash_table (size_t size, bool ggc,
+						     bool sanitize_eq_and_hash,
 						     bool gather_mem_stats
 						     ATTRIBUTE_UNUSED,
 						     mem_alloc_origin origin
 						     MEM_STAT_DECL) :
   m_n_elements (0), m_n_deleted (0), m_searches (0), m_collisions (0),
-  m_ggc (ggc)
+  m_ggc (ggc), m_sanitize_eq_and_hash (sanitize_eq_and_hash)
 #if GATHER_STATISTICS
   , m_gather_mem_stats (gather_mem_stats)
 #endif
@@ -617,12 +626,14 @@ template<typename Descriptor, bool Lazy,
 	 template<typename Type> class Allocator>
 hash_table<Descriptor, Lazy, Allocator>::hash_table (const hash_table &h,
 						     bool ggc,
+						     bool sanitize_eq_and_hash,
 						     bool gather_mem_stats
 						     ATTRIBUTE_UNUSED,
 						     mem_alloc_origin origin
 						     MEM_STAT_DECL) :
   m_n_elements (h.m_n_elements), m_n_deleted (h.m_n_deleted),
-  m_searches (0), m_collisions (0), m_ggc (ggc)
+  m_searches (0), m_collisions (0), m_ggc (ggc),
+  m_sanitize_eq_and_hash (sanitize_eq_and_hash)
 #if GATHER_STATISTICS
   , m_gather_mem_stats (gather_mem_stats)
 #endif
@@ -912,7 +923,11 @@ hash_table<Descriptor, Lazy, Allocator>
       entry = &m_entries[index];
       if (is_empty (*entry)
           || (!is_deleted (*entry) && Descriptor::equal (*entry, comparable)))
-        return *entry;
+	{
+	  if (m_sanitize_eq_and_hash)
+	    verify (comparable, hash);
+	  return *entry;
+	}
     }
 }
 
@@ -941,8 +956,10 @@ hash_table<Descriptor, Lazy, Allocator>
   if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
     expand ();
 
-  m_searches++;
+  if (m_sanitize_eq_and_hash && insert == NO_INSERT)
+    verify (comparable, hash);
 
+  m_searches++;
   value_type *first_deleted_slot = NULL;
   hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
   hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
@@ -989,6 +1006,41 @@ hash_table<Descriptor, Lazy, Allocator>
   return &m_entries[index];
 }
 
+/* Report a hash table checking error.  */
+
+ATTRIBUTE_NORETURN ATTRIBUTE_COLD
+static void
+hashtab_chk_error ()
+{
+  fprintf (stderr, "hash table checking failed: "
+	   "equal operator returns true for a pair "
+	   "of values with a different hash value\n");
+#ifndef GENERATOR_FILE
+  gcc_unreachable ();
+#else
+  exit (1);
+#endif
+}
+
+/* Verify that all existing elements in th hash table which are
+   equal to COMPARABLE have an equal HASH value provided as argument.  */
+
+template<typename Descriptor, bool Lazy,
+	 template<typename Type> class Allocator>
+void
+hash_table<Descriptor, Lazy, Allocator>
+::verify (const compare_type &comparable, hashval_t hash)
+{
+  for (size_t i = 0; i < MIN (hash_table_sanitize_eq_limit, m_size); i++)
+    {
+      value_type *entry = &m_entries[i];
+      if (!is_empty (*entry) && !is_deleted (*entry)
+	  && hash != Descriptor::hash (*entry)
+	  && Descriptor::equal (*entry, comparable))
+	hashtab_chk_error ();
+    }
+}
+
 /* This function deletes an element with the given COMPARABLE value
    from hash table starting with the given HASH.  If there is no
    matching element in the hash table, this function does nothing. */
diff --git a/gcc/params.def b/gcc/params.def
index 6b7f7eb5bae..e5ca6ff45e4 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1439,6 +1439,12 @@ DEFPARAM(PARAM_GIMPLE_FE_COMPUTED_HOT_BB_THRESHOLD,
 	 " The parameter is used only in GIMPLE FE.",
 	 0, 0, 0)
 
+DEFPARAM(PARAM_HASH_TABLE_VERIFICATION_LIMIT,
+	 "hash-table-verification-limit",
+	 "The number of elements for which hash table verification is done for "
+	 "each searched element.",
+	 100, 0, 0)
+
 /*
 
 Local variables:
diff --git a/gcc/toplev.c b/gcc/toplev.c
index d300ac2ec89..116be7be395 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1799,6 +1799,10 @@ process_options (void)
   optimization_default_node = build_optimization_node (&global_options);
   optimization_current_node = optimization_default_node;
 
+  if (flag_checking >= 2)
+    hash_table_sanitize_eq_limit
+      = PARAM_VALUE (PARAM_HASH_TABLE_VERIFICATION_LIMIT);
+
   /* Please don't change global_options after this point, those changes won't
      be reflected in optimization_{default,current}_node.  */
 }
-- 
2.21.0


  reply	other threads:[~2019-05-21 11:02 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-29 12:02 Martin Liška
2018-10-29 14:28 ` Alexander Monakov
2018-10-29 15:56   ` Martin Liška
2018-10-30 10:32     ` Jakub Jelinek
2018-10-30 14:17       ` Martin Liška
2018-11-07 22:24         ` Jeff Law
2018-11-07 22:44           ` Jakub Jelinek
2018-11-08  8:56           ` Martin Liška
2019-05-13  7:42             ` Martin Liška
2019-05-20 17:26               ` Jason Merrill
2019-05-20 22:07               ` Jeff Law
2019-05-21  9:38                 ` Richard Biener
2019-05-21 11:02                   ` Martin Liška [this message]
2019-05-21 11:52                     ` Richard Biener
2019-05-22  9:13                       ` Martin Liška
2019-05-31 13:23                         ` Richard Biener
2019-05-31 13:35                           ` Martin Liška
2019-05-31 22:10                         ` Jeff Law
2019-06-03 13:35                           ` Martin Liška
2019-06-07  8:57                             ` Richard Biener
2019-06-07 12:04                               ` Martin Liška
2019-06-07 12:09                                 ` Richard Biener
2019-06-07 12:13                                   ` Martin Liška
2019-06-07 14:48                                     ` Martin Sebor
2019-06-07 21:43                                     ` Jason Merrill
2019-06-10  7:08                                       ` Martin Liška
2019-06-10 18:22                                         ` Jason Merrill
2019-06-11  7:41                                           ` Martin Liška
2019-06-11 12:28                                             ` Jason Merrill
2019-06-11 13:16                                               ` Martin Liška
2019-06-11 19:02                                                 ` Jason Merrill
2019-06-12  7:59                                                   ` Richard Biener
2019-06-12  8:02                                                     ` Martin Liška
2019-06-12  9:15                                                       ` Martin Liška
2019-06-12  9:41                                                         ` Richard Biener
2019-06-12 11:45                                                           ` Martin Liška
2019-06-12 12:50                                                             ` Richard Biener
2019-06-12 13:05                                                               ` Martin Liška
2019-06-23 23:08                                 ` Ian Lance Taylor
2019-06-24 12:29                                   ` Richard Biener
2019-06-24 13:51                                     ` Martin Liška
2019-06-24 14:10                                       ` Richard Biener
2019-06-25 10:25                                         ` Martin Liška
2019-06-25 11:59                                           ` Martin Liška
2019-06-25 14:23                                           ` Richard Biener
2018-10-30 10:25 ` hash-table violation in cselib.c Martin Liška
2018-11-01 11:57   ` Martin Liška
2018-10-30 10:46 ` hash-table violation in gcc/fortran/trans-decl.c Martin Liška
2018-10-31 10:00   ` Trevor Saunders
2018-10-31 10:18     ` Martin Liška
2018-10-30 11:07 ` hash-table violation in gcc/cp/pt.c Martin Liška
2018-10-30 11:21   ` Martin Liška
2018-11-01 12:06     ` Martin Liška

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da052a95-6d5d-2b6c-ffce-d3f92a4ae1ac@suse.cz \
    --to=mliska@suse.cz \
    --cc=amonakov@ispras.ru \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jason@redhat.com \
    --cc=law@redhat.com \
    --cc=mjambor@suse.cz \
    --cc=nathan@acm.org \
    --cc=paul.richard.thomas@gmail.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).