public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Martin Liška" <mliska@suse.cz>
To: Alexander Monakov <amonakov@ispras.ru>
Cc: gcc-patches@gcc.gnu.org, Nathan Sidwell <nathan@acm.org>,
	Jason Merrill <jason@redhat.com>,
	Jakub Jelinek <jakub@redhat.com>,
	Paul Richard Thomas <paul.richard.thomas@gmail.com>,
	Martin Jambor <mjambor@suse.cz>
Subject: Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.
Date: Mon, 29 Oct 2018 15:56:00 -0000	[thread overview]
Message-ID: <b3f846d3-b387-a986-22b0-071ef92791d1@suse.cz> (raw)
In-Reply-To: <alpine.LNX.2.20.13.1810291537420.16308@monopod.intra.ispras.ru>

[-- Attachment #1: Type: text/plain, Size: 2443 bytes --]

On 10/29/18 2:53 PM, Alexander Monakov wrote:
> On Mon, 29 Oct 2018, Martin Liška wrote:
>> My question is whether we want to have in GCC 9 time frame or should I wait with that?
>> Does it worth implementing?
> 
> This is cool, thanks! A few questions/comments on the patch.

Hi.

Thanks for support.

> 
> I think there are places that use libiberty C-style hashtab (htab_t), would it
> make sense to have this kind of checking for them as well?  I think it's going
> to be more complicated though, no need to do both in one step.

Sure, that can be also added in the future!

> 
> I would recommend to factor out the error reporting path into a separate
> non-template function, e.g. hashtab_chk_error.  See how qsort_chk_error
> has "cold" and "noreturn" attributes and invokes problematic comparators:
> the idea was that a developer can 'break qsort_chk_error' in GDB and then
> easily step into broken comparators.

I did so.

> 
> Furthermore, it might be nice to investigate if the entire checking loop can
> be factored out somehow into a non-template function to avoid having multiple
> instantiations of it for different hashtable template parameters.

Note that Jakub is preferring to enable the checking only in non-release builds.
Thus I've guarded the checking with #if ENABLE_EXTRA_CHECKING. That said I won't
care much about saving a binary size.

The checking function is dependent on template arguments, so the avoidance is probably
not possible.

> 
> On my first attempt to submit qsort_chk, Richi asked how much it slows down
> stage2, do you have some data on the cost of this hashtable checking?

It does not survive bootstrap right now (too many cselib_find_slot failures).

> 
> I think it is possible to optimize this a bit: instead of checking on
> insertions, check on deletions and when destroying the table (with care to do
> n^2/2 rather than n^2 tests). Although, such approach would miss errors on
> hashtables that are never destroyed (leaked or deliberately not deleted).

Interesting idea. Benefit of the current approach is that the ICE is triggered
as soon as first bad element is inserted into a hast table. Which means it's easier
to debug. One problem with the destruction is that one equals function is defined on:
Descriptor::equal (Descriptor::value_type, Descriptor::compare_type). One can have
a table where these are not equal and so that equals can't be called.

Martin

> 
> Alexander
> 


[-- Attachment #2: 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch --]
[-- Type: text/x-patch, Size: 2427 bytes --]

From 9afc42d324820cc0fc8a8a8b2d9cccd218734d10 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Mon, 29 Oct 2018 09:38:21 +0100
Subject: [PATCH] Sanitize equals and hash functions in hash-tables.

---
 gcc/hash-table.h | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index bd83345c7b8..43adfac2dc0 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -503,6 +503,7 @@ private:
 
   value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
   value_type *find_empty_slot_for_expand (hashval_t);
+  void verify (const compare_type &comparable, hashval_t hash);
   bool too_empty_p (unsigned int);
   void expand ();
   static bool is_deleted (value_type &v)
@@ -882,8 +883,12 @@ hash_table<Descriptor, Allocator>
   if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
     expand ();
 
-  m_searches++;
+#if ENABLE_EXTRA_CHECKING
+    if (insert == INSERT)
+      verify (comparable, hash);
+#endif
 
+  m_searches++;
   value_type *first_deleted_slot = NULL;
   hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
   hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
@@ -930,6 +935,39 @@ hash_table<Descriptor, Allocator>
   return &m_entries[index];
 }
 
+#if ENABLE_EXTRA_CHECKING
+
+/* Report a hash table checking error.  */
+
+ATTRIBUTE_NORETURN ATTRIBUTE_COLD
+static void
+hashtab_chk_error ()
+{
+  fprintf (stderr, "hash table checking failed: "
+	   "equal operator returns true for a pair "
+	   "of values with a different hash value");
+  gcc_unreachable ();
+}
+
+/* Verify that all existing elements in th hash table which are
+   equal to COMPARABLE have an equal HASH value provided as argument.  */
+
+template<typename Descriptor, template<typename Type> class Allocator>
+void
+hash_table<Descriptor, Allocator>
+::verify (const compare_type &comparable, hashval_t hash)
+{
+  for (size_t i = 0; i < m_size; i++)
+    {
+      value_type *entry = &m_entries[i];
+      if (!is_empty (*entry) && !is_deleted (*entry)
+	  && Descriptor::equal (*entry, comparable)
+	  && hash != Descriptor::hash (*entry))
+	hashtab_chk_error ();
+    }
+}
+#endif
+
 /* This function deletes an element with the given COMPARABLE value
    from hash table starting with the given HASH.  If there is no
    matching element in the hash table, this function does nothing. */
-- 
2.19.0


  reply	other threads:[~2018-10-29 15:14 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-29 12:02 Martin Liška
2018-10-29 14:28 ` Alexander Monakov
2018-10-29 15:56   ` Martin Liška [this message]
2018-10-30 10:32     ` Jakub Jelinek
2018-10-30 14:17       ` Martin Liška
2018-11-07 22:24         ` Jeff Law
2018-11-07 22:44           ` Jakub Jelinek
2018-11-08  8:56           ` Martin Liška
2019-05-13  7:42             ` Martin Liška
2019-05-20 17:26               ` Jason Merrill
2019-05-20 22:07               ` Jeff Law
2019-05-21  9:38                 ` Richard Biener
2019-05-21 11:02                   ` Martin Liška
2019-05-21 11:52                     ` Richard Biener
2019-05-22  9:13                       ` Martin Liška
2019-05-31 13:23                         ` Richard Biener
2019-05-31 13:35                           ` Martin Liška
2019-05-31 22:10                         ` Jeff Law
2019-06-03 13:35                           ` Martin Liška
2019-06-07  8:57                             ` Richard Biener
2019-06-07 12:04                               ` Martin Liška
2019-06-07 12:09                                 ` Richard Biener
2019-06-07 12:13                                   ` Martin Liška
2019-06-07 14:48                                     ` Martin Sebor
2019-06-07 21:43                                     ` Jason Merrill
2019-06-10  7:08                                       ` Martin Liška
2019-06-10 18:22                                         ` Jason Merrill
2019-06-11  7:41                                           ` Martin Liška
2019-06-11 12:28                                             ` Jason Merrill
2019-06-11 13:16                                               ` Martin Liška
2019-06-11 19:02                                                 ` Jason Merrill
2019-06-12  7:59                                                   ` Richard Biener
2019-06-12  8:02                                                     ` Martin Liška
2019-06-12  9:15                                                       ` Martin Liška
2019-06-12  9:41                                                         ` Richard Biener
2019-06-12 11:45                                                           ` Martin Liška
2019-06-12 12:50                                                             ` Richard Biener
2019-06-12 13:05                                                               ` Martin Liška
2019-06-23 23:08                                 ` Ian Lance Taylor
2019-06-24 12:29                                   ` Richard Biener
2019-06-24 13:51                                     ` Martin Liška
2019-06-24 14:10                                       ` Richard Biener
2019-06-25 10:25                                         ` Martin Liška
2019-06-25 11:59                                           ` Martin Liška
2019-06-25 14:23                                           ` Richard Biener
2018-10-30 10:25 ` hash-table violation in cselib.c Martin Liška
2018-11-01 11:57   ` Martin Liška
2018-10-30 10:46 ` hash-table violation in gcc/fortran/trans-decl.c Martin Liška
2018-10-31 10:00   ` Trevor Saunders
2018-10-31 10:18     ` Martin Liška
2018-10-30 11:07 ` hash-table violation in gcc/cp/pt.c Martin Liška
2018-10-30 11:21   ` Martin Liška
2018-11-01 12:06     ` Martin Liška

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3f846d3-b387-a986-22b0-071ef92791d1@suse.cz \
    --to=mliska@suse.cz \
    --cc=amonakov@ispras.ru \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jason@redhat.com \
    --cc=mjambor@suse.cz \
    --cc=nathan@acm.org \
    --cc=paul.richard.thomas@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).