public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Alexandre Oliva <oliva@adacore.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [17/17] check hash table insertions
Date: Wed, 28 Dec 2022 15:20:25 +0100	[thread overview]
Message-ID: <AD1DB22F-1C8F-4D80-A028-4C29B62276E9@gmail.com> (raw)
In-Reply-To: <orwn6bd759.fsf@lxoliva.fsfla.org>



> Am 28.12.2022 um 13:51 schrieb Alexandre Oliva via Gcc-patches <gcc-patches@gcc.gnu.org>:
> 
> On Dec 27, 2022, Alexandre Oliva <oliva@adacore.com> wrote:
> 
>> The number of bugs it revealed tells me that leaving callers in charge
>> of completing insertions is too error prone.  I found this
>> verification code a bit too expensive to enable in general.
> 
> Here's a relatively cheap, checking-only test to catch dangling inserts.
> 
> 
> I've noticed a number of potential problems in hash tables, of three
> kinds: insertion of entries that seem empty, dangling insertions, and
> lookups during insertions.
> 
> These problems may all have the effect of replacing a deleted entry
> with one that seems empty, which may disconnect double-hashing chains
> involving that entry, and thus cause entries to go missing.
> 
> This patch detects such problems by recording a pending insertion and
> checking that it's completed before other potentially-conflicting
> operations.  The additional field is only introduced when checking is
> enabled.

I wonder if on INSERT, pushing a DELETED marker would fix the dangling insert and search during delete problem be whether that would be better from a design point of view? (It of course requires a DELETED representation)

> Regstrapped on x86_64-linux-gnu.  Ok to install?
> 
> 
> for  gcc/ChnageLog
> 
>    * hash-table.h (check_complete_insertion, check_insert_slot):
>    New hash_table methods.
>    (m_inserting_slot): New hash_table field.
>    (begin, hash_table ctors, ~hash_table): Check previous insert.
>    (expand, empty_slow, clear_slot, find_with_hash): Likewise.
>    (remote_elt_with_hash, traverse_noresize): Likewise.
>    (gt_pch_nx): Likewise.
>    (find_slot_with_hash): Likewise.  Record requested insert.
> ---
> gcc/hash-table.h |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 60 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> index f4bda6102323e..33753d04b7bdb 100644
> --- a/gcc/hash-table.h
> +++ b/gcc/hash-table.h
> @@ -495,6 +495,7 @@ public:
>     {
>       if (Lazy && m_entries == NULL)
>    return iterator ();
> +      check_complete_insertion ();
>       iterator iter (m_entries, m_entries + m_size);
>       iter.slide ();
>       return iter;
> @@ -551,8 +552,39 @@ private:
>     Descriptor::mark_empty (v);
>   }
> 
> +public:
> +  void check_complete_insertion () const
> +  {
> +#if CHECKING_P
> +    if (!m_inserting_slot)
> +      return;
> +
> +    gcc_checking_assert (m_inserting_slot >= &m_entries[0]
> +             && m_inserting_slot < &m_entries[m_size]);
> +
> +    if (!is_empty (*m_inserting_slot))
> +      m_inserting_slot = NULL;
> +    else
> +      gcc_unreachable ();
> +#endif
> +  }
> +
> +private:
> +  value_type *check_insert_slot (value_type *ret)
> +  {
> +#if CHECKING_P
> +    gcc_checking_assert (is_empty (*ret));
> +    m_inserting_slot = ret;
> +#endif
> +    return ret;
> +  }
> +
> +#if CHECKING_P
> +  mutable value_type *m_inserting_slot;
> +#endif
> +
>   /* Table itself.  */
> -  typename Descriptor::value_type *m_entries;
> +  value_type *m_entries;
> 
>   size_t m_size;
> 
> @@ -607,6 +639,9 @@ hash_table<Descriptor, Lazy, Allocator>::hash_table (size_t size, bool ggc,
>                             ATTRIBUTE_UNUSED,
>                             mem_alloc_origin origin
>                             MEM_STAT_DECL) :
> +#if CHECKING_P
> +  m_inserting_slot (0),
> +#endif
>   m_n_elements (0), m_n_deleted (0), m_searches (0), m_collisions (0),
>   m_ggc (ggc), m_sanitize_eq_and_hash (sanitize_eq_and_hash)
> #if GATHER_STATISTICS
> @@ -639,6 +674,9 @@ hash_table<Descriptor, Lazy, Allocator>::hash_table (const hash_table &h,
>                             ATTRIBUTE_UNUSED,
>                             mem_alloc_origin origin
>                             MEM_STAT_DECL) :
> +#if CHECKING_P
> +  m_inserting_slot (0),
> +#endif
>   m_n_elements (h.m_n_elements), m_n_deleted (h.m_n_deleted),
>   m_searches (0), m_collisions (0), m_ggc (ggc),
>   m_sanitize_eq_and_hash (sanitize_eq_and_hash)
> @@ -646,6 +684,8 @@ hash_table<Descriptor, Lazy, Allocator>::hash_table (const hash_table &h,
>   , m_gather_mem_stats (gather_mem_stats)
> #endif
> {
> +  h.check_complete_insertion ();
> +
>   size_t size = h.m_size;
> 
>   if (m_gather_mem_stats)
> @@ -675,6 +715,8 @@ template<typename Descriptor, bool Lazy,
>     template<typename Type> class Allocator>
> hash_table<Descriptor, Lazy, Allocator>::~hash_table ()
> {
> +  check_complete_insertion ();
> +
>   if (!Lazy || m_entries)
>     {
>       for (size_t i = m_size - 1; i < m_size; i--)
> @@ -778,6 +820,8 @@ template<typename Descriptor, bool Lazy,
> void
> hash_table<Descriptor, Lazy, Allocator>::expand ()
> {
> +  check_complete_insertion ();
> +
>   value_type *oentries = m_entries;
>   unsigned int oindex = m_size_prime_index;
>   size_t osize = size ();
> @@ -853,6 +897,8 @@ template<typename Descriptor, bool Lazy,
> void
> hash_table<Descriptor, Lazy, Allocator>::empty_slow ()
> {
> +  check_complete_insertion ();
> +
>   size_t size = m_size;
>   size_t nsize = size;
>   value_type *entries = m_entries;
> @@ -901,6 +947,8 @@ template<typename Descriptor, bool Lazy,
> void
> hash_table<Descriptor, Lazy, Allocator>::clear_slot (value_type *slot)
> {
> +  check_complete_insertion ();
> +
>   gcc_checking_assert (!(slot < m_entries || slot >= m_entries + size ()
>                 || is_empty (*slot) || is_deleted (*slot)));
> 
> @@ -927,6 +975,8 @@ hash_table<Descriptor, Lazy, Allocator>
>   if (Lazy && m_entries == NULL)
>     m_entries = alloc_entries (size);
> 
> +  check_complete_insertion ();
> +
> #if CHECKING_P
>   if (m_sanitize_eq_and_hash)
>     verify (comparable, hash);
> @@ -976,6 +1026,8 @@ hash_table<Descriptor, Lazy, Allocator>
>     }
>   if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>     expand ();
> +  else
> +    check_complete_insertion ();
> 
> #if CHECKING_P
>   if (m_sanitize_eq_and_hash)
> @@ -1022,11 +1074,11 @@ hash_table<Descriptor, Lazy, Allocator>
>     {
>       m_n_deleted--;
>       mark_empty (*first_deleted_slot);
> -      return first_deleted_slot;
> +      return check_insert_slot (first_deleted_slot);
>     }
> 
>   m_n_elements++;
> -  return &m_entries[index];
> +  return check_insert_slot (&m_entries[index]);
> }
> 
> /* Verify that all existing elements in the hash table which are
> @@ -1068,6 +1120,8 @@ void
> hash_table<Descriptor, Lazy, Allocator>
> ::remove_elt_with_hash (const compare_type &comparable, hashval_t hash)
> {
> +  check_complete_insertion ();
> +
>   value_type *slot = find_slot_with_hash (comparable, hash, NO_INSERT);
>   if (slot == NULL)
>     return;
> @@ -1094,6 +1148,8 @@ hash_table<Descriptor, Lazy, Allocator>::traverse_noresize (Argument argument)
>   if (Lazy && m_entries == NULL)
>     return;
> 
> +  check_complete_insertion ();
> +
>   value_type *slot = m_entries;
>   value_type *limit = slot + size ();
> 
> @@ -1210,6 +1266,7 @@ template<typename D>
> static void
> gt_pch_nx (hash_table<D> *h)
> {
> +  h->check_complete_insertion ();
>   bool success
>     = gt_pch_note_object (h->m_entries, h, hashtab_entry_note_pointers<D>);
>   gcc_checking_assert (success);
> 
> 
> -- 
> Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
>   Free Software Activist                       GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about <https://stallmansupport.org>

  reply	other threads:[~2022-12-28 14:20 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-27  4:07 [00/13] check hash table counts Alexandre Oliva
2022-12-27  4:17 ` [01/13] scoped tables: insert before further lookups Alexandre Oliva
2022-12-27 15:11   ` Jeff Law
2022-12-27  4:18 ` [02/13] varpool: do not add NULL vnodes to referenced Alexandre Oliva
2022-12-27 15:14   ` Jeff Law
2022-12-27  4:19 ` [03/13] tree-inline decl_map: skip mapping NULL to itself Alexandre Oliva
2022-12-27 15:15   ` Jeff Law
2022-12-27  4:21 ` [04/13] [C++] constraint: insert norm entry once Alexandre Oliva
2022-12-27 15:37   ` Jeff Law
2022-12-27  4:22 ` [05/13] ssa-loop-niter: skip caching of null operands Alexandre Oliva
2022-12-27 15:19   ` Jeff Law
2022-12-28  4:03     ` Alexandre Oliva
2022-12-27  4:23 ` [06/13] tree-inline decl_map: skip mapping result's NULL default def Alexandre Oliva
2022-12-27 15:23   ` Jeff Law
2022-12-27  4:24 ` [07/13] postreload-gcse: no insert on mere lookup Alexandre Oliva
2022-12-27 15:11   ` Jeff Law
2022-12-27  4:28 ` [08/13] tm: complete tm_restart insertion Alexandre Oliva
2022-12-27 15:27   ` Jeff Law
2022-12-27  4:30 ` [09/13] [C++] constexpr: request insert iff depth is ok Alexandre Oliva
2022-12-27 15:38   ` Jeff Law
2022-12-27  4:35 ` [10/13] lto: drop dummy partition mapping Alexandre Oliva
2022-12-27 15:34   ` Jeff Law
2022-12-27  4:38 ` [11/13] ada: don't map NULL decl to locus Alexandre Oliva
2022-12-27 15:33   ` Jeff Law
2022-12-27 16:54     ` Arnaud Charlet
2022-12-27  4:38 ` [12/13] hash set: reject attempts to add empty values Alexandre Oliva
2022-12-27 15:30   ` Jeff Law
2022-12-27  4:39 ` [13/13] hash-map: reject empty-looking insertions Alexandre Oliva
2022-12-27 15:31   ` Jeff Law
2022-12-27 17:53   ` David Malcolm
2022-12-28 12:32     ` [15/17] prevent hash set/map insertion of deleted entries Alexandre Oliva
2022-12-29  4:25       ` Jeff Law
2022-12-28  8:50 ` [00/13] check hash table counts Martin Liška
2022-12-28 12:46   ` [16/17] check hash table counts at expand Alexandre Oliva
2023-01-09  7:46     ` Richard Biener
2022-12-28 12:30 ` [14/17] parloops: don't request insert that won't be completed Alexandre Oliva
2022-12-29  2:44   ` Jeff Law
2022-12-28 12:50 ` [17/17] check hash table insertions Alexandre Oliva
2022-12-28 14:20   ` Richard Biener [this message]
2022-12-28 23:06     ` Alexandre Oliva
2022-12-29  7:29       ` Richard Biener
2022-12-30  8:53         ` Alexandre Oliva
2022-12-30 11:30           ` Richard Biener
2022-12-30 16:41             ` Alexandre Oliva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AD1DB22F-1C8F-4D80-A028-4C29B62276E9@gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=oliva@adacore.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).