public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Martin Liška" <mliska@suse.cz>
To: gcc-patches@gcc.gnu.org
Cc: Nathan Sidwell <nathan@acm.org>, Jason Merrill <jason@redhat.com>,
	Jakub Jelinek <jakub@redhat.com>,
	Paul Richard Thomas <paul.richard.thomas@gmail.com>,
	Martin Jambor <mjambor@suse.cz>
Subject: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.
Date: Mon, 29 Oct 2018 12:02:00 -0000	[thread overview]
Message-ID: <23ffca95-6492-e609-aebb-bbdd83b5185d@suse.cz> (raw)

[-- Attachment #1: Type: text/plain, Size: 10053 bytes --]

Hi.

As slightly discussed here: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01674.html I fixed
a situation where an equal operator of a hash table returns true, while corresponding hash
value of a pair of elements is different. That's inconsistent and can probably cause issues
in different areas of compiler.

I wrote a simple patch that verifies for a newly added element into a hash table that there's
no other equal element with a different hash. That's O(n) for each insert, so that it's only
enabled with -fchecking=3. Apart from that the way how it's enable is a bit cumbersome, but
that's caused by usage of hash-table also in generated files.

Anyway, there are first places where I see violation of the sanity check:

1) cselib_lookup_1:

$ cat ice.c
a() { b(); }

$ /dev/shm/objdir/gcc/xgcc -B/dev/shm/objdir/gcc/ ice.c -g -c -fchecking=3 -O
hash table checking failed: equal operator returns true for a pair of values with a different hash valueduring RTL pass: vartrack
ice.c:1:1: internal compiler error: in find_slot_with_hash, at hash-table.h:905
    1 | a() { b(); }
      | ^
0x9680b5 hash_table<cselib_hasher, xcallocator>::find_slot_with_hash(cselib_hasher::key* const&, unsigned int, insert_option)
	/home/marxin/Programming/gcc/gcc/hash-table.h:905
0x962518 cselib_find_slot
	/home/marxin/Programming/gcc/gcc/cselib.c:584
0x9625d4 cselib_lookup_1
	/home/marxin/Programming/gcc/gcc/cselib.c:2097
0x9625d4 cselib_lookup(rtx_def*, machine_mode, int, machine_mode)
	/home/marxin/Programming/gcc/gcc/cselib.c:2141
0x965ee7 cselib_record_sets
	/home/marxin/Programming/gcc/gcc/cselib.c:2593
0x9670a9 cselib_process_insn(rtx_insn*)
	/home/marxin/Programming/gcc/gcc/cselib.c:2790
0x1036b73 vt_initialize
	/home/marxin/Programming/gcc/gcc/var-tracking.c:10231
0x103b98a variable_tracking_main_1
	/home/marxin/Programming/gcc/gcc/var-tracking.c:10460
0x103b98a variable_tracking_main()
	/home/marxin/Programming/gcc/gcc/var-tracking.c:10513

2) gfc_find_module

$ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gfortran.dg/coarray/alloc_comp_2.f90 -fcoarray=single -fchecking=3
hash table checking failed: equal operator returns true for a pair of values with a different hash valuef951: internal compiler error: in find_slot_with_hash, at hash-table.h:905
0x8e5e86 hash_table<module_hasher, xcallocator>::find_slot_with_hash(char const* const&, unsigned int, insert_option)
	/home/marxin/Programming/gcc/gcc/hash-table.h:905
0x8e2c2c gfc_find_module(char const*)
	/home/marxin/Programming/gcc/gcc/fortran/trans-decl.c:4865
0x8e4f42 gfc_generate_module_vars(gfc_namespace*)
	/home/marxin/Programming/gcc/gcc/fortran/trans-decl.c:5475
0x8b8d7e gfc_generate_module_code(gfc_namespace*)
	/home/marxin/Programming/gcc/gcc/fortran/trans.c:2190
0x868427 translate_all_program_units
	/home/marxin/Programming/gcc/gcc/fortran/parse.c:6112
0x868427 gfc_parse_file()
	/home/marxin/Programming/gcc/gcc/fortran/parse.c:6328
0x8b19cb gfc_be_parse_file
	/home/marxin/Programming/gcc/gcc/fortran/f95-lang.c:204

3) lookup_template_class_1

$ ./xg++ -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C -c -fchecking=3
hash table checking failed: equal operator returns true for a pair of values with a different hash value/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C: In instantiation of ‘struct B<A>’:
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C:15:8:   required from here
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C:8:17: internal compiler error: in find_slot_with_hash, at hash-table.h:905
    8 |     friend bool foo (const B<Q>& a);
      |                 ^~~
0xa265a4 hash_table<spec_hasher, xcallocator>::find_slot_with_hash(spec_entry* const&, unsigned int, insert_option)
	/home/marxin/Programming/gcc/gcc/hash-table.h:905
0xa042ce lookup_template_class_1
	/home/marxin/Programming/gcc/gcc/cp/pt.c:9629
0xa042ce lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*, int, int)
	/home/marxin/Programming/gcc/gcc/cp/pt.c:9674
0xa03670 tsubst_aggr_type
	/home/marxin/Programming/gcc/gcc/cp/pt.c:12679
0x9fefcd tsubst(tree_node*, tree_node*, int, tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/pt.c:14294
0x9fe1a9 tsubst(tree_node*, tree_node*, int, tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/pt.c:14285
0xa0d8bd tsubst_arg_types
	/home/marxin/Programming/gcc/gcc/cp/pt.c:13891
0xa0dc24 tsubst_function_type
	/home/marxin/Programming/gcc/gcc/cp/pt.c:14032
0x9fe790 tsubst(tree_node*, tree_node*, int, tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/pt.c:14769
0x9f2c7c tsubst_function_decl
	/home/marxin/Programming/gcc/gcc/cp/pt.c:12921
0xa02d27 tsubst_template_decl
	/home/marxin/Programming/gcc/gcc/cp/pt.c:13214
0x9f4416 tsubst_decl
	/home/marxin/Programming/gcc/gcc/cp/pt.c:13316
0x9ff0ca tsubst(tree_node*, tree_node*, int, tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/pt.c:14212
0xa1dfd0 tsubst_friend_function
	/home/marxin/Programming/gcc/gcc/cp/pt.c:10310
0xa1dfd0 instantiate_class_template_1
	/home/marxin/Programming/gcc/gcc/cp/pt.c:11359
0xa1dfd0 instantiate_class_template(tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/pt.c:11424
0xa66b22 complete_type(tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/typeck.c:138
0x9023c7 start_decl_1(tree_node*, bool)
	/home/marxin/Programming/gcc/gcc/cp/decl.c:5278
0x92a15f start_decl(cp_declarator const*, cp_decl_specifier_seq*, int, tree_node*, tree_node*, tree_node**)
	/home/marxin/Programming/gcc/gcc/cp/decl.c:5241
0x9c1944 cp_parser_init_declarator
	/home/marxin/Programming/gcc/gcc/cp/parser.c:19750

4) register_specialization

$ ./xg++ -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-template.C -I/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/include -I/home/marxin/Programming/gcc/libstdc++-v3/libsupc++ -I/home/marxin/Programming/gcc/libstdc++-v3/include/ba -c -fchecking=3
hash table checking failed: equal operator returns true for a pair of values with a different hash value/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-template.C:13:21: internal compiler error: in find_slot_with_hash, at hash-table.h:905
   13 |   operator"" _abc<>()
      |                     ^
0xa265a4 hash_table<spec_hasher, xcallocator>::find_slot_with_hash(spec_entry* const&, unsigned int, insert_option)
	/home/marxin/Programming/gcc/gcc/hash-table.h:905
0x9e35e6 register_specialization
	/home/marxin/Programming/gcc/gcc/cp/pt.c:1534
0xa22ac3 check_explicit_specialization(tree_node*, tree_node*, int, int, tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/pt.c:3243
0x91552d grokfndecl
	/home/marxin/Programming/gcc/gcc/cp/decl.c:9106
0x9274bd grokdeclarator(cp_declarator const*, cp_decl_specifier_seq*, decl_context, int, tree_node**)
	/home/marxin/Programming/gcc/gcc/cp/decl.c:12607
0x92a9c6 start_function(cp_decl_specifier_seq*, cp_declarator const*, tree_node*)
	/home/marxin/Programming/gcc/gcc/cp/decl.c:15470
0x9c14a9 cp_parser_function_definition_from_specifiers_and_declarator
	/home/marxin/Programming/gcc/gcc/cp/parser.c:26853
0x9c14a9 cp_parser_init_declarator
	/home/marxin/Programming/gcc/gcc/cp/parser.c:19654
0x9c8ebd cp_parser_single_declaration
	/home/marxin/Programming/gcc/gcc/cp/parser.c:27437
0x9c9b11 cp_parser_explicit_specialization
	/home/marxin/Programming/gcc/gcc/cp/parser.c:16802
0x9cf30e cp_parser_declaration
	/home/marxin/Programming/gcc/gcc/cp/parser.c:12822
0x9cf94d cp_parser_translation_unit
	/home/marxin/Programming/gcc/gcc/cp/parser.c:4631
0x9cf94d c_parse_file()
	/home/marxin/Programming/gcc/gcc/cp/parser.c:39108
0xadde4a c_common_parse_file()
	/home/marxin/Programming/gcc/gcc/c-family/c-opts.c:1150

5) the case fixed in PR86158:

$ /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/ipa/iinline-2.c:37:1: internal compiler error: in find_slot_with_hash, at hash-table.h:905
0xba10f6 hash_table<ipa_vr_ggc_hash_traits, xcallocator>::find_slot_with_hash(value_range* const&, unsigned int, insert_option)
        /home/marxin/Programming/gcc/gcc/hash-table.h:905
0xb9e02c hash_table<ipa_vr_ggc_hash_traits, xcallocator>::find_slot(value_range* const&, insert_option)
        /home/marxin/Programming/gcc/gcc/hash-table.h:418
0xb8ea21 ipa_get_value_range
        /home/marxin/Programming/gcc/gcc/ipa-prop.c:1776
0xb8eab6 ipa_get_value_range
        /home/marxin/Programming/gcc/gcc/ipa-prop.c:1795
0xb8eae5 ipa_set_jfunc_vr
        /home/marxin/Programming/gcc/gcc/ipa-prop.c:1806
0xb8ef49 ipa_compute_jump_functions_for_edge
        /home/marxin/Programming/gcc/gcc/ipa-prop.c:1875
0xb8fb53 ipa_compute_jump_functions_for_bb
        /home/marxin/Programming/gcc/gcc/ipa-prop.c:2017
0xb912ab analysis_dom_walker::before_dom_children(basic_block_def*)
        /home/marxin/Programming/gcc/gcc/ipa-prop.c:2535
0x158f997 dom_walker::walk(basic_block_def*)
        /home/marxin/Programming/gcc/gcc/domwalk.c:353
0xb915b0 ipa_analyze_node(cgraph_node*)
        /home/marxin/Programming/gcc/gcc/ipa-prop.c:2605
0xb776bf inline_indirect_intraprocedural_analysis
        /home/marxin/Programming/gcc/gcc/ipa-fnsummary.c:3125
0xb776bf inline_analyze_function(cgraph_node*)
        /home/marxin/Programming/gcc/gcc/ipa-fnsummary.c:3145
0xb77866 ipa_fn_summary_generate
        /home/marxin/Programming/gcc/gcc/ipa-fnsummary.c:3189
0xca1751 execute_ipa_summary_passes(ipa_opt_pass_d*)
        /home/marxin/Programming/gcc/gcc/passes.c:2131
0x96907a ipa_passes
        /home/marxin/Programming/gcc/gcc/cgraphunit.c:2505
0x96907a symbol_table::compile()
        /home/marxin/Programming/gcc/gcc/cgraphunit.c:2616
0x96b055 symbol_table::finalize_compilation_unit()
        /home/marxin/Programming/gcc/gcc/cgraphunit.c:2861

My question is whether we want to have in GCC 9 time frame or should I wait with that?
Does it worth implementing?

Thanks,
Martin

---
 gcc/hash-table.c |  2 ++
 gcc/hash-table.h | 25 ++++++++++++++++++++++++-
 gcc/toplev.c     |  3 +++
 3 files changed, 29 insertions(+), 1 deletion(-)



[-- Attachment #2: 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch --]
[-- Type: text/x-patch, Size: 2084 bytes --]

diff --git a/gcc/hash-table.c b/gcc/hash-table.c
index bff9644ae81..d396a368171 100644
--- a/gcc/hash-table.c
+++ b/gcc/hash-table.c
@@ -121,3 +121,5 @@ void dump_hash_table_loc_statistics (void)
       hash_table_usage ().dump (origin);
     }
 }
+
+bool hash_table_verify_p = false;
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index bd83345c7b8..2d740b42535 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -240,6 +240,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "hash-traits.h"
 #include "hash-map-traits.h"
 
+#ifndef GENERATOR_FILE
+extern bool hash_table_verify_p;
+#endif
+
 template<typename, typename, typename> class hash_map;
 template<typename, typename> class hash_set;
 
@@ -883,12 +887,31 @@ hash_table<Descriptor, Allocator>
     expand ();
 
   m_searches++;
+  size_t size = m_size;
+
+#ifndef GENERATOR_FILE
+  if (hash_table_verify_p)
+    if (insert == INSERT)
+      for (size_t i = 0; i < size; i++)
+	{
+	  value_type *entry = &m_entries[i];
+	  if (!is_empty (*entry) && !is_deleted (*entry)
+	      && Descriptor::equal (*entry, comparable)
+	      && hash != Descriptor::hash (*entry))
+	    {
+	      fprintf (stderr, "hash table checking failed: "
+		       "equal operator returns true for a pair "
+		       "of values with a different hash value");
+	      gcc_unreachable ();
+	    }
+	}
+#endif
+// TODO: enable it also for generated files: there are failures!
 
   value_type *first_deleted_slot = NULL;
   hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
   hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
   value_type *entry = &m_entries[index];
-  size_t size = m_size;
   if (is_empty (*entry))
     goto empty_entry;
   else if (is_deleted (*entry))
diff --git a/gcc/toplev.c b/gcc/toplev.c
index d7ea11abf53..4cd3aafa2e8 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -2289,6 +2289,9 @@ toplev::main (int argc, char **argv)
 
   handle_common_deferred_options ();
 
+  if (flag_checking > 2)
+    hash_table_verify_p = true;
+
   init_local_tick ();
 
   initialize_plugins ();


             reply	other threads:[~2018-10-29 11:04 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-29 12:02 Martin Liška [this message]
2018-10-29 14:28 ` Alexander Monakov
2018-10-29 15:56   ` Martin Liška
2018-10-30 10:32     ` Jakub Jelinek
2018-10-30 14:17       ` Martin Liška
2018-11-07 22:24         ` Jeff Law
2018-11-07 22:44           ` Jakub Jelinek
2018-11-08  8:56           ` Martin Liška
2019-05-13  7:42             ` Martin Liška
2019-05-20 17:26               ` Jason Merrill
2019-05-20 22:07               ` Jeff Law
2019-05-21  9:38                 ` Richard Biener
2019-05-21 11:02                   ` Martin Liška
2019-05-21 11:52                     ` Richard Biener
2019-05-22  9:13                       ` Martin Liška
2019-05-31 13:23                         ` Richard Biener
2019-05-31 13:35                           ` Martin Liška
2019-05-31 22:10                         ` Jeff Law
2019-06-03 13:35                           ` Martin Liška
2019-06-07  8:57                             ` Richard Biener
2019-06-07 12:04                               ` Martin Liška
2019-06-07 12:09                                 ` Richard Biener
2019-06-07 12:13                                   ` Martin Liška
2019-06-07 14:48                                     ` Martin Sebor
2019-06-07 21:43                                     ` Jason Merrill
2019-06-10  7:08                                       ` Martin Liška
2019-06-10 18:22                                         ` Jason Merrill
2019-06-11  7:41                                           ` Martin Liška
2019-06-11 12:28                                             ` Jason Merrill
2019-06-11 13:16                                               ` Martin Liška
2019-06-11 19:02                                                 ` Jason Merrill
2019-06-12  7:59                                                   ` Richard Biener
2019-06-12  8:02                                                     ` Martin Liška
2019-06-12  9:15                                                       ` Martin Liška
2019-06-12  9:41                                                         ` Richard Biener
2019-06-12 11:45                                                           ` Martin Liška
2019-06-12 12:50                                                             ` Richard Biener
2019-06-12 13:05                                                               ` Martin Liška
2019-06-23 23:08                                 ` Ian Lance Taylor
2019-06-24 12:29                                   ` Richard Biener
2019-06-24 13:51                                     ` Martin Liška
2019-06-24 14:10                                       ` Richard Biener
2019-06-25 10:25                                         ` Martin Liška
2019-06-25 11:59                                           ` Martin Liška
2019-06-25 14:23                                           ` Richard Biener
2018-10-30 10:25 ` hash-table violation in cselib.c Martin Liška
2018-11-01 11:57   ` Martin Liška
2018-10-30 10:46 ` hash-table violation in gcc/fortran/trans-decl.c Martin Liška
2018-10-31 10:00   ` Trevor Saunders
2018-10-31 10:18     ` Martin Liška
2018-10-30 11:07 ` hash-table violation in gcc/cp/pt.c Martin Liška
2018-10-30 11:21   ` Martin Liška
2018-11-01 12:06     ` Martin Liška

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23ffca95-6492-e609-aebb-bbdd83b5185d@suse.cz \
    --to=mliska@suse.cz \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jason@redhat.com \
    --cc=mjambor@suse.cz \
    --cc=nathan@acm.org \
    --cc=paul.richard.thomas@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).