From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1546 invoked by alias); 3 Jun 2019 13:35:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 1535 invoked by uid 89); 3 Jun 2019 13:35:34 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-16.1 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT,SPF_PASS autolearn=ham version=3.3.1 spammy=blocked, Fully, is_empty, Programming X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 03 Jun 2019 13:35:31 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D2B99AED0; Mon, 3 Jun 2019 13:35:28 +0000 (UTC) Subject: Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables. To: Jeff Law , Richard Biener Cc: Jakub Jelinek , Alexander Monakov , GCC Patches , Nathan Sidwell , Jason Merrill , Paul Richard Thomas , Martin Jambor References: <23ffca95-6492-e609-aebb-bbdd83b5185d@suse.cz> <20181030100342.GN11625@tucnak> <32744d50-09fd-496c-e97e-9ec478d64ec4@suse.cz> <492d87a7-0210-0df3-f484-f126baa6866c@suse.cz> <47fcf0aa-4b89-5354-1b59-4e6c623f5c3a@suse.cz> <999abc46-57c7-ccf9-b0c9-baf4c0686b16@suse.cz> From: =?UTF-8?Q?Martin_Li=c5=a1ka?= Message-ID: <4faef430-49cf-13bc-4bb2-858a72668ae6@suse.cz> Date: Mon, 03 Jun 2019 13:35:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------C45AA118724585BB8903CCFF" X-IsSubscribed: yes X-SW-Source: 2019-06/txt/msg00076.txt.bz2 This is a multi-part message in MIME format. --------------C45AA118724585BB8903CCFF Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-length: 8297 On 6/1/19 12:06 AM, Jeff Law wrote: > On 5/22/19 3:13 AM, Martin Liška wrote: >> On 5/21/19 1:51 PM, Richard Biener wrote: >>> On Tue, May 21, 2019 at 1:02 PM Martin Liška wrote: >>>> >>>> On 5/21/19 11:38 AM, Richard Biener wrote: >>>>> On Tue, May 21, 2019 at 12:07 AM Jeff Law wrote: >>>>>> >>>>>> On 5/13/19 1:41 AM, Martin Liška wrote: >>>>>>> On 11/8/18 9:56 AM, Martin Liška wrote: >>>>>>>> On 11/7/18 11:23 PM, Jeff Law wrote: >>>>>>>>> On 10/30/18 6:28 AM, Martin Liška wrote: >>>>>>>>>> On 10/30/18 11:03 AM, Jakub Jelinek wrote: >>>>>>>>>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote: >>>>>>>>>>>> +hashtab_chk_error () >>>>>>>>>>>> +{ >>>>>>>>>>>> + fprintf (stderr, "hash table checking failed: " >>>>>>>>>>>> + "equal operator returns true for a pair " >>>>>>>>>>>> + "of values with a different hash value"); >>>>>>>>>>> BTW, either use internal_error here, or at least if using fprintf >>>>>>>>>>> terminate with \n, in your recent mail I saw: >>>>>>>>>>> ...different hash valueduring RTL pass: vartrack >>>>>>>>>>> ^^^^^^ >>>>>>>>>> Sure, fixed in attached patch. >>>>>>>>>> >>>>>>>>>> Martin >>>>>>>>>> >>>>>>>>>>>> + gcc_unreachable (); >>>>>>>>>>>> +} >>>>>>>>>>> Jakub >>>>>>>>>>> >>>>>>>>>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch >>>>>>>>>> >>>>>>>>>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001 >>>>>>>>>> From: marxin >>>>>>>>>> Date: Mon, 29 Oct 2018 09:38:21 +0100 >>>>>>>>>> Subject: [PATCH] Sanitize equals and hash functions in hash-tables. >>>>>>>>>> >>>>>>>>>> --- >>>>>>>>>> gcc/hash-table.h | 40 +++++++++++++++++++++++++++++++++++++++- >>>>>>>>>> 1 file changed, 39 insertions(+), 1 deletion(-) >>>>>>>>>> >>>>>>>>>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h >>>>>>>>>> index bd83345c7b8..694eedfc4be 100644 >>>>>>>>>> --- a/gcc/hash-table.h >>>>>>>>>> +++ b/gcc/hash-table.h >>>>>>>>>> @@ -503,6 +503,7 @@ private: >>>>>>>>>> >>>>>>>>>> value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const; >>>>>>>>>> value_type *find_empty_slot_for_expand (hashval_t); >>>>>>>>>> + void verify (const compare_type &comparable, hashval_t hash); >>>>>>>>>> bool too_empty_p (unsigned int); >>>>>>>>>> void expand (); >>>>>>>>>> static bool is_deleted (value_type &v) >>>>>>>>>> @@ -882,8 +883,12 @@ hash_table >>>>>>>>>> if (insert == INSERT && m_size * 3 <= m_n_elements * 4) >>>>>>>>>> expand (); >>>>>>>>>> >>>>>>>>>> - m_searches++; >>>>>>>>>> +#if ENABLE_EXTRA_CHECKING >>>>>>>>>> + if (insert == INSERT) >>>>>>>>>> + verify (comparable, hash); >>>>>>>>>> +#endif >>>>>>>>>> >>>>>>>>>> + m_searches++; >>>>>>>>>> value_type *first_deleted_slot = NULL; >>>>>>>>>> hashval_t index = hash_table_mod1 (hash, m_size_prime_index); >>>>>>>>>> hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index); >>>>>>>>>> @@ -930,6 +935,39 @@ hash_table >>>>>>>>>> return &m_entries[index]; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> +#if ENABLE_EXTRA_CHECKING >>>>>>>>>> + >>>>>>>>>> +/* Report a hash table checking error. */ >>>>>>>>>> + >>>>>>>>>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD >>>>>>>>>> +static void >>>>>>>>>> +hashtab_chk_error () >>>>>>>>>> +{ >>>>>>>>>> + fprintf (stderr, "hash table checking failed: " >>>>>>>>>> + "equal operator returns true for a pair " >>>>>>>>>> + "of values with a different hash value\n"); >>>>>>>>>> + gcc_unreachable (); >>>>>>>>>> +} >>>>>>>>> I think an internal_error here is probably still better than a simple >>>>>>>>> fprintf, even if the fprintf is terminated with a \n :-) >>>>>>>> Fully agree with that, but I see a lot of build errors when using internal_error. >>>>>>>> >>>>>>>>> The question then becomes can we bootstrap with this stuff enabled and >>>>>>>>> if not, are we likely to soon? It'd be a shame to put it into >>>>>>>>> EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING >>>>>>>>> because we've got too many bugs to fix. >>>>>>>> Unfortunately it's blocked with these 2 PRs: >>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845 >>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847 >>>>>>> Hi. >>>>>>> >>>>>>> I've just added one more PR: >>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450 >>>>>>> >>>>>>> I'm sending updated version of the patch that provides a disablement for the 3 PRs >>>>>>> with a new function disable_sanitize_eq_and_hash. >>>>>>> >>>>>>> With that I can bootstrap and finish tests. However, I've done that with a patch >>>>>>> limits maximal number of checks: >>>>>> So rather than call the disable_sanitize_eq_and_hash, can you have its >>>>>> state set up when you instantiate the object? It's not a huge deal, >>>>>> just thinking about loud. >>>>>> >>>>>> >>>>>> >>>>>> So how do we want to go forward, particularly the EXTRA_EXTRA checking >>>>>> issue :-) >>>>> >>>>> There is at least one PR where we have a table where elements _in_ the >>>>> table are never compared against each other but always against another >>>>> object (I guess that's usual even), but the setup is in a way that the >>>>> comparison function only works with those. With the patch we verify >>>>> hashing/comparison for something that is never used. >>>>> >>>>> So - wouldn't it be more "correct" to only verify comparison/hashing >>>>> at lookup time, using the object from the lookup and verify that against >>>>> all other elements? >>>> >>>> I don't a have problem with that. Apparently this changes fixes >>>> PR90450 and PR87847. >>>> >>>> Changes from previous version: >>>> - verification happens only when an element is searched (not inserted) >>>> - new argument 'sanitize_eq_and_hash' added for hash_table::hash_table >>>> - new param has been introduced hash-table-verification-limit in order >>>> to limit number of elements that are compared within a table >>>> - verification happens only with flag_checking >= 2 >>>> >>>> I've been bootstrapping and testing the patch right now. >>> >>> Looks like I misremembered the original patch. The issue isn't >>> comparing random two elements in the table. >>> >>> That it fixes PR90450 is because LIM never calls find_slot_with_hash >>> without INSERTing. >>> >> >> There's updated version of the patch where I check all find operations >> (both w/ and w/o insertion). >> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests >> except for: >> >> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c -O2 -c >> hash table checking failed: equal operator returns true for a pair of values with a different hash value >> during GIMPLE pass: lim >> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c: In function ‘fn1’: >> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c:6:1: internal compiler error: in hashtab_chk_error, at hash-table.h:1019 >> 6 | fn1 () >> | ^~~ >> 0x6c5725 hashtab_chk_error >> /home/marxin/Programming/gcc/gcc/hash-table.h:1019 >> 0xe504ea hash_table::verify(ao_ref* const&, unsigned int) >> /home/marxin/Programming/gcc/gcc/hash-table.h:1040 >> 0xe504ea hash_table::find_slot_with_hash(ao_ref* const&, unsigned int, insert_option) >> /home/marxin/Programming/gcc/gcc/hash-table.h:960 >> 0xe504ea gather_mem_refs_stmt >> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1501 >> 0xe504ea analyze_memory_references >> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1625 >> 0xe504ea tree_ssa_lim >> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2646 >> 0xe504ea execute >> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2708 >> >> Richi: it's after your recent patch. >> >> For some reason I don't see PR87847 issue any longer. >> >> >> May I install the patch with disabled sanitization in tree-ssa-loop-im.c ? > Don't we still need to deal with the naked fprintf when there's a > failure. ie, shouldn't we be raising it with a gcc_assert or somesuch? Good point, I've just adjusted that. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin > > jeff > --------------C45AA118724585BB8903CCFF Content-Type: text/x-patch; name="0001-Enable-sanitization-for-hash-tables.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-Enable-sanitization-for-hash-tables.patch" Content-length: 8648 >From a51fc8dc210942b87d84130bb105467112fa9967 Mon Sep 17 00:00:00 2001 From: marxin Date: Mon, 13 May 2019 07:16:22 +0200 Subject: [PATCH] Enable sanitization for hash tables. gcc/ChangeLog: 2019-06-03 Martin Liska * cselib.c (cselib_init): Disable hash table sanitization. * hash-set.h: Pass new default argument to m_table. * hash-table.c: Add global variable with hash table sanitization limit. * hash-table.h (Allocator>::hash_table): Add new argument to ctor. (hashtab_chk_error): New. * params.def (PARAM_HASH_TABLE_VERIFICATION_LIMIT): New. * toplev.c (process_options): Set hash_table_sanitize_eq_limit from the PARAM_HASH_TABLE_VERIFICATION_LIMIT value. --- gcc/cselib.c | 9 ++++++-- gcc/hash-set.h | 2 +- gcc/hash-table.c | 3 +++ gcc/hash-table.h | 56 ++++++++++++++++++++++++++++++++++++++++++++---- gcc/params.def | 6 ++++++ gcc/toplev.c | 4 ++++ 6 files changed, 73 insertions(+), 7 deletions(-) diff --git a/gcc/cselib.c b/gcc/cselib.c index 84c17c23f6d..a1cbdec9718 100644 --- a/gcc/cselib.c +++ b/gcc/cselib.c @@ -2858,9 +2858,14 @@ cselib_init (int record_what) } used_regs = XNEWVEC (unsigned int, cselib_nregs); n_used_regs = 0; - cselib_hash_table = new hash_table (31); + /* FIXME: enable sanitization (PR87845) */ + cselib_hash_table + = new hash_table (31, /* ggc */ false, + /* sanitize_eq_and_hash */ false); if (cselib_preserve_constants) - cselib_preserved_hash_table = new hash_table (31); + cselib_preserved_hash_table + = new hash_table (31, /* ggc */ false, + /* sanitize_eq_and_hash */ false); next_uid = 1; } diff --git a/gcc/hash-set.h b/gcc/hash-set.h index de3532f5f68..d891ed78297 100644 --- a/gcc/hash-set.h +++ b/gcc/hash-set.h @@ -28,7 +28,7 @@ class hash_set public: typedef typename Traits::value_type Key; explicit hash_set (size_t n = 13, bool ggc = false CXX_MEM_STAT_INFO) - : m_table (n, ggc, GATHER_STATISTICS, HASH_SET_ORIGIN PASS_MEM_STAT) {} + : m_table (n, ggc, true, GATHER_STATISTICS, HASH_SET_ORIGIN PASS_MEM_STAT) {} /* Create a hash_set in gc memory with space for at least n elements. */ diff --git a/gcc/hash-table.c b/gcc/hash-table.c index 646a7a1c497..8e86fffa36f 100644 --- a/gcc/hash-table.c +++ b/gcc/hash-table.c @@ -74,6 +74,9 @@ struct prime_ent const prime_tab[] = { { 0xfffffffb, 0x00000006, 0x00000008, 31 } }; +/* Limit number of comparisons when calling hash_table<>::verify. */ +unsigned int hash_table_sanitize_eq_limit; + /* The following function returns an index into the above table of the nearest prime number which is greater than N, and near a power of two. */ diff --git a/gcc/hash-table.h b/gcc/hash-table.h index 4178616478e..785bed14679 100644 --- a/gcc/hash-table.h +++ b/gcc/hash-table.h @@ -295,6 +295,8 @@ struct prime_ent extern struct prime_ent const prime_tab[]; +/* Limit number of comparisons when calling hash_table<>::verify. */ +extern unsigned int hash_table_sanitize_eq_limit; /* Functions for computing hash table indexes. */ @@ -371,10 +373,12 @@ class hash_table public: explicit hash_table (size_t, bool ggc = false, + bool sanitize_eq_and_hash = true, bool gather_mem_stats = GATHER_STATISTICS, mem_alloc_origin origin = HASH_TABLE_ORIGIN CXX_MEM_STAT_INFO); explicit hash_table (const hash_table &, bool ggc = false, + bool sanitize_eq_and_hash = true, bool gather_mem_stats = GATHER_STATISTICS, mem_alloc_origin origin = HASH_TABLE_ORIGIN CXX_MEM_STAT_INFO); @@ -516,6 +520,7 @@ private: value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const; value_type *find_empty_slot_for_expand (hashval_t); + void verify (const compare_type &comparable, hashval_t hash); bool too_empty_p (unsigned int); void expand (); static bool is_deleted (value_type &v) @@ -564,6 +569,9 @@ private: /* if m_entries is stored in ggc memory. */ bool m_ggc; + /* True if the table should be sanitized for equal and hash functions. */ + bool m_sanitize_eq_and_hash; + /* If we should gather memory statistics for the table. */ #if GATHER_STATISTICS bool m_gather_mem_stats; @@ -586,12 +594,13 @@ extern void dump_hash_table_loc_statistics (void); template class Allocator> hash_table::hash_table (size_t size, bool ggc, + bool sanitize_eq_and_hash, bool gather_mem_stats ATTRIBUTE_UNUSED, mem_alloc_origin origin MEM_STAT_DECL) : m_n_elements (0), m_n_deleted (0), m_searches (0), m_collisions (0), - m_ggc (ggc) + m_ggc (ggc), m_sanitize_eq_and_hash (sanitize_eq_and_hash) #if GATHER_STATISTICS , m_gather_mem_stats (gather_mem_stats) #endif @@ -617,12 +626,14 @@ template class Allocator> hash_table::hash_table (const hash_table &h, bool ggc, + bool sanitize_eq_and_hash, bool gather_mem_stats ATTRIBUTE_UNUSED, mem_alloc_origin origin MEM_STAT_DECL) : m_n_elements (h.m_n_elements), m_n_deleted (h.m_n_deleted), - m_searches (0), m_collisions (0), m_ggc (ggc) + m_searches (0), m_collisions (0), m_ggc (ggc), + m_sanitize_eq_and_hash (sanitize_eq_and_hash) #if GATHER_STATISTICS , m_gather_mem_stats (gather_mem_stats) #endif @@ -912,7 +923,11 @@ hash_table entry = &m_entries[index]; if (is_empty (*entry) || (!is_deleted (*entry) && Descriptor::equal (*entry, comparable))) - return *entry; + { + if (m_sanitize_eq_and_hash) + verify (comparable, hash); + return *entry; + } } } @@ -941,8 +956,10 @@ hash_table if (insert == INSERT && m_size * 3 <= m_n_elements * 4) expand (); - m_searches++; + if (m_sanitize_eq_and_hash) + verify (comparable, hash); + m_searches++; value_type *first_deleted_slot = NULL; hashval_t index = hash_table_mod1 (hash, m_size_prime_index); hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index); @@ -989,6 +1006,37 @@ hash_table return &m_entries[index]; } +/* Report a hash table checking error. */ + +ATTRIBUTE_NORETURN ATTRIBUTE_COLD +static void +hashtab_chk_error () +{ + fprintf (stderr, "hash table checking failed: " + "equal operator returns true for a pair " + "of values with a different hash value\n"); + gcc_unreachable (); +} + +/* Verify that all existing elements in th hash table which are + equal to COMPARABLE have an equal HASH value provided as argument. */ + +template class Allocator> +void +hash_table +::verify (const compare_type &comparable, hashval_t hash) +{ + for (size_t i = 0; i < MIN (hash_table_sanitize_eq_limit, m_size); i++) + { + value_type *entry = &m_entries[i]; + if (!is_empty (*entry) && !is_deleted (*entry) + && hash != Descriptor::hash (*entry) + && Descriptor::equal (*entry, comparable)) + hashtab_chk_error (); + } +} + /* This function deletes an element with the given COMPARABLE value from hash table starting with the given HASH. If there is no matching element in the hash table, this function does nothing. */ diff --git a/gcc/params.def b/gcc/params.def index 6b7f7eb5bae..e5ca6ff45e4 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1439,6 +1439,12 @@ DEFPARAM(PARAM_GIMPLE_FE_COMPUTED_HOT_BB_THRESHOLD, " The parameter is used only in GIMPLE FE.", 0, 0, 0) +DEFPARAM(PARAM_HASH_TABLE_VERIFICATION_LIMIT, + "hash-table-verification-limit", + "The number of elements for which hash table verification is done for " + "each searched element.", + 100, 0, 0) + /* Local variables: diff --git a/gcc/toplev.c b/gcc/toplev.c index d300ac2ec89..116be7be395 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1799,6 +1799,10 @@ process_options (void) optimization_default_node = build_optimization_node (&global_options); optimization_current_node = optimization_default_node; + if (flag_checking >= 2) + hash_table_sanitize_eq_limit + = PARAM_VALUE (PARAM_HASH_TABLE_VERIFICATION_LIMIT); + /* Please don't change global_options after this point, those changes won't be reflected in optimization_{default,current}_node. */ } -- 2.21.0 --------------C45AA118724585BB8903CCFF--