public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Pedro Alves <palves@redhat.com>
To: Simon Marchi <simark@simark.ca>
Cc: Yao Qi <qiyaoltc@gmail.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 4/7] Class-fy partial_die_info
Date: Wed, 31 Jan 2018 15:33:00 -0000	[thread overview]
Message-ID: <976e801a-4a31-092f-dd9d-13139f005499@redhat.com> (raw)
In-Reply-To: <8f245668d980ad825fd6499c24730f67@simark.ca>

On 01/31/2018 03:46 AM, Simon Marchi wrote:
> On 2018-01-30 06:39, Pedro Alves wrote:
>> On 01/29/2018 01:15 AM, Simon Marchi wrote:
>>
>>> From what I understand, the only reason to have that private constructor is
>>> to construct a temporary partial_die_info object used to search in the htab,
>>> is that right?  If so, converting that htab_t to an std::unordered_map would
>>> remove the need for all this, since you don't need to construct an object
>>> to search it.  See the diff below that applies on top of this patch.
>>>
>>> It's not thoroughly tested and I am not sure of the validity of the
>>> per_cu->cu->partial_dies.empty () call in find_partial_die, but I think it
>>> should work.  Plus, it adds some type-safety, which I am a big fan of.
>>>
>>> But otherwise, the patch is fine with me.
>>
>> Careful here.  This could do with some benchmarking.  The DWARF reading code
>> is performance (both timing and memory) sensitive.  This is trading an open
>> addressing hash table (htab_t), for a node-based closed addressing hash table.
>> The keys/elements in the map are small so I'd expect this to make
>> a difference.  Also, this is trading a in-principle cache-friendly
>> obstack allocation scheme for the standard new allocator.
> 
> Ah, indeed.  I thought that unordered_map would be implemented the same way as htab_t, but I see it's not the case.  Doing some quick tests on a big binary, it increases the time reading the symbols from an average of 37 seconds to an average of 42 seconds.
> 
> I understand the different hash table implementation having an impact, but I don't really understand how the allocation scheme can have a meaningful impact.  The partial_die_info objects are still allocated on the obstack, aren't they?  So it's just the space for the table itself that isn't on the objstack, but I don't see why that would make a difference.

Well, that's the thing.  unordered_map is going to reach to the heap to
allocate the buckets and nodes.  And the heap is slow.  And there's a
memory size cost for each element too, along with that reducing
cache locality.

unordered_map is really inefficient for this use case.  We have a hash map
of small objects that is filled in once, and after initial fill, we don't
do random insert/remove of elements from this map.  I think.

> 
> If we want to have a data structure with the same kind of performance as htab_t but with type-safety in the future, is your vision that we'll have to implement it ourselves?  Should we make a wrapper around htab_t?

Well, my vision is to investigate alternatives.  :-)  There are many options
out there.  If you google around for hash table benchmarks you'll find several.
E.g.: <https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html>.

So there are good alternatives out there that we could import and use.
AFAIK, libiberty's htab_t is actually quite good, and GCC has a C++-ified
version of that.  See gcc/hash-table.h, gcc/hash-map.h, gcc/hash-set.h:

 /* This file implements a typed hash table.
    The implementation borrows from libiberty's htab_t in hashtab.h.

So sharing that code with GCC may make sense, as it's already in
"the family".

One thing I dislike about GCC's hash containers above is that
their API isn't modeled on the standard's, which can lead to
confusion if you're used to the standard containers, or if you're
experimenting/benchmarking different containers for some
use case.  You have things like:

  /* This function clears all entries in this hash table.  */
   void empty () { if (elements ()) empty_slow (); }

while in the std containers, empty() is a predicate...

It may be that GCC's hash/map/set could do with C++11-fication too.
Really I haven't investigated much.  I'm just aware of their
existence.

(A while ago I found reading this discussion quite educational:
  <https://bugs.ruby-lang.org/issues/12142>)

Thanks,
Pedro Alves

  parent reply	other threads:[~2018-01-31 15:33 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-25  9:38 [PATCH 0/7] " Yao Qi
2018-01-25  9:38 ` [PATCH 1/7] Re-write partial_die_info allocation in load_partial_dies Yao Qi
2018-01-25  9:38 ` [PATCH 6/7] Move fixup_partial_die to partial_die_info::fixup Yao Qi
2018-01-25 12:59   ` Pedro Alves
2018-01-25 14:45     ` Yao Qi
2018-01-25  9:38 ` [PATCH 5/7] Remove one argument abbrev_len in read_partial_die Yao Qi
2018-01-29  1:30   ` Simon Marchi
2018-01-25  9:38 ` [PATCH 7/7] Move read_partial_die to partial_die_info::read Yao Qi
2018-01-29  1:58   ` Simon Marchi
2018-01-25  9:38 ` [PATCH 2/7] Don't check abbrev is NULL in read_partial_die Yao Qi
2018-01-25  9:38 ` [PATCH 4/7] Class-fy partial_die_info Yao Qi
2018-01-25 16:19   ` Tom Tromey
2018-01-26 17:25     ` Yao Qi
2018-01-26 20:55       ` Tom Tromey
2018-01-29  1:15   ` Simon Marchi
2018-01-30 10:49     ` Yao Qi
2018-01-30 15:11       ` Pedro Alves
2018-01-30 11:39     ` Pedro Alves
2018-01-31  3:46       ` Simon Marchi
2018-01-31 11:55         ` Yao Qi
2018-01-31 15:33         ` Pedro Alves [this message]
2018-01-25  9:38 ` [PATCH 3/7] Change find_partial_die_in_comp_unit to dwarf2_cu::find_partial_die Yao Qi
2018-01-25 12:05 ` [PATCH 0/7] Class-fy partial_die_info Joel Brobecker
2018-01-25 14:03   ` Yao Qi
2018-02-22 15:36 [PATCH 0/7 v2] " Yao Qi
2018-02-22 15:36 ` [PATCH 4/7] " Yao Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=976e801a-4a31-092f-dd9d-13139f005499@redhat.com \
    --to=palves@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=qiyaoltc@gmail.com \
    --cc=simark@simark.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).