From: "Christian Biesinger via gdb-patches" <gdb-patches@sourceware.org>
To: Tom Tromey <tom@tromey.com>
Cc: Christian Biesinger via gdb-patches <gdb-patches@sourceware.org>
Subject: Re: [PATCH] Don't use the mutex for each symbol_set_names call
Date: Wed, 02 Oct 2019 22:02:00 -0000 [thread overview]
Message-ID: <CAPTJ0XEVveoW5LboP0-LmAZv+kkDQqmE3dTQw2uavTqdxbBphQ@mail.gmail.com> (raw)
In-Reply-To: <CAPTJ0XFBKsp_zgZ4k5mTSmnGqKznYuk0Pu_1kgR4vPx3CjFP=Q@mail.gmail.com>
On Wed, Oct 2, 2019 at 1:20 PM Christian Biesinger
<cbiesinger@google.com> wrote:
>
> On Wed, Oct 2, 2019 at 12:18 PM Tom Tromey <tom@tromey.com> wrote:
> >
> > >>>>> "Christian" == Christian Biesinger via gdb-patches <gdb-patches@sourceware.org> writes:
> >
> > Christian> It speeds up "attach to Chrome's content_shell binary" from 44 sec -> 30
> > Christian> sec (32%) (compared to GDB trunk), or from 37 sec compared to this
> > Christian> patchset before this patch.
> >
> > Nice.
>
> I do need to redo these measurements with the latest version of the
> branch and patch...
Here are some new measurements on trunk (this is also with a
recompiled Chrome); either trunk gdb is slower or trunk Chrome is
bigger. I'll call tromey/t/parallel-minsyms-mutex "tromey" below.
GDB Trunk: 49.8s
tromey: 53-54s (!)
tromey + my patch here: ~30.3s
tromey + my patch here +
https://sourceware.org/ml/gdb-patches/2019-09/msg00633.html: 24.8s
(-18% compared to previous)
tromey + my patch here +
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/cbiesinger/minsym-hash-one-thread:
28.8s (only -5%)
I repeatedly measured "tromey" because it is so slow and got
consistent results. It must be due to the lock contention.
> > Christian> + [&] (minimal_symbol *first, minimal_symbol* last) {
> > Christian> + std::lock_guard<std::mutex> guard (demangled_mutex);
> > Christian> + for (; first < last; ++first) {
> > Christian> + symbol_set_names (first, first->name,
> > Christian> + strlen (first->name), 0,
> > Christian> + m_objfile->per_bfd);
> > Christian> + }
> > Christian> + });
> >
> > IIUC the idea is to separate the demangling from updating the
> > demangled_names_hash.
>
> That's correct.
>
> > A couple of thoughts on that...
> >
> > Christian> + *slot
> > Christian> + = ((struct demangled_name_entry *)
> > Christian> + obstack_alloc (&per_bfd->storage_obstack,
> > Christian> + offsetof (struct demangled_name_entry, demangled)
> > Christian> + + len + demangled_len + 2));
> > Christian> + mangled_ptr = &((*slot)->demangled[demangled_len + 1]);
> > Christian> + strcpy (mangled_ptr, linkage_name_copy);
> > Christian> + (*slot)->mangled = mangled_ptr;
> >
> > There's no deep reason that these things have to be stored on the
> > per-BFD obstack -- and requiring this means an extra copy. Instead we
> > could change the hash table to use ordinary heap allocation, and I think
> > this would be more efficient when demangling in worker threads, because
> > we could just reuse the existing allocation.
>
> Yes indeed. I was actually thinking of that last night -- we could
> change to a hash_set<demangled_name_entry> + reuse the alloc from
> gdb_demangle and avoid any allocations here.
https://sourceware.org/ml/gdb-patches/2019-10/msg00085.html
Although now that I re-read this, I'm not sure if I understood you
correctly, did you want to allocate more things with regular
malloc/new?
> > Also, it seems fine to serialize the calls to symbol_set_names. There's
> > no need for a lock at all, then.
>
> True, though this way, if some threads finish faster than others it's
> possible to parallelize the work a bit more.
Trying this out, it seems to be about 1-1.5s slower (3-5%). However,
this did make me realize that there's no real reason why the mutex
should be global, so I'm going to move it inside this function.
> > One idea I had was to parallelize build_minimal_symbol_hash_tables as
> > well: when possible, have it run two threads, and create the hash tables
> > in parallel.
>
> Hmm, yeah, that's a good idea. I hadn't thought of doing it quite that way.
See measurements above for users/cbiesinger/minsym-hash-one-thread;
it's not nearly as fast as my "Compute msymbol hash codes in parallel"
patch. However I couldn't do it quite as you suggested (as mentioned
in IRC, but writing it down here as well for anyone watching):
- Building the demangled minsym hash table requires having the
demangled names available, so it needs to happen at the very least
after find_demangled_name
- But it can't happen in parallel with symbol_set_names either,
because that may change the pointer (to one from the hashtable)
- So in practice, I can only build the msymbol_hash table on a thread,
and that's the faster one (search_name_hash is slowish)
> > Adding a third thread here to update the
> > demangled_names_hash might help too? Maybe this approach would
> > eliminate the need for your "Compute msymbol hash codes in parallel"
> > patch ... the issue there being that it makes the minsyms larger.
> > (Another way to handle that would be to keep the hashes in a local array
> > of some kind that is discarded once the hash tables are updated.)
>
> The local array is a bit tricky... it needs an entry for each msymbol,
> which is only known at runtime. So it can't be stack-allocated with a
> fixed size, and I'm hesitant to use alloca for this unbounded size. So
> it would require a heap allocation for that vector. Maybe it's still
> worth it...
Putting this in a vector works out! It might possibly be a couple of
tenths of a second faster even. Pushed to:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=85f7818c32fdf5b9fbd24f08320c54e9f9d50b4c
Actually makes me wonder if I could precompute the hash code for
demangled_names_hash in a similar way...
Will send an updated version of this patch in a bit.
Christian
next prev parent reply other threads:[~2019-10-02 22:02 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-18 21:00 [PATCH v2 0/8] Demangle minimal symbol names in worker threads Tom Tromey
2019-05-18 21:00 ` [PATCH v2 6/8] Introduce thread-safe way to handle SIGSEGV Tom Tromey
2019-05-18 21:00 ` [PATCH v2 8/8] Add maint set/show enable-threads Tom Tromey
2019-05-22 5:01 ` Eli Zaretskii
2019-05-26 20:46 ` Tom Tromey
2019-05-27 2:32 ` Eli Zaretskii
2019-05-18 21:00 ` [PATCH v2 7/8] Demangle minsyms in parallel Tom Tromey
2019-05-18 21:00 ` [PATCH v2 5/8] Introduce run_on_main_thread Tom Tromey
2019-05-18 21:00 ` [PATCH v2 4/8] Lock the demangled hash table Tom Tromey
2019-05-18 21:00 ` [PATCH v2 1/8] Defer minimal symbol name-setting Tom Tromey
2019-05-18 21:00 ` [PATCH v2 3/8] Add configure check for std::thread Tom Tromey
2019-05-18 21:00 ` [PATCH v2 2/8] Remove static buffer from ada_decode Tom Tromey
2019-05-19 13:59 ` [PATCH v2 0/8] Demangle minimal symbol names in worker threads Philippe Waroquiers
2019-05-19 18:55 ` Tom Tromey
2019-05-21 0:35 ` Philippe Waroquiers
2019-05-21 7:35 ` Andrew Burgess
2019-05-21 15:45 ` Tom Tromey
2019-05-21 16:21 ` Andrew Burgess
2019-05-31 2:48 ` Tom Tromey
2019-05-31 17:13 ` Philippe Waroquiers
2019-09-29 0:35 ` [PATCH] Don't use the mutex for each symbol_set_names call Christian Biesinger via gdb-patches
2019-09-30 14:18 ` Tom Tromey
2019-09-30 16:55 ` Christian Biesinger via gdb-patches
2019-10-02 17:18 ` Tom Tromey
2019-10-02 18:20 ` Christian Biesinger via gdb-patches
2019-10-02 22:02 ` Christian Biesinger via gdb-patches [this message]
2019-10-03 18:15 ` [PATCH v2 1/2] " Christian Biesinger via gdb-patches
2019-10-03 18:15 ` [PATCH v2 2/2] Precompute hash value for symbol_set_names Christian Biesinger via gdb-patches
2019-09-30 21:45 ` [PATCH] Don't use the mutex for each symbol_set_names call Christian Biesinger via gdb-patches
2019-10-01 17:02 ` Tom Tromey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPTJ0XEVveoW5LboP0-LmAZv+kkDQqmE3dTQw2uavTqdxbBphQ@mail.gmail.com \
--to=gdb-patches@sourceware.org \
--cc=cbiesinger@google.com \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).