Re: [PATCH 2/5] gdbsupport: Introduce interval_tree

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Pedro Alves <pedro@palves.net>, Tom Tromey <tromey@adacore.com>,
	Andrew Burgess <aburgess@redhat.com>
Cc: Andreas Arnez <arnez@linux.ibm.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 2/5] gdbsupport: Introduce interval_tree
Date: Thu, 02 Jun 2022 17:09:45 +0200	[thread overview]
Message-ID: <4ca5a328b7e2d1a9b9287b8c3a3b2fe525b42c3c.camel@linux.ibm.com> (raw)
In-Reply-To: <52e0e168-3ad4-df0e-5b46-2856e82b5d47@palves.net>

On Thu, 2022-06-02 at 15:37 +0100, Pedro Alves wrote:
> On 2022-06-02 15:12, Pedro Alves wrote:
> > On 2022-06-02 14:35, Ilya Leoshkevich wrote:
> > > 
> > > Adding N JITed sections has the complexity O((N**2)*log(N)),
> > > because
> > > adding each section involves breakpoint handling, which needs to
> > > resolve PCs and thus calls update_section_map().  When N is
> > > around 10k,
> > > this renders GDB unusable.
> > 
> > Does this adding of N JITed sections happen in batch?  Like, is
> > this from
> > jit_inferior_init, where we loop over JIT objects?  Or is it so
> > that we
> > get notified about JIT objects, one at a time?
> > 
> > In places where we add symbols in batch, we defer breakpoint
> > re_setting exactly
> > to avoid problems like this, via SYMFILE_DEFER_BP_RESET or
> > something similar.
> > Looks like jit.c doesn't try to do that.  Or is it not possible in
> > the scenario
> > in question?  Like, doesn't the JIT API let you register more than
> > one object
> > file at once?
> 
> It has taken me this long to remember that I once wrote a patch
> series to
> tackle this problem...  :-)  It's been on a branch on sourceware
> since 2016...
> 
> See the palves/jit-speedup branch.
> 
> I completely forgot that.
> 
> There, I added a number of patches deferring breakpoint_re_set in
> several
> different situations, and other simple optimizations that avoid O(N).
> It may be those patches are obsolete by now, but maybe they aren't.
> 
> For the section sorting issue itself, the branch has this:
> 
> ~~~~~~~~~~~~~~~~~~
> commit 5cd42e9fb13d25febe3da26595d044a57150cee5
> Author:     Pedro Alves <palves@redhat.com>
> AuthorDate: Fri Apr 1 01:14:30 2016 +0100
> Commit:     Pedro Alves <palves@redhat.com>
> CommitDate: Mon Sep 19 15:44:41 2016 +0100
> 
>     Get rid of sections sorting with qsort and use an incrementally
> updated addrmap instead
>     
>     This gives a massive speed up.  The problem with the qsort is
> that we
>     qsort for any one of the thousands of jit loads/unloads, and when
> you
>     have thousands of objfiles, that gets very slow.  In this
> scenario,
>     we're constantly adding/removing a handfull of obj_sections to a
> set
>     of thousands of already-sorted obj_sections.  It's much cheaper
> to do
>     an incremental update.
>     
>     I'm using a mutable addrmap for this, but I needed to add a new
>     primitive that allowed updating a region's object, to handle the
> case
>     of overlapping sections.  The only primitive available, only
> allows
>     setting a value to a currently-NULL region.
> 
> ~~~~~~~~~~~~~~~~~~
> 
> it talks about qsort because back when we hadn't yet moved to C++
> std::sort.
> 
> As you see, I was using an addrmap for this (gdb/addrmap.h), which is
> a data structure
> we already have.  I wonder whether the new data structure is really
> needed.  That's
> a question I was going to raise anyhow, until I remembered I had once
> already attempted it
> (and seen that it works).

Ah, interesting - I wasn't aware of addrmap.

You even handle overlapping sections conservatively by associating more
than one with each address range:

struct obj_section_map_addrmap_value
{
  VEC (obj_section_p) *sections;
};

This may use excessive memory if we have sections like [0,0], [0,1],
[0,2], ..., but hopefully that's not something that happens in the
real life often.

It's also interesting that you drop section_map_dirty and just apply
the changes right away.  I wasn't sure whether delaying the processing
was needed for something other than performance, so I kept it.  Same
for inhibit_updates.

I will try to play with addrmap and see if it works for me.

next prev parent reply	other threads:[~2022-06-02 15:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-02 13:35 [PATCH 0/5] gdb: Store section map in an interval tree Ilya Leoshkevich
2022-06-02 13:35 ` [PATCH 1/5] gdbsupport: Introduce obstack_newvec Ilya Leoshkevich
2022-06-02 14:31   ` Tom Tromey
2022-06-02 14:33     ` Ilya Leoshkevich
2022-06-02 13:35 ` [PATCH 2/5] gdbsupport: Introduce interval_tree Ilya Leoshkevich
2022-06-02 14:12   ` Pedro Alves
2022-06-02 14:17     ` Ilya Leoshkevich
2022-06-02 14:12   ` Pedro Alves
2022-06-02 14:37     ` Pedro Alves
2022-06-02 15:09       ` Ilya Leoshkevich [this message]
2022-06-02 18:04       ` Tom Tromey
2022-06-02 13:35 ` [PATCH 3/5] gdbsupport: Add interval_tree unit tests Ilya Leoshkevich
2022-06-02 13:35 ` [PATCH 4/5] gdbsupport: Add interval_tree fuzzing harness Ilya Leoshkevich
2022-06-02 13:35 ` [PATCH 5/5] gdb: Optimize section map Ilya Leoshkevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ca5a328b7e2d1a9b9287b8c3a3b2fe525b42c3c.camel@linux.ibm.com \
    --to=iii@linux.ibm.com \
    --cc=aburgess@redhat.com \
    --cc=arnez@linux.ibm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    --cc=tromey@adacore.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).