From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <archer-return-2501-listarch-archer=sourceware.org@sourceware.org>
Received: (qmail 29532 invoked by alias); 5 Mar 2012 00:25:58 -0000
Mailing-List: contact archer-help@sourceware.org; run by ezmlm
Sender: <archer@sourceware.org>
Precedence: bulk
List-Post: <mailto:archer@sourceware.org>
List-Help: <mailto:archer-help@sourceware.org>
List-Subscribe: <mailto:archer-subscribe@sourceware.org>
List-Id: <archer.sourceware.org>
Received: (qmail 29510 invoked by uid 22791); 5 Mar 2012 00:25:56 -0000
X-SWARE-Spam-Status: No, hits=-2.1 required=5.0
	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW
X-Spam-Check-By: sourceware.org
Received-SPF: pass (google.com: domain of daniel.jacobowitz@gmail.com designates 10.213.17.130 as permitted sender) client-ip=10.213.17.130;
Authentication-Results: mr.google.com; spf=pass (google.com: domain of daniel.jacobowitz@gmail.com designates 10.213.17.130 as permitted sender) smtp.mail=daniel.jacobowitz@gmail.com; dkim=pass header.i=daniel.jacobowitz@gmail.com
MIME-Version: 1.0
In-Reply-To: <871upa9yyf.fsf@fleche.redhat.com>
References: <20120201132307.GA32578@host2.jankratochvil.net>
	<87hayio7ld.fsf@fleche.redhat.com>
	<CAN9gPaEvcr5jnA5PTNHgBpNa120JDp3JTUEYqN3kUkbbS0b=sg@mail.gmail.com>
	<871upa9yyf.fsf@fleche.redhat.com>
Date: Mon, 05 Mar 2012 00:25:00 -0000
Message-ID: <CAN9gPaGSZcyxu5n_xnD3cgt-LEuAnLyQBR8hpagfWGPNrmV0eQ@mail.gmail.com>
Subject: Re: Inter-CU DWARF size optimizations and gcc -flto
From: Daniel Jacobowitz <drow@false.org>
To: Tom Tromey <tromey@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>, archer@sourceware.org, 
	Jakub Jelinek <jakub@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-SW-Source: 2012-q1/txt/msg00018.txt.bz2

On Fri, Mar 2, 2012 at 9:54 PM, Tom Tromey <tromey@redhat.com> wrote:
>>>>>> "Daniel" =3D=3D Daniel Jacobowitz <drow@false.org> writes:
>
> Daniel> You are correct, it does crush GDB :-) =A0I routinely try - empha=
sis on
> Daniel> try - to use GDB on programs with between 2500 and 5500 shared
> Daniel> libraries. =A0It's agonizing. =A0I have another project I want to=
 work on
> Daniel> first, and not much time for GDB lately, but this is absolutely o=
n my
> Daniel> list to improve.
>
> I am curious how you plan to improve it.

I have no idea.  One thing I'd like to revisit is your work on
threaded symbol load; I have plenty of cores available, and the
machine is pretty much useless to me until my test starts.  There's
also a lot of room for profiling to identify bad algorithms; I think
we spend a lot of time reading the solib list from the inferior
(something I thought I and others had fixed thoroughly already...) and
I routinely hit inefficient algorithms e.g. during "next".

>
>
> The plan I mentioned upthread is probably pretty good for scaling to
> distro-sized programs, say 200 shared libraries or less (this is
> LibreOffice or Mozilla). =A0Maybe we could get a bit more by putting
> minsyms into the index.
>
> I am not so confident it would let gdb scale to 5000 shared libraries
> though.
>
> For that size I've had two ideas.
>
> First, and simplest, punt. =A0Make the user disable automatic reading of
> shared library debuginfo (or even minsyms) and make the user explicitly
> mention which ones should be used -- either by 'sharedlibrary' or by a
> linespec extension.
>
> I guess this one would sort of work today. =A0(I haven't tried.)

I am hugely unexcited by this.  Even if did basic usability work on
top of that - e.g. automatically load all solibs that appear in the
backtrace - the inability to find sources by file:line is a huge
problem for me.

>
>
> Second, and harder, is the "big data" approach. =A0This would be something
> like -- load all the debuginfo into a server, tagged by build-id,
> ideally with global type- and symbol-interning; then change gdb to send
> queries to the server and get back the minimal DWARF (or DWARF-esque
> bits) needed; crucially, this would be a global operation instead of
> per-objfile, so that gdb could exploit parallelism on the server side.
>
> Parallelism seems key to me. =A0Parallelism on the machine running gdb
> probably wouldn't work out, though, on the theory that there'd be too
> much disk contention. =A0Dunno, maybe worth trying.

This is an idea I'm excited by.  It works well along with Cary's
http://gcc.gnu.org/wiki/DebugFission, too; a separate process could
handle the changes as individual shared libraries are rebuilt.

Something I've been thinking about is that incrementalism is hard in
GDB because the symbol tables are so entwined... adding any sort of
client/server interface would force us to detangle them, and then
individual objects could have a longer life.

--=20
Thanks,
Daniel