From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29532 invoked by alias); 5 Mar 2012 00:25:58 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 29510 invoked by uid 22791); 5 Mar 2012 00:25:56 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received-SPF: pass (google.com: domain of daniel.jacobowitz@gmail.com designates 10.213.17.130 as permitted sender) client-ip=10.213.17.130; Authentication-Results: mr.google.com; spf=pass (google.com: domain of daniel.jacobowitz@gmail.com designates 10.213.17.130 as permitted sender) smtp.mail=daniel.jacobowitz@gmail.com; dkim=pass header.i=daniel.jacobowitz@gmail.com MIME-Version: 1.0 In-Reply-To: <871upa9yyf.fsf@fleche.redhat.com> References: <20120201132307.GA32578@host2.jankratochvil.net> <87hayio7ld.fsf@fleche.redhat.com> <871upa9yyf.fsf@fleche.redhat.com> Date: Mon, 05 Mar 2012 00:25:00 -0000 Message-ID: Subject: Re: Inter-CU DWARF size optimizations and gcc -flto From: Daniel Jacobowitz To: Tom Tromey Cc: Jan Kratochvil , archer@sourceware.org, Jakub Jelinek Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2012-q1/txt/msg00018.txt.bz2 On Fri, Mar 2, 2012 at 9:54 PM, Tom Tromey wrote: >>>>>> "Daniel" =3D=3D Daniel Jacobowitz writes: > > Daniel> You are correct, it does crush GDB :-) =A0I routinely try - empha= sis on > Daniel> try - to use GDB on programs with between 2500 and 5500 shared > Daniel> libraries. =A0It's agonizing. =A0I have another project I want to= work on > Daniel> first, and not much time for GDB lately, but this is absolutely o= n my > Daniel> list to improve. > > I am curious how you plan to improve it. I have no idea. One thing I'd like to revisit is your work on threaded symbol load; I have plenty of cores available, and the machine is pretty much useless to me until my test starts. There's also a lot of room for profiling to identify bad algorithms; I think we spend a lot of time reading the solib list from the inferior (something I thought I and others had fixed thoroughly already...) and I routinely hit inefficient algorithms e.g. during "next". > > > The plan I mentioned upthread is probably pretty good for scaling to > distro-sized programs, say 200 shared libraries or less (this is > LibreOffice or Mozilla). =A0Maybe we could get a bit more by putting > minsyms into the index. > > I am not so confident it would let gdb scale to 5000 shared libraries > though. > > For that size I've had two ideas. > > First, and simplest, punt. =A0Make the user disable automatic reading of > shared library debuginfo (or even minsyms) and make the user explicitly > mention which ones should be used -- either by 'sharedlibrary' or by a > linespec extension. > > I guess this one would sort of work today. =A0(I haven't tried.) I am hugely unexcited by this. Even if did basic usability work on top of that - e.g. automatically load all solibs that appear in the backtrace - the inability to find sources by file:line is a huge problem for me. > > > Second, and harder, is the "big data" approach. =A0This would be something > like -- load all the debuginfo into a server, tagged by build-id, > ideally with global type- and symbol-interning; then change gdb to send > queries to the server and get back the minimal DWARF (or DWARF-esque > bits) needed; crucially, this would be a global operation instead of > per-objfile, so that gdb could exploit parallelism on the server side. > > Parallelism seems key to me. =A0Parallelism on the machine running gdb > probably wouldn't work out, though, on the theory that there'd be too > much disk contention. =A0Dunno, maybe worth trying. This is an idea I'm excited by. It works well along with Cary's http://gcc.gnu.org/wiki/DebugFission, too; a separate process could handle the changes as individual shared libraries are rebuilt. Something I've been thinking about is that incrementalism is hard in GDB because the symbol tables are so entwined... adding any sort of client/server interface would force us to detangle them, and then individual objects could have a longer life. --=20 Thanks, Daniel