From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <archer-return-1547-listarch-archer=sourceware.org@sourceware.org>
Received: (qmail 29944 invoked by alias); 11 Aug 2009 07:55:04 -0000
Mailing-List: contact archer-help@sourceware.org; run by ezmlm
Sender: <archer@sourceware.org>
Precedence: bulk
List-Post: <mailto:archer@sourceware.org>
List-Help: <mailto:archer-help@sourceware.org>
List-Subscribe: <mailto:archer-subscribe@sourceware.org>
List-Id: <archer.sourceware.org>
Received: (qmail 29929 invoked by uid 22791); 11 Aug 2009 07:55:02 -0000
X-SWARE-Spam-Status: No, hits=-2.0 required=5.0
	tests=AWL,BAYES_00,J_CHICKENPOX_44,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Message-ID: <4A8123C9.3030209@redhat.com>
Date: Tue, 11 Aug 2009 07:55:00 -0000
From: Dodji Seketeli <dodji@redhat.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1b3pre) Gecko/20090513 Fedora/3.0-2.3.beta2.fc11 Thunderbird/3.0b2
MIME-Version: 1.0
To: Jan Kratochvil <jan.kratochvil@redhat.com>
CC: Tom Tromey <tromey@redhat.com>, GDB/Archer list <archer@sourceware.org>
Subject: Re: [RFC] Proposal for a new DWARF name index section
References: <4A7FE28D.4050901@redhat.com> <20090810143804.GA8671@host0.dyn.jankratochvil.net> <m3y6pr8tbl.fsf@fleche.redhat.com> <20090810182136.GA25301@host0.dyn.jankratochvil.net>
In-Reply-To: <20090810182136.GA25301@host0.dyn.jankratochvil.net>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-SW-Source: 2009-q3/txt/msg00113.txt.bz2

Le 10/08/2009 20:21, Jan Kratochvil a écrit :

> OK, thanks for the clarification, forgot etc.
> 
> Still when thinking about it:
> * I do not find the symbols reading much slow myself (working _on_ small GDB).

I agree this is hard to assess precisely. In my experience, debugging large
c++ applications made of lots of dynamic libraries (like mozilla or any
webkit based app) triggers lots of disk access. How much of that is due to
debug info reading ? I don't know. What is the weight of the time penalty
induced by disk access ? I don't know. I think trying to get accurate data
to answer those questions is costly. Any taker ? :)

My hope is that the cost of trying to come up with precise data is not
_much_ less than actually trying to do the lazy reading stuff and see what
we gain. After all, compiler optimization junkies use that strategy all the
time :)  If you, experienced GDB folks, unanimously think trying this is
not worth it, then OK :)
FWIW, I think implementing this new section stuff is not really complex on
the gcc side. I guess the GDB side of things might be trickier ?

> * People complaining it is slow usually use IDEs which use rather file:line
>   based breakpoints, don't they?  (As it was discussed on RH IRC today.)
>   = Assuming the C++ people do not put breakpoints on static out-of-scope
>     functions by name.

I'd say it really depends on the user. If I am used to the code base I am
debugging, I will tend to set quite some breakpoints by name, because
opening $file, then clicking on the right line takes more time than doing
ctrl-b (assuming that's the shortcut to set a breakpoint) and typing the
name of the known function I want to break in. The debugger opens the file
and scrolls down to where the breakpoint is set. Much faster. Even better
if the debugger can provide me with _fast_ name completion when typing the
function name.
How cool would it be if GDB wouldn't stand between me and the joy of
snappiness when I a take that road of speed ? :-)

Oh, and let not forget the command line user base :)

> For the latter case I agree a fix is needed but an index of static names will
> not help with it.

True.

> 
>> Anyway, that is my logic.  Which part of this do you disagree with?
>> Or, am I missing something else?
> 
> We have concluded the currently missing information is for:
> * static functions (are they really needed for the file:line IDE usecases?)

I think they aren't needed for that exact use case. But as I said earlier,
I think there are other use cases that should be faster, are useful for
regular debugger users, and that are unfortunately not as faster as they
ought to be today. And we can address those, can't we ?

> * inlined functions which have no concrete out-of-line instance
>   (the same file:line IDE usecase question)

[...]

> IMO not for:
> * static non-function symbols are deprecated (backward GDB compatibility only)

Sorry, I am not sure to fully understand this. Do global variables and
enumerator constants fall into this "deprecated" category ?

> 
>> There does not seem to be a big downside to introducing a new section
>> that does exactly what we want.  It is automatically backward
>> compatible.  It is (I believe) not difficult to implement.  And,
>> finally, we can make it reliable by fiat.
> 
> While it is an improvement with existing .debug_pubnames, .debug_pubtypes and
> .debug_aranges one can:
> 
> * Lookup everything in current CU which can is fully read-in from .debug_info.
> * Always lookup global symbols from other CUs through the DWARF indexes.
> * Fallback to the full read-in only for:
>   * static functions in out of the language (compiler) scope
>   * inlined functions which have no concrete out-of-line instance
>   * reference to a non-existing symbol
> 
> archer-tromey-delayed-symfile could be probably more improved by properly
> following the indexes.  While I did fix a regression I broke a performance by
> my patch before, it could be probably patched better:
> 	[delayed-symfile] [commit] Fix a regression on forgotten delayed read of a type info.
> 	http://sourceware.org/ml/archer/2009-q1/msg00232.html
> 
> 
> As a summary GDB could already give (with proper non-existing patches) in the
> common usecases acceptable performance even based just on the existing DWARF
> indexes, couldn't it?  I did not think so before this mail thread.

>From what I have seen, I'd say, of course things can be improved with the
existing sections. I am not arguing against that.

What I see is that:

1/ There are "basic" usage cases that you won't be able to speedup, e.g.
imagine there is a global variable named 'foobar'. The user wants to break
in a function at some point and types "break foobar". I think the debugger
ought to know if there is a visible function named foobar in which it could
set the breakpoint. If not, it should gracefully display an error to the
user (possibly proposing the name of another function, close to foobar,
into which to break ?) without having to hit the disk to scan possibly
zillions of objects.

2/ To reach a point where we could implement those usage cases in all
serenity, I am not sure building on top of the current infrastructure (e.g.
extending the current .debug_pubnames and .debug_pubtypes) in a backward
compatible way is possible.

3/ We are lucky that no one seems to be using .debug_pubnames and
.debug_pubtypes today.

So based on 2/ and 3/ maybe it can be worth it to just throw out
.debug_pubname and .debug_pubtypes and think about something more "solid"
that we can build on ?

Thanks for reading so far.

-- 
Dodji Seketeli
Red Hat