From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8269 invoked by alias); 10 Oct 2003 15:30:55 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 8252 invoked from network); 10 Oct 2003 15:30:52 -0000 Received: from unknown (HELO nevyn.them.org) (66.93.172.17) by sources.redhat.com with SMTP; 10 Oct 2003 15:30:52 -0000 Received: from drow by nevyn.them.org with local (Exim 4.22 #1 (Debian)) id 1A7zEB-00021r-2j for ; Fri, 10 Oct 2003 11:30:51 -0400 Date: Fri, 10 Oct 2003 15:30:00 -0000 From: Daniel Jacobowitz To: gdb@sources.redhat.com Subject: Unambiguously specifying source locations Message-ID: <20031010153050.GA7318@nevyn.them.org> Mail-Followup-To: gdb@sources.redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.1i X-SW-Source: 2003-10/txt/msg00174.txt.bz2 This isn't a proposal - I haven't clearly thought out the details - just some ramblings on a problem. I would like to have a way to clearly identify a location, not by address. Primarily this would be for breakpoints; when we re-read an objfile's symbols, we need to replace breakpoints somehow, and the more of the user's intent we can preserve the better. Obviously this is not a perfectly solvable problem, but we can do pretty well. It's an important problem, too. Not only is recompile/reload a pretty common thing to do, but IDEs which save breakpoints across sessions would use this also. Et cetera. I think it's only source locations that I need to identify in this way, in the short term, but we should use an extensible syntax. For a way-beyond-current-state example, consider an inlined function with multiple inlined instances. One of the simpler things we could do is: record the number of instances in this objfile before reloading check the number of instances in this objfile after reloading if same, then we can preserve things like the breakpoint enabled state to the new set if not, punt - enable all? warn? The goal there being to preserve the status of particular breakpoints across reload, as best as we can, when unrelated changes are made to the source. Someone fixes emit_foo () and recompiles and we try not to disturb their eight disabled and one enabled breakpoint on inlined copies of obstack_alloc where they were tracking the allocation of the bad object. We could be much more thorough. Try this on for size: Record the source location of the emitted function which contains each inlined instance, as unambiguously as we can. Record the inlining path. The result would look something like (I don't like this syntax, just illustrating): [libfoo.so.2][foo-1.cc:foo_func:75][bar-1.h:inline_one:33]\ [bar-1.h:inline_two:36] Then after reload if the path still leads to an inlined copy of inline_two we can re-establish the breakpoint. Or there are various fuzzy matching things we could do. Et cetera. Generating such paths would be useful for output anyway; and if we can do them in such a way as to accept them for input too, that would be useful to our users. We could also accept as input ambiguous paths and do basically wildcarding: break [libfoo.so.2][bar-1.h]inline_two Wouldn't that be nice? The use of braces is not entirely coincidental. decode_line_1 currently does not accept anything that starts with a '[' as far as I can see; ObjC selectors always have +[ or -[. Using braces simplifies quoting and parsing quite a bit. And it could be extended as necessary without too much trouble. There are other problems: for instance, we might want to use linker names for non-inlined functions where possible, for GDB-generated location descriptions; that would handle keeping track of which constructor we were stopped on. Otherwise we'd need some other method for that. Another nice thing to have might be an element in that list describing C++ template instantiation. Not sure if that's necessary or should just be added to the function name as needed. Include paths may also be needed. This happens in C, with static functions in headers. Here's an example that bites me all the time: [libbfd.so][elf32-i386.c][elflink.h:2300] [libbfd.so][elf32-sparc.c][elflink.h:2300] or maybe it should be: [libbfd.so][elf32-i386.c:elflink.h:2300] [libbfd.so][elf32-sparc.c:elflink.h:2300] The latter, I think. Right now the equivalent to this is handled in build_canonical_line_spec. It's a little simple-minded; there are plenty of cases it doesn't handle. It would be nice to do better. Any specification of this should be explicit about quoting rules, comma, dammit. What other thoughts do you all have? Am I on the right track, should we draw up a formal specification for these? -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer