public inbox for archer@sourceware.org
 help / color / mirror / Atom feed
* Stop the Insanity! Linespec Rewrite
@ 2012-03-02  1:14 Keith Seitz
  2012-03-02 17:00 ` Tom Tromey
  2012-03-02 19:07 ` [Archer] " Joel Brobecker
  0 siblings, 2 replies; 20+ messages in thread
From: Keith Seitz @ 2012-03-02  1:14 UTC (permalink / raw)
  To: archer

Hi,

I think that most of us know the pain that is linespec.c. Well, Tom gave 
me the green light on this a while back, and I played with an initial 
design/implementation in September.

Earlier this month, I dug this out of mothballs, and it is now time to 
submit the work-in-progress for comment. This is *not* a finished design 
and/or implementation! It is simply a place to stop along the road and 
ask for advice, for a fresh set of eyes. Consider this akin to an RFC.

Let me recap what some of the more important requirements for this 
rewrite were:

- More robust parser. As several of us can attest to, adding just about 
anything to linespec.c is PAINFUL. Really painful. We need a more robust 
linespec-to-sal infrastructure.

- Consolidate similar functionality. There are several functions which 
pretty much do the same thing (e.g., decode_variable vs decode_compound 
vs the remaining bits of decode_internal).

- Rain on the quoting parade. This is actually pretty tightly coupled to 
the first requirement, but it is such a nightmare as it is now, that 
it's just worth calling out separately. I doubt I need say more.

- Enable a way for "explicit" linespecs (as I call them, for lack of a 
better term). I would like to add the ability to do, e.g., "break 
-sourcefile foo.c -function baz::doit(char*) -offset 3" and the like. 
Probably most useful to MI (and python), but not altogether useless to 
CLI users.

- Increase the maintainability and reduce the fragility of this code. If 
you haven't had the pleasure of hacking on linespecs, consider yourself 
lucky. You've never lost sleep over the prospect of digging into this code.

I'm sure there are some other requirements/wish-list items which I've 
forgotten. These are just basically mine.

If you'll pardon my rambling, here are some notes about what the 
design/implementation does/does not do:

o We now have a (trivial) parser and a lexer. The lexer "word" breaks 
the input on ':', but it does know about "::" as a scope operator for 
C++ (but nothing on ObjC). The whole thing attempts to integrate all 
languages into one design/implementation. You will not see any 
references to current_language anywhere, except one tiny place where 
canonicalization is done, and that's cut-n-paste from current sources.

o Speaking of canonicalization: that is now done in only one place in 
the parser.

o Quoting is greatly minimized. Got spaces in a filename? Doesn't matter 
anymore. The only thing that matters is ':'. Everything else is simply 
text. When a quote character (either " or ') is seen, everything is 
skipped to the next quote character. [No, you cannot mix quote 
characters in the same lexeme, e.g., "foo'bar'" will return the lexeme 
foo'bar'.] If your filename is "main::foo.cc", then you must quote that: 
"break 'main::foo.cc':3".

o As previously, Objective C is short-circuited. I have not changed that 
at all. Selector names would need quoting to avoid being lexed 
incorrectly. I haven't pursued this at all (and probably won't).

o All current functionality is maintained AFAICT. I did not add any new 
functionality that didn't just drop out from doing it this way. [One 
example: you can now do "break myclass::mymethod::a_label". It "just 
works."]

o decode_* are almost all gone. All symbol lookups are essentially done 
by one function (although there is one helper function for it). [Notable 
exceptions: decode_dollar and decode_objc.]

o Canonicalization (of the linespec) is now pretty trivial and isolated 
to one function. It is no longer scattered all over the place.

o Error reporting could be greatly expanded. The version I have 
committed simply maintains the status quo.

o Some linespecs don't work anymore. IMO, these existed because of the 
nightmare of the existing code. The only group of linespecs that won't 
work involve "goofy" quoting. For example, you cannot do: "break 
klass::'operator +'" anymore. Nor can you do "break 
'foo.c:static_function'". IMO a small price to pay for sanity.

o I have not written tests for this yet.

o I have not checked memory allocation or other such common problems. My 
goal was to get something that works.

o There is probably some dead code that can/could now be removed. 
[cp_validate_operator comes to mind] I have not even begun to look for this.

o Probably lots of cleanup to do. Probably goofed/thinko'd a few things, 
too, especially in the area of ambiguous linespecs. This is part of the 
reason for exposing this less-than-beta quality code to other developers 
and maintainers. I still have comments to myself all over the place 
(comments with "!!" and #if WHATS_THIS_FOR). Who needs a stinkin' 
notebook!? I take notes in the code.

Ok, so enough of that. You probably are all anxious to see what I've 
actually done...

If you will permit me a few more paragraphs of your time, please allow 
me very briefly describe the underlying idea. Basically, I've split 
linespec decoding into several main areas: symtabs, symbols, labels, 
line offsets, SALs, and canonicalization The first four are pretty 
obvious. We have to find those things. The last two are a rather large 
departure from the current code. [I won't go so far as to call it a design.]

Previously we constructed SALs as we found symbols. At the same time, we 
also constructed canonical linespec representations. We now take in a 
list of parameters and convert them to SALs in one swoop. Then we 
compute any necessary canonicalizations.

The main routine dealing with SALs is convert_linespec_to_sals. This 
converts a structure into a list of SALs, which is then returned to the 
caller. Canonicalization is done by canonicalize_linespec.

The structure which must be "filled-in" for convert_linespec_to_sals is 
"typedef struct linespec *linespec_t". Start with that. Right now, the 
only way to fill-in this structure is via the parser, but another 
function for "explicit" linespecs could fill it in, too, by using some 
of the new functions that I've introduced.

Ok, so enough chatter. You want to see the actual code.

Here you go: Archer branch "archer-keiths-linespec-rewrite" [duh!]

Please share your comments, concerns, and suggestions. Please help me 
stop the [linespec] insanity!

Keith

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-03-08  3:02 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-02  1:14 Stop the Insanity! Linespec Rewrite Keith Seitz
2012-03-02 17:00 ` Tom Tromey
2012-03-02 22:41   ` Keith Seitz
2012-03-03  2:36     ` Tom Tromey
2012-03-02 19:07 ` [Archer] " Joel Brobecker
2012-03-02 22:49   ` Keith Seitz
2012-03-06 17:40     ` [Archer] " Joel Brobecker
2012-03-06 19:08       ` Keith Seitz
2012-03-06 19:36         ` Keith Seitz
2012-03-06 21:50           ` [Archer] " Joel Brobecker
2012-03-07  0:11             ` [Archer] " Joel Brobecker
2012-03-07  1:08               ` Keith Seitz
2012-03-07  1:26                 ` [Archer] " Joel Brobecker
2012-03-07 14:39                   ` Tom Tromey
2012-03-07 15:01                     ` Tom Tromey
2012-03-07 15:39                       ` [Archer] " Joel Brobecker
2012-03-07 16:06                         ` Tom Tromey
2012-03-07 21:19                           ` Joel Brobecker
2012-03-07 21:11                         ` [Archer] " Keith Seitz
2012-03-08  3:02                           ` Joel Brobecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).