public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Doug Evans <dje@google.com>
To: Keith Seitz <keiths@redhat.com>
Cc: donb@codesourcery.com,
		"gdb-patches@sourceware.org ml" <gdb-patches@sourceware.org>
Subject: Re: [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303
Date: Mon, 11 Jan 2016 22:34:00 -0000	[thread overview]
Message-ID: <047d7b6d967a74c69a0529168b2e@google.com> (raw)

Keith Seitz writes:
  > Hi,
  >
  > Thank you for pointing this out and supplying a patch (and *tests*!).
  >
  > On 01/08/2016 10:44 AM, Don Breazeal wrote:
  >
  > > During GDB's parsing of the linespec, when the filename was not found  
in
  > > the program's symbols, GDB tried to determine if the filename string
  > > could be some other valid identifier.  Eventually it would get to a  
point
  > > where it concluded that the Windows logical volume, or whatever was to  
the
  > > left of the colon, was a C++ namespace.  When it found only one colon,  
it
  > > would assert.
  >
  > I have to admit, when I first read this, my reaction was that this is a
  > symbol lookup error. After investigating it a bit, I think it is. [More
  > rationale below.]
  >
  > > GDB's linespec grammar allows for several different linespecs that  
contain
  > > ':'.  The only one that has a number after the colon is  
filename:linenum.
  >
  > True, but for how long? I don't know. The parser actually does/could (or
  > for some brief time *did*) support offsets just about anywhere, e.g.,
  > foo.c:function:label:3. That support was removed and legacy (buggy)
  > behavior of ignoring the offset was maintained in the parser/sal
  > converter. There is no reason we couldn't support it, though.
  >
  > > There is another possible solution.  After no symbols are found for the
  > > file and we determine that it is a filename:linenum style linespec, we
  > > could just consume the filename token and parse the rest of the
  > > linespec.  That works as well, but there is no advantage to it.
  >
  > And, of course, I came up with an entirely different solution. :-)
  >
  > Essentially what is happening is that the linespec parser is calling
  > lookup_symbol on the user input (e.g., "spaces: and :colons.cc"). That
  > is causing several problems, all which assume the input is well-formed.
  > As you've discovered, that is a pretty poor assumption. Malformed (user)
  > input should not cause an assertion failure IMO.
  >
  > > I created two new tests for this.  One is just gdb.linespec/ls-errs.exp
  > > copied and converted to use C++ instead of C, and to add the Windows
  > > logical drive case.  The other is an MI test to provide a regression
  > > test for the specific case reported in PR 18303.
  >
  > EXCELLENT! Thank you so much for doing this. These tests were
  > outrageously useful while attempting to understand the problem.
  >
  > With that said, I offer *for discussion* this alternative solution,
  > which removes the assumption that input to lookup_symbol is/must be
  > well-formed.
  >
  > [There is one additional related/necessary fixlet snuck in here... The
  > C++ name parser always assumed that ':' was followed by another ':'. Bad
  > assumption. So it would return an incorrect result in the malformed  
case.]
  >
  > WDYT?
  >
  > Keith
  >
  > [apologies on mailer wrapping]
  >
  > Author: Keith Seitz <keiths@redhat.com>
  > Date:   Fri Jan 8 14:00:22 2016 -0800
  >
  > Tolerate malformed input for lookup_symbol-called functions.
  >
  > lookup_symbol is often called with user input. Consequently, any
  > function called from lookup_symbol{,_in_language} should attempt to deal
  > with malformed input gracefully. After all, malformed user input is
  > not a programming/API error.
  >
  > This patch does not attempt to find/correct all instances of this.
  > It only fixes locations in the code that trigger test suite failures.
  >
  >     gdb/ChangeLog
  >
  >     	* cp-namespace.c (cp_lookup_bare_symbol): Do not assert on
  >     	user input.
  >     	(cp_search_static_and_baseclasses): Return null_block_symbol for
  >     	malformed input.
  >     	Remove assertions.
  >     	* cp-support.c (cp_find_first_component_aux): Do not return
  >     	a prefix length for ':' unless the next character is also ':'.
  >
  > diff --git a/gdb/cp-namespace.c b/gdb/cp-namespace.c
  > index 72002d6..aa225fe 100644
  > --- a/gdb/cp-namespace.c
  > +++ b/gdb/cp-namespace.c
  > @@ -166,12 +166,6 @@ cp_lookup_bare_symbol (const struct language_defn
  > *langdef,
  >  {
  >    struct block_symbol sym;
  >
  > -  /* Note: We can't do a simple assert for ':' not being in NAME because
  > -     ':' may be in the args of a template spec.  This isn't intended to  
be
  > -     a complete test, just cheap and documentary.  */
  > -  if (strchr (name, '<') == NULL && strchr (name, '(') == NULL)
  > -    gdb_assert (strchr (name, ':') == NULL);
  > -

Heya.

The assert is intended to catch (some) violations of this
(from the function comment):

    NAME is guaranteed to not have any scope (no "::") in its name, though
    if for example NAME is a template spec then "::" may appear in the
    argument list.

This is an invariant on what name can (and cannot) be.
IOW, it is wrong to call this function with name == "foo::bar",
handling that is the caller's responsibility.
Thus I think having an assert here is ok and good
(as long as it tests for the correct thing!).

Whether it is ok to call this with name == "c:mumble" is the issue.
[or even "c:::mumble" or whatever else a user could type that
this function's caller isn't expected to handle]
On that I'm kinda ambivalent, but I like having the assert
watch for the stated invariant.

Thoughts?

  >    sym = lookup_symbol_in_static_block (name, block, domain);
  >    if (sym.symbol != NULL)
  >      return sym;
  > @@ -246,10 +240,9 @@ cp_search_static_and_baseclasses (const char *name,
  >    struct block_symbol klass_sym;
  >    struct type *klass_type;
  >
  > -  /* The test here uses <= instead of < because Fortran also uses this,
  > -     and the module.exp testcase will pass "modmany::" for NAME here.   
*/
  > -  gdb_assert (prefix_len + 2 <= strlen (name));
  > -  gdb_assert (name[prefix_len + 1] == ':');
  > +  /* Check for malformed input.  */
  > +  if (prefix_len + 2 > strlen (name) || name[prefix_len + 1] != ':')
  > +    return null_block_symbol;
  >
  >    /* Find the name of the class and the name of the method, variable,
  > etc.  */
  >
  > diff --git a/gdb/cp-support.c b/gdb/cp-support.c
  > index df127c4..a71c6ad 100644
  > --- a/gdb/cp-support.c
  > +++ b/gdb/cp-support.c
  > @@ -1037,8 +1037,13 @@ cp_find_first_component_aux (const char *name,
  > int permissive)
  >  	      return strlen (name);
  >  	    }
  >  	case '\0':
  > -	case ':':
  >  	  return index;
  > +	case ':':
  > +	  /* ':' marks a component iff the next character is also a ':'.
  > +	     Otherwise it is probably malformed input.  */
  > +	  if (name[index + 1] == ':')
  > +	    return index;
  > +	  break;

What if name[index+2] is also ':'? :-)

[btw, I think "a::::b" is illegal (note four colons),
but I don't know that for sure]

  >  	case 'o':
  >  	  /* Operator names can screw up the recursion.  */
  >  	  if (operator_possible
  >

-- 
/dje

             reply	other threads:[~2016-01-11 22:34 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-11 22:34 Doug Evans [this message]
2016-01-12  0:17 ` Don Breazeal
2016-01-12 18:26 ` Keith Seitz
2016-01-25 17:26   ` [PING] " Don Breazeal
2016-01-26 16:56     ` Doug Evans
2016-01-28  1:21       ` [PATCH v2] PR 18303, Tolerate malformed input for lookup_symbol-called functions Don Breazeal
2016-01-28 12:06         ` Pedro Alves
2016-01-28 22:43           ` [PATCH v3] " Don Breazeal
2016-01-28 22:52             ` [PATCH v4] " Don Breazeal
2016-02-04 18:37               ` [PING] " Don Breazeal
2016-02-18 18:22                 ` [PING] Re: [PATCH v4] PR 18303, Tolerate malformed input for lookup_symbol-called functions (was: [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303) Don Breazeal
2016-02-25 17:28                   ` [PING] Re: [PATCH v4] PR 18303, Tolerate malformed input for lookup_symbol-called functions Don Breazeal
2016-03-03 18:19                     ` Don Breazeal
2016-03-14 21:23                       ` [PING] Re: [PATCH v4] PR 18303, Tolerate malformed input for lookup_symbol-called functions (was: Re: [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303) Don Breazeal
2016-03-15 15:55                         ` [PING] Re: [PATCH v4] PR 18303, Tolerate malformed input for lookup_symbol-called functions Pedro Alves
  -- strict thread matches above, loose matches on Subject: below --
2016-01-08 18:44 [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303 Don Breazeal
2016-01-08 23:03 ` Keith Seitz
2016-01-11 19:21   ` Don Breazeal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=047d7b6d967a74c69a0529168b2e@google.com \
    --to=dje@google.com \
    --cc=donb@codesourcery.com \
    --cc=gdb-patches@sourceware.org \
    --cc=keiths@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).