From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 71006 invoked by alias); 11 Jan 2016 22:34:53 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 70994 invoked by uid 89); 11 Jan 2016 22:34:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=kinda, iff, documentary, offer X-HELO: mail-pf0-f202.google.com Received: from mail-pf0-f202.google.com (HELO mail-pf0-f202.google.com) (209.85.192.202) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 11 Jan 2016 22:34:51 +0000 Received: by mail-pf0-f202.google.com with SMTP id 65so9960728pff.1 for ; Mon, 11 Jan 2016 14:34:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:message-id:date:subject:from:to:cc :content-type; bh=DtBkqdyRWBkhlu6IJRu5prBHPRyyZH1MYJB0zxxL1O0=; b=f+hRjPojtUe2masM/OAqBbt+c463Ke9oOFcOz2ErCRXzHvBQZUmuOmp7acsvWqAfmF geWdBG55yLCvd3+nFfpu+hs3ff8hsBNq8tg6s4XcviqA6APA76kEGgTv4gONzFxIvm2F v9bSnWPwy0suWnG4LyNukeUkmIbV95M8ZxqcePUR+tudb77oI+hJG4G0Fx+enudHsFgW 5cl4aARnJdFKvUvTGqael0yygHaGW7Uaby77D6W4zkMqzcwfRkpZ8pmF5XdcA4mOkz// hfAU18mRABsLCN2dPDSER5g3RS84LmUdnplPrW0gHfPQzlZBIlHPkYS0Qz3tfTxnwg/A Kb2Q== X-Gm-Message-State: ALoCoQn/+dnLsgNVTmu9vqQTilRXHKVuUZS3JBkaFbtQc2CKnH5IC3C9ktUPdyScj6yk0PQYFwZtFRlw1wdZ/wI/BiT0gBdkSQ== MIME-Version: 1.0 X-Received: by 10.66.160.3 with SMTP id xg3mr113269664pab.35.1452551689230; Mon, 11 Jan 2016 14:34:49 -0800 (PST) Message-ID: <047d7b6d967a74c69a0529168b2e@google.com> Date: Mon, 11 Jan 2016 22:34:00 -0000 Subject: Re: [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303 From: Doug Evans To: Keith Seitz Cc: donb@codesourcery.com, "gdb-patches@sourceware.org ml" Content-Type: text/plain; charset=UTF-8; format=flowed; delsp=yes X-IsSubscribed: yes X-SW-Source: 2016-01/txt/msg00200.txt.bz2 Keith Seitz writes: > Hi, > > Thank you for pointing this out and supplying a patch (and *tests*!). > > On 01/08/2016 10:44 AM, Don Breazeal wrote: > > > During GDB's parsing of the linespec, when the filename was not found in > > the program's symbols, GDB tried to determine if the filename string > > could be some other valid identifier. Eventually it would get to a point > > where it concluded that the Windows logical volume, or whatever was to the > > left of the colon, was a C++ namespace. When it found only one colon, it > > would assert. > > I have to admit, when I first read this, my reaction was that this is a > symbol lookup error. After investigating it a bit, I think it is. [More > rationale below.] > > > GDB's linespec grammar allows for several different linespecs that contain > > ':'. The only one that has a number after the colon is filename:linenum. > > True, but for how long? I don't know. The parser actually does/could (or > for some brief time *did*) support offsets just about anywhere, e.g., > foo.c:function:label:3. That support was removed and legacy (buggy) > behavior of ignoring the offset was maintained in the parser/sal > converter. There is no reason we couldn't support it, though. > > > There is another possible solution. After no symbols are found for the > > file and we determine that it is a filename:linenum style linespec, we > > could just consume the filename token and parse the rest of the > > linespec. That works as well, but there is no advantage to it. > > And, of course, I came up with an entirely different solution. :-) > > Essentially what is happening is that the linespec parser is calling > lookup_symbol on the user input (e.g., "spaces: and :colons.cc"). That > is causing several problems, all which assume the input is well-formed. > As you've discovered, that is a pretty poor assumption. Malformed (user) > input should not cause an assertion failure IMO. > > > I created two new tests for this. One is just gdb.linespec/ls-errs.exp > > copied and converted to use C++ instead of C, and to add the Windows > > logical drive case. The other is an MI test to provide a regression > > test for the specific case reported in PR 18303. > > EXCELLENT! Thank you so much for doing this. These tests were > outrageously useful while attempting to understand the problem. > > With that said, I offer *for discussion* this alternative solution, > which removes the assumption that input to lookup_symbol is/must be > well-formed. > > [There is one additional related/necessary fixlet snuck in here... The > C++ name parser always assumed that ':' was followed by another ':'. Bad > assumption. So it would return an incorrect result in the malformed case.] > > WDYT? > > Keith > > [apologies on mailer wrapping] > > Author: Keith Seitz > Date: Fri Jan 8 14:00:22 2016 -0800 > > Tolerate malformed input for lookup_symbol-called functions. > > lookup_symbol is often called with user input. Consequently, any > function called from lookup_symbol{,_in_language} should attempt to deal > with malformed input gracefully. After all, malformed user input is > not a programming/API error. > > This patch does not attempt to find/correct all instances of this. > It only fixes locations in the code that trigger test suite failures. > > gdb/ChangeLog > > * cp-namespace.c (cp_lookup_bare_symbol): Do not assert on > user input. > (cp_search_static_and_baseclasses): Return null_block_symbol for > malformed input. > Remove assertions. > * cp-support.c (cp_find_first_component_aux): Do not return > a prefix length for ':' unless the next character is also ':'. > > diff --git a/gdb/cp-namespace.c b/gdb/cp-namespace.c > index 72002d6..aa225fe 100644 > --- a/gdb/cp-namespace.c > +++ b/gdb/cp-namespace.c > @@ -166,12 +166,6 @@ cp_lookup_bare_symbol (const struct language_defn > *langdef, > { > struct block_symbol sym; > > - /* Note: We can't do a simple assert for ':' not being in NAME because > - ':' may be in the args of a template spec. This isn't intended to be > - a complete test, just cheap and documentary. */ > - if (strchr (name, '<') == NULL && strchr (name, '(') == NULL) > - gdb_assert (strchr (name, ':') == NULL); > - Heya. The assert is intended to catch (some) violations of this (from the function comment): NAME is guaranteed to not have any scope (no "::") in its name, though if for example NAME is a template spec then "::" may appear in the argument list. This is an invariant on what name can (and cannot) be. IOW, it is wrong to call this function with name == "foo::bar", handling that is the caller's responsibility. Thus I think having an assert here is ok and good (as long as it tests for the correct thing!). Whether it is ok to call this with name == "c:mumble" is the issue. [or even "c:::mumble" or whatever else a user could type that this function's caller isn't expected to handle] On that I'm kinda ambivalent, but I like having the assert watch for the stated invariant. Thoughts? > sym = lookup_symbol_in_static_block (name, block, domain); > if (sym.symbol != NULL) > return sym; > @@ -246,10 +240,9 @@ cp_search_static_and_baseclasses (const char *name, > struct block_symbol klass_sym; > struct type *klass_type; > > - /* The test here uses <= instead of < because Fortran also uses this, > - and the module.exp testcase will pass "modmany::" for NAME here. */ > - gdb_assert (prefix_len + 2 <= strlen (name)); > - gdb_assert (name[prefix_len + 1] == ':'); > + /* Check for malformed input. */ > + if (prefix_len + 2 > strlen (name) || name[prefix_len + 1] != ':') > + return null_block_symbol; > > /* Find the name of the class and the name of the method, variable, > etc. */ > > diff --git a/gdb/cp-support.c b/gdb/cp-support.c > index df127c4..a71c6ad 100644 > --- a/gdb/cp-support.c > +++ b/gdb/cp-support.c > @@ -1037,8 +1037,13 @@ cp_find_first_component_aux (const char *name, > int permissive) > return strlen (name); > } > case '\0': > - case ':': > return index; > + case ':': > + /* ':' marks a component iff the next character is also a ':'. > + Otherwise it is probably malformed input. */ > + if (name[index + 1] == ':') > + return index; > + break; What if name[index+2] is also ':'? :-) [btw, I think "a::::b" is illegal (note four colons), but I don't know that for sure] > case 'o': > /* Operator names can screw up the recursion. */ > if (operator_possible > -- /dje