Re: Updated patch adding line number enumeration support to statement probe

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

From: BR Chrisman <brchrisman@gmail.com>
Cc: systemtap@sourceware.org
Subject: Re: Updated patch adding line number enumeration support to statement probe
Date: Tue, 03 Jun 2014 17:46:00 -0000	[thread overview]
Message-ID: <CAN4=B2=7Nofy4cGu+Vo2+yG4PBOFfrHRw_6570DjvTVGWAby7Q@mail.gmail.com> (raw)
In-Reply-To: <1844543191.19591707.1401808203426.JavaMail.zimbra@redhat.com>

On Tue, Jun 3, 2014 at 8:10 AM, Jonathan Lebon <jlebon@redhat.com> wrote:
> Hi Brian,
>
> Thanks for the patch. I had some difficulties applying it. Can you make
> sure that your mail client does not modify whitespaces? If you use the
> git send-mail command you should have no issues.

I'll verify my mailing before sending up the next patch.  I've had
issues with this before, my apologies.

>
> Your patch looks OK overall. There are some things however that need
> attention:
>
>> @@ -38,7 +38,7 @@ struct symbol_table;
>>  struct base_query;
>>  struct external_function_query;
>>
>> -enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD };
>> +enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD, ENUMERATED };
>>  enum info_status { info_unknown, info_present, info_absent };
>>
>>  // module -> cu die[]
>
> You've eliminated any way to have a RANGE lineno type, but it is still
> in the lineno_t enum and there are still references to it in many places
> (like iterate_over_labels()). These need to be cleaned up as well to
> instead handle ENUMERATED.

I cleaned up all the RANGE stuff and will include in another patch.

>
>> @@ -1086,9 +1086,9 @@ void
>>  dwarf_query::parse_function_spec(const string & spec)
>>  {
>>    lineno_type = ABSOLUTE;
>> -  linenos[0] = linenos[1] = 0;
>> -
>> -  size_t src_pos, line_pos, dash_pos, scope_pos;
>> +  linenos.push_back(0);
>> +  linenos.push_back(0);
>> +  size_t src_pos, line_pos, scope_pos;
>>
>>    // look for named scopes
>>    scope_pos = spec.rfind("::");
>
> You seed the linenos vector with two '0' elements, but then never add
> the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
> the user entered lineno 0. I suspect that's what's causing all the failures
> in statement.exp.

The seed values were intended to mimic the removed line:
linenos[0] = linenos[1] = 0;

I tried testing with those seeds removed, which failed as well, but
I'm confused about how/where ABSOLUTE/RELATIVE specifications are
parsed.

The two statement.exp test cases which are failing are:
FAIL: statement (ENUMERATED and RANGE - single func - expected 3 probes, got 0)
FAIL: statement (ENUMERATED and RANGE - wild func - expected 3 probes, got 0)
...
                ===  Summary ===
# of expected passes            32
# of unexpected failures        2


>
>> @@ -1135,18 +1135,26 @@ dwarf_query::parse_function_spec(const string & spec)
>> lex_cast<int>(spec.substr(line_pos + 1));
>> +                // try to parse N, N-M, or N,M,O,P, or combination
>> thereof...
>> +                if (spec.find_first_of(",-", line_pos + 1) != string::npos)
>> +                  {
>> +                    lineno_type = ENUMERATED;
>> +                    linenos.clear();
>> +                    string dash_delimited,num;
>> +                    std::stringstream
>> comma_delimited(spec.substr(line_pos + 1));
>> +                    vector<int> tmp;
>> +                    while (std::getline(comma_delimited, dash_delimited,
>> ','))
>> +                      {
>> +                        tmp.clear();
>> +                        stringstream dash_delimited_stream(dash_delimited);
>> +                        while (std::getline(dash_delimited_stream, num,
>> '-'))
>> +                          tmp.push_back(lex_cast<int>(num));
>> +                        if (tmp.size() > 1)
>> +                            for (int i = tmp.front(); i <= tmp.back(); i++)
>> +                                linenos.push_back(i);
>
> For parsing the spec, I think it would be easier to tokenize on commas,
> and then check the resulting tokens one by one for either a single lineno
> or a range (maybe even factor that part out into its own function). There's
> tokenize() in util.cxx which is useful for this.

I noticed that I misinterpreted what the code was doing with the range before.
This patch simply translates it into an enumeration.
Even using binary_search on an enumeration, a search of a large range
would be inefficient.
I'll have to reformulate this.  I'm not sure it helps to go as far as
interval tree stuff, but tracking ranges properly, rather than
smashing them into enumerations, makes sense to me.

>
>> @@ -1192,6 +1200,10 @@ dwarf_query::parse_function_spec(const string & spec)
>>                clog << linenos[0] << " - " << linenos[1];
>>                break;
>>
>> +            case ENUMERATED:
>> +              clog << linenos[0] << ", ..., " << *(linenos.end());
>> +              break;
>> +
>>              case WILDCARD:
>>                clog << "*";
>>                break;
>
> For testing and debugging, it'd be nice if we actually properly print the
> contents of the linenos vector instead of '...'. It could just print all the
> linenos in the vector, although it'd be nice if it recognized ranges as well.
> (Also, you're dereferencing linenos.end()... did you mean linenos.back()?).
>
> Finally, is there any protection against repeating linenos? E.g. ":5-9,7-10"?
> Maybe linenos ought to be a set instead. Or maybe we should explicitly check
> for this and error out if it happens.

Yeah, it seems like we need to check against a series of items which
could be either individual line numbers or ranges.  This will also
solve the output case as we can just parrot out the items.
I kind-of see where this can change.  Enumerating the range explicitly
was my gut response, but it could be a surprise performance reduction
for a large range in a large file of source.
I'll try to get this all cleaned up in the next day or two.

thanks for your feedback,
Brian

>
>
> Cheers,
>
> Jonathan

next prev parent reply	other threads:[~2014-06-03 17:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-03  7:43 BR Chrisman
2014-06-03 15:10 ` Jonathan Lebon
2014-06-03 17:46   ` BR Chrisman [this message]
2014-06-03 18:44     ` Jonathan Lebon
2014-06-03 20:17       ` BR Chrisman
2014-06-03 20:32         ` Jonathan Lebon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN4=B2=7Nofy4cGu+Vo2+yG4PBOFfrHRw_6570DjvTVGWAby7Q@mail.gmail.com' \
    --to=brchrisman@gmail.com \
    --cc=systemtap@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).