public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Updated patch adding line number enumeration support to statement probe
@ 2014-06-03  7:43 BR Chrisman
  2014-06-03 15:10 ` Jonathan Lebon
  0 siblings, 1 reply; 6+ messages in thread
From: BR Chrisman @ 2014-06-03  7:43 UTC (permalink / raw)
  To: systemtap

fche on IRC suggested sweeping range and enumerated specifiers together.
This patch includes that.  What's confusing me right now is why the
test modification to 'statement.exp' fails.  I've tested this patch
with the binaries I was testing on before, which accepts intermixed
ranges/enumerations such as *@*:3,4-5,8,10-20 without a problem.  I
was expecting the same with this expect test, but it's failing (the
other tests I added succeed).  Would appreciate any ideas where to go
in isolating/validating the test case's behavior.

From f444aebbf92dae46bcb7c27412c782a4a3fe3a3f Mon Sep 17 00:00:00 2001
From: Brian Chrisman <brchrisman@gmail.com>
Date: Mon, 2 Jun 2014 14:19:38 -0700
Subject: [PATCH 1/1] Adding enumeration support to statement pp

linenos converted to vector<int> from int[2]
expanded statement.exp to cover ENUMERATED testing identical to RANGE

Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
---
 dwflpp.cxx                             |  8 +++++--
 dwflpp.h                               | 10 ++++----
 tapsets.cxx                            | 42 ++++++++++++++++++++++------------
 testsuite/systemtap.base/statement.exp | 31 ++++++++++++++++++++++---
 4 files changed, 66 insertions(+), 25 deletions(-)

diff --git a/dwflpp.cxx b/dwflpp.cxx
index 67a63c4..bfc7e11 100644
--- a/dwflpp.cxx
+++ b/dwflpp.cxx
@@ -1777,7 +1777,7 @@ get_funcs_in_srcfile(base_func_info_map_t& funcs,

 template<> void
 dwflpp::iterate_over_srcfile_lines<void>(char const * srcfile,
-                                         int linenos[2],
+                                         std::vector<int> linenos,
                                          enum lineno_t lineno_type,
                                          base_func_info_map_t& funcs,
                                          void (* callback) (Dwarf_Addr,
@@ -1821,6 +1821,10 @@ dwflpp::iterate_over_srcfile_lines<void>(char
const * srcfile,
                                     current_funcs, matching_lines);
   else if (lineno_type == WILDCARD)
     collect_all_lines(srcfile, current_funcs, matching_lines);
+  else if (lineno_type == ENUMERATED)
+    for (std::vector<int>::iterator it = linenos.begin(); it !=
linenos.end(); it++)
+      collect_lines_for_single_lineno(srcfile, *it, false, /* is_relative */
+                                      current_funcs, matching_lines);
   else // RANGE
     for (int lineno = linenos[0]; lineno <= linenos[1]; lineno++)
       collect_lines_for_single_lineno(srcfile, lineno, false, /* is_relative */
@@ -1858,7 +1862,7 @@ template<> void
 dwflpp::iterate_over_labels<void>(Dwarf_Die *begin_die,
                                   const string& sym,
                                   const base_func_info& function,
-                                  int linenos[2],
+                                  std::vector<int> linenos,
                                   enum lineno_t lineno_type,
                                   void *data,
                                   void (* callback)(const base_func_info&,
diff --git a/dwflpp.h b/dwflpp.h
index 2d4a656..338b65e 100644
--- a/dwflpp.h
+++ b/dwflpp.h
@@ -38,7 +38,7 @@ struct symbol_table;
 struct base_query;
 struct external_function_query;

-enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD };
+enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD, ENUMERATED };
 enum info_status { info_unknown, info_present, info_absent };

 // module -> cu die[]
@@ -318,7 +318,7 @@ struct dwflpp

   template<typename T>
   void iterate_over_srcfile_lines (char const * srcfile,
-                                   int linenos[2],
+                                   std::vector<int> linenos,
                                    enum lineno_t lineno_type,
                                    base_func_info_map_t& funcs,
                                    void (*callback) (Dwarf_Addr,
@@ -339,7 +339,7 @@ struct dwflpp
   void iterate_over_labels (Dwarf_Die *begin_die,
                             const std::string& sym,
                             const base_func_info& function,
-                            int linenos[2],
+                            std::vector<int> linenos,
                             enum lineno_t lineno_type,
                             T *data,
                             void (* callback)(const base_func_info&,
@@ -701,7 +701,7 @@ dwflpp::iterate_over_plt<void>(void *object, void
(*callback)(void*,
                                                               size_t));
 template<> void
 dwflpp::iterate_over_srcfile_lines<void>(char const * srcfile,
-                                         int linenos[2],
+                                         std::vector<int> linenos,
                                          enum lineno_t lineno_type,
                                          base_func_info_map_t& funcs,
                                          void (* callback) (Dwarf_Addr,
@@ -711,7 +711,7 @@ template<> void
 dwflpp::iterate_over_labels<void>(Dwarf_Die *begin_die,
                                   const std::string& sym,
                                   const base_func_info& function,
-                                  int linenos[2],
+                                  std::vector<int> linenos,
                                   enum lineno_t lineno_type,
                                   void *data,
                                   void (* callback)(const base_func_info&,
diff --git a/tapsets.cxx b/tapsets.cxx
index 04810dc..3d6f10a 100644
--- a/tapsets.cxx
+++ b/tapsets.cxx
@@ -812,7 +812,7 @@ struct dwarf_query : public base_query
   string function;
   string file;
   lineno_t lineno_type;
-  int linenos[2];
+  vector<int> linenos;
   bool query_done; // Found exact match

   // Holds the prologue end of the current function
@@ -1086,9 +1086,9 @@ void
 dwarf_query::parse_function_spec(const string & spec)
 {
   lineno_type = ABSOLUTE;
-  linenos[0] = linenos[1] = 0;
-
-  size_t src_pos, line_pos, dash_pos, scope_pos;
+  linenos.push_back(0);
+  linenos.push_back(0);
+  size_t src_pos, line_pos, scope_pos;

   // look for named scopes
   scope_pos = spec.rfind("::");
@@ -1135,18 +1135,26 @@ dwarf_query::parse_function_spec(const string & spec)
           if (lineno_type != WILDCARD)
             try
               {
-                // try to parse either N or N-M
-                dash_pos = spec.find('-', line_pos + 1);
-                if (dash_pos == string::npos)
-                  linenos[0] = linenos[1] =
lex_cast<int>(spec.substr(line_pos + 1));
+                // try to parse N, N-M, or N,M,O,P, or combination thereof...
+                if (spec.find_first_of(",-", line_pos + 1) != string::npos)
+                  {
+                    lineno_type = ENUMERATED;
+                    linenos.clear();
+                    string dash_delimited,num;
+                    std::stringstream
comma_delimited(spec.substr(line_pos + 1));
+                    vector<int> tmp;
+                    while (std::getline(comma_delimited, dash_delimited, ','))
+                      {
+                        tmp.clear();
+                        stringstream dash_delimited_stream(dash_delimited);
+                        while (std::getline(dash_delimited_stream, num, '-'))
+                          tmp.push_back(lex_cast<int>(num));
+                        if (tmp.size() > 1)
+                            for (int i = tmp.front(); i <= tmp.back(); i++)
+                                linenos.push_back(i);
                         else
-                  {
-                    lineno_type = RANGE;
-                    linenos[0] = lex_cast<int>(spec.substr(line_pos + 1,
-                                                        dash_pos -
line_pos - 1));
-                    linenos[1] = lex_cast<int>(spec.substr(dash_pos + 1));
-                    if (linenos[0] > linenos[1])
-                      throw runtime_error("bad range");
+                            linenos.push_back(tmp.at(0));
+                      }
                   }
               }
             catch (runtime_error & exn)
@@ -1192,6 +1200,10 @@ dwarf_query::parse_function_spec(const string & spec)
               clog << linenos[0] << " - " << linenos[1];
               break;

+            case ENUMERATED:
+              clog << linenos[0] << ", ..., " << *(linenos.end());
+              break;
+
             case WILDCARD:
               clog << "*";
               break;
diff --git a/testsuite/systemtap.base/statement.exp
b/testsuite/systemtap.base/statement.exp
index a8a1f87..546450a 100644
--- a/testsuite/systemtap.base/statement.exp
+++ b/testsuite/systemtap.base/statement.exp
@@ -90,7 +90,7 @@ expect_probes * * 9

 # Now we do some consistent probe checking

-# ABSOLUTE and RANGE should give the same results for wild and single func
+# ABSOLUTE, RANGE, and ENUMERATED should give the same results for
wild and single func
 set subtest "ABSOLUTE - single func"
 expect_probes foo 5 1
 set subtest "ABSOLUTE - wild func"
@@ -99,8 +99,16 @@ set subtest "RANGE - single func"
 expect_probes foo 5-7 3
 set subtest "RANGE - wild func"
 expect_probes * 5-7 3
-
-# But ABSOLUTE and RANGE shouldn't give any results if the linenos fall outside
+set subtest "ENUMERATED - single func"
+expect_probes foo 5,6,7 3
+set subtest "ENUMERATED and RANGE - single func"
+expect_probes foo 5,6-7 3
+set subtest "ENUMERATED - wild func"
+expect_probes * 5,6,7 3
+set subtest "ENUMERATED and RANGE - wild func"
+expect_probes * 5-6,7 3
+
+# But ABSOLUTE, RANGE, and ENUMERATED shouldn't give any results if
the linenos fall outside
 # the given func(s)
 set subtest "ABSOLUTE - outside single func"
 expect_probes bar 5 0
@@ -110,6 +118,10 @@ set subtest "RANGE - outside single func"
 expect_probes bar 5-7 0
 set subtest "RANGE - outside wild func"
 expect_probes {[bm]*} 5-7 0
+set subtest "ENUMERATED - outside single func"
+expect_probes bar 5-6,7 0
+set subtest "ENUMERATED - outside wild func"
+expect_probes {[bm]*} 5,6-7 0

 # RELATIVE and WILDCARD must be applied *per* function
 set subtest "RELATIVE - single func"
@@ -138,6 +150,12 @@ expect_probes foo 0-5 2
 set subtest "RANGE - out-of-bounds upper"
 expect_probes foo 5-999 3

+# ENUMERATED on a function is bound by function linenos
+set subtest "ENUMERATED - out-of-bounds lower"
+expect_probes foo 0,1,2,3,4,5 2
+set subtest "ENUMERATED - out-of-bounds upper"
+expect_probes foo 5,6,7,998,999 3
+
 # RANGE overlapping two functions yields intersection of range and filtered
 # functions
 set subtest "RANGE - single func overlapping"
@@ -145,4 +163,11 @@ expect_probes foo 5-13 3
 set subtest "RANGE - wildcard func overlapping"
 expect_probes * 5-13 6

+# ENUMERATED overlapping two functions yields intersection of range
and filtered
+# functions
+set subtest "ENUMERATED - single func overlapping"
+expect_probes foo 5,6,7,8,9,10,11,12,13 3
+set subtest "ENUMERATED - wildcard func overlapping"
+expect_probes * 5,6,7,8,9,10,11,12,13 6
+
 if {[file exists "$test"]}   { file delete "$test" }
-- 
1.8.5.2 (Apple Git-48)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Updated patch adding line number enumeration support to statement probe
  2014-06-03  7:43 Updated patch adding line number enumeration support to statement probe BR Chrisman
@ 2014-06-03 15:10 ` Jonathan Lebon
  2014-06-03 17:46   ` BR Chrisman
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Lebon @ 2014-06-03 15:10 UTC (permalink / raw)
  To: BR Chrisman; +Cc: systemtap

Hi Brian,

Thanks for the patch. I had some difficulties applying it. Can you make
sure that your mail client does not modify whitespaces? If you use the
git send-mail command you should have no issues.

Your patch looks OK overall. There are some things however that need
attention:

> @@ -38,7 +38,7 @@ struct symbol_table;
>  struct base_query;
>  struct external_function_query;
> 
> -enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD };
> +enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD, ENUMERATED };
>  enum info_status { info_unknown, info_present, info_absent };
> 
>  // module -> cu die[]

You've eliminated any way to have a RANGE lineno type, but it is still
in the lineno_t enum and there are still references to it in many places
(like iterate_over_labels()). These need to be cleaned up as well to
instead handle ENUMERATED.

> @@ -1086,9 +1086,9 @@ void
>  dwarf_query::parse_function_spec(const string & spec)
>  {
>    lineno_type = ABSOLUTE;
> -  linenos[0] = linenos[1] = 0;
> -
> -  size_t src_pos, line_pos, dash_pos, scope_pos;
> +  linenos.push_back(0);
> +  linenos.push_back(0);
> +  size_t src_pos, line_pos, scope_pos;
> 
>    // look for named scopes
>    scope_pos = spec.rfind("::");

You seed the linenos vector with two '0' elements, but then never add
the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
the user entered lineno 0. I suspect that's what's causing all the failures
in statement.exp.

> @@ -1135,18 +1135,26 @@ dwarf_query::parse_function_spec(const string & spec)
> lex_cast<int>(spec.substr(line_pos + 1));
> +                // try to parse N, N-M, or N,M,O,P, or combination
> thereof...
> +                if (spec.find_first_of(",-", line_pos + 1) != string::npos)
> +                  {
> +                    lineno_type = ENUMERATED;
> +                    linenos.clear();
> +                    string dash_delimited,num;
> +                    std::stringstream
> comma_delimited(spec.substr(line_pos + 1));
> +                    vector<int> tmp;
> +                    while (std::getline(comma_delimited, dash_delimited,
> ','))
> +                      {
> +                        tmp.clear();
> +                        stringstream dash_delimited_stream(dash_delimited);
> +                        while (std::getline(dash_delimited_stream, num,
> '-'))
> +                          tmp.push_back(lex_cast<int>(num));
> +                        if (tmp.size() > 1)
> +                            for (int i = tmp.front(); i <= tmp.back(); i++)
> +                                linenos.push_back(i);

For parsing the spec, I think it would be easier to tokenize on commas,
and then check the resulting tokens one by one for either a single lineno
or a range (maybe even factor that part out into its own function). There's
tokenize() in util.cxx which is useful for this.

> @@ -1192,6 +1200,10 @@ dwarf_query::parse_function_spec(const string & spec)
>                clog << linenos[0] << " - " << linenos[1];
>                break;
> 
> +            case ENUMERATED:
> +              clog << linenos[0] << ", ..., " << *(linenos.end());
> +              break;
> +
>              case WILDCARD:
>                clog << "*";
>                break;

For testing and debugging, it'd be nice if we actually properly print the
contents of the linenos vector instead of '...'. It could just print all the
linenos in the vector, although it'd be nice if it recognized ranges as well.
(Also, you're dereferencing linenos.end()... did you mean linenos.back()?).

Finally, is there any protection against repeating linenos? E.g. ":5-9,7-10"?
Maybe linenos ought to be a set instead. Or maybe we should explicitly check
for this and error out if it happens.


Cheers,

Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Updated patch adding line number enumeration support to statement probe
  2014-06-03 15:10 ` Jonathan Lebon
@ 2014-06-03 17:46   ` BR Chrisman
  2014-06-03 18:44     ` Jonathan Lebon
  0 siblings, 1 reply; 6+ messages in thread
From: BR Chrisman @ 2014-06-03 17:46 UTC (permalink / raw)
  Cc: systemtap

On Tue, Jun 3, 2014 at 8:10 AM, Jonathan Lebon <jlebon@redhat.com> wrote:
> Hi Brian,
>
> Thanks for the patch. I had some difficulties applying it. Can you make
> sure that your mail client does not modify whitespaces? If you use the
> git send-mail command you should have no issues.

I'll verify my mailing before sending up the next patch.  I've had
issues with this before, my apologies.

>
> Your patch looks OK overall. There are some things however that need
> attention:
>
>> @@ -38,7 +38,7 @@ struct symbol_table;
>>  struct base_query;
>>  struct external_function_query;
>>
>> -enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD };
>> +enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD, ENUMERATED };
>>  enum info_status { info_unknown, info_present, info_absent };
>>
>>  // module -> cu die[]
>
> You've eliminated any way to have a RANGE lineno type, but it is still
> in the lineno_t enum and there are still references to it in many places
> (like iterate_over_labels()). These need to be cleaned up as well to
> instead handle ENUMERATED.

I cleaned up all the RANGE stuff and will include in another patch.

>
>> @@ -1086,9 +1086,9 @@ void
>>  dwarf_query::parse_function_spec(const string & spec)
>>  {
>>    lineno_type = ABSOLUTE;
>> -  linenos[0] = linenos[1] = 0;
>> -
>> -  size_t src_pos, line_pos, dash_pos, scope_pos;
>> +  linenos.push_back(0);
>> +  linenos.push_back(0);
>> +  size_t src_pos, line_pos, scope_pos;
>>
>>    // look for named scopes
>>    scope_pos = spec.rfind("::");
>
> You seed the linenos vector with two '0' elements, but then never add
> the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
> the user entered lineno 0. I suspect that's what's causing all the failures
> in statement.exp.

The seed values were intended to mimic the removed line:
linenos[0] = linenos[1] = 0;

I tried testing with those seeds removed, which failed as well, but
I'm confused about how/where ABSOLUTE/RELATIVE specifications are
parsed.

The two statement.exp test cases which are failing are:
FAIL: statement (ENUMERATED and RANGE - single func - expected 3 probes, got 0)
FAIL: statement (ENUMERATED and RANGE - wild func - expected 3 probes, got 0)
...
                ===  Summary ===
# of expected passes            32
# of unexpected failures        2


>
>> @@ -1135,18 +1135,26 @@ dwarf_query::parse_function_spec(const string & spec)
>> lex_cast<int>(spec.substr(line_pos + 1));
>> +                // try to parse N, N-M, or N,M,O,P, or combination
>> thereof...
>> +                if (spec.find_first_of(",-", line_pos + 1) != string::npos)
>> +                  {
>> +                    lineno_type = ENUMERATED;
>> +                    linenos.clear();
>> +                    string dash_delimited,num;
>> +                    std::stringstream
>> comma_delimited(spec.substr(line_pos + 1));
>> +                    vector<int> tmp;
>> +                    while (std::getline(comma_delimited, dash_delimited,
>> ','))
>> +                      {
>> +                        tmp.clear();
>> +                        stringstream dash_delimited_stream(dash_delimited);
>> +                        while (std::getline(dash_delimited_stream, num,
>> '-'))
>> +                          tmp.push_back(lex_cast<int>(num));
>> +                        if (tmp.size() > 1)
>> +                            for (int i = tmp.front(); i <= tmp.back(); i++)
>> +                                linenos.push_back(i);
>
> For parsing the spec, I think it would be easier to tokenize on commas,
> and then check the resulting tokens one by one for either a single lineno
> or a range (maybe even factor that part out into its own function). There's
> tokenize() in util.cxx which is useful for this.

I noticed that I misinterpreted what the code was doing with the range before.
This patch simply translates it into an enumeration.
Even using binary_search on an enumeration, a search of a large range
would be inefficient.
I'll have to reformulate this.  I'm not sure it helps to go as far as
interval tree stuff, but tracking ranges properly, rather than
smashing them into enumerations, makes sense to me.

>
>> @@ -1192,6 +1200,10 @@ dwarf_query::parse_function_spec(const string & spec)
>>                clog << linenos[0] << " - " << linenos[1];
>>                break;
>>
>> +            case ENUMERATED:
>> +              clog << linenos[0] << ", ..., " << *(linenos.end());
>> +              break;
>> +
>>              case WILDCARD:
>>                clog << "*";
>>                break;
>
> For testing and debugging, it'd be nice if we actually properly print the
> contents of the linenos vector instead of '...'. It could just print all the
> linenos in the vector, although it'd be nice if it recognized ranges as well.
> (Also, you're dereferencing linenos.end()... did you mean linenos.back()?).
>
> Finally, is there any protection against repeating linenos? E.g. ":5-9,7-10"?
> Maybe linenos ought to be a set instead. Or maybe we should explicitly check
> for this and error out if it happens.

Yeah, it seems like we need to check against a series of items which
could be either individual line numbers or ranges.  This will also
solve the output case as we can just parrot out the items.
I kind-of see where this can change.  Enumerating the range explicitly
was my gut response, but it could be a surprise performance reduction
for a large range in a large file of source.
I'll try to get this all cleaned up in the next day or two.

thanks for your feedback,
Brian

>
>
> Cheers,
>
> Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Updated patch adding line number enumeration support to statement probe
  2014-06-03 17:46   ` BR Chrisman
@ 2014-06-03 18:44     ` Jonathan Lebon
  2014-06-03 20:17       ` BR Chrisman
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Lebon @ 2014-06-03 18:44 UTC (permalink / raw)
  To: BR Chrisman; +Cc: systemtap

Hi Brian,


> I'll verify my mailing before sending up the next patch.  I've had
> issues with this before, my apologies.

No worries.

> I cleaned up all the RANGE stuff and will include in another patch.

Great!

> > You seed the linenos vector with two '0' elements, but then never add
> > the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
> > the user entered lineno 0. I suspect that's what's causing all the failures
> > in statement.exp.
> 
> The seed values were intended to mimic the removed line:
> linenos[0] = linenos[1] = 0;
> 
> I tried testing with those seeds removed, which failed as well, but
> I'm confused about how/where ABSOLUTE/RELATIVE specifications are
> parsed.

The lines that were removed which took care of this were these:

-   dash_pos = spec.find('-', line_pos + 1);
-   if (dash_pos == string::npos)
-     linenos[0] = linenos[1] = lex_cast<int>(spec.substr(line_pos + 1));

> The two statement.exp test cases which are failing are:
> FAIL: statement (ENUMERATED and RANGE - single func - expected 3 probes, got
> 0)
> FAIL: statement (ENUMERATED and RANGE - wild func - expected 3 probes, got 0)
> ...
>                 ===  Summary ===
> # of expected passes            32
> # of unexpected failures        2

Strange, I had the following failures when applying your patch:

Running /home/yyz/jlebon/codebase/systemtap/systemtap/testsuite/systemtap.base/statement.exp ...
FAIL: statement (ABSOLUTE - single func - expected 1 probes, got 0)
FAIL: statement (ABSOLUTE - wild func - expected 1 probes, got 0)
FAIL: statement (RELATIVE - single func - expected 1 probes, got 0)
FAIL: statement (RELATIVE - wild func - expected 2 probes, got 1)
FAIL: statement (ABSOLUTE - error for no lines - single func - no semantic error)

> > For parsing the spec, I think it would be easier to tokenize on commas,
> > and then check the resulting tokens one by one for either a single lineno
> > or a range (maybe even factor that part out into its own function). There's
> > tokenize() in util.cxx which is useful for this.
> 
> I noticed that I misinterpreted what the code was doing with the range
> before.
> This patch simply translates it into an enumeration.
> Even using binary_search on an enumeration, a search of a large range
> would be inefficient.

So what I meant here was that it might be easier to parse the a,b-c,d,e,...
by doing something like this (pseudo-code):

vector<string> tokens;
tokenize("a,b-c,d,e", tokens, ","); // available from util.h
for (token in tokens) {
  if token is single number
    add single number to linenos vector
  if token is a range
    for each number in range
       add number to linenos vector
}

Regarding binary searching, I think that's OK. It's still pretty
fast. :)

> I'll have to reformulate this.  I'm not sure it helps to go as far as
> interval tree stuff, but tracking ranges properly, rather than
> smashing them into enumerations, makes sense to me.

> Yeah, it seems like we need to check against a series of items which
> could be either individual line numbers or ranges.  This will also
> solve the output case as we can just parrot out the items.
> I kind-of see where this can change.  Enumerating the range explicitly
> was my gut response, but it could be a surprise performance reduction
> for a large range in a large file of source.
> I'll try to get this all cleaned up in the next day or two.

I think it would be easier to keep it simple and let the vector just hold
the individual linenos (like you originally designed). Since we'll need to
sort the final vector after all the parsing is done anyway (for later
binary searching if necessary), we can easily print it out in the
"a,b-c,d,e,..." form by going through the vector and detecting jumps >1.

> thanks for your feedback,
> Brian

Looking forward to your next patches!


Cheers,

Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Updated patch adding line number enumeration support to statement probe
  2014-06-03 18:44     ` Jonathan Lebon
@ 2014-06-03 20:17       ` BR Chrisman
  2014-06-03 20:32         ` Jonathan Lebon
  0 siblings, 1 reply; 6+ messages in thread
From: BR Chrisman @ 2014-06-03 20:17 UTC (permalink / raw)
  To: Jonathan Lebon; +Cc: systemtap

On Tue, Jun 3, 2014 at 11:44 AM, Jonathan Lebon <jlebon@redhat.com> wrote:
> Hi Brian,
>
>
>> I'll verify my mailing before sending up the next patch.  I've had
>> issues with this before, my apologies.
>
> No worries.
>
>> I cleaned up all the RANGE stuff and will include in another patch.
>
> Great!
>
>> > You seed the linenos vector with two '0' elements, but then never add
>> > the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
>> > the user entered lineno 0. I suspect that's what's causing all the failures
>> > in statement.exp.
>>
>> The seed values were intended to mimic the removed line:
>> linenos[0] = linenos[1] = 0;
>>
>> I tried testing with those seeds removed, which failed as well, but
>> I'm confused about how/where ABSOLUTE/RELATIVE specifications are
>> parsed.
>
> The lines that were removed which took care of this were these:
>
> -   dash_pos = spec.find('-', line_pos + 1);
> -   if (dash_pos == string::npos)
> -     linenos[0] = linenos[1] = lex_cast<int>(spec.substr(line_pos + 1));

Ahh yes.. *that* one..


>
>> The two statement.exp test cases which are failing are:
>> FAIL: statement (ENUMERATED and RANGE - single func - expected 3 probes, got
>> 0)
>> FAIL: statement (ENUMERATED and RANGE - wild func - expected 3 probes, got 0)
>> ...
>>                 ===  Summary ===
>> # of expected passes            32
>> # of unexpected failures        2
>
> Strange, I had the following failures when applying your patch:
>
> Running /home/yyz/jlebon/codebase/systemtap/systemtap/testsuite/systemtap.base/statement.exp ...
> FAIL: statement (ABSOLUTE - single func - expected 1 probes, got 0)
> FAIL: statement (ABSOLUTE - wild func - expected 1 probes, got 0)
> FAIL: statement (RELATIVE - single func - expected 1 probes, got 0)
> FAIL: statement (RELATIVE - wild func - expected 2 probes, got 1)
> FAIL: statement (ABSOLUTE - error for no lines - single func - no semantic error)

huh.. well.. if the line I goofed above cleans this up, then it should
be all good I suppose.  My internal code tests work for the
multi-range.

>
>> > For parsing the spec, I think it would be easier to tokenize on commas,
>> > and then check the resulting tokens one by one for either a single lineno
>> > or a range (maybe even factor that part out into its own function). There's
>> > tokenize() in util.cxx which is useful for this.
>>
>> I noticed that I misinterpreted what the code was doing with the range
>> before.
>> This patch simply translates it into an enumeration.
>> Even using binary_search on an enumeration, a search of a large range
>> would be inefficient.
>
> So what I meant here was that it might be easier to parse the a,b-c,d,e,...
> by doing something like this (pseudo-code):
>
> vector<string> tokens;
> tokenize("a,b-c,d,e", tokens, ","); // available from util.h
> for (token in tokens) {
>   if token is single number
>     add single number to linenos vector
>   if token is a range
>     for each number in range
>        add number to linenos vector
> }

Yes, this makes much more sense.

>
> Regarding binary searching, I think that's OK. It's still pretty
> fast. :)
>
>> I'll have to reformulate this.  I'm not sure it helps to go as far as
>> interval tree stuff, but tracking ranges properly, rather than
>> smashing them into enumerations, makes sense to me.
>
>> Yeah, it seems like we need to check against a series of items which
>> could be either individual line numbers or ranges.  This will also
>> solve the output case as we can just parrot out the items.
>> I kind-of see where this can change.  Enumerating the range explicitly
>> was my gut response, but it could be a surprise performance reduction
>> for a large range in a large file of source.
>> I'll try to get this all cleaned up in the next day or two.
>
> I think it would be easier to keep it simple and let the vector just hold
> the individual linenos (like you originally designed). Since we'll need to
> sort the final vector after all the parsing is done anyway (for later
> binary searching if necessary), we can easily print it out in the
> "a,b-c,d,e,..." form by going through the vector and detecting jumps >1.

... makes sense... is there a 'no end of line' clog stream?
I'd like to output piecemeal here, so I'd need a stream that doesn't
automatically endl or accumulate the whole thing into a string.

>
>> thanks for your feedback,
>> Brian
>
> Looking forward to your next patches!
>
>
> Cheers,
>
> Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Updated patch adding line number enumeration support to statement probe
  2014-06-03 20:17       ` BR Chrisman
@ 2014-06-03 20:32         ` Jonathan Lebon
  0 siblings, 0 replies; 6+ messages in thread
From: Jonathan Lebon @ 2014-06-03 20:32 UTC (permalink / raw)
  To: BR Chrisman; +Cc: systemtap

> ... makes sense... is there a 'no end of line' clog stream?
> I'd like to output piecemeal here, so I'd need a stream that doesn't
> automatically endl or accumulate the whole thing into a string.

Writing to clog shouldn't create newlines unless you explicitly use
endl:

   // prints "same line"
   clog << "same ";
   clog << "line" << endl;

To clarify, this will be output in parse_function_spec(), in the part
that is gated by

   if (sess.verbose > 2)
     {
       ...
     }

so you'll only see it if you do stap -vvv. This is not what will be
printed during a normal stap -l, which instead prints a probe point
for each individual lineno.


Hope this helps!

Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-06-03 20:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-03  7:43 Updated patch adding line number enumeration support to statement probe BR Chrisman
2014-06-03 15:10 ` Jonathan Lebon
2014-06-03 17:46   ` BR Chrisman
2014-06-03 18:44     ` Jonathan Lebon
2014-06-03 20:17       ` BR Chrisman
2014-06-03 20:32         ` Jonathan Lebon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).