public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Re: BZ#2421 - removing duplicate probe handlers
       [not found] ` <1153494928.11557.24.camel@dhcp-2.hsv.redhat.com>
@ 2006-07-21 16:01   ` Frank Ch. Eigler
  2006-07-21 19:15     ` David Smith
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Ch. Eigler @ 2006-07-21 16:01 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

> > I've been looking into BZ#2421 this week while waiting for people to
> > look at the kernel patch.  [...]

OK.

> - Variable lifetimes.  The probe_flavours list down in
> dwarf_query::add_probe_point() is now global, but the dps list in
> semantic_pass_symbols() is now per file.

Global as in "global scope"?  We should avoid that kind of stuff in
C++, and instead add members to something like systemtap_session.

> Do you remember exactly what is a "file" in the context of
> semantic_pass_symbols()?  Is that a tapset file or something else?

It's a "struct stapfile*" - one parse tree for an entire file.  One of
these would be the input script, and many would be the ton of tapset
scripts.

> - Current flavour code needs to be improved.  [...]

Yeah.

> [...] So, should function duplication removal work globally or only
> per probe specification line?

I imagine it working globally.  I believe what we need is a variant of
the compiler optimization known as "value numbering".  It could live
in a new optimization pass within semantic_pass_optimize.  It could
compute a hash value for every function & probe-handler from the
bottom up (requiring a new staptree-related visitor, and include
referent-names and parse tree types).  Multiple functions &
probe-handlers with the same hash value could be duplicate-eliminated.
(In the case of functions, all the callers of one of the duplicates
would be redirected to a single one.  In the case of probe handlers,
their probe_location would be merged.)  (With the addition of a
parse-tree-comparison visitor, one could eliminate the slight risk of
a hash collision.)

This should supplant the target-variable flavouring magic for dwarf
probes.

Does this make sense?  It's a little more ambitious, but want to give
it a try?

- FChE

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-07-21 16:01   ` BZ#2421 - removing duplicate probe handlers Frank Ch. Eigler
@ 2006-07-21 19:15     ` David Smith
  2006-08-01 18:39       ` David Smith
  0 siblings, 1 reply; 13+ messages in thread
From: David Smith @ 2006-07-21 19:15 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

On Fri, 2006-07-21 at 12:01 -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> > > I've been looking into BZ#2421 this week while waiting for people to
> > > look at the kernel patch.  [...]
> 
> OK.

...

> > [...] So, should function duplication removal work globally or only
> > per probe specification line?
> 
> I imagine it working globally.  I believe what we need is a variant of
> the compiler optimization known as "value numbering".  It could live
> in a new optimization pass within semantic_pass_optimize.  It could
> compute a hash value for every function & probe-handler from the
> bottom up (requiring a new staptree-related visitor, and include
> referent-names and parse tree types).  Multiple functions &
> probe-handlers with the same hash value could be duplicate-eliminated.
> (In the case of functions, all the callers of one of the duplicates
> would be redirected to a single one.  In the case of probe handlers,
> their probe_location would be merged.)  (With the addition of a
> parse-tree-comparison visitor, one could eliminate the slight risk of
> a hash collision.)

(Wow.  I've got to learn to stop asking Frank what seem to me to be
simple questions - I always seem to end up with lots more work...)

> This should supplant the target-variable flavouring magic for dwarf
> probes.

Hmm.  Would it make sense to keep the current flavouring magic as a sort
of "pass 1" attack at removing duplicate probe handlers?

> Does this make sense?  It's a little more ambitious, but want to give
> it a try?
> 
> - FChE

It makes since in a vague sort of way.  I'll give it a try and look into
it.

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-07-21 19:15     ` David Smith
@ 2006-08-01 18:39       ` David Smith
  2006-08-01 22:49         ` Frank Ch. Eigler
                           ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: David Smith @ 2006-08-01 18:39 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

[-- Attachment #1: Type: text/plain, Size: 3443 bytes --]

On Fri, 2006-07-21 at 14:13 -0500, David Smith wrote:
> On Fri, 2006-07-21 at 12:01 -0400, Frank Ch. Eigler wrote:
> > Hi -
> > 
> > > > I've been looking into BZ#2421 this week while waiting for people to
> > > > look at the kernel patch.  [...]
> > 
> > OK.
> 
> ...
> 
> > > [...] So, should function duplication removal work globally or only
> > > per probe specification line?
> > 
> > I imagine it working globally.  I believe what we need is a variant of
> > the compiler optimization known as "value numbering".  It could live
> > in a new optimization pass within semantic_pass_optimize.  It could
> > compute a hash value for every function & probe-handler from the
> > bottom up (requiring a new staptree-related visitor, and include
> > referent-names and parse tree types).  Multiple functions &
> > probe-handlers with the same hash value could be duplicate-eliminated.
> > (In the case of functions, all the callers of one of the duplicates
> > would be redirected to a single one.  In the case of probe handlers,
> > their probe_location would be merged.)  (With the addition of a
> > parse-tree-comparison visitor, one could eliminate the slight risk of
> > a hash collision.)

...

> > Does this make sense?  It's a little more ambitious, but want to give
> > it a try?
> > 
> > - FChE
> 
> It makes since in a vague sort of way.  I'll give it a try and look into
> it.

I've been working on this and here's an update.

I made a stab at removing duplicate functions.  I cheated and just
hashed the args/body of functions to do comparisons.  All duplicate
functions are redirected to the first copy found.  A patch is attached.

I'd appreciate any code comments.


As far as probes go, here I ran into conceptual problems.  There are 8
different kinds of probes derived from derived_probe:
alias_derived_probe, be_derived_probe, never_derived_probe,
dwarf_derived_probe, timer_derived_probe, profile_derived_probe,
mark_derived_probe, and hrtimer_derived_probe.  The way the code
generation currently works, it isn't going to be easy to merge a timer
probe with a dwarf_derived_probe for example - all the needed
information won't be there.  (If we merged the dwarf probe into the
timer probe, the timer probe code generation logic isn't going to know
what to do with the dwarf information.  If we merge the timer probe into
the dwarf probe, the dwarf probe code generation logic isn't going to
like the timer probe location not having any dwarf info.)

OK, I thought, I'll just merge probes of the same kind together -
dwarf_derived_probes with dwarf_derived_probes, timer_derived_probes
with timer_derived_probes, etc.  But, the problem here is certain data
is stored at the probe level, not the location level.  For instance,
timer probes store the timer interval at the probe level, not the
location level.  So, if we merged two timer probes together that had
different intervals, we can merge their location information, but they
are going to end up at the same timer interval.  A similar thing happens
with begin/end probes (be_derived_probes).  The fact that this is a
begin probe vs. an end probe is stored at the probe level, not the
location level.  So, if we merge a begin probe and an end probe
together, we're going to end up with either 2 begin probes or 2 end
probes.

Got any ideas here?

Thanks.

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


[-- Attachment #2: duplicate_function.patch --]
[-- Type: text/x-patch, Size: 3806 bytes --]

Index: elaborate.cxx
===================================================================
RCS file: /cvs/systemtap/src/elaborate.cxx,v
retrieving revision 1.62
diff -u -p -r1.62 elaborate.cxx
--- elaborate.cxx	5 Jun 2006 19:45:56 -0000	1.62
+++ elaborate.cxx	1 Aug 2006 17:40:00 -0000
@@ -24,7 +24,7 @@ extern "C" {
 #include <vector>
 #include <algorithm>
 #include <iterator>
-
+#include <ext/hash_map>
 
 using namespace std;
 
@@ -1550,6 +1550,95 @@ void semantic_pass_opt4 (systemtap_sessi
     }
 }
 
+struct duplicate_function_remover: public functioncall_traversing_visitor
+{
+  systemtap_session& s;
+  bool& relaxed_p;
+  map<functiondecl*, functiondecl*>& duplicate_function_map;
+
+  duplicate_function_remover(systemtap_session& sess, bool& r,
+			     map<functiondecl*, functiondecl*>&dfm):
+    s(sess), relaxed_p(r), duplicate_function_map(dfm) {};
+
+  void visit_functioncall (functioncall* e);
+};
+
+void
+duplicate_function_remover::visit_functioncall (functioncall *e)
+{
+  functioncall_traversing_visitor::visit_functioncall (e);
+
+  // If the current function call reference points to a function that
+  // is a duplicate, replace it.
+  if (duplicate_function_map.count(e->referent) != 0)
+    {
+      if (s.verbose>2)
+	  clog << "Changing " << e->referent->name
+	       << " reference to "
+	       << duplicate_function_map[e->referent]->name
+	       << " reference\n";
+      e->tok = duplicate_function_map[e->referent]->tok;
+      e->function = duplicate_function_map[e->referent]->name;
+      e->referent = duplicate_function_map[e->referent];
+
+      relaxed_p = false;
+    }
+}
+
+static size_t
+hash_functiondecl (functiondecl* f)
+{
+  ostringstream s;
+  __gnu_cxx::hash<const char*> hash_func;
+
+  // Get the "name:args body" of the function in s.  We have to
+  // include the args since the function 'x1(a, b)' is different than
+  // the function 'x2(b, a)' even if the bodies of the two functions
+  // are exactly the same.
+  f->printsig(s);
+  f->body->print(s);
+
+  // printsig puts f->name + ':' on the front.  Remove this
+  // (otherwise, functions would never compare equal).
+  string str = s.str().erase(0, f->name.size() + 1);
+
+  // Actually generate the hash.
+  return hash_func(str.c_str());
+}
+
+void semantic_pass_opt5 (systemtap_session& s, bool& relaxed_p)
+{
+  // Walk through all the functions, looking for duplicates.
+  map<size_t, functiondecl*> function_hash_map;
+  map<functiondecl*, functiondecl*> duplicate_function_map;
+  for (unsigned i=0; i < s.functions.size(); i++)
+    {
+      size_t function_hash = hash_functiondecl(s.functions[i]);
+
+      if (function_hash_map.count(function_hash) == 0)
+	// This function is unique.  Remember it.
+	function_hash_map[function_hash] = s.functions[i];
+      else
+	// This function is a duplicate.
+	duplicate_function_map[s.functions[i]]
+	    = function_hash_map[function_hash];
+    }
+
+  // If we have duplicate functions, traverse down the tree, replacing
+  // the appropriate function calls.
+  // duplicate_function_remover::visit_functioncall() handles the
+  // details of replacing the function calls.  Note that we don't
+  // delete the duplicate functiondecl itself, we'll let pass 1 do
+  // that.
+  if (duplicate_function_map.size() != 0)
+    {
+      duplicate_function_remover dfr (s, relaxed_p, duplicate_function_map);
+
+      for (unsigned i=0; i < s.probes.size(); i++)
+	s.probes[i]->body->visit(&dfr);
+    }
+}
+
 
 static int
 semantic_pass_optimize (systemtap_session& s)
@@ -1571,6 +1660,7 @@ semantic_pass_optimize (systemtap_sessio
       semantic_pass_opt2 (s, relaxed_p);
       semantic_pass_opt3 (s, relaxed_p);
       semantic_pass_opt4 (s, relaxed_p);
+      semantic_pass_opt5 (s, relaxed_p);
     }
 
   if (s.probes.size() == 0)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-01 18:39       ` David Smith
@ 2006-08-01 22:49         ` Frank Ch. Eigler
  2006-08-02 21:16           ` David Smith
  2006-08-01 22:53         ` Frank Ch. Eigler
  2006-08-02  8:03         ` Li Guanglei
  2 siblings, 1 reply; 13+ messages in thread
From: Frank Ch. Eigler @ 2006-08-01 22:49 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

On Tue, Aug 01, 2006 at 01:36:38PM -0500, David Smith wrote:

> [...]  As far as probes go, here I ran into conceptual problems.
> [...] The way the code generation currently works, it isn't going to
> be easy to merge a timer probe with a dwarf_derived_probe for
> example - all the needed information won't be there.  [...]  So, if
> we merge a begin probe and an end probe together, we're going to end
> up with either 2 begin probes or 2 end probes.

Maybe what we need is a slightly different representation of merging
here.  What we want is not to eliminate all those derived_probes, but
rather to reuse the code generated for identical ->body trees.  It is
those trees that are translated in c_unparser::emit_probe.  That may
be the perfect place to detect/handle probe-level duplication.
(Instead of emitting new function body that duplicates one already
seen, it could emit a call to the first copy instead.)

- FChE

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-01 18:39       ` David Smith
  2006-08-01 22:49         ` Frank Ch. Eigler
@ 2006-08-01 22:53         ` Frank Ch. Eigler
  2006-08-02  8:03         ` Li Guanglei
  2 siblings, 0 replies; 13+ messages in thread
From: Frank Ch. Eigler @ 2006-08-01 22:53 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

> I made a stab at removing duplicate functions.  I cheated and just
> hashed the args/body of functions to do comparisons.  All duplicate
> functions are redirected to the first copy found.  A patch is
> attached.  [...]

It may be straightforward to have a general hashing_visitor that
computes an overall hash value into member variable.  It should
certainly mix in a hash of the function body, especially in the case
of an embedded-C function such as those generated for $target
variables.

- FChE

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-01 18:39       ` David Smith
  2006-08-01 22:49         ` Frank Ch. Eigler
  2006-08-01 22:53         ` Frank Ch. Eigler
@ 2006-08-02  8:03         ` Li Guanglei
  2006-08-02 13:19           ` David Smith
  2 siblings, 1 reply; 13+ messages in thread
From: Li Guanglei @ 2006-08-02  8:03 UTC (permalink / raw)
  To: David Smith; +Cc: Frank Ch. Eigler, Systemtap List

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=GB2312, Size: 1186 bytes --]

David Smith дµÀ:
> On Fri, 2006-07-21 at 14:13 -0500, David Smith wrote:
> [...]
> I've been working on this and here's an update.
> 
> I made a stab at removing duplicate functions.  I cheated and just
> hashed the args/body of functions to do comparisons.  All duplicate
> functions are redirected to the first copy found.  A patch is attached.
> 
> I'd appreciate any code comments.
> 

Hi David,

I tried this stap script:

probe kernel.function("sys_read"), kernel.function("sys_write")
{
        log("here")
}

but stap will still generate duplicated handlers after applying your
patch.

  sys_read-->enter_probe_896-->probe_896
  sys_write-->enter_probe_897-->probe_897

  enter_probe_896 and probe_896 are exactly a copy of enter_probe_897
and probe_897. It seems to me that this patch will determine if a
function is duplicated with another function by hashing its function
names and arguments. So semantic_pass_opt5 will think that
enter_probe_896 and enter_probe_897 are different and didn't optimize
for it. If so, maybe we have to do this optimization a little earlier,
 say, in semantic_pass_symbols() before calling
semantic_pass_optimize(). Am I right?

- Guanglei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-02  8:03         ` Li Guanglei
@ 2006-08-02 13:19           ` David Smith
  0 siblings, 0 replies; 13+ messages in thread
From: David Smith @ 2006-08-02 13:19 UTC (permalink / raw)
  To: Li Guanglei; +Cc: Frank Ch. Eigler, Systemtap List

On Wed, 2006-08-02 at 16:03 +0800, Li Guanglei wrote:
> David Smith 写道:
> > On Fri, 2006-07-21 at 14:13 -0500, David Smith wrote:
> > [...]
> > I've been working on this and here's an update.
> > 
> > I made a stab at removing duplicate functions.  I cheated and just
> > hashed the args/body of functions to do comparisons.  All duplicate
> > functions are redirected to the first copy found.  A patch is attached.
> > 
> > I'd appreciate any code comments.
> > 
> 
> Hi David,
> 
> I tried this stap script:
> 
> probe kernel.function("sys_read"), kernel.function("sys_write")
> {
>         log("here")
> }
> 
> but stap will still generate duplicated handlers after applying your
> patch.
> 
>   sys_read-->enter_probe_896-->probe_896
>   sys_write-->enter_probe_897-->probe_897
> 
>   enter_probe_896 and probe_896 are exactly a copy of enter_probe_897
> and probe_897. It seems to me that this patch will determine if a
> function is duplicated with another function by hashing its function
> names and arguments. So semantic_pass_opt5 will think that
> enter_probe_896 and enter_probe_897 are different and didn't optimize
> for it. If so, maybe we have to do this optimization a little earlier,
>  say, in semantic_pass_symbols() before calling
> semantic_pass_optimize(). Am I right?
> 
> - Guanglei

You are correct, the patch currently doesn't try to do probe merging at
all, because of the conceptual problems I listed.  What it does do is
remove duplicate functions.  So this script:

probe kernel.function("sys_read").return,
   kernel.function("sys_write").return
{
       printf("return is %d", $return)
}

will only end up with 1 function to get the value of $return instead of
2.

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-01 22:49         ` Frank Ch. Eigler
@ 2006-08-02 21:16           ` David Smith
  2006-08-03  1:54             ` Frank Ch. Eigler
  0 siblings, 1 reply; 13+ messages in thread
From: David Smith @ 2006-08-02 21:16 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

[-- Attachment #1: Type: text/plain, Size: 2394 bytes --]

On Tue, 2006-08-01 at 18:49 -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> On Tue, Aug 01, 2006 at 01:36:38PM -0500, David Smith wrote:
> 
> > [...]  As far as probes go, here I ran into conceptual problems.
> > [...] The way the code generation currently works, it isn't going to
> > be easy to merge a timer probe with a dwarf_derived_probe for
> > example - all the needed information won't be there.  [...]  So, if
> > we merge a begin probe and an end probe together, we're going to end
> > up with either 2 begin probes or 2 end probes.
> 
> Maybe what we need is a slightly different representation of merging
> here.  What we want is not to eliminate all those derived_probes, but
> rather to reuse the code generated for identical ->body trees.  It is
> those trees that are translated in c_unparser::emit_probe.  That may
> be the perfect place to detect/handle probe-level duplication.
> (Instead of emitting new function body that duplicates one already
> seen, it could emit a call to the first copy instead.)
> 
> - FChE

I tried to think through this a bit, and I believe I was thinking too
hard about this.  I decided that we need to optimize the general case
first to get the biggest bang for the buck.

The bugzilla references the case we should be optimizing for - something
like this:

probe kernel.function("sys_read"), kernel.function("sys_write")
{
        log("here")
}

In addition, I realized that there are cases where we don't want to
merge probes even if they are identical (but we could use your idea
above of reusing the code bodies).  Such as this:

probe begin { log("here") }
probe begin { log("here") }

If we merge those 2 probes into 1, we're only going to get 1 output line
instead of 2.


So, I've coded up a patch that merges functions (as the last patch did)
and merges dwarf probes and dwarf return probes (keeping the 2 variants
separate).  This was relatively easy and should give us a good return.

I've also attached a (somewhat contrived) stp script that shows the
benefit of this patch.

original stap:
  generated C file: 1612k
  generated module: 1944k

removing duplicate functions only:
  generated C file: 1420k
  generated module: 1880k

removing duplicate functions and probes:
  generated C file:   48k
  generated module:  920k

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


[-- Attachment #2: duplicate.patch --]
[-- Type: text/x-patch, Size: 10513 bytes --]

? output.txt
? patch
? test1.stp
? test2.stp
? test3.stp
? test4.stp
? test5.stp
? runtime/lket/b2a/.deps
? runtime/lket/b2a/Makefile
? runtime/lket/b2a/lket-b2a
Index: elaborate.cxx
===================================================================
RCS file: /cvs/systemtap/src/elaborate.cxx,v
retrieving revision 1.62
diff -u -p -r1.62 elaborate.cxx
--- elaborate.cxx	5 Jun 2006 19:45:56 -0000	1.62
+++ elaborate.cxx	2 Aug 2006 21:03:43 -0000
@@ -24,7 +24,7 @@ extern "C" {
 #include <vector>
 #include <algorithm>
 #include <iterator>
-
+#include <ext/hash_map>
 
 using namespace std;
 
@@ -43,8 +43,8 @@ lex_cast(IN const & in)
 // ------------------------------------------------------------------------
 
 
-derived_probe::derived_probe (probe *p):
-  base (p)
+derived_probe::derived_probe (probe *p, bool merge):
+  base (p), can_merge (merge)
 {
   if (p)
     {
@@ -55,8 +55,8 @@ derived_probe::derived_probe (probe *p):
 }
 
 
-derived_probe::derived_probe (probe *p, probe_point *l):
-  base (p)
+derived_probe::derived_probe (probe *p, probe_point *l, bool merge):
+  base (p), can_merge (merge)
 {
   if (l)
     this->locations.push_back (l);
@@ -68,6 +68,15 @@ derived_probe::derived_probe (probe *p, 
     }
 }
 
+void
+derived_probe::merge (derived_probe* p)
+{
+  // This probe is a duplicate.  Merge it by adding all
+  // the duplicate probe's locations to this probe.
+  vector<probe_point*>::const_iterator pp;
+  for (pp = p->locations.begin(); pp != p->locations.end(); pp++)
+    locations.push_back(*pp);
+}
 
 // ------------------------------------------------------------------------
 // Members of derived_probe_builder
@@ -1550,6 +1559,151 @@ void semantic_pass_opt4 (systemtap_sessi
     }
 }
 
+struct duplicate_function_remover: public functioncall_traversing_visitor
+{
+  systemtap_session& s;
+  bool& relaxed_p;
+  map<functiondecl*, functiondecl*>& duplicate_function_map;
+
+  duplicate_function_remover(systemtap_session& sess, bool& r,
+			     map<functiondecl*, functiondecl*>&dfm):
+    s(sess), relaxed_p(r), duplicate_function_map(dfm) {};
+
+  void visit_functioncall (functioncall* e);
+};
+
+void
+duplicate_function_remover::visit_functioncall (functioncall *e)
+{
+  functioncall_traversing_visitor::visit_functioncall (e);
+
+  // If the current function call reference points to a function that
+  // is a duplicate, replace it.
+  if (duplicate_function_map.count(e->referent) != 0)
+    {
+      if (s.verbose>2)
+	  clog << "Changing " << e->referent->name
+	       << " reference to "
+	       << duplicate_function_map[e->referent]->name
+	       << " reference\n";
+      e->tok = duplicate_function_map[e->referent]->tok;
+      e->function = duplicate_function_map[e->referent]->name;
+      e->referent = duplicate_function_map[e->referent];
+
+      relaxed_p = false;
+    }
+}
+
+static size_t
+hash_functiondecl (functiondecl* f)
+{
+  ostringstream s;
+  __gnu_cxx::hash<const char*> hash_func;
+
+  // Get the "name:args body" of the function in s.  We have to
+  // include the args since the function 'x1(a, b)' is different than
+  // the function 'x2(b, a)' even if the bodies of the two functions
+  // are exactly the same.
+  f->printsig(s);
+  f->body->print(s);
+
+  // printsig puts f->name + ':' on the front.  Remove this
+  // (otherwise, functions would never compare equal).
+  string str = s.str().erase(0, f->name.size() + 1);
+
+  // Actually generate the hash.
+  return hash_func(str.c_str());
+}
+
+static size_t
+hash_probe (derived_probe* p)
+{
+  ostringstream s;
+  __gnu_cxx::hash<const char*> hash_func;
+
+  // For probes, we need the type and body of the probe.
+  s << p->get_type() << " ";
+  p->body->print(s);
+
+  // Actually generate the hash.
+  return hash_func(s.str().c_str());
+}
+
+void semantic_pass_opt5 (systemtap_session& s, bool& relaxed_p)
+{
+  // Walk through all the functions, looking for duplicates.
+  map<size_t, functiondecl*> function_hash_map;
+  map<functiondecl*, functiondecl*> duplicate_function_map;
+  for (unsigned i=0; i < s.functions.size(); i++)
+    {
+      size_t function_hash = hash_functiondecl(s.functions[i]);
+
+      if (function_hash_map.count(function_hash) == 0)
+	// This function is unique.  Remember it.
+	function_hash_map[function_hash] = s.functions[i];
+      else
+	// This function is a duplicate.
+	duplicate_function_map[s.functions[i]]
+	    = function_hash_map[function_hash];
+    }
+
+  // If we have duplicate functions, traverse down the tree, replacing
+  // the appropriate function calls.
+  // duplicate_function_remover::visit_functioncall() handles the
+  // details of replacing the function calls.  Note that we don't
+  // delete the duplicate functiondecl itself, we'll let pass 1 do
+  // that.
+  if (duplicate_function_map.size() != 0)
+    {
+      duplicate_function_remover dfr (s, relaxed_p, duplicate_function_map);
+
+      for (unsigned i=0; i < s.probes.size(); i++)
+	s.probes[i]->body->visit(&dfr);
+    }
+
+  // Walk through all the probes, looking for duplicates
+  map<size_t, derived_probe*> probe_hash_map;
+  for (unsigned i=0; i < s.probes.size(); /* see below */)
+    {
+      // If this probe type cannot be merged at all, even with a
+      // similar probe type, just go on to the next probe.
+      if (! s.probes[i]->can_merge)
+	i++;
+      // This probe type can be merged, if a duplicate is found.
+      else
+	{
+	  size_t probe_hash = hash_probe(s.probes[i]);
+
+	  if (probe_hash_map.count(probe_hash) == 0)
+	    {
+	      // This probe is uniqe.  Remember it.
+	      probe_hash_map[probe_hash] = s.probes[i];
+	      i++;
+	    }
+	  else
+	    {
+	      // This probe is a duplicate.  Remove it by first adding
+	      // all the duplicate probe's locations to the original
+	      // probe, then remove the probe itself.
+	      if (s.verbose>2)
+		{
+		  clog << "Merging probe ";
+		  s.probes[i]->printsig(clog);
+		  clog << " into probe ";
+		  probe_hash_map[probe_hash]->printsig(clog);
+		  clog << endl;
+		}
+
+	      probe_hash_map[probe_hash]->merge(s.probes[i]);
+
+	      s.probes.erase(s.probes.begin() + i);
+	      relaxed_p = false;
+	      // NB: don't increment i
+	    }
+	}
+    }
+}
+
 
 static int
 semantic_pass_optimize (systemtap_session& s)
@@ -1571,6 +1725,7 @@ semantic_pass_optimize (systemtap_sessio
       semantic_pass_opt2 (s, relaxed_p);
       semantic_pass_opt3 (s, relaxed_p);
       semantic_pass_opt4 (s, relaxed_p);
+      semantic_pass_opt5 (s, relaxed_p);
     }
 
   if (s.probes.size() == 0)
Index: elaborate.h
===================================================================
RCS file: /cvs/systemtap/src/elaborate.h,v
retrieving revision 1.31
diff -u -p -r1.31 elaborate.h
--- elaborate.h	2 Jun 2006 23:13:38 -0000	1.31
+++ elaborate.h	2 Aug 2006 21:03:43 -0000
@@ -109,13 +109,16 @@ class translator_output;
 
 struct derived_probe: public probe
 {
-  derived_probe (probe* b);
-  derived_probe (probe* b, probe_point* l);
+  derived_probe (probe* b, bool merge = false);
+  derived_probe (probe* b, probe_point* l, bool merge = false);
   probe* base; // the original parsed probe
   virtual probe* basest () { return base->basest(); }
 
   virtual ~derived_probe () {}
 
+  bool can_merge;
+  virtual const char* get_type () { return("unknown"); }
+
   virtual void emit_registrations (translator_output* o) = 0;
   // (from within module_init):
   // rc = ..... register_or_whatever (ENTRYFN);
@@ -137,6 +140,8 @@ struct derived_probe: public probe
   // From within unparser::emit_common_header, add any extra variables
   // to this probe's context locals.
 
+  virtual void merge (derived_probe* p);
+
 protected:
   void emit_probe_prologue (translator_output* o, const std::string&);
   void emit_probe_epilogue (translator_output* o);
Index: tapsets.cxx
===================================================================
RCS file: /cvs/systemtap/src/tapsets.cxx,v
retrieving revision 1.138
diff -u -p -r1.138 tapsets.cxx
--- tapsets.cxx	1 Aug 2006 02:20:47 -0000	1.138
+++ tapsets.cxx	2 Aug 2006 21:03:43 -0000
@@ -1714,6 +1714,8 @@ struct dwarf_derived_probe : public deri
 		       Dwarf_Addr addr,
 		       dwarf_query & q);
 
+  const char* get_type () { return (has_return ? "dwarf_return" : "dwarf"); }
+
   vector<Dwarf_Addr> probe_points;
   bool has_return;
 
@@ -1739,6 +1741,8 @@ struct dwarf_derived_probe : public deri
   virtual void emit_registrations (translator_output * o);
   virtual void emit_deregistrations (translator_output * o);
   virtual void emit_probe_entries (translator_output * o);
+
+  virtual void merge (derived_probe* p);
 };
 
 // Helper struct to thread through the dwfl callbacks.
@@ -3006,7 +3010,7 @@ dwarf_derived_probe::add_probe_point(str
 dwarf_derived_probe::dwarf_derived_probe (Dwarf_Die *scope_die,
 					  Dwarf_Addr addr,
 					  dwarf_query & q)
-  : derived_probe (q.base_probe, 0 /* location-less */),
+  : derived_probe (q.base_probe, 0 /* location-less */, true /* merge */),
     has_return (q.has_return)
 {
   string module_name(q.dw.module_name);
@@ -3374,6 +3378,28 @@ dwarf_derived_probe::emit_probe_entries 
 
 
 void
+dwarf_derived_probe::merge(derived_probe* p)
+{
+  // Sanity check.  Make sure types are the same.
+  if (strcmp(get_type(), p->get_type()) != 0)
+    throw runtime_error("bad probe merging");
+
+  // This is cheating, but we need a pointer to the
+  // dwarf_derived_probe, not the derived_probe.  Because of the check
+  // above, we can be certain this is a dwarf_derived_probe.
+  const dwarf_derived_probe* p2 = (const dwarf_derived_probe *)p;
+
+  // First call the base class.
+  derived_probe::merge(p);
+
+  // Now handle the dwarf_derived_probe specific stuff.  Merge it by
+  // adding all the duplicate probe's addressed to this probe.
+  vector<Dwarf_Addr>::const_iterator pp;
+  for (pp = p2->probe_points.begin(); pp != p2->probe_points.end(); pp++)
+    probe_points.push_back(*pp);
+}
+
+void
 dwarf_builder::build(systemtap_session & sess,
 		     probe * base,
 		     probe_point * location,
@@ -3765,7 +3791,7 @@ mark_derived_probe::mark_derived_probe (
                                         uintptr_t a,
                                         const string& m,
                                         probe* base):
-  derived_probe (base, 0), sess (s), probe_name (p_n), probe_sig (p_s),
+  derived_probe (base), sess (s), probe_name (p_n), probe_sig (p_s),
   address (a), module (m)
 {
   // create synthetic probe point

[-- Attachment #3: dup_function.stp --]
[-- Type: text/plain, Size: 13584 bytes --]

probe kernel.function("sys_accept").return,
  kernel.function("sys_access").return,
  kernel.function("sys_acct").return,
  kernel.function("sys_add_key").return,
  kernel.function("sys_adjtimex").return,
  kernel.function("sys_alarm").return,
  kernel.function("sys_bdflush").return,
  kernel.function("sys_bind").return,
  kernel.function("sys_brk").return,
  kernel.function("sys_capget").return,
  kernel.function("sys_capset").return,
  kernel.function("sys_chdir").return,
  kernel.function("sys_chmod").return,
  kernel.function("sys_chown16").return,
  kernel.function("sys_chown").return,
  kernel.function("sys_chroot").return,
  kernel.function("sys_clock_getres").return,
  kernel.function("sys_clock_gettime").return,
  kernel.function("sys_clock_nanosleep").return,
  kernel.function("sys_clock_settime").return,
  kernel.function("sys_clone").return,
  kernel.function("sys_close").return,
  kernel.function("sys_connect").return,
  kernel.function("sys_creat").return,
  kernel.function("sys_delete_module").return,
  kernel.function("sys_dup2").return,
  kernel.function("sys_dup").return,
  kernel.function("sys_epoll_create").return,
  kernel.function("sys_epoll_ctl").return,
  kernel.function("sys_epoll_wait").return,
  kernel.function("sys_execve").return,
  kernel.function("sys_faccessat").return,
  kernel.function("sys_fadvise64").return,
  kernel.function("sys_fadvise64_64").return,
  kernel.function("sys_fchdir").return,
  kernel.function("sys_fchmod").return,
  kernel.function("sys_fchmodat").return,
  kernel.function("sys_fchown16").return,
  kernel.function("sys_fchown").return,
  kernel.function("sys_fchownat").return,
  kernel.function("sys_fcntl64").return,
  kernel.function("sys_fcntl").return,
  kernel.function("sys_fdatasync").return,
  kernel.function("sys_fgetxattr").return,
  kernel.function("sys_flistxattr").return,
  kernel.function("sys_flock").return,
  kernel.function("sys_fork").return,
  kernel.function("sys_fremovexattr").return,
  kernel.function("sys_fsetxattr").return,
  kernel.function("sys_fstat64").return,
  kernel.function("sys_fstat").return,
  kernel.function("sys_fstatat64").return,
  kernel.function("sys_fstatfs64").return,
  kernel.function("sys_fstatfs").return,
  kernel.function("sys_fsync").return,
  kernel.function("sys_ftruncate64").return,
  kernel.function("sys_ftruncate").return,
  kernel.function("sys_futex").return,
  kernel.function("sys_futimesat").return,
  kernel.function("sys_get_robust_list").return,
  kernel.function("sys_get_thread_area").return,
  kernel.function("sys_getcwd").return,
  kernel.function("sys_getdents64").return,
  kernel.function("sys_getdents").return,
  kernel.function("sys_getegid16").return,
  kernel.function("sys_getegid").return,
  kernel.function("sys_geteuid16").return,
  kernel.function("sys_geteuid").return,
  kernel.function("sys_getgid16").return,
  kernel.function("sys_getgid").return,
  kernel.function("sys_getgroups16").return,
  kernel.function("sys_getgroups").return,
  kernel.function("sys_gethostname").return,
  kernel.function("sys_getitimer").return,
  kernel.function("sys_getpeername").return,
  kernel.function("sys_getpgid").return,
  kernel.function("sys_getpgrp").return,
  kernel.function("sys_getpid").return,
  kernel.function("sys_getppid").return,
  kernel.function("sys_getpriority").return,
  kernel.function("sys_getresgid16").return,
  kernel.function("sys_getresgid").return,
  kernel.function("sys_getresuid16").return,
  kernel.function("sys_getresuid").return,
  kernel.function("sys_getrlimit").return,
  kernel.function("sys_getrusage").return,
  kernel.function("sys_getsid").return,
  kernel.function("sys_getsockname").return,
  kernel.function("sys_getsockopt").return,
  kernel.function("sys_gettid").return,
  kernel.function("sys_gettimeofday").return,
  kernel.function("sys_getuid16").return,
  kernel.function("sys_getuid").return,
  kernel.function("sys_getxattr").return,
  kernel.function("sys_init_module").return,
  kernel.function("sys_inotify_add_watch").return,
  kernel.function("sys_inotify_init").return,
  kernel.function("sys_inotify_rm_watch").return,
  kernel.function("sys_io_cancel").return,
  kernel.function("sys_io_destroy").return,
  kernel.function("sys_io_getevents").return,
  kernel.function("sys_io_setup").return,
  kernel.function("sys_io_submit").return,
  kernel.function("sys_ioctl").return,
  kernel.function("sys_ioperm").return,
  kernel.function("sys_iopl").return,
  kernel.function("sys_ioprio_get").return,
  kernel.function("sys_ioprio_set").return,
  kernel.function("sys_ipc").return,
  kernel.function("sys_kexec_load").return,
  kernel.function("sys_keyctl").return,
  kernel.function("sys_kill").return,
  kernel.function("sys_lchown16").return,
  kernel.function("sys_lchown").return,
  kernel.function("sys_lgetxattr").return,
  kernel.function("sys_link").return,
  kernel.function("sys_linkat").return,
  kernel.function("sys_listen").return,
  kernel.function("sys_listxattr").return,
  kernel.function("sys_llistxattr").return,
  kernel.function("sys_llseek").return,
  kernel.function("sys_lookup_dcookie").return,
  kernel.function("sys_lremovexattr").return,
  kernel.function("sys_lseek").return,
  kernel.function("sys_lsetxattr").return,
  kernel.function("sys_lstat64").return,
  kernel.function("sys_lstat").return,
  kernel.function("sys_madvise").return,
  kernel.function("sys_mincore").return,
  kernel.function("sys_mkdir").return,
  kernel.function("sys_mkdirat").return,
  kernel.function("sys_mknod").return,
  kernel.function("sys_mknodat").return,
  kernel.function("sys_mlock").return,
  kernel.function("sys_mlockall").return,
  kernel.function("sys_mmap2").return,
  kernel.function("sys_modify_ldt").return,
  kernel.function("sys_mount").return,
  kernel.function("sys_mprotect").return,
  kernel.function("sys_mq_getsetattr").return,
  kernel.function("sys_mq_notify").return,
  kernel.function("sys_mq_open").return,
  kernel.function("sys_mq_timedreceive").return,
  kernel.function("sys_mq_timedsend").return,
  kernel.function("sys_mq_unlink").return,
  kernel.function("sys_mremap").return,
  kernel.function("sys_msgctl").return,
  kernel.function("sys_msgget").return,
  kernel.function("sys_msgrcv").return,
  kernel.function("sys_msgsnd").return,
  kernel.function("sys_msync").return,
  kernel.function("sys_munlock").return,
  kernel.function("sys_munlockall").return,
  kernel.function("sys_munmap").return,
  kernel.function("sys_nanosleep").return,
  kernel.function("sys_newfstat").return,
  kernel.function("sys_newlstat").return,
  kernel.function("sys_newstat").return,
  kernel.function("sys_newuname").return,
  kernel.function("sys_nfsservctl").return,
  kernel.function("sys_ni_syscall").return,
  kernel.function("sys_nice").return,
  kernel.function("sys_old_getrlimit").return,
  kernel.function("sys_oldumount").return,
  kernel.function("sys_olduname").return,
  kernel.function("sys_open").return,
  kernel.function("sys_openat").return,
  kernel.function("sys_pause").return,
  kernel.function("sys_personality").return,
  kernel.function("sys_pipe").return,
  kernel.function("sys_pivot_root").return,
  kernel.function("sys_poll").return,
  kernel.function("sys_ppoll").return,
  kernel.function("sys_prctl").return,
  kernel.function("sys_pread64").return,
  kernel.function("sys_pselect6").return,
  kernel.function("sys_pselect7").return,
  kernel.function("sys_ptrace").return,
  kernel.function("sys_pwrite64").return,
  kernel.function("sys_quotactl").return,
  kernel.function("sys_read").return,
  kernel.function("sys_readahead").return,
  kernel.function("sys_readlink").return,
  kernel.function("sys_readlinkat").return,
  kernel.function("sys_readv").return,
  kernel.function("sys_reboot").return,
  kernel.function("sys_recv").return,
  kernel.function("sys_recvfrom").return,
  kernel.function("sys_recvmsg").return,
  kernel.function("sys_remap_file_pages").return,
  kernel.function("sys_removexattr").return,
  kernel.function("sys_rename").return,
  kernel.function("sys_renameat").return,
  kernel.function("sys_request_key").return,
  kernel.function("sys_restart_syscall").return,
  kernel.function("sys_rmdir").return,
  kernel.function("sys_rt_sigaction").return,
  kernel.function("sys_rt_sigpending").return,
  kernel.function("sys_rt_sigprocmask").return,
  kernel.function("sys_rt_sigqueueinfo").return,
  kernel.function("sys_rt_sigreturn").return,
  kernel.function("sys_rt_sigsuspend").return,
  kernel.function("sys_rt_sigtimedwait").return,
  kernel.function("sys_sched_get_priority_max").return,
  kernel.function("sys_sched_get_priority_min").return,
  kernel.function("sys_sched_getaffinity").return,
  kernel.function("sys_sched_getparam").return,
  kernel.function("sys_sched_getscheduler").return,
  kernel.function("sys_sched_rr_get_interval").return,
  kernel.function("sys_sched_setaffinity").return,
  kernel.function("sys_sched_setparam").return,
  kernel.function("sys_sched_setscheduler").return,
  kernel.function("sys_sched_yield").return,
  kernel.function("sys_select").return,
  kernel.function("sys_semctl").return,
  kernel.function("sys_semget").return,
  kernel.function("sys_semop").return,
  kernel.function("sys_semtimedop").return,
  kernel.function("sys_send").return,
  kernel.function("sys_sendfile64").return,
  kernel.function("sys_sendfile").return,
  kernel.function("sys_sendmsg").return,
  kernel.function("sys_sendto").return,
  kernel.function("sys_set_robust_list").return,
  kernel.function("sys_set_thread_area").return,
  kernel.function("sys_set_tid_address").return,
  kernel.function("sys_setdomainname").return,
  kernel.function("sys_setfsgid16").return,
  kernel.function("sys_setfsgid").return,
  kernel.function("sys_setfsuid16").return,
  kernel.function("sys_setfsuid").return,
  kernel.function("sys_setgid16").return,
  kernel.function("sys_setgid").return,
  kernel.function("sys_setgroups16").return,
  kernel.function("sys_setgroups").return,
  kernel.function("sys_sethostname").return,
  kernel.function("sys_setitimer").return,
  kernel.function("sys_setpgid").return,
  kernel.function("sys_setpriority").return,
  kernel.function("sys_setregid16").return,
  kernel.function("sys_setregid").return,
  kernel.function("sys_setresgid16").return,
  kernel.function("sys_setresgid").return,
  kernel.function("sys_setresuid16").return,
  kernel.function("sys_setresuid").return,
  kernel.function("sys_setreuid16").return,
  kernel.function("sys_setreuid").return,
  kernel.function("sys_setrlimit").return,
  kernel.function("sys_setsid").return,
  kernel.function("sys_setsockopt").return,
  kernel.function("sys_settimeofday").return,
  kernel.function("sys_setuid16").return,
  kernel.function("sys_setuid").return,
  kernel.function("sys_setxattr").return,
  kernel.function("sys_sgetmask").return,
  kernel.function("sys_shmat").return,
  kernel.function("sys_shmctl").return,
  kernel.function("sys_shmdt").return,
  kernel.function("sys_shmget").return,
  kernel.function("sys_shutdown").return,
  kernel.function("sys_sigaction").return,
  kernel.function("sys_sigaltstack").return,
  kernel.function("sys_signal").return,
  kernel.function("sys_sigpending").return,
  kernel.function("sys_sigprocmask").return,
  kernel.function("sys_sigreturn").return,
  kernel.function("sys_sigsuspend").return,
  kernel.function("sys_socket").return,
  kernel.function("sys_socketcall").return,
  kernel.function("sys_socketpair").return,
  kernel.function("sys_splice").return,
  kernel.function("sys_ssetmask").return,
  kernel.function("sys_stat64").return,
  kernel.function("sys_stat").return,
  kernel.function("sys_statfs64").return,
  kernel.function("sys_statfs").return,
  kernel.function("sys_stime").return,
  kernel.function("sys_swapoff").return,
  kernel.function("sys_swapon").return,
  kernel.function("sys_symlink").return,
  kernel.function("sys_symlinkat").return,
  kernel.function("sys_sync").return,
  kernel.function("sys_sync_file_range").return,
  kernel.function("sys_sysctl").return,
  kernel.function("sys_sysfs").return,
  kernel.function("sys_sysinfo").return,
  kernel.function("sys_syslog").return,
  kernel.function("sys_tee").return,
  kernel.function("sys_tgkill").return,
  kernel.function("sys_time").return,
  kernel.function("sys_timer_create").return,
  kernel.function("sys_timer_delete").return,
  kernel.function("sys_timer_getoverrun").return,
  kernel.function("sys_timer_gettime").return,
  kernel.function("sys_timer_settime").return,
  kernel.function("sys_times").return,
  kernel.function("sys_tkill").return,
  kernel.function("sys_truncate64").return,
  kernel.function("sys_truncate").return,
  kernel.function("sys_tux").return,
  kernel.function("sys_umask").return,
  kernel.function("sys_umount").return,
  kernel.function("sys_uname").return,
  kernel.function("sys_unlink").return,
  kernel.function("sys_unlinkat").return,
  kernel.function("sys_unshare").return,
  kernel.function("sys_uselib").return,
  kernel.function("sys_ustat").return,
  kernel.function("sys_utime").return,
  kernel.function("sys_utimes").return,
  kernel.function("sys_vfork").return,
  kernel.function("sys_vhangup").return,
  kernel.function("sys_vm86").return,
  kernel.function("sys_vm86old").return,
  kernel.function("sys_vmsplice").return,
  kernel.function("sys_wait4").return,
  kernel.function("sys_waitid").return,
  kernel.function("sys_waitpid").return,
  kernel.function("sys_write").return,
  kernel.function("sys_writev").return
{
	printf("%d\n", $return)
}


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-02 21:16           ` David Smith
@ 2006-08-03  1:54             ` Frank Ch. Eigler
  2006-08-03 14:58               ` David Smith
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Ch. Eigler @ 2006-08-03  1:54 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

> I tried to think through this a bit, and I believe I was thinking
> too hard about this.  [...]  In addition, I realized that there are
> cases where we don't want to merge probes even if they are identical
> (but we could use your idea above of reusing the code bodies).
> [...]

As Josh said, merging should be a script-invisible phenomenon, and so
must preserve semantics.  Merging at the probe / derived_probe level
*could* involve moving probe_location values from place to place.  But
now that I think about it more, it seems like a good place for both
function and probe merging is in translate.cxx, strictly within the
emit_{probe,function} routines.

By the way, I was probably misunderstood with respect to hashing.
What we need for duplicate detection is a way of equality-comparing
two parse trees.  If the staptree::print function is deemed reliable
(and it may be a big if), then simple string comparison is good enough
- i.e., use strings as keys for the duplicate detection table.  There
is no need to manually hash the strings!  I mentioned hashing because
I didn't consider the pretty-printing functions, so an identity hash
number of a parse tree could instead be recursively built up from the
bottom.

- FChE

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-03  1:54             ` Frank Ch. Eigler
@ 2006-08-03 14:58               ` David Smith
  2006-08-08 19:28                 ` David Smith
  0 siblings, 1 reply; 13+ messages in thread
From: David Smith @ 2006-08-03 14:58 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

On Wed, 2006-08-02 at 21:54 -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> > I tried to think through this a bit, and I believe I was thinking
> > too hard about this.  [...]  In addition, I realized that there are
> > cases where we don't want to merge probes even if they are identical
> > (but we could use your idea above of reusing the code bodies).
> > [...]
> 
> As Josh said, merging should be a script-invisible phenomenon, and so
> must preserve semantics.

See my reply to Josh for the long answer to this, but the short version
is semantics should be preserved - I was wrong in my begin probe
example.

> Merging at the probe / derived_probe level
> *could* involve moving probe_location values from place to place.  But
> now that I think about it more, it seems like a good place for both
> function and probe merging is in translate.cxx, strictly within the
> emit_{probe,function} routines.

I'll look into it.

> By the way, I was probably misunderstood with respect to hashing.
> What we need for duplicate detection is a way of equality-comparing
> two parse trees.  If the staptree::print function is deemed reliable
> (and it may be a big if), then simple string comparison is good enough
> - i.e., use strings as keys for the duplicate detection table.  There
> is no need to manually hash the strings!  I mentioned hashing because
> I didn't consider the pretty-printing functions, so an identity hash
> number of a parse tree could instead be recursively built up from the
> bottom.
> 
> - FChE

Oh, the current hash stuff was just a place holder.  I realize I'm
basically doing a strcmp on function and probe bodies.

The current method will work (and could be replaced with a strcmp).  The
additional benefit of an identity hash would be that it would work in
the case where 2 functions or probes are exactly the same, except for
the name of a variable.

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BZ#2421 - removing duplicate probe handlers
  2006-08-03 14:58               ` David Smith
@ 2006-08-08 19:28                 ` David Smith
  0 siblings, 0 replies; 13+ messages in thread
From: David Smith @ 2006-08-08 19:28 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

[-- Attachment #1: Type: text/plain, Size: 2166 bytes --]

On Thu, 2006-08-03 at 09:56 -0500, David Smith wrote:
> On Wed, 2006-08-02 at 21:54 -0400, Frank Ch. Eigler wrote:
> > Hi -
> > 
> > > I tried to think through this a bit, and I believe I was thinking
> > > too hard about this.  [...]  In addition, I realized that there are
> > > cases where we don't want to merge probes even if they are identical
> > > (but we could use your idea above of reusing the code bodies).
> > > [...]
> > 
> > As Josh said, merging should be a script-invisible phenomenon, and so
> > must preserve semantics.
> 
> See my reply to Josh for the long answer to this, but the short version
> is semantics should be preserved - I was wrong in my begin probe
> example.
> 
> > Merging at the probe / derived_probe level
> > *could* involve moving probe_location values from place to place.  But
> > now that I think about it more, it seems like a good place for both
> > function and probe merging is in translate.cxx, strictly within the
> > emit_{probe,function} routines.
> 
> I'll look into it.

I looked into it, and if function merging is done that late, then probes
can't be merged, since they are calling different functions.

Probe merging can be done in translate.cxx.  I've attached a patch that
demonstrates it.

Note I'm just doing string compares to compare probe bodies.  That can
be improved later if you like the basic approach.

Note that using the same test case I used before (which I've included
again) shows less space savings than doing it all in elaborate.cxx,
since the C probe auxiliary functions don't get merged using this
approach.

original stap:
  generated C file: 1612k
  generated module: 1944k

removing duplicate functions (in elaborate.cxx) only:
  generated C file: 1420k
  generated module: 1880k

removing duplicate functions and probes (both in elaborate.cxx) - the
last approach:
  generated C file:   48k
  generated module:  920k

removing duplicate functions (in elaborate.cxx) and probes (in
translate.cxx) - the current approach:
  generated C file:  976k
  generated module: 1644k

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


[-- Attachment #2: duplicate2.patch --]
[-- Type: text/x-patch, Size: 7958 bytes --]

Index: elaborate.cxx
===================================================================
RCS file: /cvs/systemtap/src/elaborate.cxx,v
retrieving revision 1.62
diff -u -p -r1.62 elaborate.cxx
--- elaborate.cxx	5 Jun 2006 19:45:56 -0000	1.62
+++ elaborate.cxx	8 Aug 2006 19:15:17 -0000
@@ -24,7 +24,7 @@ extern "C" {
 #include <vector>
 #include <algorithm>
 #include <iterator>
-
+#include <ext/hash_map>
 
 using namespace std;
 
@@ -1550,6 +1550,95 @@ void semantic_pass_opt4 (systemtap_sessi
     }
 }
 
+struct duplicate_function_remover: public functioncall_traversing_visitor
+{
+  systemtap_session& s;
+  bool& relaxed_p;
+  map<functiondecl*, functiondecl*>& duplicate_function_map;
+
+  duplicate_function_remover(systemtap_session& sess, bool& r,
+			     map<functiondecl*, functiondecl*>&dfm):
+    s(sess), relaxed_p(r), duplicate_function_map(dfm) {};
+
+  void visit_functioncall (functioncall* e);
+};
+
+void
+duplicate_function_remover::visit_functioncall (functioncall *e)
+{
+  functioncall_traversing_visitor::visit_functioncall (e);
+
+  // If the current function call reference points to a function that
+  // is a duplicate, replace it.
+  if (duplicate_function_map.count(e->referent) != 0)
+    {
+      if (s.verbose>2)
+	  clog << "Changing " << e->referent->name
+	       << " reference to "
+	       << duplicate_function_map[e->referent]->name
+	       << " reference\n";
+      e->tok = duplicate_function_map[e->referent]->tok;
+      e->function = duplicate_function_map[e->referent]->name;
+      e->referent = duplicate_function_map[e->referent];
+
+      relaxed_p = false;
+    }
+}
+
+static size_t
+hash_functiondecl (functiondecl* f)
+{
+  ostringstream s;
+  __gnu_cxx::hash<const char*> hash_func;
+
+  // Get the "name:args body" of the function in s.  We have to
+  // include the args since the function 'x1(a, b)' is different than
+  // the function 'x2(b, a)' even if the bodies of the two functions
+  // are exactly the same.
+  f->printsig(s);
+  f->body->print(s);
+
+  // printsig puts f->name + ':' on the front.  Remove this
+  // (otherwise, functions would never compare equal).
+  string str = s.str().erase(0, f->name.size() + 1);
+
+  // Actually generate the hash.
+  return hash_func(str.c_str());
+}
+
+void semantic_pass_opt5 (systemtap_session& s, bool& relaxed_p)
+{
+  // Walk through all the functions, looking for duplicates.
+  map<size_t, functiondecl*> function_hash_map;
+  map<functiondecl*, functiondecl*> duplicate_function_map;
+  for (unsigned i=0; i < s.functions.size(); i++)
+    {
+      size_t function_hash = hash_functiondecl(s.functions[i]);
+
+      if (function_hash_map.count(function_hash) == 0)
+	// This function is unique.  Remember it.
+	function_hash_map[function_hash] = s.functions[i];
+      else
+	// This function is a duplicate.
+	duplicate_function_map[s.functions[i]]
+	    = function_hash_map[function_hash];
+    }
+
+  // If we have duplicate functions, traverse down the tree, replacing
+  // the appropriate function calls.
+  // duplicate_function_remover::visit_functioncall() handles the
+  // details of replacing the function calls.  Note that we don't
+  // delete the duplicate functiondecl itself, we'll let pass 1 do
+  // that.
+  if (duplicate_function_map.size() != 0)
+    {
+      duplicate_function_remover dfr (s, relaxed_p, duplicate_function_map);
+
+      for (unsigned i=0; i < s.probes.size(); i++)
+	s.probes[i]->body->visit(&dfr);
+    }
+}
+
 
 static int
 semantic_pass_optimize (systemtap_session& s)
@@ -1571,6 +1660,7 @@ semantic_pass_optimize (systemtap_sessio
       semantic_pass_opt2 (s, relaxed_p);
       semantic_pass_opt3 (s, relaxed_p);
       semantic_pass_opt4 (s, relaxed_p);
+      semantic_pass_opt5 (s, relaxed_p);
     }
 
   if (s.probes.size() == 0)
Index: translate.cxx
===================================================================
RCS file: /cvs/systemtap/src/translate.cxx,v
retrieving revision 1.127
diff -u -p -r1.127 translate.cxx
--- translate.cxx	9 Jun 2006 09:20:03 -0000	1.127
+++ translate.cxx	8 Aug 2006 19:15:17 -0000
@@ -68,6 +68,8 @@ struct c_unparser: public unparser, publ
   unsigned tmpvar_counter;
   unsigned label_counter;
 
+  map<string, string> probe_contents;
+
   c_unparser (systemtap_session* ss):
     session (ss), o (ss->op), current_probe(0), current_function (0),
   tmpvar_counter (0), label_counter (0) {}
@@ -1270,40 +1272,67 @@ c_unparser::emit_probe (derived_probe* v
   o->newline() << "static void " << v->name << " (struct context * __restrict__ c) {";
   o->indent(1);
 
-  // initialize frame pointer
-  o->newline() << "struct " << v->name << "_locals * __restrict__ l =";
-  o->newline(1) << "& c->locals[0]." << v->name << ";";
-  o->newline(-1) << "(void) l;"; // make sure "l" is marked used
-
-  // emit all read/write locks for global variables
-  varuse_collecting_visitor vut;
-  v->body->visit (& vut);
-  emit_locks (vut);
+  // If we about to emit a probe that is exactly the same as another
+  // probe previously emitted, make the second probe just call the
+  // first one.
+  //
+  // Notice we're using the probe body itself instead of the emitted C
+  // probe body to compare probes.  We need to do this because the
+  // emitted C probe body has stuff in it like:
+  //
+  //   c->last_stmt = "identifier 'printf' at foo.stp:<line>:<column>";
+  //
+  // which would make comparisons impossible.
+  ostringstream oss;
+  v->body->print(oss);
 
-  // initialize locals
-  for (unsigned j=0; j<v->locals.size(); j++)
+  // If an identical probe has already been emitted, just call that
+  // one.
+  if (probe_contents.count(oss.str()) != 0)
     {
-      if (v->locals[j]->index_types.size() > 0) // array?
-	throw semantic_error ("array locals not supported", v->tok);
-      else if (v->locals[j]->type == pe_long)
-	o->newline() << "l->" << c_varname (v->locals[j]->name)
-		     << " = 0;";
-      else if (v->locals[j]->type == pe_string)
-	o->newline() << "l->" << c_varname (v->locals[j]->name)
-		     << "[0] = '\\0';";
-      else
-	throw semantic_error ("unsupported local variable type",
-			      v->locals[j]->tok);
+	o->newline() << probe_contents[oss.str()] << " (c);";
     }
+  // This probe is unique.  Remember it and output it.
+  else
+    {
+      probe_contents[oss.str()] = v->name;
+
+      // initialize frame pointer
+      o->newline() << "struct " << v->name << "_locals * __restrict__ l =";
+      o->newline(1) << "& c->locals[0]." << v->name << ";";
+      o->newline(-1) << "(void) l;"; // make sure "l" is marked used
+
+      // emit all read/write locks for global variables
+      varuse_collecting_visitor vut;
+      v->body->visit (& vut);
+      emit_locks (vut);
+
+      // initialize locals
+      for (unsigned j=0; j<v->locals.size(); j++)
+        {
+	  if (v->locals[j]->index_types.size() > 0) // array?
+	    throw semantic_error ("array locals not supported", v->tok);
+	  else if (v->locals[j]->type == pe_long)
+	    o->newline() << "l->" << c_varname (v->locals[j]->name)
+			 << " = 0;";
+	  else if (v->locals[j]->type == pe_string)
+	    o->newline() << "l->" << c_varname (v->locals[j]->name)
+			 << "[0] = '\\0';";
+	  else
+	    throw semantic_error ("unsupported local variable type",
+				  v->locals[j]->tok);
+        }
   
-  v->body->visit (this);
+      v->body->visit (this);
 
-  o->newline(-1) << "out:";
-  // NB: no need to uninitialize locals, except if arrays can somedays be local
-  // XXX: do this flush only if the body included a print/printf/etc. routine!
-  o->newline(1) << "_stp_print_flush();";
+      o->newline(-1) << "out:";
+      // NB: no need to uninitialize locals, except if arrays can
+      // somedays be local XXX: do this flush only if the body
+      // included a print/printf/etc. routine!
+      o->newline(1) << "_stp_print_flush();";
 
-  emit_unlocks (vut);
+      emit_unlocks (vut);
+    }
 
   o->newline(-1) << "}\n";
   

[-- Attachment #3: dup_function.stp --]
[-- Type: text/plain, Size: 13584 bytes --]

probe kernel.function("sys_accept").return,
  kernel.function("sys_access").return,
  kernel.function("sys_acct").return,
  kernel.function("sys_add_key").return,
  kernel.function("sys_adjtimex").return,
  kernel.function("sys_alarm").return,
  kernel.function("sys_bdflush").return,
  kernel.function("sys_bind").return,
  kernel.function("sys_brk").return,
  kernel.function("sys_capget").return,
  kernel.function("sys_capset").return,
  kernel.function("sys_chdir").return,
  kernel.function("sys_chmod").return,
  kernel.function("sys_chown16").return,
  kernel.function("sys_chown").return,
  kernel.function("sys_chroot").return,
  kernel.function("sys_clock_getres").return,
  kernel.function("sys_clock_gettime").return,
  kernel.function("sys_clock_nanosleep").return,
  kernel.function("sys_clock_settime").return,
  kernel.function("sys_clone").return,
  kernel.function("sys_close").return,
  kernel.function("sys_connect").return,
  kernel.function("sys_creat").return,
  kernel.function("sys_delete_module").return,
  kernel.function("sys_dup2").return,
  kernel.function("sys_dup").return,
  kernel.function("sys_epoll_create").return,
  kernel.function("sys_epoll_ctl").return,
  kernel.function("sys_epoll_wait").return,
  kernel.function("sys_execve").return,
  kernel.function("sys_faccessat").return,
  kernel.function("sys_fadvise64").return,
  kernel.function("sys_fadvise64_64").return,
  kernel.function("sys_fchdir").return,
  kernel.function("sys_fchmod").return,
  kernel.function("sys_fchmodat").return,
  kernel.function("sys_fchown16").return,
  kernel.function("sys_fchown").return,
  kernel.function("sys_fchownat").return,
  kernel.function("sys_fcntl64").return,
  kernel.function("sys_fcntl").return,
  kernel.function("sys_fdatasync").return,
  kernel.function("sys_fgetxattr").return,
  kernel.function("sys_flistxattr").return,
  kernel.function("sys_flock").return,
  kernel.function("sys_fork").return,
  kernel.function("sys_fremovexattr").return,
  kernel.function("sys_fsetxattr").return,
  kernel.function("sys_fstat64").return,
  kernel.function("sys_fstat").return,
  kernel.function("sys_fstatat64").return,
  kernel.function("sys_fstatfs64").return,
  kernel.function("sys_fstatfs").return,
  kernel.function("sys_fsync").return,
  kernel.function("sys_ftruncate64").return,
  kernel.function("sys_ftruncate").return,
  kernel.function("sys_futex").return,
  kernel.function("sys_futimesat").return,
  kernel.function("sys_get_robust_list").return,
  kernel.function("sys_get_thread_area").return,
  kernel.function("sys_getcwd").return,
  kernel.function("sys_getdents64").return,
  kernel.function("sys_getdents").return,
  kernel.function("sys_getegid16").return,
  kernel.function("sys_getegid").return,
  kernel.function("sys_geteuid16").return,
  kernel.function("sys_geteuid").return,
  kernel.function("sys_getgid16").return,
  kernel.function("sys_getgid").return,
  kernel.function("sys_getgroups16").return,
  kernel.function("sys_getgroups").return,
  kernel.function("sys_gethostname").return,
  kernel.function("sys_getitimer").return,
  kernel.function("sys_getpeername").return,
  kernel.function("sys_getpgid").return,
  kernel.function("sys_getpgrp").return,
  kernel.function("sys_getpid").return,
  kernel.function("sys_getppid").return,
  kernel.function("sys_getpriority").return,
  kernel.function("sys_getresgid16").return,
  kernel.function("sys_getresgid").return,
  kernel.function("sys_getresuid16").return,
  kernel.function("sys_getresuid").return,
  kernel.function("sys_getrlimit").return,
  kernel.function("sys_getrusage").return,
  kernel.function("sys_getsid").return,
  kernel.function("sys_getsockname").return,
  kernel.function("sys_getsockopt").return,
  kernel.function("sys_gettid").return,
  kernel.function("sys_gettimeofday").return,
  kernel.function("sys_getuid16").return,
  kernel.function("sys_getuid").return,
  kernel.function("sys_getxattr").return,
  kernel.function("sys_init_module").return,
  kernel.function("sys_inotify_add_watch").return,
  kernel.function("sys_inotify_init").return,
  kernel.function("sys_inotify_rm_watch").return,
  kernel.function("sys_io_cancel").return,
  kernel.function("sys_io_destroy").return,
  kernel.function("sys_io_getevents").return,
  kernel.function("sys_io_setup").return,
  kernel.function("sys_io_submit").return,
  kernel.function("sys_ioctl").return,
  kernel.function("sys_ioperm").return,
  kernel.function("sys_iopl").return,
  kernel.function("sys_ioprio_get").return,
  kernel.function("sys_ioprio_set").return,
  kernel.function("sys_ipc").return,
  kernel.function("sys_kexec_load").return,
  kernel.function("sys_keyctl").return,
  kernel.function("sys_kill").return,
  kernel.function("sys_lchown16").return,
  kernel.function("sys_lchown").return,
  kernel.function("sys_lgetxattr").return,
  kernel.function("sys_link").return,
  kernel.function("sys_linkat").return,
  kernel.function("sys_listen").return,
  kernel.function("sys_listxattr").return,
  kernel.function("sys_llistxattr").return,
  kernel.function("sys_llseek").return,
  kernel.function("sys_lookup_dcookie").return,
  kernel.function("sys_lremovexattr").return,
  kernel.function("sys_lseek").return,
  kernel.function("sys_lsetxattr").return,
  kernel.function("sys_lstat64").return,
  kernel.function("sys_lstat").return,
  kernel.function("sys_madvise").return,
  kernel.function("sys_mincore").return,
  kernel.function("sys_mkdir").return,
  kernel.function("sys_mkdirat").return,
  kernel.function("sys_mknod").return,
  kernel.function("sys_mknodat").return,
  kernel.function("sys_mlock").return,
  kernel.function("sys_mlockall").return,
  kernel.function("sys_mmap2").return,
  kernel.function("sys_modify_ldt").return,
  kernel.function("sys_mount").return,
  kernel.function("sys_mprotect").return,
  kernel.function("sys_mq_getsetattr").return,
  kernel.function("sys_mq_notify").return,
  kernel.function("sys_mq_open").return,
  kernel.function("sys_mq_timedreceive").return,
  kernel.function("sys_mq_timedsend").return,
  kernel.function("sys_mq_unlink").return,
  kernel.function("sys_mremap").return,
  kernel.function("sys_msgctl").return,
  kernel.function("sys_msgget").return,
  kernel.function("sys_msgrcv").return,
  kernel.function("sys_msgsnd").return,
  kernel.function("sys_msync").return,
  kernel.function("sys_munlock").return,
  kernel.function("sys_munlockall").return,
  kernel.function("sys_munmap").return,
  kernel.function("sys_nanosleep").return,
  kernel.function("sys_newfstat").return,
  kernel.function("sys_newlstat").return,
  kernel.function("sys_newstat").return,
  kernel.function("sys_newuname").return,
  kernel.function("sys_nfsservctl").return,
  kernel.function("sys_ni_syscall").return,
  kernel.function("sys_nice").return,
  kernel.function("sys_old_getrlimit").return,
  kernel.function("sys_oldumount").return,
  kernel.function("sys_olduname").return,
  kernel.function("sys_open").return,
  kernel.function("sys_openat").return,
  kernel.function("sys_pause").return,
  kernel.function("sys_personality").return,
  kernel.function("sys_pipe").return,
  kernel.function("sys_pivot_root").return,
  kernel.function("sys_poll").return,
  kernel.function("sys_ppoll").return,
  kernel.function("sys_prctl").return,
  kernel.function("sys_pread64").return,
  kernel.function("sys_pselect6").return,
  kernel.function("sys_pselect7").return,
  kernel.function("sys_ptrace").return,
  kernel.function("sys_pwrite64").return,
  kernel.function("sys_quotactl").return,
  kernel.function("sys_read").return,
  kernel.function("sys_readahead").return,
  kernel.function("sys_readlink").return,
  kernel.function("sys_readlinkat").return,
  kernel.function("sys_readv").return,
  kernel.function("sys_reboot").return,
  kernel.function("sys_recv").return,
  kernel.function("sys_recvfrom").return,
  kernel.function("sys_recvmsg").return,
  kernel.function("sys_remap_file_pages").return,
  kernel.function("sys_removexattr").return,
  kernel.function("sys_rename").return,
  kernel.function("sys_renameat").return,
  kernel.function("sys_request_key").return,
  kernel.function("sys_restart_syscall").return,
  kernel.function("sys_rmdir").return,
  kernel.function("sys_rt_sigaction").return,
  kernel.function("sys_rt_sigpending").return,
  kernel.function("sys_rt_sigprocmask").return,
  kernel.function("sys_rt_sigqueueinfo").return,
  kernel.function("sys_rt_sigreturn").return,
  kernel.function("sys_rt_sigsuspend").return,
  kernel.function("sys_rt_sigtimedwait").return,
  kernel.function("sys_sched_get_priority_max").return,
  kernel.function("sys_sched_get_priority_min").return,
  kernel.function("sys_sched_getaffinity").return,
  kernel.function("sys_sched_getparam").return,
  kernel.function("sys_sched_getscheduler").return,
  kernel.function("sys_sched_rr_get_interval").return,
  kernel.function("sys_sched_setaffinity").return,
  kernel.function("sys_sched_setparam").return,
  kernel.function("sys_sched_setscheduler").return,
  kernel.function("sys_sched_yield").return,
  kernel.function("sys_select").return,
  kernel.function("sys_semctl").return,
  kernel.function("sys_semget").return,
  kernel.function("sys_semop").return,
  kernel.function("sys_semtimedop").return,
  kernel.function("sys_send").return,
  kernel.function("sys_sendfile64").return,
  kernel.function("sys_sendfile").return,
  kernel.function("sys_sendmsg").return,
  kernel.function("sys_sendto").return,
  kernel.function("sys_set_robust_list").return,
  kernel.function("sys_set_thread_area").return,
  kernel.function("sys_set_tid_address").return,
  kernel.function("sys_setdomainname").return,
  kernel.function("sys_setfsgid16").return,
  kernel.function("sys_setfsgid").return,
  kernel.function("sys_setfsuid16").return,
  kernel.function("sys_setfsuid").return,
  kernel.function("sys_setgid16").return,
  kernel.function("sys_setgid").return,
  kernel.function("sys_setgroups16").return,
  kernel.function("sys_setgroups").return,
  kernel.function("sys_sethostname").return,
  kernel.function("sys_setitimer").return,
  kernel.function("sys_setpgid").return,
  kernel.function("sys_setpriority").return,
  kernel.function("sys_setregid16").return,
  kernel.function("sys_setregid").return,
  kernel.function("sys_setresgid16").return,
  kernel.function("sys_setresgid").return,
  kernel.function("sys_setresuid16").return,
  kernel.function("sys_setresuid").return,
  kernel.function("sys_setreuid16").return,
  kernel.function("sys_setreuid").return,
  kernel.function("sys_setrlimit").return,
  kernel.function("sys_setsid").return,
  kernel.function("sys_setsockopt").return,
  kernel.function("sys_settimeofday").return,
  kernel.function("sys_setuid16").return,
  kernel.function("sys_setuid").return,
  kernel.function("sys_setxattr").return,
  kernel.function("sys_sgetmask").return,
  kernel.function("sys_shmat").return,
  kernel.function("sys_shmctl").return,
  kernel.function("sys_shmdt").return,
  kernel.function("sys_shmget").return,
  kernel.function("sys_shutdown").return,
  kernel.function("sys_sigaction").return,
  kernel.function("sys_sigaltstack").return,
  kernel.function("sys_signal").return,
  kernel.function("sys_sigpending").return,
  kernel.function("sys_sigprocmask").return,
  kernel.function("sys_sigreturn").return,
  kernel.function("sys_sigsuspend").return,
  kernel.function("sys_socket").return,
  kernel.function("sys_socketcall").return,
  kernel.function("sys_socketpair").return,
  kernel.function("sys_splice").return,
  kernel.function("sys_ssetmask").return,
  kernel.function("sys_stat64").return,
  kernel.function("sys_stat").return,
  kernel.function("sys_statfs64").return,
  kernel.function("sys_statfs").return,
  kernel.function("sys_stime").return,
  kernel.function("sys_swapoff").return,
  kernel.function("sys_swapon").return,
  kernel.function("sys_symlink").return,
  kernel.function("sys_symlinkat").return,
  kernel.function("sys_sync").return,
  kernel.function("sys_sync_file_range").return,
  kernel.function("sys_sysctl").return,
  kernel.function("sys_sysfs").return,
  kernel.function("sys_sysinfo").return,
  kernel.function("sys_syslog").return,
  kernel.function("sys_tee").return,
  kernel.function("sys_tgkill").return,
  kernel.function("sys_time").return,
  kernel.function("sys_timer_create").return,
  kernel.function("sys_timer_delete").return,
  kernel.function("sys_timer_getoverrun").return,
  kernel.function("sys_timer_gettime").return,
  kernel.function("sys_timer_settime").return,
  kernel.function("sys_times").return,
  kernel.function("sys_tkill").return,
  kernel.function("sys_truncate64").return,
  kernel.function("sys_truncate").return,
  kernel.function("sys_tux").return,
  kernel.function("sys_umask").return,
  kernel.function("sys_umount").return,
  kernel.function("sys_uname").return,
  kernel.function("sys_unlink").return,
  kernel.function("sys_unlinkat").return,
  kernel.function("sys_unshare").return,
  kernel.function("sys_uselib").return,
  kernel.function("sys_ustat").return,
  kernel.function("sys_utime").return,
  kernel.function("sys_utimes").return,
  kernel.function("sys_vfork").return,
  kernel.function("sys_vhangup").return,
  kernel.function("sys_vm86").return,
  kernel.function("sys_vm86old").return,
  kernel.function("sys_vmsplice").return,
  kernel.function("sys_wait4").return,
  kernel.function("sys_waitid").return,
  kernel.function("sys_waitpid").return,
  kernel.function("sys_write").return,
  kernel.function("sys_writev").return
{
	printf("%d\n", $return)
}


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: BZ#2421 - removing duplicate probe handlers
  2006-08-02 23:07 Stone, Joshua I
@ 2006-08-03 14:24 ` David Smith
  0 siblings, 0 replies; 13+ messages in thread
From: David Smith @ 2006-08-03 14:24 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: Frank Ch. Eigler, Systemtap List

Josh,

See stuff below.

On Wed, 2006-08-02 at 16:07 -0700, Stone, Joshua I wrote:
> On Wednesday, August 02, 2006 2:14 PM, David Smith wrote:
> > On Tue, 2006-08-01 at 18:49 -0400, Frank Ch. Eigler wrote:
> > The bugzilla references the case we should be optimizing for -
> > something like this:
> > 
> > probe kernel.function("sys_read"), kernel.function("sys_write")
> > {
> >         log("here")
> > }

> > In addition, I realized that there are cases where we don't want to
> > merge probes even if they are identical (but we could use your idea
> > above of reusing the code bodies).  Such as this:
> > 
> > probe begin { log("here") }
> > probe begin { log("here") }
> > 
> > If we merge those 2 probes into 1, we're only going to get 1 output
> > line instead of 2.

Here's where I confused you.  That above "begin" example is wrong.  I
believe it actually would work (assuming the code I sent merged
begin/end probes).

> I'm a little worried about how you're merging -- this 'begin' example
> should generate a single probe body which is "triggered" twice at the
> beginning of execution.  If that creates only one output, then you're
> breaking semantics.
> 
> Consider a similar kprobe case:
> 
> probe kernel.function("sys_read"), kernel.function("sys_read")
> {
> 	log("here")
> }
> 
> How many times will this log for each sys_read?  The correct answer
> should be two...
> 
> Josh

With the code I sent to the list, the above example works correctly (I
changed sys_read to sys_execve to lower the message frequency).

When I speak of merging, here's how it works.  With cvs systemtap and
your example, you'd end up with 2 probes at 1 location each (that
happened to be the same location) for a total of 2 locations.  With my
changes, the new optimization pass will notice that the 2 probe bodies
are exactly the same, so it will combine the location lists, then delete
the extra probe body.  So, we'll end up with 1 probe at 2 locations
(that happen to be the same).

Semantics shouldn't be broken.

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: BZ#2421 - removing duplicate probe handlers
@ 2006-08-02 23:07 Stone, Joshua I
  2006-08-03 14:24 ` David Smith
  0 siblings, 1 reply; 13+ messages in thread
From: Stone, Joshua I @ 2006-08-02 23:07 UTC (permalink / raw)
  To: David Smith, Frank Ch. Eigler; +Cc: Systemtap List

On Wednesday, August 02, 2006 2:14 PM, David Smith wrote:
> On Tue, 2006-08-01 at 18:49 -0400, Frank Ch. Eigler wrote:
> The bugzilla references the case we should be optimizing for -
> something like this:
> 
> probe kernel.function("sys_read"), kernel.function("sys_write")
> {
>         log("here")
> }
> 
> In addition, I realized that there are cases where we don't want to
> merge probes even if they are identical (but we could use your idea
> above of reusing the code bodies).  Such as this:
> 
> probe begin { log("here") }
> probe begin { log("here") }
> 
> If we merge those 2 probes into 1, we're only going to get 1 output
> line instead of 2.

I'm a little worried about how you're merging -- this 'begin' example
should generate a single probe body which is "triggered" twice at the
beginning of execution.  If that creates only one output, then you're
breaking semantics.

Consider a similar kprobe case:

probe kernel.function("sys_read"), kernel.function("sys_read")
{
	log("here")
}

How many times will this log for each sys_read?  The correct answer
should be two...

Josh

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2006-08-08 19:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1153430594.9668.57.camel@dhcp-2.hsv.redhat.com>
     [not found] ` <1153494928.11557.24.camel@dhcp-2.hsv.redhat.com>
2006-07-21 16:01   ` BZ#2421 - removing duplicate probe handlers Frank Ch. Eigler
2006-07-21 19:15     ` David Smith
2006-08-01 18:39       ` David Smith
2006-08-01 22:49         ` Frank Ch. Eigler
2006-08-02 21:16           ` David Smith
2006-08-03  1:54             ` Frank Ch. Eigler
2006-08-03 14:58               ` David Smith
2006-08-08 19:28                 ` David Smith
2006-08-01 22:53         ` Frank Ch. Eigler
2006-08-02  8:03         ` Li Guanglei
2006-08-02 13:19           ` David Smith
2006-08-02 23:07 Stone, Joshua I
2006-08-03 14:24 ` David Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).