precompiled probing scenarios

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

* precompiled probing scenarios
@ 2006-10-06 19:08 Frank Ch. Eigler
  2006-10-06 20:33 ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Frank Ch. Eigler @ 2006-10-06 19:08 UTC (permalink / raw)
  To: systemtap

Hi -

Here are some ideas about pre-compiled probes.  It all tries to bring
together some concepts we've kicked around about caching,
pre-compilation, remote deployment, run-time probe parametrization,
and the stap/staprun split.

One part of the plan would be a cache of compiled probe modules.
Unless disabled, the translator would compute a hash of the probe
script and environmental parameters (e.g.  sess.kernel_release,
architecture, invoking userid).  This hash value would be used to
identify the module in a persistent cache
($HOME/.systemtap/cache/HASH.ko just like ccache), so that the same
script for the same machine would map to the same module name.  (A
person deliberately running the same script twice concurrently would
have to "salt" the cache/hash for the duplicates.)  A "stap -p4" run
would print the cached module's file name.

To run a compiled probe, one would normally use "staprun HASH.ko".
Since we already support parametrization (extra module parameters can
initialize string/number global script variables), staprun should be
extended to pass those on to insmod.  (By the way, as we discussed,
staprun will be extended to send a variant of /proc/modules to the
module, so it can relocate its module probes.)

Cross-compilation could come in by letting users specify a target name
for probing.  This name would be mapped to a kernel version,
architecture, cpu type, and maybe a build-root, all via a
configuration file $HOME/.systemtap/known_hosts.  (A host may have
multiple target names, for example for different kernel versions.)
The translator might update its own host's details to that file
automatically, so a network-wide file could be grown by running the
translator once on each client machine and concatenating the files.

Once such a cross-compiled module is built, it could be saved in the
local cache (-p4), and for -p5 even shipped to a named remote machine
via scp and executed there via ssh/staprun.  (Probing several remote
machines concurrently could be easily accomodated.)

We've mentioned somehow securely identifying of compiled modules to
represent a special permission to execute.  This would be a way of
having a security expert dude formally designate a module for use on a
locked-down deployment machine.  Given that the modsign code in
FC/RHEL is not widespread or general enough, a proper kernel-enforced
crypto signature may be out of reach.  Maybe we can list (say) md5sums
of approved module .ko's in a /etc/systemtap/authorized_probes file,
and have a new staprun.auth variant that checks it before submitting a
module to insmod(8) (or actually better, to sys_init_module(2)
directly).

So on to some examples:

% stap -p4 -e "probe foo { ... }"
/home/fche/.systemtap/cache/0xdeadbeef.ko
% stap -p4 -e "probe foo { ... }"
/home/fche/.systemtap/cache/0xdeadbeef.ko  # instant: already cached
% staprun /home/fche/.systemtap/cache/0xdeadbeef.ko global1=value global2=value
(sudo password if needed)
(probe output)
^C

# grep TARGET $HOME/.systemtap/known_hosts
TARGET execute=ssh:user@host.name:sudo kernel=2.6.18-78234.327 arch=i686 cpu=p4
% stap -T TARGET -p4 -e "probe foo { ... }"
/home/fche/.systemtap/cache/0xfeedface.ko  # precompile
% stap -T TARGET -e "probe foo { ... }"
(scp 0xfeedface.ko to host.name:/tmp)
(ssh host.name staprun /tmp/0xfeedface.ko)
(sudo password if needed)
(probe output)

# grep TARGET2 $HOME/.systemtap/known_hosts
TARGET2 execute=ssh:user@host.name:auth kernel=2.6.18-78234.327 arch=i686 cpu=p4
% md5sum /home/fche/.systemtap/cache/0xfeedface.ko
982734982739487239487234
% ssh root@host.name echo md5:982734982739487239487234 >> /etc/systemtap/authorized_probes  # bless this module
% stap -T TARGET2 -e "probe foo { ... }" -x CMD
(scp 0xfeedface.ko to user@host.name:/tmp)
(ssh user@host.name staprun.auth /tmp/0xfeedface.ko -x CMD)
(no sudo password needed!)
(CMD forked under real-uid privileges; module loaded under setuid)
(probe output)

Is this plausible enough to build?

- FChE

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-06 19:08 precompiled probing scenarios Frank Ch. Eigler
@ 2006-10-06 20:33 ` David Smith
  2006-10-06 20:40   ` Frank Ch. Eigler
  0 siblings, 1 reply; 19+ messages in thread
From: David Smith @ 2006-10-06 20:33 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Frank Ch. Eigler wrote:
> Hi -
> 
> Here are some ideas about pre-compiled probes.  It all tries to bring
> together some concepts we've kicked around about caching,
> pre-compilation, remote deployment, run-time probe parametrization,
> and the stap/staprun split.

This is certainly ambitious.

> One part of the plan would be a cache of compiled probe modules.
> Unless disabled, the translator would compute a hash of the probe
> script and environmental parameters (e.g.  sess.kernel_release,
> architecture, invoking userid).  This hash value would be used to
> identify the module in a persistent cache
> ($HOME/.systemtap/cache/HASH.ko just like ccache), so that the same
> script for the same machine would map to the same module name.  (A
> person deliberately running the same script twice concurrently would
> have to "salt" the cache/hash for the duplicates.)  A "stap -p4" run
> would print the cached module's file name.

Hmm.  Are we hashing the input script?  If so, how does this work with 
probe wildcards?  For example, let's say I probe "kernel.function("*")". 
  We compile and cache this module.  I then plug in a bunch of 
additional hardware, which causes several extra modules to be loaded.  I 
then run stap again with the exact same input script.  If the cached 
module gets run, the functions in the new modules won't be probed.

It seems to me that the cached version of a script should probe the 
exact same functions as a newly compiled version would.

> To run a compiled probe, one would normally use "staprun HASH.ko".
> Since we already support parametrization (extra module parameters can
> initialize string/number global script variables), staprun should be
> extended to pass those on to insmod.  (By the way, as we discussed,
> staprun will be extended to send a variant of /proc/modules to the
> module, so it can relocate its module probes.)

A minor point here.  Currently, staprun expects to be run as root, so if 
you aren't root, you've got to run "sudo staprun HASH.ko".

> Cross-compilation could come in by letting users specify a target name
> for probing.  This name would be mapped to a kernel version,
> architecture, cpu type, and maybe a build-root, all via a
> configuration file $HOME/.systemtap/known_hosts.  (A host may have
> multiple target names, for example for different kernel versions.)
> The translator might update its own host's details to that file
> automatically, so a network-wide file could be grown by running the
> translator once on each client machine and concatenating the files.

Wow.  Supporting different kernel versions on the same arch/cpu is 
currently supported.  Doing different arch/cpu types is going to be 
difficult.  There has been a big discussion on 
fedora-devel-list@redhat.com fairly recently about cross-compilation (I 
can try to dig up links if needed), but I don't recollect if there was a 
concrete plan developed.  We've got it a bit easy in that the kernel is 
self-contained and doesn't need a C library, etc.

> Once such a cross-compiled module is built, it could be saved in the
> local cache (-p4), and for -p5 even shipped to a named remote machine
> via scp and executed there via ssh/staprun.  (Probing several remote
> machines concurrently could be easily accomodated.)
> 
> We've mentioned somehow securely identifying of compiled modules to
> represent a special permission to execute.  This would be a way of
> having a security expert dude formally designate a module for use on a
> locked-down deployment machine.  Given that the modsign code in
> FC/RHEL is not widespread or general enough, a proper kernel-enforced
> crypto signature may be out of reach.  Maybe we can list (say) md5sums
> of approved module .ko's in a /etc/systemtap/authorized_probes file,
> and have a new staprun.auth variant that checks it before submitting a
> module to insmod(8) (or actually better, to sys_init_module(2)
> directly).

Hmm.  Is this new staprun.auth variant a setuid program or does the user 
still need sudo privileges?

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-06 20:33 ` David Smith
@ 2006-10-06 20:40   ` Frank Ch. Eigler
  2006-10-19 19:49     ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Frank Ch. Eigler @ 2006-10-06 20:40 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

On Fri, Oct 06, 2006 at 03:33:19PM -0500, David Smith wrote:
> [...]
> Hmm.  Are we hashing the input script?  If so, how does this work with 
> probe wildcards?  For example, let's say I probe "kernel.function("*")". 
> We compile and cache this module.  I then plug in a bunch of 
> additional hardware, which causes several extra modules to be loaded.  I 
> then run stap again with the exact same input script.  [...]

kernel.function("*") should match exactly what was there before.
Probes on module("*").FOO would be redefined to mean something like
"all modules that we know at translation time that *might* exist, that
also happen to be *loaded* at run time.  This aspect of wildcard
expansion would thus take place at run time rather than translate
time.  It just so happens that the same module might probe a greater
or lesser number of modules on an actual system.  With that proviso,
a script-source-level hash still seems to work.

> Wow.  Supporting different kernel versions on the same arch/cpu is
> currently supported.  Doing different arch/cpu types is going to be
> difficult.  [...]

Yeah, I figure proper cross-architecture compilation would come later.

> [...]  Hmm.  Is this new staprun.auth variant a setuid program or
> does the user still need sudo privileges?

It could be setuid.

- FChE

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-06 20:40   ` Frank Ch. Eigler
@ 2006-10-19 19:49     ` David Smith
  2006-10-19 21:53       ` Frank Ch. Eigler
  0 siblings, 1 reply; 19+ messages in thread
From: David Smith @ 2006-10-19 19:49 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

[-- Attachment #1: Type: text/plain, Size: 2972 bytes --]

Frank Ch. Eigler wrote:
> Hi -
> 
> On Fri, Oct 06, 2006 at 03:33:19PM -0500, David Smith wrote:
>> [...]
>> Hmm.  Are we hashing the input script?  If so, how does this work with 
>> probe wildcards?  For example, let's say I probe "kernel.function("*")". 
>> We compile and cache this module.  I then plug in a bunch of 
>> additional hardware, which causes several extra modules to be loaded.  I 
>> then run stap again with the exact same input script.  [...]
> 
> kernel.function("*") should match exactly what was there before.
> Probes on module("*").FOO would be redefined to mean something like
> "all modules that we know at translation time that *might* exist, that
> also happen to be *loaded* at run time.  This aspect of wildcard
> expansion would thus take place at run time rather than translate
> time.  It just so happens that the same module might probe a greater
> or lesser number of modules on an actual system.  With that proviso,
> a script-source-level hash still seems to work.

Here's a patch that implements a cache for systemtap.  It works somewhat 
like ccache (which is unsurprising since I based it on ccache), but 
simpler.  (Note that if you want to apply the patch and compile it, 
you'll need to regenerate Makefile.in.)

Here's an outline of what happens.

By default, the cache is stored under ~/.stap_cache.  This can be 
overridden using the new 'SYSTEMTAP_CACHE' environment variable.

At the end of pass 2, a hash is generated (in find_hash(), which is in 
hash.cxx:70).  The hash is computed using the following data:
- kernel version and arch
- any user-specified macros
- runtime path
- gcc's path, size, and mtime
- stap's version and compile date
- pass 2 script output
The hash is then used to compute a hash pathname.

Still at the end of pass 2, we now check to see if we can find a cache 
hit (get_from_cache(), cache.cxx:42).  If the hash pathname exists, we 
copy it from the cache to the temporary directory.  We then skip to pass 
5 to run the module.

If a cache hit wasn't found, we proceed normally.  At the end of pass 4, 
we save the compiled module and its C file in the cache (add_to_cache(), 
cache.cxx:17).

By default, cache support is on.  To turn it off, you can either set 
SYSTEMTAP_CACHE to a non-directory ("SYSTEMTAP_CACHE=/dev/null stap ..") 
or use the '-m' option to change the module name (we must turn off cache 
support because we have to name the module 'stap_{hash}').  We could add 
a command line option to disable the cache if needed.

Note that currently several tests in the testsuite fail after a first 
run to seed the cache because they don't expect to see the skip from 
pass 2 to pass 5.

Stuff left to do:

- Testing, testing, testing
- Correct module handling (which Frank outlined above)
- Set a maximum cache size and expire old modules.  Is this needed?
- Documentation.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

[-- Attachment #2: cache.patch --]
[-- Type: text/x-patch, Size: 29719 bytes --]

Index: Makefile.am
===================================================================
RCS file: /cvs/systemtap/src/Makefile.am,v
retrieving revision 1.57
diff -u -p -u -p -r1.57 Makefile.am
--- Makefile.am	18 Oct 2006 21:23:53 -0000	1.57
+++ Makefile.am	19 Oct 2006 19:44:57 -0000
@@ -15,7 +15,8 @@ dist_man_MANS = stap.1 stapprobes.5 stap
 bin_PROGRAMS = stap staprun
 stap_SOURCES = main.cxx \
 	parse.cxx staptree.cxx elaborate.cxx translate.cxx \
-	tapsets.cxx buildrun.cxx loc2c.c
+	tapsets.cxx buildrun.cxx loc2c.c hash.cxx mdfour.c \
+	cache.cxx util.cxx
 stap_LDADD = @stap_LIBS@
 
 stap_CXXFLAGS = -Werror $(AM_CXXFLAGS)
Index: cache.cxx
===================================================================
RCS file: cache.cxx
diff -N cache.cxx
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ cache.cxx	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,101 @@
+#include "session.h"
+#include "cache.h"
+#include "util.h"
+#include <cerrno>
+#include <string>
+
+extern "C" {
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+}
+
+using namespace std;
+
+
+void
+add_to_cache(systemtap_session& s)
+{
+  string module_src_path = s.tmpdir + "/" + s.module_name + ".ko";
+  if (s.verbose)
+    clog << "Copying " << module_src_path << " to " << s.hash_path << endl;
+  if (copy_file(module_src_path.c_str(), s.hash_path.c_str()) != 0)
+    {
+      cerr << "Copy failed (\"" << module_src_path << "\" to \""
+	   << s.hash_path << "\"): " << strerror(errno) << endl;
+      return;
+    }
+
+  string c_dest_path = s.hash_path;
+  if (c_dest_path.rfind(".ko") == (c_dest_path.size() - 3))
+    c_dest_path.resize(c_dest_path.size() - 3);
+  c_dest_path += ".c";
+
+  if (s.verbose)
+    clog << "Copying " << s.translated_source << " to " << c_dest_path
+	 << endl;
+  if (copy_file(s.translated_source.c_str(), c_dest_path.c_str()) != 0)
+    {
+      cerr << "Copy failed (\"" << s.translated_source << "\" to \""
+	   << c_dest_path << "\"): " << strerror(errno) << endl;
+    }
+}
+
+
+bool
+get_from_cache(systemtap_session& s)
+{
+  string module_dest_path = s.tmpdir + "/" + s.module_name + ".ko";
+  string c_src_path = s.hash_path;
+  int fd_module, fd_c;
+
+  if (c_src_path.rfind(".ko") == (c_src_path.size() - 3))
+    c_src_path.resize(c_src_path.size() - 3);
+  c_src_path += ".c";
+
+  // See if module exists
+  fd_module = open(s.hash_path.c_str(), O_RDONLY);
+  if (fd_module == -1)
+    {
+      // It isn't in cache.
+      return false;
+    }
+
+  // See if C file exists.
+  fd_c = open(c_src_path.c_str(), O_RDONLY);
+  if (fd_c == -1)
+    {
+      // The module is there, but the C file isn't.  Cleanup and
+      // return.
+      close(fd_module);
+      unlink(s.hash_path.c_str());
+      return false;
+    }
+
+  // Copy the cached module to the destination
+  if (copy_file(s.hash_path.c_str(), module_dest_path.c_str()) != 0)
+  {
+      cerr << "Copy failed (\"" << s.hash_path << "\" to \""
+	   << module_dest_path << "\"): " << strerror(errno) << endl;
+      close(fd_module);
+      close(fd_c);
+      return false;
+  }
+  
+  // Copy the cached C file to the destination
+  if (copy_file(c_src_path.c_str(), s.translated_source.c_str()) != 0)
+  {
+      cerr << "Copy failed (\"" << c_src_path << "\" to \""
+	   << s.translated_source << "\"): " << strerror(errno) << endl;
+      unlink(module_dest_path.c_str());
+      close(fd_module);
+      close(fd_c);
+      return false;
+  }
+
+  close(fd_module);
+  close(fd_c);
+  clog << "Using cached result \"" << s.hash_path << "\" as \""
+       << module_dest_path << "\"" << endl;
+  return true;
+}
Index: cache.h
===================================================================
RCS file: cache.h
diff -N cache.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ cache.h	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,2 @@
+void add_to_cache(systemtap_session& s);
+bool get_from_cache(systemtap_session& s);
Index: hash.cxx
===================================================================
RCS file: hash.cxx
diff -N hash.cxx
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ hash.cxx	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,154 @@
+// Copyright (C) Andrew Tridgell 2002 (original file)
+// Copyright (C) 2006 Red Hat Inc. (systemtap changes)
+// 
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License as published by
+// the Free Software Foundation; either version 2 of the License, or
+// (at your option) any later version.
+// 
+// This program is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+// 
+// You should have received a copy of the GNU General Public License
+// along with this program; if not, write to the Free Software
+// Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+#include "config.h"
+#include "session.h"
+#include "hash.h"
+#include "util.h"
+#include <sstream>
+#include <iomanip>
+#include <cerrno>
+
+extern "C" {
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+}
+
+using namespace std;
+
+void
+hash::start()
+{
+  mdfour_begin(&md4);
+}
+
+
+void
+hash::add(const unsigned char *buffer, size_t size)
+{
+  mdfour_update(&md4, buffer, size);
+}
+
+
+void
+hash::result(string& r)
+{
+  ostringstream rstream;
+  unsigned char sum[16];
+
+  mdfour_update(&md4, NULL, 0);
+  mdfour_result(&md4, sum);
+
+  for (int i=0; i<16; i++)
+    {
+      rstream << hex << setfill('0') << setw(2) << (unsigned)sum[i];
+    }
+  rstream << "_" << setw(0) << dec << (unsigned)md4.totalN;
+  r = rstream.str();
+}
+
+
+// Grabbed from linux/module.h kernel include.
+#define MODULE_NAME_LEN (64 - sizeof(unsigned long))
+
+void
+find_hash (systemtap_session& s, const string& script)
+{
+  hash h;
+  int nlevels = 2;
+  struct stat st;
+
+  // We use a N level subdir for the cache path.  Let N be adjustable.
+  const char *s_n;
+  if ((s_n = getenv("SYSTEMTAP_NLEVELS")))
+    {
+      nlevels = atoi(s_n);
+      if (nlevels < 1) nlevels = 1;
+      if (nlevels > 8) nlevels = 8;
+    }
+
+  // Hash kernel release and arch.
+  h.add(s.kernel_release);
+  h.add(s.architecture);
+
+  // Hash user-specified arguments (that change the generated module).
+  for (unsigned i = 0; i < s.macros.size(); i++)
+    h.add(s.macros[i]);
+
+  // Hash runtime path (that gets added in as "-I path").
+  h.add(s.runtime_path);
+
+  // Hash compiler path, size, and mtime.  We're just going to assume
+  // we'll be using gcc, which should be correct most of the time.
+  string gcc_path;
+  if (find_executable("gcc", gcc_path))
+    {
+      if (stat(gcc_path.c_str(), &st) == 0)
+        {
+	  h.add(gcc_path);
+	  h.add(st.st_size);
+	  h.add(st.st_mtime);
+	}
+    }
+
+  // Hash the systemtap version and compile date.
+  h.add(VERSION);
+  h.add(DATE);
+
+  // Add in pass 2 script output.
+  h.add(script);
+
+  // Use a N level subdir for the cache path to reduce the impact on
+  // filesystems which are slow for large directories.
+  string hashdir = s.cachedir;
+  string result;
+  h.result(result);
+
+  for (int i = 0; i < nlevels; i++)
+    {
+      hashdir += string("/") + result[i];
+      if (create_dir(hashdir.c_str()) != 0)
+        {
+	  cerr << "Warning: failed to create cache directory (\""
+	       << hashdir + "\"): " << strerror(errno) << endl;
+	  cerr << "Disabling cache support." << endl;
+	  s.use_cache = false;
+	  return;
+	}
+    }
+
+  // Update module name to be 'stap_{hash start}'.  '{hash start}'
+  // must not be too long.  This shouldn't happen, since the maximum
+  // size of a hash is 32 fixed chars + 1 (for the '_') + a max of 11.
+  s.module_name = "stap_" + result;
+  if (s.module_name.size() >= (MODULE_NAME_LEN - 1))
+    s.module_name.resize(MODULE_NAME_LEN - 1);
+
+  // 'ccache' would use a hash path of something like:
+  //    s.hash_path = hashdir + "/" + result.substr(nlevels);
+  // which would look like:
+  //    ~/.stap_cache/A/B/CDEFGHIJKLMNOPQRSTUVWXYZABCDEF_XXX
+  //
+  // We're using the following so that the module can be used straight
+  // from the cache if desired.  This ends up looking like this:
+  //    ~/.stap_cache/A/B/stap_ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEF_XXX.ko
+  s.hash_path = hashdir + "/" + s.module_name + ".ko";
+
+  // Update C source name with new module_name.
+  s.translated_source = string(s.tmpdir) + "/" + s.module_name + ".c";
+}
Index: hash.h
===================================================================
RCS file: hash.h
diff -N hash.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ hash.h	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,34 @@
+#include <string>
+
+extern "C" {
+#include <string.h>
+#include <mdfour.h>
+}
+
+class hash
+{
+private:
+  struct mdfour md4;
+
+public:
+  hash() { start(); }
+
+  void start();
+
+  void add(const unsigned char *buffer, size_t size);
+  void add(const int x) { add((const unsigned char *)&x, sizeof(x)); }
+  void add(const long x) { add((const unsigned char *)&x, sizeof(x)); }
+  void add(const long long x) { add((const unsigned char *)&x, sizeof(x)); }
+  void add(const unsigned int x) { add((const unsigned char *)&x, sizeof(x)); }
+  void add(const unsigned long x) { add((const unsigned char *)&x,
+					sizeof(x)); }
+  void add(const unsigned long long x) { add((const unsigned char *)&x,
+					     sizeof(x)); }
+  void add(const char *s) { add((const unsigned char *)s, strlen(s)); }
+  void add(const std::string& s) { add((const unsigned char *)s.c_str(),
+				       s.length()); }
+
+  void result(std::string& r);
+};
+
+void find_hash (systemtap_session& s, const std::string& script);
Index: main.cxx
===================================================================
RCS file: /cvs/systemtap/src/main.cxx,v
retrieving revision 1.53
diff -u -p -u -p -r1.53 main.cxx
--- main.cxx	28 Sep 2006 01:48:59 -0000	1.53
+++ main.cxx	19 Oct 2006 19:44:57 -0000
@@ -15,6 +15,9 @@
 #include "translate.h"
 #include "buildrun.h"
 #include "session.h"
+#include "hash.h"
+#include "cache.h"
+#include "util.h"
 
 #include <iostream>
 #include <fstream>
@@ -113,6 +116,65 @@ stringify(T t)
 }
 
 
+static void
+printscript(systemtap_session& s, ostream& o)
+{
+  if (s.globals.size() > 0)
+    o << "# globals" << endl;
+  for (unsigned i=0; i<s.globals.size(); i++)
+    {
+      vardecl* v = s.globals[i];
+      v->printsig (o);
+      o << endl;
+    }
+
+  if (s.functions.size() > 0)
+    o << "# functions" << endl;
+  for (unsigned i=0; i<s.functions.size(); i++)
+    {
+      functiondecl* f = s.functions[i];
+      f->printsig (o);
+      o << endl;
+      if (f->locals.size() > 0)
+	o << "  # locals" << endl;
+      for (unsigned j=0; j<f->locals.size(); j++)
+        {
+	  vardecl* v = f->locals[j];
+	  o << "  ";
+	  v->printsig (o);
+	  o << endl;
+	}
+      if (s.verbose)
+        {
+	  f->body->print (o);
+	  o << endl;
+	}
+    }
+
+  if (s.probes.size() > 0)
+    o << "# probes" << endl;
+  for (unsigned i=0; i<s.probes.size(); i++)
+    {
+      derived_probe* p = s.probes[i];
+      p->printsig (o);
+      o << endl;
+      if (p->locals.size() > 0)
+        o << "  # locals" << endl;
+      for (unsigned j=0; j<p->locals.size(); j++)
+        {
+	  vardecl* v = p->locals[j];
+	  o << "  ";
+	  v->printsig (o);
+	  o << endl;
+	}
+      if (s.verbose)
+        {
+	  p->body->print (o);
+	  o << endl;
+	}
+    }
+}
+
 int
 main (int argc, char * const argv [])
 {
@@ -140,6 +202,7 @@ main (int argc, char * const argv [])
   s.target_pid = 0;
   s.merge=true;
   s.perfmon=0;
+  s.use_cache = true;
 
   const char* s_p = getenv ("SYSTEMTAP_TAPSET");
   if (s_p != NULL)  
@@ -159,6 +222,20 @@ main (int argc, char * const argv [])
   else
     s.runtime_path = string(PKGDATADIR) + "/runtime";
 
+  const char* s_c = getenv ("SYSTEMTAP_CACHE");
+  if (s_c != NULL)
+    s.cachedir = s_c;
+  else
+    s.cachedir = get_home_directory() + string("/.stap_cache");
+  if (create_dir(s.cachedir.c_str()) == 1)
+  {
+      const char* e = strerror (errno);
+      cerr << "Warning: failed to create cache directory (\"" << s.cachedir
+	   << "\"): " << e << endl;
+      cerr << "Disabling cache support." << endl;
+      s.use_cache = false;
+  }
+
   while (true)
     {
       int grc = getopt (argc, argv, "hVMvtp:I:e:o:R:r:m:kgc:x:D:bs:u");
@@ -216,6 +293,8 @@ main (int argc, char * const argv [])
 
         case 'm':
           s.module_name = string (optarg);
+	  cerr << "Warning: using '-m' disables cache support." << endl;
+	  s.use_cache = false;
           break;
 
         case 'r':
@@ -339,6 +418,10 @@ main (int argc, char * const argv [])
       clog << "Created temporary directory \"" << s.tmpdir << "\"" << endl;
   }
 
+  // Create the name of the C source file within the temporary
+  // directory.
+  s.translated_source = string(s.tmpdir) + "/" + s.module_name + ".c";
+
   struct tms tms_before;
   times (& tms_before);
   struct timeval tv_before;
@@ -469,62 +552,7 @@ main (int argc, char * const argv [])
   rc = semantic_pass (s);
 
   if (rc == 0 && s.last_pass == 2)
-    {
-      if (s.globals.size() > 0)
-        cout << "# globals" << endl;
-      for (unsigned i=0; i<s.globals.size(); i++)
-	{
-	  vardecl* v = s.globals[i];
-	  v->printsig (cout);
-          cout << endl;
-	}
-
-      if (s.functions.size() > 0)
-        cout << "# functions" << endl;
-      for (unsigned i=0; i<s.functions.size(); i++)
-	{
-	  functiondecl* f = s.functions[i];
-	  f->printsig (cout);
-          cout << endl;
-          if (f->locals.size() > 0)
-            cout << "  # locals" << endl;
-          for (unsigned j=0; j<f->locals.size(); j++)
-            {
-              vardecl* v = f->locals[j];
-              cout << "  ";
-              v->printsig (cout);
-              cout << endl;
-            }
-          if (s.verbose)
-            {
-              f->body->print (cout);
-              cout << endl;
-            }
-	}
-
-      if (s.probes.size() > 0)
-        cout << "# probes" << endl;
-      for (unsigned i=0; i<s.probes.size(); i++)
-	{
-	  derived_probe* p = s.probes[i];
-	  p->printsig (cout);
-          cout << endl;
-          if (p->locals.size() > 0)
-            cout << "  # locals" << endl;
-          for (unsigned j=0; j<p->locals.size(); j++)
-            {
-              vardecl* v = p->locals[j];
-              cout << "  ";
-              v->printsig (cout);
-              cout << endl;
-            }
-          if (s.verbose)
-            {
-              p->body->print (cout);
-              cout << endl;
-            }
-	}
-    }
+    printscript(s, cout);
 
   times (& tms_after);
   gettimeofday (&tv_after, NULL);
@@ -540,6 +568,38 @@ main (int argc, char * const argv [])
     cerr << "Pass 2: analysis failed.  "
          << "Try again with more '-v' (verbose) options."
          << endl;
+  // Generate hash.  There isn't any point in generating the hash
+  // if last_pass is 2, since we'll quit before using it.
+  else if (s.last_pass != 2 && s.use_cache)
+    {
+      ostringstream o;
+      unsigned saved_verbose;
+	  
+      // Make sure we're in verbose mode, so that printscript()
+      // will output function/probe bodies.
+      saved_verbose = s.verbose;
+      s.verbose = 3;
+
+      // Print script to 'o'
+      printscript(s, o);
+
+      // Restore original verbose mode setting.
+      s.verbose = saved_verbose;
+
+      // Generate hash
+      find_hash (s, o.str());
+
+      // See if we can use a cached module.
+      if (get_from_cache(s))
+        {
+	  // If our last pass isn't 5, we're done (since passes 3 and
+	  // 4 just generate what we just pulled out of the cache).
+	  if (s.last_pass < 5) goto cleanup;
+
+	  // Short-circuit to pass 5.
+	  goto pass_5;
+	}
+    }
 
   if (rc || s.last_pass == 2) goto cleanup;
 
@@ -548,7 +608,6 @@ main (int argc, char * const argv [])
   times (& tms_before);
   gettimeofday (&tv_before, NULL);
 
-  s.translated_source = string(s.tmpdir) + "/" + s.module_name + ".c";
   rc = translate_pass (s);
 
   if (rc == 0 && s.last_pass == 3)
@@ -590,11 +649,17 @@ main (int argc, char * const argv [])
     cerr << "Pass 4: compilation failed.  "
          << "Try again with more '-v' (verbose) options."
          << endl;
+  else if (s.use_cache)
+    {
+      // Update cache.
+      add_to_cache(s);
+    }
 
   // XXX: what to do if rc==0 && last_pass == 4?  dump .ko file to stdout?
   if (rc || s.last_pass == 4) goto cleanup;
 
   // PASS 5: RUN
+pass_5:
   times (& tms_before);
   gettimeofday (&tv_before, NULL);
   // NB: this message is a judgement call.  The other passes don't emit
Index: mdfour.c
===================================================================
RCS file: mdfour.c
diff -N mdfour.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ mdfour.c	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,218 @@
+/* 
+   a implementation of MD4 designed for use in the SMB authentication protocol
+   Copyright (C) Andrew Tridgell 1997-1998.
+   
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+   
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+   
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+*/
+
+#include <stdio.h>
+#include <string.h>
+#include "mdfour.h"
+
+/* NOTE: This code makes no attempt to be fast! 
+
+   It assumes that a int is at least 32 bits long
+*/
+
+static struct mdfour *m;
+
+#define MASK32 (0xffffffff)
+
+#define F(X,Y,Z) ((((X)&(Y)) | ((~(X))&(Z))))
+#define G(X,Y,Z) ((((X)&(Y)) | ((X)&(Z)) | ((Y)&(Z))))
+#define H(X,Y,Z) (((X)^(Y)^(Z)))
+#define lshift(x,s) (((((x)<<(s))&MASK32) | (((x)>>(32-(s)))&MASK32)))
+
+#define ROUND1(a,b,c,d,k,s) a = lshift((a + F(b,c,d) + M[k])&MASK32, s)
+#define ROUND2(a,b,c,d,k,s) a = lshift((a + G(b,c,d) + M[k] + 0x5A827999)&MASK32,s)
+#define ROUND3(a,b,c,d,k,s) a = lshift((a + H(b,c,d) + M[k] + 0x6ED9EBA1)&MASK32,s)
+
+/* this applies md4 to 64 byte chunks */
+static void
+mdfour64(uint32_t *M)
+{
+  uint32_t AA, BB, CC, DD;
+  uint32_t A,B,C,D;
+
+  A = m->A; B = m->B; C = m->C; D = m->D; 
+  AA = A; BB = B; CC = C; DD = D;
+
+  ROUND1(A,B,C,D,  0,  3);  ROUND1(D,A,B,C,  1,  7);  
+  ROUND1(C,D,A,B,  2, 11);  ROUND1(B,C,D,A,  3, 19);
+  ROUND1(A,B,C,D,  4,  3);  ROUND1(D,A,B,C,  5,  7);  
+  ROUND1(C,D,A,B,  6, 11);  ROUND1(B,C,D,A,  7, 19);
+  ROUND1(A,B,C,D,  8,  3);  ROUND1(D,A,B,C,  9,  7);  
+  ROUND1(C,D,A,B, 10, 11);  ROUND1(B,C,D,A, 11, 19);
+  ROUND1(A,B,C,D, 12,  3);  ROUND1(D,A,B,C, 13,  7);  
+  ROUND1(C,D,A,B, 14, 11);  ROUND1(B,C,D,A, 15, 19);	
+
+
+  ROUND2(A,B,C,D,  0,  3);  ROUND2(D,A,B,C,  4,  5);  
+  ROUND2(C,D,A,B,  8,  9);  ROUND2(B,C,D,A, 12, 13);
+  ROUND2(A,B,C,D,  1,  3);  ROUND2(D,A,B,C,  5,  5);  
+  ROUND2(C,D,A,B,  9,  9);  ROUND2(B,C,D,A, 13, 13);
+  ROUND2(A,B,C,D,  2,  3);  ROUND2(D,A,B,C,  6,  5);  
+  ROUND2(C,D,A,B, 10,  9);  ROUND2(B,C,D,A, 14, 13);
+  ROUND2(A,B,C,D,  3,  3);  ROUND2(D,A,B,C,  7,  5);  
+  ROUND2(C,D,A,B, 11,  9);  ROUND2(B,C,D,A, 15, 13);
+
+  ROUND3(A,B,C,D,  0,  3);  ROUND3(D,A,B,C,  8,  9);  
+  ROUND3(C,D,A,B,  4, 11);  ROUND3(B,C,D,A, 12, 15);
+  ROUND3(A,B,C,D,  2,  3);  ROUND3(D,A,B,C, 10,  9);  
+  ROUND3(C,D,A,B,  6, 11);  ROUND3(B,C,D,A, 14, 15);
+  ROUND3(A,B,C,D,  1,  3);  ROUND3(D,A,B,C,  9,  9);  
+  ROUND3(C,D,A,B,  5, 11);  ROUND3(B,C,D,A, 13, 15);
+  ROUND3(A,B,C,D,  3,  3);  ROUND3(D,A,B,C, 11,  9);  
+  ROUND3(C,D,A,B,  7, 11);  ROUND3(B,C,D,A, 15, 15);
+
+  A += AA; B += BB; 
+  C += CC; D += DD;
+	
+  A &= MASK32; B &= MASK32; 
+  C &= MASK32; D &= MASK32;
+
+  m->A = A; m->B = B; m->C = C; m->D = D;
+}
+
+static void
+copy64(uint32_t *M, const unsigned char *in)
+{
+  int i;
+
+  for (i=0;i<16;i++)
+    M[i] = (in[i*4+3]<<24) | (in[i*4+2]<<16) |
+      (in[i*4+1]<<8) | (in[i*4+0]<<0);
+}
+
+static void
+copy4(unsigned char *out,uint32_t x)
+{
+  out[0] = x&0xFF;
+  out[1] = (x>>8)&0xFF;
+  out[2] = (x>>16)&0xFF;
+  out[3] = (x>>24)&0xFF;
+}
+
+void
+mdfour_begin(struct mdfour *md)
+{
+  md->A = 0x67452301;
+  md->B = 0xefcdab89;
+  md->C = 0x98badcfe;
+  md->D = 0x10325476;
+  md->totalN = 0;
+  md->tail_len = 0;
+}
+
+
+static void
+mdfour_tail(const unsigned char *in, int n)
+{
+  unsigned char buf[128];
+  uint32_t M[16];
+  uint32_t b;
+
+  m->totalN += n;
+
+  b = m->totalN * 8;
+
+  memset(buf, 0, 128);
+  if (n) memcpy(buf, in, n);
+  buf[n] = 0x80;
+
+  if (n <= 55)
+    {
+      copy4(buf+56, b);
+      copy64(M, buf);
+      mdfour64(M);
+    }
+  else
+    {
+      copy4(buf+120, b); 
+      copy64(M, buf);
+      mdfour64(M);
+      copy64(M, buf+64);
+      mdfour64(M);
+    }
+}
+
+void
+mdfour_update(struct mdfour *md, const unsigned char *in, int n)
+{
+  uint32_t M[16];
+
+  m = md;
+
+  if (in == NULL)
+    {
+      mdfour_tail(md->tail, md->tail_len);
+      return;
+    }
+
+  if (md->tail_len)
+    {
+      int len = 64 - md->tail_len;
+      if (len > n) len = n;
+      memcpy(md->tail+md->tail_len, in, len);
+      md->tail_len += len;
+      n -= len;
+      in += len;
+      if (md->tail_len == 64)
+        {
+	  copy64(M, md->tail);
+	  mdfour64(M);
+	  m->totalN += 64;
+	  md->tail_len = 0;
+	}
+    }
+
+    while (n >= 64)
+      {
+	copy64(M, in);
+	mdfour64(M);
+	in += 64;
+	n -= 64;
+	m->totalN += 64;
+      }
+
+    if (n)
+      {
+	memcpy(md->tail, in, n);
+	md->tail_len = n;
+      }
+}
+
+
+void
+mdfour_result(struct mdfour *md, unsigned char *out)
+{
+  m = md;
+
+  copy4(out, m->A);
+  copy4(out+4, m->B);
+  copy4(out+8, m->C);
+  copy4(out+12, m->D);
+}
+
+
+void
+mdfour(unsigned char *out, const unsigned char *in, int n)
+{
+  struct mdfour md;
+  mdfour_begin(&md);
+  mdfour_update(&md, in, n);
+  mdfour_update(&md, NULL, 0);
+  mdfour_result(&md, out);
+}
Index: mdfour.h
===================================================================
RCS file: mdfour.h
diff -N mdfour.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ mdfour.h	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,39 @@
+/* 
+   Unix SMB/Netbios implementation.
+   Version 1.9.
+   a implementation of MD4 designed for use in the SMB authentication protocol
+   Copyright (C) Andrew Tridgell 1997-1998.
+   
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+   
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+   
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+*/
+
+#include <stdint.h>
+
+struct mdfour
+{
+  uint32_t A, B, C, D;
+  uint32_t totalN;
+  unsigned char tail[64];
+  unsigned tail_len;	
+};
+
+void mdfour_begin(struct mdfour *md);
+void mdfour_update(struct mdfour *md, const unsigned char *in, int n);
+void mdfour_result(struct mdfour *md, unsigned char *out);
+void mdfour(unsigned char *out, const unsigned char *in, int n);
+
+
+
+
Index: session.h
===================================================================
RCS file: /cvs/systemtap/src/session.h,v
retrieving revision 1.11
diff -u -p -u -p -r1.11 session.h
--- session.h	11 Oct 2006 14:56:09 -0000	1.11
+++ session.h	19 Oct 2006 19:44:57 -0000
@@ -85,6 +85,11 @@ struct systemtap_session
   int buffer_size;
   unsigned perfmon;
 
+  // Cache data
+  bool use_cache;
+  std::string cachedir;
+  std::string hash_path;
+
   // temporary directory for module builds etc.
   // hazardous - it is "rm -rf"'d at exit
   std::string tmpdir;
Index: util.cxx
===================================================================
RCS file: util.cxx
diff -N util.cxx
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ util.cxx	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,199 @@
+// Copyright (C) Andrew Tridgell 2002 (original file)
+// Copyright (C) 2006 Red Hat Inc. (systemtap changes)
+//
+// This program is free software; you can redistribute it and/or
+// modify it under the terms of the GNU General Public License as
+// published by the Free Software Foundation; either version 2 of the
+// License, or (at your option) any later version.
+//
+// This program is distributed in the hope that it will be useful, but
+// WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.See the GNU
+// General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with this program; if not, write to the Free Software
+// Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+#include "util.h"
+#include <stdexcept>
+#include <cerrno>
+
+extern "C" {
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pwd.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <fcntl.h>
+}
+
+using namespace std;
+
+
+// Return current users home directory or die.
+const char *
+get_home_directory(void)
+{
+  const char *p = getenv("HOME");
+  if (p)
+    return p;
+
+  struct passwd *pwd = getpwuid(getuid());
+  if (pwd)
+    return pwd->pw_dir;
+
+  throw runtime_error("Unable to determine home directory");
+  return NULL;
+}
+
+
+// Copy a file.  The copy is done via a temporary file and atomic
+// rename.
+int
+copy_file(const char *src, const char *dest)
+{
+  int fd1, fd2;
+  char buf[10240];
+  int n;
+  string tmp;
+  char *tmp_name;
+  mode_t mask;
+
+  // Open the src file.
+  fd1 = open(src, O_RDONLY);
+  if (fd1 == -1)
+    return -1;
+
+  // Open the temporary output file.
+  tmp = dest + string(".XXXXXX");
+  tmp_name = (char *)tmp.c_str();
+  fd2 = mkstemp(tmp_name);
+  if (fd2 == -1)
+    {
+      close(fd1);
+      return -1;
+    }
+
+  // Copy the src file to the temporary output file.
+  while ((n = read(fd1, buf, sizeof(buf))) > 0)
+    {
+      if (write(fd2, buf, n) != n)
+        {
+	  close(fd2);
+	  close(fd1);
+	  unlink(tmp_name);
+	  return -1;
+	}
+    }
+  close(fd1);
+
+  // Set the permissions on the temporary output file.
+  mask = umask(0);
+  fchmod(fd2, 0666 & ~mask);
+  umask(mask);
+
+  // Close the temporary output file.  The close can fail on NFS if
+  // out of space.
+  if (close(fd2) == -1)
+    {
+      unlink(tmp_name);
+      return -1;
+    }
+
+  // Rename the temporary output file to the destination file.
+  unlink(dest);
+  if (rename(tmp_name, dest) == -1)
+    {
+      unlink(tmp_name);
+      return -1;
+    }
+
+  return 0;
+}
+
+
+// Make sure a directory exists.
+int
+create_dir(const char *dir)
+{
+  struct stat st;
+  if (stat(dir, &st) == 0)
+    {
+      if (S_ISDIR(st.st_mode))
+	return 0;
+      errno = ENOTDIR;
+      return 1;
+    }
+
+  if (mkdir(dir, 0777) != 0 && errno != EEXIST)
+    return 1;
+
+  return 0;
+}
+
+
+void
+tokenize(const string& str, vector<string>& tokens,
+	 const string& delimiters = " ")
+{
+  // Skip delimiters at beginning.
+  string::size_type lastPos = str.find_first_not_of(delimiters, 0);
+  // Find first "non-delimiter".
+  string::size_type pos     = str.find_first_of(delimiters, lastPos);
+
+  while (pos != string::npos || lastPos != string::npos)
+    {
+      // Found a token, add it to the vector.
+      tokens.push_back(str.substr(lastPos, pos - lastPos));
+      // Skip delimiters.  Note the "not_of"
+      lastPos = str.find_first_not_of(delimiters, pos);
+      // Find next "non-delimiter"
+      pos = str.find_first_of(delimiters, lastPos);
+    }
+}
+
+
+//  Find an executable by name in $PATH.
+bool
+find_executable(const char *name, string& retpath)
+{
+  const char *p;
+  string path;
+  vector<string> dirs;
+  struct stat st1, st2;
+
+  if (*name == '/')
+    {
+      retpath = name;
+      return true;
+    }
+
+  p = getenv("PATH");
+  if (!p)
+    return false;
+  path = p;
+	
+  // Split PATH up.
+  tokenize(path, dirs, string(":"));
+
+  // Search the path looking for the first executable of the right name.
+  for (vector<string>::iterator i = dirs.begin(); i != dirs.end(); i++)
+    {
+      string fname = *i + "/" + name;
+      const char *f = fname.c_str();
+
+      // Look for a normal executable file.
+      if (access(f, X_OK) == 0
+	  && lstat(f, &st1) == 0
+	  && stat(f, &st2) == 0
+	  && S_ISREG(st2.st_mode))
+        {
+	  // Found it!
+	  retpath = fname;
+	  return true;
+	}
+    }
+
+  return false;
+}
Index: util.h
===================================================================
RCS file: util.h
diff -N util.h
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ util.h	19 Oct 2006 19:44:57 -0000
@@ -0,0 +1,13 @@
+#include <string>
+#include <vector>
+
+const char *get_home_directory(void);
+
+int copy_file(const char *src, const char *dest);
+
+int create_dir(const char *dir);
+
+void tokenize(const std::string& str, std::vector<std::string>& tokens,
+	      const std::string& delimiters);
+
+bool find_executable(const char *name, std::string& retpath);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-19 19:49     ` David Smith
@ 2006-10-19 21:53       ` Frank Ch. Eigler
  2006-10-20 13:50         ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Frank Ch. Eigler @ 2006-10-19 21:53 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

dsmith wrote:

> [...]

Nice work, thank you!  You might want to taunt people with some speed
improvement numbers too.

> [...] The hash is computed using the following data:
> - gcc's path, size, and mtime
> - stap's version and compile date

In addition or instead of this, could include a hash of /proc/self/exe
content and/or stat info (like gcc's), for us developers.

> [...]
> Note that currently several tests in the testsuite fail after a first 
> run to seed the cache because they don't expect to see the skip from 
> pass 2 to pass 5.

How do you mean they fail?  -p3 or -p4 should still work.

> [...]
> - Set a maximum cache size and expire old modules.  Is this needed?

We can include a shell script ditty for that.  I wouldn't bother put
the logic into stap proper.

Regarding the choice of cache directory name (".stap_cache"), that's
OK if we don't anticipate anything other than cache files to have to
live under $HOME.  But if we want to undertake cross-instrumentation
along the lines I proposed, we'd need at least a few more non-cache
files (for host descriptions for example).  If so, then .stap_cache
should be nested as .systemtap/cache instead, so that other data files
may live under .systemtap/.

- FChE

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-19 21:53       ` Frank Ch. Eigler
@ 2006-10-20 13:50         ` David Smith
  0 siblings, 0 replies; 19+ messages in thread
From: David Smith @ 2006-10-20 13:50 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Frank Ch. Eigler wrote:
> Hi -
> 
> dsmith wrote:
> 
>> [...]
> 
> Nice work, thank you!  You might want to taunt people with some speed
> improvement numbers too.

Oops, I forgot to include those.  For "make check", the time goes from 
29 minutes (when seeding the cache) down to 8 minutes (with cache).  For 
"make installcheck", the time goes from 56 minutes down to 24 minutes.

>> [...] The hash is computed using the following data:
>> - gcc's path, size, and mtime
>> - stap's version and compile date
> 
> In addition or instead of this, could include a hash of /proc/self/exe
> content and/or stat info (like gcc's), for us developers.

OK.

>> [...]
>> Note that currently several tests in the testsuite fail after a first 
>> run to seed the cache because they don't expect to see the skip from 
>> pass 2 to pass 5.
> 
> How do you mean they fail?  -p3 or -p4 should still work.

Here's what goes on.  The '-p3' and '-p4' options still work.  But, 
several run ('-p5') tests use testsuite/lib/stap_run.exp or 
testsuite/lib/stap_run2.exp.  Those two tcl files expect to see "Pass 
[12345]" in the output.  They get confused when only seeing "Pass [125]" 
and then think the test has timed out.

I'm hoping to fix this today.

>> [...]
>> - Set a maximum cache size and expire old modules.  Is this needed?
> 
> We can include a shell script ditty for that.  I wouldn't bother put
> the logic into stap proper.

OK.

> Regarding the choice of cache directory name (".stap_cache"), that's
> OK if we don't anticipate anything other than cache files to have to
> live under $HOME.  But if we want to undertake cross-instrumentation
> along the lines I proposed, we'd need at least a few more non-cache
> files (for host descriptions for example).  If so, then .stap_cache
> should be nested as .systemtap/cache instead, so that other data files
> may live under .systemtap/.

That sounds reasonable.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: precompiled probing scenarios
@ 2006-10-19 20:33 Stone, Joshua I
  2006-10-19 20:41 ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Stone, Joshua I @ 2006-10-19 20:33 UTC (permalink / raw)
  To: David Smith, Frank Ch. Eigler; +Cc: systemtap

On Thursday, October 19, 2006 12:50 PM, David Smith wrote:
> Frank Ch. Eigler wrote:
>> kernel.function("*") should match exactly what was there before.
>> Probes on module("*").FOO would be redefined to mean something like
>> "all modules that we know at translation time that *might* exist,
>> that also happen to be *loaded* at run time.  This aspect of wildcard
>> expansion would thus take place at run time rather than translate
>> time.  It just so happens that the same module might probe a greater
>> or lesser number of modules on an actual system.  With that proviso,
>> a script-source-level hash still seems to work.
>[...]
> The hash is computed using the following data:
>[...]
> - pass 2 script output
>[...]
> Stuff left to do:
> 
> - Testing, testing, testing
> - Correct module handling (which Frank outlined above)

Actually, correct module handling should already be covered if you're
hashing the pass-2 output.  Pass-2 is elaboration, which uses debuginfo
to locate the actual probe points.  Suppose I run a script like this:

	probe module("*").function("*interrupt*") { log(probefunc()) }

Pass-2 output will include all of the probe points matching my wildcard:

  ...
  module("ahci").function("ahci_interrupt@drivers/scsi/ahci.c:889"),

module("libata").function("ata_interrupt@drivers/scsi/libata-core.c:4198
"),
  ...

If I then load a new module that also has a "*interrupt*" function, the
pass-2 output will include the new probe point, and will thus get a
different hash.

In my estimation, there are two lengthy tasks in script compilation:
pass-2 elaboration (digging through lots of debuginfo) and pass-4
C-compilation.  This caching mechanism removes the pain of pass-4, which
is probably the worse of the two.  But if you find a caching scheme to
also avoid pass-2, then the module handling will need to be considered.

Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-19 20:33 Stone, Joshua I
@ 2006-10-19 20:41 ` David Smith
  0 siblings, 0 replies; 19+ messages in thread
From: David Smith @ 2006-10-19 20:41 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: Frank Ch. Eigler, systemtap

Stone, Joshua I wrote:
> On Thursday, October 19, 2006 12:50 PM, David Smith wrote:
>> Frank Ch. Eigler wrote:
>>> kernel.function("*") should match exactly what was there before.
>>> Probes on module("*").FOO would be redefined to mean something like
>>> "all modules that we know at translation time that *might* exist,
>>> that also happen to be *loaded* at run time.  This aspect of wildcard
>>> expansion would thus take place at run time rather than translate
>>> time.  It just so happens that the same module might probe a greater
>>> or lesser number of modules on an actual system.  With that proviso,
>>> a script-source-level hash still seems to work.
>> [...]
>> The hash is computed using the following data:
>> [...]
>> - pass 2 script output
>> [...]
>> Stuff left to do:
>>
>> - Testing, testing, testing
>> - Correct module handling (which Frank outlined above)
> 
> Actually, correct module handling should already be covered if you're
> hashing the pass-2 output.  Pass-2 is elaboration, which uses debuginfo
> to locate the actual probe points.  Suppose I run a script like this:
> 
> 	probe module("*").function("*interrupt*") { log(probefunc()) }
> 
> Pass-2 output will include all of the probe points matching my wildcard:
> 
>   ...
>   module("ahci").function("ahci_interrupt@drivers/scsi/ahci.c:889"),
>  
> module("libata").function("ata_interrupt@drivers/scsi/libata-core.c:4198
> "),
>   ...
> 
> If I then load a new module that also has a "*interrupt*" function, the
> pass-2 output will include the new probe point, and will thus get a
> different hash.

Good point.  But, if Frank's suggestion was implemented, you could still 
use the same cached module, since wildcard expansion would happen at 
runtime.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: precompiled probing scenarios
@ 2006-10-20 18:44 Stone, Joshua I
  2006-10-20 19:26 ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Stone, Joshua I @ 2006-10-20 18:44 UTC (permalink / raw)
  To: David Smith, Frank Ch. Eigler; +Cc: systemtap

On Friday, October 20, 2006 6:50 AM, David Smith wrote:
> Frank Ch. Eigler wrote:
>>> Note that currently several tests in the testsuite fail after a
>>> first run to seed the cache because they don't expect to see the
>>> skip from pass 2 to pass 5.
>> 
>> How do you mean they fail?  -p3 or -p4 should still work.
> 
> Here's what goes on.  The '-p3' and '-p4' options still work.  But,
> several run ('-p5') tests use testsuite/lib/stap_run.exp or
> testsuite/lib/stap_run2.exp.  Those two tcl files expect to see "Pass
> [12345]" in the output.  They get confused when only seeing "Pass
> [125]" and then think the test has timed out.

Would it make sense to print "dummy" pass 3/4 messages when a cached
version is used?  Something like:

Pass 1: parsed user script and 53 library script(s) in
310usr/0sys/326real ms.
Pass 2: analyzed script: 1 probe(s), 0 function(s), 0 global(s) in
10usr/0sys/5real ms.
Pass 3: (cached) in 0usr/0sys/0real ms.
Pass 4: (cached) in 0usr/0sys/0real ms.
Pass 5: starting run.

The timing info doesn't need to be hardcoded zero, I just expect it
would be very small.

Side question - do you still use caching when someone calls '-p3' or
'-p4'?  And with verbosity increased, what would this output, given that
you're not actually doing the work?  (e.g., you wouldn't have a compiler
output on a cached pass-4.)

Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-20 18:44 Stone, Joshua I
@ 2006-10-20 19:26 ` David Smith
  2006-10-20 19:32   ` Frank Ch. Eigler
  0 siblings, 1 reply; 19+ messages in thread
From: David Smith @ 2006-10-20 19:26 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: Frank Ch. Eigler, systemtap

Stone, Joshua I wrote:
> On Friday, October 20, 2006 6:50 AM, David Smith wrote:
>> Frank Ch. Eigler wrote:
>>>> Note that currently several tests in the testsuite fail after a
>>>> first run to seed the cache because they don't expect to see the
>>>> skip from pass 2 to pass 5.
>>> How do you mean they fail?  -p3 or -p4 should still work.
>> Here's what goes on.  The '-p3' and '-p4' options still work.  But,
>> several run ('-p5') tests use testsuite/lib/stap_run.exp or
>> testsuite/lib/stap_run2.exp.  Those two tcl files expect to see "Pass
>> [12345]" in the output.  They get confused when only seeing "Pass
>> [125]" and then think the test has timed out.
> 
> Would it make sense to print "dummy" pass 3/4 messages when a cached
> version is used?  Something like:
> 
> Pass 1: parsed user script and 53 library script(s) in
> 310usr/0sys/326real ms.
> Pass 2: analyzed script: 1 probe(s), 0 function(s), 0 global(s) in
> 10usr/0sys/5real ms.
> Pass 3: (cached) in 0usr/0sys/0real ms.
> Pass 4: (cached) in 0usr/0sys/0real ms.
> Pass 5: starting run.

Actually I've found out that the lack of "Pass [34]" wasn't the problem, 
it was the extra cache output messages that was confusing the testsuite.

Here's what the output now looks like (non-cached and cached):

=====
# stap -v -e 'probe begin { log("hi") }'
Pass 1: parsed user script and 53 library script(s) in 
680usr/10sys/692real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 global(s) in 
10usr/0sys/11real ms.
Pass 3: translated to C into 
"/tmp/stapYrsXWC/stap_d833fd040735ddde57a23bebb4456542_201.c" in 
140usr/90sys/245real ms.
Pass 4: compiled C into "stap_d833fd040735ddde57a23bebb4456542_201.ko" 
in 5030usr/550sys/5587real ms.
Copying /tmp/stapYrsXWC/stap_d833fd040735ddde57a23bebb4456542_201.ko to 
/home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko
Copying /tmp/stapYrsXWC/stap_d833fd040735ddde57a23bebb4456542_201.c to 
/home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.c
Pass 5: starting run.
hi
Pass 5: run completed in 10usr/40sys/4634real ms.
# stap -v -e 'probe begin { log("hi") }'
Pass 1: parsed user script and 53 library script(s) in 
670usr/20sys/692real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 global(s) in 
10usr/0sys/11real ms.
Using cached result 
"/home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko" 
as "/tmp/stapJFisJO/stap_d833fd040735ddde57a23bebb4456542_201.ko"
Pass 5: starting run.
hi
Pass 5: run completed in 0usr/40sys/1656real ms.
=====

> The timing info doesn't need to be hardcoded zero, I just expect it
> would be very small.
> 
> Side question - do you still use caching when someone calls '-p3' or
> '-p4'?  And with verbosity increased, what would this output, given that
> you're not actually doing the work?  (e.g., you wouldn't have a compiler
> output on a cached pass-4.)

A module only gets added to the cache if pass 4 completes successfully. 
  A module (currently) gets pulled from the cache after pass 2, even if 
'-p3' or '-p4' was specified on the command line.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-20 19:26 ` David Smith
@ 2006-10-20 19:32   ` Frank Ch. Eigler
  2006-10-20 19:50     ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Frank Ch. Eigler @ 2006-10-20 19:32 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

On Fri, Oct 20, 2006 at 02:26:32PM -0500, David Smith wrote:
> [...]
> Pass 4: compiled C into "stap_d833fd040735ddde57a23bebb4456542_201.ko" 

Is that "_201" at the end the getuid()?  I wonder if maybe that could
be hashed into the content instead.

> Copying /tmp/stapYrsXWC/stap_d833fd040735ddde57a23bebb4456542_201.ko to 
> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko
> Copying /tmp/stapYrsXWC/stap_d833fd040735ddde57a23bebb4456542_201.c to 
> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.c

These messages should probably show up only under -vv (verbosity 2)
or higher.

> [...]
> # stap -v -e 'probe begin { log("hi") }' [...]
> Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 global(s) in 
> 10usr/0sys/11real ms.
> Using cached result 
> "/home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko" 
> as "/tmp/stapJFisJO/stap_d833fd040735ddde57a23bebb4456542_201.ko"

This "Using cached result" part could be rephrased as a pair:

  Pass 3: skipped translation
  Pass 4: reused cached module stap_deadbeef

> [...]  A module (currently) gets pulled from the cache after pass 2,
> even if '-p3' or '-p4' was specified on the command line.

The translator would still need to spit out (regenerate) the C source
code for a -p3 run.

- FChE

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-20 19:32   ` Frank Ch. Eigler
@ 2006-10-20 19:50     ` David Smith
  2006-10-20 20:13       ` Frank Ch. Eigler
  0 siblings, 1 reply; 19+ messages in thread
From: David Smith @ 2006-10-20 19:50 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Frank Ch. Eigler wrote:
> Hi -
> 
> On Fri, Oct 20, 2006 at 02:26:32PM -0500, David Smith wrote:
>> [...]
>> Pass 4: compiled C into "stap_d833fd040735ddde57a23bebb4456542_201.ko" 
> 
> Is that "_201" at the end the getuid()?  I wonder if maybe that could
> be hashed into the content instead.

Nope, that is part of the hash.  It might be the number of input bits, 
I'm unsure.

I don't hash the getuid(), since:
- by default your cache is stored in your home directory
- who you are doesn't change the pass 2-4 output

>> Copying /tmp/stapYrsXWC/stap_d833fd040735ddde57a23bebb4456542_201.ko to 
>> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko
>> Copying /tmp/stapYrsXWC/stap_d833fd040735ddde57a23bebb4456542_201.c to 
>> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.c
> 
> These messages should probably show up only under -vv (verbosity 2)
> or higher.

Easy enough to fix.  Currently they only show up under '-v'.

>> [...]
>> # stap -v -e 'probe begin { log("hi") }' [...]
>> Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 global(s) in 
>> 10usr/0sys/11real ms.
>> Using cached result 
>> "/home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko" 
>> as "/tmp/stapJFisJO/stap_d833fd040735ddde57a23bebb4456542_201.ko"
> 
> This "Using cached result" part could be rephrased as a pair:
> 
>   Pass 3: skipped translation
>   Pass 4: reused cached module stap_deadbeef
> 
>> [...]  A module (currently) gets pulled from the cache after pass 2,
>> even if '-p3' or '-p4' was specified on the command line.
> 
> The translator would still need to spit out (regenerate) the C source
> code for a -p3 run.

Nope.  I actually cache the module and the C file so I can skip pass 3 
but still preserve semantics.

Hmm.  How about something like:

   Pass 3: Using cached 
/home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.c
   Pass 4: Using cached 
/home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-20 19:50     ` David Smith
@ 2006-10-20 20:13       ` Frank Ch. Eigler
  2006-10-23 20:36         ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Frank Ch. Eigler @ 2006-10-20 20:13 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

Hi -

dsmith wrote:
> [...]
> I don't hash the getuid(), since:
> - by default your cache is stored in your home directory
> - who you are doesn't change the pass 2-4 output

That's all true.  One possible reason for including getuid() anyway is
so that two different sudo-empowered people can run the same script at
the same time without one having to disable his cache.

> Nope.  I actually cache the module and the C file so I can skip pass 3 
> but still preserve semantics.

OK.

> Hmm.  How about something like:
>   Pass 3: Using cached 
> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.c
>   Pass 4: Using cached 
> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko

Works for me.  (It may be a worthwhile minor simplification to use
one-byte /d8/ as a partitioning subdirectory rather than two-nibble
/d/8/ scheme.  256 files in a single directory are well handled in our
filesystems.)

- FChE

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-20 20:13       ` Frank Ch. Eigler
@ 2006-10-23 20:36         ` David Smith
  0 siblings, 0 replies; 19+ messages in thread
From: David Smith @ 2006-10-23 20:36 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Frank Ch. Eigler wrote:
> Hi -
> 
> dsmith wrote:
>> [...]
>> I don't hash the getuid(), since:
>> - by default your cache is stored in your home directory
>> - who you are doesn't change the pass 2-4 output
> 
> That's all true.  One possible reason for including getuid() anyway is
> so that two different sudo-empowered people can run the same script at
> the same time without one having to disable his cache.

Ah - makes sense, I've added this.

>> Nope.  I actually cache the module and the C file so I can skip pass 3 
>> but still preserve semantics.
> 
> OK.
> 
>> Hmm.  How about something like:
>>   Pass 3: Using cached 
>> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.c
>>   Pass 4: Using cached 
>> /home/dsmith/.stap_cache/d/8/stap_d833fd040735ddde57a23bebb4456542_201.ko
> 
> Works for me.

OK, I've implemented this.

> (It may be a worthwhile minor simplification to use
> one-byte /d8/ as a partitioning subdirectory rather than two-nibble
> /d/8/ scheme.  256 files in a single directory are well handled in our
> filesystems.)

I did it this way because ccache did it this way.  My sense is that it 
was done this way to make sure the directories with cached files didn't 
end up too large (not the partition directories themselves).  Of course 
ccache's problem is a bit worse than ours - they cache 3 files for every 
.o file (the .o file itself, gcc stdout, and gcc stderr) while we only 
cache 2 files per ,ko (the .ko itself and the associated .c file).

(Also note that the number of levels is configurable.  The default is 2, 
but setting the SYSTEMTAP_NLEVELS environment variable lets you set it 
from 1 to 8.)

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: precompiled probing scenarios
@ 2006-10-20 20:51 Stone, Joshua I
  0 siblings, 0 replies; 19+ messages in thread
From: Stone, Joshua I @ 2006-10-20 20:51 UTC (permalink / raw)
  To: Frank Ch. Eigler, David Smith; +Cc: systemtap

On Friday, October 20, 2006 1:13 PM, Frank Ch. Eigler wrote:
> Hi -
> 
> dsmith wrote:
>> [...]
>> I don't hash the getuid(), since:
>> - by default your cache is stored in your home directory
>> - who you are doesn't change the pass 2-4 output
> 
> That's all true.  One possible reason for including getuid() anyway is
> so that two different sudo-empowered people can run the same script at
> the same time without one having to disable his cache.

Also, uid isn't always in sync with $HOME.  Besides people doing weird
things tweaking it, a simple example is that sudo may or may not change
$HOME to ~root.

Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: precompiled probing scenarios
@ 2006-10-24  0:29 Stone, Joshua I
  2006-10-24 15:16 ` David Smith
  0 siblings, 1 reply; 19+ messages in thread
From: Stone, Joshua I @ 2006-10-24  0:29 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

I saw that you checked in the caching code, so I finally got around to
trying it.  :)

For the most part, it seems to work really nicely.  The caching is
essentially transparent, which makes for a positive experience when your
scripts startup faster.

There's a few trials I did though where caching opportunities were
missed.  I'll admit freely that these are perhaps too nitpicky, so we
can treat it as a low-priority enhancement.

1. probe begin { exit() }
2. probe begin { exit(); }
3. probe begin { exit() a=1 }

2 and 3 actually hash the same, since elision turns 'a=1' into an empty
statement (';').  We should to be able to tell that these are all the
same, but since the pass-2 output leaves in all semi-colons, the hash is
different.  It ought to be pretty easy to normalize empty statements
away, so minor differences like this don't matter.

A harder scenario to address is this:

4. probe begin, end { exit() }
5. probe end, begin { exit() }

Again, with some fancy normalization, we should be able to identify
these as equal.  And actually, the seemingly-unrelated work in
probe-grouping would probably help here, if the pass-2 output were
ordered in a deterministic manner.  Stap already does some of this,
e.g., by ordering functions before probes.

It's likely rare that the differences between scripts will be so small,
so these optimizations may not matter.  But if anyone's bored, or has an
intern with nothing to do, this may be a simple enhancement.

Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-24  0:29 Stone, Joshua I
@ 2006-10-24 15:16 ` David Smith
  0 siblings, 0 replies; 19+ messages in thread
From: David Smith @ 2006-10-24 15:16 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: systemtap

Stone, Joshua I wrote:
> I saw that you checked in the caching code, so I finally got around to
> trying it.  :)
 >
> For the most part, it seems to work really nicely.  The caching is
> essentially transparent, which makes for a positive experience when your
> scripts startup faster.

Thanks for trying it.  Yep, its in.  Hopefully no one else has noticed 
because it is fairly transparent (except perhaps for your growing 
~/.systemtap/cache directory).

The faster script startup is really nice, especially on slower hardware 
(like my test box - a 1Ghz P3).

Here's an extreme example.  stap -p4 testsuite/systemtap.stress/sys.stp 
takes 0:06:18 uncached and only 0:00:07 cached.

A full "make installcheck" run takes 0:41:05 uncached and 0:15:39 cached.

BTW, a "make installcheck" should work correctly, cached or uncached. 
Next on my todo list is adding cache tests.

> There's a few trials I did though where caching opportunities were
> missed.  I'll admit freely that these are perhaps too nitpicky, so we
> can treat it as a low-priority enhancement.
> 
> 1. probe begin { exit() }
> 2. probe begin { exit(); }
> 3. probe begin { exit() a=1 }
> 
> 2 and 3 actually hash the same, since elision turns 'a=1' into an empty
> statement (';').  We should to be able to tell that these are all the
> same, but since the pass-2 output leaves in all semi-colons, the hash is
> different.  It ought to be pretty easy to normalize empty statements
> away, so minor differences like this don't matter.
> 
> A harder scenario to address is this:
> 
> 4. probe begin, end { exit() }
> 5. probe end, begin { exit() }
> 
> Again, with some fancy normalization, we should be able to identify
> these as equal.  And actually, the seemingly-unrelated work in
> probe-grouping would probably help here, if the pass-2 output were
> ordered in a deterministic manner.  Stap already does some of this,
> e.g., by ordering functions before probes.
> 
> It's likely rare that the differences between scripts will be so small,
> so these optimizations may not matter.  But if anyone's bored, or has an
> intern with nothing to do, this may be a simple enhancement.

Hmm.  Just for fun, I decided to see if the pass 3 output of [1. 2.] or 
[4. 5.] would compare equally.  They don't.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: precompiled probing scenarios
@ 2006-10-25 18:54 Stone, Joshua I
  2006-10-26  1:07 ` Frank Ch. Eigler
  0 siblings, 1 reply; 19+ messages in thread
From: Stone, Joshua I @ 2006-10-25 18:54 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

On Tuesday, October 24, 2006 8:17 AM, David Smith wrote:
> Stone, Joshua I wrote:
>> 1. probe begin { exit() }
>> 2. probe begin { exit(); }
>> 
>> 4. probe begin, end { exit() }
>> 5. probe end, begin { exit() }
> 
> Hmm.  Just for fun, I decided to see if the pass 3 output of [1. 2.]
> or [4. 5.] would compare equally.  They don't.

That's partly my point.  Those pairings are functionally equivalent,
right?  So why should the code we generate show any differences?

The difference between 1 & 2 is basically just the line "/* null */;" in
the probe's generated C -- a useless statement.  Between 4 & 5 the only
difference is whether the begin or end is generated first.  Because
these differences are insignificant, we should be able to treat them the
same for caching purposes.

Another normalization example is with braces:

6. probe begin { if(foo) exit() }
7. probe begin { if(foo) { exit() } }
8. probe begin { if(foo) { { exit() } } }

My hope is that someday the translator will also treat less obvious
cases like these as identical:

9.  probe begin { log("foo") }
10. probe begin { if(1) log("foo") }
11. probe begin { i=1; if(i) log("foo") }
12. probe begin { while(!i++) log("foo") }

This has more to do with optimization of the generated code, but it
could help caching if the optimization is done before pass-2 output.

Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: precompiled probing scenarios
  2006-10-25 18:54 Stone, Joshua I
@ 2006-10-26  1:07 ` Frank Ch. Eigler
  0 siblings, 0 replies; 19+ messages in thread
From: Frank Ch. Eigler @ 2006-10-26  1:07 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: David Smith, systemtap

"Stone, Joshua I" <joshua.i.stone@intel.com> writes:

> [...]
> That's partly my point.  Those pairings are functionally equivalent,
> right?  So why should the code we generate show any differences?
> [...]
> My hope is that someday the translator will also treat less obvious
> cases like these as identical: [...]

While none of these is a bad idea, I see little practical necessity
for normalization.  How many nearly-identical scripts do we ship?  How
many normalizably-identical scripts do people actually use?  If the
answer is "not many" or even "none", I wouldn't worry about it.

- FChE

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-10-26  1:07 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-06 19:08 precompiled probing scenarios Frank Ch. Eigler
2006-10-06 20:33 ` David Smith
2006-10-06 20:40   ` Frank Ch. Eigler
2006-10-19 19:49     ` David Smith
2006-10-19 21:53       ` Frank Ch. Eigler
2006-10-20 13:50         ` David Smith
2006-10-19 20:33 Stone, Joshua I
2006-10-19 20:41 ` David Smith
2006-10-20 18:44 Stone, Joshua I
2006-10-20 19:26 ` David Smith
2006-10-20 19:32   ` Frank Ch. Eigler
2006-10-20 19:50     ` David Smith
2006-10-20 20:13       ` Frank Ch. Eigler
2006-10-23 20:36         ` David Smith
2006-10-20 20:51 Stone, Joshua I
2006-10-24  0:29 Stone, Joshua I
2006-10-24 15:16 ` David Smith
2006-10-25 18:54 Stone, Joshua I
2006-10-26  1:07 ` Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).