public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: William Cohen <wcohen@redhat.com>
To: "Frank Ch. Eigler" <fche@redhat.com>
Cc: systemtap@sources.redhat.com
Subject: Re: Where stap spends its time producing instrumentation for testsuite/systemtap.syscall/sys.stp
Date: Thu, 04 Oct 2007 14:17:00 -0000	[thread overview]
Message-ID: <4704F602.1030704@redhat.com> (raw)
In-Reply-To: <y0mlkajsght.fsf@ton.toronto.redhat.com>

Hi Frank,

Thanks for the improvements. I reran the same experiment from
Wednesday on sys.stp with a fresh cvs checkout of systemtap. There are
improvements in the pass 2 and pass 4 times. Pass 2 dropped form 39.7
seconds to 29.7 seconds user time. Pass 4 times dropped from 34.1
seconds to 26.4 seconds. Definitely, a trend in ht ecorrect direction.
The __libdw_find_attr and __libdw_form_val_len functions still
dominate the runtimes. The elfutils-dwfl caching is going to improve
that situation?

-Will



$  sudo opcontrol --start; /home/wcohen/systemtap_write/install/bin/stap -p4 -v 
/home/wcohen/systemtap_write/src/testsuite/systemtap.syscall/sys.stp -m xyz -k; 
sudo opcontrol --shutdown
Using 2.6+ OProfile kernel interface.
Reading module info.
Using log file /var/lib/oprofile/samples/oprofiled.log
Daemon started.
Profiler running.
Warning: using '-m' disables cache support.
Pass 1: parsed user script and 37 library script(s) in 360usr/40sys/423real ms.
Pass 2: analyzed script: 768 probe(s), 108 function(s), 14 embed(s), 4 global(s) 
in 29740usr/1590sys/38374real ms.
Pass 3: translated to C into "/tmp/stapJKUmBF/xyz.c" in 150usr/110sys/256real ms.

Pass 4: compiled C into "xyz.ko" in 26350usr/1710sys/30719real ms.
Keeping temporary directory "/tmp/stapJKUmBF"
Stopping profiling.
Killing daemon.

The size of generated code also dropped significantly from 120K lines
to 91K lines of code. The resulting module is roughly half the size
for the text region.

$ wc /tmp/stapJKUmBF/xyz.c
   91259  227560 2586615 /tmp/stapJKUmBF/xyz.c
$ size /tmp/stapJKUmBF/xyz.ko
    text    data     bss     dec     hex filename
  529327  169408    4816  703551   abc3f /tmp/stapJKUmBF/xyz.ko


The number of samples for stap dropped quite a bit. Most of these
appear to be areduction in the stap code itself. For the component
breakdown of stap, the stap code was about 32% of the total samples
for stap. It is now around 11% fo the samples. The compiler cc1 does
less work also.


$ opreport -t1
CPU: AMD64 processors, speed 800 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask 
of 0x00 (No unit mask) count 1000000
Counted RETIRED_INSTRUCTIONS events (Retired instructions (includes exceptions, 
interrupts, re-syncs)) with a unit mask of 0x00 (No unit mask) count 1000000
CPU_CLK_UNHALT...|RETIRED_INSTRU...|
   samples|      %|  samples|      %|
------------------------------------
     56465 48.0868     64383 53.5976 stap
         CPU_CLK_UNHALT...|RETIRED_INSTRU...|
           samples|      %|  samples|      %|
         ------------------------------------
             32131 56.9043     37883 58.8401 libdw.so.1
             11497 20.3613     12154 18.8777 libc-2.6.90.so
              6431 11.3894      5939  9.2245 stap
              5233  9.2677      6436  9.9964 libstdc++.so.6.0.8
              1129  1.9995      1961  3.0458 libelf.so.1
                32  0.0567         6  0.0093 ld-2.6.90.so
                 9  0.0159         4  0.0062 libpthread-2.6.90.so
                 2  0.0035         0       0 libebl_x86_64.so
                 1  0.0018         0       0 libdl-2.6.90.so
     48520 41.3207     47267 39.3488 cc1
         CPU_CLK_UNHALT...|RETIRED_INSTRU...|
           samples|      %|  samples|      %|
         ------------------------------------
             45011 92.7679     43950 92.9824 cc1
              3502  7.2176      3313  7.0091 libc-2.6.90.so
                 7  0.0144         4  0.0085 ld-2.6.90.so
      9456  8.0529      6286  5.2330 vmlinux
      1869  1.5917      1635  1.3611 as
         CPU_CLK_UNHALT...|RETIRED_INSTRU...|
           samples|      %|  samples|      %|
         ------------------------------------
              1325 70.8935      1062 64.9541 as
               501 26.8058       539 32.9664 libc-2.6.90.so
                34  1.8192        31  1.8960 libbfd-2.17.50.0.18-1.so
                 8  0.4280         3  0.1835 ld-2.6.90.so
                 1  0.0535         0       0 libopcodes-2.17.50.0.18-1.so


The two hotspot functions still exist. Due to the reduction in other
parts of of stap now have a larger portion of the samples.



$ opreport  -l -t1 /home/wcohen/systemtap_write/install/bin/stap
CPU: AMD64 processors, speed 800 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask 
of 0x00 (No unit mask) count 1000000
Counted RETIRED_INSTRUCTIONS events (Retired instructions (includes exceptions, 
interrupts, re-syncs)) with a unit mask of 0x00 (No unit mask) count 1000000
samples  %        samples  %        image name               symbol name
13316    23.5828  13444    20.8813  libdw.so.1               __libdw_find_attr
12501    22.1394  17166    26.6623  libdw.so.1               __libdw_form_val_len
4770      8.4477  3806      5.9115  libc-2.6.90.so           strlen
1773      3.1400  2930      4.5509  libdw.so.1               dwfl_module_getsym
1581      2.8000  1722      2.6746  stap 
query_dwarf_func(Dwarf_Die*, void*)
1413      2.5024  1350      2.0968  libdw.so.1               dwarf_attr
1361      2.4103  2306      3.5817  libstdc++.so.6.0.8 
std::string::assign(std::string const&)
1154      2.0437  247       0.3836  stap 
std::_Rb_tree<std::string, std::pair<std::string const, std::vector<Dwarf_Die, 
std::allocator<Dwarf_Die> >*>, std::_Select1st<std::pair<std::string const, 
std::vector<Dwarf_Die, std::allocator<Dwarf_Die> >*> >, std::less<std::string>, 
std::allocator<std::pair<std::string const, std::vector<Dwarf_Die, 
std::allocator<Dwarf_Die> >*> > >::lower_bound(std::string const&)
1143      2.0243  1640      2.5473  libc-2.6.90.so           memcpy
1106      1.9587  1957      3.0396  libelf.so.1              gelf_getsymshndx
1039      1.8401  977       1.5175  libc-2.6.90.so           _int_free
1038      1.8383  1201      1.8654  stap 
query_cu(Dwarf_Die*, void*)
1027      1.8188  1160      1.8017  libc-2.6.90.so           _int_malloc
974       1.7250  1117      1.7349  libc-2.6.90.so           internal_fnmatch
733       1.2981  1118      1.7365  libc-2.6.90.so           strcmp
718       1.2716  166       0.2578  stap 
symresolution_info::find_function(std::string const&, unsigned int)
683       1.2096  1195      1.8561  stap 
lookup_symbol_address(Dwfl_Module*, char const*)
644       1.1405  532       0.8263  libc-2.6.90.so           free
644       1.1405  581       0.9024  libdw.so.1               dwarf_func_inline
612       1.0839  1145      1.7784  libc-2.6.90.so           malloc
598       1.0591  370       0.5747  libdw.so.1               lookup

      reply	other threads:[~2007-10-04 14:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-03 19:02 William Cohen
2007-10-04  3:31 ` Frank Ch. Eigler
2007-10-04 14:17   ` William Cohen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4704F602.1030704@redhat.com \
    --to=wcohen@redhat.com \
    --cc=fche@redhat.com \
    --cc=systemtap@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).