public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* get the mapping of <key, syscallname>
@ 2006-05-11  2:05 Li Guanglei
  2006-05-11 18:14 ` Martin Hunt
  2006-05-12 15:08 ` Frank Ch. Eigler
  0 siblings, 2 replies; 10+ messages in thread
From: Li Guanglei @ 2006-05-11  2:05 UTC (permalink / raw)
  To: systemtap

Hi,
  One of my tapsets will log the syscall name got from
CONTEXT->probe_point. But string printing is costly so I want to log
the key of syscall name, and later the post-processing app in user
space will figure out the syscall name represented by corresponding keys.
  But this requires to generate a mapping of pairs of <key,
syscallname> at probe begin. The CONTEXT->probe_point is stored in
dwarf_kprobe_probe_??_location_names, so I can use the address of each
syscall name as the key and in the probe handler I can simply log the
pointer instead of a string. But since there are many probe points
defined beside syscall in my tapsets, I can't figure out which
dwarf_kprobe_probe_.._location_names variable the syscall name is stored.
  Do you have any ideas about this? Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-11  2:05 get the mapping of <key, syscallname> Li Guanglei
@ 2006-05-11 18:14 ` Martin Hunt
  2006-05-12 15:08 ` Frank Ch. Eigler
  1 sibling, 0 replies; 10+ messages in thread
From: Martin Hunt @ 2006-05-11 18:14 UTC (permalink / raw)
  To: Li Guanglei; +Cc: systemtap

On Thu, 2006-05-11 at 10:04 +0800, Li Guanglei wrote:
> Hi,
>   One of my tapsets will log the syscall name got from
> CONTEXT->probe_point. But string printing is costly so I want to log
> the key of syscall name, and later the post-processing app in user
> space will figure out the syscall name represented by corresponding keys.

string printing is more expensive than it should be because the
translator emits unoptimized code that copies them too often. That will
probably be fixed in the future. Still, I wouldn't call it "costly".
Writing a binary key instead will save you little for what may be a
temporary advantage. I advise waiting on this.

Martin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-11  2:05 get the mapping of <key, syscallname> Li Guanglei
  2006-05-11 18:14 ` Martin Hunt
@ 2006-05-12 15:08 ` Frank Ch. Eigler
  2006-05-12 16:06   ` Jose R. Santos
  1 sibling, 1 reply; 10+ messages in thread
From: Frank Ch. Eigler @ 2006-05-12 15:08 UTC (permalink / raw)
  To: Li Guanglei; +Cc: systemtap


Li Guanglei <guanglei@cn.ibm.com> writes:

> One of my tapsets will log the syscall name got from
> CONTEXT->probe_point. [...]

Why not use the "name" variable set by the syscalls tapset?

> [...] I want to log the key of syscall name, and later the
> post-processing app in user space will figure out the syscall name
> represented by corresponding keys. [...]

As does Martin, I wonder if this is a good place to spend your
optimization energies.  Anyway, one possibility is to have the
syscalls tapset also define a system call number variable.

- FChE

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-12 15:08 ` Frank Ch. Eigler
@ 2006-05-12 16:06   ` Jose R. Santos
  2006-05-12 16:12     ` Frank Ch. Eigler
  0 siblings, 1 reply; 10+ messages in thread
From: Jose R. Santos @ 2006-05-12 16:06 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Li Guanglei, systemtap

Frank Ch. Eigler wrote:

>Li Guanglei <guanglei@cn.ibm.com> writes:
>
>> One of my tapsets will log the syscall name got from
>> CONTEXT->probe_point. [...]
>
>Why not use the "name" variable set by the syscalls tapset?
>  
>
LKET does not currently use the syscall tapset because it doesnt use the 
extra information that this tapset collects from each systemcall.  In 
order to improve performance, we keep the system call trace as simple as 
possible.

>> [...] I want to log the key of syscall name, and later the
>> post-processing app in user space will figure out the syscall name
>> represented by corresponding keys. [...]
>
>As does Martin, I wonder if this is a good place to spend your
>optimization energies.  Anyway, one possibility is to have the
>syscalls tapset also define a system call number variable.
>  
>

Its not just the string printing that's costly, its also searching the 
system call name with in CONTEXT->probe_point.  Avoiding searching and 
printing of the syscall string can provide between 3-6% performance 
improvements on SPECjbb results.

>- FChE
>  
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-12 16:06   ` Jose R. Santos
@ 2006-05-12 16:12     ` Frank Ch. Eigler
  2006-05-12 16:40       ` Jose R. Santos
  2006-05-13  7:18       ` Li Guanglei
  0 siblings, 2 replies; 10+ messages in thread
From: Frank Ch. Eigler @ 2006-05-12 16:12 UTC (permalink / raw)
  To: Jose R. Santos; +Cc: systemtap

Hi -

> > Why not use the "name" variable set by the syscalls tapset?
> 
> LKET does not currently use the syscall tapset because it doesnt use the 
> extra information that this tapset collects from each systemcall.  [...]

But, due to the optimization pass in the translator, only that portion
of the information actually *used* by your script is actually
collected.  Please try it out (check out stap -p2 or -p3 output to
confirm).

> [...]  Its not just the string printing that's costly, its also
> searching the system call name with in CONTEXT->probe_point.  [...]

Rolling your own code to manually scan through "probe_point" is not
really necessary or appropriate.  There is a "probefunc()" function
you can use for now.

- FChE

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-12 16:12     ` Frank Ch. Eigler
@ 2006-05-12 16:40       ` Jose R. Santos
  2006-05-13  7:18       ` Li Guanglei
  1 sibling, 0 replies; 10+ messages in thread
From: Jose R. Santos @ 2006-05-12 16:40 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Frank Ch. Eigler wrote:

>Hi -
>
>> > Why not use the "name" variable set by the syscalls tapset?
>> 
>> LKET does not currently use the syscall tapset because it doesnt use the 
>> extra information that this tapset collects from each systemcall.  [...]
>
>But, due to the optimization pass in the translator, only that portion
>of the information actually *used* by your script is actually
>collected.  Please try it out (check out stap -p2 or -p3 output to
>confirm).
>  
>
I seem to remember that there was another reason why Li Guanglei did not 
use the systemcall tapset but I cant remember why.  If the performance 
is good, then adding the systemcall id number variable will solve this 
problem.  We'll, try it out.

>> [...]  Its not just the string printing that's costly, its also
>> searching the system call name with in CONTEXT->probe_point.  [...]
>
>Rolling your own code to manually scan through "probe_point" is not
>really necessary or appropriate.  There is a "probefunc()" function
>you can use for now.
>  
>
I agree.  Thats why I would like to see LKET as part of the SystemTap 
tapset in order to have more people reviewing some of the stupid design 
decisions we make. :)

>- FChE
>  
>

-JRS

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-12 16:12     ` Frank Ch. Eigler
  2006-05-12 16:40       ` Jose R. Santos
@ 2006-05-13  7:18       ` Li Guanglei
  2006-05-14 20:26         ` Frank Ch. Eigler
  2006-05-15 16:39         ` Hien Nguyen
  1 sibling, 2 replies; 10+ messages in thread
From: Li Guanglei @ 2006-05-13  7:18 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Jose R. Santos, systemtap

Frank Ch. Eigler ??:
> Hi -
> 
>>> Why not use the "name" variable set by the syscalls tapset?
>> LKET does not currently use the syscall tapset because it doesnt use the 
>> extra information that this tapset collects from each systemcall.  [...]
> 
> But, due to the optimization pass in the translator, only that portion
> of the information actually *used* by your script is actually
> collected.  Please try it out (check out stap -p2 or -p3 output to
> confirm).
> 
I tried and found it works really well. It compiles away all the 
unnecessary codes.

But it still has one issue. e.g, I tried to use:
probe syscall.* { ... } to trace all system calls on ppc64. But it 
failed since sys64_time(defined in SystemTap's ppc64/syscall.stp) is 
removed from  2.6.16.5/ppc64. I think this issues has been addressed 
in bug 2645.

If this bug #2645 is resolved , then I think I can use the syscall.stp 
tapsets shipped with SystemTap. Another advantage of using it is that 
I can simply use the $name instead of getting it using lines of codes.

But one disadvantage is that if using "syscall.*", then every syscall 
will use a different function to be its probe handler. It causes the 
generated much larger and compiling time much longer compared with 
current LKET's syscall tapsets:

/usr/local/share/systemtap/tapset> ll -h /tmp/stap6HKAzF/*
   127 May 13 07:36 /tmp/stap6HKAzF/Makefile
  2.0M May 13 07:36 /tmp/stap6HKAzF/stap_5910.c
  4.1M May 13 07:37 /tmp/stap6HKAzF/stap_5910.ko
  2.8K May 13 07:37 /tmp/stap6HKAzF/stap_5910.mod.c
   60K May 13 07:37 /tmp/stap6HKAzF/stap_5910.mod.o
  4.0M May 13 07:37 /tmp/stap6HKAzF/stap_5910.o
  712K May 13 07:36 /tmp/stap6HKAzF/stap-symbols.h
  518K May 13 07:36 /tmp/stap6HKAzF/symbols.sorted

>> [...]  Its not just the string printing that's costly, its also
>> searching the system call name with in CONTEXT->probe_point.  [...]
> 
> Rolling your own code to manually scan through "probe_point" is not
> really necessary or appropriate.  There is a "probefunc()" function
> you can use for now.
> 
I once used the code of probefunc() inside log_syscall_extra() 
function. But during the performance analysis of LKET, I found it's a 
little expensive considered that log_syscall_extra() will be called 
very frequently. So I change to use these ugly manual written codes 
for a better performance.

I think use SystemTap as a trace tool cares much about the performance 
considered that it will be used in a heavy loaded system and a slight 
performance differece will add up to a big one. But probefunc() is not 
a problem if SystemTap is used as a diagnosis purpose.

Below is the email I sent out internally on this. I used specjbb as 
the probed application.

 >>> FYI:

   As discovered in the former analysis of LKET, strpbrk() is one of
the most costly operations. It is a kernel function and is called by
strsep() to get the current syscall name.

   I use another way to get the syscall name:

        ptr = CONTEXT->probe_point;

         while(*ptr++ != '"');

         do  {
                 buffer[i++] = *ptr++;
         }  while(*ptr != '@');

         buffer[i] = '\0';

   And I did some testings to compare the new method and the old one. I
also use get_cycles() to get the cycles consumed by these two method.

================= old method using strsep() ================

OProfile:

samples  %        app name                 symbol name
4239923  67.2099  java                     (no symbols)
207747    3.2931  lglzy                    ._stp_vsnprintf
109242    1.7317  vmlinux-2.6.16.5.debug   .strpbrk
58739     0.9311  libc-2.3.4.so            memset
58344     0.9249  lglzy                    .dwarf_kprobe_1_enter
56300     0.8924  libjvm.so                fastLocalMark
55565     0.8808  libjvm.so               cacheAllocWithInitialization
54983     0.8716  vmlinux-2.6.16.5.debug   restore
54123     0.8579  vmlinux-2.6.16.5.debug   .kernel_unknown_bucket
50853     0.8061  libjitc.so               sd9

get_cycles(): ~165

================= new method ======================


OProfile:

samples  %        app name                 symbol name
4373512  69.2051  java                     (no symbols)
211307    3.3437  lglzy                    ._stp_vsnprintf
59605     0.9432  lglzy                .function_log_syscall_tracedata
58020     0.9181  libc-2.3.4.so            memset
57804     0.9147  vmlinux-2.6.16.5.debug   restore
56514     0.8943  libjvm.so                fastLocalMark
55085     0.8716  libjvm.so              cacheAllocWithInitialization
53948     0.8537  lglzy                    .dwarf_kprobe_1_enter
50175     0.7940  libjitc.so               sd9
~

get_cycles(): ~40

================= SUMMARY ====================

oprofile shows that the total samples in lglzy & vmlinux decrease
about 2% using the new method.

And get_cycles() shows that the total cycles using the new method is
~40, much less than ~165 with the old method.

I've committed this change into TLA
> - FChE


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-13  7:18       ` Li Guanglei
@ 2006-05-14 20:26         ` Frank Ch. Eigler
  2006-05-15 16:39         ` Hien Nguyen
  1 sibling, 0 replies; 10+ messages in thread
From: Frank Ch. Eigler @ 2006-05-14 20:26 UTC (permalink / raw)
  To: Li Guanglei; +Cc: systemtap

Hi -

guanglei wrote:

> [...]
> But one disadvantage is that if using "syscall.*", then every syscall 
> will use a different function to be its probe handler. 

There are two separate opportunities for improvement here, and both of
them are going to get some attention.  

First, a lot of essentially duplicated registration, unregistration,
and entry routines are being generated.  These can be collapsed into
one triplet of functions per type (one for all kprobes, one for all
kretprobes, etc.).  This will remove hundreds of functions of the
"syscall.*"-originated C code, and should make a drastic difference as
to performance.

Second, probe handlers constructed from distinct probe/alias bodies
may be collapsed, if they turn out to be equivalent after
optimization.  One might think of it as a generalization of graydon's
"flavour" calculations in tapsets.cxx.

Together, these two tapset-level optimizations should bring the
"abstraction penalty" way down.

> [Regarding parsing CONTEXT->probe_point: ...]  As discovered in the
> former analysis of LKET, strpbrk() is one of the most costly
> operations. It is a kernel function and is called by strsep() to get
> the current syscall name.  I use another way to get the syscall
> name: [...]

It may be worth addressing this a totally different way too.  The
translator could provide another variable, a sibling of
CONTEXT->probe_point.  It would contains the equivalent of
probefunc(), at least for those probes that have a "synchronous"
binding to a PC address, which kprobes-based ones do.  It could be
statically initialized the same way as probe_point, and the
probefunc() tapset function could be extended to look at that field
first.  No strpbrk or iteration at all then.

- FChE

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: get the mapping of <key, syscallname>
  2006-05-13  7:18       ` Li Guanglei
  2006-05-14 20:26         ` Frank Ch. Eigler
@ 2006-05-15 16:39         ` Hien Nguyen
  1 sibling, 0 replies; 10+ messages in thread
From: Hien Nguyen @ 2006-05-15 16:39 UTC (permalink / raw)
  To: Li Guanglei; +Cc: Frank Ch. Eigler, Jose R. Santos, systemtap

On Sat, 2006-05-13 at 15:18 +0800, Li Guanglei wrote:

> 
> But it still has one issue. e.g, I tried to use:
> probe syscall.* { ... } to trace all system calls on ppc64. But it 
> failed since sys64_time(defined in SystemTap's ppc64/syscall.stp) is 
> removed from  2.6.16.5/ppc64. I think this issues has been addressed 
> in bug 2645.

For the time being,  I would use the kernel_v preprocessor. I fact there
are a few of them in the ppc64/syscall.stp. 
> 
> If this bug #2645 is resolved , then I think I can use the syscall.stp 
> tapsets shipped with SystemTap. Another advantage of using it is that 
> I can simply use the $name instead of getting it using lines of codes.

Another advantage is when more people start using syscalls tapset we
will get more feedback (bug reports).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: get the mapping of <key, syscallname>
@ 2006-05-15 17:41 Stone, Joshua I
  0 siblings, 0 replies; 10+ messages in thread
From: Stone, Joshua I @ 2006-05-15 17:41 UTC (permalink / raw)
  To: Frank Ch. Eigler, Li Guanglei; +Cc: systemtap

On Sunday, May 14, 2006 1:25 PM, Frank Ch. Eigler wrote:
> Second, probe handlers constructed from distinct probe/alias bodies
> may be collapsed, if they turn out to be equivalent after
> optimization.  One might think of it as a generalization of graydon's
> "flavour" calculations in tapsets.cxx.

Of course, in the syscall.* case, simply using 'name' will make every
handler appear unique...


Josh

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-05-15 17:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-11  2:05 get the mapping of <key, syscallname> Li Guanglei
2006-05-11 18:14 ` Martin Hunt
2006-05-12 15:08 ` Frank Ch. Eigler
2006-05-12 16:06   ` Jose R. Santos
2006-05-12 16:12     ` Frank Ch. Eigler
2006-05-12 16:40       ` Jose R. Santos
2006-05-13  7:18       ` Li Guanglei
2006-05-14 20:26         ` Frank Ch. Eigler
2006-05-15 16:39         ` Hien Nguyen
2006-05-15 17:41 Stone, Joshua I

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).