* Another Newbie question about measuring time in a sys call @ 2006-02-08 1:29 David A Sperry 2006-02-08 1:52 ` Li Guanglei ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: David A Sperry @ 2006-02-08 1:29 UTC (permalink / raw) To: systemtap Hi, I'm trying to run the top example from a few months ago to play with measuring time. I run top in one window and stap in another. The problem is I am getting some strange answers. Here's the script I copied out of an old message: [sperryd@ibm systap]$ cat top.stp global called, ttime probe kernel.syscall.* { if (execname() == "top") called[name] = gettimeofday_us() } probe kernel.syscall.*.return { if (execname() == "top") ttime[name] += gettimeofday_us() - called[name] } probe end { foreach ([n] in ttime-) printf("%s took %d usecs\n", n, ttime[n]) } Here are the results I get: [root@ibm systap]# stap top.stp -v Pass 1: parsed user script and 11 library script(s) in 140usr/0sys/140real ms. Pass 2: analyzed script: 293 probe(s), 18 function(s), 2 global(s) in 1160usr/40sys/1210real ms. Pass 3: translated to C into "/tmp/stapOj2TEx/stap_905.c" in 300usr/110sys/406real ms. Pass 4: compiled C into "stap_905.ko" in 45460usr/1690sys/46011real ms. stat64.return took 4918620585892709376 usecs fcntl64.return took 1886781490013993830 usecs lseek.return took 123050966740266106 usecs llseek.return took 82033977826804214 usecs ioctl.return took 82033977825997218 usecs getdents.return took 82033977825866159 usecs fstat64.return took 41016988912499374 usecs gettimeofday.return took 41016988912493711 usecs creat.return took -8486451935184063013 usecs Pass 5: run completed in 10usr/100sys/117204real ms. [root@ibm systap]# I'm guessing the syscall tapsets may have changed and the associative array may be getting confused between a call and it's return (lseek vs lseek.return) Any help would be appreciated Thanks Dave Sperry ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 1:29 Another Newbie question about measuring time in a sys call David A Sperry @ 2006-02-08 1:52 ` Li Guanglei 2006-02-08 5:56 ` Li Guanglei 2006-02-08 12:32 ` Frank Ch. Eigler 2 siblings, 0 replies; 10+ messages in thread From: Li Guanglei @ 2006-02-08 1:52 UTC (permalink / raw) To: David A Sperry; +Cc: systemtap David A Sperry ??: > > > I'm guessing the syscall tapsets may have changed and the associative array > may be getting confused between a call and it's return (lseek vs > lseek.return) > I greped in /usr/local/share/systemtap/tapsets/syscall.stp, and found for the return of syscalls, in some places, it set name to syscallname, while in other places it set name to syscall.return, e.g.: probe kernel.syscall.chdir.return = kernel.function("sys_chdir").return { name = "chdir.return" } probe syscall.accept.return = kernel.function("sys_accept").return { name = "accept" returnp = 1 /*seem to use returnp=1 to mark this as a return probe*/ } so we need to make the name consistent. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 1:29 Another Newbie question about measuring time in a sys call David A Sperry 2006-02-08 1:52 ` Li Guanglei @ 2006-02-08 5:56 ` Li Guanglei 2006-02-08 12:32 ` Frank Ch. Eigler 2 siblings, 0 replies; 10+ messages in thread From: Li Guanglei @ 2006-02-08 5:56 UTC (permalink / raw) To: David A Sperry; +Cc: systemtap David A Sperry Wrote: > Hi, > I'm trying to run the top example from a few months ago to play with > measuring time. I run top in one window and stap in another. The problem is > I am getting some strange answers. > > probe kernel.syscall.* { > if (execname() == "top") > called[name] = gettimeofday_us() > } sorry I missed something: http://sources.redhat.com/ml/systemtap/2006-q1/msg00319.html Martin has made some changes to the syscall tapsets. The new tapsets will use "probe syscall.*" instead of "probe kernel.syscall.*" ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 1:29 Another Newbie question about measuring time in a sys call David A Sperry 2006-02-08 1:52 ` Li Guanglei 2006-02-08 5:56 ` Li Guanglei @ 2006-02-08 12:32 ` Frank Ch. Eigler 2006-02-08 13:46 ` David A Sperry 2006-02-08 21:14 ` Martin Hunt 2 siblings, 2 replies; 10+ messages in thread From: Frank Ch. Eigler @ 2006-02-08 12:32 UTC (permalink / raw) To: David A Sperry; +Cc: systemtap David_A_Sperry wrote: > probe kernel.syscall.* { > if (execname() == "top") > called[name] = gettimeofday_us() > } > probe kernel.syscall.*.return { > if (execname() == "top") > ttime[name] += gettimeofday_us() - called[name] > } Indeed, the "name" value exported from the syscalls tapset has been changed, for better or for worse. Plus there are still some "kernel.syscall.*" stragglers in the syscalls*stp files, so there is plenty of reason for confusion. Here's how the script might work today. probe kernel.syscall.* { if (execname() != "top") next callee[tid()] = name # saved because .return probe changes name calltime[tid()] = gettimeofday_us() } probe kernel.syscall.*.return { if (execname() != "top") next c=calltime[tid()] if (!c) next delete calltime[tid()] ttime[callee[tid()]] <<< gettimeofday_us() - c } # Note: code like the above probe pair could go into a higher layer # standard tapset. probe end { foreach (x in ttime) printf("syscall %s count %d avg %d\n", x, @count(ttime[x]), @avg(ttime[x])) } global callee, calltime, ttime - FChE ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 12:32 ` Frank Ch. Eigler @ 2006-02-08 13:46 ` David A Sperry 2006-02-08 15:18 ` Frank Ch. Eigler 2006-02-08 21:14 ` Martin Hunt 1 sibling, 1 reply; 10+ messages in thread From: David A Sperry @ 2006-02-08 13:46 UTC (permalink / raw) To: Frank Ch. Eigler, Li Guanglei; +Cc: systemtap Thanks Frank and Li, Both of your examples work and are a big help! Frank, I like your example using tid() instead of pid() and the @ave() function. I modified your script to add @min() and @max() just to see what would happen and it worked! I got syscall ioctl count 38 avg 19 min 16 max 23 syscall gettimeofday count 19 avg 11 min 11 max 12 syscall llseek count 38 avg 10 min 9 max 13 syscall fstat64 count 19 avg 11 min 11 max 13 syscall fcntl64 count 722 avg 11 min 7 max 26 syscall creat count 3211 avg 38 min 33 max 1760 syscall stat64 count 1577 avg 32 min 29 max 80 syscall lseek count 57 avg 9 min 8 max 12 syscall getdents count 38 avg 56 min 29 max 80 From this quick test I can see that syscall creat has at least one call that took a long time. Really cool! Dave systemtap-owner@sourceware.org wrote on 02/08/2006 07:32:17 AM: > > David_A_Sperry wrote: > > > probe kernel.syscall.* { > > if (execname() == "top") > > called[name] = gettimeofday_us() > > } > > probe kernel.syscall.*.return { > > if (execname() == "top") > > ttime[name] += gettimeofday_us() - called[name] > > } > > Indeed, the "name" value exported from the syscalls tapset has been > changed, for better or for worse. Plus there are still some > "kernel.syscall.*" stragglers in the syscalls*stp files, so there is > plenty of reason for confusion. > > Here's how the script might work today. > > > probe kernel.syscall.* { > if (execname() != "top") next > callee[tid()] = name # saved because .return probe changes name > calltime[tid()] = gettimeofday_us() > } > > probe kernel.syscall.*.return { > if (execname() != "top") next > c=calltime[tid()] > if (!c) next > delete calltime[tid()] > ttime[callee[tid()]] <<< gettimeofday_us() - c > } > > # Note: code like the above probe pair could go into a higher layer > # standard tapset. > > probe end { > foreach (x in ttime) > printf("syscall %s count %d avg %d\n", x, > @count(ttime[x]), @avg(ttime[x])) > } > > global callee, calltime, ttime > > > - FChE ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 13:46 ` David A Sperry @ 2006-02-08 15:18 ` Frank Ch. Eigler 2006-02-08 15:54 ` David A Sperry 0 siblings, 1 reply; 10+ messages in thread From: Frank Ch. Eigler @ 2006-02-08 15:18 UTC (permalink / raw) To: David A Sperry; +Cc: Li Guanglei, systemtap David_A_Sperry wrote: > [...] > syscall creat count 3211 avg 38 min 33 max 1760 > [...] > From this quick test I can see that syscall creat has at least one call > that took a long time. Really cool! AFAIK, top does not perform any file creation. Did you change the script to monitor other processes too? (You can store execname() etc. in auxiliary arrays.) - FChE ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 15:18 ` Frank Ch. Eigler @ 2006-02-08 15:54 ` David A Sperry 2006-02-08 16:24 ` Frank Ch. Eigler 0 siblings, 1 reply; 10+ messages in thread From: David A Sperry @ 2006-02-08 15:54 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: Li Guanglei, systemtap fche@redhat.com wrote on 02/08/2006 10:18:06 AM: > > David_A_Sperry wrote: > > > [...] > > syscall creat count 3211 avg 38 min 33 max 1760 > > [...] > > From this quick test I can see that syscall creat has at least one call > > that took a long time. Really cool! > > AFAIK, top does not perform any file creation. Did you change the > script to monitor other processes too? (You can store execname() etc. > in auxiliary arrays.) > I was lazy and just modified the prinf line. I re-ran the script without top running and got no results as expected. I started top, then ran the script and got the results below: *** without top running *** [root@ibm systap]# stap top2.stp -v Pass 1: parsed user script and 11 library script(s) in 130usr/10sys/140real ms. Pass 2: analyzed script: 293 probe(s), 19 function(s), 3 global(s) in 1160usr/6 sys/1221real ms. Pass 3: translated to C into "/tmp/stapQealzt/stap_5083.c" in 380usr/90sys/488r al ms. Pass 4: compiled C into "stap_5083.ko" in 59980usr/1480sys/60048real ms. Pass 5: run completed in 10usr/350sys/72255real ms. [root@ibm systap]# *** with top running before script invocation *** [root@ibm systap]# stap top2.stp -v Pass 1: parsed user script and 11 library script(s) in 140usr/0sys/139real ms. Pass 2: analyzed script: 293 probe(s), 19 function(s), 3 global(s) in 1190usr/3 sys/1233real ms. Pass 3: translated to C into "/tmp/stapMP4oDM/stap_5220.c" in 390usr/90sys/484r al ms. Pass 4: compiled C into "stap_5220.ko" in 59920usr/1660sys/60245real ms. syscall ioctl count 80 avg 18 min 14 max 24 syscall gettimeofday count 40 avg 11 min 11 max 13 syscall llseek count 80 avg 10 min 8 max 12 syscall fstat64 count 40 avg 11 min 10 max 12 syscall fcntl64 count 1920 avg 11 min 6 max 150 syscall creat count 6772 avg 37 min 32 max 103 syscall stat64 count 3326 avg 31 min 29 max 98 syscall lseek count 120 avg 8 min 7 max 11 syscall getdents count 80 avg 55 min 31 max 80 Pass 5: run completed in 20usr/340sys/127410real ms. [root@ibm systap]# *** The contents of the script: *** [sperryd@ibm systap]$ cat top2.stp probe kernel.syscall.* { if (execname() != "top") next callee[tid()] = name # saved because .return probe changes name calltime[tid()] = gettimeofday_us() } probe kernel.syscall.*.return { if (execname() != "top") next c=calltime[tid()] if (!c) next delete calltime[tid()] ttime[callee[tid()]] <<< gettimeofday_us() - c } # Note: code like the above probe pair could go into a higher layer # standard tapset. probe end { foreach (x in ttime) printf("syscall %s count %d avg %d min %d max %d\n", x, @count(ttime[x]), @avg(ttime[x]), @min(ttime[x]), @max(ttime[x])) } global callee, calltime, ttime [sperryd@ibm systap]$ *** here's a list of open files: *** [sperryd@ibm systap]$ lsof |grep top top 5218 sperryd cwd DIR 22,66 4096 21731427 /home/sperryd/systap top 5218 sperryd rtd DIR 22,65 4096 2 / top 5218 sperryd txt REG 22,65 58260 1270863 /usr/bin/top top 5218 sperryd mem REG 22,65 46660 981161 /lib/libnss_files-2.3.90.so top 5218 sperryd mem REG 0,0 0 [vdso] (stat: No such file or directory) top 5218 sperryd mem REG 22,65 121480 982781 /lib/ld-2.3.90.so top 5218 sperryd mem REG 22,65 1487244 982782 /lib/libc-2.3.90.so top 5218 sperryd mem REG 22,65 54500 982805 /lib/libproc-3.2.6.so top 5218 sperryd mem REG 22,65 16304 982784 /lib/libdl-2.3.90.so top 5218 sperryd mem REG 22,65 291960 1266716 /usr/lib/libncurses.so.5.5 top 5218 sperryd 0u CHR 136,3 5 /dev/pts/3 top 5218 sperryd 1u CHR 136,3 5 /dev/pts/3 top 5218 sperryd 2u CHR 136,3 5 /dev/pts/3 top 5218 sperryd 3r REG 0,3 0 4026531841 /proc/uptime top 5218 sperryd 4r REG 0,3 0 4026531840 /proc/loadavg top 5218 sperryd 5r REG 0,3 0 4026531853 /proc/stat top 5218 sperryd 6r REG 0,3 0 4026531842 /proc/meminfo [sperryd@ibm systap]$ -Dave ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 15:54 ` David A Sperry @ 2006-02-08 16:24 ` Frank Ch. Eigler 2006-02-08 17:38 ` Martin Hunt 0 siblings, 1 reply; 10+ messages in thread From: Frank Ch. Eigler @ 2006-02-08 16:24 UTC (permalink / raw) To: David A Sperry; +Cc: systemtap David_A_Sperry wrote: > [...] > > AFAIK, top does not perform any file creation. [...] > > I was lazy and just modified the prinf line. [...] > syscall creat count 6772 avg 37 min 32 max 103 > [...] A glance at src/tapset/syscalls.stp explains the problem. There, the "sys_open" function is probed and mislabeled "creat". - FChE ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 16:24 ` Frank Ch. Eigler @ 2006-02-08 17:38 ` Martin Hunt 0 siblings, 0 replies; 10+ messages in thread From: Martin Hunt @ 2006-02-08 17:38 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: David A Sperry, systemtap On Wed, 2006-02-08 at 11:24 -0500, Frank Ch. Eigler wrote: > David_A_Sperry wrote: > > > [...] > > > AFAIK, top does not perform any file creation. [...] > > > > I was lazy and just modified the prinf line. [...] > > syscall creat count 6772 avg 37 min 32 max 103 > > [...] > > A glance at src/tapset/syscalls.stp explains the problem. There, the > "sys_open" function is probed and mislabeled "creat". One of the many problems with the old syscall tapset. Lots of functions are probed under multiple names. The tapset is currently being rewritten and is not to be trusted. I'll send out an email when the rewrite is complete, along with some sample code to show how to do profiling and tracing using it. Martin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Another Newbie question about measuring time in a sys call 2006-02-08 12:32 ` Frank Ch. Eigler 2006-02-08 13:46 ` David A Sperry @ 2006-02-08 21:14 ` Martin Hunt 1 sibling, 0 replies; 10+ messages in thread From: Martin Hunt @ 2006-02-08 21:14 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: David A Sperry, systemtap Another way that works now is to not use the syscall tapset. probe kernel.function("sys_*") { if (target() != tid()) next calltime[tid()] = gettimeofday_us() } probe kernel.function("sys_*").return { if (target() != tid()) next c=calltime[tid()] if (!c) next delete calltime[tid()] ttime[probefunc()] <<< gettimeofday_us() - c } probe end { printf("\n") foreach (x in ttime) printf("%s count %d avg %d\n", x, @count(ttime[x]), @avg(ttime[x])) } global calltime, ttime ---------- Note I change the filter to compare target() to tid(). This allows you to profile any command with the above code just by using the "-c" option to stap. > stap -c ps system_calls.stp PID TTY TIME CMD 19129 pts/6 00:00:00 stpd 19135 pts/6 00:00:00 ps sys_munmap count 7 avg 14 sys_mprotect count 6 avg 9 sys_mmap2 count 25 avg 8 sys_readlink count 2 avg 20 sys_rt_sigaction count 34 avg 5 sys_read count 231 avg 20 sys_rt_sigprocmask count 7 avg 6 sys_write count 3 avg 17 sys_brk count 6 avg 5 sys_open count 234 avg 12 sys_getdents64 count 2 avg 37 sys_geteuid count 2 avg 5 sys_time count 2 avg 7 sys_lseek count 2 avg 5 sys_ioctl count 2 avg 7 sys_access count 6 avg 8 sys_newuname count 2 avg 8 sys_fcntl64 count 1 avg 5 sys_fstat64 count 13 avg 5 sys_stat64 count 123 avg 9 sys_set_thread_area count 2 avg 6 sys_sigreturn count 1 avg 9 sys_close count 234 avg 6 sys_getpgrp count 1 avg 6 sys_getppid count 1 avg 5 sys_getpid count 1 avg 6 sys_getegid count 1 avg 5 sys_getgid count 1 avg 5 sys_getuid count 1 avg 5 Unfortunately we don't have a way to tell the translator which field to use to sort the output. Of course you could just pipe the output through sort. Martin ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-02-08 21:14 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-02-08 1:29 Another Newbie question about measuring time in a sys call David A Sperry 2006-02-08 1:52 ` Li Guanglei 2006-02-08 5:56 ` Li Guanglei 2006-02-08 12:32 ` Frank Ch. Eigler 2006-02-08 13:46 ` David A Sperry 2006-02-08 15:18 ` Frank Ch. Eigler 2006-02-08 15:54 ` David A Sperry 2006-02-08 16:24 ` Frank Ch. Eigler 2006-02-08 17:38 ` Martin Hunt 2006-02-08 21:14 ` Martin Hunt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).