public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Userspace probing
@ 2011-05-08 10:06 Mandar Gurav
  2011-05-09 14:19 ` Lukas Berk
  0 siblings, 1 reply; 3+ messages in thread
From: Mandar Gurav @ 2011-05-08 10:06 UTC (permalink / raw)
  To: systemtap

Hi all!

I want to trace disk io for my program using userspace probing as

probe("PATH").syscall

It is said that the system call number is available with $syscall. Can
anyone tell how can I check whether it is a "open" system call....???

-- 
Mandar Gurav

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Userspace probing
  2011-05-08 10:06 Userspace probing Mandar Gurav
@ 2011-05-09 14:19 ` Lukas Berk
  2011-05-10 19:57   ` David Smith
  0 siblings, 1 reply; 3+ messages in thread
From: Lukas Berk @ 2011-05-09 14:19 UTC (permalink / raw)
  To: Mandar Gurav; +Cc: systemtap

Hey Mandar,

As you already noted there is $syscall variable with a
'process("path").syscall' style probe. To check whether it is an
'open()' system call, we'd have to compare $syscall to the corresponding
syscall number (this varies slightly by architecture).  On my system,
running 'grep __NR_open /usr/include/*/* ' shows 2 and 5 relating to
SYS_open (which is what we want here). From there we'd just want to
create conditionals where the $syscall matches.

Drawing from that, running a script such as:
$stap -e 'probe process("ping").syscall {
if($syscall == 2)
printf("open 2: %s (%d)\n", execname(), pid())
if($syscall == 5)
printf("open 5: %s (%d)\n", execname(), pid())
}' -c 'ping -c 3 google.com'

would return only the open() syscalls, feel free to change the segments
following the if()'s however you want.

Another method would be to probe via syscall.open and filter by
execname() or target().

Using a similar example to above you could write a script such as:

stap -e 'probe syscall.open {
if(execname() == "ping")
printf("pid: %d\n", pid())
}' -c 'ping -c 3 google.com'

Hope this helps,

Lukas Berk

* Mandar Gurav <mandarwce@gmail.com> [2011-05-08 06:06]:
> Hi all!
> 
> I want to trace disk io for my program using userspace probing as
> 
> probe("PATH").syscall
> 
> It is said that the system call number is available with $syscall. Can
> anyone tell how can I check whether it is a "open" system call....???
> 
> -- 
> Mandar Gurav

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Userspace probing
  2011-05-09 14:19 ` Lukas Berk
@ 2011-05-10 19:57   ` David Smith
  0 siblings, 0 replies; 3+ messages in thread
From: David Smith @ 2011-05-10 19:57 UTC (permalink / raw)
  To: systemtap; +Cc: Lukas Berk, Mandar Gurav

On 05/09/2011 09:19 AM, Lukas Berk wrote:
> Hey Mandar,
> 
> As you already noted there is $syscall variable with a
> 'process("path").syscall' style probe. To check whether it is an
> 'open()' system call, we'd have to compare $syscall to the corresponding
> syscall number (this varies slightly by architecture).  On my system,
> running 'grep __NR_open /usr/include/*/* ' shows 2 and 5 relating to
> SYS_open (which is what we want here). From there we'd just want to
> create conditionals where the $syscall matches.
> 
> Drawing from that, running a script such as:
> $stap -e 'probe process("ping").syscall {
> if($syscall == 2)
> printf("open 2: %s (%d)\n", execname(), pid())
> if($syscall == 5)
> printf("open 5: %s (%d)\n", execname(), pid())
> }' -c 'ping -c 3 google.com'

Hmm, that isn't going to work correctly.  The value of __NR_open varies
between architectures.

# fgrep -w __NR_open linux-2.6-linus/arch/*/include/asm*/*.h
arch/alpha/include/asm/unistd.h:#define __NR_open		 45
arch/arm/include/asm/unistd.h:#define __NR_open			(__NR_SYSCALL_BASE+  5)
arch/avr32/include/asm/unistd.h:#define __NR_open		  5
arch/blackfin/include/asm/unistd.h:#define __NR_open		  5
arch/cris/include/asm/unistd.h:#define __NR_open		  5
arch/frv/include/asm/unistd.h:#define __NR_open		  5
arch/h8300/include/asm/unistd.h:#define __NR_open		  5
arch/ia64/include/asm/unistd.h:#define __NR_open			1028
arch/m32r/include/asm/unistd.h:#define __NR_open		  5
arch/m68k/include/asm/unistd.h:#define __NR_open		  5
arch/microblaze/include/asm/unistd.h:#define __NR_open		5 /* openat */
arch/mips/include/asm/unistd.h:#define __NR_open			(__NR_Linux +   5)
arch/mips/include/asm/unistd.h:#define __NR_open			(__NR_Linux +   2)
arch/mips/include/asm/unistd.h:#define __NR_open			(__NR_Linux +   2)
arch/mn10300/include/asm/unistd.h:#define __NR_open		  5
arch/parisc/include/asm/unistd.h:#define __NR_open
(__NR_Linux + 5)
arch/powerpc/include/asm/unistd.h:#define __NR_open		  5
arch/s390/include/asm/unistd.h:#define __NR_open                 5
arch/sh/include/asm/unistd_32.h:#define __NR_open		  5
arch/sh/include/asm/unistd_64.h:#define __NR_open		  5
arch/sparc/include/asm/unistd.h:#define __NR_open                 5 /*
Common                                      */
arch/x86/include/asm/unistd_32.h:#define __NR_open		  5
arch/x86/include/asm/unistd_64.h:#define __NR_open				2
arch/x86/include/asm/unistd_64.h:__SYSCALL(__NR_open, sys_open)
arch/xtensa/include/asm/unistd.h:#define __NR_open 				  8

So, on most platforms, but not all, 5 is __NR_open.  (For instance, on
ia64, __NR_open is 1024.)  However, 2 is __NR_fork on most platforms.
So you are going to get lots of false positives with the above code.

Here's how to fix this.  To catch the normal case, you can do this:

====
%{
#include <linux/unistd.h>
%}

probe process("ping").syscall {
  if ($syscall == %{ __NR_open %}) {
    printf("open: %s (%d)\n", execname(), pid())
  }
}
====

The above code uses the value of __NR_open that is specific for each
platform to get the right value (so it is always right).  Problem
solved!  Except...

arch/x86/include/asm/unistd_32.h:#define __NR_open  5
arch/x86/include/asm/unistd_64.h:#define __NR_open  2

On 64-bit x86, __NR_open is 2.  But __NR_open is 5 on a 32-bit
executable running on that same 64-bit kernel.

To solve this problem, we've got to know if we're running a 32-bit exe
on the 64-bit kernel.  Here's the code I've used in the past for this,
which adds a function called 'ia32' that lets us know if we're running
an x86 32-bit exe on 64-bit kernel.

====
%{
#include <linux/unistd.h>
%}

%(arch == "x86_64" %?
function ia32:long()
%{ /* pure */
	if (test_tsk_thread_flag(current, TIF_IA32))
		THIS->__retvalue = 1;
	else
		THIS->__retvalue = 0;
%}
%)

probe process("ping").syscall {
  if ($syscall == %{ __NR_open %}
%(arch == "x86_64" %?
      || (ia32() && $syscall == 5)
%)
      ) {
    printf("open: %s (%d)\n", execname(), pid())
  }
}
====

Unfortunately we've got to hardcode the 5 here, since __NR_open will be
2 on the 64-bit x86_64 kernel.

> would return only the open() syscalls, feel free to change the segments
> following the if()'s however you want.
> 
> Another method would be to probe via syscall.open and filter by
> execname() or target().
> 
> Using a similar example to above you could write a script such as:
> 
> stap -e 'probe syscall.open {
> if(execname() == "ping")
> printf("pid: %d\n", pid())
> }' -c 'ping -c 3 google.com'

That code looks fine.

So, which code to pick? It depends on what your application is and what
else is running on your system.  The 'process.syscall' probe is going to
hit for every syscall in your application, but won't slow down any other
process in the system.  The 'syscall.open' probe will only hit for open
syscalls, but will hit on every open syscall on every running process.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-05-10 19:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-08 10:06 Userspace probing Mandar Gurav
2011-05-09 14:19 ` Lukas Berk
2011-05-10 19:57   ` David Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).