public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* user mode backtrace
@ 2006-10-19 20:47 David Boreham
  2006-10-19 22:33 ` Vara Prasad
  2006-10-19 23:50 ` Frank Ch. Eigler
  0 siblings, 2 replies; 15+ messages in thread
From: David Boreham @ 2006-10-19 20:47 UTC (permalink / raw)
  To: SystemTap

I'd like to get a stack trace for the process that made the
system call I'm probing (I'm looking at filesystem access
typically, so reads/writes/syncs etc). The systemtap backtrace
function appears to only get the kernel mode stack which
is not much use to me. I was wondering if anyone had
discovered a good solution to this problem already ?
I was thinking perhaps I could invoke pstack (gdb)
on the current pid/tid. But I'm worried that doing so
might deadlock since the process is inside a system
call.

I'm looking at a very large application that beats up on
the filesystem, in case you're wondering why I want to do
this. It's so large that nobody is quite sure what code
access which files, when and why.

Thanks.


^ permalink raw reply	[flat|nested] 15+ messages in thread
* RE: user mode backtrace
@ 2006-10-19 22:56 Stone, Joshua I
  2006-10-19 23:07 ` David Boreham
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Stone, Joshua I @ 2006-10-19 22:56 UTC (permalink / raw)
  To: david_list; +Cc: SystemTap

On Thursday, October 19, 2006 1:47 PM, David Boreham wrote:
> I'd like to get a stack trace for the process that made the
> system call I'm probing (I'm looking at filesystem access
> typically, so reads/writes/syncs etc). The systemtap backtrace
> function appears to only get the kernel mode stack which
> is not much use to me. I was wondering if anyone had
> discovered a good solution to this problem already ?
> I was thinking perhaps I could invoke pstack (gdb)
> on the current pid/tid. But I'm worried that doing so
> might deadlock since the process is inside a system
> call.
> 
> I'm looking at a very large application that beats up on
> the filesystem, in case you're wondering why I want to do
> this. It's so large that nobody is quite sure what code
> access which files, when and why.
> 
> Thanks.

Deadlock issues aside, there's not really a way for you to invoke a
process (like pstack) from within a SystemTap script.  You could run a
separate user program or script to do this for you though, and then you
just need to coordinate with SystemTap.  Such a method might look like
this:

-----------------------------------------------
/* Main test driver */
-----------------------------------------------
pid = fork_ptraceme_exec("myapp"); // start the app paused
stappid = fork_exec("stap myscript.stp -x " + pid); // start systemtap
ptrace(DETACH, pid, ...); // let the app run
while(pid == waitpid(pid, stat, 0)) {
  if (WIFSTOPPED(stat)) { // app is stopped
    system("pstack " + pid); // dump the stack
    kill(pid, SIGCONT); // continue the app
  }
  else if (WIFEXITED(stat)) {
    break;
  }
}
kill(stappid, SIGINT); // tell systemtap to stop
waitpid(stappid, ...);
-----------------------------------------------

-----------------------------------------------
/* SystemTap script: myscript.stp */
-----------------------------------------------
probe syscall.read {
  if (target() != tid()) next;
  /* log some stuff: filename, etc. */
}
probe syscall.read.return {
  if (target() != tid()) next;
  send_stop() // do this on return to avoid EINTR
}
function send_stop %{
  send_sig(SIGSTOP, current, 1);
%}
-----------------------------------------------

This is all pretty rough, and I haven't actually tried it, so who knows
if it will actually work.

Of course at the end of the day, this is just a convoluted strace with a
stack printout.  You could probably do the same thing by hacking gdb's
backtrace function into strace.  But this SystemTap method would also
let you do probe other things besides just system calls...

If anyone gets this working I would LOVE to hear about it... :)


Josh

^ permalink raw reply	[flat|nested] 15+ messages in thread
* RE: user mode backtrace
@ 2006-10-20  2:02 Stone, Joshua I
  0 siblings, 0 replies; 15+ messages in thread
From: Stone, Joshua I @ 2006-10-20  2:02 UTC (permalink / raw)
  To: david_list; +Cc: SystemTap

On Thursday, October 19, 2006 4:25 PM, David Boreham wrote:
>> pid = fork_ptraceme_exec("myapp"); // start the app paused
>> stappid = fork_exec("stap myscript.stp -x " + pid); // start
>> systemtap ptrace(DETACH, pid, ...); // let the app run
>> 
>> 
> Actually I don't think this will help me because it looks like
> it assumes a specific target process. That's the specific problem that
> I have : I don't know which processes are going to be interesting
> in advance.

I just filter on a single tid because it's convenient.  The thing you
have to avoid is probing any of the processes you kick off, like the
pstack.  Otherwise you get yourself in a recursive loop, and
congratulations, you've just fork-bombed the system.  So it's hard to be
smart about which processes NOT to probe.  You could try filtering by
execname, if that's known.

If you can manage that your application is spawned from a central
process, you could try to follow forks from that process:

-----------------------------------------------
global filter
probe begin {
  filter[target()] = 1
}
probe process.create {
  if (filter[tid()])
    filter[new_pid] = 1
}
probe process.exit {
  delete filter[tid()]
}
-----------------------------------------------

Then instead of "if (target() != tid()) next;" you have "if
(!filter[tid()]) next;".


Josh

^ permalink raw reply	[flat|nested] 15+ messages in thread
* RE: user mode backtrace
@ 2006-10-20  2:13 Stone, Joshua I
  0 siblings, 0 replies; 15+ messages in thread
From: Stone, Joshua I @ 2006-10-20  2:13 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: david_list, SystemTap

On Thursday, October 19, 2006 5:10 PM, Frank Ch. Eigler wrote:
> "Stone, Joshua I" <joshua.i.stone@intel.com> writes:
> 
>> [...]  Deadlock issues aside, there's not really a way for you to
>> invoke a process (like pstack) from within a SystemTap script.  [...]
> 
> It turns out that quite some time ago, Martin implemented a clone of
> the dtrace system() function for systemtap, which enqueues a string
> for execution by the userspace staprun daemon.  (There is no
> synchronization or data exchange though.)

I forgot we had that.  Take two... :)

-----------------------------------------------
global args
probe syscall.open {
    t = tid()
    if (target() != t) next;
    args[t] = argstr
}
probe syscall.open.return {
    t = tid()
    if (target() != t) next;
    send_stop() // do this on return to avoid EINTR
    printf("\n%d %s open(%s) = %s\n", t, execname(), args[t], retstr)
    system(sprintf("pstack %d && kill -CONT %d", t, t))
    delete args[t]
}
function send_stop() %{
    send_sig(SIGSTOP, current, 1);
%}
-----------------------------------------------

... and this one actually works!  Mostly... when I SIGSTOP an app
started from an interactive shell, the shell seems to take back control,
and I can't get SIGCONT to work nicely.  But, I tried targeting a gvim
process, which is detached, and it worked just fine!

Of course, it's VERY slow -- probably orders of magnitude slower than
other options like recording the stack frame for post-processing.

It's still a fun exercise though.  :)


Josh

^ permalink raw reply	[flat|nested] 15+ messages in thread
* RE: user mode backtrace
@ 2006-10-20 18:34 Stone, Joshua I
  0 siblings, 0 replies; 15+ messages in thread
From: Stone, Joshua I @ 2006-10-20 18:34 UTC (permalink / raw)
  To: SystemTap; +Cc: david_list, Frank Ch. Eigler

On Thursday, October 19, 2006 7:13 PM, Stone, Joshua I wrote:
> -----------------------------------------------
> global args
> probe syscall.open {
>     t = tid()
>     if (target() != t) next;
>     args[t] = argstr
> }
> probe syscall.open.return {
>     t = tid()
>     if (target() != t) next;
>     send_stop() // do this on return to avoid EINTR
>     printf("\n%d %s open(%s) = %s\n", t, execname(), args[t], retstr)
>     system(sprintf("pstack %d && kill -CONT %d", t, t))
>     delete args[t]
> }
> function send_stop() %{
>     send_sig(SIGSTOP, current, 1);
> %}
> -----------------------------------------------
> 
> ... and this one actually works!  Mostly... when I SIGSTOP an app
> started from an interactive shell, the shell seems to take back
> control, and I can't get SIGCONT to work nicely.  But, I tried
> targeting a gvim process, which is detached, and it worked just fine!
> 
> Of course, it's VERY slow -- probably orders of magnitude slower than
> other options like recording the stack frame for post-processing.
> 
> It's still a fun exercise though.  :)

Bonus points -

While this method is really too slow for tracing usage, it might be good
for debugging purposes.  I'm thinking of the usage model where you
detect that something interesting happens in your app, and you want to
pause it so you can attach a debugger and inspect it interactively.

For example, consider the simple case when you're debugging an app that
forks, and you want to have gdb attached to *both* ends of the fork.
The gdb docs guide you to "Put a call to sleep in the code which the
child process executes after the fork."  With SystemTap probes on
process.create and/or process.start, you could watch for your app's
fork, SIGSTOP the new process before it goes anywhere, and then attach a
gdb to the new process.  You still need multiple gdb sessions, but this
way you don't need to modify your app with a new sleep call.

If you want to be extra clever, you could even use system() to
automatically launch an xterm with a new gdb session attached, something
like:

  probe process.start {
    if (my_filter()) {
      send_stop()
      system(sprintf("xterm -e gdb %d &", tid()))
    }
  }


Josh

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-10-20 18:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-19 20:47 user mode backtrace David Boreham
2006-10-19 22:33 ` Vara Prasad
2006-10-19 22:51   ` David Boreham
2006-10-20 14:28     ` William Cohen
2006-10-20 14:50       ` David Boreham
2006-10-19 23:50 ` Frank Ch. Eigler
2006-10-20  0:15   ` David Boreham
2006-10-20  0:27     ` Frank Ch. Eigler
2006-10-19 22:56 Stone, Joshua I
2006-10-19 23:07 ` David Boreham
2006-10-19 23:24 ` David Boreham
2006-10-20  0:09 ` Frank Ch. Eigler
2006-10-20  2:02 Stone, Joshua I
2006-10-20  2:13 Stone, Joshua I
2006-10-20 18:34 Stone, Joshua I

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).