public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Instrumenting context-switching
@ 2007-10-31 13:42 Perry Cheng
  2007-10-31 16:34 ` Mike Mason
  0 siblings, 1 reply; 5+ messages in thread
From: Perry Cheng @ 2007-10-31 13:42 UTC (permalink / raw)
  To: systemtap

Instrumenting the context switch code has long been delicate.   I can't 
seem to find any help on this particular topic on the wiki (specifically, 
the examples).

In the past, following some hints from old docs and mailing list, it has 
been enough to instrument the __switch_to method to get the prev and next 
tasks.  The method one level higher is switch_to which, being a macro, is 
not instrumentable.  Lately, I've switched from an older i386 kernel to a 
new x86_64 kernel and now __switch_to no longer is instrumentable with 
kprobes.   The higher-level context_switch itself does not seem probe-able 
because it is a static inline method.  Even higherup, we have the call to 
context_switch from schedule but instrumenting that would require using a 
specific line number which seems rather fragile because of greater 
reliance of debugging code and susceptibility to kernel code change.

So, on an x86_64 kernel, how do I instrument this method?

Perry

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Instrumenting context-switching
  2007-10-31 13:42 Instrumenting context-switching Perry Cheng
@ 2007-10-31 16:34 ` Mike Mason
  2007-10-31 16:48   ` Masami Hiramatsu
  2007-10-31 19:04   ` Stone, Joshua I
  0 siblings, 2 replies; 5+ messages in thread
From: Mike Mason @ 2007-10-31 16:34 UTC (permalink / raw)
  To: Perry Cheng; +Cc: systemtap

Perry Cheng wrote:
> Instrumenting the context switch code has long been delicate.   I can't 
> seem to find any help on this particular topic on the wiki (specifically, 
> the examples).
> 
> In the past, following some hints from old docs and mailing list, it has 
> been enough to instrument the __switch_to method to get the prev and next 
> tasks.  The method one level higher is switch_to which, being a macro, is 
> not instrumentable.  Lately, I've switched from an older i386 kernel to a 
> new x86_64 kernel and now __switch_to no longer is instrumentable with 
> kprobes.   The higher-level context_switch itself does not seem probe-able 
> because it is a static inline method.  Even higherup, we have the call to 
> context_switch from schedule but instrumenting that would require using a 
> specific line number which seems rather fragile because of greater 
> reliance of debugging code and susceptibility to kernel code change.

The scheduler tapset probes context_switch() on x86_64, but that doesn't help much.  context_switch() is an inline and, thus, the entry parameters prev and next aren't accessible via SystemTap.

What kernel version are you using?  If your using 2.6.24-rc1 then kernel markers are available.  You can place a static marker (as shown in the patch below), rebuild and reboot the kernel, then access the probe point via a SystemTap script (as shown in the script below the patch).  You'll need to use the latest SystemTap snapshot to get marker probe support: ftp://sources.redhat.com/pub/systemtap/snapshots/systemtap-20071027.tar.bz2.

Of course, this won't help if you're using earlier versions of the kernel or SystemTap.

- Mike

diff --git a/kernel/sched.c b/kernel/sched.c
index 3f6bd11..3281098 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1937,6 +1937,8 @@ context_switch(struct rq *rq, struct task_struct *prev,
        spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
 #endif
 
+       trace_mark(sched_switch_to, "%p %p %p", rq, prev, next);
+
        /* Here we just switch the register state and the stack. */
        switch_to(prev, next, prev);

SCRIPT:
probe kernel.mark("sched_switch_to")
{
        rq = $arg1
        p = $arg2
        n = $arg3

        next_pid = task_pid(n)
        prev_pid = task_pid(p)
        next_name = task_execname(n)
        prev_name = task_execname(p)

        printf("%s (%d) switching to %s (%d)\n",
               prev_name, prev_pid, next_name, next_pid)
}

> 
> So, on an x86_64 kernel, how do I instrument this method?
> 
> Perry

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Instrumenting context-switching
  2007-10-31 16:34 ` Mike Mason
@ 2007-10-31 16:48   ` Masami Hiramatsu
  2007-10-31 19:04   ` Stone, Joshua I
  1 sibling, 0 replies; 5+ messages in thread
From: Masami Hiramatsu @ 2007-10-31 16:48 UTC (permalink / raw)
  To: Mike Mason; +Cc: Perry Cheng, systemtap

Mike Mason wrote:
> What kernel version are you using?  If your using 2.6.24-rc1 then kernel markers are available.
>  You can place a static marker (as shown in the patch below), rebuild and reboot the kernel, 
> then access the probe point via a SystemTap script (as shown in the script below the patch).

FYI, from 2.6.24-rc1, you can instrument kprobe on __switch_to even if it runs on x86-64,
because the kretprobe-blacklist is merged.

Thanks,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com, masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Instrumenting context-switching
  2007-10-31 16:34 ` Mike Mason
  2007-10-31 16:48   ` Masami Hiramatsu
@ 2007-10-31 19:04   ` Stone, Joshua I
  2007-10-31 19:19     ` Stone, Joshua I
  1 sibling, 1 reply; 5+ messages in thread
From: Stone, Joshua I @ 2007-10-31 19:04 UTC (permalink / raw)
  To: Mike Mason; +Cc: Perry Cheng, systemtap

Mike Mason wrote:
> The scheduler tapset probes context_switch() on x86_64, but that doesn't 
> help much.  context_switch() is an inline and, thus, the entry parameters 
> prev and next aren't accessible via SystemTap.

You can still effectively get prev and next if you use a pair of probes.
A probe at the beginning of context_switch is still on the old thread,
so current == prev.  A probe on finish_task_switch is in the context of
the new thread, so current == next.  You can access various fields in
current by using the context.stp tapset, or use task_current() to get
the actual task_struct pointer.

These functions are used for scheduler.cpu_off and scheduler.cpu_off,
since they are architecture-independent locations in the scheduler.


Josh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Instrumenting context-switching
  2007-10-31 19:04   ` Stone, Joshua I
@ 2007-10-31 19:19     ` Stone, Joshua I
  0 siblings, 0 replies; 5+ messages in thread
From: Stone, Joshua I @ 2007-10-31 19:19 UTC (permalink / raw)
  To: Mike Mason; +Cc: Perry Cheng, systemtap

Stone, Joshua I wrote:
> These functions are used for scheduler.cpu_off and scheduler.cpu_off,

Or rather, scheduler.cpu_off and scheduler.cpu_on...

Josh

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-10-31 19:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-31 13:42 Instrumenting context-switching Perry Cheng
2007-10-31 16:34 ` Mike Mason
2007-10-31 16:48   ` Masami Hiramatsu
2007-10-31 19:04   ` Stone, Joshua I
2007-10-31 19:19     ` Stone, Joshua I

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).