From: liang xie <xieliang007@gmail.com>
To: Josh Stone <jistone@redhat.com>
Cc: systemtap@sourceware.org
Subject: Re: How to get the detailed kernel stack trace if specified fuction takes too long to finish?
Date: Wed, 02 Apr 2014 10:25:00 -0000 [thread overview]
Message-ID: <CADu=CFrnaoiWcD8Jt8uFd1Q087Kyp28gKukKH_JD7craJeF3xw@mail.gmail.com> (raw)
In-Reply-To: <5335A3A6.8010300@redhat.com>
I tried the last scheduler script, the output just like this:
# stap --all-modules y.stp
WARNING: missing unwind/symbol data for module 'uprobes'
long sys_write in tid 10162
Systemtap probe: kernel.trace("sched_switch")
long sys_write in tid 11510
Systemtap probe: kernel.trace("sched_switch")
long sys_write in tid 11288
Systemtap probe: kernel.trace("sched_switch")
I could not figure out why "print_backtrace()" didn't output any log:)
On Sat, Mar 29, 2014 at 12:30 AM, Josh Stone <jistone@redhat.com> wrote:
> On 03/27/2014 09:22 PM, liang xie wrote:
>> Hi all,
>>
>> Probably it's a newbie question, to be honest, i am new to SystemTap,
>
> No problem, welcome!
>
>> currently i want to capture the kernel stack trace if "sys_write" runs
>> too long, if that i could know where it sucks.
>>
>> My OS:2.6.32-220.el6.x86_64
>> # stap -V
>> Systemtap translator/driver (version 1.6/0.152 non-git sources)
>>
>> My tried script:
>> global i=0
>> global tb=0
>> global te=0
>> probe kernel.function("sys_write") {
>> tb=gettimeofday_us();
>> }
>> probe kernel.function("sys_write").return {
>> te=gettimeofday_us();
>> if(te-tb > 100000) {
>> i++;
>> printf("time_us: %d\n", te-tb);
>> print_backtrace();
>> if(i > 15) {
>> exit();
>> }
>> }
>> }
>>
>> # stap -x <pid> xxx.stp
>> time_us: 226795
>> Returning from: 0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to : 0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>> time_us: 101400
>> Returning from: 0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to : 0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>> time_us: 527853
>> Returning from: 0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to : 0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>> time_us: 133948
>> Returning from: 0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to : 0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>>
>> It doesn't show my "expected" kernel stack trace,e.g. file system or
>> block layer related stack trace. How to archive that? Any pointers
>> will be highly appreciated!
>
> Right, since you waited until return to print_backtrace, it will only be
> from the end of sys_write. (Though it appears to be the beginning,
> sys_write+0x0, because of the way a kretprobe captures returns.)
>
> Since you want info from deeper kernel internals, you'll need additional
> probes. One option is to use the para-callgraph example script, where
> sys_write is your first argument to start a trace, and the second
> argument gives a broad subsystem to dig through. Like:
>
> $ stap para-callgraph.stp 'kernel.function("sys_write")' \
> 'kernel.function("*@fs/*")'
>
> This script will show you timestamps, but you'll get all output whether
> it's slow or not. Maybe you can modify it to filter some of the
> information.
>
> Another option is to use timer.profile, and then you can watch for your
> long waits. Something like:
>
> global start_time
> probe syscall.write { start_time[tid()] = gettimeofday_us() }
> probe syscall.write.return { delete start_time[tid()] }
> probe timer.profile {
> t = tid()
> if (t in start_time && gettimeofday_us() - start_time[t] > 100000)
> {
> printf("long sys_write in tid %d\n", t)
> print_backtrace()
> delete start_time[t]
> }
> }
>
> Even this assumes that the long wait is busy on the cpu, but I think
> most likely it's an IO wait, which means you'll be scheduled off while
> it completes. So then instead of timer.profile, or in addition, you
> could use a scheduler tracepoint. Like:
>
> global start_time
> probe syscall.write { start_time[tid()] = gettimeofday_us() }
> probe syscall.write.return { delete start_time[tid()] }
> probe kernel.trace("sched_switch") {
> t = task_tid($next)
> if (t in start_time && gettimeofday_us() - start_time[t] > 100000)
> {
> printf("long sys_write in tid %d\n", t)
> print_backtrace()
> delete start_time[t]
> }
> }
>
> So as your thread is scheduled back onto the cpu, if it has been a long
> time, the backtrace will tell you what path it took to get scheduled off
> in the first place.
>
>
> Hope that's enough to give you some direction...
>
> Josh
next prev parent reply other threads:[~2014-04-02 10:25 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-28 4:22 liang xie
2014-03-28 16:56 ` Frank Ch. Eigler
2014-03-28 17:06 ` Josh Stone
2014-04-02 10:25 ` liang xie [this message]
2014-04-02 16:32 ` Josh Stone
2014-04-03 2:47 ` liang xie
2014-04-03 16:24 ` Josh Stone
2014-04-08 2:15 ` liang xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADu=CFrnaoiWcD8Jt8uFd1Q087Kyp28gKukKH_JD7craJeF3xw@mail.gmail.com' \
--to=xieliang007@gmail.com \
--cc=jistone@redhat.com \
--cc=systemtap@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).