* Latencytap thoughts
@ 2010-09-23 21:59 William Cohen
0 siblings, 0 replies; only message in thread
From: William Cohen @ 2010-09-23 21:59 UTC (permalink / raw)
To: SystemTAP
Hi everyone,
I working to close out PR6960, latencytap. I thought that I would
discuss the issues with the current latencytap on the mailing list and
see what suggestions other people have to improve it.
The current output of latencytap is rather nebulous. The output should
point out specific things that can be improved or corrected on the
system. Below is an example of current output from latencytap.stp that
it prints out for every 30 second interval:
Reason Count Average(us) Maximum(us) Percent%
Application requested delay 340 442067 2000977 20%
Waiting for event (poll) 348 375896 29999669 17%
Waiting for event (select) 47 1970250 4999867 12%
3034 25937 18434087 10%
Waiting for event (select) 31 1935439 30000899 8%
Waking ksoftirqd 2 15268765 29925887 4%
Userspace lock contention 30 1000943 1000952 4%
Waiting for event (select) 6 4999926 4999932 4%
22 1363599 3423967 4%
pdflush() kernel thread 6 4999761 4999962 4%
Waiting for event (poll) 150 199950 201709 4%
Waiting for event (epoll) 2 14544543 26414890 3%
kjournald() kernel thread 2 11711964 18423438 3%
Waiting for event (epoll) 2 999716 999886 0%
EXT3: Waiting for journal access 3 41252 107177 0%
opening cdrom device 15 2529 2685 0%
opening cdrom device 15 2116 2173 0%
block device IOCTL 15 2109 2161 0%
opening cdrom device 15 2081 2197 0%
opening cdrom device 15 1964 2112 0%
The "Reason" column is the based on function found in the stack
backtrace. If there is no reason found for any of the functions in the
backtrace, then the reason is left blank. One can generate a kernel
module use the debug=1 with staprun to get original backtraces for
ones without reasons. Another side effect of this method is that there
can be multiple entries with the same reason beacuse they have
different backtraces.
The rows are sorted by on the total amount of the time spent deactivated
for each backtrace. This can be seen by the "Percent%" column on the
right. Note that multiple backtraces have the same reason are not
condensed into a single entry right now.
The question is what kind of data analysis would help people figure
out "What the hold up is on the machine?"
Maybe divide things into interruptible and noninterruptible reasons.
Have a sub-table showing which user processes have the greatest amount of latency.
Any other suggestions would appreciated.
-Will
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2010-09-23 21:59 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-23 21:59 Latencytap thoughts William Cohen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).