public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Example looking at the periodic timers
@ 2011-12-13 22:29 William Cohen
  2011-12-14 14:52 ` Mark Wielaard
  0 siblings, 1 reply; 6+ messages in thread
From: William Cohen @ 2011-12-13 22:29 UTC (permalink / raw)
  To: systemtap

[-- Attachment #1: Type: text/plain, Size: 3848 bytes --]

One case that people would like to use SystemTap for is finding out why some latency sensitive application is slowed down every x seconds.  They know that something keeps running on the machine and slowing things down, but they don't know what it is.

I have been playing around with the timer tracepoints and came up with the attached script, periodic.stp.  When the script exits it prints out a list of the periodic timers, sorted from most frequently to least frequently fired timer. It also include the average period for the timer.  The functions that have [dg] are things put into the delayed_work_queues.

Here is the example output:


$ ./periodic.stp -c "sleep 10"
WARNING: missing unwind/symbol data for module 'uprobes'
#timer            function                              period(us)      count
0xffffe8ffffc118a8 0xffffffffa0e38d30                          1000      10000
0xffffe8ffffc918a8 0xffffffffa0e38d30                          1000      10000
0xffffe8ffffd118a8 0xffffffffa0e38d30                           999      10000
0xffffe8ffffd918a8 0xffffffffa0e38d30                           999      10000
0xffff880028216fa8 do_dbs_timer[dq]                           10000        999
0xffffffffa115e960 0xffffffffa0e38c10                         10000        999
0xffff880028296fa8 do_dbs_timer[dq]                            9999        999
0xffff880028316fa8 do_dbs_timer[dq]                           10000        999
0xffff880028396fa8 do_dbs_timer[dq]                            9999        999
0xffffffffa115e9c0 0xffffffffa0e38b50                         19999        499
0xffff880130c9fe50 monitor_timeout                           200002         49
0xffffffff81ea3b20 clocksource_watchdog                      499993         19
0xffff880028212d40 vmstat_update[dq]                         999963          9
0xffff880132cf9318 br_fdb_cleanup                            999962          9
0xffff880028292d40 vmstat_update[dq]                         999999          9
0xffff880028312d40 vmstat_update[dq]                        1000000          9
0xffff880028392d40 vmstat_update[dq]                        1000002          9
0xffff8801303c96c0 i915_gem_retire_work_handler[dq]         1000001          9
0xffff88013156e128 flush_to_ldisc[dq]                        113108          8
0xffff880130315928 flush_to_ldisc[dq]                        180399          5
0xffff8801303c97e8 intel_gpu_idle_timer                     1607001          5
0xffff880028212de0 cache_reap[dq]                           1999911          4
0xffff880132cf9228 br_hello_timer_expired                   2000003          4
0xffff880028292de0 cache_reap[dq]                           2000002          4
0xffff8801336f06e0 e1000_watchdog                           2000000          4
0xffff880028312de0 cache_reap[dq]                           1999994          4
0xffff880028392de0 cache_reap[dq]                           2000000          4
0xffff880131767e18 cupsd(10171)                             1500570          4
0xffff880130303540 intel_crtc_idle_timer                    3658003          2
0xffff8801303ca540 intel_crtc_idle_timer                    3658001          2
0xffff880134d2fdf0 bdi-default(29)                          5000032          1
0xffff8801336f0358 dev_watchdog                             5000050          1
0xffffffff81fab180 sync_supers_timer_fn                     5999999          1
0xffff88012e1b0b58 cfq_idle_slice_timer                     4152994          1
0xffff880130161dc0 ips-adjust(874)                          4999998          1
0xffff880130c21dc0 flush-253:0(1347)                        5000016          1
0xffff880130c4fdc0 flush-253:4(1348)                        5000015          1

What do people think of the script?  Any suggestion on improvements before checking it into the examples?

-Will



[-- Attachment #2: periodic.stp --]
[-- Type: text/plain, Size: 1225 bytes --]

#!/usr/bin/stap --all-modules
# Copyright (C) 2011 Red Hat, Inc.
# Written by William Cohen <wcohen@redhat.com>
#
# A little script to give an idea what periodic things are running on
# the system

global last_expire, period, funct, data

probe kernel.trace("timer_expire_entry")
{
  old_expire = last_expire[$timer]
  new_expire = gettimeofday_us()
  if (old_expire) {
    elapsed = new_expire - old_expire
    period[$timer] <<< elapsed
    funct[$timer] = $timer->function
    data[$timer] = $timer->data
  }
  last_expire[$timer] = new_expire
}

probe end
{
  printf("#%-16s %-35s %12s %10s\n", "timer", "function", "period(us)", "count")
  # print out the various timers firing
  foreach([timer] in period-) {
    fname = symname(funct[timer])
    if (fname == "process_timeout") {
      fname = sprintf("%s(%d)",
        kernel_string_n(@cast(data[timer], "struct task_struct")->comm, 16),
        @cast(data[timer], "struct task_struct")->pid)
    } else if (fname == "delayed_work_timer_fn") {
      fname = sprintf("%s[dq]", 
        symname(@cast(data[timer], "struct delayed_work")->work->func))
    }
    printf("%16p %-35s %12d %10d\n", timer, fname,
           @avg(period[timer]), @count(period[timer]))
  }
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example looking at the periodic timers
  2011-12-13 22:29 Example looking at the periodic timers William Cohen
@ 2011-12-14 14:52 ` Mark Wielaard
  2011-12-14 15:32   ` William Cohen
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Wielaard @ 2011-12-14 14:52 UTC (permalink / raw)
  To: William Cohen; +Cc: systemtap

On Tue, 2011-12-13 at 16:57 -0500, William Cohen wrote:
> One case that people would like to use SystemTap for is finding out
> why some latency sensitive application is slowed down every x seconds.
> They know that something keeps running on the machine and slowing
> things down, but they don't know what it is.
> 
> I have been playing around with the timer tracepoints and came up with
> the attached script, periodic.stp.  When the script exits it prints
> out a list of the periodic timers, sorted from most frequently to
> least frequently fired timer. It also include the average period for
> the timer.  The functions that have [dg] are things put into the
> delayed_work_queues.

Looks nice. I would add a begin probe to announce the script started and
that the user must press control-C to get results and/or make the end
probe a end,timer.s(10) probe so that you get output every 10 seconds
(then you can also clean up the tables).

Are the timer addresses really useful? I just don't know how to
interpret them, does it matter on which timer some event fired?

Printing the comms on process_timeout is a nice touch. Maybe explicitly
prefix it with process: or some other string making that more clear?

You could also add modname() to the output, then people can more easily
see whether it is a kernel or module function being triggered.

I also got some functions that don't immediately make sense, they don't
seem to fall inside either the kernel nor a module. Did you figure out
where they point at? From seeing the high address I assume they are
actually dynamically allocated structures, not direct functions.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example looking at the periodic timers
  2011-12-14 14:52 ` Mark Wielaard
@ 2011-12-14 15:32   ` William Cohen
  2011-12-14 18:19     ` Frank Ch. Eigler
  2011-12-14 21:58     ` Mark Wielaard
  0 siblings, 2 replies; 6+ messages in thread
From: William Cohen @ 2011-12-14 15:32 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap

[-- Attachment #1: Type: text/plain, Size: 2960 bytes --]

On 12/14/2011 06:23 AM, Mark Wielaard wrote:
> On Tue, 2011-12-13 at 16:57 -0500, William Cohen wrote:
>> One case that people would like to use SystemTap for is finding out
>> why some latency sensitive application is slowed down every x seconds.
>> They know that something keeps running on the machine and slowing
>> things down, but they don't know what it is.
>>
>> I have been playing around with the timer tracepoints and came up with
>> the attached script, periodic.stp.  When the script exits it prints
>> out a list of the periodic timers, sorted from most frequently to
>> least frequently fired timer. It also include the average period for
>> the timer.  The functions that have [dg] are things put into the
>> delayed_work_queues.

Hi Mark,

Thanks for the feedback. I made some revisions and attached the new peroidic.stp.

> 
> Looks nice. I would add a begin probe to announce the script started and
> that the user must press control-C to get results and/or make the end
> probe a end,timer.s(10) probe so that you get output every 10 seconds
> (then you can also clean up the tables).

The begin probe would be easy enough to add. Loading in thing with the "--all-modules" can slow things down, so that would give people some useful feedback.  The timer.s(10) might not be so useful when people are looking for things that have a period longer than 10 seconds.  There are things in the kernel that run every 600 seconds or so, and timer.s(10) would miss those. Maybe make an optional argument for periodic output/cleanup.

> 
> Are the timer addresses really useful? I just don't know how to
> interpret them, does it matter on which timer some event fired?

The timer addresses are probably not that useful in the output and I was debating whether to remove them before posting the script on the mailing list. It was more for when I was creating things and I could see that the same functions were attached to different timers. Now that the script appears to be working I removed that from the output.

> Printing the comms on process_timeout is a nice touch. Maybe explicitly
> prefix it with process: or some other string making that more clear?
> 
> You could also add modname() to the output, then people can more easily
> see whether it is a kernel or module function being triggered.

The output for the function could use some tweaks. The modname() sounds like a good addition.  Maybe replace the #timer columnn with type: "module", "process", "delayed_work" as in the revised script.

> I also got some functions that don't immediately make sense, they don't
> seem to fall inside either the kernel nor a module. Did you figure out
> where they point at? From seeing the high address I assume they are
> actually dynamically allocated structures, not direct functions.

Which functions are you referring to? I think that the raw addresses in the output are might the timers related to the systemtap instrumentation module.

-Will


[-- Attachment #2: periodic.stp --]
[-- Type: text/plain, Size: 1591 bytes --]

#!/usr/bin/stap --all-modules
# Copyright (C) 2011 Red Hat, Inc.
# Written by William Cohen <wcohen@redhat.com>
#
# A little script to give an idea what periodic things are running on
# the system

global last_expire, period, funct, data

probe kernel.trace("timer_expire_entry")
{
  old_expire = last_expire[$timer]
  new_expire = gettimeofday_us()
  if (old_expire) {
    elapsed = new_expire - old_expire
    period[$timer] <<< elapsed
    funct[$timer] = $timer->function
    data[$timer] = $timer->data
  }
  last_expire[$timer] = new_expire
}

function output()
{
  printf("#%-16s %-35s %12s %10s\n", "type", "function", "period(us)", "count")
  # print out the various timers firing
  foreach([timer] in period-) {
    fname = symname(funct[timer])
    if (fname == "process_timeout") {
      fname = sprintf("%s(%d)",
        kernel_string_n(@cast(data[timer], "struct task_struct")->comm, 16),
        @cast(data[timer], "struct task_struct")->pid)
	type="process"
    } else if (fname == "delayed_work_timer_fn") {
      fname = sprintf("%s", 
        symname(@cast(data[timer], "struct delayed_work")->work->func))
	type="work_queue"
    } else {
      fname = sprintf("%s:%s", modname(funct[timer]), fname)
	type="kernel"
    }
    printf("%-16s %-35s %12d %10d\n", type, fname,
           @avg(period[timer]), @count(period[timer]))
  }
}

probe begin { printf("#monitoring timer periods\n") }
probe end { output() }

# allow optional period output from script
%( $# > 0 %?
probe timer.s($1)
{
  output();
  delete last_expire
  delete period
  delete funct
  delete data
}
%)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example looking at the periodic timers
  2011-12-14 15:32   ` William Cohen
@ 2011-12-14 18:19     ` Frank Ch. Eigler
  2011-12-14 21:58     ` Mark Wielaard
  1 sibling, 0 replies; 6+ messages in thread
From: Frank Ch. Eigler @ 2011-12-14 18:19 UTC (permalink / raw)
  To: William Cohen; +Cc: Mark Wielaard, systemtap


wcohen wrote:

> [...]
>> Are the timer addresses really useful? I just don't know how to
>> interpret them, does it matter on which timer some event fired?
>
> The timer addresses are probably not that useful in the output [...]

If symdata() can get meaningful names from the timer addresses, then
they could be informative.


- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example looking at the periodic timers
  2011-12-14 15:32   ` William Cohen
  2011-12-14 18:19     ` Frank Ch. Eigler
@ 2011-12-14 21:58     ` Mark Wielaard
  2011-12-16 20:45       ` Mark Wielaard
  1 sibling, 1 reply; 6+ messages in thread
From: Mark Wielaard @ 2011-12-14 21:58 UTC (permalink / raw)
  To: William Cohen; +Cc: systemtap

[-- Attachment #1: Type: text/plain, Size: 2153 bytes --]

On Wed, Dec 14, 2011 at 09:52:18AM -0500, William Cohen wrote:
> > Looks nice. I would add a begin probe to announce the script started and
> > that the user must press control-C to get results and/or make the end
> > probe a end,timer.s(10) probe so that you get output every 10 seconds
> > (then you can also clean up the tables).
> 
> The begin probe would be easy enough to add. Loading in thing with the
> "--all-modules" can slow things down, so that would give people some
> useful feedback.  The timer.s(10) might not be so useful when people
> are looking for things that have a period longer than 10 seconds.
> There are things in the kernel that run every 600 seconds or so,
> and timer.s(10) would miss those. Maybe make an optional argument
> for periodic output/cleanup.

Nice elegant solution. To nitpick, I would make the output more explicit:
-probe begin { printf("#monitoring timer periods\n") }
+probe begin { printf("# Monitoring timer periods (press ctrl-C for output)\n") }

> > I also got some functions that don't immediately make sense, they don't
> > seem to fall inside either the kernel nor a module. Did you figure out
> > where they point at? From seeing the high address I assume they are
> > actually dynamically allocated structures, not direct functions.
> 
> Which functions are you referring to? I think that the raw addresses in
> the output are might the timers related to the systemtap instrumentation
> module.

Aha. You are right.
I even have a patch for modname() to get "unknown" module names.
I haven't checked it in because I found the preempt_disable() so
incredible ugly. We cannot use the module_mutex because we could
theoretically be probing some code that holds that mutex. What do
people think, is it useful enough to warrant the ugliness?

Since stap module names are so incredibly long you do need the following
patch to truncate the module names in periodic.stp:

-      fname = sprintf("%s:%s", modname(funct[timer]), fname)
+      fname = sprintf("%.16s:%s", modname(funct[timer]), fname)


This does expose that we (systemtap) are responsible for most timer
interrupts... hmmmm.

Cheers,

Mark

[-- Attachment #2: modname.patch --]
[-- Type: text/plain, Size: 1006 bytes --]

diff --git a/tapset/context-symbols.stp b/tapset/context-symbols.stp
index 1ab3a61..79d45f5 100644
--- a/tapset/context-symbols.stp
+++ b/tapset/context-symbols.stp
@@ -154,11 +154,28 @@ function probemod:string () %{ /* pure */
  */
 function modname:string (addr: long) %{ /* pure */
 	struct _stp_module *m;
+#ifdef STAPCONF_MODULE_TEXT_ADDRESS
+	struct module *ko;
+#endif
 	m = _stp_kmod_sec_lookup (THIS->addr, NULL);
 	if (m && m->name)
-          strlcpy (THIS->__retvalue, m->name, MAXSTRINGLEN);
-	else
-	  strlcpy (THIS->__retvalue, "<unknown>", MAXSTRINGLEN);
+	  {
+	    strlcpy (THIS->__retvalue, m->name, MAXSTRINGLEN);
+	    return;
+	  }
+
+#ifdef STAPCONF_MODULE_TEXT_ADDRESS
+	preempt_disable();
+	ko = __module_text_address (THIS->addr);
+	if (ko && ko->name)
+	  {
+	    strlcpy (THIS->__retvalue, ko->name, MAXSTRINGLEN);
+	    preempt_enable_no_resched();
+	    return;
+	  }
+	preempt_enable_no_resched();
+#endif
+	strlcpy (THIS->__retvalue, "<unknown>", MAXSTRINGLEN);
 %}
 
 /**

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example looking at the periodic timers
  2011-12-14 21:58     ` Mark Wielaard
@ 2011-12-16 20:45       ` Mark Wielaard
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Wielaard @ 2011-12-16 20:45 UTC (permalink / raw)
  To: William Cohen; +Cc: systemtap

On Wed, Dec 14, 2011 at 07:19:12PM +0100, Mark Wielaard wrote:
> On Wed, Dec 14, 2011 at 09:52:18AM -0500, William Cohen wrote:
> > Which functions are you referring to? I think that the raw addresses in
> > the output are might the timers related to the systemtap instrumentation
> > module.
> 
> Aha. You are right.
> I even have a patch for modname() to get "unknown" module names.
> I haven't checked it in because I found the preempt_disable() so
> incredible ugly. We cannot use the module_mutex because we could
> theoretically be probing some code that holds that mutex. What do
> people think, is it useful enough to warrant the ugliness?

Since nobody yelled and screamed about it being too ugly I cleaned
it up and made sure it also works for symdata().

commit 17373b9e59df05fb8101ec173b580d593ba65d09
Author: Mark Wielaard <mjw@redhat.com>
Date:   Fri Dec 16 16:59:00 2011 +0100

    modname/symdata: Try harder to retrieve module name when requested.
    
    When the user really wants a kernel module name with an address and
    we are unable to map it to any existing one then fall back on using
    __module_text_address() with appropriate locking.

Now the periodic.stp example will show who is really responsible
for all these timers... us...

#monitoring timer periods (press control-c for output)
#type   function                                            period(us)     count
kernel  0xffffffffa109c780 [stap_6c9caccd440e93fb9f95605bb         999      4999
kernel  0xffffffffa109c780 [stap_6c9caccd440e93fb9f95605bb        1000      4996
kernel  0xffffffffa109c780 [stap_6c9caccd440e93fb9f95605bb        1001      4992
kernel  0xffffffffa109c780 [stap_6c9caccd440e93fb9f95605bb        1001      4992
work_q  do_dbs_timer                                              9999       499
kernel  0xffffffffa109c330 [stap_6c9caccd440e93fb9f95605bb       10000       499
work_q  do_dbs_timer                                             10000       499
work_q  do_dbs_timer                                             10000       499
work_q  do_dbs_timer                                             10000       499
kernel  0xffffffffa109c270 [stap_6c9caccd440e93fb9f95605bb       20003       249
kernel  flush_unmaps_timeout+0x0/0x40 [kernel]                   80573        61
kernel  clocksource_watchdog+0x0/0x210 [kernel]                 500001         9
work_q  i915_gem_retire_work_handler                           1001999         4
work_q  sync_cmos_clock                                         999999         4
work_q  vmstat_update                                          1000002         4
work_q  vmstat_update                                          1000002         4
work_q  vmstat_update                                           999999         4
work_q  vmstat_update                                           999997         4
kernel  intel_gpu_idle_timer+0x0/0x80 [i915]                    601000         3
kernel  br_hello_timer_expired+0x0/0x80 [bridge]               2000015         1
kernel  iwl_bg_watchdog+0x0/0x110 [iwlagn]                     2504008         1
kernel  e1000_watchdog+0x0/0x20 [e1000e]                       2000016         1

Cheers,

Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-12-16 19:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-13 22:29 Example looking at the periodic timers William Cohen
2011-12-14 14:52 ` Mark Wielaard
2011-12-14 15:32   ` William Cohen
2011-12-14 18:19     ` Frank Ch. Eigler
2011-12-14 21:58     ` Mark Wielaard
2011-12-16 20:45       ` Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).