public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Re: Fw: systemtap application to find applications doing polling
       [not found] <20090129164643.GA17621@in.ibm.com>
@ 2009-02-02  9:10 ` Vaidyanathan Srinivasan
  2009-02-04 18:31   ` William Cohen
  0 siblings, 1 reply; 4+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-02-02  9:10 UTC (permalink / raw)
  To: Maneesh Soni; +Cc: dipankar, ananth, William Cohen, SystemTAP, Ulrich Drepper

* Maneesh Soni <maneesh@in.ibm.com> [2009-01-29 22:16:43]:

> 
> Is this something useful for energy management?

Hi Maneesh,

This would be useful for energy management as Ulrich has noted in his
blog.  The rate of wake up is reposted by PowerTop using
/proc/timer_list where even the device driver timers and in kernel
offenders are also identified.

Once the userspace application is identified, then further details on
the type of polling loops and syscall and library APIs will definitely
help optimise the user applications.

Will's script will help to identify types of polling loops and top
offenders at run time in an user space application.
 
> ----- Forwarded message from William Cohen <wcohen@redhat.com> -----
> 
> Date: Wed, 28 Jan 2009 11:52:10 -0500
> From: William Cohen <wcohen@redhat.com>
> To: SystemTAP <systemtap@sources.redhat.com>
> CC: Ulrich Drepper <drepper@redhat.com>
> Subject: systemtap application to find applications doing polling
> 
> Hi All,
> 
> Uli Drepper mentions in a blog entry need "avoid unnecessary wakeups" and that a
> systemtap script to monitor this would be useful:
> 
> http://udrepper.livejournal.com/19041.html
> 
> I talked with Uli about developing the script that identify the processes that
> are doing a lot of polling.  The attached script, timeout.stp, monitors the
> poll, epoll_wait,  select, futex, nanosleep, timer (it_real_fn). The poll and
> epoll are only recorded if the timeout value is greater than zero. The resulting
> output is displayed in a top-like format for the top twenty processes with the
> entries ordered from most problem calls to fewest. The columns indicate the
> count of each type. The output ends up like the following:
> 
>   uid |   poll  select   epoll  itimer   futex nanosle  signal| process
>  2628 |      0     364       0       0       0       0       0| Xorg
>  3586 |     21       0       0       0     179       0       0| thunderbird-bin
>  3575 |     41       0       0       0       0      20       0| xchat
>  3454 |      0      60       0       0       0       0       0| emacs
>  3325 |     43       0       0       0       0       0       0| gnome-terminal
>  3082 |     11       0       0       0       0       0       0| gnome-panel
>  3068 |      7       0       0       0       0       0       0| metacity
>  3181 |      6       0       0       0       0       0       0| wnck-applet
>  3119 |      0       5       0       0       0       0       0| httpd
>  2135 |      4       0       0       0       0       0       0| hald
>  2307 |      4       0       0       0       0       0       0| NetworkManager
>  2362 |      4       0       0       0       0       0       0| setroubleshootd
>  2530 |      0       0       0       0       0       4       0| cups-polld
>  3084 |      3       0       0       0       0       0       0| nautilus
>  3616 |      0       0       0       0       3       0       0| firefox
>  3060 |      2       0       0       0       0       0       0| gnome-settings-
>  2304 |      2       0       0       0       0       0       0| hald-addon-stor
>     0 |      0       0       0       1       0       0       0| swapper
> 
> I plan to check this into systemtap.examples directory in next day or so. Just
> looking to see if people have additional suggestions.
> 
> -Will

This output information and format is good, while I have the following
comments and suggestion:

* Display the observation interval in the output and provide options
  for say 1s or 10s sampling
* At low wakeup rate does the system tap script itself add to the
  wakeups?
* Does these values match closely with PowerTop?
* Can we aggregate these values for a group of PIDs (possibly
  parent pid or tgid) so that we can collect results for a complete
  application stack easily.  I have tried doing this by manually
  adding up wake-ups for a group of PIDs
* Another wishlist item would be to be able to add a probe at various
  locations in library and move closer to userspace code. 

I am a kernel developer and a powertop user.  This systemtap script
seems to open-up possibilities for a flexible and extensible method to
collect wakeup rate for applications.

Thanks
Vaidy

PS: Please explicitly CC me since I am not subscribed to
systemtap@sources.redhat.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: systemtap application to find applications doing polling
  2009-02-02  9:10 ` Fw: systemtap application to find applications doing polling Vaidyanathan Srinivasan
@ 2009-02-04 18:31   ` William Cohen
  2009-02-05 18:02     ` Vaidyanathan Srinivasan
  0 siblings, 1 reply; 4+ messages in thread
From: William Cohen @ 2009-02-04 18:31 UTC (permalink / raw)
  To: svaidy; +Cc: Maneesh Soni, dipankar, ananth, SystemTAP, Ulrich Drepper, svaidy

Vaidyanathan Srinivasan wrote:
> * Maneesh Soni <maneesh@in.ibm.com> [2009-01-29 22:16:43]:
> 
>> Is this something useful for energy management?
> 
> Hi Maneesh,
> 
> This would be useful for energy management as Ulrich has noted in his
> blog.  The rate of wake up is reposted by PowerTop using
> /proc/timer_list where even the device driver timers and in kernel
> offenders are also identified.
> 
> Once the userspace application is identified, then further details on
> the type of polling loops and syscall and library APIs will definitely
> help optimise the user applications.
> 
> Will's script will help to identify types of polling loops and top
> offenders at run time in an user space application.
>  
>> ----- Forwarded message from William Cohen <wcohen@redhat.com> -----
>>
>> Date: Wed, 28 Jan 2009 11:52:10 -0500
>> From: William Cohen <wcohen@redhat.com>
>> To: SystemTAP <systemtap@sources.redhat.com>
>> CC: Ulrich Drepper <drepper@redhat.com>
>> Subject: systemtap application to find applications doing polling
>>
>> Hi All,
>>
>> Uli Drepper mentions in a blog entry need "avoid unnecessary wakeups" and that a
>> systemtap script to monitor this would be useful:
>>
>> http://udrepper.livejournal.com/19041.html
>>
>> I talked with Uli about developing the script that identify the processes that
>> are doing a lot of polling.  The attached script, timeout.stp, monitors the
>> poll, epoll_wait,  select, futex, nanosleep, timer (it_real_fn). The poll and
>> epoll are only recorded if the timeout value is greater than zero. The resulting
>> output is displayed in a top-like format for the top twenty processes with the
>> entries ordered from most problem calls to fewest. The columns indicate the
>> count of each type. The output ends up like the following:
>>
>>   uid |   poll  select   epoll  itimer   futex nanosle  signal| process
>>  2628 |      0     364       0       0       0       0       0| Xorg
>>  3586 |     21       0       0       0     179       0       0| thunderbird-bin
>>  3575 |     41       0       0       0       0      20       0| xchat
>>  3454 |      0      60       0       0       0       0       0| emacs
>>  3325 |     43       0       0       0       0       0       0| gnome-terminal
>>  3082 |     11       0       0       0       0       0       0| gnome-panel
>>  3068 |      7       0       0       0       0       0       0| metacity
>>  3181 |      6       0       0       0       0       0       0| wnck-applet
>>  3119 |      0       5       0       0       0       0       0| httpd
>>  2135 |      4       0       0       0       0       0       0| hald
>>  2307 |      4       0       0       0       0       0       0| NetworkManager
>>  2362 |      4       0       0       0       0       0       0| setroubleshootd
>>  2530 |      0       0       0       0       0       4       0| cups-polld
>>  3084 |      3       0       0       0       0       0       0| nautilus
>>  3616 |      0       0       0       0       3       0       0| firefox
>>  3060 |      2       0       0       0       0       0       0| gnome-settings-
>>  2304 |      2       0       0       0       0       0       0| hald-addon-stor
>>     0 |      0       0       0       1       0       0       0| swapper
>>
>> I plan to check this into systemtap.examples directory in next day or so. Just
>> looking to see if people have additional suggestions.
>>
>> -Will
> 
> This output information and format is good, while I have the following
> comments and suggestion:
> 
> * Display the observation interval in the output and provide options
>   for say 1s or 10s sampling

It is possible to have an optional argument in systemtap such as the
para-callgraph.stp:

http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=blob_plain;f=testsuite/systemtap.examples/general/para-callgraph.stp;hb=HEAD

> * At low wakeup rate does the system tap script itself add to the
>   wakeups?

No effort is made to filter out the impact from the systemtap code from the
output. Don't see the effect in the output of timeout.stp, but in powertop can
see some effect:

  41.8% (1000.0)           staprun : __mod_timer (__stp_time_timer_callback)
  41.8% (1000.0)             udevd : __mod_timer (__stp_time_timer_callback)
   4.2% (100.0)           staprun : __mod_timer (__utt_wakeup_timer)
   4.2% ( 99.8)           staprun : queue_delayed_work (delayed_work_timer_fn)

This makes me wonder if there is someway to reduce staprun's effect.

> * Does these values match closely with PowerTop?

Powertop shows rate and the current timeout script is showing total
accumulation. If the timeout script is adjusted to print every 10 seconds and
clear out the data then a more direct comparison can be made. I made that change
and looked at the output. There appears to be some differences in what each is
measuring. Powertop reading /proc/timer_stats need to check to see how that
differs from what timeout.stp is probing.

> * Can we aggregate these values for a group of PIDs (possibly
>   parent pid or tgid) so that we can collect results for a complete
>   application stack easily.  I have tried doing this by manually
>   adding up wake-ups for a group of PIDs

There have been examples that have PID filters that limit the scope to some
subset of PIDs and their children. Put the PID and any children in to
associative array and then check the associative array before doing probing
operation.

> * Another wishlist item would be to be able to add a probe at various
>   locations in library and move closer to userspace code. 

There has been some work on userspace probing for systemtap. It isn't in a
packaged distro yet, but there should be one for fedora coming out soon.
However, this needs utrace in the kernel.

> 
> I am a kernel developer and a powertop user.  This systemtap script
> seems to open-up possibilities for a flexible and extensible method to
> collect wakeup rate for applications.
> 
> Thanks
> Vaidy
> 
> PS: Please explicitly CC me since I am not subscribed to
> systemtap@sources.redhat.com

Thanks for the comments and feedback on timeout.stp.

-Will

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: systemtap application to find applications doing polling
  2009-02-04 18:31   ` William Cohen
@ 2009-02-05 18:02     ` Vaidyanathan Srinivasan
  2009-02-05 21:02       ` Frank Ch. Eigler
  0 siblings, 1 reply; 4+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-02-05 18:02 UTC (permalink / raw)
  To: William Cohen; +Cc: Maneesh Soni, dipankar, ananth, SystemTAP, Ulrich Drepper

* William Cohen <wcohen@redhat.com> [2009-02-04 12:08:07]:

> Vaidyanathan Srinivasan wrote:
> > * Maneesh Soni <maneesh@in.ibm.com> [2009-01-29 22:16:43]:
> > 
> >> Is this something useful for energy management?
> > 
> > Hi Maneesh,
> > 
> > This would be useful for energy management as Ulrich has noted in his
> > blog.  The rate of wake up is reposted by PowerTop using
> > /proc/timer_list where even the device driver timers and in kernel
> > offenders are also identified.
> > 
> > Once the userspace application is identified, then further details on
> > the type of polling loops and syscall and library APIs will definitely
> > help optimise the user applications.
> > 
> > Will's script will help to identify types of polling loops and top
> > offenders at run time in an user space application.
> >  
> >> ----- Forwarded message from William Cohen <wcohen@redhat.com> -----
> >>
> >> Date: Wed, 28 Jan 2009 11:52:10 -0500
> >> From: William Cohen <wcohen@redhat.com>
> >> To: SystemTAP <systemtap@sources.redhat.com>
> >> CC: Ulrich Drepper <drepper@redhat.com>
> >> Subject: systemtap application to find applications doing polling
> >>
> >> Hi All,
> >>
> >> Uli Drepper mentions in a blog entry need "avoid unnecessary wakeups" and that a
> >> systemtap script to monitor this would be useful:
> >>
> >> http://udrepper.livejournal.com/19041.html
> >>
> >> I talked with Uli about developing the script that identify the processes that
> >> are doing a lot of polling.  The attached script, timeout.stp, monitors the
> >> poll, epoll_wait,  select, futex, nanosleep, timer (it_real_fn). The poll and
> >> epoll are only recorded if the timeout value is greater than zero. The resulting
> >> output is displayed in a top-like format for the top twenty processes with the
> >> entries ordered from most problem calls to fewest. The columns indicate the
> >> count of each type. The output ends up like the following:
> >>
> >>   uid |   poll  select   epoll  itimer   futex nanosle  signal| process
> >>  2628 |      0     364       0       0       0       0       0| Xorg
> >>  3586 |     21       0       0       0     179       0       0| thunderbird-bin
> >>  3575 |     41       0       0       0       0      20       0| xchat
> >>  3454 |      0      60       0       0       0       0       0| emacs
> >>  3325 |     43       0       0       0       0       0       0| gnome-terminal
> >>  3082 |     11       0       0       0       0       0       0| gnome-panel
> >>  3068 |      7       0       0       0       0       0       0| metacity
> >>  3181 |      6       0       0       0       0       0       0| wnck-applet
> >>  3119 |      0       5       0       0       0       0       0| httpd
> >>  2135 |      4       0       0       0       0       0       0| hald
> >>  2307 |      4       0       0       0       0       0       0| NetworkManager
> >>  2362 |      4       0       0       0       0       0       0| setroubleshootd
> >>  2530 |      0       0       0       0       0       4       0| cups-polld
> >>  3084 |      3       0       0       0       0       0       0| nautilus
> >>  3616 |      0       0       0       0       3       0       0| firefox
> >>  3060 |      2       0       0       0       0       0       0| gnome-settings-
> >>  2304 |      2       0       0       0       0       0       0| hald-addon-stor
> >>     0 |      0       0       0       1       0       0       0| swapper
> >>
> >> I plan to check this into systemtap.examples directory in next day or so. Just
> >> looking to see if people have additional suggestions.
> >>
> >> -Will
> > 
> > This output information and format is good, while I have the following
> > comments and suggestion:
> > 
> > * Display the observation interval in the output and provide options
> >   for say 1s or 10s sampling
> 
> It is possible to have an optional argument in systemtap such as the
> para-callgraph.stp:
> 
> http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=blob_plain;f=testsuite/systemtap.examples/general/para-callgraph.stp;hb=HEAD
> 
> > * At low wakeup rate does the system tap script itself add to the
> >   wakeups?
> 
> No effort is made to filter out the impact from the systemtap code from the
> output. Don't see the effect in the output of timeout.stp, but in powertop can
> see some effect:
> 
>   41.8% (1000.0)           staprun : __mod_timer (__stp_time_timer_callback)
>   41.8% (1000.0)             udevd : __mod_timer (__stp_time_timer_callback)
>    4.2% (100.0)           staprun : __mod_timer (__utt_wakeup_timer)
>    4.2% ( 99.8)           staprun : queue_delayed_work (delayed_work_timer_fn)
> 
> This makes me wonder if there is someway to reduce staprun's effect.

This wakeup rate is very high and this implies that we should use the
stap script for a per-application level wakeup tracing only and should
not try to profile the overall system.

Definitely some opportunity here for stap to reduce wakeups :)
But what is causing udevd to wakeup so often!
 
> > * Does these values match closely with PowerTop?
> 
> Powertop shows rate and the current timeout script is showing total
> accumulation. If the timeout script is adjusted to print every 10 seconds and
> clear out the data then a more direct comparison can be made. I made that change
> and looked at the output. There appears to be some differences in what each is
> measuring. Powertop reading /proc/timer_stats need to check to see how that
> differs from what timeout.stp is probing.

Overall wakeup rate shown by powertop is averaged over nr_cpus.  The
per application/thread wakeup count is accurate as far as I have
determined from experiments and I have also compared against
/proc/interrupts.  (LOC is the local timers)

> > * Can we aggregate these values for a group of PIDs (possibly
> >   parent pid or tgid) so that we can collect results for a complete
> >   application stack easily.  I have tried doing this by manually
> >   adding up wake-ups for a group of PIDs
> 
> There have been examples that have PID filters that limit the scope to some
> subset of PIDs and their children. Put the PID and any children in to
> associative array and then check the associative array before doing probing
> operation.

Yeah, this should be easy with stap.
 
> > * Another wishlist item would be to be able to add a probe at various
> >   locations in library and move closer to userspace code. 
> 
> There has been some work on userspace probing for systemtap. It isn't in a
> packaged distro yet, but there should be one for fedora coming out soon.
> However, this needs utrace in the kernel.

Looking forward to this feature.  This will bring statistics and
tracing closer to libraries where there may be better scope for
optimisations.

Thanks,
Vaidy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: systemtap application to find applications doing polling
  2009-02-05 18:02     ` Vaidyanathan Srinivasan
@ 2009-02-05 21:02       ` Frank Ch. Eigler
  0 siblings, 0 replies; 4+ messages in thread
From: Frank Ch. Eigler @ 2009-02-05 21:02 UTC (permalink / raw)
  To: svaidy
  Cc: William Cohen, Maneesh Soni, dipankar, ananth, SystemTAP, Ulrich Drepper

Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> writes:

> [...]
> This wakeup rate is very high and this implies that we should use the
> stap script for a per-application level wakeup tracing only and should
> not try to profile the overall system.
> Definitely some opportunity here for stap to reduce wakeups :)

Yes, will do.

> [...]
>> There has been some work on userspace probing for systemtap. It isn't in a
>> packaged distro yet, but there should be one for fedora coming out soon.
>> However, this needs utrace in the kernel.
>
> Looking forward to this feature.  [...]

There is a release 0.8 in fedora that you can experiment with already.
git systemtap is better for shared library probing though.


- FChE

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-02-05 20:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20090129164643.GA17621@in.ibm.com>
2009-02-02  9:10 ` Fw: systemtap application to find applications doing polling Vaidyanathan Srinivasan
2009-02-04 18:31   ` William Cohen
2009-02-05 18:02     ` Vaidyanathan Srinivasan
2009-02-05 21:02       ` Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).