public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* proc memory statistics tapset
@ 2009-10-05 15:07 Mark Wielaard
  2009-10-05 18:09 ` Josh Stone
  2009-10-09 15:18 ` task time tapset Mark Wielaard
  0 siblings, 2 replies; 9+ messages in thread
From: Mark Wielaard @ 2009-10-05 15:07 UTC (permalink / raw)
  To: systemtap

[-- Attachment #1: Type: text/plain, Size: 2920 bytes --]

Hi,

I recently added printing of memory usage to each stap pass in verbose
mode to see how much (extra) memory each pass contributed. Which helped
me to keep track of memory usage while I refactored some of the dwfl
construction code.

It occurred to me this is actually what we need to use systemtap for. We
already have markers in stap itself that can be probed for the start and
end of each pass. So instead of adding new source code I should have
added a tapset that provides that information. That way it could be
reused by anybody that wants to trace a program and get memory
measurements/statistics. So that is what I just did.

With this new tapset you get the same results as stap -v (actually a bit
more and nicer human readable memory strings) with this oneliner: 

$ stap -e 'probe process("stap").mark("pass[0-3]*") { log($$name . "\t" . proc_mem_string()) }' -c 'stap -k -p4 testsuite/buildok/context_test.stp'
pass0__start	size: 58m, rss: 2528k, shr: 2068k, txt: 1352k, data: 428k
pass0__end	size: 58m, rss: 2548k, shr: 2088k, txt: 1352k, data: 428k
pass1a__start	size: 58m, rss: 2548k, shr: 2088k, txt: 1352k, data: 428k
pass1b__start	size: 58m, rss: 2748k, shr: 2240k, txt: 1352k, data: 428k
pass1__end	size: 74m, rss: 18m, shr: 2400k, txt: 1352k, data: 16m
pass2__start	size: 74m, rss: 18m, shr: 2400k, txt: 1352k, data: 16m
pass2__end	size: 74m, rss: 19m, shr: 2584k, txt: 1352k, data: 17m
pass3__start	size: 74m, rss: 19m, shr: 2584k, txt: 1352k, data: 17m
pass3__end	size: 74m, rss: 19m, shr: 2784k, txt: 1352k, data: 17m

But it is usable for any probe in any program (or the kernel, as long as
there is an associated current task for the probe point), and you don't
have to print anything, you can also query the different memory stats
individually and put them in an aggregate to show max/min/avg over time,
etc.

I would like to add it as tapset/proc_mem.stp and put the documentation
together with the Memory Tapset in the Tapset Reference manual since I
think they are generally useful. But if people feel they should go into
the examples only, that is fine too.

They are currently all marked /* unprivileged */ since the same
information is world readable through ps, top or /proc/pid/statm. But
the functions could also check is_myproc() and return zero in such
cases.

The code is slightly paranoid when it comes to fetching the mm struct,
but that was the only way I could see that made it safe. It only works
for the current task because that is what we can access lock free. For
any other task we would need to somehow keep some shadow bookkeeping to
prevent having to take locks on task and/or mm structs and I don't think
that is really worth it. The interesting memory stats are probably those
of the current process anyway.

Tested against 2.6.18 and 2.6.31.1 on x86_64 and 2.6.30 on i686, testson
other kernels and architectures or any other feedback very welcome.

Cheers,

Mark

[-- Attachment #2: proc_mem.stp --]
[-- Type: text/x-csrc, Size: 5810 bytes --]

// Process memory query and utility functions.
// Copyright (C) 2009 Red Hat Inc.
//
// This file is part of systemtap, and is free software.  You can
// redistribute it and/or modify it under the terms of the GNU General
// Public License (GPL); either version 2, or (at your option) any
// later version.

// <tapsetdescription>
// Process memory query and utility functions provide information about
// the memory usage of the current application. These functions provide
// information about the full size, resident, shared, code and data used
// by the current process. And provide utility functions to query the
// page size of the current architecture and create human readable string
// representations of bytes and pages used.
// </tapsetdescription>

%{
/* PF_BORROWED_MM got renamed to PF_KTHREAD with same semantics somewhere. */
#ifdef PF_BORROWED_MM
#define _STP_PF_KTHREAD PF_BORROWED_MM
#else
#define _STP_PF_KTHREAD PF_KTHREAD
#endif
  /* Returns the mm for the current proc. Slightly paranoid. Only returns
     from safe contexts (current must exist), and the task doesn't happens
     to be (coopted by) a kernel thread. Callers also check CONTEXT->regs. */
  static struct mm_struct *_stp_proc_mm(void)
  {
    struct task_struct *pid_task;
    struct mm_struct *mm;
    if (! current)
      return NULL;
    if (current->flags & _STP_PF_KTHREAD)
      return NULL;
    return current->mm;
  }
%}

/**
 * sfunction proc_mem_size - Total program virtual memory size in pages.
 *
 * Description: Returns the total virtual memory size in pages of the
 * current process, or zero when there is no current process or the
 * number of pages couldn't be retrieved.
 */
function proc_mem_size:long ()
%{ /* pure */ /* unprivileged */
   struct mm_struct *mm = _stp_proc_mm ();
   if (CONTEXT->regs && mm)
     THIS->__retvalue = mm->total_vm;
   else
     THIS->__retvalue = 0;
%}

/**
 * sfunction proc_mem_rss - Program resident set size in pages.
 *
 * Description: Returns the resident set size in pages of the current
 * process, or zero when there is no current process or the number of
 * pages couldn't be retrieved.
 */
function proc_mem_rss:long ()
%{ /* pure */ /* unprivileged */
   struct mm_struct *mm = _stp_proc_mm ();
   if (CONTEXT->regs && mm)
     THIS->__retvalue = (get_mm_counter(mm, file_rss)
                         + get_mm_counter(mm, anon_rss));
   else
     THIS->__retvalue = 0;  
%}

/**
 * sfunction proc_mem_shr - Program shared pages (from shared mappings).
 *
 * Description: Returns the shared pages (from shared mappings) of the
 * current process, or zero when there is no current process or the
 * number of pages couldn't be retrieved.
 */
function proc_mem_shr:long ()
%{ /* pure */ /* unprivileged */
   struct mm_struct *mm = _stp_proc_mm ();
   if (CONTEXT->regs && mm)
     THIS->__retvalue = get_mm_counter(mm, file_rss);
   else
     THIS->__retvalue = 0;  
%}

/**
 * sfunction proc_mem_txt - Program text (code) size in pages.
 *
 * Description: Returns the current process text (code) size in pages,
 * or zero when there is no current process or the number of pages
 * couldn't be retrieved.
 */
function proc_mem_txt:long ()
%{ /* pure */ /* unprivileged */
   struct mm_struct *mm = _stp_proc_mm ();
   if (CONTEXT->regs && mm)
     THIS->__retvalue = (PAGE_ALIGN(mm->end_code)
                         - (mm->start_code & PAGE_MASK)) >> PAGE_SHIFT;
   else
     THIS->__retvalue = 0;  
%}

/**
 * sfunction proc_mem_data - Program data size (data + stack) in pages.
 *
 * Description: Returns the current process data size (data + stack)
 * in pages, or zero when there is no current process or the number of
 * pages couldn't be retrieved.
 */
function proc_mem_data:long ()
%{ /* pure */ /* unprivileged */
   struct mm_struct *mm = _stp_proc_mm ();
   if (CONTEXT->regs && mm)
     THIS->__retvalue = mm->total_vm - mm->shared_vm;
   else
     THIS->__retvalue = 0;  
%}

/**
 * sfunction mem_page_size - Number of bytes in a page for this architecture.
 */
function mem_page_size:long ()
%{ /* pure */ /* unprivileged */
   THIS->__retvalue = PAGE_SIZE;
%}

/**
 * sfunction bytes_to_string - Human readable string for given bytes.
 *
 * Description: Returns a string representing the number of bytes
 * (when less than 5120b) postfixed by 'b', the number of kilobytes
 * (when less than 5120k) postfixed by 'k', the number of megabytes
 * (when less than 5120m) postfixed by 'm' or the number of gigabytes
 * postfixed by 'g'.
 */
function bytes_to_string:string (bytes:long)
{
  if (bytes < 5120)
    return sprintf("%db", bytes);
  bytes = bytes / 1024;
  if (bytes < 5120)
    return sprintf("%dk", bytes);
  bytes = bytes / 1024;
  if (bytes < 5120)
    return sprintf("%dm", bytes);
  bytes = bytes / 1024;
  return sprintf("%dg", bytes);
}

/**
 * sfunction pages_to_string - Turns pages into a human readable string.
 *
 * Description: Multiplies pages by page_size() to get the number of
 * bytes and returns the result of bytes_to_string().
 */
function pages_to_string:string (pages:long)
{
  bytes = pages * mem_page_size();
  return bytes_to_string (bytes);
}

/**
 * sfunction proc_mem_string - Human readable string of current proc memory usage.
 *
 * Description: Returns a human readable string showing the size, rss,
 * shr, txt and data of the memory used by the current process.
 * For example "size: 301m, rss: 11m, shr: 8m, txt: 52k, data: 2248k".
 */
function proc_mem_string:string ()
{
  return sprintf ("size: %s, rss: %s, shr: %s, txt: %s, data: %s",
                  pages_to_string(proc_mem_size()),
                  pages_to_string(proc_mem_rss()),
                  pages_to_string(proc_mem_shr()),
                  pages_to_string(proc_mem_txt()),
                  pages_to_string(proc_mem_data()));
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: proc memory statistics tapset
  2009-10-05 15:07 proc memory statistics tapset Mark Wielaard
@ 2009-10-05 18:09 ` Josh Stone
  2009-10-06 17:26   ` Mark Wielaard
  2009-10-09 15:18 ` task time tapset Mark Wielaard
  1 sibling, 1 reply; 9+ messages in thread
From: Josh Stone @ 2009-10-05 18:09 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap

On 10/05/2009 08:06 AM, Mark Wielaard wrote:
> Hi,
> 
> I recently added printing of memory usage to each stap pass in verbose
> mode to see how much (extra) memory each pass contributed. Which helped
> me to keep track of memory usage while I refactored some of the dwfl
> construction code.
> 
> It occurred to me this is actually what we need to use systemtap for. We
> already have markers in stap itself that can be probed for the start and
> end of each pass. So instead of adding new source code I should have
> added a tapset that provides that information. That way it could be
> reused by anybody that wants to trace a program and get memory
> measurements/statistics. So that is what I just did.
> 
> [...]
> 
> I would like to add it as tapset/proc_mem.stp and put the documentation
> together with the Memory Tapset in the Tapset Reference manual since I
> think they are generally useful. But if people feel they should go into
> the examples only, that is fine too.

I think this is a fine thing to have as a tapset.  Perhaps your short
script using it on stap is also good in the examples to demonstrate its
simple use.

>   /* Returns the mm for the current proc. Slightly paranoid. Only returns
>      from safe contexts (current must exist), and the task doesn't happens
>      to be (coopted by) a kernel thread. Callers also check CONTEXT->regs. */
>   static struct mm_struct *_stp_proc_mm(void)
>   {
>     struct task_struct *pid_task;
>     struct mm_struct *mm;
>     if (! current)
>       return NULL;
>     if (current->flags & _STP_PF_KTHREAD)
>       return NULL;
>     return current->mm;
>   }

What unsafe contexts don't have current?  That's what I tried to find
experimentally with testsuite/systemtap.stress/current.exp, and so far
it seems that current is always available and safe.  "Task isn't coopted
by a kernel thread" doesn't make sense to me either -- kernel threads
have their own current task_struct, represented by that KTHREAD flag.

I'm not sure why the CONTEXT->regs check matters.  You don't need it to
get any of the stats, so I'd say leave it to the user to decide whether
it makes sense to use it with their probe.  For example, tracepoints
have no regs, but the user might still want to sample memory usage on
sched_switch.

> /**
>  * sfunction bytes_to_string - Human readable string for given bytes.
>  *
>  * Description: Returns a string representing the number of bytes
>  * (when less than 5120b) postfixed by 'b', the number of kilobytes
>  * (when less than 5120k) postfixed by 'k', the number of megabytes
>  * (when less than 5120m) postfixed by 'm' or the number of gigabytes
>  * postfixed by 'g'.
>  */

Personally, I would this to look more like "ls -sh", "du -h", etc. -- no
'b', use upper-case K/M/G, and instead of your 5120 limit, just write it
as X.Y if X < 10.  Or if you really like the extra detail, maybe write
it padded to 4 characters -- X.YY, XX.Y, and then XXX and XXXX up to 1023.

Josh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: proc memory statistics tapset
  2009-10-05 18:09 ` Josh Stone
@ 2009-10-06 17:26   ` Mark Wielaard
  2009-10-06 23:39     ` Josh Stone
  0 siblings, 1 reply; 9+ messages in thread
From: Mark Wielaard @ 2009-10-06 17:26 UTC (permalink / raw)
  To: Josh Stone; +Cc: systemtap

Hi Josh,

On Mon, 2009-10-05 at 11:09 -0700, Josh Stone wrote:
> On 10/05/2009 08:06 AM, Mark Wielaard wrote:
> > I would like to add it as tapset/proc_mem.stp and put the documentation
> > together with the Memory Tapset in the Tapset Reference manual since I
> > think they are generally useful. But if people feel they should go into
> > the examples only, that is fine too.
> 
> I think this is a fine thing to have as a tapset.  Perhaps your short
> script using it on stap is also good in the examples to demonstrate its
> simple use.

Thanks for going over it. Lets add it and see if people can poke holes
in it when they use it. I'll look into adding it also as an example.

> What unsafe contexts don't have current?  That's what I tried to find
> experimentally with testsuite/systemtap.stress/current.exp, and so far
> it seems that current is always available and safe.

yeah, you are right, current is always there. Some of these checks come
from when I wanted to provide the same functionality for task !=
current. But that is just asking for trouble it seems. So now we
restrict to current, which seems more sensible.

>   "Task isn't coopted
> by a kernel thread" doesn't make sense to me either -- kernel threads
> have their own current task_struct, represented by that KTHREAD flag.

I am not 100% sure that is correct. Task flags is set the PT_KTHREAD in
INIT_TASK() before it is fully created. I even extended that check to:
current->flags & (_STP_PF_KTHREAD | PF_EXITING | PF_STARTING)
just to be fully paranoid we never query some task-mm struct that isn't
setup right. Feel free to proof me wrong in being that paranoid :)

> I'm not sure why the CONTEXT->regs check matters.  You don't need it to
> get any of the stats, so I'd say leave it to the user to decide whether
> it makes sense to use it with their probe.  For example, tracepoints
> have no regs, but the user might still want to sample memory usage on
> sched_switch.

Agreed, check removed. Doesn't really make sense. What I wanted to catch
was "real processes" (as opposed to kernel threads), but I think the
above check takes care of that.

> Personally, I would this to look more like "ls -sh", "du -h", etc. -- no
> 'b', use upper-case K/M/G, and instead of your 5120 limit, just write it
> as X.Y if X < 10.  Or if you really like the extra detail, maybe write
> it padded to 4 characters -- X.YY, XX.Y, and then XXX and XXXX up to 1023.

That is nicer. Changed to your last suggestion.

I checked this in, with a testcase as:

commit 47f025139d1c2e75781cdab40dc9195396133754
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue Oct 6 19:24:22 2009 +0200

    Add proc_mem tapset, functions to query memory usage of the current
process.
    
    * tapset/proc_mem.stp: New tapset.
    * testsuite/buildok/proc_mem.stp
    * doc/SystemTap_Tapset_Reference/tapsets.tmpl (memory_stp): Include
      tapset/proc_mem.stp.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: proc memory statistics tapset
  2009-10-06 17:26   ` Mark Wielaard
@ 2009-10-06 23:39     ` Josh Stone
  2009-10-09 12:54       ` Mark Wielaard
  0 siblings, 1 reply; 9+ messages in thread
From: Josh Stone @ 2009-10-06 23:39 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap

On 10/06/2009 10:25 AM, Mark Wielaard wrote:
>>   "Task isn't coopted
>> by a kernel thread" doesn't make sense to me either -- kernel threads
>> have their own current task_struct, represented by that KTHREAD flag.
> 
> I am not 100% sure that is correct. Task flags is set the PT_KTHREAD in
> INIT_TASK() before it is fully created. I even extended that check to:
> current->flags & (_STP_PF_KTHREAD | PF_EXITING | PF_STARTING)
> just to be fully paranoid we never query some task-mm struct that isn't
> setup right. Feel free to proof me wrong in being that paranoid :)

INIT_TASK() is not a generic initializer -- it's only used to create the
specific "init_task".  Everybody else is copied from their parent in
copy_process().

I think your paranoia is ok to make sure there's a meaningful mm.  I was
more interested in what you meant by "coopted by" -- in which cases
could a kernel thread pop in without changing current to itself?

An interrupt handler could be considered such a case, but I don't think
those should be filtered out.  A timer.profile fires in softIRQ context,
but it's probably reasonable to profile your memory usage this way.
Even our own trap handlers could be seen as "coopting" the process.

Josh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: proc memory statistics tapset
  2009-10-06 23:39     ` Josh Stone
@ 2009-10-09 12:54       ` Mark Wielaard
  2009-10-09 17:26         ` Josh Stone
  0 siblings, 1 reply; 9+ messages in thread
From: Mark Wielaard @ 2009-10-09 12:54 UTC (permalink / raw)
  To: Josh Stone; +Cc: systemtap

Hi Josh,

On Tue, 2009-10-06 at 16:39 -0700, Josh Stone wrote:
> On 10/06/2009 10:25 AM, Mark Wielaard wrote:
> >>   "Task isn't coopted
> >> by a kernel thread" doesn't make sense to me either -- kernel threads
> >> have their own current task_struct, represented by that KTHREAD flag.
> > 
> > I am not 100% sure that is correct. Task flags is set the PT_KTHREAD in
> > INIT_TASK() before it is fully created. I even extended that check to:
> > current->flags & (_STP_PF_KTHREAD | PF_EXITING | PF_STARTING)
> > just to be fully paranoid we never query some task-mm struct that isn't
> > setup right. Feel free to proof me wrong in being that paranoid :)
> 
> INIT_TASK() is not a generic initializer -- it's only used to create the
> specific "init_task".  Everybody else is copied from their parent in
> copy_process().

Aha, thanks, I misread. But I think the check is still the right one.

> I think your paranoia is ok to make sure there's a meaningful mm.  I was
> more interested in what you meant by "coopted by" -- in which cases
> could a kernel thread pop in without changing current to itself?

Normally kernel threads don't have an associated mm struct. But they can
"coopt" one from a user thread (for example when doing aio on behave of
some process), then current->mm != NULL, but current->flags still has
PT_KTHREAD. IMHO we should not report such mm sizes.

> An interrupt handler could be considered such a case, but I don't think
> those should be filtered out.  A timer.profile fires in softIRQ context,
> but it's probably reasonable to profile your memory usage this way.
> Even our own trap handlers could be seen as "coopting" the process.

Yes, we should catch normal "interrupted" user tasks. With timer.profile
you get output like: stap -e 'probe timer.profile { if (pid() != 0)
printf("%s (%d)\n\t%s\n", execname(), tid(), proc_mem_string()) }'

firefox (31452)
	size:  787M, rss:  144M, shr: 25.8M, txt: 84.0K, data:  283M
gnome-terminal (4017)
	size:  287M, rss: 21.9M, shr: 9.45M, txt:  272K, data: 22.5M
Xorg (2745)
	size:  479M, rss: 71.0M, shr: 12.1M, txt: 1.70M, data: 71.5M
Xorg (2745)
	size:  479M, rss: 71.0M, shr: 12.1M, txt: 1.70M, data: 71.5M
Xorg (2745)
	size:  479M, rss: 71.0M, shr: 12.1M, txt: 1.70M, data: 71.5M
Xorg (2745)
	size:  479M, rss: 71.0M, shr: 12.1M, txt: 1.70M, data: 71.5M
Xorg (2745)
	size:  479M, rss: 71.0M, shr: 12.1M, txt: 1.70M, data: 71.5M
systemtap/5 (1699)
	size:     0, rss:     0, shr:     0, txt:     0, data:     0
ksoftirqd/5 (19)
	size:     0, rss:     0, shr:     0, txt:     0, data:     0
ksoftirqd/5 (19)
	size:     0, rss:     0, shr:     0, txt:     0, data:     0
gnome-terminal (4017)
	size:  287M, rss: 21.9M, shr: 9.45M, txt:  272K, data: 22.5M
Xorg (2745)
	size:  479M, rss: 71.0M, shr: 12.1M, txt: 1.70M, data: 71.5M
Xorg (2745)
	size:  479M, rss: 71.0M, shr: 12.1M, txt: 1.70M, data: 71.5M
gnome-terminal (4017)
	size:  287M, rss: 21.9M, shr: 9.45M, txt:  272K, data: 22.5M

Cheers,

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* task time tapset
  2009-10-05 15:07 proc memory statistics tapset Mark Wielaard
  2009-10-05 18:09 ` Josh Stone
@ 2009-10-09 15:18 ` Mark Wielaard
  2009-10-09 22:53   ` Josh Stone
  1 sibling, 1 reply; 9+ messages in thread
From: Mark Wielaard @ 2009-10-09 15:18 UTC (permalink / raw)
  To: systemtap

[-- Attachment #1: Type: text/plain, Size: 2033 bytes --]

Hi,

As an addition to the proc_mem tapset I created a task_time tapset. Like
the process memory tapset it only works on the current task. But that
makes it really trivial to implement. I think it would be a nice
addition since it allows you to do these "pass based statistics" fully
dynamically (and they would of course also work with function probes, or
any other probe that targets user processes).

To extend the previous example, you can now almost completely mimic to
stap -v output:
$ stap -e 'probe process("stap").mark("pass[0-3]*end")
  { log($$name . "\t" . proc_mem_string() . "\n\t\t" . task_time_string()) }'
  -c 'stap -w -k -p4 testsuite/buildok/syscall.stp'

pass0__end  size: 55.9M, rss: 2.62M, shr: 1.93M, txt: 1.34M, data: 1.01M
            usr: 0m0.002s, sys: 0m0.003s
pass1__end  size: 72.5M, rss: 19.1M, shr: 2.23M, txt: 1.34M, data: 17.6M
            usr: 0m0.103s, sys: 0m0.016s
pass2__end  size:  167M, rss: 96.5M, shr: 42.9M, txt: 1.34M, data: 54.4M
            usr: 0m2.102s, sys: 0m0.035s
pass3__end  size:  167M, rss: 96.6M, shr: 43.0M, txt: 1.34M, data: 54.4M
            usr: 0m2.354s, sys: 0m0.048s

These are total user and system times. It doesn't include real time atm.
In theory this can be gotten. The task struct keeps the start time. But
this is in monotonic or boot time and we currently only have daytime. It
shouldn't be too hard to add that, but I didn't want to do that atm.
Recent kernels export a per cpu_clock() (based on sched_clock) that
might be helpful.

It also doesn't do anything fancy like the task.stp tapset that also
works for tasks that aren't current. Again it shouldn't be too hard to
extend it to also make it do that, task.stp shows how the locking should
work. But I didn't really saw the complexity being worth it.

Again these functions are marked unprivileged since the same info can be
gotten from proc/pid/tasks/* already.

Tested against 2.6.18 and 2.6.31.1 on x86_64, tests on other kernels and
architectures or any other feedback very welcome.

Cheers,

Mark

[-- Attachment #2: task_time.stp --]
[-- Type: text/x-csrc, Size: 3340 bytes --]

// Task time query and utility functions.
// Copyright (C) 2009 Red Hat Inc.
//
// This file is part of systemtap, and is free software.  You can
// redistribute it and/or modify it under the terms of the GNU General
// Public License (GPL); either version 2, or (at your option) any
// later version.

// <tapsetdescription>
// Task time query and utility functions provide information about
// the time resource usage of the current task. These functions provide
// information about the user time and system time of the current
// task. And provide utility functions to turn the reported times
// into miliseconds and create human readable string representations
// of task time used. The reported times are approximates and should
// be used for "coarse grained" measurements only. The reported user
// and system time are only for the current task, not for the process
// as a whole nor of any time spend by children of the current task.
// </tapsetdescription>

%{
#include <asm/cputime.h>
#include <linux/time.h>
%}

/**
 * sfunction task_utime - User time of the current task.
 *
 * Description: Returns the user time of the current task in cputime.
 * Does not include any time used by other tasks in this process, nor
 * does it include any time of the children of this task.
 */
function task_utime:long ()
%{ /* pure */ /* unprivileged */
  THIS->__retvalue = current->utime;
%}

/**
 * sfunction task_stime - System time of the current task.
 *
 * Description: Returns the system time of the current task in cputime.
 * Does not include any time used by other tasks in this process, nor
 * does it include any time of the children of this task.
 */
function task_stime:long ()
%{ /* pure */ /* unprivileged */
  THIS->__retvalue = current->stime;
%}

/**
 * sfunction cputime_to_msecs - Translates the given cputime into milliseconds.
 * @cputime: Time to convert to milliseconds.
 */
function cputime_to_msecs:long (cputime:long)
%{ /* pure */ /* unprivileged */
  THIS->__retvalue = cputime_to_msecs (THIS->cputime);
%}

/**
 * sfunction msecs_to_string - Human readable string for given milliseconds.
 * @msecs: Number of milliseconds to translate.
 *
 * Description: Returns a string representing the number of
 * milliseconds as a human readable string consisting of "XmY.ZZZs",
 * where X is the number of minutes, Y is the number of seconds and
 * ZZZ is the number of milliseconds.
 */
function msecs_to_string:string (msecs:long)
{
  ms = msecs % 1000;
  secs = msecs / 1000;
  mins = secs / 60;
  secs = secs % 60;
  return sprintf("%dm%d.%.3ds", mins, secs, ms);
}

/**
 * sfunction cputime_to_string - Human readable string for given cputime.
 * @cputime: Time to translate.
 *
 * Description: Equivalent to calling:
 * msec_to_string (cputime_to_msecs (cputime).
 */
function cputime_to_string:string (cputime:long)
{
  return msecs_to_string (cputime_to_msecs (cputime));
}

/**
 * sfunction task_time_string - Human readable string of task time usage.
 *
 * Description: Returns a human readable string showing the user and
 * system time the current task has used up to now.  For example
 * "usr: 0m12.908s, sys: 1m6.851s".
 */
function task_time_string:string ()
{
  return sprintf ("usr: %s, sys: %s",
                  cputime_to_string (task_utime()),
                  cputime_to_string (task_stime()));
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: proc memory statistics tapset
  2009-10-09 12:54       ` Mark Wielaard
@ 2009-10-09 17:26         ` Josh Stone
  0 siblings, 0 replies; 9+ messages in thread
From: Josh Stone @ 2009-10-09 17:26 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap

On 10/09/2009 05:54 AM, Mark Wielaard wrote:
>> I think your paranoia is ok to make sure there's a meaningful mm.  I was
>> more interested in what you meant by "coopted by" -- in which cases
>> could a kernel thread pop in without changing current to itself?
>
> Normally kernel threads don't have an associated mm struct. But they can
> "coopt" one from a user thread (for example when doing aio on behave of
> some process), then current->mm != NULL, but current->flags still has
> PT_KTHREAD. IMHO we should not report such mm sizes.

Ah, so it's about a borrowed mm.  That makes sense, thanks.

Josh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: task time tapset
  2009-10-09 15:18 ` task time tapset Mark Wielaard
@ 2009-10-09 22:53   ` Josh Stone
  2009-10-10  9:12     ` Mark Wielaard
  0 siblings, 1 reply; 9+ messages in thread
From: Josh Stone @ 2009-10-09 22:53 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap

On 10/09/2009 08:17 AM, Mark Wielaard wrote:
> Hi,
>
> As an addition to the proc_mem tapset I created a task_time tapset. Like
> the process memory tapset it only works on the current task. But that
> makes it really trivial to implement. I think it would be a nice
> addition since it allows you to do these "pass based statistics" fully
> dynamically (and they would of course also work with function probes, or
> any other probe that targets user processes).

Looks good to me.

Since this doesn't operate on arbitrary tasks, I wonder if it would make 
sense to drop the "task_" prefix, like the functions in context.stp. 
Maybe time_string() sounds too generic though... just thinking aloud...

> These are total user and system times. It doesn't include real time atm.
> In theory this can be gotten. The task struct keeps the start time. But
> this is in monotonic or boot time and we currently only have daytime. It
> shouldn't be too hard to add that, but I didn't want to do that atm.
> Recent kernels export a per cpu_clock() (based on sched_clock) that
> might be helpful.

Yes please!  I don't think real time is that critical here, but I think 
that this clock source would be very useful in general.  Most uses now 
of gettimeofday are just to measure time elapsed, and a monotonic clock 
is much more appropriate for that.  It can fall back on gettimeofday for 
older kernels.

Josh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: task time tapset
  2009-10-09 22:53   ` Josh Stone
@ 2009-10-10  9:12     ` Mark Wielaard
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Wielaard @ 2009-10-10  9:12 UTC (permalink / raw)
  To: Josh Stone; +Cc: systemtap

Hi Josh,

On Fri, 2009-10-09 at 15:53 -0700, Josh Stone wrote:
> On 10/09/2009 08:17 AM, Mark Wielaard wrote:
> Since this doesn't operate on arbitrary tasks, I wonder if it would make 
> sense to drop the "task_" prefix, like the functions in context.stp. 
> Maybe time_string() sounds too generic though... just thinking aloud...

My thinking was the opposite. Since these don't operate on "processes",
but just on individual "tasks" (unlike the proc_mem functions, which
show usage per process) adding task as prefix makes that more clear. I
think stripping the prefix would be too general. We might provide
proc_utime and proc_stime later (if we can figure out the locking for
going over the whole proc task hierarchy that is). So for now lets keep
it. But maybe we have to go over all the tapsets before the next release
and see if they are too specific or too generally named.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-10-10  9:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-05 15:07 proc memory statistics tapset Mark Wielaard
2009-10-05 18:09 ` Josh Stone
2009-10-06 17:26   ` Mark Wielaard
2009-10-06 23:39     ` Josh Stone
2009-10-09 12:54       ` Mark Wielaard
2009-10-09 17:26         ` Josh Stone
2009-10-09 15:18 ` task time tapset Mark Wielaard
2009-10-09 22:53   ` Josh Stone
2009-10-10  9:12     ` Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).