Tracking vm activity

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

* Tracking vm activity
@ 2007-04-23 21:05 William Cohen
  2007-04-23 23:19 ` Frank Ch. Eigler
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: William Cohen @ 2007-04-23 21:05 UTC (permalink / raw)
  To: SystemTAP

[-- Attachment #1: Type: text/plain, Size: 941 bytes --]

Hi,

I am playing around with getting information about page faults and I noticed 
that there is a probe alias for the entry for the pagefault code. However, there 
is no matching probe point for the return. It is useful to look at the return 
value to determine what kind of page fault occurred (major or minor). The 
attached patch provides a similar probe point for the return point. any comments 
on the patch?

This would be useful for the following senario. A probe on vm.pagefault could 
get the address and a probe on vm.pagefault.return could get information how it 
was handled. The analysis could then do things like which addresses cause major 
page faults (real disk accesses). One could write out a log of the major page 
faults (and the mmap operations) and track which files caused the page faults. 
Probably don't want to walk through the mm structures when taking a page fault 
to determine which file it came from.

-Will

[-- Attachment #2: pagefault_return.patch --]
[-- Type: text/x-patch, Size: 595 bytes --]

Index: tapset/memory.stp
===================================================================
RCS file: /cvs/systemtap/src/tapset/memory.stp,v
retrieving revision 1.4
diff -U2 -u -r1.4 memory.stp
--- tapset/memory.stp	7 Nov 2006 09:26:24 -0000	1.4
+++ tapset/memory.stp	21 Mar 2007 14:40:27 -0000
@@ -27,4 +27,10 @@
 }

+probe vm.pagefault.return = kernel.function(
+        %( kernel_v >= "2.6.13" %? "__handle_mm_fault" %: "handle_mm_fault" %)
+        ).return
+{
+}
+
 /* Return which node the given address belongs to in a NUMA system */
 function addr_to_node:long(addr:long)  /* pure */ 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-04-23 21:05 Tracking vm activity William Cohen
@ 2007-04-23 23:19 ` Frank Ch. Eigler
  2007-04-24 17:13 ` Stone, Joshua I
  2007-05-08  3:08 ` Jun Koi
  2 siblings, 0 replies; 11+ messages in thread
From: Frank Ch. Eigler @ 2007-04-23 23:19 UTC (permalink / raw)
  To: William Cohen; +Cc: SystemTAP


William Cohen <wcohen@redhat.com> writes:

> I am playing around with getting information about page faults and I
> noticed that there is a probe alias for the entry for the pagefault
> code. However, there is no matching probe point for the
> return. [...]

Sure.  Please also make sure that some test case attempts to
instantiate this new probe, and that it is at least as well documented
as its neighbours.

- FChE

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-04-23 21:05 Tracking vm activity William Cohen
  2007-04-23 23:19 ` Frank Ch. Eigler
@ 2007-04-24 17:13 ` Stone, Joshua I
  2007-04-24 18:17   ` William Cohen
  2007-05-08  3:08 ` Jun Koi
  2 siblings, 1 reply; 11+ messages in thread
From: Stone, Joshua I @ 2007-04-24 17:13 UTC (permalink / raw)
  To: William Cohen; +Cc: SystemTAP

William Cohen wrote:
> I am playing around with getting information about page faults and I 
> noticed that there is a probe alias for the entry for the pagefault 
> code. However, there is no matching probe point for the return. It is 
> useful to look at the return value to determine what kind of page fault 
> occurred (major or minor). The attached patch provides a similar probe 
> point for the return point. any comments on the patch?

I see that you committed it, and added a variable for the return value 
(fault_type).  Could you elaborate in the documentation what the 
possible types are (major, minor) and their values?

Thanks,

Josh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-04-24 17:13 ` Stone, Joshua I
@ 2007-04-24 18:17   ` William Cohen
  2007-04-24 20:36     ` Stone, Joshua I
  0 siblings, 1 reply; 11+ messages in thread
From: William Cohen @ 2007-04-24 18:17 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: SystemTAP

Stone, Joshua I wrote:
> William Cohen wrote:
>> I am playing around with getting information about page faults and I 
>> noticed that there is a probe alias for the entry for the pagefault 
>> code. However, there is no matching probe point for the return. It is 
>> useful to look at the return value to determine what kind of page 
>> fault occurred (major or minor). The attached patch provides a similar 
>> probe point for the return point. any comments on the patch?
> 
> I see that you committed it, and added a variable for the return value 
> (fault_type).  Could you elaborate in the documentation what the 
> possible types are (major, minor) and their values?
> 
> Thanks,
> 
> Josh

Hi Josh,

Is this suitable for comment for vm.pagefault.return? If suitable, I will go 
ahead and check it in.

/* probe vm.pagefault.return
  *
  *  Records type of fault that occurred.
  *
  * Context:
  *  The process which triggered the fault.
  *
  * Argumentss:
  *  fault_type - type of fault
  *	VM_FAULT_OOM	0	out of memory
  *	VM_FAULT_SIGBUS	1	if not oom, minor, or major fault, this val
  *	VM_FAULT_MINOR	2	no blocking operation to handle fault
  *	VM_FAULT_MAJOR	3	required blocking operation to handle fault
  */

-Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-04-24 18:17   ` William Cohen
@ 2007-04-24 20:36     ` Stone, Joshua I
  0 siblings, 0 replies; 11+ messages in thread
From: Stone, Joshua I @ 2007-04-24 20:36 UTC (permalink / raw)
  To: William Cohen; +Cc: SystemTAP

William Cohen wrote:
> Is this suitable for comment for vm.pagefault.return? If suitable, I 
> will go ahead and check it in.
> [...]
>  *  fault_type - type of fault
>  *    VM_FAULT_OOM    0    out of memory
>  *    VM_FAULT_SIGBUS    1    if not oom, minor, or major fault, this val
>  *    VM_FAULT_MINOR    2    no blocking operation to handle fault
>  *    VM_FAULT_MAJOR    3    required blocking operation to handle fault

Yes, that's exactly what I was looking for, thanks.

Josh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-04-23 21:05 Tracking vm activity William Cohen
  2007-04-23 23:19 ` Frank Ch. Eigler
  2007-04-24 17:13 ` Stone, Joshua I
@ 2007-05-08  3:08 ` Jun Koi
  2007-05-08 14:15   ` Frank Ch. Eigler
  2007-05-08 21:12   ` William Cohen
  2 siblings, 2 replies; 11+ messages in thread
From: Jun Koi @ 2007-05-08  3:08 UTC (permalink / raw)
  To: William Cohen; +Cc: SystemTAP

On 4/24/07, William Cohen <wcohen@redhat.com> wrote:
> Hi,
>
> I am playing around with getting information about page faults and I noticed
> that there is a probe alias for the entry for the pagefault code. However, there
> is no matching probe point for the return. It is useful to look at the return
> value to determine what kind of page fault occurred (major or minor). The
> attached patch provides a similar probe point for the return point. any comments
> on the patch?
>
> This would be useful for the following senario. A probe on vm.pagefault could
> get the address and a probe on vm.pagefault.return could get information how it
> was handled. The analysis could then do things like which addresses cause major
> page faults (real disk accesses). One could write out a log of the major page
> faults (and the mmap operations) and track which files caused the page faults.
> Probably don't want to walk through the mm structures when taking a page fault
> to determine which file it came from.
>

Hi Will,

I found this idea interesting. However, what do you think if the
machine is SMP, or kernel is preemptive capable?

Just imagine a scenario like this with 2 kernel threads A and B:
- A causes page fault, then enters page fault handler (1)
- B preempts A, causes fault, then enters page fault handler  (2)
- A exits page fault handler (3)
- B exits page fault handler (4)

In this case, you get the exit value at (3), but it is not relevant to
the parameters collected in the above step (2).

I cannot figure out how to correlate "enter" and "exit" function
events in SMP or preemptive case.

Any idea?

Thanks,
Jun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-05-08  3:08 ` Jun Koi
@ 2007-05-08 14:15   ` Frank Ch. Eigler
  2007-05-10 10:38     ` Jun Koi
  2007-05-08 21:12   ` William Cohen
  1 sibling, 1 reply; 11+ messages in thread
From: Frank Ch. Eigler @ 2007-05-08 14:15 UTC (permalink / raw)
  To: Jun Koi; +Cc: systemtap

"Jun Koi" <junkoi2004@gmail.com> writes:

> [...]
> I found this idea interesting. However, what do you think if the
> machine is SMP, or kernel is preemptive capable?
> [...]
> I cannot figure out how to correlate "enter" and "exit" function
> events in SMP or preemptive case.

In systemtap, a .return probe can refer to parameters of the
*corresponding* .call (function entry) event.  The parameters are
saved in an auxiliary lookup table, indexed by thread-id and nesting
level.  SMP or preemption should not cause any problem.

- FChE

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-05-08  3:08 ` Jun Koi
  2007-05-08 14:15   ` Frank Ch. Eigler
@ 2007-05-08 21:12   ` William Cohen
  1 sibling, 0 replies; 11+ messages in thread
From: William Cohen @ 2007-05-08 21:12 UTC (permalink / raw)
  To: Jun Koi; +Cc: SystemTAP

Jun Koi wrote:
> On 4/24/07, William Cohen <wcohen@redhat.com> wrote:
>> Hi,
>>
>> I am playing around with getting information about page faults and I 
>> noticed
>> that there is a probe alias for the entry for the pagefault code. 
>> However, there
>> is no matching probe point for the return. It is useful to look at the 
>> return
>> value to determine what kind of page fault occurred (major or minor). The
>> attached patch provides a similar probe point for the return point. 
>> any comments
>> on the patch?
>>
>> This would be useful for the following senario. A probe on 
>> vm.pagefault could
>> get the address and a probe on vm.pagefault.return could get 
>> information how it
>> was handled. The analysis could then do things like which addresses 
>> cause major
>> page faults (real disk accesses). One could write out a log of the 
>> major page
>> faults (and the mmap operations) and track which files caused the page 
>> faults.
>> Probably don't want to walk through the mm structures when taking a 
>> page fault
>> to determine which file it came from.
>>
> 
> Hi Will,
> 
> I found this idea interesting. However, what do you think if the
> machine is SMP, or kernel is preemptive capable?
> 
> Just imagine a scenario like this with 2 kernel threads A and B:
> - A causes page fault, then enters page fault handler (1)
> - B preempts A, causes fault, then enters page fault handler  (2)
> - A exits page fault handler (3)
> - B exits page fault handler (4)
> 
> In this case, you get the exit value at (3), but it is not relevant to
> the parameters collected in the above step (2).
> 
> I cannot figure out how to correlate "enter" and "exit" function
> events in SMP or preemptive case.
> 
> Any idea?
> 
> Thanks,
> Jun

It is quite possible that more than one application has outstanding page faults. 
By definition a context switch happens when a major page fault occurs. However, 
associative arrays can be used to keep that kind of matching information 
straight. Thus, this will not be a problem.

-Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-05-08 14:15   ` Frank Ch. Eigler
@ 2007-05-10 10:38     ` Jun Koi
  2007-05-10 11:32       ` Frank Ch. Eigler
  0 siblings, 1 reply; 11+ messages in thread
From: Jun Koi @ 2007-05-10 10:38 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On 08 May 2007 10:15:43 -0400, Frank Ch. Eigler <fche@redhat.com> wrote:
> "Jun Koi" <junkoi2004@gmail.com> writes:
>
> > [...]
> > I found this idea interesting. However, what do you think if the
> > machine is SMP, or kernel is preemptive capable?
> > [...]
> > I cannot figure out how to correlate "enter" and "exit" function
> > events in SMP or preemptive case.
>
> In systemtap, a .return probe can refer to parameters of the
> *corresponding* .call (function entry) event.  The parameters are
> saved in an auxiliary lookup table, indexed by thread-id and nesting
> level.  SMP or preemption should not cause any problem.

Ah, the thread-id idea is clear to me. But may you explain a little
bit about the "nesting level"??

Thanks you,
Jun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-05-10 10:38     ` Jun Koi
@ 2007-05-10 11:32       ` Frank Ch. Eigler
  2007-05-11  6:15         ` Jun Koi
  0 siblings, 1 reply; 11+ messages in thread
From: Frank Ch. Eigler @ 2007-05-10 11:32 UTC (permalink / raw)
  To: Jun Koi; +Cc: systemtap

Hi -

On Thu, May 10, 2007 at 07:38:04PM +0900, Jun Koi wrote:
> Ah, the thread-id idea is clear to me. But may you explain a little
> bit about the "nesting level"??

Certainly.  The nesting level is a per-thread counter.  It is
incremented on a probed function entry and decremented upon its
return.  It is used as an additional index to saved function
parameters by.  This way, if a recursive function is encountered, we
can find the innermost parameters without losing the outer ones.

- FChE

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Tracking vm activity
  2007-05-10 11:32       ` Frank Ch. Eigler
@ 2007-05-11  6:15         ` Jun Koi
  0 siblings, 0 replies; 11+ messages in thread
From: Jun Koi @ 2007-05-11  6:15 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Hi Frank,

> > Ah, the thread-id idea is clear to me. But may you explain a little
> > bit about the "nesting level"??
>
> Certainly.  The nesting level is a per-thread counter.  It is
> incremented on a probed function entry and decremented upon its
> return.  It is used as an additional index to saved function
> parameters by.  This way, if a recursive function is encountered, we
> can find the innermost parameters without losing the outer ones.
>

It is all clear now. Thanks, Frank :-)

Best,
Jun

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-05-11  6:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-23 21:05 Tracking vm activity William Cohen
2007-04-23 23:19 ` Frank Ch. Eigler
2007-04-24 17:13 ` Stone, Joshua I
2007-04-24 18:17   ` William Cohen
2007-04-24 20:36     ` Stone, Joshua I
2007-05-08  3:08 ` Jun Koi
2007-05-08 14:15   ` Frank Ch. Eigler
2007-05-10 10:38     ` Jun Koi
2007-05-10 11:32       ` Frank Ch. Eigler
2007-05-11  6:15         ` Jun Koi
2007-05-08 21:12   ` William Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).