public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* oprofile's mechanism to get file path information
@ 2007-08-02 19:31 William Cohen
  2007-08-02 21:07 ` David J. Wilder
  2007-08-06 10:15 ` Mike Mason
  0 siblings, 2 replies; 3+ messages in thread
From: William Cohen @ 2007-08-02 19:31 UTC (permalink / raw)
  To: SystemTAP

There was some discussion at today's SystemTap meeting on getting the file path 
information for VFS tapset. The problem is that the functions in getting the 
path require some locks, and in general want to avoid getting locks when in a 
probe. It was mentioned that OProfile has a mechanism to get file path 
information (dcookies). OProfile's mechanism may not be entirely appropriate, 
but it have some similar issues.

OProfile has a interrupt mechanisum that does the actual sampling. On x86_64 and 
i386 machines this is done as a non-maskable interrupt (NMI). As a result what 
can be done in the interrupt context is very limited. The interrupt mechanism 
just records the context that the interrupt occurred in, the linear address of 
the program counter, and the performance counter that caused the sample. 
OProfile records this information in per processor circular queues. This is done 
to eliminate the need for any locks. The linear address is of limited use 
because linear address is very ephemeral, different programs may map the same 
shared library to different locations. OProfile converts the linear address into 
a file and an offset into the file. This conversion happens when the data from 
the per processor buffers is collected into a system-wide buffer.

Having arbitrary length strings in the buffer sent into user space is awkward. 
OProfile uses the dcookie mechanism to use fixed size integer numbers for the 
file path. The daemon in userspace can make a systemcall to convert the number 
back into a string. This makes the data format much more compact. It doesn't 
need to pass all large strings around; the user-space daemon only needs to do 
the dcookie lookup if it hasn't seen the dcookie value before. This user-space 
code in oprofile/daemon/opd_cookie.c does the operation. The kernel side of the 
code is in sys_lookup_dcookie, in linux/fs/dcookies.c. There is some code in 
linux/driver/oprofile/buffer_sync.c that is converting that linear address into 
a filename and offset.

There is a dcookie_mutex for the dcookie stuff, so there is still some locking. 
However, for oprofile this locking happens when the the buffers are being read 
out (less time critical) or in the user-space is trying to get a string name 
rather than when the sample is actually being collected.

Maybe some of the approach used in dcookies would be useful for VFS path names.

-Will

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: oprofile's mechanism to get file path information
  2007-08-02 19:31 oprofile's mechanism to get file path information William Cohen
@ 2007-08-02 21:07 ` David J. Wilder
  2007-08-06 10:15 ` Mike Mason
  1 sibling, 0 replies; 3+ messages in thread
From: David J. Wilder @ 2007-08-02 21:07 UTC (permalink / raw)
  To: William Cohen; +Cc: SystemTAP

William Cohen wrote:
> There was some discussion at today's SystemTap meeting on getting the 
> file path information for VFS tapset. The problem is that the 
> functions in getting the path require some locks, and in general want 
> to avoid getting locks when in a probe. It was mentioned that OProfile 
> has a mechanism to get file path information (dcookies). OProfile's 
> mechanism may not be entirely appropriate, but it have some similar 
> issues.
>
> OProfile has a interrupt mechanisum that does the actual sampling. On 
> x86_64 and i386 machines this is done as a non-maskable interrupt 
> (NMI). As a result what can be done in the interrupt context is very 
> limited. The interrupt mechanism just records the context that the 
> interrupt occurred in, the linear address of the program counter, and 
> the performance counter that caused the sample. OProfile records this 
> information in per processor circular queues. This is done to 
> eliminate the need for any locks. The linear address is of limited use 
> because linear address is very ephemeral, different programs may map 
> the same shared library to different locations. OProfile converts the 
> linear address into a file and an offset into the file. This 
> conversion happens when the data from the per processor buffers is 
> collected into a system-wide buffer.
>
> Having arbitrary length strings in the buffer sent into user space is 
> awkward. OProfile uses the dcookie mechanism to use fixed size integer 
> numbers for the file path. The daemon in userspace can make a 
> systemcall to convert the number back into a string. This makes the 
> data format much more compact. It doesn't need to pass all large 
> strings around; the user-space daemon only needs to do the dcookie 
> lookup if it hasn't seen the dcookie value before. This user-space 
> code in oprofile/daemon/opd_cookie.c does the operation. The kernel 
> side of the code is in sys_lookup_dcookie, in linux/fs/dcookies.c. 
> There is some code in linux/driver/oprofile/buffer_sync.c that is 
> converting that linear address into a filename and offset.
>
> There is a dcookie_mutex for the dcookie stuff, so there is still some 
> locking. However, for oprofile this locking happens when the the 
> buffers are being read out (less time critical) or in the user-space 
> is trying to get a string name rather than when the sample is actually 
> being collected.
>
> Maybe some of the approach used in dcookies would be useful for VFS 
> path names.
>
> -Will
Following with the lookup later idea,  could we just save the pointer in 
the probe handler and do the lookup when the END probe runs?   This  has 
to assume  the file is not  deleted between  the collection of the 
pointer  and the end probe running.   Another idea is creating some sort 
of kernel process running in a known context that we could pass  a 
lookup request and a place to store the result.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: oprofile's mechanism to get file path information
  2007-08-02 19:31 oprofile's mechanism to get file path information William Cohen
  2007-08-02 21:07 ` David J. Wilder
@ 2007-08-06 10:15 ` Mike Mason
  1 sibling, 0 replies; 3+ messages in thread
From: Mike Mason @ 2007-08-06 10:15 UTC (permalink / raw)
  To: William Cohen; +Cc: SystemTAP

Some elements of this might be useful for our purposes, but I can think of a few drawbacks:

- Delaying retrieval of the pathname means we can't filter on the pathname in a probe.
- This approach retrieves the pathname from user space via a system call.  We don't have that option in the current design of SystemTap.
- This approach calls d_path() in the lookup_dcookie() system call, but from the context of task doing the lookup, not from the context of the task when the data was gathered.  The tasks root directory is part of the pathname and, at least theoretically, it can be different, thus making the pathname different.  I don't know how much root directories change in real life, but it's possible.

Still, this is an interesting idea and one we should explore for this and other purposes.

Thanks,
Mike

William Cohen wrote:
> There was some discussion at today's SystemTap meeting on getting the 
> file path information for VFS tapset. The problem is that the functions 
> in getting the path require some locks, and in general want to avoid 
> getting locks when in a probe. It was mentioned that OProfile has a 
> mechanism to get file path information (dcookies). OProfile's mechanism 
> may not be entirely appropriate, but it have some similar issues.
> 
> OProfile has a interrupt mechanisum that does the actual sampling. On 
> x86_64 and i386 machines this is done as a non-maskable interrupt (NMI). 
> As a result what can be done in the interrupt context is very limited. 
> The interrupt mechanism just records the context that the interrupt 
> occurred in, the linear address of the program counter, and the 
> performance counter that caused the sample. OProfile records this 
> information in per processor circular queues. This is done to eliminate 
> the need for any locks. The linear address is of limited use because 
> linear address is very ephemeral, different programs may map the same 
> shared library to different locations. OProfile converts the linear 
> address into a file and an offset into the file. This conversion happens 
> when the data from the per processor buffers is collected into a 
> system-wide buffer.
> 
> Having arbitrary length strings in the buffer sent into user space is 
> awkward. OProfile uses the dcookie mechanism to use fixed size integer 
> numbers for the file path. The daemon in userspace can make a systemcall 
> to convert the number back into a string. This makes the data format 
> much more compact. It doesn't need to pass all large strings around; the 
> user-space daemon only needs to do the dcookie lookup if it hasn't seen 
> the dcookie value before. This user-space code in 
> oprofile/daemon/opd_cookie.c does the operation. The kernel side of the 
> code is in sys_lookup_dcookie, in linux/fs/dcookies.c. There is some 
> code in linux/driver/oprofile/buffer_sync.c that is converting that 
> linear address into a filename and offset.
> 
> There is a dcookie_mutex for the dcookie stuff, so there is still some 
> locking. However, for oprofile this locking happens when the the buffers 
> are being read out (less time critical) or in the user-space is trying 
> to get a string name rather than when the sample is actually being 
> collected.
> 
> Maybe some of the approach used in dcookies would be useful for VFS path 
> names.
> 
> -Will

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-08-05 17:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-02 19:31 oprofile's mechanism to get file path information William Cohen
2007-08-02 21:07 ` David J. Wilder
2007-08-06 10:15 ` Mike Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).