Re: Per-process tracing user-space probes approach

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

From: Vara Prasad <prasadav@us.ibm.com>
To: prasanna@in.ibm.com
Cc: systemtap@sources.redhat.com, roland@redhat.com
Subject: Re: Per-process tracing user-space probes approach
Date: Tue, 13 Jun 2006 00:24:00 -0000	[thread overview]
Message-ID: <448E0595.9010503@us.ibm.com> (raw)
In-Reply-To: <20060609055325.GA27877@in.ibm.com>

S. P. Prasanna wrote:

>Hi,
>
>I have listed a brief description of the user-space probes approach,
>which I am planning to implement.
>Please review and provide your comments.
>
>Thanks
>Prasanna
>
>Requirements:
>
>- per process tracing using "COW"
>- able to trace yet to be started applications
>  
>
I would think this is lower in priority.

>- smallest kernel patch (no aggregate support for this release)
>- correlation of kernel and user probes output
>- least performance overhead compared to ptrace
>- provide clean user interface like syscall with pre-defined
>  set of handlers that can log data, registers, stack trace etc and
>  also support adding new handlers at runtime.
>- no hooks to readpage(s)
>- handler runs in kernel and handler can sleep to collect data from
>  non-memory resident pages.
>- single step out-of-line
>
>  
>
If i summarize your proposal there are going to be two steps one to 
define the handler and second to associate a handler to the break point. 
For user space probes there are going to be two types of handlers some 
predefined handlers (similar to ptrace) that will be pre-built into the 
kernel and users also have an ability to add more handlers in the 
kernel. New handlers are registered with the kernel similar to kprobes 
handlers but they won't be associated with a beak point at the time of 
registration. Assosiation to the break point is done with a new systemcall.

>Usage:
>
>1. Specifying a new kernel handlers for a probe point.
>	- usage example.
>	   1. Trace a library routine malloc() in an application already
>	      started pid is 123. A kernel module with uprobe_khandler() is
>	      inserted into the kernel.
>
>	    To get the address of malloc() use
>	    #objdump -D appln |grep malloc
>	    0x08048320
>	    main(){
>		pid_t child = 123;
>
>		printf("insert probes on child pid %ld\n", child);
>	        utrace(child, 0x8048320, UTRACE_KHANDLER, uprobe_handler);
>		}
>
>	    2. Trace a routine foo() in an application yet to be started and
>	       specify a kernel handler utrace_foo_khandler().
>	       First of all a kernel module with utrace_foo_khandler() is
>	       inserted into the kernel.
>  
>
You mean utrace_foo_khandler() needs to be pre-installed through a 
module load before running the following program, right.

>	       To get the address of foo() use
>	       #objdump -D appln |grep foo
>	       0x08048fa0
>
>	       main(){
>			pid_t child;
>			if ((child = fork()) == 0) {
>				utrace(0, 0x08048fa0, UTRACE_KHANDLER, utrace_foo_khandler);
>				execve(appln, "/home/prasanna/appln");
>			}
>		}
>
>2. Specifying already existing kernel handler for a probe point.
>	- usage example.
>	   1. Trace a library routine malloc() in an application already
>	      started pid is 123 and specify to dump registers.
>
>	    To get the address of malloc() use
>	    #objdump -D appln |grep malloc
>	    0x08048320
>	    main(){
>		pid_t child = 123;
>
>		printf("insert probes on child pid %ld\n", child);
>	        utrace(child, 0x8048320, UTRACE_GETREGS, NULL);
>
>	    }
>
>	    2. Trace a routine foo() in an application yet to be started and
>	       specify to dump registers.
>
>	       To get the address of foo() use
>	       #objdump -D appln |grep foo
>	       0x08048fa0
>
>	       main() {
>			pid_t child;
>
>			if ((child = fork()) == 0) {
>				utrace(0, 0x08048fa0, UTRACE_GETREGS, NULL);
>				execve(appln, "/home/prasanna/appln");
>			}
>		}
>
>Issues:
>
>1. Is it acceptable to allow the user to specify a kernel
>   routine through a syscall that will be executed, when the probe
>   point gets hit ?
>  
>
I don't think we are going to allow any arbitrary existing kernel 
function here, right. We are only going to allow associating a 
registered uprobes handler to the break point. Granted uprobe handler 
can be complex and call an existing kernel function.
The second point about this is we are already allowing this for kernel 
anyway, so what we are allowing additionally by doing this in response 
to a break point in the user space.

>2. Is it acceptable to run the instrumentation code as part of kernel
>   address space ?
>  
>
We are already running instrumentation code in the kernel for kprobes. 
We are exploring possibilities of running it in user land but the 
performance penalty seems to be too high. Roland who is familiar with 
this area may have some additional comments.

>3. Are there any security concerns ?
>  
>
I am not a security expert but Roland who is on the cc may have some 
insights.

>Interfaces:
>
>int sys_utrace(pid_t pid, unsigned long vaddr,
>			unsigned long request, char *name);
>  
>
May be a different name more like sys_utrace_addbp.

> pid		- process id that need to probed.
> vaddr		- virtual address where probe is to be inserted.
> request	- UTRACE_GETDATA, UTRACE_GETREGS, UTRACE_STACKTRACE.
>  
>
What does GETDATA do?
Stack trace here gives the stack of the user process until it hit the 
break point, right?

> name		- name of the kernel handler.
>
>maybe _add_ length field as well, so that user can specify length of data
>to be logged.
>  
>
I am not sure i see the value of length field when user is not 
specifying the buffer where the data is being logged.

>void sys_utrace_rm(pid_t pid, unsigned long vaddr);
>  
>
similarly sys_utrace_rmbp.

>Data structures:
>
>Allocated for each probe.
>struct uprobe {
>	/*per process and per probe hlist_node */
>	struct hlist_node plist;
>	unsigned long request;		/* bitmap of the request */
>	unsigned long status;		/* status as active/inactive */
>	struct kprobe kp;		/* kprobe structure */
>};
>
>Allocated for each process.
>struct uprobe_module {
>	struct hlist_head phead;	/* list of all probed processes */
>	/* list of all probes for individual process */
>	struct hlist_node mlist;
>	struct pid_t pid;		/* pid of the each probed process */
>};
>
>uprobe_table[];		/* individual probes hashed on vaddr * pid */
>struct hlist_head uprobe_module_head[];
>			/* list of all uprobe_module hashed on pid*/
>uprobe_mutex		/* protect uprobe_table and uprobe_module_table*/
>  
>
I think you need to mention the interface how one can register new 
handlers for user space probes?

next prev parent reply	other threads:[~2006-06-13  0:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-09  5:53 S. P. Prasanna
2006-06-13  0:24 ` Vara Prasad [this message]
2006-06-13 10:12 ` Richard J Moore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=448E0595.9010503@us.ibm.com \
    --to=prasadav@us.ibm.com \
    --cc=prasanna@in.ibm.com \
    --cc=roland@redhat.com \
    --cc=systemtap@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).