From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1541 invoked by alias); 9 Jun 2006 05:53:46 -0000 Received: (qmail 1534 invoked by uid 22791); 9 Jun 2006 05:53:45 -0000 X-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL,BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Received: from e4.ny.us.ibm.com (HELO e4.ny.us.ibm.com) (32.97.182.144) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 09 Jun 2006 05:53:42 +0000 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e4.ny.us.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id k595reK3010399 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 9 Jun 2006 01:53:40 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay02.pok.ibm.com (8.13.6/NCO/VER7.0) with ESMTP id k595reX2221402 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 9 Jun 2006 01:53:40 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id k595reln030862 for ; Fri, 9 Jun 2006 01:53:40 -0400 Received: from newton.in.ibm.com ([9.124.31.64]) by d01av02.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id k595rc3O030843 for ; Fri, 9 Jun 2006 01:53:39 -0400 Received: by newton.in.ibm.com (Postfix, from userid 500) id 8014BCE3; Fri, 9 Jun 2006 11:23:26 +0530 (IST) Date: Fri, 09 Jun 2006 05:53:00 -0000 From: "S. P. Prasanna" To: systemtap@sources.redhat.com Subject: Per-process tracing user-space probes approach Message-ID: <20060609055325.GA27877@in.ibm.com> Reply-To: prasanna@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q2/txt/msg00588.txt.bz2 Hi, I have listed a brief description of the user-space probes approach, which I am planning to implement. Please review and provide your comments. Thanks Prasanna Requirements: - per process tracing using "COW" - able to trace yet to be started applications - smallest kernel patch (no aggregate support for this release) - correlation of kernel and user probes output - least performance overhead compared to ptrace - provide clean user interface like syscall with pre-defined set of handlers that can log data, registers, stack trace etc and also support adding new handlers at runtime. - no hooks to readpage(s) - handler runs in kernel and handler can sleep to collect data from non-memory resident pages. - single step out-of-line Usage: 1. Specifying a new kernel handlers for a probe point. - usage example. 1. Trace a library routine malloc() in an application already started pid is 123. A kernel module with uprobe_khandler() is inserted into the kernel. To get the address of malloc() use #objdump -D appln |grep malloc 0x08048320 main(){ pid_t child = 123; printf("insert probes on child pid %ld\n", child); utrace(child, 0x8048320, UTRACE_KHANDLER, uprobe_handler); } 2. Trace a routine foo() in an application yet to be started and specify a kernel handler utrace_foo_khandler(). First of all a kernel module with utrace_foo_khandler() is inserted into the kernel. To get the address of foo() use #objdump -D appln |grep foo 0x08048fa0 main(){ pid_t child; if ((child = fork()) == 0) { utrace(0, 0x08048fa0, UTRACE_KHANDLER, utrace_foo_khandler); execve(appln, "/home/prasanna/appln"); } } 2. Specifying already existing kernel handler for a probe point. - usage example. 1. Trace a library routine malloc() in an application already started pid is 123 and specify to dump registers. To get the address of malloc() use #objdump -D appln |grep malloc 0x08048320 main(){ pid_t child = 123; printf("insert probes on child pid %ld\n", child); utrace(child, 0x8048320, UTRACE_GETREGS, NULL); } 2. Trace a routine foo() in an application yet to be started and specify to dump registers. To get the address of foo() use #objdump -D appln |grep foo 0x08048fa0 main() { pid_t child; if ((child = fork()) == 0) { utrace(0, 0x08048fa0, UTRACE_GETREGS, NULL); execve(appln, "/home/prasanna/appln"); } } Issues: 1. Is it acceptable to allow the user to specify a kernel routine through a syscall that will be executed, when the probe point gets hit ? 2. Is it acceptable to run the instrumentation code as part of kernel address space ? 3. Are there any security concerns ? Interfaces: int sys_utrace(pid_t pid, unsigned long vaddr, unsigned long request, char *name); pid - process id that need to probed. vaddr - virtual address where probe is to be inserted. request - UTRACE_GETDATA, UTRACE_GETREGS, UTRACE_STACKTRACE. name - name of the kernel handler. maybe _add_ length field as well, so that user can specify length of data to be logged. void sys_utrace_rm(pid_t pid, unsigned long vaddr); Data structures: Allocated for each probe. struct uprobe { /*per process and per probe hlist_node */ struct hlist_node plist; unsigned long request; /* bitmap of the request */ unsigned long status; /* status as active/inactive */ struct kprobe kp; /* kprobe structure */ }; Allocated for each process. struct uprobe_module { struct hlist_head phead; /* list of all probed processes */ /* list of all probes for individual process */ struct hlist_node mlist; struct pid_t pid; /* pid of the each probed process */ }; uprobe_table[]; /* individual probes hashed on vaddr * pid */ struct hlist_head uprobe_module_head[]; /* list of all uprobe_module hashed on pid*/ uprobe_mutex /* protect uprobe_table and uprobe_module_table*/ -- S.P. Prasanna Linux Technology Center India Software Labs, IBM Bangalore Email: prasanna@in.ibm.com Ph: 91-80-41776329