From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29621 invoked by alias); 15 Mar 2006 06:04:46 -0000 Received: (qmail 29606 invoked by uid 22791); 15 Mar 2006 06:04:43 -0000 X-Spam-Status: No, hits=-0.9 required=5.0 tests=AWL,BAYES_00,DNS_FROM_RFC_ABUSE,SPF_SOFTFAIL X-Spam-Check-By: sourceware.org Received: from e34.co.us.ibm.com (HELO e34.co.us.ibm.com) (32.97.110.152) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 15 Mar 2006 06:04:39 +0000 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id k2F64X5L004008 for ; Wed, 15 Mar 2006 01:04:33 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VER6.8) with ESMTP id k2F67SHd157308 for ; Tue, 14 Mar 2006 23:07:28 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id k2F64WHQ001733 for ; Tue, 14 Mar 2006 23:04:32 -0700 Received: from newton.in.ibm.com ([9.124.35.81]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id k2F64Ua7001681 for ; Tue, 14 Mar 2006 23:04:30 -0700 Received: by newton.in.ibm.com (Postfix, from userid 500) id 962C6CE3; Wed, 15 Mar 2006 11:34:56 +0530 (IST) Date: Wed, 15 Mar 2006 06:04:00 -0000 From: Prasanna S Panchamukhi To: systemtap@sources.redhat.com Subject: [PATH 1/3] User space probes-take4 Message-ID: <20060315060456.GA6376@in.ibm.com> Reply-To: prasanna@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00786.txt.bz2 Hi All, Thanks to Bibo for testing out these patches. This patch set fixes most of the bugs reported by Bibo. Also now user-space probes code is moved into a separate file. I will check these patches into the cvs soon. Thanks Prasanna This kprobes patch adds support for userspace probes. It adds a new struct, uprobe, to the kprobes API, along with register_uprobe and unregister_uprobe functions. The implementation uses another new struct, uprobe_module. Objects -------- struct uprobe - Allocated per probe by the user. struct uprobe_module - Allocated per application by the userspace probe mechanism. struct uprobe { /* pointer to the pathname of the application */ char *pathname; /* kprobe structure with user specified handlers */ struct kprobe kp; /* hlist of all the userspace probes per application */ struct hlist_node ulist; /* inode of the probed application */ struct inode *inode; /* probe offset within the file */ unsigned long offset; }; struct uprobe_module { /* hlist head of all userspace probes per application */ struct hlist_head ulist_head; /* list of all uprobe_module for probed application */ struct list_head mlist; /* to hold path/dentry etc. */ struct nameidata nd; /* original readpage operations */ struct address_space_operations *ori_a_ops; /* readpage hooks added operations */ struct address_space_operations user_a_ops; }; Explanation of struct members: Before calling register_uprobe, the user sets the following members of struct uprobe: pathname - the pathname of the probed application's executable/library file kp - the kprobe object that specifies the handler(s) to run when the probe is hit, and the virtual address (kp.addr) at which to place the probe. offset - is the absolute offset of the probe point from the beginning of the probed executable/library file. The remaining members are for internal use. uprobe members inode and offset uniquely identify each probe, where: inode - is the inode of the probed application. of the file. When the probe is hit, get_uprobe() walks the kprobes hash table to find the uprobe structure with the matching inode and offset. This is more efficient than searching for the application's uprobe_module and then walking that uprobe_module's list of uprobes. ulist_head and ulist - holds all uprobes for an executable/library file. During readpage() operations, it walks the per-executable/library file probe list and inserts the probes. mlist - list of all the probed executable/library files. During readpage() operations, the module list is used to find the matching probed file based on the inode. This list is protected by uprobe_mutex. nd - holds the path of the probed executable/library file until all the inserted probes are removed for that executable/library file. ori_a_ops and user_a_ops - are used to hold the readpage pointers and readpage() hooks. Interfaces : 1. register_uprobe(struct uprobe *uprobe) : accepts a pointer to uprobe. User has to allocate the uprobes structure and initialize following elements: pathname - points to the application's pathname offset - offset of the probe from the file beginning; [It's still the case that the user has to specify the offset as well as the address (see TODO list)] In case of library calls, the offset is the relative offset from the beginning of the of the mapped library. kp.addr - virtual address within the executable. kp.pre_handler - handler to be executed when probe is fired. kp.post_handler - handler to be executed after single stepping the original instruction. kp.fault_handler- handler to be executed if fault occurs while executing the original instruction or the handlers. As with a kprobe, the user should not modify the uprobe while it is registered. This routine returns zero on successful registeration. 2. unregister_uprobe(struct uprobe *uprobe) : accepts a pointer to uprobe. Usage: Usage is similar to kprobe. /* Allocate a uprobe structure */ struct uprobe p; /* Define pre handler */ int handler_pre(struct kprobe *p, struct pt_regs *regs) { <.............collect useful data..............> return 0; } void handler_post(struct kprobe *p, struct pt_regs *regs, unsigned long flags) { <.............collect useful data..............> } int handler_fault(struct kprobe *p, struct pt_regs *regs, int trapnr) { <.............collect useful data..............> [return ... what?] } Before inserting the probe, specify the pathname of the application on which the probe is to be inserted. /*pointer to the pathname of the application */ p.pathname = "/home/prasanna/bin/myapp"; p.kp.pre_handler=handler_pre; p.kp.post_handler=handler_post; p.kp.fault_handler=handler_fault; /* Secify the probe address */ /* $nm appln |grep func1 */ p.kp.addr = (kprobe_opcode_t *)0x080484d4; /* Specify the offset within the application/executable*/ p.offset = (unsigned long)0x4d4; /* Now register the userspace probe */ if (ret = register_uprobe(&p)) printk("register_uprobe: unsuccessful ret= %d\n", ret); /* To unregister the registered probed, just call..*/ unregister_uprobe(&p); Signed-off-by : Prasanna S Panchamukhi arch/i386/kernel/uprobes.c | 70 +++++ fs/namei.c | 11 include/linux/kprobes.h | 49 +++ include/linux/namei.h | 1 kernel/Makefile | 2 kernel/kprobes.c | 2 kernel/uprobes.c | 591 +++++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 722 insertions(+), 4 deletions(-) diff -puN fs/namei.c~kprobes_userspace_probes-base-interface fs/namei.c --- linux-2.6.16-rc6-mm1/fs/namei.c~kprobes_userspace_probes-base-interface 2006-03-15 10:06:15.000000000 +0530 +++ linux-2.6.16-rc6-mm1-prasanna/fs/namei.c 2006-03-15 10:06:15.000000000 +0530 @@ -322,10 +322,8 @@ int get_write_access(struct inode * inod return 0; } -int deny_write_access(struct file * file) +int deny_write_access_to_inode(struct inode *inode) { - struct inode *inode = file->f_dentry->d_inode; - spin_lock(&inode->i_lock); if (atomic_read(&inode->i_writecount) > 0) { spin_unlock(&inode->i_lock); @@ -337,6 +335,13 @@ int deny_write_access(struct file * file return 0; } +int deny_write_access(struct file * file) +{ + struct inode *inode = file->f_dentry->d_inode; + + return deny_write_access_to_inode(inode); +} + void path_release(struct nameidata *nd) { dput(nd->dentry); diff -puN include/linux/kprobes.h~kprobes_userspace_probes-base-interface include/linux/kprobes.h --- linux-2.6.16-rc6-mm1/include/linux/kprobes.h~kprobes_userspace_probes-base-interface 2006-03-15 10:06:15.000000000 +0530 +++ linux-2.6.16-rc6-mm1-prasanna/include/linux/kprobes.h 2006-03-15 11:15:16.000000000 +0530 @@ -37,6 +37,10 @@ #include #include #include +#include +#include +#include +#include #ifdef CONFIG_KPROBES #include @@ -54,6 +58,7 @@ struct kprobe; struct pt_regs; struct kretprobe; struct kretprobe_instance; +extern struct uprobe *current_uprobe; typedef int (*kprobe_pre_handler_t) (struct kprobe *, struct pt_regs *); typedef int (*kprobe_break_handler_t) (struct kprobe *, struct pt_regs *); typedef void (*kprobe_post_handler_t) (struct kprobe *, struct pt_regs *, @@ -117,6 +122,32 @@ struct jprobe { DECLARE_PER_CPU(struct kprobe *, current_kprobe); DECLARE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk); +struct uprobe { + /* pointer to the pathname of the application */ + char *pathname; + /* kprobe structure with user specified handlers */ + struct kprobe kp; + /* hlist of all the userspace probes per application */ + struct hlist_node ulist; + /* inode of the probed application */ + struct inode *inode; + /* probe offset within the file */ + unsigned long offset; +}; + +struct uprobe_module { + /* hlist head of all userspace probes per application */ + struct hlist_head ulist_head; + /* list of all uprobe_module for probed application */ + struct list_head mlist; + /* to hold path/dentry etc. */ + struct nameidata nd; + /* original readpage operations */ + struct address_space_operations *ori_a_ops; + /* readpage hooks added operations */ + struct address_space_operations user_a_ops; +}; + #ifdef ARCH_SUPPORTS_KRETPROBES extern void arch_prepare_kretprobe(struct kretprobe *rp, struct pt_regs *regs); #else /* ARCH_SUPPORTS_KRETPROBES */ @@ -162,9 +193,14 @@ extern void show_registers(struct pt_reg extern kprobe_opcode_t *get_insn_slot(void); extern void free_insn_slot(kprobe_opcode_t *slot); extern void kprobes_inc_nmissed_count(struct kprobe *p); +extern int arch_copy_uprobe(struct kprobe *p, kprobe_opcode_t *address); +extern void arch_arm_uprobe(kprobe_opcode_t *address); +extern void arch_disarm_uprobe(struct kprobe *p, kprobe_opcode_t *address); /* Get the kprobe at this addr (if any) - called with preemption disabled */ struct kprobe *get_kprobe(void *addr); +struct kprobe *get_uprobe(void *addr); +extern int arch_alloc_insn(struct kprobe *p); struct hlist_head * kretprobe_inst_table_head(struct task_struct *tsk); /* kprobe_running() will just return the current_kprobe on this CPU */ @@ -183,6 +219,16 @@ static inline struct kprobe_ctlblk *get_ return (&__get_cpu_var(kprobe_ctlblk)); } +static inline void set_uprobe_instance(struct kprobe *p) +{ + current_uprobe = container_of(p, struct uprobe, kp); +} + +static inline void reset_uprobe_instance(void) +{ + current_uprobe = NULL; +} + int register_kprobe(struct kprobe *p); void unregister_kprobe(struct kprobe *p); int setjmp_pre_handler(struct kprobe *, struct pt_regs *); @@ -194,6 +240,9 @@ void jprobe_return(void); int register_kretprobe(struct kretprobe *rp); void unregister_kretprobe(struct kretprobe *rp); +int register_uprobe(struct uprobe *uprobe); +void unregister_uprobe(struct uprobe *uprobe); + struct kretprobe_instance *get_free_rp_inst(struct kretprobe *rp); void add_rp_inst(struct kretprobe_instance *ri); void kprobe_flush_task(struct task_struct *tk); diff -puN include/linux/namei.h~kprobes_userspace_probes-base-interface include/linux/namei.h --- linux-2.6.16-rc6-mm1/include/linux/namei.h~kprobes_userspace_probes-base-interface 2006-03-15 10:06:15.000000000 +0530 +++ linux-2.6.16-rc6-mm1-prasanna/include/linux/namei.h 2006-03-15 10:06:15.000000000 +0530 @@ -81,6 +81,7 @@ extern int follow_up(struct vfsmount **, extern struct dentry *lock_rename(struct dentry *, struct dentry *); extern void unlock_rename(struct dentry *, struct dentry *); +extern int deny_write_access_to_inode(struct inode *inode); static inline void nd_set_link(struct nameidata *nd, char *path) { diff -puN kernel/kprobes.c~kprobes_userspace_probes-base-interface kernel/kprobes.c --- linux-2.6.16-rc6-mm1/kernel/kprobes.c~kprobes_userspace_probes-base-interface 2006-03-15 10:06:15.000000000 +0530 +++ linux-2.6.16-rc6-mm1-prasanna/kernel/kprobes.c 2006-03-15 10:06:15.000000000 +0530 @@ -51,6 +51,7 @@ static struct hlist_head kretprobe_inst_ DEFINE_MUTEX(kprobe_mutex); /* Protects kprobe_table */ DEFINE_SPINLOCK(kretprobe_lock); /* Protects kretprobe_inst_table */ static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL; +extern void init_uprobes(void); #ifdef __ARCH_WANT_KPROBES_INSN_SLOT /* @@ -650,6 +651,7 @@ static int __init init_kprobes(void) INIT_HLIST_HEAD(&kretprobe_inst_table[i]); } + init_uprobes(); err = arch_init_kprobes(); if (!err) err = register_die_notifier(&kprobe_exceptions_nb); diff -puN /dev/null kernel/uprobes.c --- /dev/null 2004-06-24 23:34:38.000000000 +0530 +++ linux-2.6.16-rc6-mm1-prasanna/kernel/uprobes.c 2006-03-15 11:15:18.000000000 +0530 @@ -0,0 +1,591 @@ +/* + * User-space Probes (UProbes) + * kernel/uprobes.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2006 + * + * 2006-Mar Created by Prasanna S Panchamukhi + * User-space probes initial implementation. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define UPROBE_HASH_BITS 6 +#define UPROBE_TABLE_SIZE (1 << UPROBE_HASH_BITS) + +/* user space probes lists */ +static struct list_head uprobe_module_list; +static struct hlist_head uprobe_table[UPROBE_TABLE_SIZE]; +DEFINE_SPINLOCK(uprobe_lock); /* Protects uprobe_table*/ +DEFINE_MUTEX(uprobe_mutex); /* Protects uprobe_module_table */ + +/* + * Aggregate handlers for multiple uprobes support - these handlers + * take care of invoking the individual uprobe handlers on p->list + */ +static int __kprobes aggr_user_pre_handler(struct kprobe *p, + struct pt_regs *regs) +{ + struct kprobe *kp; + + list_for_each_entry(kp, &p->list, list) { + if (kp->pre_handler) { + set_uprobe_instance(kp); + if (kp->pre_handler(kp, regs)) + return 1; + } + } + return 0; +} + +static void __kprobes aggr_user_post_handler(struct kprobe *p, + struct pt_regs *regs, unsigned long flags) +{ + struct kprobe *kp; + + list_for_each_entry(kp, &p->list, list) { + if (kp->post_handler) { + set_uprobe_instance(kp); + kp->post_handler(kp, regs, flags); + } + } +} + +static int __kprobes aggr_user_fault_handler(struct kprobe *p, + struct pt_regs *regs, int trapnr) +{ + struct kprobe *cur; + + /* + * if we faulted "during" the execution of a user specified + * probe handler, invoke just that probe's fault handler + */ + cur = ¤t_uprobe->kp; + if (cur && cur->fault_handler) + if (cur->fault_handler(cur, regs, trapnr)) + return 1; + return 0; +} + +/** + * This routine looks for an existing uprobe at the given offset and inode. + * If it's found, returns the corresponding kprobe pointer. + * This should be called with uprobe_lock held. + */ +static struct kprobe __kprobes *get_kprobe_user(struct inode *inode, + unsigned long offset) +{ + struct hlist_head *head; + struct hlist_node *node; + struct kprobe *p, *kpr; + struct uprobe *uprobe; + + head = &uprobe_table[hash_ptr((kprobe_opcode_t *) + (((unsigned long)inode) * offset), UPROBE_HASH_BITS)]; + + hlist_for_each_entry(p, node, head, hlist) { + if (p->pre_handler == aggr_user_pre_handler) { + kpr = list_entry(p->list.next, typeof(*kpr), list); + uprobe = container_of(kpr, struct uprobe, kp); + } else + uprobe = container_of(p, struct uprobe, kp); + + if ((uprobe->inode == inode) && (uprobe->offset == offset)) + return p; + } + + return NULL; +} + +/** + * Finds a uprobe at the specified user-space address in the current task. + * Points current_uprobe at that uprobe and returns the corresponding kprobe. + */ +struct kprobe __kprobes *get_uprobe(void *addr) +{ + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; + struct inode *inode; + unsigned long offset; + struct kprobe *p, *kpr; + struct uprobe *uprobe; + + vma = find_vma(mm, (unsigned long)addr); + + BUG_ON(!vma); /* this should not happen, not in our memory map */ + + offset = (unsigned long)addr - vma->vm_start + + (vma->vm_pgoff << PAGE_SHIFT); + if (!vma->vm_file) + return NULL; + + inode = vma->vm_file->f_dentry->d_inode; + + p = get_kprobe_user(inode, offset); + if (!p) + return NULL; + + if (p->pre_handler == aggr_user_pre_handler) { + /* + * Walk the uprobe aggrigate list and return firt + * element on aggrigate list. + */ + kpr = list_entry((p)->list.next, typeof(*kpr), list); + uprobe = container_of(kpr, struct uprobe, kp); + } else + uprobe = container_of(p, struct uprobe, kp); + + if (uprobe) + current_uprobe = uprobe; + + return p; +} + +/* + * Keep all fields in the kprobe consistent + */ +static inline void copy_uprobe(struct kprobe *old_p, struct kprobe *p) +{ + memcpy(&p->opcode, &old_p->opcode, sizeof(kprobe_opcode_t)); + memcpy(&p->ainsn, &old_p->ainsn, sizeof(struct arch_specific_insn)); +} + +/* + * Fill in the required fields of the "manager uprobe". Replace the + * earlier kprobe in the hlist with the manager uprobe + */ +static inline void add_aggr_uprobe(struct kprobe *ap, struct kprobe *p) +{ + copy_uprobe(p, ap); + ap->addr = p->addr; + ap->pre_handler = aggr_user_pre_handler; + ap->post_handler = aggr_user_post_handler; + ap->fault_handler = aggr_user_fault_handler; + + INIT_LIST_HEAD(&ap->list); + list_add(&p->list, &ap->list); + + hlist_replace_rcu(&p->hlist, &ap->hlist); +} + +/* + * This is the second or subsequent uprobe at the address - handle + * the intricacies + */ +static int __kprobes register_aggr_uprobe(struct kprobe *old_p, + struct kprobe *p) +{ + int ret = 0; + struct kprobe *ap; + + if (old_p->pre_handler == aggr_user_pre_handler) { + copy_uprobe(old_p, p); + list_add(&p->list, &old_p->list); + } else { + ap = kzalloc(sizeof(struct kprobe), GFP_ATOMIC); + if (!ap) + return -ENOMEM; + add_aggr_uprobe(ap, old_p); + copy_uprobe(ap, p); + list_add(&p->list, &old_p->list); + } + return ret; +} + +typedef int (*process_uprobe_func_t)(struct uprobe *uprobe, + kprobe_opcode_t *address); + +/** + * Saves the original instruction in the uprobe structure and + * inserts the breakpoint at the given address. + */ +int __kprobes insert_kprobe_user(struct uprobe *uprobe, + kprobe_opcode_t *address) +{ + int ret = 0; + + ret = arch_copy_uprobe(&uprobe->kp, address); + if (ret) { + printk("Breakpoint already present\n"); + return ret; + } + arch_arm_uprobe(address); + + return 0; +} + +/** + * Wait for the page to be unlocked if someone else had locked it, + * then map the page and insert or remove the breakpoint. + */ +static int __kprobes map_uprobe_page(struct page *page, struct uprobe *uprobe, + process_uprobe_func_t process_kprobe_user) +{ + int ret = 0; + kprobe_opcode_t *uprobe_address; + + if (!page) + return -EINVAL; /* TODO: more suitable errno */ + + wait_on_page_locked(page); + /* could probably retry readpage here. */ + if (!PageUptodate(page)) + return -EINVAL; /* TODO: more suitable errno */ + + lock_page(page); + + uprobe_address = (kprobe_opcode_t *)kmap(page); + uprobe_address = (kprobe_opcode_t *)((unsigned long)uprobe_address + + (uprobe->offset & ~PAGE_MASK)); + ret = (*process_kprobe_user)(uprobe, uprobe_address); + kunmap(page); + + unlock_page(page); + + return ret; +} + +/** + * flush_vma walks through the list of process private mappings, + * gets the vma containing the offset and flush all the vma's + * containing the probed page. + */ +static void __kprobes flush_vma(struct address_space *mapping, + struct page *page, struct uprobe *uprobe) +{ + struct vm_area_struct *vma = NULL; + struct prio_tree_iter iter; + struct prio_tree_root *head = &mapping->i_mmap; + struct mm_struct *mm; + unsigned long start, end, offset = uprobe->offset; + + spin_lock(&mapping->i_mmap_lock); + vma_prio_tree_foreach(vma, &iter, head, offset, offset) { + mm = vma->vm_mm; + start = vma->vm_start - (vma->vm_pgoff << PAGE_SHIFT); + end = vma->vm_end - (vma->vm_pgoff << PAGE_SHIFT); + + if ((start + offset) < end) + flush_icache_user_range(vma, page, + (unsigned long)uprobe->kp.addr, + sizeof(kprobe_opcode_t)); + } + spin_unlock(&mapping->i_mmap_lock); +} + +/** + * Walk the uprobe_module_list and return the uprobe module with matching + * inode. + */ +static struct uprobe_module __kprobes *get_module_by_inode(struct inode *inode) +{ + struct uprobe_module *umodule; + + list_for_each_entry(umodule, &uprobe_module_list, mlist) { + if (umodule->nd.dentry->d_inode == inode) + return umodule; + } + + return NULL; +} + +/** + * Gets exclusive write access to the given inode to ensure that the file + * on which probes are currently applied does not change. Use the function, + * deny_write_access_to_inode() we added in fs/namei.c. + */ +static inline int ex_write_lock(struct inode *inode) +{ + return deny_write_access_to_inode(inode); +} + +/** + * Called when removing user space probes to release the write lock on the + * inode. + */ +static inline int ex_write_unlock(struct inode *inode) +{ + atomic_inc(&inode->i_writecount); + return 0; +} + +/** + * Add uprobe and uprobe_module to the appropriate hash list. + */ +static void __kprobes get_inode_ops(struct uprobe *uprobe, + struct uprobe_module *umodule) +{ + INIT_HLIST_HEAD(&umodule->ulist_head); + hlist_add_head(&uprobe->ulist, &umodule->ulist_head); + list_add(&umodule->mlist, &uprobe_module_list); +} + +/* + * Removes the specified uprobe from either aggrigate uprobe list + * or individual uprobe hash table. + */ + +static int __kprobes remove_uprobe(struct uprobe *uprobe) +{ + struct kprobe *old_p, *list_p, *p; + int ret = 0; + + p = &uprobe->kp; + old_p = get_kprobe_user(uprobe->inode, uprobe->offset); + if (unlikely(!old_p)) + return 0; + + if (p != old_p) { + list_for_each_entry(list_p, &old_p->list, list) + if (list_p == p) + /* kprobe p is a valid probe */ + goto valid_p; + return 0; + } + +valid_p: + if ((old_p == p) || + ((old_p->pre_handler == aggr_user_pre_handler) && + (p->list.next == &old_p->list) && + (p->list.prev == &old_p->list))) { + /* Only probe on the hash list */ + ret = 1; + hlist_del(&old_p->hlist); + if (p != old_p) { + list_del(&p->list); + kfree(old_p); + } + } else + list_del(&p->list); + + return ret; +} + +/* + * Disarms the probe and frees the corresponding instruction slot. + */ +static int __kprobes remove_kprobe_user(struct uprobe *uprobe, + kprobe_opcode_t *address) +{ + struct kprobe *p = &uprobe->kp; + + arch_disarm_uprobe(p, address); + arch_remove_kprobe(p); + + return 0; +} + +/* + * Adds the given uprobe to the uprobe_hash table if it is + * the first probe to be inserted at the given address else + * adds to the aggrigate uprobe's list. + */ +static int __kprobes insert_uprobe(struct uprobe *uprobe) +{ + struct kprobe *old_p; + int ret = 0; + unsigned long offset = uprobe->offset; + unsigned long inode = (unsigned long) uprobe->inode; + struct hlist_head *head; + unsigned long flags; + + spin_lock_irqsave(&uprobe_lock, flags); + uprobe->kp.nmissed = 0; + + old_p = get_kprobe_user(uprobe->inode, uprobe->offset); + + if (old_p) + register_aggr_uprobe(old_p, &uprobe->kp); + else { + head = &uprobe_table[hash_ptr((kprobe_opcode_t *) + (offset * inode), UPROBE_HASH_BITS)]; + INIT_HLIST_NODE(&uprobe->kp.hlist); + hlist_add_head(&uprobe->kp.hlist, head); + ret = 1; + } + + spin_unlock_irqrestore(&uprobe_lock, flags); + + return ret; +} + +/** + * unregister_uprobe: Disarms the probe, removes the uprobe + * pointers from the hash list and unhooks readpage routines. + */ +void __kprobes unregister_uprobe(struct uprobe *uprobe) +{ + struct address_space *mapping; + struct uprobe_module *umodule; + struct page *page; + unsigned long flags; + int ret = 0; + + if (!uprobe->inode) + return; + + mapping = uprobe->inode->i_mapping; + + page = find_get_page(mapping, uprobe->offset >> PAGE_CACHE_SHIFT); + + spin_lock_irqsave(&uprobe_lock, flags); + ret = remove_uprobe(uprobe); + spin_unlock_irqrestore(&uprobe_lock, flags); + + mutex_lock(&uprobe_mutex); + if (!(umodule = get_module_by_inode(uprobe->inode))) + goto out; + + hlist_del(&uprobe->ulist); + if (hlist_empty(&umodule->ulist_head)) { + list_del(&umodule->mlist); + ex_write_unlock(uprobe->inode); + path_release(&umodule->nd); + kfree(umodule); + } + +out: + mutex_unlock(&uprobe_mutex); + if (ret) + ret = map_uprobe_page(page, uprobe, remove_kprobe_user); + + if (ret == -EINVAL) + return; + /* + * TODO: unregister_uprobe should not fail, need to handle + * if it fails. + */ + flush_vma(mapping, page, uprobe); + + if (page) + page_cache_release(page); +} + +/** + * register_uprobe(): combination of inode and offset is used to + * identify each probe uniquely. Each uprobe can be found from the + * uprobes_hash table by using inode and offset. register_uprobe(), + * inserts the breakpoint at the given address by locating and mapping + * the page. return 0 on success and error on failure. + */ +int __kprobes register_uprobe(struct uprobe *uprobe) +{ + struct address_space *mapping; + struct uprobe_module *umodule = NULL; + struct inode *inode; + struct nameidata nd; + struct page *page; + int error = 0; + + INIT_HLIST_NODE(&uprobe->ulist); + + /* + * TODO: Need to calculate the absolute file offset for dynamic + * shared libraries. + */ + if ((error = path_lookup(uprobe->pathname, LOOKUP_FOLLOW, &nd))) + return error; + + mutex_lock(&uprobe_mutex); + + inode = nd.dentry->d_inode; + error = ex_write_lock(inode); + if (error) + goto out; + + /* + * Check if there are probes already on this application and + * add the corresponding uprobe to per application probe's list. + */ + umodule = get_module_by_inode(inode); + if (!umodule) { + + error = arch_alloc_insn(&uprobe->kp); + if (error) + goto out; + + /* + * Allocate a uprobe_module structure for this + * application if not allocated before. + */ + umodule = kzalloc(sizeof(struct uprobe_module), GFP_KERNEL); + if (!umodule) { + error = -ENOMEM; + ex_write_unlock(inode); + arch_remove_kprobe(&uprobe->kp); + goto out; + } + memcpy(&umodule->nd, &nd, sizeof(struct nameidata)); + get_inode_ops(uprobe, umodule); + } else { + path_release(&nd); + ex_write_unlock(inode); + hlist_add_head(&uprobe->ulist, &umodule->ulist_head); + } + mutex_unlock(&uprobe_mutex); + + uprobe->inode = inode; + mapping = inode->i_mapping; + page = find_get_page(mapping, (uprobe->offset >> PAGE_CACHE_SHIFT)); + + if (insert_uprobe(uprobe)) + error = map_uprobe_page(page, uprobe, insert_kprobe_user); + + /* + * If error == -EINVAL, return success, probes will inserted by + * readpage hooks. + * TODO: Use a more suitable errno? + */ + if (error == -EINVAL) + error = 0; + flush_vma(mapping, page, uprobe); + + if (page) + page_cache_release(page); + + return error; +out: + path_release(&nd); + mutex_unlock(&uprobe_mutex); + + return error; +} + +void init_uprobes(void) +{ + int i; + + /* FIXME allocate the probe table, currently defined statically */ + /* initialize all list heads */ + for (i = 0; i < UPROBE_TABLE_SIZE; i++) + INIT_HLIST_HEAD(&uprobe_table[i]); + + INIT_LIST_HEAD(&uprobe_module_list); +} + +EXPORT_SYMBOL_GPL(register_uprobe); +EXPORT_SYMBOL_GPL(unregister_uprobe); + + diff -puN /dev/null arch/i386/kernel/uprobes.c --- /dev/null 2004-06-24 23:34:38.000000000 +0530 +++ linux-2.6.16-rc6-mm1-prasanna/arch/i386/kernel/uprobes.c 2006-03-15 11:16:45.000000000 +0530 @@ -0,0 +1,70 @@ +/* + * User-space Probes (UProbes) + * arch/i386/kernel/uprobes.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2006. + * + * 2006-Mar Created by Prasanna S Panchamukhi + * User-space probes initial implementation. + */ + +#include +#include +#include +#include +#include +#include +#include + +int __kprobes arch_alloc_insn(struct kprobe *p) +{ + mutex_lock(&kprobe_mutex); + p->ainsn.insn = get_insn_slot(); + mutex_unlock(&kprobe_mutex); + + if (!p->ainsn.insn) + return -ENOMEM; + + return 0; +} + +void __kprobes arch_disarm_uprobe(struct kprobe *p, kprobe_opcode_t *address) +{ + if (p->opcode != BREAKPOINT_INSTRUCTION) + *address = p->opcode; +} + +void __kprobes arch_arm_uprobe(kprobe_opcode_t *address) +{ + *address = BREAKPOINT_INSTRUCTION; +} + +int __kprobes arch_copy_uprobe(struct kprobe *p, kprobe_opcode_t *address) +{ + int ret = 1; + + /* + * TODO: Check if the given address is a valid to access user memory. + */ + if (*address != BREAKPOINT_INSTRUCTION) { + memcpy(p->ainsn.insn, address, MAX_INSN_SIZE * sizeof(kprobe_opcode_t)); + ret = 0; + } + p->opcode = *(kprobe_opcode_t *)address; + + return ret; +} diff -puN kernel/Makefile~kprobes_userspace_probes-base-interface kernel/Makefile --- linux-2.6.16-rc6-mm1/kernel/Makefile~kprobes_userspace_probes-base-interface 2006-03-15 10:06:15.000000000 +0530 +++ linux-2.6.16-rc6-mm1-prasanna/kernel/Makefile 2006-03-15 10:06:15.000000000 +0530 @@ -32,7 +32,7 @@ obj-$(CONFIG_IKCONFIG) += configs.o obj-$(CONFIG_STOP_MACHINE) += stop_machine.o obj-$(CONFIG_AUDIT) += audit.o auditfilter.o obj-$(CONFIG_AUDITSYSCALL) += auditsc.o -obj-$(CONFIG_KPROBES) += kprobes.o +obj-$(CONFIG_KPROBES) += kprobes.o uprobes.o obj-$(CONFIG_SYSFS) += ksysfs.o obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o obj-$(CONFIG_GENERIC_HARDIRQS) += irq/ _ -- Prasanna S Panchamukhi Linux Technology Center India Software Labs, IBM Bangalore Email: prasanna@in.ibm.com Ph: 91-80-51776329