From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10573 invoked by alias); 22 Feb 2006 07:11:35 -0000 Received: (qmail 10566 invoked by uid 22791); 22 Feb 2006 07:11:34 -0000 X-Spam-Status: No, hits=-0.7 required=5.0 tests=AWL,BAYES_00,DNS_FROM_RFC_ABUSE,SPF_SOFTFAIL X-Spam-Check-By: sourceware.org Received: from e33.co.us.ibm.com (HELO e33.co.us.ibm.com) (32.97.110.151) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 22 Feb 2006 07:11:32 +0000 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e33.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id k1M7BUQW027274 for ; Wed, 22 Feb 2006 02:11:30 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VERS6.8) with ESMTP id k1M7E9Dv175182 for ; Wed, 22 Feb 2006 00:14:09 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id k1M7BUDH019716 for ; Wed, 22 Feb 2006 00:11:30 -0700 Received: from newton.in.ibm.com ([9.124.35.47]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id k1M7BSKc019683 for ; Wed, 22 Feb 2006 00:11:29 -0700 Received: by newton.in.ibm.com (Postfix, from userid 500) id 13E29CE3; Wed, 22 Feb 2006 12:43:07 +0530 (IST) Date: Wed, 22 Feb 2006 07:11:00 -0000 From: Prasanna S Panchamukhi To: systemtap@sources.redhat.com Subject: [PATCH] Kprobes- robust fault handling for i386 Message-ID: <20060222071307.GA31873@in.ibm.com> Reply-To: prasanna@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00584.txt.bz2 Hi, Below is the prototype for robust fault handling, as of now this patch is for i386 architecture and should be easily ported to other architectures. Your comments and suggestions are welcome. This patch has been tested for page faults that occur while accessing user address space data. Support needs to be added for cases such as divide by zero, NULL pointer dereference, etc. Also as of now we increment the nmissed count, instead we can track such instances by having independent counters such as nprefault, npostfault. Thanks Prasanna This patch provides proper kprobe fault handling for the following cases: - If the user specified pre/post handlers generate a fault, say, due to access to user address space, through copy_from_user(), get_user() etc. In this case we invoke the user specified fault handler (if any) and allow it to handle it. In case the user specified fault handler is unable to handle the fault, we skip calling subsequent processing (ie., calling the user specified pre/post handlers) and transparently singlestep on the original instruction. - If a fault happens while singlestepping the original instruction, the user fault handler isn't called. We instead reset the faulted probe, change the instruction pointer to the probed address and enable the interrupts so that the system fault handler can rectify it. Signed-off-by: Prasanna S Panchamukhi Signed-off-by: Ananth N Mavinakayanahalli arch/i386/kernel/kprobes.c | 122 +++++++++++++++++++++++++++++++++++++++------ include/asm-i386/kprobes.h | 2 include/linux/kprobes.h | 1 3 files changed, 111 insertions(+), 14 deletions(-) diff -puN arch/i386/kernel/kprobes.c~kprobes-fault-handling-fix arch/i386/kernel/kprobes.c --- linux-2.6.16-rc3-mm1/arch/i386/kernel/kprobes.c~kprobes-fault-handling-fix 2006-02-21 17:03:29.000000000 +0530 +++ linux-2.6.16-rc3-mm1-prasanna/arch/i386/kernel/kprobes.c 2006-02-21 17:10:51.000000000 +0530 @@ -35,6 +35,7 @@ #include #include #include +#include void jprobe_return_end(void); @@ -184,6 +185,20 @@ void __kprobes arch_prepare_kretprobe(st } /* + * Kprobe pre handler trampoline saves the function return address and + * calls the registered user pre handler. In case if the user + * specified pre handler causes any page faults, the + * kprobe_fault_handler() gets notified and it just returns directly + * to kprobe_handler(), where trampoline was suppose to return. + */ +static int __kprobes kprobe_pre_handler_trampoline(struct kprobe *p, + struct pt_regs *regs, struct kprobe_ctlblk *kcb) +{ + kcb->handler_retaddr = (unsigned long)__builtin_return_address(0); + return (p->pre_handler(p, regs)); +} + +/* * Interrupts are disabled on entry as trap3 is an interrupt gate and they * remain disabled thorough out this function. */ @@ -286,11 +301,26 @@ static int __kprobes kprobe_handler(stru set_current_kprobe(p, regs, kcb); kcb->kprobe_status = KPROBE_HIT_ACTIVE; - - if (p->pre_handler && p->pre_handler(p, regs)) - /* handler has already set things up, so skip ss setup */ - return 1; - + if (p->pre_handler) { + kcb->handler_regs = regs; + if (kprobe_pre_handler_trampoline(p, regs, kcb)) { + /* + * Check if the user defined pre-handler caused + * any faults, in such case set up for single + * stepping of original instruction. Also, set + * appropriate flags for skipping the post + * handler, since executing the user defined + * post handler is not safe * after single stepping + * the original instruction. + */ + kcb = get_kprobe_ctlblk(); + if (kcb->kprobe_status == KPROBE_HIT_FAULT) { + regs = kcb->handler_regs; + prepare_singlestep(p, regs); + } + return 1; + } + } if (p->ainsn.boostable == 1 && #ifdef CONFIG_PREEMPT !(pre_preempt_count) && /* @@ -498,6 +528,21 @@ no_change: } /* + * Kprobe post handler trampoline saves the function return address + * and calls the registered user post handler. In case if the user + * specified post handler causes any page faults, the + * kprobe_fault_handler() gets notified and it just returns directly + * to the kprobes_post_handler() where trampoline was suppose to + * return. + */ +static void __kprobes kprobe_post_handler_trampoline(struct kprobe *p, + struct pt_regs *regs, struct kprobe_ctlblk *kcb) +{ + kcb->handler_retaddr = (unsigned long)__builtin_return_address(0); + p->post_handler(p, regs, 0); +} + +/* * Interrupts are disabled on entry as trap1 is an interrupt gate and they * remain disabled thoroughout this function. */ @@ -509,9 +554,21 @@ static inline int post_kprobe_handler(st if (!cur) return 0; - if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) { + if ((kcb->kprobe_status != KPROBE_REENTER) + && (kcb->kprobe_status != KPROBE_HIT_FAULT) + && cur->post_handler) { + kcb->handler_regs = regs; kcb->kprobe_status = KPROBE_HIT_SSDONE; - cur->post_handler(cur, regs, 0); + kprobe_post_handler_trampoline(cur, regs, kcb); + kcb = get_kprobe_ctlblk(); + /* + * Check if user defined handler caused the page fault, in + * such a case restore the register pointers, just resets + * the current kprobe and resumes the execution, since we + * have already single stepped on original instruction. + */ + if (kcb->kprobe_status == KPROBE_HIT_FAULT) + regs = kcb->handler_regs; } resume_execution(cur, regs, kcb); @@ -541,18 +598,55 @@ static inline int kprobe_fault_handler(s { struct kprobe *cur = kprobe_running(); struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); + unsigned long *sara = (unsigned long *)®s->esp; - if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr)) - return 1; - - if (kcb->kprobe_status & KPROBE_HIT_SS) { - resume_execution(cur, regs, kcb); + switch(kcb->kprobe_status) { + case KPROBE_HIT_SS: + case KPROBE_REENTER: + case KPROBE_HIT_FAULT: + /* + * We are here because the instruction being single + * stepped caused a page fault. We reset the current + * kprobe and the eip points back to the probe address + * and allow the page fault handler to continue as a + * normal page fault. + */ + regs->eip = (unsigned long)cur->addr; regs->eflags |= kcb->kprobe_old_eflags; - reset_current_kprobe(); + if (kcb->kprobe_status == KPROBE_REENTER) + restore_previous_kprobe(kcb); + else + reset_current_kprobe(); preempt_enable_no_resched(); + break; + case KPROBE_HIT_ACTIVE: + case KPROBE_HIT_SSDONE: + /* + * We come here because instructions in the pre/post + * handler caused the page_fault, this could happen + * if handler tries to access user space by + * copy_from_user(), get_user() etc. Let the + * user-specified handler try to fix it first. + */ + if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr)) + return 1; + + /* + * Since user handler returned failure, we handle it + * by skipping the user specified pre/post handler, + * increment the nmissed count and return to the + * pre/post_handler_trampoline(). + */ + kprobes_inc_nmissed_count(cur); + *sara = kcb->handler_retaddr; + kcb->kprobe_status = KPROBE_HIT_FAULT; + break; + default: + break; } - return 0; + + return 0; /* let the page fault handler, fix this exception */ } /* diff -puN include/asm-i386/kprobes.h~kprobes-fault-handling-fix include/asm-i386/kprobes.h --- linux-2.6.16-rc3-mm1/include/asm-i386/kprobes.h~kprobes-fault-handling-fix 2006-02-21 17:03:29.000000000 +0530 +++ linux-2.6.16-rc3-mm1-prasanna/include/asm-i386/kprobes.h 2006-02-21 17:03:29.000000000 +0530 @@ -69,6 +69,8 @@ struct kprobe_ctlblk { unsigned long kprobe_old_eflags; unsigned long kprobe_saved_eflags; long *jprobe_saved_esp; + unsigned long handler_retaddr; + struct pt_regs *handler_regs; struct pt_regs jprobe_saved_regs; kprobe_opcode_t jprobes_stack[MAX_STACK_SIZE]; struct prev_kprobe prev_kprobe; diff -puN include/linux/kprobes.h~kprobes-fault-handling-fix include/linux/kprobes.h --- linux-2.6.16-rc3-mm1/include/linux/kprobes.h~kprobes-fault-handling-fix 2006-02-21 17:03:29.000000000 +0530 +++ linux-2.6.16-rc3-mm1-prasanna/include/linux/kprobes.h 2006-02-21 17:03:29.000000000 +0530 @@ -46,6 +46,7 @@ #define KPROBE_HIT_SS 0x00000002 #define KPROBE_REENTER 0x00000004 #define KPROBE_HIT_SSDONE 0x00000008 +#define KPROBE_HIT_FAULT 0x00000010 /* Attach to insert probes on any functions which should be ignored*/ #define __kprobes __attribute__((__section__(".kprobes.text"))) _ -- Prasanna S Panchamukhi Linux Technology Center India Software Labs, IBM Bangalore Email: prasanna@in.ibm.com Ph: 91-80-51776329