RE: [PATCH] Kprobes- robust fault handling for i386

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

* RE: [PATCH] Kprobes- robust fault handling for i386
@ 2006-02-24 19:17 Keshavamurthy, Anil S
  2006-02-27  9:24 ` Prasanna S Panchamukhi
  0 siblings, 1 reply; 14+ messages in thread
From: Keshavamurthy, Anil S @ 2006-02-24 19:17 UTC (permalink / raw)
  To: prasanna; +Cc: systemtap

Prasanna,
	For better review comments, can you please split your patch
into following, currently finding it hard to follow the kprobes states.
1) [PATCH]Fault handling due to calling pre_handler
2) [PATCH]Fault handling due to calling post_handler
3) [PATCH]Fault handling due to single_stepping
4) [PATCH]Fault handling due to single_stepping reentrant probe

Patches 3 and 4 above are required only to support user probes,
so for now I think you can skip them.

Also another suggesting, rename the kprobes states to something more
meaning full. Say
KPROBE_IN_PRE_HANDLER - This states indicates we are calling pre_handler
KPROBE_IN_POST_HANDLER - This states indicates we are calling
post_handler
KPROBES_IN_SS - We are trying to single step
KPROBES_IN_REENTER_SS - We are trying to single step reentrant probes 
[...]

Thanks,
-Anil Keshavamurthy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-24 19:17 [PATCH] Kprobes- robust fault handling for i386 Keshavamurthy, Anil S
@ 2006-02-27  9:24 ` Prasanna S Panchamukhi
  2006-02-27  9:25   ` [PATCH] Kprobes- robust fault handling for i386 post_handler changes Prasanna S Panchamukhi
  2006-02-28  1:02   ` [PATCH] Kprobes- robust fault handling for i386 Keshavamurthy Anil S
  0 siblings, 2 replies; 14+ messages in thread
From: Prasanna S Panchamukhi @ 2006-02-27  9:24 UTC (permalink / raw)
  To: Keshavamurthy, Anil S; +Cc: systemtap

Anil,

On Fri, Feb 24, 2006 at 11:17:01AM -0800, Keshavamurthy, Anil S wrote:
> Prasanna,
> 	For better review comments, can you please split your patch
> into following, currently finding it hard to follow the kprobes states.
> 1) [PATCH]Fault handling due to calling pre_handler
> 2) [PATCH]Fault handling due to calling post_handler
> 3) [PATCH]Fault handling due to single_stepping
> 4) [PATCH]Fault handling due to single_stepping reentrant probe
> 
> Patches 3 and 4 above are required only to support user probes,
> so for now I think you can skip them.

For your convenience I have splitup the patches, please find
then below.
In general splitting of patches is a good idea, but here I think
splitting does not make much difference, since post_handler changes
are only few lines.  Correct me if I am wrong.
> 
> Also another suggesting, rename the kprobes states to something more
> meaning full. Say
> KPROBE_IN_PRE_HANDLER - This states indicates we are calling pre_handler
> KPROBE_IN_POST_HANDLER - This states indicates we are calling
> post_handler
> KPROBES_IN_SS - We are trying to single step
> KPROBES_IN_REENTER_SS - We are trying to single step reentrant probes 
> [...]
> 

Renaming states is a good idea, but we should do it independent of fault
handling. So how about doing it once we have robust fault handling in
place.

Thanks
Prasanna


This patch provides proper kprobes fault handling, if a user-specified
pre handlers tries to access user address space, through
copy_from_user(), get_user() etc. The user-specified fault handler
gets called only if the fault occurs wile executing user-specified
handlers. In such a case user-specified handler is allowed to fix it
first, later if the user-specifed fault handler does not fix it, we
try to fix it by calling fix_exception(). Also we set the "FAULTED"
flags if user-specified pre handler faults, so that corresponding
user-specified post_handler can be skipped. The user-specified handler
will not be called if the fault happens when single stepping the
original instruction, instead we reset the current probe and allow the
system page fault handler to fix it up.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>



 arch/i386/kernel/kprobes.c |   73 ++++++++++++++++++++++++++++++++++++++-------
 include/linux/kprobes.h    |   12 +++++++
 2 files changed, 75 insertions(+), 10 deletions(-)

diff -puN arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling arch/i386/kernel/kprobes.c
--- linux-2.6.16-rc4-mm2/arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling	2006-02-27 13:58:41.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/arch/i386/kernel/kprobes.c	2006-02-27 14:08:45.000000000 +0530
@@ -35,6 +35,7 @@
 #include <asm/cacheflush.h>
 #include <asm/kdebug.h>
 #include <asm/desc.h>
+#include <asm/uaccess.h>
 
 void jprobe_return_end(void);
 
@@ -232,8 +233,9 @@ static int __kprobes kprobe_handler(stru
 	if (kprobe_running()) {
 		p = get_kprobe(addr);
 		if (p) {
-			if (kcb->kprobe_status == KPROBE_HIT_SS &&
-				*p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
+			if (((kcb->kprobe_status == KPROBE_HIT_SS) ||
+				(kcb->kprobe_status == KPROBE_HIT_FAULT_SS)) &&
+				(*p->ainsn.insn == BREAKPOINT_INSTRUCTION)) {
 				regs->eflags &= ~TF_MASK;
 				regs->eflags |= kcb->kprobe_saved_eflags;
 				goto no_kprobe;
@@ -320,7 +322,10 @@ static int __kprobes kprobe_handler(stru
 
 ss_probe:
 	prepare_singlestep(p, regs);
-	kcb->kprobe_status = KPROBE_HIT_SS;
+	if (kcb->kprobe_status != KPROBE_HIT_FAULT)
+		kcb->kprobe_status = KPROBE_HIT_SS;
+	else
+		kcb->kprobe_status = KPROBE_HIT_FAULT_SS;
 	return 1;
 
 no_kprobe:
@@ -554,15 +559,63 @@ static inline int kprobe_fault_handler(s
 	struct kprobe *cur = kprobe_running();
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 
-	if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
-		return 1;
-
-	if (kcb->kprobe_status & KPROBE_HIT_SS) {
-		resume_execution(cur, regs, kcb);
+	switch(kcb->kprobe_status) {
+	case KPROBE_HIT_SS:
+	case KPROBE_REENTER:
+	case KPROBE_HIT_FAULT_SS:
+		/*
+		 * We are here because the instruction being single
+		 * stepped caused a page fault. We reset the current
+		 * kprobe and the eip points back to the probe address
+		 * and allow the page fault handler to continue as a
+		 * normal page fault.
+		 */
+		regs->eip = (unsigned long)cur->addr;
 		regs->eflags |= kcb->kprobe_old_eflags;
-
-		reset_current_kprobe();
+		if (kcb->kprobe_status == KPROBE_REENTER)
+			restore_previous_kprobe(kcb);
+		else
+			reset_current_kprobe();
 		preempt_enable_no_resched();
+		break;
+	case KPROBE_HIT_ACTIVE:
+		/*
+		 *  We set the status as "FAULTED", so that subsequent
+		 *  user specified post handler can be avoided.
+		 */
+		kcb->kprobe_status = KPROBE_HIT_FAULT;
+		/*fixup the exception*/
+		/*
+		 * We increment the nmissed count for accounting,
+		 * we can also use npre/npostfault count for accouting
+		 * these specific fault cases.
+		 */
+		kprobes_inc_nmissed_count(cur);
+
+		/*
+		 * We come here because instructions in the pre/post
+		 * handler caused the page_fault, this could happen
+		 * if handler tries to access user space by
+		 * copy_from_user(), get_user() etc. Let the
+		 * user-specified handler try to fix it first.
+		 */
+		if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
+			return 1;
+
+		/*
+		 * In case the user-specified fault handler returned
+		 * zero, try to fix up.
+		 */
+		if (fixup_exception(regs))
+			return 1;
+
+		/*
+		 * fixup_exception() could not handle it,
+		 * Let do_page_fault() fix it.
+		 */
+		break;
+	default:
+		break;
 	}
 	return 0;
 }
diff -puN include/linux/kprobes.h~kprobes-i386-pagefault-handling include/linux/kprobes.h
--- linux-2.6.16-rc4-mm2/include/linux/kprobes.h~kprobes-i386-pagefault-handling	2006-02-27 13:58:41.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/include/linux/kprobes.h	2006-02-27 13:58:42.000000000 +0530
@@ -46,6 +46,18 @@
 #define KPROBE_HIT_SS		0x00000002
 #define KPROBE_REENTER		0x00000004
 #define KPROBE_HIT_SSDONE	0x00000008
+/*
+ * When set, signifies that the fault happened
+ * while in the user-specified pre_handler.
+ */
+#define KPROBE_HIT_FAULT	0x00000010
+/*
+ * When set, signifies that the faulted user-specified
+ * pre_handler has been executed, now allow single
+ * stepping of original instruction and dont execute
+ * the post_handler after single stepping.
+ */
+#define KPROBE_HIT_FAULT_SS	0x00000020
 
 /* Attach to insert probes on any functions which should be ignored*/
 #define __kprobes	__attribute__((__section__(".kprobes.text")))

_
-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386 post_handler changes
  2006-02-27  9:24 ` Prasanna S Panchamukhi
@ 2006-02-27  9:25   ` Prasanna S Panchamukhi
  2006-02-28  1:02   ` [PATCH] Kprobes- robust fault handling for i386 Keshavamurthy Anil S
  1 sibling, 0 replies; 14+ messages in thread
From: Prasanna S Panchamukhi @ 2006-02-27  9:25 UTC (permalink / raw)
  To: Keshavamurthy, Anil S; +Cc: systemtap


This patch provides proper kprobes fault handling, if a user-specified
post handlers tries to access user address space, through
copy_from_user(), get_user() etc. The user-specified fault handler
gets called only if the fault occurs wile executing user-specified
handlers. In such a case user-specified handler is allowed to fix it
first, later if the user-specifed fault handler does not fix it, we
try to fix it by calling fix_exception().
The user-specified handler will not be called if the fault happens
when single stepping the original instruction, instead we reset the
current probe and allow the system page fault handler to fix it up.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>



 arch/i386/kernel/kprobes.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletion(-)

diff -puN arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling-post_handler arch/i386/kernel/kprobes.c
--- linux-2.6.16-rc4-mm2/arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling-post_handler	2006-02-27 13:59:13.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/arch/i386/kernel/kprobes.c	2006-02-27 14:01:50.000000000 +0530
@@ -526,7 +526,9 @@ static inline int post_kprobe_handler(st
 	if (!cur)
 		return 0;
 
-	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
+	if ((kcb->kprobe_status != KPROBE_REENTER)
+			&& (kcb->kprobe_status != KPROBE_HIT_FAULT_SS)
+			&& cur->post_handler) {
 		kcb->kprobe_status = KPROBE_HIT_SSDONE;
 		cur->post_handler(cur, regs, 0);
 	}
@@ -585,6 +587,7 @@ static inline int kprobe_fault_handler(s
 		 */
 		kcb->kprobe_status = KPROBE_HIT_FAULT;
 		/*fixup the exception*/
+	case KPROBE_HIT_SSDONE:
 		/*
 		 * We increment the nmissed count for accounting,
 		 * we can also use npre/npostfault count for accouting

_
-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-27  9:24 ` Prasanna S Panchamukhi
  2006-02-27  9:25   ` [PATCH] Kprobes- robust fault handling for i386 post_handler changes Prasanna S Panchamukhi
@ 2006-02-28  1:02   ` Keshavamurthy Anil S
  2006-02-28 14:37     ` Prasanna S Panchamukhi
  1 sibling, 1 reply; 14+ messages in thread
From: Keshavamurthy Anil S @ 2006-02-28  1:02 UTC (permalink / raw)
  To: Prasanna S Panchamukhi; +Cc: Keshavamurthy, Anil S, systemtap

On Mon, Feb 27, 2006 at 02:55:35PM +0530, Prasanna S Panchamukhi wrote:
> Anil,
> 
> 
> For your convenience I have splitup the patches, please find
> then below.
Thanks for the splitting.
> In general splitting of patches is a good idea, but here I think
> splitting does not make much difference, since post_handler changes
> are only few lines.  Correct me if I am wrong.
Since you are introducing lots of kprobes states it is good 
to split the patches according the pre/post/ss handling as
the reviewer can understand why each kprobes state is needed.
Remember the lesser the states easier to understand.

> 
> Renaming states is a good idea, but we should do it independent of fault
> handling. So how about doing it once we have robust fault handling in
> place.
Sure, can be done later too.
> 
> 
Over all the the logic seems to good, except I did not 
did not see where you are handling multiple sequenital faults
that can happen in pre/post handler. i.e once the fault happens
in say pre_handler, then the status goes to KPROBE_HIT_FAULT,
and say this fault is recovered and the pre_handler continues and
again before returning from pre_handler their can be another fault
and this fault is not being handed currently.
Also I did not see why you are not changing the status back to
original status if the fault is recovered properly. i.e 
KPROBE_HIT_ACTIVE -> KPROBE_HIT_FAULT. In KPROBE_HIT_FAULT state
if this recovers, why not change this back to KPROBE_HIT_ACTIVE?
Anyreason for not doing this?


-Anil

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-28  1:02   ` [PATCH] Kprobes- robust fault handling for i386 Keshavamurthy Anil S
@ 2006-02-28 14:37     ` Prasanna S Panchamukhi
  2006-02-28 20:25       ` Keshavamurthy Anil S
  0 siblings, 1 reply; 14+ messages in thread
From: Prasanna S Panchamukhi @ 2006-02-28 14:37 UTC (permalink / raw)
  To: Keshavamurthy Anil S; +Cc: systemtap

Anil,

Thanks for your review comments. Please see the updated patch
below, this patch is only for i386 architecture and once
we are ok with it, we will port it to other architectures.

> > 
> Over all the the logic seems to good, except I did not 
> did not see where you are handling multiple sequenital faults
> that can happen in pre/post handler. i.e once the fault happens
> in say pre_handler, then the status goes to KPROBE_HIT_FAULT,
> and say this fault is recovered and the pre_handler continues and
> again before returning from pre_handler their can be another fault
> and this fault is not being handed currently.

The patch below takes care of multiple faults 
within the same pre/post_handler.


> Also I did not see why you are not changing the status back to
> original status if the fault is recovered properly. i.e 
> KPROBE_HIT_ACTIVE -> KPROBE_HIT_FAULT. In KPROBE_HIT_FAULT state
> if this recovers, why not change this back to KPROBE_HIT_ACTIVE?
> Anyreason for not doing this?
> 

The only reason was to avoid post_handler being executed in case
if the user-defined pre-handler faulted. Now the patch below avoids
corresponding user-defined post_handler without introducing any
new state. The main reason to avoid post_handler execution in this 
case is to avoid any incosistant data references between pre and post
handlers.

Thanks
Prasanna


This patch provides proper kprobes fault handling, if a user-specified
pre/post handlers tries to access user address space, through
copy_from_user(), get_user() etc. The user-specified fault handler
gets called only if the fault occurs while executing user-specified
handlers. In such a case user-specified handler is allowed to fix it
first, later if the user-specifed fault handler does not fix it, we
try to fix it by calling fix_exception(). Also we set the "kprobe_faulted"
instance if user-specified pre handler faults, so that corresponding
user-specified post_handler can be skipped. The user-specified handler
will not be called if the fault happens when single stepping the
original instruction, instead we reset the current probe and allow the
system page fault handler to fix it up.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>



 arch/i386/kernel/kprobes.c |   66 +++++++++++++++++++++++++++++++++++++++------
 include/asm-i386/kprobes.h |    1 
 kernel/kprobes.c           |   14 ++++++++-
 3 files changed, 72 insertions(+), 9 deletions(-)

diff -puN include/asm-i386/kprobes.h~kprobes-i386-pagefault-handling include/asm-i386/kprobes.h
--- linux-2.6.16-rc4-mm2/include/asm-i386/kprobes.h~kprobes-i386-pagefault-handling	2006-02-28 18:00:20.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/include/asm-i386/kprobes.h	2006-02-28 18:01:16.000000000 +0530
@@ -74,6 +74,7 @@ struct kprobe_ctlblk {
 	long *jprobe_saved_esp;
 	struct pt_regs jprobe_saved_regs;
 	kprobe_opcode_t jprobes_stack[MAX_STACK_SIZE];
+	struct kprobe *kprobe_faulted;
 	struct prev_kprobe prev_kprobe;
 };
 
diff -puN arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling arch/i386/kernel/kprobes.c
--- linux-2.6.16-rc4-mm2/arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling	2006-02-28 09:47:48.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/arch/i386/kernel/kprobes.c	2006-02-28 19:34:20.000000000 +0530
@@ -35,6 +35,7 @@
 #include <asm/cacheflush.h>
 #include <asm/kdebug.h>
 #include <asm/desc.h>
+#include <asm/uaccess.h>
 
 void jprobe_return_end(void);
 
@@ -523,7 +524,8 @@ static inline int post_kprobe_handler(st
 
 	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
 		kcb->kprobe_status = KPROBE_HIT_SSDONE;
-		cur->post_handler(cur, regs, 0);
+		if (kcb->kprobe_faulted != cur)
+			cur->post_handler(cur, regs, 0);
 	}
 
 	resume_execution(cur, regs, kcb);
@@ -554,15 +556,63 @@ static inline int kprobe_fault_handler(s
 	struct kprobe *cur = kprobe_running();
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 
-	if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
-		return 1;
-
-	if (kcb->kprobe_status & KPROBE_HIT_SS) {
-		resume_execution(cur, regs, kcb);
+	switch(kcb->kprobe_status) {
+	case KPROBE_HIT_SS:
+	case KPROBE_REENTER:
+		/*
+		 * We are here because the instruction being single
+		 * stepped caused a page fault. We reset the current
+		 * kprobe and the eip points back to the probe address
+		 * and allow the page fault handler to continue as a
+		 * normal page fault.
+		 */
+		regs->eip = (unsigned long)cur->addr;
 		regs->eflags |= kcb->kprobe_old_eflags;
-
-		reset_current_kprobe();
+		if (kcb->kprobe_status == KPROBE_REENTER)
+			restore_previous_kprobe(kcb);
+		else
+			reset_current_kprobe();
 		preempt_enable_no_resched();
+		break;
+	case KPROBE_HIT_ACTIVE:
+		/*
+		 * Set appropriate kprobe instance, so that corresponding
+		 * post_handler can be skipped in order to avoid any
+		 * inconsistant data.
+		 */
+		kcb->kprobe_faulted = cur;
+	case KPROBE_HIT_SSDONE:
+		/*
+		 * We increment the nmissed count for accounting,
+		 * we can also use npre/npostfault count for accouting
+		 * these specific fault cases.
+		 */
+		kprobes_inc_nmissed_count(cur);
+
+		/*
+		 * We come here because instructions in the pre/post
+		 * handler caused the page_fault, this could happen
+		 * if handler tries to access user space by
+		 * copy_from_user(), get_user() etc. Let the
+		 * user-specified handler try to fix it first.
+		 */
+		if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
+			return 1;
+
+		/*
+		 * In case the user-specified fault handler returned
+		 * zero, try to fix up.
+		 */
+		if (fixup_exception(regs))
+			return 1;
+
+		/*
+		 * fixup_exception() could not handle it,
+		 * Let do_page_fault() fix it.
+		 */
+		break;
+	default:
+		break;
 	}
 	return 0;
 }
diff -puN kernel/kprobes.c~kprobes-i386-pagefault-handling kernel/kprobes.c
--- linux-2.6.16-rc4-mm2/kernel/kprobes.c~kprobes-i386-pagefault-handling	2006-02-28 18:04:09.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/kernel/kprobes.c	2006-02-28 19:27:33.000000000 +0530
@@ -208,9 +208,14 @@ static void __kprobes aggr_post_handler(
 					unsigned long flags)
 {
 	struct kprobe *kp;
+	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 
 	list_for_each_entry_rcu(kp, &p->list, list) {
-		if (kp->post_handler) {
+		/*
+		 * Check if the corresponding pre_handler had faulted, avoid
+		 * the post_handler in such a case.
+		 */
+		if (kp->post_handler && (kcb->kprobe_faulted != kp)) {
 			set_kprobe_instance(kp);
 			kp->post_handler(kp, regs, flags);
 			reset_kprobe_instance();
@@ -223,12 +228,19 @@ static int __kprobes aggr_fault_handler(
 					int trapnr)
 {
 	struct kprobe *cur = __get_cpu_var(kprobe_instance);
+	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 
 	/*
 	 * if we faulted "during" the execution of a user specified
 	 * probe handler, invoke just that probe's fault handler
 	 */
 	if (cur && cur->fault_handler) {
+		/*
+		 * Set kprobe_faulted to appropriate kprobe instance, so that
+		 * corresponding post handler can be skipped if the fault
+		 * happened due to pre_handler.
+		 */
+		kcb->kprobe_faulted = cur;
 		if (cur->fault_handler(cur, regs, trapnr))
 			return 1;
 	}

_
-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-28 14:37     ` Prasanna S Panchamukhi
@ 2006-02-28 20:25       ` Keshavamurthy Anil S
  2006-03-01 14:49         ` Prasanna S Panchamukhi
  0 siblings, 1 reply; 14+ messages in thread
From: Keshavamurthy Anil S @ 2006-02-28 20:25 UTC (permalink / raw)
  To: Prasanna S Panchamukhi; +Cc: Keshavamurthy, Anil S, systemtap

On Tue, Feb 28, 2006 at 06:38:36AM -0800, Prasanna S Panchamukhi wrote:
> 
>    Anil,
> 
>    Thanks for your review comments. Please see the updated patch
>    below, this patch is only for i386 architecture and once
>    we are ok with it, we will port it to other architectures.
This version looks good with no new Kprobes states.
Makes life easy to understand :-)

>    [..]The main reason to avoid post_handler execution in this
>    case is to avoid any incosistant data references between pre and post
>    handlers.
Okay, I got that point, but if the fault recovery in pre_handler
is *successful*, then in this case you *should* permit calling
post_handler. See my inline comments to address this issue.

Thanks,
Anil

> 
>    Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
> 
>     arch/i386/kernel/kprobes.c |   66
>    +++++++++++++++++++++++++++++++++++++++------
>     include/asm-i386/kprobes.h |    1
>     kernel/kprobes.c           |   14 ++++++++-
>     3 files changed, 72 insertions(+), 9 deletions(-)
> 
>    diff  -puN  include/asm-i386/kprobes.h~kprobes-i386-pagefault-handling
>    include/asm-i386/kprobes.h
>    ---
>    linux-2.6.16-rc4-mm2/include/asm-i386/kprobes.h~kprobes-i386-pagefault
>    -handling     2006-02-28 18:00:20.000000000 +0530
> 
>    +++           linux-2.6.16-rc4-mm2-prasanna/include/asm-i386/kprobes.h
>    2006-02-28 18:01:16.000000000 +0530
>    @@ -74,6 +74,7 @@ struct kprobe_ctlblk {
>            long *jprobe_saved_esp;
>            struct pt_regs jprobe_saved_regs;
>            kprobe_opcode_t jprobes_stack[MAX_STACK_SIZE];
>    +       struct kprobe *kprobe_faulted;
>            struct prev_kprobe prev_kprobe;
>     };
Good approach.

> 
>    diff  -puN  arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling
>    arch/i386/kernel/kprobes.c
>    ---
>    linux-2.6.16-rc4-mm2/arch/i386/kernel/kprobes.c~kprobes-i386-pagefault
>    -handling     2006-02-28 09:47:48.000000000 +0530
> 
>    +++           linux-2.6.16-rc4-mm2-prasanna/arch/i386/kernel/kprobes.c
>    2006-02-28 19:34:20.000000000 +0530
>    @@ -35,6 +35,7 @@
>     #include <asm/cacheflush.h>
>     #include <asm/kdebug.h>
>     #include <asm/desc.h>
>    +#include <asm/uaccess.h>
> 
>     void jprobe_return_end(void);
> 
>    @@ -523,7 +524,8 @@ static inline int post_kprobe_handler(st
> 
>                if     ((kcb->kprobe_status    !=    KPROBE_REENTER)    &&
>    cur->post_handler) {
>                    kcb->kprobe_status = KPROBE_HIT_SSDONE;
>    -               cur->post_handler(cur, regs, 0);
>    +               if (kcb->kprobe_faulted != cur)
>    +                       cur->post_handler(cur, regs, 0);
>            }
> 
>            resume_execution(cur, regs, kcb);
>    @@ -554,15 +556,63 @@ static inline int kprobe_fault_handler(s
>            struct kprobe *cur = kprobe_running();
>            struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> 
>    -         if   (cur->fault_handler  &&  cur->fault_handler(cur,  regs,
>    trapnr))
>    -               return 1;
>    -
>    -       if (kcb->kprobe_status & KPROBE_HIT_SS) {
>    -               resume_execution(cur, regs, kcb);
>    +       switch(kcb->kprobe_status) {
>    +       case KPROBE_HIT_SS:
>    +       case KPROBE_REENTER:
>    +               /*
>    +                * We are here because the instruction being single
>    +                * stepped caused a page fault. We reset the current
>    +                * kprobe and the eip points back to the probe address
>    +                * and allow the page fault handler to continue as a
>    +                * normal page fault.
>    +                */
>    +               regs->eip = (unsigned long)cur->addr;
>                    regs->eflags |= kcb->kprobe_old_eflags;
>    -
>    -               reset_current_kprobe();
>    +               if (kcb->kprobe_status == KPROBE_REENTER)
>    +                       restore_previous_kprobe(kcb);
>    +               else
>    +                       reset_current_kprobe();
>                    preempt_enable_no_resched();
>    +               break;
>    +       case KPROBE_HIT_ACTIVE:
>    +               /*
>    +                    *   Set  appropriate  kprobe  instance,  so  that
>    corresponding
>    +                * post_handler can be skipped in order to avoid any
>    +                * inconsistant data.
>    +                */
>    +               kcb->kprobe_faulted = cur;
>    +       case KPROBE_HIT_SSDONE:
>    +               /*
>    +                * We increment the nmissed count for accounting,
>    +                * we can also use npre/npostfault count for accouting
>    +                * these specific fault cases.
>    +                */
>    +               kprobes_inc_nmissed_count(cur);
>    +
>    +               /*
>    +                * We come here because instructions in the pre/post
>    +                * handler caused the page_fault, this could happen
>    +                * if handler tries to access user space by
>    +                * copy_from_user(), get_user() etc. Let the
>    +                * user-specified handler try to fix it first.
>    +                */
>    +                 if  (cur->fault_handler  &&  cur->fault_handler(cur,
>    regs, trapnr))
>    +                       return 1;
if the fault recovery is successful, before returning 1, you
need to reset kcb->kprobe_faulted to NULL;
>    +
>    +               /*
>    +                * In case the user-specified fault handler returned
>    +                * zero, try to fix up.
>    +                */
>    +               if (fixup_exception(regs))
>    +                       return 1;
same here, before returning 1, you need to reset kcb->kprobe_faulted to NULL;
>    +
>    +               /*
>    +                * fixup_exception() could not handle it,
>    +                * Let do_page_fault() fix it.
>    +                */
>    +               break;
>    +       default:
>    +               break;
>            }
>            return 0;
>     }
>    diff       -puN       kernel/kprobes.c~kprobes-i386-pagefault-handling
>    kernel/kprobes.c
>    ---
>    linux-2.6.16-rc4-mm2/kernel/kprobes.c~kprobes-i386-pagefault-handling
>          2006-02-28 18:04:09.000000000 +0530
>    +++   linux-2.6.16-rc4-mm2-prasanna/kernel/kprobes.c        2006-02-28
>    19:27:33.000000000 +0530
>    @@ -208,9 +208,14 @@ static void __kprobes aggr_post_handler(
>                                            unsigned long flags)
>     {
>            struct kprobe *kp;
>    +       struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> 
>            list_for_each_entry_rcu(kp, &p->list, list) {
>    -               if (kp->post_handler) {
>    +               /*
>    +                * Check if the corresponding pre_handler had faulted,
>    avoid
>    +                * the post_handler in such a case.
>    +                */
>    +               if (kp->post_handler && (kcb->kprobe_faulted != kp)) {
>                            set_kprobe_instance(kp);
>                            kp->post_handler(kp, regs, flags);
>                            reset_kprobe_instance();
>    @@ -223,12 +228,19 @@ static int __kprobes aggr_fault_handler(
>                                            int trapnr)
>     {
>            struct kprobe *cur = __get_cpu_var(kprobe_instance);
>    +       struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> 
>            /*
>             * if we faulted "during" the execution of a user specified
>             * probe handler, invoke just that probe's fault handler
>             */
>            if (cur && cur->fault_handler) {
>    +               /*
>    +                 * Set kprobe_faulted to appropriate kprobe instance,
>    so that
>    +                  *  corresponding post handler can be skipped if the
>    fault
>    +                * happened due to pre_handler.
>    +                */
>    +               kcb->kprobe_faulted = cur;
Move this kcb->kprobe_faulted = cur; before if(curr && cur->handler) {}
The reason is, irrespective of cur->fault_handler, we need to save
kcb->kprobe_faulted, so post handler can be skipped properly.

>                    if (cur->fault_handler(cur, regs, trapnr))
>                            return 1;
>            }
> 
>    _
>    --
>    Prasanna S Panchamukhi
>    Linux Technology Center
>    India Software Labs, IBM Bangalore
>    Email: prasanna@in.ibm.com
>    Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-28 20:25       ` Keshavamurthy Anil S
@ 2006-03-01 14:49         ` Prasanna S Panchamukhi
  0 siblings, 0 replies; 14+ messages in thread
From: Prasanna S Panchamukhi @ 2006-03-01 14:49 UTC (permalink / raw)
  To: Keshavamurthy Anil S; +Cc: systemtap

On Tue, Feb 28, 2006 at 12:25:26PM -0800, Keshavamurthy Anil S wrote:
> On Tue, Feb 28, 2006 at 06:38:36AM -0800, Prasanna S Panchamukhi wrote:
> > 
> >    Anil,
> > 
> >    Thanks for your review comments. Please see the updated patch
> >    below, this patch is only for i386 architecture and once
> >    we are ok with it, we will port it to other architectures.
> This version looks good with no new Kprobes states.
> Makes life easy to understand :-)
> 
> >    [..]The main reason to avoid post_handler execution in this
> >    case is to avoid any incosistant data references between pre and post
> >    handlers.
> Okay, I got that point, but if the fault recovery in pre_handler
> is *successful*, then in this case you *should* permit calling
> post_handler. See my inline comments to address this issue.

Anil,

To skip post_handler execution for unsuccessful fault recovery in the
pre_hanlder, we need to take several things like aggrigate kprobe
handlers, using the same kprobe structures across the same probe hit on 
different cpus at the same time etc. This restricts us from avoiding
execution of the post-handler in case of unsuccessful fault recovery.
Please find the patch below that allows post-handler execution in all
cases as of now.

Thanks
Prasanna

This patch provides proper kprobes fault handling, if a user-specified
pre/post handler tries to access user address space, because of  
copy_from_user(), get_user() etc. The user-specified fault handler
gets called only if the fault occurs while executing user-specified
handler. In such a case user-specified handler is allowed to fix it
first. If it is unsuccessful, we try to fix it by calling 
fixup_exception(). The user-specified handler will not be called if
the fault happened when single stepping the original instruction,
instead we reset the current probe and allow the system page fault
handler to handle it.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>



 arch/i386/kernel/kprobes.c |   57 +++++++++++++++++++++++++++++++++++++++------
 1 files changed, 50 insertions(+), 7 deletions(-)

diff -puN arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling arch/i386/kernel/kprobes.c
--- linux-2.6.16-rc4-mm2/arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling	2006-03-01 19:05:01.000000000 +0530
+++ linux-2.6.16-rc4-mm2-prasanna/arch/i386/kernel/kprobes.c	2006-03-01 19:07:17.000000000 +0530
@@ -35,6 +35,7 @@
 #include <asm/cacheflush.h>
 #include <asm/kdebug.h>
 #include <asm/desc.h>
+#include <asm/uaccess.h>
 
 void jprobe_return_end(void);
 
@@ -554,15 +555,57 @@ static inline int kprobe_fault_handler(s
 	struct kprobe *cur = kprobe_running();
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 
-	if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
-		return 1;
-
-	if (kcb->kprobe_status & KPROBE_HIT_SS) {
-		resume_execution(cur, regs, kcb);
+	switch(kcb->kprobe_status) {
+	case KPROBE_HIT_SS:
+	case KPROBE_REENTER:
+		/*
+		 * We are here because the instruction being single
+		 * stepped caused a page fault. We reset the current
+		 * kprobe and the eip points back to the probe address
+		 * and allow the page fault handler to continue as a
+		 * normal page fault.
+		 */
+		regs->eip = (unsigned long)cur->addr;
 		regs->eflags |= kcb->kprobe_old_eflags;
-
-		reset_current_kprobe();
+		if (kcb->kprobe_status == KPROBE_REENTER)
+			restore_previous_kprobe(kcb);
+		else
+			reset_current_kprobe();
 		preempt_enable_no_resched();
+		break;
+	case KPROBE_HIT_ACTIVE:
+	case KPROBE_HIT_SSDONE:
+		/*
+		 * We increment the nmissed count for accounting,
+		 * we can also use npre/npostfault count for accouting
+		 * these specific fault cases.
+		 */
+		kprobes_inc_nmissed_count(cur);
+
+		/*
+		 * We come here because instructions in the pre/post
+		 * handler caused the page_fault, this could happen
+		 * if handler tries to access user space by
+		 * copy_from_user(), get_user() etc. Let the
+		 * user-specified handler try to fix it first.
+		 */
+		if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
+			return 1;
+
+		/*
+		 * In case the user-specified fault handler returned
+		 * zero, try to fix up.
+		 */
+		if (fixup_exception(regs))
+			return 1;
+
+		/*
+		 * fixup_exception() could not handle it,
+		 * Let do_page_fault() fix it.
+		 */
+		break;
+	default:
+		break;
 	}
 	return 0;
 }

_
-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-22  7:11 Prasanna S Panchamukhi
@ 2006-02-24  1:33 ` Jim Keniston
  0 siblings, 0 replies; 14+ messages in thread
From: Jim Keniston @ 2006-02-24  1:33 UTC (permalink / raw)
  To: prasanna; +Cc: SystemTAP

On Tue, 2006-02-21 at 23:13, Prasanna S Panchamukhi wrote:
> Hi,
> 
> Below is the prototype for robust fault handling, as of now 
> this patch is for i386 architecture and should be easily 
> ported to other architectures. Your comments and suggestions 
> are welcome. This patch has been tested for page faults that
> occur while accessing user address space data. Support needs 
> to be added for cases such as divide by zero, NULL pointer 
> dereference, etc. Also as of now we increment the nmissed
> count, instead we can track such instances by having
> independent counters such as nprefault, npostfault.
> 
> Thanks
> Prasanna
...
>  /*
> + * Kprobe pre handler trampoline saves the function return address and
> + * calls the registered user pre handler. In case if the user
> + * specified pre handler causes any page faults, the
> + * kprobe_fault_handler() gets notified and it just returns directly
> + * to kprobe_handler(), where trampoline was suppose to return.
> + */
> +static int __kprobes kprobe_pre_handler_trampoline(struct kprobe *p,
> +			struct pt_regs *regs, struct kprobe_ctlblk *kcb)
> +{
> +	kcb->handler_retaddr = (unsigned long)__builtin_return_address(0);
> +	return (p->pre_handler(p, regs));
> +}

If/when you pick this back up, you need to consider saving and restoring
non-scratch registers.  In particular, the handler may save and
subsequently modify ebp, ebx, esi, and edi, and then fault.  The caller
of kprobe_pre_handler_trampoline() will expect that these registers have
been restored to their original values when control returns from
kprobe_pre_handler_trampoline() (or the fault-handling code).

Jim

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-23 12:40   ` Frank Ch. Eigler
@ 2006-02-23 13:17     ` Prasanna S Panchamukhi
  0 siblings, 0 replies; 14+ messages in thread
From: Prasanna S Panchamukhi @ 2006-02-23 13:17 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On Thu, Feb 23, 2006 at 07:39:53AM -0500, Frank Ch. Eigler wrote:
> Prasanna S Panchamukhi <prasanna@in.ibm.com> writes:
> 
> > [...] As of now to fix the broken kprobes fault handling, here is the
> > patch. This is only for i386, once we freeze on this prototype, this
> > can be ported to other architectures.
> > 
> > This patch provides proper kprobes fault handling, if a user-specified
> > pre/post handlers tries to access user address space, through
> > copy_from_user(), get_user() etc. [...]
> 
> Is it correct that this patch enables only the first bullet in
> <http://sources.redhat.com/ml/systemtap/2006-q1/msg00536.html>?

Yes.


-- 
Thanks & Regards
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-23  8:58 ` Prasanna S Panchamukhi
@ 2006-02-23 12:40   ` Frank Ch. Eigler
  2006-02-23 13:17     ` Prasanna S Panchamukhi
  0 siblings, 1 reply; 14+ messages in thread
From: Frank Ch. Eigler @ 2006-02-23 12:40 UTC (permalink / raw)
  To: systemtap

Prasanna S Panchamukhi <prasanna@in.ibm.com> writes:

> [...] As of now to fix the broken kprobes fault handling, here is the
> patch. This is only for i386, once we freeze on this prototype, this
> can be ported to other architectures.
> 
> This patch provides proper kprobes fault handling, if a user-specified
> pre/post handlers tries to access user address space, through
> copy_from_user(), get_user() etc. [...]

Is it correct that this patch enables only the first bullet in
<http://sources.redhat.com/ml/systemtap/2006-q1/msg00536.html>?

By the way, just for laughs, I tried systemtap on a hugemem x86
kernel.  As probably expected, none of the user-space copy operations
worked: in fact, many "succeeded" and returned null data.

- FChE

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Kprobes- robust fault handling for i386
  2006-02-22 10:41 Mao, Bibo
@ 2006-02-23  8:58 ` Prasanna S Panchamukhi
  2006-02-23 12:40   ` Frank Ch. Eigler
  0 siblings, 1 reply; 14+ messages in thread
From: Prasanna S Panchamukhi @ 2006-02-23  8:58 UTC (permalink / raw)
  To: Mao, Bibo; +Cc: systemtap

Bibo,

You are right, the esp might be local variable, we are working on
it. As of now to fix the broken kprobes fault handling, here is the
patch. This is only for i386, once we freeze on this prototype, this
can be ported to other architectures.

Thanks
Prasanna


This patch provides proper kprobes fault handling, if a user-specified
pre/post handlers tries to access user address space, through
copy_from_user(), get_user() etc. The user-specified fault handler
gets called only if the fault occurs wile executing user-specified
handlers. In such a case user-specified handler is allowed to fix it
first, later if the user-specifed fault handler does not fix it, we
try to fix it by calling fix_exception(). Also we set the "FAULTED"
flags if user-specified pre handler faults, so that corresponding
user-specified post_handler can be skipped. The user-specified handler
will not be called if the fault happens when single stepping the
original instruction, instead we reset the current probe and allow the
system page fault handler to fix it up.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>


 arch/i386/kernel/kprobes.c |   78 ++++++++++++++++++++++++++++++++++++++-------
 include/linux/kprobes.h    |    2 +
 2 files changed, 69 insertions(+), 11 deletions(-)

diff -puN arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling arch/i386/kernel/kprobes.c
--- linux-2.6.16-rc3-mm1/arch/i386/kernel/kprobes.c~kprobes-i386-pagefault-handling	2006-02-23 11:41:29.000000000 +0530
+++ linux-2.6.16-rc3-mm1-prasanna/arch/i386/kernel/kprobes.c	2006-02-23 14:09:15.000000000 +0530
@@ -35,6 +35,7 @@
 #include <asm/cacheflush.h>
 #include <asm/kdebug.h>
 #include <asm/desc.h>
+#include <asm/uaccess.h>
 
 void jprobe_return_end(void);
 
@@ -220,8 +221,9 @@ static int __kprobes kprobe_handler(stru
 	if (kprobe_running()) {
 		p = get_kprobe(addr);
 		if (p) {
-			if (kcb->kprobe_status == KPROBE_HIT_SS &&
-				*p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
+			if (((kcb->kprobe_status == KPROBE_HIT_SS) ||
+				(kcb->kprobe_status == KPROBE_HIT_FAULT_SS)) &&
+				(*p->ainsn.insn == BREAKPOINT_INSTRUCTION)) {
 				regs->eflags &= ~TF_MASK;
 				regs->eflags |= kcb->kprobe_saved_eflags;
 				goto no_kprobe;
@@ -308,7 +310,10 @@ static int __kprobes kprobe_handler(stru
 
 ss_probe:
 	prepare_singlestep(p, regs);
-	kcb->kprobe_status = KPROBE_HIT_SS;
+	if (kcb->kprobe_status != KPROBE_HIT_FAULT)
+		kcb->kprobe_status = KPROBE_HIT_SS;
+	else
+		kcb->kprobe_status = KPROBE_HIT_FAULT_SS;
 	return 1;
 
 no_kprobe:
@@ -509,7 +514,9 @@ static inline int post_kprobe_handler(st
 	if (!cur)
 		return 0;
 
-	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
+	if ((kcb->kprobe_status != KPROBE_REENTER)
+			&& (kcb->kprobe_status != KPROBE_HIT_FAULT_SS)
+			&& cur->post_handler) {
 		kcb->kprobe_status = KPROBE_HIT_SSDONE;
 		cur->post_handler(cur, regs, 0);
 	}
@@ -542,15 +549,64 @@ static inline int kprobe_fault_handler(s
 	struct kprobe *cur = kprobe_running();
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 
-	if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
-		return 1;
-
-	if (kcb->kprobe_status & KPROBE_HIT_SS) {
-		resume_execution(cur, regs, kcb);
+	switch(kcb->kprobe_status) {
+	case KPROBE_HIT_SS:
+	case KPROBE_REENTER:
+	case KPROBE_HIT_FAULT_SS:
+		/*
+		 * We are here because the instruction being single
+		 * stepped caused a page fault. We reset the current
+		 * kprobe and the eip points back to the probe address
+		 * and allow the page fault handler to continue as a
+		 * normal page fault.
+		 */
+		regs->eip = (unsigned long)cur->addr;
 		regs->eflags |= kcb->kprobe_old_eflags;
-
-		reset_current_kprobe();
+		if (kcb->kprobe_status == KPROBE_REENTER)
+			restore_previous_kprobe(kcb);
+		else
+			reset_current_kprobe();
 		preempt_enable_no_resched();
+		break;
+	case KPROBE_HIT_ACTIVE:
+		/*
+		 *  We set the status as "FAULTED", so that subsequent
+		 *  user specified post handler can be avoided.
+		 */
+		kcb->kprobe_status = KPROBE_HIT_FAULT;
+		/* fall down and fixup the exception*/
+	case KPROBE_HIT_SSDONE:
+		/*
+		 * We increment the nmissed count for accounting,
+		 * we can also use npre/npostfault count for accouting
+		 * these specific fault cases.
+		 */
+		kprobes_inc_nmissed_count(cur);
+
+		/*
+		 * We come here because instructions in the pre/post
+		 * handler caused the page_fault, this could happen
+		 * if handler tries to access user space by
+		 * copy_from_user(), get_user() etc. Let the
+		 * user-specified handler try to fix it first.
+		 */
+		if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
+			return 1;
+
+		/*
+		 * In case the user-specified fault handler returned
+		 * zero, try to fix up.
+		 */
+		if (fixup_exception(regs))
+			return 1;
+
+		/*
+		 * fixup_exception() could not handle it,
+		 * Let do_page_fault() fix it.
+		 */
+		break;
+	default:
+		break;
 	}
 	return 0;
 }
diff -puN include/linux/kprobes.h~kprobes-i386-pagefault-handling include/linux/kprobes.h
--- linux-2.6.16-rc3-mm1/include/linux/kprobes.h~kprobes-i386-pagefault-handling	2006-02-23 12:08:07.000000000 +0530
+++ linux-2.6.16-rc3-mm1-prasanna/include/linux/kprobes.h	2006-02-23 12:31:21.000000000 +0530
@@ -46,6 +46,8 @@
 #define KPROBE_HIT_SS		0x00000002
 #define KPROBE_REENTER		0x00000004
 #define KPROBE_HIT_SSDONE	0x00000008
+#define KPROBE_HIT_FAULT	0x00000010
+#define KPROBE_HIT_FAULT_SS	0x00000020
 
 /* Attach to insert probes on any functions which should be ignored*/
 #define __kprobes	__attribute__((__section__(".kprobes.text")))

_

-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] Kprobes- robust fault handling for i386
@ 2006-02-23  0:44 Keshavamurthy, Anil S
  0 siblings, 0 replies; 14+ messages in thread
From: Keshavamurthy, Anil S @ 2006-02-23  0:44 UTC (permalink / raw)
  To: prasanna, systemtap

>Hi,
>
>Below is the prototype for robust fault handling, as of now 
>this patch is for i386 architecture and should be easily 
>ported to other architectures. Your comments and suggestions 
>are welcome. 
Since you are modifying the stack address
(*sara = kcb->handler_retaddr) to return to {pre/post}_handlers, 
not sure how easily this can be ported to architectures like 
IA64 and PPC64 which are register based architecture.


Cheers,
-Anil

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] Kprobes- robust fault handling for i386
@ 2006-02-22 10:41 Mao, Bibo
  2006-02-23  8:58 ` Prasanna S Panchamukhi
  0 siblings, 1 reply; 14+ messages in thread
From: Mao, Bibo @ 2006-02-22 10:41 UTC (permalink / raw)
  To: prasanna; +Cc: systemtap

I have one question and I reply between the lines.

>-----Original Message-----
>From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org]
>On Behalf Of Prasanna S Panchamukhi
>Sent: 2006年2月22日 15:13
>To: systemtap@sources.redhat.com
>Subject: [PATCH] Kprobes- robust fault handling for i386
>
>Hi,
>
>Below is the prototype for robust fault handling, as of now
>this patch is for i386 architecture and should be easily
>ported to other architectures. Your comments and suggestions
>are welcome. This patch has been tested for page faults that
>occur while accessing user address space data. Support needs
>to be added for cases such as divide by zero, NULL pointer
>dereference, etc. Also as of now we increment the nmissed
>count, instead we can track such instances by having
>independent counters such as nprefault, npostfault.
>
>Thanks
>Prasanna

>@@ -509,9 +554,21 @@ static inline int post_kprobe_handler(st
> 	if (!cur)
> 		return 0;
>
>-	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
>+	if ((kcb->kprobe_status != KPROBE_REENTER)
>+			&& (kcb->kprobe_status != KPROBE_HIT_FAULT)
>+			&& cur->post_handler) {
>+		kcb->handler_regs = regs;
> 		kcb->kprobe_status = KPROBE_HIT_SSDONE;
>-		cur->post_handler(cur, regs, 0);
>+		kprobe_post_handler_trampoline(cur, regs, kcb);
>+		kcb = get_kprobe_ctlblk();
>+		/*
>+		 * Check if user defined handler caused the page fault, in
>+		 * such a case restore the register pointers, just resets
>+		 * the current kprobe and resumes the execution, since we
>+		 * have already single stepped on original instruction.
>+		 */
>+		if (kcb->kprobe_status == KPROBE_HIT_FAULT)
>+			regs = kcb->handler_regs;
> 	}
>
> 	resume_execution(cur, regs, kcb);
>@@ -541,18 +598,55 @@ static inline int kprobe_fault_handler(s
> {
> 	struct kprobe *cur = kprobe_running();
> 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
>+	unsigned long *sara = (unsigned long *)&regs->esp;
What is &regs->esp meaning here? If instruction which causes page fault is not first instruction of called function, then &regs->esp will be local variable's memory address in the called function, but not caller return address.
>........
>+		*sara = kcb->handler_retaddr;
So in this line maybe sometimes it will only change callee function local variant's value, but not change caller return value.

Regards
Bibo,mao

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] Kprobes- robust fault handling for i386
@ 2006-02-22  7:11 Prasanna S Panchamukhi
  2006-02-24  1:33 ` Jim Keniston
  0 siblings, 1 reply; 14+ messages in thread
From: Prasanna S Panchamukhi @ 2006-02-22  7:11 UTC (permalink / raw)
  To: systemtap

Hi,

Below is the prototype for robust fault handling, as of now 
this patch is for i386 architecture and should be easily 
ported to other architectures. Your comments and suggestions 
are welcome. This patch has been tested for page faults that
occur while accessing user address space data. Support needs 
to be added for cases such as divide by zero, NULL pointer 
dereference, etc. Also as of now we increment the nmissed
count, instead we can track such instances by having
independent counters such as nprefault, npostfault.

Thanks
Prasanna

This patch provides proper kprobe fault handling for the following cases:
- If the user specified pre/post handlers generate a fault, say, due to
access to user address space, through copy_from_user(), get_user() etc.
In this case we invoke the user specified fault handler (if any) and allow
it to handle it. In case the user specified fault handler is unable to
handle the fault, we skip calling subsequent processing (ie., calling
the user specified pre/post handlers) and transparently singlestep on the
original instruction.
- If a fault happens while singlestepping the original instruction, the
user fault handler isn't called. We instead reset the faulted probe,
change the instruction pointer to the probed address and enable the
interrupts so that the system fault handler can rectify it.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>


 arch/i386/kernel/kprobes.c |  122 +++++++++++++++++++++++++++++++++++++++------
 include/asm-i386/kprobes.h |    2 
 include/linux/kprobes.h    |    1 
 3 files changed, 111 insertions(+), 14 deletions(-)

diff -puN arch/i386/kernel/kprobes.c~kprobes-fault-handling-fix arch/i386/kernel/kprobes.c
--- linux-2.6.16-rc3-mm1/arch/i386/kernel/kprobes.c~kprobes-fault-handling-fix	2006-02-21 17:03:29.000000000 +0530
+++ linux-2.6.16-rc3-mm1-prasanna/arch/i386/kernel/kprobes.c	2006-02-21 17:10:51.000000000 +0530
@@ -35,6 +35,7 @@
 #include <asm/cacheflush.h>
 #include <asm/kdebug.h>
 #include <asm/desc.h>
+#include <asm/uaccess.h>
 
 void jprobe_return_end(void);
 
@@ -184,6 +185,20 @@ void __kprobes arch_prepare_kretprobe(st
 }
 
 /*
+ * Kprobe pre handler trampoline saves the function return address and
+ * calls the registered user pre handler. In case if the user
+ * specified pre handler causes any page faults, the
+ * kprobe_fault_handler() gets notified and it just returns directly
+ * to kprobe_handler(), where trampoline was suppose to return.
+ */
+static int __kprobes kprobe_pre_handler_trampoline(struct kprobe *p,
+			struct pt_regs *regs, struct kprobe_ctlblk *kcb)
+{
+	kcb->handler_retaddr = (unsigned long)__builtin_return_address(0);
+	return (p->pre_handler(p, regs));
+}
+
+/*
  * Interrupts are disabled on entry as trap3 is an interrupt gate and they
  * remain disabled thorough out this function.
  */
@@ -286,11 +301,26 @@ static int __kprobes kprobe_handler(stru
 
 	set_current_kprobe(p, regs, kcb);
 	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
-
-	if (p->pre_handler && p->pre_handler(p, regs))
-		/* handler has already set things up, so skip ss setup */
-		return 1;
-
+	if (p->pre_handler) {
+		kcb->handler_regs = regs;
+		if (kprobe_pre_handler_trampoline(p, regs, kcb)) {
+			/*
+			 * Check if the user defined pre-handler caused
+			 * any faults, in such case set up for single
+			 * stepping of original instruction. Also, set
+			 * appropriate flags for skipping the post
+			 * handler, since executing the user defined
+			 * post handler is not safe * after single stepping
+			 * the original instruction.
+			 */
+			kcb = get_kprobe_ctlblk();
+			if (kcb->kprobe_status == KPROBE_HIT_FAULT) {
+				regs = kcb->handler_regs;
+				prepare_singlestep(p, regs);
+			}
+			return 1;
+		}
+	}
 	if (p->ainsn.boostable == 1 &&
 #ifdef CONFIG_PREEMPT
 	    !(pre_preempt_count) && /*
@@ -498,6 +528,21 @@ no_change:
 }
 
 /*
+ * Kprobe post handler trampoline saves the function return address
+ * and calls the registered user post handler. In case if the user
+ * specified post handler causes any page faults, the
+ * kprobe_fault_handler() gets notified and it just returns directly
+ * to the kprobes_post_handler() where trampoline was suppose to
+ * return.
+ */
+static void __kprobes kprobe_post_handler_trampoline(struct kprobe *p,
+			struct pt_regs *regs, struct kprobe_ctlblk *kcb)
+{
+	kcb->handler_retaddr = (unsigned long)__builtin_return_address(0);
+	p->post_handler(p, regs, 0);
+}
+
+/*
  * Interrupts are disabled on entry as trap1 is an interrupt gate and they
  * remain disabled thoroughout this function.
  */
@@ -509,9 +554,21 @@ static inline int post_kprobe_handler(st
 	if (!cur)
 		return 0;
 
-	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
+	if ((kcb->kprobe_status != KPROBE_REENTER)
+			&& (kcb->kprobe_status != KPROBE_HIT_FAULT)
+			&& cur->post_handler) {
+		kcb->handler_regs = regs;
 		kcb->kprobe_status = KPROBE_HIT_SSDONE;
-		cur->post_handler(cur, regs, 0);
+		kprobe_post_handler_trampoline(cur, regs, kcb);
+		kcb = get_kprobe_ctlblk();
+		/*
+		 * Check if user defined handler caused the page fault, in
+		 * such a case restore the register pointers, just resets
+		 * the current kprobe and resumes the execution, since we
+		 * have already single stepped on original instruction.
+		 */
+		if (kcb->kprobe_status == KPROBE_HIT_FAULT)
+			regs = kcb->handler_regs;
 	}
 
 	resume_execution(cur, regs, kcb);
@@ -541,18 +598,55 @@ static inline int kprobe_fault_handler(s
 {
 	struct kprobe *cur = kprobe_running();
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+	unsigned long *sara = (unsigned long *)&regs->esp;
 
-	if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
-		return 1;
-
-	if (kcb->kprobe_status & KPROBE_HIT_SS) {
-		resume_execution(cur, regs, kcb);
+	switch(kcb->kprobe_status) {
+	case KPROBE_HIT_SS:
+	case KPROBE_REENTER:
+	case KPROBE_HIT_FAULT:
+		/*
+		 * We are here because the instruction being single
+		 * stepped caused a page fault. We reset the current
+		 * kprobe and the eip points back to the probe address
+		 * and allow the page fault handler to continue as a
+		 * normal page fault.
+		 */
+		regs->eip = (unsigned long)cur->addr;
 		regs->eflags |= kcb->kprobe_old_eflags;
 
-		reset_current_kprobe();
+		if (kcb->kprobe_status == KPROBE_REENTER)
+			restore_previous_kprobe(kcb);
+		else
+			reset_current_kprobe();
 		preempt_enable_no_resched();
+		break;
+	case KPROBE_HIT_ACTIVE:
+	case KPROBE_HIT_SSDONE:
+		/*
+		 * We come here because instructions in the pre/post
+		 * handler caused the page_fault, this could happen
+		 * if handler tries to access user space by
+		 * copy_from_user(), get_user() etc. Let the
+		 * user-specified handler try to fix it first.
+		 */
+		if (cur->fault_handler && cur->fault_handler(cur, regs, trapnr))
+			return 1;
+
+		/*
+		 * Since user handler returned failure, we handle it
+		 * by skipping the user specified pre/post handler,
+		 * increment the nmissed count and return to the
+		 * pre/post_handler_trampoline().
+		 */
+		kprobes_inc_nmissed_count(cur);
+		*sara = kcb->handler_retaddr;
+		kcb->kprobe_status = KPROBE_HIT_FAULT;
+		break;
+	default:
+		break;
 	}
-	return 0;
+
+	return 0; /* let the page fault handler, fix this exception */
 }
 
 /*
diff -puN include/asm-i386/kprobes.h~kprobes-fault-handling-fix include/asm-i386/kprobes.h
--- linux-2.6.16-rc3-mm1/include/asm-i386/kprobes.h~kprobes-fault-handling-fix	2006-02-21 17:03:29.000000000 +0530
+++ linux-2.6.16-rc3-mm1-prasanna/include/asm-i386/kprobes.h	2006-02-21 17:03:29.000000000 +0530
@@ -69,6 +69,8 @@ struct kprobe_ctlblk {
 	unsigned long kprobe_old_eflags;
 	unsigned long kprobe_saved_eflags;
 	long *jprobe_saved_esp;
+	unsigned long handler_retaddr;
+	struct pt_regs *handler_regs;
 	struct pt_regs jprobe_saved_regs;
 	kprobe_opcode_t jprobes_stack[MAX_STACK_SIZE];
 	struct prev_kprobe prev_kprobe;
diff -puN include/linux/kprobes.h~kprobes-fault-handling-fix include/linux/kprobes.h
--- linux-2.6.16-rc3-mm1/include/linux/kprobes.h~kprobes-fault-handling-fix	2006-02-21 17:03:29.000000000 +0530
+++ linux-2.6.16-rc3-mm1-prasanna/include/linux/kprobes.h	2006-02-21 17:03:29.000000000 +0530
@@ -46,6 +46,7 @@
 #define KPROBE_HIT_SS		0x00000002
 #define KPROBE_REENTER		0x00000004
 #define KPROBE_HIT_SSDONE	0x00000008
+#define KPROBE_HIT_FAULT	0x00000010
 
 /* Attach to insert probes on any functions which should be ignored*/
 #define __kprobes	__attribute__((__section__(".kprobes.text")))

_
-- 
Prasanna S Panchamukhi
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-51776329

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2006-03-01 14:49 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-24 19:17 [PATCH] Kprobes- robust fault handling for i386 Keshavamurthy, Anil S
2006-02-27  9:24 ` Prasanna S Panchamukhi
2006-02-27  9:25   ` [PATCH] Kprobes- robust fault handling for i386 post_handler changes Prasanna S Panchamukhi
2006-02-28  1:02   ` [PATCH] Kprobes- robust fault handling for i386 Keshavamurthy Anil S
2006-02-28 14:37     ` Prasanna S Panchamukhi
2006-02-28 20:25       ` Keshavamurthy Anil S
2006-03-01 14:49         ` Prasanna S Panchamukhi
  -- strict thread matches above, loose matches on Subject: below --
2006-02-23  0:44 Keshavamurthy, Anil S
2006-02-22 10:41 Mao, Bibo
2006-02-23  8:58 ` Prasanna S Panchamukhi
2006-02-23 12:40   ` Frank Ch. Eigler
2006-02-23 13:17     ` Prasanna S Panchamukhi
2006-02-22  7:11 Prasanna S Panchamukhi
2006-02-24  1:33 ` Jim Keniston

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).