public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: Jim Keniston <jkenisto@us.ibm.com>
To: Martin Hunt <hunt@redhat.com>
Cc: SystemTAP <systemtap@sources.redhat.com>
Subject: Re: kprobe fault handling
Date: Tue, 07 Feb 2006 00:51:00 -0000	[thread overview]
Message-ID: <1139273452.2866.76.camel@dyn9047018079.beaverton.ibm.com> (raw)
In-Reply-To: <1139255423.4689.10.camel@monkey2>

Hi, Martin.  This is the subject of bugzilla #1303, which is assigned to
Prasanna.  I believe that Prasanna is also the author of the original
kprobe_fault_handler code in kprobes, so he may be able to provide
insight that I don't have.

However, I've thought about this some, so here goes...

On Mon, 2006-02-06 at 11:50, Martin Hunt wrote:
> I've been trying to understand how kprobes fault handling is supposed to
> work and why it isn't doing what I thought it did.
> 
> When page faults happen, do_page_fault() almost immediately calls
> notify_die(DIE_PAGE_FAULT,...) This calls the notifier chain which calls
> kprobe_exceptions_notify(). This calls kprobe_fault_handler().
> 
> kprobe_fault_handler() checkes to see if there is a specific fault
> fandler for that kprobe, and if there is, it calls it.

I believe that your analysis is correct.

> Question: What
> do we imagine a probe-specific page fault handler would do?  Why is it
> useful?

If an attempted memory access failed, the fault handler might be able to
clean up what the pre_handler or post_handler was trying to do.

For example, consider user-space return probes (which is currently hung
up on a couple of issues, including user-space access).  When we enter
the probed function, the uretprobe version of arch_prepare_kretprobe()
has to do the following:
1. Reserve a kretprobe_instance.
2. Save a copy of the return address (a read from user space).
3. Replace the return address with the uretprobe trampoline address (a
write to user space).
4. Hash the kretprobe_instance.
If the page containing the return address is not in memory (a very
unlikely scenario, but one which I believe we must handle) then we'll
get a page fault on #2 or #3.  The corresponding fault handler could
free up the kretprobe_instance reserved in step #1.

Of course, the obvious question is, what does the fault handler do
then?  We don't want to return as if the memory access were successful,
because it wasn't.  We don't want to let the page be demand-paged back
in, because that might cause us to sleep -- a definite no-no.  But those
seem to be the only two possible outcomes, given the way kprobes
currently works.

The "right" thing to do, perhaps, is for kprobe_handler to notice that
the kprobe in question has a fault handler is associated with it, and do
a sort of setjmp before calling the pre_handler.  Then the fault handler
could end in a corresponding longjmp.  (The longjmp would bypass several
stack frames, including those of the fault handler,
kprobe_fault_handler, kprobe_exceptions_notify, notify_die and friends,
do_page_fault, and the pre_handler.)

I understand what fixup_exceptions() does, but it's not clear to me how
we can use it here.  I guess if all user-space access by kprobes
handlers used the same user-read and user-write functions, then we'd
have a fixup for the user-read function and one for the user-write, and
each of these could vector into the aforementioned longjmp.

Do you envision that all user-space accesses would be via functions like
*_from_user() in the SystemTap runtime?

> 
> Then there is this code, which I don't understand
> 	if (kcb->kprobe_status & KPROBE_HIT_SS) {
> 		resume_execution(cur, regs, kcb);
> 		regs->eflags |= kcb->kprobe_old_eflags;
> 
> 		reset_current_kprobe();
> 		preempt_enable_no_resched();
> 	}
> 

I think KPROBE_HIT_SS means that the fault happened while the probed
instruction was being single-stepped.  Beyond that, I'm not sure what's
going on.

> And that's it. kprobe_fault_handler returns 0.  No call to
> fixup_exceptions()!  So do_page_fault() will have to do the fixups, but
> first it will print nasty might_sleep warnings and maybe actually sleep!
> 
> I could have sworn this was not the case previously but it has been a
> very long time since I have looked at the code at this level.  Anyway,
> this MUST be fixed. 
> 
> Martin

I agree that we need to resolve this.  Thanks for getting this
discussion rolling again.

Jim

  reply	other threads:[~2006-02-07  0:51 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-06 19:49 Martin Hunt
2006-02-07  0:51 ` Jim Keniston [this message]
2006-02-07 17:31   ` Jim Keniston
2006-02-07 17:50     ` Martin Hunt
2006-02-07 19:49       ` Jim Keniston
2006-02-08  4:38         ` Suparna Bhattacharya
2006-02-08 11:32           ` Richard J Moore
2006-02-09  7:23             ` Prasanna S Panchamukhi
2006-02-09 16:33               ` Keshavamurthy Anil S
2006-02-09 21:35               ` Jim Keniston
2006-02-09 22:06                 ` Martin Hunt
2006-02-10  5:39                   ` Martin Hunt
2006-02-10 21:46                     ` Frank Ch. Eigler
2006-02-10 21:55                       ` Martin Hunt
2006-02-10 22:12                         ` Frank Ch. Eigler
2006-02-10 22:17                           ` Martin Hunt
2006-02-10 22:20                             ` Frank Ch. Eigler
2006-02-10 22:41                               ` Martin Hunt
2006-02-10 22:47                                 ` Frank Ch. Eigler
2006-02-10 23:36                                   ` Martin Hunt
2006-02-11  0:49                                     ` Frank Ch. Eigler
2006-02-12  1:26                                       ` Martin Hunt
2006-02-13 13:39                                         ` Frank Ch. Eigler
  -- strict thread matches above, loose matches on Subject: below --
2006-02-09  8:55 Mao, Bibo
2006-02-09 10:22 ` Richard J Moore
2006-02-07 22:19 Keshavamurthy, Anil S
2006-02-07 20:36 Keshavamurthy, Anil S
2006-02-07 20:48 ` Martin Hunt
2005-09-06 12:56 Prasanna S Panchamukhi
2005-09-06 15:09 ` Frank Ch. Eigler
2005-09-08 11:52   ` Prasanna S Panchamukhi
2005-09-08 17:42     ` Frank Ch. Eigler
2005-09-06  1:06 Frank Ch. Eigler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1139273452.2866.76.camel@dyn9047018079.beaverton.ibm.com \
    --to=jkenisto@us.ibm.com \
    --cc=hunt@redhat.com \
    --cc=systemtap@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).