From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <systemtap-return-16956-listarch-systemtap=sources.redhat.com@sourceware.org>
Received: (qmail 10787 invoked by alias); 2 Nov 2010 19:25:44 -0000
Received: (qmail 10686 invoked by uid 22791); 2 Nov 2010 19:25:42 -0000
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0	tests=AWL,BAYES_00,TW_XC,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from e3.ny.us.ibm.com (HELO e3.ny.us.ibm.com) (32.97.182.143)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 02 Nov 2010 19:25:35 +0000
Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235])	by e3.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id oA2J8Q3B029292	for <systemtap@sources.redhat.com>; Tue, 2 Nov 2010 15:08:26 -0400
Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64])	by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id oA2JPWhY350132	for <systemtap@sources.redhat.com>; Tue, 2 Nov 2010 15:25:32 -0400
Received: from d01av04.pok.ibm.com (loopback [127.0.0.1])	by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id oA2JPWxd025440	for <systemtap@sources.redhat.com>; Tue, 2 Nov 2010 15:25:32 -0400
Received: from [9.65.229.212] (sig-9-65-229-212.mts.ibm.com [9.65.229.212])	by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id oA2JPUwk024763;	Tue, 2 Nov 2010 15:25:31 -0400
Subject: Re: documentation for user-space usage?
From: Jim Keniston <jkenisto@linux.vnet.ibm.com>
To: Grant Edwards <grant.b.edwards@gmail.com>
Cc: systemtap@sources.redhat.com
In-Reply-To: <ian9ci$fuc$1@dough.gmane.org>
References: <iaf9eu$jvj$1@dough.gmane.org> <y0m1v76ib0i.fsf@fche.csb>	 <iamjl4$qrh$1@dough.gmane.org> <y0md3qpgdbj.fsf@fche.csb>	 <ian9ci$fuc$1@dough.gmane.org>
Content-Type: multipart/mixed; boundary="=-ek42n1nDxWnLbcrJpVXc"
Date: Tue, 02 Nov 2010 19:25:00 -0000
Message-ID: <1288725929.3545.21.camel@localhost>
Mime-Version: 1.0
Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <systemtap.sourceware.org>
List-Subscribe: <mailto:systemtap-subscribe@sourceware.org>
List-Post: <mailto:systemtap@sourceware.org>
List-Help: <mailto:systemtap-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: systemtap-owner@sourceware.org
X-SW-Source: 2010-q4/txt/msg00155.txt.bz2


--=-ek42n1nDxWnLbcrJpVXc
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Content-length: 1495

On Mon, 2010-11-01 at 20:53 +0000, Grant Edwards wrote:
> On 2010-11-01, Frank Ch. Eigler <fche@redhat.com> wrote:
...
> >
> > Another option is to go ahead and try to port uprobes, leave
> > ARM/utrace to us / fedora people.  When/if the newer lkml-track
> > uprobes gets merged, the hypothetical ARM port could go into the main
> > kernel that way, bypassing the utrace kerfuffle.  IOW, doing an ARM
> > port of the current systemtap-resident uprobes would not be a wasted
> > effort, if LKML gets its act together and merges the other one.
> 
> OK, thanks.  Can anybody provide a guess as to how much porting needs
> to done (assuming a competent kernel-mode programmer who knows nothing
> about the tracing stuff)?
> 

Attached is the uprobes porting guide, updated to reflect SystemTap's
version of uprobes.  Most of the functions and macros you'd need to
provide are very simple, assuming you know about how ARM's "breakpoint"
instruction works.  You can get most of this info from the kprobes code.

It looks like the ARM version of kprobes emulates instructions rather
than single-stepping them.  So the bulk of your work will be:
a) deciding which instructions you'll allow users to probe; and
b) making the emulation code (which was presumably written for kernel
instructions) work for user-space instructions.

Aside from (a) and (b) above, the uprobes port should be pretty much
paint-by-numbers coding, and then (once the utrace port gets done)
testing and debugging.

Jim Keniston

--=-ek42n1nDxWnLbcrJpVXc
Content-Disposition: attachment; filename="port_stap_uprobes.txt"
Content-Type: text/plain; name="port_stap_uprobes.txt"; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Content-length: 11104

0. TERMINOLOGY

myarch: The architecture you're porting to.

regs->ip: On myarch, the member of the pt_regs structure that contains the
instruction pointer.

1. BASIC PORT
=============
The basic port gives you uprobes, using single-stepping inline (SSIL).
To add support for return probes (uretprobes), see "PORTING URETPROBES".
To add single-stepping out of line (SSOL), see "PORTING SSOL".  SSOL
ensures that no probepoints are missed even in multithreaded apps.
It also typically yields better performance.

On some architectures, emulating some or all instructions may be
preferable to using SSIL or SSOL.  For example, emulation avoids
the cost of the second (single-step) trap for each probepoint hit.
See "EMULATING INSTRUCTIONS".

runtime/uprobes2/uprobes_myarch.h
---------------------------------
Create this file, defining the following:

typedef ___ uprobe_opcode_t;
An integer type that is the appropriate size for holding myarch's
breakpoint instruction -- e.g., u8 for i386's 1-byte int3 instruction.

#define BREAKPOINT_INSTRUCTION 0x___
myarch's breakpoint instruction -- e.g., 0xcc for i386's int3
instruction.

#define BP_INSN_SIZE ___
sizeof(uprobe_opcode_t) -- e.g., 1 for i386's int3

#define MAX_UINSN_BYTES ___
The number of bytes in the longest possible (probeable) instruction
in myarch's instruction set.  Round up, if necessary, to an integral
multiple of BP_INSN_SIZE.  E.g., 16 for i386 and x86_64, where the
longest possible instruction is actually 15 bytes.

Note: You do NOT need to define the SLOT_IP macro, which is defined
in some other architectures' headers.

#define BREAKPOINT_SIGNAL ___
The signal generated when myarch's breakpoint instruction is
executed.  Typically SIGTRAP.

#define SSTEP_SIGNAL ___
The signal generated when an instruction is single-stepped.  Typically
SIGTRAP.

#define ARCH_BP_INST_PTR(inst_ptr) ___
A breakpoint has just been hit.  inst_ptr is the (unsigned long)
current value of the instruction pointer.  Return the (unsigned
long) address of the probepoint that was hit.  On i386 and x86_64,
the instruction pointer is AFTER the breakpoint instruction at this
point, so ARCH_BP_INST_PTR returns inst_ptr-BP_INSN_SIZE.  On other
architectures, the instruction pointer is AT the breakpoint, so
ARCH_BP_INST_PTR return inst_ptr.

[typically static inline]
unsigned long arch_get_probept(struct pt_regs *regs);
A breakpoint has just been hit.  regs contains the saved registers.
Return the address of the probepoint.  This function typically returns
the equivalent of ARCH_BP_INST_PTR(regs->ip).

[typically static inline]
int uprobe_emulate_insn(struct pt_regs *regs, struct uprobe_probept *ppt);
If you don't emulate any instructions, this function just returns 0.
Otherwise, see "EMULATING INSTRUCTIONS."

[typically static inline]
void arch_reset_ip_for_sstep(struct pt_regs *regs);
A breakpoint has just been hit.  regs contains the saved registers.
Make sure regs->ip points to the probepoint.  This function typically
does the equivalent of
	regs->ip = ARCH_BP_INST_PTR(regs->ip);
which is a nop on architectures where the breakpoint instruction leaves
the instruction pointer at the breakpoint.

struct uprobe_probept_arch_info { ... };
myarch-specific data for a probepoint.  Can be used, e.g., to remember
info about how the probed instruction should be single-stepped (see
"PORTING SSOL").  Typically empty.

struct uprobe_task_arch_info { ... };
myarch-specific data for a probed task.  Can be used, e.g., to pass
data from uprobe_pre_ssout() to uprobe_post_ssout() during probepoint
processing (see "PORTING SSOL").  Typically empty.

[extern or static]
int arch_validate_probed_insn(struct uprobe_probept *ppt,
					struct task_struct *tsk);
ppt->insn[] contains a copy of the instruction to be probed.
ppt->vaddr contains its address.  arch_validate_probed_insn()
returns 0 if you support probing that instruction at that address,
or a negative errno (e.g., -EINVAL or -EPERM) otherwise.  The caller
will reject requests to probe BREAKPOINT_INSTRUCTION, or to probe
addresses outside executable VM areas (see uprobe_validate_vaddr());
for these cases, arch_validate_probed_insn() won't be called.

arch_validate_probed_insn() is your opportunity to collect and
remember (e.g., in ppt->arch_info) information about the probed
instruction (see "PORTING SSOL" and "EMULATING INSTRUCTIONS").
Note that in at least one case (x86_64 rip-relative instructions),
the instruction actually single-stepped is a modified version of the
probed instruction.  For such cases, you actually modify ppt->insn[]
in arch_validate_probed_insn().  (If you need to remember the original
instruction, save a copy in ppt->arch_info.)

runtime/uprobes2/uprobes_myarch.c
---------------------------------
Create this file, implementing any functions that you declared extern
in include/asm-myarch/uprobes.h.  Be sure to do
#define UPROBES_IMPLEMENTATION 1
before
#include <linux/uprobes.h>
On some architectures, everything mentioned so far can be implemented
as static inlines in include/asm-myarch/uprobes.h, so this .c file
may be essentially empty (which is OK).  But the build scheme requires
this file, and you'll definitely need it when you port SSOL.

runtime/uprobes2/uprobes_arch.c
runtime/uprobes2/uprobes_arch.h
-------------------------------
Modifiy these files to #include your uprobes_myarch.h and uprobes_myarch.c
files.

2. PORTING URETPROBES
=====================
This step adds myarch support for register_uretprobe() and
unregister_uretprobe().  This should be pretty easy.

runtime/uprobes2/uprobes_myarch.h
---------------------------------
Add the following:

#define CONFIG_URETPROBES 1

[extern or static]
unsigned long arch_hijack_uret_addr(unsigned long trampoline_addr,
		struct pt_regs *regs, struct uprobe_task *utask);
We have just hit the probepoint at the entry to a uretprobed
function.  Remember the real* return address, replace it with
trampoline_addr, and return the real* return address.

If for some reason you can't replace the return address, return 0.
If you somehow leave the return address in a corrupted state,
also set utask->doomed = 1.

*This is called for every uretprobe registered on the probed
function.  If there's more than one, only the first call to
arch_hijack_uret_addr() will return the real return address; the rest
will return trampoline_addr.  That's the desired behavior.

[typically static inline]
void arch_restore_uret_addr(unsigned long ret_addr, struct pt_regs *regs);
Called after the uretprobed function executes its return and all
associated uretprobe handlers have been run.  Set regs->ip = ret_addr.

[typically static inline]
unsigned long arch_get_cur_sp(struct pt_regs *regs);
Returns the current value of the user-mode stack pointer, as obtained from regs.

[typically static inline]
unsigned long arch_predict_sp_at_ret(struct pt_regs *regs,
						struct task_struct *tsk);
Called right after arch_hijack_uret_addr() (see above) -- i.e., right
after the probed function has been called.  Returns the expected value
of the user-mode stack pointer after this function's return instruction
has been executed.  If your architecture passes the return address
in a register, then this function typically returns the current value
of the stack pointer.  On the other hand, if your architecture passes
the return address on the stack, then this function typically returns
the current value of the stack pointer plus the size of the return
address (to reflect the fact that the return address is on the stack
when the function is called, but is popped off as it returns).

runtime/uprobes2/uprobes_myarch.c
------------------------------------
Add any of the above-described functions that aren't implemented in
runtime/uprobes2/uprobes_myarch.h.

3. PORTING SSOL
This step adds support for single-stepping out of line (SSOL).
This step requires pretty extensive knowledge of myarch's
instruction set.  Fortunately, if there's a kprobes port for
myarch, most of the thinking has already been done for you.

As previously mentioned, SSOL ensures that no probepoints are missed
even in multithreaded apps, and typically yields better performance.
The basic idea is that we need to leave the breakpoint instruction
in place at all times (to avoid probepoint misses), and so must
single-step a copy of the probed instruction.  Uprobes puts the
instruction-copy in one of a set of "instruction slots" allocated
from a special VM area.

runtime/uprobes2/uprobes_myarch.h
---------------------------------
Add the following line:
#define CONFIG_UPROBES_SSOL 1

runtime/uprobes2/uprobes_myarch.c
------------------------------------
Add the following functions:

void uprobe_pre_ssout(struct uprobe_task *utask,
		struct uprobe_probept *ppt, struct pt_regs *regs);
Called when the indicated task is about to single-step the instruction
at the indicated probepoint.  Call uprobe_get_insn_slot() to ensure
that there's an instruction slot in the SSOL vma reserved for this
probepoint, and that the slot contains the instruction-copy to be
single-stepped.  If uprobe_get_insn_slot() returns NULL, it means
uprobes couldn't populate the instruction slot; just set utask->doomed
= 1.  Otherwise set regs->ip and utask->singlestep_addr to the address
of the instruction slot.  Perform any myarch-specific pre-single-step
work (typically none; x86_64 is an exception).

void uprobe_post_ssout(struct uprobe_task *utask,
		struct uprobe_probept *ppt, struct pt_regs *regs);
Called after the instruction copy has been single-stepped.
Call up_read(&ppt->slot->rwsem) to release the instruction slot.
Perform any myarch-specific fixups required due to the fact that
we single-stepped the instruction copy at utask->singlestep_addr
rather than the original instruction at ppt->vaddr.

For most instructions, this just means adjusting regs->ip so that it
points back to the next instruction in the probed instruction stream.
Typical exceptions are return instructions and absolute or indirect
call/jump instructions, for which no regs->ip adjustment is necessary.
Also for call instructions, you typically need to adjust the return
address.

If myarch has a kprobes port, you can model uprobe_post_ssout()
after resume_execution() in arch/myarch/kernel/kprobes.c.

4. EMULATING INSTRUCTIONS
If you emulate one or more instructions, then your uprobe_emulate_insn()
function must do more than just return 0.

int uprobe_emulate_insn(struct pt_regs *regs, struct uprobe_probept *ppt);
Called after a probepoint has been hit and the associated handlers have
been run.  If you're not emulating the instruction specified by ppt,
simply return 0.  Otherwise, this function must:

- Perform the action of the probed instruction.  Keep in mind that the
instruction ordinarily runs in user space, so you must appropriately
handle stray memory references, illegal and privileged instructions,
malevolent code, etc.

- Point regs->ip at the next instruction to be executed.

- Return 1.

Instructions that can be neither emulated nor single-stepped should
be rejected by arch_validate_probed_insn() (see above).

--=-ek42n1nDxWnLbcrJpVXc--