From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10787 invoked by alias); 2 Nov 2010 19:25:44 -0000 Received: (qmail 10686 invoked by uid 22791); 2 Nov 2010 19:25:42 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,TW_XC,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e3.ny.us.ibm.com (HELO e3.ny.us.ibm.com) (32.97.182.143) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 02 Nov 2010 19:25:35 +0000 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by e3.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id oA2J8Q3B029292 for ; Tue, 2 Nov 2010 15:08:26 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id oA2JPWhY350132 for ; Tue, 2 Nov 2010 15:25:32 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id oA2JPWxd025440 for ; Tue, 2 Nov 2010 15:25:32 -0400 Received: from [9.65.229.212] (sig-9-65-229-212.mts.ibm.com [9.65.229.212]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id oA2JPUwk024763; Tue, 2 Nov 2010 15:25:31 -0400 Subject: Re: documentation for user-space usage? From: Jim Keniston To: Grant Edwards Cc: systemtap@sources.redhat.com In-Reply-To: References: Content-Type: multipart/mixed; boundary="=-ek42n1nDxWnLbcrJpVXc" Date: Tue, 02 Nov 2010 19:25:00 -0000 Message-ID: <1288725929.3545.21.camel@localhost> Mime-Version: 1.0 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2010-q4/txt/msg00155.txt.bz2 --=-ek42n1nDxWnLbcrJpVXc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Content-length: 1495 On Mon, 2010-11-01 at 20:53 +0000, Grant Edwards wrote: > On 2010-11-01, Frank Ch. Eigler wrote: ... > > > > Another option is to go ahead and try to port uprobes, leave > > ARM/utrace to us / fedora people. When/if the newer lkml-track > > uprobes gets merged, the hypothetical ARM port could go into the main > > kernel that way, bypassing the utrace kerfuffle. IOW, doing an ARM > > port of the current systemtap-resident uprobes would not be a wasted > > effort, if LKML gets its act together and merges the other one. > > OK, thanks. Can anybody provide a guess as to how much porting needs > to done (assuming a competent kernel-mode programmer who knows nothing > about the tracing stuff)? > Attached is the uprobes porting guide, updated to reflect SystemTap's version of uprobes. Most of the functions and macros you'd need to provide are very simple, assuming you know about how ARM's "breakpoint" instruction works. You can get most of this info from the kprobes code. It looks like the ARM version of kprobes emulates instructions rather than single-stepping them. So the bulk of your work will be: a) deciding which instructions you'll allow users to probe; and b) making the emulation code (which was presumably written for kernel instructions) work for user-space instructions. Aside from (a) and (b) above, the uprobes port should be pretty much paint-by-numbers coding, and then (once the utrace port gets done) testing and debugging. Jim Keniston --=-ek42n1nDxWnLbcrJpVXc Content-Disposition: attachment; filename="port_stap_uprobes.txt" Content-Type: text/plain; name="port_stap_uprobes.txt"; charset="UTF-8" Content-Transfer-Encoding: 7bit Content-length: 11104 0. TERMINOLOGY myarch: The architecture you're porting to. regs->ip: On myarch, the member of the pt_regs structure that contains the instruction pointer. 1. BASIC PORT ============= The basic port gives you uprobes, using single-stepping inline (SSIL). To add support for return probes (uretprobes), see "PORTING URETPROBES". To add single-stepping out of line (SSOL), see "PORTING SSOL". SSOL ensures that no probepoints are missed even in multithreaded apps. It also typically yields better performance. On some architectures, emulating some or all instructions may be preferable to using SSIL or SSOL. For example, emulation avoids the cost of the second (single-step) trap for each probepoint hit. See "EMULATING INSTRUCTIONS". runtime/uprobes2/uprobes_myarch.h --------------------------------- Create this file, defining the following: typedef ___ uprobe_opcode_t; An integer type that is the appropriate size for holding myarch's breakpoint instruction -- e.g., u8 for i386's 1-byte int3 instruction. #define BREAKPOINT_INSTRUCTION 0x___ myarch's breakpoint instruction -- e.g., 0xcc for i386's int3 instruction. #define BP_INSN_SIZE ___ sizeof(uprobe_opcode_t) -- e.g., 1 for i386's int3 #define MAX_UINSN_BYTES ___ The number of bytes in the longest possible (probeable) instruction in myarch's instruction set. Round up, if necessary, to an integral multiple of BP_INSN_SIZE. E.g., 16 for i386 and x86_64, where the longest possible instruction is actually 15 bytes. Note: You do NOT need to define the SLOT_IP macro, which is defined in some other architectures' headers. #define BREAKPOINT_SIGNAL ___ The signal generated when myarch's breakpoint instruction is executed. Typically SIGTRAP. #define SSTEP_SIGNAL ___ The signal generated when an instruction is single-stepped. Typically SIGTRAP. #define ARCH_BP_INST_PTR(inst_ptr) ___ A breakpoint has just been hit. inst_ptr is the (unsigned long) current value of the instruction pointer. Return the (unsigned long) address of the probepoint that was hit. On i386 and x86_64, the instruction pointer is AFTER the breakpoint instruction at this point, so ARCH_BP_INST_PTR returns inst_ptr-BP_INSN_SIZE. On other architectures, the instruction pointer is AT the breakpoint, so ARCH_BP_INST_PTR return inst_ptr. [typically static inline] unsigned long arch_get_probept(struct pt_regs *regs); A breakpoint has just been hit. regs contains the saved registers. Return the address of the probepoint. This function typically returns the equivalent of ARCH_BP_INST_PTR(regs->ip). [typically static inline] int uprobe_emulate_insn(struct pt_regs *regs, struct uprobe_probept *ppt); If you don't emulate any instructions, this function just returns 0. Otherwise, see "EMULATING INSTRUCTIONS." [typically static inline] void arch_reset_ip_for_sstep(struct pt_regs *regs); A breakpoint has just been hit. regs contains the saved registers. Make sure regs->ip points to the probepoint. This function typically does the equivalent of regs->ip = ARCH_BP_INST_PTR(regs->ip); which is a nop on architectures where the breakpoint instruction leaves the instruction pointer at the breakpoint. struct uprobe_probept_arch_info { ... }; myarch-specific data for a probepoint. Can be used, e.g., to remember info about how the probed instruction should be single-stepped (see "PORTING SSOL"). Typically empty. struct uprobe_task_arch_info { ... }; myarch-specific data for a probed task. Can be used, e.g., to pass data from uprobe_pre_ssout() to uprobe_post_ssout() during probepoint processing (see "PORTING SSOL"). Typically empty. [extern or static] int arch_validate_probed_insn(struct uprobe_probept *ppt, struct task_struct *tsk); ppt->insn[] contains a copy of the instruction to be probed. ppt->vaddr contains its address. arch_validate_probed_insn() returns 0 if you support probing that instruction at that address, or a negative errno (e.g., -EINVAL or -EPERM) otherwise. The caller will reject requests to probe BREAKPOINT_INSTRUCTION, or to probe addresses outside executable VM areas (see uprobe_validate_vaddr()); for these cases, arch_validate_probed_insn() won't be called. arch_validate_probed_insn() is your opportunity to collect and remember (e.g., in ppt->arch_info) information about the probed instruction (see "PORTING SSOL" and "EMULATING INSTRUCTIONS"). Note that in at least one case (x86_64 rip-relative instructions), the instruction actually single-stepped is a modified version of the probed instruction. For such cases, you actually modify ppt->insn[] in arch_validate_probed_insn(). (If you need to remember the original instruction, save a copy in ppt->arch_info.) runtime/uprobes2/uprobes_myarch.c --------------------------------- Create this file, implementing any functions that you declared extern in include/asm-myarch/uprobes.h. Be sure to do #define UPROBES_IMPLEMENTATION 1 before #include On some architectures, everything mentioned so far can be implemented as static inlines in include/asm-myarch/uprobes.h, so this .c file may be essentially empty (which is OK). But the build scheme requires this file, and you'll definitely need it when you port SSOL. runtime/uprobes2/uprobes_arch.c runtime/uprobes2/uprobes_arch.h ------------------------------- Modifiy these files to #include your uprobes_myarch.h and uprobes_myarch.c files. 2. PORTING URETPROBES ===================== This step adds myarch support for register_uretprobe() and unregister_uretprobe(). This should be pretty easy. runtime/uprobes2/uprobes_myarch.h --------------------------------- Add the following: #define CONFIG_URETPROBES 1 [extern or static] unsigned long arch_hijack_uret_addr(unsigned long trampoline_addr, struct pt_regs *regs, struct uprobe_task *utask); We have just hit the probepoint at the entry to a uretprobed function. Remember the real* return address, replace it with trampoline_addr, and return the real* return address. If for some reason you can't replace the return address, return 0. If you somehow leave the return address in a corrupted state, also set utask->doomed = 1. *This is called for every uretprobe registered on the probed function. If there's more than one, only the first call to arch_hijack_uret_addr() will return the real return address; the rest will return trampoline_addr. That's the desired behavior. [typically static inline] void arch_restore_uret_addr(unsigned long ret_addr, struct pt_regs *regs); Called after the uretprobed function executes its return and all associated uretprobe handlers have been run. Set regs->ip = ret_addr. [typically static inline] unsigned long arch_get_cur_sp(struct pt_regs *regs); Returns the current value of the user-mode stack pointer, as obtained from regs. [typically static inline] unsigned long arch_predict_sp_at_ret(struct pt_regs *regs, struct task_struct *tsk); Called right after arch_hijack_uret_addr() (see above) -- i.e., right after the probed function has been called. Returns the expected value of the user-mode stack pointer after this function's return instruction has been executed. If your architecture passes the return address in a register, then this function typically returns the current value of the stack pointer. On the other hand, if your architecture passes the return address on the stack, then this function typically returns the current value of the stack pointer plus the size of the return address (to reflect the fact that the return address is on the stack when the function is called, but is popped off as it returns). runtime/uprobes2/uprobes_myarch.c ------------------------------------ Add any of the above-described functions that aren't implemented in runtime/uprobes2/uprobes_myarch.h. 3. PORTING SSOL This step adds support for single-stepping out of line (SSOL). This step requires pretty extensive knowledge of myarch's instruction set. Fortunately, if there's a kprobes port for myarch, most of the thinking has already been done for you. As previously mentioned, SSOL ensures that no probepoints are missed even in multithreaded apps, and typically yields better performance. The basic idea is that we need to leave the breakpoint instruction in place at all times (to avoid probepoint misses), and so must single-step a copy of the probed instruction. Uprobes puts the instruction-copy in one of a set of "instruction slots" allocated from a special VM area. runtime/uprobes2/uprobes_myarch.h --------------------------------- Add the following line: #define CONFIG_UPROBES_SSOL 1 runtime/uprobes2/uprobes_myarch.c ------------------------------------ Add the following functions: void uprobe_pre_ssout(struct uprobe_task *utask, struct uprobe_probept *ppt, struct pt_regs *regs); Called when the indicated task is about to single-step the instruction at the indicated probepoint. Call uprobe_get_insn_slot() to ensure that there's an instruction slot in the SSOL vma reserved for this probepoint, and that the slot contains the instruction-copy to be single-stepped. If uprobe_get_insn_slot() returns NULL, it means uprobes couldn't populate the instruction slot; just set utask->doomed = 1. Otherwise set regs->ip and utask->singlestep_addr to the address of the instruction slot. Perform any myarch-specific pre-single-step work (typically none; x86_64 is an exception). void uprobe_post_ssout(struct uprobe_task *utask, struct uprobe_probept *ppt, struct pt_regs *regs); Called after the instruction copy has been single-stepped. Call up_read(&ppt->slot->rwsem) to release the instruction slot. Perform any myarch-specific fixups required due to the fact that we single-stepped the instruction copy at utask->singlestep_addr rather than the original instruction at ppt->vaddr. For most instructions, this just means adjusting regs->ip so that it points back to the next instruction in the probed instruction stream. Typical exceptions are return instructions and absolute or indirect call/jump instructions, for which no regs->ip adjustment is necessary. Also for call instructions, you typically need to adjust the return address. If myarch has a kprobes port, you can model uprobe_post_ssout() after resume_execution() in arch/myarch/kernel/kprobes.c. 4. EMULATING INSTRUCTIONS If you emulate one or more instructions, then your uprobe_emulate_insn() function must do more than just return 0. int uprobe_emulate_insn(struct pt_regs *regs, struct uprobe_probept *ppt); Called after a probepoint has been hit and the associated handlers have been run. If you're not emulating the instruction specified by ppt, simply return 0. Otherwise, this function must: - Perform the action of the probed instruction. Keep in mind that the instruction ordinarily runs in user space, so you must appropriately handle stray memory references, illegal and privileged instructions, malevolent code, etc. - Point regs->ip at the next instruction to be executed. - Return 1. Instructions that can be neither emulated nor single-stepped should be rejected by arch_validate_probed_insn() (see above). --=-ek42n1nDxWnLbcrJpVXc--