From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9043 invoked by alias); 20 May 2011 03:24:03 -0000 Received: (qmail 9019 invoked by uid 22791); 20 May 2011 03:24:01 -0000 X-SWARE-Spam-Status: No, hits=-6.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_DW,TW_FL,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 20 May 2011 03:23:40 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p4K3NNJv030521 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 19 May 2011 23:23:23 -0400 Received: from [10.3.113.28] (ovpn-113-28.phx2.redhat.com [10.3.113.28]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p4K3NMPX011692; Thu, 19 May 2011 23:23:22 -0400 Message-ID: <4DD5DEAA.3050908@redhat.com> Date: Fri, 20 May 2011 03:24:00 -0000 From: Josh Stone User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: systemtap@sourceware.org, Srikar Dronamraju Subject: Initial stap support for inode-based uprobes Content-Type: multipart/mixed; boundary="------------050205080504000300020306" X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2011-q2/txt/msg00206.txt.bz2 This is a multi-part message in MIME format. --------------050205080504000300020306 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-length: 2550 The attached patch implements initial support for SystemTap to use Srikar's inode-based uprobes. It is also published in the branch jistone/inode-uprobes, in gitweb here: The uprobes branch I worked from is here: The good news is that the basics appear to be working well. I've tested probing stap itself and libdw, and got the expected probe hits. I'd appreciate any review of my implementation so far. Beyond these working basics, there are a lot of details to hammer out, so here's the list of what I know. * EXPORT_SYMBOL_GPL, or uprobes' lack thereof. Without kernel exports, the whole API will be inaccessible to us. * Return probes. This hasn't yet been added to the new uprobes. * Process filtering. AFAICS, the current uprobes implementation sets the breakpoint in all processes that map the particular inode. There is a filtering mechanism, but that seems only to decide whether to call the handler each time. You'll still take the bp/sstep overhead. Also, on stap's side, we previously had the ability to limit process probes to the -x/-c target and children, which I haven't tried here yet. * Runtime build-id verification. Right now I'm just mapping the path to inode*, without checking that the build-id is what we expected. I'm not sure we even could at the systemtap-init point. Even if we did, the file may still get modified without changing the inode, and I don't think this uprobes gives us any way to notice or decide whether we like the new form. * SDT semaphore. In the current form, we have no hook on individual processes, so we can't modify the semaphores in applications that are actively gating their markers. We'll probably need something like PR10994 to achieve this, which isn't really about uprobes per-se, but rather about living without utrace. * Argument access. If you try $args, it will fail with a missing symbol 'task_user_regset_view'. I haven't looked closely at this yet. * Probe IP. For many probe handlers, we try to set the pt_regs IP to the actual breakpoint IP, but in this case we don't happen to even know the virtualized address. Uprobes itself uses uprobes_get_bkpt_addr() in some instances, but that's not exposed for our use. I think that's it. So if you happen to build a kernel with the new uprobes, please enjoy systemtap support too. :) Josh --------------050205080504000300020306 Content-Type: text/x-patch; name="systemtap-inode-uprobes.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="systemtap-inode-uprobes.patch" Content-length: 14295 commit f8e4aa0bc62d79bfc7a2dcb1508215a675d9f83d Author: Josh Stone Date: Thu May 19 19:44:16 2011 -0700 Add initial support for inode-based uprobes This adds support for placing regular userspace probes using the new inode+offset API being developed for the upstream kernel. This includes probing functions, statements, and SDT markers, but return probes aren't yet supported in the new API. A lot of the finer details of systemtap's userspace runtime still needs work too, but this is a functional start. * runtime/uprobes-inode.c: New, basic registration code to lookup filename inodes and connect uprobes using the new API. * tapsets.cxx (kernel_supports_inode_uprobes): New, guess whether this is an inode-uprobes kernel based on CONFIG values. (dwarf_builder::build): Disallow userspace return probes. (uprobe_derived_probe::join_group): Only trigger task_finder and the manual uprobes model for the old style of uprobes. (uprobe_builder::build): Disallow absolute-address userspace probes. (uprobe_derived_probe_group::emit*): Split into inode/utrace variants. diff --git a/runtime/uprobes-inode.c b/runtime/uprobes-inode.c new file mode 100644 index 0000000..b04ca6d --- /dev/null +++ b/runtime/uprobes-inode.c @@ -0,0 +1,119 @@ +/* -*- linux-c -*- + * Common functions for using inode-based uprobes + * Copyright (C) 2011 Red Hat Inc. + * + * This file is part of systemtap, and is free software. You can + * redistribute it and/or modify it under the terms of the GNU General + * Public License (GPL); either version 2, or (at your option) any + * later version. + */ + +#ifndef _UPROBES_INODE_C_ +#define _UPROBES_INODE_C_ + +#include +#include +#include + +struct stp_inode_uprobe_target { + const char * const filename; + struct inode *inode; +}; + +struct stp_inode_uprobe_consumer { + struct uprobe_consumer consumer; + struct stp_inode_uprobe_target * const target; + loff_t offset; + /* XXX sdt_sem_offset support? */ + + struct stap_probe * const probe; +}; + + +static void +stp_inode_uprobes_put(struct stp_inode_uprobe_target *targets, + size_t ntargets) +{ + size_t i; + for (i = 0; i < ntargets; ++i) { + struct stp_inode_uprobe_target *ut = &targets[i]; + iput(ut->inode); + ut->inode = NULL; + } +} + +static int +stp_inode_uprobes_get(struct stp_inode_uprobe_target *targets, + size_t ntargets) +{ + int ret = 0; + size_t i; + for (i = 0; i < ntargets; ++i) { + struct path path; + struct stp_inode_uprobe_target *ut = &targets[i]; + ret = kern_path(ut->filename, LOOKUP_FOLLOW, &path); + if (!ret) { + ut->inode = igrab(path.dentry->d_inode); + if (!ut->inode) + ret = -EINVAL; + } + if (ret) + break; + } + if (ret) + stp_inode_uprobes_put(targets, i); + return ret; +} + +static void +stp_inode_uprobes_unreg(struct stp_inode_uprobe_consumer *consumers, + size_t nconsumers) +{ + size_t i; + for (i = 0; i < nconsumers; ++i) { + struct stp_inode_uprobe_consumer *uc = &consumers[i]; + unregister_uprobe(uc->target->inode, uc->offset, + &uc->consumer); + } +} + +static int +stp_inode_uprobes_reg(struct stp_inode_uprobe_consumer *consumers, + size_t nconsumers) +{ + int ret = 0; + size_t i; + for (i = 0; i < nconsumers; ++i) { + struct stp_inode_uprobe_consumer *uc = &consumers[i]; + ret = register_uprobe(uc->target->inode, uc->offset, + &uc->consumer); + if (ret) + break; + } + if (ret) + stp_inode_uprobes_unreg(consumers, i); + return ret; +} + +static int +stp_inode_uprobes_init(struct stp_inode_uprobe_target *targets, size_t ntargets, + struct stp_inode_uprobe_consumer *consumers, size_t nconsumers) +{ + int ret = stp_inode_uprobes_get(targets, ntargets); + if (!ret) { + ret = stp_inode_uprobes_reg(consumers, nconsumers); + if (ret) + stp_inode_uprobes_put(targets, ntargets); + } + return ret; +} + +static void +stp_inode_uprobes_exit(struct stp_inode_uprobe_target *targets, size_t ntargets, + struct stp_inode_uprobe_consumer *consumers, size_t nconsumers) +{ + stp_inode_uprobes_unreg(consumers, nconsumers); + stp_inode_uprobes_put(targets, ntargets); +} + +#endif /* _UPROBES_INODE_C_ */ diff --git a/tapsets.cxx b/tapsets.cxx index 8afe02e..25170dc 100644 --- a/tapsets.cxx +++ b/tapsets.cxx @@ -3795,6 +3795,16 @@ dwarf_derived_probe::join_group (systemtap_session& s) } +static bool +kernel_supports_inode_uprobes(systemtap_session& s) +{ + // The arch-supports is new to the builtin inode-uprobes, so it makes a + // reasonable indicator of the new API. Else we'll need an autoconf... + return (s.kernel_config["CONFIG_ARCH_SUPPORTS_UPROBES"] == "y" + && s.kernel_config["CONFIG_UPROBES"] == "y"); +} + + dwarf_derived_probe::dwarf_derived_probe(const string& funcname, const string& filename, int line, @@ -3835,6 +3845,12 @@ dwarf_derived_probe::dwarf_derived_probe(const string& funcname, // ET_DYN ones do (addr += run-time mmap base address). We tell these apart // by the incoming section value (".absolute" vs. ".dynamic"). // XXX Assert invariants here too? + + // inode-uprobes needs an offset rather than an absolute VM address. + if (kernel_supports_inode_uprobes(q.dw.sess) && + section == ".absolute" && addr == dwfl_addr && + addr >= q.dw.module_start && addr < q.dw.module_end) + this->addr = addr - q.dw.module_start; } else { @@ -6182,7 +6198,13 @@ dwarf_builder::build(systemtap_session & sess, else module_name = user_path; // canonicalize it - if (sess.kernel_config["CONFIG_UTRACE"] != string("y")) + if (kernel_supports_inode_uprobes(sess)) + { + if (has_null_param(parameters, TOK_RETURN)) + throw semantic_error + (_("process return probes not available with inode-based uprobes")); + } + else if (sess.kernel_config["CONFIG_UTRACE"] != string("y")) throw semantic_error (_("process probes not available without kernel CONFIG_UTRACE")); // user-space target; we use one dwflpp instance per module name @@ -6635,6 +6657,16 @@ private: return p->module + "|" + p->section + "|" + lex_cast(p->pid); } + // Using our own utrace-based uprobes + void emit_module_utrace_decls (systemtap_session& s); + void emit_module_utrace_init (systemtap_session& s); + void emit_module_utrace_exit (systemtap_session& s); + + // Using the upstream inode-based uprobes + void emit_module_inode_decls (systemtap_session& s); + void emit_module_inode_init (systemtap_session& s); + void emit_module_inode_exit (systemtap_session& s); + public: void emit_module_decls (systemtap_session& s); void emit_module_init (systemtap_session& s); @@ -6648,11 +6680,15 @@ uprobe_derived_probe::join_group (systemtap_session& s) if (! s.uprobe_derived_probes) s.uprobe_derived_probes = new uprobe_derived_probe_group (); s.uprobe_derived_probes->enroll (this); - enable_task_finder(s); - // Ask buildrun.cxx to build extra module if needed, and - // signal staprun to load that module - s.need_uprobes = true; + if (!kernel_supports_inode_uprobes(s)) + { + enable_task_finder(s); + + // Ask buildrun.cxx to build extra module if needed, and + // signal staprun to load that module + s.need_uprobes = true; + } } @@ -6684,7 +6720,7 @@ uprobe_derived_probe::emit_unprivileged_assertion (translator_output* o) struct uprobe_builder: public derived_probe_builder { uprobe_builder() {} - virtual void build(systemtap_session &, + virtual void build(systemtap_session & sess, probe * base, probe_point * location, literal_map_t const & parameters, @@ -6692,6 +6728,9 @@ struct uprobe_builder: public derived_probe_builder { int64_t process, address; + if (kernel_supports_inode_uprobes(sess)) + throw semantic_error (_("absolute process probes not available with inode-based uprobes")); + bool b1 = get_param (parameters, TOK_PROCESS, process); (void) b1; bool b2 = get_param (parameters, TOK_STATEMENT, address); @@ -6705,10 +6744,10 @@ struct uprobe_builder: public derived_probe_builder void -uprobe_derived_probe_group::emit_module_decls (systemtap_session& s) +uprobe_derived_probe_group::emit_module_utrace_decls (systemtap_session& s) { if (probes.empty()) return; - s.op->newline() << "/* ---- user probes ---- */"; + s.op->newline() << "/* ---- utrace uprobes ---- */"; // If uprobes isn't in the kernel, pull it in from the runtime. s.op->newline() << "#if defined(CONFIG_UPROBES) || defined(CONFIG_UPROBES_MODULE)"; @@ -6892,11 +6931,11 @@ uprobe_derived_probe_group::emit_module_decls (systemtap_session& s) void -uprobe_derived_probe_group::emit_module_init (systemtap_session& s) +uprobe_derived_probe_group::emit_module_utrace_init (systemtap_session& s) { if (probes.empty()) return; - s.op->newline() << "/* ---- user probes ---- */"; + s.op->newline() << "/* ---- utrace uprobes ---- */"; s.op->newline() << "for (j=0; jnewline(1) << "struct stap_uprobe *sup = & stap_uprobes[j];"; @@ -6925,10 +6964,10 @@ uprobe_derived_probe_group::emit_module_init (systemtap_session& s) void -uprobe_derived_probe_group::emit_module_exit (systemtap_session& s) +uprobe_derived_probe_group::emit_module_utrace_exit (systemtap_session& s) { if (probes.empty()) return; - s.op->newline() << "/* ---- user probes ---- */"; + s.op->newline() << "/* ---- utrace uprobes ---- */"; // NB: there is no stap_unregister_task_finder_target call; // important stuff like utrace cleanups are done by @@ -6998,6 +7037,126 @@ uprobe_derived_probe_group::emit_module_exit (systemtap_session& s) s.op->newline() << "mutex_destroy (& stap_uprobes_lock);"; } + +void +uprobe_derived_probe_group::emit_module_inode_decls (systemtap_session& s) +{ + if (probes.empty()) return; + s.op->newline() << "/* ---- inode uprobes ---- */"; + s.op->newline() << "#include \"uprobes-inode.c\""; + + // Write the probe handler. + s.op->newline() << "static int enter_inode_uprobe " + << "(struct uprobe_consumer *inst, struct pt_regs *regs) {"; + s.op->newline(1) << "struct stp_inode_uprobe_consumer *sup = " + << "container_of(inst, struct stp_inode_uprobe_consumer, consumer);"; + common_probe_entryfn_prologue (s.op, "STAP_SESSION_RUNNING", "sup->probe"); + s.op->newline() << "c->regs = regs;"; + s.op->newline() << "c->regflags |= _STP_REGS_USER_FLAG;"; + // XXX: Can't set SET_REG_IP; we don't actually know the relocated address. + // ... In some error cases, uprobes itself calls uprobes_get_bkpt_addr(). + s.op->newline() << "(*sup->probe->ph) (c);"; + common_probe_entryfn_epilogue (s.op); + s.op->newline() << "return 0;"; + s.op->newline(-1) << "}"; + s.op->assert_0_indent(); + + // Index of all the modules for which we need inodes. + map module_index; + unsigned module_index_ctr = 0; + + // Discover and declare targets for each unique path. + s.op->newline() << "static struct stp_inode_uprobe_target " + << "stap_inode_uprobe_targets[] = {"; + s.op->indent(1); + for (unsigned i=0; imodule) == module_index.end()) + { + module_index[p->module] = module_index_ctr++; + s.op->newline() << "{ .filename=" << lex_cast_qstring(p->module) << " },"; + } + } + s.op->newline(-1) << "};"; + s.op->assert_0_indent(); + + // Declare the actual probes. + s.op->newline() << "static struct stp_inode_uprobe_consumer " + << "stap_inode_uprobe_consumers[] = {"; + s.op->indent(1); + for (unsigned i=0; imodule]; + s.op->newline() << "{" + << " .consumer={ .handler=enter_inode_uprobe }," + << " .target=&stap_inode_uprobe_targets[" << index << "]," + << " .offset=(loff_t)0x" << hex << p->addr << dec << "ULL," + << " .probe=" << common_probe_init (p) << "," + << "},"; + } + s.op->newline(-1) << "};"; + s.op->assert_0_indent(); +} + + +void +uprobe_derived_probe_group::emit_module_inode_init (systemtap_session& s) +{ + if (probes.empty()) return; + s.op->newline() << "/* ---- inode uprobes ---- */"; + s.op->newline() << "rc = stp_inode_uprobes_init (" + << "stap_inode_uprobe_targets, " + << "ARRAY_SIZE(stap_inode_uprobe_targets), " + << "stap_inode_uprobe_consumers, " + << "ARRAY_SIZE(stap_inode_uprobe_consumers));"; +} + + +void +uprobe_derived_probe_group::emit_module_inode_exit (systemtap_session& s) +{ + if (probes.empty()) return; + s.op->newline() << "/* ---- inode uprobes ---- */"; + s.op->newline() << "stp_inode_uprobes_exit (" + << "stap_inode_uprobe_targets, " + << "ARRAY_SIZE(stap_inode_uprobe_targets), " + << "stap_inode_uprobe_consumers, " + << "ARRAY_SIZE(stap_inode_uprobe_consumers));"; +} + + +void +uprobe_derived_probe_group::emit_module_decls (systemtap_session& s) +{ + if (kernel_supports_inode_uprobes (s)) + emit_module_inode_decls (s); + else + emit_module_utrace_decls (s); +} + + +void +uprobe_derived_probe_group::emit_module_init (systemtap_session& s) +{ + if (kernel_supports_inode_uprobes (s)) + emit_module_inode_init (s); + else + emit_module_utrace_init (s); +} + + +void +uprobe_derived_probe_group::emit_module_exit (systemtap_session& s) +{ + if (kernel_supports_inode_uprobes (s)) + emit_module_inode_exit (s); + else + emit_module_utrace_exit (s); +} + + // ------------------------------------------------------------------------ // Kprobe derived probes // ------------------------------------------------------------------------ --------------050205080504000300020306--