From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31538 invoked by alias); 23 Jul 2009 10:28:50 -0000 Received: (qmail 31528 invoked by uid 22791); 23 Jul 2009 10:28:49 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 23 Jul 2009 10:28:40 +0000 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n6NASdq1005402 for ; Thu, 23 Jul 2009 06:28:39 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n6NAScwG009946; Thu, 23 Jul 2009 06:28:38 -0400 Received: from [127.0.0.1] (sebastian-int.corp.redhat.com [172.16.52.221]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n6NASaBZ024755; Thu, 23 Jul 2009 06:28:37 -0400 Subject: Re: new static user probe types From: Mark Wielaard To: Roland McGrath Cc: Stan Cox , systemtap@sourceware.org In-Reply-To: <20090723030643.9BC0B2D36@magilla.sf.frob.com> References: <4A453D09.60600@redhat.com> <4A5E0195.5080803@redhat.com> <4A64B8AF.6030304@redhat.com> <1248259327.7890.29.camel@springer.wildebeest.org> <20090723030643.9BC0B2D36@magilla.sf.frob.com> Content-Type: text/plain Date: Thu, 23 Jul 2009 10:28:00 -0000 Message-Id: <1248344916.3494.33.camel@springer.wildebeest.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2009-q3/txt/msg00191.txt.bz2 On Wed, 2009-07-22 at 20:06 -0700, Roland McGrath wrote: > > You would make sure that there are enough nops in the place of the probe > > point for the instruction sequence you want to replace and then the > > uprobes insert instruction mechanism would (after checking it had enough > > nop space) insert the instruction sequence (preferable the one used by > > the utrace mechanism). > > It can be more precisely-tailored than that, you don't need to think of it > as being a "uprobes method" at all. It's very simple hard-wired code patching. > i.e., the macro produces one long nop and you patch that to a relative call. > You can make it a call to a stock function we provide in some .a you link > with, or to a stub generated directly in an alternate section by the macros. > (If you don't need different stubs, it could be in a linkonce section.) > > > It would also help with implementing the idea for the ENABLED mechanism > > That's just another variant of code-patching for the same purpose. > > > So, it might be a bit like what Srikar posted to utrace-devel: [...] > > By which you just mean it's another kind of code-patching. Yes. I am thinking of it as a "uprobes method" since that already contains the UBP mechanism, interfaces and data structures we would need to make this more general. > > Or how about this. We could expand STAP_PROBE(...) to > > > > { extern char stap_probe_NNNN_enabled_p; > > if (unlikely(stap_probe_NNN_enabled_p)) { > > /* current inline-asm stuff, but adding > > &enabled_p to the descriptor struct. */ > > } > > } > > The point of this is to skip any argument-packing work generated by the > compiler, which would be inside the "if unlikely" block, right? Partly, but it is sadly not guaranteed by GCC. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207 The main speed win would be not tripping over the "probe trigger". And the advantage to have a generalized approach to being able to check whether a probe is enabled or not. > > Certainly warrants a try and benchmark. > > I think this is a lot like some things Mathieu already experimented with > and measured in the kernel context. I think he pursued a code-patching > flavor that patched an immediate operand because that was measured as > faster than having the actual extra load of a simple enabled_p variable. That sounds like what I proposed in http://sourceware.org/bugzilla/show_bug.cgi?id=10013#c2 the disadvantage is that it needs some code-patching magic. The advantage of the above approach is that it wouldn't need anything not already in the kernel mainline, just tricking gcc and the preprocessor to setup things like we would want. > > BTW. For storing changeable variables the .probes section should become > > alloc, rw now always (it currently is only for relocatable objects). > > It doesn't make sense that it should differ in relocatable objects. > I don't understand that. The .probes section stores the addresses of the generated labels that are used to find the probe addresses. This isn't a problem for an executable that isn't relocatable. But it is for shared libraries which has relocatable addresses (if you have selinux memory protection turned on). In that case the section has to be writable for the linker. See http://sourceware.org/bugzilla/show_bug.cgi?id=10381 Cheers, Mark