From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <systemtap-return-13040-listarch-systemtap=sources.redhat.com@sourceware.org>
Received: (qmail 31538 invoked by alias); 23 Jul 2009 10:28:50 -0000
Received: (qmail 31528 invoked by uid 22791); 23 Jul 2009 10:28:49 -0000
X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 	tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 23 Jul 2009 10:28:40 +0000
Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) 	by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n6NASdq1005402 	for <systemtap@sourceware.org>; Thu, 23 Jul 2009 06:28:39 -0400
Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) 	by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n6NAScwG009946; 	Thu, 23 Jul 2009 06:28:38 -0400
Received: from [127.0.0.1] (sebastian-int.corp.redhat.com [172.16.52.221]) 	by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n6NASaBZ024755; 	Thu, 23 Jul 2009 06:28:37 -0400
Subject: Re: new static user probe types
From: Mark Wielaard <mjw@redhat.com>
To: Roland McGrath <roland@redhat.com>
Cc: Stan Cox <scox@redhat.com>, systemtap@sourceware.org
In-Reply-To: <20090723030643.9BC0B2D36@magilla.sf.frob.com>
References: <4A453D09.60600@redhat.com> <4A5E0195.5080803@redhat.com> 	 <4A64B8AF.6030304@redhat.com> 	 <1248259327.7890.29.camel@springer.wildebeest.org> 	 <20090723030643.9BC0B2D36@magilla.sf.frob.com>
Content-Type: text/plain
Date: Thu, 23 Jul 2009 10:28:00 -0000
Message-Id: <1248344916.3494.33.camel@springer.wildebeest.org>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <systemtap.sourceware.org>
List-Subscribe: <mailto:systemtap-subscribe@sourceware.org>
List-Post: <mailto:systemtap@sourceware.org>
List-Help: <mailto:systemtap-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: systemtap-owner@sourceware.org
X-SW-Source: 2009-q3/txt/msg00191.txt.bz2

On Wed, 2009-07-22 at 20:06 -0700, Roland McGrath wrote:
> > You would make sure that there are enough nops in the place of the probe
> > point for the instruction sequence you want to replace and then the
> > uprobes insert instruction mechanism would (after checking it had enough
> > nop space) insert the instruction sequence (preferable the one used by
> > the utrace mechanism).
> 
> It can be more precisely-tailored than that, you don't need to think of it
> as being a "uprobes method" at all.  It's very simple hard-wired code patching.
> i.e., the macro produces one long nop and you patch that to a relative call.
> You can make it a call to a stock function we provide in some .a you link
> with, or to a stub generated directly in an alternate section by the macros.
> (If you don't need different stubs, it could be in a linkonce section.)
>
> > It would also help with implementing the idea for the ENABLED mechanism
> 
> That's just another variant of code-patching for the same purpose.
> 
> > So, it might be a bit like what Srikar posted to utrace-devel: [...]
> 
> By which you just mean it's another kind of code-patching.

Yes. I am thinking of it as a "uprobes method" since that already
contains the UBP mechanism, interfaces and data structures we would need
to make this more general.

> > Or how about this.  We could expand STAP_PROBE(...) to
> > 
> >    { extern char stap_probe_NNNN_enabled_p;
> >      if (unlikely(stap_probe_NNN_enabled_p)) {
> >         /* current inline-asm stuff, but adding
> >            &enabled_p to the descriptor struct. */
> >      }
> >    }
> 
> The point of this is to skip any argument-packing work generated by the
> compiler, which would be inside the "if unlikely" block, right?

Partly, but it is sadly not guaranteed by GCC.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207
The main speed win would be not tripping over the "probe trigger".
And the advantage to have a generalized approach to being able to check
whether a probe is enabled or not.

> > Certainly warrants a try and benchmark.
> 
> I think this is a lot like some things Mathieu already experimented with
> and measured in the kernel context.  I think he pursued a code-patching
> flavor that patched an immediate operand because that was measured as
> faster than having the actual extra load of a simple enabled_p variable.

That sounds like what I proposed in
http://sourceware.org/bugzilla/show_bug.cgi?id=10013#c2
the disadvantage is that it needs some code-patching magic. The
advantage of the above approach is that it wouldn't need anything not
already in the kernel mainline, just tricking gcc and the preprocessor
to setup things like we would want.

> > BTW. For storing changeable variables the .probes section should become
> > alloc, rw now always (it currently is only for relocatable objects). 
> 
> It doesn't make sense that it should differ in relocatable objects.
> I don't understand that.

The .probes section stores the addresses of the generated labels that
are used to find the probe addresses. This isn't a problem for an
executable that isn't relocatable. But it is for shared libraries which
has relocatable addresses (if you have selinux memory protection turned
on). In that case the section has to be writable for the linker. See
http://sourceware.org/bugzilla/show_bug.cgi?id=10381

Cheers,

Mark