From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27505 invoked by alias); 14 Aug 2007 04:59:58 -0000 Received: (qmail 27193 invoked by uid 22791); 14 Aug 2007 04:59:55 -0000 X-Spam-Status: No, hits=0.2 required=5.0 tests=AWL,BAYES_50,DK_POLICY_SIGNSOME,FORGED_RCVD_HELO,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 14 Aug 2007 04:59:48 +0000 Received: from gateway.sf.frob.com (c-67-160-211-197.hsd1.ca.comcast.net [67.160.211.197]) by mx2.redhat.com (8.13.1/8.13.1) with ESMTP id l7E4xUmH004846; Tue, 14 Aug 2007 00:59:40 -0400 Received: from magilla.localdomain (magilla.sf.frob.com [198.49.250.228]) by gateway.sf.frob.com (Postfix) with ESMTP id 9F293357B; Mon, 13 Aug 2007 21:08:16 -0700 (PDT) Received: by magilla.localdomain (Postfix, from userid 5281) id 6539F4D057D; Mon, 13 Aug 2007 21:07:46 -0700 (PDT) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Roland McGrath To: fche@redhat.com (Frank Ch. Eigler) Cc: systemtap@sources.redhat.com Subject: Re: elfutils offline mode for user-space? In-Reply-To: Frank Ch. Eigler's message of , 23 July 2007 13:50:20 -0400 X-Shopping-List: (1) Chemical meritorious crosswords (2) Hobnobbing deviant departure losers (3) Delinquent climates Message-Id: <20070814040746.6539F4D057D@magilla.localdomain> Date: Tue, 14 Aug 2007 15:51:00 -0000 X-RedHat-Blacklist-Warning: Relay 67.160.211.197 is blacklisted by a RBL system X-RedHat-Spam-Score: 3.702 *** X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2007-q3/txt/msg00333.txt.bz2 > roland wrote: > > [...] I see this as just another element of the general "figure out > > which files to look at" problem. There just isn't any clear and > > satisfying means to glean it statically. > > It is likely sufficient to identify all the *candidates* statically. That's the hard part. I think "external means", i.e. kludges, are where you should start. You can run ldd. For each packaging system (rpm et al) you can do some sort of packaging-specific guesswork about what the thing might dlopen. (It's a little bleak.) So, just start prototyping with the kludges and see what seems to fly. > > This gets back to the whole question of what the plan is for the > > model of specifying user-space probe locations. It's not just about > > the right list of loaded files. > > The basics would be identifying executables or libraries, and naming > functions/statements defined therein, as before. A reference to an > unlisted shared library could be made probe-able later. I don't follow what "reference to ... unlisted" means here. > > I can spin you a whole rant about DWARF names and symbol versioning. > > But all this belongs on the public mailing lists. [...] > > Fire away. As you mentioned, there are two basic pieces of specification: module, and probe location within that module. The way kernel probes are specified is in essence always from the point of view of the kernel source. For modules that is fully sufficient. Aside from s/-/_/g, there is only one way that the modules are referred to both in the kernel source and by users. For probe locations, it means that direct probes are purely the province of tapset writers and other kernel developers. An average script writer refers only to the probe points designed by hand and provided in a tapset. That is fine for the kernel. Application programmers have no expectation of describing their interests in the kernel directly in detailed language terms they use in their own programming. Even for system calls, there is some understanding that there is not a one-to-one correspondence between the syscall-like functions in whatever language binding they use (libc included) and the actual kernel crossing names and argument encodings. (The only expectation is to have a complete syscall tapset that presents terms that are sensible in the abstract.) For DSOs, the line between "library" (or "system") and "application" (or "user") is much more blurry. But the naive support that builds minimally from what works now for kernel probes yields the same result. That is, to specify a probe in a DSO one takes the point of view of the source code that was built into that DSO. For any library not part of the user's own code, including headers as well as DSOs, one must either wade blindly into that, or rely solely on a tapset provided to go with that DSO. Maybe that is OK, or maybe it is not. In any one user process, there can be as many different source-level points of view as compilation units making up the modules in the process, and then there are the ABI points of view. DWARF data describes the source-level point of view in each compilation unit. The name of a function in DWARF data is the name used in the source to define the function's body. With aliases, this may not be the same name by which it's called even elsewhere in the same compilation unit and source file. With aliases and symbol versions, it often may not be the same name used from other modules. If a tapset writer asks for a probe on foobar in libfoo.so in the libfoo.so tapset, he expects the "foobar" used as a source name in building libfoo.so. This is just like the current kernel case. If an application writer asks for a probe on foobar in libfoo.so, he may expect the "foobar" used as a source name in his application. He may expect the signature that name had for use in his application source. He may expect its parameters to have the names used in the prototype declaration in the header file included by the application, or those names s/^_*/, or want to refer to them positionally given that known signature. The DWARF data of an application (or any module referring to symbols defined in another) can include declaration records for external symbols. These give the details of the type signature and the source location of the declaration in a header file. That is enough to know what the script writer means to refer to. With some luck, you can map that into the ABI perspective. However, I think it's normal for the compiler to elide all those declarations from the DWARF data when they are externals. The ABI perspective is driven only by ELF symbols (in the dynamic symbol table). For our purposes, a symbol either is undefined, a key with no value, or is a key and value. The key is (soname, setname, symbolname), or just (,, symbolname) for unversioned symbols. The relevant part of the value for us is the address. In the general case, I don't think there is a reliable way to map the source-level reference a DWARF declaration record describes to the right undefined symbol. However, without unusual efforts, one module will not normally contain two undefined symbols with the same name (but different version bindings). So a first approximation is to take the DWARF name (with appropriate mangling) as the ELF symbol name and finding the undefined symbol with that name. You may be SOL for things like the asm("name") decl magic used for "open" to produce "open64" if you are using -D_FILE_OFFSET_BITS=64. Say somehow you got the right symbol key, i.e. the ABI perspective. You can find the module(s) with the right soname, or any modules that define the right unversioned symbol name, and now you have the defined symbol. All that really tells you is an address. To correlate this with DWARF in the defining module, you just have to look up the right CU and look for functions whose entry_pc matches that address. Now you've gotten back around to the source-level perspective of the writer of that function, all the way from the application source perspective. You get to reconcile the two. If you do know what the application thought the signature was, you want to sanity-check that for compatibility with the type of the defined function you found in the DWARF data. If you had application source-perspective parameter names, you've probably already converted those to positional by now. You can resolve those from DWARF. So there are many ways to attack all this. The way I've just described things is as if doing some very fancy nuanced kind of probe specification that specifies a resolution context and a target from that perspective. I'm not suggesting something like that in particular. Another approach would be to rely on tapset writers to supply explicit probes for the functions in their DSOs. But, give them some help. For example, automatic probe aliases for all the ABI keys of a source-level function you name in a probe definition. This still relies on some idea of probe resolution context, but in a much simpler way. Basically, the context of a user script (or another tapset script, for that matter) is specified simply as the ABI context of a given module (i.e., default "the executable" for user scripts). What this entails is each other soname referenced, and for each of those the ABI version (ancestorless symbol version set name) bound to (try ldd -v). All that context does is choose among the several sets of probe aliases each tapset defines. Hmm, maybe you could do the same thing implicitly for user function probes to find those aliases and it's basically the same as a simple form of the context-qualified probe specifiers. I'm just thinking out loud. I'll leave the inlines part of the rant for next time. And then there's PLT probes. It's all probably ... tractable. But, you know, be afraid. Thanks, Roland