From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6120 invoked by alias); 7 Mar 2006 00:47:15 -0000 Received: (qmail 6101 invoked by uid 22791); 7 Mar 2006 00:47:13 -0000 X-Spam-Status: No, hits=-0.7 required=5.0 tests=AWL,BAYES_50,DNS_FROM_RFC_ABUSE,SPF_PASS X-Spam-Check-By: sourceware.org Received: from e34.co.us.ibm.com (HELO e34.co.us.ibm.com) (32.97.110.152) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 07 Mar 2006 00:47:10 +0000 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id k270l8e6013188 for ; Mon, 6 Mar 2006 19:47:08 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VER6.8) with ESMTP id k270nvSr167014 for ; Mon, 6 Mar 2006 17:49:57 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id k270l8ka022194 for ; Mon, 6 Mar 2006 17:47:08 -0700 Received: from dyn9047018079.beaverton.ibm.com (dyn9047018079.beaverton.ibm.com [9.47.18.79]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id k270l74K022177; Mon, 6 Mar 2006 17:47:08 -0700 Subject: Re: tutorial draft checked in From: Jim Keniston To: "Frank Ch. Eigler" Cc: SystemTAP In-Reply-To: <20060303175653.GE6873@redhat.com> References: <20060303175653.GE6873@redhat.com> Content-Type: multipart/mixed; boundary="=-OfWYisfB1E6Ehve9xNde" Organization: Message-Id: <1141692427.2864.23.camel@dyn9047018079.beaverton.ibm.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: Tue, 07 Mar 2006 00:47:00 -0000 X-Virus-Checked: Checked by ClamAV on sourceware.org X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00721.txt.bz2 --=-OfWYisfB1E6Ehve9xNde Content-Type: text/plain Content-Transfer-Encoding: 7bit Content-length: 1404 On Fri, 2006-03-03 at 09:56, Frank Ch. Eigler wrote: > Hi - > > I checked in a draft of the systemtap tutorial (/doc/tutorial). It's > 16 pages long at the moment, written in latex. Let me know if you > have trouble formatting it into ps/pdf. I'll put up a snapshot at > *temporarily*. > > I welcome comments on organization, presentation, and content. > > - FChE Nice job. It was fun to read, and should be a huge help to new users. Post it to the web site pronto. Attached is a patch with suggested fixes and clarifications. Other comments not reflected in that patch: I didn't like the use of caps (rather than the usual italics) for generic terms like STATEMENT. I was especially confused by the use of TAPSET in "Naming conventions." I'd like to see one or more examples of probes set mid-function (i.e., specified by line number). (Exercise for reader: Adjust the line number to match your kernel version.) Item #4 at the end of section 4.3: "Block" is a synonym for "sleep." "Spin" is the term you want. And I'm not sure I agree that trylock/fail is necessary when you fully understand the dangers of deadlock. I edited out the use of "per-version." An interesting word, but distracting. :-) I'm with Richard. "Probepoint" is the spelling established by dprobes and kprobes documents. And it's consistent with "breakpoint." Jim --=-OfWYisfB1E6Ehve9xNde Content-Disposition: attachment; filename=tutedit.patch Content-Type: text/plain; name=tutedit.patch; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-length: 12985 --- oldtu/analysis.tex 2006-03-03 09:46:57.000000000 -0800 +++ tutorial/analysis.tex 2006-03-06 16:15:21.000000000 -0800 @@ -14,9 +14,19 @@ Most systemtap scripts include conditionals, to limit tracing or other logic to those processes or users or {\em whatever} of interest. The -syntax is simple: \verb+if(EXPR) STATEMENT [else STATEMENT]+. -Similarly, scripts can loop: \verb+while(EXPR) STATEMENT+ or -\verb+for(A;B;C) STATEMENT+, and \verb+break+/\verb+continue+ as in C. +syntax is simple: +\begin{verbatim} +if(EXPR) STATEMENT [else STATEMENT] +\end{verbatim} +Similarly, scripts can loop: +\begin{verbatim} +while(EXPR) STATEMENT +\end{verbatim} +or +\begin{verbatim} +for(A;B;C) STATEMENT +\end{verbatim} +and \verb+break+/\verb+continue+ as in C. Probe handlers can return early using \verb+next+ as in \verb+awk+. Blocks of statements are enclosed in \verb+{+ and \verb+}+. In systemtap, the semicolon (\verb+;+) is accepted as a null statement @@ -27,7 +37,7 @@ C++-style (\verb+//+) comments are all accepted. Expressions look like C or \verb+awk+, and support the usual -operators, precedences, and numeric literals. String are treated as +operators, precedences, and numeric literals. Strings are treated as atomic values rather than arrays of characters. String concatenation is done with the dot (\verb+"a" . "b"+). Some examples: @@ -42,7 +52,7 @@ use it in expressions. They are automatically initialized and declared. The type of each identifier -- string vs. number -- is automatically inferred by systemtap from the kinds of operators and -literals used on it. Any inconsistencies will signal an error. +literals used on it. Any inconsistencies will be reported as errors. Conversion between string and number types is done through explicit function calls. @@ -51,7 +61,7 @@ or number.} \nomenclature{string}{A \verb+\0+-terminated character string of up to a fixed limit in length.} \nomenclature{number}{A 64-bit signed integer.} \nomenclature{type inference}{The automatic -computation of the type of each variable, function parameter, array +determination of the type of each variable, function parameter, array value and index, based on their use.} \begin{tabular}{rl} @@ -67,9 +77,9 @@ anywhere in the script. Because of possible concurrency (multiple probe handlers running on different CPUs), each global variable used by a probe is automatically read- or write-locked while the handler is -running. \nomenclature{global variable}{A scalar or array that was +running. \nomenclature{global variable}{A scalar, array, or aggregate that was named in a \verb+global+ declaration, sharing that object amongst all -probe handlers executed during a systemtap session.} +probe handlers and functions executed during a systemtap session.} \nomenclature{locking}{An automated facility used by systemtap to protect global variables against concurrent modification and/or access.} @@ -102,7 +112,7 @@ A class of special ``target variables'' allow access to the probe point context. \nomenclature{target variable}{A value that may be extracted from the kernel context of the probe point, such as a -parameter or local variable within an probed function.} In a symbolic +parameter or local variable within a probed function.} In a symbolic debugger, when you're stopped at a breakpoint, you can print values from the program's context. In systemtap scripts, for those probe points that match with specific executable point (rather than an @@ -118,7 +128,7 @@ \verb+vfs_write+. Each takes a \verb+struct file *+ argument, inside which there is a \verb+struct dentry *+, a \verb+struct inode *+, and so on. Systemtap allows limited dereferencing of such pointer chains. -Two functions \verb+user_string+ and \verb+kernel_string+ can copy +Two functions, \verb+user_string+ and \verb+kernel_string+, can copy \verb+char *+ target variables into systemtap strings. Figure~\ref{fig:inode-watch} demonstrates one way to monitor a particular file (identifed by device number and inode number). This @@ -212,8 +222,9 @@ is fixed at startup. Because they are too large to be created dynamically for inidividual probes handler runs, they must be declared as global. \nomenclature{array}{A global -$k_1,k_2,\ldots,k_n\rightarrow value$ lookup table, with a string, -number, or statistics type for each index and value.} +\verb+[+$k_1,k_2,\ldots,k_n\verb+]+\rightarrow value$ lookup table, with a string, +number, or statistics type for each index and value. +Systemtap arrays are associative arrays.} The basic operations for arrays are setting and looking up elements. These are expressed in \verb+awk+ syntax: the array name followed by @@ -250,12 +261,12 @@ value by adding an extra \verb|+| or \verb|-| code. \begin{tabular}{rl} -\verb|foreach ([a,b] in foo) { foo[a,b] }| & simple loop in arbitrary sequence \\ +\verb|foreach ([a,b] in foo) { fuss_with(foo[a,b]) }| & simple loop in arbitrary sequence \\ \verb|foreach ([a,b] in foo+) { }| & loop in increasing sequence of value \\ \verb|foreach ([a-,b] in foo) { }| & loop in decreasing sequence of first key \\ \end{tabular} -The \verb+break+ and \verb+continue+ statements work inside too. +The \verb+break+ and \verb+continue+ statements work inside \verb+foreach+ loops, too. Since arrays can be large but probe handlers must not run for long, it is a good idea to exit iteration early if possible.\footnote{We anticipate an iteration-count-limited extension to {\tt foreach} @@ -266,13 +277,13 @@ When we said above that values can only be strings or numbers, we lied a little. There is a third type: statistics aggregates, or aggregates -for short. Instaces of this type are used to collect statistics on +for short. Instances of this type are used to collect statistics on numerical values, where it is important to accumulate new data quickly ({\em without} exclusive locks) and in large volume (storing only aggregated stream statistics). This type only makes sense for global variables, and may be stored individually or as elements of an array. \nomenclature{aggregate}{A special data type used to efficiently store -aggregated values of a potentially huge data stream.} +aggregated values (such as statistics) of a potentially huge data stream.} To add a value to a statistics aggregate, systemtap uses the special operator \verb+<<<+. Think of it like C++'s \verb+<<+ output @@ -297,7 +308,7 @@ \verb+@hist_linear+. These evaluate to a special sort of array that may at present\footnote{We anticipate support for indexing and looping using {\tt foreach} shortly.} only be printed. -\nomenclature{extractor}{A function-like expression in script that +\nomenclature{extractor}{A function-like expression in a script that computes a single statistic for a given aggregate.} \begin{tabular}{rl} @@ -357,6 +368,7 @@ of writing. Putting probes indiscriminately into unusually sensitive parts of the kernel (low level context switching, interrupt dispatching) has reportedly caused crashes in the past. We are +fixing these bugs as they are found, and constructing a probe point ``blacklist'', but it is not complete. \nomenclature{blacklist}{A list of probe point patterns encoded into the translator or the kernel, where probing is prohibited for safety --- oldtu/fini.tex 2006-03-03 09:46:57.000000000 -0800 +++ tutorial/fini.tex 2006-03-06 16:04:48.000000000 -0800 @@ -18,9 +18,9 @@ than their documentation, they are the most reliable way to see what's inside all the tapsets. Use the \verb+-v+ (verbose) command line option, several times if you like, to show inner workings. -\nomenclature{free softare}{Software licensed under terms such as the +\nomenclature{free software}{Software licensed under terms such as the GNU GPL, which aims to enforce certain specified user freedoms such -as study, modification, sharing.} +as study, modification, and sharing.} Finally, there is the project web site (\verb+http://sources.redhat.com/systemtap/+) with several articles, --- oldtu/intro.tex 2006-03-03 09:46:57.000000000 -0800 +++ tutorial/intro.tex 2006-03-06 16:04:48.000000000 -0800 @@ -2,7 +2,7 @@ Systemtap is a tool that allows developers and administrators to write and reuse simple scripts to deeply examine the activities of a live -linux system. Data may be extracted, filtered, and summarized quickly +Linux system. Data may be extracted, filtered, and summarized quickly and safely, to enable diagnoses of complex performance or functional problems. @@ -10,7 +10,7 @@ The essential idea behind a systemtap script is to name {\em events}, and to give them {\em handlers}. Whenever a specified event occurs, -the linux kernel runs the handler as if it were a quick subroutine, +the Linux kernel runs the handler as if it were a quick subroutine, then resumes. There are several kind of events, such as entering or exiting a function, a timer expiring, or the entire systemtap session starting or stopping. A handler is a series of script language @@ -31,8 +31,8 @@ loaded, it activates all the probed events by hooking into the kernel. Then, as events occur on any processor, the compiled handlers run. Eventually, the session stops, the hooks are disconnected, and the -module removed. This entire process is driven from a single command -line program \verb+stap+. +module removed. This entire process is driven from a single +command-line program, \verb+stap+. \begin{figure}[h!] \begin{boxedminipage}{4.5in} --- oldtu/tapsets.tex 2006-03-03 10:13:06.000000000 -0800 +++ tutorial/tapsets.tex 2006-03-06 16:04:48.000000000 -0800 @@ -23,8 +23,8 @@ larger kernel families. Naturally, the search is ordered from specific to general, as shown in Figure~\ref{fig:tapset-search}. \nomenclature{tapset search path}{A list of subdirectories searched by -systemtap for tapset scripts, allowing per-version or -per-architectural specialization.} +systemtap for tapset scripts, allowing specialization by version +or architecture.} \begin{figure}[h!] \begin{boxedminipage}{6in} @@ -181,7 +181,7 @@ Sometimes, a tapset needs provide data values from the kernel that cannot be extracted using ordinary target variables (\verb+$var+). %$ -This may be becuase the values are in complicated data structures, may +This may be because the values are in complicated data structures, may require lock awareness, or are defined by layers of macros. Systemtap provides an ``escape hatch'' to go beyond what the language can safely offer. In certain contexts, you may embed plain raw C in tapsets, --- oldtu/tracing.tex 2006-03-03 09:46:57.000000000 -0800 +++ tutorial/tracing.tex 2006-03-06 16:04:48.000000000 -0800 @@ -18,10 +18,11 @@ the \verb+stapprobes+ man page for details. \nomenclature{tapset}{A reusable script forming part of the automatically searched tapset library.} All these events are named using a unified syntax that -looks like dot-separated parametrized identifiers: +looks like dot-separated parameterized identifiers: \begin{tabular}{rl} \verb+begin+ & The startup of the systemtap session. \\ +\verb+end+ & The end of the systemtap session. \\ \verb+kernel.function("sys_open")+ & The entry to the function named \verb+sys_open+ in the kernel. \\ \verb+syscall.close.return+ & The return from the \verb+close+ system @@ -60,7 +61,7 @@ \end{verbatim} You can run this script as is, though with empty handlers there will be no output. Put the two lines into a new file. Run -\verb+stap -v FILE+. Interrupt it any time with \verb+^C+. (The +\verb+stap -v FILE+. Terminate it any time with \verb+^C+. (The \verb+-v+ option tells systemtap to print more verbose messages during its processing. Try the \verb+-h+ option to see more options.) @@ -82,12 +83,13 @@ \begin{tabular}{rl} \verb+tid()+ & The id of the current thread. \\ +\verb+pid()+ & The process (task group) id of the current thread. \\ \verb+uid()+ & The id of the current user. \\ \verb+execname()+ & The name of the current process. \\ \verb+cpu()+ & The current cpu number. \\ \verb+gettimeofday_s()+ & Number of seconds since epoch. \\ \verb+get_cycles()+ & Snapshot of hardware cycle counter. \\ -\verb+pp()+ & The probe point being currently handled. \\ +\verb+pp()+ & A string describing the probe point being currently handled. \\ \verb+probefunc()+ & If known, the name of the function in which this probe was placed. \\ \end{tabular} @@ -108,7 +110,7 @@ process name and the thread id itself. It therefore gives an idea not only about what functions were called, but who called them, and how long they took. Figure~\ref{fig:socket-trace} shows the finished -script. It lacks an call to the \verb+exit()+ function, so you need to +script. It lacks a call to the \verb+exit()+ function, so you need to interrupt it with \verb+^C+ when you want the tracing to stop. \begin{figure}[h!] --=-OfWYisfB1E6Ehve9xNde--