public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* what does 'probe process(PID_OR_NAME).clone' mean?
@ 2008-05-28 19:38 David Smith
  2008-05-28 20:30 ` David Smith
  2008-05-29  1:37 ` Frank Ch. Eigler
  0 siblings, 2 replies; 15+ messages in thread
From: David Smith @ 2008-05-28 19:38 UTC (permalink / raw)
  To: Systemtap List

I started an internal systemtap conversation that I should have started
here, so I'm going to forward on my original email and Roland's response
to this list.

===

I've been working on solving the bug 6500 fallout and that seems to be
working now (changes just checked in).  However, the changes have
either changed behavior or caused a bug, depending on your point of view.

It all boils down to what does "probe process(PID_OR_NAME).clone" mean?
Or perhaps more precisely, when does "probe process(PID_OR_NAME).clone"
get called?

Here's an example.  If you do 'probe process("/bin/bash").clone', your
probe will get called once whenever /bin/bash calls fork() (or gets
fork()'ed).  In the case of path based probing, the distinction of
calling fork() or getting fork()'ed doesn't really matter - the probe
will get called once either way.

However, in the case of pid based probing, the distinction does matter.
If you do 'probe process(12345).clone', when should that probe get
called?  Should:

(A) the pid 12345 .clone probe get called when pid 12345 calls fork()?

or

(B) the pid 12345 .clone probe get called when a pid 12345 gets created?

With what was originally checked in, (A) is true (so test UTRACE_P5_05
passes).  When the new 6550 fallout changes, (B) is true (so test
UTRACE_P5_05 fails).

At first I thought the new (B) behavior was a bug, but I've talked
myself into believing (B) is the correct behavior.

Note that a behavior very similar to (A) could return when bug #6445
(probe process PID and its descendents) gets implemented.  Behavior (A)
also seems more like a syscall-based probe that only looks at clone calls.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-28 19:38 what does 'probe process(PID_OR_NAME).clone' mean? David Smith
@ 2008-05-28 20:30 ` David Smith
  2008-05-29  2:12   ` David Smith
  2008-05-29 15:04   ` Ananth N Mavinakayanahalli
  2008-05-29  1:37 ` Frank Ch. Eigler
  1 sibling, 2 replies; 15+ messages in thread
From: David Smith @ 2008-05-28 20:30 UTC (permalink / raw)
  To: Systemtap List

Here's Roland's response to my original email.  (Note that I've got
Roland's permission to forward this on.)

====

Your new formulation really doesn't wash with me.
Rather than a coherent response to your message,
I'll just dump a bunch of related thoughts.

First about the terminology.  In Linuxspeak they use "pid" to mean what
the rest of us might call "tid"--an ID for an individual thread
(sometimes called a task in Linuxspeak).  They use "tgid" (thread group
ID) to mean what the rest of us normally call a PID--an ID for a whole
process of one or more threads.  It's easy to be loose with them when
the fine details aren't coming up, because in Linux the PID as users
normally experience it (aka tgid) is the same number as the tid of the
initial thread in that process.  In discussions relating at all to Linux
internals, I find it easiest to stick to "tid" and "tgid" to be clear.
In features and documentation for users and programmers who don't have
Linux on the brain, it usually makes sense to talk mostly about PIDs and
the process as a whole, and then treat the possibility of multiple
threads in a process directly and explicitly rather than tossing around
tid/tgid/pid casually.

At the low level, all the events are per-thread events.  (In fact, the
entirety of the utrace interface we have now is a per-thread interface
without regard to thread groups, i.e. processes.)  At the level users
want to think about, some events are naturally considered unitary
process-wide events, like exec (and maybe "process exit" as distinct
from "thread exit").  Many (most?) others (most of the time?) users do
think of as being tied to an individual thread, but want to treat
uniformly for all threads in the process, i.e., formulate as "the
process experiences the event in some thread" with the thread as a
detail of what happened rather than as the determinant of how to treat
the event.

The clone event is an event that the parent thread experiences before
the child is considered to have been born and experienced any events.
(Consider it "crowning", if you like graphic metaphors.  ;-)
It differs from a syscall event in a few ways, some semantic and
some with only performance impact.

1. At low level, there is one switch per thread to enable any kind of
   syscall tracing.  Flipping this on makes *all* syscall entries and
   *all* syscall exits by this thread go through a slow path in the
   bowels of the kernel, not the optimized fast path.  You use the
   slow exit path even if you only care about entry tracing, and vice
   versa.  If your tracing does a very quick check for just one
   syscall and ignores all others, you are going not only through your
   check but through slow entry and exit paths for each and every
   syscall that is not of interest.

   By contrast, tracing clone events imposes zero overhead on anything
   else.  Even if only a minority of clone events are of interest,
   your callback's quick check to ignore others will in fact be very
   quick in relation to a large and costly operation that just took
   place.  Percentagewise the overhead of catching and discarding
   clone events is probably entirely negligible.

2. In syscall tracing, the event is an attempt to make a syscall.
   You get an exit trace for an error return too, though that's of
   no interest.

3. (For semantics, this is the kicker.)  The clone event callback is
   an unique opportunity where the child exists and can be examined
   and controlled before it ever runs.

   The most arcane special thing is the utrace_attach synchronization.
   (The details of this might change in the future.  But keep the
   issue in mind, whether or not you write code to rely on it now, and
   your input will influence whether the future utrace interface still
   tries to solve this problem for you and the details of how.)

   As you know, utrace_attach can be called on any target thread
   asynchronously from any thread.  As soon as a new thread exists,
   its tid goes into global tables and it becomes possible for anyone
   to find the thread's task_struct and call utrace_attach on it.
   This is so at the time the report_clone callback is made.  SMP or
   preemption may allow another thread to call utrace_attach before or
   simultaneous with your callback code.

   While report_clone callbacks are being made, utrace_attach has a
   special behavior related to this.  Until callbacks have completed,
   if the caller of utrace_attach is not the parent thread (the one
   running callbacks), it can block (or return -ERESTARTNOINTR).  The
   idea is that the engines attached to the parent and tracing its
   clone events get first crack at attaching to the child, before
   other randoms off the street.

   In the face of multiple engines, this could matter for callback
   order, though that was not the motivating concern.  When it's
   important is if you are using UTRACE_ATTACH_EXCLUSIVE with
   UTRACE_ATTACH_MATCH_* as a means of synchronization for your
   engine's own data structures and semantics.  The existing ptrace
   via utrace code does this (with UTRACE_MATCH_OPS) to implement its
   "one ptracer" semantics.  In ptrace, if the tracer of the parent
   uses PTRACE_O_TRACE*, it gets attached to the child and it's
   impossible for a simultaneous PTRACE_ATTACH to get in the way.

   The current code does this only for the very first utrace_attach
   call on the new thread.  So after one engine's report_clone callback
   has used utrace_attach on the new child, asynchronous utrace_attach
   calls by other threads can usurp later engines whose report_clone
   callbacks run second.  This is a stupid rule that can let something
   innocent break ptrace's semantics.  I'm sure I didn't intend it that
   way, but it's sufficient to make ptrace correct when ptrace is the
   only engine (or always the first), so I never noticed.  At least
   this much about the magic will surely change in the coming revamp.

   That's a lot said about a small bit of magic.  But the special
   synchronization among utrace_attach calls is not the key issue.

   The key feature of the report_clone callback is that this is the
   opportunity to do things to the new thread before it starts to run.
   Before this (such as at syscall entry for clone et al), there is no
   child to get hold of.  After this, the child starts running.  At any
   later point (such as at syscall exit from the creating call), the
   new thread will get scheduled and (if nothing else happened) will
   begin running in user mode.  (In the extreme, it could have run,
   died, and then the tid you were told already reused for a completely
   unrelated new thread.)  During report_clone, you can safely call
   utrace_attach on the new thread and then make it stop/quiesce,
   preventing it from doing anything once it gets started.

Another note on report_clone: this callback is not the place to do much
with the child except to attach it.  If you want to do something with
the child, then attach it, quiesce it, and let the rest of clone finish
in the parent--this releases the child to be scheduled and finish its
initial kernel-side setup.  (If you want the parent to wait, then make
the parent quiesce after your callback.)  Then let the report_quiesce
callback from the child instigate whatever you want to do with the
child.  This is how PTRACE_O_TRACE* works: it attaches the child, gives
it a SIGSTOP (poor man's quiesce), and then the parent lets the child
run while it stops for a ptrace report; meanwhile the child gets
scheduled, processes its SIGSTOP, and stops for the ptrace'd signal.

All of that discussion was about the implementation perspective that is
Linux-centric, low-level, and per-thread, considering one thread (task
in Linuxspeak) doing a clone operation that creates another task.  In
common terms, this encompasses two distinct kinds of things: creation
of additional threads within a process (pthread_create et al), and
process creation (fork/vfork).  At the utrace level, i.e. what's
meaningful at low level in Linux, this is distinguished by the
clone_flags parameter to the report_clone callback.  Important bits:

* CLONE_THREAD set
  This is a new thread in the same process; child->tgid == parent->tgid.
* CLONE_THREAD clear
  This child has its own new thread group; child->tgid == child->pid (tid).
  For modern use, this is the marker of "new process" vs "new thread".
* CLONE_VM|CLONE_VFORK both set
  This is a vfork process creation.  The parent won't return to user
  (or syscall exit tracing) until the child dies or execs.
  (Technically CLONE_VFORK can be set without CLONE_VM and it causes
  the same synchronization.)
* CLONE_VM set
  The child shares the address space of the parent.  When set without
  CLONE_THREAD or CLONE_VFORK, this is (ancient, unsupported)
  linuxthreads, or apps doing their own private clone magic (happens).

For reference, old ptrace calls it "a vfork" if CLONE_VFORK is set,
calls it "a fork" if &CSIGNAL (a mask) == SIGCHLD, and otherwise calls
it "a clone".  With syscalls or normal glibc functions, common values
are:

fork	 	-- just SIGCHLD or SIGCHLD|CLONE_*TID
vfork		-- CLONE_VFORK | CLONE_VM | SIGCHLD
pthread_create	-- CLONE_THREAD | CLONE_SIGHAND | CLONE_VM
		     | CLONE_FS | CLONE_FILES
		     | CLONE_SETTLS | CLONE_PARENT_SETTID
		     | CLONE_CHILD_CLEARTID | CLONE_SYSVSEM

Any different combination is some uncommon funny business.  (There are
more known examples I won't go into here.)  But probably just keying on
CLONE_THREAD is more than half the battle.

For building up to the user's natural perspective on things, I like an
organization of a few building blocks.  First, let me describe the idea
of a "tracing group".  (For now, I'll just talk about it as a semantic
abstraction and not get into how something would implement it per se.)
By this I just mean a set of tasks (i.e. threads, in one or more
processes) that you want to treat uniformally, at least in utrace
terms.  That is, "tracing group" is the coarsest determinant of how you
treat a thread having an event of potential interest.  In utrace terms,
all threads in the group have the same event mask, the same ops vector,
and possibly the same engine->data pointer.  In systemtap terms, this
might mean all the threads for which the same probes are active in a
given systemtap session.  The key notion is that the tracing group is
the granularity at which we attach policy (and means of interaction,
i.e. channels to stapio or whatnot).

In that context, I think of task creation having these components:

1. clone event in the parent

   This is the place for up to three kinds of things to do.
   Choices can be driven by the clone_flags and/or by inspecting
   the kernel state of the new thread (which is shared with the parent,
   was copied from the parent, or is freshly initialized).

   a. Decide which groups the new task will belong to.
      i.e., if it qualifies for the group containing the parent,
      utrace_attach it now.  Or, maybe policy says for this clone
      we should spawn a new tracing group with a different policy.

   b. Do some cheap/nonblocking kind of notification and/or data
      structure setup.

   c. Decide if you want to do some heavier-weight tracing on the
      parent, and tell it to quiesce.

2. quiesce event in the parent

   This happens if 1(c) decided it should.  (For the ptrace model, this
   is where it just stays stopped awaiting PTRACE_CONT.)  After the
   revamp, this will not really be different from the syscall-exit
   event, which you might have enabled just now in the clone event
   callback.  If you are interested in the user-level program state of
   the parent that just forked/cloned, the kosher thing is to start
   inspecting it here.  (The child's tid will first be visible in the
   parent's return value register here, for example.)

3. join-group event for the child

   This "event" is an abstract idea, not a separate thing that occurs
   at low level.  The notion is similar to a systemtap "begin" probe.
   The main reason I distinguish this from the clone event and the
   child's start event (below) is to unify this way of organizing
   things with the idea of attaching to an interesting set of processes
   and threads already alive.  i.e., a join-group event happens when
   you start a session that probes a thread, as well as when a thread
   you are already probing creates another thread you choose to start
   probing from birth.

   You can think of this as the place that installs the utrace event
   mask for the thread, though that's intended to be implicit in the
   idea of what a tracing group is.  This is the place where you'd
   install any per-thread kind of tracing setup, which might include hw
   breakpoints/watchpoints.  For the attach case, where the thread was
   not part of an address space already represented in the tracing
   group, this could be the place to insert breakpoints (aka uprobes).

4. "start" event in the child

   This is not a separate low-level event, but just the first event you
   see reported by the child.  If you said you were interested (in the
   clone/join-group event), then this will usually be the quiesce event.
   But note that the child's first event might be death, if it was sent
   a SIGKILL before it had a chance to run.

   This is close to the process.start event in process.stp, but in a
   place just slightly later where it's thoroughly kosher in utrace
   terms.  Here is the first time it's become possible to change the new
   thread's user_regset state.  Everything in the kernel perspective and
   the parent's perspective about the new thread start-up has happened
   (including CLONE_CHILD_SETTID), but the thread has yet to run its
   first user instruction.

Now, let's describe the things that make sense to a user in terms of
these building blocks, in the systemtap context.  I'm still just using
this as an abstraction to describe what we do with utrace.  But it's not
just arbitrary.  I think the "grouping" model is a good middle ground
between the fundamental facilities we have to work with and the natural
programmer's view for the user that we want to get to.

Not that it would necessarily be implemented this way, but for purposes
of discussion imagine that we have the tracing group concept above as a
first-class idea in systemtap, and the primitive utrace events as probe
types.  The probe types users see are done in tapsets.  A systemtap
session starts by creating a series of tracing groups.  I think of each
group having a set of event rules (which implies its utrace event mask).
In systemtap, the rules are the probes active on that group.  I'll
describe the rules that would be implicit (i.e. in tapset code, or
whatever) and apply in addition to (before) any script probes on the
same specific low-level events (clone/exec).

When there are any global probes on utrace events, make a group we'll
call {global}.  (Add all live threads.)  Its rules are:
	clone -> child joins the group
(Down the road there may be special utrace support to optimize the
global tracing case over universal attach.)

For a process(PID) probe, make a group we'll call {process PID}.
(Add all threads of live process PID.)  Its rules are:
	clone with CLONE_THREAD -> child joins the group
	clone without CLONE_THREAD -> child leaves the group

Here I take a PID of 123 to refer to the one live process with tgid 123
at the start of the systemtap session, and not any new process that
might come to exist during the session and happen to be assigned tgid 123.

For a process.execname probe, make a group we'll call {execname "foo"}.
Its rules are:
	clone -> child joins the group
	exec -> if execname !matches "foo", leave the group

When there are any process.execname probes, then there is an implicit
global probe on exec.  In effect, {global} also has the rule:
	exec -> if execname matches "foo", join group {execname "foo"}

The probes a user wants to think about might be:

probe process.fork.any = probe utrace.clone if (!(clone_flags &
CLONE_THREAD))
probe process.fork.fork = probe utrace.clone if (!(clone_flags & CLONE_VM))
probe process.vfork = probe utrace.clone if (clone_flags & CLONE_VFORK)
probe process.create_thread = probe utrace.clone if (clone_flags &
CLONE_THREAD)
probe process.thread_start
probe process.child_start

The {thread,child}_start probes would be some sort of magic for running
in the report_quiesce callback of the new task after the report_clone
callback in the creator decided we wanted to attach and set up to see
that probe.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-28 19:38 what does 'probe process(PID_OR_NAME).clone' mean? David Smith
  2008-05-28 20:30 ` David Smith
@ 2008-05-29  1:37 ` Frank Ch. Eigler
  2008-05-30  8:03   ` Roland McGrath
  1 sibling, 1 reply; 15+ messages in thread
From: Frank Ch. Eigler @ 2008-05-29  1:37 UTC (permalink / raw)
  To: David Smith; +Cc: Systemtap List

David Smith <dsmith@redhat.com> writes:

> [...]
> It all boils down to what does "probe process(PID_OR_NAME).clone" mean?
> Or perhaps more precisely, when does "probe process(PID_OR_NAME).clone"
> get called?
> [...]
> (A) the pid 12345 .clone probe get called when pid 12345 calls fork()?
> (B) the pid 12345 .clone probe get called when a pid 12345 gets created?
> [...]

How about (C) neither: by conveniently defining the problem away.
Systemtap scripts don't really to know about clone events per se, but
rather when processes/threads come and go, and what syscalls they
perform.  The ".clone" event is currently implemented was sort of a
"this new thread/process was started due to a clone or fork", and not
the same thing as probing the parent's clone syscall.

So, to simplify the script level of the situation, here's the reduced
list of probe points dsmith and I tossed around yesterday:

process().begin, process().end
process().thread.begin, process().thread.end
process().syscall, process().syscall.return

The main delta from src/NEWS is the replacement of "exec" and "clone"
with "begin" and renaming "death" to "end", and the thread.begin/.end
probes.

So, for this scenario, here's the list of probes that could be hit:

    (login)
    bash%
    bash% ls
    foo bar
    bash% ^D

Here are the named-process probes:

process("bash").begin  # pid()=N
process("bash").begin  # forked process; ppid()=N; pid()=M
process("bash").end    # pid()=M  # NB: "bash"->"ls" process transition
process("ls").begin    # pid()=M
process("ls").end      # pid()=M
process("bash").end    # pid()=N

Here are the numbered-process probes:

process(N).begin  # pid()=N
process(M).begin  # forked process; ppid()=N; pid()=M
                  # N.B.: no .end/.begin for the "bash"->"ls" exec transition
                  # since process with pid=M is still just as alive
process(M).end    # pid()=M
process(N).end    # pid()=N

In there would of course be .syscall/.syscall.return probes.


For a multithreaded program doing some pthread_create() then an exec(),
which incidentally kills threads

process("parent").begin         # pid()=M
process("parent").thread.begin  # tid()=pid()=M
process("parent").thread.begin  # tid()=P,pid()=M
process("parent").thread.begin  # tid()=Q,pid()=M
                                # some thread starting a successful exec()
process("parent").thread.end    # tid()=pid()=M
process("parent").thread.end    # tid()=P
process("parent").thread.end    # tid()=Q
process("parent").end           # pid()=M; just ran the exec
process("child").begin          # pid()=M
process("child").thread.begin   # pid()=M
process("child").thread.end     # pid()=M
process("child").end            # pid()=M

For process-numbered probes, the same, except that the
thread.end/thread.begin pair around the exec() wouldn't show up
either, just like in the above case, since the original numbered
thread/process would survive the exec.


Does this make sense as the script-level view?


- FChE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-28 20:30 ` David Smith
@ 2008-05-29  2:12   ` David Smith
  2008-05-29 14:50     ` Ananth N Mavinakayanahalli
  2008-05-30  7:19     ` Roland McGrath
  2008-05-29 15:04   ` Ananth N Mavinakayanahalli
  1 sibling, 2 replies; 15+ messages in thread
From: David Smith @ 2008-05-29  2:12 UTC (permalink / raw)
  To: Systemtap List, Roland McGrath

David Smith wrote:
> Your new formulation really doesn't wash with me.
> Rather than a coherent response to your message,
> I'll just dump a bunch of related thoughts.

Thanks for this dump (although you owe me at least 2 tylenol for the
headache).

I'll comment on various parts below.

...

> The clone event is an event that the parent thread experiences before
> the child is considered to have been born and experienced any events.
...
>    The key feature of the report_clone callback is that this is the
>    opportunity to do things to the new thread before it starts to run.
>    Before this (such as at syscall entry for clone et al), there is no
>    child to get hold of.  After this, the child starts running.  At any
>    later point (such as at syscall exit from the creating call), the
>    new thread will get scheduled and (if nothing else happened) will
>    begin running in user mode.  (In the extreme, it could have run,
>    died, and then the tid you were told already reused for a completely
>    unrelated new thread.)  During report_clone, you can safely call
>    utrace_attach on the new thread and then make it stop/quiesce,
>    preventing it from doing anything once it gets started.
> 
> Another note on report_clone: this callback is not the place to do much
> with the child except to attach it.  If you want to do something with
> the child, then attach it, quiesce it, and let the rest of clone finish
> in the parent--this releases the child to be scheduled and finish its
> initial kernel-side setup.

Ah, interesting.  Now that I didn't know.  The current code is certainly
doing more than it is supposed to in report_clone.

Is this also true for any other events?  Currently we're using
UTRACE_EVENT({CLONE, DEATH, EXEC, SYSCALL_ENTRY, SYSCALL_EXIT}), but
this list could expand in the future.

> All of that discussion was about the implementation perspective that is
> Linux-centric, low-level, and per-thread, considering one thread (task
> in Linuxspeak) doing a clone operation that creates another task.  In
> common terms, this encompasses two distinct kinds of things: creation
> of additional threads within a process (pthread_create et al), and
> process creation (fork/vfork).  At the utrace level, i.e. what's
> meaningful at low level in Linux, this is distinguished by the
> clone_flags parameter to the report_clone callback.  Important bits:
> 
> * CLONE_THREAD set
>   This is a new thread in the same process; child->tgid == parent->tgid.
> * CLONE_THREAD clear
>   This child has its own new thread group; child->tgid == child->pid (tid).
>   For modern use, this is the marker of "new process" vs "new thread".
> * CLONE_VM|CLONE_VFORK both set
>   This is a vfork process creation.  The parent won't return to user
>   (or syscall exit tracing) until the child dies or execs.
>   (Technically CLONE_VFORK can be set without CLONE_VM and it causes
>   the same synchronization.)
> * CLONE_VM set
>   The child shares the address space of the parent.  When set without
>   CLONE_THREAD or CLONE_VFORK, this is (ancient, unsupported)
>   linuxthreads, or apps doing their own private clone magic (happens).
> 
> For reference, old ptrace calls it "a vfork" if CLONE_VFORK is set,
> calls it "a fork" if &CSIGNAL (a mask) == SIGCHLD, and otherwise calls
> it "a clone".  With syscalls or normal glibc functions, common values
> are:
> 
> fork	 	-- just SIGCHLD or SIGCHLD|CLONE_*TID
> vfork		-- CLONE_VFORK | CLONE_VM | SIGCHLD
> pthread_create	-- CLONE_THREAD | CLONE_SIGHAND | CLONE_VM
> 		     | CLONE_FS | CLONE_FILES
> 		     | CLONE_SETTLS | CLONE_PARENT_SETTID
> 		     | CLONE_CHILD_CLEARTID | CLONE_SYSVSEM
> 
> Any different combination is some uncommon funny business.  (There are
> more known examples I won't go into here.)  But probably just keying on
> CLONE_THREAD is more than half the battle.

Hmm, OK - I'm seeing the difference between fork/vfork/pthread_create
here.  (But I get confused later...)

> For building up to the user's natural perspective on things, I like an
> organization of a few building blocks.  First, let me describe the idea
> of a "tracing group".  (For now, I'll just talk about it as a semantic
> abstraction and not get into how something would implement it per se.)
> By this I just mean a set of tasks (i.e. threads, in one or more
> processes) that you want to treat uniformally, at least in utrace
> terms.  That is, "tracing group" is the coarsest determinant of how you
> treat a thread having an event of potential interest.  In utrace terms,
> all threads in the group have the same event mask, the same ops vector,
> and possibly the same engine->data pointer.  In systemtap terms, this
> might mean all the threads for which the same probes are active in a
> given systemtap session.  The key notion is that the tracing group is
> the granularity at which we attach policy (and means of interaction,
> i.e. channels to stapio or whatnot).
> 
> In that context, I think of task creation having these components:
> 
> 1. clone event in the parent
> 
>    This is the place for up to three kinds of things to do.
>    Choices can be driven by the clone_flags and/or by inspecting
>    the kernel state of the new thread (which is shared with the parent,
>    was copied from the parent, or is freshly initialized).
> 
>    a. Decide which groups the new task will belong to.
>       i.e., if it qualifies for the group containing the parent,
>       utrace_attach it now.  Or, maybe policy says for this clone
>       we should spawn a new tracing group with a different policy.
> 
>    b. Do some cheap/nonblocking kind of notification and/or data
>       structure setup.
> 
>    c. Decide if you want to do some heavier-weight tracing on the
>       parent, and tell it to quiesce.
> 
> 2. quiesce event in the parent
> 
>    This happens if 1(c) decided it should.  (For the ptrace model, this
>    is where it just stays stopped awaiting PTRACE_CONT.)  After the
>    revamp, this will not really be different from the syscall-exit
>    event, which you might have enabled just now in the clone event
>    callback.  If you are interested in the user-level program state of
>    the parent that just forked/cloned, the kosher thing is to start
>    inspecting it here.  (The child's tid will first be visible in the
>    parent's return value register here, for example.)
> 
> 3. join-group event for the child
> 
>    This "event" is an abstract idea, not a separate thing that occurs
>    at low level.  The notion is similar to a systemtap "begin" probe.
>    The main reason I distinguish this from the clone event and the
>    child's start event (below) is to unify this way of organizing
>    things with the idea of attaching to an interesting set of processes
>    and threads already alive.  i.e., a join-group event happens when
>    you start a session that probes a thread, as well as when a thread
>    you are already probing creates another thread you choose to start
>    probing from birth.
> 
>    You can think of this as the place that installs the utrace event
>    mask for the thread, though that's intended to be implicit in the
>    idea of what a tracing group is.  This is the place where you'd
>    install any per-thread kind of tracing setup, which might include hw
>    breakpoints/watchpoints.  For the attach case, where the thread was
>    not part of an address space already represented in the tracing
>    group, this could be the place to insert breakpoints (aka uprobes).
> 
> 4. "start" event in the child
> 
>    This is not a separate low-level event, but just the first event you
>    see reported by the child.  If you said you were interested (in the
>    clone/join-group event), then this will usually be the quiesce event.
>    But note that the child's first event might be death, if it was sent
>    a SIGKILL before it had a chance to run.
> 
>    This is close to the process.start event in process.stp, but in a
>    place just slightly later where it's thoroughly kosher in utrace
>    terms.  Here is the first time it's become possible to change the new
>    thread's user_regset state.  Everything in the kernel perspective and
>    the parent's perspective about the new thread start-up has happened
>    (including CLONE_CHILD_SETTID), but the thread has yet to run its
>    first user instruction.
> 
> Now, let's describe the things that make sense to a user in terms of
> these building blocks, in the systemtap context.  I'm still just using
> this as an abstraction to describe what we do with utrace.  But it's not
> just arbitrary.  I think the "grouping" model is a good middle ground
> between the fundamental facilities we have to work with and the natural
> programmer's view for the user that we want to get to.

At this point, I'm liking your "grouping" model (although I have a few
quibbles later on).  Note that currently the grouping model doesn't
really exist - each probe has its own utrace engine, even probes on the
same pid/exe.

> Not that it would necessarily be implemented this way, but for purposes
> of discussion imagine that we have the tracing group concept above as a
> first-class idea in systemtap, and the primitive utrace events as probe
> types.  The probe types users see are done in tapsets.  A systemtap
> session starts by creating a series of tracing groups.  I think of each
> group having a set of event rules (which implies its utrace event mask).
> In systemtap, the rules are the probes active on that group.  I'll
> describe the rules that would be implicit (i.e. in tapset code, or
> whatever) and apply in addition to (before) any script probes on the
> same specific low-level events (clone/exec).

I'd think we'd want to hide the low-level stuff from users and not
expose them at the script level, but I could be talked out of it.

> When there are any global probes on utrace events, make a group we'll
> call {global}.  (Add all live threads.)  Its rules are:
> 	clone -> child joins the group
> (Down the road there may be special utrace support to optimize the
> global tracing case over universal attach.)

This basically happens now underneath and isn't available at the script
level, but there is a bug that asks for this functionality.

> For a process(PID) probe, make a group we'll call {process PID}.
> (Add all threads of live process PID.)  Its rules are:
> 	clone with CLONE_THREAD -> child joins the group
> 	clone without CLONE_THREAD -> child leaves the group
>
> Here I take a PID of 123 to refer to the one live process with tgid 123
> at the start of the systemtap session, and not any new process that
> might come to exist during the session and happen to be assigned tgid 123.

Yep, that's the way the current code works.  The current code "sort of"
treats process(PID) like a special case of process.exename.

> For a process.execname probe, make a group we'll call {execname "foo"}.
> Its rules are:
> 	clone -> child joins the group
> 	exec -> if execname !matches "foo", leave the group

Here's my quibble.

I like the process(PID) behavior you outline above, but I'm not sure I
like the difference in behavior between it and the process.execname
behavior.

Here's a concrete example to see if I'm reading you correctly.  Assume
pid 123 points to /bin/bash and I'm doing syscall tracing.  If I'm
tracing by pid, I'm not going to get syscall events between the fork and
the exec for the child.  If I'm tracing by exename, I am going to get
the syscall events between the fork and exec.

But, I certainly like the idea of tracing by 'pid' - and by 'pid' we
meant a tgid, not a tid.  So, a multi-threaded 'pid' tracing would work
as a user *meant*, but not exactly as he *said*.

> When there are any process.execname probes, then there is an implicit
> global probe on exec.  In effect, {global} also has the rule:
> 	exec -> if execname matches "foo", join group {execname "foo"}

Yes.

> The probes a user wants to think about might be:
> 
> probe process.fork.any = probe utrace.clone if (!(clone_flags &
> CLONE_THREAD))
> probe process.fork.fork = probe utrace.clone if (!(clone_flags & CLONE_VM))
> probe process.vfork = probe utrace.clone if (clone_flags & CLONE_VFORK)
> probe process.create_thread = probe utrace.clone if (clone_flags &
> CLONE_THREAD)
> probe process.thread_start
> probe process.child_start
> 
> The {thread,child}_start probes would be some sort of magic for running
> in the report_quiesce callback of the new task after the report_clone
> callback in the creator decided we wanted to attach and set up to see
> that probe.

I'm lost on the difference between 'process.fork.any' and
'process.fork.fork'.  Does 'process.fork.any' include 'process.vfork'?


On more question.  Frank and I bounced a few ideas on irc the other day,
and we wondered if there was a good way on UTRACE_EVENT(DEATH) to tell
the difference between a "thread" death and "process" death?

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-29  2:12   ` David Smith
@ 2008-05-29 14:50     ` Ananth N Mavinakayanahalli
  2008-05-29 15:33       ` Ananth N Mavinakayanahalli
  2008-05-30  0:30       ` Roland McGrath
  2008-05-30  7:19     ` Roland McGrath
  1 sibling, 2 replies; 15+ messages in thread
From: Ananth N Mavinakayanahalli @ 2008-05-29 14:50 UTC (permalink / raw)
  To: David Smith; +Cc: Systemtap List, Roland McGrath

On Wed, May 28, 2008 at 03:37:01PM -0500, David Smith wrote:
> David Smith wrote:
> 
> On more question.  Frank and I bounced a few ideas on irc the other day,
> and we wondered if there was a good way on UTRACE_EVENT(DEATH) to tell
> the difference between a "thread" death and "process" death?

Thread dying => task_struct->pid != task_struct->tgid

and

Process dying => task_struct->pid == task_struct->tgid

??

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-28 20:30 ` David Smith
  2008-05-29  2:12   ` David Smith
@ 2008-05-29 15:04   ` Ananth N Mavinakayanahalli
  2008-05-30 11:21     ` Roland McGrath
  1 sibling, 1 reply; 15+ messages in thread
From: Ananth N Mavinakayanahalli @ 2008-05-29 15:04 UTC (permalink / raw)
  To: David Smith; +Cc: Systemtap List

On Wed, May 28, 2008 at 12:24:18PM -0500, David Smith wrote:
 
Roland,

> For building up to the user's natural perspective on things, I like an
> organization of a few building blocks.  First, let me describe the idea
> of a "tracing group".  (For now, I'll just talk about it as a semantic
> abstraction and not get into how something would implement it per se.)
> By this I just mean a set of tasks (i.e. threads, in one or more
> processes) that you want to treat uniformally, at least in utrace
> terms.  That is, "tracing group" is the coarsest determinant of how you
> treat a thread having an event of potential interest.  In utrace terms,
> all threads in the group have the same event mask, the same ops vector,
> and possibly the same engine->data pointer.

Isn't the above same as what you characterize as "sharing" utrace
engines in the utrace TODO?

Something like an aggregate utrace engine will be valuable for uprobes
too.

Ananth

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-29 14:50     ` Ananth N Mavinakayanahalli
@ 2008-05-29 15:33       ` Ananth N Mavinakayanahalli
  2008-05-30  0:30       ` Roland McGrath
  1 sibling, 0 replies; 15+ messages in thread
From: Ananth N Mavinakayanahalli @ 2008-05-29 15:33 UTC (permalink / raw)
  To: David Smith; +Cc: Systemtap List, Roland McGrath

On Thu, May 29, 2008 at 05:41:14PM +0530, Ananth N Mavinakayanahalli wrote:
> On Wed, May 28, 2008 at 03:37:01PM -0500, David Smith wrote:
> > David Smith wrote:
> > 
> > On more question.  Frank and I bounced a few ideas on irc the other day,
> > and we wondered if there was a good way on UTRACE_EVENT(DEATH) to tell
> > the difference between a "thread" death and "process" death?
> 
> Thread dying => task_struct->pid != task_struct->tgid
> 
> and
> 
> Process dying => task_struct->pid == task_struct->tgid

Hmm.. I don't think the above is enough.. you may also have to check if
task_struct->thread_group is an empty list or not.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-29 14:50     ` Ananth N Mavinakayanahalli
  2008-05-29 15:33       ` Ananth N Mavinakayanahalli
@ 2008-05-30  0:30       ` Roland McGrath
  1 sibling, 0 replies; 15+ messages in thread
From: Roland McGrath @ 2008-05-30  0:30 UTC (permalink / raw)
  To: ananth; +Cc: David Smith, Systemtap List

> Thread dying => task_struct->pid != task_struct->tgid
> 
> and
> 
> Process dying => task_struct->pid == task_struct->tgid
> 
> ??

No.

In Linux, process death is just the death of the last thread in the thread
group (process) to die.  task_struct.pid == task_struct.tgid just means
this was the initial thread.  It may have died while others still live.
This test is not part of the equation at all.

> Hmm.. I don't think the above is enough.. you may also have to check if
> task_struct->thread_group is an empty list or not.

That's called thread_group_empty().  It's not true until all other threads
have been passed to release_task().  This both has races if used here, and
does not do what you want in the face of ptrace keeping zombie threads around.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-29  2:12   ` David Smith
  2008-05-29 14:50     ` Ananth N Mavinakayanahalli
@ 2008-05-30  7:19     ` Roland McGrath
  1 sibling, 0 replies; 15+ messages in thread
From: Roland McGrath @ 2008-05-30  7:19 UTC (permalink / raw)
  To: David Smith; +Cc: Systemtap List

> > Another note on report_clone: this callback is not the place to do much
> > with the child except to attach it. [...]
> Is this also true for any other events?  Currently we're using
> UTRACE_EVENT({CLONE, DEATH, EXEC, SYSCALL_ENTRY, SYSCALL_EXIT}), but
> this list could expand in the future.

There aren't hard and fast rules.  There are particular caveats in each
context.  CLONE is of course unique in having the state of the
not-quite-started child to consider.  The main part of that caveat has to
do with when you want to do things to the child, not in whether you're
holding up the parent for its own sake.  For all other events, there is
only the thread itself experiencing the event to consider.

EXEC is at a place where it's keeping some kernel resources live (binprm
and binfmt), is keeping the exec'd file from being written, and a few other
things like that.  Once you return from the callback, those things are
released.  Since you're not ever supposed to block for long in a callback,
all that should mean is to make sure you copy out anything you wanted from
those data structures.

DEATH is in a strange scheduling context (exit_state already set) where
you're not expected to do much, and the last bit of thread state cleanup
has not been done yet.

REAP is either in the DEATH context or is called from a wholly other
process (the parent/ptracer).

EXIT is special in that staying quiescent/stopped here is not preventing
going back to user mode, it's preventing teardown work and getting to DEATH.

The other events (SYSCALL_*, SIGNAL_*, JCTL) all by definition take place
at the safe boundary places between kernel and user.  These have the same
conditions as a thread that is fully stopped/quiescent.

I expect any future additions all to be in this last category.
It is the norm, being the most thoroughly unrestricted case.

> At this point, I'm liking your "grouping" model (although I have a few
> quibbles later on).  Note that currently the grouping model doesn't
> really exist - each probe has its own utrace engine, even probes on the
> same pid/exe.

The discussion so far is only on the abstraction to describe what we're
doing.  I'm leaving related implementation details for later.  The focus
now is on getting the "middle ground" abstraction to fully make sense and
fit as the basis from which to describe what we mean in concrete low-level
terms by the user features.

> I'd think we'd want to hide the low-level stuff from users and not
> expose them at the script level, but I could be talked out of it.

I'm also not really talking about systemtap feature details here.  
The low-level script constructs are just a strawman for discussing the
mapping to the middle layer.  It's easier to refer to approximate
systemtap syntax and handwaving than just contextless pseudocode and
handwaving.  (If I were talking about systemtap features, I would
indeed talk you out of it.  But that is not the discussion I'm trying
to have right now.)

> Here's my quibble.
> 
> I like the process(PID) behavior you outline above, but I'm not sure I
> like the difference in behavior between it and the process.execname
> behavior.
> 
> Here's a concrete example to see if I'm reading you correctly.  Assume
> pid 123 points to /bin/bash and I'm doing syscall tracing.  If I'm
> tracing by pid, I'm not going to get syscall events between the fork and
> the exec for the child.  If I'm tracing by exename, I am going to get
> the syscall events between the fork and exec.
> 
> But, I certainly like the idea of tracing by 'pid' - and by 'pid' we
> meant a tgid, not a tid.  So, a multi-threaded 'pid' tracing would work
> as a user *meant*, but not exactly as he *said*.

Whether it's a difference really depends on what you meant by sameness,
or put another way, what you meant to say the user said or meant depends
on what you meant the given saying to mean or thought the meaning said.
(Yes, I'm throwing in a bonus Tylenol with that one.)

As per the caveat I mentioned, by PID we really meant "process".  That
is, the PID we said at the time of the start of the session meant the
particular process identity then identified by that PID.  

So, the question is what were we saying by a process(PID).foo probe?

I presumed in my description the intended semantics was in the style of
a predicate.  That is, "When a foo happens, was it in process(PID)?"
This is the same behavior as process.execname, which asks, "When a foo
happens, was it in a process with execname 'bar'?"

When a process exec's "baz", it no longer meets the "execname is 'bar'"
predicate, so those probes don't apply.  When a process forks a child,
the child does not meet the "process identity is PID" predicate, so
those probes don't apply.  It's the same behavior.

The other thing you're describing is a "this process and its children"
predicate, which is not what I ever thought you meant with process(PID).
Still, I'm not quibbling about what the systemtap syntax should mean.  I
go into this detail just so we'll clarify exactly what we're describing
with our casual references.  In fact, I guess what you really think
makes sense is a "this process and its children until they exec"
predicate, which to me is a fairly nonobvious definition of process(PID)
to have been implying without describing it.

All three of those things are fairly easy to represent in the tracing
group model, which was the true point of the example.  The process and
its children style is obviously just a child-joins-group rule in the
clone/fork event.  

For your fancy idea where a process doesn't mean a process or its
children but means those which haven't exec'd, the representation it
depends on which of two ways you mean to construe that.  If you mean
that process(PID) ceases to match the original PID process's own events
once it execs, then it's a simple matter of an exec rule for the group.
If you instead mean a more complex behavior where process(PID) always
means the original PID but means its children only until they exec, then
it involves a second group for the children, since their rule for exec
is different from the parent's.

The former of those two seems most sensical to me, since it corresponds
to having the same or copied user memory/state so that user-level
program state (variables, pointers) is of a piece for everything in that
tracing group.  Anyway, I'd still prefer to separate the choices for
systemtap features from the fundamental discussion of this new middle
layer idea.  My main focus at the moment is to iron out the tracing
group model so we're confident it is a sound basis to facilitate
implementing whatever choices of user-visible features we end up with.
I'm less concerned with exactly what you mean the systemtap language to
mean than and more with understanding all the options for what it means
precisely enough to ensure my model is a good platform to express them.

> I'm lost on the difference between 'process.fork.any' and
> 'process.fork.fork'.  Does 'process.fork.any' include 'process.vfork'?

Yes.  The difference is just exactly what it says there: the tests on
clone_flags.  It's not a suggestion for systemtap features, it's an
illustration of the distinctions that might make sense from the
application programmer's perspective, and how those would map to
filters on the underlying event.

> On more question.  Frank and I bounced a few ideas on irc the other day,
> and we wondered if there was a good way on UTRACE_EVENT(DEATH) to tell
> the difference between a "thread" death and "process" death?

At the moment, there are two easy methods that are racy in different
ways.  But, the kernel has the answer on hand right there and it would
be trivial to pass it down.  In the coming version of the interface,
I'll give the report_death callback an argument to tell you.  For the
moment, I'd say test atomic_read(&task->signal->live) == 0 and don't
worry about the race.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-29  1:37 ` Frank Ch. Eigler
@ 2008-05-30  8:03   ` Roland McGrath
  2008-06-02 20:31     ` Frank Ch. Eigler
  0 siblings, 1 reply; 15+ messages in thread
From: Roland McGrath @ 2008-05-30  8:03 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

> Does this make sense as the script-level view?

I think what you've described maps well to the formulation I presented.
The one area where it's hazy to me what precisely you intend is what in
debugging parlance is the run vs attach issue.  Here I mean start-up of a
process/thread observed by systemtap (run) vs starting a systemtap session
meant to apply to existing processes/threads.

The existing sense of plain begin/end probes in systemtap is the start and
finish of the systemtap session.  So one can construe a process(...).begin
or .thread.begin probe to mean "the process/thread joined the session",
i.e. either creation during the session or the start of the session when
the applicable process/thread already existed.  (And likewise, .end being
both death and session-end.)

It sounded a little more like you meant .begin just to be newborn-child and
.end just to be death, but as I said it's hazy to me.  

The latter sense doesn't really make sense for a process(PID).begin probe,
since by definition the PID process already exists before the session
starts.  Also, I think any sense that excludes session start/end, for a
thread that is live throughout, is sufficiently disparate from the meaning
of the gloal begin/end probe in a subtle way that it is very dubious to use
the same names.  However, I am also hazy on the desireability of always
conflating the thread lifetime with the session lifetime.

Can you elaborate more precisely on your ideas for the by-PID cases?
What do you think about the whole thread lifetime vs session lifetime issue?


Thanks,
Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-29 15:04   ` Ananth N Mavinakayanahalli
@ 2008-05-30 11:21     ` Roland McGrath
  0 siblings, 0 replies; 15+ messages in thread
From: Roland McGrath @ 2008-05-30 11:21 UTC (permalink / raw)
  To: ananth; +Cc: David Smith, Systemtap List

> Isn't the above same as what you characterize as "sharing" utrace
> engines in the utrace TODO?
> 
> Something like an aggregate utrace engine will be valuable for uprobes
> too.

All the ideas do tie together, yes.  If you want to follow up on this
angle, please do it on the utrace-devel@redhat.com mailing list (and don't
cross-post).  This is getting into esoteric details that don't much impinge
on the way systemtap looks at things, but certainly deserve some
elaboration on utrace-devel.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-05-30  8:03   ` Roland McGrath
@ 2008-06-02 20:31     ` Frank Ch. Eigler
  2008-06-02 21:49       ` Roland McGrath
  0 siblings, 1 reply; 15+ messages in thread
From: Frank Ch. Eigler @ 2008-06-02 20:31 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Systemtap List

Hi -

On Thu, May 29, 2008 at 06:32:09PM -0700, Roland McGrath wrote:

> [...]  It sounded a little more like you meant .begin just to be
> newborn-child and .end just to be death, but as I said it's hazy to
> me.

Yes, that makes sense for the script level.

> The latter sense doesn't really make sense for a process(PID).begin
> probe, since by definition the PID process already exists before the
> session starts.  [...]

Right.  (I wonder if there is some conceivable utility to a notation
that allows probing future processes by forecast PID numbers.)

- FChE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-06-02 20:31     ` Frank Ch. Eigler
@ 2008-06-02 21:49       ` Roland McGrath
  2008-06-02 23:11         ` Frank Ch. Eigler
  0 siblings, 1 reply; 15+ messages in thread
From: Roland McGrath @ 2008-06-02 21:49 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

> > [...]  It sounded a little more like you meant .begin just to be
> > newborn-child and .end just to be death, but as I said it's hazy to
> > me.
> 
> Yes, that makes sense for the script level.

Why does it make sense for .begin to mean this, in contrast to what global
begin probes mean?  You did not really say anything at all about the whole
issue I raised.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-06-02 21:49       ` Roland McGrath
@ 2008-06-02 23:11         ` Frank Ch. Eigler
  2008-06-27 20:52           ` Roland McGrath
  0 siblings, 1 reply; 15+ messages in thread
From: Frank Ch. Eigler @ 2008-06-02 23:11 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Systemtap List

Hi -

On Mon, Jun 02, 2008 at 01:12:13PM -0700, Roland McGrath wrote:
> > > [...]  It sounded a little more like you meant .begin just to be
> > > newborn-child and .end just to be death, but as I said it's hazy to
> > > me.
> > 
> > Yes, that makes sense for the script level.
> 
> Why does it make sense for .begin to mean this, in contrast to what global
> begin probes mean?  

Because these are process-begin and process-end probes.  If we
restrict PID-identified probes to pre-existing processes, then
process(PID).begin probes would not get called and might as well not
be supported by the syntax.

> You did not really say anything at all about the whole issue I raised.

Sorry, I must have missed it.

- FChE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: what does 'probe process(PID_OR_NAME).clone' mean?
  2008-06-02 23:11         ` Frank Ch. Eigler
@ 2008-06-27 20:52           ` Roland McGrath
  0 siblings, 0 replies; 15+ messages in thread
From: Roland McGrath @ 2008-06-27 20:52 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Systemtap List

> On Mon, Jun 02, 2008 at 01:12:13PM -0700, Roland McGrath wrote:
> > > > [...]  It sounded a little more like you meant .begin just to be
> > > > newborn-child and .end just to be death, but as I said it's hazy to
> > > > me.
> > > 
> > > Yes, that makes sense for the script level.
> > 
> > Why does it make sense for .begin to mean this, in contrast to what global
> > begin probes mean?  
> 
> Because these are process-begin and process-end probes.  If we
> restrict PID-identified probes to pre-existing processes, then
> process(PID).begin probes would not get called and might as well not
> be supported by the syntax.
> 
> > You did not really say anything at all about the whole issue I raised.
> 
> Sorry, I must have missed it.

The two different quoted bits from you seem to say contradictory things.
So I think we were confused.  I don't think it matters to figure out what
we thought we were saying.  It doesn't even much matter to me exactly what
any of us thinks the systemtap syntax is or what it actually does today.
I think I've established that my whole "tracing group" model of things
will fit well to supply a vocabulary in which to describe systemtap
features.  I hope some discussions starting on utrace-devel@redhat.com
will lead to formalizing that vocabulary and starting to put together
some code that systemtap will one day reuse.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2008-06-27 20:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-28 19:38 what does 'probe process(PID_OR_NAME).clone' mean? David Smith
2008-05-28 20:30 ` David Smith
2008-05-29  2:12   ` David Smith
2008-05-29 14:50     ` Ananth N Mavinakayanahalli
2008-05-29 15:33       ` Ananth N Mavinakayanahalli
2008-05-30  0:30       ` Roland McGrath
2008-05-30  7:19     ` Roland McGrath
2008-05-29 15:04   ` Ananth N Mavinakayanahalli
2008-05-30 11:21     ` Roland McGrath
2008-05-29  1:37 ` Frank Ch. Eigler
2008-05-30  8:03   ` Roland McGrath
2008-06-02 20:31     ` Frank Ch. Eigler
2008-06-02 21:49       ` Roland McGrath
2008-06-02 23:11         ` Frank Ch. Eigler
2008-06-27 20:52           ` Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).