public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug translator/13187] New: Reconsider the semantics of process(number).thread.begin/end
@ 2011-09-13 21:11 brolley at redhat dot com
  2011-09-14 14:31 ` [Bug translator/13187] " dsmith at redhat dot com
  2016-05-26 17:57 ` fche at redhat dot com
  0 siblings, 2 replies; 5+ messages in thread
From: brolley at redhat dot com @ 2011-09-13 21:11 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=13187

             Bug #: 13187
           Summary: Reconsider the semantics of
                    process(number).thread.begin/end
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: translator
        AssignedTo: systemtap@sourceware.org
        ReportedBy: brolley@redhat.com
                CC: dsmith@redhat.com, fche@redhat.com
    Classification: Unclassified


The current semantics of process(number).thread.begin and
process(number).thread.end are to select threads with the given task id and to
fire probes as these threads begin and end respectively.

While this is logical, given the internal semantics of process(number).begin
and process(number).end, which fire probes at the corresponding beginning and
end of the main threads of a process, it is not very useful. This is because
the task ids of individual threads within a process are difficult, if not
impossible, to predict in advance. The result is that these probes serve no
reasonably useful purpose.

The difference, which makes process(number).begin and process(number).end
useful is that, for these probes, the task id matches the process id, which is
easily obtained by the user.

Other variants of thread.begin and thread.end are useful because of additional
semantics:

- process("PATH").thread.begin and process("path").thread.end fire at the
beginning and end of each child thread of processes corresponding to the given
path. From a user's point of view this represents all child threads beginning
and ending within the processes associated with that path.

- the commands
    stap -e 'process.thread.begin {}' -c PATH
and
    stap -e 'process.thread.end {} -c PATH
have a similar effect with the firing of the probes at the beginning and end of
each child thread within the process started by running the executable at PATH.
Once again, from the user's point of view, this represents all child threads
beginning and ending within that process.

So, when a user wants to probe the same thing against an already-running
process, it would seem reasonable that process(PID).thread.begin and
process(PID).thread.end might do the same thing. That is, fire at the beginning
and end of each child thread within the process PID. After all, she has simply
substituted the process ID for the path used in the previous examples. In other
words, she has attempted to identify the target process(es) in a different
manner.

However, for a running process, these probes will currently never fire, because
the task ids of the child threads do not match the given process id.

I propose that the process(PID).thread.begin and process(PID).thread.end probes
assume the same semantics as the other thread.begin and thread.end variants by
firing at the beginning and end of the child threads of the given process as
the other variants do. That is these probes should adopt the same "follow all
children" semantics as the other variants.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug translator/13187] Reconsider the semantics of process(number).thread.begin/end
  2011-09-13 21:11 [Bug translator/13187] New: Reconsider the semantics of process(number).thread.begin/end brolley at redhat dot com
@ 2011-09-14 14:31 ` dsmith at redhat dot com
  2016-05-26 17:57 ` fche at redhat dot com
  1 sibling, 0 replies; 5+ messages in thread
From: dsmith at redhat dot com @ 2011-09-14 14:31 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=13187

--- Comment #1 from David Smith <dsmith at redhat dot com> 2011-09-14 14:28:23 UTC ---
(In reply to comment #0)
> However, for a running process, these probes will currently never fire, because
> the task ids of the child threads do not match the given process id.

There is a situation where process(PID).thread.begin probes will fire.

- Start a process that creates several threads that will run for an extended
period of time.  Determine the pids of those threads.
- RUn systemtap with a script that probes those pids.
- As systemtap attaches to those threads, the process(PID).thread.begin probe
will fire.


In my opinion, changing the semantics here is too big/messy of a change. 
Instead, a new probe type, something like 'process(PATH/PID).thread.create',
could be created.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug translator/13187] Reconsider the semantics of process(number).thread.begin/end
  2011-09-13 21:11 [Bug translator/13187] New: Reconsider the semantics of process(number).thread.begin/end brolley at redhat dot com
  2011-09-14 14:31 ` [Bug translator/13187] " dsmith at redhat dot com
@ 2016-05-26 17:57 ` fche at redhat dot com
  1 sibling, 0 replies; 5+ messages in thread
From: fche at redhat dot com @ 2016-05-26 17:57 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=13187

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #2 from Frank Ch. Eigler <fche at redhat dot com> ---
neat idea, but no recent need

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Bug translator/13187] Reconsider the semantics of process(number).thread.begin/end
  2011-09-14 18:23 Dave Brolley
@ 2011-09-21 19:46 ` David Smith
  0 siblings, 0 replies; 5+ messages in thread
From: David Smith @ 2011-09-21 19:46 UTC (permalink / raw)
  To: Dave Brolley; +Cc: systemtap

On 09/14/2011 01:23 PM, Dave Brolley wrote:

Sorry for the delay in responding to this.  I'm also sorry for the
incomplete response.  But here goes with what I've got.

... stuff deleted ...

Let's start with a discussion of 'PID'.  What does that exactly mean?
You probably already know this, I just want to get our terminology straight.

From a user's point of view, there are process IDs (returned by
getpid()) and thread IDs (returned by gettid()).  The gettid() man page
states:

    In a single-threaded process, the thread ID is equal to the
    process ID (PID, as returned  by getpid(2)).  In a
    multithreaded process, all threads have the same PID,
    but each one has a unique TID.

Here's where things get confusing.  From the kernel's point of view, the
thread ID is called 'pid' in the task structure and the process ID is
called 'tgid' (thread group id) in the task structure.  (It happened
this way for historical reasons.)

To avoid confusion, I'm going to call an individual thread a 'task' and
the group of tasks with the same 'tgid' value a 'thread group'.

[Here's something I've rediscovered about 'process(PID).*'
probes.  The task_finder only looks for processes in its initial pass.
So, after stap starts up, it looks for all the PID probes, then never
looks for them again.  This was actually done on purpose, since task ids
are non-predictable.  If you ask us to probe a PID, it finishes and
later a completely different process starts and ends up with the same
PID (because PIDs will wrap after a certain number), we didn't want to
accidently process that.  The thought was that if you're probing
specific PIDs, you meant those very tasks, not random ones that might
come along later.]

While researching this, I've discovered a bug in the task_finder.  Where
we're talking about 'process(PID).*' probes, there is actually
some confusion in the task_finder code about what PID means (which
actually relates to your question here).  This command:

  stap -x PID -e 'probe process.* {}'

treats PID differently than this one:

  stap -e 'probe process(PID).* {}'

In the first one, we look for a thread group id value of PID. In the
second case, we look for a task id value of PID.

> I think that there is a lack of orthogonality in the current
> implementation that is confusing. At least it is for me.
> 
> 1) stap -e 'process("PATH").thread.begin {}' catches *all* child threads
> of *processes* (not tasks) identified by PATH, *as they start*


Your "as they start" distinction is not correct.
'process("PATH").begin' and 'process("PATH").thread.begin' probes will
fire when attaching to existing tasks.

So, if you "stap -e 'process("PATH").thread.begin {}", that
.thread.begin probe will fire for all existing threads whose execname is
PATH.  The .thread.begin probe will later fire for all new threads whose
execname is PATH also.

Also note here that the .thread.begin probes fire in the context of the
new thread, not the parent thread.  If your probe body was something like:

    process("PATH").thread.begin {
	printf("pid %d tid %d\n", pid(), tid())
    }

You'll see 2 different numbers there.

[Internally, the task_finder first loops through all existing tasks,
looking for PATH and also attaches a probe to every task in the system,
to help in monitoring new threads.]

> 2) stap -e 'process.thread.begin {}' -c CMD catches *all* child threads
> of the specific *process* (not task) created by running CMD, *as they
> start*


To avoid any misunderstandings, let's break this one down:

- 'process.thread.begin' means we're interested in *all* threads in the
system.

- The '-c CMD' (which basically devolves into a '-x PID') means we're
*only* interested in PID.  [See 'NOTE 1' for more '-c CMD' details.]

We resolve this conflict by being interested in all threads started by PID.

[Internally, the task_finder first loops through all existing tasks,
looking for PID and only attaching a probe to PID that monitors for new
threads.]

> whereas
> 
> 3) stap -e 'process(NUMBER).thread.begin {}': catches only the thread
> with task id equal to NUMBER.
> 
> The behavior of variant 3 is not intuitive at all, to me, given the
> behavior of the other two variants, combined with the name of the probe
> itself being process(NUMBER), and not task(NUMBER).
> 
> Furthermore, the number in
> 
>   process(number).statement(stmtnumber).absolute
>   process(number).statement(stmenumber).absolute.return
>   process(number).syscall
>   process(number).syscall.return
> 
> refers to the process id and these probes all fire in the main thread of
> the process with that id and also in its children. i.e. it simply
> identifies the target process. So why should the number in not also
> refer to the process id?


I haven't tested this, but I'd bet you are incorrect there about
'process(PID).syscall' probes.  I'll bet they only apply to task id PID,
not thread group id PID.

... more stuff deleted ...

I'm going to have to ignore the rest of this email for now, or I'll
never send this response.


NOTE 1: Here's what happens when you use 'stap -c CMD'

- stap passes the '-c CMD' argument down to staprun

- staprun loads the module and passes the '-c CMD' arg down to stapio

- stapio runs the command and eventually sends the pid to the module

There ends up being little difference (except a little timing) between
"stap -c CMD foo.stp" and "CMD; stap -x PID foo.stp".

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Bug translator/13187] Reconsider the semantics of process(number).thread.begin/end
@ 2011-09-14 18:23 Dave Brolley
  2011-09-21 19:46 ` David Smith
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Brolley @ 2011-09-14 18:23 UTC (permalink / raw)
  To: systemtap

No reason why this discussion should not be public. The discussion 
started on IRC within Red Hat when I tried to test the 
process.thread.begin/end variants and ran into what was, for me anyway, 
unexpected semantics. The scenario is an already-running process, foo, 
with process id 1234.

stap -e 'probe process("foo").thread.begin,process("foo").thread.end {}' 
catches each child thread of foo as they begin and end, as expected.

I expected stap -e 'probe 
process(1234).thread.begin,process(1234).thread.end {}' to do the same, 
but it does not. Instead, it never fires, because the current semantics 
are to probe the child thread with task id == 1234, which doesn't exist.

My proposal for new semantics, and David Smith's response are in the PR 
13187. My subsequent reply to David is below.

Feel free to add your thoughts and opinions!

--------------------------------------------------------

On 09/14/2011 10:28 AM, dsmith at redhat dot com wrote:
>
>  There is a situation where process(PID).thread.begin probes will fire.
>
>  - Start a process that creates several threads that will run for an extended
>  period of time.  Determine the pids of those threads.
>  - RUn systemtap with a script that probes those pids.
>  - As systemtap attaches to those threads, the process(PID).thread.begin probe
>  will fire.
I think that you're getting caught up in the notion that the PID here
refers to a task because that happens to be the current implementation.
Sure we can jump through some hoops and get this probe type to fire for
a particular type of application (long running threads) in a particular
situation (want to probe one thread in particular, probe fires at some
random point in the already-running thread), but is this really the most
useful and most intuitive interpretation of this probe type from the
user's point of view?

I think that there is a lack of orthogonality in the current
implementation that is confusing. At least it is for me.

1) stap -e 'process("PATH").thread.begin {}' catches *all* child threads
of *processes* (not tasks) identified by PATH, *as they start*
2) stap -e 'process.thread.begin {}' -c CMD catches *all* child threads
of the specific *process* (not task) created by running CMD, *as they start*

whereas

3) stap -e 'process(NUMBER).thread.begin {}': catches only the thread
with task id equal to NUMBER.

The behavior of variant 3 is not intuitive at all, to me, given the
behavior of the other two variants, combined with the name of the probe
itself being process(NUMBER), and not task(NUMBER).

Furthermore, the number in

   process(number).statement(stmtnumber).absolute
   process(number).statement(stmenumber).absolute.return
   process(number).syscall
   process(number).syscall.return

refers to the process id and these probes all fire in the main thread of
the process with that id and also in its children. i.e. it simply
identifies the target process. So why should the number in not also
refer to the process id?

   process(number).thread.begin
   process(number).thread.end
   process(number).begin
   process(number).end

the only difference should be that the latter two only refer to the main
thread and the first two only refer to the child threads, as they do for
the other begin/end and thread.begin/end variants.

It seems to me that the thread.begin family of probes was intended to
provide for the probing of threading activity within a process (as
opposed to probing of a particular thread/task) and that providing a
*process* (not task) NUMBER instead of a *process* PATH or a -c option
shouldn't change this. i.e. the purpose of providing process(NUMBER)
probes is to allow the specification of an already-running *process* as
the target and not a specific thread/task id.

The point of my proposal in the BZ is that simply having variant 3 use
the same "follow all child threads" semantics as the other two would
accomplish this. As fche pointed out, this is exactly what variant 2
does for the *process* id generated by -c PATH, so why deny it for the
*process* id provided by NUMBER?

>  In my opinion, changing the semantics here is too big/messy of a change.
>  Instead, a new probe type, something like 'process(PATH/PID).thread.create',
>  could be created.
I see it the other way. I think that the current
process(NUMBER).thread.start/end should behave like its counterparts and
that if you really want to probe a specific thread/task, you can:

1) filter on the task id or
2) create something like task(TID).begin/end or
process(PATH/PID).thread(TID).begin/end.

Note also, that process(PATH).thread.create would duplicate
functionality already provided by process(PATH).thread.begin.

At the end of the day, don't we want what is most flexible, useful and
intuitive for the end user?

With my proposed change:
useful: probes which are possible with the current implementation will
still be possible after the change (via filtering)
useful and flexible: a whole range of probes on already-running
processes, which are now impossible would become possible
intuitive: the NUMBER in process(NUMBER).thread.begin would refer to a
process id as the name of the probe suggests

As for the concern about changing the semantics. The new semantics are a
superset of the current semantics and, given the current semantics, I
doubt that anyone is currently using process(NUMBER).thread.begin/end at
all.

I welcome your further thoughts on this!

Thanks,
Dave



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-05-26 17:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-13 21:11 [Bug translator/13187] New: Reconsider the semantics of process(number).thread.begin/end brolley at redhat dot com
2011-09-14 14:31 ` [Bug translator/13187] " dsmith at redhat dot com
2016-05-26 17:57 ` fche at redhat dot com
2011-09-14 18:23 Dave Brolley
2011-09-21 19:46 ` David Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).