public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: David Smith <dsmith@redhat.com>
To: Dave Brolley <brolley@redhat.com>
Cc: systemtap@sourceware.org
Subject: Re: [Bug translator/13187] Reconsider the semantics of process(number).thread.begin/end
Date: Wed, 21 Sep 2011 19:46:00 -0000	[thread overview]
Message-ID: <4E7A3F00.8060207@redhat.com> (raw)
In-Reply-To: <4E70F10B.2080201@redhat.com>

On 09/14/2011 01:23 PM, Dave Brolley wrote:

Sorry for the delay in responding to this.  I'm also sorry for the
incomplete response.  But here goes with what I've got.

... stuff deleted ...

Let's start with a discussion of 'PID'.  What does that exactly mean?
You probably already know this, I just want to get our terminology straight.

From a user's point of view, there are process IDs (returned by
getpid()) and thread IDs (returned by gettid()).  The gettid() man page
states:

    In a single-threaded process, the thread ID is equal to the
    process ID (PID, as returned  by getpid(2)).  In a
    multithreaded process, all threads have the same PID,
    but each one has a unique TID.

Here's where things get confusing.  From the kernel's point of view, the
thread ID is called 'pid' in the task structure and the process ID is
called 'tgid' (thread group id) in the task structure.  (It happened
this way for historical reasons.)

To avoid confusion, I'm going to call an individual thread a 'task' and
the group of tasks with the same 'tgid' value a 'thread group'.

[Here's something I've rediscovered about 'process(PID).*'
probes.  The task_finder only looks for processes in its initial pass.
So, after stap starts up, it looks for all the PID probes, then never
looks for them again.  This was actually done on purpose, since task ids
are non-predictable.  If you ask us to probe a PID, it finishes and
later a completely different process starts and ends up with the same
PID (because PIDs will wrap after a certain number), we didn't want to
accidently process that.  The thought was that if you're probing
specific PIDs, you meant those very tasks, not random ones that might
come along later.]

While researching this, I've discovered a bug in the task_finder.  Where
we're talking about 'process(PID).*' probes, there is actually
some confusion in the task_finder code about what PID means (which
actually relates to your question here).  This command:

  stap -x PID -e 'probe process.* {}'

treats PID differently than this one:

  stap -e 'probe process(PID).* {}'

In the first one, we look for a thread group id value of PID. In the
second case, we look for a task id value of PID.

> I think that there is a lack of orthogonality in the current
> implementation that is confusing. At least it is for me.
> 
> 1) stap -e 'process("PATH").thread.begin {}' catches *all* child threads
> of *processes* (not tasks) identified by PATH, *as they start*


Your "as they start" distinction is not correct.
'process("PATH").begin' and 'process("PATH").thread.begin' probes will
fire when attaching to existing tasks.

So, if you "stap -e 'process("PATH").thread.begin {}", that
.thread.begin probe will fire for all existing threads whose execname is
PATH.  The .thread.begin probe will later fire for all new threads whose
execname is PATH also.

Also note here that the .thread.begin probes fire in the context of the
new thread, not the parent thread.  If your probe body was something like:

    process("PATH").thread.begin {
	printf("pid %d tid %d\n", pid(), tid())
    }

You'll see 2 different numbers there.

[Internally, the task_finder first loops through all existing tasks,
looking for PATH and also attaches a probe to every task in the system,
to help in monitoring new threads.]

> 2) stap -e 'process.thread.begin {}' -c CMD catches *all* child threads
> of the specific *process* (not task) created by running CMD, *as they
> start*


To avoid any misunderstandings, let's break this one down:

- 'process.thread.begin' means we're interested in *all* threads in the
system.

- The '-c CMD' (which basically devolves into a '-x PID') means we're
*only* interested in PID.  [See 'NOTE 1' for more '-c CMD' details.]

We resolve this conflict by being interested in all threads started by PID.

[Internally, the task_finder first loops through all existing tasks,
looking for PID and only attaching a probe to PID that monitors for new
threads.]

> whereas
> 
> 3) stap -e 'process(NUMBER).thread.begin {}': catches only the thread
> with task id equal to NUMBER.
> 
> The behavior of variant 3 is not intuitive at all, to me, given the
> behavior of the other two variants, combined with the name of the probe
> itself being process(NUMBER), and not task(NUMBER).
> 
> Furthermore, the number in
> 
>   process(number).statement(stmtnumber).absolute
>   process(number).statement(stmenumber).absolute.return
>   process(number).syscall
>   process(number).syscall.return
> 
> refers to the process id and these probes all fire in the main thread of
> the process with that id and also in its children. i.e. it simply
> identifies the target process. So why should the number in not also
> refer to the process id?


I haven't tested this, but I'd bet you are incorrect there about
'process(PID).syscall' probes.  I'll bet they only apply to task id PID,
not thread group id PID.

... more stuff deleted ...

I'm going to have to ignore the rest of this email for now, or I'll
never send this response.


NOTE 1: Here's what happens when you use 'stap -c CMD'

- stap passes the '-c CMD' argument down to staprun

- staprun loads the module and passes the '-c CMD' arg down to stapio

- stapio runs the command and eventually sends the pid to the module

There ends up being little difference (except a little timing) between
"stap -c CMD foo.stp" and "CMD; stap -x PID foo.stp".

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

  reply	other threads:[~2011-09-21 19:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-14 18:23 Dave Brolley
2011-09-21 19:46 ` David Smith [this message]
  -- strict thread matches above, loose matches on Subject: below --
2011-09-13 21:11 [Bug translator/13187] New: " brolley at redhat dot com
2011-09-14 14:31 ` [Bug translator/13187] " dsmith at redhat dot com
2016-05-26 17:57 ` fche at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E7A3F00.8060207@redhat.com \
    --to=dsmith@redhat.com \
    --cc=brolley@redhat.com \
    --cc=systemtap@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).