public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce
@ 2011-04-06  6:05 raysonlogin at gmail dot com
  2012-10-29 12:57 ` [Bug runtime/12642] " fche at redhat dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: raysonlogin at gmail dot com @ 2011-04-06  6:05 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=12642

           Summary: utrace: taskfinder misses events when main thread does
                    not go through at least one quiesce
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
        AssignedTo: systemtap@sourceware.org
        ReportedBy: raysonlogin@gmail.com


If a multi-threaded program with the main thread not going through at least one
quiesce after systemtap/uprobes tracing starts, then Systemtap is not able to
detect the events.

Example:

#include <pthread.h>
#include <stdio.h>

void spool_write_script(int jobid) __attribute__ ((noinline));

void spool_write_script(int jobid)
{
 printf("sleeping... %d sec\n", 1 + jobid % 2);
 sleep(2 + jobid % 2);
}

mythread()
{
 int i;
 for (i=0;i<30;i++)
  spool_write_script(i);
}

int main()
{
 pthread_t tid;

 printf("pid = %d\n", getpid());

#ifdef MAINWAITS
 getchar();
#endif

 pthread_create(&tid, NULL, mythread, NULL);

 pthread_join(tid, NULL);

}


Reproduce with:
stap hangs when the process is traced with:
# stap -e 'probe process("./a.out").function("spool_write_script")
{printf("called")}' -x <PID>

However, when the testcase is compiled with -DMAINWAITS, and when stap is
started before inputing a key (getchar()), then stap is able to get the events.
(Also, starting the program with -c would make it work.)


Analysis:
In the systemtap runtime, __stp_utrace_task_finder_target_quiesce() checks if
it is encountering the main thread:

        /* Call the callbacks.  Assume that if the thread is a
         * thread group leader, it is a process. */
        __stp_call_callbacks(tgt, tsk, 1, (tsk->pid == tsk->tgid));

For slave threads, tsk->pid == tsk->tgid is false, and the callback function in
uprobes only handles the main thread, as the address space is shared by all the
threads. In stap_uprobe_process_found():

  if (! process_p) return 0; /* ignore threads */

And thus stap_uprobe_change_plus() is not called for the threads.

(As a hack) Simply passing 1 instead of (tsk->pid == tsk->tgid) would make it
work for the testcase above.

Severity:
Normal - The testcase is from a real application (daemon) that creates slave
threads, and the main thread waits until clean-up time (ie. daemon restart,
exit, etc...).

A workaround is to force the main thread to quiesce after stap runs, e.g. by
attaching gdb.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug runtime/12642] utrace: taskfinder misses events when main thread does not go through at least one quiesce
  2011-04-06  6:05 [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce raysonlogin at gmail dot com
@ 2012-10-29 12:57 ` fche at redhat dot com
  2012-10-29 19:43 ` mjw at redhat dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: fche at redhat dot com @ 2012-10-29 12:57 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=12642

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dsmith at redhat dot com,
                   |                            |fche at redhat dot com

--- Comment #1 from Frank Ch. Eigler <fche at redhat dot com> 2012-10-29 12:57:35 UTC ---
David, could this be corrected by a UTRACE_STOP sent to the main thread upon
attaching to it?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug runtime/12642] utrace: taskfinder misses events when main thread does not go through at least one quiesce
  2011-04-06  6:05 [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce raysonlogin at gmail dot com
  2012-10-29 12:57 ` [Bug runtime/12642] " fche at redhat dot com
@ 2012-10-29 19:43 ` mjw at redhat dot com
  2012-11-02 16:04 ` dsmith at redhat dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: mjw at redhat dot com @ 2012-10-29 19:43 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=12642

Mark Wielaard <mjw at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mjw at redhat dot com

--- Comment #2 from Mark Wielaard <mjw at redhat dot com> 2012-10-29 19:43:25 UTC ---
Just adding a comment so people who see hotspot java probes not working against
running java processes might find this bug report. The hotspot main thread does
nothing except wait for all other threads to exit. Which can trigger this bug.
This happens for example on a 2.6.32-220.23.1.el6.x86_64 kernel.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug runtime/12642] utrace: taskfinder misses events when main thread does not go through at least one quiesce
  2011-04-06  6:05 [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce raysonlogin at gmail dot com
  2012-10-29 12:57 ` [Bug runtime/12642] " fche at redhat dot com
  2012-10-29 19:43 ` mjw at redhat dot com
@ 2012-11-02 16:04 ` dsmith at redhat dot com
  2012-11-03 17:03 ` jistone at redhat dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: dsmith at redhat dot com @ 2012-11-02 16:04 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=12642

--- Comment #3 from David Smith <dsmith at redhat dot com> 2012-11-02 16:03:32 UTC ---
Depending on how you count, we've got 3 or 4 sets of utrace-like functionality
here, with different behaviors:

1) the original version of utrace, which is present in RHEL5, handled by
runtime/linux/task_finder.c. In this case we do send the main thread a
UTRACE_STOP, which causes it to stop and we attach correctly.

Here your testcase passes.

2) Version 2 of utrace, which is present in RHEL6 (also handled by
runtime/linux/task_finder.c). In this case we do send the main thread a
UTRACE_STOP - however the stop doesn't/can't interrupt a system call.

Here your testcase fails.

3) On new kernels without "real" utrace, we fake it with tracepoint handlers
and task_work_add().  This code is in runtime/linux/task_finder2.c and
runtime/stp_utrace.c. This code uses task_work_add() to run code when the task
is stopped.  This can't interrupt a system call.

Here your testcase fails.

4) The new dyninst runtime ("--runtime=dyninst") uses the dyninst library to
attach, which ends up using ptrace. Attaching to a running process isn't quite
there yet, but I think it should be possible to interrupt a system call.

So, for cases 2) and 3) above, we've still got some thinking to do.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug runtime/12642] utrace: taskfinder misses events when main thread does not go through at least one quiesce
  2011-04-06  6:05 [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce raysonlogin at gmail dot com
                   ` (2 preceding siblings ...)
  2012-11-02 16:04 ` dsmith at redhat dot com
@ 2012-11-03 17:03 ` jistone at redhat dot com
  2012-11-27 15:36 ` dsmith at redhat dot com
  2012-12-06 20:06 ` dsmith at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: jistone at redhat dot com @ 2012-11-03 17:03 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=12642

Josh Stone <jistone at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jistone at redhat dot com

--- Comment #4 from Josh Stone <jistone at redhat dot com> 2012-11-03 17:02:56 UTC ---
(In reply to comment #3)
> 4) The new dyninst runtime ("--runtime=dyninst") uses the dyninst library to
> attach, which ends up using ptrace. Attaching to a running process isn't quite
> there yet, but I think it should be possible to interrupt a system call.

I'd be surprised if there was a quiesce-type issue in stapdyn, but I did find
an obvious omission that caused its attach to miss -- commit 02bff02.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug runtime/12642] utrace: taskfinder misses events when main thread does not go through at least one quiesce
  2011-04-06  6:05 [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce raysonlogin at gmail dot com
                   ` (3 preceding siblings ...)
  2012-11-03 17:03 ` jistone at redhat dot com
@ 2012-11-27 15:36 ` dsmith at redhat dot com
  2012-12-06 20:06 ` dsmith at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: dsmith at redhat dot com @ 2012-11-27 15:36 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=12642

--- Comment #5 from David Smith <dsmith at redhat dot com> 2012-11-27 15:35:59 UTC ---
Commit f346b8b fixes this for everything except dyninst.

Here's what was going on with this one.

In the original utrace (RHEL5), UTRACE_STOP could interrupt the target task (by
sending the task a fake signal). This caused the target task to stop even when
it was sleeping.

In the newer utrace (RHEL6), that functionality was split out into
UTRACE_INTERRUPT. So, for RHEL6, we now make a pass after the systemtap session
is started and send all target tasks an UTRACE_INTERRUPT.

For newer kernels without utrace, I've implemented UTRACE_INTERRUPT support, so
just like for RHEL6, we now make a pass after the systemtap session is started
and send all target tasks an UTRACE_INTERRUPT.

I've added a test case, called 'main_quiesce.exp' that tests this issue.

This new test case passes everywhere, except with dyninst. That will need more
investigation.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug runtime/12642] utrace: taskfinder misses events when main thread does not go through at least one quiesce
  2011-04-06  6:05 [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce raysonlogin at gmail dot com
                   ` (4 preceding siblings ...)
  2012-11-27 15:36 ` dsmith at redhat dot com
@ 2012-12-06 20:06 ` dsmith at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: dsmith at redhat dot com @ 2012-12-06 20:06 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=12642

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #6 from David Smith <dsmith at redhat dot com> 2012-12-06 20:06:17 UTC ---
The dyninst problem has been moved to its own bug, #14923.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-12-06 20:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-06  6:05 [Bug runtime/12642] New: utrace: taskfinder misses events when main thread does not go through at least one quiesce raysonlogin at gmail dot com
2012-10-29 12:57 ` [Bug runtime/12642] " fche at redhat dot com
2012-10-29 19:43 ` mjw at redhat dot com
2012-11-02 16:04 ` dsmith at redhat dot com
2012-11-03 17:03 ` jistone at redhat dot com
2012-11-27 15:36 ` dsmith at redhat dot com
2012-12-06 20:06 ` dsmith at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).