public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [RFC] Probes don't hit in an already running process
@ 2014-03-07 22:27 Torsten Polle
  2014-03-13 13:46 ` David Smith
  0 siblings, 1 reply; 5+ messages in thread
From: Torsten Polle @ 2014-03-07 22:27 UTC (permalink / raw)
  To: systemtap

[-- Attachment #1: Type: text/plain, Size: 664 bytes --]

Hi,

I've made the observation that probes sometimes don't hit when I start
staprun after the process (where the probes should hit) started.  After
some tests, I found out that only multi-thread processes were affected
under a certain condition.

The patch below fixes the issue for me. But I've no clue about possible
side effects. In my first attempt to fix the issue, I also included the
calls to __stp_call_callbacks() into the guarded area. My probes hit,
but calls to usymname(uaddr()) in the probe body only printed the
address instead of the symbol of the probed function.

Any advice of how I can improve the patch is appreciated.

Kind Regards,
Torsten


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-uprobes-don-t-hit-in-multi-threaded-processes.patch --]
[-- Type: text/x-patch, Size: 3495 bytes --]

From d1752f91227fecd66c2f550cfc4309b1509624e5 Mon Sep 17 00:00:00 2001
Message-Id: <d1752f91227fecd66c2f550cfc4309b1509624e5.1394230365.git.Torsten.Polle@gmx.de>
From: Torsten Polle <Torsten.Polle@gmx.de>
Date: Thu, 6 Mar 2014 21:42:19 +0100
Subject: [PATCH] Fix: uprobes don't hit in multi-threaded processes.

If a processes is started before probes are inserted, SystemTap waits
for the main thread of the process to wake-up. Then the hooks for
SystemTap are installed. If the main thread does not wake up, the
hooks are not installed and therefore uprobes don't fire.

Instead of waiting only for the main thread, the task finder waits for
any thread that wakes up. In order to avoid that hooks get installed
multiple times, a protection for mmap callbacks is also realised.

Signed-off-by: Torsten Polle <Torsten.Polle@gmx.de>
---
 runtime/linux/task_finder2.c |   36 +++++++++++++++++++-----------------
 1 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/runtime/linux/task_finder2.c b/runtime/linux/task_finder2.c
index 119d04e..e259960 100644
--- a/runtime/linux/task_finder2.c
+++ b/runtime/linux/task_finder2.c
@@ -83,6 +83,8 @@ struct stap_task_finder_target {
 	unsigned mmap_events:1;
 	unsigned munmap_events:1;
 	unsigned mprotect_events:1;
+#define mmap_called  (0 << 0)               /* mmap callbacks called */
+	volatile unsigned long events;
 
 /* public: */
 	pid_t pid;
@@ -243,6 +245,7 @@ stap_register_task_finder_target(struct stap_task_finder_target *new_tgt)
 	new_tgt->mmap_events = 0;
 	new_tgt->munmap_events = 0;
 	new_tgt->mprotect_events = 0;
+	new_tgt->events = 0;
 	memset(&new_tgt->ops, 0, sizeof(new_tgt->ops));
 	new_tgt->ops.report_exec = &__stp_utrace_task_finder_target_exec;
 	new_tgt->ops.report_death = &__stp_utrace_task_finder_target_death;
@@ -1281,16 +1284,16 @@ __stp_tf_quiesce_worker(struct task_work *work)
 
 	__stp_tf_handler_start();
 
-	/* Call the callbacks.  Assume that if the thread is a
-	 * thread group leader, it is a process. */
-	__stp_call_callbacks(tgt, current, 1, (current->pid == current->tgid));
- 
-	/* If this is just a thread other than the thread group
-	 * leader, don't bother inform map callback clients about its
-	 * memory map, since they will simply duplicate each other. */
-	if (tgt->mmap_events == 1 && current->tgid == current->pid) {
+	/* Call the callbacks for each thread. */
+	__stp_call_callbacks(tgt, current, 1, 1);
+
+	/* Call the callbacks just once for each process. */
+	if (test_and_set_bit(mmap_called, &tgt->events))
+	{
+		if (tgt->mmap_events == 1) {
 	    __stp_call_mmap_callbacks_for_task(tgt, current);
 	}
+	}
 
 	__stp_tf_handler_end();
 
@@ -1368,16 +1371,15 @@ __stp_utrace_task_finder_target_quiesce(u32 action,
 		}
 	}
 	else {
-		/* Call the callbacks.  Assume that if the thread is a
-		 * thread group leader, it is a process. */
-		__stp_call_callbacks(tgt, tsk, 1, (tsk->pid == tsk->tgid));
- 
-		/* If this is just a thread other than the thread
-		   group leader, don't bother inform map callback
-		   clients about its memory map, since they will
-		   simply duplicate each other. */
-		if (tgt->mmap_events == 1 && tsk->tgid == tsk->pid) {
+		/* Call the callbacks for each thread. */
+		__stp_call_callbacks(tgt, tsk, 1, 1);
+
+		/* Call the callbacks just once for each process. */
+		if (test_and_set_bit(mmap_called, &tgt->events))
+		{
+			if (tgt->mmap_events == 1) {
 			__stp_call_mmap_callbacks_for_task(tgt, tsk);
+			}
 		}
 	}
 
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Probes don't hit in an already running process
  2014-03-07 22:27 [RFC] Probes don't hit in an already running process Torsten Polle
@ 2014-03-13 13:46 ` David Smith
  2014-03-15  6:29   ` Aw: " Torsten Polle
  0 siblings, 1 reply; 5+ messages in thread
From: David Smith @ 2014-03-13 13:46 UTC (permalink / raw)
  To: Torsten Polle, systemtap

On 03/07/2014 04:27 PM, Torsten Polle wrote:
> Hi,
> 
> I've made the observation that probes sometimes don't hit when I start
> staprun after the process (where the probes should hit) started.  After
> some tests, I found out that only multi-thread processes were affected
> under a certain condition.
> 
> The patch below fixes the issue for me. But I've no clue about possible
> side effects. In my first attempt to fix the issue, I also included the
> calls to __stp_call_callbacks() into the guarded area. My probes hit,
> but calls to usymname(uaddr()) in the probe body only printed the
> address instead of the symbol of the probed function.
> 
> Any advice of how I can improve the patch is appreciated.

Hmm. we've had this problem before, and I thought we fixed it. See
PR12642 (utrace: taskfinder misses events when main thread does not go
through at least one quiesce):

<https://sourceware.org/bugzilla/show_bug.cgi?id=12642>

One of the things the commit that fixes that bug does is add a test
case, called 'main_quiesce.exp'. Does that pass or fail for you (run
"make installcheck RUNTESTFLAGS=main_quiesce.exp")? If it passes, we
need to figure out what is different about your multi-thread process
that still causes this to happen.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Aw: Re: [RFC] Probes don't hit in an already running process
  2014-03-13 13:46 ` David Smith
@ 2014-03-15  6:29   ` Torsten Polle
  2014-08-16 20:51     ` Torsten Polle
  0 siblings, 1 reply; 5+ messages in thread
From: Torsten Polle @ 2014-03-15  6:29 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

David

 > Gesendet: Donnerstag, 13. März 2014 um 14:46 Uhr
 > Von: "David Smith" <dsmith@redhat.com>
 > An: "Torsten Polle" <Torsten.Polle@gmx.de>, systemtap@sourceware.org
 > Betreff: Re: [RFC] Probes don't hit in an already running process
 > On 03/07/2014 04:27 PM, Torsten Polle wrote:
 > > Hi,
 > >
 > > I've made the observation that probes sometimes don't hit when I
 > > start staprun after the process (where the probes should hit)
 > > started. After some tests, I found out that only multi-thread
 > > processes were affected under a certain condition.
 > >
 > > The patch below fixes the issue for me. But I've no clue about
 > > possible side effects. In my first attempt to fix the issue, I
 > > also included the calls to __stp_call_callbacks() into the
 > > guarded area. My probes hit, but calls to usymname(uaddr()) in
 > > the probe body only printed the address instead of the symbol of
 > > the probed function.
 > >
 > > Any advice of how I can improve the patch is appreciated.
 >
 > Hmm. we've had this problem before, and I thought we fixed it. See
 > PR12642 (utrace: taskfinder misses events when main thread does not
 > go through at least one quiesce):
 >
 > <https://sourceware.org/bugzilla/show_bug.cgi?id=12642>

Thanks for the hint. I checked the program that is attached to the bug
report. The program does not differ from my test program.

I'll check why UTRACE_INTERRUPT does lead to the task work to be run
for sleeping main threads in my setup.

 > One of the things the commit that fixes that bug does is add a test
 > case, called 'main_quiesce.exp'. Does that pass or fail for you
 > (run "make installcheck RUNTESTFLAGS=main_quiesce.exp")? If it
 > passes, we need to figure out what is different about your
 > multi-thread process that still causes this to happen.
 >
 > --
 > David Smith
 > dsmith@redhat.com
 > Red Hat
 > http://www.redhat.com
 > 256.217.0141 (direct)
 > 256.837.0057 (fax)
 
Kind Regards,
Torsten

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Probes don't hit in an already running process
  2014-03-15  6:29   ` Aw: " Torsten Polle
@ 2014-08-16 20:51     ` Torsten Polle
  2014-08-18 14:48       ` David Smith
  0 siblings, 1 reply; 5+ messages in thread
From: Torsten Polle @ 2014-08-16 20:51 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

David,

Am 15.03.2014 um 07:29 schrieb Torsten Polle <Torsten.Polle@gmx.de>:

> David
> 
>  > Gesendet: Donnerstag, 13. März 2014 um 14:46 Uhr
>  > Von: "David Smith" <dsmith@redhat.com>
>  > An: "Torsten Polle" <Torsten.Polle@gmx.de>, systemtap@sourceware.org
>  > Betreff: Re: [RFC] Probes don't hit in an already running process
>  > On 03/07/2014 04:27 PM, Torsten Polle wrote:
>  > > Hi,
>  > >
>  > > I've made the observation that probes sometimes don't hit when I
>  > > start staprun after the process (where the probes should hit)
>  > > started. After some tests, I found out that only multi-thread
>  > > processes were affected under a certain condition.
>  > >
>  > > The patch below fixes the issue for me. But I've no clue about
>  > > possible side effects. In my first attempt to fix the issue, I
>  > > also included the calls to __stp_call_callbacks() into the
>  > > guarded area. My probes hit, but calls to usymname(uaddr()) in
>  > > the probe body only printed the address instead of the symbol of
>  > > the probed function.
>  > >
>  > > Any advice of how I can improve the patch is appreciated.
>  >
>  > Hmm. we've had this problem before, and I thought we fixed it. See
>  > PR12642 (utrace: taskfinder misses events when main thread does not
>  > go through at least one quiesce):
>  >
>  > <https://sourceware.org/bugzilla/show_bug.cgi?id=12642>
> 
> Thanks for the hint. I checked the program that is attached to the bug
> report. The program does not differ from my test program.
> 
> I'll check why UTRACE_INTERRUPT does lead to the task work to be run
> for sleeping main threads in my setup.

it took me some time to check this issue in July and again some more time to write this mail.

Finally, I found out that UTRACE_INTERUPT was not handled properly in utrace_resume. Before I wanted to get back to you, I double checked with the newest version only to find out that you had already fixed the problem with "Fixed PR17181 by making utrace handle interrupting processes better."

>  > One of the things the commit that fixes that bug does is add a test
>  > case, called 'main_quiesce.exp'. Does that pass or fail for you
>  > (run "make installcheck RUNTESTFLAGS=main_quiesce.exp")? If it
>  > passes, we need to figure out what is different about your
>  > multi-thread process that still causes this to happen.
>  >
>  > --
>  > David Smith
>  > dsmith@redhat.com
>  > Red Hat
>  > http://www.redhat.com
>  > 256.217.0141 (direct)
>  > 256.837.0057 (fax)
>  
> Kind Regards,
> Torsten

Kind Regards,
Torsten

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Probes don't hit in an already running process
  2014-08-16 20:51     ` Torsten Polle
@ 2014-08-18 14:48       ` David Smith
  0 siblings, 0 replies; 5+ messages in thread
From: David Smith @ 2014-08-18 14:48 UTC (permalink / raw)
  To: Torsten Polle; +Cc: systemtap

On 08/16/2014 03:51 PM, Torsten Polle wrote:
> it took me some time to check this issue in July and again some more time
> to write this mail.
> 
> Finally, I found out that UTRACE_INTERUPT was not handled properly in
> utrace_resume. Before I wanted to get back to you, I double checked
> with the newest version only to find out that you had already fixed
> the problem with "Fixed PR17181 by making utrace handle interrupting
> processes better."

Thanks for checking. Yeah, PR17181 and PR17127 were quite difficult to
debug (which led to the fix that we reverted for PR17127). Having a good
testcase was *so* helpful it getting a good fix here.

I'm glad that fixed your issue too. Thanks again for retesting.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-08-18 14:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-07 22:27 [RFC] Probes don't hit in an already running process Torsten Polle
2014-03-13 13:46 ` David Smith
2014-03-15  6:29   ` Aw: " Torsten Polle
2014-08-16 20:51     ` Torsten Polle
2014-08-18 14:48       ` David Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).