public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug tapsets/20075] New: target_set_pid() returns False when execve() syscall is successful
@ 2016-05-11  1:24 mysecondaccountabc at gmail dot com
  2016-05-11 16:54 ` [Bug tapsets/20075] " jistone at redhat dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2016-05-11  1:24 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20075

            Bug ID: 20075
           Summary: target_set_pid() returns False when execve() syscall
                    is successful
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: tapsets
          Assignee: systemtap at sourceware dot org
          Reporter: mysecondaccountabc at gmail dot com
  Target Milestone: ---

Created attachment 9250
  --> https://sourceware.org/bugzilla/attachment.cgi?id=9250&action=edit
Script to show the issue

When execve() is called and it returns without error, for some reason a call to
target_set_pid(pid()) in this context returns "False", wrongly indicating that
the pid doesn't descend from the target process. It causes that syscall be
discarded from the syscall trace.

I'm using a slightly modified version of "strace.stp" to show the issue, see
the attachment.

sudo stap my_strace.stp -w -c "sh -c /bin/ls" 

====
...

||OK_CALL 3||: target_set_pid(12701):1 target:12701
||OK_RETURN 3||: target_set_pid(12701):1 target:12701
Wed May 11 00:49:14 2016.822339 rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) =
0

||OK_CALL 3||: target_set_pid(12701):1 target:12701
||OK_RETURN 3||: target_set_pid(12701):1 target:12701
Wed May 11 00:49:14 2016.822364 execve("/usr/local/sbin/sh", ["sh", "-c",
"/bin/ls"], [/* 18 vars */]) = -2 (ENOENT)

...

||OK_CALL 3||: target_set_pid(12701):1 target:12701
||FILTERED_RETURN 2||: target_set_pid(12701):0
>>FILTERED_RETURN<<: sh[12701] execve("/bin/sh", ["sh", "-c", "/bin/ls"], [/* 18 vars */]) = 0

... 
====

As you can see above in the last three lines, when the successful execve is
called, the target_set_pid(pid()) call in the nd_syscall.*.return probe returns
False "target_set_pid(12701):0", causing the syscall to be dropped.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tapsets/20075] target_set_pid() returns False when execve() syscall is successful
  2016-05-11  1:24 [Bug tapsets/20075] New: target_set_pid() returns False when execve() syscall is successful mysecondaccountabc at gmail dot com
@ 2016-05-11 16:54 ` jistone at redhat dot com
  2016-05-16  7:45 ` mysecondaccountabc at gmail dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: jistone at redhat dot com @ 2016-05-11 16:54 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20075

Josh Stone <jistone at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dsmith at redhat dot com,
                   |                            |jistone at redhat dot com

--- Comment #1 from Josh Stone <jistone at redhat dot com> ---
Interesting.  You can see this directly if you probe process.begin and
process.end yourself -- the end is called before the execve return, and the
begin is called after.

I believe for our internal process tracking, we report process.end right away,
but the process.begin waits for a quiescent point where it's sleepable, which
will be very low in the kernel entry code.  So your execve return is happening
in a gray area where we're not really "attached" to that process any more.

I'm not sure if there's any way to make that more seamless.  Perhaps we could
hold off the process.end until it quiesces too?  David, what do you think?


As a workaround, you can simplify your return filtering based just on whether
you saw the same entry -- if (tid() in thread_argstr) { report(...) } -- and in
the call you should also wait to write that entry until after the filter
passes.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tapsets/20075] target_set_pid() returns False when execve() syscall is successful
  2016-05-11  1:24 [Bug tapsets/20075] New: target_set_pid() returns False when execve() syscall is successful mysecondaccountabc at gmail dot com
  2016-05-11 16:54 ` [Bug tapsets/20075] " jistone at redhat dot com
@ 2016-05-16  7:45 ` mysecondaccountabc at gmail dot com
  2016-05-16 15:08 ` fche at redhat dot com
  2016-07-18 20:44 ` fche at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2016-05-16  7:45 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20075

--- Comment #2 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
Cool. So it would be the fix/workaround for your strace.stp example file.

--- strace.stp.old      2016-05-16 17:36:21.463561868 +1000
+++ strace.stp.new      2016-05-16 17:30:12.385743161 +1000
@@ -47,7 +47,7 @@

 probe nd_syscall.*.return
   {
-    if (filter_p()) next;
+    if (filter_p() && !(tid() in thread_argstr)) next;

     report(name,thread_argstr[tid()],retstr)
   }

Thanks,
Gustavo

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tapsets/20075] target_set_pid() returns False when execve() syscall is successful
  2016-05-11  1:24 [Bug tapsets/20075] New: target_set_pid() returns False when execve() syscall is successful mysecondaccountabc at gmail dot com
  2016-05-11 16:54 ` [Bug tapsets/20075] " jistone at redhat dot com
  2016-05-16  7:45 ` mysecondaccountabc at gmail dot com
@ 2016-05-16 15:08 ` fche at redhat dot com
  2016-07-18 20:44 ` fche at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: fche at redhat dot com @ 2016-05-16 15:08 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20075

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com

--- Comment #3 from Frank Ch. Eigler <fche at redhat dot com> ---
Perhaps we can smarten-up the target_set tapset, so that it puts dying
processes into a separate grace-period state, from which almost-born children
are still recognized as the same target-set.

Something like


global _target_set_zombie%

probe ...end {
  _target_set_zombie[pid()] = gettimeofday_s()
}

probe timer.s(30) {
  now = gettimeofday_s()
  foreach (time_of_death = pid in _target_set_zombie) {
     if (now - time_of_death > 60) delete _target_set[pid]
  }
}

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tapsets/20075] target_set_pid() returns False when execve() syscall is successful
  2016-05-11  1:24 [Bug tapsets/20075] New: target_set_pid() returns False when execve() syscall is successful mysecondaccountabc at gmail dot com
                   ` (2 preceding siblings ...)
  2016-05-16 15:08 ` fche at redhat dot com
@ 2016-07-18 20:44 ` fche at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: fche at redhat dot com @ 2016-07-18 20:44 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20075

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wcohen at redhat dot com

--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---
wcohen drafted similar function:

function child_of_target:long (t:long)
{
  if (!target()) return 1
  while(t && t != task_parent(t)) {
    if (task_pid(t) == target()) return 1
    t = task_parent(t)
  }
  return 0
}


Maybe this implementation can take the place of the current one, or vice versa?
Consider also disowned processes (that are reparented to ppid=1) somehow.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-07-18 20:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-11  1:24 [Bug tapsets/20075] New: target_set_pid() returns False when execve() syscall is successful mysecondaccountabc at gmail dot com
2016-05-11 16:54 ` [Bug tapsets/20075] " jistone at redhat dot com
2016-05-16  7:45 ` mysecondaccountabc at gmail dot com
2016-05-16 15:08 ` fche at redhat dot com
2016-07-18 20:44 ` fche at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).