* [Bug tapsets/25290] New: process(EXE).begin may occasionally miss already-running target processes from EXE
@ 2019-12-18 7:32 agentzh at gmail dot com
2019-12-19 21:25 ` [Bug tapsets/25290] " agentzh at gmail dot com
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2019-12-18 7:32 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=25290
Bug ID: 25290
Summary: process(EXE).begin may occasionally miss
already-running target processes from EXE
Product: systemtap
Version: unspecified
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tapsets
Assignee: systemtap at sourceware dot org
Reporter: agentzh at gmail dot com
Target Milestone: ---
I've noted that the process(EXE).begin probes may sometimes miss target
processes which are already running when staprun is started. To reproduce this:
1. Prepare a minimal a.c C program file:
```
#include <unistd.h>
int main (int argc, char** argv) {
while (1) {
usleep(1);
}
return 0;
}
```
Compile it like this:
gcc -g a.c
which results in ./a.out generated. Run this program in a separate terminal
window like this:
./a.out
Keep this process running (it has an infinite loop so it will never exit by
itself).
2. Prepare a minimal a.stp script file:
```
global max_cnt = 5, cnt = 0;
global pids
probe process("/home/agentzh/a.out").begin {
pids[pid()] = 1
}
probe process("/home/agentzh/a.out").end {
delete pids[pid()];
}
probe process("/lib64/libc.so.6").function("usleep") {
if (!pids[pid()])
next;
if (++cnt > max_cnt)
exit();
println("usleep(", @var("useconds"), ")");
}
probe timer.s(5) {
warn("timer expired");
exit();
}
probe begin {
warn("Start tracing...")
}
```
And then compile this file to a kernel module file, usleep.ko:
stap -p4 -m usleep a.stp
3. Run this usleep.ko module in a shell loop:
( while true; do echo ======; sudo staprun usleep.ko; sleep 0.1; done ) |&
tee a.txt
after a few minutes (be patient!), we should see "timer expired" messages from
the output file a.txt:
$ grep -a timer a.txt
WARNING: timer expired
WARNING: timer expired
WARNING: timer expired
WARNING: timer expired
WARNING: timer expired
WARNING: timer expired
WARNING: timer expired
WARNING: timer expired
You may get fewer lines here though.
The timer expired message should never be printed when the process(EXE).begin
probe is fired properly.
I tried the latest master branch of stap on kernel 5.0.16-100.fc28.x86_64 from
Fedora 28, kernel 4.15.0-72-generic x86_64 from Ubuntu 18.04, and kernels
3.10.0-957.27.2.el7.x86_64 and 3.10.0-1062.9.1.el7.x86_64 from CentOS 7, all
show the same problem. I just noted that this is much harder to reproduce on
kernel 5.0.16 than earlier versions of the kernel.
The stap version:
```
$ /opt/stap/bin/stap -V
Systemtap translator/driver (version 4.3/0.174/0.177, commit
release-4.2-10-g1427836ac118)
Copyright (C) 2005-2019 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
tested kernel versions: 2.6.32 ... 5.4-rc6
enabled features: BPF LIBSQLITE3 NLS
```
Any hints on debugging this further? Thanks!
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tapsets/25290] process(EXE).begin may occasionally miss already-running target processes from EXE
2019-12-18 7:32 [Bug tapsets/25290] New: process(EXE).begin may occasionally miss already-running target processes from EXE agentzh at gmail dot com
@ 2019-12-19 21:25 ` agentzh at gmail dot com
2019-12-20 6:47 ` agentzh at gmail dot com
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2019-12-19 21:25 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=25290
--- Comment #1 from agentzh <agentzh at gmail dot com> ---
OK, this can be reproduced by a much simpler stap script (a simple one-liner):
$ stap -m test -p4 -e 'probe process("/home/agentzh/a.out").begin {
println("Hit"); exit() } probe begin { println("Start tracing") }'
$ while true; do sudo staprun test.ko; sleep 0.1; done
The second command would hang forever upon a "Start tracing" output line (when
./a.out is always running in another terminal).
And /home/agentzh/a.out is generated from exactly the same a.c source file
provided in my previous comment.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tapsets/25290] process(EXE).begin may occasionally miss already-running target processes from EXE
2019-12-18 7:32 [Bug tapsets/25290] New: process(EXE).begin may occasionally miss already-running target processes from EXE agentzh at gmail dot com
2019-12-19 21:25 ` [Bug tapsets/25290] " agentzh at gmail dot com
@ 2019-12-20 6:47 ` agentzh at gmail dot com
2019-12-20 23:24 ` agentzh at gmail dot com
2019-12-20 23:25 ` agentzh at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2019-12-20 6:47 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=25290
--- Comment #2 from agentzh <agentzh at gmail dot com> ---
Okay, with fche's guidance on IRC, I've narrowed it down to a race condition
that the utrace process.begin callback would be invoked when the stap sessions
state is still STAP_SESSION_STARTING instead of STAP_SESSION_RUNNING
occasionally.
As per fche's suggestion on IRC, I'd propose the following patch to fix this:
https://pastebin.com/WumwHdAT
I've tested this patch myself on 3.10 kernels from CentOS, 4.15 kernels from
Ubuntu and 5.0 kernels from Fedora. And it's quite stable now. Also tested the
process(EXE).end and process(EXE).thread.{begin,end} probes. All look fine now.
Feedback welcome! Thanks!
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tapsets/25290] process(EXE).begin may occasionally miss already-running target processes from EXE
2019-12-18 7:32 [Bug tapsets/25290] New: process(EXE).begin may occasionally miss already-running target processes from EXE agentzh at gmail dot com
2019-12-19 21:25 ` [Bug tapsets/25290] " agentzh at gmail dot com
2019-12-20 6:47 ` agentzh at gmail dot com
@ 2019-12-20 23:24 ` agentzh at gmail dot com
2019-12-20 23:25 ` agentzh at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2019-12-20 23:24 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=25290
--- Comment #3 from agentzh <agentzh at gmail dot com> ---
Committed a modified version of the patch as 04d221c16.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tapsets/25290] process(EXE).begin may occasionally miss already-running target processes from EXE
2019-12-18 7:32 [Bug tapsets/25290] New: process(EXE).begin may occasionally miss already-running target processes from EXE agentzh at gmail dot com
` (2 preceding siblings ...)
2019-12-20 23:24 ` agentzh at gmail dot com
@ 2019-12-20 23:25 ` agentzh at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2019-12-20 23:25 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=25290
agentzh <agentzh at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #4 from agentzh <agentzh at gmail dot com> ---
Consider it resolved.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-12-20 23:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-18 7:32 [Bug tapsets/25290] New: process(EXE).begin may occasionally miss already-running target processes from EXE agentzh at gmail dot com
2019-12-19 21:25 ` [Bug tapsets/25290] " agentzh at gmail dot com
2019-12-20 6:47 ` agentzh at gmail dot com
2019-12-20 23:24 ` agentzh at gmail dot com
2019-12-20 23:25 ` agentzh at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).