public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
@ 2014-11-19 13:51 ` peter at peca dot dk
  2014-11-19 14:39 ` fche at redhat dot com
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-19 13:51 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #1 from Peter Allin <peter at peca dot dk> ---
Created attachment 7949
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7949&action=edit
The Systemtap script that causes this behavior

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application
@ 2014-11-19 13:51 peter at peca dot dk
  2014-11-19 13:51 ` [Bug uprobes/17623] " peter at peca dot dk
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-19 13:51 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

            Bug ID: 17623
           Summary: Sometimes probes fail to fire events when running
                    against a multi-threaded application
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: uprobes
          Assignee: systemtap at sourceware dot org
          Reporter: peter at peca dot dk

Created attachment 7948
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7948&action=edit
The example program that triggers the behavior

I have the following test program (attached as testprog.c)

  #include <pthread.h>
  #include <unistd.h>

  #define WAIT_US 1

  void func1()
  {
  }

  void func2()
  {
  }

  void* thread_main(void *foo)
  {
    for (;;)
    {
      func1();
      usleep(WAIT_US);
      func2();
    }
  }

  int main()
  {
    pthread_t thread;
    pthread_create(&thread, NULL, thread_main, NULL);
    pthread_join(thread, NULL);
    return 0;
  }

I run this program, after compiling it with "gcc -O0 -g -Wall testprog.c -o
testprog -lpthread", and then run this systemtap script (attached as
trace_funcs.stp):

  probe process("testprog").function("func1")
  {
    printf("func1\n")
  }

  probe process("testprog").function("func2")
  {
    printf("func2\n")
  }

Expected output: An endless repetition of "func1\nfunc2\n".

Actual output: Sometimes an endless repetition of "func1\n", and at other times
the expected output.


When I compile with WAIT_US set to 1, it almost always fails. If I set it to 50
about half of the runs fails. If I set it 50000 almost all runs succeed.

If I run "testprog" via the "-c" option to stap, it always works.

If I call thread_main() directly from main() instead of using pthreads, it
always works.


I am experiencing is on a Ubuntu 14.04.1, with the Ubuntu supplies Systemtap of
version 2.3. I have compiled Systemtap version 2.6 from source and got the same
results.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
  2014-11-19 13:51 ` [Bug uprobes/17623] " peter at peca dot dk
@ 2014-11-19 14:39 ` fche at redhat dot com
  2014-11-20  7:49 ` peter at peca dot dk
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2014-11-19 14:39 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com

--- Comment #2 from Frank Ch. Eigler <fche at redhat dot com> ---
Peter, have you by any chance tried out "perf probe" against your
original binary, to see if this is may be a kernel-side rather
than stap-side problem?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
  2014-11-19 13:51 ` [Bug uprobes/17623] " peter at peca dot dk
  2014-11-19 14:39 ` fche at redhat dot com
@ 2014-11-20  7:49 ` peter at peca dot dk
  2014-11-20  7:54 ` peter at peca dot dk
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-20  7:49 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #3 from Peter Allin <peter at peca dot dk> ---
(In reply to Frank Ch. Eigler from comment #2)
> Peter, have you by any chance tried out "perf probe" against your
> original binary, to see if this is may be a kernel-side rather
> than stap-side problem?

I have just tried it, and am somewhat confused by the results. Here is what I
did:

- perf probe -x testprog func1
- perf probe -x testprog func2
- Start testprog in another terminal
- perf stat -e probe_testprog:func1 -e probe_testprog:func2 -a sleep 1

I ran the last command in a loop for 1000 times. It always shows counts for
both probe points, but the counts are not equal. So it seems that SystemTap
will either work perfectly or loose all events from one of the probes, while
perf is always missing some events but not all.

Do you think this is caused by two separate problems, or are both pointing
towards a kernel problem?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (2 preceding siblings ...)
  2014-11-20  7:49 ` peter at peca dot dk
@ 2014-11-20  7:54 ` peter at peca dot dk
  2014-11-20 17:04 ` dsmith at redhat dot com
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-20  7:54 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #4 from Peter Allin <peter at peca dot dk> ---
Aha. When I let perf start the test program, like this:

  perf stat -e probe_testprog:func1 -e probe_testprog:func2 -a timeout 1s
./testprog

The counts are equal (+/- 1 of course).

This make the SystemTap problem and the perf problem look like the same.
Pointing towards the kernel I suppose?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (3 preceding siblings ...)
  2014-11-20  7:54 ` peter at peca dot dk
@ 2014-11-20 17:04 ` dsmith at redhat dot com
  2014-11-21  7:31 ` peter at peca dot dk
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: dsmith at redhat dot com @ 2014-11-20 17:04 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dsmith at redhat dot com

--- Comment #5 from David Smith <dsmith at redhat dot com> ---
Created attachment 7955
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7955&action=edit
2nd test script

On my F20 vm (3.16.3-200.fc20.x86_64), your test program and script worked fine
(even at WAIT_US 1). What kernel are you running?

When you finally stopped systemtap, did it report any skipped probes?

I've attached a modified test script that I'd like you to try. In case you are
overrunning the print system, this version just has counters. When you
interrupt systemtap, it will print the count of each function hit. For me for
instance with HEAD systemtap, it reports the following:

When I ran 'stap test2.stp -c ./foobar', I get:

func1: 207326 hits, func2: 207325 hits

When I ran './foobar &; stap test2.stp', I get:

func1: 173137 hits, func2: 173137 hits

Depending on what you get with test2.stp, can I ask you to try with HEAD
systemtap and see what you get?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (4 preceding siblings ...)
  2014-11-20 17:04 ` dsmith at redhat dot com
@ 2014-11-21  7:31 ` peter at peca dot dk
  2014-11-21 14:45 ` dsmith at redhat dot com
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-21  7:31 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #6 from Peter Allin <peter at peca dot dk> ---
(In reply to David Smith from comment #5)
> On my F20 vm (3.16.3-200.fc20.x86_64), your test program and script worked
> fine (even at WAIT_US 1). What kernel are you running?

That sounds promising :-) I am running a kernel supplied by the Ubuntu project,
they call it "3.13.0-24-generic #47-Ubuntu". I am working remotely today, and
won't risk loosing access to the server by messing a kernel upgrade up (got to
get that remote power switch configured..), but I'll defintely try upgrading it
on monday.

> When you finally stopped systemtap, did it report any skipped probes?

No I haven't seen any such reports. I stop it either by ctrl-c or by running it
under the "timeout" command. Could these ways of stopping it inhibit messages
about skipped probes?

> Depending on what you get with test2.stp, can I ask you to try with HEAD
> systemtap and see what you get?

The behavior is consistent with my version of the script. If I run the
testprogram with the -c option of stap, I gets equals counts. When I run the
test program in the background only "func1" is counted. (I don't know if this
could give any hints, if I switch the order of the probes in the .stp file, it
is "func2" that works).

Given this result, whould it make sense to try out the HEAD systemtap?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (5 preceding siblings ...)
  2014-11-21  7:31 ` peter at peca dot dk
@ 2014-11-21 14:45 ` dsmith at redhat dot com
  2014-11-23  7:40 ` peter at peca dot dk
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: dsmith at redhat dot com @ 2014-11-21 14:45 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #7 from David Smith <dsmith at redhat dot com> ---
(In reply to Peter Allin from comment #6)
> (In reply to David Smith from comment #5)
> > On my F20 vm (3.16.3-200.fc20.x86_64), your test program and script worked
> > fine (even at WAIT_US 1). What kernel are you running?
> 
> That sounds promising :-) I am running a kernel supplied by the Ubuntu
> project, they call it "3.13.0-24-generic #47-Ubuntu". I am working remotely
> today, and won't risk loosing access to the server by messing a kernel
> upgrade up (got to get that remote power switch configured..), but I'll
> defintely try upgrading it on monday.
> 
> > When you finally stopped systemtap, did it report any skipped probes?
> 
> No I haven't seen any such reports. I stop it either by ctrl-c or by running
> it under the "timeout" command. Could these ways of stopping it inhibit
> messages about skipped probes?

The skipped probes message would have come out at the end after either using
ctrl-c or using the timeout command.

> > Depending on what you get with test2.stp, can I ask you to try with HEAD
> > systemtap and see what you get?
> 
> The behavior is consistent with my version of the script. If I run the
> testprogram with the -c option of stap, I gets equals counts. When I run the
> test program in the background only "func1" is counted. (I don't know if
> this could give any hints, if I switch the order of the probes in the .stp
> file, it is "func2" that works).
> 
> Given this result, whould it make sense to try out the HEAD systemtap?

It might if you have the time. It will provide another data point to whether
this is a kernel issue or a systemtap issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (6 preceding siblings ...)
  2014-11-21 14:45 ` dsmith at redhat dot com
@ 2014-11-23  7:40 ` peter at peca dot dk
  2014-11-23  7:53 ` peter at peca dot dk
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-23  7:40 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #8 from Peter Allin <peter at peca dot dk> ---
> > Given this result, whould it make sense to try out the HEAD systemtap?
> 
> It might if you have the time. It will provide another data point to whether
> this is a kernel issue or a systemtap issue.

I just tried out with HEAD systemtap. The results are the same. I'll try with a
newer kernel tomorrow.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (7 preceding siblings ...)
  2014-11-23  7:40 ` peter at peca dot dk
@ 2014-11-23  7:53 ` peter at peca dot dk
  2014-11-23 16:49 ` peter at peca dot dk
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-23  7:53 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #9 from Peter Allin <peter at peca dot dk> ---
I have tried it in a VirtualBox VM at home, with the same kernel release I am
using on the server (but in a 32 bit build). I works perfectly here.

I will try a 64 bit version later.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (8 preceding siblings ...)
  2014-11-23  7:53 ` peter at peca dot dk
@ 2014-11-23 16:49 ` peter at peca dot dk
  2014-11-24 15:17 ` dsmith at redhat dot com
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-23 16:49 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #10 from Peter Allin <peter at peca dot dk> ---
It also works well on a 64bit virtualbox VM.

Coulld the different results on the VMs and the real hardware be caused by the
different performace of the machines?

I'll still try with a newer kernel tomorrow.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (9 preceding siblings ...)
  2014-11-23 16:49 ` peter at peca dot dk
@ 2014-11-24 15:17 ` dsmith at redhat dot com
  2014-11-25 10:24 ` peter at peca dot dk
  2021-05-03 22:34 ` fche at redhat dot com
  12 siblings, 0 replies; 14+ messages in thread
From: dsmith at redhat dot com @ 2014-11-24 15:17 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #11 from David Smith <dsmith at redhat dot com> ---
(In reply to Peter Allin from comment #10)
> It also works well on a 64bit virtualbox VM.
> 
> Coulld the different results on the VMs and the real hardware be caused by
> the different performace of the machines?

In theory it shouldn't, especially if you are using the script I posted that
doesn't print at every function hit but just increments a counter.

> I'll still try with a newer kernel tomorrow.

Sounds good.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (10 preceding siblings ...)
  2014-11-24 15:17 ` dsmith at redhat dot com
@ 2014-11-25 10:24 ` peter at peca dot dk
  2021-05-03 22:34 ` fche at redhat dot com
  12 siblings, 0 replies; 14+ messages in thread
From: peter at peca dot dk @ 2014-11-25 10:24 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

--- Comment #12 from Peter Allin <peter at peca dot dk> ---
(In reply to David Smith from comment #11)

> In theory it shouldn't, especially if you are using the script I posted that
> doesn't print at every function hit but just increments a counter.

Yes - I've been using your script with the counters.

> > I'll still try with a newer kernel tomorrow.
> Sounds good.

I have tried with a 3.17.4 kernel, and got the same results. Repeated runs
looks like this:

func1: 7845 hits, func2: 0 hits
func1: 7891 hits, func2: 7891 hits
func1: 7793 hits, func2: 7793 hits
func1: 7847 hits, func2: 7847 hits
func1: 7811 hits, func2: 7811 hits
func1: 7887 hits, func2: 0 hits
func1: 7851 hits, func2: 0 hits
func1: 7886 hits, func2: 0 hits
func1: 7800 hits, func2: 7800 hits
func1: 7889 hits, func2: 7889 hits
func1: 7877 hits, func2: 0 hits

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug uprobes/17623] Sometimes probes fail to fire events when running against a multi-threaded application
  2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
                   ` (11 preceding siblings ...)
  2014-11-25 10:24 ` peter at peca dot dk
@ 2021-05-03 22:34 ` fche at redhat dot com
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2021-05-03 22:34 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17623

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WORKSFORME

--- Comment #13 from Frank Ch. Eigler <fche at redhat dot com> ---
getting fine results now under fedora33

% stap -V
Systemtap translator/driver (version 4.5/0.183, rpm 4.5-1.202104131754.fc33)
[...]
% ./a.out &
% stap -T 15 test.stp
func1: 487793 hits, func2: 487793 hits

repeatedly

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-05-03 22:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-19 13:51 [Bug uprobes/17623] New: Sometimes probes fail to fire events when running against a multi-threaded application peter at peca dot dk
2014-11-19 13:51 ` [Bug uprobes/17623] " peter at peca dot dk
2014-11-19 14:39 ` fche at redhat dot com
2014-11-20  7:49 ` peter at peca dot dk
2014-11-20  7:54 ` peter at peca dot dk
2014-11-20 17:04 ` dsmith at redhat dot com
2014-11-21  7:31 ` peter at peca dot dk
2014-11-21 14:45 ` dsmith at redhat dot com
2014-11-23  7:40 ` peter at peca dot dk
2014-11-23  7:53 ` peter at peca dot dk
2014-11-23 16:49 ` peter at peca dot dk
2014-11-24 15:17 ` dsmith at redhat dot com
2014-11-25 10:24 ` peter at peca dot dk
2021-05-03 22:34 ` fche at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).