public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug dyninst/20052] New: pthreadprobes.exp stall
@ 2016-05-06  7:50 mcermak at redhat dot com
  2016-05-06  7:53 ` [Bug dyninst/20052] " mcermak at redhat dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: mcermak at redhat dot com @ 2016-05-06  7:50 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20052

            Bug ID: 20052
           Summary: pthreadprobes.exp stall
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: dyninst
          Assignee: systemtap at sourceware dot org
          Reporter: mcermak at redhat dot com
  Target Milestone: ---

On x86_64, the pthreadprobes.exp testcase seems to cause relatively well
reproducible stall in one of its dyninst subtests, both on
f23/release-3.0-67-ge6eea51baebb and rhel7/devtoolset-4 rpm 2.9-3s.el7. 

The testcase generates binary pthreadprobes which as such runs for a few
seconds, then finishes. `stap --dyninst xxx.stp -c ./pthreadprobes`, where
xxx.stp is listed below, seems to almost always stall.

=======
# cat xxx.stp 
  global cond_broadcast, cond_destroy, cond_init, cond_signal,
    cond_timedwait, cond_wait, mutex_acquired, mutex_destroy,
    mutex_entry, mutex_init, mutex_release, mutex_timedlock_acquired,
    mutex_timedlock_entry, pthread_create, pthread_join,
    pthread_join_ret, pthread_start, rdlock_entry, rwlock_destroy,
    wrlock_entry

  probe process("/lib64/libpthread.so.0").mark("cond_broadcast") {
cond_broadcast++ }
  probe process("/lib64/libpthread.so.0").mark("cond_destroy") { cond_destroy++
}
  probe process("/lib64/libpthread.so.0").mark("cond_init") { cond_init++ }
  probe process("/lib64/libpthread.so.0").mark("cond_signal") { cond_signal++ }
  probe process("/lib64/libpthread.so.0").mark("cond_timedwait") {
cond_timedwait++ }
  probe process("/lib64/libpthread.so.0").mark("cond_wait") { cond_wait++ }
  probe process("/lib64/libpthread.so.0").mark("mutex_acquired") {
mutex_acquired++ }
  probe process("/lib64/libpthread.so.0").mark("mutex_destroy") {
mutex_destroy++ }
  probe process("/lib64/libpthread.so.0").mark("mutex_entry") { mutex_entry++ }
  probe process("/lib64/libpthread.so.0").mark("mutex_init") { mutex_init++ }
  probe process("/lib64/libpthread.so.0").mark("mutex_release") {
mutex_release++ }
  probe process("/lib64/libpthread.so.0").mark("mutex_timedlock_acquired") {
    mutex_timedlock_acquired++ }
  probe process("/lib64/libpthread.so.0").mark("mutex_timedlock_entry") {
    mutex_timedlock_entry++ }
  probe process("/lib64/libpthread.so.0").mark("pthread_create") {
pthread_create++ }
  probe process("/lib64/libpthread.so.0").mark("pthread_join") { pthread_join++
}
  probe process("/lib64/libpthread.so.0").mark("pthread_join_ret") {
pthread_join_ret++ }
  probe process("/lib64/libpthread.so.0").mark("pthread_start") {
pthread_start++ }
  probe process("/lib64/libpthread.so.0").mark("rdlock_entry") { rdlock_entry++
}
  probe process("/lib64/libpthread.so.0").mark("rwlock_destroy") {
rwlock_destroy++ }
  probe process("/lib64/libpthread.so.0").mark("wrlock_entry") { wrlock_entry++
}
  probe end {
    printf("cond_broadcast=%d\n", cond_broadcast)
    printf("cond_destroy=%d\n", cond_destroy)
    printf("cond_init=%d\n", cond_init)
    printf("cond_signal=%d\n", cond_signal)
    printf("cond_timedwait=%d\n", cond_timedwait)
    printf("cond_wait=%d\n", cond_wait)
    printf("mutex_acquired=%d\n", mutex_acquired)
    printf("mutex_destroy=%d\n", mutex_destroy)
    printf("mutex_entry=%d\n", mutex_entry)
    printf("mutex_init=%d\n", mutex_init)
    printf("mutex_release=%d\n", mutex_release)
    printf("mutex_timedlock_acquired=%d\n", mutex_timedlock_acquired)
    printf("mutex_timedlock_entry=%d\n", mutex_timedlock_entry)
    printf("pthread_create=%d\n", pthread_create)
    printf("pthread_join=%d\n", pthread_join)
    printf("pthread_join_ret=%d\n", pthread_join_ret)
    printf("pthread_start=%d\n", pthread_start)
    printf("rdlock_entry=%d\n", rdlock_entry)
    printf("rwlock_destroy=%d\n", rwlock_destroy)
    printf("wrlock_entry=%d\n", wrlock_entry)
  }
#
=======

The stall looks thusly:

=======
# stap -vvvvvv --dyninst  xxx.stp -c ./pthreadprobes
... stuff deleted ...
Pass 5: starting run.
Running /opt/rh/devtoolset-4/root/usr/bin/stapdyn -v -v -v -v -v -c
./pthreadprobes /tmp/stap4YZyZj/stap_60b2c3beb0fcbec36141257abd333339_7328.so
stapdyn: target "/usr/lib64/libpthread-2.17.so" has 29 probes
stapdyn:   offset:0xbe10 semaphore:0 flags:0
stapdyn:   offset:0xb4d4 semaphore:0 flags:0
stapdyn:   offset:0xb479 semaphore:0 flags:0
stapdyn:   offset:0xbd30 semaphore:0 flags:0
stapdyn:   offset:0xb95c semaphore:0 flags:0
stapdyn:   offset:0xb615 semaphore:0 flags:0
stapdyn:   offset:0x5d61 semaphore:0 flags:0
stapdyn:   offset:0x9c11 semaphore:0 flags:0
stapdyn:   offset:0x6621 semaphore:0 flags:0
stapdyn:   offset:0x1039c semaphore:0 flags:0
stapdyn:   offset:0x9ba0 semaphore:0 flags:0
stapdyn:   offset:0x9bdb semaphore:0 flags:0
stapdyn:   offset:0x1036b semaphore:0 flags:0
stapdyn:   offset:0x9b67 semaphore:0 flags:0
stapdyn:   offset:0x6123 semaphore:0 flags:0
stapdyn:   offset:0x6132 semaphore:0 flags:0
stapdyn:   offset:0xab54 semaphore:0 flags:0
stapdyn:   offset:0xabde semaphore:0 flags:0
stapdyn:   offset:0xa3c8 semaphore:0 flags:0
stapdyn:   offset:0xa8c7 semaphore:0 flags:0
stapdyn:   offset:0xa9f7 semaphore:0 flags:0
stapdyn:   offset:0xa2df semaphore:0 flags:0
stapdyn:   offset:0x8420 semaphore:0 flags:0
stapdyn:   offset:0x8e8d semaphore:0 flags:0
stapdyn:   offset:0x8f33 semaphore:0 flags:0
stapdyn:   offset:0x7db3 semaphore:0 flags:0
stapdyn:   offset:0xade0 semaphore:0 flags:0
stapdyn:   offset:0xadd0 semaphore:0 flags:0
stapdyn:   offset:0xb000 semaphore:0 flags:0
stapdyn: pid 1518 has 17 registers available: rax rbx rsp rbp r10 r11 r12 r13
r14 r15 r8 r9 rcx rdx rsi rdi eflags
stapdyn: dlopen "stap_60b2c3beb0fcbec36141257abd333339_7328.so", pid = 1518
stapdyn: found object "/root/work/tmp.LPzQswb90L/pthreadprobes" in pid 1518
stapdyn: found object
"/opt/rh/devtoolset-4/root/usr/lib64/dyninst/libdyninstAPI_RT.so.9.1.0" in pid
1518
stapdyn: found object "/usr/lib64/libpthread-2.17.so" in pid 1518
stapdyn: found target "/usr/lib64/libpthread-2.17.so" in pid 1518, inserting 29
probes
stapdyn: found object "/usr/lib64/librt-2.17.so" in pid 1518
stapdyn: found object "/usr/lib64/libc-2.17.so" in pid 1518
stapdyn: found object "/usr/lib64/libdl-2.17.so" in pid 1518
stapdyn: found object
"/tmp/stap4YZyZj/stap_60b2c3beb0fcbec36141257abd333339_7328.so" in pid 1518
stapdyn: calling function 'stp_dyninst_shm_connect' in pid 1518
stapdyn: continuing execution of pid 1518
^C^C^CToo many interrupts received, exiting.
 7.2 S x86_64 [4] # stap -V
Systemtap translator/driver (version 2.9/0.166, devtoolset-4 rpm 2.9-3s.el7)
=======

Once, among couple of test runs, I saw the testcase to finish though, whereas
subsequent run of exactly the same thing was running overnight and didn't
finish.

=======
[root@dhcp-1-147 ~]# stap --dyninst xxx.stp -c ./pthreadprobes 
stapdyn: WARNING: Couldn't instrument the function containing 0x7fd7ce818a08,
/usr/lib64/libpthread-2.22.so+0xba08
stapdyn: WARNING: Couldn't instrument the function containing 0x7fd7ce818fc3,
/usr/lib64/libpthread-2.22.so+0xbfc3
cond_broadcast=0
cond_destroy=0
cond_init=0
cond_signal=0
cond_timedwait=0
cond_wait=0
mutex_acquired=1
mutex_destroy=0
mutex_entry=1
mutex_init=1
mutex_release=1
mutex_timedlock_acquired=0
mutex_timedlock_entry=0
pthread_create=1
pthread_join=0
pthread_join_ret=0
pthread_start=0
rdlock_entry=0
rwlock_destroy=0
wrlock_entry=0
[root@dhcp-1-147 ~]# stap --dyninst xxx.stp -c ./pthreadprobes 
stapdyn: WARNING: Couldn't instrument the function containing 0x7f21271bca08,
/usr/lib64/libpthread-2.22.so+0xba08
stapdyn: WARNING: Couldn't instrument the function containing 0x7f21271bcfc3,
/usr/lib64/libpthread-2.22.so+0xbfc3
^C^C^CToo many interrupts received, exiting.
=======

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/20052] pthreadprobes.exp stall
  2016-05-06  7:50 [Bug dyninst/20052] New: pthreadprobes.exp stall mcermak at redhat dot com
@ 2016-05-06  7:53 ` mcermak at redhat dot com
  2016-06-16  0:19 ` jistone at redhat dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: mcermak at redhat dot com @ 2016-05-06  7:53 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20052

Martin Cermak <mcermak at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jistone at redhat dot com

--- Comment #1 from Martin Cermak <mcermak at redhat dot com> ---
Interestingly, using distribution systemtap-2.8-10.el7.x86_64 on the very same
rhel7 test box reliably gives a segfault (which may easily be completely
different issue of course). 

=======
# stap --dyninst  xxx.stp -c ./pthreadprobes
stapdyn: WARNING: Couldn't find an instrumentation point at 0x7f16781863c8,
/usr/lib64/libpthread-2.17.so+0xa3c8
stapdyn: WARNING: Couldn't find an instrumentation point at 0x7f16781868c7,
/usr/lib64/libpthread-2.17.so+0xa8c7
stapdyn: WARNING: Couldn't find an instrumentation point at 0x7f16781869f7,
/usr/lib64/libpthread-2.17.so+0xa9f7
stapdyn: WARNING: Couldn't instrument the function containing 0x7f16781862df,
/usr/lib64/libpthread-2.17.so+0xa2df
WARNING: /usr/bin/stapdyn exited with signal: 11 (Segmentation fault)
Pass 5: run failed.  [man error::pass5]
#
=======

Two corefiles are generated, but gdb isn't very happy about those distort bad
boys. One comes from stapdyn and doesn't seem to be useful. The other comes
from the pthreadprobes binary and shows sort of traceback (probably not too
valuable though):

=======
Core was generated by `./pthreadprobes'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f21de51c8ce in DYNINSTstaticHeap_16M_anyHeap_1 () from
/usr/lib64/dyninst/libdyninstAPI_RT.so.8.2
#2  0x0000000000000000 in ?? ()
(gdb) 
=======

Another SEGV of the same testcase is PR17245.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/20052] pthreadprobes.exp stall
  2016-05-06  7:50 [Bug dyninst/20052] New: pthreadprobes.exp stall mcermak at redhat dot com
  2016-05-06  7:53 ` [Bug dyninst/20052] " mcermak at redhat dot com
@ 2016-06-16  0:19 ` jistone at redhat dot com
  2016-06-27 20:02 ` jistone at redhat dot com
  2016-06-27 20:16 ` jistone at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: jistone at redhat dot com @ 2016-06-16  0:19 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20052

--- Comment #2 from Josh Stone <jistone at redhat dot com> ---
The segfault in the mutatee (pthreadprobes) appears to be a recursion problem:
https://github.com/dyninst/dyninst/issues/98

I don't know exactly why the mutator (stapdyn) gets stuck though.  There does
appear to be a notification of the child exit received in libpcontrol.so, but
then it waits for further activity which never arrives.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/20052] pthreadprobes.exp stall
  2016-05-06  7:50 [Bug dyninst/20052] New: pthreadprobes.exp stall mcermak at redhat dot com
  2016-05-06  7:53 ` [Bug dyninst/20052] " mcermak at redhat dot com
  2016-06-16  0:19 ` jistone at redhat dot com
@ 2016-06-27 20:02 ` jistone at redhat dot com
  2016-06-27 20:16 ` jistone at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: jistone at redhat dot com @ 2016-06-27 20:02 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20052

Josh Stone <jistone at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #3 from Josh Stone <jistone at redhat dot com> ---
I have backported the TLS changes into updates for Fedora 22-24 -- karma
welcome.

The SIGSEGV hang is still a problem in general, but at least this particular
source of SIGSEGV is fixed.  Follow here for progress on that:
https://github.com/dyninst/dyninst/issues/100

Either way, there's nothing for stapdyn to do here.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/20052] pthreadprobes.exp stall
  2016-05-06  7:50 [Bug dyninst/20052] New: pthreadprobes.exp stall mcermak at redhat dot com
                   ` (2 preceding siblings ...)
  2016-06-27 20:02 ` jistone at redhat dot com
@ 2016-06-27 20:16 ` jistone at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: jistone at redhat dot com @ 2016-06-27 20:16 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20052

--- Comment #4 from Josh Stone <jistone at redhat dot com> ---
(In reply to Josh Stone from comment #3)
> I have backported the TLS changes into updates for Fedora 22-24

Sorry, I forgot I didn't do f22 as that dyninst didn't use TLS yet.

F23: https://bodhi.fedoraproject.org/updates/FEDORA-2016-5118ab44aa
F24: https://bodhi.fedoraproject.org/updates/FEDORA-2016-5bd8277036

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-27 20:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-06  7:50 [Bug dyninst/20052] New: pthreadprobes.exp stall mcermak at redhat dot com
2016-05-06  7:53 ` [Bug dyninst/20052] " mcermak at redhat dot com
2016-06-16  0:19 ` jistone at redhat dot com
2016-06-27 20:02 ` jistone at redhat dot com
2016-06-27 20:16 ` jistone at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).