public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug dyninst/15619] New: on rawhide ia32, simple scripts sometimes hang
@ 2013-06-13 19:07 dsmith at redhat dot com
  2013-06-13 19:12 ` [Bug dyninst/15619] " dsmith at redhat dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: dsmith at redhat dot com @ 2013-06-13 19:07 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15619

            Bug ID: 15619
           Summary: on rawhide ia32, simple scripts sometimes hang
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: dyninst
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com

As part of bug #14791, I added dyninst support to the following testcases:

 testsuite/systemtap.base/strftime.exp
 testsuite/systemtap.printf/end1.exp
 testsuite/systemtap.printf/mixed_out.exp
 testsuite/systemtap.printf/out1.exp
 testsuite/systemtap.printf/out2.exp
 testsuite/systemtap.printf/out3.exp

These tests pass on rawhide x86_64 (3.10.0-0.rc5.git0.1.fc20.x86_64). However,
on rawhide ia32 (3.10.0-0.rc5.git0.1.fc20.i686.PAE), two of those tests hang:
mixed_out.exp and out3.exp. To get the testsuite to continue, you have to
manually kill 'stap'. The tests pass, which I believe means the output file was
written correctly. I'd guess the problem is the 'exit()' function isn't working
correctly (perhaps related to bug #14655 or bug #15029).

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/15619] on rawhide ia32, simple scripts sometimes hang
  2013-06-13 19:07 [Bug dyninst/15619] New: on rawhide ia32, simple scripts sometimes hang dsmith at redhat dot com
@ 2013-06-13 19:12 ` dsmith at redhat dot com
  2013-06-13 19:35 ` dsmith at redhat dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: dsmith at redhat dot com @ 2013-06-13 19:12 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15619

--- Comment #1 from David Smith <dsmith at redhat dot com> ---
Both mixed_out.exp and out3.exp use '-DMAXACTION=100000'. The non-hanging tests
do not use MAXACTION.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/15619] on rawhide ia32, simple scripts sometimes hang
  2013-06-13 19:07 [Bug dyninst/15619] New: on rawhide ia32, simple scripts sometimes hang dsmith at redhat dot com
  2013-06-13 19:12 ` [Bug dyninst/15619] " dsmith at redhat dot com
@ 2013-06-13 19:35 ` dsmith at redhat dot com
  2013-06-13 19:55 ` jistone at redhat dot com
  2013-06-13 22:28 ` jistone at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: dsmith at redhat dot com @ 2013-06-13 19:35 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15619

--- Comment #2 from David Smith <dsmith at redhat dot com> ---
On x86_64, strace'd stapdyn reports the following. Note that rt_sigtimedwait()
saw the SIGTERM (15).

====
clone(child_stack=0x7f64e9e23eb0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f64e9e249d0, tls=0x7f64e9e24700, child_tidptr=0x7f64e9e249d0)
= 4259
futex(0x7f64e9e270c4, FUTEX_WAKE_OP, 1, 1, 0x7f64e9e270c0, {FUTEX_OP_SET, 0,
FUTEX_OP_CMP_GT, 1}) = 1
rt_sigtimedwait([HUP INT QUIT TERM], NULL, NULL, 8) = 15
clock_gettime(CLOCK_MONOTONIC_RAW, {1680, 689685020}) = 0
futex(0x7f64e9e270c4, FUTEX_WAKE_OP, 1, 1, 0x7f64e9e270c0, {FUTEX_OP_SET, 0,
FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f64e9e249d0, FUTEX_WAIT, 4259, NULL) = 0
munmap(0x7f64e9e25000, 2117632)         = 0
====

On ia32, an strace'd stapdyn reports:

====
clone(child_stack=0xb616c3a4,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0xb616cba8, {entry_number:6, base_addr:0xb616cb40, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0,
useable:1}, child_tidptr=0xb616cba8) = 8637
getcpu([0], NULL, 0)                    = 0
futex(0xb616ec8c, FUTEX_WAKE_OP, 1, 1, 0xb616ec88, {FUTEX_OP_SET, 0,
FUTEX_OP_CMP_GT, 1}) = 1
futex(0xb616ec40, FUTEX_WAKE, 1)        = 1
--- SIGTERM {si_signo=SIGTERM, si_code=SI_TKILL, si_pid=8635, si_uid=5183} ---
sigreturn() (mask [])                   = 1
futex(0xb616ec8c, FUTEX_WAKE_OP, 1, 1, 0xb616ec88, {FUTEX_OP_SET, 0,
FUTEX_OP_CMP_GT, 1}) = 1
futex(0xb616ec40, FUTEX_WAKE, 1)        = 1
rt_sigtimedwait([HUP INT QUIT TERM], NULL, NULL, 8
====

So, we're stuck in rt_sigtimedwait(). Which means that we either missed the
signal or it got masked off somewhere.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/15619] on rawhide ia32, simple scripts sometimes hang
  2013-06-13 19:07 [Bug dyninst/15619] New: on rawhide ia32, simple scripts sometimes hang dsmith at redhat dot com
  2013-06-13 19:12 ` [Bug dyninst/15619] " dsmith at redhat dot com
  2013-06-13 19:35 ` dsmith at redhat dot com
@ 2013-06-13 19:55 ` jistone at redhat dot com
  2013-06-13 22:28 ` jistone at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: jistone at redhat dot com @ 2013-06-13 19:55 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15619

Josh Stone <jistone at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jistone at redhat dot com

--- Comment #3 from Josh Stone <jistone at redhat dot com> ---
(In reply to David Smith from comment #0)
> (perhaps related to bug #14655 or bug #15029)

I think for the first, you meant bug #14665.

(In reply to David Smith from comment #2)
> On ia32, an strace'd stapdyn reports:
> 
> ====
> clone(child_stack=0xb616c3a4,
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|
> CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
> parent_tidptr=0xb616cba8, {entry_number:6, base_addr:0xb616cb40,
> limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1,
> seg_not_present:0, useable:1}, child_tidptr=0xb616cba8) = 8637
> getcpu([0], NULL, 0)                    = 0
> futex(0xb616ec8c, FUTEX_WAKE_OP, 1, 1, 0xb616ec88, {FUTEX_OP_SET, 0,
> FUTEX_OP_CMP_GT, 1}) = 1
> futex(0xb616ec40, FUTEX_WAKE, 1)        = 1
> --- SIGTERM {si_signo=SIGTERM, si_code=SI_TKILL, si_pid=8635, si_uid=5183} ---

It looks like the SIGTERM came here.

> sigreturn() (mask [])                   = 1
> futex(0xb616ec8c, FUTEX_WAKE_OP, 1, 1, 0xb616ec88, {FUTEX_OP_SET, 0,
> FUTEX_OP_CMP_GT, 1}) = 1
> futex(0xb616ec40, FUTEX_WAKE, 1)        = 1
> rt_sigtimedwait([HUP INT QUIT TERM], NULL, NULL, 8
> ====
> 
> So, we're stuck in rt_sigtimedwait(). Which means that we either missed the
> signal or it got masked off somewhere.

It seems it actually didn't get masked early on, so it was already delivered
before we wait for it.  I think we could just mask, check if it already
arrived, and only then do the sigwait.  I can play with this a bit.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug dyninst/15619] on rawhide ia32, simple scripts sometimes hang
  2013-06-13 19:07 [Bug dyninst/15619] New: on rawhide ia32, simple scripts sometimes hang dsmith at redhat dot com
                   ` (2 preceding siblings ...)
  2013-06-13 19:55 ` jistone at redhat dot com
@ 2013-06-13 22:28 ` jistone at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: jistone at redhat dot com @ 2013-06-13 22:28 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15619

Josh Stone <jistone at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Josh Stone <jistone at redhat dot com> ---
commit 3ee990b14ec461b0581888a038248f603a0d8113

I wasn't able to reproduce the failure myself, but David confirmed that this
fixed it for him.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-06-13 22:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-13 19:07 [Bug dyninst/15619] New: on rawhide ia32, simple scripts sometimes hang dsmith at redhat dot com
2013-06-13 19:12 ` [Bug dyninst/15619] " dsmith at redhat dot com
2013-06-13 19:35 ` dsmith at redhat dot com
2013-06-13 19:55 ` jistone at redhat dot com
2013-06-13 22:28 ` jistone at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).