public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: "dsmith at redhat dot com" <sourceware-bugzilla@sourceware.org>
To: systemtap@sourceware.org
Subject: [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp
Date: Mon, 12 Sep 2016 14:58:00 -0000	[thread overview]
Message-ID: <bug-20600-6586@http.sourceware.org/bugzilla/> (raw)

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

            Bug ID: 20600
           Summary: parallet testsuite hang in [nd_]syscall.exp
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: testsuite
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com
  Target Milestone: ---

When I run the testsuite in parallel mode with at lest 3 concurrent jobs, I'm
getting a testsuite "hang". The testsuite will run to completion, except for
either the syscall.exp or nd_syscall.exp test case. That test case will hang in
one of the tests, typically in the execve or getrlimit subtest. The stapio
process for that test is in the defunct state:

====
# ps ax | fgrep stap
14534 pts/0    S+     0:00 grep -F --color=auto stap
24933 ?        Zl     0:10 [stapio] <defunct>

# tail testsuite/artifacts/systemtap.syscall/nd_syscall/systemtap.log 
Executing on host: gcc /root/src/testsuite/systemtap.syscall/getpriority.c 
-lrt  -lm   -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptestgbSi0f/getpriority
   (timeout = 300)
spawn -ignore SIGHUP gcc /root/src/testsuite/systemtap.syscall/getpriority.c
-lrt -lm -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptestgbSi0f/getpriority
PASS: 64-bit getpriority nd_syscall
Testing 64-bit getrandom nd_syscall
Executing on host: gcc /root/src/testsuite/systemtap.syscall/getrandom.c  -lrt 
-lm   -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest9QHupy/getrandom
   (timeout = 300)
spawn -ignore SIGHUP gcc /root/src/testsuite/systemtap.syscall/getrandom.c -lrt
-lm -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest9QHupy/getrandom
PASS: 64-bit getrandom nd_syscall
Testing 64-bit getrlimit nd_syscall
Executing on host: gcc /root/src/testsuite/systemtap.syscall/getrlimit.c  -lrt 
-lm   -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest4a2xe9/getrlimit
   (timeout = 300)
spawn -ignore SIGHUP gcc /root/src/testsuite/systemtap.syscall/getrlimit.c -lrt
-lm -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest4a2xe9/getrlimit

# ll testsuite/artifacts/systemtap.syscall/nd_syscall/systemtap.log 
-rwxr-xr-x. 1 root root 21289 Sep 10 01:19
testsuite/artifacts/systemtap.syscall/nd_syscall/systemtap.lo
====

So, for over 9 hours that test has just sat there. If I do a 'kill -9' on that
defunct stapio process, the [nd_]syscall.exp test will finish (and the full
testsuite will also finish).

Note that on the same system the full testsuite (and the [nd_]syscall.exp test
cases) will run to completion when run in non-parallel mode.

This "hang" is fairly repeatable, happening at least 50% of the time.

I'd guess that one of the other tests is interfering with the [nd_]syscall.exp
test case somehow, but I can't quite think of how.

-- 
You are receiving this mail because:
You are the assignee for the bug.

             reply	other threads:[~2016-09-12 14:58 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-12 14:58 dsmith at redhat dot com [this message]
2016-09-15 14:28 ` [Bug testsuite/20600] " dsmith at redhat dot com
2016-09-15 19:40 ` dsmith at redhat dot com
2016-09-15 19:53 ` [Bug testsuite/20600] parallel " dsmith at redhat dot com
2016-09-15 20:13 ` dsmith at redhat dot com
2016-09-22 14:26 ` dsmith at redhat dot com
2016-09-22 18:09 ` dsmith at redhat dot com
2017-02-28 16:13 ` dsmith at redhat dot com
2017-03-02  7:29 ` mcermak at redhat dot com
2017-03-03 19:17 ` dsmith at redhat dot com
2023-12-06 20:55 ` wcohen at redhat dot com
2024-02-21 13:25 ` fche at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-20600-6586@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=systemtap@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).