public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp
@ 2016-09-12 14:58 dsmith at redhat dot com
  2016-09-15 14:28 ` [Bug testsuite/20600] " dsmith at redhat dot com
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-12 14:58 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

            Bug ID: 20600
           Summary: parallet testsuite hang in [nd_]syscall.exp
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: testsuite
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com
  Target Milestone: ---

When I run the testsuite in parallel mode with at lest 3 concurrent jobs, I'm
getting a testsuite "hang". The testsuite will run to completion, except for
either the syscall.exp or nd_syscall.exp test case. That test case will hang in
one of the tests, typically in the execve or getrlimit subtest. The stapio
process for that test is in the defunct state:

====
# ps ax | fgrep stap
14534 pts/0    S+     0:00 grep -F --color=auto stap
24933 ?        Zl     0:10 [stapio] <defunct>

# tail testsuite/artifacts/systemtap.syscall/nd_syscall/systemtap.log 
Executing on host: gcc /root/src/testsuite/systemtap.syscall/getpriority.c 
-lrt  -lm   -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptestgbSi0f/getpriority
   (timeout = 300)
spawn -ignore SIGHUP gcc /root/src/testsuite/systemtap.syscall/getpriority.c
-lrt -lm -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptestgbSi0f/getpriority
PASS: 64-bit getpriority nd_syscall
Testing 64-bit getrandom nd_syscall
Executing on host: gcc /root/src/testsuite/systemtap.syscall/getrandom.c  -lrt 
-lm   -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest9QHupy/getrandom
   (timeout = 300)
spawn -ignore SIGHUP gcc /root/src/testsuite/systemtap.syscall/getrandom.c -lrt
-lm -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest9QHupy/getrandom
PASS: 64-bit getrandom nd_syscall
Testing 64-bit getrlimit nd_syscall
Executing on host: gcc /root/src/testsuite/systemtap.syscall/getrlimit.c  -lrt 
-lm   -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest4a2xe9/getrlimit
   (timeout = 300)
spawn -ignore SIGHUP gcc /root/src/testsuite/systemtap.syscall/getrlimit.c -lrt
-lm -o
/root/rhel7-ppc64le/testsuite/artifacts/systemtap.syscall/nd_syscall/staptest4a2xe9/getrlimit

# ll testsuite/artifacts/systemtap.syscall/nd_syscall/systemtap.log 
-rwxr-xr-x. 1 root root 21289 Sep 10 01:19
testsuite/artifacts/systemtap.syscall/nd_syscall/systemtap.lo
====

So, for over 9 hours that test has just sat there. If I do a 'kill -9' on that
defunct stapio process, the [nd_]syscall.exp test will finish (and the full
testsuite will also finish).

Note that on the same system the full testsuite (and the [nd_]syscall.exp test
cases) will run to completion when run in non-parallel mode.

This "hang" is fairly repeatable, happening at least 50% of the time.

I'd guess that one of the other tests is interfering with the [nd_]syscall.exp
test case somehow, but I can't quite think of how.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallet testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
@ 2016-09-15 14:28 ` dsmith at redhat dot com
  2016-09-15 19:40 ` dsmith at redhat dot com
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 14:28 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

--- Comment #1 from David Smith <dsmith at redhat dot com> ---
When this happens, it looks like stapio's parent gets set to expect:

[root@ibm-p8-01-lp7 rhel7-ppc64le]# ps -l 20631
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY        TIME CMD
4 Z     0 20631  3434  0  80   0 -     0 exit   ?          0:03 [stapio]
<defunct>
[root@ibm-p8-01-lp7 rhel7-ppc64le]# ps -wl 3434
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY        TIME CMD
0 S     0  3434  3427  0  80   0 -   391 pipe_r ?          0:02 expect --
/usr/share/dejagnu/runtest.exp --tool systemtap --tool_opts  --srcdir
/root/src/testsuite --outdir=artifacts/systemt

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallet testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
  2016-09-15 14:28 ` [Bug testsuite/20600] " dsmith at redhat dot com
@ 2016-09-15 19:40 ` dsmith at redhat dot com
  2016-09-15 19:53 ` [Bug testsuite/20600] parallel " dsmith at redhat dot com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 19:40 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

--- Comment #2 from David Smith <dsmith at redhat dot com> ---
Poking around in the system log, I find this:

====
[  945.698507] stap_5af583ef6fbd2cecdd31d1bbd9fb86b7_20528: systemtap:
3.1/0.163, base: d00000000c450000, memory:
218data/116text/180ctx/2058net/133alloc kb, probes: 52 
[  947.101337] stap_1d57d05694160e9f9f0b1dd5dccc510_20591: systemtap:
3.1/0.163, base: d00000000d910000, memory:
5399data/71text/258ctx/2058net/2293alloc kb, probes: 7 
[  947.406869] BUG: sleeping function called from invalid context at
kernel/mutex.c:576 
[  947.406889] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio 
[  947.406894] INFO: lockdep is turned off. 
[  947.406900] CPU: 12 PID: 20631 Comm: stapio Tainted: G           OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  947.406905] Call Trace: 
[  947.406912] [c00000058d677980] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  947.406920] [c00000058d677a40] [c000000000a534b0] dump_stack+0x30/0x44 
[  947.406927] [c00000058d677a60] [c00000000013af14] __might_sleep+0x1c4/0x2d0 
[  947.406933] [c00000058d677ae0] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0 
[  947.406941] [c00000058d677be0] [c0000000001301a0]
srcu_notifier_chain_register+0x40/0x120 
[  947.406948] [c00000058d677c10] [c0000000007da200]
cpufreq_register_notifier+0xa0/0xc0 
[  947.406962] [c00000058d677c30] [d00000000e54da9c]
systemtap_module_init+0x6ac/0x28f0 [nd_s_20631] 
[  947.406975] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631] 
[  947.406983] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  947.406989] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  947.411401] nd_s_20631: systemtap: 3.1/0.163, base: d00000000e480000,
memory: 754data/850text/583ctx/2058net/13050alloc kb, probes: 1023 
[  947.576979] BUG: scheduling while atomic: stapio/20631/0x10000002 
[  947.576995] INFO: lockdep is turned off. 
[  947.576998] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  947.577057] CPU: 12 PID: 20631 Comm: stapio Tainted: G           OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  947.577063] Call Trace: 
[  947.577070] [c00000058d6777c0] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  947.577079] [c00000058d677880] [c000000000a534b0] dump_stack+0x30/0x44 
[  947.577087] [c00000058d6778a0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  947.577094] [c00000058d677910] [c000000000a33340] __schedule+0xb80/0xcd0 
[  947.577102] [c00000058d6779e0] [c000000000a33588] _cond_resched+0x58/0x70 
[  947.577109] [c00000058d677a10] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0 
[  947.577117] [c00000058d677b10] [c000000000a3f974]
register_kprobe+0x104/0x800 
[  947.577123] [c00000058d677bc0] [c000000000a40210]
register_kretprobe+0x1a0/0x280 
[  947.577142] [c00000058d677c00] [d00000000e494c4c]
stapkp_register_probe+0xcc/0x270 [nd_s_20631] 
[  947.577157] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631] 
[  947.577172] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631] 
[  947.577180] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  947.577187] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  947.608580] BUG: scheduling while atomic: stapio/20631/0x10000002 
[  947.608595] INFO: lockdep is turned off. 
[  947.608599] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  947.608662] CPU: 0 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  947.608669] Call Trace: 
[  947.608676] [c00000058d6777c0] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  947.608686] [c00000058d677880] [c000000000a534b0] dump_stack+0x30/0x44 
[  947.608693] [c00000058d6778a0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  947.608702] [c00000058d677910] [c000000000a33340] __schedule+0xb80/0xcd0 
[  947.608709] [c00000058d6779e0] [c000000000a33588] _cond_resched+0x58/0x70 
[  947.608717] [c00000058d677a10] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0 
[  947.608724] [c00000058d677b10] [c000000000a3f974]
register_kprobe+0x104/0x800 
[  947.608731] [c00000058d677bc0] [c000000000a40210]
register_kretprobe+0x1a0/0x280 
[  947.608750] [c00000058d677c00] [d00000000e494c4c]
stapkp_register_probe+0xcc/0x270 [nd_s_20631] 
[  947.608766] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631] 
[  947.608782] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631] 
[  947.608790] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  947.608798] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  948.177884] BUG: scheduling while atomic: stapio/20631/0x10000002 
[  948.177899] INFO: lockdep is turned off. 
[  948.177903] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  948.177966] CPU: 0 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  948.177972] Call Trace: 
[  948.177979] [c00000058d677800] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  948.177989] [c00000058d6778c0] [c000000000a534b0] dump_stack+0x30/0x44 
[  948.177996] [c00000058d6778e0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  948.178004] [c00000058d677950] [c000000000a33340] __schedule+0xb80/0xcd0 
[  948.178012] [c00000058d677a20] [c000000000a33588] _cond_resched+0x58/0x70 
[  948.178019] [c00000058d677a50] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0 
[  948.178026] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800 
[  948.178044] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631] 
[  948.178060] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631] 
[  948.178075] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631] 
[  948.178083] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  948.178090] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  948.366981] BUG: scheduling while atomic: stapio/20631/0x10000002 
[  948.366996] INFO: lockdep is turned off. 
[  948.367000] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  948.367066] CPU: 8 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  948.367073] Call Trace: 
[  948.367080] [c00000058d677800] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  948.367090] [c00000058d6778c0] [c000000000a534b0] dump_stack+0x30/0x44 
[  948.367098] [c00000058d6778e0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  948.367106] [c00000058d677950] [c000000000a33340] __schedule+0xb80/0xcd0 
[  948.367113] [c00000058d677a20] [c000000000a33588] _cond_resched+0x58/0x70 
[  948.367121] [c00000058d677a50] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0 
[  948.367128] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800 
[  948.367146] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631] 
[  948.367162] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631] 
[  948.367178] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631] 
[  948.367186] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  948.367194] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  948.407004] BUG: sleeping function called from invalid context at
kernel/mutex.c:576 
[  948.407016] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio 
[  948.407021] INFO: lockdep is turned off. 
[  948.407027] CPU: 8 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  948.407033] Call Trace: 
[  948.407038] [c00000058d6778f0] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  948.407048] [c00000058d6779b0] [c000000000a534b0] dump_stack+0x30/0x44 
[  948.407055] [c00000058d6779d0] [c00000000013af14] __might_sleep+0x1c4/0x2d0 
[  948.407063] [c00000058d677a50] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0 
[  948.407071] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800 
[  948.407087] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631] 
[  948.407103] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631] 
[  948.407119] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631] 
[  948.407126] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  948.407134] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  948.578170] BUG: scheduling while atomic: stapio/20631/0x10000002 
[  948.578184] INFO: lockdep is turned off. 
[  948.578188] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  948.578259] CPU: 8 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  948.578266] Call Trace: 
[  948.578272] [c00000058d677800] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  948.578282] [c00000058d6778c0] [c000000000a534b0] dump_stack+0x30/0x44 
[  948.578290] [c00000058d6778e0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  948.578299] [c00000058d677950] [c000000000a33340] __schedule+0xb80/0xcd0 
[  948.578307] [c00000058d677a20] [c000000000a33588] _cond_resched+0x58/0x70 
[  948.578315] [c00000058d677a50] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0 
[  948.578323] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800 
[  948.578343] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631] 
[  948.578360] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631] 
[  948.578376] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631] 
[  948.578384] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  948.578393] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  948.583844] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  948.583851] INFO: lockdep is turned off. 
[  948.583856] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  948.583920] CPU: 8 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  948.583926] Call Trace: 
[  948.583931] [c00000058d6774e0] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  948.583940] [c00000058d6775a0] [c000000000a534b0] dump_stack+0x30/0x44 
[  948.583948] [c00000058d6775c0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  948.583956] [c00000058d677630] [c000000000a33340] __schedule+0xb80/0xcd0 
[  948.583964] [c00000058d677700] [c000000000a316ec]
schedule_hrtimeout_range_clock+0x14c/0x560 
[  948.583972] [c00000058d6777e0] [c00000000037e418]
poll_schedule_timeout+0x58/0x130 
[  948.583981] [c00000058d677810] [c00000000037f07c] do_select+0x7cc/0xa30 
[  948.583988] [c00000058d677bd0] [c00000000037f74c]
core_sys_select+0x46c/0x5b0 
[  948.583996] [c00000058d677d70] [c00000000037fcf0] SyS_pselect6+0x1a0/0x3a0 
[  948.584004] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  948.586598] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  948.586603] INFO: lockdep is turned off. 
[  948.586607] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  948.586671] CPU: 8 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  948.586677] Call Trace: 
[  948.586682] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  948.586717] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44 
[  948.586733] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  948.586750] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0 
[  948.586767] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  948.586785] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  948.586798] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  948.586811] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  948.586834] [c00000058d677c20] [d00000000e54c9cc]
_stp_cleanup_and_exit+0x1ac/0xa20 [nd_s_20631] 
[  948.586853] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  948.586862] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  948.586871] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  949.256861] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  949.256877] INFO: lockdep is turned off. 
[  949.256883] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  949.256955] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  949.256963] Call Trace: 
[  949.256972] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  949.256983] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44 
[  949.256991] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  949.257000] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0 
[  949.257008] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  949.257016] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  949.257025] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  949.257033] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  949.257053] [c00000058d677c20] [d00000000e54ca2c]
_stp_cleanup_and_exit+0x20c/0xa20 [nd_s_20631] 
[  949.257070] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  949.257079] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  949.257087] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  949.347065] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  949.347079] INFO: lockdep is turned off. 
[  949.347085] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  949.347185] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  949.347194] Call Trace: 
[  949.347204] [c00000058d677700] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  949.347219] [c00000058d6777c0] [c000000000a534b0] dump_stack+0x30/0x44 
[  949.347231] [c00000058d6777e0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  949.347243] [c00000058d677850] [c000000000a33340] __schedule+0xb80/0xcd0 
[  949.347255] [c00000058d677920] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  949.347267] [c00000058d677a50] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  949.347279] [c00000058d677ac0] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  949.347291] [c00000058d677b60] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  949.347303] [c00000058d677b90] [c000000000a4086c]
unregister_kprobes.part.14+0x9c/0x170 
[  949.347330] [c00000058d677be0] [d00000000e496190]
stapkp_batch_unregister_probes.constprop.662+0x90/0x190 [nd_s_20631] 
[  949.347357] [c00000058d677c20] [d00000000e54ca54]
_stp_cleanup_and_exit+0x234/0xa20 [nd_s_20631] 
[  949.347378] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  949.347387] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  949.347396] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  949.516763] BUG: sleeping function called from invalid context at
kernel/mutex.c:576 
[  949.516790] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio 
[  949.516803] INFO: lockdep is turned off. 
[  949.516819] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  949.516835] Call Trace: 
[  949.516850] [c00000058d6778c0] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  949.516872] [c00000058d677980] [c000000000a534b0] dump_stack+0x30/0x44 
[  949.516888] [c00000058d6779a0] [c00000000013af14] __might_sleep+0x1c4/0x2d0 
[  949.516903] [c00000058d677a20] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0 
[  949.516918] [c00000058d677b20] [c000000000a3f358]
__free_insn_slot+0x38/0x1b0 
[  949.516932] [c00000058d677b60] [c000000000a390d8]
arch_remove_kprobe+0x38/0x60 
[  949.516944] [c00000058d677b90] [c000000000a40918]
unregister_kprobes.part.14+0x148/0x170 
[  949.516970] [c00000058d677be0] [d00000000e496190]
stapkp_batch_unregister_probes.constprop.662+0x90/0x190 [nd_s_20631] 
[  949.516991] [c00000058d677c20] [d00000000e54ca54]
_stp_cleanup_and_exit+0x234/0xa20 [nd_s_20631] 
[  949.517009] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  949.517019] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  949.517029] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  949.517386] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  949.517393] INFO: lockdep is turned off. 
[  949.517397] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  949.517471] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  949.517478] Call Trace: 
[  949.517484] [c00000058d6776e0] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  949.517493] [c00000058d6777a0] [c000000000a534b0] dump_stack+0x30/0x44 
[  949.517502] [c00000058d6777c0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  949.517510] [c00000058d677830] [c000000000a33340] __schedule+0xb80/0xcd0 
[  949.517519] [c00000058d677900] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  949.517528] [c00000058d677a30] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  949.517538] [c00000058d677aa0] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  949.517547] [c00000058d677b40] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  949.517554] [c00000058d677b70] [c000000000a40c88]
unregister_kretprobes.part.16+0xb8/0x2b0 
[  949.517570] [c00000058d677be0] [d00000000e4961e8]
stapkp_batch_unregister_probes.constprop.662+0xe8/0x190 [nd_s_20631] 
[  949.517586] [c00000058d677c20] [d00000000e54ca54]
_stp_cleanup_and_exit+0x234/0xa20 [nd_s_20631] 
[  949.517602] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  949.517610] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  949.517618] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  949.696344] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  949.696361] INFO: lockdep is turned off. 
[  949.696365] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  949.696429] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  949.696436] Call Trace: 
[  949.696443] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  949.696453] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44 
[  949.696461] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  949.696469] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0 
[  949.696476] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  949.696483] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  949.696492] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  949.696499] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  949.696518] [c00000058d677c20] [d00000000e54caa8]
_stp_cleanup_and_exit+0x288/0xa20 [nd_s_20631] 
[  949.696532] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  949.696540] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  949.696548] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  949.766771] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  949.766788] INFO: lockdep is turned off. 
[  949.766795] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  949.766904] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  949.766914] Call Trace: 
[  949.766923] [c00000058d6778a0] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  949.766936] [c00000058d677960] [c000000000a534b0] dump_stack+0x30/0x44 
[  949.766946] [c00000058d677980] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  949.766956] [c00000058d6779f0] [c000000000a33340] __schedule+0xb80/0xcd0 
[  949.766965] [c00000058d677ac0] [c000000000a2d178]
schedule_timeout+0x1f8/0x480 
[  949.766974] [c00000058d677bf0] [c0000000000f6c6c] msleep+0x3c/0x60 
[  949.766996] [c00000058d677c20] [d00000000e54ccc0]
_stp_cleanup_and_exit+0x4a0/0xa20 [nd_s_20631] 
[  949.767016] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  949.767025] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  949.767035] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  950.026694] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  950.026711] INFO: lockdep is turned off. 
[  950.026716] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  950.026788] irq event stamp: 0 
[  950.026792] hardirqs last  enabled at (0): [<          (null)>]          
(null) 
[  950.026799] hardirqs last disabled at (0): [<c0000000000d19ac>]
copy_process.part.22+0x5ec/0x1050 
[  950.026811] softirqs last  enabled at (0): [<c0000000000d19ac>]
copy_process.part.22+0x5ec/0x1050 
[  950.026820] softirqs last disabled at (0): [<          (null)>]          
(null) 
[  950.026830] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  950.026837] Call Trace: 
[  950.026842] [c00000058d677a00] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  950.026852] [c00000058d677ac0] [c000000000a534b0] dump_stack+0x30/0x44 
[  950.026860] [c00000058d677ae0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  950.026869] [c00000058d677b50] [c000000000a33340] __schedule+0xb80/0xcd0 
[  950.026888] [c00000058d677c20] [d00000000e54ccc8]
_stp_cleanup_and_exit+0x4a8/0xa20 [nd_s_20631] 
[  950.026904] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  950.026912] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  950.026920] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  950.026939] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  950.026944] INFO: lockdep is turned off. 
[  950.026947] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466] 
[  950.027008] CPU: 9 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  950.027014] Call Trace: 
[  950.027019] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  950.027028] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44 
[  950.027036] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  950.027043] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0 
[  950.027051] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  950.027060] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  950.027068] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  950.027077] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  950.027092] [c00000058d677c20] [d00000000e54cce8]
_stp_cleanup_and_exit+0x4c8/0xa20 [nd_s_20631] 
[  950.027108] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  950.027116] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  950.027124] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  950.147890] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  950.147905] INFO: lockdep is turned off. 
[  950.147910] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_872ad7238edaf2bf03978db3e1da69b_19920] 
[  950.147974] CPU: 10 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  950.147981] Call Trace: 
[  950.147989] [c00000058d677710] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  950.147999] [c00000058d6777d0] [c000000000a534b0] dump_stack+0x30/0x44 
[  950.148007] [c00000058d6777f0] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  950.148015] [c00000058d677860] [c000000000a33340] __schedule+0xb80/0xcd0 
[  950.148023] [c00000058d677930] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  950.148031] [c00000058d677a60] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  950.148039] [c00000058d677ad0] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  950.148048] [c00000058d677b70] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  950.148066] [c00000058d677ba0] [d00000000e488d08]
_stp_runtime_contexts_free+0x328/0x4e0 [nd_s_20631] 
[  950.148082] [c00000058d677c20] [d00000000e54d010]
_stp_cleanup_and_exit+0x7f0/0xa20 [nd_s_20631] 
[  950.148098] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  950.148107] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  950.148115] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  950.266974] BUG: scheduling while atomic: stapio/20631/0x00000002 
[  950.266990] INFO: lockdep is turned off. 
[  950.266994] Modules linked in: nd_s_20631(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_1d57d05694160e9f9f0b1dd5dccc510_20591] 
[  950.267060] CPU: 10 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  950.267067] Call Trace: 
[  950.267075] [c00000058d677760] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  950.267088] [c00000058d677820] [c000000000a534b0] dump_stack+0x30/0x44 
[  950.267097] [c00000058d677840] [c000000000a4775c] __schedule_bug+0x80/0xa8 
[  950.267106] [c00000058d6778b0] [c000000000a33340] __schedule+0xb80/0xcd0 
[  950.267113] [c00000058d677980] [c000000000a2d2a4]
schedule_timeout+0x324/0x480 
[  950.267122] [c00000058d677ab0] [c000000000a33954]
wait_for_completion+0x114/0x1b0 
[  950.267131] [c00000058d677b20] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0 
[  950.267140] [c00000058d677bc0] [c0000000002148c0]
synchronize_sched+0xc0/0x180 
[  950.267148] [c00000058d677bf0] [c00000000012fbf4]
atomic_notifier_chain_unregister+0xc4/0xf0 
[  950.267166] [c00000058d677c20] [d00000000e54c97c]
_stp_cleanup_and_exit+0x15c/0xa20 [nd_s_20631] 
[  950.267183] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631] 
[  950.267191] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400 
[  950.267199] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10 
[  950.576757] BUG: sleeping function called from invalid context at
kernel/mutex.c:576 
[  950.576775] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio 
[  950.576782] INFO: lockdep is turned off. 
[  950.576790] CPU: 10 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  950.576797] Call Trace: 
[  950.576807] [c00000058d677b70] [c000000000019d28] show_stack+0x88/0x390
(unreliable) 
[  950.576819] [c00000058d677c30] [c000000000a534b0] dump_stack+0x30/0x44 
[  950.576830] [c00000058d677c50] [c00000000013af14] __might_sleep+0x1c4/0x2d0 
[  950.576839] [c00000058d677cd0] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0 
[  950.576848] [c00000058d677dd0] [c000000000357f80] SyS_read+0x210/0x390 
[  950.576858] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4 
[  950.576900] Page fault in user mode with in_atomic() = 1 mm =
c00000004d75af80 
[  950.576908] NIP = 3fff931661d8  MSR = 800000010200d033 
[  950.576914] Oops: Weird page fault, sig: 11 [#1] 
[  950.576920] SMP NR_CPUS=2048 NUMA pSeries 
[  950.576929] Modules linked in: nd_s_20631(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_1d57d05694160e9f9f0b1dd5dccc510_20591] 
[  950.576998] CPU: 10 PID: 20631 Comm: stapio Tainted: G        W  OE 
------------   3.10.0-327.el7.ppc64le.debug #1 
[  950.577006] task: c0000005e92625b0 ti: c00000058d674000 task.ti:
c00000058d674000 
[  950.577013] NIP: 00003fff931661d8 LR: 00003fff93166820 CTR: 00003fff93245020 
[  950.577020] REGS: c00000058d677ea0 TRAP: 0301   Tainted: G        W  OE 
------------    (3.10.0-327.el7.ppc64le.debug) 
[  950.577027] MSR: 800000010200d033 <SF,VEC,EE,PR,ME,IR,DR,RI,LE>  CR:
42004444  XER: 00000000 
[  950.577056] CFAR: 00003fff9316681c DAR: 00003ffff092f270 DSISR: 42000000
SOFTE: 1  
GPR00: 00003fff93166820 00003ffff09313d0 00003fff932e74f0 00003fff932e1160  
GPR04: 000000001000ce30 00003ffff0931ae8 00003ffff0931ae8 0000000000000000  
GPR08: 0000000000000001 0000000000000004 ffffffffffffffff 0000000000000005  
GPR12: 0000000000004400 00003fff933aae80  
[  950.577113] NIP [00003fff931661d8] 0x3fff931661d8 
[  950.577119] LR [00003fff93166820] 0x3fff93166820 
[  950.577124] Call Trace: 
[  950.577135] ---[ end trace d99753e79c16f711 ]--- 
[  950.581503]  
[  950.581509] note: stapio[20631] exited with preempt_count 1
====

If I'm reading that correctly, stapio caused a multitude of "sleeping function
called from invalid context" bugs, then died with a signal 11 (SIGSEGV).

I wonder if tcl/expect handles a child getting a SIGSEGV properly. I also
wonder why stapio had the bug in the first place.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
  2016-09-15 14:28 ` [Bug testsuite/20600] " dsmith at redhat dot com
  2016-09-15 19:40 ` dsmith at redhat dot com
@ 2016-09-15 19:53 ` dsmith at redhat dot com
  2016-09-15 20:13 ` dsmith at redhat dot com
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 19:53 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|parallet testsuite hang in  |parallel testsuite hang in
                   |[nd_]syscall.exp            |[nd_]syscall.exp

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (2 preceding siblings ...)
  2016-09-15 19:53 ` [Bug testsuite/20600] parallel " dsmith at redhat dot com
@ 2016-09-15 20:13 ` dsmith at redhat dot com
  2016-09-22 14:26 ` dsmith at redhat dot com
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 20:13 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

--- Comment #3 from David Smith <dsmith at redhat dot com> ---
> I wonder if tcl/expect handles a child getting a SIGSEGV properly.

If I write a simple C program that looks like:

====
#include <sys/types.h>
#include <signal.h>
#include <unistd.h>

int
main()
{
        kill(getpid(), SIGSEGV);
        return 0;
}
====

and then do:

====
# tclsh
% catch { exec ./sigsegv }
1
====

That executes immediately, no hang.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (3 preceding siblings ...)
  2016-09-15 20:13 ` dsmith at redhat dot com
@ 2016-09-22 14:26 ` dsmith at redhat dot com
  2016-09-22 18:09 ` dsmith at redhat dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-22 14:26 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

--- Comment #4 from David Smith <dsmith at redhat dot com> ---
Here's an update.

I believe I've tracked this down to _stp_init_time(). I've added lots of
'might_sleep()' calls to that function in my local copy of systemtap, and I got
the following:

====
Sep 21 22:32:40 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: BUG: sleeping
function called from invalid context at
/usr/local/share/systemtap/runtime/time.c:323
Sep 21 22:32:40 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: in_atomic(): 1,
irqs_disabled(): 0, pid: 5960, name: stapio
Sep 21 22:32:40 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: INFO: lockdep is
turned off.
====

Here's that section of runtime/time.c (with line numbers):

====
   310      might_sleep();
   311      stp_time = _stp_alloc_percpu(sizeof(stp_time_t));
   312      if (unlikely(stp_time == 0))
   313              return -1;
   314  
   315      might_sleep();
   316  #ifdef STAPCONF_ONEACHCPU_RETRY
   317      ret = on_each_cpu(__stp_init_time, NULL, 0, 1);
   318  #else
   319      ret = on_each_cpu(__stp_init_time, NULL, 1);
   320  #endif
   321  
   322  #ifdef STAPCONF_ADD_TIMER_ON
   323      might_sleep();
   324      for_each_online_cpu(cpu) {
   325          stp_time_t *time = per_cpu_ptr(stp_time, cpu);
   326          add_timer_on(&time->timer, cpu);
   327      }
   328  #endif
===

I believe this means that something in the following line is causing us to
become atomic:

    ret = on_each_cpu(__stp_init_time, NULL, 1);

At line 315, might_sleep() didn't complain, but at line 323 we're suddenly
atomic.

On RHEL7, on_each_cpu() looks like the following:

====
static inline int on_each_cpu(smp_call_func_t func, void *info, int wait)
{
        unsigned long flags;

        local_irq_save(flags);
        func(info);
        local_irq_restore(flags);
        return 0;
}
====

That matches up to the kernel message above, since irqs aren't disabled. So, my
guess is that something in __stp_init_time() is causing us to become atomic.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (4 preceding siblings ...)
  2016-09-22 14:26 ` dsmith at redhat dot com
@ 2016-09-22 18:09 ` dsmith at redhat dot com
  2017-02-28 16:13 ` dsmith at redhat dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-22 18:09 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

--- Comment #5 from David Smith <dsmith at redhat dot com> ---
(In reply to David Smith from comment #4)
> Here's an update.

... stuff deleted ...

> On RHEL7, on_each_cpu() looks like the following:

Nope, that was on_each_cpu() for a single-processor kernel. Here's
on_each_cpu() for a SMP kernel:

====
/*
 * Call a function on all processors.  May be used during early boot while
 * early_boot_irqs_disabled is set.  Use local_irq_save/restore() instead
 * of local_irq_disable/enable().
 */
int on_each_cpu(void (*func) (void *info), void *info, int wait)
{
        unsigned long flags;
        int ret = 0;

        preempt_disable();
        ret = smp_call_function(func, info, wait);
        local_irq_save(flags);
        func(info);
        local_irq_restore(flags);
        preempt_enable();
        return ret;
}
====

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (5 preceding siblings ...)
  2016-09-22 18:09 ` dsmith at redhat dot com
@ 2017-02-28 16:13 ` dsmith at redhat dot com
  2017-03-02  7:29 ` mcermak at redhat dot com
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2017-02-28 16:13 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #6 from David Smith <dsmith at redhat dot com> ---
Fixed in commit ed948ab. The parallel testsuite seems to run to completion on
all arches.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (6 preceding siblings ...)
  2017-02-28 16:13 ` dsmith at redhat dot com
@ 2017-03-02  7:29 ` mcermak at redhat dot com
  2017-03-03 19:17 ` dsmith at redhat dot com
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mcermak at redhat dot com @ 2017-03-02  7:29 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

Martin Cermak <mcermak at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mcermak at redhat dot com

--- Comment #7 from Martin Cermak <mcermak at redhat dot com> ---
Tested couple times and saw no stalls in the syscall tests.  Looks great!

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (7 preceding siblings ...)
  2017-03-02  7:29 ` mcermak at redhat dot com
@ 2017-03-03 19:17 ` dsmith at redhat dot com
  2023-12-06 20:55 ` wcohen at redhat dot com
  2024-02-21 13:25 ` fche at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2017-03-03 19:17 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |---

--- Comment #8 from David Smith <dsmith at redhat dot com> ---
After fixing an uninitialized variable error in the fix, the hang is back
(although in a slightly different way). Sigh.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (8 preceding siblings ...)
  2017-03-03 19:17 ` dsmith at redhat dot com
@ 2023-12-06 20:55 ` wcohen at redhat dot com
  2024-02-21 13:25 ` fche at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2023-12-06 20:55 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

William Cohen <wcohen at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wcohen at redhat dot com

--- Comment #9 from William Cohen <wcohen at redhat dot com> ---
Not possible to replicate this issue now "make installcheck-parallel" was
removed by commit 12c28db3271:

Author: Frank Ch. Eigler <fche@redhat.com>  2023-10-12 12:45:09
Committer: Frank Ch. Eigler <fche@redhat.com>  2023-10-12 13:11:35
Parent: 3b2b14c5d095e7c82594d79aabca5d407692eab4 (Teach stap-prep to use new
versions of dnf (if available))
Child:  823fcc4143c254bd847a1b4592b1580ee842239d (testsuite: drop busybox test
case)
Branches: master, mcermak-pr3040, remotes/origin/master,
remotes/origin/mcermak-pr3040, remotes/origin/uprobe-reg2, wcohen/dwarf5callee
Follows: release-4.9a
Precedes: release-5.0a

    testsuite: simplify Makefile drivers

    It was reported that "make installcheck" broke with commit
    218c26a523816.  Investigation pointed at quoting mishaps somewhere in
    the Makefile machinery related to parallel / partial testsuite runs.
    While this logic was clever and appeared useful for a time, it seems
    fragile in practice and not in active use after all.  So let's nuke
    all of it.

    Moved the environment_sanity.exp test into the systemtap/ subdirectory
    to apprx. guarantee that it's run first, no Makefile magic needed.  It
    still exits dejagnu entirely if it fails.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
  2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
                   ` (9 preceding siblings ...)
  2023-12-06 20:55 ` wcohen at redhat dot com
@ 2024-02-21 13:25 ` fche at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2024-02-21 13:25 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=20600

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |OBSOLETE
                 CC|                            |fche at redhat dot com
             Status|REOPENED                    |RESOLVED

--- Comment #10 from Frank Ch. Eigler <fche at redhat dot com> ---
parallel testsuite mode gone, and many concurrency fixes to the
runtime/transport have come in since the original report

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-02-21 13:26 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
2016-09-15 14:28 ` [Bug testsuite/20600] " dsmith at redhat dot com
2016-09-15 19:40 ` dsmith at redhat dot com
2016-09-15 19:53 ` [Bug testsuite/20600] parallel " dsmith at redhat dot com
2016-09-15 20:13 ` dsmith at redhat dot com
2016-09-22 14:26 ` dsmith at redhat dot com
2016-09-22 18:09 ` dsmith at redhat dot com
2017-02-28 16:13 ` dsmith at redhat dot com
2017-03-02  7:29 ` mcermak at redhat dot com
2017-03-03 19:17 ` dsmith at redhat dot com
2023-12-06 20:55 ` wcohen at redhat dot com
2024-02-21 13:25 ` fche at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).