* [Bug testsuite/20600] parallet testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
@ 2016-09-15 14:28 ` dsmith at redhat dot com
2016-09-15 19:40 ` dsmith at redhat dot com
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 14:28 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
--- Comment #1 from David Smith <dsmith at redhat dot com> ---
When this happens, it looks like stapio's parent gets set to expect:
[root@ibm-p8-01-lp7 rhel7-ppc64le]# ps -l 20631
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
4 Z 0 20631 3434 0 80 0 - 0 exit ? 0:03 [stapio]
<defunct>
[root@ibm-p8-01-lp7 rhel7-ppc64le]# ps -wl 3434
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 0 3434 3427 0 80 0 - 391 pipe_r ? 0:02 expect --
/usr/share/dejagnu/runtest.exp --tool systemtap --tool_opts --srcdir
/root/src/testsuite --outdir=artifacts/systemt
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallet testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
2016-09-15 14:28 ` [Bug testsuite/20600] " dsmith at redhat dot com
@ 2016-09-15 19:40 ` dsmith at redhat dot com
2016-09-15 19:53 ` [Bug testsuite/20600] parallel " dsmith at redhat dot com
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 19:40 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
--- Comment #2 from David Smith <dsmith at redhat dot com> ---
Poking around in the system log, I find this:
====
[ 945.698507] stap_5af583ef6fbd2cecdd31d1bbd9fb86b7_20528: systemtap:
3.1/0.163, base: d00000000c450000, memory:
218data/116text/180ctx/2058net/133alloc kb, probes: 52
[ 947.101337] stap_1d57d05694160e9f9f0b1dd5dccc510_20591: systemtap:
3.1/0.163, base: d00000000d910000, memory:
5399data/71text/258ctx/2058net/2293alloc kb, probes: 7
[ 947.406869] BUG: sleeping function called from invalid context at
kernel/mutex.c:576
[ 947.406889] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio
[ 947.406894] INFO: lockdep is turned off.
[ 947.406900] CPU: 12 PID: 20631 Comm: stapio Tainted: G OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 947.406905] Call Trace:
[ 947.406912] [c00000058d677980] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 947.406920] [c00000058d677a40] [c000000000a534b0] dump_stack+0x30/0x44
[ 947.406927] [c00000058d677a60] [c00000000013af14] __might_sleep+0x1c4/0x2d0
[ 947.406933] [c00000058d677ae0] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0
[ 947.406941] [c00000058d677be0] [c0000000001301a0]
srcu_notifier_chain_register+0x40/0x120
[ 947.406948] [c00000058d677c10] [c0000000007da200]
cpufreq_register_notifier+0xa0/0xc0
[ 947.406962] [c00000058d677c30] [d00000000e54da9c]
systemtap_module_init+0x6ac/0x28f0 [nd_s_20631]
[ 947.406975] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631]
[ 947.406983] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 947.406989] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 947.411401] nd_s_20631: systemtap: 3.1/0.163, base: d00000000e480000,
memory: 754data/850text/583ctx/2058net/13050alloc kb, probes: 1023
[ 947.576979] BUG: scheduling while atomic: stapio/20631/0x10000002
[ 947.576995] INFO: lockdep is turned off.
[ 947.576998] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 947.577057] CPU: 12 PID: 20631 Comm: stapio Tainted: G OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 947.577063] Call Trace:
[ 947.577070] [c00000058d6777c0] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 947.577079] [c00000058d677880] [c000000000a534b0] dump_stack+0x30/0x44
[ 947.577087] [c00000058d6778a0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 947.577094] [c00000058d677910] [c000000000a33340] __schedule+0xb80/0xcd0
[ 947.577102] [c00000058d6779e0] [c000000000a33588] _cond_resched+0x58/0x70
[ 947.577109] [c00000058d677a10] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0
[ 947.577117] [c00000058d677b10] [c000000000a3f974]
register_kprobe+0x104/0x800
[ 947.577123] [c00000058d677bc0] [c000000000a40210]
register_kretprobe+0x1a0/0x280
[ 947.577142] [c00000058d677c00] [d00000000e494c4c]
stapkp_register_probe+0xcc/0x270 [nd_s_20631]
[ 947.577157] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631]
[ 947.577172] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631]
[ 947.577180] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 947.577187] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 947.608580] BUG: scheduling while atomic: stapio/20631/0x10000002
[ 947.608595] INFO: lockdep is turned off.
[ 947.608599] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 947.608662] CPU: 0 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 947.608669] Call Trace:
[ 947.608676] [c00000058d6777c0] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 947.608686] [c00000058d677880] [c000000000a534b0] dump_stack+0x30/0x44
[ 947.608693] [c00000058d6778a0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 947.608702] [c00000058d677910] [c000000000a33340] __schedule+0xb80/0xcd0
[ 947.608709] [c00000058d6779e0] [c000000000a33588] _cond_resched+0x58/0x70
[ 947.608717] [c00000058d677a10] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0
[ 947.608724] [c00000058d677b10] [c000000000a3f974]
register_kprobe+0x104/0x800
[ 947.608731] [c00000058d677bc0] [c000000000a40210]
register_kretprobe+0x1a0/0x280
[ 947.608750] [c00000058d677c00] [d00000000e494c4c]
stapkp_register_probe+0xcc/0x270 [nd_s_20631]
[ 947.608766] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631]
[ 947.608782] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631]
[ 947.608790] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 947.608798] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 948.177884] BUG: scheduling while atomic: stapio/20631/0x10000002
[ 948.177899] INFO: lockdep is turned off.
[ 948.177903] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 948.177966] CPU: 0 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 948.177972] Call Trace:
[ 948.177979] [c00000058d677800] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 948.177989] [c00000058d6778c0] [c000000000a534b0] dump_stack+0x30/0x44
[ 948.177996] [c00000058d6778e0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 948.178004] [c00000058d677950] [c000000000a33340] __schedule+0xb80/0xcd0
[ 948.178012] [c00000058d677a20] [c000000000a33588] _cond_resched+0x58/0x70
[ 948.178019] [c00000058d677a50] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0
[ 948.178026] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800
[ 948.178044] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631]
[ 948.178060] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631]
[ 948.178075] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631]
[ 948.178083] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 948.178090] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 948.366981] BUG: scheduling while atomic: stapio/20631/0x10000002
[ 948.366996] INFO: lockdep is turned off.
[ 948.367000] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 948.367066] CPU: 8 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 948.367073] Call Trace:
[ 948.367080] [c00000058d677800] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 948.367090] [c00000058d6778c0] [c000000000a534b0] dump_stack+0x30/0x44
[ 948.367098] [c00000058d6778e0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 948.367106] [c00000058d677950] [c000000000a33340] __schedule+0xb80/0xcd0
[ 948.367113] [c00000058d677a20] [c000000000a33588] _cond_resched+0x58/0x70
[ 948.367121] [c00000058d677a50] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0
[ 948.367128] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800
[ 948.367146] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631]
[ 948.367162] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631]
[ 948.367178] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631]
[ 948.367186] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 948.367194] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 948.407004] BUG: sleeping function called from invalid context at
kernel/mutex.c:576
[ 948.407016] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio
[ 948.407021] INFO: lockdep is turned off.
[ 948.407027] CPU: 8 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 948.407033] Call Trace:
[ 948.407038] [c00000058d6778f0] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 948.407048] [c00000058d6779b0] [c000000000a534b0] dump_stack+0x30/0x44
[ 948.407055] [c00000058d6779d0] [c00000000013af14] __might_sleep+0x1c4/0x2d0
[ 948.407063] [c00000058d677a50] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0
[ 948.407071] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800
[ 948.407087] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631]
[ 948.407103] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631]
[ 948.407119] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631]
[ 948.407126] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 948.407134] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 948.578170] BUG: scheduling while atomic: stapio/20631/0x10000002
[ 948.578184] INFO: lockdep is turned off.
[ 948.578188] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 948.578259] CPU: 8 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 948.578266] Call Trace:
[ 948.578272] [c00000058d677800] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 948.578282] [c00000058d6778c0] [c000000000a534b0] dump_stack+0x30/0x44
[ 948.578290] [c00000058d6778e0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 948.578299] [c00000058d677950] [c000000000a33340] __schedule+0xb80/0xcd0
[ 948.578307] [c00000058d677a20] [c000000000a33588] _cond_resched+0x58/0x70
[ 948.578315] [c00000058d677a50] [c000000000a2e0bc]
mutex_lock_nested+0x7c/0x5b0
[ 948.578323] [c00000058d677b50] [c000000000a3f974]
register_kprobe+0x104/0x800
[ 948.578343] [c00000058d677c00] [d00000000e494cd8]
stapkp_register_probe+0x158/0x270 [nd_s_20631]
[ 948.578360] [c00000058d677c30] [d00000000e54f278]
systemtap_module_init+0x1e88/0x28f0 [nd_s_20631]
[ 948.578376] [c00000058d677d20] [d00000000e55024c]
_stp_ctl_write_cmd+0x56c/0x780 [nd_s_20631]
[ 948.578384] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 948.578393] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 948.583844] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 948.583851] INFO: lockdep is turned off.
[ 948.583856] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 948.583920] CPU: 8 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 948.583926] Call Trace:
[ 948.583931] [c00000058d6774e0] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 948.583940] [c00000058d6775a0] [c000000000a534b0] dump_stack+0x30/0x44
[ 948.583948] [c00000058d6775c0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 948.583956] [c00000058d677630] [c000000000a33340] __schedule+0xb80/0xcd0
[ 948.583964] [c00000058d677700] [c000000000a316ec]
schedule_hrtimeout_range_clock+0x14c/0x560
[ 948.583972] [c00000058d6777e0] [c00000000037e418]
poll_schedule_timeout+0x58/0x130
[ 948.583981] [c00000058d677810] [c00000000037f07c] do_select+0x7cc/0xa30
[ 948.583988] [c00000058d677bd0] [c00000000037f74c]
core_sys_select+0x46c/0x5b0
[ 948.583996] [c00000058d677d70] [c00000000037fcf0] SyS_pselect6+0x1a0/0x3a0
[ 948.584004] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 948.586598] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 948.586603] INFO: lockdep is turned off.
[ 948.586607] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 948.586671] CPU: 8 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 948.586677] Call Trace:
[ 948.586682] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 948.586717] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44
[ 948.586733] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 948.586750] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0
[ 948.586767] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 948.586785] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 948.586798] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 948.586811] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 948.586834] [c00000058d677c20] [d00000000e54c9cc]
_stp_cleanup_and_exit+0x1ac/0xa20 [nd_s_20631]
[ 948.586853] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 948.586862] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 948.586871] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 949.256861] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 949.256877] INFO: lockdep is turned off.
[ 949.256883] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 949.256955] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 949.256963] Call Trace:
[ 949.256972] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 949.256983] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44
[ 949.256991] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 949.257000] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0
[ 949.257008] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 949.257016] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 949.257025] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 949.257033] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 949.257053] [c00000058d677c20] [d00000000e54ca2c]
_stp_cleanup_and_exit+0x20c/0xa20 [nd_s_20631]
[ 949.257070] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 949.257079] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 949.257087] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 949.347065] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 949.347079] INFO: lockdep is turned off.
[ 949.347085] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 949.347185] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 949.347194] Call Trace:
[ 949.347204] [c00000058d677700] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 949.347219] [c00000058d6777c0] [c000000000a534b0] dump_stack+0x30/0x44
[ 949.347231] [c00000058d6777e0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 949.347243] [c00000058d677850] [c000000000a33340] __schedule+0xb80/0xcd0
[ 949.347255] [c00000058d677920] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 949.347267] [c00000058d677a50] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 949.347279] [c00000058d677ac0] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 949.347291] [c00000058d677b60] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 949.347303] [c00000058d677b90] [c000000000a4086c]
unregister_kprobes.part.14+0x9c/0x170
[ 949.347330] [c00000058d677be0] [d00000000e496190]
stapkp_batch_unregister_probes.constprop.662+0x90/0x190 [nd_s_20631]
[ 949.347357] [c00000058d677c20] [d00000000e54ca54]
_stp_cleanup_and_exit+0x234/0xa20 [nd_s_20631]
[ 949.347378] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 949.347387] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 949.347396] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 949.516763] BUG: sleeping function called from invalid context at
kernel/mutex.c:576
[ 949.516790] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio
[ 949.516803] INFO: lockdep is turned off.
[ 949.516819] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 949.516835] Call Trace:
[ 949.516850] [c00000058d6778c0] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 949.516872] [c00000058d677980] [c000000000a534b0] dump_stack+0x30/0x44
[ 949.516888] [c00000058d6779a0] [c00000000013af14] __might_sleep+0x1c4/0x2d0
[ 949.516903] [c00000058d677a20] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0
[ 949.516918] [c00000058d677b20] [c000000000a3f358]
__free_insn_slot+0x38/0x1b0
[ 949.516932] [c00000058d677b60] [c000000000a390d8]
arch_remove_kprobe+0x38/0x60
[ 949.516944] [c00000058d677b90] [c000000000a40918]
unregister_kprobes.part.14+0x148/0x170
[ 949.516970] [c00000058d677be0] [d00000000e496190]
stapkp_batch_unregister_probes.constprop.662+0x90/0x190 [nd_s_20631]
[ 949.516991] [c00000058d677c20] [d00000000e54ca54]
_stp_cleanup_and_exit+0x234/0xa20 [nd_s_20631]
[ 949.517009] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 949.517019] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 949.517029] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 949.517386] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 949.517393] INFO: lockdep is turned off.
[ 949.517397] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 949.517471] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 949.517478] Call Trace:
[ 949.517484] [c00000058d6776e0] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 949.517493] [c00000058d6777a0] [c000000000a534b0] dump_stack+0x30/0x44
[ 949.517502] [c00000058d6777c0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 949.517510] [c00000058d677830] [c000000000a33340] __schedule+0xb80/0xcd0
[ 949.517519] [c00000058d677900] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 949.517528] [c00000058d677a30] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 949.517538] [c00000058d677aa0] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 949.517547] [c00000058d677b40] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 949.517554] [c00000058d677b70] [c000000000a40c88]
unregister_kretprobes.part.16+0xb8/0x2b0
[ 949.517570] [c00000058d677be0] [d00000000e4961e8]
stapkp_batch_unregister_probes.constprop.662+0xe8/0x190 [nd_s_20631]
[ 949.517586] [c00000058d677c20] [d00000000e54ca54]
_stp_cleanup_and_exit+0x234/0xa20 [nd_s_20631]
[ 949.517602] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 949.517610] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 949.517618] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 949.696344] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 949.696361] INFO: lockdep is turned off.
[ 949.696365] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 949.696429] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 949.696436] Call Trace:
[ 949.696443] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 949.696453] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44
[ 949.696461] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 949.696469] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0
[ 949.696476] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 949.696483] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 949.696492] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 949.696499] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 949.696518] [c00000058d677c20] [d00000000e54caa8]
_stp_cleanup_and_exit+0x288/0xa20 [nd_s_20631]
[ 949.696532] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 949.696540] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 949.696548] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 949.766771] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 949.766788] INFO: lockdep is turned off.
[ 949.766795] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 949.766904] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 949.766914] Call Trace:
[ 949.766923] [c00000058d6778a0] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 949.766936] [c00000058d677960] [c000000000a534b0] dump_stack+0x30/0x44
[ 949.766946] [c00000058d677980] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 949.766956] [c00000058d6779f0] [c000000000a33340] __schedule+0xb80/0xcd0
[ 949.766965] [c00000058d677ac0] [c000000000a2d178]
schedule_timeout+0x1f8/0x480
[ 949.766974] [c00000058d677bf0] [c0000000000f6c6c] msleep+0x3c/0x60
[ 949.766996] [c00000058d677c20] [d00000000e54ccc0]
_stp_cleanup_and_exit+0x4a0/0xa20 [nd_s_20631]
[ 949.767016] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 949.767025] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 949.767035] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 950.026694] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 950.026711] INFO: lockdep is turned off.
[ 950.026716] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 950.026788] irq event stamp: 0
[ 950.026792] hardirqs last enabled at (0): [< (null)>]
(null)
[ 950.026799] hardirqs last disabled at (0): [<c0000000000d19ac>]
copy_process.part.22+0x5ec/0x1050
[ 950.026811] softirqs last enabled at (0): [<c0000000000d19ac>]
copy_process.part.22+0x5ec/0x1050
[ 950.026820] softirqs last disabled at (0): [< (null)>]
(null)
[ 950.026830] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 950.026837] Call Trace:
[ 950.026842] [c00000058d677a00] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 950.026852] [c00000058d677ac0] [c000000000a534b0] dump_stack+0x30/0x44
[ 950.026860] [c00000058d677ae0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 950.026869] [c00000058d677b50] [c000000000a33340] __schedule+0xb80/0xcd0
[ 950.026888] [c00000058d677c20] [d00000000e54ccc8]
_stp_cleanup_and_exit+0x4a8/0xa20 [nd_s_20631]
[ 950.026904] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 950.026912] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 950.026920] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 950.026939] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 950.026944] INFO: lockdep is turned off.
[ 950.026947] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE)
stap_872ad7238edaf2bf03978db3e1da69b_19920(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
nd_s_20466]
[ 950.027008] CPU: 9 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 950.027014] Call Trace:
[ 950.027019] [c00000058d677790] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 950.027028] [c00000058d677850] [c000000000a534b0] dump_stack+0x30/0x44
[ 950.027036] [c00000058d677870] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 950.027043] [c00000058d6778e0] [c000000000a33340] __schedule+0xb80/0xcd0
[ 950.027051] [c00000058d6779b0] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 950.027060] [c00000058d677ae0] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 950.027068] [c00000058d677b50] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 950.027077] [c00000058d677bf0] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 950.027092] [c00000058d677c20] [d00000000e54cce8]
_stp_cleanup_and_exit+0x4c8/0xa20 [nd_s_20631]
[ 950.027108] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 950.027116] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 950.027124] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 950.147890] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 950.147905] INFO: lockdep is turned off.
[ 950.147910] Modules linked in: nd_s_20631(OE)
stap_1d57d05694160e9f9f0b1dd5dccc510_20591(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_872ad7238edaf2bf03978db3e1da69b_19920]
[ 950.147974] CPU: 10 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 950.147981] Call Trace:
[ 950.147989] [c00000058d677710] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 950.147999] [c00000058d6777d0] [c000000000a534b0] dump_stack+0x30/0x44
[ 950.148007] [c00000058d6777f0] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 950.148015] [c00000058d677860] [c000000000a33340] __schedule+0xb80/0xcd0
[ 950.148023] [c00000058d677930] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 950.148031] [c00000058d677a60] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 950.148039] [c00000058d677ad0] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 950.148048] [c00000058d677b70] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 950.148066] [c00000058d677ba0] [d00000000e488d08]
_stp_runtime_contexts_free+0x328/0x4e0 [nd_s_20631]
[ 950.148082] [c00000058d677c20] [d00000000e54d010]
_stp_cleanup_and_exit+0x7f0/0xa20 [nd_s_20631]
[ 950.148098] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 950.148107] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 950.148115] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 950.266974] BUG: scheduling while atomic: stapio/20631/0x00000002
[ 950.266990] INFO: lockdep is turned off.
[ 950.266994] Modules linked in: nd_s_20631(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_1d57d05694160e9f9f0b1dd5dccc510_20591]
[ 950.267060] CPU: 10 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 950.267067] Call Trace:
[ 950.267075] [c00000058d677760] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 950.267088] [c00000058d677820] [c000000000a534b0] dump_stack+0x30/0x44
[ 950.267097] [c00000058d677840] [c000000000a4775c] __schedule_bug+0x80/0xa8
[ 950.267106] [c00000058d6778b0] [c000000000a33340] __schedule+0xb80/0xcd0
[ 950.267113] [c00000058d677980] [c000000000a2d2a4]
schedule_timeout+0x324/0x480
[ 950.267122] [c00000058d677ab0] [c000000000a33954]
wait_for_completion+0x114/0x1b0
[ 950.267131] [c00000058d677b20] [c00000000011e4f4] wait_rcu_gp+0x84/0xb0
[ 950.267140] [c00000058d677bc0] [c0000000002148c0]
synchronize_sched+0xc0/0x180
[ 950.267148] [c00000058d677bf0] [c00000000012fbf4]
atomic_notifier_chain_unregister+0xc4/0xf0
[ 950.267166] [c00000058d677c20] [d00000000e54c97c]
_stp_cleanup_and_exit+0x15c/0xa20 [nd_s_20631]
[ 950.267183] [c00000058d677d20] [d00000000e5502a8]
_stp_ctl_write_cmd+0x5c8/0x780 [nd_s_20631]
[ 950.267191] [c00000058d677dd0] [c000000000358250] SyS_write+0x150/0x400
[ 950.267199] [c00000058d677e30] [c000000000050540]
kretprobe_trampoline+0x0/0x10
[ 950.576757] BUG: sleeping function called from invalid context at
kernel/mutex.c:576
[ 950.576775] in_atomic(): 1, irqs_disabled(): 0, pid: 20631, name: stapio
[ 950.576782] INFO: lockdep is turned off.
[ 950.576790] CPU: 10 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 950.576797] Call Trace:
[ 950.576807] [c00000058d677b70] [c000000000019d28] show_stack+0x88/0x390
(unreliable)
[ 950.576819] [c00000058d677c30] [c000000000a534b0] dump_stack+0x30/0x44
[ 950.576830] [c00000058d677c50] [c00000000013af14] __might_sleep+0x1c4/0x2d0
[ 950.576839] [c00000058d677cd0] [c000000000a2e0b4]
mutex_lock_nested+0x74/0x5b0
[ 950.576848] [c00000058d677dd0] [c000000000357f80] SyS_read+0x210/0x390
[ 950.576858] [c00000058d677e30] [c00000000000a188] system_call+0x38/0xb4
[ 950.576900] Page fault in user mode with in_atomic() = 1 mm =
c00000004d75af80
[ 950.576908] NIP = 3fff931661d8 MSR = 800000010200d033
[ 950.576914] Oops: Weird page fault, sig: 11 [#1]
[ 950.576920] SMP NR_CPUS=2048 NUMA pSeries
[ 950.576929] Modules linked in: nd_s_20631(OE) tun sg pseries_rng nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp
ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_1d57d05694160e9f9f0b1dd5dccc510_20591]
[ 950.576998] CPU: 10 PID: 20631 Comm: stapio Tainted: G W OE
------------ 3.10.0-327.el7.ppc64le.debug #1
[ 950.577006] task: c0000005e92625b0 ti: c00000058d674000 task.ti:
c00000058d674000
[ 950.577013] NIP: 00003fff931661d8 LR: 00003fff93166820 CTR: 00003fff93245020
[ 950.577020] REGS: c00000058d677ea0 TRAP: 0301 Tainted: G W OE
------------ (3.10.0-327.el7.ppc64le.debug)
[ 950.577027] MSR: 800000010200d033 <SF,VEC,EE,PR,ME,IR,DR,RI,LE> CR:
42004444 XER: 00000000
[ 950.577056] CFAR: 00003fff9316681c DAR: 00003ffff092f270 DSISR: 42000000
SOFTE: 1
GPR00: 00003fff93166820 00003ffff09313d0 00003fff932e74f0 00003fff932e1160
GPR04: 000000001000ce30 00003ffff0931ae8 00003ffff0931ae8 0000000000000000
GPR08: 0000000000000001 0000000000000004 ffffffffffffffff 0000000000000005
GPR12: 0000000000004400 00003fff933aae80
[ 950.577113] NIP [00003fff931661d8] 0x3fff931661d8
[ 950.577119] LR [00003fff93166820] 0x3fff93166820
[ 950.577124] Call Trace:
[ 950.577135] ---[ end trace d99753e79c16f711 ]---
[ 950.581503]
[ 950.581509] note: stapio[20631] exited with preempt_count 1
====
If I'm reading that correctly, stapio caused a multitude of "sleeping function
called from invalid context" bugs, then died with a signal 11 (SIGSEGV).
I wonder if tcl/expect handles a child getting a SIGSEGV properly. I also
wonder why stapio had the bug in the first place.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
2016-09-15 14:28 ` [Bug testsuite/20600] " dsmith at redhat dot com
2016-09-15 19:40 ` dsmith at redhat dot com
@ 2016-09-15 19:53 ` dsmith at redhat dot com
2016-09-15 20:13 ` dsmith at redhat dot com
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 19:53 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
David Smith <dsmith at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|parallet testsuite hang in |parallel testsuite hang in
|[nd_]syscall.exp |[nd_]syscall.exp
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (2 preceding siblings ...)
2016-09-15 19:53 ` [Bug testsuite/20600] parallel " dsmith at redhat dot com
@ 2016-09-15 20:13 ` dsmith at redhat dot com
2016-09-22 14:26 ` dsmith at redhat dot com
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-15 20:13 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
--- Comment #3 from David Smith <dsmith at redhat dot com> ---
> I wonder if tcl/expect handles a child getting a SIGSEGV properly.
If I write a simple C program that looks like:
====
#include <sys/types.h>
#include <signal.h>
#include <unistd.h>
int
main()
{
kill(getpid(), SIGSEGV);
return 0;
}
====
and then do:
====
# tclsh
% catch { exec ./sigsegv }
1
====
That executes immediately, no hang.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (3 preceding siblings ...)
2016-09-15 20:13 ` dsmith at redhat dot com
@ 2016-09-22 14:26 ` dsmith at redhat dot com
2016-09-22 18:09 ` dsmith at redhat dot com
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-22 14:26 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
--- Comment #4 from David Smith <dsmith at redhat dot com> ---
Here's an update.
I believe I've tracked this down to _stp_init_time(). I've added lots of
'might_sleep()' calls to that function in my local copy of systemtap, and I got
the following:
====
Sep 21 22:32:40 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: BUG: sleeping
function called from invalid context at
/usr/local/share/systemtap/runtime/time.c:323
Sep 21 22:32:40 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: in_atomic(): 1,
irqs_disabled(): 0, pid: 5960, name: stapio
Sep 21 22:32:40 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: INFO: lockdep is
turned off.
====
Here's that section of runtime/time.c (with line numbers):
====
310 might_sleep();
311 stp_time = _stp_alloc_percpu(sizeof(stp_time_t));
312 if (unlikely(stp_time == 0))
313 return -1;
314
315 might_sleep();
316 #ifdef STAPCONF_ONEACHCPU_RETRY
317 ret = on_each_cpu(__stp_init_time, NULL, 0, 1);
318 #else
319 ret = on_each_cpu(__stp_init_time, NULL, 1);
320 #endif
321
322 #ifdef STAPCONF_ADD_TIMER_ON
323 might_sleep();
324 for_each_online_cpu(cpu) {
325 stp_time_t *time = per_cpu_ptr(stp_time, cpu);
326 add_timer_on(&time->timer, cpu);
327 }
328 #endif
===
I believe this means that something in the following line is causing us to
become atomic:
ret = on_each_cpu(__stp_init_time, NULL, 1);
At line 315, might_sleep() didn't complain, but at line 323 we're suddenly
atomic.
On RHEL7, on_each_cpu() looks like the following:
====
static inline int on_each_cpu(smp_call_func_t func, void *info, int wait)
{
unsigned long flags;
local_irq_save(flags);
func(info);
local_irq_restore(flags);
return 0;
}
====
That matches up to the kernel message above, since irqs aren't disabled. So, my
guess is that something in __stp_init_time() is causing us to become atomic.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (4 preceding siblings ...)
2016-09-22 14:26 ` dsmith at redhat dot com
@ 2016-09-22 18:09 ` dsmith at redhat dot com
2017-02-28 16:13 ` dsmith at redhat dot com
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2016-09-22 18:09 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
--- Comment #5 from David Smith <dsmith at redhat dot com> ---
(In reply to David Smith from comment #4)
> Here's an update.
... stuff deleted ...
> On RHEL7, on_each_cpu() looks like the following:
Nope, that was on_each_cpu() for a single-processor kernel. Here's
on_each_cpu() for a SMP kernel:
====
/*
* Call a function on all processors. May be used during early boot while
* early_boot_irqs_disabled is set. Use local_irq_save/restore() instead
* of local_irq_disable/enable().
*/
int on_each_cpu(void (*func) (void *info), void *info, int wait)
{
unsigned long flags;
int ret = 0;
preempt_disable();
ret = smp_call_function(func, info, wait);
local_irq_save(flags);
func(info);
local_irq_restore(flags);
preempt_enable();
return ret;
}
====
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (5 preceding siblings ...)
2016-09-22 18:09 ` dsmith at redhat dot com
@ 2017-02-28 16:13 ` dsmith at redhat dot com
2017-03-02 7:29 ` mcermak at redhat dot com
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2017-02-28 16:13 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
David Smith <dsmith at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #6 from David Smith <dsmith at redhat dot com> ---
Fixed in commit ed948ab. The parallel testsuite seems to run to completion on
all arches.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (6 preceding siblings ...)
2017-02-28 16:13 ` dsmith at redhat dot com
@ 2017-03-02 7:29 ` mcermak at redhat dot com
2017-03-03 19:17 ` dsmith at redhat dot com
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: mcermak at redhat dot com @ 2017-03-02 7:29 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
Martin Cermak <mcermak at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mcermak at redhat dot com
--- Comment #7 from Martin Cermak <mcermak at redhat dot com> ---
Tested couple times and saw no stalls in the syscall tests. Looks great!
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (7 preceding siblings ...)
2017-03-02 7:29 ` mcermak at redhat dot com
@ 2017-03-03 19:17 ` dsmith at redhat dot com
2023-12-06 20:55 ` wcohen at redhat dot com
2024-02-21 13:25 ` fche at redhat dot com
10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2017-03-03 19:17 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
David Smith <dsmith at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |---
--- Comment #8 from David Smith <dsmith at redhat dot com> ---
After fixing an uninitialized variable error in the fix, the hang is back
(although in a slightly different way). Sigh.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (8 preceding siblings ...)
2017-03-03 19:17 ` dsmith at redhat dot com
@ 2023-12-06 20:55 ` wcohen at redhat dot com
2024-02-21 13:25 ` fche at redhat dot com
10 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2023-12-06 20:55 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
William Cohen <wcohen at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |wcohen at redhat dot com
--- Comment #9 from William Cohen <wcohen at redhat dot com> ---
Not possible to replicate this issue now "make installcheck-parallel" was
removed by commit 12c28db3271:
Author: Frank Ch. Eigler <fche@redhat.com> 2023-10-12 12:45:09
Committer: Frank Ch. Eigler <fche@redhat.com> 2023-10-12 13:11:35
Parent: 3b2b14c5d095e7c82594d79aabca5d407692eab4 (Teach stap-prep to use new
versions of dnf (if available))
Child: 823fcc4143c254bd847a1b4592b1580ee842239d (testsuite: drop busybox test
case)
Branches: master, mcermak-pr3040, remotes/origin/master,
remotes/origin/mcermak-pr3040, remotes/origin/uprobe-reg2, wcohen/dwarf5callee
Follows: release-4.9a
Precedes: release-5.0a
testsuite: simplify Makefile drivers
It was reported that "make installcheck" broke with commit
218c26a523816. Investigation pointed at quoting mishaps somewhere in
the Makefile machinery related to parallel / partial testsuite runs.
While this logic was clever and appeared useful for a time, it seems
fragile in practice and not in active use after all. So let's nuke
all of it.
Moved the environment_sanity.exp test into the systemtap/ subdirectory
to apprx. guarantee that it's run first, no Makefile magic needed. It
still exits dejagnu entirely if it fails.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug testsuite/20600] parallel testsuite hang in [nd_]syscall.exp
2016-09-12 14:58 [Bug testsuite/20600] New: parallet testsuite hang in [nd_]syscall.exp dsmith at redhat dot com
` (9 preceding siblings ...)
2023-12-06 20:55 ` wcohen at redhat dot com
@ 2024-02-21 13:25 ` fche at redhat dot com
10 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2024-02-21 13:25 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=20600
Frank Ch. Eigler <fche at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |OBSOLETE
CC| |fche at redhat dot com
Status|REOPENED |RESOLVED
--- Comment #10 from Frank Ch. Eigler <fche at redhat dot com> ---
parallel testsuite mode gone, and many concurrency fixes to the
runtime/transport have come in since the original report
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 12+ messages in thread