* [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts.
@ 2023-12-17 1:30 agentzh at gmail dot com
2023-12-17 1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-17 1:30 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=31176
Bug ID: 31176
Summary: Avoid spin lock deadlocks in memory pool allocations
for mixed NMI and non-NMI contexts.
Product: systemtap
Version: unspecified
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: runtime
Assignee: systemtap at sourceware dot org
Reporter: agentzh at gmail dot com
Target Milestone: ---
The kernel's lockdep finds dead locks in the stap memory pool allocator's
spinlocks when mixing NMI and non-NMI contexts. The following dmesg error from
lockdep can always be reproduced on a debug kernel of Fedora by running TEST 4
in testsuite/systemtap.base/kernel-hw-breakpoint-addr.exp:
```
[ 426.561994] stap_d42289a7e6d2399d2047d1d6eb75940b_7879
(kernel-hw-breakpoint-addr_1.stp): systemtap: 5.1/0.183, base:
ffffffffc0e91000, memory: 1276data/200text/459ctx/524390net/375alloc kb,
probes: 3
[ 426.591193] ================================
[ 426.591193] WARNING: inconsistent lock state
[ 426.591194] 5.11.22-100.fc32.x86_64+debug #1 Tainted: G OE
[ 426.591194] --------------------------------
[ 426.591194] inconsistent {INITIAL USE} -> {IN-NMI} usage.
[ 426.591195] a.out/7882 [HC1[1]:SC0[0]:HE0:SE1] takes:
[ 426.591195] ffff88813de33e30 (lock#8){....}-{2:2}, at:
_stp_mempool_alloc+0x27/0x1a0 [stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591198] {INITIAL USE} state was registered at:
[ 426.591198] lock_acquire+0x1cc/0x780
[ 426.591199] _raw_spin_lock_irqsave+0x4d/0x90
[ 426.591199] _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591199] _stp_transport_init+0x668/0x1870
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591200] do_one_initcall+0xfb/0x530
[ 426.591200] do_init_module+0x1ce/0x7a0
[ 426.591200] load_module+0x78b5/0x9fb0
[ 426.591201] __do_sys_init_module+0x18f/0x220
[ 426.591202] do_syscall_64+0x33/0x40
[ 426.591203] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 426.591204] irq event stamp: 4434
[ 426.591204] hardirqs last enabled at (4433): [<ffffffff83800ade>]
asm_exc_page_fault+0x1e/0x30
[ 426.591205] hardirqs last disabled at (4434): [<ffffffff836816fc>]
exc_debug+0x6c/0x150
[ 426.591205] softirqs last enabled at (4258): [<ffffffff81096106>]
fpu__clear+0x86/0x160
[ 426.591206] softirqs last disabled at (4254): [<ffffffff81096085>]
fpu__clear+0x5/0x160
[ 426.591207] other info that might help us debug this:
[ 426.591207] Possible unsafe locking scenario:
[ 426.591208] CPU0
[ 426.591209] ----
[ 426.591213] lock(lock#8);
[ 426.591219] <Interrupt>
[ 426.591219] lock(lock#8);
[ 426.591222] *** DEADLOCK ***
[ 426.591222] no locks held by a.out/7882.
[ 426.591223] stack backtrace:
[ 426.591224] CPU: 6 PID: 7882 Comm: a.out Tainted: G OE
5.11.22-100.fc32.x86_64+debug #1
[ 426.591225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.15.0-1.fc35 04/01/2014
[ 426.591226] Call Trace:
[ 426.591226] <#DB>
[ 426.591226] dump_stack+0xae/0xe5
[ 426.591227] lock_acquire.cold+0x3b/0x40
[ 426.591227] ? lockdep_hardirqs_on_prepare+0x3f0/0x3f0
[ 426.591228] ? _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591229] _raw_spin_lock_irqsave+0x4d/0x90
[ 426.591229] ? _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591230] _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591230] _stp_ctl_get_buffer+0x1cf/0x2f0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591231] _stp_ctl_log_werr+0x37/0x290
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591231] ? __clear_user+0x47/0x70
[ 426.591232] _stp_warn+0xa3/0xc0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591232] ? _stp_vlog.constprop.0+0x40/0x40
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591233] unwind+0x2cf/0x3d0 [stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591233] ? _stp_umod_lookup.constprop.0+0x1a0/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591234] _stp_get_uregs+0x46d/0x6f0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591234] probe_6378+0x50c/0xec0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591235] ? _stp_snprint_addr.constprop.0.isra.0+0x940/0x940
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591235] enter_hwbkpt_probe+0x582/0xaf0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591236] ? __stp_utrace_task_finder_target_syscall_entry+0xc90/0xc90
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[ 426.591236] __perf_event_overflow+0x11e/0x320
[ 426.591237] perf_bp_event+0x13b/0x150
[ 426.591237] ? __perf_sw_event+0x130/0x130
[ 426.591237] hw_breakpoint_exceptions_notify+0x204/0x310
[ 426.591238] ? atomic_notifier_call_chain+0x5/0x100
[ 426.591238] notifier_call_chain+0x9e/0x180
[ 426.591239] atomic_notifier_call_chain+0x64/0x100
[ 426.591239] notify_die+0x81/0xd0
[ 426.591239] ? atomic_notifier_call_chain+0x100/0x100
[ 426.591240] ? rcu_nmi_enter+0x7c/0xd0
[ 426.591240] notify_debug+0x25/0x30
[ 426.591240] exc_debug+0xe4/0x150
[ 426.591241] asm_exc_debug+0x19/0x30
[ 426.591241] RIP: 0010:__clear_user+0x47/0x70
[ 426.591242] Code: ff 0f 01 cb 48 89 d8 48 c1 eb 03 48 89 ef 83 e0 07 48 89
d9 48 85 c9 74 19 66 2e 0f 1f 84 00 00 00 00 00 48 c7 07 00 00 00 00 <48> 83 c7
08 ff c9 75 f1 48 89 c1 85 c9 74 0a c6 07 00 48 ff c7 ff
[ 426.591244] RSP: 0018:ffffc90000acfbe0 EFLAGS: 00040206
[ 426.591244] RAX: 0000000000000004 RBX: 00000000000001fc RCX:
00000000000001fc
[ 426.591245] RDX: 1ffff11025cb51ba RSI: 0000000000000000 RDI:
000000000040401c
[ 426.591245] RBP: 000000000040401c R08: 0000000000000000 R09:
0000000000000001
[ 426.591246] R10: fffffbfff1b4506d R11: 0000000000000001 R12:
ffff888142192aa0
[ 426.591246] R13: 0000000000000000 R14: 000000000040401c R15:
ffff88815a6cb800
[ 426.591247] </#DB>
[ 426.591247] load_elf_binary+0x27f8/0x3f10
[ 426.591247] ? elf_core_dump+0x2ce0/0x2ce0
[ 426.591248] ? ima_file_mprotect+0x380/0x380
[ 426.591248] bprm_execve+0x684/0x1580
[ 426.591249] ? copy_strings.isra.0+0x680/0x680
[ 426.591249] do_execveat_common+0x55d/0x730
[ 426.591249] ? bprm_execve+0x1580/0x1580
[ 426.591250] ? getname_flags.part.0+0x8e/0x450
[ 426.591250] __x64_sys_execve+0x8f/0xc0
[ 426.591250] do_syscall_64+0x33/0x40
[ 426.591251] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 426.591251] RIP: 0033:0x7fda303e21ab
[ 426.591252] Code: Unable to access opcode bytes at RIP 0x7fda303e2181.
[ 426.591252] RSP: 002b:00007ffee7238158 EFLAGS: 00000202 ORIG_RAX:
000000000000003b
[ 426.591253] RAX: ffffffffffffffda RBX: 00000000021c23a0 RCX:
00007fda303e21ab
[ 426.591254] RDX: 00007ffee7238680 RSI: 00000000021c2380 RDI:
00000000021c23a0
[ 426.591255] RBP: 00007ffee72381c0 R08: 00007ffee7238290 R09:
0000000000000000
[ 426.591255] R10: 0000000000401059 R11: 0000000000000202 R12:
00000000021c2380
[ 426.591256] R13: 00007ffee7238680 R14: 0000000000000001 R15:
0000000000000000
```
In addition to the spin lock in pool->lock above, there's also a similar
lockdep error report in _stp_ctl_ready_lock held by _stp_ctl_log_werr().
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
2023-12-17 1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
@ 2023-12-17 1:31 ` agentzh at gmail dot com
2023-12-17 4:30 ` agentzh at gmail dot com
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-17 1:31 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=31176
agentzh <agentzh at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|Avoid spin lock deadlocks |Spin lock deadlocks in
|in memory pool allocations |memory pool allocations for
|for mixed NMI and non-NMI |mixed NMI and non-NMI
|contexts. |contexts
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
2023-12-17 1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
2023-12-17 1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
@ 2023-12-17 4:30 ` agentzh at gmail dot com
2023-12-18 20:27 ` agentzh at gmail dot com
2023-12-18 20:28 ` agentzh at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-17 4:30 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=31176
--- Comment #1 from agentzh <agentzh at gmail dot com> ---
I'd propose this patch to fix this issue:
https://gist.github.com/agentzh/d54da2db9521f26fd6f700d270f48334
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
2023-12-17 1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
2023-12-17 1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
2023-12-17 4:30 ` agentzh at gmail dot com
@ 2023-12-18 20:27 ` agentzh at gmail dot com
2023-12-18 20:28 ` agentzh at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-18 20:27 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=31176
agentzh <agentzh at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|UNCONFIRMED |RESOLVED
--- Comment #2 from agentzh <agentzh at gmail dot com> ---
Already fixed in commit 36343e483e28.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
2023-12-17 1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
` (2 preceding siblings ...)
2023-12-18 20:27 ` agentzh at gmail dot com
@ 2023-12-18 20:28 ` agentzh at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-18 20:28 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=31176
--- Comment #3 from agentzh <agentzh at gmail dot com> ---
Sorry, the commit SHA was wrong. Should be commit bd98b2028955 instead.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-12-18 20:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-17 1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
2023-12-17 1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
2023-12-17 4:30 ` agentzh at gmail dot com
2023-12-18 20:27 ` agentzh at gmail dot com
2023-12-18 20:28 ` agentzh at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).