public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts.
@ 2023-12-17  1:30 agentzh at gmail dot com
  2023-12-17  1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-17  1:30 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31176

            Bug ID: 31176
           Summary: Avoid spin lock deadlocks in memory pool allocations
                    for mixed NMI and non-NMI contexts.
           Product: systemtap
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: agentzh at gmail dot com
  Target Milestone: ---

The kernel's lockdep finds dead locks in the stap memory pool allocator's
spinlocks when mixing NMI and non-NMI contexts. The following dmesg error from
lockdep can always be reproduced on a debug kernel of Fedora by running TEST 4
in testsuite/systemtap.base/kernel-hw-breakpoint-addr.exp:

```
[  426.561994] stap_d42289a7e6d2399d2047d1d6eb75940b_7879
(kernel-hw-breakpoint-addr_1.stp): systemtap: 5.1/0.183, base:
ffffffffc0e91000, memory: 1276data/200text/459ctx/524390net/375alloc kb,
probes: 3
[  426.591193] ================================
[  426.591193] WARNING: inconsistent lock state
[  426.591194] 5.11.22-100.fc32.x86_64+debug #1 Tainted: G           OE
[  426.591194] --------------------------------
[  426.591194] inconsistent {INITIAL USE} -> {IN-NMI} usage.
[  426.591195] a.out/7882 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  426.591195] ffff88813de33e30 (lock#8){....}-{2:2}, at:
_stp_mempool_alloc+0x27/0x1a0 [stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591198] {INITIAL USE} state was registered at:
[  426.591198]   lock_acquire+0x1cc/0x780
[  426.591199]   _raw_spin_lock_irqsave+0x4d/0x90
[  426.591199]   _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591199]   _stp_transport_init+0x668/0x1870
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591200]   do_one_initcall+0xfb/0x530
[  426.591200]   do_init_module+0x1ce/0x7a0
[  426.591200]   load_module+0x78b5/0x9fb0
[  426.591201]   __do_sys_init_module+0x18f/0x220
[  426.591202]   do_syscall_64+0x33/0x40
[  426.591203]   entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  426.591204] irq event stamp: 4434
[  426.591204] hardirqs last  enabled at (4433): [<ffffffff83800ade>]
asm_exc_page_fault+0x1e/0x30
[  426.591205] hardirqs last disabled at (4434): [<ffffffff836816fc>]
exc_debug+0x6c/0x150
[  426.591205] softirqs last  enabled at (4258): [<ffffffff81096106>]
fpu__clear+0x86/0x160
[  426.591206] softirqs last disabled at (4254): [<ffffffff81096085>]
fpu__clear+0x5/0x160

[  426.591207] other info that might help us debug this:
[  426.591207]  Possible unsafe locking scenario:

[  426.591208]        CPU0
[  426.591209]        ----
[  426.591213]   lock(lock#8);
[  426.591219]   <Interrupt>
[  426.591219]     lock(lock#8);

[  426.591222]  *** DEADLOCK ***

[  426.591222] no locks held by a.out/7882.

[  426.591223] stack backtrace:
[  426.591224] CPU: 6 PID: 7882 Comm: a.out Tainted: G           OE    
5.11.22-100.fc32.x86_64+debug #1
[  426.591225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.15.0-1.fc35 04/01/2014
[  426.591226] Call Trace:
[  426.591226]  <#DB>
[  426.591226]  dump_stack+0xae/0xe5
[  426.591227]  lock_acquire.cold+0x3b/0x40
[  426.591227]  ? lockdep_hardirqs_on_prepare+0x3f0/0x3f0
[  426.591228]  ? _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591229]  _raw_spin_lock_irqsave+0x4d/0x90
[  426.591229]  ? _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591230]  _stp_mempool_alloc+0x27/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591230]  _stp_ctl_get_buffer+0x1cf/0x2f0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591231]  _stp_ctl_log_werr+0x37/0x290
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591231]  ? __clear_user+0x47/0x70
[  426.591232]  _stp_warn+0xa3/0xc0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591232]  ? _stp_vlog.constprop.0+0x40/0x40
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591233]  unwind+0x2cf/0x3d0 [stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591233]  ? _stp_umod_lookup.constprop.0+0x1a0/0x1a0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591234]  _stp_get_uregs+0x46d/0x6f0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591234]  probe_6378+0x50c/0xec0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591235]  ? _stp_snprint_addr.constprop.0.isra.0+0x940/0x940
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591235]  enter_hwbkpt_probe+0x582/0xaf0
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591236]  ? __stp_utrace_task_finder_target_syscall_entry+0xc90/0xc90
[stap_d42289a7e6d2399d2047d1d6eb75940b_7879]
[  426.591236]  __perf_event_overflow+0x11e/0x320
[  426.591237]  perf_bp_event+0x13b/0x150
[  426.591237]  ? __perf_sw_event+0x130/0x130
[  426.591237]  hw_breakpoint_exceptions_notify+0x204/0x310
[  426.591238]  ? atomic_notifier_call_chain+0x5/0x100
[  426.591238]  notifier_call_chain+0x9e/0x180
[  426.591239]  atomic_notifier_call_chain+0x64/0x100
[  426.591239]  notify_die+0x81/0xd0
[  426.591239]  ? atomic_notifier_call_chain+0x100/0x100
[  426.591240]  ? rcu_nmi_enter+0x7c/0xd0
[  426.591240]  notify_debug+0x25/0x30
[  426.591240]  exc_debug+0xe4/0x150
[  426.591241]  asm_exc_debug+0x19/0x30
[  426.591241] RIP: 0010:__clear_user+0x47/0x70
[  426.591242] Code: ff 0f 01 cb 48 89 d8 48 c1 eb 03 48 89 ef 83 e0 07 48 89
d9 48 85 c9 74 19 66 2e 0f 1f 84 00 00 00 00 00 48 c7 07 00 00 00 00 <48> 83 c7
08 ff c9 75 f1 48 89 c1 85 c9 74 0a c6 07 00 48 ff c7 ff
[  426.591244] RSP: 0018:ffffc90000acfbe0 EFLAGS: 00040206
[  426.591244] RAX: 0000000000000004 RBX: 00000000000001fc RCX:
00000000000001fc
[  426.591245] RDX: 1ffff11025cb51ba RSI: 0000000000000000 RDI:
000000000040401c
[  426.591245] RBP: 000000000040401c R08: 0000000000000000 R09:
0000000000000001
[  426.591246] R10: fffffbfff1b4506d R11: 0000000000000001 R12:
ffff888142192aa0
[  426.591246] R13: 0000000000000000 R14: 000000000040401c R15:
ffff88815a6cb800
[  426.591247]  </#DB>
[  426.591247]  load_elf_binary+0x27f8/0x3f10
[  426.591247]  ? elf_core_dump+0x2ce0/0x2ce0
[  426.591248]  ? ima_file_mprotect+0x380/0x380
[  426.591248]  bprm_execve+0x684/0x1580
[  426.591249]  ? copy_strings.isra.0+0x680/0x680
[  426.591249]  do_execveat_common+0x55d/0x730
[  426.591249]  ? bprm_execve+0x1580/0x1580
[  426.591250]  ? getname_flags.part.0+0x8e/0x450
[  426.591250]  __x64_sys_execve+0x8f/0xc0
[  426.591250]  do_syscall_64+0x33/0x40
[  426.591251]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  426.591251] RIP: 0033:0x7fda303e21ab
[  426.591252] Code: Unable to access opcode bytes at RIP 0x7fda303e2181.
[  426.591252] RSP: 002b:00007ffee7238158 EFLAGS: 00000202 ORIG_RAX:
000000000000003b
[  426.591253] RAX: ffffffffffffffda RBX: 00000000021c23a0 RCX:
00007fda303e21ab
[  426.591254] RDX: 00007ffee7238680 RSI: 00000000021c2380 RDI:
00000000021c23a0
[  426.591255] RBP: 00007ffee72381c0 R08: 00007ffee7238290 R09:
0000000000000000
[  426.591255] R10: 0000000000401059 R11: 0000000000000202 R12:
00000000021c2380
[  426.591256] R13: 00007ffee7238680 R14: 0000000000000001 R15:
0000000000000000
```

In addition to the spin lock in pool->lock above, there's also a similar
lockdep error report in _stp_ctl_ready_lock held by _stp_ctl_log_werr().

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
  2023-12-17  1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
@ 2023-12-17  1:31 ` agentzh at gmail dot com
  2023-12-17  4:30 ` agentzh at gmail dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-17  1:31 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31176

agentzh <agentzh at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Avoid spin lock deadlocks   |Spin lock deadlocks in
                   |in memory pool allocations  |memory pool allocations for
                   |for mixed NMI and non-NMI   |mixed NMI and non-NMI
                   |contexts.                   |contexts

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
  2023-12-17  1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
  2023-12-17  1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
@ 2023-12-17  4:30 ` agentzh at gmail dot com
  2023-12-18 20:27 ` agentzh at gmail dot com
  2023-12-18 20:28 ` agentzh at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-17  4:30 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31176

--- Comment #1 from agentzh <agentzh at gmail dot com> ---
I'd propose this patch to fix this issue:
https://gist.github.com/agentzh/d54da2db9521f26fd6f700d270f48334

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
  2023-12-17  1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
  2023-12-17  1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
  2023-12-17  4:30 ` agentzh at gmail dot com
@ 2023-12-18 20:27 ` agentzh at gmail dot com
  2023-12-18 20:28 ` agentzh at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-18 20:27 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31176

agentzh <agentzh at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #2 from agentzh <agentzh at gmail dot com> ---
Already fixed in commit 36343e483e28.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug runtime/31176] Spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts
  2023-12-17  1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
                   ` (2 preceding siblings ...)
  2023-12-18 20:27 ` agentzh at gmail dot com
@ 2023-12-18 20:28 ` agentzh at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: agentzh at gmail dot com @ 2023-12-18 20:28 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31176

--- Comment #3 from agentzh <agentzh at gmail dot com> ---
Sorry, the commit SHA was wrong. Should be commit bd98b2028955 instead.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-12-18 20:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-17  1:30 [Bug runtime/31176] New: Avoid spin lock deadlocks in memory pool allocations for mixed NMI and non-NMI contexts agentzh at gmail dot com
2023-12-17  1:31 ` [Bug runtime/31176] Spin " agentzh at gmail dot com
2023-12-17  4:30 ` agentzh at gmail dot com
2023-12-18 20:27 ` agentzh at gmail dot com
2023-12-18 20:28 ` agentzh at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).