From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 8AE003858D38; Sat, 16 Mar 2024 13:03:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8AE003858D38 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1710594229; bh=0mjdZ2iH+LAsIdFnIWbBZpmO5+N4NG4Y2XBicwAncT8=; h=From:To:Subject:Date:From; b=YGh4OJ1+1ud4/Hp6gtAAjqVhsLnMLnvMfXIkm5ST0DGiFWvjonw2AcwROMtpd5Nwg nxuxiqq1wAoz3aPoRt526qqpAw7IXu5dqX14S/y4MOKBCn4+jl8Wqp5ctjhw3+6lgp 8q7b8kPcs7ZbDSVx0QKo2c1FCQS4Gxr6g0FRKhXQ= From: "mcermak at redhat dot com" To: systemtap@sourceware.org Subject: [Bug runtime/31500] New: stapio exited with irqs disabled Date: Sat, 16 Mar 2024 13:03:48 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: systemtap X-Bugzilla-Component: runtime X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: mcermak at redhat dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: systemtap at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://sourceware.org/bugzilla/show_bug.cgi?id=3D31500 Bug ID: 31500 Summary: stapio exited with irqs disabled Product: systemtap Version: unspecified Status: NEW Severity: normal Priority: P2 Component: runtime Assignee: systemtap at sourceware dot org Reporter: mcermak at redhat dot com Target Milestone: --- With systemtap commit b67f0fc38bd5139eda809eef12e20568e140cd96 I'm hitting testsuite stalls. Reproducing it takes ~hours, but seems reliable. I've reproduced with both rawhide and rawhide-debug kernel. The results and outp= ut are pretty similar for both kernels. The following is with=20 6.9.0-0.rc0.20240314git480e035fc4c7.5.fc41.x86_64+debug: [root@rawh ~]# pstree -p 62459 make(62459)=E2=94=80=E2=94=80=E2=94=80make(62462)=E2=94=80=E2=94=80=E2=94= =80make(62463)=E2=94=80=E2=94=80=E2=94=80sh(62465)=E2=94=80=E2=94=80=E2=94= =80expect(62486)=E2=94=80=E2=94=AC=E2=94=80bash(122503) =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =E2=94=9C=E2=94=80stap(62728) =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =E2=94=9C=E2=94=80stap(115557) =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =E2=94=9C=E2=94=80stap(180093) =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =E2=94=9C=E2=94=80stap(187657)=E2=94=80=E2=94=80=E2=94=80stapio(188000)=E2= =94=80=E2=94=AC=E2=94=80{stapio}(188002) =E2=94= =82=20=20=20=20=20=20=20=20=20=20 =E2=94=9C=E2=94=80{stapio}(188003) =E2=94= =82=20=20=20=20=20=20=20=20=20=20 =E2=94=94=E2=94=80{stapio}(188004) =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =E2=94=94=E2=94=80{expect}(62570) [root@rawh ~]# fgrep -i stat /proc/187657/status=20 State: S (sleeping) [root@rawh ~]# fgrep -i stat /proc/188000/status=20 State: Z (zombie) [root@rawh ~]#=20 [root@rawh ~]# gdb -q -p 187657 [ ... stuff deleted ... ] (gdb) bt #0 0x00007fed601a9d67 in wait4 () from /lib64/libc.so.6 #1 0x00000000005cb8e4 in stap_waitpid (verbose=3D0, pid=3D188000) at ../systemtap/util.cxx:843 #2 0x0000000000633647 in direct::finish (this=3D0x288e3650) at ../systemtap/remote.cxx:108 #3 0x0000000000630e2a in remote::run (remotes=3Dstd::vector of length 1, capacity 1 =3D {...}) at /usr/include/c++/14/bits/stl_vector.h:1144 #4 0x0000000000471513 in pass_5 (s=3D..., targets=3Dstd::vector of length = 1, capacity 1 =3D {...}) at ../systemtap/main.cxx:1388 #5 0x000000000046e1fc in main (argc=3D, argv=3D) at ../systemtap/main.cxx:1632 (gdb)=20 [root@rawh ~]# dmesg [ ... stuff deleted ... ] [16452.731796] stap_e4ade712b5a11cfdd0af14bfb4224df_29278(OE):1 stap_79325b69980582f86b9f5ee4e671bb2_28895(OE):1 stap_573cd5889b91236242ba1206224b1da_28519(OE):1 stap_5f829f7d6b934c4dc34166cf5f4dd80_28143(OE):1 stap_fccbe5e083add72333593aa0f6f816a_27756(OE):1 stap_4eac1dd011efe3e75bda066643b8fc4_27393(OE):1 stap_e24b147733cae10e2828c6dc6c87e3b_27041(OE):1 stap_f997e949c73eb99bae83fb21536da40_26689(OE):1 stap_618c87c4017e4ef8e4ecab83485453d_26301(OE):1 stap_ed0cedb34bc4ae619f54c37359c9f10_25938(OE):1 stap_08b6656b42c733b8993924f59cb76f9_25590(OE):1 stap_ee610f07e1b09c16176db7708c65f16_25242(OE):1 stap_6b57ff065036442445cb7eb5dc87546_24879(OE):1 stap_a07e0fe7a94e704368e383634645c3c_24517(OE):1 stap_e9b8c009da3cb8a321d6a6acc5ddfde_24152(OE):1 stap_f45886cc2f74a0200690ea946ee1e50_23791(OE):1 stap_cc682057afce0d071203c2ad97390340_23365(OE):1 stap_2153c65d83f86cce8412b1b1e442165f_22773(OE):1 [last unloaded: stap_017c8012d60fc7fd31a84d27b5a28d_187649(OE)] [16452.767193] CR2: ffffffffa64a65c0 [16452.769368] ---[ end trace 0000000000000000 ]--- [16452.771499] RIP: 0010:arch_adjust_kprobe_addr+0x41/0xe0 [16452.773611] Code: 48 89 d3 48 ba 00 00 00 00 00 fc ff df 48 83 ec 08 0f = b6 0c 11 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 85 85 00 00 00 <8b> 5= 5 00 81 fa 66 0f 1f 00 74 4f 81 e2 ff ff ff fe b9 0c f0 e1 05 [16452.777850] RSP: 0018:ffffc90002027960 EFLAGS: 00010246 [16452.779941] RAX: 0000000000000001 RBX: ffffc90002027a58 RCX: 0000000000000000 [16452.782014] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffffffffa64a65c0 [16452.784058] RBP: ffffffffa64a65c0 R08: fffffbfff4858cff R09: 0000000000000000 [16452.786089] R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff92000404f31 [16452.788180] R13: ffffffffc10ca498 R14: 0000000000000000 R15: ffffffffc20ada98 [16452.790166] FS: 00007f1eaa74a180(0000) GS:ffff888115400000(0000) knlGS:0000000000000000 [16452.792138] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [16452.794077] CR2: ffffffffa64a65c0 CR3: 000000010e99e005 CR4: 0000000000370ef0 [16452.796007] DR0: 0000000000404010 DR1: 0000000000000000 DR2: 0000000000000000 [16452.797910] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 [16452.799774] note: stapio[188000] exited with irqs disabled [17353.035208] workqueue: drain_vmap_area_work hogged CPU for >10000us 1027 times, consider switching to WQ_UNBOUND [22491.638101] workqueue: drain_vmap_area_work hogged CPU for >10000us 2051 times, consider switching to WQ_UNBOUND [32865.163317] workqueue: drain_vmap_area_work hogged CPU for >10000us 4099 times, consider switching to WQ_UNBOUND [41530.296260] clocksource: timekeeping watchdog on CPU1: kvm-clock retried= 1 times before success [53438.316931] workqueue: drain_vmap_area_work hogged CPU for >10000us 8195 times, consider switching to WQ_UNBOUND [root@rawh ~]# --=20 You are receiving this mail because: You are the assignee for the bug.=