From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 122990 invoked by alias); 24 Oct 2016 19:51:13 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 122858 invoked by uid 48); 24 Oct 2016 19:51:00 -0000 From: "dsmith at redhat dot com" To: systemtap@sourceware.org Subject: [Bug runtime/20735] New: "soft lockup" bug on RHEL7 ppc64 Date: Mon, 24 Oct 2016 19:51:00 -0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: systemtap X-Bugzilla-Component: runtime X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: dsmith at redhat dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: systemtap at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2016-q4/txt/msg00029.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=3D20735 Bug ID: 20735 Summary: "soft lockup" bug on RHEL7 ppc64 Product: systemtap Version: unspecified Status: NEW Severity: normal Priority: P2 Component: runtime Assignee: systemtap at sourceware dot org Reporter: dsmith at redhat dot com Target Milestone: --- I'm seeing the following "soft lockup" bug fairly consistently when running= the testsuite in parallel mode on RHEL7 ppc64 (3.10.0-327.el7.ppc64.debug): =3D=3D=3D=3D Oct 22 14:43:55 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: BUG: soft lock= up - CPU#11 stuck for 31s! [migration/11:94] Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Modules linked in: stap_0ea83aa31510554f18d088ec7736640f__5685(OE+) stap_fae50114ebc4a4659eb3c14ebac75168_5108(OE+) stap_16af15cbdec255c8ee6892d56fef69df_8_4819(OE) stap_bcdff2ed83de7602b65808ca8e0bf37d__4756(OE) sg pseries_rng nfsd auth_rp= cgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi= _tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_a35e54245d1be418047481fe89e36ea1__1925] Oct 22 14:43:55 ibm-p8-01-lp7 kernel: irq event stamp: 121576 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: hardirqs last enabled at (121575): [] ._raw_spin_unlock_irqrestore+0x48/0xc0 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: hardirqs last disabled at (121576): [] decrementer_common+0x10c/0x180 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: softirqs last enabled at (120802): [] .__do_softirq+0x220/0x5b0 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: softirqs last disabled at (120797): [] .call_do_softirq+0x14/0x24 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: CPU: 11 PID: 94 Comm: migration/11 Tainted: G OE ------------ 3.10.0-327.el7.ppc64.debug #1 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: task: c0000005f4ec7880 ti: c0000005f4fd8000 task.ti: c0000005f4fd8000 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: NIP: c000000000010c50 LR: c000000000010c50 CTR: c000000000020940 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: REGS: c0000005f4fdb810 TRAP: 0901=20= =20 Tainted: G OE ------------ (3.10.0-327.el7.ppc64.debug) Oct 22 14:43:55 ibm-p8-01-lp7 kernel: MSR: 8000000100009032 CR: 24000024 XER: 00000000 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: CFAR: c000000000020970 SOFTE: 1 GPR00: c000000000a1a154 c0000005f4fdba90 c000000001a1c730 0000000000000900 GPR04: 0000000000000001 0000000000000001 0350280000000000 00000000040b4432 GPR08: 0000000000000000 000000000000119a 000fe0d0d288f434 d00000000d9fa1b0 GPR12: 0000000022000824 00003fff964c8da0 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: NIP [c000000000010c50] .arch_local_irq_restore+0x100/0x150 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: LR [c000000000010c50] .arch_local_irq_restore+0x100/0x150 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Call Trace: Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbb00] [c000000000a1a154] ._raw_spin_unlock_irqrestore+0x54/0xc0 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbb80] [c0000000001d77ac] .cpu_stop_should_run+0x5c/0x80 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbc00] [c0000000001313d4] .smpboot_thread_fn+0x244/0x340 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbcb0] [c00000000011f468] .kthread+0xe8/0xf0 Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbe30] [c00000000000a47c] .ret_from_kernel_thread+0x58/0xdc Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Instruction dump: Oct 22 14:43:55 ibm-p8-01-lp7 kernel: e8690010 786307e2 3063ffff 7c631910 7863b7a6 78635624 38630e80 4bffff6c Oct 22 14:43:55 ibm-p8-01-lp7 kernel: 7c0802a6 f8010010 f821ff91 4bff1769 <60000000> 38210070 e8010010 7c0803a6 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task kworker/11:2:381 blocked f= or more than 120 seconds. Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 14:45:09 ibm-p8-01-lp7 kernel: kworker/11:2 D 0000000000000000 8= 976=20 381 2 0x00000800 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Workqueue: events .module_refresher [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace: Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb520] [d00000000dfb0000] 0xd00000000dfb0000 (unreliable) Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb6f0] [c00000000001835c] .__switch_to+0x26c/0x4b0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb7a0] [c000000000a164a4] .__schedule+0x4e4/0xd10 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb8c0] [c000000000a18130] .schedule_preempt_disabled+0x40/0xc0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb940] [c000000000a11b70] .mutex_lock_nested+0x240/0x5b0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeba50] [d00000000d924970] .systemtap_module_refresh+0x160/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebb00] [c00000000011035c] .process_one_work+0x25c/0x860 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebbe0] [c000000000110ad0] .worker_thread+0x170/0x490 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebcb0] [c00000000011f468] .kthread+0xe8/0xf0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebe30] [c00000000000a47c] .ret_from_kernel_thread+0x58/0xdc Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 4 locks held by kworker/11:2/381: Oct 22 14:45:09 ibm-p8-01-lp7 kernel: #0: (events){++++.+}, at: [] .process_one_work+0x1c8/0x860 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: #1: ((&module_refresher_work)){+.+.= ..}, at: [] .process_one_work+0x1c8/0x860 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: #2: (module_refresh_mutex){+.+...},= at: [] .systemtap_module_refresh+0x4c/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: #3: (&ut->inode_lock#2){+.+.+.}, at: [] .systemtap_module_refresh+0x160/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task echo:5102 blocked for more than 120 seconds. Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 14:45:09 ibm-p8-01-lp7 kernel: echo D 00003fff93f43cc0 8= 240=20 5102 19520 0x0000a082 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace: Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3440] [d00000000dfb0000] 0xd00000000dfb0000 (unreliable) Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3610] [c00000000001835c] .__switch_to+0x26c/0x4b0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e36c0] [c000000000a164a4] .__schedule+0x4e4/0xd10 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e37e0] [c000000000a18130] .schedule_preempt_disabled+0x40/0xc0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3860] [c000000000a11b70] .mutex_lock_nested+0x240/0x5b0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3970] [d00000000d927fa8] .stapiu_process_found+0x208/0x660 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3a30] [d00000000d9268a4] .__stp_tf_quiesce_worker+0x154/0x230 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3ad0] [c000000000119fc4] .task_work_run+0x104/0x160 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3b70] [c0000000000fc74c] .get_signal_to_deliver+0xc9c/0xdd0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3c70] [c00000000001a234] .do_signal+0x54/0x320 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3db0] [c00000000001a65c] .do_notify_resume+0x8c/0x100 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 1 lock held by echo/5102: Oct 22 14:45:09 ibm-p8-01-lp7 kernel: #0: (&ut->inode_lock#2){+.+.+.}, at: [] .stapiu_process_found+0x208/0x660 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task staprun:5108 blocked for m= ore than 120 seconds. Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 14:45:09 ibm-p8-01-lp7 kernel: staprun D 00003fff8d298318 7= 824=20 5108 29619 0x00008080 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace: Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb3c0] [d00000000de70000] 0xd00000000de70000 (unreliable) Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb590] [c00000000001835c] .__switch_to+0x26c/0x4b0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb640] [c000000000a164a4] .__schedule+0x4e4/0xd10 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb760] [c000000000a18130] .schedule_preempt_disabled+0x40/0xc0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb7e0] [c000000000a11b70] .mutex_lock_nested+0x240/0x5b0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb8f0] [d00000000d92485c] .systemtap_module_refresh+0x4c/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb9a0] [d00000000d92521c] ._stp_module_notifier+0x6c/0x170 [stap_16af15cbdec255c8ee6892d56fef69df_8_4= 819] Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fba30] [c000000000a20c18] .notifier_call_chain+0xb8/0x200 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbae0] [c00000000012bffc] .blocking_notifier_call_chain+0x7c/0x100 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbb70] [c0000000001b1f50] .load_module+0x1840/0x20d0 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbd40] [c0000000001b28d0] .SyS_init_module+0xf0/0x170 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbe30] [c00000000000a188] system_call+0x38/0xb4 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 2 locks held by staprun/5108: Oct 22 14:45:09 ibm-p8-01-lp7 kernel: #0:=20 ((module_notify_list).rwsem){++++.+}, at: [] .blocking_notifier_call_chain+0x60/0x100 Oct 22 14:45:09 ibm-p8-01-lp7 kernel: #1: (module_refresh_mutex){+.+...},= at: [] .systemtap_module_refresh+0x4c/0x650 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819] Oct 22 14:45:29 ibm-p8-01-lp7 kernel: INFO: rcu_sched detected stalls on CPUs/tasks: { 11} (detected by 1, t=3D6503 jiffies, g=3D7870, c=3D7869, q= =3D0) Oct 22 14:45:29 ibm-p8-01-lp7 kernel: Task dump for CPU 11: Oct 22 14:45:29 ibm-p8-01-lp7 kernel: swapper/11 R running task 10= 560=20 0 1 0x00000804 Oct 22 14:45:29 ibm-p8-01-lp7 kernel: Call Trace: Oct 22 14:45:29 ibm-p8-01-lp7 kernel: [c0000005f4a539a0] [0000000000000058] 0x58 (unreliable) Oct 22 14:46:55 ibm-p8-01-lp7 kernel: BUG: soft lockup - CPU#11 stuck for 6= 5s! [swapper/11:0] Oct 22 14:46:55 ibm-p8-01-lp7 kernel: Modules linked in: stap_011c8d5b3f3c577d0cebde949ae1e2fe_8_6219(OE+) stap_0ea83aa31510554f18d088ec7736640f__5685(OE+) stap_fae50114ebc4a4659eb3c14ebac75168_5108(OE+) stap_16af15cbdec255c8ee6892d56fef69df_8_4819(OE) stap_bcdff2ed83de7602b65808ca8e0bf37d__4756(OE) sg pseries_rng nfsd auth_rp= cgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi= _tgt dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_a35e54245d1be418047481fe89e36ea1__1925] =3D=3D=3D=3D=3D --=20 You are receiving this mail because: You are the assignee for the bug.