public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/15880] New: on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp
@ 2013-08-22 19:18 dsmith at redhat dot com
  2013-08-22 20:23 ` [Bug runtime/15880] " dsmith at redhat dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: dsmith at redhat dot com @ 2013-08-22 19:18 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15880

            Bug ID: 15880
           Summary: on ppc64, getting a kernel panic when running
                    systemtap.stress/conversions.exp
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com

We're getting a repeatable kernel panic (on recent kernels) on ppc64 when
running systemtap.stress/conversions.exp.

====
[ 4871.506226] BUG: scheduling while atomic: stapio/2827/0x10010001
[ 4871.506241] Modules linked in:
stap_d1d8a331689baeed1bf006b384f9a927_15275(OF) sg ehea xfs libcrc32c sd_mod
crc_t10dif ibmvscsi scsi_transport_srp scsi_tgt dm_mirror dm_region_hash dm_log
dm_mod [last unloaded: stap_d4a86a3bb86f3f1090832ceb0ea18cf2_11411]
[ 4871.506302] CPU: 1 PID: 2827 Comm: stapio Tainted: GF         
O--------------   3.10.0-9.el7.ppc64 #1
[ 4871.506313] Call Trace:
[ 4871.506327] [c000000004ad2900] [c00000000001640c] .show_stack+0x7c/0x1f0
(unreliable)
[ 4871.506344] [c000000004ad29d0] [c000000000740974] .dump_stack+0x28/0x3c
[ 4871.506359] [c000000004ad2a40] [c000000000738d50] .__schedule_bug+0x50/0x74
[ 4871.506372] [c000000004ad2ab0] [c00000000072b9b8] .__schedule+0x978/0x980
[ 4871.506386] [c000000004ad2d30] [c0000000000d1c04] .__cond_resched+0x24/0x50
[ 4871.506399] [c000000004ad2db0] [c00000000072bed0] ._cond_resched+0x40/0x50
[ 4871.506414] [c000000004ad2e20] [c0000000003ff1a8]
.strncpy_from_user+0x118/0x2c0
[ 4871.506436] [c000000004ad2ee0] [d000000004841b0c]
._stp_strncpy_from_user+0x6c/0xa0 [stap_d1d8a331689baeed1bf006b384f9a927_15275]
[ 4871.506458] [c000000004ad2f60] [d000000004841d48]
.function_user_string_n+0xe8/0x160
[stap_d1d8a331689baeed1bf006b384f9a927_15275]
[ 4871.506480] [c000000004ad3000] [d000000004848274] .probe_2235+0x1a4/0x2c0
[stap_d1d8a331689baeed1bf006b384f9a927_15275]
[ 4871.506503] [c000000004ad30a0] [d00000000484aa8c]
.handle_perf_probe+0x18c/0x370 [stap_d1d8a331689baeed1bf006b384f9a927_15275]
[ 4871.506522] [c000000004ad3140] [c00000000018bb34]
.__perf_event_overflow+0xc4/0x2b0
[ 4871.506537] [c000000004ad3220] [c00000000018bea4]
.perf_swevent_hrtimer+0x184/0x1b0
[ 4871.506553] [c000000004ad3350] [c0000000000c3dfc] .__run_hrtimer+0xac/0x290
[ 4871.506566] [c000000004ad33f0] [c0000000000c4c88]
.hrtimer_interrupt+0x138/0x320
[ 4871.506581] [c000000004ad3500] [c00000000001e340]
.timer_interrupt+0x120/0x2e0
[ 4871.506596] [c000000004ad35b0] [c0000000000025d4]
decrementer_common+0x154/0x180
[ 4871.506612] --- Exception: 901 at .arch_local_irq_restore+0x74/0x90
[ 4871.506612]     LR = .try_to_wake_up+0x394/0x400
[ 4871.506629] [c000000004ad38a0] [0000000000000004] 0x4 (unreliable)
[ 4871.506644] [c000000004ad3910] [c00000000072d744]
._raw_spin_unlock_irqrestore+0x34/0x80
[ 4871.506659] [c000000004ad3980] [c0000000000cdf98] .__wake_up+0x58/0x80
[ 4871.506672] [c000000004ad3a20] [c00000000045d908] .tty_wakeup+0x38/0xe0
[ 4871.506685] [c000000004ad3aa0] [c00000000046d384] .pty_write+0x94/0xa0
[ 4871.506699] [c000000004ad3b30] [c0000000004643a4] .n_tty_write+0x224/0x530
[ 4871.506712] [c000000004ad3c30] [c0000000004602d8] .tty_write+0x128/0x2d0
[ 4871.506725] [c000000004ad3cf0] [c00000000021e950] .vfs_write+0xe0/0x260
[ 4871.506738] [c000000004ad3d90] [c00000000021f558] .SyS_write+0x58/0xd0
[ 4871.506752] [c000000004ad3e30] [c000000000009ed4] syscall_exit+0x0/0x98
[ 4871.506783] ------------[ cut here ]------------
[ 4871.506792] WARNING: at kernel/hrtimer.c:1235
[ 4871.506798] Modules linked in:
stap_d1d8a331689baeed1bf006b384f9a927_15275(OF) sg ehea xfs libcrc32c sd_mod
crc_t10dif ibmvscsi scsi_transport_srp scsi_tgt dm_mirror dm_region_hash dm_log
dm_mod [last unloaded: stap_d4a86a3bb86f3f1090832ceb0ea18cf2_11411]
[ 4871.506848] CPU: 1 PID: 2827 Comm: stapio Tainted: GF       W 
O--------------   3.10.0-9.el7.ppc64 #1
[ 4871.506859] task: c000000004a3b0b0 ti: c000000004ad0000 task.ti:
c000000004ad0000
[ 4871.506869] NIP: c0000000000c3d94 LR: c0000000000c4c88 CTR: c00000000001dd20
[ 4871.506881] REGS: c000000004ad30d0 TRAP: 0700   Tainted: GF       W 
O--------------    (3.10.0-9.el7.ppc64)
[ 4871.506891] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 48004484  XER:
00000002
[ 4871.506916] hrtimer: interrupt took 679124 ns
[ 4871.506923] SOFTE: 1
[ 4871.506929] CFAR: c0000000000c4c84
[ 4871.506935] 
GPR00: c0000000000c4c88 c000000004ad3350 c00000000110e5e0 c0000000acdf3508 
GPR04: c000000004ad3460 009340731cad4f3f ffffffffffffffff 002f866aa4000000 
GPR08: 0000046e2052ae8c 0000000000000001 0000000000000000 009340731cad4f3f 
GPR12: 0000000028004428 c00000000f140400 000000001000c570 000000001000c530 
GPR16: 0000000000000000 0000000000b40000 0000046e2051c2bc 0000000000000000 
GPR20: c0000000016228e0 c000000001622920 c000000001622960 c000000000ae2828 
GPR24: 0000000000000002 7fffffffffffffff 0000046e205a33fc c000000001622828 
GPR28: c000000001622868 c000000004ad3460 c000000001622828 c0000000acdf3508 
[ 4871.507022] NIP [c0000000000c3d94] .__run_hrtimer+0x44/0x290
[ 4871.507028] LR [c0000000000c4c88] .hrtimer_interrupt+0x138/0x320
[ 4871.507033] PACATMSCRATCH [8000000000009032]
[ 4871.507037] Call Trace:
[ 4871.507041] [c000000004ad3350] [c0000000000c3e40] .__run_hrtimer+0xf0/0x290
(unreliable)
[ 4871.507052] [c000000004ad33f0] [c0000000000c4c88]
.hrtimer_interrupt+0x138/0x320
[ 4871.507062] [c000000004ad3500] [c00000000001e340]
.timer_interrupt+0x120/0x2e0
[ 4871.507069] [c000000004ad35b0] [c0000000000025d4]
decrementer_common+0x154/0x180
[ 4871.507077] --- Exception: 901 at .arch_local_irq_restore+0x74/0x90
[ 4871.507077]     LR = .try_to_wake_up+0x394/0x400
[ 4871.507085] [c000000004ad38a0] [0000000000000004] 0x4 (unreliable)
[ 4871.507096] [c000000004ad3910] [c00000000072d744]
._raw_spin_unlock_irqrestore+0x34/0x80
[ 4871.507105] [c000000004ad3980] [c0000000000cdf98] .__wake_up+0x58/0x80
[ 4871.507112] [c000000004ad3a20] [c00000000045d908] .tty_wakeup+0x38/0xe0
[ 4871.507120] [c000000004ad3aa0] [c00000000046d384] .pty_write+0x94/0xa0
[ 4871.507128] [c000000004ad3b30] [c0000000004643a4] .n_tty_write+0x224/0x530
[ 4871.507136] [c000000004ad3c30] [c0000000004602d8] .tty_write+0x128/0x2d0
[ 4871.507142] [c000000004ad3cf0] [c00000000021e950] .vfs_write+0xe0/0x260
[ 4871.507149] [c000000004ad3d90] [c00000000021f558] .SyS_write+0x58/0xd0
[ 4871.507156] [c000000004ad3e30] [c000000000009ed4] syscall_exit+0x0/0x98
[ 4871.507162] Instruction dump:
[ 4871.507166] fb81ffe0 fbc1fff0 7c7f1b78 7c9d2378 f8010010 f821ff61 eb830030
eb7c0000 
[ 4871.507180] 892d028a 7d290074 7929d182 69290001 <0b090000> 60000000 7fe3fb78
7f84e378 
[ 4871.507194] ---[ end trace 4c3097dba02cba20 ]---
====

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug runtime/15880] on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp
  2013-08-22 19:18 [Bug runtime/15880] New: on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp dsmith at redhat dot com
@ 2013-08-22 20:23 ` dsmith at redhat dot com
  2013-08-23 15:02 ` dsmith at redhat dot com
  2013-08-26 18:20 ` dsmith at redhat dot com
  2 siblings, 0 replies; 4+ messages in thread
From: dsmith at redhat dot com @ 2013-08-22 20:23 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15880

--- Comment #1 from David Smith <dsmith at redhat dot com> ---
The conversions.exp testcase tests 3 different addresses. Here are the results
for each:

0: I get the kernel panic and the system crashes
0xffffffff: I get the kernel panic, but the system doesn't crash
0xffffffffffffffff: no panic or crash

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug runtime/15880] on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp
  2013-08-22 19:18 [Bug runtime/15880] New: on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp dsmith at redhat dot com
  2013-08-22 20:23 ` [Bug runtime/15880] " dsmith at redhat dot com
@ 2013-08-23 15:02 ` dsmith at redhat dot com
  2013-08-26 18:20 ` dsmith at redhat dot com
  2 siblings, 0 replies; 4+ messages in thread
From: dsmith at redhat dot com @ 2013-08-23 15:02 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=15880

--- Comment #2 from David Smith <dsmith at redhat dot com> ---
Here's some additional information. The conversions.exp testcase runs several
scripts: conversions.stp, conversions_trace.stp (testing tracepoints),
conversions_profile.stp (testing timer.profile), and conversions_perf.stp
(testing perf probes).

The panic happens only when the conversions_perf.stp test is run. In addition,
this appears to be related to the user_string_n() function, which was changed
by bug #15044. Note that the panic doesn't happen if I run conversions_perf.stp
with the '--compatible=2.2' flag.

To sum up, the following doesn't cause a panic:

# stap --compatible=2.2 -DMAXSKIPPED=99999 -DMAXERRORS=40 -v
../src/testsuite/systemtap.stress/conversions_perf.stp 0xffffffff

But this does:

# stap -DMAXSKIPPED=99999 -DMAXERRORS=40 -v
../src/testsuite/systemtap.stress/conversions_perf.stp 0xffffffff

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug runtime/15880] on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp
  2013-08-22 19:18 [Bug runtime/15880] New: on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp dsmith at redhat dot com
  2013-08-22 20:23 ` [Bug runtime/15880] " dsmith at redhat dot com
  2013-08-23 15:02 ` dsmith at redhat dot com
@ 2013-08-26 18:20 ` dsmith at redhat dot com
  2 siblings, 0 replies; 4+ messages in thread
From: dsmith at redhat dot com @ 2013-08-26 18:20 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=15880

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #3 from David Smith <dsmith at redhat dot com> ---
After lots of debugging and head scratching, I upgrading the kernel - then the
problem went away. I'll close this for now, we can reopen if the problem pops
up again.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-08-26 18:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-22 19:18 [Bug runtime/15880] New: on ppc64, getting a kernel panic when running systemtap.stress/conversions.exp dsmith at redhat dot com
2013-08-22 20:23 ` [Bug runtime/15880] " dsmith at redhat dot com
2013-08-23 15:02 ` dsmith at redhat dot com
2013-08-26 18:20 ` dsmith at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).