public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe
@ 2008-06-30 15:41 mwielaard at redhat dot com
  2008-06-30 15:44 ` [Bug kprobes/6707] " mwielaard at redhat dot com
                   ` (11 more replies)
  0 siblings, 12 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 15:41 UTC (permalink / raw)
  To: systemtap

On x86 kernel 2.6.25 I regularly get hardware lockups caused by oopses when
running make installcheck. This is in particular with fedora 9 -
2.6.25.6-55.fc9.i686. The fedora 8 - 2.6.24 kernel was fine.

After setting up netconsole as described on
http://sourceware.org/systemtap/wiki/DeveloperSetupTips I collected various
oopses, which I will attached.

I cannot replicate this in a qemu-kvm environment, then there are some regular
make installcheck failures, but no kernel oopses. It only occurs on the real
hardware.

The test that most likely (but not always) triggers these oopses is
testsuite/systemtap.base/onoffprobe.stp, which can be explicitly run with make
installcheck RUNTESTFLAGS=onoffprobe.exp

-- 
           Summary: oops crashes with 2.6.25 - onoffprobe
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: kprobes
        AssignedTo: systemtap at sources dot redhat dot com
        ReportedBy: mwielaard at redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
@ 2008-06-30 15:44 ` mwielaard at redhat dot com
  2008-06-30 16:44 ` mwielaard at redhat dot com
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 15:44 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-06-30 10:42 -------
Created an attachment (id=2805)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=2805&action=view)
Various collected oopses on the affected kernel


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
  2008-06-30 15:44 ` [Bug kprobes/6707] " mwielaard at redhat dot com
@ 2008-06-30 16:44 ` mwielaard at redhat dot com
  2008-06-30 19:22 ` mwielaard at redhat dot com
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 16:44 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-06-30 12:49 -------
Some more data points.

There were also oopses/lockups on this branch systemtap version: version
0.7/0.133 git branch pr6429-comp-unwindsyms, commit 3c02e16c.

With systemtap 0.6.2 however (systemtap version: version 0.6.2/0.133 built
2008-06-30) onoffprobe seems to work flawlessly every time. 

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
  2008-06-30 15:44 ` [Bug kprobes/6707] " mwielaard at redhat dot com
  2008-06-30 16:44 ` mwielaard at redhat dot com
@ 2008-06-30 19:22 ` mwielaard at redhat dot com
  2008-06-30 20:13 ` mwielaard at redhat dot com
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 19:22 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-06-30 16:52 -------
I narrowed it down to the following script:

global switch=-1

#begin probe
probe begin if (switch==-1) {
        log("begin1 probed");
}

probe begin if (switch==0) {
        log("begin2 probed");
}

#dwarf probe (return)
probe kernel.function("sys_write").return if (switch == 1) {
        log("function return probed")
        switch = 0
}

#dwarf probe (entry)
probe kernel.function("sys_write") if (switch == 2) {
        log("function entry probed")
        switch = 0
}

It looks like if I remove any of the probes, the conditions, or manipulate
switch in any other way, things don't hang.
So, what I expect is to see a bit more log output. But all I get when it hangs
is (run with /usr/local/systemtap/bin/stap -k -vv -DDEBUG_SYMBOLS=2
onoffprobe.stp -m onoffprobe):

begin1 probed
_stp_module_relocate:36: kernel, _stext, 805fd
_stp_module_relocate:36: kernel, _stext, 805fd

none of the other probes seem to log anything in that case.

This needs some more investigation.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (2 preceding siblings ...)
  2008-06-30 19:22 ` mwielaard at redhat dot com
@ 2008-06-30 20:13 ` mwielaard at redhat dot com
  2008-06-30 20:14 ` mhiramat at redhat dot com
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 20:13 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-06-30 19:49 -------
(In reply to comment #3)
> I narrowed it down to the following script:
> 
> global switch=-1
> 
> #begin probe
> probe begin if (switch==-1) {
>         log("begin1 probed");
> }
> 
> probe begin if (switch==0) {
>         log("begin2 probed");
> }
> 
> #dwarf probe (return)
> probe kernel.function("sys_write").return if (switch == 1) {
>         log("function return probed")
>         switch = 0
> }
> 
> #dwarf probe (entry)
> probe kernel.function("sys_write") if (switch == 2) {
>         log("function entry probed")
>         switch = 0
> }
> 
> It looks like if I remove any of the probes, the conditions, or manipulate
> switch in any other way, things don't hang.

Also all the switch assignment statements and the log statements are necessary.
Remove any of them and things seem fine.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (3 preceding siblings ...)
  2008-06-30 20:13 ` mwielaard at redhat dot com
@ 2008-06-30 20:14 ` mhiramat at redhat dot com
  2008-06-30 20:23 ` mwielaard at redhat dot com
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mhiramat at redhat dot com @ 2008-06-30 20:14 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mhiramat at redhat dot com  2008-06-30 19:58 -------
I tested it on my i686 PC with 2.6.25, but it didn't happen.
How frequently would it happen? and what kernel configuration would you set?


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (4 preceding siblings ...)
  2008-06-30 20:14 ` mhiramat at redhat dot com
@ 2008-06-30 20:23 ` mwielaard at redhat dot com
  2008-06-30 20:25 ` mwielaard at redhat dot com
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 20:23 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-06-30 20:14 -------
And the exact same script with systemtap-0.6.2 on the same setup/machine seems fine.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (5 preceding siblings ...)
  2008-06-30 20:23 ` mwielaard at redhat dot com
@ 2008-06-30 20:25 ` mwielaard at redhat dot com
  2008-06-30 20:44 ` mwielaard at redhat dot com
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 20:25 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-06-30 20:16 -------
(In reply to comment #5)
> I tested it on my i686 PC with 2.6.25, but it didn't happen.
> How frequently would it happen?

It happens almost always.

> and what kernel configuration would you set?

This is with the stock/latest updated fedora 9 kernel. 2.6.25.6-55.fc9.i686
I'll attache the config file.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (6 preceding siblings ...)
  2008-06-30 20:25 ` mwielaard at redhat dot com
@ 2008-06-30 20:44 ` mwielaard at redhat dot com
  2008-07-01  8:52 ` ananth at in dot ibm dot com
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-06-30 20:44 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-06-30 20:18 -------
Created an attachment (id=2809)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=2809&action=view)
config-2.6.25.6-55.fc9.i686


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (7 preceding siblings ...)
  2008-06-30 20:44 ` mwielaard at redhat dot com
@ 2008-07-01  8:52 ` ananth at in dot ibm dot com
  2008-07-01  8:59 ` mwielaard at redhat dot com
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: ananth at in dot ibm dot com @ 2008-07-01  8:52 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From ananth at in dot ibm dot com  2008-07-01 08:51 -------
My system is running the exact same kernel, same config too, but I don't see the
crash. It just prints 'begin1 probed'. I can terminate the script and the system
is usable. No indication of any oops in dmesg either.

I've even been able to toggle switch in sys_write (to 1) and sys_write return
(to -1) to continue probing and the probes hit fine; the log does get printed
without problems.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (8 preceding siblings ...)
  2008-07-01  8:52 ` ananth at in dot ibm dot com
@ 2008-07-01  8:59 ` mwielaard at redhat dot com
  2008-07-03 21:34 ` mhiramat at redhat dot com
  2008-07-04 10:30 ` mwielaard at redhat dot com
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-07-01  8:59 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-07-01 08:58 -------
(In reply to comment #9)
> My system is running the exact same kernel, same config too, but I don't see the
> crash. It just prints 'begin1 probed'. I can terminate the script and the system
> is usable. No indication of any oops in dmesg either.

Yeah :{ I am beginning to think this is just this one system. As I said, if I
setup the same system in a qemu-kvm environment I don't get any oops.

> I've even been able to toggle switch in sys_write (to 1) and sys_write return
> (to -1) to continue probing and the probes hit fine; the log does get printed
> without problems.

If I make the a similar change the script just works fine... Indeed, that
doesn't make sense, because then the script does even more stuff than in the
original script...

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (9 preceding siblings ...)
  2008-07-01  8:59 ` mwielaard at redhat dot com
@ 2008-07-03 21:34 ` mhiramat at redhat dot com
  2008-07-04 10:30 ` mwielaard at redhat dot com
  11 siblings, 0 replies; 14+ messages in thread
From: mhiramat at redhat dot com @ 2008-07-03 21:34 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mhiramat at redhat dot com  2008-07-03 21:34 -------
(In reply to comment #9)
> My system is running the exact same kernel, same config too, but I don't see the
> crash. It just prints 'begin1 probed'. I can terminate the script and the system
> is usable. No indication of any oops in dmesg either.

I also could not reproduce this bug on my i686(PentiumD SMP).
Based on the symptoms, I guess the module set timer handler wrong way, and
timer accessed wrong address. But I'm not sure how it can be happened.

So, would you checked your running kernel and kernel binary and kernel source are
same revision? Sometimes, executing make command after install kernel lose its
consistency.

Thank you,


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
  2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
                   ` (10 preceding siblings ...)
  2008-07-03 21:34 ` mhiramat at redhat dot com
@ 2008-07-04 10:30 ` mwielaard at redhat dot com
  11 siblings, 0 replies; 14+ messages in thread
From: mwielaard at redhat dot com @ 2008-07-04 10:30 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mwielaard at redhat dot com  2008-07-04 10:29 -------
(In reply to comment #11)
> So, would you checked your running kernel and kernel binary and kernel source are
> same revision? Sometimes, executing make command after install kernel lose its
> consistency.

It is the standard fc9 kernel and kernel-debuginfo packages. I have since
upgraded to the latest available:

# uname -a; rpm -q kernel kernel-debuginfo
Linux hermans.wildebeest.org 2.6.25.9-76.fc9.i686 #1 SMP Fri Jun 27 16:14:35 EDT
2008 i686 i686 i386 GNU/Linux
kernel-2.6.25.9-76.fc9.i686
kernel-debuginfo-2.6.25.9-76.fc9.i686

With this kernel the script from comment #3 does indeed work without freezing
the machine. Unfortunately the original script in
testsuite/systemtap.base/onoffprobe.stp still does.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug kprobes/6707] oops crashes with 2.6.25 - onoffprobe
       [not found] <20080630103842.6707.mjw@redhat.com>
@ 2009-03-20 19:34 ` mjw at redhat dot com
  0 siblings, 0 replies; 14+ messages in thread
From: mjw at redhat dot com @ 2009-03-20 19:34 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From mjw at redhat dot com  2009-03-20 19:17 -------
I haven't seen this crash for a long time now on recent Fedora 10 kernels. e.g.
2.6.27.19-170.2.35.fc10.i686 and recent systemtap 0.9 or higher. onoffprobe.exp
always passes now.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WORKSFORME


http://sourceware.org/bugzilla/show_bug.cgi?id=6707

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-03-20 19:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-30 15:41 [Bug kprobes/6707] New: oops crashes with 2.6.25 - onoffprobe mwielaard at redhat dot com
2008-06-30 15:44 ` [Bug kprobes/6707] " mwielaard at redhat dot com
2008-06-30 16:44 ` mwielaard at redhat dot com
2008-06-30 19:22 ` mwielaard at redhat dot com
2008-06-30 20:13 ` mwielaard at redhat dot com
2008-06-30 20:14 ` mhiramat at redhat dot com
2008-06-30 20:23 ` mwielaard at redhat dot com
2008-06-30 20:25 ` mwielaard at redhat dot com
2008-06-30 20:44 ` mwielaard at redhat dot com
2008-07-01  8:52 ` ananth at in dot ibm dot com
2008-07-01  8:59 ` mwielaard at redhat dot com
2008-07-03 21:34 ` mhiramat at redhat dot com
2008-07-04 10:30 ` mwielaard at redhat dot com
     [not found] <20080630103842.6707.mjw@redhat.com>
2009-03-20 19:34 ` mjw at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).