public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5
@ 2006-06-01 18:45 ldimaggi at redhat dot com
  2006-06-01 19:39 ` [Bug runtime/2725] " wcohen at redhat dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: ldimaggi at redhat dot com @ 2006-06-01 18:45 UTC (permalink / raw)
  To: systemtap

Attempting to probe all kernel.function's hangs FC5

I'm not sure how realistic a test this is - but it is 100% reproducible on my
Fedora Core 5 system.

Invoking stap with this command hangs the system - power cycle required to clear:

stap -v -p5 -e 'probe kernel.function("*")  { printf ("%s  %s  %d  %s  ",
"test-> ", execname(), pid(),  probefunc() ) }'

I'm not yet sure if one probe is causing the hang - or the accumulation of 10K+
probes. 

Systemtap version (built from CVS head 20060601):
  SystemTap translator/driver (version 0.5.7 built 2006-06-01)
  (Using Red Hat elfutils 0.120 libraries.)

Kernel installed:
  kernel-devel-2.6.16-1.2122_FC5
  kernel-debuginfo-2.6.16-1.2122_FC5 
  kernel-2.6.16-1.2122_FC5

-- 
           Summary: Attempting to probe all kernel functions hangs FC5
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
        AssignedTo: systemtap at sources dot redhat dot com
        ReportedBy: ldimaggi at redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
@ 2006-06-01 19:39 ` wcohen at redhat dot com
  2006-06-01 20:46 ` fche at redhat dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2006-06-01 19:39 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From wcohen at redhat dot com  2006-06-01 19:39 -------
Which architecture is this on? There are some differences between the i386 and
x86_64 in setting up the probes. The x86_64 need executable regions of memory
allocated. Depending on which architecture is being used there could be
different failure modes.

Is there any other information output? One trick is to just have a regular
console, no X so you can see any oops that come up. Are there any entries in
/var/log/messages related to this problem?

It could be that this experiment is bumping against some limit. You might take a
look at the output of the follow to see how many probes and the size of the
result .ko:

stap -v -p4 -k -e 'probe kernel.function("*")  { printf ("%s  %s  %d  %s  ",
"test-> ", execname(), pid(),  probefunc() ) }'

It is also possible that some function that should be blacklisted is getting
instrumented. You might try to expand
tests/testsuite/systemtap.stress/all_kernel_functions.exp to see if you can get
a narrow down the boundary between working and crashing.


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Attempting to probe all     |Attempting to probe all
                   |kernel functions hangs FC5  |kernel functions hangs FC5


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
  2006-06-01 19:39 ` [Bug runtime/2725] " wcohen at redhat dot com
@ 2006-06-01 20:46 ` fche at redhat dot com
  2006-06-01 21:02 ` joshua dot i dot stone at intel dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2006-06-01 20:46 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From fche at redhat dot com  2006-06-01 20:46 -------
Broad wildcard probes are known to still trigger problems.  This is the reason
that the src/tapsets.cxx blacklist exists.  One needs to narrow down failing
wildcards as much as possible, perhaps to individual functions and/or files,
then add them to the blacklist.  When the job is done, function("*") will
automagically exclude blacklisted ones and still give one the thousands of
workable probes one might want (?).


*** This bug has been marked as a duplicate of 1836 ***

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |DUPLICATE


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
  2006-06-01 19:39 ` [Bug runtime/2725] " wcohen at redhat dot com
  2006-06-01 20:46 ` fche at redhat dot com
@ 2006-06-01 21:02 ` joshua dot i dot stone at intel dot com
  2006-06-01 22:22 ` fche at redhat dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: joshua dot i dot stone at intel dot com @ 2006-06-01 21:02 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From joshua dot i dot stone at intel dot com  2006-06-01 21:01 -------
I think it has to do with printing within the * probe.  I tried the following on
i686 RHEL4:

  probe begin { log("begin") }
  probe kernel.function("*") { }
  probe end { log("end") }

This had no problem.  The system was a bit slow, but still usable and reasonably
responsive.  Then I tried this:

  probe begin { log("begin") }
  probe kernel.function("*") { print(".") }
  probe end { log("end") }

This hung the system.  The simple print(".") will still create a lot of data --
perhaps something in the buffer flush is being overwhelmed, or causing a recursion?

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
                   ` (2 preceding siblings ...)
  2006-06-01 21:02 ` joshua dot i dot stone at intel dot com
@ 2006-06-01 22:22 ` fche at redhat dot com
  2006-06-02 14:30 ` fche at redhat dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2006-06-01 22:22 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From fche at redhat dot com  2006-06-01 22:21 -------
Good point, we've had some progress since bug #1836.
Other than reentrancy, we may also trigger stack exhaustion (bug #2685).


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|DUPLICATE                   |


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
                   ` (3 preceding siblings ...)
  2006-06-01 22:22 ` fche at redhat dot com
@ 2006-06-02 14:30 ` fche at redhat dot com
  2006-06-02 15:34 ` ldimaggi at redhat dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2006-06-02 14:30 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From fche at redhat dot com  2006-06-02 14:30 -------
Additional data point: a few experiments probing kernel.function("*")
with FC5 kernels indicates improvements but also problems.  In all
my tests, the machine *appears to hang* even with probe handlers that
do nothing.  The machine is actually running, just extremely slowly.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
                   ` (4 preceding siblings ...)
  2006-06-02 14:30 ` fche at redhat dot com
@ 2006-06-02 15:34 ` ldimaggi at redhat dot com
  2006-06-02 18:33   ` Martin Hunt
  2006-06-02 18:33 ` hunt at redhat dot com
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: ldimaggi at redhat dot com @ 2006-06-02 15:34 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From ldimaggi at redhat dot com  2006-06-02 15:33 -------
Thanks for all the suggestions feedback - I need to narrow this down - some more
notes, answers to questions...

Architecture = i386, T41 thinkpad (1GB RAM), 
Linux dhcp83-56.boston.redhat.com 2.6.16-1.2122_FC5 #1 Sun May 21 15:01:01 EDT
2006 i686 i686 i386 GNU/Linux

It's a limit problem - 10K+ probes are being set:
------------
static char const * dwarf_kprobe_probe_613_location_names[10481] = {
  "kernel.function(\"run_init_process@init/main.c:666\")",
  "kernel.function(\"init@init/main.c:690\")",
  "kernel.function(\"rest_init@init/main.c:389\")",
------------

stap_2615.ko is large too:
------------
[root@dhcp83-56 stap0bMFBZ]# ll
total 6552
-rw-r--r-- 1 root root     127 Jun  2 11:05 Makefile
-rw-r--r-- 1 root root 1066809 Jun  2 11:05 stap_2615.c
-rw-r--r-- 1 root root 2188299 Jun  2 11:05 stap_2615.ko
-rw-r--r-- 1 root root    1929 Jun  2 11:05 stap_2615.mod.c
-rw-r--r-- 1 root root   34008 Jun  2 11:05 stap_2615.mod.o
-rw-r--r-- 1 root root 2155316 Jun  2 11:05 stap_2615.o
-rw-r--r-- 1 root root  692476 Jun  2 11:05 stap-symbols.h
-rw-r--r-- 1 root root  492534 Jun  2 11:05 symbols.sorted
------------

This script reliably hangs - or even reboots the system:
------------
#!/usr/bin/env stap 
probe begin { 
  log("begin") 
}
probe end { 
  log("end") 
}
probe kernel.function("*") {
}
------------

Nothing interesting is being written to /var/messages

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Bug runtime/2725] Attempting to probe all kernel functions  hangs FC5
  2006-06-02 15:34 ` ldimaggi at redhat dot com
@ 2006-06-02 18:33   ` Martin Hunt
  0 siblings, 0 replies; 12+ messages in thread
From: Martin Hunt @ 2006-06-02 18:33 UTC (permalink / raw)
  To: sourceware-bugzilla; +Cc: systemtap

I ran your script on 2.6.16-1.2122_FC5smp i686.  It is running now on
this system as I am sending this. Besides a slight slowdown, nothing bad
is happening. Module size is 2.3M, or about half the size of the nvidia
module on my system. OK, this is interesting. I hit ^C to terminate the
script and  I lost all console input in every window for several
minutes. Firefox was running fine. I could read emails, but typing in a
window did nothing. Then the script exited and everything is fast and
responsive again.

I wonder if this is related to the bug where the console starts
repeating characters while scripts exit? Anyone else see that sometimes?




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
                   ` (5 preceding siblings ...)
  2006-06-02 15:34 ` ldimaggi at redhat dot com
@ 2006-06-02 18:33 ` hunt at redhat dot com
  2006-11-17 18:22 ` wcohen at redhat dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: hunt at redhat dot com @ 2006-06-02 18:33 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From hunt at redhat dot com  2006-06-02 18:33 -------
Subject: Re:  Attempting to probe all kernel functions
	hangs FC5

I ran your script on 2.6.16-1.2122_FC5smp i686.  It is running now on
this system as I am sending this. Besides a slight slowdown, nothing bad
is happening. Module size is 2.3M, or about half the size of the nvidia
module on my system. OK, this is interesting. I hit ^C to terminate the
script and  I lost all console input in every window for several
minutes. Firefox was running fine. I could read emails, but typing in a
window did nothing. Then the script exited and everything is fast and
responsive again.

I wonder if this is related to the bug where the console starts
repeating characters while scripts exit? Anyone else see that sometimes?






-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/2725] Attempting to probe all kernel functions hangs FC5
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
                   ` (6 preceding siblings ...)
  2006-06-02 18:33 ` hunt at redhat dot com
@ 2006-11-17 18:22 ` wcohen at redhat dot com
  2006-11-20 21:46 ` [Bug translator/2725] function("*") probes sometimes crash & burn fche at redhat dot com
  2006-11-20 23:05 ` fche at redhat dot com
  9 siblings, 0 replies; 12+ messages in thread
From: wcohen at redhat dot com @ 2006-11-17 18:22 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From wcohen at redhat dot com  2006-11-17 15:40 -------
Logged into each machine remotely and ran the 20061117 snapshot with the
following script:

  ./stap_testing_200611170930/install/bin/stap -v -p5 -e 'probe
kernel.function("*")  { printf ("%s  %s  %d  %s  ","test-> ", execname(), pid(),
 probefunc() ) }'



fc6 i686 UP
kernel: 2.6.18-1.2849.fc6 #1 SMP Fri Nov 10 12:36:14 EST 2006 i686 i686 i386
GNU/Linux

Got the following output to the console when running the mentioned script:

BUG: unable to handle kernel paging request atf printing eip:
8c60c02f
*pde = 00000000
Oops: 0000 [#1]
BUG: unable to handle kernel paging request at virtual address 8c60c02f
 printing eip:
c0629c0b
*pde = 00000000
Oops: 0000 [#2]
BUG: unable to handle kernel NULL pointer dereference at virtual address 0000000
printing eip:
c0629c0b
*pde = 00000000
Recursive die() failure, output suppressed
 <0>BUG: spinlock lockup on CPU#0, staprun/2683, c066fbac (Not tainted)

Got same output even when printf taken out of script.



rawhide i686 SMP
 2.6.18-1.2849.fc6PAE #1 SMP Fri Nov 10 13:27:10 EST 2006 i686 i686 i386 GNU/Linux

The machine hangs with the script with the printf in the body. No
output or oops or anything However, the following script didn't kill
the machine:

./stap_testing_200611170930/install/bin/stap -v -p5 -e 'probe
kernel.function("*")  { }'

It took significant time for the probes to be removed (about a minute).



rawhide x86-64

Linux dhcp59-198.rdu.redhat.com 2.6.18-1.2849.fc6 #1 SMP Fri Nov 10 12:34:46 EST
2006 x86_64 x86_64 x86_64 GNU/Linux

The machine oops with the printf script. Simplified the script. 
The following script killed the machine by setting off the watchdog timer:

./stap_testing_200611170930/install/bin/stap -v -p5 -e 'probe
kernel.function("*")  { }'
Following transcribed from the screen:

 [<ffffffff802636c0>] do_nmi+0x45/0x63
 [<ffffffff80262bb7>] nmi+0x7f/0x88
 [<ffffffff80262bb8>] nmi+0x80/0x88
 <<EOE>>   [<ffffffff80262bb8>] nmi+0x80/0x88

Kernel panic- not syncing: Aiee, killing interrupt handler!
 Bug: warning at kernel/panic.c:137/panic() (Not tainted)

Call Trace:
 [<ffffffff802691db>] show_trace+0x34/0x47
 [<ffffffff802691e2>] dump_stack+0x12/0x17
 [<ffffffff8028b933>] panic_0x1e3/0x1f4
 [<ffffffff80214ee0>] do_exit+0x8c/0x8c2
 [<ffffffff80262fcb>] sync_regs+0x0/0x67
 [<ffffffff80678290>]
DWARF2 unwinder stuck at 0xffffffff80678290
Leftover inexact backtrace:
 <NMI>  [<ffffffff802635a>] nmi_watchdog_tick+0x105/0x1a6
 [<ffffffff802632f7>] default_do_nmi+0x86/0x1de
 [<ffffffff802636c0>] do_nmi+0x45/0x63
 [<ffffffff80262bb7>] nmi+0x7f/0x88
 [<ffffffff80262bb8>] nmi+0x80/0x88
 <<EOE>>   [<ffffffff80262bb8>] nmi+0x80/0x88




-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug translator/2725] function("*") probes sometimes crash & burn
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
                   ` (7 preceding siblings ...)
  2006-11-17 18:22 ` wcohen at redhat dot com
@ 2006-11-20 21:46 ` fche at redhat dot com
  2006-11-20 23:05 ` fche at redhat dot com
  9 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2006-11-20 21:46 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From fche at redhat dot com  2006-11-20 21:44 -------
This includes a variety of kernels, including rhel5/fc5/fc6.

http://sources.redhat.com/ml/systemtap/2006-q4/msg00474.html


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |critical
             Status|REOPENED                    |ASSIGNED
          Component|runtime                     |translator
   Last reconfirmed|0000-00-00 00:00:00         |2006-11-20 21:44:38
               date|                            |
            Summary|Attempting to probe all     |function("*") probes
                   |kernel functions hangs FC5  |sometimes crash & burn


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug translator/2725] function("*") probes sometimes crash & burn
  2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
                   ` (8 preceding siblings ...)
  2006-11-20 21:46 ` [Bug translator/2725] function("*") probes sometimes crash & burn fche at redhat dot com
@ 2006-11-20 23:05 ` fche at redhat dot com
  9 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2006-11-20 23:05 UTC (permalink / raw)
  To: systemtap



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|systemtap at sources dot    |fche at redhat dot com
                   |redhat dot com              |


http://sourceware.org/bugzilla/show_bug.cgi?id=2725

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-11-20 21:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-01 18:45 [Bug runtime/2725] New: Attempting to probe all kernel functions hangs FC5 ldimaggi at redhat dot com
2006-06-01 19:39 ` [Bug runtime/2725] " wcohen at redhat dot com
2006-06-01 20:46 ` fche at redhat dot com
2006-06-01 21:02 ` joshua dot i dot stone at intel dot com
2006-06-01 22:22 ` fche at redhat dot com
2006-06-02 14:30 ` fche at redhat dot com
2006-06-02 15:34 ` ldimaggi at redhat dot com
2006-06-02 18:33   ` Martin Hunt
2006-06-02 18:33 ` hunt at redhat dot com
2006-11-17 18:22 ` wcohen at redhat dot com
2006-11-20 21:46 ` [Bug translator/2725] function("*") probes sometimes crash & burn fche at redhat dot com
2006-11-20 23:05 ` fche at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).