public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4
@ 2006-02-23  9:40 guanglei at cn dot ibm dot com
  2006-02-23 12:34 ` [Bug kprobes/2387] " fche at redhat dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-02-23  9:40 UTC (permalink / raw)
  To: systemtap

when running dbench, I use systemtap to probe all syscalls on ppc64/2.6.15.4,
the system will crash shortly. the error given by xmon:

Unable to handle kernel paging request for data at address 0x00000010
Faulting instruction address: 0xd000000000270ee4
cpu 0x1: Vector: 300 (Data Access) at [c00000005d95a8c0]
    pc: d000000000270ee4: ._stp_print_flush+0xb8/0x164 [stap_13972]
    lr: d000000000272a94: .probe_1+0x374/0x400 [stap_13972]
    sp: c00000005d95ab40
   msr: 8000000000001032
   dar: 10
 dsisr: 40000000
  current = 0xc000000020739040
  paca    = 0xc000000000538400
    pid   = 25259, comm = hotplug
enter ? for help

1:mon> t
[c00000005d95abf0] d000000000272a94 .probe_1+0x374/0x400 [stap_13972]
[c00000005d95ac90] d000000000272cf4 .dwarf_kprobe_1_enter+0x13c/0x1d8 [stap_13972]
[c00000005d95ad10] c00000000041959c .kprobe_exceptions_notify+0x334/0x5e8
[c00000005d95add0] c00000000041a134 .notifier_call_chain+0x68/0x98
[c00000005d95ae60] c000000000418834 .program_check_exception+0x114/0x5d0
[c00000005d95af00] c000000000004348 program_check_common+0xc8/0x100
--- Exception: 700 (Program Check) at c0000000000b0b94
.__find_get_block_slow+0x0/0x174
[link register   ] c0000000000b1940 .__find_get_block+0x110/0x278
[c00000005d95b1f0] c00000000027c6b0 .put_device+0x1c/0x30 (unreliable)
[c00000005d95b2d0] c0000000000b5184 .__getblk+0x44/0x2cc
[c00000005d95b390] c00000000013d678 .__ext3_get_inode_loc+0x1b0/0x42c
[c00000005d95b450] c00000000013e568 .ext3_reserve_inode_write+0x58/0x11c
[c00000005d95b500] c00000000013e650 .ext3_mark_inode_dirty+0x24/0x5c
[c00000005d95b5b0] c000000000140df0 .ext3_dirty_inode+0x8c/0xbc
[c00000005d95b640] c0000000000ddcb4 .__mark_inode_dirty+0x70/0x1e8
[c00000005d95b6e0] c0000000000d105c .update_atime+0xa4/0xbc
[c00000005d95b770] c0000000000802e8 .do_generic_mapping_read+0x41c/0x474
[c00000005d95b8c0] c000000000082b4c .__generic_file_aio_read+0x1b4/0x21c
[c00000005d95b990] c000000000082d5c .generic_file_aio_read+0x44/0x54
[c00000005d95ba20] c0000000000ae520 .do_sync_read+0xcc/0x124
[c00000005d95bba0] c0000000000ae65c .vfs_read+0xe4/0x1b8
[c00000005d95bc40] c0000000000bd7a4 .kernel_read+0x34/0x58
[c00000005d95bce0] c0000000000e87b4 .compat_do_execve+0x15c/0x2c8
[c00000005d95bd90] c000000000012744 .compat_sys_execve+0x7c/0xf8
[c00000005d95be30] c000000000008600 syscall_exit+0x0/0x18
--- Exception: c01 (System Call) at 000000000fef6004
SP (ffc403c0) is in userspace

-- 
           Summary: system crash on ppc64/2.6.15.4
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: kprobes
        AssignedTo: systemtap at sources dot redhat dot com
        ReportedBy: guanglei at cn dot ibm dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
@ 2006-02-23 12:34 ` fche at redhat dot com
  2006-02-23 15:23 ` guanglei at cn dot ibm dot com
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: fche at redhat dot com @ 2006-02-23 12:34 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From fche at redhat dot com  2006-02-23 12:34 -------
If I read this correctly, .__find_get_block_slow suffered some kind of fault. 
Could you disassemble your kernel in its neighbourhood to figure out which part
of that function triggered it?

Also, I don't understand how the kprobe was entered.  The exception notification
stuff should not result in launching into a kprobe.  Systemtap does not set any
"kp_fault_handler" at the present.  Does the "stap -p3" source code suggest any
linkage of dwarf_kprobe_1_enter to kprobe_exception_notify?  Might there simply
be a structure initialization issue?

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
  2006-02-23 12:34 ` [Bug kprobes/2387] " fche at redhat dot com
@ 2006-02-23 15:23 ` guanglei at cn dot ibm dot com
  2006-03-01 14:41 ` guanglei at cn dot ibm dot com
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-02-23 15:23 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From guanglei at cn dot ibm dot com  2006-02-23 15:23 -------
The following is the disassembly given by objdump:

Disassambly inside __find_get_block:
c0000000000b1934:    mr      r31,r6
c0000000000b1938:    bne-    cr7,c0000000000b1a68 <.__find_get_block+0x238>
c0000000000b193c:    bl      c0000000000b0b94 <.__find_get_block_slow>
c0000000000b1940:    mr.     r31,r3
c0000000000b1944:    beq-    c0000000000b1a68 <.__find_get_block+0x238>
c0000000000b1948:    li      r27,0
c0000000000b194c:    mfmsr   r0


disassambly around __find_get_block_slow:
c0000000000b0b8c <.sys_fdatasync>:
c0000000000b0b8c:    li      r4,1
c0000000000b0b90:    b       c0000000000b0a10 <.do_fsync>

c0000000000b0b94 <.__find_get_block_slow>:
c0000000000b0b94:    mflr    r0
c0000000000b0b98:    std     r24,-64(r1)
c0000000000b0b9c:    std     r25,-56(r1)
c0000000000b0ba0:    std     r28,-32(r1)
c0000000000b0ba4:    std     r29,-24(r1)
c0000000000b0ba8:    mr      r24,r4

But I wonder whether such info given by xmon is useful. I tried several times, 
and it will crash every time and showed a different exception & backtrace. And I 
noticed that all of these errors will have:

Unable to handle kernel paging request for data at address ...


--------------- Testing One ---------------------------------

Unable to handle kernel paging request for data at address 0x00000010
Faulting instruction address: 0xd000000000270ee4
cpu 0x1: Vector: 300 (Data Access) at [c000000040dab3f0]
    pc: d000000000270ee4: ._stp_print_flush+0xb8/0x164 [stap_7259]
    lr: d000000000273cb4: .probe_4+0x374/0x400 [stap_7259]
    sp: c000000040dab670
   msr: 8000000000001032
   dar: 10
 dsisr: 40000000
  current = 0xc00000002a351040
  paca    = 0xc000000000538400
    pid   = 9179, comm = dbench
enter ? for help

1:mon> t
[c000000040dab720] d000000000273cb4 .probe_4+0x374/0x400 [stap_7259]
[c000000040dab7c0] d000000000273e6c .dwarf_kprobe_4_enter+0x12c/0x1c8 
[stap_7259]
[c000000040dab840] c000000000419164 .trampoline_probe_handler+0xb0/0x150
[c000000040dab8e0] c00000000041959c .kprobe_exceptions_notify+0x334/0x5e8
[c000000040dab9a0] c00000000041a134 .notifier_call_chain+0x68/0x98
[c000000040daba30] c000000000418834 .program_check_exception+0x114/0x5d0
[c000000040dabad0] c000000000004348 program_check_common+0xc8/0x100
--- Exception: 700 (Program Check) at c00000000002a3bc kretprobe_trampoline+0x0/
0x8
[c000000040dabe30] c00000000002a3bc kretprobe_trampoline+0x0/0x8
--- Exception: c01 (System Call) at 000000000ff201b8
SP (ff9000b0) is in userspace
1:mon> 

----------- Testing Two -----------------------------------

localhost.localdomain login: Unable to handle kernel paging request for data at 
address 0x00000010
Faulting instruction address: 0xd000000000270ee4
cpu 0x1: Vector: 300 (Data Access) at [c000000066eeb500]
    pc: d000000000270ee4: ._stp_print_flush+0xb8/0x164 [stap_3949]
    lr: d0000000002736dc: .probe_3+0x374/0x400 [stap_3949]
    sp: c000000066eeb780
   msr: 8000000000001032
   dar: 10
 dsisr: 40000000
  current = 0xc000000002423040
  paca    = 0xc000000000538400
    pid   = 17224, comm = env
enter ? for help
1:mon> t
[c000000066eeb830] d0000000002736dc .probe_3+0x374/0x400 [stap_3949]
[c000000066eeb8d0] d0000000002738a4 .dwarf_kprobe_3_enter+0x13c/0x1d8 
[stap_3949]
[c000000066eeb950] c00000000041959c .kprobe_exceptions_notify+0x334/0x5e8
[c000000066eeba10] c00000000041a134 .notifier_call_chain+0x68/0x98
[c000000066eebaa0] c000000000418834 .program_check_exception+0x114/0x5d0
[c000000066eebb40] c000000000004348 program_check_common+0xc8/0x100
--- Exception: 700 (Program Check) at c00000000000ae38 .ppc_newuname+0x14/0x120
[link register   ] c00000000002a3bc kretprobe_trampoline+0x0/0x8
[c000000066eebe30] c000000000004760 .handle_page_fault+0x20/0x54 (unreliable)
--- Exception: c01 (System Call) at 000000000ffe2958
SP (fff6a970) is in userspace
1:mon> 

----------------------------------------------------------


kprobe_exceptions_notify could be triggered by breakpoint or singstep trap. 
kprobe_exceptions_notify will check and if it was triggered by BreadkPoint, it 
will invoke kprobe_handler which will then invoke kprobe->pre_handler, i.e. the 
probe handlers. and the stap -p3 shows:
 dwarf_kprobe_1[i].pre_handler = &dwarf_kprobe_1_enter;

So I think the exception notification stuff *could* result in launching into a 
kprobe. Am I wrong with something?



-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
  2006-02-23 12:34 ` [Bug kprobes/2387] " fche at redhat dot com
  2006-02-23 15:23 ` guanglei at cn dot ibm dot com
@ 2006-03-01 14:41 ` guanglei at cn dot ibm dot com
  2006-03-01 15:38   ` Frank Ch. Eigler
  2006-03-01 16:25 ` jrs at us dot ibm dot com
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-03-01 14:41 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From guanglei at cn dot ibm dot com  2006-03-01 14:41 -------
I tried the 2.6.15.1-2.6.15.4 and 2.6.16-rc5 kernels, and all of them gave
almost the same error like:
Unable to handle kernel paging request for data at address ...

And if I don't use -b option of systemtap, it seemed that it could run for a
long time without kernel panic.

And I also noticed that the kernel reported the I/O error even when I wasn't
running systemtap and only did some simple writing operations:
end_request: I/O error, dev sda, sector 17445
end_request: I/O error, dev sda, sector 17447
end_request: I/O error, dev sda, sector 17449
Aborting journal on device sda2.
ext3_abort called.
EXT3-fs error (device sda2): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only

The same version of systemtap could run very well with 2.6.9-30EL, so it is a
bug of the mainline kernel.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-03-01 14:41 ` guanglei at cn dot ibm dot com
@ 2006-03-01 15:38   ` Frank Ch. Eigler
  0 siblings, 0 replies; 15+ messages in thread
From: Frank Ch. Eigler @ 2006-03-01 15:38 UTC (permalink / raw)
  To: systemtap


guanglei@cn.ibm.com wrote:

> [...]
> And I also noticed that the kernel reported the I/O error even when I wasn't
> running systemtap and only did some simple writing operations:
> [...]
> end_request: I/O error, dev sda, sector 17449
> Aborting journal on device sda2.
> ext3_abort called.
> EXT3-fs error (device sda2): ext3_journal_start_sb: Detected aborted journal
> Remounting filesystem read-only

If this appears without having run systemtap, you almost certainly
have a problem with your hardware.  Boot it single-user, run fsck -c.

- FChE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (2 preceding siblings ...)
  2006-03-01 14:41 ` guanglei at cn dot ibm dot com
@ 2006-03-01 16:25 ` jrs at us dot ibm dot com
  2006-03-01 16:36 ` guanglei at cn dot ibm dot com
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: jrs at us dot ibm dot com @ 2006-03-01 16:25 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From jrs at us dot ibm dot com  2006-03-01 16:25 -------
If you are seen problem even when not using SystemTap the this is probably
something outside of SystemTap.  I suggest following this up on the linux-kernel
and linuxppc64-dev mailing list to see if the problems is located in the kernel.

We should mark this bug as rejected until its proven that it is a SystemTap problem.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (3 preceding siblings ...)
  2006-03-01 16:25 ` jrs at us dot ibm dot com
@ 2006-03-01 16:36 ` guanglei at cn dot ibm dot com
  2006-03-01 16:56 ` zanussi at us dot ibm dot com
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-03-01 16:36 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From guanglei at cn dot ibm dot com  2006-03-01 16:36 -------
(In reply to comment #4)
> If you are seen problem even when not using SystemTap the this is probably
> something outside of SystemTap.  I suggest following this up on the linux-kernel
> and linuxppc64-dev mailing list to see if the problems is located in the kernel.
> 
> We should mark this bug as rejected until its proven that it is a SystemTap
problem.

the error : end_request: I/O error, dev sda, sector 17445 ...
will happen without running systemtap. It will occur after I copied something
into that partition. But I am not sure if it is the reason of causing kernel
panic when running systemtap.

The error:
Unable to handle kernel paging request for data at address
will happed when running stap with -b option.
But I agree with Jose that it may not be a systemtap bug, because systemtap
could work quite well on the redhat shipped kernels(2.6.9-30.EL, 2.6.9-27.EL).

It should not be a hardware failure because I tried it on different machines,
and even after reformat the partition. all of them have the same error.

The 2.6.15 kernel has some changes about power arch(move ppc64 to powerpc
directory), and the relayfs diffs a lot from RH shipped kernel. I tried not to
compile relayfs in 2.6.15* and want systemtap compile it, but failed. the
relayfs shipped with systemtap can't be compiled. some function signatures has
changed, and if I have time I'll try to replace relayfs.





-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (4 preceding siblings ...)
  2006-03-01 16:36 ` guanglei at cn dot ibm dot com
@ 2006-03-01 16:56 ` zanussi at us dot ibm dot com
  2006-03-02  5:08 ` zanussi at us dot ibm dot com
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: zanussi at us dot ibm dot com @ 2006-03-01 16:56 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From zanussi at us dot ibm dot com  2006-03-01 16:56 -------
(In reply to comment #5)
> (In reply to comment #4)
> > If you are seen problem even when not using SystemTap the this is probably
> > something outside of SystemTap.  I suggest following this up on the linux-kernel
> > and linuxppc64-dev mailing list to see if the problems is located in the kernel.
> > 
> > We should mark this bug as rejected until its proven that it is a SystemTap
> problem.
> 
> the error : end_request: I/O error, dev sda, sector 17445 ...
> will happen without running systemtap. It will occur after I copied something
> into that partition. But I am not sure if it is the reason of causing kernel
> panic when running systemtap.
> 
> The error:
> Unable to handle kernel paging request for data at address
> will happed when running stap with -b option.
> But I agree with Jose that it may not be a systemtap bug, because systemtap
> could work quite well on the redhat shipped kernels(2.6.9-30.EL, 2.6.9-27.EL).
> 
> It should not be a hardware failure because I tried it on different machines,
> and even after reformat the partition. all of them have the same error.
> 
> The 2.6.15 kernel has some changes about power arch(move ppc64 to powerpc
> directory), and the relayfs diffs a lot from RH shipped kernel. I tried not to
> compile relayfs in 2.6.15* and want systemtap compile it, but failed. the
> relayfs shipped with systemtap can't be compiled. some function signatures has
> changed, and if I have time I'll try to replace relayfs.
> 
> 
> 
> 

To get systemtap to use the relayfs in the 2.6.15 kernel, try putting #define
RELAYFS_VERSION_GE_4 at the top of src/runtime/transport/relayfs.h.

Tom

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (5 preceding siblings ...)
  2006-03-01 16:56 ` zanussi at us dot ibm dot com
@ 2006-03-02  5:08 ` zanussi at us dot ibm dot com
  2006-03-02  5:36 ` guanglei at cn dot ibm dot com
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: zanussi at us dot ibm dot com @ 2006-03-02  5:08 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From zanussi at us dot ibm dot com  2006-03-02 05:08 -------
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > If you are seen problem even when not using SystemTap the this is probably
> > > something outside of SystemTap.  I suggest following this up on the
linux-kernel
> > > and linuxppc64-dev mailing list to see if the problems is located in the
kernel.
> > > 
> > > We should mark this bug as rejected until its proven that it is a SystemTap
> > problem.
> > 
> > the error : end_request: I/O error, dev sda, sector 17445 ...
> > will happen without running systemtap. It will occur after I copied something
> > into that partition. But I am not sure if it is the reason of causing kernel
> > panic when running systemtap.
> > 
> > The error:
> > Unable to handle kernel paging request for data at address
> > will happed when running stap with -b option.
> > But I agree with Jose that it may not be a systemtap bug, because systemtap
> > could work quite well on the redhat shipped kernels(2.6.9-30.EL, 2.6.9-27.EL).
> > 
> > It should not be a hardware failure because I tried it on different machines,
> > and even after reformat the partition. all of them have the same error.
> > 
> > The 2.6.15 kernel has some changes about power arch(move ppc64 to powerpc
> > directory), and the relayfs diffs a lot from RH shipped kernel. I tried not to
> > compile relayfs in 2.6.15* and want systemtap compile it, but failed. the
> > relayfs shipped with systemtap can't be compiled. some function signatures has
> > changed, and if I have time I'll try to replace relayfs.
> > 
> > 
> > 
> > 
> 
> To get systemtap to use the relayfs in the 2.6.15 kernel, try putting #define
> RELAYFS_VERSION_GE_4 at the top of src/runtime/transport/relayfs.h.
> 
> Tom

I don't know if this is or isn't the cause of the problem, since I'm not seeing
it on my x86 test machine, but I do see that the wrong relayfs_fs.h header file
(the one in runtime/relayfs/linux/ rather than the one in the installed kernel
sources) is being used to generate the probe module, when running a 2.6.15
kernel without the RELAYFS_VERSION_GE_4 define in relayfs.h.

Can you go ahead and try adding that define and see if it helps? i.e. add
#define RELAYFS_VERSION_GE_4 to src/runtime/transport/relayfs.h and then do a
'make install' to get it installed.  Also make sure you have relayfs configured
into your kernel.

If that's the problem, then this bug could probably be closed and would be fixed
by 2406, which deals with autodetecting the proper relayfs version, including
this one.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (6 preceding siblings ...)
  2006-03-02  5:08 ` zanussi at us dot ibm dot com
@ 2006-03-02  5:36 ` guanglei at cn dot ibm dot com
  2006-03-02  5:48 ` guanglei at cn dot ibm dot com
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-03-02  5:36 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From guanglei at cn dot ibm dot com  2006-03-02 05:36 -------
> I don't know if this is or isn't the cause of the problem, since I'm not seeing
> it on my x86 test machine, but I do see that the wrong relayfs_fs.h header file
> (the one in runtime/relayfs/linux/ rather than the one in the installed kernel
> sources) is being used to generate the probe module, when running a 2.6.15
> kernel without the RELAYFS_VERSION_GE_4 define in relayfs.h.
> 
> Can you go ahead and try adding that define and see if it helps? i.e. add
> #define RELAYFS_VERSION_GE_4 to src/runtime/transport/relayfs.h and then do a
> 'make install' to get it installed.  Also make sure you have relayfs configured
> into your kernel.
> 
> If that's the problem, then this bug could probably be closed and would be fixed
> by 2406, which deals with autodetecting the proper relayfs version, including
> this one.

I tried, and it worked. Thanks. It seems not crash any more.
But there is some errors(in fact, warnings) when stap is compiling the module, I
bypassed it by delete the -Werror in buildrun.cxx:

Running grep " [tT] " /proc/kallsyms | sort -k 1,8 -s -o
/tmp/stap2iLdUc/symbols.sorted
Pass 3: translated to C into "/tmp/stap2iLdUc/stap_6318.c" in
280usr/1000sys/1294real ms.
Running make -C "/lib/modules/2.6.9-30.EL/build" M="/tmp/stap2iLdUc" modules V=1
make: Entering directory `/usr/src/kernels/2.6.9-30.EL-ppc64'
mkdir -p /tmp/stap2iLdUc/.tmp_versions
make -f scripts/Makefile.build obj=/tmp/stap2iLdUc
  gcc -m64 -Wp,-MD,/tmp/stap2iLdUc/.stap_6318.o.d -nostdinc -iwithprefix include
-D__KERNEL__ -Iinclude  -Wall -Wstrict-prototypes -Wno-trigraphs
-fno-strict-aliasing -fno-common -Os -g -Wdeclaration-after-statement
-msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc               
    -mtune=power4 -fno-unit-at-a-time -Wno-unused -Werror -I
"/usr/local/share/systemtap/runtime" -I
"/usr/local/share/systemtap/runtime/relayfs"   -DMODULE
-DKBUILD_BASENAME=stap_6318 -DKBUILD_MODNAME=stap_6318 -c -o
/tmp/stap2iLdUc/.tmp_stap_6318.o /tmp/stap2iLdUc/stap_6318.c
In file included from /usr/local/share/systemtap/runtime/transport/transport.c:20,
                 from /usr/local/share/systemtap/runtime/io.c:14,
                 from /usr/local/share/systemtap/runtime/print.c:16,
                 from /usr/local/share/systemtap/runtime/runtime.h:61,
                 from /tmp/stap2iLdUc/stap_6318.c:30:
/usr/local/share/systemtap/runtime/transport/relayfs.c: In function
`_stp_subbuf_start':
/usr/local/share/systemtap/runtime/transport/relayfs.c:33: warning: implicit
declaration of function `relay_buf_full'
/usr/local/share/systemtap/runtime/transport/relayfs.c:39: warning: implicit
declaration of function `subbuf_start_reserve'
/usr/local/share/systemtap/runtime/transport/relayfs.c: At top level:
/usr/local/share/systemtap/runtime/transport/relayfs.c:77: warning:
initialization from incompatible pointer type
/usr/local/share/systemtap/runtime/transport/relayfs.c: In function
`_stp_relayfs_open':
/usr/local/share/systemtap/runtime/transport/relayfs.c:129: warning: passing arg
5 of `relay_open' makes integer from pointer without a cast
/usr/local/share/systemtap/runtime/transport/relayfs.c:129: error: too few
arguments to function `relay_open'
In file included from /usr/local/share/systemtap/runtime/transport/transport.c:45,
                 from /usr/local/share/systemtap/runtime/io.c:14,
                 from /usr/local/share/systemtap/runtime/print.c:16,
                 from /usr/local/share/systemtap/runtime/runtime.h:61,
                 from /tmp/stap2iLdUc/stap_6318.c:30:
/usr/local/share/systemtap/runtime/transport/procfs.c: In function `_stp_proc_read':
/usr/local/share/systemtap/runtime/transport/procfs.c:35: error: incompatible
types in assignment
/usr/local/share/systemtap/runtime/transport/procfs.c:36: error: incompatible
types in assignment
In file included from /usr/local/share/systemtap/runtime/io.c:14,
                 from /usr/local/share/systemtap/runtime/print.c:16,
                 from /usr/local/share/systemtap/runtime/runtime.h:61,
                 from /tmp/stap2iLdUc/stap_6318.c:30:
/usr/local/share/systemtap/runtime/transport/transport.c: In function
`_stp_handle_buf_info':
/usr/local/share/systemtap/runtime/transport/transport.c:86: error: incompatible
types in assignment
/usr/local/share/systemtap/runtime/transport/transport.c:87: error: incompatible
types in assignment
make[1]: *** [/tmp/stap2iLdUc/stap_6318.o] Error 1
make: *** [_module_/tmp/stap2iLdUc] Error 2
make: Leaving directory `/usr/src/kernels/2.6.9-30.EL-ppc64'
Pass 4: compiled C into "stap_6318.ko" in 2820usr/220sys/2893real ms.
Pass 4: compilation failed.  Try again with more '-v' (verbose) options.
Running rm -rf /tmp/stap2iLdUc

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (7 preceding siblings ...)
  2006-03-02  5:36 ` guanglei at cn dot ibm dot com
@ 2006-03-02  5:48 ` guanglei at cn dot ibm dot com
  2006-03-02  5:53 ` zanussi at us dot ibm dot com
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-03-02  5:48 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From guanglei at cn dot ibm dot com  2006-03-02 05:48 -------
> I tried, and it worked. Thanks. It seems not crash any more.
> But there is some errors(in fact, warnings) when stap is compiling the module, I
> bypassed it by delete the -Werror in buildrun.cxx:
The error on 2.6.15.3 kernel will be(with -Werror in buildrun.cxx):

Running grep " [tT] " /proc/kallsyms | sort -k 1,8 -s -o
/tmp/stap5mvGWl/symbols.sorted
Pass 3: translated to C into "/tmp/stap5mvGWl/stap_12492.c" in
220usr/90sys/313real ms.
Running make -C "/lib/modules/2.6.15.3/build" M="/tmp/stap5mvGWl" modules V=1
make: Entering directory `/usr/src/linux-2.6.15.3'
mkdir -p /tmp/stap5mvGWl/.tmp_versions
make -f scripts/Makefile.build obj=/tmp/stap5mvGWl
  gcc -m64 -Wp,-MD,/tmp/stap5mvGWl/.stap_12492.o.d  -nostdinc -isystem
/usr/lib/gcc/ppc64-redhat-linux/3.4.5/include -D__KERNEL__ -Iinclude  -include
include/linux/autoconf.h  -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs
-fno-strict-aliasing -fno-common -ffreestanding -Os     -fomit-frame-pointer -g
-msoft-float -pipe -mminimal-toc -mtraceback=none  -mcall-aixdesc -mtune=power4
-mno-altivec -funit-at-a-time -mstring -Wa,-maltivec
-Wdeclaration-after-statement  -Wno-unused -Werror -I
"/usr/local/share/systemtap/runtime" -I
"/usr/local/share/systemtap/runtime/relayfs"   -DMODULE
-DKBUILD_BASENAME=stap_12492 -DKBUILD_MODNAME=stap_12492 -c -o
/tmp/stap5mvGWl/.tmp_stap_12492.o /tmp/stap5mvGWl/stap_12492.c
In file included from /usr/local/share/systemtap/runtime/transport/transport.c:20,
                 from /usr/local/share/systemtap/runtime/io.c:14,
                 from /usr/local/share/systemtap/runtime/print.c:16,
                 from /usr/local/share/systemtap/runtime/runtime.h:61,
                 from /tmp/stap5mvGWl/stap_12492.c:30:
/usr/local/share/systemtap/runtime/transport/relayfs.c:77: warning:
initialization from incompatible pointer type
make[1]: *** [/tmp/stap5mvGWl/stap_12492.o] Error 1
make: *** [_module_/tmp/stap5mvGWl] Error 2
make: Leaving directory `/usr/src/linux-2.6.15.3'
Pass 4: compiled C into "stap_12492.ko" in 2210usr/250sys/2104real ms.
Pass 4: compilation failed.  Try again with more '-v' (verbose) options.
Running rm -rf /tmp/stap5mvGWl

So we need to do some explicit type cast to eliminate such warnings?


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (8 preceding siblings ...)
  2006-03-02  5:48 ` guanglei at cn dot ibm dot com
@ 2006-03-02  5:53 ` zanussi at us dot ibm dot com
  2006-03-02  5:58 ` guanglei at cn dot ibm dot com
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: zanussi at us dot ibm dot com @ 2006-03-02  5:53 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From zanussi at us dot ibm dot com  2006-03-02 05:53 -------
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > If you are seen problem even when not using SystemTap the this is probably
> > > something outside of SystemTap.  I suggest following this up on the
linux-kernel
> > > and linuxppc64-dev mailing list to see if the problems is located in the
kernel.
> > > 
> > > We should mark this bug as rejected until its proven that it is a SystemTap
> > problem.
> > 
> > the error : end_request: I/O error, dev sda, sector 17445 ...
> > will happen without running systemtap. It will occur after I copied something
> > into that partition. But I am not sure if it is the reason of causing kernel
> > panic when running systemtap.
> > 
> > The error:
> > Unable to handle kernel paging request for data at address
> > will happed when running stap with -b option.
> > But I agree with Jose that it may not be a systemtap bug, because systemtap
> > could work quite well on the redhat shipped kernels(2.6.9-30.EL, 2.6.9-27.EL).
> > 
> > It should not be a hardware failure because I tried it on different machines,
> > and even after reformat the partition. all of them have the same error.
> > 
> > The 2.6.15 kernel has some changes about power arch(move ppc64 to powerpc
> > directory), and the relayfs diffs a lot from RH shipped kernel. I tried not to
> > compile relayfs in 2.6.15* and want systemtap compile it, but failed. the
> > relayfs shipped with systemtap can't be compiled. some function signatures has
> > changed, and if I have time I'll try to replace relayfs.
> > 
> > 
> > 
> > 
> 
> To get systemtap to use the relayfs in the 2.6.15 kernel, try putting #define
> RELAYFS_VERSION_GE_4 at the top of src/runtime/transport/relayfs.h.
> 
> Tom

I don't know if this is or isn't the cause of the problem, since I'm not seeing
it on my x86 test machine, but I do see that the wrong relayfs_fs.h header file
(the one in runtime/relayfs/linux/ rather than the one in the installed kernel
sources) is being used to generate the probe module, when running a 2.6.15
kernel without the RELAYFS_VERSION_GE_4 define in relayfs.h.

Can you go ahead and try adding that define and see if it helps? i.e. add
#define RELAYFS_VERSION_GE_4 to src/runtime/transport/relayfs.h and then do a
'make install' to get it installed.  Also make sure you have relayfs configured
into your kernel.

If that's the problem, then this bug could probably be closed and would be fixed
by 2406, which deals with autodetecting the proper relayfs version, including
this one.(In reply to comment #8)
> > I don't know if this is or isn't the cause of the problem, since I'm not seeing
> > it on my x86 test machine, but I do see that the wrong relayfs_fs.h header file
> > (the one in runtime/relayfs/linux/ rather than the one in the installed kernel
> > sources) is being used to generate the probe module, when running a 2.6.15
> > kernel without the RELAYFS_VERSION_GE_4 define in relayfs.h.
> > 
> > Can you go ahead and try adding that define and see if it helps? i.e. add
> > #define RELAYFS_VERSION_GE_4 to src/runtime/transport/relayfs.h and then do a
> > 'make install' to get it installed.  Also make sure you have relayfs configured
> > into your kernel.
> > 
> > If that's the problem, then this bug could probably be closed and would be fixed
> > by 2406, which deals with autodetecting the proper relayfs version, including
> > this one.
> 
> I tried, and it worked. Thanks. It seems not crash any more.
> But there is some errors(in fact, warnings) when stap is compiling the module, I
> bypassed it by delete the -Werror in buildrun.cxx:
> 
> Running grep " [tT] " /proc/kallsyms | sort -k 1,8 -s -o
> /tmp/stap2iLdUc/symbols.sorted
> Pass 3: translated to C into "/tmp/stap2iLdUc/stap_6318.c" in
> 280usr/1000sys/1294real ms.
> Running make -C "/lib/modules/2.6.9-30.EL/build" M="/tmp/stap2iLdUc" modules V=1
> make: Entering directory `/usr/src/kernels/2.6.9-30.EL-ppc64'
> mkdir -p /tmp/stap2iLdUc/.tmp_versions
> make -f scripts/Makefile.build obj=/tmp/stap2iLdUc
>   gcc -m64 -Wp,-MD,/tmp/stap2iLdUc/.stap_6318.o.d -nostdinc -iwithprefix include
> -D__KERNEL__ -Iinclude  -Wall -Wstrict-prototypes -Wno-trigraphs
> -fno-strict-aliasing -fno-common -Os -g -Wdeclaration-after-statement
> -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc               
>     -mtune=power4 -fno-unit-at-a-time -Wno-unused -Werror -I
> "/usr/local/share/systemtap/runtime" -I
> "/usr/local/share/systemtap/runtime/relayfs"   -DMODULE
> -DKBUILD_BASENAME=stap_6318 -DKBUILD_MODNAME=stap_6318 -c -o
> /tmp/stap2iLdUc/.tmp_stap_6318.o /tmp/stap2iLdUc/stap_6318.c
> In file included from /usr/local/share/systemtap/runtime/transport/transport.c:20,
>                  from /usr/local/share/systemtap/runtime/io.c:14,
>                  from /usr/local/share/systemtap/runtime/print.c:16,
>                  from /usr/local/share/systemtap/runtime/runtime.h:61,
>                  from /tmp/stap2iLdUc/stap_6318.c:30:
> /usr/local/share/systemtap/runtime/transport/relayfs.c: In function
> `_stp_subbuf_start':
> /usr/local/share/systemtap/runtime/transport/relayfs.c:33: warning: implicit
> declaration of function `relay_buf_full'
> /usr/local/share/systemtap/runtime/transport/relayfs.c:39: warning: implicit
> declaration of function `subbuf_start_reserve'
> /usr/local/share/systemtap/runtime/transport/relayfs.c: At top level:
> /usr/local/share/systemtap/runtime/transport/relayfs.c:77: warning:
> initialization from incompatible pointer type
> /usr/local/share/systemtap/runtime/transport/relayfs.c: In function
> `_stp_relayfs_open':
> /usr/local/share/systemtap/runtime/transport/relayfs.c:129: warning: passing arg
> 5 of `relay_open' makes integer from pointer without a cast
> /usr/local/share/systemtap/runtime/transport/relayfs.c:129: error: too few
> arguments to function `relay_open'
> In file included from /usr/local/share/systemtap/runtime/transport/transport.c:45,
>                  from /usr/local/share/systemtap/runtime/io.c:14,
>                  from /usr/local/share/systemtap/runtime/print.c:16,
>                  from /usr/local/share/systemtap/runtime/runtime.h:61,
>                  from /tmp/stap2iLdUc/stap_6318.c:30:
> /usr/local/share/systemtap/runtime/transport/procfs.c: In function
`_stp_proc_read':
> /usr/local/share/systemtap/runtime/transport/procfs.c:35: error: incompatible
> types in assignment
> /usr/local/share/systemtap/runtime/transport/procfs.c:36: error: incompatible
> types in assignment
> In file included from /usr/local/share/systemtap/runtime/io.c:14,
>                  from /usr/local/share/systemtap/runtime/print.c:16,
>                  from /usr/local/share/systemtap/runtime/runtime.h:61,
>                  from /tmp/stap2iLdUc/stap_6318.c:30:
> /usr/local/share/systemtap/runtime/transport/transport.c: In function
> `_stp_handle_buf_info':
> /usr/local/share/systemtap/runtime/transport/transport.c:86: error: incompatible
> types in assignment
> /usr/local/share/systemtap/runtime/transport/transport.c:87: error: incompatible
> types in assignment
> make[1]: *** [/tmp/stap2iLdUc/stap_6318.o] Error 1
> make: *** [_module_/tmp/stap2iLdUc] Error 2
> make: Leaving directory `/usr/src/kernels/2.6.9-30.EL-ppc64'
> Pass 4: compiled C into "stap_6318.ko" in 2820usr/220sys/2893real ms.
> Pass 4: compilation failed.  Try again with more '-v' (verbose) options.
> Running rm -rf /tmp/stap2iLdUc

Hmm, where did you put the #define?

I get these warnings if I put it at the bottom of relayfs.h, but putting it at
the top, just above 

#ifdef RELAYFS_VERSION_GE_4
#include <linux/relayfs_fs.h>
...

it works fine for me...

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (9 preceding siblings ...)
  2006-03-02  5:53 ` zanussi at us dot ibm dot com
@ 2006-03-02  5:58 ` guanglei at cn dot ibm dot com
  2006-03-02  6:13 ` zanussi at us dot ibm dot com
  2006-03-02  6:16 ` guanglei at cn dot ibm dot com
  12 siblings, 0 replies; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-03-02  5:58 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From guanglei at cn dot ibm dot com  2006-03-02 05:58 -------
> Hmm, where did you put the #define?
> 
> I get these warnings if I put it at the bottom of relayfs.h, but putting it at
> the top, just above 
> 
> #ifdef RELAYFS_VERSION_GE_4
> #include <linux/relayfs_fs.h>
> ...
> 
> it works fine for me...

the file I used:

#ifndef _TRANSPORT_RELAYFS_H_ /* -*- linux-c -*- */
#define _TRANSPORT_RELAYFS_H_
#define RELAYFS_VERSION_GE_4 

/** @file relayfs.h
 * @brief Header file for relayfs transport
 */

#ifdef RELAYFS_VERSION_GE_4
#include <linux/relayfs_fs.h>
#else
#include "../relayfs/linux/relayfs_fs.h"
#endif /* RELAYFS_VERSION_GE_4 */

struct rchan *_stp_relayfs_open(unsigned n_subbufs,
                                unsigned subbuf_size,
                                int pid,
                                struct dentry **outdir);
void _stp_relayfs_close(struct rchan *chan, struct dentry *dir);

#endif /* _TRANSPORT_RELAYFS_H_ */

So is it due to the gcc version? My gcc is:
gcc version 3.4.5 20051201 (Red Hat 3.4.5-2)
I checked the codes, and it is just a warning of the assignment:
int *ptr <--- static int *ptr

But I met another problem, I use my testcase to stress test systemtap:

-bash-3.00# ./test.sh -f  lgl.cfg  -I tapsets/tapsets1/           
The tapsets is tapsets/tapsets1/
don't probe app : dbench
TIMES : 1
TIMES : 2
probe app : dbench
TIMES : 1
TIMES : 2
error opening file stpd_cpu0.
ERROR: couldn't unlink percpu file stpd_cpu0: errcode = No such file or directory

Do you have any ideas of such errors? I never met it before.
I raise the MAXDSKIPPED when running my testcases

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (10 preceding siblings ...)
  2006-03-02  5:58 ` guanglei at cn dot ibm dot com
@ 2006-03-02  6:13 ` zanussi at us dot ibm dot com
  2006-03-02  6:16 ` guanglei at cn dot ibm dot com
  12 siblings, 0 replies; 15+ messages in thread
From: zanussi at us dot ibm dot com @ 2006-03-02  6:13 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From zanussi at us dot ibm dot com  2006-03-02 06:13 -------
(In reply to comment #11)
> > Hmm, where did you put the #define?
> > 
> > I get these warnings if I put it at the bottom of relayfs.h, but putting it at
> > the top, just above 
> > 
> > #ifdef RELAYFS_VERSION_GE_4
> > #include <linux/relayfs_fs.h>
> > ...
> > 
> > it works fine for me...
> 
> the file I used:
> 
> #ifndef _TRANSPORT_RELAYFS_H_ /* -*- linux-c -*- */
> #define _TRANSPORT_RELAYFS_H_
> #define RELAYFS_VERSION_GE_4 
> 
> /** @file relayfs.h
>  * @brief Header file for relayfs transport
>  */
> 
> #ifdef RELAYFS_VERSION_GE_4
> #include <linux/relayfs_fs.h>
> #else
> #include "../relayfs/linux/relayfs_fs.h"
> #endif /* RELAYFS_VERSION_GE_4 */
> 
> struct rchan *_stp_relayfs_open(unsigned n_subbufs,
>                                 unsigned subbuf_size,
>                                 int pid,
>                                 struct dentry **outdir);
> void _stp_relayfs_close(struct rchan *chan, struct dentry *dir);
> 
> #endif /* _TRANSPORT_RELAYFS_H_ */
> 
> So is it due to the gcc version? My gcc is:
> gcc version 3.4.5 20051201 (Red Hat 3.4.5-2)
> I checked the codes, and it is just a warning of the assignment:
> int *ptr <--- static int *ptr
> 

I'm using gcc 4.1.0

> But I met another problem, I use my testcase to stress test systemtap:
> 
> -bash-3.00# ./test.sh -f  lgl.cfg  -I tapsets/tapsets1/           
> The tapsets is tapsets/tapsets1/
> don't probe app : dbench
> TIMES : 1
> TIMES : 2
> probe app : dbench
> TIMES : 1
> TIMES : 2
> error opening file stpd_cpu0.
> ERROR: couldn't unlink percpu file stpd_cpu0: errcode = No such file or directory
> 
> Do you have any ideas of such errors? I never met it before.
> I raise the MAXDSKIPPED when running my testcases

No, I haven't seen that before either.


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug kprobes/2387] system crash on ppc64/2.6.15.4
  2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
                   ` (11 preceding siblings ...)
  2006-03-02  6:13 ` zanussi at us dot ibm dot com
@ 2006-03-02  6:16 ` guanglei at cn dot ibm dot com
  12 siblings, 0 replies; 15+ messages in thread
From: guanglei at cn dot ibm dot com @ 2006-03-02  6:16 UTC (permalink / raw)
  To: systemtap


------- Additional Comments From guanglei at cn dot ibm dot com  2006-03-02 06:16 -------
> -bash-3.00# ./test.sh -f  lgl.cfg  -I tapsets/tapsets1/           
> The tapsets is tapsets/tapsets1/
> don't probe app : dbench
> TIMES : 1
> TIMES : 2
> probe app : dbench
> TIMES : 1
> TIMES : 2
> error opening file stpd_cpu0.
> ERROR: couldn't unlink percpu file stpd_cpu0: errcode = No such file or directory
> 
> Do you have any ideas of such errors? I never met it before.
> I raise the MAXDSKIPPED when running my testcases
It may due to my testcase. I run stap in background and when benchmark tools
finished running, I just:
kill -s SIGINT -- stappid stpdpid
I should terminate stap & stpd in a right order. I think this is the cause.

I think this bug could be closed. 

*** This bug has been marked as a duplicate of 2406 ***

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |DUPLICATE


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-03-02  6:16 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-23  9:40 [Bug kprobes/2387] New: system crash on ppc64/2.6.15.4 guanglei at cn dot ibm dot com
2006-02-23 12:34 ` [Bug kprobes/2387] " fche at redhat dot com
2006-02-23 15:23 ` guanglei at cn dot ibm dot com
2006-03-01 14:41 ` guanglei at cn dot ibm dot com
2006-03-01 15:38   ` Frank Ch. Eigler
2006-03-01 16:25 ` jrs at us dot ibm dot com
2006-03-01 16:36 ` guanglei at cn dot ibm dot com
2006-03-01 16:56 ` zanussi at us dot ibm dot com
2006-03-02  5:08 ` zanussi at us dot ibm dot com
2006-03-02  5:36 ` guanglei at cn dot ibm dot com
2006-03-02  5:48 ` guanglei at cn dot ibm dot com
2006-03-02  5:53 ` zanussi at us dot ibm dot com
2006-03-02  5:58 ` guanglei at cn dot ibm dot com
2006-03-02  6:13 ` zanussi at us dot ibm dot com
2006-03-02  6:16 ` guanglei at cn dot ibm dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).