From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31112 invoked by alias); 2 Mar 2006 05:53:05 -0000 Received: (qmail 31094 invoked by uid 48); 2 Mar 2006 05:53:02 -0000 Date: Thu, 02 Mar 2006 05:53:00 -0000 Message-ID: <20060302055302.31093.qmail@sourceware.org> From: "zanussi at us dot ibm dot com" To: systemtap@sources.redhat.com In-Reply-To: <20060223094047.2387.guanglei@cn.ibm.com> References: <20060223094047.2387.guanglei@cn.ibm.com> Reply-To: sourceware-bugzilla@sourceware.org Subject: [Bug kprobes/2387] system crash on ppc64/2.6.15.4 X-Bugzilla-Reason: AssignedTo Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00678.txt.bz2 ------- Additional Comments From zanussi at us dot ibm dot com 2006-03-02 05:53 ------- (In reply to comment #6) > (In reply to comment #5) > > (In reply to comment #4) > > > If you are seen problem even when not using SystemTap the this is probably > > > something outside of SystemTap. I suggest following this up on the linux-kernel > > > and linuxppc64-dev mailing list to see if the problems is located in the kernel. > > > > > > We should mark this bug as rejected until its proven that it is a SystemTap > > problem. > > > > the error : end_request: I/O error, dev sda, sector 17445 ... > > will happen without running systemtap. It will occur after I copied something > > into that partition. But I am not sure if it is the reason of causing kernel > > panic when running systemtap. > > > > The error: > > Unable to handle kernel paging request for data at address > > will happed when running stap with -b option. > > But I agree with Jose that it may not be a systemtap bug, because systemtap > > could work quite well on the redhat shipped kernels(2.6.9-30.EL, 2.6.9-27.EL). > > > > It should not be a hardware failure because I tried it on different machines, > > and even after reformat the partition. all of them have the same error. > > > > The 2.6.15 kernel has some changes about power arch(move ppc64 to powerpc > > directory), and the relayfs diffs a lot from RH shipped kernel. I tried not to > > compile relayfs in 2.6.15* and want systemtap compile it, but failed. the > > relayfs shipped with systemtap can't be compiled. some function signatures has > > changed, and if I have time I'll try to replace relayfs. > > > > > > > > > > To get systemtap to use the relayfs in the 2.6.15 kernel, try putting #define > RELAYFS_VERSION_GE_4 at the top of src/runtime/transport/relayfs.h. > > Tom I don't know if this is or isn't the cause of the problem, since I'm not seeing it on my x86 test machine, but I do see that the wrong relayfs_fs.h header file (the one in runtime/relayfs/linux/ rather than the one in the installed kernel sources) is being used to generate the probe module, when running a 2.6.15 kernel without the RELAYFS_VERSION_GE_4 define in relayfs.h. Can you go ahead and try adding that define and see if it helps? i.e. add #define RELAYFS_VERSION_GE_4 to src/runtime/transport/relayfs.h and then do a 'make install' to get it installed. Also make sure you have relayfs configured into your kernel. If that's the problem, then this bug could probably be closed and would be fixed by 2406, which deals with autodetecting the proper relayfs version, including this one.(In reply to comment #8) > > I don't know if this is or isn't the cause of the problem, since I'm not seeing > > it on my x86 test machine, but I do see that the wrong relayfs_fs.h header file > > (the one in runtime/relayfs/linux/ rather than the one in the installed kernel > > sources) is being used to generate the probe module, when running a 2.6.15 > > kernel without the RELAYFS_VERSION_GE_4 define in relayfs.h. > > > > Can you go ahead and try adding that define and see if it helps? i.e. add > > #define RELAYFS_VERSION_GE_4 to src/runtime/transport/relayfs.h and then do a > > 'make install' to get it installed. Also make sure you have relayfs configured > > into your kernel. > > > > If that's the problem, then this bug could probably be closed and would be fixed > > by 2406, which deals with autodetecting the proper relayfs version, including > > this one. > > I tried, and it worked. Thanks. It seems not crash any more. > But there is some errors(in fact, warnings) when stap is compiling the module, I > bypassed it by delete the -Werror in buildrun.cxx: > > Running grep " [tT] " /proc/kallsyms | sort -k 1,8 -s -o > /tmp/stap2iLdUc/symbols.sorted > Pass 3: translated to C into "/tmp/stap2iLdUc/stap_6318.c" in > 280usr/1000sys/1294real ms. > Running make -C "/lib/modules/2.6.9-30.EL/build" M="/tmp/stap2iLdUc" modules V=1 > make: Entering directory `/usr/src/kernels/2.6.9-30.EL-ppc64' > mkdir -p /tmp/stap2iLdUc/.tmp_versions > make -f scripts/Makefile.build obj=/tmp/stap2iLdUc > gcc -m64 -Wp,-MD,/tmp/stap2iLdUc/.stap_6318.o.d -nostdinc -iwithprefix include > -D__KERNEL__ -Iinclude -Wall -Wstrict-prototypes -Wno-trigraphs > -fno-strict-aliasing -fno-common -Os -g -Wdeclaration-after-statement > -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc > -mtune=power4 -fno-unit-at-a-time -Wno-unused -Werror -I > "/usr/local/share/systemtap/runtime" -I > "/usr/local/share/systemtap/runtime/relayfs" -DMODULE > -DKBUILD_BASENAME=stap_6318 -DKBUILD_MODNAME=stap_6318 -c -o > /tmp/stap2iLdUc/.tmp_stap_6318.o /tmp/stap2iLdUc/stap_6318.c > In file included from /usr/local/share/systemtap/runtime/transport/transport.c:20, > from /usr/local/share/systemtap/runtime/io.c:14, > from /usr/local/share/systemtap/runtime/print.c:16, > from /usr/local/share/systemtap/runtime/runtime.h:61, > from /tmp/stap2iLdUc/stap_6318.c:30: > /usr/local/share/systemtap/runtime/transport/relayfs.c: In function > `_stp_subbuf_start': > /usr/local/share/systemtap/runtime/transport/relayfs.c:33: warning: implicit > declaration of function `relay_buf_full' > /usr/local/share/systemtap/runtime/transport/relayfs.c:39: warning: implicit > declaration of function `subbuf_start_reserve' > /usr/local/share/systemtap/runtime/transport/relayfs.c: At top level: > /usr/local/share/systemtap/runtime/transport/relayfs.c:77: warning: > initialization from incompatible pointer type > /usr/local/share/systemtap/runtime/transport/relayfs.c: In function > `_stp_relayfs_open': > /usr/local/share/systemtap/runtime/transport/relayfs.c:129: warning: passing arg > 5 of `relay_open' makes integer from pointer without a cast > /usr/local/share/systemtap/runtime/transport/relayfs.c:129: error: too few > arguments to function `relay_open' > In file included from /usr/local/share/systemtap/runtime/transport/transport.c:45, > from /usr/local/share/systemtap/runtime/io.c:14, > from /usr/local/share/systemtap/runtime/print.c:16, > from /usr/local/share/systemtap/runtime/runtime.h:61, > from /tmp/stap2iLdUc/stap_6318.c:30: > /usr/local/share/systemtap/runtime/transport/procfs.c: In function `_stp_proc_read': > /usr/local/share/systemtap/runtime/transport/procfs.c:35: error: incompatible > types in assignment > /usr/local/share/systemtap/runtime/transport/procfs.c:36: error: incompatible > types in assignment > In file included from /usr/local/share/systemtap/runtime/io.c:14, > from /usr/local/share/systemtap/runtime/print.c:16, > from /usr/local/share/systemtap/runtime/runtime.h:61, > from /tmp/stap2iLdUc/stap_6318.c:30: > /usr/local/share/systemtap/runtime/transport/transport.c: In function > `_stp_handle_buf_info': > /usr/local/share/systemtap/runtime/transport/transport.c:86: error: incompatible > types in assignment > /usr/local/share/systemtap/runtime/transport/transport.c:87: error: incompatible > types in assignment > make[1]: *** [/tmp/stap2iLdUc/stap_6318.o] Error 1 > make: *** [_module_/tmp/stap2iLdUc] Error 2 > make: Leaving directory `/usr/src/kernels/2.6.9-30.EL-ppc64' > Pass 4: compiled C into "stap_6318.ko" in 2820usr/220sys/2893real ms. > Pass 4: compilation failed. Try again with more '-v' (verbose) options. > Running rm -rf /tmp/stap2iLdUc Hmm, where did you put the #define? I get these warnings if I put it at the bottom of relayfs.h, but putting it at the top, just above #ifdef RELAYFS_VERSION_GE_4 #include ... it works fine for me... -- http://sourceware.org/bugzilla/show_bug.cgi?id=2387 ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.