public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* stap is getting a segentation fault
@ 2007-09-27  0:42 David Wilder
       [not found] ` <y0m641woqj7.fsf@ton.toronto.redhat.com>
  0 siblings, 1 reply; 7+ messages in thread
From: David Wilder @ 2007-09-27  0:42 UTC (permalink / raw)
  To: SystemTAP

Hi, I am seeing a problem with systemtap on RHEL 5.1 beta on s390.
Before I dig into it any ideas?

The problem is that stap is getting a SIGSEGV whenever I attempt place a 
probe in a  module.

Here is a stack trace.

(gdb) run -vv dw.stp
Starting program: /usr/src/redhat/BUILD/systemtap-0.5.14/test/stap -vv 
dw.stp
[Thread debugging using libthread_db enabled]
[New Thread 2199023370768 (LWP 1725)]
SystemTap translator/driver (version 0.5.14/0.128 built 2007-09-25)
Copyright (C) 2005-2007 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
Created temporary directory "/tmp/stapXNCAti"
Searched '/usr/share/systemtap/tapset/s390x/*.stp', found 1
Searched '/usr/share/systemtap/tapset/*.stp', found 35
Searched '/usr/share/systemtap/tapset/LKET/*.stp', found 19
Pass 1: parsed user script and 55 library script(s) in 
620usr/10sys/645real ms.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2199023370768 (LWP 1725)]
0x0000004a05120a3a in std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::basic_string () from /usr/lib64/libstdc++.so.6
(gdb) bt
#0  0x0000004a05120a3a in std::basic_string<char, 
std::char_traits<char>, std::allocator<char> >::basic_string () from 
/usr/lib64/libstdc++.so.6
#1  0x000000008007d806 in query_cu (cudie=<value optimized out>,
     arg=<value optimized out>) at tapsets.cxx:457
#2  0x000000008007e948 in dwarf_query::handle_query_module 
(this=0x3ffffa05648)
     at tapsets.cxx:819
#3  0x000000008007048a in query_module (mod=<value optimized out>,
     name=0x80c1a480 "qdio", arg=<value optimized out>) at tapsets.cxx:3003
#4  0x00000000800765d6 in dwarf_builder::build (this=0x80bbd860,
     sess=@0x3ffffa06a38, base=0x800dc1f0, location=0x800d96f0,
     parameters=<value optimized out>, finished_results=<value optimized 
out>)
     at tapsets.cxx:778
#5  0x0000000080031c40 in match_node::find_and_build (this=0x80bbdf80,
     s=@0x3ffffa06a38, p=0x800dc1f0, loc=0x800d96f0, pos=2159798832,
     results=@0x3ffffa06798) at elaborate.cxx:318
#6  0x0000000080031a7e in match_node::find_and_build (this=0x80bbdf00,
     s=@0x3ffffa06a38, p=0x800dc1f0, loc=0x800d96f0, pos=2,
     results=@0x3ffffa06798) at elaborate.cxx:377
#7  0x0000000080031a7e in match_node::find_and_build (this=0x800d91f0,
     s=@0x3ffffa06a38, p=0x800dc1f0, loc=0x800d96f0, pos=1,
     results=@0x3ffffa06798) at elaborate.cxx:377
#8  0x0000000080035c38 in derive_probes (s=@0x3ffffa06a38, p=0x800dc1f0,
     dps=@0x3ffffa06798, optional=<value optimized out>) at 
elaborate.cxx:567
#9  0x0000000080036424 in semantic_pass_symbols (s=@0x3ffffa06a38)
     at elaborate.cxx:958
#10 0x000000008003a6f2 in semantic_pass (s=@0x3ffffa06a38) at 
elaborate.cxx:999
#11 0x000000008000a96e in main (argc=<value optimized out>, 
argv=0x3ffffa0734e)
     at main.cxx:667
(gdb)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: stap is getting a  segmentation fault on RHEL 5.1 bets
       [not found] ` <y0m641woqj7.fsf@ton.toronto.redhat.com>
@ 2007-10-02 17:16   ` David Wilder
  2007-10-02 17:51     ` Frank Ch. Eigler
  0 siblings, 1 reply; 7+ messages in thread
From: David Wilder @ 2007-10-02 17:16 UTC (permalink / raw)
  To: Frank Ch. Eigler, SystemTAP

Frank Ch. Eigler wrote:
> Try building systemtap with CXXFLAGS=-g only and see what gdb says then.
> 
> - FChE

Hi Frank-
Sorry for the delay.  To recap my problem is that stap is getting a 
segmentation fault on s390, RHEL5.1 beta.  This happens anytime I 
attempting to probe modules.

Adding the -g flag did not add much more to the backtrace.  So I did a 
little brute force troubleshooting.  What I found is that 
dwarf_diename() is returning a bad pointer causing the segmentation fault.

I added a couple of printfs in tapset.cxx

  void focus_on_cu(Dwarf_Die * c)
   {
     assert(c);
     assert(module);

     cu = c;
     printf("c = %p *addr=%p *cu = %p *abbrev = %p 
\n",c,c->addr,c->cu,c->abbrev);
     printf("string = %p\n",dwarf_diename(c));
     cu_name = default_name(dwarf_diename(c), "CU");

     // Reset existing pointers and names
     function_name.clear();
     function = NULL;
   }

test run.
c = 0x3ffff93e2a8 *addr=0x20007897333 *cu = 0x80dd4230 *abbrev = (nil)
string = 0x2eda  << invalid pointer.
Segmentation fault

Any ideas?  I am going to try building new debug info files and see if 
it helps.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: stap is getting a  segmentation fault on RHEL 5.1 bets
  2007-10-02 17:16   ` stap is getting a segmentation fault on RHEL 5.1 bets David Wilder
@ 2007-10-02 17:51     ` Frank Ch. Eigler
  2007-10-03 18:03       ` David Wilder
  0 siblings, 1 reply; 7+ messages in thread
From: Frank Ch. Eigler @ 2007-10-02 17:51 UTC (permalink / raw)
  To: David Wilder; +Cc: SystemTAP


dwilder wrote:

> [...] To recap my problem is that stap is getting a segmentation
> fault on s390, RHEL5.1 beta.  This happens anytime I attempting to
> probe modules.

OK.

> Adding the -g flag did not add much more to the backtrace.  So I did
> a little brute force troubleshooting.  What I found is that
> dwarf_diename() is returning a bad pointer causing the segmentation
> fault.

It could be an elfutils bug.

> [...]  Any ideas?  I am going to try building new debug info files
> and see if it helps.

You could also try elfutils 0.129.

- FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: stap is getting a  segmentation fault on RHEL 5.1 bets
  2007-10-02 17:51     ` Frank Ch. Eigler
@ 2007-10-03 18:03       ` David Wilder
  2007-10-03 21:06         ` Roland McGrath
  0 siblings, 1 reply; 7+ messages in thread
From: David Wilder @ 2007-10-03 18:03 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: SystemTAP

Frank Ch. Eigler wrote:
> dwilder wrote:
> 
>> [...] To recap my problem is that stap is getting a segmentation
>> fault on s390, RHEL5.1 beta.  This happens anytime I attempting to
>> probe modules.
> 
> OK.
> 
>> Adding the -g flag did not add much more to the backtrace.  So I did
>> a little brute force troubleshooting.  What I found is that
>> dwarf_diename() is returning a bad pointer causing the segmentation
>> fault.
> 
> It could be an elfutils bug.
> 
>> [...]  Any ideas?  I am going to try building new debug info files
>> and see if it helps.
> 
> You could also try elfutils 0.129.
> 
> - FChE

The problem only occurs when using the debuginfo files built from the 
kernel src.rpm.  Here is what I did.  Installed and built all the kernel 
rpms from kernel-2.6.18-48.el5.src.rpm (rpmbuild -ba).  Installed the 
debug kernel and the associated debuginfo and kernel devel rpms.  (note 
I chose the debug kernel but the problem happens on the non-debug kernel 
also.)

I ran stap probing a function in the qdio module, stap faulted!
I replace the qdio.ko.debug file in /usr/lib/debug...  with the
qdio.ko file left over in the rpm build directory 
(/usr/src/redhat/BUILD/kernel...).

Now when I run my stap scrip it works fine.  It looks like the rpm build 
is doing something to the debuginfo files.  The sizes are slightly 
different, but not as small as the running module in /lib/modules.

ls -l qdio*
-rw-r--r-- 1 root root 1373490 Oct  3 10:27 qdio.ko.debug
-rwxr--r-- 1 root root 1199376 Oct  2 20:56 qdio.ko.debug.orig

file qdio*
qdio.ko.debug:      ELF 64-bit MSB relocatable, IBM S/390, version 1 
(SYSV), not stripped
qdio.ko.debug.orig: ELF 64-bit MSB relocatable, IBM S/390, version 1 
(SYSV), not stripped

Any idea what rpmbuild could be doing to the debuginfo files?
I looked through the kernel.spec file and found no clues, but I am 
unfamiliar with rpm spec files, so I may have missed something.

I have not tried upgrading elfutils yet as you suggested.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: stap is getting a  segmentation fault on RHEL 5.1 bets
  2007-10-03 18:03       ` David Wilder
@ 2007-10-03 21:06         ` Roland McGrath
  2007-10-05 17:37           ` David Wilder
  0 siblings, 1 reply; 7+ messages in thread
From: Roland McGrath @ 2007-10-03 21:06 UTC (permalink / raw)
  To: David Wilder; +Cc: Frank Ch. Eigler, SystemTAP

> Any idea what rpmbuild could be doing to the debuginfo files?
> I looked through the kernel.spec file and found no clues, but I am 
> unfamiliar with rpm spec files, so I may have missed something.

Some implicit magic happens after what's written in the spec file.
Importantly, /usr/lib/rpm/find-debuginfo.sh runs to create the separate
debuginfo files using eu-strip -f.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: stap is getting a  segmentation fault on RHEL 5.1 bets
  2007-10-03 21:06         ` Roland McGrath
@ 2007-10-05 17:37           ` David Wilder
  2007-10-05 20:30             ` Roland McGrath
  0 siblings, 1 reply; 7+ messages in thread
From: David Wilder @ 2007-10-05 17:37 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Frank Ch. Eigler, SystemTAP

Roland McGrath wrote:
>> Any idea what rpmbuild could be doing to the debuginfo files?
>> I looked through the kernel.spec file and found no clues, but I am 
>> unfamiliar with rpm spec files, so I may have missed something.
> 
> Some implicit magic happens after what's written in the spec file.
> Importantly, /usr/lib/rpm/find-debuginfo.sh runs to create the separate
> debuginfo files using eu-strip -f.

Thanks for the pointer.  When I run  "eu-strip --remove-comment" against 
the working file it creates a file that causes stap to fail.

Now I need to figure out if the debuginfo file is damaged by eu-strip, 
or it is a bug in elfutils relating to reading the striped file.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: stap is getting a  segmentation fault on RHEL 5.1 bets
  2007-10-05 17:37           ` David Wilder
@ 2007-10-05 20:30             ` Roland McGrath
  0 siblings, 0 replies; 7+ messages in thread
From: Roland McGrath @ 2007-10-05 20:30 UTC (permalink / raw)
  To: David Wilder; +Cc: Frank Ch. Eigler, SystemTAP

> Thanks for the pointer.  When I run  "eu-strip --remove-comment" against 
> the working file it creates a file that causes stap to fail.
> 
> Now I need to figure out if the debuginfo file is damaged by eu-strip, 
> or it is a bug in elfutils relating to reading the striped file.

I can debug this.  In either of those cases, I'll be the one fixing it.
Can you just supply me with the two before and after binaries to examine?
Does another elfutils-based reader also crash looking at the problematic file?
e.g. eu-readelf --debug-dump={info,line}


Thanks,
Roland

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-10-05 20:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-27  0:42 stap is getting a segentation fault David Wilder
     [not found] ` <y0m641woqj7.fsf@ton.toronto.redhat.com>
2007-10-02 17:16   ` stap is getting a segmentation fault on RHEL 5.1 bets David Wilder
2007-10-02 17:51     ` Frank Ch. Eigler
2007-10-03 18:03       ` David Wilder
2007-10-03 21:06         ` Roland McGrath
2007-10-05 17:37           ` David Wilder
2007-10-05 20:30             ` Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).