public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* How to debug what I am doing wrong?
@ 2008-07-17 21:02 Theodore Ts'o
  2008-07-17 21:31 ` Masami Hiramatsu
  2008-07-17 22:01 ` Roland McGrath
  0 siblings, 2 replies; 14+ messages in thread
From: Theodore Ts'o @ 2008-07-17 21:02 UTC (permalink / raw)
  To: systemtap

I'm trying to run this systemtap script, ext4_check_desc.stp:

probe module("ext4dev").function("ext4_check_descriptors")
{
	printf("ext4_check_descriptors: flags %x\n", $sb->s_flags);
}

<tytso@closure> {/usr/projects/systemtap/examples}  
4% stap ext4-check-desk.stp
WARNING: cannot find module ext4dev debuginfo: No such file or directory
semantic error: no match while resolving probe point module("ext4dev").function("ext4_check_descriptors")
semantic error: no probes found
Pass 2: analysis failed.  Try again with more '-v' (verbose) options.
<tytso@closure> {/usr/projects/systemtap/examples}  
5% uname -a
Linux closure 2.6.26-03033-g97438cf #16 SMP Thu Jul 17 01:21:12 EDT 2008 i686 GNU/Linux
<tytso@closure> {/usr/projects/systemtap/examples}  
6% ls /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
3428 /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
<tytso@closure> {/usr/projects/systemtap/examples}  
7% stap -V
SystemTap translator/driver (version 0.7.1/0.131 git branch master, commit 82737bef)
Copyright (C) 2005-2008 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
<tytso@closure> {/usr/projects/systemtap/examples}  
8% stap -vvv ext4-check-desk.stp
SystemTap translator/driver (version 0.7.1/0.131 git branch master, commit 82737bef)
Copyright (C) 2005-2008 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
Session arch: i686 release: 2.6.26-03033-g97438cf
Created temporary directory "/tmp/stapfX7KWk"
Searched '/usr/local/share/systemtap/tapset/i686/*.stp', found 2
Searched '/usr/local/share/systemtap/tapset/*.stp', found 41
Pass 1: parsed user script and 43 library script(s) in 360usr/10sys/725real ms.
control symbols: kts: 0xc02f6e60 kte: 0xc02f9f02 stext: 0xc01010e8
parsed 'ext4_check_descriptors' -> func 'ext4_check_descriptors'
blacklist regexps:
blfn: ^(atomic_notifier_call_chain|default_do_nmi|__die|die_nmi|do_debug|do_general_protection|do_int3|do_IRQ|do_page_fault|do_sparc64_fault|do_trap|dummy_nmi_callback|flush_icache_range|ia64_bad_break|ia64_do_page_fault|ia64_fault|io_check_error|mem_parity_error|nmi_watchdog_tick|notifier_call_chain|oops_begin|oops_end|program_check_exception|single_step_exception|sync_regs|unhandled_fault|unknown_nmi_error|.*raw_.*lock.*|.*read_.*lock.*|.*write_.*lock.*|.*spin_.*lock.*|.*rwlock_.*lock.*|.*rwsem_.*lock.*|.*mutex_.*lock.*|raw_.*|.*seq_.*lock.*|atomic_.*|atomic64_.*|get_bh|put_bh|.*apic.*|.*APIC.*|.*softirq.*|.*IRQ.*|.*_intr.*|__delay|.*kernel_text.*|get_current|current_.*|.*exception_tables.*|.*setup_rt_frame.*|.*preempt_count.*|preempt_schedule)$
blfn_ret: ^(do_exit|sys_exit|sys_exit_group|__switch_to)$
blfile: ^(kernel/kprobes.c|arch/.*/kernel/kprobes.c)$
focused on module 'ext4dev = [0x9aa300-0x9d9888, bias 0x0] file /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko ELF machine i?86 (code 3)
WARNING: cannot find module ext4dev debuginfo: No such file or directory
semantic error: no match while resolving probe point module("ext4dev").function("ext4_check_descriptors")
semantic error: no probes found
Pass 2: analyzed script: 0 probe(s), 0 function(s), 0 embed(s), 0 global(s) in 1970usr/170sys/5974real ms.
Pass 2: analysis failed.  Try again with more '-v' (verbose) options.
Running rm -rf /tmp/stapfX7KWk
<tytso@closure> {/usr/projects/systemtap/examples}  
9% grep DEBUG_KERNEL /boot/config-2.6.26-03033-g97438cf 
CONFIG_DEBUG_KERNEL=y


So, what I am doing wrong?  And how am I supposed to figure this what I
should have done to allow systemtap to work correctly?  

						- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 21:02 How to debug what I am doing wrong? Theodore Ts'o
@ 2008-07-17 21:31 ` Masami Hiramatsu
  2008-07-17 22:07   ` Theodore Tso
  2008-07-17 22:01 ` Roland McGrath
  1 sibling, 1 reply; 14+ messages in thread
From: Masami Hiramatsu @ 2008-07-17 21:31 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: systemtap

Theodore Ts'o wrote:
> <tytso@closure> {/usr/projects/systemtap/examples}  
> 8% stap -vvv ext4-check-desk.stp
> SystemTap translator/driver (version 0.7.1/0.131 git branch master, commit 82737bef)
> Copyright (C) 2005-2008 Red Hat, Inc. and others
> This is free software; see the source for copying conditions.
> Session arch: i686 release: 2.6.26-03033-g97438cf
> Created temporary directory "/tmp/stapfX7KWk"
> Searched '/usr/local/share/systemtap/tapset/i686/*.stp', found 2
> Searched '/usr/local/share/systemtap/tapset/*.stp', found 41
> Pass 1: parsed user script and 43 library script(s) in 360usr/10sys/725real ms.
> control symbols: kts: 0xc02f6e60 kte: 0xc02f9f02 stext: 0xc01010e8
> parsed 'ext4_check_descriptors' -> func 'ext4_check_descriptors'
> blacklist regexps:
> blfn: ^(atomic_notifier_call_chain|default_do_nmi|__die|die_nmi|do_debug|do_general_protection|do_int3|do_IRQ|do_page_fault|do_sparc64_fault|do_trap|dummy_nmi_callback|flush_icache_range|ia64_bad_break|ia64_do_page_fault|ia64_fault|io_check_error|mem_parity_error|nmi_watchdog_tick|notifier_call_chain|oops_begin|oops_end|program_check_exception|single_step_exception|sync_regs|unhandled_fault|unknown_nmi_error|.*raw_.*lock.*|.*read_.*lock.*|.*write_.*lock.*|.*spin_.*lock.*|.*rwlock_.*lock.*|.*rwsem_.*lock.*|.*mutex_.*lock.*|raw_.*|.*seq_.*lock.*|atomic_.*|atomic64_.*|get_bh|put_bh|.*apic.*|.*APIC.*|.*softirq.*|.*IRQ.*|.*_intr.*|__delay|.*kernel_text.*|get_current|current_.*|.*exception_tables.*|.*setup_rt_frame.*|.*preempt_count.*|preempt_schedule)$
> blfn_ret: ^(do_exit|sys_exit|sys_exit_group|__switch_to)$
> blfile: ^(kernel/kprobes.c|arch/.*/kernel/kprobes.c)$
> focused on module 'ext4dev = [0x9aa300-0x9d9888, bias 0x0] file /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko ELF machine i?86 (code 3)
> WARNING: cannot find module ext4dev debuginfo: No such file or directory
> semantic error: no match while resolving probe point module("ext4dev").function("ext4_check_descriptors")
> semantic error: no probes found
> Pass 2: analyzed script: 0 probe(s), 0 function(s), 0 embed(s), 0 global(s) in 1970usr/170sys/5974real ms.
> Pass 2: analysis failed.  Try again with more '-v' (verbose) options.
> Running rm -rf /tmp/stapfX7KWk
> <tytso@closure> {/usr/projects/systemtap/examples}  
> 9% grep DEBUG_KERNEL /boot/config-2.6.26-03033-g97438cf 
> CONFIG_DEBUG_KERNEL=y

Hi,

Could you check CONFIG_DEBUG_INFO=y instead of DEBUG_KERNEL?

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 21:02 How to debug what I am doing wrong? Theodore Ts'o
  2008-07-17 21:31 ` Masami Hiramatsu
@ 2008-07-17 22:01 ` Roland McGrath
  2008-07-17 22:10   ` Theodore Tso
  1 sibling, 1 reply; 14+ messages in thread
From: Roland McGrath @ 2008-07-17 22:01 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: systemtap

If you built systemtap against installed elfutils libraries,
you can use some installed eu-* programs to test them too.
e.g. try eu-unstrip -n -K ext4dev
If that behaves differently, then it might not be a library problem.  If it
also fails to find the debug file, then it is probably a library problem.

You can try strace -eopen or other such debugging to see what files it's
trying to open before it gives up.  If it finds the right file and then
rejects it, that tells us something else.

Another thing to try is eu-unstrip -n -e /lib/modules/.../ext4dev.ko
That will indicate whether the library thought the debug file was OK
for the generic file case where finding it should always be simple.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 21:31 ` Masami Hiramatsu
@ 2008-07-17 22:07   ` Theodore Tso
  0 siblings, 0 replies; 14+ messages in thread
From: Theodore Tso @ 2008-07-17 22:07 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: systemtap

On Thu, Jul 17, 2008 at 05:29:33PM -0400, Masami Hiramatsu wrote:
> > <tytso@closure> {/usr/projects/systemtap/examples}  
> > 9% grep DEBUG_KERNEL /boot/config-2.6.26-03033-g97438cf 
> > CONFIG_DEBUG_KERNEL=y
> 
> Could you check CONFIG_DEBUG_INFO=y instead of DEBUG_KERNEL?

% grep DEBUG_INFO /boot/config-2.6.26-03033-g97438cf
CONFIG_DEBUG_INFO=y

					- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 22:01 ` Roland McGrath
@ 2008-07-17 22:10   ` Theodore Tso
  2008-07-17 22:23     ` Roland McGrath
  0 siblings, 1 reply; 14+ messages in thread
From: Theodore Tso @ 2008-07-17 22:10 UTC (permalink / raw)
  To: Roland McGrath; +Cc: systemtap

On Thu, Jul 17, 2008 at 03:00:16PM -0700, Roland McGrath wrote:
> If you built systemtap against installed elfutils libraries,
> you can use some installed eu-* programs to test them too.
> e.g. try eu-unstrip -n -K ext4dev

So eu-unstrip doesn't have a man page, so I'm not 100% sure what this
is doing, but:

<tytso@closure> {/usr/projects/linux}  
140%  eu-unstrip -n -K ext4dev
0x9aa300+0x2f588 a8976215a438326936201ee03829fa3230fed123@0x9aa324 /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko - ext4dev

<tytso@closure> {/usr/projects/linux}  
141% eu-unstrip -n -e /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
0+0x2f588 a8976215a438326936201ee03829fa3230fed123@0x24 /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko - 

<tytso@closure> {/usr/projects/linux}  
142% eu-unstrip -n -e /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
0+0x2f588 a8976215a438326936201ee03829fa3230fed123@0x24 /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko . 

Does this tell you anything useful?

						- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 22:10   ` Theodore Tso
@ 2008-07-17 22:23     ` Roland McGrath
  2008-07-17 22:38       ` Theodore Tso
  0 siblings, 1 reply; 14+ messages in thread
From: Roland McGrath @ 2008-07-17 22:23 UTC (permalink / raw)
  To: Theodore Tso; +Cc: systemtap

> So eu-unstrip doesn't have a man page, so I'm not 100% sure what this
> is doing, but:

Yeah, sorry about that.  I'm sometimes decent at consing new frobs, but
always lousy at documenting.  It does have --help, though it is rather
nonobvious til you read it all that -n makes it do something entirely
different. ;-)

 With -n no files are written, but one line to standard output for each module:
	 START+SIZE BUILDID FILE DEBUGFILE MODULENAME
 START and SIZE are hexadecimal giving the address bounds of the module.
 BUILDID is hexadecimal for the build ID bits, or - if no ID is known; the
 hexadecimal may be followed by @0xADDR giving the address where the ID resides
 if that is known.  FILE is the file name found for the module, or - if none was
 found, or . if an ELF image is available but not from any named file.
 DEBUGFILE is the separate debuginfo file name, or - if no debuginfo was found,
 or . if FILE contains the debug information.

> 140%  eu-unstrip -n -K ext4dev
> 0x9aa300+0x2f588 a8976215a438326936201ee03829fa3230fed123@0x9aa324 /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko - ext4dev

So, this says it found the module (file name) but found no debug file (-),
same as systemtap.

> 141% eu-unstrip -n -e /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
> 0+0x2f588 a8976215a438326936201ee03829fa3230fed123@0x24 /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko - 

This says the same about just looking at the file by name (as a generic
ET_REL file, the tool not caring that it's a .ko) and trying to find its
debug file.

> 142% eu-unstrip -n -e /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
> 0+0x2f588 a8976215a438326936201ee03829fa3230fed123@0x24 /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko . 

This says that the file you pointed it to explicitly does itself contain
satisfactory DWARF sections (.).

> Does this tell you anything useful?

It does.  It now seems likely the bug is in the libdwfl code for searching
for debuginfo files and/or its code for validating them.  (In this case, it
should be validating that the build ID matches.)

If you send me both of those files (a tar with each by its full dir name
would be handy), I should be able to reproduce this and fix the bug.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 22:23     ` Roland McGrath
@ 2008-07-17 22:38       ` Theodore Tso
  2008-07-17 23:01         ` Frank Ch. Eigler
  2008-07-18  0:15         ` Roland McGrath
  0 siblings, 2 replies; 14+ messages in thread
From: Theodore Tso @ 2008-07-17 22:38 UTC (permalink / raw)
  To: Roland McGrath; +Cc: systemtap

On Thu, Jul 17, 2008 at 03:23:10PM -0700, Roland McGrath wrote:
> It does.  It now seems likely the bug is in the libdwfl code for searching
> for debuginfo files and/or its code for validating them.  (In this case, it
> should be validating that the build ID matches.)

Hmm.  I just tried to do a strace -eopen on eu-strip and on stap, and
it looks like it's not trying to search for the file in /usr/lib/debug
at all.  I'm using the version of elfutils that shipped with Ubuntu
Hardy (since I was assured it wasn't necessary to build your own
version of elfutils).  I seem to be using version 0.131-3 of elfutils
from Ubuntu; could that be the problem?

I can work around the problem in stap (but not eu-strip) by setting
the SYSTEMP_DEBUGINFO_PATH environment variable.

export SYSTEMTAP_DEBUGINFO_PATH=/usr/local/lib/debug/lib/modules/2.6.26-03033-g97438cf

Unfortuntaely, I then get a different error message:

% stap ext4-check-desk.stp
semantic error: failed to retrieve location attribute for local 'sb' (dieoffset: 0x9cf22): identifier '$sb' at ext4-check-desk.stp:3:47
Pass 2: analysis failed.  Try again with more '-v' (verbose) options.

Sb is a parameter passed into ext4_check_descriptors, so don't know
what it's issuing *this* complaint:

static int ext4_check_descriptors(struct super_block *sb)
{
	struct ext4_sb_info *sbi = EXT4_SB(sb);
	...


						- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 22:38       ` Theodore Tso
@ 2008-07-17 23:01         ` Frank Ch. Eigler
  2008-07-18 11:50           ` Theodore Tso
  2008-07-18  0:15         ` Roland McGrath
  1 sibling, 1 reply; 14+ messages in thread
From: Frank Ch. Eigler @ 2008-07-17 23:01 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Roland McGrath, systemtap

Theodore Tso <tytso@mit.edu> writes:

> [...]
> Unfortunately, I then get a different error message:
>
> % stap ext4-check-desk.stp
> semantic error: failed to retrieve location attribute for local 'sb' (dieoffset: 0x9cf22): identifier '$sb' at ext4-check-desk.stp:3:47
> [...]
> Sb is a parameter passed into ext4_check_descriptors, so don't know
> what it's issuing *this* complaint:
>
> static int ext4_check_descriptors(struct super_block *sb)
> {
> 	struct ext4_sb_info *sbi = EXT4_SB(sb);

This is one of the cases where gcc's dwarf debugging data is
incomplete.  See http://gcc.gnu.org/PR23551 and
http://sources.redhat.com/PR1155, and many discussions on gcc-patches
and elsewhere.  While it will get better (RH and others are investing
serious effort in it), it may never be complete enough.

This was one of the motivations for markers.  Since selected
parameters are identified to the compiler, we are assured that the
values will be available, regardless of optimizations or debugging
data.  (This is also an example where James' simple_mark will fail.)

You may be able to work around this by using .statement() probes,
placing one near the call site of this function, hoping to extract the
same pointer.


- FChE

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 22:38       ` Theodore Tso
  2008-07-17 23:01         ` Frank Ch. Eigler
@ 2008-07-18  0:15         ` Roland McGrath
  2008-07-18  1:14           ` Theodore Tso
  1 sibling, 1 reply; 14+ messages in thread
From: Roland McGrath @ 2008-07-18  0:15 UTC (permalink / raw)
  To: Theodore Tso; +Cc: systemtap

> Hmm.  I just tried to do a strace -eopen on eu-strip and on stap, and
> it looks like it's not trying to search for the file in /usr/lib/debug
> at all.  I'm using the version of elfutils that shipped with Ubuntu
> Hardy (since I was assured it wasn't necessary to build your own
> version of elfutils).  I seem to be using version 0.131-3 of elfutils
> from Ubuntu; could that be the problem?

It's a libdwfl bug.  When there is a build ID, it always looks for the file
by build ID first.  The bug is that it's then not falling back to the path
search based on the file name as it should.  I've put the fix below into
elfutils upstream.


Thanks,
Roland


libdwfl/
2008-07-17  Roland McGrath  <roland@redhat.com>

	* dwfl_build_id_find_elf.c (__libdwfl_open_by_build_id): Set errno to
	zero if the failure was only ENOENT.

--- libdwfl/dwfl_build_id_find_elf.c	5780a35eb84a17f2cb1e8fcba2d7c95e51673e63
+++ libdwfl/dwfl_build_id_find_elf.c	6afdebe48e2da91dd355b198c03b383c6bd3c088
@@ -119,6 +119,13 @@ __libdwfl_open_by_build_id (Dwfl_Module 
       free (name);
     }
 
+  /* If we simply found nothing, clear errno.  If we had some other error
+     with the file, report that.  Possibly this should treat other errors
+     like ENOENT too.  But ignoring all errors could mask some that should
+     be reported.  */
+  if (fd < 0 && errno == ENOENT)
+    errno = 0;
+
   return fd;
 }
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-18  0:15         ` Roland McGrath
@ 2008-07-18  1:14           ` Theodore Tso
  2008-07-18  2:07             ` Roland McGrath
  0 siblings, 1 reply; 14+ messages in thread
From: Theodore Tso @ 2008-07-18  1:14 UTC (permalink / raw)
  To: Roland McGrath; +Cc: systemtap

On Thu, Jul 17, 2008 at 05:15:13PM -0700, Roland McGrath wrote:
> It's a libdwfl bug.  When there is a build ID, it always looks for the file
> by build ID first.  The bug is that it's then not falling back to the path
> search based on the file name as it should.  I've put the fix below into
> elfutils upstream.

Um, stupid question.  How does the build ID get set?  I'm using
make-kpkg, which I didn't think knew how to set the build ID.  (I have
some vague memory of a magic flag to the linker, or some
post-processing via some elfutils tool?)

Thanks,

							- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-18  1:14           ` Theodore Tso
@ 2008-07-18  2:07             ` Roland McGrath
  0 siblings, 0 replies; 14+ messages in thread
From: Roland McGrath @ 2008-07-18  2:07 UTC (permalink / raw)
  To: Theodore Tso; +Cc: systemtap

> Um, stupid question.  How does the build ID get set?  I'm using
> make-kpkg, which I didn't think knew how to set the build ID.  (I have
> some vague memory of a magic flag to the linker, or some
> post-processing via some elfutils tool?)

ld generates a build ID when given the --build-id option.
The kernel makefiles pass it by default when it's supported.
Sometimes gcc's default ld run for normal executables/DSOs
passes --build-id too (Fedora's gcc does since F8).

When a final link (or a "quasi-final" -r link for a .ko) is done without
generating a build ID at that time, one is never added in later.

There has not yet been any elfutils tool that generates or changes build IDs.

The rpmbuild magic scripts that do separate debuginfo splitting also
(first) use an rpmbuild tool (/usr/lib/rpm/debugedit) that edits the
DWARF information.  (It rewrites the source directory names from the
package build directory into the installed locations in /usr/src/debug
where the -debuginfo rpm will install copies of all the referenced
source files.)  This tool now also regenerates the build ID based on the
editted binary's contents.  Doing this ensures that two rpmbuild runs
that generate completely identical binaries and .debug files but use
different _builddir settings reproduce the identical build IDs in the
identical binaries.  (In some buildsystems that directory name is
different every time.)

If a packaging system is not editting the binaries after they are linked
(aside from strip-to-debug), or is not concerned with 100% reproducible
builds (identical binaries from identical constituents), then it never
has any reason to meddle with the build ID bits.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-17 23:01         ` Frank Ch. Eigler
@ 2008-07-18 11:50           ` Theodore Tso
  2008-07-18 12:13             ` Frank Ch. Eigler
  0 siblings, 1 reply; 14+ messages in thread
From: Theodore Tso @ 2008-07-18 11:50 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Roland McGrath, systemtap

On Thu, Jul 17, 2008 at 07:00:12PM -0400, Frank Ch. Eigler wrote:
> Theodore Tso <tytso@mit.edu> writes:
> 
> > [...]
> > Unfortunately, I then get a different error message:
> >
> > % stap ext4-check-desk.stp
> > semantic error: failed to retrieve location attribute for local 'sb' (dieoffset: 0x9cf22): identifier '$sb' at ext4-check-desk.stp:3:47
> > [...]
> > Sb is a parameter passed into ext4_check_descriptors, so don't know
> > what it's issuing *this* complaint:
> >
> > static int ext4_check_descriptors(struct super_block *sb)
> > {
> > 	struct ext4_sb_info *sbi = EXT4_SB(sb);
> 
> This is one of the cases where gcc's dwarf debugging data is
> incomplete.  See http://gcc.gnu.org/PR23551 and
> http://sources.redhat.com/PR1155, and many discussions on gcc-patches
> and elsewhere.  While it will get better (RH and others are investing
> serious effort in it), it may never be complete enough.

So is this a good summary?  With a sufficiently modern gcc (presumably
all 4.x compilers), if there is a static function which is only used
once, it will almost certainly be not be compiled into a separate
function, but incorporated into ths calling function --- and then
optimizations involving function parameter folding kicks in, and at
least given the DWARF information ommitted by a recent gcc, any
attempt at accessing variables in a static inline function has a very
high likelihood of being Doomed To Fail.

> You may be able to work around this by using .statement() probes,
> placing one near the call site of this function, hoping to extract the
> same pointer.

I tried multiple .statement probes inside the function, and that
didn't work.  Statement probes around the call site of function didn't
work either.  I ultimtaely managed to grab it by grabbing using
.module().function() of the calling function, and grabbing from the
parameter there.

This may already be documented somewhere, but basically it sounds like
it is a very Bad Idea for tapsets to try to grab information from
inside static functions, since depending on the compiler used and
optimizations, the necessary DWARF information may not be available.
Also, static functions probably have a higher probabilty of changing
over time, making the tapset much less reliable.

It sounds like a good regression check would be to make sure all of
probe points in the tapsets can reference all of the desired variables
given a kernel build tree and debuginfo files.  Is this already being
done, or is there an easy way to do this?

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: How to debug what I am doing wrong?
  2008-07-18 11:50           ` Theodore Tso
@ 2008-07-18 12:13             ` Frank Ch. Eigler
  0 siblings, 0 replies; 14+ messages in thread
From: Frank Ch. Eigler @ 2008-07-18 12:13 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Roland McGrath, systemtap

Hi -

On Fri, Jul 18, 2008 at 07:50:15AM -0400, Theodore Tso wrote:

> [...]  So is this a good summary?  [...]

Yes.

> > You may be able to work around this by using .statement() probes,
> > placing one near the call site of this function, hoping to extract the
> > same pointer.
> 
> I tried multiple .statement probes inside the function, and that
> didn't work.  Statement probes around the call site of function didn't
> work either.  I ultimtaely managed to grab it by grabbing using
> .module().function() of the calling function, and grabbing from the
> parameter there.

OK.  I'll write a wiki tip page about this issue.

> This may already be documented somewhere, but basically it sounds
> like it is a very Bad Idea for tapsets to try to grab information
> from inside static functions [...]

Yeah, it is difficult to make work reliably.

> It sounds like a good regression check would be to make sure all of
> probe points in the tapsets can reference all of the desired
> variables given a kernel build tree and debuginfo files.  Is this
> already being done, or is there an easy way to do this?

There are two facilities: disabling systemtap-side script
optimizations with "-u", so that any $expressions that are not used by
an end-user (or test-suite) script are nevertheless accessed - sort of
as if they were "volatile".  At this point, our test suite does not
blanket-use this, because of the dramatic number of failures (though
it's not clear to what extent the problems are due to debuginfo or
actual tapset bugs).  Soon also we'll have a way of referring to
pretty-printing of all variables in scope ($$vars).

- FChE

^ permalink raw reply	[flat|nested] 14+ messages in thread

* How to debug what I am doing wrong?
@ 2008-07-17 21:30 Theodore Ts'o
  0 siblings, 0 replies; 14+ messages in thread
From: Theodore Ts'o @ 2008-07-17 21:30 UTC (permalink / raw)
  To: systemtap

I'm trying to run this systemtap script, ext4_check_desc.stp:

probe module("ext4dev").function("ext4_check_descriptors")
{
	printf("ext4_check_descriptors: flags %x\n", $sb->s_flags);
}

<tytso@closure> {/usr/projects/systemtap/examples}  
4% stap ext4-check-desk.stp
WARNING: cannot find module ext4dev debuginfo: No such file or directory
semantic error: no match while resolving probe point module("ext4dev").function("ext4_check_descriptors")
semantic error: no probes found
Pass 2: analysis failed.  Try again with more '-v' (verbose) options.
<tytso@closure> {/usr/projects/systemtap/examples}  
5% uname -a
Linux closure 2.6.26-03033-g97438cf #16 SMP Thu Jul 17 01:21:12 EDT 2008 i686 GNU/Linux
<tytso@closure> {/usr/projects/systemtap/examples}  
6% ls /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
3428 /usr/lib/debug/lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko
<tytso@closure> {/usr/projects/systemtap/examples}  
7% stap -V
SystemTap translator/driver (version 0.7.1/0.131 git branch master, commit 82737bef)
Copyright (C) 2005-2008 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
<tytso@closure> {/usr/projects/systemtap/examples}  
8% stap -vvv ext4-check-desk.stp
SystemTap translator/driver (version 0.7.1/0.131 git branch master, commit 82737bef)
Copyright (C) 2005-2008 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
Session arch: i686 release: 2.6.26-03033-g97438cf
Created temporary directory "/tmp/stapfX7KWk"
Searched '/usr/local/share/systemtap/tapset/i686/*.stp', found 2
Searched '/usr/local/share/systemtap/tapset/*.stp', found 41
Pass 1: parsed user script and 43 library script(s) in 360usr/10sys/725real ms.
control symbols: kts: 0xc02f6e60 kte: 0xc02f9f02 stext: 0xc01010e8
parsed 'ext4_check_descriptors' -> func 'ext4_check_descriptors'
blacklist regexps:
blfn: ^(atomic_notifier_call_chain|default_do_nmi|__die|die_nmi|do_debug|do_general_protection|do_int3|do_IRQ|do_page_fault|do_sparc64_fault|do_trap|dummy_nmi_callback|flush_icache_range|ia64_bad_break|ia64_do_page_fault|ia64_fault|io_check_error|mem_parity_error|nmi_watchdog_tick|notifier_call_chain|oops_begin|oops_end|program_check_exception|single_step_exception|sync_regs|unhandled_fault|unknown_nmi_error|.*raw_.*lock.*|.*read_.*lock.*|.*write_.*lock.*|.*spin_.*lock.*|.*rwlock_.*lock.*|.*rwsem_.*lock.*|.*mutex_.*lock.*|raw_.*|.*seq_.*lock.*|atomic_.*|atomic64_.*|get_bh|put_bh|.*apic.*|.*APIC.*|.*softirq.*|.*IRQ.*|.*_intr.*|__delay|.*kernel_text.*|get_current|current_.*|.*exception_tables.*|.*setup_rt_frame.*|.*preempt_count.*|preempt_schedule)$
blfn_ret: ^(do_exit|sys_exit|sys_exit_group|__switch_to)$
blfile: ^(kernel/kprobes.c|arch/.*/kernel/kprobes.c)$
focused on module 'ext4dev = [0x9aa300-0x9d9888, bias 0x0] file /lib/modules/2.6.26-03033-g97438cf/kernel/fs/ext4/ext4dev.ko ELF machine i?86 (code 3)
WARNING: cannot find module ext4dev debuginfo: No such file or directory
semantic error: no match while resolving probe point module("ext4dev").function("ext4_check_descriptors")
semantic error: no probes found
Pass 2: analyzed script: 0 probe(s), 0 function(s), 0 embed(s), 0 global(s) in 1970usr/170sys/5974real ms.
Pass 2: analysis failed.  Try again with more '-v' (verbose) options.
Running rm -rf /tmp/stapfX7KWk
<tytso@closure> {/usr/projects/systemtap/examples}  
9% grep DEBUG_KERNEL /boot/config-2.6.26-03033-g97438cf 
CONFIG_DEBUG_KERNEL=y


So, what I am doing wrong?  And how am I supposed to figure this what I
should have done to allow systemtap to work correctly?  

						- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-07-18 12:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-17 21:02 How to debug what I am doing wrong? Theodore Ts'o
2008-07-17 21:31 ` Masami Hiramatsu
2008-07-17 22:07   ` Theodore Tso
2008-07-17 22:01 ` Roland McGrath
2008-07-17 22:10   ` Theodore Tso
2008-07-17 22:23     ` Roland McGrath
2008-07-17 22:38       ` Theodore Tso
2008-07-17 23:01         ` Frank Ch. Eigler
2008-07-18 11:50           ` Theodore Tso
2008-07-18 12:13             ` Frank Ch. Eigler
2008-07-18  0:15         ` Roland McGrath
2008-07-18  1:14           ` Theodore Tso
2008-07-18  2:07             ` Roland McGrath
2008-07-17 21:30 Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).