public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug translator/14820] New: hang when running semok/nodwf01.stp
@ 2012-11-08 18:42 dsmith at redhat dot com
  2012-11-12 17:03 ` [Bug translator/14820] " dsmith at redhat dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: dsmith at redhat dot com @ 2012-11-08 18:42 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=14820

             Bug #: 14820
           Summary: hang when running semok/nodwf01.stp
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: translator
        AssignedTo: systemtap@sourceware.org
        ReportedBy: dsmith@redhat.com
    Classification: Unclassified


When running the semok/nodwf01.stp testcase, systemtap will hang on
2.6.32-279.5.2.el6.x86_64.

That test does:

# stap -p2 --ignore-vmlinux --kmap=/proc/kallsyms -e '{SCRIPT}'

I've debugged this a bit with gdb. It appears that stap goes into an infinite
loop in module_info::update_symtab() (from tapsets.cxx), which looks like this:

====
void
module_info::update_symtab(cu_function_cache_t *funcs)
{
  if (!sym_table)
    return;

  cu_function_cache_t new_funcs;

  for (cu_function_cache_t::iterator func = funcs->begin();
       func != funcs->end(); func++)
    {
      // optimization: inlines will never be in the symbol table
      if (dwarf_func_inline(&func->second) != 0)
        continue;

      // XXX We may want to make additional efforts to match mangled elf names
      // to dwarf too.  MIPS_linkage_name can help, but that's sometimes
      // missing, so we may also need to try matching by address.  See also the
      // notes about _Z in dwflpp::iterate_over_functions().

      func_info *fi = sym_table->lookup_symbol(func->first);
      if (!fi)
        continue;

      // iterate over all functions at the same address
      symbol_table::range_t er = sym_table->map_by_addr.equal_range(fi->addr);
      for (symbol_table::iterator_t it = er.first; it != er.second; ++it)
        {
          // update this function with the dwarf die
          it->second->die = func->second;

          // if this function is a new alias, then
          // save it to merge into the function cache
          if (it->second != fi)
            new_funcs.insert(make_pair(it->second->name, it->second->die));
        }
    }

  // add all discovered aliases back into the function cache
  // NB: this won't replace any names that dwarf may have already found
  funcs->insert(new_funcs.begin(), new_funcs.end());
}
====

stap happily runs through around 660 symbols, then hits the
"shrink_dcache_memory" symbol. When that symbol is processed, stap never breaks
out of that innermost 'for' loop (the one with the "iterate over all functions
at the same address" comment).

In my current run, I've been stuck there for over an hour. I've let this test
run overnight, and it never finishes on this system.

Note that this is being run on actual hardware, not on a vm. Also this system
only has 1M of memory (however only about 1/2 of swap has been consumed at this
point).

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug translator/14820] hang when running semok/nodwf01.stp
  2012-11-08 18:42 [Bug translator/14820] New: hang when running semok/nodwf01.stp dsmith at redhat dot com
@ 2012-11-12 17:03 ` dsmith at redhat dot com
  2012-11-12 22:02 ` mjw at redhat dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: dsmith at redhat dot com @ 2012-11-12 17:03 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=14820

--- Comment #1 from David Smith <dsmith at redhat dot com> 2012-11-12 17:03:34 UTC ---
Here's an update on this one. This also happens on a 1M vm (running
2.6.32-279.14.1.el6.x86_64).

With some debug prints added, I now believe that module_info::update_symtab()
really isn't the problem, that either elfutils or dwflpp.cxx is. Basically the
problem is that we're finding *way* too many function aliases when we don't
have dwarf info.

Here's a couple of randomly chosen functions:

piix4_io_quirk: 0 aliases with dwarf info, ~660 aliases without dwarf info
native_read_cr4_safe: 0 aliases with dwarf info, ~1320 aliases without dwarf
info

Without dwarf info, I'm seeing innermost loop of module_info::update_symtab()
find almost 18 million aliases (if I'm reading my debug output correctly). I
believe the hang here is just the time being taken trying to insert that many
items into the hash table.

I'd guess the problem is in map_by_addr.equal_range(), but I could certainly be
wrong.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug translator/14820] hang when running semok/nodwf01.stp
  2012-11-08 18:42 [Bug translator/14820] New: hang when running semok/nodwf01.stp dsmith at redhat dot com
  2012-11-12 17:03 ` [Bug translator/14820] " dsmith at redhat dot com
@ 2012-11-12 22:02 ` mjw at redhat dot com
  2012-11-13 10:00 ` mjw at redhat dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: mjw at redhat dot com @ 2012-11-12 22:02 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=14820

Mark Wielaard <mjw at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mjw at redhat dot com

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug translator/14820] hang when running semok/nodwf01.stp
  2012-11-08 18:42 [Bug translator/14820] New: hang when running semok/nodwf01.stp dsmith at redhat dot com
  2012-11-12 17:03 ` [Bug translator/14820] " dsmith at redhat dot com
  2012-11-12 22:02 ` mjw at redhat dot com
@ 2012-11-13 10:00 ` mjw at redhat dot com
  2012-11-13 10:04 ` mjw at redhat dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: mjw at redhat dot com @ 2012-11-13 10:00 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=14820

--- Comment #2 from Mark Wielaard <mjw at redhat dot com> 2012-11-13 10:00:26 UTC ---
The issue seems to be that with --ignore-vmlinux --kmap=/proc/kallsyms lots of
functions map to address 0x0 and they all get aliased. This only seems to
happen on my RHEl6 setup, not on my fedora setup.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug translator/14820] hang when running semok/nodwf01.stp
  2012-11-08 18:42 [Bug translator/14820] New: hang when running semok/nodwf01.stp dsmith at redhat dot com
                   ` (2 preceding siblings ...)
  2012-11-13 10:00 ` mjw at redhat dot com
@ 2012-11-13 10:04 ` mjw at redhat dot com
  2012-11-13 12:14 ` fche at redhat dot com
  2012-11-13 22:01 ` dsmith at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: mjw at redhat dot com @ 2012-11-13 10:04 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=14820

--- Comment #3 from Mark Wielaard <mjw at redhat dot com> 2012-11-13 10:04:31 UTC ---
And also, it only seems to happen when I am a normal user (in group stapdev),
but not as root...

O... Look at that:

$ head /proc/kallsyms
0000000000000000 D per_cpu__irq_stack_union
0000000000000000 D __per_cpu_start
0000000000000000 D per_cpu__gdt_page
0000000000000000 d per_cpu__exception_stacks
0000000000000000 d per_cpu__idt_desc
0000000000000000 d per_cpu__xen_cr0_value
0000000000000000 D per_cpu__xen_vcpu
0000000000000000 D per_cpu__xen_vcpu_info
0000000000000000 d per_cpu__mc_buffer
0000000000000000 D per_cpu__xen_mc_irq_flags

$ sudo head /proc/kallsyms
0000000000000000 D per_cpu__irq_stack_union
0000000000000000 D __per_cpu_start
0000000000004000 D per_cpu__gdt_page
0000000000005000 d per_cpu__exception_stacks
000000000000b000 d per_cpu__idt_desc
000000000000b010 d per_cpu__xen_cr0_value
000000000000b018 D per_cpu__xen_vcpu
000000000000b020 D per_cpu__xen_vcpu_info
000000000000b060 d per_cpu__mc_buffer
000000000000c570 D per_cpu__xen_mc_irq_flags

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug translator/14820] hang when running semok/nodwf01.stp
  2012-11-08 18:42 [Bug translator/14820] New: hang when running semok/nodwf01.stp dsmith at redhat dot com
                   ` (3 preceding siblings ...)
  2012-11-13 10:04 ` mjw at redhat dot com
@ 2012-11-13 12:14 ` fche at redhat dot com
  2012-11-13 22:01 ` dsmith at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: fche at redhat dot com @ 2012-11-13 12:14 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=14820

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com

--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> 2012-11-13 12:14:11 UTC ---
Courtesy of kptr_restrict.  IMO, let's drop the old --ignore-vmlinux / --kmap
related options entirely.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug translator/14820] hang when running semok/nodwf01.stp
  2012-11-08 18:42 [Bug translator/14820] New: hang when running semok/nodwf01.stp dsmith at redhat dot com
                   ` (4 preceding siblings ...)
  2012-11-13 12:14 ` fche at redhat dot com
@ 2012-11-13 22:01 ` dsmith at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: dsmith at redhat dot com @ 2012-11-13 22:01 UTC (permalink / raw)
  To: systemtap

http://sourceware.org/bugzilla/show_bug.cgi?id=14820

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #5 from David Smith <dsmith at redhat dot com> 2012-11-13 22:01:17 UTC ---
Commit ab3ed72 removes the old '--ignore-vmlinux' and '--kmap' options (and the
test that caused this hang).

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-11-13 22:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-08 18:42 [Bug translator/14820] New: hang when running semok/nodwf01.stp dsmith at redhat dot com
2012-11-12 17:03 ` [Bug translator/14820] " dsmith at redhat dot com
2012-11-12 22:02 ` mjw at redhat dot com
2012-11-13 10:00 ` mjw at redhat dot com
2012-11-13 10:04 ` mjw at redhat dot com
2012-11-13 12:14 ` fche at redhat dot com
2012-11-13 22:01 ` dsmith at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).