public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simark@simark.ca>
To: DeJiang Zhu <doujiang24@gmail.com>, gdb@sourceware.org
Subject: Re: memory increased rapidly when adding a break
Date: Sun, 13 Nov 2022 19:26:10 -0500	[thread overview]
Message-ID: <b3f75aa3-dbd1-b704-c884-4083c9246967@simark.ca> (raw)
In-Reply-To: <CAEZxTmnx+QmDubVbbCETaLnXR5GiVVk3E=VPmJviUaPZHqJFUA@mail.gmail.com>



On 11/13/22 05:01, DeJiang Zhu via Gdb wrote:
> Hi,
> 
> I compiled envoy(a big c++ project) by using gcc-12.2.0, debug it by using
> gdb 12.1.
> 
> But, memory increased rapidly(over 40+GB, until OOM), when adding a break.
> 
> I got this backtrace, after attach the gdb process, when memory increasing.
> 
> ```
> (gdb) bt
> #0  0x00007fa6893dc935 in _int_malloc () from /lib64/libc.so.6
> #1  0x00007fa6893df6fc in malloc () from /lib64/libc.so.6
> #2  0x0000000000468278 in xmalloc (size=4064) at alloc.c:60
> #3  0x00000000008ecd95 in call_chunkfun (size=<optimized out>,
> h=0x17a246ed0) at ./obstack.c:94
> #4  _obstack_begin_worker (h=0x17a246ed0, size=<optimized out>,
> alignment=<optimized out>) at ./obstack.c:141
> #5  0x000000000052d0d3 in demangle_parse_info::demangle_parse_info
> (this=0x17a246ec0) at cp-name-parser.y:1973
> #6  cp_demangled_name_to_comp (demangled_name=demangled_name@entry=0x8d12c8c0
> "std::stack<unsigned int, std::deque<unsigned int, std::allocator<unsigned
> int> > >::size_type", errmsg=errmsg@entry=0x0) at cp-name-parser.y:2040
> #7  0x000000000052ff5e in cp_canonicalize_string
> (string=string@entry=0x8d12c8c0
> "std::stack<unsigned int, std::deque<unsigned int, std::allocator<unsigned
> int> > >::size_type") at cp-support.c:635
> #8  0x0000000000570b98 in dwarf2_canonicalize_name (name=0x8d12c8c0
> "std::stack<unsigned int, std::deque<unsigned int, std::allocator<unsigned
> int> > >::size_type", cu=<optimized out>, objfile=0x2c3af10) at
> dwarf2/read.c:22908
> #9  0x0000000000590265 in dwarf2_compute_name (name=0x7fa55773c524
> "size_type", die=0x172590eb0, cu=0xe2aeefd0, physname=0) at
> dwarf2/read.c:10095
> #10 0x000000000058bf39 in dwarf2_full_name (cu=0xe2aeefd0, die=0x172590eb0,
> name=0x0) at dwarf2/read.c:10123
> #11 read_typedef (cu=0xe2aeefd0, die=0x172590eb0) at dwarf2/read.c:17687
> #12 read_type_die_1 (cu=0xe2aeefd0, die=0x172590eb0) at dwarf2/read.c:22531
> #13 read_type_die (die=0x172590eb0, cu=0xe2aeefd0) at dwarf2/read.c:22473
> #14 0x000000000059acda in dwarf2_add_type_defn (cu=0xe2aeefd0,
> die=0x172590eb0, fip=0x7ffd8a1be3e0) at dwarf2/read.c:14740
> #15 handle_struct_member_die (child_die=0x172590eb0, type=0x17a6becd0,
> fi=0x7ffd8a1be3e0, template_args=<optimized out>, cu=0xe2aeefd0) at
> dwarf2/read.c:15867
> #16 0x0000000000597044 in process_structure_scope (cu=0xe2aeefd0,
> die=0x172590920) at dwarf2/read.c:15908
> #17 process_die (die=0x172590920, cu=0xe2aeefd0) at dwarf2/read.c:9698
> #18 0x000000000059646d in read_namespace (cu=0xe2aeefd0, die=0x16802e140)
> at dwarf2/read.c:17068
> #19 process_die (die=0x16802e140, cu=0xe2aeefd0) at dwarf2/read.c:9737
> #20 0x0000000000598df9 in read_file_scope (die=0x1594e8360, cu=0xe2aeefd0)
> at dwarf2/read.c:10648
> #21 0x0000000000595f32 in process_die (die=0x1594e8360, cu=0xe2aeefd0) at
> dwarf2/read.c:9669
> #22 0x000000000059c0c8 in process_full_comp_unit
> (pretend_language=<optimized out>, cu=0xe2aeefd0) at dwarf2/read.c:9439
> #23 process_queue (per_objfile=0x9d546c0) at dwarf2/read.c:8652
> #24 dw2_do_instantiate_symtab (per_cu=<optimized out>,
> per_objfile=0x9d546c0, skip_partial=<optimized out>) at dwarf2/read.c:2311
> #25 0x000000000059c5f0 in dw2_instantiate_symtab (per_cu=0x9c886f0,
> per_objfile=0x9d546c0, skip_partial=<optimized out>) at
> dwarf2/read.c:2335#26 0x000000000059c78a in
> dw2_expand_symtabs_matching_one(dwarf2_per_cu_data *, dwarf2_per_objfile *,
> gdb::function_view<bool(char const*, bool)>,
> gdb::function_view<bool(compunit_symtab*)>) (per_cu=<optimized out>,
> per_objfile=<optimized out>, file_matcher=..., expansion_notify=...) at
> dwarf2/read.c:4204
> #27 0x000000000059c94b in
> dwarf2_gdb_index::expand_symtabs_matching(objfile*, gdb::function_view<bool
> (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool
> (char const*)>, gdb::function_view<bool (compunit_symtab*)>,
> enum_flags<block_search_flag_values>, domain_enum_tag, search_domain)
> (this=<optimized out>, objfile=<optimized out>, file_matcher=...,
> lookup_name=<optimized out>, symbol_matcher=
> ..., expansion_notify=..., search_flags=..., domain=UNDEF_DOMAIN,
> kind=<optimized out>) at dwarf2/read.c:4421
> #28 0x0000000000730feb in objfile::map_symtabs_matching_filename(char
> const*, char const*, gdb::function_view<bool (symtab*)>) (this=0x2c3af10,
> name=<optimized out>, name@entry=0x586f26f0 "utility.h",
> real_path=<optimized out>, real_path@entry=0x0, callback=...) at
> symfile-debug.c:207
> #29 0x0000000000741abd in iterate_over_symtabs(char const*,
> gdb::function_view<bool (symtab*)>) (name=name@entry=0x586f26f0
> "utility.h", callback=...) at symtab.c:624
> #30 0x00000000006311d7 in collect_symtabs_from_filename (file=0x586f26f0
> "utility.h", search_pspace=<optimized out>) at linespec.c:3716
> #31 0x0000000000631212 in symtabs_from_filename (filename=0x586f26f0
> "utility.h", search_pspace=<optimized out>) at linespec.c:3736
> #32 0x0000000000635e9f in parse_linespec (parser=0x7ffd8a1bf1b0,
> arg=<optimized out>, match_type=<optimized out>) at linespec.c:2557
> #33 0x0000000000636cac in event_location_to_sals (parser=0x7ffd8a1bf1b0,
> location=0x51ed4da0) at linespec.c:3082
> #34 0x0000000000636f73 in decode_line_full (location=location@entry=0x51ed4da0,
> flags=flags@entry=1, search_pspace=search_pspace@entry=0x0,
> default_symtab=<optimized out>, default_line=<optimized out>,
> canonical=0x7ffd8a1bf4e0, select_mode=0x0, filter=<optimized out>) at
> linespec.c:3161
> #35 0x00000000004b1683 in parse_breakpoint_sals (location=0x51ed4da0,
> canonical=0x7ffd8a1bf4e0) at breakpoint.c:8730
> #36 0x00000000004b5d03 in create_breakpoint (gdbarch=0xeca5dc0,
> location=location@entry=0x51ed4da0, cond_string=cond_string@entry=0x0,
> thread=<optimized out>, thread@entry=-1, extra_string=0x0,
> extra_string@entry=0x7ffd8a1bf650 "",
> force_condition=force_condition@entry=false,
> parse_extra=0, tempflag=0, type_wanted=bp_breakpoint, ignore_count=0,
> pending_break_support=AUTO_BOOLEAN_TRUE, ops=0xc23c00
> <bkpt_breakpoint_ops>, from_tty=0, enabled=1, internal=0, flags=0) at
> breakpoint.c:9009
> #37 0x0000000000674ba8 in mi_cmd_break_insert_1 (dprintf=0, argv=<optimized
> out>, argc=<optimized out>, command=<optimized out>) at
> mi/mi-cmd-break.c:361
> ```
> 
> Also, I found it's loop in `dwarf2_gdb_index::expand_symtabs_matching`.
> I added a break on `dw2_expand_symtabs_matching_one`, it hit this break
> repeatly.
> 
> ```
>   if (lookup_name == nullptr)
>     {
>       for (dwarf2_per_cu_data *per_cu
>         : all_comp_units_range (per_objfile->per_bfd))
>       {
>          QUIT;
>          if (!dw2_expand_symtabs_matching_one (per_cu, per_objfile,
> file_matcher, expansion_notify))
>            return false;
>       }
>       return true;
>     }
> ```
> 
> Seems, `per_bfd->all_comp_units.size()` is `28776`.
> I'm not sure if this is a reasonable value.

I think that's possible, if it's a big project.  For instance, my gdb
binary has about 660 compile units, and gdb is not really big.

> 
> ```
> (gdb) p per_objfile->per_bfd->all_comp_units
> $423 = {<std::_Vector_base<std::unique_ptr<dwarf2_per_cu_data,
> dwarf2_per_cu_data_deleter>,
> std::allocator<std::unique_ptr<dwarf2_per_cu_data,
> dwarf2_per_cu_data_deleter> > >> = {_M_impl =
> {<std::allocator<std::unique_ptr<dwarf2_per_cu_data,
> dwarf2_per_cu_data_deleter> >> =
> {<__gnu_cxx::new_allocator<std::unique_ptr<dwarf2_per_cu_data,
> dwarf2_per_cu_data_deleter> >> = {<No data fields>}, <No data fields>},
> <std::_Vector_base<std::unique_ptr<dwarf2_per_cu_data,
> dwarf2_per_cu_data_deleter>,
> std::allocator<std::unique_ptr<dwarf2_per_cu_data,
> dwarf2_per_cu_data_deleter> > >::_Vector_impl_data> = {_M_start =
> 0x3b6f980, _M_finish = 0x3b769e8, _M_end_of_storage = 0x3b769e8}, <No data
> fields>}}, <No data fields>}
> (gdb) p 0x3b769e8-0x3b6f980
> $424 = 28776
> ```
> 
> I can see the memory increasing rapidly in the for loop.
> I'm new to the gdb internal implementation.
> I'm not sure where could be the problem, gcc or gdb, or just a wrong use.
> 
> Could you help to point the direction? I have the files to reproduce it
> stablely.

GDB works in two steps to read compile units.  From you stack trace, it
looks like you are using an index (the .gdb_index kind).  When GDB first
loads you binary, it reads in an index present in the binary (or in the
index cache) that lists all the entity names present in each compile
unit of the program.  When you set a breakpoint using a name, GDB
"expands" all the compile units with something in it that matches what
you asked for.  "Expand" means that GDB reads the full debug information
from the DWARF for that compile unit, creating some internal data
structures to represent it.

It sounds like the breakpoint spec string you passed matches a lot of
compile units, and a lot of them get expanded.  That creates a lot of
in-memory objects, eventually reaching some limit.

Out of curiosity, what is the string you used to create your breakpoint?
From you stack trace, it sounds like it's "utility.h:LINE".

Expanding that many CUs could be legitimate, if there's really something
matching in all these CUs, or it could be a bug where GDB expands
unrelated CUs.  There is an open bug related to a problem like this:

https://sourceware.org/bugzilla/show_bug.cgi?id=29105

Although I'm not sure this is what you see.

Is the project you build something open source that other people could
build and try?

Simon

  reply	other threads:[~2022-11-14  0:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-13 10:01 DeJiang Zhu
2022-11-14  0:26 ` Simon Marchi [this message]
2022-11-14  3:58   ` DeJiang Zhu
2022-11-14 14:47     ` Simon Marchi
2022-11-15  1:34       ` DeJiang Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3f75aa3-dbd1-b704-c884-4083c9246967@simark.ca \
    --to=simark@simark.ca \
    --cc=doujiang24@gmail.com \
    --cc=gdb@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).