public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Kevin Buettner <kevinb@redhat.com>
To: Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] Throw error when creating an overly large gdb-index file
Date: Tue, 12 Sep 2023 19:43:23 -0700	[thread overview]
Message-ID: <20230912194323.7537de96@f37-zws-nv> (raw)
In-Reply-To: <87o7i7jrxm.fsf@tromey.com>

On Tue, 12 Sep 2023 10:07:49 -0600
Tom Tromey <tom@tromey.com> wrote:

> >>>>> "Kevin" == Kevin Buettner via Gdb-patches <gdb-patches@sourceware.org> writes:  
> 
> Kevin> The header in a .gdb_index section uses 32-bit unsigned offsets to
> Kevin> refer to other areas of the section.  Thus, there is a size limit of
> Kevin> 2^32-1 which is currently unaccounted for by GDB's code for outputting
> Kevin> these sections.  
> ...
> Kevin> This commit prevents the internal error from occurring by calling error()
> Kevin> when the file size exceeds 2^32-1.  
> 
> I don't have a problem with this but... why is the section too long?
> I wonder if this is covering up some other bug, like maybe not enough
> symbol de-duplication.

The problematic shared object is named libgraph_tool_inference.so and
was created while building the python-graph-tool package on Fedora.

The shared object is 3 GiB in size and the gdb-index section for it is
4.3 GiB in size.

objdump -h shows that the largest section is .debug_str with a size
of 0x81fe181e, which is a little over 2 GiB.  The next largest is
.debug_info with a size of 0x19691516 or about 406.6 MiB.  Here is
some of the objdump -h output:

libgraph_tool_inference.so:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
[...]
 28 .debug_aranges 00004d70  0000000000000000  0000000000000000  083dd180  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 29 .debug_info   19691516  0000000000000000  0000000000000000  083e1ef0  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 30 .debug_abbrev 0012e28b  0000000000000000  0000000000000000  21a73406  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 31 .debug_line   03b842f5  0000000000000000  0000000000000000  21ba1691  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 32 .debug_str    81fe181e  0000000000000000  0000000000000000  25725986  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 33 .debug_line_str 0000656b  0000000000000000  0000000000000000  a77071a4  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 34 .debug_loclists 090b9227  0000000000000000  0000000000000000  a770d70f  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 35 .debug_rnglists 01e6cd27  0000000000000000  0000000000000000  b07c6936  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS

I extracted .debug_str using obj_copy, replaced the \0 characters
with newlines, and then used sort and uniq on that output.  There
were no duplicate lines.  Altogether, there are 1,212,620 strings
in .debug_str.

I can't say that I fully understand the layout of the constant pool,
but it appears that all of the strings from .debug_str will end in the
latter part of it.  So, if I'm right, that alone accounts for nearly
half the size of the index file.

Kevin


  reply	other threads:[~2023-09-13  2:43 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-09  2:55 Kevin Buettner
2023-09-09  5:19 ` Kevin Buettner
2023-09-09  8:20 ` Tom de Vries
2023-09-15  0:22   ` Kevin Buettner
2023-09-15  9:09     ` Tom de Vries
2023-09-12 16:07 ` Tom Tromey
2023-09-13  2:43   ` Kevin Buettner [this message]
2023-09-13 14:24     ` Tom Tromey
2023-09-15  0:10       ` Kevin Buettner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230912194323.7537de96@f37-zws-nv \
    --to=kevinb@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).