public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Milian Wolff <mail@milianw.de>
To: elfutils-devel@sourceware.org, Mark Wielaard <mark@klomp.org>
Subject: Re: runtime validation of DT_SYMTAB lookups - why is there no DT_SYMSZ?
Date: Wed, 27 Jul 2022 13:38:08 +0200	[thread overview]
Message-ID: <2563495.X3S6A8DOgW@milian-workstation> (raw)
In-Reply-To: <c750f01cc7dbd2145d7d6bbe29f65bf27d437c6e.camel@klomp.org>

[-- Attachment #1: Type: text/plain, Size: 4315 bytes --]

On Dienstag, 26. Juli 2022 17:28:11 CEST Mark Wielaard wrote:
> Hi Milian,
> 
> On Mon, 2022-07-11 at 18:40 +0200, Milian Wolff wrote:
> > in heaptrack I have code to runtime attach to a program and then
> > rewrite the
> > various rel / rela / jmprel tables to intercept calls to malloc & friends.
> > 
> > This works, but now I have received a crash report for what seems to
> > be an
> > invalid DSO file: The jmprel table contains an invalid entry which
> > points to
> > an out-of-bounds symbol, leading to a crash when we try to look at
> > the
> > symbol's name.
> > 
> > I would like to protect against this crash by detecting the invalid
> > symbols.
> > But to do that, I would need to know the size of the symbol table,
> > which is
> > much harder than I would have hoped:
> > 
> > We have:
> > 
> > ```
> > #define DT_SYMTAB	6		/* Address of symbol table */
> > #define DT_SYMENT	11		/* Size of one symbol table
> > entry */
> > ```
> > 
> > But there is no `DT_SYMSZ` or similar, which we would need to
> > validate symbol
> > indices. Am I overlooking something or is that really missing? Does
> > anyone
> > know why? The other tables have that, e.g.:
> > 
> > ```
> > #define DT_PLTRELSZ	2		/* Size in bytes of PLT relocs */
> > #define DT_RELASZ	8		/* Total size of Rela relocs */
> > #define DT_STRSZ	10		/* Size of string table */
> > #define DT_RELSZ	18		/* Total size of Rel relocs
> > */
> > ```
> > 
> > Why is this missing for the symtab?
> > 
> > The only viable alternative seems to be to mmap the file completely
> > to access
> > the Elf header and then iterate over the Elf sections to query the
> > size of the
> > SHT_DYNSYM section. This is pretty complicated, and costly. Does
> > anyone have a
> > better solution that would allow me to validate symbol indices?
> 
> I don't know why it is missing, but it is indeed a tricky issue. You
> really want to know the number of elements (or the size) of the symbol
> table, but it takes a little gymnastics to get that.

Thanks for confirming that this isn't available currently. Would it be 
possible to add this? What's the process for standardization here? I guess it 
would take a very long time, yet this seems to me as if it would be beneficial 
in the long term.

> Di Chen recently
> (or actually not that recently, I just still haven't reviewed, sorry!)
> posted a patch for
> https://sourceware.org/bugzilla/show_bug.cgi?id=28873 to print out the
> symbols from the dynamic segment
> https://sourceware.org/pipermail/elfutils-devel/2022q2/005086.html

Interesting. But from what I can tell, this patch has access to the full Elf 
object and thus can access segments which are not normally loaded at runtime?

> > PS: eu-elflint reports this for the broken DSOs e.g.:
> > ```
> > $ eu-elflint libQt5Qml.so.5.12
> > section [ 3] '.dynsym': symbol 1272: st_value out of bounds
> > section [ 3] '.dynsym': symbol 3684: st_value out of bounds
> > section [29] '.symtab': _GLOBAL_OFFSET_TABLE_ symbol size 0 does not
> > match
> > .got section size 18340
> > section [29] '.symtab': _DYNAMIC symbol size 0 does not match dynamic
> > segment
> > size 336
> > section [29] '.symtab': symbol 25720: st_value out of bounds
> > section [29] '.symtab': symbol 27227: st_value out of bounds
> > ```
> > 
> > Does anyone know how this can happen? Is this a bug in the toolchain?
> 
> Try with eu-elflint --gnu which suppresses some known issues.

Indeed, with `--gnu` the tool reports `No errors`.

> Also could you show those symbol values (1272, 3684, 25720, 27227) they
> might have a special type, so their st_value isn't really an address?

```
$ eu-readelf -s libQt5Qml.so.5.12.0 | grep -E "^\s*(1272|3684|25720|27227):"
 1272: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start__@@Qt_5
 3684: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start@@Qt_5
 1272: 003ccc4c      0 NOTYPE  LOCAL  DEFAULT       17 $d
 3684: 003cbfec      0 NOTYPE  LOCAL  DEFAULT       17 $d
25720: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start
27227: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start__
```

The first two matches come from the `.dynsym`, the last four come from 
`.symtab`.

Can anyone tell me how `eu-readelf` resolves these symbol names?

Thanks

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-07-27 11:38 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-11 16:40 Milian Wolff
2022-07-26 15:28 ` Mark Wielaard
2022-07-27 11:38   ` Milian Wolff [this message]
2022-07-28 16:41     ` Mark Wielaard
2022-08-28  6:41       ` Jacob Burkholder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2563495.X3S6A8DOgW@milian-workstation \
    --to=mail@milianw.de \
    --cc=elfutils-devel@sourceware.org \
    --cc=mark@klomp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).