From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dd14210.kasserver.com (dd14210.kasserver.com [85.13.138.83]) by sourceware.org (Postfix) with ESMTPS id F04493856DDF for ; Wed, 27 Jul 2022 11:38:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F04493856DDF Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=milianw.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=milianw.de Received: from milian-workstation.localnet (p54a1bbed.dip0.t-ipconnect.de [84.161.187.237]) by dd14210.kasserver.com (Postfix) with ESMTPSA id 0B580240891; Wed, 27 Jul 2022 13:38:13 +0200 (CEST) From: Milian Wolff To: elfutils-devel@sourceware.org, Mark Wielaard Subject: Re: runtime validation of DT_SYMTAB lookups - why is there no DT_SYMSZ? Date: Wed, 27 Jul 2022 13:38:08 +0200 Message-ID: <2563495.X3S6A8DOgW@milian-workstation> In-Reply-To: References: <2825590.45ddzSUfD6@milian-workstation> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart4374081.9GJbTIGYf8"; micalg="pgp-sha256"; protocol="application/pgp-signature" X-Spamd-Bar: -- X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jul 2022 11:38:17 -0000 --nextPart4374081.9GJbTIGYf8 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii"; protected-headers="v1" From: Milian Wolff To: elfutils-devel@sourceware.org, Mark Wielaard Date: Wed, 27 Jul 2022 13:38:08 +0200 Message-ID: <2563495.X3S6A8DOgW@milian-workstation> MIME-Version: 1.0 On Dienstag, 26. Juli 2022 17:28:11 CEST Mark Wielaard wrote: > Hi Milian, > > On Mon, 2022-07-11 at 18:40 +0200, Milian Wolff wrote: > > in heaptrack I have code to runtime attach to a program and then > > rewrite the > > various rel / rela / jmprel tables to intercept calls to malloc & friends. > > > > This works, but now I have received a crash report for what seems to > > be an > > invalid DSO file: The jmprel table contains an invalid entry which > > points to > > an out-of-bounds symbol, leading to a crash when we try to look at > > the > > symbol's name. > > > > I would like to protect against this crash by detecting the invalid > > symbols. > > But to do that, I would need to know the size of the symbol table, > > which is > > much harder than I would have hoped: > > > > We have: > > > > ``` > > #define DT_SYMTAB 6 /* Address of symbol table */ > > #define DT_SYMENT 11 /* Size of one symbol table > > entry */ > > ``` > > > > But there is no `DT_SYMSZ` or similar, which we would need to > > validate symbol > > indices. Am I overlooking something or is that really missing? Does > > anyone > > know why? The other tables have that, e.g.: > > > > ``` > > #define DT_PLTRELSZ 2 /* Size in bytes of PLT relocs */ > > #define DT_RELASZ 8 /* Total size of Rela relocs */ > > #define DT_STRSZ 10 /* Size of string table */ > > #define DT_RELSZ 18 /* Total size of Rel relocs > > */ > > ``` > > > > Why is this missing for the symtab? > > > > The only viable alternative seems to be to mmap the file completely > > to access > > the Elf header and then iterate over the Elf sections to query the > > size of the > > SHT_DYNSYM section. This is pretty complicated, and costly. Does > > anyone have a > > better solution that would allow me to validate symbol indices? > > I don't know why it is missing, but it is indeed a tricky issue. You > really want to know the number of elements (or the size) of the symbol > table, but it takes a little gymnastics to get that. Thanks for confirming that this isn't available currently. Would it be possible to add this? What's the process for standardization here? I guess it would take a very long time, yet this seems to me as if it would be beneficial in the long term. > Di Chen recently > (or actually not that recently, I just still haven't reviewed, sorry!) > posted a patch for > https://sourceware.org/bugzilla/show_bug.cgi?id=28873 to print out the > symbols from the dynamic segment > https://sourceware.org/pipermail/elfutils-devel/2022q2/005086.html Interesting. But from what I can tell, this patch has access to the full Elf object and thus can access segments which are not normally loaded at runtime? > > PS: eu-elflint reports this for the broken DSOs e.g.: > > ``` > > $ eu-elflint libQt5Qml.so.5.12 > > section [ 3] '.dynsym': symbol 1272: st_value out of bounds > > section [ 3] '.dynsym': symbol 3684: st_value out of bounds > > section [29] '.symtab': _GLOBAL_OFFSET_TABLE_ symbol size 0 does not > > match > > .got section size 18340 > > section [29] '.symtab': _DYNAMIC symbol size 0 does not match dynamic > > segment > > size 336 > > section [29] '.symtab': symbol 25720: st_value out of bounds > > section [29] '.symtab': symbol 27227: st_value out of bounds > > ``` > > > > Does anyone know how this can happen? Is this a bug in the toolchain? > > Try with eu-elflint --gnu which suppresses some known issues. Indeed, with `--gnu` the tool reports `No errors`. > Also could you show those symbol values (1272, 3684, 25720, 27227) they > might have a special type, so their st_value isn't really an address? ``` $ eu-readelf -s libQt5Qml.so.5.12.0 | grep -E "^\s*(1272|3684|25720|27227):" 1272: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 __bss_start__@@Qt_5 3684: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 __bss_start@@Qt_5 1272: 003ccc4c 0 NOTYPE LOCAL DEFAULT 17 $d 3684: 003cbfec 0 NOTYPE LOCAL DEFAULT 17 $d 25720: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 __bss_start 27227: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 __bss_start__ ``` The first two matches come from the `.dynsym`, the last four come from `.symtab`. Can anyone tell me how `eu-readelf` resolves these symbol names? Thanks -- Milian Wolff mail@milianw.de http://milianw.de --nextPart4374081.9GJbTIGYf8 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEezawi1aUvUGg3A1+8zYW/HGdOX8FAmLhI6AACgkQ8zYW/HGd OX+WHw/+JSJW20ONo3T1/ySmN8iP2m8jq/RKLrUTIS5C4dAdkQJr/+Bf5Es0ZiDy ghjIwQhu19M9fn3Z7MN+/fbyir94TxEScd5GGHFsSc7WfBw9/mdhQwXL9yPxk1rj GHgxEr8Cei5yB0VEOtLdg+DkBfBCMExEXlIPhxZW3yHT1PjDIHTbontO8gKnfOgx DtA9K04kyyIMqzS3dSwgr7tkhkp9BLldHcIp8j/qfU3iYcj8YvS+GMM8TsCeW/a3 HvbnjQPde/07aVb4AvLKXFkVKiUbcwXQTtAXK10s1CUgjLmHlEUppahZnP1vOgWo yLB1WYbi/QqrXINalWCnU4o+PjbJVfXrTBgay/4FFaVnAeR6BkOS+u823kzM85bF NztFL+N549irw+BcWVat4qMihvv/ZXBCyGdl0ldgkmXsG+iKa0LbJOyRDUfKuzlz OO/nZvJm5twSmew5HHbcoQvVh4daOT9FBDG8aAuhG/OOjA8Q14fLn4eQFmbYxB1a foJGkgosUrd80YhF1I9bGu8oKR1ciHMZarfRtNpYH7XozxtmSInMNWggxxAc/IQ7 TKtNK8pF1v/cDtZG8CFDyk8iW3X1LShzvd/0eDqRizxaDg4ePQa7iNecpV8FqdCY k97l1dLa58pOFi5RC8x2kfvPaJfCxCbHbtr1OuowEHHJqbagxZ4= =jCJg -----END PGP SIGNATURE----- --nextPart4374081.9GJbTIGYf8--