public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Purpose of NULL SHN_ABS symbols for version script nodes?
@ 2021-10-15 14:07 Raul Tambre
  2021-10-17 13:54 ` Alan Modra
  0 siblings, 1 reply; 6+ messages in thread
From: Raul Tambre @ 2021-10-15 14:07 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]

What is the reasoning behind GNU ld and gold generating NULL SHN_ABS symbols for 
version script nodes?

test.sym:

     LIBTEST_1
     {
     };

clang -shared /dev/null -o example.so -Wl,--version-script=test.sym -fuse-ld=ld
llvm-nm -D example.so:

     0000000000000000 A LIBTEST_1@@LIBTEST_1
                      w _ITM_deregisterTMCloneTable
                      w _ITM_registerTMCloneTable
                      w __cxa_finalize
                      w __gmon_start__

LLVM LLD does not. I stumbled upon this when building a Debian package that used 
version scripts (systemd) and that the packaging was tracking symbols for using 
dpkg-gensymbols.

As a result I ended up reporting Debian bug #992796 
(<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796>) and developing a 
patch for dpkg-gensymbols to ignore these symbols. However the dpkg maintainer 
has requested I clarify this with binutils.

The existence of the version node symbol can be detected by dlerror() being 
non-NULL, but I'm not sure what this could realistically could be useful for. I 
contacted one of the LLD maintainers, Fangrui Song, who was of the opinion of 
these symbols being useless. I have attached the thread for reference with their 
permission.
They noted that they'd have expected to have received reports if this was made 
use of somewhere, since quite a few large projects use LLD these days.

The generation of these symbols seems to have been added in 
dbe717effbdf31236088837f4686fd5ad5e71893 along with the rest of version script 
support. Currently it's done here 
(<https://github.com/bminor/binutils-gdb/blob/f9ebf60b6ff54fddf7e34b1bd96775b62fb89a22/gold/symtab.cc#L1623>).
I haven't been able to locate the reponsible piece of code in GNU ld.

[-- Attachment #2: Re  Version script node symbols ld vs lld - Fangrui Song (i@maskray.me) - 2021-09-19 0027.eml --]
[-- Type: message/rfc822, Size: 4973 bytes --]

From: Fangrui Song <i@maskray.me>
To: Raul Tambre <raul@tambre.ee>
Subject: Re: Version script node symbols ld vs lld
Date: Sat, 18 Sep 2021 14:27:49 -0700
Message-ID: <20210918212749.rhsrf5j2rtzfgjmp@gmail.com>

On 2021-09-18, Raul Tambre wrote:
>The dpkg maintainer hasn't quite agreed with the reasoning in my bug report.
>
>    <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796>
>
>I have attached an example that can be compiled using:
>clang dlsym.c -o dlsym -rdynamic -ldl -Wl,--version-script=dlsym.sym -fuse-ld=ld
>
>dlerror() returns an undefined symbol error for LIBTEST_1@@LIBTEST_1 
>if using LLD. While unlikely to ever cause any trouble, I think this 
>can be considered an ABI breaking difference. I'm forced to agree with 
>the maintainer that the current reasoning for ignoring them in 
>dpkg-gensymbols is invalid.

>They also found some relevant code in binutils.
>
>
><https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gold/symtab.cc;h=5a21ddc8cc2f560be109f755c3d2bf8a9ef9763c;hb=HEAD#l1618>

This is a .dynsym difference (SHN_ABS) between LLD and GNU linkers, but
I don't think any package leverages the property said by the comment.
(If they did, I think I'd already received a report from FreeBSD/Android/Chrome OS.)

The comment seems incorrect to me: -u is used to force archive member
extraction. However, archive files (unlinked) do not have SHN_ABS
definitions.

>I'm not too familiar with linker internals, so I don't comprehend, but 
>this behaviour does seem to be used for something. Would you be 
>willing/able to offer any further insights?
>
>May I also attach our email thread, once concluded, to the Debian bug 
>report for reference?

Sure:)

>#define _GNU_SOURCE
>
>#include <dlfcn.h>
>#include <stdio.h>
>
>void example()
>{
>}
>
>int main(int argc, char** argv)
>{
>	void* sym = dlsym(0, "LIBTEST_1");
>	void* vsym = dlvsym(0, "LIBTEST_1", "LIBTEST_1");
>	printf("LIBTEST_1 sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());
>
>	sym = dlsym(0, "example");
>	vsym = dlvsym(0, "example", "LIBTEST_1");
>	printf("example sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());
>}

>LIBTEST_1
>{
>global:
>  example;
>
>local:
>  *;
>};


[-- Attachment #3: Re  Version script node symbols ld vs lld - Raul Tambre (raul@tambre.ee) - 2021-08-30 1205.eml --]
[-- Type: message/rfc822, Size: 1282 bytes --]

From: Raul Tambre <raul@tambre.ee>
To: Fangrui Song <i@maskray.me>
Subject: Re: Version script node symbols ld vs lld
Date: Mon, 30 Aug 2021 12:05:12 +0300
Message-ID: <93bb4a94-b018-2e4c-f345-bb93f8a0e6b4@tambre.ee>

Thanks for the detailed answer and all the extra information!

> Thanks for the fix:)

I cleaned up and posted the fix for review on the dpkg-dev mailing list:
https://lists.debian.org/debian-dpkg/2021/08/msg00003.html

> [This appears to be unrelated to the SHN_ABS symbols.]
> 
> Technically the software can, but that won't make much sense. A new version, 
> if introduces new dynamic symbols, should attach them to the current version 
> node, rather an arbitrary old version node to confuse users.

That part was rather me trying to come up with possible uses for these. But I
agree, it'd be quite farfetched and there's likely no actual software doing
such trickery.

[-- Attachment #4: Re  Version script node symbols ld vs lld - Raul Tambre (raul@tambre.ee) - 2021-09-18 2035.eml --]
[-- Type: message/rfc822, Size: 2909 bytes --]

[-- Attachment #4.1.1: Type: text/plain, Size: 1041 bytes --]

The dpkg maintainer hasn't quite agreed with the reasoning in my bug report.

     <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796>

I have attached an example that can be compiled using:
clang dlsym.c -o dlsym -rdynamic -ldl -Wl,--version-script=dlsym.sym -fuse-ld=ld

dlerror() returns an undefined symbol error for LIBTEST_1@@LIBTEST_1 if using 
LLD. While unlikely to ever cause any trouble, I think this can be considered an 
ABI breaking difference. I'm forced to agree with the maintainer that the 
current reasoning for ignoring them in dpkg-gensymbols is invalid.

They also found some relevant code in binutils.

 
<https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gold/symtab.cc;h=5a21ddc8cc2f560be109f755c3d2bf8a9ef9763c;hb=HEAD#l1618>

I'm not too familiar with linker internals, so I don't comprehend, but this 
behaviour does seem to be used for something. Would you be willing/able to offer 
any further insights?

May I also attach our email thread, once concluded, to the Debian bug report for 
reference?

[-- Attachment #4.1.2: dlsym.c --]
[-- Type: text/plain, Size: 413 bytes --]

#define _GNU_SOURCE

#include <dlfcn.h>
#include <stdio.h>

void example()
{
}

int main(int argc, char** argv)
{
	void* sym = dlsym(0, "LIBTEST_1");
	void* vsym = dlvsym(0, "LIBTEST_1", "LIBTEST_1");
	printf("LIBTEST_1 sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());

	sym = dlsym(0, "example");
	vsym = dlvsym(0, "example", "LIBTEST_1");
	printf("example sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());
}

[-- Attachment #4.1.3: dlsym.sym --]
[-- Type: text/plain, Size: 47 bytes --]

LIBTEST_1
{
global:
  example;

local:
  *;
};

[-- Attachment #5: Version script node symbols ld vs lld - Raul Tambre (raul@tambre.ee) - 2021-08-27 1455.eml --]
[-- Type: message/rfc822, Size: 2061 bytes --]

From: Raul Tambre <raul@tambre.ee>
To: i@maskray.me
Subject: Version script node symbols ld vs lld
Date: Fri, 27 Aug 2021 14:55:42 +0300
Message-ID: <bc403c7c-a77d-e389-8481-dd6eb6fd67ed@tambre.ee>

Given a version script "test.sym":

     LIBTEST_1
     {
     };

clang -shared /dev/null -o example.so -Wl,--version-script=test.sym -fuse-ld=ld
llvm-nm -D example.so:

     0000000000000000 A LIBTEST_1@@LIBTEST_1
                      w _ITM_deregisterTMCloneTable
                      w _ITM_registerTMCloneTable
                      w __cxa_finalize
                      w __gmon_start__

LLD however doesn't generate the "LIBTEST_1@@LIBTEST_1" symbol.

Debian's dpkg-gensymbols tracks such symbols. When building Debian packages with 
symbols files and whose upstream uses version scripts files using LLD there's 
typically a bunch of symbols that have gone "missing" (i.e. an ABI break).

I eagerly filed Debian bug #992796 
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796> and prototyped a 
"fix" for dpkg 
<https://gitlab.com/cleveron/debian/dpkg/-/commit/1bbef85e82b355a87c549ba193149ba8350e04e1>.

However, technically functions may be added to older version nodes in newer 
versions. Say, in an alternative independently-versioned binary-compatible 
implementation of a library.
In such cases it would make sense and be useful to track the first appearance of 
a version node.

Would it be right to rather consider this a deficiency in LLD?

I trawled the binutils source code, but didn't manage to find the piece of code 
responsible for generating those, nevermind the reasoning or history.
I didn't read too thoroughly, but your symbol versioning blog post 
<https://maskray.me/blog/2020-11-26-all-about-symbol-versioning> doesn't seem to 
touch on this specific behaviour either.

Thanks in advance,
Raul

[-- Attachment #6: Re  Version script node symbols ld vs lld - Fangrui Song (i@maskray.me) - 2021-08-29 0454.eml --]
[-- Type: message/rfc822, Size: 5141 bytes --]

From: Fangrui Song <i@maskray.me>
To: Raul Tambre <raul@tambre.ee>
Subject: Re: Version script node symbols ld vs lld
Date: Sat, 28 Aug 2021 18:54:06 -0700
Message-ID: <20210829015406.odftzbmcbqhfvu2t@gmail.com>

On 2021-08-27, Raul Tambre wrote:
>Given a version script "test.sym":
>
>    LIBTEST_1
>    {
>    };
>
>clang -shared /dev/null -o example.so -Wl,--version-script=test.sym -fuse-ld=ld
>llvm-nm -D example.so:
>
>    0000000000000000 A LIBTEST_1@@LIBTEST_1
>                     w _ITM_deregisterTMCloneTable
>                     w _ITM_registerTMCloneTable
>                     w __cxa_finalize
>                     w __gmon_start__
>
>LLD however doesn't generate the "LIBTEST_1@@LIBTEST_1" symbol.

Right. The SHN_ABS symbol associated to a version node seems completely
useless to me. Its order is pretty arbitrary (readelf -W --dyn-syms
/lib/x86_64-linux-gnu/libc.so.6: GLIBC_2.3.3).

>Debian's dpkg-gensymbols tracks such symbols. When building Debian 
>packages with symbols files and whose upstream uses version scripts 
>files using LLD there's typically a bunch of symbols that have gone 
>"missing" (i.e. an ABI break).
>I eagerly filed Debian bug #992796 
><https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796> and 
>prototyped a "fix" for dpkg <https://gitlab.com/cleveron/debian/dpkg/-/commit/1bbef85e82b355a87c549ba193149ba8350e04e1>.

Thanks for the fix:)

The SHN_ABS symbols don't convey extra information.
In the output of readelf -V, .gnu.version_d lists the used versions.

>However, technically functions may be added to older version nodes in 
>newer versions. Say, in an alternative independently-versioned 
>binary-compatible implementation of a library.
>In such cases it would make sense and be useful to track the first 
>appearance of a version node.

[This appears to be unrelated to the SHN_ABS symbols.]

Technically the software can, but that won't make much sense.
A new version, if introduces new dynamic symbols, should attach them
to the current version node, rather an arbitrary old version node to
confuse users.

>Would it be right to rather consider this a deficiency in LLD?

I think no. LLD just doesn't support the SHN_ABS symbols which are
rather useless.

>I trawled the binutils source code, but didn't manage to find the 
>piece of code responsible for generating those, nevermind the 
>reasoning or history.

Wow! I tried a bit but gave up.
GNU ld's version symbol code is rather messy...
gold has more compatibility problems than LLD...

>I didn't read too thoroughly, but your symbol versioning blog post 
><https://maskray.me/blog/2020-11-26-all-about-symbol-versioning> 
>doesn't seem to touch on this specific behaviour either.

Thanks! I'll mention such SHN_ABS symbols.

>Thanks in advance,
>Raul

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-27 22:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 14:07 Purpose of NULL SHN_ABS symbols for version script nodes? Raul Tambre
2021-10-17 13:54 ` Alan Modra
2021-10-18 15:00   ` H.J. Lu
2021-10-27 17:43   ` Raul Tambre
2021-10-27 17:57     ` H.J. Lu
2021-10-27 22:10     ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).