public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Purpose of NULL SHN_ABS symbols for version script nodes?
@ 2021-10-15 14:07 Raul Tambre
  2021-10-17 13:54 ` Alan Modra
  0 siblings, 1 reply; 6+ messages in thread
From: Raul Tambre @ 2021-10-15 14:07 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]

What is the reasoning behind GNU ld and gold generating NULL SHN_ABS symbols for 
version script nodes?

test.sym:

     LIBTEST_1
     {
     };

clang -shared /dev/null -o example.so -Wl,--version-script=test.sym -fuse-ld=ld
llvm-nm -D example.so:

     0000000000000000 A LIBTEST_1@@LIBTEST_1
                      w _ITM_deregisterTMCloneTable
                      w _ITM_registerTMCloneTable
                      w __cxa_finalize
                      w __gmon_start__

LLVM LLD does not. I stumbled upon this when building a Debian package that used 
version scripts (systemd) and that the packaging was tracking symbols for using 
dpkg-gensymbols.

As a result I ended up reporting Debian bug #992796 
(<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796>) and developing a 
patch for dpkg-gensymbols to ignore these symbols. However the dpkg maintainer 
has requested I clarify this with binutils.

The existence of the version node symbol can be detected by dlerror() being 
non-NULL, but I'm not sure what this could realistically could be useful for. I 
contacted one of the LLD maintainers, Fangrui Song, who was of the opinion of 
these symbols being useless. I have attached the thread for reference with their 
permission.
They noted that they'd have expected to have received reports if this was made 
use of somewhere, since quite a few large projects use LLD these days.

The generation of these symbols seems to have been added in 
dbe717effbdf31236088837f4686fd5ad5e71893 along with the rest of version script 
support. Currently it's done here 
(<https://github.com/bminor/binutils-gdb/blob/f9ebf60b6ff54fddf7e34b1bd96775b62fb89a22/gold/symtab.cc#L1623>).
I haven't been able to locate the reponsible piece of code in GNU ld.

[-- Attachment #2: Re  Version script node symbols ld vs lld - Fangrui Song (i@maskray.me) - 2021-09-19 0027.eml --]
[-- Type: message/rfc822, Size: 4973 bytes --]

From: Fangrui Song <i@maskray.me>
To: Raul Tambre <raul@tambre.ee>
Subject: Re: Version script node symbols ld vs lld
Date: Sat, 18 Sep 2021 14:27:49 -0700
Message-ID: <20210918212749.rhsrf5j2rtzfgjmp@gmail.com>

On 2021-09-18, Raul Tambre wrote:
>The dpkg maintainer hasn't quite agreed with the reasoning in my bug report.
>
>    <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796>
>
>I have attached an example that can be compiled using:
>clang dlsym.c -o dlsym -rdynamic -ldl -Wl,--version-script=dlsym.sym -fuse-ld=ld
>
>dlerror() returns an undefined symbol error for LIBTEST_1@@LIBTEST_1 
>if using LLD. While unlikely to ever cause any trouble, I think this 
>can be considered an ABI breaking difference. I'm forced to agree with 
>the maintainer that the current reasoning for ignoring them in 
>dpkg-gensymbols is invalid.

>They also found some relevant code in binutils.
>
>
><https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gold/symtab.cc;h=5a21ddc8cc2f560be109f755c3d2bf8a9ef9763c;hb=HEAD#l1618>

This is a .dynsym difference (SHN_ABS) between LLD and GNU linkers, but
I don't think any package leverages the property said by the comment.
(If they did, I think I'd already received a report from FreeBSD/Android/Chrome OS.)

The comment seems incorrect to me: -u is used to force archive member
extraction. However, archive files (unlinked) do not have SHN_ABS
definitions.

>I'm not too familiar with linker internals, so I don't comprehend, but 
>this behaviour does seem to be used for something. Would you be 
>willing/able to offer any further insights?
>
>May I also attach our email thread, once concluded, to the Debian bug 
>report for reference?

Sure:)

>#define _GNU_SOURCE
>
>#include <dlfcn.h>
>#include <stdio.h>
>
>void example()
>{
>}
>
>int main(int argc, char** argv)
>{
>	void* sym = dlsym(0, "LIBTEST_1");
>	void* vsym = dlvsym(0, "LIBTEST_1", "LIBTEST_1");
>	printf("LIBTEST_1 sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());
>
>	sym = dlsym(0, "example");
>	vsym = dlvsym(0, "example", "LIBTEST_1");
>	printf("example sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());
>}

>LIBTEST_1
>{
>global:
>  example;
>
>local:
>  *;
>};


[-- Attachment #3: Re  Version script node symbols ld vs lld - Raul Tambre (raul@tambre.ee) - 2021-08-30 1205.eml --]
[-- Type: message/rfc822, Size: 1282 bytes --]

From: Raul Tambre <raul@tambre.ee>
To: Fangrui Song <i@maskray.me>
Subject: Re: Version script node symbols ld vs lld
Date: Mon, 30 Aug 2021 12:05:12 +0300
Message-ID: <93bb4a94-b018-2e4c-f345-bb93f8a0e6b4@tambre.ee>

Thanks for the detailed answer and all the extra information!

> Thanks for the fix:)

I cleaned up and posted the fix for review on the dpkg-dev mailing list:
https://lists.debian.org/debian-dpkg/2021/08/msg00003.html

> [This appears to be unrelated to the SHN_ABS symbols.]
> 
> Technically the software can, but that won't make much sense. A new version, 
> if introduces new dynamic symbols, should attach them to the current version 
> node, rather an arbitrary old version node to confuse users.

That part was rather me trying to come up with possible uses for these. But I
agree, it'd be quite farfetched and there's likely no actual software doing
such trickery.

[-- Attachment #4: Re  Version script node symbols ld vs lld - Raul Tambre (raul@tambre.ee) - 2021-09-18 2035.eml --]
[-- Type: message/rfc822, Size: 2909 bytes --]

[-- Attachment #4.1.1: Type: text/plain, Size: 1041 bytes --]

The dpkg maintainer hasn't quite agreed with the reasoning in my bug report.

     <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796>

I have attached an example that can be compiled using:
clang dlsym.c -o dlsym -rdynamic -ldl -Wl,--version-script=dlsym.sym -fuse-ld=ld

dlerror() returns an undefined symbol error for LIBTEST_1@@LIBTEST_1 if using 
LLD. While unlikely to ever cause any trouble, I think this can be considered an 
ABI breaking difference. I'm forced to agree with the maintainer that the 
current reasoning for ignoring them in dpkg-gensymbols is invalid.

They also found some relevant code in binutils.

 
<https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gold/symtab.cc;h=5a21ddc8cc2f560be109f755c3d2bf8a9ef9763c;hb=HEAD#l1618>

I'm not too familiar with linker internals, so I don't comprehend, but this 
behaviour does seem to be used for something. Would you be willing/able to offer 
any further insights?

May I also attach our email thread, once concluded, to the Debian bug report for 
reference?

[-- Attachment #4.1.2: dlsym.c --]
[-- Type: text/plain, Size: 413 bytes --]

#define _GNU_SOURCE

#include <dlfcn.h>
#include <stdio.h>

void example()
{
}

int main(int argc, char** argv)
{
	void* sym = dlsym(0, "LIBTEST_1");
	void* vsym = dlvsym(0, "LIBTEST_1", "LIBTEST_1");
	printf("LIBTEST_1 sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());

	sym = dlsym(0, "example");
	vsym = dlvsym(0, "example", "LIBTEST_1");
	printf("example sym=%p vsym=%p error=%s\n", sym, vsym, dlerror());
}

[-- Attachment #4.1.3: dlsym.sym --]
[-- Type: text/plain, Size: 47 bytes --]

LIBTEST_1
{
global:
  example;

local:
  *;
};

[-- Attachment #5: Version script node symbols ld vs lld - Raul Tambre (raul@tambre.ee) - 2021-08-27 1455.eml --]
[-- Type: message/rfc822, Size: 2061 bytes --]

From: Raul Tambre <raul@tambre.ee>
To: i@maskray.me
Subject: Version script node symbols ld vs lld
Date: Fri, 27 Aug 2021 14:55:42 +0300
Message-ID: <bc403c7c-a77d-e389-8481-dd6eb6fd67ed@tambre.ee>

Given a version script "test.sym":

     LIBTEST_1
     {
     };

clang -shared /dev/null -o example.so -Wl,--version-script=test.sym -fuse-ld=ld
llvm-nm -D example.so:

     0000000000000000 A LIBTEST_1@@LIBTEST_1
                      w _ITM_deregisterTMCloneTable
                      w _ITM_registerTMCloneTable
                      w __cxa_finalize
                      w __gmon_start__

LLD however doesn't generate the "LIBTEST_1@@LIBTEST_1" symbol.

Debian's dpkg-gensymbols tracks such symbols. When building Debian packages with 
symbols files and whose upstream uses version scripts files using LLD there's 
typically a bunch of symbols that have gone "missing" (i.e. an ABI break).

I eagerly filed Debian bug #992796 
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796> and prototyped a 
"fix" for dpkg 
<https://gitlab.com/cleveron/debian/dpkg/-/commit/1bbef85e82b355a87c549ba193149ba8350e04e1>.

However, technically functions may be added to older version nodes in newer 
versions. Say, in an alternative independently-versioned binary-compatible 
implementation of a library.
In such cases it would make sense and be useful to track the first appearance of 
a version node.

Would it be right to rather consider this a deficiency in LLD?

I trawled the binutils source code, but didn't manage to find the piece of code 
responsible for generating those, nevermind the reasoning or history.
I didn't read too thoroughly, but your symbol versioning blog post 
<https://maskray.me/blog/2020-11-26-all-about-symbol-versioning> doesn't seem to 
touch on this specific behaviour either.

Thanks in advance,
Raul

[-- Attachment #6: Re  Version script node symbols ld vs lld - Fangrui Song (i@maskray.me) - 2021-08-29 0454.eml --]
[-- Type: message/rfc822, Size: 5141 bytes --]

From: Fangrui Song <i@maskray.me>
To: Raul Tambre <raul@tambre.ee>
Subject: Re: Version script node symbols ld vs lld
Date: Sat, 28 Aug 2021 18:54:06 -0700
Message-ID: <20210829015406.odftzbmcbqhfvu2t@gmail.com>

On 2021-08-27, Raul Tambre wrote:
>Given a version script "test.sym":
>
>    LIBTEST_1
>    {
>    };
>
>clang -shared /dev/null -o example.so -Wl,--version-script=test.sym -fuse-ld=ld
>llvm-nm -D example.so:
>
>    0000000000000000 A LIBTEST_1@@LIBTEST_1
>                     w _ITM_deregisterTMCloneTable
>                     w _ITM_registerTMCloneTable
>                     w __cxa_finalize
>                     w __gmon_start__
>
>LLD however doesn't generate the "LIBTEST_1@@LIBTEST_1" symbol.

Right. The SHN_ABS symbol associated to a version node seems completely
useless to me. Its order is pretty arbitrary (readelf -W --dyn-syms
/lib/x86_64-linux-gnu/libc.so.6: GLIBC_2.3.3).

>Debian's dpkg-gensymbols tracks such symbols. When building Debian 
>packages with symbols files and whose upstream uses version scripts 
>files using LLD there's typically a bunch of symbols that have gone 
>"missing" (i.e. an ABI break).
>I eagerly filed Debian bug #992796 
><https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992796> and 
>prototyped a "fix" for dpkg <https://gitlab.com/cleveron/debian/dpkg/-/commit/1bbef85e82b355a87c549ba193149ba8350e04e1>.

Thanks for the fix:)

The SHN_ABS symbols don't convey extra information.
In the output of readelf -V, .gnu.version_d lists the used versions.

>However, technically functions may be added to older version nodes in 
>newer versions. Say, in an alternative independently-versioned 
>binary-compatible implementation of a library.
>In such cases it would make sense and be useful to track the first 
>appearance of a version node.

[This appears to be unrelated to the SHN_ABS symbols.]

Technically the software can, but that won't make much sense.
A new version, if introduces new dynamic symbols, should attach them
to the current version node, rather an arbitrary old version node to
confuse users.

>Would it be right to rather consider this a deficiency in LLD?

I think no. LLD just doesn't support the SHN_ABS symbols which are
rather useless.

>I trawled the binutils source code, but didn't manage to find the 
>piece of code responsible for generating those, nevermind the 
>reasoning or history.

Wow! I tried a bit but gave up.
GNU ld's version symbol code is rather messy...
gold has more compatibility problems than LLD...

>I didn't read too thoroughly, but your symbol versioning blog post 
><https://maskray.me/blog/2020-11-26-all-about-symbol-versioning> 
>doesn't seem to touch on this specific behaviour either.

Thanks! I'll mention such SHN_ABS symbols.

>Thanks in advance,
>Raul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Purpose of NULL SHN_ABS symbols for version script nodes?
  2021-10-15 14:07 Purpose of NULL SHN_ABS symbols for version script nodes? Raul Tambre
@ 2021-10-17 13:54 ` Alan Modra
  2021-10-18 15:00   ` H.J. Lu
  2021-10-27 17:43   ` Raul Tambre
  0 siblings, 2 replies; 6+ messages in thread
From: Alan Modra @ 2021-10-17 13:54 UTC (permalink / raw)
  To: Raul Tambre; +Cc: binutils

On Fri, Oct 15, 2021 at 05:07:48PM +0300, Raul Tambre via Binutils wrote:
> What is the reasoning behind GNU ld and gold generating NULL SHN_ABS symbols
> for version script nodes?

Ulrich Drepper and Eric Youngdale were the designers of the GNU symbol
versioning scheme.  I didn't have anything to do with the design, so I
can only guess.  I suspect they are there so that a program can ask
"Does this library support version XYZ symbols?" at runtime without
knowing which set of symbols have a version XYZ.

> > I trawled the binutils source code, but didn't manage to find the piece
> > of code responsible for generating those, nevermind the reasoning or
> > history.

Search for BSF_GLOBAL in git commit d044b40a4095.  There isn't any
reasoning given there, just a comment
	/* Add a symbol representing this version.  */ 

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Purpose of NULL SHN_ABS symbols for version script nodes?
  2021-10-17 13:54 ` Alan Modra
@ 2021-10-18 15:00   ` H.J. Lu
  2021-10-27 17:43   ` Raul Tambre
  1 sibling, 0 replies; 6+ messages in thread
From: H.J. Lu @ 2021-10-18 15:00 UTC (permalink / raw)
  To: Alan Modra; +Cc: Raul Tambre, Binutils

On Sun, Oct 17, 2021 at 6:54 AM Alan Modra via Binutils
<binutils@sourceware.org> wrote:
>
> On Fri, Oct 15, 2021 at 05:07:48PM +0300, Raul Tambre via Binutils wrote:
> > What is the reasoning behind GNU ld and gold generating NULL SHN_ABS symbols
> > for version script nodes?
>
> Ulrich Drepper and Eric Youngdale were the designers of the GNU symbol
> versioning scheme.  I didn't have anything to do with the design, so I
> can only guess.  I suspect they are there so that a program can ask
> "Does this library support version XYZ symbols?" at runtime without
> knowing which set of symbols have a version XYZ.

This may be useful to implement the glibc version check for:

https://sourceware.org/bugzilla/show_bug.cgi?id=27923

> > > I trawled the binutils source code, but didn't manage to find the piece
> > > of code responsible for generating those, nevermind the reasoning or
> > > history.
>
> Search for BSF_GLOBAL in git commit d044b40a4095.  There isn't any
> reasoning given there, just a comment
>         /* Add a symbol representing this version.  */
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Purpose of NULL SHN_ABS symbols for version script nodes?
  2021-10-17 13:54 ` Alan Modra
  2021-10-18 15:00   ` H.J. Lu
@ 2021-10-27 17:43   ` Raul Tambre
  2021-10-27 17:57     ` H.J. Lu
  2021-10-27 22:10     ` Alan Modra
  1 sibling, 2 replies; 6+ messages in thread
From: Raul Tambre @ 2021-10-27 17:43 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

LLD side is unwilling to implement this feature due to lack of real-world use 
and its purpose having to be guessed at even by a binutils maintainer.

Would there be willingness to accept a patch deprecating/removing this behaviour?

The alternative of me trying to convince the Debian side (dpkg) to accept 
deprecation of this there or many packages remaining unbuildable with LLD 
doesn't seem great.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Purpose of NULL SHN_ABS symbols for version script nodes?
  2021-10-27 17:43   ` Raul Tambre
@ 2021-10-27 17:57     ` H.J. Lu
  2021-10-27 22:10     ` Alan Modra
  1 sibling, 0 replies; 6+ messages in thread
From: H.J. Lu @ 2021-10-27 17:57 UTC (permalink / raw)
  To: Raul Tambre; +Cc: Alan Modra, Binutils

On Wed, Oct 27, 2021 at 10:44 AM Raul Tambre via Binutils
<binutils@sourceware.org> wrote:
>
> LLD side is unwilling to implement this feature due to lack of real-world use
> and its purpose having to be guessed at even by a binutils maintainer.
>
> Would there be willingness to accept a patch deprecating/removing this behaviour?
>
> The alternative of me trying to convince the Debian side (dpkg) to accept
> deprecation of this there or many packages remaining unbuildable with LLD
> doesn't seem great.

I believe this is a useful feature.

-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Purpose of NULL SHN_ABS symbols for version script nodes?
  2021-10-27 17:43   ` Raul Tambre
  2021-10-27 17:57     ` H.J. Lu
@ 2021-10-27 22:10     ` Alan Modra
  1 sibling, 0 replies; 6+ messages in thread
From: Alan Modra @ 2021-10-27 22:10 UTC (permalink / raw)
  To: Raul Tambre; +Cc: binutils

On Wed, Oct 27, 2021 at 08:43:42PM +0300, Raul Tambre wrote:
> LLD side is unwilling to implement this feature due to lack of real-world
> use and its purpose having to be guessed at even by a binutils maintainer.

Lack of real world use is an assertion that is yet to be proven true.

> Would there be willingness to accept a patch deprecating/removing this behaviour?

At some time in the future perhaps.  I'm not going to drop them just
now.  I'm perfectly happy for lld to find out whether they are really
needed..

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-27 22:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 14:07 Purpose of NULL SHN_ABS symbols for version script nodes? Raul Tambre
2021-10-17 13:54 ` Alan Modra
2021-10-18 15:00   ` H.J. Lu
2021-10-27 17:43   ` Raul Tambre
2021-10-27 17:57     ` H.J. Lu
2021-10-27 22:10     ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).