From: Fangrui Song <maskray@google.com>
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Florian Weimer <fweimer@redhat.com>, libc-alpha@sourceware.org
Subject: Re: [PATCH] dlsym: Make RTLD_NEXT prefer default version definition [#BZ #14932]
Date: Fri, 27 May 2022 15:04:08 -0700 [thread overview]
Message-ID: <20220527220408.djhp3nz4aalpnsua@google.com> (raw)
In-Reply-To: <fd2ff169-536d-9af2-5bfa-c7a9d69dba23@linaro.org>
On 2022-05-27, Adhemerval Zanella wrote:
>
>
>On 27/05/2022 18:03, Fangrui Song via Libc-alpha wrote:
>> On 2022-05-27, Florian Weimer wrote:
>>> * Fangrui Song:
>>>
>>>> I suspect the code (from 199x) has a hidden assumption that for a stem
>>>> symbol (say foo), the .dynsym entries (say foo@v1, foo@v2, foo@@v3) are
>>>> ordered by the version node indices (in .gnu.version_d).
>>>
>>> Right, the code clearly has this assumption. There has to be some
>>> ordering because we don't want to do a topological sort to find the
>>> lowest and highest versions.
>>>
>>>> If so:
>>>>
>>>> * (flags & DL_LOOKUP_RETURN_NEWEST) == 0 => the symbol with
>>>> verstab[symidx]==2 will skip the check and be returned
>>>> * (flags & DL_LOOKUP_RETURN_NEWEST) != 0 => only the default version
>>>> symbol gets assigned in `*versioned_sym = sym`
>>>>
>>>> Unfortunately, this assumption hold for none of GNU ld, gold, and
>>>> ld.lld (seems unrelated to --hash-style=):
>>>>
>>>> ```
>>>> % ld.bfd @response.txt && readelf -W --dyn-syms nextmod3.so | grep foo@
>>>> 5: 0000000000001100 6 FUNC GLOBAL DEFAULT 13 foo@v1
>>>> 8: 0000000000001120 6 FUNC GLOBAL DEFAULT 13 foo@@v3
>>>> 9: 0000000000001110 6 FUNC GLOBAL DEFAULT 13 foo@v2
>>>> % gold @response.txt && readelf -W --dyn-syms nextmod3.so | grep foo@
>>>> 7: 00000000000007c0 6 FUNC GLOBAL DEFAULT 14 foo@@v3
>>>> 9: 00000000000007a0 6 FUNC GLOBAL DEFAULT 14 foo@v1
>>>> 10: 00000000000007b0 6 FUNC GLOBAL DEFAULT 14 foo@v2
>>>> % ld.lld @response.txt && readelf -W --dyn-syms nextmod3.so | grep foo@
>>>> 7: 0000000000001740 6 FUNC GLOBAL DEFAULT 13 foo@@v3
>>>> 8: 0000000000001720 6 FUNC GLOBAL DEFAULT 13 foo@v1
>>>> 9: 0000000000001730 6 FUNC GLOBAL DEFAULT 13 foo@v2
>>>
>>> The order of symbols in the symbol table is not what counts here, it's
>>> the order in the list for the hash bucket.
>>>
>>> I'm not sure if it's possible to dump this with readelf. I'm writing a
>>> custom dumper for this.
>>>
>>> Thanks,
>>> Florian
>>>
>> llvm-readelf --gnu-hash-table can dump the chain array (`Values:`), with
>> the size equal to the number of defined symbols. The odd numbers
>> indicate the end of a chain.
>>
>> I believe the .gnu.hash lookup order is in increasing symbol index (in
>> all of musl, glibc, FreeBSD rtld).
>> So the glibc (flags & DL_LOOKUP_RETURN_NEWEST) == 0 code path (for
>> relocations) is wrong...
OK, I was wrong. Re-reading, I think the code is correct but confusing.
The following example tests a relocation (flags = DL_LOOKUP_ADD_DEPENDENCY |
DL_LOOKUP_FOR_RELOCATE; note there is no DL_LOOKUP_RETURN_NEWEST):
printf > ./a.c %s '
#define _GNU_SOURCE // workaround before 748df8126ac69e68e0b94e236ea3c2e11b1176cb
#include <dlfcn.h>
#include <stdio.h>
int foo(int a);
int main(void) {
int res = foo (0);
printf ("foo (0) = %d, %s\n", res, res == 0 ? "ok" : "wrong");
/* Resolve to foo@@v3 in nextmod3.so, instead of
foo@v1 or foo@v2. */
int (*fp) (int) = dlsym (RTLD_DEFAULT, "foo");
res = fp (0);
printf ("foo (0) = %d, %s\n", res, res == 3 ? "ok" : "wrong");
/* Resolve to foo@@v3 in nextmod3.so, instead of
foo@v1 or foo@v2. */
fp = dlsym (RTLD_NEXT, "foo");
res = fp (0);
printf ("foo (0) = %d, %s\n", res, res == 3 ? "ok" : "wrong");
}
'
echo 'int foo(int a) { return -1; }' > ./b0.c
printf > ./b.c %s '
int foo_va(int a) { return 0; } asm(".symver foo_va, foo@va, remove");
int foo_v1(int a) { return 1; } asm(".symver foo_v1, foo@v1, remove");
int foo_v2(int a) { return 2; } asm(".symver foo_v2, foo@v2, remove");
int foo(int a) { return 3; } asm(".symver foo, foo@@@v3");
'
echo 'va {}; v1 {} va; v2 {} v1; v3 {} v2;' > ./b.ver
sed 's/^ /\t/g' > ./Makefile <<'eof'
.MAKE.MODE := meta curdirOk=1
CFLAGS = -fpic -g
a: a.o b0.so b.so
$(CC) -Wl,--no-as-needed a.o b0.so -ldl -Wl,-rpath=$$PWD -o $@
b0.so: b0.o
$(CC) $> -shared -Wl,--soname=b.so -o $@
b.so: b.o
$(CC) $> -shared -Wl,--soname=b.so,--version-script=b.ver -o $@
clean:
rm -f a *.so *.o *.meta
eof
Use `bmake` to build (meta mode ensures changing the command line will cause a re-build).
It will be nice the enhance tst-next-ver.c with the `foo` relocation
test but my glibc-Makefile-fu isn't great...
>
>My understanding is (flags & DL_LOOKUP_RETURN_NEWEST) == 0 is only used for
>dlvsym and I think we never documented properly hwo dlvsym (handle, symbol, "")
>should behave. Currently it is different than dlsym (handle, symbol) (it will
>return the a versioned symbol), while with your patch it will be similar to dlsym.
It's for both dlvsym and relocation resolving.
https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic.html
"10.7.6 Symbol Resolution" says:
> The object with the reference does not use versioning, while the object
> with the defin-
> itions does. In this instance, only the definitions with index numbers 1
> and 2 will be
> used in the reference match, the same identified by the static linker as
> the base defini-
> tion. In cases where the static linker was not used, such as in calls to
> dlopen(), a
> version that does not have the base definition index shall be acceptable
> if it is the only
> version for which the symbol is defined.
% readelf -WV b.so
...
Version definition section '.gnu.version_d' contains 5 entries:
Addr: 0x0000000000000374 Offset: 0x000374 Link: 6 (.dynstr)
000000: Rev: 1 Flags: BASE Index: 1 Cnt: 1 Name: b.so
0x001c: Rev: 1 Flags: none Index: 2 Cnt: 1 Name: va
0x0038: Rev: 1 Flags: none Index: 3 Cnt: 1 Name: v1
0x0054: Rev: 1 Flags: none Index: 4 Cnt: 1 Name: v2
0x0070: Rev: 1 Flags: none Index: 5 Cnt: 1 Name: v3
https://maskray.me/blog/2020-11-26-all-about-symbol-versioning#rtld-behavior
has my updated understanding of the behavior.
prev parent reply other threads:[~2022-05-27 22:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-20 8:35 Fangrui Song
2022-05-20 11:27 ` Adhemerval Zanella
2022-05-20 14:22 ` Florian Weimer
2022-05-22 18:54 ` Fāng-ruì Sòng
2022-05-27 16:59 ` Adhemerval Zanella
2022-05-27 11:03 ` Florian Weimer
2022-05-27 19:24 ` Fangrui Song
2022-05-27 20:32 ` Fangrui Song
2022-05-27 20:38 ` Florian Weimer
2022-05-27 21:03 ` Fangrui Song
2022-05-27 21:15 ` Adhemerval Zanella
2022-05-27 22:04 ` Fangrui Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220527220408.djhp3nz4aalpnsua@google.com \
--to=maskray@google.com \
--cc=adhemerval.zanella@linaro.org \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).