* [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first @ 2021-12-09 23:53 Khem Raj 2022-01-11 19:26 ` Adhemerval Zanella 2022-01-13 19:02 ` Carlos O'Donell 0 siblings, 2 replies; 12+ messages in thread From: Khem Raj @ 2021-12-09 23:53 UTC (permalink / raw) To: libc-alpha; +Cc: Mark Hatle, Mark Hatle, Khem Raj From: Mark Hatle <mark.hatle@windriver.com> According to the ELF specification: When resolving symbolic references, the dynamic linker examines the symbol tables with a breadth-first search. This function was using a depth first search. By doing so the conflict resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) was incorrect. This caused problems when their were various circular dependencies between libraries. The problem usually manifested itself by the wrong IFUNC being executed. Similar issue has been reported here [1] [BZ# 20488] [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> Signed-off-by: Khem Raj <raj.khem@gmail.com> --- elf/dl-deps.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/elf/dl-deps.c b/elf/dl-deps.c index 237d9636c5..e15f7f83d8 100644 --- a/elf/dl-deps.c +++ b/elf/dl-deps.c @@ -73,13 +73,19 @@ _dl_build_local_scope (struct link_map **list, struct link_map *map) { struct link_map **p = list; struct link_map **q; + struct link_map **r; *p++ = map; map->l_reserved = 1; - if (map->l_initfini) - for (q = map->l_initfini + 1; *q; ++q) - if (! (*q)->l_reserved) - p += _dl_build_local_scope (p, *q); + + for (r = list; r < p; ++r) + if ((*r)->l_initfini) + for (q = (*r)->l_initfini + 1; *q; ++q) + if (! (*q)->l_reserved) + { + *p++ = *q; + (*q)->l_reserved = 1; + } return p - list; } -- 2.34.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2021-12-09 23:53 [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first Khem Raj @ 2022-01-11 19:26 ` Adhemerval Zanella 2022-01-12 19:08 ` Mark Hatle 2022-01-13 19:02 ` Carlos O'Donell 1 sibling, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2022-01-11 19:26 UTC (permalink / raw) To: Khem Raj, libc-alpha; +Cc: Mark Hatle, Mark Hatle On 09/12/2021 20:53, Khem Raj via Libc-alpha wrote: > From: Mark Hatle <mark.hatle@windriver.com> > > According to the ELF specification: > > When resolving symbolic references, the dynamic linker examines the symbol > tables with a breadth-first search. > > This function was using a depth first search. By doing so the conflict > resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) > was incorrect. This caused problems when their were various circular > dependencies between libraries. The problem usually manifested itself by > the wrong IFUNC being executed. > > Similar issue has been reported here [1] > > [BZ# 20488] > > [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html > > Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> > Signed-off-by: Khem Raj <raj.khem@gmail.com> I am trying to understand why it this only an issue for LD_TRACE_PRELINKING=1, do we have a testcase that stress it for a default usercase? When you say the 'wrong IFUNC being executed' what exactly you mean here? Could we use a testcase based on this? > --- > elf/dl-deps.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/elf/dl-deps.c b/elf/dl-deps.c > index 237d9636c5..e15f7f83d8 100644 > --- a/elf/dl-deps.c > +++ b/elf/dl-deps.c > @@ -73,13 +73,19 @@ _dl_build_local_scope (struct link_map **list, struct link_map *map) > { > struct link_map **p = list; > struct link_map **q; > + struct link_map **r; > > *p++ = map; > map->l_reserved = 1; > - if (map->l_initfini) > - for (q = map->l_initfini + 1; *q; ++q) > - if (! (*q)->l_reserved) > - p += _dl_build_local_scope (p, *q); > + > + for (r = list; r < p; ++r) > + if ((*r)->l_initfini) > + for (q = (*r)->l_initfini + 1; *q; ++q) > + if (! (*q)->l_reserved) > + { > + *p++ = *q; > + (*q)->l_reserved = 1; > + } > return p - list; > } > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-11 19:26 ` Adhemerval Zanella @ 2022-01-12 19:08 ` Mark Hatle 2022-01-12 20:12 ` Adhemerval Zanella 0 siblings, 1 reply; 12+ messages in thread From: Mark Hatle @ 2022-01-12 19:08 UTC (permalink / raw) To: Adhemerval Zanella, Khem Raj, libc-alpha; +Cc: Mark Hatle On 1/11/22 1:26 PM, Adhemerval Zanella wrote: > > > On 09/12/2021 20:53, Khem Raj via Libc-alpha wrote: >> From: Mark Hatle <mark.hatle@windriver.com> >> >> According to the ELF specification: >> >> When resolving symbolic references, the dynamic linker examines the symbol >> tables with a breadth-first search. >> >> This function was using a depth first search. By doing so the conflict >> resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) >> was incorrect. This caused problems when their were various circular >> dependencies between libraries. The problem usually manifested itself by >> the wrong IFUNC being executed. >> >> Similar issue has been reported here [1] >> >> [BZ# 20488] >> >> [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html >> >> Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> >> Signed-off-by: Khem Raj <raj.khem@gmail.com> > > I am trying to understand why it this only an issue for LD_TRACE_PRELINKING=1, > do we have a testcase that stress it for a default usercase? The underlying issue here is that resolution is happening depth first and not breadth first. According to the ELF spec, all resolution should be breadth-first. As noted in item in above, the prelinker just happens to be a way to actually show that the behavior is incorrect. (There even appears to be a related defect with a reproducer.) When taking the values from LD_TRACE_PRELINKING=1, various addresses and conflict resolutions are specified. When you compare what is reported, vs what happens, vs what the spec says they don't align as they should. > When you say the 'wrong IFUNC being executed' what exactly you mean here? > Could we use a testcase based on this? The prelinker (and possibly just in general), the IFUNC address used is the one from the wrong library scope. I personally have never tried to reproduce this outside of the prelinking use-case, but based on the referenced report and the code at the time of the change, it is believed this could happen (but rarely) without a prelinked system in a very complex case with multiple libraries providing the same functions. The main issue (my memory is sketchy sorry this might be wrong) is that if one or more functions is an IFUNC and one or more functions is NOT, then the you can get into a situation with a conflict of the wrong function being called using the wrong mechanism. Definitely in the prelink case, the resolver gave the address of a regular function which was then placed into an IFUNC (or maybe it was the other way around) triggering the runtime segfault. --Mark >> --- >> elf/dl-deps.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/elf/dl-deps.c b/elf/dl-deps.c >> index 237d9636c5..e15f7f83d8 100644 >> --- a/elf/dl-deps.c >> +++ b/elf/dl-deps.c >> @@ -73,13 +73,19 @@ _dl_build_local_scope (struct link_map **list, struct link_map *map) >> { >> struct link_map **p = list; >> struct link_map **q; >> + struct link_map **r; >> >> *p++ = map; >> map->l_reserved = 1; >> - if (map->l_initfini) >> - for (q = map->l_initfini + 1; *q; ++q) >> - if (! (*q)->l_reserved) >> - p += _dl_build_local_scope (p, *q); >> + >> + for (r = list; r < p; ++r) >> + if ((*r)->l_initfini) >> + for (q = (*r)->l_initfini + 1; *q; ++q) >> + if (! (*q)->l_reserved) >> + { >> + *p++ = *q; >> + (*q)->l_reserved = 1; >> + } >> return p - list; >> } >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-12 19:08 ` Mark Hatle @ 2022-01-12 20:12 ` Adhemerval Zanella 2022-01-12 20:41 ` Mark Hatle 0 siblings, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2022-01-12 20:12 UTC (permalink / raw) To: Mark Hatle, Khem Raj, Libc-alpha, Carlos O'Donell On 12/01/2022 16:08, Mark Hatle wrote: > > > On 1/11/22 1:26 PM, Adhemerval Zanella wrote: >> >> >> On 09/12/2021 20:53, Khem Raj via Libc-alpha wrote: >>> From: Mark Hatle <mark.hatle@windriver.com> >>> >>> According to the ELF specification: >>> >>> When resolving symbolic references, the dynamic linker examines the symbol >>> tables with a breadth-first search. >>> >>> This function was using a depth first search. By doing so the conflict >>> resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) >>> was incorrect. This caused problems when their were various circular >>> dependencies between libraries. The problem usually manifested itself by >>> the wrong IFUNC being executed. >>> >>> Similar issue has been reported here [1] >>> >>> [BZ# 20488] >>> >>> [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html >>> >>> Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> >>> Signed-off-by: Khem Raj <raj.khem@gmail.com> >> >> I am trying to understand why it this only an issue for LD_TRACE_PRELINKING=1, >> do we have a testcase that stress it for a default usercase? > > The underlying issue here is that resolution is happening depth first and not breadth first. According to the ELF spec, all resolution should be breadth-first. > > > As noted in item in above, the prelinker just happens to be a way to actually show that the behavior is incorrect. (There even appears to be a related defect with a reproducer.) > > > When taking the values from LD_TRACE_PRELINKING=1, various addresses and conflict resolutions are specified. When you compare what is reported, vs what happens, vs what the spec says they don't align as they should. That was pretty clear from bug report, what I am trying to understand is why it seems to not being reported before in default non-prelinked usage. Also, the patch only changes the semantic to prelinked binaries which is at least troublesome: this semantic difference at symbol resolution is a potential source of issues where user see different behavior depending whether prelinked was used or not. I am also worried because making such change might potentially trigger some hard to diagnostic breakage. > > >> When you say the 'wrong IFUNC being executed' what exactly you mean here? >> Could we use a testcase based on this? > > > The prelinker (and possibly just in general), the IFUNC address used is the one from the wrong library scope. I personally have never tried to reproduce this outside of the prelinking use-case, but based on the referenced report and the code at the time of the change, it is believed this could happen (but rarely) without a prelinked system in a very complex case with multiple libraries providing the same functions. > > The main issue (my memory is sketchy sorry this might be wrong) is that if one or more functions is an IFUNC and one or more functions is NOT, then the you can get into a situation with a conflict of the wrong function being called using the wrong mechanism. > > Definitely in the prelink case, the resolver gave the address of a regular function which was then placed into an IFUNC (or maybe it was the other way around) triggering the runtime segfault. I asked about ifunc because it should not really be dependent where ifunc is used or not, what might be happening is this issue triggers the long standing ifunc ordering issue more often. To summarize, although this change only affects prelinked binaries I think it would be better to change for default semantic as well. We can make it a tunable for the transition, like new DSO sorting algorithm with glibc.rtld.dynamic_sort, and evaluate on Fedora rawhide, and then hit the switch to make it the default. We will also need regression testcases, although not sure what kind of coverage we will need to provide (our own testsuite currently does not trigger any issue though). > > --Mark > >>> --- >>> elf/dl-deps.c | 14 ++++++++++---- >>> 1 file changed, 10 insertions(+), 4 deletions(-) >>> >>> diff --git a/elf/dl-deps.c b/elf/dl-deps.c >>> index 237d9636c5..e15f7f83d8 100644 >>> --- a/elf/dl-deps.c >>> +++ b/elf/dl-deps.c >>> @@ -73,13 +73,19 @@ _dl_build_local_scope (struct link_map **list, struct link_map *map) >>> { >>> struct link_map **p = list; >>> struct link_map **q; >>> + struct link_map **r; >>> *p++ = map; >>> map->l_reserved = 1; >>> - if (map->l_initfini) >>> - for (q = map->l_initfini + 1; *q; ++q) >>> - if (! (*q)->l_reserved) >>> - p += _dl_build_local_scope (p, *q); >>> + >>> + for (r = list; r < p; ++r) >>> + if ((*r)->l_initfini) >>> + for (q = (*r)->l_initfini + 1; *q; ++q) >>> + if (! (*q)->l_reserved) >>> + { >>> + *p++ = *q; >>> + (*q)->l_reserved = 1; >>> + } >>> return p - list; >>> } >>> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-12 20:12 ` Adhemerval Zanella @ 2022-01-12 20:41 ` Mark Hatle 2022-01-13 11:52 ` Adhemerval Zanella 0 siblings, 1 reply; 12+ messages in thread From: Mark Hatle @ 2022-01-12 20:41 UTC (permalink / raw) To: Adhemerval Zanella, Khem Raj, Libc-alpha, Carlos O'Donell On 1/12/22 2:12 PM, Adhemerval Zanella wrote: > > > On 12/01/2022 16:08, Mark Hatle wrote: >> >> >> On 1/11/22 1:26 PM, Adhemerval Zanella wrote: >>> >>> >>> On 09/12/2021 20:53, Khem Raj via Libc-alpha wrote: >>>> From: Mark Hatle <mark.hatle@windriver.com> >>>> >>>> According to the ELF specification: >>>> >>>> When resolving symbolic references, the dynamic linker examines the symbol >>>> tables with a breadth-first search. >>>> >>>> This function was using a depth first search. By doing so the conflict >>>> resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) >>>> was incorrect. This caused problems when their were various circular >>>> dependencies between libraries. The problem usually manifested itself by >>>> the wrong IFUNC being executed. >>>> >>>> Similar issue has been reported here [1] >>>> >>>> [BZ# 20488] >>>> >>>> [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html >>>> >>>> Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> >>>> Signed-off-by: Khem Raj <raj.khem@gmail.com> >>> >>> I am trying to understand why it this only an issue for LD_TRACE_PRELINKING=1, >>> do we have a testcase that stress it for a default usercase? >> >> The underlying issue here is that resolution is happening depth first and not breadth first. According to the ELF spec, all resolution should be breadth-first. >> >> >> As noted in item in above, the prelinker just happens to be a way to actually show that the behavior is incorrect. (There even appears to be a related defect with a reproducer.) >> >> >> When taking the values from LD_TRACE_PRELINKING=1, various addresses and conflict resolutions are specified. When you compare what is reported, vs what happens, vs what the spec says they don't align as they should. > > That was pretty clear from bug report, what I am trying to understand is > why it seems to not being reported before in default non-prelinked usage. > > Also, the patch only changes the semantic to prelinked binaries which is > at least troublesome: this semantic difference at symbol resolution is > a potential source of issues where user see different behavior depending > whether prelinked was used or not. > > I am also worried because making such change might potentially trigger > some hard to diagnostic breakage. > >> >> >>> When you say the 'wrong IFUNC being executed' what exactly you mean here? >>> Could we use a testcase based on this? >> >> >> The prelinker (and possibly just in general), the IFUNC address used is the one from the wrong library scope. I personally have never tried to reproduce this outside of the prelinking use-case, but based on the referenced report and the code at the time of the change, it is believed this could happen (but rarely) without a prelinked system in a very complex case with multiple libraries providing the same functions. >> >> The main issue (my memory is sketchy sorry this might be wrong) is that if one or more functions is an IFUNC and one or more functions is NOT, then the you can get into a situation with a conflict of the wrong function being called using the wrong mechanism. >> >> Definitely in the prelink case, the resolver gave the address of a regular function which was then placed into an IFUNC (or maybe it was the other way around) triggering the runtime segfault. > > I asked about ifunc because it should not really be dependent where ifunc > is used or not, what might be happening is this issue triggers the long > standing ifunc ordering issue more often. IFUNC is the only place we ever reproduced it. At least one of the functions had to be an ifunc. (Or maybe a better way to say it is at least one function was an ifunc and one wasn't, but they both had the same name.) > To summarize, although this change only affects prelinked binaries I think > it would be better to change for default semantic as well. We can make > it a tunable for the transition, like new DSO sorting algorithm with > glibc.rtld.dynamic_sort, and evaluate on Fedora rawhide, and then hit > the switch to make it the default. It absolutely affects prelinked binaries. At the time we believed it COULD affect regular binaries, but we had never seen a failure in the wild. > We will also need regression testcases, although not sure what kind > of coverage we will need to provide (our own testsuite currently does not > trigger any issue though). This is the test case that was developed at the time for including inside of the prelinker: https://git.yoctoproject.org/prelink-cross/commit/testsuite?id=8f55afd84b3580b1f1d6af904e8c2a39221055b7 It essentially makes three libraries, with two functions (two libraries have the same function). The resulting value expects the breadth-first loading behavior in order to generate the correct output. The main purpose of the prelink test cases was to ensure the same behavior before and after the prelinking. The same should be true with this change to glibc, everything should work the same before and after the change, unless the stuff before was an error (wrong function used). We have also been using this patch in the Yocto Project since 2016. And we've never had a report of an incompatibility/failure. So it really is quite low risk, but I'll never say it's "no risk". >> >> --Mark >> >>>> --- >>>> elf/dl-deps.c | 14 ++++++++++---- >>>> 1 file changed, 10 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/elf/dl-deps.c b/elf/dl-deps.c >>>> index 237d9636c5..e15f7f83d8 100644 >>>> --- a/elf/dl-deps.c >>>> +++ b/elf/dl-deps.c >>>> @@ -73,13 +73,19 @@ _dl_build_local_scope (struct link_map **list, struct link_map *map) >>>> { >>>> struct link_map **p = list; >>>> struct link_map **q; >>>> + struct link_map **r; >>>> *p++ = map; >>>> map->l_reserved = 1; >>>> - if (map->l_initfini) >>>> - for (q = map->l_initfini + 1; *q; ++q) >>>> - if (! (*q)->l_reserved) >>>> - p += _dl_build_local_scope (p, *q); >>>> + >>>> + for (r = list; r < p; ++r) >>>> + if ((*r)->l_initfini) >>>> + for (q = (*r)->l_initfini + 1; *q; ++q) >>>> + if (! (*q)->l_reserved) >>>> + { >>>> + *p++ = *q; >>>> + (*q)->l_reserved = 1; >>>> + } >>>> return p - list; >>>> } >>>> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-12 20:41 ` Mark Hatle @ 2022-01-13 11:52 ` Adhemerval Zanella 2022-01-13 16:33 ` Mark Hatle 0 siblings, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2022-01-13 11:52 UTC (permalink / raw) To: Mark Hatle, Khem Raj, Libc-alpha, Carlos O'Donell On 12/01/2022 17:41, Mark Hatle wrote: > > > On 1/12/22 2:12 PM, Adhemerval Zanella wrote: >> >> >> On 12/01/2022 16:08, Mark Hatle wrote: >>> >>> >>> On 1/11/22 1:26 PM, Adhemerval Zanella wrote: >>>> >>>> >>>> On 09/12/2021 20:53, Khem Raj via Libc-alpha wrote: >>>>> From: Mark Hatle <mark.hatle@windriver.com> >>>>> >>>>> According to the ELF specification: >>>>> >>>>> When resolving symbolic references, the dynamic linker examines the symbol >>>>> tables with a breadth-first search. >>>>> >>>>> This function was using a depth first search. By doing so the conflict >>>>> resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) >>>>> was incorrect. This caused problems when their were various circular >>>>> dependencies between libraries. The problem usually manifested itself by >>>>> the wrong IFUNC being executed. >>>>> >>>>> Similar issue has been reported here [1] >>>>> >>>>> [BZ# 20488] >>>>> >>>>> [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html >>>>> >>>>> Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> >>>>> Signed-off-by: Khem Raj <raj.khem@gmail.com> >>>> >>>> I am trying to understand why it this only an issue for LD_TRACE_PRELINKING=1, >>>> do we have a testcase that stress it for a default usercase? >>> >>> The underlying issue here is that resolution is happening depth first and not breadth first. According to the ELF spec, all resolution should be breadth-first. >>> >>> >>> As noted in item in above, the prelinker just happens to be a way to actually show that the behavior is incorrect. (There even appears to be a related defect with a reproducer.) >>> >>> >>> When taking the values from LD_TRACE_PRELINKING=1, various addresses and conflict resolutions are specified. When you compare what is reported, vs what happens, vs what the spec says they don't align as they should. >> >> That was pretty clear from bug report, what I am trying to understand is >> why it seems to not being reported before in default non-prelinked usage. >> >> Also, the patch only changes the semantic to prelinked binaries which is >> at least troublesome: this semantic difference at symbol resolution is >> a potential source of issues where user see different behavior depending >> whether prelinked was used or not. >> >> I am also worried because making such change might potentially trigger >> some hard to diagnostic breakage. >> >>> >>> >>>> When you say the 'wrong IFUNC being executed' what exactly you mean here? >>>> Could we use a testcase based on this? >>> >>> >>> The prelinker (and possibly just in general), the IFUNC address used is the one from the wrong library scope. I personally have never tried to reproduce this outside of the prelinking use-case, but based on the referenced report and the code at the time of the change, it is believed this could happen (but rarely) without a prelinked system in a very complex case with multiple libraries providing the same functions. >>> >>> The main issue (my memory is sketchy sorry this might be wrong) is that if one or more functions is an IFUNC and one or more functions is NOT, then the you can get into a situation with a conflict of the wrong function being called using the wrong mechanism. >>> >>> Definitely in the prelink case, the resolver gave the address of a regular function which was then placed into an IFUNC (or maybe it was the other way around) triggering the runtime segfault. >> >> I asked about ifunc because it should not really be dependent where ifunc >> is used or not, what might be happening is this issue triggers the long >> standing ifunc ordering issue more often. > > IFUNC is the only place we ever reproduced it. At least one of the functions had to be an ifunc. (Or maybe a better way to say it is at least one function was an ifunc and one wasn't, but they both had the same name.) > >> To summarize, although this change only affects prelinked binaries I think >> it would be better to change for default semantic as well. We can make >> it a tunable for the transition, like new DSO sorting algorithm with >> glibc.rtld.dynamic_sort, and evaluate on Fedora rawhide, and then hit >> the switch to make it the default. > > It absolutely affects prelinked binaries. At the time we believed it COULD affect regular binaries, but we had never seen a failure in the wild. > >> We will also need regression testcases, although not sure what kind >> of coverage we will need to provide (our own testsuite currently does not >> trigger any issue though). > > This is the test case that was developed at the time for including inside of the prelinker: > > https://git.yoctoproject.org/prelink-cross/commit/testsuite?id=8f55afd84b3580b1f1d6af904e8c2a39221055b7 > > It essentially makes three libraries, with two functions (two libraries have the same function). The resulting value expects the breadth-first loading behavior in order to generate the correct output. > > The main purpose of the prelink test cases was to ensure the same behavior before and after the prelinking. The same should be true with this change to glibc, everything should work the same before and after the change, unless the stuff before was an error (wrong function used). The test does not really exercises IFUNC, however on the patch description it seems that actually tries to fix a upstream bug that we already fixed [1]. And checking prelink output with this test, the patch does not really seem to be changing the binding of the 'value' function: * master $ LD_TRACE_PRELINKING=1 [...] ./order [...] ./order => ./order (0x00007f10eca07000, 0x00007f10eca07000) orderlib.so => ./orderlib.so (0x00007f10eca00000, 0x00007f10eca00000) [***] orderlib3.so => ./orderlib3.so (0x00007f10ec9fb000, 0x00007f10ec9fb000) libc.so.6 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/libc.so.6 (0x00007f10ec7db000, 0x00007f10ec7db000) TLS(0x1, 0x0000000000000090) orderlib1.so => ./orderlib1.so (0x00007f10ec7d4000, 0x00007f10ec7d4000) orderlib2.so => ./orderlib2.so (0x00007f10ec7cf000, 0x00007f10ec7cf000) /lib64/ld-linux-x86-64.so.2 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/elf/ld-linux-x86-64.so.2 (0x00007f10eca0c000, 0x00007f10eca0c000) [...] lookup 0x00007f10eca07000 0x0000000000000468 -> 0x00007f10ec9fb000 0x0000000000001100 /1 value [...] * patch $ LD_TRACE_PRELINKING=1 [...] ./order [...] ./order => ./order (0x00007f30cf770000, 0x00007f30cf770000) orderlib.so => ./orderlib.so (0x00007f30cf769000, 0x00007f30cf769000) [***] orderlib3.so => ./orderlib3.so (0x00007f30cf764000, 0x00007f30cf764000) * libc.so.6 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/libc.so.6 (0x00007f30cf544000, 0x00007f30cf544000) TLS(0x1, 0x0000000000000090) orderlib1.so => ./orderlib1.so (0x00007f30cf53d000, 0x00007f30cf53d000) orderlib2.so => ./orderlib2.so (0x00007f30cf538000, 0x00007f30cf538000) /lib64/ld-linux-x86-64.so.2 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/elf/ld-linux-x86-64.so.2 (0x00007f30cf775000, 0x00007f30cf775000) [...] lookup 0x00007f30cf770000 0x0000000000000468 -> 0x00007f30cf764000 0x0000000000001100 /1 value [...] So on both cases 'value' binds to orderlib3.so, which is the expected value. I really think we need to came with a testcase to actually stress it. At least _dl_build_local_scope is localized to prelink support, which I think it should be safer to change. > > We have also been using this patch in the Yocto Project since 2016. And we've never had a report of an incompatibility/failure. So it really is quite low risk, but I'll never say it's "no risk". > Recently Florian has asked about prelinked support [1], and Joseph answer seems that Yocto still provide support for it [3]. IMHO I would just deprecate it, so we can eventually clean this up: it requires a lot of hacks within loader, its support is not straightforward for newer ports, and it even make less sense with ASLR and PIE executable. It would be good to know if the performance optimization it should bring does pay off the clutter and complexity it adds on loader code. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=19861 [2] https://sourceware.org/pipermail/libc-alpha/2021-August/130404.html [3] https://git.yoctoproject.org/prelink-cross/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-13 11:52 ` Adhemerval Zanella @ 2022-01-13 16:33 ` Mark Hatle 2022-01-13 17:20 ` Adhemerval Zanella 0 siblings, 1 reply; 12+ messages in thread From: Mark Hatle @ 2022-01-13 16:33 UTC (permalink / raw) To: Adhemerval Zanella, Khem Raj, Libc-alpha, Carlos O'Donell On 1/13/22 5:52 AM, Adhemerval Zanella wrote: > > > On 12/01/2022 17:41, Mark Hatle wrote: >> >> >> On 1/12/22 2:12 PM, Adhemerval Zanella wrote: >>> >>> >>> On 12/01/2022 16:08, Mark Hatle wrote: >>>> >>>> >>>> On 1/11/22 1:26 PM, Adhemerval Zanella wrote: >>>>> >>>>> >>>>> On 09/12/2021 20:53, Khem Raj via Libc-alpha wrote: >>>>>> From: Mark Hatle <mark.hatle@windriver.com> >>>>>> >>>>>> According to the ELF specification: >>>>>> >>>>>> When resolving symbolic references, the dynamic linker examines the symbol >>>>>> tables with a breadth-first search. >>>>>> >>>>>> This function was using a depth first search. By doing so the conflict >>>>>> resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) >>>>>> was incorrect. This caused problems when their were various circular >>>>>> dependencies between libraries. The problem usually manifested itself by >>>>>> the wrong IFUNC being executed. >>>>>> >>>>>> Similar issue has been reported here [1] >>>>>> >>>>>> [BZ# 20488] >>>>>> >>>>>> [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html >>>>>> >>>>>> Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> >>>>>> Signed-off-by: Khem Raj <raj.khem@gmail.com> >>>>> >>>>> I am trying to understand why it this only an issue for LD_TRACE_PRELINKING=1, >>>>> do we have a testcase that stress it for a default usercase? >>>> >>>> The underlying issue here is that resolution is happening depth first and not breadth first. According to the ELF spec, all resolution should be breadth-first. >>>> >>>> >>>> As noted in item in above, the prelinker just happens to be a way to actually show that the behavior is incorrect. (There even appears to be a related defect with a reproducer.) >>>> >>>> >>>> When taking the values from LD_TRACE_PRELINKING=1, various addresses and conflict resolutions are specified. When you compare what is reported, vs what happens, vs what the spec says they don't align as they should. >>> >>> That was pretty clear from bug report, what I am trying to understand is >>> why it seems to not being reported before in default non-prelinked usage. >>> >>> Also, the patch only changes the semantic to prelinked binaries which is >>> at least troublesome: this semantic difference at symbol resolution is >>> a potential source of issues where user see different behavior depending >>> whether prelinked was used or not. >>> >>> I am also worried because making such change might potentially trigger >>> some hard to diagnostic breakage. >>> >>>> >>>> >>>>> When you say the 'wrong IFUNC being executed' what exactly you mean here? >>>>> Could we use a testcase based on this? >>>> >>>> >>>> The prelinker (and possibly just in general), the IFUNC address used is the one from the wrong library scope. I personally have never tried to reproduce this outside of the prelinking use-case, but based on the referenced report and the code at the time of the change, it is believed this could happen (but rarely) without a prelinked system in a very complex case with multiple libraries providing the same functions. >>>> >>>> The main issue (my memory is sketchy sorry this might be wrong) is that if one or more functions is an IFUNC and one or more functions is NOT, then the you can get into a situation with a conflict of the wrong function being called using the wrong mechanism. >>>> >>>> Definitely in the prelink case, the resolver gave the address of a regular function which was then placed into an IFUNC (or maybe it was the other way around) triggering the runtime segfault. >>> >>> I asked about ifunc because it should not really be dependent where ifunc >>> is used or not, what might be happening is this issue triggers the long >>> standing ifunc ordering issue more often. >> >> IFUNC is the only place we ever reproduced it. At least one of the functions had to be an ifunc. (Or maybe a better way to say it is at least one function was an ifunc and one wasn't, but they both had the same name.) >> >>> To summarize, although this change only affects prelinked binaries I think >>> it would be better to change for default semantic as well. We can make >>> it a tunable for the transition, like new DSO sorting algorithm with >>> glibc.rtld.dynamic_sort, and evaluate on Fedora rawhide, and then hit >>> the switch to make it the default. >> >> It absolutely affects prelinked binaries. At the time we believed it COULD affect regular binaries, but we had never seen a failure in the wild. >> >>> We will also need regression testcases, although not sure what kind >>> of coverage we will need to provide (our own testsuite currently does not >>> trigger any issue though). >> >> This is the test case that was developed at the time for including inside of the prelinker: >> >> https://git.yoctoproject.org/prelink-cross/commit/testsuite?id=8f55afd84b3580b1f1d6af904e8c2a39221055b7 >> >> It essentially makes three libraries, with two functions (two libraries have the same function). The resulting value expects the breadth-first loading behavior in order to generate the correct output. >> >> The main purpose of the prelink test cases was to ensure the same behavior before and after the prelinking. The same should be true with this change to glibc, everything should work the same before and after the change, unless the stuff before was an error (wrong function used). > > The test does not really exercises IFUNC, however on the patch description it > seems that actually tries to fix a upstream bug that we already fixed [1]. > > And checking prelink output with this test, the patch does not really seem > to be changing the binding of the 'value' function: > > * master Master branch is for conventional prelink, batches are cross-ported with the cross_prelink branch. cross_prelink is where this was developed, so the chunk of the commit message about changing the linker has nothing to do with the bit I was showing you. (Find the corresponding patch in cross_prelink, and you'll see the RTLD side of things.) > $ LD_TRACE_PRELINKING=1 [...] ./order > [...] > ./order => ./order (0x00007f10eca07000, 0x00007f10eca07000) > orderlib.so => ./orderlib.so (0x00007f10eca00000, 0x00007f10eca00000) > [***] orderlib3.so => ./orderlib3.so (0x00007f10ec9fb000, 0x00007f10ec9fb000) > libc.so.6 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/libc.so.6 (0x00007f10ec7db000, 0x00007f10ec7db000) TLS(0x1, 0x0000000000000090) > orderlib1.so => ./orderlib1.so (0x00007f10ec7d4000, 0x00007f10ec7d4000) > orderlib2.so => ./orderlib2.so (0x00007f10ec7cf000, 0x00007f10ec7cf000) > /lib64/ld-linux-x86-64.so.2 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/elf/ld-linux-x86-64.so.2 (0x00007f10eca0c000, 0x00007f10eca0c000) > [...] > lookup 0x00007f10eca07000 0x0000000000000468 -> 0x00007f10ec9fb000 0x0000000000001100 /1 value > [...] > > * patch > > $ LD_TRACE_PRELINKING=1 [...] ./order > [...] > ./order => ./order (0x00007f30cf770000, 0x00007f30cf770000) > orderlib.so => ./orderlib.so (0x00007f30cf769000, 0x00007f30cf769000) > [***] orderlib3.so => ./orderlib3.so (0x00007f30cf764000, 0x00007f30cf764000) * > libc.so.6 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/libc.so.6 (0x00007f30cf544000, 0x00007f30cf544000) TLS(0x1, 0x0000000000000090) > orderlib1.so => ./orderlib1.so (0x00007f30cf53d000, 0x00007f30cf53d000) > orderlib2.so => ./orderlib2.so (0x00007f30cf538000, 0x00007f30cf538000) > /lib64/ld-linux-x86-64.so.2 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/elf/ld-linux-x86-64.so.2 (0x00007f30cf775000, 0x00007f30cf775000) > [...] > lookup 0x00007f30cf770000 0x0000000000000468 -> 0x00007f30cf764000 0x0000000000001100 /1 value > [...] > > So on both cases 'value' binds to orderlib3.so, which is the expected value. I really > think we need to came with a testcase to actually stress it. It's more then just LD_TRACE_PRELINKING=1, the environment used is usually similar to: LD_TRACE_PRELINKING=1 LD_BIND_NOW=1 LD_TRACE_LOADED_OBJECTS=1 Then for each library and executable it runs the ld.so and captures this information. > At least _dl_build_local_scope is localized to prelink support, which I think it should > be safer to change. Due to relocation and other changes, I don't know if you can run the prelinker for the test cases, but if you do -- it's pretty easy to show the issue in this case. >> >> We have also been using this patch in the Yocto Project since 2016. And we've never had a report of an incompatibility/failure. So it really is quite low risk, but I'll never say it's "no risk". >> > > Recently Florian has asked about prelinked support [1], and Joseph > answer seems that Yocto still provide support for it [3]. IMHO I > would just deprecate it, so we can eventually clean this up: it > requires a lot of hacks within loader, its support is not straightforward > for newer ports, and it even make less sense with ASLR and PIE executable. I have stopped maintaining the prelinker at this time, as I don't have the knowledge for all of the latest fixups. Over the last 3 or so years, the prelinker itself has fallen into a state of disrepair as additional relocations, functions, etc have been added to the system. I don't have the time or more importantly knowledge to deal with the recent changes. I would be happy to resume some of the maintenance activities if others would provide the fixes necessary for the latest compiler/library implementations -- but the number of people who understand ELF at that level AND have any desire for prelinking can probably be counted on one hand. Even trying to use the prelinker with a somewhat modern Ubuntu system reports errors like: ../src/prelink: ./order: Could not parse `lookup 0x00007fffd8b0a000 0xfffffffffffcd180 -> 0x00007fffd8b0a000 0x00000000000008e0 /0 __vdso_clock_gettime' While I'm sure that one wouldn't be too hard to fix, then there is another and another.. (I believe the issue is that the VDSO items are presenting BEFORE the list of libraries which is change in behavior....) (I did try to just skip those lookups, and it results in the prelinked code crashing..) > It would be good to know if the performance optimization it should bring > does pay off the clutter and complexity it adds on loader code. When I last profiled this, roughly 3 1/2 years ago, the run-time linking speedup was huge. There were two main advantages to this: * Run Time linking speedup - primarily helped initial application loads. System boot times went from 10-15 seconds down to 4-5 seconds. For embedded systems this was massive. * Memory usage. The COW page usage for runtime linking can be significant on memory constrained systems. Prelinking dropped the COW page usage in the systems I was looking at to about 10% of what it was prior. This is believed to have further contributed to the boot time optimizations. Last I looked at this, only about 20-30% of the system is capable of prelinking anymore due to the evolutionary changes in the various toolchain elements, introducing new relocations and related components. Even things like re-ordering sections (done a couple years ago in binutils) has broken the prelinker in mysterious ways. Add to this the IMHO mistaken belief that ASLR is some magic security device. I see it more as a component of security that needs to be part of a broader system that you can decide to trade off against load performance (including memory). But the broad "if you don't use ASLR your device if vulnerable" mentality has seriously restricted the desire for people to use, improve and contribute back to the prelinker. > > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=19861 > [2] https://sourceware.org/pipermail/libc-alpha/2021-August/130404.html > [3] https://git.yoctoproject.org/prelink-cross/ > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-13 16:33 ` Mark Hatle @ 2022-01-13 17:20 ` Adhemerval Zanella 2022-01-13 18:00 ` Mark Hatle 0 siblings, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2022-01-13 17:20 UTC (permalink / raw) To: Mark Hatle, Khem Raj, Libc-alpha, Carlos O'Donell On 13/01/2022 13:33, Mark Hatle wrote: > > > On 1/13/22 5:52 AM, Adhemerval Zanella wrote: >> >> >> On 12/01/2022 17:41, Mark Hatle wrote: >>> >>> >>> On 1/12/22 2:12 PM, Adhemerval Zanella wrote: >>>> >>>> >>>> On 12/01/2022 16:08, Mark Hatle wrote: >>>>> >>>>> >>>>> On 1/11/22 1:26 PM, Adhemerval Zanella wrote: >>>>>> >>>>>> >>>>>> On 09/12/2021 20:53, Khem Raj via Libc-alpha wrote: >>>>>>> From: Mark Hatle <mark.hatle@windriver.com> >>>>>>> >>>>>>> According to the ELF specification: >>>>>>> >>>>>>> When resolving symbolic references, the dynamic linker examines the symbol >>>>>>> tables with a breadth-first search. >>>>>>> >>>>>>> This function was using a depth first search. By doing so the conflict >>>>>>> resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) >>>>>>> was incorrect. This caused problems when their were various circular >>>>>>> dependencies between libraries. The problem usually manifested itself by >>>>>>> the wrong IFUNC being executed. >>>>>>> >>>>>>> Similar issue has been reported here [1] >>>>>>> >>>>>>> [BZ# 20488] >>>>>>> >>>>>>> [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html >>>>>>> >>>>>>> Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> >>>>>>> Signed-off-by: Khem Raj <raj.khem@gmail.com> >>>>>> >>>>>> I am trying to understand why it this only an issue for LD_TRACE_PRELINKING=1, >>>>>> do we have a testcase that stress it for a default usercase? >>>>> >>>>> The underlying issue here is that resolution is happening depth first and not breadth first. According to the ELF spec, all resolution should be breadth-first. >>>>> >>>>> >>>>> As noted in item in above, the prelinker just happens to be a way to actually show that the behavior is incorrect. (There even appears to be a related defect with a reproducer.) >>>>> >>>>> >>>>> When taking the values from LD_TRACE_PRELINKING=1, various addresses and conflict resolutions are specified. When you compare what is reported, vs what happens, vs what the spec says they don't align as they should. >>>> >>>> That was pretty clear from bug report, what I am trying to understand is >>>> why it seems to not being reported before in default non-prelinked usage. >>>> >>>> Also, the patch only changes the semantic to prelinked binaries which is >>>> at least troublesome: this semantic difference at symbol resolution is >>>> a potential source of issues where user see different behavior depending >>>> whether prelinked was used or not. >>>> >>>> I am also worried because making such change might potentially trigger >>>> some hard to diagnostic breakage. >>>> >>>>> >>>>> >>>>>> When you say the 'wrong IFUNC being executed' what exactly you mean here? >>>>>> Could we use a testcase based on this? >>>>> >>>>> >>>>> The prelinker (and possibly just in general), the IFUNC address used is the one from the wrong library scope. I personally have never tried to reproduce this outside of the prelinking use-case, but based on the referenced report and the code at the time of the change, it is believed this could happen (but rarely) without a prelinked system in a very complex case with multiple libraries providing the same functions. >>>>> >>>>> The main issue (my memory is sketchy sorry this might be wrong) is that if one or more functions is an IFUNC and one or more functions is NOT, then the you can get into a situation with a conflict of the wrong function being called using the wrong mechanism. >>>>> >>>>> Definitely in the prelink case, the resolver gave the address of a regular function which was then placed into an IFUNC (or maybe it was the other way around) triggering the runtime segfault. >>>> >>>> I asked about ifunc because it should not really be dependent where ifunc >>>> is used or not, what might be happening is this issue triggers the long >>>> standing ifunc ordering issue more often. >>> >>> IFUNC is the only place we ever reproduced it. At least one of the functions had to be an ifunc. (Or maybe a better way to say it is at least one function was an ifunc and one wasn't, but they both had the same name.) >>> >>>> To summarize, although this change only affects prelinked binaries I think >>>> it would be better to change for default semantic as well. We can make >>>> it a tunable for the transition, like new DSO sorting algorithm with >>>> glibc.rtld.dynamic_sort, and evaluate on Fedora rawhide, and then hit >>>> the switch to make it the default. >>> >>> It absolutely affects prelinked binaries. At the time we believed it COULD affect regular binaries, but we had never seen a failure in the wild. >>> >>>> We will also need regression testcases, although not sure what kind >>>> of coverage we will need to provide (our own testsuite currently does not >>>> trigger any issue though). >>> >>> This is the test case that was developed at the time for including inside of the prelinker: >>> >>> https://git.yoctoproject.org/prelink-cross/commit/testsuite?id=8f55afd84b3580b1f1d6af904e8c2a39221055b7 >>> >>> It essentially makes three libraries, with two functions (two libraries have the same function). The resulting value expects the breadth-first loading behavior in order to generate the correct output. >>> >>> The main purpose of the prelink test cases was to ensure the same behavior before and after the prelinking. The same should be true with this change to glibc, everything should work the same before and after the change, unless the stuff before was an error (wrong function used). >> >> The test does not really exercises IFUNC, however on the patch description it >> seems that actually tries to fix a upstream bug that we already fixed [1]. >> >> And checking prelink output with this test, the patch does not really seem >> to be changing the binding of the 'value' function: >> >> * master > > Master branch is for conventional prelink, batches are cross-ported with the cross_prelink branch. cross_prelink is where this was developed, so the chunk of the commit message about changing the linker has nothing to do with the bit I was showing you. (Find the corresponding patch in cross_prelink, and you'll see the RTLD side of things.) I was not using your branch or repo, sorry if I was not clear. This is glibc master and the patch is glibc master plus your patch. > >> $ LD_TRACE_PRELINKING=1 [...] ./order >> [...] >> ./order => ./order (0x00007f10eca07000, 0x00007f10eca07000) >> orderlib.so => ./orderlib.so (0x00007f10eca00000, 0x00007f10eca00000) >> [***] orderlib3.so => ./orderlib3.so (0x00007f10ec9fb000, 0x00007f10ec9fb000) >> libc.so.6 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/libc.so.6 (0x00007f10ec7db000, 0x00007f10ec7db000) TLS(0x1, 0x0000000000000090) >> orderlib1.so => ./orderlib1.so (0x00007f10ec7d4000, 0x00007f10ec7d4000) >> orderlib2.so => ./orderlib2.so (0x00007f10ec7cf000, 0x00007f10ec7cf000) >> /lib64/ld-linux-x86-64.so.2 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/elf/ld-linux-x86-64.so.2 (0x00007f10eca0c000, 0x00007f10eca0c000) >> [...] >> lookup 0x00007f10eca07000 0x0000000000000468 -> 0x00007f10ec9fb000 0x0000000000001100 /1 value >> [...] >> >> * patch >> >> $ LD_TRACE_PRELINKING=1 [...] ./order >> [...] >> ./order => ./order (0x00007f30cf770000, 0x00007f30cf770000) >> orderlib.so => ./orderlib.so (0x00007f30cf769000, 0x00007f30cf769000) >> [***] orderlib3.so => ./orderlib3.so (0x00007f30cf764000, 0x00007f30cf764000) * >> libc.so.6 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/libc.so.6 (0x00007f30cf544000, 0x00007f30cf544000) TLS(0x1, 0x0000000000000090) >> orderlib1.so => ./orderlib1.so (0x00007f30cf53d000, 0x00007f30cf53d000) >> orderlib2.so => ./orderlib2.so (0x00007f30cf538000, 0x00007f30cf538000) >> /lib64/ld-linux-x86-64.so.2 => /home/azanella/Projects/glibc/build/x86_64-linux-gnu/elf/ld-linux-x86-64.so.2 (0x00007f30cf775000, 0x00007f30cf775000) >> [...] >> lookup 0x00007f30cf770000 0x0000000000000468 -> 0x00007f30cf764000 0x0000000000001100 /1 value >> [...] >> >> So on both cases 'value' binds to orderlib3.so, which is the expected value. I really >> think we need to came with a testcase to actually stress it. > > It's more then just LD_TRACE_PRELINKING=1, the environment used is usually similar to: > > LD_TRACE_PRELINKING=1 > LD_BIND_NOW=1 > LD_TRACE_LOADED_OBJECTS=1 > > Then for each library and executable it runs the ld.so and captures this information. LD_TRACE_PRELINKING implies in LD_TRACE_LOADED_OBJECTS: 2773 case 16: 2774 /* The mode of the dynamic linker can be set. */ 2775 if (memcmp (envline, "TRACE_PRELINKING", 16) == 0) 2776 { 2777 state->mode = rtld_mode_trace; 2778 GLRO(dl_verbose) = 1; 2779 GLRO(dl_debug_mask) |= DL_DEBUG_PRELINK; 2780 GLRO(dl_trace_prelink) = &envline[17]; 2781 } 2782 break; 2783 2784 case 20: 2785 /* The mode of the dynamic linker can be set. */ 2786 if (memcmp (envline, "TRACE_LOADED_OBJECTS", 20) == 0) 2787 state->mode = rtld_mode_trace; 2788 break; That's why it is used on glibc own prelink tests (tst-prelink for instance). And LD_BIND_NOW=1 does not change the test outcome also. > >> At least _dl_build_local_scope is localized to prelink support, which I think it should >> be safer to change. > > Due to relocation and other changes, I don't know if you can run the prelinker for the test cases, but if you do -- it's pretty easy to show the issue in this case. It would not be straightforward to add prelink as test requirese (nor desirable), but AFAIU LD_TRACE_PRELINKING is the loader's output used by prelink, > >>> >>> We have also been using this patch in the Yocto Project since 2016. And we've never had a report of an incompatibility/failure. So it really is quite low risk, but I'll never say it's "no risk". >>> >> >> Recently Florian has asked about prelinked support [1], and Joseph >> answer seems that Yocto still provide support for it [3]. IMHO I >> would just deprecate it, so we can eventually clean this up: it >> requires a lot of hacks within loader, its support is not straightforward >> for newer ports, and it even make less sense with ASLR and PIE executable. > > I have stopped maintaining the prelinker at this time, as I don't have the knowledge for all of the latest fixups. Over the last 3 or so years, the prelinker itself has fallen into a state of disrepair as additional relocations, functions, etc have been added to the system. I don't have the time or more importantly knowledge to deal with the recent changes. > > I would be happy to resume some of the maintenance activities if others would provide the fixes necessary for the latest compiler/library implementations -- but the number of people who understand ELF at that level AND have any desire for prelinking can probably be counted on one hand. > > Even trying to use the prelinker with a somewhat modern Ubuntu system reports errors like: > > ../src/prelink: ./order: Could not parse `lookup 0x00007fffd8b0a000 0xfffffffffffcd180 -> 0x00007fffd8b0a000 0x00000000000008e0 /0 __vdso_clock_gettime' > > While I'm sure that one wouldn't be too hard to fix, then there is another and another.. (I believe the issue is that the VDSO items are presenting BEFORE the list of libraries which is change in behavior....) > > (I did try to just skip those lookups, and it results in the prelinked code crashing..) > >> It would be good to know if the performance optimization it should bring >> does pay off the clutter and complexity it adds on loader code. > > When I last profiled this, roughly 3 1/2 years ago, the run-time linking speedup was huge. There were two main advantages to this: > > * Run Time linking speedup - primarily helped initial application loads. System boot times went from 10-15 seconds down to 4-5 seconds. For embedded systems this was massive. Right, this is interesting. Any profile data where exactly the speed is coming from? I wonder if we could get any gain by optimizing the normal patch without the need to resort to prelink. > > * Memory usage. The COW page usage for runtime linking can be significant on memory constrained systems. Prelinking dropped the COW page usage in the systems I was looking at to about 10% of what it was prior. This is believed to have further contributed to the boot time optimizations. Interesting, why exactly is prelinking help in COW usage? I would expect memory utilization to be rough the same, is prelinking helping in aligning the segment in a better way? > > Last I looked at this, only about 20-30% of the system is capable of prelinking anymore due to the evolutionary changes in the various toolchain elements, introducing new relocations and related components. Even things like re-ordering sections (done a couple years ago in binutils) has broken the prelinker in mysterious ways. Yes and it is even harder to have a project that is dependent of both static in dynamic linker to have out-of-tree developement without a proper ABI definition. That's why I think currently prelink is hackish solution with a niche usage that adds a lot complexity to the code base. For instance, we are aiming to support DT_RELR which would help to decrease the relocation segment size for PIE binaries. It would be probably another feature that prelink will lack support. In fact this information you provided that only 20-30% of all binaries are supported makes even more willing to really deprecate prelink. > > Add to this the IMHO mistaken belief that ASLR is some magic security device. I see it more as a component of security that needs to be part of a broader system that you can decide to trade off against load performance (including memory). But the broad "if you don't use ASLR your device if vulnerable" mentality has seriously restricted the desire for people to use, improve and contribute back to the prelinker. ASLR/PIE is not a silver bullet, specially with limited mmap entropy on 32 bit systems. But it is a gradual improvement over the multiple security features we support (like the generic ones as relro, malloc safelink, etc. to arch-specific one such as x86_64 CET or aarch64 BTI or PAC/RET). My point is more that usually what we see is generic distribution is to use more broader security features. I am not sure about embedded though. > >> >> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=19861 >> [2] https://sourceware.org/pipermail/libc-alpha/2021-August/130404.html >> [3] https://git.yoctoproject.org/prelink-cross/ >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-13 17:20 ` Adhemerval Zanella @ 2022-01-13 18:00 ` Mark Hatle 2022-01-13 18:37 ` Adhemerval Zanella 0 siblings, 1 reply; 12+ messages in thread From: Mark Hatle @ 2022-01-13 18:00 UTC (permalink / raw) To: Adhemerval Zanella, Khem Raj, Libc-alpha, Carlos O'Donell On 1/13/22 11:20 AM, Adhemerval Zanella wrote: >> When I last profiled this, roughly 3 1/2 years ago, the run-time linking speedup was huge. There were two main advantages to this: >> >> * Run Time linking speedup - primarily helped initial application loads. System boot times went from 10-15 seconds down to 4-5 seconds. For embedded systems this was massive. > > Right, this is interesting. Any profile data where exactly the speed is coming > from? I wonder if we could get any gain by optimizing the normal patch without > the need to resort to prelink. glibc's runtime linker is very efficient, I don't honestly expect many speedups at this point. This is partially from memory, so I may have a few details wrong... but LD_DEBUG=statistics On my ubuntu machine, just setting then and running /bin/bash results in: 334067: 334067: runtime linker statistics: 334067: total startup time in dynamic loader: 252415 cycles 334067: time needed for relocation: 119006 cycles (47.1%) 334067: number of relocations: 412 334067: number of relocations from cache: 3 334067: number of relative relocations: 5100 334067: time needed to load objects: 92655 cycles (36.7%) 334068: 334068: runtime linker statistics: 334068: total startup time in dynamic loader: 125018 cycles 334068: time needed for relocation: 40554 cycles (32.4%) 334068: number of relocations: 176 334068: number of relocations from cache: 3 334068: number of relative relocations: 1534 334068: time needed to load objects: 45882 cycles (36.7%) 334069: 334069: runtime linker statistics: 334069: total startup time in dynamic loader: 121500 cycles 334069: time needed for relocation: 39067 cycles (32.1%) 334069: number of relocations: 136 334069: number of relocations from cache: 3 334069: number of relative relocations: 1274 334069: time needed to load objects: 47505 cycles (39.0%) 334071: 334071: runtime linker statistics: 334071: total startup time in dynamic loader: 111850 cycles 334071: time needed for relocation: 35089 cycles (31.3%) 334071: number of relocations: 135 334071: number of relocations from cache: 3 334071: number of relative relocations: 1272 334071: time needed to load objects: 45746 cycles (40.8%) 334072: 334072: runtime linker statistics: 334072: total startup time in dynamic loader: 109827 cycles 334072: time needed for relocation: 34863 cycles (31.7%) 334072: number of relocations: 145 334072: number of relocations from cache: 3 334072: number of relative relocations: 1351 334072: time needed to load objects: 45565 cycles (41.4%) (why so many, because it's running through the profile and other bash startups which end up running additional items.) When prelinker worked... the number of relocations (and especially cycles) required dropped to about 1-10% of the original application. This compounded by the large number of executables loaded at boot (think of sysvinit with all of the shells started and destroyed) turned into a massive speedup during early boot process. As a normal "user" behavior, the speedup is negligible, because the amount of time spent loading vs running is nothing.... but in automated processing where something, like bash, is started runs for a fraction of a second, exits.. "repeat" 1000s of times.. it really becomes a massive part of the time scale. So back to the above, I know that in one instance that bash would end up with about 4 relocations, with 400+ from cache with the prelinker. Resulting in the cycles required for relocations to be in the 10% of overall load time, with time needed to load objects being roughly 90%. >> >> * Memory usage. The COW page usage for runtime linking can be significant on memory constrained systems. Prelinking dropped the COW page usage in the systems I was looking at to about 10% of what it was prior. This is believed to have further contributed to the boot time optimizations. > > Interesting, why exactly is prelinking help in COW usage? I would expect memory > utilization to be rough the same, is prelinking helping in aligning the segment > in a better way? Each time a relocation occurs, the runtime linker needs to write into a page with that address. No relocation, no runtime write, no COW page created. Add to this mmap usage between applications, and you can run say 100 bash sessions and each session would use a fraction of the COW pages that it would without prelinking. At one point I had statistics on this, but I don't even remember how this was calculated or done anymore. (I had help from some kernel people to show me kernel memory use, contiguous pages, etc..) >> >> Last I looked at this, only about 20-30% of the system is capable of prelinking anymore due to the evolutionary changes in the various toolchain elements, introducing new relocations and related components. Even things like re-ordering sections (done a couple years ago in binutils) has broken the prelinker in mysterious ways. > > Yes and it is even harder to have a project that is dependent of both > static in dynamic linker to have out-of-tree developement without a > proper ABI definition. That's why I think currently prelink is hackish > solution with a niche usage that adds a lot complexity to the code base. > > For instance, we are aiming to support DT_RELR which would help to > decrease the relocation segment size for PIE binaries. It would be > probably another feature that prelink will lack support. > > In fact this information you provided that only 20-30% of all binaries > are supported makes even more willing to really deprecate prelink. prelink has a huge advantage on embedded systems -- but it hasn't worked well for about 3 years now... I was hoping other then life support someone would step up and contribute, and it never really happened. There were a few bugs/fixes sent by Mentor that kept things going on a few platforms -- but even that eventually dried up. (This is meant to thank them for the code and contributions they did!) >> >> Add to this the IMHO mistaken belief that ASLR is some magic security device. I see it more as a component of security that needs to be part of a broader system that you can decide to trade off against load performance (including memory). But the broad "if you don't use ASLR your device if vulnerable" mentality has seriously restricted the desire for people to use, improve and contribute back to the prelinker. > > ASLR/PIE is not a silver bullet, specially with limited mmap entropy on > 32 bit systems. But it is a gradual improvement over the multiple security > features we support (like the generic ones as relro, malloc safelink, etc. > to arch-specific one such as x86_64 CET or aarch64 BTI or PAC/RET). Exactly it's multiple security features work together for a purpose. But everyone got convinced ASLR was a silver bullet and that is what started the final death spiral of the prelinker (as it is today). > My point is more that usually what we see is generic distribution is to > use more broader security features. I am not sure about embedded though. Embedded needs security, no doubt.. but with the limited entropy (even on 64-bit, the entropy is truly limited.. great I now have to run my attack 15 times instead of 5.. that really isn't much of an improvement!) ASLR has become a check list item for some security consultant to approve a product release. Things like the CET, BTI / PAC/RET have a much larger re-world security impact, IMHO. So in the end the embedded development that I've been involved with has always had a series of "these are our options, in a perfect world we'd use them all -- but we don't have the memory (prelink helped), we've got disk space limits (can't use PAC/RET, binaries get bigger), we need to be able to upgrade the software (prelink on the device? send pre-prelinked to all devices), we've got industry requirements (not all devices should have the same memory map, prelink ranomize addresses?), we've got maximum boot time requirements, etc. It's not cut and dried what combination of those requirements, and which technologies (such as the prelinker) should be used to meet them. As we have less operating system engineers, the preference is going away from using tools like prelink and lots of simple utilities into alternatives like "jumbo do it all" binaries that only get loaded once. Avoiding initscript systems and packing system initialization into those binaries, or even moving to other libc's that have less relocation pressure (due to smaller libraires, feature sets, etc.) If you declare prelink dead, then it's dead.. nobody will be bringing it back. But I do still believe technology wise it's a good technology for the embedded systems (remember embedded doesn't mean "small") to help the meet specific integration needs. But without help from people with the appropriate knowledge to implement new features, like DT_RELR, in the prelinker -- there is little chance that it is anything but on life support. --Mark >> >>> >>> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=19861 >>> [2] https://sourceware.org/pipermail/libc-alpha/2021-August/130404.html >>> [3] https://git.yoctoproject.org/prelink-cross/ >>> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-13 18:00 ` Mark Hatle @ 2022-01-13 18:37 ` Adhemerval Zanella 2022-01-13 19:01 ` Carlos O'Donell 0 siblings, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2022-01-13 18:37 UTC (permalink / raw) To: Mark Hatle, Khem Raj, Libc-alpha, Carlos O'Donell On 13/01/2022 15:00, Mark Hatle wrote: > > > On 1/13/22 11:20 AM, Adhemerval Zanella wrote: >>> When I last profiled this, roughly 3 1/2 years ago, the run-time linking speedup was huge. There were two main advantages to this: >>> >>> * Run Time linking speedup - primarily helped initial application loads. System boot times went from 10-15 seconds down to 4-5 seconds. For embedded systems this was massive. >> >> Right, this is interesting. Any profile data where exactly the speed is coming >> from? I wonder if we could get any gain by optimizing the normal patch without >> the need to resort to prelink. > > glibc's runtime linker is very efficient, I don't honestly expect many speedups at this point. > > This is partially from memory, so I may have a few details wrong... but > > LD_DEBUG=statistics > > On my ubuntu machine, just setting then and running /bin/bash results in: > > 334067: > 334067: runtime linker statistics: > 334067: total startup time in dynamic loader: 252415 cycles > 334067: time needed for relocation: 119006 cycles (47.1%) > 334067: number of relocations: 412 > 334067: number of relocations from cache: 3 > 334067: number of relative relocations: 5100 > 334067: time needed to load objects: 92655 cycles (36.7%) > 334068: > 334068: runtime linker statistics: > 334068: total startup time in dynamic loader: 125018 cycles > 334068: time needed for relocation: 40554 cycles (32.4%) > 334068: number of relocations: 176 > 334068: number of relocations from cache: 3 > 334068: number of relative relocations: 1534 > 334068: time needed to load objects: 45882 cycles (36.7%) > 334069: > 334069: runtime linker statistics: > 334069: total startup time in dynamic loader: 121500 cycles > 334069: time needed for relocation: 39067 cycles (32.1%) > 334069: number of relocations: 136 > 334069: number of relocations from cache: 3 > 334069: number of relative relocations: 1274 > 334069: time needed to load objects: 47505 cycles (39.0%) > 334071: > 334071: runtime linker statistics: > 334071: total startup time in dynamic loader: 111850 cycles > 334071: time needed for relocation: 35089 cycles (31.3%) > 334071: number of relocations: 135 > 334071: number of relocations from cache: 3 > 334071: number of relative relocations: 1272 > 334071: time needed to load objects: 45746 cycles (40.8%) > 334072: > 334072: runtime linker statistics: > 334072: total startup time in dynamic loader: 109827 cycles > 334072: time needed for relocation: 34863 cycles (31.7%) > 334072: number of relocations: 145 > 334072: number of relocations from cache: 3 > 334072: number of relative relocations: 1351 > 334072: time needed to load objects: 45565 cycles (41.4%) > > (why so many, because it's running through the profile and other bash startups which end up running additional items.) > > When prelinker worked... the number of relocations (and especially cycles) required dropped to about 1-10% of the original application. This compounded by the large number of executables loaded at boot (think of sysvinit with all of the shells started and destroyed) turned into a massive speedup during early boot process. > > As a normal "user" behavior, the speedup is negligible, because the amount of time spent loading vs running is nothing.... but in automated processing where something, like bash, is started runs for a fraction of a second, exits.. "repeat" 1000s of times.. it really becomes a massive part of the time scale. > > So back to the above, I know that in one instance that bash would end up with about 4 relocations, with 400+ from cache with the prelinker. Resulting in the cycles required for relocations to be in the 10% of overall load time, with time needed to load objects being roughly 90%. > Right, the compound improvements over all binaries make sense. >>> >>> * Memory usage. The COW page usage for runtime linking can be significant on memory constrained systems. Prelinking dropped the COW page usage in the systems I was looking at to about 10% of what it was prior. This is believed to have further contributed to the boot time optimizations. >> >> Interesting, why exactly is prelinking help in COW usage? I would expect memory >> utilization to be rough the same, is prelinking helping in aligning the segment >> in a better way? > > Each time a relocation occurs, the runtime linker needs to write into a page with that address. No relocation, no runtime write, no COW page created. > > Add to this mmap usage between applications, and you can run say 100 bash sessions and each session would use a fraction of the COW pages that it would without prelinking. Yes, but not taking in consideration TEXTREL or writable PLT I am not seeing on how COW would help since the GOT (where mostly if not all relocation would happen) is anonymous mappings. > > At one point I had statistics on this, but I don't even remember how this was calculated or done anymore. (I had help from some kernel people to show me kernel memory use, contiguous pages, etc..) > >>> >>> Last I looked at this, only about 20-30% of the system is capable of prelinking anymore due to the evolutionary changes in the various toolchain elements, introducing new relocations and related components. Even things like re-ordering sections (done a couple years ago in binutils) has broken the prelinker in mysterious ways. >> >> Yes and it is even harder to have a project that is dependent of both >> static in dynamic linker to have out-of-tree developement without a >> proper ABI definition. That's why I think currently prelink is hackish >> solution with a niche usage that adds a lot complexity to the code base. >> >> For instance, we are aiming to support DT_RELR which would help to >> decrease the relocation segment size for PIE binaries. It would be >> probably another feature that prelink will lack support. >> >> In fact this information you provided that only 20-30% of all binaries >> are supported makes even more willing to really deprecate prelink. > > prelink has a huge advantage on embedded systems -- but it hasn't worked well for about 3 years now... I was hoping other then life support someone would step up and contribute, and it never really happened. There were a few bugs/fixes sent by Mentor that kept things going on a few platforms -- but even that eventually dried up. (This is meant to thank them for the code and contributions they did!) > >>> >>> Add to this the IMHO mistaken belief that ASLR is some magic security device. I see it more as a component of security that needs to be part of a broader system that you can decide to trade off against load performance (including memory). But the broad "if you don't use ASLR your device if vulnerable" mentality has seriously restricted the desire for people to use, improve and contribute back to the prelinker. >> >> ASLR/PIE is not a silver bullet, specially with limited mmap entropy on >> 32 bit systems. But it is a gradual improvement over the multiple security >> features we support (like the generic ones as relro, malloc safelink, etc. >> to arch-specific one such as x86_64 CET or aarch64 BTI or PAC/RET). > > Exactly it's multiple security features work together for a purpose. But everyone got convinced ASLR was a silver bullet and that is what started the final death spiral of the prelinker (as it is today). > >> My point is more that usually what we see is generic distribution is to >> use more broader security features. I am not sure about embedded though. > > Embedded needs security, no doubt.. but with the limited entropy (even on 64-bit, the entropy is truly limited.. great I now have to run my attack 15 times instead of 5.. that really isn't much of an improvement!) ASLR has become a check list item for some security consultant to approve a product release. > > Things like the CET, BTI / PAC/RET have a much larger re-world security impact, IMHO. > > So in the end the embedded development that I've been involved with has always had a series of "these are our options, in a perfect world we'd use them all -- but we don't have the memory (prelink helped), we've got disk space limits (can't use PAC/RET, binaries get bigger), we need to be able to upgrade the software (prelink on the device? send pre-prelinked to all devices), we've got industry requirements (not all devices should have the same memory map, prelink ranomize addresses?), we've got maximum boot time requirements, etc. It's not cut and dried what combination of those requirements, and which technologies (such as the prelinker) should be used to meet them. As we have less operating system engineers, the preference is going away from using tools like prelink and lots of simple utilities into alternatives like "jumbo do it all" binaries that only get loaded once. Avoiding initscript systems and packing system initialization into those binaries, or even moving > to other libc's that have less relocation pressure (due to smaller libraires, feature sets, etc.) > > If you declare prelink dead, then it's dead.. nobody will be bringing it back. But I do still believe technology wise it's a good technology for the embedded systems (remember embedded doesn't mean "small") to help the meet specific integration needs. But without help from people with the appropriate knowledge to implement new features, like DT_RELR, in the prelinker -- there is little chance that it is anything but on life support. > It is more and more I see that proper static linking is a *much* more simple solution than prelink, with the advantage it also decrease code complexity and attack surface and can keep up with ABI extension way more easily. For instance, static pie is now support on both glibc and musl. And it is not that I declare dead, but it will become a dead weight support that we will need to provide for the sake of handful specific usage that due lack of maintenance will have subtle issues and missing support with the new ABI extensions. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2022-01-13 18:37 ` Adhemerval Zanella @ 2022-01-13 19:01 ` Carlos O'Donell 0 siblings, 0 replies; 12+ messages in thread From: Carlos O'Donell @ 2022-01-13 19:01 UTC (permalink / raw) To: Adhemerval Zanella, Mark Hatle, Khem Raj, Libc-alpha On 1/13/22 13:37, Adhemerval Zanella wrote: > And it is not that I declare dead, but it will become a dead weight > support that we will need to provide for the sake of handful specific > usage that due lack of maintenance will have subtle issues and missing > support with the new ABI extensions. *I* declare prelink is dead because there is no interest in prelink. Interest in prelink would entail IHVs, ISVs, or volunteers supporting the feature upstream in *all* impacted components. Lack of interest means the feature is going to be removed to reduced the maintenance burden overall. -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first 2021-12-09 23:53 [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first Khem Raj 2022-01-11 19:26 ` Adhemerval Zanella @ 2022-01-13 19:02 ` Carlos O'Donell 1 sibling, 0 replies; 12+ messages in thread From: Carlos O'Donell @ 2022-01-13 19:02 UTC (permalink / raw) To: Khem Raj, libc-alpha; +Cc: Mark Hatle, Mark Hatle On 12/9/21 18:53, Khem Raj via Libc-alpha wrote: > From: Mark Hatle <mark.hatle@windriver.com> > > According to the ELF specification: > > When resolving symbolic references, the dynamic linker examines the symbol > tables with a breadth-first search. > > This function was using a depth first search. By doing so the conflict > resolution reported to the prelinker (when LD_TRACE_PRELINKING=1 is set) > was incorrect. This caused problems when their were various circular > dependencies between libraries. The problem usually manifested itself by > the wrong IFUNC being executed. > > Similar issue has been reported here [1] > > [BZ# 20488] > > [1] https://sourceware.org/legacy-ml/libc-alpha/2016-05/msg00034.html > > Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> > Signed-off-by: Khem Raj <raj.khem@gmail.com> This needs much more discussion before inclusion. At the very least we need clear test cases that show the problem. The alternative patch I would support is prelink support removal. -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-01-13 19:03 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-12-09 23:53 [PATCH] elf/dl-deps.c: Make _dl_build_local_scope breadth first Khem Raj 2022-01-11 19:26 ` Adhemerval Zanella 2022-01-12 19:08 ` Mark Hatle 2022-01-12 20:12 ` Adhemerval Zanella 2022-01-12 20:41 ` Mark Hatle 2022-01-13 11:52 ` Adhemerval Zanella 2022-01-13 16:33 ` Mark Hatle 2022-01-13 17:20 ` Adhemerval Zanella 2022-01-13 18:00 ` Mark Hatle 2022-01-13 18:37 ` Adhemerval Zanella 2022-01-13 19:01 ` Carlos O'Donell 2022-01-13 19:02 ` Carlos O'Donell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).