* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
@ 2021-11-22 17:46 jma14
2021-11-23 13:58 ` Adhemerval Zanella
0 siblings, 1 reply; 15+ messages in thread
From: jma14 @ 2021-11-22 17:46 UTC (permalink / raw)
To: Florian Weimer, Adhemerval Zanella
Cc: John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
On 11/19/21 14:31, Florian Weimer wrote:
> * Adhemerval Zanella:
>>> "" for the main executable is widely known. Usually code uses it to
>>> implement a fallback on argv[0] or /proc/self/exe, though.
If this is widely known, it would be helpful if this was added to the
man pages. Currently the man pages define l_name as the "absolute
pathname where object was found." That sounds (to me at least) like
dlinfo(dlopen(NULL)) is a portable alternative to the Linux-specific
AT_EXECFN, contrary to reality.
>> There are still the issue where audit interface does not have direct
>> access to argv[0] from the audited process and '/proc' might also not
>> be accessible. I am still not convinced that provided argv[0] for
>> l_name for main executable is worse than "", specially because the
>> fallback might not work.
>
> I think it's better to give the auditor a chance to figure out whether
> they want to use program_invocation_name (if that's not available in the
> inner libc, that's for sure a bug we must fix), AT_EXECFN, or
> /proc/self/exe. If we pick one of these for the auditor, we make it
> more difficult to make the appropriate choice.
Apologies, but I don't understand the logic here. The main executable
is easy to identify in la_objopen as `l_prev == NULL && lmid ==
LM_ID_BASE`, the same (effectively) is done in gdb [1]. Applications
have dlinfo(dlopen(NULL)) to directly obtain the main executable's
link_map. I fail to see how setting l_name to "" is superior to either
of those options, especially given the downsides.
I agree with Adhemerval. There is no portable way for an auditor to
acquire the resolved path to the main executable, AT_EXECFN and /proc
are Linux-specific and dladdr (currently) often crashes from the
auditor. An l_name of "" means auditors *cannot* make the appropriate
choice.
[1]
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/solib-svr4.c;hb=d3de0860104b8bb8d496527fbb042c3b4c5c82dc#l1230
>>> […] […]
>>> # define __RELOC_POINTER(ptr, base) ((ptr) + (base))
>>> […]
>>> _Unwind_Ptr load_base;
>>> […]
>>> load_base = info->dlpi_addr;
>>> […]
>>> /* See if PC falls into one of the loaded segments. Find the eh_frame
>>> segment at the same time. */
>>> for (n = info->dlpi_phnum; --n >= 0; phdr++)
>>> {
>>> if (phdr->p_type == PT_LOAD)
>>> {
>>> _Unwind_Ptr vaddr = (_Unwind_Ptr)
>>> __RELOC_POINTER (phdr->p_vaddr, load_base);
>>> if (data->pc >= vaddr && data->pc < vaddr + phdr->p_memsz)
>>> {
>>> match = 1;
>>> pc_low = vaddr;
>>> pc_high = vaddr + phdr->p_memsz;
>>> }
>>> }
>>> […]
>>> Changing l_addr will break the libgcc unwinder. It uses l_addr to
>>> relocate the program header (see the code I quoted previously). Not
>>> everyone uses the platform unwinder, and the libgcc unwinder is
>>> sometimes linked statically. This is different from the l_name
>>> change: The l_addr would definitely cause widespread breakage.
Based on the quoted code snippet, the libgcc unwinder uses dlpi_addr
(from dl_iterate_phdr) instead of l_addr. To clarify, my proposal is
that *only* the publicly visible l_addr changes. In short replacing
the relation:
dlpi_addr (from dl_iterate_phdr) == l_addr != dli_fbase (from
dladdr) == (private) l_map_start
with the relation:
dlpi_addr (from dl_iterate_phdr) != l_addr == dli_fbase (from
dladdr) == (private) l_map_start
Any Glibc code using l_addr (including dl_iterate_phdr) would instead
use a newly added private field retaining the old semantic, and l_addr
would carry the value of l_map_start.
I don't expect users to use dl_iterate_phdr and link_map in tandem
(and then fail when l_addr != dlpi_addr), the former (+
dladdr(dlpi_phdr)) gives strictly superior information. The quoted
code snippet supports that expectation.
-Jonathon
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-22 17:46 Fwd: [PATCH v5 00/22] Some rtld-audit fixes jma14
@ 2021-11-23 13:58 ` Adhemerval Zanella
2021-11-23 14:02 ` Florian Weimer
0 siblings, 1 reply; 15+ messages in thread
From: Adhemerval Zanella @ 2021-11-23 13:58 UTC (permalink / raw)
To: jma14, Florian Weimer
Cc: John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
On 22/11/2021 14:46, jma14 wrote:
>
> On 11/19/21 14:31, Florian Weimer wrote:
>
>> * Adhemerval Zanella:
>>>> "" for the main executable is widely known. Usually code uses it to
>>>> implement a fallback on argv[0] or /proc/self/exe, though.
>
> If this is widely known, it would be helpful if this was added to the man pages. Currently the man pages define l_name as the "absolute pathname where object was found." That sounds (to me at least) like dlinfo(dlopen(NULL)) is a portable alternative to the Linux-specific AT_EXECFN, contrary to reality.
To provide an absolute pathname it would require to use realpath() or similar
strategy either before issuing la_objopen() or while parsing the arguments
or reading AT_EXECFN. I am not sure if this best approach, specially if the
executable is executed through a namespace or any kernel abstraction where
obtaining the full path though filesystem walk is not possible or restricted.
>
>>> There are still the issue where audit interface does not have direct
>>> access to argv[0] from the audited process and '/proc' might also not
>>> be accessible. I am still not convinced that provided argv[0] for
>>> l_name for main executable is worse than "", specially because the
>>> fallback might not work.
>>
>> I think it's better to give the auditor a chance to figure out whether
>> they want to use program_invocation_name (if that's not available in the
>> inner libc, that's for sure a bug we must fix), AT_EXECFN, or
>> /proc/self/exe. If we pick one of these for the auditor, we make it
>> more difficult to make the appropriate choice.
>
> Apologies, but I don't understand the logic here. The main executable is easy to identify in la_objopen as `l_prev == NULL && lmid == LM_ID_BASE`, the same (effectively) is done in gdb [1]. Applications have dlinfo(dlopen(NULL)) to directly obtain the main executable's link_map. I fail to see how setting l_name to "" is superior to either of those options, especially given the downsides.
>
> I agree with Adhemerval. There is no portable way for an auditor to acquire the resolved path to the main executable, AT_EXECFN and /proc are Linux-specific and dladdr (currently) often crashes from the auditor. An l_name of "" means auditors *cannot* make the appropriate choice.
In fact I think rather than using the argv[0], since it passing the executable
path is just a convention; I think it would be better to use AT_EXECFN. On
recent kernel it is always passed to userland and kernel should be responsible
to hide any filesystem information if it is required.
And it also helps for the case where execve() is called with bogus initial
argument.
>
> [1] https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/solib-svr4.c;hb=d3de0860104b8bb8d496527fbb042c3b4c5c82dc#l1230
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-23 13:58 ` Adhemerval Zanella
@ 2021-11-23 14:02 ` Florian Weimer
2021-11-23 16:25 ` Adhemerval Zanella
0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2021-11-23 14:02 UTC (permalink / raw)
To: Adhemerval Zanella
Cc: jma14, John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
* Adhemerval Zanella:
> In fact I think rather than using the argv[0], since it passing the
> executable path is just a convention; I think it would be better to
> use AT_EXECFN. On recent kernel it is always passed to userland and
> kernel should be responsible to hide any filesystem information if it
> is required.
It's still a relative path to an unknown directory, I think. I expect
(but have not checked) that it is the pathname argument to execveat,
which may not be meaningful to the new process image.
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-23 14:02 ` Florian Weimer
@ 2021-11-23 16:25 ` Adhemerval Zanella
2021-11-23 16:50 ` Florian Weimer
0 siblings, 1 reply; 15+ messages in thread
From: Adhemerval Zanella @ 2021-11-23 16:25 UTC (permalink / raw)
To: Florian Weimer
Cc: jma14, John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
On 23/11/2021 11:02, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> In fact I think rather than using the argv[0], since it passing the
>> executable path is just a convention; I think it would be better to
>> use AT_EXECFN. On recent kernel it is always passed to userland and
>> kernel should be responsible to hide any filesystem information if it
>> is required.
>
> It's still a relative path to an unknown directory, I think. I expect
> (but have not checked) that it is the pathname argument to execveat,
> which may not be meaningful to the new process image.
Yes, but it better than _dl_argv[0] and/or an empty string. Without
proper kernel support we can not reliable get the path, in fact the
kernel might in fact hides it on purpose.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-23 16:25 ` Adhemerval Zanella
@ 2021-11-23 16:50 ` Florian Weimer
2021-11-23 21:13 ` Jonathon Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2021-11-23 16:50 UTC (permalink / raw)
To: Adhemerval Zanella
Cc: jma14, John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
* Adhemerval Zanella:
> On 23/11/2021 11:02, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> In fact I think rather than using the argv[0], since it passing the
>>> executable path is just a convention; I think it would be better to
>>> use AT_EXECFN. On recent kernel it is always passed to userland and
>>> kernel should be responsible to hide any filesystem information if it
>>> is required.
>>
>> It's still a relative path to an unknown directory, I think. I expect
>> (but have not checked) that it is the pathname argument to execveat,
>> which may not be meaningful to the new process image.
>
> Yes, but it better than _dl_argv[0] and/or an empty string. Without
> proper kernel support we can not reliable get the path, in fact the
> kernel might in fact hides it on purpose.
I'm worried that if we put a path-looking string there, programmers will
assume it is THE path to the executable. With SUID programs, that looks
like a recipe for security vulnerabilities. These users need to use
/proc/self/exe and give up if that's not available.
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-23 16:50 ` Florian Weimer
@ 2021-11-23 21:13 ` Jonathon Anderson
2021-11-25 17:56 ` Adhemerval Zanella
0 siblings, 1 reply; 15+ messages in thread
From: Jonathon Anderson @ 2021-11-23 21:13 UTC (permalink / raw)
To: Florian Weimer, Adhemerval Zanella
Cc: jma14, John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
> On 22/11/2021 14:46, jma14 wrote:
>> On 11/19/21 14:31, Florian Weimer wrote:
>>
>>> * Adhemerval Zanella:
>>>>> "" for the main executable is widely known. Usually code uses it to
>>>>> implement a fallback on argv[0] or /proc/self/exe, though.
>> If this is widely known, it would be helpful if this was added to the man pages. Currently the man pages define l_name as the "absolute pathname where object was found." That sounds (to me at least) like dlinfo(dlopen(NULL)) is a portable alternative to the Linux-specific AT_EXECFN, contrary to reality.
> To provide an absolute pathname it would require to use realpath() or similar
> strategy either before issuing la_objopen() or while parsing the arguments
> or reading AT_EXECFN. I am not sure if this best approach, specially if the
> executable is executed through a namespace or any kernel abstraction where
> obtaining the full path though filesystem walk is not possible or restricted.
I had interpreted this as l_name may not necessarily be a canonical
path. To me, the wording "was found" indicates that this is the path
resulting from library search, in effect .../libfoo.so.1 instead of the
canonical .../libfoo.so.1.2.3. Currently l_name is not always absolute
if the search involved a relative path, which I also consider an issue
either in documentation or behavior (also affects dladdr).
Under this interpretation I don't see a need to use realpath() (at most
getcwd()), if a user requires a canonical path they can call realpath()
themselves. They already need to be robust against the possibility that
the file was deleted/replaced right after it was loaded,
deleted/replaced symlinks are a reasonable extension to that.
On 11/23/21 10:50, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> On 23/11/2021 11:02, Florian Weimer wrote:
>>> * Adhemerval Zanella:
>>>
>>>> In fact I think rather than using the argv[0], since it passing the
>>>> executable path is just a convention; I think it would be better to
>>>> use AT_EXECFN. On recent kernel it is always passed to userland and
>>>> kernel should be responsible to hide any filesystem information if it
>>>> is required.
>>> It's still a relative path to an unknown directory, I think. I expect
>>> (but have not checked) that it is the pathname argument to execveat,
>>> which may not be meaningful to the new process image.
execveat seems to generate an AT_EXECFN under /dev/fd, it has no meaning
in the new process image if O_CLOEXEC is used on the directory file
descriptor. Under the interpretation above I think this is reasonable.
>>>
>>> Yes, but it better than _dl_argv[0] and/or an empty string. Without
>>> proper kernel support we can not reliable get the path, in fact the
>>> kernel might in fact hides it on purpose.
> I'm worried that if we put a path-looking string there, programmers will
> assume it is THE path to the executable. With SUID programs, that looks
> like a recipe for security vulnerabilities. These users need to use
> /proc/self/exe and give up if that's not available.
I do not think l_name is unique in this concern, AT_EXECFN and dladdr
expose the same security risk and AFAICT this is not noted in either of
their man pages.
I'm not familiar with all the security constraints surrounding SUID, but
as a simple but effective resolution I would consider setting the
executable's l_name to the string "/proc/self/exe" in secure-execution
mode. A similar change could be done in getauxval() to transparently
modify AT_EXECFN. Those two changes would force all unwitting SUID users
to (semi-gracefully) fail when /proc is unavailable.
-Jonathon
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-23 21:13 ` Jonathon Anderson
@ 2021-11-25 17:56 ` Adhemerval Zanella
0 siblings, 0 replies; 15+ messages in thread
From: Adhemerval Zanella @ 2021-11-25 17:56 UTC (permalink / raw)
To: Jonathon Anderson, Florian Weimer
Cc: jma14, John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
On 23/11/2021 18:13, Jonathon Anderson wrote:
>> On 22/11/2021 14:46, jma14 wrote:
>>> On 11/19/21 14:31, Florian Weimer wrote:
>>>
>>>> * Adhemerval Zanella:
>>>>>> "" for the main executable is widely known. Usually code uses it to
>>>>>> implement a fallback on argv[0] or /proc/self/exe, though.
>>> If this is widely known, it would be helpful if this was added to the man pages. Currently the man pages define l_name as the "absolute pathname where object was found." That sounds (to me at least) like dlinfo(dlopen(NULL)) is a portable alternative to the Linux-specific AT_EXECFN, contrary to reality.
>> To provide an absolute pathname it would require to use realpath() or similar
>> strategy either before issuing la_objopen() or while parsing the arguments
>> or reading AT_EXECFN. I am not sure if this best approach, specially if the
>> executable is executed through a namespace or any kernel abstraction where
>> obtaining the full path though filesystem walk is not possible or restricted.
> I had interpreted this as l_name may not necessarily be a canonical path. To me, the wording "was found" indicates that this is the path resulting from library search, in effect .../libfoo.so.1 instead of the canonical .../libfoo.so.1.2.3. Currently l_name is not always absolute if the search involved a relative path, which I also consider an issue either in documentation or behavior (also affects dladdr).
>
> Under this interpretation I don't see a need to use realpath() (at most getcwd()), if a user requires a canonical path they can call realpath() themselves. They already need to be robust against the possibility that the file was deleted/replaced right after it was loaded, deleted/replaced symlinks are a reasonable extension to that.
Linux at least appends '(deleted)' if the pathname has been unlinked.
>
> On 11/23/21 10:50, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> On 23/11/2021 11:02, Florian Weimer wrote:
>>>> * Adhemerval Zanella:
>>>>
>>>>> In fact I think rather than using the argv[0], since it passing the
>>>>> executable path is just a convention; I think it would be better to
>>>>> use AT_EXECFN. On recent kernel it is always passed to userland and
>>>>> kernel should be responsible to hide any filesystem information if it
>>>>> is required.
>>>> It's still a relative path to an unknown directory, I think. I expect
>>>> (but have not checked) that it is the pathname argument to execveat,
>>>> which may not be meaningful to the new process image.
> execveat seems to generate an AT_EXECFN under /dev/fd, it has no meaning in the new process image if O_CLOEXEC is used on the directory file descriptor. Under the interpretation above I think this is reasonable.
>>>> Yes, but it better than _dl_argv[0] and/or an empty string. Without
>>>> proper kernel support we can not reliable get the path, in fact the
>>>> kernel might in fact hides it on purpose.
>> I'm worried that if we put a path-looking string there, programmers will
>> assume it is THE path to the executable. With SUID programs, that looks
>> like a recipe for security vulnerabilities. These users need to use
>> /proc/self/exe and give up if that's not available.
> I do not think l_name is unique in this concern, AT_EXECFN and dladdr expose the same security risk and AFAICT this is not noted in either of their man pages.
>
> I'm not familiar with all the security constraints surrounding SUID, but as a simple but effective resolution I would consider setting the executable's l_name to the string "/proc/self/exe" in secure-execution mode. A similar change could be done in getauxval() to transparently modify AT_EXECFN. Those two changes would force all unwitting SUID users to (semi-gracefully) fail when /proc is unavailable.
I think this is fair assumption.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-19 20:31 ` Florian Weimer
@ 2021-11-23 16:36 ` Adhemerval Zanella
0 siblings, 0 replies; 15+ messages in thread
From: Adhemerval Zanella @ 2021-11-23 16:36 UTC (permalink / raw)
To: Florian Weimer
Cc: Jonathon Anderson, John Mellor-Crummey, libc-alpha,
Mark W. Krentel, Xiaozhu Meng
On 19/11/2021 17:31, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>>> "" for the main executable is widely known. Usually code uses it to
>>> implement a fallback on argv[0] or /proc/self/exe, though.
>>
>> There are still the issue where audit interface does not have direct
>> access to argv[0] from the audited process and '/proc' might also not
>> be accessible. I am still not convinced that provided argv[0] for
>> l_name for main executable is worse than "", specially because the
>> fallback might not work.
>
> I think it's better to give the auditor a chance to figure out whether
> they want to use program_invocation_name (if that's not available in the
> inner libc, that's for sure a bug we must fix), AT_EXECFN, or
> /proc/self/exe. If we pick one of these for the auditor, we make it
> more difficult to make the appropriate choice.
>
>>> Changing l_addr will break the libgcc unwinder. It uses l_addr to
>>> relocate the program header (see the code I quoted previously). Not
>>> everyone uses the platform unwinder, and the libgcc unwinder is
>>> sometimes linked statically. This is different from the l_name change:
>>> The l_addr would definitely cause widespread breakage.
>>>
>>>> - Every use case I can think of for obtaining a link_map from the dl*
>>>> functions (dlinfo and dladdr1) will either already have the special
>>>> handling, or won't operate on the main executable, or likely won't opt
>>>> to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).
>>>
>>> Some special-case the main executable based on l_name, I expect, which
>>> is why I'm so reluctant to change l_name. The GDB comment is actually
>>> hinting strongly towards a "" convention (that Solaris broke).
>>
>> So I take that Solaris does provide the application name to l_name? And
>> what kind of breakage it has done on gdb?
>
> Solaris seems to use the pathname argument to execve as l_name,
> via AT_SUN_EXECNAME. I do not know if it is an absolute name.
> The documentation for getexecname suggests it may not be:
>
> | Normally this is an absolute pathname, as the majority of commands are
> | executed by the shells that append the command name to the user's PATH
> | components. If this is not an absolute path, the output of getcwd(3C)
> | can be prepended to it to create an absolute path, unless the process
> | or one of its ancestors has changed its root directory or current
> | working directory since the last successful call to one of the exec
> | family of functions.
>
> <https://docs.oracle.com/cd/E36784_01/html/E36874/getexecname-3c.html>
>
> If that's accurate, it would be just like our AT_EXECFN. I suspect this
> won't work for fexecve, so they are just going to fall back to argv[0]
> in that case.
I seems to return a relative path used for process execution:
azanella@gcc-solaris11:~$ uname -a
SunOS gcc-solaris11 5.11 11.3 sun4u sparc SUNW,SPARC-Enterprise
azanella@gcc-solaris11:~$ cat lname.c
#include <link.h>
#include <dlfcn.h>
#include <stdio.h>
#include <assert.h>
int main (int argc, char *argv[])
{
void *h = dlopen (NULL, RTLD_NOW);
assert (h != NULL);
struct link_map *l;
assert (dlinfo (h, RTLD_DI_LINKMAP, &l) == 0);
printf ("%s\n", l->l_name);
return 0;
}
azanella@gcc-solaris11:~$ gcc -Wall lname.c -o lname
azanella@gcc-solaris11:~$ ./lname
lname
azanella@gcc-solaris11:~$ /export/home/azanella/lname
/export/home/azanella/lname
>
> Still I think we have 20+ years of doing things our way (the BSD way),
> and I do worry about the backwards compatibility impact of such a
> change.
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
@ 2021-11-22 17:46 jma14
0 siblings, 0 replies; 15+ messages in thread
From: jma14 @ 2021-11-22 17:46 UTC (permalink / raw)
To: Florian Weimer
Cc: John Mellor-Crummey, Xiaozhu Meng, Mark W. Krentel, libc-alpha
On 11/19/21 13:18, Florian Weimer wrote:
> * Jonathon Anderson:
>>>> Right now, we only require the program headers which we can obtain
>>>> from getauxval(AT_PHDR), however this technique has questionable
>>>> portability and robustness (getauxval returns an unsigned long, not a
>>>> pointer).
>
>>> A glibc port to an architecture where a long value cannot hold all
>>> pointer values will have to provide an alternative interface similar
>>> to getauxval, but that returns pointer values.
>
>> I would go one step farther and say getauxval is already broken for any
>> 64-bit architecture, unsigned long is only required to support 32 bits
>> as per the C standards. One of my greater fears is that some exotic
>> compiler will cleverly allocate only 4 bytes of stack space for the
>> return value, and we wouldn't know except for a subtle bug (dependent
>> on optimization flags!) that crashes our entire tool with SEGVs in the
>> auditor (where GDB doesn't give properly unwound call stacks).
>
> If ported to such an architecture, glibc would need several changes to
> accomodate this. Newer architectures take this into account and do not
> do funny things.
Ah, right, I forgot the sizes of the standard types are part of the
GNU ABI. That resolves my concern then.
>> This is a much clearer description of the semantic, it would be very
>> helpful the man pages used that sentence (or one like it) wherever the
>> l_addr value is exposed in the API (link_map->l_addr and
>> dl_phdr_info->dlpi_addr). It would also be very helpful if
>> Dl_info.dli_fbase was clearly documented as *not* l_addr but instead
>> byte 0/ELF header in the process image.
>
> I've made a note to update the manual pages.
Thanks!
>>>> dladdr gets its value from link_map->l_map_start instead of l_addr,
>>>> so the semantic we want is already present in a private field. It
>>>> seems to me these two fields could be swapped with little issue, if
>>>> altering the public semantic is not acceptable we could also be sated
>>>> if l_map_start was made public.
>>>
>>> Applications which know about the current semantics of l_addr will
>>> break, though. l_addr is also exposed to debuggers via the _r_debug
>>> interface. I really do not think we can make changes to l_addr. We
>>> have a similar issue around l_name being "" for the main program, and
>>> unfortuantely I will have to argue quite strongly against changing
>>> that.
>
>> Is adding new public fields completely off the table?
>
> To struct link_map? We could probably pull it off, but it would be
> years until such a change will be in the hands of the users. There is
> an internal structure that overlaps with the public struct link_map, and
> some applications poke at the private bits at fixed offsets. We've
> started not to strip ld.so downstream, so that these applications can
> switch to DWARF data to avoid dependencies on fixed offsets, but that
> has been a very recent change.
I would consider poking at private fields a last resort, portability
across (versions of) GNU/Linux distributions is something we fight
with on occasion. It also doesn't help the l_name issue.
>> I am keeping my hopes low for changing the public semantics, but I hope
>> that adding new fields for the missing data at least could fall into
>> the "bug fixes" category.
>
> We can in theory add new flags to dladdr1 or new requests to dlinfo.
> These things do not impact ABI and are backportable.
These options would not help the issue. dlinfo (currently) crashes in
the auditor, and using it relies on the (undocumented) fact that
dlopen handles are in fact just struct link_map*. dladdr already
provides the corrected values, adding a new option to dladdr1 isn't
needed.
> Changing struct link_map is challenging, as I explained. Adding new
> functions definitely triggers deployment delays because those aren't
> considered backportable. Adding fields to existing structs is usually
> impossible because these structs may have been used to define
> application types, and new fields cause subsequent offsets to change.
Ah, right. Can the new fields be added in a future release? If so,
then with backported fixes to the dladdr crashes dladdr(l_ld) can be
used as a portable workaround, provided it works as intended on all
Glibc targets (it works on x86-64/Linux but I'll need confirmation for
others). There is a performance cost to dladdr so I still think the
new fields are required, it also gives the workaround a clear lifetime.
If new fields are completely impossible (now and forever), then as an
LD_AUDIT user the only option (brought up so far) I feel comfortable
with is a change to the l_addr/l_name semantics. It's ABI-compatible,
I have a hard time believing there are applications that will regress
because of it, and it fixes an interface that was *never* quite right
(and differed from its documentation).
-Jonathon
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-19 19:56 ` Adhemerval Zanella
@ 2021-11-19 20:31 ` Florian Weimer
2021-11-23 16:36 ` Adhemerval Zanella
0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2021-11-19 20:31 UTC (permalink / raw)
To: Adhemerval Zanella
Cc: Jonathon Anderson, John Mellor-Crummey, libc-alpha,
Mark W. Krentel, Xiaozhu Meng
* Adhemerval Zanella:
>> "" for the main executable is widely known. Usually code uses it to
>> implement a fallback on argv[0] or /proc/self/exe, though.
>
> There are still the issue where audit interface does not have direct
> access to argv[0] from the audited process and '/proc' might also not
> be accessible. I am still not convinced that provided argv[0] for
> l_name for main executable is worse than "", specially because the
> fallback might not work.
I think it's better to give the auditor a chance to figure out whether
they want to use program_invocation_name (if that's not available in the
inner libc, that's for sure a bug we must fix), AT_EXECFN, or
/proc/self/exe. If we pick one of these for the auditor, we make it
more difficult to make the appropriate choice.
>> Changing l_addr will break the libgcc unwinder. It uses l_addr to
>> relocate the program header (see the code I quoted previously). Not
>> everyone uses the platform unwinder, and the libgcc unwinder is
>> sometimes linked statically. This is different from the l_name change:
>> The l_addr would definitely cause widespread breakage.
>>
>>> - Every use case I can think of for obtaining a link_map from the dl*
>>> functions (dlinfo and dladdr1) will either already have the special
>>> handling, or won't operate on the main executable, or likely won't opt
>>> to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).
>>
>> Some special-case the main executable based on l_name, I expect, which
>> is why I'm so reluctant to change l_name. The GDB comment is actually
>> hinting strongly towards a "" convention (that Solaris broke).
>
> So I take that Solaris does provide the application name to l_name? And
> what kind of breakage it has done on gdb?
Solaris seems to use the pathname argument to execve as l_name,
via AT_SUN_EXECNAME. I do not know if it is an absolute name.
The documentation for getexecname suggests it may not be:
| Normally this is an absolute pathname, as the majority of commands are
| executed by the shells that append the command name to the user's PATH
| components. If this is not an absolute path, the output of getcwd(3C)
| can be prepended to it to create an absolute path, unless the process
| or one of its ancestors has changed its root directory or current
| working directory since the last successful call to one of the exec
| family of functions.
<https://docs.oracle.com/cd/E36784_01/html/E36874/getexecname-3c.html>
If that's accurate, it would be just like our AT_EXECFN. I suspect this
won't work for fexecve, so they are just going to fall back to argv[0]
in that case.
Still I think we have 20+ years of doing things our way (the BSD way),
and I do worry about the backwards compatibility impact of such a
change.
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-19 19:18 ` Florian Weimer
@ 2021-11-19 19:56 ` Adhemerval Zanella
2021-11-19 20:31 ` Florian Weimer
0 siblings, 1 reply; 15+ messages in thread
From: Adhemerval Zanella @ 2021-11-19 19:56 UTC (permalink / raw)
To: Florian Weimer, Jonathon Anderson
Cc: John Mellor-Crummey, libc-alpha, Mark W. Krentel, Xiaozhu Meng
On 19/11/2021 16:18, Florian Weimer via Libc-alpha wrote:
> * Jonathon Anderson:
>
>>>> Right now, we
>>>> only require the program headers which we can obtain from
>>>> getauxval(AT_PHDR), however this technique has questionable
>>>> portability and robustness (getauxval returns an unsigned long, not a
>>>> pointer).
>
>>> A glibc port to an architecture where a long value cannot hold all
>>> pointer values will have to provide an alternative interface similar to
>>> getauxval, but that returns pointer values.
>
>> I would go one step farther and say getauxval is already broken for
>> any 64-bit architecture, unsigned long is only required to support 32
>> bits as per the C standards. One of my greater fears is that some
>> exotic compiler will cleverly allocate only 4 bytes of stack space for
>> the return value, and we wouldn't know except for a subtle bug
>> (dependent on optimization flags!) that crashes our entire tool with
>> SEGVs in the auditor (where GDB doesn't give properly unwound call
>> stacks).
>
> If ported to such an architecture, glibc would need several changes to
> accomodate this. Newer architectures take this into account and do not
> do funny things. But Morello, as a capabilities-based architecture,
> does not have this luxury, so they have to do something about this
> interface. But that is (still) an outlier.
>
> I think the important point is that glibc interfaces do not need to be
> fully API-compatible with future architecture requirements. We can
> change APIs for future ports.
>
>>> Of course that's not the
>>> only interface with this problem (ElfW(Addr) is an integer as well).
>
>> AFAICT ElfW(Addr) is fine, it should always be an integer large enough
>> to store a pointer on the host architecture (i.e. a uintptr_t). Unless
>> I missed some specific arch where this doesn't work out to be the
>> case?
>
> Morello and other capabilities-based architectures. Pointers need to
> pointers there. Weird architectures do not have uintptr_t.
>
>>> makes the Morello glibc port quite interesting. So I think *something*
>>> like getauxval (AT_PHDR) will always be available, with pretty much
>>> identical semantics.
>>>
>>>> From an outside perspective the current l_addr semantic is fairly
>>>> undocumented, the dladdr and dlinfo man pages define it vaguely as
>>>> the "difference between the address in the ELF file and the address
>>>> in memory." That sounds (to me at least) like l_addr should point to
>>>> byte 0 in the file (the ELF header), and that seems to be correct in
>>>> all but the non-PIE case.
>>> I have struggled with this in the past. I agree that it is confusing.
>>> l_addr is the offset between virtual addresses in the program header of
>>> the ELF object and the actual addresses in the process image. This
>>> offset happens to be 0 for ET_EXEC objects, and only there.
>
>> This is a much clearer description of the semantic, it would be very
>> helpful the man pages used that sentence (or one like it) wherever the
>> l_addr value is exposed in the API (link_map->l_addr and
>> dl_phdr_info->dlpi_addr). It would also be very helpful if
>> Dl_info.dli_fbase was clearly documented as *not* l_addr but instead
>> byte 0/ELF header in the process image.
>
> I've made a note to update the manual pages.
>
>> That does sound like the "correct" way out, but dl_iterate_phdr
>> operates on the caller's namespace so one would need to inject a shim
>> library to do the actual call.
>
> Ugh, you are right. It means we currently can't unwind across dlmopen
> boundaries.
>
> So please use getauxval (AT_PHDR) for now. It is fully portable across
> all present glibc targets.
>
>>>> dladdr gets its value from link_map->l_map_start instead of l_addr,
>>>> so the semantic we want is already present in a private field. It
>>>> seems to me these two fields could be swapped with little issue, if
>>>> altering the public semantic is not acceptable we could also be sated
>>>> if l_map_start was made public.
>>> Applications which know about the current semantics of l_addr will
>>> break, though. l_addr is also exposed to debuggers via the _r_debug
>>> interface. I really do not think we can make changes to l_addr.
>>> We have a similar issue around l_name being "" for the main program, and
>>> unfortuantely I will have to argue quite strongly against changing that.
>
>> Is adding new public fields completely off the table?
>
> To struct link_map? We could probably pull it off, but it would be
> years until such a change will be in the hands of the users. There is
> an internal structure that overlaps with the public struct link_map,
> and some applications poke at the private bits at fixed offsets.
>
> We've started not to strip ld.so downstream, so that these applications
> can switch to DWARF data to avoid dependencies on fixed offsets, but
> that has been a very recent change.
>
>> If I can humor the impossible for a few moments longer, I personally
>> have a difficult time believing that anyone actually uses
>> link_map->l_addr or link_map->l_name in a way that would break by
>> changing their semantics for the main executable:
>
>> - The documentation hasn't improved for years so there can't be many
>> users that care about (or even noticed) this case in particular.
>
> "" for the main executable is widely known. Usually code uses it to
> implement a fallback on argv[0] or /proc/self/exe, though.
There are still the issue where audit interface does not have direct
access to argv[0] from the audited process and '/proc' might also not
be accessible. I am still not convinced that provided argv[0] for
l_name for main executable is worse than "", specially because the
fallback might not work.
>
> Changing l_addr will break the libgcc unwinder. It uses l_addr to
> relocate the program header (see the code I quoted previously). Not
> everyone uses the platform unwinder, and the libgcc unwinder is
> sometimes linked statically. This is different from the l_name change:
> The l_addr would definitely cause widespread breakage.
>
>> - Every use case I can think of for obtaining a link_map from the dl*
>> functions (dlinfo and dladdr1) will either already have the special
>> handling, or won't operate on the main executable, or likely won't opt
>> to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).
>
> Some special-case the main executable based on l_name, I expect, which
> is why I'm so reluctant to change l_name. The GDB comment is actually
> hinting strongly towards a "" convention (that Solaris broke).
So I take that Solaris does provide the application name to l_name? And
what kind of breakage it has done on gdb?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-18 21:55 ` Jonathon Anderson
@ 2021-11-19 19:18 ` Florian Weimer
2021-11-19 19:56 ` Adhemerval Zanella
0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2021-11-19 19:18 UTC (permalink / raw)
To: Jonathon Anderson
Cc: John Mellor-Crummey, Xiaozhu Meng, Mark W. Krentel, libc-alpha
* Jonathon Anderson:
>>> Right now, we
>>> only require the program headers which we can obtain from
>>> getauxval(AT_PHDR), however this technique has questionable
>>> portability and robustness (getauxval returns an unsigned long, not a
>>> pointer).
>> A glibc port to an architecture where a long value cannot hold all
>> pointer values will have to provide an alternative interface similar to
>> getauxval, but that returns pointer values.
> I would go one step farther and say getauxval is already broken for
> any 64-bit architecture, unsigned long is only required to support 32
> bits as per the C standards. One of my greater fears is that some
> exotic compiler will cleverly allocate only 4 bytes of stack space for
> the return value, and we wouldn't know except for a subtle bug
> (dependent on optimization flags!) that crashes our entire tool with
> SEGVs in the auditor (where GDB doesn't give properly unwound call
> stacks).
If ported to such an architecture, glibc would need several changes to
accomodate this. Newer architectures take this into account and do not
do funny things. But Morello, as a capabilities-based architecture,
does not have this luxury, so they have to do something about this
interface. But that is (still) an outlier.
I think the important point is that glibc interfaces do not need to be
fully API-compatible with future architecture requirements. We can
change APIs for future ports.
>> Of course that's not the
>> only interface with this problem (ElfW(Addr) is an integer as well).
> AFAICT ElfW(Addr) is fine, it should always be an integer large enough
> to store a pointer on the host architecture (i.e. a uintptr_t). Unless
> I missed some specific arch where this doesn't work out to be the
> case?
Morello and other capabilities-based architectures. Pointers need to
pointers there. Weird architectures do not have uintptr_t.
>> makes the Morello glibc port quite interesting. So I think *something*
>> like getauxval (AT_PHDR) will always be available, with pretty much
>> identical semantics.
>>
>>> From an outside perspective the current l_addr semantic is fairly
>>> undocumented, the dladdr and dlinfo man pages define it vaguely as
>>> the "difference between the address in the ELF file and the address
>>> in memory." That sounds (to me at least) like l_addr should point to
>>> byte 0 in the file (the ELF header), and that seems to be correct in
>>> all but the non-PIE case.
>> I have struggled with this in the past. I agree that it is confusing.
>> l_addr is the offset between virtual addresses in the program header of
>> the ELF object and the actual addresses in the process image. This
>> offset happens to be 0 for ET_EXEC objects, and only there.
> This is a much clearer description of the semantic, it would be very
> helpful the man pages used that sentence (or one like it) wherever the
> l_addr value is exposed in the API (link_map->l_addr and
> dl_phdr_info->dlpi_addr). It would also be very helpful if
> Dl_info.dli_fbase was clearly documented as *not* l_addr but instead
> byte 0/ELF header in the process image.
I've made a note to update the manual pages.
> That does sound like the "correct" way out, but dl_iterate_phdr
> operates on the caller's namespace so one would need to inject a shim
> library to do the actual call.
Ugh, you are right. It means we currently can't unwind across dlmopen
boundaries.
So please use getauxval (AT_PHDR) for now. It is fully portable across
all present glibc targets.
>>> dladdr gets its value from link_map->l_map_start instead of l_addr,
>>> so the semantic we want is already present in a private field. It
>>> seems to me these two fields could be swapped with little issue, if
>>> altering the public semantic is not acceptable we could also be sated
>>> if l_map_start was made public.
>> Applications which know about the current semantics of l_addr will
>> break, though. l_addr is also exposed to debuggers via the _r_debug
>> interface. I really do not think we can make changes to l_addr.
>> We have a similar issue around l_name being "" for the main program, and
>> unfortuantely I will have to argue quite strongly against changing that.
> Is adding new public fields completely off the table?
To struct link_map? We could probably pull it off, but it would be
years until such a change will be in the hands of the users. There is
an internal structure that overlaps with the public struct link_map,
and some applications poke at the private bits at fixed offsets.
We've started not to strip ld.so downstream, so that these applications
can switch to DWARF data to avoid dependencies on fixed offsets, but
that has been a very recent change.
> If I can humor the impossible for a few moments longer, I personally
> have a difficult time believing that anyone actually uses
> link_map->l_addr or link_map->l_name in a way that would break by
> changing their semantics for the main executable:
> - The documentation hasn't improved for years so there can't be many
> users that care about (or even noticed) this case in particular.
"" for the main executable is widely known. Usually code uses it to
implement a fallback on argv[0] or /proc/self/exe, though.
Changing l_addr will break the libgcc unwinder. It uses l_addr to
relocate the program header (see the code I quoted previously). Not
everyone uses the platform unwinder, and the libgcc unwinder is
sometimes linked statically. This is different from the l_name change:
The l_addr would definitely cause widespread breakage.
> - Every use case I can think of for obtaining a link_map from the dl*
> functions (dlinfo and dladdr1) will either already have the special
> handling, or won't operate on the main executable, or likely won't opt
> to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).
Some special-case the main executable based on l_name, I expect, which
is why I'm so reluctant to change l_name. The GDB comment is actually
hinting strongly towards a "" convention (that Solaris broke).
> I am keeping my hopes low for changing the public semantics, but I
> hope that adding new fields for the missing data at least could fall
> into the "bug fixes" category.
We can in theory add new flags to dladdr1 or new requests to dlinfo.
These things do not impact ABI and are backportable.
Changing struct link_map is challenging, as I explained. Adding new
functions definitely triggers deployment delays because those aren't
considered backportable. Adding fields to existing structs is usually
impossible because these structs may have been used to define
application types, and new fields cause subsequent offsets to change.
Thanks,
Flroian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-17 20:42 ` Florian Weimer
@ 2021-11-18 21:55 ` Jonathon Anderson
2021-11-19 19:18 ` Florian Weimer
0 siblings, 1 reply; 15+ messages in thread
From: Jonathon Anderson @ 2021-11-18 21:55 UTC (permalink / raw)
To: Florian Weimer
Cc: John Mellor-Crummey, Xiaozhu Meng, Mark W. Krentel, libc-alpha
On 11/17/21 14:42, Florian Weimer wrote:
> Thank you for the feedback. At this time, I want to comment on the
> l_addr aspect.
Thank you for the comments. I think I understand your points, below are
my own opinions on this matter.
>> An alternative would be to re-open the file from its path
>> (link_map->l_name), however this is a serious performance concern for
>> large-scale executions (metadata servers are known to be a bottleneck
>> of parallel filesystems).
> It also has its security issues because you might get back the binary
> you expect.
Agreed!
>> Right now, we
>> only require the program headers which we can obtain from
>> getauxval(AT_PHDR), however this technique has questionable
>> portability and robustness (getauxval returns an unsigned long, not a
>> pointer).
> A glibc port to an architecture where a long value cannot hold all
> pointer values will have to provide an alternative interface similar to
> getauxval, but that returns pointer values.
I would go one step farther and say getauxval is already broken for any
64-bit architecture, unsigned long is only required to support 32 bits
as per the C standards. One of my greater fears is that some exotic
compiler will cleverly allocate only 4 bytes of stack space for the
return value, and we wouldn't know except for a subtle bug (dependent on
optimization flags!) that crashes our entire tool with SEGVs in the
auditor (where GDB doesn't give properly unwound call stacks).
> Of course that's not the
> only interface with this problem (ElfW(Addr) is an integer as well).
AFAICT ElfW(Addr) is fine, it should always be an integer large enough
to store a pointer on the host architecture (i.e. a uintptr_t). Unless I
missed some specific arch where this doesn't work out to be the case?
> It
> makes the Morello glibc port quite interesting. So I think *something*
> like getauxval (AT_PHDR) will always be available, with pretty much
> identical semantics.
>
>> From an outside perspective the current l_addr semantic is fairly
>> undocumented, the dladdr and dlinfo man pages define it vaguely as
>> the "difference between the address in the ELF file and the address
>> in memory." That sounds (to me at least) like l_addr should point to
>> byte 0 in the file (the ELF header), and that seems to be correct in
>> all but the non-PIE case.
> I have struggled with this in the past. I agree that it is confusing.
> l_addr is the offset between virtual addresses in the program header of
> the ELF object and the actual addresses in the process image. This
> offset happens to be 0 for ET_EXEC objects, and only there.
This is a much clearer description of the semantic, it would be very
helpful the man pages used that sentence (or one like it) wherever the
l_addr value is exposed in the API (link_map->l_addr and
dl_phdr_info->dlpi_addr). It would also be very helpful if
Dl_info.dli_fbase was clearly documented as *not* l_addr but instead
byte 0/ELF header in the process image.
> I think the sort-of-official way out of this conundrum is to call
> dl_iterate_phdr and then look at the program headers. This is what
> _Unwind_Find_FDE in libgcc does to find the object boundaries and the
> PT_GNU_EH_FRAME segment, see libgcc/unwind-dw2-fde-dip.c in the GCC
> sources:
>
> […]
> # define __RELOC_POINTER(ptr, base) ((ptr) + (base))
> […]
> _Unwind_Ptr load_base;
> […]
> load_base = info->dlpi_addr;
> […]
> /* See if PC falls into one of the loaded segments. Find the eh_frame
> segment at the same time. */
> for (n = info->dlpi_phnum; --n >= 0; phdr++)
> {
> if (phdr->p_type == PT_LOAD)
> {
> _Unwind_Ptr vaddr = (_Unwind_Ptr)
> __RELOC_POINTER (phdr->p_vaddr, load_base);
> if (data->pc >= vaddr && data->pc < vaddr + phdr->p_memsz)
> {
> match = 1;
> pc_low = vaddr;
> pc_high = vaddr + phdr->p_memsz;
> }
> }
> […]
>
> This is not even glibc-specific, other systems have dl_iterate_phdr as
> well.
That does sound like the "correct" way out, but dl_iterate_phdr operates
on the caller's namespace so one would need to inject a shim library to
do the actual call.
Pulling this off is not trivial, I have a number of concerns with the
approach:
- It seems to be fixed upstream, but with 2.32 and prior dlmopen will
assert when used within the la_activity(CONSISTENT) for the target
namespace. So the shim library needs to be loaded via other means
(LD_PRELOAD), leaving the (much earlier) auditor without options until
the shim gets loaded.
- The above also means that dlsym can't be used to resolve the shim
function symbol, so it needs to be manually derived from the dynamic
symbol table. Which is awkward and a bit dangerous, AFAIK there isn't a
simple way to get the size of the dynamic symbol table so if the symbol
isn't there it usually SEGVs in the auditor (running off the end).
- The above also means that init constructors can't be "promoted," so
dl_iterate_phdr will be called before init constructors (Glibc and shim)
have run for the target namespace. My mini-experiments on x86-64 pass so
it doesn't seem like an issue, but it still makes me anxious.
- The above *also* means that you can only shim into the main
namespace (with upstream Glibc, only into non-auditor namespaces), so
this (complicated!) trick is only useful for getting data for the main
executable. Which really is the only binary that needs to be worked
around, but it still leaves oddly diverging control flow.
- dl_iterate_phdr will iterate the whole namespace and can't be
stopped in the middle, so there are mild performance concerns with
applications pulling in hundreds of libraries. A single call might be
acceptable, but more than that probably will be a problem.
In summary, ugly as heck but it probably works. With all these caveats
though I have difficulty considering this workaround "official" by any
stretch of the imagination.
>> dladdr gets its value from link_map->l_map_start instead of l_addr,
>> so the semantic we want is already present in a private field. It
>> seems to me these two fields could be swapped with little issue, if
>> altering the public semantic is not acceptable we could also be sated
>> if l_map_start was made public.
> Applications which know about the current semantics of l_addr will
> break, though. l_addr is also exposed to debuggers via the _r_debug
> interface. I really do not think we can make changes to l_addr.
> We have a similar issue around l_name being "" for the main program, and
> unfortuantely I will have to argue quite strongly against changing that.
Is adding new public fields completely off the table? HPCToolkit (and
other users in need) can use a simple configure test to check whether
the fields are present, and AFAICT adding new fields shouldn't break
backwards compatibility for older projects or binaries. This would also
give a place to resolve the concerns around returning argv[0] instead of
the actual binary path.
The motivation for these requests is improved usability, the auditor
interface (and _r_debug) are unique in that they *only* give the
link_map and everything else must be derived from it. dladdr isn't
always an option for auditors (another of our 'Tier 2' issues),
dl_iterate_phdr is painful to use as noted above, the remaining dl*
functions require dlopen handles (and/or crash in the auditor, another
'Tier 2' issue). In HPCToolkit we managed to scrounge up alternative
paths for the missing data (which is why this isn't a 'Tier 1' issue),
but the interface itself is still unacceptably unusable and has been
since its first implementation. Any change to fill in this data is
strictly an improvement.
If I can humor the impossible for a few moments longer, I personally
have a difficult time believing that anyone actually uses
link_map->l_addr or link_map->l_name in a way that would break by
changing their semantics for the main executable:
- The documentation hasn't improved for years so there can't be many
users that care about (or even noticed) this case in particular.
- Every use case I can think of for obtaining a link_map from the dl*
functions (dlinfo and dladdr1) will either already have the special
handling, or won't operate on the main executable, or likely won't opt
to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).
- dl_iterate_phdr is a preferable alternative to manually iterating
the link_map chain and provides l_addr itself, and the "better"
dli_fname from dladdr(dlpi_phdr) is far easier to get than l_name.
- The users of the auditor interface can be counted on one hand (IIRC
HPCToolkit, Spindle and sotruss), at least for HPCToolkit we are
generally fine with changes if it means we get a *usable* audit
interface out of the deal.
- And my cursory examination of the logic in GDB [1] and LLDB [2] (and
immediately surrounding code) makes it seem plausible that debuggers
treat the main executable very differently and generally ignore it in
the link_map.
I can't say with absolute confidence that nothing would break, but I
currently consider it very rare and most likely those that do were
already intentionally abusing the dl* interface.
[1]
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/solib-svr4.c;hb=d3de0860104b8bb8d496527fbb042c3b4c5c82dc#l1225
[2]
https://github.com/llvm/llvm-project/blob/ebf8d74e929d908829eda4ad8548ec21e2dbc6ae/lldb/source/Plugins/Process/POSIX/NativeProcessELF.cpp#L175-L176
> We should perhaps collect these grievances and work on better interfaces
> for the future. However, that will be quite a long-term investment
> because it will take years until these new interfaces will be available
> on your users' systems. Bug fixes we can roll out more quickly, but
> glibc interface changes typically happen at major distribution release
> boundaries only. For the time being, it has to be workarounds for
> missing interfaces, I'm afraid.
I am keeping my hopes low for changing the public semantics, but I hope
that adding new fields for the missing data at least could fall into the
"bug fixes" category. From my perspective the public link_map interface
was *never* actually usable, any change to fix it is strictly an
(ABI-compatible) improvement. :)
-Jonathon
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
2021-11-17 18:08 ` John Mellor-Crummey
@ 2021-11-17 20:42 ` Florian Weimer
2021-11-18 21:55 ` Jonathon Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2021-11-17 20:42 UTC (permalink / raw)
To: Jonathon Anderson
Cc: John Mellor-Crummey, Xiaozhu Meng, Mark W. Krentel, libc-alpha
Thank you for the feedback. At this time, I want to comment on the
l_addr aspect.
> > 3. For non-PIE executables the base address listed in link_map->l_addr
> > for the main application binary is 0, even though dladdr is able to
> > recover the correct offset. La_objopen is affected by this.
> >
> > This would require to change an internal semantic for link_map->l_addr.
> > This is not straighfoward and I am not sure about the direct gains.
>
> Again, we are wholly sympathetic to the difficulty of refactoring
> complex code!
>
> The motivation for providing a consistent link_map->l_addr value is
> to unify the handling for the main executable with any other binary
> and to allow access to the ELF header of the main executable (which
> provides fields not available anywhere else: type, ABI, entry
> point...). An alternative would be to re-open the file from its path
> (link_map->l_name), however this is a serious performance concern for
> large-scale executions (metadata servers are known to be a bottleneck
> of parallel filesystems).
It also has its security issues because you might get back the binary
you expect.
> dladdr is not always an option for
> auditors, as noted by another of our 'Tier 2' issues. Right now, we
> only require the program headers which we can obtain from
> getauxval(AT_PHDR), however this technique has questionable
> portability and robustness (getauxval returns an unsigned long, not a
> pointer).
A glibc port to an architecture where a long value cannot hold all
pointer values will have to provide an alternative interface similar to
getauxval, but that returns pointer values. Of course that's not the
only interface with this problem (ElfW(Addr) is an integer as well). It
makes the Morello glibc port quite interesting. So I think *something*
like getauxval (AT_PHDR) will always be available, with pretty much
identical semantics.
> From an outside perspective the current l_addr semantic is fairly
> undocumented, the dladdr and dlinfo man pages define it vaguely as
> the "difference between the address in the ELF file and the address
> in memory." That sounds (to me at least) like l_addr should point to
> byte 0 in the file (the ELF header), and that seems to be correct in
> all but the non-PIE case.
I have struggled with this in the past. I agree that it is confusing.
l_addr is the offset between virtual addresses in the program header of
the ELF object and the actual addresses in the process image. This
offset happens to be 0 for ET_EXEC objects, and only there.
I think the sort-of-official way out of this conundrum is to call
dl_iterate_phdr and then look at the program headers. This is what
_Unwind_Find_FDE in libgcc does to find the object boundaries and the
PT_GNU_EH_FRAME segment, see libgcc/unwind-dw2-fde-dip.c in the GCC
sources:
[…]
# define __RELOC_POINTER(ptr, base) ((ptr) + (base))
[…]
_Unwind_Ptr load_base;
[…]
load_base = info->dlpi_addr;
[…]
/* See if PC falls into one of the loaded segments. Find the eh_frame
segment at the same time. */
for (n = info->dlpi_phnum; --n >= 0; phdr++)
{
if (phdr->p_type == PT_LOAD)
{
_Unwind_Ptr vaddr = (_Unwind_Ptr)
__RELOC_POINTER (phdr->p_vaddr, load_base);
if (data->pc >= vaddr && data->pc < vaddr + phdr->p_memsz)
{
match = 1;
pc_low = vaddr;
pc_high = vaddr + phdr->p_memsz;
}
}
[…]
This is not even glibc-specific, other systems have dl_iterate_phdr as
well.
getauxval (AT_PHDR) is definitely the more direct, Linux-specific
approach. It's also much faster because it does not involve locking.
> dladdr gets its value from link_map->l_map_start instead of l_addr,
> so the semantic we want is already present in a private field. It
> seems to me these two fields could be swapped with little issue, if
> altering the public semantic is not acceptable we could also be sated
> if l_map_start was made public.
Applications which know about the current semantics of l_addr will
break, though. l_addr is also exposed to debuggers via the _r_debug
interface. I really do not think we can make changes to l_addr.
We have a similar issue around l_name being "" for the main program, and
unfortuantely I will have to argue quite strongly against changing that.
We should perhaps collect these grievances and work on better interfaces
for the future. However, that will be quite a long-term investment
because it will take years until these new interfaces will be available
on your users' systems. Bug fixes we can roll out more quickly, but
glibc interface changes typically happen at major distribution release
boundaries only. For the time being, it has to be workarounds for
missing interfaces, I'm afraid.
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Fwd: [PATCH v5 00/22] Some rtld-audit fixes
[not found] <EA69A62D-7C01-4536-B551-2609226053F2@rice.edu>
@ 2021-11-17 18:08 ` John Mellor-Crummey
2021-11-17 20:42 ` Florian Weimer
0 siblings, 1 reply; 15+ messages in thread
From: John Mellor-Crummey @ 2021-11-17 18:08 UTC (permalink / raw)
To: John Mellor-Crummey via Libc-alpha
Cc: John Mellor-Crummey, Jonathon Anderson, Mark W. Krentel, Xiaozhu Meng
We are grateful to the members of the libc-alpha team who have been working on authoring (Adhemerval Zanella) and reviewing (Florian Weimer) fixes for the auditor problems that we previously reported to this list.
Jonathon Anderson, the auditor expert on my team, has reviewed the patches and offers some feedback about them.
--
John Mellor-Crummey Professor
Dept of Computer Science Rice University
email: johnmc@rice.edu phone: 713-348-5179
> Begin forwarded message:
>
> From: Jonathon Anderson <janderson@rice.edu>
> Subject: Re: [PATCH v5 00/22] Some rtld-audit fixes
> Date: November 16, 2021 at 5:38:28 PM CST
>
>
> === Reply to [00/20]: ===
>
> Many thanks to Adhemerval Zanella for authoring and maintaining these fixes and to Florian Weimer for reviewing them! We are very happy to see progress towards getting our bugs resolved.
>
> I have reviewed the tests included with these patches, presuming all these tests pass all of our reported 'Tier 1' issues are resolved by this patchset. I have also tested this patch series in an x86-64 Fedora rawhide container, all of our 'Tier 1' issue reproducers pass (except for the ARM-specific reproducer which I skipped). Not including the ones Adhemerval raised questions about, our 'Tier 2' issues that are not yet resolved in upstream + these patches are as follows (see [1] for full details):
>
> - "Various Glibc functions cannot be called from an auditor": Several Glibc functions segfault when called from an auditor, many dependent on the initialization of the Glibc in the application namespace (LM_ID_BASE). Patch [13/20] fixes and tests the isspace case. gethostid does not have a test but our reproducer passes. dlopen/dladdr are also known to have this issue, fixes for these are not included in this patchset.
>
> - "La_activity calls are missing or mis-ordered with respect to la_obj* calls": The calls to la_activity seen in practice often differ from our presumed interpretation of the rtld-audit man page. Patch [17/20] fixes case 2 (la_activity is not called during application exit, even though la_objclose is). A fix for case 4 (la_activity(LA_ACT_DELETE) is only called after la_objclose, and immediately followed by la_activity(LA_ACT_CONSISTENT)) is not included in this patchset.
>
> Overall, we are very pleased with these patches, and would love to see them included in official releases of Glibc and RHEL in the near future.
>
> In addition, since the time of the initial development of these patches we have uncovered a pair of new 'Tier 2' issues (see [1] for full details):
>
> - "Related recursive dl*open calls can crash or cause inconsistent state": Dlopen and dlmopen when called recursively (under very specific conditions) will crash uncatchably or return prior to the init constructors of the requested binaries (the difference is presumed to be based on whether loader assertions were enabled during the build).
>
> - "Auditor namespaces (undocumentedly) differ from normal namespaces": The separate namespace used to load an auditor behaves very differently from a "normal" namespace created by the application, however there is no indication in the auditor callbacks that a namespace falls into this "special" case. Since LD_AUDIT can load auditors arbitrarily *all* auditors must be robust against the idiosyncrasies of auditor namespaces, complicating already delicate auditor code.
>
> Of course, we would also like our 'Tier 2' issues to also be resolved soon. Below are our responses to the questions Adhemerval raised concerning a few of our issues:
>
> > There is also some point brough by John Melloc-Crummey documents that
> > I don't have a straighforward answer so I haven't added on this
> > patchset:
> >
> > 1 la_activity(LA_ACT_ADD) is never called for auditor namespaces,
> > even though la_objopen and la_activity(LA_ACT_CONSISTENT) are.
> >
> > There is no easy solution for this: we need at least to load the
> > *first* auditor to actually issue the la_activity(LA_ACT_ADD). It
> > means that it would *only* work for subsequent audit modules, and
> > adding this specific semantic is confusing and does not really
> > improve things (it only helps when multiple audit modules are used).
>
> Our reproducer for this issue passes even though there is no explicit test, so this question may be resolved in part already. To provide some of our motivation for this issue:
>
> The intended use case is in fact multiple audit modules. As detailed in the introduction to our issue document [1], the available interfaces aside from LD_AUDIT for intercepting functions are insufficient for libraries intended to accelerate applications and for performance tools. We predict a near-future scenario where large-scale applications are accelerated by auditors, in order to provide suitable performance measurements in this environment we need first-class support for multiple auditors. We see this already taking place with Spindle [2], an auditor which accelerates dynamic library loading using cross-node caching and communication. If Spindle is helping or hurting the performance of an application, we want to be able to see that in our performance measurements. Degraded callbacks from Glibc for auditors increases the burden and complexity on our own code to work around/defend against such idiosyncrasies. If there are multiple auditors other than HPCToolkit, e.g. spindle and any other auditor, each of the auditors may need to address these issues.
>
> > 2. la_objopen is called for the main binary and for ld.so before the
> > first la_activity(LA_ACT_ADD) call. This contradicts the pattern
> > found in a successful dlopen (where la_activity(LA_ACT_ADD)
> > precedes la_objopen).
> >
> > The constrain here is we need to handle DT_AUDIT and DT_DEPAUDIT
> > dynamic tags, which means we need to first load the executable in
> > memory to parse the required audit modules. So we need to first parse
> > the dynamic audit tags, load the audit modules, and then load the
> > object itself.
>
> We are wholly sympathetic to the difficulty of adjusting complex and delicate code!
>
> From an outside perspective this constraint does not seem to be an issue, consider that the executable's la_objopen must occur *after* the DT_*AUDIT auditors have been loaded so that it can be reported to the newly-loaded auditors, this also matches what happens in miniature experiments. I would think then that there is (hopefully one) code path where la_activity(ADD) could be called just before the la_objopen for the executable or just after DT_*AUDIT auditors have been loaded.
>
> > 3. For non-PIE executables the base address listed in link_map->l_addr
> > for the main application binary is 0, even though dladdr is able to
> > recover the correct offset. La_objopen is affected by this.
> >
> > This would require to change an internal semantic for link_map->l_addr.
> > This is not straighfoward and I am not sure about the direct gains.
>
> Again, we are wholly sympathetic to the difficulty of refactoring complex code!
>
> The motivation for providing a consistent link_map->l_addr value is to unify the handling for the main executable with any other binary and to allow access to the ELF header of the main executable (which provides fields not available anywhere else: type, ABI, entry point...). An alternative would be to re-open the file from its path (link_map->l_name), however this is a serious performance concern for large-scale executions (metadata servers are known to be a bottleneck of parallel filesystems). dladdr is not always an option for auditors, as noted by another of our 'Tier 2' issues. Right now, we only require the program headers which we can obtain from getauxval(AT_PHDR), however this technique has questionable portability and robustness (getauxval returns an unsigned long, not a pointer).
>
> From an outside perspective the current l_addr semantic is fairly undocumented, the dladdr and dlinfo man pages define it vaguely as the "difference between the address in the ELF file and the address in memory." That sounds (to me at least) like l_addr should point to byte 0 in the file (the ELF header), and that seems to be correct in all but the non-PIE case. It makes little sense to expose a value without a clear (documented!) interpretation, and knowing this deviation makes it unclear to me as a user how I should be using this value (and makes me wonder how many other users have made the same mistake).
>
> dladdr gets its value from link_map->l_map_start instead of l_addr, so the semantic we want is already present in a private field. It seems to me these two fields could be swapped with little issue, if altering the public semantic is not acceptable we could also be sated if l_map_start was made public.
>
> [1] https://docs.google.com/document/d/1dVaDBdzySecxQqD6hLLzDrEF18M1UtjDna9gL5BWWI0/edit# <https://docs.google.com/document/d/1dVaDBdzySecxQqD6hLLzDrEF18M1UtjDna9gL5BWWI0/edit#>
> [2] https://github.com/hpc/Spindle <https://github.com/hpc/Spindle>
>
> === Reply to [17/20]: ===
>
> > la_activity() is not called during application exit, even though
> > la_objclose is.
> >
> > Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
>
> Many thanks for the patch!
>
> As an outsider's review, the included test does not appear to test the new functionality noted in the commit message and fixed in the patch. It merely tests that la_objopen/close pairs are properly generated for all binaries, not that an la_activity(DELETE) call is made at program termination.
>
> Since dlclose is not called in the test body, I believe all that should be needed is an additional check that la_activity(DELETE) is called exactly once.
>
> -Jonathon
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-11-25 17:56 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-22 17:46 Fwd: [PATCH v5 00/22] Some rtld-audit fixes jma14
2021-11-23 13:58 ` Adhemerval Zanella
2021-11-23 14:02 ` Florian Weimer
2021-11-23 16:25 ` Adhemerval Zanella
2021-11-23 16:50 ` Florian Weimer
2021-11-23 21:13 ` Jonathon Anderson
2021-11-25 17:56 ` Adhemerval Zanella
-- strict thread matches above, loose matches on Subject: below --
2021-11-22 17:46 jma14
[not found] <EA69A62D-7C01-4536-B551-2609226053F2@rice.edu>
2021-11-17 18:08 ` John Mellor-Crummey
2021-11-17 20:42 ` Florian Weimer
2021-11-18 21:55 ` Jonathon Anderson
2021-11-19 19:18 ` Florian Weimer
2021-11-19 19:56 ` Adhemerval Zanella
2021-11-19 20:31 ` Florian Weimer
2021-11-23 16:36 ` Adhemerval Zanella
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).