From: jma14 <jma14@rice.edu>
To: Florian Weimer <fweimer@redhat.com>,
Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: John Mellor-Crummey <johnmc@rice.edu>,
libc-alpha@sourceware.org, "Mark W. Krentel" <krentel@rice.edu>,
Xiaozhu Meng <xm13@rice.edu>
Subject: Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
Date: Mon, 22 Nov 2021 11:46:29 -0600 [thread overview]
Message-ID: <20211122114629.Horde._rW0tgxbf_wCwsiLyKcms3g@webmail.rice.edu> (raw)
On 11/19/21 14:31, Florian Weimer wrote:
> * Adhemerval Zanella:
>>> "" for the main executable is widely known. Usually code uses it to
>>> implement a fallback on argv[0] or /proc/self/exe, though.
If this is widely known, it would be helpful if this was added to the
man pages. Currently the man pages define l_name as the "absolute
pathname where object was found." That sounds (to me at least) like
dlinfo(dlopen(NULL)) is a portable alternative to the Linux-specific
AT_EXECFN, contrary to reality.
>> There are still the issue where audit interface does not have direct
>> access to argv[0] from the audited process and '/proc' might also not
>> be accessible. I am still not convinced that provided argv[0] for
>> l_name for main executable is worse than "", specially because the
>> fallback might not work.
>
> I think it's better to give the auditor a chance to figure out whether
> they want to use program_invocation_name (if that's not available in the
> inner libc, that's for sure a bug we must fix), AT_EXECFN, or
> /proc/self/exe. If we pick one of these for the auditor, we make it
> more difficult to make the appropriate choice.
Apologies, but I don't understand the logic here. The main executable
is easy to identify in la_objopen as `l_prev == NULL && lmid ==
LM_ID_BASE`, the same (effectively) is done in gdb [1]. Applications
have dlinfo(dlopen(NULL)) to directly obtain the main executable's
link_map. I fail to see how setting l_name to "" is superior to either
of those options, especially given the downsides.
I agree with Adhemerval. There is no portable way for an auditor to
acquire the resolved path to the main executable, AT_EXECFN and /proc
are Linux-specific and dladdr (currently) often crashes from the
auditor. An l_name of "" means auditors *cannot* make the appropriate
choice.
[1]
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/solib-svr4.c;hb=d3de0860104b8bb8d496527fbb042c3b4c5c82dc#l1230
>>> […] […]
>>> # define __RELOC_POINTER(ptr, base) ((ptr) + (base))
>>> […]
>>> _Unwind_Ptr load_base;
>>> […]
>>> load_base = info->dlpi_addr;
>>> […]
>>> /* See if PC falls into one of the loaded segments. Find the eh_frame
>>> segment at the same time. */
>>> for (n = info->dlpi_phnum; --n >= 0; phdr++)
>>> {
>>> if (phdr->p_type == PT_LOAD)
>>> {
>>> _Unwind_Ptr vaddr = (_Unwind_Ptr)
>>> __RELOC_POINTER (phdr->p_vaddr, load_base);
>>> if (data->pc >= vaddr && data->pc < vaddr + phdr->p_memsz)
>>> {
>>> match = 1;
>>> pc_low = vaddr;
>>> pc_high = vaddr + phdr->p_memsz;
>>> }
>>> }
>>> […]
>>> Changing l_addr will break the libgcc unwinder. It uses l_addr to
>>> relocate the program header (see the code I quoted previously). Not
>>> everyone uses the platform unwinder, and the libgcc unwinder is
>>> sometimes linked statically. This is different from the l_name
>>> change: The l_addr would definitely cause widespread breakage.
Based on the quoted code snippet, the libgcc unwinder uses dlpi_addr
(from dl_iterate_phdr) instead of l_addr. To clarify, my proposal is
that *only* the publicly visible l_addr changes. In short replacing
the relation:
dlpi_addr (from dl_iterate_phdr) == l_addr != dli_fbase (from
dladdr) == (private) l_map_start
with the relation:
dlpi_addr (from dl_iterate_phdr) != l_addr == dli_fbase (from
dladdr) == (private) l_map_start
Any Glibc code using l_addr (including dl_iterate_phdr) would instead
use a newly added private field retaining the old semantic, and l_addr
would carry the value of l_map_start.
I don't expect users to use dl_iterate_phdr and link_map in tandem
(and then fail when l_addr != dlpi_addr), the former (+
dladdr(dlpi_phdr)) gives strictly superior information. The quoted
code snippet supports that expectation.
-Jonathon
next reply other threads:[~2021-11-22 17:46 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-22 17:46 jma14 [this message]
2021-11-23 13:58 ` Adhemerval Zanella
2021-11-23 14:02 ` Florian Weimer
2021-11-23 16:25 ` Adhemerval Zanella
2021-11-23 16:50 ` Florian Weimer
2021-11-23 21:13 ` Jonathon Anderson
2021-11-25 17:56 ` Adhemerval Zanella
-- strict thread matches above, loose matches on Subject: below --
2021-11-22 17:46 jma14
[not found] <EA69A62D-7C01-4536-B551-2609226053F2@rice.edu>
2021-11-17 18:08 ` John Mellor-Crummey
2021-11-17 20:42 ` Florian Weimer
2021-11-18 21:55 ` Jonathon Anderson
2021-11-19 19:18 ` Florian Weimer
2021-11-19 19:56 ` Adhemerval Zanella
2021-11-19 20:31 ` Florian Weimer
2021-11-23 16:36 ` Adhemerval Zanella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211122114629.Horde._rW0tgxbf_wCwsiLyKcms3g@webmail.rice.edu \
--to=jma14@rice.edu \
--cc=adhemerval.zanella@linaro.org \
--cc=fweimer@redhat.com \
--cc=johnmc@rice.edu \
--cc=krentel@rice.edu \
--cc=libc-alpha@sourceware.org \
--cc=xm13@rice.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).