public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Jonathon Anderson <janderson@rice.edu>
To: Florian Weimer <fweimer@redhat.com>
Cc: John Mellor-Crummey <johnmc@rice.edu>,
	Xiaozhu Meng <xm13@rice.edu>,
	"Mark W. Krentel" <krentel@rice.edu>,
	libc-alpha@sourceware.org
Subject: Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
Date: Thu, 18 Nov 2021 15:55:11 -0600	[thread overview]
Message-ID: <b3c03539-42e1-1548-39b7-55fc6e46dc7e@rice.edu> (raw)
In-Reply-To: <87bl2i345m.fsf@oldenburg.str.redhat.com>


On 11/17/21 14:42, Florian Weimer wrote:
> Thank you for the feedback.  At this time, I want to comment on the
> l_addr aspect.
Thank you for the comments. I think I understand your points, below are 
my own opinions on this matter.
>> An alternative would be to re-open the file from its path
>> (link_map->l_name), however this is a serious performance concern for
>> large-scale executions (metadata servers are known to be a bottleneck
>> of parallel filesystems).
> It also has its security issues because you might get back the binary
> you expect.
Agreed!
>> Right now, we
>> only require the program headers which we can obtain from
>> getauxval(AT_PHDR), however this technique has questionable
>> portability and robustness (getauxval returns an unsigned long, not a
>> pointer).
> A glibc port to an architecture where a long value cannot hold all
> pointer values will have to provide an alternative interface similar to
> getauxval, but that returns pointer values.
I would go one step farther and say getauxval is already broken for any 
64-bit architecture, unsigned long is only required to support 32 bits 
as per the C standards. One of my greater fears is that some exotic 
compiler will cleverly allocate only 4 bytes of stack space for the 
return value, and we wouldn't know except for a subtle bug (dependent on 
optimization flags!) that crashes our entire tool with SEGVs in the 
auditor (where GDB doesn't give properly unwound call stacks).
> Of course that's not the
> only interface with this problem (ElfW(Addr) is an integer as well).
AFAICT ElfW(Addr) is fine, it should always be an integer large enough 
to store a pointer on the host architecture (i.e. a uintptr_t). Unless I 
missed some specific arch where this doesn't work out to be the case?
> It
> makes the Morello glibc port quite interesting.  So I think *something*
> like getauxval (AT_PHDR) will always be available, with pretty much
> identical semantics.
>
>>  From an outside perspective the current l_addr semantic is fairly
>> undocumented, the dladdr and dlinfo man pages define it vaguely as
>> the "difference between the address in the ELF file and the address
>> in memory." That sounds (to me at least) like l_addr should point to
>> byte 0 in the file (the ELF header), and that seems to be correct in
>> all but the non-PIE case.
> I have struggled with this in the past.  I agree that it is confusing.
> l_addr is the offset between virtual addresses in the program header of
> the ELF object and the actual addresses in the process image.  This
> offset happens to be 0 for ET_EXEC objects, and only there.
This is a much clearer description of the semantic, it would be very 
helpful the man pages used that sentence (or one like it) wherever the 
l_addr value is exposed in the API (link_map->l_addr and 
dl_phdr_info->dlpi_addr). It would also be very helpful if 
Dl_info.dli_fbase was clearly documented as *not* l_addr but instead 
byte 0/ELF header in the process image.
> I think the sort-of-official way out of this conundrum is to call
> dl_iterate_phdr and then look at the program headers.  This is what
> _Unwind_Find_FDE in libgcc does to find the object boundaries and the
> PT_GNU_EH_FRAME segment, see libgcc/unwind-dw2-fde-dip.c in the GCC
> sources:
>
>    […]
>    # define __RELOC_POINTER(ptr, base) ((ptr) + (base))
>    […]
>    _Unwind_Ptr load_base;
>    […]
>    load_base = info->dlpi_addr;
>    […]
>    /* See if PC falls into one of the loaded segments.  Find the eh_frame
>       segment at the same time.  */
>    for (n = info->dlpi_phnum; --n >= 0; phdr++)
>      {
>        if (phdr->p_type == PT_LOAD)
>          {
>            _Unwind_Ptr vaddr = (_Unwind_Ptr)
>              __RELOC_POINTER (phdr->p_vaddr, load_base);
>            if (data->pc >= vaddr && data->pc < vaddr + phdr->p_memsz)
>              {
>                match = 1;
>                pc_low = vaddr;
>                pc_high =  vaddr + phdr->p_memsz;
>              }
>          }
>    […]
>
> This is not even glibc-specific, other systems have dl_iterate_phdr as
> well.
That does sound like the "correct" way out, but dl_iterate_phdr operates 
on the caller's namespace so one would need to inject a shim library to 
do the actual call.

Pulling this off is not trivial, I have a number of concerns with the 
approach:
  - It seems to be fixed upstream, but with 2.32 and prior dlmopen will 
assert when used within the la_activity(CONSISTENT) for the target 
namespace. So the shim library needs to be loaded via other means 
(LD_PRELOAD), leaving the (much earlier) auditor without options until 
the shim gets loaded.
  - The above also means that dlsym can't be used to resolve the shim 
function symbol, so it needs to be manually derived from the dynamic 
symbol table. Which is awkward and a bit dangerous, AFAIK there isn't a 
simple way to get the size of the dynamic symbol table so if the symbol 
isn't there it usually SEGVs in the auditor (running off the end).
  - The above also means that init constructors can't be "promoted," so 
dl_iterate_phdr will be called before init constructors (Glibc and shim) 
have run for the target namespace. My mini-experiments on x86-64 pass so 
it doesn't seem like an issue, but it still makes me anxious.
  - The above *also* means that you can only shim into the main 
namespace (with upstream Glibc, only into non-auditor namespaces), so 
this (complicated!) trick is only useful for getting data for the main 
executable. Which really is the only binary that needs to be worked 
around, but it still leaves oddly diverging control flow.
  - dl_iterate_phdr will iterate the whole namespace and can't be 
stopped in the middle, so there are mild performance concerns with 
applications pulling in hundreds of libraries. A single call might be 
acceptable, but more than that probably will be a problem.

In summary, ugly as heck but it probably works. With all these caveats 
though I have difficulty considering this workaround "official" by any 
stretch of the imagination.
>> dladdr gets its value from link_map->l_map_start instead of l_addr,
>> so the semantic we want is already present in a private field. It
>> seems to me these two fields could be swapped with little issue, if
>> altering the public semantic is not acceptable we could also be sated
>> if l_map_start was made public.
> Applications which know about the current semantics of l_addr will
> break, though.  l_addr is also exposed to debuggers via the _r_debug
> interface.  I really do not think we can make changes to l_addr.
> We have a similar issue around l_name being "" for the main program, and
> unfortuantely I will have to argue quite strongly against changing that.
Is adding new public fields completely off the table? HPCToolkit (and 
other users in need) can use a simple configure test to check whether 
the fields are present, and AFAICT adding new fields shouldn't break 
backwards compatibility for older projects or binaries. This would also 
give a place to resolve the concerns around returning argv[0] instead of 
the actual binary path.

The motivation for these requests is improved usability, the auditor 
interface (and _r_debug) are unique in that they *only* give the 
link_map and everything else must be derived from it. dladdr isn't 
always an option for auditors (another of our 'Tier 2' issues), 
dl_iterate_phdr is painful to use as noted above, the remaining dl* 
functions require dlopen handles (and/or crash in the auditor, another 
'Tier 2' issue). In HPCToolkit we managed to scrounge up alternative 
paths for the missing data (which is why this isn't a 'Tier 1' issue), 
but the interface itself is still unacceptably unusable and has been 
since its first implementation. Any change to fill in this data is 
strictly an improvement.

If I can humor the impossible for a few moments longer, I personally 
have a difficult time believing that anyone actually uses 
link_map->l_addr or link_map->l_name in a way that would break by 
changing their semantics for the main executable:
  - The documentation hasn't improved for years so there can't be many 
users that care about (or even noticed) this case in particular.
  - Every use case I can think of for obtaining a link_map from the dl* 
functions (dlinfo and dladdr1) will either already have the special 
handling, or won't operate on the main executable, or likely won't opt 
to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).
  - dl_iterate_phdr is a preferable alternative to manually iterating 
the link_map chain and provides l_addr itself, and the "better" 
dli_fname from dladdr(dlpi_phdr) is far easier to get than l_name.
  - The users of the auditor interface can be counted on one hand (IIRC 
HPCToolkit, Spindle and sotruss), at least for HPCToolkit we are 
generally fine with changes if it means we get a *usable* audit 
interface out of the deal.
  - And my cursory examination of the logic in GDB [1] and LLDB [2] (and 
immediately surrounding code) makes it seem plausible that debuggers 
treat the main executable very differently and generally ignore it in 
the link_map.

I can't say with absolute confidence that nothing would break, but I 
currently consider it very rare and most likely those that do were 
already intentionally abusing the dl* interface.

[1] 
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/solib-svr4.c;hb=d3de0860104b8bb8d496527fbb042c3b4c5c82dc#l1225
[2] 
https://github.com/llvm/llvm-project/blob/ebf8d74e929d908829eda4ad8548ec21e2dbc6ae/lldb/source/Plugins/Process/POSIX/NativeProcessELF.cpp#L175-L176
> We should perhaps collect these grievances and work on better interfaces
> for the future.  However, that will be quite a long-term investment
> because it will take years until these new interfaces will be available
> on your users' systems.  Bug fixes we can roll out more quickly, but
> glibc interface changes typically happen at major distribution release
> boundaries only.  For the time being, it has to be workarounds for
> missing interfaces, I'm afraid.
I am keeping my hopes low for changing the public semantics, but I hope 
that adding new fields for the missing data at least could fall into the 
"bug fixes" category. From my perspective the public link_map interface 
was *never* actually usable, any change to fix it is strictly an 
(ABI-compatible) improvement. :)

-Jonathon

  reply	other threads:[~2021-11-18 21:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <EA69A62D-7C01-4536-B551-2609226053F2@rice.edu>
2021-11-17 18:08 ` John Mellor-Crummey
2021-11-17 20:42   ` Florian Weimer
2021-11-18 21:55     ` Jonathon Anderson [this message]
2021-11-19 19:18       ` Florian Weimer
2021-11-19 19:56         ` Adhemerval Zanella
2021-11-19 20:31           ` Florian Weimer
2021-11-23 16:36             ` Adhemerval Zanella
2021-11-22 17:46 jma14
2021-11-23 13:58 ` Adhemerval Zanella
2021-11-23 14:02   ` Florian Weimer
2021-11-23 16:25     ` Adhemerval Zanella
2021-11-23 16:50       ` Florian Weimer
2021-11-23 21:13         ` Jonathon Anderson
2021-11-25 17:56           ` Adhemerval Zanella
2021-11-22 17:46 jma14

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3c03539-42e1-1548-39b7-55fc6e46dc7e@rice.edu \
    --to=janderson@rice.edu \
    --cc=fweimer@redhat.com \
    --cc=johnmc@rice.edu \
    --cc=krentel@rice.edu \
    --cc=libc-alpha@sourceware.org \
    --cc=xm13@rice.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).