public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: jma14 <jma14@rice.edu>
To: Florian Weimer <fweimer@redhat.com>,
	Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: John Mellor-Crummey <johnmc@rice.edu>,
	libc-alpha@sourceware.org, "Mark W. Krentel" <krentel@rice.edu>,
	Xiaozhu Meng <xm13@rice.edu>
Subject: Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
Date: Mon, 22 Nov 2021 11:46:29 -0600	[thread overview]
Message-ID: <20211122114629.Horde._rW0tgxbf_wCwsiLyKcms3g@webmail.rice.edu> (raw)


On 11/19/21 14:31, Florian Weimer       wrote:

> * Adhemerval Zanella:
>>> "" for the main executable is widely known.  Usually code uses it to
>>> implement a fallback on argv[0] or /proc/self/exe, though.

If this is widely known, it would be helpful if this was added to the  
man pages. Currently the man pages define l_name as the "absolute  
pathname where object was found." That sounds (to me at least) like  
dlinfo(dlopen(NULL)) is a portable alternative to the Linux-specific  
AT_EXECFN, contrary to reality.

>> There are still the issue where audit interface does not have direct
>> access to argv[0] from the audited process and '/proc' might also not
>> be accessible.  I am still not convinced that provided argv[0] for
>> l_name for main executable is worse than "", specially because the
>> fallback might not work.
>
> I think it's better to give the auditor a chance to figure out whether
> they want to use program_invocation_name (if that's not available in the
> inner libc, that's for sure a bug we must fix), AT_EXECFN, or
> /proc/self/exe.  If we pick one of these for the auditor, we make it
> more difficult to make the appropriate choice.

Apologies, but I don't understand the logic here. The main executable  
is easy to identify in la_objopen as `l_prev == NULL && lmid ==  
LM_ID_BASE`, the same (effectively) is done in gdb [1]. Applications  
have dlinfo(dlopen(NULL)) to directly obtain the main executable's  
link_map. I fail to see how setting l_name to "" is superior to either  
of those options, especially given the downsides.

I agree with Adhemerval. There is no portable way for an auditor to  
acquire the resolved path to the main executable, AT_EXECFN and /proc  
are Linux-specific and dladdr (currently) often crashes from the  
auditor. An l_name of "" means auditors *cannot* make the appropriate  
choice.

[1]  
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/solib-svr4.c;hb=d3de0860104b8bb8d496527fbb042c3b4c5c82dc#l1230

>>>   […]   […]
>>>   # define __RELOC_POINTER(ptr, base) ((ptr) + (base))
>>>   […]
>>>   _Unwind_Ptr load_base;
>>>   […]
>>>   load_base = info->dlpi_addr;
>>>   […]
>>>   /* See if PC falls into one of the loaded segments.  Find the eh_frame
>>>      segment at the same time.  */
>>>   for (n = info->dlpi_phnum; --n >= 0; phdr++)
>>>     {
>>>       if (phdr->p_type == PT_LOAD)
>>>         {
>>>           _Unwind_Ptr vaddr = (_Unwind_Ptr)
>>>             __RELOC_POINTER (phdr->p_vaddr, load_base);
>>>           if (data->pc >= vaddr && data->pc < vaddr + phdr->p_memsz)
>>>             {
>>>               match = 1;
>>>               pc_low = vaddr;
>>>               pc_high =  vaddr + phdr->p_memsz;
>>>             }
>>>         }
>>>   […]

>>> Changing l_addr will break the libgcc unwinder.  It uses l_addr to
>>> relocate the program header (see the code I quoted previously).  Not
>>> everyone uses the platform unwinder, and the libgcc unwinder is
>>> sometimes linked statically.  This is different from the l_name
>>> change: The l_addr would definitely cause widespread breakage.

Based on the quoted code snippet, the libgcc unwinder uses dlpi_addr  
(from dl_iterate_phdr) instead of l_addr. To clarify, my proposal is  
that *only* the publicly visible l_addr changes. In short replacing  
the relation:
    dlpi_addr (from dl_iterate_phdr) == l_addr != dli_fbase (from  
dladdr) == (private) l_map_start
with the relation:
    dlpi_addr (from dl_iterate_phdr) != l_addr == dli_fbase (from  
dladdr) == (private) l_map_start

Any Glibc code using l_addr (including dl_iterate_phdr) would instead  
use a newly added private field retaining the old semantic, and l_addr  
would carry the value of l_map_start.

I don't expect users to use dl_iterate_phdr and link_map in tandem  
(and then fail when l_addr != dlpi_addr), the former (+  
dladdr(dlpi_phdr)) gives strictly superior information. The quoted  
code snippet supports that expectation.

-Jonathon



             reply	other threads:[~2021-11-22 17:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-22 17:46 jma14 [this message]
2021-11-23 13:58 ` Adhemerval Zanella
2021-11-23 14:02   ` Florian Weimer
2021-11-23 16:25     ` Adhemerval Zanella
2021-11-23 16:50       ` Florian Weimer
2021-11-23 21:13         ` Jonathon Anderson
2021-11-25 17:56           ` Adhemerval Zanella
  -- strict thread matches above, loose matches on Subject: below --
2021-11-22 17:46 jma14
     [not found] <EA69A62D-7C01-4536-B551-2609226053F2@rice.edu>
2021-11-17 18:08 ` John Mellor-Crummey
2021-11-17 20:42   ` Florian Weimer
2021-11-18 21:55     ` Jonathon Anderson
2021-11-19 19:18       ` Florian Weimer
2021-11-19 19:56         ` Adhemerval Zanella
2021-11-19 20:31           ` Florian Weimer
2021-11-23 16:36             ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211122114629.Horde._rW0tgxbf_wCwsiLyKcms3g@webmail.rice.edu \
    --to=jma14@rice.edu \
    --cc=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=johnmc@rice.edu \
    --cc=krentel@rice.edu \
    --cc=libc-alpha@sourceware.org \
    --cc=xm13@rice.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).