public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: jma14 <jma14@rice.edu>
To: Florian Weimer <fweimer@redhat.com>
Cc: John Mellor-Crummey <johnmc@rice.edu>,
	Xiaozhu Meng <xm13@rice.edu>,
	"Mark W. Krentel" <krentel@rice.edu>,
	libc-alpha@sourceware.org
Subject: Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes
Date: Mon, 22 Nov 2021 11:46:30 -0600	[thread overview]
Message-ID: <20211122114630.Horde.vByjLyp4gX9-IpeJ-NBFLp8@webmail.rice.edu> (raw)


On 11/19/21 13:18, Florian Weimer wrote:

> * Jonathon Anderson:
>>>> Right now, we only require the program headers which we can obtain
>>>> from getauxval(AT_PHDR), however this technique has questionable
>>>> portability and robustness (getauxval returns an unsigned long, not a
>>>> pointer).
>
>>> A glibc port to an architecture where a long value cannot hold all
>>> pointer values will have to provide an alternative interface similar
>>> to getauxval, but that returns pointer values.
>
>> I would go one step farther and say getauxval is already broken for any
>> 64-bit architecture, unsigned long is only required to support 32 bits
>> as per the C standards. One of my greater fears is that some exotic
>> compiler will cleverly allocate only 4 bytes of stack space for the
>> return value, and we wouldn't know except for a subtle bug (dependent
>> on optimization flags!) that crashes our entire tool with SEGVs in the
>> auditor (where GDB doesn't give properly unwound call stacks).
>
> If ported to such an architecture, glibc would need several changes to
> accomodate this.  Newer architectures take this into account and do not
> do funny things.

Ah, right, I forgot the sizes of the standard types are part of the  
GNU ABI. That resolves my concern then.

>> This is a much clearer description of the semantic, it would be very
>> helpful the man pages used that sentence (or one like it) wherever the
>> l_addr value is exposed in the API (link_map->l_addr and
>> dl_phdr_info->dlpi_addr). It would also be very helpful if
>> Dl_info.dli_fbase was clearly documented as *not* l_addr but instead
>> byte 0/ELF header in the process image.
>
> I've made a note to update the manual pages.

Thanks!

>>>> dladdr gets its value from link_map->l_map_start instead of l_addr,
>>>> so the semantic we want is already present in a private field. It
>>>> seems to me these two fields could be swapped with little issue, if
>>>> altering the public semantic is not acceptable we could also be sated
>>>> if l_map_start was made public.
>>>
>>> Applications which know about the current semantics of l_addr will
>>> break, though.  l_addr is also exposed to debuggers via the _r_debug
>>> interface.  I really do not think we can make changes to l_addr. We
>>> have a similar issue around l_name being "" for the main program, and
>>> unfortuantely I will have to argue quite strongly against changing
>>> that.
>
>> Is adding new public fields completely off the table?
>
> To struct link_map?  We could probably pull it off, but it would be
> years until such a change will be in the hands of the users.  There is
> an internal structure that overlaps with the public struct link_map, and
> some applications poke at the private bits at fixed offsets.  We've
> started not to strip ld.so downstream, so that these applications can
> switch to DWARF data to avoid dependencies on fixed offsets, but that
> has been a very recent change.

I would consider poking at private fields a last resort, portability  
across (versions of) GNU/Linux distributions is something we fight  
with on occasion. It also doesn't help the l_name issue.

>> I am keeping my hopes low for changing the public semantics, but I hope
>> that adding new fields for the missing data at least could fall into
>> the "bug fixes" category.
>
> We can in theory add new flags to dladdr1 or new requests to dlinfo.
> These things do not impact ABI and are backportable.

These options would not help the issue. dlinfo (currently) crashes in  
the auditor, and using it relies on the (undocumented) fact that  
dlopen handles are in fact just struct link_map*. dladdr already  
provides the corrected values, adding a new option to dladdr1 isn't  
needed.

> Changing struct link_map is challenging, as I explained.  Adding new
> functions definitely triggers deployment delays because those aren't
> considered backportable.  Adding fields to existing structs is usually
> impossible because these structs may have been used to define
> application types, and new fields cause subsequent offsets to change.

Ah, right. Can the new fields be added in a future release? If so,  
then with backported fixes to the dladdr crashes dladdr(l_ld) can be  
used as a portable workaround, provided it works as intended on all  
Glibc targets (it works on x86-64/Linux but I'll need confirmation for  
others). There is a performance cost to dladdr so I still think the  
new fields are required, it also gives the workaround a clear lifetime.

If new fields are completely impossible (now and forever), then as an  
LD_AUDIT user the only option (brought up so far) I feel comfortable  
with is a change to the l_addr/l_name semantics. It's ABI-compatible,  
I have a hard time believing there are applications that will regress  
because of it, and it fixes an interface that was *never* quite right  
(and differed from its documentation).

-Jonathon



             reply	other threads:[~2021-11-22 17:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-22 17:46 jma14 [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-11-22 17:46 jma14
2021-11-23 13:58 ` Adhemerval Zanella
2021-11-23 14:02   ` Florian Weimer
2021-11-23 16:25     ` Adhemerval Zanella
2021-11-23 16:50       ` Florian Weimer
2021-11-23 21:13         ` Jonathon Anderson
2021-11-25 17:56           ` Adhemerval Zanella
     [not found] <EA69A62D-7C01-4536-B551-2609226053F2@rice.edu>
2021-11-17 18:08 ` John Mellor-Crummey
2021-11-17 20:42   ` Florian Weimer
2021-11-18 21:55     ` Jonathon Anderson
2021-11-19 19:18       ` Florian Weimer
2021-11-19 19:56         ` Adhemerval Zanella
2021-11-19 20:31           ` Florian Weimer
2021-11-23 16:36             ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211122114630.Horde.vByjLyp4gX9-IpeJ-NBFLp8@webmail.rice.edu \
    --to=jma14@rice.edu \
    --cc=fweimer@redhat.com \
    --cc=johnmc@rice.edu \
    --cc=krentel@rice.edu \
    --cc=libc-alpha@sourceware.org \
    --cc=xm13@rice.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).