public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* Augmenting ld.so Bootstrap Relocations for Unikernel Linux
@ 2024-05-09 12:20 Vance Raiti
  2024-05-09 17:39 ` Carlos O'Donell
  2024-05-09 18:01 ` Florian Weimer
  0 siblings, 2 replies; 6+ messages in thread
From: Vance Raiti @ 2024-05-09 12:20 UTC (permalink / raw)
  To: libc-help

[-- Attachment #1: Type: text/plain, Size: 2991 bytes --]

Hello,

I am writing today to ask for guidance on augmenting the bootstrap
relocations that ld.so performs.

My name is Vance Raiti. I’m an undergraduate research assistant working
with the Red Hat Collaboratory at Boston University on developing dynamic
linkage support for Unikernel Linux (https://arxiv.org/pdf/2206.00789). The
goal of this project is to allow any Linux application to run unmodified as
a unikernel application, executing in supervisor mode and gaining access to
kernel symbols.

In order to support this, we use a modified glibc (
https://github.com/unikernelLinux/glibc/tree/ukl-dynamic) that, among other
things, uses a special system call ABI: instead of issuing syscall
instructions, glibc instead calls the kernel entry point (entry_SYSCALL_64)
directly. For example, the following assembly is used in the
internal_syscall# macros:

        asm volatile (

        "call entry_SYSCALL_64@PLT\n\t"

        : "=a" (resultvar)

        : "0" (number)

        : "memory", REGISTERS_CLOBBERED_BY_SYSCALL);

This symbol is provided to glibc by a shared object, libuklsyms.so. glibc
is linked with this shared object by appending a few options to the
build-shlib-helper and build-shlib definitions in glibc’s top-level
Makerules file:

define build-shlib-helper

$(LINK.o) -shared -static-libgcc -Wl,-O1 $(sysdep-LDFLAGS) \

          $(if $($(@F)-no-z-defs)$(no-z-defs),,-Wl,-z,defs) $(rtld-LDFLAGS)
\

          $(extra-B-$(@F:lib%.so=%).so) -B$(csu-objpfx) \

          $(extra-B-$(@F:lib%.so=%).so) $(load-map-file) \

          -Wl,-soname=lib$(libprefix)$(@F:lib%.so=%).so$($(@F)-version) \

          $(LDFLAGS.so) $(LDFLAGS-lib.so) $(LDFLAGS-$(@F:lib%.so=%).so) \

          -L$(subst :, -L,$(rpath-link))
-Wl,-rpath-link=$(rpath-link):$(common-objpfx).. \

          -L$(common-objpfx).. -Wl,-rpath=/data

endef

…

define build-shlib

$(build-shlib-helper) -o $@ $(shlib-lds-flags) \

          $(csu-objpfx)abi-note.o $(build-shlib-objlist) \

          -luklsyms

endef

We choose this approach because we can use a kernel module to update
libuklsyms.so after the kernel has booted, allowing us to account for KASLR.

This works well for the normal glibc shared objects (libc.so,
libpthread.so, etc.), however, it causes complications for the dynamic
linker, ld.so. Since our system calls use the PLT, ld.so needs to relocate
itself before issuing any. I’ve made it so that ld.so knows the address of
entry_SYSCALL_64 ($r15 is initialized with its value by the kernel), so it
is possible, in theory, for ld.so to relocate itself. I noticed that in
_dl_start, one of the first things that ld.so does is perform bootstrap
relocation on itself using boostrap_map. Would it be possible to add the
entry_SYSCALL_64 address to boostrap_map such that ld.so could properly
relocate entry_SYSCALL_64@PLT and make calls to it? If so, how could that
be done?

Thanks,

Vance

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Augmenting ld.so Bootstrap Relocations for Unikernel Linux
  2024-05-09 12:20 Augmenting ld.so Bootstrap Relocations for Unikernel Linux Vance Raiti
@ 2024-05-09 17:39 ` Carlos O'Donell
  2024-05-09 18:01 ` Florian Weimer
  1 sibling, 0 replies; 6+ messages in thread
From: Carlos O'Donell @ 2024-05-09 17:39 UTC (permalink / raw)
  To: Vance Raiti, libc-help, Florian Weimer, DJ Delorie,
	Arjun Shankar, Patsy Griffin, libc-alpha

On 5/9/24 08:20, Vance Raiti via Libc-help wrote:
> I am writing today to ask for guidance on augmenting the bootstrap
> relocations that ld.so performs.

Hello Vance!

I'm going to add libc-alpha (developer list) because this topic crosses boundaries
between help and net-new development.

Let me introduce you to the team at Red Hat that works on glibc (Florian, DJ, Arjun,
Patsy) for Red Hat Enterprise Linux (and the layered products that use RHEL).

> My name is Vance Raiti. I’m an undergraduate research assistant working
> with the Red Hat Collaboratory at Boston University on developing dynamic
> linkage support for Unikernel Linux (https://arxiv.org/pdf/2206.00789). The
> goal of this project is to allow any Linux application to run unmodified as
> a unikernel application, executing in supervisor mode and gaining access to
> kernel symbols.

Nice!

> In order to support this, we use a modified glibc (
> https://github.com/unikernelLinux/glibc/tree/ukl-dynamic) that, among other
> things, uses a special system call ABI: instead of issuing syscall
> instructions, glibc instead calls the kernel entry point (entry_SYSCALL_64)
> directly. For example, the following assembly is used in the
> internal_syscall# macros:
> 
>         asm volatile (
> 
>         "call entry_SYSCALL_64@PLT\n\t"
> 
>         : "=a" (resultvar)
> 
>         : "0" (number)
> 
>         : "memory", REGISTERS_CLOBBERED_BY_SYSCALL);
> 
> This symbol is provided to glibc by a shared object, libuklsyms.so. glibc
> is linked with this shared object by appending a few options to the
> build-shlib-helper and build-shlib definitions in glibc’s top-level
> Makerules file:

I assume that call doesn't follow the normal procedure call standard?

> define build-shlib-helper
> 
> $(LINK.o) -shared -static-libgcc -Wl,-O1 $(sysdep-LDFLAGS) \
> 
>           $(if $($(@F)-no-z-defs)$(no-z-defs),,-Wl,-z,defs) $(rtld-LDFLAGS)
> \
> 
>           $(extra-B-$(@F:lib%.so=%).so) -B$(csu-objpfx) \
> 
>           $(extra-B-$(@F:lib%.so=%).so) $(load-map-file) \
> 
>           -Wl,-soname=lib$(libprefix)$(@F:lib%.so=%).so$($(@F)-version) \
> 
>           $(LDFLAGS.so) $(LDFLAGS-lib.so) $(LDFLAGS-$(@F:lib%.so=%).so) \
> 
>           -L$(subst :, -L,$(rpath-link))
> -Wl,-rpath-link=$(rpath-link):$(common-objpfx).. \
> 
>           -L$(common-objpfx).. -Wl,-rpath=/data
> 
> endef
> 
> …
> 
> define build-shlib
> 
> $(build-shlib-helper) -o $@ $(shlib-lds-flags) \
> 
>           $(csu-objpfx)abi-note.o $(build-shlib-objlist) \
> 
>           -luklsyms
> 
> endef
> 
> We choose this approach because we can use a kernel module to update
> libuklsyms.so after the kernel has booted, allowing us to account for KASLR.

Makes sense.

> This works well for the normal glibc shared objects (libc.so,
> libpthread.so, etc.), however, it causes complications for the dynamic
> linker, ld.so. Since our system calls use the PLT, ld.so needs to relocate
> itself before issuing any. I’ve made it so that ld.so knows the address of
> entry_SYSCALL_64 ($r15 is initialized with its value by the kernel), so it
> is possible, in theory, for ld.so to relocate itself. I noticed that in
> _dl_start, one of the first things that ld.so does is perform bootstrap
> relocation on itself using boostrap_map. Would it be possible to add the
> entry_SYSCALL_64 address to boostrap_map such that ld.so could properly
> relocate entry_SYSCALL_64@PLT and make calls to it? If so, how could that
> be done?

Not easily IMO.

The problem is that you want to make syscalls before you can dlopen shared objects?

Is there any reason you can't extend the existing vDSO framework?

Is vDSO setup too late?

Do you need syscalls earlier?

You would likely have to create a fake link_map with all the information about libuklsyms.so,
make all the ld.so syscalls use PLT relocations, and then in the bootstrap map processing
the PLT relocations would be updated using the fake link_map. Then once you're past that point
you probably want to upgrade the fake link_map with any missing libuklsyms.so data. 

If you don't provide the fake link_map then you have to load libuklsyms.so... but I assume
it's already in memory somewhere. You'd want to create an internal-only fdlopen()-like interface
(dlopen from a memory-backed fd ala memfd_create()).

This seems more complicated than just providing the required minimal boostrap syscall set via
the vDSO interface whose information is passed via AT_SYSINFO/AT_SYSINFO_EHDR?

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Augmenting ld.so Bootstrap Relocations for Unikernel Linux
  2024-05-09 12:20 Augmenting ld.so Bootstrap Relocations for Unikernel Linux Vance Raiti
  2024-05-09 17:39 ` Carlos O'Donell
@ 2024-05-09 18:01 ` Florian Weimer
  2024-05-09 18:39   ` Carlos O'Donell
  1 sibling, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2024-05-09 18:01 UTC (permalink / raw)
  To: Vance Raiti via Libc-help; +Cc: Vance Raiti

* Vance Raiti via Libc-help:

> In order to support this, we use a modified glibc (
> https://github.com/unikernelLinux/glibc/tree/ukl-dynamic) that, among other
> things, uses a special system call ABI: instead of issuing syscall
> instructions, glibc instead calls the kernel entry point (entry_SYSCALL_64)
> directly. For example, the following assembly is used in the
> internal_syscall# macros:
>
>         asm volatile (
>
>         "call entry_SYSCALL_64@PLT\n\t"
>
>         : "=a" (resultvar)
>
>         : "0" (number)
>
>         : "memory", REGISTERS_CLOBBERED_BY_SYSCALL);

Note that this clobbers the red zone, so unless you build with
-mno-red-zone, you'll run into subtle bugs.  There could also be
unwinding issues.  REGISTERS_CLOBBERED_BY_SYSCALL could be wrong as
well, but that depends on what entry_SYSCALL_64 does.

> This works well for the normal glibc shared objects (libc.so,
> libpthread.so, etc.), however, it causes complications for the dynamic
> linker, ld.so. Since our system calls use the PLT, ld.so needs to relocate
> itself before issuing any. I’ve made it so that ld.so knows the address of
> entry_SYSCALL_64 ($r15 is initialized with its value by the kernel), so it
> is possible, in theory, for ld.so to relocate itself. I noticed that in
> _dl_start, one of the first things that ld.so does is perform bootstrap
> relocation on itself using boostrap_map. Would it be possible to add the
> entry_SYSCALL_64 address to boostrap_map such that ld.so could properly
> relocate entry_SYSCALL_64@PLT and make calls to it? If so, how could that
> be done?

You shouldn't use a PLT call, but an indirect call, something like:

  call *_dl_entry_SYSCALL_64(%rip)

And then have a zero-initialized global variable

  void *_dl_entry_SYSCALL_64 attribute_hidden;

that is set to the correct value (passed in $r15 based on your
explanation?) very early in the RTLD_START fragment in
sysdeps/x86_64/dl-machine.h.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Augmenting ld.so Bootstrap Relocations for Unikernel Linux
  2024-05-09 18:01 ` Florian Weimer
@ 2024-05-09 18:39   ` Carlos O'Donell
  2024-05-12 16:41     ` Vance Raiti
  0 siblings, 1 reply; 6+ messages in thread
From: Carlos O'Donell @ 2024-05-09 18:39 UTC (permalink / raw)
  To: Florian Weimer, libc-help, libc-alpha; +Cc: Vance Raiti

On 5/9/24 14:01, Florian Weimer via Libc-help wrote:
> * Vance Raiti via Libc-help:
> 
>> In order to support this, we use a modified glibc (
>> https://github.com/unikernelLinux/glibc/tree/ukl-dynamic) that, among other
>> things, uses a special system call ABI: instead of issuing syscall
>> instructions, glibc instead calls the kernel entry point (entry_SYSCALL_64)
>> directly. For example, the following assembly is used in the
>> internal_syscall# macros:
>>
>>         asm volatile (
>>
>>         "call entry_SYSCALL_64@PLT\n\t"
>>
>>         : "=a" (resultvar)
>>
>>         : "0" (number)
>>
>>         : "memory", REGISTERS_CLOBBERED_BY_SYSCALL);
> 
> Note that this clobbers the red zone, so unless you build with
> -mno-red-zone, you'll run into subtle bugs.  There could also be
> unwinding issues.  REGISTERS_CLOBBERED_BY_SYSCALL could be wrong as
> well, but that depends on what entry_SYSCALL_64 does.
> 
>> This works well for the normal glibc shared objects (libc.so,
>> libpthread.so, etc.), however, it causes complications for the dynamic
>> linker, ld.so. Since our system calls use the PLT, ld.so needs to relocate
>> itself before issuing any. I’ve made it so that ld.so knows the address of
>> entry_SYSCALL_64 ($r15 is initialized with its value by the kernel), so it
>> is possible, in theory, for ld.so to relocate itself. I noticed that in
>> _dl_start, one of the first things that ld.so does is perform bootstrap
>> relocation on itself using boostrap_map. Would it be possible to add the
>> entry_SYSCALL_64 address to boostrap_map such that ld.so could properly
>> relocate entry_SYSCALL_64@PLT and make calls to it? If so, how could that
>> be done?
> 
> You shouldn't use a PLT call, but an indirect call, something like:
> 
>   call *_dl_entry_SYSCALL_64(%rip)
> 
> And then have a zero-initialized global variable
> 
>   void *_dl_entry_SYSCALL_64 attribute_hidden;
> 
> that is set to the correct value (passed in $r15 based on your
> explanation?) very early in the RTLD_START fragment in
> sysdeps/x86_64/dl-machine.h.
 
An indirect call to one location is a very good suggestion.

This is going to be much simpler than anything I suggested.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Augmenting ld.so Bootstrap Relocations for Unikernel Linux
  2024-05-09 18:39   ` Carlos O'Donell
@ 2024-05-12 16:41     ` Vance Raiti
  2024-05-14  7:54       ` Florian Weimer
  0 siblings, 1 reply; 6+ messages in thread
From: Vance Raiti @ 2024-05-12 16:41 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Florian Weimer, libc-help, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 2948 bytes --]

The indirect call worked. I had tried it before and thought it wouldn't be
possible because the indirect pointer to entry_SYSCALL_64 must be
initialized by each library that uses it, and I was not aware of the
initialization routines in the regular glibc shared objects. But after
reading your email I took another look over the ABI and found the DT_INIT
dynamic tag, leading me to find the `_init` routine where I could do my
initialization.

Thanks for the help,
Vance

On Thu, May 9, 2024 at 2:39 PM Carlos O'Donell <carlos@redhat.com> wrote:

> On 5/9/24 14:01, Florian Weimer via Libc-help wrote:
> > * Vance Raiti via Libc-help:
> >
> >> In order to support this, we use a modified glibc (
> >> https://github.com/unikernelLinux/glibc/tree/ukl-dynamic) that, among
> other
> >> things, uses a special system call ABI: instead of issuing syscall
> >> instructions, glibc instead calls the kernel entry point
> (entry_SYSCALL_64)
> >> directly. For example, the following assembly is used in the
> >> internal_syscall# macros:
> >>
> >>         asm volatile (
> >>
> >>         "call entry_SYSCALL_64@PLT\n\t"
> >>
> >>         : "=a" (resultvar)
> >>
> >>         : "0" (number)
> >>
> >>         : "memory", REGISTERS_CLOBBERED_BY_SYSCALL);
> >
> > Note that this clobbers the red zone, so unless you build with
> > -mno-red-zone, you'll run into subtle bugs.  There could also be
> > unwinding issues.  REGISTERS_CLOBBERED_BY_SYSCALL could be wrong as
> > well, but that depends on what entry_SYSCALL_64 does.
> >
> >> This works well for the normal glibc shared objects (libc.so,
> >> libpthread.so, etc.), however, it causes complications for the dynamic
> >> linker, ld.so. Since our system calls use the PLT, ld.so needs to
> relocate
> >> itself before issuing any. I’ve made it so that ld.so knows the address
> of
> >> entry_SYSCALL_64 ($r15 is initialized with its value by the kernel), so
> it
> >> is possible, in theory, for ld.so to relocate itself. I noticed that in
> >> _dl_start, one of the first things that ld.so does is perform bootstrap
> >> relocation on itself using boostrap_map. Would it be possible to add the
> >> entry_SYSCALL_64 address to boostrap_map such that ld.so could properly
> >> relocate entry_SYSCALL_64@PLT and make calls to it? If so, how could
> that
> >> be done?
> >
> > You shouldn't use a PLT call, but an indirect call, something like:
> >
> >   call *_dl_entry_SYSCALL_64(%rip)
> >
> > And then have a zero-initialized global variable
> >
> >   void *_dl_entry_SYSCALL_64 attribute_hidden;
> >
> > that is set to the correct value (passed in $r15 based on your
> > explanation?) very early in the RTLD_START fragment in
> > sysdeps/x86_64/dl-machine.h.
>
> An indirect call to one location is a very good suggestion.
>
> This is going to be much simpler than anything I suggested.
>
> --
> Cheers,
> Carlos.
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Augmenting ld.so Bootstrap Relocations for Unikernel Linux
  2024-05-12 16:41     ` Vance Raiti
@ 2024-05-14  7:54       ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2024-05-14  7:54 UTC (permalink / raw)
  To: Vance Raiti; +Cc: Carlos O'Donell, libc-help, libc-alpha

* Vance Raiti:

> The indirect call worked. I had tried it before and thought it
> wouldn't be possible because the indirect pointer to entry_SYSCALL_64
> must be initialized by each library that uses it, and I was not aware
> of the initialization routines in the regular glibc shared
> objects. But after reading your email I took another look over the ABI
> and found the DT_INIT dynamic tag, leading me to find the `_init`
> routine where I could do my initialization.

I think using a relocation for anything else but ld.so is actually a
fine approach.  DT_INIT has ordering issues.  With a relocation, you can
ensure that user code never encounters an uninitialized pointer.
Alternatively, if it's just for libc.so, you can initialize the pointer
in __libc_early_init.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-05-14  8:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-09 12:20 Augmenting ld.so Bootstrap Relocations for Unikernel Linux Vance Raiti
2024-05-09 17:39 ` Carlos O'Donell
2024-05-09 18:01 ` Florian Weimer
2024-05-09 18:39   ` Carlos O'Donell
2024-05-12 16:41     ` Vance Raiti
2024-05-14  7:54       ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).