public inbox for libffi-discuss@sourceware.org
 help / color / mirror / Atom feed
From: Jay K <jayk123@hotmail.com>
To: Florian Weimer <fw@deneb.enyo.de>,
	"Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
Cc: "kernel-hardening@lists.openwall.com"
	<kernel-hardening@lists.openwall.com>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"oleg@redhat.com" <oleg@redhat.com>,
	"linux-security-module@vger.kernel.org"
	<linux-security-module@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-integrity@vger.kernel.org"
	<linux-integrity@vger.kernel.org>,
	"libffi-discuss@sourceware.org" <libffi-discuss@sourceware.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"Jay (work)" <jaykrell@microsoft.com>
Subject: Re: [PATCH v2 0/4] [RFC] Implement Trampoline File Descriptor
Date: Wed, 23 Sep 2020 02:50:23 +0000	[thread overview]
Message-ID: <MWHPR1401MB1951CF31B40DF202D3BCE2EEE6380@MWHPR1401MB1951.namprd14.prod.outlook.com> (raw)
In-Reply-To: <96ea02df-4154-5888-1669-f3beeed60b33@linux.microsoft.com>

  > As mentioned before, if the ISA supports PC-relative data references 
  > (e.g., X86 64-bit platforms support RIP-relative data references) 
  > then we can pass data to that code by placing the code and data in  
  > adjacent pages. So, you can implement the trampoline table for X64.  
  > i386 does not support it. 

i386 does not need this either.

You make a PC-relative call, read the return address into a register, and then do register-relative data access.

either: 
  call get_pc ; PC-relative call  
  mov  eax, [eax+x] 
 
get_pc: 
  mov eax, [esp] 
  ret 

or if you don't mind disrupting the return address predictor:

 call +0  
 pop eax   
 mov eax, [eax+x]  
 
where x is computed by the static linker, and eax can vary.
The same way PIC code normally works I think.

Also the data and code do not have to be on adjacent pages in this scheme.

You can just map an entire .dll/.so additional times.
 A little wasteful, yes, but quite convenient. Factor the thunks/trampolines
 into their own .so/.dll to make it not very wasteful.


The functions do not even have to be a fixed distance from their array element either.
Architectures that are "naturally" position independent (amd64, arm64) do not even need any assembly to do this.
Just use C and stamp out multiple copies with the C preprocessor.
But arm32 and x86 do tend to need some assembly, depending on compilation model, etc. (i.e. on Windows at least).


Is there any architecture that lacks both PC-relative data access and PC-relative call, with
ability to materialize the return address into a register?


Given codegen that is not "arbitrary", you make it "data driven" and you don't need
kernel support. Unless there really exists architectures that cannot reasonably synthesize
PC-relative data access. ?


As long as you can use mmap or similar to map a .so/.dll any number of times,
to produce any number of thunks.

On Windows that this is CreateFileMapping(SEC_IMAGE) + MapViewOfFile.
i.e. not dlopen and not LoadLibrary, they just increment a reference count
and return the original mapping.

 - Jay


From: Libffi-discuss <libffi-discuss-bounces@sourceware.org> on behalf of Madhavan T. Venkataraman via Libffi-discuss <libffi-discuss@sourceware.org>
Sent: Thursday, September 17, 2020 3:36 PM
To: Florian Weimer <fw@deneb.enyo.de>
Cc: kernel-hardening@lists.openwall.com <kernel-hardening@lists.openwall.com>; linux-api@vger.kernel.org <linux-api@vger.kernel.org>; x86@kernel.org <x86@kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; oleg@redhat.com <oleg@redhat.com>; linux-security-module@vger.kernel.org <linux-security-module@vger.kernel.org>; linux-fsdevel@vger.kernel.org <linux-fsdevel@vger.kernel.org>; linux-integrity@vger.kernel.org <linux-integrity@vger.kernel.org>; libffi-discuss@sourceware.org <libffi-discuss@sourceware.org>; linux-arm-kernel@lists.infradead.org <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v2 0/4] [RFC] Implement Trampoline File Descriptor 
 


On 9/16/20 8:04 PM, Florian Weimer wrote:
> * madvenka:
> 
>> Examples of trampolines
>> =======================
>>
>> libffi (A Portable Foreign Function Interface Library):
>>
>> libffi allows a user to define functions with an arbitrary list of
>> arguments and return value through a feature called "Closures".
>> Closures use trampolines to jump to ABI handlers that handle calling
>> conventions and call a target function. libffi is used by a lot
>> of different applications. To name a few:
>>
>>       - Python
>>       - Java
>>       - Javascript
>>       - Ruby FFI
>>       - Lisp
>>       - Objective C
> 
> libffi does not actually need this.  It currently collocates
> trampolines and the data they need on the same page, but that's
> actually unecessary.  It's possible to avoid doing this just by
> changing libffi, without any kernel changes.
> 
> I think this has already been done for the iOS port.
> 

The trampoline table that has been implemented for the iOS port (MACH)
is based on PC-relative data referencing. That is, the code and data
are placed in adjacent pages so that the code can access the data using
an address relative to the current PC.

This is an ISA feature that is not supported on all architectures.

Now, if it is a performance feature, we can include some architectures
and exclude others. But this is a security feature. IMO, we cannot
exclude any architecture even if it is a legacy one as long as Linux
is running on the architecture. So, we need a solution that does
not assume any specific ISA feature.

>> The code for trampoline X in the trampoline table is:
>>
>>       load    &code_table[X], code_reg
>>       load    (code_reg), code_reg
>>       load    &data_table[X], data_reg
>>       load    (data_reg), data_reg
>>       jump    code_reg
>>
>> The addresses &code_table[X] and &data_table[X] are baked into the
>> trampoline code. So, PC-relative data references are not needed. The user
>> can modify code_table[X] and data_table[X] dynamically.
> 
> You can put this code into the libffi shared object and map it from
> there, just like the rest of the libffi code.  To get more
> trampolines, you can map the page containing the trampolines multiple
> times, each instance preceded by a separate data page with the control
> information.
> 

If you put the code in the libffi shared object, how do you pass data to
the code at runtime? If the code we are talking about is a function, then
there is an ABI defined way to pass data to the function. But if the
code we are talking about is some arbitrary code such as a trampoline,
there is no ABI defined way to pass data to it except in a couple of
platforms such as HP PA-RISC that have support for function descriptors
in the ABI itself.

As mentioned before, if the ISA supports PC-relative data references
(e.g., X86 64-bit platforms support RIP-relative data references)
then we can pass data to that code by placing the code and data in
adjacent pages. So, you can implement the trampoline table for X64.
i386 does not support it.


> I think the previous patch submission has also resulted in several
> comments along those lines, so I'm not sure why you are reposting
> this.

IIRC, I have answered all of those comments by mentioning the point
that we need to support all architectures without requiring special
ISA features. Taking the kernel's help in this is one solution.


> 
>> libffi
>> ======
>>
>> I have implemented my solution for libffi and provided the changes for
>> X86 and ARM, 32-bit and 64-bit. Here is the reference patch:
>>
>> https://nam10.safelinks.protection.outlook.com/?url=http:%2F%2Flinux.microsoft.com%2F~madvenka%2Flibffi%2Flibffi.v2.txt&amp;data=02%7C01%7C%7C25b693de3de342e1e02c08d85b1f6af5%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637359537776320186&amp;sdata=b%2BqpgrUoSy%2FrprtE4xgd0%2FhPiFxTOh69yYjlTkgSQoc%3D&amp;reserved=0
> 
> The URL does not appear to work, I get a 403 error.

I apologize for that. That site is supposed to be accessible publicly.
I will contact the administrator and get this resolved.

Sorry for the annoyance.

> 
>> If the trampfd patchset gets accepted, I will send the libffi changes
>> to the maintainers for a review. BTW, I have also successfully executed
>> the libffi self tests.
> 
> I have not seen your libffi changes, but I expect that the complexity
> is about the same as a userspace-only solution.
> 
> 

I agree. The complexity is about the same. But the support is for all
architectures. Once the common code is in place, the changes for each
architecture are trivial.

Madhavan

> Cc:ing libffi upstream for awareness.  The start of the thread is
> here:
> 
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-api%2F20200916150826.5990-1-madvenka%40linux.microsoft.com%2F&amp;data=02%7C01%7C%7C25b693de3de342e1e02c08d85b1f6af5%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637359537776320186&amp;sdata=nIIDBh6F%2Fit%2BklEWLzuy0iiKCCf%2BxRf4JNZS8LbFkOY%3D&amp;reserved=0>
> 

      parent reply	other threads:[~2020-09-23  2:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200916150826.5990-1-madvenka@linux.microsoft.com>
2020-09-17  1:04 ` Florian Weimer
2020-09-17 15:36   ` Madhavan T. Venkataraman
2020-09-17 15:57     ` Madhavan T. Venkataraman
2020-09-17 16:01       ` Florian Weimer
2020-09-23  1:46     ` Arvind Sankar
2020-09-23  9:11       ` Arvind Sankar
2020-09-23 19:17         ` Madhavan T. Venkataraman
2020-09-23 19:51           ` Arvind Sankar
2020-09-23 23:51             ` Madhavan T. Venkataraman
2020-09-24 20:23             ` Madhavan T. Venkataraman
2020-09-24 20:52               ` Florian Weimer
2020-09-25 22:22                 ` Madhavan T. Venkataraman
2020-09-27 18:25                   ` Madhavan T. Venkataraman
2020-10-03  9:43                     ` Jay K
2020-09-24 22:13               ` Pavel Machek
2020-09-24 23:43               ` Arvind Sankar
2020-09-25 22:44                 ` Madhavan T. Venkataraman
2020-09-26 15:55                   ` Arvind Sankar
2020-09-27 17:59                     ` Madhavan T. Venkataraman
2020-09-23  2:50     ` Jay K [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR1401MB1951CF31B40DF202D3BCE2EEE6380@MWHPR1401MB1951.namprd14.prod.outlook.com \
    --to=jayk123@hotmail.com \
    --cc=fw@deneb.enyo.de \
    --cc=jaykrell@microsoft.com \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=libffi-discuss@sourceware.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=madvenka@linux.microsoft.com \
    --cc=oleg@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).