RFC: malloc and secure memory.

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* RFC: malloc and secure memory.
@ 2020-09-24 20:56 Carlos O'Donell
  2020-09-25  6:39 ` Florian Weimer
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Carlos O'Donell @ 2020-09-24 20:56 UTC (permalink / raw)
  To: libc-alpha

In reviewing this discussion:
https://github.com/systemd/systemd/pull/14213

The request is for a way to mark some allocations as "secure"
and to give them special properties.

I wonder if we can't do this in some generic way:

- Make arenas a first class construct.

/* Get arena with special properties.  */
malloc_arena *secure_arena = NULL;
/* Get a handle to an arena that has secure heaps.  If glibc can make this
   kind of arena and heap then it does, otherwise it returns NULL.  */
secure_arena = malloc_arena_get (HEAP_SECURE);
/* Does this glibc support his kind of arena?  */
if (secure_arena == NULL)
  abort();

- Bind the malloc call site to a specific arena with specific properties.

For example:

  /* malloc_arena takes an opaque arena pointer that is a global
     variable that the implementation provides, a function pointer
     the memory allocator routine e.g. malloc, and a size.  */
  password_storage = malloc_arena (secure_arena, malloc, size);
  ...
  /* Completely different TU, or scope... */
  free (password_storage);

At the call site you bind:
* The arena.
* The allocator routine.
* The paramters.

At runtime you can:
* Verify if the allocator is one that is part of your own
  implementation, and if it isn't abort (fixes partial
  interposition problem).
* Call the allocator but with the new arena as active
  with the given size.

- Specific arenas have special properties.

For example:
- All allocations from a secure arena use mmap.
- All frees from the secure unmap the memory.

Notes:
- Callers must know the memory may container "secure" information and
  that the eventual user of the memory may place "secure" data in that
  memory. The alternative is significantly more complex in that the actual
  chunk will have to carry the arena type information e.g. pointer to
  an arena or index value. That is to say that the binding of the context
  happens at the *use* site of the memory in question and spreads to the
  entire chunk.

- This concept that an arena should service specific types of memory
  is something that DJ and I have been tossing about for a while.
  Particularly since it might be useful to, at an implementation level
  actually have the properties be specific to the logical heap. For
  example an arena could have a heap that services all large requests.
  So large requests always go via a specific heap in the same arena.
  This creates size-classes which might help alleviate some of the
  workload mixing issues we see with size growth. Likewise the arena
  could have a secure heap that follows these different rules. So
  while the user requests a HEAP_SECURE arena, they will just get
  their local arena but pointing to the secure heap within that arena.

  e.g. 
  struct malloc_arena  {  
    heap_info heap; /* Heap selected by the type.  */
    ... other info ...
  }

- Could be used to specifically request "tagged (coloured) memory"
  like that being offered by aarch64's MTE, and bound to a specific
  allocation only e.g. malloc_arena_get (HEAP_TAGGED); and then
  deal with the consequences of specific chunks that are always
  MTE-enabled.

- I don't want to design a fully pluggable arena interface in C
  with callbacks, just something where we can extend the existing
  interface with a new API.

- Should we consider reducing the heap size to less than 64MiB
  on 64-bit processes?

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC: malloc and secure memory.
  2020-09-24 20:56 RFC: malloc and secure memory Carlos O'Donell
@ 2020-09-25  6:39 ` Florian Weimer
  2020-09-25 16:10   ` Carlos O'Donell
  2020-09-25 23:39 ` Stefan O'Rear
  2020-09-26 17:07 ` Rich Felker
  2 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2020-09-25  6:39 UTC (permalink / raw)
  To: Carlos O'Donell via Libc-alpha

* Carlos O'Donell via Libc-alpha:

> I wonder if we can't do this in some generic way:
>
> - Make arenas a first class construct.
>
> /* Get arena with special properties.  */
> malloc_arena *secure_arena = NULL;
> /* Get a handle to an arena that has secure heaps.  If glibc can make this
>    kind of arena and heap then it does, otherwise it returns NULL.  */
> secure_arena = malloc_arena_get (HEAP_SECURE);
> /* Does this glibc support his kind of arena?  */
> if (secure_arena == NULL)
>   abort();
>
> - Bind the malloc call site to a specific arena with specific properties.
>
> For example:
>
>   /* malloc_arena takes an opaque arena pointer that is a global
>      variable that the implementation provides, a function pointer
>      the memory allocator routine e.g. malloc, and a size.  */
>   password_storage = malloc_arena (secure_arena, malloc, size);
>   ...
>   /* Completely different TU, or scope... */
>   free (password_storage);

How is this going to work with existing out-of-tree mallocs?  Do you
want them all to change?  Why would these implementations want to add
the overhead to support memory they have not allocated?  How would they
discover the actual implementation of free to call?

If we want to add new allocator interfaces, they need to have completely
separate names, and should follow an existing, well-understood design
(e.g., the APR pool interfaces, libtalloc with its pointers-as-pools,
the Windows 2.x heap interfaces with its handles).

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC: malloc and secure memory.
  2020-09-25  6:39 ` Florian Weimer
@ 2020-09-25 16:10   ` Carlos O'Donell
  0 siblings, 0 replies; 8+ messages in thread
From: Carlos O'Donell @ 2020-09-25 16:10 UTC (permalink / raw)
  To: Florian Weimer, Carlos O'Donell via Libc-alpha

On 9/25/20 2:39 AM, Florian Weimer wrote:
> * Carlos O'Donell via Libc-alpha:
> 
>> I wonder if we can't do this in some generic way:
>>
>> - Make arenas a first class construct.
>>
>> /* Get arena with special properties.  */
>> malloc_arena *secure_arena = NULL;
>> /* Get a handle to an arena that has secure heaps.  If glibc can make this
>>    kind of arena and heap then it does, otherwise it returns NULL.  */
>> secure_arena = malloc_arena_get (HEAP_SECURE);
>> /* Does this glibc support his kind of arena?  */
>> if (secure_arena == NULL)
>>   abort();
>>
>> - Bind the malloc call site to a specific arena with specific properties.
>>
>> For example:
>>
>>   /* malloc_arena takes an opaque arena pointer that is a global
>>      variable that the implementation provides, a function pointer
>>      the memory allocator routine e.g. malloc, and a size.  */
>>   password_storage = malloc_arena (secure_arena, malloc, size);
>>   ...
>>   /* Completely different TU, or scope... */
>>   free (password_storage);
> 
> How is this going to work with existing out-of-tree mallocs?  Do you
> want them all to change?  Why would these implementations want to add
> the overhead to support memory they have not allocated?  How would they
> discover the actual implementation of free to call?

The call to malloc_arena is passed a function pointer for malloc, and my intent
was to use that to compare if there was an interposed allocator that didn't
implement the call. Granted this would have to be tested and the code
verified if it would even work e.g. address of PLT / canonical address etc.
In which case we would fail the call and return an error to show that the
mixed-allocator use case is not supported. This opens a big problem though
in that you now have the following scenarios:

(1) Uninterposed: Works fine.
(2) Interposed: Works sometimes depending on interposer and testing.

Which is not a great situation, but supports the "I don't control the point
of free" requirement in the design.

> If we want to add new allocator interfaces, they need to have completely
> separate names, and should follow an existing, well-understood design
> (e.g., the APR pool interfaces, libtalloc with its pointers-as-pools,
> the Windows 2.x heap interfaces with its handles).

This is the other way to take the decision and it has better scenarios:

(1) Before using APIs: works fine.
(2) Using new APIs: works fine.

There is no potentially tricky case where you have to check for the new
allocator and if it fails have a fallback.

If we agree that a new API should be used, then it doesn't need to be
solved in glibc. Other runtimes can provide "secure memory" handling
and the applications have to use those APIs.

In summary:

(a) Is calling free with this memory important enough to design an API
    around that use case?

(b) Should applications develop new APIs to solve this problem?

My feeling is (a) "No" because it's a lot of design and maintenance
overhead for a limited problem with bad runtime scenarios.

My feeling is (b) "Yes" because it yields a better runtime scenario.

Thoughts?

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC: malloc and secure memory.
  2020-09-24 20:56 RFC: malloc and secure memory Carlos O'Donell
  2020-09-25  6:39 ` Florian Weimer
@ 2020-09-25 23:39 ` Stefan O'Rear
  2020-09-27 12:39   ` Carlos O'Donell
  2020-09-26 17:07 ` Rich Felker
  2 siblings, 1 reply; 8+ messages in thread
From: Stefan O'Rear @ 2020-09-25 23:39 UTC (permalink / raw)
  To: Stefan O'Rear via Libc-alpha

On Thu, Sep 24, 2020, at 4:56 PM, Carlos O'Donell via Libc-alpha wrote:
> In reviewing this discussion:
> https://github.com/systemd/systemd/pull/14213
> 
> The request is for a way to mark some allocations as "secure"
> and to give them special properties.
> 
> I wonder if we can't do this in some generic way:
> 
> - Make arenas a first class construct.
> 
> /* Get arena with special properties.  */
> malloc_arena *secure_arena = NULL;
> /* Get a handle to an arena that has secure heaps.  If glibc can make this
>    kind of arena and heap then it does, otherwise it returns NULL.  */
> secure_arena = malloc_arena_get (HEAP_SECURE);
> /* Does this glibc support his kind of arena?  */
> if (secure_arena == NULL)
>   abort();

This is a bit late and I apologize, but is there any possibility of choosing
a more descriptive name than SECURE for this?  It's extremely vague, will mean
something different for everybody, and because security is a situational and 
global property of systems, I would generally consider it incorrect to use
"secure" to describe local and binary properties of subsystems.

-s

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC: malloc and secure memory.
  2020-09-24 20:56 RFC: malloc and secure memory Carlos O'Donell
  2020-09-25  6:39 ` Florian Weimer
  2020-09-25 23:39 ` Stefan O'Rear
@ 2020-09-26 17:07 ` Rich Felker
  2020-09-27 14:04   ` Carlos O'Donell
  2020-09-27 20:29   ` Florian Weimer
  2 siblings, 2 replies; 8+ messages in thread
From: Rich Felker @ 2020-09-26 17:07 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: libc-alpha

On Thu, Sep 24, 2020 at 04:56:59PM -0400, Carlos O'Donell via Libc-alpha wrote:
> In reviewing this discussion:
> https://github.com/systemd/systemd/pull/14213
> 
> The request is for a way to mark some allocations as "secure"
> and to give them special properties.
> 
> I wonder if we can't do this in some generic way:
> 
> - Make arenas a first class construct.
> 
> /* Get arena with special properties.  */
> malloc_arena *secure_arena = NULL;
> /* Get a handle to an arena that has secure heaps.  If glibc can make this
>    kind of arena and heap then it does, otherwise it returns NULL.  */
> secure_arena = malloc_arena_get (HEAP_SECURE);
> /* Does this glibc support his kind of arena?  */
> if (secure_arena == NULL)
>   abort();
> 
> - Bind the malloc call site to a specific arena with specific properties.

I don't see any plausible motivation for why a caller would want to do
this rather than just calling mmap. It's far less portable, more
error-prone (since it doesn't look like it would catch use of one type
where you meant the other, as opposed to with direct mmap where
passing it to free or vice versa would trap at some point), and more
complex to program for.

Is the answer just "for systemd reasons"?

> For example:
> 
>   /* malloc_arena takes an opaque arena pointer that is a global
>      variable that the implementation provides, a function pointer
>      the memory allocator routine e.g. malloc, and a size.  */
>   password_storage = malloc_arena (secure_arena, malloc, size);

Is the function pointer merely being used as a lookup key here? Or is
the intend that it would be implemented by changing some thread-local
context, calling the pointed-to function, then restoring it? I don't
see what you get by having this weird interface since the signature
has to match to make the call, and there are no other malloc-family
functions that match malloc's signature anyway.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC: malloc and secure memory.
  2020-09-25 23:39 ` Stefan O'Rear
@ 2020-09-27 12:39   ` Carlos O'Donell
  0 siblings, 0 replies; 8+ messages in thread
From: Carlos O'Donell @ 2020-09-27 12:39 UTC (permalink / raw)
  To: Stefan O'Rear, Stefan O'Rear via Libc-alpha

On 9/25/20 7:39 PM, Stefan O'Rear via Libc-alpha wrote:
> On Thu, Sep 24, 2020, at 4:56 PM, Carlos O'Donell via Libc-alpha wrote:
>> In reviewing this discussion:
>> https://github.com/systemd/systemd/pull/14213
>>
>> The request is for a way to mark some allocations as "secure"
>> and to give them special properties.
>>
>> I wonder if we can't do this in some generic way:
>>
>> - Make arenas a first class construct.
>>
>> /* Get arena with special properties.  */
>> malloc_arena *secure_arena = NULL;
>> /* Get a handle to an arena that has secure heaps.  If glibc can make this
>>    kind of arena and heap then it does, otherwise it returns NULL.  */
>> secure_arena = malloc_arena_get (HEAP_SECURE);
>> /* Does this glibc support his kind of arena?  */
>> if (secure_arena == NULL)
>>   abort();
> 
> This is a bit late and I apologize, but is there any possibility of choosing
> a more descriptive name than SECURE for this?  It's extremely vague, will mean
> something different for everybody, and because security is a situational and 
> global property of systems, I would generally consider it incorrect to use
> "secure" to describe local and binary properties of subsystems.

Sure. This is an RFC and largely a proposal to simply spark some conversation
and elucidate opinions.

I like Florian's suggestion that this is largely something that could be handled
with a new API to avoid problematic use cases. And if you have a new API it need
not be a part of glibc.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC: malloc and secure memory.
  2020-09-26 17:07 ` Rich Felker
@ 2020-09-27 14:04   ` Carlos O'Donell
  2020-09-27 20:29   ` Florian Weimer
  1 sibling, 0 replies; 8+ messages in thread
From: Carlos O'Donell @ 2020-09-27 14:04 UTC (permalink / raw)
  To: Rich Felker; +Cc: libc-alpha

On 9/26/20 1:07 PM, Rich Felker wrote:
> On Thu, Sep 24, 2020 at 04:56:59PM -0400, Carlos O'Donell via Libc-alpha wrote:
>> In reviewing this discussion:
>> https://github.com/systemd/systemd/pull/14213
>>
>> The request is for a way to mark some allocations as "secure"
>> and to give them special properties.
>>
>> I wonder if we can't do this in some generic way:
>>
>> - Make arenas a first class construct.
>>
>> /* Get arena with special properties.  */
>> malloc_arena *secure_arena = NULL;
>> /* Get a handle to an arena that has secure heaps.  If glibc can make this
>>    kind of arena and heap then it does, otherwise it returns NULL.  */
>> secure_arena = malloc_arena_get (HEAP_SECURE);
>> /* Does this glibc support his kind of arena?  */
>> if (secure_arena == NULL)
>>   abort();
>>
>> - Bind the malloc call site to a specific arena with specific properties.
> 
> I don't see any plausible motivation for why a caller would want to do
> this rather than just calling mmap. It's far less portable, more
> error-prone (since it doesn't look like it would catch use of one type
> where you meant the other, as opposed to with direct mmap where
> passing it to free or vice versa would trap at some point), and more
> complex to program for.

I agree with all of your points.

However, if you have existing legacy APIs and by their semantics the
caller can free some allocations later with free() and you want to
transparently apply some kind of additional semantics, then you need:

* At the call site of the allocation the extra properties need to be
  applied.

* At the call site of the deallocation the caller may no longer be
  aware of the extra properties and the deallocation should proceed
  as expected.

Don't get me wrong, applying a design like this yields problems, and
you and I point them out. In summary:

* New API is not as portable as mmap.

* New API yields problems for interposed allocators.

* New API could yield problems if down-stack callers imply that they
  can free/malloc (realloc might work) the allocation without loosing
  the additional properties.

> Is the answer just "for systemd reasons"?

Having a public conversation about something and coming up with valid
reasons NOT to do it is just as important as having a conversation for
all the things you WOULD do.

The only way I see to have such a conversation is to put together an
RFC, discuss it, and reject the proposal with rationale given.

I might also have raised this issue on libc-coord actually, now that
you mention "reasons."

>> For example:
>>
>>   /* malloc_arena takes an opaque arena pointer that is a global
>>      variable that the implementation provides, a function pointer
>>      the memory allocator routine e.g. malloc, and a size.  */
>>   password_storage = malloc_arena (secure_arena, malloc, size);
> 
> Is the function pointer merely being used as a lookup key here? Or is
> the intend that it would be implemented by changing some thread-local
> context, calling the pointed-to function, then restoring it? I don't
> see what you get by having this weird interface since the signature
> has to match to make the call, and there are no other malloc-family
> functions that match malloc's signature anyway.

I mention the primary rationale here in the downthread post, but let
me elucidate the problem a bit more:

(1) glibc implements an general purpose system allocator.

(2) Various APIs all use the allocator, and as such their
    returned pointers are compatible with the allocator's unadorned
    free() API.

(3) Other developers implement interposed allocators and as such
    there may be a mix of new/old interposers and new/old glibc.

(4) Users expect interposted allocators to work.

It turns out that over time supporting (4) creates a strong incentive
to avoid changing the implemented allocator APIs.

Why? If you add any API to (1) and (3) doesn't support it, then
when you add in code that uses a new allocator API and attempt (4)
it will likely fail as chunks cross allocator boundaries because
(3) didn't interpose the new APIs.

Today it basically means we can't extend the allocator APIs anymore
because of prevalent use of special-purpose allocators designed to
optimize for specific workloads e.g. jemalloc, tcmalloc, new-tcmalloc.

To avoid problematic scenarios we need to provide entirely new APIs
that users would never mix with the original allocator APIs.

One suggestion is to provide a way to bind the allocation and
deallocation sides together at compile time, and I have no solutions
for that today.

My last idea is, as proposed above, to pass in the pointer to the
expected malloc, and if the allocator can prove that this malloc
does not match e.g. detect interposition, then it can declare that
the free is equally unsafe and return NULL for the allocation
to indicate that such memory with such properties cannot be allocated.

In summary:
- Supporting allocator interposition has created a composability
  problem with adding new allocation/deallocation APIs.
- glibc is effectively unable to add new APIs easily that aren't
  just in the form of hints to mallopt().
- If we have composability problems then we should just use new
  APIs from other allocator libraries and leave glibc's malloc
  for applications that need those exact semantics.
- New libraries can provide a superset of glibc's APIs and that
  would be safe because the application would require that specific
  library to operate.

Do you agree with that summary?

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC: malloc and secure memory.
  2020-09-26 17:07 ` Rich Felker
  2020-09-27 14:04   ` Carlos O'Donell
@ 2020-09-27 20:29   ` Florian Weimer
  1 sibling, 0 replies; 8+ messages in thread
From: Florian Weimer @ 2020-09-27 20:29 UTC (permalink / raw)
  To: Rich Felker; +Cc: Carlos O'Donell, libc-alpha

* Rich Felker:

> On Thu, Sep 24, 2020 at 04:56:59PM -0400, Carlos O'Donell via Libc-alpha wrote:
>> In reviewing this discussion:
>> https://github.com/systemd/systemd/pull/14213
>> 
>> The request is for a way to mark some allocations as "secure"
>> and to give them special properties.
>> 
>> I wonder if we can't do this in some generic way:
>> 
>> - Make arenas a first class construct.
>> 
>> /* Get arena with special properties.  */
>> malloc_arena *secure_arena = NULL;
>> /* Get a handle to an arena that has secure heaps.  If glibc can make this
>>    kind of arena and heap then it does, otherwise it returns NULL.  */
>> secure_arena = malloc_arena_get (HEAP_SECURE);
>> /* Does this glibc support his kind of arena?  */
>> if (secure_arena == NULL)
>>   abort();
>> 
>> - Bind the malloc call site to a specific arena with specific properties.
>
> I don't see any plausible motivation for why a caller would want to do
> this rather than just calling mmap. It's far less portable, more
> error-prone (since it doesn't look like it would catch use of one type
> where you meant the other, as opposed to with direct mmap where
> passing it to free or vice versa would trap at some point), and more
> complex to program for.

Many systems only allow 64 KiB of non-swappable memory per process, so
calling mmap (followed by mlock) would be quite wasteful and restrict
the number of active allocations to 16 with a 4 KiB page size.

There are various competing ideas what “secure” should mean in this
context, so maybe it's supposed to be about something else, and the 64
KiB wouldn't apply.  It would be tough to get multiple libraries to
coordinate the best use of this space even with a finer-grained
allocator—who wants to use non-secure memory for their data?

But I think the general idea to require that programers open-code
matching allocator and deallactor calls is not too onerous.  If it is,
the next village down the road has std::allocator on offer.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-09-27 20:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-24 20:56 RFC: malloc and secure memory Carlos O'Donell
2020-09-25  6:39 ` Florian Weimer
2020-09-25 16:10   ` Carlos O'Donell
2020-09-25 23:39 ` Stefan O'Rear
2020-09-27 12:39   ` Carlos O'Donell
2020-09-26 17:07 ` Rich Felker
2020-09-27 14:04   ` Carlos O'Donell
2020-09-27 20:29   ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).