public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* memcpy is leaking secret data through ZMM vector registers
@ 2024-04-19 14:07 Mikulas Patocka
  2024-04-19 14:19 ` H.J. Lu
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Mikulas Patocka @ 2024-04-19 14:07 UTC (permalink / raw)
  To: libc-alpha; +Cc: Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

Hi

As a part of LVM2, we are developing the libdevmapper library. The library 
may be used to load cryptographic keys to the kernel, so it avoids leaking 
the data to kernel memory and to the swap partition.

After the use of cryptographic data, the libdevmapper library clears them 
with memset and frees them afterwards. It executes __asm__ volatile("" ::: 
"memory") to thwart some compiler optimization regarding writing to 
to-be-freed memory.

We have a test "dmsecuretest.sh" that loads cryptographic keys into the 
kernel, dumps a core, the core file is analyzed and if it contains the 
key, the test fails.

This test fails on AMD Zen 4 - the reason for the failure is that the 
"memcpy" function uses ZMM registers for data copying. When memcpy exits, 
the encryption key is present in the ZMM registers and the key remains 
there even after both source and destination buffers of memcpy were 
cleared.

When we perform dynamic symbol lookup, the ZMM registers are spilled on 
the stack and they remain there forever - this is the reason why the core 
file contains the encryption key and the test fails.

I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or 
-Wl,-z,now) - it mostly works, but not entirely - the key may still be 
present on the stack even if we use LD_BIND_NOW=1.

When I hack the file glibc/sysdeps/x86_64/multiarch/ifunc-memmove.h so 
that it always selects the ERMS variant of memcpy, the problem goes away.

Could it be possible to add some switch to glibc, that could be turned on 
by security-sensitive programs and that would prevent glibc from using the 
vector registers? Or, do you suggest another solution?

Mikulas


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka
@ 2024-04-19 14:19 ` H.J. Lu
  2024-04-19 14:24   ` Mikulas Patocka
  2024-04-21  1:20 ` Andreas K. Huettel
  2024-04-22  9:33 ` Szabolcs Nagy
  2 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2024-04-19 14:19 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

On Fri, Apr 19, 2024 at 7:08 AM Mikulas Patocka <mpatocka@redhat.com> wrote:
>
> Hi
>
> As a part of LVM2, we are developing the libdevmapper library. The library
> may be used to load cryptographic keys to the kernel, so it avoids leaking
> the data to kernel memory and to the swap partition.
>
> After the use of cryptographic data, the libdevmapper library clears them
> with memset and frees them afterwards. It executes __asm__ volatile("" :::
> "memory") to thwart some compiler optimization regarding writing to
> to-be-freed memory.
>
> We have a test "dmsecuretest.sh" that loads cryptographic keys into the
> kernel, dumps a core, the core file is analyzed and if it contains the
> key, the test fails.
>
> This test fails on AMD Zen 4 - the reason for the failure is that the
> "memcpy" function uses ZMM registers for data copying. When memcpy exits,
> the encryption key is present in the ZMM registers and the key remains
> there even after both source and destination buffers of memcpy were
> cleared.
>
> When we perform dynamic symbol lookup, the ZMM registers are spilled on
> the stack and they remain there forever - this is the reason why the core
> file contains the encryption key and the test fails.
>
> I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or
> -Wl,-z,now) - it mostly works, but not entirely - the key may still be
> present on the stack even if we use LD_BIND_NOW=1.

Since vector registers are saved on stack only during symbol lookup,
shouldn't disabling lazy binding solve this issue?

> When I hack the file glibc/sysdeps/x86_64/multiarch/ifunc-memmove.h so
> that it always selects the ERMS variant of memcpy, the problem goes away.
>
> Could it be possible to add some switch to glibc, that could be turned on
> by security-sensitive programs and that would prevent glibc from using the
> vector registers? Or, do you suggest another solution?
>
> Mikulas
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 14:19 ` H.J. Lu
@ 2024-04-19 14:24   ` Mikulas Patocka
  2024-04-19 14:37     ` H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Mikulas Patocka @ 2024-04-19 14:24 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

[-- Attachment #1: Type: text/plain, Size: 765 bytes --]



On Fri, 19 Apr 2024, H.J. Lu wrote:

> On Fri, Apr 19, 2024 at 7:08 AM Mikulas Patocka <mpatocka@redhat.com> wrote:
> >
> > I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or
> > -Wl,-z,now) - it mostly works, but not entirely - the key may still be
> > present on the stack even if we use LD_BIND_NOW=1.
> 
> Since vector registers are saved on stack only during symbol lookup,
> shouldn't disabling lazy binding solve this issue?

It should, but it doesn't fix this problem entirely.

If I set "GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F,-AVX2" "LD_BIND_NOW=1", 
I still get a failure (I don't get the failure if I don't set 
GLIBC_TUNABLES and set only LD_BIND_NOW).

So, even if we use plain SSE, the data somehow end up on the stack.

Mikulas

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 14:24   ` Mikulas Patocka
@ 2024-04-19 14:37     ` H.J. Lu
  2024-04-19 18:04       ` Mikulas Patocka
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2024-04-19 14:37 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

On Fri, Apr 19, 2024 at 7:24 AM Mikulas Patocka <mpatocka@redhat.com> wrote:
>
>
>
> On Fri, 19 Apr 2024, H.J. Lu wrote:
>
> > On Fri, Apr 19, 2024 at 7:08 AM Mikulas Patocka <mpatocka@redhat.com> wrote:
> > >
> > > I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or
> > > -Wl,-z,now) - it mostly works, but not entirely - the key may still be
> > > present on the stack even if we use LD_BIND_NOW=1.
> >
> > Since vector registers are saved on stack only during symbol lookup,
> > shouldn't disabling lazy binding solve this issue?
>
> It should, but it doesn't fix this problem entirely.
>
> If I set "GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F,-AVX2" "LD_BIND_NOW=1",
> I still get a failure (I don't get the failure if I don't set
> GLIBC_TUNABLES and set only LD_BIND_NOW).
>
> So, even if we use plain SSE, the data somehow end up on the stack.
>

You should write your own memory copy function and compile it with
-fzero-call-used-regs if possible.

-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 14:37     ` H.J. Lu
@ 2024-04-19 18:04       ` Mikulas Patocka
  2024-04-19 18:45         ` Paul Eggert
  0 siblings, 1 reply; 16+ messages in thread
From: Mikulas Patocka @ 2024-04-19 18:04 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel



On Fri, 19 Apr 2024, H.J. Lu wrote:

> You should write your own memory copy function and compile it with
> -fzero-call-used-regs if possible.
> 
> -- 
> H.J.

This would work - but I looked at OpenSSL and it seems to suffer from the 
same problem as libdevmapper.

OpenSSL uses plain memcpy, it overwrites memory before freeing it, but it 
doesn't overwrite the YMM and ZMM registers.

So, it seems like overkill to add a special memcpy implementation to every 
library that manipulates sensitive data. It may be better to have some 
general solution. There's already "explicit_bzero", so maybe we could add 
"explicit_memcpy" or "secure_memcpy"?

Mikulas


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 18:04       ` Mikulas Patocka
@ 2024-04-19 18:45         ` Paul Eggert
  2024-04-19 18:47           ` Zack Weinberg
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Eggert @ 2024-04-19 18:45 UTC (permalink / raw)
  To: Mikulas Patocka, H.J. Lu
  Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

On 4/19/24 11:04, Mikulas Patocka wrote:
> There's already "explicit_bzero", so maybe we could add
> "explicit_memcpy"

Where would this stop? Wouldn't we also need explicit_memcmp, 
explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that 
looks at memory could have the problem. Even C source code that doesn't 
invoke any C library function could have the problem.

On the library side, shouldn't this sort of thing be handled by 
_FORTIFY_SOURCE or something similar? And don't we need a compiler 
option saying "don't cache anything in registers"?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 18:45         ` Paul Eggert
@ 2024-04-19 18:47           ` Zack Weinberg
  2024-04-19 18:53             ` Alexander Monakov
  0 siblings, 1 reply; 16+ messages in thread
From: Zack Weinberg @ 2024-04-19 18:47 UTC (permalink / raw)
  To: Paul Eggert, Mikulas Patocka, H . J . Lu
  Cc: GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz,
	dm-devel

On Fri, Apr 19, 2024, at 2:45 PM, Paul Eggert wrote:
> On 4/19/24 11:04, Mikulas Patocka wrote:
>> There's already "explicit_bzero", so maybe we could add
>> "explicit_memcpy"
>
> Where would this stop? Wouldn't we also need explicit_memcmp, 
> explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that 
> looks at memory could have the problem. Even C source code that doesn't 
> invoke any C library function could have the problem.

As I recall, one of the arguments for _not_ adding explicit_bzero to glibc
was that we couldn't guarantee copies of the secret data wouldn't hang
around in registers.

Is a hypothetical function __attribute__((clear_call_clobbered_regs_on_exit))
what we need here instead, maybe?

zw

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 18:47           ` Zack Weinberg
@ 2024-04-19 18:53             ` Alexander Monakov
  2024-04-19 19:11               ` Zack Weinberg
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2024-04-19 18:53 UTC (permalink / raw)
  To: Zack Weinberg
  Cc: Paul Eggert, Mikulas Patocka, H . J . Lu, GNU libc development,
	Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel


On Fri, 19 Apr 2024, Zack Weinberg wrote:

> On Fri, Apr 19, 2024, at 2:45 PM, Paul Eggert wrote:
> > On 4/19/24 11:04, Mikulas Patocka wrote:
> >> There's already "explicit_bzero", so maybe we could add
> >> "explicit_memcpy"
> >
> > Where would this stop? Wouldn't we also need explicit_memcmp, 
> > explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that 
> > looks at memory could have the problem. Even C source code that doesn't 
> > invoke any C library function could have the problem.
> 
> As I recall, one of the arguments for _not_ adding explicit_bzero to glibc
> was that we couldn't guarantee copies of the secret data wouldn't hang
> around in registers.

bzero and memset have no reason to read data from memory, they only need
to overwrite that memory. This makes them different from memcpy.

In the caller of memset/memcpy, sure, copies of that data may be present
on registers.

> Is a hypothetical function __attribute__((clear_call_clobbered_regs_on_exit))
> what we need here instead, maybe?

As indicated upthread, there's a non-hypothetical
__attribute__((zero_call_used_regs)), unless you mean something else?

Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 18:53             ` Alexander Monakov
@ 2024-04-19 19:11               ` Zack Weinberg
  2024-04-19 20:15                 ` Mikulas Patocka
  0 siblings, 1 reply; 16+ messages in thread
From: Zack Weinberg @ 2024-04-19 19:11 UTC (permalink / raw)
  To: Alexander Monakov
  Cc: Paul Eggert, Mikulas Patocka, H . J . Lu, GNU libc development,
	Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

On Fri, Apr 19, 2024, at 2:53 PM, Alexander Monakov wrote:
> On Fri, 19 Apr 2024, Zack Weinberg wrote:
>
>> On Fri, Apr 19, 2024, at 2:45 PM, Paul Eggert wrote:
>> > On 4/19/24 11:04, Mikulas Patocka wrote:
>> >> There's already "explicit_bzero", so maybe we could add
>> >> "explicit_memcpy"
>> >
>> > Where would this stop? Wouldn't we also need explicit_memcmp, 
>> > explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that 
>> > looks at memory could have the problem. Even C source code that doesn't 
>> > invoke any C library function could have the problem.
>> 
>> As I recall, one of the arguments for _not_ adding explicit_bzero to glibc
>> was that we couldn't guarantee copies of the secret data wouldn't hang
>> around in registers.
>
> bzero and memset have no reason to read data from memory, they only need
> to overwrite that memory. This makes them different from memcpy.

Yes, but the compiler does not know that bzero/explicit_bzero/memset only write
and do not read, which means if you have something like

void aes256_encrypt_in_place(const uint8_t *key, const uint8_t *iv,
                             uint8_t *data, size_t len)
{
    __m128 round_keys[AES256_N_ROUND_KEYS];
    aes256_expand_key(key, round_keys);
    aes256_do_cbc(round_keys, iv, data, len);
    explicit_bzero(round_keys, sizeof round_keys);
}

and aes256_expand_key and aes256_do_cbc get inlined, the compiler might
be able to keep the entire key schedule in the vector registers *until*
the call to explicit_bzero.  But right before calling explicit_bzero,
it will have to copy the round_keys array onto the stack!  And the copy
of round_keys in the vector registers *won't* get erased -- the exact
problem being discussed in this thread.

>> Is a hypothetical function __attribute__((clear_call_clobbered_regs_on_exit))
>> what we need here instead, maybe?
>
> As indicated upthread, there's a non-hypothetical
> __attribute__((zero_call_used_regs)), unless you mean something else?

I didn't know whether "call used" meant what I mean by "call clobbered".
Also, it's not clear to me whether this is bulletproof (under whatever name).

zw

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 19:11               ` Zack Weinberg
@ 2024-04-19 20:15                 ` Mikulas Patocka
  2024-04-19 20:31                   ` Zack Weinberg
  0 siblings, 1 reply; 16+ messages in thread
From: Mikulas Patocka @ 2024-04-19 20:15 UTC (permalink / raw)
  To: Zack Weinberg
  Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development,
	Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel



On Fri, 19 Apr 2024, Zack Weinberg wrote:

> Yes, but the compiler does not know that bzero/explicit_bzero/memset only write
> and do not read, which means if you have something like
> 
> void aes256_encrypt_in_place(const uint8_t *key, const uint8_t *iv,
>                              uint8_t *data, size_t len)
> {
>     __m128 round_keys[AES256_N_ROUND_KEYS];
>     aes256_expand_key(key, round_keys);
>     aes256_do_cbc(round_keys, iv, data, len);
>     explicit_bzero(round_keys, sizeof round_keys);
> }
> 
> and aes256_expand_key and aes256_do_cbc get inlined, the compiler might
> be able to keep the entire key schedule in the vector registers *until*
> the call to explicit_bzero.  But right before calling explicit_bzero,
> it will have to copy the round_keys array onto the stack!  And the copy
> of round_keys in the vector registers *won't* get erased -- the exact
> problem being discussed in this thread.

On the SYSV ABI, all the vector registers are volatile, so you can erase 
them in explicit_bzero.

On Windows 64-bit ABI, it is more problematic, because some of the vector 
registers must be preserved.

Mikulas


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 20:15                 ` Mikulas Patocka
@ 2024-04-19 20:31                   ` Zack Weinberg
  2024-04-19 21:11                     ` Mikulas Patocka
  0 siblings, 1 reply; 16+ messages in thread
From: Zack Weinberg @ 2024-04-19 20:31 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development,
	Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote:
> On Fri, 19 Apr 2024, Zack Weinberg wrote:
>> ... the copy
>> of round_keys in the vector registers *won't* get erased -- the exact
>> problem being discussed in this thread.
>
> On the SYSV ABI, all the vector registers are volatile, so you can erase 
> them in explicit_bzero.
>
> On Windows 64-bit ABI, it is more problematic, because some of the vector 
> registers must be preserved.

Oh, huh. Yes, that would work. Call-preserved registers are not a problem, because any function that puts secret data in a call-preserved register in the first place, must erase it again (by restoring the old value) before returning. Therefore, if we made explicit_bzero wipe *all* the call-clobbered registers before returning, my example function would be safe.

There's still a place secrets could leak to and not get erased, though: register spill slots on the stack. Only the compiler could plug this leak. Long term, I think what we want is something like __attribute__((sensitive)), which can only be applied to variables with automatic storage duration, and which means "erase all copies of this variable's value, wherever they wound up, at the end of its lifetime." Note that such variables must not be put in call-preserved registers in non-leaf functions, because then they might get spilled to the stack by a callee, which has no way of knowing that it's just leaked a secret. And I suppose we might also want to worry about signal frames. Nobody said this was gonna be easy ;-)

zw

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 20:31                   ` Zack Weinberg
@ 2024-04-19 21:11                     ` Mikulas Patocka
  2024-04-19 23:27                       ` Florian Weimer
  0 siblings, 1 reply; 16+ messages in thread
From: Mikulas Patocka @ 2024-04-19 21:11 UTC (permalink / raw)
  To: Zack Weinberg
  Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development,
	Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel



On Fri, 19 Apr 2024, Zack Weinberg wrote:

> On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote:
> > On Fri, 19 Apr 2024, Zack Weinberg wrote:
> >> ... the copy
> >> of round_keys in the vector registers *won't* get erased -- the exact
> >> problem being discussed in this thread.
> >
> > On the SYSV ABI, all the vector registers are volatile, so you can erase 
> > them in explicit_bzero.
> >
> > On Windows 64-bit ABI, it is more problematic, because some of the vector 
> > registers must be preserved.
> 
> Oh, huh. Yes, that would work.

I've just realized that this wouldn't work - if the function 
explicit_bzero is lazily resolved, the dynamic linker would spill the 
vector registers to the stack prior to calling explicit_bzero.

> Call-preserved registers are not a 
> problem, because any function that puts secret data in a call-preserved 
> register in the first place, must erase it again (by restoring the old 
> value) before returning. Therefore, if we made explicit_bzero wipe *all* 
> the call-clobbered registers before returning, my example function would 
> be safe.
> 
> There's still a place secrets could leak to and not get erased, though: 
> register spill slots on the stack. Only the compiler could plug this 
> leak. Long term, I think what we want is something like 
> __attribute__((sensitive)), which can only be applied to variables with 
> automatic storage duration, and which means "erase all copies of this 
> variable's value, wherever they wound up, at the end of its lifetime." 
> Note that such variables must not be put in call-preserved registers in 
> non-leaf functions, because then they might get spilled to the stack by 
> a callee, which has no way of knowing that it's just leaked a secret. 
> And I suppose we might also want to worry about signal frames. Nobody 
> said this was gonna be easy ;-)
> 
> zw

Yes.

Another problem is varargs - if there is at least one floating point 
argument, the compiler will store 8 XMM registers on the stack regardless 
of whether they are used or not.

In the past it didn't do it (it made indirect jump based on the value in 
the %AL register to save only the used registers), but someone probably 
found out that indirect jumps are expensive and that storing all 8 
registers is faster.

Mikulas


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 21:11                     ` Mikulas Patocka
@ 2024-04-19 23:27                       ` Florian Weimer
  2024-04-20  3:29                         ` Zack Weinberg
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2024-04-19 23:27 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Zack Weinberg, Alexander Monakov, Paul Eggert, H . J . Lu,
	GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz,
	dm-devel

* Mikulas Patocka:

> On Fri, 19 Apr 2024, Zack Weinberg wrote:
>
>> On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote:
>> > On Fri, 19 Apr 2024, Zack Weinberg wrote:
>> >> ... the copy
>> >> of round_keys in the vector registers *won't* get erased -- the exact
>> >> problem being discussed in this thread.
>> >
>> > On the SYSV ABI, all the vector registers are volatile, so you can erase 
>> > them in explicit_bzero.
>> >
>> > On Windows 64-bit ABI, it is more problematic, because some of the vector 
>> > registers must be preserved.
>> 
>> Oh, huh. Yes, that would work.
>
> I've just realized that this wouldn't work - if the function 
> explicit_bzero is lazily resolved, the dynamic linker would spill the 
> vector registers to the stack prior to calling explicit_bzero.

No, the dynamic linker makes a tail call to explicit_bzero.  There's no
register restore on the return path, all that happens before the tail
call.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 23:27                       ` Florian Weimer
@ 2024-04-20  3:29                         ` Zack Weinberg
  0 siblings, 0 replies; 16+ messages in thread
From: Zack Weinberg @ 2024-04-20  3:29 UTC (permalink / raw)
  To: Florian Weimer, Mikulas Patocka
  Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development,
	Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel



On Fri, Apr 19, 2024, at 7:27 PM, Florian Weimer wrote:
> * Mikulas Patocka:
>
>> On Fri, 19 Apr 2024, Zack Weinberg wrote:
>>
>>> On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote:
>>> > On Fri, 19 Apr 2024, Zack Weinberg wrote:
>>> >> ... the copy
>>> >> of round_keys in the vector registers *won't* get erased -- the exact
>>> >> problem being discussed in this thread.
>>> >
>>> > On the SYSV ABI, all the vector registers are volatile, so you can erase 
>>> > them in explicit_bzero.
>>> >
>>> > On Windows 64-bit ABI, it is more problematic, because some of the vector 
>>> > registers must be preserved.
>>> 
>>> Oh, huh. Yes, that would work.
>>
>> I've just realized that this wouldn't work - if the function 
>> explicit_bzero is lazily resolved, the dynamic linker would spill the 
>> vector registers to the stack prior to calling explicit_bzero.
>
> No, the dynamic linker makes a tail call to explicit_bzero.  There's no
> register restore on the return path, all that happens before the tail
> call.

Doesn't help — if the vector registers get spilled at all, we lose.

zw

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka
  2024-04-19 14:19 ` H.J. Lu
@ 2024-04-21  1:20 ` Andreas K. Huettel
  2024-04-22  9:33 ` Szabolcs Nagy
  2 siblings, 0 replies; 16+ messages in thread
From: Andreas K. Huettel @ 2024-04-21  1:20 UTC (permalink / raw)
  To: libc-alpha
  Cc: Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel, Mikulas Patocka

[-- Attachment #1: Type: text/plain, Size: 1224 bytes --]

> We have a test "dmsecuretest.sh" that loads cryptographic keys into the 
> kernel, dumps a core, the core file is analyzed and if it contains the 
> key, the test fails.
> 
> This test fails on AMD Zen 4 - the reason for the failure is that the 
> "memcpy" function uses ZMM registers for data copying. When memcpy exits, 
> the encryption key is present in the ZMM registers and the key remains 
> there even after both source and destination buffers of memcpy were 
> cleared.
> 
> When we perform dynamic symbol lookup, the ZMM registers are spilled on 
> the stack and they remain there forever - this is the reason why the core 
> file contains the encryption key and the test fails.

So let me ask a few obvious questions, as someone with not (yet) deep
insights into the problem.

* Shouldn't this be treated as a security issue?

* Are the expectations on where the (key) data may end up defined 
  somewhere?

* If yes, which component behaves faulty?

* If no, who needs to be involved in making the specs?


-- 
Andreas K. Hüttel
dilfridge@gentoo.org
Gentoo Linux developer 
(council, comrel, toolchain, base-system, perl, libreoffice)
https://wiki.gentoo.org/wiki/User:Dilfridge

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: memcpy is leaking secret data through ZMM vector registers
  2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka
  2024-04-19 14:19 ` H.J. Lu
  2024-04-21  1:20 ` Andreas K. Huettel
@ 2024-04-22  9:33 ` Szabolcs Nagy
  2 siblings, 0 replies; 16+ messages in thread
From: Szabolcs Nagy @ 2024-04-22  9:33 UTC (permalink / raw)
  To: Mikulas Patocka, libc-alpha
  Cc: Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel

The 04/19/2024 16:07, Mikulas Patocka wrote:
> Hi
> 
> As a part of LVM2, we are developing the libdevmapper library. The library 
> may be used to load cryptographic keys to the kernel, so it avoids leaking 
> the data to kernel memory and to the swap partition.
> 
> After the use of cryptographic data, the libdevmapper library clears them 
> with memset and frees them afterwards. It executes __asm__ volatile("" ::: 
> "memory") to thwart some compiler optimization regarding writing to 
> to-be-freed memory.

instead of

 crypto_foo(key);
 dont_optimize_me_memset(key, 0, sizeof key);

can you do

 crypto_foo(key);
 memcpy(key, dummykey, sizeof key);
 crypto_foo(key);
 memcpy(key, dummykey, sizeof key);

if there is no sensitive data based conditional in
the code (which there should not be in crypto logic
nor in memcpy) the exact same registers and
instructions should be exercised twice. i.e. you
clobber all state in a portable way, no arch
specific magic hack is needed nor new compiler flag.

technically this can still leak information in all
sorts of ways (c is a high level language, internally
the implementation can do whatever with the secrets),
but this is pretty much how far you can go within c
(pretending otherwise with random weird compiler or
libc extensions is a mistake imho).


> 
> We have a test "dmsecuretest.sh" that loads cryptographic keys into the 
> kernel, dumps a core, the core file is analyzed and if it contains the 
> key, the test fails.
> 
> This test fails on AMD Zen 4 - the reason for the failure is that the 
> "memcpy" function uses ZMM registers for data copying. When memcpy exits, 
> the encryption key is present in the ZMM registers and the key remains 
> there even after both source and destination buffers of memcpy were 
> cleared.
> 
> When we perform dynamic symbol lookup, the ZMM registers are spilled on 
> the stack and they remain there forever - this is the reason why the core 
> file contains the encryption key and the test fails.
> 
> I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or 
> -Wl,-z,now) - it mostly works, but not entirely - the key may still be 
> present on the stack even if we use LD_BIND_NOW=1.
> 
> When I hack the file glibc/sysdeps/x86_64/multiarch/ifunc-memmove.h so 
> that it always selects the ERMS variant of memcpy, the problem goes away.
> 
> Could it be possible to add some switch to glibc, that could be turned on 
> by security-sensitive programs and that would prevent glibc from using the 
> vector registers? Or, do you suggest another solution?
> 
> Mikulas
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-04-22  9:34 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka
2024-04-19 14:19 ` H.J. Lu
2024-04-19 14:24   ` Mikulas Patocka
2024-04-19 14:37     ` H.J. Lu
2024-04-19 18:04       ` Mikulas Patocka
2024-04-19 18:45         ` Paul Eggert
2024-04-19 18:47           ` Zack Weinberg
2024-04-19 18:53             ` Alexander Monakov
2024-04-19 19:11               ` Zack Weinberg
2024-04-19 20:15                 ` Mikulas Patocka
2024-04-19 20:31                   ` Zack Weinberg
2024-04-19 21:11                     ` Mikulas Patocka
2024-04-19 23:27                       ` Florian Weimer
2024-04-20  3:29                         ` Zack Weinberg
2024-04-21  1:20 ` Andreas K. Huettel
2024-04-22  9:33 ` Szabolcs Nagy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).