public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* explicit_bzero documentation
@ 2016-11-18 18:11 Zack Weinberg
  2016-11-19  9:34 ` Rical Jasan
  2016-11-19  9:51 ` Michael Kerrisk (man-pages)
  0 siblings, 2 replies; 4+ messages in thread
From: Zack Weinberg @ 2016-11-18 18:11 UTC (permalink / raw)
  To: GNU C Library
  Cc: Paul Eggert, Michael Kerrisk (man-pages), Joseph Myers, Roland McGrath

I believe that the only remaining concern with the explicit_bzero
patch *itself* (as opposed to the question of whether new public
symbols should be REDIRECTed into the implementation namespace, which
is best addressed separately) is the documentation, and specifically,
the precise wording of the warnings about its limitations.  So, rather
than resend the entire patch, I'm just sending a revision of the
"Erasing Sensitive Data" manual section.  This hopefully strikes a
good balance between making sure that all the important caveats are
documented and not scaring programmers off of using the function at
all.  I'm not super happy with "is designed to be used in this
situation", but I'm stuck for more fluent wording and I have other
stuff I need to do today :)

cc:ed are everyone who's commented on previous iterations of the
documentation and also Roland, whom I believe wrote the string.h
section of the manual originally.

Paul: This revision doesn't specifically mention the _address_ of
sensitive data as an exposure, but it does talk about increased risk
of exposure in general if a variable only has its address taken to
call explicit_bzero; I think this is the right level of detail.

zw

----

@node Erasing Sensitive Data
@section Erasing Sensitive Data

Sensitive data, such as cryptographic keys, should be erased from
memory after use, to reduce the risk that a bug will expose it to the
outside world.  However, often the compiler can deduce that an erasure
operation is ``unnecessary'' because no @emph{correct} program could
access the variable or heap object containing the sensitive data after
it's deallocated.  Since erasure is a precaution against bugs, it
should be done even though it is ``unnecessary'' in this sense.  The
function @code{explicit_bzero} is designed to be used in this
situation.

@smallexample
@group
#include <string.h>

extern void encrypt (const char *key, const char *in,
                     char *out, size_t n);
extern void genkey (const char *phrase, char *key);

void encrypt_with_phrase (const char *phrase, const char *in,
                          char *out, size_t n)
@{
  char key[16];
  genkey (phrase, key);
  encrypt (key, in, out, n);
  explicit_bzero (key, 16);
@}
@end group
@end smallexample

@noindent
The call to @code{explicit_bzero} in @code{encrypt_with_phrase} erases
sensitive data that is about to be deallocated.  If @code{memset},
@code{bzero}, or a hand-written loop had been used, the compiler
might treat the erasure as ``unnecessary'' and remove it, but it will
not do this with @code{explicit_bzero}.

@strong{Warning:} Using @code{explicit_bzero} does not guarantee that
sensitive data is @emph{completely} erased from the computer's memory.
There may be copies in temporary storage areas, such as registers and
``scratch'' stack space; since these copies are invisible to the
source code, there is no way to arrange for them to be erased.

Also, @code{explicit_bzero} only operates on RAM.  If a sensitive data
object never needs to have its address taken other than to call
@code{explicit_bzero}, it might be stored entirely in CPU registers
@emph{until} the call to @code{explicit_bzero}.  At that point, it
will be copied into RAM, the copy will be erased, and the original
will remain intact.  Data in RAM is more likely to be exposed by a bug
than data in registers, so using @code{explicit_bzero} can create a
brief window where sensitive data is at greater risk of exposure than
it would have been if the program didn't try to erase it at all.
@Theglibc{}'s implementation of @code{explicit_bzero} contains a hack
that can prevent the data from being copied to RAM in this situation,
but it is not guaranteed to work and it doesn't do anything about the
data in registers.

In both cases, declaring sensitive variables as @code{volatile} will
make the problem @emph{worse}; a @code{volatile} variable will be
stored in memory for its entire lifetime, and the compiler will make
@emph{more} copies of it than it would otherwise have.  Attempting to
erase a normal variable ``by hand'' through a
@code{volatile}-qualified pointer doesn't work at all---because the
variable itself is not @code{volatile}, the compiler may ignore the
qualification on the pointer and remove the erasure anyway.

Having said all that, in most situations, using @code{explicit_bzero}
is better than not using it.  At present, the only way to do a more
thorough job is to write the entire sensitive operation in assembly
language.  We anticipate that future compilers will recognize calls to
@code{explicit_bzero} and take appropriate steps to erase all the
copies of the affected data, whereever they may be.

@comment string.h
@comment BSD
@deftypefun void explicit_bzero (void *@var{block}, size_t @var{len})
@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}

@code{explicit_bzero} writes zero into @var{len} bytes of memory
beginning at @var{block}, just as @code{bzero} would.  The zeroes are
always written, even if the compiler could prove that this is
``unnecessary'' because no correct program could read them back.

@strong{Note:} The @emph{only} optimization that @code{explicit_bzero}
disables is removal of ``unnecessary'' writes to memory.  The compiler
can perform all the other optimizations that it could for a call to
@code{memset}.  For instance, it may replace the function call with
inline memory writes, and it may deduce that @var{block} cannot be a
null pointer.

@strong{Portability Note:} This function first appeared in OpenBSD 5.5
and has not been standardized.  Other systems may provide the same
functionality under a different name, such as @code{explicit_memset},
@code{memset_s}, or @code{SecureZeroMemory}.

@Theglibc{} declares this function in @file{string.h}, but on other
systems it may be in @file{strings.h} instead.
@end deftypefun

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: explicit_bzero documentation
  2016-11-18 18:11 explicit_bzero documentation Zack Weinberg
@ 2016-11-19  9:34 ` Rical Jasan
  2016-11-19  9:38   ` Paul Eggert
  2016-11-19  9:51 ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 4+ messages in thread
From: Rical Jasan @ 2016-11-19  9:34 UTC (permalink / raw)
  To: GNU C Library
  Cc: Zack Weinberg, Paul Eggert, Michael Kerrisk (man-pages),
	Joseph Myers, Roland McGrath

FWIW, this looks good to me, and does not seem scary, but informative.
I think it conveys, "Know what you're doing, please.", along with enough
to help you decide whether you think you do or not.  (Assuming "enough"
is what all the previous discussion has established.  I can only speak
to the tone and impression of the documentation.)

Rical

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: explicit_bzero documentation
  2016-11-19  9:34 ` Rical Jasan
@ 2016-11-19  9:38   ` Paul Eggert
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Eggert @ 2016-11-19  9:38 UTC (permalink / raw)
  To: Rical Jasan, GNU C Library
  Cc: Zack Weinberg, Michael Kerrisk (man-pages), Joseph Myers, Roland McGrath

Rical Jasan wrote:
> FWIW, this looks good to me, and does not seem scary, but informative.

Thanks, it looks good to me too. (Anyone who wants the scary details can just 
read this thread. :-)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: explicit_bzero documentation
  2016-11-18 18:11 explicit_bzero documentation Zack Weinberg
  2016-11-19  9:34 ` Rical Jasan
@ 2016-11-19  9:51 ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 4+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-19  9:51 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: GNU C Library, Paul Eggert, Joseph Myers, Roland McGrath

Hello Zack,

Just a couple of minor wording suggestions below. Otherwise,
this definitely looks better to me.

On 18 November 2016 at 19:11, Zack Weinberg <zackw@panix.com> wrote:
> I believe that the only remaining concern with the explicit_bzero
> patch *itself* (as opposed to the question of whether new public
> symbols should be REDIRECTed into the implementation namespace, which
> is best addressed separately) is the documentation, and specifically,
> the precise wording of the warnings about its limitations.  So, rather
> than resend the entire patch, I'm just sending a revision of the
> "Erasing Sensitive Data" manual section.  This hopefully strikes a
> good balance between making sure that all the important caveats are
> documented and not scaring programmers off of using the function at
> all.  I'm not super happy with "is designed to be used in this
> situation", but I'm stuck for more fluent wording and I have other
> stuff I need to do today :)
>
> cc:ed are everyone who's commented on previous iterations of the
> documentation and also Roland, whom I believe wrote the string.h
> section of the manual originally.
>
> Paul: This revision doesn't specifically mention the _address_ of
> sensitive data as an exposure, but it does talk about increased risk
> of exposure in general if a variable only has its address taken to
> call explicit_bzero; I think this is the right level of detail.
>
> zw
>
> ----
>
> @node Erasing Sensitive Data
> @section Erasing Sensitive Data
>
> Sensitive data, such as cryptographic keys, should be erased from
> memory after use, to reduce the risk that a bug will expose it to the
> outside world.  However, often the compiler can deduce that an erasure

Maybe: s/the compiler/compiler optimizations/

> operation is ``unnecessary'' because no @emph{correct} program could
> access the variable or heap object containing the sensitive data after
> it's deallocated.  Since erasure is a precaution against bugs, it
> should be done even though it is ``unnecessary'' in this sense.  The
> function @code{explicit_bzero} is designed to be used in this
> situation.
>
> @smallexample
> @group
> #include <string.h>
>
> extern void encrypt (const char *key, const char *in,
>                      char *out, size_t n);
> extern void genkey (const char *phrase, char *key);
>
> void encrypt_with_phrase (const char *phrase, const char *in,
>                           char *out, size_t n)
> @{
>   char key[16];
>   genkey (phrase, key);
>   encrypt (key, in, out, n);
>   explicit_bzero (key, 16);
> @}
> @end group
> @end smallexample
>
> @noindent
> The call to @code{explicit_bzero} in @code{encrypt_with_phrase} erases
> sensitive data that is about to be deallocated.  If @code{memset},
> @code{bzero}, or a hand-written loop had been used, the compiler
> might treat the erasure as ``unnecessary'' and remove it, but it will
> not do this with @code{explicit_bzero}.
>
> @strong{Warning:} Using @code{explicit_bzero} does not guarantee that
> sensitive data is @emph{completely} erased from the computer's memory.
> There may be copies in temporary storage areas, such as registers and
> ``scratch'' stack space; since these copies are invisible to the
> source code, there is no way to arrange for them to be erased.
>
> Also, @code{explicit_bzero} only operates on RAM.  If a sensitive data
> object never needs to have its address taken other than to call
> @code{explicit_bzero}, it might be stored entirely in CPU registers
> @emph{until} the call to @code{explicit_bzero}.  At that point, it
> will be copied into RAM, the copy will be erased, and the original
> will remain intact.  Data in RAM is more likely to be exposed by a bug
> than data in registers, so using @code{explicit_bzero} can create a
> brief window where sensitive data is at greater risk of exposure than
> it would have been if the program didn't try to erase it at all.
> @Theglibc{}'s implementation of @code{explicit_bzero} contains a hack
> that can prevent the data from being copied to RAM in this situation,
> but it is not guaranteed to work and it doesn't do anything about the
> data in registers.
>
> In both cases, declaring sensitive variables as @code{volatile} will
> make the problem @emph{worse}; a @code{volatile} variable will be
> stored in memory for its entire lifetime, and the compiler will make
> @emph{more} copies of it than it would otherwise have.  Attempting to
> erase a normal variable ``by hand'' through a
> @code{volatile}-qualified pointer doesn't work at all---because the
> variable itself is not @code{volatile}, the compiler may ignore the
> qualification on the pointer and remove the erasure anyway.
>
> Having said all that, in most situations, using @code{explicit_bzero}
> is better than not using it.  At present, the only way to do a more
> thorough job is to write the entire sensitive operation in assembly
> language.  We anticipate that future compilers will recognize calls to
> @code{explicit_bzero} and take appropriate steps to erase all the
> copies of the affected data, whereever they may be.
>
> @comment string.h
> @comment BSD
> @deftypefun void explicit_bzero (void *@var{block}, size_t @var{len})
> @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>
> @code{explicit_bzero} writes zero into @var{len} bytes of memory
> beginning at @var{block}, just as @code{bzero} would.  The zeroes are
> always written, even if the compiler could prove that this is

Maybe: s/compiler could prove/compiler can determine/

> ``unnecessary'' because no correct program could read them back.
>
> @strong{Note:} The @emph{only} optimization that @code{explicit_bzero}
> disables is removal of ``unnecessary'' writes to memory.  The compiler
> can perform all the other optimizations that it could for a call to
> @code{memset}.  For instance, it may replace the function call with
> inline memory writes, and it may deduce that @var{block} cannot be a
> null pointer.
>
> @strong{Portability Note:} This function first appeared in OpenBSD 5.5
> and has not been standardized.  Other systems may provide the same
> functionality under a different name, such as @code{explicit_memset},
> @code{memset_s}, or @code{SecureZeroMemory}.
>
> @Theglibc{} declares this function in @file{string.h}, but on other
> systems it may be in @file{strings.h} instead.
> @end deftypefun

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-19  9:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-18 18:11 explicit_bzero documentation Zack Weinberg
2016-11-19  9:34 ` Rical Jasan
2016-11-19  9:38   ` Paul Eggert
2016-11-19  9:51 ` Michael Kerrisk (man-pages)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).