RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc)

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc)
@ 2021-04-29 17:18 David Malcolm
  2021-04-29 18:26 ` Martin Sebor
  2021-04-30 21:51 ` Andi Kleen
  0 siblings, 2 replies; 3+ messages in thread
From: David Malcolm @ 2021-04-29 17:18 UTC (permalink / raw)
  To: msebor, gcc

I've been going through old Linux kernel CVEs, trying to prototype some
possible new warnings for -fanalyzer in GCC 12 (and, alas, finding
places where the analyzer internals need work...)

I think I want a way for the user to be able to mark security
boundaries in their code: for example:
* in the Linux kernel the boundary between untrusted user-space data
and kernel data, or,
* for a user-space daemon, the boundary between data coming from the
network and the data of daemon itself

The analyzer could then make use of this, for example:

(a) marking untrusted incoming data as "tainted" and prioritizing
analysis of paths that make use of it (e.g. a "might overflow a buffer
when N is really large" goes from being a noisy false positive when we
simply have no knowledge of N (or the buffer's size) to being a serious
issue if N is under the control of an attacker

(b) copying uninitialized data back to the untrusted region becomes a
potential disclosure of sensitive information

I think I also want a way to mark system calls and ioctl
implementations, so that I mark all of the parameters as being
potentially hostile.

Specifically, the Linux kernel uses functions like this:
  #define __user
  extern long copy_to_user(void __user *to, const void *from, unsigned long n);
  extern long copy_from_user(void *to, const void __user *from, long n);

in various places, so I want a way to mark the "to" and "from" params
as being a security boundary.

I've been experimenting with implementing (b) for CVE-2011-1078 (in
which a copy_to_user is passed a pointer to an on-stack buffer that
isn't fully initialized, hence a disclosure of information to user-
space).

Martin: I believe you added __attribute__((access)) in GCC 9.

I was thinking of extending it to allow something like:

#define __user
extern long copy_to_user(void __user *to, const void *from, unsigned long n)
  __attribute__((access (untrusted_write, 1, 3),
		 access (read_only, 2, 3)
		 ));

extern long copy_from_user(void *to, const void __user *from, long n)
  __attribute__((access (write_only, 1, 3),
		 access (untrusted_read, 2, 3)
		 ));

so that to "to" and "from" and marked as being writes and reads of up
to size n, but they are flagged as "untrusted" as appropriate, so the
analyzer can pay particular attention as described above.

Does the above idea sound like a sane extension of the access
attribute?

I tried implementing it, but "access" seems to get converted to its own
microformat for expressing these things as strings (created via
append_access_attr, and parsed in e.g. init_attr_rdwr_indices), which
seems to make it much harder than I was expecting.

Any thoughts about how to mark system calls/ioctls?  The simplest would
be an attribute that marks all parameters as being untrusted, and the
return value, somehow.

Thanks
Dave

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc)
  2021-04-29 17:18 RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc) David Malcolm
@ 2021-04-29 18:26 ` Martin Sebor
  2021-04-30 21:51 ` Andi Kleen
  1 sibling, 0 replies; 3+ messages in thread
From: Martin Sebor @ 2021-04-29 18:26 UTC (permalink / raw)
  To: David Malcolm, gcc

On 4/29/21 11:18 AM, David Malcolm wrote:
> I've been going through old Linux kernel CVEs, trying to prototype some
> possible new warnings for -fanalyzer in GCC 12 (and, alas, finding
> places where the analyzer internals need work...)
> 
> I think I want a way for the user to be able to mark security
> boundaries in their code: for example:
> * in the Linux kernel the boundary between untrusted user-space data
> and kernel data, or,
> * for a user-space daemon, the boundary between data coming from the
> network and the data of daemon itself
> 
> The analyzer could then make use of this, for example:
> 
> (a) marking untrusted incoming data as "tainted" and prioritizing
> analysis of paths that make use of it (e.g. a "might overflow a buffer
> when N is really large" goes from being a noisy false positive when we
> simply have no knowledge of N (or the buffer's size) to being a serious
> issue if N is under the control of an attacker
> 
> (b) copying uninitialized data back to the untrusted region becomes a
> potential disclosure of sensitive information
> 
> I think I also want a way to mark system calls and ioctl
> implementations, so that I mark all of the parameters as being
> potentially hostile.
> 
> Specifically, the Linux kernel uses functions like this:
>    #define __user
>    extern long copy_to_user(void __user *to, const void *from, unsigned long n);
>    extern long copy_from_user(void *to, const void __user *from, long n);
> 
> in various places, so I want a way to mark the "to" and "from" params
> as being a security boundary.
> 
> I've been experimenting with implementing (b) for CVE-2011-1078 (in
> which a copy_to_user is passed a pointer to an on-stack buffer that
> isn't fully initialized, hence a disclosure of information to user-
> space).
> 
> Martin: I believe you added __attribute__((access)) in GCC 9.
> 
> I was thinking of extending it to allow something like:
> 
> #define __user
> extern long copy_to_user(void __user *to, const void *from, unsigned long n)
>    __attribute__((access (untrusted_write, 1, 3),
> 		 access (read_only, 2, 3)
> 		 ));
> 
> extern long copy_from_user(void *to, const void __user *from, long n)
>    __attribute__((access (write_only, 1, 3),
> 		 access (untrusted_read, 2, 3)
> 		 ));
> 
> so that to "to" and "from" and marked as being writes and reads of up
> to size n, but they are flagged as "untrusted" as appropriate, so the
> analyzer can pay particular attention as described above.
> 
> Does the above idea sound like a sane extension of the access
> attribute?

The trust aspect seems orthogonal to the access mode, so my initial
thought is if it should be a separate attribute.  That would also
make it applicable to other sources of data such as function return
values, or namespace scope variables.

At the same time, I have been thinking of extending attribute access
to these contexts as well: e.g., to denote that a function like
getenv() returns a pointer to a non-modifiable string, or that
the array pointed to by a variable like environ should not be
directly modified by a program.  So maybe it's not unreasonable
to integrate the two.

> 
> I tried implementing it, but "access" seems to get converted to its own
> microformat for expressing these things as strings (created via
> append_access_attr, and parsed in e.g. init_attr_rdwr_indices), which
> seems to make it much harder than I was expecting.

The translation was done to address a concern about the cost of
parsing of the new attribute with all its variations.  I suspect
Richard (who raised the concern) might have been envisioning
a format similar to attribute fn spec.  That's what I aimed
for in any event.  But attribute access is more general and
cab be dynamically extended and so it's more complex, and
so the result is probably not quite what he had in mind.

The translation from the human readable form to the condensed
internal form is tricky, a source of bugs, and a royal PITA to
work with.  I'd be happy to go back to the original design (but
I'm not terribly motivated to do the work myself to change it).

That said. I wouldn't expect adding a new mode to be too bad.
Just add the handling to handle_access_attribute and the names
and the corresponding letters to attr_access.

> 
> Any thoughts about how to mark system calls/ioctls?  The simplest would
> be an attribute that marks all parameters as being untrusted, and the
> return value, somehow.

I'm not sure I understand the question but if you're wondering
about what to do about data coming in and out of variadic functions
with arbitrary orders of arguments of arbitrary types I don't really
have any thoughts on how to deal with those in general.

Martin

> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc)
  2021-04-29 17:18 RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc) David Malcolm
  2021-04-29 18:26 ` Martin Sebor
@ 2021-04-30 21:51 ` Andi Kleen
  1 sibling, 0 replies; 3+ messages in thread
From: Andi Kleen @ 2021-04-30 21:51 UTC (permalink / raw)
  To: David Malcolm via Gcc; +Cc: msebor, David Malcolm

David Malcolm via Gcc <gcc@gcc.gnu.org> writes:

> I think I want a way for the user to be able to mark security
> boundaries in their code: for example:
> * in the Linux kernel the boundary between untrusted user-space data
> and kernel data, or,
> * for a user-space daemon, the boundary between data coming from the
> network and the data of daemon itself
>
> The analyzer could then make use of this, for example:
>
> (a) marking untrusted incoming data as "tainted" and prioritizing
> analysis of paths that make use of it (e.g. a "might overflow a buffer
> when N is really large" goes from being a noisy false positive when we
> simply have no knowledge of N (or the buffer's size) to being a serious
> issue if N is under the control of an attacker

It would be great if gcc supported address spaces, like sparse
(__attribute__((address_space(id)))
The Linux kernel already supports this syntax for multiple purposes
(user pointers, io memory) and it's nicely extensible to more usages.
One possibility would be detection of Spectre gadgets.

It would also need sparse like force casts.

Of course sparse already does these checks, but it would be great
if it was integrated into a standard gcc build.

Perhaps it could be used for your analyzer purposes too.

-Andi

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-30 21:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29 17:18 RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc) David Malcolm
2021-04-29 18:26 ` Martin Sebor
2021-04-30 21:51 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).