public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Qing Zhao <qing.zhao@oracle.com>
To: Kees Cook <keescook@chromium.org>,
	Siddhesh Poyarekar <siddhesh@gotplt.org>
Cc: "joseph@codesourcery.com" <joseph@codesourcery.com>,
	"richard.guenther@gmail.com" <richard.guenther@gmail.com>,
	"jakub@redhat.com" <jakub@redhat.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	"uecker@tugraz.at" <uecker@tugraz.at>,
	"isanbard@gmail.com" <isanbard@gmail.com>
Subject: Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
Date: Mon, 31 Jul 2023 20:14:42 +0000	[thread overview]
Message-ID: <72AF1253-564C-46C1-9FBC-5A53871CB701@oracle.com> (raw)
In-Reply-To: <A41FC9A6-D73D-4FCB-8B69-3FA6908C6FA8@oracle.com>

Hi,

After some detailed study and consideration on how to use the new attribute “counted_by”
 in __builtin_dynamic_object_size, I came up with the following example with detailed explanation 
on the expected behavior from GCC on using this new attribute. 

Please take a look on this example and  the explanation embedded, and let me know if you have further
Comments or suggestions.

Thanks a lot.

Qing

===========================================
#include <stdint.h>
#include <malloc.h>
#include <string.h>
#include <stdio.h>

struct annotated {
        size_t foo;
        int array[] __attribute__((counted_by (foo)));
};

#define expect(p, _v) do { \
    size_t v = _v; \
    if (p == v) \
        __builtin_printf ("ok:  %s == %zd\n", #p, p); \
    else \
        {  \
          __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
        } \
} while (0);

#define noinline __attribute__((__noinline__))
#define SIZE_BUMP 2 

/* In general, Due to type casting, the type for the pointee of a pointer
   does not say anything about the object it points to,
   So, __builtin_object_size can not directly use the type of the pointee
   to decide the size of the object the pointer points to.

   there are only two reliable ways:
   A. observed allocations  (call to the allocation functions in the routine)
   B. observed accesses     (read or write access to the location of the 
                             pointer points to)

   that provide information about the type/existence of an object at
   the corresponding address.

   for A, we use the "alloc_size" attribute for the corresponding allocation
   functions to determine the object size;

   For B, we use the SIZE info of the TYPE attached to the corresponding access.
   (We treat counted_by attribute as a complement to the SIZE info of the TYPE
    for FMA) 

   The only other way in C which ensures that a pointer actually points
   to an object of the correct type is 'static':

   void foo(struct P *p[static 1]);   

   See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
   for more details.  */

/* in the following function, malloc allocated more space than the value
   of counted_by attribute.  Then what's the correct behavior we expect
   the __builtin_dynamic_object_size should have for each of the cases?  */ 

static struct annotated * noinline alloc_buf (int index)
{
  struct annotated *p;
  p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
  p->foo = index;

  /*when checking the observed access p->array, we have info on both
    observered allocation and observed access, 
    A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
    B. from observed access: p->foo * sizeof (int)

    in the above, p->foo = index.
   */
   
  /* for MAXIMUM sub-object size: chose the smaller of A and B.
   * Please see https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625891.html
   * for details on why.  */
  expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 0), sizeof (*p) + (p->foo) * sizeof(int));

  /* for MINIMUM sub-object size: chose the smaller of A and B too.  */
  expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 2), sizeof (*p) + p->foo * sizeof(int));

  /*when checking the pointer p, we only have info on the observed allocation.
    So, the object size info can only been obtained from the call to malloc.
    for both MAXIMUM and MINIMUM: A = (index + SIZE_BUMP) * sizeof (int)  */ 
  expect(__builtin_dynamic_object_size(p, 1), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  expect(__builtin_dynamic_object_size(p, 0), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  expect(__builtin_dynamic_object_size(p, 3), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  expect(__builtin_dynamic_object_size(p, 2), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  return p;
}


int main ()
{
  struct annotated *p; 
  p = alloc_buf (10);
  /*when checking the observed access p->array, we only have info on the
    observed access, i.e, the TYPE_SIZE info from the access. We don't have
    info on the whole object.  */

  /*For MAXIMUM size, We know the SIZE info of the TYPE from the access to the
    sub-object p->array.
    but don't know the whole object the pointer p points to.  */ 
  expect(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 0), -1);

  /*for MINIMUM size, We know the TYPE_SIZE from the access to the sub-oject
    p->array.
    but don't know the whole object the pointer p points to.  */
  expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 2), 0);

  /*when checking the pointer p, we have no observed allocation nor observed access.
    therefore, we cannot determine the size info here.  */
  expect(__builtin_dynamic_object_size(p, 1), -1);
  expect(__builtin_dynamic_object_size(p, 0), -1);
  expect(__builtin_dynamic_object_size(p, 3), 0);
  expect(__builtin_dynamic_object_size(p, 2), 0);

  return 0;
}





> On Jul 19, 2023, at 2:52 PM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
>>> 
>>> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement:
>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
>> 
>> Right, I'm saying it would be nice if __alloc_size was checked as well,
>> in the sense that if it is available, it knows without question what the
>> size of the allocation is. If __alloc_size and __counted_by conflict,
>> the smaller of the two should be the truth.
> 
> I don’t think that  “if __alloc_size and __counted_by conflict, the smaller of the two should be the truth” will work correctly.
> 
> When __alloc_size is larger than the value of __counted_by, it’s okay. 
> But when the value of __counted_by is larger than the __alloc_size, the array bound check or object size sanitizer might not work correctly.
> 
> 
> Please see the following example:
> 
> struct grows {
> 	int alloc_count;
> 	int valid_count;
> 	int  item[] __counted_by(valid_count);
> } *p;
> 
> void __attribute__((__noinline__)) something (int n)
> {
> 	p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
> 	p->alloc_count = 100;
> 	p->valid_count = 102;
> 	p->item[n] = 10;		// both _alloc_size and the value of __counted_by are available in this routine, the smaller one is , 100;
> 
> }
> 
> void __attribute__((__noinline__))  something_2 (int n)
> {
>   p->item[n] = 10;   // only the value of  __counted_by is available in this routine, which is 102;  
> }
> 
> Int main
> {
>   Something (101);
>   Something_2 (101);
> }
> 
> 
> For the above example, the out-of-bound array access in routine “something” should be able to be caught by the compiler.
> However, the out-of-bound array access in the routine “something_2” will NOT be able to be caught by the compiler.
> 
> Since in the routine “something_2” , the compiler don’t know the alloc_size, the only available info is the counted_by value
> through the attribute.  But this value is bigger than the REAL size of the array. Therefore the compiler cannot detect the 
> out-of-bound array access in the routine something_2
> 
> 
> Based on the above observation, I think we should add the following requirement: 
> 
> The value of “counted_by” should be equal or SMALLER than the real alloc_size for the flexible array member. 
> 
> This is the same requirement as the LLVM RFC. 
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> 
> "the compiler inserts additional checks to ensure the new buf has at least as many elements as the new count indicates.”
> LLVM has additional requirement in addition to this, we might need to consider those requirement too. 
> 
> Qing
> 
>> But, as I said, if there is some need to explicitly ignore __alloc_size
>> when __counted_by is present, I can live with it; we just need to
>> document it.
>> 
>> If the RFC and you agree that the __counted_by variable can only ever be
>> (re)assigned after the flex array has been (re)allocated, then I guess
>> we'll see how it goes. :) I think most places in the kernel using
>> __counted_by will be fine, but I suspect we may have cases where we need
>> to update it like in the loop I described above. If that's true, we can
>> revisit the requirement then. :)
>> 
>> -Kees
>> 
>> -- 
>> Kees Cook
> 


  reply	other threads:[~2023-07-31 20:14 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-25 16:14 Qing Zhao
2023-05-25 16:14 ` [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896) Qing Zhao
2023-05-25 21:02   ` Joseph Myers
2023-05-26 13:32     ` Qing Zhao
2023-05-26 18:15       ` Joseph Myers
2023-05-26 19:09         ` Qing Zhao
2023-06-07 19:59         ` Qing Zhao
2023-06-07 20:53           ` Joseph Myers
2023-06-07 21:32             ` Qing Zhao
2023-06-07 22:05               ` Joseph Myers
2023-06-08 13:06                 ` Qing Zhao
2023-06-15 15:09                 ` Qing Zhao
2023-06-15 16:55                   ` Joseph Myers
2023-06-15 19:54                     ` Qing Zhao
2023-06-15 22:48                       ` Joseph Myers
2023-06-16 15:01                         ` Qing Zhao
2023-06-16  7:21                     ` Martin Uecker
2023-06-16 15:14                       ` Qing Zhao
2023-06-16 16:21                       ` Joseph Myers
2023-06-16 17:07                         ` Martin Uecker
2023-06-16 20:20                           ` Qing Zhao
2023-06-16 21:35                             ` Joseph Myers
2023-06-20 19:40                               ` Qing Zhao
2023-06-27 15:44                                 ` Qing Zhao
2023-05-25 16:14 ` [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896] Qing Zhao
2023-05-27 10:20   ` Martin Uecker
2023-05-30 16:08     ` Qing Zhao
2023-05-25 16:14 ` [V1][PATCH 3/3] Use the element_count attribute information in bound sanitizer[PR108896] Qing Zhao
2023-05-26 16:12 ` [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Kees Cook
2023-05-30 21:44   ` Qing Zhao
2023-05-26 20:40 ` Kees Cook
2023-05-30 15:43   ` Qing Zhao
2023-07-06 18:56   ` Qing Zhao
2023-07-06 21:10     ` Martin Uecker
2023-07-07 15:47       ` Qing Zhao
2023-07-07 20:21         ` Qing Zhao
2023-07-13 20:31     ` Kees Cook
2023-07-17 21:17       ` Qing Zhao
2023-07-17 23:40         ` Kees Cook
2023-07-18 15:37           ` Qing Zhao
2023-07-18 16:03             ` Martin Uecker
2023-07-18 16:25               ` Qing Zhao
2023-07-18 16:50                 ` Martin Uecker
2023-07-18 18:53             ` Qing Zhao
2023-07-19  8:41           ` Martin Uecker
2023-07-19 16:16           ` Qing Zhao
2023-07-19 18:52           ` Qing Zhao
2023-07-31 20:14             ` Qing Zhao [this message]
2023-08-01 22:45               ` Kees Cook
2023-08-02  6:25                 ` Martin Uecker
2023-08-02 15:02                   ` Qing Zhao
2023-08-02 15:09                 ` Qing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72AF1253-564C-46C1-9FBC-5A53871CB701@oracle.com \
    --to=qing.zhao@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=isanbard@gmail.com \
    --cc=jakub@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=keescook@chromium.org \
    --cc=richard.guenther@gmail.com \
    --cc=siddhesh@gotplt.org \
    --cc=uecker@tugraz.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).