public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Qing Zhao <qing.zhao@oracle.com>
To: David Malcolm <dmalcolm@redhat.com>
Cc: Richard Biener <rguenther@suse.de>,
	Joseph Myers <josmyers@redhat.com>,
	Siddhesh Poyarekar <siddhesh@gotplt.org>,
	"uecker@tugraz.at" <uecker@tugraz.at>,
	kees Cook <keescook@chromium.org>,
	"isanbard@gmail.com" <isanbard@gmail.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: "counted_by" and -fanalyzer
Date: Wed, 5 Jun 2024 19:54:16 +0000	[thread overview]
Message-ID: <9474C9DC-D02A-401A-A81D-102F0B8259E5@oracle.com> (raw)
In-Reply-To: <8a42e01a6bb4c86704c473f709c9ac0ffb8c571d.camel@redhat.com>



> On Jun 5, 2024, at 09:49, David Malcolm <dmalcolm@redhat.com> wrote:
> 
> On Tue, 2024-06-04 at 22:09 +0000, Qing Zhao wrote:
>> 
>> 
>>> On Jun 4, 2024, at 17:55, David Malcolm <dmalcolm@redhat.com>
>>> wrote:
>>> 
>>> On Fri, 2024-05-31 at 13:11 +0000, Qing Zhao wrote:
>>>> 
>>>> 
> 
> [...]
> 
>>>> 
>>>> 
>>>> Thanks a lot for the review.
>>>> Will commit the patch set soon.
>>> 
>>> [...snip...]
>>> 
>>> Congratulations on getting this merged.
>>> 
>>> FWIW I've started investigating adding support for the new
>>> attribute to
>>> -fanalyzer (and am tracked this as PR analyzer/111567
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111567 ).
>> 
>> Thank you for starting looking at this.
>>> 
>>> The docs for the attribute speak of the implied relationship
>>> between
>>> the count field and size of the flex array, and say that: "It's the
>>> user's responsibility to make sure the above requirements to be
>>> kept
>>> all the time.  Otherwise the compiler *reports warnings*, at the
>>> same
>>> time, the results of the array bound sanitizer and the
>>> '__builtin_dynamic_object_size' is undefined." (my emphasis).
>>> 
>>> What are these warnings that are reported?  I looked through 
>>> r15-944-gf824acd0e80754 through r15-948-g4c5bea7def1361 and I
>>> didn't
>>> see any new warnings or test coverage for warnings (beyond misuing
>>> the
>>> attribute).  Sorry if I'm missing something obvious here.
>> 
>> These warnings will be in the remaining work (I listed the remaining
>> work in all versions except the last one):
>> 
>>>>>> ******Remaining works: 
>>>>>> 
>>>>>> 6  Improve __bdos to use the counted_by info in whole-object
>>>>>> size for the structure with FAM.
>>>>>> 7  Emit warnings when the user breaks the requirments for the
>>>>>> new counted_by attribute
>>>>>> compilation time: -Wcounted-by
>>>>>> run time: -fsanitizer=counted-by
>>>>>>    * The initialization to the size field should be done
>>>>>> before the first reference to the FAM field.
>>>>>>    * the array has at least # of elements specified by the
>>>>>> size field all the time during the program.
> 
> Aha - thanks.  Sorry for missing this, I confess I haven't been paying
> close attention to this thread.
> 
>> 
>> With the current patches that have been committed, the warnings are
>> not emitted. 
>> I believe that more analysis and more information are needed for
>> these warnings to be effective, it might not
>> be a trivial patch.  More discussion is needed for emitting such
>> warnings.
>> 
>>> 
>>> Does anyone have examples of cases that -fanalyzer ought to warn
>>> for?
>> 
>> At this moment, I don’t have concrete testing cases for this yet, but
>> I can come up with several small examples and share with you in a
>> later email.
> 
> FWIW I did some brainstorming and put together the following .c file,
> am posting it inline here for the sake of discussion; does this look
> like the kind of thing to test for (in terms of how users are expected
> to use the attribute, and the kinds of mistake they'd want warnings
> about) ?

From my understanding, there are two parts of work to support “counted-by” in -fanalyzer:

1. Use this new attribute to improve out-of-bound, buffer overflow detection in -fanalyzer (maybe -Wanalyzer-out-of-bounds can be improved with this new attribute?)
2. Report user errors that breaks the following 2 requirements for the new counted-by attribute:
    -fsanitizer=counted-by
     * The initialization to the size field should be done before the first reference to the FAM field.
     * the array has at least # of elements specified by the size field all the time during the program.    

So, the testing cases might include the above 1 and 2. 
From my understanding of the below testings, mostly belong to 2. 
Some more comments inlined below:

> 
> /* TODO:
>   Some ideas for dimensions of test matrix:
>   (a) concrete value vs symbolic value for "count"
>   (b) concrete value vs symbolic value for size of array
>   (c) dynamic vs static allocation of buffer (and malloc vs alloca)
>   (d) relative size of array and of count
>       - same size (not an issue)
>       - array is too small compared to "count"
>         - off by one
> - off by more than one
> - size is zero (but not negative)
>         - negative size (which the docs say is OK)
>       - array is too large compared to "count" (not an issue)
>   (e) type of flex array:
>       - char
>       - non-char
>       - type requiring padding
>   (f) type/size/signedness of count field; what about overflow
>       in (count * sizeof (type of array element)) ?
>   ... etc: ideas?
> 
>    Other ideas for test coverage:
>    - realloc
>      - growing object
>      - shrinking object
>      - symbolic sizes where could be growth or shrinkage
>      - failing realloc
>    - ...etc: ideas?  */
> 
> #include <stddef.h>
> #include <stdlib.h>
> #include <stdint.h>
> 
> /* Example from the docs.  */
> 
> struct P {
>  size_t count;
>  char other;
>  char array[] __attribute__ ((counted_by (count)));
> } *p;
> 
> struct P *
> test_malloc_with_correct_symbolic (size_t n)
> {
>  struct P *p = malloc (sizeof (struct P) + n);

The size of the malloc might not be computed very accurately here, you might want:

malloc (MAX (sizeof (struct P), (offsetof (struct P, array[0]) + n))

>  if (!p)
>    return NULL;
>  p->count = n; // don't warn here
>  return p;  
> }
> 
> struct P *
> test_malloc_with_correct_count_concrete (void)
> {
>  struct P *p = malloc (sizeof (struct P) + 42);
>  if (!p)
>    return NULL;
>  p->count = 42; // don't warn here
>  return p;  
> }
> 
> struct P *
> test_malloc_with_array_smaller_than_count_concrete (void)
> {
>  struct P *p = malloc (sizeof (struct P) + 42);
>  if (!p)
>    return NULL;
>  p->count = 80; // TODO: warn here
>  return p;
> }
> 
> struct P *
> test_malloc_with_array_larger_than_count_concrete (void)
> {
>  struct P *p = malloc (sizeof (struct P) + 80);
>  if (!p)
>    return NULL;
>  p->count = 42; // don't warn here
>  return p;  
> }

Okay (except the malloc size). You can take a look at: 
gcc/testsuite/gcc.dg/flex-array-counted-by-4.c

alloc_buf_more and alloc_buf_less

> struct P *
> test_malloc_with_array_access_before_count_init_concrete_1 (void)
> {
>  struct P *p = malloc (sizeof (struct P) + 42);
>  if (!p)
>    return NULL;
>  /* Forgetting to set count altogether.  */
>  __builtin_memset (p->array, 0, 42); // TODO: should warn here
>  return p;  
> }
> 

> struct P *
> test_malloc_with_array_access_before_count_init_concrete_2 (void)
> {
>  struct P *p = malloc (sizeof (struct P) + 42);
>  if (!p)
>    return NULL;
>  /* Erroneously touching array before setting count.  */
>  __builtin_memset (p->array, 0, 42); // TODO: should warn here
>  p->count = 42;
>  return p;  
> }
> 
> struct P *
> test_malloc_with_array_access_before_count_init_symbolic_1 (size_t n)
> {
>  struct P *p = malloc (sizeof (struct P) + n);
>  if (!p)
>    return NULL;
>  /* Forgetting to set count altogether.  */
>  __builtin_memset (p->array, 0, n); // TODO: should warn here
>  return p;  
> }
> 
> struct P *
> test_malloc_with_array_access_before_count_init_symbolic_2 (size_t n)
> {
>  struct P *p = malloc (sizeof (struct P) + n);
>  if (!p)
>    return NULL;
>  /* Erroneously touching array before setting count.  */
>  __builtin_memset (p->array, 0, n); // TODO: should warn here
>  p->count = n;
>  return p;  
> }

Yes, the above are good for: The initialization to the size field should be done before the first reference to the FAM field.

/* Example where sizeof array element != 1.  */
> 
> 
> struct Q
> {
>  size_t count;
>  int32_t array[] __attribute__ ((counted_by (count)));
> };
> 
> struct Q *
> test_malloc_of_non_char_array_valid_symbolic (size_t n)
> {
>  size_t alloc_sz = sizeof (struct Q) + (sizeof (int32_t) * n);
>  struct Q *q = malloc (alloc_sz);
>  if (!q)
>    return NULL;
>  // Don't warn for this:
>  q->count = n;
>  __builtin_memset (q->array, 0,  sizeof (int32_t) * n);
>  return q;
> }
> 
> struct Q *
> test_malloc_of_non_char_array_bad_size_symbolic (size_t n)
> {
>  /* Allocation size is too small: forgetting to multiply
>     count by sizeof (array element).  */
>  size_t alloc_sz = sizeof (struct Q) + n;
>  struct Q *q = malloc (alloc_sz);
>  if (!q)
>    return NULL;
> 
>  /* TODO: should we warn here?
>     Allocated size of flex array is too small relative
>     to implicit size of accesses.  */
>  q->count = n;


If the real # of elements of the array "q->array”  is smaller than that is specified by “q->count”, we should issue warning here. 
> 
>  /* TODO: should we warn here?
>     This initializes the buffer we allocated,
>     but only the first quarter of the flex array.  */
>  __builtin_memset (q->array, 0,  n);
I think that no warning is needed for -fanalyzer=counted-by here, since “q->count” has been initialized before the reference to “q->array”.  This is correct.

 (The error of the initialization is bigger than the real array has been issued in the above warning already). 
> 
>  /* TODO: should we warn here?
>     This initializes the full flex array as specified by
>     "count", but is out-of-bounds relative to our heap
>     allocation.  */
>  __builtin_memset (q->array, 0,  sizeof (int32_t) * n);

I think that no warning is needed for -fanalyzer=counted-by here. 
But whether there are warnings for -Wanalyzer-out-of-bounds is another question, I think when the user use the “counted-by” attribute incorrectly in their source code, the behavior of out-of-bounds detection is undefined as we mentioned in the documentation:

"It's the user's responsibility to make sure the above requirements to
be kept all the time.  Otherwise the compiler reports warnings,
at the same time, the results of the array bound sanitizer and the
@code{__builtin_dynamic_object_size} is undefined”

Qing


>  return q;
> }
> 
> 
> 
> 
> 
> 
> 
> 


  reply	other threads:[~2024-06-05 19:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-30 12:26 [PATCH v10 0/5] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
2024-05-30 12:26 ` [PATCH v10 1/5] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
2024-05-30 12:26 ` [PATCH v10 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE Qing Zhao
2024-05-30 19:43   ` Joseph Myers
2024-05-30 20:03     ` Qing Zhao
2024-05-31 12:58   ` Richard Biener
2024-05-31 13:11     ` Qing Zhao
2024-06-04 21:55       ` "counted_by" and -fanalyzer (was Re: [PATCH v10 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.) David Malcolm
2024-06-04 22:09         ` Qing Zhao
2024-06-05 13:49           ` "counted_by" and -fanalyzer David Malcolm
2024-06-05 19:54             ` Qing Zhao [this message]
2024-05-30 12:26 ` [PATCH v10 3/5] Use the .ACCESS_WITH_SIZE in builtin object size Qing Zhao
2024-05-30 12:26 ` [PATCH v10 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer Qing Zhao
2024-05-30 12:27 ` [PATCH v10 5/5] Add the 6th argument to .ACCESS_WITH_SIZE Qing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9474C9DC-D02A-401A-A81D-102F0B8259E5@oracle.com \
    --to=qing.zhao@oracle.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=isanbard@gmail.com \
    --cc=josmyers@redhat.com \
    --cc=keescook@chromium.org \
    --cc=rguenther@suse.de \
    --cc=siddhesh@gotplt.org \
    --cc=uecker@tugraz.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).