public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Qing Zhao <qing.zhao@oracle.com>
To: Martin Uecker <uecker@tugraz.at>,
	"josmyers@redhat.com" <josmyers@redhat.com>,
	Richard Biener <richard.guenther@gmail.com>
Cc: Kees Cook <keescook@chromium.org>,
	"jakub@redhat.com" <jakub@redhat.com>,
	"siddhesh@gotplt.org" <siddhesh@gotplt.org>,
	"isanbard@gmail.com" <isanbard@gmail.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Date: Mon, 29 Jan 2024 15:09:44 +0000	[thread overview]
Message-ID: <E7395E71-3C6D-4118-8B1D-1A423A19CAC5@oracle.com> (raw)
In-Reply-To: <98f8a862e65dfc1ddd8227266ed41724f9422adb.camel@tugraz.at>

Thank you!

Joseph and Richard,  could you also comment on this?

> On Jan 28, 2024, at 5:09 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao:
>> 
>>> On Jan 26, 2024, at 3:04 AM, Martin Uecker <uecker@tugraz.at> wrote:
>>> 
>>> 
>>> I haven't looked at the patch, but it sounds you give the result
>>> the wrong type. Then patching up all use cases instead of the
>>> type seems wrong.
>> 
>> Yes, this is for resolving a very early gimplification issue as I reported last Nov:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
>> 
>> Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
>> With a pointer indirection:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
>> 
>> The reason for such change is:  return a flexible array member TYPE is not allowed
>> by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 
>> 
>> ******The new internal function
>> 
>> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)
>> 
>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
>> 
>> which returns the "REF_TO_OBJ" same as the 1st argument;
>> 
>> Both the return type and the type of the first argument of this function have been converted from 
>> the incomplete array type to the corresponding pointer type.
>> 
>> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost
>> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added
>> The 6th argument “INDEX”.
>> 
>> What’s your comment and suggestion on this solution?
> 
> I am not entirely sure but changing types in the FE seems
> problematic because this breaks language semantics. And
> then adding special code everywhere to treat it specially
> in the FE does not seem a good way forward.
> 
> If I understand correctly, returning an incomplete array 
> type is not allowed and then fails during gimplification.

Yes, this is the problem in gimplification. 

> So I would suggest to make it return a pointer to the 
> incomplete array (and not the element type)


for the following:

struct annotated {
  unsigned int size;
  int array[] __attribute__((counted_by (size)));
};

  struct annotated * p = ….
  p->array[9] = 0;

The IL for the above array reference p->array[9] is:

1. If the return type is the original incomplete array type, 

.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;

(this triggered the gimplification failure since the return type cannot be a complete type).

2. When the return type is changed to a pointer to the element type of the incomplete array, (the current patch)
Then the original array reference naturally becomes an indirect reference through the pointer

*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;

Since the original array reference becomes an indirect reference through the pointer to the element array, the INDEX info 
is mixed into the OFFSET of the indirect reference and lost, so, I added the 6th argument to the routine .ACCESS_WITH_SIZE
to record the INDEX. 

3. With your suggestion, the return type is changed to a pointer to the incomplete array, 
I just tried this to change the result type :


--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
                                       tree counted_by_type)
 {
   gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
-  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
+  tree result_type = build_pointer_type (TREE_TYPE (ref));

Then, I got the following FE errors:

test.c:10:11: error: invalid use of flexible array member
   10 |   p->array[9] = 0;

The reason for the error is: when the original array_ref becomes an indirect_ref through the pointer to the incomplete array,
During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (type) is invalid since the type is an incomplete array. 
As a result, the OFFSET cannot computed for the indirect_ref.

Looks like even more issues with this approach.


> but then wrap
> it with an indirection when inserting this code in the FE
> so that the full replacement has the correct type again
> (of the incomplete array).

I don’t quite understand the above, could you please explain this in more details? (If possible, could you please use the above small example?)
thanks.

> 
> 
> Alternatively, one could allow this during gimplification
> or add some conversion.

Allowing this in gimplification might trigger some other issues.  I guess that adding conversion 
in the end of the FE or in the beginning of gimplification might be better.

i.e,  in FE, still keep the original incomplete array type as the return type for the routine .ACCESS_WITH_SIZE, i.e

.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;

But add a conversion from the above array_ref to an indirect_ref in the end of FE or in the beginning of gimplification:

*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) = 0;

With this approach,  during FE, the original ARRAY TYPE and the INDEX can be kept, the array bound sanitizer instrumentation
Will be much easier than my current approach. 

Is this approach reasonable?

If so, where is better to add this conversion, in the end of FE or in the beginning of gimplification?

Thanks.

Qing


> 
> Martin
> 
> 
>> 
>> Thanks.
>> 
>> Qing
>> 
>> 
>>> 
>>> Martin
>>> 
>>> 
>>> Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
>>>> Thanks a lot for the testing.
>>>> 
>>>> Yes, I can repeat the issue with the following small example:
>>>> 
>>>> #include <stdlib.h>
>>>> #include <stddef.h>
>>>> #include <stdio.h>
>>>> 
>>>> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
>>>> 
>>>> struct untracked {
>>>>      int size;
>>>>      int array[] __attribute__((counted_by (size)));
>>>> } *a;
>>>> struct untracked * alloc_buf (int index)
>>>> {
>>>> struct untracked *p;
>>>> p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>>>>                                       (offsetof (struct untracked, array[0])
>>>>                                        + (index) * sizeof (int))));
>>>> p->size = index;
>>>> return p;
>>>> }
>>>> 
>>>> int main()
>>>> {
>>>> a = alloc_buf(10);
>>>>    printf ("same_type is %d\n",
>>>> (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
>>>> return 0;
>>>> }
>>>> 
>>>> 
>>>> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
>>>> same_type is 1
>>>> 
>>>> Looks like that the “typeof” operator need to be handled specially in C FE
>>>> for the new internal function .ACCESS_WITH_SIZE. 
>>>> 
>>>> (I have specially handle the operator “offsetof” in C FE already).
>>>> 
>>>> Will fix this issue.
>>>> 
>>>> Thanks.
>>>> 
>>>> Qing
>>>> 
>>>>> On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
>>>>> 
>>>>> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
>>>>>> This is the 4th version of the patch.
>>>>> 
>>>>> Thanks very much for this!
>>>>> 
>>>>> I tripped over an unexpected behavioral change that the Linux kernel
>>>>> depends on:
>>>>> 
>>>>> __builtin_types_compatible_p() no longer treats an array marked with
>>>>> counted_by as different from that array's decayed pointer. Specifically,
>>>>> the kernel uses these macros:
>>>>> 
>>>>> 
>>>>> /*
>>>>> * Force a compilation error if condition is true, but also produce a
>>>>> * result (of value 0 and type int), so the expression can be used
>>>>> * e.g. in a structure initializer (or where-ever else comma expressions
>>>>> * aren't permitted).
>>>>> */
>>>>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>>>>> 
>>>>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
>>>>> 
>>>>> /* &a[0] degrades to a pointer: a different type from an array */
>>>>> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
>>>>> 
>>>>> 
>>>>> This gets used in various places to make sure we're dealing with an
>>>>> array for a macro:
>>>>> 
>>>>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
>>>>> 
>>>>> 
>>>>> So this builds:
>>>>> 
>>>>> struct untracked {
>>>>>      int size;
>>>>>      int array[];
>>>>> } *a;
>>>>> 
>>>>> __must_be_array(a->array)
>>>>> => 0 (as expected)
>>>>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>>>>> => 0 (as expected, array vs decayed array pointer)
>>>>> 
>>>>> 
>>>>> But if counted_by is added, we get a build failure:
>>>>> 
>>>>> struct tracked {
>>>>>      int size;
>>>>>      int array[] __counted_by(size);
>>>>> } *b;
>>>>> 
>>>>> __must_be_array(b->array)
>>>>> => build failure (not expected)
>>>>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>>>>> => 1 (not expected, both pointers?)
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Kees Cook



  reply	other threads:[~2024-01-29 15:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24  0:29 Qing Zhao
2024-01-24  0:29 ` [PATCH v4 1/4] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
2024-01-24  0:29 ` [PATCH v4 2/4] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE Qing Zhao
2024-01-24  0:29 ` [PATCH v4 3/4] Use the .ACCESS_WITH_SIZE in builtin object size Qing Zhao
2024-01-24  0:29 ` [PATCH v4 4/4] Use the .ACCESS_WITH_SIZE in bound sanitizer Qing Zhao
2024-01-25  0:51 ` [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
2024-01-25 20:11   ` Qing Zhao
2024-01-26  8:04     ` Martin Uecker
2024-01-26 14:33       ` Qing Zhao
2024-01-28 10:09         ` Martin Uecker
2024-01-29 15:09           ` Qing Zhao [this message]
2024-01-29 15:50             ` Martin Uecker
2024-01-29 16:19               ` Qing Zhao
2024-01-29 20:35             ` Joseph Myers
2024-01-29 22:20               ` Qing Zhao
2024-01-29 16:00     ` Qing Zhao
2024-01-29 17:25       ` Kees Cook
2024-01-29 19:32         ` Qing Zhao
2024-01-29 20:19           ` Kees Cook
2024-01-29 22:45             ` Qing Zhao
2024-01-30  5:41               ` Kees Cook
2024-01-30 15:43                 ` Qing Zhao
2024-01-30 16:04 ` Qing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E7395E71-3C6D-4118-8B1D-1A423A19CAC5@oracle.com \
    --to=qing.zhao@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=isanbard@gmail.com \
    --cc=jakub@redhat.com \
    --cc=josmyers@redhat.com \
    --cc=keescook@chromium.org \
    --cc=richard.guenther@gmail.com \
    --cc=siddhesh@gotplt.org \
    --cc=uecker@tugraz.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).