public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Uecker <uecker@tugraz.at>
To: Qing Zhao <qing.zhao@oracle.com>,
	"josmyers@redhat.com" <josmyers@redhat.com>,
	Richard Biener <richard.guenther@gmail.com>
Cc: Kees Cook <keescook@chromium.org>,
	"jakub@redhat.com" <jakub@redhat.com>,
	 "siddhesh@gotplt.org" <siddhesh@gotplt.org>,
	"isanbard@gmail.com" <isanbard@gmail.com>,
	 "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Date: Mon, 29 Jan 2024 16:50:36 +0100	[thread overview]
Message-ID: <30a9f93477b9f9745f4b12c5c25fa1e5aa9a8daa.camel@tugraz.at> (raw)
In-Reply-To: <E7395E71-3C6D-4118-8B1D-1A423A19CAC5@oracle.com>

Am Montag, dem 29.01.2024 um 15:09 +0000 schrieb Qing Zhao:
> Thank you!
> 
> Joseph and Richard,  could you also comment on this?
> 
> > On Jan 28, 2024, at 5:09 AM, Martin Uecker <uecker@tugraz.at> wrote:
> > 
> > Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao:
> > > 
> > > > On Jan 26, 2024, at 3:04 AM, Martin Uecker <uecker@tugraz.at> wrote:
> > > > 
> > > > 
> > > > I haven't looked at the patch, but it sounds you give the result
> > > > the wrong type. Then patching up all use cases instead of the
> > > > type seems wrong.
> > > 
> > > Yes, this is for resolving a very early gimplification issue as I reported last Nov:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> > > 
> > > Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
> > > With a pointer indirection:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> > > 
> > > The reason for such change is:  return a flexible array member TYPE is not allowed
> > > by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 
> > > 
> > > ******The new internal function
> > > 
> > > .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)
> > > 
> > > INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> > > 
> > > which returns the "REF_TO_OBJ" same as the 1st argument;
> > > 
> > > Both the return type and the type of the first argument of this function have been converted from 
> > > the incomplete array type to the corresponding pointer type.
> > > 
> > > As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost
> > > when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added
> > > The 6th argument “INDEX”.
> > > 
> > > What’s your comment and suggestion on this solution?
> > 
> > I am not entirely sure but changing types in the FE seems
> > problematic because this breaks language semantics. And
> > then adding special code everywhere to treat it specially
> > in the FE does not seem a good way forward.
> > 
> > If I understand correctly, returning an incomplete array 
> > type is not allowed and then fails during gimplification.
> 
> Yes, this is the problem in gimplification. 
> 
> > So I would suggest to make it return a pointer to the 
> > incomplete array (and not the element type)
> 
> 
> for the following:
> 
> struct annotated {
>   unsigned int size;
>   int array[] __attribute__((counted_by (size)));
> };
> 
>   struct annotated * p = ….
>   p->array[9] = 0;
> 
> The IL for the above array reference p->array[9] is:
> 
> 1. If the return type is the original incomplete array type, 
> 
> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
> 
> (this triggered the gimplification failure since the return type cannot be a complete type).
> 
> 2. When the return type is changed to a pointer to the element type of the incomplete array, (the current patch)
> Then the original array reference naturally becomes an indirect reference through the pointer
> 
> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;
> 
> Since the original array reference becomes an indirect reference through the pointer to the element array, the INDEX info 
> is mixed into the OFFSET of the indirect reference and lost, so, I added the 6th argument to the routine .ACCESS_WITH_SIZE
> to record the INDEX. 
> 
> 3. With your suggestion, the return type is changed to a pointer to the incomplete array, 
> I just tried this to change the result type :
> 
> 
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
>                                        tree counted_by_type)
>  {
>    gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
> -  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
> +  tree result_type = build_pointer_type (TREE_TYPE (ref));
> 
> Then, I got the following FE errors:
> 
> test.c:10:11: error: invalid use of flexible array member
>    10 |   p->array[9] = 0;
> 
> The reason for the error is: when the original array_ref becomes an indirect_ref through the pointer to the incomplete array,
> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (type) is invalid since the type is an incomplete array. 
> As a result, the OFFSET cannot computed for the indirect_ref.
> 
> Looks like even more issues with this approach.

Yes, but only because the following is missing:

> 
> 
> > but then wrap
> > it with an indirection when inserting this code in the FE
> > so that the full replacement has the correct type again
> > (of the incomplete array).
> 
> I don’t quite understand the above, could you please explain this in more details? (If possible, could you please use the above small example?)
> thanks.

You would need to add an indirection:

(*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)))[9] = 0;

if .ACCESS_WITH_SIZE has type    T (*)[], i.e. pointer to incomplete
array of type T, then (*(.ACCESS_WITH_SIZE (...))) has type T[], i.e.
incomplete array of type.   

And you shouldn't even consider array derefencing part at all,
but replace the p->array when the component ref is constructed. 


Martin



> 
> > 
> > 
> > Alternatively, one could allow this during gimplification
> > or add some conversion.
> 
> Allowing this in gimplification might trigger some other issues.  I guess that adding conversion 
> in the end of the FE or in the beginning of gimplification might be better.
> 
> i.e,  in FE, still keep the original incomplete array type as the return type for the routine .ACCESS_WITH_SIZE, i.e
> 
> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
> 
> But add a conversion from the above array_ref to an indirect_ref in the end of FE or in the beginning of gimplification:
> 
> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) = 0;
> 
> With this approach,  during FE, the original ARRAY TYPE and the INDEX can be kept, the array bound sanitizer instrumentation
> Will be much easier than my current approach. 
> 
> Is this approach reasonable?
> 
> If so, where is better to add this conversion, in the end of FE or in the beginning of gimplification?
> 
> Thanks.
> 
> Qing
> 
> 
> > 
> > Martin
> > 
> > 
> > > 
> > > Thanks.
> > > 
> > > Qing
> > > 
> > > 
> > > > 
> > > > Martin
> > > > 
> > > > 
> > > > Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
> > > > > Thanks a lot for the testing.
> > > > > 
> > > > > Yes, I can repeat the issue with the following small example:
> > > > > 
> > > > > #include <stdlib.h>
> > > > > #include <stddef.h>
> > > > > #include <stdio.h>
> > > > > 
> > > > > #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> > > > > 
> > > > > struct untracked {
> > > > >      int size;
> > > > >      int array[] __attribute__((counted_by (size)));
> > > > > } *a;
> > > > > struct untracked * alloc_buf (int index)
> > > > > {
> > > > > struct untracked *p;
> > > > > p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
> > > > >                                       (offsetof (struct untracked, array[0])
> > > > >                                        + (index) * sizeof (int))));
> > > > > p->size = index;
> > > > > return p;
> > > > > }
> > > > > 
> > > > > int main()
> > > > > {
> > > > > a = alloc_buf(10);
> > > > >    printf ("same_type is %d\n",
> > > > > (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
> > > > > return 0;
> > > > > }
> > > > > 
> > > > > 
> > > > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> > > > > same_type is 1
> > > > > 
> > > > > Looks like that the “typeof” operator need to be handled specially in C FE
> > > > > for the new internal function .ACCESS_WITH_SIZE. 
> > > > > 
> > > > > (I have specially handle the operator “offsetof” in C FE already).
> > > > > 
> > > > > Will fix this issue.
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > > Qing
> > > > > 
> > > > > > On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
> > > > > > 
> > > > > > On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
> > > > > > > This is the 4th version of the patch.
> > > > > > 
> > > > > > Thanks very much for this!
> > > > > > 
> > > > > > I tripped over an unexpected behavioral change that the Linux kernel
> > > > > > depends on:
> > > > > > 
> > > > > > __builtin_types_compatible_p() no longer treats an array marked with
> > > > > > counted_by as different from that array's decayed pointer. Specifically,
> > > > > > the kernel uses these macros:
> > > > > > 
> > > > > > 
> > > > > > /*
> > > > > > * Force a compilation error if condition is true, but also produce a
> > > > > > * result (of value 0 and type int), so the expression can be used
> > > > > > * e.g. in a structure initializer (or where-ever else comma expressions
> > > > > > * aren't permitted).
> > > > > > */
> > > > > > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> > > > > > 
> > > > > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
> > > > > > 
> > > > > > /* &a[0] degrades to a pointer: a different type from an array */
> > > > > > #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> > > > > > 
> > > > > > 
> > > > > > This gets used in various places to make sure we're dealing with an
> > > > > > array for a macro:
> > > > > > 
> > > > > > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
> > > > > > 
> > > > > > 
> > > > > > So this builds:
> > > > > > 
> > > > > > struct untracked {
> > > > > >      int size;
> > > > > >      int array[];
> > > > > > } *a;
> > > > > > 
> > > > > > __must_be_array(a->array)
> > > > > > => 0 (as expected)
> > > > > > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
> > > > > > => 0 (as expected, array vs decayed array pointer)
> > > > > > 
> > > > > > 
> > > > > > But if counted_by is added, we get a build failure:
> > > > > > 
> > > > > > struct tracked {
> > > > > >      int size;
> > > > > >      int array[] __counted_by(size);
> > > > > > } *b;
> > > > > > 
> > > > > > __must_be_array(b->array)
> > > > > > => build failure (not expected)
> > > > > > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
> > > > > > => 1 (not expected, both pointers?)
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > -- 
> > > > > > Kees Cook
> 
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging



  reply	other threads:[~2024-01-29 15:50 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24  0:29 Qing Zhao
2024-01-24  0:29 ` [PATCH v4 1/4] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
2024-01-24  0:29 ` [PATCH v4 2/4] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE Qing Zhao
2024-01-24  0:29 ` [PATCH v4 3/4] Use the .ACCESS_WITH_SIZE in builtin object size Qing Zhao
2024-01-24  0:29 ` [PATCH v4 4/4] Use the .ACCESS_WITH_SIZE in bound sanitizer Qing Zhao
2024-01-25  0:51 ` [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
2024-01-25 20:11   ` Qing Zhao
2024-01-26  8:04     ` Martin Uecker
2024-01-26 14:33       ` Qing Zhao
2024-01-28 10:09         ` Martin Uecker
2024-01-29 15:09           ` Qing Zhao
2024-01-29 15:50             ` Martin Uecker [this message]
2024-01-29 16:19               ` Qing Zhao
2024-01-29 20:35             ` Joseph Myers
2024-01-29 22:20               ` Qing Zhao
2024-01-29 16:00     ` Qing Zhao
2024-01-29 17:25       ` Kees Cook
2024-01-29 19:32         ` Qing Zhao
2024-01-29 20:19           ` Kees Cook
2024-01-29 22:45             ` Qing Zhao
2024-01-30  5:41               ` Kees Cook
2024-01-30 15:43                 ` Qing Zhao
2024-01-30 16:04 ` Qing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=30a9f93477b9f9745f4b12c5c25fa1e5aa9a8daa.camel@tugraz.at \
    --to=uecker@tugraz.at \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=isanbard@gmail.com \
    --cc=jakub@redhat.com \
    --cc=josmyers@redhat.com \
    --cc=keescook@chromium.org \
    --cc=qing.zhao@oracle.com \
    --cc=richard.guenther@gmail.com \
    --cc=siddhesh@gotplt.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).