From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailrelay.tugraz.at (mailrelay.tugraz.at [129.27.2.202]) by sourceware.org (Postfix) with ESMTPS id 0F69F3858D20 for ; Mon, 29 Jan 2024 15:50:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F69F3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=tugraz.at Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tugraz.at ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0F69F3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=129.27.2.202 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706543445; cv=none; b=LFjdohGuhGn7Bv4cSqiq6tUyAXfBSy0QSM4g5wpZxfbx329CA9aKiFigJ1Ksg1IRitgcmIuTNKhQAnJWALpjFM5ut5VSrTjNubGKOY80IjtNjT8r6jgcIDcZqXASs0686eGNW5XJfyur597h46KUMNbpfeQox51b+j0T38tnXnA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706543445; c=relaxed/simple; bh=98GuyXv7sJmloGsDF7So70lHUicgbgqTUcvEOECbUOg=; h=DKIM-Signature:Message-ID:Subject:From:To:Date:MIME-Version; b=Ei4MECg+bNQUOvHGvsRRKb+RrhRymndZW/0G2TABTxBxbXp6HQKcsortYl0p4+exFtY74Gfc9Mj3yiXnfxWzWEX+/zhl7IfSjlG7TEJuuca9+KMwhk4qT7MidhOKzYgMlwBH1wuWN+bi31vXOK//eHEy7XzwEYiDyXbvOa0jnkc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from fbmtpc21.tugraz.at (fbmtpc21.tugraz.at [129.27.144.40]) by mailrelay.tugraz.at (Postfix) with ESMTPSA id 4TNt701n7Kz1LM0J; Mon, 29 Jan 2024 16:50:36 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.11.0 mailrelay.tugraz.at 4TNt701n7Kz1LM0J DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tugraz.at; s=mailrelay; t=1706543436; bh=KNWZrahTDzV8mRBZiyD1Ce07FRpx5StNklbvsZA51gc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=kMqxTsy3gHRRmcxPsfKKe8AGHG2vcK36Xh4X3Dxf69hEVHdv5fH0pgvhuPM8aPBLR I5/xvo4/qGHfAFonfMRrhDsgShTJHuq8sebdBwe4oG19el+iNUzPJ71PcclQDUYUcV NuKCh/wkhkujKPAwz8SgvO9HayF6KcXWCdGRUW+4= Message-ID: <30a9f93477b9f9745f4b12c5c25fa1e5aa9a8daa.camel@tugraz.at> Subject: Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) From: Martin Uecker To: Qing Zhao , "josmyers@redhat.com" , Richard Biener Cc: Kees Cook , "jakub@redhat.com" , "siddhesh@gotplt.org" , "isanbard@gmail.com" , "gcc-patches@gcc.gnu.org" Date: Mon, 29 Jan 2024 16:50:36 +0100 In-Reply-To: References: <20240124002955.3387096-1-qing.zhao@oracle.com> <202401241624.4DC3D829@keescook> <08c3913ee0e75beb1fdfbce5487418fbd50de9e3.camel@tugraz.at> <848FCA7D-1CBC-435E-8B4A-8136B13CF318@oracle.com> <98f8a862e65dfc1ddd8227266ed41724f9422adb.camel@tugraz.at> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4-2 MIME-Version: 1.0 X-TUG-Backscatter-control: G/VXY7/6zeyuAY/PU2/0qw X-Spam-Scanner: SpamAssassin 3.003001 X-Spam-Score-relay: -0.4 X-Scanned-By: MIMEDefang 2.74 on 129.27.10.116 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Am Montag, dem 29.01.2024 um 15:09 +0000 schrieb Qing Zhao: > Thank you! >=20 > Joseph and Richard, could you also comment on this? >=20 > > On Jan 28, 2024, at 5:09 AM, Martin Uecker wrote: > >=20 > > Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao: > > >=20 > > > > On Jan 26, 2024, at 3:04 AM, Martin Uecker wrote= : > > > >=20 > > > >=20 > > > > I haven't looked at the patch, but it sounds you give the result > > > > the wrong type. Then patching up all use cases instead of the > > > > type seems wrong. > > >=20 > > > Yes, this is for resolving a very early gimplification issue as I rep= orted last Nov: > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html > > >=20 > > > Since no-one responded at that time, I fixed the issue by replacing t= he ARRAY_REF > > > With a pointer indirection: > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html > > >=20 > > > The reason for such change is: return a flexible array member TYPE i= s not allowed > > > by C language (our gimplification follows this rule), so, we have to = return a pointer TYPE instead.=20 > > >=20 > > > ******The new internal function > > >=20 > > > .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SI= ZE, ACCESS_MODE, INDEX) > > >=20 > > > INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) > > >=20 > > > which returns the "REF_TO_OBJ" same as the 1st argument; > > >=20 > > > Both the return type and the type of the first argument of this funct= ion have been converted from=20 > > > the incomplete array type to the corresponding pointer type. > > >=20 > > > As a result, the original ARRAY_REF was converted to an INDIRECT_REF,= the original INDEX of the ARRAY_REF was lost > > > when converting from ARRAY_REF to INDIRECT_REF, in order to keep the = INDEX for bound sanitizer instrumentation, I added > > > The 6th argument =E2=80=9CINDEX=E2=80=9D. > > >=20 > > > What=E2=80=99s your comment and suggestion on this solution? > >=20 > > I am not entirely sure but changing types in the FE seems > > problematic because this breaks language semantics. And > > then adding special code everywhere to treat it specially > > in the FE does not seem a good way forward. > >=20 > > If I understand correctly, returning an incomplete array=20 > > type is not allowed and then fails during gimplification. >=20 > Yes, this is the problem in gimplification.=20 >=20 > > So I would suggest to make it return a pointer to the=20 > > incomplete array (and not the element type) >=20 >=20 > for the following: >=20 > struct annotated { > unsigned int size; > int array[] __attribute__((counted_by (size))); > }; >=20 > struct annotated * p =3D =E2=80=A6. > p->array[9] =3D 0; >=20 > The IL for the above array reference p->array[9] is: >=20 > 1. If the return type is the original incomplete array type,=20 >=20 > .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] =3D 0; >=20 > (this triggered the gimplification failure since the return type cannot b= e a complete type). >=20 > 2. When the return type is changed to a pointer to the element type of th= e incomplete array, (the current patch) > Then the original array reference naturally becomes an indirect reference= through the pointer >=20 > *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) =3D= 0; >=20 > Since the original array reference becomes an indirect reference through = the pointer to the element array, the INDEX info=20 > is mixed into the OFFSET of the indirect reference and lost, so, I added = the 6th argument to the routine .ACCESS_WITH_SIZE > to record the INDEX.=20 >=20 > 3. With your suggestion, the return type is changed to a pointer to the i= ncomplete array,=20 > I just tried this to change the result type : >=20 >=20 > --- a/gcc/c/c-typeck.cc > +++ b/gcc/c/c-typeck.cc > @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t l= oc, tree ref, > tree counted_by_type) > { > gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))); > - tree result_type =3D build_pointer_type (TREE_TYPE (TREE_TYPE (ref))); > + tree result_type =3D build_pointer_type (TREE_TYPE (ref)); >=20 > Then, I got the following FE errors: >=20 > test.c:10:11: error: invalid use of flexible array member > 10 | p->array[9] =3D 0; >=20 > The reason for the error is: when the original array_ref becomes an indir= ect_ref through the pointer to the incomplete array, > During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (= type) is invalid since the type is an incomplete array.=20 > As a result, the OFFSET cannot computed for the indirect_ref. >=20 > Looks like even more issues with this approach. Yes, but only because the following is missing: >=20 >=20 > > but then wrap > > it with an indirection when inserting this code in the FE > > so that the full replacement has the correct type again > > (of the incomplete array). >=20 > I don=E2=80=99t quite understand the above, could you please explain this= in more details? (If possible, could you please use the above small exampl= e?) > thanks. You would need to add an indirection: (*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)))[9] =3D 0; if .ACCESS_WITH_SIZE has type T (*)[], i.e. pointer to incomplete array of type T, then (*(.ACCESS_WITH_SIZE (...))) has type T[], i.e. incomplete array of type. =20 And you shouldn't even consider array derefencing part at all, but replace the p->array when the component ref is constructed.=20 Martin >=20 > >=20 > >=20 > > Alternatively, one could allow this during gimplification > > or add some conversion. >=20 > Allowing this in gimplification might trigger some other issues. I guess= that adding conversion=20 > in the end of the FE or in the beginning of gimplification might be bette= r. >=20 > i.e, in FE, still keep the original incomplete array type as the return = type for the routine .ACCESS_WITH_SIZE, i.e >=20 > .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] =3D 0; >=20 > But add a conversion from the above array_ref to an indirect_ref in the e= nd of FE or in the beginning of gimplification: >=20 > *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) =3D 0; >=20 > With this approach, during FE, the original ARRAY TYPE and the INDEX can= be kept, the array bound sanitizer instrumentation > Will be much easier than my current approach.=20 >=20 > Is this approach reasonable? >=20 > If so, where is better to add this conversion, in the end of FE or in the= beginning of gimplification? >=20 > Thanks. >=20 > Qing >=20 >=20 > >=20 > > Martin > >=20 > >=20 > > >=20 > > > Thanks. > > >=20 > > > Qing > > >=20 > > >=20 > > > >=20 > > > > Martin > > > >=20 > > > >=20 > > > > Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao: > > > > > Thanks a lot for the testing. > > > > >=20 > > > > > Yes, I can repeat the issue with the following small example: > > > > >=20 > > > > > #include > > > > > #include > > > > > #include > > > > >=20 > > > > > #define MAX(a, b) ((a) > (b) ? (a) : (b)) > > > > >=20 > > > > > struct untracked { > > > > > int size; > > > > > int array[] __attribute__((counted_by (size))); > > > > > } *a; > > > > > struct untracked * alloc_buf (int index) > > > > > { > > > > > struct untracked *p; > > > > > p =3D (struct untracked *) malloc (MAX (sizeof (struct untracked)= , > > > > > (offsetof (struct untracked= , array[0]) > > > > > + (index) * sizeof (int)))= ); > > > > > p->size =3D index; > > > > > return p; > > > > > } > > > > >=20 > > > > > int main() > > > > > { > > > > > a =3D alloc_buf(10); > > > > > printf ("same_type is %d\n", > > > > > (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->arr= ay)[0])))); > > > > > return 0; > > > > > } > > > > >=20 > > > > >=20 > > > > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c > > > > > same_type is 1 > > > > >=20 > > > > > Looks like that the =E2=80=9Ctypeof=E2=80=9D operator need to be = handled specially in C FE > > > > > for the new internal function .ACCESS_WITH_SIZE.=20 > > > > >=20 > > > > > (I have specially handle the operator =E2=80=9Coffsetof=E2=80=9D = in C FE already). > > > > >=20 > > > > > Will fix this issue. > > > > >=20 > > > > > Thanks. > > > > >=20 > > > > > Qing > > > > >=20 > > > > > > On Jan 24, 2024, at 7:51 PM, Kees Cook = wrote: > > > > > >=20 > > > > > > On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote: > > > > > > > This is the 4th version of the patch. > > > > > >=20 > > > > > > Thanks very much for this! > > > > > >=20 > > > > > > I tripped over an unexpected behavioral change that the Linux k= ernel > > > > > > depends on: > > > > > >=20 > > > > > > __builtin_types_compatible_p() no longer treats an array marked= with > > > > > > counted_by as different from that array's decayed pointer. Spec= ifically, > > > > > > the kernel uses these macros: > > > > > >=20 > > > > > >=20 > > > > > > /* > > > > > > * Force a compilation error if condition is true, but also prod= uce a > > > > > > * result (of value 0 and type int), so the expression can be us= ed > > > > > > * e.g. in a structure initializer (or where-ever else comma exp= ressions > > > > > > * aren't permitted). > > > > > > */ > > > > > > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)= ); }))) > > > > > >=20 > > > > > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a= ), typeof(b)) > > > > > >=20 > > > > > > /* &a[0] degrades to a pointer: a different type from an array = */ > > > > > > #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a),= &(a)[0])) > > > > > >=20 > > > > > >=20 > > > > > > This gets used in various places to make sure we're dealing wit= h an > > > > > > array for a macro: > > > > > >=20 > > > > > > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __mus= t_be_array(arr)) > > > > > >=20 > > > > > >=20 > > > > > > So this builds: > > > > > >=20 > > > > > > struct untracked { > > > > > > int size; > > > > > > int array[]; > > > > > > } *a; > > > > > >=20 > > > > > > __must_be_array(a->array) > > > > > > =3D> 0 (as expected) > > > > > > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->arra= y)[0])) > > > > > > =3D> 0 (as expected, array vs decayed array pointer) > > > > > >=20 > > > > > >=20 > > > > > > But if counted_by is added, we get a build failure: > > > > > >=20 > > > > > > struct tracked { > > > > > > int size; > > > > > > int array[] __counted_by(size); > > > > > > } *b; > > > > > >=20 > > > > > > __must_be_array(b->array) > > > > > > =3D> build failure (not expected) > > > > > > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->arra= y)[0])) > > > > > > =3D> 1 (not expected, both pointers?) > > > > > >=20 > > > > > >=20 > > > > > >=20 > > > > > >=20 > > > > > > --=20 > > > > > > Kees Cook >=20 >=20 --=20 Univ.-Prof. Dr. rer. nat. Martin Uecker Graz University of Technology Institute of Biomedical Imaging