public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Qing Zhao <qing.zhao@oracle.com>
To: Martin Uecker <uecker@tugraz.at>
Cc: Richard Biener <richard.guenther@gmail.com>,
	Siddhesh Poyarekar <siddhesh@gotplt.org>,
	Joseph Myers <joseph@codesourcery.com>,
	Jakub Jelinek <jakub@redhat.com>,
	gcc Patches <gcc-patches@gcc.gnu.org>,
	kees Cook <keescook@chromium.org>,
	"isanbard@gmail.com" <isanbard@gmail.com>
Subject: Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Date: Thu, 26 Oct 2023 16:41:14 +0000	[thread overview]
Message-ID: <8EF8A68F-CCE2-4746-A95E-F6C0A87B3C29@oracle.com> (raw)
In-Reply-To: <4f9d20fbf2d3891a8488c24916ad8b3a3aa4f7cc.camel@tugraz.at>



> On Oct 26, 2023, at 5:20 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker <uecker@tugraz.at> wrote:
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>>> 
>>>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker <uecker@tugraz.at>:
>>>>> 
>>>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>>>>> 
>>>>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker <uecker@tugraz.at>:
>>>>>>>>> 
>>>>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 +0000 schrieb Qing Zhao:
>>>>>>>>>> Hi, Sid,
>>>>>>>>>> 
>>>>>>>>>> Really appreciate for your example and detailed explanation. Very helpful.
>>>>>>>>>> I think that this example is an excellent example to show (almost) all the issues we need to consider.
>>>>>>>>>> 
>>>>>>>>>> I slightly modified this example to make it to be compilable and run-able, as following:
>>>>>>>>>> (but I still cannot make the incorrect reordering or DSE happening, anyway, the potential reordering possibility is there…)
>>>>>>>>>> 
>>>>>>>>>> 1 #include <malloc.h>
>>>>>>>>>> 2 struct A
>>>>>>>>>> 3 {
>>>>>>>>>> 4  size_t size;
>>>>>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>>>>>> 6 };
>>>>>>>>>> 7
>>>>>>>>>> 8 static size_t
>>>>>>>>>> 9 get_size_from (void *ptr)
>>>>>>>>>> 10 {
>>>>>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>>>>>> 12 }
>>>>>>>>>> 13
>>>>>>>>>> 14 void
>>>>>>>>>> 15 foo (size_t sz)
>>>>>>>>>> 16 {
>>>>>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>>>>>>>>> 18  obj->size = sz;
>>>>>>>>>> 19  obj->buf[0] = 2;
>>>>>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>>>>> 21  return;
>>>>>>>>>> 22 }
>>>>>>>>>> 23
>>>>>>>>>> 24 int main ()
>>>>>>>>>> 25 {
>>>>>>>>>> 26  foo (20);
>>>>>>>>>> 27  return 0;
>>>>>>>>>> 28 }
>>>>>>>>>> 
>>>>>> 
>>>>>> <snip>
>>>>>> 
>>>>>>>> When it’s set I suppose.  Turn
>>>>>>>> 
>>>>>>>> X.l = n;
>>>>>>>> 
>>>>>>>> Into
>>>>>>>> 
>>>>>>>> X.l = __builtin_with_size (x.buf, n);
>>>>>>> 
>>>>>>> It would turn
>>>>>>> 
>>>>>>> some_variable = (&) x.buf
>>>>>>> 
>>>>>>> into
>>>>>>> 
>>>>>>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>>>>>>> 
>>>>>>> 
>>>>>>> So the later access to x.buf and not the initialization
>>>>>>> of a member of the struct (which is too early).
>>>>>>> 
>>>>>> 
>>>>>> Hmm, so with Qing's example above, are you suggesting the transformation
>>>>>> be to foo like so:
>>>>>> 
>>>>>> 14 void
>>>>>> 15 foo (size_t sz)
>>>>>> 16 {
>>>>>> 16.5  void * _1;
>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>>>>> 18  obj->size = sz;
>>>>>> 19  obj->buf[0] = 2;
>>>>>> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
>>>>>> 20  __builtin_printf (“%d\n", get_size_from (_1));
>>>>>> 21  return;
>>>>>> 22 }
>>>>>> 
>>>>>> If yes then this could indeed work.  I think I got thrown off by the
>>>>>> reference to __bdos.
>>>>> 
>>>>> Yes. I think it is important not to evaluate the size at the
>>>>> access to buf and not the allocation, because the point is to
>>>>> recover it from the size member even when the compiler can't
>>>>> see the original allocation.
>>>> 
>>>> But if the access is through a pointer without the attribute visible
>>>> even the Frontend cannot recover?
>>> 
>>> Yes, if the access is using a struct-with-FAM without the attribute
>>> the FE would not be insert the builtin.  BDOS could potentially
>>> still see the original allocation but if it doesn't, then there is
>>> no information.
>>> 
>>>> We’d need to force type correctness and give up on indirecting
>>>> through an int * when it can refer to two diffenent container types.
>>>> The best we can do I think is mark allocation sites and hope for
>>>> some basic code hygiene (not clobbering size or array pointer
>>>> through pointers without the appropriately attributed type)
>>> 
>>> I am do not fully understand what you are referring to.
>> 
>> struct A { int n; int data[n]; };
>> struct B { long n; int data[n]; };
>> 
>> int *p = flag ? a->data : b->data;
>> 
>> access *p;
>> 
>> Since we need to allow interoperability of pointers (a->data is
>> convertible to a non-fat pointer of type int *) this leaves us with
>> ambiguity we need to conservatively handle to avoid false positives.
> 
> For BDOS, I would expect this to work exactly like:
> 
> char aa[n1];
> char bb[n2];
> char *p = flag ? aa : bb;
> 
> (or similar code with malloc). In fact it does:
> 
> https://godbolt.org/z/bK68YKqhe
> (cheating a bit and also the sub-object version of
> BDOS does not seem to work)
> 
>> 
>> We _might_ want to diagnose decay of a->data to int *, but IIRC
>> there's no way (or proposal) to allow declaring a corresponding
>> fat pointer, so it's not a good designed feature.
> 
> As a language feature, I fully agree.  I see the
> counted_by attribute has a makeshift solution.

The “counted_by” attribute is necessary at this moment since
 it will be much easier to be adopted by the existing source code,
 for example, the Linux Kernel. 

Though I agree that embedding the bound information into TYPE 
system  should be the ultimate goal. 

> 
> But we can already do:
> 
> auto p = flag ? &aa : &bb;
> 
> and this already works perfectly:
> 
> https://godbolt.org/z/rvb6xWWPj
> 
> We can also name the variably-modifed type: 
> 
> char (*p)[flag ? n1 : n2] = flag ? &aa : &bb;
> https://godbolt.org/z/13cTT1vGP
> 
> The problem with this version is that consistency
> is not checked. (I have patch for adding run-time
> checks).
> 
> And then the next step would be to allow
> 
> char (*p)[:] = flag ? &aa : &bb;
> 
> or similar.  Dennis Ritchie proposed this himself
> a long time ago.
> 
> So far this seems straightfoward.
> 
> If we then want to allow such wide pointers as
> function arguments or in structs, we would need
> to define an ABI. But the ABI could just be
> 
> struct { char (*p)[.s]; size_t s; };
> 
> Maybe we could try to make the following
> ABI compatible:
> 
> int foo(int p[s], size_t s);
> int foo(int p[:]);
> 
> 
>> Having __builtin_with_size at allocation would possibly make
>> the BOS use-def walk discover both objects.
> 
> Yes. But I do not think this there is any fundamental
> difference to discovering allocation functions.
> 
>>  I think you can't
>> insert __builtin_with_size at the access to *p, but in practice
>> that would be very much needed.
> 
> Usually the access to *p would follow directly the
> access x.buf, so BDOS should find it.
> 
> But yes, to get full bounds safety, the pointer type 
> has to change to a variably-modified type (which would work
> today) or a fat pointer type.

By variable-modified type, you mean the VLA?

There is one major difference between VLA and (FAM or Pointer array):

For VLA, the compiler is responsible for allocating the memory for it, 
the size assignment and the memory allocation are both done by the
 compiler at the same time and tied together. 

But for FAM and pointer arrays, right now, users allocate the memory for them
In the source code, so, when we add the “counted_by” attribute, we need to
specify the additional requirement for the order of size assignment and memory
allocation into the source code, and specify this requirement in the user documentation.

Later, if we try to make the bound information of FAM/pointer array into TYPE 
system, similar as the current VLA, should we also need to move the memory allocation 
of the FAM/pointer arrays into compiler (similar as VLA too)? 
> The later can be built on
> vm-types easily because all the FE semantics already
> exists.

Except the memory allocation part…

Do I miss anything here?

Qing
> 
> Martin
> 
>> 
>> Richard.
>> 
>>> But yes,
>>> for full bounds safety we would need the language feature.
>>> In C people should start to variably-modified types
>>> more.  I think we can build perfect bounds safety on top of
>>> them in a very good way with only FE changes.
>>> 
>>> All these attributes are just a best effort.  But for a while,
>>> this will be necessary.
>>> 
>>> Martin
>>> 
>>>> 
>>>>> Evaluating at this point requires that the size is correctly set
>>>>> before the access to the FAM and the user has to make sure
>>>>> this is the case. But to me this requirement would make sense.
>>>>> 
>>>>> Semantically, it could aöso make sense to evaluate the size at a
>>>>> later time.  But then the reordering becomes problematic again.
>>>>> 
>>>>> Also I think this would make this feature generally more useful.
>>>>> For example, it could work also for others pointers in the struct
>>>>> and not just for FAMs.  In this case, the struct may already be
>>>>> freed when  BDOS is called, so it might also not possible to
>>>>> access the size member at a later time.
>>>>> 
>>>>> Martin
>>>>> 
>>>>> 
>>>>>> 
>>>>> 
>>> 
> 


  parent reply	other threads:[~2023-10-26 16:41 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-25 15:24 Qing Zhao
2023-08-25 15:24 ` [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
2023-09-08 14:12   ` Qing Zhao
2023-09-20 13:44   ` Ping * 2: " Qing Zhao
2023-10-05 18:51   ` Siddhesh Poyarekar
2023-10-05 19:31     ` Siddhesh Poyarekar
2023-10-18 14:51       ` Qing Zhao
2023-10-18 15:18         ` Siddhesh Poyarekar
2023-10-18 15:37           ` Qing Zhao
2023-10-18 14:41     ` Qing Zhao
2023-08-25 15:24 ` [V3][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896] Qing Zhao
2023-09-08 14:12   ` Qing Zhao
2023-09-20 13:44   ` PING *2: " Qing Zhao
2023-10-05 20:01   ` Siddhesh Poyarekar
2023-10-18 20:39     ` Qing Zhao
2023-08-25 15:24 ` [V3][PATCH 3/3] Use the counted_by attribute information in bound sanitizer[PR108896] Qing Zhao
2023-09-08 14:12   ` Qing Zhao
2023-09-20 13:45   ` PING * 2: " Qing Zhao
2023-08-25 19:51 ` [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
2023-09-08 14:11 ` Qing Zhao
2023-09-20 13:43 ` PING * 2: " Qing Zhao
2023-10-05 20:08 ` Siddhesh Poyarekar
2023-10-05 22:35   ` Kees Cook
2023-10-06  5:11     ` Martin Uecker
2023-10-06 10:50       ` Siddhesh Poyarekar
2023-10-06 20:01         ` Martin Uecker
2023-10-18 15:37           ` Siddhesh Poyarekar
2023-10-18 19:35           ` Qing Zhao
2023-10-18 21:11   ` Qing Zhao
2023-10-19 23:33     ` Kees Cook
2023-10-20  9:50       ` Martin Uecker
2023-10-20 18:34         ` Kees Cook
2023-10-20 18:48           ` Qing Zhao
2023-10-20 19:54             ` Martin Uecker
2023-10-23 18:17               ` Qing Zhao
2023-10-23 19:52               ` Kees Cook
2023-10-23 19:57                 ` Martin Uecker
2023-10-23 22:03                   ` Kees Cook
2023-10-20 17:08     ` HELP: Will the reordering happen? " Qing Zhao
2023-10-20 18:22       ` Richard Biener
2023-10-20 18:38         ` Qing Zhao
2023-10-20 19:10           ` Siddhesh Poyarekar
2023-10-20 20:41             ` Qing Zhao
2023-10-23  7:57               ` Richard Biener
2023-10-23 11:27                 ` Siddhesh Poyarekar
2023-10-23 12:34                   ` Richard Biener
2023-10-23 13:23                     ` Siddhesh Poyarekar
2023-10-23 15:14                     ` Qing Zhao
2023-10-23 14:56                 ` Qing Zhao
2023-10-23 15:57                   ` Richard Biener
2023-10-23 16:37                     ` Qing Zhao
2023-10-23 18:06                       ` Martin Uecker
2023-10-23 18:31                         ` Martin Uecker
2023-10-23 19:00                           ` Qing Zhao
2023-10-23 19:37                             ` Martin Uecker
2023-10-23 20:33                               ` Qing Zhao
2023-10-23 18:33                         ` Qing Zhao
2023-10-23 18:43                         ` Siddhesh Poyarekar
2023-10-23 18:55                           ` Martin Uecker
2023-10-23 19:43                           ` Qing Zhao
2023-10-23 22:48                             ` Siddhesh Poyarekar
2023-10-24 20:30                               ` Qing Zhao
2023-10-24 20:38                                 ` Martin Uecker
2023-10-24 21:09                                   ` Siddhesh Poyarekar
2023-10-24 22:51                                   ` Qing Zhao
2023-10-24 23:56                                     ` Siddhesh Poyarekar
2023-10-25 13:27                                       ` Qing Zhao
2023-10-25 14:50                                         ` Siddhesh Poyarekar
2023-10-25 15:38                                           ` Richard Biener
2023-10-25 19:03                                             ` Qing Zhao
2023-10-26  5:21                                               ` Jakub Jelinek
2023-10-26  8:56                                                 ` Richard Biener
2023-10-26 14:58                                                   ` Qing Zhao
2023-10-26 15:48                                                     ` Richard Biener
2023-10-26 16:16                                                       ` Martin Uecker
2023-10-26 14:41                                                 ` Qing Zhao
2023-10-25 18:44                                           ` Qing Zhao
2023-10-25 22:06                                         ` Kees Cook
2023-10-25 22:27                                           ` Qing Zhao
2023-10-25 22:32                                             ` Kees Cook
2023-10-26  8:15                                               ` Martin Uecker
2023-10-26 16:13                                                 ` Kees Cook
2023-10-26 16:45                                                   ` Martin Uecker
2023-10-26 19:57                                                     ` Qing Zhao
2023-10-27  7:21                                                       ` Martin Uecker
2023-10-27 14:32                                                         ` Qing Zhao
2023-10-27 14:53                                                           ` Martin Uecker
2023-10-27 15:10                                                             ` Qing Zhao
2023-10-27 17:19                                                               ` Kees Cook
2023-10-27 18:13                                                                 ` Qing Zhao
2023-10-25  5:26                                     ` Martin Uecker
2023-10-25  6:43                                   ` Richard Biener
2023-10-25  8:16                                     ` Martin Uecker
2023-10-25 10:25                                       ` Siddhesh Poyarekar
2023-10-25 10:47                                         ` Martin Uecker
2023-10-25 11:13                                           ` Richard Biener
2023-10-25 18:16                                             ` Martin Uecker
2023-10-26  8:45                                               ` Richard Biener
2023-10-26  9:20                                                 ` Martin Uecker
2023-10-26 10:14                                                   ` Martin Uecker
2023-10-26 14:05                                                     ` Richard Biener
2023-10-26 18:54                                                       ` Qing Zhao
2023-10-27 16:43                                                         ` Qing Zhao
2023-10-26 16:41                                                   ` Qing Zhao [this message]
2023-10-26 17:05                                                     ` Martin Uecker
2023-10-26 17:35                                                       ` Richard Biener
2023-10-26 19:20                                                       ` Qing Zhao
2023-10-25 18:17                                             ` Qing Zhao
2023-10-25 10:25                                       ` Richard Biener
2023-10-25 10:39                                         ` Martin Uecker
2023-10-25 18:06                                           ` Qing Zhao
2023-10-24 21:03                                 ` Siddhesh Poyarekar
2023-10-24 22:41                                   ` Qing Zhao
2023-10-24 23:51                                     ` Siddhesh Poyarekar
2023-10-25 21:59                                       ` Kees Cook
2023-10-23 18:10                       ` Joseph Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8EF8A68F-CCE2-4746-A95E-F6C0A87B3C29@oracle.com \
    --to=qing.zhao@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=isanbard@gmail.com \
    --cc=jakub@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=keescook@chromium.org \
    --cc=richard.guenther@gmail.com \
    --cc=siddhesh@gotplt.org \
    --cc=uecker@tugraz.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).