public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Jason Merrill <jason@redhat.com>, gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PING][PATCH] correct handling of indices into arrays with elements larger than 1 (PR c++/96511)
Date: Mon, 12 Oct 2020 09:21:13 -0600	[thread overview]
Message-ID: <077f124c-e7ec-63ac-1d62-6437dbd4e0c2@gmail.com> (raw)
In-Reply-To: <a5a6637c-6993-9cdb-2921-213cff40b243@redhat.com>

On 10/11/20 9:44 PM, Jason Merrill wrote:
> On 10/11/20 6:45 PM, Martin Sebor wrote:
>> On 10/9/20 9:13 AM, Jason Merrill wrote:
>>> On 10/9/20 10:51 AM, Martin Sebor wrote:
>>>> On 10/8/20 1:40 PM, Jason Merrill wrote:
>>>>> On 10/8/20 3:18 PM, Martin Sebor wrote:
>>>>>> On 10/7/20 3:01 PM, Jason Merrill wrote:
>>>>>>> On 10/7/20 4:11 PM, Martin Sebor wrote:
>>>>>> ...
>>>>>>
>>>>>>>>>>>>>>> For the various member functions, please include the 
>>>>>>>>>>>>>>> comments with the definition as well as the in-class 
>>>>>>>>>>>>>>> declaration.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Only one access_ref member function is defined 
>>>>>>>>>>>>>> out-of-line: offset_bounded().  I've adjusted the comment 
>>>>>>>>>>>>>> and copied it above
>>>>>>>>>>>>>> the function definition.
>>>>>>>
>>>>>>> And size_remaining, as quoted above?
>>>>>>
>>>>>> I have this in my tree:
>>>>>>
>>>>>> /* Return the maximum amount of space remaining and if non-null, set
>>>>>>     argument to the minimum.  */
>>>>>>
>>>>>> I'll add it when I commit the patch.
>>>>>>
>>>>>>>
>>>>>>> I also don't see a comment above the definition of offset_bounded 
>>>>>>> in the new patch?
>>>>>>
>>>>>> There is a comment in the latest patch.
>>>>>>
>>>>>> ...
>>>>>>>>>>>>>> The goal of conditionals is to avoid overwhelming the user 
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> excessive numbers that may not be meaningful or even relevant
>>>>>>>>>>>>>> to the warning.  I've corrected the function body, tweaked 
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> renamed the get_range function to get_offset_range to do a 
>>>>>>>>>>>>>> better
>>>>>>>>>>>>>> job of extracting ranges from the types of some nonconstant
>>>>>>>>>>>>>> expressions the front end passes it, and added a new test for
>>>>>>>>>>>>>> all this.  Attached is the new revision.
>>>>>>>
>>>>>>> offset_bounded looks unchanged in the new patch.  It still 
>>>>>>> returns true iff either the range is a single value or one of the 
>>>>>>> bounds are unrepresentable in ptrdiff_t.  I'm still unclear how 
>>>>>>> this corresponds to "Return true if OFFRNG is bounded to a 
>>>>>>> subrange of possible offset values."
>>>>>>
>>>>>> I don't think you're looking at the latest patch.  It has this:
>>>>>>
>>>>>> +/* Return true if OFFRNG is bounded to a subrange of offset values
>>>>>> +   valid for the largest possible object.  */
>>>>>> +
>>>>>>   bool
>>>>>>   access_ref::offset_bounded () const
>>>>>>   {
>>>>>> -  if (offrng[0] == offrng[1])
>>>>>> -    return false;
>>>>>> -
>>>>>>     tree min = TYPE_MIN_VALUE (ptrdiff_type_node);
>>>>>>     tree max = TYPE_MAX_VALUE (ptrdiff_type_node);
>>>>>> -  return offrng[0] <= wi::to_offset (min) || offrng[1] >= 
>>>>>> wi::to_offset (max);
>>>>>> +  return wi::to_offset (min) <= offrng[0] && offrng[1] <= 
>>>>>> wi::to_offset (max);
>>>>>>   }
>>>>>>
>>>>>> Here's a link to it in the archive:
>>>>>>
>>>>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/555019.html
>>>>>> https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200928/9026783a/attachment-0003.bin 
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Ah, yes, there are two patches in that email; the first introduces 
>>>>> the broken offset_bounded, and the second one fixes it without 
>>>>> mentioning that in the ChangeLog.  How about moving the fix to the 
>>>>> first patch?
>>>>
>>>> Sure, I can do that.  Anything else or is the final version okay
>>>> to commit with this adjustment?
>>>
>>> OK with that adjustment.
>>
>> I've done more testing and found a bug in the second patch: adding
>> an offset in an inverted range to an existing offset range isn't as
>> simple as adding up the bounds because they mean different things:
>> like an anti-range, an inverted range is a union of two subranges.
>> Instead, the upper bound needs to be extended to PTRDIFF_MAX because
>> that is the maximum being added, and the lower bound either reset to
>> zero if the absolute value of the maximum being added is less than
>> it, or incremented by the absolute value otherwise.
>>
>> For example, given:
>>
>>    char a[8];
>>    char *pa = a;
>>    char *p1 = pa + i;   // i's range is [3, 5]
>>    char *p2 = p1 + j;   // j's range is [1, -4]
>>
>> the range of p2's offset isn't [4, 1] but [4, PTRDIFF_MAX] (or more
>> precisely [4, 8] if we assume it's valid).  But the range of p3's
>> valid offset in this last pointer
>>
>>    char *p3 = p2 + k;   // k's range is [5, -4]
>>
>> is all of [0, PTRDIFF_MAX] (or, more accurately, [0, 8]).
>>
>> This may seem obvious but it took me a while at first to wrap my head
>> around.
> 
> It makes sense, but doesn't seem obvious; a bit more comment might be nice.

I just now noticed this suggestion, after pushing both patches.
I'll keep it in mind and add something later.

> 
>> I've tweaked access_ref::add_offset in the patch to handle this
>> correctly.  The function now ensures that every offset is in
>> a regular range (and not an inverted one).  That in turn simplifies
>> access_ref::size_remaining.  Since an inverted range is the same as
>> an anti-range, there's no reason to exclude the latter anymore(*).
>> The diff on top of the approved patch is attached.
>>
>> I've retested this new revision of the patch with Glibc and GDB/
>> Binutils, (the latter fails due to PR 97360), and the Linux kernel.
>>
>> Please let me know if you have any questions or concerns with
>> this change.  If not, I'd like to commit it sometime tomorrow.
>>
>> Martin
>>
>> [*] I was curious how often these inverted ranges/anti-ranges come
>> up in pointer arithmetic to see if handling them is worthwhile.  I
>> instrumented GCC to print them in get_range() on master where they
>> are only looked at in calls to built-in functions, and in another
>> patch I'm working on where they are looked at for every pointer
>> addition.  They account for 16% to 23% (GCC and Glibc, respectively)
>> and 22% to 32% (Glibc and GCC).  The detailed results are below.
>>
>> GCC
>>    Builtin pointer arithmetic:
>>      kind         ordinary      inverted
>>      RANGE        636 (38%)     150 ( 9%)
>>      ANTI_RANGE    28 ( 1%)      99 ( 6%)
>>      VARYING      733 (44%)
>>      total       1397 (84%)     249 (15%)
>>
>>    All pointer arithmetic:
>>      kind         ordinary      inverted
>>      RANGE        4663 (45%)   2945 (28%)
>>      ANTI_RANGE    134 ( 1%)    119 ( 1%)
>>      VARYING      2344 (22%)
>>      total        7141 (69%)   3064 (30%)
>>
>> Glibc
>>    Builtin pointer arithmetic:
>>      kind         ordinary      inverted
>>      RANGE        488 (37%)     191 (14%)
>>      ANTI_RANGE    18 ( 1%)     112 ( 8%)
>>      VARYING      509 (38%)
>>      total       1015 (77%)     303 (22%)
>>
>>    All pointer arithmetic:
>>      kind         ordinary      inverted
>>      RANGE       1941 (51%)     636 (16%)
>>      ANTI_RANGE    41 ( 1%)     202 ( 5%)
>>      VARYING      952 (25%)
>>      total       2934 (77%)     838 (22%)
> 
>> +  /* When referencing a known object check to see if the offset computed
>> +     so far is in bounds... */
> 
> ...but you don't do anything if it isn't in bounds?  That could use a 
> comment, at least.

When the offset is out of bounds it has to stay unchanged so that
the access to the object can be diagnosed and the invalid offset
mentioned in the note after the warning.  This also means that
subsequent pointer addition can make an invalid offset valid
again, preventing the access warning.  A different warning will
have to detect that at the point of the pointer addition (I'm
about to submit a change that implements that).

I added a brief comment mentioning this.

Martin

  reply	other threads:[~2020-10-12 15:21 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-11 16:19 [PATCH] " Martin Sebor
2020-08-19 15:00 ` [PING][PATCH] " Martin Sebor
2020-08-28 15:42   ` [PING 2][PATCH] " Martin Sebor
2020-09-01 19:22 ` [PATCH] " Jason Merrill
2020-09-03 18:44   ` Martin Sebor
2020-09-04 17:14     ` Jason Merrill
2020-09-14 22:01       ` Martin Sebor
2020-09-21 21:17         ` [PING][PATCH] " Martin Sebor
2020-09-22 20:05           ` Martin Sebor
2020-09-26  5:17             ` Jason Merrill
2020-09-28 22:01               ` Martin Sebor
2020-10-05 16:37                 ` Martin Sebor
2020-10-07 14:26                 ` Jason Merrill
2020-10-07 14:42                   ` Martin Sebor
2020-10-07 15:07                     ` Jason Merrill
2020-10-07 15:19                       ` Martin Sebor
2020-10-07 19:28                         ` Jason Merrill
2020-10-07 20:11                           ` Martin Sebor
2020-10-07 21:01                             ` Jason Merrill
2020-10-08 19:18                               ` Martin Sebor
2020-10-08 19:40                                 ` Jason Merrill
2020-10-09 14:51                                   ` Martin Sebor
2020-10-09 15:13                                     ` Jason Merrill
2020-10-11 22:45                                       ` Martin Sebor
2020-10-12  3:44                                         ` Jason Merrill
2020-10-12 15:21                                           ` Martin Sebor [this message]
2020-10-13  9:46                 ` Christophe Lyon
2020-10-13 16:59                   ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=077f124c-e7ec-63ac-1d62-6437dbd4e0c2@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).