public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jason Merrill <jason@redhat.com>
To: Martin Sebor <msebor@gmail.com>, gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PING][PATCH] correct handling of indices into arrays with elements larger than 1 (PR c++/96511)
Date: Sun, 11 Oct 2020 23:44:58 -0400	[thread overview]
Message-ID: <a5a6637c-6993-9cdb-2921-213cff40b243@redhat.com> (raw)
In-Reply-To: <455d95e7-46bf-6a16-b25b-e038fd72f6a7@gmail.com>

On 10/11/20 6:45 PM, Martin Sebor wrote:
> On 10/9/20 9:13 AM, Jason Merrill wrote:
>> On 10/9/20 10:51 AM, Martin Sebor wrote:
>>> On 10/8/20 1:40 PM, Jason Merrill wrote:
>>>> On 10/8/20 3:18 PM, Martin Sebor wrote:
>>>>> On 10/7/20 3:01 PM, Jason Merrill wrote:
>>>>>> On 10/7/20 4:11 PM, Martin Sebor wrote:
>>>>> ...
>>>>>
>>>>>>>>>>>>>> For the various member functions, please include the 
>>>>>>>>>>>>>> comments with the definition as well as the in-class 
>>>>>>>>>>>>>> declaration.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Only one access_ref member function is defined out-of-line: 
>>>>>>>>>>>>> offset_bounded().  I've adjusted the comment and copied it 
>>>>>>>>>>>>> above
>>>>>>>>>>>>> the function definition.
>>>>>>
>>>>>> And size_remaining, as quoted above?
>>>>>
>>>>> I have this in my tree:
>>>>>
>>>>> /* Return the maximum amount of space remaining and if non-null, set
>>>>>     argument to the minimum.  */
>>>>>
>>>>> I'll add it when I commit the patch.
>>>>>
>>>>>>
>>>>>> I also don't see a comment above the definition of offset_bounded 
>>>>>> in the new patch?
>>>>>
>>>>> There is a comment in the latest patch.
>>>>>
>>>>> ...
>>>>>>>>>>>>> The goal of conditionals is to avoid overwhelming the user 
>>>>>>>>>>>>> with
>>>>>>>>>>>>> excessive numbers that may not be meaningful or even relevant
>>>>>>>>>>>>> to the warning.  I've corrected the function body, tweaked and
>>>>>>>>>>>>> renamed the get_range function to get_offset_range to do a 
>>>>>>>>>>>>> better
>>>>>>>>>>>>> job of extracting ranges from the types of some nonconstant
>>>>>>>>>>>>> expressions the front end passes it, and added a new test for
>>>>>>>>>>>>> all this.  Attached is the new revision.
>>>>>>
>>>>>> offset_bounded looks unchanged in the new patch.  It still returns 
>>>>>> true iff either the range is a single value or one of the bounds 
>>>>>> are unrepresentable in ptrdiff_t.  I'm still unclear how this 
>>>>>> corresponds to "Return true if OFFRNG is bounded to a subrange of 
>>>>>> possible offset values."
>>>>>
>>>>> I don't think you're looking at the latest patch.  It has this:
>>>>>
>>>>> +/* Return true if OFFRNG is bounded to a subrange of offset values
>>>>> +   valid for the largest possible object.  */
>>>>> +
>>>>>   bool
>>>>>   access_ref::offset_bounded () const
>>>>>   {
>>>>> -  if (offrng[0] == offrng[1])
>>>>> -    return false;
>>>>> -
>>>>>     tree min = TYPE_MIN_VALUE (ptrdiff_type_node);
>>>>>     tree max = TYPE_MAX_VALUE (ptrdiff_type_node);
>>>>> -  return offrng[0] <= wi::to_offset (min) || offrng[1] >= 
>>>>> wi::to_offset (max);
>>>>> +  return wi::to_offset (min) <= offrng[0] && offrng[1] <= 
>>>>> wi::to_offset (max);
>>>>>   }
>>>>>
>>>>> Here's a link to it in the archive:
>>>>>
>>>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/555019.html
>>>>> https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200928/9026783a/attachment-0003.bin 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Ah, yes, there are two patches in that email; the first introduces 
>>>> the broken offset_bounded, and the second one fixes it without 
>>>> mentioning that in the ChangeLog.  How about moving the fix to the 
>>>> first patch?
>>>
>>> Sure, I can do that.  Anything else or is the final version okay
>>> to commit with this adjustment?
>>
>> OK with that adjustment.
> 
> I've done more testing and found a bug in the second patch: adding
> an offset in an inverted range to an existing offset range isn't as
> simple as adding up the bounds because they mean different things:
> like an anti-range, an inverted range is a union of two subranges.
> Instead, the upper bound needs to be extended to PTRDIFF_MAX because
> that is the maximum being added, and the lower bound either reset to
> zero if the absolute value of the maximum being added is less than
> it, or incremented by the absolute value otherwise.
> 
> For example, given:
> 
>    char a[8];
>    char *pa = a;
>    char *p1 = pa + i;   // i's range is [3, 5]
>    char *p2 = p1 + j;   // j's range is [1, -4]
> 
> the range of p2's offset isn't [4, 1] but [4, PTRDIFF_MAX] (or more
> precisely [4, 8] if we assume it's valid).  But the range of p3's
> valid offset in this last pointer
> 
>    char *p3 = p2 + k;   // k's range is [5, -4]
> 
> is all of [0, PTRDIFF_MAX] (or, more accurately, [0, 8]).
> 
> This may seem obvious but it took me a while at first to wrap my head
> around.

It makes sense, but doesn't seem obvious; a bit more comment might be nice.

> I've tweaked access_ref::add_offset in the patch to handle this
> correctly.  The function now ensures that every offset is in
> a regular range (and not an inverted one).  That in turn simplifies
> access_ref::size_remaining.  Since an inverted range is the same as
> an anti-range, there's no reason to exclude the latter anymore(*).
> The diff on top of the approved patch is attached.
> 
> I've retested this new revision of the patch with Glibc and GDB/
> Binutils, (the latter fails due to PR 97360), and the Linux kernel.
> 
> Please let me know if you have any questions or concerns with
> this change.  If not, I'd like to commit it sometime tomorrow.
> 
> Martin
> 
> [*] I was curious how often these inverted ranges/anti-ranges come
> up in pointer arithmetic to see if handling them is worthwhile.  I
> instrumented GCC to print them in get_range() on master where they
> are only looked at in calls to built-in functions, and in another
> patch I'm working on where they are looked at for every pointer
> addition.  They account for 16% to 23% (GCC and Glibc, respectively)
> and 22% to 32% (Glibc and GCC).  The detailed results are below.
> 
> GCC
>    Builtin pointer arithmetic:
>      kind         ordinary      inverted
>      RANGE        636 (38%)     150 ( 9%)
>      ANTI_RANGE    28 ( 1%)      99 ( 6%)
>      VARYING      733 (44%)
>      total       1397 (84%)     249 (15%)
> 
>    All pointer arithmetic:
>      kind         ordinary      inverted
>      RANGE        4663 (45%)   2945 (28%)
>      ANTI_RANGE    134 ( 1%)    119 ( 1%)
>      VARYING      2344 (22%)
>      total        7141 (69%)   3064 (30%)
> 
> Glibc
>    Builtin pointer arithmetic:
>      kind         ordinary      inverted
>      RANGE        488 (37%)     191 (14%)
>      ANTI_RANGE    18 ( 1%)     112 ( 8%)
>      VARYING      509 (38%)
>      total       1015 (77%)     303 (22%)
> 
>    All pointer arithmetic:
>      kind         ordinary      inverted
>      RANGE       1941 (51%)     636 (16%)
>      ANTI_RANGE    41 ( 1%)     202 ( 5%)
>      VARYING      952 (25%)
>      total       2934 (77%)     838 (22%)

> +  /* When referencing a known object check to see if the offset computed
> +     so far is in bounds... */

...but you don't do anything if it isn't in bounds?  That could use a 
comment, at least.

Jason


  reply	other threads:[~2020-10-12  3:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-11 16:19 [PATCH] " Martin Sebor
2020-08-19 15:00 ` [PING][PATCH] " Martin Sebor
2020-08-28 15:42   ` [PING 2][PATCH] " Martin Sebor
2020-09-01 19:22 ` [PATCH] " Jason Merrill
2020-09-03 18:44   ` Martin Sebor
2020-09-04 17:14     ` Jason Merrill
2020-09-14 22:01       ` Martin Sebor
2020-09-21 21:17         ` [PING][PATCH] " Martin Sebor
2020-09-22 20:05           ` Martin Sebor
2020-09-26  5:17             ` Jason Merrill
2020-09-28 22:01               ` Martin Sebor
2020-10-05 16:37                 ` Martin Sebor
2020-10-07 14:26                 ` Jason Merrill
2020-10-07 14:42                   ` Martin Sebor
2020-10-07 15:07                     ` Jason Merrill
2020-10-07 15:19                       ` Martin Sebor
2020-10-07 19:28                         ` Jason Merrill
2020-10-07 20:11                           ` Martin Sebor
2020-10-07 21:01                             ` Jason Merrill
2020-10-08 19:18                               ` Martin Sebor
2020-10-08 19:40                                 ` Jason Merrill
2020-10-09 14:51                                   ` Martin Sebor
2020-10-09 15:13                                     ` Jason Merrill
2020-10-11 22:45                                       ` Martin Sebor
2020-10-12  3:44                                         ` Jason Merrill [this message]
2020-10-12 15:21                                           ` Martin Sebor
2020-10-13  9:46                 ` Christophe Lyon
2020-10-13 16:59                   ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5a6637c-6993-9cdb-2921-213cff40b243@redhat.com \
    --to=jason@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).