public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Cc: Guinevere Larsen <blarsen@redhat.com>
Subject: Re: [PATCH] Fix an out of bounds array access in, find_epilogue_using_linetable
Date: Fri, 5 Apr 2024 17:11:04 +0200	[thread overview]
Message-ID: <a0a847b7-1f11-41db-9c5e-9376fb4464a9@suse.de> (raw)
In-Reply-To: <AS8P193MB1285A45A142E6CFE6B1D832EE43F2@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM>

On 4/1/24 18:41, Bernd Edlinger wrote:
> On 4/1/24 11:07, Tom de Vries wrote:
>> On 3/31/24 12:45, Bernd Edlinger wrote:
>>> This causes random test failures like these:
>>>
>>
>> Hi Bernd,
>>
>> thanks for the analysis and the patch.
>>
>> I think with "this" you mean $subject, which was not immediately clear to me when reading it on the mailing list where there's a whole bunch of text inbetween, so please consider spelling it out or using $subject or some such.
>>
> 
> ok will do.
> 
>>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $fn_fba
>>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: check frame-id matches
>>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: bt 2
>>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: up
>>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $sp_value == $::main_sp
>>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $::main_fba
>>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: [string equal $fid $::main_fid]
>>>
>>> Here the read happens below the first element of the line
>>> table, and the test failure depends on the value that is
>>> read from there.
>>>
>>
>> Using this description, I managed to minimally rewrite the loop into a form where I could easily add an assert, and then add an assert that reliably detects the problem:
>> ...
>>        while (1)
>>          {
>>            gdb_assert (it >= &linetable->item[0]);
>>            if (it->unrelocated_pc () >= unrel_start)
>>              ;
>>            else
>>              break;
>>            if (it->epilogue_begin)
>>              return {it->pc (objfile)};
>>            it --;
>>          }
>> ...
>>
>>> Theoretically it is also possible that std::lower_bound
>>> returns a pointer exactly at the upper bound of the line table,
>>> also here the read value is undefined, that happens in this test:
>>>
>>> FAIL: gdb.dwarf2/dw2-epilogue-begin.exp: confirm watchpoint doesn't trigger
>>>
>>
>> Ack, thanks for finding and reporting that.
>>
>>> Fixes: 528b729be1a2 ("gdb/dwarf2: Add support for DW_LNS_set_epilogue_begin in line-table")
>>> ---
>>>    gdb/symtab.c | 3 +--
>>>    1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/gdb/symtab.c b/gdb/symtab.c
>>> index 86603dfebc3..2fc8e819932 100644
>>> --- a/gdb/symtab.c
>>> +++ b/gdb/symtab.c
>>> @@ -4177,11 +4177,10 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>>>           return lte.unrelocated_pc () < pc;
>>>         });
>>>    -      while (it->unrelocated_pc () >= unrel_start)
>>> +      while (it > linetable->item && (--it)->unrelocated_pc () >= unrel_start)
>>>          {
>>>        if (it->epilogue_begin)
>>>          return {it->pc (objfile)};
>>> -    it --;
>>>          }
>>>        }
>>>      return {};
>>
>> I think this makes the while condition fairly convoluted.
>>
>> Also, AFAIU the problem you mention related to "pointer exactly at the upper bound of the line table", may or may not happen, and the new --it placement that seems intended to address that is executed unconditionally, so wouldn't this sometimes skip an item?
>>
> 
> No, because the upper limit is exactly where the next function starts,
> so it will evaluate the first line table entry of the following function,
> probably harmless, since that should not be the epilogue of the function.
> 

Looking at the "pointer exactly at the upper bound of the line table" 
problem a bit more, I think it can happen that it->epilogue_begin is 
evaluated for an end_sequence entry.  My guess is that this is harmless, 
but it's better to avoid it.

Atm it's hard to reason about what happens because we try to handle 
several initial scenarios in the loop.  It's best to untangle this:
- handle exceptional situations before the loop, and
- reserve the loop to handle the steady state: iterating over
   proper line-table entries of the current function.

>> I think things can be solved simpler by:
>> - using a trivial for loop that expresses the boundary of the line
>>    table, and
>> - moving the while condition into the loop, while
>> - handling the exceptional case explictly, outside the loop:
>> ...
>>        if (it == &linetable->item[linetable->nitems])
>>          it--;
>>
> 
> no, that is also wrong if linetable->nitems == 0, `it` is
> &linetable->item[-1], and the comparing that value to
> &linetable->item[0] yields an undefined boolean value.

I see, I forgot about that, thanks for pointing that out.  It's good to 
make explicit in the implementation that and where we're avoid this.

Btw, if "linetable->nitems == 0" indeed can happen, it's best handled 
using an early-out.

> And as I said above looking at the line table entry
> that belongs to the next function is asking for trouble.
> 
>>        for (; it >= &linetable->item[0]; it--)
>>          {
>>            if (it->unrelocated_pc () < unrel_start)
>>              break;
>>            if (it->epilogue_begin)
>>              return {it->pc (objfile)};
>>          }
>> ...
>>
>> WDYT?
>>
> 
> Even decrementing `it` below the first element of an array
> is undefined behavior, avoiding that is the main reason
> why I wrote the if condition in this way.
> 
> I would agree to add a comment here which explains how the
> loop works.
> 
> OK?
> 

I've written a version that avoids the undefined behaviour decrement, 
clearly advertises this, handles cornercases explicitly and adds and 
reorganizes comments, as well as adds some asserts.  I've submitted it 
as a v3, which also adds an additional test-case.

Thanks,
- Tom


  reply	other threads:[~2024-04-05 15:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-31 10:45 Bernd Edlinger
2024-04-01  9:07 ` Tom de Vries
2024-04-01 16:41   ` Bernd Edlinger
2024-04-05 15:11     ` Tom de Vries [this message]
2024-04-05 18:13       ` Bernd Edlinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0a847b7-1f11-41db-9c5e-9376fb4464a9@suse.de \
    --to=tdevries@suse.de \
    --cc=bernd.edlinger@hotmail.de \
    --cc=blarsen@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).