public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: Simon Marchi <simon.marchi@polymtl.ca>, gdb-patches@sourceware.org
Cc: Pedro Alves <pedro@palves.net>
Subject: Re: [PATCH v2] [gdb] Fix heap-use-after-free in select_event_lwp
Date: Tue, 23 Jan 2024 18:52:59 +0100	[thread overview]
Message-ID: <f9f7937d-91fc-4ad7-84ab-1a0c92d4092e@suse.de> (raw)
In-Reply-To: <e1f9f082-032c-4d16-a1d1-f23a250e2128@polymtl.ca>

On 1/23/24 17:08, Simon Marchi wrote:
> 
> 
> On 2024-01-23 06:48, Tom de Vries wrote:
>> When building gdb with -O0 -fsanitize=thread, and running test-case
>> gdb.base/vfork-follow-parent.exp, 5 times out of 10 I run into:
>> ...
>> WARNING: ThreadSanitizer: heap-use-after-free (pid=249653)
>>    Write of size 4 at 0xffffee83055c by main thread:
>>      #0 select_event_lwp gdb/linux-nat.c:2809 (gdb+0xb0a65c)
>>      #1 linux_nat_wait_1 gdb/linux-nat.c:3389 (gdb+0xb0c470)
>>      #2 linux_nat_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-nat.c:3560 (gdb+0xb0cfc8)
>>      #3 thread_db_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-thread-db.c:1402 (gdb+0xb35958)
>>      #4 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/target.c:2571 (gdb+0xfb6c34)
>>      #5 do_target_wait_1 gdb/infrun.c:4120 (gdb+0xa99dc4)
>>      #6 operator() gdb/infrun.c:4179 (gdb+0xa99f70)
>>      #7 do_target_wait gdb/infrun.c:4198 (gdb+0xa9a2bc)
>>      #8 fetch_inferior_event() gdb/infrun.c:4629 (gdb+0xa9b658)
>>      #9 inferior_event_handler(inferior_event_type) gdb/inf-loop.c:42 (gdb+0xa6b0c8)
>>      #10 handle_target_event gdb/linux-nat.c:4357 (gdb+0xb0f694)
>>      #11 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1cfc03c)
>>      #12 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1cfc700)
>>      #13 gdb_do_one_event(int) gdbsupport/event-loop.cc:217 (gdb+0x1cfa8ac)
>>      #14 start_event_loop gdb/main.c:408 (gdb+0xb7be9c)
>>      #15 captured_command_loop gdb/main.c:472 (gdb+0xb7c0cc)
>>      #16 captured_main gdb/main.c:1342 (gdb+0xb7e4e4)
>>      #17 gdb_main(captured_main_args*) gdb/main.c:1361 (gdb+0xb7e594)
>>      #18 main gdb/gdb.c:39 (gdb+0x423ce8)
>>
>>    Previous write of size 8 at 0xffffee830558 by main thread:
>>      #0 operator delete(void*, unsigned long) <null> (libtsan.so.2+0x8fb14)
>>      #1 delete_lwp gdb/linux-nat.c:849 (gdb+0xb0384c)
>>      #2 exit_lwp gdb/linux-nat.c:924 (gdb+0xb03c4c)
>>      #3 wait_lwp gdb/linux-nat.c:2224 (gdb+0xb08404)
>>      #4 stop_wait_callback gdb/linux-nat.c:2458 (gdb+0xb092a8)
>>      #5 gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const gdb/../gdbsupport/function-view.h:326 (gdb+0xb15ab0)
>>      #6 gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) gdb/../gdbsupport/function-view.h:320 (gdb+0xb15b18)
>>      #7 gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const gdb/../gdbsupport/function-view.h:289 (gdb+0xb13e90)
>>      #8 iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) gdb/linux-nat.c:879 (gdb+0xb03a18)
>>      #9 linux_nat_wait_1 gdb/linux-nat.c:3382 (gdb+0xb0c3f8)
>>      #10 linux_nat_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-nat.c:3560 (gdb+0xb0cfc8)
>>      #11 thread_db_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-thread-db.c:1402 (gdb+0xb35958)
>>      #12 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/target.c:2571 (gdb+0xfb6c34)
>>      #13 do_target_wait_1 gdb/infrun.c:4120 (gdb+0xa99dc4)
>>      #14 operator() gdb/infrun.c:4179 (gdb+0xa99f70)
>>      #15 do_target_wait gdb/infrun.c:4198 (gdb+0xa9a2bc)
>>      #16 fetch_inferior_event() gdb/infrun.c:4629 (gdb+0xa9b658)
>>      #17 inferior_event_handler(inferior_event_type) gdb/inf-loop.c:42 (gdb+0xa6b0c8)
>>      #18 handle_target_event gdb/linux-nat.c:4357 (gdb+0xb0f694)
>>      #19 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1cfc03c)
>>      #20 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1cfc700)
>>      #21 gdb_do_one_event(int) gdbsupport/event-loop.cc:217 (gdb+0x1cfa8ac)
>>      #22 start_event_loop gdb/main.c:408 (gdb+0xb7be9c)
>>      #23 captured_command_loop gdb/main.c:472 (gdb+0xb7c0cc)
>>      #24 captured_main gdb/main.c:1342 (gdb+0xb7e4e4)
>>      #25 gdb_main(captured_main_args*) gdb/main.c:1361 (gdb+0xb7e594)
>>      #26 main gdb/gdb.c:39 (gdb+0x423ce8)
>>
>> SUMMARY: ThreadSanitizer: heap-use-after-free gdb/linux-nat.c:2809 in select_event_lwp
>> ...
>>
>> Since heap-use-after-free is essentially an address sanitizer complaint, I
>> also tried building gdb with -O0 -fsanitize=address, but with this setup it
>> doesn't seem to trigger (0 times out of 10).
>>
>> The heap-use-after-free happens during the following scenario:
>> - linux_nat_wait_1 selects an LWP thread T1 with a status to report.
>> - it sets variable lp to point to the corresponding lwp_info.
>> - it calls stop_callback and stop_wait_callback for all threads
>>    (because !target_is_non_stop_p ()).
>> - it calls select_event_lwp to maybe pick another thread than T1, to prevent
>>    starvation.
>>
>> The problem seems to be the following:
>> - while calling stop_wait_callback for all threads, it also does this for T1.
>>    While doing so, the corresponding lwp_info is deleted (callstack
>>    stop_wait_callback -> wait_lwp -> exit_lwp -> delete_lwp), leaving variable
>>    lp as a dangling pointer.
>> - variable lp is passed to select_event_lwp, which derefences it, which causes
>>    the heap-use-after-free.
>>
>> Note that the comment here mentions "all other LWP's":
>> ...
>>        /* Now stop all other LWP's ...  */
>>        iterate_over_lwps (minus_one_ptid, stop_callback);
>>        /* ... and wait until all of them have reported back that
>>          they're no longer running.  */
>>        iterate_over_lwps (minus_one_ptid, stop_wait_callback);
>> ...
>> which presumably means other than the one in lp, but the iterators
>> don't skip lp.
>>
>> Fix this by making the code match the comment, and skipping stop_callback and
>> stop_wait_callback for lp.
>>
>> Tested on aarch64-linux.
>>
>> PR gdb/31259
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31259
>> ---
>>   gdb/linux-nat.c | 16 ++++++++++++++--
>>   1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
>> index e91c57ba239..8bfae8555fc 100644
>> --- a/gdb/linux-nat.c
>> +++ b/gdb/linux-nat.c
>> @@ -3375,11 +3375,23 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
>>     if (!target_is_non_stop_p ())
>>       {
>>         /* Now stop all other LWP's ...  */
>> -      iterate_over_lwps (minus_one_ptid, stop_callback);
>> +      for (lwp_info *other_lp : all_lwps_safe ())
>> +	{
>> +	  if (other_lp == lp)
>> +	    continue;
>> +
>> +	  stop_callback (other_lp);
>> +	}
>>   
>>         /* ... and wait until all of them have reported back that
>>   	 they're no longer running.  */
>> -      iterate_over_lwps (minus_one_ptid, stop_wait_callback);
>> +      for (lwp_info *other_lp : all_lwps_safe ())
>> +	{
>> +	  if (other_lp == lp)
>> +	    continue;
>> +
>> +	  stop_wait_callback (other_lp);
>> +	}
> 
> I did a bit of archeology to see how this code evolved, and I noticed
> that this change in commit 9c02b52532 ("linux-nat.c: better starvation
> avoidance, handle non-stop mode too"):
> 
> https://gitlab.com/gnutools/binutils-gdb/-/commit/9c02b52532ac?view=parallel#a360e5f37ff035d1ed6814cb60de9f2826b55788_3373_3362
> 
> Previously, we had:
> 
>    lp->stopped = 1;
> 
> just before the snippet you modify.  Both stop_callback and
> stop_wait_callback are no-ops if `lp->stopped` is true, so that would
> make them skip over the event thread.  The commit removed that, so I
> suppose that the problem was introduced in that commit.
> 

Hi Simon,

thanks for the review.

> Ideally, Pedro should look at this, but in the mean time:
> 

cc-ing Pedro.

Thanks,
- Tom

> Reviewed-By: Simon Marchi <simon.marchi@efficios.com>
> 
> Simon


  reply	other threads:[~2024-01-23 17:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23 11:48 Tom de Vries
2024-01-23 16:08 ` Simon Marchi
2024-01-23 17:52   ` Tom de Vries [this message]
2024-02-09 15:46 ` Pedro Alves
2024-02-19 15:04   ` Tom de Vries
2024-02-21 17:42     ` Pedro Alves
2024-02-22 11:43       ` Tom de Vries
2024-02-23 14:33         ` Pedro Alves
2024-02-26 14:23           ` Tom de Vries
2024-02-26 15:28             ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9f7937d-91fc-4ad7-84ab-1a0c92d4092e@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    --cc=simon.marchi@polymtl.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).