From: Simon Marchi <simon.marchi@polymtl.ca>
To: Tom de Vries <tdevries@suse.de>, gdb-patches@sourceware.org
Subject: Re: [PATCH v2] [gdb] Fix heap-use-after-free in select_event_lwp
Date: Tue, 23 Jan 2024 11:08:19 -0500 [thread overview]
Message-ID: <e1f9f082-032c-4d16-a1d1-f23a250e2128@polymtl.ca> (raw)
In-Reply-To: <20240123114830.20253-1-tdevries@suse.de>
On 2024-01-23 06:48, Tom de Vries wrote:
> When building gdb with -O0 -fsanitize=thread, and running test-case
> gdb.base/vfork-follow-parent.exp, 5 times out of 10 I run into:
> ...
> WARNING: ThreadSanitizer: heap-use-after-free (pid=249653)
> Write of size 4 at 0xffffee83055c by main thread:
> #0 select_event_lwp gdb/linux-nat.c:2809 (gdb+0xb0a65c)
> #1 linux_nat_wait_1 gdb/linux-nat.c:3389 (gdb+0xb0c470)
> #2 linux_nat_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-nat.c:3560 (gdb+0xb0cfc8)
> #3 thread_db_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-thread-db.c:1402 (gdb+0xb35958)
> #4 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/target.c:2571 (gdb+0xfb6c34)
> #5 do_target_wait_1 gdb/infrun.c:4120 (gdb+0xa99dc4)
> #6 operator() gdb/infrun.c:4179 (gdb+0xa99f70)
> #7 do_target_wait gdb/infrun.c:4198 (gdb+0xa9a2bc)
> #8 fetch_inferior_event() gdb/infrun.c:4629 (gdb+0xa9b658)
> #9 inferior_event_handler(inferior_event_type) gdb/inf-loop.c:42 (gdb+0xa6b0c8)
> #10 handle_target_event gdb/linux-nat.c:4357 (gdb+0xb0f694)
> #11 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1cfc03c)
> #12 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1cfc700)
> #13 gdb_do_one_event(int) gdbsupport/event-loop.cc:217 (gdb+0x1cfa8ac)
> #14 start_event_loop gdb/main.c:408 (gdb+0xb7be9c)
> #15 captured_command_loop gdb/main.c:472 (gdb+0xb7c0cc)
> #16 captured_main gdb/main.c:1342 (gdb+0xb7e4e4)
> #17 gdb_main(captured_main_args*) gdb/main.c:1361 (gdb+0xb7e594)
> #18 main gdb/gdb.c:39 (gdb+0x423ce8)
>
> Previous write of size 8 at 0xffffee830558 by main thread:
> #0 operator delete(void*, unsigned long) <null> (libtsan.so.2+0x8fb14)
> #1 delete_lwp gdb/linux-nat.c:849 (gdb+0xb0384c)
> #2 exit_lwp gdb/linux-nat.c:924 (gdb+0xb03c4c)
> #3 wait_lwp gdb/linux-nat.c:2224 (gdb+0xb08404)
> #4 stop_wait_callback gdb/linux-nat.c:2458 (gdb+0xb092a8)
> #5 gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const gdb/../gdbsupport/function-view.h:326 (gdb+0xb15ab0)
> #6 gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) gdb/../gdbsupport/function-view.h:320 (gdb+0xb15b18)
> #7 gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const gdb/../gdbsupport/function-view.h:289 (gdb+0xb13e90)
> #8 iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) gdb/linux-nat.c:879 (gdb+0xb03a18)
> #9 linux_nat_wait_1 gdb/linux-nat.c:3382 (gdb+0xb0c3f8)
> #10 linux_nat_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-nat.c:3560 (gdb+0xb0cfc8)
> #11 thread_db_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/linux-thread-db.c:1402 (gdb+0xb35958)
> #12 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) gdb/target.c:2571 (gdb+0xfb6c34)
> #13 do_target_wait_1 gdb/infrun.c:4120 (gdb+0xa99dc4)
> #14 operator() gdb/infrun.c:4179 (gdb+0xa99f70)
> #15 do_target_wait gdb/infrun.c:4198 (gdb+0xa9a2bc)
> #16 fetch_inferior_event() gdb/infrun.c:4629 (gdb+0xa9b658)
> #17 inferior_event_handler(inferior_event_type) gdb/inf-loop.c:42 (gdb+0xa6b0c8)
> #18 handle_target_event gdb/linux-nat.c:4357 (gdb+0xb0f694)
> #19 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1cfc03c)
> #20 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1cfc700)
> #21 gdb_do_one_event(int) gdbsupport/event-loop.cc:217 (gdb+0x1cfa8ac)
> #22 start_event_loop gdb/main.c:408 (gdb+0xb7be9c)
> #23 captured_command_loop gdb/main.c:472 (gdb+0xb7c0cc)
> #24 captured_main gdb/main.c:1342 (gdb+0xb7e4e4)
> #25 gdb_main(captured_main_args*) gdb/main.c:1361 (gdb+0xb7e594)
> #26 main gdb/gdb.c:39 (gdb+0x423ce8)
>
> SUMMARY: ThreadSanitizer: heap-use-after-free gdb/linux-nat.c:2809 in select_event_lwp
> ...
>
> Since heap-use-after-free is essentially an address sanitizer complaint, I
> also tried building gdb with -O0 -fsanitize=address, but with this setup it
> doesn't seem to trigger (0 times out of 10).
>
> The heap-use-after-free happens during the following scenario:
> - linux_nat_wait_1 selects an LWP thread T1 with a status to report.
> - it sets variable lp to point to the corresponding lwp_info.
> - it calls stop_callback and stop_wait_callback for all threads
> (because !target_is_non_stop_p ()).
> - it calls select_event_lwp to maybe pick another thread than T1, to prevent
> starvation.
>
> The problem seems to be the following:
> - while calling stop_wait_callback for all threads, it also does this for T1.
> While doing so, the corresponding lwp_info is deleted (callstack
> stop_wait_callback -> wait_lwp -> exit_lwp -> delete_lwp), leaving variable
> lp as a dangling pointer.
> - variable lp is passed to select_event_lwp, which derefences it, which causes
> the heap-use-after-free.
>
> Note that the comment here mentions "all other LWP's":
> ...
> /* Now stop all other LWP's ... */
> iterate_over_lwps (minus_one_ptid, stop_callback);
> /* ... and wait until all of them have reported back that
> they're no longer running. */
> iterate_over_lwps (minus_one_ptid, stop_wait_callback);
> ...
> which presumably means other than the one in lp, but the iterators
> don't skip lp.
>
> Fix this by making the code match the comment, and skipping stop_callback and
> stop_wait_callback for lp.
>
> Tested on aarch64-linux.
>
> PR gdb/31259
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31259
> ---
> gdb/linux-nat.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index e91c57ba239..8bfae8555fc 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -3375,11 +3375,23 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
> if (!target_is_non_stop_p ())
> {
> /* Now stop all other LWP's ... */
> - iterate_over_lwps (minus_one_ptid, stop_callback);
> + for (lwp_info *other_lp : all_lwps_safe ())
> + {
> + if (other_lp == lp)
> + continue;
> +
> + stop_callback (other_lp);
> + }
>
> /* ... and wait until all of them have reported back that
> they're no longer running. */
> - iterate_over_lwps (minus_one_ptid, stop_wait_callback);
> + for (lwp_info *other_lp : all_lwps_safe ())
> + {
> + if (other_lp == lp)
> + continue;
> +
> + stop_wait_callback (other_lp);
> + }
I did a bit of archeology to see how this code evolved, and I noticed
that this change in commit 9c02b52532 ("linux-nat.c: better starvation
avoidance, handle non-stop mode too"):
https://gitlab.com/gnutools/binutils-gdb/-/commit/9c02b52532ac?view=parallel#a360e5f37ff035d1ed6814cb60de9f2826b55788_3373_3362
Previously, we had:
lp->stopped = 1;
just before the snippet you modify. Both stop_callback and
stop_wait_callback are no-ops if `lp->stopped` is true, so that would
make them skip over the event thread. The commit removed that, so I
suppose that the problem was introduced in that commit.
Ideally, Pedro should look at this, but in the mean time:
Reviewed-By: Simon Marchi <simon.marchi@efficios.com>
Simon
next prev parent reply other threads:[~2024-01-23 16:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-23 11:48 Tom de Vries
2024-01-23 16:08 ` Simon Marchi [this message]
2024-01-23 17:52 ` Tom de Vries
2024-02-09 15:46 ` Pedro Alves
2024-02-19 15:04 ` Tom de Vries
2024-02-21 17:42 ` Pedro Alves
2024-02-22 11:43 ` Tom de Vries
2024-02-23 14:33 ` Pedro Alves
2024-02-26 14:23 ` Tom de Vries
2024-02-26 15:28 ` Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e1f9f082-032c-4d16-a1d1-f23a250e2128@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=gdb-patches@sourceware.org \
--cc=tdevries@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).