public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb: fix dealloc function not being called for frame 0
@ 2023-02-09 19:50 Simon Marchi
  2023-02-10  3:10 ` Simon Marchi
  2023-02-15 18:01 ` Tom de Vries
  0 siblings, 2 replies; 7+ messages in thread
From: Simon Marchi @ 2023-02-09 19:50 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Tom de Vries

Tom de Vries reported [1] a regression in gdb.btrace/record_goto.exp
caused by 6d3717d4c4 ("gdb: call frame unwinders' dealloc_cache methods
through destroying the frame cache").  This issue is caught by ASan.  On
a non-ASan build, it may or may not cause a crash or some other issue, I
haven't tried.

I managed to narrow it down to:

    $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.btrace/record_goto/record_goto -ex "start" -ex "record btrace" -ex "next"

... and then doing repeatedly "record goto 19" and "record goto 27".
Eventually, I get:

    (gdb) record goto 27
    =================================================================
    ==1527735==ERROR: AddressSanitizer: heap-use-after-free on address 0x6210003392a8 at pc 0x55e4c26eef86 bp 0x7ffd229f24e0 sp 0x7ffd229f24d8
    READ of size 8 at 0x6210003392a8 thread T0
        #0 0x55e4c26eef85 in bfcache_eq /home/simark/src/binutils-gdb/gdb/record-btrace.c:1639
        #1 0x55e4c37cdeff in htab_find_slot_with_hash /home/simark/src/binutils-gdb/libiberty/hashtab.c:659
        #2 0x55e4c37ce24a in htab_find_slot /home/simark/src/binutils-gdb/libiberty/hashtab.c:703
        #3 0x55e4c26ef0c6 in bfcache_new /home/simark/src/binutils-gdb/gdb/record-btrace.c:1653
        #4 0x55e4c26f1242 in record_btrace_frame_sniffer /home/simark/src/binutils-gdb/gdb/record-btrace.c:1820
        #5 0x55e4c1b926a1 in frame_unwind_try_unwinder /home/simark/src/binutils-gdb/gdb/frame-unwind.c:136
        #6 0x55e4c1b930d7 in frame_unwind_find_by_frame(frame_info_ptr, void**) /home/simark/src/binutils-gdb/gdb/frame-unwind.c:196
        #7 0x55e4c1bb867f in get_frame_type(frame_info_ptr) /home/simark/src/binutils-gdb/gdb/frame.c:2925
        #8 0x55e4c2ae6798 in print_frame_info(frame_print_options const&, frame_info_ptr, int, print_what, int, int) /home/simark/src/binutils-gdb/gdb/stack.c:1049
        #9 0x55e4c2ade3e1 in print_stack_frame(frame_info_ptr, int, print_what, int) /home/simark/src/binutils-gdb/gdb/stack.c:367
        #10 0x55e4c26fda03 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2779
        #11 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
        #12 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
        #13 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
        #14 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383

    0x6210003392a8 is located 424 bytes inside of 4064-byte region [0x621000339100,0x62100033a0e0)
    freed by thread T0 here:
        #0 0x7f6ca34a5b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
        #1 0x55e4c38a4c17 in rpl_free /home/simark/src/binutils-gdb/gnulib/import/free.c:44
        #2 0x55e4c1bbd378 in xfree<void> /home/simark/src/binutils-gdb/gdb/../gdbsupport/gdb-xfree.h:37
        #3 0x55e4c37d1b63 in call_freefun /home/simark/src/binutils-gdb/libiberty/obstack.c:103
        #4 0x55e4c37d25a2 in _obstack_free /home/simark/src/binutils-gdb/libiberty/obstack.c:280
        #5 0x55e4c1bad701 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2112
        #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
        #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
        #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
        #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
        #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
        #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
        #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383

    previously allocated by thread T0 here:
        #0 0x7f6ca34a5e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
        #1 0x55e4c0b55c60 in xmalloc /home/simark/src/binutils-gdb/gdb/alloc.c:57
        #2 0x55e4c37d1a6d in call_chunkfun /home/simark/src/binutils-gdb/libiberty/obstack.c:94
        #3 0x55e4c37d1c20 in _obstack_begin_worker /home/simark/src/binutils-gdb/libiberty/obstack.c:141
        #4 0x55e4c37d1ed7 in _obstack_begin /home/simark/src/binutils-gdb/libiberty/obstack.c:164
        #5 0x55e4c1bad728 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2113
        #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
        #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
        #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
        #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
        #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
        #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
        #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383

The problem is a stale entry in the bfcache hash table (in
record-btrace.c), left across a reinit_frame_cache.  This entry points
to something that used to be allocated on the frame obstack, that has
since been wiped by reinit_frame_cache.

Before the aforementioned, unwinder deallocation functions were called
by iterating on the frame chain, starting with the sentinel frame, like
so:

  /* Tear down all frame caches.  */
  for (frame_info *fi = sentinel_frame; fi != NULL; fi = fi->prev)
    {
      if (fi->prologue_cache && fi->unwind->dealloc_cache)
	fi->unwind->dealloc_cache (fi, fi->prologue_cache);
      if (fi->base_cache && fi->base->unwind->dealloc_cache)
	fi->base->unwind->dealloc_cache (fi, fi->base_cache);
    }

After that patch, we relied on the fact that all frames are (supposedly)
in the frame_stash.  A deletion function was added to the frame_stash
hash table, so that dealloc functions would be called when emptying the
frame stash.  There is one case, however, where a frame_info is not in
the frame stash.  That is when we create the frame_info for the current
frame (level 0, unwound from the sentinel frame), but don't computed its
frame id.  The computation of the frame id for that frame (and only that
frame, AFAIK) is done lazily.  And putting a frame_info in the frame stash
requires knowing its id.  So a frame 0 whose frame id is not computed
yet is necessarily not in the frame stash.

When replaying with btrace, record_btrace_frame_sniffer insert entries
corresponding to frames in the "bfcache" hash table.  It then relies on
record_btrace_frame_dealloc_cache being called for each frame to remove
all those entries when the frames get invalidated.  If a frame reinit
happens while frame 0's id is not computed  (and thefore that frame is
not in frame_stash), record_btrace_frame_dealloc_cache does not get
called for it, and it leaves a stale entry in bfcache.  That then leads
to a use-after-free when that entry is accessed later, which ASan
catches.

The proposed solution is to explicitly call frame_info_del on frame 0,
if it exists, and if its frame id is not computed.  If it's frame id is
computed, it is expected that it will be in the frame stash, so it will
be "deleted" through that.

Reported-By: Tom de Vries <tdevries@suse.de>
Change-Id: I2351882dd511f3bbc01e4152e9db13b69b3ba384
---
 gdb/frame.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/gdb/frame.c b/gdb/frame.c
index 9235a2ceb38c..5c7e1fecd946 100644
--- a/gdb/frame.c
+++ b/gdb/frame.c
@@ -265,10 +265,8 @@ frame_addr_hash_eq (const void *a, const void *b)
 /* Deletion function for the frame cache hash table.  */
 
 static void
-frame_info_del (void *frame_v)
+frame_info_del (frame_info *frame)
 {
-  frame_info *frame = (frame_info *) frame_v;
-
   if (frame->prologue_cache != nullptr
       && frame->unwind->dealloc_cache != nullptr)
     frame->unwind->dealloc_cache (frame, frame->prologue_cache);
@@ -284,10 +282,13 @@ frame_info_del (void *frame_v)
 static void
 frame_stash_create (void)
 {
-  frame_stash = htab_create (100,
-			     frame_addr_hash,
-			     frame_addr_hash_eq,
-			     frame_info_del);
+  frame_stash = htab_create
+    (100, frame_addr_hash, frame_addr_hash_eq,
+     [] (void *p)
+       {
+	 auto frame = static_cast<frame_info *> (p);
+	 frame_info_del (frame);
+       });
 }
 
 /* Internal function to add a frame to the frame_stash hash table.
@@ -2105,7 +2106,23 @@ reinit_frame_cache (void)
   invalidate_selected_frame ();
 
   /* Invalidate cache.  */
-  sentinel_frame = NULL;
+  if (sentinel_frame != nullptr)
+    {
+      /* If frame 0's id is not computed, it is not in the frame stash, so its
+	 dealloc functions will not be called when emptying the frash stash.
+	 Call frame_info_del manually in that case.  */
+      frame_info *current_frame = sentinel_frame->prev;
+      if (current_frame != nullptr)
+	{
+	  gdb_assert (current_frame->this_id.p != frame_id_status::COMPUTING);
+
+	  if (current_frame->this_id.p == frame_id_status::NOT_COMPUTED)
+	    frame_info_del (current_frame);
+	}
+
+      sentinel_frame = nullptr;
+    }
+
   frame_stash_invalidate ();
 
   /* Since we can't really be sure what the first object allocated was.  */
-- 
2.39.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb: fix dealloc function not being called for frame 0
  2023-02-09 19:50 [PATCH] gdb: fix dealloc function not being called for frame 0 Simon Marchi
@ 2023-02-10  3:10 ` Simon Marchi
  2023-02-15  9:39   ` Tom de Vries
  2023-02-15 18:01 ` Tom de Vries
  1 sibling, 1 reply; 7+ messages in thread
From: Simon Marchi @ 2023-02-10  3:10 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Tom de Vries


> @@ -2105,7 +2106,23 @@ reinit_frame_cache (void)
>    invalidate_selected_frame ();
>  
>    /* Invalidate cache.  */
> -  sentinel_frame = NULL;
> +  if (sentinel_frame != nullptr)
> +    {
> +      /* If frame 0's id is not computed, it is not in the frame stash, so its
> +	 dealloc functions will not be called when emptying the frash stash.
> +	 Call frame_info_del manually in that case.  */
> +      frame_info *current_frame = sentinel_frame->prev;
> +      if (current_frame != nullptr)
> +	{
> +	  gdb_assert (current_frame->this_id.p != frame_id_status::COMPUTING);

Well, this gdb_assert happens to cause a failure in
gdb.server/server-kill.exp:

    (gdb) PASS: gdb.server/server-kill.exp: kill_pid_of=inferior: test_unwind_syms: p server_pid
    Executing on target: kill -9 1567851    (timeout = 300)
    builtin_spawn -ignore SIGHUP kill -9 1567851
    bt
    /home/simark/src/binutils-gdb/gdb/frame.c:2117: internal-error: reinit_frame_cache: Assertion `current_frame->this_id.p != frame_id_status::COMPUTING' failed.
    A problem internal to GDB has been detected, further debugging may prove unreliable.
    FAIL: gdb.server/server-kill.exp: kill_pid_of=inferior: test_unwind_syms: bt (GDB internal error)

This happens because the remote connection breaks while computing a
frame id.  The stack looks roughly like this (many frames removed for
readability):

0x55cd4623396c internal_error_loc(char const*, int, char const*, ...)
        /home/simark/src/binutils-gdb/gdbsupport/errors.cc:58
0x55cd4434f5a1 reinit_frame_cache()
        /home/simark/src/binutils-gdb/gdb/frame.c:2117
0x55cd457e1bce switch_to_no_thread()
        /home/simark/src/binutils-gdb/gdb/thread.c:1330
0x55cd44618270 switch_to_inferior_no_thread(inferior*)
        /home/simark/src/binutils-gdb/gdb/inferior.c:671
0x55cd451467c9 remote_unpush_target
        /home/simark/src/binutils-gdb/gdb/remote.c:5903
0x55cd45183b88 unpush_and_perror
        /home/simark/src/binutils-gdb/gdb/remote.c:9612
0x55cd451840ce remote_target::readchar(int)
        /home/simark/src/binutils-gdb/gdb/remote.c:9652
0x55cd45188e91 remote_target::getpkt_or_notif_sane_1(std::__debug::vector<char, gdb::default_init_allocator<char, std::allocator<char> > >*, int, int, int*)
        /home/simark/src/binutils-gdb/gdb/remote.c:10118
0x55cd4518a2fe remote_target::getpkt_sane(std::__debug::vector<char, gdb::default_init_allocator<char, std::allocator<char> > >*, int)
        /home/simark/src/binutils-gdb/gdb/remote.c:10220
0x55cd451889d6 remote_target::getpkt(std::__debug::vector<char, gdb::default_init_allocator<char, std::allocator<char> > >*, int)
        /home/simark/src/binutils-gdb/gdb/remote.c:10062
0x55cd45181137 remote_target::remote_read_bytes_1(unsigned long, unsigned char*, unsigned long, int, unsigned long*)
        /home/simark/src/binutils-gdb/gdb/remote.c:9379
0x55cd45182772 remote_target::remote_read_bytes(unsigned long, unsigned char*, unsigned long, int, unsigned long*)
        /home/simark/src/binutils-gdb/gdb/remote.c:9503
0x55cd4519cd2d remote_target::xfer_partial(target_object, char const*, unsigned char*, unsigned char const*, unsigned long, unsigned long, unsigned long*)
        /home/simark/src/binutils-gdb/gdb/remote.c:11421
...
0x55cd4433d812 compute_frame_id
        /home/simark/src/binutils-gdb/gdb/frame.c:606

So, I will remove that gdb_assert and simplify the code to:

      /* If frame 0's id is not computed, it is not in the frame stash, so its
	 dealloc functions will not be called when emptying the frash stash.
	 Call frame_info_del manually in that case.  */
      frame_info *current_frame = sentinel_frame->prev;
      if (current_frame != nullptr
	  && current_frame->this_id.p == frame_id_status::NOT_COMPUTED)
	frame_info_del (current_frame);

With that change, the test passes again.

If the state of the id of the current frame is "COMPUTING", I think it's
better not to call frame_info_del and the dealloc functions, because the
state of the per-unwinder cache object is maybe not in a state expected
that the dealloc functions expect.  If something goes wrong during the
computation of the frame id, I think it's best to let the unwinder make
sure to clean up anything it has started.

Simon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb: fix dealloc function not being called for frame 0
  2023-02-10  3:10 ` Simon Marchi
@ 2023-02-15  9:39   ` Tom de Vries
  2023-02-15 16:42     ` Simon Marchi
  0 siblings, 1 reply; 7+ messages in thread
From: Tom de Vries @ 2023-02-15  9:39 UTC (permalink / raw)
  To: Simon Marchi, Simon Marchi, gdb-patches

[-- Attachment #1: Type: text/plain, Size: 4984 bytes --]

On 2/10/23 04:10, Simon Marchi wrote:
> 
>> @@ -2105,7 +2106,23 @@ reinit_frame_cache (void)
>>     invalidate_selected_frame ();
>>   
>>     /* Invalidate cache.  */
>> -  sentinel_frame = NULL;
>> +  if (sentinel_frame != nullptr)
>> +    {
>> +      /* If frame 0's id is not computed, it is not in the frame stash, so its
>> +	 dealloc functions will not be called when emptying the frash stash.
>> +	 Call frame_info_del manually in that case.  */
>> +      frame_info *current_frame = sentinel_frame->prev;
>> +      if (current_frame != nullptr)
>> +	{
>> +	  gdb_assert (current_frame->this_id.p != frame_id_status::COMPUTING);
> 
> Well, this gdb_assert happens to cause a failure in
> gdb.server/server-kill.exp:
> 
>      (gdb) PASS: gdb.server/server-kill.exp: kill_pid_of=inferior: test_unwind_syms: p server_pid
>      Executing on target: kill -9 1567851    (timeout = 300)
>      builtin_spawn -ignore SIGHUP kill -9 1567851
>      bt
>      /home/simark/src/binutils-gdb/gdb/frame.c:2117: internal-error: reinit_frame_cache: Assertion `current_frame->this_id.p != frame_id_status::COMPUTING' failed.
>      A problem internal to GDB has been detected, further debugging may prove unreliable.
>      FAIL: gdb.server/server-kill.exp: kill_pid_of=inferior: test_unwind_syms: bt (GDB internal error)
> 
> This happens because the remote connection breaks while computing a
> frame id.  The stack looks roughly like this (many frames removed for
> readability):
> 
> 0x55cd4623396c internal_error_loc(char const*, int, char const*, ...)
>          /home/simark/src/binutils-gdb/gdbsupport/errors.cc:58
> 0x55cd4434f5a1 reinit_frame_cache()
>          /home/simark/src/binutils-gdb/gdb/frame.c:2117
> 0x55cd457e1bce switch_to_no_thread()
>          /home/simark/src/binutils-gdb/gdb/thread.c:1330
> 0x55cd44618270 switch_to_inferior_no_thread(inferior*)
>          /home/simark/src/binutils-gdb/gdb/inferior.c:671
> 0x55cd451467c9 remote_unpush_target
>          /home/simark/src/binutils-gdb/gdb/remote.c:5903
> 0x55cd45183b88 unpush_and_perror
>          /home/simark/src/binutils-gdb/gdb/remote.c:9612
> 0x55cd451840ce remote_target::readchar(int)
>          /home/simark/src/binutils-gdb/gdb/remote.c:9652
> 0x55cd45188e91 remote_target::getpkt_or_notif_sane_1(std::__debug::vector<char, gdb::default_init_allocator<char, std::allocator<char> > >*, int, int, int*)
>          /home/simark/src/binutils-gdb/gdb/remote.c:10118
> 0x55cd4518a2fe remote_target::getpkt_sane(std::__debug::vector<char, gdb::default_init_allocator<char, std::allocator<char> > >*, int)
>          /home/simark/src/binutils-gdb/gdb/remote.c:10220
> 0x55cd451889d6 remote_target::getpkt(std::__debug::vector<char, gdb::default_init_allocator<char, std::allocator<char> > >*, int)
>          /home/simark/src/binutils-gdb/gdb/remote.c:10062
> 0x55cd45181137 remote_target::remote_read_bytes_1(unsigned long, unsigned char*, unsigned long, int, unsigned long*)
>          /home/simark/src/binutils-gdb/gdb/remote.c:9379
> 0x55cd45182772 remote_target::remote_read_bytes(unsigned long, unsigned char*, unsigned long, int, unsigned long*)
>          /home/simark/src/binutils-gdb/gdb/remote.c:9503
> 0x55cd4519cd2d remote_target::xfer_partial(target_object, char const*, unsigned char*, unsigned char const*, unsigned long, unsigned long, unsigned long*)
>          /home/simark/src/binutils-gdb/gdb/remote.c:11421
> ...
> 0x55cd4433d812 compute_frame_id
>          /home/simark/src/binutils-gdb/gdb/frame.c:606
> 
> So, I will remove that gdb_assert and simplify the code to:
> 
>        /* If frame 0's id is not computed, it is not in the frame stash, so its
> 	 dealloc functions will not be called when emptying the frash stash.
> 	 Call frame_info_del manually in that case.  */
>        frame_info *current_frame = sentinel_frame->prev;
>        if (current_frame != nullptr
> 	  && current_frame->this_id.p == frame_id_status::NOT_COMPUTED)
> 	frame_info_del (current_frame);
> 
> With that change, the test passes again.
> 
> If the state of the id of the current frame is "COMPUTING", I think it's
> better not to call frame_info_del and the dealloc functions, because the
> state of the per-unwinder cache object is maybe not in a state expected
> that the dealloc functions expect.  If something goes wrong during the
> computation of the frame id, I think it's best to let the unwinder make
> sure to clean up anything it has started.

Hi Simon,

I've:
- applied the initial patch
- verified that it fixes the gdb.btrace/record_goto.exp test-case
- verified that it breaks the gdb.server/server-kill.exp test-case
- applied your proposed change (attached in patch form for clarity)
- verified that it fixes the gdb.server/server-kill.exp test-case
- verified that it still fixes the gdb.btrace/record_goto.exp test-case
- done a complete build and test run, no issues found (in other words,
   I'm back to just one FAIL, in gdb.server/monitor-exit-quit.exp,
   PR29833)

Thanks,
- Tom

[-- Attachment #2: 0001-fixup.patch --]
[-- Type: text/x-patch, Size: 1073 bytes --]

From a63fe09438deba286464ac79685587890ea8ae39 Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@efficios.com>
Date: Wed, 15 Feb 2023 09:43:35 +0100
Subject: [PATCH] fixup

---
 gdb/frame.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/gdb/frame.c b/gdb/frame.c
index 69730979703..6629f444afe 100644
--- a/gdb/frame.c
+++ b/gdb/frame.c
@@ -2112,13 +2112,9 @@ reinit_frame_cache (void)
 	 dealloc functions will not be called when emptying the frash stash.
 	 Call frame_info_del manually in that case.  */
       frame_info *current_frame = sentinel_frame->prev;
-      if (current_frame != nullptr)
-	{
-	  gdb_assert (current_frame->this_id.p != frame_id_status::COMPUTING);
-
-	  if (current_frame->this_id.p == frame_id_status::NOT_COMPUTED)
-	    frame_info_del (current_frame);
-	}
+      if (current_frame != nullptr
+	  && current_frame->this_id.p == frame_id_status::NOT_COMPUTED)
+	frame_info_del (current_frame);
 
       sentinel_frame = nullptr;
     }

base-commit: b6daa77d0d91216266bccbe182cc7dc689591692
-- 
2.35.3


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb: fix dealloc function not being called for frame 0
  2023-02-15  9:39   ` Tom de Vries
@ 2023-02-15 16:42     ` Simon Marchi
  2023-02-15 17:56       ` Tom de Vries
  0 siblings, 1 reply; 7+ messages in thread
From: Simon Marchi @ 2023-02-15 16:42 UTC (permalink / raw)
  To: Tom de Vries, Simon Marchi, gdb-patches

> Hi Simon,
> 
> I've:
> - applied the initial patch
> - verified that it fixes the gdb.btrace/record_goto.exp test-case
> - verified that it breaks the gdb.server/server-kill.exp test-case
> - applied your proposed change (attached in patch form for clarity)
> - verified that it fixes the gdb.server/server-kill.exp test-case
> - verified that it still fixes the gdb.btrace/record_goto.exp test-case
> - done a complete build and test run, no issues found (in other words,
>   I'm back to just one FAIL, in gdb.server/monitor-exit-quit.exp,
>   PR29833)

Thanks for testing.  Should I add just your Tested-By, or did you review
more in-depth?

Simon


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb: fix dealloc function not being called for frame 0
  2023-02-15 16:42     ` Simon Marchi
@ 2023-02-15 17:56       ` Tom de Vries
  0 siblings, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2023-02-15 17:56 UTC (permalink / raw)
  To: Simon Marchi, Simon Marchi, gdb-patches

On 2/15/23 17:42, Simon Marchi wrote:
>> Hi Simon,
>>
>> I've:
>> - applied the initial patch
>> - verified that it fixes the gdb.btrace/record_goto.exp test-case
>> - verified that it breaks the gdb.server/server-kill.exp test-case
>> - applied your proposed change (attached in patch form for clarity)
>> - verified that it fixes the gdb.server/server-kill.exp test-case
>> - verified that it still fixes the gdb.btrace/record_goto.exp test-case
>> - done a complete build and test run, no issues found (in other words,
>>    I'm back to just one FAIL, in gdb.server/monitor-exit-quit.exp,
>>    PR29833)
> 
> Thanks for testing.  Should I add just your Tested-By, or did you review
> more in-depth?
> 

Hi Simon,

just tested-by.

Unfortunately, the frame cache is just one of those things in gdb I'm 
wholly unfamiliar with, so I don't feel myself capable of reviewing 
without investing serious amounts of time.

I've looked through the commit message, and it sounds reasonable and 
well-explained to me, I just don't understand the full extent of it, 
nevermind the code change.

I did see a few typo's, I'll reply to the submission email to note those.

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb: fix dealloc function not being called for frame 0
  2023-02-09 19:50 [PATCH] gdb: fix dealloc function not being called for frame 0 Simon Marchi
  2023-02-10  3:10 ` Simon Marchi
@ 2023-02-15 18:01 ` Tom de Vries
  2023-02-15 18:45   ` Simon Marchi
  1 sibling, 1 reply; 7+ messages in thread
From: Tom de Vries @ 2023-02-15 18:01 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 2/9/23 20:50, Simon Marchi wrote:
> Tom de Vries reported [1] a regression in gdb.btrace/record_goto.exp
> caused by 6d3717d4c4 ("gdb: call frame unwinders' dealloc_cache methods
> through destroying the frame cache").  This issue is caught by ASan.  On
> a non-ASan build, it may or may not cause a crash or some other issue, I
> haven't tried.
> 
> I managed to narrow it down to:
> 
>      $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.btrace/record_goto/record_goto -ex "start" -ex "record btrace" -ex "next"
> 
> ... and then doing repeatedly "record goto 19" and "record goto 27".
> Eventually, I get:
> 
>      (gdb) record goto 27
>      =================================================================
>      ==1527735==ERROR: AddressSanitizer: heap-use-after-free on address 0x6210003392a8 at pc 0x55e4c26eef86 bp 0x7ffd229f24e0 sp 0x7ffd229f24d8
>      READ of size 8 at 0x6210003392a8 thread T0
>          #0 0x55e4c26eef85 in bfcache_eq /home/simark/src/binutils-gdb/gdb/record-btrace.c:1639
>          #1 0x55e4c37cdeff in htab_find_slot_with_hash /home/simark/src/binutils-gdb/libiberty/hashtab.c:659
>          #2 0x55e4c37ce24a in htab_find_slot /home/simark/src/binutils-gdb/libiberty/hashtab.c:703
>          #3 0x55e4c26ef0c6 in bfcache_new /home/simark/src/binutils-gdb/gdb/record-btrace.c:1653
>          #4 0x55e4c26f1242 in record_btrace_frame_sniffer /home/simark/src/binutils-gdb/gdb/record-btrace.c:1820
>          #5 0x55e4c1b926a1 in frame_unwind_try_unwinder /home/simark/src/binutils-gdb/gdb/frame-unwind.c:136
>          #6 0x55e4c1b930d7 in frame_unwind_find_by_frame(frame_info_ptr, void**) /home/simark/src/binutils-gdb/gdb/frame-unwind.c:196
>          #7 0x55e4c1bb867f in get_frame_type(frame_info_ptr) /home/simark/src/binutils-gdb/gdb/frame.c:2925
>          #8 0x55e4c2ae6798 in print_frame_info(frame_print_options const&, frame_info_ptr, int, print_what, int, int) /home/simark/src/binutils-gdb/gdb/stack.c:1049
>          #9 0x55e4c2ade3e1 in print_stack_frame(frame_info_ptr, int, print_what, int) /home/simark/src/binutils-gdb/gdb/stack.c:367
>          #10 0x55e4c26fda03 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2779
>          #11 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
>          #12 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
>          #13 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
>          #14 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383
> 
>      0x6210003392a8 is located 424 bytes inside of 4064-byte region [0x621000339100,0x62100033a0e0)
>      freed by thread T0 here:
>          #0 0x7f6ca34a5b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>          #1 0x55e4c38a4c17 in rpl_free /home/simark/src/binutils-gdb/gnulib/import/free.c:44
>          #2 0x55e4c1bbd378 in xfree<void> /home/simark/src/binutils-gdb/gdb/../gdbsupport/gdb-xfree.h:37
>          #3 0x55e4c37d1b63 in call_freefun /home/simark/src/binutils-gdb/libiberty/obstack.c:103
>          #4 0x55e4c37d25a2 in _obstack_free /home/simark/src/binutils-gdb/libiberty/obstack.c:280
>          #5 0x55e4c1bad701 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2112
>          #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
>          #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
>          #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
>          #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
>          #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
>          #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
>          #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383
> 
>      previously allocated by thread T0 here:
>          #0 0x7f6ca34a5e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
>          #1 0x55e4c0b55c60 in xmalloc /home/simark/src/binutils-gdb/gdb/alloc.c:57
>          #2 0x55e4c37d1a6d in call_chunkfun /home/simark/src/binutils-gdb/libiberty/obstack.c:94
>          #3 0x55e4c37d1c20 in _obstack_begin_worker /home/simark/src/binutils-gdb/libiberty/obstack.c:141
>          #4 0x55e4c37d1ed7 in _obstack_begin /home/simark/src/binutils-gdb/libiberty/obstack.c:164
>          #5 0x55e4c1bad728 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2113
>          #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
>          #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
>          #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
>          #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
>          #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
>          #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
>          #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383
> 
> The problem is a stale entry in the bfcache hash table (in
> record-btrace.c), left across a reinit_frame_cache.  This entry points
> to something that used to be allocated on the frame obstack, that has
> since been wiped by reinit_frame_cache.
> 
> Before the aforementioned, unwinder deallocation functions were called
> by iterating on the frame chain, starting with the sentinel frame, like
> so:
> 
>    /* Tear down all frame caches.  */
>    for (frame_info *fi = sentinel_frame; fi != NULL; fi = fi->prev)
>      {
>        if (fi->prologue_cache && fi->unwind->dealloc_cache)
> 	fi->unwind->dealloc_cache (fi, fi->prologue_cache);
>        if (fi->base_cache && fi->base->unwind->dealloc_cache)
> 	fi->base->unwind->dealloc_cache (fi, fi->base_cache);
>      }
> 
> After that patch, we relied on the fact that all frames are (supposedly)
> in the frame_stash.  A deletion function was added to the frame_stash
> hash table, so that dealloc functions would be called when emptying the
> frame stash.  There is one case, however, where a frame_info is not in
> the frame stash.  That is when we create the frame_info for the current
> frame (level 0, unwound from the sentinel frame), but don't computed its

computed -> compute

> frame id.  The computation of the frame id for that frame (and only that
> frame, AFAIK) is done lazily.  And putting a frame_info in the frame stash
> requires knowing its id.  So a frame 0 whose frame id is not computed
> yet is necessarily not in the frame stash.
> 
> When replaying with btrace, record_btrace_frame_sniffer insert entries
> corresponding to frames in the "bfcache" hash table.  It then relies on
> record_btrace_frame_dealloc_cache being called for each frame to remove
> all those entries when the frames get invalidated.  If a frame reinit
> happens while frame 0's id is not computed  (and thefore that frame is

thefore -> therefore

> not in frame_stash), record_btrace_frame_dealloc_cache does not get
> called for it, and it leaves a stale entry in bfcache.  That then leads
> to a use-after-free when that entry is accessed later, which ASan
> catches.
> 
> The proposed solution is to explicitly call frame_info_del on frame 0,
> if it exists, and if its frame id is not computed.  If it's frame id is

it's -> its

Thanks,
- Tom

> computed, it is expected that it will be in the frame stash, so it will
> be "deleted" through that.
> 
> Reported-By: Tom de Vries <tdevries@suse.de>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] gdb: fix dealloc function not being called for frame 0
  2023-02-15 18:01 ` Tom de Vries
@ 2023-02-15 18:45   ` Simon Marchi
  0 siblings, 0 replies; 7+ messages in thread
From: Simon Marchi @ 2023-02-15 18:45 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 2/15/23 13:01, Tom de Vries wrote:
> On 2/9/23 20:50, Simon Marchi wrote:
>> Tom de Vries reported [1] a regression in gdb.btrace/record_goto.exp
>> caused by 6d3717d4c4 ("gdb: call frame unwinders' dealloc_cache methods
>> through destroying the frame cache").  This issue is caught by ASan.  On
>> a non-ASan build, it may or may not cause a crash or some other issue, I
>> haven't tried.
>>
>> I managed to narrow it down to:
>>
>>      $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.btrace/record_goto/record_goto -ex "start" -ex "record btrace" -ex "next"
>>
>> ... and then doing repeatedly "record goto 19" and "record goto 27".
>> Eventually, I get:
>>
>>      (gdb) record goto 27
>>      =================================================================
>>      ==1527735==ERROR: AddressSanitizer: heap-use-after-free on address 0x6210003392a8 at pc 0x55e4c26eef86 bp 0x7ffd229f24e0 sp 0x7ffd229f24d8
>>      READ of size 8 at 0x6210003392a8 thread T0
>>          #0 0x55e4c26eef85 in bfcache_eq /home/simark/src/binutils-gdb/gdb/record-btrace.c:1639
>>          #1 0x55e4c37cdeff in htab_find_slot_with_hash /home/simark/src/binutils-gdb/libiberty/hashtab.c:659
>>          #2 0x55e4c37ce24a in htab_find_slot /home/simark/src/binutils-gdb/libiberty/hashtab.c:703
>>          #3 0x55e4c26ef0c6 in bfcache_new /home/simark/src/binutils-gdb/gdb/record-btrace.c:1653
>>          #4 0x55e4c26f1242 in record_btrace_frame_sniffer /home/simark/src/binutils-gdb/gdb/record-btrace.c:1820
>>          #5 0x55e4c1b926a1 in frame_unwind_try_unwinder /home/simark/src/binutils-gdb/gdb/frame-unwind.c:136
>>          #6 0x55e4c1b930d7 in frame_unwind_find_by_frame(frame_info_ptr, void**) /home/simark/src/binutils-gdb/gdb/frame-unwind.c:196
>>          #7 0x55e4c1bb867f in get_frame_type(frame_info_ptr) /home/simark/src/binutils-gdb/gdb/frame.c:2925
>>          #8 0x55e4c2ae6798 in print_frame_info(frame_print_options const&, frame_info_ptr, int, print_what, int, int) /home/simark/src/binutils-gdb/gdb/stack.c:1049
>>          #9 0x55e4c2ade3e1 in print_stack_frame(frame_info_ptr, int, print_what, int) /home/simark/src/binutils-gdb/gdb/stack.c:367
>>          #10 0x55e4c26fda03 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2779
>>          #11 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
>>          #12 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
>>          #13 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
>>          #14 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383
>>
>>      0x6210003392a8 is located 424 bytes inside of 4064-byte region [0x621000339100,0x62100033a0e0)
>>      freed by thread T0 here:
>>          #0 0x7f6ca34a5b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>>          #1 0x55e4c38a4c17 in rpl_free /home/simark/src/binutils-gdb/gnulib/import/free.c:44
>>          #2 0x55e4c1bbd378 in xfree<void> /home/simark/src/binutils-gdb/gdb/../gdbsupport/gdb-xfree.h:37
>>          #3 0x55e4c37d1b63 in call_freefun /home/simark/src/binutils-gdb/libiberty/obstack.c:103
>>          #4 0x55e4c37d25a2 in _obstack_free /home/simark/src/binutils-gdb/libiberty/obstack.c:280
>>          #5 0x55e4c1bad701 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2112
>>          #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
>>          #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
>>          #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
>>          #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
>>          #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
>>          #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
>>          #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383
>>
>>      previously allocated by thread T0 here:
>>          #0 0x7f6ca34a5e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
>>          #1 0x55e4c0b55c60 in xmalloc /home/simark/src/binutils-gdb/gdb/alloc.c:57
>>          #2 0x55e4c37d1a6d in call_chunkfun /home/simark/src/binutils-gdb/libiberty/obstack.c:94
>>          #3 0x55e4c37d1c20 in _obstack_begin_worker /home/simark/src/binutils-gdb/libiberty/obstack.c:141
>>          #4 0x55e4c37d1ed7 in _obstack_begin /home/simark/src/binutils-gdb/libiberty/obstack.c:164
>>          #5 0x55e4c1bad728 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2113
>>          #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
>>          #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
>>          #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
>>          #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
>>          #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
>>          #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
>>          #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383
>>
>> The problem is a stale entry in the bfcache hash table (in
>> record-btrace.c), left across a reinit_frame_cache.  This entry points
>> to something that used to be allocated on the frame obstack, that has
>> since been wiped by reinit_frame_cache.
>>
>> Before the aforementioned, unwinder deallocation functions were called
>> by iterating on the frame chain, starting with the sentinel frame, like
>> so:
>>
>>    /* Tear down all frame caches.  */
>>    for (frame_info *fi = sentinel_frame; fi != NULL; fi = fi->prev)
>>      {
>>        if (fi->prologue_cache && fi->unwind->dealloc_cache)
>>     fi->unwind->dealloc_cache (fi, fi->prologue_cache);
>>        if (fi->base_cache && fi->base->unwind->dealloc_cache)
>>     fi->base->unwind->dealloc_cache (fi, fi->base_cache);
>>      }
>>
>> After that patch, we relied on the fact that all frames are (supposedly)
>> in the frame_stash.  A deletion function was added to the frame_stash
>> hash table, so that dealloc functions would be called when emptying the
>> frame stash.  There is one case, however, where a frame_info is not in
>> the frame stash.  That is when we create the frame_info for the current
>> frame (level 0, unwound from the sentinel frame), but don't computed its
> 
> computed -> compute

Fixed.

>> frame id.  The computation of the frame id for that frame (and only that
>> frame, AFAIK) is done lazily.  And putting a frame_info in the frame stash
>> requires knowing its id.  So a frame 0 whose frame id is not computed
>> yet is necessarily not in the frame stash.
>>
>> When replaying with btrace, record_btrace_frame_sniffer insert entries
>> corresponding to frames in the "bfcache" hash table.  It then relies on
>> record_btrace_frame_dealloc_cache being called for each frame to remove
>> all those entries when the frames get invalidated.  If a frame reinit
>> happens while frame 0's id is not computed  (and thefore that frame is
> 
> thefore -> therefore

Fixed.

> 
>> not in frame_stash), record_btrace_frame_dealloc_cache does not get
>> called for it, and it leaves a stale entry in bfcache.  That then leads
>> to a use-after-free when that entry is accessed later, which ASan
>> catches.
>>
>> The proposed solution is to explicitly call frame_info_del on frame 0,
>> if it exists, and if its frame id is not computed.  If it's frame id is
> 
> it's -> its

Fixed.

I will push the patch shortly, thanks.

Simon

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-15 18:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09 19:50 [PATCH] gdb: fix dealloc function not being called for frame 0 Simon Marchi
2023-02-10  3:10 ` Simon Marchi
2023-02-15  9:39   ` Tom de Vries
2023-02-15 16:42     ` Simon Marchi
2023-02-15 17:56       ` Tom de Vries
2023-02-15 18:01 ` Tom de Vries
2023-02-15 18:45   ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).