public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Bruno Larsen <blarsen@redhat.com>
To: "Aktemur, Tankut Baris" <tankut.baris.aktemur@intel.com>,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [PATCH v2 2/2] gdb: raise and handle NOT_AVAILABLE_ERROR when accessing frame PC
Date: Wed, 23 Mar 2022 10:34:56 -0300	[thread overview]
Message-ID: <59a6639f-4873-66b9-71f9-733fa656c4ca@redhat.com> (raw)
In-Reply-To: <MWHPR1101MB2271B92929814AB965001AA4C4189@MWHPR1101MB2271.namprd11.prod.outlook.com>

On 3/23/22 09:55, Aktemur, Tankut Baris wrote:
> On Wednesday, March 16, 2022 6:27 PM, Bruno Larsen wrote:
>> Hello Tankut,
>>
>> On 2/8/22 06:15, Tankut Baris Aktemur via Gdb-patches wrote:
>>> This patch can be considered a continuation of
>>>
>>>     commit 4778a5f87d253399083565b4919816f541ebe414
>>>     Author: Tom de Vries <tdevries@suse.de>
>>>     Date:   Tue Apr 21 15:45:57 2020 +0200
>>>
>>>       [gdb] Fix hang after ext sigkill
>>>
>>> and
>>>
>>>     commit 47f1aceffa02be4726b854082d7587eb259136e0
>>>     Author: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
>>>     Date:   Thu May 14 13:59:54 2020 +0200
>>>
>>>       gdb/infrun: handle already-exited threads when attempting to stop
>>>
>>> If a process dies before GDB reports the exit error to the user, we
>>> may see the "Couldn't get registers: No such process." error message
>>> in various places.  For instance:
>>>
>>>     (gdb) start
>>>     ...
>>>     (gdb) info inferior
>>>       Num  Description       Connection           Executable
>>>     * 1    process 31943     1 (native)           /tmp/a.out
>>>     (gdb) shell kill -9 31943
>>>     (gdb) maintenance flush register-cache
>>>     Register cache flushed.
>>>     Couldn't get registers: No such process.
>>>     (gdb) info threads
>>>       Id   Target Id              Frame
>>>     * 1    process 31943 "a.out" Couldn't get registers: No such process.
>>>     (gdb) backtrace
>>>     Python Exception <class 'gdb.error'>: Couldn't get registers: No such process.
>>>     Couldn't get registers: No such process.
>>>     (gdb) inferior 1
>>>     Couldn't get registers: No such process.
>>>     (gdb) thread
>>>     [Current thread is 1 (process 31943)]
>>>     Couldn't get registers: No such process.
>>>     (gdb)
>>>
>>> The gdb.threads/killed-outside.exp, gdb.multi/multi-kill.exp, and
>>> gdb.multi/multi-exit.exp tests also check related scenarios.
>>>
>>> To improve the situation,
>>>
>>> 1. when printing the frame info, catch and process a NOT_AVAILABLE_ERROR.
>>>
>>> 2. when accessing the target to fetch registers, if the operation
>>>      fails, raise a NOT_AVAILABLE_ERROR instead of a generic error, so
>>>      that clients can attempt to recover accordingly.  This patch updates
>>>      the amd64_linux_nat_target and remote_target in this direction.
>>>
>>> With this patch, we obtain the following behavior:
>>>
>>>     (gdb) start
>>>     ...
>>>     (gdb) info inferior
>>>       Num  Description       Connection           Executable
>>>     * 1    process 748       1 (native)           /tmp/a.out
>>>     (gdb) shell kill -9 748
>>>     (gdb) maintenance flush register-cache
>>>     Register cache flushed.
>>>     (gdb) info threads
>>>       Id   Target Id           Frame
>>>     * 1    process 748 "a.out" <PC register is not available>
>>>     (gdb) backtrace
>>>     #0  <PC register is not available>
>>>     Backtrace stopped: not enough registers or memory available to unwind further
>>>     (gdb) inferior 1
>>>     [Switching to inferior 1 [process 748] (/tmp/a.out)]
>>>     [Switching to thread 1 (process 748)]
>>>     #0  <PC register is not available>
>>>     (gdb) thread
>>>     [Current thread is 1 (process 748)]
>>>     (gdb)
>>>
>>> Here is another "before/after" case.  Suppose we have two inferiors,
>>> each having its own remote target underneath.  Before this patch, we
>>> get the following output:
>>>
>>>     # Create two inferiors on two remote targets, resume both until
>>>     # termination.  Exit event from one of them is shown first, but the
>>>     # other also exited -- just not yet shown.
>>>     (gdb) maint set target-non-stop on
>>>     (gdb) target remote | gdbserver - ./a.out
>>>     (gdb) add-inferior -no-connection
>>>     (gdb) inferior 2
>>>     (gdb) target remote | gdbserver - ./a.out
>>>     (gdb) set schedule-multiple on
>>>     (gdb) continue
>>>     ...
>>>     [Inferior 2 (process 22127) exited normally]
>>>     (gdb) inferior 1
>>>     [Switching to inferior 1 [process 22111] (target:/tmp/a.out)]
>>>     [Switching to thread 1.1 (Thread 22111.22111)]
>>>     Could not read registers; remote failure reply 'E01'
>>>     (gdb) info threads
>>>       Id   Target Id                  Frame
>>>     * 1.1  Thread 22111.22111 "a.out" Could not read registers; remote failure reply 'E01'
>>>     (gdb) backtrace
>>>     Python Exception <class 'gdb.error'>: Could not read registers; remote failure reply
>> 'E01'
>>>     Could not read registers; remote failure reply 'E01'
>>>     (gdb) thread
>>>     [Current thread is 1.1 (Thread 22111.22111)]
>>>     Could not read registers; remote failure reply 'E01'
>>>     (gdb)
>>>
>>> With this patch, it becomes:
>>>
>>>     ...
>>>     [Inferior 1 (process 11759) exited normally]
>>>     (gdb) inferior 2
>>>     [Switching to inferior 2 [process 13440] (target:/path/to/a.out)]
>>>     [Switching to thread 2.1 (Thread 13440.13440)]
>>>     #0  <unavailable> in ?? ()
>>>     (gdb) info threads
>>>       Id   Target Id                   Frame
>>>     * 2.1  Thread 13440.13440 "a.out" <unavailable> in ?? ()
>>>     (gdb) backtrace
>>>     #0  <unavailable> in ?? ()
>>>     Backtrace stopped: not enough registers or memory available to unwind further
>>>     (gdb) thread
>>>     [Current thread is 2.1 (Thread 13440.13440)]
>>>     (gdb)
>>>
>>> Finally, together with its predecessor, this patch also fixes PR gdb/26877.
>>
>> While I think this is a good idea, it doesn't seem to fix the root cause of the bug you
>> mentioned. It does stop the crash that the bug reports, but I would say the actual issue is
>> that GDB is not noticing that the second inferior is also finished. My 2 cents, for what
>> they're worth.
> 
> The root cause was an unhandled error in a destructor.  The 2-inferior setup was
> just one way to expose it.  From https://sourceware.org/bugzilla/show_bug.cgi?id=26877#c0:
> 
> 	The problem is at:
> 
> 	#20 0x00005555561128b9 in program_space::~program_space (this=0x55555830a070, __in_chrg=<optimized out>) at gdb/progspace.c:153
> 
> 	While inside a destructor, GDB wanted to access the frame information
> 	of Inferior 2 in a series of calls.  But because the process is dead, its
> 	registers cannot be read.  This raises an error inside a destructor, leading
> 	to termination of GDB.
> 
>  From that perspective, I think the root cause is fixed.
> 

It is the root cause of the crash, but GDB is still not aware that the second inferior has finished, which is still a problem, IMHO.
Regardless, it looks like a good cleanup anyway, since even if we fix GDB noticing the process finishing, this crash could happen for other situations.


>> The explanation in the commit message is great! It explains the problem quite well, I just
>> don't understand why you only changed amd64_linux_nat_target and remote. I imagine this
>> issue happens with all targets. I'd ask at least that some of the most common ones be
>> changed and validated.
> 
> Those two targets are the ones I can test reliably.  For the others,
> unfortunately I don't have a reliable way of regression-testing.
> 

Ah, then I guess we should keep this fix in mind if someone reports a similar bug.

>> Also, some extra testing revealed that the previous patch is not
>> actually necessary to fix the crash.
> 
> That's possible.  The bug report in PR/26877 was just a starting point
> for the submitted patches, which aim at addressing a more general problem.
>   
>> As for technical review, I don't have any questions or comments, but I can't approve
>> patches.
>   
> Thanks for your comments!
> 
> Regards
> -Baris
> 
> 
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
> Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928


-- 
Cheers!
Bruno Larsen


  reply	other threads:[~2022-03-23 13:35 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08  9:15 [PATCH v2 0/2] Querying registers of already-exited processes Tankut Baris Aktemur
2022-02-08  9:15 ` [PATCH v2 1/2] gdb/regcache: return REG_UNAVAILABLE if raw_update raises NOT_AVAILABLE_ERROR Tankut Baris Aktemur
2022-03-16 15:18   ` Bruno Larsen
2022-03-23 12:55     ` Aktemur, Tankut Baris
2022-02-08  9:15 ` [PATCH v2 2/2] gdb: raise and handle NOT_AVAILABLE_ERROR when accessing frame PC Tankut Baris Aktemur
2022-03-16 17:26   ` Bruno Larsen
2022-03-23 12:55     ` Aktemur, Tankut Baris
2022-03-23 13:34       ` Bruno Larsen [this message]
2022-03-24  8:46         ` Aktemur, Tankut Baris
2022-02-22  7:31 ` [PATCH v2 0/2] Querying registers of already-exited processes Aktemur, Tankut Baris
2022-03-07  8:00 ` Aktemur, Tankut Baris
2022-03-23 13:05 ` [PATCH v3 " Tankut Baris Aktemur
2022-03-23 13:05   ` [PATCH v3 1/2] gdb/regcache: return REG_UNAVAILABLE in raw_read if NOT_AVAILABLE_ERROR is seen Tankut Baris Aktemur
2022-03-23 13:05   ` [PATCH v3 2/2] gdb: raise and handle NOT_AVAILABLE_ERROR when accessing frame PC Tankut Baris Aktemur
2022-05-04  7:19   ` [PATCH v3 0/2] Querying registers of already-exited processes Aktemur, Tankut Baris
2022-12-23 17:10     ` Aktemur, Tankut Baris
2023-01-17 20:40       ` Aktemur, Tankut Baris
2023-01-24 10:35       ` Aktemur, Tankut Baris
2023-01-31 20:14       ` Aktemur, Tankut Baris
2023-02-20 13:07       ` Aktemur, Tankut Baris
2023-03-03  7:46       ` Aktemur, Tankut Baris
2023-03-28 13:40       ` Aktemur, Tankut Baris
2023-12-18 14:40   ` [PATCH v4 " Tankut Baris Aktemur
2023-12-18 14:40     ` [PATCH v4 1/2] gdb/regcache: return REG_UNAVAILABLE in raw_read if NOT_AVAILABLE_ERROR is seen Tankut Baris Aktemur
2023-12-18 14:40     ` [PATCH v4 2/2] gdb: raise and handle NOT_AVAILABLE_ERROR when accessing frame PC Tankut Baris Aktemur
2023-12-20 22:00       ` John Baldwin
2023-12-21  6:41         ` Eli Zaretskii
2023-12-27 18:41           ` Aktemur, Tankut Baris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59a6639f-4873-66b9-71f9-733fa656c4ca@redhat.com \
    --to=blarsen@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=tankut.baris.aktemur@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).