From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id D4B5D3858412 for ; Mon, 24 Oct 2022 15:50:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D4B5D3858412 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca Received: from [172.16.0.64] (192-222-180-24.qc.cable.ebox.net [192.222.180.24]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 0B8791E0CB; Mon, 24 Oct 2022 11:50:03 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail; t=1666626604; bh=8f4KR4WoXVfyl/BK3t/DNEgQVR+229l6+FxMOzeM2p8=; h=Date:Subject:To:References:From:In-Reply-To:From; b=AJ/69xABKXyYscuQkGBsAP9nPbHBD5EpTNwST+9PMa4Hr8ZvnsBSAjYD468p2HG+K Ofc3E9SDKhFYpYqfibF/d64NUZnilg+UFk9V0wTVwayoB3+ODg4oLq+eTUtw81LtX4 bZAs3g7/izRiro0cGqltI+lrlwqJ/t+wIJ4YHiE4= Message-ID: <499ecf26-0f57-2376-a617-9b6214319b4a@simark.ca> Date: Mon, 24 Oct 2022 11:50:03 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.2 Subject: Re: [PATCH] gdb/python: avoid throwing an exception over libopcodes code To: Andrew Burgess , gdb-patches@sourceware.org References: <20221024125016.2823862-1-aburgess@redhat.com> Content-Language: fr From: Simon Marchi In-Reply-To: <20221024125016.2823862-1-aburgess@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/24/22 08:50, Andrew Burgess via Gdb-patches wrote: > Bug gdb/29712 identifies a problem with the Python disassembler API. > In some cases GDB will try to throw an exception through the > libopcodes disassembler code, however, not all targets include > exception unwind information when compiling C code, for targets that > don't include this information GDB will terminate when trying to pass > the exception through libopcodes. > > To explain what GDB is trying to do, consider the following trivial > use of the Python disassembler API: > > class ExampleDisassembler(gdb.disassembler.Disassembler): > > class MyInfo(gdb.disassembler.DisassembleInfo): > def __init__(self, info): > super().__init__(info) > > def read_memory(self, length, offset): > return super().read_memory(length, offset) > > def __init__(self): > super().__init__("ExampleDisassembler") > > def __call__(self, info): > info = self.MyInfo(info) > return gdb.disassembler.builtin_disassemble(info) > > This disassembler doesn't add any value, it defers back to GDB to do > all the actual work, but it serves to allow us to discuss the problem. > > The problem occurs when a Python exception is raised by the > MyInfo.read_memory method. The MyInfo.read_memory method is called > from the C++ function gdbpy_disassembler::read_memory_func. The C++ > stack at the point this function is called looks like this: > > #0 gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510 > #1 0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305 > #2 0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571 > #3 0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548 > #4 0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949 > #5 0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033 > #6 0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072 > #7 0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351 > #8 0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324 > > ### ... snip lots of frames as we pass through Python itself ... > > #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783 > #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939 > #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078 > #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101 > > So gdbpy_disassembler::read_memory_func is called from the libopcodes > disassembler to read memory, this C++ function then calls into user > supplied Python code to do the work. > > If the user supplied Python code raises an gdb.MemoryError exception > indicating the memory read failed, this this is fine. The C++ code > converts this exception back into a return value that libopcodes can > understand, and returns to libopcodes. > > However, if the user supplied Python code raises some other exception, > what we want is for this exception to propagate through GDB and appear > as if raised by the call to gdb.disassembler.builtin_disassemble. To > achieve this, when gdbpy_disassembler::read_memory_func spots an > unknown Python exception, we must pass the information about this > exception from frame #0 to frame #8 in the above backtrace. Frame #8 > is the C++ implementation of gdb.disassembler.builtin_disassemble, and > so it is this function that we want to re-raise the unknown Python > exception, so the user can, if they want, catch the exception in their > code. > > The previous mechanism by which the exception was passed was to pack > the details of the Python exception into a C++ exception, then throw > the exception from frame #0, and catch the exception in frame #8, > unpack the details of the Python exception, and re-raise it. > > However, this relies on the exception passing through frames #1 to #7, > some of which are in libopcodes, which is C code, and so, might not be > compiled with exception support. Can we make gdbpy_disassembler::read_memory_func "noexcept", so that if an exception escapes that function, regardless of the architecture, there will be an apparent GDB crash? > > This commit proposes an alternative solution that does not rely on > throwing a C++ exception. > > When we spot an unhandled Python exception in frame #0, we will store > the details of this exception within the gdbpy_disassembler object > currently in use. Then we return to libopcodes a value indicating > that the memory_read failed. > > libopcodes will now continue to disassemble as though that memory read > failed (with one special case described below), then, when we > eventually return to disasmpy_builtin_disassemble we check to see if > there is an exception stored in the gdbpy_disassembler object. If > there is then this exception can immediately be installed, and then we > return back to Python, when the user will be able to catch the > exception. > > There is one extra change in gdbpy_disassembler::read_memory_func. > After the first call that results in an exception being stored on the > gdbpy_disassembler object, any future calls to the ::read_memory_func > function will immediately return as if the read failed. This avoids > any additional calls into user supplied Python code. > > My thinking here is that should the first call fail with some unknown > error, GDB should not keep trying with any additional calls. This > maintains the illusion that the exception raised from > MyInfo.read_memory is immediately raised by > gdb.disassembler.builtin_disassemble. I have no tests for this change > though - to trigger this issue would rely on a libopcodes disassembler > that will try to read further memory even after the first failed > read. I'm not aware of any such disassembler that currently does > this, but that doesn't mean such a disassembler couldn't exist in the > future. I don't really understand the need for this change. The read_memory_func interface should mostly be stateless, if you try to read an invalid address, you get an error. If you then try to read a valid address, I don't see why that wouldn't work. Apart from that design decision, the code itself looks good to me. > @@ -138,6 +160,23 @@ struct gdbpy_disassembler : public gdb_printing_disassembler > memory source object then a pointer to the object is placed in here, > otherwise, this field is nullptr. */ > PyObject *m_memory_source; > + > + /* Move the exception EX into this disassembler object. */ > + void store_exception (gdbpy_err_fetch &&ex) > + { > + if (!m_stored_exception.has_value ()) > + m_stored_exception.emplace (std::move (ex)); When would m_stored_exception already have a value? I'm thinking this should be an assertion instead: gdb_assert (!m_stored_exception.has_value ()); > @@ -513,7 +555,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff, > exception and throw it, this will then be caught in > disasmpy_builtin_disassemble, at which point the exception will be > restored. */ > - throw gdbpy_err_fetch (); > + dis->store_exception (std::move (gdbpy_err_fetch ())); I believe these std::moves are unnecessary. The result of gdbpy_err_fetch() is already a temporary that will be moved from. Simon