From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id D6A153845188 for ; Mon, 28 Nov 2022 14:39:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D6A153845188 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca Received: from [10.0.0.11] (unknown [217.28.27.60]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id CBFEC1E0CB; Mon, 28 Nov 2022 09:39:04 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail; t=1669646344; bh=lEWbzNJCGV2uGkE5Q81+HlvMva9w2YMJuNH87XmZToc=; h=Date:Subject:To:References:From:In-Reply-To:From; b=o5hIyhb8FU2GA0T/9/1o5laGcXEQu7uzkQ8bFYMq6muMMkuf50zb97f6QygACNbd1 hOdF5kpvzXPmyn5CJGE31sX2eqDWsx1+4tu3yTeEUEBKewuGnGPqMR0J1DFkX8g8fY pkvwp82w1pnjLrKIy+qUu2EZDZ5cZuKyoyA1hbKc= Message-ID: Date: Mon, 28 Nov 2022 09:39:04 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [PATCHv2 1/3] gdb/python: avoid throwing an exception over libopcodes code Content-Language: en-US To: Andrew Burgess , gdb-patches@sourceware.org References: <20221024125016.2823862-1-aburgess@redhat.com> <2924e27d733e1b398654bea3ae1e04b875e5fe43.1666884855.git.aburgess@redhat.com> From: Simon Marchi In-Reply-To: <2924e27d733e1b398654bea3ae1e04b875e5fe43.1666884855.git.aburgess@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/27/22 11:38, Andrew Burgess wrote: > Bug gdb/29712 identifies a problem with the Python disassembler API. > In some cases GDB will try to throw an exception through the > libopcodes disassembler code, however, not all targets include > exception unwind information when compiling C code, for targets that > don't include this information GDB will terminate when trying to pass > the exception through libopcodes. > > To explain what GDB is trying to do, consider the following trivial > use of the Python disassembler API: > > class ExampleDisassembler(gdb.disassembler.Disassembler): > > class MyInfo(gdb.disassembler.DisassembleInfo): > def __init__(self, info): > super().__init__(info) > > def read_memory(self, length, offset): > return super().read_memory(length, offset) > > def __init__(self): > super().__init__("ExampleDisassembler") > > def __call__(self, info): > info = self.MyInfo(info) > return gdb.disassembler.builtin_disassemble(info) > > This disassembler doesn't add any value, it defers back to GDB to do > all the actual work, but it serves to allow us to discuss the problem. > > The problem occurs when a Python exception is raised by the > MyInfo.read_memory method. The MyInfo.read_memory method is called > from the C++ function gdbpy_disassembler::read_memory_func. The C++ > stack at the point this function is called looks like this: > > #0 gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510 > #1 0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305 > #2 0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571 > #3 0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548 > #4 0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949 > #5 0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033 > #6 0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072 > #7 0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351 > #8 0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324 > > ### ... snip lots of frames as we pass through Python itself ... > > #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783 > #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939 > #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078 > #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101 > > So gdbpy_disassembler::read_memory_func is called from the libopcodes > disassembler to read memory, this C++ function then calls into user > supplied Python code to do the work. > > If the user supplied Python code raises an gdb.MemoryError exception > indicating the memory read failed, this this is fine. The C++ code "this this" > converts this exception back into a return value that libopcodes can > understand, and returns to libopcodes. > > However, if the user supplied Python code raises some other exception, > what we want is for this exception to propagate through GDB and appear > as if raised by the call to gdb.disassembler.builtin_disassemble. To > achieve this, when gdbpy_disassembler::read_memory_func spots an > unknown Python exception, we must pass the information about this > exception from frame #0 to frame #8 in the above backtrace. Frame #8 > is the C++ implementation of gdb.disassembler.builtin_disassemble, and > so it is this function that we want to re-raise the unknown Python > exception, so the user can, if they want, catch the exception in their > code. > > The previous mechanism by which the exception was passed was to pack > the details of the Python exception into a C++ exception, then throw > the exception from frame #0, and catch the exception in frame #8, > unpack the details of the Python exception, and re-raise it. > > However, this relies on the exception passing through frames #1 to #7, > some of which are in libopcodes, which is C code, and so, might not be > compiled with exception support. > > This commit proposes an alternative solution that does not rely on > throwing a C++ exception. > > When we spot an unhandled Python exception in frame #0, we will store > the details of this exception within the gdbpy_disassembler object > currently in use. Then we return to libopcodes a value indicating > that the memory_read failed. > > libopcodes will now continue to disassemble as though that memory read > failed (with one special case described below), then, when we > eventually return to disasmpy_builtin_disassemble we check to see if > there is an exception stored in the gdbpy_disassembler object. If > there is then this exception can immediately be installed, and then we > return back to Python, when the user will be able to catch the > exception. > > There is one extra change in gdbpy_disassembler::read_memory_func. > After the first call that results in an exception being stored on the > gdbpy_disassembler object, any future calls to the ::read_memory_func > function will immediately return as if the read failed. This avoids > any additional calls into user supplied Python code. > > My thinking here is that should the first call fail with some unknown > error, GDB should not keep trying with any additional calls. This > maintains the illusion that the exception raised from > MyInfo.read_memory is immediately raised by > gdb.disassembler.builtin_disassemble. I have no tests for this change > though - to trigger this issue would rely on a libopcodes disassembler > that will try to read further memory even after the first failed > read. I'm not aware of any such disassembler that currently does > this, but that doesn't mean such a disassembler couldn't exist in the > future. After understanding this, I'm fine with the idea. You can add: Approved-By: Simon Marchi Simon