public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb/python: avoid throwing an exception over libopcodes code
@ 2022-10-24 12:50 Andrew Burgess
  2022-10-24 15:50 ` Simon Marchi
  2022-10-27 15:38 ` [PATCHv2 0/3] " Andrew Burgess
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-10-24 12:50 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Bug gdb/29712 identifies a problem with the Python disassembler API.
In some cases GDB will try to throw an exception through the
libopcodes disassembler code, however, not all targets include
exception unwind information when compiling C code, for targets that
don't include this information GDB will terminate when trying to pass
the exception through libopcodes.

To explain what GDB is trying to do, consider the following trivial
use of the Python disassembler API:

  class ExampleDisassembler(gdb.disassembler.Disassembler):

      class MyInfo(gdb.disassembler.DisassembleInfo):
          def __init__(self, info):
              super().__init__(info)

          def read_memory(self, length, offset):
              return super().read_memory(length, offset)

      def __init__(self):
          super().__init__("ExampleDisassembler")

      def __call__(self, info):
          info = self.MyInfo(info)
          return gdb.disassembler.builtin_disassemble(info)

This disassembler doesn't add any value, it defers back to GDB to do
all the actual work, but it serves to allow us to discuss the problem.

The problem occurs when a Python exception is raised by the
MyInfo.read_memory method.  The MyInfo.read_memory method is called
from the C++ function gdbpy_disassembler::read_memory_func.  The C++
stack at the point this function is called looks like this:

  #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
  #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
  #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
  #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
  #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
  #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
  #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
  #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
  #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324

  ### ... snip lots of frames as we pass through Python itself ...

  #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
  #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
  #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
  #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101

So gdbpy_disassembler::read_memory_func is called from the libopcodes
disassembler to read memory, this C++ function then calls into user
supplied Python code to do the work.

If the user supplied Python code raises an gdb.MemoryError exception
indicating the memory read failed, this this is fine.  The C++ code
converts this exception back into a return value that libopcodes can
understand, and returns to libopcodes.

However, if the user supplied Python code raises some other exception,
what we want is for this exception to propagate through GDB and appear
as if raised by the call to gdb.disassembler.builtin_disassemble.  To
achieve this, when gdbpy_disassembler::read_memory_func spots an
unknown Python exception, we must pass the information about this
exception from frame #0 to frame #8 in the above backtrace.  Frame #8
is the C++ implementation of gdb.disassembler.builtin_disassemble, and
so it is this function that we want to re-raise the unknown Python
exception, so the user can, if they want, catch the exception in their
code.

The previous mechanism by which the exception was passed was to pack
the details of the Python exception into a C++ exception, then throw
the exception from frame #0, and catch the exception in frame #8,
unpack the details of the Python exception, and re-raise it.

However, this relies on the exception passing through frames #1 to #7,
some of which are in libopcodes, which is C code, and so, might not be
compiled with exception support.

This commit proposes an alternative solution that does not rely on
throwing a C++ exception.

When we spot an unhandled Python exception in frame #0, we will store
the details of this exception within the gdbpy_disassembler object
currently in use.  Then we return to libopcodes a value indicating
that the memory_read failed.

libopcodes will now continue to disassemble as though that memory read
failed (with one special case described below), then, when we
eventually return to disasmpy_builtin_disassemble we check to see if
there is an exception stored in the gdbpy_disassembler object.  If
there is then this exception can immediately be installed, and then we
return back to Python, when the user will be able to catch the
exception.

There is one extra change in gdbpy_disassembler::read_memory_func.
After the first call that results in an exception being stored on the
gdbpy_disassembler object, any future calls to the ::read_memory_func
function will immediately return as if the read failed.  This avoids
any additional calls into user supplied Python code.

My thinking here is that should the first call fail with some unknown
error, GDB should not keep trying with any additional calls.  This
maintains the illusion that the exception raised from
MyInfo.read_memory is immediately raised by
gdb.disassembler.builtin_disassemble.  I have no tests for this change
though - to trigger this issue would rely on a libopcodes disassembler
that will try to read further memory even after the first failed
read.  I'm not aware of any such disassembler that currently does
this, but that doesn't mean such a disassembler couldn't exist in the
future.

With this change in place the gdb.python/py-disasm.exp test should now
pass on AArch64.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29712
---
 gdb/python/py-disasm.c | 77 +++++++++++++++++++++++++++++++++---------
 1 file changed, 61 insertions(+), 16 deletions(-)

diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index c37452fcf72..819679bd604 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -122,6 +122,28 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
     return m_string_file.release ();
   }
 
+  /* If there is a Python exception stored in this disassembler then
+     restore it (i.e. set the PyErr_* state), clear the exception within
+     this disassembler, and return true.  There must be no current
+     exception set (i.e. !PyErr_Occurred()) when this function is called,
+     as any such exception might get lost.
+
+     Otherwise, there is no exception stored in this disassembler, return
+     false.  */
+  bool restore_exception ()
+  {
+    gdb_assert (!PyErr_Occurred ());
+    if (m_stored_exception.has_value ())
+      {
+	gdbpy_err_fetch ex = std::move (*m_stored_exception);
+	m_stored_exception.reset ();
+	ex.restore ();
+	return true;
+      }
+
+    return false;
+  }
+
 private:
 
   /* Where the disassembler result is written.  */
@@ -138,6 +160,23 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
      memory source object then a pointer to the object is placed in here,
      otherwise, this field is nullptr.  */
   PyObject *m_memory_source;
+
+  /* Move the exception EX into this disassembler object.  */
+  void store_exception (gdbpy_err_fetch &&ex)
+  {
+    if (!m_stored_exception.has_value ())
+      m_stored_exception.emplace (std::move (ex));
+  }
+
+  /* Return true if there is an exception stored in this disassembler.  */
+  bool has_stored_exception () const
+  {
+    return m_stored_exception.has_value ();
+  }
+
+  /* Store a single exception.  This is used to pass Python exceptions back
+     from ::memory_read to disasmpy_builtin_disassemble.  */
+  gdb::optional<gdbpy_err_fetch> m_stored_exception;
 };
 
 /* Return true if OBJ is still valid, otherwise, return false.  A valid OBJ
@@ -288,20 +327,15 @@ disasmpy_builtin_disassemble (PyObject *self, PyObject *args, PyObject *kw)
      the disassembled instruction, or -1 if there was a memory-error
      encountered while disassembling.  See below more more details on
      handling of -1 return value.  */
-  int length;
-  try
-    {
-      length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
+  int length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
 				   disassembler.disasm_info ());
-    }
-  catch (gdbpy_err_fetch &pyerr)
-    {
-      /* Reinstall the Python exception held in PYERR.  This clears to
-	 pointers held in PYERR, hence the need to catch as a non-const
-	 reference.  */
-      pyerr.restore ();
-      return nullptr;
-    }
+
+  /* It is possible that, while calling a user overridden memory read
+     function, a Python exception was raised that couldn't be
+     translated into a standard memory-error.  In this case the first such
+     exception is stored in the disassembler and restored here.  */
+  if (disassembler.restore_exception ())
+    return nullptr;
 
   if (length == -1)
     {
@@ -483,6 +517,14 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     = static_cast<gdbpy_disassembler *> (info->application_data);
   disasm_info_object *obj = dis->py_disasm_info ();
 
+  /* If a previous read attempt resulted in an exception, then we don't
+     allow any further reads to succeed.  We only do this check for the
+     read_memory_func as this is the only one the user can hook into,
+     thus, this check prevents us calling back into user code if a
+     previous call has already thrown an error.  */
+  if (dis->has_stored_exception ())
+    return -1;
+
   /* The DisassembleInfo.read_memory method expects an offset from the
      address stored within the DisassembleInfo object; calculate that
      offset here.  */
@@ -513,7 +555,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 	 exception and throw it, this will then be caught in
 	 disasmpy_builtin_disassemble, at which point the exception will be
 	 restored.  */
-      throw gdbpy_err_fetch ();
+      dis->store_exception (std::move (gdbpy_err_fetch ()));
+      return -1;
     }
 
   /* Convert the result to a buffer.  */
@@ -523,7 +566,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     {
       PyErr_Format (PyExc_TypeError,
 		    _("Result from read_memory is not a buffer"));
-      throw gdbpy_err_fetch ();
+      dis->store_exception (std::move (gdbpy_err_fetch ()));
+      return -1;
     }
 
   /* Wrap PY_BUFF so that it is cleaned up correctly at the end of this
@@ -536,7 +580,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
       PyErr_Format (PyExc_ValueError,
 		    _("Buffer returned from read_memory is sized %d instead of the expected %d"),
 		    py_buff.len, len);
-      throw gdbpy_err_fetch ();
+      dis->store_exception (std::move (gdbpy_err_fetch ()));
+      return -1;
     }
 
   /* Copy the data out of the Python buffer and return success.  */
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-24 12:50 [PATCH] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
@ 2022-10-24 15:50 ` Simon Marchi
  2022-10-24 17:22   ` Andrew Burgess
  2022-10-27 15:38 ` [PATCHv2 0/3] " Andrew Burgess
  1 sibling, 1 reply; 18+ messages in thread
From: Simon Marchi @ 2022-10-24 15:50 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On 10/24/22 08:50, Andrew Burgess via Gdb-patches wrote:
> Bug gdb/29712 identifies a problem with the Python disassembler API.
> In some cases GDB will try to throw an exception through the
> libopcodes disassembler code, however, not all targets include
> exception unwind information when compiling C code, for targets that
> don't include this information GDB will terminate when trying to pass
> the exception through libopcodes.
> 
> To explain what GDB is trying to do, consider the following trivial
> use of the Python disassembler API:
> 
>   class ExampleDisassembler(gdb.disassembler.Disassembler):
> 
>       class MyInfo(gdb.disassembler.DisassembleInfo):
>           def __init__(self, info):
>               super().__init__(info)
> 
>           def read_memory(self, length, offset):
>               return super().read_memory(length, offset)
> 
>       def __init__(self):
>           super().__init__("ExampleDisassembler")
> 
>       def __call__(self, info):
>           info = self.MyInfo(info)
>           return gdb.disassembler.builtin_disassemble(info)
> 
> This disassembler doesn't add any value, it defers back to GDB to do
> all the actual work, but it serves to allow us to discuss the problem.
> 
> The problem occurs when a Python exception is raised by the
> MyInfo.read_memory method.  The MyInfo.read_memory method is called
> from the C++ function gdbpy_disassembler::read_memory_func.  The C++
> stack at the point this function is called looks like this:
> 
>   #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
>   #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
>   #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
>   #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
>   #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
>   #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
>   #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
>   #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
>   #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324
> 
>   ### ... snip lots of frames as we pass through Python itself ...
> 
>   #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
>   #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
>   #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
>   #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101
> 
> So gdbpy_disassembler::read_memory_func is called from the libopcodes
> disassembler to read memory, this C++ function then calls into user
> supplied Python code to do the work.
> 
> If the user supplied Python code raises an gdb.MemoryError exception
> indicating the memory read failed, this this is fine.  The C++ code
> converts this exception back into a return value that libopcodes can
> understand, and returns to libopcodes.
> 
> However, if the user supplied Python code raises some other exception,
> what we want is for this exception to propagate through GDB and appear
> as if raised by the call to gdb.disassembler.builtin_disassemble.  To
> achieve this, when gdbpy_disassembler::read_memory_func spots an
> unknown Python exception, we must pass the information about this
> exception from frame #0 to frame #8 in the above backtrace.  Frame #8
> is the C++ implementation of gdb.disassembler.builtin_disassemble, and
> so it is this function that we want to re-raise the unknown Python
> exception, so the user can, if they want, catch the exception in their
> code.
> 
> The previous mechanism by which the exception was passed was to pack
> the details of the Python exception into a C++ exception, then throw
> the exception from frame #0, and catch the exception in frame #8,
> unpack the details of the Python exception, and re-raise it.
> 
> However, this relies on the exception passing through frames #1 to #7,
> some of which are in libopcodes, which is C code, and so, might not be
> compiled with exception support.

Can we make gdbpy_disassembler::read_memory_func "noexcept", so that if
an exception escapes that function, regardless of the architecture,
there will be an apparent GDB crash?

> 
> This commit proposes an alternative solution that does not rely on
> throwing a C++ exception.
> 
> When we spot an unhandled Python exception in frame #0, we will store
> the details of this exception within the gdbpy_disassembler object
> currently in use.  Then we return to libopcodes a value indicating
> that the memory_read failed.
> 
> libopcodes will now continue to disassemble as though that memory read
> failed (with one special case described below), then, when we
> eventually return to disasmpy_builtin_disassemble we check to see if
> there is an exception stored in the gdbpy_disassembler object.  If
> there is then this exception can immediately be installed, and then we
> return back to Python, when the user will be able to catch the
> exception.
> 
> There is one extra change in gdbpy_disassembler::read_memory_func.
> After the first call that results in an exception being stored on the
> gdbpy_disassembler object, any future calls to the ::read_memory_func
> function will immediately return as if the read failed.  This avoids
> any additional calls into user supplied Python code.
> 
> My thinking here is that should the first call fail with some unknown
> error, GDB should not keep trying with any additional calls.  This
> maintains the illusion that the exception raised from
> MyInfo.read_memory is immediately raised by
> gdb.disassembler.builtin_disassemble.  I have no tests for this change
> though - to trigger this issue would rely on a libopcodes disassembler
> that will try to read further memory even after the first failed
> read.  I'm not aware of any such disassembler that currently does
> this, but that doesn't mean such a disassembler couldn't exist in the
> future.

I don't really understand the need for this change.  The
read_memory_func interface should mostly be stateless, if you try to
read an invalid address, you get an error.  If you then try to read a
valid address, I don't see why that wouldn't work.

Apart from that design decision, the code itself looks good to me.

> @@ -138,6 +160,23 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
>       memory source object then a pointer to the object is placed in here,
>       otherwise, this field is nullptr.  */
>    PyObject *m_memory_source;
> +
> +  /* Move the exception EX into this disassembler object.  */
> +  void store_exception (gdbpy_err_fetch &&ex)
> +  {
> +    if (!m_stored_exception.has_value ())
> +      m_stored_exception.emplace (std::move (ex));

When would m_stored_exception already have a value?  I'm thinking this
should be an assertion instead:

  gdb_assert (!m_stored_exception.has_value ());

> @@ -513,7 +555,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
>  	 exception and throw it, this will then be caught in
>  	 disasmpy_builtin_disassemble, at which point the exception will be
>  	 restored.  */
> -      throw gdbpy_err_fetch ();
> +      dis->store_exception (std::move (gdbpy_err_fetch ()));

I believe these std::moves are unnecessary.  The result of
gdbpy_err_fetch() is already a temporary that will be moved from.

Simon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-24 15:50 ` Simon Marchi
@ 2022-10-24 17:22   ` Andrew Burgess
  2022-10-24 17:45     ` [PATCH] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
  2022-10-24 18:20     ` [PATCH] gdb/python: avoid throwing an exception over libopcodes code Simon Marchi
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-10-24 17:22 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

Simon Marchi <simark@simark.ca> writes:

> On 10/24/22 08:50, Andrew Burgess via Gdb-patches wrote:
>> Bug gdb/29712 identifies a problem with the Python disassembler API.
>> In some cases GDB will try to throw an exception through the
>> libopcodes disassembler code, however, not all targets include
>> exception unwind information when compiling C code, for targets that
>> don't include this information GDB will terminate when trying to pass
>> the exception through libopcodes.
>> 
>> To explain what GDB is trying to do, consider the following trivial
>> use of the Python disassembler API:
>> 
>>   class ExampleDisassembler(gdb.disassembler.Disassembler):
>> 
>>       class MyInfo(gdb.disassembler.DisassembleInfo):
>>           def __init__(self, info):
>>               super().__init__(info)
>> 
>>           def read_memory(self, length, offset):
>>               return super().read_memory(length, offset)
>> 
>>       def __init__(self):
>>           super().__init__("ExampleDisassembler")
>> 
>>       def __call__(self, info):
>>           info = self.MyInfo(info)
>>           return gdb.disassembler.builtin_disassemble(info)
>> 
>> This disassembler doesn't add any value, it defers back to GDB to do
>> all the actual work, but it serves to allow us to discuss the problem.
>> 
>> The problem occurs when a Python exception is raised by the
>> MyInfo.read_memory method.  The MyInfo.read_memory method is called
>> from the C++ function gdbpy_disassembler::read_memory_func.  The C++
>> stack at the point this function is called looks like this:
>> 
>>   #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
>>   #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
>>   #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
>>   #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
>>   #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
>>   #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
>>   #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
>>   #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
>>   #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324
>> 
>>   ### ... snip lots of frames as we pass through Python itself ...
>> 
>>   #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
>>   #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
>>   #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
>>   #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101
>> 
>> So gdbpy_disassembler::read_memory_func is called from the libopcodes
>> disassembler to read memory, this C++ function then calls into user
>> supplied Python code to do the work.
>> 
>> If the user supplied Python code raises an gdb.MemoryError exception
>> indicating the memory read failed, this this is fine.  The C++ code
>> converts this exception back into a return value that libopcodes can
>> understand, and returns to libopcodes.
>> 
>> However, if the user supplied Python code raises some other exception,
>> what we want is for this exception to propagate through GDB and appear
>> as if raised by the call to gdb.disassembler.builtin_disassemble.  To
>> achieve this, when gdbpy_disassembler::read_memory_func spots an
>> unknown Python exception, we must pass the information about this
>> exception from frame #0 to frame #8 in the above backtrace.  Frame #8
>> is the C++ implementation of gdb.disassembler.builtin_disassemble, and
>> so it is this function that we want to re-raise the unknown Python
>> exception, so the user can, if they want, catch the exception in their
>> code.
>> 
>> The previous mechanism by which the exception was passed was to pack
>> the details of the Python exception into a C++ exception, then throw
>> the exception from frame #0, and catch the exception in frame #8,
>> unpack the details of the Python exception, and re-raise it.
>> 
>> However, this relies on the exception passing through frames #1 to #7,
>> some of which are in libopcodes, which is C code, and so, might not be
>> compiled with exception support.
>
> Can we make gdbpy_disassembler::read_memory_func "noexcept", so that if
> an exception escapes that function, regardless of the architecture,
> there will be an apparent GDB crash?

Yes!  This was my first thought once I spotted this bug.

So, what I originally tried was this, in gdb/disasm.h, I applied this:

diff --git a/gdb/disasm.h b/gdb/disasm.h
index b7d16744c20..7a217dfad12 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -52,7 +52,9 @@ struct gdb_disassemble_info
 protected:
 
   /* Types for the function callbacks within m_di.  */
-  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
+  using read_memory_ftype
+    = int (*) (bfd_vma, bfd_byte *, unsigned int,
+              struct disassemble_info *) noexcept;
   using memory_error_ftype = decltype (disassemble_info::memory_error_func);
   using print_address_ftype = decltype (disassemble_info::print_address_func);
   using fprintf_ftype = decltype (disassemble_info::fprintf_func);

What I was trying to do was require that _every_ read memory function
that we have in GDB _must_ be noexcept.  This makes sense I think - the
same issue I ran into here could just as easily crop up elsewhere if we
tried to throw an exception.

Unfortunately, the above doesn't build, I run into this warning/error:

  In file included from ../../src/gdb/disasm.c:25:
  ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
  ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
    116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
        |   ^~~~~~~~~~~~~~~~~~~~~~~~~

At this point I figured I couldn't use noexcept in the API like this, so
gave up.

But I think this was a mistake.  I just tried, and turns out I can add
noexcept to gdbpy_disassembler::read_memory_func, and everything
compiles fine, which, thinking about it more makes perfect sense.  The
noexcept function is more restrictive, so passing it to a function
pointer type that doesn't include noexcept can't hurt.

What I'd like to do though it create a single patch that adds noexcept
to all the disassembler callbacks throughout GDB in one go.  I'll reply
to this message once I have that patch ready, but if it's OK, I'll leave
that as a separate change to the original patch?

>
>> 
>> This commit proposes an alternative solution that does not rely on
>> throwing a C++ exception.
>> 
>> When we spot an unhandled Python exception in frame #0, we will store
>> the details of this exception within the gdbpy_disassembler object
>> currently in use.  Then we return to libopcodes a value indicating
>> that the memory_read failed.
>> 
>> libopcodes will now continue to disassemble as though that memory read
>> failed (with one special case described below), then, when we
>> eventually return to disasmpy_builtin_disassemble we check to see if
>> there is an exception stored in the gdbpy_disassembler object.  If
>> there is then this exception can immediately be installed, and then we
>> return back to Python, when the user will be able to catch the
>> exception.
>> 
>> There is one extra change in gdbpy_disassembler::read_memory_func.
>> After the first call that results in an exception being stored on the
>> gdbpy_disassembler object, any future calls to the ::read_memory_func
>> function will immediately return as if the read failed.  This avoids
>> any additional calls into user supplied Python code.
>> 
>> My thinking here is that should the first call fail with some unknown
>> error, GDB should not keep trying with any additional calls.  This
>> maintains the illusion that the exception raised from
>> MyInfo.read_memory is immediately raised by
>> gdb.disassembler.builtin_disassemble.  I have no tests for this change
>> though - to trigger this issue would rely on a libopcodes disassembler
>> that will try to read further memory even after the first failed
>> read.  I'm not aware of any such disassembler that currently does
>> this, but that doesn't mean such a disassembler couldn't exist in the
>> future.
>
> I don't really understand the need for this change.  The
> read_memory_func interface should mostly be stateless, if you try to
> read an invalid address, you get an error.  If you then try to read a
> valid address, I don't see why that wouldn't work.

If the error/exception that occurs the first time is a gdb.MemoryError
then I agree with you completely.

The question is about any other exception type.  Imagine this totally
made up nonsense example:

  class MyInfo(gdb.disassembler.DisassembleInfo):
      def __init__(self, info):
          super().__init__(info)

      def read_memory(self, length, offset):
          if read_resource_from_somewhere_remote():
            return super().read_memory(length, offset)
          else:
            raise RemoteResourceUnavailableError()

If the disassembler tries to read memory, and for some reason the
read_resource_from_somewhere_remote() call fails, we raise a
RemoteResourceUnavailableError exception.

My expectation (and my desire for the API) is that the
RemoteResourceUnavailableError will _immediately_ be re-raised by the
corresponding builtin_disassemble call.

Without the line you are commenting on though, it is possible that the
disassembler will repeatedly call the read_memory method, and so
repeatedly raise the RemoteResourceUnavailableError error.

I don't know if I'm doing a good job of arguing my position :/ does the
above make any sense?

>
> Apart from that design decision, the code itself looks good to me.
>
>> @@ -138,6 +160,23 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
>>       memory source object then a pointer to the object is placed in here,
>>       otherwise, this field is nullptr.  */
>>    PyObject *m_memory_source;
>> +
>> +  /* Move the exception EX into this disassembler object.  */
>> +  void store_exception (gdbpy_err_fetch &&ex)
>> +  {
>> +    if (!m_stored_exception.has_value ())
>> +      m_stored_exception.emplace (std::move (ex));
>
> When would m_stored_exception already have a value?  I'm thinking this
> should be an assertion instead:
>
>   gdb_assert (!m_stored_exception.has_value ());

Good spot.  This was originally added before I added the:

  if (dis->has_stored_exception ())
    return -1;

guard to gdbpy_disassembler::read_memory_func that we are discussing
above, and ensured only the first Python exception was captured.  Then I
started worrying about whether we should even allow the function to be
run multiple times, added the guard ... and never updated this.

I'll change this to an assert as you suggest.

>
>> @@ -513,7 +555,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
>>  	 exception and throw it, this will then be caught in
>>  	 disasmpy_builtin_disassemble, at which point the exception will be
>>  	 restored.  */
>> -      throw gdbpy_err_fetch ();
>> +      dis->store_exception (std::move (gdbpy_err_fetch ()));
>
> I believe these std::moves are unnecessary.  The result of
> gdbpy_err_fetch() is already a temporary that will be moved from.

Ah, OK.  I'll remove these then.  Thanks.

Updated patch below has the assert, and less std::move.  I'll follow up
with an extra patch adding the noexcept keywords.

Thanks,
Andrew

---

commit f8a254c1d1c14d2b6b934e300e5b48158f15f23b
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Fri Oct 21 16:20:58 2022 +0100

    gdb/python: avoid throwing an exception over libopcodes code
    
    Bug gdb/29712 identifies a problem with the Python disassembler API.
    In some cases GDB will try to throw an exception through the
    libopcodes disassembler code, however, not all targets include
    exception unwind information when compiling C code, for targets that
    don't include this information GDB will terminate when trying to pass
    the exception through libopcodes.
    
    To explain what GDB is trying to do, consider the following trivial
    use of the Python disassembler API:
    
      class ExampleDisassembler(gdb.disassembler.Disassembler):
    
          class MyInfo(gdb.disassembler.DisassembleInfo):
              def __init__(self, info):
                  super().__init__(info)
    
              def read_memory(self, length, offset):
                  return super().read_memory(length, offset)
    
          def __init__(self):
              super().__init__("ExampleDisassembler")
    
          def __call__(self, info):
              info = self.MyInfo(info)
              return gdb.disassembler.builtin_disassemble(info)
    
    This disassembler doesn't add any value, it defers back to GDB to do
    all the actual work, but it serves to allow us to discuss the problem.
    
    The problem occurs when a Python exception is raised by the
    MyInfo.read_memory method.  The MyInfo.read_memory method is called
    from the C++ function gdbpy_disassembler::read_memory_func.  The C++
    stack at the point this function is called looks like this:
    
      #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
      #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
      #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
      #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
      #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
      #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
      #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
      #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
      #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324
    
      ### ... snip lots of frames as we pass through Python itself ...
    
      #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
      #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
      #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
      #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101
    
    So gdbpy_disassembler::read_memory_func is called from the libopcodes
    disassembler to read memory, this C++ function then calls into user
    supplied Python code to do the work.
    
    If the user supplied Python code raises an gdb.MemoryError exception
    indicating the memory read failed, this this is fine.  The C++ code
    converts this exception back into a return value that libopcodes can
    understand, and returns to libopcodes.
    
    However, if the user supplied Python code raises some other exception,
    what we want is for this exception to propagate through GDB and appear
    as if raised by the call to gdb.disassembler.builtin_disassemble.  To
    achieve this, when gdbpy_disassembler::read_memory_func spots an
    unknown Python exception, we must pass the information about this
    exception from frame #0 to frame #8 in the above backtrace.  Frame #8
    is the C++ implementation of gdb.disassembler.builtin_disassemble, and
    so it is this function that we want to re-raise the unknown Python
    exception, so the user can, if they want, catch the exception in their
    code.
    
    The previous mechanism by which the exception was passed was to pack
    the details of the Python exception into a C++ exception, then throw
    the exception from frame #0, and catch the exception in frame #8,
    unpack the details of the Python exception, and re-raise it.
    
    However, this relies on the exception passing through frames #1 to #7,
    some of which are in libopcodes, which is C code, and so, might not be
    compiled with exception support.
    
    This commit proposes an alternative solution that does not rely on
    throwing a C++ exception.
    
    When we spot an unhandled Python exception in frame #0, we will store
    the details of this exception within the gdbpy_disassembler object
    currently in use.  Then we return to libopcodes a value indicating
    that the memory_read failed.
    
    libopcodes will now continue to disassemble as though that memory read
    failed (with one special case described below), then, when we
    eventually return to disasmpy_builtin_disassemble we check to see if
    there is an exception stored in the gdbpy_disassembler object.  If
    there is then this exception can immediately be installed, and then we
    return back to Python, when the user will be able to catch the
    exception.
    
    There is one extra change in gdbpy_disassembler::read_memory_func.
    After the first call that results in an exception being stored on the
    gdbpy_disassembler object, any future calls to the ::read_memory_func
    function will immediately return as if the read failed.  This avoids
    any additional calls into user supplied Python code.
    
    My thinking here is that should the first call fail with some unknown
    error, GDB should not keep trying with any additional calls.  This
    maintains the illusion that the exception raised from
    MyInfo.read_memory is immediately raised by
    gdb.disassembler.builtin_disassemble.  I have no tests for this change
    though - to trigger this issue would rely on a libopcodes disassembler
    that will try to read further memory even after the first failed
    read.  I'm not aware of any such disassembler that currently does
    this, but that doesn't mean such a disassembler couldn't exist in the
    future.
    
    With this change in place the gdb.python/py-disasm.exp test should now
    pass on AArch64.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29712

diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index c37452fcf72..1d460997831 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -122,6 +122,28 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
     return m_string_file.release ();
   }
 
+  /* If there is a Python exception stored in this disassembler then
+     restore it (i.e. set the PyErr_* state), clear the exception within
+     this disassembler, and return true.  There must be no current
+     exception set (i.e. !PyErr_Occurred()) when this function is called,
+     as any such exception might get lost.
+
+     Otherwise, there is no exception stored in this disassembler, return
+     false.  */
+  bool restore_exception ()
+  {
+    gdb_assert (!PyErr_Occurred ());
+    if (m_stored_exception.has_value ())
+      {
+	gdbpy_err_fetch ex = std::move (*m_stored_exception);
+	m_stored_exception.reset ();
+	ex.restore ();
+	return true;
+      }
+
+    return false;
+  }
+
 private:
 
   /* Where the disassembler result is written.  */
@@ -138,6 +160,25 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
      memory source object then a pointer to the object is placed in here,
      otherwise, this field is nullptr.  */
   PyObject *m_memory_source;
+
+  /* Move the exception EX into this disassembler object.  */
+  void store_exception (gdbpy_err_fetch &&ex)
+  {
+    /* The only calls to store_exception are from read_memory_func, which
+       will return early if there's already an exception stored.  */
+    gdb_assert (!m_stored_exception.has_value ());
+    m_stored_exception.emplace (std::move (ex));
+  }
+
+  /* Return true if there is an exception stored in this disassembler.  */
+  bool has_stored_exception () const
+  {
+    return m_stored_exception.has_value ();
+  }
+
+  /* Store a single exception.  This is used to pass Python exceptions back
+     from ::memory_read to disasmpy_builtin_disassemble.  */
+  gdb::optional<gdbpy_err_fetch> m_stored_exception;
 };
 
 /* Return true if OBJ is still valid, otherwise, return false.  A valid OBJ
@@ -288,20 +329,15 @@ disasmpy_builtin_disassemble (PyObject *self, PyObject *args, PyObject *kw)
      the disassembled instruction, or -1 if there was a memory-error
      encountered while disassembling.  See below more more details on
      handling of -1 return value.  */
-  int length;
-  try
-    {
-      length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
+  int length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
 				   disassembler.disasm_info ());
-    }
-  catch (gdbpy_err_fetch &pyerr)
-    {
-      /* Reinstall the Python exception held in PYERR.  This clears to
-	 pointers held in PYERR, hence the need to catch as a non-const
-	 reference.  */
-      pyerr.restore ();
-      return nullptr;
-    }
+
+  /* It is possible that, while calling a user overridden memory read
+     function, a Python exception was raised that couldn't be
+     translated into a standard memory-error.  In this case the first such
+     exception is stored in the disassembler and restored here.  */
+  if (disassembler.restore_exception ())
+    return nullptr;
 
   if (length == -1)
     {
@@ -483,6 +519,14 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     = static_cast<gdbpy_disassembler *> (info->application_data);
   disasm_info_object *obj = dis->py_disasm_info ();
 
+  /* If a previous read attempt resulted in an exception, then we don't
+     allow any further reads to succeed.  We only do this check for the
+     read_memory_func as this is the only one the user can hook into,
+     thus, this check prevents us calling back into user code if a
+     previous call has already thrown an error.  */
+  if (dis->has_stored_exception ())
+    return -1;
+
   /* The DisassembleInfo.read_memory method expects an offset from the
      address stored within the DisassembleInfo object; calculate that
      offset here.  */
@@ -513,7 +557,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 	 exception and throw it, this will then be caught in
 	 disasmpy_builtin_disassemble, at which point the exception will be
 	 restored.  */
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Convert the result to a buffer.  */
@@ -523,7 +568,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     {
       PyErr_Format (PyExc_TypeError,
 		    _("Result from read_memory is not a buffer"));
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Wrap PY_BUFF so that it is cleaned up correctly at the end of this
@@ -536,7 +582,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
       PyErr_Format (PyExc_ValueError,
 		    _("Buffer returned from read_memory is sized %d instead of the expected %d"),
 		    py_buff.len, len);
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Copy the data out of the Python buffer and return success.  */


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] gdb/disasm: mark functions passed to the disassembler noexcept
  2022-10-24 17:22   ` Andrew Burgess
@ 2022-10-24 17:45     ` Andrew Burgess
  2022-10-24 18:24       ` Simon Marchi
  2022-10-24 18:20     ` [PATCH] gdb/python: avoid throwing an exception over libopcodes code Simon Marchi
  1 sibling, 1 reply; 18+ messages in thread
From: Andrew Burgess @ 2022-10-24 17:45 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches


I realise I made a bit of a mess, including the updated patch in the
previous email.  I should have just posted both of these as a new V2
series.

Oh well!  The patch below applies on top of the first patch (in the
parent email), and adds the noexcept keyword in a few places.

Thanks,
Andrew

---

commit b1e3243296084566baf443d9c5280918a9d784f3
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Mon Oct 24 18:35:41 2022 +0100

    gdb/disasm: mark functions passed to the disassembler noexcept
    
    While working on another patch, Simon pointed out that GDB could be
    improved by marking the functions passed to the disassembler as
    noexcept.
    
      https://sourceware.org/pipermail/gdb-patches/2022-October/193084.html
    
    The reason this is important is the on some hosts, libopcodes, being C
    code, will not be compiled with support for handling exceptions.  As
    such, an attempt to throw an exception over libopcodes code will cause
    GDB to terminate.
    
    See bug gdb/29712 for an example of when this happened.
    
    In this commit all the functions that are passed to the disassembler,
    and which might be used as callbacks by libopcodes are marked
    noexcept.
    
    Ideally, I would have liked to change these typedefs:
    
      using read_memory_ftype = decltype (disassemble_info::read_memory_func);
      using memory_error_ftype = decltype (disassemble_info::memory_error_func);
      using print_address_ftype = decltype (disassemble_info::print_address_func);
      using fprintf_ftype = decltype (disassemble_info::fprintf_func);
      using fprintf_styled_ftype = decltype (disassemble_info::fprintf_styled_func);
    
    which are declared in disasm.h, as including the noexcept keyword.
    However, when I tried this, I ran into this warning/error:
    
      In file included from ../../src/gdb/disasm.c:25:
      ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
      ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
        116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
            |   ^~~~~~~~~~~~~~~~~~~~~~~~~
    
    So I've left that change out.  This does mean that if somebody adds a
    new use of the disassembler classes in the future, and forgets to mark
    the callbacks as noexcept, this will compile fine.  We'll just have to
    manually check for that during review.

diff --git a/gdb/disasm.c b/gdb/disasm.c
index de44aac3263..da2fb89e5c3 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -191,7 +191,7 @@ line_has_code_p (htab_t table, struct symtab *symtab, int line)
 int
 gdb_disassembler_memory_reader::dis_asm_read_memory
   (bfd_vma memaddr, gdb_byte *myaddr, unsigned int len,
-   struct disassemble_info *info)
+   struct disassemble_info *info) noexcept
 {
   return target_read_code (memaddr, myaddr, len);
 }
@@ -199,8 +199,8 @@ gdb_disassembler_memory_reader::dis_asm_read_memory
 /* Wrapper of memory_error.  */
 
 void
-gdb_disassembler::dis_asm_memory_error (int err, bfd_vma memaddr,
-					struct disassemble_info *info)
+gdb_disassembler::dis_asm_memory_error
+  (int err, bfd_vma memaddr, struct disassemble_info *info) noexcept
 {
   gdb_disassembler *self
     = static_cast<gdb_disassembler *>(info->application_data);
@@ -211,8 +211,8 @@ gdb_disassembler::dis_asm_memory_error (int err, bfd_vma memaddr,
 /* Wrapper of print_address.  */
 
 void
-gdb_disassembler::dis_asm_print_address (bfd_vma addr,
-					 struct disassemble_info *info)
+gdb_disassembler::dis_asm_print_address
+  (bfd_vma addr, struct disassemble_info *info) noexcept
 {
   gdb_disassembler *self
     = static_cast<gdb_disassembler *>(info->application_data);
@@ -256,7 +256,7 @@ gdb_printing_disassembler::stream_from_gdb_disassemble_info (void *dis_info)
 
 int
 gdb_printing_disassembler::fprintf_func (void *dis_info,
-					 const char *format, ...)
+					 const char *format, ...) noexcept
 {
   ui_file *stream = stream_from_gdb_disassemble_info (dis_info);
 
@@ -272,9 +272,9 @@ gdb_printing_disassembler::fprintf_func (void *dis_info,
 /* See disasm.h.  */
 
 int
-gdb_printing_disassembler::fprintf_styled_func (void *dis_info,
-						enum disassembler_style style,
-						const char *format, ...)
+gdb_printing_disassembler::fprintf_styled_func
+  (void *dis_info, enum disassembler_style style,
+   const char *format, ...) noexcept
 {
   ui_file *stream = stream_from_gdb_disassemble_info (dis_info);
   gdb_printing_disassembler *dis = (gdb_printing_disassembler *) dis_info;
@@ -1220,8 +1220,8 @@ gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR addr)
 /* See disasm.h.  */
 
 int
-gdb_non_printing_disassembler::null_fprintf_func (void *stream,
-						  const char *format, ...)
+gdb_non_printing_disassembler::null_fprintf_func
+  (void *stream, const char *format, ...) noexcept
 {
   return 0;
 }
@@ -1230,7 +1230,8 @@ gdb_non_printing_disassembler::null_fprintf_func (void *stream,
 
 int
 gdb_non_printing_disassembler::null_fprintf_styled_func
-  (void *stream, enum disassembler_style style, const char *format, ...)
+  (void *stream, enum disassembler_style style,
+   const char *format, ...) noexcept
 {
   return 0;
 }
diff --git a/gdb/disasm.h b/gdb/disasm.h
index b7d16744c20..989f287c9e1 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -127,7 +127,7 @@ struct gdb_printing_disassembler : public gdb_disassemble_info
   /* Callback used as the disassemble_info's fprintf_func callback.  The
      DIS_INFO pointer is a pointer to a gdb_printing_disassembler object.
      Content is written to the m_stream extracted from DIS_INFO.  */
-  static int fprintf_func (void *dis_info, const char *format, ...)
+  static int fprintf_func (void *dis_info, const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(2,3);
 
   /* Callback used as the disassemble_info's fprintf_styled_func callback.
@@ -135,7 +135,7 @@ struct gdb_printing_disassembler : public gdb_disassemble_info
      object.  Content is written to the m_stream extracted from DIS_INFO.  */
   static int fprintf_styled_func (void *dis_info,
 				  enum disassembler_style style,
-				  const char *format, ...)
+				  const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(3,4);
 
   /* Return true if the disassembler is considered inside a comment, false
@@ -187,14 +187,14 @@ struct gdb_non_printing_disassembler : public gdb_disassemble_info
 
   /* Callback used as the disassemble_info's fprintf_func callback, this
      doesn't write anything to STREAM, but just returns 0.  */
-  static int null_fprintf_func (void *stream, const char *format, ...)
+  static int null_fprintf_func (void *stream, const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(2,3);
 
   /* Callback used as the disassemble_info's fprintf_styled_func callback,
      , this doesn't write anything to STREAM, but just returns 0.  */
   static int null_fprintf_styled_func (void *stream,
 				       enum disassembler_style style,
-				       const char *format, ...)
+				       const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(3,4);
 };
 
@@ -208,7 +208,7 @@ struct gdb_disassembler_memory_reader
   /* Implements the read_memory_func disassemble_info callback.  */
   static int dis_asm_read_memory (bfd_vma memaddr, gdb_byte *myaddr,
 				  unsigned int len,
-				  struct disassemble_info *info);
+				  struct disassemble_info *info) noexcept;
 };
 
 /* A non-printing disassemble_info management class.  The disassemble_info
@@ -281,9 +281,9 @@ struct gdb_disassembler : public gdb_printing_disassembler,
   static bool use_ext_lang_colorization_p;
 
   static void dis_asm_memory_error (int err, bfd_vma memaddr,
-				    struct disassemble_info *info);
+				    struct disassemble_info *info) noexcept;
   static void dis_asm_print_address (bfd_vma addr,
-				     struct disassemble_info *info);
+				     struct disassemble_info *info) noexcept;
 
   /* Return true if we should use the extension language to apply
      disassembler styling.  This requires disassembler styling to be on
diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index 1d460997831..a25252b4306 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -101,12 +101,12 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
 
   /* Callbacks used by disassemble_info.  */
   static void memory_error_func (int status, bfd_vma memaddr,
-				 struct disassemble_info *info);
+				 struct disassemble_info *info) noexcept;
   static void print_address_func (bfd_vma addr,
-				  struct disassemble_info *info);
+				  struct disassemble_info *info) noexcept;
   static int read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 			       unsigned int len,
-			       struct disassemble_info *info);
+			       struct disassemble_info *info) noexcept;
 
   /* Return a reference to an optional that contains the address at which a
      memory error occurred.  The optional will only have a value if a
@@ -513,7 +513,7 @@ disasmpy_info_progspace (PyObject *self, void *closure)
 int
 gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 				      unsigned int len,
-				      struct disassemble_info *info)
+				      struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
@@ -658,7 +658,7 @@ disasmpy_result_init (PyObject *self, PyObject *args, PyObject *kwargs)
 
 void
 gdbpy_disassembler::memory_error_func (int status, bfd_vma memaddr,
-				       struct disassemble_info *info)
+				       struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
@@ -669,7 +669,7 @@ gdbpy_disassembler::memory_error_func (int status, bfd_vma memaddr,
 
 void
 gdbpy_disassembler::print_address_func (bfd_vma addr,
-					struct disassemble_info *info)
+					struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-24 17:22   ` Andrew Burgess
  2022-10-24 17:45     ` [PATCH] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
@ 2022-10-24 18:20     ` Simon Marchi
  2022-10-27 10:38       ` Andrew Burgess
  1 sibling, 1 reply; 18+ messages in thread
From: Simon Marchi @ 2022-10-24 18:20 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On 10/24/22 13:22, Andrew Burgess via Gdb-patches wrote:
>> Can we make gdbpy_disassembler::read_memory_func "noexcept", so that if
>> an exception escapes that function, regardless of the architecture,
>> there will be an apparent GDB crash?
> 
> Yes!  This was my first thought once I spotted this bug.
> 
> So, what I originally tried was this, in gdb/disasm.h, I applied this:
> 
> diff --git a/gdb/disasm.h b/gdb/disasm.h
> index b7d16744c20..7a217dfad12 100644
> --- a/gdb/disasm.h
> +++ b/gdb/disasm.h
> @@ -52,7 +52,9 @@ struct gdb_disassemble_info
>  protected:
>  
>    /* Types for the function callbacks within m_di.  */
> -  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
> +  using read_memory_ftype
> +    = int (*) (bfd_vma, bfd_byte *, unsigned int,
> +              struct disassemble_info *) noexcept;
>    using memory_error_ftype = decltype (disassemble_info::memory_error_func);
>    using print_address_ftype = decltype (disassemble_info::print_address_func);
>    using fprintf_ftype = decltype (disassemble_info::fprintf_func);
> 
> What I was trying to do was require that _every_ read memory function
> that we have in GDB _must_ be noexcept.  This makes sense I think - the
> same issue I ran into here could just as easily crop up elsewhere if we
> tried to throw an exception.
> 
> Unfortunately, the above doesn't build, I run into this warning/error:
> 
>   In file included from ../../src/gdb/disasm.c:25:
>   ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
>   ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
>     116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
>         |   ^~~~~~~~~~~~~~~~~~~~~~~~~
> 
> At this point I figured I couldn't use noexcept in the API like this, so
> gave up.

I think this warning could safely be silenced, since we don't care about
the mangling name change.

> But I think this was a mistake.  I just tried, and turns out I can add
> noexcept to gdbpy_disassembler::read_memory_func, and everything
> compiles fine, which, thinking about it more makes perfect sense.  The
> noexcept function is more restrictive, so passing it to a function
> pointer type that doesn't include noexcept can't hurt.

That's what I tried to, but your idea of putting it on read_memory_ftype
would be stronger, if it's possible.

> What I'd like to do though it create a single patch that adds noexcept
> to all the disassembler callbacks throughout GDB in one go.  I'll reply
> to this message once I have that patch ready, but if it's OK, I'll leave
> that as a separate change to the original patch?

Yeah sure.

>>
>>>
>>> This commit proposes an alternative solution that does not rely on
>>> throwing a C++ exception.
>>>
>>> When we spot an unhandled Python exception in frame #0, we will store
>>> the details of this exception within the gdbpy_disassembler object
>>> currently in use.  Then we return to libopcodes a value indicating
>>> that the memory_read failed.
>>>
>>> libopcodes will now continue to disassemble as though that memory read
>>> failed (with one special case described below), then, when we
>>> eventually return to disasmpy_builtin_disassemble we check to see if
>>> there is an exception stored in the gdbpy_disassembler object.  If
>>> there is then this exception can immediately be installed, and then we
>>> return back to Python, when the user will be able to catch the
>>> exception.
>>>
>>> There is one extra change in gdbpy_disassembler::read_memory_func.
>>> After the first call that results in an exception being stored on the
>>> gdbpy_disassembler object, any future calls to the ::read_memory_func
>>> function will immediately return as if the read failed.  This avoids
>>> any additional calls into user supplied Python code.
>>>
>>> My thinking here is that should the first call fail with some unknown
>>> error, GDB should not keep trying with any additional calls.  This
>>> maintains the illusion that the exception raised from
>>> MyInfo.read_memory is immediately raised by
>>> gdb.disassembler.builtin_disassemble.  I have no tests for this change
>>> though - to trigger this issue would rely on a libopcodes disassembler
>>> that will try to read further memory even after the first failed
>>> read.  I'm not aware of any such disassembler that currently does
>>> this, but that doesn't mean such a disassembler couldn't exist in the
>>> future.
>>
>> I don't really understand the need for this change.  The
>> read_memory_func interface should mostly be stateless, if you try to
>> read an invalid address, you get an error.  If you then try to read a
>> valid address, I don't see why that wouldn't work.
> 
> If the error/exception that occurs the first time is a gdb.MemoryError
> then I agree with you completely.
> 
> The question is about any other exception type.  Imagine this totally
> made up nonsense example:
> 
>   class MyInfo(gdb.disassembler.DisassembleInfo):
>       def __init__(self, info):
>           super().__init__(info)
> 
>       def read_memory(self, length, offset):
>           if read_resource_from_somewhere_remote():
>             return super().read_memory(length, offset)
>           else:
>             raise RemoteResourceUnavailableError()
> 
> If the disassembler tries to read memory, and for some reason the
> read_resource_from_somewhere_remote() call fails, we raise a
> RemoteResourceUnavailableError exception.
> 
> My expectation (and my desire for the API) is that the
> RemoteResourceUnavailableError will _immediately_ be re-raised by the
> corresponding builtin_disassemble call.
> 
> Without the line you are commenting on though, it is possible that the
> disassembler will repeatedly call the read_memory method, and so
> repeatedly raise the RemoteResourceUnavailableError error.
> 
> I don't know if I'm doing a good job of arguing my position :/ does the
> above make any sense?

I don't really understand the problem with raising
RemoteResourceUnavailableError repeatedly, if the disassembler calls
read_memory repeatedly.

Ah, I might get it now that I read your comment below.  Is that you have
these stack frames:

 0. gdb Python read_memory stack frames
 1. opcodes disassembler stack frames
 2. other gdb stack frames

The Python exception is stored in frames 0, and normally restored in
frames 2.  However, it could be that frames 0 store an exception, then
control goes back to frames 1, which handle the error return code, but
somehow think they can recover and so continue and call read_memory
again.  And so we in frames 0 again, but the packaged Python exception
from last time is still there.

If so, I think it would be fine to drop the original exception upon
entering frames 0 again.  It would be as if frames 1 caught the
exception (if it were a C++ exception) and decided it was safe to
ignore.  It would be surprising thought for a disassembler to do that.
After getting an error that can't be explained, you'd think they would
return immediately with an error.

Simon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] gdb/disasm: mark functions passed to the disassembler noexcept
  2022-10-24 17:45     ` [PATCH] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
@ 2022-10-24 18:24       ` Simon Marchi
  0 siblings, 0 replies; 18+ messages in thread
From: Simon Marchi @ 2022-10-24 18:24 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On 10/24/22 13:45, Andrew Burgess via Gdb-patches wrote:
> 
> I realise I made a bit of a mess, including the updated patch in the
> previous email.  I should have just posted both of these as a new V2
> series.
> 
> Oh well!  The patch below applies on top of the first patch (in the
> parent email), and adds the noexcept keyword in a few places.
> 
> Thanks,
> Andrew
> 
> ---
> 
> commit b1e3243296084566baf443d9c5280918a9d784f3
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Mon Oct 24 18:35:41 2022 +0100
> 
>     gdb/disasm: mark functions passed to the disassembler noexcept
>     
>     While working on another patch, Simon pointed out that GDB could be
>     improved by marking the functions passed to the disassembler as
>     noexcept.
>     
>       https://sourceware.org/pipermail/gdb-patches/2022-October/193084.html
>     
>     The reason this is important is the on some hosts, libopcodes, being C
>     code, will not be compiled with support for handling exceptions.  As
>     such, an attempt to throw an exception over libopcodes code will cause
>     GDB to terminate.
>     
>     See bug gdb/29712 for an example of when this happened.
>     
>     In this commit all the functions that are passed to the disassembler,
>     and which might be used as callbacks by libopcodes are marked
>     noexcept.
>     
>     Ideally, I would have liked to change these typedefs:
>     
>       using read_memory_ftype = decltype (disassemble_info::read_memory_func);
>       using memory_error_ftype = decltype (disassemble_info::memory_error_func);
>       using print_address_ftype = decltype (disassemble_info::print_address_func);
>       using fprintf_ftype = decltype (disassemble_info::fprintf_func);
>       using fprintf_styled_ftype = decltype (disassemble_info::fprintf_styled_func);
>     
>     which are declared in disasm.h, as including the noexcept keyword.
>     However, when I tried this, I ran into this warning/error:
>     
>       In file included from ../../src/gdb/disasm.c:25:
>       ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
>       ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
>         116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
>             |   ^~~~~~~~~~~~~~~~~~~~~~~~~
>     
>     So I've left that change out.  This does mean that if somebody adds a
>     new use of the disassembler classes in the future, and forgets to mark
>     the callbacks as noexcept, this will compile fine.  We'll just have to
>     manually check for that during review.

Like I said in my other email, I think this warning could safely be
ignored (silenced).  We are not writing a library with a stable ABI, we
don't care if the mangling is different in two different builds.

Simon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-24 18:20     ` [PATCH] gdb/python: avoid throwing an exception over libopcodes code Simon Marchi
@ 2022-10-27 10:38       ` Andrew Burgess
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-10-27 10:38 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

Simon Marchi <simark@simark.ca> writes:

> On 10/24/22 13:22, Andrew Burgess via Gdb-patches wrote:
>>> Can we make gdbpy_disassembler::read_memory_func "noexcept", so that if
>>> an exception escapes that function, regardless of the architecture,
>>> there will be an apparent GDB crash?
>> 
>> Yes!  This was my first thought once I spotted this bug.
>> 
>> So, what I originally tried was this, in gdb/disasm.h, I applied this:
>> 
>> diff --git a/gdb/disasm.h b/gdb/disasm.h
>> index b7d16744c20..7a217dfad12 100644
>> --- a/gdb/disasm.h
>> +++ b/gdb/disasm.h
>> @@ -52,7 +52,9 @@ struct gdb_disassemble_info
>>  protected:
>>  
>>    /* Types for the function callbacks within m_di.  */
>> -  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
>> +  using read_memory_ftype
>> +    = int (*) (bfd_vma, bfd_byte *, unsigned int,
>> +              struct disassemble_info *) noexcept;
>>    using memory_error_ftype = decltype (disassemble_info::memory_error_func);
>>    using print_address_ftype = decltype (disassemble_info::print_address_func);
>>    using fprintf_ftype = decltype (disassemble_info::fprintf_func);
>> 
>> What I was trying to do was require that _every_ read memory function
>> that we have in GDB _must_ be noexcept.  This makes sense I think - the
>> same issue I ran into here could just as easily crop up elsewhere if we
>> tried to throw an exception.
>> 
>> Unfortunately, the above doesn't build, I run into this warning/error:
>> 
>>   In file included from ../../src/gdb/disasm.c:25:
>>   ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
>>   ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
>>     116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
>>         |   ^~~~~~~~~~~~~~~~~~~~~~~~~
>> 
>> At this point I figured I couldn't use noexcept in the API like this, so
>> gave up.
>
> I think this warning could safely be silenced, since we don't care about
> the mangling name change.
>
>> But I think this was a mistake.  I just tried, and turns out I can add
>> noexcept to gdbpy_disassembler::read_memory_func, and everything
>> compiles fine, which, thinking about it more makes perfect sense.  The
>> noexcept function is more restrictive, so passing it to a function
>> pointer type that doesn't include noexcept can't hurt.
>
> That's what I tried to, but your idea of putting it on read_memory_ftype
> would be stronger, if it's possible.
>
>> What I'd like to do though it create a single patch that adds noexcept
>> to all the disassembler callbacks throughout GDB in one go.  I'll reply
>> to this message once I have that patch ready, but if it's OK, I'll leave
>> that as a separate change to the original patch?
>
> Yeah sure.
>
>>>
>>>>
>>>> This commit proposes an alternative solution that does not rely on
>>>> throwing a C++ exception.
>>>>
>>>> When we spot an unhandled Python exception in frame #0, we will store
>>>> the details of this exception within the gdbpy_disassembler object
>>>> currently in use.  Then we return to libopcodes a value indicating
>>>> that the memory_read failed.
>>>>
>>>> libopcodes will now continue to disassemble as though that memory read
>>>> failed (with one special case described below), then, when we
>>>> eventually return to disasmpy_builtin_disassemble we check to see if
>>>> there is an exception stored in the gdbpy_disassembler object.  If
>>>> there is then this exception can immediately be installed, and then we
>>>> return back to Python, when the user will be able to catch the
>>>> exception.
>>>>
>>>> There is one extra change in gdbpy_disassembler::read_memory_func.
>>>> After the first call that results in an exception being stored on the
>>>> gdbpy_disassembler object, any future calls to the ::read_memory_func
>>>> function will immediately return as if the read failed.  This avoids
>>>> any additional calls into user supplied Python code.
>>>>
>>>> My thinking here is that should the first call fail with some unknown
>>>> error, GDB should not keep trying with any additional calls.  This
>>>> maintains the illusion that the exception raised from
>>>> MyInfo.read_memory is immediately raised by
>>>> gdb.disassembler.builtin_disassemble.  I have no tests for this change
>>>> though - to trigger this issue would rely on a libopcodes disassembler
>>>> that will try to read further memory even after the first failed
>>>> read.  I'm not aware of any such disassembler that currently does
>>>> this, but that doesn't mean such a disassembler couldn't exist in the
>>>> future.
>>>
>>> I don't really understand the need for this change.  The
>>> read_memory_func interface should mostly be stateless, if you try to
>>> read an invalid address, you get an error.  If you then try to read a
>>> valid address, I don't see why that wouldn't work.
>> 
>> If the error/exception that occurs the first time is a gdb.MemoryError
>> then I agree with you completely.
>> 
>> The question is about any other exception type.  Imagine this totally
>> made up nonsense example:
>> 
>>   class MyInfo(gdb.disassembler.DisassembleInfo):
>>       def __init__(self, info):
>>           super().__init__(info)
>> 
>>       def read_memory(self, length, offset):
>>           if read_resource_from_somewhere_remote():
>>             return super().read_memory(length, offset)
>>           else:
>>             raise RemoteResourceUnavailableError()
>> 
>> If the disassembler tries to read memory, and for some reason the
>> read_resource_from_somewhere_remote() call fails, we raise a
>> RemoteResourceUnavailableError exception.
>> 
>> My expectation (and my desire for the API) is that the
>> RemoteResourceUnavailableError will _immediately_ be re-raised by the
>> corresponding builtin_disassemble call.
>> 
>> Without the line you are commenting on though, it is possible that the
>> disassembler will repeatedly call the read_memory method, and so
>> repeatedly raise the RemoteResourceUnavailableError error.
>> 
>> I don't know if I'm doing a good job of arguing my position :/ does the
>> above make any sense?
>
> I don't really understand the problem with raising
> RemoteResourceUnavailableError repeatedly, if the disassembler calls
> read_memory repeatedly.
>
> Ah, I might get it now that I read your comment below.  Is that you have
> these stack frames:
>
>  0. gdb Python read_memory stack frames
>  1. opcodes disassembler stack frames
>  2. other gdb stack frames

More like:

  0. User Python code for read_memory,
  1. gdb frame that calls into Python read_memory,
  2. libopcodes disassembler stack frames,
  3. gdb frame that implemented Python builtin_disassemble call,
  4. User Python code that calls builtin_disassemble,
  5. gdb frame that calls into Python disassembler,
  6. other gdb stack frames.

What used to happen:

  - Python exception raised in frame 0,
  - Python exception converted to C++ exception in 1, and thrown,
  - C++ exception caught in 3 and converted back to a Python exception,
  - Python exception appears in frame 4 for the user to handle.

Now what happens:

  - Python exception raised in frame 0,
  - Python exception stored in frame 1, control returns to frame 2 with
    an error code,
  - Control might re-enter frame 1 again, but the stored exception means
    we immediately return with an error code,
  - Eventually the disassembler gives up and returns to frame 3,
  - We spot the stored exception and reinstall it in frame 3,
  - Python exception appears in frame 4 for the user to handle.

This gives the _illusion_ to the user that the exception raised in
read_memory immediately propagated up the stack and was raised as if
from the builtin_disassemble call.

Currently (thought this could change as this stuff has not yet been
released) the documentation says that the _only_ exception frame 1/2
know how to handle is gdb.MemoryError.  Anything else is not handled by
the disassembler.

If we allow frame 2 to keep calling to frame 1/0 even after an unknown
exception has been raised then this just feels weird to me.  Hopefully
the user code will just keep raising the same, or similar exception each
time.  If it doesn't though, then there's the possibility that the
disassembler results will be undefined.  Like you say, it gives the
_impression_ that the disassembler has handled the exception ... but it
really hasn't.

>
> The Python exception is stored in frames 0, and normally restored in
> frames 2.  However, it could be that frames 0 store an exception, then
> control goes back to frames 1, which handle the error return code, but
> somehow think they can recover and so continue and call read_memory
> again.  And so we in frames 0 again, but the packaged Python exception
> from last time is still there.
>
> If so, I think it would be fine to drop the original exception upon
> entering frames 0 again.  It would be as if frames 1 caught the
> exception (if it were a C++ exception) and decided it was safe to
> ignore.  It would be surprising thought for a disassembler to do that.

I agree that it would be surprising for a disassembler to claim to
handle an error that it has, in fact, not handled.

> After getting an error that can't be explained, you'd think they would
> return immediately with an error.

Remember, libopcodes doesn't have a concept of "unexplained" errors.
There's only one error type in libopcode, memory errors.  This whole
mechanism is me attempting to implement the concept of "unexplained"
errors on top of libopcodes.

Initially I did this by bypassing libopcodes completely, but that
doesn't work because libopcode is C code.

So now, I'm trying to overload the memory-error mechanism, i.e. store
the unexplained error, then return the memory-error value.  In most
cases, the disassembler will return as soon as it sees the first
memory-error, so for (I'm guessing) 99% of cases, this whole discussion
is moot.  I'm just keen to lock down the actual behaviour and not leave
it at the whim of the disassembler, just in case we hit that 1% case.

I'll take another pass at patch 2/2 and see if I can silence the
compiler warning about using noexcept in the function type.

Thanks for your continued reviews,

Andrew


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 0/3] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-24 12:50 [PATCH] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
  2022-10-24 15:50 ` Simon Marchi
@ 2022-10-27 15:38 ` Andrew Burgess
  2022-10-27 15:38   ` [PATCHv2 1/3] " Andrew Burgess
                     ` (3 more replies)
  1 sibling, 4 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-10-27 15:38 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Andrew Burgess

Newest version of this series, bringing everything together into a
nice V2 series.

Changes since v1:

  - Patch #1 is largely unchanged.  I'm still in discussion with Simon
    as to whether this is the correct solution or not.

  - Patch #2 has grown a little since the first version I posted, I
    found a few extra callbacks which can be marked as noexcept.

  - Patch #3 is new, this adds the noexcept keyword to the typedefs
    used to handle the libopcode callbacks, but only when we compile
    for C++17 or later.

---

Andrew Burgess (3):
  gdb/python: avoid throwing an exception over libopcodes code
  gdb/disasm: mark functions passed to the disassembler noexcept
  gdb: mark disassembler function callback types as noexcept

 gdb/disasm-selftests.c |  5 ++-
 gdb/disasm.c           | 25 ++++++------
 gdb/disasm.h           | 48 ++++++++++++++++------
 gdb/guile/scm-disasm.c |  2 +-
 gdb/python/py-disasm.c | 91 ++++++++++++++++++++++++++++++++----------
 5 files changed, 121 insertions(+), 50 deletions(-)

-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 1/3] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-27 15:38 ` [PATCHv2 0/3] " Andrew Burgess
@ 2022-10-27 15:38   ` Andrew Burgess
  2022-11-28 14:39     ` Simon Marchi
  2022-10-27 15:38   ` [PATCHv2 2/3] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Andrew Burgess @ 2022-10-27 15:38 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Andrew Burgess

Bug gdb/29712 identifies a problem with the Python disassembler API.
In some cases GDB will try to throw an exception through the
libopcodes disassembler code, however, not all targets include
exception unwind information when compiling C code, for targets that
don't include this information GDB will terminate when trying to pass
the exception through libopcodes.

To explain what GDB is trying to do, consider the following trivial
use of the Python disassembler API:

  class ExampleDisassembler(gdb.disassembler.Disassembler):

      class MyInfo(gdb.disassembler.DisassembleInfo):
          def __init__(self, info):
              super().__init__(info)

          def read_memory(self, length, offset):
              return super().read_memory(length, offset)

      def __init__(self):
          super().__init__("ExampleDisassembler")

      def __call__(self, info):
          info = self.MyInfo(info)
          return gdb.disassembler.builtin_disassemble(info)

This disassembler doesn't add any value, it defers back to GDB to do
all the actual work, but it serves to allow us to discuss the problem.

The problem occurs when a Python exception is raised by the
MyInfo.read_memory method.  The MyInfo.read_memory method is called
from the C++ function gdbpy_disassembler::read_memory_func.  The C++
stack at the point this function is called looks like this:

  #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
  #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
  #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
  #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
  #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
  #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
  #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
  #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
  #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324

  ### ... snip lots of frames as we pass through Python itself ...

  #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
  #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
  #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
  #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101

So gdbpy_disassembler::read_memory_func is called from the libopcodes
disassembler to read memory, this C++ function then calls into user
supplied Python code to do the work.

If the user supplied Python code raises an gdb.MemoryError exception
indicating the memory read failed, this this is fine.  The C++ code
converts this exception back into a return value that libopcodes can
understand, and returns to libopcodes.

However, if the user supplied Python code raises some other exception,
what we want is for this exception to propagate through GDB and appear
as if raised by the call to gdb.disassembler.builtin_disassemble.  To
achieve this, when gdbpy_disassembler::read_memory_func spots an
unknown Python exception, we must pass the information about this
exception from frame #0 to frame #8 in the above backtrace.  Frame #8
is the C++ implementation of gdb.disassembler.builtin_disassemble, and
so it is this function that we want to re-raise the unknown Python
exception, so the user can, if they want, catch the exception in their
code.

The previous mechanism by which the exception was passed was to pack
the details of the Python exception into a C++ exception, then throw
the exception from frame #0, and catch the exception in frame #8,
unpack the details of the Python exception, and re-raise it.

However, this relies on the exception passing through frames #1 to #7,
some of which are in libopcodes, which is C code, and so, might not be
compiled with exception support.

This commit proposes an alternative solution that does not rely on
throwing a C++ exception.

When we spot an unhandled Python exception in frame #0, we will store
the details of this exception within the gdbpy_disassembler object
currently in use.  Then we return to libopcodes a value indicating
that the memory_read failed.

libopcodes will now continue to disassemble as though that memory read
failed (with one special case described below), then, when we
eventually return to disasmpy_builtin_disassemble we check to see if
there is an exception stored in the gdbpy_disassembler object.  If
there is then this exception can immediately be installed, and then we
return back to Python, when the user will be able to catch the
exception.

There is one extra change in gdbpy_disassembler::read_memory_func.
After the first call that results in an exception being stored on the
gdbpy_disassembler object, any future calls to the ::read_memory_func
function will immediately return as if the read failed.  This avoids
any additional calls into user supplied Python code.

My thinking here is that should the first call fail with some unknown
error, GDB should not keep trying with any additional calls.  This
maintains the illusion that the exception raised from
MyInfo.read_memory is immediately raised by
gdb.disassembler.builtin_disassemble.  I have no tests for this change
though - to trigger this issue would rely on a libopcodes disassembler
that will try to read further memory even after the first failed
read.  I'm not aware of any such disassembler that currently does
this, but that doesn't mean such a disassembler couldn't exist in the
future.

With this change in place the gdb.python/py-disasm.exp test should now
pass on AArch64.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29712
---
 gdb/python/py-disasm.c | 79 +++++++++++++++++++++++++++++++++---------
 1 file changed, 63 insertions(+), 16 deletions(-)

diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index c37452fcf72..1d460997831 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -122,6 +122,28 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
     return m_string_file.release ();
   }
 
+  /* If there is a Python exception stored in this disassembler then
+     restore it (i.e. set the PyErr_* state), clear the exception within
+     this disassembler, and return true.  There must be no current
+     exception set (i.e. !PyErr_Occurred()) when this function is called,
+     as any such exception might get lost.
+
+     Otherwise, there is no exception stored in this disassembler, return
+     false.  */
+  bool restore_exception ()
+  {
+    gdb_assert (!PyErr_Occurred ());
+    if (m_stored_exception.has_value ())
+      {
+	gdbpy_err_fetch ex = std::move (*m_stored_exception);
+	m_stored_exception.reset ();
+	ex.restore ();
+	return true;
+      }
+
+    return false;
+  }
+
 private:
 
   /* Where the disassembler result is written.  */
@@ -138,6 +160,25 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
      memory source object then a pointer to the object is placed in here,
      otherwise, this field is nullptr.  */
   PyObject *m_memory_source;
+
+  /* Move the exception EX into this disassembler object.  */
+  void store_exception (gdbpy_err_fetch &&ex)
+  {
+    /* The only calls to store_exception are from read_memory_func, which
+       will return early if there's already an exception stored.  */
+    gdb_assert (!m_stored_exception.has_value ());
+    m_stored_exception.emplace (std::move (ex));
+  }
+
+  /* Return true if there is an exception stored in this disassembler.  */
+  bool has_stored_exception () const
+  {
+    return m_stored_exception.has_value ();
+  }
+
+  /* Store a single exception.  This is used to pass Python exceptions back
+     from ::memory_read to disasmpy_builtin_disassemble.  */
+  gdb::optional<gdbpy_err_fetch> m_stored_exception;
 };
 
 /* Return true if OBJ is still valid, otherwise, return false.  A valid OBJ
@@ -288,20 +329,15 @@ disasmpy_builtin_disassemble (PyObject *self, PyObject *args, PyObject *kw)
      the disassembled instruction, or -1 if there was a memory-error
      encountered while disassembling.  See below more more details on
      handling of -1 return value.  */
-  int length;
-  try
-    {
-      length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
+  int length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
 				   disassembler.disasm_info ());
-    }
-  catch (gdbpy_err_fetch &pyerr)
-    {
-      /* Reinstall the Python exception held in PYERR.  This clears to
-	 pointers held in PYERR, hence the need to catch as a non-const
-	 reference.  */
-      pyerr.restore ();
-      return nullptr;
-    }
+
+  /* It is possible that, while calling a user overridden memory read
+     function, a Python exception was raised that couldn't be
+     translated into a standard memory-error.  In this case the first such
+     exception is stored in the disassembler and restored here.  */
+  if (disassembler.restore_exception ())
+    return nullptr;
 
   if (length == -1)
     {
@@ -483,6 +519,14 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     = static_cast<gdbpy_disassembler *> (info->application_data);
   disasm_info_object *obj = dis->py_disasm_info ();
 
+  /* If a previous read attempt resulted in an exception, then we don't
+     allow any further reads to succeed.  We only do this check for the
+     read_memory_func as this is the only one the user can hook into,
+     thus, this check prevents us calling back into user code if a
+     previous call has already thrown an error.  */
+  if (dis->has_stored_exception ())
+    return -1;
+
   /* The DisassembleInfo.read_memory method expects an offset from the
      address stored within the DisassembleInfo object; calculate that
      offset here.  */
@@ -513,7 +557,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 	 exception and throw it, this will then be caught in
 	 disasmpy_builtin_disassemble, at which point the exception will be
 	 restored.  */
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Convert the result to a buffer.  */
@@ -523,7 +568,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     {
       PyErr_Format (PyExc_TypeError,
 		    _("Result from read_memory is not a buffer"));
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Wrap PY_BUFF so that it is cleaned up correctly at the end of this
@@ -536,7 +582,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
       PyErr_Format (PyExc_ValueError,
 		    _("Buffer returned from read_memory is sized %d instead of the expected %d"),
 		    py_buff.len, len);
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Copy the data out of the Python buffer and return success.  */
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 2/3] gdb/disasm: mark functions passed to the disassembler noexcept
  2022-10-27 15:38 ` [PATCHv2 0/3] " Andrew Burgess
  2022-10-27 15:38   ` [PATCHv2 1/3] " Andrew Burgess
@ 2022-10-27 15:38   ` Andrew Burgess
  2022-10-27 15:38   ` [PATCHv2 3/3] gdb: mark disassembler function callback types as noexcept Andrew Burgess
  2022-11-18 16:57   ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
  3 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-10-27 15:38 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Andrew Burgess

While working on another patch, Simon pointed out that GDB could be
improved by marking the functions passed to the disassembler as
noexcept.

  https://sourceware.org/pipermail/gdb-patches/2022-October/193084.html

The reason this is important is the on some hosts, libopcodes, being C
code, will not be compiled with support for handling exceptions.  As
such, an attempt to throw an exception over libopcodes code will cause
GDB to terminate.

See bug gdb/29712 for an example of when this happened.

In this commit all the functions that are passed to the disassembler,
and which might be used as callbacks by libopcodes are marked
noexcept.

Ideally, I would have liked to change these typedefs:

  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
  using memory_error_ftype = decltype (disassemble_info::memory_error_func);
  using print_address_ftype = decltype (disassemble_info::print_address_func);
  using fprintf_ftype = decltype (disassemble_info::fprintf_func);
  using fprintf_styled_ftype = decltype (disassemble_info::fprintf_styled_func);

which are declared in disasm.h, as including the noexcept keyword.
However, when I tried this, I ran into this warning/error:

  In file included from ../../src/gdb/disasm.c:25:
  ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
  ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
    116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
        |   ^~~~~~~~~~~~~~~~~~~~~~~~~

So I've left that change out.  This does mean that if somebody adds a
new use of the disassembler classes in the future, and forgets to mark
the callbacks as noexcept, this will compile fine.  We'll just have to
manually check for that during review.
---
 gdb/disasm-selftests.c |  5 +++--
 gdb/disasm.c           | 25 +++++++++++++------------
 gdb/disasm.h           | 22 ++++++++++++++--------
 gdb/guile/scm-disasm.c |  2 +-
 gdb/python/py-disasm.c | 12 ++++++------
 5 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/gdb/disasm-selftests.c b/gdb/disasm-selftests.c
index db2d1e0ac59..10df1a1980d 100644
--- a/gdb/disasm-selftests.c
+++ b/gdb/disasm-selftests.c
@@ -234,7 +234,8 @@ print_one_insn_test (struct gdbarch *gdbarch)
     size_t m_len;
 
     static int read_memory (bfd_vma memaddr, gdb_byte *myaddr,
-			    unsigned int len, struct disassemble_info *info)
+			    unsigned int len,
+			    struct disassemble_info *info) noexcept
     {
       gdb_disassembler_test *self
 	= static_cast<gdb_disassembler_test *>(info->application_data);
@@ -296,7 +297,7 @@ memory_error_test (struct gdbarch *gdbarch)
 
     static int read_memory (bfd_vma memaddr, gdb_byte *myaddr,
 			    unsigned int len,
-			    struct disassemble_info *info)
+			    struct disassemble_info *info) noexcept
     {
       /* Always return an error.  */
       return -1;
diff --git a/gdb/disasm.c b/gdb/disasm.c
index 60f95c398a9..e64cf69b250 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -191,7 +191,7 @@ line_has_code_p (htab_t table, struct symtab *symtab, int line)
 int
 gdb_disassembler_memory_reader::dis_asm_read_memory
   (bfd_vma memaddr, gdb_byte *myaddr, unsigned int len,
-   struct disassemble_info *info)
+   struct disassemble_info *info) noexcept
 {
   return target_read_code (memaddr, myaddr, len);
 }
@@ -199,8 +199,8 @@ gdb_disassembler_memory_reader::dis_asm_read_memory
 /* Wrapper of memory_error.  */
 
 void
-gdb_disassembler::dis_asm_memory_error (int err, bfd_vma memaddr,
-					struct disassemble_info *info)
+gdb_disassembler::dis_asm_memory_error
+  (int err, bfd_vma memaddr, struct disassemble_info *info) noexcept
 {
   gdb_disassembler *self
     = static_cast<gdb_disassembler *>(info->application_data);
@@ -211,8 +211,8 @@ gdb_disassembler::dis_asm_memory_error (int err, bfd_vma memaddr,
 /* Wrapper of print_address.  */
 
 void
-gdb_disassembler::dis_asm_print_address (bfd_vma addr,
-					 struct disassemble_info *info)
+gdb_disassembler::dis_asm_print_address
+  (bfd_vma addr, struct disassemble_info *info) noexcept
 {
   gdb_disassembler *self
     = static_cast<gdb_disassembler *>(info->application_data);
@@ -256,7 +256,7 @@ gdb_printing_disassembler::stream_from_gdb_disassemble_info (void *dis_info)
 
 int
 gdb_printing_disassembler::fprintf_func (void *dis_info,
-					 const char *format, ...)
+					 const char *format, ...) noexcept
 {
   ui_file *stream = stream_from_gdb_disassemble_info (dis_info);
 
@@ -272,9 +272,9 @@ gdb_printing_disassembler::fprintf_func (void *dis_info,
 /* See disasm.h.  */
 
 int
-gdb_printing_disassembler::fprintf_styled_func (void *dis_info,
-						enum disassembler_style style,
-						const char *format, ...)
+gdb_printing_disassembler::fprintf_styled_func
+  (void *dis_info, enum disassembler_style style,
+   const char *format, ...) noexcept
 {
   ui_file *stream = stream_from_gdb_disassemble_info (dis_info);
   gdb_printing_disassembler *dis = (gdb_printing_disassembler *) dis_info;
@@ -1220,8 +1220,8 @@ gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR addr)
 /* See disasm.h.  */
 
 int
-gdb_non_printing_disassembler::null_fprintf_func (void *stream,
-						  const char *format, ...)
+gdb_non_printing_disassembler::null_fprintf_func
+  (void *stream, const char *format, ...) noexcept
 {
   return 0;
 }
@@ -1230,7 +1230,8 @@ gdb_non_printing_disassembler::null_fprintf_func (void *stream,
 
 int
 gdb_non_printing_disassembler::null_fprintf_styled_func
-  (void *stream, enum disassembler_style style, const char *format, ...)
+  (void *stream, enum disassembler_style style,
+   const char *format, ...) noexcept
 {
   return 0;
 }
diff --git a/gdb/disasm.h b/gdb/disasm.h
index b7d16744c20..58c6c623098 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -51,7 +51,13 @@ struct gdb_disassemble_info
 
 protected:
 
-  /* Types for the function callbacks within m_di.  */
+  /* Types for the function callbacks within m_di.  It would be nice if
+     these function types were all defined to include the noexcept
+     keyword, as every implementation of these must be noexcept.  However,
+     using noexcept within a function typedef like this is a C++17
+     feature, trying to do this for earlier C++ versions results in a
+     warning from GCC/Clang, and the noexcept isn't checked.  After we
+     move to C++17 these should be updated to add noexcept.  */
   using read_memory_ftype = decltype (disassemble_info::read_memory_func);
   using memory_error_ftype = decltype (disassemble_info::memory_error_func);
   using print_address_ftype = decltype (disassemble_info::print_address_func);
@@ -127,7 +133,7 @@ struct gdb_printing_disassembler : public gdb_disassemble_info
   /* Callback used as the disassemble_info's fprintf_func callback.  The
      DIS_INFO pointer is a pointer to a gdb_printing_disassembler object.
      Content is written to the m_stream extracted from DIS_INFO.  */
-  static int fprintf_func (void *dis_info, const char *format, ...)
+  static int fprintf_func (void *dis_info, const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(2,3);
 
   /* Callback used as the disassemble_info's fprintf_styled_func callback.
@@ -135,7 +141,7 @@ struct gdb_printing_disassembler : public gdb_disassemble_info
      object.  Content is written to the m_stream extracted from DIS_INFO.  */
   static int fprintf_styled_func (void *dis_info,
 				  enum disassembler_style style,
-				  const char *format, ...)
+				  const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(3,4);
 
   /* Return true if the disassembler is considered inside a comment, false
@@ -187,14 +193,14 @@ struct gdb_non_printing_disassembler : public gdb_disassemble_info
 
   /* Callback used as the disassemble_info's fprintf_func callback, this
      doesn't write anything to STREAM, but just returns 0.  */
-  static int null_fprintf_func (void *stream, const char *format, ...)
+  static int null_fprintf_func (void *stream, const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(2,3);
 
   /* Callback used as the disassemble_info's fprintf_styled_func callback,
      , this doesn't write anything to STREAM, but just returns 0.  */
   static int null_fprintf_styled_func (void *stream,
 				       enum disassembler_style style,
-				       const char *format, ...)
+				       const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(3,4);
 };
 
@@ -208,7 +214,7 @@ struct gdb_disassembler_memory_reader
   /* Implements the read_memory_func disassemble_info callback.  */
   static int dis_asm_read_memory (bfd_vma memaddr, gdb_byte *myaddr,
 				  unsigned int len,
-				  struct disassemble_info *info);
+				  struct disassemble_info *info) noexcept;
 };
 
 /* A non-printing disassemble_info management class.  The disassemble_info
@@ -281,9 +287,9 @@ struct gdb_disassembler : public gdb_printing_disassembler,
   static bool use_ext_lang_colorization_p;
 
   static void dis_asm_memory_error (int err, bfd_vma memaddr,
-				    struct disassemble_info *info);
+				    struct disassemble_info *info) noexcept;
   static void dis_asm_print_address (bfd_vma addr,
-				     struct disassemble_info *info);
+				     struct disassemble_info *info) noexcept;
 
   /* Return true if we should use the extension language to apply
      disassembler styling.  This requires disassembler styling to be on
diff --git a/gdb/guile/scm-disasm.c b/gdb/guile/scm-disasm.c
index 992a64b9331..c0957559136 100644
--- a/gdb/guile/scm-disasm.c
+++ b/gdb/guile/scm-disasm.c
@@ -106,7 +106,7 @@ gdbscm_disasm_read_memory_worker (void *datap)
 static int
 gdbscm_disasm_read_memory (bfd_vma memaddr, bfd_byte *myaddr,
 			   unsigned int length,
-			   struct disassemble_info *dinfo)
+			   struct disassemble_info *dinfo) noexcept
 {
   gdbscm_disassembler *self
     = static_cast<gdbscm_disassembler *> (dinfo->application_data);
diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index 1d460997831..a25252b4306 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -101,12 +101,12 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
 
   /* Callbacks used by disassemble_info.  */
   static void memory_error_func (int status, bfd_vma memaddr,
-				 struct disassemble_info *info);
+				 struct disassemble_info *info) noexcept;
   static void print_address_func (bfd_vma addr,
-				  struct disassemble_info *info);
+				  struct disassemble_info *info) noexcept;
   static int read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 			       unsigned int len,
-			       struct disassemble_info *info);
+			       struct disassemble_info *info) noexcept;
 
   /* Return a reference to an optional that contains the address at which a
      memory error occurred.  The optional will only have a value if a
@@ -513,7 +513,7 @@ disasmpy_info_progspace (PyObject *self, void *closure)
 int
 gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 				      unsigned int len,
-				      struct disassemble_info *info)
+				      struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
@@ -658,7 +658,7 @@ disasmpy_result_init (PyObject *self, PyObject *args, PyObject *kwargs)
 
 void
 gdbpy_disassembler::memory_error_func (int status, bfd_vma memaddr,
-				       struct disassemble_info *info)
+				       struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
@@ -669,7 +669,7 @@ gdbpy_disassembler::memory_error_func (int status, bfd_vma memaddr,
 
 void
 gdbpy_disassembler::print_address_func (bfd_vma addr,
-					struct disassemble_info *info)
+					struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 3/3] gdb: mark disassembler function callback types as noexcept
  2022-10-27 15:38 ` [PATCHv2 0/3] " Andrew Burgess
  2022-10-27 15:38   ` [PATCHv2 1/3] " Andrew Burgess
  2022-10-27 15:38   ` [PATCHv2 2/3] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
@ 2022-10-27 15:38   ` Andrew Burgess
  2022-11-18 16:57   ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
  3 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-10-27 15:38 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Andrew Burgess

In disasm.h we define a set of types that are used by the various
disassembler classes to hold callback functions before passing the
callbacks into libopcodes.

Because libopcodes is C code, and on some (many?) targets, C code is
compiled without exception support, it is important that GDB not try
to throw an exception over libopcode code.

In the previous commit all the existing callbacks were marked as
noexcept, however, this doesn't protect us from a future change to GDB
either adding a new callback that is not noexcept, or removing the
noexcept keyword from an existing callback.

In this commit I mark all the callback types as noexcept.  This means
that GDB's disassembler classes will no longer compile if we try to
pass a callback that is not marked as noexcept.

At least, that's the idea.  Unfortunately, it's not that easy.

Before C++17, the noexcept keyword on a function typedef would be
ignored, thus:

  using func_type = void (*) (void) noexcept;

  void
  a_func ()
  {
    throw 123;
  }

  void
  some_func (func_type f)
  {
    f ();
  }

  int
  main ()
  {
    some_func (a_func);
    return 0;
  }

Will compile just fine for C++11 and C++14 with GCC.  Clang on the
other hand complains that 'noexcept' should not appear on function
types, but then does appear to correctly complain that passing a_func
is a mismatch in the set of exceptions that could be thrown.

Switching to C++17 and both GCC and Clang correctly point out that
passing a_func is an invalid conversion relating to the noexcept
keyword.  Changing a_func to:

  void
  a_func () noexcept
  { /* Nothing.  */ }

And for C++17 both GCC and Clang compile this just fine.

My conclusion then is that adding the noexcept keyword to the function
types is pointless while GDB is not compiled as C++17, and silencing
the warnings would require us to jump through a bunch of hoops.

And so, in this commit, I define a macro LIBOPCODE_CALLBACK_NOEXCEPT,
this macro expands to noexcept when compiling for C++17, but otherwise
expands to nothing.  I then add this macro to the function types.

I've compiled GDB as the default C++11 and also forced the compile to
C++17.  When compiling as C++17 I spotted a few additional places
where callbacks needed to be marked noexcept (these fixes were merged
into the previous commit, but this confirmed to be that the macro is
working as expected).
---
 gdb/disasm.h | 40 ++++++++++++++++++++++++++++------------
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/gdb/disasm.h b/gdb/disasm.h
index 58c6c623098..d6aec9a4705 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -26,6 +26,12 @@ struct gdbarch;
 struct ui_out;
 struct ui_file;
 
+#if __cplusplus >= 201703L
+#define LIBOPCODE_CALLBACK_NOEXCEPT noexcept
+#else
+#define LIBOPCODE_CALLBACK_NOEXCEPT
+#endif
+
 /* A wrapper around a disassemble_info and a gdbarch.  This is the core
    set of data that all disassembler sub-classes will need.  This class
    doesn't actually implement the disassembling process, that is something
@@ -51,18 +57,28 @@ struct gdb_disassemble_info
 
 protected:
 
-  /* Types for the function callbacks within m_di.  It would be nice if
-     these function types were all defined to include the noexcept
-     keyword, as every implementation of these must be noexcept.  However,
-     using noexcept within a function typedef like this is a C++17
-     feature, trying to do this for earlier C++ versions results in a
-     warning from GCC/Clang, and the noexcept isn't checked.  After we
-     move to C++17 these should be updated to add noexcept.  */
-  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
-  using memory_error_ftype = decltype (disassemble_info::memory_error_func);
-  using print_address_ftype = decltype (disassemble_info::print_address_func);
-  using fprintf_ftype = decltype (disassemble_info::fprintf_func);
-  using fprintf_styled_ftype = decltype (disassemble_info::fprintf_styled_func);
+  /* Types for the function callbacks within m_di.  The actual function
+     signatures here are taken from include/dis-asm.h.  The noexcept macro
+     expands to 'noexcept' for C++17 and later, otherwise, it expands to
+     nothing.  This is because including noexcept was ignored for function
+     types before C++17, but both GCC and Clang warn that the noexcept
+     will become relevant when you switch to C++17, and this warning
+     causes the build to fail.  */
+  using read_memory_ftype
+    = int (*) (bfd_vma, bfd_byte *, unsigned int, struct disassemble_info *)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using memory_error_ftype
+    = void (*) (int, bfd_vma, struct disassemble_info *)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using print_address_ftype
+    = void (*) (bfd_vma, struct disassemble_info *)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using fprintf_ftype
+    = int (*) (void *, const char *, ...)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using fprintf_styled_ftype
+    = int (*) (void *, enum disassembler_style, const char *, ...)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
 
   /* Constructor, many fields in m_di are initialized from GDBARCH.  The
      remaining arguments are function callbacks that are written into m_di.
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-27 15:38 ` [PATCHv2 0/3] " Andrew Burgess
                     ` (2 preceding siblings ...)
  2022-10-27 15:38   ` [PATCHv2 3/3] gdb: mark disassembler function callback types as noexcept Andrew Burgess
@ 2022-11-18 16:57   ` Andrew Burgess
  2022-11-18 16:57     ` [PATCHv3 1/3] " Andrew Burgess
                       ` (3 more replies)
  3 siblings, 4 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-11-18 16:57 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Changes since v2:

  - Rebase onto current upstream/master, no merged conflicts.

Changes since v1:

  - Patch #1 is largely unchanged.  I'm still in discussion with Simon
    as to whether this is the correct solution or not.

  - Patch #2 has grown a little since the first version I posted, I
    found a few extra callbacks which can be marked as noexcept.

  - Patch #3 is new, this adds the noexcept keyword to the typedefs
    used to handle the libopcode callbacks, but only when we compile
    for C++17 or later.

---

Andrew Burgess (3):
  gdb/python: avoid throwing an exception over libopcodes code
  gdb/disasm: mark functions passed to the disassembler noexcept
  gdb: mark disassembler function callback types as noexcept

 gdb/disasm-selftests.c |  5 ++-
 gdb/disasm.c           | 25 ++++++------
 gdb/disasm.h           | 48 ++++++++++++++++------
 gdb/guile/scm-disasm.c |  2 +-
 gdb/python/py-disasm.c | 91 ++++++++++++++++++++++++++++++++----------
 5 files changed, 121 insertions(+), 50 deletions(-)


base-commit: f9f88aede3bb84efd088a59a5f6bccb3a6bb6516
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 1/3] gdb/python: avoid throwing an exception over libopcodes code
  2022-11-18 16:57   ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
@ 2022-11-18 16:57     ` Andrew Burgess
  2022-11-18 16:57     ` [PATCHv3 2/3] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-11-18 16:57 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Bug gdb/29712 identifies a problem with the Python disassembler API.
In some cases GDB will try to throw an exception through the
libopcodes disassembler code, however, not all targets include
exception unwind information when compiling C code, for targets that
don't include this information GDB will terminate when trying to pass
the exception through libopcodes.

To explain what GDB is trying to do, consider the following trivial
use of the Python disassembler API:

  class ExampleDisassembler(gdb.disassembler.Disassembler):

      class MyInfo(gdb.disassembler.DisassembleInfo):
          def __init__(self, info):
              super().__init__(info)

          def read_memory(self, length, offset):
              return super().read_memory(length, offset)

      def __init__(self):
          super().__init__("ExampleDisassembler")

      def __call__(self, info):
          info = self.MyInfo(info)
          return gdb.disassembler.builtin_disassemble(info)

This disassembler doesn't add any value, it defers back to GDB to do
all the actual work, but it serves to allow us to discuss the problem.

The problem occurs when a Python exception is raised by the
MyInfo.read_memory method.  The MyInfo.read_memory method is called
from the C++ function gdbpy_disassembler::read_memory_func.  The C++
stack at the point this function is called looks like this:

  #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
  #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
  #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
  #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
  #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
  #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
  #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
  #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
  #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324

  ### ... snip lots of frames as we pass through Python itself ...

  #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
  #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
  #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
  #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101

So gdbpy_disassembler::read_memory_func is called from the libopcodes
disassembler to read memory, this C++ function then calls into user
supplied Python code to do the work.

If the user supplied Python code raises an gdb.MemoryError exception
indicating the memory read failed, this this is fine.  The C++ code
converts this exception back into a return value that libopcodes can
understand, and returns to libopcodes.

However, if the user supplied Python code raises some other exception,
what we want is for this exception to propagate through GDB and appear
as if raised by the call to gdb.disassembler.builtin_disassemble.  To
achieve this, when gdbpy_disassembler::read_memory_func spots an
unknown Python exception, we must pass the information about this
exception from frame #0 to frame #8 in the above backtrace.  Frame #8
is the C++ implementation of gdb.disassembler.builtin_disassemble, and
so it is this function that we want to re-raise the unknown Python
exception, so the user can, if they want, catch the exception in their
code.

The previous mechanism by which the exception was passed was to pack
the details of the Python exception into a C++ exception, then throw
the exception from frame #0, and catch the exception in frame #8,
unpack the details of the Python exception, and re-raise it.

However, this relies on the exception passing through frames #1 to #7,
some of which are in libopcodes, which is C code, and so, might not be
compiled with exception support.

This commit proposes an alternative solution that does not rely on
throwing a C++ exception.

When we spot an unhandled Python exception in frame #0, we will store
the details of this exception within the gdbpy_disassembler object
currently in use.  Then we return to libopcodes a value indicating
that the memory_read failed.

libopcodes will now continue to disassemble as though that memory read
failed (with one special case described below), then, when we
eventually return to disasmpy_builtin_disassemble we check to see if
there is an exception stored in the gdbpy_disassembler object.  If
there is then this exception can immediately be installed, and then we
return back to Python, when the user will be able to catch the
exception.

There is one extra change in gdbpy_disassembler::read_memory_func.
After the first call that results in an exception being stored on the
gdbpy_disassembler object, any future calls to the ::read_memory_func
function will immediately return as if the read failed.  This avoids
any additional calls into user supplied Python code.

My thinking here is that should the first call fail with some unknown
error, GDB should not keep trying with any additional calls.  This
maintains the illusion that the exception raised from
MyInfo.read_memory is immediately raised by
gdb.disassembler.builtin_disassemble.  I have no tests for this change
though - to trigger this issue would rely on a libopcodes disassembler
that will try to read further memory even after the first failed
read.  I'm not aware of any such disassembler that currently does
this, but that doesn't mean such a disassembler couldn't exist in the
future.

With this change in place the gdb.python/py-disasm.exp test should now
pass on AArch64.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29712
---
 gdb/python/py-disasm.c | 79 +++++++++++++++++++++++++++++++++---------
 1 file changed, 63 insertions(+), 16 deletions(-)

diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index c37452fcf72..1d460997831 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -122,6 +122,28 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
     return m_string_file.release ();
   }
 
+  /* If there is a Python exception stored in this disassembler then
+     restore it (i.e. set the PyErr_* state), clear the exception within
+     this disassembler, and return true.  There must be no current
+     exception set (i.e. !PyErr_Occurred()) when this function is called,
+     as any such exception might get lost.
+
+     Otherwise, there is no exception stored in this disassembler, return
+     false.  */
+  bool restore_exception ()
+  {
+    gdb_assert (!PyErr_Occurred ());
+    if (m_stored_exception.has_value ())
+      {
+	gdbpy_err_fetch ex = std::move (*m_stored_exception);
+	m_stored_exception.reset ();
+	ex.restore ();
+	return true;
+      }
+
+    return false;
+  }
+
 private:
 
   /* Where the disassembler result is written.  */
@@ -138,6 +160,25 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
      memory source object then a pointer to the object is placed in here,
      otherwise, this field is nullptr.  */
   PyObject *m_memory_source;
+
+  /* Move the exception EX into this disassembler object.  */
+  void store_exception (gdbpy_err_fetch &&ex)
+  {
+    /* The only calls to store_exception are from read_memory_func, which
+       will return early if there's already an exception stored.  */
+    gdb_assert (!m_stored_exception.has_value ());
+    m_stored_exception.emplace (std::move (ex));
+  }
+
+  /* Return true if there is an exception stored in this disassembler.  */
+  bool has_stored_exception () const
+  {
+    return m_stored_exception.has_value ();
+  }
+
+  /* Store a single exception.  This is used to pass Python exceptions back
+     from ::memory_read to disasmpy_builtin_disassemble.  */
+  gdb::optional<gdbpy_err_fetch> m_stored_exception;
 };
 
 /* Return true if OBJ is still valid, otherwise, return false.  A valid OBJ
@@ -288,20 +329,15 @@ disasmpy_builtin_disassemble (PyObject *self, PyObject *args, PyObject *kw)
      the disassembled instruction, or -1 if there was a memory-error
      encountered while disassembling.  See below more more details on
      handling of -1 return value.  */
-  int length;
-  try
-    {
-      length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
+  int length = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address,
 				   disassembler.disasm_info ());
-    }
-  catch (gdbpy_err_fetch &pyerr)
-    {
-      /* Reinstall the Python exception held in PYERR.  This clears to
-	 pointers held in PYERR, hence the need to catch as a non-const
-	 reference.  */
-      pyerr.restore ();
-      return nullptr;
-    }
+
+  /* It is possible that, while calling a user overridden memory read
+     function, a Python exception was raised that couldn't be
+     translated into a standard memory-error.  In this case the first such
+     exception is stored in the disassembler and restored here.  */
+  if (disassembler.restore_exception ())
+    return nullptr;
 
   if (length == -1)
     {
@@ -483,6 +519,14 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     = static_cast<gdbpy_disassembler *> (info->application_data);
   disasm_info_object *obj = dis->py_disasm_info ();
 
+  /* If a previous read attempt resulted in an exception, then we don't
+     allow any further reads to succeed.  We only do this check for the
+     read_memory_func as this is the only one the user can hook into,
+     thus, this check prevents us calling back into user code if a
+     previous call has already thrown an error.  */
+  if (dis->has_stored_exception ())
+    return -1;
+
   /* The DisassembleInfo.read_memory method expects an offset from the
      address stored within the DisassembleInfo object; calculate that
      offset here.  */
@@ -513,7 +557,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 	 exception and throw it, this will then be caught in
 	 disasmpy_builtin_disassemble, at which point the exception will be
 	 restored.  */
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Convert the result to a buffer.  */
@@ -523,7 +568,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
     {
       PyErr_Format (PyExc_TypeError,
 		    _("Result from read_memory is not a buffer"));
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Wrap PY_BUFF so that it is cleaned up correctly at the end of this
@@ -536,7 +582,8 @@ gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
       PyErr_Format (PyExc_ValueError,
 		    _("Buffer returned from read_memory is sized %d instead of the expected %d"),
 		    py_buff.len, len);
-      throw gdbpy_err_fetch ();
+      dis->store_exception (gdbpy_err_fetch ());
+      return -1;
     }
 
   /* Copy the data out of the Python buffer and return success.  */
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 2/3] gdb/disasm: mark functions passed to the disassembler noexcept
  2022-11-18 16:57   ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
  2022-11-18 16:57     ` [PATCHv3 1/3] " Andrew Burgess
@ 2022-11-18 16:57     ` Andrew Burgess
  2022-11-18 16:57     ` [PATCHv3 3/3] gdb: mark disassembler function callback types as noexcept Andrew Burgess
  2022-11-28  8:35     ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Tom de Vries
  3 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-11-18 16:57 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

While working on another patch, Simon pointed out that GDB could be
improved by marking the functions passed to the disassembler as
noexcept.

  https://sourceware.org/pipermail/gdb-patches/2022-October/193084.html

The reason this is important is the on some hosts, libopcodes, being C
code, will not be compiled with support for handling exceptions.  As
such, an attempt to throw an exception over libopcodes code will cause
GDB to terminate.

See bug gdb/29712 for an example of when this happened.

In this commit all the functions that are passed to the disassembler,
and which might be used as callbacks by libopcodes are marked
noexcept.

Ideally, I would have liked to change these typedefs:

  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
  using memory_error_ftype = decltype (disassemble_info::memory_error_func);
  using print_address_ftype = decltype (disassemble_info::print_address_func);
  using fprintf_ftype = decltype (disassemble_info::fprintf_func);
  using fprintf_styled_ftype = decltype (disassemble_info::fprintf_styled_func);

which are declared in disasm.h, as including the noexcept keyword.
However, when I tried this, I ran into this warning/error:

  In file included from ../../src/gdb/disasm.c:25:
  ../../src/gdb/disasm.h: In constructor ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’:
  ../../src/gdb/disasm.h:116:3: error: mangled name for ‘gdb_printing_disassembler::gdb_printing_disassembler(gdbarch*, ui_file*, gdb_disassemble_info::read_memory_ftype, gdb_disassemble_info::memory_error_ftype, gdb_disassemble_info::print_address_ftype)’ will change in C++17 because the exception specification is part of a function type [-Werror=noexcept-type]
    116 |   gdb_printing_disassembler (struct gdbarch *gdbarch,
        |   ^~~~~~~~~~~~~~~~~~~~~~~~~

So I've left that change out.  This does mean that if somebody adds a
new use of the disassembler classes in the future, and forgets to mark
the callbacks as noexcept, this will compile fine.  We'll just have to
manually check for that during review.
---
 gdb/disasm-selftests.c |  5 +++--
 gdb/disasm.c           | 25 +++++++++++++------------
 gdb/disasm.h           | 22 ++++++++++++++--------
 gdb/guile/scm-disasm.c |  2 +-
 gdb/python/py-disasm.c | 12 ++++++------
 5 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/gdb/disasm-selftests.c b/gdb/disasm-selftests.c
index db2d1e0ac59..10df1a1980d 100644
--- a/gdb/disasm-selftests.c
+++ b/gdb/disasm-selftests.c
@@ -234,7 +234,8 @@ print_one_insn_test (struct gdbarch *gdbarch)
     size_t m_len;
 
     static int read_memory (bfd_vma memaddr, gdb_byte *myaddr,
-			    unsigned int len, struct disassemble_info *info)
+			    unsigned int len,
+			    struct disassemble_info *info) noexcept
     {
       gdb_disassembler_test *self
 	= static_cast<gdb_disassembler_test *>(info->application_data);
@@ -296,7 +297,7 @@ memory_error_test (struct gdbarch *gdbarch)
 
     static int read_memory (bfd_vma memaddr, gdb_byte *myaddr,
 			    unsigned int len,
-			    struct disassemble_info *info)
+			    struct disassemble_info *info) noexcept
     {
       /* Always return an error.  */
       return -1;
diff --git a/gdb/disasm.c b/gdb/disasm.c
index 60f95c398a9..e64cf69b250 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -191,7 +191,7 @@ line_has_code_p (htab_t table, struct symtab *symtab, int line)
 int
 gdb_disassembler_memory_reader::dis_asm_read_memory
   (bfd_vma memaddr, gdb_byte *myaddr, unsigned int len,
-   struct disassemble_info *info)
+   struct disassemble_info *info) noexcept
 {
   return target_read_code (memaddr, myaddr, len);
 }
@@ -199,8 +199,8 @@ gdb_disassembler_memory_reader::dis_asm_read_memory
 /* Wrapper of memory_error.  */
 
 void
-gdb_disassembler::dis_asm_memory_error (int err, bfd_vma memaddr,
-					struct disassemble_info *info)
+gdb_disassembler::dis_asm_memory_error
+  (int err, bfd_vma memaddr, struct disassemble_info *info) noexcept
 {
   gdb_disassembler *self
     = static_cast<gdb_disassembler *>(info->application_data);
@@ -211,8 +211,8 @@ gdb_disassembler::dis_asm_memory_error (int err, bfd_vma memaddr,
 /* Wrapper of print_address.  */
 
 void
-gdb_disassembler::dis_asm_print_address (bfd_vma addr,
-					 struct disassemble_info *info)
+gdb_disassembler::dis_asm_print_address
+  (bfd_vma addr, struct disassemble_info *info) noexcept
 {
   gdb_disassembler *self
     = static_cast<gdb_disassembler *>(info->application_data);
@@ -256,7 +256,7 @@ gdb_printing_disassembler::stream_from_gdb_disassemble_info (void *dis_info)
 
 int
 gdb_printing_disassembler::fprintf_func (void *dis_info,
-					 const char *format, ...)
+					 const char *format, ...) noexcept
 {
   ui_file *stream = stream_from_gdb_disassemble_info (dis_info);
 
@@ -272,9 +272,9 @@ gdb_printing_disassembler::fprintf_func (void *dis_info,
 /* See disasm.h.  */
 
 int
-gdb_printing_disassembler::fprintf_styled_func (void *dis_info,
-						enum disassembler_style style,
-						const char *format, ...)
+gdb_printing_disassembler::fprintf_styled_func
+  (void *dis_info, enum disassembler_style style,
+   const char *format, ...) noexcept
 {
   ui_file *stream = stream_from_gdb_disassemble_info (dis_info);
   gdb_printing_disassembler *dis = (gdb_printing_disassembler *) dis_info;
@@ -1220,8 +1220,8 @@ gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR addr)
 /* See disasm.h.  */
 
 int
-gdb_non_printing_disassembler::null_fprintf_func (void *stream,
-						  const char *format, ...)
+gdb_non_printing_disassembler::null_fprintf_func
+  (void *stream, const char *format, ...) noexcept
 {
   return 0;
 }
@@ -1230,7 +1230,8 @@ gdb_non_printing_disassembler::null_fprintf_func (void *stream,
 
 int
 gdb_non_printing_disassembler::null_fprintf_styled_func
-  (void *stream, enum disassembler_style style, const char *format, ...)
+  (void *stream, enum disassembler_style style,
+   const char *format, ...) noexcept
 {
   return 0;
 }
diff --git a/gdb/disasm.h b/gdb/disasm.h
index b7d16744c20..58c6c623098 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -51,7 +51,13 @@ struct gdb_disassemble_info
 
 protected:
 
-  /* Types for the function callbacks within m_di.  */
+  /* Types for the function callbacks within m_di.  It would be nice if
+     these function types were all defined to include the noexcept
+     keyword, as every implementation of these must be noexcept.  However,
+     using noexcept within a function typedef like this is a C++17
+     feature, trying to do this for earlier C++ versions results in a
+     warning from GCC/Clang, and the noexcept isn't checked.  After we
+     move to C++17 these should be updated to add noexcept.  */
   using read_memory_ftype = decltype (disassemble_info::read_memory_func);
   using memory_error_ftype = decltype (disassemble_info::memory_error_func);
   using print_address_ftype = decltype (disassemble_info::print_address_func);
@@ -127,7 +133,7 @@ struct gdb_printing_disassembler : public gdb_disassemble_info
   /* Callback used as the disassemble_info's fprintf_func callback.  The
      DIS_INFO pointer is a pointer to a gdb_printing_disassembler object.
      Content is written to the m_stream extracted from DIS_INFO.  */
-  static int fprintf_func (void *dis_info, const char *format, ...)
+  static int fprintf_func (void *dis_info, const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(2,3);
 
   /* Callback used as the disassemble_info's fprintf_styled_func callback.
@@ -135,7 +141,7 @@ struct gdb_printing_disassembler : public gdb_disassemble_info
      object.  Content is written to the m_stream extracted from DIS_INFO.  */
   static int fprintf_styled_func (void *dis_info,
 				  enum disassembler_style style,
-				  const char *format, ...)
+				  const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(3,4);
 
   /* Return true if the disassembler is considered inside a comment, false
@@ -187,14 +193,14 @@ struct gdb_non_printing_disassembler : public gdb_disassemble_info
 
   /* Callback used as the disassemble_info's fprintf_func callback, this
      doesn't write anything to STREAM, but just returns 0.  */
-  static int null_fprintf_func (void *stream, const char *format, ...)
+  static int null_fprintf_func (void *stream, const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(2,3);
 
   /* Callback used as the disassemble_info's fprintf_styled_func callback,
      , this doesn't write anything to STREAM, but just returns 0.  */
   static int null_fprintf_styled_func (void *stream,
 				       enum disassembler_style style,
-				       const char *format, ...)
+				       const char *format, ...) noexcept
     ATTRIBUTE_PRINTF(3,4);
 };
 
@@ -208,7 +214,7 @@ struct gdb_disassembler_memory_reader
   /* Implements the read_memory_func disassemble_info callback.  */
   static int dis_asm_read_memory (bfd_vma memaddr, gdb_byte *myaddr,
 				  unsigned int len,
-				  struct disassemble_info *info);
+				  struct disassemble_info *info) noexcept;
 };
 
 /* A non-printing disassemble_info management class.  The disassemble_info
@@ -281,9 +287,9 @@ struct gdb_disassembler : public gdb_printing_disassembler,
   static bool use_ext_lang_colorization_p;
 
   static void dis_asm_memory_error (int err, bfd_vma memaddr,
-				    struct disassemble_info *info);
+				    struct disassemble_info *info) noexcept;
   static void dis_asm_print_address (bfd_vma addr,
-				     struct disassemble_info *info);
+				     struct disassemble_info *info) noexcept;
 
   /* Return true if we should use the extension language to apply
      disassembler styling.  This requires disassembler styling to be on
diff --git a/gdb/guile/scm-disasm.c b/gdb/guile/scm-disasm.c
index 992a64b9331..c0957559136 100644
--- a/gdb/guile/scm-disasm.c
+++ b/gdb/guile/scm-disasm.c
@@ -106,7 +106,7 @@ gdbscm_disasm_read_memory_worker (void *datap)
 static int
 gdbscm_disasm_read_memory (bfd_vma memaddr, bfd_byte *myaddr,
 			   unsigned int length,
-			   struct disassemble_info *dinfo)
+			   struct disassemble_info *dinfo) noexcept
 {
   gdbscm_disassembler *self
     = static_cast<gdbscm_disassembler *> (dinfo->application_data);
diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index 1d460997831..a25252b4306 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -101,12 +101,12 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
 
   /* Callbacks used by disassemble_info.  */
   static void memory_error_func (int status, bfd_vma memaddr,
-				 struct disassemble_info *info);
+				 struct disassemble_info *info) noexcept;
   static void print_address_func (bfd_vma addr,
-				  struct disassemble_info *info);
+				  struct disassemble_info *info) noexcept;
   static int read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 			       unsigned int len,
-			       struct disassemble_info *info);
+			       struct disassemble_info *info) noexcept;
 
   /* Return a reference to an optional that contains the address at which a
      memory error occurred.  The optional will only have a value if a
@@ -513,7 +513,7 @@ disasmpy_info_progspace (PyObject *self, void *closure)
 int
 gdbpy_disassembler::read_memory_func (bfd_vma memaddr, gdb_byte *buff,
 				      unsigned int len,
-				      struct disassemble_info *info)
+				      struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
@@ -658,7 +658,7 @@ disasmpy_result_init (PyObject *self, PyObject *args, PyObject *kwargs)
 
 void
 gdbpy_disassembler::memory_error_func (int status, bfd_vma memaddr,
-				       struct disassemble_info *info)
+				       struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
@@ -669,7 +669,7 @@ gdbpy_disassembler::memory_error_func (int status, bfd_vma memaddr,
 
 void
 gdbpy_disassembler::print_address_func (bfd_vma addr,
-					struct disassemble_info *info)
+					struct disassemble_info *info) noexcept
 {
   gdbpy_disassembler *dis
     = static_cast<gdbpy_disassembler *> (info->application_data);
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 3/3] gdb: mark disassembler function callback types as noexcept
  2022-11-18 16:57   ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
  2022-11-18 16:57     ` [PATCHv3 1/3] " Andrew Burgess
  2022-11-18 16:57     ` [PATCHv3 2/3] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
@ 2022-11-18 16:57     ` Andrew Burgess
  2022-11-28  8:35     ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Tom de Vries
  3 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-11-18 16:57 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In disasm.h we define a set of types that are used by the various
disassembler classes to hold callback functions before passing the
callbacks into libopcodes.

Because libopcodes is C code, and on some (many?) targets, C code is
compiled without exception support, it is important that GDB not try
to throw an exception over libopcode code.

In the previous commit all the existing callbacks were marked as
noexcept, however, this doesn't protect us from a future change to GDB
either adding a new callback that is not noexcept, or removing the
noexcept keyword from an existing callback.

In this commit I mark all the callback types as noexcept.  This means
that GDB's disassembler classes will no longer compile if we try to
pass a callback that is not marked as noexcept.

At least, that's the idea.  Unfortunately, it's not that easy.

Before C++17, the noexcept keyword on a function typedef would be
ignored, thus:

  using func_type = void (*) (void) noexcept;

  void
  a_func ()
  {
    throw 123;
  }

  void
  some_func (func_type f)
  {
    f ();
  }

  int
  main ()
  {
    some_func (a_func);
    return 0;
  }

Will compile just fine for C++11 and C++14 with GCC.  Clang on the
other hand complains that 'noexcept' should not appear on function
types, but then does appear to correctly complain that passing a_func
is a mismatch in the set of exceptions that could be thrown.

Switching to C++17 and both GCC and Clang correctly point out that
passing a_func is an invalid conversion relating to the noexcept
keyword.  Changing a_func to:

  void
  a_func () noexcept
  { /* Nothing.  */ }

And for C++17 both GCC and Clang compile this just fine.

My conclusion then is that adding the noexcept keyword to the function
types is pointless while GDB is not compiled as C++17, and silencing
the warnings would require us to jump through a bunch of hoops.

And so, in this commit, I define a macro LIBOPCODE_CALLBACK_NOEXCEPT,
this macro expands to noexcept when compiling for C++17, but otherwise
expands to nothing.  I then add this macro to the function types.

I've compiled GDB as the default C++11 and also forced the compile to
C++17.  When compiling as C++17 I spotted a few additional places
where callbacks needed to be marked noexcept (these fixes were merged
into the previous commit, but this confirmed to be that the macro is
working as expected).
---
 gdb/disasm.h | 40 ++++++++++++++++++++++++++++------------
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/gdb/disasm.h b/gdb/disasm.h
index 58c6c623098..d6aec9a4705 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -26,6 +26,12 @@ struct gdbarch;
 struct ui_out;
 struct ui_file;
 
+#if __cplusplus >= 201703L
+#define LIBOPCODE_CALLBACK_NOEXCEPT noexcept
+#else
+#define LIBOPCODE_CALLBACK_NOEXCEPT
+#endif
+
 /* A wrapper around a disassemble_info and a gdbarch.  This is the core
    set of data that all disassembler sub-classes will need.  This class
    doesn't actually implement the disassembling process, that is something
@@ -51,18 +57,28 @@ struct gdb_disassemble_info
 
 protected:
 
-  /* Types for the function callbacks within m_di.  It would be nice if
-     these function types were all defined to include the noexcept
-     keyword, as every implementation of these must be noexcept.  However,
-     using noexcept within a function typedef like this is a C++17
-     feature, trying to do this for earlier C++ versions results in a
-     warning from GCC/Clang, and the noexcept isn't checked.  After we
-     move to C++17 these should be updated to add noexcept.  */
-  using read_memory_ftype = decltype (disassemble_info::read_memory_func);
-  using memory_error_ftype = decltype (disassemble_info::memory_error_func);
-  using print_address_ftype = decltype (disassemble_info::print_address_func);
-  using fprintf_ftype = decltype (disassemble_info::fprintf_func);
-  using fprintf_styled_ftype = decltype (disassemble_info::fprintf_styled_func);
+  /* Types for the function callbacks within m_di.  The actual function
+     signatures here are taken from include/dis-asm.h.  The noexcept macro
+     expands to 'noexcept' for C++17 and later, otherwise, it expands to
+     nothing.  This is because including noexcept was ignored for function
+     types before C++17, but both GCC and Clang warn that the noexcept
+     will become relevant when you switch to C++17, and this warning
+     causes the build to fail.  */
+  using read_memory_ftype
+    = int (*) (bfd_vma, bfd_byte *, unsigned int, struct disassemble_info *)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using memory_error_ftype
+    = void (*) (int, bfd_vma, struct disassemble_info *)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using print_address_ftype
+    = void (*) (bfd_vma, struct disassemble_info *)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using fprintf_ftype
+    = int (*) (void *, const char *, ...)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
+  using fprintf_styled_ftype
+    = int (*) (void *, enum disassembler_style, const char *, ...)
+	LIBOPCODE_CALLBACK_NOEXCEPT;
 
   /* Constructor, many fields in m_di are initialized from GDBARCH.  The
      remaining arguments are function callbacks that are written into m_di.
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code
  2022-11-18 16:57   ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
                       ` (2 preceding siblings ...)
  2022-11-18 16:57     ` [PATCHv3 3/3] gdb: mark disassembler function callback types as noexcept Andrew Burgess
@ 2022-11-28  8:35     ` Tom de Vries
  3 siblings, 0 replies; 18+ messages in thread
From: Tom de Vries @ 2022-11-28  8:35 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On 11/18/22 17:57, Andrew Burgess via Gdb-patches wrote:
> Changes since v2:
> 
>    - Rebase onto current upstream/master, no merged conflicts.
> 
> Changes since v1:
> 
>    - Patch #1 is largely unchanged.  I'm still in discussion with Simon
>      as to whether this is the correct solution or not.
> 
>    - Patch #2 has grown a little since the first version I posted, I
>      found a few extra callbacks which can be marked as noexcept.
> 
>    - Patch #3 is new, this adds the noexcept keyword to the typedefs
>      used to handle the libopcode callbacks, but only when we compile
>      for C++17 or later.
> 
> ---
> 
> Andrew Burgess (3):
>    gdb/python: avoid throwing an exception over libopcodes code
>    gdb/disasm: mark functions passed to the disassembler noexcept
>    gdb: mark disassembler function callback types as noexcept
> 

Hi,

I ran into the same problem on powerpc64le, applied the patch series and 
rebuild gdb, reran the test-case gdb.python/py-disasm.exp and it passes now.

Thanks,
- Tom

>   gdb/disasm-selftests.c |  5 ++-
>   gdb/disasm.c           | 25 ++++++------
>   gdb/disasm.h           | 48 ++++++++++++++++------
>   gdb/guile/scm-disasm.c |  2 +-
>   gdb/python/py-disasm.c | 91 ++++++++++++++++++++++++++++++++----------
>   5 files changed, 121 insertions(+), 50 deletions(-)
> 
> 
> base-commit: f9f88aede3bb84efd088a59a5f6bccb3a6bb6516

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv2 1/3] gdb/python: avoid throwing an exception over libopcodes code
  2022-10-27 15:38   ` [PATCHv2 1/3] " Andrew Burgess
@ 2022-11-28 14:39     ` Simon Marchi
  2022-11-28 19:26       ` Andrew Burgess
  0 siblings, 1 reply; 18+ messages in thread
From: Simon Marchi @ 2022-11-28 14:39 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches



On 10/27/22 11:38, Andrew Burgess wrote:
> Bug gdb/29712 identifies a problem with the Python disassembler API.
> In some cases GDB will try to throw an exception through the
> libopcodes disassembler code, however, not all targets include
> exception unwind information when compiling C code, for targets that
> don't include this information GDB will terminate when trying to pass
> the exception through libopcodes.
> 
> To explain what GDB is trying to do, consider the following trivial
> use of the Python disassembler API:
> 
>   class ExampleDisassembler(gdb.disassembler.Disassembler):
> 
>       class MyInfo(gdb.disassembler.DisassembleInfo):
>           def __init__(self, info):
>               super().__init__(info)
> 
>           def read_memory(self, length, offset):
>               return super().read_memory(length, offset)
> 
>       def __init__(self):
>           super().__init__("ExampleDisassembler")
> 
>       def __call__(self, info):
>           info = self.MyInfo(info)
>           return gdb.disassembler.builtin_disassemble(info)
> 
> This disassembler doesn't add any value, it defers back to GDB to do
> all the actual work, but it serves to allow us to discuss the problem.
> 
> The problem occurs when a Python exception is raised by the
> MyInfo.read_memory method.  The MyInfo.read_memory method is called
> from the C++ function gdbpy_disassembler::read_memory_func.  The C++
> stack at the point this function is called looks like this:
> 
>   #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
>   #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
>   #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
>   #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
>   #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
>   #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
>   #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
>   #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
>   #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324
> 
>   ### ... snip lots of frames as we pass through Python itself ...
> 
>   #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
>   #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
>   #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
>   #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101
> 
> So gdbpy_disassembler::read_memory_func is called from the libopcodes
> disassembler to read memory, this C++ function then calls into user
> supplied Python code to do the work.
> 
> If the user supplied Python code raises an gdb.MemoryError exception
> indicating the memory read failed, this this is fine.  The C++ code

"this this"

> converts this exception back into a return value that libopcodes can
> understand, and returns to libopcodes.
> 
> However, if the user supplied Python code raises some other exception,
> what we want is for this exception to propagate through GDB and appear
> as if raised by the call to gdb.disassembler.builtin_disassemble.  To
> achieve this, when gdbpy_disassembler::read_memory_func spots an
> unknown Python exception, we must pass the information about this
> exception from frame #0 to frame #8 in the above backtrace.  Frame #8
> is the C++ implementation of gdb.disassembler.builtin_disassemble, and
> so it is this function that we want to re-raise the unknown Python
> exception, so the user can, if they want, catch the exception in their
> code.
> 
> The previous mechanism by which the exception was passed was to pack
> the details of the Python exception into a C++ exception, then throw
> the exception from frame #0, and catch the exception in frame #8,
> unpack the details of the Python exception, and re-raise it.
> 
> However, this relies on the exception passing through frames #1 to #7,
> some of which are in libopcodes, which is C code, and so, might not be
> compiled with exception support.
> 
> This commit proposes an alternative solution that does not rely on
> throwing a C++ exception.
> 
> When we spot an unhandled Python exception in frame #0, we will store
> the details of this exception within the gdbpy_disassembler object
> currently in use.  Then we return to libopcodes a value indicating
> that the memory_read failed.
> 
> libopcodes will now continue to disassemble as though that memory read
> failed (with one special case described below), then, when we
> eventually return to disasmpy_builtin_disassemble we check to see if
> there is an exception stored in the gdbpy_disassembler object.  If
> there is then this exception can immediately be installed, and then we
> return back to Python, when the user will be able to catch the
> exception.
> 
> There is one extra change in gdbpy_disassembler::read_memory_func.
> After the first call that results in an exception being stored on the
> gdbpy_disassembler object, any future calls to the ::read_memory_func
> function will immediately return as if the read failed.  This avoids
> any additional calls into user supplied Python code.
> 
> My thinking here is that should the first call fail with some unknown
> error, GDB should not keep trying with any additional calls.  This
> maintains the illusion that the exception raised from
> MyInfo.read_memory is immediately raised by
> gdb.disassembler.builtin_disassemble.  I have no tests for this change
> though - to trigger this issue would rely on a libopcodes disassembler
> that will try to read further memory even after the first failed
> read.  I'm not aware of any such disassembler that currently does
> this, but that doesn't mean such a disassembler couldn't exist in the
> future.

After understanding this, I'm fine with the idea.   You can add:

Approved-By: Simon Marchi <simon.marchi@efficios.com>

Simon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv2 1/3] gdb/python: avoid throwing an exception over libopcodes code
  2022-11-28 14:39     ` Simon Marchi
@ 2022-11-28 19:26       ` Andrew Burgess
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2022-11-28 19:26 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

Simon Marchi <simark@simark.ca> writes:

> On 10/27/22 11:38, Andrew Burgess wrote:
>> Bug gdb/29712 identifies a problem with the Python disassembler API.
>> In some cases GDB will try to throw an exception through the
>> libopcodes disassembler code, however, not all targets include
>> exception unwind information when compiling C code, for targets that
>> don't include this information GDB will terminate when trying to pass
>> the exception through libopcodes.
>> 
>> To explain what GDB is trying to do, consider the following trivial
>> use of the Python disassembler API:
>> 
>>   class ExampleDisassembler(gdb.disassembler.Disassembler):
>> 
>>       class MyInfo(gdb.disassembler.DisassembleInfo):
>>           def __init__(self, info):
>>               super().__init__(info)
>> 
>>           def read_memory(self, length, offset):
>>               return super().read_memory(length, offset)
>> 
>>       def __init__(self):
>>           super().__init__("ExampleDisassembler")
>> 
>>       def __call__(self, info):
>>           info = self.MyInfo(info)
>>           return gdb.disassembler.builtin_disassemble(info)
>> 
>> This disassembler doesn't add any value, it defers back to GDB to do
>> all the actual work, but it serves to allow us to discuss the problem.
>> 
>> The problem occurs when a Python exception is raised by the
>> MyInfo.read_memory method.  The MyInfo.read_memory method is called
>> from the C++ function gdbpy_disassembler::read_memory_func.  The C++
>> stack at the point this function is called looks like this:
>> 
>>   #0  gdbpy_disassembler::read_memory_func (memaddr=4198805, buff=0x7fff9ab9d2a8 "\220ӹ\232\377\177", len=1, info=0x7fff9ab9d558) at ../../src/gdb/python/py-disasm.c:510
>>   #1  0x000000000104ba06 in fetch_data (info=0x7fff9ab9d558, addr=0x7fff9ab9d2a9 "ӹ\232\377\177") at ../../src/opcodes/i386-dis.c:305
>>   #2  0x000000000104badb in ckprefix (ins=0x7fff9ab9d100) at ../../src/opcodes/i386-dis.c:8571
>>   #3  0x000000000104e28e in print_insn (pc=4198805, info=0x7fff9ab9d558, intel_syntax=-1) at ../../src/opcodes/i386-dis.c:9548
>>   #4  0x000000000104f4d4 in print_insn_i386 (pc=4198805, info=0x7fff9ab9d558) at ../../src/opcodes/i386-dis.c:9949
>>   #5  0x00000000004fa7ea in default_print_insn (memaddr=4198805, info=0x7fff9ab9d558) at ../../src/gdb/arch-utils.c:1033
>>   #6  0x000000000094fe5e in i386_print_insn (pc=4198805, info=0x7fff9ab9d558) at ../../src/gdb/i386-tdep.c:4072
>>   #7  0x0000000000503d49 in gdbarch_print_insn (gdbarch=0x5335560, vma=4198805, info=0x7fff9ab9d558) at ../../src/gdb/gdbarch.c:3351
>>   #8  0x0000000000bcc8c6 in disasmpy_builtin_disassemble (self=0x7f2ab07f54d0, args=0x7f2ab0789790, kw=0x0) at ../../src/gdb/python/py-disasm.c:324
>> 
>>   ### ... snip lots of frames as we pass through Python itself ...
>> 
>>   #22 0x0000000000bcd860 in gdbpy_print_insn (gdbarch=0x5335560, memaddr=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/python/py-disasm.c:783
>>   #23 0x00000000008995a5 in ext_lang_print_insn (gdbarch=0x5335560, address=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/extension.c:939
>>   #24 0x0000000000741aaa in gdb_print_insn_1 (gdbarch=0x5335560, vma=0x401195, info=0x7fff9ab9e3c8) at ../../src/gdb/disasm.c:1078
>>   #25 0x0000000000741bab in gdb_disassembler::print_insn (this=0x7fff9ab9e3c0, memaddr=0x401195, branch_delay_insns=0x0) at ../../src/gdb/disasm.c:1101
>> 
>> So gdbpy_disassembler::read_memory_func is called from the libopcodes
>> disassembler to read memory, this C++ function then calls into user
>> supplied Python code to do the work.
>> 
>> If the user supplied Python code raises an gdb.MemoryError exception
>> indicating the memory read failed, this this is fine.  The C++ code
>
> "this this"
>
>> converts this exception back into a return value that libopcodes can
>> understand, and returns to libopcodes.
>> 
>> However, if the user supplied Python code raises some other exception,
>> what we want is for this exception to propagate through GDB and appear
>> as if raised by the call to gdb.disassembler.builtin_disassemble.  To
>> achieve this, when gdbpy_disassembler::read_memory_func spots an
>> unknown Python exception, we must pass the information about this
>> exception from frame #0 to frame #8 in the above backtrace.  Frame #8
>> is the C++ implementation of gdb.disassembler.builtin_disassemble, and
>> so it is this function that we want to re-raise the unknown Python
>> exception, so the user can, if they want, catch the exception in their
>> code.
>> 
>> The previous mechanism by which the exception was passed was to pack
>> the details of the Python exception into a C++ exception, then throw
>> the exception from frame #0, and catch the exception in frame #8,
>> unpack the details of the Python exception, and re-raise it.
>> 
>> However, this relies on the exception passing through frames #1 to #7,
>> some of which are in libopcodes, which is C code, and so, might not be
>> compiled with exception support.
>> 
>> This commit proposes an alternative solution that does not rely on
>> throwing a C++ exception.
>> 
>> When we spot an unhandled Python exception in frame #0, we will store
>> the details of this exception within the gdbpy_disassembler object
>> currently in use.  Then we return to libopcodes a value indicating
>> that the memory_read failed.
>> 
>> libopcodes will now continue to disassemble as though that memory read
>> failed (with one special case described below), then, when we
>> eventually return to disasmpy_builtin_disassemble we check to see if
>> there is an exception stored in the gdbpy_disassembler object.  If
>> there is then this exception can immediately be installed, and then we
>> return back to Python, when the user will be able to catch the
>> exception.
>> 
>> There is one extra change in gdbpy_disassembler::read_memory_func.
>> After the first call that results in an exception being stored on the
>> gdbpy_disassembler object, any future calls to the ::read_memory_func
>> function will immediately return as if the read failed.  This avoids
>> any additional calls into user supplied Python code.
>> 
>> My thinking here is that should the first call fail with some unknown
>> error, GDB should not keep trying with any additional calls.  This
>> maintains the illusion that the exception raised from
>> MyInfo.read_memory is immediately raised by
>> gdb.disassembler.builtin_disassemble.  I have no tests for this change
>> though - to trigger this issue would rely on a libopcodes disassembler
>> that will try to read further memory even after the first failed
>> read.  I'm not aware of any such disassembler that currently does
>> this, but that doesn't mean such a disassembler couldn't exist in the
>> future.
>
> After understanding this, I'm fine with the idea.   You can add:
>
> Approved-By: Simon Marchi <simon.marchi@efficios.com>

Thanks.  I pushed this series.

Andrew


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-11-28 19:26 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24 12:50 [PATCH] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
2022-10-24 15:50 ` Simon Marchi
2022-10-24 17:22   ` Andrew Burgess
2022-10-24 17:45     ` [PATCH] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
2022-10-24 18:24       ` Simon Marchi
2022-10-24 18:20     ` [PATCH] gdb/python: avoid throwing an exception over libopcodes code Simon Marchi
2022-10-27 10:38       ` Andrew Burgess
2022-10-27 15:38 ` [PATCHv2 0/3] " Andrew Burgess
2022-10-27 15:38   ` [PATCHv2 1/3] " Andrew Burgess
2022-11-28 14:39     ` Simon Marchi
2022-11-28 19:26       ` Andrew Burgess
2022-10-27 15:38   ` [PATCHv2 2/3] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
2022-10-27 15:38   ` [PATCHv2 3/3] gdb: mark disassembler function callback types as noexcept Andrew Burgess
2022-11-18 16:57   ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Andrew Burgess
2022-11-18 16:57     ` [PATCHv3 1/3] " Andrew Burgess
2022-11-18 16:57     ` [PATCHv3 2/3] gdb/disasm: mark functions passed to the disassembler noexcept Andrew Burgess
2022-11-18 16:57     ` [PATCHv3 3/3] gdb: mark disassembler function callback types as noexcept Andrew Burgess
2022-11-28  8:35     ` [PATCHv3 0/3] gdb/python: avoid throwing an exception over libopcodes code Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).