On 10/5/18 6:43 AM, Simon Marchi wrote: > On 2018-10-04 05:11 PM, Tom de Vries wrote: >> Hi, >> >> [ Submitted earlier here ( >> https://sourceware.org/ml/gdb-patches/2016-02/msg00124.html ). ] >> >> This patch adds a new gdb.MinSymbol object to export the minimal_symbol >> interface. >> >> Build and reg-tested on x86_64-linux. >> >> OK for trunk? >> >> Thanks, >> - Tom > > Hi Tom, > > I think I have read (probably from Phil or Tom Tromey) that the intention was to > expose minsyms and full symbols using the same Symbol class, to avoid exposing > the fact that GDB represents symbols in different ways internally. I don't know > if there was some concrete plans for that or if it was just at the idea stage. > > Otherwise, I'd be fine with exposing gdb.MinSyms and documenting that they map to > the binary file format symbols (ELF, PE, mach-O, etc) while gdb.Symbols map to > debug info symbols. I think it's a reality that will not change any time soon, > and it could be useful for users of the Python API to make the distinction > between the two. > > Maybe I'm missing something that has already been discussed that makes exposing > minsyms a bad idea, in that case Phil and Tom are probably going to be able to > shed som light on that. In the mean time here are a bunch of random > comments/suggestions. > > * A general GDB-specific code style comment, every time you compare a pointer > to know if it's NULL or not, it should be explicitly "ptr == NULL" or "ptr != NULL". Done. > * Please make sure that all functions are documented (with an introductory comment). > Done. >> >> [gdb/python] Add interface to access minimal_symbols >> >> 2018-09-27 Jeff Mahoney >> Tom de Vries >> >> * Makefile.in (SUBDIR_PYTHON_SRCS): Add python/py-minsymbol.c. >> * python/py-minsymbol.c: New file. >> * python/py-objfile.c (objfpy_object_to_objfile): New function. >> * python/python-internal.h (minsym_object_type) >> (gdbpy_lookup_minimal_symbol, objfpy_object_to_objfile): >> (gdbpy_initialize_minsymbols): Declare. >> * python/python.c (do_start_initialization): Call >> gdbpy_initialize_minsymbols. >> (python_GdbMethods): Add lookup_minimal_symbol entry. >> >> * python.texi (@node Python API): Add "Minimal Symbols In Python" menu >> entry. >> (@node Minimal Symbols In Python): New node. >> >> * gdb.python/py-minsymbol.c: New test. >> * gdb.python/py-minsymbol.exp: New file. >> >> --- >> gdb/Makefile.in | 1 + >> gdb/doc/python.texi | 140 +++++++++ >> gdb/python/py-minsymbol.c | 492 ++++++++++++++++++++++++++++++ >> gdb/python/py-objfile.c | 9 + >> gdb/python/python-internal.h | 7 + >> gdb/python/python.c | 5 + >> gdb/testsuite/gdb.python/py-minsymbol.c | 38 +++ >> gdb/testsuite/gdb.python/py-minsymbol.exp | 67 ++++ >> 8 files changed, 759 insertions(+) >> >> diff --git a/gdb/Makefile.in b/gdb/Makefile.in >> index 8d780ac758..489dab5ca1 100644 >> --- a/gdb/Makefile.in >> +++ b/gdb/Makefile.in >> @@ -377,6 +377,7 @@ SUBDIR_PYTHON_SRCS = \ >> python/py-instruction.c \ >> python/py-lazy-string.c \ >> python/py-linetable.c \ >> + python/py-minsymbol.c \ >> python/py-newobjfileevent.c \ >> python/py-objfile.c \ >> python/py-param.c \ >> diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi >> index 1035be33f0..d91f6e35c5 100644 >> --- a/gdb/doc/python.texi >> +++ b/gdb/doc/python.texi >> @@ -158,6 +158,7 @@ optional arguments while skipping others. Example: >> * Frames In Python:: Accessing inferior stack frames from Python. >> * Blocks In Python:: Accessing blocks from Python. >> * Symbols In Python:: Python representation of symbols. >> +* Minimal Symbols In Python:: Python representation of minimal symbols. >> * Symbol Tables In Python:: Python representation of symbol tables. >> * Line Tables In Python:: Python representation of line tables. >> * Breakpoints In Python:: Manipulating breakpoints using Python. >> @@ -4878,6 +4879,145 @@ The value does not actually exist in the program. >> The value's address is a computed location. >> @end vtable >> >> +@node Minimal Symbols In Python >> +@subsubsection Python representation of Minimal Symbols. >> + >> +@cindex minsymbols in python >> +@tindex gdb.MinSymbol >> + >> +@value{GDBN} represents every variable, function and type as an >> +entry in a symbol table. @xref{Symbols, ,Examining the Symbol Table}. >> +Typical symbols like functions, variables, etc are represented by >> +gdb.Symbol objects in Python. Some symbols are defined with less >> +information associated with them, like linker script variables >> +or assembly labels. Python represents these minimal symbols in @value{GDBN} >> +with the @code{gdb.MinSymbol} object. > > Here for example, I would make it clear that a MinSym == a symbol provided by the > binary file format, and would cite ELF, PE and mach-O as examples. I think that > would make it more obvious to the reader. > Done. >> + >> +The following minimal symbol-related functions are available in the @code{gdb} >> +module: >> + >> +@findex gdb.lookup_minimal_symbol >> +@defun gdb.lookup_minimal_symbol (name @r{[}, sfile@r{]}, objfile@r{[}) > > The square bracket look wrong. Since sfile and objfile are keyword arguments, > I guess you want them to look like this? > > gdb.lookup_minimal_symbol (name [, sfile][, objfile]) > > ? > Done. > Not in this patch, but I think we should document the default value for keyword > arguments, like the official Python doc does, which would mean something like: > > gdb.lookup_minimal_symbol (name, sfile=None, objfile=None) > > >> +This function searches for a minimal symbol by name. >> +The search scope can be restricted by the sfile and objfile arguments. >> + >> +@var{name} is the name of the minimal symbol. It must be a string. >> +The optional @var{sfile} argument restricts the search to the source file >> +in which the minimal symbol was defined. >> +The @var{sfile} argument must be a string. The optional @var{objfile} >> +restricts the search to the objfile that contains the minimal symbol. >> +The @var{objfile} argument must be a @code{gdb.Objfile} object. > > Can you try to split these in some kind of bullet, or at least one parameter > per paragraph? I think we have a tendency to document parameters in a big > paragraph, which does not make it easy to look up an individual parameter. > > I think a format like this makes it really easy to read: > > https://nodejs.org/dist/latest-v10.x/docs/api/buffer.html#buffer_class_method_buffer_from_arraybuffer_byteoffset_length > Done. > Also, would it be possible to precise how the search is made, especially > when multiple symbols match a search? If I have a C++ program with this: > > void allo(int x) > { > } > > void allo(float f) > { > } > > then doing gdb.lookup_minimal_symbol('allo') will only return one of them. > Done. > It would also be good to mention how the sfile argument is used, does it have > to be an exact match, just the base name, any substring? > Done. >> +The result is a @code{gdb.MinSymbol} object or @code{None} if the symbol >> +is not found. >> +@end defun >> + >> +A @code{gdb.MinSymbol} object has the following attributes: >> + >> +@defvar MinSymbol.name >> +The name of the symbol as a string. This attribute is not writable. >> +@end defvar >> + >> +@defvar MinSymbol.linkage_name >> +The name of the symbol, as used by the linker (i.e., may be mangled). >> +This attribute is not writable. >> +@end defvar >> + >> +@defvar MinSymbol.print_name >> +The name of the symbol in a form suitable for output. This is either >> +@code{name} or @code{linkage_name}, depending on whether the user >> +asked @value{GDBN} to display demangled or mangled names. >> +@end defvar >> + >> +@defvar MinSymbol.filename >> +The file name of the source file where the minimal symbol is defined. This >> +value may represent filenames used internally by the compiler rather >> +than a viewable/editable source file. >> +@end defvar >> + >> +@defvar MinSymbol.section >> +The name of the binary section containing this minimal symbol. >> +@end defvar >> + >> +@defvar MinSymbol.is_code >> +@code{True} if the minimal symbol is a function or a method. >> +@end defvar >> + >> +@defvar MinSymbol.is_data >> +@code{True} if the symbol is a variable or other data. >> +@end defvar >> + >> +A @code{gdb.MinSymbol} object has the following methods: >> + >> +@defun MinSymbol.is_valid () >> +Returns @code{True} if the @code{gdb.MinSymbol} object is valid, >> +@code{False} if not. A @code{gdb.MinSymbol} object can become invalid if >> +the symbol it refers to does not exist in @value{GDBN} any longer. >> +All other @code{gdb.MinSymbol} methods will throw an exception if it is >> +invalid at the time the method is called. >> +@end defun >> + >> +@defun MinSymbol.value () >> +Compute the value of the minimal symbol, as a @code{gdb.Value}. The value >> +returned represents the address of the minimal symbol. Since minimal symbols >> +represent objects without rich type information, the @code{gdb.Type} >> +associated with the @code{gdb.Value} objects will be limited to whether >> +the minimal symbol describes executable code or data. >> +@end defun >> + >> +The available types for @code{gdb.MinSymbol} are represented >> +as constants in the @code{gdb} module. They are distinctly separate from the >> +types represented by the @code{gdb.Type} object. > > I think the "type" method, that returns one of these, is not documented. > Done. >> +int >> +gdbpy_initialize_minsymbols (void) >> +{ >> + if (PyType_Ready (&minsym_object_type) < 0) >> + return -1; >> + >> + msympy_objfile_data_key >> + = register_objfile_data_with_cleanup (NULL, del_objfile_msymbols); >> + >> + if (PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_UNKNOWN", >> + mst_unknown) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_TEXT", >> + mst_text) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_TEXT_GNU_IFUNC", >> + mst_text_gnu_ifunc) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_SLOT_GOT_PLT", >> + mst_slot_got_plt) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_DATA", >> + mst_data) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_BSS", mst_bss) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_ABS", mst_abs) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_SOLIB_TRAMPOLINE", >> + mst_solib_trampoline) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_FILE_TEXT", >> + mst_file_text) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_FILE_DATA", >> + mst_file_data) < 0 >> + || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_FILE_BSS", >> + mst_file_bss) < 0) > > I was a bit worried about exposing the internal enum values, but it seems like > we do this in other places (like gdb.COMMAND_* constants). If scripts use the > labels, they don't really care what the numerical value is. > >> + return -1; >> + >> + return gdb_pymodule_addobject (gdb_module, "MinSymbol", >> + (PyObject *) &minsym_object_type); >> +} >> + >> + >> + >> +static gdb_PyGetSetDef minsym_object_getset[] = { >> + { "name", msympy_get_name, NULL, >> + "Name of the minimal symbol, as it appears in the source code.", NULL }, >> + { "linkage_name", msympy_get_linkage_name, NULL, >> + "Name of the minimal symbol, as used by the linker (i.e., may be mangled).", >> + NULL }, >> + { "filename", msympy_get_file_name, NULL, >> + "Name of source file that contains this minimal symbol. Only applies for" >> + " mst_file_*.", >> + NULL }, >> + { "print_name", msympy_get_print_name, NULL, >> + "Name of the minimal symbol in a form suitable for output.\n\ >> +This is either name or linkage_name, depending on whether the user asked GDB\n\ >> +to display demangled or mangled names.", NULL }, >> + { "section", msympy_get_section, NULL, >> + "Section that contains this minimal symbol, if any", NULL, }, > > I would suggest naming this "section_name". It would leave the "section" property > available so that ff we ever have a Python type to represent binary file sections, > we can use it to return that. > Done. >> + { "type", msympy_get_type, NULL, >> + "Type that this minimal symbol represents." }, >> + { NULL } /* Sentinel */ >> +}; >> + >> +static PyMethodDef minsym_object_methods[] = { >> + { "is_valid", msympy_is_valid, METH_NOARGS, >> + "is_valid () -> Boolean.\n\ >> +Return true if this minimal symbol is valid, false if not." }, >> + { "is_code", msympy_is_code, METH_NOARGS, >> + "is_code () -> Boolean.\n\ >> +Return true if this minimal symbol represents code." }, >> + { "is_data", msympy_is_data, METH_NOARGS, >> + "is_data () -> Boolean.\n\ >> +Return true if this minimal symbol represents data." }, >> + { "value", msympy_value, METH_VARARGS, >> + "value ([frame]) -> gdb.Value\n\ >> +Return the value of the minimal symbol." }, >> + {NULL} /* Sentinel */ >> +}; >> + >> +PyTypeObject minsym_object_type = { >> + PyVarObject_HEAD_INIT (NULL, 0) >> + "gdb.MinSymbol", /*tp_name*/ >> + sizeof (minsym_object), /*tp_basicsize*/ >> + 0, /*tp_itemsize*/ >> + msympy_dealloc, /*tp_dealloc*/ >> + 0, /*tp_print*/ >> + 0, /*tp_getattr*/ >> + 0, /*tp_setattr*/ >> + 0, /*tp_compare*/ >> + 0, /*tp_repr*/ >> + 0, /*tp_as_number*/ >> + 0, /*tp_as_sequence*/ >> + 0, /*tp_as_mapping*/ >> + 0, /*tp_hash */ >> + 0, /*tp_call*/ >> + msympy_str, /*tp_str*/ > > I would suggest implementing repr instead, as we have done for > gdb.Objfile and others. Also, I would suggest using the output > style > > > > to be somewhat consistent. Done. AFAIU, the purpose of repr is to print a unique representation, and distinct symbols can have the same name, so I went for: ... ... for local symbols, dropping the filename=%s part for local symbols. > I think it helps when developing to > have the type of the object printed. Plus, we make sure people > don't rely on the output of repr/str to format the output the > way they want :). > Retested as attached. Thanks, - Tom