public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
@ 2022-12-04 19:46 gustaf.waldemarson at gmail dot com
  2022-12-04 20:04 ` [Bug libstdc++/107965] " pinskia at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: gustaf.waldemarson at gmail dot com @ 2022-12-04 19:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

            Bug ID: 107965
           Summary: libstdc++ Python Pretty-Printers: Many Exceptions From
                    Uninitialized Structures
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gustaf.waldemarson at gmail dot com
  Target Milestone: ---

Created attachment 54008
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54008&action=edit
Python stack-traces and memory errors

Hello,

I've been having a bit of an odd issue with Python pretty-printers
bundled together with libstdc++(-v3). Truthfully though, I'm not really
sure if this is a bug, but anyways:

Given the following simple `gdbinit` that only initializes the pretty
printers and enables Python stack-traces:

```
python
import os
import re
import sys
import os.path
import textwrap
import gdb
import gdb.types
import gdb.printing

home          = os.environ.get("HOME", "~")
default_path  = os.path.join(home, "git", "installs")
objects_dir   = os.environ.get("objects_dir", default_path)
python_addons = os.path.join(objects_dir, "gcc", "libstdc++-v3", "python")

if os.path.isdir(python_addons):
  print("Installing libstdcxx printers...")
  sys.path.insert(0, python_addons)
  from libstdcxx.v6.printers import register_libstdcxx_printers
  register_libstdcxx_printers(None)
end

set python print-stack full
```

and this simple C++ file:

    #include <vector>
    #include <string>
    #include <iostream>

    using namespace std;

    int main(void)
    {
        vector<string> test{"test", "test2"};
        string blabla = "hello";
        int b = 2;

        std::cout << blabla << " "
                  << b << " "
                  << test[0] << " " << test[1] << std::endl;

        return 0;
    }

Compile it and start debugging:

```
g++ -g3 test.cpp
gdb -q a.out
(gdb) start
(gdb) info locals
```

At `info locals` I get a lot of memory related errors, presumably
because none of the local variables have been initialized, but I
also get a large number of Python exceptions; following the Python
print-stack-trace reveals a possible error here:

```
  File
"/home/xaldew/git/installs/gcc/libstdc++-v3/python/libstdcxx/v6/printers.py",
line 971, in to_string
    return ptr.lazy_string (length = length)
OverflowError: int too big to convert
```

I had a look at that file but could not find any obvious errors. Is this
behavior intended for uninitialized local variables?

(The complete log-file of the error is attached)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
@ 2022-12-04 20:04 ` pinskia at gcc dot gnu.org
  2022-12-05  8:24 ` redi at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-12-04 20:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=59170

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Related to PR 59170.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
  2022-12-04 20:04 ` [Bug libstdc++/107965] " pinskia at gcc dot gnu.org
@ 2022-12-05  8:24 ` redi at gcc dot gnu.org
  2022-12-05  8:45 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2022-12-05  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
They're nothing the printers can do. You're asking to print them out before
they are initialized, so they try to interpret garbage as values. The
OverflowError is just because some uninitialized std::string cannot be printed.

This should really be reported as a gdb bug. Gdb knows if the object's
initialization had finished, so it should not try to print variables at all
before their lifetime has begun, especially not via python printers.

It might make sense to display the variable name with a value like <before
lifetime>, but even that is debatable. The C++ standard is very clear that none
of those variables exists yet at your breakpoint, and gdb contradicts its own
documentation:

"These are all variables (declared either static or automatic) accessible at
the point of execution of the selected frame."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
  2022-12-04 20:04 ` [Bug libstdc++/107965] " pinskia at gcc dot gnu.org
  2022-12-05  8:24 ` redi at gcc dot gnu.org
@ 2022-12-05  8:45 ` rguenth at gcc dot gnu.org
  2022-12-05  8:48 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-05  8:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #2)
> They're nothing the printers can do. You're asking to print them out before
> they are initialized, so they try to interpret garbage as values. The
> OverflowError is just because some uninitialized std::string cannot be
> printed.
> 
> This should really be reported as a gdb bug. Gdb knows if the object's
> initialization had finished, so it should not try to print variables at all
> before their lifetime has begun, especially not via python printers.
> 
> It might make sense to display the variable name with a value like <before
> lifetime>, but even that is debatable. The C++ standard is very clear that
> none of those variables exists yet at your breakpoint, and gdb contradicts
> its own documentation:
> 
> "These are all variables (declared either static or automatic) accessible at
> the point of execution of the selected frame."

I'm not so sure.  For

struct X { X(); int i; };

X::X () { i = 42; }

int main()
{
  X x;
  return 0;
}

GCC emits

 <1><6e>: Abbrev Number: 9 (DW_TAG_subprogram)
    <6f>   DW_AT_external    : 1
    <6f>   DW_AT_name        : (indirect string, offset: 0xf): main
    <73>   DW_AT_decl_file   : 1
    <74>   DW_AT_decl_line   : 5
    <75>   DW_AT_decl_column : 5
    <76>   DW_AT_type        : <0x67>
    <7a>   DW_AT_low_pc      : 0x15
    <82>   DW_AT_high_pc     : 0x1b
    <8a>   DW_AT_frame_base  : 1 byte block: 9c         (DW_OP_call_frame_cfa)
    <8c>   DW_AT_GNU_all_tail_call_sites: 1
    <8c>   DW_AT_sibling     : <0x9e>
 <2><90>: Abbrev Number: 10 (DW_TAG_variable)
    <91>   DW_AT_name        : x
    <93>   DW_AT_decl_file   : 1
    <94>   DW_AT_decl_line   : 7
    <95>   DW_AT_decl_column : 5
    <96>   DW_AT_type        : <0x2d>
    <9a>   DW_AT_location    : 2 byte block: 91 6c      (DW_OP_fbreg: -20)

so gdb has no idea that x only becomes live after the call to the CTOR
(or during that).  Instead GCC says it lives throughout the whole
function on the frame.  Even the original IL from the frontend has no
hint that would allow the middle-end to emit different DWARF.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
                   ` (2 preceding siblings ...)
  2022-12-05  8:45 ` rguenth at gcc dot gnu.org
@ 2022-12-05  8:48 ` rguenth at gcc dot gnu.org
  2022-12-05  9:42 ` [Bug debug/107965] " redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-05  8:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I would suggest to make the pretty-printers more robust with respect to memory
errrors (can those errors be catched and the printing avoided somehow?)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debug/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
                   ` (3 preceding siblings ...)
  2022-12-05  8:48 ` rguenth at gcc dot gnu.org
@ 2022-12-05  9:42 ` redi at gcc dot gnu.org
  2022-12-05 10:37 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2022-12-05  9:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
          Component|libstdc++                   |debug
   Last reconfirmed|                            |2022-12-05
                 CC|                            |jason at gcc dot gnu.org

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> so gdb has no idea that x only becomes live after the call to the CTOR
> (or during that).  Instead GCC says it lives throughout the whole
> function on the frame.

That's clearly a GCC bug then, because that is not how C++ works.

> Even the original IL from the frontend has no
> hint that would allow the middle-end to emit different DWARF.

So let's change component to 'debug'. There's definitely nothing the libstdc++
printers can do here.

(In reply to Richard Biener from comment #4)
> I would suggest to make the pretty-printers more robust with respect to
> memory errrors (can those errors be catched and the printing avoided
> somehow?)

There's no single type of error reported in such cases (e.g. the OverflowError
shown above depends on the specific values of the uninitialized bytes, even the
same printer won't fail with the same error every time). The only way to be
more robust is to catch *all* exceptions, and swallow all errors from any point
in the printers. That then hides real errors, and makes them impossible to
develop/debug. It would mean changes in at least 50 places, which would all
have to be individually disabled to allow real errors to propagate when trying
to debug or improve the printers. I don't want to do that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debug/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
                   ` (4 preceding siblings ...)
  2022-12-05  9:42 ` [Bug debug/107965] " redi at gcc dot gnu.org
@ 2022-12-05 10:37 ` redi at gcc dot gnu.org
  2023-01-17 19:37 ` gustaf.waldemarson at gmail dot com
  2023-01-17 20:25 ` jason at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2022-12-05 10:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Reading symbols from p...
(gdb) start
Temporary breakpoint 1 at 0x4022ea: file p.cc, line 18.
Starting program: /tmp/p 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Temporary breakpoint 1, main () at p.cc:18
18          }

Why is the location for the 'start' breakpoint the closing brace of main() ?!

(gdb) n
9               vector<string> test{"test", "test2"};
(gdb) info locals
test = std::vector of length 5, capacity 38 = {" ", <error: Cannot access
memory at address 0x2e00000000>, "", 
  <error: Cannot access memory at address 0x400030200000000>, <error: Cannot
access memory at address 0x3500000034>}
blabla = ""
b = 32767

The errors here seem reasonable, but they depend entirely on the garbage that
happens to be in the uninitialized variables. Different garbage will produce
worse errors.


(gdb) disable pretty-printer
200 printers disabled
0 of 200 printers enabled
(gdb) info locals
test = {<std::_Vector_base<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > > >> = {
    _M_impl = {<std::allocator<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > >> =
{<std::__new_allocator<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > >> = {<No data fields>}, <No data fields>},
<std::_Vector_base<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > > >::_Vector_impl_data> =
{_M_start = 0x7ffff7e2e9a0 <(anonymous namespace)::moneypunct_cache_wf>, 
        _M_finish = 0x7ffff7e2ea40 <(anonymous
namespace)::moneypunct_cache_wt>, 
        _M_end_of_storage = 0x7ffff7e2ee60 <(anonymous
namespace)::moneypunct_cache_ct>}, <No data fields>}}, <No data fields>}

Even with the printers disabled (which is what would happen if the printers
caught all exceptions and didn't show anything) we get nonsense.
moneypunct_cache_wf, moneypunct_cache_wt, and moneypunch_cache_ct are global
variables inside libstdc++.so, but the uninitialized pointers in the
std::vector just happen to point at them.

blabla = {_M_dataplus = {<std::allocator<char>> = {<std::__new_allocator<char>>
= {<No data fields>}, <No data fields>}, 
    _M_p = 0x7ffff7e2ede0 <(anonymous namespace)::moneypunct_cache_cf>
"x9\342\367\377\177"}, _M_string_length = 140737352232672, {
    _M_local_buf =
"\300\357\342\367\377\177\000\000\002\316\314\367\377\177\000",
_M_allocated_capacity = 140737352232896}}

This is all garbage.

b = 32767

At least with an uninitialized int we just get an arbitrary int value, not
nonsense.

But none of these variables should be live yet, so gdb shouldn't even try to
show their values.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debug/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
                   ` (5 preceding siblings ...)
  2022-12-05 10:37 ` redi at gcc dot gnu.org
@ 2023-01-17 19:37 ` gustaf.waldemarson at gmail dot com
  2023-01-17 20:25 ` jason at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: gustaf.waldemarson at gmail dot com @ 2023-01-17 19:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

--- Comment #7 from Gustaf Waldemarson <gustaf.waldemarson at gmail dot com> ---
Very interesting to see so many people chime in on this. Since I arguably
agree that this looks like a GDB bug than an error in the printers. To that
end, I went ahead and registered this
[ticket](https://sourceware.org/bugzilla/show_bug.cgi?id=30018) here. Please
feel free to add
some more details if you have time, as some of the details presented here is
admittedly a bit beyond my skills.

All that said, I still think that the pretty printers should be a bit more
defensive to errors and conservative with what it outputs. E.g., presenting an
error message once, due to something like this is fine in my opinion, but
displaying hundreds of lines with the same error just because some function
uses a hash-map is clearly excessive.

To that end, perhaps it would make sense to defend the `to_string` calls (and
possibly others) with something like this:

    seen_errors = {}
    # ...
    try:
        to_string(...)
    except Exception as ex:
        if ex not in seen_errors:
            seen_errors.add(ex)
            raise ex

Admittedly, I don't know all the details here, so perhaps this isn't feasible?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debug/107965] libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures
  2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
                   ` (6 preceding siblings ...)
  2023-01-17 19:37 ` gustaf.waldemarson at gmail dot com
@ 2023-01-17 20:25 ` jason at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jason at gcc dot gnu.org @ 2023-01-17 20:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107965

--- Comment #8 from Jason Merrill <jason at gcc dot gnu.org> ---
The compiler could represent this with a location list instead of a single
location.  I guess implementing that would involve using NOTE_INSN_VAR_LOCATION
instead of giving the variable DECL_RTL?  This sounds complicated.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-17 20:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-04 19:46 [Bug libstdc++/107965] New: libstdc++ Python Pretty-Printers: Many Exceptions From Uninitialized Structures gustaf.waldemarson at gmail dot com
2022-12-04 20:04 ` [Bug libstdc++/107965] " pinskia at gcc dot gnu.org
2022-12-05  8:24 ` redi at gcc dot gnu.org
2022-12-05  8:45 ` rguenth at gcc dot gnu.org
2022-12-05  8:48 ` rguenth at gcc dot gnu.org
2022-12-05  9:42 ` [Bug debug/107965] " redi at gcc dot gnu.org
2022-12-05 10:37 ` redi at gcc dot gnu.org
2023-01-17 19:37 ` gustaf.waldemarson at gmail dot com
2023-01-17 20:25 ` jason at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).