public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
From: Paul Smith <psmith@gnu.org>
To: gdb@sourceware.org
Subject: Repro case!  Re: GDB 10.1: Backtrace goes into infinite loop
Date: Mon, 01 Feb 2021 22:21:27 -0500	[thread overview]
Message-ID: <f9ac93d9fe0062a870c2ebe53d21b435c04cf4d1.camel@gnu.org> (raw)
In-Reply-To: <503bd54a619aa2781d6b1385cbd3db20634addaa.camel@gnu.org>

Hi all.  I'm back!  I've still not seen this myself but someone
(finally!) provided me with a reproducible test case (see the quoted
message below for all details).  After examining it I am able to come
up with a repro case.

I've discovered that there's something wonky with the embedded Python
interpreter and checking threads while pretty-printing stack variables.
 If I don't load our custom Python macros (so that none of our pretty-
printers are defined) then GDB works fine.  If I load our macros,
Badness Happens.

FYI I'm linking Python 2.7.18 statically with GDB if that matters.

After spelunking my macros I've discovered that I can reproduce the
problem by replacing all our macros with one that does nothing but show
all threads.

Here is the stupid code I'm using:

// ----- foo.cpp
#include <stdlib.h>
#include <iostream>

class Foo
{
public:
    const char* s;
};

void foo(Foo& f)
{
    std::cout << f.s << "\n";
    abort();
}

int main(int argc, const char** argv)
{
    Foo f;
    f.s = argv[1] ? argv[1] : "hi";
    foo(f);
    return 0;
}
// -----

Then I do this:

$ g++ -g -ggdb3 -pthread -o foo foo.cpp

(-pthread is required!!)

Then I run it and get a core:

$ ./foo hiya
hiya
Aborted (core dumped)

Then if I run a simple GDB bt with no Python macros it's fine:

$ gdb -q -batch -ex 'bt' -c core.* foo
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./foo hiya'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f28af56a18b in raise () from /lib/x86_64-linux-gnu/libc.so.6
#0  0x00007f28af56a18b in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f28af549859 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x000055e3b4dc4203 in foo (f=...) at foo.cpp:13
#3  0x000055e3b4dc4256 in main (argc=2, argv=0x7ffe1045c1c8) at foo.cpp:20

Now I add in a pretty-printer for the Foo class, with this file:

# ----- foo.py
import gdb

class SyncPrinter(object):
    def __init__(self, val): self.val = val

    def to_string(self):
        gdb.selected_inferior().threads()  # <---- NOTE!!
        return 'hello'

    def display_hint(self): return "Foo"

def mkpp():
    pp = gdb.printing.RegexpCollectionPrettyPrinter("test")
    pp.add_printer('Foo', r'^Foo$', SyncPrinter)
    return pp

gdb.printing.register_pretty_printer(gdb.current_objfile(),
                                     mkpp(), replace=True)
# ----- foo.py

Then I run GDB with this:

$ gdb -q -x foo.py -batch -ex 'bt' -c core.* foo

Now the backtrace loops quite a few times, then sometimes it finishes,
sometimes it dumps core with:

gdb/frame.c:2467: internal-error: bool get_frame_pc_if_available(frame_info*,
CORE_ADDR*): Assertion `frame->next != NULL' failed.

If I modify the pretty-printer to run gdb.selected_inferior() without
the threads() call, then it works again.

If anyone has any advice or thoughts I'm interested!!  Thx.


On Fri, 2020-11-13 at 17:16 -0500, Paul Smith via Gdb wrote:
> Hi all;
> 
> I just upgraded our users from a toolset using GCC 8.1.0, binutils
> 2.30, and GDB 8.2.1, to a new one using GCC 10.2, binutils 2.35.1, and
> GDB 10.1 (on GNU/Linux x86_64).
> 
> Now some of my users are running into a problem where they run the "bt"
> command and it shows some subset of the stack frames, then jumps back
> and starts over printing from frame 0, and does this forever until you
> use ^C to stop it.
> 
> Apparently this doesn't happen every time, and the number of frames
> that are shown are variable (but usually a smaller number like 2 to 5
> frames).  By "not every time" I mean after a breakpoint sometimes we
> get a good bt and sometimes it recurses, but if it recurses for a given
> bt it will always recurse (that is if you use ^C to stop then "bt"
> again it recurses again).
> 
> If we do the same thing with the older GDB (keeping the newer
> compiler/binutils) then we don't see this behavior.
> 
> FWIW, the code in question is C++ code and was compiled with -ggdb3 and
> no optimization.
> 
> Just wondering if anyone has seen something like this, and/or how to
> try to collect more details.
> 


  parent reply	other threads:[~2021-02-02  3:21 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-13 22:16 Paul Smith
2020-11-16  1:04 ` Simon Marchi
2020-11-21 18:48   ` Paul Smith
2020-11-21 20:33     ` Aurelian Melinte
2021-02-02  3:21 ` Paul Smith [this message]
2021-02-02  4:27   ` Repro case! " Paul Smith
2021-02-02 13:45     ` Paul Smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9ac93d9fe0062a870c2ebe53d21b435c04cf4d1.camel@gnu.org \
    --to=psmith@gnu.org \
    --cc=gdb@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).