public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* GDB 10.1: Backtrace goes into infinite loop
@ 2020-11-13 22:16 Paul Smith
  2020-11-16  1:04 ` Simon Marchi
  2021-02-02  3:21 ` Repro case! " Paul Smith
  0 siblings, 2 replies; 7+ messages in thread
From: Paul Smith @ 2020-11-13 22:16 UTC (permalink / raw)
  To: gdb

Hi all;

I just upgraded our users from a toolset using GCC 8.1.0, binutils
2.30, and GDB 8.2.1, to a new one using GCC 10.2, binutils 2.35.1, and
GDB 10.1 (on GNU/Linux x86_64).

Now some of my users are running into a problem where they run the "bt"
command and it shows some subset of the stack frames, then jumps back
and starts over printing from frame 0, and does this forever until you
use ^C to stop it.

Apparently this doesn't happen every time, and the number of frames
that are shown are variable (but usually a smaller number like 2 to 5
frames).  By "not every time" I mean after a breakpoint sometimes we
get a good bt and sometimes it recurses, but if it recurses for a given
bt it will always recurse (that is if you use ^C to stop then "bt"
again it recurses again).

If we do the same thing with the older GDB (keeping the newer
compiler/binutils) then we don't see this behavior.

FWIW, the code in question is C++ code and was compiled with -ggdb3 and
no optimization.

Just wondering if anyone has seen something like this, and/or how to
try to collect more details.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB 10.1: Backtrace goes into infinite loop
  2020-11-13 22:16 GDB 10.1: Backtrace goes into infinite loop Paul Smith
@ 2020-11-16  1:04 ` Simon Marchi
  2020-11-21 18:48   ` Paul Smith
  2021-02-02  3:21 ` Repro case! " Paul Smith
  1 sibling, 1 reply; 7+ messages in thread
From: Simon Marchi @ 2020-11-16  1:04 UTC (permalink / raw)
  To: psmith, gdb


On 2020-11-13 5:16 p.m., Paul Smith via Gdb wrote:
> Hi all;
>
> I just upgraded our users from a toolset using GCC 8.1.0, binutils
> 2.30, and GDB 8.2.1, to a new one using GCC 10.2, binutils 2.35.1, and
> GDB 10.1 (on GNU/Linux x86_64).
>
> Now some of my users are running into a problem where they run the "bt"
> command and it shows some subset of the stack frames, then jumps back
> and starts over printing from frame 0, and does this forever until you
> use ^C to stop it.
>
> Apparently this doesn't happen every time, and the number of frames
> that are shown are variable (but usually a smaller number like 2 to 5
> frames).  By "not every time" I mean after a breakpoint sometimes we
> get a good bt and sometimes it recurses, but if it recurses for a given
> bt it will always recurse (that is if you use ^C to stop then "bt"
> again it recurses again).
>
> If we do the same thing with the older GDB (keeping the newer
> compiler/binutils) then we don't see this behavior.
>
> FWIW, the code in question is C++ code and was compiled with -ggdb3 and
> no optimization.
>
> Just wondering if anyone has seen something like this, and/or how to
> try to collect more details.
>

Hi Paul,

When you see the problem happening, you could save a core file (just run
the generate-core-file command).  Then, try to reproduce the problem
using the core file.  If it reproduces, you could upload it to a bug
report.

Simon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB 10.1: Backtrace goes into infinite loop
  2020-11-16  1:04 ` Simon Marchi
@ 2020-11-21 18:48   ` Paul Smith
  2020-11-21 20:33     ` Aurelian Melinte
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Smith @ 2020-11-21 18:48 UTC (permalink / raw)
  To: gdb

On Sun, 2020-11-15 at 20:04 -0500, Simon Marchi wrote:
> When you see the problem happening, you could save a core file (just
> run the generate-core-file command).  Then, try to reproduce the
> problem using the core file.  If it reproduces, you could upload it
> to a bug report.

Hi Simon; thanks for the reply.  Just to say that this is still
happening and I've now heard from two people on my team who see this
behavior.

I have not seen it though.  I'm asking them to provide me with
reproduction core files and binaries.  As soon as I have this I will
investigate further.

Cheers!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB 10.1: Backtrace goes into infinite loop
  2020-11-21 18:48   ` Paul Smith
@ 2020-11-21 20:33     ` Aurelian Melinte
  0 siblings, 0 replies; 7+ messages in thread
From: Aurelian Melinte @ 2020-11-21 20:33 UTC (permalink / raw)
  To: psmith, gdb

On 21/11/2020 13:48, Paul Smith via Gdb wrote:
> On Sun, 2020-11-15 at 20:04 -0500, Simon Marchi wrote:
>> When you see the problem happening, you could save a core file (just
>> run the generate-core-file command).  Then, try to reproduce the
>> problem using the core file.  If it reproduces, you could upload it
>> to a bug report.
> Hi Simon; thanks for the reply.  Just to say that this is still
> happening and I've now heard from two people on my team who see this
> behavior.
>
> I have not seen it though.  I'm asking them to provide me with
> reproduction core files and binaries.  As soon as I have this I will
> investigate further.
>
> Cheers!

I remember having seen this with a C++ clang compiled binary. At the
time I thought it loops over static data members in classes. E.g. you
have a class C, a static of it s and a member in C which would return a
ref/ptr to s.

HTH


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Repro case!  Re: GDB 10.1: Backtrace goes into infinite loop
  2020-11-13 22:16 GDB 10.1: Backtrace goes into infinite loop Paul Smith
  2020-11-16  1:04 ` Simon Marchi
@ 2021-02-02  3:21 ` Paul Smith
  2021-02-02  4:27   ` Paul Smith
  1 sibling, 1 reply; 7+ messages in thread
From: Paul Smith @ 2021-02-02  3:21 UTC (permalink / raw)
  To: gdb

Hi all.  I'm back!  I've still not seen this myself but someone
(finally!) provided me with a reproducible test case (see the quoted
message below for all details).  After examining it I am able to come
up with a repro case.

I've discovered that there's something wonky with the embedded Python
interpreter and checking threads while pretty-printing stack variables.
 If I don't load our custom Python macros (so that none of our pretty-
printers are defined) then GDB works fine.  If I load our macros,
Badness Happens.

FYI I'm linking Python 2.7.18 statically with GDB if that matters.

After spelunking my macros I've discovered that I can reproduce the
problem by replacing all our macros with one that does nothing but show
all threads.

Here is the stupid code I'm using:

// ----- foo.cpp
#include <stdlib.h>
#include <iostream>

class Foo
{
public:
    const char* s;
};

void foo(Foo& f)
{
    std::cout << f.s << "\n";
    abort();
}

int main(int argc, const char** argv)
{
    Foo f;
    f.s = argv[1] ? argv[1] : "hi";
    foo(f);
    return 0;
}
// -----

Then I do this:

$ g++ -g -ggdb3 -pthread -o foo foo.cpp

(-pthread is required!!)

Then I run it and get a core:

$ ./foo hiya
hiya
Aborted (core dumped)

Then if I run a simple GDB bt with no Python macros it's fine:

$ gdb -q -batch -ex 'bt' -c core.* foo
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./foo hiya'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f28af56a18b in raise () from /lib/x86_64-linux-gnu/libc.so.6
#0  0x00007f28af56a18b in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f28af549859 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x000055e3b4dc4203 in foo (f=...) at foo.cpp:13
#3  0x000055e3b4dc4256 in main (argc=2, argv=0x7ffe1045c1c8) at foo.cpp:20

Now I add in a pretty-printer for the Foo class, with this file:

# ----- foo.py
import gdb

class SyncPrinter(object):
    def __init__(self, val): self.val = val

    def to_string(self):
        gdb.selected_inferior().threads()  # <---- NOTE!!
        return 'hello'

    def display_hint(self): return "Foo"

def mkpp():
    pp = gdb.printing.RegexpCollectionPrettyPrinter("test")
    pp.add_printer('Foo', r'^Foo$', SyncPrinter)
    return pp

gdb.printing.register_pretty_printer(gdb.current_objfile(),
                                     mkpp(), replace=True)
# ----- foo.py

Then I run GDB with this:

$ gdb -q -x foo.py -batch -ex 'bt' -c core.* foo

Now the backtrace loops quite a few times, then sometimes it finishes,
sometimes it dumps core with:

gdb/frame.c:2467: internal-error: bool get_frame_pc_if_available(frame_info*,
CORE_ADDR*): Assertion `frame->next != NULL' failed.

If I modify the pretty-printer to run gdb.selected_inferior() without
the threads() call, then it works again.

If anyone has any advice or thoughts I'm interested!!  Thx.


On Fri, 2020-11-13 at 17:16 -0500, Paul Smith via Gdb wrote:
> Hi all;
> 
> I just upgraded our users from a toolset using GCC 8.1.0, binutils
> 2.30, and GDB 8.2.1, to a new one using GCC 10.2, binutils 2.35.1, and
> GDB 10.1 (on GNU/Linux x86_64).
> 
> Now some of my users are running into a problem where they run the "bt"
> command and it shows some subset of the stack frames, then jumps back
> and starts over printing from frame 0, and does this forever until you
> use ^C to stop it.
> 
> Apparently this doesn't happen every time, and the number of frames
> that are shown are variable (but usually a smaller number like 2 to 5
> frames).  By "not every time" I mean after a breakpoint sometimes we
> get a good bt and sometimes it recurses, but if it recurses for a given
> bt it will always recurse (that is if you use ^C to stop then "bt"
> again it recurses again).
> 
> If we do the same thing with the older GDB (keeping the newer
> compiler/binutils) then we don't see this behavior.
> 
> FWIW, the code in question is C++ code and was compiled with -ggdb3 and
> no optimization.
> 
> Just wondering if anyone has seen something like this, and/or how to
> try to collect more details.
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Repro case!  Re: GDB 10.1: Backtrace goes into infinite loop
  2021-02-02  3:21 ` Repro case! " Paul Smith
@ 2021-02-02  4:27   ` Paul Smith
  2021-02-02 13:45     ` Paul Smith
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Smith @ 2021-02-02  4:27 UTC (permalink / raw)
  To: gdb

On Mon, 2021-02-01 at 22:21 -0500, Paul Smith via Gdb wrote:
> I've discovered that there's something wonky with the embedded Python
> interpreter and checking threads while pretty-printing stack
> variables.

More proof:

I rebuilt GDB 10.1 with this patch applied:

--- a/gdb/python/py-inferior.c
+++ b/gdb/python/py-inferior.c
@@ -394,7 +394,7 @@ infpy_threads (PyObject *self, PyObject *args)

   try
     {
-      update_thread_list ();
+      // update_thread_list ();
     }
   catch (const gdb_exception &except)
     {

Just to see, and this version of GDB works fine.

There's something unpleasant about invoking update_thread_list() while
we are attempting to pretty-print stack variables; it must mess with
some global state somewhere that doesn't expect to be messed with, at
that time.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Repro case!  Re: GDB 10.1: Backtrace goes into infinite loop
  2021-02-02  4:27   ` Paul Smith
@ 2021-02-02 13:45     ` Paul Smith
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Smith @ 2021-02-02 13:45 UTC (permalink / raw)
  To: gdb

On Mon, 2021-02-01 at 23:27 -0500, Paul Smith via Gdb wrote:
> There's something unpleasant about invoking update_thread_list()
> while we are attempting to pretty-print stack variables; it must mess
> with some global state somewhere that doesn't expect to be messed
> with, at that time.

I filed https://sourceware.org/bugzilla/show_bug.cgi?id=27315 for this.

It would be great if anyone could confirm they can see the issue as
well (or if there's a proposed fix even better!!)


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-02 13:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13 22:16 GDB 10.1: Backtrace goes into infinite loop Paul Smith
2020-11-16  1:04 ` Simon Marchi
2020-11-21 18:48   ` Paul Smith
2020-11-21 20:33     ` Aurelian Melinte
2021-02-02  3:21 ` Repro case! " Paul Smith
2021-02-02  4:27   ` Paul Smith
2021-02-02 13:45     ` Paul Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).