[Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard

public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed

* [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard
@ 2021-06-25 23:43 rincebrain at gmail dot com
  2021-06-26  1:20 ` [Bug threads/28014] " simark at simark dot ca
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: rincebrain at gmail dot com @ 2021-06-25 23:43 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

            Bug ID: 28014
           Summary: gdb coredumps when remote+kgdbing a system that OOMs
                    too hard
           Product: gdb
           Version: 10.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: threads
          Assignee: unassigned at sourceware dot org
          Reporter: rincebrain at gmail dot com
  Target Milestone: ---

I was using gdb on mips64el under qemu-user to remote a qemu of a mips64el
system with kgdb, everything was going fine (from gdb's perspective - the
system was in the process of eating all the RAM), then things went south.

The kernel reported, after a whole bunch of other screaming:
[59702.844702] Out of memory and no killable processes...
[59702.845130] Kernel panic - not syncing: System is deadlocked on memory

Meanwhile, when I tabbed back over, gdb said:
[New Thread 172398]
[New Thread 172401]
[New Thread 172402]
[New Thread 172403]
[New Thread 172404]
[New Thread 172405]
--Type <RET> for more, q to quit, c to continue without paging--

Thread 2 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 1]
/build/gdb-OSO7kB/gdb-10.1/gdb/thread.c:95: internal-error: thread_info*
inferior_thread(): Assertion `current_thread_ != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

/build/gdb-OSO7kB/gdb-10.1/gdb/thread.c:95: internal-error: thread_info*
inferior_thread(): Assertion `current_thread_ != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) y
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted

And so I'm here.

(I won't be astonished if you just close this with "everything was on fire,
what are you expecting", but since gdb did request I open a bug, here I am.)

Since the core is 80M compressed and 236M uncompressed, you can find it:
https://www.dropbox.com/s/u7erpk1x6fj4ds6/qemu_gdb_20210625-192829_18450.core.zst?dl=0

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
@ 2021-06-26  1:20 ` simark at simark dot ca
  2021-06-26  2:15 ` rincebrain at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: simark at simark dot ca @ 2021-06-26  1:20 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

Simon Marchi <simark at simark dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |simark at simark dot ca

--- Comment #1 from Simon Marchi <simark at simark dot ca> ---
Hi Rich,

(In reply to Rich from comment #0)
> I was using gdb on mips64el under qemu-user to remote a qemu of a mips64el
> system with kgdb, everything was going fine (from gdb's perspective - the
> system was in the process of eating all the RAM), then things went south.

Unrelated to the reported problem, but do you run GDB as a mips64el program
inside qemu-user only so that you can debug your remote mips64el program?  It
might be easier to run an x86-64 GDB (or whatever your host system is) to
connect to your mips64el remote.  That GDB just needs to be built to include
mips support, using --target=<your-triplet>, or
--enable-targets=<your-triplet>, or --enable-targets=all.

> The kernel reported, after a whole bunch of other screaming:
> [59702.844702] Out of memory and no killable processes...
> [59702.845130] Kernel panic - not syncing: System is deadlocked on memory
> 
> Meanwhile, when I tabbed back over, gdb said:
> [New Thread 172398]
> [New Thread 172401]
> [New Thread 172402]
> [New Thread 172403]
> [New Thread 172404]
> [New Thread 172405]
> --Type <RET> for more, q to quit, c to continue without paging--
> 
> Thread 2 received signal SIGTRAP, Trace/breakpoint trap.
> [Switching to Thread 1]
> /build/gdb-OSO7kB/gdb-10.1/gdb/thread.c:95: internal-error: thread_info*
> inferior_thread(): Assertion `current_thread_ != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) y
> 
> This is a bug, please report it.  For instructions, see:
> <https://www.gnu.org/software/gdb/bugs/>.
> 
> /build/gdb-OSO7kB/gdb-10.1/gdb/thread.c:95: internal-error: thread_info*
> inferior_thread(): Assertion `current_thread_ != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Create a core file of GDB? (y or n) y
> qemu: uncaught target signal 6 (Aborted) - core dumped
> Aborted
> 
> 
> And so I'm here.
> 
> (I won't be astonished if you just close this with "everything was on fire,
> what are you expecting", but since gdb did request I open a bug, here I am.)

I presume that the remote connection (what kind of connection was that, tcp?)
closed at an unexpected time.  GDB is not very good at handling that
gracefully.  

It might not be a bug somebody will immediately jump to fix, but every time you
hit a GDB internal error, you can consider it a valid GDB bug.  It should never
be possible for a user to hit an internal error, whatever the bad input they
provide (including a remote connection breaking at an unexpected moment).  

> Since the core is 80M compressed and 236M uncompressed, you can find it:
> https://www.dropbox.com/s/u7erpk1x6fj4ds6/qemu_gdb_20210625-192829_18450.
> core.zst?dl=0

There's little we can do without the gdb executable that goes along with the
core file.  Can you provide that too?  If we can get a good backtrace of GDB,
at least we'll be able to rough idea.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
  2021-06-26  1:20 ` [Bug threads/28014] " simark at simark dot ca
@ 2021-06-26  2:15 ` rincebrain at gmail dot com
  2021-06-26  2:29 ` simark at simark dot ca
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rincebrain at gmail dot com @ 2021-06-26  2:15 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #2 from Rich <rincebrain at gmail dot com> ---
(In reply to Simon Marchi from comment #1)
> Hi Rich,
> 
> (In reply to Rich from comment #0)
> > I was using gdb on mips64el under qemu-user to remote a qemu of a mips64el
> > system with kgdb, everything was going fine (from gdb's perspective - the
> > system was in the process of eating all the RAM), then things went south.
> 
> Unrelated to the reported problem, but do you run GDB as a mips64el program
> inside qemu-user only so that you can debug your remote mips64el program? 
> It might be easier to run an x86-64 GDB (or whatever your host system is) to
> connect to your mips64el remote.  That GDB just needs to be built to include
> mips support, using --target=<your-triplet>, or
> --enable-targets=<your-triplet>, or --enable-targets=all.

Yeah, but recompiling gdb with all the arches I might need is a timesink when I
already had a chroot right there, and Debian's definitely doesn't have other
targets enabled OOTB.


> It might not be a bug somebody will immediately jump to fix, but every time
> you hit a GDB internal error, you can consider it a valid GDB bug.  It
> should never be possible for a user to hit an internal error, whatever the
> bad input they provide (including a remote connection breaking at an
> unexpected moment).  

Duly noted. That's nice; I've encountered all sorts of philosophies on that,
and while the one you describe aligns with my own opinions on the matter, not
everyone agrees.

> > Since the core is 80M compressed and 236M uncompressed, you can find it:
> > https://www.dropbox.com/s/u7erpk1x6fj4ds6/qemu_gdb_20210625-192829_18450.
> > core.zst?dl=0
> 
> There's little we can do without the gdb executable that goes along with the
> core file.  Can you provide that too?  If we can get a good backtrace of
> GDB, at least we'll be able to rough idea.

It's just the gdb binary from Debian bullseye at this precise moment:
https://www.dropbox.com/s/orf6tmcbctpjx16/gdb?dl=0

You may find this pretty useless, as none of my gdbs find the core useful, and
so far nobody in #qemu has known how to make use of it either.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
  2021-06-26  1:20 ` [Bug threads/28014] " simark at simark dot ca
  2021-06-26  2:15 ` rincebrain at gmail dot com
@ 2021-06-26  2:29 ` simark at simark dot ca
  2021-06-26  2:41 ` rincebrain at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: simark at simark dot ca @ 2021-06-26  2:29 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #3 from Simon Marchi <simark at simark dot ca> ---
(In reply to Rich from comment #2)
> (In reply to Simon Marchi from comment #1)
> > Hi Rich,
> > 
> > (In reply to Rich from comment #0)
> > > I was using gdb on mips64el under qemu-user to remote a qemu of a mips64el
> > > system with kgdb, everything was going fine (from gdb's perspective - the
> > > system was in the process of eating all the RAM), then things went south.
> > 
> > Unrelated to the reported problem, but do you run GDB as a mips64el program
> > inside qemu-user only so that you can debug your remote mips64el program? 
> > It might be easier to run an x86-64 GDB (or whatever your host system is) to
> > connect to your mips64el remote.  That GDB just needs to be built to include
> > mips support, using --target=<your-triplet>, or
> > --enable-targets=<your-triplet>, or --enable-targets=all.
> 
> Yeah, but recompiling gdb with all the arches I might need is a timesink
> when I already had a chroot right there, and Debian's definitely doesn't
> have other targets enabled OOTB.

No problem, I mentioned it just in case.

IIRC, this package is GDB built with --enable-targets=all:
https://packages.debian.org/bullseye/gdb-multiarch.  So you could use it on a
host Debian.  But if your setup works, it works.

> It's just the gdb binary from Debian bullseye at this precise moment:
> https://www.dropbox.com/s/orf6tmcbctpjx16/gdb?dl=0
> 
> You may find this pretty useless, as none of my gdbs find the core useful,
> and so far nobody in #qemu has known how to make use of it either.

Hmm, no success here either.  And I couldn't find debug info for that build in
Debian's repos.

$ ./gdb -nx --data-directory=data-directory -q /tmp/gdb
/tmp/qemu_gdb_20210625-192829_18450.core
Reading symbols from /tmp/gdb...
(No debugging symbols found in /tmp/gdb)

warning: core file may not match specified executable file.
[New LWP 18450]
[New LWP 18479]
[New LWP 18480]
Core was generated by ``/@   d/@   h/@   u/@   y/@  '.
#0  0x0000004003148b4c in ?? ()
[Current thread is 1 (LWP 18450)]
(gdb) bt
warning: GDB can't find the start of the function at 0x4003148b4c.

    GDB is unable to find the start of the function at 0x4003148b4c
and thus can't determine the size of that function's stack frame.
This means that GDB may be unable to access that stack frame, or
the frames below it.
    This problem is most likely caused by an invalid program counter or
stack pointer.
    However, if you think GDB should simply search farther back
from 0x4003148b4c for code which looks like the beginning of a
function, you can increase the range of the search using the `set
heuristic-fence-post' command.
#0  0x0000004003148b4c in ?? ()
(gdb)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
                   ` (2 preceding siblings ...)
  2021-06-26  2:29 ` simark at simark dot ca
@ 2021-06-26  2:41 ` rincebrain at gmail dot com
  2021-06-26  4:27 ` simark at simark dot ca
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rincebrain at gmail dot com @ 2021-06-26  2:41 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #4 from Rich <rincebrain at gmail dot com> ---
(In reply to Simon Marchi from comment #3)
> (In reply to Rich from comment #2)
> > (In reply to Simon Marchi from comment #1)
> > > Hi Rich,
> > > 
> > > (In reply to Rich from comment #0)
> > > > I was using gdb on mips64el under qemu-user to remote a qemu of a mips64el
> > > > system with kgdb, everything was going fine (from gdb's perspective - the
> > > > system was in the process of eating all the RAM), then things went south.
> > > 
> > > Unrelated to the reported problem, but do you run GDB as a mips64el program
> > > inside qemu-user only so that you can debug your remote mips64el program? 
> > > It might be easier to run an x86-64 GDB (or whatever your host system is) to
> > > connect to your mips64el remote.  That GDB just needs to be built to include
> > > mips support, using --target=<your-triplet>, or
> > > --enable-targets=<your-triplet>, or --enable-targets=all.
> > 
> > Yeah, but recompiling gdb with all the arches I might need is a timesink
> > when I already had a chroot right there, and Debian's definitely doesn't
> > have other targets enabled OOTB.
> 
> No problem, I mentioned it just in case.
> 
> IIRC, this package is GDB built with --enable-targets=all:
> https://packages.debian.org/bullseye/gdb-multiarch.  So you could use it on
> a host Debian.  But if your setup works, it works.

That was what I assumed as well, but for at least this case, it errored in
precisely the same way as non-multiarch Debian. It's possible it would work for
local core dumps if not for, well, the other difficulties.

> > It's just the gdb binary from Debian bullseye at this precise moment:
> > https://www.dropbox.com/s/orf6tmcbctpjx16/gdb?dl=0
> > 
> > You may find this pretty useless, as none of my gdbs find the core useful,
> > and so far nobody in #qemu has known how to make use of it either.
> 
> Hmm, no success here either.  And I couldn't find debug info for that build
> in Debian's repos.
> 
> $ ./gdb -nx --data-directory=data-directory -q /tmp/gdb
> /tmp/qemu_gdb_20210625-192829_18450.core
> Reading symbols from /tmp/gdb...
> (No debugging symbols found in /tmp/gdb)
> 
> warning: core file may not match specified executable file.
> [New LWP 18450]
> [New LWP 18479]
> [New LWP 18480]
> Core was generated by ``/@   d/@   h/@   u/@   y/@  '.
> #0  0x0000004003148b4c in ?? ()
> [Current thread is 1 (LWP 18450)]
> (gdb) bt
> warning: GDB can't find the start of the function at 0x4003148b4c.
> 
>     GDB is unable to find the start of the function at 0x4003148b4c
> and thus can't determine the size of that function's stack frame.
> This means that GDB may be unable to access that stack frame, or
> the frames below it.
>     This problem is most likely caused by an invalid program counter or
> stack pointer.
>     However, if you think GDB should simply search farther back
> from 0x4003148b4c for code which looks like the beginning of a
> function, you can increase the range of the search using the `set
> heuristic-fence-post' command.
> #0  0x0000004003148b4c in ?? ()
> (gdb)

Yeah, that's about what I get.

I had no trouble finding the debug symbols from the debug symbol repo, though:

# apt install gdb-dbgsym
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following package was automatically installed and is no longer required:
  linux-image-5.10.0-6-5kc-malta
Use 'sudo apt autoremove' to remove it.
The following NEW packages will be installed:
  gdb-dbgsym
0 upgraded, 1 newly installed, 0 to remove and 8 not upgraded.
Need to get 21.3 MB of archives.
After this operation, 23.2 MB of additional disk space will be used.
Get:1 http://debug.mirrors.debian.org/debian-debug bullseye-debug/main mips64el
gdb-dbgsym mips64el 10.1-1.7 [21.3 MB]
Fetched 21.3 MB in 2s (9,234 kB/s)

Not that it helped...

# gdb `which gdb` qemu_gdb_20210625-192829_18450.core
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "mips64el-linux-gnuabi64".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/gdb...
Reading symbols from
/usr/lib/debug/.build-id/80/af60f6adb52cb6a14118a4fea3f9270ea1a923.debug...
[New LWP 18450]
[New LWP 18479]
[New LWP 18480]
Core was generated by ``�/@   d�/@   h�/@   u�/@   y�/@  '.
#0  0x0000004003148b4c in ?? (
warning: GDB can't find the start of the function at 0x4003148b4c.

    GDB is unable to find the start of the function at 0x4003148b4c
and thus can't determine the size of that function's stack frame.
This means that GDB may be unable to access that stack frame, or
the frames below it.
    This problem is most likely caused by an invalid program counter or
stack pointer.
    However, if you think GDB should simply search farther back
from 0x4003148b4c for code which looks like the beginning of a
function, you can increase the range of the search using the `set
heuristic-fence-post' command.
)
[Current thread is 1 (LWP 18450)]
(gdb)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
                   ` (3 preceding siblings ...)
  2021-06-26  2:41 ` rincebrain at gmail dot com
@ 2021-06-26  4:27 ` simark at simark dot ca
  2021-06-26 10:55 ` rincebrain at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: simark at simark dot ca @ 2021-06-26  4:27 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #5 from Simon Marchi <simark at simark dot ca> ---
It dug a little bit.  I checked why GDB couldn't see any shared library, I
think there's a little bug in the MIPS-specific code where GDB uses a
MIPS-specific auxv entry to locate the base of the runtime loader.  It doesn't
take into account the main executable's runtime address.

With the hack below, I can get a shared library list.  I then idenfitied that
the current PC was in libc, so I downloaded and extracted
libc6_2.31-12_mips64el.deb in my sysroot.  The backtrace becomes a bit better:

(gdb) bt
#0  0x0000004003148b4c in raise () from
/tmp/investigation/lib/mips64el-linux-gnuabi64/libc.so.6
#1  0x000000400312fa50 in abort () from
/tmp/investigation/lib/mips64el-linux-gnuabi64/libc.so.6
Backtrace stopped: frame did not save the PC

To get further, we would need libc's debug info.  I tried to find debug info
for libc.so.6 but can't find it.  It's supposed to be in the libc6 package,
which I think should be found here:

http://debug.mirrors.debian.org/debian-debug/pool/main/g/glibc/

But I can't find the debug package corresponding to libc6_2.31-12_mips64el.deb.
The libc's build-id is 59aa6ff984aff00883acda7feef7613cce475991, I can't find
that in any of the packages.  Do you have an idea?

>From a52228d3bb10cc6d1f6d748d22ee8a4cabff3304 Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@polymtl.ca>
Date: Sat, 26 Jun 2021 00:16:55 -0400
Subject: [PATCH] fix

Change-Id: I0d90e5432a5a998840dae7446bfcdb8995cf0297
---
 gdb/solib-svr4.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index a8a7d1171dc6..d60427cd46bc 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -798,7 +798,7 @@ elf_locate_base (void)
       pbuf = (gdb_byte *) alloca (pbuf_size);
       /* DT_MIPS_RLD_MAP_REL contains an offset from the address of the
         DT slot to the address of the dynamic link structure.  */
-      if (target_read_memory (dyn_ptr + dyn_ptr_addr, pbuf, pbuf_size))
+      if (target_read_memory (dyn_ptr + dyn_ptr_addr + 0x4000000000, pbuf,
pbuf_size))
        return 0;
       return extract_typed_address (pbuf, ptr_type);
     }
-- 
2.32.0

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
                   ` (4 preceding siblings ...)
  2021-06-26  4:27 ` simark at simark dot ca
@ 2021-06-26 10:55 ` rincebrain at gmail dot com
  2021-06-26 13:04 ` simark at simark dot ca
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rincebrain at gmail dot com @ 2021-06-26 10:55 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #6 from Rich <rincebrain at gmail dot com> ---
(In reply to Simon Marchi from comment #5)
> It dug a little bit.  I checked why GDB couldn't see any shared library, I
> think there's a little bug in the MIPS-specific code where GDB uses a
> MIPS-specific auxv entry to locate the base of the runtime loader.  It
> doesn't take into account the main executable's runtime address.

Lovely.

> To get further, we would need libc's debug info.  I tried to find debug info
> for libc.so.6 but can't find it.  It's supposed to be in the libc6 package,
> which I think should be found here:
> 
> http://debug.mirrors.debian.org/debian-debug/pool/main/g/glibc/
> 
> But I can't find the debug package corresponding to
> libc6_2.31-12_mips64el.deb. The libc's build-id is
> 59aa6ff984aff00883acda7feef7613cce475991, I can't find that in any of the
> packages.  Do you have an idea?

I think you're getting burned because glibc is one of the ones that (for legacy
reasons, now that $RELEASE-debug is a thing? I'm unsure.) gets shoved in a -dbg
package in the main repos. gdb finds those symbols and loads them for me, even
as everything else horfs; I haven't tried recompiling gdb with that patch,
though.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
                   ` (5 preceding siblings ...)
  2021-06-26 10:55 ` rincebrain at gmail dot com
@ 2021-06-26 13:04 ` simark at simark dot ca
  2021-06-27  5:27 ` simark at simark dot ca
  2021-06-27 12:24 ` rincebrain at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: simark at simark dot ca @ 2021-06-26 13:04 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #7 from Simon Marchi <simark at simark dot ca> ---
(In reply to Rich from comment #6)
> (In reply to Simon Marchi from comment #5)
> > It dug a little bit.  I checked why GDB couldn't see any shared library, I
> > think there's a little bug in the MIPS-specific code where GDB uses a
> > MIPS-specific auxv entry to locate the base of the runtime loader.  It
> > doesn't take into account the main executable's runtime address.
>  
> Lovely.

Actually, the bug is in svr4_exec_displacement, which doesn't find the
executable displacement correctly.  I now use the following hack instead:

>From fa346c1961e206a67ccad84c13c0d9f3a1217bfc Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@polymtl.ca>
Date: Sat, 26 Jun 2021 00:16:55 -0400
Subject: [PATCH] fix

Change-Id: I0d90e5432a5a998840dae7446bfcdb8995cf0297
---
 gdb/solib-svr4.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index a8a7d1171dc6..215d3f3a2250 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -2574,6 +2574,9 @@ svr4_exec_displacement (CORE_ADDR *displacementp)
      a call to gdbarch_convert_from_func_ptr_addr.  */
   CORE_ADDR entry_point, exec_displacement;

+  *displacementp = 0x0000004000000000;
+  return 1;
+
   if (current_program_space->exec_bfd () == NULL)
     return 0;

-- 
2.32.0



> > To get further, we would need libc's debug info.  I tried to find debug info
> > for libc.so.6 but can't find it.  It's supposed to be in the libc6 package,
> > which I think should be found here:
> > 
> > http://debug.mirrors.debian.org/debian-debug/pool/main/g/glibc/
> > 
> > But I can't find the debug package corresponding to
> > libc6_2.31-12_mips64el.deb. The libc's build-id is
> > 59aa6ff984aff00883acda7feef7613cce475991, I can't find that in any of the
> > packages.  Do you have an idea?
> 
> I think you're getting burned because glibc is one of the ones that (for
> legacy reasons, now that $RELEASE-debug is a thing? I'm unsure.) gets shoved
> in a -dbg package in the main repos. gdb finds those symbols and loads them
> for me, even as everything else horfs; I haven't tried recompiling gdb with
> that patch, though.

Ah, got it.  libc6-dbg has the right file.  So I got this backtrace:

(gdb) bt
#0  __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000000400312fa50 in __GI_abort () at abort.c:79
#2  0x00000040005ce440 in dump_core () at
/build/gdb-OSO7kB/gdb-10.1/gdb/utils.c:204
#3  0x00000040005d4ea0 in internal_vproblem (problem=<optimized out>,
file=<optimized out>, line=<optimized out>, fmt=<optimized out>, ap=<optimized
out>)
    at /build/gdb-OSO7kB/gdb-10.1/gdb/utils.c:424
#4  0x00000040005d5234 in internal_verror (file=<optimized out>,
line=<optimized out>, fmt=<optimized out>, ap=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdb/utils.c:439
#5  0x00000040007b3000 in internal_error (file=<optimized out>, line=<optimized
out>, fmt=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdbsupport/errors.cc:55
#6  0x0000004000586ebc in inferior_thread () at
/build/gdb-OSO7kB/gdb-10.1/gdb/thread.c:93
#7  inferior_thread () at /build/gdb-OSO7kB/gdb-10.1/gdb/thread.c:93
#8  0x000000400039c748 in print_stop_event (uiout=0x4000b795f0,
displays=<optimized out>) at /build/gdb-OSO7kB/gdb-10.1/gdb/infrun.c:8136
#9  0x00000040005b1288 in tui_on_normal_stop (bs=<optimized out>,
print_frame=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdb/tui/tui-interp.c:98
#10 0x00000040003a00f4 in std::function<void (bpstats*,
int)>::operator()(bpstats*, int) const (__args#1=<optimized out>, __args#0=0x0,
this=0x4000bfd530)
    at /usr/include/c++/10/bits/std_function.h:622
#11 gdb::observers::observable<bpstats*, int>::notify (args#1=1, args#0=0x0,
this=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdb/../gdbsupport/observable.h:106
#12 normal_stop () at /build/gdb-OSO7kB/gdb-10.1/gdb/infrun.c:8407
#13 0x00000040003a9674 in fetch_inferior_event () at
/build/gdb-OSO7kB/gdb-10.1/gdb/infrun.c:3967
#14 0x00000040003858ac in inferior_event_handler (event_type=<optimized out>)
at /build/gdb-OSO7kB/gdb-10.1/gdb/inf-loop.c:42
#15 0x00000040004d0664 in remote_async_serial_handler (scb=<optimized out>,
context=<optimized out>) at /build/gdb-OSO7kB/gdb-10.1/gdb/remote.c:14160
#16 0x0000004000503afc in run_async_handler_and_reschedule (scb=0x4025424390)
at /build/gdb-OSO7kB/gdb-10.1/gdb/ser-base.c:137
#17 0x00000040007b37f8 in gdb_wait_for_event (block=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdbsupport/event-loop.cc:673
#18 0x00000040007b3c18 in gdb_wait_for_event (block=1) at
/build/gdb-OSO7kB/gdb-10.1/gdbsupport/event-loop.cc:569
#19 gdb_do_one_event () at
/build/gdb-OSO7kB/gdb-10.1/gdbsupport/event-loop.cc:215
#20 0x00000040003eb1e0 in start_event_loop () at
/build/gdb-OSO7kB/gdb-10.1/gdb/main.c:356
#21 captured_command_loop () at /build/gdb-OSO7kB/gdb-10.1/gdb/main.c:416
#22 0x00000040003efa6c in captured_main (data=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdb/main.c:1253
#23 gdb_main (args=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdb/main.c:1268
#24 0x00000040001b6d10 in main (argc=<optimized out>, argv=<optimized out>) at
/build/gdb-OSO7kB/gdb-10.1/gdb/gdb.c:32

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
                   ` (6 preceding siblings ...)
  2021-06-26 13:04 ` simark at simark dot ca
@ 2021-06-27  5:27 ` simark at simark dot ca
  2021-06-27 12:24 ` rincebrain at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: simark at simark dot ca @ 2021-06-27  5:27 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #8 from Simon Marchi <simark at simark dot ca> ---
I investigated a little bit why GDB fails to get the exec displacement for your
program, and found that program header validation here fails:

https://gitlab.com/gnutools/gdb/-/blob/f1fa7a3d88561cef54dd5cf9422c29a802af6ce3/gdb/solib-svr4.c#L2625-2628

The program headers read from the core is all zeroes.  The address of the PHDR
in your core is:

    $ eu-readelf -n qemu_gdb_20210625-192829_18450.core | grep PHDR
        PHDR: 0x4000000040

The corresponding LOAD in the core, for that address, is:

    $ eu-readelf -l qemu_gdb_20210625-192829_18450.core
      Type           Offset   VirtAddr           PhysAddr           FileSiz 
MemSiz   Flg Align
      LOAD           0x005000 0x0000004000000000 0x0000000000000000 0x000000
0xa04000 R E 0x1000

FileSiz == 0 hints that this section of memory was not dumped in the core.

I then found that bit 4 of /proc/$pid/coredump_filter controls that, as
documented in the core(5) man page.  I tried flipping that bit off, generated a
core, and also got a core with FileSiz == 0.  Is it possible that your system
does not include that information in the code dumps?

Simon

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug threads/28014] gdb coredumps when remote+kgdbing a system that OOMs too hard
  2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
                   ` (7 preceding siblings ...)
  2021-06-27  5:27 ` simark at simark dot ca
@ 2021-06-27 12:24 ` rincebrain at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: rincebrain at gmail dot com @ 2021-06-27 12:24 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=28014

--- Comment #9 from Rich <rincebrain at gmail dot com> ---
FWIW, I SIGABRTed an ordinary, no qemu-user while(1) { sleep } program, and:

$ eu-readelf -n core | grep PHDR
    PHDR: 0x5623c369a040
$ eu-readelf -l core
Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz
  Flg Align
[...]
  LOAD           0x002000 0x00005623c369a000 0x0000000000000000 0x001000
0x001000 R   0x1000
[...]

And gdb was perfectly content to read it.

coredump_filter seems to be 0x00000033 on processes by default on my system.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-27 12:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-25 23:43 [Bug threads/28014] New: gdb coredumps when remote+kgdbing a system that OOMs too hard rincebrain at gmail dot com
2021-06-26  1:20 ` [Bug threads/28014] " simark at simark dot ca
2021-06-26  2:15 ` rincebrain at gmail dot com
2021-06-26  2:29 ` simark at simark dot ca
2021-06-26  2:41 ` rincebrain at gmail dot com
2021-06-26  4:27 ` simark at simark dot ca
2021-06-26 10:55 ` rincebrain at gmail dot com
2021-06-26 13:04 ` simark at simark dot ca
2021-06-27  5:27 ` simark at simark dot ca
2021-06-27 12:24 ` rincebrain at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).