* GDB cannot access memory after Emacs abort [not found] <87r6j6rvn3.fsf@escher.local.home> @ 2007-11-10 23:50 ` Stephen Berman 2007-11-11 6:46 ` Michael Snyder 0 siblings, 1 reply; 32+ messages in thread From: Stephen Berman @ 2007-11-10 23:50 UTC (permalink / raw) To: gdb I recently experienced an abort in CVS Emacs and was unable to get a backtrace from GDB. The Emacs bug causing the abort was fixed, but Richard Stallman responded to the lack of a backtrace with this comment: > That could be a serious problem in GDB. It would be good to talk > with GDB developers about how to investigate it. Since the bug's > cause is known, they could focus on figuring out why GDB fails > to give a backtrace. Here is the GDB-relevant part of my bug report about the abort (Emacs was built using the GTK+ toolkit): > I attempted to debug in gdb; here's the output when the abort occurs: > > (emacs:19177): Gtk-WARNING **: Can't set a parent on widget which has a parent > > > Gtk-ERROR **: file gtkcontainer.c: line 2641 (gtk_container_propagate_expose): assertion failed: (child->parent == GTK_WIDGET (container)) > aborting... > [Switching to Thread 0xb6ec66c0 (LWP 19177)] > > Breakpoint 1, abort () at emacs.c:431 > 431 kill (getpid (), SIGABRT); > > At this point my desktop (I tried in KDE, GNOME and twm, same behavior > in all) is totally locked up, but I can switch to a virtual tty and > there kill emacs with SIGKILL (kill -9); SIGTERM (kill -15) does not do > the job. This releases the desktop, but gdb delivers no further > feedback: > > (gdb) xbacktrace > Cannot access memory at address 0x8321b6c > (gdb) bt > #0 abort () at emacs.c:431 > Cannot access memory at address 0xbfd6836c > Cannot access memory at address 0x8321b6c > > Setting a break point at abort() in emacs.c makes no difference. I > don't have the GTK+ source code so I cannot debug it there. Does anyone have an idea why there was no backtrace? Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-10 23:50 ` GDB cannot access memory after Emacs abort Stephen Berman @ 2007-11-11 6:46 ` Michael Snyder 2007-11-11 7:44 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Michael Snyder @ 2007-11-11 6:46 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb Hi Stephen, See questions inline: On Sun, 2007-11-11 at 00:42 +0100, Stephen Berman wrote: > I recently experienced an abort in CVS Emacs and was unable to get a > backtrace from GDB. The Emacs bug causing the abort was fixed, but > Richard Stallman responded to the lack of a backtrace with this comment: > > > That could be a serious problem in GDB. It would be good to talk > > with GDB developers about how to investigate it. Since the bug's > > cause is known, they could focus on figuring out why GDB fails > > to give a backtrace. Out of curiosity, and because RMS seems to think it's relevant -- what is the bug's cause? > Here is the GDB-relevant part of my bug report about the abort (Emacs > was built using the GTK+ toolkit): What's your host architecture? OS? How is gdb configured (host-target tuple)? > > I attempted to debug in gdb; here's the output when the abort occurs: > > > > (emacs:19177): Gtk-WARNING **: Can't set a parent on widget which has a parent > > > > > > Gtk-ERROR **: file gtkcontainer.c: line 2641 (gtk_container_propagate_expose): assertion failed: (child->parent == GTK_WIDGET (container)) > > aborting... > > [Switching to Thread 0xb6ec66c0 (LWP 19177)] > > > > Breakpoint 1, abort () at emacs.c:431 > > 431 kill (getpid (), SIGABRT); > > > > At this point my desktop (I tried in KDE, GNOME and twm, same behavior > > in all) is totally locked up, but I can switch to a virtual tty and > > there kill emacs with SIGKILL (kill -9); SIGTERM (kill -15) does not do > > the job. Making sure that I understand -- you ran emacs under gdb, you set a breakpoint at abort, you hit the breakpoint -- and your desktop is locked up? That seems unusual -- do you have any idea of the cause? Is it possible that emacs is in an infinite recursion and has consumed all of virtual memory, or something of the sort? > This releases the desktop, but gdb delivers no further > > feedback: > > > > (gdb) xbacktrace OK, I'm not familiar with that command. "xbacktrace"? Grepping my gdb source tree for "xbacktrace" produces no results. > > Cannot access memory at address 0x8321b6c Is that a valid address for your architecture? > > (gdb) bt > > #0 abort () at emacs.c:431 > > Cannot access memory at address 0xbfd6836c > > Cannot access memory at address 0x8321b6c > > > > Setting a break point at abort() in emacs.c makes no difference. I > > don't have the GTK+ source code so I cannot debug it there. > > Does anyone have an idea why there was no backtrace? > > Steve Berman > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 6:46 ` Michael Snyder @ 2007-11-11 7:44 ` Eli Zaretskii 2007-11-11 23:05 ` Stephen Berman 2007-11-11 19:22 ` Daniel Jacobowitz 2007-11-11 23:01 ` Stephen Berman 2 siblings, 1 reply; 32+ messages in thread From: Eli Zaretskii @ 2007-11-11 7:44 UTC (permalink / raw) To: Michael Snyder; +Cc: Stephen.Berman, gdb > From: Michael Snyder <msnyder@specifix.com> > Cc: gdb@sources.redhat.com > Date: Sat, 10 Nov 2007 22:38:14 -0800 > > Making sure that I understand -- you ran emacs under gdb, > you set a breakpoint at abort, you hit the breakpoint -- The .gdbinit file in the Emacs src directory always sets a breakpoint at `abort', because Emacs calls `abort' when it encounters a ``can't happen'' situation. > > > (gdb) xbacktrace > > OK, I'm not familiar with that command. "xbacktrace"? It's defined in Emacs's src/.gdbinint. I reproduce it below, in case you are interested, but the upshot of all this is that `bt' doesn't work, as shown below: > > > (gdb) bt > > > #0 abort () at emacs.c:431 > > > Cannot access memory at address 0xbfd6836c > > > Cannot access memory at address 0x8321b6c Stack overflow, maybe? define xbacktrace set $bt = backtrace_list while $bt xgettype (*$bt->function) if $type == Lisp_Symbol xprintsym (*$bt->function) printf " (0x%x)\n", *$bt->args else printf "0x%x ", *$bt->function if $type == Lisp_Vectorlike xgetptr (*$bt->function) set $size = ((struct Lisp_Vector *) $ptr)->size output ($size & PVEC_FLAG) ? (enum pvec_type) ($size & PVEC_TYPE_MASK) : $size & ~gdb_array_mark_flag else printf "Lisp type %d", $type end echo \n end set $bt = $bt->next end end document xbacktrace Print a backtrace of Lisp function calls from backtrace_list. Set a breakpoint at Fsignal and call this to see from where an error was signaled. end ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 7:44 ` Eli Zaretskii @ 2007-11-11 23:05 ` Stephen Berman 2007-11-12 4:18 ` Eli Zaretskii 2007-11-12 5:24 ` Michael Snyder 0 siblings, 2 replies; 32+ messages in thread From: Stephen Berman @ 2007-11-11 23:05 UTC (permalink / raw) To: gdb On Sun, 11 Nov 2007 09:44:23 +0200 Eli Zaretskii <eliz@gnu.org> wrote: > the upshot of all this is that `bt' doesn't > work, as shown below: > >> > > (gdb) bt >> > > #0 abort () at emacs.c:431 >> > > Cannot access memory at address 0xbfd6836c >> > > Cannot access memory at address 0x8321b6c > > Stack overflow, maybe? Due to an infinite loop in Emacs? (I don't know if the bug I reported caused this, maybe Jan D. can answer that.) But as I mentioned in my other followup, I've never experienced an infinite loop in Emacs that locked up X. If it was due to a stack overflow, does that mean GDB is above suspicion in this case? Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 23:05 ` Stephen Berman @ 2007-11-12 4:18 ` Eli Zaretskii 2007-11-12 5:24 ` Michael Snyder 1 sibling, 0 replies; 32+ messages in thread From: Eli Zaretskii @ 2007-11-12 4:18 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb > From: Stephen Berman <Stephen.Berman@gmx.net> > Date: Mon, 12 Nov 2007 00:01:14 +0100 > > >> > > (gdb) bt > >> > > #0 abort () at emacs.c:431 > >> > > Cannot access memory at address 0xbfd6836c > >> > > Cannot access memory at address 0x8321b6c > > > > Stack overflow, maybe? > > Due to an infinite loop in Emacs? Due to infinite recursion, or some complex regexp, who knows? Perhaps the Linux folks here can suggest how to check for a stack overflow: it has to do with comparing the value of $sp with some platform-dependent symbol that holds the top of the stack. > If it was due to a stack overflow, does that mean GDB is above > suspicion in this case? Yes, if we prove that it's a stack overflow. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 23:05 ` Stephen Berman 2007-11-12 4:18 ` Eli Zaretskii @ 2007-11-12 5:24 ` Michael Snyder 2007-11-13 22:40 ` Stephen Berman 1 sibling, 1 reply; 32+ messages in thread From: Michael Snyder @ 2007-11-12 5:24 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Mon, 2007-11-12 at 00:01 +0100, Stephen Berman wrote: > On Sun, 11 Nov 2007 09:44:23 +0200 Eli Zaretskii <eliz@gnu.org> wrote: > > > the upshot of all this is that `bt' doesn't > > work, as shown below: > > > >> > > (gdb) bt > >> > > #0 abort () at emacs.c:431 > >> > > Cannot access memory at address 0xbfd6836c > >> > > Cannot access memory at address 0x8321b6c > > > > Stack overflow, maybe? > > Due to an infinite loop in Emacs? (I don't know if the bug I reported > caused this, maybe Jan D. can answer that.) But as I mentioned in my > other followup, I've never experienced an infinite loop in Emacs that > locked up X. If it was due to a stack overflow, does that mean GDB is > above suspicion in this case? No, it just means we don't yet have enough information to diagnose it. Stack overflow could potentially produce a state that was too corrupt for gdb to decypher, but we don't know yet if that's the case. Now that I know that the target is x86-linux, I can reasonably speculate that 0xbfd6836c looks like a stack address, and 0x8321b6c looks like a code or data address. But so far those are only guesses, and it isn't yet clear why gdb would be unable to access memory at those addresses. I wonder -- after the above happens, what do you get if you type the following at the (gdb) prompt: x /i $eip If you get the same error (Cannot access memory at ...), then perhaps gdb has lost contact with the child process entirely, and cannot access *any* memory. If not, then some child memory is accessable and some is not (which is not entirely surprising) -- and the question becomes, why is gdb trying to read from memory that is not accessable? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-12 5:24 ` Michael Snyder @ 2007-11-13 22:40 ` Stephen Berman 2007-11-13 23:20 ` Michael Snyder 2007-11-13 23:57 ` Andreas Schwab 0 siblings, 2 replies; 32+ messages in thread From: Stephen Berman @ 2007-11-13 22:40 UTC (permalink / raw) To: gdb On Sun, 11 Nov 2007 21:15:55 -0800 Michael Snyder <msnyder@specifix.com> wrote: > On Mon, 2007-11-12 at 00:01 +0100, Stephen Berman wrote: >> On Sun, 11 Nov 2007 09:44:23 +0200 Eli Zaretskii <eliz@gnu.org> wrote: >> >> > the upshot of all this is that `bt' doesn't >> > work, as shown below: >> > >> >> > > (gdb) bt >> >> > > #0 abort () at emacs.c:431 >> >> > > Cannot access memory at address 0xbfd6836c >> >> > > Cannot access memory at address 0x8321b6c >> > [...] > I wonder -- after the above happens, what do you get if you > type the following at the (gdb) prompt: > > x /i $eip After the abort occurs, the desktop locks up, I switch to a virtual tty and kill -9 the emacs process, releasing the desktop, then type what you said at the gdb prompt and get this: 0x80f9e56 <abort+6>: Cannot access memory at address 0x80f9e56 In this case, bt returned this: Cannot access memory at address 0xbf855e4c Cannot access memory at address 0x8322c0c > If you get the same error (Cannot access memory at ...), > then perhaps gdb has lost contact with the child process > entirely, and cannot access *any* memory. Yes, this is also what Jim Blandy surmised. But, as I ask in my response to Blandy, why does the desktop lock up only happen when the emacs abort is induced while running under gdb? Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 22:40 ` Stephen Berman @ 2007-11-13 23:20 ` Michael Snyder 2007-11-13 23:28 ` Dan Nicolaescu 2007-11-13 23:57 ` Andreas Schwab 1 sibling, 1 reply; 32+ messages in thread From: Michael Snyder @ 2007-11-13 23:20 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Tue, 2007-11-13 at 23:28 +0100, Stephen Berman wrote: > On Sun, 11 Nov 2007 21:15:55 -0800 Michael Snyder <msnyder@specifix.com> wrote: > > > On Mon, 2007-11-12 at 00:01 +0100, Stephen Berman wrote: > >> On Sun, 11 Nov 2007 09:44:23 +0200 Eli Zaretskii <eliz@gnu.org> wrote: > >> > >> > the upshot of all this is that `bt' doesn't > >> > work, as shown below: > >> > > >> >> > > (gdb) bt > >> >> > > #0 abort () at emacs.c:431 > >> >> > > Cannot access memory at address 0xbfd6836c > >> >> > > Cannot access memory at address 0x8321b6c > >> > > [...] > > I wonder -- after the above happens, what do you get if you > > type the following at the (gdb) prompt: > > > > x /i $eip > > After the abort occurs, the desktop locks up, I switch to a virtual tty > and kill -9 the emacs process, releasing the desktop, then type what you > said at the gdb prompt and get this: > > 0x80f9e56 <abort+6>: Cannot access memory at address 0x80f9e56 Oh yes. I understand that now, thanks. What we need, I guess, is to get back into control of the gdb without killing the emacs. Otherwise it is kind of hard to debug this gdb problem further. > > If you get the same error (Cannot access memory at ...), > > then perhaps gdb has lost contact with the child process > > entirely, and cannot access *any* memory. > > Yes, this is also what Jim Blandy surmised. Well, as I now understand, that's because you killed the emacs. > But, as I ask in my > response to Blandy, why does the desktop lock up only happen when the > emacs abort is induced while running under gdb? I believe this is now well understood, and not a gdb problem. In a nutshell, your emacs process has a lock on a shared resource (the X keyboard-and-mouse "focus"). It is intended to keep that lock only briefly, but while in posession of the lock, it aborts. Normally an abort would result in the freeing of the lock, but since gdb stops the emacs process from exiting, the lock is not freed, resulting in a deadlock when some other process (eg. xterm) needs the lock. This is a problem, but a normal and predictable one. GDB cannot tell when a debugged process is in posession of a lock that will cause other processes to deadlock, and it has no way of freeing such locks. This could happen with any shared, lockable resource. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 23:20 ` Michael Snyder @ 2007-11-13 23:28 ` Dan Nicolaescu 2007-11-14 10:00 ` Stephen Berman 0 siblings, 1 reply; 32+ messages in thread From: Dan Nicolaescu @ 2007-11-13 23:28 UTC (permalink / raw) To: Michael Snyder; +Cc: Stephen Berman, gdb Michael Snyder <msnyder@specifix.com> writes: > On Tue, 2007-11-13 at 23:28 +0100, Stephen Berman wrote: > > On Sun, 11 Nov 2007 21:15:55 -0800 Michael Snyder <msnyder@specifix.com> wrote: > > > > > On Mon, 2007-11-12 at 00:01 +0100, Stephen Berman wrote: > > >> On Sun, 11 Nov 2007 09:44:23 +0200 Eli Zaretskii <eliz@gnu.org> wrote: > > >> > > >> > the upshot of all this is that `bt' doesn't > > >> > work, as shown below: > > >> > > > >> >> > > (gdb) bt > > >> >> > > #0 abort () at emacs.c:431 > > >> >> > > Cannot access memory at address 0xbfd6836c > > >> >> > > Cannot access memory at address 0x8321b6c > > >> > > > [...] > > > I wonder -- after the above happens, what do you get if you > > > type the following at the (gdb) prompt: > > > > > > x /i $eip > > > > After the abort occurs, the desktop locks up, I switch to a virtual tty > > and kill -9 the emacs process, releasing the desktop, then type what you > > said at the gdb prompt and get this: > > > > 0x80f9e56 <abort+6>: Cannot access memory at address 0x80f9e56 > > Oh yes. I understand that now, thanks. > > What we need, I guess, is to get back into control of > the gdb without killing the emacs. Otherwise it is > kind of hard to debug this gdb problem further. You can do that by using the power of emacs :-) Run emacs from CVS M-x server-start RET M-x gdb from this gdb start another emacs session and do whatever you need to induce the crash (just make sure that in the second instance of emacs you don't run `server-start') switch to a console and run emacsclient -t this should connect to the first emacs instance and give you access to gdb, you can run all the gdb commands there ... Hope this helps. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 23:28 ` Dan Nicolaescu @ 2007-11-14 10:00 ` Stephen Berman 0 siblings, 0 replies; 32+ messages in thread From: Stephen Berman @ 2007-11-14 10:00 UTC (permalink / raw) To: gdb On Tue, 13 Nov 2007 15:26:33 -0800 Dan Nicolaescu <dann@ics.uci.edu> wrote: > Michael Snyder <msnyder@specifix.com> writes: [...] > > What we need, I guess, is to get back into control of > > the gdb without killing the emacs. Otherwise it is > > kind of hard to debug this gdb problem further. > > You can do that by using the power of emacs :-) > > Run emacs from CVS > M-x server-start RET > M-x gdb > from this gdb start another emacs session and do whatever you need to > induce the crash (just make sure that in the second instance of emacs > you don't run `server-start') > > switch to a console and run > emacsclient -t > > this should connect to the first emacs instance and give you access to > gdb, you can run all the gdb commands there ... > > Hope this helps. Indeed it does, thanks. (I actually combined your suggestion to use Emacs's multi-tty capability with Michael Snyder's to attach the emacs process to gdb: I did the latter from within the Emacs client. This had the advantage over running gdb directly from the shell that I could copy and paste the backtrace from the shell buffer to my followup buffer in Gnus.) (Before reading your post I had tried sort of the inverse strategy: I started emacs under gdb from a virtual console, switched to X, invoked emacsclient -c from an X terminal and induced the abort. However, when I switched back to the virtual console the gdb process was dead. In hindsight this was obviously a flawed strategy, but I'm still learning how to use the multi-tty capability. When will the doc be ready? ;-) ) Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 22:40 ` Stephen Berman 2007-11-13 23:20 ` Michael Snyder @ 2007-11-13 23:57 ` Andreas Schwab 1 sibling, 0 replies; 32+ messages in thread From: Andreas Schwab @ 2007-11-13 23:57 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb Stephen Berman <Stephen.Berman@gmx.net> writes: > After the abort occurs, the desktop locks up, I switch to a virtual tty > and kill -9 the emacs process, releasing the desktop, then type what you > said at the gdb prompt and get this: > > 0x80f9e56 <abort+6>: Cannot access memory at address 0x80f9e56 Of course, after forcefully killing the process it does not exist any more, so there is nothing the debugger can inspect. I'd suggest running the debugger inside screen, then you can attach to the screen instance from any other tty and inspect the still running emacs process. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, MaxfeldstraÃe 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 6:46 ` Michael Snyder 2007-11-11 7:44 ` Eli Zaretskii @ 2007-11-11 19:22 ` Daniel Jacobowitz 2007-11-11 23:10 ` Stephen Berman 2007-11-12 7:39 ` Vladimir Prus 2007-11-11 23:01 ` Stephen Berman 2 siblings, 2 replies; 32+ messages in thread From: Daniel Jacobowitz @ 2007-11-11 19:22 UTC (permalink / raw) To: Michael Snyder; +Cc: Stephen Berman, gdb On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: > > > At this point my desktop (I tried in KDE, GNOME and twm, same behavior > > > in all) is totally locked up, but I can switch to a virtual tty and > > > there kill emacs with SIGKILL (kill -9); SIGTERM (kill -15) does not do > > > the job. > > Making sure that I understand -- you ran emacs under gdb, > you set a breakpoint at abort, you hit the breakpoint -- > and your desktop is locked up? > > That seems unusual -- do you have any idea of the cause? This is pretty common when debugging X programs, IIRC. I believe there's some ways in which an application can "own" a display while something is in progress. That's just from observation, I don't know much about X programming. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 19:22 ` Daniel Jacobowitz @ 2007-11-11 23:10 ` Stephen Berman 2007-11-12 0:39 ` Daniel Jacobowitz 2007-11-12 17:47 ` Jim Blandy 2007-11-12 7:39 ` Vladimir Prus 1 sibling, 2 replies; 32+ messages in thread From: Stephen Berman @ 2007-11-11 23:10 UTC (permalink / raw) To: gdb On Sun, 11 Nov 2007 14:22:37 -0500 Daniel Jacobowitz <drow@false.org> wrote: > On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: >> > > At this point my desktop (I tried in KDE, GNOME and twm, same behavior >> > > in all) is totally locked up, but I can switch to a virtual tty and >> > > there kill emacs with SIGKILL (kill -9); SIGTERM (kill -15) does not do >> > > the job. >> >> Making sure that I understand -- you ran emacs under gdb, >> you set a breakpoint at abort, you hit the breakpoint -- >> and your desktop is locked up? >> >> That seems unusual -- do you have any idea of the cause? > > This is pretty common when debugging X programs, IIRC. I believe > there's some ways in which an application can "own" a display while > something is in progress. That's interesting; do you have any pointers to further information about this? Yet, as I mentioned in my other followups, this has never happened to me before when running Emacs under gdb, even when it's in an infinite loop. It sounds like you, too, don't suspect a bug in GDB that prevented getting a backtrace. Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 23:10 ` Stephen Berman @ 2007-11-12 0:39 ` Daniel Jacobowitz 2007-11-12 17:47 ` Jim Blandy 1 sibling, 0 replies; 32+ messages in thread From: Daniel Jacobowitz @ 2007-11-12 0:39 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Mon, Nov 12, 2007 at 12:01:33AM +0100, Stephen Berman wrote: > That's interesting; do you have any pointers to further information > about this? Nope, sorry. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 23:10 ` Stephen Berman 2007-11-12 0:39 ` Daniel Jacobowitz @ 2007-11-12 17:47 ` Jim Blandy 2007-11-12 19:44 ` Andreas Schwab 2007-11-13 22:34 ` Stephen Berman 1 sibling, 2 replies; 32+ messages in thread From: Jim Blandy @ 2007-11-12 17:47 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb Stephen Berman <Stephen.Berman at gmx.net> writes: > On Sun, 11 Nov 2007 14:22:37 -0500 Daniel Jacobowitz <drow@false.org> wrote: > >> On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: >>> > > At this point my desktop (I tried in KDE, GNOME and twm, same behavior >>> > > in all) is totally locked up, but I can switch to a virtual tty and >>> > > there kill emacs with SIGKILL (kill -9); SIGTERM (kill -15) does not do >>> > > the job. >>> >>> Making sure that I understand -- you ran emacs under gdb, >>> you set a breakpoint at abort, you hit the breakpoint -- >>> and your desktop is locked up? >>> >>> That seems unusual -- do you have any idea of the cause? >> >> This is pretty common when debugging X programs, IIRC. I believe >> there's some ways in which an application can "own" a display while >> something is in progress. > > That's interesting; do you have any pointers to further information > about this? Yet, as I mentioned in my other followups, this has never > happened to me before when running Emacs under gdb, even when it's in an > infinite loop. It sounds like you, too, don't suspect a bug in GDB that > prevented getting a backtrace. Actually, these are legit X Windows behavior; they're called 'server grabs'. They're supposed to be rare (for obvious reasons), but if Emacs died while it had the server grabbed, you'd certainly not be able to interact with the debugger in another window. Am I correct in understanding that: - your X session locks up, and all your windows are unresponsive, not just GDB's and Emacs - you kill Emacs via some other means, which unfreezes your X session - now you can interact with GDB again, but GDB can't get the backtrace. GDB produces backtraces by reading memory from the process. So if my sequence above is correct, once you have killed the Emacs process in another window, then it's expected that GDB won't be able to get its backtrace. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-12 17:47 ` Jim Blandy @ 2007-11-12 19:44 ` Andreas Schwab 2007-11-13 22:36 ` Stephen Berman 2007-11-13 22:34 ` Stephen Berman 1 sibling, 1 reply; 32+ messages in thread From: Andreas Schwab @ 2007-11-12 19:44 UTC (permalink / raw) To: Jim Blandy; +Cc: Stephen Berman, gdb Jim Blandy <jimb@codesourcery.com> writes: > Actually, these are legit X Windows behavior; they're called 'server > grabs'. They're supposed to be rare (for obvious reasons), but if > Emacs died while it had the server grabbed, you'd certainly not be > able to interact with the debugger in another window. You can ungrab the server by typing XF86_Ungrab (assigned to C-A-kp-/ by default). Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, MaxfeldstraÃe 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-12 19:44 ` Andreas Schwab @ 2007-11-13 22:36 ` Stephen Berman 0 siblings, 0 replies; 32+ messages in thread From: Stephen Berman @ 2007-11-13 22:36 UTC (permalink / raw) To: gdb On Mon, 12 Nov 2007 20:44:29 +0100 Andreas Schwab <schwab@suse.de> wrote: > Jim Blandy <jimb@codesourcery.com> writes: > >> Actually, these are legit X Windows behavior; they're called 'server >> grabs'. They're supposed to be rare (for obvious reasons), but if >> Emacs died while it had the server grabbed, you'd certainly not be >> able to interact with the debugger in another window. > > You can ungrab the server by typing XF86_Ungrab (assigned to C-A-kp-/ by > default). I tried this after the desktop lock-up, but it did not release the desktop. I also tried XF86_ClearGrab (C-A-kp-*) but this also had no effect. Before these attempts I had confirmed with xev that those were the correct bindings for these commands. Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-12 17:47 ` Jim Blandy 2007-11-12 19:44 ` Andreas Schwab @ 2007-11-13 22:34 ` Stephen Berman 2007-11-13 23:14 ` Michael Snyder 2007-11-13 23:51 ` Andreas Schwab 1 sibling, 2 replies; 32+ messages in thread From: Stephen Berman @ 2007-11-13 22:34 UTC (permalink / raw) To: gdb On Mon, 12 Nov 2007 09:46:43 -0800 Jim Blandy <jimb@codesourcery.com> wrote: > Stephen Berman <Stephen.Berman at gmx.net> writes: >> On Sun, 11 Nov 2007 14:22:37 -0500 Daniel Jacobowitz <drow@false.org> wrote: >> >>> On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: >>>> > > At this point my desktop (I tried in KDE, GNOME and twm, same behavior >>>> > > in all) is totally locked up, but I can switch to a virtual tty and >>>> > > there kill emacs with SIGKILL (kill -9); SIGTERM (kill -15) does not do >>>> > > the job. >>>> >>>> Making sure that I understand -- you ran emacs under gdb, >>>> you set a breakpoint at abort, you hit the breakpoint -- >>>> and your desktop is locked up? >>>> >>>> That seems unusual -- do you have any idea of the cause? >>> >>> This is pretty common when debugging X programs, IIRC. I believe >>> there's some ways in which an application can "own" a display while >>> something is in progress. >> >> That's interesting; do you have any pointers to further information >> about this? Yet, as I mentioned in my other followups, this has never >> happened to me before when running Emacs under gdb, even when it's in an >> infinite loop. It sounds like you, too, don't suspect a bug in GDB that >> prevented getting a backtrace. > > Actually, these are legit X Windows behavior; they're called 'server > grabs'. They're supposed to be rare (for obvious reasons), but if > Emacs died while it had the server grabbed, you'd certainly not be > able to interact with the debugger in another window. Note, however, that this only happens when running emacs under gdb; if I start emacs directly from the shell, induce the abort, then the Emacs window vanishes, but the desktop remains responsive and no other problems are apparent. > Am I correct in understanding that: > - your X session locks up, and all your windows are unresponsive, not > just GDB's and Emacs Yes, all open X apps are unresponsive to keyboard or mouse input, but the apps continue to operate normally, e.g., the newsticker scrolls, the displays in gkrellm (time, CPU activity, free memory, etc.) continue to be updated. > - you kill Emacs via some other means, which unfreezes your X session Yes, by switching to a virtual tty. > - now you can interact with GDB again, but GDB can't get the backtrace. Yes. > GDB produces backtraces by reading memory from the process. So if my > sequence above is correct, once you have killed the Emacs process in > another window, then it's expected that GDB won't be able to get its > backtrace. So again the (or a) question is, why does this only happen when running emacs under gdb? Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 22:34 ` Stephen Berman @ 2007-11-13 23:14 ` Michael Snyder 2007-11-14 9:48 ` Stephen Berman 2007-11-13 23:51 ` Andreas Schwab 1 sibling, 1 reply; 32+ messages in thread From: Michael Snyder @ 2007-11-13 23:14 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Tue, 2007-11-13 at 23:28 +0100, Stephen Berman wrote: > On Mon, 12 Nov 2007 09:46:43 -0800 Jim Blandy <jimb@codesourcery.com> wrote: > > > Stephen Berman <Stephen.Berman at gmx.net> writes: > >> On Sun, 11 Nov 2007 14:22:37 -0500 Daniel Jacobowitz <drow@false.org> wrote: > >> > >>> On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: > >>>> > > At this point my desktop (I tried in KDE, GNOME and twm, same behavior > >>>> > > in all) is totally locked up, but I can switch to a virtual tty and > >>>> > > there kill emacs with SIGKILL (kill -9); SIGTERM (kill -15) does not do > >>>> > > the job. > >>>> > >>>> Making sure that I understand -- you ran emacs under gdb, > >>>> you set a breakpoint at abort, you hit the breakpoint -- > >>>> and your desktop is locked up? > >>>> > >>>> That seems unusual -- do you have any idea of the cause? > >>> > >>> This is pretty common when debugging X programs, IIRC. I believe > >>> there's some ways in which an application can "own" a display while > >>> something is in progress. > >> > >> That's interesting; do you have any pointers to further information > >> about this? Yet, as I mentioned in my other followups, this has never > >> happened to me before when running Emacs under gdb, even when it's in an > >> infinite loop. It sounds like you, too, don't suspect a bug in GDB that > >> prevented getting a backtrace. > > > > Actually, these are legit X Windows behavior; they're called 'server > > grabs'. They're supposed to be rare (for obvious reasons), but if > > Emacs died while it had the server grabbed, you'd certainly not be > > able to interact with the debugger in another window. > > Note, however, that this only happens when running emacs under gdb; if I > start emacs directly from the shell, induce the abort, then the Emacs > window vanishes, but the desktop remains responsive and no other > problems are apparent. Well that's understandable. emacs has grabbed the focus, presumably something you are only supposed to do for a brief interval. If emacs aborts, it will let go of the focus when it dies, but if gdb stops it from dying, it can't let go of the focus. This would happen with any resource that a debugged program had a lock on. You can create deadlocks when you debug things that have locks on resources. That's a fact of life. > > Am I correct in understanding that: > > - your X session locks up, and all your windows are unresponsive, not > > just GDB's and Emacs > > Yes, all open X apps are unresponsive to keyboard or mouse input, but > the apps continue to operate normally, e.g., the newsticker scrolls, the > displays in gkrellm (time, CPU activity, free memory, etc.) continue to > be updated. I would prefer if we could set this issue aside. I don't believe it is a gdb issue. The backtrace issue is what we should discuss. > > > - you kill Emacs via some other means, which unfreezes your X session > > Yes, by switching to a virtual tty. > > > - now you can interact with GDB again, but GDB can't get the backtrace. > > Yes. > > > GDB produces backtraces by reading memory from the process. So if my > > sequence above is correct, once you have killed the Emacs process in > > another window, then it's expected that GDB won't be able to get its > > backtrace. > > So again the (or a) question is, why does this only happen when running > emacs under gdb? If by "this" you mean the backtrace failure, at this point I don't feel we have enough information to answer. But I have not yet read your three other emails. ;-) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 23:14 ` Michael Snyder @ 2007-11-14 9:48 ` Stephen Berman 0 siblings, 0 replies; 32+ messages in thread From: Stephen Berman @ 2007-11-14 9:48 UTC (permalink / raw) To: gdb On Tue, 13 Nov 2007 15:05:36 -0800 Michael Snyder <msnyder@specifix.com> wrote: > On Tue, 2007-11-13 at 23:28 +0100, Stephen Berman wrote: [...] >> Note, however, that this only happens when running emacs under gdb; if I >> start emacs directly from the shell, induce the abort, then the Emacs >> window vanishes, but the desktop remains responsive and no other >> problems are apparent. > > Well that's understandable. emacs has grabbed the focus, presumably > something you are only supposed to do for a brief interval. If emacs > aborts, it will let go of the focus when it dies, but if gdb stops it > from dying, it can't let go of the focus. > > This would happen with any resource that a debugged program > had a lock on. You can create deadlocks when you debug things > that have locks on resources. > > That's a fact of life. Ok, I understand this now. >> > Am I correct in understanding that: >> > - your X session locks up, and all your windows are unresponsive, not >> > just GDB's and Emacs >> >> Yes, all open X apps are unresponsive to keyboard or mouse input, but >> the apps continue to operate normally, e.g., the newsticker scrolls, the >> displays in gkrellm (time, CPU activity, free memory, etc.) continue to >> be updated. > > I would prefer if we could set this issue aside. > I don't believe it is a gdb issue. The backtrace > issue is what we should discuss. See my followup to your suggestion to attach the emacs process to gdb. Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 22:34 ` Stephen Berman 2007-11-13 23:14 ` Michael Snyder @ 2007-11-13 23:51 ` Andreas Schwab 1 sibling, 0 replies; 32+ messages in thread From: Andreas Schwab @ 2007-11-13 23:51 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb Stephen Berman <Stephen.Berman@gmx.net> writes: > Note, however, that this only happens when running emacs under gdb; if I > start emacs directly from the shell, induce the abort, then the Emacs > window vanishes, but the desktop remains responsive and no other > problems are apparent. Of course, when an X client disappears any grab held by it is automatically cancelled. > So again the (or a) question is, why does this only happen when running > emacs under gdb? Because the process is _stopped_, not killed. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, MaxfeldstraÃe 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 19:22 ` Daniel Jacobowitz 2007-11-11 23:10 ` Stephen Berman @ 2007-11-12 7:39 ` Vladimir Prus 2007-11-13 22:36 ` Stephen Berman 1 sibling, 1 reply; 32+ messages in thread From: Vladimir Prus @ 2007-11-12 7:39 UTC (permalink / raw) To: gdb Daniel Jacobowitz wrote: > On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: >> >> Making sure that I understand -- you ran emacs under gdb, >> you set a breakpoint at abort, you hit the breakpoint -- >> and your desktop is locked up? >> >> That seems unusual -- do you have any idea of the cause? > > This is pretty common when debugging X programs, IIRC. I believe > there's some ways in which an application can "own" a display while > something is in progress. I believe it's XGrabPointer, which causes all mouse and/or keyboard event to be delivered to specific application. If that application is stopped by gdb, it means, in effect, that events are not processed at all. I have no idea if that's the cause, since the emacs diff posted in another email does not have any apparent locking thing, but it very likely. >> > > At this point my desktop (I tried in KDE, GNOME and twm, same >> > > behavior in all) I believe this is dependent on what toolkit the application being debugged uses, not what other applications are running. - Volodya ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-12 7:39 ` Vladimir Prus @ 2007-11-13 22:36 ` Stephen Berman 2007-11-13 23:24 ` Michael Snyder 0 siblings, 1 reply; 32+ messages in thread From: Stephen Berman @ 2007-11-13 22:36 UTC (permalink / raw) To: gdb On Mon, 12 Nov 2007 10:38:36 +0300 Vladimir Prus <ghost@cs.msu.su> wrote: > Daniel Jacobowitz wrote: > >> On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: >>> >>> Making sure that I understand -- you ran emacs under gdb, >>> you set a breakpoint at abort, you hit the breakpoint -- >>> and your desktop is locked up? >>> >>> That seems unusual -- do you have any idea of the cause? >> >> This is pretty common when debugging X programs, IIRC. I believe >> there's some ways in which an application can "own" a display while >> something is in progress. > > I believe it's XGrabPointer, which causes all mouse and/or keyboard > event to be delivered to specific application. If that application is > stopped by gdb, it means, in effect, that events are not processed at all. > > I have no idea if that's the cause, since the emacs diff posted in another > email does not have any apparent locking thing, but it very likely. But as I pointed out in my followup to Jim Blandy, the lockup only happens when emacs aborts while running under gdb; starting emacs directly from the shell and inducing the abort does not lock up the desktop. Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 22:36 ` Stephen Berman @ 2007-11-13 23:24 ` Michael Snyder 2007-11-14 9:50 ` Stephen Berman 0 siblings, 1 reply; 32+ messages in thread From: Michael Snyder @ 2007-11-13 23:24 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Tue, 2007-11-13 at 23:29 +0100, Stephen Berman wrote: > On Mon, 12 Nov 2007 10:38:36 +0300 Vladimir Prus <ghost@cs.msu.su> wrote: > > > Daniel Jacobowitz wrote: > > > >> On Sat, Nov 10, 2007 at 10:38:14PM -0800, Michael Snyder wrote: > >>> > >>> Making sure that I understand -- you ran emacs under gdb, > >>> you set a breakpoint at abort, you hit the breakpoint -- > >>> and your desktop is locked up? > >>> > >>> That seems unusual -- do you have any idea of the cause? > >> > >> This is pretty common when debugging X programs, IIRC. I believe > >> there's some ways in which an application can "own" a display while > >> something is in progress. > > > > I believe it's XGrabPointer, which causes all mouse and/or keyboard > > event to be delivered to specific application. If that application is > > stopped by gdb, it means, in effect, that events are not processed at all. > > > > I have no idea if that's the cause, since the emacs diff posted in another > > email does not have any apparent locking thing, but it very likely. > > But as I pointed out in my followup to Jim Blandy, the lockup only > happens when emacs aborts while running under gdb; starting emacs > directly from the shell and inducing the abort does not lock up the > desktop. Hoping you have seen my other replies, I believe we understand why this happens. I do have a suggestion. Run gdb from a non-GUI terminal, eg. a virtual console. For example, launch emacs normally, WITHOUT gdb, from gnome/kde/gtk/X. Then go to a virtual console, use 'ps' to determine the process id of emacs, then start gdb and use the "attach" command to get control of emacs from gdb. Now return to the GUI console, and make emacs crash. When you return to the virtual console, you should be able to debug. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-13 23:24 ` Michael Snyder @ 2007-11-14 9:50 ` Stephen Berman 2007-11-14 12:00 ` Michael Snyder 0 siblings, 1 reply; 32+ messages in thread From: Stephen Berman @ 2007-11-14 9:50 UTC (permalink / raw) To: gdb On Tue, 13 Nov 2007 15:14:58 -0800 Michael Snyder <msnyder@specifix.com> wrote: > On Tue, 2007-11-13 at 23:29 +0100, Stephen Berman wrote: [...] >> But as I pointed out in my followup to Jim Blandy, the lockup only >> happens when emacs aborts while running under gdb; starting emacs >> directly from the shell and inducing the abort does not lock up the >> desktop. > > Hoping you have seen my other replies, I believe we understand > why this happens. > > I do have a suggestion. > > Run gdb from a non-GUI terminal, eg. a virtual console. > > For example, launch emacs normally, WITHOUT gdb, from gnome/kde/gtk/X. > Then go to a virtual console, use 'ps' to determine the process id of > emacs, then start gdb and use the "attach" command to get control of > emacs from gdb. > > Now return to the GUI console, and make emacs crash. > > When you return to the virtual console, you should be able to debug. Thanks for this suggestion, it worked. Here's the backtrace: #0 abort () at emacs.c:431 #1 0xb798526a in g_logv () from /usr/lib/libglib-2.0.so.0 #2 0xb79852a9 in g_log () from /usr/lib/libglib-2.0.so.0 #3 0xb7985320 in g_assert_warning () from /usr/lib/libglib-2.0.so.0 #4 0xb7c7b195 in gtk_container_propagate_expose () from /usr/lib/libgtk-x11-2.0.so.0 #5 0xb7c7b1c1 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #6 0x085c2d00 in ?? () #7 0x086c0a08 in ?? () #8 0x087c31f0 in ?? () #9 0xb7d23b6a in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #10 0xb7f3aff4 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #11 0x085c2d00 in ?? () #12 0xbfef82a8 in ?? () #13 0xb7cf4b42 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #14 0x086c0a08 in ?? () #15 0xbfef82e8 in ?? () #16 0xb7c7b1a0 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #17 0xbfef82e8 in ?? () #18 0x0000001e in ?? () #19 0x40000036 in ?? () #20 0xbfef82b8 in ?? () #21 0xb7f3aff4 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #22 0x085c2d00 in ?? () #23 0x085c2d00 in ?? () #24 0xbfef82c8 in ?? () #25 0xb7c7bbe7 in gtk_container_forall () from /usr/lib/libgtk-x11-2.0.so.0 Backtrace stopped: previous frame inner to this frame (corrupt stack?) I don't know if this is useful to you or any other gdb hacker. I don't have the GTK+ sources installed. Maybe someone who does can reproduce the abort and get a more informative backtrace. In any case, the fact that I have gotten a backtrace now evidently absolves GDB of the suspicion that it had a bug preventing a backtrace :-). And since the Emacs bug causing the abort was already fixed, and the issue of the desktop lockup has been explained, I guess we can declare this issue closed, unless someone thinks the above backtrace is still reason for concern. Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-14 9:50 ` Stephen Berman @ 2007-11-14 12:00 ` Michael Snyder 2007-11-14 19:24 ` Stephen Berman 0 siblings, 1 reply; 32+ messages in thread From: Michael Snyder @ 2007-11-14 12:00 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Wed, 2007-11-14 at 10:48 +0100, Stephen Berman wrote: > Thanks for this suggestion, it worked. Here's the backtrace: OK, this is great! See below. > #0 abort () at emacs.c:431 > #1 0xb798526a in g_logv () from /usr/lib/libglib-2.0.so.0 > #2 0xb79852a9 in g_log () from /usr/lib/libglib-2.0.so.0 > #3 0xb7985320 in g_assert_warning () from /usr/lib/libglib-2.0.so.0 > #4 0xb7c7b195 in gtk_container_propagate_expose () from /usr/lib/libgtk-x11-2.0.so.0 > #5 0xb7c7b1c1 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 > #6 0x085c2d00 in ?? () > #7 0x086c0a08 in ?? () > #8 0x087c31f0 in ?? () [...] > I don't know if this is useful to you or any other gdb hacker. I don't > have the GTK+ sources installed. Maybe someone who does can reproduce > the abort and get a more informative backtrace. You don't need to have the sources installed, but it appears as if GDB can't find symbols for the shared libraries. Are these libraries installed in an unusual location? Is LD_LIBRARY_PATH set correctly (in the gdb shell)? Is there a location (other than /lib, /usr/lib etc) where you could tell gdb to find the libraries? See the built in help for "set solib-search-path" and "set solib-absolute-prefix". One more thing that might help is if you can install the debuggable versions of these libraries (the ones compiled with -g for debug symbols, and/or the ones that have not been stripped). > In any case, the fact > that I have gotten a backtrace now evidently absolves GDB of the > suspicion that it had a bug preventing a backtrace :-). And since the > Emacs bug causing the abort was already fixed, and the issue of the > desktop lockup has been explained, I guess we can declare this issue > closed, unless someone thinks the above backtrace is still reason for > concern. I think we're in accord -- my suggestions were just to help you if you need to debug further (now or in future). Michael ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-14 12:00 ` Michael Snyder @ 2007-11-14 19:24 ` Stephen Berman 2007-11-15 1:00 ` Michael Snyder 0 siblings, 1 reply; 32+ messages in thread From: Stephen Berman @ 2007-11-14 19:24 UTC (permalink / raw) To: gdb On Wed, 14 Nov 2007 03:50:53 -0800 Michael Snyder <msnyder@specifix.com> wrote: > On Wed, 2007-11-14 at 10:48 +0100, Stephen Berman wrote: > >> Thanks for this suggestion, it worked. Here's the backtrace: > > OK, this is great! See below. > >> #0 abort () at emacs.c:431 >> #1 0xb798526a in g_logv () from /usr/lib/libglib-2.0.so.0 >> #2 0xb79852a9 in g_log () from /usr/lib/libglib-2.0.so.0 >> #3 0xb7985320 in g_assert_warning () from /usr/lib/libglib-2.0.so.0 >> #4 0xb7c7b195 in gtk_container_propagate_expose () from /usr/lib/libgtk-x11-2.0.so.0 >> #5 0xb7c7b1c1 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 >> #6 0x085c2d00 in ?? () >> #7 0x086c0a08 in ?? () >> #8 0x087c31f0 in ?? () > [...] > >> I don't know if this is useful to you or any other gdb hacker. I don't >> have the GTK+ sources installed. Maybe someone who does can reproduce >> the abort and get a more informative backtrace. > > You don't need to have the sources installed, but > it appears as if GDB can't find symbols for the shared libraries. What does it mean that in frame #4 of the backtrace the symbol gtk_container_propagate_expose from /usr/lib/libgtk-x11-2.0.so.0 is referenced but starting in the next stack frame there is only ?? with reference to the same library? That gdb is finding some but not all symbols? Note that when I attached the emacs process to gdb, it returned a slew of message like this: Reading symbols from /usr/lib/libgtk-x11-2.0.so.0...done. Loaded symbols for /usr/lib/libgtk-x11-2.0.so.0 Reading symbols from /usr/lib/libgdk-x11-2.0.so.0...done. Loaded symbols for /usr/lib/libgdk-x11-2.0.so.0 Reading symbols from /usr/lib/libatk-1.0.so.0...done. Loaded symbols for /usr/lib/libatk-1.0.so.0 Reading symbols from /usr/lib/libgdk_pixbuf-2.0.so.0...done. ... > Are these libraries installed in an unusual location? They are all in /usr/lib AFAICT; I didn't install them myself, that's where the distro I use (openSUSE) put them. > Is LD_LIBRARY_PATH set correctly (in the gdb shell)? How do I determine this (show env does not list LD_LIBRARY_PATH)? > Is there a location (other than /lib, /usr/lib etc) > where you could tell gdb to find the libraries? Not that I know of. Here's an additional datapoint, FWIW: I induced the abort again, and this time the backtrace was slightly different from the one I posted before: #0 abort () at emacs.c:431 #1 0xb790f26a in g_logv () from /usr/lib/libglib-2.0.so.0 #2 0xb790f2a9 in g_log () from /usr/lib/libglib-2.0.so.0 #3 0xb790f320 in g_assert_warning () from /usr/lib/libglib-2.0.so.0 #4 0xb7c05195 in gtk_container_propagate_expose () from /usr/lib/libgtk-x11-2.0.so.0 #5 0xb7c051c1 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #6 0x0833e1b0 in ceil () #7 0x086c35c0 in ?? () #8 0x087c59f8 in ?? () #9 0xb7cadb6a in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #10 0xb7ec4ff4 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #11 0x0833e1b0 in ceil () #12 0xbfea1228 in ?? () #13 0xb7c7eb42 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #14 0x086c35c0 in ?? () #15 0xbfea1268 in ?? () #16 0xb7c051a0 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #17 0xbfea1268 in ?? () #18 0x0000001e in ?? () #19 0x40000036 in ?? () #20 0xbfea1238 in ?? () #21 0xb7ec4ff4 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 #22 0x0833e1b0 in ceil () #23 0x0833e1b0 in ceil () #24 0xbfea1248 in ?? () #25 0xb7c05be7 in gtk_container_forall () from /usr/lib/libgtk-x11-2.0.so.0 Backtrace stopped: previous frame inner to this frame (corrupt stack?) What is the significance of the stack frames #6 0x0833e1b0 in ceil () and so on (note it's always the same address)? Thanks again for your helpful comments. Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-14 19:24 ` Stephen Berman @ 2007-11-15 1:00 ` Michael Snyder 0 siblings, 0 replies; 32+ messages in thread From: Michael Snyder @ 2007-11-15 1:00 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Wed, 2007-11-14 at 20:22 +0100, Stephen Berman wrote: > On Wed, 14 Nov 2007 03:50:53 -0800 Michael Snyder <msnyder@specifix.com> wrote: > > > On Wed, 2007-11-14 at 10:48 +0100, Stephen Berman wrote: > > > >> Thanks for this suggestion, it worked. Here's the backtrace: > > > > OK, this is great! See below. > > > >> #0 abort () at emacs.c:431 > >> #1 0xb798526a in g_logv () from /usr/lib/libglib-2.0.so.0 > >> #2 0xb79852a9 in g_log () from /usr/lib/libglib-2.0.so.0 > >> #3 0xb7985320 in g_assert_warning () from /usr/lib/libglib-2.0.so.0 > >> #4 0xb7c7b195 in gtk_container_propagate_expose () from /usr/lib/libgtk-x11-2.0.so.0 > >> #5 0xb7c7b1c1 in ?? () from /usr/lib/libgtk-x11-2.0.so.0 > >> #6 0x085c2d00 in ?? () > >> #7 0x086c0a08 in ?? () > >> #8 0x087c31f0 in ?? () > > [...] > > > >> I don't know if this is useful to you or any other gdb hacker. I don't > >> have the GTK+ sources installed. Maybe someone who does can reproduce > >> the abort and get a more informative backtrace. > > > > You don't need to have the sources installed, but > > it appears as if GDB can't find symbols for the shared libraries. > > What does it mean that in frame #4 of the backtrace the symbol > gtk_container_propagate_expose from /usr/lib/libgtk-x11-2.0.so.0 is > referenced but starting in the next stack frame there is only ?? with > reference to the same library? It means that: For frame #4, gdb found a PC address that fit inside the bounds of the text section of libgtk-x1-2.0.so, and corresponded with the function "gtk_container_propagate_expose" from that library. For frame #5, gdb found a PC address that also fit the address range of that library, but did not correspond with any function symbol. For frame #6, gdb found a PC address that did not match with any known section of an object file (eg. shared library), and could not be matched against any known function symbol. > That gdb is finding some but not all > symbols? Note that when I attached the emacs process to gdb, it > returned a slew of message like this: > > Reading symbols from /usr/lib/libgtk-x11-2.0.so.0...done. > Loaded symbols for /usr/lib/libgtk-x11-2.0.so.0 > Reading symbols from /usr/lib/libgdk-x11-2.0.so.0...done. > Loaded symbols for /usr/lib/libgdk-x11-2.0.so.0 > Reading symbols from /usr/lib/libatk-1.0.so.0...done. > Loaded symbols for /usr/lib/libatk-1.0.so.0 > Reading symbols from /usr/lib/libgdk_pixbuf-2.0.so.0...done. OK -- do you know whether these messages accounted for all known libraries used by emacs? Could there be some libraries that are missing (therefore gdb does not have symbols for)? ... > What is the significance of the stack frames > #6 0x0833e1b0 in ceil () > and so on (note it's always the same address)? It may be garbage. The stack unwind may have become corrupt by that point (I can't tell). If we assume it is meaningful, it means that gdb found a PC for the stack frame for which the nearest corresponding function symbol is "ceil". ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 6:46 ` Michael Snyder 2007-11-11 7:44 ` Eli Zaretskii 2007-11-11 19:22 ` Daniel Jacobowitz @ 2007-11-11 23:01 ` Stephen Berman 2007-11-12 5:15 ` Michael Snyder 2 siblings, 1 reply; 32+ messages in thread From: Stephen Berman @ 2007-11-11 23:01 UTC (permalink / raw) To: gdb On Sat, 10 Nov 2007 22:38:14 -0800 Michael Snyder <msnyder@specifix.com> wrote: > Hi Stephen, > > See questions inline: > > On Sun, 2007-11-11 at 00:42 +0100, Stephen Berman wrote: >> I recently experienced an abort in CVS Emacs and was unable to get a >> backtrace from GDB. The Emacs bug causing the abort was fixed, but >> Richard Stallman responded to the lack of a backtrace with this comment: >> >> > That could be a serious problem in GDB. It would be good to talk >> > with GDB developers about how to investigate it. Since the bug's >> > cause is known, they could focus on figuring out why GDB fails >> > to give a backtrace. > > Out of curiosity, and because RMS seems to think it's > relevant -- what is the bug's cause? It had to do with the handling of named icons in GTK+ tool bars. I don't know the code well enough to say more, but the diff is here: http://cvs.savannah.gnu.org/viewvc/emacs/emacs/src/gtkutil.c?r1=1.120&r2=1.121&pathrev=MAIN >> Here is the GDB-relevant part of my bug report about the abort (Emacs >> was built using the GTK+ toolkit): > > What's your host architecture? OS? How is gdb configured (host-target > tuple)? Linux escher 2.6.22.12-0.1-default i686 athlon i386 GNU/Linux (openSUSE 10.3) GNU gdb 6.6.50.20070726-cvs This GDB was configured as "i586-suse-linux". > Making sure that I understand -- you ran emacs under gdb, > you set a breakpoint at abort, you hit the breakpoint -- > and your desktop is locked up? Yes (as Eli Zaretskii pointed out, Emacs set a breakpoint at abort). > That seems unusual -- do you have any idea of the cause? No! I'm hoping someone here might have some insight. > Is it possible that emacs is in an infinite recursion > and has consumed all of virtual memory, or something > of the sort? This has happened on (rare) occasion, but it never locked up the desktop, I could always at least kill -9 the emacs process from within the X window system (in the case under discussion, I was able to switch to a virtual tty and kill -9 the emacs process from there, but X was locked up solid). >> > Cannot access memory at address 0x8321b6c > > Is that a valid address for your architecture? How can I determine that? Anyway, it sound like you don't suspect a bug in GDB that prevented getting a backtrace, or is that still a possibility? Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-11 23:01 ` Stephen Berman @ 2007-11-12 5:15 ` Michael Snyder 2007-11-14 9:55 ` Stephen Berman 0 siblings, 1 reply; 32+ messages in thread From: Michael Snyder @ 2007-11-12 5:15 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Mon, 2007-11-12 at 00:01 +0100, Stephen Berman wrote: > On Sat, 10 Nov 2007 22:38:14 -0800 Michael Snyder <msnyder@specifix.com> wrote: > > > Hi Stephen, > > > > See questions inline: > > > > On Sun, 2007-11-11 at 00:42 +0100, Stephen Berman wrote: > >> I recently experienced an abort in CVS Emacs and was unable to get a > >> backtrace from GDB. The Emacs bug causing the abort was fixed, but > >> Richard Stallman responded to the lack of a backtrace with this comment: > >> > >> > That could be a serious problem in GDB. It would be good to talk > >> > with GDB developers about how to investigate it. Since the bug's > >> > cause is known, they could focus on figuring out why GDB fails > >> > to give a backtrace. > > > > Out of curiosity, and because RMS seems to think it's > > relevant -- what is the bug's cause? > > It had to do with the handling of named icons in GTK+ tool bars. I > don't know the code well enough to say more, but the diff is here: > http://cvs.savannah.gnu.org/viewvc/emacs/emacs/src/gtkutil.c?r1=1.120&r2=1.121&pathrev=MAIN Fair enough -- I think that's out of my depth. ;-) > >> Here is the GDB-relevant part of my bug report about the abort (Emacs > >> was built using the GTK+ toolkit): > > > > What's your host architecture? OS? How is gdb configured (host-target > > tuple)? > > Linux escher 2.6.22.12-0.1-default i686 athlon i386 GNU/Linux (openSUSE > 10.3) > > GNU gdb 6.6.50.20070726-cvs > This GDB was configured as "i586-suse-linux". Thanks, that's all very helpful. > > > Making sure that I understand -- you ran emacs under gdb, > > you set a breakpoint at abort, you hit the breakpoint -- > > and your desktop is locked up? > > Yes (as Eli Zaretskii pointed out, Emacs set a breakpoint at abort). > > > That seems unusual -- do you have any idea of the cause? > > No! I'm hoping someone here might have some insight. OK -- just hoping for more info. > > Is it possible that emacs is in an infinite recursion > > and has consumed all of virtual memory, or something > > of the sort? > > This has happened on (rare) occasion, but it never locked up the > desktop, I could always at least kill -9 the emacs process from within > the X window system (in the case under discussion, I was able to switch > to a virtual tty and kill -9 the emacs process from there, but X was > locked up solid). Yeah, that's pretty rare in my experience -- but I'm not a GUI or X hacker. Seems like something that shouldn't happen, unless thru a bug in X. No client app should be able to take control of the entire windowing system and prevent anything else from getting access. Please note, I'm not at all trying to say "this isn't our (gdb's) problem". > >> > Cannot access memory at address 0x8321b6c > > > > Is that a valid address for your architecture? > > How can I determine that? Just thought you might know. Forget about it. > Anyway, it sound like you don't suspect a bug in GDB that prevented > getting a backtrace, or is that still a possibility? Oh, I didn't mean to imply that at all. Just asking for more information. So just to define the problem ... 1) emacs calls abort. This is apparently due to a bug in emacs, for which you already have a patch. 2) While emacs is held at the abort by gdb, your X system is frozen and you can't get any other X window client to work. We don't know the cause of this, but it probably isn't gdb, so let's forget about it in this context. 3) Within gdb, when you're at the breakpoint at abort, backtrace doesn't work. This part is within our domain. Now 3 could well be a bug in gdb, but there are other possibilities. Something could have corrupted the stack, so badly that gdb can't unwind it. Personally I don't see how to decide that question, based on the information we have so far. Maybe Daniel and/or Eli might have an idea? To pursue it further, we can go one of two ways: A) maybe you can provide us with enough information and context to reproduce the problem ourselves? This seems unlikely, but maybe, for instance, you know that with a certain released version of emacs and a certain released version of linux, you can give a fixed sequence of commands and reliably reproduce the crash? or B) we can keep asking you for more information, question and answer style. For instance, I'd like to know the output that you get from the following gdb commands when you're at the breakpoint: i) info registers ii) info target iii) x /64x $esp ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-12 5:15 ` Michael Snyder @ 2007-11-14 9:55 ` Stephen Berman 2007-11-14 12:00 ` Michael Snyder 0 siblings, 1 reply; 32+ messages in thread From: Stephen Berman @ 2007-11-14 9:55 UTC (permalink / raw) To: gdb On Sun, 11 Nov 2007 21:06:28 -0800 Michael Snyder <msnyder@specifix.com> wrote: [...] > To pursue it further, we can go one of two ways: > > A) maybe you can provide us with enough information and > context to reproduce the problem ourselves? This seems > unlikely, but maybe, for instance, you know that with > a certain released version of emacs and a certain released > version of linux, you can give a fixed sequence of commands > and reliably reproduce the crash? > > or > > B) we can keep asking you for more information, question > and answer style. > > For instance, I'd like to know the output that you get > from the following gdb commands when you're at the breakpoint: > > i) info registers > ii) info target > iii) x /64x $esp I posted a followup providing the requested information but it hasn't yet appeared on the list, perhaps because it was too long (the output of `info target' was 1815 lines long). Since I've gotten a backtrace in the mean time (see my followup to your suggestion to attach the emacs process to gdb), do you want me to repost or email you this information, or is it no longer relevant? Steve Berman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: GDB cannot access memory after Emacs abort 2007-11-14 9:55 ` Stephen Berman @ 2007-11-14 12:00 ` Michael Snyder 0 siblings, 0 replies; 32+ messages in thread From: Michael Snyder @ 2007-11-14 12:00 UTC (permalink / raw) To: Stephen Berman; +Cc: gdb On Wed, 2007-11-14 at 10:50 +0100, Stephen Berman wrote: > On Sun, 11 Nov 2007 21:06:28 -0800 Michael Snyder <msnyder@specifix.com> wrote: > [...] > > To pursue it further, we can go one of two ways: > > > > A) maybe you can provide us with enough information and > > context to reproduce the problem ourselves? This seems > > unlikely, but maybe, for instance, you know that with > > a certain released version of emacs and a certain released > > version of linux, you can give a fixed sequence of commands > > and reliably reproduce the crash? > > > > or > > > > B) we can keep asking you for more information, question > > and answer style. > > > > For instance, I'd like to know the output that you get > > from the following gdb commands when you're at the breakpoint: > > > > i) info registers > > ii) info target > > iii) x /64x $esp > > I posted a followup providing the requested information but it hasn't > yet appeared on the list, perhaps because it was too long (the output of > `info target' was 1815 lines long). Since I've gotten a backtrace in > the mean time (see my followup to your suggestion to attach the emacs > process to gdb), do you want me to repost or email you this information, > or is it no longer relevant? I think it's moot now. Good luck! ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2007-11-15 1:00 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <87r6j6rvn3.fsf@escher.local.home> 2007-11-10 23:50 ` GDB cannot access memory after Emacs abort Stephen Berman 2007-11-11 6:46 ` Michael Snyder 2007-11-11 7:44 ` Eli Zaretskii 2007-11-11 23:05 ` Stephen Berman 2007-11-12 4:18 ` Eli Zaretskii 2007-11-12 5:24 ` Michael Snyder 2007-11-13 22:40 ` Stephen Berman 2007-11-13 23:20 ` Michael Snyder 2007-11-13 23:28 ` Dan Nicolaescu 2007-11-14 10:00 ` Stephen Berman 2007-11-13 23:57 ` Andreas Schwab 2007-11-11 19:22 ` Daniel Jacobowitz 2007-11-11 23:10 ` Stephen Berman 2007-11-12 0:39 ` Daniel Jacobowitz 2007-11-12 17:47 ` Jim Blandy 2007-11-12 19:44 ` Andreas Schwab 2007-11-13 22:36 ` Stephen Berman 2007-11-13 22:34 ` Stephen Berman 2007-11-13 23:14 ` Michael Snyder 2007-11-14 9:48 ` Stephen Berman 2007-11-13 23:51 ` Andreas Schwab 2007-11-12 7:39 ` Vladimir Prus 2007-11-13 22:36 ` Stephen Berman 2007-11-13 23:24 ` Michael Snyder 2007-11-14 9:50 ` Stephen Berman 2007-11-14 12:00 ` Michael Snyder 2007-11-14 19:24 ` Stephen Berman 2007-11-15 1:00 ` Michael Snyder 2007-11-11 23:01 ` Stephen Berman 2007-11-12 5:15 ` Michael Snyder 2007-11-14 9:55 ` Stephen Berman 2007-11-14 12:00 ` Michael Snyder
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).