* Can't debug x86_64 C++ programs.
@ 2008-09-18 20:53 John Fine
2008-09-18 21:19 ` Keith Seitz
2008-09-19 2:17 ` Keith Seitz
0 siblings, 2 replies; 16+ messages in thread
From: John Fine @ 2008-09-18 20:53 UTC (permalink / raw)
To: insight
Actually X86_64 C++ programs are the only things I've even tried to
debug with Insight. So I don't have any good guess of what I might be
trying that differs from what those who can use Insight are doing.
When I try to use it, Insight has so many bugs, I couldn't begin to list
them here and other bugs interfere with any attempt to understand or
work around any specific bug.
If there is any chance of getting any support, I'd be glad to do
specific tests or give much more detail of the failures.
I really need a GUI debugger for x86_64 C++ programs. I've tried kdbg,
which has many of the same bugs (which implies they are in gdb it self,
not in the GUI), but some of those don't happen when I try gdb alone.
Also kdbg seems to be missing features needed for practical debugging,
features Insight has, if only it worked. If you know of a GUI debugger
that works, please tell me.
Some of the bugs:
The step over buttons (both "Next" and "Next Asm") usually do a step
into. They sometimes step over, so they must be connected to the right
thing, but usually they step into.
The Finish button usually does nothing, sometimes does the step out of
operation that I want and sometimes causes the program being debugged to
seg fault. If I restart the program and set a breakpoint where a
correct step out of would reach and get into the same place as before
and just continue, the breakpoint will be hit, but even with the
breakpoint there a Finish will make the program seg fault.
If I try to view the registers window anytime after pressing Run, the
whole debugger crashes. If I view the register window first, it
appears, then when I press run it populates, then a moment later the
whole debugger crashes.
I normally want to work in SRC+ASM mode. The compiler has often put asm
instructions in a strange order relative to source lines and I'm used to
that and (in Windows debuggers) know how to work around it with the dual
view. Insight keeps changing its mind about which view is on top (opens
with source on top, then changes back and forth for reasons I can't
begin to guess). That is very distracting.
If I set a breakpoint on a source line, it shows on an asm line that is
probably a plausible choice given the debug info the compiler generated
(but often isn't a usable choice for actual debugging). If I set a
breakpoint on an asm line, it only sometimes gets marked on the correct
source line. But I've used objdump to verify that the debug info
generated by the compiler is correct enough that the asm line in
question can be traced to the correct source line. When it goes to a
wrong source line, I'm pretty sure that wrong line is in a totally wrong
file, not just the wrong line of the current file.
If it hits a breakpoint set in the asm view, even if it originally did
mark that correctly in the source view, it almost never opens the right
source view. It seems to pick an hpp from which the compiler inlined
code elsewhere in the same function. I used objdump to see if the debug
info was wrong, and it wasn't. Anyway, the dual view then shows the
correct asm code and incorrect source code. If I step, the source code
stays in the same wrong file, but I think it uses the source line number
that would be correct if it were in the right source file.
I expect someone will tell me I should have completely unoptimized code
when debugging. That usually isn't practical. I know how to deal with
all the strange sequence and inlining effects using dual view in a
working GUI debugger. The failures I've described above happen even in
functions where the optimization did nothing strange.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 20:53 Can't debug x86_64 C++ programs John Fine
@ 2008-09-18 21:19 ` Keith Seitz
2008-09-18 21:37 ` John Fine
` (2 more replies)
2008-09-19 2:17 ` Keith Seitz
1 sibling, 3 replies; 16+ messages in thread
From: Keith Seitz @ 2008-09-18 21:19 UTC (permalink / raw)
To: John Fine; +Cc: insight
John Fine wrote:
> When I try to use it, Insight has so many bugs, I couldn't begin to list
> them here and other bugs interfere with any attempt to understand or
> work around any specific bug.
Unfortunately, insight is quickly approaching EOL. I have been about the
only person to keep it hobbling along for quite some years, and I am
just about ready to spend my copious free time on some other project.
As you've discovered, there are a ton of problems. Unfortunately, it
takes an experienced eye to recognize the difference between a problem
with the GUI and gdb. Gdb is pretty much the worst debugger on the
planet when it comes to C++. Sad but true.
Red Hat and other companies (and interested parties) are attempting to
make gdb a better C++ debugger. You might want to keep an eye on the
archer project at sourceware.org. It's probably quite a while out,
though, before anything substantially useable is ready.
> If there is any chance of getting any support, I'd be glad to do
> specific tests or give much more detail of the failures.
Isolated test cases are very useful. I don't often use x86_64, but I do
now have access to such a box. For right now, all my comments below
apply to running insight on x86. I will try to verify all of your issues
on x86_64 by the end of the week.
> If you know of a GUI debugger that works, please tell me.
You might try the CDT project in Eclipse or even some of the debuggers
mentioned off of the insight homepage hosted on sourceware.org. I have
no experience with those unfortunately, and they are all based on gdb,
so it is unlikely that C++ debugging is going to be any better than what
you are experiencing now.
> The step over buttons (both "Next" and "Next Asm") usually do a step
> into. They sometimes step over, so they must be connected to the right
> thing, but usually they step into.
That is very odd. I certainly have never experienced this, and I have
actually been using insight almost daily now for several weeks. Can you
give me a test case? By the way, what version of insight are you
attempting to use? [insight -v or "show version" in a console window]
What compiler?
> The Finish button usually does nothing, sometimes does the step out of
> operation that I want and sometimes causes the program being debugged to
> seg fault. If I restart the program and set a breakpoint where a
> correct step out of would reach and get into the same place as before
> and just continue, the breakpoint will be hit, but even with the
> breakpoint there a Finish will make the program seg fault.
I am sorry, but I also have not encountered these problems, and believe
me when I say I've been using "finish" a awful lot of late, debugging
gdb's symbol code with C++. Can you provide me with a test case?
> If I try to view the registers window anytime after pressing Run, the
> whole debugger crashes. If I view the register window first, it
> appears, then when I press run it populates, then a moment later the
> whole debugger crashes.
Once again, I am sorry, but I cannot reproduce this (on x86).
> I normally want to work in SRC+ASM mode. The compiler has often put asm
> instructions in a strange order relative to source lines and I'm used to
> that and (in Windows debuggers) know how to work around it with the dual
> view. Insight keeps changing its mind about which view is on top (opens
> with source on top, then changes back and forth for reasons I can't
> begin to guess). That is very distracting.
Holy moly! Do they really do that? I admit that I seldom use SRC+ASM,
but the way that code is arranged in insight, I would have thought that
it was impossible for the SRC and ASM panes to swap back and forth. I
would definitely like to see a test case for this.
> If I set a breakpoint on a source line, it shows on an asm line that is
> probably a plausible choice given the debug info the compiler generated
> (but often isn't a usable choice for actual debugging). If I set a
> breakpoint on an asm line, it only sometimes gets marked on the correct
> source line. But I've used objdump to verify that the debug info
> generated by the compiler is correct enough that the asm line in
> question can be traced to the correct source line. When it goes to a
> wrong source line, I'm pretty sure that wrong line is in a totally wrong
> file, not just the wrong line of the current file.
Insight does not do anything with breakpoints other than display them.
This is almost certainly a gdb problem. Nonetheless, given that I've
been working on gdb and C++, I would appreciate it if you could send me
a test case for this so that I can integrate it into the test suite.
> I expect someone will tell me I should have completely unoptimized code
> when debugging. That usually isn't practical. I know how to deal with
> all the strange sequence and inlining effects using dual view in a
> working GUI debugger. The failures I've described above happen even in
> functions where the optimization did nothing strange.
It is not possible for me to say whether anything is misbehaving. If you
are correct, it certainly sounds like there is a problem with
gdb/insight. Optimization is a strange beast, but it sounds like you
already are well-aware of the pitfalls of debugging optimized code.
Once again, I must apologize and ask for a test case. I simply have not
seen the issues you are reporting _on x86_. I will attempt to
double-check your findings against x86_64 this week, but if you have
test cases for any of these, it would certainly make things a whole lot
easier.
Keith
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 21:19 ` Keith Seitz
@ 2008-09-18 21:37 ` John Fine
2008-09-19 7:26 ` Keith Seitz
2008-09-18 22:11 ` John Fine
2008-09-18 22:27 ` John Fine
2 siblings, 1 reply; 16+ messages in thread
From: John Fine @ 2008-09-18 21:37 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
Keith Seitz wrote:
> Unfortunately, insight is quickly approaching EOL. I have been about
> the only person to keep it hobbling along for quite some years, and I
> am just about ready to spend my copious free time on some other project.
Can you give me some hints where in the GDB/Insight source code certain
operations are done? I think I might be able to fix or at least
understand some of these bugs myself.
I'd especially like to have the nexti button actually work (rather than
usually step in). I can't imagine how that bug could be the GUI's
fault. It only makes sense if the GDB code for nexti can't recognize a
call instruction in x86_64 asm code. I'd really like to look at the
code in gdb that tries to determine whether the next instruction is a
call. (Maybe it is confused by the possible encodings of call or maybe
it is confused by the 64-bit virtual address and sometimes looking in
the wrong place.) But I don't know how to find that code.
> Isolated test cases are very useful.
I'll see what I can find time to create. So far I haven't tried to
debug anything small, so I don't know whether Insight would behave
differently. I haven't found any target programs in which insight
doesn't have these malfunctions. There are calls in each program for
which gdb/Insight can correctly step over or out. There are lines in
the source code of each for which gdb/Insight seems to understand the
correspondence between source and asm. But in every program, there are
lines where it doesn't.
A lot of the debugging I attempted, and some of the worst gdb/Insight
behavior I saw) was with the program opannotate that is part of oprofile
package.
It is Oprofile version 0.9.3 (because the build for the newer Oprofile
was unhappy with some older libraries on our Centos system). It was
built with gcc version 3.4.6 with switches -g O2.
I tried Insight versions 6.7.1 and 6.8. In 6.7.1 I made the obvious
changes to the several casts that caused the build to fail.
In Opannotate, I just wanted to understand some of the basic flow of the
program that I couldn't figure out from the source code. So I just
wanted to do a bunch of step over and a few step into operations. But I
spent all my time trying to get out of the functions that I stepped into
using the step over button and never reached the parts I wanted to see.
Most of what I'm attempting to debug is in a large closed source project
compiled with the Intel 10.0 compiler. I tried using GCC 3.4.6 for the
parts of that which I wanted to debug, but that just made debugging
harder. Neither Intel nor GCC is great about correctly identifying the
source line for each asm line. But GCC is by far the worse of the two
at that. Most of the errors in the correspondence between source and
asm are made in the debugger (I'm trusting Objdump as the arbiter) but
starting with lots of errors in that made by the compiler still makes
things worse.
Another place I'd really like some hints about gdb/Insight source code
location is in the construction of the text that goes into the asm
window. I probably won't understand the code well enough to make the
change I'd like, but I want to look and see if I can. On each line of
asm code, I'd like display the line number of the source line that the
debugger thinks the debug info tells it corresponds. If I really
understood how to change things, I'd also color code that, to flag the
places where the source filename is different. If practical, I would
also drop the display of the function+offset version of the address. In
C++ code function names get too messy and the plain virtual address
should be good enough.
>
>> If I try to view the registers window anytime after pressing Run, the
>> whole debugger crashes. If I view the register window first, it
>> appears, then when I press run it populates, then a moment later the
>> whole debugger crashes.
>
> Once again, I am sorry, but I cannot reproduce this (on x86).
I haven't checked yet whether I can get a core file. If I can, then I
can reload that in gdb and get a backtrace.
>
> Holy moly! Do they really do that? I admit that I seldom use SRC+ASM,
> but the way that code is arranged in insight, I would have thought
> that it was impossible for the SRC and ASM panes to swap back and
> forth. I would definitely like to see a test case for this.
I'm not certain, but I think it happens when the debugger thinks the
source file has changed. I do a step across an asm instruction and
(because of inlining) the source file might change. The compiler's
rules (especially GCC) for identifying the line number in cases of
inlining seem to be seriously bogus. The debug info often stays at the
highest level (so the source file wouldn't change) but sometimes digs in
a layer or two. By any reasonable definition, the source file
identified by the debug info differs from reality and the understanding
by the debugger differs again from that debug info. But at some points
the debugger understands the source file has changed on a step of an
ordinary instruction within a function. Then the contents of the source
pane change to the other position and if the source pane was on top, it
moves to the bottom. I'm not sure that case can also move the source
pane back to the top, but something can, because it doesn't stay on the
bottom.
If it shouldn't swap, where should it stay (top or bottom)?
> Insight does not do anything with breakpoints other than display them.
> This is almost certainly a gdb problem. Nonetheless, given that I've
> been working on gdb and C++, I would appreciate it if you could send
> me a test case for this so that I can integrate it into the test suite.
When you set a breakpoint in asm view, Insight sometimes displays the
red mark in both asm view and source view. Is it really gdb code that
tells it where to display the red mark in source view?
I hope I wasn't unclear about that pair of bugs. When I set a
breakpoint in asm view gdb/Insight seem to be rock solid about setting
the breakpoint on the correct asm instruction and when you press
continue actually stopping on the correct asm instruction. The problems
are that it usually doesn't (but sometimes does) mark a line in the
source view when you set the breakpoint (even though the Objdump output
from the same code identifies a source line for that asm line that is
within view). The other problem is when it hits the breakpoint, it
displays the wrong file in the source pane.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 21:19 ` Keith Seitz
2008-09-18 21:37 ` John Fine
@ 2008-09-18 22:11 ` John Fine
2008-09-18 22:27 ` John Fine
2 siblings, 0 replies; 16+ messages in thread
From: John Fine @ 2008-09-18 22:11 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
Keith Seitz wrote:
>
>> If I try to view the registers window anytime after pressing Run, the
>> whole debugger crashes. If I view the register window first, it
>> appears, then when I press run it populates, then a moment later the
>> whole debugger crashes.
>
> Once again, I am sorry, but I cannot reproduce this (on x86).
>
I got the core file, then used gdb to get a backtrace. The beginning of
that is
#0 get_register (regnum=2, arg=<value optimized out>)
at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-register.c:331
#1 0x00000000004addd8 in gdb_register_info (clientData=<value optimized
out>, interp=0xa93ab0,
objc=<value optimized out>, objv=0xa96148)
at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-register.c:426
#2 0x00000000004a8ff7 in wrapped_call (opaque_args=0x7fbfff5cf0)
at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-cmds.c:423
#3 0x0000000000505f3b in catch_errors (func=0x4a8fe0 <wrapped_call>,
func_args=0x7fbfff5cf0,
errstring=0x77ecc2 "", mask=<value optimized out>) at
../../insight-6.8/gdb/exceptions.c:513
#4 0x00000000004a8ecb in gdbtk_call_wrapper (clientData=0x4adb90,
interp=0xa93ab0, objc=3,
objv=0xa96138) at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-cmds.c:354
#5 0x00000000007069b1 in TclEvalObjvInternal (interp=0xa93ab0, objc=3,
objv=0xa96138, command=0x0,
length=0, flags=0) at
../../../insight-6.8/tcl/unix/../generic/tclBasic.c:3048
#6 0x000000000072a414 in TclExecuteByteCode (interp=0xa93ab0,
codePtr=0x196f350)
at ../../../insight-6.8/tcl/unix/../generic/tclExecute.c:1431
#7 0x000000000072d548 in TclCompEvalObj (interp=0xa93ab0, objPtr=0x16559e0)
at ../../../insight-6.8/tcl/unix/../generic/tclExecute.c:1008
#8 0x00000000007089e7 in Tcl_EvalObjEx (interp=0x3e30031640,
objPtr=0x51, flags=0)
at ../../../insight-6.8/tcl/unix/../generic/tclBasic.c:3944
#9 0x0000000000669f41 in Itcl_EvalMemberCode (interp=0xa93ab0,
mfunc=0x16d6b40, member=0x16d6b70,
contextObj=0x16d0f20, objc=2, objv=0xa96128)
at /home/fine/insight-6.8/itcl/itcl/generic/itcl_methods.c:1003
#10 0x000000000066a9e6 in Itcl_ExecMethod (clientData=<value optimized
out>, interp=0xa93ab0,
objc=2, objv=0xa96128) at
/home/fine/insight-6.8/itcl/itcl/generic/itcl_methods.c:1516
#11 0x00000000007069b1 in TclEvalObjvInternal (interp=0xa93ab0, objc=2,
objv=0xa96128, command=0x0,
length=0, flags=0) at
../../../insight-6.8/tcl/unix/../generic/tclBasic.c:3048
#12 0x000000000072a414 in TclExecuteByteCode (interp=0xa93ab0,
codePtr=0x197c8a0)
at ../../../insight-6.8/tcl/unix/../generic/tclExecute.c:1431
I'll try to dig into gdbtk-register.c myself and see if I can understand
the problem. But meanwhile, I thought the backtrace might help you tell
me what to look for.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 21:19 ` Keith Seitz
2008-09-18 21:37 ` John Fine
2008-09-18 22:11 ` John Fine
@ 2008-09-18 22:27 ` John Fine
2008-09-18 23:04 ` Keith Seitz
2 siblings, 1 reply; 16+ messages in thread
From: John Fine @ 2008-09-18 22:27 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
Keith Seitz wrote:
>> If I try to view the registers window anytime after pressing Run, the
>> whole debugger crashes. If I view the register window first, it
>> appears, then when I press run it populates, then a moment later the
>> whole debugger crashes.
>
> Once again, I am sorry, but I cannot reproduce this (on x86).
>
Can you give me an expert opinion on these lines of code in gdbtk-register.c
regformat = (int *)xcalloc (numregs, sizeof(int));
regtype = (struct type **)xcalloc (numregs, sizeof(struct type **));
especially that sizeof(int)
I know that whatever object lies directly in physical ram before the
allocation of regtype has overflowed and corrupted regtype.
I think the above allocations happen early enough in initialization that
they would sequentially grab new memory (rather than reuse chunks that
would tend to be distant from each other).
So (barely more than guess) I think regformat is the object using more
memory than was allocated for it and overflowing into regtype.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 22:27 ` John Fine
@ 2008-09-18 23:04 ` Keith Seitz
2008-09-18 23:25 ` John Fine
0 siblings, 1 reply; 16+ messages in thread
From: Keith Seitz @ 2008-09-18 23:04 UTC (permalink / raw)
To: John Fine; +Cc: insight
John Fine wrote:
> Can you give me an expert opinion on these lines of code in
> gdbtk-register.c
>
> regformat = (int *)xcalloc (numregs, sizeof(int));
> regtype = (struct type **)xcalloc (numregs, sizeof(struct type **));
>
> especially that sizeof(int)
The file global regformat simply holds a bunch of ints which are passed
to gdb to specify the output format for the register -- one per register.
> So (barely more than guess) I think regformat is the object using more
> memory than was allocated for it and overflowing into regtype.
That sounds like a plausible explanation. You might try to see if
numregs is changing or if the register number is greater than the amount
that has been allocated.
It shouldn't, but who knows. There has been a fair amount of churn in
this area in gdb over the last year.
Keith
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 23:04 ` Keith Seitz
@ 2008-09-18 23:25 ` John Fine
2008-09-19 8:20 ` Keith Seitz
0 siblings, 1 reply; 16+ messages in thread
From: John Fine @ 2008-09-18 23:25 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
Keith Seitz wrote:
>
> That sounds like a plausible explanation. You might try to see if
> numregs is changing or if the register number is greater than the
> amount that has been allocated.
>
> It shouldn't, but who knows. There has been a fair amount of churn in
> this area in gdb over the last year.
>
This is a situation in which I really could use a decent 64-bit debugger
(to find the bug in gdb/Insight). But I don't know how to use gdb
beyond the simplest operations. I wouldn't even know how to try to used
gdb to debug itself.
So I wasted lots of time trying to use
fprintf_unfiltered (gdb_stdlog,
Then I realized most of those were just trying to pop up in a window
when Insight crashed, so I would never get to see them.
So I tried ordinary printf and verified what I expected. But I still
don't know enough about gdb/Insight internals to know exactly where the
bug is. I'm not even sure yet whether it is gdb or insight (though I'm
looking at code that is apparently gdb.
I put printf's in
architecture_changed_event
deprecated_current_gdbarch_select_hack
setup_architecture_data
and
gdb_regformat
The output I get from my printf's is
deprecated_current_gdbarch_select_hack() current_gdbarch=0xa89450
architecture_changed_event
setup_architecture_data() current_gdbarch=0xa89450 numregs=50
old_regs=0xc98ab0 regformat=0xc98de0 regtype=0xc98eb0
deprecated_current_gdbarch_select_hack() current_gdbarch=0xc9de00
architecture_changed_event
gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=0 fm=0x78
gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=1 fm=0x78
...
gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=55 fm=0x78
gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=56 fm=0x78
Notice something calls deprecated_current_gdbarch_select_hack, which
calls architecture_changed_event, then something calls
setup_architecture_data.
I think architecture_changed_event OUGHT to call
setup_architecture_data, but I can't tell from the source code what it
actually does or who calls setup_architecture_data.
Then notice the second time that deprecated_current_gdbarch_select_hack
is called it still calls architecture_changed_event, but nothing calls
setup_architecture_data, so the data structures are still allocated for
50 registers, but now there are 58.
The registers 0 through 56 get their formats set, trashing regtype,
which as I expected does follow regformat in memory.
The question of what is supposed to call setup_architecture_data
on/after that second call to deprecated_current_gdbarch_select_hack is
way beyond both my understanding of the GCC source code and my ability
to debug things by adding printf's.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 20:53 Can't debug x86_64 C++ programs John Fine
2008-09-18 21:19 ` Keith Seitz
@ 2008-09-19 2:17 ` Keith Seitz
1 sibling, 0 replies; 16+ messages in thread
From: Keith Seitz @ 2008-09-19 2:17 UTC (permalink / raw)
To: John Fine; +Cc: insight
I've finally built x86_64 insight from CVS HEAD. And the news isn't good...
I'm going to omit the entirety of your message, because my response is
the same on all of them: I could not reproduce the problems you reported
-- not one of them. :-(
The only problem I was able to reproduce is the
pending-breakpoints-crash problem, but there is a patch floating around
for that (which I will check in before EOD).
I know that's not what you want to hear, but that tells us something: we
have a chance at resolving this because it does work for me.
I'll respond to your other messages with more detail shortly.
Keith
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 21:37 ` John Fine
@ 2008-09-19 7:26 ` Keith Seitz
2008-09-19 14:27 ` John Fine
0 siblings, 1 reply; 16+ messages in thread
From: Keith Seitz @ 2008-09-19 7:26 UTC (permalink / raw)
To: John Fine; +Cc: insight
John Fine wrote:
> Can you give me some hints where in the GDB/Insight source code certain
> operations are done? I think I might be able to fix or at least
> understand some of these bugs myself.
Yes, I can provide some pointers about where to look for certain things,
but my knowledge of gdb is rusty. The good news is that it is now my job
to make gdb a better c++ debugger, so my knowledge grows every day.
> I'd especially like to have the nexti button actually work (rather than
> usually step in). I can't imagine how that bug could be the GUI's
> fault. It only makes sense if the GDB code for nexti can't recognize a
> call instruction in x86_64 asm code. I'd really like to look at the
> code in gdb that tries to determine whether the next instruction is a
> call. (Maybe it is confused by the possible encodings of call or maybe
> it is confused by the 64-bit virtual address and sometimes looking in
> the wrong place.) But I don't know how to find that code.
First things first: verify that this is a problem with core gdb (which
it would have to be IMO). Run the command-line gdb (insight -nw or
insight --i=console), load your file (pass as argument or "file MYAPP"
at gdb prompt), set a break somewhere ("break SOMEWHERE"), and try to
nexti. If it doesn't work here, it is definitely a gdb problem.
Debugging this, though, is going to be a pain. A real pain. You should
set a breakpoint at "nexti_command" in infcmd.c and step from there. I
haven't looked at this code in almost a decade, but I do remember it
being particularly nasty.
>> Isolated test cases are very useful.
> I'll see what I can find time to create. So far I haven't tried to
> debug anything small, so I don't know whether Insight would behave
> differently. I haven't found any target programs in which insight
> doesn't have these malfunctions. There are calls in each program for
> which gdb/Insight can correctly step over or out. There are lines in
> the source code of each for which gdb/Insight seems to understand the
> correspondence between source and asm. But in every program, there are
> lines where it doesn't.
If you can find a test case -- ANY test case -- that you can send me,
that would greatly expedite this process.
> A lot of the debugging I attempted, and some of the worst gdb/Insight
> behavior I saw) was with the program opannotate that is part of oprofile
> package.
I am quite familiar with opannotate (having written the eclipse-oprofile
RPM), so I'll take a look at this and see if I cannot get any further
with your problems.
> It is Oprofile version 0.9.3 (because the build for the newer Oprofile
> was unhappy with some older libraries on our Centos system). It was
> built with gcc version 3.4.6 with switches -g O2.
gcc 3.4.6? Wow, that's over two and one-half years old -- ancient by
gcc/gdb standards *for C++*. I will try to grab a copy of this and see
how executables built by this behave with gdb. I am going to guess that
this is a big part of the problem.
I don't suppose you could try gcc 4.1 or 4.4?
> In Opannotate, I just wanted to understand some of the basic flow of the
> program that I couldn't figure out from the source code. So I just
> wanted to do a bunch of step over and a few step into operations. But I
> spent all my time trying to get out of the functions that I stepped into
> using the step over button and never reached the parts I wanted to see.
I'll give this a try.
> Most of what I'm attempting to debug is in a large closed source project
> compiled with the Intel 10.0 compiler.
I don't know much about the Intel 10.0 compiler or how well gdb works
with it. I'll ask around.
> Another place I'd really like some hints about gdb/Insight source code
> location is in the construction of the text that goes into the asm
> window. I probably won't understand the code well enough to make the
> change I'd like, but I want to look and see if I can. On each line of
> asm code, I'd like display the line number of the source line that the
> debugger thinks the debug info tells it corresponds. If I really
> understood how to change things, I'd also color code that, to flag the
> places where the source filename is different. If practical, I would
> also drop the display of the function+offset version of the address. In
> C++ code function names get too messy and the plain virtual address
> should be good enough.
That occurs in gdb_load_assembly in src/gdb/gdbtk/generic/gdbtk-cmds.c.
> I'm not certain, but I think it happens when the debugger thinks the
> source file has changed. I do a step across an asm instruction and
> (because of inlining) the source file might change.
If I am reading this right, the executable has changed and the asm pane
is now obsolete. There could be a bug here w.r.t. updating the contents
of the asm pane (a bug I've squashed several times over the years), but
I cannot think how this could cause the panes to swap places off the top
of my head. Again, I'll take a look and see if I can reproduce this
given the new information.
> If it shouldn't swap, where should it stay (top or bottom)?
SRC pane should always be on top, ASM always on the bottom.
> When you set a breakpoint in asm view, Insight sometimes displays the
> red mark in both asm view and source view. Is it really gdb code that
> tells it where to display the red mark in source view?
Yes, because only the backend (gdb) knows how to translate between the
two. Insight is actually pretty stupid most of the time: it was designed
this way so that getting a new architecture to run on it would require
getting gdb running. Even today, with every new arch added to gdb, the
only time insight needs modification is if a new target is introduced.
So, there we are: more investigation is necessary. I hope to get to some
of this tonight.
Keith
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-18 23:25 ` John Fine
@ 2008-09-19 8:20 ` Keith Seitz
2008-09-19 13:27 ` John Fine
0 siblings, 1 reply; 16+ messages in thread
From: Keith Seitz @ 2008-09-19 8:20 UTC (permalink / raw)
To: John Fine; +Cc: insight
John Fine wrote:
> So I wasted lots of time trying to use
> fprintf_unfiltered (gdb_stdlog,
>
> Then I realized most of those were just trying to pop up in a window
> when Insight crashed, so I would never get to see them.
Yeah, the *_unfilitered and *_filtered all head through uiout, which are
redirected to insight and tcl/tk. As you discovered, you need to use
"normal" printfs.
> I think architecture_changed_event OUGHT to call
> setup_architecture_data, but I can't tell from the source code what it
> actually does or who calls setup_architecture_data.
Here's what going on. When you load your x86_64 executable, gdb notices
that this is different from the current i386 architecture, so it changes
the current architecture to i386:x86_64. When this is changed, an
observer notification is fired off from
deprecated_current_gdbarch_select_hack (deprecated? I guess gdb doesn't
want anyone to know that the architecture has changed??).
Insight registers gdbtk_architecture_changed as the observer callback.
This then calls the Tcl procedure gdbtk_tcl_architecture_changed, which
dispatches this event to the windows. The register window is currently
the only window that does anything with this event.
It calls the proc gdb_reg_arch_changed, which is aliased to the C
function setup_architecture_data. [Phew!] So, alas,
setup_architecture_data *is* being called.
This will require further investigation.
Keith
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-19 8:20 ` Keith Seitz
@ 2008-09-19 13:27 ` John Fine
2008-09-19 16:38 ` Keith Seitz
0 siblings, 1 reply; 16+ messages in thread
From: John Fine @ 2008-09-19 13:27 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
Keith Seitz wrote:
>
> Insight registers gdbtk_architecture_changed as the observer callback.
That's the part I didn't have a clue about, so you've given me at least
the name I can grep for to get a clue.
> It calls the proc gdb_reg_arch_changed, which is aliased to the C
> function setup_architecture_data. [Phew!] So, alas,
> setup_architecture_data *is* being called.
But I had a printf in setup_architecture_data, so I can be quite sure it
is only called the first time (for the architecture with 50 registers)
and it is not called the second time (for the architecture with 58
registers).
I'm a bit confused that i386 has as many as 50 registers and much more
confused that AMD64 has only 8 more than i386. I know for sure about 16
registers that are not in i368 but are part of AMD64 (and are working
correctly in the registers window, now that I kludged around the memory
clobber). But that 58 vs. 50 issue is just idle curiosity. The big
question is how you avoid the memory clobber.
Is setup_architecture_data really called twice for 50 registers the
first time and 58 the second? Or does your copy start in AMD64
architecture? Or does your memory allocation land differently, so the
memory clobber happens without symptom? Or what?
Is the fact that my copy of Insight was built with gcc 3.4.6 more
significant than the fact that some of my target programs were built
with gcc 3.4.6?
The group with which my project needs to keep gcc version compatibility
is about to switch version, I think to 4.1.2. We have that version of
gcc already installed here, but not as the default. I'm not very expert
in Linux. So I know how to pick a non default version of gcc for our
own projects built with bjam, but I don't know how to do it for a source
package like Insight, with its rather complicated makefile. Tomorrow
I'll look into building both Insight and Oprofile with the newer gcc.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-19 7:26 ` Keith Seitz
@ 2008-09-19 14:27 ` John Fine
2008-09-19 18:47 ` Keith Seitz
0 siblings, 1 reply; 16+ messages in thread
From: John Fine @ 2008-09-19 14:27 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
Keith Seitz wrote:
> The good news is that it is now my job to make gdb a better c++
> debugger, so my knowledge grows every day.
Can I put in an early request (maybe you've heard this one already) to
do something about the absurdly long function names.
In viewing disassembly, you get function name + offset as one of the
items on the line, but in typical templated code the function name is so
long that the actual disassembly is pushed off the right edge of the
display.
I can't think of a decent method to make a more concise display of
function name (maybe you can). But failing that, I think you need an
option to drop it entirely. Or is there such an option already that I
just haven't found? I'm not very good at using gdb. function name +
offset was never such valuable information that we really needed it and
when it is too bulky to use, it should go away.
For Insight's SRC+ASM view, I think putting the source line number on
each line in both panels would help a lot (again if that option is
already there, I'm not expert). While that is less critical for gdb
itself than for a GUI, I think it would be a very useful option in gdb
itself (disassemble with the source line number shown on each line of
disassembly). Hopefully other GUI's layered on gdb would give the user
access to that feature.
If you're not totally turning off optimization, any user looking at a
disassembly spends a lot of time figuring out which source line is
connected to each asm line. It would make more sense for the debugger
to just display the compiler's version of that information.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-19 13:27 ` John Fine
@ 2008-09-19 16:38 ` Keith Seitz
2008-09-20 21:22 ` John Fine
0 siblings, 1 reply; 16+ messages in thread
From: Keith Seitz @ 2008-09-19 16:38 UTC (permalink / raw)
To: John Fine; +Cc: insight
John Fine wrote:
> But I had a printf in setup_architecture_data, so I can be quite sure it
> is only called the first time (for the architecture with 50 registers)
> and it is not called the second time (for the architecture with 58
> registers).
setup_architecture_data is called twice, once on startup (in which case
the architecture is i386 w/50 registers) and then once after "file"
command is issued (when the arch changes to i386:x86_64 w/58 registers)
-- at least it is in my copy. You might be seeing an old bug?
From what I can tell, the change to add the call to
setup_architecture_data was committed on 28 Jun 2007:
2007-06-27 Keith Seitz <keiths@redhat.com>
* generic/gdbtk-register.c (Gdbtk_Register_Init): Remove
calls to deprecated_register_gdbarch_swap.
Add "gdb_reg_arch_changed" command.
* library/regwin.itb (arch_changed): Call gdb_reg_arch_changed.
Is this in your copy of the sources?
> I'm a bit confused that i386 has as many as 50 registers and much more
> confused that AMD64 has only 8 more than i386. I know for sure about 16
> registers that are not in i368 but are part of AMD64 (and are working
> correctly in the registers window, now that I kludged around the memory
> clobber). But that 58 vs. 50 issue is just idle curiosity. The big
> question is how you avoid the memory clobber.
The register names all come from gdb. Insight has no specific knowledge
about them. [Remember I said earlier that insight was really ignorant of
architecture-specific things in order to facilitate bring-up of new
architectures? That applies here, too.] Insight gets the list of
registers (and register groups) from gdb directly.
If your copy of the sources contains the patch mentioned above, there
must be another reason for the clobber. Have you tried to "watch" the
memory involved? [If you know the location of a specific clobber, you
can do "watch *0xADDR" to find out what clobbered the memory.] AFAICT,
there is no memory clobber: setup_architecture_data is called on the
architecture_changed_event (albeit in a very obfuscated way for some
reason), it releases the old data and callocs new space.
> Is setup_architecture_data really called twice for 50 registers the
> first time and 58 the second? Or does your copy start in AMD64
> architecture? Or does your memory allocation land differently, so the
> memory clobber happens without symptom? Or what?
Yes. The first call is from Gdbtk_Register_Init on startup (arch=i386,
numregs=50). The second time is after the "file" command to load the
x86_64 executable (when arch changes to i386:x86_64, numregs=58).
> Is the fact that my copy of Insight was built with gcc 3.4.6 more
> significant than the fact that some of my target programs were built
> with gcc 3.4.6?
I would guess that the former is not a problem, gdb/insight is, after
all, just a simple C program. Your target code, though, in C++ w/gcc
3.4.6 is what I would be most suspicious of.
> Tomorrow
> I'll look into building both Insight and Oprofile with the newer gcc.
If you are just building insight from sources, you can simply set CC in
your environment to use a different compiler from the default one on
your PATH. For example,
$ which gcc
/usr/bin/gcc
$ CC=/home/keiths/work/gcj/built/bin/gcc ../src/configure
[...]
$ make all-gdb
This forces the insight build to use the specified compiler
(/home/keiths/work/gcj/built/bin/gcc) instead of the one in your path
(/usr/bin/gcc).
Don't recall how to do it with oprofile (since oprofile uses C++), but
normally the procedure is similar (although C++ uses a different env var
-- try passing "--help" to oprofile's configure for a list of options).
Keith
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-19 14:27 ` John Fine
@ 2008-09-19 18:47 ` Keith Seitz
0 siblings, 0 replies; 16+ messages in thread
From: Keith Seitz @ 2008-09-19 18:47 UTC (permalink / raw)
To: John Fine; +Cc: insight
John Fine wrote:
> I can't think of a decent method to make a more concise display of
> function name (maybe you can). But failing that, I think you need an
> option to drop it entirely. Or is there such an option already that I
> just haven't found? I'm not very good at using gdb. function name +
> offset was never such valuable information that we really needed it and
> when it is too bulky to use, it should go away.
Perhaps then name could be truncate to, say, N leading characters,
"...", and M trailing characters, but I think you are right: we need to
offer the option to drop it entirely. I don't know if gdb will let us do
that, but I can certainly give it a go.
> For Insight's SRC+ASM view, I think putting the source line number on
> each line in both panels would help a lot (again if that option is
> already there, I'm not expert). While that is less critical for gdb
> itself than for a GUI, I think it would be a very useful option in gdb
> itself (disassemble with the source line number shown on each line of
> disassembly). Hopefully other GUI's layered on gdb would give the user
> access to that feature.
I don't really ever use SRC+ASM mode. I usually used MIXED, which
intersperses source and assembly. For me, it is much, much easier to
read. Or perhaps having the two side-by-side with callouts (like is
common in many diffing programs) or some other scheme.
Keith
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-19 16:38 ` Keith Seitz
@ 2008-09-20 21:22 ` John Fine
2008-09-22 18:35 ` John Fine
0 siblings, 1 reply; 16+ messages in thread
From: John Fine @ 2008-09-20 21:22 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
I really appreciate all the time you've spent helping me, especially as
it is getting more and more likely that something is wrong with the
environment in which I compiled Insight, rather than with Insight's
source code.
But I still don't know how to find out what was wrong.
Keith Seitz wrote:
> setup_architecture_data is called twice, once on startup (in which
> case the architecture is i386 w/50 registers) and then once after
> "file" command is issued (when the arch changes to i386:x86_64 w/58
> registers) -- at least it is in my copy. You might be seeing an old bug?
I have the lines you quoted in insight-6.8\gdb\gdbtk\ChangeLog-2007. I
don't know how to be certain I have the correction that goes with it.
But in regwin.itb, in RegWin::arch_changed it does call gdb_reg_arch_changed
However, I am quite sure:
1) setup_architecture_data is called only once (directly from
Gdbtk_Register_Init)
2) (Before my kludge to allocate excess memory in
setup_architecture_data) regformat did overflow corrupting regtype
I'm still not sure I understand how the call to setup_architecture_data
when the architecture changes is supposed to occur.
Much to my surprise, debugging the broken Insight inside the broken
Insight is moderately practical, so ...
I set a breakpoint inside architecture_changed_event (in gdb-events.c)
on the line
if (!current_event_hooks->architecture_changed)
and I verified that current_event_hooks->architecture_changed is equal
to zero, so architecture_changed_event does nothing.
I expect that is where the second call to setup_architecture_data is
supposed to happen, so current_event_hooks->architecture_changed is not
supposed to be zero, so now I have to search/guess for the code that is
supposed to set current_event_hooks->architecture_changed to something
nonzero.
Can you confirm that current_event_hooks->architecture_changed is
supposed to be nonzero at that point and would be the path to
setup_architecture_data if it were working?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs.
2008-09-20 21:22 ` John Fine
@ 2008-09-22 18:35 ` John Fine
0 siblings, 0 replies; 16+ messages in thread
From: John Fine @ 2008-09-22 18:35 UTC (permalink / raw)
To: Keith Seitz; +Cc: insight
John Fine wrote:
> I set a breakpoint inside architecture_changed_event (in gdb-events.c)
> on the line
> if (!current_event_hooks->architecture_changed)
> and I verified that current_event_hooks->architecture_changed is equal
> to zero, so architecture_changed_event does nothing.
Oops. That was sloppy debugging on my part.
current_event_hooks->architecture_changed is equal to zero the first
time architecture_changed_event is called. But
architecture_changed_event is called again later and
current_event_hooks->architecture_changed then points to
gdbtk_architecture_changed, which calls Tcl_Eval(gdbtk_interp,
"gdbtk_tcl_architecture_changed");
That still doesn't end up calling setup_architecture_data. I don't know
whether I can debug well enough to find out why.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-09-19 14:27 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-18 20:53 Can't debug x86_64 C++ programs John Fine
2008-09-18 21:19 ` Keith Seitz
2008-09-18 21:37 ` John Fine
2008-09-19 7:26 ` Keith Seitz
2008-09-19 14:27 ` John Fine
2008-09-19 18:47 ` Keith Seitz
2008-09-18 22:11 ` John Fine
2008-09-18 22:27 ` John Fine
2008-09-18 23:04 ` Keith Seitz
2008-09-18 23:25 ` John Fine
2008-09-19 8:20 ` Keith Seitz
2008-09-19 13:27 ` John Fine
2008-09-19 16:38 ` Keith Seitz
2008-09-20 21:22 ` John Fine
2008-09-22 18:35 ` John Fine
2008-09-19 2:17 ` Keith Seitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).