* Can't debug x86_64 C++ programs. @ 2008-09-18 20:53 John Fine 2008-09-18 21:19 ` Keith Seitz 2008-09-19 2:17 ` Keith Seitz 0 siblings, 2 replies; 16+ messages in thread From: John Fine @ 2008-09-18 20:53 UTC (permalink / raw) To: insight Actually X86_64 C++ programs are the only things I've even tried to debug with Insight. So I don't have any good guess of what I might be trying that differs from what those who can use Insight are doing. When I try to use it, Insight has so many bugs, I couldn't begin to list them here and other bugs interfere with any attempt to understand or work around any specific bug. If there is any chance of getting any support, I'd be glad to do specific tests or give much more detail of the failures. I really need a GUI debugger for x86_64 C++ programs. I've tried kdbg, which has many of the same bugs (which implies they are in gdb it self, not in the GUI), but some of those don't happen when I try gdb alone. Also kdbg seems to be missing features needed for practical debugging, features Insight has, if only it worked. If you know of a GUI debugger that works, please tell me. Some of the bugs: The step over buttons (both "Next" and "Next Asm") usually do a step into. They sometimes step over, so they must be connected to the right thing, but usually they step into. The Finish button usually does nothing, sometimes does the step out of operation that I want and sometimes causes the program being debugged to seg fault. If I restart the program and set a breakpoint where a correct step out of would reach and get into the same place as before and just continue, the breakpoint will be hit, but even with the breakpoint there a Finish will make the program seg fault. If I try to view the registers window anytime after pressing Run, the whole debugger crashes. If I view the register window first, it appears, then when I press run it populates, then a moment later the whole debugger crashes. I normally want to work in SRC+ASM mode. The compiler has often put asm instructions in a strange order relative to source lines and I'm used to that and (in Windows debuggers) know how to work around it with the dual view. Insight keeps changing its mind about which view is on top (opens with source on top, then changes back and forth for reasons I can't begin to guess). That is very distracting. If I set a breakpoint on a source line, it shows on an asm line that is probably a plausible choice given the debug info the compiler generated (but often isn't a usable choice for actual debugging). If I set a breakpoint on an asm line, it only sometimes gets marked on the correct source line. But I've used objdump to verify that the debug info generated by the compiler is correct enough that the asm line in question can be traced to the correct source line. When it goes to a wrong source line, I'm pretty sure that wrong line is in a totally wrong file, not just the wrong line of the current file. If it hits a breakpoint set in the asm view, even if it originally did mark that correctly in the source view, it almost never opens the right source view. It seems to pick an hpp from which the compiler inlined code elsewhere in the same function. I used objdump to see if the debug info was wrong, and it wasn't. Anyway, the dual view then shows the correct asm code and incorrect source code. If I step, the source code stays in the same wrong file, but I think it uses the source line number that would be correct if it were in the right source file. I expect someone will tell me I should have completely unoptimized code when debugging. That usually isn't practical. I know how to deal with all the strange sequence and inlining effects using dual view in a working GUI debugger. The failures I've described above happen even in functions where the optimization did nothing strange. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 20:53 Can't debug x86_64 C++ programs John Fine @ 2008-09-18 21:19 ` Keith Seitz 2008-09-18 21:37 ` John Fine ` (2 more replies) 2008-09-19 2:17 ` Keith Seitz 1 sibling, 3 replies; 16+ messages in thread From: Keith Seitz @ 2008-09-18 21:19 UTC (permalink / raw) To: John Fine; +Cc: insight John Fine wrote: > When I try to use it, Insight has so many bugs, I couldn't begin to list > them here and other bugs interfere with any attempt to understand or > work around any specific bug. Unfortunately, insight is quickly approaching EOL. I have been about the only person to keep it hobbling along for quite some years, and I am just about ready to spend my copious free time on some other project. As you've discovered, there are a ton of problems. Unfortunately, it takes an experienced eye to recognize the difference between a problem with the GUI and gdb. Gdb is pretty much the worst debugger on the planet when it comes to C++. Sad but true. Red Hat and other companies (and interested parties) are attempting to make gdb a better C++ debugger. You might want to keep an eye on the archer project at sourceware.org. It's probably quite a while out, though, before anything substantially useable is ready. > If there is any chance of getting any support, I'd be glad to do > specific tests or give much more detail of the failures. Isolated test cases are very useful. I don't often use x86_64, but I do now have access to such a box. For right now, all my comments below apply to running insight on x86. I will try to verify all of your issues on x86_64 by the end of the week. > If you know of a GUI debugger that works, please tell me. You might try the CDT project in Eclipse or even some of the debuggers mentioned off of the insight homepage hosted on sourceware.org. I have no experience with those unfortunately, and they are all based on gdb, so it is unlikely that C++ debugging is going to be any better than what you are experiencing now. > The step over buttons (both "Next" and "Next Asm") usually do a step > into. They sometimes step over, so they must be connected to the right > thing, but usually they step into. That is very odd. I certainly have never experienced this, and I have actually been using insight almost daily now for several weeks. Can you give me a test case? By the way, what version of insight are you attempting to use? [insight -v or "show version" in a console window] What compiler? > The Finish button usually does nothing, sometimes does the step out of > operation that I want and sometimes causes the program being debugged to > seg fault. If I restart the program and set a breakpoint where a > correct step out of would reach and get into the same place as before > and just continue, the breakpoint will be hit, but even with the > breakpoint there a Finish will make the program seg fault. I am sorry, but I also have not encountered these problems, and believe me when I say I've been using "finish" a awful lot of late, debugging gdb's symbol code with C++. Can you provide me with a test case? > If I try to view the registers window anytime after pressing Run, the > whole debugger crashes. If I view the register window first, it > appears, then when I press run it populates, then a moment later the > whole debugger crashes. Once again, I am sorry, but I cannot reproduce this (on x86). > I normally want to work in SRC+ASM mode. The compiler has often put asm > instructions in a strange order relative to source lines and I'm used to > that and (in Windows debuggers) know how to work around it with the dual > view. Insight keeps changing its mind about which view is on top (opens > with source on top, then changes back and forth for reasons I can't > begin to guess). That is very distracting. Holy moly! Do they really do that? I admit that I seldom use SRC+ASM, but the way that code is arranged in insight, I would have thought that it was impossible for the SRC and ASM panes to swap back and forth. I would definitely like to see a test case for this. > If I set a breakpoint on a source line, it shows on an asm line that is > probably a plausible choice given the debug info the compiler generated > (but often isn't a usable choice for actual debugging). If I set a > breakpoint on an asm line, it only sometimes gets marked on the correct > source line. But I've used objdump to verify that the debug info > generated by the compiler is correct enough that the asm line in > question can be traced to the correct source line. When it goes to a > wrong source line, I'm pretty sure that wrong line is in a totally wrong > file, not just the wrong line of the current file. Insight does not do anything with breakpoints other than display them. This is almost certainly a gdb problem. Nonetheless, given that I've been working on gdb and C++, I would appreciate it if you could send me a test case for this so that I can integrate it into the test suite. > I expect someone will tell me I should have completely unoptimized code > when debugging. That usually isn't practical. I know how to deal with > all the strange sequence and inlining effects using dual view in a > working GUI debugger. The failures I've described above happen even in > functions where the optimization did nothing strange. It is not possible for me to say whether anything is misbehaving. If you are correct, it certainly sounds like there is a problem with gdb/insight. Optimization is a strange beast, but it sounds like you already are well-aware of the pitfalls of debugging optimized code. Once again, I must apologize and ask for a test case. I simply have not seen the issues you are reporting _on x86_. I will attempt to double-check your findings against x86_64 this week, but if you have test cases for any of these, it would certainly make things a whole lot easier. Keith ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 21:19 ` Keith Seitz @ 2008-09-18 21:37 ` John Fine 2008-09-19 7:26 ` Keith Seitz 2008-09-18 22:11 ` John Fine 2008-09-18 22:27 ` John Fine 2 siblings, 1 reply; 16+ messages in thread From: John Fine @ 2008-09-18 21:37 UTC (permalink / raw) To: Keith Seitz; +Cc: insight Keith Seitz wrote: > Unfortunately, insight is quickly approaching EOL. I have been about > the only person to keep it hobbling along for quite some years, and I > am just about ready to spend my copious free time on some other project. Can you give me some hints where in the GDB/Insight source code certain operations are done? I think I might be able to fix or at least understand some of these bugs myself. I'd especially like to have the nexti button actually work (rather than usually step in). I can't imagine how that bug could be the GUI's fault. It only makes sense if the GDB code for nexti can't recognize a call instruction in x86_64 asm code. I'd really like to look at the code in gdb that tries to determine whether the next instruction is a call. (Maybe it is confused by the possible encodings of call or maybe it is confused by the 64-bit virtual address and sometimes looking in the wrong place.) But I don't know how to find that code. > Isolated test cases are very useful. I'll see what I can find time to create. So far I haven't tried to debug anything small, so I don't know whether Insight would behave differently. I haven't found any target programs in which insight doesn't have these malfunctions. There are calls in each program for which gdb/Insight can correctly step over or out. There are lines in the source code of each for which gdb/Insight seems to understand the correspondence between source and asm. But in every program, there are lines where it doesn't. A lot of the debugging I attempted, and some of the worst gdb/Insight behavior I saw) was with the program opannotate that is part of oprofile package. It is Oprofile version 0.9.3 (because the build for the newer Oprofile was unhappy with some older libraries on our Centos system). It was built with gcc version 3.4.6 with switches -g O2. I tried Insight versions 6.7.1 and 6.8. In 6.7.1 I made the obvious changes to the several casts that caused the build to fail. In Opannotate, I just wanted to understand some of the basic flow of the program that I couldn't figure out from the source code. So I just wanted to do a bunch of step over and a few step into operations. But I spent all my time trying to get out of the functions that I stepped into using the step over button and never reached the parts I wanted to see. Most of what I'm attempting to debug is in a large closed source project compiled with the Intel 10.0 compiler. I tried using GCC 3.4.6 for the parts of that which I wanted to debug, but that just made debugging harder. Neither Intel nor GCC is great about correctly identifying the source line for each asm line. But GCC is by far the worse of the two at that. Most of the errors in the correspondence between source and asm are made in the debugger (I'm trusting Objdump as the arbiter) but starting with lots of errors in that made by the compiler still makes things worse. Another place I'd really like some hints about gdb/Insight source code location is in the construction of the text that goes into the asm window. I probably won't understand the code well enough to make the change I'd like, but I want to look and see if I can. On each line of asm code, I'd like display the line number of the source line that the debugger thinks the debug info tells it corresponds. If I really understood how to change things, I'd also color code that, to flag the places where the source filename is different. If practical, I would also drop the display of the function+offset version of the address. In C++ code function names get too messy and the plain virtual address should be good enough. > >> If I try to view the registers window anytime after pressing Run, the >> whole debugger crashes. If I view the register window first, it >> appears, then when I press run it populates, then a moment later the >> whole debugger crashes. > > Once again, I am sorry, but I cannot reproduce this (on x86). I haven't checked yet whether I can get a core file. If I can, then I can reload that in gdb and get a backtrace. > > Holy moly! Do they really do that? I admit that I seldom use SRC+ASM, > but the way that code is arranged in insight, I would have thought > that it was impossible for the SRC and ASM panes to swap back and > forth. I would definitely like to see a test case for this. I'm not certain, but I think it happens when the debugger thinks the source file has changed. I do a step across an asm instruction and (because of inlining) the source file might change. The compiler's rules (especially GCC) for identifying the line number in cases of inlining seem to be seriously bogus. The debug info often stays at the highest level (so the source file wouldn't change) but sometimes digs in a layer or two. By any reasonable definition, the source file identified by the debug info differs from reality and the understanding by the debugger differs again from that debug info. But at some points the debugger understands the source file has changed on a step of an ordinary instruction within a function. Then the contents of the source pane change to the other position and if the source pane was on top, it moves to the bottom. I'm not sure that case can also move the source pane back to the top, but something can, because it doesn't stay on the bottom. If it shouldn't swap, where should it stay (top or bottom)? > Insight does not do anything with breakpoints other than display them. > This is almost certainly a gdb problem. Nonetheless, given that I've > been working on gdb and C++, I would appreciate it if you could send > me a test case for this so that I can integrate it into the test suite. When you set a breakpoint in asm view, Insight sometimes displays the red mark in both asm view and source view. Is it really gdb code that tells it where to display the red mark in source view? I hope I wasn't unclear about that pair of bugs. When I set a breakpoint in asm view gdb/Insight seem to be rock solid about setting the breakpoint on the correct asm instruction and when you press continue actually stopping on the correct asm instruction. The problems are that it usually doesn't (but sometimes does) mark a line in the source view when you set the breakpoint (even though the Objdump output from the same code identifies a source line for that asm line that is within view). The other problem is when it hits the breakpoint, it displays the wrong file in the source pane. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 21:37 ` John Fine @ 2008-09-19 7:26 ` Keith Seitz 2008-09-19 14:27 ` John Fine 0 siblings, 1 reply; 16+ messages in thread From: Keith Seitz @ 2008-09-19 7:26 UTC (permalink / raw) To: John Fine; +Cc: insight John Fine wrote: > Can you give me some hints where in the GDB/Insight source code certain > operations are done? I think I might be able to fix or at least > understand some of these bugs myself. Yes, I can provide some pointers about where to look for certain things, but my knowledge of gdb is rusty. The good news is that it is now my job to make gdb a better c++ debugger, so my knowledge grows every day. > I'd especially like to have the nexti button actually work (rather than > usually step in). I can't imagine how that bug could be the GUI's > fault. It only makes sense if the GDB code for nexti can't recognize a > call instruction in x86_64 asm code. I'd really like to look at the > code in gdb that tries to determine whether the next instruction is a > call. (Maybe it is confused by the possible encodings of call or maybe > it is confused by the 64-bit virtual address and sometimes looking in > the wrong place.) But I don't know how to find that code. First things first: verify that this is a problem with core gdb (which it would have to be IMO). Run the command-line gdb (insight -nw or insight --i=console), load your file (pass as argument or "file MYAPP" at gdb prompt), set a break somewhere ("break SOMEWHERE"), and try to nexti. If it doesn't work here, it is definitely a gdb problem. Debugging this, though, is going to be a pain. A real pain. You should set a breakpoint at "nexti_command" in infcmd.c and step from there. I haven't looked at this code in almost a decade, but I do remember it being particularly nasty. >> Isolated test cases are very useful. > I'll see what I can find time to create. So far I haven't tried to > debug anything small, so I don't know whether Insight would behave > differently. I haven't found any target programs in which insight > doesn't have these malfunctions. There are calls in each program for > which gdb/Insight can correctly step over or out. There are lines in > the source code of each for which gdb/Insight seems to understand the > correspondence between source and asm. But in every program, there are > lines where it doesn't. If you can find a test case -- ANY test case -- that you can send me, that would greatly expedite this process. > A lot of the debugging I attempted, and some of the worst gdb/Insight > behavior I saw) was with the program opannotate that is part of oprofile > package. I am quite familiar with opannotate (having written the eclipse-oprofile RPM), so I'll take a look at this and see if I cannot get any further with your problems. > It is Oprofile version 0.9.3 (because the build for the newer Oprofile > was unhappy with some older libraries on our Centos system). It was > built with gcc version 3.4.6 with switches -g O2. gcc 3.4.6? Wow, that's over two and one-half years old -- ancient by gcc/gdb standards *for C++*. I will try to grab a copy of this and see how executables built by this behave with gdb. I am going to guess that this is a big part of the problem. I don't suppose you could try gcc 4.1 or 4.4? > In Opannotate, I just wanted to understand some of the basic flow of the > program that I couldn't figure out from the source code. So I just > wanted to do a bunch of step over and a few step into operations. But I > spent all my time trying to get out of the functions that I stepped into > using the step over button and never reached the parts I wanted to see. I'll give this a try. > Most of what I'm attempting to debug is in a large closed source project > compiled with the Intel 10.0 compiler. I don't know much about the Intel 10.0 compiler or how well gdb works with it. I'll ask around. > Another place I'd really like some hints about gdb/Insight source code > location is in the construction of the text that goes into the asm > window. I probably won't understand the code well enough to make the > change I'd like, but I want to look and see if I can. On each line of > asm code, I'd like display the line number of the source line that the > debugger thinks the debug info tells it corresponds. If I really > understood how to change things, I'd also color code that, to flag the > places where the source filename is different. If practical, I would > also drop the display of the function+offset version of the address. In > C++ code function names get too messy and the plain virtual address > should be good enough. That occurs in gdb_load_assembly in src/gdb/gdbtk/generic/gdbtk-cmds.c. > I'm not certain, but I think it happens when the debugger thinks the > source file has changed. I do a step across an asm instruction and > (because of inlining) the source file might change. If I am reading this right, the executable has changed and the asm pane is now obsolete. There could be a bug here w.r.t. updating the contents of the asm pane (a bug I've squashed several times over the years), but I cannot think how this could cause the panes to swap places off the top of my head. Again, I'll take a look and see if I can reproduce this given the new information. > If it shouldn't swap, where should it stay (top or bottom)? SRC pane should always be on top, ASM always on the bottom. > When you set a breakpoint in asm view, Insight sometimes displays the > red mark in both asm view and source view. Is it really gdb code that > tells it where to display the red mark in source view? Yes, because only the backend (gdb) knows how to translate between the two. Insight is actually pretty stupid most of the time: it was designed this way so that getting a new architecture to run on it would require getting gdb running. Even today, with every new arch added to gdb, the only time insight needs modification is if a new target is introduced. So, there we are: more investigation is necessary. I hope to get to some of this tonight. Keith ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-19 7:26 ` Keith Seitz @ 2008-09-19 14:27 ` John Fine 2008-09-19 18:47 ` Keith Seitz 0 siblings, 1 reply; 16+ messages in thread From: John Fine @ 2008-09-19 14:27 UTC (permalink / raw) To: Keith Seitz; +Cc: insight Keith Seitz wrote: > The good news is that it is now my job to make gdb a better c++ > debugger, so my knowledge grows every day. Can I put in an early request (maybe you've heard this one already) to do something about the absurdly long function names. In viewing disassembly, you get function name + offset as one of the items on the line, but in typical templated code the function name is so long that the actual disassembly is pushed off the right edge of the display. I can't think of a decent method to make a more concise display of function name (maybe you can). But failing that, I think you need an option to drop it entirely. Or is there such an option already that I just haven't found? I'm not very good at using gdb. function name + offset was never such valuable information that we really needed it and when it is too bulky to use, it should go away. For Insight's SRC+ASM view, I think putting the source line number on each line in both panels would help a lot (again if that option is already there, I'm not expert). While that is less critical for gdb itself than for a GUI, I think it would be a very useful option in gdb itself (disassemble with the source line number shown on each line of disassembly). Hopefully other GUI's layered on gdb would give the user access to that feature. If you're not totally turning off optimization, any user looking at a disassembly spends a lot of time figuring out which source line is connected to each asm line. It would make more sense for the debugger to just display the compiler's version of that information. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-19 14:27 ` John Fine @ 2008-09-19 18:47 ` Keith Seitz 0 siblings, 0 replies; 16+ messages in thread From: Keith Seitz @ 2008-09-19 18:47 UTC (permalink / raw) To: John Fine; +Cc: insight John Fine wrote: > I can't think of a decent method to make a more concise display of > function name (maybe you can). But failing that, I think you need an > option to drop it entirely. Or is there such an option already that I > just haven't found? I'm not very good at using gdb. function name + > offset was never such valuable information that we really needed it and > when it is too bulky to use, it should go away. Perhaps then name could be truncate to, say, N leading characters, "...", and M trailing characters, but I think you are right: we need to offer the option to drop it entirely. I don't know if gdb will let us do that, but I can certainly give it a go. > For Insight's SRC+ASM view, I think putting the source line number on > each line in both panels would help a lot (again if that option is > already there, I'm not expert). While that is less critical for gdb > itself than for a GUI, I think it would be a very useful option in gdb > itself (disassemble with the source line number shown on each line of > disassembly). Hopefully other GUI's layered on gdb would give the user > access to that feature. I don't really ever use SRC+ASM mode. I usually used MIXED, which intersperses source and assembly. For me, it is much, much easier to read. Or perhaps having the two side-by-side with callouts (like is common in many diffing programs) or some other scheme. Keith ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 21:19 ` Keith Seitz 2008-09-18 21:37 ` John Fine @ 2008-09-18 22:11 ` John Fine 2008-09-18 22:27 ` John Fine 2 siblings, 0 replies; 16+ messages in thread From: John Fine @ 2008-09-18 22:11 UTC (permalink / raw) To: Keith Seitz; +Cc: insight Keith Seitz wrote: > >> If I try to view the registers window anytime after pressing Run, the >> whole debugger crashes. If I view the register window first, it >> appears, then when I press run it populates, then a moment later the >> whole debugger crashes. > > Once again, I am sorry, but I cannot reproduce this (on x86). > I got the core file, then used gdb to get a backtrace. The beginning of that is #0 get_register (regnum=2, arg=<value optimized out>) at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-register.c:331 #1 0x00000000004addd8 in gdb_register_info (clientData=<value optimized out>, interp=0xa93ab0, objc=<value optimized out>, objv=0xa96148) at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-register.c:426 #2 0x00000000004a8ff7 in wrapped_call (opaque_args=0x7fbfff5cf0) at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-cmds.c:423 #3 0x0000000000505f3b in catch_errors (func=0x4a8fe0 <wrapped_call>, func_args=0x7fbfff5cf0, errstring=0x77ecc2 "", mask=<value optimized out>) at ../../insight-6.8/gdb/exceptions.c:513 #4 0x00000000004a8ecb in gdbtk_call_wrapper (clientData=0x4adb90, interp=0xa93ab0, objc=3, objv=0xa96138) at ../../insight-6.8/gdb/gdbtk/generic/gdbtk-cmds.c:354 #5 0x00000000007069b1 in TclEvalObjvInternal (interp=0xa93ab0, objc=3, objv=0xa96138, command=0x0, length=0, flags=0) at ../../../insight-6.8/tcl/unix/../generic/tclBasic.c:3048 #6 0x000000000072a414 in TclExecuteByteCode (interp=0xa93ab0, codePtr=0x196f350) at ../../../insight-6.8/tcl/unix/../generic/tclExecute.c:1431 #7 0x000000000072d548 in TclCompEvalObj (interp=0xa93ab0, objPtr=0x16559e0) at ../../../insight-6.8/tcl/unix/../generic/tclExecute.c:1008 #8 0x00000000007089e7 in Tcl_EvalObjEx (interp=0x3e30031640, objPtr=0x51, flags=0) at ../../../insight-6.8/tcl/unix/../generic/tclBasic.c:3944 #9 0x0000000000669f41 in Itcl_EvalMemberCode (interp=0xa93ab0, mfunc=0x16d6b40, member=0x16d6b70, contextObj=0x16d0f20, objc=2, objv=0xa96128) at /home/fine/insight-6.8/itcl/itcl/generic/itcl_methods.c:1003 #10 0x000000000066a9e6 in Itcl_ExecMethod (clientData=<value optimized out>, interp=0xa93ab0, objc=2, objv=0xa96128) at /home/fine/insight-6.8/itcl/itcl/generic/itcl_methods.c:1516 #11 0x00000000007069b1 in TclEvalObjvInternal (interp=0xa93ab0, objc=2, objv=0xa96128, command=0x0, length=0, flags=0) at ../../../insight-6.8/tcl/unix/../generic/tclBasic.c:3048 #12 0x000000000072a414 in TclExecuteByteCode (interp=0xa93ab0, codePtr=0x197c8a0) at ../../../insight-6.8/tcl/unix/../generic/tclExecute.c:1431 I'll try to dig into gdbtk-register.c myself and see if I can understand the problem. But meanwhile, I thought the backtrace might help you tell me what to look for. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 21:19 ` Keith Seitz 2008-09-18 21:37 ` John Fine 2008-09-18 22:11 ` John Fine @ 2008-09-18 22:27 ` John Fine 2008-09-18 23:04 ` Keith Seitz 2 siblings, 1 reply; 16+ messages in thread From: John Fine @ 2008-09-18 22:27 UTC (permalink / raw) To: Keith Seitz; +Cc: insight Keith Seitz wrote: >> If I try to view the registers window anytime after pressing Run, the >> whole debugger crashes. If I view the register window first, it >> appears, then when I press run it populates, then a moment later the >> whole debugger crashes. > > Once again, I am sorry, but I cannot reproduce this (on x86). > Can you give me an expert opinion on these lines of code in gdbtk-register.c regformat = (int *)xcalloc (numregs, sizeof(int)); regtype = (struct type **)xcalloc (numregs, sizeof(struct type **)); especially that sizeof(int) I know that whatever object lies directly in physical ram before the allocation of regtype has overflowed and corrupted regtype. I think the above allocations happen early enough in initialization that they would sequentially grab new memory (rather than reuse chunks that would tend to be distant from each other). So (barely more than guess) I think regformat is the object using more memory than was allocated for it and overflowing into regtype. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 22:27 ` John Fine @ 2008-09-18 23:04 ` Keith Seitz 2008-09-18 23:25 ` John Fine 0 siblings, 1 reply; 16+ messages in thread From: Keith Seitz @ 2008-09-18 23:04 UTC (permalink / raw) To: John Fine; +Cc: insight John Fine wrote: > Can you give me an expert opinion on these lines of code in > gdbtk-register.c > > regformat = (int *)xcalloc (numregs, sizeof(int)); > regtype = (struct type **)xcalloc (numregs, sizeof(struct type **)); > > especially that sizeof(int) The file global regformat simply holds a bunch of ints which are passed to gdb to specify the output format for the register -- one per register. > So (barely more than guess) I think regformat is the object using more > memory than was allocated for it and overflowing into regtype. That sounds like a plausible explanation. You might try to see if numregs is changing or if the register number is greater than the amount that has been allocated. It shouldn't, but who knows. There has been a fair amount of churn in this area in gdb over the last year. Keith ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 23:04 ` Keith Seitz @ 2008-09-18 23:25 ` John Fine 2008-09-19 8:20 ` Keith Seitz 0 siblings, 1 reply; 16+ messages in thread From: John Fine @ 2008-09-18 23:25 UTC (permalink / raw) To: Keith Seitz; +Cc: insight Keith Seitz wrote: > > That sounds like a plausible explanation. You might try to see if > numregs is changing or if the register number is greater than the > amount that has been allocated. > > It shouldn't, but who knows. There has been a fair amount of churn in > this area in gdb over the last year. > This is a situation in which I really could use a decent 64-bit debugger (to find the bug in gdb/Insight). But I don't know how to use gdb beyond the simplest operations. I wouldn't even know how to try to used gdb to debug itself. So I wasted lots of time trying to use fprintf_unfiltered (gdb_stdlog, Then I realized most of those were just trying to pop up in a window when Insight crashed, so I would never get to see them. So I tried ordinary printf and verified what I expected. But I still don't know enough about gdb/Insight internals to know exactly where the bug is. I'm not even sure yet whether it is gdb or insight (though I'm looking at code that is apparently gdb. I put printf's in architecture_changed_event deprecated_current_gdbarch_select_hack setup_architecture_data and gdb_regformat The output I get from my printf's is deprecated_current_gdbarch_select_hack() current_gdbarch=0xa89450 architecture_changed_event setup_architecture_data() current_gdbarch=0xa89450 numregs=50 old_regs=0xc98ab0 regformat=0xc98de0 regtype=0xc98eb0 deprecated_current_gdbarch_select_hack() current_gdbarch=0xc9de00 architecture_changed_event gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=0 fm=0x78 gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=1 fm=0x78 ... gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=55 fm=0x78 gdb_regformat() current_gdbarch=c9de00 numregs=58 regno=56 fm=0x78 Notice something calls deprecated_current_gdbarch_select_hack, which calls architecture_changed_event, then something calls setup_architecture_data. I think architecture_changed_event OUGHT to call setup_architecture_data, but I can't tell from the source code what it actually does or who calls setup_architecture_data. Then notice the second time that deprecated_current_gdbarch_select_hack is called it still calls architecture_changed_event, but nothing calls setup_architecture_data, so the data structures are still allocated for 50 registers, but now there are 58. The registers 0 through 56 get their formats set, trashing regtype, which as I expected does follow regformat in memory. The question of what is supposed to call setup_architecture_data on/after that second call to deprecated_current_gdbarch_select_hack is way beyond both my understanding of the GCC source code and my ability to debug things by adding printf's. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 23:25 ` John Fine @ 2008-09-19 8:20 ` Keith Seitz 2008-09-19 13:27 ` John Fine 0 siblings, 1 reply; 16+ messages in thread From: Keith Seitz @ 2008-09-19 8:20 UTC (permalink / raw) To: John Fine; +Cc: insight John Fine wrote: > So I wasted lots of time trying to use > fprintf_unfiltered (gdb_stdlog, > > Then I realized most of those were just trying to pop up in a window > when Insight crashed, so I would never get to see them. Yeah, the *_unfilitered and *_filtered all head through uiout, which are redirected to insight and tcl/tk. As you discovered, you need to use "normal" printfs. > I think architecture_changed_event OUGHT to call > setup_architecture_data, but I can't tell from the source code what it > actually does or who calls setup_architecture_data. Here's what going on. When you load your x86_64 executable, gdb notices that this is different from the current i386 architecture, so it changes the current architecture to i386:x86_64. When this is changed, an observer notification is fired off from deprecated_current_gdbarch_select_hack (deprecated? I guess gdb doesn't want anyone to know that the architecture has changed??). Insight registers gdbtk_architecture_changed as the observer callback. This then calls the Tcl procedure gdbtk_tcl_architecture_changed, which dispatches this event to the windows. The register window is currently the only window that does anything with this event. It calls the proc gdb_reg_arch_changed, which is aliased to the C function setup_architecture_data. [Phew!] So, alas, setup_architecture_data *is* being called. This will require further investigation. Keith ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-19 8:20 ` Keith Seitz @ 2008-09-19 13:27 ` John Fine 2008-09-19 16:38 ` Keith Seitz 0 siblings, 1 reply; 16+ messages in thread From: John Fine @ 2008-09-19 13:27 UTC (permalink / raw) To: Keith Seitz; +Cc: insight Keith Seitz wrote: > > Insight registers gdbtk_architecture_changed as the observer callback. That's the part I didn't have a clue about, so you've given me at least the name I can grep for to get a clue. > It calls the proc gdb_reg_arch_changed, which is aliased to the C > function setup_architecture_data. [Phew!] So, alas, > setup_architecture_data *is* being called. But I had a printf in setup_architecture_data, so I can be quite sure it is only called the first time (for the architecture with 50 registers) and it is not called the second time (for the architecture with 58 registers). I'm a bit confused that i386 has as many as 50 registers and much more confused that AMD64 has only 8 more than i386. I know for sure about 16 registers that are not in i368 but are part of AMD64 (and are working correctly in the registers window, now that I kludged around the memory clobber). But that 58 vs. 50 issue is just idle curiosity. The big question is how you avoid the memory clobber. Is setup_architecture_data really called twice for 50 registers the first time and 58 the second? Or does your copy start in AMD64 architecture? Or does your memory allocation land differently, so the memory clobber happens without symptom? Or what? Is the fact that my copy of Insight was built with gcc 3.4.6 more significant than the fact that some of my target programs were built with gcc 3.4.6? The group with which my project needs to keep gcc version compatibility is about to switch version, I think to 4.1.2. We have that version of gcc already installed here, but not as the default. I'm not very expert in Linux. So I know how to pick a non default version of gcc for our own projects built with bjam, but I don't know how to do it for a source package like Insight, with its rather complicated makefile. Tomorrow I'll look into building both Insight and Oprofile with the newer gcc. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-19 13:27 ` John Fine @ 2008-09-19 16:38 ` Keith Seitz 2008-09-20 21:22 ` John Fine 0 siblings, 1 reply; 16+ messages in thread From: Keith Seitz @ 2008-09-19 16:38 UTC (permalink / raw) To: John Fine; +Cc: insight John Fine wrote: > But I had a printf in setup_architecture_data, so I can be quite sure it > is only called the first time (for the architecture with 50 registers) > and it is not called the second time (for the architecture with 58 > registers). setup_architecture_data is called twice, once on startup (in which case the architecture is i386 w/50 registers) and then once after "file" command is issued (when the arch changes to i386:x86_64 w/58 registers) -- at least it is in my copy. You might be seeing an old bug? From what I can tell, the change to add the call to setup_architecture_data was committed on 28 Jun 2007: 2007-06-27 Keith Seitz <keiths@redhat.com> * generic/gdbtk-register.c (Gdbtk_Register_Init): Remove calls to deprecated_register_gdbarch_swap. Add "gdb_reg_arch_changed" command. * library/regwin.itb (arch_changed): Call gdb_reg_arch_changed. Is this in your copy of the sources? > I'm a bit confused that i386 has as many as 50 registers and much more > confused that AMD64 has only 8 more than i386. I know for sure about 16 > registers that are not in i368 but are part of AMD64 (and are working > correctly in the registers window, now that I kludged around the memory > clobber). But that 58 vs. 50 issue is just idle curiosity. The big > question is how you avoid the memory clobber. The register names all come from gdb. Insight has no specific knowledge about them. [Remember I said earlier that insight was really ignorant of architecture-specific things in order to facilitate bring-up of new architectures? That applies here, too.] Insight gets the list of registers (and register groups) from gdb directly. If your copy of the sources contains the patch mentioned above, there must be another reason for the clobber. Have you tried to "watch" the memory involved? [If you know the location of a specific clobber, you can do "watch *0xADDR" to find out what clobbered the memory.] AFAICT, there is no memory clobber: setup_architecture_data is called on the architecture_changed_event (albeit in a very obfuscated way for some reason), it releases the old data and callocs new space. > Is setup_architecture_data really called twice for 50 registers the > first time and 58 the second? Or does your copy start in AMD64 > architecture? Or does your memory allocation land differently, so the > memory clobber happens without symptom? Or what? Yes. The first call is from Gdbtk_Register_Init on startup (arch=i386, numregs=50). The second time is after the "file" command to load the x86_64 executable (when arch changes to i386:x86_64, numregs=58). > Is the fact that my copy of Insight was built with gcc 3.4.6 more > significant than the fact that some of my target programs were built > with gcc 3.4.6? I would guess that the former is not a problem, gdb/insight is, after all, just a simple C program. Your target code, though, in C++ w/gcc 3.4.6 is what I would be most suspicious of. > Tomorrow > I'll look into building both Insight and Oprofile with the newer gcc. If you are just building insight from sources, you can simply set CC in your environment to use a different compiler from the default one on your PATH. For example, $ which gcc /usr/bin/gcc $ CC=/home/keiths/work/gcj/built/bin/gcc ../src/configure [...] $ make all-gdb This forces the insight build to use the specified compiler (/home/keiths/work/gcj/built/bin/gcc) instead of the one in your path (/usr/bin/gcc). Don't recall how to do it with oprofile (since oprofile uses C++), but normally the procedure is similar (although C++ uses a different env var -- try passing "--help" to oprofile's configure for a list of options). Keith ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-19 16:38 ` Keith Seitz @ 2008-09-20 21:22 ` John Fine 2008-09-22 18:35 ` John Fine 0 siblings, 1 reply; 16+ messages in thread From: John Fine @ 2008-09-20 21:22 UTC (permalink / raw) To: Keith Seitz; +Cc: insight I really appreciate all the time you've spent helping me, especially as it is getting more and more likely that something is wrong with the environment in which I compiled Insight, rather than with Insight's source code. But I still don't know how to find out what was wrong. Keith Seitz wrote: > setup_architecture_data is called twice, once on startup (in which > case the architecture is i386 w/50 registers) and then once after > "file" command is issued (when the arch changes to i386:x86_64 w/58 > registers) -- at least it is in my copy. You might be seeing an old bug? I have the lines you quoted in insight-6.8\gdb\gdbtk\ChangeLog-2007. I don't know how to be certain I have the correction that goes with it. But in regwin.itb, in RegWin::arch_changed it does call gdb_reg_arch_changed However, I am quite sure: 1) setup_architecture_data is called only once (directly from Gdbtk_Register_Init) 2) (Before my kludge to allocate excess memory in setup_architecture_data) regformat did overflow corrupting regtype I'm still not sure I understand how the call to setup_architecture_data when the architecture changes is supposed to occur. Much to my surprise, debugging the broken Insight inside the broken Insight is moderately practical, so ... I set a breakpoint inside architecture_changed_event (in gdb-events.c) on the line if (!current_event_hooks->architecture_changed) and I verified that current_event_hooks->architecture_changed is equal to zero, so architecture_changed_event does nothing. I expect that is where the second call to setup_architecture_data is supposed to happen, so current_event_hooks->architecture_changed is not supposed to be zero, so now I have to search/guess for the code that is supposed to set current_event_hooks->architecture_changed to something nonzero. Can you confirm that current_event_hooks->architecture_changed is supposed to be nonzero at that point and would be the path to setup_architecture_data if it were working? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-20 21:22 ` John Fine @ 2008-09-22 18:35 ` John Fine 0 siblings, 0 replies; 16+ messages in thread From: John Fine @ 2008-09-22 18:35 UTC (permalink / raw) To: Keith Seitz; +Cc: insight John Fine wrote: > I set a breakpoint inside architecture_changed_event (in gdb-events.c) > on the line > if (!current_event_hooks->architecture_changed) > and I verified that current_event_hooks->architecture_changed is equal > to zero, so architecture_changed_event does nothing. Oops. That was sloppy debugging on my part. current_event_hooks->architecture_changed is equal to zero the first time architecture_changed_event is called. But architecture_changed_event is called again later and current_event_hooks->architecture_changed then points to gdbtk_architecture_changed, which calls Tcl_Eval(gdbtk_interp, "gdbtk_tcl_architecture_changed"); That still doesn't end up calling setup_architecture_data. I don't know whether I can debug well enough to find out why. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Can't debug x86_64 C++ programs. 2008-09-18 20:53 Can't debug x86_64 C++ programs John Fine 2008-09-18 21:19 ` Keith Seitz @ 2008-09-19 2:17 ` Keith Seitz 1 sibling, 0 replies; 16+ messages in thread From: Keith Seitz @ 2008-09-19 2:17 UTC (permalink / raw) To: John Fine; +Cc: insight I've finally built x86_64 insight from CVS HEAD. And the news isn't good... I'm going to omit the entirety of your message, because my response is the same on all of them: I could not reproduce the problems you reported -- not one of them. :-( The only problem I was able to reproduce is the pending-breakpoints-crash problem, but there is a patch floating around for that (which I will check in before EOD). I know that's not what you want to hear, but that tells us something: we have a chance at resolving this because it does work for me. I'll respond to your other messages with more detail shortly. Keith ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-09-19 14:27 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-09-18 20:53 Can't debug x86_64 C++ programs John Fine 2008-09-18 21:19 ` Keith Seitz 2008-09-18 21:37 ` John Fine 2008-09-19 7:26 ` Keith Seitz 2008-09-19 14:27 ` John Fine 2008-09-19 18:47 ` Keith Seitz 2008-09-18 22:11 ` John Fine 2008-09-18 22:27 ` John Fine 2008-09-18 23:04 ` Keith Seitz 2008-09-18 23:25 ` John Fine 2008-09-19 8:20 ` Keith Seitz 2008-09-19 13:27 ` John Fine 2008-09-19 16:38 ` Keith Seitz 2008-09-20 21:22 ` John Fine 2008-09-22 18:35 ` John Fine 2008-09-19 2:17 ` Keith Seitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).