* [vta] don't let debug insns get in the way of simple vect reduction @ 2007-11-05 8:28 Alexandre Oliva 2007-11-05 11:27 ` Richard Guenther 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-05 8:28 UTC (permalink / raw) To: gcc-patches [-- Attachment #1: Type: text/plain, Size: 184 bytes --] libgfortran had some vectorization cases that wouldn't be applied in the presence of debug stmts referencing the same variables. Fixed with the patch below, to be installed shortly. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: vta-vectorizer-reduction.patch --] [-- Type: text/x-patch, Size: 1097 bytes --] for gcc/ChangeLog.vta from Alexandre Oliva <aoliva@redhat.com> * tree-vectorizer.c (vect_is_simple_reduction): Disregard uses in debug insns. Index: gcc/tree-vectorizer.c =================================================================== --- gcc/tree-vectorizer.c.orig 2007-09-17 15:31:48.000000000 -0300 +++ gcc/tree-vectorizer.c 2007-11-03 01:44:55.000000000 -0200 @@ -2199,6 +2199,8 @@ vect_is_simple_reduction (loop_vec_info FOR_EACH_IMM_USE_FAST (use_p, imm_iter, name) { tree use_stmt = USE_STMT (use_p); + if (IS_DEBUG_STMT (use_stmt)) + continue; if (flow_bb_inside_loop_p (loop, bb_for_stmt (use_stmt)) && vinfo_for_stmt (use_stmt) && !is_pattern_stmt_p (vinfo_for_stmt (use_stmt))) @@ -2241,6 +2243,8 @@ vect_is_simple_reduction (loop_vec_info FOR_EACH_IMM_USE_FAST (use_p, imm_iter, name) { tree use_stmt = USE_STMT (use_p); + if (IS_DEBUG_STMT (use_stmt)) + continue; if (flow_bb_inside_loop_p (loop, bb_for_stmt (use_stmt)) && vinfo_for_stmt (use_stmt) && !is_pattern_stmt_p (vinfo_for_stmt (use_stmt))) [-- Attachment #3: Type: text/plain, Size: 249 bytes --] -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: [vta] don't let debug insns get in the way of simple vect reduction 2007-11-05 8:28 [vta] don't let debug insns get in the way of simple vect reduction Alexandre Oliva @ 2007-11-05 11:27 ` Richard Guenther 2007-11-07 7:52 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Richard Guenther @ 2007-11-05 11:27 UTC (permalink / raw) To: Alexandre Oliva; +Cc: gcc-patches On 11/5/07, Alexandre Oliva <aoliva@redhat.com> wrote: > libgfortran had some vectorization cases that wouldn't be applied in > the presence of debug stmts referencing the same variables. Fixed > with the patch below, to be installed shortly. (I'm just picking a random patch of this kind for this mail) I see you have to touch lots of places to teach them about debug insns. This hints at a design error (as I believe), namely that you encode the debug information in the IL. I believe in the long long thread earlier this year people suggested to use a on-the-side representation for the extra information. Unfortunately nobody has (apperantly) looked at the vta code yet and nobody made comments so far. With the different approach I and Matz started (and to which we didn't yet spend enough time to get debug information actually output - but I hope we'll get there soon), on the tree level the extra information is stored in a bitmap per SSA_NAME (where necessary). On the RTL level we chose to extend the SET insn by a bitmap argument (refering to those bitmaps). With that approach we only need to touch places where debug information is lost (we just at those places propagate this information to the bitmaps). I realize that the GCC development model does not really support development in the open or steering of technical approaches. Probably due to lack of time and interest. Still I'd ask people to actually look at both approaches and give advice to us implementors. (And IMHO, debug insns in the IL are the wrong way to go and I would be very unhappy seeing this code get in GCC - no personal offense intended) Thanks, Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) 2007-11-05 11:27 ` Richard Guenther @ 2007-11-07 7:52 ` Alexandre Oliva 2007-11-07 16:16 ` Ian Lance Taylor 2007-11-07 17:20 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Michael Matz 0 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-07 7:52 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches, gcc On Nov 5, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote: > On 11/5/07, Alexandre Oliva <aoliva@redhat.com> wrote: >> libgfortran had some vectorization cases that wouldn't be applied in >> the presence of debug stmts referencing the same variables. Fixed >> with the patch below, to be installed shortly. > (I'm just picking a random patch of this kind for this mail) > I see you have to touch lots of places to teach them about debug > insns. Yes. There's no escaping for that. There are two options: - keep them separate, and modify the code that manipulates the IL so as to update them as needed, or - keep them in the IL, and modify the code to disregard them as needed. I've pondered both alternatives, and decided that the latter was the only testable path. If we had a reliable debug information tester, we could proceed incrementally with the first alternative; it might be viable, but I don't really see that it would make things any simpler. If anything, you'd need to introduce a lot of new code to manipulate the separate representation, unless this separate representation was very similar in structure to the existing representation, and in any case you'd have to add code all over the place to keep it up to date. With the approach I've taken, there's something that's testable: as long as there are codegen changes, something needs to be fixed. Besides, the information is encoded in a form that is automatically handled by most compilation passes, so updates for pretty much all transformations are already in place, without any additional code. The only additional code is what's needed to detect missing updates and to ensure the debug notes don't interfere with code generation. I've managed to implement these such that they don't take any additional memory unless you actually request the additional debug information, and such that they almost never bring any compile-time performance hit. That's one of the reasons that guided the placement of DEBUG_INSN just next to the other INSNs: such that INSN_P is optimized to a range test (as it was before, but now with a different boundary), and INSN_P && !DEBUG_INSN_P is optimized to the original range test. In most other places, it's just yet another entry in a switch table, so again it's zero-cost in terms of performance. And at points where it would be more costly, there's a test guarding the complex processing to tell whether the feature is enabled that requires that additional processing. Hard to beat that. > I believe in the long long thread earlier this year people suggested > to use a on-the-side representation for the extra information. Yes. And I thought I'd already made it clear why this on-the-side representation won't get you as far as I needed to go. Basically, it leads to a situation in which you can't possibly represent correct debug information, or you end up adding annotations to the instruction flow anyway, which means you have to deal with them or give up correct debug information. Since one of the requirements I was given was that debug information be correct (as in, if I don't know where a variable is, debug information must say so, rather than say the variable is somewhere it really isn't), going without additional annotations just wouldn't work. Therefore, I figured I'd have to bite the bullet and take the longer path, even though I don't dispute that it is possible to achieve many improvements with the simpler approach. However, eventually the simpler approach runs into a wall, and I couldn't afford to get to that point and then backtrack to the complete approach, because the wall couldn't be surpassed. > With the different approach I and Matz started (and to which we > didn't yet spend enough time to get debug information actually > output - but I hope we'll get there soon), on the tree level the > extra information is stored in a bitmap per SSA_NAME (where > necessary). This will fail on a very fundamental level. Consider code such as: f(int x, int y) { int c; /* other vars */ c = x; do_something_with(c, ...); // doesn't touch x or y c = y; do_something_else_with(c, ...); // doesn't touch x or y } where do_something_*with are actually complex computations, be that explicit code, be it macros or inlined functions. This can (and should) be trivially optimized to: f(int x, int y) { /* other vars */ do_something_with(x, ...); // doesn't touch x or y do_something_else_with(y, ...); // doesn't touch x or y } But now, if I 'print c' in a debugger in the middle of one of the do_something_*with expansions, what do I get? With the approach I'm implementing, you should get x and y at the appropriate points, even though variable c doesn't really exist any more. With your approach, what will you get? There isn't any assignment to x or y you could hook your notes to. Even if you were to set up side representations to model the additional variables that end up mapped to the incoming arguments, you'd have 'c' in both, and at the entry point. How would you tell? Sure, you could hand-wave that both assignments were effectively moved to the entry point of the function, and that only the last one prevails. I guess this wouldn't be wrong per se. But would it be the best we could do for the users? Say, if do_something_with is a loop, and you're monitoring some condition that depends on c and other variables at a point in the middle of an iteration, would you be happy if that didn't work because the compiler told you 'c' evaluated to 'y' rather than 'x'? Do you realize that the only way you could possibly make the above work as expected by the user is by adding notes at the point of the assignment? And that, once you add such notes, they'd have to map back to the value the variable was supposed to gain at that point, and that thus you must keep them accurate as further optimizations mess with that value. E.g., when f is inlined into another function, its x and y are certainly going to disappear, like they do now, because of copy propagation, and then you'd have to update the notes for the original assignments to c that you've already discarded. And, well, if you're going to have to add and update notes anyway, why not just bite the bullet and use them all over? Maybe to save on memory? I guess this could be a good reason for that. We could indeed add a bitmap to gimple assignments indicating which user variables, if any, they modify. And then, if we move them out of place, we can drop the bitmap in the new location, and replace the original location with a note. But this has a number of problems: 1. every single gimple assignment grows by one word, to hold the pointer to the bitmap. But most gimple assignments are to temporaries variables, and these don't need annotations. Creating different kinds of gimple assignments would make things quite complex, so I'd rather not go down that path. So, you'd use a lot more memory, even when the feature is not in use at all, and you might likely use more memory than adding separate notes for user assignments like I do. And this doesn't even count the actual bitmaps. 2. this can't possibly handle assignments to parts of large variables. My current implementation only tracks gimple regs, but there's no reason why it can't be easily extended to handle component refs of largish variables that end up in registers rather than memory, and for other SRA-able variables, even when they aren't fully scalarized. How'd you handle this with a look-aside bitmap? I guess you could generate uids or even decls for other expressions, but it seems to me that this would waste even more space, and get things even more complicated, no? 3. the marked assignment doesn't (can't possibly) denote the correct point at which all variables in its bitmap were assigned to. It only marks the earliest such point for all the variables. Overlapping ranges and incorrect debug info are a consequence of this. > On the RTL level we chose to extend the SET insn by a > bitmap argument (refering to those bitmaps). Same problems here. The memory problem becomes even more critical: even for compilation without debug info, you grow by 33% the most common RTX element that, again, most often assigns to temporaries, and this is without counting the space for the actual bitmaps. > I realize that the GCC development model does not really support > development in the open or steering of technical approaches. > Probably due to lack of time and interest. Still I'd ask people to > actually look at both approaches and give advice to us implementors. +1 > (And IMHO, debug insns in the IL are the wrong way to go and I would > be very unhappy seeing this code get in GCC - no personal offense > intended) No offense taken. I hope I've shown why it can't be helped to add debug annotations to the IL if we're to have any hope of getting correct debug information throughout compiler transformations, and that the decision I've made to accomplish this is one that stands a chance of being reasonably validated automatically. I'm somewhat sick of seeing debug information being treated as a second-class citizen, just because it doesn't cause the main program to crash or to produce incorrect results. The more people use monitoring tools that are based on standards-defined debug information, the more important it is that such information be reliable and accurate. Otherwise, even if the program is compiled into code that runs perfectly, the systems built using this program, its debug information and the monitoring tools that use this information will fail just as severely as if we had generated incorrect code for the program in the first place. I believe that adding debug annotations to the instruction stream, as if they were references to the appropriate values, but subject to the condition that debug information must not alter the generated code, is the most viable approach to get to reliable and accurate debug information. That said, I do realize it's a long path, and certainly much longer than other approaches that could get you say 80% there, but no further than that and, worse, without any way for the compiler to tell when it hasn't got there so as to inform the user about it. I just hope having something that gets us 80% there won't become an impediment for a solution that gets us 95% there, while setting foundations that will enable us to get to 100% over time. Debug information correctness ought to be treated no different from code generation correctness. Which is not to say that debug information needs to be absolutely perfect and complete. Having an optimization pass that improves code while keeping its behavior correct is a good thing, even if the optimization could be further improved, just like having debug information that reports the location of a variable as unknown at some points, and accurately at all others, even if the debug info generator could be further improved so as to find out where the variable is at some of the points where it lost track of them. But having the debug info generator report an incorrect location for a variable is as bad as having the optimization pass change the meaning of the program. I believe that keeping debug annotations as part of the IL are the simplest and most effective way to ensure that optimization passes do deal with them, rather than disregarding them as something unimportant. Besides, the way I've designed them, most of the passes deal with them in the very same fashion as they deal with all other expressions; it's just that some of them need minor tweaks to avoid codegen changes when debug info is being generated. We could avoid the risk of codegen changes with -g by always generating the annotations, even when not generating debug info, and discarding them at the end, when they could no longer affect the generated code. But then, if there are codegen changes (most often missed optimizations), they will be present both with -g and -g0, so this is undesirable. Nevertheless, it's an option I've considered adding, and that would be quite easy and helpful to introduce it for field tests of this new debug info infrastructure, such that users aren't left out in the cold if their code works with -O2 -g but not with -O2, or vice-versa. I haven't added it yet, for I've had no need for it, but it's a matter of minutes. It shouldn't be default, though, for it would use more memory than needed for -g0. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) 2007-11-07 7:52 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Alexandre Oliva @ 2007-11-07 16:16 ` Ian Lance Taylor 2007-11-07 19:11 ` Designs for better debug info in GCC Alexandre Oliva 2007-11-07 17:20 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Michael Matz 1 sibling, 1 reply; 150+ messages in thread From: Ian Lance Taylor @ 2007-11-07 16:16 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, gcc Alexandre Oliva <aoliva@redhat.com> writes: > I've pondered both alternatives, and decided that the latter was the > only testable path. If we had a reliable debug information tester, we > could proceed incrementally with the first alternative; it might be > viable, but I don't really see that it would make things any simpler. It seems to me that this is a reason to write a reliable debug information tester. Your approach gives you a point solution--did anything change today--but it doesn't give us a maintenance solution--did anything change over time? > Since one of the requirements I was given was that debug information > be correct (as in, if I don't know where a variable is, debug > information must say so, rather than say the variable is somewhere it > really isn't), going without additional annotations just wouldn't > work. Therefore, I figured I'd have to bite the bullet and take the > longer path, even though I don't dispute that it is possible to > achieve many improvements with the simpler approach. While I understand that you were given certain requirements, for the purposes of mainline gcc we need to weigh costs and benefits. How many of our users are looking for precise debugging of optimized code, and how much are they willing to pay for that? Will our users overall be better served by the 90% solution? > 1. every single gimple assignment grows by one word, to hold the > pointer to the bitmap. But most gimple assignments are to temporaries > variables, and these don't need annotations. Creating different kinds > of gimple assignments would make things quite complex, so I'd rather > not go down that path. So, you'd use a lot more memory, even when the > feature is not in use at all, and you might likely use more memory > than adding separate notes for user assignments like I do. And this > doesn't even count the actual bitmaps. I expect that most compilations are with -g, so I think we need to compare memory usage between the two approaches with -g. I don't know what the best approach is for improving debug information. But I think we've learned over time that explicit NOTEs in the RTL was not, in general, a good idea. They complicate optimizations and they tend to get left behind when moving code. We've fixed many many bugs and misoptimizations over the years due to NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake we've made in the past. Ian ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 16:16 ` Ian Lance Taylor @ 2007-11-07 19:11 ` Alexandre Oliva 2007-11-07 22:57 ` Ian Lance Taylor 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-07 19:11 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Ian Lance Taylor <iant@google.com> wrote: > Alexandre Oliva <aoliva@redhat.com> writes: >> I've pondered both alternatives, and decided that the latter was the >> only testable path. If we had a reliable debug information tester, we >> could proceed incrementally with the first alternative; it might be >> viable, but I don't really see that it would make things any simpler. > It seems to me that this is a reason to write a reliable debug > information tester. Yep. This is in the roadmap. But it's not something that can be done with GCC alone. It's more of a "system" test, that will involve debuggers or monitoring tools. gdb, fryks, systemtap or some such come to mind. > Your approach gives you a point solution--did anything change > today--but it doesn't give us a maintenance solution--did anything > change over time? Actually, no, your assessment is incorrect. What I'm providing gives us means to test, at any point in time, that enabling debug information won't cause changes to the generated code. So far, code in the trunk only performs these comparisons within the GCC directory. And, nevertheless, patches that correct obvious divergences have been lingering for months. I have recently-posted patches that introduce means to test other host and target libraries. I still haven't written testsuite code to enable us to verify that debug information doesn't affect the generated code for existing tests, or for additional tests introduced for this very purpose, but this is in the roadmap. Of course, none of this guarantees that debug information is accurate or complete, it just helps ensure that -g won't change code generation. Testing more than this requires a tool that can not only interpret debug information, but also the generated code, and verify that they match. The plan is to use the actual processors (or simulators) to understand the generated code, and existing debug info consumers that are debugging or monitoring tools to verify that debug info reflects the behavior observed by the processor. > While I understand that you were given certain requirements, for the > purposes of mainline gcc we need to weigh costs and benefits. How > many of our users are looking for precise debugging of optimized code, > and how much are they willing to pay for that? Will our users overall > be better served by the 90% solution? Does it really matter? Do we compromise standards compliance (and so violently, while at that) in any aspect of the compiler? What do we tell the growing number of users who don't regard debug information as something useless except for occasional debugging? That GCC cares about standards compliant except for debug information, and they should write their own Free Software compiler if they want a correct, standards-compliant compiler? Do we accept taking shortcuts for optimizations or other code generation issues when they cause incorrect code to be produced? Why should the mantra "must not sacrifice correctness" not applicable to debug information standards in GCC? At this point, debug information is so bad that it's a shame that most builds are done with -O2 -g: we're just wasting CPU cycles and disk space, contributing to accelerate the thermodynamic end of the universe (nevermind the Kyoto protocol ;-), for information that is severely incomplete at best, and terribly broken at worst. Yes, generating correct code may take some more memory and some more CPU cycles. Have we ever made a decision to use less memory or CPU cycles when the result is incorrect code? Why should standardized meta-information about the generated code be any different? >> 1. every single gimple assignment grows by one word, I take this back, I'd been misled by richi's description. It's really a side hashtable (which gets me worried about the re-emitted rather than modified gimple assignments in some locations), so it doesn't waste memory for gimple assignments that don't refer to user variables. Unfortunately, this is not the case for rtx SETs, in this alternate approach. > I don't know what the best approach is for improving debug > information. Your phrasing seems to indicate you're not concerned about fixing debug information, but rather only about making it less broken. With different goals, we can come to very different solutions. > But I think we've learned over time that explicit NOTEs > in the RTL was not, in general, a good idea. They complicate > optimizations and they tend to get left behind when moving code. Being left behind is actually a feature. It's one of the reasons why I chose this representation. The debug annotation is not supposed to move along with the SET, because it would then no longer model the source code, it would rather be mangled, often beyond recognition, because of implementation details. As for complicating optimizations, I can have some sympathy for that. Sure, generating code without preserving the information needed to map source-level concepts to implementation-level concepts is easier. But generating broken code is not an option, it's a bug, so why should it be an acceptable option just because the code we're talking about is meta-information about the executable code? > We've fixed many many bugs and misoptimizations over the years due to > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > we've made in the past. That's a valid concern. However, per this reasoning, we might as well push every operand in our IL to separate representations, because there have been so many bugs and misoptimizations over the years, especially when the representation didn't make transformations trivially correct. However, the beauty of the representation I've chosen, that models the annotations as a weak USE of an expression that evaluates to the value of the variable at the point of assignment, most compiler passes *will* keep them accurate, where any other representation would have to be dealt with explicitly. Sure, some passes need to compensate to make sure these weak USEs don't affect codegen or optimizations, and a few need special tweaks to keep notes accurate, to stop the safeguards in place that would discard the information that went inaccurate. But these are few. I believe strongly that this is the correct trade-off. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 19:11 ` Designs for better debug info in GCC Alexandre Oliva @ 2007-11-07 22:57 ` Ian Lance Taylor 2007-11-07 23:05 ` Daniel Jacobowitz ` (3 more replies) 0 siblings, 4 replies; 150+ messages in thread From: Ian Lance Taylor @ 2007-11-07 22:57 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, gcc Alexandre Oliva <aoliva@redhat.com> writes: > > Your approach gives you a point solution--did anything change > > today--but it doesn't give us a maintenance solution--did anything > > change over time? > > Actually, no, your assessment is incorrect. Ah, you're right. I was wrong. > > While I understand that you were given certain requirements, for the > > purposes of mainline gcc we need to weigh costs and benefits. How > > many of our users are looking for precise debugging of optimized code, > > and how much are they willing to pay for that? Will our users overall > > be better served by the 90% solution? > > Does it really matter? Do we compromise standards compliance (and so > violently, while at that) in any aspect of the compiler? What standards are you talking about? I'm not aware of any standard for debuggability of optimized code. At one time, gcc actually provided better debugging of optimized code than any other compiler, though I don't know if that is still true. Optimized gcc code is still debuggable today. I do it all the time. (For me poor support for debugging C++ is a much bigger issue, though I think that is an issue more with gdb than with gcc.) gcc's users are definitely calling for a faster compiler. Are they calling for better debuggability of optimized code? > >> 1. every single gimple assignment grows by one word, > > I take this back, I'd been misled by richi's description. It's really > a side hashtable (which gets me worried about the re-emitted rather > than modified gimple assignments in some locations), so it doesn't > waste memory for gimple assignments that don't refer to user > variables. > > Unfortunately, this is not the case for rtx SETs, in this alternate > approach. Obviously the memory requirements of both approaches will need to be measured. > > We've fixed many many bugs and misoptimizations over the years due to > > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > > we've made in the past. > > That's a valid concern. However, per this reasoning, we might as well > push every operand in our IL to separate representations, because > there have been so many bugs and misoptimizations over the years, > especially when the representation didn't make transformations > trivially correct. Please don't use strawman arguments. As I understand your proposal, it materializes variables which were otherwise omitted from the generated program. It doesn't address the other issues with debugging optimized code, like bouncing around between program lines. Is that correct? What else does your proposal do? Ian ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 22:57 ` Ian Lance Taylor @ 2007-11-07 23:05 ` Daniel Jacobowitz 2007-11-08 0:00 ` Mark Mitchell ` (2 subsequent siblings) 3 siblings, 0 replies; 150+ messages in thread From: Daniel Jacobowitz @ 2007-11-07 23:05 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches, gcc On Wed, Nov 07, 2007 at 02:56:24PM -0800, Ian Lance Taylor wrote: > At one time, gcc actually provided better debugging of optimized code > than any other compiler, though I don't know if that is still true. > Optimized gcc code is still debuggable today. I do it all the time. > (For me poor support for debugging C++ is a much bigger issue, though > I think that is an issue more with gdb than with gcc.) We're working on both of these on the GDB side. > gcc's users are definitely calling for a faster compiler. Are they > calling for better debuggability of optimized code? In my experience, yes. CodeSourcery has work currently being contributed to GDB that makes this quite a lot better; we also occasionally have customers ask us about further improvements. And I file bugs about this from time to time, most of which are still open. > As I understand your proposal, it materializes variables which were > otherwise omitted from the generated program. It doesn't address the > other issues with debugging optimized code, like bouncing around > between program lines. Is that correct? What else does your proposal > do? I've been thinking about the bouncing problem quite a bit lately. I have some rough ideas, but I won't draw out this thread by sharing :-) -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 22:57 ` Ian Lance Taylor 2007-11-07 23:05 ` Daniel Jacobowitz @ 2007-11-08 0:00 ` Mark Mitchell 2007-11-08 0:15 ` David Edelsohn ` (2 more replies) 2007-11-08 5:01 ` Alexandre Oliva 2007-11-08 8:58 ` Paolo Bonzini 3 siblings, 3 replies; 150+ messages in thread From: Mark Mitchell @ 2007-11-08 0:00 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches, gcc Ian Lance Taylor wrote: > At one time, gcc actually provided better debugging of optimized code > than any other compiler, though I don't know if that is still true. > Optimized gcc code is still debuggable today. I do it all the time. > (For me poor support for debugging C++ is a much bigger issue, though > I think that is an issue more with gdb than with gcc.) I think we all agree that providing better debugging of optimized code is a priori a good thing. So, as I see it, this thread is focused on what internal representation we might use for that. I don't know that there's an abstract right answer to whether something NOTE-like or something on the side is better. There are problems with both approaches. We know the NOTE/DEBUG_INSN thing is going to break, from experience; we also know the on-the-side thing is going to be hard to maintain. Alexandre has clearly thought about this a lot. I'd like to start by capturing the functional changes that we want to make to GCC's debug output -- not the changes that we want in the debug experience, or changes that we need in GDB, but the changes in the generated DWARF. For example, I'm thinking of a series of function test cases. Ignore the substance of this example -- I'm making it up! -- I'm just trying to capture the form. === int main () { int i; i = 3; return i; } When optimizing, "i" is optimized away. The debug info for "i" right before the return statement says "i has been optimized away", but not what its value is. I think it should say that the value is "3". To do that, we need to emit a DW_Now_My_Value_is_3 tag for "i". === Now, how is whatever representation we pick going to get us that? Is the Oliva representation sufficient? What about the Guenther/Matz representation? Independently of the representation, what algorithms are we going to use to track whatever we need to track as the optimizers remove, insert, duplicate, and reorder code? Until we all know what we're trying to do, I don't see how we can make a good decision about the representation. Clearly, in the abstract, we can represent data either on-the-side or in the instruction stream, but until we know what output we want, I'm not sure how we can pick. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 0:00 ` Mark Mitchell @ 2007-11-08 0:15 ` David Edelsohn 2007-11-08 0:35 ` Mark Mitchell 2007-11-08 5:15 ` Alexandre Oliva 2007-11-08 5:44 ` Alexandre Oliva 2007-11-08 9:54 ` Richard Guenther 2 siblings, 2 replies; 150+ messages in thread From: David Edelsohn @ 2007-11-08 0:15 UTC (permalink / raw) To: Mark Mitchell Cc: Ian Lance Taylor, Alexandre Oliva, Richard Guenther, gcc-patches, gcc >>>>> Mark Mitchell writes: Mark> I think we all agree that providing better debugging of optimized code Mark> is a priori a good thing. So, as I see it, this thread is focused on Mark> what internal representation we might use for that. Yes, it is a good thing, but not at any price. Regardless of the representation and implementation, there is a cost. This discussion should not start with the premise that better debugging of optimized code is better at any cost. Mark> I'd like to start by Mark> capturing the functional changes that we want to make to GCC's debug Mark> output -- not the changes that we want in the debug experience, or Mark> changes that we need in GDB, but the changes in the generated DWARF. Who is "we"? What better debugging are GCC users demanding? What debugging difficulties are they experiencing? Who is that set of users? What functional changes would improve those cases? What is the cost of those improvements in complexity, maintainability, compile time, object file size, GDB start-up time, etc.? David ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 0:15 ` David Edelsohn @ 2007-11-08 0:35 ` Mark Mitchell 2007-11-08 5:14 ` Alexandre Oliva 2007-11-22 23:07 ` Frank Ch. Eigler 2007-11-08 5:15 ` Alexandre Oliva 1 sibling, 2 replies; 150+ messages in thread From: Mark Mitchell @ 2007-11-08 0:35 UTC (permalink / raw) To: David Edelsohn Cc: Ian Lance Taylor, Alexandre Oliva, Richard Guenther, gcc-patches, gcc David Edelsohn wrote: >>>>>> Mark Mitchell writes: > > Mark> I think we all agree that providing better debugging of optimized code > Mark> is a priori a good thing. So, as I see it, this thread is focused on > Mark> what internal representation we might use for that. > > Yes, it is a good thing, but not at any price. Regardless of the > representation and implementation, there is a cost. This discussion > should not start with the premise that better debugging of optimized code > is better at any cost. I agree. You're right to state this explicitly, but I'd implicitly expected that we'd do cost/benefit analysis on this feature, as we would any other. > Mark> I'd like to start by > Mark> capturing the functional changes that we want to make to GCC's debug > Mark> output -- not the changes that we want in the debug experience, or > Mark> changes that we need in GDB, but the changes in the generated DWARF. > > Who is "we"? What better debugging are GCC users demanding? What > debugging difficulties are they experiencing? Who is that set of users? > What functional changes would improve those cases? What is the cost of > those improvements in complexity, maintainability, compile time, object > file size, GDB start-up time, etc.? That's what I'm asking. First and foremost, I want to know what, concretely, Alexandre is trying to achieve, beyond "better debugging info for optimized code". Until we understand that, I don't see how we can sensibly debate any methods of implementation, possible costs, etc. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 0:35 ` Mark Mitchell @ 2007-11-08 5:14 ` Alexandre Oliva 2007-11-08 18:28 ` Alexandre Oliva 2007-11-22 23:07 ` Frank Ch. Eigler 1 sibling, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 5:14 UTC (permalink / raw) To: Mark Mitchell Cc: David Edelsohn, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > First and foremost, I want to know what, concretely, Alexandre is > trying to achieve, beyond "better debugging info for optimized > code". I'm not really going for "better". I'm going for "correct" first, while making room for "better", and hopefully already getting better, in the process. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 5:14 ` Alexandre Oliva @ 2007-11-08 18:28 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 18:28 UTC (permalink / raw) To: Mark Mitchell Cc: David Edelsohn, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > First and foremost, I want to know what, concretely, Alexandre is > trying to achieve, beyond "better debugging info for optimized > code". I'm not really going for "better". I'm going for "correct" first, while making room for "better", and hopefully already getting better, in the process. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 0:35 ` Mark Mitchell 2007-11-08 5:14 ` Alexandre Oliva @ 2007-11-22 23:07 ` Frank Ch. Eigler 2007-11-22 23:13 ` Richard Guenther 1 sibling, 1 reply; 150+ messages in thread From: Frank Ch. Eigler @ 2007-11-22 23:07 UTC (permalink / raw) To: Mark Mitchell Cc: David Edelsohn, Ian Lance Taylor, Alexandre Oliva, Richard Guenther, gcc-patches, gcc Mark Mitchell <mark@codesourcery.com> writes: > [...] >> Who is "we"? What better debugging are GCC users demanding? What >> debugging difficulties are they experiencing? Who is that set of users? >> What functional changes would improve those cases? What is the cost of >> those improvements in complexity, maintainability, compile time, object >> file size, GDB start-up time, etc.? > > That's what I'm asking. First and foremost, I want to know what, > concretely, Alexandre is trying to achieve, beyond "better debugging > info for optimized code". Until we understand that, I don't see how we > can sensibly debate any methods of implementation, possible costs, etc. It may be asking to belabour the obvious. GCC users do not want to have to compile with "-O0 -g" just to debug during development (or during crash analysis *after deployment*!). Developers would like to be able to place breakpoints anywhere by reference to the source code, and would like to access any variables logically present there. Developers will accept that optimized code will by its nature make some of these fuzzy, but incorrect data must be and incomplete data should be minimized. That they put up with the status quo at all is a historical artifact of being told so long not to expect any better. - FChE ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-22 23:07 ` Frank Ch. Eigler @ 2007-11-22 23:13 ` Richard Guenther 2007-11-23 20:53 ` Frank Ch. Eigler 2007-11-24 15:02 ` Robert Dewar 0 siblings, 2 replies; 150+ messages in thread From: Richard Guenther @ 2007-11-22 23:13 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Mark Mitchell, David Edelsohn, Ian Lance Taylor, Alexandre Oliva, gcc-patches, gcc On Nov 22, 2007 8:22 PM, Frank Ch. Eigler <fche@redhat.com> wrote: > > Mark Mitchell <mark@codesourcery.com> writes: > > > [...] > >> Who is "we"? What better debugging are GCC users demanding? What > >> debugging difficulties are they experiencing? Who is that set of users? > >> What functional changes would improve those cases? What is the cost of > >> those improvements in complexity, maintainability, compile time, object > >> file size, GDB start-up time, etc.? > > > > That's what I'm asking. First and foremost, I want to know what, > > concretely, Alexandre is trying to achieve, beyond "better debugging > > info for optimized code". Until we understand that, I don't see how we > > can sensibly debate any methods of implementation, possible costs, etc. > > It may be asking to belabour the obvious. GCC users do not want to > have to compile with "-O0 -g" just to debug during development (or > during crash analysis *after deployment*!). Developers would like to > be able to place breakpoints anywhere by reference to the source code, > and would like to access any variables logically present there. > Developers will accept that optimized code will by its nature make > some of these fuzzy, but incorrect data must be and incomplete data > should be minimized. > > That they put up with the status quo at all is a historical artifact > of being told so long not to expect any better. As it is (without serious overhead) impossible to do both, you either have to live with possibly incorrect but elaborate or incomplete but correct debug information for optimized code. Choose one ;) What we (Matz and myself) are trying to do is provide elaborate debug information with the chance of wrong (I'd call it superflous, or extra) debug information. Alexandre seems to aim at the world-domination solution (with the serious overhead in terms of implementation and verboseness). Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-22 23:13 ` Richard Guenther @ 2007-11-23 20:53 ` Frank Ch. Eigler 2007-11-24 1:53 ` Alexandre Oliva 2007-11-24 15:02 ` Robert Dewar 1 sibling, 1 reply; 150+ messages in thread From: Frank Ch. Eigler @ 2007-11-23 20:53 UTC (permalink / raw) To: Richard Guenther Cc: Mark Mitchell, David Edelsohn, Ian Lance Taylor, Alexandre Oliva, gcc-patches, gcc Hi - (BTW, sorry for reopening this old thread if people are sick & tired of it.) > > Mark Mitchell <mark@codesourcery.com> writes: > > > [...] > > > That's what I'm asking. First and foremost, I want to know what, > > > concretely, Alexandre is trying to achieve, beyond "better debugging > > > info for optimized code". [...] > > > > It may be asking to belabour the obvious. GCC users do not want to > > have to compile with "-O0 -g" just to debug during development [...] > > Developers will accept that optimized code will by its nature make > > some of these fuzzy, but incorrect data must be and incomplete data > > should be minimized. [...] > > As it is (without serious overhead) impossible to do both, you either have > to live with possibly incorrect but elaborate or incomplete but correct > debug information for optimized code. Choose one ;) I did say "minimized", not "eliminated". It needs to be good enough that a semi-knowledgable person or a dumb but heuristic-laden program that processes debugging info can nevertheless extract reliable information. > What we (Matz and myself) are trying to do is provide elaborate > debug information with the chance of wrong (I'd call it superflous, > or extra) debug information. (I will need to reread the thread to see what this extra information can do in terms of misleading users or tools, such as giving incorrect variable values/locations. I'd appreciate a link if you have one handy.) > Alexandre seems to aim at the world-domination solution (with the > serious overhead in terms of implementation and verboseness). That ("world-domination") seems an overly unkind characterization - we could simply say he's trying an exhaustive, straining-to-be-correct solution. It seems to me that we will shortly see the actual impacts of both of these approaches in terms of compiler complexity as well as any improvements in data quality. It does not seem to me like there is substantial disagreement over the ideal of correct and to a lesser extent complete information, so let's see the implementations and then compare. - FChE ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-23 20:53 ` Frank Ch. Eigler @ 2007-11-24 1:53 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 1:53 UTC (permalink / raw) To: Frank Ch. Eigler Cc: Richard Guenther, Mark Mitchell, David Edelsohn, Ian Lance Taylor, gcc-patches, gcc On Nov 23, 2007, "Frank Ch. Eigler" <fche@redhat.com> wrote: >> > It may be asking to belabour the obvious. GCC users do not want to >> > have to compile with "-O0 -g" just to debug during development [...] >> > Developers will accept that optimized code will by its nature make >> > some of these fuzzy, but incorrect data must be and incomplete data ^avoided? >> > should be minimized. [...] Richard Guenther replied: >> As it is (without serious overhead) impossible to do both, Is it? >> you either have to live with possibly incorrect but elaborate or >> incomplete but correct debug information for optimized code. You have proof of that? >> Choose one ;) As in, command line options? Or are we going to make a choice and impose that on all our users, as if it fit all? Frank followed up: >> What we (Matz and myself) are trying to do is provide elaborate >> debug information with the chance of wrong (I'd call it superflous, >> or extra) debug information. It's not just superfluous or extra. Your approach actively regresses debug information for some cases, while it's arguable whether it actually improves others. > That ("world-domination") seems an overly unkind characterization +1 It would be like myself pointing out that, for every problem, there's a solution that's simple, elegant and wrong ;-) Given the problems with sequential live ranges being made parallel and conflicting, values subject to conditions being made inconditional, and overwritten values remaining noted as live, I wouldn't think the characterization above would be unfair, but I'd managed to resist it so far. I don't think pulling the blanket such that it covers your face while it uncovers your feet is the way to go. It's even worse, because then, with your face covered, you won't even see that your feet are uncovered ;-) Regressions are bad, and this proposed approach guarantees regressions, while it might fix a few trivial cases. This is not enough for me. I'm not just hacking up a quick fix for a poorly-worded problem. I'm doing actual software engineering here, trying to get GCC to comply with existing debug info standards. > It does not seem to me like there is > substantial disagreement over the ideal of correct Unfortunately, that is indeed up for debate. There are even those who dispute that there's any correctness issue involved. Most other approaches are actually overreaching in completeness, trading correctness for more information, as if more unreliable information was any better than no information at all. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-22 23:13 ` Richard Guenther 2007-11-23 20:53 ` Frank Ch. Eigler @ 2007-11-24 15:02 ` Robert Dewar 1 sibling, 0 replies; 150+ messages in thread From: Robert Dewar @ 2007-11-24 15:02 UTC (permalink / raw) To: Richard Guenther Cc: Frank Ch. Eigler, Mark Mitchell, David Edelsohn, Ian Lance Taylor, Alexandre Oliva, gcc-patches, gcc Richard Guenther wrote: > On Nov 22, 2007 8:22 PM, Frank Ch. Eigler <fche@redhat.com> wrote: >> Mark Mitchell <mark@codesourcery.com> writes: >> >>> [...] >>>> Who is "we"? What better debugging are GCC users demanding? What >>>> debugging difficulties are they experiencing? Who is that set of users? >>>> What functional changes would improve those cases? What is the cost of >>>> those improvements in complexity, maintainability, compile time, object >>>> file size, GDB start-up time, etc.? >>> That's what I'm asking. First and foremost, I want to know what, >>> concretely, Alexandre is trying to achieve, beyond "better debugging >>> info for optimized code". Until we understand that, I don't see how we >>> can sensibly debate any methods of implementation, possible costs, etc. >> It may be asking to belabour the obvious. GCC users do not want to >> have to compile with "-O0 -g" just to debug during development (or >> during crash analysis *after deployment*!). Developers would like to >> be able to place breakpoints anywhere by reference to the source code, >> and would like to access any variables logically present there. >> Developers will accept that optimized code will by its nature make >> some of these fuzzy, but incorrect data must be and incomplete data >> should be minimized. >> >> That they put up with the status quo at all is a historical artifact >> of being told so long not to expect any better. > > As it is (without serious overhead) impossible to do both, you either have > to live with possibly incorrect but elaborate or incomplete but correct > debug information for optimized code. Choose one ;) I don't think you can use the phrase "serious overhead" without rather extensive statistics. To me, -O1 should be reasonably debuggable, as it always was back in earlier gcc days. It is nice that -O1 is somewhat more efficient than it was in those earlier days, but not nice enough to warrant a severe regression in debug capabilities. To me anyone who is so concerned about performance as to really appreciate this difference will likely be using -O2 anyway. The trouble is that we have set as the criterion for -O1 all the optimizations that are reasonably cheap in compile time. I think it is essential that there be an optimization level that means All the optimizations that are reasonably cheap to implement and that do not impact debugging information significantly (except I would say it is OK to impact the ability to change variables). For me it would be fine for -O1 to mean that but if there is a a consensus that an extra level (-Od or whatever) is worth while that's fine by me. I find working on the Ada front end that it used to be that I could always use -O1, OK for debugging, and OK for performance. Now I have to switch between -O0 for debugging, and then I use -O2 for performance (for me, the debuggability of -O1 and -O2 are equivalent in this context, both hopeless, so I might as well use -O2). So I no longer use -O1 at all (the extra compile time for -O2 is negligible on my fast note book). > > What we (Matz and myself) are trying to do is provide elaborate debug > information with the chance of wrong (I'd call it superflous, or extra) > debug information. Alexandre seems to aim at the world-domination > solution (with the serious overhead in terms of implementation and > verboseness). > > Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 0:15 ` David Edelsohn 2007-11-08 0:35 ` Mark Mitchell @ 2007-11-08 5:15 ` Alexandre Oliva 2007-11-08 18:18 ` Alexandre Oliva 2007-11-08 19:46 ` Andrew Pinski 1 sibling, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 5:15 UTC (permalink / raw) To: David Edelsohn Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 7, 2007, David Edelsohn <dje@watson.ibm.com> wrote: > Who is "we"? What better debugging are GCC users demanding? What > debugging difficulties are they experiencing? I, for one, miss the arguments of inlined functions, a lot. The reason for that is that arguments are currently optimized away to boot. Even if they weren't, since they're initialized with a trivial copy, at least their initial value (quite often preserved throughout compilation) would be gone to boot. On top of that, we currently regard arguments and variables of non-inlined functions as special, and we prevent a number of optimizations with them, in order to be able to generate slightly better debug information for them. (As for arguments and variables of inlined functions, we happily drop them on the floor right away.) This is not only inconsistent, it's also harmful, because we're trading performance and compile-time memory for slightly better but still incorrect, incomplete and unreliable debug information. > Who is that set of users? I'm personally getting numerous requests for debug information correctness and better completeness from debug info consumers such as gdb, frysk and systemtap. GCC's eagerness to inline functions, even ones never declared as inline, and its eagerness to corrupt the meta-information associated with them, causes these tools to malfunction in very many situations. And it's all GCC's fault, for generating code that is not standards-compliant in the meta-information sections of its output. > What functional changes would improve those cases? What is the cost of > those improvements in complexity, maintainability, compile time, object > file size, GDB start-up time, etc.? Before I spend hours describing the little I can foresee about this, how much of this really matters, given that it's mostly a matter of correctness, rather than mere trade offs? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 5:15 ` Alexandre Oliva @ 2007-11-08 18:18 ` Alexandre Oliva 2007-11-08 19:46 ` Andrew Pinski 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 18:18 UTC (permalink / raw) To: David Edelsohn Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 7, 2007, David Edelsohn <dje@watson.ibm.com> wrote: > Who is "we"? What better debugging are GCC users demanding? What > debugging difficulties are they experiencing? I, for one, miss the arguments of inlined functions, a lot. The reason for that is that arguments are currently optimized away to boot. Even if they weren't, since they're initialized with a trivial copy, at least their initial value (quite often preserved throughout compilation) would be gone to boot. On top of that, we currently regard arguments and variables of non-inlined functions as special, and we prevent a number of optimizations with them, in order to be able to generate slightly better debug information for them. (As for arguments and variables of inlined functions, we happily drop them on the floor right away.) This is not only inconsistent, it's also harmful, because we're trading performance and compile-time memory for slightly better but still incorrect, incomplete and unreliable debug information. > Who is that set of users? I'm personally getting numerous requests for debug information correctness and better completeness from debug info consumers such as gdb, frysk and systemtap. GCC's eagerness to inline functions, even ones never declared as inline, and its eagerness to corrupt the meta-information associated with them, causes these tools to malfunction in very many situations. And it's all GCC's fault, for generating code that is not standards-compliant in the meta-information sections of its output. > What functional changes would improve those cases? What is the cost of > those improvements in complexity, maintainability, compile time, object > file size, GDB start-up time, etc.? Before I spend hours describing the little I can foresee about this, how much of this really matters, given that it's mostly a matter of correctness, rather than mere trade offs? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 5:15 ` Alexandre Oliva 2007-11-08 18:18 ` Alexandre Oliva @ 2007-11-08 19:46 ` Andrew Pinski 2007-11-08 20:39 ` Alexandre Oliva 2007-11-09 8:39 ` Robert Dewar 1 sibling, 2 replies; 150+ messages in thread From: Andrew Pinski @ 2007-11-08 19:46 UTC (permalink / raw) To: Alexandre Oliva Cc: David Edelsohn, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc First off I would like to say I did not want to reply but I guess I am going to because of some false information spreading around about what GCC as a compiler is. On 11/7/07, Alexandre Oliva <aoliva@redhat.com> wrote: > I'm personally getting numerous requests for debug information > correctness and better completeness from debug info consumers such as > gdb, frysk and systemtap. GCC's eagerness to inline functions, even > ones never declared as inline, and its eagerness to corrupt the > meta-information associated with them, causes these tools to > malfunction in very many situations. And it's all GCC's fault, for > generating code that is not standards-compliant in the > meta-information sections of its output. I have to ask, do you want an optimizing compiler or one which generates full debugging information???? Because there are trade off here really. The reason behind the extra inlining is because it improves code generation. I don't know about you but in some area of coding, they need the extra speed/size reductions that inlining of non user marked functions. I have plenty of code which needs the speed help that the extra inling helps (remember some developers don't want to change the code that much to have the optimizing compiler do its work). Remember dwarf3 is not really a standards about meta-information, it just mentions how it represented if it exists. -- Pinski ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 19:46 ` Andrew Pinski @ 2007-11-08 20:39 ` Alexandre Oliva 2007-11-09 8:39 ` Robert Dewar 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 20:39 UTC (permalink / raw) To: Andrew Pinski Cc: David Edelsohn, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 8, 2007, "Andrew Pinski" <pinskia@gmail.com> wrote: > On 11/7/07, Alexandre Oliva <aoliva@redhat.com> wrote: >> I'm personally getting numerous requests for debug information >> correctness and better completeness from debug info consumers such as >> gdb, frysk and systemtap. GCC's eagerness to inline functions, even >> ones never declared as inline, and its eagerness to corrupt the >> meta-information associated with them, causes these tools to >> malfunction in very many situations. And it's all GCC's fault, for >> generating code that is not standards-compliant in the >> meta-information sections of its output. > I have to ask, do you want an optimizing compiler or one which > generates full debugging information???? I want both. That's the whole point of this project I'm in. > Because there are trade off here really. For a superficial look at the problem, they might look like trade-offs. But the assumption that it's impossible to get both is incorrect. It takes work, but it's not impossible. > The reason behind the extra inlining is because it > improves code generation. I don't see how you got the impression that I might be arguing against the inlining, as it looks like you did. I'm not. I'm arguing against the corruption of meta-information associated with them. That's just laziness on our part. > Remember dwarf3 is not really a standards about meta-information, it > just mentions how it represented if it exists. That's what meta-information is. One of the problems is that we often fail to represent information that does exist. A more serious problem is that we often represent such information incorrectly, making it seem like things that don't exist do, and that things are at different locations from those in which they actually are. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 19:46 ` Andrew Pinski 2007-11-08 20:39 ` Alexandre Oliva @ 2007-11-09 8:39 ` Robert Dewar 1 sibling, 0 replies; 150+ messages in thread From: Robert Dewar @ 2007-11-09 8:39 UTC (permalink / raw) To: Andrew Pinski Cc: Alexandre Oliva, David Edelsohn, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Andrew Pinski wrote: > I have to ask, do you want an optimizing compiler or one which > generates full debugging information???? Both! I would like modes which do the following a) reasonable amount of optimization that does not intefere too much with debugging. The old GCC 3 -O1 was a close approximation to this (certainly a closer approximation than the current -O1). b) all possible optimziations even if debuggability is compromised That's a perfectly reasonable request, and we used to be pretty close to having it, but now -O1 has really degraded as a solution to a). Yes, it's somewhat more efficient, but I suspect that the small minority of those interested in the last bit of performance are using -O2 anyway, so I doubt many people get much benefit from the improved performance of -O1 code. On the other hand lots of people are negatively affected by the degrading of debugging in -O1 mode. Because there are trade off > here really. The reason behind the extra inlining is because it > improves code generation. I don't know about you but in some area of > coding, they need the extra speed/size reductions that inlining of non > user marked functions. I have plenty of code which needs the speed > help that the extra inling helps (remember some developers don't want > to change the code that much to have the optimizing compiler do its > work). Obviously you don't want a lot of inlining unless the debugger can handle inlining properly if your interest is in being able to debug! > > Remember dwarf3 is not really a standards about meta-information, it > just mentions how it represented if it exists. But consumers want a debugger that works, without having to take the hit of huge volumes of code at -O0 > > -- Pinski ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 0:00 ` Mark Mitchell 2007-11-08 0:15 ` David Edelsohn @ 2007-11-08 5:44 ` Alexandre Oliva 2007-11-08 18:37 ` Alexandre Oliva 2007-11-08 19:13 ` Mark Mitchell 2007-11-08 9:54 ` Richard Guenther 2 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 5:44 UTC (permalink / raw) To: Mark Mitchell; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > Until we all know what we're trying to do Here's what I am trying to do: 1. Ensure that, for every user variable for which we emit debug information, the information is correct, i.e., if it says the value of a variable at a certain instruction is at certain locations, or is a known constant, then the variable must not be at any other location at that point, and the locations or values must match reasonable expectations based on source code inspection. 2. Defining "reasonable expectations" is tricky, for code reordering typical of optimization can make room for numerous surprises. I don't have a precise definition for this yet, but very clearly to me saying that a variable holds a value that it couldn't possibly hold (e.g., because it is only assigned that value in a code path that is knowingly not taken) is a very clear indication that something is amiss. The general guiding rule is, if we aren't sure the information is correct (or we're sure it isn't), we shouldn't pretend that it is. 3. Try to ensure that, if the value of a variable is a known constant at a certain point in the program, this information is present in debug information. 4. Try to ensure that, if the value of a variable is available at any location at a certain point in the program, this information is present in debug information. 5. Stop missing optimizations for the sake of improving debug information. 6. Avoid using additional memory and CPU cycles that would be needed only for debug information when compiling without generating debug information -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 5:44 ` Alexandre Oliva @ 2007-11-08 18:37 ` Alexandre Oliva 2007-11-08 19:13 ` Mark Mitchell 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 18:37 UTC (permalink / raw) To: Mark Mitchell; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > Until we all know what we're trying to do Here's what I am trying to do: 1. Ensure that, for every user variable for which we emit debug information, the information is correct, i.e., if it says the value of a variable at a certain instruction is at certain locations, or is a known constant, then the variable must not be at any other location at that point, and the locations or values must match reasonable expectations based on source code inspection. 2. Defining "reasonable expectations" is tricky, for code reordering typical of optimization can make room for numerous surprises. I don't have a precise definition for this yet, but very clearly to me saying that a variable holds a value that it couldn't possibly hold (e.g., because it is only assigned that value in a code path that is knowingly not taken) is a very clear indication that something is amiss. The general guiding rule is, if we aren't sure the information is correct (or we're sure it isn't), we shouldn't pretend that it is. 3. Try to ensure that, if the value of a variable is a known constant at a certain point in the program, this information is present in debug information. 4. Try to ensure that, if the value of a variable is available at any location at a certain point in the program, this information is present in debug information. 5. Stop missing optimizations for the sake of improving debug information. 6. Avoid using additional memory and CPU cycles that would be needed only for debug information when compiling without generating debug information -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 5:44 ` Alexandre Oliva 2007-11-08 18:37 ` Alexandre Oliva @ 2007-11-08 19:13 ` Mark Mitchell 2007-11-08 19:13 ` David Daney 2007-11-09 2:09 ` Alexandre Oliva 1 sibling, 2 replies; 150+ messages in thread From: Mark Mitchell @ 2007-11-08 19:13 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: > On Nov 7, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > >> Until we all know what we're trying to do > > Here's what I am trying to do: I think these are laudable goals, but you didn't really provide the information I wanted. In particular, what I'd like to drill down from goals (like "ensure that, for every user variable for which we emit debug information, the information is correct") to concrete problems. I think that most of the goals boil down to making sure that, at any point in the program, the debug information for a variable meets the following criteria: (a) if the variable has not been optimized away, gives the location where that variable's current value can be found, or (b) if the variable has been optimized away, and the value is not a constant, says that the value is not available, or (c) if the variable has been optimized away, but is a constant, says what the constant value is Is that right? (Note "at any point" above; it might be that the variable is present in r0 for a while, and then optimized away, and then present at *0xdeadbeef for a while, and then has the constant value 7.) If so, how are you proposing to accomplish that? It's easy enough to design a representation (whether in the instruction stream, or on the side) that says "from instruction A to instruction B, the value is in this location". So, I don't think we need to worry about that just yet. But, how are we going to track this information? Algorithmically, what needs to change in the compiler to maintain this state? For example, we need some way for an optimization pass to tell the rest of the compiler that a variable was completely eliminated. (Perhaps, for example, because all uses of the variable were eliminated.) So, maybe we need a debug_var_eliminated API. Then, every pass that blows away variables can call this function, which can make whatever notations on the VAR_DECL are required. I'm not claiming that's the right approach, but I'd like to understand the plan at that kind of level. What changes will need to be made throughout the compiler to keep track of the state? -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 19:13 ` Mark Mitchell @ 2007-11-08 19:13 ` David Daney 2007-11-08 19:17 ` Mark Mitchell 2007-11-09 2:09 ` Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: David Daney @ 2007-11-08 19:13 UTC (permalink / raw) To: Mark Mitchell Cc: Alexandre Oliva, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Mark Mitchell wrote: > Alexandre Oliva wrote: >> On Nov 7, 2007, Mark Mitchell <mark@codesourcery.com> wrote: >> >>> Until we all know what we're trying to do >> Here's what I am trying to do: > > I think these are laudable goals, but you didn't really provide the > information I wanted. In particular, what I'd like to drill down from > goals (like "ensure that, for every user variable for which we emit > debug information, the information is correct") to concrete problems. > > I think that most of the goals boil down to making sure that, at any > point in the program, the debug information for a variable meets the > following criteria: > > (a) if the variable has not been optimized away, gives the location > where that variable's current value can be found, or > (b) if the variable has been optimized away, and the value is not a > constant, says that the value is not available, or Perhaps if the variable has been optimized away *but* it is possible to calculate its value by examining the state of the program, then we can emit the expression needed to calculate its value in the debugging information as well. I may be missing something, but it seems that may be part of Alexandre's plan as well. > (c) if the variable has been optimized away, but is a constant, says > what the constant value is ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 19:13 ` David Daney @ 2007-11-08 19:17 ` Mark Mitchell 0 siblings, 0 replies; 150+ messages in thread From: Mark Mitchell @ 2007-11-08 19:17 UTC (permalink / raw) To: David Daney Cc: Alexandre Oliva, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc David Daney wrote: >> (a) if the variable has not been optimized away, gives the location >> where that variable's current value can be found, or >> (b) if the variable has been optimized away, and the value is not a >> constant, says that the value is not available, or > > Perhaps if the variable has been optimized away *but* it is possible to > calculate its value by examining the state of the program, then we can > emit the expression needed to calculate its value in the debugging > information as well. Yes, that's a good addition. To be clear, I'm not trying to set the goals here; I'm just trying to make sure we have a clear set of objectives and a plan to get there. Thanks, -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 19:13 ` Mark Mitchell 2007-11-08 19:13 ` David Daney @ 2007-11-09 2:09 ` Alexandre Oliva 2007-11-12 4:49 ` Mark Mitchell 1 sibling, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-09 2:09 UTC (permalink / raw) To: Mark Mitchell; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 8, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > Alexandre Oliva wrote: >> On Nov 7, 2007, Mark Mitchell <mark@codesourcery.com> wrote: >> >>> Until we all know what we're trying to do >> >> Here's what I am trying to do: > I think these are laudable goals, but you didn't really provide the > information I wanted. Oh, you didn't want goals. Design and implementation plans more detailed than http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html, I suppose. Ok, let's see... 1. introduce, early in compilation (when entering SSA), annotations that map user-level variables whose location may vary throughout their lifetime to implementation-level variables or expressions at every point of assignment and PHI joins. 2. keep those annotations accurate throughout compilation, without letting them interfere with optimizations, but making sure they are kept up-to-date or marked untrackable. 3. in var-tracking, starting from the expressions in the annotations and their equivalent expressions computed with a dataflow-globalized cse analysis, emit traditional var-tracking var_location notes for all variables. For variables that didn't start out as gimple regs, the current debug info behavior should be preserved. > I think that most of the goals boil down to making sure that, at any > point in the program, the debug information for a variable meets the > following criteria: > (a) if the variable has not been optimized away, gives the location > where that variable's current value can be found, or > (b) if the variable has been optimized away, and the value is not a > constant, says that the value is not available, or > (c) if the variable has been optimized away, but is a constant, says > what the constant value is yes, except that instead of constant and constant value, I'd put it as 'computable expression from other live values'. And I'd say "locations" rather than just "location". > But, how are we going to track this information? Algorithmically, what > needs to change in the compiler to maintain this state? Most optimizations passes must already update uses of gimple or pseudo regs they modify, so these will be taken care of automatically (which is why I chose this representation). Optimization passes that move assignments to an earlier point in the program don't need any modification. Those that move them to a later point will often move them past their debug notes. This means the debug notes need updating, but it also means that, in the absence of fixes, the debug notes most likely will stand in the way of the transformation, so testing that the debug notes don't change optimization behavior ought to catch these. Transformations that copy or move blocks will retain the annotations, so this should "just work". Transformations that delete blocks might be a bit of a problem, if they delete important debug annotations. So far, the only case I've noticed of such behavior is in ifcvt, in which an if-then-assign-else-assign set of blocks is turned into a single if-then-else assignments. This particular case is covered by the PHI statement that is placed in the entry point of the block that joins the then and the else. On architectures that support longer blocks with conditional-execution of arbitrary instructions (arm, ia64), I'm not sure how to handle the debug notes. It seems to me that, with the current design, the variable may be regarded as untrackable after the first conditional assignment within the combined blocks, but at the join point there will be a the debug annotation corresponding to the PHI join that will take care of getting a correct location for the variable again. I don't have plans in place for any other kind of situation, but it appears to me that the notion of using assignments and joins as fixed points is solid, and I'm pretty sure any surprises can be overcome. Of course software pipelining and other kinds of loop transformations will yield debug information that's not exactly easy to grasp, but this would be true of any representation. When the compiler messes too much with the code, there's very little one can do to make execution resemble that of sequential execution. I'm also thinking debug info consumers would probably enjoy some means to tell a point at which all side effects present in a certain source line have been completed. But these are mostly orthogonal issues, so I won't delve into them right now. > For example, we need some way for an optimization pass to tell the > rest of the compiler that a variable was completely eliminated. In the design I'm proposing, there's no need for anything explicit like this. This would require global information, which is undesirable, especially for optimizers that operate locally. What they'd have to do when they throw away a value that a debug annotation relies on is to replace that value with something equivalent, if they can, or to mark that particular annotation as untrackable. Then, if all annotations associated with a variable are untrackable, we know it was completely optimized away. But if any assignments remained trackable, we can (and should, even though we don't have to) still issue debug information for that. Besides, optimization passes don't deal with user variables. They deal with implementation user variables, that initially resemble user variables, but that quickly diverge. Optimization passes shouldn't have to care about user variables. In my proposal, all they have to do is to adjust expressions (that happen to be known to evaluate to what user variables are expected to hold) such that they retain the same value in spite of transformations they perform, or are marked as untrackable if that's impossible or too difficult. For the optimizers, all that matters is the expressions, and they already have to deal with these all over anyway. It's the debug info generator that deals with user-level variables, taking into account whatever the optimizers tell it about how to determine the location of user variables throughout the program. > What changes will need to be made throughout the compiler to keep > track of the state? Very few, so far. Pretty much all of the changes that I had to make were to prevent the notes from disabling optimizations; very few of them required updating of debug notes beyond whatever the optimization pass would have done by default. That said, I have no means to test automatically that updates to debug annotations are being performed correctly, but since optimizers as a rule have to update all uses of whatever they mess with, I have reasons to believe that they do it correctly, precisely because the debug notes look so much like regular uses to them. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-09 2:09 ` Alexandre Oliva @ 2007-11-12 4:49 ` Mark Mitchell 2007-11-12 18:45 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Mark Mitchell @ 2007-11-12 4:49 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: > 1. introduce, early in compilation (when entering SSA), annotations > that map user-level variables whose location may vary throughout their > lifetime to implementation-level variables or expressions at every > point of assignment and PHI joins. > > 2. keep those annotations accurate throughout compilation, without > letting them interfere with optimizations, but making sure they are > kept up-to-date or marked untrackable. > > 3. in var-tracking, starting from the expressions in the annotations > and their equivalent expressions computed with a dataflow-globalized > cse analysis, emit traditional var-tracking var_location notes for all > variables. For variables that didn't start out as gimple regs, the > current debug info behavior should be preserved. > >> I think that most of the goals boil down to making sure that, at any >> point in the program, the debug information for a variable meets the >> following criteria: > >> (a) if the variable has not been optimized away, gives the location >> where that variable's current value can be found, or >> (b) if the variable has been optimized away, and the value is not a >> constant, says that the value is not available, or >> (c) if the variable has been optimized away, but is a constant, says >> what the constant value is > > yes, except that instead of constant and constant value, I'd put it as > 'computable expression from other live values'. > > And I'd say "locations" rather than just "location". I agree; those are generalizations, of which my bullets are a needlessly constrained special case. (Of course, we can gradually approach "computable" by starting with "constant", and then adding more and more refinement, if we like.) >> But, how are we going to track this information? Algorithmically, what >> needs to change in the compiler to maintain this state? > > Most optimizations passes must already update uses of gimple or pseudo > regs they modify, so these will be taken care of automatically (which > is why I chose this representation). For the purposes of this discussion, let's assume that upon exit from SSA we still have the information we need. In particular, we know which SSA names correspond to which user variables. That tells us how to get the values of user variables at the points where their values are available, and also tells us when those variables do not have their values available. (We may already have lost some information, though. For example, given: i = 3; f(i); i = 7; i = 2; g(i); we may well have lost the "i = 7" assignment, so "i" might appear to have the value "3" right before we assign "2" to it, if we were to generate debug information right then.) The reason I want to make that assumption is that the part of this where the representation is in question is once we reach RTL, right? I guess I still don't really understand what you're doing at the RTL level. I understand the objectives. I understand some of the things you're claiming as virtues of DEBUG_INSN. What I don't understand is how it's actually going to work. What are the notes you're inserting? Do they just say "here is an RTL expression for computing the value of user-variable V at this point in the program"? Why does it make sense to have that, rather than notes on instructions that say what affect the instruction has on user variables? (For example, "this SET makes the value of V unavailable". Or "this SET makes the value of the V available in the destination register"?) As a meta-question, have you or anyone else on the list looked at the literature (IEEE/ACM, etc.) or how other compilers handle these problems? -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 4:49 ` Mark Mitchell @ 2007-11-12 18:45 ` Alexandre Oliva 2007-11-12 18:49 ` Joe Buck ` (3 more replies) 0 siblings, 4 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-12 18:45 UTC (permalink / raw) To: Mark Mitchell; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 12, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > (We may already have lost some information, though. For example, given: > i = 3; > f(i); > i = 7; > i = 2; > g(i); > we may well have lost the "i = 7" assignment, so "i" might appear to > have the value "3" right before we assign "2" to it, if we were to > generate debug information right then.) Yup. And even if we could somehow preserve that information, there wouldn't be any code to attach that information to. There might be uses for empty-range locations in debug information, but I can't think of any. Can anyone? It's something we could try to preserve, and with my design it would be quite easy to do so, but unless it's useful for some purpose, I think we could just do away with it. > The reason I want to make that assumption is that the part of this where > the representation is in question is once we reach RTL, right? I'm not sure what is in question at all. I've proposed a design to preserve debug information throughout compilation. Other designs on the table differ both in tree and rtl levels, and in the potential quality and correctness of the debug information they can produce. > I guess I still don't really understand what you're doing at the RTL > level. It's no different, except that instead of a DEBUG_STMT it's a DEBUG_INSN, with the TREE exprssion converted to an RTL expression. /me mumbles something about the silliness of keeping two completely different yet nearly-isomorphic internal representations for statements/instructions. > What I don't understand is how it's actually going to work. What > are the notes you're inserting? They're always of the form DEBUG user-variable = expression where DEBUG stands for a DEBUG_STMT or a DEBUG_INSN, user-variable is a tree that represents the user variable, and expression is a TREE or RTL (depending on which representation we're in) that evaluates to the value the user-variable is expected to hold at that point in the program. > Do they just say "here is an RTL expression for computing the value of > user-variable V at this point in the program"? In RTL, yes. > Why does it make sense to have that, rather than notes on > instructions that say what affect the instruction has on user > variables? Few instructions need such notes, so the proposal of growing SET by 33% doesn't quite appeal to me. And then, optimizations move instructions around, but I don't think they should move the assignment notes around, for they should reflect the structure of the source program, rather than the mangled representation that the optimizers turn it into. That said, growing SET to add to it a list of variables (or components thereof) that the variable is assigned to could be made to work, to some extent. But when you optimize away such a set, you'd still have to keep the note around, so it's not clear to me that adding code all over to maintain the notes in place when the SETs go away or are juggled around would bring us any advantage. It would be just a redundant notation for what the note would already convey, so it just brings complexity for no actual advantage. To make it concrete, consider that your example above could have become: (set (reg i) (const_int 3)) ;; assigns to i (set (reg P1) (reg i)) (call (mem f)) (set (reg i) (const_int 7)) ;; assigns to i (set (reg i) (const_int 2)) ;; assigns to i (set (reg P1) (reg i)) (call (mem g)) could have been optimized to: (set (reg P1) (const_int 3)) (call (mem f)) (set (reg P1) (const_int 2)) (call (mem g)) and then you wouldn't have any debug information left for variable i. whereas with the notes I propose, you'd be left with: (debug i (const_int 3)) (set (reg P1) (const_int 3)) (call (mem f)) (debug i (const_int 7)) ;; may be dropped, as discussed above (debug i (const_int 2)) (set (reg P1) (const_int 2)) (call (mem g)) even if no register at all ends up allocated for i. And if there were uses of i that followed the assignment to 7, to which the constant could be propagated, you'd still be left with the annotation to indicate that i has a new value at the correct point. > As a meta-question, have you or anyone else on the list looked at the > literature (IEEE/ACM, etc.) or how other compilers handle these problems? I couldn't find much information about other compilers, but I've see a number of (mostly dated) articles and US patents. In fact, I'm particularly concerned that US Patent 6091896 covers the design proposed by Richi, that involves annotating the instructions themselves. I believe the independent, stand-alone annotations I propose escape the patent claims. That said, if anyone knows of articles that could be of use, I'd love to hear about them. It's not like my research was exhaustive. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 18:45 ` Alexandre Oliva @ 2007-11-12 18:49 ` Joe Buck 2007-11-25 6:57 ` Alexandre Oliva 2007-11-12 18:53 ` Ian Lance Taylor ` (2 subsequent siblings) 3 siblings, 1 reply; 150+ messages in thread From: Joe Buck @ 2007-11-12 18:49 UTC (permalink / raw) To: Alexandre Oliva Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Mon, Nov 12, 2007 at 03:52:01PM -0200, Alexandre Oliva wrote: > On Nov 12, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > > > (We may already have lost some information, though. For example, given: > > > i = 3; > > f(i); > > i = 7; > > i = 2; > > g(i); > > > we may well have lost the "i = 7" assignment, so "i" might appear to > > have the value "3" right before we assign "2" to it, if we were to > > generate debug information right then.) > > Yup. And even if we could somehow preserve that information, there > wouldn't be any code to attach that information to. There might be > uses for empty-range locations in debug information, but I can't think > of any. Can anyone? It's something we could try to preserve, and > with my design it would be quite easy to do so, but unless it's useful > for some purpose, I think we could just do away with it. If we drop the "i = 7" assignment, then a debugger could have a consistent view of what is going on if, given i = 3; // line 10 f(i); // line 11 i = 7; // line 12 i = 2; // line 13 g(i); // line 14 "next" would step from line 10, to 11, to 12, to 14. We would not be able to stop after the execution of a no-longer-existing statement; if we could stop at the beginning of line 13, it would imply that line 12 has run and line 13 has not, which does not reflect what the optimized code is doing. We don't do it this way at the moment; we would be able to set a breakpoint at line 13. But perhaps the right way to think about your project, Alexandre, is to make things match up at the point where the gdb user can observe the state, and consider dropping observable points where the states will not match. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 18:49 ` Joe Buck @ 2007-11-25 6:57 ` Alexandre Oliva 2007-11-25 12:09 ` Richard Kenner 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-25 6:57 UTC (permalink / raw) To: Joe Buck Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 12, 2007, Joe Buck <Joe.Buck@synopsys.COM> wrote: > consider dropping observable points where the states will not match. We can't really do that. The line number mapping is from PC to line number, regardless of how far into the execution or earlier lines the code is. Omitting certain mappings from PC to line numbers would be wrong. The piece of the puzzle we're still missing is how to get debuggers clever enough to decide where to set a breakpoint. Nowadays, debuggers (at least those I'm familiar with) tend to set breakpoints at the lowest-numbered PC corresponding to a given source line number. While this is useful at times, at other times what you want is the lowest PC after all instructions corresponding to the previous line, because at that point you know all the state of the previous line should be stable and hopefully still observable. Or something along these lines. I don't have a complete solution for this problem. It's very far from trivial, and I don't see that debug information can carry enough information for the compiler to aid the debugger in selecting where to place breakpoints in this regard. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-25 6:57 ` Alexandre Oliva @ 2007-11-25 12:09 ` Richard Kenner 0 siblings, 0 replies; 150+ messages in thread From: Richard Kenner @ 2007-11-25 12:09 UTC (permalink / raw) To: aoliva; +Cc: Joe.Buck, gcc-patches, gcc, iant, mark, richard.guenther > The piece of the puzzle we're still missing is how to get debuggers > clever enough to decide where to set a breakpoint. Nowadays, debuggers > (at least those I'm familiar with) tend to set breakpoints at the > lowest-numbered PC corresponding to a given source line number. While > this is useful at times, at other times what you want is the lowest PC > after all instructions corresponding to the previous line, because at > that point you know all the state of the previous line should be stable > and hopefully still observable. Or something along these lines. I don't > have a complete solution for this problem. It's very far from trivial, > and I don't see that debug information can carry enough information for > the compiler to aid the debugger in selecting where to place breakpoints > in this regard. Or you want the first instruction of that line that shows the actual flow of control. Or sometimes other things, as you say. A few of us were discussing this issue in person last week and we strongly agree with your characterization that it's very far from trivial. The consensus we came to is that the compiler should continue associating the original line number with each instruction that came from it, but perhaps should also provide additional, not-yet-defined annotations to allow the debugger to be able to provide various different types of breakpoints, corresponding to various purposes the programmer us using the breakpoints for. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 18:45 ` Alexandre Oliva 2007-11-12 18:49 ` Joe Buck @ 2007-11-12 18:53 ` Ian Lance Taylor 2007-11-24 2:12 ` Alexandre Oliva 2007-11-13 10:30 ` Mark Mitchell 2007-11-13 15:30 ` Michael Matz 3 siblings, 1 reply; 150+ messages in thread From: Ian Lance Taylor @ 2007-11-12 18:53 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Mark Mitchell, Richard Guenther, gcc-patches, gcc Alexandre Oliva <aoliva@redhat.com> writes: > > Why does it make sense to have that, rather than notes on > > instructions that say what affect the instruction has on user > > variables? > > Few instructions need such notes, so the proposal of growing SET by > 33% doesn't quite appeal to me. We could add a note to the relevant instructions. We don't need to change the SET representation. That approach would only increase memory usage for relevant instructions. > And then, optimizations move > instructions around, but I don't think they should move the assignment > notes around, for they should reflect the structure of the source > program, rather than the mangled representation that the optimizers > turn it into. I'm not sure I follow this. If the equivalent of some source code line is hoisted out of a loop, shouldn't the user variable assignments follow it? After the scheduler has run over a large basic block, the structure of the source program is gone. Are we going to somehow try to retain it in the debugging information? Does that make sense? Side note: I think it would be unwise to discuss specific patents on this public mailing list. I think that where we have specific patent concerns, the steering committee should raise them on a telephone call with the FSF and/or the SFLC. If you have concerns about a specific patent, I recommend that you telephone some member of the SC, or send e-mail directly to that person. Ian ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 18:53 ` Ian Lance Taylor @ 2007-11-24 2:12 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 2:12 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Mark Mitchell, Richard Guenther, gcc-patches, gcc On Nov 12, 2007, Ian Lance Taylor <iant@google.com> wrote: > Alexandre Oliva <aoliva@redhat.com> writes: >> And then, optimizations move instructions around, but I don't think >> they should move the assignment notes around, for they should >> reflect the structure of the source program, rather than the >> mangled representation that the optimizers turn it into. > I'm not sure I follow this. If the equivalent of some source code > line is hoisted out of a loop, shouldn't the user variable assignments > follow it? Why should it? The user is entitled to expect the variable to be set to that value at the right point in the program, no earlier than that. Before the assignment point in the program, we ought to note that the variable holds its previous value, or that its previous value is no longer available. But noting it holds a value it should only hold at a later point doesn't seem right to me. Consider, again, the example: f(int x, int y) { int c; c = x; do_something_with_c(); c = y; do_something_with_c(); } If we optimize away the assignments c=x and c=y, and just use x and y instead (assume c is not otherwise modified), what should we note in debug info? Should we pretend that c is dead all over, just because it was optimized away? Should we note that it's live in both x and y registers/stack slots? Or should we vary its location between x and y, at the assignment points, as expected by the user? Now, what if f() is inlined into a loop, such that c could be versioned and the assignments to it could be hoisted, because x and y don't vary? Should this then change the debug information generated for variable c from the IMHO correct points to the loop entry points? > After the scheduler has run over a large basic block, the > structure of the source program is gone. The mapping becomes more difficult, yes. But the structure of the source program remains untouched, in the source program. And debug information is about mapping source concepts to implementation concepts. So we should try to map source concepts that remain in the implementation to the remaining implementation concepts. > Side note: I think it would be unwise to discuss specific patents on > this public mailing list. I think that where we have specific patent > concerns, the steering committee should raise them on a telephone call > with the FSF and/or the SFLC. If you have concerns about a specific > patent, I recommend that you telephone some member of the SC, or send > e-mail directly to that person. That makes sense. I hadn't actually seen that patent before the day I mentioned it, and I still haven't got 'round to reading it. I just thought it would be wise to inform people about the danger of going down that path, but now I realize it may not have been wise at all. Sorry for not thinking about it. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 18:45 ` Alexandre Oliva 2007-11-12 18:49 ` Joe Buck 2007-11-12 18:53 ` Ian Lance Taylor @ 2007-11-13 10:30 ` Mark Mitchell 2007-11-24 1:54 ` Alexandre Oliva 2007-11-13 15:30 ` Michael Matz 3 siblings, 1 reply; 150+ messages in thread From: Mark Mitchell @ 2007-11-13 10:30 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: >> What I don't understand is how it's actually going to work. What >> are the notes you're inserting? > > They're always of the form > > DEBUG user-variable = expression Good, I understand that now. Why is this better than associating user variables with assignments? In other words, if we have: X = E; where X is the location in which a user variable V is presently being stored, we could just put a note on the assignment that says "assigns to user variable V". If X is, for example, a hard register, and we're now clobbering the value of a user variable V (so that the value of the variable is no longer available there), we can add a note that says "clobbers user variable V". (The value might still be available somewhere else; we can figure that out by seeing if any instruction that is annotated as setting V dominates this instruction, without an intervening clobbering of that location.) > That said, growing SET to add to it a list of variables (or components > thereof) that the variable is assigned to could be made to work, to > some extent. But when you optimize away such a set, you'd still have > to keep the note around Why? It seems to me that if we're no longer doing the assignment, then the location where the value of the user variable can be found (if any) is not changing at this point. > (set (reg i) (const_int 3)) ;; assigns to i > (set (reg P1) (reg i)) > (call (mem f)) > (set (reg i) (const_int 7)) ;; assigns to i > (set (reg i) (const_int 2)) ;; assigns to i > (set (reg P1) (reg i)) > (call (mem g)) > > could have been optimized to: > > (set (reg P1) (const_int 3)) > (call (mem f)) > (set (reg P1) (const_int 2)) > (call (mem g)) > > and then you wouldn't have any debug information left for variable i. Actually, you would, in the method I'm making up. In particular, both of the first two lines in the top example (setting "i" and setting "P1") would be marked as providing the value of the user variable "i". The first line obviously has the value of "i", so we would have a "value of i" note. The second would also have a "value of i" note because its copying a value with such a note. What I'm suggesting is that this is something akin to a dataflow problem. We start by marking user variables, in the original TREE representation. Then, any time we copy the value of a user variable, we know that what we're doing is providing another place where we can find the value of that user variable. Then, when generating debug information, for every program region, we can find the location(s) where the value of the user variable is available, and we can output any one of those locations for the debugger. Now, of course, we can generate more compact information by trying to use the same location as often as possible, but that's just an optimization problem. This method gives us accurate debug information, in the sense that if we say that the value of V is at location X, then it is in fact there, and the value there is a value assigned to V. It does not necessarily give us complete information, though, in that there may be times when the value is somewhere and we don't realize it. Like, if: x = y + 3; f(x); is optimized to: f(y + 3) Then, right before the call to "f", we might not know that the value of "x" is available, or we might say that "x" has a previous value. As a special case of incompleteness, this fails utterly with respect to variables whose values are constants if those variables are then optimized away. If there's no location holding the constant, then the method I've proposed will say that the value is unavailable -- rather than cleverly telling the debugger that the value is a constant. I don't see that as an unreasonable limitation when debugging optimized code, but that's open for debate. I'm not claiming this is better than what you're suggesting. I'm just throwing it out there. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-13 10:30 ` Mark Mitchell @ 2007-11-24 1:54 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 1:54 UTC (permalink / raw) To: Mark Mitchell; +Cc: Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 13, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > Alexandre Oliva wrote: >>> What I don't understand is how it's actually going to work. What >>> are the notes you're inserting? >> >> They're always of the form >> >> DEBUG user-variable = expression > Good, I understand that now. > Why is this better than associating user variables with assignments? I've already explained that, but let me try to sum it up again. If we annotate assignments, then not only do the annotations move around along with assignments (I don't think that's desirable), but when we optimize such assignments away, the annotations are either dropped or have to stand on their own. Since dropping annotations and moving them around are precisely opposed the goal of making debug information accurate, then keeping the annotations in place and enabling them to stand on their own is the right thing to do. Now, since we have to enable them to stand on their own, then we're faced with the following decision: either we make that the canonical annotation representation all the way from the beginning, or we piggyback the annotations on assignments until they're moved or removed, at which point they become stand-alone annotations. The former seems much more maintainable and simpler to deal with, and I don't see that there's a significant memory or performance penalty to this. >> That said, growing SET to add to it a list of variables (or components >> thereof) that the variable is assigned to could be made to work, to >> some extent. But when you optimize away such a set, you'd still have >> to keep the note around > Why? It seems to me that if we're no longer doing the assignment, then > the location where the value of the user variable can be found (if any) > is not changing at this point. The thing is that the *location* of the user variable is changing at that point. Either because its previous value was unavalable, or because it had remained only at a different location. Only at the point of the assignment should we associate the variable with the location that holds its current value. >> (set (reg i) (const_int 3)) ;; assigns to i >> (set (reg P1) (reg i)) >> (call (mem f)) >> (set (reg i) (const_int 7)) ;; assigns to i >> (set (reg i) (const_int 2)) ;; assigns to i >> (set (reg P1) (reg i)) >> (call (mem g)) >> >> could have been optimized to: >> >> (set (reg P1) (const_int 3)) >> (call (mem f)) >> (set (reg P1) (const_int 2)) >> (call (mem g)) >> >> and then you wouldn't have any debug information left for variable i. > Actually, you would, in the method I'm making up. In particular, both > of the first two lines in the top example (setting "i" and setting "P1") > would be marked as providing the value of the user variable "i". Yes, this works in this very simple case. But it doesn't when i is assigned, at different points, to the values of two separate variables, that are live and initialized much earlier in the program. Using hte method you seem to be envisioning would extend the life of the binding of variable 'i' to the life of the two other variables, ending up with two overlapping and conflicting live ranges for i, or it would have to drop one in favor of the other. You can't possibly retain correct (non-overlapping) live ranges for both unless you keep notes at the points of assignment. To make the example clear, consider: (set (reg x [x]) ???1) (set (reg y [y]) ???2) (set (reg i [i]) (reg x [x])) (set (reg P1) (reg i)) (call (mem f)) (set (reg i [i]) (reg y [y])) (call (mem g)) (set (reg P1) (reg i)) (call (mem f)) if it gets optimized to: (set (reg P1 [x, i]) ???1) (set (reg y [y, i]) ???2) (call (mem f)) (call (mem g)) (set (reg P1) (reg y)) (call (mem f)) then we lose. There's no way you can emit debug information for i based on these annotations such that, at the call to g, the value of i is correct. Even if you annotate the copy from y to P1, you still won't have it right, and, worse, you won't even be able to tell that, before the call to g, i should have held a different value. So you'll necessarily emit incorrect debug information for this case: you'll state i still holds a value at a point in which it shouldn't hold that value any more. This is worse that stating you don't know what the value of i is. > What I'm suggesting is that this is something akin to a dataflow > problem. We start by marking user variables, in the original TREE > representation. Then, any time we copy the value of a user variable, we > know that what we're doing is providing another place where we can find > the value of that user variable. Then, when generating debug > information, for every program region, we can find the location(s) where > the value of the user variable is available, and we can output any one > of those locations for the debugger. That's exactly what I have in mind. > This method gives us accurate debug information, in the sense that if we > say that the value of V is at location X, then it is in fact there, and > the value there is a value assigned to V. It does not necessarily give > us complete information, though, in that there may be times when the > value is somewhere and we don't realize it. Like, if: > x = y + 3; > f(x); > is optimized to: > f(y + 3) > Then, right before the call to "f", we might not know that the value of > "x" is available, or we might say that "x" has a previous value. It's not just previous value. It can be arbitrarily wrong value too. Consider again the conditional case: foo (int x, int y, int z) { int c = z; whatever0(c); c = x; whatever1(); if (some_condition) { whatever2(); c = y; whatever3(); } whatever4(c); } In the tree representation, the assignments to c just go away, in favor of a PHI node that takes x from the !some_condition block and y from the some_condition block. So, you could recover the correct value for c at the PHI node, but since the other assignments are all dropped, you can at best figure out that you don't know the value held by c between whatever1() and the PHI node, and at worst claim that it's z or x or y, or even both x and y, depending on how you update the notes. > method I've proposed will say that the value is unavailable [when > it's a constant and the assignment is optimized away] I don't see how, unless you keep a note saying at least that the variable was modified to an unknown value at that point. > I don't see that as an unreasonable limitation when debugging > optimized code, but that's open for debate. If it did that reliably, then it would be a reasonable limitation, indeed, for it would be accurate, even if incomplete. It would no longer be a correctness issue, just a quality of implementation issue. But then, I'm yet to understand how you'd generate debug info to note that the value is unavailable if you don't keep notes around to indicate the point of the assignment that was optimized away. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 18:45 ` Alexandre Oliva ` (2 preceding siblings ...) 2007-11-13 10:30 ` Mark Mitchell @ 2007-11-13 15:30 ` Michael Matz 2007-11-24 2:00 ` Alexandre Oliva 3 siblings, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-13 15:30 UTC (permalink / raw) To: Alexandre Oliva Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Hi, On Mon, 12 Nov 2007, Alexandre Oliva wrote: > > Why does it make sense to have that, rather than notes on instructions > > that say what affect the instruction has on user variables? > > Few instructions need such notes, so the proposal of growing SET by 33% > doesn't quite appeal to me. Though I don't have produced hard numbers yet, that every SET now contains an additional pointer is less of an issue than one might think. There only ever exists one RTL body at each point in time, hence the memory use for RTL is vastly dominated by the memory use of GIMPLE, which exists for all functions at the same time. Having this annotation in the SET is just the esthetically most pleasing place. If you do it with notes on insns you have issues with multi-set insns, and you have to move them around in case you change the insns. Putting them in the SET itself keeps them up-to-date nearly automatically (of course you still have to touch them once in a while). > That said, growing SET to add to it a list of variables (or components > thereof) that the variable is assigned to could be made to work, to > some extent. But when you optimize away such a set, you'd still have > to keep the note around, so it's not clear to me that adding code all > over to maintain the notes in place when the SETs go away or are > juggled around would bring us any advantage. The nice thing is, that there are only few places which really get rid of SETs: remove_insn. You have to tweak that to keep the information around, not much else (though that claim remains to be proven :) ). Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-13 15:30 ` Michael Matz @ 2007-11-24 2:00 ` Alexandre Oliva 2007-11-26 21:01 ` Michael Matz 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 2:00 UTC (permalink / raw) To: Michael Matz Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 13, 2007, Michael Matz <matz@suse.de> wrote: > The nice thing is, that there are only few places which really get rid of > SETs: remove_insn. You have to tweak that to keep the information around, > not much else (though that claim remains to be proven :) ). And then, you have to tweak everything else to keep the note that replaced the set up to date as you further optimize the code. So what was the point of adding the note to the SET, again? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 2:00 ` Alexandre Oliva @ 2007-11-26 21:01 ` Michael Matz 2007-11-27 5:31 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-26 21:01 UTC (permalink / raw) To: Alexandre Oliva Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Hi, On Fri, 23 Nov 2007, Alexandre Oliva wrote: > On Nov 13, 2007, Michael Matz <matz@suse.de> wrote: > > > The nice thing is, that there are only few places which really get rid of > > SETs: remove_insn. You have to tweak that to keep the information around, > > not much else (though that claim remains to be proven :) ). > > And then, you have to tweak everything else to keep the note that > replaced the set up to date as you further optimize the code. No. remove_insn() would replace the SET with a note. It would look at other SETs where the information could be put in which is lost. After all, there must have been a reason for the SET to be deleted: the destination is dead, hence whatever user-variables were associated with it also are dead. (if they also lie in other places, those are not affected). So it's okay to completely get rid of the SET and decl associations. One special case of the above is, when a SET is deleted which is a copy, where the LHS was associated with some variables, but the RHS was not. From that point on we can (under certain circumstances) associate the RHS with the decls (by changing it's initial SET). Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-26 21:01 ` Michael Matz @ 2007-11-27 5:31 ` Alexandre Oliva 2007-11-27 20:31 ` Michael Matz 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-27 5:31 UTC (permalink / raw) To: Michael Matz Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 26, 2007, Michael Matz <matz@suse.de> wrote: > Hi, > On Fri, 23 Nov 2007, Alexandre Oliva wrote: >> On Nov 13, 2007, Michael Matz <matz@suse.de> wrote: >> >> > The nice thing is, that there are only few places which really get rid of >> > SETs: remove_insn. You have to tweak that to keep the information around, >> > not much else (though that claim remains to be proven :) ). >> >> And then, you have to tweak everything else to keep the note that >> replaced the set up to date as you further optimize the code. > No. remove_insn() would replace the SET with a note. What information would this note convey? > After all, there must have been a reason for the SET to be deleted: > the destination is dead, hence whatever user-variables were > associated with it also are dead. Note quite. The destination could be merely redundant. And the difference is crucial. If you delete a copy (or some other redundant computation, you don't seem to handle this case) that would install a value in a variable that is available elsewhere, and then adjust the uses of the variable such that they use the value elsewhere, you ought to note that the variable holds that value, and at that point. If you delete a computation because the result is completely unused, then you ought to note that you no longer know the value of the variable (or, ideally, that the variable would hold the result of that computation if there was code to compute it). In both cases, you ought to note that earlier values of the variable are no longer current at that point. In both cases, the notion of "at that point" is crucial, especially when you deal with conditional assignments. You don't want to make it seem like a conditional assignment applies when the condition doesn't hold. Consider: int foo(bool p, int x, int y) { int i = x; p1(); if (p) i = y; p2(); i++; p3(i); } int main() { foo (false, 3, 5); } At p1()'s caller's frame, you want i to hold the value 3. At p2()'s, you want i to still hold the value 3. At p3(int)'s, it should be 4. Now, if you change the program such that p is true, then at p1 i is still 3, but at p2 it ought to be 5, and at p3(int)'s it should be 6. How do you get that if you drop the assignments on the floor, or even if you replace them assignments with notes that don't keep the correct values associated not only with the names, but also with the points in the program? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-27 5:31 ` Alexandre Oliva @ 2007-11-27 20:31 ` Michael Matz 2007-11-27 21:44 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-27 20:31 UTC (permalink / raw) To: Alexandre Oliva Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Hi, On Mon, 26 Nov 2007, Alexandre Oliva wrote: > >> And then, you have to tweak everything else to keep the note that > >> replaced the set up to date as you further optimize the code. > > > No. remove_insn() would replace the SET with a note. > > What information would this note convey? Oh my, sorry for adding confusion to the topic: I meant to write "would _not_ replace the SET with a note". Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-27 20:31 ` Michael Matz @ 2007-11-27 21:44 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-27 21:44 UTC (permalink / raw) To: Michael Matz Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 27, 2007, Michael Matz <matz@suse.de> wrote: > Hi, > On Mon, 26 Nov 2007, Alexandre Oliva wrote: >> >> And then, you have to tweak everything else to keep the note that >> >> replaced the set up to date as you further optimize the code. >> >> > No. remove_insn() would replace the SET with a note. >> >> What information would this note convey? > Oh my, sorry for adding confusion to the topic: I meant to write "would > _not_ replace the SET with a note". Aah, ok. So, you do indeed completely lose track of the crucial differences between the two cases for the removal of a SET. And not only about their implications, but also about where they ought to take effect. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 0:00 ` Mark Mitchell 2007-11-08 0:15 ` David Edelsohn 2007-11-08 5:44 ` Alexandre Oliva @ 2007-11-08 9:54 ` Richard Guenther 2 siblings, 0 replies; 150+ messages in thread From: Richard Guenther @ 2007-11-08 9:54 UTC (permalink / raw) To: Mark Mitchell; +Cc: Ian Lance Taylor, Alexandre Oliva, gcc-patches, gcc On 11/8/07, Mark Mitchell <mark@codesourcery.com> wrote: > Ian Lance Taylor wrote: > > > At one time, gcc actually provided better debugging of optimized code > > than any other compiler, though I don't know if that is still true. > > Optimized gcc code is still debuggable today. I do it all the time. > > (For me poor support for debugging C++ is a much bigger issue, though > > I think that is an issue more with gdb than with gcc.) > > I think we all agree that providing better debugging of optimized code > is a priori a good thing. So, as I see it, this thread is focused on > what internal representation we might use for that. > > I don't know that there's an abstract right answer to whether something > NOTE-like or something on the side is better. There are problems with > both approaches. We know the NOTE/DEBUG_INSN thing is going to break, > from experience; we also know the on-the-side thing is going to be hard > to maintain. I think we're going to find out once both approaches are implemented up to a way that they reasonably to what they want to do. So I'm fine to defer this decision up to that point (or the point where we start the fighting on which approach will get merged). > Alexandre has clearly thought about this a lot. I'd like to start by > capturing the functional changes that we want to make to GCC's debug > output -- not the changes that we want in the debug experience, or > changes that we need in GDB, but the changes in the generated DWARF. > > For example, I'm thinking of a series of function test cases. Ignore > the substance of this example -- I'm making it up! -- I'm just trying to > capture the form. > > === > int main () { int i; i = 3; return i; } > > When optimizing, "i" is optimized away. The debug info for "i" right > before the return statement says "i has been optimized away", but not > what its value is. I think it should say that the value is "3". To do > that, we need to emit a DW_Now_My_Value_is_3 tag for "i". > === > > Now, how is whatever representation we pick going to get us that? Is > the Oliva representation sufficient? What about the Guenther/Matz > representation? Independently of the representation, what algorithms > are we going to use to track whatever we need to track as the optimizers > remove, insert, duplicate, and reorder code? For the example above, the representation we use on the tree level cannot attach a name to '3' (since obviously '3' is not a SSA_NAME). But this is fixable if we think it is worthwhile. > Until we all know what we're trying to do, I don't see how we can make a > good decision about the representation. Clearly, in the abstract, we > can represent data either on-the-side or in the instruction stream, but > until we know what output we want, I'm not sure how we can pick. That's true. I was also thinking on how to properly do testcases for both kind of infrastructure. At the moment I scan tree/rtl dumps for the names I want to preserve, but ultimately it would be nice to be able to run gdb testcases in the gcc tree to also verify 'correctness' of the information we produce (and not just existence of some information). Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 22:57 ` Ian Lance Taylor 2007-11-07 23:05 ` Daniel Jacobowitz 2007-11-08 0:00 ` Mark Mitchell @ 2007-11-08 5:01 ` Alexandre Oliva 2007-11-08 18:15 ` Alexandre Oliva 2007-11-08 19:13 ` Ian Lance Taylor 2007-11-08 8:58 ` Paolo Bonzini 3 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 5:01 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Ian Lance Taylor <iant@google.com> wrote: >> Does it really matter? Do we compromise standards compliance (and so >> violently, while at that) in any aspect of the compiler? > What standards are you talking about? Debug information standards such as DWARF-3. > I'm not aware of any standard for debuggability of optimized code. I'm talking about standards that specify how a compiler should encode meta-information about how source code concepts map to the code it generated. See, for example, section 2.6 in the Dwarf-3 specification. It talks very little about optimization, but it does discuss what a DW_AT_location, if present, means. It doesn't say anything like: "if a variable is available at a certain location most of the time, you can emit a DW_AT_location that refers to that location". It says: Debugging information must provide consumers a way to find the location of program variables, determine the bounds of dynamic arrays and strings, and possibly to find the base address of a subroutine’s stack frame or the return address of a subroutine See, it's not about debuggers, it's about consumers. It's an obligation, not really an option (that said, DW_AT_location *is* optional). 1. Location expressions, which are a language independent representation of addressing rules of arbitrary complexity built from DWARF expressions. They are sufficient for describing the location of any object as long as its lifetime is either static or the same as the lexical block that owns it, and it does not move throughout its lifetime. 2. Location lists, which are used to describe objects that have a limited lifetime or change their location throughout their lifetime. Nowhere does it state that, "if the compiler can't quite keep track of the location of a variable, it can be sloppy and emit just whatever is simpler or appears to make sense". Address ranges may overlap. When they do, they describe a situation in which an object exists simultaneously in more than one place. If all of the address ranges in a given location list do not collectively cover the entire range over which the object in question is defined, it is assumed that the object is not available for the portion of the range that is not covered. So, it does make room for *some* sloppiness, after all. That's what I refer to as "incompleteness of debug information". If we fail to keep track of where an object is, it's sort-of ok (although undesirable) to emit debug information that omits the location of the object in certain program regions where it might be live. However, it is not standard-compliant to emit information stating that the object is available at certain locations if it is NOT really there, or if it is available elsewhere, in addition to or instead of the locations we've emitted. That's what I refer to as "incorrectness of debug information". Incorrectness in the compiler output is always a bug. No matter how hard it is to implement, or how resource-intensive the solution is, arguing that we've made a trade-off and decided to generate wrong output for this case is a clever decision. Incompleteness is a completely different issue. This is where we *can* afford to make trade-offs. Just like we can decide to omit certain optimizations, or to not carry them out to the greatest possible extent, or to experiment with various different heuristics, we could afford to emit incomplete debug information, it's "just" a quality of implementation issue. But not incorrect debug information, that's just a bug. > gcc's users are definitely calling for a faster compiler. Are they > calling for better debuggability of optimized code? This is not just about debuggability, as I've tried to explain all the way from the beginning of the discussion, maybe a couple of months ago. Debug information is not just about debuggers any more. There are good reasons why the Dwarf-3 standard says "consumers" rather than "debuggers". It's no longer just a matter of convenience, recompile with -g0 if you want to debug it. It's a matter of correctness, for various monitoring tools now rely on this meta-information, and rightfully so. >> > We've fixed many many bugs and misoptimizations over the years due to >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake >> > we've made in the past. >> >> That's a valid concern. However, per this reasoning, we might as well >> push every operand in our IL to separate representations, because >> there have been so many bugs and misoptimizations over the years, >> especially when the representation didn't make transformations >> trivially correct. > Please don't use strawman arguments. It's not, really. A reference to an object within a debug stmt or insn is very much like any other operand, in that most optimizer passes must keep them up to date. If you argue for pushing them outside the IL, why would any other operands be different? > As I understand your proposal, it materializes variables which were > otherwise omitted from the generated program. It doesn't address the > other issues with debugging optimized code, like bouncing around > between program lines. Is that correct? What else does your proposal > do? All it does is to try to carry information about what value the user is entitled to expect a variable to hold at each point in the program throughout compilation. Such that, even if the compiler doesn't retain something that represents only that variable through to the end of the compilation, we still have information about where, or at least what, its value is, if it is available anywhere, such that we can include this piece of data in the debug information. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 5:01 ` Alexandre Oliva @ 2007-11-08 18:15 ` Alexandre Oliva 2007-11-08 19:13 ` Ian Lance Taylor 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 18:15 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Ian Lance Taylor <iant@google.com> wrote: >> Does it really matter? Do we compromise standards compliance (and so >> violently, while at that) in any aspect of the compiler? > What standards are you talking about? Debug information standards such as DWARF-3. > I'm not aware of any standard for debuggability of optimized code. I'm talking about standards that specify how a compiler should encode meta-information about how source code concepts map to the code it generated. See, for example, section 2.6 in the Dwarf-3 specification. It talks very little about optimization, but it does discuss what a DW_AT_location, if present, means. It doesn't say anything like: "if a variable is available at a certain location most of the time, you can emit a DW_AT_location that refers to that location". It says: Debugging information must provide consumers a way to find the location of program variables, determine the bounds of dynamic arrays and strings, and possibly to find the base address of a subroutine’s stack frame or the return address of a subroutine See, it's not about debuggers, it's about consumers. It's an obligation, not really an option (that said, DW_AT_location *is* optional). 1. Location expressions, which are a language independent representation of addressing rules of arbitrary complexity built from DWARF expressions. They are sufficient for describing the location of any object as long as its lifetime is either static or the same as the lexical block that owns it, and it does not move throughout its lifetime. 2. Location lists, which are used to describe objects that have a limited lifetime or change their location throughout their lifetime. Nowhere does it state that, "if the compiler can't quite keep track of the location of a variable, it can be sloppy and emit just whatever is simpler or appears to make sense". Address ranges may overlap. When they do, they describe a situation in which an object exists simultaneously in more than one place. If all of the address ranges in a given location list do not collectively cover the entire range over which the object in question is defined, it is assumed that the object is not available for the portion of the range that is not covered. So, it does make room for *some* sloppiness, after all. That's what I refer to as "incompleteness of debug information". If we fail to keep track of where an object is, it's sort-of ok (although undesirable) to emit debug information that omits the location of the object in certain program regions where it might be live. However, it is not standard-compliant to emit information stating that the object is available at certain locations if it is NOT really there, or if it is available elsewhere, in addition to or instead of the locations we've emitted. That's what I refer to as "incorrectness of debug information". Incorrectness in the compiler output is always a bug. No matter how hard it is to implement, or how resource-intensive the solution is, arguing that we've made a trade-off and decided to generate wrong output for this case is a clever decision. Incompleteness is a completely different issue. This is where we *can* afford to make trade-offs. Just like we can decide to omit certain optimizations, or to not carry them out to the greatest possible extent, or to experiment with various different heuristics, we could afford to emit incomplete debug information, it's "just" a quality of implementation issue. But not incorrect debug information, that's just a bug. > gcc's users are definitely calling for a faster compiler. Are they > calling for better debuggability of optimized code? This is not just about debuggability, as I've tried to explain all the way from the beginning of the discussion, maybe a couple of months ago. Debug information is not just about debuggers any more. There are good reasons why the Dwarf-3 standard says "consumers" rather than "debuggers". It's no longer just a matter of convenience, recompile with -g0 if you want to debug it. It's a matter of correctness, for various monitoring tools now rely on this meta-information, and rightfully so. >> > We've fixed many many bugs and misoptimizations over the years due to >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake >> > we've made in the past. >> >> That's a valid concern. However, per this reasoning, we might as well >> push every operand in our IL to separate representations, because >> there have been so many bugs and misoptimizations over the years, >> especially when the representation didn't make transformations >> trivially correct. > Please don't use strawman arguments. It's not, really. A reference to an object within a debug stmt or insn is very much like any other operand, in that most optimizer passes must keep them up to date. If you argue for pushing them outside the IL, why would any other operands be different? > As I understand your proposal, it materializes variables which were > otherwise omitted from the generated program. It doesn't address the > other issues with debugging optimized code, like bouncing around > between program lines. Is that correct? What else does your proposal > do? All it does is to try to carry information about what value the user is entitled to expect a variable to hold at each point in the program throughout compilation. Such that, even if the compiler doesn't retain something that represents only that variable through to the end of the compilation, we still have information about where, or at least what, its value is, if it is available anywhere, such that we can include this piece of data in the debug information. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 5:01 ` Alexandre Oliva 2007-11-08 18:15 ` Alexandre Oliva @ 2007-11-08 19:13 ` Ian Lance Taylor 2007-11-08 20:27 ` Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: Ian Lance Taylor @ 2007-11-08 19:13 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, gcc Alexandre Oliva <aoliva@redhat.com> writes: > On Nov 7, 2007, Ian Lance Taylor <iant@google.com> wrote: > > >> Does it really matter? Do we compromise standards compliance (and so > >> violently, while at that) in any aspect of the compiler? > > > What standards are you talking about? > > Debug information standards such as DWARF-3. ... > Incorrectness in the compiler output is always a bug. No matter how > hard it is to implement, or how resource-intensive the solution is, > arguing that we've made a trade-off and decided to generate wrong > output for this case is a clever decision. I'm sorry, I've thought about it, but I don't buy this argument. I'm certainly willing to talk about improving debug information for optimized code, and clearly it is more important to more people than I initially thought. However, I don't think your arguments that this is an issue comparable to code correctness are valid. Incorrect generated code is a fatal problem in a compiler. Incorrect debugging information is a quality of implementation issue. > >> > We've fixed many many bugs and misoptimizations over the years due to > >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > >> > we've made in the past. > >> > >> That's a valid concern. However, per this reasoning, we might as well > >> push every operand in our IL to separate representations, because > >> there have been so many bugs and misoptimizations over the years, > >> especially when the representation didn't make transformations > >> trivially correct. > > > Please don't use strawman arguments. > > It's not, really. A reference to an object within a debug stmt or > insn is very much like any other operand, in that most optimizer > passes must keep them up to date. If you argue for pushing them > outside the IL, why would any other operands be different? I think you misread me. I didn't argue for pushing debugging information outside the IL. I argued against a specific implementation--DEBUG_INSN--based on our experience with similar implementations. Ian ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 19:13 ` Ian Lance Taylor @ 2007-11-08 20:27 ` Alexandre Oliva 2007-11-08 21:26 ` Ian Lance Taylor 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 20:27 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Richard Guenther, gcc-patches, gcc On Nov 8, 2007, Ian Lance Taylor <iant@google.com> wrote: > However, I don't think your arguments that this is > an issue comparable to code correctness are valid. It *is* code correctness. Say, if the linker emitted incorrect addresses in an executable, but the kernel and dynamic loader didn't rely on those addresses, would it not still be a bug in the linker? And then, if those tools started relying on those addresses and exposed the problem, would it be right to tell them they must not rely on them because they were broken in the past and we don't feel like correcting the linker? So... The compiler is outputting code that tells other tools where to look for certain variables at run time, but it's putting incorrect information there. How can you possibly argue that this is not a code correctness issue? > Incorrect generated code is a fatal problem in a compiler. > Incorrect debugging information is a quality of implementation > issue. Incomplete debugging information is a quality of implementation, just like missed optimizations. Incorrect compiler output is a bug. Claiming it's not just because tools you happen to rely on don't care about that piece of information won't make it any less of a bug. It may make it a less important bug for some time, but it's still a bug. >> >> > We've fixed many many bugs and misoptimizations over the years due to >> >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake >> >> > we've made in the past. >> >> >> >> That's a valid concern. However, per this reasoning, we might as well >> >> push every operand in our IL to separate representations, because >> >> there have been so many bugs and misoptimizations over the years, >> >> especially when the representation didn't make transformations >> >> trivially correct. >> >> > Please don't use strawman arguments. >> >> It's not, really. A reference to an object within a debug stmt or >> insn is very much like any other operand, in that most optimizer >> passes must keep them up to date. If you argue for pushing them >> outside the IL, why would any other operands be different? > I think you misread me. I didn't argue for pushing debugging > information outside the IL. I argued against a specific > implementation--DEBUG_INSN--based on our experience with similar > implementations. Do you remember any other notes that contained actual rtx expressions and expected optimization passes to keep them accurate? All notes (as in matching NOTE_P) I remember didn't really contain rtx expressions. The first exception I remember is VAR_LOCATION, and this one explicitly does *not* want to be updated, for it's generated so late in the process. Conversely, REG_NOTES do contain rtx, and they often have to be updated, so that's the right representation for them. Do you think we'd gain anything by moving them to a separate, out-of-line representation? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 20:27 ` Alexandre Oliva @ 2007-11-08 21:26 ` Ian Lance Taylor 2007-11-09 9:53 ` Robert Dewar 2007-11-09 9:55 ` Seongbae Park (박성배, 朴成培) 0 siblings, 2 replies; 150+ messages in thread From: Ian Lance Taylor @ 2007-11-08 21:26 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, gcc Alexandre Oliva <aoliva@redhat.com> writes: > So... The compiler is outputting code that tells other tools where to > look for certain variables at run time, but it's putting incorrect > information there. How can you possibly argue that this is not a code > correctness issue? I don't see any point to going around this point again, so I'll just note that I disagree. > >> >> > We've fixed many many bugs and misoptimizations over the years due to > >> >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > >> >> > we've made in the past. > >> >> > >> >> That's a valid concern. However, per this reasoning, we might as well > >> >> push every operand in our IL to separate representations, because > >> >> there have been so many bugs and misoptimizations over the years, > >> >> especially when the representation didn't make transformations > >> >> trivially correct. > >> > >> > Please don't use strawman arguments. > >> > >> It's not, really. A reference to an object within a debug stmt or > >> insn is very much like any other operand, in that most optimizer > >> passes must keep them up to date. If you argue for pushing them > >> outside the IL, why would any other operands be different? > > > I think you misread me. I didn't argue for pushing debugging > > information outside the IL. I argued against a specific > > implementation--DEBUG_INSN--based on our experience with similar > > implementations. > > Do you remember any other notes that contained actual rtx expressions > and expected optimization passes to keep them accurate? No. > Do you think > we'd gain anything by moving them to a separate, out-of-line > representation? I don't know. I don't see such a proposal on the table, and I don't have one myself, so I don't know how to evaluate it. Ian ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 21:26 ` Ian Lance Taylor @ 2007-11-09 9:53 ` Robert Dewar 2007-11-12 5:36 ` Mark Mitchell 2007-11-09 9:55 ` Seongbae Park (박성배, 朴成培) 1 sibling, 1 reply; 150+ messages in thread From: Robert Dewar @ 2007-11-09 9:53 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches, gcc Ian Lance Taylor wrote: > Alexandre Oliva <aoliva@redhat.com> writes: > >> So... The compiler is outputting code that tells other tools where to >> look for certain variables at run time, but it's putting incorrect >> information there. How can you possibly argue that this is not a code >> correctness issue? > > I don't see any point to going around this point again, so I'll just > note that I disagree. Well I very much agree. If you are writing certified code, then a number of evidence producing tools rely on the debugging information, and it is a problem if this information is incorrect. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-09 9:53 ` Robert Dewar @ 2007-11-12 5:36 ` Mark Mitchell 2007-11-12 17:34 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Mark Mitchell @ 2007-11-12 5:36 UTC (permalink / raw) To: Robert Dewar Cc: Ian Lance Taylor, Alexandre Oliva, Richard Guenther, gcc-patches, gcc Robert Dewar wrote: > Ian Lance Taylor wrote: >> Alexandre Oliva <aoliva@redhat.com> writes: >> >>> So... The compiler is outputting code that tells other tools where to >>> look for certain variables at run time, but it's putting incorrect >>> information there. How can you possibly argue that this is not a code >>> correctness issue? >> >> I don't see any point to going around this point again, so I'll just >> note that I disagree. > > Well I very much agree. The trick is that we're being asked to give a binary answer ("is it a correctness issue?") when it's not really a binary issue. Clearly, for some users, incorrect debugging information on optimized code is not a terribly big deal. It's certainly less important to many users than that the program get the right answer. On the other hand, there are no doubt users where, whether for debugging, certification, or whatever, it's vitally important that the debugging information meet some standard of accuracy. Part of my concern with this whole discussion is that we seem to be saying we want the debugging information to be better, but not saying very clearly what the requirements on better are. Are we going to consider it a bug if the value of a variable is unavailable, but the debugging information says it is available? (Yes, this seems like a bug to me.) What if an old value is available, but a simple-minded reading of the program would have now assigned a new value? (No, I wouldn't consider this a bug.) What if the value is available in two places, and we only describe one of them? (No, I wouldn't consider this a bug.) What if the value is available, but we say that it isn't because we lost track of it at some point? (I would say "it depends".) We could certainly track user variables through SSA and RTL, at least insofar as knowing that some REGs refer to SSA names that refer to user VAR_DECLs. We can use dataflow analysis to compute where those values (might) die. Thus, we can probably do a reasonable job of guaranteeing that when we say a variable is somewhere, it is in fact in that place. I don't yet understand what else we're trying to do. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 5:36 ` Mark Mitchell @ 2007-11-12 17:34 ` Alexandre Oliva 2007-11-12 17:54 ` Mark Mitchell 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-12 17:34 UTC (permalink / raw) To: Mark Mitchell Cc: Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 12, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > Clearly, for some users, incorrect debugging information on optimized > code is not a terribly big deal. It's certainly less important to many > users than that the program get the right answer. On the other hand, > there are no doubt users where, whether for debugging, certification, or > whatever, it's vitally important that the debugging information meet > some standard of accuracy. How is this different from a port of the compiler for a CPU that few people care about? That many users couldn't care less whether the compiler output on that port works at all doesn't make it any less of a correctness issue. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 17:34 ` Alexandre Oliva @ 2007-11-12 17:54 ` Mark Mitchell 2007-11-24 1:55 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Mark Mitchell @ 2007-11-12 17:54 UTC (permalink / raw) To: Alexandre Oliva Cc: Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: > On Nov 12, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > >> Clearly, for some users, incorrect debugging information on optimized >> code is not a terribly big deal. It's certainly less important to many >> users than that the program get the right answer. On the other hand, >> there are no doubt users where, whether for debugging, certification, or >> whatever, it's vitally important that the debugging information meet >> some standard of accuracy. > > How is this different from a port of the compiler for a CPU that few > people care about? That many users couldn't care less whether the > compiler output on that port works at all doesn't make it any less of > a correctness issue. You're again trying to make this a binary-value question. Why? Lots of things are "a correctness issue". But, some categories tend to be worse than others. There is certainly a qualitative difference in the severity of a defect that results in the compiler generating code that computes the wrong answer and a defect that results in the compiler generating wrong debugging information for optimized code. The impact on a user affected by the first problem is likely very severe: the application does not run correctly. The impact on a user affected by the second problem is likely less severe: the debugger doesn't work as well, or some other external tool doesn't work as well. Let's put it this way: if a user has to choose whether the compiler will (a) generate code that runs correctly for their application, or (b) generate debugging information that's accurate, which one will they choose? But what's the point of this argument? It sounds like you're trying to argue that debug info for optimized code is a correctness issue, and therefore we should work as hard on it as we would on code-generation bugs. I don't find that argument persuasive. I'd like better debugging for optimized code, but I'm certainly more concerned that (a) we generate correct, fast code when optimizing, and (b) we generate good debugging information when not optimizing. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 17:54 ` Mark Mitchell @ 2007-11-24 1:55 ` Alexandre Oliva 2007-11-26 1:08 ` Mark Mitchell 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 1:55 UTC (permalink / raw) To: Mark Mitchell Cc: Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 12, 2007, Mark Mitchell <mark@codesourcery.com> wrote: > Alexandre Oliva wrote: >> On Nov 12, 2007, Mark Mitchell <mark@codesourcery.com> wrote: >> >>> Clearly, for some users, incorrect debugging information on optimized >>> code is not a terribly big deal. It's certainly less important to many >>> users than that the program get the right answer. On the other hand, >>> there are no doubt users where, whether for debugging, certification, or >>> whatever, it's vitally important that the debugging information meet >>> some standard of accuracy. >> >> How is this different from a port of the compiler for a CPU that few >> people care about? That many users couldn't care less whether the >> compiler output on that port works at all doesn't make it any less of >> a correctness issue. > You're again trying to make this a binary-value question. Why? Because in my mind, when we agree there is a bug, then a fix for it can is easier to swallow even if it makes the compiler spend more resources, whereas a mere quality-of-implementation issue is subject to quite different standards. > Lots of things are "a correctness issue". But, some categories tend to > be worse than others. There is certainly a qualitative difference in > the severity of a defect that results in the compiler generating code > that computes the wrong answer and a defect that results in the compiler > generating wrong debugging information for optimized code. That depends a lot on whether your application depends uses the incorrect compiler output or not. If the compiler produces incorrect code, but your application doesn't ever exercise that error, would you argue for leaving the bug unfixed? These days, applications are built that depend on the correctness of the compiler output in certain sections that historically weren't all that functionally essential, namely, the meta-information sections that we got used to calling debug information. I.e., these days, applications exercise the "code paths" that formerly weren't exercised. This exposes bugs in the compiler. Worse: bugs that we have no infrastructure to test, and that we don't even agree are actual bugs, because the standards that specify the "ISA and ABI" in which such code ought to be output are apparently regarded as irrelevant by some. Just because their perception is distorted by a single use of such information, which involves a high amount of human interaction, and humans are able to tolerate and adapt to error conditions. But as more and more uses of such information are actual production systems rather than humans behind debuggers, such errors can no longer be tolerated, because when the debug output is wrong, the system breaks. It's that simple. It's really no different from any other compiler bug. > Let's put it this way: if a user has to choose whether the compiler will > (a) generate code that runs correctly for their application, or (b) > generate debugging information that's accurate, which one will they choose? (a), for sure. But bear in mind that, when the application's correct execution depends on the correctness of debugging information, then a implies b. > But what's the point of this argument? It sounds like you're trying to > argue that debug info for optimized code is a correctness issue, and > therefore we should work as hard on it as we would on code-generation > bugs. I'm working hard on it. I'm not asking others to join me. I'm just asking people to understand how serious a problem it is, and that, even those fixing these bugs may have a cost, it's bugs we're talking about, it's incorrect compiler output that causes applications to break, not mere inconvenience for debuggers. > I'd like better debugging for optimized code, but I'm certainly more > concerned that (a) we generate correct, fast code when optimizing, > and (b) we generate good debugging information when not optimizing. This just goes to show that you're not concerned with the kind of application that *depends* on correct debug information for functioning. And it's not debuggers I'm talking about here. That's a reasonable point of view. Maybe the GCC community can decide that the debug information it produces is just for (poor) consumption by debug programs, and that we have no interest in *complying* with the debug information standards that document the debug information that other applications depend on. And I mean *complying* with the standards, rather than merely outputting whatever seems to be easy and approximately close to what the standard mandates. I just wish the GCC community doesn't make this decision, and it accepts fixes to these bugs even when they impose some overhead, especially when such overhead can be easily avoided with command-line options, or even is disabled by default (because debug info is not emitted by default, after all). -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 1:55 ` Alexandre Oliva @ 2007-11-26 1:08 ` Mark Mitchell 2007-12-05 14:22 ` Diego Novillo 0 siblings, 1 reply; 150+ messages in thread From: Mark Mitchell @ 2007-11-26 1:08 UTC (permalink / raw) To: Alexandre Oliva Cc: Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: >> You're again trying to make this a binary-value question. Why? > > Because in my mind, when we agree there is a bug, then a fix for it > can is easier to swallow even if it makes the compiler spend more > resources, whereas a mere quality-of-implementation issue is subject > to quite different standards. Unfortunately, not all questions are black-and-white. I don't think you're going to get consensus that this issue is as important to fix as wrong-code (in the traditional sense) problems. So, arguing about whether this is a "correctness issue" isn't very productive. Neither is arguing that there is now some urgent need for machine-usable debugging information in a way that there wasn't before. Machines have been using debugging information for various purposes other than interactive debugging for ages. But, they've always had to deal with the kinds of problems that you're encountering, especially with optimized code. I think that at this point you're doing research. I don't think we have a well-defined notion of what exactly debugging information should be for optimized code. Robert Dewar's definition of -O1 as doing optimizations that don't interfere with debugging is coherent (though informal, of course), but you're asking for something more: full optimization, and, somehow, accurate debugging information in the presence of that. I'm all for research, and the thinking that you're doing is unquestionably valuable. But, you're pushing hard for a particular solution and that may be premature at this point. Debugging information just isn't rich enough to describe the full complexity of the optimization transformations. There's no great way to assign a line number to an instruction that was created by the compiler when it inserted code on some flow-graph edge. You can't get exact information about variable lifetimes because the scope doesn't start at a particular point in the generated code in the same way that it does in the source code. My suggestion (not as a GCC SC member or GCC RM, but just as a fellow GCC developer with an interest in improving the compiler in the same way that you're trying to do) is that you stop writing code and start writing a paper about what you're trying to do. Ignore the implementation. Describe the problem in detail. Narrow its scope if necessary. Describe the success criteria in detail. Ideally, the success criteria are mechanically checkable properties: i.e., given a C program as input, and optimized code + debug information as output, it should be possible to algorithmically prove whether the output is correct. For example, how do you define the correctness of debug information for a variable's location at a given PC? Perhaps we want to say that giving the answer "no information available" is always correct, but that saying "the value is here" when it's not is incorrect; that gives us a conservative fallback. How do you define the point in the source program given a PC? If the value of "x" changes on line 100, and we're at an instruction which corresponds line 101, are we guaranteed to see the changed value? Or is seeing the previous value OK? What about some intermediate value if "x" is being changed byte-by-byte? What about a garbage value if the compiler happens to optimize by throwing away the old value of "x" before assigning a new one? -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-26 1:08 ` Mark Mitchell @ 2007-12-05 14:22 ` Diego Novillo 2007-12-05 22:10 ` Joe Buck 2007-12-15 21:41 ` Alexandre Oliva 0 siblings, 2 replies; 150+ messages in thread From: Diego Novillo @ 2007-12-05 14:22 UTC (permalink / raw) To: Mark Mitchell Cc: Alexandre Oliva, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 11/25/07 3:43 PM, Mark Mitchell wrote: > My suggestion (not as a GCC SC member or GCC RM, but just as a fellow > GCC developer with an interest in improving the compiler in the same way > that you're trying to do) is that you stop writing code and start > writing a paper about what you're trying to do. > > Ignore the implementation. Describe the problem in detail. Narrow its > scope if necessary. Describe the success criteria in detail. Ideally, > the success criteria are mechanically checkable properties: i.e., given > a C program as input, and optimized code + debug information as output, > it should be possible to algorithmically prove whether the output is > correct. Yes, please. I would very much like to see an abstract design document on what you are trying to accomplish. I have been trying to follow this thread but I've gotten lost. It's full of implementation details, rhetoric and high-level discussion. I would like to see exactly what Mark is asking for. Perhaps a presentation in next year's Summit? I don't think I understand the goal of the project. "Correct debugging info" means little, particularly if you say that it's not debuggers that you are thinking about. It's certainly worrisome that your implementation seems to be intrusive to the point of brittleness. Will every new optimization need to think about debug information from scratch and refrain from doing certain transformations? In my simplistic view of this problem, I've always had the idea that -O0 -g means "full debugging bliss", -O1 -g means "tolerable debugging" (symbols shouldn't disappear, for instance, though they do now) and -O2 -g means "you can probably know what line+function you're executing". But you seem to be addressing other problems. And it even seems to me that you want debugging information that is capable of deconstructing arbitrary transformations done by the optimizers. But I think I'm just lost in this thread, so a high-level design document would be perfect to expose your ideas. Diego. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-05 14:22 ` Diego Novillo @ 2007-12-05 22:10 ` Joe Buck 2007-12-15 21:41 ` Alexandre Oliva 1 sibling, 0 replies; 150+ messages in thread From: Joe Buck @ 2007-12-05 22:10 UTC (permalink / raw) To: Diego Novillo Cc: Mark Mitchell, Alexandre Oliva, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Wed, Dec 05, 2007 at 09:05:33AM -0500, Diego Novillo wrote: > In my simplistic view of this problem, I've always had the idea that -O0 > -g means "full debugging bliss", -O1 -g means "tolerable debugging" > (symbols shouldn't disappear, for instance, though they do now) and -O2 > -g means "you can probably know what line+function you're executing". I'd be happy enough if the state of -O1 -g debugging were improved, perhaps using some of Alexandre's ideas so that it could be "full debugging bliss" with some optimization as well. Speeding up the compile/test/debug/modify cycle would result. We could then have fast but fully debuggable code at -O1, and even faster code at -O2 not constrained by the requirement of, as Diego says, "deconstructing arbitrary transformations done by the optimizers". ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-05 14:22 ` Diego Novillo 2007-12-05 22:10 ` Joe Buck @ 2007-12-15 21:41 ` Alexandre Oliva 2007-12-16 3:15 ` Daniel Berlin 2007-12-16 21:42 ` Mark Mitchell 1 sibling, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-15 21:41 UTC (permalink / raw) To: Diego Novillo Cc: Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 5, 2007, Diego Novillo <dnovillo@google.com> wrote: > On 11/25/07 3:43 PM, Mark Mitchell wrote: >> My suggestion (not as a GCC SC member or GCC RM, but just as a fellow >> GCC developer with an interest in improving the compiler in the same way >> that you're trying to do) is that you stop writing code and start >> writing a paper about what you're trying to do. >> >> Ignore the implementation. Describe the problem in detail. Narrow its >> scope if necessary. Describe the success criteria in detail. Ideally, >> the success criteria are mechanically checkable properties: i.e., given >> a C program as input, and optimized code + debug information as output, >> it should be possible to algorithmically prove whether the output is >> correct. > Yes, please. I would very much like to see an abstract design > document on what you are trying to accomplish. Other than the ones I've already posted, here's one: http://dwarfstd.org/Dwarf3Std.php Seriously. There is a standard for this stuff. My ultimate goal in this project is that we comply with it, at least as far as emitting debug information for location of variables is concerned. Here are some relevant postings on design strategies, rationales and goals: http://gcc.gnu.org/ml/gcc/2007-11/msg00229.html (goals) http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html (initial plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00261.html (detailed plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00317.html (example) http://gcc.gnu.org/ml/gcc/2007-11/msg00590.html (more example) http://gcc.gnu.org/ml/gcc/2007-11/msg00176.html (design rationale) http://gcc.gnu.org/ml/gcc/2007-11/msg00177.html (clarification) > I would like to see exactly what Mark is asking for. Perhaps a > presentation in next year's Summit? Sure, if there's interest, I could sure plan on doing that. I could use sponsors, BTW; I haven't discussed this with my employer, and writing articles and presenting speeches are not part of this assignment I was given. Anyhow, by the time of the next year's Summit, I hope this is mostly old news. > I don't think I understand the goal of the project. Follow the standard, as in (1) emit debug information that is correct (standard-compliant), as in, if we emit some piece of debug information, it reflects reality, rather than being a sometimes distant approximation of some past reality long destroyed by some optimization pass, and (2) emit debug information that is more complete, as in, we currently fail to emit a lot of debug information that we could, because we lose track of the location of variables as optimization passes fail to maintain the needed information to do so. > "Correct debugging info" means little, particularly if you say that > it's not debuggers that you are thinking about. Thinking of the debuggers is a mistake. We don't think of specific compilers when reading a programming language standard. We don't think of specific processors when reading an ISA or ABI specification. Even when we read documentation specific to a processor, we still don't think of its internal implementation details in order to write a compiler for it; even the scheduling properties are abstracted out in the design specification and optimization guidelines. When someone finds that the compiler deviates from one of these standards, we just cite chapter and verse of the relevant standard, and people see there's a bug. Why should debug information standards be treated any differently? > It's certainly worrisome that your implementation seems to be > intrusive to the point of brittleness. What part of instrusiveness are you concerned about? The change of INSN_P such that it covers DEBUG_INSN_P too in the supported range? Or the few changes that revert to the original INSN_P, in the few exceptions in which DEBUG_INSN_P is not to be handled as an INSN? I've heard this "intrusiveness" argument be pointed out so many times, by so many people that claim to not have been able to keep up with the thread, and who claim to have not looked at the patches at all, that I'm more and more convinced it's just fear of the unknown than any actual rational evaluation of the impact of the changes. Seriously. Have a look at the patches and tell me what in them you regard as intrusive. We're talking about infrastructure here, needed to fix GCC's carelessness about maintaining a mapping between source and implementation concepts that went on for years and years, while optimizations were added and debug information was degraded. At some point you have to face reality and see that such information isn't kept around by magic, it takes some effort, and this effort is needed at every location where there are changes that might affect debug information. And that's pretty much everywhere. Even if we had consistent interfaces to make some changes, such as variable renaming, substitution, etc, this would only cover a small amount of the data a debug info generator would need: it needs higher-level information than that, especially in rtl, where transformations, for historical reasons, are messier than in the tree IL. So, the approach I've taken is to use the strength of the problem against itself: take advantage of the fact that optimizers already know how to perform transformations they need to do in order to keep things consistent, and represent debug information in a way that, to them, will look just like any other use, so they will adjust it likewise. And then, on top of that, handle the few exceptions, in which the optimizer needs to do something cleverer, because the transformation it performs wouldn't work when say there's more than one use or so. > Will every new optimization need to think about debug information > from scratch and refrain from doing certain transformations? Refraining from doing certain transformations would be wrong. We don't want debug information to affect code generation, and we don't want it to reduce the amount of optimization you can make. So, you optimize away, and if you find that you can't keep track of debug information, you mark stuff as unavailable, or, most likely, the safety nets in place will do that for you, rather than taking the current approach, in which we silently corrupt debug information. Sure, this might require a little bit more thinking in some optimizations. But in my experience fixing up the tree and rtl passes that needed tweaking, the additional thinking needed is a no-brainer in most cases; in a few, you have to work a bit harder to keep information around rather than simply noting it as unavailable. But it has never required optimizations to be disabled, and it must not do so. In fact, in a few cases, I noticed we were missing trivial optimizations and fixed them. > In my simplistic view of this problem, I've always had the idea that > -O0 -g means "full debugging bliss", -O1 -g means "tolerable > debugging" (symbols shouldn't disappear, for instance, though they do > now) and -O2 -g means "you can probably know what line+function you're > executing". I've never seen this documented as such, and we've never worked toward these stated goals. However, I see that, underlying all of this, we should be concerned about emitting debug information that is correct, i.e., never emit information that says the location of FOO is BAR while it's actually at BAZ. I've seen many people (including myself, in a distant past) claiming that imprecise information is better than no information. I've learned better. Debugger information consumers are often equipped with heuristics to fill in common gaps in debug information. But if the information is there, and wrong, the heuristics that might very well have worked are disabled in favor of the incorrect information, and then the whole system (debuggers, monitors, etc, along with the program) misbehaves. And then, even when heuristics don't exist and the information is gone, it's better to tell the user "I don't know how to get you that" than to hand it something other than it needs (e.g., an incorrect variable location). > But you seem to be addressing other problems. And it even seems to me > that you want debugging information that is capable of deconstructing > arbitrary transformations done by the optimizers. No. I don't see where this notion came from, but it appears to be quite widespread. Omitting certain pieces of debug information is almost always correct, since most debug info attributes are optional. But emitting information that doesn't reflect the program is always incorrect. So, if you perform an arbitrary transformation that is too hard to represent in debug information, that's fine, just throw the information away. The debug information might become less complete, and therefore less useful, but it will at least won't induce errors elsewhere. The parallel I draw is that emitting an optional piece of debug information is like applying an optional optimization. If it's correct, and it's not too expensive, go for it. But if it's going to get you the wrong output, it's broken, so don't do it. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-15 21:41 ` Alexandre Oliva @ 2007-12-16 3:15 ` Daniel Berlin 2007-12-16 13:09 ` Alexandre Oliva 2007-12-16 21:42 ` Mark Mitchell 1 sibling, 1 reply; 150+ messages in thread From: Daniel Berlin @ 2007-12-16 3:15 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/15/07, Alexandre Oliva <aoliva@redhat.com> wrote: > On Dec 5, 2007, Diego Novillo <dnovillo@google.com> wrote: > > > On 11/25/07 3:43 PM, Mark Mitchell wrote: > > >> My suggestion (not as a GCC SC member or GCC RM, but just as a fellow > >> GCC developer with an interest in improving the compiler in the same way > >> that you're trying to do) is that you stop writing code and start > >> writing a paper about what you're trying to do. > >> > >> Ignore the implementation. Describe the problem in detail. Narrow its > >> scope if necessary. Describe the success criteria in detail. Ideally, > >> the success criteria are mechanically checkable properties: i.e., given > >> a C program as input, and optimized code + debug information as output, > >> it should be possible to algorithmically prove whether the output is > >> correct. > > > Yes, please. I would very much like to see an abstract design > > document on what you are trying to accomplish. > > Other than the ones I've already posted, here's one: > > http://dwarfstd.org/Dwarf3Std.php > > Seriously. There is a standard for this stuff. My ultimate goal in > this project is that we comply with it Comply with it how? There is no portion of the DWARF3 spec which requires you output information that is correct or useful. The same way the C standard does not require you to write correct programs, only valid ones, the DWARF3 spec does not require you to output correct information, only information that is encoded properly. It is certainly a goal of DWARF3 to allow producers to provide correct info (as witness by the one of the listed goals: "Debugging information must provide consumers a way to find the location of program variables, determine the bounds of dynamic arrays and strings, and possibly to find the base address of a subroutine's stack frame or the return address of a subroutine. Furthermore, to meet the needs of recent computer architectures and optimization techniques, debugging information must be able to describe the location of an object whose location changes over the object's lifetime.") If you search the entire spec for the word "correct", you will find it 3 times. If you search for "must", you will discover they all related to encoding or the goals of the standard. It may be entirely useless to output incorrect information, and in fact, worse than useless. It is however, compliant, as long as they are encoded properly. I have to say, this is typical of the argumentation you have used thus far in this thread, and honestly, it's not winning you any points. That said, nobody here believes we should output useless or incorrect info, even though we could. A lot of people appear to disagree with you about the best way to do it, and in fact, about what we should be trying to provide users in what cases. > >What part of instrusiveness are you concerned about? The change of >INSN_P such that it covers DEBUG_INSN_P too in the supported range? >Or the few changes that revert to the original INSN_P, in the few >exceptions in which DEBUG_INSN_P is not to be handled as an INSN? >I've heard this "intrusiveness" argument be pointed out so many times, >by so many people that claim to not have been able to keep up with the >thread, and who claim to have not looked at the patches at all, that >I'm more and more convinced it's just fear of the unknown than any >actual rational evaluation of the impact of the changes. Well, no. You yourself have shown it to be intrusiveness in the extreme, in the very next paragraphs! " At some point you have to face reality and see that such information isn't kept around by magic, it takes some effort, and this effort is needed at every location where there are changes that might affect debug information. And that's pretty much everywhere. " So, everywhere needs to change. That's pretty intrusiveness, no? "Sure, this might require a little bit more thinking in some optimizations. But in my experience fixing up the tree and rtl passes that needed tweaking, the additional thinking needed is a no-brainer in most cases; in a few, you have to work a bit harder to keep information around rather than simply noting it as unavailable. " Having to stop and think at every point in an optimization about the debug info, having to deal with debug info at every single point of change, and then your other patches This is intrusiveness as well (having to stop and think about debug info at every single point of every single optimization). You don't need to be this intrusiveness to stop outputting the incorrect info we do. >I've never seen this documented as such, and we've never worked toward > these stated goals. Who is we? I certainly have worked exactly towards these goals. As have almost all the authors of the current debugging info framework. The reason it is the way it is because these in fact, *were exactly the goals we were working towards*. As for not documented, a lot of gcc is not documented. If you look in the mailing list archives, you will even discover Diego is not the first one have exactly the viewpoint about what should and should not be debuggable, and that the community has consistenly worked towards exactly the viewpoint diego describes. Anyway, I give up on reading this thread. It has turned into a mess. You really need to step back and see that you have not achieved any sort of consensus of what levels of optimization should be how debuggable, before you start telling everyone their approach isn't as good as yours. I certainly wouldn't agree that we should take such intrusive steps to make -O2 -g as debuggable as you want, I'd much rather see us do what we can easily, and drop any info that ends up being incorrect. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-16 3:15 ` Daniel Berlin @ 2007-12-16 13:09 ` Alexandre Oliva 2007-12-17 1:27 ` Daniel Berlin 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-16 13:09 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 16, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > There is no portion of the DWARF3 spec which requires you output > information that is correct or useful. The same way the C standard > does not require you to write correct programs, only valid ones, the > DWARF3 spec does not require you to output correct information, only > information that is encoded properly. But if a C compiler translated programs to garbage, that would be wrong. By the same reasoning, if a Dwarf producer created garbage, that would be wrong. It's true that most of Dwarf 3 attributes are optional. But when it says "if you output this attribute, its operand must be such and such", if you output the attribute with operands that don't match the specification, that's a bug. > It is certainly a goal of DWARF3 to allow producers to provide correct > info Exactly. And where's the permission to provide incorrect info, rather than merely leaving it out? >> I've heard this "intrusiveness" argument be pointed out so many times, >> by so many people that claim to not have been able to keep up with the >> thread, and who claim to have not looked at the patches at all, that >> I'm more and more convinced it's just fear of the unknown than any >> actual rational evaluation of the impact of the changes. > Well, no. > You yourself have shown it to be intrusiveness in the extreme, in the > very next paragraphs! > " > At some point you have to face reality and see that such information > isn't kept around by magic, it takes some effort, and this effort is > needed at every location where there are changes that might affect > debug information. And that's pretty much everywhere. " > So, everywhere needs to change. That's pretty intrusiveness, no? No. Looks like selective attention, because you're reasoning out the part in which I discussed using the strength of the optimizers against the problem, by letting them do what they are already used to on the debug information too. If we add a new RTL code or a new TREE code, is that intrusive because now every optimization pass will deal with the new node types in very much the same way they've dealt with other similar node types forever? Of course not. And if we have to add a few exceptions here and there to deal with the specifics of this new node type, does that become too intrusive then? I don't think so. Then what's the fuss about the new node types? Do you want to count the number of places in which INSN_P remains there, lexically unchanged, and compare with the number of places in which I've added a !DEBUG_INSN_P after it? > Having to stop and think at every point in an optimization about the > debug info, Well, sorry, writing compilers is hard. You have to think about several things at the same time. Shall we just go shopping instead? I'm trying to make it as simple as possible. The fact that nearly 100% of the code is unchanged seems to indicate to me that it's not such a bad an approach, but if you want something that just magically works, you're up for much disappointment. > (having to stop and think about debug info at every single point of > every single optimization). Information doesn't come out of thin air, and thin air doesn't maintain information accurate just because we wish it does. We have to work to create and update the information throughout compilation, at every transformation, and my reasoning is precisely that optimizers already do this all the time, so why not use them for what we need? > You don't need to be this intrusiveness to stop outputting the > incorrect info we do. What do you have to back your statement up? Let me help you: sure we don't. We can just refrain from outputting any debug information whatsoever. Then, it will be compliant with the standard. But it won't be useful. >> I've never seen this documented as such, and we've never worked toward >> these stated goals. > Who is we? > I certainly have worked exactly towards these goals. > As have almost all the authors of the current debugging info > framework. Oh, wow, I guess I just wasn't welcome into the club, because I didn't get the guidelines book. How unfortunate, now I have to give up my plan of doing better and abide by the unpublished and undocumented goals of some small cabal. Or do I? > If you look in the mailing list archives, you will even discover Diego > is not the first one have exactly the viewpoint about what should and > should not be debuggable, and that the community has consistenly > worked towards exactly the viewpoint diego describes. I've seen several different viewpoints from "the community". > Anyway, I give up on reading this thread. It has turned into a mess. > You really need to step back Oh, do I? Why is that? > and see that you have not achieved any sort of consensus of what > levels of optimization should be how debuggable, Why would I expect to get any consensus on that? I haven't even tried, and I won't. This is not what the issue is about. The issue is about not emitting incorrect information. Better debuggability for all levels of optimization will be a side effect of achieving that, and it will be achievable incrementally once we have an actual framework that enables us to take steps in this direction without introducing further regressions. > I certainly wouldn't agree that we should take such intrusive steps to > make -O2 -g as debuggable as you want, It is obvious that you misunderstood what I want, and how intrusive the approach is. > I'd much rather see us do what we can easily, and drop any info that > ends up being incorrect. So what's your plan to find out what's incorrect? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-16 13:09 ` Alexandre Oliva @ 2007-12-17 1:27 ` Daniel Berlin 2007-12-17 4:20 ` Joe Buck 2007-12-17 17:59 ` Alexandre Oliva 0 siblings, 2 replies; 150+ messages in thread From: Daniel Berlin @ 2007-12-17 1:27 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc > It is obvious that you misunderstood what I want, and how intrusive > the approach is. > Yes Alexandre, everyone who disagrees with you must not understand! That's really the problem here. None of us understand but you. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 1:27 ` Daniel Berlin @ 2007-12-17 4:20 ` Joe Buck 2007-12-17 8:13 ` Geert Bosch 2007-12-17 18:36 ` Alexandre Oliva 2007-12-17 17:59 ` Alexandre Oliva 1 sibling, 2 replies; 150+ messages in thread From: Joe Buck @ 2007-12-17 4:20 UTC (permalink / raw) To: Daniel Berlin Cc: Alexandre Oliva, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Sun, Dec 16, 2007 at 08:12:07PM -0500, Daniel Berlin wrote: > > It is obvious that you misunderstood what I want, and how intrusive > > the approach is. > > > > Yes Alexandre, everyone who disagrees with you must not understand! > That's really the problem here. > None of us understand but you. I have some sympathy for going in Alexandre's direction, in that it would be nice to have a mode that provided optimization as well as accurate debugging. However, since preserving accurate debug information has a cost, I think it would be better to turn -O1, not -O2, into the mode that Alexandre wants, where debug information is preserved. Trying to rework all optimizations to keep perfect debug information is going to take forever and make the compiler worse. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 4:20 ` Joe Buck @ 2007-12-17 8:13 ` Geert Bosch 2007-12-18 1:24 ` Alexandre Oliva 2007-12-17 18:36 ` Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: Geert Bosch @ 2007-12-17 8:13 UTC (permalink / raw) To: Joe Buck Cc: Daniel Berlin, Alexandre Oliva, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 16, 2007, at 20:27, Joe Buck wrote: > I have some sympathy for going in Alexandre's direction, in that it > would be nice to have a mode that provided optimization as well as > accurate debugging. However, since preserving accurate debug > information > has a cost, I think it would be better to turn -O1, not -O2, into the > mode that Alexandre wants, where debug information is preserved. > Trying > to rework all optimizations to keep perfect debug information is going > to take forever and make the compiler worse. Right, at the moment -O1 is far too much like -O2. There is room for an optimization mode that is mostly local, scales well far large programs and allows for high-quality debug information. Fortunately, these goals seem all to match. We could conceptually have inspection points between each source statement and declaration, which would roughly correspond to a use of all memory and all source variables, wether in memory or in registers. These inspections points would be considered potentially trapping. This approach would still allow some scheduling. For example, loads and arithmetic operations that are known not to trap could still be done early. On the other hand, when breaking at any statement, all variables can be printed. Also, since no user-visible state can be modified by speculatively executed instructions such as loads, such instructions should not be tagged with their original source location information. This would prevent the very annoying and unhelpful jumping around the program during debugging. The method I describe here, which roughly corresponds to the semantics of Ada's "pragma Inspection_Point", seems relatively easy to implement using an empty "asm" or similar. -Geert PS. For convenience, I'm including a snippet of the Ada 2005 standard, the full version of which is freely available on the web. H.3.2 Pragma Inspection_Point 1 An occurrence of a pragma Inspection_Point identifies a set of objects each of whose values is to be available at the point(s) during program execution corresponding to the position of the pragma in the compilation unit. The purpose of such a pragma is to facilitate code validation. Syntax 2 The form of a pragma Inspection_Point is as follows: 3 pragma Inspection_Point[(object_name {, object_name})]; Legality Rules 4 A pragma Inspection_Point is allowed wherever a declarative_item or statement is allowed. Each object_name shall statically denote the declaration of an object. Static Semantics 5/2 An inspection point is a point in the object code corresponding to the occurrence of a pragma Inspection_Point in the compilation unit. An object is inspectable at an inspection point if the corresponding pragma Inspection_Point either has an argument denoting that object, or has no arguments and the declaration of the object is visible at the inspection point. Dynamic Semantics 6 Execution of a pragma Inspection_Point has no effect. Implementation Requirements 7 Reaching an inspection point is an external interaction with respect to the values of the inspectable objects at that point (see 1.1.3). Documentation Requirements 8 For each inspection point, the implementation shall identify a mapping between each inspectable object and the machine resources (such as memory locations or registers) from which the object's value can be obtained. NOTES 9/2 7 The implementation is not allowed to perform "dead store elimination" on the last assignment to a variable prior to a point where the variable is inspectable. Thus an inspection point has the effect of an implicit read of each of its inspectable objects. 10 8 Inspection points are useful in maintaining a correspondence between the state of the program in source code terms, and the machine state during the program's execution. Assertions about the values of program objects can be tested in machine terms at inspection points. Object code between inspection points can be processed by automated tools to verify programs mechanically. 11 9 The identification of the mapping from source program objects to machine resources is allowed to be in the form of an annotated object listing, in human-readable or tool-processable form. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 8:13 ` Geert Bosch @ 2007-12-18 1:24 ` Alexandre Oliva 2007-12-18 1:29 ` Joe Buck 2007-12-18 7:35 ` Robert Dewar 0 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 1:24 UTC (permalink / raw) To: Geert Bosch Cc: Joe Buck, Daniel Berlin, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 17, 2007, Geert Bosch <bosch@adacore.com> wrote: > We could conceptually have inspection points between each source > statement and declaration, which would roughly correspond to a > use of all memory and all source variables, wether in memory or > in registers. > These inspections points would be considered potentially trapping. Yes, I've considered something along these lines, but decided against it, for we can't afford for debug information to affect executable code generation in any way whatsoever, and we don't want to pessimize optimized code when compiling without -g just so that compiling with -g would get us the same code. > Also, since no user-visible state can be modified by speculatively > executed instructions such as loads, such instructions should not > be tagged with their original source location information. Line number information has a well-defined meaning: it ought to represent the source code line that best represents the source-code construct that ended up implemented using that instruction. To address what we have in mind, there's an additional annotation on top of line number information: the is_stmt flag. This is what we should use to tell debuggers what the best instruction is to set a breakpoint at a certain line number or so, and for debuggers to be able to step line by line more seamlessly. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 1:24 ` Alexandre Oliva @ 2007-12-18 1:29 ` Joe Buck 2007-12-18 4:40 ` Alexandre Oliva 2007-12-18 7:35 ` Robert Dewar 1 sibling, 1 reply; 150+ messages in thread From: Joe Buck @ 2007-12-18 1:29 UTC (permalink / raw) To: Alexandre Oliva Cc: Geert Bosch, Daniel Berlin, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Mon, Dec 17, 2007 at 11:11:46PM -0200, Alexandre Oliva wrote: > Line number information has a well-defined meaning: it ought to > represent the source code line that best represents the source-code > construct that ended up implemented using that instruction. You implicitly assume that souch a source code line exists. Consider something like int func(bool cond, int a, int b, int c) { int out; if (cond) out = a + b; else out = a + b + c; return out; } The optimizer might produce something that structurally resembles out = a + b; if (!cond) out += c; return out; If you set a breakpoint on the addition of a and b, it will trigger regardless of the value of cond. Furthermore, there isn't a place to put a breakpoint that will trigger only for the case where cond is true, as you can on unoptimized code. So you need to choose between natural debugging and optimization. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 1:29 ` Joe Buck @ 2007-12-18 4:40 ` Alexandre Oliva 2007-12-18 7:42 ` Robert Dewar 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 4:40 UTC (permalink / raw) To: Joe Buck Cc: Geert Bosch, Daniel Berlin, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 17, 2007, Joe Buck <Joe.Buck@synopsys.COM> wrote: > On Mon, Dec 17, 2007 at 11:11:46PM -0200, Alexandre Oliva wrote: >> Line number information has a well-defined meaning: it ought to >> represent the source code line that best represents the source-code >> construct that ended up implemented using that instruction. > You implicitly assume that souch a source code line exists. Actually, no. I'm not sure where you got that impression, and how you came to the conclusion that I'd assign line numbers the way you have. To me, when you hoist something that is present in both blocks of a conditional, it probably makes more sense to give it the line number of the conditional, rather than that of either block. But I won't pretend to have thought very hard about this particular issue. For the time being, I'm focusing my efforts on local variable locations. Anyhow, very clearly you don't want to mark such hoisted-out computation as is_stmt. This should eliminate at least the solvable problem you're worried about. > out = a + b; > if (!cond) > out += c; > return out; > Furthermore, there isn't a place to put a breakpoint that will > trigger only for the case where cond is true, as you can on > unoptimized code. Yep. Sometimes code just is optimized away. Can't stop that without harming optimizations. If dwarf line number programs were smarter, we could perhaps encode multiple lines for the same instruction, along with conditions to tell when the instruction applies to such or such lines, and even more fancy stuff like that. But line number programs don't let us express this in Dwarf3. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 4:40 ` Alexandre Oliva @ 2007-12-18 7:42 ` Robert Dewar 2007-12-18 8:09 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Robert Dewar @ 2007-12-18 7:42 UTC (permalink / raw) To: Alexandre Oliva Cc: Joe Buck, Geert Bosch, Daniel Berlin, Diego Novillo, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: > Yep. Sometimes code just is optimized away. Can't stop that without > harming optimizations. OK, so you are agreeing that good debuggability is impossible with all the optimizations in place, so once again, let's have an optimziation level that optimizes as far as possible without harming debuggability. > > If dwarf line number programs were smarter, we could perhaps encode > multiple lines for the same instruction, along with conditions to tell > when the instruction applies to such or such lines, and even more > fancy stuff like that. But line number programs don't let us express > this in Dwarf3. So, that's not an option. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 7:42 ` Robert Dewar @ 2007-12-18 8:09 ` Alexandre Oliva 2007-12-18 14:01 ` Robert Dewar 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 8:09 UTC (permalink / raw) To: Robert Dewar Cc: Joe Buck, Geert Bosch, Daniel Berlin, Diego Novillo, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 18, 2007, Robert Dewar <dewar@adacore.com> wrote: > Alexandre Oliva wrote: >> Yep. Sometimes code just is optimized away. Can't stop that without >> harming optimizations. > OK, so you are agreeing that good debuggability is impossible > with all the optimizations in place, so once again, let's have > an optimziation level that optimizes as far as possible without > harming debuggability. I don't oppose such an optimization level, even though I don't know that we agree on what "good debuggability" stands for. It's just that changing optimizations is precisely *against* the goals of my current project. So, don't expect significant efforts to this end from me at this time. >> If dwarf line number programs were smarter, we could perhaps encode >> multiple lines for the same instruction, along with conditions to tell >> when the instruction applies to such or such lines, and even more >> fancy stuff like that. But line number programs don't let us express >> this in Dwarf3. > So, that's not an option. Yup. Best we can do right now is to emit the condition line number. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 8:09 ` Alexandre Oliva @ 2007-12-18 14:01 ` Robert Dewar 2007-12-18 21:20 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Robert Dewar @ 2007-12-18 14:01 UTC (permalink / raw) To: Alexandre Oliva Cc: Joe Buck, Geert Bosch, Daniel Berlin, Diego Novillo, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: > On Dec 18, 2007, Robert Dewar <dewar@adacore.com> wrote: >> OK, so you are agreeing that good debuggability is impossible >> with all the optimizations in place, so once again, let's have >> an optimziation level that optimizes as far as possible without >> harming debuggability. > > I don't oppose such an optimization level, even though I don't know > that we agree on what "good debuggability" stands for. My definition is that it should be indistinguishable from -O0 except that I could live without being able to modify variables. > > It's just that changing optimizations is precisely *against* the goals > of my current project. So, don't expect significant efforts to this > end from me at this time. But you can't achieve the above criterion with your approach. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 14:01 ` Robert Dewar @ 2007-12-18 21:20 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 21:20 UTC (permalink / raw) To: Robert Dewar Cc: Joe Buck, Geert Bosch, Daniel Berlin, Diego Novillo, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 18, 2007, Robert Dewar <dewar@adacore.com> wrote: > Alexandre Oliva wrote: >> On Dec 18, 2007, Robert Dewar <dewar@adacore.com> wrote: >>> OK, so you are agreeing that good debuggability is impossible >>> with all the optimizations in place, so once again, let's have >>> an optimziation level that optimizes as far as possible without >>> harming debuggability. >> It's just that changing optimizations is precisely *against* the goals >> of my current project. So, don't expect significant efforts to this >> end from me at this time. > But you can't achieve the above criterion with your approach. Actually, you can. My approach is about ensuring the mapping between the location of source and implementation variables is correct. This is orthogonal to how much optimization you make. If you optimize more, more values or locations may become unavailable, but this is not about correctness (what fraction of the annotations point at locations that hold the correct value), and it's not even about completeness (what fraction of the source variables are represented at all locations they are available), it's just about theoretical completeness (what fraction of the source variables are represented at all locations they would be available without optimization). -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 1:24 ` Alexandre Oliva 2007-12-18 1:29 ` Joe Buck @ 2007-12-18 7:35 ` Robert Dewar 2007-12-18 8:34 ` Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: Robert Dewar @ 2007-12-18 7:35 UTC (permalink / raw) To: Alexandre Oliva Cc: Geert Bosch, Joe Buck, Daniel Berlin, Diego Novillo, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: > Yes, I've considered something along these lines, but decided against > it, for we can't afford for debug information to affect executable > code generation in any way whatsoever, and we don't want to pessimize > optimized code when compiling without -g just so that compiling with > -g would get us the same code. I disagree, I think it would be fine to degrade -O1 slightly to achieve full debuggability, and of course -g cannot affect the generated code. If indeed a) it is possible to get perfect debuggability without any pessimization b) that includes unexpected jumping around c) everyone agrees on how to achieve a) and b) d) this is implemented then fine, but in the absence of these conditions, if we need to pessimize -O1 code slightly to achieve this, that's OK by me. If it really worries people, introduce a -Og that achieves this. In my experience people use -O1 not because they are very performance sensitive (those folk use -O2), but because -O0 is so horrible, that they need something better than that for production delivery. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 7:35 ` Robert Dewar @ 2007-12-18 8:34 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 8:34 UTC (permalink / raw) To: Robert Dewar Cc: Geert Bosch, Joe Buck, Daniel Berlin, Diego Novillo, Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 18, 2007, Robert Dewar <dewar@adacore.com> wrote: > Alexandre Oliva wrote: >> Yes, I've considered something along these lines, but decided against >> it, for we can't afford for debug information to affect executable >> code generation in any way whatsoever, and we don't want to pessimize >> optimized code when compiling without -g just so that compiling with >> -g would get us the same code. > I disagree, I think it would be fine to degrade -O1 slightly to achieve > full debuggability, Sure. But this is just not relevant to my project of getting GCC to emit correct (and, ideally, as complete as possible) variable location information, no matter what the optimization level. My goal is not so much about aiming at a perfect debugging experience, but rather at making sure that what the compiler encodes in debug information actually reflects the code it produced. This will surely benefit a future full debuggability project, of course. But, as much as I see value in perfect debuggability at some new optimization level, my current task is to get correct and more complete variable location information at vanilla-build optimization levels, i.e., at -O2 -g. It is possible to do much better than what we do now, and it appears to me that it's even possible to do much better than my current plan. But I need to get this task wrapped up before I can spend further time figuring out how to make it even better. In either case, it probably won't be like -O0, for optimizations are performed that make it impossible, and I'm not supposed to sacrifice them for the sake of better debug information. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 4:20 ` Joe Buck 2007-12-17 8:13 ` Geert Bosch @ 2007-12-17 18:36 ` Alexandre Oliva 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-17 18:36 UTC (permalink / raw) To: Joe Buck Cc: Daniel Berlin, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 16, 2007, Joe Buck <Joe.Buck@synopsys.COM> wrote: > However, since preserving accurate debug information > has a cost, I think it would be better to turn -O1, not -O2, into the > mode that Alexandre wants, where debug information is preserved. In terms of memory, that's true, it does have a cost, for we have to keep more information around. That's one of the reasons why I'm implementing this all under the control of a command-line option: you can selectively enable or disable it, regardless of the level of optimization. If we want to make it default for -O1, but not for -O2, sure, that works. But this won't make much of a difference in terms of code change. Except for the fact that we could simply leave alone the passes that are only executed at -O2 or higher (which is not worth it, given that I've already done the small work needed for them to keep debug info accurate), most of the passes will still keep the information accurate, nearly all of them without any code changes whatsoever. So, doing this only for -O1 seems like a waste, given that -O2 is the most common optimization level, and it's most often accompanied by -g. > Trying to rework all optimizations to keep perfect debug information > is going to take forever and make the compiler worse. This statement is easy to make and to believe, but my approach is proving it false, given a design that took this concern into account. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 1:27 ` Daniel Berlin 2007-12-17 4:20 ` Joe Buck @ 2007-12-17 17:59 ` Alexandre Oliva 2007-12-17 18:02 ` Diego Novillo 1 sibling, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-17 17:59 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 16, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: >> It is obvious that you misunderstood what I want, and how intrusive >> the approach is. > Yes Alexandre, everyone who disagrees with you must not understand! My conclusion is not based on disagreement, but rather on the faulty arguments presented during the discussion. For example, when you took the argument that every transformation had effects on debug information, and used that to conclude that every transformation would need difficult changes to generate correct debug information, you left out from your reasoning a major strength of the design, that I had mentioned in the e-mail you responded to: that the optimizers already perform the transformations we need to keep debug information accurate. So, by missing or misunderstanding an essential part of the thought process that went into the design, you came to a false conclusion about it. > That's really the problem here. > None of us understand but you. I guess I'm to blame, for having naïvely put the code out without as much as a design and goals document, such that people started looking at it without actually understanding what it was about, and at the same time taking conclusions about it based on hunches rather than on solid logical grounds. At this point, we have a scenario in which people have already jumped to their conclusions, and whatever I say requires a much higher threshold to be listened to and accepted. It's quite unfortunate that psychological factors take such a large role in the making of technical decisions, and I naïvely assumed this wouldn't raise so much rejection, for being such a simple and well thought-out design. Oh, well... Something to avoid next time... -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 17:59 ` Alexandre Oliva @ 2007-12-17 18:02 ` Diego Novillo 2007-12-17 20:34 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Diego Novillo @ 2007-12-17 18:02 UTC (permalink / raw) To: Alexandre Oliva Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/17/07 12:51, Alexandre Oliva wrote: > I guess I'm to blame, for having naïvely put the code out without as > much as a design and goals document Yes, you are. You need to provide such a document now. I can't see how you'll be able to incorporate your implementation without a convincing design. The barrier is probably going to be higher. You raised too much controversy, so I have my doubts about your simplicity claims. Diego. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 18:02 ` Diego Novillo @ 2007-12-17 20:34 ` Alexandre Oliva 2007-12-17 20:45 ` Diego Novillo 2007-12-31 15:40 ` Richard Guenther 0 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-17 20:34 UTC (permalink / raw) To: Diego Novillo Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 17, 2007, Diego Novillo <dnovillo@google.com> wrote: > On 12/17/07 12:51, Alexandre Oliva wrote: >> I guess I'm to blame, for having naïvely put the code out without as >> much as a design and goals document > Yes, you are. Wow, thanks. At least we agree on something! ;-) > You need to provide such a document now. Can't I instead provide it when it's ready? You know, it wasn't me who asked to have the thing developed in the open. I didn't push it out just so that people who didn't want to understand it could beat on it before it was ready to defend itself. I put it out because there was an offer for contribution. > I can't see how you'll be able to incorporate your implementation > without a convincing design. Agreed, I don't see how this would be doable for any but the most trivial patches. > The barrier is probably going to be higher. > You raised too much controversy, so I have my doubts about your > simplicity claims. Oh, nice! *I* raised too much controversy. So people first ask me to put the code out such that they can peek at it and help, then most refrain from peeking at it because it's not ready and some who do raise some concerns that are not reflected by the code, and then everyone doubts I've taken those concerns into account and demand a design document that will no more than just repeat the information that's already out there but that people fail to take into account. And then, this is a technical discussion, so historical controversy shouldn't play any role in it, if people were rational about it. Now, can you please explain to me how the efforts of repeating myself one more time, rather than completing the implementation, are going to make it any more likely that people who have already made up their minds based on groundless fears will be convinced? If you really think it would be worth it, can you point out at what you feel to be missing in the consolidated documentation I posted upthread, in response to your request? I'd be happy to fill in the blanks, if you're willing to listen. But I wouldn't be happy to waste more time. (This is not to say that the document won't ever be produced; it's to say that I'm to work on it right now. I have other deliverables ahead of it.) -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 20:34 ` Alexandre Oliva @ 2007-12-17 20:45 ` Diego Novillo 2007-12-18 1:02 ` Alexandre Oliva 2007-12-31 15:40 ` Richard Guenther 1 sibling, 1 reply; 150+ messages in thread From: Diego Novillo @ 2007-12-17 20:45 UTC (permalink / raw) To: Alexandre Oliva Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/17/07 15:28, Alexandre Oliva wrote: >> You need to provide such a document now. > > Can't I instead provide it when it's ready? Of course. Diego. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 20:45 ` Diego Novillo @ 2007-12-18 1:02 ` Alexandre Oliva 2007-12-18 1:14 ` Diego Novillo 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 1:02 UTC (permalink / raw) To: Diego Novillo Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 17, 2007, Diego Novillo <dnovillo@google.com> wrote: > On 12/17/07 15:28, Alexandre Oliva wrote: >>> You need to provide such a document now. >> >> Can't I instead provide it when it's ready? > Of course. Thanks, Now, since you're so interested in it and you've already read the various perspectives on the issue that I listed in my yesterday's e-mail to you, would you help me improve this document, by letting me know what you believe to be missing from the selected postings on design strategies, rationales and goals: http://gcc.gnu.org/ml/gcc/2007-11/msg00229.html (goals) http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html (initial plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00261.html (detailed plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00317.html (example) http://gcc.gnu.org/ml/gcc/2007-11/msg00590.html (more example) http://gcc.gnu.org/ml/gcc/2007-11/msg00176.html (design rationale) http://gcc.gnu.org/ml/gcc/2007-11/msg00177.html (clarification) I could then focus on these missing aspects too, in addition to the ones I already have, while designing the best form to present the ideas. Thanks in advance, -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 1:02 ` Alexandre Oliva @ 2007-12-18 1:14 ` Diego Novillo 2007-12-18 5:21 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Diego Novillo @ 2007-12-18 1:14 UTC (permalink / raw) To: Alexandre Oliva Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/17/07 19:50, Alexandre Oliva wrote: > Now, since you're so interested in it and you've already read the > various perspectives on the issue that I listed in my yesterday's > e-mail to you, would you help me improve this document, by letting me > know what you believe to be missing from the selected postings on > design strategies, rationales and goals: No. I am not interested in organizing your thoughts for you. I am interested in reading a single, concise and well organized design document that you produce for all of us to understand what you want to do. Take your time. It doesn't need to be now. Diego. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 1:14 ` Diego Novillo @ 2007-12-18 5:21 ` Alexandre Oliva 2007-12-18 9:10 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 5:21 UTC (permalink / raw) To: Diego Novillo Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 17, 2007, Diego Novillo <dnovillo@google.com> wrote: > On 12/17/07 19:50, Alexandre Oliva wrote: >> Now, since you're so interested in it and you've already read the >> various perspectives on the issue that I listed in my yesterday's >> e-mail to you, would you help me improve this document, by letting me >> know what you believe to be missing from the selected postings on >> design strategies, rationales and goals: > No. I am not interested in organizing your thoughts for you. Wow, nice shot! So tell me, what part of what you've read in the selected bibliography seemed not organized for you? Maybe that's what I have to work on first. > I am interested in reading a single, concise and well organized design > document that you produce for all of us to understand what you want to > do. You got that already, except now I'm no longer sure you've actually read it. Have you? You got the goals. You got the way I intend to get there, in two levels of detail. You got examples that show why the goals can't be achieved in other simpler ways. You got various justifications for the representation I've chosen. Would reformatting these and stamping a title on top make it worthy of your interest? I really don't see what else you might want, and if the above isn't enough, then my rephrasing it all into a single document still wouldn't be enough. I'd be just wasting my time, and yours. So, please do tell me, what is it that you're still missing? Note that I can't promise to deliver, but I can't possibly give you what you want unless you help me figure out what it is. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 5:21 ` Alexandre Oliva @ 2007-12-18 9:10 ` Alexandre Oliva 2007-12-18 13:20 ` Diego Novillo ` (2 more replies) 0 siblings, 3 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 9:10 UTC (permalink / raw) To: Diego Novillo Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc [-- Attachment #1: Type: text/plain, Size: 753 bytes --] On Dec 18, 2007, Alexandre Oliva <aoliva@redhat.com> wrote: > On Dec 17, 2007, Diego Novillo <dnovillo@google.com> wrote: >> On 12/17/07 19:50, Alexandre Oliva wrote: >>> Now, since you're so interested in it and you've already read the >>> various perspectives on the issue that I listed in my yesterday's >>> e-mail to you, would you help me improve this document, by letting me >>> know what you believe to be missing from the selected postings on >>> design strategies, rationales and goals: >> No. I am not interested in organizing your thoughts for you. > Wow, nice shot! Rats, this below-the-waistline attack really got me annoyed. So annoyed that I spent the night writing up this consolidated design document. So, what do you say now? [-- Attachment #2: debug-var-loc.txt --] [-- Type: text/plain, Size: 22558 bytes --] A plan to fix local variable debug information in GCC by Alexandre Oliva <aoliva@redhat.com> 2007-12-18 draft == Introduction The DWARF Debugging Information Format, version 3, determines the ways a compiler can communicate the location of user variables at run time to debug information consumers such as debuggers, program analysis tools, run-time monitors, etc. One possibility is that the location of a variable is fixed throughout the execution of a function. This is generally good enough for unoptimized programs. However, for optimized programs, the location of a variable can vary. The variable may be live for some parts of a function, even in multiple locations simultaneously. At other parts, it may be completely unavailable, or it may still be computable even if no location actually holds its value. The encoding, in these cases, can be a location list: tuples with possibly-overlapping ranges of instructions, and location expressions that determine a location or a value for the variable. Historically, GCC started with the simpler, fixed-location model. In fact, back then, there weren't debug information formats that could represent anything better than this. More recently, GCC gained code to keep track of varying locations, and to emit debug information accordingly. Unfortunately, very many optimization passes discard information that would be necessary to emit correct and complete variable location lists. Coalescing, scalarizing, substituting, propagating, and many other transformations prevent the late-running variable tracker from doing an accurate job. By the time it runs, many variables no longer show up in the retained annotations, although they're still conceptually available. The variable tracker can't tell when a user variable overlaps with another, and it can't tell when a variable is overwritten, if the assignment is optimized away. These limitations are inherent to a model based on inspecting actual code and trying to make inferences from that. In order to be able to represent not only what remained in the code, but also what was optimized, combined or otherwise apparently-removed, additional information needs to be kept around. This paper describes an approach to maintain this information. == Goals * Ensure that, for every user variable for which we emit debug information, the information is correct, i.e., if it says the value of a variable at a certain instruction is at certain locations, or is a known constant, then the variable must not be at any other location at that point, and the locations or values must match reasonable expectations based on source code inspection. * Defining "reasonable expectations" is tricky, for code reordering typical of optimization can make room for numerous surprises. I don't have a precise definition for this yet, but very clearly to me saying that a variable holds a value that it couldn't possibly hold (e.g., because it is only assigned that value in a code path that is knowingly not taken) is a very clear indication that something is amiss. The general guiding rule is, if we aren't sure the information is correct (or we're sure it isn't), we shouldn't pretend that it is. * Try to ensure that, if the value of a variable is a known constant at a certain point in the program, this information is present in debug information. * Try to ensure that, if the value of a variable is available or computable at any location at a certain point in the program, this information is present in debug information. * Stop missing optimizations for the sake of preserving debug information. * Avoid using additional memory and CPU cycles that would be needed only for debug information when compiling without generating debug information == Internal Representation For historical reasons, GCC has two completely different, even if nearly isomorphic, internal representations: trees and RTL. This decision has required a lot of code to be duplicated for low-level manipulation and simplification of each of these representations. Since tracking variables and their values must start early to ensure correctness, and be carried throughout the complete optimization process, it might seem tempting to introduce yet another representation for debug information, decaying both isomorphic representations into a single debug information representation. The drawbacks would be additional duplication of internal representation manipulation code, and the possibility of increasing memory use out of the need for representing information in yet another format. Another concern is that even the simplest compiler transformations may need to be reflected in debug information. This might indicate a need for modifying every point of transformation in every optimization pass so as to propagate information into the debug information representation. This is undesirable, because it would be very intrusive. But then, keeping references to the correct values, expressions or variables, as transformations are made, is precisely what optimization passes have to do to perform their jobs correctly. Finding a way to take advantage of this is a very non-intrusive way of keeping debug information accurate. In fact, most transformations wouldn't need any changes whatsoever: uses of variables in debug information can, in most optimization passes, be handled just like any other uses. Once this is established, a possible representation becomes almost obvious: statements (in trees) or instructions (in rtl) that assert, to the variable tracker, that a user variable or member is represented by a given expression: # DEBUG var expr By var, we mean a tree expression that denotes a user variable, for now. We envision trivially extending it to support components of variables in the future. By expr, we mean a tree or rtl expression that computes the value of the variable at the point in which the statement or instruction appears in the program. A special value needs to be specified for each representation that denotes a location or value that cannot be determined or represented in debug information, for example, the location of a variable that was completely optimized away. It might be useful to represent the expression as a list of expressions, and to distinguish lvalues from rvalues, but for now let's keep this simple. == Generating debug information Generating initial annotations when entering SSA is early enough in the translation that the program will still reflect very reliably the original source code. Annotations are only generated for user variables that are GIMPLE registers, i.e., variables that represent scalar values and that never have their address taken. Other kinds of variables don't have varying locations, so we don't need to worry about them. After every assignment to such a variable, we emit a DEBUG statement that will preserve, throughout compilation, the information that, at that point, the assigned variable was represented by that expression. So, after turning an assignment such as the following into SSA form, we emit the debug statement below right after it: x_1 = whatever; # DEBUG x x_1 Likewise, at control flow merge points, for each PHI node we introduce in the SSA representation, we emit an annotation: # x_4 = PHI <x_1(3), x_2(4), x_3(7)>; # DEBUG x x_4 Then, we let tree optimizers do their jobs. Whenever they rename, renumber, coalesce, combine or otherwise optimize a variable, they will automatically update debug statements that mention them as well. In the rare cases in which the presence of such a statement might prevent an optimization, we need to adjust the optimizer code such that the optimization is not prevented. This most often amounts to skipping or otherwise ignoring debug statements. In a few very rare cases, special code might be needed to adjust debug statements manually. After transformation to RTL, the representation needs translation, but conceptually it's still the same: a mapping from variable to expression. Again, optimizers will most often adjust debug instructions automatically. The exceptions can be handled at no cost: the test for whether an element of the instruction stream is an instruction or some kind of note, that never needs updating, is a range test, in its optimized form. By placing the identifier for a debug instruction at one of the limits of this range, testing for both ranges requires identical code, except for the constants. Since most code that tests for INSN_P and handles instructions can and should match debug instructions as well, in order to keep them up to date, we extend INSN_P so as to match debug instructions, and modify the exceptions, that need to skip debug instructions, by using an alternate test, with the same meaning as the original definition of INSN_P. These simple and non-intrusive changes are relatively common, but still, by far, the exception rather than the rule. When optimizations are completed, including register allocation and scheduling, it is time to pick up the debug instructions and emit debug information out of them. Conceptually, the debug instructions represent points of assignment, at which a user variable ought to evaluate to the annotated expression, maintained throughout compilation. However, when the value of a variable is live at more than one location, it is important to note it, such that, if a debugging session attempts to modify the variable, all copies are modified. The idea is to use some mechanism to determine equivalent expressions throughout a function (say some variant of Global Value Numbering). At debug instructions, we assert that the value of the named variable is in the equivalence class represented by the expression. As we scan basic blocks forward and find that expressions in an equivalence class are modified, we remove them from the equivalence class, and thus from the list of available locations for the variable. When such expressions are further copied, we add them to equivalence classes. At function calls and volatile asm statements, we remove non-function-private memory slots from equivalence classes. At function calls, we also remove call-clobbered registers from equivalence classes. When no live expression remains in the equivalence class that represents a variable, it is understood that its value is no longer available. At basic block confluences, we combine information from the end states of the incoming blocks and the debug statements added as a side effect of PHI nodes. The end result is accurate debug information. Also, except for transformations that require special handling to update debug annotations properly, debug information should come out as complete as possible. == Testability Since debug annotations are added early, and, in most cases, maintained up-to-date by the same code that optimizers use to maintain executable code up-to-date, debug annotations are likely to remain accurate throughout compilation. The risk of this approach is that the annotations get in the way of optimizations, thus causing executable code to vary depending on whether or not debug information is to be generated. The risk of varying code could be removed at the expense of generating and maintaining debug annotations throughout compilation and just throwing them away at the end. This is undesirable, for it would slow down compilation without debug information and waste memory while at that. Therefore, we've built testing mechanisms into the compiler to detect cases in which the presence of debug annotations would cause code changes. The bootstrap-debug Makefile target, by default, compiles the second bootstrap stage without debug information, and the third bootstrap stage with it, and then compares all object files after stripping them, a process that discards all debug information. Furthermore, bootstrap4-debug, after bootstrap-debug and prepare-bootstrap4-debug-lib-g0, rebuilds all target libraries without debug information, and compares them with the stage3 target libraries, built with debug information. At the time of this writing, both tests pass on platforms x86_64-linux-gnu and i686-linux-gnu, and ppc64-linux-gnu and ia64-linux-gnu are getting close. Additional testing mechanisms should be built in, to exercise a wider range of internal GCC behaviors and extensions, for example, by comparing the compiler output with and without debug information while compiling all of its testsuite. Even if testing mechanisms fail to catch an error, the generation of debug annotations is controlled by a command-line option, such that any code changes caused by it can be easily avoided, at the expense of the quality of the debug information. Testing for accuracy and completeness of debug information can be best accomplished using a debugging environment. For example, writing programs of increasing complexity, adding functional-call or asm probe points to stabilize the internal execution state, and then examining the state of the program at these probe points in a debugger, shall let us know how accurate and how complete variable location information is. Measuring accuracy is easy: if you ask for the value of a variable, and get a value other than the expected, there's a bug in the compiler. If you get "unavailable", this can still be regarded as accurate, for locations are always optional. However, it might be incomplete. Telling whether the variable was indeed optimized away, or whether the value is available or computable but the information is missing, is a harder problem, but it's not part of the accuracy test, but rather of the completeness test. The completeness score for an unoptimized program might very often be unachievable for optimized programs, not because the compiler is doing a poor job at maintaining debug information, but rather because the compiler is doing a good job at optimizing it, to the point that it is no longer possible to determine the value of the inspected variable. == Concerns === Memory consumption Keeping more information around requires more memory; information theory tells us that there's only so much information you can fit in a bit. In order to generate correct debug information, more information needs to be retained throughout compilation. The only way to arrange for debug information to not require any additional memory is to waste memory when not generating debug information. But this is undesirable. Therefore, the better debug information we want, the more memory overhead we're going to have to tolerate. Of course at times we can trade memory for efficiency, using more computationally expensive representations that are more compact. At other times, we may trade memory for maintainability. For example, instead of emitting annotations as soon as we enter SSA mode, we could emit them on demand, i.e., whenever we deleted, moved or significantly modified an SSA assignment for which we would have emitted a debug annotation. Additional memory would be needed to mark assignments that should have gained annotations but haven't, and care must be taken to make sure that transformations aren't made without leaving a correct debug statement in place. It is not clear that this would save significant memory, for a large fraction of relevant assignments are modified or moved anyway, so it might very well be a maintainability loss and a performance penalty for no measurable memory gains. Worst case, we may trade memory for debug information quality: if memory use of this scheme is too high for some scenario, one can disable debug information annotations through a command line option, or disable debug information altogether. === Intrusiveness Given that nearly all compiler transformations would require reflection in debug information, any solution that doesn't take advantage of this fact is bound to require changes all over the place. Perhaps not so much for Tree-SSA passes, that are relatively well-behaved and use a narrow API to make transformations, but very clearly so for RTL passes, that very often modify instructions in place, and at times even reuse locations assigned to user variables as temporaries. Even when we do use the strength of optimizers to maintain debug information up to date, there are exceptions in which detailed knowledge about the transformation taking place enables us to adjust the annotations properly, if possible, or to discard location information for the variable otherwise. It is just not possible to hope that information can be maintained accurate throughout compilation without any effort from optimizers, or even through a trivial API for a debug information generator. A number of the exceptions that require detailed knowledge about the ongoing transformation would be indistinguishable from other common transformations that would have very different effects on debug information. At this point, any expectations of lower intrusiveness by use of such an API vanish. By letting optimizers do their jobs on debug annotations, and handling exceptions only at the few locations where they are needed, trivially in most such cases, we keep intrusiveness at a minimum. Of course we could get even lower intrusiveness by accepting errors in debug information, or accepting to generate different code depending on debug information command-line options. But these options shouldn't be considered seriously. === Complexity The annotations are conceptually trivial and they can be immediately handled by optimizers. It is hard to imagine a simpler design that would still enable us to get right cases such as those in the examples below. Worrying about the representation of debug annotations as statements or instructions, rather than notes, is missing the fact that, most of the time, we do want them to be updated just like statements and instructions. Worrying about the representation of debug annotations in-line, rather than an on-the-side representation, is a valid concern, but it's addressed by the testability of the design, and the in-line representation is highly advantageous, not only for using optimizers to keep debug information accurate, but also for doing away with the need for yet another internal representation and all the efforts into maintaining it accurate. === Optimizations Correct and more complete debugging information isn't supposed to disable optimizations. Keep in mind that enabling debug information isn't supposed to modify the executable code in any way whatsoever. The goal is to ensure that whatever debug information the compiler generates actually matches the executable code, and that it is as complete as viable. The goal is not to disable optimizations so as to preserve variables or code, such that it can be represented in debug information and provide for a debugging experience more like that of code that is not optimized. If debug information disables any optimization, that's a bug that needs fixing. Now, while testing this design, a number of opportunities for optimization that GCC missed were detected and fixed, others were merely detected, and at least one optimization shortcoming kept in place in order to get better debug information could be removed, for the new debug information infrastructure enables the optimization to be applied in its fullest extent. == Examples It is desirable to be able to represent constants and other optimized-away values, rather than stating variables have values they can no longer have: int x1 (int x) { int i; i = 2; f(i); i = x; h(); i = 7; g(i); } Even if variable i is completely optimized away, a debugger can still print the correct values for i if we keep annotations such as: (debug (var_location i (const_int 2))) (set (reg arg0) (const_int 2)) (call (mem (symbol_ref f))) (debug (var_location i unknown)) (call (mem (symbol_ref h))) (debug (var_location i (const_int 7))) (set (reg arg0) (const_int 7)) (call (mem (symbol_ref g))) In this case, before the call to h, not only the assignment to i was dead, but also the value of the incoming argument x had already been clobbered. If i had been assigned to another constant instead, debug information could easily represent this. Another example that covers PHI nodes and conditionals: int x2 (int x, int y, int z) { int c = z; whatever0(c); c = x; whatever1(); if (some_condition) { whatever2(); c = y; whatever3(); } whatever4(c); } With SSA infrastructure, this program can be optimized to: int x2 (int x, int y, int z) { int c; # bb 1 whatever0(z_0(D)); whatever1(); if (some_condition) { # bb 2 whatever2(); whatever3(); } # bb 3 # c_1 = PHI <x_2(D)(1), y_3(D)(2)>; whatever4(c_1); } Note how, without debug annotations, c is only initialized just before the call to whatever4. At all other points, the value of c would be unavailable to the debugger, possibly even wrong. If we were to annotate the SSA definitions forward-propagated into c versions as applying to c, we'd end up with all of x_2, y_3 and z_0 applied to c throughout the entire function, in the absence of additional markers. Now, with the annotations proposed in this paper, what is initially: int x2 (int x, int y, int z) { int c; # bb 1 c_4 = z_0(D); # DEBUG c c_4 whatever0(c_4); c_5 = x_2(D); # DEBUG c c_5 whatever1(); if (some_condition) { # bb 2 whatever2(); c_6 = y_3(D); # DEBUG c c_6 whatever3(); } # bb 3 # c_1 = PHI <c_5(D)(1), c_6(D)(2)> # DEBUG c c_1 whatever4(c_1); } is optimized into: int x2 (int x, int y, int z) { int c; # bb 1 # DEBUG c z_0(D) whatever0(z_0(D)); # DEBUG c x_2(D) whatever1(); if (some_condition) { # bb 2 whatever2(); # DEBUG y_3(D) whatever3(); } # bb 3 # c_1 = PHI <x_2(D)(1), y_3(D)(2)>; # DEBUG c c_1 whatever4(c_1); } and then, at every one of the inspection points, we get the correct value for variable c. == Conclusion This design enables a compiler to emit variable location debug information that complies with the DWARF version 3 standard, and that is likely to be as complete as theoretically possible, with an implementation that is conceptually simple, relatively easy to introduce, trivial to test and easy to maintain in the long run. Not wasting memory or CPU cycles during compilation without debug information are welcome bonuses. [-- Attachment #3: Type: text/plain, Size: 250 bytes --] -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 9:10 ` Alexandre Oliva @ 2007-12-18 13:20 ` Diego Novillo 2007-12-18 15:42 ` Alexandre Oliva 2007-12-18 22:43 ` Daniel Berlin 2007-12-18 23:35 ` Daniel Berlin 2 siblings, 1 reply; 150+ messages in thread From: Diego Novillo @ 2007-12-18 13:20 UTC (permalink / raw) To: Alexandre Oliva Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/18/07 03:07, Alexandre Oliva wrote: > Rats, this below-the-waistline attack really got me annoyed. I'm sorry you feel that way, it was not meant as a personal attack, though it was rather brusque. I was getting tired of asking for the same thing over and over again. > So, what do you say now? Thank you. Now I have something concrete to read and comment on. Diego. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 13:20 ` Diego Novillo @ 2007-12-18 15:42 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-18 15:42 UTC (permalink / raw) To: Diego Novillo Cc: Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 18, 2007, Diego Novillo <dnovillo@google.com> wrote: > On 12/18/07 03:07, Alexandre Oliva wrote: >> Rats, this below-the-waistline attack really got me annoyed. > I'm sorry you feel that way, it was not meant as a personal attack, > though it was rather brusque. I was getting tired of asking for the > same thing over and over again. >> So, what do you say now? > Thank you. Now I have something concrete to read and comment on. You already had it. Really. You just didn't feel like reading and commenting on it, for whatever reason I can't understand, which is why you kept asking for what you already had over and over again. Anyhow... I expect your feedback, err... "now" ;-P :-D -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 9:10 ` Alexandre Oliva 2007-12-18 13:20 ` Diego Novillo @ 2007-12-18 22:43 ` Daniel Berlin 2007-12-19 6:07 ` Alexandre Oliva 2007-12-18 23:35 ` Daniel Berlin 2 siblings, 1 reply; 150+ messages in thread From: Daniel Berlin @ 2007-12-18 22:43 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/18/07, Alexandre Oliva <aoliva@redhat.com> wrote: > Then, we let tree optimizers do their jobs. Whenever they rename, > renumber, coalesce, combine or otherwise optimize a variable, they > will automatically update debug statements that mention them as well. > Speaking only about the tree level, in this entire email I make no representations about the RTL level ;) This is much harder than you give it credit for, unless you plan on throwing out all the info at elimination points. Consider PRE alone, which makes new statements that are combinations of old ones, and eliminate tons of variables in favor of it. If your debug statement strategy is "move debug statements when we insert code that is equivalent", it won't work, because our equivalence is based on value equivalence, not location equivalence. We only guarantee it has the same value as the whatever it is a copy of at that point, not that it has the same location. So you will lose info every time PRE makes an insertion, unless you make serious modifications to PRE. This is not to mention the data you lose if you just throw it away at elimination points. Let's take another problem. How do i say debug info for some variable is now dead, we have no idea what it is right now? How do I figure out which debug statements need to be modified when you introduce new memory operations? When you pass something by address, you get vops. The vops are not variables, and have no relation to the original variable (they can be partitions containing more vairables). If i have DEBUG(x, x_3) x_3 = x; // Read from global y = x_3; .... If i insert a new call DEBUG(x, x_3): 1 x_3 = x foo() // May modify x and *&x) y = x_3 Now you have two problems. It is no longer true that at the point of y = x_3, that DEBUG (x, x_3) is true In act, x_3 may no longer have any relation to x. You have three choices: 1. Either destroy the DEBUG(x, x_3) losing valuable and correct info 2. Add a new DEBUG (x, unknown) 3. Figure out which debug statement are reached by your call #3 is a dataflow problem, and not something you want to do every time you insert a call. If your answer is #1 or #2, then what you are really doing is computing roughly the same dataflow problem var-location does, except on trees and with a different meet-operation. var-location generates incorrect info not because it represents something fundamentally different than you are (it doesn't), it falls down because it uses union as the meet operation. It says "oh, i don't know which of these locations is right, it must be both of them". If you changed the meet operation to "oh, i don't know which of these locations is right, it must be none of them", and did a little more work you would inference the same info as yours *at the tree level* Nothing you have proposed is fundamentally going to give you better info. All you have done is annotated the IR in some places to make explicit some bits in the dataflow problem that you could inference anyway. It is provable you can inference them with a simple lattice and associated value, *unless you are going to start guessing* (which you have said you don't want to do because it can generate incorrect info). There is absolutely no reason what you are trying to do needs to modify the tree IR at all to achieve exactly the same accuracy of debug info as your design proposes at the tree level. You could simply compute the global dataflow problem. The RTL level is harder, of course. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 22:43 ` Daniel Berlin @ 2007-12-19 6:07 ` Alexandre Oliva 2007-12-19 8:39 ` Daniel Berlin 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-19 6:07 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > Consider PRE alone, > If your debug statement strategy is "move debug statements when we > insert code that is equivalent" Move? Debug statements don't move, in general. I'm not sure what you have in mind, but I sense some disconnect here. > because our equivalence is based on value equivalence, not location > equivalence. We only guarantee it has the same value as the > whatever it is a copy of at that point, not that it has the same > location. This sounds perfect to me. I'm concerned about values. Locations are an implementation detail. The thing to keep in mind is that what was originally a single user variable may end up mangloptimized into multiple stack slots, registers, with multiple simultaneously-live versions. Trying to pretend that any of these represent the user variable sounds like a recipe for madness to me. So I focus on values instead, and then on trying to recover locations based on binding and sharing of values. > How do i say debug info for some variable is now dead, we have no idea > what it is right now? For annotations, look for VAR_DEBUG_VALUE_NOVALUE in tree.h and VAR_LOC_UNKNOWN_P in rtl.h, in the VTA branch. For dwarf location lists, you just refrain from emitting locations for a given range. > How do I figure out which debug statements need to be modified when > you introduce new memory operations? None. By definition, debug annotations are only about variables that are not addressable. Those that are are fixed at a single location, so there's no reason to track them in a fancy way. > If i insert a new call > DEBUG(x, x_3): 1 > x_3 = x > foo() // May modify x and *&x) > y = x_3 > Now you have two problems. You're talking about a real problem, but your example is misguided. Let me give you a real problem scenario. (set (reg <T>) (<whatever>)) (var_location x (reg <T>)) (set (mem <addr>) (reg <T>)) (set (reg <T>) (<somethingelse>)) (call (mem (symbol_ref foo))) So, at the var_location debug_insn, we know that x is in reg <T>. That's stored at *addr, so now we might be able to use it as an additional location for x. And then, when reg is modified, we remove T from the equivalence class, and then only location holding the value of x is *addr. Then, a function call, that might modify *addr. So, do we decide that x is no longer available after the call, or do we hope *addr still represents it? The thing to remember is that the annotations are only about gimple regs. This means calls don't modify them, ever. But we still have to decide whether *addr represents x or not. My thoughts are leaning towards looking at the memory address or other memory attributes to tell whether it's an addressable stack slot or not. If it's addressable, remove it from the equivalence class at the call, so the equivalence class becomes empty, and the variable is regarded as dead. If it's not addressable (a pseudo assigned to memory), then we can keep it, even if x is actually dead past the call. What we'll see is that, if x is not dead after the call, the compiler will arrange to preserve its value in one such local non-addressable stack slot, and it will probably extend the equivalence class again after the call, as the pseudo is restored. Or the pseudo will be temporarily assigned to a call-saved register, which, for being call-saved, won't be removed from equivalence classes at call instructions. Whereas, if x is dead and its value was just copied to some random memory location, then we may as well flag it as dead at the call site, where the memory location may be modified. So, it all works out nicely, because we know we're only dealing with gimple regs. volatile asms make this slightly trickier, because they're totally unpredictable. I'm thinking it's safe to simply remove addressable memory locations from equivalence classes at them, just for safety, but I don't have it completely figured out. > #3 is a dataflow problem, and not something you want to do every time > you insert a call. I'm not sure what you mean by "inserting calls". We don't do that. Calls are present in the source code (even when implied by stuff like TLS, OpenMP or builtins such as memcpy), and they're either kept around, eliminated or inlined. (disgression intended to be funny: this "inserting a call" discussion reminds me of those impossible initial conditions in electromagnetism textbook exercises, such as uniform magnetic fields in which charged particle suddenly appear ;-) > If your answer is #1 or #2, then what you are really doing is > computing roughly the same dataflow problem var-location does, except > on trees and with a different meet-operation. I am actually computing the same dataflow problem of var-tracking. That's the whole point. But I'm giving it more information, to enable it to track more variables. And it needs to deal with multiple concurrent locations for the same variable, and multiple variables in the same locations, which are "slight" complications. But you're right, in the end it's the same problem. But I'm not computing that in trees. I'm just collecting and maintaining data points for var-tracking, all the way from the tree level. > var-location generates incorrect info not because it represents > something fundamentally different than you are (it doesn't), it falls > down because it uses union as the meet operation. > It says "oh, i don't know which of these locations is right, it must > be both of them". However, it can't deal with parallel locations, so this is at odds with your statement. I haven't got 'round to studying the exact dataflow algorithm var-tracking uses, I just figured I needed to do something along these lines. Maybe it does need tweaking, if I end up using it. I'm not sure yet it's going to make sense to use it for the more detailed tracking of copying that I'm going to have to do. > If you changed the meet operation to "oh, i don't know which of these > locations is right, it must be none of them", and did a little more > work you would inference the same info as yours *at the tree level* Intersection sounds like the right approach to me. I assumed var-tracking did this, except for unknowns. It's a bit trickier than this because var-tracking has to deal with a lot of incomplete information. But at least for vta values, we are going to have a complete picture, so we can be stricter when it comes to gimple reg variables. Now, whether the fact that we could infer the very same values at the tree level is relevant, I don't know. The tree level is neither source level nor the final executable code, so unless we can establish useful mappings from the tree level to both source level and final executable code, this information is of little use, no matter how true it is. > Nothing you have proposed is fundamentally going to give you better info. Except for what tree transformations currently discard, such as the points of the program in which variables are bound to values. This is indeed the one of the elements that the annotations are trying to preserve, that the compiler has not cared about preserving. (The other being expressions that end up not computed at run time, but that could still be computed by a debugger based on state available elsewhere) > All you have done is annotated the IR in some places to make explicit > some bits in the dataflow problem that you could inference anyway. Now, this is not true. I could infer values, yes, but I couldn't infer the variables they relate to, nor the point of binding. And debug information is not just about the values, it's about mapping variables to values and locations. So, we can't infer all the information we need. > There is absolutely no reason what you are trying to do needs to > modify the tree IR at all to achieve exactly the same accuracy of > debug info as your design proposes at the tree level. So far these claims have been unconvincing. I still get the feeling that you're missing some aspects of the problem, but I invite you to show me how the information available in the current IR could be used to generate accurate debug information for the two examples in the design document. Even if we leave the RTL aspect of it aside for a moment. I certainly wouldn't mind having to generate annotations only when we move from Trees to RTL, but I can't imagine how we'd reintroduce bindings at points that are not marked in the tree level, for variables that are (partially or entirely) gone from the tree IR. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 6:07 ` Alexandre Oliva @ 2007-12-19 8:39 ` Daniel Berlin 2007-12-19 16:12 ` Daniel Berlin 2007-12-19 20:27 ` Alexandre Oliva 0 siblings, 2 replies; 150+ messages in thread From: Daniel Berlin @ 2007-12-19 8:39 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/19/07, Alexandre Oliva <aoliva@redhat.com> wrote: > On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > > > Consider PRE alone, > > > If your debug statement strategy is "move debug statements when we > > insert code that is equivalent" > > Move? Debug statements don't move, in general. I'm not sure what you > have in mind, but I sense some disconnect here. OKay, so if you aren't going to move them, you have to erase them when you move statements around. > > > because our equivalence is based on value equivalence, not location > > equivalence. We only guarantee it has the same value as the > > whatever it is a copy of at that point, not that it has the same > > location. This is just a problem with an initial state and some propagation at each statement. How were you going to generate the initial set of debug annotations? This is how you get your initial state for your dataflow problem How were you going to update it if you saw a statement was updated to say x_5 = x_4 instead of x_5 = x_3 + x_2. The same operation you perform to update your annotations when you see x_5 = x_4 works whether you started with x_5 = x_3 + x_2 or not (it better, or else your updating will give different results for the same IR depending on how you got there, which is *incredibly* bad). So then how will using your debug annotations and updating them come out any different than say performing a value numbering pass where you also associate user variables with the ssa names (IE alongside our value numbers), and propagate them around as well? If you want to associate multiple user variables with a single SSA definition point, you can do that as well (use union instead of copy). You can do whatever you think is best at phi nodes (empty set if user var sets are not equal, or union them or intersect them). At the end, you could emit DEBUG(user var, ssa name) right after each SSA_NAME_DEF_STMT for all user vars in the user var set for ssa name. The right DEBUG statements would then appear at the points you can guarantee the user variable has the same *value* as the gimple register you've said it does. From there, it is up to you to do what you like with the result. (it's late, so i may have described/ calculated the dataflow problem backwards, but you get the idea) This is, after all, more or less what PRE does for it's value numbering. It computes which things have the same value at what points in the program, then uses this after computing some more dataflow problems that say where this implies reuse. I don't see why you believe user variables/bindings are special and can't be propagated in this manner, given that you can't depend on the type of statement change that has occurred, only what the IR looks like after the statement change. Otherwise, again, the same IR and source may have different debug annotations depending on the set of changes you applied to get that IR from the initial IR, which is not good the standard reasons [maintainability, determinism, reproducibility, etc]. > > > #3 is a dataflow problem, and not something you want to do every time > > you insert a call. > > I'm not sure what you mean by "inserting calls". We don't do that. Sure we do. We will definitely insert new calls when we PRE const/pure calls, or calls we determine to be movable to the point we want to move them (using call clobbered results, etc). This will insert calls in latch blocks, above loops, in branch conditions This is not just movement. It is insertion of calls that did not exist in the source code at a given point, but are allowed to be executed at that point in the source code anyway. > Calls are present in the source code (even when implied by stuff like > TLS, OpenMP or builtins such as memcpy), and they're either kept > around, eliminated or inlined. No, we can and will insert new calls. Not just for PRE, but for profiling, devirtualization, struct reorg, SRA, etc struct reorg inserts new mallocs and frees profiling inserts profiling calls devirt will insert branches and new calls to replace virtual function calls SRA will insert memcpys to and from structures that were not there in user source before. i could go on if you like. I'm not sure why you believe all the calls that we end up with in the IR are actually in the source (or even implied by it). > > But I'm not computing that in trees. I'm just collecting and > maintaining data points for var-tracking, all the way from the tree > level. Okay, then for trees, why bother tracking it when you can compute it right before translation with the same accuracy you can if you update it every time you make statement changes? > > > All you have done is annotated the IR in some places to make explicit > > some bits in the dataflow problem that you could inference anyway. > > Now, this is not true. I could infer values, yes, but I couldn't > infer the variables they relate to, nor the point of binding See above. > And > debug information is not just about the values, it's about mapping > variables to values and locations. You have no locations at the tree level, and i've explicitly said what i said applies to the tree level :) > So, we can't infer all the > information we need. Again, i believe we can at the tree level. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 8:39 ` Daniel Berlin @ 2007-12-19 16:12 ` Daniel Berlin 2007-12-19 16:36 ` Andrew MacLeod ` (2 more replies) 2007-12-19 20:27 ` Alexandre Oliva 1 sibling, 3 replies; 150+ messages in thread From: Daniel Berlin @ 2007-12-19 16:12 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/19/07, Daniel Berlin <dberlin@dberlin.org> wrote: > On 12/19/07, Alexandre Oliva <aoliva@redhat.com> wrote: > > On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > > > > > Consider PRE alone, > > > > > If your debug statement strategy is "move debug statements when we > > > insert code that is equivalent" > > > > Move? Debug statements don't move, in general. I'm not sure what you > > have in mind, but I sense some disconnect here. > > OKay, so if you aren't going to move them, you have to erase them when > you move statements around. > Besides this, how do you plan on handling the following situations (both of which reassoc performs *right now*). These are the relatively easy ones Here is the easy one: z_5 = a_3 + b_3 x_4 = z_5 + c_3 DEBUG(x, x_4) Reassoc may transform this into: z_5 = c_3 + b_3 x_4 = z_5 + a_3 DEBUG(x, x_4) Now x has the wrong value. At least in this case, you can tell which DEBUG statement to eliminate easily (it is an immediate use of x_4) It gets worse, however c_3 = a_1 + b_2 z_5 = c_3 + d_9 x_4 = z_5 + e_10 DEBUG(x, x_4) y_7 = x_4 + f_11 z_8 = y_7 + g_12 -> c_3 = a_1 + b_2 z_5 = c_3 + g_12 x_4 = z_5 + e_10 DEBUG(x, x_4) y_7 = x_4 + f_11 z_8 = y_7 + d_9 x_4 now no longer represents the value of x, but we haven't directly changed x_4, it's immediate users, or the statements that immediately make up it's defining values. How do you propose we figure out which DEBUG statements we may have affected without doing all kinds of walks? (This is of course, a more general problem of how do i find which debug statements are reached by my transformation without doing linear walks) ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 16:12 ` Daniel Berlin @ 2007-12-19 16:36 ` Andrew MacLeod 2007-12-19 19:49 ` Daniel Berlin 2007-12-19 20:00 ` Andrew MacLeod 2007-12-19 20:07 ` Alexandre Oliva 2 siblings, 1 reply; 150+ messages in thread From: Andrew MacLeod @ 2007-12-19 16:36 UTC (permalink / raw) To: Daniel Berlin Cc: Alexandre Oliva, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Daniel Berlin wrote: > > Here is the easy one: > > z_5 = a_3 + b_3 > x_4 = z_5 + c_3 > > DEBUG(x, x_4) > > > Reassoc may transform this into: > > > z_5 = c_3 + b_3 > x_4 = z_5 + a_3 > > DEBUG(x, x_4) > > Now x has the wrong value. > ?? x_4 looks like it has the value 'a_3 + b_3 + c_3' in both examples to me, although computed in different orders... so isn't that still the right value? Andrew ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 16:36 ` Andrew MacLeod @ 2007-12-19 19:49 ` Daniel Berlin 0 siblings, 0 replies; 150+ messages in thread From: Daniel Berlin @ 2007-12-19 19:49 UTC (permalink / raw) To: Andrew MacLeod Cc: Alexandre Oliva, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/19/07, Andrew MacLeod <amacleod@redhat.com> wrote: > Daniel Berlin wrote: > > > > Here is the easy one: > > > > z_5 = a_3 + b_3 > > x_4 = z_5 + c_3 > > > > DEBUG(x, x_4) > > > > > > Reassoc may transform this into: > > > > > > z_5 = c_3 + b_3 > > x_4 = z_5 + a_3 > > > > DEBUG(x, x_4) > > > > Now x has the wrong value. > > > ?? > > x_4 looks like it has the value 'a_3 + b_3 + c_3' in both examples to > me, although computed in different orders... > > so isn't that still the right value? Yes, sorry, you have to add one more set of adds below and move one so you can make it have a different value You get the general idea though :) Reassoc knows they are all only used in each other, and that it is okay to change their intermediate value as long as the last thing int he chain retains its value (which it does since they are all commutative operations) > > Andrew > ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 16:12 ` Daniel Berlin 2007-12-19 16:36 ` Andrew MacLeod @ 2007-12-19 20:00 ` Andrew MacLeod 2007-12-19 20:57 ` Daniel Berlin 2007-12-19 20:07 ` Alexandre Oliva 2 siblings, 1 reply; 150+ messages in thread From: Andrew MacLeod @ 2007-12-19 20:00 UTC (permalink / raw) To: Daniel Berlin Cc: Alexandre Oliva, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc > It gets worse, however > > c_3 = a_1 + b_2 > z_5 = c_3 + d_9 > x_4 = z_5 + e_10 > DEBUG(x, x_4) > y_7 = x_4 + f_11 > z_8 = y_7 + g_12 > -> > > c_3 = a_1 + b_2 > z_5 = c_3 + g_12 > x_4 = z_5 + e_10 > DEBUG(x, x_4) > y_7 = x_4 + f_11 > z_8 = y_7 + d_9 > > > x_4 now no longer represents the value of x, but we haven't directly > changed x_4, it's immediate users, or the statements that immediately > make up it's defining values. > > This does seem more troublesome. Reassociation shuffles things around without changing the LHS presumably because it has looked at the uses and knows there are no uses outside the expression, so it can manipulate them however it wants. It elects not to create new temps since it knows the old ones aren't being used elsewhere, so why wast new entries. So if it was aware of the debug stmt, there would be a use of x_4 outside the expression, and it would no longer do the same reassociation. Is that the jist of it? Andrew ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 20:00 ` Andrew MacLeod @ 2007-12-19 20:57 ` Daniel Berlin 0 siblings, 0 replies; 150+ messages in thread From: Daniel Berlin @ 2007-12-19 20:57 UTC (permalink / raw) To: Andrew MacLeod Cc: Alexandre Oliva, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/19/07, Andrew MacLeod <amacleod@redhat.com> wrote: > > > It gets worse, however > > > > c_3 = a_1 + b_2 > > z_5 = c_3 + d_9 > > x_4 = z_5 + e_10 > > DEBUG(x, x_4) > > y_7 = x_4 + f_11 > > z_8 = y_7 + g_12 > > -> > > > > c_3 = a_1 + b_2 > > z_5 = c_3 + g_12 > > x_4 = z_5 + e_10 > > DEBUG(x, x_4) > > y_7 = x_4 + f_11 > > z_8 = y_7 + d_9 > > > > > > x_4 now no longer represents the value of x, but we haven't directly > > changed x_4, it's immediate users, or the statements that immediately > > make up it's defining values. > > > > > > This does seem more troublesome. Reassociation shuffles things around > without changing the LHS presumably because it has looked at the uses > and knows there are no uses outside the expression, so it can manipulate > them however it wants. It elects not to create new temps since it knows > the old ones aren't being used elsewhere, so why wast new entries. Yes. > > So if it was aware of the debug stmt, there would be a use of x_4 > outside the expression, and it would no longer do the same reassociation. Either that, or you would have to hunt all the uses of every single thing in the chain to see if any were debug expressions, and if the value is going to change. > > Is that the jist of it? Yes > > Andrew > ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 16:12 ` Daniel Berlin 2007-12-19 16:36 ` Andrew MacLeod 2007-12-19 20:00 ` Andrew MacLeod @ 2007-12-19 20:07 ` Alexandre Oliva 2007-12-19 22:00 ` Daniel Berlin 2 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-19 20:07 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 19, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > Here is the easy one: > z_5 = a_3 + b_3 > x_4 = z_5 + c_3 > DEBUG(x, x_4) > Reassoc may transform this into: > z_5 = c_3 + b_3 > x_4 = z_5 + a_3 > DEBUG(x, x_4) > Now x has the wrong value. As Andrew said, no, it doesn't. Now, if z_5 were present in a debug expression, then it would need adjusting. No different from the adjusting need for any other instruction in which z_5 was present, though. That's what I mean when I talk about letting the optimizers do their job on debug instructions too. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 20:07 ` Alexandre Oliva @ 2007-12-19 22:00 ` Daniel Berlin 2007-12-20 9:26 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Daniel Berlin @ 2007-12-19 22:00 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/19/07, Alexandre Oliva <aoliva@redhat.com> wrote: > On Dec 19, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > > > Here is the easy one: > > > z_5 = a_3 + b_3 > > x_4 = z_5 + c_3 > > > DEBUG(x, x_4) > > > > Reassoc may transform this into: > > > > z_5 = c_3 + b_3 > > x_4 = z_5 + a_3 > > > DEBUG(x, x_4) > > > Now x has the wrong value. > > As Andrew said, no, it doesn't. > Yes, I corrected it later. You didn't address the other one, which is much harder and does require addressing by you. > Now, if z_5 were present in a debug expression, then it would need > adjusting. No different from the adjusting need for any other > instruction in which z_5 was present, though. uh, but if you don't adjust in the fixed examples, DEBUG(x, x_4) will give an invalid value. You can cause this to value to change without ever changing x_4, and do so legally. How do i know i need to change this DEBUG expression. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 22:00 ` Daniel Berlin @ 2007-12-20 9:26 ` Alexandre Oliva 2007-12-20 17:04 ` Ian Lance Taylor 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-20 9:26 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 19, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: >> Now, if z_5 were present in a debug expression, then it would need >> adjusting. No different from the adjusting need for any other >> instruction in which z_5 was present, though. > uh, but if you don't adjust in the fixed examples, DEBUG(x, x_4) will > give an invalid value. My point was that optimizers already had to know how to adjust things such that it doesn't break code. Now, in this optimization, it takes additional liberties with existing variables because it sees they're only used within the sequence. IMHO, it would be more appropriate to introduce alternate temporaries, rather than reusing SSA names for different purposes, in this case. If this approach was taken, the debug annotations referring to a no-longer-defined SSA name would be recognized as invalid, and the variable binding would be removed (i.e., turned into a "value unknown" annotation). Or, if we left the definitions in place, even though they're dead, the same code that cleans up undefined SSA names could recognize these SSA names as unused except in debug information and substitute them for their values, maintaining accurate and complete debug information. But can we do better without introducing more SSA names and keeping assignments around that are known to be dead? Yes, with some additional effort, see below. > How do i know i need to change this DEBUG expression. As reassoc looks for sets of variables it can freely mess with, it should take note of variables that are used in debug annotations in addition to the kind of single (?) non-debug uses it's interested in, such that, when it modifies these variables, the annotations can be compensated for. OTOH, if the compiler performs reassoc on user variables today, it means we do get mangled debug information for such variables already, and they get incorrect values. So, even if we didn't address this problem right away, it wouldn't be much of a regression. But, of course, not dealing with it breaks the goal of having correct debug information, so it ought to be dealt with properly. Do you happen to have a yummy testcase handy that I could use to trigger this kind of transformation in ways that affect the value of user variables? Thanks in advance, -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-20 9:26 ` Alexandre Oliva @ 2007-12-20 17:04 ` Ian Lance Taylor 2007-12-20 20:53 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Ian Lance Taylor @ 2007-12-20 17:04 UTC (permalink / raw) To: Alexandre Oliva Cc: Daniel Berlin, Diego Novillo, Mark Mitchell, Robert Dewar, Richard Guenther, gcc-patches, gcc Alexandre Oliva <aoliva@redhat.com> writes: > > How do i know i need to change this DEBUG expression. > > As reassoc looks for sets of variables it can freely mess with, it > should take note of variables that are used in debug annotations in > addition to the kind of single (?) non-debug uses it's interested in, > such that, when it modifies these variables, the annotations can be > compensated for. The question is how it finds them efficiently, without doing a scan of all instructions. Ian ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-20 17:04 ` Ian Lance Taylor @ 2007-12-20 20:53 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-20 20:53 UTC (permalink / raw) To: Ian Lance Taylor Cc: Daniel Berlin, Diego Novillo, Mark Mitchell, Robert Dewar, Richard Guenther, gcc-patches, gcc On Dec 20, 2007, Ian Lance Taylor <iant@google.com> wrote: > Alexandre Oliva <aoliva@redhat.com> writes: >> > How do i know i need to change this DEBUG expression. >> >> As reassoc looks for sets of variables it can freely mess with, it >> should take note of variables that are used in debug annotations in >> addition to the kind of single (?) non-debug uses it's interested in, >> such that, when it modifies these variables, the annotations can be >> compensated for. > The question is how it finds them efficiently, without doing a scan of > all instructions. It must keep track of variables it can mess with, so it might as well take notes about those it has to be more careful about. *Or* it can just introduce new temporaries, rename the uses and leave the original sets behind for "garbage collection" AKA dead code elimination, like I said. One is more implementation work, the other is potentially more wasteful in terms of memory use. None look particularly hard to me. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 8:39 ` Daniel Berlin 2007-12-19 16:12 ` Daniel Berlin @ 2007-12-19 20:27 ` Alexandre Oliva 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-12-19 20:27 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 19, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > On 12/19/07, Alexandre Oliva <aoliva@redhat.com> wrote: >> On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: >> >> > Consider PRE alone, >> >> > If your debug statement strategy is "move debug statements when we >> > insert code that is equivalent" >> >> Move? Debug statements don't move, in general. I'm not sure what you >> have in mind, but I sense some disconnect here. > OKay, so if you aren't going to move them, you have to erase them when > you move statements around. Why? They still represent the point of binding between user variable and value. > How were you going to generate the initial set of debug annotations? It's in the document: after each assignment to user variable, and at PHI nodes for user variables. The debug statement means the variable holds that value from that point on until conflicting information arises (i.e., another debug statement for the same variable, or a control flow merge with different values for the same variable) > How were you going to update it if you saw a statement was updated to > say x_5 = x_4 instead of x_5 = x_3 + x_2. No update needed, if x_5 is the value of interest. I'm not sure that's what you're asking, though. > So then how will using your debug annotations and updating them come > out any different than say performing a value numbering pass where you > also associate user variables with the ssa names (IE alongside our > value numbers), and propagate them around as well? First, debug annotations may be at different points than the corresponding SSA definitions, because the same SSA definition may be bound to different variables at different ranges. Second, debug annotations may contain more complex expressions than a single SSA name, and there may not be any SSA name that represents the value of these expressions left. For example, given: x_3 = a_1 + b_2; # DEBUG x => x_3 foo(); if we find that x_3 is unused elsewhere, we can drop it without discarding debug information about the value of x at that point # DEBUG x => a_1 + b_2 foo(); such that, if we stop at the call and print x, we get the expected value, even though the actual variable was optimized away. > At the end, you could emit DEBUG(user var, ssa name) right after each > SSA_NAME_DEF_STMT for all user vars in the user var set for ssa name. This doesn't work. Consider: a_2 = whatever1; b_4 = whatever2; x_1 = a_2; probe(); if (condition) { probe(); x_3 = b_4; probe(); } x_5 = PHI <x_1(!condition), x_3(condition)>; probe(); Now, if you optimize it and apply the debug stmt generation technique you suggested, this is what you get: T_2 = whatever1; # DEBUG a => T_2 # DEBUG x => T_2 T_4 = whatever2; # DEBUG b => T_4 # DEBUG x => T_4 probe(); if (condition) { probe(); probe(); } T_5 = PHI <T_2(!condition), T_4(condition)> # DEBUG x => T_5 probe(); What do you get if you print x at each of the probe points? > I don't see why you believe user variables/bindings are special and > can't be propagated in this manner, It's not that I don't believe it, it's just that just being able to propagate them is not enough. We must also take the binding point into account. Now, as I wrote to Ian last night, if we just add a binding point annotation to this mix, then we have sufficient information: T_2 = whatever1; # DEBUG a => T_2 here # DEBUG x => T_2 at P1 T_4 = whatever2; # DEBUG b => T_4 here # DEBUG x => T_4 at P2 probe(); # DEBUG point P1 if (condition) { probe(); # DEBUG point P2 probe(); } T_5 = PHI <T_2(!condition), T_4(condition)> # DEBUG x => T_5 probe(); I still don't see how, in this notation, we'd represent something like "at this point, the value of this user variable is unknown". Any ideas? Also, this strategy works for the nice and well-behaved Tree SSA optimization passes. For RTL, that is far less abstract, especially after register allocation, I don't see that we can rely on such a simple strategy. But, in a way, I hope I'm wrong ;-) >> > #3 is a dataflow problem, and not something you want to do every time >> > you insert a call. >> I'm not sure what you mean by "inserting calls". We don't do that. > Sure we do. > We will definitely insert new calls when we PRE const/pure calls, or > calls we determine to be movable to the point we want to move them I think of that as moving, rather than inserting. That said, I still don't quite see what you're getting at. Calls don't mess with gimple registers of their callers, ever, so it appears to me that inserting a call in the tree level is a NOP in terms of debug information annotations. > I'm not sure why you believe all the calls that we end up with in the > IR are actually in the source (or even implied by it). Conceptually, they are, kind-a sort of :-) Except perhaps for profiling calls, that are meant to be fully transparent anyway. Others are more akin to inlining, or using a call for convenience rather than expanding a copy or something to that effect. >> But I'm not computing that in trees. I'm just collecting and >> maintaining data points for var-tracking, all the way from the tree >> level. > Okay, then for trees, why bother tracking it when you can compute it > right before translation with the same accuracy you can if you update > it every time you make statement changes? Just because we still haven't found a reliable way to do so that doesn't drop essential information for correct debug info. If we do, I'll be delighted to immediately drop the proposed debug annotations in the tree level. And in the RTL level as well. >> And debug information is not just about the values, it's about >> mapping variables to values and locations. > You have no locations at the tree level, ?!? Locations as in point of execution, rather than DWARF locations, is waht I mean. > and i've explicitly said what > i said applies to the tree level :) Indeed ;-) >> So, we can't infer all the >> information we need. > Again, i believe we can at the tree level. Good, let's keep on it. How about you use something like the example above to explain how to accomplish it? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 9:10 ` Alexandre Oliva 2007-12-18 13:20 ` Diego Novillo 2007-12-18 22:43 ` Daniel Berlin @ 2007-12-18 23:35 ` Daniel Berlin 2007-12-19 5:50 ` Alexandre Oliva 2 siblings, 1 reply; 150+ messages in thread From: Daniel Berlin @ 2007-12-18 23:35 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc > > It is desirable to be able to represent constants and other > optimized-away values, rather than stating variables have values they > can no longer have: > > int > x1 (int x) > { > int i; > > i = 2; > f(i); > i = x; > h(); > i = 7; > g(i); > } > > Even if variable i is completely optimized away, a debugger can still > print the correct values for i if we keep annotations such as: > > (debug (var_location i (const_int 2))) > (set (reg arg0) (const_int 2)) > (call (mem (symbol_ref f))) > (debug (var_location i unknown)) > (call (mem (symbol_ref h))) > (debug (var_location i (const_int 7))) > (set (reg arg0) (const_int 7)) > (call (mem (symbol_ref g))) > > In this case, before the call to h, not only the assignment to i was > dead, but also the value of the incoming argument x had already been > clobbered. If i had been assigned to another constant instead, debug > information could easily represent this. > > Another example that covers PHI nodes and conditionals: > > int > x2 (int x, int y, int z) > { > int c = z; > whatever0(c); > c = x; > whatever1(); > if (some_condition) > { > whatever2(); > c = y; > whatever3(); > } > whatever4(c); > } > > With SSA infrastructure, this program can be optimized to: > > int > x2 (int x, int y, int z) > { > int c; > # bb 1 > whatever0(z_0(D)); > whatever1(); > if (some_condition) > { > # bb 2 > whatever2(); > whatever3(); > } > # bb 3 > # c_1 = PHI <x_2(D)(1), y_3(D)(2)>; > whatever4(c_1); > } > > Note how, without debug annotations, c is only initialized just before > the call to whatever4. At all other points, the value of c would be > unavailable to the debugger, possibly even wrong. > > If we were to annotate the SSA definitions forward-propagated into c > versions as applying to c, we'd end up with all of x_2, y_3 and z_0 I> f you forward propagate any annotations, ever, > applied to c throughout the entire function, in the absence of > additional markers. > > Now, with the annotations proposed in this paper, what is initially: > > int > x2 (int x, int y, int z) > { > int c; > # bb 1 > c_4 = z_0(D); > # DEBUG c z_0(D) > whatever0(z_0(D)); > # DEBUG c x_2(D) > whatever1(); > and then, at every one of the inspection points, we get the correct > value for variable c. Because you have added information you have no way of knowing. How exactly did you compute that the call *definitely sets c to the value of z_0*, and definitely sets the value of c to x_2. This must be "may-information", because we don't know what the call does. Ignoring this (the solution is to not assume anything at calls, because you run the risk of gettng the wrong answer at meet points later on!) your scheme is sufficient to get correct values, but not correct locations. However, value equivalene does not imply location equivalence, and all of our debug formats deal with locations of variables, except for constants. IE If you translate this directly into DWARF3, as written, you will claim that c and x_4 has the same location (since dwarf does not let you say "it has the same value as x, but not the same location), and thus incorrectly represent that p *x_4=5 modifies c if i were to do it in the debugger. Because of the may-problem, you will also claim the same value/location for c and x_2, which you can't prove is right, because you don't know what whatever1/2 actually does. if all you want is the values you compute above, on SSA, you can easily use a lattice to compute the same values you are going to compute as you update the annotations on the fly. (This is because it is a flow sensitive problem, and you want the flow answers at each unique definition point, which SSA neatly provides, except for calls, where you could hang it off the vops). Tracking which values *definitely represent user values* is actually quite easy at the tree level, and doesn't require any IR modification. It may be worth doing at the RTL level, however, where the solution requires making up program points at each definition site and computing the dataflow problem in terms of them. --Dan ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-18 23:35 ` Daniel Berlin @ 2007-12-19 5:50 ` Alexandre Oliva 2007-12-19 16:35 ` Daniel Berlin 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-19 5:50 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: >> int c = z; >> whatever0(c); >> c = x; > Because you have added information you have no way of knowing. > How exactly did you compute that the call *definitely sets c to the > value of z_0*, and definitely sets the value of c to x_2. Err... I guess you're thinking memory, global variables, alias analysis and that sort of stuff. None of this applies to gimple registers, which is all the annotations are about. Yes, aliasing, memory references and must- and may-alias do play a role at the time of turning the annotations into equivalence classes, when memory locations that are not stack slots allocated to gimple regs that couldn't get hardware registers show up in the equivalence classes. These don't seem too hard to handle conservatively (removing even may-alias assignment destinations from equivalence classes, as well as non-local memory references at function calls and volatile asms), at the expense of incompleteness in debug information, or in a more lax way, at the potential expense of correctness. I still don't know exactly where to draw the line here, this note-propagation algorithm is one that I haven't completely figured out yet. > However, value equivalene does not imply location equivalence, and all > of our debug formats deal with locations of variables, except for > constants. Dwarf enables arbitrary value expressions too. There's some discussion about lvalue vs rvalue in the document, and this is also something that will take some experimenting. I'm not entirely sure where to draw the line, and I'm not entirely sure there is a perfect answer. For example, consider that a variable's home is a stack slot, but for a loop in which it's not modified, it's held in a register. Clearly in this case the correct representation is for the variable to be in both locations, both as lvalues. But if the variable is further copied to other variables or locations, these additoinal locations probably shouldn't be regarded as the same variable any more; at most, as rvalues, but maybe not even that. And then, if for some particular instruction, the variable in the register needs to be copied to a different register class, then it is correct to state that, between the copy and the use, the variable is held in all three locations. I'm still trying to figure out how to deal with overlaps between variables, deciding whether locations are to be handled as lvalues or rvalues, this sort of stuff. It is indeed a difficult problem. > IE If you translate this directly into DWARF3, as written, you will > claim that c and x_4 has the same location (since dwarf does not let > you say "it has the same value as x, but not the same location), Yeah. The $1M question is, when two variables are coalesced into one, does this mean we now have two variables sharing the same location, or do we just use the rvalue of one (which?) for the other? Isn't this like talking about body and spirit of variables? After optimization, I'm not even sure that talking about location (body) of variables make much sense. An important part of the design process was to distinguish between source-level variables and implementation-level variables. Our naming of stack slots or pseudos as variables is just a mnemonic artifact for us compiler engineers, to simplify debugging. Which variables they actually represent depends a lot on optimization decisions, perhaps even more than on the original code. So I talk about binding a source-level variable to a value, rather than to a location. Then, we figure out the locations that hold the value, what other variables do, how they overlap, maybe how they're used, and then figure out which locations should be assigned to each source variable. Tricky. The only certainty I have right now is that the annotations I've proposed enable us to keep track of values. Distributing locations in equivalence classes to different user variables is an open problem, and there are various possible solutions that could make sense, and that would be arguably correct. > if all you want is the values you compute above, on SSA, you can > easily use a lattice to compute the same values you are going to > compute as you update the annotations on the fly. This sounds interesting, but I don't quite follow what you mean. Can you elaborate, maybe give some examples? > Tracking which values *definitely represent user values* is actually > quite easy at the tree level, and doesn't require any IR modification. But is the binding of user variables to user values for specified ranges part of this representation too? I don't see that it is, and this is the gap I'm trying to fill with the debug annotations. > It may be worth doing at the RTL level, however, where the solution > requires making up program points at each definition site and > computing the dataflow problem in terms of them. /me mumbles something about RTL-SSA, that Jeff Law started working on before we took this turn into Tree-SSA. I'm sort of having to introduce some limited form of SSA in RTL to infer global equivalence classes out of the annotations, in the RTL var-tracking pass. Fun... If only we had sticked to a single IR... (No personal preference, I like both, but I'd rather not have to duplicate work so as to deal with both) -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 5:50 ` Alexandre Oliva @ 2007-12-19 16:35 ` Daniel Berlin 2007-12-19 19:46 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Daniel Berlin @ 2007-12-19 16:35 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On 12/18/07, Alexandre Oliva <aoliva@redhat.com> wrote: > On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > > >> int c = z; > >> whatever0(c); > >> c = x; > > > Because you have added information you have no way of knowing. > > How exactly did you compute that the call *definitely sets c to the > > value of z_0*, and definitely sets the value of c to x_2. > > Err... I guess you're thinking memory, global variables, alias > analysis and that sort of stuff. > Yes, i mixed your examples up, i apologize. > None of this applies to gimple registers, which is all the annotations > are about. > > > > However, value equivalene does not imply location equivalence, and all > > of our debug formats deal with locations of variables, except for > > constants. > > Dwarf enables arbitrary value expressions too. Well, uh, no. The only way to directly specify the value of a variable is for constants. DW_AT_const_value does not allow location descriptions. "An entry describing a variable or formal parameter whose value is constant and not represented by an object in the address space of the program, or an entry describing a named constant, does not have a location attribute. Such entries have a DW_AT_const_value attribute, whose value may be a string or any of the constant data or data block forms, as appropriate for the representation of the variable's value. The value of this attribute is the actual constant value of the variable, represented as it would be on the target architecture. " There are no other provisions in DWARF for describing the value of a variable, it is expected you describe their locations using DW_AT_location (which gives you the full power of location descriptions, but requires you to return a location, not a value) > There's some > discussion about lvalue vs rvalue in the document, and this is also > something that will take some experimenting. I'm not entirely sure > where to draw the line, and I'm not entirely sure there is a perfect > answer. I'm still curious where you think it describes value expressions for variables other than constants (which again, can't use the location description language) Again, i'd support such an extension, but it does not currently exist. Rest answers in other message. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 16:35 ` Daniel Berlin @ 2007-12-19 19:46 ` Alexandre Oliva 2007-12-19 20:39 ` Daniel Jacobowitz 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-19 19:46 UTC (permalink / raw) To: Daniel Berlin Cc: Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Dec 19, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote: > On 12/18/07, Alexandre Oliva <aoliva@redhat.com> wrote: >> Dwarf enables arbitrary value expressions too. > Well, uh, no. > The only way to directly specify the value of a variable is for > constants. DW_AT_const_value does not allow location descriptions. DW_AT_const_value is irrelevant for location lists. It's DW_OP_* that I'm talking about. That said... I can't find any more the equivalent of DW_CFA_val_expression in DW_OP_*s that could be used in location expressions. I just *knew* it was there, but I guess I just imagined it. This is embarrassing. At this point, there are three options available: - go back to the drawing board - discard altogether expressions that don't represent lvalues (maybe don't even keep track of them) - introduce a DWARF extension that enables value expressions to be used in location lists (say DW_OP_value, DW_OP_temp_location, or even DW_OP_self_location (*)) (*) maps value to a virtual location that, if dereferenced, evaluates to the value. Could be "easily" implemented through a virtual out-of-range base address, plus the offset that represents the value on dereference, but there are many other ways to implement this in debug information consumers. > I'm still curious where you think it describes value expressions for > variables other than constants Me too :-) :-( Thanks for drawing my attention to this incorrect assumption I made about DWARF location lists. > i'd support such an extension Cool. Do you happen to know the procedure to propose DWARF standard extensions? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-19 19:46 ` Alexandre Oliva @ 2007-12-19 20:39 ` Daniel Jacobowitz 0 siblings, 0 replies; 150+ messages in thread From: Daniel Jacobowitz @ 2007-12-19 20:39 UTC (permalink / raw) To: Alexandre Oliva Cc: Daniel Berlin, Diego Novillo, Mark Mitchell, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Wed, Dec 19, 2007 at 05:02:52PM -0200, Alexandre Oliva wrote: > That said... I can't find any more the equivalent of > DW_CFA_val_expression in DW_OP_*s that could be used in location > expressions. I just *knew* it was there, but I guess I just imagined > it. This is embarrassing. I am pretty sure such an extension has already been proposed. Might want to check with the committee (see dwarf.org). -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-17 20:34 ` Alexandre Oliva 2007-12-17 20:45 ` Diego Novillo @ 2007-12-31 15:40 ` Richard Guenther 1 sibling, 0 replies; 150+ messages in thread From: Richard Guenther @ 2007-12-31 15:40 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Daniel Berlin, Mark Mitchell, Robert Dewar, Ian Lance Taylor, gcc-patches, gcc, Michael Matz On Dec 17, 2007 9:28 PM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Dec 17, 2007, Diego Novillo <dnovillo@google.com> wrote: > > > On 12/17/07 12:51, Alexandre Oliva wrote: > >> I guess I'm to blame, for having naïvely put the code out without as > >> much as a design and goals document > > > Yes, you are. > > Wow, thanks. At least we agree on something! ;-) > > > You need to provide such a document now. > > Can't I instead provide it when it's ready? > > You know, it wasn't me who asked to have the thing developed in the > open. I didn't push it out just so that people who didn't want to > understand it could beat on it before it was ready to defend itself. > I put it out because there was an offer for contribution. Yeah - that was me... Fact is we had a discussion about debug information earlier this year from which I took the conclusion that most people would appreciate an on-the-side representation to address the most limiting design issue of GCCs tree representation (only one variable per SSA_NAME to track). So I had the impression you worked in that direction and offered help. Now, you seemed to have come to the conclusion that this approach would not help your goal and started on a different route. Now the "mistake" maybe was to before starting this not to revive the former discussion based on your findings and elaborate on your goals. (I realize this is the way development for GCC works most of the time, but this is not what I consider good practice for open source development) Now - I think your goal is valid, and the choice of implementation might even be the best one for it. But we (the GCC community) have not yet decided if the combination of "your goal" and "this best implementation" is what we want. (I haven't decided myself either ;)) So my suggestion for you is to continue with your implementation and produce a white paper about your design (which you ideally would present during the next GCC summit, where we should do a discussion on this topic in some form). We (myself and Matz) will continue to implement what is "our goal" (because we internally committed to it, and to see limitations or problems with the approach) and possibly also will present about its outcome at the summit. Thanks, Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-15 21:41 ` Alexandre Oliva 2007-12-16 3:15 ` Daniel Berlin @ 2007-12-16 21:42 ` Mark Mitchell 1 sibling, 0 replies; 150+ messages in thread From: Mark Mitchell @ 2007-12-16 21:42 UTC (permalink / raw) To: Alexandre Oliva Cc: Diego Novillo, Robert Dewar, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: >> Yes, please. I would very much like to see an abstract design >> document on what you are trying to accomplish. > > Other than the ones I've already posted, here's one: > > http://dwarfstd.org/Dwarf3Std.php > > Seriously. There is a standard for this stuff. That's the specification for the encoding format. I agree with you that emitting incorrect debugging information, in the sense of declaring that the location of a variable is in one place, even though its value is not available in that place, is bad. In -O0 code, I consider it a serious bug. In -O2 code, I think it's still a bug, but with our current infrastructure, we may have little choice: we either deny all knowledge of the variable's location, or give one that's sometimes incorrect. Which alternative is better depends on what you're trying to do with the information; for interactive debugging, mostly-right is probably better than nothing, whereas for some programmatic activities, the opposite may be true. If your goal is to avoid the information ever being wrong -- without worrying about whether it is complete -- there is of course a trivial solution: do not emit the information. That is not a serious suggestion, but it does provide a path to a serious suggestion, which I gave earlier: conservatively emit location information you provide based on what you can prove at the time you generate debugging information. For example, if the value of "x" is in a register, and you cross a call which might clobber that register value, then emit debugging information that says that at that point the value is unavailable. You could probably do this kind of thing with relatively few changes to the GCC internal representation; you would run a pass before debug-information generation that attempted to prove dataflow properties about variables and told you where values could reliably be found. Your earlier messages, however, suggest that you are trying to do something harder: emit information that is essentially both complete (in the sense of providing as much information as possible about the locations and values of variables) and correct (in the sense of never giving incorrect information). If you want to do that, you're going to have to answer the harder questions, like "what line number corresponds to this address?" and "what should the debugging information say that the value of a variable is when it has been optimized away?" If that's still your goal, then pointing at the DWARF3 specification doesn't help. Diego and I are asking you to confront these fundamental questions about what information you want to provide and what the correctness criteria are. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 21:26 ` Ian Lance Taylor 2007-11-09 9:53 ` Robert Dewar @ 2007-11-09 9:55 ` Seongbae Park (박성배, 朴成培) 2007-11-09 11:08 ` Robert Dewar 1 sibling, 1 reply; 150+ messages in thread From: Seongbae Park (박성배, 朴成培) @ 2007-11-09 9:55 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches, gcc I think both sides are talking over each other, partially because two different goals are in mind. IMHO, there are two extremes when it comes to the so called debugging optimized code. One camp wants the full debuggability (let's call them debuggability crowd) - which means they want to know the value of any valid program state anywhere, and wants to set breakpoint anywhere and be able to even change the program state anywhere as if there was an assignment at the point the debugger stopped the program at. This camp still wants better performance (like everyone else) but they don't want to sacrifice the debuggability for performance, because they rely on these. The other camp is the performance crowd, where they want the absolute best performance but they still want as much debug information possible. Most people fall in this camp and this is what gcc has implemented. This camp doesn't want to change the code so that they can get better debugging information. Of course, the real world is somewhere in between, but in practice, most people fall in the latter group (aka performance crowd). Alexandre's proposal would make it possible to make the debuggability crowd happy at some unknown cost of compile-time/runtime cost and maintenance cost. Richiard's proposal (from what I can understand) would make performance crowd happy, since it would be less costly to implement than Alexandre's and would provide incrementally better debugging information than current, but it doesn't seem to be that it would make the debuggability crowd happy (or at least the extremists among debuggability crowd). So I think the difference in the opinion isn't so much as Alexandre's proposal is good or bad, but rather whether we aim to make the debuggability crowd happy or the performance crowd happy or both. Ideally we should serve both groups of users, but there's non-trivial ongoing maintenance cost for having two different approaches. So I'd like to ask both Alexandre and Richard whether they each can satisfy the other camp, that is, Alexandre to come up with a way to tweak his proposal so that it is possible to keep the compile time cost comparable to what is right now with similar or better debug information, and with reasonable maintenance cost, and Richard whether his proposal can satisfy the debuggability crowd. Of course, another possible opinion would be to ignore the debuggability crowd on the ground that they are not important or big. I personally think it's a mistake to do so, but you may disagree on that point. Seongbae On 08 Nov 2007 12:50:17 -0800, Ian Lance Taylor <iant@google.com> wrote: > Alexandre Oliva <aoliva@redhat.com> writes: > > > So... The compiler is outputting code that tells other tools where to > > look for certain variables at run time, but it's putting incorrect > > information there. How can you possibly argue that this is not a code > > correctness issue? > > I don't see any point to going around this point again, so I'll just > note that I disagree. > > > > >> >> > We've fixed many many bugs and misoptimizations over the years due to > > >> >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > > >> >> > we've made in the past. > > >> >> > > >> >> That's a valid concern. However, per this reasoning, we might as well > > >> >> push every operand in our IL to separate representations, because > > >> >> there have been so many bugs and misoptimizations over the years, > > >> >> especially when the representation didn't make transformations > > >> >> trivially correct. > > >> > > >> > Please don't use strawman arguments. > > >> > > >> It's not, really. A reference to an object within a debug stmt or > > >> insn is very much like any other operand, in that most optimizer > > >> passes must keep them up to date. If you argue for pushing them > > >> outside the IL, why would any other operands be different? > > > > > I think you misread me. I didn't argue for pushing debugging > > > information outside the IL. I argued against a specific > > > implementation--DEBUG_INSN--based on our experience with similar > > > implementations. > > > > Do you remember any other notes that contained actual rtx expressions > > and expected optimization passes to keep them accurate? > > No. > > > Do you think > > we'd gain anything by moving them to a separate, out-of-line > > representation? > > I don't know. I don't see such a proposal on the table, and I don't > have one myself, so I don't know how to evaluate it. > > Ian > -- #pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com" ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-09 9:55 ` Seongbae Park (박성배, 朴成培) @ 2007-11-09 11:08 ` Robert Dewar 0 siblings, 0 replies; 150+ messages in thread From: Robert Dewar @ 2007-11-09 11:08 UTC (permalink / raw) To: "Seongbae Park (¹Ú¼º¹è, ÚÓà÷ÛÆ)" Cc: Ian Lance Taylor, Alexandre Oliva, Richard Guenther, gcc-patches, gcc Seongbae Park (¹Ã¼º¹è, ÃÃà ÷ÃÃ) wrote: > Most people > fall in this camp > and this is what gcc has implemented. This camp doesn't want to change the code > so that they can get better debugging information. This is definitely not the case. At least among our users, very few fall into this camp. But in any case I think we all agree that there should be a mode in which this is the emphasis. > > Of course, the real world is somewhere in between, but in practice, > most people fall in the latter group > (aka performance crowd). You must live in a strange world, after all think about it, lots of people find Java quite fine, even though it throws away a lot of performance. > Of course, another possible opinion would be to ignore the debuggability crowd > on the ground that they are not important or big. Actually I think big serious users with programs in the millions of lines category are much more likely to be in the "debuggability" crowd. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 22:57 ` Ian Lance Taylor ` (2 preceding siblings ...) 2007-11-08 5:01 ` Alexandre Oliva @ 2007-11-08 8:58 ` Paolo Bonzini 3 siblings, 0 replies; 150+ messages in thread From: Paolo Bonzini @ 2007-11-08 8:58 UTC (permalink / raw) To: gcc-patches; +Cc: gcc > What standards are you talking about? I'm not aware of any standard > for debuggability of optimized code. As a developer of gcc, it would be *invaluable* in debugging for example bootstrap comparison failures. There I have to debug side-by-side the stage1 and the stage2 compiler, and no way I can compile the latter unoptimized... As a user more than a developer of gcc this days, definitely yes. I often have programs that run for say 1 minute, and I *know* the bug comes up after 50 seconds. It's already unnerving enough to debug programs like this (I often start ten gdbs at the same time, launch them to the magic point while I'm taking a coffee, and go back working!); and it's only worse if you're doing it on -O0 binaries that take 5 minutes to reach the point you're trying to debug. Backward debugging would also be a possibility for me, much more productive than debuggability of optimized code, but since backward debugging is pie-in-the-sky, debuggability of optimized code is also good. Paolo ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) 2007-11-07 7:52 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Alexandre Oliva 2007-11-07 16:16 ` Ian Lance Taylor @ 2007-11-07 17:20 ` Michael Matz 2007-11-07 18:45 ` Designs for better debug info in GCC Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-07 17:20 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, gcc Hi, On Wed, 7 Nov 2007, Alexandre Oliva wrote: > > With the different approach I and Matz started (and to which we didn't > > yet spend enough time to get debug information actually output - but I > > hope we'll get there soon), on the tree level the extra information is > > stored in a bitmap per SSA_NAME (where necessary). > > This will fail on a very fundamental level. Consider code such as: > > f(int x, int y) { > int c; > /* other vars */ > > c = x; > do_something_with(c, ...); // doesn't touch x or y > > c = y; > do_something_else_with(c, ...); // doesn't touch x or y > } > > where do_something_*with are actually complex computations, be that > explicit code, be it macros or inlined functions. > > This can (and should) be trivially optimized to: > > f(int x, int y) { > /* other vars */ > > do_something_with(x, ...); // doesn't touch x or y > > do_something_else_with(y, ...); // doesn't touch x or y > } > > But now, if I 'print c' in a debugger in the middle of one of the > do_something_*with expansions, what do I get? > > With the approach I'm implementing, you should get x and y at the > appropriate points, even though variable c doesn't really exist any > more. > > With your approach, what will you get? x and y at the appropriate part. Whatever holds 'x' at a point (SSA name, pseudo or mem) will also mention that it holds 'c'. At a later point whichever holds 'y' will also mention in holds 'c' . > There isn't any assignment to x or y you could hook your notes to. But there are _places_ for x and y. Those places can and are also associated with c. > Even if you were to set up side representations to model the additional > variables that end up mapped to the incoming arguments, you'd have 'c' > in both, and at the entry point. How would you tell? I don't understand the question. Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 17:20 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Michael Matz @ 2007-11-07 18:45 ` Alexandre Oliva 2007-11-08 10:23 ` Michael Matz 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-07 18:45 UTC (permalink / raw) To: Michael Matz; +Cc: Richard Guenther, gcc-patches, gcc On Nov 7, 2007, Michael Matz <matz@suse.de> wrote: > On Wed, 7 Nov 2007, Alexandre Oliva wrote: >> This will fail on a very fundamental level. Consider code such as: >> >> f(int x, int y) { int c; /* other vars */ >> c = x; do_something_with(c, ...); // doesn't touch x or y >> c = y; do_something_else_with(c, ...); // doesn't touch x or y >> This can (and should) be trivially optimized to: >> >> f(int x, int y) { /* other vars */ >> do_something_with(x, ...); // doesn't touch x or y >> do_something_else_with(y, ...); // doesn't touch x or y >> >> But now, if I 'print c' in a debugger in the middle of one of the >> do_something_*with expansions, what do I get? >> >> With the approach I'm implementing, you should get x and y at the >> appropriate points, even though variable c doesn't really exist any >> more. >> >> With your approach, what will you get? > x and y at the appropriate part. Whatever holds 'x' at a point (SSA name, > pseudo or mem) will also mention that it holds 'c'. At a later point > whichever holds 'y' will also mention in holds 'c' . I.e., there will be two parallel locations throughout the entire function that hold the value of 'c'. Something like: f(int x /* but also c */, int y /* but also c */) { /* other vars */ do_something_with(x, ...); // doesn't touch x or y do_something_else_with(y, ...); // doesn't touch x or y Now, what will you get if you 'print c' in the debugger (or if any other debug info evaluator needs to tell what the value of user variable c is) at a point within do_something_with(c,...) or do_something_else_with(c)? Now consider that f is inlined into the following code: int g(point2d p) { /* lots of code */ f(p.x, p.y); /* more code */ f(p.y, p.x); /* even more code */ } g gets fully scalarized, so, before inlining, we have: int g(point2d p) { int p$x = p.x, int p$y = p.y; /* lots of code */ f(p$x, p$y); /* more code */ f(p$y, p$x); /* even more code */ } after inlining of f, we end up with: int g(point2d p) { int p$x = p.x, int p$y = p.y; /* lots of code */ { int f()::x.1 /* but also f()::c.1 */ = p$x, f()::y.1 /* but also f()::c.1 */ = p$y; { /* other vars */ do_something_with(f()::x.1, ...); // doesn't touch x or y do_something_else_with(f()::y.1, ...); // doesn't touch x or y } } /* more code */ { int f()::x.2 /* but also f()::c.2 */ = p$x, f()::y.2 /* but also f()::c.2 */ = p$y; { /* other vars */ do_something_with(f()::x.2, ...); // doesn't touch x or y do_something_else_with(f()::y.2, ...); // doesn't touch x or y } } /* even more code */ } then, we further optimize g and get: int g(point2d p) { int p$x /* but also f()::x.1, f()::c.1, f()::y.2, f()::c.2 */ = p.x; int p$y /* but also f()::y.1, f()::c.1, f()::x.2, f()::c.2 */ = p.y; /* lots of code */ { { /* other vars */ do_something_with(p$x, ...); // doesn't touch x or y do_something_else_with(p$y, ...); // doesn't touch x or y } } /* more code */ { { /* other vars */ do_something_with(p$y, ...); // doesn't touch x or y do_something_else_with(p$x, ...); // doesn't touch x or y } } /* even more code */ } and now, if you try to resolve the variable name 'c' to a location or a value within any of the occurrences of do_something_*with(), what do you get? What ranges do you generate for each of the variables involved? >> There isn't any assignment to x or y you could hook your notes to. > But there are _places_ for x and y. Those places can and are also > associated with c. This just goes to show that there's a fundamental mistake in the mapping. Instead of mapping user-level concepts to implementation concepts, which is what debug information is meant to do, you're mapping implementation details to user-level concepts. Unfortunately, this mapping is not biunivocal. The chosen representation is fundamentally lossy. It can't possibly get you accurate debug information. And the above is just an initial example of the loss of information that will lead to *incorrect* debug information, which is far worse than *incomplete* information. >> Even if you were to set up side representations to model the additional >> variables that end up mapped to the incoming arguments, you'd have 'c' >> in both, and at the entry point. How would you tell? > I don't understand the question. See the discussion about resolving 'c' above. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-07 18:45 ` Designs for better debug info in GCC Alexandre Oliva @ 2007-11-08 10:23 ` Michael Matz 2007-11-08 14:02 ` Robert Dewar 2007-11-08 16:32 ` Alexandre Oliva 0 siblings, 2 replies; 150+ messages in thread From: Michael Matz @ 2007-11-08 10:23 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, gcc Hi, On Wed, 7 Nov 2007, Alexandre Oliva wrote: > > x and y at the appropriate part. Whatever holds 'x' at a point (SSA > > name, pseudo or mem) will also mention that it holds 'c'. At a later > > point whichever holds 'y' will also mention in holds 'c' . > > I.e., there will be two parallel locations throughout the entire > function that hold the value of 'c'. No. For some PC locations the location of 'c' will happen to be the same as the one holding 'x', and for a different set of PC locations it will be the one also holding 'y'. The request "what's in 'c'" from a debugger only makes sense when done from a certain program counter. Depending on that the location of 'c' will be different. In the case from above both locations might exist in parallel throughout the entire function, but they don't hold 'c' in parallel. > Something like: > > f(int x /* but also c */, int y /* but also c */) { /* other vars */ "int x /* but also c */, int y /* but also c */" implies that x == y already, at which point the compiler will most probably have allocated just one place for x and y (and c) anyway ... > do_something_with(x, ...); // doesn't touch x or y > do_something_else_with(y, ...); // doesn't touch x or y > > Now, what will you get if you 'print c' in the debugger (or if any > other debug info evaluator needs to tell what the value of user > variable c is) at a point within do_something_with(c,...) or > do_something_else_with(c)? ... so the answer would be "whatever is in that common place for x,y and c". If the compiler did not allocate one place for x and y the answer still would be "whatever is in the place of 'y'", because that value is life, unlike 'x'. > Now consider that f is inlined into the following code: > > int g(point2d p) { > /* lots of code */ > f(p.x, p.y); > /* more code */ > f(p.y, p.x); > /* even more code */ > } > > g gets fully scalarized, so, before inlining, we have: > > int g(point2d p) { > int p$x = p.x, int p$y = p.y; > /* lots of code */ > f(p$x, p$y); > /* more code */ > f(p$y, p$x); > /* even more code */ > } > > after inlining of f, we end up with: > > int g(point2d p) { > int p$x = p.x, int p$y = p.y; > /* lots of code */ > { int f()::x.1 /* but also f()::c.1 */ = p$x, f()::y.1 /* but also f()::c.1 */ = p$y; Here you punt. How come that f::c is actually set to p$x? I don't see any assignment and in fact no declaration for c in f. If you had one _that_ would be the place were the connection between p$x and 'c' would have been made and everything would fall in place. > { /* other vars */ > do_something_with(f()::x.1, ...); // doesn't touch x or y > do_something_else_with(f()::y.1, ...); // doesn't touch x or y > } } > /* more code */ > { int f()::x.2 /* but also f()::c.2 */ = p$x, f()::y.2 /* but also f()::c.2 */ = p$y; > { /* other vars */ > do_something_with(f()::x.2, ...); // doesn't touch x or y > do_something_else_with(f()::y.2, ...); // doesn't touch x or y > } } > /* even more code */ > } > > then, we further optimize g and get: > > int g(point2d p) { > int p$x /* but also f()::x.1, f()::c.1, f()::y.2, f()::c.2 */ = p.x; > int p$y /* but also f()::y.1, f()::c.1, f()::x.2, f()::c.2 */ = p.y; > /* lots of code */ > { { /* other vars */ > do_something_with(p$x, ...); // doesn't touch x or y > do_something_else_with(p$y, ...); // doesn't touch x or y > } } > /* more code */ > { { /* other vars */ > do_something_with(p$y, ...); // doesn't touch x or y > do_something_else_with(p$x, ...); // doesn't touch x or y > } } > /* even more code */ > } > > and now, if you try to resolve the variable name 'c' to a location or > a value within any of the occurrences of do_something_*with(), what do > you get? What ranges do you generate for each of the variables > involved? It's not possible that p$x _and_ p$y are f()::c.1 at the same time, so the above examples are all somehow invalid. Except if p$x and p$y are somehow the same value, and if that's the case it's enough and exactly correct if the range of f()::c.1 covers the whole body of your function 'g' referring to exactly the one location of f()::c.1, f()::c.2, p$x and p$y. > Unfortunately, this mapping is not biunivocal. The chosen > representation is fundamentally lossy. What's fundamentally lossy are transformations done by the compiler. E.g. in this simple case: int f(int y) { int x = 2 * y; return x + 2; } If the compiler forward-props 2*y into the single use and simplifies: return (y+1)*2; then the value 2*y is never actually calculated anymore, not in any register, not in any local variable, nowhere. There's no way debug information could generally rectify this loss of information. As DWARF is capable to encode complete expressions it would be possible in this case to express it, because the inverse of the above function is easily determined. In case of more complicated expressions that's not possible anymore and you lose. So, if the value is never ever computed anymore debug information won't help you. You either have to force the value you're interested in to be life, or live with the impreciseness. Forcing some values life is possible, but is independend of generating debug information as exact as possible. It must be independend because forcing values life is going to change the code, something which mere generation of debug information is not allowed to do. So, our mapping is as accurate as your's. If a value is computed in some place which can be traced back to some user-declared variable then this will be expressed. If the value is not available then of course it also can't be reflected in the debug information (only as "optimized out"). It seems in your branch you also force some values life IIUC. That's okay but doesn't have to do with generating precise debug information as shown above. Even for forcing values life there are easier mechanisms. We for instance experimented with volatile asms, which simply refer to the values in question (and unsurprisingly we also were interested in formal arguments of inlined functions): int f (int x) { force_use (x); ... old body ... } You have to switch off any propagation into force_use(x), so that the original value of 'x' and the connection to the DECL of 'x' lives until the end of the compilation pipeline. That's a rather simple hack doing exactly what's necessary: it forces GCC to actually have a place for the value of 'x' at the function entry point, which also survives inlining. Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 10:23 ` Michael Matz @ 2007-11-08 14:02 ` Robert Dewar 2007-11-08 15:13 ` H.J. Lu ` (2 more replies) 2007-11-08 16:32 ` Alexandre Oliva 1 sibling, 3 replies; 150+ messages in thread From: Robert Dewar @ 2007-11-08 14:02 UTC (permalink / raw) To: Michael Matz; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches, gcc My general feelings on this subject: 1. I don't think we should care much about the ability to *SET* values of variables in optimized code. You can definitely do without that. So if a variable exists in two places, no problem, just register one of them. 2. It is much more important to have reasonable debugging for most users than the last mile of optimization. For me we should ensure that -O1 is still reasonably debuggable. The switch to GCC 4, at least in the Ada context, has significantly degraded -O1 debugging. I have found for instance that debugging the GNAT compiler itself, -O1 used to be perfectly fine, but now far too many arguments and variables disappear. 3. The quality of code at -O0 is really terrible compared to the competition (at least in the case of Ada), and large scale programs are just too big at -O0 to be practical (there is a big difference between a 50 megabyte image and a 100 megabyte image). So we really cannot rely on using -O0 for debugging. At -O1 we are more than competitive for performance with competing compilers. 4. In any case, most users really prefer to test and debug at the same optimization level that they will use for delivery. As noted above, -O0 is seldom practical for delivery (furthermore the voluminous extra code makes certification at the object level more work). -O1 is a fine compromise from a performance point of view, but needs to be debuggable. 5. Among our users we have relatively few who care about even a factor of 2 in performance, and VERY few who care about 10%. On the other hand we have lots of customers who definitely have severe problems with the lack of debuggability of -O1 code. 5. We have talked sometime about a -Od level or somesuch that would be fully debuggable. That's an interesting idea, but I think in practice it is more reasonable to try to ensure good debugging at -O1. Optimizations that significantly intefere with debugging should be moved to -O2. I think it is fine for -O2 to mean "optimize the heck out of the program, I really care about the last ounce of optimization, and I know debuggability will suffer." ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 14:02 ` Robert Dewar @ 2007-11-08 15:13 ` H.J. Lu 2007-11-08 16:11 ` Michael Matz 2007-11-08 16:37 ` Alexandre Oliva 2 siblings, 0 replies; 150+ messages in thread From: H.J. Lu @ 2007-11-08 15:13 UTC (permalink / raw) To: Robert Dewar Cc: Michael Matz, Alexandre Oliva, Richard Guenther, gcc-patches, gcc On Thu, Nov 08, 2007 at 08:59:18AM -0500, Robert Dewar wrote: > 2. It is much more important to have reasonable debugging > for most users than the last mile of optimization. For me > we should ensure that -O1 is still reasonably debuggable. > The switch to GCC 4, at least in the Ada context, has > significantly degraded -O1 debugging. I have found for > instance that debugging the GNAT compiler itself, -O1 > used to be perfectly fine, but now far too many arguments > and variables disappear. > With gcc 3.4, I can debug binutils at -O1 and -O2 in some cases. But with gcc 4, I have to use -O0 if I want to do any serious debug on binutils. H.J. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 14:02 ` Robert Dewar 2007-11-08 15:13 ` H.J. Lu @ 2007-11-08 16:11 ` Michael Matz 2007-11-08 17:48 ` Alexandre Oliva 2007-11-08 16:37 ` Alexandre Oliva 2 siblings, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-08 16:11 UTC (permalink / raw) To: Robert Dewar; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches, gcc Hi, On Thu, 8 Nov 2007, Robert Dewar wrote: > significantly degraded -O1 debugging. I have found for > instance that debugging the GNAT compiler itself, -O1 > used to be perfectly fine, but now far too many arguments > and variables disappear. Yes. That problem is addressed by Alexandre's approach and by ours. If you want to be really sure no arguments disappear (necessary for instance for meaningful use of systemtap) you also need to inhibit some transformations, which can be done under a certain option (which might or might not be on by default for -O1). > 3. The quality of code at -O0 is really terrible compared > to the competition (at least in the case of Ada), and > large scale programs are just too big at -O0 to be > practical (there is a big difference between a 50 > megabyte image and a 100 megabyte image). This is a problem on it's own. We're planning to work on this somewhen during the next months, i.e. improve code quality at -O0 at least to a point it was in the 3.x line of GCC. Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 16:11 ` Michael Matz @ 2007-11-08 17:48 ` Alexandre Oliva 2007-11-09 12:46 ` Michael Matz 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 17:48 UTC (permalink / raw) To: Michael Matz; +Cc: Robert Dewar, Richard Guenther, gcc-patches, gcc On Nov 8, 2007, Michael Matz <matz@suse.de> wrote: > If you want to be really sure no arguments disappear (necessary for > instance for meaningful use of systemtap) you also need to inhibit > some transformations, I'm not aware of any situations in which we must force an argument not to disappear. All of the problems I'm aware of are those in which the argument is there, we're just missing debug information for it. If you have information about needs for preserving arguments that are actually dead, please send it my way. > This is a problem on it's own. We're planning to work on this somewhen > during the next months, i.e. improve code quality at -O0 at least to a > point it was in the 3.x line of GCC. Aah, I guess the problem here is all the gimple-introduced temps, right? That our current -O0 is more like -O-1? :-) -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 17:48 ` Alexandre Oliva @ 2007-11-09 12:46 ` Michael Matz 2007-11-12 18:31 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-09 12:46 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Robert Dewar, Richard Guenther, gcc-patches, gcc Hi, On Thu, 8 Nov 2007, Alexandre Oliva wrote: > > If you want to be really sure no arguments disappear (necessary for > > instance for meaningful use of systemtap) you also need to inhibit > > some transformations, > > I'm not aware of any situations in which we must force an argument not > to disappear. All of the problems I'm aware of are those in which the > argument is there, we're just missing debug information for it. If you > have information about needs for preserving arguments that are actually > dead, please send it my way. ------------------------------------ static inline int foo(int i) { return i-1; } int foobar(int j) { return foo(j+2); } int main(int argc, char **argv) { return foobar(argc); } ------------------------------------ And similar examples. Depending on circumstances the formal argument 'i' of "foo" might be optimized away. If you want to use systemtap to show the actual arguments for all calls to foo, even the inlined ones, then you somehow have to make sure that the value of 'i' itself is not optimized away. Again, in this specific case, due to the simplicity of the involved expression, it would theoretically be possible to express this with just DWARF expressions (relating to the formal argument 'j' of foobar). In more complicated situtation that's not possible anymore, at which point you have to force the value of 'i' being live, if you want to be sure that systemtap works in all cases. > > during the next months, i.e. improve code quality at -O0 at least to a > > point it was in the 3.x line of GCC. > > Aah, I guess the problem here is all the gimple-introduced temps, > right? That our current -O0 is more like -O-1? :-) Indeed :) Perhaps also doing a simple DCE and local regalloc, none of which inhibits debugging. Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-09 12:46 ` Michael Matz @ 2007-11-12 18:31 ` Alexandre Oliva 2007-11-13 13:56 ` Michael Matz 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-12 18:31 UTC (permalink / raw) To: Michael Matz; +Cc: Robert Dewar, Richard Guenther, gcc-patches, gcc On Nov 9, 2007, Michael Matz <matz@suse.de> wrote: > static inline int foo(int i) > { > return i-1; > } > int foobar(int j) > { > return foo(j+2); > } > int main(int argc, char **argv) > { > return foobar(argc); > } > ------------------------------------ > And similar examples. Depending on circumstances the formal argument 'i' > of "foo" might be optimized away. With the design I've proposed, it is possible to compute the value of i, for the end result is live, which ensures that the inputs used to compute i are not completely optimized away. This means at any point in the execution of foo it is possible to compute i based on the inputs (argc or j) or the outputs (the return values of foo, foobar and main), no matter how much inlining takes place. Now, it is perfectly possible that foo is completely optimized away, such that no instruction remains in the scope in which i is live. In this case, it's debatable whether i still remains, but we could still emit debug information for it if we wanted to. > If you want to use systemtap to show the actual arguments for all > calls to foo, even the inlined ones, then you somehow have to make > sure that the value of 'i' itself is not optimized away. As I wrote before, I'm not aware of any systemtap bug report about a situation in which an argument was actually optimized away. I wouldn't go as far as stopping the optimization just so that systemtap can monitor the code. I'm not working on changing optimization to improve debugging, I'm working on fixing debug information such that it matches optimizations that occur. > at which point you have to force the value of 'i' being live, if you > want to be sure that systemtap works in all cases. I don't want to be sure of that. At least that was not the problem I was asked to solve. And, indeed, it's not solvable without disabling optimizations. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 18:31 ` Alexandre Oliva @ 2007-11-13 13:56 ` Michael Matz 2007-11-24 2:34 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-13 13:56 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Robert Dewar, Richard Guenther, gcc-patches, gcc Hi, On Mon, 12 Nov 2007, Alexandre Oliva wrote: > With the design I've proposed, it is possible to compute the value of i, No. Only if the function is reservible. There are many which aren't: static inline int foo(int i) { return i % 10; } int foobar(int j) { return foo(j % 20); } int main(int argc, char **argv) { return foobar(argc); } If foo is inlined and foobar simplified (to return j%10), the value for 'i' (j % 20) can not be recovered anymore. Hence for a 100% solution (and for systemtap you want that) you have no choice than to force the value to be live, e.g. by a volatile asm or the like. > As I wrote before, I'm not aware of any systemtap bug report about a > situation in which an argument was actually optimized away. I think it all started from PR23551. For us it also happened in the kernel in namei.c, where real_lookup is inlined sometimes, and it's arguments are missing. That might or might not be reversible functions, so your scheme perhaps would have helped there. But generally it won't solve the problem for good. > I wouldn't go as far as stopping the optimization just so that systemtap > can monitor the code. Like I said, at some point you have to or accept that some code remains to be not introspectable. > > at which point you have to force the value of 'i' being live, if you > > want to be sure that systemtap works in all cases. > > I don't want to be sure of that. At least that was not the problem I > was asked to solve. Then I'm probably still confused what problem you're actually trying to solve. If you don't want to be sure you get precise location information 100% of the time, then what percentage are you required to get? And how do you measure this? Or is the task rather "emit better debug info"? But that can be done also in our scheme, so why is there a need for DEBUG_INSN if it can't solve the systemtap problem for good? > And, indeed, it's not solvable without disabling optimizations. Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-13 13:56 ` Michael Matz @ 2007-11-24 2:34 ` Alexandre Oliva 2007-11-26 20:56 ` Michael Matz 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 2:34 UTC (permalink / raw) To: Michael Matz; +Cc: Robert Dewar, Richard Guenther, gcc-patches, gcc On Nov 13, 2007, Michael Matz <matz@suse.de> wrote: > Hi, > On Mon, 12 Nov 2007, Alexandre Oliva wrote: >> With the design I've proposed, it is possible to compute the value of i, > No. Only if the function is reservible. Of course. I meant it for that particular case. The generalization is obvious, but I didn't mean it would be always possible. >> As I wrote before, I'm not aware of any systemtap bug report about a >> situation in which an argument was actually optimized away. > I think it all started from PR23551. Yep. Nowhere does that bug report request parameters to be forced live. What it does request is that parameters that are not completely optimized away be present in debug information. Now, consider these cases: 1. function is not inlined At its entry point, we bind the argument to the register or stack slot in which the argument is live. Worst case, it's clobbered at the entry point instruction itself, because it's entirely unused. By emitting a live range from the entry point to the death point, we're emitting accurate and complete debug information for the argument. We win. 2. function is inlined, the argument is unused and thus optimized away, but the function does some other useful computation At the inlined entry point, we have a note that binds the argument to its expected value. As we transform the program and optimize away the argument, we retain and update the note, such that we can still represent the value of the inlined argument for as long as it's available. 3. function is inlined and completely optimized away No instruction remains in which the argument is in scope, so we might as well refrain from emitting location information for it. Even though we can figure out where the value lives, there's no code to attach this information to. So there's no place to set a breakpoint on to inspect the variable location anyway. > For us it also happened in the kernel in namei.c, where real_lookup > is inlined sometimes, and it's arguments are missing. That might or > might not be reversible functions, so your scheme perhaps would have > helped there. But generally it won't solve the problem for good. It looks like you're trying to solve a different problem. I'm not trying to find a way to ensure that arguments are live. I'm trying to get GCC to emit debug information that correctly matches the instructions it generated. If the value of a variable is completely optimized away at a point in the porogram, the correct representation for its location at that point is an empty set. >> I wouldn't go as far as stopping the optimization just so that systemtap >> can monitor the code. > Like I said, at some point you have to or accept that some code remains to > be not introspectable. Yep. It's easy enough to tweak the code to keep a variable live, if you absolutely need it. But this is not something I'm working to get the compiler to do by itself. Quite the opposite, in fact. I'm going to set the compiler free to perform some optimizations that it currently refrains from performing for the sake of debug information, when the conflict is only apparent because of past implementation decisions that I'm working to fix. > Then I'm probably still confused what problem you're actually trying to > solve. If you don't want to be sure you get precise location information > 100% of the time, then what percentage are you required to get? Accuracy comes first. If we ever emit debug information saying 'this variable is here' for a point in the program in which it's in fact elsewhere or unavailable, that's a bug to be fixed. Completeness comes second. If we could have emitted debug information saying 'the value of this variable is here' for a point in the program, and we instead claim the variable is unavailable at that point, that's an improvement that can be made. > And how do you measure this? Good question. The implementation approach I've taken, that exposes debug annotations as actual code, starts out with 100% accuracy (that's the theory, anyway, otherwise generated code would change, and, even though we still don't have a complete framework to ensure code doesn't change, if it does, then at least debug information will model the change accurately), and we can then grow completeness incrementally. > Or is the task rather "emit better debug info"? Nope. That's a secondary goal that will be achieved as we get accurate and sufficiently complete debug information. I don't have completeness goals set, but I have reasons to expect we're going to get much better results than we have now without too much additional effort. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 2:34 ` Alexandre Oliva @ 2007-11-26 20:56 ` Michael Matz 2007-11-27 5:30 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Michael Matz @ 2007-11-26 20:56 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Robert Dewar, Richard Guenther, gcc-patches, gcc Hi, On Fri, 23 Nov 2007, Alexandre Oliva wrote: > Yep. Nowhere does that bug report request parameters to be forced live. Not in that bug report perhaps, but we got requests for exactly this, i.e. to be able to introspect all parameters of all functions, be they inlined or not, at all time. I think that's a reasonable request even (which in some situations comes at a cost). > 2. function is inlined, the argument is unused and thus optimized > away, but the function does some other useful computation > > At the inlined entry point, we have a note that binds the argument to > its expected value. As we transform the program and optimize away the > argument, we retain and update the note, As far as possible. If it's not possible you loose (with our requirements). > > For us it also happened in the kernel in namei.c, where real_lookup is > > inlined sometimes, and it's arguments are missing. That might or > > might not be reversible functions, so your scheme perhaps would have > > helped there. But generally it won't solve the problem for good. > > It looks like you're trying to solve a different problem. We work on two fronts: 1) increasing the precision of debug information 2) forcing values life Our branch, and our ssa-name<->user-name map (and the SET<->decls association) is concerned with the first topic. The second topic can be implemented (or hacked) already now, but will potentially be more usefull when we also have (1). So, as in your branch, we are not trying to limit optimizers to reach the goal, that's the concern of (2), and happens somewhere else. > I'm trying to get GCC to emit debug information that correctly matches > the instructions it generated. > > If the value of a variable is completely optimized away at a point in > the porogram, the correct representation for its location at that point > is an empty set. I think this is academic. If a value is dead, but happens to lie in a place which isn't yet overwritten with something else, it is harmless to reveal this value. It's the "last" value the variable had. If OTOH the place _is_ already overwritten then it's important that we _don't_ say the dead variable lies therein. So, for me correctness is defined a bit different than for you: 1) if location L contains value X, then debug info should say so (as much as possible, i.e. here the quality of the info comes into play) 2) if location L does not contain value X, debug info should not say that it does. This is the correctness part. Where we differ in opinion (I think) is, when location L doesn't contain value X anymore. For you it's when X becomes dead. For me it's when X is dead and when location L is overwritten (with something different than X). I think for users there is no practical difference between our approaches, but there's a higher cost of implementation for your definition. > > Then I'm probably still confused what problem you're actually trying to > > solve. If you don't want to be sure you get precise location information > > 100% of the time, then what percentage are you required to get? > > Accuracy comes first. If we ever emit debug information saying 'this > variable is here' for a point in the program in which it's in fact > elsewhere I agree here ... > or unavailable, that's a bug to be fixed. ... and disagree here. If a value is dead it's not necessarily unavailable in my world. I think a world requiring this (and hence the constraints you were given) is unreasonable. Ciao, Michael. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-26 20:56 ` Michael Matz @ 2007-11-27 5:30 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-27 5:30 UTC (permalink / raw) To: Michael Matz; +Cc: Robert Dewar, Richard Guenther, gcc-patches, gcc On Nov 26, 2007, Michael Matz <matz@suse.de> wrote: > Hi, > On Fri, 23 Nov 2007, Alexandre Oliva wrote: >> Yep. Nowhere does that bug report request parameters to be forced live. > Not in that bug report perhaps, but we got requests for exactly this, i.e. > to be able to introspect all parameters of all functions, be they inlined > or not, at all time. I think that's a reasonable request even (which in > some situations comes at a cost). Fair enough. And we agree this is not about debug info, it's about limiting optimizations, so this is indeed a different problem from the one I was asked to address. >> 2. function is inlined, the argument is unused and thus optimized >> away, but the function does some other useful computation >> >> At the inlined entry point, we have a note that binds the argument to >> its expected value. As we transform the program and optimize away the >> argument, we retain and update the note, > As far as possible. If it's not possible you loose (with our > requirements). If the argument is completely removed, yes, you won't be able to get to it by merely improving debug information. You actually have to change the generated code. >> If the value of a variable is completely optimized away at a point in >> the porogram, the correct representation for its location at that point >> is an empty set. > I think this is academic. If a value is dead, but happens to lie in a > place which isn't yet overwritten with something else, it is harmless to > reveal this value. It's the "last" value the variable had. If OTOH the > place _is_ already overwritten then it's important that we _don't_ say the > dead variable lies therein. Exactly. Full agreement. I wasn't talking about the *location* of the variable, or the variable itself. I was talking about the value. And I wrote "completely optimized away", not "dead". Liveness has very little to do with this issue. The only catch is that, once a variable should be *expected* to hold a different value, if debug information still claims the variable still holds the old value it shouldn't hold any more, just because the value happens to be around and the assignment of the new value could be optimized away, then I'd say debug information is incorrect. > So, for me correctness is defined a bit different than for you: > 1) if location L contains value X, then debug info should say so (as much > as possible, i.e. here the quality of the info comes into play) > 2) if location L does not contain value X, debug info should not say that > it does. This is the correctness part. Your definition is exactly what I've been trying to communicate. It looks like we're in complete agreement as to the goals and the two different metrics (1 being completeness, 2 being correctness). So either there's some other underlying difference or you'll soon realize that the simple SSA name<->variable mapping is insufficient to get you correctness. > Where we differ in opinion (I think) is, when location L doesn't contain > value X anymore. For you it's when X becomes dead. For me it's when X is > dead and when location L is overwritten (with something different than X). For me, it's when X is overwritten. That's the point at which the user is entitled to expect the variable to no longer hold its previous value (assuming they're different). Consider this program: int foo(int x) { int i; i = x; p1(); i++; p2(i); i++; p3(); } int main() { foo(1); } If you set a breakpoint in p1(), go up one frame and print i, you should ideally get 1 (although "unavailable" is always correct, even if undesirable). If you set a breakpoint in p2(int), you should get 2, but "unavailable" is quite likely in the presence of optimization, depending on the calling conventions. If you set a breakpoint in p3(), you should get 3, but "unavailable" is quite likely, given that the value is not even computed, and it's based on a value that is dead and thus may have been overwritten. Getting any other values at any of these points would be a bug in the compiler. Does this sound sound to you? Did you somehow get the impression that the SSA<->names mapping can get you correct results? >> Accuracy comes first. If we ever emit debug information saying 'this >> variable is here' for a point in the program in which it's in fact >> elsewhere > I agree here ... >> or unavailable, that's a bug to be fixed. > ... and disagree here. If a value is dead it's not necessarily > unavailable in my world. I never said "dead", you did. I said "unavailable", and by that I don't mean "dead", I really mean "unavailable". The value I'm talking about is not "whatever was last assigned to something that resembles the variable after numerous optimizations" but rather "a value the user might expect the variable to hold at that point in the program", given some user tolerance to reordering and other optimizations. One reason I use separate functions for the breakpoint locations is precisely because at those points users are entitled to expect the state of the program to be stable, i.e., there isn't a lot of reordering or other surprises that a compiler can introduce across function calls that are by themselves in a statement. Another reason is that I still don't have a good answer for breakpoint locations at other points in the program that are less stable across optimizations, and I can't quite describe what I think users are entitled to expect at such other points. But the infrastructure needed to bring great improvements even in this regard is being set in place by getting them correct at stable points such as function calls. That said, I'm putting some thought into getting better debug information in these less stable points, but making it completely unsurprising in spite of optimizations isn't the task I was assigned. Making it correct and far more complete is. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 14:02 ` Robert Dewar 2007-11-08 15:13 ` H.J. Lu 2007-11-08 16:11 ` Michael Matz @ 2007-11-08 16:37 ` Alexandre Oliva 2007-11-09 1:26 ` Joe Buck 2007-11-09 1:26 ` Robert Dewar 2 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 16:37 UTC (permalink / raw) To: Robert Dewar; +Cc: Michael Matz, Richard Guenther, gcc-patches, gcc On Nov 8, 2007, Robert Dewar <dewar@adacore.com> wrote: > My general feelings on this subject: > 1. I don't think we should care much about the ability to > *SET* values of variables in optimized code. Indeed. We should care about correctness of debug information, and then this ability will come naturally ;-) > 3. The quality of code at -O0 is really terrible That's a feature, no? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 16:37 ` Alexandre Oliva @ 2007-11-09 1:26 ` Joe Buck 2007-11-09 14:53 ` Daniel Jacobowitz 2007-11-09 1:26 ` Robert Dewar 1 sibling, 1 reply; 150+ messages in thread From: Joe Buck @ 2007-11-09 1:26 UTC (permalink / raw) To: Alexandre Oliva Cc: Robert Dewar, Michael Matz, Richard Guenther, gcc-patches, gcc On Thu, Nov 08, 2007 at 02:36:57PM -0200, Alexandre Oliva wrote: > > 3. The quality of code at -O0 is really terrible > > That's a feature, no? Actually it's a misfeature, in that it's worse than it needs to be, and it's worse in ways that increase the time required to produce it (since a larger volume of code then has to be handled by the back end, assembler, and linker). Debugging would be just as easy and natural if -O0 only made sure that values of variables are written out to memory at positions where the user can set a breakpoint; the code doesn't need to preserve every operation exactly as written, or read variables in from memory that are already in registers. Kind of an -O0.5 would be more desirable than what we have now. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-09 1:26 ` Joe Buck @ 2007-11-09 14:53 ` Daniel Jacobowitz 2007-11-09 17:06 ` Robert Dewar 0 siblings, 1 reply; 150+ messages in thread From: Daniel Jacobowitz @ 2007-11-09 14:53 UTC (permalink / raw) To: gcc-patches, gcc [Can we pick just gcc@ or just gcc-patches@ please?] On Thu, Nov 08, 2007 at 05:11:24PM -0800, Joe Buck wrote: > Debugging would be just as easy and natural if -O0 only made sure that > values of variables are written out to memory at positions where the > user can set a breakpoint; the code doesn't need to preserve every > operation exactly as written, or read variables in from memory that > are already in registers. Kind of an -O0.5 would be more desirable > than what we have now. Careful. Eliminating reads from memory messes up debugger modification of variables, unless you can explain to the debugger that the variable is currently in both locations - this has been discussed but AFAIK there is no representation for it yet. Changing the memory location won't change the next operation that thinks it's in the register. Changing the register will be lost later. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-09 14:53 ` Daniel Jacobowitz @ 2007-11-09 17:06 ` Robert Dewar 0 siblings, 0 replies; 150+ messages in thread From: Robert Dewar @ 2007-11-09 17:06 UTC (permalink / raw) To: gcc-patches, gcc Daniel Jacobowitz wrote: > Careful. Eliminating reads from memory messes up debugger > modification of variables, unless you can explain to the debugger that > the variable is currently in both locations - this has been discussed > but AFAIK there is no representation for it yet. Changing the memory > location won't change the next operation that thinks it's in the > register. Changing the register will be lost later. I still think that changing memory locations is a marginal capability compared to reading them, and that is is fine if this capability is impacted by even low level optimization. > ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 16:37 ` Alexandre Oliva 2007-11-09 1:26 ` Joe Buck @ 2007-11-09 1:26 ` Robert Dewar 2007-11-12 16:56 ` Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: Robert Dewar @ 2007-11-09 1:26 UTC (permalink / raw) To: Alexandre Oliva; +Cc: Michael Matz, Richard Guenther, gcc-patches, gcc Alexandre Oliva wrote: > On Nov 8, 2007, Robert Dewar <dewar@adacore.com> wrote: > >> My general feelings on this subject: > >> 1. I don't think we should care much about the ability to >> *SET* values of variables in optimized code. > > Indeed. We should care about correctness of debug information, and > then this ability will come naturally ;-) Not really, there are optimizations that will still allow reading the value of a variable, but not setting it, and I think it is just fine to do these optimizations. For instance if we have b = a; the optimizer may not do a copy, it may simply know that b and a values are in the same place. This does not stand in the way of reading the value, but it does make it impossible to write a or b. Similarly, if the optimizer does test replacement, and knows that the value of a can be obtained by evaluating some expression, the debugger can read the value, but may not be able to set it. > >> 3. The quality of code at -O0 is really terrible > > That's a feature, no? > ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-09 1:26 ` Robert Dewar @ 2007-11-12 16:56 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-12 16:56 UTC (permalink / raw) To: Robert Dewar; +Cc: Michael Matz, Richard Guenther, gcc-patches, gcc On Nov 8, 2007, Robert Dewar <dewar@adacore.com> wrote: > Alexandre Oliva wrote: >>> 1. I don't think we should care much about the ability to >>> *SET* values of variables in optimized code. >> >> Indeed. We should care about correctness of debug information, and >> then this ability will come naturally ;-) > Not really, there are optimizations that will still allow > reading the value of a variable, but not setting it, Indeed. I was thinking implementation-level variables, rather than source-level variables. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-08 10:23 ` Michael Matz 2007-11-08 14:02 ` Robert Dewar @ 2007-11-08 16:32 ` Alexandre Oliva 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-08 16:32 UTC (permalink / raw) To: Michael Matz; +Cc: Richard Guenther, gcc-patches, gcc On Nov 8, 2007, Michael Matz <matz@suse.de> wrote: > Hi, > On Wed, 7 Nov 2007, Alexandre Oliva wrote: >> > x and y at the appropriate part. Whatever holds 'x' at a point (SSA >> > name, pseudo or mem) will also mention that it holds 'c'. At a later >> > point whichever holds 'y' will also mention in holds 'c' . >> >> I.e., there will be two parallel locations throughout the entire >> function that hold the value of 'c'. > No. For some PC locations the location of 'c' will happen to be the same > as the one holding 'x', and for a different set of PC locations it will be > the one also holding 'y'. So we're in agreement. What you say is how it ought to be done, what I did was to point out that the representation proposed by richi will be unable to do the right thing. >> f(int x /* but also c */, int y /* but also c */) { /* other vars */ > "int x /* but also c */, int y /* but also c */" implies that x == y > already No, per the posted design (assuming I understood it correctly) it just implies that, at some point in the program, an assignment 'c = x' was optimized away, and that at some other point in the program, an assignment 'c = y' was optimized away. >> do_something_with(x, ...); // doesn't touch x or y >> do_something_else_with(y, ...); // doesn't touch x or y >> >> Now, what will you get if you 'print c' in the debugger (or if any >> other debug info evaluator needs to tell what the value of user >> variable c is) at a point within do_something_with(c,...) or >> do_something_else_with(c)? > ... so the answer would be "whatever is in that common place for x,y and > c". And once we removed the incorrect assumption you made, that 'x == y', what do you get? > How come that f::c is actually set to p$x? It was in the original source code, was it not? p$x was passed to f() as x, and then x was copied to c. > I don't see any assignment and in fact no declaration for c in f. > If you had one _that_ would be the place were the connection between > p$x and 'c' would have been made and everything would fall in place. Since there is a declaration of c in the original source-level f (the only one that matters, as far as debug information is concerned), can you please expand on how you'd get everything to fall in place? > It's not possible that p$x _and_ p$y are f()::c.1 at the same time, Exactly > so the above examples are all somehow invalid. It's the bitmap debug info representation that makes them nonsensical. > int f(int y) { > int x = 2 * y; > return x + 2; > } > If the compiler forward-props 2*y into the single use and simplifies: > return (y+1)*2; > then the value 2*y is never actually calculated anymore, not in any > register, not in any local variable, nowhere. There's no way debug > information could generally rectify this loss of information. Actually, while y is live, debug information could encode that x is 2*y, even if the value is not computed at run time. So your statement is quite an exaggeration. > In case of more complicated expressions that's not possible anymore > and you lose. Yep. If the value is unavailable, debug information should say so, rather than pointing at something else. > Forcing some values life is possible, But undesirable. I'm not trying to do that. Actually, I'm working hard to make sure it doesn't happen. > So, our mapping is as accurate as your's. Not at all, and you made that point yourself, twice, in a single e-mail. > It seems in your branch you also force some values life IIUC. Nope. Any values that are forced live by debug annotations are bugs to be fixed. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC
@ 2007-11-12 21:04 Steven Bosscher
2007-11-24 1:37 ` Alexandre Oliva
0 siblings, 1 reply; 150+ messages in thread
From: Steven Bosscher @ 2007-11-12 21:04 UTC (permalink / raw)
To: Mark Mitchell
Cc: Alexandre Oliva, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc
xf. http://gcc.gnu.org/ml/gcc/2007-11/msg00293.html
Mark Mitchell wrote:
> The reason I want to make that assumption is that the part of this where
> the representation is in question is once we reach RTL, right?
The representation in GIMPLE should also be discussed IMVHO. For
GIMPLE Alex has invented DEBUG_STMT, which has the same properties as
DEBUG_INSN in RTL (with one noteworthy difference, namely that having
note-like GIPMLE statements is a totally new concept while DEBUG_INSN
is just a wannabe-real-insn INSN_NOTE).
Gr.
Steven
^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-12 21:04 Steven Bosscher @ 2007-11-24 1:37 ` Alexandre Oliva 2007-11-24 2:35 ` Steven Bosscher 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 1:37 UTC (permalink / raw) To: Steven Bosscher Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 12, 2007, "Steven Bosscher" <stevenb.gcc@gmail.com> wrote: > DEBUG_INSN in RTL (with one noteworthy difference, namely that having > note-like GIPMLE statements is a totally new concept Not quite. There were codeless gimple constructs before (think labels, for one). Or empty asm statements. But then, I'm not sure what you mean by note-like; maybe it's something else. As I explained before, debug insns and debug stmts are more like code than like notes, because notes generally don't need adjusting as code is modified elsewhere, whereas code does. And debug insns and stmts definitely need adjusting like regular insns. > while DEBUG_INSN is just a wannabe-real-insn INSN_NOTE). Except for this tiny detail that INSN_NOTEs are never adjusted as code is modified, because in general they don't even contain RTL. VAR_LOCATION is a recent exception, and it used to be introduced so late precisely because there's no infrastructure to keep notes up-to-date as code transformations are performed. So, yes, debug stmts and insns are notes in the sense that they don't output code. Like USE insns, labels, empty asm insns and other UNSPECs. But wait, those are insns, not notes. And they do generate code, just not in the .text section, but rather in .debug sections. So, what's this prejudice against debug insns? Why do you regard them as notes rather than insns? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 1:37 ` Alexandre Oliva @ 2007-11-24 2:35 ` Steven Bosscher 2007-11-24 15:08 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Steven Bosscher @ 2007-11-24 2:35 UTC (permalink / raw) To: Alexandre Oliva Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 23, 2007 9:45 PM, Alexandre Oliva <aoliva@redhat.com> wrote: > So, yes, debug stmts and insns are notes in the sense that they don't > output code. Like USE insns, labels, empty asm insns and other > UNSPECs. But wait, those are insns, not notes. And they do generate > code, just not in the .text section, but rather in .debug sections. All of them relate to code generation though. Without them, we create wrong code. I'm aware of how you feel about debug info and correctness and so on. > So, what's this prejudice against debug insns? Why do you regard them > as notes rather than insns? What worries me is that GCC will have to special-case DEBUG_INSN everywhere where it looks at INSNs. One can already see some of that happening on your branch. Apparently, you can't treat DEBUG_INSN just like any other normal insn. What I see happening with your DEBUG_INSN approach, is that all passes that use NEXT_INSN/PREV_INSN will have to special-case DEBUG_INSN in addition to the NOTE_P or INSN_P checks that they already have. I have seen too many bugs with passes who forgot to look through notes to feel comfortable about adding another not-a-note-but-also-not-an-insn like thing to the insn stream. The fact that DEBUG_INSN also has real operands that are not really real operands is bound to confuse the matter even more. Life with proper insn and operands iterators for RTL would be so much easier, but for the moment I fear you're just going to see a lot of duplication of ugly conditionals and bugs where such conditionals are forgotten/overlooked/missing. So to summarize: I'm just worried your approach is going to make GCC even slower, buggier, more difficult to maintain and more difficult to understand and modify. And the benefit, well, let's just say I'm not convinced that less elaborate efforts are not sufficient. (And to be perfectly honest, I think GCC has bigger issues to solve than getting perfect debug info -- such as getting compile times of a linux kernel down ;-)) Gr. Steven ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 2:35 ` Steven Bosscher @ 2007-11-24 15:08 ` Alexandre Oliva 2007-11-24 15:18 ` Richard Kenner 2007-11-24 16:45 ` Steven Bosscher 0 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 15:08 UTC (permalink / raw) To: Steven Bosscher Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 23, 2007, "Steven Bosscher" <stevenb.gcc@gmail.com> wrote: >> So, what's this prejudice against debug insns? Why do you regard them >> as notes rather than insns? > What worries me is that GCC will have to special-case DEBUG_INSN > everywhere where it looks at INSNs. This is just not true. Anywhere that simply wants to update insns for the effects of other transformations won't have to do that. Only places in which we need the weak-use semantics of debug_insns need to give them special treatment. Not because they're not insns, but because they're weak uses, i.e., uses that shouldn't interfere with optimizations. Yes, catching all such cases hasn't been trivial. If we miss some, then what happens is that -O2 -g -fvar-tracking-assignments outputs different executable code than -O2. Everything still works just fine, we eventually get a bug report, we fix it and move on. This is *much* better than starting out with notes, that nearly nothing cares about, and try to add code to update the notes as code transformations are performed. In this case, we get incorrect, non-functional compiler output unless we catch absolutely all bugs upfront. > Apparently, you can't treat DEBUG_INSN just like any other normal > insn. Obviously not. They're weaker uses than anything else. We haven't had any such thing in the compiler before. > but for the moment I fear you're just going to see a lot of > duplication of ugly conditionals Your fear is understandable but not justified. Go look at the patches. x86_64-linux-gnu now bootstraps and produces exactly the same code with and without -fvar-tracking-assignments. And no complex conditionals were needed. The most I've needed so far was to ignore debug insns at certain spots. It's true that in a number of situations this is an oversimplified course of action, and some additional effort might be needed to actually update the debug insns when they would have interfered with optimizations. Time will tell, I guess. So far, it doesn't look like it's been a problem, and I don't foresee these duplicated or ugly conditionals you fear. > and bugs where such conditionals are forgotten/overlooked/missing. See above. One of the reasons for the approach I've taken is that such cases will, in the worst case, cause missed optimizations, not incorrect compiler output. > And the benefit, well, let's just say I'm not convinced that less > elaborate efforts are not sufficient. Sufficient for what? Efforts towards what? Generating more incorrect debug information just for the sake of it? Adding more debug information while breaking some that's just fine now? Is that really progress? > (And to be perfectly honest, I think GCC has bigger issues to solve > than getting perfect debug info -- such as getting compile times of a > linux kernel down ;-)) Compile speed is a quality of implementation issue. Output correctness and standard compliance comes first in my book. And then, I'm supposed to fix this correctness problem, not other issues that others might find more important. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 15:08 ` Alexandre Oliva @ 2007-11-24 15:18 ` Richard Kenner 2007-11-24 20:11 ` Alexandre Oliva 2007-11-24 16:45 ` Steven Bosscher 1 sibling, 1 reply; 150+ messages in thread From: Richard Kenner @ 2007-11-24 15:18 UTC (permalink / raw) To: aoliva; +Cc: gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc > Yes, catching all such cases hasn't been trivial. If we miss some, > then what happens is that -O2 -g -fvar-tracking-assignments outputs > different executable code than -O2. But that's a very serious type of bug because it means you have situations where a program fails and you can't debug it because when you turn on debugging information, it doesn't fail anymore. We need to make an absolute rule that this *cannot* happen and luckily this is one of the easiest types of errors to project against. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 15:18 ` Richard Kenner @ 2007-11-24 20:11 ` Alexandre Oliva 2007-11-24 20:46 ` Bernd Schmidt ` (2 more replies) 0 siblings, 3 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 20:11 UTC (permalink / raw) To: Richard Kenner Cc: gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc On Nov 24, 2007, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote: >> Yes, catching all such cases hasn't been trivial. If we miss some, >> then what happens is that -O2 -g -fvar-tracking-assignments outputs >> different executable code than -O2. > But that's a very serious type of bug because it means you have > situations where a program fails and you can't debug it because when > you turn on debugging information, it doesn't fail anymore. We need > to make an absolute rule that this *cannot* happen and luckily this is > one of the easiest types of errors to project against. I agree completely. That's why I've gone to such great lengths to ensure these errors are easily testable in my implementation, and to put all my changes under control of a command-line option. Then, you can still get (poorer) debug information by disabling (or not enabling) this option. And then, despite the consensus that GCC must not generate different code with and without -g, the patch that fixes one such regression has been lingering for months, and the patch that introduced the regression hasn't been reverted either. Besides, the Ada RTS compiles differently with -g than without -g, such that compare-debug doesn't pass if you compare sysdep.o. Nobody but me seems to care. I'm sure I'm going to find other differences between -g and -g0 once I fix this and bootstrap4-debug gets past this point and builds other target libraries. I'm not looking forward to the discussions that will ensue if any fixes for these problems imply any costs whatsoever, given the experience I've had with the SSA-coalescing and the optimize-basic-blocks issues that are all about debug information versus optimization :-( -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 20:11 ` Alexandre Oliva @ 2007-11-24 20:46 ` Bernd Schmidt 2007-11-25 0:42 ` Alexandre Oliva 2007-11-24 20:48 ` Richard Kenner 2007-11-25 14:23 ` Robert Dewar 2 siblings, 1 reply; 150+ messages in thread From: Bernd Schmidt @ 2007-11-24 20:46 UTC (permalink / raw) To: Alexandre Oliva Cc: Richard Kenner, gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc Alexandre Oliva wrote: > And then, despite the consensus that GCC must not generate different > code with and without -g, the patch that fixes one such regression has > been lingering for months, and the patch that introduced the > regression hasn't been reverted either. Pointers? Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 20:46 ` Bernd Schmidt @ 2007-11-25 0:42 ` Alexandre Oliva 2007-11-25 7:19 ` Richard Guenther 2007-11-25 14:22 ` Alexandre Oliva 0 siblings, 2 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-25 0:42 UTC (permalink / raw) To: Bernd Schmidt Cc: Richard Kenner, gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc On Nov 24, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote: > Alexandre Oliva wrote: >> And then, despite the consensus that GCC must not generate different >> code with and without -g, the patch that fixes one such regression has >> been lingering for months, and the patch that introduced the >> regression hasn't been reverted either. > Pointers? Regression introduced here: http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01745.html first reported here: http://gcc.gnu.org/ml/gcc-patches/2007-08/msg00127.html last proposed patch here: http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00608.html -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-25 0:42 ` Alexandre Oliva @ 2007-11-25 7:19 ` Richard Guenther 2007-11-25 14:30 ` Alexandre Oliva 2007-11-25 14:22 ` Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: Richard Guenther @ 2007-11-25 7:19 UTC (permalink / raw) To: Alexandre Oliva Cc: Bernd Schmidt, Richard Kenner, gcc-patches, gcc, iant, mark, stevenb.gcc On Nov 24, 2007 9:19 PM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Nov 24, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote: > > > Alexandre Oliva wrote: > >> And then, despite the consensus that GCC must not generate different > >> code with and without -g, the patch that fixes one such regression has > >> been lingering for months, and the patch that introduced the > >> regression hasn't been reverted either. > > > Pointers? > > Regression introduced here: > > http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01745.html > > first reported here: > > http://gcc.gnu.org/ml/gcc-patches/2007-08/msg00127.html > > last proposed patch here: > > http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00608.html Well - it's a workaround for a bug that's elsewhere. Generated code shouldn't change if we allocate extra DECL_UIDs, but only possibly if we change DECL_UID ordering. (If that is the problem, as I remember your analysis) Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-25 7:19 ` Richard Guenther @ 2007-11-25 14:30 ` Alexandre Oliva 2007-11-25 14:46 ` Richard Guenther 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-25 14:30 UTC (permalink / raw) To: Richard Guenther Cc: Bernd Schmidt, Richard Kenner, gcc-patches, gcc, iant, mark, stevenb.gcc On Nov 24, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote: > Generated code shouldn't change if we allocate extra DECL_UIDs, but > only possibly if we change DECL_UID ordering. (If that is the > problem, as I remember your analysis) That is indeed the problem, but I'm not sure your requirement is feasible. If we permit DECL_UID divergence, it means we can't use DECL_UID for hashing any more. Since they already stand for hashable proxies for the decl pointers, I don't see what we'd gain by introducing yet another hashable uid that's stable across -g. What do you suggest us to use for hashing? Or do you suggest us to do away with hashing and use sorted set or map data structures? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-25 14:30 ` Alexandre Oliva @ 2007-11-25 14:46 ` Richard Guenther 2007-11-26 10:11 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Richard Guenther @ 2007-11-25 14:46 UTC (permalink / raw) To: Alexandre Oliva Cc: Bernd Schmidt, Richard Kenner, gcc-patches, gcc, iant, mark, stevenb.gcc On Nov 25, 2007 12:28 AM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Nov 24, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote: > > > Generated code shouldn't change if we allocate extra DECL_UIDs, but > > only possibly if we change DECL_UID ordering. (If that is the > > problem, as I remember your analysis) > > That is indeed the problem, but I'm not sure your requirement is > feasible. If we permit DECL_UID divergence, it means we can't use > DECL_UID for hashing any more. Since they already stand for hashable > proxies for the decl pointers, I don't see what we'd gain by > introducing yet another hashable uid that's stable across -g. > > What do you suggest us to use for hashing? Or do you suggest us to do > away with hashing and use sorted set or map data structures? No, hashing is fine, but doing walks over a hashtable when your algorithm depends on ordering is not. I have patches to fix the instance of walking over all referenced vars. Which is in the case of UIDs using bitmaps and a walk over a bitmap (which ensures walks in UID order). Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-25 14:46 ` Richard Guenther @ 2007-11-26 10:11 ` Alexandre Oliva 2007-11-26 12:26 ` Richard Guenther 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-26 10:11 UTC (permalink / raw) To: Richard Guenther Cc: Bernd Schmidt, Richard Kenner, gcc-patches, gcc, iant, mark, stevenb.gcc On Nov 24, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote: > No, hashing is fine, but doing walks over a hashtable when your algorithm > depends on ordering is not. Point. > I have patches to fix the instance of walking over all referenced > vars. Which is in the case of UIDs using bitmaps and a walk over a > bitmap (which ensures walks in UID order). Why is such memory and CPU overhead better than avoiding the divergence of UIDs in the first place? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-26 10:11 ` Alexandre Oliva @ 2007-11-26 12:26 ` Richard Guenther 2007-11-26 18:58 ` Alexandre Oliva 0 siblings, 1 reply; 150+ messages in thread From: Richard Guenther @ 2007-11-26 12:26 UTC (permalink / raw) To: Alexandre Oliva Cc: Bernd Schmidt, Richard Kenner, gcc-patches, gcc, iant, mark, stevenb.gcc On Nov 26, 2007 7:57 AM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Nov 24, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote: > > > No, hashing is fine, but doing walks over a hashtable when your algorithm > > depends on ordering is not. > > Point. > > > I have patches to fix the instance of walking over all referenced > > vars. Which is in the case of UIDs using bitmaps and a walk over a > > bitmap (which ensures walks in UID order). > > Why is such memory and CPU overhead better than avoiding the > divergence of UIDs in the first place? Actually my patches should be an overall memory savings. But, as you (and me and others) look at bugs that happen because of UID divergence, it is easier to use UIDs in a way that guarantees that generated code does not change in such cases. Otherwise what's the point in using UIDs? If you later do hashtable walks anyway you can hash on the pointer as well. So, IMHO an algorithm should produce the same result if for an ordered set of UIDs M { u1, u2, u3 } instead an ordered set M' { u1', u2', u3' } is used where element correspondence is u1 : u1', u2 : u2', u3 : u3' independent on the actual values uN or differences between values uN - uM. Anything else is a bug. And compensating for those bugs in other places by trying to preserve the exact values of UIDs is broken (and in this case, as it delays memory optimization, actually bad). Just my few euro-cents, Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-26 12:26 ` Richard Guenther @ 2007-11-26 18:58 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-26 18:58 UTC (permalink / raw) To: Richard Guenther Cc: Bernd Schmidt, Richard Kenner, gcc-patches, gcc, iant, mark, stevenb.gcc On Nov 26, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote: > On Nov 26, 2007 7:57 AM, Alexandre Oliva <aoliva@redhat.com> wrote: >> On Nov 24, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote: >> >> > No, hashing is fine, but doing walks over a hashtable when your algorithm >> > depends on ordering is not. >> >> Point. >> >> > I have patches to fix the instance of walking over all referenced >> > vars. Which is in the case of UIDs using bitmaps and a walk over a >> > bitmap (which ensures walks in UID order). >> >> Why is such memory and CPU overhead better than avoiding the >> divergence of UIDs in the first place? > Actually my patches should be an overall memory savings. Err... I don't see how using a bitmap in addition to a hashtable can save memory over using only a hashtable. Or are you saying you do away with the hashtables? I can see that this is possible and desirable. > But, as you (and me and others) look at bugs that happen because of > UID divergence, it is easier to use UIDs in a way that guarantees > that generated code does not change in such cases. Agreed, this property is desirable. But I wouldn't say it is enough. Ensuring UIDs remain constant across compilations has helped tremendously in locating other compilation divergences, for comparing debug dumps becomes much easier. So, even if we use algorithms that don't depend on UIDs remaining constant across compilations, I believe it is highly desirable that we keep them constant across compilations. > Otherwise what's the point in using UIDs? There are several different reasons for having UIDs, some of which could be having some unique identifier for an object, even in the presence of a moving garbage collector; being able to create fully-ordered sets of objects; being able to easily identify objects across a single compilation; being able to easily identify objects even across multiple compilations; and I'm sure it's possible to come up with other reasons that would justify the idea of UIDs on their own. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-25 0:42 ` Alexandre Oliva 2007-11-25 7:19 ` Richard Guenther @ 2007-11-25 14:22 ` Alexandre Oliva 1 sibling, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-25 14:22 UTC (permalink / raw) To: Bernd Schmidt Cc: Richard Kenner, gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc On Nov 24, 2007, Alexandre Oliva <aoliva@redhat.com> wrote: > On Nov 24, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote: >> Alexandre Oliva wrote: >>> And then, despite the consensus that GCC must not generate different >>> code with and without -g, the patch that fixes one such regression has >>> been lingering for months, and the patch that introduced the >>> regression hasn't been reverted either. >> Pointers? > Regression introduced here: > http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01745.html > first reported here: > http://gcc.gnu.org/ml/gcc-patches/2007-08/msg00127.html > last proposed patch here: > http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00608.html I take it back that this patch wasn't approved. Mark had approved it on Nov 5, I didn't want to check it in before going on a trip and, when I returned, I forgot about the approval because it was in an unrelated thread. http://gcc.gnu.org/ml/gcc/2007-11/msg00139.html I'll shortly check in that one and a bunch of others that also got approval but that I deferred until my return. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 20:11 ` Alexandre Oliva 2007-11-24 20:46 ` Bernd Schmidt @ 2007-11-24 20:48 ` Richard Kenner 2007-11-25 0:02 ` Alexandre Oliva 2007-11-25 14:23 ` Robert Dewar 2 siblings, 1 reply; 150+ messages in thread From: Richard Kenner @ 2007-11-24 20:48 UTC (permalink / raw) To: aoliva; +Cc: gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc > Besides, the Ada RTS compiles differently with -g than without -g, > such that compare-debug doesn't pass if you compare sysdep.o. Nobody > but me seems to care. That's wierd. Except on Windows, VXWorks, and VMS, there's almost no code in that file. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 20:48 ` Richard Kenner @ 2007-11-25 0:02 ` Alexandre Oliva 0 siblings, 0 replies; 150+ messages in thread From: Alexandre Oliva @ 2007-11-25 0:02 UTC (permalink / raw) To: Richard Kenner Cc: gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc On Nov 24, 2007, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote: >> Besides, the Ada RTS compiles differently with -g than without -g, >> such that compare-debug doesn't pass if you compare sysdep.o. Nobody >> but me seems to care. > That's wierd. Except on Windows, VXWorks, and VMS, there's almost > no code in that file. Yep. On GNU/Linux, the difference is precisely that, when compiling with -g, you get the variables that represent the file open modes to the output, while compiling without -g they're completely optimized away. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 20:11 ` Alexandre Oliva 2007-11-24 20:46 ` Bernd Schmidt 2007-11-24 20:48 ` Richard Kenner @ 2007-11-25 14:23 ` Robert Dewar 2007-12-15 20:32 ` Alexandre Oliva 2 siblings, 1 reply; 150+ messages in thread From: Robert Dewar @ 2007-11-25 14:23 UTC (permalink / raw) To: Alexandre Oliva Cc: Richard Kenner, gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc Alexandre Oliva wrote: > Besides, the Ada RTS compiles differently with -g than without -g, > such that compare-debug doesn't pass if you compare sysdep.o. Nobody > but me seems to care. We certainly care about this, and appreciate efforts to fix it! Robert Dewar. We = all the GNAT folks. ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-25 14:23 ` Robert Dewar @ 2007-12-15 20:32 ` Alexandre Oliva 2007-12-15 21:41 ` Robert Dewar 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-12-15 20:32 UTC (permalink / raw) To: Robert Dewar Cc: Richard Kenner, gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc On Nov 24, 2007, Robert Dewar <dewar@adacore.com> wrote: > Alexandre Oliva wrote: >> Besides, the Ada RTS compiles differently with -g than without -g, >> such that compare-debug doesn't pass if you compare sysdep.o. Nobody >> but me seems to care. > We certainly care about this, and appreciate efforts to fix it! Should be fixed now, FWIW. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-12-15 20:32 ` Alexandre Oliva @ 2007-12-15 21:41 ` Robert Dewar 0 siblings, 0 replies; 150+ messages in thread From: Robert Dewar @ 2007-12-15 21:41 UTC (permalink / raw) To: Alexandre Oliva Cc: Richard Kenner, gcc-patches, gcc, iant, mark, richard.guenther, stevenb.gcc Alexandre Oliva wrote: > On Nov 24, 2007, Robert Dewar <dewar@adacore.com> wrote: > >> Alexandre Oliva wrote: > >>> Besides, the Ada RTS compiles differently with -g than without -g, >>> such that compare-debug doesn't pass if you compare sysdep.o. Nobody >>> but me seems to care. > >> We certainly care about this, and appreciate efforts to fix it! > > Should be fixed now, FWIW. Good to hear, definition worth while! that's an important invariant. > ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 15:08 ` Alexandre Oliva 2007-11-24 15:18 ` Richard Kenner @ 2007-11-24 16:45 ` Steven Bosscher 2007-11-24 18:50 ` Alexandre Oliva 1 sibling, 1 reply; 150+ messages in thread From: Steven Bosscher @ 2007-11-24 16:45 UTC (permalink / raw) To: Alexandre Oliva Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 24, 2007 5:54 AM, Alexandre Oliva <aoliva@redhat.com> wrote: > > Apparently, you can't treat DEBUG_INSN just like any other normal > > insn. > > Obviously not. They're weaker uses than anything else. We haven't > had any such thing in the compiler before. So we get a "third way". GCC has insns and notes, and now it gets a third object to deal with in the insns stream. And it has to handle this new case everywhere. To me it seems that your approach will not help to make GCC easier to work with and understand. Unless there are compelling reasons to do this, I think this is a step in the wrong direction. > > but for the moment I fear you're just going to see a lot of > > duplication of ugly conditionals > > Your fear is understandable but not justified. Go look at the > patches. x86_64-linux-gnu now bootstraps and produces exactly the > same code with and without -fvar-tracking-assignments. And no complex > conditionals were needed. The most I've needed so far was to ignore > debug insns at certain spots. I didn't say "complex conditionals" but ugly conditionals ;-) I mean all the "INSN_P && ! DEBUG_INSN_P" conditionals. There seem to be a lot of those, and it's not immediately obvious where and when you'd need them. > > and bugs where such conditionals are forgotten/overlooked/missing. > > See above. One of the reasons for the approach I've taken is that > such cases will, in the worst case, cause missed optimizations, not > incorrect compiler output. Ah! More on that later. > > And the benefit, well, let's just say I'm not convinced that less > > elaborate efforts are not sufficient. > > Sufficient for what? Efforts towards what? Generating more incorrect > debug information just for the sake of it? Adding more debug > information while breaking some that's just fine now? Is that really > progress? Ah, there you go again with this extremist pro-debug-info stance. How can one argue with you when you keep ridiculing other points of view using ridiculous arguments? Who said anything about "generating more incorrect information just for the sake of it"? I don't think anyone did. The "for the sake of it" part is just offensive. You seem imply that people are arguing gcc should emit wrong debug information on purpose. Please step out of your own world of thoughts for a second, and try to understand that other people can have a different but nevertheless reasonable point of view. I think it is impossible to get perfect debug info after very complex code transformations. And because of that, I also think it is reasonable to not get perfect debug info in less complex cases. Your colleague expressed perfectly how I define "sufficiently good debug info": "It needs to be good enough that a semi-knowledgable person or a dumb but heuristic-laden program that processes debugging info can nevertheless extract reliable information." (http://gcc.gnu.org/ml/gcc/2007-11/msg00581.html) Note how this "good enough" does not imply correctness at all cost". Here is another "extremist" point of view: Correctness for a optimization algorithm means that it does not miss optimization opportunities that it is designed to catch. Therefore if an optimization algorithm implementation misses an optimization that it should catch, then this is a correctness issue. ;-) You said you now get the same code with and without -fvar-tracking-assignments on your branch. Can you also prove that the branch does not introduce new missed optimizations wrt. the latest revision that you merged from the trunk? Gr. Steven ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 16:45 ` Steven Bosscher @ 2007-11-24 18:50 ` Alexandre Oliva 2007-11-24 20:21 ` Richard Guenther 0 siblings, 1 reply; 150+ messages in thread From: Alexandre Oliva @ 2007-11-24 18:50 UTC (permalink / raw) To: Steven Bosscher Cc: Mark Mitchell, Ian Lance Taylor, Richard Guenther, gcc-patches, gcc On Nov 24, 2007, "Steven Bosscher" <stevenb.gcc@gmail.com> wrote: > On Nov 24, 2007 5:54 AM, Alexandre Oliva <aoliva@redhat.com> wrote: >> > Apparently, you can't treat DEBUG_INSN just like any other normal >> > insn. >> >> Obviously not. They're weaker uses than anything else. We haven't >> had any such thing in the compiler before. > So we get a "third way". GCC has insns and notes, and now it gets a > third object to deal with in the insns stream. Not quite. It's an insn. But it is different in some ways. It's not unheard of. Asm insns are also different in some ways. USEs and CLOBBERs too. Delayed-branch instruction groups too. It would be great if infrastructure for weak uses was already in place, but if it's needed (we haven't determined that, but I'm convinced there's no better way) and it isn't there, then it has to be put in. > And it has to handle this new case everywhere. I've already explained why this isn't true. It's not even close to being true. In fact, I've chosen this representation *precisely* because I reasoned it would lead to the least global impact. Of course you can refuse to believe that and point at the changes I had to make as alleged counter-proof, failing to notice how many other locations I haven't had to change and that just work because adjusting other instructions after transformations is precisely what all transformation passes already do. > I didn't say "complex conditionals" but ugly conditionals ;-) > I mean all the "INSN_P && ! DEBUG_INSN_P" conditionals. Oh, that's easy: NON_DEBUG_INSN_P can simplify that. There are, what, a few dozens of such tests in the compiler right now, compared with the hundreds of tests for INSN_P and a few tens of tests for DEBUG_INSN_P. I didn't think it was worth creating yet another macro, but if you find this so unacceptable, maybe I can rework it. Would you prefer NON_DEBUG_INSN_P, or would you prefer the original INSN_P and all uses thereof to be spelled differently, just to keep the few objectionable INSN_P && ! DEBUG_INSN_P tests more beautiful? >> Sufficient for what? Efforts towards what? Generating more incorrect >> debug information just for the sake of it? Adding more debug >> information while breaking some that's just fine now? Is that really >> progress? > Ah, there you go again with this extremist pro-debug-info stance. How > can one argue with you when you keep ridiculing other points of view > using ridiculous arguments? Who said anything about "generating more > incorrect information just for the sake of it"? Getting even the trivial cases wrong and dismissing those without realizing how things would fall apart in the big picture looks like "generating more incorrect information just for the sake of it" to me. Now, maybe it's not. Maybe it's just human behavior, a wish that some simpler solution will take care of a problem and that the simple counter-examples I've pointed out are rare situations. I don't see that they are. I've put a lot of thought into this problem, I've been working on it for quite a long time, and I've fallen in many of the traps that I pointed out, and avoided several others. I realize I come off as arrogant when I feel cornered by a majority that obviously hasn't spent enough on the issue to realize the obvious-to-me major problems with the alternatives that are on the table. I realize in such situations I often react in ways that are detrimental to the points I'm trying to make. I realize this doesn't help. I hope people can see through the mess of proposal-name-calling that this is turning into. > The "for the sake of it" part is just offensive. I agree, and I apologize for that. It's been a very frustrating debate. > You seem imply that people are arguing gcc should emit wrong debug > information on purpose. That's how it feels to me when the claims come up that it's not a matter of correctness, or that it's not important to get it right. > Your colleague expressed perfectly how I define "sufficiently good > debug info": > "It needs to be good enough > that a semi-knowledgable person or a dumb but heuristic-laden program > that processes debugging info can nevertheless extract reliable > information." > (http://gcc.gnu.org/ml/gcc/2007-11/msg00581.html) I'm very happy you agree with him. Unfortunately, you appear to be focusing on the sloppiness afforded by the wording "good enough", and assuming that this can be pushed beyond the point of "extract *reliable* information", which is the key operative qualifier here. If it's "good enough" for other purposes, but it's not possible to "extract reliable information from debugging info", then we don't satisfy the predicate above. That's why I'm aiming at correctness (it's reliable) rather than completeness (optimizations can discard stuff). > Here is another "extremist" point of view: > Correctness for a optimization algorithm means that it does not miss > optimization opportunities that it is designed to catch. Therefore if > an optimization algorithm implementation misses an optimization that > it should catch, then this is a correctness issue. > ;-) I happen to agree, indeed, but it's a correctness issue of the implementation, not a correctness issue of the compiler output, which is what I'm talking about when I speak of correctness issues. > You said you now get the same code with and without > -fvar-tracking-assignments on your branch. Can you also prove that > the branch does not introduce new missed optimizations wrt. the latest > revision that you merged from the trunk? I could, and that's a very good idea (thanks!), but it will be easier to do that after my next merge, when there won't be fixes for missed optimizations, that I detected with my testing, missing from the baseline. After all such missed optimizations are in the trunk, I intend to merge that into the branch and compare mergepoint and branch for compiler output changes other than in debug information. If there are any changes (extremenly unlikely), these are bugs that I'll have to fix. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} ^ permalink raw reply [flat|nested] 150+ messages in thread
* Re: Designs for better debug info in GCC 2007-11-24 18:50 ` Alexandre Oliva @ 2007-11-24 20:21 ` Richard Guenther 0 siblings, 0 replies; 150+ messages in thread From: Richard Guenther @ 2007-11-24 20:21 UTC (permalink / raw) To: Alexandre Oliva Cc: Steven Bosscher, Mark Mitchell, Ian Lance Taylor, gcc-patches, gcc On Nov 24, 2007 4:00 PM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Nov 24, 2007, "Steven Bosscher" <stevenb.gcc@gmail.com> wrote: > > > And it has to handle this new case everywhere. > > I've already explained why this isn't true. It's not even close to > being true. In fact, I've chosen this representation *precisely* > because I reasoned it would lead to the least global impact. Of > course you can refuse to believe that and point at the changes I had > to make as alleged counter-proof, failing to notice how many other > locations I haven't had to change and that just work because adjusting > other instructions after transformations is precisely what all > transformation passes already do. It also makes some things easier - for example during inlining of a function body we re-map all DECLs in the inlined copy. With an on-the-side representation you have to ensure to make the same mapping explicitly, with DEBUG_INSNs the mapping is automatically done during the copying of the IL. A similar problem with using SSA_NAME definition points to store information is using the renamer to rename a variable that already has SSA_NAMES (which is IMHO bogus, as we do not detect the errorneous case of overlapping life-ranges - but ignore that for now) - in this case you need some magic to transfer the on-the-side debug information from the old SSA_NAMEs to the new ones (where possible). Just to mention a few problems we are running into ;) Richard. ^ permalink raw reply [flat|nested] 150+ messages in thread
end of thread, other threads:[~2007-12-31 14:25 UTC | newest] Thread overview: 150+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-11-05 8:28 [vta] don't let debug insns get in the way of simple vect reduction Alexandre Oliva 2007-11-05 11:27 ` Richard Guenther 2007-11-07 7:52 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Alexandre Oliva 2007-11-07 16:16 ` Ian Lance Taylor 2007-11-07 19:11 ` Designs for better debug info in GCC Alexandre Oliva 2007-11-07 22:57 ` Ian Lance Taylor 2007-11-07 23:05 ` Daniel Jacobowitz 2007-11-08 0:00 ` Mark Mitchell 2007-11-08 0:15 ` David Edelsohn 2007-11-08 0:35 ` Mark Mitchell 2007-11-08 5:14 ` Alexandre Oliva 2007-11-08 18:28 ` Alexandre Oliva 2007-11-22 23:07 ` Frank Ch. Eigler 2007-11-22 23:13 ` Richard Guenther 2007-11-23 20:53 ` Frank Ch. Eigler 2007-11-24 1:53 ` Alexandre Oliva 2007-11-24 15:02 ` Robert Dewar 2007-11-08 5:15 ` Alexandre Oliva 2007-11-08 18:18 ` Alexandre Oliva 2007-11-08 19:46 ` Andrew Pinski 2007-11-08 20:39 ` Alexandre Oliva 2007-11-09 8:39 ` Robert Dewar 2007-11-08 5:44 ` Alexandre Oliva 2007-11-08 18:37 ` Alexandre Oliva 2007-11-08 19:13 ` Mark Mitchell 2007-11-08 19:13 ` David Daney 2007-11-08 19:17 ` Mark Mitchell 2007-11-09 2:09 ` Alexandre Oliva 2007-11-12 4:49 ` Mark Mitchell 2007-11-12 18:45 ` Alexandre Oliva 2007-11-12 18:49 ` Joe Buck 2007-11-25 6:57 ` Alexandre Oliva 2007-11-25 12:09 ` Richard Kenner 2007-11-12 18:53 ` Ian Lance Taylor 2007-11-24 2:12 ` Alexandre Oliva 2007-11-13 10:30 ` Mark Mitchell 2007-11-24 1:54 ` Alexandre Oliva 2007-11-13 15:30 ` Michael Matz 2007-11-24 2:00 ` Alexandre Oliva 2007-11-26 21:01 ` Michael Matz 2007-11-27 5:31 ` Alexandre Oliva 2007-11-27 20:31 ` Michael Matz 2007-11-27 21:44 ` Alexandre Oliva 2007-11-08 9:54 ` Richard Guenther 2007-11-08 5:01 ` Alexandre Oliva 2007-11-08 18:15 ` Alexandre Oliva 2007-11-08 19:13 ` Ian Lance Taylor 2007-11-08 20:27 ` Alexandre Oliva 2007-11-08 21:26 ` Ian Lance Taylor 2007-11-09 9:53 ` Robert Dewar 2007-11-12 5:36 ` Mark Mitchell 2007-11-12 17:34 ` Alexandre Oliva 2007-11-12 17:54 ` Mark Mitchell 2007-11-24 1:55 ` Alexandre Oliva 2007-11-26 1:08 ` Mark Mitchell 2007-12-05 14:22 ` Diego Novillo 2007-12-05 22:10 ` Joe Buck 2007-12-15 21:41 ` Alexandre Oliva 2007-12-16 3:15 ` Daniel Berlin 2007-12-16 13:09 ` Alexandre Oliva 2007-12-17 1:27 ` Daniel Berlin 2007-12-17 4:20 ` Joe Buck 2007-12-17 8:13 ` Geert Bosch 2007-12-18 1:24 ` Alexandre Oliva 2007-12-18 1:29 ` Joe Buck 2007-12-18 4:40 ` Alexandre Oliva 2007-12-18 7:42 ` Robert Dewar 2007-12-18 8:09 ` Alexandre Oliva 2007-12-18 14:01 ` Robert Dewar 2007-12-18 21:20 ` Alexandre Oliva 2007-12-18 7:35 ` Robert Dewar 2007-12-18 8:34 ` Alexandre Oliva 2007-12-17 18:36 ` Alexandre Oliva 2007-12-17 17:59 ` Alexandre Oliva 2007-12-17 18:02 ` Diego Novillo 2007-12-17 20:34 ` Alexandre Oliva 2007-12-17 20:45 ` Diego Novillo 2007-12-18 1:02 ` Alexandre Oliva 2007-12-18 1:14 ` Diego Novillo 2007-12-18 5:21 ` Alexandre Oliva 2007-12-18 9:10 ` Alexandre Oliva 2007-12-18 13:20 ` Diego Novillo 2007-12-18 15:42 ` Alexandre Oliva 2007-12-18 22:43 ` Daniel Berlin 2007-12-19 6:07 ` Alexandre Oliva 2007-12-19 8:39 ` Daniel Berlin 2007-12-19 16:12 ` Daniel Berlin 2007-12-19 16:36 ` Andrew MacLeod 2007-12-19 19:49 ` Daniel Berlin 2007-12-19 20:00 ` Andrew MacLeod 2007-12-19 20:57 ` Daniel Berlin 2007-12-19 20:07 ` Alexandre Oliva 2007-12-19 22:00 ` Daniel Berlin 2007-12-20 9:26 ` Alexandre Oliva 2007-12-20 17:04 ` Ian Lance Taylor 2007-12-20 20:53 ` Alexandre Oliva 2007-12-19 20:27 ` Alexandre Oliva 2007-12-18 23:35 ` Daniel Berlin 2007-12-19 5:50 ` Alexandre Oliva 2007-12-19 16:35 ` Daniel Berlin 2007-12-19 19:46 ` Alexandre Oliva 2007-12-19 20:39 ` Daniel Jacobowitz 2007-12-31 15:40 ` Richard Guenther 2007-12-16 21:42 ` Mark Mitchell 2007-11-09 9:55 ` Seongbae Park (박성배, 朴成培) 2007-11-09 11:08 ` Robert Dewar 2007-11-08 8:58 ` Paolo Bonzini 2007-11-07 17:20 ` Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction) Michael Matz 2007-11-07 18:45 ` Designs for better debug info in GCC Alexandre Oliva 2007-11-08 10:23 ` Michael Matz 2007-11-08 14:02 ` Robert Dewar 2007-11-08 15:13 ` H.J. Lu 2007-11-08 16:11 ` Michael Matz 2007-11-08 17:48 ` Alexandre Oliva 2007-11-09 12:46 ` Michael Matz 2007-11-12 18:31 ` Alexandre Oliva 2007-11-13 13:56 ` Michael Matz 2007-11-24 2:34 ` Alexandre Oliva 2007-11-26 20:56 ` Michael Matz 2007-11-27 5:30 ` Alexandre Oliva 2007-11-08 16:37 ` Alexandre Oliva 2007-11-09 1:26 ` Joe Buck 2007-11-09 14:53 ` Daniel Jacobowitz 2007-11-09 17:06 ` Robert Dewar 2007-11-09 1:26 ` Robert Dewar 2007-11-12 16:56 ` Alexandre Oliva 2007-11-08 16:32 ` Alexandre Oliva 2007-11-12 21:04 Steven Bosscher 2007-11-24 1:37 ` Alexandre Oliva 2007-11-24 2:35 ` Steven Bosscher 2007-11-24 15:08 ` Alexandre Oliva 2007-11-24 15:18 ` Richard Kenner 2007-11-24 20:11 ` Alexandre Oliva 2007-11-24 20:46 ` Bernd Schmidt 2007-11-25 0:42 ` Alexandre Oliva 2007-11-25 7:19 ` Richard Guenther 2007-11-25 14:30 ` Alexandre Oliva 2007-11-25 14:46 ` Richard Guenther 2007-11-26 10:11 ` Alexandre Oliva 2007-11-26 12:26 ` Richard Guenther 2007-11-26 18:58 ` Alexandre Oliva 2007-11-25 14:22 ` Alexandre Oliva 2007-11-24 20:48 ` Richard Kenner 2007-11-25 0:02 ` Alexandre Oliva 2007-11-25 14:23 ` Robert Dewar 2007-12-15 20:32 ` Alexandre Oliva 2007-12-15 21:41 ` Robert Dewar 2007-11-24 16:45 ` Steven Bosscher 2007-11-24 18:50 ` Alexandre Oliva 2007-11-24 20:21 ` Richard Guenther
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).