From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28705 invoked by alias); 19 Dec 2007 18:01:23 -0000 Received: (qmail 28664 invoked by uid 22791); 19 Dec 2007 18:01:12 -0000 X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.45.13) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 19 Dec 2007 18:01:02 +0000 Received: from zps37.corp.google.com (zps37.corp.google.com [172.25.146.37]) by smtp-out.google.com with ESMTP id lBJI0vSU015738; Wed, 19 Dec 2007 10:00:57 -0800 Received: from localhost.localdomain.google.com (dhcp-172-18-119-235.corp.google.com [172.18.119.235]) (authenticated bits=0) by zps37.corp.google.com with ESMTP id lBJI0u1Y031114 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 19 Dec 2007 10:00:57 -0800 To: Alexandre Oliva Cc: gcc@gcc.gnu.org Subject: Re: Designs for better debug info in GCC References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> From: Ian Lance Taylor Date: Wed, 19 Dec 2007 18:41:00 -0000 In-Reply-To: Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2007-12/txt/msg00580.txt.bz2 Alexandre Oliva writes: > You snipped (skipped?) one aspect of the reasoning on why it is > appropriate. Of course this doesn't prove it's the best possibility, > but I haven't seen evidence of why it isn't. You will find it easier to demonstrate the worth of your proposal if you act publically as though your interlocutors are people of good will, even when it doesn't seem that way to you, and omit interjections like "(skipped?)". Assuming the goal is to get this into mainline gcc, you have to convince us, not the other way around. The first step in convincing people in this forum is not to irritate them. > Now, if you tell me that information about i_0 and j_2 is > backward-propagated to the top of the function, where x and y are set > up, I introduce say zero-initialization for i and j before probe1() > (an actual function call, mind you), and then this representation is > provably broken. To be sure we are on the same page, I think your argument here is that with this code: int f(int x, int y) { int i = 0, j = 0; probe1(); i = x; j = y; probe2(); if (x < y) i += y; else j -= x; probe3(); return g (i ,j); } if I set a breakpoint just before the call to probe2(), and I print the values of 'i' and 'j', I should get the values of 'x' and 'y'. That is, you want to emit a DWARF variable note at that point that the value of 'i' can be found in the location corresponding to 'x'. Of course there are no actual instructions between the calls to probe1() and probe2(). If I use gdb's "finish" command out of probe1(), what values should I see for 'i' and 'j' at that point? Arguably I am now before the assignment statements, and should see '0' and '0', the values that 'i' and 'j' have before they are changed. Of course, this is the same location as the breakpoint before probe2(), and we can't see both '0'/'0' and 'x'/'y'. So it seems to me that this situation is actually somewhat ambiguous. I don't see an obviously correct answer. Setting that aside, seeing the values 'x' and 'y' would probably be more useful in practice, even if the other possibility is not wrong. I think the general issue you are describing is how to handle an assignment which appears in user code but which has been eliminated during optimization. You are certainly correct: the scheme I was outlining does not address deleted assignments. It seems to me that such eliminated assignments are inherently ambiguous. If the assignment is gone, then there is a point in the generated code where the variable logically has both the old and the new values. I assume that the debugger can only display one value. Which one should it be? Your representation clearly makes a choice. What makes it a principled choice? Consider a series of assignments to a local variable, and suppose that all the assignments are deleted becaues they are unused. Are there dependencies between the DEBUG notes which keep them in the right order? One way to make a principled choice is to consider the line notes we are going to emit with the debugging information. Presumably we do not have the goal of emitting correct debug information in between line notes--e.g., when using the "stepi" command in gdb. Our goal is to emit correct debug information at the points where a debugger would naturally stop--the notes for where a line starts. I wonder whether it would be feasible for the debug info generation to work from the assignments in the source code as generated by the frontend. For each assignment, we would find the corresponding line note. Then we would look at the right hand side, and try to identify where that value could be found at that point in the program. This would be a variant of our current variable tracking pass. I haven't thought about this enough to know whether it would really work. > > It is of course true that optimized code will move around > > unpredictably, and your proposal doesn't handle that. > > It handles that in that a variable will be regarded as being assigned > to a value when execution crosses the debug stmt/insn originally > inserted right after the assignment. This is by design, but I realize > now I forgot to mention this in the design document. > > The idea is that, debug insns get high priority in scheduling. > However, since they mention the assignment just before them, if the > assignment is just moved earlier, without an intervening scheduling > barrier, then the debug instruction will follow it. If the assignment > is removed, then the debug insn can be legitimately be move up to the > point where the assignment, if remaining, might have been moved up to. > However, if the assignment is moved to a separate basic block, say out > of a loop or a conditional, then we don't want the debug insn to move > with it: such that hoisting and commonizing are regarded as setting > temporaries, and the value is only "committed" to the variable if we > get to the point where the assignment would take place. That will only work correctly if sched-deps.c introduces dependencies between debug insns and real insns. Otherwise, debug insns will move ahead of real insns which change their values. If you introduce those dependencies, I don't understand how you will avoid changing the schedulers behaviour in the presence of debug insns. How did you work around that problem? > >> Testing for accuracy and completeness of debug information can be best > >> accomplished using a debugging environment. > > > Of course this is very unsatisfactory without an automated testsuite. > > Err... I didn't say the testing through a debugging environment > wouldn't be automated. My plan is to use something along the lines of > the GDB testsuite scripts, but whether to use GDB or some other > debugging or monitoring infrastructure is a tiny implementation detail > that I haven't worried about at all. The basic idea is to script the > inspection of variables and verify that the obtained values are the > expected ones, or that variables are defensibly unavailable at the > inspection points. Personally, I would like to see that testsuite first. That will give us an operational definition to aim for, rather than a theoretical discussion which I find to be ambiguous. And it will avoid the problem of turning the testsuite into a regression testsuite rather than an accuracy testsuite. But of course I'm not doing the work. Ian