From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15280 invoked by alias); 18 Dec 2007 22:15:58 -0000 Received: (qmail 15201 invoked by uid 22791); 18 Dec 2007 22:15:57 -0000 X-Spam-Check-By: sourceware.org Received: from ro-out-1112.google.com (HELO ro-out-1112.google.com) (72.14.202.178) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 18 Dec 2007 22:15:51 +0000 Received: by ro-out-1112.google.com with SMTP id o35so3573989rog.6 for ; Tue, 18 Dec 2007 14:15:49 -0800 (PST) Received: by 10.142.98.18 with SMTP id v18mr425280wfb.61.1198016148619; Tue, 18 Dec 2007 14:15:48 -0800 (PST) Received: by 10.142.161.9 with HTTP; Tue, 18 Dec 2007 14:15:48 -0800 (PST) Message-ID: <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> Date: Tue, 18 Dec 2007 23:19:00 -0000 From: "Daniel Berlin" To: "Alexandre Oliva" Subject: Re: Designs for better debug info in GCC Cc: "Diego Novillo" , "Mark Mitchell" , "Robert Dewar" , "Ian Lance Taylor" , "Richard Guenther" , gcc-patches@gcc.gnu.org, gcc@gcc.gnu.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2007-12/txt/msg00538.txt.bz2 On 12/18/07, Alexandre Oliva wrote: > Then, we let tree optimizers do their jobs. Whenever they rename, > renumber, coalesce, combine or otherwise optimize a variable, they > will automatically update debug statements that mention them as well. > Speaking only about the tree level, in this entire email I make no representations about the RTL level ;) This is much harder than you give it credit for, unless you plan on throwing out all the info at elimination points. Consider PRE alone, which makes new statements that are combinations of old ones, and eliminate tons of variables in favor of it. If your debug statement strategy is "move debug statements when we insert code that is equivalent", it won't work, because our equivalence is based on value equivalence, not location equivalence. We only guarantee it has the same value as the whatever it is a copy of at that point, not that it has the same location. So you will lose info every time PRE makes an insertion, unless you make serious modifications to PRE. This is not to mention the data you lose if you just throw it away at elimination points. Let's take another problem. How do i say debug info for some variable is now dead, we have no idea what it is right now? How do I figure out which debug statements need to be modified when you introduce new memory operations? When you pass something by address, you get vops. The vops are not variables, and have no relation to the original variable (they can be partitions containing more vairables). If i have DEBUG(x, x_3) x_3 = x; // Read from global y = x_3; .... If i insert a new call DEBUG(x, x_3): 1 x_3 = x foo() // May modify x and *&x) y = x_3 Now you have two problems. It is no longer true that at the point of y = x_3, that DEBUG (x, x_3) is true In act, x_3 may no longer have any relation to x. You have three choices: 1. Either destroy the DEBUG(x, x_3) losing valuable and correct info 2. Add a new DEBUG (x, unknown) 3. Figure out which debug statement are reached by your call #3 is a dataflow problem, and not something you want to do every time you insert a call. If your answer is #1 or #2, then what you are really doing is computing roughly the same dataflow problem var-location does, except on trees and with a different meet-operation. var-location generates incorrect info not because it represents something fundamentally different than you are (it doesn't), it falls down because it uses union as the meet operation. It says "oh, i don't know which of these locations is right, it must be both of them". If you changed the meet operation to "oh, i don't know which of these locations is right, it must be none of them", and did a little more work you would inference the same info as yours *at the tree level* Nothing you have proposed is fundamentally going to give you better info. All you have done is annotated the IR in some places to make explicit some bits in the dataflow problem that you could inference anyway. It is provable you can inference them with a simple lattice and associated value, *unless you are going to start guessing* (which you have said you don't want to do because it can generate incorrect info). There is absolutely no reason what you are trying to do needs to modify the tree IR at all to achieve exactly the same accuracy of debug info as your design proposes at the tree level. You could simply compute the global dataflow problem. The RTL level is harder, of course.