From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28241 invoked by alias); 7 Nov 2007 18:45:44 -0000 Received: (qmail 28220 invoked by uid 22791); 7 Nov 2007 18:45:43 -0000 X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 07 Nov 2007 18:45:40 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.1) with ESMTP id lA7IjZIn023287; Wed, 7 Nov 2007 13:45:35 -0500 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [10.11.255.20]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id lA7IjZqs031174; Wed, 7 Nov 2007 13:45:35 -0500 Received: from free.oliva.athome.lsd.ic.unicamp.br (vpn-14-113.rdu.redhat.com [10.11.14.113]) by pobox.corp.redhat.com (8.13.1/8.13.1) with ESMTP id lA7IjVjp002127; Wed, 7 Nov 2007 13:45:32 -0500 Received: from free.oliva.athome.lsd.ic.unicamp.br (localhost.localdomain [127.0.0.1]) by free.oliva.athome.lsd.ic.unicamp.br (8.14.1/8.14.1) with ESMTP id lA7IjTjb009863; Wed, 7 Nov 2007 16:45:29 -0200 Received: (from aoliva@localhost) by free.oliva.athome.lsd.ic.unicamp.br (8.14.1/8.14.1/Submit) id lA7IjSTR009862; Wed, 7 Nov 2007 16:45:28 -0200 To: Michael Matz Cc: Richard Guenther , gcc-patches@gcc.gnu.org, gcc@gcc.gnu.org Subject: Re: Designs for better debug info in GCC References: <84fc9c000711050327x74845c78ya18a3329fcf9e4d2@mail.gmail.com> From: Alexandre Oliva Errors-To: aoliva@oliva.athome.lsd.ic.unicamp.br Date: Wed, 07 Nov 2007 18:45:00 -0000 In-Reply-To: (Michael Matz's message of "Wed\, 7 Nov 2007 18\:20\:31 +0100 \(CET\)") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-11/txt/msg00348.txt.bz2 On Nov 7, 2007, Michael Matz wrote: > On Wed, 7 Nov 2007, Alexandre Oliva wrote: >> This will fail on a very fundamental level. Consider code such as: >> >> f(int x, int y) { int c; /* other vars */ >> c = x; do_something_with(c, ...); // doesn't touch x or y >> c = y; do_something_else_with(c, ...); // doesn't touch x or y >> This can (and should) be trivially optimized to: >> >> f(int x, int y) { /* other vars */ >> do_something_with(x, ...); // doesn't touch x or y >> do_something_else_with(y, ...); // doesn't touch x or y >> >> But now, if I 'print c' in a debugger in the middle of one of the >> do_something_*with expansions, what do I get? >> >> With the approach I'm implementing, you should get x and y at the >> appropriate points, even though variable c doesn't really exist any >> more. >> >> With your approach, what will you get? > x and y at the appropriate part. Whatever holds 'x' at a point (SSA name, > pseudo or mem) will also mention that it holds 'c'. At a later point > whichever holds 'y' will also mention in holds 'c' . I.e., there will be two parallel locations throughout the entire function that hold the value of 'c'. Something like: f(int x /* but also c */, int y /* but also c */) { /* other vars */ do_something_with(x, ...); // doesn't touch x or y do_something_else_with(y, ...); // doesn't touch x or y Now, what will you get if you 'print c' in the debugger (or if any other debug info evaluator needs to tell what the value of user variable c is) at a point within do_something_with(c,...) or do_something_else_with(c)? Now consider that f is inlined into the following code: int g(point2d p) { /* lots of code */ f(p.x, p.y); /* more code */ f(p.y, p.x); /* even more code */ } g gets fully scalarized, so, before inlining, we have: int g(point2d p) { int p$x = p.x, int p$y = p.y; /* lots of code */ f(p$x, p$y); /* more code */ f(p$y, p$x); /* even more code */ } after inlining of f, we end up with: int g(point2d p) { int p$x = p.x, int p$y = p.y; /* lots of code */ { int f()::x.1 /* but also f()::c.1 */ = p$x, f()::y.1 /* but also f()::c.1 */ = p$y; { /* other vars */ do_something_with(f()::x.1, ...); // doesn't touch x or y do_something_else_with(f()::y.1, ...); // doesn't touch x or y } } /* more code */ { int f()::x.2 /* but also f()::c.2 */ = p$x, f()::y.2 /* but also f()::c.2 */ = p$y; { /* other vars */ do_something_with(f()::x.2, ...); // doesn't touch x or y do_something_else_with(f()::y.2, ...); // doesn't touch x or y } } /* even more code */ } then, we further optimize g and get: int g(point2d p) { int p$x /* but also f()::x.1, f()::c.1, f()::y.2, f()::c.2 */ = p.x; int p$y /* but also f()::y.1, f()::c.1, f()::x.2, f()::c.2 */ = p.y; /* lots of code */ { { /* other vars */ do_something_with(p$x, ...); // doesn't touch x or y do_something_else_with(p$y, ...); // doesn't touch x or y } } /* more code */ { { /* other vars */ do_something_with(p$y, ...); // doesn't touch x or y do_something_else_with(p$x, ...); // doesn't touch x or y } } /* even more code */ } and now, if you try to resolve the variable name 'c' to a location or a value within any of the occurrences of do_something_*with(), what do you get? What ranges do you generate for each of the variables involved? >> There isn't any assignment to x or y you could hook your notes to. > But there are _places_ for x and y. Those places can and are also > associated with c. This just goes to show that there's a fundamental mistake in the mapping. Instead of mapping user-level concepts to implementation concepts, which is what debug information is meant to do, you're mapping implementation details to user-level concepts. Unfortunately, this mapping is not biunivocal. The chosen representation is fundamentally lossy. It can't possibly get you accurate debug information. And the above is just an initial example of the loss of information that will lead to *incorrect* debug information, which is far worse than *incomplete* information. >> Even if you were to set up side representations to model the additional >> variables that end up mapped to the incoming arguments, you'd have 'c' >> in both, and at the entry point. How would you tell? > I don't understand the question. See the discussion about resolving 'c' above. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}