From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26045 invoked by alias); 16 Sep 2005 12:03:50 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 25813 invoked by uid 22791); 16 Sep 2005 12:03:33 -0000 Received: from [220.225.32.98] (HELO calvin.codito.co.in) (220.225.32.98) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Fri, 16 Sep 2005 12:03:33 +0000 Received: from zirakzigil.codito.co.in (zirakzigil.codito.co.in [192.168.100.106]) by calvin.codito.co.in (8.13.2/8.13.2/Debian-1) with ESMTP id j8GC32M1017281; Fri, 16 Sep 2005 17:33:02 +0530 Subject: Re: Using reverse execution From: Ramana Radhakrishnan Reply-To: ramana.radhakrishnan@codito.com To: Stan Shebs Cc: gdb@sources.redhat.com In-Reply-To: <432628AA.2040808@apple.com> References: <432628AA.2040808@apple.com> Content-Type: text/plain Date: Fri, 16 Sep 2005 12:03:00 -0000 Message-Id: <1126871774.9129.113.camel@localhost.localdomain> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-SW-Source: 2005-09/txt/msg00114.txt.bz2 Hi, I remember us having quite a bit of discussion on this when LIZARD was happening here at Codito. The features you are discussing here resemble some of the work we did back then . It was an effort with gdb and Linux to get gdb to do some reverse execution . I don't know if you saw this too. You can check the Lizard site at http://lizard.sourceforge.net/ The features are here : http://lizard.sourceforge.net/features.html and there is a mailing list at http://sourceforge.net/mailarchive/forum.php?forum=lizard-hackers cheers Ramana On Mon, 2005-09-12 at 18:17 -0700, Stan Shebs wrote: > Hi all, I've been spending some time recently on the reverse execution > idea that's been getting some airplay since May or so > (http://sources.redhat.com/ml/gdb/2005-05/msg00145.html and > subsequent), and would like to bring people up to date on some of > my thinking, plus solicit ideas. > > The context is Darwin aka Mac OS X native debugging (surprise), and > the idea is to make it something that any application developer using > the Xcode environment could use. There are of course a hundred > practical difficulties (how do you un-execute an IP packet send? how > do you reverse a billion instructions executed on multi-GHz > hardware?), so this is more of a research project at this point; a > real-life usable implementation probably entails extensive kernel > hacking, but right now we don't know enough even to tell kernel people > what facilities we want them to add. In this message, I'm going to > focus on the user model, without trying to tie things down to a > specific implementation. > > So my big question is: what is reverse execution good for? Thinking > about some of the difficulties I allude to above, it's easy to > conclude that maybe reverse execution is just "party tricks" - an > impressive demo perhaps, but not a feature that real-life users would > ever adopt. Since the May discussion, I've been watching myself while > debugging other things (like a GDB merge :-) ), and asking "if I had > reverse execution, when would I use it?". > > The thing that jumped out at me most strongly was reverse execution as > "undo" facility. > > For instance, when stepping through unfamiliar and complicated code, > it's very common to "next" over an apparently innocuous function, then > say "oh sh*t" - your data has magically changed for the worse and so > the function you nexted over must be the guilty party. But it's often > too late - you've passed by the key bit of wrong code, and need to > re-run. Much of the time this is OK, and only takes a moment; but if > your application is complicated (like an iTunes), or if you have a > complicated sequence of breaks and continues and user input to get to > the point of interest, re-running starts to get slow and > errorprone. You may also have a situation where the bug is > intermittent and not guaranteed to appear on rerunning (if that sounds > familiar, hold the thought for a moment). So in these kinds of cases, > what you really want to undo that last "next" so you can do a "step" > and descend into the function of unexpected interest. > > A similar case might occur with single-stepping through a series of > calculations. I suspect everybody has at one time or another stepped > over something like "a *= b;", printed the value of a only to get a > bogus value, then either mentally calculated a / b or typed in "p a / > b", as a quick way to recover the last value of a. It would have been > easier and faster just to back up by one line and print (or watch) > a. If the calculation is complicated, the manual un-calculate exposes > the user to blind alleys if the calcution was mistaken. For instance, > if you try to manually undo some pointer arithmetic, you might > mentally adjust by chars when you should be adjusting by ints, and > then be misled because you think that the bug is that the program is > writing bad stuff into memory, when it's the pointer value that's > mistaken. > > The key tradeoff for reverse execution as undo facility is complexity > of rerunning. If rerunning is a cheap part of the debugging session, > then the undo facility is not going to seem that important. > > Another use for reverse execution is a more general form of zeroing in > on a bug. Suppose you have a bogus pointer that was supposedly > returned by malloc() somewhere earlier in the program's > execution. That pointer may only sit in a named variable, and the rest > of time is wandering around in various other data structures. There's > no single thing to watch in this case; it's not the memory being > pointed to that's the problem, it's that the pointer itself goes "off > the reservation". So what you want to do is start from the point at > which you've discovered the pointer is bad (segfault perhaps), watch > the location holding the bad pointer, and run backwards until either > a) somebody modifies the pointer in place, or b) the bogus pointer is > copied from elsewhere, in which case you watch it and continue > backwards. In many cases you'll get to the bad code sooner than by > running forwards, where you have to anticipate which malloc will > produce the later-to-be-scrambled pointer, and generally trace along > trying to anticipate who's going to do what to it before the bad thing > happens. (The display of ignored breakpoint hits was introduced as a > way to speed this up some.) Again, as with undo, the efficiency of > this process vs re-running depends on whether the actual bug occurs > closer to the beginning of execution, or closer to the point of > failure. One could make an argument that most root-cause bugs tend to > occur closer to failure points than to the beginning of program > execution, but that's kind of a philosphical point about program > desing for which I have no concrete evidence. > > Then there is stepping backwards by instructions to retrace what is > happening at the machine level. I'm less inclined to say this is > valuable; picking apart registers and raw memory is a rather > painstaking activity, so slow (at the human level), that the time to > re-run up to the line in question is usually negligible by > comparison. Even so, I can see it becoming very natural for a user to > do a step, see bogus data that simply can't be explained by the source > line on the screen, do a reverse-step and then multiple stepi's to > "slo-mo" the calculations of that line's compiled code. > > I touched on hard-to-repeat cases briefly above - GDB mavens will > recognize this as one of the rationales for the tracepoint facility. > Reverse execution is similar in that once you've gotten the program > into a state where a problem manifests, you want to poke around in the > program's immediate past states. Tracepoints however are designed such > that the user needs to anticipate what data will be interesting; > sensible in a decoupled remote debugging context, but not so good for > the data-driven spur-of-the-moment experimentation that is part of a > productive debugging session. So a working reverse execution gives the > user freedom to look around a program's entire state while moving up > and down along the flow of execution. (Ironically, this capability > might work against good program design, in that it takes away some > incentive to design a program with repeatable behavior. For instance, > programs using random number generator often include machinery to > display and input RNG seeds, one of the uses being to guarantee > predictability while re-running under a debugger.) > > But will users actually use any of this in real life? "Undo" is pretty > easy - everybody understands "undo", even nonprogrammers, with many > GUIs giving it a dedicated keystroke. Tracking data backwards through > a program is a powerful tool for a tough class of bugs, but as we know > from past experience, powerful features that are at all hard to use > are often ignored. Single-instruction reverse stepping is conceptually > simpler, but likely to see more interest from the low-level > developers, and may only be interesting if available for kernel > debugging and the like. Reproducibility problems crop up regularly, so > I can see people wanting to use reverse execution after a breakpoint > sets them down in rarely-executed code. > > Once we have an idea of what we think users will want from the > feature, we'll have a better idea of what characteristics and > limitations might be acceptable in an implementation. > > Stan > >