From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-22559-listarch-gdb=sources.redhat.com@sources.redhat.com>
Received: (qmail 26045 invoked by alias); 16 Sep 2005 12:03:50 -0000
Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb/>
List-Post: <mailto:gdb@sources.redhat.com>
List-Help: <mailto:gdb-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-owner@sources.redhat.com
Received: (qmail 25813 invoked by uid 22791); 16 Sep 2005 12:03:33 -0000
Received: from [220.225.32.98] (HELO calvin.codito.co.in) (220.225.32.98)
    by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Fri, 16 Sep 2005 12:03:33 +0000
Received: from zirakzigil.codito.co.in (zirakzigil.codito.co.in [192.168.100.106])
	by calvin.codito.co.in (8.13.2/8.13.2/Debian-1) with ESMTP id j8GC32M1017281;
	Fri, 16 Sep 2005 17:33:02 +0530
Subject: Re: Using reverse execution
From: Ramana Radhakrishnan <ramana.radhakrishnan@codito.com>
Reply-To: ramana.radhakrishnan@codito.com
To: Stan Shebs <shebs@apple.com>
Cc: gdb@sources.redhat.com
In-Reply-To: <432628AA.2040808@apple.com>
References: <432628AA.2040808@apple.com>
Content-Type: text/plain
Date: Fri, 16 Sep 2005 12:03:00 -0000
Message-Id: <1126871774.9129.113.camel@localhost.localdomain>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-SW-Source: 2005-09/txt/msg00114.txt.bz2

Hi,

I remember us having quite a bit of discussion on this when LIZARD was
happening here at Codito. The features you are discussing here resemble
some of the work we did back then . It was an effort with gdb and Linux
to get gdb to do some reverse execution . 

I don't know if you saw this too. 

You can check the Lizard site at http://lizard.sourceforge.net/

The features are here : 

http://lizard.sourceforge.net/features.html


and there is a mailing list at 
http://sourceforge.net/mailarchive/forum.php?forum=lizard-hackers

cheers
Ramana


On Mon, 2005-09-12 at 18:17 -0700, Stan Shebs wrote:
> Hi all, I've been spending some time recently on the reverse execution
> idea that's been getting some airplay since May or so
> (http://sources.redhat.com/ml/gdb/2005-05/msg00145.html and
> subsequent), and would like to bring people up to date on some of
> my thinking, plus solicit ideas.
> 
> The context is Darwin aka Mac OS X native debugging (surprise), and
> the idea is to make it something that any application developer using
> the Xcode environment could use. There are of course a hundred
> practical difficulties (how do you un-execute an IP packet send?  how
> do you reverse a billion instructions executed on multi-GHz
> hardware?), so this is more of a research project at this point; a
> real-life usable implementation probably entails extensive kernel
> hacking, but right now we don't know enough even to tell kernel people
> what facilities we want them to add. In this message, I'm going to
> focus on the user model, without trying to tie things down to a
> specific implementation.
> 
> So my big question is: what is reverse execution good for? Thinking
> about some of the difficulties I allude to above, it's easy to
> conclude that maybe reverse execution is just "party tricks" - an
> impressive demo perhaps, but not a feature that real-life users would
> ever adopt.  Since the May discussion, I've been watching myself while
> debugging other things (like a GDB merge :-) ), and asking "if I had
> reverse execution, when would I use it?".
> 
> The thing that jumped out at me most strongly was reverse execution as
> "undo" facility.
> 
> For instance, when stepping through unfamiliar and complicated code,
> it's very common to "next" over an apparently innocuous function, then
> say "oh sh*t" - your data has magically changed for the worse and so
> the function you nexted over must be the guilty party. But it's often
> too late - you've passed by the key bit of wrong code, and need to
> re-run. Much of the time this is OK, and only takes a moment; but if
> your application is complicated (like an iTunes), or if you have a
> complicated sequence of breaks and continues and user input to get to
> the point of interest, re-running starts to get slow and
> errorprone. You may also have a situation where the bug is
> intermittent and not guaranteed to appear on rerunning (if that sounds
> familiar, hold the thought for a moment). So in these kinds of cases,
> what you really want to undo that last "next" so you can do a "step"
> and descend into the function of unexpected interest.
> 
> A similar case might occur with single-stepping through a series of
> calculations. I suspect everybody has at one time or another stepped
> over something like "a *= b;", printed the value of a only to get a
> bogus value, then either mentally calculated a / b or typed in "p a /
> b", as a quick way to recover the last value of a.  It would have been
> easier and faster just to back up by one line and print (or watch)
> a. If the calculation is complicated, the manual un-calculate exposes
> the user to blind alleys if the calcution was mistaken. For instance,
> if you try to manually undo some pointer arithmetic, you might
> mentally adjust by chars when you should be adjusting by ints, and
> then be misled because you think that the bug is that the program is
> writing bad stuff into memory, when it's the pointer value that's
> mistaken.
> 
> The key tradeoff for reverse execution as undo facility is complexity
> of rerunning. If rerunning is a cheap part of the debugging session,
> then the undo facility is not going to seem that important.
> 
> Another use for reverse execution is a more general form of zeroing in
> on a bug. Suppose you have a bogus pointer that was supposedly
> returned by malloc() somewhere earlier in the program's
> execution. That pointer may only sit in a named variable, and the rest
> of time is wandering around in various other data structures. There's
> no single thing to watch in this case; it's not the memory being
> pointed to that's the problem, it's that the pointer itself goes "off
> the reservation". So what you want to do is start from the point at
> which you've discovered the pointer is bad (segfault perhaps), watch
> the location holding the bad pointer, and run backwards until either
> a) somebody modifies the pointer in place, or b) the bogus pointer is
> copied from elsewhere, in which case you watch it and continue
> backwards. In many cases you'll get to the bad code sooner than by
> running forwards, where you have to anticipate which malloc will
> produce the later-to-be-scrambled pointer, and generally trace along
> trying to anticipate who's going to do what to it before the bad thing
> happens. (The display of ignored breakpoint hits was introduced as a
> way to speed this up some.) Again, as with undo, the efficiency of
> this process vs re-running depends on whether the actual bug occurs
> closer to the beginning of execution, or closer to the point of
> failure. One could make an argument that most root-cause bugs tend to
> occur closer to failure points than to the beginning of program
> execution, but that's kind of a philosphical point about program
> desing for which I have no concrete evidence.
> 
> Then there is stepping backwards by instructions to retrace what is
> happening at the machine level. I'm less inclined to say this is
> valuable; picking apart registers and raw memory is a rather
> painstaking activity, so slow (at the human level), that the time to
> re-run up to the line in question is usually negligible by
> comparison. Even so, I can see it becoming very natural for a user to
> do a step, see bogus data that simply can't be explained by the source
> line on the screen, do a reverse-step and then multiple stepi's to
> "slo-mo" the calculations of that line's compiled code.
> 
> I touched on hard-to-repeat cases briefly above - GDB mavens will
> recognize this as one of the rationales for the tracepoint facility.
> Reverse execution is similar in that once you've gotten the program
> into a state where a problem manifests, you want to poke around in the
> program's immediate past states. Tracepoints however are designed such
> that the user needs to anticipate what data will be interesting;
> sensible in a decoupled remote debugging context, but not so good for
> the data-driven spur-of-the-moment experimentation that is part of a
> productive debugging session. So a working reverse execution gives the
> user freedom to look around a program's entire state while moving up
> and down along the flow of execution. (Ironically, this capability
> might work against good program design, in that it takes away some
> incentive to design a program with repeatable behavior. For instance,
> programs using random number generator often include machinery to
> display and input RNG seeds, one of the uses being to guarantee
> predictability while re-running under a debugger.)
> 
> But will users actually use any of this in real life? "Undo" is pretty
> easy - everybody understands "undo", even nonprogrammers, with many
> GUIs giving it a dedicated keystroke. Tracking data backwards through
> a program is a powerful tool for a tough class of bugs, but as we know
> from past experience, powerful features that are at all hard to use
> are often ignored. Single-instruction reverse stepping is conceptually
> simpler, but likely to see more interest from the low-level
> developers, and may only be interesting if available for kernel
> debugging and the like. Reproducibility problems crop up regularly, so
> I can see people wanting to use reverse execution after a breakpoint
> sets them down in rarely-executed code.
> 
> Once we have an idea of what we think users will want from the
> feature, we'll have a better idea of what characteristics and
> limitations might be acceptable in an implementation.
> 
> Stan
> 
>