From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <frysk-return-2003-listarch-frysk=sources.redhat.com@sourceware.org>
Received: (qmail 22395 invoked by alias); 5 Jul 2007 18:37:59 -0000
Received: (qmail 22387 invoked by uid 22791); 5 Jul 2007 18:37:58 -0000
X-Spam-Status: No, hits=-2.5 required=5.0 	tests=AWL,BAYES_00,DK_POLICY_SIGNSOME,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 05 Jul 2007 18:37:53 +0000
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) 	by mx1.redhat.com (8.13.1/8.13.1) with ESMTP id l65IbnJU028530; 	Thu, 5 Jul 2007 14:37:49 -0400
Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [10.11.255.20]) 	by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l65IbnkS016177; 	Thu, 5 Jul 2007 14:37:49 -0400
Received: from [127.0.0.1] (sebastian-int.corp.redhat.com [172.16.52.221]) 	by pobox.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l65Iblim016062; 	Thu, 5 Jul 2007 14:37:48 -0400
Message-ID: <468D3A87.4020205@redhat.com>
Date: Thu, 05 Jul 2007 18:37:00 -0000
From: Andrew Cagney <cagney@redhat.com>
User-Agent: Thunderbird 1.5.0.12 (X11/20070530)
MIME-Version: 1.0
To: Mark Wielaard <mark@klomp.org>
CC: frysk@sourceware.org
Subject: Re: Breakpoint stepping
References: <1183573205.3598.157.camel@dijkstra.wildebeest.org>
In-Reply-To: <1183573205.3598.157.camel@dijkstra.wildebeest.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact frysk-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <frysk.sourceware.org>
List-Subscribe: <mailto:frysk-subscribe@sourceware.org>
List-Post: <mailto:frysk@sourceware.org>
List-Help: <mailto:frysk-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: frysk-owner@sourceware.org
X-SW-Source: 2007-q3/txt/msg00038.txt.bz2

Mark,

Can I suggest adding this, or something based on it, to 
frysk.proc.live.package.html?  It's this useful! information we should 
be accumulating in the source tree.

Andrew

Mark Wielaard wrote:
> As you might have noticed over the last week I have dropped in the new
> breakpoint stepping framework (or hopefully you didn't notice since I
> did try to do it in chunks that all had no detectable functional
> behavior changes). This is an update to my previous general stepping
> overview http://sourceware.org/ml/frysk/2007-q2/msg00283.html
> and a request for review of the next steps with regard to the
> instruction parsers, testframework and what to use as ssol area.
>
> The main difference with the previous stepping overview regarding the
> ptrace state machine is that there is now an explicit Stepping state,
> which is a sub-state of Running and that Running.sendContinue() now not
> only determines how to instruct ptrace to continue the task, but also
> returns a different Running state based on whether or not we asked for a
> continue or step-continue. This makes interpreting the cause of the trap
> event we subsequently expect to get easier. This was previously recorded
> as a field in the Task itself.
>
> With respect to breakpoints the ptrace task states have been adapted to
> all call setupSteppingBreakpoint() after a breakpoint hit has been
> detected, which makes sure the PC is setup correctly (which previously
> could be off by one on some architectures) and to mark the Task as being
> at that particular breakpoint (this is still kept as Task field for now)
> and to not actually call Breakpoint.prepareStep(), which is now handled
> in Running.sendContinue(). In the previous version when reset-breakpoint
> stepping was used another task could miss the breakpoint if the Task
> moved into a Blocked state before continuing.
>
> All the logic of how to breakpoint step has been moved into the
> Breakpoint class that checks the properties of a new Instruction class
> object which is created by the Isa through an instruction parser before
> the breakpoint is inserted at a given location (previously we
> represented instructions just as a byte[]). An Instruction knows how
> long the instruction is, which bytes it represents, whether it can be
> single stepped out of line and how to set that up given the original pc
> location and an alternative address, plus any fixups that are needed to
> the Task afterwards (and it has a notion of whether or not the
> Instruction can be simulated, but that isn't currently used, see below).
> A Breakpoint ties an Instruction to a particular address and Proc (and
> Tasks can have zero or more Breakpoints, they share the same Breakpoint
> on the same address with other Tasks of a Proc and when no Tasks of a
> Proc has an Breakpoint at a particular address anymore the Breakpoint is
> removed).
>
> For stepping the Breakpoint the Running.sendContinue() method first
> calls Breakpoint.prepareStep(), then signals ptrace to do a single step
> of the Task, putting the Task in Stepping state and then in
> handleTrapEvent() calls Breakpoint.stepDone(). prepareStep() queries the
> capabilities of the Instruction at the pc address and depending on that
> either sets things up for doing a step out of line, simulate the
> instruction (but none of the current Instructions have been setup to do
> simulation yes, and look at the comment in prepareStep() to see what is
> needed to fully enable this option) or reset the current Instruction
> (removing the breakpoint instruction temporarily). Accordingly a
> Breakpoint can be in the state NOT_STEPPING, OUT_OF_LINE_STEPPING,
> SIMULATE_STEPPING or RESET_STEPPING.
>
> In the case of RESET_STEPPING (which was the only option we supported
> before) other Tasks might miss and just past the Breakpoint during the
> brief period between the reset, step and reinstall. Breakpoint
> prepareStep() just takes the Instruction bytes and puts them at the
> current pc address, and doneStep() reinstates the breakpoint
> instruction.
>
> When the Instruction supports single step out of line then the
> Breakpoint requests an address in the single step out of line area of
> the Proc, instructs the Instruction to install itself there for the
> current Task calling Instruction.setupExecuteOutOfLine(). The default
> action of setupExecuteOutOfLine() is to set the pc to the given address
> and place a copy the instruction bytes there (although this can be
> overridden if an Instruction wants to do something more fancy). When the
> task signals the Breakpoint that a step was taken by calling stepDone(),
> the Breakpoint calls Instruction fixupExecuteOutOfLine() with the
> original pc and replacement address so any adjustments can be done to
> the Task registers. The default action is to just set the pc to the
> original pc plus the length of the Instruction just stepped. But
> Instructions can override that if more is needed. As an example the RET
> instruction doesn't do any fixup (since the only action is setting the
> pc to the right location in the first place) and the JMP instruction
> sets the pc to original pc plus/minus the difference of the alternate
> address and the pc after the single step. Afterward the Breakpoint
> returns the used address to the Proc so it can be used by other Tasks
> needing to do a single step out of line.
>
> The Proc maintains a single step out of line area pool of addresses that
> point to locations that are at least as big is the largest instruction
> of the instruction set. The Proc gets this list from the Isa the first
> time an address is requested through getOutOfLineAddress(). Currently
> this is (for x86 and x86_64) just the address of the main function entry
> point(see below). The address is taken out of the pool and the
> Breakpoint is responsible for putting it back through doneOutOfLine (see
> above). If no address is currently available the call blocks till one is
> available (this was way easier than inventing yet another TaskState and
> getting the communication between Proc and Task about this right, and
> contention is very low and at the longest it takes for an address to
> become available is one instruction single step).
>
> All the above seems to work nicely for x86 and x86_64 (powerpc hasn't
> been updated, but as long as an Isa doesn't return Instructions through
> an instruction parser that returns single stepping out of line capable
> Instructions the old reset-breakpoint stepping is used) with all the old
> testcases and a couple of new ones but there are 3 areas of improvement
> needed:
>
> - Instruction Parser.  The framework is in place and works for the few
> Instructions that are known to the instruction parse, but there are all
> hand coded (see IA32InstructionParser which just handles NOP, INT3, RETQ
> and one JMP variant, the X8664Instruction just delegates to the IA32 for
> now). There don't seem to be libraries available to easily plugin that
> would give us the fixup instructions needed. The best available is the
> kprobes examples from the linux kernel which have as drawback that they
> are coded to be intimately tied to the kernel/C way of doing things and
> only handles instructions found in kernel space. For uprobes this should
> have been extended to handle every instruction that can occur in user
> space, but I haven't seen that work yet (and apparently is only
> available for x86 and no other architecture at this time). Any
> alternatives to look at would be appreciated. Otherwise I need to sit
> down with the various instruction manuals and just code it up by hand.
> (Bonus points for finding something that would not just give us ssol
> fixups but also simulation of instructions when hooked to the registers
> and memory of a Task).
>
> - Testsuite.  The best I could come up with for now is TestInstructions
> and funit-instructions.c which is a nice little framework for having a
> simple assembly instructions labeled in such a way that a 'instruction
> address path' can be communicated to the java side so it can do either a
> full step through the program or install breakpoints on every
> instruction and make sure that every step and/or breakpoint hit takes it
> to the right next address. But the assembly part has to be written
> completely by hand (and unsurprisingly the simple assembly program
> currently used works identically on x86 and x86_64). Trying to reuse the
> funit-asm.h generic assembly instructions to make it platform neutral
> was really hard especially since the labels used on the C and inline
> assembly side have to kept in sync and mixing C defines and asm
> statements don't seem to mix very well. Also at the moment it is just
> using nops and jmps to keep things simple. I guess that in the end, if
> we have real instruction parsers we could try to generate them using
> that. Ideas or experiences with setting something up to track individual
> instructions and getting the right addresses listed appreciated.
>
> - Single Step Out Of Line Address Area.  Currently the Isa (for x86 and
> x86_64 at least) just provide one address. The address of the main()
> function entry point taken by just doing:
>
>         Elf elf = new Elf(proc.getExe(), ElfCommand.ELF_C_READ);
>         Dwarf dwarf = new Dwarf(elf, DwarfCommand.READ, null);
>         DwarfDie die = DwarfDie.getDecl(dwarf, "main");
>         return die.getEntryBreakpoints();
>
> This works surprisingly well for a simple first approach, and programs
> generally don't reenter their own main() function. But it would be nice
> to either find an area that is guaranteed to never be used (again) by
> the process, or to map in an executable area in the inferior that is
> just used by us (maybe just making the inferior load a dummy shared
> library). Again any suggestions welcome.
>