From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22395 invoked by alias); 5 Jul 2007 18:37:59 -0000 Received: (qmail 22387 invoked by uid 22791); 5 Jul 2007 18:37:58 -0000 X-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,DK_POLICY_SIGNSOME,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 05 Jul 2007 18:37:53 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.1/8.13.1) with ESMTP id l65IbnJU028530; Thu, 5 Jul 2007 14:37:49 -0400 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [10.11.255.20]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l65IbnkS016177; Thu, 5 Jul 2007 14:37:49 -0400 Received: from [127.0.0.1] (sebastian-int.corp.redhat.com [172.16.52.221]) by pobox.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l65Iblim016062; Thu, 5 Jul 2007 14:37:48 -0400 Message-ID: <468D3A87.4020205@redhat.com> Date: Thu, 05 Jul 2007 18:37:00 -0000 From: Andrew Cagney User-Agent: Thunderbird 1.5.0.12 (X11/20070530) MIME-Version: 1.0 To: Mark Wielaard CC: frysk@sourceware.org Subject: Re: Breakpoint stepping References: <1183573205.3598.157.camel@dijkstra.wildebeest.org> In-Reply-To: <1183573205.3598.157.camel@dijkstra.wildebeest.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact frysk-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: frysk-owner@sourceware.org X-SW-Source: 2007-q3/txt/msg00038.txt.bz2 Mark, Can I suggest adding this, or something based on it, to frysk.proc.live.package.html? It's this useful! information we should be accumulating in the source tree. Andrew Mark Wielaard wrote: > As you might have noticed over the last week I have dropped in the new > breakpoint stepping framework (or hopefully you didn't notice since I > did try to do it in chunks that all had no detectable functional > behavior changes). This is an update to my previous general stepping > overview http://sourceware.org/ml/frysk/2007-q2/msg00283.html > and a request for review of the next steps with regard to the > instruction parsers, testframework and what to use as ssol area. > > The main difference with the previous stepping overview regarding the > ptrace state machine is that there is now an explicit Stepping state, > which is a sub-state of Running and that Running.sendContinue() now not > only determines how to instruct ptrace to continue the task, but also > returns a different Running state based on whether or not we asked for a > continue or step-continue. This makes interpreting the cause of the trap > event we subsequently expect to get easier. This was previously recorded > as a field in the Task itself. > > With respect to breakpoints the ptrace task states have been adapted to > all call setupSteppingBreakpoint() after a breakpoint hit has been > detected, which makes sure the PC is setup correctly (which previously > could be off by one on some architectures) and to mark the Task as being > at that particular breakpoint (this is still kept as Task field for now) > and to not actually call Breakpoint.prepareStep(), which is now handled > in Running.sendContinue(). In the previous version when reset-breakpoint > stepping was used another task could miss the breakpoint if the Task > moved into a Blocked state before continuing. > > All the logic of how to breakpoint step has been moved into the > Breakpoint class that checks the properties of a new Instruction class > object which is created by the Isa through an instruction parser before > the breakpoint is inserted at a given location (previously we > represented instructions just as a byte[]). An Instruction knows how > long the instruction is, which bytes it represents, whether it can be > single stepped out of line and how to set that up given the original pc > location and an alternative address, plus any fixups that are needed to > the Task afterwards (and it has a notion of whether or not the > Instruction can be simulated, but that isn't currently used, see below). > A Breakpoint ties an Instruction to a particular address and Proc (and > Tasks can have zero or more Breakpoints, they share the same Breakpoint > on the same address with other Tasks of a Proc and when no Tasks of a > Proc has an Breakpoint at a particular address anymore the Breakpoint is > removed). > > For stepping the Breakpoint the Running.sendContinue() method first > calls Breakpoint.prepareStep(), then signals ptrace to do a single step > of the Task, putting the Task in Stepping state and then in > handleTrapEvent() calls Breakpoint.stepDone(). prepareStep() queries the > capabilities of the Instruction at the pc address and depending on that > either sets things up for doing a step out of line, simulate the > instruction (but none of the current Instructions have been setup to do > simulation yes, and look at the comment in prepareStep() to see what is > needed to fully enable this option) or reset the current Instruction > (removing the breakpoint instruction temporarily). Accordingly a > Breakpoint can be in the state NOT_STEPPING, OUT_OF_LINE_STEPPING, > SIMULATE_STEPPING or RESET_STEPPING. > > In the case of RESET_STEPPING (which was the only option we supported > before) other Tasks might miss and just past the Breakpoint during the > brief period between the reset, step and reinstall. Breakpoint > prepareStep() just takes the Instruction bytes and puts them at the > current pc address, and doneStep() reinstates the breakpoint > instruction. > > When the Instruction supports single step out of line then the > Breakpoint requests an address in the single step out of line area of > the Proc, instructs the Instruction to install itself there for the > current Task calling Instruction.setupExecuteOutOfLine(). The default > action of setupExecuteOutOfLine() is to set the pc to the given address > and place a copy the instruction bytes there (although this can be > overridden if an Instruction wants to do something more fancy). When the > task signals the Breakpoint that a step was taken by calling stepDone(), > the Breakpoint calls Instruction fixupExecuteOutOfLine() with the > original pc and replacement address so any adjustments can be done to > the Task registers. The default action is to just set the pc to the > original pc plus the length of the Instruction just stepped. But > Instructions can override that if more is needed. As an example the RET > instruction doesn't do any fixup (since the only action is setting the > pc to the right location in the first place) and the JMP instruction > sets the pc to original pc plus/minus the difference of the alternate > address and the pc after the single step. Afterward the Breakpoint > returns the used address to the Proc so it can be used by other Tasks > needing to do a single step out of line. > > The Proc maintains a single step out of line area pool of addresses that > point to locations that are at least as big is the largest instruction > of the instruction set. The Proc gets this list from the Isa the first > time an address is requested through getOutOfLineAddress(). Currently > this is (for x86 and x86_64) just the address of the main function entry > point(see below). The address is taken out of the pool and the > Breakpoint is responsible for putting it back through doneOutOfLine (see > above). If no address is currently available the call blocks till one is > available (this was way easier than inventing yet another TaskState and > getting the communication between Proc and Task about this right, and > contention is very low and at the longest it takes for an address to > become available is one instruction single step). > > All the above seems to work nicely for x86 and x86_64 (powerpc hasn't > been updated, but as long as an Isa doesn't return Instructions through > an instruction parser that returns single stepping out of line capable > Instructions the old reset-breakpoint stepping is used) with all the old > testcases and a couple of new ones but there are 3 areas of improvement > needed: > > - Instruction Parser. The framework is in place and works for the few > Instructions that are known to the instruction parse, but there are all > hand coded (see IA32InstructionParser which just handles NOP, INT3, RETQ > and one JMP variant, the X8664Instruction just delegates to the IA32 for > now). There don't seem to be libraries available to easily plugin that > would give us the fixup instructions needed. The best available is the > kprobes examples from the linux kernel which have as drawback that they > are coded to be intimately tied to the kernel/C way of doing things and > only handles instructions found in kernel space. For uprobes this should > have been extended to handle every instruction that can occur in user > space, but I haven't seen that work yet (and apparently is only > available for x86 and no other architecture at this time). Any > alternatives to look at would be appreciated. Otherwise I need to sit > down with the various instruction manuals and just code it up by hand. > (Bonus points for finding something that would not just give us ssol > fixups but also simulation of instructions when hooked to the registers > and memory of a Task). > > - Testsuite. The best I could come up with for now is TestInstructions > and funit-instructions.c which is a nice little framework for having a > simple assembly instructions labeled in such a way that a 'instruction > address path' can be communicated to the java side so it can do either a > full step through the program or install breakpoints on every > instruction and make sure that every step and/or breakpoint hit takes it > to the right next address. But the assembly part has to be written > completely by hand (and unsurprisingly the simple assembly program > currently used works identically on x86 and x86_64). Trying to reuse the > funit-asm.h generic assembly instructions to make it platform neutral > was really hard especially since the labels used on the C and inline > assembly side have to kept in sync and mixing C defines and asm > statements don't seem to mix very well. Also at the moment it is just > using nops and jmps to keep things simple. I guess that in the end, if > we have real instruction parsers we could try to generate them using > that. Ideas or experiences with setting something up to track individual > instructions and getting the right addresses listed appreciated. > > - Single Step Out Of Line Address Area. Currently the Isa (for x86 and > x86_64 at least) just provide one address. The address of the main() > function entry point taken by just doing: > > Elf elf = new Elf(proc.getExe(), ElfCommand.ELF_C_READ); > Dwarf dwarf = new Dwarf(elf, DwarfCommand.READ, null); > DwarfDie die = DwarfDie.getDecl(dwarf, "main"); > return die.getEntryBreakpoints(); > > This works surprisingly well for a simple first approach, and programs > generally don't reenter their own main() function. But it would be nice > to either find an area that is guaranteed to never be used (again) by > the process, or to map in an executable area in the inferior that is > just used by us (maybe just making the inferior load a dummy shared > library). Again any suggestions welcome. >