From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <frysk-return-964-listarch-frysk=sources.redhat.com@sourceware.org>
Received: (qmail 25524 invoked by alias); 28 Sep 2006 14:26:15 -0000
Received: (qmail 25511 invoked by uid 22791); 28 Sep 2006 14:26:14 -0000
X-Spam-Status: No, hits=-1.0 required=5.0 	tests=AWL,BAYES_50,FORGED_RCVD_HELO
X-Spam-Check-By: sourceware.org
Received: from wildebeest.demon.nl (HELO gnu.wildebeest.org) (83.160.170.119)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 28 Sep 2006 14:26:09 +0000
Received: from dijkstra.wildebeest.org ([192.168.1.29]) 	by gnu.wildebeest.org with esmtp (Exim 3.36 #1 (Debian)) 	id 1GSwpx-0000IT-00 	for <frysk@sourceware.org>; Thu, 28 Sep 2006 16:26:05 +0200
Subject: From breakpoint addresses to source line stepping
From: Mark Wielaard <mark@klomp.org>
To: frysk@sourceware.org
Content-Type: text/plain
Date: Thu, 28 Sep 2006 15:54:00 -0000
Message-Id: <1159453562.3034.2.camel@dijkstra.wildebeest.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.6.3 (2.6.3-1.fc5.5)
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact frysk-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:frysk-subscribe@sourceware.org>
List-Post: <mailto:frysk@sourceware.org>
List-Help: <mailto:frysk-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: frysk-owner@sourceware.org
X-SW-Source: 2006-q3/txt/msg00631.txt.bz2

Hi,

This is an incomplete overview of all the issues/tasks remaining to go
from the simple (single threaded) breakpoint address support we have now
towards full source line stepping for Frysk. Not everything has the same
priority and some are just listed as ideas that would be nice to work on
when we had infinite time :) And some issues are only listed because I
was thinking in the past that we would use breakpoint addresses to
implement them, but on further thought that might be the wrong
mechanism to use. Comments on how to prioritize issues and what else
can be or is being worked on very welcome. And since it is a pretty
long list of issues I am sure I got some things wrong, so please
correct me where I am wrong. Especially my knowledge about the higher
level runtime/language model and the mapping to/from the low level
task/addresses is not yet complete.

= Current breakpoint address support implemented.

  - Proc shares breakpoints between Tasks. TaskState makes sure that
    running Task gets suspended when a new Code TaskObserver is added
    for an address that doesn't have a breakpoint set yet in the Proc
    and resumes immediately (unless already blocked).
    Code: frysk.proc.Proc, frysk.proc.Task, frysk.proc.TaskState
    frysk.proc.BreakpointAddresses

  - Supported through int3 on x86 and x86_64 and through trapping on
    an illegal instruction on ppc64. ppc isn't supported at this moment.
    A simple instruction replacement is done on the original addresses
    which gets reset when stepping (not multi-task safe, see below).
    Code: frysk.proc.BreakPoint,
    frysk.proc.Isa[IA32|EMT64|PPC64].getBreakpointInstruction().

  - Code observers can be set. Code observer can monitor multiple
    (related) addresses from multiple Tasks. Get updateHit() called.
    Code: frysk.proc.TaskObserver.Code,
    frysk.proc.Task,requestAddCodeObserver()

= Bugs and Extensions (low level work to do)

  - exec call should clears all breakpoints
    We forget to clear and delete the Code observer in this case.
    http://sourceware.org/bugzilla/show_bug.cgi?id=3255

  - traps can be used by applications
    Some applications install their own trap handlers and might
    generate trap events themselves. Our sanity checks are to strict
    and crash and burn in such cases.
    http://sourceware.org/bugzilla/show_bug.cgi?id=3256

  - system call vs breakpoint stepping.  When setting a breakpoint on
    a system call entry point we cannot easily use ptrace for a single
    step (since it will 'disappear' into the system call). Solution is
    to monitor syscall exit or set another breakpoint after return.
    Or maybe utrace will give us a more flexible interface.
    Needs test.

  - Unsafe locations/instructions
    Some locations or instructions mightbe unsafe for setting an
    breakpoint since they interfere with the instruction semantics. In
    particular ppc lwarx/stwcx pairs, see
    http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=207287 for an
    example. Question are there similar instruction (pairs) on other
    architectures?
    Needs test.

  - Multiple tasks
    The current setup is not multi-task safe. When an breakpoint
    address is hit and we can to continue or step over it the original
    instruction stream is put back, a step is taking in the Task and
    the breakpoint instructions are put back. When other Tasks are
    running this means those Task might miss the breakpoint since they
    are seeing the original instruction stream. Or worse, they might
    see an invalid instruction stream with partial breakpoint and
    partial original instructions in place. This is a problem on
    architectures that have multibyte breakpoint instruction
    sequences, like on ppc.
    There are basically 2 ways to solve this issue:
    - Stop the world, step, resume world.
      Whenever an breakpoint address is updated all Tasks of the Proc
      are suspended first. The original instruction stream is
      restored. The Task that hit the breakpoint is stepped. The
      breakpoint instruction is put back. And all Tasks are restarted.
      This is mostly architecture independent.
    - Out of instruction stream stepping.
      To keep the other Tasks running (suspending/resuming has a lot
      of overhead) we can try to use 'out of instruction stream'
      stepping. A per Task local memory location is found to put the
      original instruction(s) on. We set the PC to this location, a
      step is performed and the PC is set back. On architectures that
      support different lenght instructions we need to parse the
      original instruction stream. And for (jump, load or branch)
      instructions that are relative to the PC after the step we need
      to 'fixup' some of the registers before or after the step. The
      kprobe code in the linux kernel is an example of this approach.
      This is highly architecture dependent.

  - Hardware breakpoints

    When available (often there are not many hardware breakpoint
    registers) we should use an hardware breakpoint to speed things up
    and simplify things (no code patching needed!). As an extension an
    analysis of which breakpoints are hit the most can be done so we
    use them for those and switch others to less used addresses.

= Even more stuff that would be nice (low level)

  - Alternative for simple function call tracing
    Often users will be interested in just function calls being
    hit. This can be build upon the low level breakpoint
    addresses. But a simpler way to add this might be to patch the PLT
    elf entries of libraries loaded. This is what strace does for
    example. For languages with alternative linking strategies (gcj)
    other entry point triggers might be used. This ties in with the
    runtime/language model used.

  - Pushing observers/trigger logic into tracee
    It would reduce overhead a lot if some of the observer logic could
    be pushed in the tracee so a trap event is only generated when
    some simple condition holds.  This would require elaborate code
    patching and/or loading of a support library in the
    executable. Very invasive. Would also need an overview of "simple
    logic" that is useful. Note that systemtap does something like
    this in kernel space through loading a kernel module that
    interacts with kprobes.

  - Better kernel support.
    Both utrace and user-kprobes will hopefully become available in
    the future and might ease some of the issues outlined above.
    utrace: http://people.redhat.com/roland/utrace/
    user-kprobes: http://lwn.net/Articles/176281/
    http://lwn.net/Articles/182910/

= Instruction stepping (work items)

  - Stopping the Task (BlockObserver)
    There are multiple TaskObserver to stop a Task, most appropriate
    in this case the Code observer which you can give an address. But
    there isn't a simple way to just stop the Task where it is
    currently executing. So a BlockObserver should be introduced which
    only function is to put the Task in a suspended state. From there
    on you could inspect the Task and possibly initiate instruction
    stepping.
    Code: frysk.proc.TaskObserver.Block
          - Action blocked(Task);

  - Stepping the Task (StepObserver)

    This would enable instruction single stepping. On each step the
    observer would be called with the current pc value. The
    implementation would need to add a stepping flag to the running
    and blocked task states which indicate that instead of
    task.sendContinue() a task.sendStepInstruction() should be done.
    Note that it is the responsibility of higher level code to decide
    whether to instruction step of put a breakpoint when using source
    line stepping.
    Code: frysk.proc.TaskObserver.Step
          - Action stepped(Task,long);

= Mapping addresses to lines and back

  - Mapping addresses to source lines
    Done through lib.dw (Dwlf,DwflLine). This uses dwarf information,
    so can only be done when debug info is available. In theory an
    address could belong to different source lines (when different
    contexts are optimized into common code). But in practise this
    seems to be ignore (unavailable?).

  - TagSets
    Maintained in frysk.gui.srcwin.tags are the set of source line
    tags that the gui is interested in. Currently there isn't a way to
    define them (except loading them from the preferences). The
    concept seems useful outside the gui/srcwin package.

  - Mapping TagSets to Task addresses
    Given a TagSet we need a mechanism for mapping them to breakpoint
    addresses for each Proc we are interested in. Given the whole
    system approach that frysk we need a way to map these whenever a
    new Proc is being observered. Map any core code mapped in to
    sources which can be mapped against the TagSets. We also need a
    way to monitor the loading (and unloading) of dynamic libraries

= source line stepping, step into, step out off...

  Given all of the above we can finally implement the functions a user
  would be interested in given a language model view of the sources.