From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <frysk-return-2414-listarch-frysk=sources.redhat.com@sourceware.org>
Received: (qmail 32329 invoked by alias); 10 Oct 2007 07:11:42 -0000
Received: (qmail 32319 invoked by uid 22791); 10 Oct 2007 07:11:41 -0000
X-Spam-Status: No, hits=-1.0 required=5.0 	tests=AWL,BAYES_50,DK_POLICY_SIGNSOME,FORGED_RCVD_HELO,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 10 Oct 2007 07:11:38 +0000
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) 	by mx1.redhat.com (8.13.8/8.13.1) with ESMTP id l9A7BaYZ000969 	for <frysk@sourceware.org>; Wed, 10 Oct 2007 03:11:36 -0400
Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) 	by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l9A7Baf1025563; 	Wed, 10 Oct 2007 03:11:36 -0400
Received: from localhost.localdomain (vpn-6-12.fab.redhat.com [10.33.6.12]) 	by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l9A7BYXl005046; 	Wed, 10 Oct 2007 03:11:35 -0400
Message-ID: <470C7B26.1000307@redhat.com>
Date: Wed, 10 Oct 2007 07:11:00 -0000
From: Phil Muldoon <pmuldoon@redhat.com>
User-Agent: Thunderbird 2.0.0.5 (X11/20070727)
MIME-Version: 1.0
To: Roland McGrath <roland@redhat.com>
CC: Frysk Hackers <frysk@sourceware.org>
Subject: Re: Optimizing watchpoints
References: <46FD7036.2010500@redhat.com> <20071001012529.D264A4D0325@magilla.localdomain>
In-Reply-To: <20071001012529.D264A4D0325@magilla.localdomain>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact frysk-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <frysk.sourceware.org>
List-Subscribe: <mailto:frysk-subscribe@sourceware.org>
List-Post: <mailto:frysk@sourceware.org>
List-Help: <mailto:frysk-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: frysk-owner@sourceware.org
X-SW-Source: 2007-q4/txt/msg00035.txt.bz2

Roland McGrath wrote:
> For the latter, that means an individual thread or a group of threads that
> share a set of watchpoints.  Right now, the implementation can only be done
> by setting each watchpoint individually on each thread.  But it is likely
> that future facilities will be able to share some low-level resources and
> interface overhead by treating uniformly an arbitrary subset of threads in
> the same address space.  

Ideally from an api perspective, I'd like both. In the past, I always 
found it useful to watch every thread in a process to see which one was 
clobbering this memory address. However I would still like to preserve 
single thread watchpoints from a user (Frysk) api perspective.

> It also likely to matter whether the chosen subset
> is in fact the whole set of all threads in the same address space, and
> whether a thread has only the breakpoints shared with its siblings in a
> chosen subset, or has those plus additional private breakpoints of its own.
> So it's worthwhile to think about how the structure of keeping track of
> watchpoints (and other kinds of low-level breakpoints) can reflect those
> groupings of threads from the high-level semantic control plane down to the
> lowest-level implementation, where the most important sharing can occur.
>   

Right now (correct me if I wrong here Mark), we do "software" code 
breakpoints via single-stepping and none of the limited debug registers 
are used for hardware code breakpoints. I guess the question here is 
whether we ever will, and if any design should reflect and be 
accommodating to that, or whether we should just "rewrite as necessary". 
For now I am going to take the latter, and pretend the former will never 
exist, at least in Frysk.

> There is one final aspect of organization to consider.  At the lowest
> level, there is a fixed-size hardware resource of watchpoint slots.  When
> you set them with ptrace, the operating system just context-switches them
> for each thread in the most straightforward way.  So the hardware resource
> is yours to decide how to allocate.  However, this is not what we expect to
> see in future facilities.  The coming model is that hardware watchpoints
> are a shared resource managed and virtualized to a certain degree by the
> operating system.  The debugger may be one among several noncooperating
> users of this resource, for both per-thread and system-wide uses.  Rather
> than having the hardware slots to allocate as you choose, you will specify
> what you want in a slot, and a priority, and can get dynamic feedback about
> the availability of a slot for your priority.  (For compatibility, ptrace
> itself will use that facility to virtualize the demands made by
> PTRACE_SET_DEBUGREG and the like.  ptrace uses a known priority number that
> is fairly high, so that some system-wide or other background tracing would
> have to knowingly intend to interfere with traditional user application use
> by choosing an even higher priority.)
>   

This is where I see the largest change in Frysk's implementation now, 
and where it will change in the future with utrace; and it would do to 
make this setting and getting stuff in a fairly abstract class that can 
be reslotted depending on implementation. This is where I have been 
currently spending a lot of my thinking time. Right now, the debug 
registers will be populated via Frysk's register access routines which 
are themselves being refactored. The ptrace peek and poke is abstracted 
from the code, and just a simple set/get will be performed via the Frysk 
functions to populate and read the debug registers. But as you mention, 
it appears in the utrace world that this will be taken from the 
(abstracted) ptrace user and managed by the kernel. For the purposes of 
context on this list, is that hardware watchpoint design set in stone 
with utrace now, and would it be safe to lay plans based on that?

> At one extreme you have single-step, i.e. software watchpoints by storing
> the old value, stepping an instruction, and checking if the value in memory
> changed.  This has few constraints on specification (only that you can't
> distinguish stored-same from no-store, and it's not a mechanism for data
> read breakpoints).  It has no resource contention issues at all.  It is
> inordinately expensive in CPU time (though a straightforward in-kernel
> implementation could easily be orders of magnitude faster than the
> traditional debugger experience of implementing this).
>   

Conceptually (again correct me if I am wrong again, Mark/Tim) this is 
what we do with Code breakpoints, so adding a software watchpoint would 
be a modification of that code, and the hardware watchpoints - at least 
at the engine level - would be separate implementation. The user may or 
may not know the difference on whether they are assigning a hardware or 
software watchpoints depending on the tuneability that is given to them. 
However, I have no plans for software watchpoints at this moment.

> Hardware watchpoints have some precise constraints and they compete for a
> very limited dynamic resource, but they are extremely cheap in CPU time.
>   

Yes and they seem to change on inter-model processor revisions too. Fun! 
Anyway, I'm still working on the bag of tricks for optimizing 
watchpoints. But I just wanted to give an overview to the first part of 
the email just as a wider scope, and open it up for comments about my 
long term intentions. I'll comment on the second part of your email later.

Regards

Phil