From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 32329 invoked by alias); 10 Oct 2007 07:11:42 -0000 Received: (qmail 32319 invoked by uid 22791); 10 Oct 2007 07:11:41 -0000 X-Spam-Status: No, hits=-1.0 required=5.0 tests=AWL,BAYES_50,DK_POLICY_SIGNSOME,FORGED_RCVD_HELO,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 10 Oct 2007 07:11:38 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.1) with ESMTP id l9A7BaYZ000969 for ; Wed, 10 Oct 2007 03:11:36 -0400 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l9A7Baf1025563; Wed, 10 Oct 2007 03:11:36 -0400 Received: from localhost.localdomain (vpn-6-12.fab.redhat.com [10.33.6.12]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l9A7BYXl005046; Wed, 10 Oct 2007 03:11:35 -0400 Message-ID: <470C7B26.1000307@redhat.com> Date: Wed, 10 Oct 2007 07:11:00 -0000 From: Phil Muldoon User-Agent: Thunderbird 2.0.0.5 (X11/20070727) MIME-Version: 1.0 To: Roland McGrath CC: Frysk Hackers Subject: Re: Optimizing watchpoints References: <46FD7036.2010500@redhat.com> <20071001012529.D264A4D0325@magilla.localdomain> In-Reply-To: <20071001012529.D264A4D0325@magilla.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact frysk-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: frysk-owner@sourceware.org X-SW-Source: 2007-q4/txt/msg00035.txt.bz2 Roland McGrath wrote: > For the latter, that means an individual thread or a group of threads that > share a set of watchpoints. Right now, the implementation can only be done > by setting each watchpoint individually on each thread. But it is likely > that future facilities will be able to share some low-level resources and > interface overhead by treating uniformly an arbitrary subset of threads in > the same address space. Ideally from an api perspective, I'd like both. In the past, I always found it useful to watch every thread in a process to see which one was clobbering this memory address. However I would still like to preserve single thread watchpoints from a user (Frysk) api perspective. > It also likely to matter whether the chosen subset > is in fact the whole set of all threads in the same address space, and > whether a thread has only the breakpoints shared with its siblings in a > chosen subset, or has those plus additional private breakpoints of its own. > So it's worthwhile to think about how the structure of keeping track of > watchpoints (and other kinds of low-level breakpoints) can reflect those > groupings of threads from the high-level semantic control plane down to the > lowest-level implementation, where the most important sharing can occur. > Right now (correct me if I wrong here Mark), we do "software" code breakpoints via single-stepping and none of the limited debug registers are used for hardware code breakpoints. I guess the question here is whether we ever will, and if any design should reflect and be accommodating to that, or whether we should just "rewrite as necessary". For now I am going to take the latter, and pretend the former will never exist, at least in Frysk. > There is one final aspect of organization to consider. At the lowest > level, there is a fixed-size hardware resource of watchpoint slots. When > you set them with ptrace, the operating system just context-switches them > for each thread in the most straightforward way. So the hardware resource > is yours to decide how to allocate. However, this is not what we expect to > see in future facilities. The coming model is that hardware watchpoints > are a shared resource managed and virtualized to a certain degree by the > operating system. The debugger may be one among several noncooperating > users of this resource, for both per-thread and system-wide uses. Rather > than having the hardware slots to allocate as you choose, you will specify > what you want in a slot, and a priority, and can get dynamic feedback about > the availability of a slot for your priority. (For compatibility, ptrace > itself will use that facility to virtualize the demands made by > PTRACE_SET_DEBUGREG and the like. ptrace uses a known priority number that > is fairly high, so that some system-wide or other background tracing would > have to knowingly intend to interfere with traditional user application use > by choosing an even higher priority.) > This is where I see the largest change in Frysk's implementation now, and where it will change in the future with utrace; and it would do to make this setting and getting stuff in a fairly abstract class that can be reslotted depending on implementation. This is where I have been currently spending a lot of my thinking time. Right now, the debug registers will be populated via Frysk's register access routines which are themselves being refactored. The ptrace peek and poke is abstracted from the code, and just a simple set/get will be performed via the Frysk functions to populate and read the debug registers. But as you mention, it appears in the utrace world that this will be taken from the (abstracted) ptrace user and managed by the kernel. For the purposes of context on this list, is that hardware watchpoint design set in stone with utrace now, and would it be safe to lay plans based on that? > At one extreme you have single-step, i.e. software watchpoints by storing > the old value, stepping an instruction, and checking if the value in memory > changed. This has few constraints on specification (only that you can't > distinguish stored-same from no-store, and it's not a mechanism for data > read breakpoints). It has no resource contention issues at all. It is > inordinately expensive in CPU time (though a straightforward in-kernel > implementation could easily be orders of magnitude faster than the > traditional debugger experience of implementing this). > Conceptually (again correct me if I am wrong again, Mark/Tim) this is what we do with Code breakpoints, so adding a software watchpoint would be a modification of that code, and the hardware watchpoints - at least at the engine level - would be separate implementation. The user may or may not know the difference on whether they are assigning a hardware or software watchpoints depending on the tuneability that is given to them. However, I have no plans for software watchpoints at this moment. > Hardware watchpoints have some precise constraints and they compete for a > very limited dynamic resource, but they are extremely cheap in CPU time. > Yes and they seem to change on inter-model processor revisions too. Fun! Anyway, I'm still working on the bag of tricks for optimizing watchpoints. But I just wanted to give an overview to the first part of the email just as a wider scope, and open it up for comments about my long term intentions. I'll comment on the second part of your email later. Regards Phil