From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4966 invoked by alias); 12 Feb 2006 01:26:32 -0000 Received: (qmail 4959 invoked by uid 22791); 12 Feb 2006 01:26:32 -0000 X-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sun, 12 Feb 2006 01:26:30 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id k1C1QRBk009037 for ; Sat, 11 Feb 2006 20:26:27 -0500 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id k1C1QR119271; Sat, 11 Feb 2006 20:26:27 -0500 Received: from vpn83-136.boston.redhat.com (vpn83-136.boston.redhat.com [172.16.83.136]) by pobox.corp.redhat.com (8.12.8/8.12.8) with ESMTP id k1C1QQBi012341; Sat, 11 Feb 2006 20:26:27 -0500 Subject: Re: kprobe fault handling From: Martin Hunt To: "Frank Ch. Eigler" Cc: "systemtap@sources.redhat.com" In-Reply-To: <20060211004909.GE19238@redhat.com> References: <1139522818.4127.15.camel@monkey2> <1139550026.4025.4.camel@monkey2> <1139608530.3947.12.camel@monkey2> <20060210221202.GA19238@redhat.com> <1139609899.3947.13.camel@monkey2> <20060210222051.GB19238@redhat.com> <1139611343.3947.16.camel@monkey2> <20060210224748.GC19238@redhat.com> <1139614648.3947.22.camel@monkey2> <20060211004909.GE19238@redhat.com> Content-Type: text/plain Organization: Red Hat Inc Date: Sun, 12 Feb 2006 01:26:00 -0000 Message-Id: <1139707646.3943.12.camel@monkey2> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-25) Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00479.txt.bz2 On Fri, 2006-02-10 at 19:49 -0500, Frank Ch. Eigler wrote: > Hi - > > > > Those kernel functions are similarly unsafe (for purposes of > > > systemtap), since they can sleep (wait while page faults are being > > > serviced). > > > > I started this whole thread to explain that my tests were now > > showing that was indeed the case. > > Why was that news, given my repeated warning to this effect? The problem with your warnings is that they are vague and lack any analysis or data. Saying that there may be bugs now or in the future in a system is not the same as reporting a specific bug. > > However that was due to an easily fixed bug in the fault handler. > > Perhaps so, but: > > > You can't deem high-level functions unsafe to use because a bug in a > > lower-level routine temporarily made them that way. > > Temporarily? And it's not just that routine. The larger problem is > sleeping/rescheduling/locking, not just faulting. This lesson made an > earlier appearance with printk. And now we're off on a completely different course. You have bad feelings that there are bugs involving sleeping/rescheduling/locking somewhere. I agree there could be a few we haven't detected; I haven't finished reviewing the code for them. But not in the copy functions which don't sleep, lock, or reschedule. > > > This is why Roland went out of his way to collect > > > alternatives in loc2c-runtime.h. This was explained at the time. > > > > IIRC, he explained to you why using __get_user_asm was safe. That is the > > same function used by copy_from_user and get_user. > > It may be that even those are not sufficiently safe (i.e., not > stressed enough on pessimistic cases such as valid user addresses that > are paged out). Or maybe they are used just differently enough to > have made them work. How much analysis went into your variant, beyond > bypassing the might_sleep warning? Certainly less time than I have spent arguing over it. I writing a quick summary of my analysis and posting it separately. We can continue this debate there if you wish. Martin