From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14338 invoked by alias); 6 Dec 2013 06:13:08 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 14327 invoked by uid 89); 6 Dec 2013 06:13:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail4.hitachi.co.jp Received: from Unknown (HELO mail4.hitachi.co.jp) (133.145.228.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 06 Dec 2013 06:13:05 +0000 Received: from mlsv2.hitachi.co.jp (unknown [133.144.234.166]) by mail4.hitachi.co.jp (Postfix) with ESMTP id 6352033CC4; Fri, 6 Dec 2013 15:12:57 +0900 (JST) Received: from mfilter05.hitachi.co.jp by mlsv2.hitachi.co.jp (8.13.1/8.13.1) id rB66CvJl014752; Fri, 6 Dec 2013 15:12:57 +0900 Received: from vshuts02.hitachi.co.jp (vshuts02.hitachi.co.jp [10.201.6.84]) by mfilter05.hitachi.co.jp (Switch-3.3.4/Switch-3.3.4) with ESMTP id rB66CnJf020379; Fri, 6 Dec 2013 15:12:56 +0900 Received: from gxml20a.ad.clb.hitachi.co.jp (unknown [158.213.157.160]) by vshuts02.hitachi.co.jp (Postfix) with ESMTP id 4E24749004D; Fri, 6 Dec 2013 15:12:56 +0900 (JST) Received: from [10.198.194.91] by gxml20a.ad.clb.hitachi.co.jp (Switch-3.1.10/Switch-3.1.9) id 5B661CJNI0000D68C; Fri, 06 Dec 2013 15:12:55 +0900 Message-ID: <52A16AD7.6040500@hitachi.com> Date: Fri, 06 Dec 2013 06:13:00 -0000 From: Masami Hiramatsu User-Agent: Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: "Frank Ch. Eigler" Cc: Ingo Molnar , Ananth N Mavinakayanahalli , Sandeepa Prabhu , x86@kernel.org, lkml , "Steven Rostedt (Red Hat)" , systemtap@sourceware.org, "David S. Miller" Subject: Re: [PATCH -tip v4 0/6] kprobes: introduce NOKPROBE_SYMBOL() and fixes crash bugs References: <20131204012841.22118.82992.stgit@kbuild-fedora.novalocal> <20131204084551.GA31772@gmail.com> <529FBA71.6070107@hitachi.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2013-q4/txt/msg00343.txt.bz2 (2013/12/05 23:49), Frank Ch. Eigler wrote: > > Hi, Masami - > > masami.hiramatsu.pt wrote: > >> [...] >> For the safeness of kprobes, I have an idea; introduce a whitelist >> for dynamic events. AFAICS, the biggest unstable issue of kprobes >> comes from putting *many* probes on the functions called from tracers. > > Why do you think so? Oh, because I actually hit this problem when enabling kprobe-events on every *ftrace-related* functions(ring buffer, trace filter etc.) It doesn't crash the kernel but it slows down the machine very much. And finally I have to reboot it forcibly. But when I just enables a few probes on those functions, the system has no problem. In this case, almost probes are miss-hit because of recursion, but anyway each miss-hit involves int3/debug interrupts and it increases the processing time of one event handling by ftrace as below. 1. hit a kprobe outside of ftrace 2. kprobe calls event handler 3. the event handler calls ftrace-related functions to reserve buffer, check filter, commit buffer etc. 3-1. each ftrace/ringbuffer function hits a kprobe 3-2. the kprobe detect recursion and just do single-step and return 4. do single stepping 5. return from kprobe Note that all the problem happens inside the event handler. > We have had problems with single kprobes in the > "wrong" spot. The main reason I showed spraying them widely is to get > wide coverage with minimal information/effort, not to suggest that the > number of concurrent probes per se is a problem. (We have had > systemtap scripts probing some areas of the kernel with thousands of > active kprobes, e.g. for statement-by-statement variable-watching > jobs, and these have worked fine.) Ah, sorry for confusion. Agreed. I just tried to explain that kprobes can cause a performance problem under *very specific* operation. So the whitelist is just for keeping people away from it. >> It doesn't crash the kernel but slows down so much, because every >> probes hit many other nested miss-hit probes. > > (kprobes does have code to detect & handle reentrancy.) Right. :) >> This gives us a big performance impact. [...] > > Sure, but I'd expect to see pure slowdowns show their impact with > time-related problems like watchdogs firing or timeouts. I doubt it can cause, because each probe processing time is still small enough to slip through the watchdog. >> [...] Then, I'd like to propose this new whitelist feature in >> kprobe-tracer (not raw kprobe itself). And a sysctl knob for >> disabling the whitelist. That knob will be >> /proc/sys/debug/kprobe-event-whitelist and disabling it will mark >> kernel tainted so that we can check it from bug reports. > > How would one assemble a reliable whitelist, if we haven't fully > characterized the problems that make the blacklist necessary? As I said, we can use function graph tracer's list as the whitelist, since it doesn't include any functions invoked from ftrace's event handler. (Note that I don't mention the Systemtap or other user here) Whitelist is just for keeping the people away from the quantitative issue, who just want to trace their subsystems except for ftrace. For example, such people may try to probe every functions (e.g. perf probe --add '* $vars' : actually this is why I don't release wildcard support on perf probe yet). Of course I can implement the whitelist feature in perf probe only, that will allow me to support wildcard on perf probe. :) For the long term solution, I think we can introduce some kind of performance gatekeeper as systemtap does. Counting the miss-hit rate per second and if it go over a threshold, disable next miss-hit (or most miss-hit) probe (as OOM killer does). Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com