From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8191 invoked by alias); 6 Dec 2013 23:19:50 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 8183 invoked by uid 89); 6 Dec 2013 23:19:49 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail9.hitachi.co.jp Received: from Unknown (HELO mail9.hitachi.co.jp) (133.145.228.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 06 Dec 2013 23:19:47 +0000 Received: from mlsv6.hitachi.co.jp (unknown [133.144.234.166]) by mail9.hitachi.co.jp (Postfix) with ESMTP id 26D4037C82; Sat, 7 Dec 2013 08:19:39 +0900 (JST) Received: from mfilter05.hitachi.co.jp by mlsv6.hitachi.co.jp (8.13.1/8.13.1) id rB6NJdsf022043; Sat, 7 Dec 2013 08:19:39 +0900 Received: from vshuts04.hitachi.co.jp (vshuts04.hitachi.co.jp [10.201.6.86]) by mfilter05.hitachi.co.jp (Switch-3.3.4/Switch-3.3.4) with ESMTP id rB6NJbIk010793; Sat, 7 Dec 2013 08:19:38 +0900 Received: from gxml20a.ad.clb.hitachi.co.jp (unknown [158.213.157.160]) by vshuts04.hitachi.co.jp (Postfix) with ESMTP id 7560A14003B; Sat, 7 Dec 2013 08:19:37 +0900 (JST) Received: from [10.198.194.91] by gxml20a.ad.clb.hitachi.co.jp (Switch-3.1.10/Switch-3.1.9) id 5B6N1J0NI000015D4; Sat, 07 Dec 2013 08:19:36 +0900 Message-ID: <52A25B71.3090108@hitachi.com> Date: Fri, 06 Dec 2013 23:19:00 -0000 From: Masami Hiramatsu User-Agent: Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: "Frank Ch. Eigler" Cc: Ingo Molnar , Ananth N Mavinakayanahalli , Sandeepa Prabhu , x86@kernel.org, lkml , "Steven Rostedt (Red Hat)" , systemtap@sourceware.org, "David S. Miller" Subject: Re: [PATCH -tip v4 0/6] kprobes: introduce NOKPROBE_SYMBOL() and fixes crash bugs References: <20131204012841.22118.82992.stgit@kbuild-fedora.novalocal> <20131204084551.GA31772@gmail.com> <529FBA71.6070107@hitachi.com> <52A16AD7.6040500@hitachi.com> <20131206190753.GA3201@redhat.com> In-Reply-To: <20131206190753.GA3201@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2013-q4/txt/msg00350.txt.bz2 (2013/12/07 4:07), Frank Ch. Eigler wrote: > Hi, Masami - > > masami.hiramatsu.pt wrote: > >> [...] >>>> [...] Then, I'd like to propose this new whitelist feature in >>>> kprobe-tracer (not raw kprobe itself). And a sysctl knob for >>>> disabling the whitelist. That knob will be >>>> /proc/sys/debug/kprobe-event-whitelist and disabling it will mark >>>> kernel tainted so that we can check it from bug reports. >>> >>> How would one assemble a reliable whitelist, if we haven't fully >>> characterized the problems that make the blacklist necessary? >> >> As I said, we can use function graph tracer's list as the whitelist, >> since it doesn't include any functions invoked from ftrace's event >> handler. (Note that I don't mention the Systemtap or other user here) >> >> Whitelist is just for keeping the people away from the quantitative >> issue, who just want to trace their subsystems except for ftrace. >> [...] > > Would you plan to limit kprobes (or just the perf-probe frontend) to > only function-entries also? Exactly, yes :). Currently I have a patch for kprobe-tracer implementation (not only for perf-probe, but doesn't limit kprobes itself). > If not, and if intra-function > statement-granularity kprobes remain allowed within a > function-granularity whitelist, then you might still have those > "quantitative" problems. Yes, but as far as I've tested, the performance overhead is not high, especially as far as putting kprobes at the entry of those functions because of ftrace-based optimization. > Even worse, kprobes robustness problems can bite even with a small > whitelist, unless you can test the countless subset selections > cartesian-product the aggrevating factors (like other tracing > facilities being in use at the same time, limited memory, high irq > rates, debugging sessions, architectures, whatever). And also, what script will run on each probe, right? :) >> [...] For the long term solution, I think we can introduce some >> kind of performance gatekeeper as systemtap does. Counting the >> miss-hit rate per second and if it go over a threshold, disable next >> miss-hit (or most miss-hit) probe (as OOM killer does). > > That would make sense, but again it would not help deal with kprobes > robustness (in the kernel-crashing rather than kernel-slowdown sense). Why would you think so? Is there any hidden path for calling kprobes mechanism?? The kernel crash problem just comes from bugs, not the quantitative issue. Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com