From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27355 invoked by alias); 27 Feb 2014 07:34:21 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 27277 invoked by uid 89); 27 Feb 2014 07:34:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_00,KHOP_BIG_TO_CC,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail9.hitachi.co.jp Received: from mail9.hitachi.co.jp (HELO mail9.hitachi.co.jp) (133.145.228.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 27 Feb 2014 07:34:19 +0000 Received: from mlsv6.hitachi.co.jp (unknown [133.144.234.166]) by mail9.hitachi.co.jp (Postfix) with ESMTP id D0E8B37C9B; Thu, 27 Feb 2014 16:34:17 +0900 (JST) Received: from mfilter06.hitachi.co.jp by mlsv6.hitachi.co.jp (8.13.1/8.13.1) id s1R7YHT3007166; Thu, 27 Feb 2014 16:34:17 +0900 Received: from vshuts04.hitachi.co.jp (vshuts04.hitachi.co.jp [10.201.6.86]) by mfilter06.hitachi.co.jp (Switch-3.3.4/Switch-3.3.4) with ESMTP id s1R7YGp3006715; Thu, 27 Feb 2014 16:34:17 +0900 Received: from gmml27.itg.hitachi.co.jp (unknown [158.213.165.130]) by vshuts04.hitachi.co.jp (Postfix) with ESMTP id F3EF014003B; Thu, 27 Feb 2014 16:34:14 +0900 (JST) Received: from ltc230.yrl.intra.hitachi.co.jp by gmml27.itg.hitachi.co.jp (AIX5.2/8.11.6p2/8.11.0) id s1R7YEk1843278; Thu, 27 Feb 2014 16:34:14 +0900 Subject: [PATCH -tip v7 24/26] kprobes: Enlarge hash table to 4096 entries From: Masami Hiramatsu To: linux-kernel@vger.kernel.org, Ingo Molnar Cc: Ananth N Mavinakayanahalli , Sandeepa Prabhu , Frederic Weisbecker , x86@kernel.org, Steven Rostedt , fche@redhat.com, mingo@redhat.com, systemtap@sourceware.org, "H. Peter Anvin" , Thomas Gleixner Date: Thu, 27 Feb 2014 07:34:00 -0000 Message-ID: <20140227073414.20992.16882.stgit@ltc230.yrl.intra.hitachi.co.jp> In-Reply-To: <20140227073315.20992.6174.stgit@ltc230.yrl.intra.hitachi.co.jp> References: <20140227073315.20992.6174.stgit@ltc230.yrl.intra.hitachi.co.jp> User-Agent: StGit/0.17-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2014-q1/txt/msg00189.txt.bz2 Currently, since the kprobes expects to be used with less than 100 probe points, its hash table just has 64 entries. This is too little to handle several thousands of probes. Enlarge this to 4096 entires which just consumes 32KB (on 64bit arch) for better scalability. Without this patch, enabling 17787 probes takes more than 2 hours! (9428sec, 1 min intervals for each 2000 probes enabled) Enabling trace events: start at 1392782584 0 1392782585 a2mp_chan_alloc_skb_cb_38556 1 1392782585 a2mp_chan_close_cb_38555 .... 17785 1392792008 lookup_vport_34987 17786 1392792010 loop_add_23485 17787 1392792012 loop_attr_do_show_autoclear_23464 I profiled it and saw that more than 90% of cycles are consumed on get_kprobe. Samples: 18K of event 'cycles', Event count (approx.): 37759714934 + 95.90% [k] get_kprobe + 0.76% [k] ftrace_lookup_ip + 0.54% [k] kprobe_trace_func And also more than 60% of executed instructions were in get_kprobe too. Samples: 17K of event 'instructions', Event count (approx.): 1321391290 + 65.48% [k] get_kprobe + 4.07% [k] kprobe_trace_func + 2.93% [k] optimized_callback And annotating get_kprobe also shows the hlist is too long and takes a time on tracking it. | struct hlist_head *head; | struct kprobe *p; | | head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)]; | hlist_for_each_entry_rcu(p, head, hlist) { 86.33 | mov (%rax),%rax 11.24 | test %rax,%rax | jne 60 | if (p->addr == addr) | return p; | } With this fix, enabling 20,000 probes just takes 40 min (2303 sec, 1 min intervals for each 2000 probes enabled) Enabling trace events: start at 1392794306 0 1392794307 a2mp_chan_alloc_skb_cb_38556 1 1392794307 a2mp_chan_close_cb_38555 .... 19997 1392796603 nfs4_negotiate_security_12119 19998 1392796603 nfs4_open_confirm_done_11767 19999 1392796603 nfs4_open_confirm_prepare_11779 And it reduced cycles on get_kprobe (with 20,000 probes). Samples: 5K of event 'cycles', Event count (approx.): 4540269674 + 68.77% [k] get_kprobe + 8.56% [k] ftrace_lookup_ip + 3.04% [k] kprobe_trace_func Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index abdede5..302ff42 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -54,7 +54,7 @@ #include #include -#define KPROBE_HASH_BITS 6 +#define KPROBE_HASH_BITS 12 #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS)