From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7040 invoked by alias); 18 Apr 2006 05:35:29 -0000 Received: (qmail 7023 invoked by uid 48); 18 Apr 2006 05:35:24 -0000 Date: Tue, 18 Apr 2006 05:35:00 -0000 Message-ID: <20060418053524.7021.qmail@sourceware.org> From: "guanglei at cn dot ibm dot com" To: systemtap@sources.redhat.com In-Reply-To: <20051216010933.2062.anil.s.keshavamurthy@intel.com> References: <20051216010933.2062.anil.s.keshavamurthy@intel.com> Reply-To: sourceware-bugzilla@sourceware.org Subject: [Bug kprobes/2062] Return probes does not scale well on SMP box X-Bugzilla-Reason: AssignedTo Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q2/txt/msg00166.txt.bz2 ------- Additional Comments From guanglei at cn dot ibm dot com 2006-04-18 05:35 ------- I did some testings and tried to find the potential places of SystemTap/kprobes for performance improvement. One of the results shown by these testings is about spin locks used by return probe. Testing environment: RHEL 4 U2 Kernel: 2.6.16.5 SystemTap codes: cvs coded dated 04-17 gcc version 3.4.5 20051201 (Red Hat 3.4.5-2) ppc64, LPAR with 4 virtual CPUs (thus 8-way) and 2G memory The data given by oprofile shows that spin locks routines used by kretprobe take up a lot of time. As said by Anil, kretprobe_lock make the return probe serialized on multiple CPUs. The testings show that the spin locks used by kretprobe is really too expensive. For these testings, I use stpd-notReadRelayfs.patch to eliminate the influence of data logging: stpd-notReadRelayfs.patch: --- ori/src/runtime/stpd/librelay.c 2006-04-08 17:59:36.000000000 -0400 +++ src/runtime/stpd/librelay.c 2006-04-13 15:22:31.000000000 -0400 @@ -330,6 +330,8 @@ static void *reader_thread(void *data) strerror(errno)); rc = 0; } + sleep(1); + continue; rc = read(proc_file[cpu], &status[cpu].info, sizeof(struct buf_info)); *note* addevent.syscall.entry & addevent.syscall.return are two event hooks defined in LKET(Linux Kernel Event Trace tool). Basically these two event hooks will log timestamp, pid, ppid, tid, cpuid, syscallname, hookid for the entry & return of all system calls. =============== Testing of addevent.syscall.entry =============== specjbb score: 2218 oprofile.all: samples % app name symbol name 3709035 67.0087 java (no symbols) 187750 3.3920 lglzy ._stp_vsnprintf 87234 1.5760 vmlinux-2.6.16.5.debug .strpbrk 53092 0.9592 libjvm.so cacheAllocWithInitialization 51339 0.9275 libc-2.3.4.so memset 49893 0.9014 vmlinux-2.6.16.5.debug restore 48598 0.8780 libjvm.so fastLocalMark 47751 0.8627 lglzy ._stp_print_flush vmstat: ---io--- --system-- ----cpu---- bi bo in cs us sy id wa 0 2 35 465 82 17 1 0 0 50 37 463 80 18 3 0 0 86 27 434 80 19 1 0 0 0 10 410 80 20 0 0 0 26 18 416 81 19 0 0 =============== Testing of addevent.syscall.return ============== specjbb score: 1673 oprofile.all: samples % app name symbol name 2701635 49.1351 java (no symbols) 832730 15.1450 vmlinux-2.6.16.5.debug .__spin_yield 382374 6.9543 vmlinux-2.6.16.5.debug ._spin_lock_irqsave 127950 2.3270 lglzy ._stp_vsnprintf 61853 1.1249 vmlinux-2.6.16.5.debug .strpbrk 61002 1.1095 vmlinux-2.6.16.5.debug restore 57893 1.0529 vmlinux-2.6.16.5.debug .trampoline_probe_handler vmstat: ----io--- --system-- ----cpu---- bi bo in cs us sy id wa 0 16 26 406 59 41 0 0 0 8 29 425 57 43 0 0 0 0 17 399 57 43 0 0 ========================================================== You can see that the score of syscall.entry is 2218, while the score of syscall.return is only 1673. vmstat shows that only ~19%CPU is in kernel for syscall.entry, while for syscall.return ~42%CPU is spent in kernel. The oprofile also shows that for syscall.return probe spin locks will eat up ~21% total CPU time, and only ~49% is spent on specjbb. But for syscall.entry, ~67% CPU can be spent on specjbb, and the cost of spin locks are negligible. -- http://sourceware.org/bugzilla/show_bug.cgi?id=2062 ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.