From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 115762 invoked by alias); 29 Nov 2016 03:31:14 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 115663 invoked by uid 89); 29 Nov 2016 03:31:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=cmdline, fcheredhatcom, U*fche, fche@redhat.com X-HELO: mail-vk0-f67.google.com Received: from mail-vk0-f67.google.com (HELO mail-vk0-f67.google.com) (209.85.213.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 29 Nov 2016 03:30:57 +0000 Received: by mail-vk0-f67.google.com with SMTP id w194so6236194vkw.3 for ; Mon, 28 Nov 2016 19:30:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=qWvG5LCPQJF2fkbM2st/wWiWpjzcWjUeU3BCpP8cGvc=; b=AAlDnwzWgNdqCo91418EnVFqTZdEjI0kbizG97l3m9L1gh8gGKyACdEfbgJf9UgXNI 00EFzNhiZcQry31YiAi/qLRYOC/0sLbEiNDOx33FVVLz4pXTzCMeWUHdHdtFhWrvzw4t 410vi3Urnrtow7AHdSPdLfajnHhKJIjsN1h/dZWm/YPgpyLga5qM+12pSPF2vftGizC1 58Mphlh3sMIWC78oXuWLaaGszeW9t6CpjdYQtrTscq+1z/Mxiqrq6FQhHeOrk4JjRZpN rw4ADjEAMaknvRgcwsi02R7SQyLuG1JLv8bg0adR1yL83SAsRFmf7a1m0teMDxiUQ9e+ qZ4Q== X-Gm-Message-State: AKaTC00OlSmDVG+5coMoFMLHORWl7bZhdU3zH4X/gdvAxdM6iG4OT0iUqRIV4jX45rP5vPEX2/J7BykbNqVxpg== X-Received: by 10.31.53.71 with SMTP id c68mr7454534vka.66.1480390255569; Mon, 28 Nov 2016 19:30:55 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.26.66 with HTTP; Mon, 28 Nov 2016 19:30:54 -0800 (PST) In-Reply-To: References: From: Aubrey Li Date: Tue, 29 Nov 2016 03:31:00 -0000 Message-ID: Subject: Re: kernel function probe overhead To: "Frank Ch. Eigler" Cc: systemtap@sourceware.org Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2016-q4/txt/msg00094.txt.bz2 On Tue, Nov 29, 2016 at 3:19 AM, Frank Ch. Eigler wrote: > > > aubreylee wrote: > > >> A couple of things come to mind. Use of gettimeofday_ns() function uses > >> a software-estimated real-time clock, which may not be steady enough for > >> high-grained measurements like this > > > > Timestamp function: gettimeofday_ns() > > [...] > > Timestamp function: get_cycles() > > [...] > > Timestamp function: local_clock_ns() > > [...] > > Thanks for testing! The distribution of those three seem remarkably > similar - a nice central peak, and a second small peak at 4-8x the > first. It suggests we can probably rule out clock problems. > > > >> The average is pretty low. Consider printing a > >> histogram of t_end, or (if using git systemtap) its @variance. > >> > >> Another possibility is contention over the t_start[] variable, if > >> multiple processors are running this region of code, so trigger the > >> systemtap probes concurrently. systemtap automatically locks such > >> global variables, esp. if they contain non-aggregate values like the > >> scalar timestamps. > > > > Do we have an approach to probe the function on CPU0 only? > > Not directly; the kernel does not give us such control. The probe could > include a > > if (cpu() != 0) next; > > at the top of the probe handler body. That way locks still are taken, > but released quickly on the other CPUs. > Thanks for the suggestion. Original result: t_end @count=4800 @min=740 @max=300281 @sum=90589877 @avg=18872 Add if(cpu()!=0) next in the probe handler: t_end @count=161 @min=653 @max=114856 @sum=1780522 @avg=11059 And if I add "maxcpus=1" to the kernel cmdline: t_end @count=1190 @min=573 @max=14529 @sum=1406437 @avg=1181 The last one is much better, so it looks like we have a scalability opportunity here. > Another solution would be to have per-thread global variables. Since > the kernel doesn't help us out with this, we'd have to emulate it > somehow, and for this to be wortwhile, it'd have to work faster than > ordinary locked/shared array[tid()] accesses. > Is this a suggestion for me? How to do this? > > >> Consider using "stap -t FOO.stp" to have the tool print internal > >> estimates of its own probes' runtimes. That would include lock > >> contention, but exclude kernel-side overheads like the code involved in > >> the its kprobes machinery. > > > > "-t" option output. > > Ah, I note that we don't track occurrences of lock contention, > even though we could. Opened https://sourceware.org/bugzilla/show_bug.cgi?id=20879 okay, thanks, I'll keep an eye on this ticket. > > > [...] > > kernel.function("arch_cpu_idle_enter@arch/x86/kernel/process.c:277").call, > > (test2.stp:13:1), hits: 2801, cycles: 560min/34290avg/230611max, from: > > [...] > > kernel.function("local_touch_nmi@arch/x86/kernel/nmi.c:561").call, > > (test2.stp:18:1), hits: 2801, cycles: 396min/5118avg/160096max, from: > > [...] > > That inconsistency dramatic contention or caching or some other source > of jitter. Having the variance figure in there would be useful though. > Would anyone like to give that a try? (translate.cxx line 2283ish and > runtime/linux/common_session_state.h _stp_stat_init area). > If there is a patch delivered against git head, I can apply it and paste the output here. Thanks, -Aubrey > - FChE