From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21602 invoked by alias); 25 Sep 2009 22:52:49 -0000 Received: (qmail 21595 invoked by uid 22791); 25 Sep 2009 22:52:48 -0000 X-SWARE-Spam-Status: No, hits=-0.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_63,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 25 Sep 2009 22:52:41 +0000 Received: from int-mx05.intmail.prod.int.phx2.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.18]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n8PMqefp005865 for ; Fri, 25 Sep 2009 18:52:40 -0400 Received: from [10.3.224.244] (vpn-224-244.phx2.redhat.com [10.3.224.244]) by int-mx05.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n8PMqdec003562; Fri, 25 Sep 2009 18:52:40 -0400 Message-ID: <4ABD49B7.60105@redhat.com> Date: Fri, 25 Sep 2009 22:52:00 -0000 From: Josh Stone User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.4pre) Gecko/20090922 Fedora/3.0-2.7.b4.fc11 Lightning/1.0pre Thunderbird/3.0b4 MIME-Version: 1.0 To: Kiran CC: systemtap@sources.redhat.com Subject: Re: [PATCH v2] Scheduler Tapset based on kernel tracepoints References: <1253715217.5193.115.camel@kiran-laptop> In-Reply-To: <1253715217.5193.115.camel@kiran-laptop> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2009-q3/txt/msg00854.txt.bz2 Hi, On 09/23/2009 07:13 AM, Kiran wrote: > +probe scheduler.ctxswitch.tp = kernel.trace("sched_switch") These .tp and .kp variants are an internal detail -- please put them under __scheduler so they are in a more private namespace from what's exposed to the user. > +/** > + * probe scheduler.migrate_task - Traces the migration of the tasks across cpus by the scheduler. > + * @pid: pid of the task being migrated. > + * @priority: priority of the task being migrated. > + * @original_cpu: the original cpu > + * @destination_cpu: the destination cpu > + */ There is already a "scheduler.migrate" defined, so please merge with that instead of creating a new one. > +/** > + * probe scheduler.process_exit - Fires when a process exits > + * @pid: pid of the process exiting > + * @priority: priority of the process exiting > + */ This overlaps with the less well-known "kprocess.exit". I think in this case we can make the kprocess one point to yours. > +/** > + * probe scheduler.process_fork - Probes the tracepoint for forking a process > + * @parent_pid: PID of the parent process > + * @child_pid: PID of the child process > + */ The "kprocess.create" is also similar to this. > +/** > + * probe scheduler.signal_send - Probes the tracepoint for sending a signal > + * @pid: pid of the process sending signal > + * @signal_number: signal number > + */ Here we have the whole signal.stp tapset, which includes a "signal.send" probe point. It's not easy to decide how to resolve the overlaps in different tapsets, but please consider it. At a minimum, they should leverage each other so they're all using the best available probe points. > diff -Naur systemtap-orig/testsuite/systemtap.examples/profiling/sched_switch.stp systemtap/testsuite/systemtap.examples/profiling/sched_switch.stp > --- systemtap-orig/testsuite/systemtap.examples/profiling/sched_switch.stp 1969-12-31 19:00:00.000000000 -0500 > +++ systemtap/testsuite/systemtap.examples/profiling/sched_switch.stp 2009-09-22 02:29:16.000000000 -0400 > [...] > +probe scheduler.wakeup > +{ > + pids[task_pid]++ > + processes[task_pid] = $p; > + prev[task_pid] = task_current() > + > +} If this is just so you can print the "+" line of who wokeup whom, why not print that right away? Then you don't need those arrays at all. Saving task_struct pointers is messy business. > +probe scheduler.ctxswitch > +{ > + tid = next_tid > + tid1 = prev_tid > + state = prev_state > + state1 = next_state Why copy these values? The prev/next variable names already reflect more meaning. > + > + %( $# == 2 %? > + > + if(@1 == "pid") > + if (tid != $2 && tid1 != $2) > + next > + if(@1 == "name") > + if (task_execname(task_current()) != @2 && task_execname($next) != @2) > + next Note that task_execname(task_current()) is the same as simply execname(). But anyway, your tapset provides prev_task_name and next_task_name, so just use those. > + > + foreach (name in pids-) { > + if ((@1 == "pid" && (name == $2 || task_pid(prev[name]) == $2)) || > + (@1 == "name" && (task_execname(prev[name]) == @2 || task_execname(processes[name]) == @2))) > + printf("%s\t\t%d\t%d\t%d\t%d:%d:%s + %d:%d:%s %s\n", > + task_execname(prev[name]), task_pid(prev[name]), task_cpu(processes[name]), gettimeofday_ns(), > + task_pid(prev[name]), task_prio(prev[name]), state_calc(task_state(prev[name])), > + task_pid(processes[name]), task_prio(processes[name]), state_calc(task_state(processes[name])), > + task_execname(processes[name])) > + } %) Hmm, "name" is not a name, which is confusing. But if you're looking for the task which woke this one up, wouldn't that be prev[next_pid]? (That should really be indexed by the tid, actually.) Or, as I said above, it may be easier to print this right in the wakeup probe. > + > + old_cpu = task_cpu_old[tid] > + printf("%s\t\t%d\t%d\t%d\t%d:%d:%s ==> %d:%d:%s %s\n",task_execname(task_current()),tid1, > + old_cpu,gettimeofday_ns(),tid1,task_prio(task_current()),state_calc(state),next_pid, > + next_prio,state_calc(next_state),next_task_name ) > + task_cpu_old[next_tid] = cpu() > +} I think task_current() is probably always the same as prev_task, but it's probably best to use the latter. It would also be helpful if you spaced this out so we can better see what's being printed. I can see two "tid1"s in there, which is a bit weird. Thanks, Josh