public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Script to measure resource usage based on process arguments
@ 2007-01-09  0:55 Mike Mason
  2007-01-09  1:27 ` Mike Mason
  2007-04-18 16:07 ` Frank Ch. Eigler
  0 siblings, 2 replies; 5+ messages in thread
From: Mike Mason @ 2007-01-09  0:55 UTC (permalink / raw)
  To: systemtap

[-- Attachment #1: Type: text/plain, Size: 2903 bytes --]

I wrote the attached script in response to a customer question. They wanted to know if SystemTap could show them which set of java processes were consuming the most resources, based on the process arguments.  They may have several thousand java instances running at one time and those instances can have very long argument strings.  I'm not sure the script gives them exactly what they want, but I thought I'd send this out for comments anyway.

The script lets you to specify process names and strings to look for in their arguments.  It then aggregates the resources used by all processes with those names and  argument strings.  In this case, it tracks cpu time, but you could modify it to track any resource value that's available in task_struct.  It prints the results at the end of an interval.  For example, I ran the script as shown below during a kernel build. 

# ./argtap -i 30 -n gcc -n as -a fs/ -a mm/ -a arch/ -a security/ -a crypto/ -a drivers/ 
======================= Mon Jan  8 15:59:05 2007 ========================
TIME(ms) #PROC  PROCNAME         ARG SEGMENT
1472     33     as               drivers/
975      25     as               fs/
91       35     gcc              drivers/
69       27     gcc              fs/

======================= Mon Jan  8 15:59:35 2007 ========================
TIME(ms) #PROC  PROCNAME         ARG SEGMENT
2174     55     as               fs/
683      17     as               drivers/
155      56     gcc              fs/
44       17     gcc              drivers/

The results show how many gcc and as processes with the specified strings in their arguments ran during a 30 second interval and how much total cpu time they used.

While writing this script I had to figure out how get the arguments from existing processes. That took awhile.  Since the arguments are stored in the processes' address space, we can't just access them whenever we want to.  The usual routines used in /proc to grab another processes' arguments won't work in systemtap because irqs are always disabled in probes, even begin probes. We can't call functions that might sleep in systemtap.  I eventually realized I could access the arguments when the process is "current" in the scheduler.cpu_off probe.  This is okay because the script doesn't care about processes that don't get scheduled.

Note that there's a lot of embedded C in this script, which I don't really like, but I couldn't think of any other way to do what needed to be done.  Also note I had to use a big MAXACTION value to get it to work in some cases.  Given that limitation, I'm beginning to think systemtap scripts should be used more for extracting raw data and data processing should be left to user scripts and programs, especially where large amounts of data are involved.

Anyway, please look this over and give me your comments.  I'm especially interested in ways I could improve it.

Mike Mason


[-- Attachment #2: argtap --]
[-- Type: text/plain, Size: 12018 bytes --]

#!/bin/bash

# Filter options
F_ARGSTR=""; F_ARGS=0  # Filter and group by process arguments
F_PIDSTR=""; F_PID=0   # Filter by process ID
F_NAMSTR=""; F_NAME=0  # Filter by process name
F_UIDSTR=""; F_UID=0   # Filter by user
FILTER=0               # Any filters specified?
ERROR=0

# Print options
P_INTERVAL=5 # default interval between output

DELIM=","

function usage {
	echo "USAGE: argtap [-a argstr]... [-i interval] [-n pname]..."
	echo "              [-p pid]... [-u username]... [-h]"
	echo "    -a argstr    # argument string to look for (default: all)"
	echo "    -i interval  # interval in seconds between printing (default: $P_INTERVAL)"
	echo "    -n pname     # this process name only (default: all)"
	echo "    -p pid       # this process ID only (default: all)"
	echo "    -u username  # this user only (default: all)"
	echo "    -h           # display this help test"
	echo ""
}

# Process options
while getopts a:i:n:p:u:h option; do
	case $option in
	a)      let "F_ARGS++"
	        F_ARGSTR=$OPTARG$DELIM$F_ARGSTR ;;
	i)      P_INTERVAL=$OPTARG ;;
	n)      let "F_NAME++"
	        F_NAMESTR=$OPTARG$DELIM$F_NAMESTR ;;
	p)      let "F_PID++"
	        F_PIDSTR=$OPTARG$DELIM$F_PIDSTR ;;
	u)      uid=`awk -F: '$1 == name {print $3}' name=$OPTARG /etc/passwd`
	        if [[ $uid != "" ]]; then
	            let "F_UID++"
	            F_UIDSTR=$uid$DELIM$F_UIDSTR
	        else
	            echo "ERROR: Unknown user:" $OPTARG
	            let "ERROR++"
	        fi ;;
	h|?|*)  usage
	        exit 1 ;;
	esac
done

if [[ $ERROR > 0 ]]; then
	exit 1
fi

if [[ $F_ARGS > 0 || $F_NAME > 0 || $F_PID > 0 || $F_UID > 0 ]]; then
	FILTER=1
fi

#
# Pass a timezone adjustment value to the stap script
#
TZ=`date "+%z"`
TZ_SIGN=`echo $TZ | cut -c1`
TZ_HOURS=`echo $TZ | cut -c2-3`
TZ_MINS=`echo $TZ | cut -c4-5`
TZ_ADJUST=$TZ_SIGN$((10#$TZ_HOURS*60*60+10#$TZ_MINS*60))

# Start the systemtap script
#
stap -g -DMAXACTION=50000 -DMAXSTRINGLEN=1024 -DMAXMAPENTRIES=2048 -e '
%{
#include <linux/sched.h>
#include <linux/rcupdate.h>

struct task_struct *__pid2task(int pid)
{
	struct task_struct *task = NULL;

	rcu_read_lock();
	task = find_task_by_pid (pid);
	rcu_read_unlock();
	return(task);
}
%}


function rcu_read_lock()
%{
	rcu_read_lock();
%}


function rcu_read_unlock()
%{
	rcu_read_unlock();
%}


function next_task:long(cur_task:long)
%{
        THIS->__retvalue = (long) next_task((struct task_struct *)THIS->cur_task);
%}


function inittask:long()
%{
        THIS->__retvalue = (long) &init_task;
%}


function ltostr:string (l:long)
{
	return sprintf("%d", l)
}


function tokenize:string(input:string, delim:string)
%{
	static char str[MAXSTRINGLEN];
	static char *str_start;
	char *token = NULL;

	if (THIS->input[0]) {
		strlcpy(str, THIS->input, MAXSTRINGLEN);
		str_start = &str[0];
	}
	token = strsep(&str_start, THIS->delim);
	if (token)
		strlcpy (THIS->__retvalue, token, MAXSTRINGLEN);
%}

function strtol:long(str:string)
%{
	THIS->__retvalue = simple_strtol(THIS->str, NULL, 10);
%}


function current_args:string(delimit:string)
{
	return __task_args(task_current(), delimit)
}


function __task_args:string(task:long, delimit:string)
%{
	struct task_struct *task;
	struct mm_struct *mm;
	char buffer[MAXSTRINGLEN];
	int len, i, left = 0;

	task = (struct task_struct *) deref (sizeof(struct task_struct *),
	                                     &(THIS->task));

	// This should only be called for the "current" task. Restriction may
	// be lifted later.
	if (!task || (task != current))
		goto end;

	mm = (struct mm_struct *) deref (sizeof(struct mm_struct *),
	                                 &(task->mm));
	if (!mm)
		goto end;

	len = mm->arg_end - mm->arg_start;
	if (len > MAXSTRINGLEN)
		len = MAXSTRINGLEN;
	left = _stp_copy_from_user (buffer, (char *)mm->arg_start, len);
	len = len - left;  /* left = number of bytes that could not be copied,
	                      silently ignore for now */

	/* Replace NULLs with the caller specified separator */
	for (i = 0; i < len; i++)
		if (buffer[i] == '\0')
			buffer[i] = THIS->delimit[0];
	/* Make sure the string is NULL terminated */
	buffer[len-1] = '\0';

	strlcpy(THIS->__retvalue, buffer, len);

end:
	if (0) {
deref_fault:
		CONTEXT->last_error = "pointer dereference fault";
	}
%}


// args_str()
// Concatenates the argv arguments, separated by a delimter,
// into one string, up to a maximum length of MAXSTRINGLEN.
function args_str:string(args    :long,     # ptr to argv array
                         delimit :string,   # Delimiter to use between
                                            # arguments in returned string,
                                            # useful for later tokenizing
                                            # of the arguments string
                         execname:long)     # Indicates whether to return the
                                            # execname (1st arg) in the
                                            # returned string
%{
	char **argv, *arg;
	int first = 1;

	argv = (char **) deref (sizeof(char **), &(THIS->args));
	if (argv == NULL) goto end;
	arg = (char *) deref (sizeof (char *), &(*argv));

	while (arg) {
		if (!first || (first && THIS->execname)) {
			strlcat (THIS->__retvalue, arg, MAXSTRINGLEN);
			strlcat (THIS->__retvalue, THIS->delimit, MAXSTRINGLEN);
			first = 0;
		}
		arg = (char *) deref (sizeof (char *), &(*++argv));
	}

	if (0) {
deref_fault:
		CONTEXT->last_error = "pointer dereference fault";
	}
end: ;
%}


// args_exec()
// Returns the 1st argument (exec name or path) from argv
function args_exec:string (args:long) # ptr to argv array
%{
	char **argv, *arg;

	argv = (char **) deref (sizeof(char **), &(THIS->args));
	if (argv == NULL) goto end;
	arg = (char *) deref (sizeof (char *), &(*argv));
	if (arg) strlcpy (THIS->__retvalue, arg, MAXSTRINGLEN);

	if (0) {
deref_fault:
		CONTEXT->last_error = "pointer dereference fault";
	}
end: ;
%}


function cmd_from_path:string (path:string)
%{
	char *p  = THIS->path + strlen(THIS->path);
	char sep[2] = "/";

	while ( (*p != *sep) && (p != THIS->path) ) p--;
	if (*p == *sep) p++;
	strlcpy(THIS->__retvalue, p, MAXSTRINGLEN);
%}


function pid_task:long (pid:long)
%{
	THIS->__retvalue = (long) __pid2task((int) THIS->pid);
%}


//
// Returns total cpu time in milliseconds based on pid
//
function pid_cputime:long(pid:long)
%{
	struct task_struct *task = __pid2task((int) THIS->pid);
	THIS->__retvalue = -1;
	task = (struct task_struct *)
	     deref (sizeof(struct task_struct *), &(task));

	if (task) {
		task_lock(task);
		THIS->__retvalue = (long) cputime_to_msecs(task->stime + task->utime);
		task_unlock(task);
	}

	if (0) {
deref_fault:
		CONTEXT->last_error = "pointer dereference fault";
	}
end: ;
%}

//
// Returns total cpu time in milliseconds based on task descriptor
//
function task_cputime:long(task:long)
%{
	struct task_struct *task = (struct task_struct *)
	      deref (sizeof(struct task_struct *), &(THIS->task));

	THIS->__retvalue = -1;

	if (task) {
		task_lock(task);
		THIS->__retvalue = (long) cputime_to_msecs(task->stime + task->utime);
		task_unlock(task);
	}

	if (0) {
deref_fault:
		CONTEXT->last_error = "pointer dereference fault";
	}
end: ;
%}

//
// Determines whether this is a process descriptor for a
// kernel thread (has no user address space).
//
function is_kthread:long (task:long)
%{
	struct task_struct *task;
	THIS->__retvalue = 0;

	task = (struct task_struct *) 
	            deref (sizeof(struct task_struct *), &(THIS->task));
	task_lock(task);
	if (task->mm == NULL)
		THIS->__retvalue = 1;
	task_unlock(task);

	if (0) {
deref_fault:
		CONTEXT->last_error = "pointer dereference fault";
	}
end: ;
%}


global uniq_args_cputime, uniq_args_cnt
global old, new
global initial_pids, current_pids, done_pids
global execpath, execname, cputime, user, args
global f_args, f_names, f_pids, f_uids
global arg_cputime, arg_cputime_cnt

probe begin(-1000)
{
	old = 0
	new = 1

	f_arg_str  = "'$F_ARGSTR'"
	f_name_str = "'$F_NAMESTR'"
	f_pid_str  = "'$F_PIDSTR'"
	f_uid_str  = "'$F_UIDSTR'"

	delim = "'$DELIM'"

	# Argument strings
	if ('$F_ARGS') {
		tok = tokenize(f_arg_str, delim)
		while (tok != "") {
			f_args[tok] = 1
			tok = tokenize("", delim)
		}
	}

	# Process names
	if ('$F_NAME') {
		tok = tokenize(f_name_str, delim)
		while (tok != "") {
			f_names[tok] = 1
			tok = tokenize("", delim)
		}
	}

	# Process IDs
	if ('$F_PID') {
		tok = tokenize(f_pid_str, delim)
		while (tok != "") {
			f_pids[strtol(tok)] = 1
			tok = tokenize("", delim)
		}
	}

	# User IDs
	if ('$F_UID') {
		tok = tokenize(f_uid_str, delim)
		while (tok != "") {
			f_uids[strtol(tok)] = 1
			tok = tokenize("", delim)
		}
	}

	// Grab the existing pids
	itask = inittask()
	rcu_read_lock()
	for (task = next_task(itask); task != itask; task = next_task(task)) {
		// Ignore kernel threads
		if (is_kthread(task)) continue
		p = task_pid(task)
		if (p > 0) {
			// NOTE: process arguments are kept in user address space.
			// All probes, including begin, run with irqs disabled and 
			// cannot sleep, so we cannot access user space for non-current
			// processes at this point. Thus, we wait until a process is
			// "current" in scheduler.cpu_off before grabbing the arguments.
			initial_pids[p] = 1
			user[p] = task_uid(task)
			cputime[p, old] = task_cputime(task)
		}
	}
	rcu_read_unlock()
}


probe scheduler.cpu_off
{
	// If this process existed when the script started, we have
	// get the process arguments when the process is "current".
	p = pid()
	if ([p] in initial_pids) {
		args[p] = current_args(" ")
		execpath[p] = tokenize(args[p], " ")
		execname[p] = cmd_from_path(execpath[p])
		delete initial_pids[p]
		match_filters(p) ? current_pids[p] = 1 : delete_pid(p)
	}
}


probe process.exec
{
	p = pid()
	args[p] = args_str($argv, " ", 1)
	execpath[p] = args_exec($argv)
	execname[p] = cmd_from_path(execpath[p])
	user[p] = uid()
	match_filters(p) ? current_pids[p] = 1 : delete_pid(p)
}


probe process.exit
{
	p = pid()

	if (!([p] in current_pids)) next

	// Save cpu time for exiting process
	cputime[p, new] = pid_cputime(p)

	// Mark for deletion
	done_pids[p] = 1
}

function delete_pid (p:long)
{
	delete current_pids[p]
	delete execname[p]
	delete execpath[p]
	delete user[p]
	delete args[p]
	delete cputime[p, new]
	delete cputime[p, old]
}

function match_filters:long (p:long)
{
	# Check filters
	if ('$FILTER') {
		if ('$F_NAME' && !([execname[p]] in f_names)) return 0
		if ('$F_PID'  && !([p]           in f_pids))  return 0
		if ('$F_UID'  && !([user[p]]     in f_uids))  return 0
		if ('$F_ARGS') {
			foreach ([str] in f_args)
				if (isinstr(args[p], str)) match++
			if (!match) return 0
		}
	}

	return 1
}

probe timer.s('$P_INTERVAL')
{
	foreach ([p] in current_pids) {
		if (!([p] in done_pids))
			cputime[p, new] = pid_cputime(p)

		if ('$F_ARGS') {
			// Argument segments were specified. Group by process name
			// and argument segment.
			foreach ([arg] in f_args)
				if (isinstr(args[p], arg)) {
					arg_cputime[execname[p], arg] += cputime[p, new] - cputime[p, old]
					arg_cputime_cnt[execname[p], arg]++
				}
		} else {
			// Argument segments were NOT specified. Group by process
			// name and entire argument sring. Note the argument string
			// can be quite long is some cases.
			arg_cputime[execname[p], args[p]] += cputime[p, new] - cputime[p, old]
			arg_cputime_cnt[execname[p], args[p]]++
		}
	}

	time = gettimeofday_s() + '$TZ_ADJUST'
	printf("======================= %s ========================\n", ctime(time))
	printf("%-8s %-6s %-16s %-s\n", "TIME(ms)", "#PROC", "PROCNAME", "ARG SEGMENT")
	foreach ([name, arg] in arg_cputime-) 
		printf ("%-8d %-6d %-16s %-s\n", arg_cputime[name, arg],
		        arg_cputime_cnt[name, arg], name, arg)

	printf("\n")

	delete arg_cputime
	delete arg_cputime_cnt

	// Clean up processes that have exited
	foreach ([p] in done_pids) delete_pid(p)
	delete done_pids

	// flip indexes
	tmp = new
	new = old
	old = tmp
}
'

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Script to measure resource usage based on process arguments
  2007-01-09  0:55 Script to measure resource usage based on process arguments Mike Mason
@ 2007-01-09  1:27 ` Mike Mason
  2007-01-16 15:42   ` Frank Ch. Eigler
  2007-04-18 16:07 ` Frank Ch. Eigler
  1 sibling, 1 reply; 5+ messages in thread
From: Mike Mason @ 2007-01-09  1:27 UTC (permalink / raw)
  To: Mike Mason; +Cc: systemtap

Forgot to mention one important thing... I've seen this script crash the system a few times.  The sequence of events that cause the crash are a bit strange.  I start the script, then start a kernel build.  The script and build both run to completion.  Then I start the script again and it crashes. I've attached an example of the console messages.  Do these messages give anybody a clue as to what's happening? 

BTW, I see the crash when I run the script in an xterm in Xvnc.  I don't see the crash when I run the script in a telnet session.  Might be irrelevant, but I thought I'd mention it.

Mike

Console messages
================

Error creating systemtap /proc entries.
Unable to handle kernel paging request at ffffffff8832e1dd RIP: 
 [<ffffffff8832e1dd>]
PGD 203027 PUD 205027 PMD 396e6067 PTE 0
Oops: 0010 [1] SMP 
last sysfs file: /module/scsi_mod/sections/.text
CPU 0 
Modules linked in: ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror dm_mod video sbs i2c_ec button battery asus_acpi ac lp parport_pc parport snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss sg e1000 snd_mixer_oss snd_pcm serio_raw snd_timer ehci_hcd snd uhci_hcd ide_cd i2c_i801 shpchp soundcore snd_page_alloc cdrom i2c_core pcspkr ext3 jbd ahci libata sd_mod scsi_mod
Pid: 2467, comm: Xvnc Not tainted 2.6.18-1.2257.fc5 #1
RIP: 0010:[<ffffffff8832e1dd>]  [<ffffffff8832e1dd>]
RSP: 0000:ffffffff80632f18  EFLAGS: 00010246
RAX: ffff81002d773fd8 RBX: 0000000000000000 RCX: ffff81001aae2378
RDX: 0000000000000000 RSI: 10369bb9244ede90 RDI: 0000000000000000
RBP: ffffffff80671b00 R08: 0000000000b31860 R09: 000000000000000a
R10: 00002aaaaadafa18 R11: 0000000000000004 R12: 0000000000000100
R13: ffffffff8832e1dd R14: 0000000000000027 R15: 0000000000000105
FS:  00002aaaaaacd140(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffff8832e1dd CR3: 000000002e048000 CR4: 00000000000006e0
Process Xvnc (pid: 2467, threadinfo ffff81002d772000, task ffff81003848b810)
Stack:  ffffffff80290ec5 ffffffff80632f20 ffffffff80632f20 ffffffff80287132
 0000000000000001 ffffffff805de110 000000000000000a 0000000000000000
 ffffffff80211cff ffff81002d773f58 0000000000000046 ffffffff80632f98
Call Trace:
Inexact backtrace:
 <IRQ>  [<ffffffff80290ec5>] run_timer_softirq+0x13b/0x1be
 [<ffffffff80287132>] scheduler_tick+0x31c/0x345
 [<ffffffff80211cff>] __do_softirq+0x5e/0xd5
 [<ffffffff8025d9f0>] call_softirq+0x1c/0x28
 [<ffffffff8026aab8>] do_softirq+0x2c/0x99
 [<ffffffff8025d382>] apic_timer_interrupt+0x66/0x6c
 <EOI> 

Code:  Bad RIP value.
RIP  [<ffffffff8832e1dd>]
 RSP <ffffffff80632f18>
CR2: ffffffff8832e1dd
 <1>Unable to handle kernel paging request<3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():1, irqs_disabled():1

Call Trace:
 [<ffffffff80269935>] show_trace+0x34/0x47
 [<ffffffff8026995a>] dump_stack+0x12/0x17
 [<ffffffff8029bb57>] down_read+0x15/0x23
 [<ffffffff8029414e>] blocking_notifier_call_chain+0x13/0x36
 [<ffffffff80214e81>] do_exit+0x20/0x928
 [<ffffffff80265023>] do_page_fault+0x79a/0x815
 [<ffffffff8025d4dd>] error_exit+0x0/0x84
DWARF2 unwinder stuck at error_exit+0x0/0x84
Leftover inexact backtrace:
 <IRQ>  [<ffffffff80290ec5>] run_timer_softirq+0x13b/0x1be
 [<ffffffff80287132>] scheduler_tick+0x31c/0x345
 [<ffffffff80211cff>] __do_softirq+0x5e/0xd5
 [<ffffffff8025d9f0>] call_softirq+0x1c/0x28
 [<ffffffff8026aab8>] do_softirq+0x2c/0x99
 [<ffffffff8025d382>] apic_timer_interrupt+0x66/0x6c
 <EOI> 
Kernel panic - not syncing: Aiee, killing interrupt handler!
  at ffffffff8832e1dd RIP: 
 [<ffffffff8832e1dd>]
PGD 203027 PUD 205027 PMD 396e6067 PTE 0
Oops: 0010 [2] SMP 
last sysfs file: /module/scsi_mod/sections/.text
CPU 1 
Modules linked in: ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror dm_mod video sbs i2c_ec button battery asus_acpi ac lp parport_pc parport snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss sg e1000 snd_mixer_oss snd_pcm serio_raw snd_timer ehci_hcd snd uhci_hcd ide_cd i2c_i801 shpchp soundcore snd_page_alloc cdrom i2c_core pcspkr ext3 jbd ahci libata sd_mod scsi_mod
Pid: 24434, comm: insmod Not tainted 2.6.18-1.2257.fc5 #1
RIP: 0010:[<ffffffff8832e1dd>]  [<ffffffff8832e1dd>]
RSP: 0018:ffff81003fe0bf18  EFLAGS: 00010246
RAX: ffff810004d7bfd8 RBX: 0000000000000000 RCX: ffff81001aae2678
RDX: 0000000000000001 RSI: 10369bb9244ede90 RDI: 0000000000000000
RBP: ffff810037ebc000 R08: 0000000000b31860 R09: 000000000000000a
R10: ffff810001182d90 R11: 0000000000000246 R12: 0000000000000100
R13: ffffffff8832e1dd R14: 0000000000000100 R15: 0000000000000003
FS:  0000000000000000(0000) GS:ffff810037e999c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffff8832e1dd CR3: 0000000006a2b000 CR4: 00000000000006e0
Process insmod (pid: 24434, threadinfo ffff810004d7a000, task ffff810039312080)
Stack:  ffffffff80290ec5 ffff81003fe0bf20 ffff81003fe0bf20 ffffffff80287132
 0000000000000011 ffffffff805de110 000000000000000a 0000000000000001
 ffffffff80211cff ffff810004d7bde8 0000000000000046 ffff81003fe0bf98
Call Trace:
Inexact backtrace:
 <IRQ>  [<ffffffff80290ec5>] run_timer_softirq+0x13b/0x1be
 [<ffffffff80287132>] scheduler_tick+0x31c/0x345
 [<ffffffff80211cff>] __do_softirq+0x5e/0xd5
 [<ffffffff8025d9f0>] call_softirq+0x1c/0x28
 [<ffffffff8026aab8>] do_softirq+0x2c/0x99
 [<ffffffff8025d382>] apic_timer_interrupt+0x66/0x6c
 <EOI>  [<ffffffff80262eac>] _spin_unlock_irq+0xb/0xc
 [<ffffffff8020cf32>] __pagevec_lru_add_active+0xb9/0xdc
 [<ffffffff80239e82>] exit_mmap+0x22/0xee
 [<ffffffff8023c058>] mmput+0x41/0x96
 [<ffffffff802150f4>] do_exit+0x293/0x928
 [<ffffffff8024816e>] cpuset_exit+0x0/0x6c
 [<ffffffff8025c80e>] system_call+0x7e/0x83


Code:  Bad RIP value.
RIP  [<ffffffff8832e1dd>]
 RSP <ffff81003fe0bf18>
CR2: ffffffff8832e1dd
BUG: warning at drivers/char/vt.c:3359/do_unblank_screen() (Not tainted)

Call Trace:
 [<ffffffff80269935>] show_trace+0x34/0x47
 [<ffffffff8026995a>] dump_stack+0x12/0x17
 [<ffffffff8038b5a6>] do_unblank_screen+0x67/0x142
 [<ffffffff8027c523>] bust_spinlocks+0x1c/0x46
 [<ffffffff802634ce>] oops_end+0x15/0x53
 [<ffffffff80264fd3>] do_page_fault+0x74a/0x815
 [<ffffffff8025d4dd>] error_exit+0x0/0x84
DWARF2 unwinder stuck at error_exit+0x0/0x84
Leftover inexact backtrace:
 <IRQ>  [<ffffffff80290ec5>] run_timer_softirq+0x13b/0x1be
 [<ffffffff80287132>] scheduler_tick+0x31c/0x345
 [<ffffffff80211cff>] __do_softirq+0x5e/0xd5
 [<ffffffff8025d9f0>] call_softirq+0x1c/0x28
 [<ffffffff8026aab8>] do_softirq+0x2c/0x99
 [<ffffffff8025d382>] apic_timer_interrupt+0x66/0x6c
 <EOI>  [<ffffffff80262eac>] _spin_unlock_irq+0xb/0xc
 [<ffffffff8020cf32>] __pagevec_lru_add_active+0xb9/0xdc
 [<ffffffff80239e82>] exit_mmap+0x22/0xee
 [<ffffffff8023c058>] mmput+0x41/0x96
 [<ffffffff802150f4>] do_exit+0x293/0x928
 [<ffffffff8024816e>] cpuset_exit+0x0/0x6c
 [<ffffffff8025c80e>] system_call+0x7e/0x83


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Script to measure resource usage based on process arguments
  2007-01-09  1:27 ` Mike Mason
@ 2007-01-16 15:42   ` Frank Ch. Eigler
  0 siblings, 0 replies; 5+ messages in thread
From: Frank Ch. Eigler @ 2007-01-16 15:42 UTC (permalink / raw)
  To: Mike Mason; +Cc: systemtap


mmlnx wrote:

> Forgot to mention one important thing... I've seen this script crash
> the system a few times.  The sequence of events that cause the crash
> are a bit strange.  I start the script, then start a kernel build.
> The script and build both run to completion.  Then I start the script
> again and it crashes. [...]

Interesting.  It's a little reminiscent of an old bug that did not
correctly unregister all kprobes under some failure exit conditions.

> Unable to handle kernel paging request at ffffffff8832e1dd RIP:
> [<ffffffff8832e1dd>]
> PGD 203027 PUD 205027 PMD 396e6067 PTE 0
> Oops: 0010 [1] SMP last sysfs file: /module/scsi_mod/sections/.text
> CPU 0 Modules linked in: ipv6 autofs4 hidp rfcomm l2cap bluetooth
> sunrpc dm_mirror dm_mod video sbs i2c_ec button battery asus_acpi ac
> lp parport_pc parport snd_hda_intel snd_hda_codec snd_seq_dummy
> snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss sg
> e1000 snd_mixer_oss snd_pcm serio_raw snd_timer ehci_hcd snd uhci_hcd
> ide_cd i2c_i801 shpchp soundcore snd_page_alloc cdrom i2c_core pcspkr
> ext3 jbd ahci libata sd_mod scsi_mod

Note that no stap_* module is listed here as loaded.  This could mean
that the problem occurred during initialization of the new copy of the
module. From the "bad RIP value" / fault traceback, it looks as if
there was still some kind of timer task left running in the system by
a prior systemtap run.  That module was then unloaded, which unmapped
the module executable region.  It would help if you followed the steps
in the HowToReportProblems wiki page, specifically to identify
probable stap module loading addresses.

We use timer type tasks in two contexts: probes on timer.ms and like,
and an I/O related widget in the runtime.  We should review both bits
of code to ensure that we always start up and clean up carefully.  It
is probably also helpful to have systemtap emit a few DEBUG-level
printk's during setup to note vital statistics about systemtap
modules: base addresses, number of probes, memory used, that sort of
thing.  Any volunteers?


- FChE

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Script to measure resource usage based on process arguments
  2007-01-09  0:55 Script to measure resource usage based on process arguments Mike Mason
  2007-01-09  1:27 ` Mike Mason
@ 2007-04-18 16:07 ` Frank Ch. Eigler
  2007-04-18 16:21   ` Mike Mason
  1 sibling, 1 reply; 5+ messages in thread
From: Frank Ch. Eigler @ 2007-04-18 16:07 UTC (permalink / raw)
  To: Mike Mason; +Cc: systemtap


Mike Mason <mmlnx@us.ibm.com> writes:

> I wrote the attached script [long ago!] in response to a customer
> question. They wanted to know if SystemTap could show them which set
> of java processes were consuming the most resources, based on the
> process arguments.  [...]

Reading over this old script again, I'd take away the following
observations:

- process argument fetching routines would be welcome the tapset library
- a syntax to initialize a systemtap array would be useful - both at
  compile time ("global array[maxsize] = [idx]=value, [idx2]=value2")
  and at run time (module option string "[idx]=value, ...")
- some systemtap policies/libraries have become more favourable to this
  kind of script since January.
- a new script that exploits these hypothetical features would be much
  more compact and may still make the shell script wrapping unnecessary


- FChE

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Script to measure resource usage based on process arguments
  2007-04-18 16:07 ` Frank Ch. Eigler
@ 2007-04-18 16:21   ` Mike Mason
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Mason @ 2007-04-18 16:21 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Agreed.  I need to revisit this script in light of the changes since it was written.  It's on my todo list.  

Thanks,
Mike

Frank Ch. Eigler wrote:
> Mike Mason <mmlnx@us.ibm.com> writes:
> 
>> I wrote the attached script [long ago!] in response to a customer
>> question. They wanted to know if SystemTap could show them which set
>> of java processes were consuming the most resources, based on the
>> process arguments.  [...]
> 
> Reading over this old script again, I'd take away the following
> observations:
> 
> - process argument fetching routines would be welcome the tapset library
> - a syntax to initialize a systemtap array would be useful - both at
>   compile time ("global array[maxsize] = [idx]=value, [idx2]=value2")
>   and at run time (module option string "[idx]=value, ...")
> - some systemtap policies/libraries have become more favourable to this
>   kind of script since January.
> - a new script that exploits these hypothetical features would be much
>   more compact and may still make the shell script wrapping unnecessary
> 
> 
> - FChE

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-04-18 16:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-09  0:55 Script to measure resource usage based on process arguments Mike Mason
2007-01-09  1:27 ` Mike Mason
2007-01-16 15:42   ` Frank Ch. Eigler
2007-04-18 16:07 ` Frank Ch. Eigler
2007-04-18 16:21   ` Mike Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).