public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Simplifying examples by improving vfs tapset
@ 2008-10-27 14:54 William Cohen
  2008-10-28 15:40 ` Jeff Moyer
  0 siblings, 1 reply; 4+ messages in thread
From: William Cohen @ 2008-10-27 14:54 UTC (permalink / raw)
  To: SystemTAP

[-- Attachment #1: Type: text/plain, Size: 1165 bytes --]

I am looking through the examples being used in the SystemTap
beginners guide and trying to streamline them a bit.  One
simplification is the minimize the use of raw kernel.function() and
module().function() probes by using tapset aliases instead.  Probing
the raw C functions is not so meaningful for average users. Plus they
don't know what arguments might be available at those functions. The
tapsets should minimize raw functions. When looking to convert some of
the probes to use the vfs tapsets, I noticed that a number of the vfs
tapsets don't have dev or devname defined, e.g. vfs.read{.return} and
vfs.write{.return}. This result in ugly code such as in disktop.stp
probe handlers:

		dev = __file_dev($file)
		devname = __find_bdevname(dev,__file_bdev($file))


The dev and devname should be consistently available for vfs
tapset. This would eliminate the need for internal systemtap tapset
functions to be used in the probe handlers and would make examples
such as disktop.stp simpler. There appears to be ugliness in
systemtap.examples/io/iostat-scsi.stp.

Attached is the suggested patch to for the vfs.stp stapset and the disktop.stp 
example.

-Will

[-- Attachment #2: vfs.diff --]
[-- Type: text/x-patch, Size: 3316 bytes --]

diff --git a/tapset/vfs.stp b/tapset/vfs.stp
index 7f2312d..2ac8256 100644
--- a/tapset/vfs.stp
+++ b/tapset/vfs.stp
@@ -800,6 +800,8 @@ probe vfs.read = kernel.function ("vfs_read")
 	pos = $pos
 	buf = $buf
 	bytes_to_read = $count
+	dev = __file_dev($file)
+	devname = __find_bdevname(dev, __file_bdev($file))
 }
 
 probe vfs.read.return = kernel.function ("vfs_read").return
@@ -808,6 +810,8 @@ probe vfs.read.return = kernel.function ("vfs_read").return
 	pos = $pos
 	buf = $buf
 	bytes_to_read = $count
+	dev = __file_dev($file)
+	devname = __find_bdevname(dev, __file_bdev($file))
 
 	ret = $return
 	bytes_read = ret > 0 ? ret : 0
@@ -844,6 +848,8 @@ probe vfs.write = kernel.function ("vfs_write")
 	pos = $pos
 	buf = $buf
 	bytes_to_write = $count
+	dev = __file_dev($file)
+	devname = __find_bdevname(dev, __file_bdev($file))
 }
 
 probe vfs.write.return = kernel.function ("vfs_write").return
@@ -852,6 +858,8 @@ probe vfs.write.return = kernel.function ("vfs_write").return
 	pos = $pos
 	buf = $buf
 	bytes_to_write = $count
+	dev = __file_dev($file)
+	devname = __find_bdevname(dev, __file_bdev($file))
 
 	ret = $return
 	bytes_written = ret > 0 ? ret : 0
diff --git a/testsuite/systemtap.examples/io/disktop.stp b/testsuite/systemtap.examples/io/disktop.stp
index e2c4fc3..20462f0 100755
--- a/testsuite/systemtap.examples/io/disktop.stp
+++ b/testsuite/systemtap.examples/io/disktop.stp
@@ -16,9 +16,6 @@ global read_bytes,write_bytes
 
 probe vfs.read.return {
 	if ($return>0) {
-		dev = __file_dev($file)
-		devname = __find_bdevname(dev,__file_bdev($file))
-
 		if (devname!="N/A") {/*skip read from cache*/
 			io_stat[pid(),execname(),uid(),ppid(),"R"] += $return
 			device[pid(),execname(),uid(),ppid(),"R"] = devname
@@ -29,9 +26,6 @@ probe vfs.read.return {
 
 probe vfs.write.return {
 	if ($return>0) {
-		dev = __file_dev($file)
-		devname = __find_bdevname(dev,__file_bdev($file))
-
 		if (devname!="N/A") { /*skip update cache*/
 			io_stat[pid(),execname(),uid(),ppid(),"W"] += $return
 			device[pid(),execname(),uid(),ppid(),"W"] = devname
@@ -43,16 +37,21 @@ probe vfs.write.return {
 probe timer.ms(5000) {
 	/* skip non-read/write disk */
 	if (read_bytes+write_bytes) {
-
-		printf("\n%-25s, %-8s%4dKb/sec, %-7s%6dKb, %-7s%6dKb\n\n",ctime(gettimeofday_s()),"Average:",
-      		((read_bytes+write_bytes)/1024)/5,"Read:",read_bytes/1024,"Write:",write_bytes/1024)
+		printf("\n%-25s, %-8s%4dKb/sec, %-7s%6dKb, %-7s%6dKb\n\n",
+				 ctime(gettimeofday_s()),"Average:",
+      		((read_bytes+write_bytes)/1024)/5,"Read:",
+				read_bytes/1024,"Write:",write_bytes/1024)
 
 		/* print header */
-		printf("%8s %8s %8s %25s %8s %4s %12s\n","UID","PID","PPID","CMD","DEVICE","T","BYTES")
+		printf("%8s %8s %8s %25s %8s %4s %12s\n",
+			    "UID","PID","PPID","CMD","DEVICE","T","BYTES")
 	}
 	/* print top ten I/O */
 	foreach ([process,cmd,userid,parent,action] in io_stat- limit 10)
-	  printf("%8d %8d %8d %25s %8s %4s %12d\n",userid,process,parent,cmd,device[process,cmd,userid,parent,action],action,io_stat[process,cmd,userid,parent,action])
+	  printf("%8d %8d %8d %25s %8s %4s %12d\n",
+	  	      userid,process,parent,cmd,
+		      device[process,cmd,userid,parent,action],
+		      action,io_stat[process,cmd,userid,parent,action])
 
 	/* clear data */
 	delete io_stat

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Simplifying examples by improving vfs tapset
  2008-10-27 14:54 Simplifying examples by improving vfs tapset William Cohen
@ 2008-10-28 15:40 ` Jeff Moyer
  2008-11-10 17:58   ` William Cohen
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Moyer @ 2008-10-28 15:40 UTC (permalink / raw)
  To: systemtap

William Cohen <wcohen@redhat.com> writes:

> I am looking through the examples being used in the SystemTap
> beginners guide and trying to streamline them a bit.  One
> simplification is the minimize the use of raw kernel.function() and
> module().function() probes by using tapset aliases instead.  Probing
> the raw C functions is not so meaningful for average users. Plus they
> don't know what arguments might be available at those functions. The
> tapsets should minimize raw functions. When looking to convert some of
> the probes to use the vfs tapsets, I noticed that a number of the vfs
> tapsets don't have dev or devname defined, e.g. vfs.read{.return} and
> vfs.write{.return}. This result in ugly code such as in disktop.stp
> probe handlers:
>
> 		dev = __file_dev($file)
> 		devname = __find_bdevname(dev,__file_bdev($file))
>
>
> The dev and devname should be consistently available for vfs
> tapset. This would eliminate the need for internal systemtap tapset
> functions to be used in the probe handlers and would make examples
> such as disktop.stp simpler. There appears to be ugliness in
> systemtap.examples/io/iostat-scsi.stp.
>
> Attached is the suggested patch to for the vfs.stp stapset and the
> disktop.stp example.
>
> -Will
> diff --git a/tapset/vfs.stp b/tapset/vfs.stp
> index 7f2312d..2ac8256 100644
> --- a/tapset/vfs.stp
> +++ b/tapset/vfs.stp
> @@ -800,6 +800,8 @@ probe vfs.read = kernel.function ("vfs_read")
>  	pos = $pos
>  	buf = $buf
>  	bytes_to_read = $count
> +	dev = __file_dev($file)
> +	devname = __find_bdevname(dev, __file_bdev($file))
>  }
>  
>  probe vfs.read.return = kernel.function ("vfs_read").return
> @@ -808,6 +810,8 @@ probe vfs.read.return = kernel.function ("vfs_read").return
>  	pos = $pos
>  	buf = $buf
>  	bytes_to_read = $count
> +	dev = __file_dev($file)
> +	devname = __find_bdevname(dev, __file_bdev($file))
>  
>  	ret = $return
>  	bytes_read = ret > 0 ? ret : 0
> @@ -844,6 +848,8 @@ probe vfs.write = kernel.function ("vfs_write")
>  	pos = $pos
>  	buf = $buf
>  	bytes_to_write = $count
> +	dev = __file_dev($file)
> +	devname = __find_bdevname(dev, __file_bdev($file))
>  }
>  
>  probe vfs.write.return = kernel.function ("vfs_write").return
> @@ -852,6 +858,8 @@ probe vfs.write.return = kernel.function ("vfs_write").return
>  	pos = $pos
>  	buf = $buf
>  	bytes_to_write = $count
> +	dev = __file_dev($file)
> +	devname = __find_bdevname(dev, __file_bdev($file))
>  
>  	ret = $return
>  	bytes_written = ret > 0 ? ret : 0

What about readv and writev?  I had modified my local copy of
disktop.stp to take these into account (hooking into do_readv_writev).
I will post those updates if you like, once I figure out why I ran into
a problem with the array assignments.  (I had to read from the io_stat
array before writing to it, otherwise the written values didn't show
up.)

One other improvement that could be made to the script is AIO support.

Cheers,

Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Simplifying examples by improving vfs tapset
  2008-10-28 15:40 ` Jeff Moyer
@ 2008-11-10 17:58   ` William Cohen
  2008-11-10 18:23     ` Jeff Moyer
  0 siblings, 1 reply; 4+ messages in thread
From: William Cohen @ 2008-11-10 17:58 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: systemtap

Jeff Moyer wrote:

> What about readv and writev?  I had modified my local copy of
> disktop.stp to take these into account (hooking into do_readv_writev).
> I will post those updates if you like, once I figure out why I ran into
> a problem with the array assignments.  (I had to read from the io_stat
> array before writing to it, otherwise the written values didn't show
> up.)

I didn't modify the vfs.readv and vfs.writev in the patch. However, it looks 
like the same dev and ino variables could be made available for vfs_writev and 
vfs_readv.

> One other improvement that could be made to the script is AIO support.

There is an example that examines rescheduling during AIO operations:

http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=blob;f=testsuite/systemtap.examples/io/io_submit.stp;

What additional support do you see needed for AIO?

-Will

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Simplifying examples by improving vfs tapset
  2008-11-10 17:58   ` William Cohen
@ 2008-11-10 18:23     ` Jeff Moyer
  0 siblings, 0 replies; 4+ messages in thread
From: Jeff Moyer @ 2008-11-10 18:23 UTC (permalink / raw)
  To: William Cohen; +Cc: systemtap

William Cohen <wcohen@redhat.com> writes:

> Jeff Moyer wrote:
>
>> What about readv and writev?  I had modified my local copy of
>> disktop.stp to take these into account (hooking into do_readv_writev).
>> I will post those updates if you like, once I figure out why I ran into
>> a problem with the array assignments.  (I had to read from the io_stat
>> array before writing to it, otherwise the written values didn't show
>> up.)
>
> I didn't modify the vfs.readv and vfs.writev in the patch. However, it
> looks like the same dev and ino variables could be made available for
> vfs_writev and vfs_readv.
>
>> One other improvement that could be made to the script is AIO support.
>
> There is an example that examines rescheduling during AIO operations:
>
> http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=blob;f=testsuite/systemtap.examples/io/io_submit.stp;
>
> What additional support do you see needed for AIO?

Perhaps my message was off-topic, as you were focusing on the tapset and
I was focusing on the disktop.stp script.  Sorry if this caused
confusion.  Anyway, below are answers to your questions.

The disktop.stp script looks at the I/O sizes completed on behalf of a
process in order to get a good idea of which processes are issuing the
lion's share of I/O on the system.  I think the easiest way to cover the
AIO case is to instrument fs/aio.c:read_events, since it will be called
in the context of the process issuing the I/O (though not necessarily
the same thread).

This is very different from io_submit.stp, which tries to determine the
most frequent causes of scheduling inside of io_submit (a call that
processes expect to complete relatively quickly).

Cheers,

Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-11-10 18:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-27 14:54 Simplifying examples by improving vfs tapset William Cohen
2008-10-28 15:40 ` Jeff Moyer
2008-11-10 17:58   ` William Cohen
2008-11-10 18:23     ` Jeff Moyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).