public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Using the qemu tracepoints with SystemTap
@ 2011-09-12 15:33 William Cohen
  2011-09-13 10:04 ` [Qemu-devel] " Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: William Cohen @ 2011-09-12 15:33 UTC (permalink / raw)
  To: qemu-devel, SystemTAP

[-- Attachment #1: Type: text/plain, Size: 1291 bytes --]

Hi All,

The RHEL-6 version of qemu-kvm makes the tracepoints available to SystemTap. I have been working on useful examples for the SystemTap tracepoints in qemu. There doesn't seem to be a great number of examples showing the utility of the tracepoints in diagnosing problems. However, I came across the following blog entry that had several examples:

http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html

I reimplemented the VirtqueueRequestTracker example from the blog in SystemTap (the attached virtqueueleaks.stp). I can run it on RHEL-6's qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64 and get output like the following. It outputs the pid and the address of the elem that leaked when the script is stopped like the following:

$ stap virtqueueleaks.stp 
^C
     pid     elem
   19503  1c4af28
   19503  1c56f88
   19503  1c62fe8
   19503  1c6f048
   19503  1c7b0a8
   19503  1c87108
   19503  1c93168
...

I am not that familiar with the internals of qemu. The script seems to indicates qemu is leaking, but is that really the case?  If there are resource leaks, what output would help debug those leaks? What enhancements can be done to this script to provide more useful information?

Are there other examples of qemu probing people would like to see?

-Will




[-- Attachment #2: virtqueueleaks.stp --]
[-- Type: text/plain, Size: 425 bytes --]

# virtqueueleaks.stp
#
# virtqueueleaks.stp is based on the VirtqueueRequestTracker from:
# http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html

global elems
probe qemu.kvm.virtqueue_pop { elems[pid(),elem] = elem }
probe qemu.kvm.virtqueue_fill { delete elems[pid(),elem] }

probe end
{
  printf("\n%8s %8s\n", "pid", "elem")
  foreach([p+, elem] in elems) {
    printf("%8d %8x\n", p, elem)
  }
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Using the qemu tracepoints with SystemTap
  2011-09-12 15:33 Using the qemu tracepoints with SystemTap William Cohen
@ 2011-09-13 10:04 ` Stefan Hajnoczi
  2011-09-13 16:11   ` William Cohen
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2011-09-13 10:04 UTC (permalink / raw)
  To: William Cohen; +Cc: qemu-devel, SystemTAP

On Mon, Sep 12, 2011 at 4:33 PM, William Cohen <wcohen@redhat.com> wrote:
> The RHEL-6 version of qemu-kvm makes the tracepoints available to SystemTap. I have been working on useful examples for the SystemTap tracepoints in qemu. There doesn't seem to be a great number of examples showing the utility of the tracepoints in diagnosing problems. However, I came across the following blog entry that had several examples:
>
> http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html
>
> I reimplemented the VirtqueueRequestTracker example from the blog in SystemTap (the attached virtqueueleaks.stp). I can run it on RHEL-6's qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64 and get output like the following. It outputs the pid and the address of the elem that leaked when the script is stopped like the following:
>
> $ stap virtqueueleaks.stp
> ^C
>     pid     elem
>   19503  1c4af28
>   19503  1c56f88
>   19503  1c62fe8
>   19503  1c6f048
>   19503  1c7b0a8
>   19503  1c87108
>   19503  1c93168
> ...
>
> I am not that familiar with the internals of qemu. The script seems to indicates qemu is leaking, but is that really the case?  If there are resource leaks, what output would help debug those leaks? What enhancements can be done to this script to provide more useful information?

Leak tracing always has corner cases :).

With virtio-blk this would indicate a leak because it uses a
request-response model where the guest initiates I/O and the host
responds.  A guest that cleanly shuts down before you exit your
SystemTap script should not leak requests for virtio-blk.

With virtio-net the guest actually hands the host receive buffers and
they are held until we can receive packets into them and return them
to the host.  We don't have a virtio_reset trace event, and due to
this we're not accounting for clean shutdown (the guest driver resets
the device to clear all virtqueues).

I am submitting a patch to add virtio_reset() tracing.  This will
allow the script to delete all elements belonging to this virtio
device.

> Are there other examples of qemu probing people would like to see?

The malloc/realloc/memalign/vmalloc/free/vfree trace events can be
used for a few things:
 * Finding memory leaks.
 * Finding malloc/vfree or vmalloc/free mismatches.  The rules are:
malloc/realloc need free, memalign/vmalloc need vfree.  They cannot be
mixed.

Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Using the qemu tracepoints with SystemTap
  2011-09-13 10:04 ` [Qemu-devel] " Stefan Hajnoczi
@ 2011-09-13 16:11   ` William Cohen
  2011-09-13 19:38     ` William Cohen
  2011-09-14 15:33     ` Stefan Hajnoczi
  0 siblings, 2 replies; 9+ messages in thread
From: William Cohen @ 2011-09-13 16:11 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, SystemTAP

On 09/13/2011 06:03 AM, Stefan Hajnoczi wrote:
> On Mon, Sep 12, 2011 at 4:33 PM, William Cohen <wcohen@redhat.com> wrote:
>> The RHEL-6 version of qemu-kvm makes the tracepoints available to SystemTap. I have been working on useful examples for the SystemTap tracepoints in qemu. There doesn't seem to be a great number of examples showing the utility of the tracepoints in diagnosing problems. However, I came across the following blog entry that had several examples:
>>
>> http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html
>>
>> I reimplemented the VirtqueueRequestTracker example from the blog in SystemTap (the attached virtqueueleaks.stp). I can run it on RHEL-6's qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64 and get output like the following. It outputs the pid and the address of the elem that leaked when the script is stopped like the following:
>>
>> $ stap virtqueueleaks.stp
>> ^C
>>     pid     elem
>>   19503  1c4af28
>>   19503  1c56f88
>>   19503  1c62fe8
>>   19503  1c6f048
>>   19503  1c7b0a8
>>   19503  1c87108
>>   19503  1c93168
>> ...
>>
>> I am not that familiar with the internals of qemu. The script seems to indicates qemu is leaking, but is that really the case?  If there are resource leaks, what output would help debug those leaks? What enhancements can be done to this script to provide more useful information?
> 

Hi Stefan,

Thanks for the comments.

> Leak tracing always has corner cases :).
> 
> With virtio-blk this would indicate a leak because it uses a
> request-response model where the guest initiates I/O and the host
> responds.  A guest that cleanly shuts down before you exit your
> SystemTap script should not leak requests for virtio-blk.

I stopped the systemtap script while the guest vm was still running. So when the guest vm cleanly shuts down there should be a series of virtqueue_fill operations that will remove those elements?

Qemu uses a thread for each virtual processor, but a single thread to handle all IO. It seems like that might be a possible bottleneck. What would be the path of io event from guest to host back to guest? Is there somthing that a script could do to gauge that delay due to the qemu io thread handling multiple processors?

> 
> With virtio-net the guest actually hands the host receive buffers and
> they are held until we can receive packets into them and return them
> to the host.  We don't have a virtio_reset trace event, and due to
> this we're not accounting for clean shutdown (the guest driver resets
> the device to clear all virtqueues).
> 
> I am submitting a patch to add virtio_reset() tracing.  This will
> allow the script to delete all elements belonging to this virtio
> device.
> 
>> Are there other examples of qemu probing people would like to see?
> 
> The malloc/realloc/memalign/vmalloc/free/vfree trace events can be
> used for a few things:
>  * Finding memory leaks.
>  * Finding malloc/vfree or vmalloc/free mismatches.  The rules are:
> malloc/realloc need free, memalign/vmalloc need vfree.  They cannot be
> mixed.
> 
> Stefan

As a quick and simple experiment to see how often various probes are getting hit I used the following script on RHEL-6 (the probe points are a bit different on Fedora):

global counts
probe qemu.*.*? {counts[pn()]++}
probe end {foreach(n+ in counts) printf ("%s = %d\n", n, counts[n])}

For starting up a fedora 14 guest virtual machine and shutting it down I got the following output:

$  stap ~/research/profiling/examples/qemu_count.s 
^Cqemu.kvm.balloon_event = 1
qemu.kvm.bdrv_aio_multiwrite = 155
qemu.kvm.bdrv_aio_readv = 13284
qemu.kvm.bdrv_aio_writev = 998
qemu.kvm.cpu_get_apic_base = 20
qemu.kvm.cpu_in = 94082
qemu.kvm.cpu_out = 165789
qemu.kvm.cpu_set_apic_base = 445752
qemu.kvm.multiwrite_cb = 654
qemu.kvm.paio_submit = 7141
qemu.kvm.qemu_free = 677704
qemu.kvm.qemu_malloc = 683031
qemu.kvm.qemu_memalign = 285
qemu.kvm.qemu_realloc = 47550
qemu.kvm.virtio_blk_handle_write = 504
qemu.kvm.virtio_blk_req_complete = 7146
qemu.kvm.virtio_blk_rw_complete = 7146
qemu.kvm.virtio_notify = 6574
qemu.kvm.virtio_queue_notify = 6680
qemu.kvm.virtqueue_fill = 7146
qemu.kvm.virtqueue_flush = 7146
qemu.kvm.virtqueue_pop = 7147
qemu.kvm.vm_state_notify = 1


See a lot of qemu.kvm.qemu_malloc. This is likely more than systemtap can track if there are thousands of them live at the same time. There are no qemu_vmalloc events because of https://bugzilla.redhat.com/show_bug.cgi?id=714773.

Should the qemu.kvm.cpu_in and qemu.kvm.cpu_out match up?  There are a lot more qemu.kvm.cpu_out than qemu.kvm.cpu_in count.

See that qemu.kvm.virtio_blk_req_complete, qemu.kvm.virtio_blk_rw_complete, qemu.kvm.virtqueue_fill, and qemu.kvm.virtqueue_flush all have the same count, 7146. The qemu.kvm.virtqueue_pop is close, at 7147.

-Will








^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Using the qemu tracepoints with SystemTap
  2011-09-13 16:11   ` William Cohen
@ 2011-09-13 19:38     ` William Cohen
  2011-09-14 14:52       ` Stefan Hajnoczi
  2011-09-14 15:33     ` Stefan Hajnoczi
  1 sibling, 1 reply; 9+ messages in thread
From: William Cohen @ 2011-09-13 19:38 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, SystemTAP

[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]

On 09/13/2011 12:10 PM, William Cohen wrote:

> Should the qemu.kvm.cpu_in and qemu.kvm.cpu_out match up?  There are a lot more qemu.kvm.cpu_out than qemu.kvm.cpu_in count.

I found that cpu_in and cpu_out refer to input and output instructions.  I wrote a little script tally up the input and output operations on each port to run on a qemu on fc15 machine.

It generates output like the following:


cpu_in
  port    count
0x01f7     3000
0x03d5      120
0xc000     2000
0xc002     3000

cpu_out
  port    count
0x0080      480
0x01f1     2000
0x01f2     2000
0x01f3     2000
0x01f4     2000
0x01f5     2000
0x01f6     2000
0x01f7     1000
0x03d4      480
0x03d5      120
0x03f6     1000
0xc000     3000
0xc002     2000
0xc004     1000
0xc090        4

Looks like lots of touching the ide device ports (0x01f0-0x01ff) and some vga controller (0x03d0-0x3df). This is kind of what would be expected when the machine is doing a fsck and selinux relabel on the guest virtual machines. Look like some pci device access (http://www.tech-pro.net/intro_pci.html) also.

-Will

[-- Attachment #2: qemu_io.stp --]
[-- Type: text/plain, Size: 412 bytes --]

global cpu_in, cpu_out
probe qemu.system.x86_64.cpu_in { cpu_in[addr]++ }
probe qemu.system.x86_64.cpu_out {cpu_out[addr]++ }
probe end {
  # write out the data
  printf("\ncpu_in\n%6s %8s\n","port", "count")
  foreach (addr+ in cpu_in)
    printf("0x%04x %8d\n", addr, cpu_in[addr])
  printf("\ncpu_out\n%6s %8s\n","port", "count")
  foreach (addr+ in cpu_out)
    printf("0x%04x %8d\n", addr, cpu_out[addr])
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Using the qemu tracepoints with SystemTap
  2011-09-13 19:38     ` William Cohen
@ 2011-09-14 14:52       ` Stefan Hajnoczi
  2011-09-14 15:20         ` Frank Ch. Eigler
  2011-09-14 16:03         ` [Qemu-devel] " William Cohen
  0 siblings, 2 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2011-09-14 14:52 UTC (permalink / raw)
  To: William Cohen; +Cc: qemu-devel, SystemTAP

On Tue, Sep 13, 2011 at 8:38 PM, William Cohen <wcohen@redhat.com> wrote:
> On 09/13/2011 12:10 PM, William Cohen wrote:
>
>> Should the qemu.kvm.cpu_in and qemu.kvm.cpu_out match up?  There are a lot more qemu.kvm.cpu_out than qemu.kvm.cpu_in count.
>
> I found that cpu_in and cpu_out refer to input and output instructions.  I wrote a little script tally up the input and output operations on each port to run on a qemu on fc15 machine.
>
> It generates output like the following:
>
>
> cpu_in
>  port    count
> 0x01f7     3000
> 0x03d5      120
> 0xc000     2000
> 0xc002     3000
>
> cpu_out
>  port    count
> 0x0080      480
> 0x01f1     2000
> 0x01f2     2000
> 0x01f3     2000
> 0x01f4     2000
> 0x01f5     2000
> 0x01f6     2000
> 0x01f7     1000
> 0x03d4      480
> 0x03d5      120
> 0x03f6     1000
> 0xc000     3000
> 0xc002     2000
> 0xc004     1000
> 0xc090        4
>
> Looks like lots of touching the ide device ports (0x01f0-0x01ff) and some vga controller (0x03d0-0x3df). This is kind of what would be expected when the machine is doing a fsck and selinux relabel on the guest virtual machines. Look like some pci device access (http://www.tech-pro.net/intro_pci.html) also.

I think the cpu_in/cpu_out tracing script can be useful in identifying
performance problems, especially when obscure guest OSes experience
poor performance due to weird ways of interacting with hardware.

Where are you putting your SystemTap scripts?  I suggest creating a
public git repo and adding a link from the QEMU wiki tracing page:
http://wiki.qemu.org/Features/Tracing

Perhaps we could even include them in a contrib/ or similar directory
in qemu.git so that distros can ship them.  But before we worry about
that we need a useful set of things that can be observed.

Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Using the qemu tracepoints with SystemTap
  2011-09-14 14:52       ` Stefan Hajnoczi
@ 2011-09-14 15:20         ` Frank Ch. Eigler
  2011-09-14 16:03         ` [Qemu-devel] " William Cohen
  1 sibling, 0 replies; 9+ messages in thread
From: Frank Ch. Eigler @ 2011-09-14 15:20 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: William Cohen, qemu-devel, SystemTAP

Stefan Hajnoczi <stefanha@gmail.com> writes:

> [...]
> Where are you putting your SystemTap scripts?  I suggest creating a
> public git repo and adding a link from the QEMU wiki tracing page:
> http://wiki.qemu.org/Features/Tracing
>
> Perhaps we could even include them in a contrib/ or similar directory
> in qemu.git so that distros can ship them.  But before we worry about
> that we need a useful set of things that can be observed.

(We can also or instead ship them within systemtap's bundled sources.)

- FChE

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Using the qemu tracepoints with SystemTap
  2011-09-13 16:11   ` William Cohen
  2011-09-13 19:38     ` William Cohen
@ 2011-09-14 15:33     ` Stefan Hajnoczi
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2011-09-14 15:33 UTC (permalink / raw)
  To: William Cohen; +Cc: qemu-devel, SystemTAP

On Tue, Sep 13, 2011 at 5:10 PM, William Cohen <wcohen@redhat.com> wrote:
> On 09/13/2011 06:03 AM, Stefan Hajnoczi wrote:
>> On Mon, Sep 12, 2011 at 4:33 PM, William Cohen <wcohen@redhat.com> wrote:
>>> The RHEL-6 version of qemu-kvm makes the tracepoints available to SystemTap. I have been working on useful examples for the SystemTap tracepoints in qemu. There doesn't seem to be a great number of examples showing the utility of the tracepoints in diagnosing problems. However, I came across the following blog entry that had several examples:
>>>
>>> http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html
>>>
>>> I reimplemented the VirtqueueRequestTracker example from the blog in SystemTap (the attached virtqueueleaks.stp). I can run it on RHEL-6's qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64 and get output like the following. It outputs the pid and the address of the elem that leaked when the script is stopped like the following:
>>>
>>> $ stap virtqueueleaks.stp
>>> ^C
>>>     pid     elem
>>>   19503  1c4af28
>>>   19503  1c56f88
>>>   19503  1c62fe8
>>>   19503  1c6f048
>>>   19503  1c7b0a8
>>>   19503  1c87108
>>>   19503  1c93168
>>> ...
>>>
>>> I am not that familiar with the internals of qemu. The script seems to indicates qemu is leaking, but is that really the case?  If there are resource leaks, what output would help debug those leaks? What enhancements can be done to this script to provide more useful information?
>>
>
> Hi Stefan,
>
> Thanks for the comments.
>
>> Leak tracing always has corner cases :).
>>
>> With virtio-blk this would indicate a leak because it uses a
>> request-response model where the guest initiates I/O and the host
>> responds.  A guest that cleanly shuts down before you exit your
>> SystemTap script should not leak requests for virtio-blk.
>
> I stopped the systemtap script while the guest vm was still running. So when the guest vm cleanly shuts down there should be a series of virtqueue_fill operations that will remove those elements?

In the case of virtio-blk we only see pop and fill when the guest
tells the host to process an I/O request.  That means once the request
is complete we've done the fill and there is no outstanding virtqueue
element anymore.  When a guest shuts down cleanly it will have no
pending virtio-blk I/O requests and hence the pops and fills balance
out (there is no leak).

I tried to explain that with virtio-net the situation is different
since the guest gives us a bunch of virtqueue elements that point to
receive packet buffers.  When the guest shuts down it will reset the
virtio device, which will clear all virtqueues and hence the popped
elements are no longer relevant.  I CCed you on a patch that adds a
virtio_set_status() trace event so we see when the guest driver resets
the virtio device.

The upshot of this is that you can use these trace events to check
that QEMU is returning all virtqueue elements to the guest and not
holding on to (leaking) them.  It's easy for virtio-blk.

> Qemu uses a thread for each virtual processor, but a single thread to handle all IO. It seems like that might be a possible bottleneck. What would be the path of io event from guest to host back to guest? Is there somthing that a script could do to gauge that delay due to the qemu io thread handling multiple processors?

I think it would be fantastic to look at the vm_exit and vm_entry
events.  These are the events that get raised each time a vcpu exits
guest mode and returns to the kvm.ko host kernel module and every time
we re-enter guest mode.  There are a couple of different ways to work
with this data:

1. Average, min, max, std deviation.  Answers what the general
vm_exit->vm_entry latency looks like.
2. Heat map of vm_exit->vm_entry latencies.  We can see more in a
heatmap, including how the latencies change over time and which are
most frequent.
3. Detailed vm_exit->vm_entry trace logs together with the events that
lead to the exit (e.g. I/O register access by guest).  This
information is useful for finding out why certain exits happened but
can already be captured today, so I would focus on 1 & 2 above which
can make use of SystemTap's power.

This sort of information can help track down instances where the vcpu
is being held up, either because QEMU's global mutex is held by
another thread or for various other reasons.

Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Using the qemu tracepoints with SystemTap
  2011-09-14 14:52       ` Stefan Hajnoczi
  2011-09-14 15:20         ` Frank Ch. Eigler
@ 2011-09-14 16:03         ` William Cohen
  2011-09-15 10:17           ` Stefan Hajnoczi
  1 sibling, 1 reply; 9+ messages in thread
From: William Cohen @ 2011-09-14 16:03 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, SystemTAP

On 09/14/2011 10:51 AM, Stefan Hajnoczi wrote:
> On Tue, Sep 13, 2011 at 8:38 PM, William Cohen <wcohen@redhat.com> wrote:
>> On 09/13/2011 12:10 PM, William Cohen wrote:
>>
>>> Should the qemu.kvm.cpu_in and qemu.kvm.cpu_out match up?  There are a lot more qemu.kvm.cpu_out than qemu.kvm.cpu_in count.
>>
>> I found that cpu_in and cpu_out refer to input and output instructions.  I wrote a little script tally up the input and output operations on each port to run on a qemu on fc15 machine.
>>
>> It generates output like the following:
>>
>>
>> cpu_in
>>  port    count
>> 0x01f7     3000
>> 0x03d5      120
>> 0xc000     2000
>> 0xc002     3000
>>
>> cpu_out
>>  port    count
>> 0x0080      480
>> 0x01f1     2000
>> 0x01f2     2000
>> 0x01f3     2000
>> 0x01f4     2000
>> 0x01f5     2000
>> 0x01f6     2000
>> 0x01f7     1000
>> 0x03d4      480
>> 0x03d5      120
>> 0x03f6     1000
>> 0xc000     3000
>> 0xc002     2000
>> 0xc004     1000
>> 0xc090        4
>>
>> Looks like lots of touching the ide device ports (0x01f0-0x01ff) and some vga controller (0x03d0-0x3df). This is kind of what would be expected when the machine is doing a fsck and selinux relabel on the guest virtual machines. Look like some pci device access (http://www.tech-pro.net/intro_pci.html) also.
> 
> I think the cpu_in/cpu_out tracing script can be useful in identifying
> performance problems, especially when obscure guest OSes experience
> poor performance due to weird ways of interacting with hardware.

Glad this example looks useful.

> Where are you putting your SystemTap scripts?  I suggest creating a
> public git repo and adding a link from the QEMU wiki tracing page:
> http://wiki.qemu.org/Features/Tracing

Definitely need a more lasting place to put the QEMU examples. SystemTap has an examples directory which try to put things like this into (http://sourceware.org/systemtap/examples/). These are run as part of the testsuite. These qemu examples are not ready for inclusion in the examples. I have set up https://github.com/wcohen/qemu_systemtap for the qemu examples.

> Perhaps we could even include them in a contrib/ or similar directory
> in qemu.git so that distros can ship them.  But before we worry about
> that we need a useful set of things that can be observed.

We definitely want to develop scripts that give people a better understanding what is going on. I am just now learning how kvm and qemu work, so the early scripts are just counting events to give people an idea events are frequent. In some cases these could point to issues where some assumptions about event frequency are incorrect. I would like to have scripts that examine the sequences of events and indicate when long latencies occur, but I don't yet have enough understanding of the how kvm and qemu work to implement those.

There are other userspace applications like python and java that have similar probes (http://sourceware.org/systemtap/wiki "Applications with built-in User-Space Markers"). Certainly making the systemtap scripts easily available would be good. Having the scripts with qemu is one way to do it. The other way would be to include in the systemtap examples. Advantage with including with qemu would be scripts would match the version of the qemu running and avoid problems with changes in the qemu probe points. Having the scripts in systemtap would allow regular testing of the scripts.

-Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Using the qemu tracepoints with SystemTap
  2011-09-14 16:03         ` [Qemu-devel] " William Cohen
@ 2011-09-15 10:17           ` Stefan Hajnoczi
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2011-09-15 10:17 UTC (permalink / raw)
  To: William Cohen; +Cc: qemu-devel, SystemTAP

On Wed, Sep 14, 2011 at 5:02 PM, William Cohen <wcohen@redhat.com> wrote:
> On 09/14/2011 10:51 AM, Stefan Hajnoczi wrote:
>> On Tue, Sep 13, 2011 at 8:38 PM, William Cohen <wcohen@redhat.com> wrote:
>>> On 09/13/2011 12:10 PM, William Cohen wrote:
>>>
>>>> Should the qemu.kvm.cpu_in and qemu.kvm.cpu_out match up?  There are a lot more qemu.kvm.cpu_out than qemu.kvm.cpu_in count.
>>>
>>> I found that cpu_in and cpu_out refer to input and output instructions.  I wrote a little script tally up the input and output operations on each port to run on a qemu on fc15 machine.
>>>
>>> It generates output like the following:
>>>
>>>
>>> cpu_in
>>>  port    count
>>> 0x01f7     3000
>>> 0x03d5      120
>>> 0xc000     2000
>>> 0xc002     3000
>>>
>>> cpu_out
>>>  port    count
>>> 0x0080      480
>>> 0x01f1     2000
>>> 0x01f2     2000
>>> 0x01f3     2000
>>> 0x01f4     2000
>>> 0x01f5     2000
>>> 0x01f6     2000
>>> 0x01f7     1000
>>> 0x03d4      480
>>> 0x03d5      120
>>> 0x03f6     1000
>>> 0xc000     3000
>>> 0xc002     2000
>>> 0xc004     1000
>>> 0xc090        4
>>>
>>> Looks like lots of touching the ide device ports (0x01f0-0x01ff) and some vga controller (0x03d0-0x3df). This is kind of what would be expected when the machine is doing a fsck and selinux relabel on the guest virtual machines. Look like some pci device access (http://www.tech-pro.net/intro_pci.html) also.
>>
>> I think the cpu_in/cpu_out tracing script can be useful in identifying
>> performance problems, especially when obscure guest OSes experience
>> poor performance due to weird ways of interacting with hardware.
>
> Glad this example looks useful.
>
>> Where are you putting your SystemTap scripts?  I suggest creating a
>> public git repo and adding a link from the QEMU wiki tracing page:
>> http://wiki.qemu.org/Features/Tracing
>
> Definitely need a more lasting place to put the QEMU examples. SystemTap has an examples directory which try to put things like this into (http://sourceware.org/systemtap/examples/). These are run as part of the testsuite. These qemu examples are not ready for inclusion in the examples. I have set up https://github.com/wcohen/qemu_systemtap for the qemu examples.

Added a link from http://qemu.org/Features/Tracing.

>> Perhaps we could even include them in a contrib/ or similar directory
>> in qemu.git so that distros can ship them.  But before we worry about
>> that we need a useful set of things that can be observed.
>
> We definitely want to develop scripts that give people a better understanding what is going on. I am just now learning how kvm and qemu work, so the early scripts are just counting events to give people an idea events are frequent. In some cases these could point to issues where some assumptions about event frequency are incorrect. I would like to have scripts that examine the sequences of events and indicate when long latencies occur, but I don't yet have enough understanding of the how kvm and qemu work to implement those.

The vmexit/entry things I mentioned in the mail above are events that
can be traced in the host kernel (kvm:kvm_exit and kvm:kvm_entry).
The QEMU-side equivalent is the ioctl(KVM_RUN) but we do not have any
trace events in kvm-all.c yet (!!).  A trace event before the ioctl
and one after could be used to trace vcpu code execution from QEMU's
perspective.

Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-09-15 10:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-12 15:33 Using the qemu tracepoints with SystemTap William Cohen
2011-09-13 10:04 ` [Qemu-devel] " Stefan Hajnoczi
2011-09-13 16:11   ` William Cohen
2011-09-13 19:38     ` William Cohen
2011-09-14 14:52       ` Stefan Hajnoczi
2011-09-14 15:20         ` Frank Ch. Eigler
2011-09-14 16:03         ` [Qemu-devel] " William Cohen
2011-09-15 10:17           ` Stefan Hajnoczi
2011-09-14 15:33     ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).