public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Improving virtualization performance as a use case for SystemTap
@ 2011-03-24 21:35 William Cohen
  0 siblings, 0 replies; only message in thread
From: William Cohen @ 2011-03-24 21:35 UTC (permalink / raw)
  To: systemTAP

I am looking at using systemtap to improve virtualization performance
for systemtap.  So broadly things that will help people cram more
guest VMs on a physical system, for example reduce wakeups due to
polling and reduce writes to share pages. Below are just some examples
of how SystemTap could help.

Find polling processes in guest VMs

Every time a timer expires and wakes up a idle guest VM (times out)
there could be up to four switches between guest VM and host. This
takes additional CPU time that could be better spent doing useful
work.  Polling processes are a common cause of these timeouts.  The
timeout.stp script in the systemtap examples can be used in the guest
VM to identify the processes that are waking due to timeout.


Find page faulting processes

Page faults may require additional fixup by the host VM, major page
faults may also require additional IO operation to fix the page fault.

The pfaults.stp example script provides a log of major and minor pages
faults. Each line of the log contains. This provides a compact trace
of each major and minor page fault on the system.  Each line has the
format:

<time_stamp>:<pid>:<address>:<r/w>:<major/minor>:<elapsed_time_to_service>

If debug information for the executable is installed and are process
is still running, the faults for executable code can be mapped to
source file and line with:

eu-addr2line --pid=<pid> <address>



Find processes writing to shared page

Writing to a shared page causes the kernel to copy the page then make
the change to the duplicated page. The host VM may need to assist with
the page table management. There is also the cost of copying data to
the newly created page.  Finally, there is the additional space taken
by the newly created page.

Looks like would want to use the following, but the probe point
vm.write_shared_copy doesn't work in RHEL-6 (have a patch to fix
this):

stap -e 'probe vm.write_shared_copy {
  printf("%s(%d) write_to_shared_page(%p)\n",
         execname(), pid(), address)
}'

RHEL-5 and RHEL-6 work around:

stap -e 'probe kernel.function("cow_user_page") { 
  printf("%s(%d) write_to_shared_page(%p)\n",
         execname(), pid(), $va)



Find processes that could use Transparent Huge Pages (THP)

Typical pages size on x86 machines is 4096 bytes. Newer versions of
the Linux kernel support Transparent Huge Pages (THP), 2MB pages for
anonymous memory regions. The THP can reduce the overhead due to page
table management and VM fixup of page tables. Maybe something like the
following to show those large allocations:

stap -e 'probe vm.brk {
  if ((512*4096)<=length)
    printf("%s(%d) brk(%p, %d)\n",
           execname(), pid(), address, length)
}'

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2011-03-24 21:35 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-24 21:35 Improving virtualization performance as a use case for SystemTap William Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).