* Getting systemtap examples working with --bpf backend @ 2019-05-16 13:41 William Cohen 2019-05-20 19:52 ` William Cohen 2019-06-03 15:51 ` William Cohen 0 siblings, 2 replies; 7+ messages in thread From: William Cohen @ 2019-05-16 13:41 UTC (permalink / raw) To: systemtap I noticed https://elinux.org/images/d/dc/Kernel-Analysis-Using-eBPF-Daniel-Thompson-Linaro.pdf mentioned on page 29 that many of the systemtap examples did not work with the bpf back end and led to frustration. Today I took a quick survey of how badly the examples are broken by adding the following line to the beginning of the run_command function in check.exp trying: set command [ string map {"stap " "stap --bpf "} $command ] Most of the examples fail. There are a few that actually do appear to run are just doing things in probe begin or end handlers (with the exception of cachestat*) : PASS: systemtap.examples/general/ansi_colors run PASS: systemtap.examples/general/ansi_colors2 run PASS: systemtap.examples/general/helloworld run PASS: systemtap.examples/memory/cachestat run PASS: systemtap.examples/memory/cachestat_bpf run PASS: systemtap.examples/memory/kmalloc-top run The non-bpf cachestat works because it just probes raw functions. Most examples fail because of various missing syscall.*/syscall_any probe points and gettimeofday_*() functions. The time functions should be something easy to get working in bpf as there is already a nanosecond time function and its results could be scaled for microseconds, milliseconds, and seconds. -Will ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Getting systemtap examples working with --bpf backend 2019-05-16 13:41 Getting systemtap examples working with --bpf backend William Cohen @ 2019-05-20 19:52 ` William Cohen 2019-05-21 17:11 ` William Cohen 2019-06-03 15:51 ` William Cohen 1 sibling, 1 reply; 7+ messages in thread From: William Cohen @ 2019-05-20 19:52 UTC (permalink / raw) To: systemtap On 5/16/19 9:41 AM, William Cohen wrote: > I noticed https://elinux.org/images/d/dc/Kernel-Analysis-Using-eBPF-Daniel-Thompson-Linaro.pdf mentioned on page 29 that many of the systemtap examples did not work with the bpf back end and led to frustration. Today I took a quick survey of how badly the examples are broken by adding the following line to the beginning of the run_command function in check.exp trying: > > set command [ string map {"stap " "stap --bpf "} $command ] > > Most of the examples fail. There are a few that actually do appear to run are just doing things in probe begin or end handlers (with the exception of cachestat*) : > > PASS: systemtap.examples/general/ansi_colors run > PASS: systemtap.examples/general/ansi_colors2 run > PASS: systemtap.examples/general/helloworld run > PASS: systemtap.examples/memory/cachestat run > PASS: systemtap.examples/memory/cachestat_bpf run > PASS: systemtap.examples/memory/kmalloc-top run > > The non-bpf cachestat works because it just probes raw functions. Most examples fail because of various missing syscall.*/syscall_any probe points and gettimeofday_*() functions. The time functions should be something easy to get working in bpf as there is already a nanosecond time function and its results could be scaled for microseconds, milliseconds, and seconds. > > -Will > Hi, The cachestat_bpf test has been folded into the cachestat test to minimize duplication. The helloworld tests has been set to run with the bpf back end. The kmalloc-top test is actually a perl script isn't running the generating code with the bpf backend. The ansi_colors and ansi_colors2 compile and run but their output is not checked and it results do not have the proper color formatting like the regular systemtap version (due to not handling octal escapes, PR23559) The systemtap example heavily leverage the systemtap tapsets. Currently, the bpf has few tapsets. Things like syscall.* probe points and gettimeofday_* functions are not available for systemtap bpf generation. However, probably don't want to blindly duplicate all the tapsets in the tapsets/linux directory for tapsets/bpf. There is already a bpf ktime_get_ns available. If there was a time offset generated, then it should be possible to have gettimeofday_* functions for bpf, allowing some additional scripts to work. Alternatively, don't worry about the offset at the moment as most of the example are taking the difference between two gettimeofday_* function calls. A number of examples are using multdimensional-arrays this can also be implicit when using the @entry() operation. However, these examples are not going to work because of PR23478. Examples such as hugepage_cow_delays fail in the folowing manner: attempting command stap --bpf -p4 hugepage_cow_delays.stp OUT semantic error: unhandled multi-dimensional array: identifier 'gettimeofday_us' at hugepage_cow_delays.stp:8:37 source: <<< (gettimeofday_us() - @entry(gettimeofday_us())) ^ Pass 4: compilation failed. [man error::pass4] child process exited abnormally RC 1 -Will ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Getting systemtap examples working with --bpf backend 2019-05-20 19:52 ` William Cohen @ 2019-05-21 17:11 ` William Cohen 2019-05-22 20:51 ` Frank Ch. Eigler 0 siblings, 1 reply; 7+ messages in thread From: William Cohen @ 2019-05-21 17:11 UTC (permalink / raw) To: systemtap, Serhei Makarov [-- Attachment #1: Type: text/plain, Size: 3625 bytes --] On 5/20/19 3:52 PM, William Cohen wrote: > On 5/16/19 9:41 AM, William Cohen wrote: >> I noticed https://elinux.org/images/d/dc/Kernel-Analysis-Using-eBPF-Daniel-Thompson-Linaro.pdf mentioned on page 29 that many of the systemtap examples did not work with the bpf back end and led to frustration. Today I took a quick survey of how badly the examples are broken by adding the following line to the beginning of the run_command function in check.exp trying: >> >> set command [ string map {"stap " "stap --bpf "} $command ] >> >> Most of the examples fail. There are a few that actually do appear to run are just doing things in probe begin or end handlers (with the exception of cachestat*) : >> >> PASS: systemtap.examples/general/ansi_colors run >> PASS: systemtap.examples/general/ansi_colors2 run >> PASS: systemtap.examples/general/helloworld run >> PASS: systemtap.examples/memory/cachestat run >> PASS: systemtap.examples/memory/cachestat_bpf run >> PASS: systemtap.examples/memory/kmalloc-top run >> >> The non-bpf cachestat works because it just probes raw functions. Most examples fail because of various missing syscall.*/syscall_any probe points and gettimeofday_*() functions. The time functions should be something easy to get working in bpf as there is already a nanosecond time function and its results could be scaled for microseconds, milliseconds, and seconds. >> >> -Will >> > > Hi, > > The cachestat_bpf test has been folded into the cachestat test to minimize duplication. The helloworld tests has been set to run with the bpf back end. The kmalloc-top test is actually a perl script isn't running the generating code with the bpf backend. The ansi_colors and ansi_colors2 compile and run but their output is not checked and it results do not have the proper color formatting like the regular systemtap version (due to not handling octal escapes, PR23559) > > The systemtap example heavily leverage the systemtap tapsets. Currently, the bpf has few tapsets. Things like syscall.* probe points and gettimeofday_* functions are not available for systemtap bpf generation. However, probably don't want to blindly duplicate all the tapsets in the tapsets/linux directory for tapsets/bpf. > > There is already a bpf ktime_get_ns available. If there was a time offset generated, then it should be possible to have gettimeofday_* functions for bpf, allowing some additional scripts to work. Alternatively, don't worry about the offset at the moment as most of the example are taking the difference between two gettimeofday_* function calls. > > A number of examples are using multdimensional-arrays this can also be implicit when using the @entry() operation. However, these examples are not going to work because of PR23478. Examples such as hugepage_cow_delays fail in the folowing manner: > > > attempting command stap --bpf -p4 hugepage_cow_delays.stp > OUT semantic error: unhandled multi-dimensional array: identifier 'gettimeofday_us' at hugepage_cow_delays.stp:8:37 > source: <<< (gettimeofday_us() - @entry(gettimeofday_us())) > ^ > > Pass 4: compilation failed. [man error::pass4] > child process exited abnormally > RC 1 > > > -Will > Hi, Attached is a proposed tapset file for tapsets/bpf to provide gettimeofday_* function for scripts that are using the time of day. It has a global variable for a time offset to convert the ktime_get_ns into gettimeofday_ns. Expect that some type of probe begin would set that up, but don't have anything doing that yet. Any thoughts or comments about this? -Will [-- Attachment #2: timestamp_gtod.stp --] [-- Type: text/plain, Size: 1438 bytes --] // timestamp tapset -- gettimeofday variants // Copyright (C) 2005-2009 Red Hat Inc. // Copyright (C) 2006 Intel Corporation. // // This file is part of systemtap, and is free software. You can // redistribute it and/or modify it under the terms of the GNU General // Public License (GPL); either version 2, or (at your option) any // later version. global __gtod_offset = 0 /* FIXME need to set appropriately on startup */ /** * sfunction gettimeofday_ns - Number of nanoseconds since UNIX epoch * * Description: This function returns the number of nanoseconds * since the UNIX epoch. */ function gettimeofday_ns:long () { /* pure */ /* unprivileged */ return (ktime_get_ns() + __gtod_offset) } /** * sfunction gettimeofday_us - Number of microseconds since UNIX epoch * * Description: This function returns the number of microseconds * since the UNIX epoch. */ function gettimeofday_us:long () { return gettimeofday_ns() / 1000; } /** * sfunction gettimeofday_ms - Number of milliseconds since UNIX epoch * * Description: This function returns the number of milliseconds * since the UNIX epoch. */ function gettimeofday_ms:long () { return gettimeofday_ns() / 1000000; } /** * sfunction gettimeofday_s - Number of seconds since UNIX epoch * * Description: This function returns the number of seconds since * the UNIX epoch. */ function gettimeofday_s:long () { return gettimeofday_ns() / 1000000000; } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Getting systemtap examples working with --bpf backend 2019-05-21 17:11 ` William Cohen @ 2019-05-22 20:51 ` Frank Ch. Eigler 2019-05-22 21:12 ` William Cohen 0 siblings, 1 reply; 7+ messages in thread From: Frank Ch. Eigler @ 2019-05-22 20:51 UTC (permalink / raw) To: William Cohen; +Cc: systemtap, Serhei Makarov wcohen wrote: > global __gtod_offset = 0 /* FIXME need to set appropriately on startup */ Is there a standard bpf function to get this value? If so, it's trivial to call it from a probe-begin and initialize this global. If not, it's a less trivial job for it to get a reserved spot in the same global array where the bpf runtime communicates exit-ness with stapbpf. Then stapbpf could initialize this shared global at its startup. Or a fake stapbpf-special bpf function could provide this value, and again a probe begin could save the value. - FChE ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Getting systemtap examples working with --bpf backend 2019-05-22 20:51 ` Frank Ch. Eigler @ 2019-05-22 21:12 ` William Cohen 2019-05-22 21:39 ` Serhei Makarov 0 siblings, 1 reply; 7+ messages in thread From: William Cohen @ 2019-05-22 21:12 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: systemtap, Serhei Makarov On 5/22/19 4:51 PM, Frank Ch. Eigler wrote: > > wcohen wrote: > >> global __gtod_offset = 0 /* FIXME need to set appropriately on startup */ > > Is there a standard bpf function to get this value? If so, it's trivial > to call it from a probe-begin and initialize this global. If not, it's > a less trivial job for it to get a reserved spot in the same global > array where the bpf runtime communicates exit-ness with stapbpf. Then > stapbpf could initialize this shared global at its startup. Or a fake > stapbpf-special bpf function could provide this value, and again a > probe begin could save the value. > > - FChE > Hi, The BPF helper libraries have a ktime_get_ns function to get the time, but there isn't a helper function to get that offset between start of the epoch and when the machine powered on, what ktime_get_ns uses as the start of time. What might be feasible is to probe begin run in user space to compute that offset and put it in a bpf map. According to https://blogs.oracle.com/linux/notes-on-bpf-3 : Map actions We can create/update, delete and lookup map information, both in BPF programs and in user-space. User-space map interactions are done via the BPF syscall. Their function signatures are slightly different to those of their in-kernel BPF program equivalents. In tools/lib/bpf/bpf.c wrappers for these actions are present: This technique might also be useful for initialization information like syscall numbers<->names and other constants rather than trying to put everything into space constrained bpf code. However, not sure how that is going to be managed if multiple systemtap scripts are kicked off. -Will ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Getting systemtap examples working with --bpf backend 2019-05-22 21:12 ` William Cohen @ 2019-05-22 21:39 ` Serhei Makarov 0 siblings, 0 replies; 7+ messages in thread From: Serhei Makarov @ 2019-05-22 21:39 UTC (permalink / raw) To: William Cohen; +Cc: Frank Ch. Eigler, systemtap On Wed, May 22, 2019 at 5:12 PM William Cohen <wcohen@redhat.com> wrote: > We can create/update, delete and lookup map information, both in BPF programs and in user-space. User-space map interactions are done via the BPF syscall. Their function signatures are slightly different to those of their in-kernel BPF program equivalents. In tools/lib/bpf/bpf.c wrappers for these actions are present: Yep. You can see in bpfinterp.cxx how the BPF-level helpers are then implemented in terms of the syscalls. > This technique might also be useful for initialization information like syscall numbers<->names and other constants rather than trying to put everything into space constrained bpf code. However, not sure how that is going to be managed if multiple systemtap scripts are kicked off. Each stapbpf invocation creates its own separate set of maps to track global variables, so there is no conflict. (What would be more difficult is if you wanted to share a map between different stapbpf processes for some reason.) There was an upcoming BPF extension being discussed at LPC 2018 which would allow loading constant data sections into the BPF program's address space. Having that would simplify things a lot. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Getting systemtap examples working with --bpf backend 2019-05-16 13:41 Getting systemtap examples working with --bpf backend William Cohen 2019-05-20 19:52 ` William Cohen @ 2019-06-03 15:51 ` William Cohen 1 sibling, 0 replies; 7+ messages in thread From: William Cohen @ 2019-06-03 15:51 UTC (permalink / raw) To: systemtap On 5/16/19 9:41 AM, William Cohen wrote: > I noticed https://elinux.org/images/d/dc/Kernel-Analysis-Using-eBPF-Daniel-Thompson-Linaro.pdf mentioned on page 29 that many of the systemtap examples did not work with the bpf back end and led to frustration. Today I took a quick survey of how badly the examples are broken by adding the following line to the beginning of the run_command function in check.exp trying: > > set command [ string map {"stap " "stap --bpf "} $command ] > > Most of the examples fail. There are a few that actually do appear to run are just doing things in probe begin or end handlers (with the exception of cachestat*) : > > PASS: systemtap.examples/general/ansi_colors run > PASS: systemtap.examples/general/ansi_colors2 run > PASS: systemtap.examples/general/helloworld run > PASS: systemtap.examples/memory/cachestat run > PASS: systemtap.examples/memory/cachestat_bpf run > PASS: systemtap.examples/memory/kmalloc-top run > > The non-bpf cachestat works because it just probes raw functions. Most examples fail because of various missing syscall.*/syscall_any probe points and gettimeofday_*() functions. The time functions should be something easy to get working in bpf as there is already a nanosecond time function and its results could be scaled for microseconds, milliseconds, and seconds. > > -Will > Hi, I have been looking at getting the syscall_any tapset working with bpf. If the syscall_any and syscall_any.return worked then the following examples should work: syscalls_by_pid.stp There are other examples that use syscall_any and syscall_any.return, but they have other issues like using multi-dimensional arrays, string concentenation operations, or for loops that will prevent them from working with the bpf backend. I have made some modifications to provide the syscall_name and syscall_num functions for bpf. However, the code doesn't handle 32-bit compat syscalls properly. This also gives warnings about cross-file global variable references. This is on the wcohen/bpf_syscall_any branch of systemtap git repo. Below is an example running on x86_64: $ ../install/bin/stap --bpf -k -e 'probe oneshot {printf("%s\n", syscall_name(10))}' WARNING: cross-file global variable reference to identifier '__syscall_32_num2name' at /home/wcohen/research/profiling/systemtap_write/install/share/systemtap/tapset/x86_64/syscall_num.stp:3:8 from: identifier '__syscall_32_num2name' at /home/wcohen/research/profiling/systemtap_write/install/share/systemtap/tapset/syscall_table.stp:8:16 source: return __syscall_32_num2name[num] ^ WARNING: cross-file global variable reference to identifier '__syscall_64_num2name' at /home/wcohen/research/profiling/systemtap_write/install/share/systemtap/tapset/x86_64/syscall_num.stp:5:8 from: identifier '__syscall_64_num2name' at :11:12 source: return __syscall_64_num2name[num] ^ WARNING: instance of overloaded function will never be reached: identifier 'syscall_name' at :5:10 source: function syscall_name(num) { ^ mprotect Keeping temporary directory "/tmp/stapdkLJNc" Taking a look at the syscall_any and syscall_any.return probes. The syscall_any can work in the existing bpf environment, but the syscall_any.return uses some machine dependent C code possibly from the kernel header to extract the syscall number the pt_regs. The question is how this to keep it portable and avoid having the debuginfo installed. With the incomplete syscall_any tapset: $ ../install/bin/stap --bpf -k testsuite/systemtap.examples/process/syscalls_by_pid.stp -T 1 WARNING: cross-file global variable reference to identifier '__syscall_32_num2name' at /home/wcohen/research/profiling/systemtap_write/install/share/systemtap/tapset/x86_64/syscall_num.stp:3:8 from: identifier '__syscall_32_num2name' at /home/wcohen/research/profiling/systemtap_write/install/share/systemtap/tapset/syscall_table.stp:8:16 source: return __syscall_32_num2name[num] ^ WARNING: cross-file global variable reference to identifier '__syscall_64_num2name' at /home/wcohen/research/profiling/systemtap_write/install/share/systemtap/tapset/x86_64/syscall_num.stp:5:8 from: identifier '__syscall_64_num2name' at :11:12 source: return __syscall_64_num2name[num] ^ Collecting data... Type Ctrl-C to exit and display results #SysCalls PID 109 30875 117 27703 233 22764 317 20845 22 19624 1951 16890 219 14304 108 13864 548 13707 11 13665 110 13198 11 10789 11 10783 11 10780 11 10777 18 10520 1985 10372 17 10247 10 10014 19 9479 ... -Will ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-06-03 15:51 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-16 13:41 Getting systemtap examples working with --bpf backend William Cohen 2019-05-20 19:52 ` William Cohen 2019-05-21 17:11 ` William Cohen 2019-05-22 20:51 ` Frank Ch. Eigler 2019-05-22 21:12 ` William Cohen 2019-05-22 21:39 ` Serhei Makarov 2019-06-03 15:51 ` William Cohen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).