* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-10 21:12 ` David Long
@ 2015-12-11 4:19 ` Pratyush Anand
2015-12-11 4:43 ` David Long
2015-12-11 17:02 ` William Cohen
2015-12-11 20:59 ` William Cohen
2 siblings, 1 reply; 13+ messages in thread
From: Pratyush Anand @ 2015-12-11 4:19 UTC (permalink / raw)
To: David Long; +Cc: William Cohen, systemtap
On 10/12/2015:04:12:30 PM, David Long wrote:
> On 12/10/2015 03:24 PM, William Cohen wrote:
> >Hi All,
> >
> >Dave Long and Pratyush Anand have been working on kprobe and uprobe patches for aarch64. I have built a local version the uprobe/upstream_arm64_devel branch of https://github.com/pratyushanand/linux which includes those patches in a linux-4.4.0-rc3 kernel.
> >
> >The tests seemed to run fairly well and the results have been uploaded to dejazilla:
> >
> >https://web.elastic.org/~dejazilla/viewsummary.php?summary=%3D%27%3C56698DCC.3090207%40redhat.com%3E%27
> >
> > === systemtap Summary ===
> >
> ># of expected passes 6096
> ># of unexpected failures 111
> ># of unexpected successes 2
> ># of expected failures 333
> ># of unknown successes 2
> ># of known failures 89
> ># of untested testcases 97
> ># of unsupported tests 27
> >runtest completed at Thu Dec 10 05:54:32 2015
> >
> >
> >There are still some areas needing for for aarch64 such as stack backtrace support.
> >
> >The following failure looks suspect because the child process died:
> >
> >
> >spawn stap -g ./systemtap.examples/process/threadstacks.stp -Gsize=65536 -c /root/systemtap_write/systemtap/testsuite/pthread_stacks.x 1024 0 -d /root/systemtap_write/systemtap/testsuite/pthread_stacks.x
> >
> >pthread_stacks.x: ./systemtap.base/pthread_stacks.c:67: main: Assertion `rc == 0' failed.
> >
> >WARNING: Child process exited with signal 6 (Aborted)
> >
> >pthread_stacks.[3567] overwrote __default_stacksize@0x3ffb3be4338 (8388608->65536)
> >
> >WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
> >
> >Pass 5: run failed. [man error::pass5]
> >
> >FAIL: pthread_stacks -Gsize (0 0)
> >
> >The fslatency-nd and fsslower-nd tests need further investigation:
> >
> >PASS: ./systemtap.examples/lwtools/fslatency-nd build
> >meta taglines 'test_installcheck: stap fslatency-nd.stp 1 1' tag 'test_installcheck' value 'stap fslatency-nd.stp 1 1'
> >attempting command stap fslatency-nd.stp 1 1
> >OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fslatency-nd.stp:66:15
> >Tracing FS sync reads and writes... Output every 1 secs.
> >WARNING: Number of errors: 1, skipped probes: 1
> >WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
> >Pass 5: run failed. [man error::pass5]
> >child process exited abnormally
> >RC 1
> >FAIL: ./systemtap.examples/lwtools/fslatency-nd run
> >
> >PASS: ./systemtap.examples/lwtools/fsslower-nd build
> >meta taglines 'test_installcheck: stap fsslower-nd.stp -c "sleep 1"' tag 'test_installcheck' value 'stap fsslower-nd.stp -c "sleep 1"'
> >attempting command stap fsslower-nd.stp -c "sleep 1"
> >OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fsslower-nd.stp:68:15
> >Tracing FS sync reads and writes slower than 10 ms... Hit Ctrl-C to end.
> >TIME PID COMM FUNC SIZE LAT(ms)
> >WARNING: Number of errors: 1, skipped probes: 1
> >WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
> >Pass 5: run failed. [man error::pass5]
> >child process exited abnormally
> >RC 1
> >FAIL: ./systemtap.examples/lwtools/fsslower-nd run
> >
> >
> >Also a number of network tests failed like the following
> >
> >
> >TEST PWD=/root/systemtap_write/systemtap/testsuite/systemtap.examples/network
> >meta taglines 'test_check: stap -g -p4 netfilter_drop.stp TCP 1' tag 'test_check' value 'stap -g -p4 netfilter_drop.stp TCP 1'
> >attempting command stap -g -p4 netfilter_drop.stp TCP 1
> >OUT /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2731:1: error: initialization from incompatible pointer type [-Werror]
> > .hook = enter_netfilter_probe_0,
> > ^
> >/tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2731:1: error: (near initialization for 'netfilter_opts_0.hook') [-Werror]
> >/tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: unknown field 'owner' specified in initializer
> > .owner = THIS_MODULE,
> > ^
> >/tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: initialization from incompatible pointer type [-Werror]
> >/tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: (near initialization for 'netfilter_opts_0.dev') [-Werror]
> >cc1: all warnings being treated as errors
> >make[4]: *** [/tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.o] Error 1
> >make[3]: *** [_module_/tmp/stapbIEqFl] Error 2
> >WARNING: kbuild exited with status: 2
> >Pass 4: compilation failed. [man error::pass4]
> >child process exited abnormally
> >RC 1
> >FAIL: ./systemtap.examples/network/netfilter_drop build
> >
> >-Will
>
>
> Cool. Wish I could make sense of systemtap error messages.
>
> At Will Deacon's suggested I tested probing the instruction in
> __copy_to_user that can cause a captured kernel exception when an
> application passes in a bad buffer address. Unfortunately the result was a
> hang. So copy_to/from user is going to have to be blacklisted for now,
> unless there turns out to be a simple fix. I'm worried there might be other
> places in the kernel where an otherwise probeable instruction might be
> expected to generate an exception.
There are many arm64 specific functions which need blacklisting. I have them
here.
https://github.com/pratyushanand/linux/commit/4098b5ad2c67bf4c375981fc68793f44af005eb9
https://github.com/pratyushanand/linux/commit/df3e76cbf70a8e1af42951d4b30587f022d25938
I think uprobe_pre/post_sstep_notifier() should also be blacklisted.
https://github.com/pratyushanand/linux/commit/99c89512931a46582d2f026b7288c895b8ef320c
Certainly, there could be some more functions which need kprobe blacklisting.
Because I remember, I had kprobe at every function of kallsyms on a x86
platform, and it had crashed.
~Pratyush
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-11 4:19 ` Pratyush Anand
@ 2015-12-11 4:43 ` David Long
0 siblings, 0 replies; 13+ messages in thread
From: David Long @ 2015-12-11 4:43 UTC (permalink / raw)
To: Pratyush Anand; +Cc: William Cohen, systemtap
On 12/10/2015 11:19 PM, Pratyush Anand wrote:
> On 10/12/2015:04:12:30 PM, David Long wrote:
>> On 12/10/2015 03:24 PM, William Cohen wrote:
>>> Hi All,
>>>
>>> Dave Long and Pratyush Anand have been working on kprobe and uprobe patches for aarch64. I have built a local version the uprobe/upstream_arm64_devel branch of https://github.com/pratyushanand/linux which includes those patches in a linux-4.4.0-rc3 kernel.
>>>
>>> The tests seemed to run fairly well and the results have been uploaded to dejazilla:
>>>
>>> https://web.elastic.org/~dejazilla/viewsummary.php?summary=%3D%27%3C56698DCC.3090207%40redhat.com%3E%27
>>>
>>> === systemtap Summary ===
>>>
>>> # of expected passes 6096
>>> # of unexpected failures 111
>>> # of unexpected successes 2
>>> # of expected failures 333
>>> # of unknown successes 2
>>> # of known failures 89
>>> # of untested testcases 97
>>> # of unsupported tests 27
>>> runtest completed at Thu Dec 10 05:54:32 2015
>>>
>>>
>>> There are still some areas needing for for aarch64 such as stack backtrace support.
>>>
>>> The following failure looks suspect because the child process died:
>>>
>>>
>>> spawn stap -g ./systemtap.examples/process/threadstacks.stp -Gsize=65536 -c /root/systemtap_write/systemtap/testsuite/pthread_stacks.x 1024 0 -d /root/systemtap_write/systemtap/testsuite/pthread_stacks.x
>>>
>>> pthread_stacks.x: ./systemtap.base/pthread_stacks.c:67: main: Assertion `rc == 0' failed.
>>>
>>> WARNING: Child process exited with signal 6 (Aborted)
>>>
>>> pthread_stacks.[3567] overwrote __default_stacksize@0x3ffb3be4338 (8388608->65536)
>>>
>>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>>>
>>> Pass 5: run failed. [man error::pass5]
>>>
>>> FAIL: pthread_stacks -Gsize (0 0)
>>>
>>> The fslatency-nd and fsslower-nd tests need further investigation:
>>>
>>> PASS: ./systemtap.examples/lwtools/fslatency-nd build
>>> meta taglines 'test_installcheck: stap fslatency-nd.stp 1 1' tag 'test_installcheck' value 'stap fslatency-nd.stp 1 1'
>>> attempting command stap fslatency-nd.stp 1 1
>>> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fslatency-nd.stp:66:15
>>> Tracing FS sync reads and writes... Output every 1 secs.
>>> WARNING: Number of errors: 1, skipped probes: 1
>>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>>> Pass 5: run failed. [man error::pass5]
>>> child process exited abnormally
>>> RC 1
>>> FAIL: ./systemtap.examples/lwtools/fslatency-nd run
>>>
>>> PASS: ./systemtap.examples/lwtools/fsslower-nd build
>>> meta taglines 'test_installcheck: stap fsslower-nd.stp -c "sleep 1"' tag 'test_installcheck' value 'stap fsslower-nd.stp -c "sleep 1"'
>>> attempting command stap fsslower-nd.stp -c "sleep 1"
>>> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fsslower-nd.stp:68:15
>>> Tracing FS sync reads and writes slower than 10 ms... Hit Ctrl-C to end.
>>> TIME PID COMM FUNC SIZE LAT(ms)
>>> WARNING: Number of errors: 1, skipped probes: 1
>>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>>> Pass 5: run failed. [man error::pass5]
>>> child process exited abnormally
>>> RC 1
>>> FAIL: ./systemtap.examples/lwtools/fsslower-nd run
>>>
>>>
>>> Also a number of network tests failed like the following
>>>
>>>
>>> TEST PWD=/root/systemtap_write/systemtap/testsuite/systemtap.examples/network
>>> meta taglines 'test_check: stap -g -p4 netfilter_drop.stp TCP 1' tag 'test_check' value 'stap -g -p4 netfilter_drop.stp TCP 1'
>>> attempting command stap -g -p4 netfilter_drop.stp TCP 1
>>> OUT /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2731:1: error: initialization from incompatible pointer type [-Werror]
>>> .hook = enter_netfilter_probe_0,
>>> ^
>>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2731:1: error: (near initialization for 'netfilter_opts_0.hook') [-Werror]
>>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: unknown field 'owner' specified in initializer
>>> .owner = THIS_MODULE,
>>> ^
>>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: initialization from incompatible pointer type [-Werror]
>>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: (near initialization for 'netfilter_opts_0.dev') [-Werror]
>>> cc1: all warnings being treated as errors
>>> make[4]: *** [/tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.o] Error 1
>>> make[3]: *** [_module_/tmp/stapbIEqFl] Error 2
>>> WARNING: kbuild exited with status: 2
>>> Pass 4: compilation failed. [man error::pass4]
>>> child process exited abnormally
>>> RC 1
>>> FAIL: ./systemtap.examples/network/netfilter_drop build
>>>
>>> -Will
>>
>>
>> Cool. Wish I could make sense of systemtap error messages.
>>
>> At Will Deacon's suggested I tested probing the instruction in
>> __copy_to_user that can cause a captured kernel exception when an
>> application passes in a bad buffer address. Unfortunately the result was a
>> hang. So copy_to/from user is going to have to be blacklisted for now,
>> unless there turns out to be a simple fix. I'm worried there might be other
>> places in the kernel where an otherwise probeable instruction might be
>> expected to generate an exception.
>
> There are many arm64 specific functions which need blacklisting. I have them
> here.
>
> https://github.com/pratyushanand/linux/commit/4098b5ad2c67bf4c375981fc68793f44af005eb9
> https://github.com/pratyushanand/linux/commit/df3e76cbf70a8e1af42951d4b30587f022d25938
>
> I think uprobe_pre/post_sstep_notifier() should also be blacklisted.
>
> https://github.com/pratyushanand/linux/commit/99c89512931a46582d2f026b7288c895b8ef320c
>
> Certainly, there could be some more functions which need kprobe blacklisting.
> Because I remember, I had kprobe at every function of kallsyms on a x86
> platform, and it had crashed.
>
> ~Pratyush
>
OK, it sounds like this has been just a "best effort" on x86 then? I'll
blacklist the copy functions and move on.
-dl
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-10 21:12 ` David Long
2015-12-11 4:19 ` Pratyush Anand
@ 2015-12-11 17:02 ` William Cohen
2015-12-16 5:22 ` Pratyush Anand
2015-12-11 20:59 ` William Cohen
2 siblings, 1 reply; 13+ messages in thread
From: William Cohen @ 2015-12-11 17:02 UTC (permalink / raw)
To: David Long, systemtap; +Cc: Pratyush Anand
On 12/10/2015 04:12 PM, David Long wrote:
> On 12/10/2015 03:24 PM, William Cohen wrote:
>> Hi All,
>>
>> Dave Long and Pratyush Anand have been working on kprobe and uprobe patches for aarch64. I have built a local version the uprobe/upstream_arm64_devel branch of https://github.com/pratyushanand/linux which includes those patches in a linux-4.4.0-rc3 kernel.
>>
>> The tests seemed to run fairly well and the results have been uploaded to dejazilla:
>>
>> https://web.elastic.org/~dejazilla/viewsummary.php?summary=%3D%27%3C56698DCC.3090207%40redhat.com%3E%27
>>
>> === systemtap Summary ===
>>
>> # of expected passes 6096
>> # of unexpected failures 111
>> # of unexpected successes 2
>> # of expected failures 333
>> # of unknown successes 2
>> # of known failures 89
>> # of untested testcases 97
>> # of unsupported tests 27
>> runtest completed at Thu Dec 10 05:54:32 2015
>>
>>
>> There are still some areas needing for for aarch64 such as stack backtrace support.
>>
>> The following failure looks suspect because the child process died:
>>
>>
>> spawn stap -g ./systemtap.examples/process/threadstacks.stp -Gsize=65536 -c /root/systemtap_write/systemtap/testsuite/pthread_stacks.x 1024 0 -d /root/systemtap_write/systemtap/testsuite/pthread_stacks.x
>>
>> pthread_stacks.x: ./systemtap.base/pthread_stacks.c:67: main: Assertion `rc == 0' failed.
>>
>> WARNING: Child process exited with signal 6 (Aborted)
>>
>> pthread_stacks.[3567] overwrote __default_stacksize@0x3ffb3be4338 (8388608->65536)
>>
>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>>
>> Pass 5: run failed. [man error::pass5]
>>
>> FAIL: pthread_stacks -Gsize (0 0)
>>
>> The fslatency-nd and fsslower-nd tests need further investigation:
>>
>> PASS: ./systemtap.examples/lwtools/fslatency-nd build
>> meta taglines 'test_installcheck: stap fslatency-nd.stp 1 1' tag 'test_installcheck' value 'stap fslatency-nd.stp 1 1'
>> attempting command stap fslatency-nd.stp 1 1
>> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fslatency-nd.stp:66:15
>> Tracing FS sync reads and writes... Output every 1 secs.
>> WARNING: Number of errors: 1, skipped probes: 1
>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>> Pass 5: run failed. [man error::pass5]
>> child process exited abnormally
>> RC 1
>> FAIL: ./systemtap.examples/lwtools/fslatency-nd run
>>
>> PASS: ./systemtap.examples/lwtools/fsslower-nd build
>> meta taglines 'test_installcheck: stap fsslower-nd.stp -c "sleep 1"' tag 'test_installcheck' value 'stap fsslower-nd.stp -c "sleep 1"'
>> attempting command stap fsslower-nd.stp -c "sleep 1"
>> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fsslower-nd.stp:68:15
>> Tracing FS sync reads and writes slower than 10 ms... Hit Ctrl-C to end.
>> TIME PID COMM FUNC SIZE LAT(ms)
>> WARNING: Number of errors: 1, skipped probes: 1
>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>> Pass 5: run failed. [man error::pass5]
>> child process exited abnormally
>> RC 1
>> FAIL: ./systemtap.examples/lwtools/fsslower-nd run
>>
>>
>> Also a number of network tests failed like the following
>>
>>
>> TEST PWD=/root/systemtap_write/systemtap/testsuite/systemtap.examples/network
>> meta taglines 'test_check: stap -g -p4 netfilter_drop.stp TCP 1' tag 'test_check' value 'stap -g -p4 netfilter_drop.stp TCP 1'
>> attempting command stap -g -p4 netfilter_drop.stp TCP 1
>> OUT /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2731:1: error: initialization from incompatible pointer type [-Werror]
>> .hook = enter_netfilter_probe_0,
>> ^
>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2731:1: error: (near initialization for 'netfilter_opts_0.hook') [-Werror]
>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: unknown field 'owner' specified in initializer
>> .owner = THIS_MODULE,
>> ^
>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: initialization from incompatible pointer type [-Werror]
>> /tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.c:2732:1: error: (near initialization for 'netfilter_opts_0.dev') [-Werror]
>> cc1: all warnings being treated as errors
>> make[4]: *** [/tmp/stapbIEqFl/stap_0c0db8521e3ea3b2d26dcde208a77baf_21082_src.o] Error 1
>> make[3]: *** [_module_/tmp/stapbIEqFl] Error 2
>> WARNING: kbuild exited with status: 2
>> Pass 4: compilation failed. [man error::pass4]
>> child process exited abnormally
>> RC 1
>> FAIL: ./systemtap.examples/network/netfilter_drop build
>>
>> -Will
>
>
> Cool. Wish I could make sense of systemtap error messages.
Hi Dave,
This was a data dump, so I haven't made sense of some of it either. :) The ".hook=..." and ".owner=..." are problems in the systemtap code generation for newer kernel and don't concern the aarch64 kprobes/uprobes work. The read faults for fslatency-nd.stp and fsslower-nd,stp need to be check more carefully. but they are likely issues with systemtap (they do work on linux-4.2.0 on x86_64).
The "FAIL: pthread_stacks -Gsize (0 0)" looks like it could be an issue with uprobes affecting the running of the program. Pratyush are you able to run this systemtap test locally?
>
> At Will Deacon's suggested I tested probing the instruction in __copy_to_user that can cause a captured kernel exception when an application passes in a bad buffer address. Unfortunately the result was a hang. So copy_to/from user is going to have to be blacklisted for now, unless there turns out to be a simple fix. I'm worried there might be other places in the kernel where an otherwise probeable instruction might be expected to generate an exception.
>
> -dl
So the problem is the issue of nested exceptions in the single step debug exception? Are there other place in the kernel where similar exceptions could occur, such as memory management code probing an address to verify it is valid?
-Will
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-11 17:02 ` William Cohen
@ 2015-12-16 5:22 ` Pratyush Anand
2015-12-16 13:14 ` William Cohen
0 siblings, 1 reply; 13+ messages in thread
From: Pratyush Anand @ 2015-12-16 5:22 UTC (permalink / raw)
To: William Cohen; +Cc: David Long, systemtap
On 11/12/2015:12:02:21 PM, William Cohen wrote:
>
> The "FAIL: pthread_stacks -Gsize (0 0)" looks like it could be an issue with uprobes affecting the running of the program. Pratyush are you able to run this systemtap test locally?
Even when I run this test locally it does not work, but it fails very early in
my case. May be because of different libpthread.so
[root@amd-seattle-01 testsuite]# /root/bin/systemtap/bin/stap -gp4 ./systemtap.examples/process/threadstacks.stp -Gsize=65536 -d /root/systemtap/testsuite/pthread_stacks.x
semantic error: while resolving probe point: identifier 'process' at ./systemtap.examples/process/threadstacks.stp:17:7
source: probe process("/lib*/libpthread.so.*").function("allocate_stack") {
^
semantic error: no match
Pass 2: analysis failed. [man error::pass2]
[root@amd-seattle-01 testsuite]# ls /lib*/libpthread.so.*
/lib64/libpthread.so.0
[root@amd-seattle-01 testsuite]# ll /lib64/libpthread.so.0
lrwxrwxrwx. 1 root root 18 Dec 13 23:42 /lib64/libpthread.so.0 -> libpthread-2.17.so
[root@amd-seattle-01 testsuite]# objdump -d /lib64/libpthread.so.0 | grep allocate_stack
0000000000006a50 <__deallocate_stack>:
6a7c: 54000061 b.ne 6a88 <__deallocate_stack+0x38>
6a84: 35ffff83 cbnz w3, 6a74 <__deallocate_stack+0x24>
6a88: 540005e1 b.ne 6b44 <__deallocate_stack+0xf4>
6a90: 350005e0 cbnz w0, 6b4c <__deallocate_stack+0xfc>
6ac4: 350005e2 cbnz w2, 6b80 <__deallocate_stack+0x130>
6b14: 54000328 b.hi 6b78 <__deallocate_stack+0x128>
6b2c: 35ffffc2 cbnz w2, 6b24 <__deallocate_stack+0xd4>
6b34: 5400014c b.gt 6b5c <__deallocate_stack+0x10c>
6b48: 17ffffd1 b 6a8c <__deallocate_stack+0x3c>
6b58: 17ffffcf b 6a94 <__deallocate_stack+0x44>
6b74: 17fffff1 b 6b38 <__deallocate_stack+0xe8>
6b7c: 17ffffe7 b 6b18 <__deallocate_stack+0xc8>
6b8c: 17ffffe3 b 6b18 <__deallocate_stack+0xc8>
6c3c: 97ffff85 bl 6a50 <__deallocate_stack>
7ce4: 97fffb5b bl 6a50 <__deallocate_stack>
7f04: 97fffad3 bl 6a50 <__deallocate_stack>
894c: 97fff841 bl 6a50 <__deallocate_stack>
[root@amd-seattle-01 testsuite]#
~Pratyush
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-16 5:22 ` Pratyush Anand
@ 2015-12-16 13:14 ` William Cohen
2015-12-17 0:53 ` Pratyush Anand
0 siblings, 1 reply; 13+ messages in thread
From: William Cohen @ 2015-12-16 13:14 UTC (permalink / raw)
To: Pratyush Anand; +Cc: David Long, systemtap
On 12/16/2015 12:22 AM, Pratyush Anand wrote:
> On 11/12/2015:12:02:21 PM, William Cohen wrote:
>>
>> The "FAIL: pthread_stacks -Gsize (0 0)" looks like it could be an issue with uprobes affecting the running of the program. Pratyush are you able to run this systemtap test locally?
>
> Even when I run this test locally it does not work, but it fails very early in
> my case. May be because of different libpthread.so
>
> [root@amd-seattle-01 testsuite]# /root/bin/systemtap/bin/stap -gp4 ./systemtap.examples/process/threadstacks.stp -Gsize=65536 -d /root/systemtap/testsuite/pthread_stacks.x
> semantic error: while resolving probe point: identifier 'process' at ./systemtap.examples/process/threadstacks.stp:17:7
> source: probe process("/lib*/libpthread.so.*").function("allocate_stack") {
> ^
>
> semantic error: no match
You might need to install glibc-debuginfo. Below is some information from the machine I have setup showing that the probe point is available and what glibc stuff is installed on the machine:
[root@apm-mustang-ev3-01 systemtap]# ../install/bin/stap -L 'process("/lib*/libpthread.so.*").function("allocate_stack")'
process("/usr/lib64/libpthread-2.17.so").function("allocate_stack@/usr/src/debug/glibc-2.17-c758a686/nptl/allocatestack.c:344") $stack:void** $pdp:struct pthread** $attr:struct pthread_attr const*
[root@apm-mustang-ev3-01 systemtap]# rpm -qf /usr/lib64/libpthread-2.17.so
glibc-2.17-105.el7.aarch64
[root@apm-mustang-ev3-01 systemtap]# rpm -qa|grep glibc
glibc-common-2.17-105.el7.aarch64
glibc-devel-2.17-105.el7.aarch64
glibc-debuginfo-2.17-105.el7.aarch64
glibc-headers-2.17-105.el7.aarch64
glibc-2.17-105.el7.aarch64
-Will
>
> Pass 2: analysis failed. [man error::pass2]
> [root@amd-seattle-01 testsuite]# ls /lib*/libpthread.so.*
> /lib64/libpthread.so.0
> [root@amd-seattle-01 testsuite]# ll /lib64/libpthread.so.0
> lrwxrwxrwx. 1 root root 18 Dec 13 23:42 /lib64/libpthread.so.0 -> libpthread-2.17.so
> [root@amd-seattle-01 testsuite]# objdump -d /lib64/libpthread.so.0 | grep allocate_stack
> 0000000000006a50 <__deallocate_stack>:
> 6a7c: 54000061 b.ne 6a88 <__deallocate_stack+0x38>
> 6a84: 35ffff83 cbnz w3, 6a74 <__deallocate_stack+0x24>
> 6a88: 540005e1 b.ne 6b44 <__deallocate_stack+0xf4>
> 6a90: 350005e0 cbnz w0, 6b4c <__deallocate_stack+0xfc>
> 6ac4: 350005e2 cbnz w2, 6b80 <__deallocate_stack+0x130>
> 6b14: 54000328 b.hi 6b78 <__deallocate_stack+0x128>
> 6b2c: 35ffffc2 cbnz w2, 6b24 <__deallocate_stack+0xd4>
> 6b34: 5400014c b.gt 6b5c <__deallocate_stack+0x10c>
> 6b48: 17ffffd1 b 6a8c <__deallocate_stack+0x3c>
> 6b58: 17ffffcf b 6a94 <__deallocate_stack+0x44>
> 6b74: 17fffff1 b 6b38 <__deallocate_stack+0xe8>
> 6b7c: 17ffffe7 b 6b18 <__deallocate_stack+0xc8>
> 6b8c: 17ffffe3 b 6b18 <__deallocate_stack+0xc8>
> 6c3c: 97ffff85 bl 6a50 <__deallocate_stack>
> 7ce4: 97fffb5b bl 6a50 <__deallocate_stack>
> 7f04: 97fffad3 bl 6a50 <__deallocate_stack>
> 894c: 97fff841 bl 6a50 <__deallocate_stack>
> [root@amd-seattle-01 testsuite]#
>
> ~Pratyush
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-16 13:14 ` William Cohen
@ 2015-12-17 0:53 ` Pratyush Anand
0 siblings, 0 replies; 13+ messages in thread
From: Pratyush Anand @ 2015-12-17 0:53 UTC (permalink / raw)
To: William Cohen; +Cc: David Long, systemtap
On 16/12/2015:08:14:01 AM, William Cohen wrote:
> On 12/16/2015 12:22 AM, Pratyush Anand wrote:
> > On 11/12/2015:12:02:21 PM, William Cohen wrote:
> >>
> >> The "FAIL: pthread_stacks -Gsize (0 0)" looks like it could be an issue with uprobes affecting the running of the program. Pratyush are you able to run this systemtap test locally?
> >
> > Even when I run this test locally it does not work, but it fails very early in
> > my case. May be because of different libpthread.so
> >
> > [root@amd-seattle-01 testsuite]# /root/bin/systemtap/bin/stap -gp4 ./systemtap.examples/process/threadstacks.stp -Gsize=65536 -d /root/systemtap/testsuite/pthread_stacks.x
> > semantic error: while resolving probe point: identifier 'process' at ./systemtap.examples/process/threadstacks.stp:17:7
> > source: probe process("/lib*/libpthread.so.*").function("allocate_stack") {
> > ^
> >
> > semantic error: no match
>
> You might need to install glibc-debuginfo. Below is some information from the machine I have setup showing that the probe point is available and what glibc stuff is installed on the machine:
Thanks. After installing glibc-debuginfo I see the test is passing locally.
PASS: ./systemtap.examples/process/thread-business run
meta taglines '' tag 'output' value ''
PRETEST PWD=/root/systemtap/testsuite
meta taglines '' tag 'test_support' value ''
TEST PWD=/root/systemtap/testsuite/systemtap.examples/process
sourcing threadstacks.tcl for ./systemtap.examples/process/threadstacks
meta taglines 'test_check: stap -gp4 threadstacks.stp -Gsize=65536 -d `which stap`' tag 'test_check' value 'stap -gp4 threadstacks.stp -Gsize=65536 -d `which stap`'
attempting command stap -gp4 threadstacks.stp -Gsize=65536 -d `which stap`
OUT /root/systemtap/testsuite/.systemtap-root/cache/71/stap_715823b07593f6d19206a3fa54071a9b_9758.ko
RC 0
PASS: ./systemtap.examples/process/threadstacks build
meta taglines 'test_installcheck: stap -g threadstacks.stp -Gsize=65536 -c "sleep 1" -d `which stap`' tag 'test_installcheck' value 'stap -g threadstacks.stp -Gsize=65536 -c "sleep 1" -d `which stap`'
attempting command stap -g threadstacks.stp -Gsize=65536 -c "sleep 1" -d `which stap`
OUT
RC 0
PASS: ./systemtap.examples/process/threadstacks run
~Pratyush
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-10 21:12 ` David Long
2015-12-11 4:19 ` Pratyush Anand
2015-12-11 17:02 ` William Cohen
@ 2015-12-11 20:59 ` William Cohen
2015-12-16 11:55 ` Pratyush Anand
2 siblings, 1 reply; 13+ messages in thread
From: William Cohen @ 2015-12-11 20:59 UTC (permalink / raw)
To: David Long, systemtap; +Cc: Pratyush Anand
[-- Attachment #1: Type: text/plain, Size: 4811 bytes --]
On 12/10/2015 04:12 PM, David Long wrote:
> On 12/10/2015 03:24 PM, William Cohen wrote:
>> The fslatency-nd and fsslower-nd tests need further investigation:
>>
>> PASS: ./systemtap.examples/lwtools/fslatency-nd build
>> meta taglines 'test_installcheck: stap fslatency-nd.stp 1 1' tag 'test_installcheck' value 'stap fslatency-nd.stp 1 1'
>> attempting command stap fslatency-nd.stp 1 1
>> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fslatency-nd.stp:66:15
>> Tracing FS sync reads and writes... Output every 1 secs.
>> WARNING: Number of errors: 1, skipped probes: 1
>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>> Pass 5: run failed. [man error::pass5]
>> child process exited abnormally
>> RC 1
>> FAIL: ./systemtap.examples/lwtools/fslatency-nd run
>>
>> PASS: ./systemtap.examples/lwtools/fsslower-nd build
>> meta taglines 'test_installcheck: stap fsslower-nd.stp -c "sleep 1"' tag 'test_installcheck' value 'stap fsslower-nd.stp -c "sleep 1"'
>> attempting command stap fsslower-nd.stp -c "sleep 1"
>> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fsslower-nd.stp:68:15
>> Tracing FS sync reads and writes slower than 10 ms... Hit Ctrl-C to end.
>> TIME PID COMM FUNC SIZE LAT(ms)
>> WARNING: Number of errors: 1, skipped probes: 1
>> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
>> Pass 5: run failed. [man error::pass5]
>> child process exited abnormally
>> RC 1
>> FAIL: ./systemtap.examples/lwtools/fsslower-nd run
>
> Cool. Wish I could make sense of systemtap error messages.
>
> At Will Deacon's suggested I tested probing the instruction in __copy_to_user that can cause a captured kernel exception when an application passes in a bad buffer address. Unfortunately the result was a hang. So copy_to/from user is going to have to be blacklisted for now, unless there turns out to be a simple fix. I'm worried there might be other places in the kernel where an otherwise probeable instruction might be expected to generate an exception.
>
> -dl
>
>
>
Hi Dave and Pratyush,
I did some more experimentation with the fslatency-nd and fsslow-nd tests to see what is going on. The problem seems to be related to the return probes. I have a small reproducer attached which runs fine on x86_64 machine. However on aarch64 it has the bogus read because some of the argument registers have changed value
# ../install/bin/stap ./aarch64_retkprobe_issue2.stp
ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at ./aarch64_retkprobe_issue2.stp:13:7
pc : [<fffffe000021e37c>] lr : [<fffffe000021eb64>] pstate: 80000145
sp : fffffe00bad7be30
x29: fffffe00bad7be30 x28: fffffe00bad78000
x27: fffffe0000912000 x26: 000000000000003f
x25: 000000000000011d x24: 0000000000000015
x23: 0000000080000000 x22: 000003fff82b9760
x21: fffffe00bad7bec8 x20: 0000000000002004
x19: fffffe01b716e100 x18: 000003fff82b8160
x17: 000003ff849bf0a0 x16: fffffe000021f4a0
x15: 0000000000000004 x14: 000003fff82bb910
x13: 0000000000000001 x12: 000003ff7d75f200
x11: 00000000003d0f00 x10: 000003ff849b7af4
x9 : 0000000000000028 x8 : 0000000000000020
x7 : fffffe00bc5c3600 x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000000
x3 : fffffe00bad7bec8 x2 : 0000000000002004
x1 : 000003fff82b9760 x0 : fffffe01b716e100
pc : [<fffffe000021e37c>] lr : [<fffffe000009fbe0>] pstate: 60000145
sp : fffffe00bad7be30
x29: fffffe00bad7be30 x28: fffffe00bad78000
x27: fffffe0000912000 x26: 000000000000003f
x25: 000000000000011d x24: 0000000000000015
x23: 0000000080000000 x22: 000003fff82b9760
x21: fffffe00bad7bec8 x20: 0000000000002004
x19: fffffe01b716e100 x18: 000003fff82b8160
x17: 000003ff849bf0a0 x16: fffffe000021f4a0
x15: 0000000000000004 x14: 000003fff82bb910
x13: 0000000000000001 x12: 000003ff7d75f200
x11: 00000000003d0f00 x10: 000003ff849b7af4
x9 : 0000000000000028 x8 : 0000000000000020
x7 : fffffe00bc5c3600 x6 : 000003fff82b976c
x5 : 000003fff82b976c x4 : 0000000000000000
x3 : 0000000000000000 x2 : 0000000000000000
x1 : 0000000000000000 x0 : 000000000000000c
WARNING: Number of errors: 1, skipped probes: 1
WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
Pass 5: run failed. [man error::pass5]
Comment the return probe with a '#' at the beginning of the line with "kprobe.function("__vfs_read").return," and the script runs fine. The systemtap pointer_arg() doesn't take into account that the register might be used as a scratch register and the value changed after entry into the function. This is an issue with the systemtap scripts. I have patched the systemtap scripts to addresss this issue..
-Will
[-- Attachment #2: aarch64_retkprobe_issue2.stp --]
[-- Type: text/plain, Size: 473 bytes --]
# The return probe appears to cause this reproducer to crash
# kprobe.function("__vfs_read").return causes a read fault
# comment out the kprobe.function("__vfs_read").return and the
# script runs without error
probe
kprobe.function("__vfs_read").return,
kprobe.function("__vfs_read")
{
# Skip the call if new_sync_read() wouldn't be called.
file = pointer_arg(1)
if (file) {
print_regs();
if(@cast(file, "file")->f_op->read) {
next
}
}
exit()
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-11 20:59 ` William Cohen
@ 2015-12-16 11:55 ` Pratyush Anand
2015-12-16 13:10 ` William Cohen
0 siblings, 1 reply; 13+ messages in thread
From: Pratyush Anand @ 2015-12-16 11:55 UTC (permalink / raw)
To: William Cohen; +Cc: David Long, systemtap
On 11/12/2015:03:59:53 PM, William Cohen wrote:
> On 12/10/2015 04:12 PM, David Long wrote:
> > On 12/10/2015 03:24 PM, William Cohen wrote:
>
> >> The fslatency-nd and fsslower-nd tests need further investigation:
> >>
> >> PASS: ./systemtap.examples/lwtools/fslatency-nd build
> >> meta taglines 'test_installcheck: stap fslatency-nd.stp 1 1' tag 'test_installcheck' value 'stap fslatency-nd.stp 1 1'
> >> attempting command stap fslatency-nd.stp 1 1
> >> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fslatency-nd.stp:66:15
> >> Tracing FS sync reads and writes... Output every 1 secs.
> >> WARNING: Number of errors: 1, skipped probes: 1
> >> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
> >> Pass 5: run failed. [man error::pass5]
> >> child process exited abnormally
> >> RC 1
> >> FAIL: ./systemtap.examples/lwtools/fslatency-nd run
> >>
> >> PASS: ./systemtap.examples/lwtools/fsslower-nd build
> >> meta taglines 'test_installcheck: stap fsslower-nd.stp -c "sleep 1"' tag 'test_installcheck' value 'stap fsslower-nd.stp -c "sleep 1"'
> >> attempting command stap fsslower-nd.stp -c "sleep 1"
> >> OUT ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at fsslower-nd.stp:68:15
> >> Tracing FS sync reads and writes slower than 10 ms... Hit Ctrl-C to end.
> >> TIME PID COMM FUNC SIZE LAT(ms)
> >> WARNING: Number of errors: 1, skipped probes: 1
> >> WARNING: /root/systemtap_write/install/bin/staprun exited with status: 1
> >> Pass 5: run failed. [man error::pass5]
> >> child process exited abnormally
> >> RC 1
> >> FAIL: ./systemtap.examples/lwtools/fsslower-nd run
>
> >
> > Cool. Wish I could make sense of systemtap error messages.
> >
> > At Will Deacon's suggested I tested probing the instruction in __copy_to_user that can cause a captured kernel exception when an application passes in a bad buffer address. Unfortunately the result was a hang. So copy_to/from user is going to have to be blacklisted for now, unless there turns out to be a simple fix. I'm worried there might be other places in the kernel where an otherwise probeable instruction might be expected to generate an exception.
> >
> > -dl
> >
> >
> >
>
> Hi Dave and Pratyush,
>
> I did some more experimentation with the fslatency-nd and fsslow-nd tests to see what is going on. The problem seems to be related to the return probes. I have a small reproducer attached which runs fine on x86_64 machine. However on aarch64 it has the bogus read because some of the argument registers have changed value
>
> # ../install/bin/stap ./aarch64_retkprobe_issue2.stp
> ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at ./aarch64_retkprobe_issue2.stp:13:7
> pc : [<fffffe000021e37c>] lr : [<fffffe000021eb64>] pstate: 80000145
> sp : fffffe00bad7be30
> x29: fffffe00bad7be30 x28: fffffe00bad78000
> x27: fffffe0000912000 x26: 000000000000003f
> x25: 000000000000011d x24: 0000000000000015
> x23: 0000000080000000 x22: 000003fff82b9760
> x21: fffffe00bad7bec8 x20: 0000000000002004
> x19: fffffe01b716e100 x18: 000003fff82b8160
> x17: 000003ff849bf0a0 x16: fffffe000021f4a0
> x15: 0000000000000004 x14: 000003fff82bb910
> x13: 0000000000000001 x12: 000003ff7d75f200
> x11: 00000000003d0f00 x10: 000003ff849b7af4
> x9 : 0000000000000028 x8 : 0000000000000020
> x7 : fffffe00bc5c3600 x6 : 0000000000000000
> x5 : 0000000000000000 x4 : 0000000000000000
> x3 : fffffe00bad7bec8 x2 : 0000000000002004
> x1 : 000003fff82b9760 x0 : fffffe01b716e100
>
> pc : [<fffffe000021e37c>] lr : [<fffffe000009fbe0>] pstate: 60000145
> sp : fffffe00bad7be30
> x29: fffffe00bad7be30 x28: fffffe00bad78000
> x27: fffffe0000912000 x26: 000000000000003f
> x25: 000000000000011d x24: 0000000000000015
> x23: 0000000080000000 x22: 000003fff82b9760
> x21: fffffe00bad7bec8 x20: 0000000000002004
> x19: fffffe01b716e100 x18: 000003fff82b8160
> x17: 000003ff849bf0a0 x16: fffffe000021f4a0
> x15: 0000000000000004 x14: 000003fff82bb910
> x13: 0000000000000001 x12: 000003ff7d75f200
> x11: 00000000003d0f00 x10: 000003ff849b7af4
> x9 : 0000000000000028 x8 : 0000000000000020
> x7 : fffffe00bc5c3600 x6 : 000003fff82b976c
> x5 : 000003fff82b976c x4 : 0000000000000000
> x3 : 0000000000000000 x2 : 0000000000000000
> x1 : 0000000000000000 x0 : 000000000000000c
Although I am not sure, but this is what it seems to me:
First argument (file) is in x0, and which is 0xC in case of kretprobe. But, can
x0 really be considered as 1st arg in case of kretprobe?
I think, x0 should have return value of __vfs_read() in case of kretprobe. So,
0xC could be the number of bytes read.
With perf I see:
# perf probe -k vmlinux __vfs_read_exit=__vfs_read%return file
Semantic error :You can't specify local variable for kretprobe.
So, I am not sure what mechanism systemtap uses to get local variable in case of
kretprobe.
Moreover, on x86 I see that loop exits after the 1st print_regs() only. So it
means there was valid file->f_op->read() for the 1st file itself. If I comment
"kprobe.function("__vfs_read")", then there is no print at all. It means, we are
not hitting a case on x86 when callback was called for kretprobe and we had
nonzero 1st argument.
~Pratyush
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent aarch64 kprobes and uprobes patch systemtap testing
2015-12-16 11:55 ` Pratyush Anand
@ 2015-12-16 13:10 ` William Cohen
0 siblings, 0 replies; 13+ messages in thread
From: William Cohen @ 2015-12-16 13:10 UTC (permalink / raw)
To: Pratyush Anand; +Cc: David Long, systemtap
On 12/16/2015 06:55 AM, Pratyush Anand wrote:
> On 11/12/2015:03:59:53 PM, William Cohen wrote:
>> Hi Dave and Pratyush,
>>
>> I did some more experimentation with the fslatency-nd and fsslow-nd tests to see what is going on. The problem seems to be related to the return probes. I have a small reproducer attached which runs fine on x86_64 machine. However on aarch64 it has the bogus read because some of the argument registers have changed value
>>
>> # ../install/bin/stap ./aarch64_retkprobe_issue2.stp
>> ERROR: read fault [man error::fault] at 0x0000000000000034 (addr) near operator '@cast' at ./aarch64_retkprobe_issue2.stp:13:7
>> pc : [<fffffe000021e37c>] lr : [<fffffe000021eb64>] pstate: 80000145
>> sp : fffffe00bad7be30
>> x29: fffffe00bad7be30 x28: fffffe00bad78000
>> x27: fffffe0000912000 x26: 000000000000003f
>> x25: 000000000000011d x24: 0000000000000015
>> x23: 0000000080000000 x22: 000003fff82b9760
>> x21: fffffe00bad7bec8 x20: 0000000000002004
>> x19: fffffe01b716e100 x18: 000003fff82b8160
>> x17: 000003ff849bf0a0 x16: fffffe000021f4a0
>> x15: 0000000000000004 x14: 000003fff82bb910
>> x13: 0000000000000001 x12: 000003ff7d75f200
>> x11: 00000000003d0f00 x10: 000003ff849b7af4
>> x9 : 0000000000000028 x8 : 0000000000000020
>> x7 : fffffe00bc5c3600 x6 : 0000000000000000
>> x5 : 0000000000000000 x4 : 0000000000000000
>> x3 : fffffe00bad7bec8 x2 : 0000000000002004
>> x1 : 000003fff82b9760 x0 : fffffe01b716e100
>>
>> pc : [<fffffe000021e37c>] lr : [<fffffe000009fbe0>] pstate: 60000145
>> sp : fffffe00bad7be30
>> x29: fffffe00bad7be30 x28: fffffe00bad78000
>> x27: fffffe0000912000 x26: 000000000000003f
>> x25: 000000000000011d x24: 0000000000000015
>> x23: 0000000080000000 x22: 000003fff82b9760
>> x21: fffffe00bad7bec8 x20: 0000000000002004
>> x19: fffffe01b716e100 x18: 000003fff82b8160
>> x17: 000003ff849bf0a0 x16: fffffe000021f4a0
>> x15: 0000000000000004 x14: 000003fff82bb910
>> x13: 0000000000000001 x12: 000003ff7d75f200
>> x11: 00000000003d0f00 x10: 000003ff849b7af4
>> x9 : 0000000000000028 x8 : 0000000000000020
>> x7 : fffffe00bc5c3600 x6 : 000003fff82b976c
>> x5 : 000003fff82b976c x4 : 0000000000000000
>> x3 : 0000000000000000 x2 : 0000000000000000
>> x1 : 0000000000000000 x0 : 000000000000000c
>
> Although I am not sure, but this is what it seems to me:
>
> First argument (file) is in x0, and which is 0xC in case of kretprobe. But, can
> x0 really be considered as 1st arg in case of kretprobe?
> I think, x0 should have return value of __vfs_read() in case of kretprobe. So,
> 0xC could be the number of bytes read.
>
> With perf I see:
>
> # perf probe -k vmlinux __vfs_read_exit=__vfs_read%return file
> Semantic error :You can't specify local variable for kretprobe.
>
> So, I am not sure what mechanism systemtap uses to get local variable in case of
> kretprobe.
>
> Moreover, on x86 I see that loop exits after the 1st print_regs() only. So it
> means there was valid file->f_op->read() for the 1st file itself. If I comment
> "kprobe.function("__vfs_read")", then there is no print at all. It means, we are
> not hitting a case on x86 when callback was called for kretprobe and we had
> nonzero 1st argument.
>
> ~Pratyush
>
Hi Pratyush,
I found that the problem was that the systemtap scripts assumed x86_64 behavior where the arguments passed into the function are in memory and are still around on return. For aarch64 (and other architectures that pass arguments via registers) the registers holding the values get clobbered. I checked in a fix into the exampls to address this problem:
https://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commit;h=3d0c2f452f09a64b800aabe68508f8f0183f0ea1
I also filed a systemtap bug to look for this issue in other places in systemtap:
https://sourceware.org/bugzilla/show_bug.cgi?id=19360
-Will
^ permalink raw reply [flat|nested] 13+ messages in thread