* [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL
@ 2015-03-28 0:27 fche at redhat dot com
2015-03-28 0:31 ` [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, " fche at redhat dot com
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: fche at redhat dot com @ 2015-03-28 0:27 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
Bug ID: 18171
Summary: setjmp/longjmp clashing with uprobes, causing SIGSEGV
or SIGILL
Product: systemtap
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: uprobes
Assignee: systemtap at sourceware dot org
Reporter: fche at redhat dot com
The testsuite/systemtap.base/bz5274* test case now regularly fails on modern
kernels such as 3.18.9-200.fc21.x86_64, on at least i686 & x86-64.
% gcc -g -o bz5274 .../testsuite/systemtap.base/bz5274.c
% cat > bz5274a.stp << EOF
probe never
# ,process("bz5274").function("funca")
,process("bz5274").function("funca").return
# ,process("bz5274").function("funcb")
,process("bz5274").function("funcb").return
,process("bz5274").function("funcc")
# ,process("bz5274").function("funcc").return
,process("bz5274").function("funcd")
# ,process("bz5274").function("funcd").return
# ,process("bz5274").function("main")
,process("bz5274").function("main").return # <-- out=SIGBUS vs in=SIGILL
{i++ }
global i
EOF
% sudo stap -t bz5274a.stp -c ./bz5274
In bz5274.c: funcb :20 : i=1. Calling funcc
In bz5274.c: funcc :13 : i=1. Calling funcd
In bz5274.c: funcd :7 : i=1. Calling longjmp
In bz5274.c: funcb :20 : i=2. Calling funcc
In bz5274.c: funcc :13 : i=2. Calling funcd
In bz5274.c: funcd :7 : i=2. Calling longjmp
In bz5274.c: funcb :20 : i=3. Calling funcc
In bz5274.c: funcc :13 : i=3. Calling funcd
In bz5274.c: funcd :7 : i=3. Calling longjmp
In bz5274.c: funcb :20 : i=4. Calling funcc
In bz5274.c: funcc :13 : i=4. Calling funcd
In bz5274.c: funcd :7 : i=4. Calling longjmp
WARNING: Child process exited with signal 11 (Segmentation fault)
i=0xa
----- probe hit report:
process(".../testsuite/bz5274").function("funcb@.../systemtap.base/bz5274.c:18").return,
(bz5274a.stp:1:1), hits: 2 [...]
process(".../testsuite/bz5274").function("funcc@.../systemtap.base/bz5274.c:11"),
(bz5274a.stp:1:1), hits: 4 [...]
process(".../testsuite/bz5274").function("funcd@.../systemtap.base/bz5274.c:5"),
(bz5274a.stp:1:1), hits: 4 [...]
end, (<synthetic>:1:1), hits: 1, cycles: 4608min/4608avg/4608max, from: end,
index: 6
----- refresh report:
WARNING: .../staprun exited with status: 1
Pass 5: run failed. [man error::pass5]
[same with -DSTP_ALIBI]
% coredumpctl (pick last $pid)
% sudo coredumpctl gdb $pid
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000400774 in funca (s=0xbffffffda1e80000 <error: Cannot access
memory at address 0xbffffffda1e80000>, env=0xb80000000abf)
at .../bz5274.c:34
(gdb) info registers
rax 0x4 4
rbx 0x0 0
rcx 0x7fffffe4 2147483620
rdx 0x400752 4196178
rsi 0x4 4
rdi 0x7ffd3654d590 140725514982800
rbp 0x41ff894156415741 0x41ff894156415741
rsp 0x4007d8 0x4007d8 <__libc_csu_init+8>
r8 0x7ffd3654d550 140725514982736
r9 0x7ffd3654d570 140725514982768
r10 0x0 0
r11 0x246 582
r12 0x400570 4195696
r13 0x7ffd3654d740 140725514983232
r14 0x0 0
r15 0x0 0
rip 0x400774 0x400774 <funca+62>
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
Note the corrupted $rsp value.
My attempts to reproduce this with "perf probe" failed, without exonerating the
kernel uprobes:
% stap -p2 bz5274a.stp) | sh .../scripts/stap2perf | sudo sh
[...]
probe:bz5274_main_return [Tracepoint event]
probe:bz5274_funcd [Tracepoint event]
probe:bz5274_funcc [Tracepoint event]
probe:bz5274_funcb_return [Tracepoint event]
probe:bz5274_funca_return [Tracepoint event]
% sudo perf record -e 'probe:*' -aR .../bz5274
[...]
In bz5274.c: funcc :13 : i=4. Calling funcd
In bz5274.c: funcd :7 : i=4. Calling longjmp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.888 MB perf.data (~38805 samples) ]
% sudo perf report
[...] "The perf.data file has no samples"
% sudo perf report -D
[...]
Aggregated stats:
TOTAL events: 7357
MMAP events: 96
COMM events: 648
EXIT events: 2
FORK events: 334
MMAP2 events: 6276
FINISHED_ROUND events: 1
probe:bz5274_main_return stats:
probe:bz5274_funcd stats:
probe:bz5274_funcc stats:
probe:bz5274_funcb_return stats:
probe:bz5274_funca_return stats:
... indicates zero uprobes hits of any sort
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, causing SIGSEGV or SIGILL
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
@ 2015-03-28 0:31 ` fche at redhat dot com
2015-03-30 14:17 ` mcermak at redhat dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: fche at redhat dot com @ 2015-03-28 0:31 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
Frank Ch. Eigler <fche at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mcermak at redhat dot com
Summary|setjmp/longjmp clashing |setjmp/longjmp clashing
|with uprobes, causing |with u[ret]probes, causing
|SIGSEGV or SIGILL |SIGSEGV or SIGILL
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, causing SIGSEGV or SIGILL
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
2015-03-28 0:31 ` [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, " fche at redhat dot com
@ 2015-03-30 14:17 ` mcermak at redhat dot com
2015-04-09 21:26 ` mjw at redhat dot com
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: mcermak at redhat dot com @ 2015-03-30 14:17 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
--- Comment #1 from Martin Cermak <mcermak at redhat dot com> ---
Reproduced using perf. Problem is the order of commandline switches produced by
stap2perf: Flipping -a and -x seems to help:
=======
# perf probe -x /usr/share/systemtap/testsuite/bz5274 -a
bz5274_funca_return=funca%return
# perf probe -x /usr/share/systemtap/testsuite/bz5274 -a
bz5274_funcb_return=funcb%return
...
=======
After this I have:
=======
# cat //sys/kernel/debug/tracing/uprobe_events
r:probe_bz5274/bz5274_funca_return
/usr/share/systemtap/testsuite/bz5274:0x00000000000006e6
r:probe_bz5274/bz5274_funcb_return
/usr/share/systemtap/testsuite/bz5274:0x00000000000006a0
p:probe_bz5274/bz5274_funcc
/usr/share/systemtap/testsuite/bz5274:0x000000000000065a
p:probe_bz5274/bz5274_funcd
/usr/share/systemtap/testsuite/bz5274:0x0000000000000616
r:probe_bz5274/bz5274_main_return
/usr/share/systemtap/testsuite/bz5274:0x0000000000000725
=======
Now using -e 'probe_bz5274:*' instead of -e 'probe:*' I'm getting:
=======
# /root/bin/perf record -e 'probe_bz5274:*' -aR ./bz5274
In bz5274.c: funcb :20 : i=1. Calling funcc
In bz5274.c: funcc :13 : i=1. Calling funcd
In bz5274.c: funcd :7 : i=1. Calling longjmp
In bz5274.c: funcb :20 : i=2. Calling funcc
In bz5274.c: funcc :13 : i=2. Calling funcd
In bz5274.c: funcd :7 : i=2. Calling longjmp
In bz5274.c: funcb :20 : i=3. Calling funcc
In bz5274.c: funcc :13 : i=3. Calling funcd
In bz5274.c: funcd :7 : i=3. Calling longjmp
In bz5274.c: funcb :20 : i=4. Calling funcc
In bz5274.c: funcc :13 : i=4. Calling funcd
In bz5274.c: funcd :7 : i=4. Calling longjmp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.166 MB perf.data (10 samples) ]
Segmentation fault (core dumped)
[root@dhcp-1-102 testsuite]#
=======
But now I'm getting two corefiles:
=======
[root@dhcp-1-102 testsuite]# ls core*
core.2097 core.2098
[root@dhcp-1-102 testsuite]# file core.2200
core.2200: ELF 64-bit LSB core file x86-64, version 1 (SYSV), too many program
headers (163)
[root@dhcp-1-102 testsuite]# file core.2201
core.2201: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from
'./bz5274'
[root@dhcp-1-102 testsuite]#
[root@dhcp-1-102 testsuite]# gdb -q ./bz5274 core.2097
Reading symbols from ./bz5274...done.
warning: core file may not match specified executable file.
---------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^---------------------------------------------------
[New LWP 2097]
Core was generated by `/root/bin/perf record -e probe_bz5274:* -aR ./bz5274'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000003e38e0ff99 in ?? ()
(gdb) info registers
rax 0x0 0
rbx 0x0 0
rcx 0xffffffffffffffff -1
rdx 0xb 11
rsi 0x831 2097
rdi 0x831 2097
rbp 0x3e389b76a0 0x3e389b76a0
rsp 0x7fff705200f8 0x7fff705200f8
r8 0x7fff70520030 140735077810224
r9 0x7fff7051fe40 140735077809728
r10 0xa5 165
r11 0x202 514
r12 0x1 1
r13 0x3e389b8e80 267237691008
r14 0x8365d0 8611280
r15 0x0 0
rip 0x3e38e0ff99 0x3e38e0ff99
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) q
[root@dhcp-1-102 testsuite]#
[root@dhcp-1-102 testsuite]# gdb -q ./bz5274 core.2098
Reading symbols from ./bz5274...done.
[New LWP 2098]
Core was generated by `./bz5274'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000400724 in funca (s=0xffff95e800400876 <error: Cannot access
memory at address 0xffff95e800400876>, env=0xbfc68948ffffff30) at
systemtap.base/bz5274.c:34
34 }
(gdb) info registers
rax 0x4 4
rbx 0x0 0
rcx 0x7fffffe4 2147483620
rdx 0x400702 4196098
rsi 0x4 4
rdi 0x7fff359cc380 140734092854144
rbp 0x41ff894156415741 0x41ff894156415741
rsp 0x400768 0x400768 <__libc_csu_init+8>
r8 0x7fff359cc340 140734092854080
r9 0x7fff359cc360 140734092854112
r10 0x0 0
r11 0x246 582
r12 0x400520 4195616
r13 0x7fff359cc530 140734092854576
r14 0x0 0
r15 0x0 0
rip 0x400724 0x400724 <funca+62>
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
=======
So I assume only the second corefile is relevant.
The $rsp corruption seems to happen here too.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, causing SIGSEGV or SIGILL
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
2015-03-28 0:31 ` [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, " fche at redhat dot com
2015-03-30 14:17 ` mcermak at redhat dot com
@ 2015-04-09 21:26 ` mjw at redhat dot com
2015-05-22 20:20 ` fche at redhat dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: mjw at redhat dot com @ 2015-04-09 21:26 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
Mark Wielaard <mjw at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mjw at redhat dot com
--- Comment #2 from Mark Wielaard <mjw at redhat dot com> ---
Just a note that it seems the same is true for the golang runtime.
Although I haven't pinpointed it precisely it seems at least some code in the
garbage collector seems to get upset when we place return probe.
e.g.
$ cat hello.go
package main
import "fmt"
func main() {
fmt.Printf("Hello, world.\n")
}
$ go build hello.go
$ ./run-stap -e 'probe process.function("*").call {printf("%s->%s %s\n",
thread_indent (1), ppfunc (), $$parms)} probe process.function("*").return
{printf("%s<-%s %s\n", thread_indent (-1), ppfunc (), $$return)}' -c ./hello
[...]
fatal error: panic on m stack
runtime stack:
WARNING: Child process exited with status 2
[...]
Without the process.function("*").return probe things look fine (assuming you
have the fixes for PR17957, PR17958 and PR17959).
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, causing SIGSEGV or SIGILL
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
` (2 preceding siblings ...)
2015-04-09 21:26 ` mjw at redhat dot com
@ 2015-05-22 20:20 ` fche at redhat dot com
2017-10-11 10:44 ` mark at klomp dot org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: fche at redhat dot com @ 2015-05-22 20:20 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
--- Comment #3 from Frank Ch. Eigler <fche at redhat dot com> ---
Oleg's patches at http://www.spinics.net/lists/kernel/msg1981148.html look good
on x86-64.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, causing SIGSEGV or SIGILL
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
` (3 preceding siblings ...)
2015-05-22 20:20 ` fche at redhat dot com
@ 2017-10-11 10:44 ` mark at klomp dot org
2017-10-11 10:49 ` mjw at fedoraproject dot org
2018-10-18 18:39 ` fche at redhat dot com
6 siblings, 0 replies; 8+ messages in thread
From: mark at klomp dot org @ 2017-10-11 10:44 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
Mark Wielaard <mark at klomp dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mark at klomp dot org
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, causing SIGSEGV or SIGILL
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
` (4 preceding siblings ...)
2017-10-11 10:44 ` mark at klomp dot org
@ 2017-10-11 10:49 ` mjw at fedoraproject dot org
2018-10-18 18:39 ` fche at redhat dot com
6 siblings, 0 replies; 8+ messages in thread
From: mjw at fedoraproject dot org @ 2017-10-11 10:49 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
Mark Wielaard <mjw at fedoraproject dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC|mjw at fedoraproject dot org |
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, causing SIGSEGV or SIGILL
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
` (5 preceding siblings ...)
2017-10-11 10:49 ` mjw at fedoraproject dot org
@ 2018-10-18 18:39 ` fche at redhat dot com
6 siblings, 0 replies; 8+ messages in thread
From: fche at redhat dot com @ 2018-10-18 18:39 UTC (permalink / raw)
To: systemtap
https://sourceware.org/bugzilla/show_bug.cgi?id=18171
Frank Ch. Eigler <fche at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |WORKSFORME
--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---
Kernels much newer than these are working fine.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-10-18 18:39 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-28 0:27 [Bug uprobes/18171] New: setjmp/longjmp clashing with uprobes, causing SIGSEGV or SIGILL fche at redhat dot com
2015-03-28 0:31 ` [Bug uprobes/18171] setjmp/longjmp clashing with u[ret]probes, " fche at redhat dot com
2015-03-30 14:17 ` mcermak at redhat dot com
2015-04-09 21:26 ` mjw at redhat dot com
2015-05-22 20:20 ` fche at redhat dot com
2017-10-11 10:44 ` mark at klomp dot org
2017-10-11 10:49 ` mjw at fedoraproject dot org
2018-10-18 18:39 ` fche at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).