public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
@ 2018-02-15  6:01 ` mysecondaccountabc at gmail dot com
  2018-02-15  6:09 ` mysecondaccountabc at gmail dot com
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2018-02-15  6:01 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #1 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
same happens using syscall.* and syscall.*.return

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
  2018-02-15  6:01 ` [Bug runtime/22847] ARM OABI syscall tracing issues mysecondaccountabc at gmail dot com
@ 2018-02-15  6:09 ` mysecondaccountabc at gmail dot com
  2018-02-15 15:45 ` dsmith at redhat dot com
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2018-02-15  6:09 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #2 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
Created attachment 10818
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10818&action=edit
Source to reproduce this issue

Crosscompiling it using ARMv4 uClibc toolchain (*)

cross-compiler-armv4l/bin/armv4l-gcc -static ex_socket.c -o ex_socket_OABI

(*)
https://www.uclibc.org/downloads/binaries/0.9.30.1/cross-compiler-armv4l.tar.bz2

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
  2018-02-15  6:01 ` [Bug runtime/22847] ARM OABI syscall tracing issues mysecondaccountabc at gmail dot com
  2018-02-15  6:09 ` mysecondaccountabc at gmail dot com
@ 2018-02-15 15:45 ` dsmith at redhat dot com
  2018-02-15 16:08 ` dsmith at redhat dot com
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-02-15 15:45 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dsmith at redhat dot com

--- Comment #3 from David Smith <dsmith at redhat dot com> ---
(In reply to Gustavo Moreira from comment #0)
> The system is an ARMv7 (EABI) but the kernel is compiled with OABI
> compatibility so that it can execute both ABI binaries. Everything is
> working, both sort of binaries can be executed on it but SystemTap struggles
> to trace the OABI syscalls. 
> For instance, syscalls like "execve" and "exit" are traced but "connect" is
> completely ignored. Most likely many other syscalls are ignored as well.

Let's take the example of the "connect" syscall. The connect() system call is
complicated. On some systems, connect() is implemented via sys_connect(). On
other systems, it is implemented via sys_socketcall() (with $call set to
SYS_CONNECT). Systemtap tries to catch the syscall on first entry to the
kernel, and also tries to not give two probe hits on one actual syscall.

Doing this correctly is a bit complicated (see tapset/linux/sysc_connect.stp
for all the details). One technique we use is rejecting the call to
sys_connect() if the syscall number isn't __NR_connect. For your OABI binaries,
it might be possible that you've got a different __NR_connect for each ABI or
that systemtap isn't getting the syscall number correctly for that ABI.

You are going to have to debug this a bit to narrow it down. Some questions to
try to answer:

1) Is your connect syscall implemented via sys_connect() or through
sys_socketcall(), or perhaps through some arch-specific function? Run a test
binary, set a probe on both sys_connect() and sys_socketcall() and see what
gets hit. (If you need a test program, look in
testsuite/systemtap.syscall/connect.c.)

2) Are you getting the correct syscall number for both ABIs? Run your test
program (compiled once for each ABI) and see what _stp_syscall_nr() returns. Is
the number the same for both ABIs?

One more thing. The syscall testsuite should let you know exactly which
syscalls are not working. To run the tests, do the following:

# make installcheck RUNTESTFLAGS="systemtap.syscall/*.exp"

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2018-02-15 15:45 ` dsmith at redhat dot com
@ 2018-02-15 16:08 ` dsmith at redhat dot com
  2018-02-16  4:06 ` mysecondaccountabc at gmail dot com
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-02-15 16:08 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #4 from David Smith <dsmith at redhat dot com> ---
(In reply to David Smith from comment #3)
> (In reply to Gustavo Moreira from comment #0)
> > The system is an ARMv7 (EABI) but the kernel is compiled with OABI
> > compatibility so that it can execute both ABI binaries. Everything is
> > working, both sort of binaries can be executed on it but SystemTap struggles
> > to trace the OABI syscalls. 
> > For instance, syscalls like "execve" and "exit" are traced but "connect" is
> > completely ignored. Most likely many other syscalls are ignored as well.
> 
> Let's take the example of the "connect" syscall. The connect() system call
> is complicated. On some systems, connect() is implemented via sys_connect().
> On other systems, it is implemented via sys_socketcall() (with $call set to
> SYS_CONNECT). Systemtap tries to catch the syscall on first entry to the
> kernel, and also tries to not give two probe hits on one actual syscall.
> 
> Doing this correctly is a bit complicated (see tapset/linux/sysc_connect.stp
> for all the details). One technique we use is rejecting the call to
> sys_connect() if the syscall number isn't __NR_connect. For your OABI
> binaries, it might be possible that you've got a different __NR_connect for
> each ABI or that systemtap isn't getting the syscall number correctly for
> that ABI.
> 
> You are going to have to debug this a bit to narrow it down. Some questions
> to try to answer:
> 
> 1) Is your connect syscall implemented via sys_connect() or through
> sys_socketcall(), or perhaps through some arch-specific function? Run a test
> binary, set a probe on both sys_connect() and sys_socketcall() and see what
> gets hit. (If you need a test program, look in
> testsuite/systemtap.syscall/connect.c.)

To be clear here, that try the following:

# stap -ve 'probe kernel.function("sys_connect").call,
kernel.function("sys_socketcall").call { printf("%s\n", ppfunc()) }' -c
test_program

> 2) Are you getting the correct syscall number for both ABIs? Run your test
> program (compiled once for each ABI) and see what _stp_syscall_nr() returns.
> Is the number the same for both ABIs?

To be clear here, that try the following:

# stap -ve 'probe kernel.function("sys_connect").call,
kernel.function("sys_socketcall").call { printf("%s - %d\n", ppfunc(),
_stp_syscall_nr()) }' -c test_program

> One more thing. The syscall testsuite should let you know exactly which
> syscalls are not working. To run the tests, do the following:
> 
> # make installcheck RUNTESTFLAGS="systemtap.syscall/*.exp"

I just realized the above really isn't going to help, since the testsuite won't
know that you've got that "extra" ABI and so won't compile for it. That
wouldn't be hard to change, but we'll have to come up with some way for the
testsuite to check for the ABI support.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2018-02-15 16:08 ` dsmith at redhat dot com
@ 2018-02-16  4:06 ` mysecondaccountabc at gmail dot com
  2018-02-19  6:55 ` mysecondaccountabc at gmail dot com
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2018-02-16  4:06 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #5 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
(In reply to David Smith from comment #4)
> > 1) Is your connect syscall implemented via sys_connect() or through
> > sys_socketcall(), or perhaps through some arch-specific function? Run a test
> > binary, set a probe on both sys_connect() and sys_socketcall() and see what
> > gets hit. (If you need a test program, look in
> > testsuite/systemtap.syscall/connect.c.)
> 
> To be clear here, that try the following:
> 
> # stap -ve 'probe kernel.function("sys_connect").call,
> kernel.function("sys_socketcall").call { printf("%s\n", ppfunc()) }' -c
> test_program
> 

It's implemented using sys_connect:
...
Pass 5: starting run.
SyS_connect
Connected
Pass 5: run completed in 320usr/890sys/2179real ms

However, for some reason, the syscall probe alias syscall.*/nd_syscall.* don't
capture that.


> > 2) Are you getting the correct syscall number for both ABIs? Run your test
> > program (compiled once for each ABI) and see what _stp_syscall_nr() returns.
> > Is the number the same for both ABIs?
> 
> To be clear here, that try the following:
> 
> # stap -ve 'probe kernel.function("sys_connect").call,
> kernel.function("sys_socketcall").call { printf("%s - %d\n", ppfunc(),
> _stp_syscall_nr()) }' -c test_program
>  
The above returns:
SyS_connect - 32916

However, that is not correct because apparently _stp_syscall_nr() is made for
EABI where the syscall number is passed using R7.

systemtap/runtime/syscall.h:
...
#if defined(__arm__)
...
static inline long _stp_syscall_get_nr(struct task_struct *task, struct pt_regs
*regs)
{
        return regs->ARM_r7;
}

In OABI the syscall convention is svc 0x900000 + SYSCALL_NR.
For instance, for sys_exit() syscall:

EABI:
    mov r7, #0x01 ; sys_exit 
    svc #0x00 

OABI:
    svc #0x900001 ; sys_exit 

man syscall(2):
   arch/ABI   instruction          syscall #   retval Notes
   ──────────────────────────────────────────────────────────
   arm/OABI   swi NR               -           a1     NR is syscall #
   arm/EABI   swi 0x0              r7          r0

In the attached example:
$ objdump -d test_program  | grep -A2 "libc_connect>:"
00008830 <__libc_connect>:
    8830:       e92d4010        push    {r4, lr}
    8834:       ef90011b        svc     0x0090011b

Where 0x11b (283) is sys_connect .

In the kernel source
(https://elixir.bootlin.com/linux/v4.9.75/source/arch/arm/include/uapi/asm/unistd.h):
#define __NR_OABI_SYSCALL_BASE  0x900000
...
#if defined(__thumb__) || defined(__ARM_EABI__)
#define __NR_SYSCALL_BASE       0
#else
#define __NR_SYSCALL_BASE       __NR_OABI_SYSCALL_BASE
#endif
...
#define __NR_exit                       (__NR_SYSCALL_BASE+  1)
...
#define __NR_connect                    (__NR_SYSCALL_BASE+283)

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2018-02-16  4:06 ` mysecondaccountabc at gmail dot com
@ 2018-02-19  6:55 ` mysecondaccountabc at gmail dot com
  2018-02-19  7:02 ` mysecondaccountabc at gmail dot com
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2018-02-19  6:55 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #6 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
I've found the all_syscalls.stp script in the SystemTap's source tree which
show the sys_socket and sys_connect syscalls correctly.

~/systemtap/testsuite/systemtap.syscall# ./all_syscalls.stp -c ~/ex_socket_OABI 
kernel.function("sys_sigreturn_wrapper")?
  kernel.function("sys_sigreturn@arch/arm/kernel/signal.c:189")?
kernel.function("SyS_rt_sigaction@kernel/signal.c:3323")?
kernel.function("SyS_rt_sigprocmask@kernel/signal.c:2543")?
kernel.function("SyS_setitimer@kernel/time/itimer.c:278")?
kernel.function("SyS_execve@fs/exec.c:1906")?
kernel.function("SyS_ioctl@fs/ioctl.c:685")?
kernel.function("SyS_ioctl@fs/ioctl.c:685")?
kernel.function("SyS_socket@net/socket.c:1218")?
kernel.function("sys_oabi_connect@arch/arm/kernel/sys_oabi-compat.c:387")?
  kernel.function("SyS_connect@net/socket.c:1529")?
Connected
kernel.function("SyS_write@fs/read_write.c:599")?
kernel.function("SyS_exit@kernel/exit.c:899")?

However, any of the probe alias are called when for instance sys_connect is
called:

# stap -ve 'probe nd_syscall.*  { printf("%s %s\n", ppfunc(), name) }' -c
./ex_socket_OABI 2>&1 | grep -i connect
Connected

# stap -ve 'probe syscall.*  { printf("%s %s\n", ppfunc(), name) }' -c
./ex_socket_OABI 2>&1 | grep -i connect
Connected

I wonder why the following probe in tapset/linux/sysc_connect.stp doesn't do
the job:
probe __nd_syscall.connect = kprobe.function("sys_connect") ?
{
        @__syscall_gate(@const("__NR_connect"))
        asmlinkage()
        sockfd = int_arg(1)
        serv_addr_uaddr = pointer_arg(2)
        addrlen = uint_arg(3)
}

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2018-02-19  6:55 ` mysecondaccountabc at gmail dot com
@ 2018-02-19  7:02 ` mysecondaccountabc at gmail dot com
  2018-02-19 14:31 ` dsmith at redhat dot com
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2018-02-19  7:02 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #7 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
Created attachment 10834
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10834&action=edit
Full stap output comment #6

Full stap output for the syscall.* probe alias used in comment #6, without
grepping the "connect" string.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2018-02-19  7:02 ` mysecondaccountabc at gmail dot com
@ 2018-02-19 14:31 ` dsmith at redhat dot com
  2018-02-19 22:42 ` mysecondaccountabc at gmail dot com
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-02-19 14:31 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #8 from David Smith <dsmith at redhat dot com> ---
(In reply to Gustavo Moreira from comment #5)
> (In reply to David Smith from comment #4)
> > > 1) Is your connect syscall implemented via sys_connect() or through
> > > sys_socketcall(), or perhaps through some arch-specific function? Run a test
> > > binary, set a probe on both sys_connect() and sys_socketcall() and see what
> > > gets hit. (If you need a test program, look in
> > > testsuite/systemtap.syscall/connect.c.)
> > 
> > To be clear here, that try the following:
> > 
> > # stap -ve 'probe kernel.function("sys_connect").call,
> > kernel.function("sys_socketcall").call { printf("%s\n", ppfunc()) }' -c
> > test_program
> > 
> 
> It's implemented using sys_connect:
> ...
> Pass 5: starting run.
> SyS_connect
> Connected
> Pass 5: run completed in 320usr/890sys/2179real ms
> 
> However, for some reason, the syscall probe alias syscall.*/nd_syscall.*
> don't capture that.
> 
> 
> > > 2) Are you getting the correct syscall number for both ABIs? Run your test
> > > program (compiled once for each ABI) and see what _stp_syscall_nr() returns.
> > > Is the number the same for both ABIs?
> > 
> > To be clear here, that try the following:
> > 
> > # stap -ve 'probe kernel.function("sys_connect").call,
> > kernel.function("sys_socketcall").call { printf("%s - %d\n", ppfunc(),
> > _stp_syscall_nr()) }' -c test_program
> >  
> The above returns:
> SyS_connect - 32916
> 
> However, that is not correct because apparently _stp_syscall_nr() is made
> for EABI where the syscall number is passed using R7.
> 
> systemtap/runtime/syscall.h:
> ...
> #if defined(__arm__)
> ...
> static inline long _stp_syscall_get_nr(struct task_struct *task, struct
> pt_regs *regs)
> {
>         return regs->ARM_r7;
> }
> 
> In OABI the syscall convention is svc 0x900000 + SYSCALL_NR.
> For instance, for sys_exit() syscall:
> 
> EABI:
>     mov r7, #0x01 ; sys_exit 
>     svc #0x00 
> 
> OABI:
>     svc #0x900001 ; sys_exit 
> 
> man syscall(2):
>    arch/ABI   instruction          syscall #   retval Notes
>    ──────────────────────────────────────────────────────────
>    arm/OABI   swi NR               -           a1     NR is syscall #
>    arm/EABI   swi 0x0              r7          r0
> 
> In the attached example:
> $ objdump -d test_program  | grep -A2 "libc_connect>:"
> 00008830 <__libc_connect>:
>     8830:	e92d4010 	push	{r4, lr}
>     8834:	ef90011b 	svc	0x0090011b
> 
> Where 0x11b (283) is sys_connect .
> 
> In the kernel source
> (https://elixir.bootlin.com/linux/v4.9.75/source/arch/arm/include/uapi/asm/
> unistd.h):
> #define __NR_OABI_SYSCALL_BASE	0x900000
> ...
> #if defined(__thumb__) || defined(__ARM_EABI__)
> #define __NR_SYSCALL_BASE	0
> #else
> #define __NR_SYSCALL_BASE	__NR_OABI_SYSCALL_BASE
> #endif
> ...
> #define __NR_exit			(__NR_SYSCALL_BASE+  1)
> ...
> #define __NR_connect			(__NR_SYSCALL_BASE+283)

OK, let's start small here and try to fix _stp_syscall_get_nr() for OABI. Try
the following patch (which tries to use the kernel's syscall_get_nr()):

====
diff --git a/runtime/syscall.h b/runtime/syscall.h
index 5ed019869..1f5552d78 100644
--- a/runtime/syscall.h
+++ b/runtime/syscall.h
@@ -166,11 +166,15 @@
  * returns 0 (since it was designed to be used with ftrace syscall
  * tracing, not called from any context). So, let's use our function
  * instead. */
+#if defined(__thumb__) || defined(__ARM_EABI__)
 static inline long
 _stp_syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
        return regs->ARM_r7;
 }
+#else
+#define _stp_syscall_get_nr syscall_get_nr
+#endif

 #elif defined(__mips__)
 /* Define our own function as syscall_get_nr always returns 0 unless
====

With that patch added, does the following return the correct value?

# stap -ve 'kernel.function("sys_socketcall").call { printf("%s - %d\n",
ppfunc(), _stp_syscall_nr()) }' -c test_program

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2018-02-19 14:31 ` dsmith at redhat dot com
@ 2018-02-19 22:42 ` mysecondaccountabc at gmail dot com
  2018-02-19 22:53 ` dsmith at redhat dot com
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2018-02-19 22:42 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #9 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
(In reply to David Smith from comment #8)
> (In reply to Gustavo Moreira from comment #5)
> > (In reply to David Smith from comment #4)
> > > > 1) Is your connect syscall implemented via sys_connect() or through
> > > > sys_socketcall(), or perhaps through some arch-specific function? Run a test
> > > > binary, set a probe on both sys_connect() and sys_socketcall() and see what
> > > > gets hit. (If you need a test program, look in
> > > > testsuite/systemtap.syscall/connect.c.)
> > > 
> > > To be clear here, that try the following:
> > > 
> > > # stap -ve 'probe kernel.function("sys_connect").call,
> > > kernel.function("sys_socketcall").call { printf("%s\n", ppfunc()) }' -c
> > > test_program
> > > 
> > 
> > It's implemented using sys_connect:
> > ...
> > Pass 5: starting run.
> > SyS_connect
> > Connected
> > Pass 5: run completed in 320usr/890sys/2179real ms
> > 
> > However, for some reason, the syscall probe alias syscall.*/nd_syscall.*
> > don't capture that.
> > 
> > 
> > > > 2) Are you getting the correct syscall number for both ABIs? Run your test
> > > > program (compiled once for each ABI) and see what _stp_syscall_nr() returns.
> > > > Is the number the same for both ABIs?
> > > 
> > > To be clear here, that try the following:
> > > 
> > > # stap -ve 'probe kernel.function("sys_connect").call,
> > > kernel.function("sys_socketcall").call { printf("%s - %d\n", ppfunc(),
> > > _stp_syscall_nr()) }' -c test_program
> > >  
> > The above returns:
> > SyS_connect - 32916
> > 
> > However, that is not correct because apparently _stp_syscall_nr() is made
> > for EABI where the syscall number is passed using R7.
> > 
> > systemtap/runtime/syscall.h:
> > ...
> > #if defined(__arm__)
> > ...
> > static inline long _stp_syscall_get_nr(struct task_struct *task, struct
> > pt_regs *regs)
> > {
> >         return regs->ARM_r7;
> > }
> > 
> > In OABI the syscall convention is svc 0x900000 + SYSCALL_NR.
> > For instance, for sys_exit() syscall:
> > 
> > EABI:
> >     mov r7, #0x01 ; sys_exit 
> >     svc #0x00 
> > 
> > OABI:
> >     svc #0x900001 ; sys_exit 
> > 
> > man syscall(2):
> >    arch/ABI   instruction          syscall #   retval Notes
> >    ──────────────────────────────────────────────────────────
> >    arm/OABI   swi NR               -           a1     NR is syscall #
> >    arm/EABI   swi 0x0              r7          r0
> > 
> > In the attached example:
> > $ objdump -d test_program  | grep -A2 "libc_connect>:"
> > 00008830 <__libc_connect>:
> >     8830:	e92d4010 	push	{r4, lr}
> >     8834:	ef90011b 	svc	0x0090011b
> > 
> > Where 0x11b (283) is sys_connect .
> > 
> > In the kernel source
> > (https://elixir.bootlin.com/linux/v4.9.75/source/arch/arm/include/uapi/asm/
> > unistd.h):
> > #define __NR_OABI_SYSCALL_BASE	0x900000
> > ...
> > #if defined(__thumb__) || defined(__ARM_EABI__)
> > #define __NR_SYSCALL_BASE	0
> > #else
> > #define __NR_SYSCALL_BASE	__NR_OABI_SYSCALL_BASE
> > #endif
> > ...
> > #define __NR_exit			(__NR_SYSCALL_BASE+  1)
> > ...
> > #define __NR_connect			(__NR_SYSCALL_BASE+283)
> 
> OK, let's start small here and try to fix _stp_syscall_get_nr() for OABI.
> Try the following patch (which tries to use the kernel's syscall_get_nr()):
> 
> ====
> diff --git a/runtime/syscall.h b/runtime/syscall.h
> index 5ed019869..1f5552d78 100644
> --- a/runtime/syscall.h
> +++ b/runtime/syscall.h
> @@ -166,11 +166,15 @@
>   * returns 0 (since it was designed to be used with ftrace syscall
>   * tracing, not called from any context). So, let's use our function
>   * instead. */
> +#if defined(__thumb__) || defined(__ARM_EABI__)
>  static inline long
>  _stp_syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
>  {
>  	return regs->ARM_r7;
>  }
> +#else
> +#define _stp_syscall_get_nr syscall_get_nr
> +#endif
>  
>  #elif defined(__mips__)
>  /* Define our own function as syscall_get_nr always returns 0 unless
> ====
> 
> With that patch added, does the following return the correct value?
> 
> # stap -ve 'kernel.function("sys_socketcall").call { printf("%s - %d\n",
> ppfunc(), _stp_syscall_nr()) }' -c test_program

I've added "probe" at the beginning and changed the syscall to "sys_connect"
because it doesn't use sys_socketcall. 

# stap -ve 'probe kernel.function("sys_connect").call { printf("%s - %d\n",
ppfunc(), _stp_syscall_nr()) }' -c ./ex_socket_OABI 
Pass 1: parsed user script and 452 library scripts using
40896virt/33624res/4948shr/28920data kb, in 4780usr/1090sys/5869real ms.
Pass 2: analyzed script: 1 probe, 2 functions, 97 embeds, 0 globals using
77520virt/70912res/5528shr/65544data kb, in 13480usr/11970sys/25472real ms.
Pass 3: translated to C into
"/tmp/stapSl5Rx9/stap_195c43dcde9908a38abbe97ece0f593b_53976_src.c" using
77520virt/71040res/5656shr/65544data kb, in 1670usr/10930sys/12604real ms.
Pass 4: compiled C into "stap_195c43dcde9908a38abbe97ece0f593b_53976.ko" in
58570usr/20570sys/73572real ms.
Pass 5: starting run.
SyS_connect - 32916
Connected
Pass 5: run completed in 370usr/1000sys/2316real ms.

However, the result seems to be the same. I've patched the file in the
installation directory (/usr/share/systemtap/runtime/syscall.h). I don't think
SystemTap needs to be completely recompiled again, right? The change should be
included when it compiles the LKM in the above stap execution.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2018-02-19 22:42 ` mysecondaccountabc at gmail dot com
@ 2018-02-19 22:53 ` dsmith at redhat dot com
  2018-02-20  1:05 ` mysecondaccountabc at gmail dot com
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-02-19 22:53 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #10 from David Smith <dsmith at redhat dot com> ---
(In reply to Gustavo Moreira from comment #9)

> > With that patch added, does the following return the correct value?
> > 
> > # stap -ve 'kernel.function("sys_socketcall").call { printf("%s - %d\n",
> > ppfunc(), _stp_syscall_nr()) }' -c test_program
> 
> I've added "probe" at the beginning and changed the syscall to "sys_connect"
> because it doesn't use sys_socketcall. 

Sorry, I misremembered how your kernel worked.

> # stap -ve 'probe kernel.function("sys_connect").call { printf("%s - %d\n",
> ppfunc(), _stp_syscall_nr()) }' -c ./ex_socket_OABI 
> Pass 1: parsed user script and 452 library scripts using
> 40896virt/33624res/4948shr/28920data kb, in 4780usr/1090sys/5869real ms.
> Pass 2: analyzed script: 1 probe, 2 functions, 97 embeds, 0 globals using
> 77520virt/70912res/5528shr/65544data kb, in 13480usr/11970sys/25472real ms.
> Pass 3: translated to C into
> "/tmp/stapSl5Rx9/stap_195c43dcde9908a38abbe97ece0f593b_53976_src.c" using
> 77520virt/71040res/5656shr/65544data kb, in 1670usr/10930sys/12604real ms.
> Pass 4: compiled C into "stap_195c43dcde9908a38abbe97ece0f593b_53976.ko" in
> 58570usr/20570sys/73572real ms.
> Pass 5: starting run.
> SyS_connect - 32916
> Connected
> Pass 5: run completed in 370usr/1000sys/2316real ms.
> 
> However, the result seems to be the same. I've patched the file in the
> installation directory (/usr/share/systemtap/runtime/syscall.h). I don't
> think SystemTap needs to be completely recompiled again, right? The change
> should be included when it compiles the LKM in the above stap execution.

Right. Hmm.

I wonder if we've got to handle both ABIs at once (more like a 32-bit ia32
executable on a x86_64 kernel). Is CONFIG_OABI_COMPAT defined in your config
file?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2018-02-19 22:53 ` dsmith at redhat dot com
@ 2018-02-20  1:05 ` mysecondaccountabc at gmail dot com
  2018-02-20 16:03 ` dsmith at redhat dot com
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: mysecondaccountabc at gmail dot com @ 2018-02-20  1:05 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #11 from Gustavo Moreira <mysecondaccountabc at gmail dot com> ---
(In reply to David Smith from comment #10)
> (In reply to Gustavo Moreira from comment #9)
> 
> > > With that patch added, does the following return the correct value?
> > > 
> > > # stap -ve 'kernel.function("sys_socketcall").call { printf("%s - %d\n",
> > > ppfunc(), _stp_syscall_nr()) }' -c test_program
> > 
> > I've added "probe" at the beginning and changed the syscall to "sys_connect"
> > because it doesn't use sys_socketcall. 
> 
> Sorry, I misremembered how your kernel worked.
> 
> > # stap -ve 'probe kernel.function("sys_connect").call { printf("%s - %d\n",
> > ppfunc(), _stp_syscall_nr()) }' -c ./ex_socket_OABI 
> > Pass 1: parsed user script and 452 library scripts using
> > 40896virt/33624res/4948shr/28920data kb, in 4780usr/1090sys/5869real ms.
> > Pass 2: analyzed script: 1 probe, 2 functions, 97 embeds, 0 globals using
> > 77520virt/70912res/5528shr/65544data kb, in 13480usr/11970sys/25472real ms.
> > Pass 3: translated to C into
> > "/tmp/stapSl5Rx9/stap_195c43dcde9908a38abbe97ece0f593b_53976_src.c" using
> > 77520virt/71040res/5656shr/65544data kb, in 1670usr/10930sys/12604real ms.
> > Pass 4: compiled C into "stap_195c43dcde9908a38abbe97ece0f593b_53976.ko" in
> > 58570usr/20570sys/73572real ms.
> > Pass 5: starting run.
> > SyS_connect - 32916
> > Connected
> > Pass 5: run completed in 370usr/1000sys/2316real ms.
> > 
> > However, the result seems to be the same. I've patched the file in the
> > installation directory (/usr/share/systemtap/runtime/syscall.h). I don't
> > think SystemTap needs to be completely recompiled again, right? The change
> > should be included when it compiles the LKM in the above stap execution.
> 
> Right. Hmm.
> 
> I wonder if we've got to handle both ABIs at once (more like a 32-bit ia32
> executable on a x86_64 kernel). Is CONFIG_OABI_COMPAT defined in your config
> file?

exactly, CONFIG_OABI_COMPAT=y. I think so, because when that is enabled the
kernel is able to execute both sort of ABI binaries.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2018-02-20  1:05 ` mysecondaccountabc at gmail dot com
@ 2018-02-20 16:03 ` dsmith at redhat dot com
  2018-02-22  0:04 ` gmoreira at gmail dot com
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-02-20 16:03 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #12 from David Smith <dsmith at redhat dot com> ---
(In reply to Gustavo Moreira from comment #11)
> (In reply to David Smith from comment #10)
> > I wonder if we've got to handle both ABIs at once (more like a 32-bit ia32
> > executable on a x86_64 kernel). Is CONFIG_OABI_COMPAT defined in your config
> > file?
> 
> exactly, CONFIG_OABI_COMPAT=y. I think so, because when that is enabled the
> kernel is able to execute both sort of ABI binaries.

OK, that makes more sense - I should have realized that earlier. Try the
following patch (which tries to use the kernel's syscall_get_nr()) with *both*
ABIs and see what syscall numbers you get:

====
diff --git a/runtime/syscall.h b/runtime/syscall.h
index 5ed019869..2b551f16f 100644
--- a/runtime/syscall.h
+++ b/runtime/syscall.h
@@ -169,7 +169,11 @@
 static inline long
 _stp_syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
+#ifdef CONFIG_OABI_COMPAT
+       return syscall_get_nr(task, regs);
+#else
        return regs->ARM_r7;
+#endif
 }

 #elif defined(__mips__)
====

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2018-02-20 16:03 ` dsmith at redhat dot com
@ 2018-02-22  0:04 ` gmoreira at gmail dot com
  2018-02-22 16:58 ` dsmith at redhat dot com
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: gmoreira at gmail dot com @ 2018-02-22  0:04 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #13 from Gustavo Moreira <gmoreira at gmail dot com> ---
(In reply to David Smith from comment #12)
> (In reply to Gustavo Moreira from comment #11)
> > (In reply to David Smith from comment #10)
> > > I wonder if we've got to handle both ABIs at once (more like a 32-bit ia32
> > > executable on a x86_64 kernel). Is CONFIG_OABI_COMPAT defined in your config
> > > file?
> > 
> > exactly, CONFIG_OABI_COMPAT=y. I think so, because when that is enabled the
> > kernel is able to execute both sort of ABI binaries.
> 
> OK, that makes more sense - I should have realized that earlier. Try the
> following patch (which tries to use the kernel's syscall_get_nr()) with
> *both* ABIs and see what syscall numbers you get:
> 
> ====
> diff --git a/runtime/syscall.h b/runtime/syscall.h
> index 5ed019869..2b551f16f 100644
> --- a/runtime/syscall.h
> +++ b/runtime/syscall.h
> @@ -169,7 +169,11 @@
>  static inline long
>  _stp_syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
>  {
> +#ifdef CONFIG_OABI_COMPAT
> +	return syscall_get_nr(task, regs);
> +#else
>  	return regs->ARM_r7;
> +#endif
>  }
>  
>  #elif defined(__mips__)
> ====

I've just realised that the ARM machine also had systemtap-runtime and
systemtap-common 3.1-2 deb packages installed in /usr/share/systemtap, apart
from the /usr/local/share/systemtap installed from the systemtap-3.2 sources,
sorry by that.
Anyway, I'm back on the right track.

Without that patch, the results are
EABI:
 SyS_connect - 283
OABI:
 SyS_connect - 32916

With that patch:
EABI:
 SyS_connect - 0
OABI:
 SyS_connect - 0

So, that patch isn't working, it's always executing syscall_get_nr() which
seems to also return 0 always as the comment just above that code explains:

/* The syscall_get_nr() function on 3.17.1-302.fc21.armv7hl always
 * returns 0 (since it was designed to be used with ftrace syscall
 * tracing, not called from any context). So, let's use our function
 * instead. */

So, the first issue we have is that we are not correctly identifying the
appropriate constant to detect when it's in OABI compatibility mode.

I've tried all the following combinations but all of them are being executed in
both cases:
#if defined(__thumb__) || defined(__ARM_EABI__)
#if defined(CONFIG_OABI_COMPAT)
#if !defined(CONFIG_AEABI) || defined(CONFIG_OABI_COMPAT)
#if defined(__ARM_EABI__)

It should be a more dynamic way to detect that, I mean instead of using
preprocessor directives. Trying to see how the kernel differentiates those
modes, any thought?

On the other hand, once we can detect that we need to extract the syscall
number from the instruction itself. I have some ideas, trying to test if it
would work.

Anyway, having this done, what else you think it would be needed? Is that the
reason why the probe aliases are not catching for instance the sys_connect
trace point?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2018-02-22  0:04 ` gmoreira at gmail dot com
@ 2018-02-22 16:58 ` dsmith at redhat dot com
  2018-04-18  6:50 ` gmoreira at gmail dot com
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-02-22 16:58 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #14 from David Smith <dsmith at redhat dot com> ---
(In reply to Gustavo Moreira from comment #13)
> (In reply to David Smith from comment #12)
> > (In reply to Gustavo Moreira from comment #11)
> > > (In reply to David Smith from comment #10)
> > > > I wonder if we've got to handle both ABIs at once (more like a 32-bit ia32
> > > > executable on a x86_64 kernel). Is CONFIG_OABI_COMPAT defined in your config
> > > > file?
> > > 
> > > exactly, CONFIG_OABI_COMPAT=y. I think so, because when that is enabled the
> > > kernel is able to execute both sort of ABI binaries.
> > 
> > OK, that makes more sense - I should have realized that earlier. Try the
> > following patch (which tries to use the kernel's syscall_get_nr()) with
> > *both* ABIs and see what syscall numbers you get:
> > 
> > ====
> > diff --git a/runtime/syscall.h b/runtime/syscall.h
> > index 5ed019869..2b551f16f 100644
> > --- a/runtime/syscall.h
> > +++ b/runtime/syscall.h
> > @@ -169,7 +169,11 @@
> >  static inline long
> >  _stp_syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
> >  {
> > +#ifdef CONFIG_OABI_COMPAT
> > +	return syscall_get_nr(task, regs);
> > +#else
> >  	return regs->ARM_r7;
> > +#endif
> >  }
> >  
> >  #elif defined(__mips__)
> > ====
> 
> I've just realised that the ARM machine also had systemtap-runtime and
> systemtap-common 3.1-2 deb packages installed in /usr/share/systemtap, apart
> from the /usr/local/share/systemtap installed from the systemtap-3.2
> sources, sorry by that.
> Anyway, I'm back on the right track.
> 
> Without that patch, the results are
> EABI:
>  SyS_connect - 283
> OABI:
>  SyS_connect - 32916
> 
> With that patch:
> EABI:
>  SyS_connect - 0
> OABI:
>  SyS_connect - 0
> 
> So, that patch isn't working, it's always executing syscall_get_nr() which
> seems to also return 0 always as the comment just above that code explains:
> 
> /* The syscall_get_nr() function on 3.17.1-302.fc21.armv7hl always
>  * returns 0 (since it was designed to be used with ftrace syscall
>  * tracing, not called from any context). So, let's use our function
>  * instead. */

Right. I had hoped that was just a 3.17 limitation and that things had improved
since then. Unfortunately, I was wrong. For arm, the kernel seems to only save
the syscall number when it knows the executable is being ptrace'd.

> So, the first issue we have is that we are not correctly identifying the
> appropriate constant to detect when it's in OABI compatibility mode.
> 
> I've tried all the following combinations but all of them are being executed
> in both cases:
> #if defined(__thumb__) || defined(__ARM_EABI__)
> #if defined(CONFIG_OABI_COMPAT)
> #if !defined(CONFIG_AEABI) || defined(CONFIG_OABI_COMPAT)
> #if defined(__ARM_EABI__)
> 
> It should be a more dynamic way to detect that, I mean instead of using
> preprocessor directives. Trying to see how the kernel differentiates those
> modes, any thought?

You've got 2 separate, but related, problems.

1) How do we determine if we've got an EABI or an OABI executable? The kernel
knows that if the argument to 'swi' is 0, we've got an EABI executable (see
arch/arm/kernel/entry-common.S). Otherwise, we've got an OABI executable.

2) Once we know what API the executable is using, how do we get the syscall
number? Once again the kernel knows this based on the argument to 'swi'.

Unfortunately, systemtap can't get the swi argument (at least not in any way I
can think of).

This situation is somewhat similar to running 32-bit executables on an x86_64
kernel (or 31-bit s390 executables on a 64-bit s390x kernel). In those cases,
we can test the TIF_32BIT thread flag to see which kind of executable we've
got. I don't see something similar here to test, although I'd love to be proved
wrong.

Your testing of the combinations of various CONFIG variables failed above
because this isn't a compile-time problem, this is a run-time problem.

I've been staring at arch/arm/kernel/entry-common.S, but I haven't had any
great ideas.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2018-02-22 16:58 ` dsmith at redhat dot com
@ 2018-04-18  6:50 ` gmoreira at gmail dot com
  2018-04-18  6:52 ` gmoreira at gmail dot com
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: gmoreira at gmail dot com @ 2018-04-18  6:50 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #15 from Gustavo Moreira <gmoreira at gmail dot com> ---
Created attachment 10954
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10954&action=edit
Output staprun -vv ./stap_gcm.ko -c ../ex_socket_EABI

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2018-04-18  6:50 ` gmoreira at gmail dot com
@ 2018-04-18  6:52 ` gmoreira at gmail dot com
  2018-04-18  7:05 ` gmoreira at gmail dot com
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: gmoreira at gmail dot com @ 2018-04-18  6:52 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #16 from Gustavo Moreira <gmoreira at gmail dot com> ---
Created attachment 10955
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10955&action=edit
Output staprun -vv ./stap_gcm.ko -c ../ex_socket_OABI

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (15 preceding siblings ...)
  2018-04-18  6:52 ` gmoreira at gmail dot com
@ 2018-04-18  7:05 ` gmoreira at gmail dot com
  2018-04-18  7:26 ` gmoreira at gmail dot com
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: gmoreira at gmail dot com @ 2018-04-18  7:05 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #17 from Gustavo Moreira <gmoreira at gmail dot com> ---
Created attachment 10956
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10956&action=edit
Patch for tapset/linux/syscalls.stpm

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (16 preceding siblings ...)
  2018-04-18  7:05 ` gmoreira at gmail dot com
@ 2018-04-18  7:26 ` gmoreira at gmail dot com
  2018-04-30 17:16 ` dsmith at redhat dot com
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: gmoreira at gmail dot com @ 2018-04-18  7:26 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #18 from Gustavo Moreira <gmoreira at gmail dot com> ---
I ended up modifying the kernel to update thread_info struct with the syscall
number. Then I just call the original kernel syscall_get_nr() function from
SystemTap, which is working like a charm.

# stap -v -g -p4 -m stap_abi_test -e probe kernel.function("sys_connect").call
{ printf("%s - 0x%x\n", ppfunc(), _stp_syscall_nr()) }

# staprun -c ../ex_socket_OABI ./stap_abi_test.ko
SyS_connect - 0x90011b
Connected

# staprun -c ../ex_socket_EABI ./stap_abi_test.ko
SyS_connect - 0x11b
Connected

However, for instance, when it's used with your strace.stp which uses probe
alias, it doesn't work ... it doesn't report the syscalls. Even using an EABI
binary it doesn't report the syscalls. (See staprun_output_eabi.log and
staprun_output_oabi.log)

I also noticed that, for instance from tapset/linux/sysc_connect.stp,
__syscall_gate() is called to filter the syscalls, so I've crafted some code
(see syscalls_stpm.patch) to avoid to be filtered in case the syscall number
doesn't match with the constants.

I'm not getting what is happening from the SystemTap side, it seems the
syscalls are being filtered somewhere ... could you please help me out?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (17 preceding siblings ...)
  2018-04-18  7:26 ` gmoreira at gmail dot com
@ 2018-04-30 17:16 ` dsmith at redhat dot com
  2018-05-01  2:46 ` gmoreira at gmail dot com
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-04-30 17:16 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #19 from David Smith <dsmith at redhat dot com> ---
(sorry for the delay in responding)

(In reply to Gustavo Moreira from comment #18)
> I ended up modifying the kernel to update thread_info struct with the
> syscall number. Then I just call the original kernel syscall_get_nr()
> function from SystemTap, which is working like a charm.

Good deal. Have you tried getting the kernel patch upstream?

... stuff deleted ...

> However, for instance, when it's used with your strace.stp which uses probe
> alias, it doesn't work ... it doesn't report the syscalls. Even using an
> EABI binary it doesn't report the syscalls. (See staprun_output_eabi.log and
> staprun_output_oabi.log)
> 
> I also noticed that, for instance from tapset/linux/sysc_connect.stp,
> __syscall_gate() is called to filter the syscalls, so I've crafted some code
> (see syscalls_stpm.patch) to avoid to be filtered in case the syscall number
> doesn't match with the constants.
> 
> I'm not getting what is happening from the SystemTap side, it seems the
> syscalls are being filtered somewhere ... could you please help me out?

You'll need to break down the @__syscall_gate macro into smaller pieces and see
where it is calling "next". Another idea, perhaps simpler, would be to stick
printf calls in that macro (and all that it calls) to let you know which macro
is calling "next". My guess would be that the @__syscall_gate_compat_simple
macro is doing the filtering, but you'll need to test that theory.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (18 preceding siblings ...)
  2018-04-30 17:16 ` dsmith at redhat dot com
@ 2018-05-01  2:46 ` gmoreira at gmail dot com
  2018-05-01 15:11 ` dsmith at redhat dot com
  2023-10-06 15:55 ` wcohen at redhat dot com
  21 siblings, 0 replies; 22+ messages in thread
From: gmoreira at gmail dot com @ 2018-05-01  2:46 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #20 from Gustavo Moreira <gmoreira at gmail dot com> ---
(In reply to David Smith from comment #19)
> (sorry for the delay in responding)
> 
> (In reply to Gustavo Moreira from comment #18)
> > I ended up modifying the kernel to update thread_info struct with the
> > syscall number. Then I just call the original kernel syscall_get_nr()
> > function from SystemTap, which is working like a charm.
> 
> Good deal. Have you tried getting the kernel patch upstream?


Not yet. Do you think they could be interested?

> 
> ... stuff deleted ...
> 
> > However, for instance, when it's used with your strace.stp which uses probe
> > alias, it doesn't work ... it doesn't report the syscalls. Even using an
> > EABI binary it doesn't report the syscalls. (See staprun_output_eabi.log and
> > staprun_output_oabi.log)
> > 
> > I also noticed that, for instance from tapset/linux/sysc_connect.stp,
> > __syscall_gate() is called to filter the syscalls, so I've crafted some code
> > (see syscalls_stpm.patch) to avoid to be filtered in case the syscall number
> > doesn't match with the constants.
> > 
> > I'm not getting what is happening from the SystemTap side, it seems the
> > syscalls are being filtered somewhere ... could you please help me out?
> 
> You'll need to break down the @__syscall_gate macro into smaller pieces and
> see where it is calling "next". Another idea, perhaps simpler, would be to
> stick printf calls in that macro (and all that it calls) to let you know
> which macro is calling "next". My guess would be that the
> @__syscall_gate_compat_simple macro is doing the filtering, but you'll need
> to test that theory.

Actually, the patches are fully working. The probes wasn't being called due to
the MAXSKIPPED limit:
So, I've suppressed the time limits checks (--suppress-time-limits). I could
also increase the limit to a specific value but anyway I wonder why it's
happening now after these changes.

What do you think about the changes in syscalls.stpm? Do they look good?

It also shows two warnings in the output:

WARNING: Skipped due to missed kretprobe/2 on
'kprobe.function("sys_readlink").return?': 1
WARNING: Skipped due to missed kprobe on 'kprobe.function("sys_readlink")?': 1

I don't think it would be important but anyway it would be nice if we could fix
it as well. Any clue?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (19 preceding siblings ...)
  2018-05-01  2:46 ` gmoreira at gmail dot com
@ 2018-05-01 15:11 ` dsmith at redhat dot com
  2023-10-06 15:55 ` wcohen at redhat dot com
  21 siblings, 0 replies; 22+ messages in thread
From: dsmith at redhat dot com @ 2018-05-01 15:11 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

--- Comment #21 from David Smith <dsmith at redhat dot com> ---
(In reply to Gustavo Moreira from comment #20)
> (In reply to David Smith from comment #19)
> > (sorry for the delay in responding)
> > 
> > (In reply to Gustavo Moreira from comment #18)
> > > I ended up modifying the kernel to update thread_info struct with the
> > > syscall number. Then I just call the original kernel syscall_get_nr()
> > > function from SystemTap, which is working like a charm.
> > 
> > Good deal. Have you tried getting the kernel patch upstream?

> 
> Not yet. Do you think they could be interested?

Yes. Without getting your patch in the upstream kernel, you work here will only
be useful for you.

> > ... stuff deleted ...
> > 
> > > However, for instance, when it's used with your strace.stp which uses probe
> > > alias, it doesn't work ... it doesn't report the syscalls. Even using an
> > > EABI binary it doesn't report the syscalls. (See staprun_output_eabi.log and
> > > staprun_output_oabi.log)
> > > 
> > > I also noticed that, for instance from tapset/linux/sysc_connect.stp,
> > > __syscall_gate() is called to filter the syscalls, so I've crafted some code
> > > (see syscalls_stpm.patch) to avoid to be filtered in case the syscall number
> > > doesn't match with the constants.
> > > 
> > > I'm not getting what is happening from the SystemTap side, it seems the
> > > syscalls are being filtered somewhere ... could you please help me out?
> > 
> > You'll need to break down the @__syscall_gate macro into smaller pieces and
> > see where it is calling "next". Another idea, perhaps simpler, would be to
> > stick printf calls in that macro (and all that it calls) to let you know
> > which macro is calling "next". My guess would be that the
> > @__syscall_gate_compat_simple macro is doing the filtering, but you'll need
> > to test that theory.
> 
> Actually, the patches are fully working. The probes wasn't being called due
> to the MAXSKIPPED limit:
> So, I've suppressed the time limits checks (--suppress-time-limits). I could
> also increase the limit to a specific value but anyway I wonder why it's
> happening now after these changes.
> 
> What do you think about the changes in syscalls.stpm? Do they look good?

I've got some problems with the changes to syscalls.stpm. Besides having debug
printf's present, your changes bypass the filtering if you've got a OABI
executable. You'll end up with syscall nesting that way, something we
definitely try to avoid. Also, you'd need similar changes in the other macros -
__syscall_gate2, __syscall_compat_gate, etc.

Earlier, you said: In OABI the syscall convention is svc 0x900000 + SYSCALL_NR.
If that is true, couldn't your changes be simplified to:

    %( CONFIG_OABI_COMPAT == "y" %?
        # If _stp_syscall_nr() fails, that means we aren't in user
        # context. So, skip this call.
        try { __nr = _stp_syscall_nr() } catch { next }

        # In ARM, if it is an OABI call, the syscalls are >
__NR_OABI_SYSCALL_BASE
        if (__nr > @const("__NR_OABI_SYSCALL_BASE")) {
             __nr = __nr - @const("__NR_OABI_SYSCALL_BASE")
        }
        if (__nr != @syscall_nr) next
     %:
...

And then the next thing I wonder is there has got to be more difference than
just syscall numbers between the two ABIs. I assume structures are laid out
differently along with perhaps other changes. You'll have to account for that.

Poking around the arch/arm directory I'd guess you might need to probe the
sys_oabi_* functions and implement a way of knowing if we're in an OABI
executable (like setting a thread flag).


> It also shows two warnings in the output:
> 
> WARNING: Skipped due to missed kretprobe/2 on
> 'kprobe.function("sys_readlink").return?': 1
> WARNING: Skipped due to missed kprobe on 'kprobe.function("sys_readlink")?':
> 1
> 
> I don't think it would be important but anyway it would be nice if we could
> fix it as well. Any clue?

Actually, it is important (and probably why MAXSKIPPED is being hit). Let's
start with the definition of MAXSKIPPED: "Maximum number of skipped probes
before an exit is triggered, default 100."

So, the first question to answer is "why are you getting so many skipped
probes?". You might start by seeing if the kernel outputs any messages when
this happens.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug runtime/22847] ARM OABI syscall tracing issues
       [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
                   ` (20 preceding siblings ...)
  2018-05-01 15:11 ` dsmith at redhat dot com
@ 2023-10-06 15:55 ` wcohen at redhat dot com
  21 siblings, 0 replies; 22+ messages in thread
From: wcohen at redhat dot com @ 2023-10-06 15:55 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=22847

William Cohen <wcohen at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wcohen at redhat dot com
         Resolution|---                         |WONTFIX
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #22 from William Cohen <wcohen at redhat dot com> ---
OABI support has been deprecated for quite some time, so going to close this.

https://docs.embeddedts.com/EABI_vs_OABI
https://forums.gentoo.org/viewtopic-t-925012-start-0.html

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-10-06 15:55 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-22847-6586@http.sourceware.org/bugzilla/>
2018-02-15  6:01 ` [Bug runtime/22847] ARM OABI syscall tracing issues mysecondaccountabc at gmail dot com
2018-02-15  6:09 ` mysecondaccountabc at gmail dot com
2018-02-15 15:45 ` dsmith at redhat dot com
2018-02-15 16:08 ` dsmith at redhat dot com
2018-02-16  4:06 ` mysecondaccountabc at gmail dot com
2018-02-19  6:55 ` mysecondaccountabc at gmail dot com
2018-02-19  7:02 ` mysecondaccountabc at gmail dot com
2018-02-19 14:31 ` dsmith at redhat dot com
2018-02-19 22:42 ` mysecondaccountabc at gmail dot com
2018-02-19 22:53 ` dsmith at redhat dot com
2018-02-20  1:05 ` mysecondaccountabc at gmail dot com
2018-02-20 16:03 ` dsmith at redhat dot com
2018-02-22  0:04 ` gmoreira at gmail dot com
2018-02-22 16:58 ` dsmith at redhat dot com
2018-04-18  6:50 ` gmoreira at gmail dot com
2018-04-18  6:52 ` gmoreira at gmail dot com
2018-04-18  7:05 ` gmoreira at gmail dot com
2018-04-18  7:26 ` gmoreira at gmail dot com
2018-04-30 17:16 ` dsmith at redhat dot com
2018-05-01  2:46 ` gmoreira at gmail dot com
2018-05-01 15:11 ` dsmith at redhat dot com
2023-10-06 15:55 ` wcohen at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).