public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels
@ 2023-04-30  6:28 agentzh at gmail dot com
  2023-05-02 22:31 ` [Bug runtime/30408] " agentzh at gmail dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-04-30  6:28 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

            Bug ID: 30408
           Summary: Always fail to read userland memory (read faults)
                    inside perf event probes with 6.2 kernels
           Product: systemtap
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: agentzh at gmail dot com
  Target Milestone: ---

I've noted a regression of stap when using Fedora 36 x86_64's latest 6.2.12
kernel. It always fails to read userland memory inside a perf event probe
handler. It can be demonstrated by the following minimal example.

First prepare a small C program as the tracing target:

```
#include <stdio.h>

unsigned long a = 0;

void foo (void) {
    FILE *f = fopen("/dev/null", "w");
    fputs("hello world!\n", f);
    fclose(f);
}

int main (void) {
    while (1) {
        a++;
        for (int i = 0; i < 100; i++) {
            foo();
            foo();
        }
    }
    return 0;
}
```

Then we compile it with debug symbols:

```
gcc -g a.c
```

And we copy the resulting `./a.out` program file to `/tmp/`:

```
cp a.out /tmp/
```

Now we prepare a small stap script file named c.stp:

```
global fails = 0;

probe perf.type(1).config(0).sample(1000000) {
    if (execname() == "a.out") {
        ok = 1;
        val = 0;
        try {
            val = @var("a", "/tmp/a.out");
        } catch {
            ok = 0;
            fails++;
        }
        if (ok) {
            printf("a = %d\n", val);
            exit();
        }
    }
}

probe begin {
    warn("Start tracing...");
}

probe timer.s(2) {
    printf("failed %ld times\n", fails);
}
```

And we run it like this:

```
/opt/stap/bin/stap -c /tmp/a.out c.stp
WARNING: Start tracing...
failed 3927 times
failed 7854 times
failed 11769 times
failed 15741 times
failed 19705 times
failed 23671 times
failed 27668 times
failed 31639 times
failed 35613 times
failed 39563 times
failed 43491 times
failed 47463 times
failed 51465 times
...
```

No matter how long we wait, it always fails to read anything from `@var()`.

For comparison, I also ran this example on an older kernel (5.0.16) of an older
Fedora system and it works fine:

```
$ /opt/stap/bin/stap -c ./a.out c.stp
WARNING: Start tracing...
a = 150746
```

So it's definitely a behavior regression with the new kernel.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
@ 2023-05-02 22:31 ` agentzh at gmail dot com
  2023-05-02 22:32 ` [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels agentzh at gmail dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-02 22:31 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

--- Comment #1 from agentzh <agentzh at gmail dot com> ---
The 6.1.18 kernel of Fedora 36 x86_64 also has this problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
  2023-05-02 22:31 ` [Bug runtime/30408] " agentzh at gmail dot com
@ 2023-05-02 22:32 ` agentzh at gmail dot com
  2023-05-09 19:59 ` agentzh at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-02 22:32 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

agentzh <agentzh at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Always fail to read         |Always fail to read
                   |userland memory (read       |userland memory (read
                   |faults) inside perf event   |faults) inside perf event
                   |probes with 6.2 kernels     |probes with 6.2/6.1 kernels

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
  2023-05-02 22:31 ` [Bug runtime/30408] " agentzh at gmail dot com
  2023-05-02 22:32 ` [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels agentzh at gmail dot com
@ 2023-05-09 19:59 ` agentzh at gmail dot com
  2023-05-09 20:04 ` agentzh at gmail dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-09 19:59 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

--- Comment #2 from agentzh <agentzh at gmail dot com> ---
I also noted that reading from the userland stack memory also fails. Tapset
functions like `sprint_ubacktrace()` can only return the top stack frame which
is read from CPU registers (like RIP) directly. Alas. Seems like only register
reading works here.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
                   ` (2 preceding siblings ...)
  2023-05-09 19:59 ` agentzh at gmail dot com
@ 2023-05-09 20:04 ` agentzh at gmail dot com
  2023-05-13  5:33 ` agentzh at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-09 20:04 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

--- Comment #3 from agentzh <agentzh at gmail dot com> ---
We already reproduced it by different people on 2 different machines. So I
think this regression is real.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
                   ` (3 preceding siblings ...)
  2023-05-09 20:04 ` agentzh at gmail dot com
@ 2023-05-13  5:33 ` agentzh at gmail dot com
  2023-05-13  6:28 ` agentzh at gmail dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-13  5:33 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

--- Comment #4 from agentzh <agentzh at gmail dot com> ---
OK, I tracked it down to be the user_addr_max() macro missing since 5.18, which
is in the stap runtime's lookup_bad_addr_user() function. In perf event probe
handlers, the in_task() macro always returns 0 (false). in_task() is a macro
defined as

```
#define in_task()  (!(in_nmi() | in_hardirq() | in_serving_softirq()))
```

And the kernels use hardirq contexts for perf events like
`perf.type(1).config(0).sample(100000)`, and thus in_hardirq() returning 1.

For comparison, the kernel's own bpf_probe_read_user() function does not check
in_task(), in_hardirq(), or user_addr_max() (though it has another deadlock
regression since 5.19 on the code path copy_from_user_nofault ->
check_object_size -> find_vmap_area() around the vmap_area_lock lock, but that
is another story).

The following patch seems to fix this for me:

https://gist.github.com/agentzh/948f77381c1f1e2cb7474c22c2c17c0e

So this regression really appeared since 5.18.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
                   ` (4 preceding siblings ...)
  2023-05-13  5:33 ` agentzh at gmail dot com
@ 2023-05-13  6:28 ` agentzh at gmail dot com
  2023-05-16 19:28 ` agentzh at gmail dot com
  2023-05-16 19:51 ` agentzh at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-13  6:28 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

--- Comment #5 from agentzh <agentzh at gmail dot com> ---
This patch has been tested with 6.2 lockdep/kasan debug kernels and RHEL 7's
3.10 debug kernels (with lockdep).

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
                   ` (5 preceding siblings ...)
  2023-05-13  6:28 ` agentzh at gmail dot com
@ 2023-05-16 19:28 ` agentzh at gmail dot com
  2023-05-16 19:51 ` agentzh at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-16 19:28 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

--- Comment #6 from agentzh <agentzh at gmail dot com> ---
OK, I dup deeper here and came up with this better V2 patch:

https://gist.github.com/agentzh/b712c10e859ef5cc08b8de48d3ab85c5

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels
  2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
                   ` (6 preceding siblings ...)
  2023-05-16 19:28 ` agentzh at gmail dot com
@ 2023-05-16 19:51 ` agentzh at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: agentzh at gmail dot com @ 2023-05-16 19:51 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=30408

agentzh <agentzh at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from agentzh <agentzh at gmail dot com> ---
Pushed a slightly modified version of the V2 patch as commit ca60bb0c0 to
master. Thanks fche for the review.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-05-16 19:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-30  6:28 [Bug runtime/30408] New: Always fail to read userland memory (read faults) inside perf event probes with 6.2 kernels agentzh at gmail dot com
2023-05-02 22:31 ` [Bug runtime/30408] " agentzh at gmail dot com
2023-05-02 22:32 ` [Bug runtime/30408] Always fail to read userland memory (read faults) inside perf event probes with 6.2/6.1 kernels agentzh at gmail dot com
2023-05-09 19:59 ` agentzh at gmail dot com
2023-05-09 20:04 ` agentzh at gmail dot com
2023-05-13  5:33 ` agentzh at gmail dot com
2023-05-13  6:28 ` agentzh at gmail dot com
2023-05-16 19:28 ` agentzh at gmail dot com
2023-05-16 19:51 ` agentzh at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).