public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug translator/31018] New: Map operations might get no lock protections due to "pushdown lock" bugs
@ 2023-10-31 19:25 agentzh at gmail dot com
  2023-10-31 19:27 ` [Bug translator/31018] " agentzh at gmail dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: agentzh at gmail dot com @ 2023-10-31 19:25 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31018

            Bug ID: 31018
           Summary: Map operations might get no lock protections due to
                    "pushdown lock" bugs
           Product: systemtap
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: translator
          Assignee: systemtap at sourceware dot org
          Reporter: agentzh at gmail dot com
  Target Milestone: ---

The stap translator's "pushdown lock" optimization is buggy in that it may not
emit locking code for all the code branches in the first statement needing a
lock, making subsequent statements actually needing a lock go without a lock.
Consider the following minimal example:

```
global stats;

probe process.function("main") {
    my_pid = pid();
    v = @var("argc");
    if (v > 3) {
        delete stats;
    }
    stats[my_pid] <<< v;  // this line has no lock held
}
```

The translator emits locking code inside the `if` statement's "then" branch but
when the `if`'s condition is not met, the subsequent `stats[my_id] <<< v`
statement would not get any locks held at all.

Such bugs lead to corruption in the unprotected map data structure when the
probe handler runs in parallel on different CPUs. We saw panic messages like
these:

```
[ 1274.534162] list_del corruption. next->prev should be ffffc90002180b80, but
was ffffc90002180be0
[ 1274.534850] kernel BUG at lib/list_debug.c:54!
[ 1274.506139] list_add double add: new=ffffc90002180fd0,
prev=ffffc90002180fd0, next=ffffc90000eb9020.
```

The backtrace looks like this:

```
[ 1274.513560] Call Trace:
[ 1274.513700]  _new_map_create+0x158/0x540
[stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730]
[ 1274.514133]  _stp_map_set_ii+0x122/0x1e0
[stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730]
[ 1274.514574]  probe_6385+0xe20/0x1bd0
[stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730]
[ 1274.514990]  ? probe_6394+0x1560/0x1560
[stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730]
[ 1274.515424]  ? _stp_ctl_write_cmd+0x16f0/0x16f0
[stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730]
[ 1274.515891]  enter_kprobe_probe+0x529/0xcc0
[stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730]
```

The guilty code snippet in the translate is here, in C++ function
c_unparser::visit_block():

```
      // if needed, find the lock insertion site; instruct it to lock
      if (pushdown_lock_p(s))
        for (unsigned i=0; i<s->statements.size(); i++)
          if (locks_needed_p (s->statements[i]))
            { pushdown_lock.insert(s->statements[i]); pushed_lock_down = true;
break; }
```

Here the first "if" statement in the sample stap script above would would hit
the "if (locks_needed_p (s->statements[i]))" condition in this C++ code
snippet, shortcuiting the surrounding C++ "for" loop, leaving the subsequent
C++ statements like "stats[my_pid] <<< v;" go without "pushdown_lock" marking.
So the subsequent statements would not emit any locking code at all.

The correct fix is to always emit locking code for statements that might "push
down" the locks, until seeing a simple statements what don't (like an
assignment statement using "<<<").

I'm preparing a patch to fix this problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug translator/31018] Map operations might get no lock protections due to "pushdown lock" bugs
  2023-10-31 19:25 [Bug translator/31018] New: Map operations might get no lock protections due to "pushdown lock" bugs agentzh at gmail dot com
@ 2023-10-31 19:27 ` agentzh at gmail dot com
  2023-10-31 20:56 ` agentzh at gmail dot com
  2023-10-31 23:08 ` agentzh at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: agentzh at gmail dot com @ 2023-10-31 19:27 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31018

--- Comment #1 from agentzh <agentzh at gmail dot com> ---
I forgot to add that it might not necessarily be triggered by a leading if
statement needing a lock. Other statements that "push down" locks may also
trigger it, like a loop statement or nested code blocks.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug translator/31018] Map operations might get no lock protections due to "pushdown lock" bugs
  2023-10-31 19:25 [Bug translator/31018] New: Map operations might get no lock protections due to "pushdown lock" bugs agentzh at gmail dot com
  2023-10-31 19:27 ` [Bug translator/31018] " agentzh at gmail dot com
@ 2023-10-31 20:56 ` agentzh at gmail dot com
  2023-10-31 23:08 ` agentzh at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: agentzh at gmail dot com @ 2023-10-31 20:56 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31018

--- Comment #2 from agentzh <agentzh at gmail dot com> ---
fche asked for the ko C source snippets before and after the fix. Here is the
code before the fix:

https://gist.github.com/agentzh/c8dc388a7b362564e3e18c4428db9bd0

And here is the one after the fix:

https://gist.github.com/agentzh/928a2cda402fe6a0fee24fff941e68c4

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug translator/31018] Map operations might get no lock protections due to "pushdown lock" bugs
  2023-10-31 19:25 [Bug translator/31018] New: Map operations might get no lock protections due to "pushdown lock" bugs agentzh at gmail dot com
  2023-10-31 19:27 ` [Bug translator/31018] " agentzh at gmail dot com
  2023-10-31 20:56 ` agentzh at gmail dot com
@ 2023-10-31 23:08 ` agentzh at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: agentzh at gmail dot com @ 2023-10-31 23:08 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=31018

agentzh <agentzh at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #3 from agentzh <agentzh at gmail dot com> ---
Fixed in commit 1c547184ca.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-10-31 23:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-31 19:25 [Bug translator/31018] New: Map operations might get no lock protections due to "pushdown lock" bugs agentzh at gmail dot com
2023-10-31 19:27 ` [Bug translator/31018] " agentzh at gmail dot com
2023-10-31 20:56 ` agentzh at gmail dot com
2023-10-31 23:08 ` agentzh at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).