From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 3F6563858D1E; Tue, 31 Oct 2023 19:25:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3F6563858D1E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1698780316; bh=7U5JN/Cjd/+Jf+eZCs4lXw8DnkWK3SaVI3RBjmMQpkk=; h=From:To:Subject:Date:From; b=SyP2lRURDXayozzAXyY/r8cxYi9jJE0iqRG5tGPGdiQc6o+7zRUxp23b4UBn4PX8J 5IfCKMm4kalxSL71Q5ttlx6y8KB5DUCS5Foe+KanLEMTJpFHezxkMKtUlJXfS1ce2S IiBZ6tu9Uq3o11dO2MMP7PtVCD1waClmSdCo5d70= From: "agentzh at gmail dot com" To: systemtap@sourceware.org Subject: [Bug translator/31018] New: Map operations might get no lock protections due to "pushdown lock" bugs Date: Tue, 31 Oct 2023 19:25:15 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: systemtap X-Bugzilla-Component: translator X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: agentzh at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: systemtap at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://sourceware.org/bugzilla/show_bug.cgi?id=3D31018 Bug ID: 31018 Summary: Map operations might get no lock protections due to "pushdown lock" bugs Product: systemtap Version: unspecified Status: UNCONFIRMED Severity: normal Priority: P2 Component: translator Assignee: systemtap at sourceware dot org Reporter: agentzh at gmail dot com Target Milestone: --- The stap translator's "pushdown lock" optimization is buggy in that it may = not emit locking code for all the code branches in the first statement needing a lock, making subsequent statements actually needing a lock go without a loc= k. Consider the following minimal example: ``` global stats; probe process.function("main") { my_pid =3D pid(); v =3D @var("argc"); if (v > 3) { delete stats; } stats[my_pid] <<< v; // this line has no lock held } ``` The translator emits locking code inside the `if` statement's "then" branch= but when the `if`'s condition is not met, the subsequent `stats[my_id] <<< v` statement would not get any locks held at all. Such bugs lead to corruption in the unprotected map data structure when the probe handler runs in parallel on different CPUs. We saw panic messages like these: ``` [ 1274.534162] list_del corruption. next->prev should be ffffc90002180b80, = but was ffffc90002180be0 [ 1274.534850] kernel BUG at lib/list_debug.c:54! [ 1274.506139] list_add double add: new=3Dffffc90002180fd0, prev=3Dffffc90002180fd0, next=3Dffffc90000eb9020. ``` The backtrace looks like this: ``` [ 1274.513560] Call Trace: [ 1274.513700] _new_map_create+0x158/0x540 [stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730] [ 1274.514133] _stp_map_set_ii+0x122/0x1e0 [stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730] [ 1274.514574] probe_6385+0xe20/0x1bd0 [stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730] [ 1274.514990] ? probe_6394+0x1560/0x1560 [stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730] [ 1274.515424] ? _stp_ctl_write_cmd+0x16f0/0x16f0 [stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730] [ 1274.515891] enter_kprobe_probe+0x529/0xcc0 [stap_ac8469bf9b1b0dd44e159b94c8bba7f8_16730] ``` The guilty code snippet in the translate is here, in C++ function c_unparser::visit_block(): ``` // if needed, find the lock insertion site; instruct it to lock if (pushdown_lock_p(s)) for (unsigned i=3D0; istatements.size(); i++) if (locks_needed_p (s->statements[i])) { pushdown_lock.insert(s->statements[i]); pushed_lock_down =3D = true; break; } ``` Here the first "if" statement in the sample stap script above would would h= it the "if (locks_needed_p (s->statements[i]))" condition in this C++ code snippet, shortcuiting the surrounding C++ "for" loop, leaving the subsequent C++ statements like "stats[my_pid] <<< v;" go without "pushdown_lock" marki= ng. So the subsequent statements would not emit any locking code at all. The correct fix is to always emit locking code for statements that might "p= ush down" the locks, until seeing a simple statements what don't (like an assignment statement using "<<<"). I'm preparing a patch to fix this problem. --=20 You are receiving this mail because: You are the assignee for the bug.=