From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 81140 invoked by alias); 20 Feb 2020 14:26:54 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 81102 invoked by uid 48); 20 Feb 2020 14:26:50 -0000 From: "fche at redhat dot com" To: systemtap@sourceware.org Subject: [Bug testsuite/23493] Test suite makes all CPU stuck forever on kernel 4.16.16 (Fedora 27) Date: Thu, 20 Feb 2020 14:26:00 -0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: systemtap X-Bugzilla-Component: testsuite X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: fche at redhat dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: systemtap at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2020-q1/txt/msg00039.txt https://sourceware.org/bugzilla/show_bug.cgi?id=3D23493 --- Comment #12 from Frank Ch. Eigler --- (In reply to Mark Wielaard from comment #9) > It might assume the current pid actually exists? Maybe > current is NULL at this point and we should check for that first? current is nonzero (I think by definition), but task_active_pid_ns(current) (called within find_get_pid->find_vpid) looks like it was 0. So when we get called for a schedule tracepoint, the invoking task might just be a mostly-dead one which doesn't even have a pid any more. I'm not sure why this should occur only under heavy load (installcheck-parallel), vs. all the time. Maybe task garbage collection occurs more in the former case. I'm tempted to put in a probe-prologue detection of this (maybe via (current->flags & PF_EXITING)) and reject the probe hit entirely. --=20 You are receiving this mail because: You are the assignee for the bug.