From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29098 invoked by alias); 13 Aug 2018 13:01:31 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 29067 invoked by uid 89); 13 Aug 2018 13:01:30 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=Hx-spam-relays-external:74.125.82.68, H*RU:74.125.82.68, Kill X-HELO: mail-wm0-f68.google.com Received: from mail-wm0-f68.google.com (HELO mail-wm0-f68.google.com) (74.125.82.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 13 Aug 2018 13:01:28 +0000 Received: by mail-wm0-f68.google.com with SMTP id q8-v6so8771114wmq.4 for ; Mon, 13 Aug 2018 06:01:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=jEPedrUQ6PQqBxTL9ypDfzvBDS7vUMElEjQE1eDpaaY=; b=c/wlfZjXnf4+XWDT19m3rm+2Mpw5weRiR9HxizEqFTpAuvp7FPtryw3tBPF0Uwk7oe KqpEs4LapTX3yVbH4jmQ6yZK2afluTgEqJhNEMsqoUR8rWQvwlNd1+tnhbSGBUVN+qDq pXfp+fh7kGhy3Xanp5fhIcHvNFsO8l7PTsblQzcS4Cnxqprr30tRIAdoTJ2oE+ZoWSIA OAK4WaLG2YCfERsJmk6oGVWO3LXbp3BHsnn6MFi9ty87oJyggClX5ZV2VOXbFZEwmTl/ IYhYABKoLq1IyWk5GTJyk+7BAM9geLpxFCsh0TbbdLrgC1uT6nry8zq07PtQM1lWnuz8 Y2Pg== Return-Path: Received: from localhost (host81-140-215-41.range81-140.btcentralplus.com. [81.140.215.41]) by smtp.gmail.com with ESMTPSA id f132-v6sm11659637wme.24.2018.08.13.06.01.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 13 Aug 2018 06:01:26 -0700 (PDT) Date: Mon, 13 Aug 2018 13:01:00 -0000 From: Andrew Burgess To: Pedro Alves Cc: Simon Marchi , gdb-patches@sourceware.org Subject: Re: [PATCH] gdb: Fix instability in thread groups test Message-ID: <20180813130125.GY3155@embecosm.com> References: <20180810095750.13017-1-andrew.burgess@embecosm.com> <7da382e5-bd5e-25c2-b3f8-f38e692f35a1@redhat.com> <20180813114137.GX3155@embecosm.com> <2e47657d-b81b-497d-58bf-0463980dec24@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2e47657d-b81b-497d-58bf-0463980dec24@redhat.com> X-Fortune: There are never any bugs you haven't found yet. X-Editor: GNU Emacs [ http://www.gnu.org/software/emacs ] User-Agent: Mutt/1.9.2 (2017-12-15) X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg00330.txt.bz2 * Pedro Alves [2018-08-13 13:03:47 +0100]: > On 08/13/2018 12:41 PM, Andrew Burgess wrote: > > * Pedro Alves [2018-08-13 10:51:44 +0100]: > > > >> But shouldn't we make GDB handle this better? Make the output > >> more "atomic" in the sense that we either show a valid complete > >> entry, or no entry? There's an inherent race > >> here, since we use multiple /proc accesses to fill up a process > >> entry. If we start fetching process info for a process, and the process > >> disappears midway, I'd think it better to discard that process's entry, > >> as-if we had not even seen it, i.e., as if we had listed the set of > >> processes a tiny moment later. > > > > I agree. > > > > We also need to think about process reuse. So with multiple accesses > > to /proc we might start with one process, and end up with a completely > > new process. > > > > I might be overthinking it, but my first guess at a reliable strategy > > would be: > > > > 1. Find each /proc/PID directory. > > 2. Read /proc/PID/stat and extract the start time. Failure to read > > this causes the process to be abandoned. > > 3. Read all of the other /proc/PID/XXX files as needed. Any failure > > results in the process being abandoned. > > 4. Reread /proc/PID/stat and confirm the start time hasn't changed, > > this would indicate a new process having slipped in. > > > > My initial quick thought was just to drop the process entry if > it turns out we end up with an empty core set. > > I wonder whether we can prevent PID reuse by keeping a descriptor > for /proc/PID/ open while we open the other files. Probably not. That was my first though, I tried: - chdir /proc/PID - opendir for /proc/PID - Kill /proc/PID - Read from the opendir handle, find nothing there. Which didn't really surprise me, but was worth a try... > Otherwise, your scheme sounds like the next best. > > > Given the system is still running, we can never be sure that we have > > "all" processes, so throwing out anything that looks wrong seems like > > the right strategy. > > > > Also in step #4 we know we've just missed a process - something new > > has started, but we ignore it. I think this is fine though given the > > racy nature of this sort of thing... > > > > The only question is, could these thoughts be dropped into a bug > > report, > > > Sure. > > > > and the original patch to remove the unstable result applied? > > Or maybe the test updated to either PASS or KFAIL? > > I'd prefer the KFAIL option. At the very least, a comment in > the .exp file. I'll put something together... Thanks, Andrew