From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 17A28385B835 for ; Thu, 16 Apr 2020 19:39:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 17A28385B835 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca Received: from [10.0.0.11] (unknown [192.222.164.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id EC34A1E5F8; Thu, 16 Apr 2020 15:39:29 -0400 (EDT) Subject: Re: [PATCH 28/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412) To: Pedro Alves , gdb-patches@sourceware.org References: <20200414175434.8047-1-palves@redhat.com> <20200414175434.8047-29-palves@redhat.com> From: Simon Marchi Message-ID: Date: Thu, 16 Apr 2020 15:39:28 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200414175434.8047-29-palves@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: fr Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2020 19:39:33 -0000 On 2020-04-14 1:54 p.m., Pedro Alves via Gdb-patches wrote: > In PR/25412, Simon noticed that after the multi-target series, the > tid-reuse.exp testcase manages to create a duplicate thread in the > thread list. Or rather, two threads with the same PTID. > > add_thread_silent has code in place to detect the case of a new thread > reusing some older thread's ptid, but it doesn't work correctly > anymore when the old thread is NOT the current thread and it has a > refcount higher than 0. Either condition prevents a thread from being > deleted, but the refcount case wasn't being considered. I think the > reason that case wasn't considered is that that code predates > thread_info refcounting. Back when it was originally written, > delete_thread always deleted the thread. > > That add_thread_silent code in question has some now-unnecessary > warts, BTW. For instance, this: > > /* Make switch_to_thread not read from the thread. */ > new_thr->state = THREAD_EXITED; > > ... used to be required because switch_to_thread would update > 'stop_pc' otherwise. I.e., it would read registers from an exited > thread otherwise. switch_to_thread no longer reads the stop_pc, since: > > commit f2ffa92bbce9dd5fbedc138ac2a3bc8a88327d09 > Author: Pedro Alves > AuthorDate: Thu Jun 28 20:18:24 2018 +0100 > > gdb: Eliminate the 'stop_pc' global > > Also, if the ptid of the now-gone current thread is reused, we > currently return from add_thread_silent with the current thread > pointing at the _new_ thread. Either pointing at the old thread, or > at no thread selected would be reasonable. But pointing at an > unrelated thread (the new thread that happens to reuse the ptid) is > just broken. Seems like I was the one who wrote it like that but I > have no clue why, FWIW. > > Currently, an exited thread kept in the thread list still holds its > original ptid. The idea was that we need the ptid to be able to > temporarily switch to another thread and then switch back to the > original thread, because thread switching is really inferior_ptid > switching. Switching back to the original thread requires a ptid > lookup. > > Now, in order to avoid exited threads with the same ptid as a live > thread in the same thread list, one thing I considered (and tried) was > to change an exited thread's ptid to minus_one_ptid. However, with > that, there's a case that we won't handle well, which is if we end up > with more than one exited thread in the list, since then all exited > threads will all have the same ptid. Since inferior_thread() relies > on inferior_ptid, may well return the wrong thread. > > My next attempt to address this, was to switch an exited thread's ptid > to a globally unique "exited" ptid, which is a ptid with pid == -1 and > tid == 'the thread's global GDB thread number'. Note that GDB assumes > that the GDB global thread number is monotonically increasing and > doesn't wrap around. (We should probably make GDB thread numbers > 64-bit to prevent that happening in practice; they're currently signed > 32-bit.) This attempt went a long way, but still ran into a number of > issues. It was a major hack too, obviously. > > My next attempt is the one that I'm proposing, which is to bite the > bullet and break the connection between inferior_ptid and > inferior_thread(), aka the current thread. I.e., make the current > thread be a global thread_info pointer that is written to directly by > switch_to_thread, etc., and making inferior_thread() return that > pointer, instead of having inferior_thread() lookup up the > inferior_ptid thread, by ptid_t. You can look at this as a > continuation of the effort of using more thread_info pointers instead > of ptids when possible. > > By making the current thread a global thread_info pointer, we can make > switch_to_thread simply write to the global thread pointer, which > makes scoped_restore_current_thread able to restore back to an exited > thread without relying on unrelyable ptid look ups. I.e., this makes > it not a real problem to have more than one thread with the same ptid > in the thread list. There will always be only one live thread with a > given ptid, so code that looks up a live thread by ptid will always be > able to find the right one. > > This change required auditing the whole codebase for places where we > were writing to inferior_ptid directly to change the current thread, > and change them to use switch_to_thread instead or one of its > siblings, because otherwise inferior_thread() would return a thread > unrelated to the changed-to inferior_ptid. That was all (hopefully) > done in previous patches. > > After this, inferior_ptid is mainly used by target backend code. It > is also relied on by a number of target methods. E.g., the > target_resume interface and the memory reading routines -- we still > need it there because we need to be able to access memory off of > processes for which we don't have a corresponding inferior/thread > object, like when handling forks. Maybe we could pass down a context > explicitly to target_read_memory, etc. > > gdb/ChangeLog: > yyyy-mm-dd Pedro Alves > > * gdbthread.h (delete_thread, delete_thread_silent) > (find_thread_ptid): Update comments. > * thread.c (current_thread_): New global. > (is_current_thread): Move higher, and reimplement. > (inferior_thread): Reimplement. > (set_thread_exited): Use bool. Add assertions. > (add_thread_silent): Simplify thread-reuse handling by always > calling delete_thread. > (delete_thread): Remove intro comment. > (find_thread_ptid): Skip exited threads. > (switch_to_thread_no_regs): Write to current_thread_. > (switch_to_no_thread): Check CURRENT_THREAD_ instead of > INFERIOR_PTID. Clear current_thread_. I checked the rest of the series, I didn't find any obvious problem. I have a patch [1] that I might consider sending upstream some day, which replaces the inferior thread lists with maps, using the ptid as the key. This was made for targets that have a lot of threads (like, thousands), where looking up a thread object from a ptid would take a bit of time, and that would add up. If there can be two threads in the list with the same ptid, that won't work with a map using the ptid as the key. So I was wondering, when we want to delete a thread but its refcount is not 0, does it need to stay in the thread list? What if we remove it from the list, and whatever decrements its refcount to 0 deletes it? Do you think that could work? It's possible, however, that with your series, the number of times we look up a thread from its ptid is reduced enough that it makes my patch not really useful. Simon [1] https://github.com/simark/binutils-gdb/commits/inferior-thread-map