From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id CF3BC39A28C9 for ; Sat, 10 Jul 2021 02:51:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CF3BC39A28C9 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-407-D0YKr0DAN7mSf6pNGKpAUw-1; Fri, 09 Jul 2021 22:51:58 -0400 X-MC-Unique: D0YKr0DAN7mSf6pNGKpAUw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 56FBE801B0A for ; Sat, 10 Jul 2021 02:51:57 +0000 (UTC) Received: from rawhide-1.lan (ovpn-112-110.phx2.redhat.com [10.3.112.110]) by smtp.corp.redhat.com (Postfix) with ESMTP id 33267620DE; Sat, 10 Jul 2021 02:51:57 +0000 (UTC) From: Kevin Buettner To: gdb-patches@sourceware.org Subject: [PATCH 2/2] glibc-2.34: Fix internal error when running gdb.base/gdb-sigterm.exp Date: Fri, 9 Jul 2021 22:51:29 -0400 Message-Id: <20210710025129.201884-3-kevinb@redhat.com> In-Reply-To: <20210710025129.201884-1-kevinb@redhat.com> References: <20210710025129.201884-1-kevinb@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Jul 2021 02:52:01 -0000 This commit fixes an internal error that occurs when running gdb.base/gdb-sigterm.exp on a machine with a prerelease of glibc-2.34. This test case, gdb.base/gdb-sigterm.exp, turns on infrun debugging (via "set debug infrun 1") and then attempts to do a step at the source line "for (;;);" Of course, this is an infinite loop, so there's no way to get to the next line. When run by hand, GDB busily prints out the infrun debugging log messages without end. After a while, a SIGTERM is sent from an external source - when debugging by hand, I'm sending it from another terminal, but when running the test, the test case sends a SIGTERM shortly after one of the infrun debug messages is seen. Regardless, GDB is expected to shut down cleanly. The test is run 50 times; if it shuts down cleanly after 50 iterations, the test passes; otherwise it's supposed to fail. When running the test by hand, I'm doing: file testsuite/outputs/gdb.base/gdb-sigterm/gdb-sigterm set height 0 set width 0 b main run tbreak 29 continue set range-stepping off set debug infrun 1 step At this point, GDB will start generating copious amounts of log output. I then send a SIGTERM to the gdb process from another terminal. The internal error occurs roughly once in every five tries. The backtrace shows that a double internal error occurs - GDB detects this recursion and bails out on the second one. The first internal error is caused by the assert in inferior_thread() in thread.c. Further up the stack, I see: #34 0x0000000000756eaa in do_target_wait_1 (inf=0x1603a60, ptid=..., status=0x7ffca2d42578, options=...) at worktree-glibc234/gdb/infrun.c:3663 It's calling target_wait() at this point; there is a call chain eventually leading to inferior_thread() where the assert occurs. One of the interesting things is that the call chain goes through the QUIT mechanism (i.e. a call to maybe_quit()) in target_read(). If global sync_quit_force_run is non-zero - which will be the case after GDB receives a SIGTERM signal - the path through various calls is taken eventually culiminating in the assert / internal error. Now for the important bit: At the beginning of do_target_wait_1() is the following code: /* We know that we are looking for an event in the target of inferior INF, but we don't know which thread the event might come from. As such we want to make sure that INFERIOR_PTID is reset so that none of the wait code relies on it - doing so is always a mistake. */ switch_to_inferior_no_thread (inf); This call causes the variable being tested by the assert to be NULL (actually nullptr) which ultimately triggers the assert. So... it does seem that setting the variable in question (current_thread_) to nullptr is intentional, but that path to the triggered assert was not anticipated. The reason that the problem occurs with glibc-2.34 is that libthread_db will always be loaded now; thus the wait() machinery found in linux-thread-db.c is used instead of linux_nat_target::wait (which is found in linux-nat.c). gdb/ChangeLog: * thread.c (any_thread_of_inferior): Don't call inferior_thread() when there is no current thread. --- gdb/thread.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gdb/thread.c b/gdb/thread.c index f850f05ad48..3fe81e810c0 100644 --- a/gdb/thread.c +++ b/gdb/thread.c @@ -638,7 +638,8 @@ any_thread_of_inferior (inferior *inf) gdb_assert (inf->pid != 0); /* Prefer the current thread, if there's one. */ - if (inf == current_inferior () && inferior_ptid != null_ptid) + if (inf == current_inferior () && inferior_ptid != null_ptid + && !is_current_thread (nullptr)) return inferior_thread (); for (thread_info *tp : inf->non_exited_threads ()) -- 2.32.0