From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2610:1c1:1:606c::19:2]) by sourceware.org (Postfix) with ESMTPS id 840113858D37 for ; Tue, 4 Oct 2022 16:57:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 840113858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=FreeBSD.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits)) (Client CN "mx1.freebsd.org", Issuer "R3" (verified OK)) by mx2.freebsd.org (Postfix) with ESMTPS id 4MhkQH23ljz3Zxr; Tue, 4 Oct 2022 16:57:11 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4MhkQH1BLdz4BHn; Tue, 4 Oct 2022 16:57:11 +0000 (UTC) (envelope-from jhb@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1664902631; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u/sAWi5fDuZ39yGFLCxk7CZFDtZWfzqU8W8RvYYFCOc=; b=jOTGeRnxVeCQ16/XLw3lAwLEbwX3VrXU19Ay8GYNVFDie65qXfAEzVFOm6R5/58YNuZ4wE AuKl3u6uR4MnECThw7BnHIcdTQqKtAUpWqa39T2jusurMdl2qzS6HosUS5WbBDi5h1jQun /fnIoMReP+F99vYhvrQNC5RFjB0aRZr2CONMFF65taio9Ja5TwlgAYZhq3qf6qRSfA0yhh x2kuu0gDZyoxCV7vd+0Mipp1MXRlwNyqb2ADj5+Wu5LYCWzW0ZbsHlHlFVoQ3ukLrbu+VZ IVq706CILwWifjrFzof6Ww0VPvqajjc5MddaeFG9v0bYCTjuKnvrr1fEHUvWkA== Received: from [IPV6:2601:648:8684:ad0:c138:fd94:ebf2:bb32] (unknown [IPv6:2601:648:8684:ad0:c138:fd94:ebf2:bb32]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: jhb) by smtp.freebsd.org (Postfix) with ESMTPSA id 4MhkQG53ngz1N9g; Tue, 4 Oct 2022 16:57:10 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Message-ID: Date: Tue, 4 Oct 2022 09:57:09 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.13.1 Subject: Re: Context switch during stepping causes weird behavior Content-Language: en-US To: Adrian Oltean , "gdb@sourceware.org" References: From: John Baldwin In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1664902631; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u/sAWi5fDuZ39yGFLCxk7CZFDtZWfzqU8W8RvYYFCOc=; b=vvx7FLiBWyU2CbzF2gLmY8hGbSwJB1hrpf+1mERYQeEOAcMFdXv3NZxGLDG14MVvr9Mxwe gQi6hXWq9o7vJUQ6pqIHky6KhKPWWkSCwa566c4FtQ3zrV5m9Q+cROl183TAENRm25ZHzQ vS70ke85Wn371/PPrVnoRvI+bSqxk1hNkmNRiEIyLxQYk51EJm8cutc7XsISyiiLvoyM3P c5HEp203/2Y0BVRZ/cPAOFiLTTrxD0m4o0Jd/RXWBasSQbFDxzSiZ5VjW0mPweO2UhzKmO VotQF5EaHJtK+q1I4CGN7IV6MDk98T+8okCTHH6CDs3mESGCt3UHBwBsllOmrg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1664902631; a=rsa-sha256; cv=none; b=dz+BJ52Ii5JF7IYjFcYRbzsvfGfRQg9fxMYmGxfeL1dyFd/skor75F/VyrraYEz+n1ZDPQ yv9IMooiHmBV7MchpeoSDn5p9MJufmeEcLvmeQXlW3u+NoP72oAUQiTR/BbQFQ4KmvAFYo cDvbqOWJjimsnR5CiE10G0p3w7H0h1L5t3GLQYGY2b70fWHO6c6ywDdxvtndfIrorJi+bR m8APloZxJW137OcT+pGkk+IxkQUH8Fh+eqeh3ma9G4RiH3nzdr2lsAfVMmxYJgIv1i+t/k D3i6bISn0lt5jeE7+9zCQZE7uq6yGXpMCsxgEyyt4JA3Os32O6QS9Fmh8ttS8w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/4/22 5:19 AM, Adrian Oltean via Gdb wrote: > Hi everyone, > > I'm currently facing an issue that occurs while stepping over code running in a > kernel thread that gets moved by the target OS (Linux) on a different core. > > To give you a little bit of background on my setup: > - I have a custom GDB server able to control ARMv8 targets; > - I'm using GDB 7.11.1 and GDB 11.1 but seeing the same behavior; > - I'm running GDB in all-stop mode; > - A thread from GDB is actually associated to a physical core from target; > - I'm actually debugging a Linux kernel 5.15 with GDB (a bare-metal debug > session but with an extra layer of python scripts to help control the target > Linux kernel). > What is the problem? I have a HW break somewhere inside the initialization > function of a kernel module. Target stops in the breakpoint but the problem > I face happens during a step over inside this init sequence. While GDB performing > all the step actions (single stepping, range stepping, resuming, setting temp > breaks, etc.) the Linux kernel decides to move the execution to a different > core (so a different thread in my debug model). As a result, temp breaks set > during stepping are hit on a different thread than the one used for initiating > the step over. This completely messes-up the debug session. In other words, > GDB ends-up in an infinite loop trying to finish the step over by switching > to the initial thread, resuming it, setting other breaks that are than hit by > other threads, resuming from those temp breaks, etc. Also, once GDB looses > control of the stepping, the target Linux enters the "idle" loop, making GDB's > job even more complicated when it comes to resuming from pointless breaks > set during stepping. Note that it has to deal with 16 threads (actual HW cores) > that constantly loop inside the "idle" subsystem. > > I'm attaching below some logs. Maybe some trained eyes can help with some hints > about how to avoid such an issue. Note that control is lost around address > 0xffff80000945803c, when target is resumed and the actual kernel thread is moved > from thread 3 to thread 4. Moreover, addresses 0xffff8000113dfb1c, 0xffff8000100a40a0 > or 0xffff8000113dfb20 are somewhere in the "idle" loop inside the Linux kernel. > > Any help would be appreciated. Interrupts make single-stepping on bare-metal or OS kernels hard. When doing similar things for debugging FreeBSD's kernel via bare-metal stubs (e.g. in QEMU or in FreeBSD's hypervisor) I generally use 'until' and/or breakpoints instead to work around this rather than using normal stepping. On the GDB server side (for the GDB stub in FreeBSD's hypervisor) I've thought about doing odd things like trying to defer interrupts while stepping but there isn't a really good way to deal with that in general. (Most of the time when trying to step what happens for me is that I get a reported stop back for a PC in the timer interrupt handler for the local APIC timer interrupt, and then GDB sees that the PC is "out of range" and just stops at that point) -- John Baldwin