From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id A1CF63858D1E for ; Tue, 21 May 2024 22:18:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A1CF63858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A1CF63858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c36 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716329908; cv=none; b=VH1yjDpvTM0aCnPcrw5ZyLA/bhu6U6xYi3bukc6nfIf5onvwfp/IGUhljul1xADfihINevDLHHI2c2TwovzWVKf36dC8cvIYyHBrl2U/w9IAqsDYrFAafcKqzDqJtKpbaD7iAGnaLP3BeW+70GS+bcbm2E/9PhQjUzvtXkra22Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716329908; c=relaxed/simple; bh=NHGT7fmvpmlL/Kj17l+uN8vM17ny1tnx5HY+rPmBFu4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=SPsD48/hQ0u3/faNBReQJvn+BnlJ5zJ9ib/SdQDl6px03497jo+tgwoy2yTCuVdVr8pGfZHeoeu9JaN5fEhtMBy1eleyx2knk+IrdXsGeyxTDnyr40gAsor3/1+S2rOItZLG8ABQC5aoNCywx9ocXQx/GmBhGdItoOocH3Yzt/4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-5b25cee6b7bso2356630eaf.3 for ; Tue, 21 May 2024 15:18:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716329906; x=1716934706; darn=sourceware.org; h=mime-version:message-id:date:user-agent:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=TZoMuMyjbvyFiSOa8IV1MuO0MgDkatQ02ALBDGaISGg=; b=fvQRLZDeQIeTEdtcspdxoAmRcrNVQcXRwahpmBcUxDtGtCSE2U+PCSHklV64ONwZkq 0+YdZWyATA+r6J4KRtRlYy1wvOM+CKwD+vEFvJAy3twsl5EPAvb8Ec7H3g0d8hA05hul +GWbi7KeN38+GEqjLXXrGGFduLYfbRSQ+mPIdCBdM925Iwf1qQHyjn2A3vWlFsIbkJa7 j1j8IabPjx7q6GIAt41pSxMsapQSDKCG08Wgv5xiorE2QOzLJmSOWM7px/FPSLwSEVZo qMsVygp3IITKqKR7I0JsUVNKEYnCxDl+czFKy8QkONU/axCBlLFyp9sJQdkALbbDBtNO wQSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716329906; x=1716934706; h=mime-version:message-id:date:user-agent:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=TZoMuMyjbvyFiSOa8IV1MuO0MgDkatQ02ALBDGaISGg=; b=UaQv3Yh0fgR+qcsEnBOgPbfM5AvRQIoiYQqu3KQHSVFwwL7835dAKqrI2zK5NN7gFZ R7Udai3SypR4VHGA9US2NFWVNFoo9ierEUbtlSlCWdPQRQVKlTCZSb3KoQpw+u8AYaBA IuPY9ewn/O9UnhunUi+L3YkkdwjcWg2YCyg4bAuf/jg+/ZyJhBIt8ag14dRwjRjdX2fR brxnsA8t2LmljyPaQlHNd0+6ZmCH6Ib2ki6Ag7NfWGbKK7vyOyhGYwNP1tDzaE/+b7yg 6Kw05ACxJjKlv/HObn+1kc2kGb7htKSkXfN4aB3sJd6L64PAHGwN2CmQEpqXaL0GneOR pQYQ== X-Gm-Message-State: AOJu0YxIHgnQHTIQyHEq+TtlUjicKC9qfUHwjSeJRLpABnBCV5ztA0SP ke/0r8UMPmATUE5hsEaVB3nDoFPucvRaBsZsIu9Ng+zq3vW7ROEmdFgXSbWT2IU= X-Google-Smtp-Source: AGHT+IGa6Zrf1SwxCPAK/hOdPDyzSCRqiCUCqBcuPMiYoL6cVnn4Irz0EUAfEDovpw7HNAHSUhMt6w== X-Received: by 2002:a05:6359:6e0c:b0:17e:6a4c:e96d with SMTP id e5c5f4694b2df-1979203281amr5302955d.30.1716329905714; Tue, 21 May 2024 15:18:25 -0700 (PDT) Received: from localhost ([2804:14d:7e39:8470:5957:d80d:a9a5:87f6]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f4d2a9d9acsm21819990b3a.90.2024.05.21.15.18.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 15:18:25 -0700 (PDT) From: Thiago Jung Bauermann To: Pedro Alves Cc: gdb-patches@sourceware.org, Christophe Lyon , Luis Machado Subject: Re: [PATCH v3 0/5] Add support for AArch64 MOPS instructions In-Reply-To: <83aee3f3-4f3f-4009-94cc-99d53318206b@palves.net> (Pedro Alves's message of "Fri, 10 May 2024 15:16:52 +0100") References: <20240510052408.2173579-1-thiago.bauermann@linaro.org> <83aee3f3-4f3f-4009-94cc-99d53318206b@palves.net> User-Agent: mu4e 1.12.4; emacs 29.3 Date: Tue, 21 May 2024 19:18:23 -0300 Message-ID: <87v836u7v4.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Pedro Alves writes: > On 2024-05-10 06:24, Thiago Jung Bauermann wrote: >> Hello, >> >> This version is to adapt to Luis' clarification that MOPS instructions >> don't need to be treated as atomic sequences and can be single-stepped. >> If the OS reschedules the inferior to a different CPU while a main or >> epilogue instruction is executed, it will reset the sequence back to the >> prologue instruction. > > Curious -- if you single step each of the instructions, then there will > be kernel code executed on the CPU in between each of the instructions > in the sequence, and other userspace code (of other tasks too, like the > debugger itself, potentially). So the kernel is free to context switch > in between the instructions in the sequence, and _only_ restarts the sequence > when the task is moved to another CPU? Weird that it can context switch > without losing state on the same CPU but not to a different CPU. The kernel commits implementing this behaviour actually have a good explanation on this: $ git log --reverse -n2 8cd076a67dc8 commit 8536ceaa747174ded7983f13906b225e0c33ac51 Author: Kristina Martsenko AuthorDate: Tue May 9 15:22:31 2023 +0100 Commit: Catalin Marinas CommitDate: Mon Jun 5 17:05:41 2023 +0100 arm64: mops: handle MOPS exceptions The memory copy/set instructions added as part of FEAT_MOPS can take an exception (e.g. page fault) part-way through their execution and resume execution afterwards. If however the task is re-scheduled and execution resumes on a different CPU, then the CPU may take a new type of exception to indicate this. This is because the architecture allows two options (Option A and Option B) to implement the instructions and a heterogeneous system can have different implementations between CPUs. In this case the OS has to reset the registers and restart execution from the prologue instruction. The algorithm for doing this is provided as part of the Arm ARM. Add an exception handler for the new exception and wire it up for userspace tasks. Reviewed-by: Catalin Marinas Signed-off-by: Kristina Martsenko Link: https://lore.kernel.org/r/20230509142235.3284028-8-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas commit 8cd076a67dc8eac5d613b3258f656efa7a54412e Author: Kristina Martsenko AuthorDate: Tue May 9 15:22:32 2023 +0100 Commit: Catalin Marinas CommitDate: Mon Jun 5 17:05:41 2023 +0100 arm64: mops: handle single stepping after MOPS exception When a MOPS main or epilogue instruction is being executed, the task may get scheduled on a different CPU and restart execution from the prologue instruction. If the main or epilogue instruction is being single stepped then it makes sense to finish the step and take the step exception before starting to execute the next (prologue) instruction. So fast-forward the single step state machine when taking a MOPS exception. This means that if a main or epilogue instruction is single stepped with ptrace, the debugger will sometimes observe the PC moving back to the prologue instruction. (As already mentioned, this should be rare as it only happens when the task is scheduled to another CPU during the step.) This also ensures that perf breakpoints count prologue instructions consistently (i.e. every time they are executed), rather than skipping them when there also happens to be a breakpoint on a main or epilogue instruction. Acked-by: Catalin Marinas Signed-off-by: Kristina Martsenko Link: https://lore.kernel.org/r/20230509142235.3284028-9-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas > But then again, I have no idea what the instructions themselves do. :-) >From the Arm ARM: "CPYP performs some preconditioning of the arguments suitable for using the CPYM instruction, and performs an IMPLEMENTATION DEFINED amount of the memory copy. CPYM performs an IMPLEMENTATION DEFINED amount of the memory copy. CPYE performs the last part of the memory copy." Ditto for other kinds of prologue, main and epilogue instructions. I would say that the prologue instruction is copies some poorly aligned bytes at the beginning of the memory region, then the main instruction copies the memory in chunks that are convenient for the processor implementation, then the epilogue instruction copies the remaining poorly aligned bytes at the end. -- Thiago