From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx07-00178001.pphosted.com (mx08-00178001.pphosted.com [91.207.212.93]) by sourceware.org (Postfix) with ESMTPS id BF82B3858C2C for ; Mon, 3 Oct 2022 13:25:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BF82B3858C2C Received: from pps.filterd (m0046661.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2938p4we025949; Mon, 3 Oct 2022 15:25:39 +0200 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com (PPS) with ESMTPS id 3jxcehtfuc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 03 Oct 2022 15:25:39 +0200 Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 04A8D10002A; Mon, 3 Oct 2022 15:25:34 +0200 (CEST) Received: from Webmail-eu.st.com (shfdag1node3.st.com [10.75.129.71]) by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id EFB3C22D178; Mon, 3 Oct 2022 15:25:34 +0200 (CEST) Received: from [10.252.26.129] (10.75.127.120) by SHFDAG1NODE3.st.com (10.75.129.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2375.31; Mon, 3 Oct 2022 15:25:31 +0200 Message-ID: <096bac29-0836-ea55-cea2-12d0334bfbd6@foss.st.com> Date: Mon, 3 Oct 2022 15:25:30 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH v4] gdb/arm: Handle lazy FPU state preservation Content-Language: en-US To: Luis Machado , CC: , , , Yvan ROUX References: <20221002145138.2360215-1-torbjorn.svensson@foss.st.com> From: Torbjorn SVENSSON In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.75.127.120] X-ClientProxiedBy: GPXDAG2NODE4.st.com (10.75.127.68) To SHFDAG1NODE3.st.com (10.75.129.71) X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-03_02,2022-09-29_03,2022-06-22_01 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Oct 2022 13:25:49 -0000 Hi, On 2022-10-03 13:27, Luis Machado wrote: > Hi, > > I think the current mechanism is better than the observer scheme. I agree. Don't know why I did not see this solution from the start. > > On 10/2/22 15:51, Torbjörn SVENSSON wrote: >> Read LSPEN, ASPEN and LSPACT bits from FPCCR and use them together >> with FPCAR to identify if lazy FPU state preservation is active for >> the current frame.  See "Lazy context save of FP state", in B1.5.7, >> also ARM AN298, supported by Cortex-M4F architecture for details on >> lazy FPU register stacking.  The same conditions are valid for other >> Cortex-M cores with FPU. >> >> This patch has been verified on a STM32F4-Discovery board by: >> a) writing a non-zero value (lets use 0x1122334455667788 as an >>     example) to all the D-registers in the main function >> b) configured the SysTick to fire >> c) in the SysTick_Handler, write some other value (lets use >>     0x0022446688aaccee as an example) to one of the D-registers (D0 as >>     an example) and then do "SVC #0" >> d) in the SVC_Handler, write some other value (lets use >>     0x0099aabbccddeeff) to one of the D-registers (D0 as an example) >> >> In GDB, suspend the execution in the SVC_Handler function and compare >> the value of the D-registers for the SVC_handler frame and the >> SysTick_Handler frame.  With the patch, the value of the modified >> D-register (D0) should be the new value (0x009..eff) on the >> SVC_Handler frame, and the intermediate value (0x002..cee) for the >> SysTick_Handler frame.  Now compare the D-register value for the >> SysTick_Handler frame and the main frame.  The main frame should >> have the initial value (0x112..788). >> >> >> Example GDB session: >> >> C:\>c:\dev\gdb\arm-none-eabi-gdb.exe >> C:\Users\foo\STM32Cube\Repository\STM32Cube_FW_F4_V1.27.0\Projects\STM32F4-Discovery\Examples\GPIO\GPIO_EXTI\SW4STM32\STM32F4-Discovery\Debug\STM32F4-Discovery.elf >> GNU gdb (GDB) 13.0.50.20221001-git >> Copyright (C) 2022 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later >> >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law. >> Type "show copying" and "show warranty" for details. >> This GDB was configured as "--host=x86_64-w64-mingw32 >> --target=arm-none-eabi". >> Type "show configuration" for configuration details. >> For bug reporting instructions, please see: >> . >> Find the GDB manual and other documentation resources online at: >>      . >> >> For help, type "help". >> Type "apropos word" to search for commands related to "word"... >> Reading symbols from >> C:\Users\foo\STM32Cube\Repository\STM32Cube_FW_F4_V1.27.0\Projects\STM32F4-Discovery\Examples\GPIO\GPIO_EXTI\SW4STM32\STM32F4-Discovery\Debug\STM32F4-Discovery.elf... >> (gdb) set pagination off >> (gdb) set confirm off >> (gdb) define lazy_dump >> Type commands for definition of "lazy_dump". >> End with a line saying just "end". >>>   shell echo >>>   shell echo "Content at $arg0" >>>   set $i = 0 >>>   while $i <= $arg1 >>   >    f $i >>   >    p/x $d0 >>   >    p/x $d1 >>   >    shell echo >>   >    set $i = $i + 1 >>   >  end >>>   shell echo >>> end >> (gdb) >> (gdb) target remote :61234 >> Remote debugging using :61234 >> Reset_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/SW4STM32\startup_stm32f407xx.s:63 >> 63        ldr   sp, =_estack     /* set stack pointer */ >> (gdb) b main >> Breakpoint 1 at 0x80015cc: file >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/main.c, line 63. >> Note: automatically using hardware breakpoints for read-only addresses. >> (gdb) b SysTick_Handler >> Breakpoint 2 at 0x8001854: file >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c, line 150. >> (gdb) b SVC_Handler >> Breakpoint 3 at 0x8001808: file >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c, line 112. >> (gdb) c >> Continuing. >> >> Breakpoint 1, main () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/main.c:63 >> 63        HAL_Init(); >> (gdb) lazy_dump main 0 >> >> Content at main >> #0  main () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/main.c:63 >> 63        HAL_Init(); >> $1 = 0x0 >> $2 = 0x0 >> >> >> (gdb) c >> Continuing. >> >> Breakpoint 2, SysTick_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c:150 >> 150       HAL_IncTick(); >> (gdb) lazy_dump "SysTick_Handler before vmov" 2 >> >> Content at "SysTick_Handler before vmov" >> #0  SysTick_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c:150 >> 150       HAL_IncTick(); >> $3 = 0x1122334455667788 >> $4 = 0x1122334455667788 >> >> #1  >> $5 = 0x1122334455667788 >> $6 = 0x1122334455667788 >> >> #2  main () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/main.c:120 >> 120       while (1) >> $7 = 0x1122334455667788 >> $8 = 0x1122334455667788 >> >> >> (gdb) c >> Continuing. >> >> Breakpoint 3, SVC_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c:112 >> 112               asm("push {r0, r1}"); >> (gdb) lazy_dump "SVC_Handler before vmov" 4 >> >> Content at "SVC_Handler before vmov" >> #0  SVC_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c:112 >> 112               asm("push {r0, r1}"); >> $9 = 0x22446688aaccee >> $10 = 0x22446688aaccee >> >> #1  >> $11 = 0x22446688aaccee >> $12 = 0x22446688aaccee >> >> #2  SysTick_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c:162 >> 162     } >> $13 = 0x22446688aaccee >> $14 = 0x22446688aaccee >> >> #3  >> $15 = 0x22446688aaccee >> $16 = 0x22446688aaccee >> >> #4  main () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/main.c:120 >> 120       while (1) >> $17 = 0x1122334455667788 >> $18 = 0x1122334455667788 >> >> >> (gdb) s 8 >> 122       HAL_GPIO_EXTI_Callback(KEY_BUTTON_PIN); >> (gdb) lazy_dump "SVC_Handler after vmov" 4 >> >> Content at "SVC_Handler after vmov" >> #0  SVC_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c:122 >> 122       HAL_GPIO_EXTI_Callback(KEY_BUTTON_PIN); >> $19 = 0x99aabbccddeeff >> $20 = 0x99aabbccddeeff >> >> #1  >> $21 = 0x99aabbccddeeff >> $22 = 0x99aabbccddeeff >> >> #2  SysTick_Handler () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/stm32f4xx_it.c:162 >> 162     } >> $23 = 0x22446688aaccee >> $24 = 0x22446688aaccee >> >> #3  >> $25 = 0x22446688aaccee >> $26 = 0x22446688aaccee >> >> #4  main () at >> C:/Users/foo/STM32Cube/Repository/STM32Cube_FW_F4_V1.27.0/Projects/STM32F4-Discovery/Examples/GPIO/GPIO_EXTI/Src/main.c:120 >> 120       while (1) >> $27 = 0x1122334455667788 >> $28 = 0x1122334455667788 >> >> >> >> >> >> Signed-off-by: Torbjörn SVENSSON  >> Signed-off-by: Yvan ROUX  >> --- >>   gdb/arch/arm.h |  7 ++++++- >>   gdb/arm-tdep.c | 51 ++++++++++++++++++++++++++++++++++---------------- >>   2 files changed, 41 insertions(+), 17 deletions(-) >> >> diff --git a/gdb/arch/arm.h b/gdb/arch/arm.h >> index 36757493406..d384b952144 100644 >> --- a/gdb/arch/arm.h >> +++ b/gdb/arch/arm.h >> @@ -115,7 +115,12 @@ enum system_register_address : CORE_ADDR >>     /* M-profile Floating-Point Context Control Register address, >> defined in >>        ARMv7-M (Section B3.2.2) and ARMv8-M (Section D1.2.99) reference >>        manuals.  */ >> -  FPCCR = 0xe000ef34 >> +  FPCCR = 0xe000ef34, >> + >> +  /* M-profile Floating-Point Context Address Register address, >> defined in >> +     ARMv7-M (Section B3.2.2) and ARMv8-M (Section D1.2.98) reference >> +     manuals.  */ >> +  FPCAR = 0xe000ef38 >>   }; >>   /* Instruction condition field values.  */ >> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c >> index 2810232fcb8..b08ee096f97 100644 >> --- a/gdb/arm-tdep.c >> +++ b/gdb/arm-tdep.c >> @@ -3588,27 +3588,43 @@ arm_m_exception_cache (struct frame_info >> *this_frame) >>         if (extended_frame_used) >>       { >>         ULONGEST fpccr; >> +      ULONGEST fpcar; >>         /* Read FPCCR register.  */ >>         gdb_assert (safe_read_memory_unsigned_integer (FPCCR, >>                                ARM_INT_REGISTER_SIZE, >>                                byte_order, &fpccr)); >> -      bool fpccr_ts = bit (fpccr, 26); >> -      /* This code does not take into account the lazy stacking, see >> "Lazy >> -         context save of FP state", in B1.5.7, also ARM AN298, supported >> -         by Cortex-M4F architecture. >> -         To fully handle this the FPCCR register (Floating-point Context >> -         Control Register) needs to be read out and the bits ASPEN and >> -         LSPEN could be checked to setup correct lazy stacked FP >> registers. >> -         This register is located at address 0xE000EF34.  */ >> +      /* Read FPCAR register.  */ >> +      gdb_assert (safe_read_memory_unsigned_integer (FPCAR, >> +                             ARM_INT_REGISTER_SIZE, >> +                             byte_order, &fpcar)); > > I'm not sure about this assertion. Can we handle this gracefully without > crashing GDB? How do you want it to be handled "gracefully"? IMO, it's better to have GDB do a controlled "crash" than continuing and potentially showing some register content that does not align with the current target state. An alternative is to simply abort the unwinding by calling error() like we do for the FNC_RETURN pattern if tdep->have_sec_ext is false. Would this bee a good compromise? If so, what should the message contain? > >> +      bool fpccr_aspen = bit (fpccr, 31); >> +      bool fpccr_lspen = bit (fpccr, 30); >> +      bool fpccr_ts = bit (fpccr, 26); >> +      bool fpccr_lspact = bit (fpccr, 0); >> + >> +      /* The LSPEN and ASPEN bits indicate if the lazy state >> preservation >> +         for FP registers is enabled or disabled.  The LSPACT bit >> indicate, >> +         together with FPCAR, if the lazy state preservation feature is >> +         active for the current frame or for another frame. >> +         See "Lazy context save of FP state", in B1.5.7, also ARM AN298, >> +         supported by Cortex-M4F architecture for details.  */ >> +      bool fpcar_points_to_this_frame = ((unwound_sp + sp_r0_offset + >> 0x20) > > I'd define "addr = unwound_sp + sp_r0_offset + 0x20" and use it here... I thought of doing it like that, but I decided that sp_s0_offset was so similar to sp_r0_offset that I ended up with the full expression instead. Also, the offsets sort of points back to the ASCII art that explains the stack frame layout and using a different number might make it hard to see the connection. > >> +                         == (fpcar & ~0x7)); >> +      bool read_fp_regs_from_stack = (!(fpccr_aspen && fpccr_lspen >> +                        && fpccr_lspact >> +                        && fpcar_points_to_this_frame)); >>         /* Extended stack frame type used.  */ >> -      CORE_ADDR addr = unwound_sp + sp_r0_offset + 0x20; >> -      for (int i = 0; i < 8; i++) >> +      if (read_fp_regs_from_stack) >>           { >> -          cache->saved_regs[ARM_D0_REGNUM + i].set_addr (addr); >> -          addr += 8; >> +          CORE_ADDR addr = unwound_sp + sp_r0_offset + 0x20; > > ... and here. > >> +          for (int i = 0; i < 8; i++) >> +        { >> +          cache->saved_regs[ARM_D0_REGNUM + i].set_addr (addr); >> +          addr += 8; >> +        } >>           } >>         cache->saved_regs[ARM_FPSCR_REGNUM].set_addr (unwound_sp >>                               + sp_r0_offset + 0x60); >> @@ -3617,11 +3633,14 @@ arm_m_exception_cache (struct frame_info >> *this_frame) >>             && fpccr_ts) >>           { >>             /* Handle floating-point callee saved registers.  */ >> -          addr = unwound_sp + sp_r0_offset + 0x68; >> -          for (int i = 8; i < 16; i++) >> +          if (read_fp_regs_from_stack) >>           { >> -          cache->saved_regs[ARM_D0_REGNUM + i].set_addr (addr); >> -          addr += 8; >> +          CORE_ADDR addr = unwound_sp + sp_r0_offset + 0x68; > > Then adjust addr here if we are using a different address. > >> +          for (int i = 8; i < 16; i++) >> +            { >> +              cache->saved_regs[ARM_D0_REGNUM + i].set_addr (addr); >> +              addr += 8; >> +            } >>           } >>             arm_cache_set_active_sp_value (cache, tdep, > > Otherwise LGTM. > > Thanks for putting this together. Let me know what you think about the replies above. Kind regards, Torbjörn