From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 857 invoked by alias); 21 Aug 2007 09:21:33 -0000 Received: (qmail 636 invoked by uid 22791); 21 Aug 2007 09:21:30 -0000 X-Spam-Status: No, hits=-0.8 required=5.0 tests=AWL,BAYES_50,DK_POLICY_SIGNSOME,FORGED_RCVD_HELO X-Spam-Check-By: sourceware.org Received: from wildebeest.demon.nl (HELO gnu.wildebeest.org) (83.160.170.119) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 21 Aug 2007 09:21:25 +0000 Received: from dijkstra.wildebeest.org ([192.168.1.29]) by gnu.wildebeest.org with esmtp (Exim 4.63) (envelope-from ) id 1INPvI-0000El-Hx; Tue, 21 Aug 2007 11:21:22 +0200 Subject: Roundtable, breakpoints and lots of unwinding (Was: meeting 2007-08-15 9:30 us east coast time) From: Mark Wielaard To: Andrew Cagney Cc: frysk@sourceware.org In-Reply-To: <46C2FDB9.2090800@redhat.com> References: <46C257B2.5030903@redhat.com> <46C2FDB9.2090800@redhat.com> Content-Type: text/plain Date: Tue, 21 Aug 2007 09:21:00 -0000 Message-Id: <1187688075.3852.32.camel@dijkstra.wildebeest.org> Mime-Version: 1.0 X-Mailer: Evolution 2.8.3 (2.8.3-2.fc6) Content-Transfer-Encoding: 7bit X-Spam-Score: -4.2 (----) X-IsSubscribed: yes Mailing-List: contact frysk-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: frysk-owner@sourceware.org X-SW-Source: 2007-q3/txt/msg00314.txt.bz2 Hi Andrew, That was a nice summary and overview of the things people are working on. Thanks. Maybe we can add a wiki somewhere to keep these overviews up to date. Here some additions and details on my current work. > mjw: bug fixes for stepping; Low level stepping of breakpoints in particular. This all started with the demo of TestUpdatingDisplayValue which in the end produced a reproducer for http://sourceware.org/bugzilla/show_bug.cgi?id=4747 Which was just one way of how breakpoint stepping while a signal becomes pending could break. Something which is especially nasty when doing out of line stepping. There is now a low-level instruction stepping test framework: http://sourceware.org/bugzilla/show_bug.cgi?id=4763 http://sourceware.org/ml/frysk/2007-q3/msg00118.html And there is now a low-level test framework for testing signals being raised while breakpoint stepping (except user installed signal handlers for now though): http://sourceware.org/ml/frysk/2007-q3/msg00301.html The work by Petr on fltrace exposed some extra issues, who also helped adding some extra tests, it did take some more time then I had originally though, but I believe with the above the code is a lot more robust and better supports what Petr is doing. In particular the following bugs have now been closed: http://sourceware.org/bugzilla/show_bug.cgi?id=4889 http://sourceware.org/bugzilla/show_bug.cgi?id=4894 There are still 3 issues I know of with low-level stepping of breakpoints (none of which I am currently actively working on, but I expect fltrace to hit them, so when that work is progressing we might want to go back to these issues): http://sourceware.org/bugzilla/show_bug.cgi?id=4847 Stepping Trap instructions (in particular inside a trap signal handler) is broken (again) on newer kernels. Getting stepping of a trap instruction and stepping of the trap handler needs to be fully special cased. And the kernel doesn't help because it changes behavior every other release it seems. Not working on this till someone finds a real use case for this one. http://sourceware.org/bugzilla/show_bug.cgi?id=4762 We don't have a real Instruction parser (x86/x86_64) for single stepping out of line framework (see IA32InstructionParser) which means that for almost all instructions we are actually using reset breakpoint stepping which misses breakpoints when used with multiple threads. Not currently working on this one either, but I suspect that the new fltrace work that Petr is doing will soon hit this limitation at which time we should either finish the instruction parser or introduce stop-the-world stepping to make things more robust. http://sourceware.org/bugzilla/show_bug.cgi?id=4895 Low level breakpoints are visible to all Tasks, not just the Task that requested it. This seemed to be a good idea back when they were added since low-level breakpoints are essentially Proc based, but this confuses some users because they have to check the Task argument in their updateHit() handler, the workaround is easy, so not very important, but it would be better if this was cleaned up. > support for .debug_frame in libunwind > for instance, lesson the need for asynchronous-unwind-tables in > .eh_frame by using .debug_frame when available The breakpoints took some more time than I had anticipated, so I am still working on this bit. Since there is a lot of layers to unwinding and documentation is scattered all over the place here is a summary of unwinding as I found it. Both to document my own research and to structure the work a bit. If any of the below is wrong, please let me know. Unwinding the call stack used to be something only a debugger would do and relied on the executable having a frame pointer in a dedicated register that points to the bottom of the stack frame for the current function which also contained the return address [1]. Having a frame pointer allows you to quickly walk the call stack and get all the addresses, if you can map those to the names of the relevant functions they are in you have a nice backtrace for the user. If you want to get more of the state then you could rely on each function having a prologue and epilogue that saved and restored the registers [2] of the caller. Given a calling convention for a particular architecture you could use these to reliably find the original registers on the stack, which in turn with some debug info would give you the values of variables and arguments of the functions on the call stack. Unfortunately compilers got smart and optimized code might not keep a frame pointer (frees up one more register) and might reschedule the function prologue and epilogue instructions between the other instructions in the function. All making it pretty hard for an unwinder to reconstruct the previous call frames on the stack. In particular x86_64 does away with a standard frame pointer. You can still get some information back by conservatively approximating the instructions in the function and guessing at the actual way the various registers are stored [3] but this becomes pretty messy pretty quickly. To help debuggers still get all the information needed to unwind a stack and restore all needed registers the debugging information (DWARF) generated by compilers was extended to include Call Frame Information (CFI) [4] that allows a debugger to reconstruct the the calling pc and registers of a function. This information is stored in the .debug_frame section of an elf file. It uses a simplified version of the dwarf instructions (not all operands are relevant for reconstructing the registers). This section is not guaranteed to be available, it is not necessarily loaded into memory and can even be split off into its own debug info file in some distributions. At the same time different languages got constructs (exceptions, continuations, global gotos, asynchronous garbage collectors, etc) which required some sort of reliable unwinding (and in some cases rewinding) of the call stack. Since some optimizations and some newer architectures also did away with a standard frame pointer another way to reliably unwind the stack was needed. This became the exception handler framework (eh_frame) which is based on the DWARF CFI work but which is slightly different. Unfortunately nobody seems to have documented the precise differences between the formats. So you will have to carefully read both the DWARF standard and the LSB core specification Exception Frames [5] side-by-side. Note that a debugger that wants to walk a stack and recover all registers might need more information than some of these language constructs which might only need unwind information for specific call sites. Depending on optimizations, architecture and language compiled (and sometimes specific distribution default choices) no, full or partial exception handler unwind information and/or frame pointers are generated (see the GCC options -funwind-tables and -fasynchronous-unwind-tables [6]). Both the dwarf and the exception handler specs are architecture neutral. But since you do need to a mapping between the actual registers and the specs you also need to consult the relevant architecture abi that defines the actual mapping. Sometimes these architecture abi specs also define some DWARF/EH extensions. See for example the x86_64 abi spec [7]. Note that in practise what gcc generates overrides any of the above specs, and if a discrepancy is found the spec usually gets updated [8]. And that one should be careful about bugs in the old DWARF2 spec [9] and extension of DWARF specified by the LSB [10] (which mostly augment DWARF2 to be like DWARF3, at least for the exception handler sections). If an .eh_frame section is available in an elf file it is guaranteed to be loaded in memory. But depending on architecture and language being compiled might not be available at all (and neither might the frame pointer or the .debug_frame section). So with that background the work having to be done consists of the following: - libunwind officially only supports the .eh_frame format so it will have to be extended to also support the .debug_frame format. Luckily the differences, although very poorly documented, don't seem to be that large. - libunwind has its own CFI EH/DWARF parser but doesn't come with an interface to feed/read the CFI information directly. This is the Gget_unwind_table.c (unw_get_unwind_table) support that Nurdin added, but which isn't upstream yet. Pushing this upstream would be very beneficial since then we could use pure upstream in frysk, but see the next point, maybe the proposed interface should be changed a little. - Currently we hook into this new unw_get_unwind_table through UnwindH.hxx (createProcInfoFromElfImage), this is called indirectly through the libunwind find_proc_info callback which wants to see the unwind_info filled in. The ElfImage used is created in UnwindAddressSpace.findProcInfo() through the private method getElfImage(long address) by getting the MemoryMap of the address from the Task and either mapping the map from the elf file into memory or if the section is the VDSO by creating an anonymous mmap and filling that through reading the address map and then passing it to the libunwind dwarf reader. Directly mmapping these sections seems wrong here since the sections should already be available through the memory buffers of the proc we are inspecting (which might already have mapped in those sections). So it would be better to use the libunwind addressspace accessors that go through the ByteBuffers also for this. This might mean another change in the libunwind interface so all remote memory accesses go through the same hooks (although unw_get_unwind_table already provides an unw_address_space as argument, so I might be missing something). - For the .debug_info we cannot rely on it being available in the target address space (unlike .eh_frame which always gets loaded) and it might not be in the elf file directly, but might be in a separate debuginfo file. So there we need to locate the section first through libdwfl, load it (also through libdwfl?) and feed that to libunwind. Cheers, Mark [1] http://en.wikipedia.org/wiki/Frame_pointer [2] http://en.wikipedia.org/wiki/Function_prologue [3] http://sourceware.org/gdb/current/onlinedocs/gdbint_3.html#SEC9 [4] http://wiki.dwarfstd.org/ (Dwarf 3 - section 6.4) [5] http://refspecs.freestandards.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html [6] http://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html [7] http://www.x86-64.org/documentation/abi.pdf (Section 3.6 and 3.7) [8] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32982 [9] http://wiki.dwarfstd.org/index.php?title=DWARF_FAQ#How_big_is_a_DW_FORM_ref_addr.3F [10] http://refspecs.freestandards.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/dwarfext.html