* [PATCH 0/4] Fix BZ 25631 - core file memory access problem @ 2020-03-05 0:43 Kevin Buettner 2020-03-05 0:43 ` [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections Kevin Buettner ` (4 more replies) 0 siblings, 5 replies; 19+ messages in thread From: Kevin Buettner @ 2020-03-05 0:43 UTC (permalink / raw) To: gdb-patches; +Cc: Kevin Buettner This series fixes a bug with accessing memory from (described by) core files. The bug can be viewed here... https://sourceware.org/bugzilla/show_bug.cgi?id=25631 ...though I also provide similar details in part 3 of this series. Kevin Buettner (4): Remove hack for GDB which sets the section size to 0 Add function for partitioning/splitting a section table Provide access to non SEC_HAS_CONTENTS core file sections Test ability to access unwritten-to mmap data in core file bfd/elf.c | 8 --- gdb/corelow.c | 80 ++++++++++++++++++++++++++--- gdb/exec.c | 55 ++++++++++++++++++++ gdb/exec.h | 13 +++++ gdb/testsuite/gdb.base/corefile.exp | 6 +++ gdb/testsuite/gdb.base/coremaker.c | 10 ++++ 6 files changed, 157 insertions(+), 15 deletions(-) -- 2.24.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-03-05 0:43 [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner @ 2020-03-05 0:43 ` Kevin Buettner 2020-03-25 17:25 ` Keith Seitz 2020-03-29 13:18 ` Pedro Alves 2020-03-05 0:43 ` [PATCH 1/4] Remove hack for GDB which sets the section size to 0 Kevin Buettner ` (3 subsequent siblings) 4 siblings, 2 replies; 19+ messages in thread From: Kevin Buettner @ 2020-03-05 0:43 UTC (permalink / raw) To: gdb-patches; +Cc: Kevin Buettner Consider the following program: Change-Id: I1adbb4e9047baad7cae7eab9c72e6d2b16f87d73 --- mkmmapcore.c --- static char *buf; int main (int argc, char **argv) { buf = mmap (NULL, 8192, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); abort (); } --- end mkmmapcore.c --- Compile it like this: gcc -g -o mkmmapcore mkmmapcore.c Now let's run it from GDB. I've already placed a breakpoint on the line with the abort() call and have run to that breakpoint. Breakpoint 1, main (argc=1, argv=0x7fffffffd678) at mkmmapcore.c:11 11 abort (); (gdb) x/x buf 0x7ffff7fcb000: 0x00000000 Note that we can examine the memory allocated via the call to mmap(). Now let's try debugging a core file created by running this program. Depending on your system, in order to make a core file, you may have to run the following as root (or using sudo): echo core > /proc/sys/kernel/core_pattern It may also be necessary to do: ulimit -c unlimited I'm using Fedora 31. YMMV if you're using one of the BSDs or some other (non-Linux) system. This is what things look like when we debug the core file: [kev@f31-1 tmp]$ gdb -q ./mkmmapcore core.304767 Reading symbols from ./mkmmapcore... [New LWP 304767] Core was generated by `/tmp/mkmmapcore'. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 50 return ret; (gdb) x/x buf 0x7ffff7fcb000: Cannot access memory at address 0x7ffff7fcb000 Note that we can no longer access the memory region allocated by mmap(). Back in 2007, a hack for GDB was added to _bfd_elf_make_section_from_phdr() in bfd/elf.c: /* Hack for gdb. Segments that have not been modified do not have their contents written to a core file, on the assumption that a debugger can find the contents in the executable. We flag this case by setting the fake section size to zero. Note that "real" bss sections will always have their contents dumped to the core file. */ if (bfd_get_format (abfd) == bfd_core) newsect->size = 0; You can find the entire patch plus links to other discussion starting here: https://sourceware.org/ml/binutils/2007-08/msg00047.html This hack sets the size of certain BFD sections to 0, which effectively causes GDB to ignore them. I think it's likely that the bug described above existed even before this hack was added, but I have no easy way to test this now. The output from objdump -h shows the result of this hack: 25 load13 00000000 00007ffff7fcb000 0000000000000000 00013000 2**12 ALLOC (The first field, after load13, shows the size of 0.) Once the hack is removed, the output from objdump -h shows the correct size: 25 load13 00002000 00007ffff7fcb000 0000000000000000 00013000 2**12 ALLOC (This is a digression, but I think it's good that objdump will now show the correct size.) If we remove the hack from bfd/elf.c, but do nothing to GDB, we'll see the following regression: FAIL: gdb.base/corefile.exp: print coremaker_ro The reason for this is that all sections which have the BFD flag SEC_ALLOC set, but for which SEC_HAS_CONTENTS is not set no longer have zero size. Some of these sections have data that can (and should) be read from the executable. But, due to the way that the target strata are traversed when attempting to access memory, the non-SEC_HAS_CONTENTS sections will be read as zeroes from the process_stratum (which in this case is the core file stratum) without first checking the file stratum, which is where the data might actually be found. What we should be doing is this: - Attempt to access core file data for SEC_HAS_CONTENTS sections. - Attempt to access executable file data if the above fails. - Attempt to access core file data for non SEC_HAS_CONTENTS sections, if both of the above fail. That's what this commit does. See the comments in the patch for additional details. gdb/ChangeLog: * corelow.c (class core_target): Add new field m_core_no_contents_section_table. (core_target::core_target): Initialize m_core_no_contents_section_table. (core_target::~core_target): Free data structure associated with m_core_no_contents_section_table. (core_target::files_info): Print section info associated with m_core_no_contents_section_table. (core_target:xfer_partial): Revise TARGET_OBJECT_MEMORY case to consider the stratum beneath the core target as well as m_core_no_contents_section_table. --- gdb/corelow.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 73 insertions(+), 7 deletions(-) diff --git a/gdb/corelow.c b/gdb/corelow.c index 5cd058d599..7a71174062 100644 --- a/gdb/corelow.c +++ b/gdb/corelow.c @@ -122,9 +122,21 @@ private: /* per-core data */ sections --- those should come only from pure executable or shared library bfds. The core bfd sections are an implementation detail of the core target, just like ptrace is for unix child - targets. */ + targets. + + Immediately after being read, sections with no contents will be + removed from this table and placed in the "no contents" table; + see below. */ target_section_table m_core_section_table {}; + /* The core's "no contents" section table. These sections represent + regions of memory which were not modified while the process was + live. Data in these sections are either found in the exec file + or represent regions of memory which were allocated, but never + written to. */ + + target_section_table m_core_no_contents_section_table {}; + /* The core_fns for a core file handler that is prepared to read the core file currently open on core_bfd. */ core_fns *m_core_vec = NULL; @@ -147,11 +159,24 @@ core_target::core_target () &m_core_section_table.sections_end)) error (_("\"%s\": Can't find sections: %s"), bfd_get_filename (core_bfd), bfd_errmsg (bfd_get_error ())); + + /* Place sections which have the BFD flag SEC_ALLOC set, but + for which SEC_HAS_CONTENTS is not set into a separate section + table. */ + split_section_table (&m_core_section_table, + &m_core_no_contents_section_table, + [](struct target_section *s) -> bool + { + flagword flags = s->the_bfd_section->flags; + return ((flags & SEC_ALLOC) + && !(flags & SEC_HAS_CONTENTS)); + }); } core_target::~core_target () { xfree (m_core_section_table.sections); + xfree (m_core_no_contents_section_table.sections); } /* List of all available core_fns. On gdb startup, each core file @@ -731,6 +756,7 @@ void core_target::files_info () { print_section_info (&m_core_section_table, core_bfd); + print_section_info (&m_core_no_contents_section_table, core_bfd); } \f enum target_xfer_status @@ -741,12 +767,52 @@ core_target::xfer_partial (enum target_object object, const char *annex, switch (object) { case TARGET_OBJECT_MEMORY: - return (section_table_xfer_memory_partial - (readbuf, writebuf, - offset, len, xfered_len, - m_core_section_table.sections, - m_core_section_table.sections_end, - NULL)); + enum target_xfer_status xfer_status; + + /* Try accessing memory contents from core file data, + restricting consideration to those sections in + m_core_section_table. Due to the partitioning (splitting) + which occurs after the core's section table has been read in, + this table will consist only of sections for which the BFD + section flag SEC_HAS_CONTENTS is set. */ + xfer_status = section_table_xfer_memory_partial + (readbuf, writebuf, + offset, len, xfered_len, + m_core_section_table.sections, + m_core_section_table.sections_end, + NULL); + if (xfer_status == TARGET_XFER_OK) + return TARGET_XFER_OK; + + /* If the above failed, we need to see if memory contents are + available from exec file prior to examining those sections + in the core file for which we know only the size of the + section. */ + xfer_status = this->beneath ()->xfer_partial (object, annex, readbuf, + writebuf, offset, len, + xfered_len); + if (xfer_status == TARGET_XFER_OK) + return TARGET_XFER_OK; + + /* Finally, attempt to access data in core file sections with + no contents. These will typically read as all zero, but we + leave it to the underlying layers to decide what to do. */ + xfer_status = section_table_xfer_memory_partial + (readbuf, writebuf, + offset, len, xfered_len, + m_core_no_contents_section_table.sections, + m_core_no_contents_section_table.sections_end, + NULL); + + /* If none of the above attempts worked to access the memory in + question, return TARGET_XFER_UNAVAILABLE. Due to the fact + that the exec file stratum has already been considered, we + want to prevent it from being examined yet again (at a higher + level). */ + if (xfer_status == TARGET_XFER_OK) + return TARGET_XFER_OK; + else + return TARGET_XFER_UNAVAILABLE; case TARGET_OBJECT_AUXV: if (readbuf) -- 2.24.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-03-05 0:43 ` [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections Kevin Buettner @ 2020-03-25 17:25 ` Keith Seitz 2020-03-29 13:18 ` Pedro Alves 1 sibling, 0 replies; 19+ messages in thread From: Keith Seitz @ 2020-03-25 17:25 UTC (permalink / raw) To: Kevin Buettner, gdb-patches On 3/4/20 4:42 PM, Kevin Buettner wrote: [snip excellent introduction to the problem] > What we should be doing is this: > > - Attempt to access core file data for SEC_HAS_CONTENTS sections. > - Attempt to access executable file data if the above fails. > - Attempt to access core file data for non SEC_HAS_CONTENTS sections, if > both of the above fail. This is also the path that I was investigating. FWIW, I don't see anything obviously `wrong' with the contents of the patch, but low-level target operations are outside where I typically work. Just one trivial request (again): > gdb/ChangeLog: > + PR corefiles/25631 > * corelow.c (class core_target): Add new field> m_core_no_contents_section_table. > (core_target::core_target): Initialize > m_core_no_contents_section_table. > (core_target::~core_target): Free data structure associated > with m_core_no_contents_section_table. > (core_target::files_info): Print section info associated with > m_core_no_contents_section_table. > (core_target:xfer_partial): Revise TARGET_OBJECT_MEMORY case > to consider the stratum beneath the core target as well as > m_core_no_contents_section_table. Keith ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-03-05 0:43 ` [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections Kevin Buettner 2020-03-25 17:25 ` Keith Seitz @ 2020-03-29 13:18 ` Pedro Alves 2020-05-03 7:21 ` Kevin Buettner 2020-05-12 8:40 ` Kevin Buettner 1 sibling, 2 replies; 19+ messages in thread From: Pedro Alves @ 2020-03-29 13:18 UTC (permalink / raw) To: Kevin Buettner, gdb-patches Hi Kevin, On 3/5/20 12:42 AM, Kevin Buettner wrote: > Consider the following program: > > Change-Id: I1adbb4e9047baad7cae7eab9c72e6d2b16f87d73 > This Change-Id line should be at the bottom of the commit log. Or removed entirely since we're not relying on it anymore. > --- mkmmapcore.c --- > > static char *buf; > > int > main (int argc, char **argv) > { > buf = mmap (NULL, 8192, PROT_READ | PROT_WRITE, > MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > abort (); > } > --- end mkmmapcore.c --- > > Compile it like this: > > gcc -g -o mkmmapcore mkmmapcore.c > > Now let's run it from GDB. I've already placed a breakpoint on the > line with the abort() call and have run to that breakpoint. > > Breakpoint 1, main (argc=1, argv=0x7fffffffd678) at mkmmapcore.c:11 > 11 abort (); > (gdb) x/x buf > 0x7ffff7fcb000: 0x00000000 > > Note that we can examine the memory allocated via the call to mmap(). > > Now let's try debugging a core file created by running this program. > Depending on your system, in order to make a core file, you may have to > run the following as root (or using sudo): > > echo core > /proc/sys/kernel/core_pattern > > It may also be necessary to do: > > ulimit -c unlimited > > I'm using Fedora 31. YMMV if you're using one of the BSDs or some other > (non-Linux) system. > > This is what things look like when we debug the core file: > > [kev@f31-1 tmp]$ gdb -q ./mkmmapcore core.304767 > Reading symbols from ./mkmmapcore... > [New LWP 304767] > Core was generated by `/tmp/mkmmapcore'. > Program terminated with signal SIGABRT, Aborted. > #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 > 50 return ret; > (gdb) x/x buf > 0x7ffff7fcb000: Cannot access memory at address 0x7ffff7fcb000 > > Note that we can no longer access the memory region allocated by mmap(). > > Back in 2007, a hack for GDB was added to _bfd_elf_make_section_from_phdr() > in bfd/elf.c: > > /* Hack for gdb. Segments that have not been modified do > not have their contents written to a core file, on the > assumption that a debugger can find the contents in the > executable. We flag this case by setting the fake > section size to zero. Note that "real" bss sections will > always have their contents dumped to the core file. */ > if (bfd_get_format (abfd) == bfd_core) > newsect->size = 0; > > You can find the entire patch plus links to other discussion starting > here: > > https://sourceware.org/ml/binutils/2007-08/msg00047.html > > This hack sets the size of certain BFD sections to 0, which > effectively causes GDB to ignore them. I think it's likely that the > bug described above existed even before this hack was added, but I > have no easy way to test this now. > > The output from objdump -h shows the result of this hack: > > 25 load13 00000000 00007ffff7fcb000 0000000000000000 00013000 2**12 > ALLOC > > (The first field, after load13, shows the size of 0.) > > Once the hack is removed, the output from objdump -h shows the correct > size: > > 25 load13 00002000 00007ffff7fcb000 0000000000000000 00013000 2**12 > ALLOC > > (This is a digression, but I think it's good that objdump will now show > the correct size.) > > If we remove the hack from bfd/elf.c,> but do nothing to GDB, we'll > see the following regression: > > FAIL: gdb.base/corefile.exp: print coremaker_ro > > The reason for this is that all sections which have the BFD flag > SEC_ALLOC set, but for which SEC_HAS_CONTENTS is not set no longer > have zero size. Some of these sections have data that can (and should) > be read from the executable. Removing the bfd hack alone fixes your new test for me. > But, due to the way that the target > strata are traversed when attempting to access memory, the > non-SEC_HAS_CONTENTS sections will be read as zeroes from the > process_stratum (which in this case is the core file stratum) without > first checking the file stratum, which is where the data might actually > be found. I've applied your patch #1 only, and ran the corefile.exp test, but it still passes cleanly for me. I don't see any "print coremaker_ro" FAIL here. :-/ That makes it a bit harder for me to understand all of this. I'm on Fedora 27. Can you expand a bit more on this following part? > Some of these sections have data that can (and should) be read > from the executable. I'd like to understand and explore this a little bit better. Are these cases truly indistinguishable from the cases where data shouldn't be read from the executable? I don't mean from the current bfd data structures, but from the data in the core file and the executable. It's kind of fascinating that that's the case, and if so, it would sound like a nasty bug in either the core format or in the Linux kernel for producing such cores with which we have to apply heuristics. For the NON-split fake sections case (by split I mean the loadXXXa/loadXXXb sections that map to a single segment), how come we end up with such sections in the core in the first place if they weren't modified at run time? Diffing "objdump -h" results from before/after the hack removal, on the corefile.exp core dump, I see, this case for example: - 18 load6 00000000 00007fd61476a000 0000000000000000 00027000 2**12 + 18 load6 001ff000 00007fd61476a000 0000000000000000 00027000 2**12 ALLOC, READONLY This is a case of a segment that is not split in two sections like some others (note no trailing "a" and "b"). So this is a "!split" case in _bfd_elf_make_section_from_phdr. Trying to disassemble that address, with the whole patch series applied, results in: (gdb) disassemble 0x00007fd61476a000,+10 Dump of assembler code from 0x7fd61476a000 to 0x7fd61476a00a: 0x00007fd61476a000: /home/pedro/gdb/binutils-gdb/src/gdb/target.c:1271: internal-error: target_xfer_status target_xfer_partial(target_ops*, target_object, const char*, gdb_byte*, const gdb_byte*, ULONGEST, ULONGEST, ULONGEST*): Assertion `*xfered_len > 0' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) I'd be good to also cover this in the testsuite somehow. Using "info proc mappings" and "readelf -l" we can see that the address belongs to a file-backed mapping with p_filesz=0. I'm puzzled about why we ended up with a p_filesz=0 load segment in the core for this memory range (load6). > > What we should be doing is this: > > - Attempt to access core file data for SEC_HAS_CONTENTS sections. > - Attempt to access executable file data if the above fails. > - Attempt to access core file data for non SEC_HAS_CONTENTS sections, if > both of the above fail. > This seems to end up in line with Daniel's suggestion back in 2007 at: https://sourceware.org/legacy-ml/binutils/2007-08/msg00045.html Except it uses the section flags as proxy for the p_filesz/p_memsz checks. I'm still not fully sure this is the right thing to do given I'm not clear on all the details, but if there truly is no other way to distinguish the segments that need to be read from the executable compared to segments that need to be read from the core, I suppose this is the way to go. I'm not fully convinced on the splitting the sections though, compared to just walking the core sections list twice with a predicate. section_table_xfer_memory_partial already has a predicate parameter, the 'section_name' parameter, we would just need to generalize it to a gdb::function_view callback instead. Or alternatively, if you prefer two separate lists, then I don't understand why build a single list, and then split in two right after. Wouldn't it be preferable to make core_target::core_target() build the two separate lists from the get go, rather that build a single list and then split it in two lists immediately after? BTW, that TARGET_OBJECT_MEMORY case in core_target::xfer_partial is getting largish, might be worth it to move that chunk to a separate core_target::xfer_memory method. But really just walking the single sections list in place would be simpler, I think. I don't think this is a bottleneck. > enum target_xfer_status > @@ -741,12 +767,52 @@ core_target::xfer_partial (enum target_object object, const char *annex, > + /* If none of the above attempts worked to access the memory in > + question, return TARGET_XFER_UNAVAILABLE. Due to the fact > + that the exec file stratum has already been considered, we > + want to prevent it from being examined yet again (at a higher > + level). */ > + if (xfer_status == TARGET_XFER_OK) > + return TARGET_XFER_OK; > + else > + return TARGET_XFER_UNAVAILABLE; This returning ...UNAVAILABLE seems like the wrong thing to do. If we want to prevent continuing to the next layer, then we could just make core_target::has_all_memory() return true. Effectively that would mean we could eliminate that method, since it only exists for core files, here, in raw_memory_xfer_partial: /* We want to continue past core files to executables, but not past a running target's memory. */ if (ops->has_all_memory ()) break; At the very least, that comment should be updated. Trying it out locally, like this, on top of your whole series: diff --git c/gdb/corelow.c w/gdb/corelow.c index 7a711740622..d449efb74b9 100644 --- c/gdb/corelow.c +++ w/gdb/corelow.c @@ -90,7 +90,7 @@ class core_target final : public process_stratum_target const char *thread_name (struct thread_info *) override; - bool has_all_memory () override { return false; } + bool has_all_memory () override { return true; } bool has_memory () override; bool has_stack () override; bool has_registers () override; @@ -804,15 +804,7 @@ core_target::xfer_partial (enum target_object object, const char *annex, m_core_no_contents_section_table.sections_end, NULL); - /* If none of the above attempts worked to access the memory in - question, return TARGET_XFER_UNAVAILABLE. Due to the fact - that the exec file stratum has already been considered, we - want to prevent it from being examined yet again (at a higher - level). */ - if (xfer_status == TARGET_XFER_OK) - return TARGET_XFER_OK; - else - return TARGET_XFER_UNAVAILABLE; + return xfer_status; case TARGET_OBJECT_AUXV: if (readbuf) Fixes the assertion (different address since this was another core dump from another test run): - 17 load6 00000000 00007ffff7884000 0000000000000000 001d3000 2**12 + 17 load6 001ff000 00007ffff7884000 0000000000000000 001d3000 2**12 ALLOC, READONLY (gdb) disassemble /r 0x7ffff7884000,+10 Dump of assembler code from 0x7ffff7884000 to 0x7ffff788400a: 0x00007ffff7884000: 00 00 add %al,(%rax) 0x00007ffff7884002: 00 00 add %al,(%rax) 0x00007ffff7884004: 00 00 add %al,(%rax) 0x00007ffff7884006: 00 00 add %al,(%rax) 0x00007ffff7884008: 00 00 add %al,(%rax) End of assembler dump. But I'm not sure (yet anyway), whether reading that section as all zeroes is really the right thing to do. Running "info proc mappings" when debugging the core shows that this address comes from libc.so. It's the second libc-2.26.so mapping below, see "THIS ONE": (gdb) info proc mappings Mapped address spaces: Start Addr End Addr Size Offset objfile 0x400000 0x401000 0x1000 0x0 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile 0x600000 0x601000 0x1000 0x0 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile 0x601000 0x602000 0x1000 0x1000 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile 0x7ffff76d7000 0x7ffff7884000 0x1ad000 0x0 /usr/lib64/libc-2.26.so 0x7ffff7884000 0x7ffff7a83000 0x1ff000 0x1ad000 /usr/lib64/libc-2.26.so <<< THIS ONE 0x7ffff7a83000 0x7ffff7a87000 0x4000 0x1ac000 /usr/lib64/libc-2.26.so 0x7ffff7a87000 0x7ffff7a89000 0x2000 0x1b0000 /usr/lib64/libc-2.26.so 0x7ffff7a8d000 0x7ffff7bd7000 0x14a000 0x0 /usr/lib64/libm-2.26.so 0x7ffff7bd7000 0x7ffff7dd6000 0x1ff000 0x14a000 /usr/lib64/libm-2.26.so 0x7ffff7dd6000 0x7ffff7dd7000 0x1000 0x149000 /usr/lib64/libm-2.26.so 0x7ffff7dd7000 0x7ffff7dd8000 0x1000 0x14a000 /usr/lib64/libm-2.26.so 0x7ffff7dd8000 0x7ffff7dfd000 0x25000 0x0 /usr/lib64/ld-2.26.so 0x7ffff7ff5000 0x7ffff7ff7000 0x2000 0x0 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/coremmap.data 0x7ffff7ffc000 0x7ffff7ffd000 0x1000 0x24000 /usr/lib64/ld-2.26.so So, I tried comparing a live process to a core dump one. Since we need to use a kernel-generated core, what I did was, load the corefile program under GDB, let it run till before the abort() call, and then do (gdb) print fork () This makes the program fork, and the fork child crashes and aborts. Now I'm still debugging the parent, and I have a kernel-generated core with the same same memory map as the still-running inferior. I loaded the core dump as a second inferior under gdb. (add-inferior -no-connection; inferior 2; file ...; core ...) Remember that this now works with multi-target. (gdb) p fork () [Detaching after fork from child process 19488] $2 = 19488 (gdb) add-inferior -no-connection [New inferior 2] Added inferior 2 (gdb) inferior 2 [Switching to inferior 2 [<null>] (<noexec>)] (gdb) file /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile Reading symbols from /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile... (gdb) core core.19488 [New LWP 19488] Core was generated by `/home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/co'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fffffffd23f in ?? () Find the address in question: (gdb) shell objdump -h core.19488 ... 17 load6 00000000 00007ffff7884000 0000000000000000 001d3000 2**12 ALLOC, READONLY ... Disassemble it in inferior 2, the core dump: (gdb) disassemble /r 0x00007ffff7884000,+10 Dump of assembler code from 0x7ffff7884000 to 0x7ffff788400a: 0x00007ffff7884000: 00 00 add %al,(%rax) 0x00007ffff7884002: 00 00 add %al,(%rax) 0x00007ffff7884004: 00 00 add %al,(%rax) 0x00007ffff7884006: 00 00 add %al,(%rax) 0x00007ffff7884008: 00 00 add %al,(%rax) End of assembler dump. Now let's try disassembling the same address in the live inferior, inferior 1: (gdb) inferior 1 [Switching to inferior 1 [process 19451] (/home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile)] [Switching to thread 1.1 (process 19451)] #0 main (argc=1, argv=0x7fffffffd3b8) at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.base/coremaker.c:155 155 func1 (); (gdb) disassemble /r 0x00007ffff7884000,+10 Dump of assembler code from 0x7ffff7884000 to 0x7ffff788400a: 0x00007ffff7884000: 20 c6 and %al,%dh 0x00007ffff7884002: 07 (bad) 0x00007ffff7884003: 00 00 add %al,(%rax) 0x00007ffff7884005: 00 00 add %al,(%rax) 0x00007ffff7884007: 00 40 b8 add %al,-0x48(%rax) End of assembler dump. (gdb) They should result in the same contents, but clearly the core case read all zeroes, while the live one didn't. If we unwind all the patches in this series and try pristine master, we hit the original: (gdb) disassemble /r 0x00007ffff7884000,+10 Dump of assembler code from 0x7ffff7884000 to 0x7ffff788400a: 0x00007ffff7884000: Cannot access memory at address 0x7ffff7884000 So GDB doesn't find this section's contents in the executable or shared libraries, even though the file-backed mappings suggest we should be able to read it from libc-2.26.so. Maybe on your system you'll have different results and gdb manages to find the data in the executable somehow? Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-03-29 13:18 ` Pedro Alves @ 2020-05-03 7:21 ` Kevin Buettner 2020-05-03 11:07 ` H.J. Lu 2020-05-12 8:40 ` Kevin Buettner 1 sibling, 1 reply; 19+ messages in thread From: Kevin Buettner @ 2020-05-03 7:21 UTC (permalink / raw) To: gdb-patches; +Cc: Pedro Alves On Sun, 29 Mar 2020 14:18:46 +0100 Pedro Alves <palves@redhat.com> wrote: > Removing the bfd hack alone fixes your new test for me. > > > But, due to the way that the target > > strata are traversed when attempting to access memory, the > > non-SEC_HAS_CONTENTS sections will be read as zeroes from the > > process_stratum (which in this case is the core file stratum) without > > first checking the file stratum, which is where the data might actually > > be found. > > I've applied your patch #1 only, and ran the corefile.exp test, but > it still passes cleanly for me. I don't see any "print coremaker_ro" > FAIL here. :-/ That makes it a bit harder for me to understand all > of this. I'm on Fedora 27. I'm still working through the rest of your comments, but I have figured out what's going on with Fedora 27, so I'll address that now. I've tested with Fedora 27, 28, 29, 31, and 32. I am able to confirm the lack of regression with only patch #1 applied using F27 and F28. F29 onwards show the regression. (I didn't test with F30, but I assume that it too shows the regression.) I ended up using F28 and F29 to try to figure out what's going on. There's not much difference in the kernel versions: [kev@f28-1 gdb]$ uname -a Linux f28-1 5.0.16-100.fc28.x86_64 #1 SMP Tue May 14 18:22:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux [kev@f29-efi-1 gdb]$ uname -a Linux f29-efi-1 5.0.17-200.fc29.x86_64 #1 SMP Mon May 20 15:39:10 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux The gcc versions seem to be identical: [kev@f28-1 gdb]$ gcc --version gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [kev@f29-efi-1 gdb]$ gcc --version gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. There is a slight difference in the binutils versions: [kev@f28-1 gdb]$ ld --version | head -1 GNU ld version 2.29.1-23.fc28 [kev@f29-efi-1 gdb]$ ld --version | head -1 GNU ld version 2.31.1-25.fc29 I wondered at first if there was some difference between the way that F28 and F29 kernels made core dumps. I ran the F28 binary on F29 and vice versa and found that this didn't make any difference. The results were the same. I.e. the F28 binary failed to show the problem even when run/dumped using F29. Likewise, the F29 binary showed the problem when run/dumped using F28. So, it seemed likely that the problem was intrinsic to the binary. Looking at the output of... readelf -a testsuite/outputs/gdb.base/corefile/corefile ...on F28 and F29 revealed the following: F28: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align ... [11] .init PROGBITS 00000000004005d0 000005d0 0000000000000017 0000000000000000 AX 0 0 4 [12] .plt PROGBITS 00000000004005f0 000005f0 00000000000000a0 0000000000000010 AX 0 0 16 [13] .text PROGBITS 0000000000400690 00000690 0000000000000381 0000000000000000 AX 0 0 16 [14] .fini PROGBITS 0000000000400a14 00000a14 0000000000000009 0000000000000000 AX 0 0 4 [15] .rodata PROGBITS 0000000000400a20 00000a20 0000000000000067 0000000000000000 A 0 0 8 F29: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align ... [11] .init PROGBITS 0000000000401000 00001000 000000000000001b 0000000000000000 AX 0 0 4 [12] .plt PROGBITS 0000000000401020 00001020 00000000000000a0 0000000000000010 AX 0 0 16 [13] .text PROGBITS 00000000004010c0 000010c0 0000000000000395 0000000000000000 AX 0 0 16 [14] .fini PROGBITS 0000000000401458 00001458 000000000000000d 0000000000000000 AX 0 0 4 [15] .rodata PROGBITS 0000000000402000 00002000 0000000000000067 0000000000000000 A 0 0 8 The thing to observe here is that F28's .rodata address is 0x400a20. Observe, too, that the addresses for .text and .fini aren't that far away. The address for .rodata on F29 is at 0x402000. It's aligned on a 4K boundary which separates it quite a lot from the sections preceding it. Checking the kernel sources, I found that PAGE_SIZE and ELF_EXEC_PAGESIZE are 4096 for the architecture in question. (Actually most (maybe all?) have this setting.) These values are used to determine ELF_MIN_ALIGN in fs/binfmt_elf.c. Moving onto the corefiles, I see: F28: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align ... LOAD 0x0000000000002000 0x0000000000400000 0x0000000000000000 0x0000000000001000 0x0000000000001000 R E 0x1000 F29: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align ... LOAD 0x0000000000002000 0x0000000000400000 0x0000000000000000 0x0000000000001000 0x0000000000001000 R 0x1000 LOAD 0x0000000000003000 0x0000000000401000 0x0000000000000000 0x0000000000000000 0x0000000000001000 R E 0x1000 LOAD 0x0000000000003000 0x0000000000402000 0x0000000000000000 0x0000000000000000 0x0000000000001000 R 0x1000 The thing to observe here is that, on F29, .fini and .rodata get their own headers. On F28, a single header describes .init, .plt, .text, .fini, and .rodata. This is important because, on F28, the .rodata data actually ended up in the core file. But on F29, it doesn't since the header describing that region has no data (i.e. it's zero-sized). Making the following change to the test program causes the regression to also occur on F27/F28; the addition of filler_ro causes coremaker_ro to be placed far enough away that its data does not end up in the core file. diff --git a/gdb/testsuite/gdb.base/coremaker.c b/gdb/testsuite/gdb.base/coremaker.c index 55330fd3e8..4f6b5399e2 100644 --- a/gdb/testsuite/gdb.base/coremaker.c +++ b/gdb/testsuite/gdb.base/coremaker.c @@ -42,6 +42,7 @@ char *buf2; int coremaker_data = 1; /* In Data section */ int coremaker_bss; /* In BSS section */ +const unsigned char filler_ro[4096] = {1, 2, 3, 4, 5, 6, 7, 8}; const int coremaker_ro = 201; /* In Read-Only Data section */ /* Note that if the mapping fails for any reason, we set buf2 With that change made to the test case, this is what I see on F28: Binary: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align ... [11] .init PROGBITS 00000000004005d0 000005d0 0000000000000017 0000000000000000 AX 0 0 4 [12] .plt PROGBITS 00000000004005f0 000005f0 00000000000000a0 0000000000000010 AX 0 0 16 [13] .text PROGBITS 0000000000400690 00000690 0000000000000381 0000000000000000 AX 0 0 16 [14] .fini PROGBITS 0000000000400a14 00000a14 0000000000000009 0000000000000000 AX 0 0 4 [15] .rodata PROGBITS 0000000000400a20 00000a20 0000000000001077 0000000000000000 A 0 0 32 Core file: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align ... LOAD 0x0000000000002000 0x0000000000400000 0x0000000000000000 0x0000000000001000 0x0000000000002000 R E 0x1000 .init, .plt, .text, .fini, and .rodata still end up being described by a single LOAD header, but note that the file size is not equal to the memory size; part of .rodata will still be in the core file, but not coremaker_ro. So... hopefully, this answers the question of why part #1 alone is not sufficient even on F27. It'll also give you a way to evaluate future patch sets should you choose to stay on F27. Kevin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-05-03 7:21 ` Kevin Buettner @ 2020-05-03 11:07 ` H.J. Lu 2020-05-03 19:06 ` Kevin Buettner 0 siblings, 1 reply; 19+ messages in thread From: H.J. Lu @ 2020-05-03 11:07 UTC (permalink / raw) To: Kevin Buettner; +Cc: GDB, Pedro Alves On Sun, May 3, 2020 at 12:25 AM Kevin Buettner via Gdb-patches <gdb-patches@sourceware.org> wrote: > > On Sun, 29 Mar 2020 14:18:46 +0100 > Pedro Alves <palves@redhat.com> wrote: > > > Removing the bfd hack alone fixes your new test for me. > > > > > But, due to the way that the target > > > strata are traversed when attempting to access memory, the > > > non-SEC_HAS_CONTENTS sections will be read as zeroes from the > > > process_stratum (which in this case is the core file stratum) without > > > first checking the file stratum, which is where the data might actually > > > be found. > > > > I've applied your patch #1 only, and ran the corefile.exp test, but > > it still passes cleanly for me. I don't see any "print coremaker_ro" > > FAIL here. :-/ That makes it a bit harder for me to understand all > > of this. I'm on Fedora 27. > > I'm still working through the rest of your comments, but I have > figured out what's going on with Fedora 27, so I'll address that now. > > I've tested with Fedora 27, 28, 29, 31, and 32. I am able to confirm > the lack of regression with only patch #1 applied using F27 and F28. > F29 onwards show the regression. (I didn't test with F30, but I assume > that it too shows the regression.) > > I ended up using F28 and F29 to try to figure out what's going on. > > There's not much difference in the kernel versions: > > [kev@f28-1 gdb]$ uname -a > Linux f28-1 5.0.16-100.fc28.x86_64 #1 SMP Tue May 14 18:22:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > > [kev@f29-efi-1 gdb]$ uname -a > Linux f29-efi-1 5.0.17-200.fc29.x86_64 #1 SMP Mon May 20 15:39:10 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > > The gcc versions seem to be identical: > > [kev@f28-1 gdb]$ gcc --version > gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) > Copyright (C) 2018 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > [kev@f29-efi-1 gdb]$ gcc --version > gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) > Copyright (C) 2018 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > There is a slight difference in the binutils versions: > > [kev@f28-1 gdb]$ ld --version | head -1 > GNU ld version 2.29.1-23.fc28 > > [kev@f29-efi-1 gdb]$ ld --version | head -1 > GNU ld version 2.31.1-25.fc29 > > I wondered at first if there was some difference between the way that > F28 and F29 kernels made core dumps. I ran the F28 binary on F29 > and vice versa and found that this didn't make any difference. The > results were the same. I.e. the F28 binary failed to show the problem > even when run/dumped using F29. Likewise, the F29 binary > showed the problem when run/dumped using F28. So, it seemed likely > that the problem was intrinsic to the binary. > > Looking at the output of... > > readelf -a testsuite/outputs/gdb.base/corefile/corefile > > ...on F28 and F29 revealed the following: > > F28: > > [Nr] Name Type Address Offset > Size EntSize Flags Link Info Align > ... > [11] .init PROGBITS 00000000004005d0 000005d0 > 0000000000000017 0000000000000000 AX 0 0 4 > [12] .plt PROGBITS 00000000004005f0 000005f0 > 00000000000000a0 0000000000000010 AX 0 0 16 > [13] .text PROGBITS 0000000000400690 00000690 > 0000000000000381 0000000000000000 AX 0 0 16 > [14] .fini PROGBITS 0000000000400a14 00000a14 > 0000000000000009 0000000000000000 AX 0 0 4 > [15] .rodata PROGBITS 0000000000400a20 00000a20 > 0000000000000067 0000000000000000 A 0 0 8 > > F29: > > > [Nr] Name Type Address Offset > Size EntSize Flags Link Info Align > ... > [11] .init PROGBITS 0000000000401000 00001000 > 000000000000001b 0000000000000000 AX 0 0 4 > [12] .plt PROGBITS 0000000000401020 00001020 > 00000000000000a0 0000000000000010 AX 0 0 16 > [13] .text PROGBITS 00000000004010c0 000010c0 > 0000000000000395 0000000000000000 AX 0 0 16 > [14] .fini PROGBITS 0000000000401458 00001458 > 000000000000000d 0000000000000000 AX 0 0 4 > [15] .rodata PROGBITS 0000000000402000 00002000 > 0000000000000067 0000000000000000 A 0 0 8 > > The thing to observe here is that F28's .rodata address is 0x400a20. > Observe, too, that the addresses for .text and .fini aren't that far > away. > > The address for .rodata on F29 is at 0x402000. It's aligned on a 4K > boundary which separates it quite a lot from the sections preceding > it. > > Checking the kernel sources, I found that PAGE_SIZE and ELF_EXEC_PAGESIZE > are 4096 for the architecture in question. (Actually most (maybe all?) have > this setting.) These values are used to determine ELF_MIN_ALIGN in > fs/binfmt_elf.c. > > Moving onto the corefiles, I see: > > F28: > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > ... > LOAD 0x0000000000002000 0x0000000000400000 0x0000000000000000 > 0x0000000000001000 0x0000000000001000 R E 0x1000 > > > F29: > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > ... > LOAD 0x0000000000002000 0x0000000000400000 0x0000000000000000 > 0x0000000000001000 0x0000000000001000 R 0x1000 > LOAD 0x0000000000003000 0x0000000000401000 0x0000000000000000 > 0x0000000000000000 0x0000000000001000 R E 0x1000 > LOAD 0x0000000000003000 0x0000000000402000 0x0000000000000000 > 0x0000000000000000 0x0000000000001000 R 0x1000 > > The thing to observe here is that, on F29, .fini and .rodata get their > own headers. On F28, a single header describes .init, .plt, .text, > .fini, and .rodata. > It has nothing to do with kernel. It is a linker feature: 'separate-code' 'noseparate-code' Create separate code 'PT_LOAD' segment header in the object. This specifies a memory segment that should contain only instructions and must be in wholly disjoint pages from any other data. Don't create separate code 'PT_LOAD' segment if 'noseparate-code' is used. [hjl@gnu-cfl-2 ~]$ ld --help | grep separate -z separate-code Create separate code program header (default) -z noseparate-code Don't create separate code program header [hjl@gnu-cfl-2 ~]$ -- H.J. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-05-03 11:07 ` H.J. Lu @ 2020-05-03 19:06 ` Kevin Buettner 0 siblings, 0 replies; 19+ messages in thread From: Kevin Buettner @ 2020-05-03 19:06 UTC (permalink / raw) To: H.J. Lu; +Cc: gdb-patches, Pedro Alves On Sun, 3 May 2020 04:07:33 -0700 "H.J. Lu" <hjl.tools@gmail.com> wrote: > On Sun, May 3, 2020 at 12:25 AM Kevin Buettner via Gdb-patches > <gdb-patches@sourceware.org> wrote: > > > > On Sun, 29 Mar 2020 14:18:46 +0100 > > Pedro Alves <palves@redhat.com> wrote: > > > > > Removing the bfd hack alone fixes your new test for me. > > > > > > > But, due to the way that the target > > > > strata are traversed when attempting to access memory, the > > > > non-SEC_HAS_CONTENTS sections will be read as zeroes from the > > > > process_stratum (which in this case is the core file stratum) without > > > > first checking the file stratum, which is where the data might actually > > > > be found. > > > > > > I've applied your patch #1 only, and ran the corefile.exp test, but > > > it still passes cleanly for me. I don't see any "print coremaker_ro" > > > FAIL here. :-/ That makes it a bit harder for me to understand all > > > of this. I'm on Fedora 27. > > > > I'm still working through the rest of your comments, but I have > > figured out what's going on with Fedora 27, so I'll address that now. > > > > I've tested with Fedora 27, 28, 29, 31, and 32. I am able to confirm > > the lack of regression with only patch #1 applied using F27 and F28. > > F29 onwards show the regression. (I didn't test with F30, but I assume > > that it too shows the regression.) > > > > I ended up using F28 and F29 to try to figure out what's going on. > > > > There's not much difference in the kernel versions: > > > > [kev@f28-1 gdb]$ uname -a > > Linux f28-1 5.0.16-100.fc28.x86_64 #1 SMP Tue May 14 18:22:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > > > > [kev@f29-efi-1 gdb]$ uname -a > > Linux f29-efi-1 5.0.17-200.fc29.x86_64 #1 SMP Mon May 20 15:39:10 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > > > > The gcc versions seem to be identical: > > > > [kev@f28-1 gdb]$ gcc --version > > gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) > > Copyright (C) 2018 Free Software Foundation, Inc. > > This is free software; see the source for copying conditions. There is NO > > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > > > [kev@f29-efi-1 gdb]$ gcc --version > > gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) > > Copyright (C) 2018 Free Software Foundation, Inc. > > This is free software; see the source for copying conditions. There is NO > > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > > > There is a slight difference in the binutils versions: > > > > [kev@f28-1 gdb]$ ld --version | head -1 > > GNU ld version 2.29.1-23.fc28 > > > > [kev@f29-efi-1 gdb]$ ld --version | head -1 > > GNU ld version 2.31.1-25.fc29 > > > > I wondered at first if there was some difference between the way that > > F28 and F29 kernels made core dumps. I ran the F28 binary on F29 > > and vice versa and found that this didn't make any difference. The > > results were the same. I.e. the F28 binary failed to show the problem > > even when run/dumped using F29. Likewise, the F29 binary > > showed the problem when run/dumped using F28. So, it seemed likely > > that the problem was intrinsic to the binary. > > > > Looking at the output of... > > > > readelf -a testsuite/outputs/gdb.base/corefile/corefile > > > > ...on F28 and F29 revealed the following: > > > > F28: > > > > [Nr] Name Type Address Offset > > Size EntSize Flags Link Info Align > > ... > > [11] .init PROGBITS 00000000004005d0 000005d0 > > 0000000000000017 0000000000000000 AX 0 0 4 > > [12] .plt PROGBITS 00000000004005f0 000005f0 > > 00000000000000a0 0000000000000010 AX 0 0 16 > > [13] .text PROGBITS 0000000000400690 00000690 > > 0000000000000381 0000000000000000 AX 0 0 16 > > [14] .fini PROGBITS 0000000000400a14 00000a14 > > 0000000000000009 0000000000000000 AX 0 0 4 > > [15] .rodata PROGBITS 0000000000400a20 00000a20 > > 0000000000000067 0000000000000000 A 0 0 8 > > > > F29: > > > > > > [Nr] Name Type Address Offset > > Size EntSize Flags Link Info Align > > ... > > [11] .init PROGBITS 0000000000401000 00001000 > > 000000000000001b 0000000000000000 AX 0 0 4 > > [12] .plt PROGBITS 0000000000401020 00001020 > > 00000000000000a0 0000000000000010 AX 0 0 16 > > [13] .text PROGBITS 00000000004010c0 000010c0 > > 0000000000000395 0000000000000000 AX 0 0 16 > > [14] .fini PROGBITS 0000000000401458 00001458 > > 000000000000000d 0000000000000000 AX 0 0 4 > > [15] .rodata PROGBITS 0000000000402000 00002000 > > 0000000000000067 0000000000000000 A 0 0 8 > > > > The thing to observe here is that F28's .rodata address is 0x400a20. > > Observe, too, that the addresses for .text and .fini aren't that far > > away. > > > > The address for .rodata on F29 is at 0x402000. It's aligned on a 4K > > boundary which separates it quite a lot from the sections preceding > > it. > > > > Checking the kernel sources, I found that PAGE_SIZE and ELF_EXEC_PAGESIZE > > are 4096 for the architecture in question. (Actually most (maybe all?) have > > this setting.) These values are used to determine ELF_MIN_ALIGN in > > fs/binfmt_elf.c. > > > > Moving onto the corefiles, I see: > > > > F28: > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Align > > ... > > LOAD 0x0000000000002000 0x0000000000400000 0x0000000000000000 > > 0x0000000000001000 0x0000000000001000 R E 0x1000 > > > > > > F29: > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Align > > ... > > LOAD 0x0000000000002000 0x0000000000400000 0x0000000000000000 > > 0x0000000000001000 0x0000000000001000 R 0x1000 > > LOAD 0x0000000000003000 0x0000000000401000 0x0000000000000000 > > 0x0000000000000000 0x0000000000001000 R E 0x1000 > > LOAD 0x0000000000003000 0x0000000000402000 0x0000000000000000 > > 0x0000000000000000 0x0000000000001000 R 0x1000 > > > > The thing to observe here is that, on F29, .fini and .rodata get their > > own headers. On F28, a single header describes .init, .plt, .text, > > .fini, and .rodata. > > > > It has nothing to do with kernel. It is a linker feature: > > 'separate-code' > 'noseparate-code' > Create separate code 'PT_LOAD' segment header in the object. > This specifies a memory segment that should contain only > instructions and must be in wholly disjoint pages from any > other data. Don't create separate code 'PT_LOAD' segment if > 'noseparate-code' is used. > > [hjl@gnu-cfl-2 ~]$ ld --help | grep separate > -z separate-code Create separate code program header (default) > -z noseparate-code Don't create separate code program header > [hjl@gnu-cfl-2 ~]$ Thanks for this info. I had been wondering what change was made in between binutils 2.29 and 2.31 to cause the change in behavior described in my earlier post. With the info you provided above, I found it in bfd/ChangeLog-2018: 2018-02-27 H.J. Lu <hongjiu.lu@intel.com> * config.in: Regenerated. * configure: Likewise. * configure.ac: Add --enable-separate-code. (DEFAULT_LD_Z_SEPARATE_CODE): New AC_DEFINE_UNQUOTED. Default to 1 for Linux/x86 targets, * elf64-x86-64.c (ELF_MAXPAGESIZE): Set to 0x1000 if DEFAULT_LD_Z_SEPARATE_CODE is 1. Also, regarding version numbers, I see this: 2018-06-24 Nick Clifton <nickc@redhat.com> 2.31 branch created. So 2.31 would have been the first version for which --enable-separate-code was made the default for Linux/x86. As noted earlier, F28 used version 2.29 while F29 used version 2.31. Kevin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-03-29 13:18 ` Pedro Alves 2020-05-03 7:21 ` Kevin Buettner @ 2020-05-12 8:40 ` Kevin Buettner 2020-05-13 17:44 ` Kevin Buettner ` (2 more replies) 1 sibling, 3 replies; 19+ messages in thread From: Kevin Buettner @ 2020-05-12 8:40 UTC (permalink / raw) To: gdb-patches; +Cc: Pedro Alves On Sun, 29 Mar 2020 14:18:46 +0100 Pedro Alves <palves@redhat.com> wrote: > Hi Kevin, > > On 3/5/20 12:42 AM, Kevin Buettner wrote: > > Consider the following program: > > > > Change-Id: I1adbb4e9047baad7cae7eab9c72e6d2b16f87d73 > > > > This Change-Id line should be at the bottom of the commit log. > Or removed entirely since we're not relying on it anymore. I think the Change-Id might have been placed there due to the use of "---" as the scissors line demarcating the test program. Git may have thought it was the start of a diff. But that's just a guess. > > --- mkmmapcore.c --- I think I have this fixed now - In the v2 patch draft, I'm using "- - -" instead. [...] > Removing the bfd hack alone fixes your new test for me. I've addressed this point in earlier email. > > But, due to the way that the target > > strata are traversed when attempting to access memory, the > > non-SEC_HAS_CONTENTS sections will be read as zeroes from the > > process_stratum (which in this case is the core file stratum) without > > first checking the file stratum, which is where the data might actually > > be found. > > I've applied your patch #1 only, and ran the corefile.exp test, but > it still passes cleanly for me. I don't see any "print coremaker_ro" > FAIL here. :-/ That makes it a bit harder for me to understand all > of this. I'm on Fedora 27. > > Can you expand a bit more on this following part? > > > Some of these sections have data that can (and should) be read > > from the executable. > > I'd like to understand and explore this a little bit better. I've revised my draft commit log comments as follows: The reason for this is that all sections which have the BFD flag SEC_ALLOC set, but for which SEC_HAS_CONTENTS is not set no longer have zero size. Some of these sections have data that can (and should) be read from the executable. (Sections for which SEC_HAS_CONTENTS is set should be read from the core file; sections which do not have this flag set need to either be read from the executable or, failing that, from the core file using whatever BFD decides is the best value to present to the user - it uses zeros.) At present, due to the way that the target strata are traversed when attempting to access memory, the non-SEC_HAS_CONTENTS sections will be read as zeroes from the process_stratum (which in this case is the core file stratum) without first checking the file stratum, which is where the data might actually be found. I basically split the paragraph and added the parenthetical remark to the end of the first paragraph. > Are these cases truly indistinguishable from the cases where data > shouldn't be read from the executable? I don't mean from the current > bfd data structures, but from the data in the core file and the executable. > It's kind of fascinating that that's the case, and if so, it would sound > like a nasty bug in either the core format or in the Linux kernel for > producing such cores with which we have to apply heuristics. The core file shows a "FileSiz" of 0 for cases when we shouldn't look first at the core file. Here's an example using a core file produced by the corefile.exp test case. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align ... LOAD 0x0000000000026000 0x00007f498a801000 0x0000000000000000 0x0000000000001000 0x00000000001b5000 R E 0x1000 LOAD 0x0000000000027000 0x00007f498a9b6000 0x0000000000000000 0x0000000000000000 0x00000000001ff000 0x1000 The first header show a file size of 1000. For this one, GDB should first look at the contents provided by the core file. The second header has a size of 0. There's no data there to look at, so GDB needs to look at the exec file. If that fails, the memory in question was probably dynamically allocated, so we let BFD provide us with something reasonable to present to the user. It's possible (as I discuss later) that a live process has non-zero contents which aren't captured by the core file. This might be considered a bug, but in the case that I looked at the memory in question wasn't especially relevant to execution of the program once it was loaded. By the way, objdump -h (using the one with my hack-removing patch applied) shows the above headers / sections as follows: Idx Name Size VMA LMA File off Algn 16 load5a 00001000 00007f498a801000 0000000000000000 00026000 2**12 CONTENTS, ALLOC, LOAD, READONLY, CODE 17 load5b 001b4000 00007f498a802000 0000000000001000 00027000 2**12 ALLOC, READONLY, CODE Note that there's no longer any numerical indication that the size in the file is 0. But we can discern that by the lack of a CONTENTS flag. This is the objdump output without the hack removed: Idx Name Size VMA LMA File off Algn 16 load5a 00001000 00007f498a801000 0000000000000000 00026000 2**12 CONTENTS, ALLOC, LOAD, READONLY, CODE 17 load5b 00000000 00007f498a802000 0000000000001000 00027000 2**12 ALLOC, READONLY, CODE The "old" objdump shows a file size of 0, but contains no indication of the size of the memory region. The new (hack-removed) way is better because it tells us the size of the memory region. The lack of a CONTENTS flag lets us know that the file size was 0 I don't know the reason for the choice of names (load5a and load5b). According to the readelf output, these started out as separate program headers. As such, I'd have thought that they'd be given names not needing to be distinguished by an "a" or "b" suffix. > For the NON-split fake sections case (by split I mean the loadXXXa/loadXXXb > sections that map to a single segment), how come we end up with such sections > in the core in the first place if they weren't modified at run time? Well, we still need to know that they were there, right? It doesn't take much space to output a header that has no corresponding file contents. > Diffing "objdump -h" results from before/after the hack removal, on the corefile.exp > core dump, I see, this case for example: > > - 18 load6 00000000 00007fd61476a000 0000000000000000 00027000 2**12 > + 18 load6 001ff000 00007fd61476a000 0000000000000000 00027000 2**12 > ALLOC, READONLY > > This is a case of a segment that is not split in two sections like some > others (note no trailing "a" and "b"). So this is a "!split" case in > _bfd_elf_make_section_from_phdr. Trying to disassemble that address, with > the whole patch series applied, results in: > > (gdb) disassemble 0x00007fd61476a000,+10 > Dump of assembler code from 0x7fd61476a000 to 0x7fd61476a00a: > 0x00007fd61476a000: /home/pedro/gdb/binutils-gdb/src/gdb/target.c:1271: internal-error: target_xfer_status target_xfer_partial(target_ops*, target_object, const char*, gdb_byte*, const gdb_byte*, ULONGEST, ULONGEST, ULONGEST*): Assertion `*xfered_len > 0' failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > Quit this debugging session? (y or n) Unfortunately, I wasn't able to reproduce this problem. I tried on F27, F28, and F31. However, I think I fixed the bug while fixing a testsuite regression after the rebase. > I'd be good to also cover this in the testsuite somehow. Hmmm... I'd need to reproduce it first, which might be done by undoing the bug fix mentioned earlier. It doesn't sound like an easy test case to write. Plus there's no guarantee that the test would fail as expected (for a buggy) gdb in modern linux releases. (I.e. I'm not terribly enthusiastic about pursuing this.) > Using "info proc mappings" and "readelf -l" we can see that the address > belongs to a file-backed mapping with p_filesz=0. I'm puzzled about > why we ended up with a p_filesz=0 load segment in the core for > this memory range (load6). I don't think that the kernel tries to distinguish these cases; it just dumps them. > > What we should be doing is this: > > > > - Attempt to access core file data for SEC_HAS_CONTENTS sections. > > - Attempt to access executable file data if the above fails. > > - Attempt to access core file data for non SEC_HAS_CONTENTS sections, if > > both of the above fail. > > > > This seems to end up in line with Daniel's suggestion back in 2007 at: > > https://sourceware.org/legacy-ml/binutils/2007-08/msg00045.html > > Except it uses the section flags as proxy for the p_filesz/p_memsz > checks. Yes. I'ved added that link to my v2 commit remarks. I've also added your note about the section flags. > I'm still not fully sure this is the right thing to do given I'm not > clear on all the details, but if there truly is no other way to > distinguish the segments that need to be read from the executable > compared to segments that need to be read from the core, I suppose > this is the way to go. I don't know of a better way. > I'm not fully convinced on the splitting the sections though, compared to > just walking the core sections list twice with a predicate. > section_table_xfer_memory_partial already has a predicate parameter, > the 'section_name' parameter, we would just need to generalize > it to a gdb::function_view callback instead. I think that changing section_name to a predicate parameter is good idea. This is what I've done for the v2 patch set. > Or alternatively, if you prefer two separate lists, then I don't > understand why build a single list, and then split in two right after. > Wouldn't it be preferable to make core_target::core_target() build the > two separate lists from the get go, rather that build a single list and > then split it in two lists immediately after? I've removed the splitting stuff entirely. > BTW, that TARGET_OBJECT_MEMORY case in core_target::xfer_partial > is getting largish, might be worth it to move that chunk to a > separate core_target::xfer_memory method. I didn't do this; it doesn't appear to me to be that much bigger than some of the other largish cases in the same function. > But really just walking the single sections list in place would > be simpler, I think. I don't think this is a bottleneck. Yes, I like the way it turned out after making the changes you suggested. > > enum target_xfer_status > > @@ -741,12 +767,52 @@ core_target::xfer_partial (enum target_object object, const char *annex, > > > + /* If none of the above attempts worked to access the memory in > > + question, return TARGET_XFER_UNAVAILABLE. Due to the fact > > + that the exec file stratum has already been considered, we > > + want to prevent it from being examined yet again (at a higher > > + level). */ > > + if (xfer_status == TARGET_XFER_OK) > > + return TARGET_XFER_OK; > > + else > > + return TARGET_XFER_UNAVAILABLE; > > This returning ...UNAVAILABLE seems like the wrong thing to do. If > we want to prevent continuing to the next layer, then we could > just make core_target::has_all_memory() return true. You're right. I ran into a regression while rebasing against current sources when I started looking at this again. I ended up writing most of the patch that you provided to me, all except for the has_all_memory bit. I've now added that line from your patch. > > Effectively that would mean we could eliminate that method, since it > only exists for core files, here, in raw_memory_xfer_partial: > > /* We want to continue past core files to executables, but not > past a running target's memory. */ > if (ops->has_all_memory ()) > break; > > At the very least, that comment should be updated. I'll update the comment. remote-sim.c's implementation of has_all_memory can still return false. I think there's at least one other instance where false can be returned. [patch snipped] > But I'm not sure (yet anyway), whether reading that section > as all zeroes is really the right thing to do. > > Running "info proc mappings" when debugging the core shows that > this address comes from libc.so. It's the second libc-2.26.so > mapping below, see "THIS ONE": > > (gdb) info proc mappings > Mapped address spaces: > > Start Addr End Addr Size Offset objfile > 0x400000 0x401000 0x1000 0x0 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile > 0x600000 0x601000 0x1000 0x0 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile > 0x601000 0x602000 0x1000 0x1000 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile > 0x7ffff76d7000 0x7ffff7884000 0x1ad000 0x0 /usr/lib64/libc-2.26.so > 0x7ffff7884000 0x7ffff7a83000 0x1ff000 0x1ad000 /usr/lib64/libc-2.26.so <<< THIS ONE > 0x7ffff7a83000 0x7ffff7a87000 0x4000 0x1ac000 /usr/lib64/libc-2.26.so > 0x7ffff7a87000 0x7ffff7a89000 0x2000 0x1b0000 /usr/lib64/libc-2.26.so > 0x7ffff7a8d000 0x7ffff7bd7000 0x14a000 0x0 /usr/lib64/libm-2.26.so > 0x7ffff7bd7000 0x7ffff7dd6000 0x1ff000 0x14a000 /usr/lib64/libm-2.26.so > 0x7ffff7dd6000 0x7ffff7dd7000 0x1000 0x149000 /usr/lib64/libm-2.26.so > 0x7ffff7dd7000 0x7ffff7dd8000 0x1000 0x14a000 /usr/lib64/libm-2.26.so > 0x7ffff7dd8000 0x7ffff7dfd000 0x25000 0x0 /usr/lib64/ld-2.26.so > 0x7ffff7ff5000 0x7ffff7ff7000 0x2000 0x0 /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/coremmap.data > 0x7ffff7ffc000 0x7ffff7ffd000 0x1000 0x24000 /usr/lib64/ld-2.26.so > > > So, I tried comparing a live process to a core dump one. Since we need > to use a kernel-generated core, what I did was, load the corefile program > under GDB, let it run till before the abort() call, and then do > > (gdb) print fork () > > This makes the program fork, and the fork child crashes and aborts. > Now I'm still debugging the parent, and I have a kernel-generated core > with the same same memory map as the still-running inferior. > > I loaded the core dump as a second inferior under gdb. > (add-inferior -no-connection; inferior 2; file ...; core ...) > Remember that this now works with multi-target. > > (gdb) p fork () > [Detaching after fork from child process 19488] > $2 = 19488 > (gdb) add-inferior -no-connection > [New inferior 2] > Added inferior 2 > (gdb) inferior 2 > [Switching to inferior 2 [<null>] (<noexec>)] > (gdb) file /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile > Reading symbols from /home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile... > (gdb) core core.19488 > [New LWP 19488] > Core was generated by `/home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/co'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x00007fffffffd23f in ?? () > > Find the address in question: > > (gdb) shell objdump -h core.19488 > ... > 17 load6 00000000 00007ffff7884000 0000000000000000 001d3000 2**12 > ALLOC, READONLY > ... > > Disassemble it in inferior 2, the core dump: > > (gdb) disassemble /r 0x00007ffff7884000,+10 > Dump of assembler code from 0x7ffff7884000 to 0x7ffff788400a: > 0x00007ffff7884000: 00 00 add %al,(%rax) > 0x00007ffff7884002: 00 00 add %al,(%rax) > 0x00007ffff7884004: 00 00 add %al,(%rax) > 0x00007ffff7884006: 00 00 add %al,(%rax) > 0x00007ffff7884008: 00 00 add %al,(%rax) > End of assembler dump. > > Now let's try disassembling the same address in the live inferior, > inferior 1: > > (gdb) inferior 1 > [Switching to inferior 1 [process 19451] (/home/pedro/brno/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.base/corefile/corefile)] > [Switching to thread 1.1 (process 19451)] > #0 main (argc=1, argv=0x7fffffffd3b8) at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.base/coremaker.c:155 > 155 func1 (); > (gdb) disassemble /r 0x00007ffff7884000,+10 > Dump of assembler code from 0x7ffff7884000 to 0x7ffff788400a: > 0x00007ffff7884000: 20 c6 and %al,%dh > 0x00007ffff7884002: 07 (bad) > 0x00007ffff7884003: 00 00 add %al,(%rax) > 0x00007ffff7884005: 00 00 add %al,(%rax) > 0x00007ffff7884007: 00 40 b8 add %al,-0x48(%rax) > End of assembler dump. > (gdb) > > They should result in the same contents, but clearly the > core case read all zeroes, while the live one didn't. > > If we unwind all the patches in this series and try pristine master, > we hit the original: > > (gdb) disassemble /r 0x00007ffff7884000,+10 > Dump of assembler code from 0x7ffff7884000 to 0x7ffff788400a: > 0x00007ffff7884000: Cannot access memory at address 0x7ffff7884000 > > So GDB doesn't find this section's contents in the executable or shared > libraries, even though the file-backed mappings suggest we should be able > to read it from libc-2.26.so. Maybe on your system you'll have different > results and gdb manages to find the data in the executable somehow? I wasn't able to reproduce this on F31. I was able to reproduce it on F28. It took a while to figure out what was happening. I used the fork trick as you suggested, but I ended up debugging with separate GDB instances so that I could place the windows side by side to do comparisons. However, it was cool seeing how the add-inferior stuff worked though, so thanks for that! Okay, so in my session(s), this is what I see: (gdb) info proc mappings process 23063 Mapped address spaces: Start Addr End Addr Size Offset objfile 0x400000 0x401000 0x1000 0x0 /mesquite2/sourceware-git/f28-bz25631/bld/gdb/testsuite/outputs/gdb.base/corefile/corefile 0x600000 0x601000 0x1000 0x0 /mesquite2/sourceware-git/f28-bz25631/bld/gdb/testsuite/outputs/gdb.base/corefile/corefile 0x601000 0x602000 0x1000 0x1000 /mesquite2/sourceware-git/f28-bz25631/bld/gdb/testsuite/outputs/gdb.base/corefile/corefile 0x602000 0x623000 0x21000 0x0 [heap] 0x7ffff7684000 0x7ffff7839000 0x1b5000 0x0 /usr/lib64/libc-2.27.so 0x7ffff7839000 0x7ffff7a38000 0x1ff000 0x1b5000 /usr/lib64/libc-2.27.so 0x7ffff7a38000 0x7ffff7a3c000 0x4000 0x1b4000 /usr/lib64/libc-2.27.so 0x7ffff7a3c000 0x7ffff7a3e000 0x2000 0x1b8000 /usr/lib64/libc-2.27.so 0x7ffff7a3e000 0x7ffff7a42000 0x4000 0x0 0x7ffff7a42000 0x7ffff7bd4000 0x192000 0x0 /usr/lib64/libm-2.27.so 0x7ffff7bd4000 0x7ffff7dd4000 0x200000 0x192000 /usr/lib64/libm-2.27.so 0x7ffff7dd4000 0x7ffff7dd5000 0x1000 0x192000 /usr/lib64/libm-2.27.so 0x7ffff7dd5000 0x7ffff7dd6000 0x1000 0x193000 /usr/lib64/libm-2.27.so 0x7ffff7dd6000 0x7ffff7dfd000 0x27000 0x0 /usr/lib64/ld-2.27.so 0x7ffff7fcb000 0x7ffff7fd0000 0x5000 0x0 0x7ffff7ff4000 0x7ffff7ff6000 0x2000 0x0 0x7ffff7ff6000 0x7ffff7ff8000 0x2000 0x0 /mesquite2/sourceware-git/f28-bz25631/bld/gdb/coremmap.data 0x7ffff7ff8000 0x7ffff7ffb000 0x3000 0x0 [vvar] 0x7ffff7ffb000 0x7ffff7ffc000 0x1000 0x0 [vdso] 0x7ffff7ffc000 0x7ffff7ffd000 0x1000 0x26000 /usr/lib64/ld-2.27.so 0x7ffff7ffd000 0x7ffff7ffe000 0x1000 0x27000 /usr/lib64/ld-2.27.so 0x7ffff7ffe000 0x7ffff7fff000 0x1000 0x0 0x7ffffffdd000 0x7ffffffff000 0x22000 0x0 [stack] 0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall] I'm going to focus on this mapping, which is in libc-2.27.so: 0x7ffff7839000 0x7ffff7a38000 0x1ff000 0x1b5000 /usr/lib64/libc-2.27.so It's interesting to look at the corresponding line from /proc/23063/maps too: 7ffff7839000-7ffff7a38000 ---p 001b5000 fc:03 1446829 /usr/lib64/libc-2.27.so Of interest here is are the permissions for this mapping; it's "---p", which indicates that it cannot be read, written, or executed. The "p" flag indicates that it's private. So from this we might infer that this region was not really in use by the process. Also, of very great importance (for figuring out what's happening anyway), is the number 001b5000. It's the offset for this region in the shared library. Examining memory in the live process: (gdb) x/16x 0x7ffff7839000 0x7ffff7839000: 0x0007ddf0 0x00000000 0x0007dba0 0x00000000 0x7ffff7839010: 0x0007ec00 0x00000000 0x0007ec10 0x00000000 0x7ffff7839020: 0x0007ebe0 0x00000000 0x0007ddf0 0x00000000 0x7ffff7839030: 0x0007ebf0 0x00000000 0x0007ec20 0x00000000 And via the core file: (gdb) x/16x 0x7ffff7839000 0x7ffff7839000: 0x00000000 0x00000000 0x00000000 0x00000000 0x7ffff7839010: 0x00000000 0x00000000 0x00000000 0x00000000 0x7ffff7839020: 0x00000000 0x00000000 0x00000000 0x00000000 0x7ffff7839030: 0x00000000 0x00000000 0x00000000 0x00000000 Okay, so they're different. I've reproduced the problem you found. Let's dump part of libc-2.27.so... [kev@f28-1 gdb]$ od -A x -t x4 -j 0x001b5000 -N 176 /usr/lib64/libc-2.27.so 1b5000 0007ddf0 00000000 0007dba0 00000000 1b5010 0007ec00 00000000 0007ec10 00000000 1b5020 0007ebe0 00000000 0007ddf0 00000000 1b5030 0007ebf0 00000000 0007ec20 00000000 1b5040 0007ec30 00000000 00000000 00000000 1b5050 00000000 00000000 00000000 00000000 * 1b5070 00000000 00000000 00078ba0 00000000 1b5080 00000000 00000000 00000000 00000000 1b5090 00000000 00000000 00078ae0 00000000 1b50a0 00000000 00000000 00000000 00000000 1b50b0 Note that the initial four lines from the dump match the four lines that I dumped via x/x for the live process. Now, let's look at some of the objdump -h output for /usr/lib64/libc-2.27.so: Sections: Idx Name Size VMA LMA File off Algn ... 27 __libc_IO_vtables 00000d68 00000000003b4760 00000000003b4760 001b4760 2**5 CONTENTS, ALLOC, LOAD, DATA 28 .data.rel.ro 000026a0 00000000003b54e0 00000000003b54e0 001b54e0 2**5 CONTENTS, ALLOC, LOAD, DATA The offset range for __libc_IO_vtables is 0x1b4760 to 0x1b54c8. So, I think that what we're looking at there is some of the vtable data. However, this not where we actually find the vtables in memory. This is what "info target" has to say about the matter: 0x00007ffff7a38760 - 0x00007ffff7a394c8 is __libc_IO_vtables in /lib64/libc.so.6 Let's take the file offset corresponding to the start of the region (start of region=0x7ffff7839000, file offset=0x1b5000) and subtract the offset provided by objdump for __libc_IO_vtables (file offset=001b4760): (gdb) p/x 0x1b5000-0x1b4760 $28 = 0x8a0 Now let's use this offset to try to look at actual vtables memory corresponding to our mystery region. (gdb) x/16x 0x00007ffff7a38760+0x8a0 0x7ffff7a39000 <_IO_strn_jumps+96>: 0xf7701df0 0x00007fff 0xf7701ba0 0x00007fff 0x7ffff7a39010 <_IO_strn_jumps+112>: 0xf7702c00 0x00007fff 0xf7702c10 0x00007fff 0x7ffff7a39020 <_IO_strn_jumps+128>: 0xf7702be0 0x00007fff 0xf7701df0 0x00007fff 0x7ffff7a39030 <_IO_strn_jumps+144>: 0xf7702bf0 0x00007fff 0xf7702c20 0x00007fff Okay, this looks nothing at all like the memory that we dumped earlier. But let's try something... (gdb) p/x 0xf7701df0-0x0007ddf0 $29 = 0xf7684000 (gdb) p/x 0xf7701ba0-0x0007dba0 $30 = 0xf7684000 (gdb) p/x 0xf7702c00-0x0007ec00 $31 = 0xf7684000 (gdb) p/x 0xf7702c10-0x0007ec10 $32 = 0xf7684000 Here I took the first four non-zero values in the region starting at 0x7ffff7839000 and subtracted them from the corresponding value starting at 0x7ffff7a39000 <_IO_strn_jumps+96>. In each case, the result of the subtraction is the same number, 0xf7684000, which I'm guessing is a base address for something. In any case, I think there's enough of a correspondence here to suspect that one was derived from the other. So... my best guess is that the memory in question was scratch storage for the libc IO vtable data. If that's really the case (which the mapping permissions suggest), then this memory wasn't being actively used by the program at that point in its execution. In any case, because the region is file backed, the kernel chose not to dump it. The address ranges provided by "info target" know nothing about that address (in this case 0x7ffff7839000), so that explains why GDB wasn't able to go to the shared library to get it (even though it's there if it were told where to look). If we think it important enough, we might be able to fix this in gdb and/or bfd. I hope that's a satisfactory enough answer to move on... I'll post a v2 series soon; I just have a bit more polishing to do first. Kevin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-05-12 8:40 ` Kevin Buettner @ 2020-05-13 17:44 ` Kevin Buettner 2020-05-20 12:05 ` Pedro Alves 2020-05-20 16:04 ` Pedro Alves 2 siblings, 0 replies; 19+ messages in thread From: Kevin Buettner @ 2020-05-13 17:44 UTC (permalink / raw) To: gdb-patches On Tue, 12 May 2020 01:40:32 -0700 Kevin Buettner via Gdb-patches <gdb-patches@sourceware.org> wrote: > > This Change-Id line should be at the bottom of the commit log. > > Or removed entirely since we're not relying on it anymore. > > I think the Change-Id might have been placed there due to the use of > "---" as the scissors line demarcating the test program. Git may have > thought it was the start of a diff. But that's just a guess. FWIW, I ended up removing the script which was added for gerrit. Kevin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-05-12 8:40 ` Kevin Buettner 2020-05-13 17:44 ` Kevin Buettner @ 2020-05-20 12:05 ` Pedro Alves 2020-05-20 16:04 ` Pedro Alves 2 siblings, 0 replies; 19+ messages in thread From: Pedro Alves @ 2020-05-20 12:05 UTC (permalink / raw) To: Kevin Buettner, gdb-patches On 5/12/20 9:40 AM, Kevin Buettner via Gdb-patches wrote: > By the way, objdump -h (using the one with my hack-removing patch > applied) shows the above headers / sections as follows: > > Idx Name Size VMA LMA File off Algn > 16 load5a 00001000 00007f498a801000 0000000000000000 00026000 2**12 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 17 load5b 001b4000 00007f498a802000 0000000000001000 00027000 2**12 > ALLOC, READONLY, CODE ... > I don't know the reason for the choice of names (load5a and load5b). > According to the readelf output, these started out as separate program > headers. As such, I'd have thought that they'd be given names not > needing to be distinguished by an "a" or "b" suffix. I take it you mean separate segments instead of separate program headers. But that's not correct -- the two sections load5a/load5b come from a single segment, the first one below: > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > ... > LOAD 0x0000000000026000 0x00007f498a801000 0x0000000000000000 > 0x0000000000001000 0x00000000001b5000 R E 0x1000 > LOAD 0x0000000000027000 0x00007f498a9b6000 0x0000000000000000 > 0x0000000000000000 0x00000000001ff000 0x1000 > This segment has FileSiz == 0x1000 and MemSiz == 0x1b5000. This means that the segment starts with .data-like contents in the first 0x1000 bytes, and then is followed with (0x1b5000-0x1000) .bss-like zeroes. The .bss zeroes don't need to be present in the file of course, which is why FileSiz is smaller than MemSiz. Note how load5a starts at 0x7f498a801000 is is 0x1000 bytes long, while load5b starts at 0x7f498a802000 (0x7f498a801000+0x1000) and is 0x1b4000 (0x1b5000-0x1000) bytes long. Note how the "split" variable is computed in _bfd_elf_make_section_from_phdr: bfd_boolean _bfd_elf_make_section_from_phdr (bfd *abfd, Elf_Internal_Phdr *hdr, int hdr_index, const char *type_name) { int split; ... split = ((hdr->p_memsz > 0) && (hdr->p_filesz > 0) && (hdr->p_memsz > hdr->p_filesz)); If "split" is true, then you end up with two sections for the same segment, as discussed above. Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections 2020-05-12 8:40 ` Kevin Buettner 2020-05-13 17:44 ` Kevin Buettner 2020-05-20 12:05 ` Pedro Alves @ 2020-05-20 16:04 ` Pedro Alves 2 siblings, 0 replies; 19+ messages in thread From: Pedro Alves @ 2020-05-20 16:04 UTC (permalink / raw) To: Kevin Buettner, gdb-patches On 5/12/20 9:40 AM, Kevin Buettner via Gdb-patches wrote: >> Are these cases truly indistinguishable from the cases where data >> shouldn't be read from the executable? I don't mean from the current >> bfd data structures, but from the data in the core file and the executable. >> It's kind of fascinating that that's the case, and if so, it would sound >> like a nasty bug in either the core format or in the Linux kernel for >> producing such cores with which we have to apply heuristics. > The core file shows a "FileSiz" of 0 for cases when we shouldn't > look first at the core file. What I'm saying is that this "look first in the core" algorithm seems like an heuristic. As in, maybe it only fails in some corner cases, but still, it's an heuristic that can fail sometimes. And that to me indicates a flaw in the design somewhere. Maybe it's all we can do with the available data, but I'd still call it a flaw. Here's an example of the heuristic failing. Say, a program uses mmap(MAP_FIXED) to inject a new mapping at an address range that overlaps some file-backed mappings, and then never touches that new mapping's addresses. In this scenario we end up with a core with a FileSiz==0 section for the never-written-to mapping, and thus when reading from that mapping's addresses, GDB reads the data from the executable, while instead we should have read the data as 0, because that's what you had at runtime. This is actually easy to reproduce. Like so: $ cat mmap.c #include <sys/mman.h> #include <stdlib.h> int main () { abort (); return 0; } $ gcc mmap.c -o mmap -g3 -O0 $ gdb -q ~/tmp/mmap Reading symbols from /home/pedro/tmp/mmap... (gdb) start Temporary breakpoint 1 at 0x4004db: file mmap.c, line 7. Starting program: /home/pedro/tmp/mmap Temporary breakpoint 1, main () at mmap.c:7 7 abort (); (gdb) info proc mappings process 4523 Mapped address spaces: Start Addr End Addr Size Offset objfile 0x400000 0x401000 0x1000 0x0 /home/pedro/tmp/mmap 0x600000 0x601000 0x1000 0x0 /home/pedro/tmp/mmap 0x601000 0x602000 0x1000 0x1000 /home/pedro/tmp/mmap 0x7ffff7a22000 0x7ffff7bcf000 0x1ad000 0x0 /usr/lib64/libc-2.26.so 0x7ffff7bcf000 0x7ffff7dce000 0x1ff000 0x1ad000 /usr/lib64/libc-2.26.so 0x7ffff7dce000 0x7ffff7dd2000 0x4000 0x1ac000 /usr/lib64/libc-2.26.so 0x7ffff7dd2000 0x7ffff7dd4000 0x2000 0x1b0000 /usr/lib64/libc-2.26.so 0x7ffff7dd4000 0x7ffff7dd8000 0x4000 0x0 0x7ffff7dd8000 0x7ffff7dfd000 0x25000 0x0 /usr/lib64/ld-2.26.so 0x7ffff7fb8000 0x7ffff7fba000 0x2000 0x0 0x7ffff7ff7000 0x7ffff7ffa000 0x3000 0x0 [vvar] 0x7ffff7ffa000 0x7ffff7ffc000 0x2000 0x0 [vdso] 0x7ffff7ffc000 0x7ffff7ffd000 0x1000 0x24000 /usr/lib64/ld-2.26.so 0x7ffff7ffd000 0x7ffff7ffe000 0x1000 0x25000 /usr/lib64/ld-2.26.so 0x7ffff7ffe000 0x7ffff7fff000 0x1000 0x0 0x7ffffffdd000 0x7ffffffff000 0x22000 0x0 [stack] 0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall] (gdb) p mmap (0x400000, getpagesize (), 0, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, 0, 0) $1 = (void *) 0x400000 (gdb) set code-cache off (gdb) disassemble Dump of assembler code for function main: 0x00000000004004d7 <+0>: add %al,(%rax) 0x00000000004004d9 <+2>: add %al,(%rax) => 0x00000000004004db <+4>: add %al,(%rax) 0x00000000004004dd <+6>: add %al,(%rax) 0x00000000004004df <+8>: add %al,(%rax) End of assembler dump. (gdb) (gdb) detach Detaching from program: /home/pedro/tmp/mmap, process 4523 [Inferior 1 (process 4523) detached] (gdb) q The process crashed at this point and the kernel generated a core dump. Loading the resulting core into gdb (this is a GDB with your v2 patches), we see: $ gdb -q ~/tmp/mmap -c ./core.4523 Reading symbols from /home/pedro/tmp/mmap... [New LWP 4523] Core was generated by `/home/pedro/tmp/mmap'. Program terminated with signal SIGSEGV, Segmentation fault. #0 main () at mmap.c:7 7 abort (); (gdb) set code-cache off (gdb) disassemble Dump of assembler code for function main: 0x00000000004004d7 <+0>: push %rbp 0x00000000004004d8 <+1>: mov %rsp,%rbp => 0x00000000004004db <+4>: callq 0x4003f0 <abort@plt> End of assembler dump. (gdb) Disassembling is still showing the .text read from the executable, instead of zeroes (i.e., "add %al,(%rax)"), which is incorrect. If you debug that gdb, you can see that gdb does not find a section with contents for that address, and then tries reading from the executable, which finds the data. The next step of reading from sections with no contents would succeed, since there's a matching section in the core, but we never get there. Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/4] Remove hack for GDB which sets the section size to 0 2020-03-05 0:43 [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner 2020-03-05 0:43 ` [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections Kevin Buettner @ 2020-03-05 0:43 ` Kevin Buettner 2020-03-18 16:28 ` Kevin Buettner 2020-05-04 18:09 ` Jan Kratochvil 2020-03-05 0:43 ` [PATCH 2/4] Add function for partitioning/splitting a section table Kevin Buettner ` (2 subsequent siblings) 4 siblings, 2 replies; 19+ messages in thread From: Kevin Buettner @ 2020-03-05 0:43 UTC (permalink / raw) To: gdb-patches; +Cc: Kevin Buettner [Note: This patch will require approval from binutils maintainers. I'll send this patch to that list separately, but include it in this series for completeness.] This commit removes a hack for GDB which was introduced in 2007. See: https://sourceware.org/ml/binutils/2007-08/msg00044.html That hack mostly allowed GDB's handling of core files to continue to work without any changes to GDB. The problem with setting the section size to zero is that GDB won't know how big that section is/was. Often, this doesn't matter because the data in question are found in the exec file. But it can happen that the section describes memory that had been allocated, but never written to. In this instance, the contents of that memory region are not written to the core file. Also, since the region in question was dynamically allocated, it won't appear in the exec file. We don't want these regions to appear as inaccessible to GDB (since they *were* accessible when the process was live), so it's important that GDB know the size of the region. I've made changes to GDB which correctly handles this case. When attempting to access memory, GDB will first consider core file data for which both SEC_ALLOC and SEC_HAS_CONTENTS is set. Next, if that fails, GDB will attempt to find the data in the exec file. Finally, if that also fails, GDB will attempt to access memory in the sections which are flagged as SEC_ALLOC, but not SEC_HAS_CONTENTS. bfd/ChangeLog: * elf.c (_bfd_elf_make_section_from_phdr): Remove hack for GDB. Change-Id: I7cce707aa3c217addbc27589730a77620199843f --- bfd/elf.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/bfd/elf.c b/bfd/elf.c index c4d6718aaa..89c61acc40 100644 --- a/bfd/elf.c +++ b/bfd/elf.c @@ -3007,14 +3007,6 @@ _bfd_elf_make_section_from_phdr (bfd *abfd, newsect->alignment_power = bfd_log2 (align); if (hdr->p_type == PT_LOAD) { - /* Hack for gdb. Segments that have not been modified do - not have their contents written to a core file, on the - assumption that a debugger can find the contents in the - executable. We flag this case by setting the fake - section size to zero. Note that "real" bss sections will - always have their contents dumped to the core file. */ - if (bfd_get_format (abfd) == bfd_core) - newsect->size = 0; newsect->flags |= SEC_ALLOC; if (hdr->p_flags & PF_X) newsect->flags |= SEC_CODE; -- 2.24.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/4] Remove hack for GDB which sets the section size to 0 2020-03-05 0:43 ` [PATCH 1/4] Remove hack for GDB which sets the section size to 0 Kevin Buettner @ 2020-03-18 16:28 ` Kevin Buettner 2020-05-04 18:09 ` Jan Kratochvil 1 sibling, 0 replies; 19+ messages in thread From: Kevin Buettner @ 2020-03-18 16:28 UTC (permalink / raw) To: gdb-patches FYI, Nick Clifton has approved this patch. On Wed, 4 Mar 2020 17:42:40 -0700 Kevin Buettner <kevinb@redhat.com> wrote: > [Note: This patch will require approval from binutils maintainers. > I'll send this patch to that list separately, but include it in this > series for completeness.] > > This commit removes a hack for GDB which was introduced in 2007. > See: > > https://sourceware.org/ml/binutils/2007-08/msg00044.html > > That hack mostly allowed GDB's handling of core files to continue to > work without any changes to GDB. > > The problem with setting the section size to zero is that GDB won't > know how big that section is/was. Often, this doesn't matter because > the data in question are found in the exec file. But it can happen > that the section describes memory that had been allocated, but never > written to. In this instance, the contents of that memory region are > not written to the core file. Also, since the region in question was > dynamically allocated, it won't appear in the exec file. We don't > want these regions to appear as inaccessible to GDB (since they *were* > accessible when the process was live), so it's important that GDB know > the size of the region. > > I've made changes to GDB which correctly handles this case. When > attempting to access memory, GDB will first consider core file data > for which both SEC_ALLOC and SEC_HAS_CONTENTS is set. Next, if that > fails, GDB will attempt to find the data in the exec file. Finally, > if that also fails, GDB will attempt to access memory in the sections > which are flagged as SEC_ALLOC, but not SEC_HAS_CONTENTS. > > bfd/ChangeLog: > > * elf.c (_bfd_elf_make_section_from_phdr): Remove hack for GDB. > > Change-Id: I7cce707aa3c217addbc27589730a77620199843f > --- > bfd/elf.c | 8 -------- > 1 file changed, 8 deletions(-) > > diff --git a/bfd/elf.c b/bfd/elf.c > index c4d6718aaa..89c61acc40 100644 > --- a/bfd/elf.c > +++ b/bfd/elf.c > @@ -3007,14 +3007,6 @@ _bfd_elf_make_section_from_phdr (bfd *abfd, > newsect->alignment_power = bfd_log2 (align); > if (hdr->p_type == PT_LOAD) > { > - /* Hack for gdb. Segments that have not been modified do > - not have their contents written to a core file, on the > - assumption that a debugger can find the contents in the > - executable. We flag this case by setting the fake > - section size to zero. Note that "real" bss sections will > - always have their contents dumped to the core file. */ > - if (bfd_get_format (abfd) == bfd_core) > - newsect->size = 0; > newsect->flags |= SEC_ALLOC; > if (hdr->p_flags & PF_X) > newsect->flags |= SEC_CODE; > -- > 2.24.1 > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/4] Remove hack for GDB which sets the section size to 0 2020-03-05 0:43 ` [PATCH 1/4] Remove hack for GDB which sets the section size to 0 Kevin Buettner 2020-03-18 16:28 ` Kevin Buettner @ 2020-05-04 18:09 ` Jan Kratochvil 1 sibling, 0 replies; 19+ messages in thread From: Jan Kratochvil @ 2020-05-04 18:09 UTC (permalink / raw) To: Kevin Buettner; +Cc: gdb-patches On Thu, 05 Mar 2020 01:42:40 +0100, Kevin Buettner wrote: > This commit removes a hack for GDB which was introduced in 2007. > See: > > https://sourceware.org/ml/binutils/2007-08/msg00044.html > > That hack mostly allowed GDB's handling of core files to continue to > work without any changes to GDB. I believe this hack is no longer needed since GDB parses /proc/PID/smaps: Implement support for checking /proc/PID/coredump_filter https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=df8411da087dc05481926f4c4a82deabc5bc3859 Jan ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/4] Add function for partitioning/splitting a section table 2020-03-05 0:43 [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner 2020-03-05 0:43 ` [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections Kevin Buettner 2020-03-05 0:43 ` [PATCH 1/4] Remove hack for GDB which sets the section size to 0 Kevin Buettner @ 2020-03-05 0:43 ` Kevin Buettner 2020-03-25 17:25 ` Keith Seitz 2020-03-05 0:43 ` [PATCH 4/4] Test ability to access unwritten-to mmap data in core file Kevin Buettner 2020-03-18 16:29 ` [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner 4 siblings, 1 reply; 19+ messages in thread From: Kevin Buettner @ 2020-03-05 0:43 UTC (permalink / raw) To: gdb-patches; +Cc: Kevin Buettner gdb/ChangeLog: * exec.h, exec.c (split_section_table): New function. Change-Id: I8909174aed892727bdd6d704cfe43763ee8969c3 --- gdb/exec.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ gdb/exec.h | 13 +++++++++++++ 2 files changed, 68 insertions(+) diff --git a/gdb/exec.c b/gdb/exec.c index 68bca1be17..e8cd471f5f 100644 --- a/gdb/exec.c +++ b/gdb/exec.c @@ -604,6 +604,61 @@ resize_section_table (struct target_section_table *table, int adjustment) return old_count; } +/* See exec.h. */ + +void +split_section_table (struct target_section_table *orig, + struct target_section_table *splits, + bool (*criterion) (struct target_section *)) +{ + int count = 0; + + for (struct target_section *sect = orig->sections; + sect < orig->sections_end; + sect++) + { + if (criterion (sect)) + count++; + } + + /* Handle the case of an empty SPLITS table. */ + if (count == 0) + { + splits->sections = nullptr; + splits->sections_end = nullptr; + return; + } + + /* Handle case of an empty ORIG table. */ + if (count == orig->sections_end - orig->sections) + { + splits = orig; + orig->sections = nullptr; + orig->sections_end = nullptr; + return; + } + + /* Okay, we actually have something to do. Allocate a new table for + SPLITS. We'll resize ORIG after copying. */ + + splits->sections = XNEWVEC (struct target_section, count); + splits->sections_end = splits->sections + count; + + for (struct target_section *sect = orig->sections, + *osect = sect, + *ssect = splits->sections; + sect < orig->sections_end; + sect++) + { + if (criterion (sect)) + *ssect++ = *sect; + else + *osect++ = *sect; + } + + resize_section_table (orig, -count); +} + /* Builds a section table, given args BFD, SECTABLE_PTR, SECEND_PTR. Returns 0 if OK, 1 on error. */ diff --git a/gdb/exec.h b/gdb/exec.h index 54e6ff4d9b..fc45e11898 100644 --- a/gdb/exec.h +++ b/gdb/exec.h @@ -44,6 +44,19 @@ extern int build_section_table (struct bfd *, struct target_section **, extern void clear_section_table (struct target_section_table *table); +/* Split a target section table ORIG into two parts. Sections meeting + CRITERION will be placed into SPLITS. The original table ORIG will + be adjusted and resized as necessary, removing those sections which + were placed into SPLITS. If no sections in ORIG satisfy CRITERION, + sections and sections_end in SPLITS will be set to nullptr, + representing an empty table. Likewise, if all sections in ORIG + satisfy CRITERION, ORIG will be set to an empty table and SPLITS + end up with all of the sections originally in ORIG. */ + +extern void split_section_table (struct target_section_table *orig, + struct target_section_table *splits, + bool (*criterion) (struct target_section *)); + /* The current inferior is a child vforked and its program space is shared with its parent. This pushes the exec target on the current/child inferior's target stack if there are sections in the -- 2.24.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] Add function for partitioning/splitting a section table 2020-03-05 0:43 ` [PATCH 2/4] Add function for partitioning/splitting a section table Kevin Buettner @ 2020-03-25 17:25 ` Keith Seitz 0 siblings, 0 replies; 19+ messages in thread From: Keith Seitz @ 2020-03-25 17:25 UTC (permalink / raw) To: Kevin Buettner, gdb-patches Hi, Kevin, Thank you for looking into this bug, which I have some familiarity with. Like you, the BFD hack made me a little nervous. I am relieved that I was apparently barking up the right proverbial tree. Comments/(rhetorical) questions inline. On 3/4/20 4:42 PM, Kevin Buettner wrote: > gdb/ChangeLog: > * exec.h, exec.c (split_section_table): New function. Mention "PR corefiles/25631"? > gdb/exec.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > gdb/exec.h | 13 +++++++++++++ > 2 files changed, 68 insertions(+) > > diff --git a/gdb/exec.c b/gdb/exec.c > index 68bca1be17..e8cd471f5f 100644 > --- a/gdb/exec.c > +++ b/gdb/exec.c > @@ -604,6 +604,61 @@ resize_section_table (struct target_section_table *table, int adjustment) > return old_count; > } > > +/* See exec.h. */ > + > +void > +split_section_table (struct target_section_table *orig, > + struct target_section_table *splits, > + bool (*criterion) (struct target_section *)) > +{ > + int count = 0; > + > + for (struct target_section *sect = orig->sections; > + sect < orig->sections_end; > + sect++) > + { > + if (criterion (sect)) > + count++; > + } > + I admit, I am hesitant about looping over this `criterion' evaluation twice (once here, once below), especially given BFD's propensity to turn everything into a section, but I realize you're working with some pre-existing APIs that would need rewriting. [Rewriting, e.g., build_section_table doesn't seem so important in the grand scheme of things.] Still, I wonder if the double-looping might be avoidable using vectors (partially at least) or XRESIZEVEC? > + /* Handle the case of an empty SPLITS table. */ > + if (count == 0) > + { > + splits->sections = nullptr; > + splits->sections_end = nullptr; > + return; > + } > + > + /* Handle case of an empty ORIG table. */ > + if (count == orig->sections_end - orig->sections) > + { > + splits = orig; ^^^^^^^^^^^^^^ Typo? > + orig->sections = nullptr; > + orig->sections_end = nullptr; > + return; > + } > + > + /* Okay, we actually have something to do. Allocate a new table for > + SPLITS. We'll resize ORIG after copying. */ > + > + splits->sections = XNEWVEC (struct target_section, count); > + splits->sections_end = splits->sections + count; > + > + for (struct target_section *sect = orig->sections, > + *osect = sect, > + *ssect = splits->sections; > + sect < orig->sections_end; > + sect++) > + { > + if (criterion (sect)) > + *ssect++ = *sect; > + else > + *osect++ = *sect; > + } > + > + resize_section_table (orig, -count); > +} > + > /* Builds a section table, given args BFD, SECTABLE_PTR, SECEND_PTR. > Returns 0 if OK, 1 on error. */ Rambling notwithstanding, LGTM. Thanks again for picking this up. Keith ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 4/4] Test ability to access unwritten-to mmap data in core file 2020-03-05 0:43 [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner ` (2 preceding siblings ...) 2020-03-05 0:43 ` [PATCH 2/4] Add function for partitioning/splitting a section table Kevin Buettner @ 2020-03-05 0:43 ` Kevin Buettner 2020-03-25 17:25 ` Keith Seitz 2020-03-18 16:29 ` [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner 4 siblings, 1 reply; 19+ messages in thread From: Kevin Buettner @ 2020-03-05 0:43 UTC (permalink / raw) To: gdb-patches; +Cc: Kevin Buettner gdb/testsuite/ChangeLog: * gdb.base/corefile.exp (accessing anonymous, unwritten-to mmap data): New test. * gdb.base/coremaker.c (buf3): New global. (mmapdata): Add mmap call which uses MAP_ANONYMOUSE and MAP_PRIVATE flags. Change-Id: Ifb8d77b06050e1220f33f83f1bec27533e4d9ead --- gdb/testsuite/gdb.base/corefile.exp | 6 ++++++ gdb/testsuite/gdb.base/coremaker.c | 10 ++++++++++ 2 files changed, 16 insertions(+) diff --git a/gdb/testsuite/gdb.base/corefile.exp b/gdb/testsuite/gdb.base/corefile.exp index 34b903b350..d46b38704c 100644 --- a/gdb/testsuite/gdb.base/corefile.exp +++ b/gdb/testsuite/gdb.base/corefile.exp @@ -175,6 +175,12 @@ gdb_test_multiple "x/8bd buf2" "$test" { } } +# Test ability to read anonymous and, more importantly, unwritten-to +# mmap'd data. + +gdb_test "x/wx buf3" "$hex:\[ \t\]+0x00000000" \ + "accessing anonymous, unwritten-to mmap data" + # test reinit_frame_cache gdb_load ${binfile} diff --git a/gdb/testsuite/gdb.base/coremaker.c b/gdb/testsuite/gdb.base/coremaker.c index 55330fd3e8..3a01c16405 100644 --- a/gdb/testsuite/gdb.base/coremaker.c +++ b/gdb/testsuite/gdb.base/coremaker.c @@ -38,6 +38,7 @@ char *buf1; char *buf2; +char *buf3; int coremaker_data = 1; /* In Data section */ int coremaker_bss; /* In BSS section */ @@ -98,6 +99,15 @@ mmapdata () } /* Touch buf2 so kernel writes it out into 'core'. */ buf2[0] = buf1[0]; + + /* Create yet another region which is allocated, but not written to. */ + buf3 = mmap (NULL, MAPSIZE, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + if (buf3 == (char *) -1) + { + perror ("mmap failed"); + return; + } } void -- 2.24.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] Test ability to access unwritten-to mmap data in core file 2020-03-05 0:43 ` [PATCH 4/4] Test ability to access unwritten-to mmap data in core file Kevin Buettner @ 2020-03-25 17:25 ` Keith Seitz 0 siblings, 0 replies; 19+ messages in thread From: Keith Seitz @ 2020-03-25 17:25 UTC (permalink / raw) To: Kevin Buettner, gdb-patches Hi, Kevin, Thank you for the test! I know my first thoughts of writing a test for this culminated in some head-scratching. On 3/4/20 4:42 PM, Kevin Buettner wrote: > gdb/testsuite/ChangeLog: > > * gdb.base/corefile.exp (accessing anonymous, unwritten-to mmap data): > New test. > * gdb.base/coremaker.c (buf3): New global. > (mmapdata): Add mmap call which uses MAP_ANONYMOUSE and MAP_PRIVATE > flags. The only comment I have (other than LGTM) is to mention the PR in the ChangeLog entry (again). Keith ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] Fix BZ 25631 - core file memory access problem 2020-03-05 0:43 [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner ` (3 preceding siblings ...) 2020-03-05 0:43 ` [PATCH 4/4] Test ability to access unwritten-to mmap data in core file Kevin Buettner @ 2020-03-18 16:29 ` Kevin Buettner 4 siblings, 0 replies; 19+ messages in thread From: Kevin Buettner @ 2020-03-18 16:29 UTC (permalink / raw) To: gdb-patches Ping. On Wed, 4 Mar 2020 17:42:39 -0700 Kevin Buettner <kevinb@redhat.com> wrote: > This series fixes a bug with accessing memory from (described by) core > files. > > The bug can be viewed here... > > https://sourceware.org/bugzilla/show_bug.cgi?id=25631 > > ...though I also provide similar details in part 3 of this series. > > Kevin Buettner (4): > Remove hack for GDB which sets the section size to 0 > Add function for partitioning/splitting a section table > Provide access to non SEC_HAS_CONTENTS core file sections > Test ability to access unwritten-to mmap data in core file > > bfd/elf.c | 8 --- > gdb/corelow.c | 80 ++++++++++++++++++++++++++--- > gdb/exec.c | 55 ++++++++++++++++++++ > gdb/exec.h | 13 +++++ > gdb/testsuite/gdb.base/corefile.exp | 6 +++ > gdb/testsuite/gdb.base/coremaker.c | 10 ++++ > 6 files changed, 157 insertions(+), 15 deletions(-) > > -- > 2.24.1 > ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2020-05-20 16:04 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-05 0:43 [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner 2020-03-05 0:43 ` [PATCH 3/4] Provide access to non SEC_HAS_CONTENTS core file sections Kevin Buettner 2020-03-25 17:25 ` Keith Seitz 2020-03-29 13:18 ` Pedro Alves 2020-05-03 7:21 ` Kevin Buettner 2020-05-03 11:07 ` H.J. Lu 2020-05-03 19:06 ` Kevin Buettner 2020-05-12 8:40 ` Kevin Buettner 2020-05-13 17:44 ` Kevin Buettner 2020-05-20 12:05 ` Pedro Alves 2020-05-20 16:04 ` Pedro Alves 2020-03-05 0:43 ` [PATCH 1/4] Remove hack for GDB which sets the section size to 0 Kevin Buettner 2020-03-18 16:28 ` Kevin Buettner 2020-05-04 18:09 ` Jan Kratochvil 2020-03-05 0:43 ` [PATCH 2/4] Add function for partitioning/splitting a section table Kevin Buettner 2020-03-25 17:25 ` Keith Seitz 2020-03-05 0:43 ` [PATCH 4/4] Test ability to access unwritten-to mmap data in core file Kevin Buettner 2020-03-25 17:25 ` Keith Seitz 2020-03-18 16:29 ` [PATCH 0/4] Fix BZ 25631 - core file memory access problem Kevin Buettner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).