* [PATCH] ld: ensure build-id is placed near ELF headers @ 2024-08-12 14:21 Andrew Burgess 2024-08-19 11:01 ` Nick Clifton 0 siblings, 1 reply; 17+ messages in thread From: Andrew Burgess @ 2024-08-12 14:21 UTC (permalink / raw) To: binutils; +Cc: Andrew Burgess After the commit: commit e8e10743f7b207b21a1efb0cc9e42487080db013 Date: Thu Jun 13 15:10:15 2024 +0100 Add --rosegment option to BFD linker to stop the '-z separate-code' from generating two read-only segments. When an executable is built with the --rosegment option GDB is no longer able to find the build-id of the executable from a core file. The problem is that GDB depends on what I guess was a lucky coincidence; that the build-id was placed close to the ELF headers. When a Linux kernel produces a core file there is specific support in place so that the kernel will emit the first page of any private file mapping that looks like it might contain ELF headers. If the build-id is placed close to the ELF headers then it will be included in this first page and GDB can find it. Using the build-id GDB can then validate that it has the correct executable loaded to debug the core file, or in some cases GDB can use the build-id to find the executable that matches the core file automatically. The Linux kernel support for this feature has existed since 2007, see this (kernel) commit: commit 82df39738ba9e02c057fa99b7461a56117d36119 Date: Tue Oct 16 23:27:02 2007 -0700 Add MMF_DUMP_ELF_HEADERS I ran into this issue while testing on the latest Fedora Rawhide build, currently the linker on this system enables --rosegment by default and as a result GDB is not able to find the build-id for any core file. What's happening is that with --rosegment the linker merges the .note.gnu.build-id sections with all the other data content and places this after the executable content. Without the --rosegment the .note.gnu.build-id section is placed pretty much first before the executable and data sections. As a consequence, without --rosegment the build-id will usually be on the same page as the ELF headers while with --rosegment on the executable content is first and this often fills the rest of the page after the ELF headers. This patch aims to fix this by placing the build-id first. The build-id will then be included within the same LOAD-able segment as the executable content, just as the ELF headers are. With this patch in place GDB is once again able to find the build-id from a core file. --- ld/scripttempl/elf.sc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/ld/scripttempl/elf.sc b/ld/scripttempl/elf.sc index 54716110b61..17184037753 100644 --- a/ld/scripttempl/elf.sc +++ b/ld/scripttempl/elf.sc @@ -425,7 +425,6 @@ emit_early_ro() { cat <<EOF ${INITIAL_READONLY_SECTIONS} - .note.gnu.build-id ${RELOCATING-0}: { *(.note.gnu.build-id) } EOF } @@ -436,6 +435,11 @@ cat <<EOF ${CREATE_SHLIB-${CREATE_PIE-${RELOCATING+PROVIDE (__executable_start = ${TEXT_START_ADDR}); . = ${TEXT_BASE_ADDRESS};}}} ${CREATE_SHLIB+${RELOCATING+. = ${SHLIB_TEXT_START_ADDR}${SIZEOF_HEADERS_CODE};}} ${CREATE_PIE+${RELOCATING+PROVIDE (__executable_start = ${SHLIB_TEXT_START_ADDR}); . = ${SHLIB_TEXT_START_ADDR}${SIZEOF_HEADERS_CODE};}} + + /* Place build-id as close to the ELF headers as possible. This + maximises the chance the build-id will be present in core files, + which GDB can use to improve the debug experience. */ + .note.gnu.build-id ${RELOCATING-0}: { *(.note.gnu.build-id) } EOF } base-commit: 49e607f511c1fab82a0116990a72d1915c74bb4a -- 2.25.4 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-08-12 14:21 [PATCH] ld: ensure build-id is placed near ELF headers Andrew Burgess @ 2024-08-19 11:01 ` Nick Clifton 2024-08-21 10:57 ` Andrew Burgess ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Nick Clifton @ 2024-08-19 11:01 UTC (permalink / raw) To: Andrew Burgess, binutils Hi Andrew, [Sorry for being so late in responding to your email] > When an executable is built with the --rosegment option GDB is no > longer able to find the build-id of the executable from a core file. For problems like this it really helps if you create a bug report using the Sourceware bugzilla system. This allows us to track the problem and any discussion surrounding it. It also means that we can refer to the bug ID in comments in the code. In order to save you time however, I have gone ahead and created a bug report for you: https://sourceware.org/bugzilla/show_bug.cgi?id=32100 > This patch aims to fix this by placing the build-id first. The > build-id will then be included within the same LOAD-able segment as > the executable content, just as the ELF headers are. With this patch > in place GDB is once again able to find the build-id from a core > file. Unfortunately that patch breaks the intention of the --rosegment option by restoring two loadable, read-only segments to the executable. Here is an example: $ cat hello.c extern int printf (const char *, ...); int i = 42; const int * j = & i; int main (void) { return printf ("hello world %d\n", * j); } $ gcc -fPIC -Wl,-z,noseparate-code hello.c $ readelf -lW a.out | grep LOAD LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 $ gcc -fPIX -Wl,-z,separate-code hello.c $ readelf -lW a.out | grep LOAD LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c $ readelf -lW a.out | grep LOAD LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker $ readelf -lW a.out | grep LOAD LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In reality I used a full path to a linker built with your proposed patch applied). The section to segment mapping is changed by using --rosegment, but with your patch applied we still get an early read-only segment containing the note sections. I think that what is needed is a patch to move all of the read-only segments to before the read-execute segment. But this weill need testing. Cheers Nick ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-08-19 11:01 ` Nick Clifton @ 2024-08-21 10:57 ` Andrew Burgess 2024-08-21 20:45 ` Fangrui Song 2024-09-19 15:51 ` Nick Clifton 2024-09-02 10:26 ` Andrew Burgess 2024-09-19 20:53 ` Sam James 2 siblings, 2 replies; 17+ messages in thread From: Andrew Burgess @ 2024-08-21 10:57 UTC (permalink / raw) To: Nick Clifton, binutils Nick Clifton <nickc@redhat.com> writes: > Hi Andrew, > > [Sorry for being so late in responding to your email] > >> When an executable is built with the --rosegment option GDB is no >> longer able to find the build-id of the executable from a core file. > > For problems like this it really helps if you create a bug report > using the Sourceware bugzilla system. This allows us to track the > problem and any discussion surrounding it. It also means that we > can refer to the bug ID in comments in the code. > > In order to save you time however, I have gone ahead and created > a bug report for you: > > https://sourceware.org/bugzilla/show_bug.cgi?id=32100 > >> This patch aims to fix this by placing the build-id first. The >> build-id will then be included within the same LOAD-able segment as >> the executable content, just as the ELF headers are. With this patch >> in place GDB is once again able to find the build-id from a core >> file. > > Unfortunately that patch breaks the intention of the --rosegment > option by restoring two loadable, read-only segments to the executable. > > Here is an example: > > $ cat hello.c > > extern int printf (const char *, ...); > int i = 42; > const int * j = & i; > int main (void) { return printf ("hello world %d\n", * j); } > > $ gcc -fPIC -Wl,-z,noseparate-code hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 > LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In > reality I used a full path to a linker built with your proposed patch applied). > > The section to segment mapping is changed by using --rosegment, but with your patch > applied we still get an early read-only segment containing the note sections. > > I think that what is needed is a patch to move all of the read-only segments to > before the read-execute segment. But this weill need testing. Thanks for taking the time to test this patch. Sorry that this failed. I had thought I'd seen different results, but retesting I can reproduce what you see, so clearly I was doing something wrong before[1]. I do have one question which you might be able to enlighten me on though: the initial loadable segment contains both the code, but also the ELF headers, right? Which are clearly not executable content. So it would seem that it is OK to have a limited set of data mapped at the start of the code segment. Clearly we don't want _all_ the data placed there as that would make the separate segments pointless. But something that is not usually read by the program itself (e.g. ELF headers, or the build-id) seemed OK. My next question then would be, I guess the linker is creating a new segment because the permissions of the build-id section doesn't match that of the code section. Instead of moving all the sections, can the linker be told/convinced to merge the build-id into the start of the code? I'm just trying to understand more about how this all works. Thanks, Andrew [1] I'm pretty sure I figured out what I was doing wrong. But it was a stupid error my side. > > Cheers > Nick ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-08-21 10:57 ` Andrew Burgess @ 2024-08-21 20:45 ` Fangrui Song 2024-09-19 15:51 ` Nick Clifton 1 sibling, 0 replies; 17+ messages in thread From: Fangrui Song @ 2024-08-21 20:45 UTC (permalink / raw) To: Andrew Burgess; +Cc: Nick Clifton, binutils On Wed, Aug 21, 2024 at 3:57 AM Andrew Burgess <aburgess@redhat.com> wrote: > > Nick Clifton <nickc@redhat.com> writes: > > > Hi Andrew, > > > > [Sorry for being so late in responding to your email] > > > >> When an executable is built with the --rosegment option GDB is no > >> longer able to find the build-id of the executable from a core file. > > > > For problems like this it really helps if you create a bug report > > using the Sourceware bugzilla system. This allows us to track the > > problem and any discussion surrounding it. It also means that we > > can refer to the bug ID in comments in the code. > > > > In order to save you time however, I have gone ahead and created > > a bug report for you: > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=32100 > > > >> This patch aims to fix this by placing the build-id first. The > >> build-id will then be included within the same LOAD-able segment as > >> the executable content, just as the ELF headers are. With this patch > >> in place GDB is once again able to find the build-id from a core > >> file. > > > > Unfortunately that patch breaks the intention of the --rosegment > > option by restoring two loadable, read-only segments to the executable. > > > > Here is an example: > > > > $ cat hello.c > > > > extern int printf (const char *, ...); > > int i = 42; > > const int * j = & i; > > int main (void) { return printf ("hello world %d\n", * j); } > > > > $ gcc -fPIC -Wl,-z,noseparate-code hello.c > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 > > LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 > > > > $ gcc -fPIX -Wl,-z,separate-code hello.c > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 > > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 > > > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 > > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > > > (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In > > reality I used a full path to a linker built with your proposed patch applied). > > > > The section to segment mapping is changed by using --rosegment, but with your patch > > applied we still get an early read-only segment containing the note sections. > > > > I think that what is needed is a patch to move all of the read-only segments to > > before the read-execute segment. But this weill need testing. > > Thanks for taking the time to test this patch. Sorry that this failed. > I had thought I'd seen different results, but retesting I can reproduce > what you see, so clearly I was doing something wrong before[1]. > > I do have one question which you might be able to enlighten me on > though: the initial loadable segment contains both the code, but also > the ELF headers, right? Which are clearly not executable content. > > So it would seem that it is OK to have a limited set of data mapped at > the start of the code segment. Clearly we don't want _all_ the data > placed there as that would make the separate segments pointless. But > something that is not usually read by the program itself (e.g. ELF > headers, or the build-id) seemed OK. > > My next question then would be, I guess the linker is creating a new > segment because the permissions of the build-id section doesn't match > that of the code section. Instead of moving all the sections, can the > linker be told/convinced to merge the build-id into the start of the > code? > > I'm just trying to understand more about how this all works. > > Thanks, > Andrew > > > [1] I'm pretty sure I figured out what I was doing wrong. But it was a > stupid error my side. > > > > > Cheers > > Nick Thanks for the .note.build-id use case (kernel core dumper). I think --rosegment needs to place read-only sections entirely before .text https://sourceware.org/pipermail/binutils/2024-June/134541.html > **Proposed Changes for GNU ld:** > > * Add an option to place read-only sections entirely before .text, eliminating the R segment after the RX segment. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-08-21 10:57 ` Andrew Burgess 2024-08-21 20:45 ` Fangrui Song @ 2024-09-19 15:51 ` Nick Clifton 2024-09-19 20:29 ` H.J. Lu 1 sibling, 1 reply; 17+ messages in thread From: Nick Clifton @ 2024-09-19 15:51 UTC (permalink / raw) To: Andrew Burgess, binutils Hi Andrew, I have now checked in the patch. (commit bf6d7087de0a7351fd1dfd5f41522a7f4f576180) In the end I went for a more wordy but also more likely to be correct syntax for the conditional for emitting the text segment alignment code: + # We only need the alignment if we have emitted some sections. + if test -z "${SEPARATE_CODE}"; then + align_text + elif test -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then + align_text + fi Cheers Nick ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-19 15:51 ` Nick Clifton @ 2024-09-19 20:29 ` H.J. Lu 0 siblings, 0 replies; 17+ messages in thread From: H.J. Lu @ 2024-09-19 20:29 UTC (permalink / raw) To: Nick Clifton; +Cc: Andrew Burgess, binutils On Thu, Sep 19, 2024 at 11:52 PM Nick Clifton <nickc@redhat.com> wrote: > > Hi Andrew, > > I have now checked in the patch. (commit bf6d7087de0a7351fd1dfd5f41522a7f4f576180) > > In the end I went for a more wordy but also more likely to be correct > syntax for the conditional for emitting the text segment alignment > code: > > + # We only need the alignment if we have emitted some sections. > + if test -z "${SEPARATE_CODE}"; then > + align_text > + elif test -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then > + align_text > + fi > > Cheers > Nick > This caused: https://sourceware.org/bugzilla/show_bug.cgi?id=32190 -- H.J. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-08-19 11:01 ` Nick Clifton 2024-08-21 10:57 ` Andrew Burgess @ 2024-09-02 10:26 ` Andrew Burgess 2024-09-10 9:01 ` Nick Clifton 2024-09-19 20:53 ` Sam James 2 siblings, 1 reply; 17+ messages in thread From: Andrew Burgess @ 2024-09-02 10:26 UTC (permalink / raw) To: Nick Clifton, binutils Nick Clifton <nickc@redhat.com> writes: > Hi Andrew, > > [Sorry for being so late in responding to your email] > >> When an executable is built with the --rosegment option GDB is no >> longer able to find the build-id of the executable from a core file. > > For problems like this it really helps if you create a bug report > using the Sourceware bugzilla system. This allows us to track the > problem and any discussion surrounding it. It also means that we > can refer to the bug ID in comments in the code. > > In order to save you time however, I have gone ahead and created > a bug report for you: > > https://sourceware.org/bugzilla/show_bug.cgi?id=32100 > >> This patch aims to fix this by placing the build-id first. The >> build-id will then be included within the same LOAD-able segment as >> the executable content, just as the ELF headers are. With this patch >> in place GDB is once again able to find the build-id from a core >> file. > > Unfortunately that patch breaks the intention of the --rosegment > option by restoring two loadable, read-only segments to the executable. > > Here is an example: > > $ cat hello.c > > extern int printf (const char *, ...); > int i = 42; > const int * j = & i; > int main (void) { return printf ("hello world %d\n", * j); } > > $ gcc -fPIC -Wl,-z,noseparate-code hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 > LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In > reality I used a full path to a linker built with your proposed patch applied). > > The section to segment mapping is changed by using --rosegment, but with your patch > applied we still get an early read-only segment containing the note sections. > > I think that what is needed is a patch to move all of the read-only segments to > before the read-execute segment. But this weill need testing. Hi Nick, First, what you've written above is 100% correct, and the patch I posted did break the "only one r/o segment" goal. However, I think there are some additional, interesting observations to be made from the experiment you performed. I'll discuss these results below. Second, I've read the link that Fangrui Song posted[1], which is yet another voice arguing for r/o content to be placed before the text content. However, the initial suggestion is that this would be opt in, which would still leave GDB's build-id lookup broken by default when --rosegment was used, which is less than ideal. Third, I've read up on bug PR ld/30907[2] which seems to be the motivator for the addition of --rosegment. The initial complaint appears to be about file size rather than the number of segments. It's just that reducing the number of segments helps reduce the file size. The description of --rosegment does mention creating a single r/o segment, which is fine, but, I think, we should keep the original goal (file size) in mind. OK, so the analysis. I created four binaries using the same approach Nick took (this experiment was done on AArch64, not x86-64): $ cat hello.c extern int printf (const char *, ...); int i = 42; const int * j = & i; int main (void) { return printf ("hello world %d\n", * j); } $ cat Makefile CC := gcc LD_NEW := /tmp/binutils-gdb/install.patched/bin LD_OLD := /tmp/binutils-gdb/install.master/bin all: $(CC) -B$(LD_OLD) -fPIC -Wl,-z,noseparate-code hello.c -o hello.nosep-code.x $(CC) -B$(LD_OLD) -fPIC -Wl,-z,separate-code hello.c -o hello.sep-code.x $(CC) -B$(LD_OLD) -fPIC -Wl,-z,separate-code -Wl,--rosegment hello.c -o hello.roseg.x $(CC) -B$(LD_NEW) -fPIC -Wl,-z,separate-code -Wl,--rosegment hello.c -o hello.patched.x sizes: ls -l hello.*.x segments: for F in $$(ls -1 hello.*.x); do echo $$F; readelf -lW $$F | grep LOAD; done After building the binaries using `make`, I then looked at the results: $ make segments for F in $(ls -1 hello.*.x); do echo $F; readelf -lW $F | grep LOAD; done hello.nosep-code.x LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00083c 0x00083c R E 0x10000 LOAD 0x00fde0 0x000000000041fde0 0x000000000041fde0 0x000250 0x000258 RW 0x10000 hello.patched.x LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0002ec 0x0002ec R 0x10000 LOAD 0x010000 0x0000000000410000 0x0000000000410000 0x0001f4 0x0001f4 R E 0x10000 LOAD 0x020000 0x0000000000420000 0x0000000000420000 0x00039c 0x00039c R 0x10000 LOAD 0x02fde0 0x000000000043fde0 0x000000000043fde0 0x000250 0x000258 RW 0x10000 hello.roseg.x LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0101f4 0x0101f4 R E 0x10000 LOAD 0x020000 0x0000000000420000 0x0000000000420000 0x0003dc 0x0003dc R 0x10000 LOAD 0x02fde0 0x000000000043fde0 0x000000000043fde0 0x000250 0x000258 RW 0x10000 hello.sep-code.x LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000540 0x000540 R 0x10000 LOAD 0x010000 0x0000000000410000 0x0000000000410000 0x0001f4 0x0001f4 R E 0x10000 LOAD 0x020000 0x0000000000420000 0x0000000000420000 0x000144 0x000144 R 0x10000 LOAD 0x02fde0 0x000000000043fde0 0x000000000043fde0 0x000250 0x000258 RW 0x10000 This confirms what Nick saw, the patched binary does have an extra LOAD segment. Next I tried this: $ make sizes ls -l hello.*.x -rwxr-xr-x. 1 andrew andrew 71512 Sep 2 05:39 hello.nosep-code.x -rwxr-xr-x. 1 andrew andrew 202584 Sep 2 05:39 hello.patched.x -rwxr-xr-x. 1 andrew andrew 202584 Sep 2 05:39 hello.roseg.x -rwxr-xr-x. 1 andrew andrew 202584 Sep 2 05:39 hello.sep-code.x Which, I think, is interesting. Despite having the extra segment, the patched linker didn't trigger an increase in file size. So what's going on? Taking a look at the 'make segments' output for 'hello.patched.x' and 'hello.roseg.x' we see this, I've added some labels on the left to make it easier to reference the lines: hello.patched.x [1] LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0002ec 0x0002ec R 0x10000 [2] LOAD 0x010000 0x0000000000410000 0x0000000000410000 0x0001f4 0x0001f4 R E 0x10000 [3] LOAD 0x020000 0x0000000000420000 0x0000000000420000 0x00039c 0x00039c R 0x10000 [4] LOAD 0x02fde0 0x000000000043fde0 0x000000000043fde0 0x000250 0x000258 RW 0x10000 hello.roseg.x [5] LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0101f4 0x0101f4 R E 0x10000 [6] LOAD 0x020000 0x0000000000420000 0x0000000000420000 0x0003dc 0x0003dc R 0x10000 [7] LOAD 0x02fde0 0x000000000043fde0 0x000000000043fde0 0x000250 0x000258 RW 0x10000 Lines [3] and [6] are both the r/o data segment. Notice that they both start at file offset 0x20000. This means that by this point in the ELF the files have somehow synchronised. Now the size of [3] is slightly smaller than [6] as some content has moved into the new earlier segment [1], but despite this segments [4] and [7] (the r/w data) still start at the same file offset, and not surprisingly, are the same size. And so, overall, the files are the same size. Given that [4] and [7] don't appear to have any particular alignment, I'm a little surprised that [4] didn't move. I haven't looked into this at this point, because I don't think this is the most interesting part. What is interesting is why [3] and [6] start at the same file offset. If we look at [2] and [5] we can clearly see that [2] starts one page later than [5], however, if we look at the file size of [2] you'll notice that it is a page smaller than [5]! What's going on here? The answer is in the linker script. Here's the unpatched script used, on AArch64, when '-z separate-code --rosegment' is in use. I've wrapped some of the long lines at the ';' to make it more readable: SECTIONS { PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS; . = ALIGN(CONSTANT (MAXPAGESIZE)); .init : { KEEP (*(SORT_NONE(.init))) } =0x1f2003d5 .plt : ALIGN(16) { *(.plt) *(.iplt) } .text : { ... snip ... } =0x1f2003d5 ... snip ... } And here's the patched version of the same script: SECTIONS { PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS; /* Place build-id as close to the ELF headers as possible. This maximises the chance the build-id will be present in core files, which GDB can use to improve the debug experience. */ .note.gnu.build-id : { *(.note.gnu.build-id) } . = ALIGN(CONSTANT (MAXPAGESIZE)); .init : { KEEP (*(SORT_NONE(.init))) } =0x1f2003d5 .plt : ALIGN(16) { *(.plt) *(.iplt) } .text : { ... snip ... } =0x1f2003d5 ... snip ... } Notice that in both cases '.' is initialised to a non page aligned value, unless overridden by the user, it will be: 0x400000 + SIZEOF_HEADERS This is then aligned up to a page boundary. In both cases the linker is responsible for creating the section to segment map. In the unpatched case the linker will collect everything from the start of the ELF (so the ELF headers) into the first segment, which is the executable segment. This also includes the gap that is being left after the ELF headers and before the executable sections. In the patched case the build-id easily fits into the gap between the ELF headers and the aligned start of the code sections. Even on targets with 4k pages. Of course, in the patched case, as the linker builds the section to segment map, it figures that it should create two segments, the first being r/o which holds the ELF headers and the build-id, and the second being r/x to hold the code. The linker can do this because we still have the forced page alignment before the code. But this has no impact on file size .... which was the original goal of --rosegment. Now, when looking at this, I did wonder, is that initial alignment still needed? I guess that it made sense when we have actual data content before the text content. But if all we have before the text is the ELF headers, and if (apparently) we are OK having those headers mapped into the r/x segment, then surely we could drop the alignment? If we do that, but keep the rest of my patch (moving the build-id) then we have a linker script that looks like this: SECTIONS { PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS; /* Place build-id as close to the ELF headers as possible. This maximises the chance the build-id will be present in core files, which GDB can use to improve the debug experience. */ .note.gnu.build-id : { *(.note.gnu.build-id) } .init : { KEEP (*(SORT_NONE(.init))) } =0x1f2003d5 .plt : ALIGN(16) { *(.plt) *(.iplt) } .text : { ... snip ... } =0x1f2003d5 ... snip ... } And when we rebuild the test binaries, here are the two most interesting results ('hello.roseg.x' is still using the unpatched, vanilla, upstream linker): hello.patched.x LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0004b4 0x0004b4 R E 0x10000 LOAD 0x010000 0x0000000000410000 0x0000000000410000 0x00039c 0x00039c R 0x10000 LOAD 0x01fde0 0x000000000042fde0 0x000000000042fde0 0x000250 0x000258 RW 0x10000 hello.roseg.x LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0101f4 0x0101f4 R E 0x10000 LOAD 0x020000 0x0000000000420000 0x0000000000420000 0x0003dc 0x0003dc R 0x10000 LOAD 0x02fde0 0x000000000043fde0 0x000000000043fde0 0x000250 0x000258 RW 0x10000 Better, we still have just 3 segments. The first segment with permissions 'R E' does contain the ELF headers in both cases, and in the patched case it also contains the build-id note. And as an added bonus, the file sizes: $ make sizes ls -l hello.*.x -rwxr-xr-x. 1 andrew andrew 71512 Sep 2 06:13 hello.nosep-code.x -rwxr-xr-x. 1 andrew andrew 137048 Sep 2 06:13 hello.patched.x -rwxr-xr-x. 1 andrew andrew 202584 Sep 2 06:13 hello.roseg.x -rwxr-xr-x. 1 andrew andrew 202584 Sep 2 06:13 hello.sep-code.x The patched binary gets smaller due to the removal of the page alignment. Obviously, the removal of the alignment could be done without moving the build-id note, which would provide file size improvements. So, why am I still pushing this patch when "just moving the data before the text" is the obviously better solution? My concerns are mostly those discussed in [2], `ld` has had text before data for so long, I'm nervous proposing moving all that content just to fix this one issue. Especially if, as in [2] it was done opt-out be default, that would leave this feature broken in this case. Placing the build-id separately to the rest of the data content highlights its slightly special placement requirements, and hopefully will stop it getting broken again in the future. The patch below is an RFC rather than a full patch, I've not update the commit message to describe the changes to the alignment code yet, but I wanted to share this so we could discuss it. Thanks, Andrew [1] https://sourceware.org/pipermail/binutils/2024-June/134541.html [2] https://sourceware.org/bugzilla/show_bug.cgi?id=30907 --- diff --git a/ld/scripttempl/elf.sc b/ld/scripttempl/elf.sc index 54716110b61..b20c5ccc9a1 100644 --- a/ld/scripttempl/elf.sc +++ b/ld/scripttempl/elf.sc @@ -425,7 +425,6 @@ emit_early_ro() { cat <<EOF ${INITIAL_READONLY_SECTIONS} - .note.gnu.build-id ${RELOCATING-0}: { *(.note.gnu.build-id) } EOF } @@ -436,6 +435,11 @@ cat <<EOF ${CREATE_SHLIB-${CREATE_PIE-${RELOCATING+PROVIDE (__executable_start = ${TEXT_START_ADDR}); . = ${TEXT_BASE_ADDRESS};}}} ${CREATE_SHLIB+${RELOCATING+. = ${SHLIB_TEXT_START_ADDR}${SIZEOF_HEADERS_CODE};}} ${CREATE_PIE+${RELOCATING+PROVIDE (__executable_start = ${SHLIB_TEXT_START_ADDR}); . = ${SHLIB_TEXT_START_ADDR}${SIZEOF_HEADERS_CODE};}} + + /* Place build-id as close to the ELF headers as possible. This + maximises the chance the build-id will be present in core files, + which GDB can use to improve the debug experience. */ + .note.gnu.build-id ${RELOCATING-0}: { *(.note.gnu.build-id) } EOF } @@ -800,9 +804,12 @@ EOF if test -z "${ALL_TEXT_BEFORE_RO}"; then test -n "${SEPARATE_CODE}" || emit_early_ro test -n "${NON_ALLOC_DYN}${SEPARATE_CODE}" || emit_dyn + + if test -n "${SEPARATE_CODE}" -o -n "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then + align_text + fi fi - - align_text + emit_text align_rodata ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-02 10:26 ` Andrew Burgess @ 2024-09-10 9:01 ` Nick Clifton 2024-09-10 15:38 ` Andrew Burgess 0 siblings, 1 reply; 17+ messages in thread From: Nick Clifton @ 2024-09-10 9:01 UTC (permalink / raw) To: Andrew Burgess, binutils Hi Andrew, [Sorry for the delay in replying - I have been on PTO...] > Second, I've read the link that Fangrui Song posted[1], which is yet > another voice arguing for r/o content to be placed before the text > content. However, the initial suggestion is that this would be opt in, > which would still leave GDB's build-id lookup broken by default when > --rosegment was used, which is less than ideal. Just as a matter of interest, have you checked lld's current behaviour ? (Ideally with and without --rosegment and/or -z separate-code). > Third, I've read up on bug PR ld/30907[2] which seems to be the > motivator for the addition of --rosegment. The initial complaint > appears to be about file size rather than the number of segments. It's > just that reducing the number of segments helps reduce the file size. Right. > The description of --rosegment does mention creating a single r/o > segment, which is fine, but, I think, we should keep the original goal > (file size) in mind. I totally agree. > Now, when looking at this, I did wonder, is that initial alignment still > needed? I guess that it made sense when we have actual data content > before the text content. But if all we have before the text is the ELF > headers, and if (apparently) we are OK having those headers mapped into > the r/x segment, then surely we could drop the alignment? That makes sense. > The patch below is an RFC rather than a full patch, I've not update the > commit message to describe the changes to the alignment code yet, but I > wanted to share this so we could discuss it. I think that the patch looks OK, apart from one possible typo... > @@ -800,9 +804,12 @@ EOF > if test -z "${ALL_TEXT_BEFORE_RO}"; then > test -n "${SEPARATE_CODE}" || emit_early_ro > test -n "${NON_ALLOC_DYN}${SEPARATE_CODE}" || emit_dyn > + > + if test -n "${SEPARATE_CODE}" -o -n "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then > + align_text > + fi > fi > - > - align_text > + > emit_text > The intent of this change is to only invoke the align_text operation if some data has been emitted before the start of the emit_text operation, yes ? But emit_early_ro and emit_dyn are only invoked if the corresponding test *fails*. So for example: test -n "${SEPARATE_CODE}" || emit_early_ro translates to: if ${SEPARATE_CODE} is not empty do nothing, otherwise invoke emit_early_ro So the test to invoke align_text ought to be: if test -z "${SEPARATE_CODE}" -o -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then align_text fi ie align_text is called only if either of the previous tests encountered empty strings. Does this sound right to you ? (I confess that I find the bash shell's conditional syntax very confusing. I have been wondering if there is an easier way to code these operations). Cheers Nick ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-10 9:01 ` Nick Clifton @ 2024-09-10 15:38 ` Andrew Burgess 2024-09-10 23:03 ` Hans-Peter Nilsson 2024-09-11 14:11 ` Nick Clifton 0 siblings, 2 replies; 17+ messages in thread From: Andrew Burgess @ 2024-09-10 15:38 UTC (permalink / raw) To: Nick Clifton, binutils Nick Clifton <nickc@redhat.com> writes: > Hi Andrew, > > [Sorry for the delay in replying - I have been on PTO...] > >> Second, I've read the link that Fangrui Song posted[1], which is yet >> another voice arguing for r/o content to be placed before the text >> content. However, the initial suggestion is that this would be opt in, >> which would still leave GDB's build-id lookup broken by default when >> --rosegment was used, which is less than ideal. > > Just as a matter of interest, have you checked lld's current behaviour ? > (Ideally with and without --rosegment and/or -z separate-code). Here are the results using lld 17.0.6: $ gcc -fPIC -fuse-ld=lld -Wl,-z,noseparate-code -Wl,--no-rosegment hello.c -o hello.lld.nosep-code.x $ gcc -fPIC -fuse-ld=lld -Wl,-z,separate-code -Wl,--no-rosegment hello.c -o hello.lld.sep-code.x $ gcc -fPIC -fuse-ld=lld -Wl,-z,separate-code -Wl,--rosegment hello.c -o hello.lld.sep-code-roseg.x $ ld.lld --version LLD 17.0.6 (compatible with GNU linkers) $ readelf -lW hello.lld.nosep-code.x | grep LOAD\\\|NOTE LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000690 0x000690 R E 0x1000 LOAD 0x000690 0x0000000000201690 0x0000000000201690 0x0001a0 0x0001a0 RW 0x1000 LOAD 0x000830 0x0000000000202830 0x0000000000202830 0x000030 0x000031 RW 0x1000 NOTE 0x00028c 0x000000000020028c 0x000000000020028c 0x000038 0x000038 R 0x4 $ readelf -lW hello.lld.sep-code.x | grep LOAD\\\|NOTE LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000690 0x000690 R E 0x1000 LOAD 0x001000 0x0000000000201000 0x0000000000201000 0x0001a0 0x0001a0 RW 0x1000 LOAD 0x0011a0 0x00000000002021a0 0x00000000002021a0 0x000030 0x000031 RW 0x1000 NOTE 0x00028c 0x000000000020028c 0x000000000020028c 0x000038 0x000038 R 0x4 $ readelf -lW hello.lld.sep-code-roseg.x | grep LOAD\\\|NOTE LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000568 0x000568 R 0x1000 LOAD 0x001000 0x0000000000201000 0x0000000000201000 0x000160 0x000160 R E 0x1000 LOAD 0x002000 0x0000000000202000 0x0000000000202000 0x0001a0 0x0001a0 RW 0x1000 LOAD 0x0021a0 0x00000000002031a0 0x00000000002031a0 0x000030 0x000031 RW 0x1000 NOTE 0x0002c4 0x00000000002002c4 0x00000000002002c4 0x000038 0x000038 R 0x4 $ ls -l hello.lld.*.x -rwxr-xr-x. 1 andrew andrew 6064 Sep 10 16:00 hello.lld.nosep-code.x -rwxr-xr-x. 1 andrew andrew 12576 Sep 10 16:00 hello.lld.sep-code-roseg.x -rwxr-xr-x. 1 andrew andrew 8480 Sep 10 16:00 hello.lld.sep-code.x In every case the NOTE segment is placed towards the start of the ELF, which is good. In the first two cases the r/o data is merged with the code into a single segment. In the --rosegment case the r/o data is moved earlier, before the code, into a single r/o segment which also continues to hold the NOTE data. I've repeated this experiment using lld 18.1.8 and there's no change in how the segments are laid out, or the section to segment assignment. I think this aligns with what Fangrui Song said in their emails: https://inbox.sourceware.org/binutils/PH7PR06MB9078C0C0A8E4634A266A9436CB8E2@PH7PR06MB9078.namprd06.prod.outlook.com/ https://inbox.sourceware.org/binutils/20240605022136.pnkv3j5sgdx6waex@google.com/ Basically suggesting that the bfd linker should place r/o data before the code. My concern is that if this was made optional, especially if this wasn't the default in --rosegment mode, then this would leave some loss of functionality in place, which I'd really like to avoid if possible. > >> Third, I've read up on bug PR ld/30907[2] which seems to be the >> motivator for the addition of --rosegment. The initial complaint >> appears to be about file size rather than the number of segments. It's >> just that reducing the number of segments helps reduce the file size. > > Right. > >> The description of --rosegment does mention creating a single r/o >> segment, which is fine, but, I think, we should keep the original goal >> (file size) in mind. > > I totally agree. > >> Now, when looking at this, I did wonder, is that initial alignment still >> needed? I guess that it made sense when we have actual data content >> before the text content. But if all we have before the text is the ELF >> headers, and if (apparently) we are OK having those headers mapped into >> the r/x segment, then surely we could drop the alignment? > > That makes sense. > > >> The patch below is an RFC rather than a full patch, I've not update the >> commit message to describe the changes to the alignment code yet, but I >> wanted to share this so we could discuss it. > > I think that the patch looks OK, apart from one possible typo... > >> @@ -800,9 +804,12 @@ EOF >> if test -z "${ALL_TEXT_BEFORE_RO}"; then >> test -n "${SEPARATE_CODE}" || emit_early_ro >> test -n "${NON_ALLOC_DYN}${SEPARATE_CODE}" || emit_dyn >> + >> + if test -n "${SEPARATE_CODE}" -o -n "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then >> + align_text >> + fi >> fi >> - >> - align_text >> + >> emit_text >> > > The intent of this change is to only invoke the align_text operation if some > data has been emitted before the start of the emit_text operation, yes ? > > But emit_early_ro and emit_dyn are only invoked if the corresponding test > *fails*. So for example: > > test -n "${SEPARATE_CODE}" || emit_early_ro > > translates to: > > if ${SEPARATE_CODE} is not empty do nothing, otherwise invoke emit_early_ro > > So the test to invoke align_text ought to be: > > if test -z "${SEPARATE_CODE}" -o -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then > align_text > fi > Ugh! You are absolutely correct. I updated my patch and posted it below, just so we have something to discuss. Thanks, Andrew --- commit 33c4f1e786b5e782dae2e80af2a3f63d43ca0a39 Author: Andrew Burgess <aburgess@redhat.com> Date: Mon Aug 12 14:46:26 2024 +0100 ld: ensure build-id is placed near ELF headers After the commit: commit e8e10743f7b207b21a1efb0cc9e42487080db013 Date: Thu Jun 13 15:10:15 2024 +0100 Add --rosegment option to BFD linker to stop the '-z separate-code' from generating two read-only segments. When an executable is built with the --rosegment option GDB is no longer able to find the build-id of the executable from a core file. The problem is that GDB depends on what I guess was a lucky coincidence; that the build-id was placed close to the ELF headers, such that the build-id can be found in the first page of the ELF. When a Linux kernel produces a core file there is specific support in place so that the kernel will emit the first page of any private file mapping that looks like it might contain ELF headers. If the build-id is placed in this first page then this means the build-id will be included into the core file, and GDB can then find it. Using the build-id GDB can validate that it has the correct executable loaded to debug the core file, or in some cases GDB can use the build-id to fetch the executable that matches the core file automatically, e.g. using debuginfod. The Linux kernel support for this feature has existed since 2007, see this (kernel) commit: commit 82df39738ba9e02c057fa99b7461a56117d36119 Date: Tue Oct 16 23:27:02 2007 -0700 Add MMF_DUMP_ELF_HEADERS I ran into this issue while testing on the latest Fedora Rawhide build, currently the linker on this system enables --rosegment by default and as a result GDB is not able to find the build-id for any core file. What's happening is that with --rosegment the linker merges the .note.gnu.build-id sections with all the other data content and places this after the executable content. Without the --rosegment the .note.gnu.build-id section is placed pretty much first before the executable and data sections. As a consequence, without --rosegment the build-id will usually be on the same page as the ELF headers while with --rosegment on the executable content is first and due to alignment directives in the linker script the NOTE segment (including the build-id) will always appear outside the first page. In fact, when we look at the alignment directives we see that there is an unnecessary alignment being generated before the .text content. This means that when --rosegment is in use we leave an unnecessary gap after the ELF headers and before the .text content. I've fixed this in this commit by only emitting the alignment when we have first emitted any data content before the .text. With that fixed I have then placed the build-id immediately after the ELF headers. The build-id will then be included within the same LOAD-able segment as the executable content, just as the ELF headers are. This maybe isn't ideal as the goal of --rosegment is to avoid placing data into an executable segment, but we currently do include the ELF headers, so adding the NOTES seems like a necessary evil at this point. The other option is to move all read-only data to live before the .text content, thus creating a r/o segment as the first segment in the ELF. This segment would hold the ELF headers, the NOTES, and then all the other r/o data. This is what lld does. This might be a good long term goal for the bfd linker to work towards. With this patch in place GDB is once again able to find the build-id from a core file. diff --git a/ld/scripttempl/elf.sc b/ld/scripttempl/elf.sc index 54716110b61..97b96fae85e 100644 --- a/ld/scripttempl/elf.sc +++ b/ld/scripttempl/elf.sc @@ -425,7 +425,6 @@ emit_early_ro() { cat <<EOF ${INITIAL_READONLY_SECTIONS} - .note.gnu.build-id ${RELOCATING-0}: { *(.note.gnu.build-id) } EOF } @@ -436,6 +435,11 @@ cat <<EOF ${CREATE_SHLIB-${CREATE_PIE-${RELOCATING+PROVIDE (__executable_start = ${TEXT_START_ADDR}); . = ${TEXT_BASE_ADDRESS};}}} ${CREATE_SHLIB+${RELOCATING+. = ${SHLIB_TEXT_START_ADDR}${SIZEOF_HEADERS_CODE};}} ${CREATE_PIE+${RELOCATING+PROVIDE (__executable_start = ${SHLIB_TEXT_START_ADDR}); . = ${SHLIB_TEXT_START_ADDR}${SIZEOF_HEADERS_CODE};}} + + /* Place build-id as close to the ELF headers as possible. This + maximises the chance the build-id will be present in core files, + which GDB can use to improve the debug experience. */ + .note.gnu.build-id ${RELOCATING-0}: { *(.note.gnu.build-id) } EOF } @@ -800,9 +804,12 @@ EOF if test -z "${ALL_TEXT_BEFORE_RO}"; then test -n "${SEPARATE_CODE}" || emit_early_ro test -n "${NON_ALLOC_DYN}${SEPARATE_CODE}" || emit_dyn + + if test -z "${SEPARATE_CODE}" -o -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then + align_text + fi fi - - align_text + emit_text align_rodata ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-10 15:38 ` Andrew Burgess @ 2024-09-10 23:03 ` Hans-Peter Nilsson 2024-09-11 10:42 ` Nick Clifton 2024-09-11 14:11 ` Nick Clifton 1 sibling, 1 reply; 17+ messages in thread From: Hans-Peter Nilsson @ 2024-09-10 23:03 UTC (permalink / raw) To: Andrew Burgess; +Cc: Nick Clifton, binutils On Tue, 10 Sep 2024, Andrew Burgess wrote: > Nick Clifton <nickc@redhat.com> writes: > > But emit_early_ro and emit_dyn are only invoked if the corresponding test > > *fails*. So for example: > > > > test -n "${SEPARATE_CODE}" || emit_early_ro > > > > translates to: > > > > if ${SEPARATE_CODE} is not empty do nothing, otherwise invoke emit_early_ro > > > > So the test to invoke align_text ought to be: > > > > if test -z "${SEPARATE_CODE}" -o -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then > > align_text > > fi > > IIRC, "test A -o B" is frowned upon as non-portable, and IIRC mattered for some shell used even in today's day and age. I'd suggest to please use "test A || test B". brgds, H-P ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-10 23:03 ` Hans-Peter Nilsson @ 2024-09-11 10:42 ` Nick Clifton 2024-09-11 11:39 ` Hans-Peter Nilsson 0 siblings, 1 reply; 17+ messages in thread From: Nick Clifton @ 2024-09-11 10:42 UTC (permalink / raw) To: Hans-Peter Nilsson, Andrew Burgess; +Cc: binutils Hi Hans-Peter, >>> if test -z "${SEPARATE_CODE}" -o -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then >>> align_text >>> fi >>> > > IIRC, "test A -o B" is frowned upon as non-portable, and IIRC > mattered for some shell used even in today's day and age. > > I'd suggest to please use "test A || test B". Can you use || inside an if-expression (in bash) ? I thought not. Maybe it would be simpler, if somewhat wordier, to use: # We only need the alignment if we have emitted some sections. if test -z "${SEPARATE_CODE}"; then align_text elif test -z "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then align_text fi Cheers Nick ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-11 10:42 ` Nick Clifton @ 2024-09-11 11:39 ` Hans-Peter Nilsson 0 siblings, 0 replies; 17+ messages in thread From: Hans-Peter Nilsson @ 2024-09-11 11:39 UTC (permalink / raw) To: Nick Clifton; +Cc: Andrew Burgess, binutils On Wed, 11 Sep 2024, Nick Clifton wrote: > Hi Hans-Peter, > > > > > if test -z "${SEPARATE_CODE}" -o -z > > > > "${NON_ALLOC_DYN}${SEPARATE_CODE}"; then > > > > align_text > > > > fi > > > > > > > > IIRC, "test A -o B" is frowned upon as non-portable, and IIRC > > mattered for some shell used even in today's day and age. > > > > I'd suggest to please use "test A || test B". > > Can you use || inside an if-expression (in bash) ? I thought not. Those are separate commands, not arguments evaluated by "if", so to speak. (TL;DR: yes) brgds, H-P ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-10 15:38 ` Andrew Burgess 2024-09-10 23:03 ` Hans-Peter Nilsson @ 2024-09-11 14:11 ` Nick Clifton 1 sibling, 0 replies; 17+ messages in thread From: Nick Clifton @ 2024-09-11 14:11 UTC (permalink / raw) To: Andrew Burgess, binutils Hi Andrew, > Here are the results using lld 17.0.6: > In every case the NOTE segment is placed towards the start of the ELF, > which is good. Phew! > Basically suggesting that the bfd linker should place r/o data before > the code. My concern is that if this was made optional, especially if > this wasn't the default in --rosegment mode, then this would leave some > loss of functionality in place, which I'd really like to avoid if > possible. Understood. >>> The patch below is an RFC rather than a full patch, I've not update the >>> commit message to describe the changes to the alignment code yet, but I >>> wanted to share this so we could discuss it. Bash shell syntax issues aside I like the patch. My testing has shown no problems with it and it does still allow --rosegment to make a difference to the size of binaries when -z separate-code is used, so I am willing to accept it. I am however above to fly off to the GNU Tools Cauldron conference however, so if you could submit a updated version with the syntax changes suggested then I will review and apply it when I get back. Cheers Nick ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-08-19 11:01 ` Nick Clifton 2024-08-21 10:57 ` Andrew Burgess 2024-09-02 10:26 ` Andrew Burgess @ 2024-09-19 20:53 ` Sam James 2024-09-20 2:52 ` H.J. Lu 2 siblings, 1 reply; 17+ messages in thread From: Sam James @ 2024-09-19 20:53 UTC (permalink / raw) To: Nick Clifton; +Cc: Andrew Burgess, binutils Nick Clifton <nickc@redhat.com> writes: > Hi Andrew, > > [Sorry for being so late in responding to your email] > >> When an executable is built with the --rosegment option GDB is no >> longer able to find the build-id of the executable from a core file. > > For problems like this it really helps if you create a bug report > using the Sourceware bugzilla system. This allows us to track the > problem and any discussion surrounding it. It also means that we > can refer to the bug ID in comments in the code. > > In order to save you time however, I have gone ahead and created > a bug report for you: > > https://sourceware.org/bugzilla/show_bug.cgi?id=32100 > >> This patch aims to fix this by placing the build-id first. The >> build-id will then be included within the same LOAD-able segment as >> the executable content, just as the ELF headers are. With this patch >> in place GDB is once again able to find the build-id from a core >> file. > > Unfortunately that patch breaks the intention of the --rosegment > option by restoring two loadable, read-only segments to the executable. > > Here is an example: > > $ cat hello.c > > extern int printf (const char *, ...); > int i = 42; > const int * j = & i; > int main (void) { return printf ("hello world %d\n", * j); } > > $ gcc -fPIC -Wl,-z,noseparate-code hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 > LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker > $ readelf -lW a.out | grep LOAD > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In > reality I used a full path to a linker built with your proposed patch applied). > > The section to segment mapping is changed by using --rosegment, but with your patch > applied we still get an early read-only segment containing the note sections. > > I think that what is needed is a patch to move all of the read-only segments to > before the read-execute segment. But this weill need testing. I'm reading over the rest of the thread, so apologies if this got addressed later, but did we add a testcase like this? > > Cheers > Nick ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-19 20:53 ` Sam James @ 2024-09-20 2:52 ` H.J. Lu 2024-09-20 3:45 ` H.J. Lu 2024-09-20 13:20 ` Andrew Burgess 0 siblings, 2 replies; 17+ messages in thread From: H.J. Lu @ 2024-09-20 2:52 UTC (permalink / raw) To: Sam James; +Cc: Nick Clifton, Andrew Burgess, binutils On Fri, Sep 20, 2024 at 4:54 AM Sam James <sam@gentoo.org> wrote: > > Nick Clifton <nickc@redhat.com> writes: > > > Hi Andrew, > > > > [Sorry for being so late in responding to your email] > > > >> When an executable is built with the --rosegment option GDB is no > >> longer able to find the build-id of the executable from a core file. > > > > For problems like this it really helps if you create a bug report > > using the Sourceware bugzilla system. This allows us to track the > > problem and any discussion surrounding it. It also means that we > > can refer to the bug ID in comments in the code. > > > > In order to save you time however, I have gone ahead and created > > a bug report for you: > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=32100 > > > >> This patch aims to fix this by placing the build-id first. The > >> build-id will then be included within the same LOAD-able segment as > >> the executable content, just as the ELF headers are. With this patch > >> in place GDB is once again able to find the build-id from a core > >> file. > > > > Unfortunately that patch breaks the intention of the --rosegment > > option by restoring two loadable, read-only segments to the executable. > > > > Here is an example: > > > > $ cat hello.c > > > > extern int printf (const char *, ...); > > int i = 42; > > const int * j = & i; > > int main (void) { return printf ("hello world %d\n", * j); } > > > > $ gcc -fPIC -Wl,-z,noseparate-code hello.c > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 > > LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 > > > > $ gcc -fPIX -Wl,-z,separate-code hello.c > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 > > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 > > > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker > > $ readelf -lW a.out | grep LOAD > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 > > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > > > (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In > > reality I used a full path to a linker built with your proposed patch applied). > > > > The section to segment mapping is changed by using --rosegment, but with your patch > > applied we still get an early read-only segment containing the note sections. > > > > I think that what is needed is a patch to move all of the read-only segments to > > before the read-execute segment. But this weill need testing. > > I'm reading over the rest of the thread, so apologies if this got > addressed later, but did we add a testcase like this? Hi Andrew, Why does GDB need the build-id to be placed close to the ELF header? Can't GDB read the program header and search for the build-id in PT_NOTE segment? If GDB needs help, we can add a PT_BUILD_ID segment. -- H.J. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-20 2:52 ` H.J. Lu @ 2024-09-20 3:45 ` H.J. Lu 2024-09-20 13:20 ` Andrew Burgess 1 sibling, 0 replies; 17+ messages in thread From: H.J. Lu @ 2024-09-20 3:45 UTC (permalink / raw) To: Sam James; +Cc: Nick Clifton, Andrew Burgess, binutils On Fri, Sep 20, 2024 at 10:52 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Fri, Sep 20, 2024 at 4:54 AM Sam James <sam@gentoo.org> wrote: > > > > Nick Clifton <nickc@redhat.com> writes: > > > > > Hi Andrew, > > > > > > [Sorry for being so late in responding to your email] > > > > > >> When an executable is built with the --rosegment option GDB is no > > >> longer able to find the build-id of the executable from a core file. > > > > > > For problems like this it really helps if you create a bug report > > > using the Sourceware bugzilla system. This allows us to track the > > > problem and any discussion surrounding it. It also means that we > > > can refer to the bug ID in comments in the code. > > > > > > In order to save you time however, I have gone ahead and created > > > a bug report for you: > > > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=32100 > > > > > >> This patch aims to fix this by placing the build-id first. The > > >> build-id will then be included within the same LOAD-able segment as > > >> the executable content, just as the ELF headers are. With this patch > > >> in place GDB is once again able to find the build-id from a core > > >> file. > > > > > > Unfortunately that patch breaks the intention of the --rosegment > > > option by restoring two loadable, read-only segments to the executable. > > > > > > Here is an example: > > > > > > $ cat hello.c > > > > > > extern int printf (const char *, ...); > > > int i = 42; > > > const int * j = & i; > > > int main (void) { return printf ("hello world %d\n", * j); } > > > > > > $ gcc -fPIC -Wl,-z,noseparate-code hello.c > > > $ readelf -lW a.out | grep LOAD > > > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 > > > LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 > > > > > > $ gcc -fPIX -Wl,-z,separate-code hello.c > > > $ readelf -lW a.out | grep LOAD > > > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 > > > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 > > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 > > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 > > > > > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c > > > $ readelf -lW a.out | grep LOAD > > > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 > > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 > > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > > > > > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker > > > $ readelf -lW a.out | grep LOAD > > > > > > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 > > > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 > > > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 > > > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 > > > > > > (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In > > > reality I used a full path to a linker built with your proposed patch applied). > > > > > > The section to segment mapping is changed by using --rosegment, but with your patch > > > applied we still get an early read-only segment containing the note sections. > > > > > > I think that what is needed is a patch to move all of the read-only segments to > > > before the read-execute segment. But this weill need testing. > > > > I'm reading over the rest of the thread, so apologies if this got > > addressed later, but did we add a testcase like this? > > Hi Andrew, > > Why does GDB need the build-id to be placed close to the ELF header? > Can't GDB read the program header and search for the build-id in PT_NOTE > segment? If GDB needs help, we can add a PT_BUILD_ID segment. > Why doesn't elfobj_grok_gnu_build_id work for GDB? -- H.J. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] ld: ensure build-id is placed near ELF headers 2024-09-20 2:52 ` H.J. Lu 2024-09-20 3:45 ` H.J. Lu @ 2024-09-20 13:20 ` Andrew Burgess 1 sibling, 0 replies; 17+ messages in thread From: Andrew Burgess @ 2024-09-20 13:20 UTC (permalink / raw) To: H.J. Lu, Sam James; +Cc: Nick Clifton, binutils "H.J. Lu" <hjl.tools@gmail.com> writes: > On Fri, Sep 20, 2024 at 4:54 AM Sam James <sam@gentoo.org> wrote: >> >> Nick Clifton <nickc@redhat.com> writes: >> >> > Hi Andrew, >> > >> > [Sorry for being so late in responding to your email] >> > >> >> When an executable is built with the --rosegment option GDB is no >> >> longer able to find the build-id of the executable from a core file. >> > >> > For problems like this it really helps if you create a bug report >> > using the Sourceware bugzilla system. This allows us to track the >> > problem and any discussion surrounding it. It also means that we >> > can refer to the bug ID in comments in the code. >> > >> > In order to save you time however, I have gone ahead and created >> > a bug report for you: >> > >> > https://sourceware.org/bugzilla/show_bug.cgi?id=32100 >> > >> >> This patch aims to fix this by placing the build-id first. The >> >> build-id will then be included within the same LOAD-able segment as >> >> the executable content, just as the ELF headers are. With this patch >> >> in place GDB is once again able to find the build-id from a core >> >> file. >> > >> > Unfortunately that patch breaks the intention of the --rosegment >> > option by restoring two loadable, read-only segments to the executable. >> > >> > Here is an example: >> > >> > $ cat hello.c >> > >> > extern int printf (const char *, ...); >> > int i = 42; >> > const int * j = & i; >> > int main (void) { return printf ("hello world %d\n", * j); } >> > >> > $ gcc -fPIC -Wl,-z,noseparate-code hello.c >> > $ readelf -lW a.out | grep LOAD >> > >> > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006d4 0x0006d4 R E 0x1000 >> > LOAD 0x000df8 0x0000000000401df8 0x0000000000401df8 0x000228 0x000230 RW 0x1000 >> > >> > $ gcc -fPIX -Wl,-z,separate-code hello.c >> > $ readelf -lW a.out | grep LOAD >> > >> > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000510 0x000510 R 0x1000 >> > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000 >> > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0000dc 0x0000dc R 0x1000 >> > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000 >> > >> > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c >> > $ readelf -lW a.out | grep LOAD >> > >> > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00115d 0x00115d R E 0x1000 >> > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x0002d4 0x0002d4 R 0x1000 >> > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 >> > >> > $ gcc -fPIX -Wl,-z,separate-code -Wl,--rosegment hello.c -fuse-ld=patched-linker >> > $ readelf -lW a.out | grep LOAD >> > >> > LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000 >> > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00015d 0x00015d R E 0x1000 >> > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000 >> > LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000220 0x000228 RW 0x1000 >> > >> > (I invented the "-fuse-ld=patched-linker" option to keep the presentation simple. In >> > reality I used a full path to a linker built with your proposed patch applied). >> > >> > The section to segment mapping is changed by using --rosegment, but with your patch >> > applied we still get an early read-only segment containing the note sections. >> > >> > I think that what is needed is a patch to move all of the read-only segments to >> > before the read-execute segment. But this weill need testing. >> >> I'm reading over the rest of the thread, so apologies if this got >> addressed later, but did we add a testcase like this? > > Hi Andrew, > > Why does GDB need the build-id to be placed close to the ELF header? This is specifically to help with core-file debugging. In the Linux kernel commit: commit 82df39738ba9e02c057fa99b7461a56117d36119 Date: Tue Oct 16 23:27:02 2007 -0700 Add MMF_DUMP_ELF_HEADERS When writing out a core-file the kernel will always include in the core file the first page of any file backed mapping that starts at offset zero within a file. The idea is that this mapping will (likely) include the section headers and (hopefully) the build-id. When loading the core-file GDB then uses the elf_backend_core_find_build_id elf backend hook to try and find the build-id within each file backed mapping that is included in the core file. What this means of course, is that GDB can magically "know" the build-id for both the main executable, and every shared library that was mapped into the inferior given just the core file. This means that GDB can help the user understand if say the executable or libraries have changed since the core file was created. Or GDB can even go and fetch the missing files (possibly) using things like debuginfod. > Can't GDB read the program header and search for the build-id in PT_NOTE > segment? That's exactly what we do. In a follow up you ask about elfobj_grok_gnu_build_id, and that is exactly what ends up being called. On the GDB side we start in linux-tdep.c with this call into bfd: if (sec->flags & SEC_LOAD && (get_elf_backend_data (cbfd)->elf_backend_core_find_build_id (cbfd, (bfd_vma) sec->filepos))) vma_map[sec->vma] = cbfd->build_id; This lands us (eventually) into elfcore.h in the function: NAME(_bfd_elf, core_find_build_id) Which ends up calling: elf_read_notes (abfd, offset + i_phdr->p_offset, i_phdr->p_filesz, i_phdr->p_align); to read the NOTES, which will, eventually, call elfobj_grok_gnu_build_id, at least, that's my understanding. > If GDB needs help, we can add a PT_BUILD_ID segment. I don't understand exactly what this would mean in relation to the above, so I'm unclear if this would help or not. I hope this all makes things a little clearer. Let me know if you still have questions. Thanks, Andrew ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-09-20 13:20 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-08-12 14:21 [PATCH] ld: ensure build-id is placed near ELF headers Andrew Burgess 2024-08-19 11:01 ` Nick Clifton 2024-08-21 10:57 ` Andrew Burgess 2024-08-21 20:45 ` Fangrui Song 2024-09-19 15:51 ` Nick Clifton 2024-09-19 20:29 ` H.J. Lu 2024-09-02 10:26 ` Andrew Burgess 2024-09-10 9:01 ` Nick Clifton 2024-09-10 15:38 ` Andrew Burgess 2024-09-10 23:03 ` Hans-Peter Nilsson 2024-09-11 10:42 ` Nick Clifton 2024-09-11 11:39 ` Hans-Peter Nilsson 2024-09-11 14:11 ` Nick Clifton 2024-09-19 20:53 ` Sam James 2024-09-20 2:52 ` H.J. Lu 2024-09-20 3:45 ` H.J. Lu 2024-09-20 13:20 ` Andrew Burgess
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).