From 236f81ce1bc90ea9f94771e49bd5990b3fdf8f64 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Vivek=20Das=C2=A0Mohapatra?= Date: Wed, 25 Nov 2020 17:07:55 +0000 Subject: [PATCH 1/2] Incorporate feedback & clarifications from upstream --- program-loading-and-dynamic-linking.txt | 79 +++++++++++++++++-------- 1 file changed, 53 insertions(+), 26 deletions(-) diff --git a/program-loading-and-dynamic-linking.txt b/program-loading-and-dynamic-linking.txt index a37ac34..a07c195 100644 --- a/program-loading-and-dynamic-linking.txt +++ b/program-loading-and-dynamic-linking.txt @@ -10,8 +10,8 @@ PT_SUNW_EH_FRAME 0x6474e550 Segment contains the EH_FRAME_HDR section (stack frame unwind information) NOTE: The virtual address range referred to by PT_GNU_EH_FRAME must be - covered by a separate PT_LOAD header - PT_GNU_EH_FRAME on its own does - not trigger the mapping/loading of any data into memory. + covered by a PT_LOAD entry - PT_GNU_EH_FRAME on its own does not trigger + the mapping/loading of any data. PT_SUNW_EH_FRAME is used by a non-GNU implementation for the same purpose, and has the same value (although this does not imply compatible contents). @@ -42,7 +42,9 @@ PT_GNU_RELRO 0x6474e552 The specified segment should be made read-only once run-time linking of this object has completed. - DOCUMENTME: Interaction with PT_LOAD here + As with PT_GNU_EH_FRAME this header entry does NOT guarantee that the + range in question is loaded: That must be ensured via a PT_LOAD entry + which covers the range. Reference: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic.html#PROGHEADER @@ -139,7 +141,8 @@ The values are typically found in the ElfW(Dyn).d_tag member. DT_GNU_FLAGS_1 0x6ffffdf4 Similar to DT_FLAGS and DT_FLAGS_1, but DT_FLAGS is generic and - the DT_FLAGS_1 bit mask has been exhausted (last bit claimed by Solaris). + the DT_FLAGS_1 bit mask has been exhausted (last available bit + claimed by Solaris). Currently supports the following flag bit(s) in its d_val value: @@ -165,16 +168,19 @@ DT_GNU_FLAGS_1 0x6ffffdf4 This mechanism is the basis for isolation of LD_AUDIT libraries (for example). - While this is generally desirable some libraries do not behave well under these - conditions - in particular libc (malloc/free get upset when they interact with - independent copies of themselves since they have no knowledge of one another's - memory accounting) and libpthread (which tends to deadlock of two different - namespaces attempt to initialise threadsafe locking). DF_GNU_1_UNIQUE is used to - mark such libraries so that when they are loaded only one copy (which resides in - LM_ID_BASE) is mapped, and all namespaces use that copy (unless such sharing is - explicitly suppressed, such as for LD_AUDIT libraries). + While this is generally desirable some libraries do not behave well + under these conditions - in particular libc (malloc/free get upset + when they interact with independent copies of themselves since they + have no knowledge of one another's memory accounting) and libpthread + (which tends to deadlock of two different namespaces attempt to + initialise thread metadata). - This behaviour can always be explicitly overridden by the caller of dlmopen(3). + DF_GNU_1_UNIQUE is used to mark such libraries so that when they are + loaded only one copy (which resides in LM_ID_BASE) is mapped, and + all namespaces use that copy (unless such sharing is explicitly + suppressed, such as for LD_AUDIT libraries). + + This behaviour can be explicitly overridden by the caller of dlmopen(3). Reference: This document is canonical. @@ -186,7 +192,7 @@ DT_GNU_PRELINKED 0x6ffffdf5 The d_val field contains a time_t value giving the UTC time at which the object was (pre)linked. - Reference: DOCUMENTME + Reference: See the accompanying prelink document for details. DT_GNU_CONFLICTSZ 0x6ffffdf6 @@ -233,12 +239,13 @@ DT_GNU_HASH 0x6ffffef5 The d_ptr value gives the location of the GNU style symbol hash table. The GNU hash of a symbol is computed as follows: - - extract the NAME of the symbol - - examples: 'foo@version-info' becomes 'foo'; 'bar' remains 'bar' + - take the NAME of the symbol (WITHOUT any @version suffix) - unsigned long h ← 5381 - for each unsigned character C in NAME, starting at position 0: - h ← (h << 5) + h + C; - - HASH ← h & 0xffffffff // 32 bit value + OR + h ← (h * 33) + C; + - uint32_t HASH ← h Hash Table contents: @@ -285,7 +292,7 @@ DT_GNU_HASH 0x6ffffef5 - bloom : ElfW(Addr)[ bitmask-words ] - For each symbol [name] S the following is carried out: + For each symbol [name] S the following is carried out (by the link editor): - C ← __ELF_NATIVE_CLASS /* ie 32 on ELF32, 64 on ELF64 */ - H ← gnu-hash( S ) - BWORD ← (H / C) & (bitmask-words - 1) @@ -329,7 +336,15 @@ DT_GNU_HASH 0x6ffffef5 - BUCKET ← CHASH % nbuckets - CINDEX ← position of the symbol _within_ its bucket 0 for the first symbol, 1 for the second and so forth - - if this is the last symbol in the bucket: + + The chain data are stored as a single linear chunk with each + pseudo-hash value immediately following another. CINDEX gives the + position of a pseudo-hash inside the bucket to which it belongs, + rather than its position in the chain data area as a whole. + + [ b0h0 | b0h1 | b0h3 | b1h0 | … + + - if a pseudo-hash value is the last one in the bucket: - CHASH ← CHASH | 1 /* set the least bit */ - else - CHASH ← CHASH & ~1 /* unset the least bit */ @@ -359,7 +374,7 @@ SHT_GNU_INCREMENTAL_INPUTS 0x6fff4700 Section name: ".gnu_incremental_inputs" - Currently used internally for incremental linking by gold. + Currently used internally during incremental linking by gold. SHT_GNU_ATTRIBUTES 0x6ffffff5 @@ -375,7 +390,7 @@ SHT_GNU_ATTRIBUTES 0x6ffffff5 - starting with a \0 terminated name - at least 6 bytes of tag-byte+value - a tag byte - - a 4 byte native integer size (including the tag byte and the size itself) + - a 4 byte native integer size (including the tag byte & the size itself) - if the tag is 2 or 3: a LEB128 encoded value stored in the remaining space - DOCUMENTME: some attribute bytes? reverse engineer from readelf? @@ -447,7 +462,11 @@ SHT_GNU_verdef 0x6ffffffd VER_NDX_LOCAL 0 // private symbol VER_NDX_GLOBAL 1 // global symbol VER_NDX_LORESERVE 0xff00 // Beginning of reserved entries - VER_NDX_ELIMINATE 0xff01 // DOCUMENTME: Symbol should be eliminated + VER_NDX_ELIMINATE 0xff01 // + + DOCUMENTME: VER_NDX_ELIMINATE does not appear to be implemented + in glibc: If an implementation exists its semantics should be + reverse-engineered from there and explained here. ElfW(Verdaux): @@ -499,9 +518,17 @@ SHT_GNU_verneed 0x6ffffffe Not fatal if this symbol+version is missing. vna_other: - If bit 15 (0x8000) is set then this symbol is hidden. - vna_other should therefore be bitwise-anded with 0x7fff before - comparison with the value from SHT_GNU_versym. + This value is used to look up the symbol version hash: It gives + the position of the hash in the version lookup table. + + Bit 15 (0x8000) is a flag bit and should be masked out of + this value before using it as an index (eg by bitwise-and-ing + its value with 0x7fff) + + If bit 15 (0x8000) is set then this symbol is hidden and is + never an acceptable candidate for matching version criteria. + + Reference: glibc: elf/dl-version.c; elf/dl-lookup.c SHT_GNU_versym 0x6fffffff @@ -527,7 +554,7 @@ SHT_GNU_versym 0x6fffffff Two values are reserved: VER_NDX_LOCAL 0 - The symbol is private, and is not available outside this object. - VER_NDX_GLOBAL 1 - The symbol is globally available (ie not versioned? DOCUMENTME). + VER_NDX_GLOBAL 1 - The symbol is globally available (ie the base or default version). Note section descriptors (SHT_NOTE extensions) ============================================== -- 2.20.1