From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) by sourceware.org (Postfix) with ESMTP id 8AD303851C1C for ; Mon, 31 Aug 2020 12:54:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8AD303851C1C Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-575-bAUmS5PwM0GfKiu6mHmK_Q-1; Mon, 31 Aug 2020 08:54:08 -0400 X-MC-Unique: bAUmS5PwM0GfKiu6mHmK_Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1798E10ABDA8; Mon, 31 Aug 2020 12:54:07 +0000 (UTC) Received: from oldenburg2.str.redhat.com (ovpn-113-173.ams2.redhat.com [10.36.113.173]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D54D81002D4A; Mon, 31 Aug 2020 12:54:05 +0000 (UTC) From: Florian Weimer To: Vivek =?utf-8?Q?Das=C2=A0Mohapatra?= Cc: Mark Wielaard , GNU gABI gnu-gabi Subject: Re: ABI document References: <87o8nypmpq.fsf@oldenburg2.str.redhat.com> <616c0f661732fd1021e5a5b13ef872927390004d.camel@klomp.org> <87ime6pl2q.fsf@oldenburg2.str.redhat.com> <1597402688538.b66a61e7629438@mozgaia> Date: Mon, 31 Aug 2020 14:54:04 +0200 In-Reply-To: ("Vivek \=\?utf-8\?Q\?Das\=C2\=A0Mohapatra\=22's\?\= message of "Thu, 27 Aug 2020 17:52:55 +0100 (BST)") Message-ID: <874kojj7rn.fsf@oldenburg2.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0.003 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gnu-gabi@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gnu-gabi mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Aug 2020 12:54:14 -0000 * Vivek Das=C2=A0Mohapatra: > diff --git a/program-loading-and-dynamic-linking.txt b/program-loading-an= d-dynamic-linking.txt > new file mode 100644 > index 0000000..751eaca > --- /dev/null > +++ b/program-loading-and-dynamic-linking.txt > @@ -0,0 +1,272 @@ > +Program Headers > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Thanks for doing this, it's very much needed. I can't tell whether the formatting is okay, and whether we need to add markup (like ``) for program text. > +These are GNU extended program header type values: They are typically > +found in ElfW(Phdr).p_type. > + > +PT_GNU_EH_FRAME 0x6474e550 > +PT_SUNW_EH_FRAME 0x6474e550 > + > + Segment contains the EH_FRAME_HDR section (stack frame unwind informat= ion) > + > + PT_SUNW_EH_FRAME is used by a non-GNU implementation for the same purp= ose, > + and has the same value (although this does not imply compatible conten= ts). Can you dig out a link to a document that describes for the format of the GNU_EH_FRAME segment? I think this should also say that the virtual address range must be covered by an earlier PT_LOAD segment. > +PT_GNU_STACK 0x6474e551 > + > + The p_flags member of this ElfW(Phdr) structure apply to the stack. > + > + If present AND p_flags DOES NOT contain PF_X (0x1) then the stack > + should _not_ be executable. > + > + Otherwise the stack is executable (the default). The default depends on the architecture, I think. I think we have differing behavior in regards to the size of the segment. glibc ignores it, other implementations may use it to set the stack size. > +PT_GNU_RELRO 0x6474e552 > + > + The specified segment should be made read-only once run-time linking > + has completed. I think it's relocation of this object, not the entire linking operation. Also we should try to sketch the interaction with PT_LOAD. > +PT_GNU_PROPERTY 0x6474e553 > + > + The Linux kernel uses this program header to locate the > + .note.gnu.property section. > + > + If there is a program property that requires the kernel to perform > + some action before loading and ELF file (eg AArch64 BTI or intel CET) > + then this header MUST be present. =E2=80=9CIntel=E2=80=9D The requirement could be worded better. It must only be present if these features are to be enabled. > + The contents are laid out as follows: > + > + Field | Length | Contents > + n_namsz | 4 | 4 > + n_descsz | 4 | Size of n_desc (4 byte int, processor format) > + n_type | 4 | NT_GNU_PROPERTY_TYPE_0 (0x5) > + n_name | 4 | GNU\0 > + n_desc | n_descsz | property array > + > + Each element of n_desc, in turn is: > + > + typedef struct { > + Elf_Word pr_type; > + Elf_Word pr_datasz; > + unsigned char pr_data[PR_DATASZ]; > + unsigned char pr_padding[PR_PADDING]; > + } Elf_Prop; > + > + Properties are sorted in ascending order of pr_type; > + > + pr_data is aligned to 4 bytes in 32-bit objects and 8 bytes in 64-bit = ones. What's the overall alignment of the segment? 8 bytes on 64-bit? This also has to say where the padding is inserted: before pr_data? After Elf_Prop? I think it's the latter, and that Elf_Prop is aligned even if the pr_data member is absent. This means that we should have Elf32_Prop and Elf64_Prop with different alignment. (We can avoid mentioning the type name in the ABI document, I guess.) > + Defined properties are: > + > + GNU_PROPERTY_STACK_SIZE 0x1 > + > + A native format & size integer specifying the minimum stack size. > + The linker should pick the highest instance of this from all relocatab= le > + objects in the link chain and ensure the stack is at least this big. So this is an Elf*_Addr? > + GNU_PROPERTY_NO_COPY_ON_PROTECTED 0x2 > + > + The linker should treat protected data symbol as defined locally at > + run-time and copy this property to the output share object. > + > + The linker should add this property to the output share object if > + any protected symbol is expected to be defined locally at run-time. > + > + The run-time loader should disallow copy relocations against protected > + data symbols defined such objects. > + > + This type has a PR_DATASZ of 0. =E2=80=9Cpr_datasz field=E2=80=9D? > +DT_GNU_PRELINKED 0x6ffffdf5 > + > + The d_val field contains a time_t value giving the UTC time at which t= he > + object was (pre)linked. Woah, I didn't know we had this. Is this really the time when prelink was run? So running it multiple times does not always produce the same results? It seems it's this way indeed: if (! verify) info->ent->timestamp =3D (GElf_Word) time (NULL); dso->info_DT_GNU_PRELINKED =3D info->ent->timestamp; That's not good for reproducibility (but then prelink results depend=20 > +DT_GNU_CONFLICTSZ 0x6ffffdf6 > + > + Used in prelinked objects. > + d_val contains the size of the conflict segment. > + > +DT_GNU_LIBLISTSZ 0x6ffffdf7 > + > + Used in prelinked objects. > + d_val contains the size of the library list. It would be nice to add a link there to the prelink documentation. (There's a prelink.tex file in the sources.) > +DT_GNU_HASH 0x6ffffef5 > + > + The d_ptr value gives the location of the GNU style symbol hash table. Do we have a format documentation for those? > +DT_GNU_CONFLICT 0x6ffffef8 > + > + Used in prelinked objects. > + The d_ptr value gives the location of the conflict segment. > + This will contain an array of ElfW(Rela) structs. > + > + If DT_GNU_LIBLIST matches the library searchlist after loading > + then these relocation records are replayed immediately after > + run-time loading. > + > +DT_GNU_LIBLIST 0x6ffffef9 > + > + Used in prelinked objects. > + The d_ptr value gives the location of the ElfW(Lib) array giving the > + SONAME, checksum and timestamp or each library encountered at prelink = time. > + > + This is used to check that all required prelinked libraries are still > + present, loaded, and have the correct checksums at runtime. Maybe group this with the earlier prelink items? > +Section Headers > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +SHT_GNU_verdef 0x6ffffffd > + > +SHT_GNU_verneed 0x6ffffffe > + > +SHT_GNU_versym 0x6fffffff I think the canonical reference for these is: > +Note section descriptors (SHT_NOTE extensions) > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +NT_GNU_BUILD_ID 3 > + > + descsz bytes of build-id data. > + Typically presented as a hex string. But stored in binary? Maybe reference the ld documentation here, and say that the actual computation mechanism is unspecified? Thanks, Florian