From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id 2563A3857BAB for ; Wed, 13 Mar 2024 16:08:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2563A3857BAB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2563A3857BAB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::630 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710346105; cv=none; b=xc9nCI/SMl0iMzOiuLrPuLt78X9fZr9Jq8OGajFsS8GGYxn++RfUOfllgfJs6Gn3MoPqXULEm2MqiA8mePPB80JzhBt/rLCKaKIbFItfm6G7yqjJZrxTqibwU0bRZDY9FIyd6fl/iyj/RRdCNUOfI9JjN2Zjjgr0ZyalnAYqMK8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710346105; c=relaxed/simple; bh=JeO9q3DyZYsDW9mxwx/grHqi7i7ZgR9LigMmzTYfClo=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=n+jc45blIXCSi9DC9nduUbxLNyqJ4JMPjnKvDpUJfdYxRIJOI8UEEYeoZk++lbbzGcFv2Z4od7kWL+XtIIBU/33PkEUg2JaR51lPVhSvhiQdLk2A5t/BKqaAldj+d0tdw5Hnqq5XHEigSgyL0hnJhybQoMcRuLmqFKRd81GxDSk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1dd8d586126so10154305ad.0 for ; Wed, 13 Mar 2024 09:08:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710346097; x=1710950897; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6RCubQDsXpdY1sQ78BhmIwCQTT0Oy7mjzDSQJ8pgTCk=; b=BKHkWRG3GNsTJnGHgGAsd1KI5tCwJ6H08LtBhGdHVIR/fktiPolh6BnhkoFMK1s1mk GdBEZKrhy92zTv/h0jzFoFOK95rsFebqJmZQM0cGJGk63oRLIzTo1CRuvw9KJvMZxo0A 3MG6Ywc/mObBisIUinrwjlt0zWyoD00FbszsSH2QygmTypE3N6E88mlPIVzgDqnTuXeq lcTtP6OtYzuoAu0Wy5hvpLewIf6YQlX7SWaG2GDV4eDVh6uNJ3MPbOP2PMf1EFm/TDEn FPWtSj9WcLZ40nwY+TdAXO6naawn3XBNP/h/UGCeRsZyy0d7Ey8SZ7cc8YZSRcOVZgrz NDJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710346097; x=1710950897; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6RCubQDsXpdY1sQ78BhmIwCQTT0Oy7mjzDSQJ8pgTCk=; b=HLDOEXDeHRNLryIV3xYTknuGe7UYyajNyzqSV+e/h6SVDtBP0jtg0nv4szB56zJp58 xfnfj+DDzRSk49MaqdbwEYNE0uCEl0Go9Nt4DqrvKYwaDwX4dOORNjSoa0TiWTNJOjJg zgDob+/Vmy8gffhqNolOO8W8QlhrlPOgkf9kmSUDlOH9X26jUI3+OsKQEvIoux/ojSXg u5e/ekmlJ4XK5584F27WOyT/OMCYmCb+Y0K3MKvle+ZycVEtRTqrOJ92R4pSSYQOKZ4e cZN8NZs8iYrmKAOGurUgaqnpf6Kx3h5woialLDRpUVd3Q3Z9WpZMkgidBxmDpD9kmNWm nnBw== X-Gm-Message-State: AOJu0YwkpovMkUeAWYAl35IQLKy4pGYOXejTUpLmxmxsnpOo2XK9FH1C BNhgAdN7S1JxhQ1lIDwl4gJj4HbbRQc/5bJ0MJ/5FUG94F50LNmz X-Google-Smtp-Source: AGHT+IENqyjeY4vqu3OPo/kJYP8ZtInMCRmXgrxSYB6Yl/od0ptcbTu6RrTWTb5rji/coXWE3oSWYg== X-Received: by 2002:a17:902:dac2:b0:1dc:fc5c:9e22 with SMTP id q2-20020a170902dac200b001dcfc5c9e22mr8100406plx.33.1710346097022; Wed, 13 Mar 2024 09:08:17 -0700 (PDT) Received: from gnu-cfl-3.localdomain ([172.58.89.72]) by smtp.gmail.com with ESMTPSA id u5-20020a170902e5c500b001dd88a5dc47sm7723392plf.290.2024.03.13.09.08.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Mar 2024 09:08:16 -0700 (PDT) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id 642337406E2; Wed, 13 Mar 2024 09:08:15 -0700 (PDT) From: "H.J. Lu" To: binutils@sourceware.org Cc: goldstein.w.n@gmail.com, sam@gentoo.org, amodra@gmail.com Subject: [PATCH v9 3/6] elf: Use mmap to map in symbol and relocation tables Date: Wed, 13 Mar 2024 09:08:12 -0700 Message-ID: <20240313160815.665818-4-hjl.tools@gmail.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240313160815.665818-1-hjl.tools@gmail.com> References: <20240313160815.665818-1-hjl.tools@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3019.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_ABUSEAT,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Add _bfd_mmap_read_temporary to mmap in symbol tables and relocations whose sizes >= 4 * page size. Don't cache external relocations when mmap is used. When mmap is used to map in all ELF sections, data to link the 3.5GB clang executable in LLVM 17 debug build on Linux/x86-64 with 32GB RAM is: stdio mmap improvement user 84.28 85.04 -0.9% system 12.46 10.16 14% total 96 95.35 0.7% page faults 4837944 4047667 16% and data to link the 275M cc1plus executable in GCC 14 stage 1 build is: user 5.22 5.27 -1% system 0.94 0.84 11% total 6.20 6.13 0.7% page faults 361272 323377 10% * elf.c (bfd_elf_get_elf_syms): Replace bfd_read with _bfd_mmap_read_temporary. * elflink.c (elf_link_read_relocs_from_section): Add 2 arguments to return mmap memory address and size. (_bfd_elf_link_info_read_relocs); Replace bfd_read with _bfd_mmap_read_temporary. (bfd_elf_final_link): Don't cache external relocations when mmap is used. * libbfd.c (_bfd_mmap_read_temporary ): New. * libbfd-in.h (_bfd_mmap_read_temporary): Likewise. * libbfd.h: Regenerated. --- bfd/elf.c | 47 +++++++++++++++++++++++++++++++++++------------ bfd/elflink.c | 44 ++++++++++++++++++++++++++++++++------------ bfd/libbfd-in.h | 3 +++ bfd/libbfd.c | 33 +++++++++++++++++++++++++++++++++ bfd/libbfd.h | 3 +++ 5 files changed, 106 insertions(+), 24 deletions(-) diff --git a/bfd/elf.c b/bfd/elf.c index 557c1ac5f4a..3a6c07af6c1 100644 --- a/bfd/elf.c +++ b/bfd/elf.c @@ -460,19 +460,30 @@ bfd_elf_get_elf_syms (bfd *ibfd, goto out; } pos = symtab_hdr->sh_offset + symoffset * extsym_size; + size_t alloc_ext_size = amt; if (extsym_buf == NULL) { - alloc_ext = bfd_malloc (amt); - extsym_buf = alloc_ext; +#ifdef USE_MMAP + if ((ibfd->flags & BFD_PLUGIN) != 0 + || amt < _bfd_minimum_mmap_size) + { +#endif + alloc_ext = bfd_malloc (amt); + extsym_buf = alloc_ext; +#ifdef USE_MMAP + } +#endif } - if (extsym_buf == NULL - || bfd_seek (ibfd, pos, SEEK_SET) != 0 - || bfd_read (extsym_buf, amt, ibfd) != amt) + + if (bfd_seek (ibfd, pos, SEEK_SET) != 0 + || !_bfd_mmap_read_temporary (&extsym_buf, &alloc_ext_size, + &alloc_ext, ibfd)) { intsym_buf = NULL; goto out; } + size_t alloc_extshndx_size = 0; if (shndx_hdr == NULL || shndx_hdr->sh_size == 0) extshndx_buf = NULL; else @@ -483,15 +494,27 @@ bfd_elf_get_elf_syms (bfd *ibfd, intsym_buf = NULL; goto out; } + alloc_extshndx_size = amt; pos = shndx_hdr->sh_offset + symoffset * sizeof (Elf_External_Sym_Shndx); if (extshndx_buf == NULL) { - alloc_extshndx = (Elf_External_Sym_Shndx *) bfd_malloc (amt); - extshndx_buf = alloc_extshndx; +#ifdef USE_MMAP + if ((ibfd->flags & BFD_PLUGIN) != 0 + || amt < _bfd_minimum_mmap_size) + { +#endif + alloc_extshndx + = (Elf_External_Sym_Shndx *) bfd_malloc (amt); + extshndx_buf = alloc_extshndx; +#ifdef USE_MMAP + } +#endif } - if (extshndx_buf == NULL - || bfd_seek (ibfd, pos, SEEK_SET) != 0 - || bfd_read (extshndx_buf, amt, ibfd) != amt) + if (bfd_seek (ibfd, pos, SEEK_SET) != 0 + || !_bfd_mmap_read_temporary ((void **) &extshndx_buf, + &alloc_extshndx_size, + (void **) &alloc_extshndx, + ibfd)) { intsym_buf = NULL; goto out; @@ -530,8 +553,8 @@ bfd_elf_get_elf_syms (bfd *ibfd, } out: - free (alloc_ext); - free (alloc_extshndx); + _bfd_munmap_readonly_temporary (alloc_ext, alloc_ext_size); + _bfd_munmap_readonly_temporary (alloc_extshndx, alloc_extshndx_size); return intsym_buf; } diff --git a/bfd/elflink.c b/bfd/elflink.c index 216c124b207..2a0586722e8 100644 --- a/bfd/elflink.c +++ b/bfd/elflink.c @@ -2644,8 +2644,11 @@ _bfd_elf_link_assign_sym_version (struct elf_link_hash_entry *h, void *data) may be either a REL or a RELA section. The relocations are translated into RELA relocations and stored in INTERNAL_RELOCS, which should have already been allocated to contain enough space. - The EXTERNAL_RELOCS are a buffer where the external form of the - relocations should be stored. + The *EXTERNAL_RELOCS_P are a buffer where the external form of the + relocations should be stored. If *EXTERNAL_RELOCS_ADDR is NULL, + *EXTERNAL_RELOCS_ADDR and *EXTERNAL_RELOCS_SIZE returns the mmap + memory address and size. Otherwise, *EXTERNAL_RELOCS_ADDR is + unchanged and *EXTERNAL_RELOCS_SIZE returns 0. Returns FALSE if something goes wrong. */ @@ -2653,7 +2656,8 @@ static bool elf_link_read_relocs_from_section (bfd *abfd, asection *sec, Elf_Internal_Shdr *shdr, - void *external_relocs, + void **external_relocs_addr, + size_t *external_relocs_size, Elf_Internal_Rela *internal_relocs) { const struct elf_backend_data *bed; @@ -2663,13 +2667,17 @@ elf_link_read_relocs_from_section (bfd *abfd, Elf_Internal_Rela *irela; Elf_Internal_Shdr *symtab_hdr; size_t nsyms; + void *external_relocs = *external_relocs_addr; /* Position ourselves at the start of the section. */ if (bfd_seek (abfd, shdr->sh_offset, SEEK_SET) != 0) return false; /* Read the relocations. */ - if (bfd_read (external_relocs, shdr->sh_size, abfd) != shdr->sh_size) + *external_relocs_size = shdr->sh_size; + if (!_bfd_mmap_read_temporary (&external_relocs, + external_relocs_size, + external_relocs_addr, abfd)) return false; symtab_hdr = &elf_tdata (abfd)->symtab_hdr; @@ -2754,6 +2762,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, bool keep_memory) { void *alloc1 = NULL; + size_t alloc1_size; Elf_Internal_Rela *alloc2 = NULL; const struct elf_backend_data *bed = get_elf_backend_data (abfd); struct bfd_elf_section_data *esdo = elf_section_data (o); @@ -2791,17 +2800,26 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, if (esdo->rela.hdr) size += esdo->rela.hdr->sh_size; - alloc1 = bfd_malloc (size); - if (alloc1 == NULL) - goto error_return; - external_relocs = alloc1; +#ifdef USE_MMAP + if (size < _bfd_minimum_mmap_size) + { +#endif + alloc1 = bfd_malloc (size); + if (alloc1 == NULL) + goto error_return; + external_relocs = alloc1; +#ifdef USE_MMAP + } +#endif } + else + alloc1 = external_relocs; internal_rela_relocs = internal_relocs; if (esdo->rel.hdr) { if (!elf_link_read_relocs_from_section (abfd, o, esdo->rel.hdr, - external_relocs, + &alloc1, &alloc1_size, internal_relocs)) goto error_return; external_relocs = (((bfd_byte *) external_relocs) @@ -2812,7 +2830,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, if (esdo->rela.hdr && (!elf_link_read_relocs_from_section (abfd, o, esdo->rela.hdr, - external_relocs, + &alloc1, &alloc1_size, internal_rela_relocs))) goto error_return; @@ -2820,7 +2838,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, if (keep_memory) esdo->relocs = internal_relocs; - free (alloc1); + _bfd_munmap_readonly_temporary (alloc1, alloc1_size); /* Don't free alloc2, since if it was allocated we are passing it back (under the name of internal_relocs). */ @@ -2828,7 +2846,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, return internal_relocs; error_return: - free (alloc1); + _bfd_munmap_readonly_temporary (alloc1, alloc1_size); if (alloc2 != NULL) { if (keep_memory) @@ -12741,12 +12759,14 @@ bfd_elf_final_link (bfd *abfd, struct bfd_link_info *info) goto error_return; } +#ifndef USE_MMAP if (max_external_reloc_size != 0) { flinfo.external_relocs = bfd_malloc (max_external_reloc_size); if (flinfo.external_relocs == NULL) goto error_return; } +#endif if (max_internal_reloc_count != 0) { diff --git a/bfd/libbfd-in.h b/bfd/libbfd-in.h index c5a79cf932c..9d5396a0354 100644 --- a/bfd/libbfd-in.h +++ b/bfd/libbfd-in.h @@ -905,6 +905,9 @@ extern void _bfd_munmap_readonly_temporary #define _bfd_munmap_readonly_temporary(ptr, rsize) free (ptr) #endif +extern bool _bfd_mmap_read_temporary + (void **, size_t *, void **, bfd *) ATTRIBUTE_HIDDEN; + static inline void * _bfd_malloc_and_read (bfd *abfd, bfd_size_type asize, bfd_size_type rsize) { diff --git a/bfd/libbfd.c b/bfd/libbfd.c index e5147a29d69..3a6bf4e35a2 100644 --- a/bfd/libbfd.c +++ b/bfd/libbfd.c @@ -1174,6 +1174,39 @@ _bfd_mmap_readonly_persistent (bfd *abfd, size_t rsize) } #endif +/* Attempt to read *SIZE_P bytes from ABFD's iostream to *DATA_P. + Return true if the full the amount has been read. If *DATA_P is + NULL, mmap should be used, return the memory address at the + current offset in *DATA_P as well as return mmap address and size + in *MMAP_BASE and *SIZE_P. Otherwise, return NULL in *MMAP_BASE + and 0 in *SIZE_P. */ + +bool +_bfd_mmap_read_temporary (void **data_p, size_t *size_p, + void **mmap_base, bfd *abfd) +{ + void *data = *data_p; + size_t size = *size_p; + +#ifdef USE_MMAP + if (data == NULL) + { + data = _bfd_mmap_readonly_temporary (abfd, size, mmap_base, + size_p); + if (data == NULL || data == MAP_FAILED) + abort (); + *data_p = data; + return true; + } + else +#endif + { + *mmap_base = NULL; + *size_p = 0; + return bfd_read (data, size, abfd) == size; + } +} + /* Default implementation */ bool diff --git a/bfd/libbfd.h b/bfd/libbfd.h index 47f40889a95..4b605c4ce6c 100644 --- a/bfd/libbfd.h +++ b/bfd/libbfd.h @@ -911,6 +911,9 @@ extern void _bfd_munmap_readonly_temporary #define _bfd_munmap_readonly_temporary(ptr, rsize) free (ptr) #endif +extern bool _bfd_mmap_read_temporary + (void **, size_t *, void **, bfd *) ATTRIBUTE_HIDDEN; + static inline void * _bfd_malloc_and_read (bfd *abfd, bfd_size_type asize, bfd_size_type rsize) { -- 2.44.0