From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2e.google.com (mail-yb1-xb2e.google.com [IPv6:2607:f8b0:4864:20::b2e]) by sourceware.org (Postfix) with ESMTPS id 8426F385E001 for ; Fri, 8 Mar 2024 15:36:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8426F385E001 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8426F385E001 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::b2e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709912214; cv=none; b=M8G51H48gWeZ/9id1GIkZ1AsomUEIvCGndHHHgsnuQa6H9LJfcvMwXsUx1FrdqItbG6bQyntSgQk/T22U5LS0ijQpnU9UQO+Ss3zswNf+oz4bsVl2/UE8crMqfgjn+JTmONMc7Az6VCQ36ggCy0MsJF0hpRBo7pt+PGc7uZ9pUM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709912214; c=relaxed/simple; bh=TJBKfkh5liPftCu24XrkHV5/nAWyw2ixYGq41MeG9VA=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=eI9s6Si+S6E4h6fRtqRsdT4QwTHiIMh23P7WdLMPmQwsqfw5G3OmMOxd2JBAJGhP/uIyWH/Jaze/8rLXr87E9/6d5jhPWaE0CSKbxpNyY47kLuJYNPrS1L4mN8n8Usqp+aal7/9gLlfeDA5Hf7uBEWVsIjKpKwNQsfBr1X/Okag= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yb1-xb2e.google.com with SMTP id 3f1490d57ef6-d9b9adaf291so993520276.1 for ; Fri, 08 Mar 2024 07:36:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709912210; x=1710517010; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Fll0OA5uyvwLMZ2ycUCW0KswhxUPKYNG06UMfker5GU=; b=E7NAW5JP/gR15vW6HwYi5OVe/4g54laYPMl7trcnkePDpfJOvKr3i8zM58nfgd9Bw3 cLUz0ZrbRsvBA5aizUy72JOB+0t86d0/h1/m8pfH4hXqQ5izpgFPdXWFxRXwz24+apYb bGwPR4KAKN0Tg2uv4SgaYFehfvA0I/KA9wVKDl7a+0PULVZcyy6AOhOTlwv37XgjYgNg XRvyD+1nvy/Fv6thCDK/X38itDRRNCZ0EcRhcz0YqLTp0xwrMjOHz4VhBtvXqWdoj5vv ZXcKE9G6u/FcSJN1Nz16wPwupe1i8KZOz/s6OpbYpvpl+V3tINhWME5OdDK+uKusNRma 5mSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709912210; x=1710517010; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Fll0OA5uyvwLMZ2ycUCW0KswhxUPKYNG06UMfker5GU=; b=htUmSVcq4sxpFwW8DTYEmWTssgMllV1Al/v2ErVbwYqHswkLaxSZOLBPS01Hx4hk65 JS7njPwSBx15txh0R0KCRVRgVDunacdDdh0g6FCaaU53+MxWytqRKhvcVy73IZP0b+2T Mi97sSdlQUvAGWg0aG0K/cMS6/PwdBiEZNoXrmZlbXVnSynfyMUv6CVmIgQvYW2/C6+K b6yCNb37FGzEsx+X/knNjC+ThGOUq8WeNtoc1j30W8hBhp94Z3lry4YxR6JrPxfEvHB+ r0Ana45A1AEoKqXJkf9iHj//9hertj3IrFykfDKohScNmYrygQGcxv2nq2FXo8hY4S4u MhjQ== X-Gm-Message-State: AOJu0YxaVnSppfT7tQYp6WJJsPbhM1GHxdvFtQjlxj3r0BC7kST1vgX9 oxIyd0ktxoILqTExHFoO08XZcBAIe2xOc0jv7HtbYCLUa9KShQ1ub+xDB1oJ8WRJ4muBNaF3Ih5 A2l90awuPbRujqzOlYPJ5oXggroUGfg6GJIk= X-Google-Smtp-Source: AGHT+IGKpGhebC908pgh8zeQv/gd6IxiD6sH8GBRfYCRvVc5DA9jq6694MCM6ZXma+wyKBnj9c+RmhEzW40jkQwAhzw= X-Received: by 2002:a25:b47:0:b0:dc3:696f:167e with SMTP id 68-20020a250b47000000b00dc3696f167emr17638649ybl.64.1709912209945; Fri, 08 Mar 2024 07:36:49 -0800 (PST) MIME-Version: 1.0 References: <20240306232401.1408530-1-hjl.tools@gmail.com> <20240306232401.1408530-6-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Fri, 8 Mar 2024 07:36:13 -0800 Message-ID: Subject: Re: [PATCH v4 5/6] elf: Use mmap to map in symbol and relocation tables To: Alan Modra Cc: binutils@sourceware.org, goldstein.w.n@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3019.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Mar 7, 2024 at 4:42=E2=80=AFPM Alan Modra wrote: > > On Wed, Mar 06, 2024 at 03:24:00PM -0800, H.J. Lu wrote: > > Add _bfd_mmap_read_untracked to mmap in symbol tables and relocations > > whose sizes >=3D 4 * page size. Don't cache external relocations when > > mmap is used. > > > > When mmap is used to map in all ELF sections, data to link the 3.5GB > > clang executable in LLVM 17 debug build on Linux/x86-64 with 32GB RAM > > is: > > > > stdio mmap improvement > > user 84.28 85.04 -0.9% > > system 12.46 10.16 14% > > total 96 95.35 0.7% > > page faults 4837944 4047667 16% > > > > and data to link the 275M cc1plus executable in GCC 14 stage 1 build > > is: > > > > user 5.22 5.27 -1% > > system 0.94 0.84 11% > > total 6.20 6.13 0.7% > > page faults 361272 323377 10% > > > > * elf.c (bfd_elf_get_elf_syms): Replace bfd_read with > > _bfd_mmap_read_untracked. > > * elflink.c (elf_link_read_relocs_from_section): Add 2 arguments > > to return mmap memory address and size. > > (_bfd_elf_link_info_read_relocs); Replace bfd_read with > > _bfd_mmap_read_untracked. > > (bfd_elf_final_link): Don't cache external relocations when mmap > > is used. > > * libbfd.c (_bfd_mmap_read_untracked ): New. > > * libbfd-in.h (_bfd_mmap_read_untracked): Likewise. > > * libbfd.h: Regenerated. > > --- > > bfd/elf.c | 47 +++++++++++++++++++++++++++++++++++------------ > > bfd/elflink.c | 44 ++++++++++++++++++++++++++++++++------------ > > bfd/libbfd-in.h | 3 +++ > > bfd/libbfd.c | 33 +++++++++++++++++++++++++++++++++ > > bfd/libbfd.h | 3 +++ > > 5 files changed, 106 insertions(+), 24 deletions(-) > > > > diff --git a/bfd/elf.c b/bfd/elf.c > > index e2e31a93950..7427dca2ba0 100644 > > --- a/bfd/elf.c > > +++ b/bfd/elf.c > > @@ -464,19 +464,30 @@ bfd_elf_get_elf_syms (bfd *ibfd, > > goto out; > > } > > pos =3D symtab_hdr->sh_offset + symoffset * extsym_size; > > + size_t alloc_ext_size =3D amt; > > if (extsym_buf =3D=3D NULL) > > { > > - alloc_ext =3D bfd_malloc (amt); > > - extsym_buf =3D alloc_ext; > > +#ifdef USE_MMAP > > + if ((ibfd->flags & BFD_PLUGIN) !=3D 0 > > + || amt < _bfd_minimum_mmap_size) > > + { > > +#endif > > + alloc_ext =3D bfd_malloc (amt); > > + extsym_buf =3D alloc_ext; > > +#ifdef USE_MMAP > > + } > > +#endif > > } > > - if (extsym_buf =3D=3D NULL > > - || bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > - || bfd_read (extsym_buf, amt, ibfd) !=3D amt) > > + > > + if (bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > + || !_bfd_mmap_read_untracked (&extsym_buf, &alloc_ext_size, > > + &alloc_ext, ibfd)) > > { > > intsym_buf =3D NULL; > > goto out; > > } > > > > + size_t alloc_extshndx_size =3D 0; > > if (shndx_hdr =3D=3D NULL || shndx_hdr->sh_size =3D=3D 0) > > extshndx_buf =3D NULL; > > else > > @@ -487,15 +498,27 @@ bfd_elf_get_elf_syms (bfd *ibfd, > > intsym_buf =3D NULL; > > goto out; > > } > > + alloc_extshndx_size =3D amt; > > pos =3D shndx_hdr->sh_offset + symoffset * sizeof (Elf_External_= Sym_Shndx); > > if (extshndx_buf =3D=3D NULL) > > { > > - alloc_extshndx =3D (Elf_External_Sym_Shndx *) bfd_malloc (amt); > > - extshndx_buf =3D alloc_extshndx; > > +#ifdef USE_MMAP > > + if ((ibfd->flags & BFD_PLUGIN) !=3D 0 > > + || amt < _bfd_minimum_mmap_size) > > + { > > +#endif > > + alloc_extshndx > > + =3D (Elf_External_Sym_Shndx *) bfd_malloc (amt); > > + extshndx_buf =3D alloc_extshndx; > > +#ifdef USE_MMAP > > + } > > +#endif > > } > > - if (extshndx_buf =3D=3D NULL > > - || bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > - || bfd_read (extshndx_buf, amt, ibfd) !=3D amt) > > + if (bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > + || !_bfd_mmap_read_untracked ((void **) &extshndx_buf, > > + &alloc_extshndx_size, > > + (void **) &alloc_extshndx, > > + ibfd)) > > { > > intsym_buf =3D NULL; > > goto out; > > @@ -534,8 +557,8 @@ bfd_elf_get_elf_syms (bfd *ibfd, > > } > > > > out: > > - free (alloc_ext); > > - free (alloc_extshndx); > > + _bfd_munmap_readonly_untracked (alloc_ext, alloc_ext_size); > > + _bfd_munmap_readonly_untracked (alloc_extshndx, alloc_extshndx_size)= ; > > > > return intsym_buf; > > } > > diff --git a/bfd/elflink.c b/bfd/elflink.c > > index 47fb890f94f..4602fb3d10f 100644 > > --- a/bfd/elflink.c > > +++ b/bfd/elflink.c > > @@ -2644,8 +2644,11 @@ _bfd_elf_link_assign_sym_version (struct elf_lin= k_hash_entry *h, void *data) > > may be either a REL or a RELA section. The relocations are > > translated into RELA relocations and stored in INTERNAL_RELOCS, > > which should have already been allocated to contain enough space. > > - The EXTERNAL_RELOCS are a buffer where the external form of the > > - relocations should be stored. > > + The *EXTERNAL_RELOCS_P are a buffer where the external form of the > > + relocations should be stored. If *EXTERNAL_RELOCS_P is NULL, > > + *EXTERNAL_RELOCS_P and *EXTERNAL_RELOCS_SIZE_P returns the mmap > > + memory address and size. Otherwise, *EXTERNAL_RELOCS_SIZE_P is > > + unchanged and EXTERNAL_RELOCS_SIZE_P returns 0. > > Comment doesn't match function params. Fixed. > > Returns FALSE if something goes wrong. */ > > > > @@ -2653,7 +2656,8 @@ static bool > > elf_link_read_relocs_from_section (bfd *abfd, > > asection *sec, > > Elf_Internal_Shdr *shdr, > > - void *external_relocs, > > + void **external_relocs_addr, > > + size_t *external_relocs_size_addr, > > I think external_relocs_size is a better name. Why the _addr suffix? Fixed. > > Elf_Internal_Rela *internal_relocs) > > { > > const struct elf_backend_data *bed; > > @@ -2663,13 +2667,17 @@ elf_link_read_relocs_from_section (bfd *abfd, > > Elf_Internal_Rela *irela; > > Elf_Internal_Shdr *symtab_hdr; > > size_t nsyms; > > + void *external_relocs =3D *external_relocs_addr; > > > > /* Position ourselves at the start of the section. */ > > if (bfd_seek (abfd, shdr->sh_offset, SEEK_SET) !=3D 0) > > return false; > > > > /* Read the relocations. */ > > - if (bfd_read (external_relocs, shdr->sh_size, abfd) !=3D shdr->sh_si= ze) > > + *external_relocs_size_addr =3D shdr->sh_size; > > + if (!_bfd_mmap_read_untracked (&external_relocs, > > + external_relocs_size_addr, > > + external_relocs_addr, abfd)) > > return false; > > > > symtab_hdr =3D &elf_tdata (abfd)->symtab_hdr; > > @@ -2754,6 +2762,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > bool keep_memory) > > { > > void *alloc1 =3D NULL; > > + size_t alloc1_size; > > Elf_Internal_Rela *alloc2 =3D NULL; > > const struct elf_backend_data *bed =3D get_elf_backend_data (abfd); > > struct bfd_elf_section_data *esdo =3D elf_section_data (o); > > @@ -2791,17 +2800,26 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > if (esdo->rela.hdr) > > size +=3D esdo->rela.hdr->sh_size; > > > > - alloc1 =3D bfd_malloc (size); > > - if (alloc1 =3D=3D NULL) > > - goto error_return; > > - external_relocs =3D alloc1; > > +#ifdef USE_MMAP > > + if (size < _bfd_minimum_mmap_size) > > + { > > +#endif > > + alloc1 =3D bfd_malloc (size); > > + if (alloc1 =3D=3D NULL) > > + goto error_return; > > + external_relocs =3D alloc1; > > +#ifdef USE_MMAP > > + } > > +#endif > > } > > + else > > + alloc1 =3D external_relocs; > > > > internal_rela_relocs =3D internal_relocs; > > if (esdo->rel.hdr) > > { > > if (!elf_link_read_relocs_from_section (abfd, o, esdo->rel.hdr, > > - external_relocs, > > + &alloc1, &alloc1_size, > > internal_relocs)) > > goto error_return; > > external_relocs =3D (((bfd_byte *) external_relocs) > > @@ -2812,7 +2830,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > > > if (esdo->rela.hdr > > && (!elf_link_read_relocs_from_section (abfd, o, esdo->rela.hdr, > > - external_relocs, > > + &alloc1, &alloc1_size, > > internal_rela_relocs))) > > goto error_return; > > > > @@ -2820,7 +2838,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > if (keep_memory) > > esdo->relocs =3D internal_relocs; > > > > - free (alloc1); > > + _bfd_munmap_readonly_untracked (alloc1, alloc1_size); > > > > /* Don't free alloc2, since if it was allocated we are passing it > > back (under the name of internal_relocs). */ > > @@ -2828,7 +2846,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > return internal_relocs; > > > > error_return: > > - free (alloc1); > > + _bfd_munmap_readonly_untracked (alloc1, alloc1_size); > > if (alloc2 !=3D NULL) > > { > > if (keep_memory) > > @@ -12741,12 +12759,14 @@ bfd_elf_final_link (bfd *abfd, struct bfd_lin= k_info *info) > > goto error_return; > > } > > > > +#ifndef USE_MMAP > > if (max_external_reloc_size !=3D 0) > > { > > flinfo.external_relocs =3D bfd_malloc (max_external_reloc_size); > > if (flinfo.external_relocs =3D=3D NULL) > > goto error_return; > > } > > +#endif > > > > if (max_internal_reloc_count !=3D 0) > > { > > diff --git a/bfd/libbfd-in.h b/bfd/libbfd-in.h > > index 7887fad9c92..effe1b86b53 100644 > > --- a/bfd/libbfd-in.h > > +++ b/bfd/libbfd-in.h > > @@ -905,6 +905,9 @@ extern void _bfd_munmap_readonly_untracked > > #define _bfd_munmap_readonly_untracked(ptr, rsize) free (ptr) > > #endif > > > > +extern bool _bfd_mmap_read_untracked > > + (void **, size_t *, void **, bfd *) ATTRIBUTE_HIDDEN; > > + > > static inline void * > > _bfd_malloc_and_read (bfd *abfd, bfd_size_type asize, bfd_size_type rs= ize) > > { > > diff --git a/bfd/libbfd.c b/bfd/libbfd.c > > index c847c4f0180..0a31f113999 100644 > > --- a/bfd/libbfd.c > > +++ b/bfd/libbfd.c > > @@ -1180,6 +1180,39 @@ _bfd_mmap_readonly_tracked (bfd *abfd, size_t rs= ize) > > } > > #endif > > > > +/* Attempt to read *SIZE_ADDR bytes from ABFD's iostream to *PTR_P. > > + Return true if the full the amount has been read. If *PTR_P is > > + NULL, mmap should be used, return the memory address at the > > + current offset in *PTR_P as well as return mmap address and size > > + in *PTR_ADDR and *SIZE_ADDR. Otherwise, return NULL in *PTR_ADDR > > + and 0 in *SIZE_ADDR. */ > > + > > +bool > > +_bfd_mmap_read_untracked (void **ptr_p, size_t *size_addr, > > + void **ptr_addr, bfd *abfd) > > I find these parameter names confusing. Perhaps > void **data, size_t *size, void **mmap_base, bfd *abfd Fixed. > > +{ > > + void *ptr =3D *ptr_p; > > + size_t size =3D *size_addr; > > + > > +#ifdef USE_MMAP > > + if (ptr =3D=3D NULL) > > + { > > + ptr =3D _bfd_mmap_readonly_untracked (abfd, size, ptr_addr, > > + size_addr); > > + if (ptr =3D=3D NULL || ptr =3D=3D (void *) -1) > > + abort (); > > + *ptr_p =3D ptr; > > + return true; > > + } > > + else > > +#endif > > + { > > + *ptr_addr =3D NULL; > > + *size_addr =3D 0; > > + return bfd_read (ptr, size, abfd) =3D=3D size; > > + } > > +} > > + > > /* Default implementation */ > > > > bool > > diff --git a/bfd/libbfd.h b/bfd/libbfd.h > > index 1515a03b093..c80f5a86ed1 100644 > > --- a/bfd/libbfd.h > > +++ b/bfd/libbfd.h > > @@ -911,6 +911,9 @@ extern void _bfd_munmap_readonly_untracked > > #define _bfd_munmap_readonly_untracked(ptr, rsize) free (ptr) > > #endif > > > > +extern bool _bfd_mmap_read_untracked > > + (void **, size_t *, void **, bfd *) ATTRIBUTE_HIDDEN; > > + > > static inline void * > > _bfd_malloc_and_read (bfd *abfd, bfd_size_type asize, bfd_size_type rs= ize) > > { > > -- > > 2.44.0 > > OK with these fixed. Here is the v5 patch: https://patchwork.sourceware.org/project/binutils/list/?series=3D31726 I will check it in tomorrow. Thanks. --=20 H.J.