From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112a.google.com (mail-yw1-x112a.google.com [IPv6:2607:f8b0:4864:20::112a]) by sourceware.org (Postfix) with ESMTPS id 058E03858C60 for ; Sat, 9 Mar 2024 16:34:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 058E03858C60 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 058E03858C60 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::112a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710002100; cv=none; b=KL3XcqhJMCTyc6xvuyThLjueD+90z2tYzoyY28P12GecSxvIsEF9hZgECIL4Af7jD1OeypAO2MOZGEP8LSrZtEaRQ34SL7CFKEF8t3vRGkjARkKxzghQSnkElpDRasONHAz88q6EZJmI4AAbWWhhbvHY7XxuzFkNEb1i8RhXUkI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710002100; c=relaxed/simple; bh=NjW3HrZ0Sg8GspFXKuuDlBTvVMstyFZQLSnlA1sVC6I=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=aH/MO6xsoj6U5TUWxzmDaSKbDn4xtLWwx3smtx9FO7+jWcYl3UfGjBNq/BXdmJB0ArSKeT9fxIvW4oLFjbtdCIIcELSvUQ1piQXm6JxlN2Cl2ed0q+ocpCVmAfaAyT/nczd0rAFViXMioNuSvavxNXkyLk/FJ59NPjNVdRljcuQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x112a.google.com with SMTP id 00721157ae682-609f060cbafso31858997b3.0 for ; Sat, 09 Mar 2024 08:34:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710002097; x=1710606897; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=0x8k4OxCC+cbbRfpUdWVWpY09Bxv8FKl3oIXVFR4UbI=; b=XGFt2LqaF7HVBrFE/LLGnNeZkm8hoyUd3HD92ajfHGL/WflUsTgVa4Kq2Ix/jBFnC9 tYe1MgmnVugg+2NTzhRl+QoqpfQcX+tauprbQcybTguBZAFXCVXJf4yz32NmpLC/aXmt Q1qmhVAAFlDOGPAz8npTYPXYLY72KqkRTnUM4ngx6xno29FKZAaiEfbHV5mftwLXuC22 29gOLXrYXp1UuLd3IJmC+Zb871fhf+A5I7qH2E8D99D/OBYI5wJTqX2xfekMITrqwXKV gpIJV6hbx/+tNwCsUYHsUV1HxhyiwzyA/+fTTBqo/qOyJDNZ7eqpU5MXRBy3T+KEDKzz STig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710002097; x=1710606897; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0x8k4OxCC+cbbRfpUdWVWpY09Bxv8FKl3oIXVFR4UbI=; b=sV6Oj/MhQ9ByCN1vyxvmHH983a2oxwbuV4wVOkJ5Z6aauwvRGOidGk4U64bPGsACmt ethDSZxzKQKJ2US8OYGsDIc0ipedWoVtsefMmIJxFcEnEmscN0V+8n4QBlW13Xy6jwwE i4WN/KFzJDYygwjBqVJ5+xmEm4CtucMcXa0tAfm6MDN1ps9m1jz45Dbp7wjll4vYxm2k xmzdpTDo1ZMCdH71oh6dDHl7oilG95DfUd+kMaopB9RUzJKM6r0KKWVZJsfv7vuYFzNa lF29dmz6HDG6UhktpIIld0NuvWXGrDQlcododrXzETlZHpb0pBmovGyMPz1WqmesIgWz FZIA== X-Gm-Message-State: AOJu0Yw8qnoBS29174tcR6L4TBzGFp/Wrm0lMmn+vwm0ljTGuGATSk4X zNYLkkfU8S0VWOgoZKpcdjsOaxHgP33cq5DijwP5Yes4IJo4ZjKUiroYfg3wlewMomEA7DbGj7T ruRnVnKVBzWmVvqdm+ovV62lGQ08= X-Google-Smtp-Source: AGHT+IFBgAdhXwN+WG/S0J0Rh6RjxoMHMfJUCjI+mDFvdJS5gVzouedA4BxHa4iiMZlD+smABl1pIEHxd5O4KUuP9Mc= X-Received: by 2002:a81:c30a:0:b0:609:fc7f:1e0d with SMTP id r10-20020a81c30a000000b00609fc7f1e0dmr2250785ywk.30.1710002097204; Sat, 09 Mar 2024 08:34:57 -0800 (PST) MIME-Version: 1.0 References: <20240306232401.1408530-1-hjl.tools@gmail.com> <20240306232401.1408530-6-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Sat, 9 Mar 2024 08:34:20 -0800 Message-ID: Subject: Re: [PATCH v4 5/6] elf: Use mmap to map in symbol and relocation tables To: Alan Modra Cc: binutils@sourceware.org, goldstein.w.n@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3020.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Mar 8, 2024 at 7:36=E2=80=AFAM H.J. Lu wrote: > > On Thu, Mar 7, 2024 at 4:42=E2=80=AFPM Alan Modra wrot= e: > > > > On Wed, Mar 06, 2024 at 03:24:00PM -0800, H.J. Lu wrote: > > > Add _bfd_mmap_read_untracked to mmap in symbol tables and relocations > > > whose sizes >=3D 4 * page size. Don't cache external relocations whe= n > > > mmap is used. > > > > > > When mmap is used to map in all ELF sections, data to link the 3.5GB > > > clang executable in LLVM 17 debug build on Linux/x86-64 with 32GB RAM > > > is: > > > > > > stdio mmap improvement > > > user 84.28 85.04 -0.9% > > > system 12.46 10.16 14% > > > total 96 95.35 0.7% > > > page faults 4837944 4047667 16% > > > > > > and data to link the 275M cc1plus executable in GCC 14 stage 1 build > > > is: > > > > > > user 5.22 5.27 -1% > > > system 0.94 0.84 11% > > > total 6.20 6.13 0.7% > > > page faults 361272 323377 10% > > > > > > * elf.c (bfd_elf_get_elf_syms): Replace bfd_read with > > > _bfd_mmap_read_untracked. > > > * elflink.c (elf_link_read_relocs_from_section): Add 2 argument= s > > > to return mmap memory address and size. > > > (_bfd_elf_link_info_read_relocs); Replace bfd_read with > > > _bfd_mmap_read_untracked. > > > (bfd_elf_final_link): Don't cache external relocations when mma= p > > > is used. > > > * libbfd.c (_bfd_mmap_read_untracked ): New. > > > * libbfd-in.h (_bfd_mmap_read_untracked): Likewise. > > > * libbfd.h: Regenerated. > > > --- > > > bfd/elf.c | 47 +++++++++++++++++++++++++++++++++++------------ > > > bfd/elflink.c | 44 ++++++++++++++++++++++++++++++++------------ > > > bfd/libbfd-in.h | 3 +++ > > > bfd/libbfd.c | 33 +++++++++++++++++++++++++++++++++ > > > bfd/libbfd.h | 3 +++ > > > 5 files changed, 106 insertions(+), 24 deletions(-) > > > > > > diff --git a/bfd/elf.c b/bfd/elf.c > > > index e2e31a93950..7427dca2ba0 100644 > > > --- a/bfd/elf.c > > > +++ b/bfd/elf.c > > > @@ -464,19 +464,30 @@ bfd_elf_get_elf_syms (bfd *ibfd, > > > goto out; > > > } > > > pos =3D symtab_hdr->sh_offset + symoffset * extsym_size; > > > + size_t alloc_ext_size =3D amt; > > > if (extsym_buf =3D=3D NULL) > > > { > > > - alloc_ext =3D bfd_malloc (amt); > > > - extsym_buf =3D alloc_ext; > > > +#ifdef USE_MMAP > > > + if ((ibfd->flags & BFD_PLUGIN) !=3D 0 > > > + || amt < _bfd_minimum_mmap_size) > > > + { > > > +#endif > > > + alloc_ext =3D bfd_malloc (amt); > > > + extsym_buf =3D alloc_ext; > > > +#ifdef USE_MMAP > > > + } > > > +#endif > > > } > > > - if (extsym_buf =3D=3D NULL > > > - || bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > > - || bfd_read (extsym_buf, amt, ibfd) !=3D amt) > > > + > > > + if (bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > > + || !_bfd_mmap_read_untracked (&extsym_buf, &alloc_ext_size, > > > + &alloc_ext, ibfd)) > > > { > > > intsym_buf =3D NULL; > > > goto out; > > > } > > > > > > + size_t alloc_extshndx_size =3D 0; > > > if (shndx_hdr =3D=3D NULL || shndx_hdr->sh_size =3D=3D 0) > > > extshndx_buf =3D NULL; > > > else > > > @@ -487,15 +498,27 @@ bfd_elf_get_elf_syms (bfd *ibfd, > > > intsym_buf =3D NULL; > > > goto out; > > > } > > > + alloc_extshndx_size =3D amt; > > > pos =3D shndx_hdr->sh_offset + symoffset * sizeof (Elf_Externa= l_Sym_Shndx); > > > if (extshndx_buf =3D=3D NULL) > > > { > > > - alloc_extshndx =3D (Elf_External_Sym_Shndx *) bfd_malloc (amt= ); > > > - extshndx_buf =3D alloc_extshndx; > > > +#ifdef USE_MMAP > > > + if ((ibfd->flags & BFD_PLUGIN) !=3D 0 > > > + || amt < _bfd_minimum_mmap_size) > > > + { > > > +#endif > > > + alloc_extshndx > > > + =3D (Elf_External_Sym_Shndx *) bfd_malloc (amt); > > > + extshndx_buf =3D alloc_extshndx; > > > +#ifdef USE_MMAP > > > + } > > > +#endif > > > } > > > - if (extshndx_buf =3D=3D NULL > > > - || bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > > - || bfd_read (extshndx_buf, amt, ibfd) !=3D amt) > > > + if (bfd_seek (ibfd, pos, SEEK_SET) !=3D 0 > > > + || !_bfd_mmap_read_untracked ((void **) &extshndx_buf, > > > + &alloc_extshndx_size, > > > + (void **) &alloc_extshndx, > > > + ibfd)) > > > { > > > intsym_buf =3D NULL; > > > goto out; > > > @@ -534,8 +557,8 @@ bfd_elf_get_elf_syms (bfd *ibfd, > > > } > > > > > > out: > > > - free (alloc_ext); > > > - free (alloc_extshndx); > > > + _bfd_munmap_readonly_untracked (alloc_ext, alloc_ext_size); > > > + _bfd_munmap_readonly_untracked (alloc_extshndx, alloc_extshndx_siz= e); > > > > > > return intsym_buf; > > > } > > > diff --git a/bfd/elflink.c b/bfd/elflink.c > > > index 47fb890f94f..4602fb3d10f 100644 > > > --- a/bfd/elflink.c > > > +++ b/bfd/elflink.c > > > @@ -2644,8 +2644,11 @@ _bfd_elf_link_assign_sym_version (struct elf_l= ink_hash_entry *h, void *data) > > > may be either a REL or a RELA section. The relocations are > > > translated into RELA relocations and stored in INTERNAL_RELOCS, > > > which should have already been allocated to contain enough space. > > > - The EXTERNAL_RELOCS are a buffer where the external form of the > > > - relocations should be stored. > > > + The *EXTERNAL_RELOCS_P are a buffer where the external form of th= e > > > + relocations should be stored. If *EXTERNAL_RELOCS_P is NULL, > > > + *EXTERNAL_RELOCS_P and *EXTERNAL_RELOCS_SIZE_P returns the mmap > > > + memory address and size. Otherwise, *EXTERNAL_RELOCS_SIZE_P is > > > + unchanged and EXTERNAL_RELOCS_SIZE_P returns 0. > > > > Comment doesn't match function params. > > Fixed. > > > > Returns FALSE if something goes wrong. */ > > > > > > @@ -2653,7 +2656,8 @@ static bool > > > elf_link_read_relocs_from_section (bfd *abfd, > > > asection *sec, > > > Elf_Internal_Shdr *shdr, > > > - void *external_relocs, > > > + void **external_relocs_addr, > > > + size_t *external_relocs_size_addr, > > > > I think external_relocs_size is a better name. Why the _addr suffix? > > Fixed. > > > > Elf_Internal_Rela *internal_relocs) > > > { > > > const struct elf_backend_data *bed; > > > @@ -2663,13 +2667,17 @@ elf_link_read_relocs_from_section (bfd *abfd, > > > Elf_Internal_Rela *irela; > > > Elf_Internal_Shdr *symtab_hdr; > > > size_t nsyms; > > > + void *external_relocs =3D *external_relocs_addr; > > > > > > /* Position ourselves at the start of the section. */ > > > if (bfd_seek (abfd, shdr->sh_offset, SEEK_SET) !=3D 0) > > > return false; > > > > > > /* Read the relocations. */ > > > - if (bfd_read (external_relocs, shdr->sh_size, abfd) !=3D shdr->sh_= size) > > > + *external_relocs_size_addr =3D shdr->sh_size; > > > + if (!_bfd_mmap_read_untracked (&external_relocs, > > > + external_relocs_size_addr, > > > + external_relocs_addr, abfd)) > > > return false; > > > > > > symtab_hdr =3D &elf_tdata (abfd)->symtab_hdr; > > > @@ -2754,6 +2762,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > > bool keep_memory) > > > { > > > void *alloc1 =3D NULL; > > > + size_t alloc1_size; > > > Elf_Internal_Rela *alloc2 =3D NULL; > > > const struct elf_backend_data *bed =3D get_elf_backend_data (abfd)= ; > > > struct bfd_elf_section_data *esdo =3D elf_section_data (o); > > > @@ -2791,17 +2800,26 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > > if (esdo->rela.hdr) > > > size +=3D esdo->rela.hdr->sh_size; > > > > > > - alloc1 =3D bfd_malloc (size); > > > - if (alloc1 =3D=3D NULL) > > > - goto error_return; > > > - external_relocs =3D alloc1; > > > +#ifdef USE_MMAP > > > + if (size < _bfd_minimum_mmap_size) > > > + { > > > +#endif > > > + alloc1 =3D bfd_malloc (size); > > > + if (alloc1 =3D=3D NULL) > > > + goto error_return; > > > + external_relocs =3D alloc1; > > > +#ifdef USE_MMAP > > > + } > > > +#endif > > > } > > > + else > > > + alloc1 =3D external_relocs; > > > > > > internal_rela_relocs =3D internal_relocs; > > > if (esdo->rel.hdr) > > > { > > > if (!elf_link_read_relocs_from_section (abfd, o, esdo->rel.hdr= , > > > - external_relocs, > > > + &alloc1, &alloc1_size, > > > internal_relocs)) > > > goto error_return; > > > external_relocs =3D (((bfd_byte *) external_relocs) > > > @@ -2812,7 +2830,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > > > > > if (esdo->rela.hdr > > > && (!elf_link_read_relocs_from_section (abfd, o, esdo->rela.hd= r, > > > - external_relocs, > > > + &alloc1, &alloc1_size, > > > internal_rela_relocs))) > > > goto error_return; > > > > > > @@ -2820,7 +2838,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > > if (keep_memory) > > > esdo->relocs =3D internal_relocs; > > > > > > - free (alloc1); > > > + _bfd_munmap_readonly_untracked (alloc1, alloc1_size); > > > > > > /* Don't free alloc2, since if it was allocated we are passing it > > > back (under the name of internal_relocs). */ > > > @@ -2828,7 +2846,7 @@ _bfd_elf_link_info_read_relocs (bfd *abfd, > > > return internal_relocs; > > > > > > error_return: > > > - free (alloc1); > > > + _bfd_munmap_readonly_untracked (alloc1, alloc1_size); > > > if (alloc2 !=3D NULL) > > > { > > > if (keep_memory) > > > @@ -12741,12 +12759,14 @@ bfd_elf_final_link (bfd *abfd, struct bfd_l= ink_info *info) > > > goto error_return; > > > } > > > > > > +#ifndef USE_MMAP > > > if (max_external_reloc_size !=3D 0) > > > { > > > flinfo.external_relocs =3D bfd_malloc (max_external_reloc_size= ); > > > if (flinfo.external_relocs =3D=3D NULL) > > > goto error_return; > > > } > > > +#endif > > > > > > if (max_internal_reloc_count !=3D 0) > > > { > > > diff --git a/bfd/libbfd-in.h b/bfd/libbfd-in.h > > > index 7887fad9c92..effe1b86b53 100644 > > > --- a/bfd/libbfd-in.h > > > +++ b/bfd/libbfd-in.h > > > @@ -905,6 +905,9 @@ extern void _bfd_munmap_readonly_untracked > > > #define _bfd_munmap_readonly_untracked(ptr, rsize) free (ptr) > > > #endif > > > > > > +extern bool _bfd_mmap_read_untracked > > > + (void **, size_t *, void **, bfd *) ATTRIBUTE_HIDDEN; > > > + > > > static inline void * > > > _bfd_malloc_and_read (bfd *abfd, bfd_size_type asize, bfd_size_type = rsize) > > > { > > > diff --git a/bfd/libbfd.c b/bfd/libbfd.c > > > index c847c4f0180..0a31f113999 100644 > > > --- a/bfd/libbfd.c > > > +++ b/bfd/libbfd.c > > > @@ -1180,6 +1180,39 @@ _bfd_mmap_readonly_tracked (bfd *abfd, size_t = rsize) > > > } > > > #endif > > > > > > +/* Attempt to read *SIZE_ADDR bytes from ABFD's iostream to *PTR_P. > > > + Return true if the full the amount has been read. If *PTR_P is > > > + NULL, mmap should be used, return the memory address at the > > > + current offset in *PTR_P as well as return mmap address and size > > > + in *PTR_ADDR and *SIZE_ADDR. Otherwise, return NULL in *PTR_ADDR > > > + and 0 in *SIZE_ADDR. */ > > > + > > > +bool > > > +_bfd_mmap_read_untracked (void **ptr_p, size_t *size_addr, > > > + void **ptr_addr, bfd *abfd) > > > > I find these parameter names confusing. Perhaps > > void **data, size_t *size, void **mmap_base, bfd *abfd > > Fixed. > > > > +{ > > > + void *ptr =3D *ptr_p; > > > + size_t size =3D *size_addr; > > > + > > > +#ifdef USE_MMAP > > > + if (ptr =3D=3D NULL) > > > + { > > > + ptr =3D _bfd_mmap_readonly_untracked (abfd, size, ptr_addr, > > > + size_addr); > > > + if (ptr =3D=3D NULL || ptr =3D=3D (void *) -1) > > > + abort (); > > > + *ptr_p =3D ptr; > > > + return true; > > > + } > > > + else > > > +#endif > > > + { > > > + *ptr_addr =3D NULL; > > > + *size_addr =3D 0; > > > + return bfd_read (ptr, size, abfd) =3D=3D size; > > > + } > > > +} > > > + > > > /* Default implementation */ > > > > > > bool > > > diff --git a/bfd/libbfd.h b/bfd/libbfd.h > > > index 1515a03b093..c80f5a86ed1 100644 > > > --- a/bfd/libbfd.h > > > +++ b/bfd/libbfd.h > > > @@ -911,6 +911,9 @@ extern void _bfd_munmap_readonly_untracked > > > #define _bfd_munmap_readonly_untracked(ptr, rsize) free (ptr) > > > #endif > > > > > > +extern bool _bfd_mmap_read_untracked > > > + (void **, size_t *, void **, bfd *) ATTRIBUTE_HIDDEN; > > > + > > > static inline void * > > > _bfd_malloc_and_read (bfd *abfd, bfd_size_type asize, bfd_size_type = rsize) > > > { > > > -- > > > 2.44.0 > > > > OK with these fixed. > > Here is the v5 patch: > > https://patchwork.sourceware.org/project/binutils/list/?series=3D31726 > > I will check it in tomorrow. > Here is the v6 patch: https://patchwork.sourceware.org/project/binutils/list/?series=3D31752 with the fix for: https://sourceware.org/bugzilla/show_bug.cgi?id=3D31466 --=20 H.J.