public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* bfd: use less memory in string merging
@ 2023-11-07 16:51 Michael Matz
  2023-11-08  7:41 ` Jan Beulich
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Matz @ 2023-11-07 16:51 UTC (permalink / raw)
  To: binutils

the offset-to-entry mappings are allocated in blocks, which may
become a bit wasteful in case there are extremely many small
input files or sections.  This made it so that a large project
(Qt5WebEngine) didn't build anymore on x86 32bit due to address
space limits.  It barely fit into address space before the new
string merging, and then got pushed over the limit by this.

So instead of leaving the waste reallocate the maps to their final
size once known.  Now the link barely fits again.

bfd/
    * merge.c (record_section): Reallocate offset maps to their
    final size.
---

regtested on many targets, okay for master?

 bfd/merge.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/bfd/merge.c b/bfd/merge.c
index 4aa2f838679..ccefb707c47 100644
--- a/bfd/merge.c
+++ b/bfd/merge.c
@@ -767,6 +767,18 @@ record_section (struct sec_merge_info *sinfo,
 
   free (contents);
   contents = NULL;
+
+  /* We allocate the ofsmap arrays in blocks of 2048 elements.
+     In case we have very many small input files/sections,
+     this might waste large amounts of memory, so reallocate these
+     arrays here to their true size.  */
+  amt = secinfo->noffsetmap + 1;
+  secinfo->map_ofs = bfd_realloc (secinfo->map_ofs,
+				  amt * sizeof(secinfo->map_ofs[0]));
+  BFD_ASSERT (secinfo->map_ofs);
+  secinfo->map = bfd_realloc (secinfo->map, amt * sizeof(secinfo->map[0]));
+  BFD_ASSERT (secinfo->map);
+
   /*printf ("ZZZ %s:%s %u entries\n", sec->owner->filename, sec->name,
 	  (unsigned)secinfo->noffsetmap);*/
 
-- 
2.42.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bfd: use less memory in string merging
  2023-11-07 16:51 bfd: use less memory in string merging Michael Matz
@ 2023-11-08  7:41 ` Jan Beulich
  2023-11-08 13:39   ` Michael Matz
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2023-11-08  7:41 UTC (permalink / raw)
  To: Michael Matz; +Cc: binutils

On 07.11.2023 17:51, Michael Matz wrote:
> --- a/bfd/merge.c
> +++ b/bfd/merge.c
> @@ -767,6 +767,18 @@ record_section (struct sec_merge_info *sinfo,
>  
>    free (contents);
>    contents = NULL;
> +
> +  /* We allocate the ofsmap arrays in blocks of 2048 elements.
> +     In case we have very many small input files/sections,
> +     this might waste large amounts of memory, so reallocate these
> +     arrays here to their true size.  */
> +  amt = secinfo->noffsetmap + 1;
> +  secinfo->map_ofs = bfd_realloc (secinfo->map_ofs,
> +				  amt * sizeof(secinfo->map_ofs[0]));
> +  BFD_ASSERT (secinfo->map_ofs);
> +  secinfo->map = bfd_realloc (secinfo->map, amt * sizeof(secinfo->map[0]));
> +  BFD_ASSERT (secinfo->map);

Re-use of the same block when shrinking isn't guaranteed, so depending
on the underlying allocator this may actually add memory pressure (and
the allocations may therefore also fail). I think it would be nice to
be independent of such an implementation detail of the underlying
library. (It may also be worthwhile then to shrink the larger of the
two arrays first. Otoh the comment ahead of mapofs_type already
indicates that this type may need widening at some point.)

Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bfd: use less memory in string merging
  2023-11-08  7:41 ` Jan Beulich
@ 2023-11-08 13:39   ` Michael Matz
  2023-11-09  7:59     ` Jan Beulich
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Matz @ 2023-11-08 13:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

Hello,

On Wed, 8 Nov 2023, Jan Beulich wrote:

> > +  /* We allocate the ofsmap arrays in blocks of 2048 elements.
> > +     In case we have very many small input files/sections,
> > +     this might waste large amounts of memory, so reallocate these
> > +     arrays here to their true size.  */
> > +  amt = secinfo->noffsetmap + 1;
> > +  secinfo->map_ofs = bfd_realloc (secinfo->map_ofs,
> > +				  amt * sizeof(secinfo->map_ofs[0]));
> > +  BFD_ASSERT (secinfo->map_ofs);
> > +  secinfo->map = bfd_realloc (secinfo->map, amt * sizeof(secinfo->map[0]));
> > +  BFD_ASSERT (secinfo->map);
> 
> Re-use of the same block when shrinking isn't guaranteed, so depending
> on the underlying allocator this may actually add memory pressure (and
> the allocations may therefore also fail).

That's true, strictly speaking, but when this doesn't average out over 
thousands of blocks then it's such a low quality malloc(3) that it will 
have many other problems as well.  Certainly with the cases that lead me 
to the above (linking running nearly out of 32bit address space).  So I 
had that worry as well and rejected it.

> I think it would be nice to be independent of such an implementation 
> detail of the underlying library.

Yes.  But do you mean it as pre-requisite for the patch?  In that case 
I'll try something about a bucket allocator for the offsetmap blocks, 
though I think it's a bit on the extreme to work around lousy mallocs in 
current times.

> (It may also be worthwhile then to shrink the larger of the
> two arrays first. Otoh the comment ahead of mapofs_type already
> indicates that this type may need widening at some point.)

That is true nevertheless, so consider this changed in the patch.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bfd: use less memory in string merging
  2023-11-08 13:39   ` Michael Matz
@ 2023-11-09  7:59     ` Jan Beulich
  2023-11-09 16:45       ` Michael Matz
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2023-11-09  7:59 UTC (permalink / raw)
  To: Michael Matz; +Cc: binutils

On 08.11.2023 14:39, Michael Matz wrote:
> Hello,
> 
> On Wed, 8 Nov 2023, Jan Beulich wrote:
> 
>>> +  /* We allocate the ofsmap arrays in blocks of 2048 elements.
>>> +     In case we have very many small input files/sections,
>>> +     this might waste large amounts of memory, so reallocate these
>>> +     arrays here to their true size.  */
>>> +  amt = secinfo->noffsetmap + 1;
>>> +  secinfo->map_ofs = bfd_realloc (secinfo->map_ofs,
>>> +				  amt * sizeof(secinfo->map_ofs[0]));
>>> +  BFD_ASSERT (secinfo->map_ofs);
>>> +  secinfo->map = bfd_realloc (secinfo->map, amt * sizeof(secinfo->map[0]));
>>> +  BFD_ASSERT (secinfo->map);
>>
>> Re-use of the same block when shrinking isn't guaranteed, so depending
>> on the underlying allocator this may actually add memory pressure (and
>> the allocations may therefore also fail).
> 
> That's true, strictly speaking, but when this doesn't average out over 
> thousands of blocks then it's such a low quality malloc(3) that it will 
> have many other problems as well.  Certainly with the cases that lead me 
> to the above (linking running nearly out of 32bit address space).  So I 
> had that worry as well and rejected it.
> 
>> I think it would be nice to be independent of such an implementation 
>> detail of the underlying library.
> 
> Yes.  But do you mean it as pre-requisite for the patch?  In that case 
> I'll try something about a bucket allocator for the offsetmap blocks, 
> though I think it's a bit on the extreme to work around lousy mallocs in 
> current times.

I definitely wouldn't go as far as asking for such a rework. What I'd
like to see though is that the realloc() return values be latched into
a local, and instead of failing upon the function returning NULL the
old pointers in the struct simply not be updated. Preferably with that
adjustment okay to put in.

Jan

>> (It may also be worthwhile then to shrink the larger of the
>> two arrays first. Otoh the comment ahead of mapofs_type already
>> indicates that this type may need widening at some point.)
> 
> That is true nevertheless, so consider this changed in the patch.
> 
> 
> Ciao,
> Michael.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bfd: use less memory in string merging
  2023-11-09  7:59     ` Jan Beulich
@ 2023-11-09 16:45       ` Michael Matz
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Matz @ 2023-11-09 16:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

Hey,

On Thu, 9 Nov 2023, Jan Beulich wrote:

> >> I think it would be nice to be independent of such an implementation 
> >> detail of the underlying library.
> > 
> > Yes.  But do you mean it as pre-requisite for the patch?  In that case 
> > I'll try something about a bucket allocator for the offsetmap blocks, 
> > though I think it's a bit on the extreme to work around lousy mallocs in 
> > current times.
> 
> I definitely wouldn't go as far as asking for such a rework. What I'd
> like to see though is that the realloc() return values be latched into
> a local, and instead of failing upon the function returning NULL the
> old pointers in the struct simply not be updated. Preferably with that
> adjustment okay to put in.

Oh, that makes sense, yes.  (The contract on the bfd_realloc wrapper is a 
bit unhelpful here, it invariably will set bfd_error_no_memory when 
returning NULL, but I still agree with you that it's nicer to not 
overwrite the existing pointer when realloc didn't work).


Ciao,
Michael.

> 
> Jan
> 
> >> (It may also be worthwhile then to shrink the larger of the
> >> two arrays first. Otoh the comment ahead of mapofs_type already
> >> indicates that this type may need widening at some point.)
> > 
> > That is true nevertheless, so consider this changed in the patch.
> > 
> > 
> > Ciao,
> > Michael.
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-09 16:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-07 16:51 bfd: use less memory in string merging Michael Matz
2023-11-08  7:41 ` Jan Beulich
2023-11-08 13:39   ` Michael Matz
2023-11-09  7:59     ` Jan Beulich
2023-11-09 16:45       ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).