public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Patrick McGehearty <patrick.mcgehearty@oracle.com>
To: DJ Delorie <dj@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v2] Remove upper limit on tunable MALLOC_MMAP_THRESHOLD
Date: Wed, 24 Nov 2021 18:52:06 -0600	[thread overview]
Message-ID: <4a54b7de-bf41-6550-0e9a-e7a17e9c3b8c@oracle.com> (raw)
In-Reply-To: <xnr1bo6e4o.fsf@greed.delorie.com>

After studying the main malloc code paths, I believe I can
summarize the malloc behavior when the request is below
MALLOC_MMAP_THRESHOLD and above HEAP_MAX  as follows
(omitting much code for failure paths, etc):

malloc trys to find an arena with enough memory.
If success, allow, return.
If no arena is usable, call sysmalloc to get more system memory.
[We are concerned with large allocations where no arena
has enough memory, so we continue with:]

sysmalloc does the following:

If there is no usable arena or (the request exceeds mmap
threshold and we have not hit the mmap section limit),
then we "try_mmap".
[But we are asking about the case where we exceed HEAP_MAX_SIZE
but not mmap_threshold, meaning we will bypass try_mmap here.]

If we don't try_mmap and sysmalloc was called without an arena, then
we return 0 (i.e. fail).

If the above cases do not apply, then we have two branches, one for
a non-main arena and one for the main arena.

[I believe the following is the case the reviewer was concerned about.]
If the arena is not the main arena (I'm guessing this case only
applies to multi-threads apps), then we either try to grow_heap (but
not this time because our request is too large) or we call
new_heap. The new_heap call will fail because our request exceeds
HEAP_MAX_SIZE. At this point, if we have not yet tried mmap, we jump
back to try_mmap.

The try_mmap: label bypasses the test for mmap_threshold as the label
is below the if clause. As far as I can tell from reading the code,
following this path, the mmap call succeeds and a single chunk is
allocated to fulfill the request.

[This case is where we agree that all is as expected.]
If the arena is the main arena (i.e. single threaded app), then
we call MORECORE (usually sbrk) to extend the arena region as desired.

- patrick


On 11/9/2021 6:36 PM, DJ Delorie wrote:
> Patrick McGehearty <patrick.mcgehearty@oracle.com> writes:
>> If a chunk smaller than the mmap_threshold is requested,
>> then MORECORE [typically sbrk()] is called and HEAP_MAX
>> is not considered by the malloc code.  Heaps are only used
>> for mmap()ed allocations, not sbrk()'ed allocations,
>> so far as I can tell in reading the code.
> There are two types of arenas: the sbrk-based arena (limited in size by
> ulimit), and zero or more mmap-based arenas (limited by HEAP_MAX).  The
> sbrk-based one is used when the program is single threaded; the
> mmap-based ones are used when the program is multi-threaded.  Your
> original email made it sound like you were concerned with the
> multi-threaded case, where the mmap-based heaps are used.
>
> In either case, a malloc() request may be satisfied by pulling a free
> chunk out of either type of arena (possibly growing the arena if needed
> and possible), or by calling mmap() directly to satisfy that one
> request.
>
> I would think mmap_threshold should still apply if you're using the
> mmap'd heaps, so you can reserve the heaps for smaller chunks, but that
> is meaningless if mmap_threshold is larger than the heap size.  I could
> not find an obvious place in the code where mmap_threshold is used to
> bypass the mmap'd heaps, though.
>
> So while I have no problems with allowing larger mmap_threshold settings
> for the sbrk-based arena, I still wonder what happens to requests that
> go through an mmap-based arena that are larger than HEAP_MAX but still
> under the mmap_threshold.
>
> Of course, I've spent more time typing this response than it would take
> to write a test program and see what happens ;-)
>
>> It might be desirable to also allow HEAP_MAX to be set by
>> the user before the first call to malloc, but I see that
>> as a separate task.
> Our current implementation requires that the heap size be a compile-time
> constant, but... yeah.
>


  reply	other threads:[~2021-11-25  0:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-01 21:48 Patrick McGehearty
2021-11-02  0:27 ` DJ Delorie
2021-11-09 22:33   ` Patrick McGehearty
2021-11-10  0:36     ` DJ Delorie
2021-11-25  0:52       ` Patrick McGehearty [this message]
2021-11-29 20:42         ` DJ Delorie
2021-11-29 21:35           ` Patrick McGehearty
2021-12-07 19:51           ` Patrick McGehearty
2021-12-07 20:35             ` DJ Delorie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4a54b7de-bf41-6550-0e9a-e7a17e9c3b8c@oracle.com \
    --to=patrick.mcgehearty@oracle.com \
    --cc=dj@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).