public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ?
@ 2020-08-12  3:43 孙世龙 sunshilong
  2020-08-12  7:58 ` Florian Weimer
  0 siblings, 1 reply; 7+ messages in thread
From: 孙世龙 sunshilong @ 2020-08-12  3:43 UTC (permalink / raw)
  To: 孙世龙 sunshilong via Libc-help

Hi, list

What are the advantages and disadvantages if we always use mmap to
allocate memory instead of malloc.

For instance, TLSF uses such a method to allocate memory.
Could somebody shed some light on this matter?
Thank you for your attention to this matter.

Here is the related code snippet:

#if USE_SBRK || USE_MMAP
static __inline__ void *get_new_area(size_t * size)
{
    void *area;

#if USE_SBRK
    area = (void *)sbrk(0);
    if (((void *)sbrk(*size)) != ((void *) -1))
       return area;
#endif

#ifndef MAP_ANONYMOUS
/* https://dev.openwrt.org/ticket/322 */
# define MAP_ANONYMOUS MAP_ANON
#endif


#if USE_MMAP
    *size = ROUNDUP(*size, getpagesize());
    if ((area = __STD(mmap(0, *size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0))) != MAP_FAILED)
       return area;
#endif
    return ((void *) ~0);
}
#endif

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ?
  2020-08-12  3:43 What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ? 孙世龙 sunshilong
@ 2020-08-12  7:58 ` Florian Weimer
  2020-08-12  8:41   ` tomas
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2020-08-12  7:58 UTC (permalink / raw)
  To: 孙世龙 sunshilong via Libc-help

* 孙世龙 sunshilong via Libc-help:

> What are the advantages and disadvantages if we always use mmap to
> allocate memory instead of malloc.

On Linux, the advantage of using mmap is that the kernel might find
usable address space in more cases.  There is also no problem when
multiple allocators each call mmap indepedently.  With sbrk, that
would be a problem.  As long as address space is available, sbrk
maintains a single memory mapping, which can be more efficient.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ?
  2020-08-12  7:58 ` Florian Weimer
@ 2020-08-12  8:41   ` tomas
  2020-08-12 11:34     ` 孙世龙 sunshilong
  0 siblings, 1 reply; 7+ messages in thread
From: tomas @ 2020-08-12  8:41 UTC (permalink / raw)
  To: Florian Weimer; +Cc: 孙世龙 sunshilong via Libc-help

[-- Attachment #1: Type: text/plain, Size: 894 bytes --]

On Wed, Aug 12, 2020 at 09:58:10AM +0200, Florian Weimer wrote:
> * 孙世龙 sunshilong via Libc-help:
> 
> > What are the advantages and disadvantages if we always use mmap to
> > allocate memory instead of malloc.
> 
> On Linux, the advantage of using mmap is that the kernel might find
> usable address space in more cases.  There is also no problem when
> multiple allocators each call mmap indepedently.  With sbrk, that
> would be a problem.  As long as address space is available, sbrk
> maintains a single memory mapping, which can be more efficient.

Talking from the peanut gallery: what Florian is saying is that mmap
can replace sbreak -- not malloc. What mmap and sbreak are good for
is to make available *big* chunks of memory, which are dealt out
in small chunks by some library via malloc and friends.

I had the impression this wasn't clear.

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ?
  2020-08-12  8:41   ` tomas
@ 2020-08-12 11:34     ` 孙世龙 sunshilong
  2020-08-12 13:59       ` Siddhesh Poyarekar
  0 siblings, 1 reply; 7+ messages in thread
From: 孙世龙 sunshilong @ 2020-08-12 11:34 UTC (permalink / raw)
  To: tomas; +Cc: Florian Weimer, 孙世龙 sunshilong via Libc-help

>What mmap and sbreak are good for
>is to make available *big* chunks of memory, which are dealt out
>in small chunks by some library via malloc and friends.

As per the Linux Programmer Manual, which says:
For  allocations  greater than or equal to the limit specified (in bytes)
by M_MMAP_THRESHOLD that can't be satisfied from the free list,
the memory-allocation functions employ mmap(2) instead of
increasing the program break using sbrk(2).

I can draw the conclusion that malloc(3) may invoke mmap(2) when
allocating a huge block memory.

So I think there may be something wrong with your conclusion(i.e.
mmap calls malloc to allocate a big block).

What do you think about it? Am I missing something?
Thank you for your attention to this matter.

On Wed, Aug 12, 2020 at 4:42 PM <tomas@tuxteam.de> wrote:
>
> On Wed, Aug 12, 2020 at 09:58:10AM +0200, Florian Weimer wrote:
> > * 孙世龙 sunshilong via Libc-help:
> >
> > > What are the advantages and disadvantages if we always use mmap to
> > > allocate memory instead of malloc.
> >
> > On Linux, the advantage of using mmap is that the kernel might find
> > usable address space in more cases.  There is also no problem when
> > multiple allocators each call mmap indepedently.  With sbrk, that
> > would be a problem.  As long as address space is available, sbrk
> > maintains a single memory mapping, which can be more efficient.
>
> Talking from the peanut gallery: what Florian is saying is that mmap
> can replace sbreak -- not malloc. What mmap and sbreak are good for
> is to make available *big* chunks of memory, which are dealt out
> in small chunks by some library via malloc and friends.
>
> I had the impression this wasn't clear.
>
> Cheers
>  - t

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ?
  2020-08-12 11:34     ` 孙世龙 sunshilong
@ 2020-08-12 13:59       ` Siddhesh Poyarekar
  2020-08-13  1:19         ` 孙世龙 sunshilong
  0 siblings, 1 reply; 7+ messages in thread
From: Siddhesh Poyarekar @ 2020-08-12 13:59 UTC (permalink / raw)
  To: 孙世龙 sunshilong
  Cc: tomas, 孙世龙 sunshilong via Libc-help, Florian Weimer

On Wed, 12 Aug 2020 at 17:05, 孙世龙 sunshilong via Libc-help
<libc-help@sourceware.org> wrote:
> As per the Linux Programmer Manual, which says:
> For  allocations  greater than or equal to the limit specified (in bytes)
> by M_MMAP_THRESHOLD that can't be satisfied from the free list,
> the memory-allocation functions employ mmap(2) instead of
> increasing the program break using sbrk(2).
>
> I can draw the conclusion that malloc(3) may invoke mmap(2) when
> allocating a huge block memory.

That description is many years old.  It is kinda the same in
principle, i.e. allocation from the heap vs allocation of a single
block of memory using mmap can be controlled by M_MMAP_THRESHOLD, but
the definition of the "heap" has changed in glibc over the years.

The "heap" in question is called an arena in glibc parlance and it can
be expanded or contracted either by brk() or by mmap() depending on
various factors.  The key point here though is not which syscall is
used for allocation, but that a syscall is used at all.

> So I think there may be something wrong with your conclusion(i.e.
> mmap calls malloc to allocate a big block).

I think he said the opposite, i.e. malloc calls mmap to allocate a
single big block.

> What do you think about it? Am I missing something?

I think the point you're missing is the fact that the key performance
benefit of using malloc is the possibility of getting allocations
without having to go through a syscall at all.  System calls are quite
expensive because they involve a context switch into the kernel and
back, so malloc economizes those calls by requesting large blocks of
memory at once and then giving out smaller blocks from it whenever a
user calls malloc.  For very large blocks of memory (> 128K), it is
generally more optimal to serve it with a dedicated mmap'd block of
its own instead of giving it out from the cached blocks.

When you replace malloc with mmap, it would mean that *all* your
allocations will be served by mmap, meaning that every single
allocation call would result in a system call, thus potentially
reducing performance.  Further, even if you request 16 bytes from the
kernel using mmap, you'll get a minimum of a page (i.e. 4K or 64K
depending on the architecture), thus wasting the rest of the
allocation.

The example application you cited could be a case where the program is
doing its own memory management, i.e. requesting large blocks from the
kernel and then handing out small blocks from it to different parts of
the program.

Siddhesh
-- 
http://siddhesh.in

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ?
  2020-08-12 13:59       ` Siddhesh Poyarekar
@ 2020-08-13  1:19         ` 孙世龙 sunshilong
  2020-08-13  8:42           ` Siddhesh Poyarekar
  0 siblings, 1 reply; 7+ messages in thread
From: 孙世龙 sunshilong @ 2020-08-13  1:19 UTC (permalink / raw)
  To: Siddhesh Poyarekar
  Cc: tomas, 孙世龙 sunshilong via Libc-help, Florian Weimer

Thank you for the clarification.

>> So I think there may be something wrong with your conclusion(i.e.
>> mmap calls malloc to allocate a big block).
Siddhesh >I think he said the opposite, i.e. malloc calls mmap to allocate a
Siddhesh >single big block.
I agree with you.

> The example application you cited could be a case where the program is
> doing its own memory management, i.e. requesting large blocks from the
> kernel and then handing out small blocks from it to different parts of
> the program.
Yes, TLSF does what you said.

> malloc economizes those calls by requesting large blocks of
> memory at once and then giving out smaller blocks from it whenever a
> user calls malloc.
How large are the blocks of memory?
Will it causes a waste of memory if the entire program calls malloc to allocate
totally 1 byte (and no such an invocation is called anymore).

Thank you for your attention to this matter.
Best Regards
Sunshilong

On Wed, Aug 12, 2020 at 9:59 PM Siddhesh Poyarekar
<siddhesh.poyarekar@gmail.com> wrote:
>
> On Wed, 12 Aug 2020 at 17:05, 孙世龙 sunshilong via Libc-help
> <libc-help@sourceware.org> wrote:
> > As per the Linux Programmer Manual, which says:
> > For  allocations  greater than or equal to the limit specified (in bytes)
> > by M_MMAP_THRESHOLD that can't be satisfied from the free list,
> > the memory-allocation functions employ mmap(2) instead of
> > increasing the program break using sbrk(2).
> >
> > I can draw the conclusion that malloc(3) may invoke mmap(2) when
> > allocating a huge block memory.
>
> That description is many years old.  It is kinda the same in
> principle, i.e. allocation from the heap vs allocation of a single
> block of memory using mmap can be controlled by M_MMAP_THRESHOLD, but
> the definition of the "heap" has changed in glibc over the years.
>
> The "heap" in question is called an arena in glibc parlance and it can
> be expanded or contracted either by brk() or by mmap() depending on
> various factors.  The key point here though is not which syscall is
> used for allocation, but that a syscall is used at all.
>
> > So I think there may be something wrong with your conclusion(i.e.
> > mmap calls malloc to allocate a big block).
>
> I think he said the opposite, i.e. malloc calls mmap to allocate a
> single big block.
>
> > What do you think about it? Am I missing something?
>
> I think the point you're missing is the fact that the key performance
> benefit of using malloc is the possibility of getting allocations
> without having to go through a syscall at all.  System calls are quite
> expensive because they involve a context switch into the kernel and
> back, so malloc economizes those calls by requesting large blocks of
> memory at once and then giving out smaller blocks from it whenever a
> user calls malloc.  For very large blocks of memory (> 128K), it is
> generally more optimal to serve it with a dedicated mmap'd block of
> its own instead of giving it out from the cached blocks.
>
> When you replace malloc with mmap, it would mean that *all* your
> allocations will be served by mmap, meaning that every single
> allocation call would result in a system call, thus potentially
> reducing performance.  Further, even if you request 16 bytes from the
> kernel using mmap, you'll get a minimum of a page (i.e. 4K or 64K
> depending on the architecture), thus wasting the rest of the
> allocation.
>
> The example application you cited could be a case where the program is
> doing its own memory management, i.e. requesting large blocks from the
> kernel and then handing out small blocks from it to different parts of
> the program.
>
> Siddhesh
> --
> http://siddhesh.in

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ?
  2020-08-13  1:19         ` 孙世龙 sunshilong
@ 2020-08-13  8:42           ` Siddhesh Poyarekar
  0 siblings, 0 replies; 7+ messages in thread
From: Siddhesh Poyarekar @ 2020-08-13  8:42 UTC (permalink / raw)
  To: 孙世龙 sunshilong
  Cc: tomas, 孙世龙 sunshilong via Libc-help, Florian Weimer

On Thu, 13 Aug 2020 at 06:50, 孙世龙 sunshilong <sunshilong369@gmail.com> wrote:
> How large are the blocks of memory?
> Will it causes a waste of memory if the entire program calls malloc to allocate
> totally 1 byte (and no such an invocation is called anymore).

One the larger side***, the allocations will be of the order of 64MB,
but there's a catch.  The allocations do not have any physical pages
backing them, so while they allocate address space, they don't
allocate any real memory except for the 1byte (or rather, ~32 bytes
(or rather, the page containing the 32 bytes), counting the alignment
requirements and malloc internal metadata) that the user requests.

Siddhesh

*** The larger side is typically for non-main arenas.  The main arena
is allocated using brk() and can be grown a page at a time.

-- 
http://siddhesh.in

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-13  8:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-12  3:43 What are the advantages and disadvantages if we always use mmap to allocate memory instead of malloc ? 孙世龙 sunshilong
2020-08-12  7:58 ` Florian Weimer
2020-08-12  8:41   ` tomas
2020-08-12 11:34     ` 孙世龙 sunshilong
2020-08-12 13:59       ` Siddhesh Poyarekar
2020-08-13  1:19         ` 孙世龙 sunshilong
2020-08-13  8:42           ` Siddhesh Poyarekar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).