public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/167] New: malloc() eats excess ram
@ 2004-05-16 19:46 bluefoxicy at linux dot net
  2004-05-16 19:48 ` [Bug libc/167] " bluefoxicy at linux dot net
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: bluefoxicy at linux dot net @ 2004-05-16 19:46 UTC (permalink / raw)
  To: glibc-bugs

This isn't really a 'bug', just a restriction due to the nature of the heap. 
The issue is that when using malloc(), memory fragmentation holds the heap in an
overly large state in many cases, which results in greater ram/swap usage and
excess swapping.  This can be fixed on Linux by using mmap() with private,
anonymous, read/write memory rather than brk().

The basic function of malloc() with brk(), to the best of my understanding, is
to grow and shrink the heap and manage the partitioning of the memory within. 
When there is not a large enough chunk of contiguous, free memory on the heap to
satisfy a malloc(), a brk() is called to get enough for the request to be satisfied:

* -- Used ram
- -- unused, but allocated ram
n -- Newly allocated ram in this example
N -- Used, newly allocated ram in this example

[*-**--*] malloc(3) ->
[*-**--*NNN]

This raises one problem that can be manifested in two ways.  The less extreme
problem is seen above:  There is unused, but allocated ram.  The more extreme
problem is best illustrated with a new example.

Let's say that your task is a file manager.  It indexes and displays many files,
sometimes thousands at once.  Let's say that in one particular directory, you
bring up 10,000 files, the rendering of thumbnail previews for which eat a grand
total of 500M of ram.

[*--*-] malloc() ->
[*****NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN]

(not to scale)

Now let's say that after this, you allocate one more piece of ram.  Let's say
that you're using Konqueror or Nautilus, so this could be for a web browser
activity or a new icon display on the desktop, or just one permenant allocation.


[**********************************************************
 **********************************************************
 **********************************************************
 **********************************************************] malloc() ->
[**********************************************************
 **********************************************************
 **********************************************************
 **********************************************************
 N]

Now, let's say that the directory is left.  All that ram is freed.  What's the
result?

[**********************************************************
 **********************************************************
 **********************************************************
 **********************************************************
 *] free() ->


[*--*-*----------------------------------------------------
 ----------------------------------------------------------
 ----------------------------------------------------------
 ----------------------------------------------------------
 *]

In the case of things like Nautilus, you could keep this large chunk of unused
memory allocated for days (nautilus runs as the desktop too), and still only
ever use a fraction of it again.  Even if it gets swapped out, it still uses
space in swap that could be used for other things; becomes a target for the OOM
killer; and causes an unnecessary disk write on swap out, and an unnecessary
disk read on swap in, which could be used for a freeing and a reallocation (many
times faster than a disk seek, unless they trigger a swapping operation, which
is <100%).

This may explain why many applications, especially X applications, eat more ram
than they're supposed to, and always seem to have memory leaks.

I propose that the Linux implimentation of malloc() use instead of brk() calls
to mmap() for anonymous, read-write pages.  This would be functionally the same
as malloc(), except it would not use the heap and thus would not have the above
issues.

A small bit of code I made a while back just to play with mmap() is attatched. 
With a little modification, you can add it to your ld preload list and drop it
in as a replacement for malloc().  It will need to be changed to have the proper
function names (instead of mmalloc() mcalloc() etc) and to set errno on errors.
 I haven't gone over the code in a long time, so it may be a bit wonky.  :)

-- 
           Summary: malloc() eats excess ram
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: gotom at debian dot or dot jp
        ReportedBy: bluefoxicy at linux dot net
                CC: glibc-bugs at sources dot redhat dot com


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
@ 2004-05-16 19:48 ` bluefoxicy at linux dot net
  2004-05-16 20:37 ` bluefoxicy at linux dot net
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bluefoxicy at linux dot net @ 2004-05-16 19:48 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From bluefoxicy at linux dot net  2004-05-16 19:48 -------
Created an attachment (id=81)
 --> (http://sources.redhat.com/bugzilla/attachment.cgi?id=81&action=view)
mmalloc.tar -- mmap() malloc, needs work to become a drop-in testcase

This needs a bit of work, but it's a start for an LD_PRELOAD testcase.	Change
the code to be a viable drop-in and then test it.

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
  2004-05-16 19:48 ` [Bug libc/167] " bluefoxicy at linux dot net
@ 2004-05-16 20:37 ` bluefoxicy at linux dot net
  2004-05-16 20:52 ` bluefoxicy at linux dot net
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bluefoxicy at linux dot net @ 2004-05-16 20:37 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From bluefoxicy at linux dot net  2004-05-16 20:37 -------
Created an attachment (id=82)
 --> (http://sources.redhat.com/bugzilla/attachment.cgi?id=82&action=view)
msztest.c -- Testcase to show how to keep the heap really huge

Should upload this too.  This shows an example, albeit a controlled example, of
a program allocating a huge heap, then holding it for the sake of 100 bytes of
memory.  If it were mmap() rather than current malloc(), it would free the 100M
back to the system.

Run top, then run this in a terminal next to it, and watch the state of its
memory usage at every point where it sleeps.

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
  2004-05-16 19:48 ` [Bug libc/167] " bluefoxicy at linux dot net
  2004-05-16 20:37 ` bluefoxicy at linux dot net
@ 2004-05-16 20:52 ` bluefoxicy at linux dot net
  2004-05-18 21:01 ` bluefoxicy at linux dot net
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bluefoxicy at linux dot net @ 2004-05-16 20:52 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From bluefoxicy at linux dot net  2004-05-16 20:52 -------
Sorry, I meant to say

--
This raises one problem that can be manifested in two ways.  The less extreme
problem is seen above:  There is unused, but allocated ram.  The more extreme
problem is best illustrated with a new example.
--

Instead

--
This raises one problem that can be manifested in two ways.  The less extreme
problem is seen above:  There are intermittent segments of unused, but allocated
ram.  The more extreme problem is best illustrated with a new example, and is
simply a large scale manifestation of unused, allocated ram.
--

Sorry :)

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
                   ` (2 preceding siblings ...)
  2004-05-16 20:52 ` bluefoxicy at linux dot net
@ 2004-05-18 21:01 ` bluefoxicy at linux dot net
  2004-05-27  5:09 ` bluefoxicy at linux dot net
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bluefoxicy at linux dot net @ 2004-05-18 21:01 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From bluefoxicy at linux dot net  2004-05-18 21:01 -------
Interesting thought.  mmap() lets you force it to fail if you can't map new ram
to a given address.  You could use this for realloc().  Other than that, you
could "try" to get mmap() to give you contiguous segments, but keep track of
them in your own table (on the heap or in a specially allocated mmap() block)
either way, so that you can effectively use whatever mmap() gives you as a
scattered heap.  The difference would be that when you have chunks in the middle
unallocated, you can free them back to the system.

This would be a crossbreed between what I understand to be the current
heap-based malloc() code in glibc/malloc/ and the current mmap() fallback in
glibc/malloc, right?  It might work excessively well.

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
                   ` (3 preceding siblings ...)
  2004-05-18 21:01 ` bluefoxicy at linux dot net
@ 2004-05-27  5:09 ` bluefoxicy at linux dot net
  2004-05-31  2:09 ` pinskia at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bluefoxicy at linux dot net @ 2004-05-27  5:09 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From bluefoxicy at linux dot net  2004-05-27 05:09 -------
Stuff about this can now be seen at

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=123965
http://bluefoxicy.blogspot.com/2004_05_01_bluefoxicy_archive.html#108507154064758877
http://sources.redhat.com/bugzilla/show_bug.cgi?id=167
http://sources.redhat.com/ml/libc-alpha/2004-05/msg00216.html

Here is a full pro/con outline from the current state of it.
Note that a "Segment" in context here is a span of related
used or unused memory (such that, if used, calling free() to
its beginning would free it); and a mapping is a
physical->virtual page mapping.

Pros:

 - In cases where an area of unused, allocated memory touches or crosses
   a page boundary, that area can be used for any allocation, including
   those larger than itself; those areas are of infinite size.
 - In cases where an area of used memory needs to be expanded, i.e. via
   realloc(), and crosses a page boundary, that area can be partially
   expanded in place, lessening the copying overhead.  The least
   overhead is attained by finding the highest boundary and inserting
   mappings ahead of it, after mmap()ing the original pages of the
   segment to be realloc()ed elsewhere.  Once the new mapping is made,
   all of the original pages containing only data for that segment can
   be munmap()ed.
 - In cases where an allocated page exists and is spanned entirely by
   an unused area of memory, that page can be freed to the system.
 - The internals of the allocator can track allocated and freed
   segments, in two separate lists.  The free segment list could be
   arranged by size; and all segments could have pointers to the
   entries to the next and previous segments.  Thus, tracking the
   beginning and the actual size would allow quick access to the
   entire list.
 - Mappings could be easily tracked so that any free() or any
   realloc() resulting in a move could check a physical page to
   determine if it is free, and then unmap all of the mappings to
   it if so.

Cons:

 - In the normal operation of a program, many physical pages will be
   mapped to two or more areas of virtual memory.
 - Resource limits could be hit on some systems, which would be
   handled by falling back to the heap
 - Some systems slow down after lots of mmap()ing.  Information I've
   gathered from kernel developers indicates that this would require
   several million physical->virtual mappings, at least, to be
   significant on Linux.

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
                   ` (4 preceding siblings ...)
  2004-05-27  5:09 ` bluefoxicy at linux dot net
@ 2004-05-31  2:09 ` pinskia at gcc dot gnu dot org
  2004-06-18  8:53 ` pere at hungry dot com
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-05-31  2:09 UTC (permalink / raw)
  To: glibc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu dot
                   |                            |org


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
                   ` (5 preceding siblings ...)
  2004-05-31  2:09 ` pinskia at gcc dot gnu dot org
@ 2004-06-18  8:53 ` pere at hungry dot com
  2004-06-20 20:17 ` bluefoxicy at linux dot net
  2004-08-09  7:42 ` drepper at redhat dot com
  8 siblings, 0 replies; 10+ messages in thread
From: pere at hungry dot com @ 2004-06-18  8:53 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From pere at hungry dot com  2004-06-18 08:53 -------
The glibc malloc() is based on Doug Leas malloc(), and will use
mmap() for all requests with size above mmap_threshold.

You can change this threshold using
mallopt(M_MMAP_THRESHOLD, <new value>).  The default threshold
seem to be 128k.  You should test very carefully if you decide to
change this threshold.


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
                   ` (6 preceding siblings ...)
  2004-06-18  8:53 ` pere at hungry dot com
@ 2004-06-20 20:17 ` bluefoxicy at linux dot net
  2004-08-09  7:42 ` drepper at redhat dot com
  8 siblings, 0 replies; 10+ messages in thread
From: bluefoxicy at linux dot net @ 2004-06-20 20:17 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From bluefoxicy at linux dot net  2004-06-20 20:17 -------
Petter:
  I appreciate your actual reply, but try reading.

I am not suggesting using mmap() for each allocation; I am suggesting using
mmap() to build a virtual segment of memory to use in place of the heap which
may allow more complex operations which return free memory to the system, reduce
the impact of fragmentation, and possibly even to increase the speed of
allocations and especially realloc(), to be implemented.  The changes in the
allocator should be transparent to the application.

Try reading the blog, all of the text I've placed on both bugzillas, and the
message to libc-alpha.  Especially read comment 4 on this bugzilla:

    This would be a crossbreed between what I understand
    to be the current heap-based malloc() code in glibc/
    malloc and the current mmap() fallback in glibc/malloc,
    right?  It might work excessively well.

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/167] malloc() eats excess ram
  2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
                   ` (7 preceding siblings ...)
  2004-06-20 20:17 ` bluefoxicy at linux dot net
@ 2004-08-09  7:42 ` drepper at redhat dot com
  8 siblings, 0 replies; 10+ messages in thread
From: drepper at redhat dot com @ 2004-08-09  7:42 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2004-08-09 07:42 -------
There is no way I'm replacing the malloc implementation with something new just
like that.  Work with Doug Lea and/or Wolfram Gloger to integrate changes into
the glibc malloc or have them endorse your implementation.  Until either happens
no code will be replaced.

Just use your implementation by adding the code to your programs or a separate
DSO.  All the malloc code in glibc can be overwritten by external code.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX


http://sources.redhat.com/bugzilla/show_bug.cgi?id=167

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-08-09  7:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-16 19:46 [Bug libc/167] New: malloc() eats excess ram bluefoxicy at linux dot net
2004-05-16 19:48 ` [Bug libc/167] " bluefoxicy at linux dot net
2004-05-16 20:37 ` bluefoxicy at linux dot net
2004-05-16 20:52 ` bluefoxicy at linux dot net
2004-05-18 21:01 ` bluefoxicy at linux dot net
2004-05-27  5:09 ` bluefoxicy at linux dot net
2004-05-31  2:09 ` pinskia at gcc dot gnu dot org
2004-06-18  8:53 ` pere at hungry dot com
2004-06-20 20:17 ` bluefoxicy at linux dot net
2004-08-09  7:42 ` drepper at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).