public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
@ 2011-08-27 21:45 ` heuler at infosim dot net
  2011-08-27 22:02 ` rich at testardi dot com
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: heuler at infosim dot net @ 2011-08-27 21:45 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

Marius Heuler <heuler at infosim dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
                 CC|                            |heuler at infosim dot net
         Resolution|WONTFIX                     |

--- Comment #8 from Marius Heuler <heuler at infosim dot net> 2011-08-27 21:45:04 UTC ---
We have exactly the same problem with the current implementation of malloc.

The suggested solutions by Ulrich using M_ARENA_MAX does not work since the
check for number of arenas is not thread safe. In fact the limit is not working
for heay threading applications where that would be needed! 

Since the number of cores and usage of threads will increase strongly there
should be a solution for that kind of applications! If the arena limit would
work as described we would have no problem.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
  2011-08-27 21:45 ` [Bug libc/11261] malloc uses excessive memory for multi-threaded applications heuler at infosim dot net
@ 2011-08-27 22:02 ` rich at testardi dot com
  2011-09-02  7:39 ` heuler at infosim dot net
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: rich at testardi dot com @ 2011-08-27 22:02 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

--- Comment #9 from Rich Testardi <rich at testardi dot com> 2011-08-27 22:02:03 UTC ---
Hi,

We ended up building our own memory allocator -- it's faster and more efficient
than glibc, and it works equally fast with threads and wihout.

We used the "small block allocator" concept from HP-UX where we only allocate
huge (32MB) allocations from the system (after setting M_MMAP_THRESHOLD
suitably small).

We then carve out large *naturally aligned* 1MB blocks from the huge allocation
(accepting 3% waste, since the allocation was page alogned to begin with, not
naturally aligned).

And we carve each one of those large blocks into small fixed size buckets
(which are fractional powers of 2 -- like 16 bytes, 20, 24, 28, 32, 40, 48, 56,
64, 80, etc.).

Then we put the aligned addresses into a very fast hash and have a linked list
for each bucket size.

This means our allocate routine is just a lock, linked list remove, unlock, on
average, and our free routine is just a hash lookup, lock, linked list insert,
unlock on average.

The trick here is that from any address being freed, you can get back to the
naturally aligned 1MB block that contains it with just a pointer mask, and from
there you can get the allocation's size as well as the head of the linked list
of free entries to which it should be returned...

-- Rich

  ----- Original Message ----- 
  From: heuler at infosim dot net 
  To: rich@testardi.com 
  Sent: Saturday, August 27, 2011 3:45 PM
  Subject: [Bug libc/11261] malloc uses excessive memory for multi-threaded
applications


  http://sourceware.org/bugzilla/show_bug.cgi?id=11261

  Marius Heuler <heuler at infosim dot net> changed:

             What    |Removed                     |Added
  ----------------------------------------------------------------------------
               Status|RESOLVED                    |REOPENED
                   CC|                            |heuler at infosim dot net
           Resolution|WONTFIX                     |

  --- Comment #8 from Marius Heuler <heuler at infosim dot net> 2011-08-27
21:45:04 UTC ---
  We have exactly the same problem with the current implementation of malloc.

  The suggested solutions by Ulrich using M_ARENA_MAX does not work since the
  check for number of arenas is not thread safe. In fact the limit is not
working
  for heay threading applications where that would be needed! 

  Since the number of cores and usage of threads will increase strongly there
  should be a solution for that kind of applications! If the arena limit would
  work as described we would have no problem.

  -- 
  Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
  ------- You are receiving this mail because: -------
  You reported the bug.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
  2011-08-27 21:45 ` [Bug libc/11261] malloc uses excessive memory for multi-threaded applications heuler at infosim dot net
  2011-08-27 22:02 ` rich at testardi dot com
@ 2011-09-02  7:39 ` heuler at infosim dot net
  2011-09-02  7:45 ` heuler at infosim dot net
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: heuler at infosim dot net @ 2011-09-02  7:39 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

--- Comment #10 from Marius Heuler <heuler at infosim dot net> 2011-09-02 07:38:51 UTC ---
Created attachment 5917
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5917
Memory consumption with glibc malloc and jeMalloc (straight line).

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2011-09-02  7:39 ` heuler at infosim dot net
@ 2011-09-02  7:45 ` heuler at infosim dot net
  2011-09-11 15:46 ` drepper.fsp at gmail dot com
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: heuler at infosim dot net @ 2011-09-02  7:45 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

--- Comment #11 from Marius Heuler <heuler at infosim dot net> 2011-09-02 07:44:31 UTC ---
Comment on attachment 5917
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5917
Memory consumption with glibc malloc and jeMalloc (straight line).

We now changed to another malloc implementation: jeMalloc
(http://www.canonware.com/jemalloc/) which is a magnitude superior to the glibc
malloc. A similar implementation is also used in *BSD variants!
Linux/glibc should really improve their malloc since the current implementation
is not sufficient for large applications. 
Why can't this implemenetion be used inside glibc? Is it GPL <-> BSD license
problem?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2011-09-02  7:45 ` heuler at infosim dot net
@ 2011-09-11 15:46 ` drepper.fsp at gmail dot com
  2011-09-11 21:32 ` rich at testardi dot com
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: drepper.fsp at gmail dot com @ 2011-09-11 15:46 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

Ulrich Drepper <drepper.fsp at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |WORKSFORME

--- Comment #12 from Ulrich Drepper <drepper.fsp at gmail dot com> 2011-09-11 15:46:13 UTC ---
Stop reopening.  There is a solution for people who are stupid enough to create
too many threads.  No implementation will be perfect for everyone.  The glibc
implementation is tuned for reasonable programs and will run much faster than
any other I tested.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2011-09-11 15:46 ` drepper.fsp at gmail dot com
@ 2011-09-11 21:32 ` rich at testardi dot com
  2012-07-29 10:10 ` zhannk at gmail dot com
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: rich at testardi dot com @ 2011-09-11 21:32 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

--- Comment #13 from Rich Testardi <rich at testardi dot com> 2011-09-11 21:31:37 UTC ---
Let's all not take things so personally -- nobody here is stupid (and I'm sure
some folks here are a *lot* smarter than other folks 
give them credit for)...

There are lots of reasons to create a half dozen threads and that's all it
takes to make the glibc version perform absolutely 
horribly.

(And there can be no non-objective measurement that won't show my version of
malloc is faster than yours -- so this has been a win 
all around for us, thanks...)

If you're not interested in improving glibc, you can just say so.

But stop name calling when you feel threatened -- my 5 year old daughter has
already outgrown that.

-- Rich


-----Original Message----- 
From: drepper.fsp at gmail dot com
Sent: Sunday, September 11, 2011 9:46 AM
To: rich@testardi.com
Subject: [Bug libc/11261] malloc uses excessive memory for multi-threaded
applications

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

Ulrich Drepper <drepper.fsp at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |WORKSFORME

--- Comment #12 from Ulrich Drepper <drepper.fsp at gmail dot com> 2011-09-11
15:46:13 UTC ---
Stop reopening.  There is a solution for people who are stupid enough to create
too many threads.  No implementation will be perfect for everyone.  The glibc
implementation is tuned for reasonable programs and will run much faster than
any other I tested.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2011-09-11 21:32 ` rich at testardi dot com
@ 2012-07-29 10:10 ` zhannk at gmail dot com
  2012-12-19 10:47 ` schwab@linux-m68k.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: zhannk at gmail dot com @ 2012-07-29 10:10 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

zhannk at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
                 CC|                            |zhannk at gmail dot com
         Resolution|WORKSFORME                  |
     Ever Confirmed|1                           |0

--- Comment #14 from zhannk at gmail dot com 2012-07-29 10:09:47 UTC ---
Ulrich Drepper, this huge virtual memory allocator could be a potential trouble
maker on Linux6 with 64bit JVM. 
There is already one document on hadoop regarding to this issue, while their
solution by setting MALLOC_ARENA_MAX=4 has no effect. we still found JVM with
30G virtual memory reported. 
https://issues.apache.org/jira/browse/HADOOP-7154

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2012-07-29 10:10 ` zhannk at gmail dot com
@ 2012-12-19 10:47 ` schwab@linux-m68k.org
  2013-03-14 19:03 ` carlos at redhat dot com
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: schwab@linux-m68k.org @ 2012-12-19 10:47 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|drepper.fsp at gmail dot    |unassigned at sourceware
                   |com                         |dot org

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2012-12-19 10:47 ` schwab@linux-m68k.org
@ 2013-03-14 19:03 ` carlos at redhat dot com
  2013-12-12  0:22 ` neleai at seznam dot cz
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: carlos at redhat dot com @ 2013-03-14 19:03 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
                 CC|                            |carlos at redhat dot com
         Resolution|                            |FIXED

--- Comment #15 from Carlos O'Donell <carlos at redhat dot com> 2013-03-14 19:03:05 UTC ---
This should have been fixed by the following commit:

commit 41b81892f11fe1353123e892158b53de73863d62
Author: Ulrich Drepper <drepper@gmail.com>
Date:   Tue Jan 31 14:42:34 2012 -0500

    Handle ARENA_TEST correctly

I have verified that using `mallopt (M_ARENA_MAX, 1)' that the limit of memory
is bounded by the single arena.

creating 10 threads
allowing threads to contend to create preferred arenas
display preferred arenas
Arena 0:
system bytes     =     135168
in use bytes     =       2880
Total (incl. mmap):
system bytes     =     135168
in use bytes     =       2880
max mmap regions =          0
max mmap bytes   =          0
allowing threads to allocate 100MB each, sequentially in turn
thread 0 alloc 100MB
thread 0 free 100MB-20kB
thread 4 alloc 100MB
thread 4 free 100MB-20kB
thread 9 alloc 100MB
thread 9 free 100MB-20kB
thread 5 alloc 100MB
thread 5 free 100MB-20kB
thread 2 alloc 100MB
thread 2 free 100MB-20kB
thread 7 alloc 100MB
thread 7 free 100MB-20kB
thread 1 alloc 100MB
thread 1 free 100MB-20kB
thread 8 alloc 100MB
thread 8 free 100MB-20kB
thread 6 alloc 100MB
thread 6 free 100MB-20kB
thread 3 alloc 100MB
thread 3 free 100MB-20kB
Arena 0:
system bytes     =  100392960
in use bytes     =     201472
Total (incl. mmap):
system bytes     =  100392960
in use bytes     =     201472
max mmap regions =          0
max mmap bytes   =          0

Therefore the solution to a program with lots of threads is to limit the arenas
as a trade-off for memory.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2013-03-14 19:03 ` carlos at redhat dot com
@ 2013-12-12  0:22 ` neleai at seznam dot cz
  2013-12-12  3:32 ` siddhesh at redhat dot com
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: neleai at seznam dot cz @ 2013-12-12  0:22 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Ondrej Bilka <neleai at seznam dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
   Last reconfirmed|                            |2013-12-12
                 CC|                            |neleai at seznam dot cz
         Resolution|FIXED                       |---
     Ever confirmed|0                           |1

--- Comment #16 from Ondrej Bilka <neleai at seznam dot cz> ---
> Therefore the solution to a program with lots of threads is to limit the arenas > as a trade-off for memory.

That is a bandaid not a solution. Still there is no memory returned to system
when one first does allocations and then allocates auxiliary memory like

void *calculate ()
{
  void **ary = malloc (1000000 * sizeof (void *))
  for (i = 0; i < 1000000; i++) ary[i] = malloc (100);
  for (i = 0; i <  999999; i++) free (ary [i]);
  return ary[999999];
}

When one acknowledges a bug a solution is relatively simple. Add a flag
UNMAPPED for chunks which means that all pages completely contained in chunk
were zeroed by madvise(s, n, MADV_DONTNEED).

You keep track of memory used and system and when their ratio is bigger than
two you make chunks starting from largest ones UNMAPPED to decrease system
charge.

This deals with RSS problem. A virtual space usage could still be excesive but
that is smaller problem.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2013-12-12  0:22 ` neleai at seznam dot cz
@ 2013-12-12  3:32 ` siddhesh at redhat dot com
  2013-12-12  8:41 ` neleai at seznam dot cz
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: siddhesh at redhat dot com @ 2013-12-12  3:32 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Siddhesh Poyarekar <siddhesh at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |siddhesh at redhat dot com

--- Comment #17 from Siddhesh Poyarekar <siddhesh at redhat dot com> ---
(In reply to Ondrej Bilka from comment #16)
> > Therefore the solution to a program with lots of threads is to limit the arenas > as a trade-off for memory.
> 
> That is a bandaid not a solution. Still there is no memory returned to
> system when one first does allocations and then allocates auxiliary memory
> like

You have not understood the bug report.

> void *calculate ()
> {
>   void **ary = malloc (1000000 * sizeof (void *))
>   for (i = 0; i < 1000000; i++) ary[i] = malloc (100);
>   for (i = 0; i <  999999; i++) free (ary [i]);
>   return ary[999999];
> }

This is a different problem from the current bug report, which is about too
many arenas getting created resulting in excessive address space usage
 and the MALLOC_ARENA_* variables not working to limit them.  Memory holes not
being freed has nothing to do with it. 

> When one acknowledges a bug a solution is relatively simple. Add a flag
> UNMAPPED for chunks which means that all pages completely contained in chunk
> were zeroed by madvise(s, n, MADV_DONTNEED).
> 
> You keep track of memory used and system and when their ratio is bigger than
> two you make chunks starting from largest ones UNMAPPED to decrease system
> charge.
> 
> This deals with RSS problem. A virtual space usage could still be excesive
> but that is smaller problem.

The problem you've described is different and I'm sure there's a bug report
open for it too.  madvise is not sufficient to free up commit charge; there's a
mail thread on libc-alpha that discusses this problem that you can search for
and read up on.  I think vm.overcommit_memory is one of the keywords to look
for.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2013-12-12  3:32 ` siddhesh at redhat dot com
@ 2013-12-12  8:41 ` neleai at seznam dot cz
  2013-12-12 10:48 ` siddhesh at redhat dot com
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: neleai at seznam dot cz @ 2013-12-12  8:41 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=11261

--- Comment #18 from Ondrej Bilka <neleai at seznam dot cz> ---
On Thu, Dec 12, 2013 at 03:31:58AM +0000, siddhesh at redhat dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=11261
> 
> Siddhesh Poyarekar <siddhesh at redhat dot com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |siddhesh at redhat dot com
> 
> --- Comment #17 from Siddhesh Poyarekar <siddhesh at redhat dot com> ---
> (In reply to Ondrej Bilka from comment #16)
> > > Therefore the solution to a program with lots of threads is to limit the arenas > as a trade-off for memory.
> > 
> > That is a bandaid not a solution. Still there is no memory returned to
> > system when one first does allocations and then allocates auxiliary memory
> > like
> 
> You have not understood the bug report.
>
When you read discussion more carefully there are following posts where
this problem is mentioned:


Ulrich Drepper:

 You don't understand the difference between address space and allocated
 memory.

Rich Testardi:

Actually, I totally understand the difference and that is why I mentioned the 
fragmentation of memory...  When each arena has just a few straggling 
allocations, the maximum *committed* RAM required for the program's *working 
set* using the thread-preferred arena model is, in fact, N times that required 
for a traditional model, where N is the number of threads.  This shows up in 
real-world thrashing that could actually be avoided.  Basically, if the 
program is doing small allocations, a small percentage of stragglers can pin 
the entire allocated space -- and the allocated space is, in fact, much larger 
than it needs to be (and larger than it is in other OS's).  But thank you for

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2013-12-12  8:41 ` neleai at seznam dot cz
@ 2013-12-12 10:48 ` siddhesh at redhat dot com
  2014-02-07  3:01 ` [Bug malloc/11261] " jsm28 at gcc dot gnu.org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: siddhesh at redhat dot com @ 2013-12-12 10:48 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

--- Comment #19 from Siddhesh Poyarekar <siddhesh at redhat dot com> ---
(In reply to Ondrej Bilka from comment #18)
> When you read discussion more carefully there are following posts where
> this problem is mentioned:
> 
> 
> Ulrich Drepper:
> 
>  You don't understand the difference between address space and allocated
>  memory.
> 
> Rich Testardi:
> 
> Actually, I totally understand the difference and that is why I mentioned
> the 
> fragmentation of memory...  When each arena has just a few straggling 
> allocations, the maximum *committed* RAM required for the program's *working 
> set* using the thread-preferred arena model is, in fact, N times that
> required 
> for a traditional model, where N is the number of threads.  This shows up in 
> real-world thrashing that could actually be avoided.  Basically, if the 
> program is doing small allocations, a small percentage of stragglers can pin 
> the entire allocated space -- and the allocated space is, in fact, much
> larger 
> than it needs to be (and larger than it is in other OS's).  But thank you for

Right, but most comments on the bug report (and the resolution) are in the
context of malloc creating too many arenas and the switches not working. 
Single allocations blocking an entire free space is not a multi-threaded
problem - it occurs on single-threads too and is only compounded with multiple
arenas.  I'd suggest working with a fresh bug report or an open bug report that
describes this problem exactly (which I'm pretty sure there should be).

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug malloc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2013-12-12 10:48 ` siddhesh at redhat dot com
@ 2014-02-07  3:01 ` jsm28 at gcc dot gnu.org
  2014-02-16 19:42 ` jackie.rosen at hushmail dot com
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2014-02-07  3:01 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|libc                        |malloc

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug malloc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2014-02-07  3:01 ` [Bug malloc/11261] " jsm28 at gcc dot gnu.org
@ 2014-02-16 19:42 ` jackie.rosen at hushmail dot com
  2014-05-28 19:46 ` schwab at sourceware dot org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: jackie.rosen at hushmail dot com @ 2014-02-16 19:42 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Jackie Rosen <jackie.rosen at hushmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jackie.rosen at hushmail dot com

--- Comment #20 from Jackie Rosen <jackie.rosen at hushmail dot com> ---
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug malloc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (15 preceding siblings ...)
  2014-05-28 19:46 ` schwab at sourceware dot org
@ 2014-05-28 19:46 ` schwab at sourceware dot org
  2014-06-30 18:50 ` fweimer at redhat dot com
  2015-02-12 20:04 ` carlos at redhat dot com
  18 siblings, 0 replies; 26+ messages in thread
From: schwab at sourceware dot org @ 2014-05-28 19:46 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Andreas Schwab <schwab at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|jackie.rosen at hushmail dot com   |

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug malloc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2014-02-16 19:42 ` jackie.rosen at hushmail dot com
@ 2014-05-28 19:46 ` schwab at sourceware dot org
  2014-05-28 19:46 ` schwab at sourceware dot org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: schwab at sourceware dot org @ 2014-05-28 19:46 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Andreas Schwab <schwab at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|jackie.rosen at hushmail dot com   |

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug malloc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (16 preceding siblings ...)
  2014-05-28 19:46 ` schwab at sourceware dot org
@ 2014-06-30 18:50 ` fweimer at redhat dot com
  2015-02-12 20:04 ` carlos at redhat dot com
  18 siblings, 0 replies; 26+ messages in thread
From: fweimer at redhat dot com @ 2014-06-30 18:50 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug malloc/11261] malloc uses excessive memory for multi-threaded applications
       [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
                   ` (17 preceding siblings ...)
  2014-06-30 18:50 ` fweimer at redhat dot com
@ 2015-02-12 20:04 ` carlos at redhat dot com
  18 siblings, 0 replies; 26+ messages in thread
From: carlos at redhat dot com @ 2015-02-12 20:04 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #21 from Carlos O'Donell <carlos at redhat dot com> ---
I'm marking this fixed, since the tunnables that limit arena creation are
fixed. You can limit the number of arenas in your application at the cost of
thread contention during allocation (increased malloc latency). This does
however limit the total VA usage. This is particularly true of 32-bit
applications running close to the 32-bit VA limit.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
  2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
                   ` (5 preceding siblings ...)
  2010-02-10 14:29 ` rich at testardi dot com
@ 2010-02-10 15:52 ` rich at testardi dot com
  6 siblings, 0 replies; 26+ messages in thread
From: rich at testardi dot com @ 2010-02-10 15:52 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rich at testardi dot com  2010-02-10 15:52 -------
Last mail...

It turns out the arena_max and arena_test numbers are "fuzzy" (I am sure by 
design), since no lock is held here:

static mstate
internal_function
arena_get2(mstate a_tsd, size_t size)
{
  mstate a;
#ifdef PER_THREAD
  if (__builtin_expect (use_per_thread, 0)) {
    if ((a = get_free_list ()) == NULL
        && (a = reused_arena ()) == NULL)
      /* Nothing immediately available, so generate a new arena.  */
      a = _int_new_arena(size);
    return a;
  }
#endif

Therefore, if narenas is less than the limit tested for in reused_arena(), and 
N threads get in to this code at once, narenas can then end up N-1 *above* the 
limit.  The likelihood of this happening is proportional to the malloc arrival 
rate and the time spend in _int_new_arena().

This is exactly what I am seeing.

So if you can live with 2 arenas, the critical thing to do is to make sure 
narenas is exactly 2 before going heavily multi-threaded, and then it won't be 
able to go above 2; otherwise, it can sneak up to 2+N-1, where N is the number 
of threads contending for allocations.

If the ">=" in reused_arena() was changed to ">", then we could use this 
mechanism to limit narenas to exactly 1 right from the get-go.  That would be 
ideal for our kind of applications (that can't live with 2 arenas).

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
  2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
                   ` (4 preceding siblings ...)
  2010-02-10 13:42 ` rich at testardi dot com
@ 2010-02-10 14:29 ` rich at testardi dot com
  2010-02-10 15:52 ` rich at testardi dot com
  6 siblings, 0 replies; 26+ messages in thread
From: rich at testardi dot com @ 2010-02-10 14:29 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rich at testardi dot com  2010-02-10 14:29 -------
And a comment for anyone else who might stumble this way...

I *can* reduce the total number of arenas to *2* (not low enough for our 
purposes) with the following sequence:

export MALLOC_PER_THREAD=1

    rv = mallopt(-7, 1);  // M_ARENA_TEST
    printf("%d\n", rv);
    rv = mallopt(-8, 1);  // M_ARENA_MAX
    printf("%d\n", rv);

*PLUS* I have to have a global pthread mutex around every malloc(3) and free
(3) call -- I can't figure out from the code why this is required, but without 
it the number of arenas seems independent of the mallopt settings.

I cannot get to *1* arena because a) mallopt() won't allow you to set 
arena_test to 0:

#ifdef PER_THREAD
  case M_ARENA_TEST:
    if (value > 0)
      mp_.arena_test = value;
    break;

  case M_ARENA_MAX:
    if (value > 0)
      mp_.arena_max = value;
    break;
#endif

And b) reused_arena() uses a ">=" here rather than a ">":

static mstate
reused_arena (void)
{
  if (narenas <= mp_.arena_test)
    return NULL;




-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
  2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
                   ` (3 preceding siblings ...)
  2010-02-10 13:21 ` drepper at redhat dot com
@ 2010-02-10 13:42 ` rich at testardi dot com
  2010-02-10 14:29 ` rich at testardi dot com
  2010-02-10 15:52 ` rich at testardi dot com
  6 siblings, 0 replies; 26+ messages in thread
From: rich at testardi dot com @ 2010-02-10 13:42 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rich at testardi dot com  2010-02-10 13:41 -------
Hi Ulrich,

Agreed 100% no one size fits all...

Unfortunately, the neither of the "tuning" settings for MALLOC_ARENA_MAX nor 
MALLOC_ARENA_TEST seem to work.  Neither do mallopt() M_ARENA_MAX nor 
M_ARENA_TEST. :-(

Part of the problem seems to stem from the fact that the global "narenas" is 
only incremented if MALLOC_PER_THREAD/use_per_thread is true...

#ifdef PER_THREAD
  if (__builtin_expect (use_per_thread, 0)) {
    ++narenas;

    (void)mutex_unlock(&list_lock);
  }
#endif

So the tests of those other variables in reused_arena() never limit anything.  
And setting MALLOC_PER_THREAD makes our problem much worse.

static mstate
reused_arena (void)
{
  if (narenas <= mp_.arena_test)
    return NULL;

  ...

  if (narenas < narenas_limit)
    return NULL;

I also tried all combinations I could imagine of MALLOC_PER_THREAD and the 
other variables, to no avail.  I also did the same with mallopt(), verifying 
at the assembly level that we got all the right values into mp_. :-(

Specifically, I tried things like:

export MALLOC_PER_THREAD=1
export MALLOC_ARENA_MAX=1
export MALLOC_ARENA_TEST=1

and:

    rv = mallopt(-7, 1);
    printf("%d\n", rv);
    rv = mallopt(-8, 1);
    printf("%d\n", rv);

Anyway, thank you.  You've already pointed me in all of the right directions.  
If I did something completely brain-dead, above, feel free to tell me and save 
me another few days of work! :-)

-- Rich

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
  2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
                   ` (2 preceding siblings ...)
  2010-02-10 13:10 ` rich at testardi dot com
@ 2010-02-10 13:21 ` drepper at redhat dot com
  2010-02-10 13:42 ` rich at testardi dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: drepper at redhat dot com @ 2010-02-10 13:21 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2010-02-10 13:21 -------
I already described what you can do to limit the number of memory pools.  Just
use it.  If you don't like envvars use the appropriate mallopt() calls (using
M_ARENA_MAX and M_ARENA_TEST).

No malloc implementation is optimal for all situations.  This is why there are
customization knobs.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |WONTFIX


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
  2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
  2010-02-09 15:28 ` [Bug libc/11261] " drepper at redhat dot com
  2010-02-09 16:02 ` rich at testardi dot com
@ 2010-02-10 13:10 ` rich at testardi dot com
  2010-02-10 13:21 ` drepper at redhat dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: rich at testardi dot com @ 2010-02-10 13:10 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rich at testardi dot com  2010-02-10 13:10 -------
Hi Ulrich,

I apologize in advance and want you to know I will not reopen this bug again, 
but I felt I had to show you a new test program that clearly shows "The cost 
of large amounts of allocated address space is insignificant" can be 
exceedingly untrue for heavily threaded systems using large amounts of 
memory.  In our product, we require 2x the RAM on Linux vs other OS's because 
of this. :-(

I've reduced the problem to a program that you can invoke with no options and 
it runs fine, but with the "-x" option it thrashes wildly.  The only 
difference is that in the "-x" case we allow the threads to do some dummy 
malloc/frees up front to create thread-preferred arenas.

The program simply has a bunch of threads that, in turn (i.e., not 
concurrently), allocate a bunch of memory, and then free most (but not all!) 
of it.  The resulting allocations easily fit in RAM, even when fragmented.  It 
then attempts to memset the unfreed memory to 0.

The problem is that in the thread-preferred arena case, the fragmented 
allocations are now spread over 10x the virtual space, and when accessed, 
result in actual commitment of at least 2x the physical space -- enough to 
push us over the top of RAM and into thrashing.

So as a result, without the -x option, the program memset runs in two seconds 
or so on my system (8-way, 2GHz, 12GB RAM); with the -x option, the program 
memset can take hundreds to thousands of seconds.

I know this sounds contrived, but it was in fact *derived* from a real-life 
problem.

All I am hoping to convey is that there are memory intensive applications for 
which thread-preferred arenas actually hurt performance significantly.  
Furthermore, turning on MALLOC_PER_THREAD can actually have an even more 
devastating effect on these applications than the default behavior.  And 
unfortunately, neither MALLOC_ARENA_MAX nor MALLOC_ARENA_TEST can prevent the 
thread-preferred arena proliferation.

The test run output without and with "-x" option are below; the source code is 
below that.

Thank you for your time.  Like I said, I won't reopen this again, but I hope 
you'll consider giving applications like ours a "way out" of the thread-
preferred arenas in the future -- especially since it seems our future is even 
more bleak with MALLOC_PER_THREAD, and that's the way you are moving (and for 
certain applications, MALLOC_PER_THREAD makes sense!).

Anyway, I've already written a small block binned allocator that will live on 
top of mmap'd pages for us for Linux, so we're OK.  But I'd rather just use 
malloc(3).

-- Rich

[root@lab2-160 test_heap]# ./memx2
cpus = 8; pages = 3072694; pagesize = 4096
nallocs = 307200
--- creating 100 threads ---
--- waiting for threads to allocate memory ---
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
--- malloc_stats() ---
Arena 0:
system bytes     = 1557606400
in use bytes     =  743366944
Total (incl. mmap):
system bytes     = 1562529792
in use bytes     =  748290336
max mmap regions =          2
max mmap bytes   =    4923392
--- cat /proc/29565/status | grep -i vm ---
VmPeak:  9961304 kB
VmSize:  9951060 kB
VmLck:         0 kB
VmHWM:   2517656 kB
VmRSS:   2517656 kB
VmData:  9945304 kB
VmStk:        84 kB
VmExe:         8 kB
VmLib:      1532 kB
VmPTE:     19432 kB
--- accessing memory ---
--- done in 3 seconds ---


[root@lab2-160 test_heap]# ./memx2 -x
cpus = 8; pages = 3072694; pagesize = 4096
nallocs = 307200
--- creating 100 threads ---
--- allowing threads to create preferred arenas ---
--- waiting for threads to allocate memory ---
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
--- malloc_stats() ---
Arena 0:
system bytes     = 1264455680
in use bytes     =  505209392
Arena 1:
system bytes     = 1344937984
in use bytes     =  653695200
Arena 2:
system bytes     = 1396580352
in use bytes     =  705338800
Arena 3:
system bytes     = 1195057152
in use bytes     =  503815408
Arena 4:
system bytes     = 1295818752
in use bytes     =  604577136
Arena 5:
system bytes     = 1094295552
in use bytes     =  403053744
Arena 6:
system bytes     = 1245437952
in use bytes     =  554196272
Arena 7:
system bytes     = 1144676352
in use bytes     =  453434608
Arena 8:
system bytes     = 1346199552
in use bytes     =  654958000
Total (incl. mmap):
system bytes     = 2742448128
in use bytes     =  748234656
max mmap regions =          2
max mmap bytes   =    4923392
--- cat /proc/29669/status | grep -i vm ---
VmPeak: 49213720 kB
VmSize: 49182988 kB
VmLck:         0 kB
VmHWM:  12052384 kB
VmRSS:  11861284 kB
VmData: 49177232 kB
VmStk:        84 kB
VmExe:         8 kB
VmLib:      1532 kB
VmPTE:     95452 kB
--- accessing memory ---
60 secs... 120 secs... 180 secs... 240 secs... 300 secs... 360 secs... 420 
secs... 480 secs... 540 secs... 600 secs... 660 secs... 720 secs... 780 secs...
--- done in 818 seconds ---
[root@lab2-160 test_heap]#


[root@lab2-160 test_heap]# cat memx2.c
// ****************************************************************************

#include <stdio.h>
#include <errno.h>
#include <assert.h>
#include <limits.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <inttypes.h>

#define NTHREADS  100
#define ALLOCSIZE  16384
#define STRAGGLERS  100

static uint cpus;
static uint pages;
static uint pagesize;

static uint nallocs;

static volatile int go;
static volatile int done;
static volatile int spin;
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

static void **ps;  // allocations that are freed in turn by each thread
static int nps;
static void **ss;  // straggling allocations to prevent arena free
static int nss;

void
my_sleep(
    int ms
    )
{
    int rv;
    struct timespec ts;
    struct timespec rem;

    ts.tv_sec  = ms / 1000;
    ts.tv_nsec = (ms % 1000) * 1000000;
    for (;;) {
        rv = nanosleep(&ts, &rem);
        if (! rv) {
            break;
        }
        assert(errno == EINTR);
        ts = rem;
    }
}

void *
my_thread(
    void *context
    )
{
    int i;
    int n;
    int si;
    int rv;
    void *p;

    n = (int)(intptr_t)context;

    while (! go) {
        my_sleep(100);
    }

    // first we spin to get our own arena
    while (spin) {
        p = malloc(ALLOCSIZE);
        assert(p);
        if (rand()%20000 == 0) {
            my_sleep(10);
        }
        free(p);
    }

    my_sleep(1000);

    // then one thread at a time, do our big allocs
    rv = pthread_mutex_lock(&mutex);
    assert(! rv);
    for (i = 0; i < nallocs; i++) {
        assert(i < nps);
        ps[i] = malloc(ALLOCSIZE);
        assert(ps[i]);
    }
    // N.B. we leave 1 of every STRAGGLERS allocations straggling
    for (i = 0; i < nallocs; i++) {
        assert(i < nps);
        if (i%STRAGGLERS == 0) {
            si = nallocs/STRAGGLERS*n + i/STRAGGLERS;
            assert(si < nss);
            ss[si] = ps[i];
        } else {
            free(ps[i]);
        }
    }
    done++;
    printf("%d ", done);
    fflush(stdout);
    rv = pthread_mutex_unlock(&mutex);
    assert(! rv);
}

int
main(int argc, char **argv)
{
    int i;
    int rv;
    time_t n;
    time_t t;
    time_t lt;
    pthread_t thread;
    char command[128];


    if (argc > 1) {
        if (! strcmp(argv[1], "-x")) {
            spin = 1;
            argc--;
            argv++;
        }
    }
    if (argc > 1) {
        printf("usage: memx2 [-x]\n");
        return 1;
    }

    cpus = sysconf(_SC_NPROCESSORS_CONF);
    pages = sysconf (_SC_PHYS_PAGES);
    pagesize = sysconf (_SC_PAGESIZE);
    printf("cpus = %d; pages = %d; pagesize = %d\n", cpus, pages, pagesize);

    nallocs = pages/10/STRAGGLERS*STRAGGLERS;
    assert(! (nallocs%STRAGGLERS));
    printf("nallocs = %d\n", nallocs);

    nps = nallocs;
    ps = malloc(nps*sizeof(*ps));
    assert(ps);
    nss = NTHREADS*nallocs/STRAGGLERS;
    ss = malloc(nss*sizeof(*ss));
    assert(ss);

    if (pagesize != 4096) {
        printf("WARNING -- this program expects 4096 byte pagesize!\n");
    }

    printf("--- creating %d threads ---\n", NTHREADS);
    for (i = 0; i < NTHREADS; i++) {
        rv = pthread_create(&thread, NULL, my_thread, (void *)(intptr_t)i);
        assert(! rv);
        rv = pthread_detach(thread);
        assert(! rv);
    }
    go = 1;

    if (spin) {
        printf("--- allowing threads to create preferred arenas ---\n");
        my_sleep(5000);
        spin = 0;
    }

    printf("--- waiting for threads to allocate memory ---\n");
    while (done != NTHREADS) {
        my_sleep(1000);
    }
    printf("\n");

    printf("--- malloc_stats() ---\n");
    malloc_stats();
    sprintf(command, "cat /proc/%d/status | grep -i vm", (int)getpid());
    printf("--- %s ---\n", command);
    (void)system(command);

    // access the stragglers
    printf("--- accessing memory ---\n");
    t = time(NULL);
    lt = t;
    for (i = 0; i < nss; i++) {
        memset(ss[i], 0, ALLOCSIZE);
        n = time(NULL);
        if (n-lt >= 60) {
            printf("%d secs... ", (int)(n-t));
            fflush(stdout);
            lt = n;
        }
    }
    if (lt != t) {
        printf("\n");
    }
    printf("--- done in %d seconds ---\n", (int)(time(NULL)-t));

    return 0;
}
[root@lab2-160 test_heap]#


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|WONTFIX                     |


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
  2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
  2010-02-09 15:28 ` [Bug libc/11261] " drepper at redhat dot com
@ 2010-02-09 16:02 ` rich at testardi dot com
  2010-02-10 13:10 ` rich at testardi dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: rich at testardi dot com @ 2010-02-09 16:02 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rich at testardi dot com  2010-02-09 16:01 -------
Actually, I totally understand the difference and that is why I mentioned the 
fragmentation of memory...  When each arena has just a few straggling 
allocations, the maximum *committed* RAM required for the program's *working 
set* using the thread-preferred arena model is, in fact, N times that required 
for a traditional model, where N is the number of threads.  This shows up in 
real-world thrashing that could actually be avoided.  Basically, if the 
program is doing small allocations, a small percentage of stragglers can pin 
the entire allocated space -- and the allocated space is, in fact, much larger 
than it needs to be (and larger than it is in other OS's).  But thank you for 
your time -- we all want the same thing here, a ever better Linux that is more 
suited to heavily threaded applications. :-)

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug libc/11261] malloc uses excessive memory for multi-threaded applications
  2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
@ 2010-02-09 15:28 ` drepper at redhat dot com
  2010-02-09 16:02 ` rich at testardi dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: drepper at redhat dot com @ 2010-02-09 15:28 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2010-02-09 15:28 -------
You don't understand the difference between address space and allocated memory.
 The cost of large amounts of allocated address space is insignificant.

If you don't want it control it using the MALLOC_ARENA_MAX and MALLOC_ARENA_TEST
envvars.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2015-02-12 20:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-11261-131@http.sourceware.org/bugzilla/>
2011-08-27 21:45 ` [Bug libc/11261] malloc uses excessive memory for multi-threaded applications heuler at infosim dot net
2011-08-27 22:02 ` rich at testardi dot com
2011-09-02  7:39 ` heuler at infosim dot net
2011-09-02  7:45 ` heuler at infosim dot net
2011-09-11 15:46 ` drepper.fsp at gmail dot com
2011-09-11 21:32 ` rich at testardi dot com
2012-07-29 10:10 ` zhannk at gmail dot com
2012-12-19 10:47 ` schwab@linux-m68k.org
2013-03-14 19:03 ` carlos at redhat dot com
2013-12-12  0:22 ` neleai at seznam dot cz
2013-12-12  3:32 ` siddhesh at redhat dot com
2013-12-12  8:41 ` neleai at seznam dot cz
2013-12-12 10:48 ` siddhesh at redhat dot com
2014-02-07  3:01 ` [Bug malloc/11261] " jsm28 at gcc dot gnu.org
2014-02-16 19:42 ` jackie.rosen at hushmail dot com
2014-05-28 19:46 ` schwab at sourceware dot org
2014-05-28 19:46 ` schwab at sourceware dot org
2014-06-30 18:50 ` fweimer at redhat dot com
2015-02-12 20:04 ` carlos at redhat dot com
2010-02-08 20:23 [Bug libc/11261] New: " rich at testardi dot com
2010-02-09 15:28 ` [Bug libc/11261] " drepper at redhat dot com
2010-02-09 16:02 ` rich at testardi dot com
2010-02-10 13:10 ` rich at testardi dot com
2010-02-10 13:21 ` drepper at redhat dot com
2010-02-10 13:42 ` rich at testardi dot com
2010-02-10 14:29 ` rich at testardi dot com
2010-02-10 15:52 ` rich at testardi dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).