public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/1541] New: Poor threaded application performance when using malloc
@ 2005-10-25 14:11 sjmunroe at us dot ibm dot com
  2005-10-25 15:32 ` [Bug libc/1541] " sjmunroe at us dot ibm dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: sjmunroe at us dot ibm dot com @ 2005-10-25 14:11 UTC (permalink / raw)
  To: glibc-bugs

Threaded applications that use malloc to allocate large buffer/work ares will
suffer significant performance degradation when ever the allocation size exceeds
the MMAP_THRESHOLD.

When a malloc allocation size exceeds the MMAP_THRESHOLD the storage is
allocated via anonymous mmap insted of from brt storage. The mmap syscal only
allocate the region, no pages are allocated until 1st touch. So there is page
fault for each page as it is touched for the 1st time. The kernel has a
semaphore around the "allocate zeroed page" operation which seriallizes this
operation for threaded applications. These anonymous mmap regions are not resued
by malloc so the "fault/zero page" bottleneck is ocurrs for every large allocation.

This can be seen as a kernel problem but it is also a glibc problem because for
some application the default MMAP_THRESHOLD (normally 128K) is simply too small.
Changing the MMAP_THRESHOLD to a value large enough to handle most allocations
gives a signicant speed up. 

For 64-bit platforms it could be wise to bump up the default thresholds to a
more reasonable value (say 16M). Or we need a simple and effective way to change
the thresholds from outside the applications. The mallopt API can used used to
change the default MMAP_THRESHOLD but many customers are reluctant to change
their source "just for Linux". And enviroment varible based mechansim may be
more acceptable.

-- 
           Summary: Poor threaded application performance when using malloc
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper at redhat dot com
        ReportedBy: sjmunroe at us dot ibm dot com
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
@ 2005-10-25 15:32 ` sjmunroe at us dot ibm dot com
  2005-10-25 15:53 ` sjmunroe at us dot ibm dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: sjmunroe at us dot ibm dot com @ 2005-10-25 15:32 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From sjmunroe at us dot ibm dot com  2005-10-25 15:32 -------
Created an attachment (id=724)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=724&action=view)
Threaded malloc test with MMAP_THRESHOLD options

To build use:
   gcc -g -O2 malloc-test.c -lpthread -o malloc-test
or 
   gcc -g -O2 -DMAP_THRESHOLD=16777216 malloc-test.c -lpthread -o
malloc-test_16M


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
  2005-10-25 15:32 ` [Bug libc/1541] " sjmunroe at us dot ibm dot com
@ 2005-10-25 15:53 ` sjmunroe at us dot ibm dot com
  2005-11-01  8:05 ` roland at gnu dot org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: sjmunroe at us dot ibm dot com @ 2005-10-25 15:53 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From sjmunroe at us dot ibm dot com  2005-10-25 15:53 -------
To run the testcase single threaded 
   ./malloc_test 128000 10000
   ...
   Average : 0.718383 seconds for 10000 requests of 128000 bytes, 491MB concurrent.

To run with 16 threads
   ./malloc_test 128000 10000 16
   ...
   Average : 1.280583 seconds for 10000 requests of 128000 bytes, 490MB concurrent.


These run quickly because 128000 is less than the cash threshold. Now try with a
malloc size larger than the MMAP_THRESHOLD:
   ./malloc_test 1280000 10000 16
   ... 
   Average : 227.594933 seconds for 10000 requests of 421006 bytes, 488MB
concurrent.

Notice the huge jump from 1.28 to 227 seconds while to total concurrent storage
remained constant around 490MB!

Now try a version of malloc-test that changes the MMAP_THRESHOLD to 16M:

   ./malloc-test_16M 1280000 10000 16
   ...
   Average : 7.473701 seconds for 10000 requests of 421006 bytes, 488MB concurrent.

The time comes down to a more reasonable 7.47 seconds. Finally to verify that
larger MMAP_THRESHOLD does not negatively impact smalled allocatoions try.

   ./malloc-test_16M 128000 10000 16
   ...
   1.066022 seconds for 10000 requests of 128000 bytes, 490MB concurrent.

Which in this case is faster than with to smalled default MMAP_THRESHOLD.

All runs on my dual 2GHz G5 (PPC64/970) system, but I see simular results on my
dual Athelon system. So I suspect this a common problem across SMP platforms.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
  2005-10-25 15:32 ` [Bug libc/1541] " sjmunroe at us dot ibm dot com
  2005-10-25 15:53 ` sjmunroe at us dot ibm dot com
@ 2005-11-01  8:05 ` roland at gnu dot org
  2005-11-01 17:12 ` sjmunroe at us dot ibm dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: roland at gnu dot org @ 2005-11-01  8:05 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From roland at gnu dot org  2005-11-01 08:05 -------
Have you done any profiling to substantiate your analysis of why it is slower?
I see nothing in the kernel to suggest that brk preallocates zero-fill pages.
Your test program preallocates them in its early iterations and then reuses
those pages by freeing and allocating repeatedly, I would suspect.  Profiling
would show the time spent in mmap/munmap syscalls vs spent faulting in pages,
for example.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roland at gnu dot org
             Status|NEW                         |WAITING


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
                   ` (2 preceding siblings ...)
  2005-11-01  8:05 ` roland at gnu dot org
@ 2005-11-01 17:12 ` sjmunroe at us dot ibm dot com
  2005-11-01 17:19 ` sjmunroe at us dot ibm dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: sjmunroe at us dot ibm dot com @ 2005-11-01 17:12 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From sjmunroe at us dot ibm dot com  2005-11-01 17:12 -------
Created an attachment (id=733)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=733&action=view)
Oprofile of malloc-test 128000 1000 8 on Dual PPC64 G5

This profile show that when the MMAP_THRESHOLD is exceeded we see a big
increase in kernel time. The kernel time is associate with the locking,
schedualing, and page fault.

I don't have access to a i386 SMP box with at the moment but I suspect the
profile there will be similar.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
                   ` (3 preceding siblings ...)
  2005-11-01 17:12 ` sjmunroe at us dot ibm dot com
@ 2005-11-01 17:19 ` sjmunroe at us dot ibm dot com
  2005-11-01 17:27 ` sjmunroe at us dot ibm dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: sjmunroe at us dot ibm dot com @ 2005-11-01 17:19 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From sjmunroe at us dot ibm dot com  2005-11-01 17:19 -------
Created an attachment (id=734)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=734&action=view)
profile from similar run but with MMAP_THRESHOLD increased to 16M

Increasing the MMAP_THRESHOLD improved performance so I had the increase the
number of iterations to get the test to run long enoigh to profile. The profile
show most of the time (92%) in the test application (run_test) and and a few
percent in the malloc runtime. The first kernel contribution starts at 0.2% for
schedule.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
                   ` (4 preceding siblings ...)
  2005-11-01 17:19 ` sjmunroe at us dot ibm dot com
@ 2005-11-01 17:27 ` sjmunroe at us dot ibm dot com
  2007-02-18  4:45 ` drepper at redhat dot com
  2010-06-01  3:30 ` pasky at suse dot cz
  7 siblings, 0 replies; 11+ messages in thread
From: sjmunroe at us dot ibm dot com @ 2005-11-01 17:27 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From sjmunroe at us dot ibm dot com  2005-11-01 17:27 -------
Yes arenas allocated in brk store page fault once but are effeciently reused.
The problem with large allocations is that the storage allocated with mmap is
unmapped with the free(). So each new allocation that exceeds the MMAP_THRESHOLD
has to be faulted in. 

The mmap syscall does not do much work. Most of the effort of allocating the
page and zeroing it out is defered until the page is actually touched the first
time. This is reflected in the profiles attached above.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
                   ` (5 preceding siblings ...)
  2005-11-01 17:27 ` sjmunroe at us dot ibm dot com
@ 2007-02-18  4:45 ` drepper at redhat dot com
  2010-06-01  3:30 ` pasky at suse dot cz
  7 siblings, 0 replies; 11+ messages in thread
From: drepper at redhat dot com @ 2007-02-18  4:45 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2007-02-18 04:45 -------
This should have been dealt with in a malloc patch which went in some time ago.
 Verify and close or elaborate.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
  2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
                   ` (6 preceding siblings ...)
  2007-02-18  4:45 ` drepper at redhat dot com
@ 2010-06-01  3:30 ` pasky at suse dot cz
  7 siblings, 0 replies; 11+ messages in thread
From: pasky at suse dot cz @ 2010-06-01  3:30 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From pasky at suse dot cz  2010-06-01 03:30 -------
The adaptive mmap threshold should have fixed this; no response, so it's
probably safe to assume so.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|                            |FIXED


http://sourceware.org/bugzilla/show_bug.cgi?id=1541

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
       [not found] <bug-1541-131@http.sourceware.org/bugzilla/>
  2014-02-16 19:41 ` jackie.rosen at hushmail dot com
@ 2014-05-28 19:46 ` schwab at sourceware dot org
  1 sibling, 0 replies; 11+ messages in thread
From: schwab at sourceware dot org @ 2014-05-28 19:46 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=1541

Andreas Schwab <schwab at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|jackie.rosen at hushmail dot com   |

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libc/1541] Poor threaded application performance when using malloc
       [not found] <bug-1541-131@http.sourceware.org/bugzilla/>
@ 2014-02-16 19:41 ` jackie.rosen at hushmail dot com
  2014-05-28 19:46 ` schwab at sourceware dot org
  1 sibling, 0 replies; 11+ messages in thread
From: jackie.rosen at hushmail dot com @ 2014-02-16 19:41 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=1541

Jackie Rosen <jackie.rosen at hushmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jackie.rosen at hushmail dot com

--- Comment #9 from Jackie Rosen <jackie.rosen at hushmail dot com> ---
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-05-28 19:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-25 14:11 [Bug libc/1541] New: Poor threaded application performance when using malloc sjmunroe at us dot ibm dot com
2005-10-25 15:32 ` [Bug libc/1541] " sjmunroe at us dot ibm dot com
2005-10-25 15:53 ` sjmunroe at us dot ibm dot com
2005-11-01  8:05 ` roland at gnu dot org
2005-11-01 17:12 ` sjmunroe at us dot ibm dot com
2005-11-01 17:19 ` sjmunroe at us dot ibm dot com
2005-11-01 17:27 ` sjmunroe at us dot ibm dot com
2007-02-18  4:45 ` drepper at redhat dot com
2010-06-01  3:30 ` pasky at suse dot cz
     [not found] <bug-1541-131@http.sourceware.org/bugzilla/>
2014-02-16 19:41 ` jackie.rosen at hushmail dot com
2014-05-28 19:46 ` schwab at sourceware dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).