public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* wrong performance of malloc/free under multi-threading
@ 2013-02-26  6:36 MITSUNARI Shigeo
  2013-02-26  9:14 ` Corinna Vinschen
  0 siblings, 1 reply; 7+ messages in thread
From: MITSUNARI Shigeo @ 2013-02-26  6:36 UTC (permalink / raw)
  To: cygwin

Hi.

I found that the performance of malloc/free is wrong under multi-threading.
The following test program reproduces the problem.

The program repeats malloc and free under multi-thread.
I measured the timing on Cygwin and Linux.

timing(sec)|      threadNum
-----------+----------+-------------
           | 1        |     2
-----------+----------+-------------
Linux      | 1.45     |     0.69
-----------+----------+-------------
Cygwin     | 2.059    |    53.165
-----------+----------+-------------

The timing under Linux seems good scale but it is very wrong under Cygwin.
Is it intentional behavior or do I use pthread in bad way?

env : Core i7-2600 + Windows 7 Ultimate SP1(64bit) + 8GiB memory

%gcc malloc-free-pthread.c -lpthread -Wall -Wextra -ansi -pedantic -O2 -m32

// results for Linux
% time ./a.out 1
threadNum=1, n=120000
begin=0, end=120000
end
1.432u 0.016s 0:01.45 99.3%     0+0k 0+0io 0pf+0w

% time ./a.out 2
threadNum=2, n=120000
begin=0, end=60000
begin=60000, end=120000
end
1.384u 0.000s 0:00.69 200.0%    0+0k 0+0io 0pf+0w

// results for Cygwin
// I stopped Anti-virus software under measuring.

$ time ./a.exe 1
threadNum=1, n=120000
begin=0, end=120000
end

real    0m2.059s
user    0m2.059s
sys     0m0.000s

$ time ./a.exe 2
threadNum=2, n=120000
begin=0, end=60000
begin=60000, end=120000
end

real    0m53.165s
user    0m11.949s
sys     0m19.812s

// Linux
% uname -a
Linux i3 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012 x86_64 x8
6_64 x86_64 GNU/Linux
% gcc -v
4.7

// Cygwin
% uname -mrsv
CYGWIN_NT-6.1-WOW64 1.7.17(0.262/5/3) 2012-10-19 14:39 i686
% gcc -dumpversion
4.5.3

// test code
% cat malloc-free-pthread.c
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <pthread.h>

void task(int idx)
{
    const int size = 100;
    int i;
    for (i = 1; i < size; i++) {
        char *p = (char*)malloc(i);
        memset(p, idx, i); // ensure to call malloc/free under optimization
        free(p);
    }
}

typedef struct {
    int begin;
    int end;
} Range;

void* run(void *arg)
{
    const Range *range = (const Range*)arg;
    int begin = range->begin;
    int end = range->end;
    printf("begin=%d, end=%d\n", begin, end);
    while (begin != end) {
        task(begin);
        begin++;
    }
    return 0;
}

#define MAX_THREAD_NUM 4

int main(int argc, char *argv[])
{
    const int threadNum = argc == 1 ? 1 : atoi(argv[1]);
    const int n = 1 * 2 * 3 * 4 * 5000;
    if (threadNum < 0 || threadNum > MAX_THREAD_NUM) {
        printf("threadNum = 0, 1, 2, 3, 4\n");
        return 1;
    }
    printf("threadNum=%d, n=%d\n", threadNum, n);
    if (threadNum == 0) {
        Range range;
        puts("no thread\n");
        range.begin = 0;
        range.end = n;
        run(&range);
    } else {
        const int dn = n / threadNum;
        Range range[MAX_THREAD_NUM];
        pthread_t pt[MAX_THREAD_NUM];
        int i;
        for (i = 0; i < threadNum; i++) {
            range[i].begin = i * dn;
            range[i].end = (i + 1) * dn;
            if (pthread_create(&pt[i], NULL, run, &range[i]) != 0) {
                printf("ERR create %d\n", i);
                return 1;
            }
        }
        for (i = 0; i < threadNum; i++) {
            if (pthread_join(pt[i], NULL) != 0) {
                printf("ERR join %d\n", i);
                return 1;
            }
        }
    }
    puts("end");
    return 0;
}
---
Yours,
 MITSUNARI Shigeo


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wrong performance of malloc/free under multi-threading
  2013-02-26  6:36 wrong performance of malloc/free under multi-threading MITSUNARI Shigeo
@ 2013-02-26  9:14 ` Corinna Vinschen
  2013-02-26 11:22   ` Chris J. Breisch
  0 siblings, 1 reply; 7+ messages in thread
From: Corinna Vinschen @ 2013-02-26  9:14 UTC (permalink / raw)
  To: cygwin

On Feb 26 15:35, MITSUNARI Shigeo wrote:
> Hi.
> 
> I found that the performance of malloc/free is wrong under multi-threading.
> The following test program reproduces the problem.
> 
> The program repeats malloc and free under multi-thread.
> I measured the timing on Cygwin and Linux.
> 
> timing(sec)|      threadNum
> -----------+----------+-------------
>            | 1        |     2
> -----------+----------+-------------
> Linux      | 1.45     |     0.69
> -----------+----------+-------------
> Cygwin     | 2.059    |    53.165
> -----------+----------+-------------
> 
> The timing under Linux seems good scale but it is very wrong under Cygwin.
> Is it intentional behavior or do I use pthread in bad way?

No, you're right.  This is easily reproducable.  I just had a look and
it seems that our malloc is really slow in multi-threading scenarios.
We're using Doug Lea's malloc unchanged with just additional locks
surrounding the underlying malloc/free calls.

This appears to be a serious performance problem.  I just learned that
glibc uses another version of dlmalloc, called ptmalloc, which is a
derived version of dlmalloc optimized for multi-threading environments.

Perhaps we have to do the same, but I don't know how long it takes to
port ptmalloc to Cygwin and obviously I don't know how big the
performance gain might be.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wrong performance of malloc/free under multi-threading
  2013-02-26  9:14 ` Corinna Vinschen
@ 2013-02-26 11:22   ` Chris J. Breisch
  2013-02-26 13:26     ` Corinna Vinschen
  0 siblings, 1 reply; 7+ messages in thread
From: Chris J. Breisch @ 2013-02-26 11:22 UTC (permalink / raw)
  To: cygwin

On 2/26/2013 4:14 AM, Corinna Vinschen wrote:
> On Feb 26 15:35, MITSUNARI Shigeo wrote:
>> Hi.
>>
>> I found that the performance of malloc/free is wrong under multi-threading.
>> The following test program reproduces the problem.
>>
>> The program repeats malloc and free under multi-thread.
>> I measured the timing on Cygwin and Linux.
>>
>> timing(sec)|      threadNum
>> -----------+----------+-------------
>>             | 1        |     2
>> -----------+----------+-------------
>> Linux      | 1.45     |     0.69
>> -----------+----------+-------------
>> Cygwin     | 2.059    |    53.165
>> -----------+----------+-------------
>>
>> The timing under Linux seems good scale but it is very wrong under Cygwin.
>> Is it intentional behavior or do I use pthread in bad way?
>
> No, you're right.  This is easily reproducable.  I just had a look and
> it seems that our malloc is really slow in multi-threading scenarios.
> We're using Doug Lea's malloc unchanged with just additional locks
> surrounding the underlying malloc/free calls.
>
> This appears to be a serious performance problem.  I just learned that
> glibc uses another version of dlmalloc, called ptmalloc, which is a
> derived version of dlmalloc optimized for multi-threading environments.
>
> Perhaps we have to do the same, but I don't know how long it takes to
> port ptmalloc to Cygwin and obviously I don't know how big the
> performance gain might be.
>
>
> Corinna
>

Does any host using newlib suffer from this problem, or is it exclusive 
to Cygwin?

Chris

-- 
In theory, there's no difference between theory and practice. In 
practice, there is.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wrong performance of malloc/free under multi-threading
  2013-02-26 11:22   ` Chris J. Breisch
@ 2013-02-26 13:26     ` Corinna Vinschen
  2013-02-26 14:13       ` jojelino
  0 siblings, 1 reply; 7+ messages in thread
From: Corinna Vinschen @ 2013-02-26 13:26 UTC (permalink / raw)
  To: cygwin

On Feb 26 06:22, Chris J. Breisch wrote:
> On 2/26/2013 4:14 AM, Corinna Vinschen wrote:
> >On Feb 26 15:35, MITSUNARI Shigeo wrote:
> >>Hi.
> >>
> >>I found that the performance of malloc/free is wrong under multi-threading.
> >>The following test program reproduces the problem.
> >>
> >>The program repeats malloc and free under multi-thread.
> >>I measured the timing on Cygwin and Linux.
> >>
> >>timing(sec)|      threadNum
> >>-----------+----------+-------------
> >>            | 1        |     2
> >>-----------+----------+-------------
> >>Linux      | 1.45     |     0.69
> >>-----------+----------+-------------
> >>Cygwin     | 2.059    |    53.165
> >>-----------+----------+-------------
> >>
> >>The timing under Linux seems good scale but it is very wrong under Cygwin.
> >>Is it intentional behavior or do I use pthread in bad way?
> >
> >No, you're right.  This is easily reproducable.  I just had a look and
> >it seems that our malloc is really slow in multi-threading scenarios.
> >We're using Doug Lea's malloc unchanged with just additional locks
> >surrounding the underlying malloc/free calls.
> >
> >This appears to be a serious performance problem.  I just learned that
> >glibc uses another version of dlmalloc, called ptmalloc, which is a
> >derived version of dlmalloc optimized for multi-threading environments.
> >
> >Perhaps we have to do the same, but I don't know how long it takes to
> >port ptmalloc to Cygwin and obviously I don't know how big the
> >performance gain might be.
> >
> >
> >Corinna
> >
> 
> Does any host using newlib suffer from this problem, or is it
> exclusive to Cygwin?

This is Cygwin only.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wrong performance of malloc/free under multi-threading
  2013-02-26 13:26     ` Corinna Vinschen
@ 2013-02-26 14:13       ` jojelino
  2013-02-26 18:25         ` Reini Urban
  2013-02-26 19:17         ` Christopher Faylor
  0 siblings, 2 replies; 7+ messages in thread
From: jojelino @ 2013-02-26 14:13 UTC (permalink / raw)
  To: cygwin

On 2013-02-26 PM 10:25, Corinna Vinschen wrote:
>
> This is Cygwin only.
>
>
> Corinna
>
cygwin malloc is not reentrant according to malloc_wrapper.cc so let's 
not expect performance like linux or native windows. until someone have 
plenty of time to resolve this issue.
-- 
Regards.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wrong performance of malloc/free under multi-threading
  2013-02-26 14:13       ` jojelino
@ 2013-02-26 18:25         ` Reini Urban
  2013-02-26 19:17         ` Christopher Faylor
  1 sibling, 0 replies; 7+ messages in thread
From: Reini Urban @ 2013-02-26 18:25 UTC (permalink / raw)
  To: cygwin

ptmalloc3 would be even better.
http://www.malloc.de/en/

-- 
Reini Urban
http://cpanel.net/   http://www.perl-compiler.org/

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wrong performance of malloc/free under multi-threading
  2013-02-26 14:13       ` jojelino
  2013-02-26 18:25         ` Reini Urban
@ 2013-02-26 19:17         ` Christopher Faylor
  1 sibling, 0 replies; 7+ messages in thread
From: Christopher Faylor @ 2013-02-26 19:17 UTC (permalink / raw)
  To: cygwin

On Tue, Feb 26, 2013 at 11:13:24PM +0900, jojelino wrote:
>On 2013-02-26 PM 10:25, Corinna Vinschen wrote:
>>This is Cygwin only.
>>
>cygwin malloc is not reentrant according to malloc_wrapper.cc so let's
>not expect performance like linux or native windows.  until someone
>have plenty of time to resolve this issue.

It's a fairly simple matter to drop in a version of malloc which is
truly multi-threaded, removing the locks in the malloc wrapper.
Figuring out which malloc is the best and insuring that we haven't
broken anything which relies on the subtle behavior of dlmalloc
isn't.  Maybe that's what you're saying.

One gotcha is that some multi-threaded mallocs use memory profligately.
So, that's something that we have to be aware of when we change mallocs.

cgf

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-02-26 19:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-26  6:36 wrong performance of malloc/free under multi-threading MITSUNARI Shigeo
2013-02-26  9:14 ` Corinna Vinschen
2013-02-26 11:22   ` Chris J. Breisch
2013-02-26 13:26     ` Corinna Vinschen
2013-02-26 14:13       ` jojelino
2013-02-26 18:25         ` Reini Urban
2013-02-26 19:17         ` Christopher Faylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).