public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: cygwin@cygwin.com
Subject: Re: Cygwin multithreading performance
Date: Sat, 21 Nov 2015 10:53:00 -0000	[thread overview]
Message-ID: <20151121105301.GE2755@calimero.vinschen.de> (raw)
In-Reply-To: <5650379B.4030405@maxrnd.com>

[-- Attachment #1: Type: text/plain, Size: 3282 bytes --]

On Nov 21 01:21, Mark Geisert wrote:
> Kacper Michajlow wrote:
> >Thanks for reply. And sorry for being not specific enough before. 'git
> >gc' is a driver which runs various git command to do cleanup in
> >repository. Though I'm mostly concerned about the code I linked.
> >Instead of 'git gc' it is better to test directly 'git repack -a -f'
> >and possibly on repository where it takes some time.
> >'git://sourceware.org/git/newlib-cygwin.git' is good test case.
> >Although with bigger repositories performance hit is bigger, this is
> >good example to see what's going on.
> 
> I appreciate that more specific info on how you experience the issue.
> 
> >I'm well aware that forking on windows is problematic, but I
> >explicitly interested in parallelized part of execution. I don't care
> >about forks, while this slows things down too, they are not used in
> >compression process which is parallelized over the all cpu threads.
> >Each command is indeed forked, but I'm only interested about
> >pack-objects part hence the code I linked.
> 
> OK, we're on the same page now :).
> 
> >$ strace --mask=debug+syscall+thread -o git.strace git repack -a -f
> >Counting objects: 156690, done.
> >Delta compression using up to 12 threads.
> >Compressing objects: 100% (154730/154730), done.
> >Writing objects: 100% (156690/156690), done.
> >Total 156690 (delta 123449), reused 33146 (delta 0)
> >
> >$ grep "fork(" git.strace
> >   559   53728 [main] git 24340 fork: 24368 = fork()
> >   465   54022 [main] git 24368 fork: 0 = fork()
> >
> >Only two forks were created, while during compression only 25% cpu was
> >used (on big repo like linux kernel it doesn't exceed 8%). With native
> >git the same workload easily uses 95-100% cpu and therefor is a lot
> >faster.
> 
> I was able to reproduce your issue using a cloned newlib-cygwin repo. On a
> 6-CPU machine I saw max 36% CPU utilization during the compression phase.
> ProcessExplorer showed all 6 threads were getting CPU time (to varying
> degrees) and when suspended they were always trying to acquire a mutex.  I'd
> like to run some more straces and perhaps investigate with some other tools
> before saying more.  This may take a while.
> 
> What I've done so far is install the git-debuginfo and cygwin-debuginfo
> packages to that I can convert hex RIP addresses to line numbers.  I've run
> the testcase under gdb so I can interrupt at random times and poke around.
> The straces from this testcase are ginormous so I hope I can figure out a
> better way to see why the compression threads aren't CPU-bound like they
> should be.  If you don't already know, 'strace --help' shows the available
> mask values.  The threads are each writing to disk, so I wonder if there's
> some unintentional serialization going on somewhere, but I don't know yet
> how I could verify that theory.

If I'm allowed to make an educated guess, the big serializer in Cygwin
are probably the calls to malloc, calloc, realloc, free.  We desperately
need a new malloc implementation better suited to multi-threading.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-11-21 10:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-14  0:24 Kacper Michajlow
2015-11-19 20:24 ` Mark Geisert
2015-11-20 14:25   ` Kacper Michajlow
2015-11-21  9:21     ` Mark Geisert
2015-11-21 10:53       ` Corinna Vinschen [this message]
2015-11-23  7:45         ` Mark Geisert
2015-11-23 10:27           ` John Hein
2015-11-24  1:05             ` Mark Geisert
2015-11-26  9:49               ` Corinna Vinschen
2015-11-26 10:49                 ` Mark Geisert
2015-12-05 10:51                   ` Mark Geisert
2015-12-05 13:07                     ` Kacper Michajlow
2015-12-05 13:59                       ` Kacper Michajlow
2015-12-05 22:40                       ` Mark Geisert
2015-12-06  2:35                         ` Kacper Michajlow
2015-12-06  8:02                           ` Mark Geisert
2015-12-06 20:56                             ` Kacper Michajlow
2015-12-08 10:51                               ` Mark Geisert
2015-12-08 15:34                                 ` Corinna Vinschen
2015-12-08 17:02                                   ` Corinna Vinschen
2015-12-18 15:06 ` Achim Gratz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151121105301.GE2755@calimero.vinschen.de \
    --to=corinna-cygwin@cygwin.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).