From: Mark Geisert <mark@maxrnd.com>
To: cygwin@cygwin.com
Subject: Re: Cygwin multithreading performance
Date: Sat, 21 Nov 2015 09:21:00 -0000 [thread overview]
Message-ID: <5650379B.4030405@maxrnd.com> (raw)
In-Reply-To: <CABPLASTLrH_udLuu2F-m5P6dkENW1Z4YHEudp4NG0-FGLJgPMg@mail.gmail.com>
Kacper Michajlow wrote:
> Thanks for reply. And sorry for being not specific enough before. 'git
> gc' is a driver which runs various git command to do cleanup in
> repository. Though I'm mostly concerned about the code I linked.
> Instead of 'git gc' it is better to test directly 'git repack -a -f'
> and possibly on repository where it takes some time.
> 'git://sourceware.org/git/newlib-cygwin.git' is good test case.
> Although with bigger repositories performance hit is bigger, this is
> good example to see what's going on.
I appreciate that more specific info on how you experience the issue.
> I'm well aware that forking on windows is problematic, but I
> explicitly interested in parallelized part of execution. I don't care
> about forks, while this slows things down too, they are not used in
> compression process which is parallelized over the all cpu threads.
> Each command is indeed forked, but I'm only interested about
> pack-objects part hence the code I linked.
OK, we're on the same page now :).
> $ strace --mask=debug+syscall+thread -o git.strace git repack -a -f
> Counting objects: 156690, done.
> Delta compression using up to 12 threads.
> Compressing objects: 100% (154730/154730), done.
> Writing objects: 100% (156690/156690), done.
> Total 156690 (delta 123449), reused 33146 (delta 0)
>
> $ grep "fork(" git.strace
> 559 53728 [main] git 24340 fork: 24368 = fork()
> 465 54022 [main] git 24368 fork: 0 = fork()
>
> Only two forks were created, while during compression only 25% cpu was
> used (on big repo like linux kernel it doesn't exceed 8%). With native
> git the same workload easily uses 95-100% cpu and therefor is a lot
> faster.
I was able to reproduce your issue using a cloned newlib-cygwin repo.
On a 6-CPU machine I saw max 36% CPU utilization during the compression
phase. ProcessExplorer showed all 6 threads were getting CPU time (to
varying degrees) and when suspended they were always trying to acquire a
mutex. I'd like to run some more straces and perhaps investigate with
some other tools before saying more. This may take a while.
What I've done so far is install the git-debuginfo and cygwin-debuginfo
packages to that I can convert hex RIP addresses to line numbers. I've
run the testcase under gdb so I can interrupt at random times and poke
around. The straces from this testcase are ginormous so I hope I can
figure out a better way to see why the compression threads aren't
CPU-bound like they should be. If you don't already know, 'strace
--help' shows the available mask values. The threads are each writing
to disk, so I wonder if there's some unintentional serialization going
on somewhere, but I don't know yet how I could verify that theory.
..mark
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
next prev parent reply other threads:[~2015-11-21 9:21 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-14 0:24 Kacper Michajlow
2015-11-19 20:24 ` Mark Geisert
2015-11-20 14:25 ` Kacper Michajlow
2015-11-21 9:21 ` Mark Geisert [this message]
2015-11-21 10:53 ` Corinna Vinschen
2015-11-23 7:45 ` Mark Geisert
2015-11-23 10:27 ` John Hein
2015-11-24 1:05 ` Mark Geisert
2015-11-26 9:49 ` Corinna Vinschen
2015-11-26 10:49 ` Mark Geisert
2015-12-05 10:51 ` Mark Geisert
2015-12-05 13:07 ` Kacper Michajlow
2015-12-05 13:59 ` Kacper Michajlow
2015-12-05 22:40 ` Mark Geisert
2015-12-06 2:35 ` Kacper Michajlow
2015-12-06 8:02 ` Mark Geisert
2015-12-06 20:56 ` Kacper Michajlow
2015-12-08 10:51 ` Mark Geisert
2015-12-08 15:34 ` Corinna Vinschen
2015-12-08 17:02 ` Corinna Vinschen
2015-12-18 15:06 ` Achim Gratz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5650379B.4030405@maxrnd.com \
--to=mark@maxrnd.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).