public inbox for cygwin-apps@cygwin.com
 help / color / mirror / Atom feed
* Optimising cygwin fork performance
@ 2020-12-16 12:13 Hamish McIntyre-Bhatty
  2020-12-16 17:36 ` Marco Atzeri
  0 siblings, 1 reply; 5+ messages in thread
From: Hamish McIntyre-Bhatty @ 2020-12-16 12:13 UTC (permalink / raw)
  To: cygwin-apps


[-- Attachment #1.1.1: Type: text/plain, Size: 710 bytes --]

Hi,

So I know it's been mentioned a lot that fork is slow on Cygwin, but
compared to other people's machines, eg when building, it seems way
slower for me.

First I'd like to know if there's a good way to measure this that anyone
has found, because I'm not sure how to measure it. If I print multiple
lines with echo in a script, I can see it printing maybe 2-3 a second -
it's very slow.

I think this might be because I'm using a Virtual Machine with
VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).

It might be nice if we could so some comparisons so I can figure out
what's wrong.

Hamish


[-- Attachment #1.1.2: 0x87B761FE07F548D6.asc --]
[-- Type: application/pgp-keys, Size: 3235 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Optimising cygwin fork performance
  2020-12-16 12:13 Optimising cygwin fork performance Hamish McIntyre-Bhatty
@ 2020-12-16 17:36 ` Marco Atzeri
  2020-12-16 18:47   ` Christian Franke
  2020-12-16 20:37   ` Brian Inglis
  0 siblings, 2 replies; 5+ messages in thread
From: Marco Atzeri @ 2020-12-16 17:36 UTC (permalink / raw)
  To: cygwin-apps



On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:
> Hi,
> 
> So I know it's been mentioned a lot that fork is slow on Cygwin, but
> compared to other people's machines, eg when building, it seems way
> slower for me.
> 
> First I'd like to know if there's a good way to measure this that anyone
> has found, because I'm not sure how to measure it. If I print multiple
> lines with echo in a script, I can see it printing maybe 2-3 a second -
> it's very slow.
> 
> I think this might be because I'm using a Virtual Machine with
> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
> with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).
> 
> It might be nice if we could so some comparisons so I can figure out
> what's wrong.
> 
> Hamish

Same AV here, W10 64bit (no VM), 2 year old Laptop

model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
4 cores

https://github.com/mondalaci/fork-benchmark
it seems there is a aging effect

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 39.928576 seconds.

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 42.701295 seconds.

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 49.890909 seconds.

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 61.657031 seconds.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Optimising cygwin fork performance
  2020-12-16 17:36 ` Marco Atzeri
@ 2020-12-16 18:47   ` Christian Franke
  2020-12-16 20:37   ` Brian Inglis
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Franke @ 2020-12-16 18:47 UTC (permalink / raw)
  To: cygwin-apps

Marco Atzeri via Cygwin-apps wrote:
>
> On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:
>> Hi,
>>
>> So I know it's been mentioned a lot that fork is slow on Cygwin, but
>> compared to other people's machines, eg when building, it seems way
>> slower for me.
>>
>> First I'd like to know if there's a good way to measure this that anyone
>> has found, because I'm not sure how to measure it. If I print multiple
>> lines with echo in a script, I can see it printing maybe 2-3 a second -
>> it's very slow.
>>
>> I think this might be because I'm using a Virtual Machine with
>> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
>> with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).
>>
>> It might be nice if we could so some comparisons so I can figure out
>> what's wrong.
>>
>> Hamish
>
> Same AV here, W10 64bit (no VM), 2 year old Laptop
>
> model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
> 4 cores
>
> https://github.com/mondalaci/fork-benchmark
> it seems there is a aging effect

Result of power management? Could possibly be prevented if another minor 
cpu load is run in parallel.


>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 39.928576 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 42.701295 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 49.890909 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 61.657031 seconds.
>
>


9 year old PC, W10 64bit, no VM, Intel i7-2600K CPU @ 3.4GHz, 4 cores / 
8 threads

AV: Windows Defender
Cygwin x86_64

Protection on:

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 16.651101 seconds.

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 16.674107 seconds.

Protection off:

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 14.281071 seconds.

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 14.426482 seconds.


An alternative which could be run out-of-the-box is the good old 
'date(s) per second' benchmark. Its results are comparable:

$ while true; do date; done | uniq -c
...
      53 Wed Dec 16 19:23:26 CET 2020   <== Protection on
      54 Wed Dec 16 19:23:27 CET 2020
      56 Wed Dec 16 19:23:28 CET 2020
      55 Wed Dec 16 19:23:29 CET 2020
      56 Wed Dec 16 19:23:30 CET 2020
...
      52 Wed Dec 16 19:23:37 CET 2020
      54 Wed Dec 16 19:23:38 CET 2020
      56 Wed Dec 16 19:23:39 CET 2020
      63 Wed Dec 16 19:23:40 CET 2020   <== Protection off
      63 Wed Dec 16 19:23:41 CET 2020
      62 Wed Dec 16 19:23:42 CET 2020
      64 Wed Dec 16 19:23:43 CET 2020
...
      63 Wed Dec 16 19:23:51 CET 2020
      64 Wed Dec 16 19:23:52 CET 2020
      63 Wed Dec 16 19:23:53 CET 2020
      55 Wed Dec 16 19:23:54 CET 2020   <== Protection on
      48 Wed Dec 16 19:23:55 CET 2020
      53 Wed Dec 16 19:23:56 CET 2020
      54 Wed Dec 16 19:23:57 CET 2020
      54 Wed Dec 16 19:23:58 CET 2020
...


Cygwin x86 is somewhat slower:

Protection on:

$ ./fork-benchmark-32.exe 1000
Forked, executed and destroyed 1000 processes in 19.231766 seconds.

Protection off:

$ ./fork-benchmark-32.exe 1000
Forked, executed and destroyed 1000 processes in 17.107739 seconds.

Regards,
Christian


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Optimising cygwin fork performance
  2020-12-16 17:36 ` Marco Atzeri
  2020-12-16 18:47   ` Christian Franke
@ 2020-12-16 20:37   ` Brian Inglis
  2021-01-08 15:24     ` Hamish McIntyre-Bhatty
  1 sibling, 1 reply; 5+ messages in thread
From: Brian Inglis @ 2020-12-16 20:37 UTC (permalink / raw)
  To: cygwin-apps

[-- Attachment #1: Type: text/plain, Size: 3621 bytes --]

On 2020-12-16 10:36, Marco Atzeri via Cygwin-apps wrote:
> On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:
>> So I know it's been mentioned a lot that fork is slow on Cygwin, but
>> compared to other people's machines, eg when building, it seems way
>> slower for me.
>>
>> First I'd like to know if there's a good way to measure this that anyone
>> has found, because I'm not sure how to measure it. If I print multiple
>> lines with echo in a script, I can see it printing maybe 2-3 a second -
>> it's very slow.
>>
>> I think this might be because I'm using a Virtual Machine with
>> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
>> with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).
>>
>> It might be nice if we could so some comparisons so I can figure out
>> what's wrong.

Running strace on your forking executable should give you accurate numbers in 
the output, with some work to extract the relevant values.

> Same AV here, W10 64bit (no VM), 2 year old Laptop
> 
> model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
> 4 cores
> 
> https://github.com/mondalaci/fork-benchmark
> it seems there is a aging effect
> 
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 39.928576 seconds.
> 
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 42.701295 seconds.
> 
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 49.890909 seconds.
> 
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 61.657031 seconds.

It's all part of your current process tree.
Running cygserver may help this with appropriate process settings:

$ grep -C1 'kern\.srv\..*_.*' /etc/cygserver.conf

# kern.srv.cleanup_threads: No. of cygserver threads used for cleanup tasks.
# Default: 2, Min: 1, Max: 16, command line option -c, --cleanup-threads
#kern.srv.cleanup_threads 2
kern.srv.cleanup_threads 16

# kern.srv.request_threads: No. of cygserver threads used to serve
#                           application requests.
# Default: 10, Min: 1, Max: 310, command line option -r, --request-threads
#kern.srv.request_threads 10
kern.srv.request_threads 310

# kern.srv.process_cache_size: No. of concurrent processes which can be handled
#                              by Cygserver concurrently.
# Default: 62, Min: 1, Max: 310, command line option -p, --process-cache
#kern.srv.process_cache_size 62
kern.srv.process_cache_size 310

$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 33.397727 seconds.
$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 34.70389 seconds.
$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 34.186709 seconds.
$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 33.65649 seconds.
$ sed -En '/^model name|^cpu MHz/p;/MHz/q' /proc/cpuinfo
model name      : AMD A10-9700 RADEON R7, 10 COMPUTE CORES 4C+6G
cpu MHz         : 3500.000

$ strace -o fork-benchmark.strace ./fork-benchmark 10
$ egrep '^--- Process [0-9]+ (crea|\(pid: [0-9]+\) exi)ted|dwProcessId 
[0-9]+|ExitProcess n 0x' fork-benchmark.strace > fork-benchmark.log
$ awk '/ [0-9]+! pinfo::exit: /{t+=$2};END{print t/10000"ms"}' fork-benchmark.log
34.1855ms

Faster CPUs, faster memory, bigger caches, SSD drive may help.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

[-- Attachment #2: fork-benchmark.log --]
[-- Type: text/plain, Size: 4926 bytes --]

--- Process 9704 created
   65   18385 [main] fork-benchmark 13592 pinfo::thisproc: myself dwProcessId 9704
--- Process 6712 created
   79    7102 [main] fork-benchmark 13593 pinfo::thisproc: myself dwProcessId 6712
--- Process 11804 created
   75    5642 [main] fork-benchmark 13593 pinfo::thisproc: myself dwProcessId 11804
   51   34992 [main] fork-benchmark 13593! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   41   11124 [main] fork-benchmark 13593 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 11804 (pid: 13593) exited with status 0x0
--- Process 4176 created
   68    6571 [main] fork-benchmark 13594 pinfo::thisproc: myself dwProcessId 4176
--- Process 1924 created
   77    5786 [main] fork-benchmark 13594 pinfo::thisproc: myself dwProcessId 1924
   51   33582 [main] fork-benchmark 13594! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   40   10465 [main] fork-benchmark 13594 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 1924 (pid: 13594) exited with status 0x0
--- Process 10576 created
   64    6030 [main] fork-benchmark 13595 pinfo::thisproc: myself dwProcessId 10576
--- Process 6932 created
   76    5798 [main] fork-benchmark 13595 pinfo::thisproc: myself dwProcessId 6932
   42   31674 [main] fork-benchmark 13595! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   41   10602 [main] fork-benchmark 13595 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 6932 (pid: 13595) exited with status 0x0
--- Process 11184 created
   71    8542 [main] fork-benchmark 13596 pinfo::thisproc: myself dwProcessId 11184
--- Process 1544 created
   75    5640 [main] fork-benchmark 13596 pinfo::thisproc: myself dwProcessId 1544
   68   34454 [main] fork-benchmark 13596! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   53   10774 [main] fork-benchmark 13596 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 1544 (pid: 13596) exited with status 0x0
--- Process 8404 created
   68    7283 [main] fork-benchmark 13597 pinfo::thisproc: myself dwProcessId 8404
--- Process 1908 created
   73    5514 [main] fork-benchmark 13597 pinfo::thisproc: myself dwProcessId 1908
   45   34388 [main] fork-benchmark 13597! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   53   11023 [main] fork-benchmark 13597 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 1908 (pid: 13597) exited with status 0x0
--- Process 3724 created
  123    6348 [main] fork-benchmark 13598 pinfo::thisproc: myself dwProcessId 3724
--- Process 1280 created
  156    6800 [main] fork-benchmark 13598 pinfo::thisproc: myself dwProcessId 1280
   46   35177 [main] fork-benchmark 13598! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   50   12248 [main] fork-benchmark 13598 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 1280 (pid: 13598) exited with status 0x0
--- Process 7452 created
   79    6782 [main] fork-benchmark 13599 pinfo::thisproc: myself dwProcessId 7452
--- Process 11212 created
   89    7645 [main] fork-benchmark 13599 pinfo::thisproc: myself dwProcessId 11212
   42   36209 [main] fork-benchmark 13599! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   41   13435 [main] fork-benchmark 13599 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 11212 (pid: 13599) exited with status 0x0
--- Process 8836 created
   65    7997 [main] fork-benchmark 13600 pinfo::thisproc: myself dwProcessId 8836
--- Process 9160 created
  131    6518 [main] fork-benchmark 13600 pinfo::thisproc: myself dwProcessId 9160
   44   35142 [main] fork-benchmark 13600! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
  318   12253 [main] fork-benchmark 13600 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 9160 (pid: 13600) exited with status 0x0
--- Process 2884 created
   64    7104 [main] fork-benchmark 13601 pinfo::thisproc: myself dwProcessId 2884
--- Process 9388 created
   72    5373 [main] fork-benchmark 13601 pinfo::thisproc: myself dwProcessId 9388
   48   33496 [main] fork-benchmark 13601! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   63   10481 [main] fork-benchmark 13601 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 9388 (pid: 13601) exited with status 0x0
--- Process 6292 created
   68    7492 [main] fork-benchmark 13602 pinfo::thisproc: myself dwProcessId 6292
--- Process 7344 created
   72    5412 [main] fork-benchmark 13602 pinfo::thisproc: myself dwProcessId 7344
   42   32741 [main] fork-benchmark 13602! pinfo::exit: Calling ExitProcess n 0x4000000, exitcode 0x0
   41   10880 [main] fork-benchmark 13602 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 7344 (pid: 13602) exited with status 0x0
   39  612762 [main] fork-benchmark 13592 pinfo::exit: Calling ExitProcess n 0x0, exitcode 0x0
--- Process 9704 (pid: 13592) exited with status 0x0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Optimising cygwin fork performance
  2020-12-16 20:37   ` Brian Inglis
@ 2021-01-08 15:24     ` Hamish McIntyre-Bhatty
  0 siblings, 0 replies; 5+ messages in thread
From: Hamish McIntyre-Bhatty @ 2021-01-08 15:24 UTC (permalink / raw)
  To: cygwin-apps


[-- Attachment #1.1.1: Type: text/plain, Size: 4139 bytes --]

On 16/12/2020 20:37, Brian Inglis wrote:
> On 2020-12-16 10:36, Marco Atzeri via Cygwin-apps wrote:
>> On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:
>>> So I know it's been mentioned a lot that fork is slow on Cygwin, but
>>> compared to other people's machines, eg when building, it seems way
>>> slower for me.
>>>
>>> First I'd like to know if there's a good way to measure this that
>>> anyone
>>> has found, because I'm not sure how to measure it. If I print multiple
>>> lines with echo in a script, I can see it printing maybe 2-3 a second -
>>> it's very slow.
>>>
>>> I think this might be because I'm using a Virtual Machine with
>>> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
>>> with exceptions for the cygwin install folders (C:\cygwin64,
>>> C:\cygwin).
>>>
>>> It might be nice if we could so some comparisons so I can figure out
>>> what's wrong.
>
> Running strace on your forking executable should give you accurate
> numbers in the output, with some work to extract the relevant values.
>
>> Same AV here, W10 64bit (no VM), 2 year old Laptop
>>
>> model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
>> 4 cores
>>
>> https://github.com/mondalaci/fork-benchmark
>> it seems there is a aging effect
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 39.928576 seconds.
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 42.701295 seconds.
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 49.890909 seconds.
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 61.657031 seconds.
>
> It's all part of your current process tree.
> Running cygserver may help this with appropriate process settings:
>
> $ grep -C1 'kern\.srv\..*_.*' /etc/cygserver.conf
>
> # kern.srv.cleanup_threads: No. of cygserver threads used for cleanup
> tasks.
> # Default: 2, Min: 1, Max: 16, command line option -c, --cleanup-threads
> #kern.srv.cleanup_threads 2
> kern.srv.cleanup_threads 16
>
> # kern.srv.request_threads: No. of cygserver threads used to serve
> #                           application requests.
> # Default: 10, Min: 1, Max: 310, command line option -r,
> --request-threads
> #kern.srv.request_threads 10
> kern.srv.request_threads 310
>
> # kern.srv.process_cache_size: No. of concurrent processes which can
> be handled
> #                              by Cygserver concurrently.
> # Default: 62, Min: 1, Max: 310, command line option -p, --process-cache
> #kern.srv.process_cache_size 62
> kern.srv.process_cache_size 310
>
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 33.397727 seconds.
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 34.70389 seconds.
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 34.186709 seconds.
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 33.65649 seconds.
> $ sed -En '/^model name|^cpu MHz/p;/MHz/q' /proc/cpuinfo
> model name      : AMD A10-9700 RADEON R7, 10 COMPUTE CORES 4C+6G
> cpu MHz         : 3500.000
>
> $ strace -o fork-benchmark.strace ./fork-benchmark 10
> $ egrep '^--- Process [0-9]+ (crea|\(pid: [0-9]+\) exi)ted|dwProcessId
> [0-9]+|ExitProcess n 0x' fork-benchmark.strace > fork-benchmark.log
> $ awk '/ [0-9]+! pinfo::exit: /{t+=$2};END{print t/10000"ms"}'
> fork-benchmark.log
> 34.1855ms
>
> Faster CPUs, faster memory, bigger caches, SSD drive may help.

Just running in KVM seems to have helped me.

My times are more like 13 seconds for 1000 forks now. There appears to
be drop-off, but after a big package build and some idling, it then gets
back down to 13 seconds.

Is there a way for me to check that the cygserver options are working?

Either way, my package builds are about twice as quick as they were
before now, possibly just from a reinstall and using KVM.

Hamish


[-- Attachment #1.1.2: 0x87B761FE07F548D6.asc --]
[-- Type: application/pgp-keys, Size: 3235 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-08 15:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-16 12:13 Optimising cygwin fork performance Hamish McIntyre-Bhatty
2020-12-16 17:36 ` Marco Atzeri
2020-12-16 18:47   ` Christian Franke
2020-12-16 20:37   ` Brian Inglis
2021-01-08 15:24     ` Hamish McIntyre-Bhatty

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).