public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
* [PATCH] Cygwin: Speed up mkimport
@ 2020-11-26  9:56 Mark Geisert
  2020-11-26 10:07 ` Mark Geisert
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mark Geisert @ 2020-11-26  9:56 UTC (permalink / raw)
  To: cygwin-patches

Cut mkimport elapsed time in half by forking each iteration of the two
time-consuming loops within.  Only do this if more than one CPU is
present.  In the second loop, combine the two 'objdump' calls into one
system() invocation to avoid a system() invocation per iteration.

---
 winsup/cygwin/mkimport | 34 ++++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/winsup/cygwin/mkimport b/winsup/cygwin/mkimport
index 2b08dfe3d..919dc305b 100755
--- a/winsup/cygwin/mkimport
+++ b/winsup/cygwin/mkimport
@@ -47,6 +47,9 @@ for my $sym (keys %replace) {
     $import{$fn} = $imp_sym;
 }
 
+my $ncpus = `grep -c ^processor /proc/cpuinfo`;
+my $forking = $ncpus > 1; # Decides if loops below should fork() each iteration
+
 for my $f (keys %text) {
     my $imp_sym = delete $import{$f};
     my $glob_sym = $text{$f};
@@ -56,25 +59,30 @@ for my $f (keys %text) {
 	$text{$f} = 0;
     } else {
 	$text{$f} = 1;
-	open my $as_fd, '|-', $as, '-o', "$dir/t-$f", "-";
-	if ($is64bit) {
-	    print $as_fd <<EOF;
+	if ($forking && fork) {
+	    ; # Testing shows sleep here is unneeded. 'as' runs very quickly.
+	} else {
+	    open my $as_fd, '|-', $as, '-o', "$dir/t-$f", "-";
+	    if ($is64bit) {
+		print $as_fd <<EOF;
 	.text
 	.extern	$imp_sym
 	.global	$glob_sym
 $glob_sym:
 	jmp	*$imp_sym(%rip)
 EOF
-	} else {
-	    print $as_fd <<EOF;
+	    } else {
+		print $as_fd <<EOF;
 	.text
 	.extern	$imp_sym
 	.global	$glob_sym
 $glob_sym:
 	jmp	*$imp_sym
 EOF
+	    }
+	    close $as_fd or exit 1;
+	    exit 0 if $forking;
 	}
-	close $as_fd or exit 1;
     }
 }
 
@@ -86,8 +94,18 @@ for my $f (keys %text) {
     if (!$text{$f}) {
 	unlink $f;
     } else {
-	system $objcopy, '-R', '.text', $f and exit 1;
-	system $objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
+	if ($forking && fork) {
+	    # Testing shows parent does need to sleep a short time here,
+	    # otherwise system is inundated with hundreds of objcopy processes
+	    # and the forked perl processes that launched them.
+	    my $delay = 0.01; # NOTE: Slower systems may need to raise this
+	    select(undef, undef, undef, $delay); # Supports fractional seconds
+	} else {
+	    # Do two objcopy calls at once to avoid one system() call overhead
+	    system '(', $objcopy, '-R', '.text', $f, ')', '||',
+		$objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
+	    exit 0 if $forking;
+	}
     }
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-26  9:56 [PATCH] Cygwin: Speed up mkimport Mark Geisert
@ 2020-11-26 10:07 ` Mark Geisert
  2020-11-26 20:30 ` Achim Gratz
  2020-12-16 14:29 ` Jon Turney
  2 siblings, 0 replies; 10+ messages in thread
From: Mark Geisert @ 2020-11-26 10:07 UTC (permalink / raw)
  To: cygwin-patches

Previously, Mark Geisert wrote:
> Cut mkimport elapsed time in half by forking each iteration of the two
> time-consuming loops within.  Only do this if more than one CPU is
> present.  In the second loop, combine the two 'objdump' calls into one
                                                  ^^^^^^^
That should say objcopy.  The code is correct though.

..mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-26  9:56 [PATCH] Cygwin: Speed up mkimport Mark Geisert
  2020-11-26 10:07 ` Mark Geisert
@ 2020-11-26 20:30 ` Achim Gratz
  2020-11-27  9:56   ` Mark Geisert
  2020-12-16 14:29 ` Jon Turney
  2 siblings, 1 reply; 10+ messages in thread
From: Achim Gratz @ 2020-11-26 20:30 UTC (permalink / raw)
  To: cygwin-patches

Mark Geisert writes:
> +	    # Do two objcopy calls at once to avoid one system() call overhead
> +	    system '(', $objcopy, '-R', '.text', $f, ')', '||',
> +		$objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;

That doesn't do what you think it does.  It in fact increases the
overhead since it'll start a shell that runs those two commands sand
will even needlessly start the first objcopy in a subshell.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf rackAttack:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-26 20:30 ` Achim Gratz
@ 2020-11-27  9:56   ` Mark Geisert
  2020-11-27 18:37     ` Achim Gratz
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Geisert @ 2020-11-27  9:56 UTC (permalink / raw)
  To: cygwin-patches

Achim Gratz wrote:
> Mark Geisert writes:
>> +	    # Do two objcopy calls at once to avoid one system() call overhead
>> +	    system '(', $objcopy, '-R', '.text', $f, ')', '||',
>> +		$objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
> 
> That doesn't do what you think it does.  It in fact increases the
> overhead since it'll start a shell that runs those two commands sand
> will even needlessly start the first objcopy in a subshell.

Still faster than two system commands :-).  But thanks for the comment; I thought 
I was merely grouping args, to get around Perl's greedy arg list building for the 
system command.  After more experimenting I ended up with:
             system '/bin/true', '||', $objcopy, '-R', '.text', $f, '||',
                 $objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
Kind of ugly, but better?  It obviates the need for parent to pace itself so the 
enclosing loop runs a bit faster.

..mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-12-16 14:29 ` Jon Turney
@ 2020-11-27 10:07   ` Mark Geisert
  0 siblings, 0 replies; 10+ messages in thread
From: Mark Geisert @ 2020-11-27 10:07 UTC (permalink / raw)
  To: Cygwin Patches

Jon Turney wrote:
> On 26/11/2020 09:56, Mark Geisert wrote:
>> @@ -86,8 +94,18 @@ for my $f (keys %text) {
>>       if (!$text{$f}) {
>>       unlink $f;
>>       } else {
>> -    system $objcopy, '-R', '.text', $f and exit 1;
>> -    system $objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
>> +    if ($forking && fork) {
>> +        # Testing shows parent does need to sleep a short time here,
>> +        # otherwise system is inundated with hundreds of objcopy processes
>> +        # and the forked perl processes that launched them.
>> +        my $delay = 0.01; # NOTE: Slower systems may need to raise this
>> +        select(undef, undef, undef, $delay); # Supports fractional seconds
>> +    } else {
>> +        # Do two objcopy calls at once to avoid one system() call overhead
>> +        system '(', $objcopy, '-R', '.text', $f, ')', '||',
>> +        $objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
>> +        exit 0 if $forking;
>> +    }
>>       }
>>   }
> 
> Hmm... not so sure about this.  This seems racy, as nothing ensures that these 
> objcopies have finished before we combine all the produced .o files into a library.

Good point.  I've added a hash to track the forked pids, and after each of these 
two time-consuming loops finishes I loop over the pids list doing waitpid() on 
each pid.

> I'm pretty sure with more understanding, this whole thing could be done better:  
> For example, from a brief look, it seems that the t-*.o files are produced by gas, 
> and then we remove .bss and .data sections.  Could we not arrange to assemble 
> these objects without those sections in the first place?

I looked over as's options in its man page but could not see anything obvious.  I 
wonder if defining the sections explicitly as zero-length somehow in mkimport's 
assembler snippets would accomplish the same thing.  I'll try this next.

Note that mkimport operates both on those tiny object files it creates with as, 
but also on the object files created by the whole Cygwin build.  So adjusting the 
latter object files would need to be done somewhere else.
Thanks,

..mark


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-27  9:56   ` Mark Geisert
@ 2020-11-27 18:37     ` Achim Gratz
  2020-11-28  2:33       ` Brian Inglis
  2020-11-28 16:57       ` Achim Gratz
  0 siblings, 2 replies; 10+ messages in thread
From: Achim Gratz @ 2020-11-27 18:37 UTC (permalink / raw)
  To: cygwin-patches

Mark Geisert writes:
> Still faster than two system commands :-).  But thanks for the
> comment;

It still seems you are barking up the wrong tree.

> I thought I was merely grouping args, to get around Perl's
> greedy arg list building for the system command.

Wot?  It just takes a list which you can build any which way you desire.
The other option is to give it the full command line in a string, which
does work for this script (but not on Windows).  If it finds shell
metacharacters in the arguments it'll run a shell, otherwise the forked
perl just does an execve.

If it's really the forking that is causing the slowdown, why not do
either of those things:

a) Generate a complete shell script and fork once to run that.

b) Open up two pipes to an "xargs -P $ncpu/2 L 1 …" and feed in the file
names.

Getting the error codes back to the script and handling the error is
left as an exercise for the reader.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf Q+, Q and microQ:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-27 18:37     ` Achim Gratz
@ 2020-11-28  2:33       ` Brian Inglis
  2020-11-28 16:57       ` Achim Gratz
  1 sibling, 0 replies; 10+ messages in thread
From: Brian Inglis @ 2020-11-28  2:33 UTC (permalink / raw)
  To: cygwin-patches

On 2020-11-27 11:37, Achim Gratz wrote:
> Mark Geisert writes:
>> Still faster than two system commands :-).  But thanks for the
>> comment;
> 
> It still seems you are barking up the wrong tree.
> 
>> I thought I was merely grouping args, to get around Perl's
>> greedy arg list building for the system command.
> 
> Wot?  It just takes a list which you can build any which way you desire.
> The other option is to give it the full command line in a string, which
> does work for this script (but not on Windows).  If it finds shell
> metacharacters in the arguments it'll run a shell, otherwise the forked
> perl just does an execve.
> 
> If it's really the forking that is causing the slowdown, why not do
> either of those things:
> 
> a) Generate a complete shell script and fork once to run that.
> 
> b) Open up two pipes to an "xargs -P $ncpu/2 L 1 …" and feed in the file
> names.
> 
> Getting the error codes back to the script and handling the error is
> left as an exercise for the reader.

Use explicit binary paths to avoid path search overhead; for portability: /bin/ 
for base system, dir, file, and net utils including compressors, grep, and sed; 
/usr/bin/ otherwise; {/usr,}/sbin/ for some admin utils not elsewhere.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-27 18:37     ` Achim Gratz
  2020-11-28  2:33       ` Brian Inglis
@ 2020-11-28 16:57       ` Achim Gratz
  2020-11-28 19:31         ` Achim Gratz
  1 sibling, 1 reply; 10+ messages in thread
From: Achim Gratz @ 2020-11-28 16:57 UTC (permalink / raw)
  To: cygwin-patches

Achim Gratz writes:
> b) Open up two pipes to an "xargs -P $ncpu/2 L 1 …" and feed in the file
> names.

That actually works, but the speedup is quite modest on my system
(4C/8T) even though I've allowed it to use unlimited resources.  So it
basically forks slower than the runtime for each of the invocations is.
Some more speedup can be had if the assembler is run on actual files in
the same way, but the best I've come up with goes from 93s to 47s and
runs at 150% CPU (up from 85%).  Most of that time is spent in system,
so forking and I/O.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-28 16:57       ` Achim Gratz
@ 2020-11-28 19:31         ` Achim Gratz
  0 siblings, 0 replies; 10+ messages in thread
From: Achim Gratz @ 2020-11-28 19:31 UTC (permalink / raw)
  To: cygwin-patches

Achim Gratz writes:
> That actually works, but the speedup is quite modest on my system
> (4C/8T) even though I've allowed it to use unlimited resources.  So it
> basically forks slower than the runtime for each of the invocations is.
> Some more speedup can be had if the assembler is run on actual files in
> the same way, but the best I've come up with goes from 93s to 47s and
> runs at 150% CPU (up from 85%).  Most of that time is spent in system,
> so forking and I/O.

Not that I really know what I'm doing, but creating a single .s file and
running as just once gets mkimport down to 21s / 110%.  Now the
resulting library doesn't actually link, because somehow the information
ends up in the wrong place…


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Wavetables for the Terratec KOMPLEXER:
http://Synth.Stromeko.net/Downloads.html#KomplexerWaves

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Cygwin: Speed up mkimport
  2020-11-26  9:56 [PATCH] Cygwin: Speed up mkimport Mark Geisert
  2020-11-26 10:07 ` Mark Geisert
  2020-11-26 20:30 ` Achim Gratz
@ 2020-12-16 14:29 ` Jon Turney
  2020-11-27 10:07   ` Mark Geisert
  2 siblings, 1 reply; 10+ messages in thread
From: Jon Turney @ 2020-12-16 14:29 UTC (permalink / raw)
  To: Mark Geisert, Cygwin Patches

On 26/11/2020 09:56, Mark Geisert wrote:
> Cut mkimport elapsed time in half by forking each iteration of the two
> time-consuming loops within.  Only do this if more than one CPU is
> present.  In the second loop, combine the two 'objdump' calls into one
> system() invocation to avoid a system() invocation per iteration.

Nice.  Thanks for looking into this.

> @@ -86,8 +94,18 @@ for my $f (keys %text) {
>       if (!$text{$f}) {
>   	unlink $f;
>       } else {
> -	system $objcopy, '-R', '.text', $f and exit 1;
> -	system $objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
> +	if ($forking && fork) {
> +	    # Testing shows parent does need to sleep a short time here,
> +	    # otherwise system is inundated with hundreds of objcopy processes
> +	    # and the forked perl processes that launched them.
> +	    my $delay = 0.01; # NOTE: Slower systems may need to raise this
> +	    select(undef, undef, undef, $delay); # Supports fractional seconds
> +	} else {
> +	    # Do two objcopy calls at once to avoid one system() call overhead
> +	    system '(', $objcopy, '-R', '.text', $f, ')', '||',
> +		$objcopy, '-R', '.bss', '-R', '.data', "t-$f" and exit 1;
> +	    exit 0 if $forking;
> +	}
>       }
>   }
>   

Hmm... not so sure about this.  This seems racy, as nothing ensures that 
these objcopies have finished before we combine all the produced .o 
files into a library.

I'm pretty sure with more understanding, this whole thing could be done 
better:  For example, from a brief look, it seems that the t-*.o files 
are produced by gas, and then we remove .bss and .data sections.  Could 
we not arrange to assemble these objects without those sections in the 
first place?

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-12-16 14:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-26  9:56 [PATCH] Cygwin: Speed up mkimport Mark Geisert
2020-11-26 10:07 ` Mark Geisert
2020-11-26 20:30 ` Achim Gratz
2020-11-27  9:56   ` Mark Geisert
2020-11-27 18:37     ` Achim Gratz
2020-11-28  2:33       ` Brian Inglis
2020-11-28 16:57       ` Achim Gratz
2020-11-28 19:31         ` Achim Gratz
2020-12-16 14:29 ` Jon Turney
2020-11-27 10:07   ` Mark Geisert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).