public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* RE: Optimizing away "ReadFile" calls when Make calls stat()
@ 2001-02-14  4:46 Bernard Dautrevaux
  0 siblings, 0 replies; 50+ messages in thread
From: Bernard Dautrevaux @ 2001-02-14  4:46 UTC (permalink / raw)
  To: 'cygwin@cygwin.com'

> -----Original Message-----
> From: Christopher Faylor [ mailto:cgf@redhat.com ]
> Sent: Tuesday, February 13, 2001 11:29 PM
> To: cygwin@cygwin.com
> Subject: Re: Optimizing away "ReadFile" calls when Make calls stat()
> 
> 
> On Tue, Feb 13, 2001 at 05:13:49PM -0500, Puttkammer, Roman wrote:
> >
> >> -----Original Message-----
> >> From: jfaith@lineo.com [ mailto:jfaith@lineo.com ]
> >> ...
> >> script just did "make --version > /dev/null" one thousand times
> >> ...
> >> Linux: 3 sec.
> >> VMWare running Linux: 9 sec.
> >> DOS (batch file) 18 sec.
> >> Cygwin: 30 sec.
> >
> >AFAIK, fork() tends to be much slower on windows than on most unixes
> >such as solaris or linux.
> 
> There is no real fork on generic Win32.  Cygwin emulates the 
> fork call and
> it is, as a result, very slow.
> 

AFAIK, cygwin is not the only at fault here, the raw Win32 CreateProcess()
call is quite slow also. In our cross-development toolset we only use a
"spawn" call implemented directly on top of ProcessCreate and we see a more
than 10-times performance loss between "fork/exec" on Linux and "spawn" on
NT :-)

Regards,

	Bernard

--------------------------------------------
Bernard Dautrevaux
Microprocess Ingenierie
97 bis, rue de Colombes
92400 COURBEVOIE
FRANCE
Tel:	+33 (0) 1 47 68 80 80
Fax:	+33 (0) 1 47 88 97 85
e-mail:	dautrevaux@microprocess.com
		b.dautrevaux@usa.net
-------------------------------------------- 

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-16  9:24 Bernard Dautrevaux
@ 2001-02-16 10:17 ` Christopher Faylor
  0 siblings, 0 replies; 50+ messages in thread
From: Christopher Faylor @ 2001-02-16 10:17 UTC (permalink / raw)
  To: 'cygwin@cygwin.com'

On Fri, Feb 16, 2001 at 06:06:53PM +0100, Bernard Dautrevaux wrote:
>>>Chris sent some email about this yesterday.  He's looking at the
>>>possibility of eliminating this problem without changing the API.
>>
>>Nice to see that *someone* is paying attention.
>
>
>I, for one, is more than just paying attention; I just wondered if it
>was worthless to add some traffic to an already quite long thread.
>
>I' REALLY waiting to see your solution; I myself expose my ideas on how
>to do that, but I trust you to have found a solution a lot better than
>mine :-)
>
>Thanks in advance for the good work,

You're welcome.  I had an idea on how to make this transparent a couple
of days ago and I've been struggling with the concept ever since.

I'm now in the "almost drifting off to sleep but then getting an idea
that proves to be fruitless in the morning but keeps me up all night
anyway" mode.

Yawn.

I'm still optimistic that I can do something, though.  It will require
recompilation of tools like make and bash, etc. but it should still be
ok.

cgf

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: Optimizing away "ReadFile" calls when Make calls stat()
@ 2001-02-16  9:24 Bernard Dautrevaux
  2001-02-16 10:17 ` Christopher Faylor
  0 siblings, 1 reply; 50+ messages in thread
From: Bernard Dautrevaux @ 2001-02-16  9:24 UTC (permalink / raw)
  To: 'cygwin@cygwin.com'

> -----Original Message-----
> From: Christopher Faylor [ mailto:cgf@redhat.com ]
> Sent: Friday, February 16, 2001 6:00 PM
> To: Cygwin-L
> Subject: Re: Optimizing away "ReadFile" calls when Make calls stat()
> 
> 
> On Fri, Feb 16, 2001 at 10:59:49AM -0500, Larry Hall (RFK 
> Partners, Inc) wrote:
> >At 04:34 AM 2/16/2001, Warren Young wrote:
> >>"Charles S. Wilson" wrote:
> >> > 
> >> > If I were porting an old app from unix to cygwin, and 
> wanted to tune
> >> > performance, I'd much rather do this:
> >>
> >>Both you and Jonathan have understood my intent perfectly.  
> >>
> >>Christopher, please do consider this proposal.  It's easy 
> to implement
> >>-- probably just a few tweaks on Egor's patch -- and it 
> makes it easy to
> >>gain performance with straightforward patches to affected programs. 
> >>It'd be nice if we can make Cygwin faster, but this proposal has an
> >>inherent advantage: the calling process _knows_ what it 
> wants, whereas
> >>Cygwin can only guess or anticipate.
> >>
> >>Egor, Jonathan, maybe some benchmarks would help convince 
> Christopher of
> >>the patch's utility.
> >
> >Chris sent some email about this yesterday.  He's looking at 
> the possibility
> >of eliminating this problem without changing the API.
> 
> Nice to see that *someone* is paying attention.


I, for one, is more than just paying attention; I just wondered if it was
worthless to add some traffic to an already quite long thread. 

I' REALLY waiting to see your solution; I myself expose my ideas on how to
do that, but I trust you to have found a solution a lot better than mine :-)


Thanks in advance for the good work,

	Berrnard
--------------------------------------------
Bernard Dautrevaux
Microprocess Ingenierie
97 bis, rue de Colombes
92400 COURBEVOIE
FRANCE
Tel:	+33 (0) 1 47 68 80 80
Fax:	+33 (0) 1 47 88 97 85
e-mail:	dautrevaux@microprocess.com
		b.dautrevaux@usa.net
-------------------------------------------- 

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-16  8:07                       ` Larry Hall (RFK Partners, Inc)
@ 2001-02-16  9:00                         ` Christopher Faylor
  0 siblings, 0 replies; 50+ messages in thread
From: Christopher Faylor @ 2001-02-16  9:00 UTC (permalink / raw)
  To: Cygwin-L

On Fri, Feb 16, 2001 at 10:59:49AM -0500, Larry Hall (RFK Partners, Inc) wrote:
>At 04:34 AM 2/16/2001, Warren Young wrote:
>>"Charles S. Wilson" wrote:
>> > 
>> > If I were porting an old app from unix to cygwin, and wanted to tune
>> > performance, I'd much rather do this:
>>
>>Both you and Jonathan have understood my intent perfectly.  
>>
>>Christopher, please do consider this proposal.  It's easy to implement
>>-- probably just a few tweaks on Egor's patch -- and it makes it easy to
>>gain performance with straightforward patches to affected programs. 
>>It'd be nice if we can make Cygwin faster, but this proposal has an
>>inherent advantage: the calling process _knows_ what it wants, whereas
>>Cygwin can only guess or anticipate.
>>
>>Egor, Jonathan, maybe some benchmarks would help convince Christopher of
>>the patch's utility.
>
>Chris sent some email about this yesterday.  He's looking at the possibility
>of eliminating this problem without changing the API.

Nice to see that *someone* is paying attention.

cgf

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-16  1:34                     ` Warren Young
@ 2001-02-16  8:07                       ` Larry Hall (RFK Partners, Inc)
  2001-02-16  9:00                         ` Christopher Faylor
  0 siblings, 1 reply; 50+ messages in thread
From: Larry Hall (RFK Partners, Inc) @ 2001-02-16  8:07 UTC (permalink / raw)
  To: Warren Young, Cygwin-L

At 04:34 AM 2/16/2001, Warren Young wrote:
>"Charles S. Wilson" wrote:
> > 
> > If I were porting an old app from unix to cygwin, and wanted to tune
> > performance, I'd much rather do this:
>
>Both you and Jonathan have understood my intent perfectly.  
>
>Christopher, please do consider this proposal.  It's easy to implement
>-- probably just a few tweaks on Egor's patch -- and it makes it easy to
>gain performance with straightforward patches to affected programs. 
>It'd be nice if we can make Cygwin faster, but this proposal has an
>inherent advantage: the calling process _knows_ what it wants, whereas
>Cygwin can only guess or anticipate.
>
>Egor, Jonathan, maybe some benchmarks would help convince Christopher of
>the patch's utility.



Chris sent some email about this yesterday.  He's looking at the possibility
of eliminating this problem without changing the API.



Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      http://www.rfk.com
118 Washington Street                   (508) 893-9779 - RFK Office
Holliston, MA 01746                     (508) 893-9889 - FAX



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-15 14:17                   ` Charles S. Wilson
@ 2001-02-16  1:34                     ` Warren Young
  2001-02-16  8:07                       ` Larry Hall (RFK Partners, Inc)
  0 siblings, 1 reply; 50+ messages in thread
From: Warren Young @ 2001-02-16  1:34 UTC (permalink / raw)
  To: Cygwin-L

"Charles S. Wilson" wrote:
> 
> If I were porting an old app from unix to cygwin, and wanted to tune
> performance, I'd much rather do this:

Both you and Jonathan have understood my intent perfectly.  

Christopher, please do consider this proposal.  It's easy to implement
-- probably just a few tweaks on Egor's patch -- and it makes it easy to
gain performance with straightforward patches to affected programs. 
It'd be nice if we can make Cygwin faster, but this proposal has an
inherent advantage: the calling process _knows_ what it wants, whereas
Cygwin can only guess or anticipate.

Egor, Jonathan, maybe some benchmarks would help convince Christopher of
the patch's utility.
--                                                   _
= 'Net Address: http://www.cyberport.com/~tangent | / \  ASCII Ribbon
= ICBM Address: 36.82740N, 108.02040W, alt. 1714m | \ /  Campaign
=                                                 |  X   Against
= Chance favors the prepared mind.                | / \  HTML Mail

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-16  1:14                 ` Egor Duda
@ 2001-02-16  1:29                   ` Warren Young
  0 siblings, 0 replies; 50+ messages in thread
From: Warren Young @ 2001-02-16  1:29 UTC (permalink / raw)
  To: Egor Duda

Egor Duda wrote:
> 
> WY> If this design is used, stat_lite() would be a misleading name.
> 
> sure.  inventing the name was the hardest part when i was implementing
> this function :)

Further proposals:

	sub_stat  -- my favorite, a play on "subset"  :)
	win_stat  -- not win32_stat; win64_stat would follow shortly
	partial_stat  -- STL-like

--                                                   _
= 'Net Address: http://www.cyberport.com/~tangent | / \  ASCII Ribbon
= ICBM Address: 36.82740N, 108.02040W, alt. 1714m | \ /  Campaign
=                                                 |  X   Against
= Chance favors the prepared mind.                | / \  HTML Mail

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-15 11:47               ` Warren Young
  2001-02-15 13:14                 ` Larry Hall (RFK Partners, Inc)
@ 2001-02-16  1:14                 ` Egor Duda
  2001-02-16  1:29                   ` Warren Young
  1 sibling, 1 reply; 50+ messages in thread
From: Egor Duda @ 2001-02-16  1:14 UTC (permalink / raw)
  To: Warren Young; +Cc: cygwin

Hi!

Thursday, 15 February, 2001 Warren Young warren@etr-usa.com wrote:

WY> If this design is used, stat_lite() would be a misleading name.

sure.  inventing the name was the hardest part when i was implementing
this function :)

WY>  How  about stat_select(), since it would behave like select(2)?

well, i would take word "select" as an implication that this function
will   block   and  wait  for  something to happen. "stat_selective"
would  be  more intuitive, but it goes against the custom of not using
adjectives in posix function names.

Egor.            mailto:deo@logos-m.ru ICQ 5165414 FidoNet 2:5020/496.19



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-15 13:14                 ` Larry Hall (RFK Partners, Inc)
  2001-02-15 14:17                   ` Christopher Faylor
  2001-02-15 14:17                   ` Charles S. Wilson
@ 2001-02-15 14:19                   ` Jonathan Kamens
  2 siblings, 0 replies; 50+ messages in thread
From: Jonathan Kamens @ 2001-02-15 14:19 UTC (permalink / raw)
  To: cygwin

>  Date: Thu, 15 Feb 2001 16:09:46 -0500
>  From: "Larry Hall (RFK Partners, Inc)" <lhall@rfk.com>
>  
>  >Example code:
>  >
>  >         struct stat st;
>  >         stat_select("foo", &st, ST_MODE | ST_MTIME);
>  >
>  >Cygwin.DLL:
>  >         int stat(const char* file, struct stat* pst) 
>  >         {
>  >                 return stat_select(file, pst, 0xFFFFFFFF);
>  >         }
>  
>  
>  Sure, this is an idea for new, Cygwin specific code.  I'm not quite 
>  sure why someone who was writing new code (or changing old) wouldn't just
>  use Win32 APIs to accomplish the required task though.

Because all you would have to do to modify your code to work with
Cygwin is to use stat_select with the correct bitmask.  You wouldn't
have to convert any of the information returned by the Win32 APIs into
the format the rest of your code expects, and in fact you could just
keep using the same "struct stat" as you were using before.

In other words, something like this is a lot simpler than writing a
lot of Win32 API calls in your code:

        struct stat st;
        #ifdef __CYGWIN__
        stat_select("foo", &st, <whatever>);
        #else
        stat("foo", &st);
        #endif

  jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-15 13:14                 ` Larry Hall (RFK Partners, Inc)
  2001-02-15 14:17                   ` Christopher Faylor
@ 2001-02-15 14:17                   ` Charles S. Wilson
  2001-02-16  1:34                     ` Warren Young
  2001-02-15 14:19                   ` Jonathan Kamens
  2 siblings, 1 reply; 50+ messages in thread
From: Charles S. Wilson @ 2001-02-15 14:17 UTC (permalink / raw)
  To: Larry Hall (RFK Partners, Inc); +Cc: cygwin

"Larry Hall (RFK Partners, Inc)" wrote:
>
> Sure, this is an idea for new, Cygwin specific code.  I'm not quite
> sure why someone who was writing new code (or changing old) wouldn't just
> use Win32 APIs to accomplish the required task though.  

If I were porting an old app from unix to cygwin, and wanted to tune
performance, I'd much rather do this:

#ifdef __CYGWIN__
	stat_select("foo", &st, ST_MODE | ST_MTIME);
#else
	stat("foo", &st);
#endif

Than this:

#ifdef __CYGWIN__
	lots of ugly Windows stuff here
	populate a fake st structure (if I don't do
		this, then I need lots more #ifdef 
		blocks in other parts of the code
		that references st->mode, etc.)
#else
	stat("foo", &st);
#endif

> I mean, so long as
> the change results in non-portable code, why not pick the less specific
> Win32 APIs (over some Cygwin-specific alternative)?  

I might not want to #include windows.h (which *REQUIRES* me to also
#define _WIN32 or compile with 'gcc -mwin32'.)  I might not want to
collude my native-win32-specific stuff with cygwin-specific stuff (which
will happen if I'm forced to #define _WIN32) 

> Still, if you want to
> implement such a patch and submit it, I'm sure it will get some thoughtful
> consideration.

I certainly hope so.

--Chuck

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-15 13:14                 ` Larry Hall (RFK Partners, Inc)
@ 2001-02-15 14:17                   ` Christopher Faylor
  2001-02-15 14:17                   ` Charles S. Wilson
  2001-02-15 14:19                   ` Jonathan Kamens
  2 siblings, 0 replies; 50+ messages in thread
From: Christopher Faylor @ 2001-02-15 14:17 UTC (permalink / raw)
  To: Egor Duda

On Thu, Feb 15, 2001 at 04:09:46PM -0500, Larry Hall (RFK Partners, Inc) wrote:
>At 02:47 PM 2/15/2001, Warren Young wrote:
>>Egor Duda wrote:
>> > 
>> > the  only problem with this approach i can see is that if we introduce
>> > new  API  and applications start to use it we became "bound" to it and
>> > it'll  be  not too easy to deprecate ad remove it afterwards. OTOH, we
>> > can  always  make stat_lite() a simple wrapper to stat() if the latter
>> > become fast enough.
>>
>>I like the idea of stat_lite(), and I don't see a reason to ever
>>deprecate it: it's simply a fact that stat() is a bad interface to Win32
>>functionality.  It exposes a Unix filesystem's inode element, and
>>therefore makes programs dependent on it.  To eliminate the need for a
>>stat_lite(), you'd have to redesign Win32, which is out of our hands.
>>
>>Here's how I think stat_lite() should be designed: give it one extra
>>parameter, a bitfield, over regular stat().  This declares what fields
>>are important to the caller.  
>>
>>All the code in the DLL's current stat() implementation would move to
>>stat_lite().  Then add 'if's checking the bitfield flags before making
>>Win32 calls to look up field values.  The DLL's stat() implementation
>>then becomes a wrapper around stat_lite(): it just sets all the bitfield
>>flags, telling it to look up every field value.
>>
>>If this design is used, stat_lite() would be a misleading name.  How
>>about stat_select(), since it would behave like select(2)?
>>
>>Example code:
>>
>>         struct stat st;
>>         stat_select("foo", &st, ST_MODE | ST_MTIME);
>>
>>Cygwin.DLL:
>>         int stat(const char* file, struct stat* pst) 
>>         {
>>                 return stat_select(file, pst, 0xFFFFFFFF);
>>         }
>
>
>
>Sure, this is an idea for new, Cygwin specific code.  I'm not quite 
>sure why someone who was writing new code (or changing old) wouldn't just
>use Win32 APIs to accomplish the required task though.  I mean, so long as 
>the change results in non-portable code, why not pick the less specific 
>Win32 APIs (over some Cygwin-specific alternative)?  Still, if you want to 
>implement such a patch and submit it, I'm sure it will get some thoughtful 
>consideration.

Egor has already submitted a patch like this and I've been mulling it
over for... a year maybe?

I think I can actually change Cygwin to do the right thing here without
any new API changes.

Stay tuned.

cgf

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-15 11:47               ` Warren Young
@ 2001-02-15 13:14                 ` Larry Hall (RFK Partners, Inc)
  2001-02-15 14:17                   ` Christopher Faylor
                                     ` (2 more replies)
  2001-02-16  1:14                 ` Egor Duda
  1 sibling, 3 replies; 50+ messages in thread
From: Larry Hall (RFK Partners, Inc) @ 2001-02-15 13:14 UTC (permalink / raw)
  To: Warren Young, Egor Duda

At 02:47 PM 2/15/2001, Warren Young wrote:
>Egor Duda wrote:
> > 
> > the  only problem with this approach i can see is that if we introduce
> > new  API  and applications start to use it we became "bound" to it and
> > it'll  be  not too easy to deprecate ad remove it afterwards. OTOH, we
> > can  always  make stat_lite() a simple wrapper to stat() if the latter
> > become fast enough.
>
>I like the idea of stat_lite(), and I don't see a reason to ever
>deprecate it: it's simply a fact that stat() is a bad interface to Win32
>functionality.  It exposes a Unix filesystem's inode element, and
>therefore makes programs dependent on it.  To eliminate the need for a
>stat_lite(), you'd have to redesign Win32, which is out of our hands.
>
>Here's how I think stat_lite() should be designed: give it one extra
>parameter, a bitfield, over regular stat().  This declares what fields
>are important to the caller.  
>
>All the code in the DLL's current stat() implementation would move to
>stat_lite().  Then add 'if's checking the bitfield flags before making
>Win32 calls to look up field values.  The DLL's stat() implementation
>then becomes a wrapper around stat_lite(): it just sets all the bitfield
>flags, telling it to look up every field value.
>
>If this design is used, stat_lite() would be a misleading name.  How
>about stat_select(), since it would behave like select(2)?
>
>Example code:
>
>         struct stat st;
>         stat_select("foo", &st, ST_MODE | ST_MTIME);
>
>Cygwin.DLL:
>         int stat(const char* file, struct stat* pst) 
>         {
>                 return stat_select(file, pst, 0xFFFFFFFF);
>         }



Sure, this is an idea for new, Cygwin specific code.  I'm not quite 
sure why someone who was writing new code (or changing old) wouldn't just
use Win32 APIs to accomplish the required task though.  I mean, so long as 
the change results in non-portable code, why not pick the less specific 
Win32 APIs (over some Cygwin-specific alternative)?  Still, if you want to 
implement such a patch and submit it, I'm sure it will get some thoughtful 
consideration.



Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      http://www.rfk.com
118 Washington Street                   (508) 893-9779 - RFK Office
Holliston, MA 01746                     (508) 893-9889 - FAX



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-14  0:12             ` Egor Duda
  2001-02-14  0:17               ` Robert Collins
@ 2001-02-15 11:47               ` Warren Young
  2001-02-15 13:14                 ` Larry Hall (RFK Partners, Inc)
  2001-02-16  1:14                 ` Egor Duda
  1 sibling, 2 replies; 50+ messages in thread
From: Warren Young @ 2001-02-15 11:47 UTC (permalink / raw)
  To: Egor Duda

Egor Duda wrote:
> 
> the  only problem with this approach i can see is that if we introduce
> new  API  and applications start to use it we became "bound" to it and
> it'll  be  not too easy to deprecate ad remove it afterwards. OTOH, we
> can  always  make stat_lite() a simple wrapper to stat() if the latter
> become fast enough.

I like the idea of stat_lite(), and I don't see a reason to ever
deprecate it: it's simply a fact that stat() is a bad interface to Win32
functionality.  It exposes a Unix filesystem's inode element, and
therefore makes programs dependent on it.  To eliminate the need for a
stat_lite(), you'd have to redesign Win32, which is out of our hands.

Here's how I think stat_lite() should be designed: give it one extra
parameter, a bitfield, over regular stat().  This declares what fields
are important to the caller.  

All the code in the DLL's current stat() implementation would move to
stat_lite().  Then add 'if's checking the bitfield flags before making
Win32 calls to look up field values.  The DLL's stat() implementation
then becomes a wrapper around stat_lite(): it just sets all the bitfield
flags, telling it to look up every field value.

If this design is used, stat_lite() would be a misleading name.  How
about stat_select(), since it would behave like select(2)?

Example code:

	struct stat st;
	stat_select("foo", &st, ST_MODE | ST_MTIME);

Cygwin.DLL:
	int stat(const char* file, struct stat* pst) 
	{
		return stat_select(file, pst, 0xFFFFFFFF);
	}

--                                                   _
= 'Net Address: http://www.cyberport.com/~tangent | / \  ASCII Ribbon
= ICBM Address: 36.82740N, 108.02040W, alt. 1714m | \ /  Campaign
=                                                 |  X   Against
= Chance favors the prepared mind.                | / \  HTML Mail

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: Optimizing away "ReadFile" calls when Make calls stat()
@ 2001-02-14  2:41 Bernard Dautrevaux
  0 siblings, 0 replies; 50+ messages in thread
From: Bernard Dautrevaux @ 2001-02-14  2:41 UTC (permalink / raw)
  To: 'DJ Delorie', jik-cygwin; +Cc: cygwin

> -----Original Message-----
> From: DJ Delorie [ mailto:dj@delorie.com ]
> Sent: Tuesday, February 13, 2001 8:54 PM
> To: jik-cygwin@curl.com
> Cc: cygwin@cygwin.com
> Subject: Re: Optimizing away "ReadFile" calls when Make calls stat()
> 
> 
> 
> > As I've noted separately, reading tens of thousands of 
> files even once
> > incurs a significant performance penalty.
> 
> True, but reading them all once is better than reading them all twice.
> I'm trying to break the problem down into small enough changes that we
> actually have a chance of implementing them.
> 
> > The change I've proposed can eliminate reading them at all.
> 
> But not in a way that we can make it the default.  Perhaps you could
> propose a set of mount flags to optimize common situations?  We
> already have one to avoid the read-for-execute test, perhaps you could
> work on an assume-no-symlinks flag?  Then we wouldn't need a custom
> make.exe (or any other program).
> 
> > But it does nothing at all for the "usual case" I'm trying to
> > optimize, which is Make stat()ing a file but never reading it.
> 
> It does, because stat() reads the file twice, once to see if it's a
> symlink, and once to see if the executable bit needs to be set.
> 
> > >  These should be easier wins (thus, more doable) than a 
> global cache,
> > >  which NT should be providing itself as part of the disk cache
> > >  subsystem (for local drives, at least).  I don't think it's
> > >  appropriate for cygwin to go beyond this anyway - too many race
> > >  conditions arise.
> > 
> > As far as I know, there are no race conditions in the change I
> > suggested.  In fact, it *removes* race conditions, since it reduces
> > the number of distinct OS operations that must be performed 
> on a file
> > during stat().
> 
> Right, but others were suggesting a global cache of file bytes.
> *That* would introduce race conditions.
> 

Perhaps a solution would be to maintain what could be called a "partial"
stat() cache: maintain a global cache of ALL the result of the ReadFile()s
(that can easily I think reduced to 1) together with the last-time-modified
value.

stat() will then ALWAYS check the last-time-modified of the ACTUAL file,
then check the cache and if the cache is up-to-date, returns the
execute/symlink flags found in the cache. If the cache is obsolete or
absent, just re-read the file's content and save in the cache the
LMT/exec/symlink values.

The only race condition will be when UPDATING the cache (no problem on
reading if we first change exec/symlink then upadte LMT); this should be
simple to handle.

Regretfully I don't have time to look at this (and don't know how it is
effectively implemented now) but this should provide quite a big win for
cygwin.

Regards,

	Bernard

--------------------------------------------
Bernard Dautrevaux
Microprocess Ingenierie
97 bis, rue de Colombes
92400 COURBEVOIE
FRANCE
Tel:	+33 (0) 1 47 68 80 80
Fax:	+33 (0) 1 47 88 97 85
e-mail:	dautrevaux@microprocess.com
		b.dautrevaux@usa.net
-------------------------------------------- 

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 15:28         ` Warren Young
@ 2001-02-14  0:48           ` Lothan
  0 siblings, 0 replies; 50+ messages in thread
From: Lothan @ 2001-02-14  0:48 UTC (permalink / raw)
  To: Cygwin-L

> From: cygwin-owner@sources.redhat.com
> [ mailto:cygwin-owner@sources.redhat.com]On Behalf Of Warren Young
> Sent: Tuesday, February 13, 2001 3:28 PM
> To: Cygwin-L
> Subject: Re: Optimizing away "ReadFile" calls when Make calls stat()
>
>
> jik-cygwin@curl.com wrote:
> >
> > As I've noted separately, reading tens of thousands of files even once
> > incurs a significant performance penalty.  The change I've proposed
> > can eliminate reading them at all.
>
> Even stat() under Linux does at least one disk read.  You can't
> completely optimize away disk I/O for stat().
>
> The main culprit is that this is one of many places where Unix doesn't
> map onto Win32 well at all.  The VC++ RTL doesn't use ReadFile() to
> implement _stat() at all.  When it checks for things like stat.st_mode
> == S_IEXEC, it simply checks the filename extension for .exe, .com or
> .bat.  Cygwin can't do that -- it must look at the file's magic bytes to
> see if it's an executable, or a #! style script.
>
> stat() on Unixen doesn't do either of these things; all the info stat()
> reports is in the inode.  The info in a prototypical Unix inode is
> scattered in many different places in Win32, which makes the emulator
> for a call like stat() slow.
>
> Maybe a better optimization strategy would be to patch GNU Make.
> Wherever it does a stat() to find the modification time, do something
> like this:
>
>         struct stat st;
> #ifdef CYGWIN
>         WIN32_FIND_DATA findinfo;
>         HANDLE h = FindFirstFile(filename, &findinfo);
>         st.st_mtime = findinfo.ftLastWriteTime;
>         FindClose(h);
> #else
>         stat(filename, &st);
> #endif

Your method is simpler, but I think it's actually faster to call GetFileTime
on an open file handle rather than go through the FindFirstFile API.

	struct stat st;
#ifdef CYGWIN
	HANDLE hFile;
	FILETIME CreationTime;
	FILETIME LastAccessTime;
	FILETIME LastWriteTime;
	hFile = CreateFile(filename, GENERIC_READ, FILE_SHARE_READ |
FILE_SHARE_WRITE, null, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, null);
	GetFileTime(hFile, &CreationTime, &LastAccessTime, &LastWriteTime);
	st.st_mtime = LastWriteTime
	CloseHandle(hFile);
#else
	stat(filename, &st);
#endif

In the case of make, it may be even simpler roll an internal stat() function
that uses GetFileInformationByHandle(), since that seems to include
everything make needs (with mild translation of Windows attributes to *nix
attributes). Of course, one must remember that Windows API functions are not
aware of symbolic links. Caveat emptor.

> (ftLastWriteTime will probably need translation into a time_t, but
> that's a small matter.)


--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-14  0:12             ` Egor Duda
@ 2001-02-14  0:17               ` Robert Collins
  2001-02-15 11:47               ` Warren Young
  1 sibling, 0 replies; 50+ messages in thread
From: Robert Collins @ 2001-02-14  0:17 UTC (permalink / raw)
  To: cygwin

----- Original Message -----
From: "Egor Duda" <deo@logos-m.ru>
To: <cygwin@cygwin.com>
Sent: Wednesday, February 14, 2001 7:08 PM
Subject: Re: Optimizing away "ReadFile" calls when Make calls stat()


> Hi!
<SNIP>
> not  meaning to be too pushy ;), but i'd like to bring the thread
>
> http://sources.redhat.com/ml/cygwin-developers/2000-03/msg00077.html
>
> back to life. I have to say, that not only ReadFile() is slowing things
> up,  but  CreateFile()  too.  i tend to think that cygwin- (or win32-)
> specific  parts  in  ported  applications  are  unavoidable  evil (and
> "make"   sources   are   already   full   of them).
>
<snip>
> giving  porter  a single universal tool to be more specific about what
> he  wants  to  get from stat() has one more benefit. otherwise, porter
> will  tend  to use different native win32 calls such as GetFileTime(),
> GetFileAttributes(),  GetFileSize()  etc.  which are harder to find in
> large  source  tree  when needed. with stat_lite he had just to do the
> simple grep.
>
> the  only problem with this approach i can see is that if we introduce
> new  API  and applications start to use it we became "bound" to it and
> it'll  be  not too easy to deprecate ad remove it afterwards. OTOH, we
> can  always  make stat_lite() a simple wrapper to stat() if the latter
> become fast enough.
>

Perhaps a macro for each hint, that porters who have read up on cygwin can use? Rather than change the cygwin API, have a macro that
calls GetFileTime() etc as appropriate... that way cygwin itself is not altered or tied to anything ?

Rob


--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 12:22           ` Christopher Faylor
  2001-02-13 12:50             ` DJ Delorie
@ 2001-02-14  0:12             ` Egor Duda
  2001-02-14  0:17               ` Robert Collins
  2001-02-15 11:47               ` Warren Young
  1 sibling, 2 replies; 50+ messages in thread
From: Egor Duda @ 2001-02-14  0:12 UTC (permalink / raw)
  To: cygwin

Hi!

Tuesday, 13 February, 2001 Christopher Faylor cgf@redhat.com wrote:

CF> On Tue, Feb 13, 2001 at 02:54:01PM -0500, DJ Delorie wrote:
>>> But it does nothing at all for the "usual case" I'm trying to
>>> optimize, which is Make stat()ing a file but never reading it.
>>
>>It does, because stat() reads the file twice, once to see if it's a
>>symlink, and once to see if the executable bit needs to be set.

CF> If it is actually doing this, then that's a bug.  I went to some effort
CF> a year or so ago to do away with the second read.

not  meaning to be too pushy ;), but i'd like to bring the thread

http://sources.redhat.com/ml/cygwin-developers/2000-03/msg00077.html

back to life. I have to say, that not only ReadFile() is slowing things
up,  but  CreateFile()  too.  i tend to think that cygwin- (or win32-)
specific  parts  in  ported  applications  are  unavoidable  evil (and
"make"   sources   are   already   full   of them).

quick  grep  through "make"  sources  reveals  that  we  need  st_mode
only to check in for S_ISREG(m)  or  S_ISDIR(m). So application porter
can  use  this  info to give  stat a hint, and stat won't need to read
file, and even open it. 

giving  porter  a single universal tool to be more specific about what
he  wants  to  get from stat() has one more benefit. otherwise, porter
will  tend  to use different native win32 calls such as GetFileTime(),
GetFileAttributes(),  GetFileSize()  etc.  which are harder to find in
large  source  tree  when needed. with stat_lite he had just to do the
simple grep.

the  only problem with this approach i can see is that if we introduce
new  API  and applications start to use it we became "bound" to it and
it'll  be  not too easy to deprecate ad remove it afterwards. OTOH, we
can  always  make stat_lite() a simple wrapper to stat() if the latter
become fast enough.

i'm ready to submit an appropriate patch against current sources.

Egor.            mailto:deo@logos-m.ru ICQ 5165414 FidoNet 2:5020/496.19



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:46       ` jik-cygwin
  2001-02-13 11:54         ` DJ Delorie
@ 2001-02-13 15:28         ` Warren Young
  2001-02-14  0:48           ` Lothan
  1 sibling, 1 reply; 50+ messages in thread
From: Warren Young @ 2001-02-13 15:28 UTC (permalink / raw)
  To: Cygwin-L

jik-cygwin@curl.com wrote:
> 
> As I've noted separately, reading tens of thousands of files even once
> incurs a significant performance penalty.  The change I've proposed
> can eliminate reading them at all.

Even stat() under Linux does at least one disk read.  You can't
completely optimize away disk I/O for stat().

The main culprit is that this is one of many places where Unix doesn't
map onto Win32 well at all.  The VC++ RTL doesn't use ReadFile() to
implement _stat() at all.  When it checks for things like stat.st_mode
== S_IEXEC, it simply checks the filename extension for .exe, .com or
.bat.  Cygwin can't do that -- it must look at the file's magic bytes to
see if it's an executable, or a #! style script.  

stat() on Unixen doesn't do either of these things; all the info stat()
reports is in the inode.  The info in a prototypical Unix inode is
scattered in many different places in Win32, which makes the emulator
for a call like stat() slow.

Maybe a better optimization strategy would be to patch GNU Make. 
Wherever it does a stat() to find the modification time, do something
like this:

        struct stat st;
#ifdef CYGWIN
        WIN32_FIND_DATA findinfo;
        HANDLE h = FindFirstFile(filename, &findinfo);
        st.st_mtime = findinfo.ftLastWriteTime;
        FindClose(h);
#else
        stat(filename, &st);
#endif

(ftLastWriteTime will probably need translation into a time_t, but
that's a small matter.)
--                                                   _
= 'Net Address: http://www.cyberport.com/~tangent | / \  ASCII Ribbon
= ICBM Address: 36.82740N, 108.02040W, alt. 1714m | \ /  Campaign
=                                                 |  X   Against
= Chance favors the prepared mind.                | / \  HTML Mail

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 14:15 Puttkammer, Roman
@ 2001-02-13 14:28 ` Christopher Faylor
  0 siblings, 0 replies; 50+ messages in thread
From: Christopher Faylor @ 2001-02-13 14:28 UTC (permalink / raw)
  To: cygwin

On Tue, Feb 13, 2001 at 05:13:49PM -0500, Puttkammer, Roman wrote:
>
>> -----Original Message-----
>> From: jfaith@lineo.com [ mailto:jfaith@lineo.com ]
>> ...
>> script just did "make --version > /dev/null" one thousand times
>> ...
>> Linux: 3 sec.
>> VMWare running Linux: 9 sec.
>> DOS (batch file) 18 sec.
>> Cygwin: 30 sec.
>
>AFAIK, fork() tends to be much slower on windows than on most unixes
>such as solaris or linux.

There is no real fork on generic Win32.  Cygwin emulates the fork call and
it is, as a result, very slow.

cgf

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: Optimizing away "ReadFile" calls when Make calls stat()
@ 2001-02-13 14:15 Puttkammer, Roman
  2001-02-13 14:28 ` Christopher Faylor
  0 siblings, 1 reply; 50+ messages in thread
From: Puttkammer, Roman @ 2001-02-13 14:15 UTC (permalink / raw)
  To: cygwin

> -----Original Message-----
> From: jfaith@lineo.com [ mailto:jfaith@lineo.com ]
> ...
> script just did "make --version > /dev/null" one thousand times
> ...
> Linux: 3 sec.
> VMWare running Linux: 9 sec.
> DOS (batch file) 18 sec.
> Cygwin: 30 sec.

AFAIK, fork() tends to be much slower on windows than on most unixes
such as solaris or linux. Hence you'll always get a bad performance
on windows when running this kind of tests. I doubt however that you
can generalize these results; it's kind of like comparing pineapples
with carrots.

The reason why this is the case though is probably because unixes are
optimized for server applications using many heavy and light weight
processes/threads. Windows however seems to be optimized for running
10meg VB script functions inside an excel spreadsheet - and it does
actually a pretty good job running those :-)

putt

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 13:50             ` Mumit Khan
@ 2001-02-13 14:13               ` DJ Delorie
  0 siblings, 0 replies; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 14:13 UTC (permalink / raw)
  To: khan; +Cc: cygwin

> And I have seen results that show W2k/NTFS_5 to be at least as fast as
> some of the Unix counterparts, and I trust neither (at least not w/out
> more information). I also don't trust benchmark numbers of Linux/ext2fs,

At the time, I was testing simple things, like open(), system(), etc.
I just wanted a baseline to compare against so I could guage the
improvements I was working on at the time; as long as the results were
consistent it didn't matter what the absolute numbers were.  For
example, if I could improve the performance from 10x as much as linux
to only 5x as much as linux, that would be great, but I shouldn't
expect that going from 5x to 2x would be as easy.

One of my other comparisons was against djgpp, for example, whose
performance is sensitive to 95 vs NT, but just knowing that cygwin was
less than 2x the time djgpp took was useful.

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 12:50           ` Larry Hall (RFK Partners, Inc)
  2001-02-13 12:51             ` DJ Delorie
  2001-02-13 13:37             ` jfaith
@ 2001-02-13 13:50             ` Mumit Khan
  2001-02-13 14:13               ` DJ Delorie
  2 siblings, 1 reply; 50+ messages in thread
From: Mumit Khan @ 2001-02-13 13:50 UTC (permalink / raw)
  To: cygwin

On Tue, 13 Feb 2001, Larry Hall (RFK Partners, Inc) wrote:

> At 03:25 PM 2/13/2001, DJ Delorie wrote:
> >
> >Actually, it is.  I did some benchmarks using the native Win32 API
> >directly, and Linux is way faster.
> 
> 
> Any chance that you have a pointer to the results of such a test?  Just
> curious.

And I have seen results that show W2k/NTFS_5 to be at least as fast as
some of the Unix counterparts, and I trust neither (at least not w/out
more information). I also don't trust benchmark numbers of Linux/ext2fs,
because of the metadata issue, nor do I trust some of the other Unix and
W2k numbers because of other issues. *BSD and Linux folks don't believe 
each others numbers either.  Argh, Just can't win. Skeptical crisis all 
over again. 

Regards,
Mumit



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 12:50           ` Larry Hall (RFK Partners, Inc)
  2001-02-13 12:51             ` DJ Delorie
@ 2001-02-13 13:37             ` jfaith
  2001-02-13 13:50             ` Mumit Khan
  2 siblings, 0 replies; 50+ messages in thread
From: jfaith @ 2001-02-13 13:37 UTC (permalink / raw)
  To: Larry Hall (RFK Partners, Inc), cygwin

"Larry Hall (RFK Partners, Inc)" wrote:

> At 03:25 PM 2/13/2001, DJ Delorie wrote:
> > > Win32 is slower, but not THAT much slower.
> >
> >Actually, it is.  I did some benchmarks using the native Win32 API
> >directly, and Linux is way faster.
>
> Any chance that you have a pointer to the results of such a test?  Just
> curious.

Althought not related to the performance of stat() ... to get a feel for
where time was being spent during a make on Cygwin, I did an experiment a
while ago to compare process launching on Linux versus Win32.  The test
script just did "make --version > /dev/null" one thousand times.  This is
how long it took, all done on the same hardware:

Linux: 3 sec.
VMWare running Linux: 9 sec.
DOS (batch file) 18 sec.
Cygwin: 30 sec.

One thing which can help build performace is to pass several C files at
once to gcc, if it's possible to work it into your build procedure.

--
John Faith
Lineo



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 12:50           ` Larry Hall (RFK Partners, Inc)
@ 2001-02-13 12:51             ` DJ Delorie
  2001-02-13 13:37             ` jfaith
  2001-02-13 13:50             ` Mumit Khan
  2 siblings, 0 replies; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 12:51 UTC (permalink / raw)
  To: lhall; +Cc: cygwin

> Any chance that you have a pointer to the results of such a test?  Just
> curious.

Nope.  Long ago and a different machine.  Sorry.

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 12:22           ` Christopher Faylor
@ 2001-02-13 12:50             ` DJ Delorie
  2001-02-14  0:12             ` Egor Duda
  1 sibling, 0 replies; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 12:50 UTC (permalink / raw)
  To: cygwin

> If it is actually doing this, then that's a bug.  I went to some effort
> a year or so ago to do away with the second read.

See?  I broke it down to easy-to-do chunks, and someone did one of them ;-)

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 12:25         ` DJ Delorie
@ 2001-02-13 12:50           ` Larry Hall (RFK Partners, Inc)
  2001-02-13 12:51             ` DJ Delorie
                               ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Larry Hall (RFK Partners, Inc) @ 2001-02-13 12:50 UTC (permalink / raw)
  To: DJ Delorie; +Cc: cygwin

At 03:25 PM 2/13/2001, DJ Delorie wrote:
> > Win32 is slower, but not THAT much slower.
>
>Actually, it is.  I did some benchmarks using the native Win32 API
>directly, and Linux is way faster.


Any chance that you have a pointer to the results of such a test?  Just
curious.


Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      http://www.rfk.com
118 Washington Street                   (508) 893-9779 - RFK Office
Holliston, MA 01746                     (508) 893-9889 - FAX



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:56           ` Jonathan Kamens
  2001-02-13 12:06             ` DJ Delorie
@ 2001-02-13 12:31             ` Larry Hall (RFK Partners, Inc)
  1 sibling, 0 replies; 50+ messages in thread
From: Larry Hall (RFK Partners, Inc) @ 2001-02-13 12:31 UTC (permalink / raw)
  To: Jonathan Kamens, dj; +Cc: cygwin

At 02:56 PM 2/13/2001, Jonathan Kamens wrote:
> >  Date: Tue, 13 Feb 2001 14:54:01 -0500
> >  From: DJ Delorie <dj@delorie.com>
> >  
> >  Perhaps you could
> >  propose a set of mount flags to optimize common situations?  We
> >  already have one to avoid the read-for-execute test,
>
>Could you elaborate on that?
>
> >  perhaps you could
> >  work on an assume-no-symlinks flag?  Then we wouldn't need a custom
> >  make.exe (or any other program).
>
>I think that's a very good idea.  If you tell me a little bit more
>about the read-for-execute mount flag, with which I'm not familiar, I
>can look into implementing a similar thing for symbolic links.



Take a look at mount --help.  You can then look at the mount code to see
how this works.



Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      http://www.rfk.com
118 Washington Street                   (508) 893-9779 - RFK Office
Holliston, MA 01746                     (508) 893-9889 - FAX



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:54       ` jik-cygwin
@ 2001-02-13 12:25         ` DJ Delorie
  2001-02-13 12:50           ` Larry Hall (RFK Partners, Inc)
  0 siblings, 1 reply; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 12:25 UTC (permalink / raw)
  To: jik-cygwin; +Cc: cygwin

> if they speed up Cygwin significantly, why aren't the people who
> precompile Cygwin for download using those switches by default?

Because using those switches when you shouldn't results in incorrect
behavior.  The defaults are *always* right, at the expense of speed.

> Win32 is slower, but not THAT much slower.

Actually, it is.  I did some benchmarks using the native Win32 API
directly, and Linux is way faster.

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:54         ` DJ Delorie
  2001-02-13 11:56           ` Jonathan Kamens
@ 2001-02-13 12:22           ` Christopher Faylor
  2001-02-13 12:50             ` DJ Delorie
  2001-02-14  0:12             ` Egor Duda
  1 sibling, 2 replies; 50+ messages in thread
From: Christopher Faylor @ 2001-02-13 12:22 UTC (permalink / raw)
  To: cygwin

On Tue, Feb 13, 2001 at 02:54:01PM -0500, DJ Delorie wrote:
>> But it does nothing at all for the "usual case" I'm trying to
>> optimize, which is Make stat()ing a file but never reading it.
>
>It does, because stat() reads the file twice, once to see if it's a
>symlink, and once to see if the executable bit needs to be set.

If it is actually doing this, then that's a bug.  I went to some effort
a year or so ago to do away with the second read.

cgf

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:48     ` Earnie Boyd
  2001-02-13 11:54       ` jik-cygwin
@ 2001-02-13 12:11       ` DJ Delorie
  1 sibling, 0 replies; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 12:11 UTC (permalink / raw)
  To: cygwin

> It's like comparing a tortoise with a hare.

An insomniac hare on speed, that is.

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:56           ` Jonathan Kamens
@ 2001-02-13 12:06             ` DJ Delorie
  2001-02-13 12:31             ` Larry Hall (RFK Partners, Inc)
  1 sibling, 0 replies; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 12:06 UTC (permalink / raw)
  To: jik; +Cc: cygwin

> >  Perhaps you could
> >  propose a set of mount flags to optimize common situations?  We
> >  already have one to avoid the read-for-execute test,
> 
> Could you elaborate on that?

$ mount --h
...
-x  treat all files under mount point as executables.

> I think that's a very good idea.  If you tell me a little bit more
> about the read-for-execute mount flag, with which I'm not familiar, I
> can look into implementing a similar thing for symbolic links.

winsup/utils/mount.cc sets it.  <sys/mount.h> defines
MOUNT_CYGWIN_EXEC but look for PATH_ALL_EXEC to see where
it's relevent (in path.{cc,h}).

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:28   ` jik-cygwin
@ 2001-02-13 12:04     ` Eric M. Monsler
  0 siblings, 0 replies; 50+ messages in thread
From: Eric M. Monsler @ 2001-02-13 12:04 UTC (permalink / raw)
  To: jik-cygwin; +Cc: cygwin

jik-cygwin@curl.com wrote:
> 
(snip)
> 
> >  For everything in the system?  Is your proposed change MT-safe?
> 
> It would have to be made MT-safe, obviously.  While I am not
> intimately familiar with how to do such a thing in Cygwin DLL code, I
> am confident that the more knowledgeable maintainers of the code would
> be able to do so easily.

Do you think so?  Does the DLL maintain "state" for every process which
may make cygwin system calls?  Are there different levels of state for
processes vs. threads, such that if one thread was in the middle of a
set_stat_options()/stat()/set_stat_options() operation and another
thread attempted a stat(), the stat() in the other thread would get the
POSIX behavior, because inside the stat() call the DLL checked the
set_stat_options() state for this thread and found that it had not been
changed?

I am philosophically opposed to anything that changes system call
behavior.  I am doubtfull that even experts can see all consequences. 
Even *fixing* behaviors can break code that had successfully worked
around the previous behavior.

If a DLL modification is really required, how about a new function,
substat(), which checks on the file and returns in the struct only those
fields that are set to non-null when the substat() is called?

struct stat *pStatBuff;
...

/* Identical to current "stat" */
memset(pStatBuff, 0xff, sizeof(*pStatBuff));
substat(pPath,pStatBuff);

/* A long and expensive NOP */
memset(pStatBuff, 0x0, sizeof(*pStatBuff));
substat(pPath,pStatBuff);

/* Get just last modification time */
memset(pStatBuff, 0x0, sizeof(*pStatBuff));
pStatBuff->st_mtime = 1;
substat(pPath,pStatBuff);


On the other hand, it sounds as if you have a working version of make
that suits your needs right now.  How many other applications are
expected to use substat(), or any other Cygwin-only system call like
set_stat_options()?


Eric M. Monsler

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:54         ` DJ Delorie
@ 2001-02-13 11:56           ` Jonathan Kamens
  2001-02-13 12:06             ` DJ Delorie
  2001-02-13 12:31             ` Larry Hall (RFK Partners, Inc)
  2001-02-13 12:22           ` Christopher Faylor
  1 sibling, 2 replies; 50+ messages in thread
From: Jonathan Kamens @ 2001-02-13 11:56 UTC (permalink / raw)
  To: dj; +Cc: cygwin

>  Date: Tue, 13 Feb 2001 14:54:01 -0500
>  From: DJ Delorie <dj@delorie.com>
>  
>  Perhaps you could
>  propose a set of mount flags to optimize common situations?  We
>  already have one to avoid the read-for-execute test,

Could you elaborate on that?

>  perhaps you could
>  work on an assume-no-symlinks flag?  Then we wouldn't need a custom
>  make.exe (or any other program).

I think that's a very good idea.  If you tell me a little bit more
about the read-for-execute mount flag, with which I'm not familiar, I
can look into implementing a similar thing for symbolic links.

Thanks,

  jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:48     ` Earnie Boyd
@ 2001-02-13 11:54       ` jik-cygwin
  2001-02-13 12:25         ` DJ Delorie
  2001-02-13 12:11       ` DJ Delorie
  1 sibling, 1 reply; 50+ messages in thread
From: jik-cygwin @ 2001-02-13 11:54 UTC (permalink / raw)
  To: cygwin

>  Date: Tue, 13 Feb 2001 14:48:48 -0500
>  From: Earnie Boyd <earnie_boyd@yahoo.com>
>  
>  You are looking for ways to "speed up execution".  I was suggesting you
>  try the suggested switches as another means to speed up execution.  I
>  can rebuild the Cygwin dll in ~10 minutes doing a `make clean && make'.

And I will try the switches you suggested, but my point is that if
they speed up Cygwin significantly, why aren't the people who
precompile Cygwin for download using those switches by default?

>  BTW, repetitive stats are already cached which can be seen by timing an
>  `ls /bin' repetitively.

Perhaps repeated "ls /bin" calls are faster because of local
filesystem caching, but that's irrelevant to our usage because (a) we
are frequently accessing dependency files on remote filesystems which
may not be cached and (b) we don't stat the same files over and over
again, we stat many different files, surely more than are allowed to
remain in the cache, so the cache doesn't help us.

>  Also, Cygwin will always be slower than Linux. 
>  Win32 is just slower.  It's like comparing a tortoise with a hare.

Win32 is slower, but not THAT much slower.  There is clearly much
overhead in Cygwin, and our goal here is to eliminate as much of that
overhead as possible.

  jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:46       ` jik-cygwin
@ 2001-02-13 11:54         ` DJ Delorie
  2001-02-13 11:56           ` Jonathan Kamens
  2001-02-13 12:22           ` Christopher Faylor
  2001-02-13 15:28         ` Warren Young
  1 sibling, 2 replies; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 11:54 UTC (permalink / raw)
  To: jik-cygwin; +Cc: cygwin

> As I've noted separately, reading tens of thousands of files even once
> incurs a significant performance penalty.

True, but reading them all once is better than reading them all twice.
I'm trying to break the problem down into small enough changes that we
actually have a chance of implementing them.

> The change I've proposed can eliminate reading them at all.

But not in a way that we can make it the default.  Perhaps you could
propose a set of mount flags to optimize common situations?  We
already have one to avoid the read-for-execute test, perhaps you could
work on an assume-no-symlinks flag?  Then we wouldn't need a custom
make.exe (or any other program).

> But it does nothing at all for the "usual case" I'm trying to
> optimize, which is Make stat()ing a file but never reading it.

It does, because stat() reads the file twice, once to see if it's a
symlink, and once to see if the executable bit needs to be set.

> >  These should be easier wins (thus, more doable) than a global cache,
> >  which NT should be providing itself as part of the disk cache
> >  subsystem (for local drives, at least).  I don't think it's
> >  appropriate for cygwin to go beyond this anyway - too many race
> >  conditions arise.
> 
> As far as I know, there are no race conditions in the change I
> suggested.  In fact, it *removes* race conditions, since it reduces
> the number of distinct OS operations that must be performed on a file
> during stat().

Right, but others were suggesting a global cache of file bytes.
*That* would introduce race conditions.

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:15   ` jik-cygwin
@ 2001-02-13 11:48     ` Earnie Boyd
  2001-02-13 11:54       ` jik-cygwin
  2001-02-13 12:11       ` DJ Delorie
  0 siblings, 2 replies; 50+ messages in thread
From: Earnie Boyd @ 2001-02-13 11:48 UTC (permalink / raw)
  To: jik-cygwin; +Cc: cygwin

jik-cygwin@curl.com wrote:
> 
> >  Date: Tue, 13 Feb 2001 14:09:48 -0500
> >  From: Earnie Boyd <earnie_boyd@yahoo.com>
> >
> >  This sounds very interesting but I believe work to eliminate TWO
> >  ReadFiles would be best; but, I don't know if this is possible.
> 

I stated this sloppy.  My intention was as you've discussed already with
Larry.

> I don't understand what you mean.  The experiment I did yesterday
> *did* eliminate both ReadFiles.  However, I don't think Make can
> eliminate both ReadFiles *by default* because Make can't assume that
> the user doesn't use any symlinks unless the user tells it to assume
> that.
> 
> >  I have found the following set of GCC flags to have great impact
> >  with the speed with wish Cygwin flies.
> 
> How gcc should be called when compiling Cygwin is an interesting
> question, but it's not the one I'm asking here.  I hope the people
> from RedHat who compile the Cygwin packages that go up on the Web
> sites consider your suggestion, though :-).
> 

You are looking for ways to "speed up execution".  I was suggesting you
try the suggested switches as another means to speed up execution.  I
can rebuild the Cygwin dll in ~10 minutes doing a `make clean && make'.

BTW, repetitive stats are already cached which can be seen by timing an
`ls /bin' repetitively.  Also, Cygwin will always be slower than Linux. 
Win32 is just slower.  It's like comparing a tortoise with a hare.

Earnie.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:35     ` DJ Delorie
@ 2001-02-13 11:46       ` jik-cygwin
  2001-02-13 11:54         ` DJ Delorie
  2001-02-13 15:28         ` Warren Young
  0 siblings, 2 replies; 50+ messages in thread
From: jik-cygwin @ 2001-02-13 11:46 UTC (permalink / raw)
  To: dj; +Cc: cygwin

>  Date: Tue, 13 Feb 2001 14:35:08 -0500
>  From: DJ Delorie <dj@delorie.com>
>  
>  I think an easier win would be to cache the bytes read within the
>  fhandler, not globally, so while each fhandler (i.e. open, stat,
>  whatnot) would still need to read the file itself, it would never need
>  to read it more than once.

As I've noted separately, reading tens of thousands of files even once
incurs a significant performance penalty.  The change I've proposed
can eliminate reading them at all.

>  If we still need more performance, perhaps a one-entry cache of the
>  most recent file accessed (and when) could be added, and ignored if
>  the data is more than, say, one second old.  Even if it were local to
>  one process, it would hit most of the usual cases (stat a file, alloc
>  a buffer, then open/read the file).

But it does nothing at all for the "usual case" I'm trying to
optimize, which is Make stat()ing a file but never reading it.

>  These should be easier wins (thus, more doable) than a global cache,
>  which NT should be providing itself as part of the disk cache
>  subsystem (for local drives, at least).  I don't think it's
>  appropriate for cygwin to go beyond this anyway - too many race
>  conditions arise.

As far as I know, there are no race conditions in the change I
suggested.  In fact, it *removes* race conditions, since it reduces
the number of distinct OS operations that must be performed on a file
during stat().

  jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
  2001-02-13 11:01   ` jik-cygwin
  2001-02-13 11:12   ` Earnie Boyd
@ 2001-02-13 11:46   ` Christopher Faylor
  2 siblings, 0 replies; 50+ messages in thread
From: Christopher Faylor @ 2001-02-13 11:46 UTC (permalink / raw)
  To: cygwin

On Tue, Feb 13, 2001 at 01:51:50PM -0500, Larry Hall (RFK Partners, Inc) wrote:
>At 01:36 PM 2/13/2001, Jonathan Kamens wrote:
>>Please comment.
>
>
>
>I know this would only address half the problem but I wonder if it would make
>sense to cache the results of ReadFile() so that separate checks for symbolic
>links and executables would result in only 1 ReadFile() call.  This seems
>like a nice general optimization which wouldn't be so "gross"...

Actually, I wonder if just mounting the directory with the execute bit
set ("mount -x c:\bin /bin") would solve this.  It would cause every
file to be considered executable but it may bypass the ReadFile.

cgf

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:01   ` jik-cygwin
  2001-02-13 11:14     ` Larry Hall (RFK Partners, Inc)
@ 2001-02-13 11:35     ` DJ Delorie
  2001-02-13 11:46       ` jik-cygwin
  1 sibling, 1 reply; 50+ messages in thread
From: DJ Delorie @ 2001-02-13 11:35 UTC (permalink / raw)
  To: jik-cygwin; +Cc: cygwin

> I see three problems with that:
> 
> 1) The cache would have to be automatically invalidated whenever the
> 2) Many of our dependencies are checked over and over again in many
> 3) The implementation of such a global cache would be much more

I think an easier win would be to cache the bytes read within the
fhandler, not globally, so while each fhandler (i.e. open, stat,
whatnot) would still need to read the file itself, it would never need
to read it more than once.

If we still need more performance, perhaps a one-entry cache of the
most recent file accessed (and when) could be added, and ignored if
the data is more than, say, one second old.  Even if it were local to
one process, it would hit most of the usual cases (stat a file, alloc
a buffer, then open/read the file).

These should be easier wins (thus, more doable) than a global cache,
which NT should be providing itself as part of the disk cache
subsystem (for local drives, at least).  I don't think it's
appropriate for cygwin to go beyond this anyway - too many race
conditions arise.

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:24 ` Eric M. Monsler
@ 2001-02-13 11:28   ` jik-cygwin
  2001-02-13 12:04     ` Eric M. Monsler
  0 siblings, 1 reply; 50+ messages in thread
From: jik-cygwin @ 2001-02-13 11:28 UTC (permalink / raw)
  To: emonsler; +Cc: cygwin

>  Date: Tue, 13 Feb 2001 11:25:46 -0800
>  From: "Eric M. Monsler" <emonsler@beamreachnetworks.com>
>  
>  I don't understand the change that you are proposing, unless it is to
>  change the API for stat() to include two more flags.

No.

>  Not to mention that you would have also forked GNU Make, or else added a
>  compilation dependency, that would need to get folded back in.

There are already plenty of Cygwin-specific changes in GNU Make.  This
would simply be one more of them.

>  On re-reading your post, it appears that you are not proposing an API
>  change to stat, but rather another call that will set/unset that part of
>  stat()'s behavior in the DLL.

Yes.

>  For everything in the system?  Is your proposed change MT-safe?

It would have to be made MT-safe, obviously.  While I am not
intimately familiar with how to do such a thing in Cygwin DLL code, I
am confident that the more knowledgeable maintainers of the code would
be able to do so easily.

>  I believe that the proposal to cache the results of ReadFile() was
>  intended to suggest that inside stat(), only one ReadFile might be
>  required.  This seems like a good idea, performance enhancing and
>  standards preserving.  I don't know ReadFile, and so don't know if this
>  would be possible.

Yes, it does seem like a good idea, but it doesn't go far enough for
our needs.  This would still result in thousands (or perhaps even tens
of thousands) of unnecessary ReadFile calls in our builds and would
have a significant performance impact.

  jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:18       ` jik-cygwin
@ 2001-02-13 11:26         ` Larry Hall (RFK Partners, Inc)
  0 siblings, 0 replies; 50+ messages in thread
From: Larry Hall (RFK Partners, Inc) @ 2001-02-13 11:26 UTC (permalink / raw)
  To: jik-cygwin; +Cc: cygwin

At 02:17 PM 2/13/2001, jik-cygwin@curl.com wrote:
> >  Date: Tue, 13 Feb 2001 14:08:58 -0500
> >  From: "Larry Hall (RFK Partners, Inc)" <lhall@rfk.com>
> >  
> >  >1) The cache would have to be automatically invalidated whenever the
> >  >    file is changed, and you'd thus need to check if a file has changed
> >  >    before using the cache, and those checks would themselves take
> >  >    time.
> >  
> >  I'd submit that while this may be true in general, it shouldn't be in the
> >  case of symbolic links and executables.  These attributes don't really change
> >  or, if they do, they change in a very defined way which should make it 
> >  possible to track.
>
>Yikes, I don't agree with that at all.
>
>Non-deterministic behavior is *much* worse than slow behavior.  It
>would be *impossible* for Cygwin to track every single change to
>symbolic links and/or executable files, since people can modify such
>files without going through Cygwin at all (e.g., modifying them
>directly with a Windows app, or modifying such a file on a SAMBA mount
>through Linux).
>
>Since it's impossible for Cygwin to know when such files are changed
>out from under it, it *must* check them, or at least check if they
>have been changed, each time they are accessed.


Again, in general, I agree with you.  Still, it depends on what attributes
are being cached.  If we limit ourselves to checking for executables and
symbolic links, I don't see a major problem.  The only thing that this won't 
catch is places which someone changes their executable file to a 
non-executable or vice versa (same for symbolic link files).  Assuming the 
cache ages relatively quickly, this shouldn't be an issue.  Your point is
taken however.



Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      http://www.rfk.com
118 Washington Street                   (508) 893-9779 - RFK Office
Holliston, MA 01746                     (508) 893-9889 - FAX



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 10:36 Jonathan Kamens
  2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
  2001-02-13 11:09 ` Earnie Boyd
@ 2001-02-13 11:24 ` Eric M. Monsler
  2001-02-13 11:28   ` jik-cygwin
  2 siblings, 1 reply; 50+ messages in thread
From: Eric M. Monsler @ 2001-02-13 11:24 UTC (permalink / raw)
  To: Jonathan Kamens; +Cc: cygwin

I don't understand the change that you are proposing, unless it is to
change the API for stat() to include two more flags.

The Cygwin-API reference lists stat() as being compatible with POSIX.1,
so the API is not really changeable, without a significant chance of
breaking every other application under Cygwin that uses stat.

Not to mention that you would have also forked GNU Make, or else added a
compilation dependency, that would need to get folded back in.

On re-reading your post, it appears that you are not proposing an API
change to stat, but rather another call that will set/unset that part of
stat()'s behavior in the DLL.  For everything in the system?  Is your
proposed change MT-safe?  


I believe that the proposal to cache the results of ReadFile() was
intended to suggest that inside stat(), only one ReadFile might be
required.  This seems like a good idea, performance enhancing and
standards preserving.  I don't know ReadFile, and so don't know if this
would be possible.


Eric Monsler

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:14     ` Larry Hall (RFK Partners, Inc)
@ 2001-02-13 11:18       ` jik-cygwin
  2001-02-13 11:26         ` Larry Hall (RFK Partners, Inc)
  0 siblings, 1 reply; 50+ messages in thread
From: jik-cygwin @ 2001-02-13 11:18 UTC (permalink / raw)
  To: lhall; +Cc: cygwin

>  Date: Tue, 13 Feb 2001 14:08:58 -0500
>  From: "Larry Hall (RFK Partners, Inc)" <lhall@rfk.com>
>  
>  >1) The cache would have to be automatically invalidated whenever the
>  >    file is changed, and you'd thus need to check if a file has changed
>  >    before using the cache, and those checks would themselves take
>  >    time.
>  
>  I'd submit that while this may be true in general, it shouldn't be in the
>  case of symbolic links and executables.  These attributes don't really change
>  or, if they do, they change in a very defined way which should make it 
>  possible to track.

Yikes, I don't agree with that at all.

Non-deterministic behavior is *much* worse than slow behavior.  It
would be *impossible* for Cygwin to track every single change to
symbolic links and/or executable files, since people can modify such
files without going through Cygwin at all (e.g., modifying them
directly with a Windows app, or modifying such a file on a SAMBA mount
through Linux).

Since it's impossible for Cygwin to know when such files are changed
out from under it, it *must* check them, or at least check if they
have been changed, each time they are accessed.

  jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:09 ` Earnie Boyd
@ 2001-02-13 11:15   ` jik-cygwin
  2001-02-13 11:48     ` Earnie Boyd
  0 siblings, 1 reply; 50+ messages in thread
From: jik-cygwin @ 2001-02-13 11:15 UTC (permalink / raw)
  To: cygwin; +Cc: cygwin

>  Date: Tue, 13 Feb 2001 14:09:48 -0500
>  From: Earnie Boyd <earnie_boyd@yahoo.com>
>  
>  This sounds very interesting but I believe work to eliminate TWO
>  ReadFiles would be best; but, I don't know if this is possible.

I don't understand what you mean.  The experiment I did yesterday
*did* eliminate both ReadFiles.  However, I don't think Make can
eliminate both ReadFiles *by default* because Make can't assume that
the user doesn't use any symlinks unless the user tells it to assume
that.

>  I have found the following set of GCC flags to have great impact
>  with the speed with wish Cygwin flies.

How gcc should be called when compiling Cygwin is an interesting
question, but it's not the one I'm asking here.  I hope the people
from RedHat who compile the Cygwin packages that go up on the Web
sites consider your suggestion, though :-).

  jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 11:01   ` jik-cygwin
@ 2001-02-13 11:14     ` Larry Hall (RFK Partners, Inc)
  2001-02-13 11:18       ` jik-cygwin
  2001-02-13 11:35     ` DJ Delorie
  1 sibling, 1 reply; 50+ messages in thread
From: Larry Hall (RFK Partners, Inc) @ 2001-02-13 11:14 UTC (permalink / raw)
  To: jik-cygwin; +Cc: cygwin

At 02:01 PM 2/13/2001, jik-cygwin@curl.com wrote:
> >  Date: Tue, 13 Feb 2001 13:51:50 -0500
> >  From: "Larry Hall (RFK Partners, Inc)" <lhall@rfk.com>
> >  
> >  I know this would only address half the problem but I wonder if it would make
> >  sense to cache the results of ReadFile() so that separate checks for symbolic
> >  links and executables would result in only 1 ReadFile() call.  This seems
> >  like a nice general optimization which wouldn't be so "gross"...
>
>I see three problems with that:
>
>1) The cache would have to be automatically invalidated whenever the
>    file is changed, and you'd thus need to check if a file has changed
>    before using the cache, and those checks would themselves take
>    time.


I'd submit that while this may be true in general, it shouldn't be in the
case of symbolic links and executables.  These attributes don't really change
or, if they do, they change in a very defined way which should make it 
possible to track.


>2) Many of our dependencies are checked over and over again in many
>    Makefiles by many different Make processes.  Thus, either the cache
>    you propose would have to be global in the Cygwin shared memory
>    segment, or the checks would still happen over and over for us.


Right.  Having it in shared memory would be almost a requirement as far
as I can see.


>3) The implementation of such a global cache would be much more
>    complex then the simple changes I implemented in only a couple of
>    hours, and I would argue that this complexity would in fact make
>    such a cache *more* "gross" than the changes I'm suggesting.


I'm not necessarily suggesting this as a replacement for your changes.
I'm just trying to think of how this problem could be attacked in general.
Whether or not your changes are accepted into the baseline, a general 
solution would benefit every app.  Mostly, I'm just thinking out loud (with
the intent of having people poke holes in the idea...)



Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      http://www.rfk.com
118 Washington Street                   (508) 893-9779 - RFK Office
Holliston, MA 01746                     (508) 893-9889 - FAX



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
  2001-02-13 11:01   ` jik-cygwin
@ 2001-02-13 11:12   ` Earnie Boyd
  2001-02-13 11:46   ` Christopher Faylor
  2 siblings, 0 replies; 50+ messages in thread
From: Earnie Boyd @ 2001-02-13 11:12 UTC (permalink / raw)
  To: Larry Hall (RFK Partners, Inc); +Cc: cygwin

"Larry Hall (RFK Partners, Inc)" wrote:
> 
> I know this would only address half the problem but I wonder if it would make
> sense to cache the results of ReadFile() so that separate checks for symbolic
> links and executables would result in only 1 ReadFile() call.  This seems
> like a nice general optimization which wouldn't be so "gross"...
> 

Great minds must think alike.

Earnie.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 10:36 Jonathan Kamens
  2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
@ 2001-02-13 11:09 ` Earnie Boyd
  2001-02-13 11:15   ` jik-cygwin
  2001-02-13 11:24 ` Eric M. Monsler
  2 siblings, 1 reply; 50+ messages in thread
From: Earnie Boyd @ 2001-02-13 11:09 UTC (permalink / raw)
  To: Jonathan Kamens; +Cc: cygwin

Jonathan Kamens wrote:
> 
> I realize that this is a bit gross.  However, (a) surely it isn't much
> more gross than storing symbolic links inside files and reading files
> to determine whether they should look executable :-), and (b) it
> really does give a drastic performance improvement for the small price
> of not using symbolic links in your source or build tree.
> 
> Please comment.
> 

This sounds very interesting but I believe work to eliminate TWO
ReadFiles would be best; but, I don't know if this is possible.  I have
found the following set of GCC flags to have great impact with the speed
with wish Cygwin flies.  Set both the CFLAGS variable and the CXXFLAGS
variable before configuring Cygwin.

-O3 -s -fnative-struct -foptimize-register-move -fgnu-linker -ffast-math
-fnew-exceptions -frerun-loop-opt -frerun-cse-after-loop -fgcse
-fpeephole -fstrength-reduce -fthread-jumps -fexpensive-optimizations
-fvolatile -mpreferred-stack-boundary=8 -march=i686

Earnie.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
@ 2001-02-13 11:01   ` jik-cygwin
  2001-02-13 11:14     ` Larry Hall (RFK Partners, Inc)
  2001-02-13 11:35     ` DJ Delorie
  2001-02-13 11:12   ` Earnie Boyd
  2001-02-13 11:46   ` Christopher Faylor
  2 siblings, 2 replies; 50+ messages in thread
From: jik-cygwin @ 2001-02-13 11:01 UTC (permalink / raw)
  To: lhall; +Cc: cygwin

>  Date: Tue, 13 Feb 2001 13:51:50 -0500
>  From: "Larry Hall (RFK Partners, Inc)" <lhall@rfk.com>
>  
>  I know this would only address half the problem but I wonder if it would make
>  sense to cache the results of ReadFile() so that separate checks for symbolic
>  links and executables would result in only 1 ReadFile() call.  This seems
>  like a nice general optimization which wouldn't be so "gross"...

I see three problems with that:

1) The cache would have to be automatically invalidated whenever the
   file is changed, and you'd thus need to check if a file has changed
   before using the cache, and those checks would themselves take
   time.

2) Many of our dependencies are checked over and over again in many
   Makefiles by many different Make processes.  Thus, either the cache
   you propose would have to be global in the Cygwin shared memory
   segment, or the checks would still happen over and over for us.

3) The implementation of such a global cache would be much more
   complex then the simple changes I implemented in only a couple of
   hours, and I would argue that this complexity would in fact make
   such a cache *more* "gross" than the changes I'm suggesting.

jik

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Optimizing away "ReadFile" calls when Make calls stat()
  2001-02-13 10:36 Jonathan Kamens
@ 2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
  2001-02-13 11:01   ` jik-cygwin
                     ` (2 more replies)
  2001-02-13 11:09 ` Earnie Boyd
  2001-02-13 11:24 ` Eric M. Monsler
  2 siblings, 3 replies; 50+ messages in thread
From: Larry Hall (RFK Partners, Inc) @ 2001-02-13 10:56 UTC (permalink / raw)
  To: Jonathan Kamens, cygwin

At 01:36 PM 2/13/2001, Jonathan Kamens wrote:
>We use Cygwin to develop a large product (running a build and the test
>suites takes about two hours on a very fast machine); our builds are
>driven by GNU Make.  We compile and test the same product under Linux.
>We've found that builds under Cygwin run several times slower than
>builds under Linux, even on machines of comparable speed, RAM, etc.
>The slowness is seriously impacting the productivity of our developers
>who work on Windows, so we're searching for any way we can to speed up
>Cygwin builds.
>
>We've found that one of the biggest culprits in slowing down the
>Cygwin builds is Make.  The problem is that every time Make does
>stat() to find out the modification time on a dependency to determine
>whether or not its dependents need to be rebuilt, Cygwin calls
>ReadFile on the file twice -- once to determine whether it's a
>symbolic link, and a second time to determine whether it should appear
>to be executable according to stat().  We have thousands of
>dependencies in our Makefiles, and many of those dependencies
>frequently live on network drives, so these calls to ReadFile
>seriously slow things down.
>
>We don't use symbolic links anywhere in our source tree or build
>tree, and Make doesn't really care whether a file is executable when
>deciding whether it is newer than one of its dependents, so both of
>these calls to ReadFile are totally unnecessary to us.  As an
>experiment, I added code to the Cygwin DLL to allow these ReadFile
>calls to be temporarily disabled, and then I compiled a modified
>version of Make which disables the ReadFile calls before calling
>stat() and then turns them back on.
>
>To measure the effect of these changes, I ran "make all" in a build
>tree tha was already completely built, so that I would be timing only
>the work Make does to check dependencies, rather than timing actual
>build work.  With the unmodified Make, "make all" takes around six
>minutes; with the modified Make, it takes around three.  We consider
>this a significant improvement.  (However, note that on Linux, "make
>all" when nothing needs to be done takes only 17 seconds, so clearly
>there's still a lot of room for improvement under Cygwin.)
>
>I'm wondering if the maintainers of Cygwin would be willing to
>consider incorporating these changes, if I submit them, into the
>Cygwin DLL and the Cygwin version of Make.  I'm thinking that the DLL
>changes would actually need to be split into two flags -- one to say,
>"Don't call ReadFile to find out whether a file is executable, because
>I don't care about that," and the other to say, "Don't call ReadFile
>to find out if a file is a symbolic link, because I know I'm not using
>any symbolic links."  Then, GNU Make on Cygwin could always set the
>first flag, and it could set the second flag if the user specified
>"--nosymlinks" or something like that.
>
>I realize that this is a bit gross.  However, (a) surely it isn't much
>more gross than storing symbolic links inside files and reading files
>to determine whether they should look executable :-), and (b) it
>really does give a drastic performance improvement for the small price
>of not using symbolic links in your source or build tree.
>
>Please comment.



I know this would only address half the problem but I wonder if it would make
sense to cache the results of ReadFile() so that separate checks for symbolic
links and executables would result in only 1 ReadFile() call.  This seems
like a nice general optimization which wouldn't be so "gross"...



Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      http://www.rfk.com
118 Washington Street                   (508) 893-9779 - RFK Office
Holliston, MA 01746                     (508) 893-9889 - FAX



--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Optimizing away "ReadFile" calls when Make calls stat()
@ 2001-02-13 10:36 Jonathan Kamens
  2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
                   ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Jonathan Kamens @ 2001-02-13 10:36 UTC (permalink / raw)
  To: cygwin

We use Cygwin to develop a large product (running a build and the test
suites takes about two hours on a very fast machine); our builds are
driven by GNU Make.  We compile and test the same product under Linux.
We've found that builds under Cygwin run several times slower than
builds under Linux, even on machines of comparable speed, RAM, etc.
The slowness is seriously impacting the productivity of our developers
who work on Windows, so we're searching for any way we can to speed up
Cygwin builds.

We've found that one of the biggest culprits in slowing down the
Cygwin builds is Make.  The problem is that every time Make does
stat() to find out the modification time on a dependency to determine
whether or not its dependents need to be rebuilt, Cygwin calls
ReadFile on the file twice -- once to determine whether it's a
symbolic link, and a second time to determine whether it should appear
to be executable according to stat().  We have thousands of
dependencies in our Makefiles, and many of those dependencies
frequently live on network drives, so these calls to ReadFile
seriously slow things down.

We don't use symbolic links anywhere in our source tree or build
tree, and Make doesn't really care whether a file is executable when
deciding whether it is newer than one of its dependents, so both of
these calls to ReadFile are totally unnecessary to us.  As an
experiment, I added code to the Cygwin DLL to allow these ReadFile
calls to be temporarily disabled, and then I compiled a modified
version of Make which disables the ReadFile calls before calling
stat() and then turns them back on.

To measure the effect of these changes, I ran "make all" in a build
tree tha was already completely built, so that I would be timing only
the work Make does to check dependencies, rather than timing actual
build work.  With the unmodified Make, "make all" takes around six
minutes; with the modified Make, it takes around three.  We consider
this a significant improvement.  (However, note that on Linux, "make
all" when nothing needs to be done takes only 17 seconds, so clearly
there's still a lot of room for improvement under Cygwin.)

I'm wondering if the maintainers of Cygwin would be willing to
consider incorporating these changes, if I submit them, into the
Cygwin DLL and the Cygwin version of Make.  I'm thinking that the DLL
changes would actually need to be split into two flags -- one to say,
"Don't call ReadFile to find out whether a file is executable, because
I don't care about that," and the other to say, "Don't call ReadFile
to find out if a file is a symbolic link, because I know I'm not using
any symbolic links."  Then, GNU Make on Cygwin could always set the
first flag, and it could set the second flag if the user specified
"--nosymlinks" or something like that.

I realize that this is a bit gross.  However, (a) surely it isn't much
more gross than storing symbolic links inside files and reading files
to determine whether they should look executable :-), and (b) it
really does give a drastic performance improvement for the small price
of not using symbolic links in your source or build tree.

Please comment.

Thanks,

  Jonathan Kamens

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2001-02-16 10:17 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-02-14  4:46 Optimizing away "ReadFile" calls when Make calls stat() Bernard Dautrevaux
  -- strict thread matches above, loose matches on Subject: below --
2001-02-16  9:24 Bernard Dautrevaux
2001-02-16 10:17 ` Christopher Faylor
2001-02-14  2:41 Bernard Dautrevaux
2001-02-13 14:15 Puttkammer, Roman
2001-02-13 14:28 ` Christopher Faylor
2001-02-13 10:36 Jonathan Kamens
2001-02-13 10:56 ` Larry Hall (RFK Partners, Inc)
2001-02-13 11:01   ` jik-cygwin
2001-02-13 11:14     ` Larry Hall (RFK Partners, Inc)
2001-02-13 11:18       ` jik-cygwin
2001-02-13 11:26         ` Larry Hall (RFK Partners, Inc)
2001-02-13 11:35     ` DJ Delorie
2001-02-13 11:46       ` jik-cygwin
2001-02-13 11:54         ` DJ Delorie
2001-02-13 11:56           ` Jonathan Kamens
2001-02-13 12:06             ` DJ Delorie
2001-02-13 12:31             ` Larry Hall (RFK Partners, Inc)
2001-02-13 12:22           ` Christopher Faylor
2001-02-13 12:50             ` DJ Delorie
2001-02-14  0:12             ` Egor Duda
2001-02-14  0:17               ` Robert Collins
2001-02-15 11:47               ` Warren Young
2001-02-15 13:14                 ` Larry Hall (RFK Partners, Inc)
2001-02-15 14:17                   ` Christopher Faylor
2001-02-15 14:17                   ` Charles S. Wilson
2001-02-16  1:34                     ` Warren Young
2001-02-16  8:07                       ` Larry Hall (RFK Partners, Inc)
2001-02-16  9:00                         ` Christopher Faylor
2001-02-15 14:19                   ` Jonathan Kamens
2001-02-16  1:14                 ` Egor Duda
2001-02-16  1:29                   ` Warren Young
2001-02-13 15:28         ` Warren Young
2001-02-14  0:48           ` Lothan
2001-02-13 11:12   ` Earnie Boyd
2001-02-13 11:46   ` Christopher Faylor
2001-02-13 11:09 ` Earnie Boyd
2001-02-13 11:15   ` jik-cygwin
2001-02-13 11:48     ` Earnie Boyd
2001-02-13 11:54       ` jik-cygwin
2001-02-13 12:25         ` DJ Delorie
2001-02-13 12:50           ` Larry Hall (RFK Partners, Inc)
2001-02-13 12:51             ` DJ Delorie
2001-02-13 13:37             ` jfaith
2001-02-13 13:50             ` Mumit Khan
2001-02-13 14:13               ` DJ Delorie
2001-02-13 12:11       ` DJ Delorie
2001-02-13 11:24 ` Eric M. Monsler
2001-02-13 11:28   ` jik-cygwin
2001-02-13 12:04     ` Eric M. Monsler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).