public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
* Re: src/winsup/cygwin ChangeLog fhandler_proc.cc f ...
       [not found] <20141009132437.7650.qmail@sourceware.org>
@ 2014-10-09 14:31 ` Eric Blake
  2014-10-09 14:51   ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Blake @ 2014-10-09 14:31 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1677 bytes --]

On 10/09/2014 07:24 AM, corinna@cygwin.com wrote:

> Log message:
> 	* fhandler_proc.cc (fhandler_proc::readdir): Set dirent d_type.
> 	* fhandler_process.cc (fhandler_process::readdir): Ditto.
> 	* fhandler_procnet.cc (fhandler_procnet::readdir): Ditto.
> 	* fhandler_procsys.cc (fhandler_procsys::readdir): Ditto.
> 	* fhandler_procsysvipc.cc (fhandler_procsysvipc::readdir): Ditto.
> 	*  fhandler_virtual.h (virt_ftype_to_dtype): Define new inline function
> 	to generate dirent d_type from virtual_ftype_t.

Most of these look okay; but I have a problem with fhandler_procsys.cc:

> @@ -357,10 +358,17 @@
>  	res = ENMFILE;
>        else
>  	{
> +	  struct stat st;
> +	  char *file = tp.c_get ();
> +
>  	  sys_wcstombs (de->d_name, NAME_MAX + 1, f.dbi.ObjectName.Buffer,
>  			f.dbi.ObjectName.Length / sizeof (WCHAR));
>  	  de->d_ino = hash_path_name (get_ino (), de->d_name);
> -	  de->d_type = 0;
> +	  stpcpy (stpcpy (stpcpy (file, get_name ()), "/"), de->d_name);
> +	  if (!lstat64 (file, &st))
> +	    de->d_type = IFTODT (st.st_mode);
> +	  else
> +	    de->d_type = DT_UNKNOWN;

The whole point of d_type is for optimization, to tell a process when it
can avoid the overhead of an lstat() because the system was able to
obtain the information in a cheaper manner.  But if you have to resort
to an lstat() to get the information, then you are wasting cycles on the
case of a user that doesn't care about d_type.  I'd rather we always
return DT_UNKNOWN if the only way we'd get a better type is by calling
lstat().

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: src/winsup/cygwin ChangeLog fhandler_proc.cc f ...
  2014-10-09 14:31 ` src/winsup/cygwin ChangeLog fhandler_proc.cc f Eric Blake
@ 2014-10-09 14:51   ` Corinna Vinschen
  2014-10-09 15:07     ` Eric Blake
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2014-10-09 14:51 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2313 bytes --]

On Oct  9 08:31, Eric Blake wrote:
> On 10/09/2014 07:24 AM, corinna@cygwin.com wrote:
> 
> > Log message:
> > 	* fhandler_proc.cc (fhandler_proc::readdir): Set dirent d_type.
> > 	* fhandler_process.cc (fhandler_process::readdir): Ditto.
> > 	* fhandler_procnet.cc (fhandler_procnet::readdir): Ditto.
> > 	* fhandler_procsys.cc (fhandler_procsys::readdir): Ditto.
> > 	* fhandler_procsysvipc.cc (fhandler_procsysvipc::readdir): Ditto.
> > 	*  fhandler_virtual.h (virt_ftype_to_dtype): Define new inline function
> > 	to generate dirent d_type from virtual_ftype_t.
> 
> Most of these look okay; but I have a problem with fhandler_procsys.cc:
> 
> > @@ -357,10 +358,17 @@
> >  	res = ENMFILE;
> >        else
> >  	{
> > +	  struct stat st;
> > +	  char *file = tp.c_get ();
> > +
> >  	  sys_wcstombs (de->d_name, NAME_MAX + 1, f.dbi.ObjectName.Buffer,
> >  			f.dbi.ObjectName.Length / sizeof (WCHAR));
> >  	  de->d_ino = hash_path_name (get_ino (), de->d_name);
> > -	  de->d_type = 0;
> > +	  stpcpy (stpcpy (stpcpy (file, get_name ()), "/"), de->d_name);
> > +	  if (!lstat64 (file, &st))
> > +	    de->d_type = IFTODT (st.st_mode);
> > +	  else
> > +	    de->d_type = DT_UNKNOWN;
> 
> The whole point of d_type is for optimization, to tell a process when it
> can avoid the overhead of an lstat() because the system was able to
> obtain the information in a cheaper manner.  But if you have to resort
> to an lstat() to get the information, then you are wasting cycles on the
> case of a user that doesn't care about d_type.  I'd rather we always
> return DT_UNKNOWN if the only way we'd get a better type is by calling
> lstat().

I see.  The idea here was to try and, at least on my machine, it
was still *very* fast, likely because the whole thing occurs only
in globally allocated memory and there's no disk access or paging
involved.

The question is, what exactly do we lose?  /proc/sys isn't often
accessed at all (I guess) and what could be gained?  Yaakov asked
for setting d_type under /proc, so he might enlighten us which
tools make heavy use of the stuff, so the net gain is > 0...


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: src/winsup/cygwin ChangeLog fhandler_proc.cc f ...
  2014-10-09 14:51   ` Corinna Vinschen
@ 2014-10-09 15:07     ` Eric Blake
  2014-10-09 16:22       ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Blake @ 2014-10-09 15:07 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1792 bytes --]

On 10/09/2014 08:51 AM, Corinna Vinschen wrote:
>> The whole point of d_type is for optimization, to tell a process when it
>> can avoid the overhead of an lstat() because the system was able to
>> obtain the information in a cheaper manner.  But if you have to resort
>> to an lstat() to get the information, then you are wasting cycles on the
>> case of a user that doesn't care about d_type.  I'd rather we always
>> return DT_UNKNOWN if the only way we'd get a better type is by calling
>> lstat().
> 
> I see.  The idea here was to try and, at least on my machine, it
> was still *very* fast, likely because the whole thing occurs only
> in globally allocated memory and there's no disk access or paging
> involved.
> 
> The question is, what exactly do we lose?  /proc/sys isn't often
> accessed at all (I guess) and what could be gained?  Yaakov asked
> for setting d_type under /proc, so he might enlighten us which
> tools make heavy use of the stuff, so the net gain is > 0...

Some modes of 'find' and 'ls' (such as ls -F) are faster if d_type is
accurate (because they avoided an lstat); there, returning DT_UNKNOWN
requires the lstat.  In other cases (like ls -l) an lstat is always
required.  Anywhere that lstat is slow, embedding an lstat into d_type
determination as well as a followup lstat is going to make directory
traversal twice as slow (well, maybe the second call is faster because
of caching effects); conversely, anywhere that lstat is not required by
the caller, it is wasted effort during the readdir.  But as you say,
lstat in /proc/sys is mostly stuff in memory and already fast, so maybe
it doesn't hurt to leave it in.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: src/winsup/cygwin ChangeLog fhandler_proc.cc f ...
  2014-10-09 15:07     ` Eric Blake
@ 2014-10-09 16:22       ` Corinna Vinschen
  2014-10-09 17:47         ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2014-10-09 16:22 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2246 bytes --]

On Oct  9 09:07, Eric Blake wrote:
> On 10/09/2014 08:51 AM, Corinna Vinschen wrote:
> >> The whole point of d_type is for optimization, to tell a process when it
> >> can avoid the overhead of an lstat() because the system was able to
> >> obtain the information in a cheaper manner.  But if you have to resort
> >> to an lstat() to get the information, then you are wasting cycles on the
> >> case of a user that doesn't care about d_type.  I'd rather we always
> >> return DT_UNKNOWN if the only way we'd get a better type is by calling
> >> lstat().
> > 
> > I see.  The idea here was to try and, at least on my machine, it
> > was still *very* fast, likely because the whole thing occurs only
> > in globally allocated memory and there's no disk access or paging
> > involved.
> > 
> > The question is, what exactly do we lose?  /proc/sys isn't often
> > accessed at all (I guess) and what could be gained?  Yaakov asked
> > for setting d_type under /proc, so he might enlighten us which
> > tools make heavy use of the stuff, so the net gain is > 0...
> 
> Some modes of 'find' and 'ls' (such as ls -F) are faster if d_type is
> accurate (because they avoided an lstat); there, returning DT_UNKNOWN
> requires the lstat.  In other cases (like ls -l) an lstat is always
> required.  Anywhere that lstat is slow, embedding an lstat into d_type
> determination as well as a followup lstat is going to make directory
> traversal twice as slow (well, maybe the second call is faster because
> of caching effects); conversely, anywhere that lstat is not required by
> the caller, it is wasted effort during the readdir.  But as you say,
> lstat in /proc/sys is mostly stuff in memory and already fast, so maybe
> it doesn't hurt to leave it in.

I made a quick test on 64 bit W8.1 and a non-opimized Cygwin DLL.

  time ls -l --color=always /proc/sys/Device/

takes a constant 0.53 secs without my patch, and a constant 0.83 secs
with my patch.  So it's actually rather noticable, being more than 50%
slower.  It's hard to justify such a hit...


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: src/winsup/cygwin ChangeLog fhandler_proc.cc f ...
  2014-10-09 16:22       ` Corinna Vinschen
@ 2014-10-09 17:47         ` Corinna Vinschen
  2014-10-09 18:07           ` Eric Blake
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2014-10-09 17:47 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2617 bytes --]

On Oct  9 18:22, Corinna Vinschen wrote:
> On Oct  9 09:07, Eric Blake wrote:
> > On 10/09/2014 08:51 AM, Corinna Vinschen wrote:
> > >> The whole point of d_type is for optimization, to tell a process when it
> > >> can avoid the overhead of an lstat() because the system was able to
> > >> obtain the information in a cheaper manner.  But if you have to resort
> > >> to an lstat() to get the information, then you are wasting cycles on the
> > >> case of a user that doesn't care about d_type.  I'd rather we always
> > >> return DT_UNKNOWN if the only way we'd get a better type is by calling
> > >> lstat().
> > > 
> > > I see.  The idea here was to try and, at least on my machine, it
> > > was still *very* fast, likely because the whole thing occurs only
> > > in globally allocated memory and there's no disk access or paging
> > > involved.
> > > 
> > > The question is, what exactly do we lose?  /proc/sys isn't often
> > > accessed at all (I guess) and what could be gained?  Yaakov asked
> > > for setting d_type under /proc, so he might enlighten us which
> > > tools make heavy use of the stuff, so the net gain is > 0...
> > 
> > Some modes of 'find' and 'ls' (such as ls -F) are faster if d_type is
> > accurate (because they avoided an lstat); there, returning DT_UNKNOWN
> > requires the lstat.  In other cases (like ls -l) an lstat is always
> > required.  Anywhere that lstat is slow, embedding an lstat into d_type
> > determination as well as a followup lstat is going to make directory
> > traversal twice as slow (well, maybe the second call is faster because
> > of caching effects); conversely, anywhere that lstat is not required by
> > the caller, it is wasted effort during the readdir.  But as you say,
> > lstat in /proc/sys is mostly stuff in memory and already fast, so maybe
> > it doesn't hurt to leave it in.
> 
> I made a quick test on 64 bit W8.1 and a non-opimized Cygwin DLL.
> 
>   time ls -l --color=always /proc/sys/Device/
> 
> takes a constant 0.53 secs without my patch, and a constant 0.83 secs
> with my patch.  So it's actually rather noticable, being more than 50%
> slower.  It's hard to justify such a hit...

I applied another technique which has no noticable performance hit.  It
doesn't recognize all objects, only directories, symlinks, and partially
character devices, but on the upside it uses only information which has
already been provided anyway.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: src/winsup/cygwin ChangeLog fhandler_proc.cc f ...
  2014-10-09 17:47         ` Corinna Vinschen
@ 2014-10-09 18:07           ` Eric Blake
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Blake @ 2014-10-09 18:07 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 533 bytes --]

On 10/09/2014 11:47 AM, Corinna Vinschen wrote:

> I applied another technique which has no noticable performance hit.  It
> doesn't recognize all objects, only directories, symlinks, and partially
> character devices, but on the upside it uses only information which has
> already been provided anyway.

Yep, that looks a lot cleaner - reusing stuff we have for free is the
whole point of d_type optimization :)

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-10-09 18:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20141009132437.7650.qmail@sourceware.org>
2014-10-09 14:31 ` src/winsup/cygwin ChangeLog fhandler_proc.cc f Eric Blake
2014-10-09 14:51   ` Corinna Vinschen
2014-10-09 15:07     ` Eric Blake
2014-10-09 16:22       ` Corinna Vinschen
2014-10-09 17:47         ` Corinna Vinschen
2014-10-09 18:07           ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).