* Bug in stat()?
@ 2002-04-30 14:23 Eric Blake
2002-04-30 16:49 ` Patch: " Eric Blake
0 siblings, 1 reply; 3+ messages in thread
From: Eric Blake @ 2002-04-30 14:23 UTC (permalink / raw)
To: cygwin
I am running into weird behavior with stat(). I am getting the same
st_ino number for two distinct directories. When using the jikes
compiler on the GNU Classpath project (the upstream source of libjava in
gcc), jikes is keying off of the inode number to determine where to
write .class files. Because the inode number is a duplicate, jikes is
making the wrong choice, and then failing to compile.
I've simplified the demonstration of the problem as follows:
$ cd /tmp
$ cat blah.cpp
#include <sys/stat.h>
#include <stdio.h>
int main(int argc, char** argv)
{
struct stat status;
int result;
result = stat("./java/net", &status);
printf("net (%d): %d, %d\n", result, status.st_dev, status.st_ino);
result = stat("./java/nio", &status);
printf("nio (%d): %d, %d\n", result, status.st_dev, status.st_ino);
return 0;
}
$ g++ blah.cpp -o blah
$ cd ~/cp/lib
$ rm -Rf java
$ mkdir java java/net
$ /tmp/blah
net (0): 4096, 547532427
nio (-1): 0, 0
$ mkdir java/nio
$ /tmp/blah
net (0): 4096, 547532427
nio (0): 4096, 547532427
$ cd /tmp
$ rm -Rf java
$ mkdir java java/net java/nio
$ /tmp/blah
net (0): 4096, 314387057
nio (0): 4096, 311437853
Notice that in the /tmp directory, stat() correctly gave different
st_ino values for the two newly created directories. However, in
~/cp/lib, BOTH directories are given the inode of 547532427, even though
they are distinct objects.
Is there a bug in the implementation of <sys/stat.h>? I am running the
latest version of cygwin (1.3.10-1) and gcc (2.95.3-5) unmodified, with
the root directory located at d:\cygwin on a Win98 box.
--
This signature intentionally left boring.
Eric Blake ebb9@email.byu.edu
BYU student, free software programmer
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Patch: Re: Bug in stat()?
2002-04-30 14:23 Bug in stat()? Eric Blake
@ 2002-04-30 16:49 ` Eric Blake
2002-05-02 9:46 ` Corinna Vinschen
0 siblings, 1 reply; 3+ messages in thread
From: Eric Blake @ 2002-04-30 16:49 UTC (permalink / raw)
To: cygwin; +Cc: Eric Blake
Eric Blake wrote:
>
> I am running into weird behavior with stat(). I am getting the same
> st_ino number for two distinct directories. When using the jikes
> compiler on the GNU Classpath project (the upstream source of libjava in
> gcc), jikes is keying off of the inode number to determine where to
> write .class files. Because the inode number is a duplicate, jikes is
> making the wrong choice, and then failing to compile.
>
After some poking around, I see that stat() fills in st_ino using a hash
function on the absolute file name (since Windows does not understand the
concept of inodes). I confirmed that there is a bug in the hashing function
- it is not a strong enough hash. The culprit is in winsup/cygwin/path.cc,
in hash_path_name().
My updated demonstration follows. Notice that it is sensitive to the
directory you run it in; while there are certainly other directories that
would show this problem, the bug does not surface in all directories.
$ cygpath -aw `pwd`
d:\cygwin\home\eblake\cp\lib
$ cat blah.java
#include <sys/stat.h>
#include <stdio.h>
int main(int argc, char** argv)
{
/* Print results obtained from cygwin */
struct stat status;
int result;
result = stat("./java", &status);
printf("./java (%d): %x, %x\n", result, status.st_dev, status.st_ino);
result = stat("./java/net", &status);
printf("net (%d): %x, %x\n", result, status.st_dev, status.st_ino);
result = stat("./java/nio", &status);
printf("nio (%d): %x, %x\n", result, status.st_dev, status.st_ino);
/* morph from "./java" to "./java/net" */
unsigned int hash = 0x210fd907;
printf("%x\n", hash);
int ch = '\\';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n", hash);
ch = 'n';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n", hash);
ch = 'e';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n", hash);
ch = 't';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n\n", hash);
/* morph from "./java" to "./java/nio" */
hash = 0x210fd907;
printf("%x\n", hash);
ch = '\\';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n", hash);
ch = 'n';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n", hash);
ch = 'i';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n", hash);
ch = 'o';
hash += ch + (ch << 17);
hash ^= hash >> 2;
printf("%x\n", hash);
return 0;
}
$ ./blah.exe
./java (0): 1000, 210fd907
net (0): 1000, 20a2ae8b
nio (0): 1000, 20a2ae8b
210fd907
29b62f3b
2036a443
29408d82
20a2ae8b
210fd907
29b62f3b
2036a443
294a8d87
20a2ae8b
Notice that from the inode hash of ./java, with cwd of
d:\cygwin\home\eblake\cp\lib, I was able to match the inode hash of both
./java/net and ./java/nio using the two lines
hash += ch + (ch << 17);
hash ^= hash >> 2;
from path.cc. The two hashes are different after "\\ne" and "\\ni", but
converge again when appending 't' and 'o' respectively.
I'm not a hashing expert, but I suggest that you try the hashing algorithm
used in the Java programming language for java.lang.String.hashCode(), as
shown in my patch below. I think it is a stronger hash, and I know that it
would solve my problem, with less computation per character.
2002-04-30 Eric Blake <ebb9@email.byu.edu>
* path.cc (hash_path_name): Improve hash function strength.
$ diff -u path.cc.bak path.cc
--- path.cc.bak Tue Apr 30 16:32:52 2002
+++ path.cc Tue Apr 30 16:40:14 2002
@@ -3136,7 +3136,7 @@
hash = cygheap->cwd.get_hash ();
if (name[0] == '.' && name[1] == '\0')
return hash;
- hash += hash_path_name (hash, "\\");
+ hash = (hash << 5) - hash + '\\';
}
}
@@ -3146,8 +3146,7 @@
do
{
int ch = cyg_tolower(*name);
- hash += ch + (ch << 17);
- hash ^= hash >> 2;
+ hash = (hash << 5) - hash + ch;
}
while (*++name != '\0' &&
!(*name == '\\' && (!name[1] || (name[1] == '.' && !name[2]))));
--
This signature intentionally left boring.
Eric Blake ebb9@email.byu.edu
BYU student, free software programmer
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Patch: Re: Bug in stat()?
2002-04-30 16:49 ` Patch: " Eric Blake
@ 2002-05-02 9:46 ` Corinna Vinschen
0 siblings, 0 replies; 3+ messages in thread
From: Corinna Vinschen @ 2002-05-02 9:46 UTC (permalink / raw)
To: cygwin
On Tue, Apr 30, 2002 at 04:54:18PM -0600, Eric Blake wrote:
> Eric Blake wrote:
> > [...]
> 2002-04-30 Eric Blake <ebb9@email.byu.edu>
>
> * path.cc (hash_path_name): Improve hash function strength.
>
> $ diff -u path.cc.bak path.cc
> --- path.cc.bak Tue Apr 30 16:32:52 2002
> +++ path.cc Tue Apr 30 16:40:14 2002
> @@ -3136,7 +3136,7 @@
> hash = cygheap->cwd.get_hash ();
> if (name[0] == '.' && name[1] == '\0')
> return hash;
> - hash += hash_path_name (hash, "\\");
> + hash = (hash << 5) - hash + '\\';
> }
> }
>
> @@ -3146,8 +3146,7 @@
> do
> {
> int ch = cyg_tolower(*name);
> - hash += ch + (ch << 17);
> - hash ^= hash >> 2;
> + hash = (hash << 5) - hash + ch;
> }
> while (*++name != '\0' &&
> !(*name == '\\' && (!name[1] || (name[1] == '.' && !name[2]))));
>
Thanks for that patch. I've applied it. Check out the next
developers snapshot.
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Developer mailto:cygwin@cygwin.com
Red Hat, Inc.
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2002-05-02 16:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-04-30 14:23 Bug in stat()? Eric Blake
2002-04-30 16:49 ` Patch: " Eric Blake
2002-05-02 9:46 ` Corinna Vinschen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).