public inbox for java-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Patch: rewrite File.toCanonicalPath() for GNU/Posix systems
@ 2006-04-03 12:32 Gary Benson
  2006-04-05 22:00 ` Tom Tromey
  2006-04-07 15:53 ` Patch: rewrite File.toCanonicalPath() take 2 Gary Benson
  0 siblings, 2 replies; 14+ messages in thread
From: Gary Benson @ 2006-04-03 12:32 UTC (permalink / raw)
  To: java-patches

[-- Attachment #1: Type: text/plain, Size: 788 bytes --]

Hi all,

In order for FilePermission checks to work properly it is necessary to
convert all paths to canonical form before comparing them.  I recently
fixed this in Classpath, but the fix relies on File.toCanonicalPath()
handling symbolic links as other JVMs.  GCJ's current implementation
has a number of drawbacks, which this patch fixes.

The diff of the method itself is not particularly legible, so I've
attached a copy of the new method as well.

Cheers,
Gary

2006-04-03  Gary Benson  <gbenson@redhat.com>

	Partial fix for PR classpath/24895
	java/io/natFilePosix.cc (getCanonicalPath): Rewrite.
	java/io/File.java (getCanonicalPath): Javadoc fix.
	configure.ac: Replace check for realpath with checks for
	lstat and readlink.
	configure: Rebuilt.
	include/config.h.in: Likewise.

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 8763 bytes --]

Index: libjava/java/io/natFilePosix.cc
===================================================================
--- libjava/java/io/natFilePosix.cc	(revision 112529)
+++ libjava/java/io/natFilePosix.cc	(working copy)
@@ -104,91 +104,128 @@
 jstring
 java::io::File::getCanonicalPath (void)
 {
-  // We use `+2' here because we might need to use `.' for our special
-  // case.
-  char *buf = (char *) __builtin_alloca (JvGetStringUTFLength (path) + 2);
-  char buf2[MAXPATHLEN];
-  jsize total = JvGetStringUTFRegion (path, 0, path->length(), buf);
+  jstring path = getAbsolutePath ();
 
-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';
+  int len = JvGetStringUTFLength (path);
+  if (len >= MAXPATHLEN)
+    throw new IOException (JvNewStringLatin1 ("Path too long"));
 
-  buf[total] = '\0';
+  char *src = (char *) _Jv_Malloc (MAXPATHLEN * 3);
+  char *dst = src + MAXPATHLEN;
+  char *tmp = dst + MAXPATHLEN;
 
-#ifdef HAVE_REALPATH
-  if (realpath (buf, buf2) == NULL)
+  JvGetStringUTFRegion (path, 0, path->length(), src);
+  src[len] = '\0';
+  int srci = 1;
+
+  dst[0] = '/';
+  int dsti = 1;
+
+  bool fschecks = true;
+
+  while (src[srci] != '\0')
     {
-      // If realpath failed, we have to come up with a canonical path
-      // anyway.  We do this with purely textual manipulation.
-      // FIXME: this isn't perfect.  You can construct a case where
-      // we get a different answer from the JDK:
-      // mkdir -p /tmp/a/b/c
-      // ln -s /tmp/a/b /tmp/a/z
-      // ... getCanonicalPath("/tmp/a/z/c/nosuchfile")
-      // We will give /tmp/a/z/c/nosuchfile, while the JDK will
-      // give /tmp/a/b/c/nosuchfile.
-      int out_idx;
-      if (buf[0] != '/')
+      // Skip slashes.
+      while (src[srci] == '/')
+	srci++;
+      int tmpi = srci;
+      // Find next slash.
+      while (src[srci] != '/' && src[srci] != '\0')
+	srci++;
+      if (srci == tmpi)
+	// We hit the end.
+	break;
+      len = srci - tmpi;
+
+      // Handle "." and "..".
+      if (len == 1 && src[tmpi] == '.')
+	continue;
+      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
 	{
-	  // Not absolute, so start with current directory.
-	  if (getcwd (buf2, sizeof (buf2)) == NULL)
-	    throw new IOException ();
-	  out_idx = strlen (buf2);
+	  if (dsti == 1)
+	    {
+	      // Unlike other JVMs we do not rewind past the root
+	      // directory.  I can't see any legitimate reason why you
+	      // would want this, and chopping off bits of path seems
+	      // like a sure-fire way to introduce vulnerabilities.
+	      _Jv_Free (src);
+	      throw new IOException (
+		JvNewStringLatin1 ("Too many up-level references"));
+	    }
+	  while (dsti > 1 && dst[dsti - 1] != '/')
+	    dsti--;
+	  if (dsti != 1)
+	    dsti--;
+	  // Reenable filesystem checking if disabled, as we might
+	  // have reversed over whatever caused the problem before.
+	  // At least one proprietary JVM has inconsistencies because
+	  // it does not do this.
+	  fschecks = true;
+	  continue;
 	}
-      else
+
+      // Handle real path components.
+      if (dsti + len + 1 >= MAXPATHLEN)
 	{
-	  buf2[0] = '/';
-	  out_idx = 1;
-	} 
-      int in_idx = 0;
-      while (buf[in_idx] != '\0')
+	  _Jv_Free (src);
+	  throw new IOException (JvNewStringLatin1 ("Path too long"));
+	}
+      int dsti_save = dsti;
+      if (dsti > 1)
+	dst[dsti++] = '/';
+      strncpy (&dst[dsti], &src[tmpi], len);
+      dsti += len;
+      if (fschecks == false)
+	continue;
+
+#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
+      struct stat sb;
+      dst[dsti] = '\0';
+      if (::lstat (dst, &sb) == 0)
 	{
-	  // Skip '/'s.
-	  while (buf[in_idx] == '/')
-	    ++in_idx;
-	  int elt_start = in_idx;
-	  // Find next '/' or end of path.
-	  while (buf[in_idx] != '\0' && buf[in_idx] != '/')
-	    ++in_idx;
-	  if (in_idx == elt_start)
+	  if (S_ISLNK (sb.st_mode))
 	    {
-	      // An empty component means we've reached the end.
-	      break;
+	      tmpi = ::readlink (dst, tmp, MAXPATHLEN);
+	      if (tmpi < 1 || tmpi == MAXPATHLEN)
+		{
+		  _Jv_Free (src);
+		  throw new IOException (JvNewStringLatin1 ("Path too long"));
+		}
+
+	      // Prepend the link's path to src.
+	      if (tmpi + strlen (&src[srci]) >= MAXPATHLEN)
+		{
+		  _Jv_Free (src);
+		  throw new IOException (JvNewStringLatin1 ("Path too long"));
+		}
+	      while (src[srci] != '\0')
+		tmp[tmpi++] = src[srci++];
+	      tmp[tmpi] = '\0';
+	      strcpy (src, tmp);
+	      srci = 0;
+
+	      // Either replace or append dst depending on whether the
+	      // link is relative or absolute.
+	      dsti = tmp[0] == '/' ? 1 : dsti_save;
 	    }
-	  int len = in_idx - elt_start;
-	  if (len == 1 && buf[in_idx] == '.')
-	    continue;
-	  if (len == 2 && buf[in_idx] == '.' && buf[in_idx + 1] == '.')
-	    {
-	      // Found ".." component, lop off last part from existing
-	      // buffer.
-	      --out_idx;
-	      while (out_idx > 0 && buf2[out_idx] != '/')
-		--out_idx;
-	      // Can't go up past "/".
-	      if (out_idx == 0)
-		++out_idx;
-	    }
-	  else
-	    {
-	      // Append a real path component to the output.
-	      if (out_idx > 1)
-		buf2[out_idx++] = '/';
-	      strncpy (&buf2[out_idx], &buf[elt_start], len);
-	      out_idx += len;
-	    }
 	}
-
-      buf2[out_idx] = '\0';
+      else
+	{
+	  // Something doesn't exist, or we don't have permission to
+	  // read it, or a previous path component is a directory, or
+	  // a symlink is looped.  Whatever, we can't check the
+	  // filesystem any more.
+	  fschecks = false;
+	}
+#endif // HAVE_LSTAT && HAVE_READLINK
     }
+  dst[dsti] = '\0';
 
   // FIXME: what encoding to assume for file names?  This affects many
   // calls.
-  return JvNewStringUTF (buf2);
-#else
-  return JvNewStringUTF (buf);
-#endif
+  path = JvNewStringUTF (dst);
+  _Jv_Free (src);
+  return path;
 }
 
 jboolean
Index: libjava/java/io/File.java
===================================================================
--- libjava/java/io/File.java	(revision 112529)
+++ libjava/java/io/File.java	(working copy)
@@ -508,9 +508,9 @@
   /**
    * This method returns a canonical representation of the pathname of
    * this file.  The actual form of the canonical representation is
-   * different.  On the GNU system, the canonical form differs from the
-   * absolute form in that all relative file references to "." and ".."
-   * are resolved and removed.
+   * system-dependent.  On the GNU system, conversion to canonical
+   * form involves the removal of redundant separators, references to
+   * "." and "..", and symbolic links.
    * <p>
    * Note that this method, unlike the other methods which return path
    * names, can throw an IOException.  This is because native method 
Index: libjava/configure.ac
===================================================================
--- libjava/configure.ac	(revision 112529)
+++ libjava/configure.ac	(working copy)
@@ -895,7 +895,7 @@
 else
    AC_CHECK_FUNCS([strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate])
Index: libjava/configure
===================================================================
--- libjava/configure	(revision 112529)
+++ libjava/configure	(working copy)
@@ -9388,9 +9388,9 @@
 
 
 
 for ac_func in strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate
Index: libjava/include/config.h.in
===================================================================
--- libjava/include/config.h.in	(revision 112529)
+++ libjava/include/config.h.in	(working copy)
@@ -172,6 +172,9 @@
 /* Define to 1 if you have the `localtime_r' function. */
 #undef HAVE_LOCALTIME_R
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
@@ -229,8 +232,8 @@
 /* Define to 1 if you have the `readdir_r' function. */
 #undef HAVE_READDIR_R
 
-/* Define to 1 if you have the `realpath' function. */
-#undef HAVE_REALPATH
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
 
 /* Define to 1 if you have the `rename' function. */
 #undef HAVE_RENAME

[-- Attachment #3: canon.cc --]
[-- Type: text/plain, Size: 3280 bytes --]

jstring
java::io::File::getCanonicalPath (void)
{
  jstring path = getAbsolutePath ();

  int len = JvGetStringUTFLength (path);
  if (len >= MAXPATHLEN)
    throw new IOException (JvNewStringLatin1 ("Path too long"));

  char *src = (char *) _Jv_Malloc (MAXPATHLEN * 3);
  char *dst = src + MAXPATHLEN;
  char *tmp = dst + MAXPATHLEN;

  JvGetStringUTFRegion (path, 0, path->length(), src);
  src[len] = '\0';
  int srci = 1;

  dst[0] = '/';
  int dsti = 1;

  bool fschecks = true;

  while (src[srci] != '\0')
    {
      // Skip slashes.
      while (src[srci] == '/')
	srci++;
      int tmpi = srci;
      // Find next slash.
      while (src[srci] != '/' && src[srci] != '\0')
	srci++;
      if (srci == tmpi)
	// We hit the end.
	break;
      len = srci - tmpi;

      // Handle "." and "..".
      if (len == 1 && src[tmpi] == '.')
	continue;
      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
	{
	  if (dsti == 1)
	    {
	      // Unlike other JVMs we do not rewind past the root
	      // directory.  I can't see any legitimate reason why you
	      // would want this, and chopping off bits of path seems
	      // like a sure-fire way to introduce vulnerabilities.
	      _Jv_Free (src);
	      throw new IOException (
		JvNewStringLatin1 ("Too many up-level references"));
	    }
	  while (dsti > 1 && dst[dsti - 1] != '/')
	    dsti--;
	  if (dsti != 1)
	    dsti--;
	  // Reenable filesystem checking if disabled, as we might
	  // have reversed over whatever caused the problem before.
	  // At least one proprietary JVM has inconsistencies because
	  // it does not do this.
	  fschecks = true;
	  continue;
	}

      // Handle real path components.
      if (dsti + len + 1 >= MAXPATHLEN)
	{
	  _Jv_Free (src);
	  throw new IOException (JvNewStringLatin1 ("Path too long"));
	}
      int dsti_save = dsti;
      if (dsti > 1)
	dst[dsti++] = '/';
      strncpy (&dst[dsti], &src[tmpi], len);
      dsti += len;
      if (fschecks == false)
	continue;

#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
      struct stat sb;
      dst[dsti] = '\0';
      if (::lstat (dst, &sb) == 0)
	{
	  if (S_ISLNK (sb.st_mode))
	    {
	      tmpi = ::readlink (dst, tmp, MAXPATHLEN);
	      if (tmpi < 1 || tmpi == MAXPATHLEN)
		{
		  _Jv_Free (src);
		  throw new IOException (JvNewStringLatin1 ("Path too long"));
		}

	      // Prepend the link's path to src.
	      if (tmpi + strlen (&src[srci]) >= MAXPATHLEN)
		{
		  _Jv_Free (src);
		  throw new IOException (JvNewStringLatin1 ("Path too long"));
		}
	      while (src[srci] != '\0')
		tmp[tmpi++] = src[srci++];
	      tmp[tmpi] = '\0';
	      strcpy (src, tmp);
	      srci = 0;

	      // Either replace or append dst depending on whether the
	      // link is relative or absolute.
	      dsti = tmp[0] == '/' ? 1 : dsti_save;
	    }
	}
      else
	{
	  // Something doesn't exist, or we don't have permission to
	  // read it, or a previous path component is a directory, or
	  // a symlink is looped.  Whatever, we can't check the
	  // filesystem any more.
	  fschecks = false;
	}
#endif // HAVE_LSTAT && HAVE_READLINK
    }
  dst[dsti] = '\0';

  // FIXME: what encoding to assume for file names?  This affects many
  // calls.
  path = JvNewStringUTF (dst);
  _Jv_Free (src);
  return path;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Patch: rewrite File.toCanonicalPath() for GNU/Posix systems
  2006-04-03 12:32 Patch: rewrite File.toCanonicalPath() for GNU/Posix systems Gary Benson
@ 2006-04-05 22:00 ` Tom Tromey
  2006-04-06 10:03   ` Gary Benson
  2006-04-07 15:53 ` Patch: rewrite File.toCanonicalPath() take 2 Gary Benson
  1 sibling, 1 reply; 14+ messages in thread
From: Tom Tromey @ 2006-04-05 22:00 UTC (permalink / raw)
  To: Gary Benson; +Cc: java-patches

>>>>> "Gary" == Gary Benson <gbenson@redhat.com> writes:

Gary> The diff of the method itself is not particularly legible, so I've
Gary> attached a copy of the new method as well.

This is looking good.

I had a couple of questions though.

First:

-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';

Is this now handled some other way?  I didn't try to trace through the
logic.  I forget if there is a test for this... if not there ought to
be.

Gary> 	      // Unlike other JVMs we do not rewind past the root
Gary> 	      // directory.  I can't see any legitimate reason why you
Gary> 	      // would want this, and chopping off bits of path seems
Gary> 	      // like a sure-fire way to introduce vulnerabilities.

I'm curious about this.  It seems a bit weird that it would be valid
to try to open "../../../foo" but not to get the canonical path name
of that same file.

Tom

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Patch: rewrite File.toCanonicalPath() for GNU/Posix systems
  2006-04-05 22:00 ` Tom Tromey
@ 2006-04-06 10:03   ` Gary Benson
  0 siblings, 0 replies; 14+ messages in thread
From: Gary Benson @ 2006-04-06 10:03 UTC (permalink / raw)
  To: java-patches

Tom Tromey wrote:
> >>>>> "Gary" == Gary Benson <gbenson@redhat.com> writes:
> 
> Gary> The diff of the method itself is not particularly legible,
> Gary> so I've attached a copy of the new method as well.
> 
> This is looking good.
> 
> I had a couple of questions though.
> 
> First:
> 
> -  // Special case: treat "" the same as ".".
> -  if (total == 0)
> -    buf[total++] = '.';
> 
> Is this now handled some other way?  I didn't try to trace through
> the logic.  I forget if there is a test for this... if not there
> ought to be.

The very first call in the method ensures the path is absolute:

+  jstring path = getAbsolutePath ();

So the smallest possible path is "/".  The previous implementation did
not ensure paths were absolute, which was an error.

> Gary>       // Unlike other JVMs we do not rewind past the root
> Gary>       // directory.  I can't see any legitimate reason why you
> Gary>       // would want this, and chopping off bits of path seems
> Gary>       // like a sure-fire way to introduce vulnerabilities.
> 
> I'm curious about this.  It seems a bit weird that it would be valid
> to try to open "../../../foo" but not to get the canonical path name
> of that same file.

It'll give you a canonical path for "../../../foo" so long as your
current directory has at least three elements in it.  It's paths like
"/../foo" that will cause the exception to be thrown.

I added this check because I managed to confuse a JVM into not
resolving symlinks this way:

  File.getCanonicalPath("/bin/view") => "/bin/vi"
  File.getCanonicalPath("/home/../bin/view") => "/bin/vi"
  File.getCanonicalPath("/../bin/view") => "/bin/view"
                                                 ^^^^

I originally thought this couldn't be avoided if you allowed rewinding
past root, but some later changes I made mean my implementation will
cope with this just fine, so the check could be removed.  It all
depends whether you think "/../foo" is a valid path... personally I'm
not sure either way.

Cheers,
Gary

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Patch: rewrite File.toCanonicalPath() take 2
  2006-04-03 12:32 Patch: rewrite File.toCanonicalPath() for GNU/Posix systems Gary Benson
  2006-04-05 22:00 ` Tom Tromey
@ 2006-04-07 15:53 ` Gary Benson
  2006-04-12 10:32   ` Andrew Haley
  1 sibling, 1 reply; 14+ messages in thread
From: Gary Benson @ 2006-04-07 15:53 UTC (permalink / raw)
  To: java-patches

[-- Attachment #1: Type: text/plain, Size: 1134 bytes --]

Hi all,

Here's a second draft of my rewritten File.toCanonicalPath().  If you
missed my previous email, a working File.toCanonicalPath() is required
in order for FilePermission checks to work properly.  GCJ's present
implementation of this has a number of drawbacks, which this patch
fixes.

This patch improves upon my previous one in two ways:

 1. Following comments on the Classpath mailing list, storage for
    paths is now allocated dynamically.
 2. Following comments on this list, the "no-rewind-past-root" check
    has been removed.

As before the diff of the method itself is not very legible so I've
attached a copy of it as well as the patch.

One question I forgot to ask last time, concerning the very first line
of the method.  Do I need to free path somehow, or will it be GCd?

Cheers,
Gary


2006-04-07  Gary Benson  <gbenson@redhat.com>

	Partial fix for PR classpath/24895
	java/io/natFilePosix.cc (getCanonicalPath): Rewrite.
	java/io/File.java (getCanonicalPath): Javadoc fix.
	configure.ac: Replace check for realpath with checks for
	lstat and readlink.
	configure: Rebuilt.
	include/config.h.in: Likewise.

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 8462 bytes --]

Index: libjava/java/io/natFilePosix.cc
===================================================================
--- libjava/java/io/natFilePosix.cc	(revision 112529)
+++ libjava/java/io/natFilePosix.cc	(working copy)
@@ -104,91 +104,121 @@
 jstring
 java::io::File::getCanonicalPath (void)
 {
-  // We use `+2' here because we might need to use `.' for our special
-  // case.
-  char *buf = (char *) __builtin_alloca (JvGetStringUTFLength (path) + 2);
-  char buf2[MAXPATHLEN];
-  jsize total = JvGetStringUTFRegion (path, 0, path->length(), buf);
+  jstring path = getAbsolutePath ();
 
-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';
+  int len = JvGetStringUTFLength (path);
+  char *src = (char *) _Jv_Malloc (len + 1);
+  JvGetStringUTFRegion (path, 0, path->length(), src);
+  src[len] = '\0';
+  int srci = 1;
 
-  buf[total] = '\0';
+  char *dst = (char *) _Jv_Malloc (2);
+  dst[0] = '/';
+  int dsti = 1;
 
-#ifdef HAVE_REALPATH
-  if (realpath (buf, buf2) == NULL)
+  bool fschecks = true;
+
+  while (src[srci] != '\0')
     {
-      // If realpath failed, we have to come up with a canonical path
-      // anyway.  We do this with purely textual manipulation.
-      // FIXME: this isn't perfect.  You can construct a case where
-      // we get a different answer from the JDK:
-      // mkdir -p /tmp/a/b/c
-      // ln -s /tmp/a/b /tmp/a/z
-      // ... getCanonicalPath("/tmp/a/z/c/nosuchfile")
-      // We will give /tmp/a/z/c/nosuchfile, while the JDK will
-      // give /tmp/a/b/c/nosuchfile.
-      int out_idx;
-      if (buf[0] != '/')
+      // Skip slashes.
+      while (src[srci] == '/')
+	srci++;
+      int tmpi = srci;
+      // Find next slash.
+      while (src[srci] != '/' && src[srci] != '\0')
+	srci++;
+      if (srci == tmpi)
+	// We hit the end.
+	break;
+      len = srci - tmpi;
+
+      // Handle "." and "..".
+      if (len == 1 && src[tmpi] == '.')
+	continue;
+      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
 	{
-	  // Not absolute, so start with current directory.
-	  if (getcwd (buf2, sizeof (buf2)) == NULL)
-	    throw new IOException ();
-	  out_idx = strlen (buf2);
+	  while (dsti > 1 && dst[dsti - 1] != '/')
+	    dsti--;
+	  if (dsti != 1)
+	    dsti--;
+	  dst = (char *) _Jv_Realloc (dst, dsti + 1);
+	  // Reenable filesystem checking if disabled, as we might
+	  // have reversed over whatever caused the problem before.
+	  // At least one proprietary JVM has inconsistencies because
+	  // it does not do this.
+	  fschecks = true;
+	  continue;
 	}
-      else
+
+      // Handle real path components.
+      dst = (char *) _Jv_Realloc (dst, dsti + (dsti > 1 ? 1 : 0) + len + 1);
+      int dsti_save = dsti;
+      if (dsti > 1)
+	dst[dsti++] = '/';
+      strncpy (&dst[dsti], &src[tmpi], len);
+      dsti += len;
+      if (fschecks == false)
+	continue;
+
+#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
+      struct stat sb;
+      dst[dsti] = '\0';
+      if (::lstat (dst, &sb) == 0)
 	{
-	  buf2[0] = '/';
-	  out_idx = 1;
-	} 
-      int in_idx = 0;
-      while (buf[in_idx] != '\0')
-	{
-	  // Skip '/'s.
-	  while (buf[in_idx] == '/')
-	    ++in_idx;
-	  int elt_start = in_idx;
-	  // Find next '/' or end of path.
-	  while (buf[in_idx] != '\0' && buf[in_idx] != '/')
-	    ++in_idx;
-	  if (in_idx == elt_start)
+	  if (S_ISLNK (sb.st_mode))
 	    {
-	      // An empty component means we've reached the end.
-	      break;
+	      int step = 4096;
+	      int size = step;
+	      char *tmp = (char *) _Jv_Malloc (size);
+
+	      while (1)
+		{
+		  tmpi = ::readlink (dst, tmp, size);
+		  if (tmpi < 1)
+		    {
+		      _Jv_Free (src);
+		      _Jv_Free (dst);
+		      _Jv_Free (tmp);
+		      throw new IOException (
+			JvNewStringLatin1 ("readlink failed"));
+		    }
+		  if (tmpi < size)
+		    break;
+		  size += step;
+		  tmp = (char *) _Jv_Realloc (tmp, size);
+		}
+
+	      // Prepend the link's path to src.
+	      tmp = (char *) _Jv_Realloc (tmp, tmpi + strlen (&src[srci]) + 1);
+	      strcpy(&tmp[tmpi], &src[srci]);
+	      _Jv_Free (src);
+	      src = tmp;
+	      srci = 0;
+
+	      // Either replace or append dst depending on whether the
+	      // link is relative or absolute.
+	      dsti = src[0] == '/' ? 1 : dsti_save;
+	      dst = (char *) _Jv_Realloc (dst, dsti + 1);
 	    }
-	  int len = in_idx - elt_start;
-	  if (len == 1 && buf[in_idx] == '.')
-	    continue;
-	  if (len == 2 && buf[in_idx] == '.' && buf[in_idx + 1] == '.')
-	    {
-	      // Found ".." component, lop off last part from existing
-	      // buffer.
-	      --out_idx;
-	      while (out_idx > 0 && buf2[out_idx] != '/')
-		--out_idx;
-	      // Can't go up past "/".
-	      if (out_idx == 0)
-		++out_idx;
-	    }
-	  else
-	    {
-	      // Append a real path component to the output.
-	      if (out_idx > 1)
-		buf2[out_idx++] = '/';
-	      strncpy (&buf2[out_idx], &buf[elt_start], len);
-	      out_idx += len;
-	    }
 	}
-
-      buf2[out_idx] = '\0';
+      else
+	{
+	  // Something doesn't exist, or we don't have permission to
+	  // read it, or a previous path component is a directory, or
+	  // a symlink is looped.  Whatever, we can't check the
+	  // filesystem any more.
+	  fschecks = false;
+	}
+#endif // HAVE_LSTAT && HAVE_READLINK
     }
+  dst[dsti] = '\0';
 
   // FIXME: what encoding to assume for file names?  This affects many
   // calls.
-  return JvNewStringUTF (buf2);
-#else
-  return JvNewStringUTF (buf);
-#endif
+  path = JvNewStringUTF (dst);
+  _Jv_Free (src);
+  _Jv_Free (dst);
+  return path;
 }
 
 jboolean
Index: libjava/java/io/File.java
===================================================================
--- libjava/java/io/File.java	(revision 112529)
+++ libjava/java/io/File.java	(working copy)
@@ -508,9 +508,9 @@
   /**
    * This method returns a canonical representation of the pathname of
    * this file.  The actual form of the canonical representation is
-   * different.  On the GNU system, the canonical form differs from the
-   * absolute form in that all relative file references to "." and ".."
-   * are resolved and removed.
+   * system-dependent.  On the GNU system, conversion to canonical
+   * form involves the removal of redundant separators, references to
+   * "." and "..", and symbolic links.
    * <p>
    * Note that this method, unlike the other methods which return path
    * names, can throw an IOException.  This is because native method 
Index: libjava/configure.ac
===================================================================
--- libjava/configure.ac	(revision 112529)
+++ libjava/configure.ac	(working copy)
@@ -895,7 +895,7 @@
 else
    AC_CHECK_FUNCS([strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate])
Index: libjava/configure
===================================================================
--- libjava/configure	(revision 112529)
+++ libjava/configure	(working copy)
@@ -9388,9 +9388,10 @@
 
 
 
+
 for ac_func in strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate
Index: libjava/include/config.h.in
===================================================================
--- libjava/include/config.h.in	(revision 112529)
+++ libjava/include/config.h.in	(working copy)
@@ -172,6 +172,9 @@
 /* Define to 1 if you have the `localtime_r' function. */
 #undef HAVE_LOCALTIME_R
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
@@ -229,8 +232,8 @@
 /* Define to 1 if you have the `readdir_r' function. */
 #undef HAVE_READDIR_R
 
-/* Define to 1 if you have the `realpath' function. */
-#undef HAVE_REALPATH
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
 
 /* Define to 1 if you have the `rename' function. */
 #undef HAVE_RENAME

[-- Attachment #3: canon.cc --]
[-- Type: text/plain, Size: 2979 bytes --]

jstring
java::io::File::getCanonicalPath (void)
{
  jstring path = getAbsolutePath ();

  int len = JvGetStringUTFLength (path);
  char *src = (char *) _Jv_Malloc (len + 1);
  JvGetStringUTFRegion (path, 0, path->length(), src);
  src[len] = '\0';
  int srci = 1;

  char *dst = (char *) _Jv_Malloc (2);
  dst[0] = '/';
  int dsti = 1;

  bool fschecks = true;

  while (src[srci] != '\0')
    {
      // Skip slashes.
      while (src[srci] == '/')
	srci++;
      int tmpi = srci;
      // Find next slash.
      while (src[srci] != '/' && src[srci] != '\0')
	srci++;
      if (srci == tmpi)
	// We hit the end.
	break;
      len = srci - tmpi;

      // Handle "." and "..".
      if (len == 1 && src[tmpi] == '.')
	continue;
      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
	{
	  while (dsti > 1 && dst[dsti - 1] != '/')
	    dsti--;
	  if (dsti != 1)
	    dsti--;
	  dst = (char *) _Jv_Realloc (dst, dsti + 1);
	  // Reenable filesystem checking if disabled, as we might
	  // have reversed over whatever caused the problem before.
	  // At least one proprietary JVM has inconsistencies because
	  // it does not do this.
	  fschecks = true;
	  continue;
	}

      // Handle real path components.
      dst = (char *) _Jv_Realloc (dst, dsti + (dsti > 1 ? 1 : 0) + len + 1);
      int dsti_save = dsti;
      if (dsti > 1)
	dst[dsti++] = '/';
      strncpy (&dst[dsti], &src[tmpi], len);
      dsti += len;
      if (fschecks == false)
	continue;

#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
      struct stat sb;
      dst[dsti] = '\0';
      if (::lstat (dst, &sb) == 0)
	{
	  if (S_ISLNK (sb.st_mode))
	    {
	      int step = 4096;
	      int size = step;
	      char *tmp = (char *) _Jv_Malloc (size);

	      while (1)
		{
		  tmpi = ::readlink (dst, tmp, size);
		  if (tmpi < 1)
		    {
		      _Jv_Free (src);
		      _Jv_Free (dst);
		      _Jv_Free (tmp);
		      throw new IOException (
			JvNewStringLatin1 ("readlink failed"));
		    }
		  if (tmpi < size)
		    break;
		  size += step;
		  tmp = (char *) _Jv_Realloc (tmp, size);
		}

	      // Prepend the link's path to src.
	      tmp = (char *) _Jv_Realloc (tmp, tmpi + strlen (&src[srci]) + 1);
	      strcpy(&tmp[tmpi], &src[srci]);
	      _Jv_Free (src);
	      src = tmp;
	      srci = 0;

	      // Either replace or append dst depending on whether the
	      // link is relative or absolute.
	      dsti = src[0] == '/' ? 1 : dsti_save;
	      dst = (char *) _Jv_Realloc (dst, dsti + 1);
	    }
	}
      else
	{
	  // Something doesn't exist, or we don't have permission to
	  // read it, or a previous path component is a directory, or
	  // a symlink is looped.  Whatever, we can't check the
	  // filesystem any more.
	  fschecks = false;
	}
#endif // HAVE_LSTAT && HAVE_READLINK
    }
  dst[dsti] = '\0';

  // FIXME: what encoding to assume for file names?  This affects many
  // calls.
  path = JvNewStringUTF (dst);
  _Jv_Free (src);
  _Jv_Free (dst);
  return path;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Patch: rewrite File.toCanonicalPath() take 2
  2006-04-07 15:53 ` Patch: rewrite File.toCanonicalPath() take 2 Gary Benson
@ 2006-04-12 10:32   ` Andrew Haley
  2006-04-19 13:55     ` Patch: rewrite File.toCanonicalPath() take 3 Gary Benson
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Haley @ 2006-04-12 10:32 UTC (permalink / raw)
  To: Gary Benson; +Cc: java-patches

Gary Benson writes:
 > Hi all,
 > 
 > Here's a second draft of my rewritten File.toCanonicalPath().  If you
 > missed my previous email, a working File.toCanonicalPath() is required
 > in order for FilePermission checks to work properly.  GCJ's present
 > implementation of this has a number of drawbacks, which this patch
 > fixes.
 > 
 > This patch improves upon my previous one in two ways:
 > 
 >  1. Following comments on the Classpath mailing list, storage for
 >     paths is now allocated dynamically.
 >  2. Following comments on this list, the "no-rewind-past-root" check
 >     has been removed.
 > 
 > As before the diff of the method itself is not very legible so I've
 > attached a copy of it as well as the patch.
 > 
 > One question I forgot to ask last time, concerning the very first line
 > of the method.  Do I need to free path somehow, or will it be GCd?

It'll be gcd. no problem about that.

The repeated use of realloc() in this method is rather nasty.  It must
surely be possible to grow the buffer in some way that doesn't require
this.  Just pre-allocate the buffer and grow it iff it's insufficient.
256 bytes or so will be adequate in most cases.

I know tromey said that realloc() is not slow when compared with
readlink(), and this is surely true, but I can't accept that we should
unnecessarily make may calls to realloc() for that reason.

Andrew.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Patch: rewrite File.toCanonicalPath() take 3
  2006-04-12 10:32   ` Andrew Haley
@ 2006-04-19 13:55     ` Gary Benson
  2006-05-12  8:16       ` PING: rewrite File.toCanonicalPath() Gary Benson
  0 siblings, 1 reply; 14+ messages in thread
From: Gary Benson @ 2006-04-19 13:55 UTC (permalink / raw)
  To: java-patches

[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]

Andrew Haley wrote:
> Gary Benson writes:
> > Here's a second draft of my rewritten File.toCanonicalPath().  If
> > you missed my previous email, a working File.toCanonicalPath() is
> > required in order for FilePermission checks to work properly.
> > GCJ's present implementation of this has a number of drawbacks,
> > which this patch fixes.
> > 
> > This patch improves upon my previous one in two ways:
> > 
> >  1. Following comments on the Classpath mailing list, storage for
> >     paths is now allocated dynamically.
> >  2. Following comments on this list, the "no-rewind-past-root"
> >     check has been removed.
> > 
> > As before the diff of the method itself is not very legible so
> > I've attached a copy of it as well as the patch.
> 
> The repeated use of realloc() in this method is rather nasty.  It
> must surely be possible to grow the buffer in some way that doesn't
> require this.  Just pre-allocate the buffer and grow it iff it's
> insufficient.  256 bytes or so will be adequate in most cases.
> 
> I know tromey said that realloc() is not slow when compared with
> readlink(), and this is surely true, but I can't accept that we
> should unnecessarily make may calls to realloc() for that reason.

How does this look?

Cheers,
Gary

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 8850 bytes --]

Index: libjava/java/io/natFilePosix.cc
===================================================================
--- libjava/java/io/natFilePosix.cc	(revision 112529)
+++ libjava/java/io/natFilePosix.cc	(working copy)
@@ -101,94 +101,144 @@
 #endif
 }
 
+#define CHUNKLOG 8
+#define CHUNKSIZ (1 << CHUNKLOG)
+
+static int
+nextChunkSize (int size)
+{
+  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
+}
+
+static char *
+maybeGrowBuf (char *buf, int *size, int required)
+{
+  if (required > *size)
+    {
+      *size = nextChunkSize (required);
+      buf = (char *) _Jv_Realloc (buf, *size);
+    }
+  return buf;
+}
+
 jstring
 java::io::File::getCanonicalPath (void)
 {
-  // We use `+2' here because we might need to use `.' for our special
-  // case.
-  char *buf = (char *) __builtin_alloca (JvGetStringUTFLength (path) + 2);
-  char buf2[MAXPATHLEN];
-  jsize total = JvGetStringUTFRegion (path, 0, path->length(), buf);
+  jstring path = getAbsolutePath ();
 
-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';
+  int len = JvGetStringUTFLength (path);
+  int srcl = nextChunkSize (len + 1);
+  char *src = (char *) _Jv_Malloc (srcl);
+  JvGetStringUTFRegion (path, 0, path->length(), src);
+  src[len] = '\0';
+  int srci = 1;
 
-  buf[total] = '\0';
+  int dstl = nextChunkSize (2);  
+  char *dst = (char *) _Jv_Malloc (dstl);
+  dst[0] = '/';
+  int dsti = 1;
 
-#ifdef HAVE_REALPATH
-  if (realpath (buf, buf2) == NULL)
+  bool fschecks = true;
+
+  while (src[srci] != '\0')
     {
-      // If realpath failed, we have to come up with a canonical path
-      // anyway.  We do this with purely textual manipulation.
-      // FIXME: this isn't perfect.  You can construct a case where
-      // we get a different answer from the JDK:
-      // mkdir -p /tmp/a/b/c
-      // ln -s /tmp/a/b /tmp/a/z
-      // ... getCanonicalPath("/tmp/a/z/c/nosuchfile")
-      // We will give /tmp/a/z/c/nosuchfile, while the JDK will
-      // give /tmp/a/b/c/nosuchfile.
-      int out_idx;
-      if (buf[0] != '/')
+      // Skip slashes.
+      while (src[srci] == '/')
+	srci++;
+      int tmpi = srci;
+      // Find next slash.
+      while (src[srci] != '/' && src[srci] != '\0')
+	srci++;
+      if (srci == tmpi)
+	// We hit the end.
+	break;
+      len = srci - tmpi;
+
+      // Handle "." and "..".
+      if (len == 1 && src[tmpi] == '.')
+	continue;
+      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
 	{
-	  // Not absolute, so start with current directory.
-	  if (getcwd (buf2, sizeof (buf2)) == NULL)
-	    throw new IOException ();
-	  out_idx = strlen (buf2);
+	  while (dsti > 1 && dst[dsti - 1] != '/')
+	    dsti--;
+	  if (dsti != 1)
+	    dsti--;
+	  // Reenable filesystem checking if disabled, as we might
+	  // have reversed over whatever caused the problem before.
+	  // At least one proprietary JVM has inconsistencies because
+	  // it does not do this.
+	  fschecks = true;
+	  continue;
 	}
-      else
+
+      // Handle real path components.
+      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
+      int dsti_save = dsti;
+      if (dsti > 1)
+	dst[dsti++] = '/';
+      strncpy (&dst[dsti], &src[tmpi], len);
+      dsti += len;
+      if (fschecks == false)
+	continue;
+
+#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
+      struct stat sb;
+      dst[dsti] = '\0';
+      if (::lstat (dst, &sb) == 0)
 	{
-	  buf2[0] = '/';
-	  out_idx = 1;
-	} 
-      int in_idx = 0;
-      while (buf[in_idx] != '\0')
-	{
-	  // Skip '/'s.
-	  while (buf[in_idx] == '/')
-	    ++in_idx;
-	  int elt_start = in_idx;
-	  // Find next '/' or end of path.
-	  while (buf[in_idx] != '\0' && buf[in_idx] != '/')
-	    ++in_idx;
-	  if (in_idx == elt_start)
+	  if (S_ISLNK (sb.st_mode))
 	    {
-	      // An empty component means we've reached the end.
-	      break;
+	      int tmpl = CHUNKSIZ;
+	      char *tmp = (char *) _Jv_Malloc (tmpl);
+
+	      while (1)
+		{
+		  tmpi = ::readlink (dst, tmp, tmpl);
+		  if (tmpi < 1)
+		    {
+		      _Jv_Free (src);
+		      _Jv_Free (dst);
+		      _Jv_Free (tmp);
+		      throw new IOException (
+			JvNewStringLatin1 ("readlink failed"));
+		    }
+		  if (tmpi < tmpl)
+		    break;
+		  tmpl += CHUNKSIZ;
+		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
+		}
+
+	      // Prepend the link's path to src.
+	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
+	      strcpy(&tmp[tmpi], &src[srci]);
+	      _Jv_Free (src);
+	      src = tmp;
+	      srcl = tmpl;
+	      srci = 0;
+
+	      // Either replace or append dst depending on whether the
+	      // link is relative or absolute.
+	      dsti = src[0] == '/' ? 1 : dsti_save;
 	    }
-	  int len = in_idx - elt_start;
-	  if (len == 1 && buf[in_idx] == '.')
-	    continue;
-	  if (len == 2 && buf[in_idx] == '.' && buf[in_idx + 1] == '.')
-	    {
-	      // Found ".." component, lop off last part from existing
-	      // buffer.
-	      --out_idx;
-	      while (out_idx > 0 && buf2[out_idx] != '/')
-		--out_idx;
-	      // Can't go up past "/".
-	      if (out_idx == 0)
-		++out_idx;
-	    }
-	  else
-	    {
-	      // Append a real path component to the output.
-	      if (out_idx > 1)
-		buf2[out_idx++] = '/';
-	      strncpy (&buf2[out_idx], &buf[elt_start], len);
-	      out_idx += len;
-	    }
 	}
-
-      buf2[out_idx] = '\0';
+      else
+	{
+	  // Something doesn't exist, or we don't have permission to
+	  // read it, or a previous path component is a directory, or
+	  // a symlink is looped.  Whatever, we can't check the
+	  // filesystem any more.
+	  fschecks = false;
+	}
+#endif // HAVE_LSTAT && HAVE_READLINK
     }
+  dst[dsti] = '\0';
 
   // FIXME: what encoding to assume for file names?  This affects many
   // calls.
-  return JvNewStringUTF (buf2);
-#else
-  return JvNewStringUTF (buf);
-#endif
+  path = JvNewStringUTF (dst);
+  _Jv_Free (src);
+  _Jv_Free (dst);
+  return path;
 }
 
 jboolean
Index: libjava/java/io/File.java
===================================================================
--- libjava/java/io/File.java	(revision 112529)
+++ libjava/java/io/File.java	(working copy)
@@ -508,9 +508,9 @@
   /**
    * This method returns a canonical representation of the pathname of
    * this file.  The actual form of the canonical representation is
-   * different.  On the GNU system, the canonical form differs from the
-   * absolute form in that all relative file references to "." and ".."
-   * are resolved and removed.
+   * system-dependent.  On the GNU system, conversion to canonical
+   * form involves the removal of redundant separators, references to
+   * "." and "..", and symbolic links.
    * <p>
    * Note that this method, unlike the other methods which return path
    * names, can throw an IOException.  This is because native method 
Index: libjava/configure.ac
===================================================================
--- libjava/configure.ac	(revision 112529)
+++ libjava/configure.ac	(working copy)
@@ -895,7 +895,7 @@
 else
    AC_CHECK_FUNCS([strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate])
Index: libjava/configure
===================================================================
--- libjava/configure	(revision 112529)
+++ libjava/configure	(working copy)
@@ -9388,9 +9388,10 @@
 
 
 
+
 for ac_func in strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate
Index: libjava/include/config.h.in
===================================================================
--- libjava/include/config.h.in	(revision 112529)
+++ libjava/include/config.h.in	(working copy)
@@ -172,6 +172,9 @@
 /* Define to 1 if you have the `localtime_r' function. */
 #undef HAVE_LOCALTIME_R
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
@@ -229,8 +232,8 @@
 /* Define to 1 if you have the `readdir_r' function. */
 #undef HAVE_READDIR_R
 
-/* Define to 1 if you have the `realpath' function. */
-#undef HAVE_REALPATH
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
 
 /* Define to 1 if you have the `rename' function. */
 #undef HAVE_RENAME

[-- Attachment #3: canon.cc --]
[-- Type: text/plain, Size: 3334 bytes --]

#define CHUNKLOG 8
#define CHUNKSIZ (1 << CHUNKLOG)

static int
nextChunkSize (int size)
{
  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
}

static char *
maybeGrowBuf (char *buf, int *size, int required)
{
  if (required > *size)
    {
      *size = nextChunkSize (required);
      buf = (char *) _Jv_Realloc (buf, *size);
    }
  return buf;
}

jstring
java::io::File::getCanonicalPath (void)
{
  jstring path = getAbsolutePath ();

  int len = JvGetStringUTFLength (path);
  int srcl = nextChunkSize (len + 1);
  char *src = (char *) _Jv_Malloc (srcl);
  JvGetStringUTFRegion (path, 0, path->length(), src);
  src[len] = '\0';
  int srci = 1;

  int dstl = nextChunkSize (2);  
  char *dst = (char *) _Jv_Malloc (dstl);
  dst[0] = '/';
  int dsti = 1;

  bool fschecks = true;

  while (src[srci] != '\0')
    {
      // Skip slashes.
      while (src[srci] == '/')
	srci++;
      int tmpi = srci;
      // Find next slash.
      while (src[srci] != '/' && src[srci] != '\0')
	srci++;
      if (srci == tmpi)
	// We hit the end.
	break;
      len = srci - tmpi;

      // Handle "." and "..".
      if (len == 1 && src[tmpi] == '.')
	continue;
      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
	{
	  while (dsti > 1 && dst[dsti - 1] != '/')
	    dsti--;
	  if (dsti != 1)
	    dsti--;
	  // Reenable filesystem checking if disabled, as we might
	  // have reversed over whatever caused the problem before.
	  // At least one proprietary JVM has inconsistencies because
	  // it does not do this.
	  fschecks = true;
	  continue;
	}

      // Handle real path components.
      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
      int dsti_save = dsti;
      if (dsti > 1)
	dst[dsti++] = '/';
      strncpy (&dst[dsti], &src[tmpi], len);
      dsti += len;
      if (fschecks == false)
	continue;

#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
      struct stat sb;
      dst[dsti] = '\0';
      if (::lstat (dst, &sb) == 0)
	{
	  if (S_ISLNK (sb.st_mode))
	    {
	      int tmpl = CHUNKSIZ;
	      char *tmp = (char *) _Jv_Malloc (tmpl);

	      while (1)
		{
		  tmpi = ::readlink (dst, tmp, tmpl);
		  if (tmpi < 1)
		    {
		      _Jv_Free (src);
		      _Jv_Free (dst);
		      _Jv_Free (tmp);
		      throw new IOException (
			JvNewStringLatin1 ("readlink failed"));
		    }
		  if (tmpi < tmpl)
		    break;
		  tmpl += CHUNKSIZ;
		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
		}

	      // Prepend the link's path to src.
	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
	      strcpy(&tmp[tmpi], &src[srci]);
	      _Jv_Free (src);
	      src = tmp;
	      srcl = tmpl;
	      srci = 0;

	      // Either replace or append dst depending on whether the
	      // link is relative or absolute.
	      dsti = src[0] == '/' ? 1 : dsti_save;
	    }
	}
      else
	{
	  // Something doesn't exist, or we don't have permission to
	  // read it, or a previous path component is a directory, or
	  // a symlink is looped.  Whatever, we can't check the
	  // filesystem any more.
	  fschecks = false;
	}
#endif // HAVE_LSTAT && HAVE_READLINK
    }
  dst[dsti] = '\0';

  // FIXME: what encoding to assume for file names?  This affects many
  // calls.
  path = JvNewStringUTF (dst);
  _Jv_Free (src);
  _Jv_Free (dst);
  return path;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* PING: rewrite File.toCanonicalPath()
  2006-04-19 13:55     ` Patch: rewrite File.toCanonicalPath() take 3 Gary Benson
@ 2006-05-12  8:16       ` Gary Benson
  2006-05-31  8:21         ` Second ping: " Gary Benson
  0 siblings, 1 reply; 14+ messages in thread
From: Gary Benson @ 2006-05-12  8:16 UTC (permalink / raw)
  To: java-patches

[-- Attachment #1: Type: text/plain, Size: 406 bytes --]

Hi all,

A few weeks ago I posted a patch that rewrote File.toCanonicalPath()
on GNU/Posix systems.  Is anything happening with it?  A correct
File.toCanonicalPath() is required in order for FilePermission checks
to work properly, so much of the security framework relies on it.

As before the diff of the method itself is not very legible so I've
attached a copy of it as well as the patch.

Cheers,
Gary

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 8850 bytes --]

Index: libjava/java/io/natFilePosix.cc
===================================================================
--- libjava/java/io/natFilePosix.cc	(revision 112529)
+++ libjava/java/io/natFilePosix.cc	(working copy)
@@ -101,94 +101,144 @@
 #endif
 }
 
+#define CHUNKLOG 8
+#define CHUNKSIZ (1 << CHUNKLOG)
+
+static int
+nextChunkSize (int size)
+{
+  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
+}
+
+static char *
+maybeGrowBuf (char *buf, int *size, int required)
+{
+  if (required > *size)
+    {
+      *size = nextChunkSize (required);
+      buf = (char *) _Jv_Realloc (buf, *size);
+    }
+  return buf;
+}
+
 jstring
 java::io::File::getCanonicalPath (void)
 {
-  // We use `+2' here because we might need to use `.' for our special
-  // case.
-  char *buf = (char *) __builtin_alloca (JvGetStringUTFLength (path) + 2);
-  char buf2[MAXPATHLEN];
-  jsize total = JvGetStringUTFRegion (path, 0, path->length(), buf);
+  jstring path = getAbsolutePath ();
 
-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';
+  int len = JvGetStringUTFLength (path);
+  int srcl = nextChunkSize (len + 1);
+  char *src = (char *) _Jv_Malloc (srcl);
+  JvGetStringUTFRegion (path, 0, path->length(), src);
+  src[len] = '\0';
+  int srci = 1;
 
-  buf[total] = '\0';
+  int dstl = nextChunkSize (2);  
+  char *dst = (char *) _Jv_Malloc (dstl);
+  dst[0] = '/';
+  int dsti = 1;
 
-#ifdef HAVE_REALPATH
-  if (realpath (buf, buf2) == NULL)
+  bool fschecks = true;
+
+  while (src[srci] != '\0')
     {
-      // If realpath failed, we have to come up with a canonical path
-      // anyway.  We do this with purely textual manipulation.
-      // FIXME: this isn't perfect.  You can construct a case where
-      // we get a different answer from the JDK:
-      // mkdir -p /tmp/a/b/c
-      // ln -s /tmp/a/b /tmp/a/z
-      // ... getCanonicalPath("/tmp/a/z/c/nosuchfile")
-      // We will give /tmp/a/z/c/nosuchfile, while the JDK will
-      // give /tmp/a/b/c/nosuchfile.
-      int out_idx;
-      if (buf[0] != '/')
+      // Skip slashes.
+      while (src[srci] == '/')
+	srci++;
+      int tmpi = srci;
+      // Find next slash.
+      while (src[srci] != '/' && src[srci] != '\0')
+	srci++;
+      if (srci == tmpi)
+	// We hit the end.
+	break;
+      len = srci - tmpi;
+
+      // Handle "." and "..".
+      if (len == 1 && src[tmpi] == '.')
+	continue;
+      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
 	{
-	  // Not absolute, so start with current directory.
-	  if (getcwd (buf2, sizeof (buf2)) == NULL)
-	    throw new IOException ();
-	  out_idx = strlen (buf2);
+	  while (dsti > 1 && dst[dsti - 1] != '/')
+	    dsti--;
+	  if (dsti != 1)
+	    dsti--;
+	  // Reenable filesystem checking if disabled, as we might
+	  // have reversed over whatever caused the problem before.
+	  // At least one proprietary JVM has inconsistencies because
+	  // it does not do this.
+	  fschecks = true;
+	  continue;
 	}
-      else
+
+      // Handle real path components.
+      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
+      int dsti_save = dsti;
+      if (dsti > 1)
+	dst[dsti++] = '/';
+      strncpy (&dst[dsti], &src[tmpi], len);
+      dsti += len;
+      if (fschecks == false)
+	continue;
+
+#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
+      struct stat sb;
+      dst[dsti] = '\0';
+      if (::lstat (dst, &sb) == 0)
 	{
-	  buf2[0] = '/';
-	  out_idx = 1;
-	} 
-      int in_idx = 0;
-      while (buf[in_idx] != '\0')
-	{
-	  // Skip '/'s.
-	  while (buf[in_idx] == '/')
-	    ++in_idx;
-	  int elt_start = in_idx;
-	  // Find next '/' or end of path.
-	  while (buf[in_idx] != '\0' && buf[in_idx] != '/')
-	    ++in_idx;
-	  if (in_idx == elt_start)
+	  if (S_ISLNK (sb.st_mode))
 	    {
-	      // An empty component means we've reached the end.
-	      break;
+	      int tmpl = CHUNKSIZ;
+	      char *tmp = (char *) _Jv_Malloc (tmpl);
+
+	      while (1)
+		{
+		  tmpi = ::readlink (dst, tmp, tmpl);
+		  if (tmpi < 1)
+		    {
+		      _Jv_Free (src);
+		      _Jv_Free (dst);
+		      _Jv_Free (tmp);
+		      throw new IOException (
+			JvNewStringLatin1 ("readlink failed"));
+		    }
+		  if (tmpi < tmpl)
+		    break;
+		  tmpl += CHUNKSIZ;
+		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
+		}
+
+	      // Prepend the link's path to src.
+	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
+	      strcpy(&tmp[tmpi], &src[srci]);
+	      _Jv_Free (src);
+	      src = tmp;
+	      srcl = tmpl;
+	      srci = 0;
+
+	      // Either replace or append dst depending on whether the
+	      // link is relative or absolute.
+	      dsti = src[0] == '/' ? 1 : dsti_save;
 	    }
-	  int len = in_idx - elt_start;
-	  if (len == 1 && buf[in_idx] == '.')
-	    continue;
-	  if (len == 2 && buf[in_idx] == '.' && buf[in_idx + 1] == '.')
-	    {
-	      // Found ".." component, lop off last part from existing
-	      // buffer.
-	      --out_idx;
-	      while (out_idx > 0 && buf2[out_idx] != '/')
-		--out_idx;
-	      // Can't go up past "/".
-	      if (out_idx == 0)
-		++out_idx;
-	    }
-	  else
-	    {
-	      // Append a real path component to the output.
-	      if (out_idx > 1)
-		buf2[out_idx++] = '/';
-	      strncpy (&buf2[out_idx], &buf[elt_start], len);
-	      out_idx += len;
-	    }
 	}
-
-      buf2[out_idx] = '\0';
+      else
+	{
+	  // Something doesn't exist, or we don't have permission to
+	  // read it, or a previous path component is a directory, or
+	  // a symlink is looped.  Whatever, we can't check the
+	  // filesystem any more.
+	  fschecks = false;
+	}
+#endif // HAVE_LSTAT && HAVE_READLINK
     }
+  dst[dsti] = '\0';
 
   // FIXME: what encoding to assume for file names?  This affects many
   // calls.
-  return JvNewStringUTF (buf2);
-#else
-  return JvNewStringUTF (buf);
-#endif
+  path = JvNewStringUTF (dst);
+  _Jv_Free (src);
+  _Jv_Free (dst);
+  return path;
 }
 
 jboolean
Index: libjava/java/io/File.java
===================================================================
--- libjava/java/io/File.java	(revision 112529)
+++ libjava/java/io/File.java	(working copy)
@@ -508,9 +508,9 @@
   /**
    * This method returns a canonical representation of the pathname of
    * this file.  The actual form of the canonical representation is
-   * different.  On the GNU system, the canonical form differs from the
-   * absolute form in that all relative file references to "." and ".."
-   * are resolved and removed.
+   * system-dependent.  On the GNU system, conversion to canonical
+   * form involves the removal of redundant separators, references to
+   * "." and "..", and symbolic links.
    * <p>
    * Note that this method, unlike the other methods which return path
    * names, can throw an IOException.  This is because native method 
Index: libjava/configure.ac
===================================================================
--- libjava/configure.ac	(revision 112529)
+++ libjava/configure.ac	(working copy)
@@ -895,7 +895,7 @@
 else
    AC_CHECK_FUNCS([strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate])
Index: libjava/configure
===================================================================
--- libjava/configure	(revision 112529)
+++ libjava/configure	(working copy)
@@ -9388,9 +9388,10 @@
 
 
 
+
 for ac_func in strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate
Index: libjava/include/config.h.in
===================================================================
--- libjava/include/config.h.in	(revision 112529)
+++ libjava/include/config.h.in	(working copy)
@@ -172,6 +172,9 @@
 /* Define to 1 if you have the `localtime_r' function. */
 #undef HAVE_LOCALTIME_R
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
@@ -229,8 +232,8 @@
 /* Define to 1 if you have the `readdir_r' function. */
 #undef HAVE_READDIR_R
 
-/* Define to 1 if you have the `realpath' function. */
-#undef HAVE_REALPATH
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
 
 /* Define to 1 if you have the `rename' function. */
 #undef HAVE_RENAME

[-- Attachment #3: canon.cc --]
[-- Type: text/plain, Size: 3334 bytes --]

#define CHUNKLOG 8
#define CHUNKSIZ (1 << CHUNKLOG)

static int
nextChunkSize (int size)
{
  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
}

static char *
maybeGrowBuf (char *buf, int *size, int required)
{
  if (required > *size)
    {
      *size = nextChunkSize (required);
      buf = (char *) _Jv_Realloc (buf, *size);
    }
  return buf;
}

jstring
java::io::File::getCanonicalPath (void)
{
  jstring path = getAbsolutePath ();

  int len = JvGetStringUTFLength (path);
  int srcl = nextChunkSize (len + 1);
  char *src = (char *) _Jv_Malloc (srcl);
  JvGetStringUTFRegion (path, 0, path->length(), src);
  src[len] = '\0';
  int srci = 1;

  int dstl = nextChunkSize (2);  
  char *dst = (char *) _Jv_Malloc (dstl);
  dst[0] = '/';
  int dsti = 1;

  bool fschecks = true;

  while (src[srci] != '\0')
    {
      // Skip slashes.
      while (src[srci] == '/')
	srci++;
      int tmpi = srci;
      // Find next slash.
      while (src[srci] != '/' && src[srci] != '\0')
	srci++;
      if (srci == tmpi)
	// We hit the end.
	break;
      len = srci - tmpi;

      // Handle "." and "..".
      if (len == 1 && src[tmpi] == '.')
	continue;
      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
	{
	  while (dsti > 1 && dst[dsti - 1] != '/')
	    dsti--;
	  if (dsti != 1)
	    dsti--;
	  // Reenable filesystem checking if disabled, as we might
	  // have reversed over whatever caused the problem before.
	  // At least one proprietary JVM has inconsistencies because
	  // it does not do this.
	  fschecks = true;
	  continue;
	}

      // Handle real path components.
      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
      int dsti_save = dsti;
      if (dsti > 1)
	dst[dsti++] = '/';
      strncpy (&dst[dsti], &src[tmpi], len);
      dsti += len;
      if (fschecks == false)
	continue;

#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
      struct stat sb;
      dst[dsti] = '\0';
      if (::lstat (dst, &sb) == 0)
	{
	  if (S_ISLNK (sb.st_mode))
	    {
	      int tmpl = CHUNKSIZ;
	      char *tmp = (char *) _Jv_Malloc (tmpl);

	      while (1)
		{
		  tmpi = ::readlink (dst, tmp, tmpl);
		  if (tmpi < 1)
		    {
		      _Jv_Free (src);
		      _Jv_Free (dst);
		      _Jv_Free (tmp);
		      throw new IOException (
			JvNewStringLatin1 ("readlink failed"));
		    }
		  if (tmpi < tmpl)
		    break;
		  tmpl += CHUNKSIZ;
		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
		}

	      // Prepend the link's path to src.
	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
	      strcpy(&tmp[tmpi], &src[srci]);
	      _Jv_Free (src);
	      src = tmp;
	      srcl = tmpl;
	      srci = 0;

	      // Either replace or append dst depending on whether the
	      // link is relative or absolute.
	      dsti = src[0] == '/' ? 1 : dsti_save;
	    }
	}
      else
	{
	  // Something doesn't exist, or we don't have permission to
	  // read it, or a previous path component is a directory, or
	  // a symlink is looped.  Whatever, we can't check the
	  // filesystem any more.
	  fschecks = false;
	}
#endif // HAVE_LSTAT && HAVE_READLINK
    }
  dst[dsti] = '\0';

  // FIXME: what encoding to assume for file names?  This affects many
  // calls.
  path = JvNewStringUTF (dst);
  _Jv_Free (src);
  _Jv_Free (dst);
  return path;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Second ping: rewrite File.toCanonicalPath()
  2006-05-12  8:16       ` PING: rewrite File.toCanonicalPath() Gary Benson
@ 2006-05-31  8:21         ` Gary Benson
  2006-06-01  7:01           ` Bryce McKinlay
  2006-06-01  7:54           ` Second ping: " Bryce McKinlay
  0 siblings, 2 replies; 14+ messages in thread
From: Gary Benson @ 2006-05-31  8:21 UTC (permalink / raw)
  To: java-patches

[-- Attachment #1: Type: text/plain, Size: 519 bytes --]

Hi all,

A couple of months ago I posted a patch that rewrote File.toCanonicalPath()
on GNU/Posix systems in order for FilePermission checks to work properly,
so much of the security framework relies on it.  I rewrote it a couple of
times at various people's requests, but this (final?) version met with no
response whatsoever.

Can someone please approve it or tell me what's wrong with it?

As before the diff of the method itself is not very legible so I've
attached a copy of it as well as the patch.

Cheers,
Gary

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 8850 bytes --]

Index: libjava/java/io/natFilePosix.cc
===================================================================
--- libjava/java/io/natFilePosix.cc	(revision 112529)
+++ libjava/java/io/natFilePosix.cc	(working copy)
@@ -101,94 +101,144 @@
 #endif
 }
 
+#define CHUNKLOG 8
+#define CHUNKSIZ (1 << CHUNKLOG)
+
+static int
+nextChunkSize (int size)
+{
+  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
+}
+
+static char *
+maybeGrowBuf (char *buf, int *size, int required)
+{
+  if (required > *size)
+    {
+      *size = nextChunkSize (required);
+      buf = (char *) _Jv_Realloc (buf, *size);
+    }
+  return buf;
+}
+
 jstring
 java::io::File::getCanonicalPath (void)
 {
-  // We use `+2' here because we might need to use `.' for our special
-  // case.
-  char *buf = (char *) __builtin_alloca (JvGetStringUTFLength (path) + 2);
-  char buf2[MAXPATHLEN];
-  jsize total = JvGetStringUTFRegion (path, 0, path->length(), buf);
+  jstring path = getAbsolutePath ();
 
-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';
+  int len = JvGetStringUTFLength (path);
+  int srcl = nextChunkSize (len + 1);
+  char *src = (char *) _Jv_Malloc (srcl);
+  JvGetStringUTFRegion (path, 0, path->length(), src);
+  src[len] = '\0';
+  int srci = 1;
 
-  buf[total] = '\0';
+  int dstl = nextChunkSize (2);  
+  char *dst = (char *) _Jv_Malloc (dstl);
+  dst[0] = '/';
+  int dsti = 1;
 
-#ifdef HAVE_REALPATH
-  if (realpath (buf, buf2) == NULL)
+  bool fschecks = true;
+
+  while (src[srci] != '\0')
     {
-      // If realpath failed, we have to come up with a canonical path
-      // anyway.  We do this with purely textual manipulation.
-      // FIXME: this isn't perfect.  You can construct a case where
-      // we get a different answer from the JDK:
-      // mkdir -p /tmp/a/b/c
-      // ln -s /tmp/a/b /tmp/a/z
-      // ... getCanonicalPath("/tmp/a/z/c/nosuchfile")
-      // We will give /tmp/a/z/c/nosuchfile, while the JDK will
-      // give /tmp/a/b/c/nosuchfile.
-      int out_idx;
-      if (buf[0] != '/')
+      // Skip slashes.
+      while (src[srci] == '/')
+	srci++;
+      int tmpi = srci;
+      // Find next slash.
+      while (src[srci] != '/' && src[srci] != '\0')
+	srci++;
+      if (srci == tmpi)
+	// We hit the end.
+	break;
+      len = srci - tmpi;
+
+      // Handle "." and "..".
+      if (len == 1 && src[tmpi] == '.')
+	continue;
+      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
 	{
-	  // Not absolute, so start with current directory.
-	  if (getcwd (buf2, sizeof (buf2)) == NULL)
-	    throw new IOException ();
-	  out_idx = strlen (buf2);
+	  while (dsti > 1 && dst[dsti - 1] != '/')
+	    dsti--;
+	  if (dsti != 1)
+	    dsti--;
+	  // Reenable filesystem checking if disabled, as we might
+	  // have reversed over whatever caused the problem before.
+	  // At least one proprietary JVM has inconsistencies because
+	  // it does not do this.
+	  fschecks = true;
+	  continue;
 	}
-      else
+
+      // Handle real path components.
+      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
+      int dsti_save = dsti;
+      if (dsti > 1)
+	dst[dsti++] = '/';
+      strncpy (&dst[dsti], &src[tmpi], len);
+      dsti += len;
+      if (fschecks == false)
+	continue;
+
+#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
+      struct stat sb;
+      dst[dsti] = '\0';
+      if (::lstat (dst, &sb) == 0)
 	{
-	  buf2[0] = '/';
-	  out_idx = 1;
-	} 
-      int in_idx = 0;
-      while (buf[in_idx] != '\0')
-	{
-	  // Skip '/'s.
-	  while (buf[in_idx] == '/')
-	    ++in_idx;
-	  int elt_start = in_idx;
-	  // Find next '/' or end of path.
-	  while (buf[in_idx] != '\0' && buf[in_idx] != '/')
-	    ++in_idx;
-	  if (in_idx == elt_start)
+	  if (S_ISLNK (sb.st_mode))
 	    {
-	      // An empty component means we've reached the end.
-	      break;
+	      int tmpl = CHUNKSIZ;
+	      char *tmp = (char *) _Jv_Malloc (tmpl);
+
+	      while (1)
+		{
+		  tmpi = ::readlink (dst, tmp, tmpl);
+		  if (tmpi < 1)
+		    {
+		      _Jv_Free (src);
+		      _Jv_Free (dst);
+		      _Jv_Free (tmp);
+		      throw new IOException (
+			JvNewStringLatin1 ("readlink failed"));
+		    }
+		  if (tmpi < tmpl)
+		    break;
+		  tmpl += CHUNKSIZ;
+		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
+		}
+
+	      // Prepend the link's path to src.
+	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
+	      strcpy(&tmp[tmpi], &src[srci]);
+	      _Jv_Free (src);
+	      src = tmp;
+	      srcl = tmpl;
+	      srci = 0;
+
+	      // Either replace or append dst depending on whether the
+	      // link is relative or absolute.
+	      dsti = src[0] == '/' ? 1 : dsti_save;
 	    }
-	  int len = in_idx - elt_start;
-	  if (len == 1 && buf[in_idx] == '.')
-	    continue;
-	  if (len == 2 && buf[in_idx] == '.' && buf[in_idx + 1] == '.')
-	    {
-	      // Found ".." component, lop off last part from existing
-	      // buffer.
-	      --out_idx;
-	      while (out_idx > 0 && buf2[out_idx] != '/')
-		--out_idx;
-	      // Can't go up past "/".
-	      if (out_idx == 0)
-		++out_idx;
-	    }
-	  else
-	    {
-	      // Append a real path component to the output.
-	      if (out_idx > 1)
-		buf2[out_idx++] = '/';
-	      strncpy (&buf2[out_idx], &buf[elt_start], len);
-	      out_idx += len;
-	    }
 	}
-
-      buf2[out_idx] = '\0';
+      else
+	{
+	  // Something doesn't exist, or we don't have permission to
+	  // read it, or a previous path component is a directory, or
+	  // a symlink is looped.  Whatever, we can't check the
+	  // filesystem any more.
+	  fschecks = false;
+	}
+#endif // HAVE_LSTAT && HAVE_READLINK
     }
+  dst[dsti] = '\0';
 
   // FIXME: what encoding to assume for file names?  This affects many
   // calls.
-  return JvNewStringUTF (buf2);
-#else
-  return JvNewStringUTF (buf);
-#endif
+  path = JvNewStringUTF (dst);
+  _Jv_Free (src);
+  _Jv_Free (dst);
+  return path;
 }
 
 jboolean
Index: libjava/java/io/File.java
===================================================================
--- libjava/java/io/File.java	(revision 112529)
+++ libjava/java/io/File.java	(working copy)
@@ -508,9 +508,9 @@
   /**
    * This method returns a canonical representation of the pathname of
    * this file.  The actual form of the canonical representation is
-   * different.  On the GNU system, the canonical form differs from the
-   * absolute form in that all relative file references to "." and ".."
-   * are resolved and removed.
+   * system-dependent.  On the GNU system, conversion to canonical
+   * form involves the removal of redundant separators, references to
+   * "." and "..", and symbolic links.
    * <p>
    * Note that this method, unlike the other methods which return path
    * names, can throw an IOException.  This is because native method 
Index: libjava/configure.ac
===================================================================
--- libjava/configure.ac	(revision 112529)
+++ libjava/configure.ac	(working copy)
@@ -895,7 +895,7 @@
 else
    AC_CHECK_FUNCS([strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate])
Index: libjava/configure
===================================================================
--- libjava/configure	(revision 112529)
+++ libjava/configure	(working copy)
@@ -9388,9 +9388,10 @@
 
 
 
+
 for ac_func in strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate
Index: libjava/include/config.h.in
===================================================================
--- libjava/include/config.h.in	(revision 112529)
+++ libjava/include/config.h.in	(working copy)
@@ -172,6 +172,9 @@
 /* Define to 1 if you have the `localtime_r' function. */
 #undef HAVE_LOCALTIME_R
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
@@ -229,8 +232,8 @@
 /* Define to 1 if you have the `readdir_r' function. */
 #undef HAVE_READDIR_R
 
-/* Define to 1 if you have the `realpath' function. */
-#undef HAVE_REALPATH
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
 
 /* Define to 1 if you have the `rename' function. */
 #undef HAVE_RENAME

[-- Attachment #3: canon.cc --]
[-- Type: text/plain, Size: 3334 bytes --]

#define CHUNKLOG 8
#define CHUNKSIZ (1 << CHUNKLOG)

static int
nextChunkSize (int size)
{
  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
}

static char *
maybeGrowBuf (char *buf, int *size, int required)
{
  if (required > *size)
    {
      *size = nextChunkSize (required);
      buf = (char *) _Jv_Realloc (buf, *size);
    }
  return buf;
}

jstring
java::io::File::getCanonicalPath (void)
{
  jstring path = getAbsolutePath ();

  int len = JvGetStringUTFLength (path);
  int srcl = nextChunkSize (len + 1);
  char *src = (char *) _Jv_Malloc (srcl);
  JvGetStringUTFRegion (path, 0, path->length(), src);
  src[len] = '\0';
  int srci = 1;

  int dstl = nextChunkSize (2);  
  char *dst = (char *) _Jv_Malloc (dstl);
  dst[0] = '/';
  int dsti = 1;

  bool fschecks = true;

  while (src[srci] != '\0')
    {
      // Skip slashes.
      while (src[srci] == '/')
	srci++;
      int tmpi = srci;
      // Find next slash.
      while (src[srci] != '/' && src[srci] != '\0')
	srci++;
      if (srci == tmpi)
	// We hit the end.
	break;
      len = srci - tmpi;

      // Handle "." and "..".
      if (len == 1 && src[tmpi] == '.')
	continue;
      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
	{
	  while (dsti > 1 && dst[dsti - 1] != '/')
	    dsti--;
	  if (dsti != 1)
	    dsti--;
	  // Reenable filesystem checking if disabled, as we might
	  // have reversed over whatever caused the problem before.
	  // At least one proprietary JVM has inconsistencies because
	  // it does not do this.
	  fschecks = true;
	  continue;
	}

      // Handle real path components.
      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
      int dsti_save = dsti;
      if (dsti > 1)
	dst[dsti++] = '/';
      strncpy (&dst[dsti], &src[tmpi], len);
      dsti += len;
      if (fschecks == false)
	continue;

#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
      struct stat sb;
      dst[dsti] = '\0';
      if (::lstat (dst, &sb) == 0)
	{
	  if (S_ISLNK (sb.st_mode))
	    {
	      int tmpl = CHUNKSIZ;
	      char *tmp = (char *) _Jv_Malloc (tmpl);

	      while (1)
		{
		  tmpi = ::readlink (dst, tmp, tmpl);
		  if (tmpi < 1)
		    {
		      _Jv_Free (src);
		      _Jv_Free (dst);
		      _Jv_Free (tmp);
		      throw new IOException (
			JvNewStringLatin1 ("readlink failed"));
		    }
		  if (tmpi < tmpl)
		    break;
		  tmpl += CHUNKSIZ;
		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
		}

	      // Prepend the link's path to src.
	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
	      strcpy(&tmp[tmpi], &src[srci]);
	      _Jv_Free (src);
	      src = tmp;
	      srcl = tmpl;
	      srci = 0;

	      // Either replace or append dst depending on whether the
	      // link is relative or absolute.
	      dsti = src[0] == '/' ? 1 : dsti_save;
	    }
	}
      else
	{
	  // Something doesn't exist, or we don't have permission to
	  // read it, or a previous path component is a directory, or
	  // a symlink is looped.  Whatever, we can't check the
	  // filesystem any more.
	  fschecks = false;
	}
#endif // HAVE_LSTAT && HAVE_READLINK
    }
  dst[dsti] = '\0';

  // FIXME: what encoding to assume for file names?  This affects many
  // calls.
  path = JvNewStringUTF (dst);
  _Jv_Free (src);
  _Jv_Free (dst);
  return path;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Second ping: rewrite File.toCanonicalPath()
  2006-05-31  8:21         ` Second ping: " Gary Benson
@ 2006-06-01  7:01           ` Bryce McKinlay
  2006-06-05 12:07             ` Gary Benson
  2006-06-06 15:18             ` Patch: FYI: " Gary Benson
  2006-06-01  7:54           ` Second ping: " Bryce McKinlay
  1 sibling, 2 replies; 14+ messages in thread
From: Bryce McKinlay @ 2006-06-01  7:01 UTC (permalink / raw)
  To: java-patches

Gary Benson wrote:
> A couple of months ago I posted a patch that rewrote File.toCanonicalPath()
> on GNU/Posix systems in order for FilePermission checks to work properly,
> so much of the security framework relies on it.  I rewrote it a couple of
> times at various people's requests, but this (final?) version met with no
> response whatsoever.
>
> Can someone please approve it or tell me what's wrong with it?
>
> As before the diff of the method itself is not very legible so I've
> attached a copy of it as well as the patch.
>   

Hi Gary,

Sorry for the delay in reviewing this. This patch looks pretty good to 
me, but I think a few extra comments would help to make it more readable 
- could you add one to nextChunkSize explaining what the algorithm does? 
Also, I think a comment in getCanonicalPath itself briefly explaining 
the approach and why it is necessary would be very helpful. A ChangeLog 
entry is needed as well.

Otherwise, OK - please go ahead and check it in to trunk with these 
changes (you can post the final patch to the list as an FYI).

Bryce

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Second ping: rewrite File.toCanonicalPath()
  2006-05-31  8:21         ` Second ping: " Gary Benson
  2006-06-01  7:01           ` Bryce McKinlay
@ 2006-06-01  7:54           ` Bryce McKinlay
  2006-06-01  8:12             ` Gary Benson
  1 sibling, 1 reply; 14+ messages in thread
From: Bryce McKinlay @ 2006-06-01  7:54 UTC (permalink / raw)
  To: java-patches

Gary Benson wrote:
> A couple of months ago I posted a patch that rewrote File.toCanonicalPath()
> on GNU/Posix systems in order for FilePermission checks to work properly,
> so much of the security framework relies on it.  I rewrote it a couple of
> times at various people's requests, but this (final?) version met with no
> response whatsoever.
>   

I forgot to ask - do you have a test case that could be added to mauve? 
It looks like there is very basic getCanonicalPath testing already, but 
no testing for proper behavior with symlinks, etc.

Bryce

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Second ping: rewrite File.toCanonicalPath()
  2006-06-01  7:54           ` Second ping: " Bryce McKinlay
@ 2006-06-01  8:12             ` Gary Benson
  0 siblings, 0 replies; 14+ messages in thread
From: Gary Benson @ 2006-06-01  8:12 UTC (permalink / raw)
  To: java-patches

Bryce McKinlay wrote:
> Gary Benson wrote:
> > A couple of months ago I posted a patch that rewrote
> > File.toCanonicalPath() on GNU/Posix systems in order for
> > FilePermission checks to work properly, so much of the
> > security framework relies on it.  I rewrote it a couple
> > of times at various people's requests, but this (final?)
> > version met with no response whatsoever.
> 
> I forgot to ask - do you have a test case that could be added
> to mauve?  It looks like there is very basic getCanonicalPath
> testing already, but no testing for proper behavior with symlinks,
> etc.

There's no specific testcase for it, but the traversal tests for
FilePermission cover pretty much every edge case I could think of.

Cheers,
Gary

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Second ping: rewrite File.toCanonicalPath()
  2006-06-01  7:01           ` Bryce McKinlay
@ 2006-06-05 12:07             ` Gary Benson
  2006-06-05 15:13               ` Andrew Haley
  2006-06-06 15:18             ` Patch: FYI: " Gary Benson
  1 sibling, 1 reply; 14+ messages in thread
From: Gary Benson @ 2006-06-05 12:07 UTC (permalink / raw)
  To: java-patches

[-- Attachment #1: Type: text/plain, Size: 868 bytes --]

Bryce McKinlay wrote:
> Gary Benson wrote:
> > A couple of months ago I posted a patch that rewrote
> > File.toCanonicalPath() on GNU/Posix systems in order
> > for FilePermission checks to work properly...
> 
> Sorry for the delay in reviewing this. This patch looks
> pretty good to me, but I think a few extra comments would
> help to make it more readable - could you add one to
> nextChunkSize explaining what the algorithm does?  Also,
> I think a comment in getCanonicalPath itself briefly
> explaining the approach and why it is necessary would be
> very helpful. A ChangeLog entry is needed as well.
> 
> Otherwise, OK - please go ahead and check it in to trunk
> with these changes (you can post the final patch to the
> list as an FYI).

I made the changes but don't have commit access.  Could
someone please commit the attached patch for me?

Cheers,
Gary

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 10513 bytes --]

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 114387)
+++ ChangeLog	(working copy)
@@ -1,3 +1,12 @@
+2006-06-05  Gary Benson  <gbenson@redhat.com>
+
+	* java/io/natFilePosix.cc (getCanonicalPath): Rewritten.
+	* configure.ac: Remove realpath check and add checks for
+	lstat and readlink.
+	* configure: Rebuilt.
+	* include/config.h.in: Likewise.
+	* java/io/File.java: Javadoc fix.
+
 2006-06-03  Paolo Bonzini  <bonzini@gnu.org>
 
 	* scripts/jar.in: Ensure return with argument has non-empty argument.
Index: java/io/natFilePosix.cc
===================================================================
--- java/io/natFilePosix.cc	(revision 114387)
+++ java/io/natFilePosix.cc	(working copy)
@@ -101,94 +101,167 @@
 #endif
 }
 
+// These two methods are used to maintain dynamically allocated
+// buffers for getCanonicalPath without the overhead of calling
+// realloc every time a buffer is modified.  Buffers are sized
+// at the smallest multiple of CHUNKSIZ that is greater than or
+// equal to the desired length.  The default CHUNKSIZ is 256,
+// longer than most paths, so in most cases a getCanonicalPath
+// will require only one malloc per buffer.
+
+#define CHUNKLOG 8
+#define CHUNKSIZ (1 << CHUNKLOG)
+
+static int
+nextChunkSize (int size)
+{
+  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
+}
+
+static char *
+maybeGrowBuf (char *buf, int *size, int required)
+{
+  if (required > *size)
+    {
+      *size = nextChunkSize (required);
+      buf = (char *) _Jv_Realloc (buf, *size);
+    }
+  return buf;
+}
+
+// Return a canonical representation of the pathname of this file.  On
+// the GNU system this involves the removal of redundant separators,
+// references to "." and "..", and symbolic links.
+//
+// The conversion proceeds on a component-by-component basis: symbolic
+// links and references to ".."  are resolved as and when they occur.
+// This means that if "/foo/bar" is a symbolic link to "/baz" then the
+// canonical form of "/foo/bar/.." is "/" and not "/foo".
+//
+// In order to mimic the behaviour of proprietary JVMs, non-existant
+// path components are allowed (a departure from the normal GNU system
+// convention).  This means that if "/foo/bar" is a symbolic link to
+// "/baz", the canonical form of "/non-existant-directory/../foo/bar"
+// is "/baz".
+
 jstring
 java::io::File::getCanonicalPath (void)
 {
-  // We use `+2' here because we might need to use `.' for our special
-  // case.
-  char *buf = (char *) __builtin_alloca (JvGetStringUTFLength (path) + 2);
-  char buf2[MAXPATHLEN];
-  jsize total = JvGetStringUTFRegion (path, 0, path->length(), buf);
+  jstring path = getAbsolutePath ();
 
-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';
+  int len = JvGetStringUTFLength (path);
+  int srcl = nextChunkSize (len + 1);
+  char *src = (char *) _Jv_Malloc (srcl);
+  JvGetStringUTFRegion (path, 0, path->length(), src);
+  src[len] = '\0';
+  int srci = 1;
 
-  buf[total] = '\0';
+  int dstl = nextChunkSize (2);  
+  char *dst = (char *) _Jv_Malloc (dstl);
+  dst[0] = '/';
+  int dsti = 1;
 
-#ifdef HAVE_REALPATH
-  if (realpath (buf, buf2) == NULL)
+  bool fschecks = true;
+
+  while (src[srci] != '\0')
     {
-      // If realpath failed, we have to come up with a canonical path
-      // anyway.  We do this with purely textual manipulation.
-      // FIXME: this isn't perfect.  You can construct a case where
-      // we get a different answer from the JDK:
-      // mkdir -p /tmp/a/b/c
-      // ln -s /tmp/a/b /tmp/a/z
-      // ... getCanonicalPath("/tmp/a/z/c/nosuchfile")
-      // We will give /tmp/a/z/c/nosuchfile, while the JDK will
-      // give /tmp/a/b/c/nosuchfile.
-      int out_idx;
-      if (buf[0] != '/')
+      // Skip slashes.
+      while (src[srci] == '/')
+	srci++;
+      int tmpi = srci;
+      // Find next slash.
+      while (src[srci] != '/' && src[srci] != '\0')
+	srci++;
+      if (srci == tmpi)
+	// We hit the end.
+	break;
+      len = srci - tmpi;
+
+      // Handle "." and "..".
+      if (len == 1 && src[tmpi] == '.')
+	continue;
+      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
 	{
-	  // Not absolute, so start with current directory.
-	  if (getcwd (buf2, sizeof (buf2)) == NULL)
-	    throw new IOException ();
-	  out_idx = strlen (buf2);
+	  while (dsti > 1 && dst[dsti - 1] != '/')
+	    dsti--;
+	  if (dsti != 1)
+	    dsti--;
+	  // Reenable filesystem checking if disabled, as we might
+	  // have reversed over whatever caused the problem before.
+	  // At least one proprietary JVM has inconsistencies because
+	  // it does not do this.
+	  fschecks = true;
+	  continue;
 	}
-      else
+
+      // Handle real path components.
+      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
+      int dsti_save = dsti;
+      if (dsti > 1)
+	dst[dsti++] = '/';
+      strncpy (&dst[dsti], &src[tmpi], len);
+      dsti += len;
+      if (fschecks == false)
+	continue;
+
+#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
+      struct stat sb;
+      dst[dsti] = '\0';
+      if (::lstat (dst, &sb) == 0)
 	{
-	  buf2[0] = '/';
-	  out_idx = 1;
-	} 
-      int in_idx = 0;
-      while (buf[in_idx] != '\0')
-	{
-	  // Skip '/'s.
-	  while (buf[in_idx] == '/')
-	    ++in_idx;
-	  int elt_start = in_idx;
-	  // Find next '/' or end of path.
-	  while (buf[in_idx] != '\0' && buf[in_idx] != '/')
-	    ++in_idx;
-	  if (in_idx == elt_start)
+	  if (S_ISLNK (sb.st_mode))
 	    {
-	      // An empty component means we've reached the end.
-	      break;
+	      int tmpl = CHUNKSIZ;
+	      char *tmp = (char *) _Jv_Malloc (tmpl);
+
+	      while (1)
+		{
+		  tmpi = ::readlink (dst, tmp, tmpl);
+		  if (tmpi < 1)
+		    {
+		      _Jv_Free (src);
+		      _Jv_Free (dst);
+		      _Jv_Free (tmp);
+		      throw new IOException (
+			JvNewStringLatin1 ("readlink failed"));
+		    }
+		  if (tmpi < tmpl)
+		    break;
+		  tmpl += CHUNKSIZ;
+		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
+		}
+
+	      // Prepend the link's path to src.
+	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
+	      strcpy(&tmp[tmpi], &src[srci]);
+	      _Jv_Free (src);
+	      src = tmp;
+	      srcl = tmpl;
+	      srci = 0;
+
+	      // Either replace or append dst depending on whether the
+	      // link is relative or absolute.
+	      dsti = src[0] == '/' ? 1 : dsti_save;
 	    }
-	  int len = in_idx - elt_start;
-	  if (len == 1 && buf[in_idx] == '.')
-	    continue;
-	  if (len == 2 && buf[in_idx] == '.' && buf[in_idx + 1] == '.')
-	    {
-	      // Found ".." component, lop off last part from existing
-	      // buffer.
-	      --out_idx;
-	      while (out_idx > 0 && buf2[out_idx] != '/')
-		--out_idx;
-	      // Can't go up past "/".
-	      if (out_idx == 0)
-		++out_idx;
-	    }
-	  else
-	    {
-	      // Append a real path component to the output.
-	      if (out_idx > 1)
-		buf2[out_idx++] = '/';
-	      strncpy (&buf2[out_idx], &buf[elt_start], len);
-	      out_idx += len;
-	    }
 	}
-
-      buf2[out_idx] = '\0';
+      else
+	{
+	  // Something doesn't exist, or we don't have permission to
+	  // read it, or a previous path component is a directory, or
+	  // a symlink is looped.  Whatever, we can't check the
+	  // filesystem any more.
+	  fschecks = false;
+	}
+#endif // HAVE_LSTAT && HAVE_READLINK
     }
+  dst[dsti] = '\0';
 
   // FIXME: what encoding to assume for file names?  This affects many
   // calls.
-  return JvNewStringUTF (buf2);
-#else
-  return JvNewStringUTF (buf);
-#endif
+  path = JvNewStringUTF (dst);
+  _Jv_Free (src);
+  _Jv_Free (dst);
+  return path;
 }
 
 jboolean
Index: configure.ac
===================================================================
--- configure.ac	(revision 114387)
+++ configure.ac	(working copy)
@@ -897,7 +897,7 @@
 else
    AC_CHECK_FUNCS([strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate mmap])
Index: configure
===================================================================
--- configure	(revision 114387)
+++ configure	(working copy)
@@ -10396,7 +10396,7 @@
 
 for ac_func in strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate mmap
Index: include/config.h.in
===================================================================
--- include/config.h.in	(revision 114387)
+++ include/config.h.in	(working copy)
@@ -169,6 +169,9 @@
 /* Define to 1 if you have the `localtime_r' function. */
 #undef HAVE_LOCALTIME_R
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
@@ -226,8 +229,8 @@
 /* Define to 1 if you have the `readdir_r' function. */
 #undef HAVE_READDIR_R
 
-/* Define to 1 if you have the `realpath' function. */
-#undef HAVE_REALPATH
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
 
 /* Define to 1 if you have the `rename' function. */
 #undef HAVE_RENAME
Index: java/io/File.java
===================================================================
--- java/io/File.java	(revision 114387)
+++ java/io/File.java	(working copy)
@@ -508,9 +508,9 @@
   /**
    * This method returns a canonical representation of the pathname of
    * this file.  The actual form of the canonical representation is
-   * different.  On the GNU system, the canonical form differs from the
-   * absolute form in that all relative file references to "." and ".."
-   * are resolved and removed.
+   * system-dependent.  On the GNU system, conversion to canonical
+   * form involves the removal of redundant separators, references to
+   * "." and "..", and symbolic links.
    * <p>
    * Note that this method, unlike the other methods which return path
    * names, can throw an IOException.  This is because native method 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Second ping: rewrite File.toCanonicalPath()
  2006-06-05 12:07             ` Gary Benson
@ 2006-06-05 15:13               ` Andrew Haley
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Haley @ 2006-06-05 15:13 UTC (permalink / raw)
  To: Gary Benson; +Cc: java-patches

Gary Benson writes:
 > Bryce McKinlay wrote:
 > > Gary Benson wrote:
 > > > A couple of months ago I posted a patch that rewrote
 > > > File.toCanonicalPath() on GNU/Posix systems in order
 > > > for FilePermission checks to work properly...
 > > 
 > > Sorry for the delay in reviewing this. This patch looks
 > > pretty good to me, but I think a few extra comments would
 > > help to make it more readable - could you add one to
 > > nextChunkSize explaining what the algorithm does?  Also,
 > > I think a comment in getCanonicalPath itself briefly
 > > explaining the approach and why it is necessary would be
 > > very helpful. A ChangeLog entry is needed as well.
 > > 
 > > Otherwise, OK - please go ahead and check it in to trunk
 > > with these changes (you can post the final patch to the
 > > list as an FYI).
 > 
 > I made the changes but don't have commit access.  Could
 > someone please commit the attached patch for me?

They could, but I suggest you apply for commit access.  You'll need it
if you're going to do much gcj security work.

http://gcc.gnu.org/svnwrite.html

Andrew.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Patch: FYI: rewrite File.toCanonicalPath()
  2006-06-01  7:01           ` Bryce McKinlay
  2006-06-05 12:07             ` Gary Benson
@ 2006-06-06 15:18             ` Gary Benson
  1 sibling, 0 replies; 14+ messages in thread
From: Gary Benson @ 2006-06-06 15:18 UTC (permalink / raw)
  To: java-patches

[-- Attachment #1: Type: text/plain, Size: 450 bytes --]

Hi all,

I just checked in a rewrite of File.toCanonicalPath() for GNU/Posix
systems to make it correctly handle symbolic links as a partial fix
for PR 24895.

Cheers,
Gary

2006-06-06  Gary Benson  <gbenson@redhat.com>

	* java/io/natFilePosix.cc (getCanonicalPath): Rewritten.
	* configure.ac: Remove realpath check and add checks for
	lstat and readlink.
	* configure: Rebuilt.
	* include/config.h.in: Likewise.
	* java/io/File.java: Javadoc fix.

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 9944 bytes --]

Index: java/io/natFilePosix.cc
===================================================================
--- java/io/natFilePosix.cc	(revision 114425)
+++ java/io/natFilePosix.cc	(working copy)
@@ -101,94 +101,167 @@
 #endif
 }
 
+// These two methods are used to maintain dynamically allocated
+// buffers for getCanonicalPath without the overhead of calling
+// realloc every time a buffer is modified.  Buffers are sized
+// at the smallest multiple of CHUNKSIZ that is greater than or
+// equal to the desired length.  The default CHUNKSIZ is 256,
+// longer than most paths, so in most cases a getCanonicalPath
+// will require only one malloc per buffer.
+
+#define CHUNKLOG 8
+#define CHUNKSIZ (1 << CHUNKLOG)
+
+static int
+nextChunkSize (int size)
+{
+  return ((size >> CHUNKLOG) + ((size & (CHUNKSIZ - 1)) ? 1 : 0)) << CHUNKLOG;
+}
+
+static char *
+maybeGrowBuf (char *buf, int *size, int required)
+{
+  if (required > *size)
+    {
+      *size = nextChunkSize (required);
+      buf = (char *) _Jv_Realloc (buf, *size);
+    }
+  return buf;
+}
+
+// Return a canonical representation of the pathname of this file.  On
+// the GNU system this involves the removal of redundant separators,
+// references to "." and "..", and symbolic links.
+//
+// The conversion proceeds on a component-by-component basis: symbolic
+// links and references to ".."  are resolved as and when they occur.
+// This means that if "/foo/bar" is a symbolic link to "/baz" then the
+// canonical form of "/foo/bar/.." is "/" and not "/foo".
+//
+// In order to mimic the behaviour of proprietary JVMs, non-existant
+// path components are allowed (a departure from the normal GNU system
+// convention).  This means that if "/foo/bar" is a symbolic link to
+// "/baz", the canonical form of "/non-existant-directory/../foo/bar"
+// is "/baz".
+
 jstring
 java::io::File::getCanonicalPath (void)
 {
-  // We use `+2' here because we might need to use `.' for our special
-  // case.
-  char *buf = (char *) __builtin_alloca (JvGetStringUTFLength (path) + 2);
-  char buf2[MAXPATHLEN];
-  jsize total = JvGetStringUTFRegion (path, 0, path->length(), buf);
+  jstring path = getAbsolutePath ();
 
-  // Special case: treat "" the same as ".".
-  if (total == 0)
-    buf[total++] = '.';
+  int len = JvGetStringUTFLength (path);
+  int srcl = nextChunkSize (len + 1);
+  char *src = (char *) _Jv_Malloc (srcl);
+  JvGetStringUTFRegion (path, 0, path->length(), src);
+  src[len] = '\0';
+  int srci = 1;
 
-  buf[total] = '\0';
+  int dstl = nextChunkSize (2);  
+  char *dst = (char *) _Jv_Malloc (dstl);
+  dst[0] = '/';
+  int dsti = 1;
 
-#ifdef HAVE_REALPATH
-  if (realpath (buf, buf2) == NULL)
+  bool fschecks = true;
+
+  while (src[srci] != '\0')
     {
-      // If realpath failed, we have to come up with a canonical path
-      // anyway.  We do this with purely textual manipulation.
-      // FIXME: this isn't perfect.  You can construct a case where
-      // we get a different answer from the JDK:
-      // mkdir -p /tmp/a/b/c
-      // ln -s /tmp/a/b /tmp/a/z
-      // ... getCanonicalPath("/tmp/a/z/c/nosuchfile")
-      // We will give /tmp/a/z/c/nosuchfile, while the JDK will
-      // give /tmp/a/b/c/nosuchfile.
-      int out_idx;
-      if (buf[0] != '/')
+      // Skip slashes.
+      while (src[srci] == '/')
+	srci++;
+      int tmpi = srci;
+      // Find next slash.
+      while (src[srci] != '/' && src[srci] != '\0')
+	srci++;
+      if (srci == tmpi)
+	// We hit the end.
+	break;
+      len = srci - tmpi;
+
+      // Handle "." and "..".
+      if (len == 1 && src[tmpi] == '.')
+	continue;
+      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
 	{
-	  // Not absolute, so start with current directory.
-	  if (getcwd (buf2, sizeof (buf2)) == NULL)
-	    throw new IOException ();
-	  out_idx = strlen (buf2);
+	  while (dsti > 1 && dst[dsti - 1] != '/')
+	    dsti--;
+	  if (dsti != 1)
+	    dsti--;
+	  // Reenable filesystem checking if disabled, as we might
+	  // have reversed over whatever caused the problem before.
+	  // At least one proprietary JVM has inconsistencies because
+	  // it does not do this.
+	  fschecks = true;
+	  continue;
 	}
-      else
+
+      // Handle real path components.
+      dst = maybeGrowBuf (dst, &dstl, dsti + (dsti > 1 ? 1 : 0) + len + 1);
+      int dsti_save = dsti;
+      if (dsti > 1)
+	dst[dsti++] = '/';
+      strncpy (&dst[dsti], &src[tmpi], len);
+      dsti += len;
+      if (fschecks == false)
+	continue;
+
+#if defined (HAVE_LSTAT) && defined (HAVE_READLINK)
+      struct stat sb;
+      dst[dsti] = '\0';
+      if (::lstat (dst, &sb) == 0)
 	{
-	  buf2[0] = '/';
-	  out_idx = 1;
-	} 
-      int in_idx = 0;
-      while (buf[in_idx] != '\0')
-	{
-	  // Skip '/'s.
-	  while (buf[in_idx] == '/')
-	    ++in_idx;
-	  int elt_start = in_idx;
-	  // Find next '/' or end of path.
-	  while (buf[in_idx] != '\0' && buf[in_idx] != '/')
-	    ++in_idx;
-	  if (in_idx == elt_start)
+	  if (S_ISLNK (sb.st_mode))
 	    {
-	      // An empty component means we've reached the end.
-	      break;
+	      int tmpl = CHUNKSIZ;
+	      char *tmp = (char *) _Jv_Malloc (tmpl);
+
+	      while (1)
+		{
+		  tmpi = ::readlink (dst, tmp, tmpl);
+		  if (tmpi < 1)
+		    {
+		      _Jv_Free (src);
+		      _Jv_Free (dst);
+		      _Jv_Free (tmp);
+		      throw new IOException (
+			JvNewStringLatin1 ("readlink failed"));
+		    }
+		  if (tmpi < tmpl)
+		    break;
+		  tmpl += CHUNKSIZ;
+		  tmp = (char *) _Jv_Realloc (tmp, tmpl);
+		}
+
+	      // Prepend the link's path to src.
+	      tmp = maybeGrowBuf (tmp, &tmpl, tmpi + strlen (&src[srci]) + 1);
+	      strcpy(&tmp[tmpi], &src[srci]);
+	      _Jv_Free (src);
+	      src = tmp;
+	      srcl = tmpl;
+	      srci = 0;
+
+	      // Either replace or append dst depending on whether the
+	      // link is relative or absolute.
+	      dsti = src[0] == '/' ? 1 : dsti_save;
 	    }
-	  int len = in_idx - elt_start;
-	  if (len == 1 && buf[in_idx] == '.')
-	    continue;
-	  if (len == 2 && buf[in_idx] == '.' && buf[in_idx + 1] == '.')
-	    {
-	      // Found ".." component, lop off last part from existing
-	      // buffer.
-	      --out_idx;
-	      while (out_idx > 0 && buf2[out_idx] != '/')
-		--out_idx;
-	      // Can't go up past "/".
-	      if (out_idx == 0)
-		++out_idx;
-	    }
-	  else
-	    {
-	      // Append a real path component to the output.
-	      if (out_idx > 1)
-		buf2[out_idx++] = '/';
-	      strncpy (&buf2[out_idx], &buf[elt_start], len);
-	      out_idx += len;
-	    }
 	}
-
-      buf2[out_idx] = '\0';
+      else
+	{
+	  // Something doesn't exist, or we don't have permission to
+	  // read it, or a previous path component is a directory, or
+	  // a symlink is looped.  Whatever, we can't check the
+	  // filesystem any more.
+	  fschecks = false;
+	}
+#endif // HAVE_LSTAT && HAVE_READLINK
     }
+  dst[dsti] = '\0';
 
   // FIXME: what encoding to assume for file names?  This affects many
   // calls.
-  return JvNewStringUTF (buf2);
-#else
-  return JvNewStringUTF (buf);
-#endif
+  path = JvNewStringUTF (dst);
+  _Jv_Free (src);
+  _Jv_Free (dst);
+  return path;
 }
 
 jboolean
Index: configure.ac
===================================================================
--- configure.ac	(revision 114425)
+++ configure.ac	(working copy)
@@ -897,7 +897,7 @@
 else
    AC_CHECK_FUNCS([strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate mmap])
Index: configure
===================================================================
--- configure	(revision 114425)
+++ configure	(working copy)
@@ -10396,7 +10396,7 @@
 
 for ac_func in strerror ioctl select fstat open fsync sleep opendir \
                    gmtime_r localtime_r readdir_r getpwuid_r getcwd \
-		   access stat mkdir rename rmdir unlink realpath utime chmod \
+		   access stat lstat mkdir rename rmdir unlink utime chmod readlink \
 		   nl_langinfo setlocale \
 		   inet_pton uname inet_ntoa \
 		   fork execvp pipe sigaction ftruncate mmap
Index: include/config.h.in
===================================================================
--- include/config.h.in	(revision 114425)
+++ include/config.h.in	(working copy)
@@ -169,6 +169,9 @@
 /* Define to 1 if you have the `localtime_r' function. */
 #undef HAVE_LOCALTIME_R
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
 /* Define to 1 if you have the `memcpy' function. */
 #undef HAVE_MEMCPY
 
@@ -226,8 +229,8 @@
 /* Define to 1 if you have the `readdir_r' function. */
 #undef HAVE_READDIR_R
 
-/* Define to 1 if you have the `realpath' function. */
-#undef HAVE_REALPATH
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
 
 /* Define to 1 if you have the `rename' function. */
 #undef HAVE_RENAME
Index: java/io/File.java
===================================================================
--- java/io/File.java	(revision 114425)
+++ java/io/File.java	(working copy)
@@ -508,9 +508,9 @@
   /**
    * This method returns a canonical representation of the pathname of
    * this file.  The actual form of the canonical representation is
-   * different.  On the GNU system, the canonical form differs from the
-   * absolute form in that all relative file references to "." and ".."
-   * are resolved and removed.
+   * system-dependent.  On the GNU system, conversion to canonical
+   * form involves the removal of redundant separators, references to
+   * "." and "..", and symbolic links.
    * <p>
    * Note that this method, unlike the other methods which return path
    * names, can throw an IOException.  This is because native method 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2006-06-06 15:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-04-03 12:32 Patch: rewrite File.toCanonicalPath() for GNU/Posix systems Gary Benson
2006-04-05 22:00 ` Tom Tromey
2006-04-06 10:03   ` Gary Benson
2006-04-07 15:53 ` Patch: rewrite File.toCanonicalPath() take 2 Gary Benson
2006-04-12 10:32   ` Andrew Haley
2006-04-19 13:55     ` Patch: rewrite File.toCanonicalPath() take 3 Gary Benson
2006-05-12  8:16       ` PING: rewrite File.toCanonicalPath() Gary Benson
2006-05-31  8:21         ` Second ping: " Gary Benson
2006-06-01  7:01           ` Bryce McKinlay
2006-06-05 12:07             ` Gary Benson
2006-06-05 15:13               ` Andrew Haley
2006-06-06 15:18             ` Patch: FYI: " Gary Benson
2006-06-01  7:54           ` Second ping: " Bryce McKinlay
2006-06-01  8:12             ` Gary Benson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).