On 16/09/15 11:36 -0600, Martin Sebor wrote:
>>It turns out I was wrong about BSD traditionally supporting
>>realpath(path, NULL), it first appeared in these relatively recent
>>versions:
>>
>>FreeBSD 9.0
>>OpenBSD 5.0
>>NetBSD 6.1
>>
>>Like Solaris 11, some of them still define _POSIX_VERSION=200112L even
>>though they support passing NULL,  so we could hardcode the targets
>>that are known to support it, but we'll still need a fallback for lots
>>of slightly older targets anyway.
>>
>>So here's a new implementation of canonical() which only uses
>>realpath() when this autoconf compile-or-link test passes:
>>
>>#if _XOPEN_VERSION < 500
>>#error
>>#elif _XOPEN_VERSION >= 700 || defined(PATH_MAX)
>>char *tmp = realpath((const char*)NULL, (char*)NULL);
>>#else
>>#error
>>#endif
>>
>>i.e. I use realpath() for Issue 7, or for Issue 6 if PATH_MAX is
>>defined.
>
>I probably wouldn't code the PATH_MAX test quite the same way.
>I would expect it to be mainly Issue 6 implementations that don't
>define the macro to want to provide the null extension since there
>would otherwise be no safe way to use the function.

OK.

>I didn't test it at all but I'd be inclined to write the conditional
>to look more along these lines:
>
>  #if _XOPEN_VERSION >= 700
>    // Issue 7 and better -- null resolved_name means allocate
>    char *tmp = realpath (file_name, (char*)NULL);
>  #elif _XOPEN_VERSION == 600
>     // Issue 6 -- null resolved_name is implementation-defined
>  #  if FreeBSD 9.0 or OpenBSD 5.0 or NetBSD 6.1
>    char *tmp = realpath (file_name, (char*)NULL);
>  #  elif PATH_MAX
>    char *tmp = realpath (file_name, malloc (PATH_MAX));
>  #  else
>  #    error No safe way to call realpath
>  #  endif
>  #elif _XOPEN_VERSION && PATH_MAX
>    // SUSv2 -- null resolved_name is an error
>    char *tmp = realpath (file_name, malloc (PATH_MAX));
>  #else
>  #  error realpath not supported or no safe way to call it
>  #endif

That would allow us to use realpath() on more targets, so would be a
good improvement, but I think it can wait for a later date. I think we
might also want two macros, USE_REALPATH and USE_REALPATH_NULL,
otherwise we just have to repeat the conditions in
src/filesystem/ops.cc to decide whether to use malloc() or not.

But I think the best solution for now is to not define any *_SOURCE
macros in the configure checks or in the source code (due to the
problem on NetBSD when _XOPEN_SOURCE is defined). If a new enough
_XOPEN_VERSION happens to be defined anyway (maybe due to an implicit
_GNU_SOURCE, or just a target where it's the default) then we'll use
realpath(), otherwise we use the fallback C++ implementation.

Here's a patch to do that, it's substantially the same as the last one
but doesn't define _XOPEN_SOURCE, and also tweaks some tests.

Tested powerpc64le-linux, x86_64-dragonfly4.1 and x86_64-netbsd5.1,
do you see any reason not to commit this for now?

Any improvements such as hardcoding checks for specific versions of
Solaris or the BSDs are QoI, and this is only an experimental TS, so I
don't want to spend the rest of stage 1 working on one function :-)


>>Then in filesystem::canonical() if _GLIBCXX_USE_REALPATH is set I use
>>it, passing NULL for Issue 7, or allocating a buffer of PATH_MAX
>>otherwise. If realpath isn't supported or fails with ENAMETOOLONG then
>>I do the path traversal by hand (which can be done entirely using the
>>std::experimental::filesystem operations).
>
>FWIW, to work across filesystems with different _PC_PATH_MAX, I
>suspect the operations might need to use readlinkat. I.e., they
>might need to descend into each individual subdirectory to avoid
>forming a temporary pathname that's too long for the file system
>being traversed, even though both the initial and the final
>pathnames are under the limit. (I haven't actually tested this
>but I don't see where GLIBC handles this case so it might not
>do the right thing either.)


Yes, I was planning to do it that way originally, then realised that
it can be done purely in C++ with the new filesystem API, which was
much easier!

It would be better to use openat() for each dir and use the *at()
functions, but I'd have to get my copy of Stevens out and read lots of
man pages, where as I already know the C++ API because I've recently
implemented the entire thing myself :-)

In an ideal world, with infinite time, it might be nice to create a
hybrid of the C++ Filesystem API and the *at() functions, even if only
for use internally in the library. It might reduce the number of
stat() calls, or at least the number of similar path traversals, if we
used openat(), fstatat() etc.

The Filesystem TS deals entirely in paths, because that's the only way
to be portable to both POSIX and Windows, but it would be really nice
to have a similar API based on file descriptors. It would certainly
make some things a lot more efficient.

A quick grep tells me that Boost.Filesystem doesn't use openat()
anywhere, and users have been happy with that implementation for many
years, so again I think improvements like this can wait (but feel free
to add something to Bugzilla suggesting it, or maybe a wiki page where
we can collect suggestions like this).

>>Only passing NULL for Issue 7 is quite conservative. It means we don't
>>do it for targets that support it as an implementation-defined
>>extension to Issue 6, which includes Solaris 11, the BSDs and even
>>older GNU systems (including RHEL6). But that's OK, we have a fallback
>>now so it means no loss of functionality, just efficiency.  We can
>>tweak the config later for targets known to handle NULL.
>
>Does the config test actually run? If not, I don't see the point
>(it doesn't tell us anything the POSIX feature tests macros don't).
>If it did run, it would fail since the first argument is null.

No, it's only a compile-or-link test, so only tells us if realpath()
is declared with a suitable signature, not that calling it works.
That's why I didn't bother passing a real argument, just used null
pointers of the right type.

We can't do config tests that need to be executed because you can't do
them for a cross-compiler, and we don't want the library config to
differ depending on whether you build native or cross.

>>
>>The new testcase is not very thorough. I've run a few more involved
>>tests that aren't suitable to check in until I figure out a good way
>>of running filesystem tests that can create/remove arbitrary files and
>>symlinks.
>
>Yeah, writing good tests is always the hard part. If you need help
>I can try to put some together in my spare time.

If you have suggestions for edge cases to test that would be great,
even if we can't actually test them yet.

My main obstacle to writing good tests right now is having some way to
create and destroy files safely in the tests. It's hard to test
functions like is_symlink() without first creating a symlink in a
known location, and also removing it again cleanly so the next
testsuite run doesn't fail if the file is already present.

One option would be to have libstdc++-v3/testsuite/Makefile create a
new sub-directory as a sandbox for filesystem tests, removing it if it
already exists. Then the tests can put anything they like in that new
dir without fear of trashing the user's files elsewhere on the FS!

That shouldn't be too hard to do, it's just a Simple Matter Of
Programming. And finding time.