public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Houder <houder@xs4all.nl>
To: cygwin@cygwin.com
Subject: Re: Odd, is it not? mkdir 'e:\' cannot be undone by rmdir 'e:\' ...
Date: Sun, 22 Sep 2019 07:34:00 -0000	[thread overview]
Message-ID: <22effcaa7390026126e1edb46155299b@smtp-cloud8.xs4all.net> (raw)
In-Reply-To: <8b163d0f-680f-e4ea-098c-703d0fac87fd@cornell.edu>

On Sat, 21 Sep 2019 15:42:36, Ken Brown  wrote:
[snip]

> I'll fix this and then look at your patches to mkdir and rmdir.  It would
> be very helpful if you would write these as a patch series with cover letter,
> using git format-patch, and send them to the cygwin-patches list.

Hi Ken,

I think this will be my last post. In order to help you understand what
I have been thinking, I describe (using terse language as in telegrams)
what I think path_conv::check() in path.cc does (should do) with regard
to "Unix path resolution", and as a _consequence_ of that understanding,
what mkdir() and rmdir() in dir.cc must do.
(also included are URLs to "standards" I have been studying)

 - 1. - Farewell

I will be hospitalized soon, and I do not think I will be back here (any
time soon?).

Therefore, if you believe that the rational behind my modifications is
valid, (and you like to do all this), you are welcome to carry them out
yourself.

I will not be able to carry them out myself.

 - 2. - rmdir(2) not in agreement w/ Posix (and Linux)

64-++ uname -a
CYGWIN_NT-6.1 Seven 3.1.0(0.340/5/3) 2019-09-15 17:57 x86_64 Cygwin
64-++ ln -s bar foo
64-++ ls -l foo
lrwxrwxrwx 1 Henri None 3 Sep 22 07:28 foo -> bar
64-++ mkdir bar

64-++ rmdir foo
rmdir: failed to remove 'foo': Not a directory <==== Correct!

64-++ rmdir foo/ # directory bar has been deleted -- Posix does not
                 # allow this to happen!
64-++ ls -l bar
ls: cannot access 'bar': No such file or directory

The same applies to mkdir(2).

Eric Blake fixed mkdir() in winsup/cygwin/dir.cc in 2009, but he did
not fix rmdir() in winsup/cygwin/dir.cc at the same time. Why he did
not, I am unable to tell.

 - 3. - Path Resolution

Call flow:

mkdir() (and rmdir() ) in winsup/cygwin/dir.cc
  path_conv::check() in winsup/cygwin/path.cc <==== path resolution
  mkdir() (or rmdir(2) ) in winsup/cygwin/fhandler_disk_file.cc

To simplify what happens when either mkdir(1) or rmdir(1) is called:

 - mkdir() (and rmdir() ) in dir.cc are "the system call entries"
 - path_conv::check() performs the "Unix path resolution" (and a lot
   of other things, I do not care about at the moment)
 - mkdir() (likewise rmdir() ) in fhandler_disk_file.cc is called
   by mkdir() in dir.cc, but only if the latter is "satisfied" with
   the result of path resolution
 - mkdir() (and rmdir() ) in fhandler_disk_file.cc do not perform
   path resolution -- these functions use the path as "computed" by
   path_conv::check()

Path Resolution (summarized):

pathname = /prefix/final[/]
(in general, there is a difference between a path not ending w/ slash and
 a pathname that ends w/ a slash)

 - if final/ is a symlnk, it is followed and the target must exist and must
   be a directory
  - the exceptions to this rule are mkdir(2) and rmdir(2) (in principle, both
    these system calls ignore (strip!) the trailing slash)
 - if final is a symlnk, it is followed by default (but the target does not
   have to exist, let alone be a directory)
  - however a system call can specify that the symlnk must NOT be followed;
    mkdir(2) and rmdir(2) are examples of these system calls
    (because, again, both mkdir(2) and rmdir(2) do _not_ accept a symlnk as
     argument, according to specification)

path_conv::check() is "common code" to all system calls w/ arguments that
specify a pathname, i.e. this method does NOT know about the system calls
mkdir() and rmdir(). This method only knows whether or not the pathname
ended w/ a slash or not, and whether or not the system call specified to
follow a symlnk or not (only relevant in case the pathname did NOT end w/
a slash).

Because both mkdir(2) and rmdir(2) should not accept a symlnk as argument,
they must both strip trailing slashes AND specify "do not follow".

My understanding of what "path resolution" must do (that is, what method
path_conv::check() must do wrt Unix path resolution, is based on studying
the URLs I include below.

Regards,
Henri

-----
file: references.txt

Bzzt. Posix goes over the top in an attempt to make the difference between
 final and final/ as general as is possible.
 According to Posix (with regard to "path resolution"), both mkdir(newdir/)
 and rmdir(existing-dir/) obey the rules. [1]
-
 However the argument to both mkdir(2) and rmdir(2) must be a directory; a
 symbolic link is NOT allowed in case of these system calls ("functions").
 Said differently, there is NO need to specify the argument WITH a trailing
 slash: confusion does not exist: the argument must be a directory.
 Moreover, if a symbolic link is specified as an argument in these cases,
 "path resolution" should NOT follow this symbolic link.
 That is why both mkdir(2) and rmdir(2) will strip a trailing slash; they
 also specify "do not follow" for the same reason.

[1] however, my interpretation is, that rmdir(existing-dir/) should succeed
if the target of "existing-dir" (a symbolic link) s indeed a directory ...
But that is wrong! (a symlnk as argument should be rejected by rmdir(2)! )
Similar reasoning applies to mkdir(newdir/) -- the call should be rejected
if "newdir" is a symlnk.

Linux ignores the difference between final and final/ in case of mkdir(2)
and rmdir(2) - the trailing slash is irrelevant here - and silently strips
the trailing slash (any implementation will do the same thing in order to
be "posix-compliant").

References:

Posix:

  The Open Group Base Specifications Issue 7, 2018 edition
  IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008)
  Copyright (c) 2001-2018 IEEE and The Open Group

 - https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13
   ( 4.13 Pathname Resolution -- general concepts )
 - https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap04.html#tag_21_04_13
   ( A.4.13 Pathname Resolution -- appendix )
"POSIX.1-2017 requires that a pathname with a trailing <slash> be rejected unless it refers to a file that
 is a directory or to a file that is to be created as a directory."
Henri: this would allow rmdir(existing-dir/), where existing-dir is a symlnk that refers to an _existing_
Henri: directory to succeed; but that is wrong! rmdir(2) does not allow a symlnk as an argument!
Henri: similar reasoning applies to mkdir(newdir/) -- mkdir(2) must reject the call if newdir is a symlnk.

-
 - https://pubs.opengroup.org/onlinepubs/9699919799/functions/mkdir.html
   ( mkdir(2) -- functions )
"If path names a symbolic link, mkdir() shall fail and set errno to [EEXIST]."

 - https://pubs.opengroup.org/onlinepubs/9699919799/functions/rmdir.html
   ( rmdir(2) -- functions )
"If path names a symbolic link, then rmdir() shall fail and set errno to [ENOTDIR]."

Linux:

 - http://man7.org/linux/man-pages/man7/path_resolution.7.html
   ( man 7 path_resolution )
"Step 3: find the final entry -- Henri: (Linux!) the final entry is WITHOUT a trailing slash
       Henri: (Linux!) final/ is not considered the final entry: it must be
       a directory (or if a symlink, the target must be directory) and that
       directory must exist.

       The lookup of the final component of the pathname goes just like that
       of all other components, as described in the previous step, with two
Henri: that is, a symlnk is followed BY DEFAULT!

       differences: (i) the final component need not be a directory (at
       least as far as the path resolution process is concerned—it may have
       to be a directory, or a nondirectory, because of the requirements of
       the specific system call), and (ii) it is not necessarily an error if
       the component is not found—maybe we are just creating it.  The
       details on the treatment of the final entry are described in the
       manual pages of the specific system calls."
Henri: in case of final (i.e. w/o trailing slash), a system call can direct
Henri: "path resolution" NOT to follow a symlnk (my interpretation)

"Final symlink
Henri: (Linux!) again, the final symlink is WITHOUT a trailing slash!
       If the last component of a pathname is a symbolic link, then it
       depends on the system call whether the file referred to will be the
       symbolic link or the result of path resolution on its contents.  For
       example, the system call lstat(2) will operate on the symlink, while
       stat(2) operates on the file pointed to by the symlink."
----------
Henri: my conclusion wrt "path resolution":
 - if final/ is a symlnk, it is followed and the target must exist and must
   be a directory
  - the exceptions to this rule are mkdir(2) and rmdir(2) (in principle, both
    these system calls ignore (strip!) the trailing slash)
 - if final is a symlnk, it is followed by default (but the target does not
   have to exist, let alone be a directory)
  - however a system call can specify that the symlnk must NOT be followed;
    mkdir(2) and rmdir(2) are examples of these system calls
    (because, again, both mkdir(2) and rmdir(2) do _not_ accept a symlnk as
     argument, according to specification)
----------

 - http://man7.org/linux/man-pages/man7/symlink.7.html
   ( man 7 symlink )
"System calls
       The first area is symbolic links used as filename arguments for
       system calls.
       Except as noted below, all system calls follow symbolic links.  For
       example, if there were a symbolic link slink which pointed to a file
       named afile, the system call open("slink" ...) would return a file
       descriptor referring to the file afile."

 - http://man7.org/linux/man-pages/man2/mkdir.2.html
   ( man 2 mkdir )
"EEXIST: pathname already exists (not necessarily as a directory). This includes the case where
 pathname is a symbolic link, dangling or not."

 - http://man7.org/linux/man-pages/man2/rmdir.2.html
   ( man 2 rmdir )
"ENOTDIR: pathname, or a component used as a directory in pathname, is NOT, in fact, a directory."

-
 - https://lwn.net/Articles/649115/
   ( Pathname lookup in Linux -- June, 2015, by Neil Brown )
 - https://www.kernel.org/doc/html/latest/filesystems/path-lookup.html
   ( Pathname lookup )

=====


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  reply	other threads:[~2019-09-22  7:11 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-27 15:25 Houder
2019-08-27 16:28 ` Corinna Vinschen
2019-08-27 17:01   ` Houder
2019-08-27 17:32     ` Vince Rice
2019-08-27 17:50       ` Corinna Vinschen
2019-08-28  7:16       ` Houder
2019-08-28  9:22         ` john doe
2019-08-28 11:47           ` Houder
2019-08-28 13:22         ` Corinna Vinschen
2019-08-28 14:16           ` Eric Blake
2019-08-28 14:22             ` Corinna Vinschen
2019-08-28 15:18               ` Corinna Vinschen
2019-08-29 15:19                 ` Houder
2019-08-30  8:20                   ` Corinna Vinschen
2019-08-30 12:42                   ` Houder
2019-09-01 17:38                     ` Houder
2019-09-02  8:15                       ` Corinna Vinschen
2019-09-03  8:40                         ` Houder
2019-09-03  6:50                       ` Andrey Repin
2019-09-19 19:51                       ` Ken Brown
2019-09-20  9:11                         ` Houder
2019-09-20 18:20                           ` Houder
2019-09-21 16:07                             ` Ken Brown
2019-09-22  7:34                               ` Houder [this message]
2019-09-22 14:12                                 ` Ken Brown
2019-09-07  3:47                 ` L A Walsh
2019-08-27 19:48   ` Achim Gratz
2019-08-27 20:58     ` Brian Inglis
2019-08-28  7:16       ` Corinna Vinschen
2019-08-27 22:21     ` Achim Gratz
2019-08-28 13:36 ` Eric Blake
2019-08-28 22:57   ` Houder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22effcaa7390026126e1edb46155299b@smtp-cloud8.xs4all.net \
    --to=houder@xs4all.nl \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).