From: Alexandre Oliva <oliva@adacore.com>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>,
"libstdc++" <libstdc++@gcc.gnu.org>
Subject: Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
Date: Wed, 22 Jun 2022 22:02:08 -0300 [thread overview]
Message-ID: <orh74c6vlb.fsf@lxoliva.fsfla.org> (raw)
In-Reply-To: <CACb0b4m5fZ3HDi3GdvCFgSD1ZOLzfcz1CmLTbkFcpv36zE+ndA@mail.gmail.com> (Jonathan Wakely's message of "Wed, 22 Jun 2022 10:45:13 +0100")
On Jun 22, 2022, Jonathan Wakely <jwakely@redhat.com> wrote:
> It looks like it would be possible for this to livelock.
True
> The current
> implementation will fail with an error in that case, rather than
> getting stuck forever in a loop.
In the equivalent livelock scenario, newly-added dir entries are added
to the end of the directory, and get visited in the same readdir
iteration, so you never reach the end as long as the entry-creator runs
faster than the remover.
> It would be possible to add a __rewind member to directory iterators
> too, to call rewinddir after each modification to the directory.
That would likely lead to O(n^2) behavior in some implementations, in
which remove entries get rescanned over and over, whereas the approach I
proposed cuts that down to O(nlogn). Unless you rewind once you hit the
end after successful removals, even before trying to remove the dir.
That's still a little wasteful on POSIX-compliant targets, because you
rewind and rescan a dir that should already be empty. I aimed for
minimizing the overhead on compliant targets, but that was before
remove_all switched to recursive iterators.
With recursive iterators, rewinding might be better done in a custom
iterator, tuned for recursive removal, that knows to rewind a dir if
there were removals in it or something. Rewinding the entire recursive
removal is not something I realized my rewritten patch did. I still had
the recursive remove_all implementation in cache. Oops ;-)
That said, I'm not sure it makes much of a difference in the end. As
long as the recursion and removal doesn't treat symlinks as dirs (which
IIUC requires openat and unlinkat, so that's a big if to boot), the
rewinding seems to only change the nature of filesystem race conditions
that recursive removal enables. E.g., consider that you start removing
the entries in a dir, and then the dir you're visiting is moved out of
the subtree you're removing, and other dirs are moved into it: the
recursive removal with openat and unlinkat will happily attempt to wipe
out everything moved in there, even if it wasn't within that subtree at
the time of the remove_all request, and even if the newly-moved dirs
were never part of the subtree whose removal was requested. To make it
clearer:
c++::std::filesystem::remove_all foo/bar &
mv foo/bar/temp temp
mv foo temp
wait
ls -d temp/foo
temp/foo might be removed if you happened to be iterating in temp when
the 'mv' commands run. Is that another kind of race that needs to be
considered? If a dir is moved while we're visiting it, should we stop
visiting it? What about its parent?
--
Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
Free Software Activist GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts. Ask me about <https://stallmansupport.org>
next prev parent reply other threads:[~2022-06-23 1:02 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-22 6:19 Alexandre Oliva
2022-06-22 9:45 ` Jonathan Wakely
2022-06-23 1:02 ` Alexandre Oliva [this message]
2022-06-27 13:27 ` Alexandre Oliva
2022-06-29 5:16 ` Alexandre Oliva
2022-06-30 7:52 ` Alexandre Oliva
2022-06-30 8:19 ` Sebastian Huber
2022-07-05 17:39 ` Alexandre Oliva
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=orh74c6vlb.fsf@lxoliva.fsfla.org \
--to=oliva@adacore.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jwakely@redhat.com \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).