public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Alexandre Oliva <oliva@adacore.com>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>,
	"libstdc++" <libstdc++@gcc.gnu.org>
Subject: Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
Date: Wed, 22 Jun 2022 22:02:08 -0300	[thread overview]
Message-ID: <orh74c6vlb.fsf@lxoliva.fsfla.org> (raw)
In-Reply-To: <CACb0b4m5fZ3HDi3GdvCFgSD1ZOLzfcz1CmLTbkFcpv36zE+ndA@mail.gmail.com> (Jonathan Wakely's message of "Wed, 22 Jun 2022 10:45:13 +0100")

On Jun 22, 2022, Jonathan Wakely <jwakely@redhat.com> wrote:

> It looks like it would be possible for this to livelock.

True

> The current
> implementation will fail with an error in that case, rather than
> getting stuck forever in a loop.

In the equivalent livelock scenario, newly-added dir entries are added
to the end of the directory, and get visited in the same readdir
iteration, so you never reach the end as long as the entry-creator runs
faster than the remover.

>     It would be possible to add a __rewind member to directory iterators
>     too, to call rewinddir after each modification to the directory.

That would likely lead to O(n^2) behavior in some implementations, in
which remove entries get rescanned over and over, whereas the approach I
proposed cuts that down to O(nlogn).  Unless you rewind once you hit the
end after successful removals, even before trying to remove the dir.
That's still a little wasteful on POSIX-compliant targets, because you
rewind and rescan a dir that should already be empty.  I aimed for
minimizing the overhead on compliant targets, but that was before
remove_all switched to recursive iterators.

With recursive iterators, rewinding might be better done in a custom
iterator, tuned for recursive removal, that knows to rewind a dir if
there were removals in it or something.  Rewinding the entire recursive
removal is not something I realized my rewritten patch did.  I still had
the recursive remove_all implementation in cache.  Oops ;-)

That said, I'm not sure it makes much of a difference in the end.  As
long as the recursion and removal doesn't treat symlinks as dirs (which
IIUC requires openat and unlinkat, so that's a big if to boot), the
rewinding seems to only change the nature of filesystem race conditions
that recursive removal enables.  E.g., consider that you start removing
the entries in a dir, and then the dir you're visiting is moved out of
the subtree you're removing, and other dirs are moved into it: the
recursive removal with openat and unlinkat will happily attempt to wipe
out everything moved in there, even if it wasn't within that subtree at
the time of the remove_all request, and even if the newly-moved dirs
were never part of the subtree whose removal was requested.  To make it
clearer:

  c++::std::filesystem::remove_all foo/bar &
  mv foo/bar/temp temp
  mv foo temp
  wait
  ls -d temp/foo

temp/foo might be removed if you happened to be iterating in temp when
the 'mv' commands run.  Is that another kind of race that needs to be
considered?  If a dir is moved while we're visiting it, should we stop
visiting it?  What about its parent?

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

  reply	other threads:[~2022-06-23  1:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-22  6:19 Alexandre Oliva
2022-06-22  9:45 ` Jonathan Wakely
2022-06-23  1:02   ` Alexandre Oliva [this message]
2022-06-27 13:27   ` Alexandre Oliva
2022-06-29  5:16     ` Alexandre Oliva
2022-06-30  7:52     ` Alexandre Oliva
2022-06-30  8:19       ` Sebastian Huber
2022-07-05 17:39         ` Alexandre Oliva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=orh74c6vlb.fsf@lxoliva.fsfla.org \
    --to=oliva@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jwakely@redhat.com \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).