public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Alexandre Oliva <oliva@adacore.com>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>,
	"libstdc++" <libstdc++@gcc.gnu.org>
Subject: Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
Date: Mon, 27 Jun 2022 10:27:25 -0300	[thread overview]
Message-ID: <orpmiuxmma.fsf@lxoliva.fsfla.org> (raw)
In-Reply-To: <CACb0b4m5fZ3HDi3GdvCFgSD1ZOLzfcz1CmLTbkFcpv36zE+ndA@mail.gmail.com> (Jonathan Wakely's message of "Wed, 22 Jun 2022 10:45:13 +0100")

On Jun 22, 2022, Jonathan Wakely <jwakely@redhat.com> wrote:

> I haven't properly reviewed it yet

Nevermind that one, it's broken because I hadn't realized the recursive
iteration.  It fails and throws/errors out when we attempt to __erase a
subdir that wasn't successfully emptied because some of its entries were
skipped.

Here's an improved version that appears to work, despite the few(er)
remaining fails.  I've convinced myself it can't possibly introduce
symlink races because, if it doesn't follow symlinks in the first try,
it won't follow them in retries.  Potential races related with moving
directories into or out of the remove_all root remain.

On POSIX-compliant filesystems, the first name that fails to be removed
is most likely to be the first to be encountered in the retry, so
removal will terminate with the failure immediately.  There's a
potential for retries to end up removing other dirs moved into the
remove_all root, but moving them in while remove_all is still running
may remove them, so I don't see that it changes anything.

On RTEMS, the first name that would have failed to be removed, say
because of permissions, may be skipped by the iterator, so we may
proceed to remove later dir entries under the same parent before failing
to remove the parent and starting a retry.  There may thus be an
unbounded number of retries, one for each subdirectory with more than
one entry in the remove_all tree.  I see two potential ways to avoid
this: (i) call remove_all recursively upon failure to remove an entry,
instead of restarting iteration; or (ii) arrange for
recursive_directory_iterator to rewind a dir from which entries have
been _erase()d before returning to the parent dir.  I have not
implemented either of these alternatives, though.

This one is regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?



libstdc++: retry removal of dir entries if dir removal fails

From: Alexandre Oliva <oliva@adacore.com>

On some target systems (e.g. rtems6.0), removing directory components
while iterating over directory entries may cause some of the directory
entries to be skipped, which prevents the removal of the parent
directory from succeeding.

Advancing the iterator before removing a member proved not to be
enough, so I've instead arranged for remove_all to retry the removal
of components if the removal of the parent dir fails after removing at
least one entry.  The fail will be permanent only if no components got
removed in the current try.


for  libstdc++-v3/ChangeLog

	* src/c++17/fs_ops.cc (remove_all): Retry removal of
	directory entries.
---
 libstdc++-v3/src/c++17/fs_ops.cc |   40 ++++++++++++++++++++++++++++++++++----
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 435368fa5c5ff..de99e02af4c34 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1286,6 +1286,8 @@ fs::remove_all(const path& p)
 {
   error_code ec;
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1295,7 +1297,16 @@ fs::remove_all(const path& p)
       const recursive_directory_iterator end;
       while (dir != end)
 	{
-	  dir.__erase(); // throws on error
+	  /* Avoid exceptions if we may retry on fail on systems that
+	     miss dir entries when removing while iterating.  */
+	  if (count > init_count)
+	    {
+	      dir.__erase(&ec);
+	      if (ec)
+		goto retry;
+	    }
+	  else
+	    dir.__erase(); // throws on error
 	  ++count;
 	}
     }
@@ -1303,7 +1314,7 @@ fs::remove_all(const path& p)
     break;
   case ENOENT:
     // Our work here is done.
-    return 0;
+    return count;
   case ENOTDIR:
   case ELOOP:
     // Not a directory, will remove below.
@@ -1313,6 +1324,18 @@ fs::remove_all(const path& p)
     _GLIBCXX_THROW_OR_ABORT(filesystem_error("cannot remove all", p, ec));
   }
 
+  if (count > init_count)
+    {
+      if (int last = fs::remove(p, ec); !ec)
+	return count + last;
+      else
+	// Some systems seem to skip entries in the dir iteration if
+	// you remove dir entries while iterating, so if we removed
+	// anything in the dir in this round, and failed to remove
+	// the dir (presumably because it wasn't empty), retry.
+	goto retry;
+    }
+
   // Remove p itself, which is either a non-directory or is now empty.
   return count + fs::remove(p);
 }
@@ -1321,6 +1344,8 @@ std::uintmax_t
 fs::remove_all(const path& p, error_code& ec)
 {
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1332,7 +1357,12 @@ fs::remove_all(const path& p, error_code& ec)
 	{
 	  dir.__erase(&ec);
 	  if (ec)
-	    return -1;
+	    {
+	      if (count > init_count)
+		goto retry;
+	      else
+		return -1;
+	    }
 	  ++count;
 	}
     }
@@ -1341,7 +1371,7 @@ fs::remove_all(const path& p, error_code& ec)
   case ENOENT:
     // Our work here is done.
     ec.clear();
-    return 0;
+    return count;
   case ENOTDIR:
   case ELOOP:
     // Not a directory, will remove below.
@@ -1354,6 +1384,8 @@ fs::remove_all(const path& p, error_code& ec)
   // Remove p itself, which is either a non-directory or is now empty.
   if (int last = fs::remove(p, ec); !ec)
     return count + last;
+  if (count > init_count)
+    goto retry;
   return -1;
 }
 


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

  parent reply	other threads:[~2022-06-27 13:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-22  6:19 Alexandre Oliva
2022-06-22  9:45 ` Jonathan Wakely
2022-06-23  1:02   ` Alexandre Oliva
2022-06-27 13:27   ` Alexandre Oliva [this message]
2022-06-29  5:16     ` Alexandre Oliva
2022-06-30  7:52     ` Alexandre Oliva
2022-06-30  8:19       ` Sebastian Huber
2022-07-05 17:39         ` Alexandre Oliva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=orpmiuxmma.fsf@lxoliva.fsfla.org \
    --to=oliva@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jwakely@redhat.com \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).