public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] libstdc++: retry removal of dir entries if dir removal fails
@ 2022-06-22  6:19 Alexandre Oliva
  2022-06-22  9:45 ` Jonathan Wakely
  0 siblings, 1 reply; 8+ messages in thread
From: Alexandre Oliva @ 2022-06-22  6:19 UTC (permalink / raw)
  To: gcc-patches, libstdc++


On some target systems (e.g. rtems6.0), removing directory components
while iterating over directory entries may cause some of the directory
entries to be skipped, which prevents the removal of the parent
directory from succeeding.

Advancing the iterator before removing a member proved not to be
enough, so I've instead arranged for remove_all to retry the removal
of components if the removal of the parent dir fails after removing at
least one entry.  The fail will be permanent only if no components got
removed in the current try.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS: The implementation of remove_all has changed completely, compared
with the gcc-11 environment in which the need for this patch came up.  I
have reimplemented it for mainline, and I have attempted to test it in
this environment, but new filesystem tests and subtests that fail on
rtems6.0 have impaired testing and prevented the full pass rate I got
for them with a similar patchset on gcc-11.


for  libstdc++-v3/ChangeLog

	* src/c++17/fs_ops.cc (remove_all): Retry removal of
	directory entries.
---
 libstdc++-v3/src/c++17/fs_ops.cc |   22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 435368fa5c5ff..b3390310132b4 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1286,6 +1286,8 @@ fs::remove_all(const path& p)
 {
   error_code ec;
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1303,7 +1305,7 @@ fs::remove_all(const path& p)
     break;
   case ENOENT:
     // Our work here is done.
-    return 0;
+    return count;
   case ENOTDIR:
   case ELOOP:
     // Not a directory, will remove below.
@@ -1313,6 +1315,18 @@ fs::remove_all(const path& p)
     _GLIBCXX_THROW_OR_ABORT(filesystem_error("cannot remove all", p, ec));
   }
 
+  if (count > init_count)
+    {
+      if (int last = fs::remove(p, ec); !ec)
+	return count + last;
+      else
+	// Some systems seem to skip entries in the dir iteration if
+	// you remove dir entries while iterating, so if we removed
+	// anything in the dir in this round, and failed to remove
+	// the dir (presumably because it wasn't empty), retry.
+	goto retry;
+    }
+
   // Remove p itself, which is either a non-directory or is now empty.
   return count + fs::remove(p);
 }
@@ -1321,6 +1335,8 @@ std::uintmax_t
 fs::remove_all(const path& p, error_code& ec)
 {
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1341,7 +1357,7 @@ fs::remove_all(const path& p, error_code& ec)
   case ENOENT:
     // Our work here is done.
     ec.clear();
-    return 0;
+    return count;
   case ENOTDIR:
   case ELOOP:
     // Not a directory, will remove below.
@@ -1354,6 +1370,8 @@ fs::remove_all(const path& p, error_code& ec)
   // Remove p itself, which is either a non-directory or is now empty.
   if (int last = fs::remove(p, ec); !ec)
     return count + last;
+  if (count > init_count)
+    goto retry;
   return -1;
 }
 

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
  2022-06-22  6:19 [PATCH] libstdc++: retry removal of dir entries if dir removal fails Alexandre Oliva
@ 2022-06-22  9:45 ` Jonathan Wakely
  2022-06-23  1:02   ` Alexandre Oliva
  2022-06-27 13:27   ` Alexandre Oliva
  0 siblings, 2 replies; 8+ messages in thread
From: Jonathan Wakely @ 2022-06-22  9:45 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc Patches, libstdc++

On Wed, 22 Jun 2022 at 07:20, Alexandre Oliva via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
>
> On some target systems (e.g. rtems6.0), removing directory components
> while iterating over directory entries may cause some of the directory
> entries to be skipped, which prevents the removal of the parent
> directory from succeeding.

This probably comes as no surprise to you, but for the record I don't
think that's POSIX-confirming. The spec for readdir says:

"If a file is removed from or added to the directory after the most
recent call to opendir() or rewinddir(), whether a subsequent call to
readdir() returns an entry for that file is unspecified."

This implies to me that all existing directory entries should get
visited, even if you remove some.


>
> Advancing the iterator before removing a member proved not to be
> enough, so I've instead arranged for remove_all to retry the removal
> of components if the removal of the parent dir fails after removing at
> least one entry.  The fail will be permanent only if no components got
> removed in the current try.
>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?

I haven't properly reviewed it yet, just commenting for now.

It looks like it would be possible for this to livelock. If another
process keeps writing to the directory tree you're trying to remove,
it will keep removing some entries (increasing the count) but also
keep failing to remove a non-empty directory. The current
implementation will fail with an error in that case, rather than
getting stuck forever in a loop.

We could add a max-retries limit, although that would presumably mean
the original problem remains for rtems if trying to remove a directory
with a huge number of entries.

>
> PS: The implementation of remove_all has changed completely, compared

That's PR104161, and in the r12-7062 commit I wrote:

    It would be possible to add a __rewind member to directory iterators
    too, to call rewinddir after each modification to the directory. That
    would make it more likely for filesystem::remove_all to successfully
    remove everything even if files are being written to the directory tree
    while removing it. It's unclear if that is actually prefereable, or if
    it's better to fail and report an error at the first opportunity.

I was thinking of cases where there are concurrent modifications to
the directory while we're trying to remove it, but the RTEMS case of
skipping subsequent entries is more motivating.

The rewind operation would reopen the current directory stream, but
without creating a new directory iterator and starting again from the
very top of the directory tree. For the non-recursive directory
iterator there's no real difference, but for the recursive one it
would only rewind at the current nesting depth, not restart from the
top of the tree. I also need to consider the interaction with symlink
races, and whether retrying can bring back the vulnerability that
PR104161 addressed (I think the way you've done it doesn't have the
same race condition, but would be susceptible to a different race
condition where the top of the directory tree to be removed is
different when you retry).



> with the gcc-11 environment in which the need for this patch came up.  I
> have reimplemented it for mainline, and I have attempted to test it in
> this environment, but new filesystem tests and subtests that fail on
> rtems6.0 have impaired testing and prevented the full pass rate I got
> for them with a similar patchset on gcc-11.
>
>
> for  libstdc++-v3/ChangeLog
>
>         * src/c++17/fs_ops.cc (remove_all): Retry removal of
>         directory entries.
> ---
>  libstdc++-v3/src/c++17/fs_ops.cc |   22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
> index 435368fa5c5ff..b3390310132b4 100644
> --- a/libstdc++-v3/src/c++17/fs_ops.cc
> +++ b/libstdc++-v3/src/c++17/fs_ops.cc
> @@ -1286,6 +1286,8 @@ fs::remove_all(const path& p)
>  {
>    error_code ec;
>    uintmax_t count = 0;
> + retry:
> +  uintmax_t init_count = count;
>    recursive_directory_iterator dir(p, directory_options{64|128}, ec);
>    switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
>    {
> @@ -1303,7 +1305,7 @@ fs::remove_all(const path& p)
>      break;
>    case ENOENT:
>      // Our work here is done.
> -    return 0;
> +    return count;
>    case ENOTDIR:
>    case ELOOP:
>      // Not a directory, will remove below.
> @@ -1313,6 +1315,18 @@ fs::remove_all(const path& p)
>      _GLIBCXX_THROW_OR_ABORT(filesystem_error("cannot remove all", p, ec));
>    }
>
> +  if (count > init_count)
> +    {
> +      if (int last = fs::remove(p, ec); !ec)
> +       return count + last;
> +      else
> +       // Some systems seem to skip entries in the dir iteration if
> +       // you remove dir entries while iterating, so if we removed
> +       // anything in the dir in this round, and failed to remove
> +       // the dir (presumably because it wasn't empty), retry.
> +       goto retry;
> +    }
> +
>    // Remove p itself, which is either a non-directory or is now empty.
>    return count + fs::remove(p);
>  }
> @@ -1321,6 +1335,8 @@ std::uintmax_t
>  fs::remove_all(const path& p, error_code& ec)
>  {
>    uintmax_t count = 0;
> + retry:
> +  uintmax_t init_count = count;
>    recursive_directory_iterator dir(p, directory_options{64|128}, ec);
>    switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
>    {
> @@ -1341,7 +1357,7 @@ fs::remove_all(const path& p, error_code& ec)
>    case ENOENT:
>      // Our work here is done.
>      ec.clear();
> -    return 0;
> +    return count;
>    case ENOTDIR:
>    case ELOOP:
>      // Not a directory, will remove below.
> @@ -1354,6 +1370,8 @@ fs::remove_all(const path& p, error_code& ec)
>    // Remove p itself, which is either a non-directory or is now empty.
>    if (int last = fs::remove(p, ec); !ec)
>      return count + last;
> +  if (count > init_count)
> +    goto retry;
>    return -1;
>  }
>
>
> --
> Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
>    Free Software Activist                       GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about <https://stallmansupport.org>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
  2022-06-22  9:45 ` Jonathan Wakely
@ 2022-06-23  1:02   ` Alexandre Oliva
  2022-06-27 13:27   ` Alexandre Oliva
  1 sibling, 0 replies; 8+ messages in thread
From: Alexandre Oliva @ 2022-06-23  1:02 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc Patches, libstdc++

On Jun 22, 2022, Jonathan Wakely <jwakely@redhat.com> wrote:

> It looks like it would be possible for this to livelock.

True

> The current
> implementation will fail with an error in that case, rather than
> getting stuck forever in a loop.

In the equivalent livelock scenario, newly-added dir entries are added
to the end of the directory, and get visited in the same readdir
iteration, so you never reach the end as long as the entry-creator runs
faster than the remover.

>     It would be possible to add a __rewind member to directory iterators
>     too, to call rewinddir after each modification to the directory.

That would likely lead to O(n^2) behavior in some implementations, in
which remove entries get rescanned over and over, whereas the approach I
proposed cuts that down to O(nlogn).  Unless you rewind once you hit the
end after successful removals, even before trying to remove the dir.
That's still a little wasteful on POSIX-compliant targets, because you
rewind and rescan a dir that should already be empty.  I aimed for
minimizing the overhead on compliant targets, but that was before
remove_all switched to recursive iterators.

With recursive iterators, rewinding might be better done in a custom
iterator, tuned for recursive removal, that knows to rewind a dir if
there were removals in it or something.  Rewinding the entire recursive
removal is not something I realized my rewritten patch did.  I still had
the recursive remove_all implementation in cache.  Oops ;-)

That said, I'm not sure it makes much of a difference in the end.  As
long as the recursion and removal doesn't treat symlinks as dirs (which
IIUC requires openat and unlinkat, so that's a big if to boot), the
rewinding seems to only change the nature of filesystem race conditions
that recursive removal enables.  E.g., consider that you start removing
the entries in a dir, and then the dir you're visiting is moved out of
the subtree you're removing, and other dirs are moved into it: the
recursive removal with openat and unlinkat will happily attempt to wipe
out everything moved in there, even if it wasn't within that subtree at
the time of the remove_all request, and even if the newly-moved dirs
were never part of the subtree whose removal was requested.  To make it
clearer:

  c++::std::filesystem::remove_all foo/bar &
  mv foo/bar/temp temp
  mv foo temp
  wait
  ls -d temp/foo

temp/foo might be removed if you happened to be iterating in temp when
the 'mv' commands run.  Is that another kind of race that needs to be
considered?  If a dir is moved while we're visiting it, should we stop
visiting it?  What about its parent?

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
  2022-06-22  9:45 ` Jonathan Wakely
  2022-06-23  1:02   ` Alexandre Oliva
@ 2022-06-27 13:27   ` Alexandre Oliva
  2022-06-29  5:16     ` Alexandre Oliva
  2022-06-30  7:52     ` Alexandre Oliva
  1 sibling, 2 replies; 8+ messages in thread
From: Alexandre Oliva @ 2022-06-27 13:27 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc Patches, libstdc++

On Jun 22, 2022, Jonathan Wakely <jwakely@redhat.com> wrote:

> I haven't properly reviewed it yet

Nevermind that one, it's broken because I hadn't realized the recursive
iteration.  It fails and throws/errors out when we attempt to __erase a
subdir that wasn't successfully emptied because some of its entries were
skipped.

Here's an improved version that appears to work, despite the few(er)
remaining fails.  I've convinced myself it can't possibly introduce
symlink races because, if it doesn't follow symlinks in the first try,
it won't follow them in retries.  Potential races related with moving
directories into or out of the remove_all root remain.

On POSIX-compliant filesystems, the first name that fails to be removed
is most likely to be the first to be encountered in the retry, so
removal will terminate with the failure immediately.  There's a
potential for retries to end up removing other dirs moved into the
remove_all root, but moving them in while remove_all is still running
may remove them, so I don't see that it changes anything.

On RTEMS, the first name that would have failed to be removed, say
because of permissions, may be skipped by the iterator, so we may
proceed to remove later dir entries under the same parent before failing
to remove the parent and starting a retry.  There may thus be an
unbounded number of retries, one for each subdirectory with more than
one entry in the remove_all tree.  I see two potential ways to avoid
this: (i) call remove_all recursively upon failure to remove an entry,
instead of restarting iteration; or (ii) arrange for
recursive_directory_iterator to rewind a dir from which entries have
been _erase()d before returning to the parent dir.  I have not
implemented either of these alternatives, though.

This one is regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?



libstdc++: retry removal of dir entries if dir removal fails

From: Alexandre Oliva <oliva@adacore.com>

On some target systems (e.g. rtems6.0), removing directory components
while iterating over directory entries may cause some of the directory
entries to be skipped, which prevents the removal of the parent
directory from succeeding.

Advancing the iterator before removing a member proved not to be
enough, so I've instead arranged for remove_all to retry the removal
of components if the removal of the parent dir fails after removing at
least one entry.  The fail will be permanent only if no components got
removed in the current try.


for  libstdc++-v3/ChangeLog

	* src/c++17/fs_ops.cc (remove_all): Retry removal of
	directory entries.
---
 libstdc++-v3/src/c++17/fs_ops.cc |   40 ++++++++++++++++++++++++++++++++++----
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 435368fa5c5ff..de99e02af4c34 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1286,6 +1286,8 @@ fs::remove_all(const path& p)
 {
   error_code ec;
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1295,7 +1297,16 @@ fs::remove_all(const path& p)
       const recursive_directory_iterator end;
       while (dir != end)
 	{
-	  dir.__erase(); // throws on error
+	  /* Avoid exceptions if we may retry on fail on systems that
+	     miss dir entries when removing while iterating.  */
+	  if (count > init_count)
+	    {
+	      dir.__erase(&ec);
+	      if (ec)
+		goto retry;
+	    }
+	  else
+	    dir.__erase(); // throws on error
 	  ++count;
 	}
     }
@@ -1303,7 +1314,7 @@ fs::remove_all(const path& p)
     break;
   case ENOENT:
     // Our work here is done.
-    return 0;
+    return count;
   case ENOTDIR:
   case ELOOP:
     // Not a directory, will remove below.
@@ -1313,6 +1324,18 @@ fs::remove_all(const path& p)
     _GLIBCXX_THROW_OR_ABORT(filesystem_error("cannot remove all", p, ec));
   }
 
+  if (count > init_count)
+    {
+      if (int last = fs::remove(p, ec); !ec)
+	return count + last;
+      else
+	// Some systems seem to skip entries in the dir iteration if
+	// you remove dir entries while iterating, so if we removed
+	// anything in the dir in this round, and failed to remove
+	// the dir (presumably because it wasn't empty), retry.
+	goto retry;
+    }
+
   // Remove p itself, which is either a non-directory or is now empty.
   return count + fs::remove(p);
 }
@@ -1321,6 +1344,8 @@ std::uintmax_t
 fs::remove_all(const path& p, error_code& ec)
 {
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1332,7 +1357,12 @@ fs::remove_all(const path& p, error_code& ec)
 	{
 	  dir.__erase(&ec);
 	  if (ec)
-	    return -1;
+	    {
+	      if (count > init_count)
+		goto retry;
+	      else
+		return -1;
+	    }
 	  ++count;
 	}
     }
@@ -1341,7 +1371,7 @@ fs::remove_all(const path& p, error_code& ec)
   case ENOENT:
     // Our work here is done.
     ec.clear();
-    return 0;
+    return count;
   case ENOTDIR:
   case ELOOP:
     // Not a directory, will remove below.
@@ -1354,6 +1384,8 @@ fs::remove_all(const path& p, error_code& ec)
   // Remove p itself, which is either a non-directory or is now empty.
   if (int last = fs::remove(p, ec); !ec)
     return count + last;
+  if (count > init_count)
+    goto retry;
   return -1;
 }
 


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
  2022-06-27 13:27   ` Alexandre Oliva
@ 2022-06-29  5:16     ` Alexandre Oliva
  2022-06-30  7:52     ` Alexandre Oliva
  1 sibling, 0 replies; 8+ messages in thread
From: Alexandre Oliva @ 2022-06-29  5:16 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc Patches, libstdc++

On Jun 27, 2022, Alexandre Oliva <oliva@adacore.com> wrote:

> (ii) arrange for recursive_directory_iterator to rewind a dir from
> which entries have been _erase()d before returning to the parent dir

Here's an implementation of the above.  I kind of like it; it's far more
elegant than the earlier patch, and if it starts removing stuff from one
subdir, it won't remove stuff from other sibling subdirs before removing
that one subdir, and it won't visit any directory more than once.  I
don't think POSIX imposes any such restriction (AFAICT one could launch
parallel removes for each subdir and still be compliant), but it's less
surprising this way.


libstdc++: auto-rewind dirs for remove_all

On some target systems (e.g. rtems6.0), removing directory components
while iterating over directory entries may cause some of the directory
entries to be skipped, which prevents the removal of the parent
directory from succeeding.

Advancing the iterator before removing a member proved not to be
enough, so I've arranged the directory reading implementation to
implicitly rewind a directory when reaching the end, as long as there
were any entries and they have all been removed.  Rewinding will only
ever take place for users of recursive_directory_iterator::__erase,
and thus of _Dir::do_unlink, and since __erase is a private member
function, it can only be used by the remove_all functions because
they're granted friendship.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?


for  libstdc++-v3/ChangeLog

	* src/filesystem/dir-common.h (__gnu_posix::rewinddir):
	Define.
	* src/c++17/fs_dir.cc (fs::_Dir::auto_rewind): New enum.
	(fs::_Dir::rewind): New data member.
	(fs::_Dir::advance): Update rewind, rewinddir if found entries
	have all been removed.
	(fs::_Dir::do_unlink): Drop const.  Update rewind.
	(fs::_Dir::unlink, fs::_Dir::rmdir): Drop const.
---
 libstdc++-v3/src/c++17/fs_dir.cc         |   75 +++++++++++++++++++++++++++++-
 libstdc++-v3/src/filesystem/dir-common.h |    3 +
 2 files changed, 75 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_dir.cc b/libstdc++-v3/src/c++17/fs_dir.cc
index 2258399da2587..6f535c95a32fb 100644
--- a/libstdc++-v3/src/c++17/fs_dir.cc
+++ b/libstdc++-v3/src/c++17/fs_dir.cc
@@ -44,6 +44,30 @@ template class std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>;
 
 struct fs::_Dir : _Dir_base
 {
+  // On some systems, removing a directory entry causes readdir to
+  // skip the subsequent entry.  This state machine enables remove_all
+  // to not miss entries, by arranging for directories to
+  // automatically rewind iff every visited entry got removed, hitting
+  // the end only when no entries are found.  We start with
+  // no_entries, and advance moves from that to found_entry, that
+  // do_unlink changes to removed_entry.  If advance is called again
+  // without unlink, it moves to remaining_entry instead, so that we
+  // will know not to rewind (otherwise we'd visit the same entry
+  // again).  OTOH, if advance doesn't find any entry and the state is
+  // removed_entry, it resets to no_entries and rewinds; at other
+  // states, no rewind takes place.  Skipped entries (dot and dotdot
+  // and permission-denied) do not affect the state machine: they're
+  // skipped every time anyway.  It is envisioned that, with a
+  // reliable detection mechanism for systems that don't need
+  // rewinding, rewind could be initialized to remaining_entry, that
+  // is a final state that prevents rewinding.
+  enum auto_rewind {
+    no_entries,
+    found_entry,
+    removed_entry,
+    remaining_entry
+  };
+
   _Dir(const fs::path& p, bool skip_permission_denied, bool nofollow,
        [[maybe_unused]] bool filename_only, error_code& ec)
   : _Dir_base(p.c_str(), skip_permission_denied, nofollow, ec)
@@ -67,6 +91,21 @@ struct fs::_Dir : _Dir_base
   {
     if (const auto entp = _Dir_base::advance(skip_permission_denied, ec))
       {
+	switch (rewind)
+	  {
+	  case no_entries:
+	  case removed_entry:
+	    rewind = found_entry;
+	    break;
+	  case found_entry:
+	    rewind = remaining_entry;
+	    break;
+	  case remaining_entry:
+	    break;
+	  default:
+	    __builtin_unreachable ();
+	    break;
+	  }
 	auto name = path;
 	name /= entp->d_name;
 	file_type type = file_type::none;
@@ -80,6 +119,22 @@ struct fs::_Dir : _Dir_base
       }
     else if (!ec)
       {
+	switch (rewind)
+	  {
+	  case removed_entry:
+	    rewind = no_entries;
+	    posix::rewinddir(this->dirp);
+	    return advance (skip_permission_denied, ec);
+	  case found_entry:
+	    rewind = remaining_entry;
+	    break;
+	  case no_entries:
+	  case remaining_entry:
+	    break;
+	  default:
+	    __builtin_unreachable ();
+	    break;
+	  }
 	// reached the end
 	entry = {};
       }
@@ -144,8 +199,21 @@ struct fs::_Dir : _Dir_base
   }
 
   bool
-  do_unlink(bool is_directory, error_code& ec) const noexcept
+  do_unlink(bool is_directory, error_code& ec) noexcept
   {
+    switch (rewind)
+      {
+      case no_entries:
+      case removed_entry:
+      default:
+	__builtin_unreachable ();
+	break;
+      case found_entry:
+	rewind = removed_entry;
+	break;
+      case remaining_entry:
+	break;
+      }
 #if _GLIBCXX_HAVE_UNLINKAT
     const auto atp = current();
     if (::unlinkat(atp.dir(), atp.path_at_dir(),
@@ -166,16 +234,17 @@ struct fs::_Dir : _Dir_base
 
   // Remove the non-directory that this->entry refers to.
   bool
-  unlink(error_code& ec) const noexcept
+  unlink(error_code& ec) noexcept
   { return do_unlink(/* is_directory*/ false, ec); }
 
   // Remove the directory that this->entry refers to.
   bool
-  rmdir(error_code& ec) const noexcept
+  rmdir(error_code& ec) noexcept
   { return do_unlink(/* is_directory*/ true, ec); }
 
   fs::path		path; // Empty if only using unlinkat with file descr.
   directory_entry	entry;
+  enum auto_rewind	rewind = no_entries;
 };
 
 namespace
diff --git a/libstdc++-v3/src/filesystem/dir-common.h b/libstdc++-v3/src/filesystem/dir-common.h
index 228fab55afbcf..7eb39ae74378d 100644
--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -55,6 +55,7 @@ using char_type = wchar_t;
 using DIR = ::_WDIR;
 using dirent = _wdirent;
 inline DIR* opendir(const wchar_t* path) { return ::_wopendir(path); }
+inline void rewinddir(DIR* dir) { ::_wrewinddir(dir); }
 inline dirent* readdir(DIR* dir) { return ::_wreaddir(dir); }
 inline int closedir(DIR* dir) { return ::_wclosedir(dir); }
 #elif defined _GLIBCXX_HAVE_DIRENT_H
@@ -64,11 +65,13 @@ typedef struct ::dirent dirent;
 using ::opendir;
 using ::readdir;
 using ::closedir;
+using ::rewinddir;
 #else
 using char_type = char;
 struct dirent { const char* d_name; };
 struct DIR { };
 inline DIR* opendir(const char*) { return nullptr; }
+inline void rewinddir(DIR*) { }
 inline dirent* readdir(DIR*) { return nullptr; }
 inline int closedir(DIR*) { return -1; }
 #undef _GLIBCXX_HAVE_DIRFD


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
  2022-06-27 13:27   ` Alexandre Oliva
  2022-06-29  5:16     ` Alexandre Oliva
@ 2022-06-30  7:52     ` Alexandre Oliva
  2022-06-30  8:19       ` Sebastian Huber
  1 sibling, 1 reply; 8+ messages in thread
From: Alexandre Oliva @ 2022-06-30  7:52 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc Patches, libstdc++

On Jun 27, 2022, Alexandre Oliva <oliva@adacore.com> wrote:

> I see two potential ways to avoid this:

Another possibility occurred to me: seeking back to the entry we're
about to remove, before removing it.  Then, POSIX-compliant
implementations will just skip the removed entry and find the next one,
while RTEMS will find the next entry at the spot where the removed entry
used to be.

It is syscall-heavier, and it may invoke O(n^2) behavior for each
directory in remove_all, since prev_pos is quite likely to always hold
the initial offset, requiring scanning past more and more removed
entries after each removal, so I don't submit this formally for
inclusion, but post it FTR.  I've only confirmed that it solves the
problem on RTEMS, passing libstdc++ filesystem test, but I haven't
tested it further.


diff --git a/libstdc++-v3/src/c++17/fs_dir.cc b/libstdc++-v3/src/c++17/fs_dir.cc
index 2258399da2587..43e2d9678eae5 100644
--- a/libstdc++-v3/src/c++17/fs_dir.cc
+++ b/libstdc++-v3/src/c++17/fs_dir.cc
@@ -65,6 +65,7 @@ struct fs::_Dir : _Dir_base
   // Reports errors by setting ec.
   bool advance(bool skip_permission_denied, error_code& ec) noexcept
   {
+    prev_pos = posix::telldir(dirp);
     if (const auto entp = _Dir_base::advance(skip_permission_denied, ec))
       {
 	auto name = path;
@@ -146,6 +147,12 @@ struct fs::_Dir : _Dir_base
   bool
   do_unlink(bool is_directory, error_code& ec) const noexcept
   {
+    // On some systems, removing the just-read entry causes the next
+    // readdir to skip the entry that comes after it.  That's not
+    // POSIX-compliant, but we can work around this problem by moving
+    // back to the position of the last-read entry, as if it was to be
+    // read again next, before removing it.
+    posix::seekdir(dirp, prev_pos);
 #if _GLIBCXX_HAVE_UNLINKAT
     const auto atp = current();
     if (::unlinkat(atp.dir(), atp.path_at_dir(),
@@ -176,6 +183,7 @@ struct fs::_Dir : _Dir_base
 
   fs::path		path; // Empty if only using unlinkat with file descr.
   directory_entry	entry;
+  long			prev_pos;
 };
 
 namespace
diff --git a/libstdc++-v3/src/filesystem/dir-common.h b/libstdc++-v3/src/filesystem/dir-common.h
index 228fab55afbcf..6174a8ef3c228 100644
--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -55,6 +55,8 @@ using char_type = wchar_t;
 using DIR = ::_WDIR;
 using dirent = _wdirent;
 inline DIR* opendir(const wchar_t* path) { return ::_wopendir(path); }
+inline long telldir(DIR* dir) { ::_wtelldir(dir); }
+inline void seekdir(DIR *dir, long loc) { ::_wseekdir(dir, loc); }
 inline dirent* readdir(DIR* dir) { return ::_wreaddir(dir); }
 inline int closedir(DIR* dir) { return ::_wclosedir(dir); }
 #elif defined _GLIBCXX_HAVE_DIRENT_H
@@ -64,6 +66,8 @@ typedef struct ::dirent dirent;
 using ::opendir;
 using ::readdir;
 using ::closedir;
+using ::telldir;
+using ::seekdir;
 #else
 using char_type = char;
 struct dirent { const char* d_name; };
@@ -71,6 +75,8 @@ struct DIR { };
 inline DIR* opendir(const char*) { return nullptr; }
 inline dirent* readdir(DIR*) { return nullptr; }
 inline int closedir(DIR*) { return -1; }
+inline long telldir(DIR *) { return -1; }
+inline void seekdir(DIR *, long) { }
 #undef _GLIBCXX_HAVE_DIRFD
 #undef _GLIBCXX_HAVE_UNLINKAT
 #endif


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
  2022-06-30  7:52     ` Alexandre Oliva
@ 2022-06-30  8:19       ` Sebastian Huber
  2022-07-05 17:39         ` Alexandre Oliva
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Huber @ 2022-06-30  8:19 UTC (permalink / raw)
  To: Alexandre Oliva, Jonathan Wakely; +Cc: libstdc++, gcc Patches, RTEMS



On 30/06/2022 09:52, Alexandre Oliva via Gcc-patches wrote:
> On Jun 27, 2022, Alexandre Oliva<oliva@adacore.com>  wrote:
> 
>> I see two potential ways to avoid this:
> Another possibility occurred to me: seeking back to the entry we're
> about to remove, before removing it.  Then, POSIX-compliant
> implementations will just skip the removed entry and find the next one,
> while RTEMS will find the next entry at the spot where the removed entry
> used to be.
> 
> It is syscall-heavier, and it may invoke O(n^2) behavior for each
> directory in remove_all, since prev_pos is quite likely to always hold
> the initial offset, requiring scanning past more and more removed
> entries after each removal, so I don't submit this formally for
> inclusion, but post it FTR.  I've only confirmed that it solves the
> problem on RTEMS, passing libstdc++ filesystem test, but I haven't
> tested it further.

 From my point of view this is behaviour is an RTEMS bug. Instead of 
adding tweaks for RTEMS, it would be better to report the issues and fix 
them in RTEMS. It could be also a Newlib issue.

-- 
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails
  2022-06-30  8:19       ` Sebastian Huber
@ 2022-07-05 17:39         ` Alexandre Oliva
  0 siblings, 0 replies; 8+ messages in thread
From: Alexandre Oliva @ 2022-07-05 17:39 UTC (permalink / raw)
  To: Sebastian Huber; +Cc: Jonathan Wakely, libstdc++, gcc Patches, RTEMS

On Jun 30, 2022, Sebastian Huber <sebastian.huber@embedded-brains.de> wrote:

> From my point of view this is behaviour is an RTEMS bug. Instead of
> adding tweaks for RTEMS, it would be better to report the issues and
> fix them in RTEMS. It could be also a Newlib issue.

Thanks, I've just filed https://devel.rtems.org/ticket/4674

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-07-05 17:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22  6:19 [PATCH] libstdc++: retry removal of dir entries if dir removal fails Alexandre Oliva
2022-06-22  9:45 ` Jonathan Wakely
2022-06-23  1:02   ` Alexandre Oliva
2022-06-27 13:27   ` Alexandre Oliva
2022-06-29  5:16     ` Alexandre Oliva
2022-06-30  7:52     ` Alexandre Oliva
2022-06-30  8:19       ` Sebastian Huber
2022-07-05 17:39         ` Alexandre Oliva

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).