public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Jonathon Anderson <janderson@rice.edu>
To: libc-alpha@sourceware.org
Cc: John Mellor-Crummey <johnmc@rice.edu>
Subject: Re: [PATCH v6 4/5] io: Add closefrom [BZ #10353]
Date: Sat, 3 Jul 2021 09:45:12 -0500	[thread overview]
Message-ID: <b3473d33-fa2a-78e4-f51c-415f39858aea@rice.edu> (raw)
In-Reply-To: <20210623185115.395812-5-adhemerval.zanella@linaro.org>

Hello,

I'm part of a team at Rice University developing HPCToolkit, a 
fine-grained sampling-based performance analysis tool. Recently we have 
been sharing our experiences using the LD_AUDIT framework with the 
community, this patch series came to our attention as the subject of a 
long-standing issue we have run into multiple times with our tool.

The performance measurement portion of our tool runs as a library 
preloaded into the application, sharing the address space and (more 
importantly) file descriptors. As mentioned previously in this thread 
there are a number of projects which at times close all open file 
descriptors not known to the application (often via iteration over 
/proc/self/fd). We have found that in most cases this will also close 
our own file descriptors with no warning, causing us to corrupt and lose 
valuable measurement data, especially for highly multi-threaded 
applications.

In response to this problem we have considered reserving file 
descriptors above the RLIMIT_NOFILE limit, a technique used by Valgrind 
[7,8] where file descriptors are opened early in the process and then 
the limit is lowered below them (Valgrind intercepts getrlimit to report 
modified limits to the application, we would use setrlimit to lower the 
limit instead). While this does not remove their entries from 
/proc/self/fd (they are still valid open file descriptors), a number of 
projects that close all open file descriptors support this technique by 
only closing descriptors within the RLIMIT_NOFILE bound (obtained 
directly or indirectly through getdtablesize) [1][3][4][6].

Two of the three implementations of closefrom provided by this patch *do 
not* support this technique (specific locations marked below). Not only 
is this inconsistent, it directly affects in-process tools like 
HPCToolkit as applications shift to using modern Glibc extensions.

We recommend that this be adjusted before this patch is accepted into 
upstream Glibc, and that support for this technique be tested in 
tst-closefrom.c.

-Jonathon

[1] 
https://github.com/systemd/systemd/blob/5238e9575906297608ff802a27e2ff9effa3b338/src/basic/fd-util.c#L203-L204 

[3] 
https://github.com/python/cpython/blob/9e4f2f3a6b8ee995c365e86d976937c141d867f8/Modules/_posixsubprocess.c#L196-L200 

[4] 
https://github.com/rust-lang/rust/blob/5f47c0613ed4eb46fca3633c1297364c09e5e451/src/libstd/sys/unix/process2.rs#L161-L164 

[6] 
https://hg.mozilla.org/mozilla-central/file/tip/ipc/chromium/src/base/process_util_posix.cc#l152 

[7] 
https://sourceware.org/git/?p=valgrind.git;a=blob;f=coregrind/m_main.c;hb=393732dda164c1cc0fc511eadc0b8f06008ade4f#l1125 

[8] 
https://sourceware.org/git/?p=valgrind.git;a=blob;f=coregrind/m_syswrap/syswrap-generic.c;hb=393732dda164c1cc0fc511eadc0b8f06008ade4f#l3609

On 6/23/21 1:51 PM, Adhemerval Zanella via Libc-alpha wrote:
> diff --git a/sysdeps/unix/sysv/linux/closefrom.c b/sysdeps/unix/sysv/linux/closefrom.c
> new file mode 100644
> index 0000000000..f5d7342c2c
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/closefrom.c
> @@ -0,0 +1,35 @@
> +/* Close a range of file descriptors.  Linux version.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <stdio.h>
> +#include <sys/param.h>
> +#include <unistd.h>
> +
> +void
> +__closefrom (int lowfd)
> +{
> +  int l = MAX (0, lowfd);
> +
> +  int r = __close_range (l, ~0U, 0);
The upper limit `~0U` should be changed to `__getdtablesize()` or similar.
> +  if (r == 0)
> +    return;
> +
> +  if (!__closefrom_fallback (l))
> +    __fortify_fail ("closefrom failed to close a file descriptor");
> +}
> +weak_alias (__closefrom, closefrom)
> diff --git a/sysdeps/unix/sysv/linux/closefrom_fallback.c b/sysdeps/unix/sysv/linux/closefrom_fallback.c
> new file mode 100644
> index 0000000000..61e71d388d
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/closefrom_fallback.c
> @@ -0,0 +1,97 @@
> +/* Close a range of file descriptors.  Linux version.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <arch-fd_to_filename.h>
> +#include <dirent.h>
> +#include <not-cancel.h>
> +#include <stdbool.h>
> +
> +/* Fallback code: iterates over /proc/self/fd, closing each file descriptor
> +   that fall on the criteria.  */
> +_Bool
> +__closefrom_fallback (int from)
> +{
> +  bool ret = false;
> +
> +  int dirfd = __open_nocancel (FD_TO_FILENAME_PREFIX, O_RDONLY | O_DIRECTORY,
> +                               0);
> +  if (dirfd == -1)
> +    {
> +      /* The closefrom should work even when process can't open new files.  */
> +      if (errno == ENOENT)
> +        goto err;
> +
> +      for (int i = from; i < INT_MAX; i++)
> +        {
> +          int r = __close_nocancel (i);
> +          if (r == 0 || (r == -1 && errno != EBADF))
> +            break;
> +        }
> +
> +      dirfd = __open_nocancel (FD_TO_FILENAME_PREFIX, O_RDONLY | O_DIRECTORY,
> +                               0);
> +      if (dirfd == -1)
> +        goto err;
> +    }
> +
> +  char buffer[1024];
> +  while (true)
> +    {
> +      ssize_t ret = __getdents64 (dirfd, buffer, sizeof (buffer));
> +      if (ret == -1)
> +        goto err;
> +      else if (ret == 0)
> +        break;
> +
> +      /* If any file descriptor is closed it resets the /proc/self position
> +         read again from the start (to obtain any possible kernel update).  */
> +      bool closed = false;
> +      char *begin = buffer, *end = buffer + ret;
> +      while (begin != end)
> +        {
> +          unsigned short int d_reclen;
> +          memcpy (&d_reclen, begin + offsetof (struct dirent64, d_reclen),
> +                  sizeof (d_reclen));
> +          const char *dname = begin + offsetof (struct dirent64, d_name);
> +          begin += d_reclen;
> +
> +          if (dname[0] == '.')
> +            continue;
> +
> +          int fd = 0;
> +          for (const char *s = dname; (unsigned int) (*s) - '0' < 10; s++)
> +            fd = 10 * fd + (*s - '0');
> +
> +          if (fd == dirfd || fd < from)
> +            continue;
An extra condition here should check that `fd < __getdtablesize()` or 
similar.
> +
> +          /* We ignore close errors because EBADF, EINTR, and EIO means the
> +             descriptor has been released.  */
> +          __close_nocancel (fd);
> +          closed = true;
> +        }
> +
> +      if (closed && __lseek (dirfd, 0, SEEK_SET) < 0)
> +        goto err;
> +    }
> +
> +  ret = true;
> +err:
> +  __close_nocancel (dirfd);
> +  return ret;
> +}

  reply	other threads:[~2021-07-03 14:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-23 18:51 [PATCH v6 0/5] Add close_range, closefrom, and posix_spawn_file_actions_closefrom_np Adhemerval Zanella
2021-06-23 18:51 ` [PATCH v6 1/5] support: Add support_stack_alloc Adhemerval Zanella
2021-06-24  9:15   ` Florian Weimer
2021-06-24 11:33     ` Adhemerval Zanella
2021-06-23 18:51 ` [PATCH v6 2/5] support: Add xclone Adhemerval Zanella
2021-06-24  9:15   ` Florian Weimer
2021-06-23 18:51 ` [PATCH v6 3/5] linux: Add close_range Adhemerval Zanella
2021-06-27 17:23   ` Florian Weimer
2021-06-23 18:51 ` [PATCH v6 4/5] io: Add closefrom [BZ #10353] Adhemerval Zanella
2021-07-03 14:45   ` Jonathon Anderson [this message]
2021-07-05 20:26     ` Adhemerval Zanella
2021-07-06 11:37       ` Florian Weimer
2021-06-23 18:51 ` [PATCH v6 5/5] posix: Add posix_spawn_file_actions_closefrom_np Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3473d33-fa2a-78e4-f51c-415f39858aea@rice.edu \
    --to=janderson@rice.edu \
    --cc=johnmc@rice.edu \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).