public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Martin Buchholz <martinrb@google.com>, Zack Weinberg <zackw@panix.com>
Cc: GNU C Library <libc-alpha@sourceware.org>
Subject: Re: system and popen fail in case of big application
Date: Thu, 13 Sep 2018 01:27:00 -0000	[thread overview]
Message-ID: <01ffd458-794b-6d05-0635-f13437bec096@linaro.org> (raw)
In-Reply-To: <CA+kOe0-CvUou3Au4y+niES8OnspG==mVE6YKbQXpAZ-wmcPPDw@mail.gmail.com>



On 12/09/2018 16:45, Martin Buchholz wrote:
> On Wed, Sep 12, 2018 at 9:29 AM, Zack Weinberg <zackw@panix.com> wrote:
>> I don't want to encourage people to use posix_spawn because
>> posix_spawn is a badly designed API.  It's difficult to use correctly.
>> It's *tedious* to use correctly, which means people won't bother.  It
>> can't do everything you might want to do on the child side (witness
>> the discussion of adding an extension to let it do chdir()).  Its
>> behavior in case of errors is underspecified.  And it might be
>> implemented in terms of fork, which means it doesn't guarantee to
>> solve Sergey's problem.
> 
> I seem to be in agreement with Zack's eloquent post.  From java's
> point of view, we need to avoid the momentary doubling of memory
> caused by fork()ing a huge process, and we need to be able to specify
> a working directory for the child.  One strategy is to use posix_spawn
> to start a small helper process which then in turn execs the real
> child after necessary set up (including chdir).
> 

The code you linked [1] contains some outdated information regarding
vfork and posix_spawn:

  - vfork would be supported on Linux not because its a good interface,
    but rather than Linux policy is to not broke userspace in any case.
    And it would be supported only on architectures that already provide
    it, new ports do not automatically export vfork and they are not 
    encouraged to do so. Currently ARC, alpha, s390, microblame, x86, sH,
    powerpc, hexagen, parisc, m68k, arm, and aarch64 do so (and aarch64 
    only if compat mode is enabled, a 64-bit kernel only won't have vfork
    syscall). For remaining arches, glibc will just use a clone plus
    required flags (CLONE_VM | CLONE_VFORK | SIGCHLD).

  - Almost sure the clone issues described is due the fact of old tid
    caching due CLONE_VM and this semantic has been changed in recent
    glibc version.  clone now should not touch any shared internal 
    structures, the only constraint is it still requires a stack argument.

Also, on vfork oath I am not very confident openjdk implementation works
around correctly on the issue that posix_spawn has fixed recently:

  - inherent usage issue with signal handlers (BZ#14750): I don't see
    any signal handling in neither Java_java_lang_ProcessImpl_forkAndExec,
    startChild, vforkChild, or childProcess (maybe it is handle by the
    Java_java_lang_ProcessImpl_forkAndExec caller?).

  - cancellation state being shared with the parent (BZ#10354): at least
    two cancellation entrypoints are called on childProcess (write and
    close), which in a vfork state is undefined behaviour if cancellation
    in pending and act upon it.

  - vfork limitation of possible parent clobber due stack spilling:
    although the call stack does have some care of assuming a shared
    memory environment, it does not have any treatment of possible
    compiler spilling the parent's stack (glibc's posix_spawn uses a
    mmaped stack for this issue).

  - Another issue is the case where the helper process itself
    is killed somehow (there were some discussion about it on BZ#22273).

  - The closeDescriptors is inherent racy in the way it is implemented
    for vfork case: either by iterating over proc/%d/fd or _SC_OPEN_MAX
    you can't guarantee that another thread won't open another file. 
    Worse, taking locks in the vfork helper case can't be quite
    troublesome for the aforementioned case where helper process is
    killed before calling either _exit or _execv*.

You may have not crossed any of this issue due the fact pthread cancellation
is not widely used, some compiler luck with code generation, and with runtime
not issuing misguided signals.  In any case I would say it would be better to
use the posix_spawn strategy by issuing a helper process for Linux as well.

And what I would really like is to see what prevents openjdk and other programs
to use posix_spawn directly. It seems now that you need an extension to change 
the working directory and to close the inherited file descriptor. We are 
discussing the former and I think it would be feasible to raise it to Austin 
group as an extension.

For former I see there is no easy way to provide a similar closefrom
function without either being racy (as *BSD and openjdk implementations)
or to remove some scalability by adding serialization (for open* syscall
to avoid create new FD while closeall is closing them). A better alternative
would to request for some kernel helper, but I am not convinced that
current way of explicit open all file descriptors with O_CLOEXEC is not
the better option. Unfortunately it does not help run external code
(through shared libraries) that still call open in default mode.

[1] http://hg.openjdk.java.net/jdk/jdk/file/tip/src/java.base/unix/native/libjava/ProcessImpl_md.c

  reply	other threads:[~2018-09-13  1:27 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-09 22:16 Sergey Melnikov
2018-09-09 22:44 ` Zack Weinberg
2018-09-09 22:47   ` Zack Weinberg
2018-09-09 23:27     ` Sergey Melnikov
2018-09-09 23:49       ` Zack Weinberg
2018-09-10  2:50         ` Martin Buchholz
2018-09-10 15:38       ` Joseph Myers
2018-09-10 15:42         ` Zack Weinberg
2018-09-10 15:43           ` Joseph Myers
     [not found]       ` <CA+kOe0_1k_PbJ-pjHznP4AmTJUgziAdT+4vCcCRSb7GGdvbv7Q@mail.gmail.com>
2018-09-10 15:59         ` Zack Weinberg
2018-09-10 16:55           ` Martin Buchholz
2018-09-11 20:16   ` Adhemerval Zanella
2018-09-12 16:30     ` Zack Weinberg
2018-09-12 19:46       ` Martin Buchholz
2018-09-13  1:27         ` Adhemerval Zanella [this message]
2018-09-13  6:31           ` Andreas Schwab
2018-09-13 12:30             ` Adhemerval Zanella
2018-09-18  1:18           ` Martin Buchholz
2018-09-18  1:59             ` Adhemerval Zanella
2018-09-18  2:18               ` Martin Buchholz
2018-09-18  4:31               ` Rich Felker
2018-09-12 23:08       ` Adhemerval Zanella
2018-09-13 12:11       ` Szabolcs Nagy
2018-09-14 18:42         ` Zack Weinberg
2018-09-17 10:32           ` Szabolcs Nagy
2018-09-17 18:27             ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01ffd458-794b-6d05-0635-f13437bec096@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    --cc=martinrb@google.com \
    --cc=zackw@panix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).