From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 3CBCB385800E for ; Mon, 30 May 2022 12:58:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3CBCB385800E Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-531-Z8YBwfaMPtiXLyCMxos9Pw-1; Mon, 30 May 2022 08:58:04 -0400 X-MC-Unique: Z8YBwfaMPtiXLyCMxos9Pw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5402C29ABA34; Mon, 30 May 2022 12:58:04 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.192.134]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BF6D240EC002; Mon, 30 May 2022 12:58:02 +0000 (UTC) From: Florian Weimer To: Andrei Vagin Cc: Christian Brauner , Adhemerval Zanella , libc-alpha@sourceware.org, Alexey Izbyshev , "Carlos O'Donell" Subject: Re: [PATCH v4 0/3] Linux: Fix posix_spawn when user with time namespaces References: <20220510191155.1998575-1-adhemerval.zanella@linaro.org> <877d6tb3hl.fsf@oldenburg.str.redhat.com> <20220511092119.ke4zlm2dkazasmva@wittgenstein> <87h75dyf3p.fsf@oldenburg.str.redhat.com> Date: Mon, 30 May 2022 14:58:01 +0200 In-Reply-To: (Andrei Vagin's message of "Fri, 27 May 2022 08:53:54 -0700") Message-ID: <87sfori3dy.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 May 2022 12:58:07 -0000 * Andrei Vagin: [CLONE_NEWTIME and vfork] >> Breaking vfork is really a bit of a hassle for us, and the workaround >> code is quite non-trivial and will have to implemented across many >> projects (not just glibc). An unshare request that takes effect on >> execve only would really help. > > Is the problem that vfork fails if a process has half-entered a time > namespace? Exactly. Anything that implements a general-purpose process launching facility on top of vfork now needs to implement fork fallback (after vfork failure), so that launching processes still works if the original process has called unshare(CLONE_NEWTIME). In glibc, this affects posix_spawn, but other libcs are also impacted, and so is any custom posix_spawn-like interface that uses vfork internally. Without fork fallback, they turn unusable if anything in the process has previously called unshare(CLONE_NEWTIME). The fallback implementation tends to be complicated if it's necessary to report execve and other errors to the caller. There is a choice between the O_CLOEXEC pipe hack (which has become more complex to implement due to close_range support), or a shared mapping has to be created using MAP_SHARED, and the subprocess writes error information to that (which adds more potentially costly MM updates). MAP_SHARED is probably easier to implement than the pipe approach (no interference possible from file actions), but for glibc, Adhemerval wrote something based on the pipe approach. But the key point is that any general-purpose wrapper around vfork now has to implement fork fallback. Regarding the patch you sketched, we'd probably have to introduce a new flag (not CLONE_NEWTIME) for this because the difference in behavior is quite visible. Thanks, Florian