public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2()
@ 2021-02-25 19:47 Petr Vorel
  2021-02-25 22:38 ` Dmitry V. Levin
  0 siblings, 1 reply; 7+ messages in thread
From: Petr Vorel @ 2021-02-25 19:47 UTC (permalink / raw)
  To: libc-alpha
  Cc: Petr Vorel, Florian Weimer, Adhemerval Zanella, Andreas Schwab,
	Aleksa Sarai, Fabian Vogt, Kir Kolyshkin, Ladislav Slezak

3d3ab573a5 ("Linux: Use faccessat2 to implement faccessat (bug 18683)")
started to use faccessat2() which breaks docker/podman/... containers
with guest running glibc 2.33 running on host with older kernel and are
built with older libseccomp.

See also: https://bugzilla.opensuse.org/show_bug.cgi?id=1182451#c17

Signed-off-by: Petr Vorel <pvorel@suse.cz>
---
Hi,

I admit that this is a very ugly workaround and wouldn't be surprised if
you just don't care about seccomp() incompatibilities. But it'd be nice
to have unified approach for this incompatibility, as it hits any distro
with glibc 2.33 (currently openSUSE Tumbleweed, Arch Linux, Fedora
rawhide). And after some time (when old LTS distros EOL) this crap could be removed.

More info:
https://github.com/opencontainers/runc/pull/2750
https://github.com/seccomp/libseccomp/issues/314

Kind regards,
Petr

 sysdeps/unix/sysv/linux/faccessat.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/sysdeps/unix/sysv/linux/faccessat.c b/sysdeps/unix/sysv/linux/faccessat.c
index 13160d3249..f01c59b6e7 100644
--- a/sysdeps/unix/sysv/linux/faccessat.c
+++ b/sysdeps/unix/sysv/linux/faccessat.c
@@ -30,9 +30,22 @@ __faccessat (int fd, const char *file, int mode, int flag)
 #if __ASSUME_FACCESSAT2
   return ret;
 #else
-  if (ret == 0 || errno != ENOSYS)
+  if (ret == 0 || (errno != ENOSYS && errno != EPERM))
     return ret;
 
+  /*
+   * Check seccomp() issue with faccessat2(). Additional EPERM means seccomp()
+   * in use, ENOSYS or EBADF real EPERM.
+   */
+  if (errno == EPERM) {
+    int backup = errno;
+    INLINE_SYSCALL_CALL (faccessat2, -2, ".", 0, 0);
+    int err = errno;
+    errno = backup;
+    if (err != EPERM)
+      return ret;
+  }
+
   if (flag & ~(AT_SYMLINK_NOFOLLOW | AT_EACCESS))
     return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
 
-- 
2.30.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2()
  2021-02-25 19:47 [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2() Petr Vorel
@ 2021-02-25 22:38 ` Dmitry V. Levin
  2021-02-26  4:11   ` Petr Vorel
  2021-02-28  7:56   ` Aleksa Sarai
  0 siblings, 2 replies; 7+ messages in thread
From: Dmitry V. Levin @ 2021-02-25 22:38 UTC (permalink / raw)
  To: Petr Vorel
  Cc: libc-alpha, Florian Weimer, Fabian Vogt, Andreas Schwab,
	Kir Kolyshkin, Aleksa Sarai, Ladislav Slezak

Hi,

On Thu, Feb 25, 2021 at 08:47:02PM +0100, Petr Vorel wrote:
> 3d3ab573a5 ("Linux: Use faccessat2 to implement faccessat (bug 18683)")
> started to use faccessat2() which breaks docker/podman/... containers
> with guest running glibc 2.33 running on host with older kernel and are
> built with older libseccomp.
> 
> See also: https://bugzilla.opensuse.org/show_bug.cgi?id=1182451#c17
> 
> Signed-off-by: Petr Vorel <pvorel@suse.cz>
> ---
> Hi,
> 
> I admit that this is a very ugly workaround and wouldn't be surprised if
> you just don't care about seccomp() incompatibilities. But it'd be nice
> to have unified approach for this incompatibility, as it hits any distro
> with glibc 2.33 (currently openSUSE Tumbleweed, Arch Linux, Fedora
> rawhide). And after some time (when old LTS distros EOL) this crap could be removed.
> 
> More info:
> https://github.com/opencontainers/runc/pull/2750
> https://github.com/seccomp/libseccomp/issues/314
> 
> Kind regards,
> Petr

Petr, you must have missed the whole discussion on this subject [1][2],
the consensus was that problematic container runtimes need to be fixed
to make their seccomp filters return ENOSYS for unknown syscalls.

[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119955.html
[2] https://lore.kernel.org/linux-api/87lfer2c0b.fsf@oldenburg2.str.redhat.com/T/#u


-- 
ldv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2()
  2021-02-25 22:38 ` Dmitry V. Levin
@ 2021-02-26  4:11   ` Petr Vorel
  2021-02-28  6:03     ` Mike Frysinger
  2021-02-28  7:56   ` Aleksa Sarai
  1 sibling, 1 reply; 7+ messages in thread
From: Petr Vorel @ 2021-02-26  4:11 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: libc-alpha, Florian Weimer, Fabian Vogt, Andreas Schwab,
	Kir Kolyshkin, Aleksa Sarai, Ladislav Slezak

Hi Dmitry,

> Petr, you must have missed the whole discussion on this subject [1][2],
> the consensus was that problematic container runtimes need to be fixed
> to make their seccomp filters return ENOSYS for unknown syscalls.

Thanks for info and sorry for spam then.

Kind regards,
Petr

> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119955.html
> [2] https://lore.kernel.org/linux-api/87lfer2c0b.fsf@oldenburg2.str.redhat.com/T/#u

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2()
  2021-02-26  4:11   ` Petr Vorel
@ 2021-02-28  6:03     ` Mike Frysinger
  0 siblings, 0 replies; 7+ messages in thread
From: Mike Frysinger @ 2021-02-28  6:03 UTC (permalink / raw)
  To: Petr Vorel
  Cc: Dmitry V. Levin, Florian Weimer, libc-alpha, Andreas Schwab,
	Kir Kolyshkin, Fabian Vogt, Aleksa Sarai, Ladislav Slezak

On 26 Feb 2021 05:11, Petr Vorel wrote:
> Hi Dmitry,
> > Petr, you must have missed the whole discussion on this subject [1][2],
> > the consensus was that problematic container runtimes need to be fixed
> > to make their seccomp filters return ENOSYS for unknown syscalls.
> 
> Thanks for info and sorry for spam then.

no need to apologize.  can't expect everyone to read everything all the time.
-mike

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2()
  2021-02-25 22:38 ` Dmitry V. Levin
  2021-02-26  4:11   ` Petr Vorel
@ 2021-02-28  7:56   ` Aleksa Sarai
  2021-03-01 11:54     ` Florian Weimer
  1 sibling, 1 reply; 7+ messages in thread
From: Aleksa Sarai @ 2021-02-28  7:56 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: Petr Vorel, libc-alpha, Florian Weimer, Fabian Vogt,
	Andreas Schwab, Kir Kolyshkin, Ladislav Slezak

[-- Attachment #1: Type: text/plain, Size: 1887 bytes --]

On 2021-02-26, Dmitry V. Levin <ldv@altlinux.org> wrote:
> On Thu, Feb 25, 2021 at 08:47:02PM +0100, Petr Vorel wrote:
> > 3d3ab573a5 ("Linux: Use faccessat2 to implement faccessat (bug 18683)")
> > started to use faccessat2() which breaks docker/podman/... containers
> > with guest running glibc 2.33 running on host with older kernel and are
> > built with older libseccomp.
> > 
> > See also: https://bugzilla.opensuse.org/show_bug.cgi?id=1182451#c17
> > 
> > Signed-off-by: Petr Vorel <pvorel@suse.cz>
> > ---
> > Hi,
> > 
> > I admit that this is a very ugly workaround and wouldn't be surprised if
> > you just don't care about seccomp() incompatibilities. But it'd be nice
> > to have unified approach for this incompatibility, as it hits any distro
> > with glibc 2.33 (currently openSUSE Tumbleweed, Arch Linux, Fedora
> > rawhide). And after some time (when old LTS distros EOL) this crap could be removed.
> > 
> > More info:
> > https://github.com/opencontainers/runc/pull/2750
> > https://github.com/seccomp/libseccomp/issues/314
> > 
> > Kind regards,
> > Petr
> 
> Petr, you must have missed the whole discussion on this subject [1][2],
> the consensus was that problematic container runtimes need to be fixed
> to make their seccomp filters return ENOSYS for unknown syscalls.

It should also be noted that we fixed this in runc a month ago[1], which
means that it's up to distributions and cloud vendors to update their
runc packages to the latest version or backport the patch.

Docker's packaging hasn't been updated to use the latest runc yet
(that'll happen in the next patch release), but distributions can ship
newer runc versions -- that's what we do in openSUSE.

[1]: https://github.com/opencontainers/runc/pull/2750

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2()
  2021-02-28  7:56   ` Aleksa Sarai
@ 2021-03-01 11:54     ` Florian Weimer
  2021-03-04  8:27       ` Petr Vorel
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2021-03-01 11:54 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Dmitry V. Levin, Petr Vorel, libc-alpha, Fabian Vogt,
	Andreas Schwab, Kir Kolyshkin, Ladislav Slezak

* Aleksa Sarai:

> It should also be noted that we fixed this in runc a month ago[1], which
> means that it's up to distributions and cloud vendors to update their
> runc packages to the latest version or backport the patch.
>
> Docker's packaging hasn't been updated to use the latest runc yet
> (that'll happen in the next patch release), but distributions can ship
> newer runc versions -- that's what we do in openSUSE.
>
> [1]: https://github.com/opencontainers/runc/pull/2750

There are some indications that not all container runtimes will pick up
the runc kludge (thanks for developing that by the way).  So it's likely
that the general issue will be with us for a while longer.  Maybe the
competitive pressure from other working container runtimes will
encourage other re-evaluate their approach, I don't know.

We still don't plan to throw in downstream-only glibc patches to paper
over this (given that it's been rejected by kernel and glibc developers
alike, I really think it's the wrong way to go).  So far management
isn't breathing down our necks.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2()
  2021-03-01 11:54     ` Florian Weimer
@ 2021-03-04  8:27       ` Petr Vorel
  0 siblings, 0 replies; 7+ messages in thread
From: Petr Vorel @ 2021-03-04  8:27 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Aleksa Sarai, Dmitry V. Levin, libc-alpha, Fabian Vogt,
	Andreas Schwab, Kir Kolyshkin, Ladislav Slezak

Hi all,

> There are some indications that not all container runtimes will pick up
> the runc kludge (thanks for developing that by the way).  So it's likely
> that the general issue will be with us for a while longer.  Maybe the
> competitive pressure from other working container runtimes will
> encourage other re-evaluate their approach, I don't know.
Hopefully.

> We still don't plan to throw in downstream-only glibc patches to paper
> over this (given that it's been rejected by kernel and glibc developers
> alike, I really think it's the wrong way to go).  So far management
> isn't breathing down our necks.
As workaround exists (for openSUSE using podman with newest runc v1.0.0-rc93)
I understand the reluctance to accept a workaround. It just reminds me occasional
musl approach to be correct no matter what problems it brings to users.

Kind regards,
Petr

> Thanks,
> Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-03-04  8:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-25 19:47 [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2() Petr Vorel
2021-02-25 22:38 ` Dmitry V. Levin
2021-02-26  4:11   ` Petr Vorel
2021-02-28  6:03     ` Mike Frysinger
2021-02-28  7:56   ` Aleksa Sarai
2021-03-01 11:54     ` Florian Weimer
2021-03-04  8:27       ` Petr Vorel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).