public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/2] wait/ptrace: always assume __WALL if the child is traced
       [not found] <20151020171740.GA29290@redhat.com>
@ 2015-10-22 14:40 ` Pedro Alves
  2015-10-25 14:46   ` Oleg Nesterov
       [not found] ` <20151020171754.GA29304@redhat.com>
  1 sibling, 1 reply; 6+ messages in thread
From: Pedro Alves @ 2015-10-22 14:40 UTC (permalink / raw)
  To: Oleg Nesterov, Andrew Morton, Dmitry Vyukov
  Cc: Alexander Potapenko, Denys Vlasenko, Eric Dumazet,
	Jan Kratochvil, Julien Tinnes, Kees Cook, Kostya Serebryany,
	Linus Torvalds, Michael Kerrisk (man-pages),
	Robert Swiecki, Roland McGrath, syzkaller, linux-kernel, gdb

On 10/20/2015 06:17 PM, Oleg Nesterov wrote:

> Jan, Pedro, could you please confirm this won't break gdb? I tried
> to look into gdb-7.1, and at first glance gdb uses __WCLONE only
> because __WALL doesn't work on older kernels, iow it seems to me
> that gdb actually wants __WALL so this change should be fine.

Right, gdb actually wants __WALL, but it doesn't use it to keep
compatibility with kernels that predate it.

gdb nowadays has an __WALL emulation waitpid wrapper
(alternates __WCLONE+WNOHANG with 0+WNOHANG, blocks
on sigsuspend/SIGCHLD):

 https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/nat/linux-waitpid.c;h=cbcdd95afa9c664993542b0b3851e79fbae4e1df;hb=HEAD#l77

Though it's not used everywhere.  Some older code in the
ptrace backend open codes the "try __WCLONE, then try !__WCLONE."
dance.

Seems like __WALL was added in Linux 2.4; gdb could probably
assume it's available nowadays...

In any case, to make sure existing gdb binaries would still work
with your kernel change, I ran GDB's testsuite with this:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/gdb/nat/linux-waitpid.c b/gdb/nat/linux-waitpid.c
index cbcdd95..864ba2e 100644
--- a/gdb/nat/linux-waitpid.c
+++ b/gdb/nat/linux-waitpid.c
@@ -149,3 +149,17 @@ my_waitpid (int pid, int *status, int flags)
   errno = out_errno;
   return ret;
 }
+
+#include <dlfcn.h>
+
+pid_t
+waitpid (pid_t pid, int *status, int options)
+{
+  static pid_t (*waitpid2) (pid_t pid, int *status, int options) = NULL;
+
+  if (waitpid2 == NULL)
+    waitpid2 = dlsym (RTLD_NEXT, "waitpid");
+
+  options |= __WALL;
+  return waitpid2 (pid, status, options);
+}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

and got no regressions.  So seems like all would be well from
GDB's perspective.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] wait/ptrace: always assume __WALL if the child is traced
  2015-10-22 14:40 ` [PATCH 0/2] wait/ptrace: always assume __WALL if the child is traced Pedro Alves
@ 2015-10-25 14:46   ` Oleg Nesterov
  0 siblings, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2015-10-25 14:46 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Andrew Morton, Dmitry Vyukov, Alexander Potapenko,
	Denys Vlasenko, Eric Dumazet, Jan Kratochvil, Julien Tinnes,
	Kees Cook, Kostya Serebryany, Linus Torvalds,
	Michael Kerrisk (man-pages),
	Robert Swiecki, Roland McGrath, syzkaller, linux-kernel, gdb

On 10/22, Pedro Alves wrote:
>
> In any case, to make sure existing gdb binaries would still work
> with your kernel change, I ran GDB's testsuite with this:
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> diff --git a/gdb/nat/linux-waitpid.c b/gdb/nat/linux-waitpid.c
> index cbcdd95..864ba2e 100644
> --- a/gdb/nat/linux-waitpid.c
> +++ b/gdb/nat/linux-waitpid.c
> @@ -149,3 +149,17 @@ my_waitpid (int pid, int *status, int flags)
>    errno = out_errno;
>    return ret;
>  }
> +
> +#include <dlfcn.h>
> +
> +pid_t
> +waitpid (pid_t pid, int *status, int options)
> +{
> +  static pid_t (*waitpid2) (pid_t pid, int *status, int options) = NULL;
> +
> +  if (waitpid2 == NULL)
> +    waitpid2 = dlsym (RTLD_NEXT, "waitpid");
> +
> +  options |= __WALL;
> +  return waitpid2 (pid, status, options);
> +}
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Thanks a lot Pedro!

So gdb should be fine, strace too. Perhaps we should change the kernel
this way and forget about /sbin/init fixes.

Oleg.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] wait/ptrace: always assume __WALL if the child is traced
       [not found]             ` <20151025155440.GB2043@redhat.com>
@ 2015-10-26 12:09               ` Pedro Alves
  2015-10-28 15:15                 ` Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: Pedro Alves @ 2015-10-26 12:09 UTC (permalink / raw)
  To: Oleg Nesterov, Denys Vlasenko
  Cc: Denys Vlasenko, Andrew Morton, Dmitry Vyukov,
	Alexander Potapenko, Eric Dumazet, Jan Kratochvil, Julien Tinnes,
	Kees Cook, Kostya Serebryany, Linus Torvalds,
	Michael Kerrisk (man-pages),
	Robert Swiecki, Roland McGrath, syzkaller,
	Linux Kernel Mailing List, gdb

On 10/25/2015 03:54 PM, Oleg Nesterov wrote:
> On 10/22, Denys Vlasenko wrote:
>>
>> On Wed, Oct 21, 2015 at 11:47 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>>> On 10/21, Denys Vlasenko wrote:
>>>>
>>>> On 10/21/2015 09:59 PM, Denys Vlasenko wrote:
>>>>> On 10/21/2015 12:31 AM, Andrew Morton wrote:
>>>>>> Well, to fix this a distro needs to roll out a new kernel.  Or a new
>>>>>> init(8).  Is there any reason to believe that distributing/deploying a
>>>>>> new kernel is significantly easier for everyone?  Because fixing init
>>>>>> sounds like a much preferable solution to this problem.
>>>>>
>>>>> People will continue to write new init(8) implementations,
>>>>> and they will miss this obscure case.
>>>>>
>>>>> Before this bug was found, it was considered possible to use
>>>>> a shell script as init process. What now, every shell needs to add
>>>>> __WALL to its waitpids?
>>>
>>> Why not? I think it can safely use __WALL too.
>>
>> Because having any userspace program which can happen to be init,
>> which includes all shells out there in the wild
>> (bash, dash, ash, ksh, zsh, msh, hush,...)
>> learn about __WALL is wrong: apart from this wart, they do not have
>> to use any Linux-specific code. It can all be perfectly legitimate POSIX.
> 
> Yes, this is true. I meant that they could safely use __WALL to, but I
> understand that this change can be painful.
> 
>>> Sure. But people do the things which were never intended to be
>>> used all the time. We simply can not know if this "feature"
>>> already has a creative user or not.
>>
>> It won't be unfixable: they will just have to switch from PTRACE_TRACEME
>> to PTRACE_ATTACH.
>>
>> As of now we do not know any people craz^W creative enough
>> to create a cross between init and strace. If such specimens would
>> materialize, don't they deserve to have to make that change?
> 
> This also applies to people who use bash/whatever as /sbin/init and allow
> the untrusted users to run the exploits ;) I do not know who is more crazy.
> 
> In any case, the real question is whether we should change the kernel to
> fix the problem, or ask the distros to fix their init's. In the former
> case 1/2 looks simpler/safer to me than the change in ptrace_traceme(),
> and you seem to agree that 1/2 is not that bad.

A risk here seems to be that waitpid will start returning unexpected
(thread) PIDs to parent processes, and it's not unreasonable to assume
that e.g., a program asserts that waitpid either returns error or a
known (process) PID.

That's not an init-only issue, but something that might affect any
process that runs a child that happens to decide to
call PTRACE_TRACEME.

The ptrace man page says:

 "A process can initiate a trace by calling fork(2) and having the resulting
 child do a PTRACE_TRACEME, followed (typically) by an execve(2)."

Given that, can we instead make the kernel error out on PTRACE_TRACEME issued
from a non-leader thread?  Then between PTRACE_TRACEME and the parent's
waitpid, __WALL or !__WALL should make no difference.

(Also, in the original test case, if the child gets/raises a signal or execs
before exiting, the bash/init/whatever process won't be issuing PTRACE_CONT,
and the child will thus end up stuck (though should be SIGKILLable,
I believe).  All this because PTRACE_TRACEME is broken by design by making it
be the child's choice whether to be traced...)

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] wait/ptrace: always assume __WALL if the child is traced
  2015-10-26 12:09               ` [PATCH 1/2] " Pedro Alves
@ 2015-10-28 15:15                 ` Oleg Nesterov
  2015-10-28 15:43                   ` Pedro Alves
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2015-10-28 15:15 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Denys Vlasenko, Denys Vlasenko, Andrew Morton, Dmitry Vyukov,
	Alexander Potapenko, Eric Dumazet, Jan Kratochvil, Julien Tinnes,
	Kees Cook, Kostya Serebryany, Linus Torvalds,
	Michael Kerrisk (man-pages),
	Robert Swiecki, Roland McGrath, syzkaller,
	Linux Kernel Mailing List, gdb

On 10/26, Pedro Alves wrote:
>
> On 10/25/2015 03:54 PM, Oleg Nesterov wrote:
> >
> > In any case, the real question is whether we should change the kernel to
> > fix the problem, or ask the distros to fix their init's. In the former
> > case 1/2 looks simpler/safer to me than the change in ptrace_traceme(),
> > and you seem to agree that 1/2 is not that bad.
>
> A risk here seems to be that waitpid will start returning unexpected
> (thread) PIDs to parent processes,

I don't see how this change can make the things worse,

> and it's not unreasonable to assume
> that e.g., a program asserts that waitpid either returns error or a
> known (process) PID.

Well. /sbin/init can never assume this, obviously.

> That's not an init-only issue,

Yes. Because we have CLONE_PARENT. So "waitpid either returns error or a
known (process) PID" is only true if you trust your children.

> but something that might affect any
> process that runs a child that happens to decide to
> call PTRACE_TRACEME.
>
> The ptrace man page says:
>
>  "A process can initiate a trace by calling fork(2) and having the resulting
>  child do a PTRACE_TRACEME, followed (typically) by an execve(2)."
>
> Given that, can we instead make the kernel error out on PTRACE_TRACEME issued
> from a non-leader thread?  Then between PTRACE_TRACEME and the parent's
> waitpid, __WALL or !__WALL should make no difference.

Afaics not really. A group leader can do PTRACE_TRACEME and then
clone(CLONE_THREAD | CLONE_PTRACE) with the same effect.

> (Also, in the original test case, if the child gets/raises a signal or execs
> before exiting, the bash/init/whatever process won't be issuing PTRACE_CONT,
> and the child will thus end up stuck (though should be SIGKILLable,

Oh, but if it is killable everything is fine. How does this differ from the
case when, say, you jusr reparent to init and do kill(getpid(), SIGSTOP) ?

> All this because PTRACE_TRACEME is broken by design

Heh. I agree. But we can't fix it now.

Oleg.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] wait/ptrace: always assume __WALL if the child is traced
  2015-10-28 15:15                 ` Oleg Nesterov
@ 2015-10-28 15:43                   ` Pedro Alves
  2015-10-28 18:06                     ` Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: Pedro Alves @ 2015-10-28 15:43 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Denys Vlasenko, Andrew Morton, Dmitry Vyukov,
	Alexander Potapenko, Eric Dumazet, Jan Kratochvil, Julien Tinnes,
	Kees Cook, Kostya Serebryany, Linus Torvalds,
	Michael Kerrisk (man-pages),
	Robert Swiecki, Roland McGrath, syzkaller,
	Linux Kernel Mailing List, gdb

On 10/28/2015 04:11 PM, Oleg Nesterov wrote:
> On 10/26, Pedro Alves wrote:
>>
>> On 10/25/2015 03:54 PM, Oleg Nesterov wrote:
>>>
>>> In any case, the real question is whether we should change the kernel to
>>> fix the problem, or ask the distros to fix their init's. In the former
>>> case 1/2 looks simpler/safer to me than the change in ptrace_traceme(),
>>> and you seem to agree that 1/2 is not that bad.
>>
>> A risk here seems to be that waitpid will start returning unexpected
>> (thread) PIDs to parent processes,
> 
> I don't see how this change can make the things worse,
> 
>> and it's not unreasonable to assume
>> that e.g., a program asserts that waitpid either returns error or a
>> known (process) PID.
> 
> Well. /sbin/init can never assume this, obviously.

Right.  I was actually thinking of !init processes -- basically code
that spawns helper processes, keeps a data structure indexed by pid, then
discards the structure when the child exits.  Something like:

 pid = waitpid(-1, &status, 0);
 if (pid > 0)
 {
   struct child_process *child = find_process(pid);
   assert (child != NULL);
 }

As in, before your change, the child could get stuck forever, but after your
change, the parent could die/assert instead.

But ...

> 
>> That's not an init-only issue,
> 
> Yes. Because we have CLONE_PARENT. So "waitpid either returns error or a
> known (process) PID" is only true if you trust your children.

... OK, that's indeed a good point.

>> (Also, in the original test case, if the child gets/raises a signal or execs
>> before exiting, the bash/init/whatever process won't be issuing PTRACE_CONT,
>> and the child will thus end up stuck (though should be SIGKILLable,
>
> Oh, but if it is killable everything is fine. How does this differ from the
> case when, say, you jusr reparent to init and do kill(getpid(), SIGSTOP) ?

The difference is that if the child called PTRACE_TRACEME, then it goes
to ptrace-stop instead and no amount of SIGCONT unstucks it -- the only way
out is force killing.  I agree it's not a major issue as there's a way out
(and thus made it a parens), but I wouldn't call it nice either.

>> All this because PTRACE_TRACEME is broken by design
>
> Heh. I agree. But we can't fix it now.

Perhaps the man page could document it as deprecated, suggesting
PTRACE_ATTACH/PTRACE_SEIZE instead?

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] wait/ptrace: always assume __WALL if the child is traced
  2015-10-28 15:43                   ` Pedro Alves
@ 2015-10-28 18:06                     ` Oleg Nesterov
  0 siblings, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2015-10-28 18:06 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Denys Vlasenko, Denys Vlasenko, Andrew Morton, Dmitry Vyukov,
	Alexander Potapenko, Eric Dumazet, Jan Kratochvil, Julien Tinnes,
	Kees Cook, Kostya Serebryany, Linus Torvalds,
	Michael Kerrisk (man-pages),
	Robert Swiecki, Roland McGrath, syzkaller,
	Linux Kernel Mailing List, gdb

On 10/28, Pedro Alves wrote:
>
> On 10/28/2015 04:11 PM, Oleg Nesterov wrote:
> > On 10/26, Pedro Alves wrote:
> >>
> >> (Also, in the original test case, if the child gets/raises a signal or execs
> >> before exiting, the bash/init/whatever process won't be issuing PTRACE_CONT,
> >> and the child will thus end up stuck (though should be SIGKILLable,
> >
> > Oh, but if it is killable everything is fine. How does this differ from the
> > case when, say, you jusr reparent to init and do kill(getpid(), SIGSTOP) ?
>
> The difference is that if the child called PTRACE_TRACEME, then it goes
> to ptrace-stop instead and no amount of SIGCONT unstucks it -- the only way
> out is force killing.  I agree it's not a major issue as there's a way out
> (and thus made it a parens), but I wouldn't call it nice either.

IOW, the difference is that it is TASK_TRACED, not TASK_STOPPED. I agree,
this is not nice. But this is not nice simply because PTRACE_TRACEME is
not nice.

> >> All this because PTRACE_TRACEME is broken by design
> >
> > Heh. I agree. But we can't fix it now.
>
> Perhaps the man page could document it as deprecated, suggesting
> PTRACE_ATTACH/PTRACE_SEIZE instead?

I don't know. but I won't mind if you mark PTRACE_ATTACH as deprecated
too ;) PTRACE_SEIZE can be used instead and it doesn't abuse SIGSTOP.

Oleg.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-28 18:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20151020171740.GA29290@redhat.com>
2015-10-22 14:40 ` [PATCH 0/2] wait/ptrace: always assume __WALL if the child is traced Pedro Alves
2015-10-25 14:46   ` Oleg Nesterov
     [not found] ` <20151020171754.GA29304@redhat.com>
     [not found]   ` <20151020153155.e03f4219da4014efe6f810b0@linux-foundation.org>
     [not found]     ` <5627EE9E.8040600@redhat.com>
     [not found]       ` <5627F607.4050506@redhat.com>
     [not found]         ` <20151021214703.GA1810@redhat.com>
     [not found]           ` <CAK1hOcP027FAWg5uVDjiFDC+0=dGu5JFJcy9Jij4dY8SBnT4MA@mail.gmail.com>
     [not found]             ` <20151025155440.GB2043@redhat.com>
2015-10-26 12:09               ` [PATCH 1/2] " Pedro Alves
2015-10-28 15:15                 ` Oleg Nesterov
2015-10-28 15:43                   ` Pedro Alves
2015-10-28 18:06                     ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).