public inbox for archer@sourceware.org
 help / color / mirror / Atom feed
* ptrace improvement ideas
@ 2011-02-03 22:39 Roland McGrath
  2011-02-04 18:56 ` Oleg Nesterov
                   ` (3 more replies)
  0 siblings, 4 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-03 22:39 UTC (permalink / raw)
  To: Project Archer; +Cc: Oleg Nesterov

I've been considering ideas for incremental improvements to the Linux
ptrace interface to make life better for debuggers.  This is far less
ambitious than big ideas about replacing ptrace with a good interface.
But the focus is on what is practical to get accepted by the kernel
community without months or years of wrangling and delay.

I'm trying to concentrate on things that both are of immediate help to
GDB, and are fairly straightforward and noninvasive in the kernel
implementation.  Some things that sound simple in the abstract from
the userland point of view are in fact substantially difficult to
implement in the kernel given the current structure of things.  So
this will not be one giant win to solve all your pain points.  It will
be a series of small, incremental improvements that address
significant pain points while also being moderately low-hanging fruit
in the kernel implementation.

I don't want to propose any specific changes to the kernel community
off the cuff.  We need to work out details with GDB hackers so that we
have things that really improve life in GDB, and that GDB hackers will
really make use of quite soon.  It's worse than nothing to add or
change things in the kernel interfaces before GDB folks are ready to
really work on using them, so that we can be quite sure that the
details and corner cases are thoroughly-specified and done so in ways
that really work well for GDB in practice.

We're open to your requests, of course.  But we need to keep a tight focus
on the things that we can implement fairly quickly and simply in the
kernel as it stands today, that make a substantial difference in the
correctness, performance, or ease of maintenance of GDB for real-world
debugging cases, and that GDB will actually make use of quite soon.

I have a few ideas to start with.  Some of these are quite simple to
implement in the kernel and thus we can expect to get them in without
controversy.  Others require more investigation by me and Oleg to be sure
we can really do them well in the kernel without making too many waves.
Everything will have to be done in an incremental fashion.  That means we
will have to get the easy changes done first and have GDB really using
them and getting observable benefits, before we can propose another round
of changes in the kernel.  If something is a nice idea, and even seems
simple to do from the kernel perspective, but GDB is not really going to
start using it right away, then we won't do it.  Ideally we will have
proven our draft interfaces and their implementations with GDB work and
gotten comfortable with them in all their corners, before we try to submit
the changes to the kernel.

So, here are my first few ideas.

* PTRACE_ATTACH_NOSTOP

This is a new request that differs from PTRACE_ATTACH in two ways.

First, it does not generate a SIGSTOP.  On return from ptrace, the
tracee thread is attached for ptrace, but may be running uninterrupted
or may be stopped, or however it was.  If traceable events happen
right afterward, then it may be in a ptrace stop by the time you look.

Second, it uses the other arguments to the ptrace call.  One of these
is reserved for future use (unless you have an idea), meaning the call
with fail with EINVAL if it's nonzero.  The other is a set of flags as
now used with PTRACE_SETOPTIONS.  The options will be set atomically
with the attach.  So there is no window during which you are attached
but the event-reporting behavior is not yet configured exactly as you
want it.

* PTRACE_O_INHERIT

This is a new option bit for PTRACE_SETOPTIONS or the options argument of
PTRACE_ATTACH_NOSTOP.  Its effect is that clones of the tracee inherit the
ptrace attachedness and option settings of their parent.  This applies to
all kinds of clones, which in userland are known as thread creations,
forks, and vforks.  This has no other effects, meaning it does not cause
either the parent or the child to stop for any event.  There's no point in
using this along with all of PTRACE_O_TRACECLONE, PTRACE_O_TRACEFORK, and
PTRACE_O_TRACEVFORK, because those already imply the inheritance behavior.
The point of PTRACE_O_INHERIT would be to attach newly-created threads and
children without causing an event stop and the attendant overhead.

This means that you would have no notice that the new thread was your
tracee until you got some event report for it.  This being the case, it
appears as a spontaneous wait result for a PID you hadn't heard of before.
To help keep track of what that's about, the siginfo_t for SIGCHLD would
be extended with a new field si_tgid.  To get this information reliably,
the debugger needs to use the waitid call instead of waitpid/wait4.  Thus,
for a new thread (i.e. CLONE_THREAD clone), you would see a new PID you
didn't know about, and the siginfo_t from that waitid would show the CLD_*
status as normal, with si_pid being the individual thread's ID and si_tgid
being the ID of the thread-group (PID in userland terms).

Because of this spontaneous report aspect, it could be difficult to figure
out what's going on with any new thread that is a fork/vfork, or other use
of clone (oddball applications, or old linuxthreads), rather than a
CLONE_THREAD case (NPTL pthread_create).  In those cases, si_tgid and
si_pid are the same and neither matches any process you already know you
are tracing.  In general, it can be impossible to figure out whose child
this is, because its parent could exit so its ppid (as seen in
/proc/pid/status et al) becomes 1.  So perhaps it would be better to have
this be just PTRACE_O_THREAD_INHERIT, where it only applies to CLONE_THREAD
clones.

* PTRACE_O_NO_ZOMBIE_THREAD

This is another new option.  It applies to the behavior on the death of a
CLONE_THREAD clone (i.e. an NPTL pthread_create thread).  The thread would
not become a zombie and not cause any wait result.  Instead, it would just
die and disappear silently, as they do when not traced.  If you want to
notice individual threads dying, you can already use PTRACE_O_TRACEEXIT
for that instead.  

Another subtle issue is when the initial thread (the one whose thread ID
matches the tgid) exits while other threads are live.  It's already the
case (I'm pretty sure, anyway) that you don't get a wait result when this
happens.  That's because that wait result is reserved for when the whole
group exits, i.e. the entire process is dead and there are no threads left
in it at all.  That would remain so with this option, but now it would be
the obvious and consistent thing to see rather than being a subtle
difference between this particular thread dying and any other thread dying.

One nonobvious issue here is that of PID reuse.  It's unlikely, but
possible, that the individual thread ID of a dead thread is reused for a
different new thread.  With this option, there would be no notification to
the debugger that the old thread using this ID has died.  If you are also
using PTRACE_O_INHERIT, and the same thread ID is reused for a new thread
in the same process (same tgid), then all you would see is some new event
for a thread ID you already knew about.  It would appear that the same
thread remained alive and something new happened, when in fact what
happened was that the old thread died and a new thread came along and
happened to get the same ID.  So without other new feature aspects, you
would have to assume this could happen, and be sure not to be confused by
it or to tell the user something misleading.

* PTRACE_INTERRUPT

This is a new request, with an attendant new PTRACE_EVENT_* type.
I have not thought out all the details of this yet.  I think it is
viable to implement it without too much trouble, but it is certainly
more involved than the first three ideas above.

This request asks to make a given tracee thread stop and give a
PTRACE_EVENT_INTERRUPT wait result.  Unlike other ptrace requests, you can
make this request on a tracee that is not already stopped.  It is similar
to sending a signal with tkill, but it does not interfere with any real
signals, is not affected by the blocked signal mask, etc.

If the tracee is already stopped for a ptrace stop, then this would return
EALREADY.  If the tracee is already stopped for job control, then it would
morph that into a ptrace stop (so that SIGCONT cannot resume it), and
likewise return EALREADY.

One major use for this would be to clean up the cold-attach procedure to
avoid the races and bad side effects it has now.  First, the debugger
would use PTRACE_ATTACH_NOSTOP to establish tracing but not perturb the
thread at all.  As soon as that returns success, the debugger can use
PTRACE_INTERRUPT on it.  This will yield EALREADY if the tracee is already
stopped, telling you that you can safely inspect it with other ptrace
requests (or read /proc/pid/mem, or whatever).  If it instead returns 0,
that means that the tracee will stop soon, telling you that you can safely
do a blocking wait* call on it and not worry about any races or long blocks.


These are a few ideas to get the discussion started.  We will need to hash
everything out in detail before we commit to feature proposals for the
kernel.  These are certainly not the only things we can do with the
constraints I described above.  But these are examples of some things that
are fairly easy to do in the kernel.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas
  2011-02-03 22:39 ptrace improvement ideas Roland McGrath
@ 2011-02-04 18:56 ` Oleg Nesterov
  2011-02-04 18:58   ` Roland McGrath
  2011-02-07 21:11 ` Jan Kratochvil
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-04 18:56 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer

On 02/03, Roland McGrath wrote:
>
> I don't want to propose any specific changes to the kernel community
> off the cuff.  We need to work out details with GDB hackers so that we
> have things that really improve life in GDB, and that GDB hackers will
> really make use of quite soon.  It's worse than nothing to add or
> change things in the kernel interfaces before GDB folks are ready to
> really work on using them, so that we can be quite sure that the
> details and corner cases are thoroughly-specified and done so in ways
> that really work well for GDB in practice.

Yes.

Right now I have nothing to add ;) I'll try to think more, but every
proposal looks as "obviously makes sense" to me. However, I do not
know what is useful for GDB.

Only one note,

> * PTRACE_ATTACH_NOSTOP

How about PTRACE_DETACH_NOSTOP? Like PTRACE_DETACH, but doesn't
require the stopped tracee.

I was really amazed when I looked into strace sources. The code
which handles detach is soooooo complicated (and I do not blame
strace). But, again, I do not know if this makes sense for GDB.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas
  2011-02-04 18:56 ` Oleg Nesterov
@ 2011-02-04 18:58   ` Roland McGrath
  0 siblings, 0 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-04 18:58 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Project Archer

> > * PTRACE_ATTACH_NOSTOP
> 
> How about PTRACE_DETACH_NOSTOP? Like PTRACE_DETACH, but doesn't
> require the stopped tracee.

Yes, that is a good idea too.

> I was really amazed when I looked into strace sources. The code
> which handles detach is soooooo complicated (and I do not blame
> strace). But, again, I do not know if this makes sense for GDB.

It is dismal, I know--I wrote it.  But that was very long ago, before a
whole lot of kernel fixes in that area.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas
  2011-02-03 22:39 ptrace improvement ideas Roland McGrath
  2011-02-04 18:56 ` Oleg Nesterov
@ 2011-02-07 21:11 ` Jan Kratochvil
  2011-02-08  1:58   ` Roland McGrath
  2011-02-10 20:00 ` ptrace improvement: PTRACE_O_INHERIT Oleg Nesterov
  2011-02-16 11:31 ` ptrace improvement ideas (QPassSignals) Jan Kratochvil
  3 siblings, 1 reply; 57+ messages in thread
From: Jan Kratochvil @ 2011-02-07 21:11 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer, Oleg Nesterov

Hi Roland,

the majority of the overhead should be solved by PTRACE_O_INHERIT, an idea for
a next optimization is to replace reading+parsing of "/proc/PID/xxx" text
files by a single syscall for that small binary value such as for:

	linux_proc_pending_signals (/proc/PID/status) -> now 2x sigset_t
	-> PTRACE_IS_SIGNUM_PENDING or PTRACE_GET_PENDING_SIGSET etc.
	(GDB is only interested in SIGINT from that sigset_t now.)

	linux_nat_core_of_thread_1 (/proc/%d/task/%ld/stat)
	-> PTRACE_GET_CPUCORE -> long return value as the CPU #


Also there could be PTRACE_SET_TGID_DEBUGREG to set debug registers for all
the TIDs of a PID (=TGID), even for those that are not stopped.

gdbserver now has to stop+resume each TID to change the hw watchpoints:
http://sourceware.org/bugzilla/show_bug.cgi?id=10729
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gdb/gdbserver/linux-x86-low.c.diff?cvsroot=src&r1=1.20&r2=1.21

(issue 1) But what if some thread hits a hw watchpoint?  Changing the debug
registers on x86* will make the result no longer readable.

(issue 2) Also it must set atomically at least both the DR_CONTROL and one of
DR addresses, best to set atomically all the 6 writable DR registers.  But not
to set regular registers.  PTRACE_POKEUSER writes a single register and other
block writes cannot write to DR registers.

There is already AFAIK some abstraction of DR regiters inside kernel so maybe
userland could get access to this abstraction to resolve these two issues.



On Thu, 03 Feb 2011 23:39:05 +0100, Roland McGrath wrote:
[...]
> * PTRACE_O_INHERIT
[...]
> Its effect is that clones of the tracee inherit the
> ptrace attachedness and option settings of their parent.

It must explicitly require debug registers (hw watchpoints) inheritance.
Which happened before but it no longer happens in recent upstream kernels
(NOTABUG RH#660003).


[...]
> To get this information reliably,
> the debugger needs to use the waitid call instead of waitpid/wait4.

[nitpick] or PTRACE_GETSIGINFO after waitpid, as GDB does.



Thanks,
Jan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas
  2011-02-07 21:11 ` Jan Kratochvil
@ 2011-02-08  1:58   ` Roland McGrath
  2011-02-08 20:41     ` Jan Kratochvil
  2011-02-08 21:07     ` Oleg Nesterov
  0 siblings, 2 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-08  1:58 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Project Archer, Oleg Nesterov

> the majority of the overhead should be solved by PTRACE_O_INHERIT,

Ok, good.  That leaves the questions of all the possible corner issues that
using it would entail.  So let's try to think of all those.

> an idea for
> a next optimization is to replace reading+parsing of "/proc/PID/xxx" text
> files by a single syscall for that small binary value such as for:
> 
> 	linux_proc_pending_signals (/proc/PID/status) -> now 2x sigset_t
> 	-> PTRACE_IS_SIGNUM_PENDING or PTRACE_GET_PENDING_SIGSET etc.
> 	(GDB is only interested in SIGINT from that sigset_t now.)

I'm surprised that GDB wants to know such a thing.  It makes me suspect
there is a deeper problem to which looking that up is a workaround.
Can you explain?

> 	linux_nat_core_of_thread_1 (/proc/%d/task/%ld/stat)
> 	-> PTRACE_GET_CPUCORE -> long return value as the CPU #

I had no idea GDB cared about such a thing at all.  Why does it?

> Also there could be PTRACE_SET_TGID_DEBUGREG to set debug registers for all
> the TIDs of a PID (=TGID), even for those that are not stopped.

IMHO the legacy style of directly setting the (now virtual anyway) hardware
registers for debugging features should just die, and not be given any new
crutches.

> There is already AFAIK some abstraction of DR regiters inside kernel so maybe
> userland could get access to this abstraction to resolve these two issues.

Indeed, there is now a layer called hw_breakpoint, with in-kernel APIs that
are largely machine-independent.  The legacy arch-specific ptrace interface
for x86 is implemented on top of that (not purely so, but rather supported
by that infrastructure as special cases).  I think Oleg knows the details
of that stuff better than I do at this point.

I think the desireable approach is to figure out a new interface (ptrace
extension, presumably) to use those new facilities directly from user
space.  It should be possible for such a new interface to be largely
machine-independent too.  Perhaps Oleg can make some suggestions for this.

> > * PTRACE_O_INHERIT
> [...]
> > Its effect is that clones of the tracee inherit the
> > ptrace attachedness and option settings of their parent.
> 
> It must explicitly require debug registers (hw watchpoints) inheritance.
> Which happened before but it no longer happens in recent upstream kernels
> (NOTABUG RH#660003).

That is a good point and one I had not been thinking of, though now that
you remind me, I was aware of the issue before.  It's my feeling that the
right way to approach this is to focus on a new set of interfaces built on
the kernel's hw_breakpoint facility (that infrastructure may well need
extending to deal well with userland better).  Those can be defined with
the inheritance and process-wide sharing issues in mind, so that if we then
add PTRACE_O_INHERIT they would mesh well to serve GDB's needs nicely.  I
think that trying to define PTRACE_O_INHERIT first in a way that has new
specific semantics interacting with fuzzily-defined arch-dependent issues
like the x86's legacy debug register behavior would be a bad route.

> > To get this information reliably,
> > the debugger needs to use the waitid call instead of waitpid/wait4.
> 
> [nitpick] or PTRACE_GETSIGINFO after waitpid, as GDB does.

This is a misunderstanding.  PTRACE_GETSIGINFO relates to the information
about a signal being delivered to a tracee.  What I'm talking about is the
information that goes with the SIGCHLD that gdb itself receives when a
tracee event/stop occurs.  This has nothing to do with any tracee's signal
details.  It has to do with how the kernel reports the tracee event to gdb.
Since multiple SIGCHLD signals do not queue with independent siginfo_t
(only SIGRTMIN+n signals do), using waitid is the only reliable way to get
that information.  It is the same information that arrives with the SIGCHLD
associated with a tracee event, but it is detail about the wait result you
are getting, not detail about the tracee status.

The point of using waitid is that you can see the new si_tgid field, and
hence receive both the thread-specific ID and the process-wide TGID for a
tracee that you haven't seen before.  Otherwise, PTRACE_O_INHERIT results
in spontaneous reports for new IDs that you know nothing about and have no
way to associate with where they came from.  The si_tgid idea addresses the
problem for the case of new threads (standard NPTL threads, that
is--CLONE_THREAD thread creations in the kernel's terms).  It doesn't help
at all for other kinds of creations, such as a normal fork or vfork.  That
is why I suggested it might actually not be desireable to have
PTRACE_O_INHERIT apply to all new creations, but instead make it limited to
CLONE_THREAD creations.  I'm interested in your thoughts on the issue of
how GDB deals with the first report of an ID it hasn't seen before.  With
current kernels, the only such situation that's possible is the brief race
between a new PTRACE_O_TRACE{CLONE,FORK,VFORK} child reporting its first
stop, and its parent reporting the PTRACE_EVENT_{CLONE,FORK,VFORK} stop for
that child's creation (at which time PTRACE_GETEVENTMSG tells you the
association between that parent's creation attempt and the new child).


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas
  2011-02-08  1:58   ` Roland McGrath
@ 2011-02-08 20:41     ` Jan Kratochvil
  2011-02-09  2:48       ` Roland McGrath
  2011-02-08 21:07     ` Oleg Nesterov
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Kratochvil @ 2011-02-08 20:41 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer, Oleg Nesterov

On Tue, 08 Feb 2011 02:58:44 +0100, Roland McGrath wrote:
> > 	linux_proc_pending_signals (/proc/PID/status) -> now 2x sigset_t
> > 	-> PTRACE_IS_SIGNUM_PENDING or PTRACE_GET_PENDING_SIGSET etc.
> > 	(GDB is only interested in SIGINT from that sigset_t now.)
> 
> I'm surprised that GDB wants to know such a thing.  It makes me suspect
> there is a deeper problem to which looking that up is a workaround.
> Can you explain?

-> linux_nat_has_pending_sigint -> maybe_clear_ignore_sigint

http://sourceware.org/ml/gdb-patches/2005-04/msg00041.html

without it single CTRL-C gets reported for each process in a process group:

echo 'main(){ fork(); pause(); }' | gcc -x c -; patched-gdb -nx ./a.out 
(gdb) set detach-on-fork off
(gdb) set target-async on
(gdb) set non-stop on
(gdb) run
Starting program: .../a.out 
[New process 19546]
^C <----------------------------------------------------
Program received signal SIGINT, Interrupt.
0x00000032a1cadb40 in __pause_nocancel () at ../sysdeps/unix/syscall-template.S:82
[...]
Program received signal SIGINT, Interrupt.
0x00000032a1cadb40 in __pause_nocancel () at ../sysdeps/unix/syscall-template.S:82

(Not reproducible with NPTL threads.)


> > 	linux_nat_core_of_thread_1 (/proc/%d/task/%ld/stat)
> > 	-> PTRACE_GET_CPUCORE -> long return value as the CPU #
> 
> I had no idea GDB cared about such a thing at all.  Why does it?

It may be generally useless on normal platforms but Ericsson exchanges use
some CPU-affinities of tasks and Eclipse displays it for that purpose.
I failed to display the CPU core # now in Eclipse.

The MI protocol has to provide it as FE (Front End) does not explicitly ask
for it.


> > [nitpick] or PTRACE_GETSIGINFO after waitpid, as GDB does.
> 
> This is a misunderstanding.

OK, sorry, got it now.


> The point of using waitid is that you can see the new si_tgid field, and
> hence receive both the thread-specific ID and the process-wide TGID for a
> tracee that you haven't seen before.

GDB has linux_proc_get_tgid :-) but only to workaround nptl/5983 fixed by you,
again reading "/proc/%d/status", for "Tgid".


> Otherwise, PTRACE_O_INHERIT results
> in spontaneous reports for new IDs that you know nothing about and have no
> way to associate with where they came from.  The si_tgid idea addresses the
> problem for the case of new threads (standard NPTL threads, that
> is--CLONE_THREAD thread creations in the kernel's terms).  It doesn't help
> at all for other kinds of creations, such as a normal fork or vfork.


> That is why I suggested it might actually not be desireable to have
> PTRACE_O_INHERIT apply to all new creations, but instead make it limited to
> CLONE_THREAD creations.

PTRACE_O_INHERIT cannot be used for fork/vfork with `set detach-on-fork on'
(the default mode, so that it is not multi-inferior) as GDB needs to remove
software breakpoints.  But the goal is multi-inferior `set detach-on-fork off'
anyway so in that mode PTRACE_O_INHERIT could apply even for fork/vfork.


> I'm interested in your thoughts on the issue of
> how GDB deals with the first report of an ID it hasn't seen before.  With
> current kernels, the only such situation that's possible is the brief race
> between a new PTRACE_O_TRACE{CLONE,FORK,VFORK} child reporting its first
> stop, and its parent reporting the PTRACE_EVENT_{CLONE,FORK,VFORK} stop for
> that child's creation (at which time PTRACE_GETEVENTMSG tells you the
> association between that parent's creation attempt and the new child).

`stopped_pids' tracks tasks which got reported as stopped before the parent's
PTRACE_EVENT_{CLONE,FORK,VFORK} has been seen.


With multi-inferior `set detach-on-fork off', with no
PTRACE_O_TRACE{CLONE,FORK,VFORK} and using PTRACE_O_INHERIT it could apply for
both clones and forks/vforks, couldn't it?  There is no longer any need to
track `stopped_pids'.


PTRACE_O_TRACEEXEC still needs to stop the inferior as GDB needs to reinsert
all the software breakpoints to their newly computed addresses.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas
  2011-02-08  1:58   ` Roland McGrath
  2011-02-08 20:41     ` Jan Kratochvil
@ 2011-02-08 21:07     ` Oleg Nesterov
  2011-02-08 23:18       ` hw_breakpoint userland interface Roland McGrath
  1 sibling, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-08 21:07 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/07, Roland McGrath wrote:
>
> > There is already AFAIK some abstraction of DR regiters inside kernel so maybe
> > userland could get access to this abstraction to resolve these two issues.
>
> Indeed, there is now a layer called hw_breakpoint, with in-kernel APIs that
> are largely machine-independent.  The legacy arch-specific ptrace interface
> for x86 is implemented on top of that (not purely so, but rather supported
> by that infrastructure as special cases).  I think Oleg knows the details
> of that stuff better than I do at this point.

Yes, and hw_breakpoint is implemented on top of perf counters.

> I think the desireable approach is to figure out a new interface (ptrace
> extension, presumably) to use those new facilities directly from user
> space.  It should be possible for such a new interface to be largely
> machine-independent too.

Probably yes... but it is not clear to me how exactly the new interface
should look. And it should coexist with the current ->ptrace_bps[] code.

And while register_user_hw_breakpoint() itself is arch independent, there
are still some details. Say, if the tracee traps, the tracer should know
somehow which bp was the reason.

But I agree it would be nice to have the simple "abstract" interface.
Jan, if you have any idea about how it should look - please tell us.
I _think_ that the actual implementation shoudn't be very very difficult,
although I can't say I understand this code in details.

> > It must explicitly require debug registers (hw watchpoints) inheritance.
> > Which happened before but it no longer happens in recent upstream kernels
> > (NOTABUG RH#660003).
>
> That is a good point and one I had not been thinking of,

Oh, yes, I missed this too.

> It's my feeling that the
> right way to approach this is to focus on a new set of interfaces built on
> the kernel's hw_breakpoint facility (that infrastructure may well need
> extending to deal well with userland better).  Those can be defined with
> the inheritance and process-wide sharing issues in mind,

Hmm. This is also not clear to me...

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* hw_breakpoint userland interface
  2011-02-08 21:07     ` Oleg Nesterov
@ 2011-02-08 23:18       ` Roland McGrath
  2011-02-10 21:03         ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-08 23:18 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

There are indeed many things to be figured out about a userland interface
for hw_breakpoint.  It's certainly quite possible that something entirely
separate from ptrace is the right approach for that.  I haven't given it
any thought, really.  It should be worked out with Jan and other GDB folks
as well as whoever in the kernel sphere cares about hw_breakpoint stuff.
It might be a ptrace-based interface, or it might be a separate interface
that interacts with ptrace (perhaps just generating some new PTRACE_EVENT_*
type for its stops).

If the hw_breakpoint stuff is well-integrated with perf event stuff, then
perhaps a good approach involves a more general-purpose interlock between
ptrace and perf event stuff.  That is, a way for GDB to make use of all
that stuff in a somewhat clean fashion for its case, that being of
unprivileged all-userland control that ties into ptrace stops somehow.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas
  2011-02-08 20:41     ` Jan Kratochvil
@ 2011-02-09  2:48       ` Roland McGrath
  0 siblings, 0 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-09  2:48 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Project Archer, Oleg Nesterov

> -> linux_nat_has_pending_sigint -> maybe_clear_ignore_sigint
> 
> http://sourceware.org/ml/gdb-patches/2005-04/msg00041.html
> 
> without it single CTRL-C gets reported for each process in a process group:

That is a correct and true report of what actually happens.  When you hit
C-c, each process in the process group does get a SIGINT and can handle it
independently.  I can't see wanting anything different for the test case
you show (except perhaps more sophisticated 'handle' commands to say I
wanted something more automagic).

The old mailing list thread you cited talks about the LinuxThreads case.
That's a case where the user perception of the program is that it's one
process with multiple threads, but the implementation has the kernel
semantics of many processes that just share the memory address space and
not much else.  That is quite a different situation than several normal
processes in a process group, as we encounter today (where the only kind of
multithreadedness is NPTL).

> > The point of using waitid is that you can see the new si_tgid field, and
> > hence receive both the thread-specific ID and the process-wide TGID for a
> > tracee that you haven't seen before.
> 
> GDB has linux_proc_get_tgid :-) but only to workaround nptl/5983 fixed by you,
> again reading "/proc/%d/status", for "Tgid".

Sure, there is old stuff.  The point of the si_tgid/waitid hack is to
relieve GDB of the need to do that ungainly open+read+parse work when just
dealing with the "new normal" of low-overhead multithread tracing.  What
you need to tell us is whether it is worthwhile to GDB that we add such
features or not.  If it's not a problem for GDB to use the crufty old /proc
mechanism when faced with the wait result of an unknown ID, then we won't
bother with the si_tgid changes in the kernel.  I had been thinking it
would be another scaling constraint and/or unpleasant corner case to have
to deal with that.  Perhaps it's really not a problem, since you only have
to do that work when a thread actually stops and thus everything is slowing
down anyway.  You tell us.

> PTRACE_O_INHERIT cannot be used for fork/vfork with `set detach-on-fork on'
> (the default mode, so that it is not multi-inferior) as GDB needs to remove
> software breakpoints.  But the goal is multi-inferior `set detach-on-fork off'
> anyway so in that mode PTRACE_O_INHERIT could apply even for fork/vfork.

Ok.  So that sounds like PTRACE_O_INHERIT as I described (applying to all
kinds of clones) is indeed the useful way to define the feature.

> `stopped_pids' tracks tasks which got reported as stopped before the parent's
> PTRACE_EVENT_{CLONE,FORK,VFORK} has been seen.
> 
> With multi-inferior `set detach-on-fork off', with no
> PTRACE_O_TRACE{CLONE,FORK,VFORK} and using PTRACE_O_INHERIT it could apply for
> both clones and forks/vforks, couldn't it?  There is no longer any need to
> track `stopped_pids'.

This seems to be a question about gdb internals and I can't answer it for
you.  In the status quo, the only case where you can see an unexpected
tracee is the race between parent and new child reporting, when the child
reports first.  There you can immediately do another blocking wait and be
sure that you will see the parent report soon and can make the correlation.

With the proposed new situation, you could see an unexpected new tracee
that is a fork/vfork (or any non-CLONE_THREAD clone) and have no reliable
way whatsoever to know where that new child came from.  You can attempt to
look up its ppid in /proc/pid/status and correlate that with some other
tracee, but it's possible that its ppid is already 1 when the parent has
died, so you won't know.  (It's also possible that its ppid is another
tracee you don't yet know about, so you'd have to iterate on the /proc
lookup to follow the lineage back to one you know.)

The question is, do you care?  I am guessing that you have to care, because
you need to know what known tracee was the proximate ancestor to know what
breakpoints are already inserted in the new child.  But it's for you to
tell us.

> PTRACE_O_TRACEEXEC still needs to stop the inferior as GDB needs to reinsert
> all the software breakpoints to their newly computed addresses.

Indeed.  The PTRACE_O_INHERIT idea only attempts to relieve the need to use
PTRACE_O_TRACE{CLONE,FORK,VFORK} when you don't want those events to be
reported with stops.  It has no bearing on other kinds of event stops you
do want.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-03 22:39 ptrace improvement ideas Roland McGrath
  2011-02-04 18:56 ` Oleg Nesterov
  2011-02-07 21:11 ` Jan Kratochvil
@ 2011-02-10 20:00 ` Oleg Nesterov
  2011-02-11 19:24   ` Roland McGrath
  2011-02-16 11:31 ` ptrace improvement ideas (QPassSignals) Jan Kratochvil
  3 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-10 20:00 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer

On 02/03, Roland McGrath wrote:
>
> * PTRACE_O_INHERIT
>
> This is a new option bit for PTRACE_SETOPTIONS or the options argument of
> PTRACE_ATTACH_NOSTOP.  Its effect is that clones of the tracee inherit the
> ptrace attachedness and option settings of their parent.

Jan has already reminded about about debug registers.

What else? Say, should TIF_SYSCALL_TRACE be copied? Suppose that we
try adapt strace to this new option.

> This has no other effects, meaning it does not cause
> either the parent or the child to stop for any event.

Somehow I am starting to feel uncomfortable. The debugger traces the
thread and doesn't even know about this fact. And can't know until the
first report. But once again, if this is fine for gdb - please ignore.

> The point of PTRACE_O_INHERIT would be to attach newly-created threads and
> children without causing an event stop and the attendant overhead.

OK, PTRACE_EVENT_FORK stops both, parent and child. But we can implement
PTRACE_O_INHERIT so that only the new child/thread stops. Or this doesn't
help enough?


Or. Suppose that clone() under PTRACE_O_INHERIT notifies the tracer
(sends SIGCHLD), and the new tracee gets the new PTRACE_O_INHERITed
mark. Then we can implement wait(W_WHO_WAS_CLONNED) which clears
PTRACE_O_INHERITed and reports the new tracee (just in case, this
doesn't need the stopped tracee).

Not sure this makes any sense, but how "info treads" can work otherwise?

> Because of this spontaneous report aspect, it could be difficult to figure
> out what's going on with any new thread that is a fork/vfork, or other use
> of clone (oddball applications, or old linuxthreads), rather than a
> CLONE_THREAD case (NPTL pthread_create).  In those cases, si_tgid and
> si_pid are the same and neither matches any process you already know you
> are tracing.  In general, it can be impossible to figure out whose child
> this is, because its parent could exit so its ppid (as seen in
> /proc/pid/status et al) becomes 1.

Not sure I really understand why this is the problem by itself. Yes, if
its ppid == 1, we know that the original parent was traced by us and then
it has gone.

Suppose that we (to simplify the discussion) introduce task->original_ppid,
then we can "solve" the problem and report this pid to gdb. But given that
it has already died, how this can help?

Nevermind,

> So perhaps it would be better to have
> this be just PTRACE_O_THREAD_INHERIT, where it only applies to CLONE_THREAD
> clones.

Or we can have both.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: hw_breakpoint userland interface
  2011-02-08 23:18       ` hw_breakpoint userland interface Roland McGrath
@ 2011-02-10 21:03         ` Oleg Nesterov
  2011-02-10 21:14           ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-10 21:03 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/08, Roland McGrath wrote:
>
> There are indeed many things to be figured out about a userland interface
> for hw_breakpoint.  It's certainly quite possible that something entirely
> separate from ptrace is the right approach for that.  I haven't given it
> any thought, really.

The same. And yes, perhaps it should be more perf_event oriented. Afaics, it
is already possible to create the bp via sys_perf_open(PERF_TYPE_BREAKPOINT),
but I didn't verify this.

> It should be worked out with Jan and other GDB folks

Yes. Firstly we need to know what would be more convenient for those
who will actually use the new interface.


But there is one thing which is even less clear to me. From you
previous email,

> > Those can be defined with
> > the inheritance and process-wide sharing issues in mind,

(to remind, you were talking about watchpoints/PTRACE_O_INHERIT problems).

Do you have any plans to introduce something like per-process
breakpoints, or (apart from inheriting) everything continues to
be per-thread?

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: hw_breakpoint userland interface
  2011-02-10 21:03         ` Oleg Nesterov
@ 2011-02-10 21:14           ` Roland McGrath
  2011-02-11 20:17             ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-10 21:14 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> Yes. Firstly we need to know what would be more convenient for those
> who will actually use the new interface.

Indeed.

> Do you have any plans to introduce something like per-process
> breakpoints, or (apart from inheriting) everything continues to
> be per-thread?

I think we should look at doing what makes most sense for the users,
i.e. GDB.  It's my impression that what is really desired is both options.

In fact, I'm not altogether sure that anyone wants per-thread watchpoints
at all, that being data watchpoints.  When the issue is finding how some
memory gets touched, then it doesn't make much sense to confine it to a
particular thread--you want to know when the memory changed, whatever
thread did it.

If the hw_breakpoint interfaces also support (or will support) the
hardware-assisted code breakpoints, then there it is probably desired to
have thread-specific breakpoint support.  But it's not clear to me that
serious userland users like GDB would really make use of any hardware-based
code breakpoints any time soon.  They are so limited that you always need
to fall back to traditional text-inserted breakpoints and I'm not sure it's
deemed worthwhile to put effort into hardware-based code breakpoints for
the small number of cases they can address.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-10 20:00 ` ptrace improvement: PTRACE_O_INHERIT Oleg Nesterov
@ 2011-02-11 19:24   ` Roland McGrath
  2011-02-11 20:46     ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-11 19:24 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Project Archer

> What else? Say, should TIF_SYSCALL_TRACE be copied?
> Suppose that we try adapt strace to this new option.

Let's talk about the userland interface features here, rather than
implementation details.  The only interface today is PTRACE_SYSCALL, which
is not an ongoing state but only a one-shot action.  I don't think it makes
sense to talk about "inheriting" the fact that PTRACE_SYSCALL is in flight.

For strace as it is, I don't think there is really any benefit to using
PTRACE_O_INHERIT.  An strace tracee is going to stop for every syscall
anyway, so the addition child-startup stop and clone stops are not really a
significantly greater overhead than what it's already always going to have.

Another feature that has made obvious sense for a long time is to have
PTRACE_O_TRACESYSCALL instead of just PTRACE_SYSCALL (or perhaps separate
PTRACE_O_TRACE_SYSCALL_{ENTRY,EXIT} flags).  That would be a permanent
state flag like the other PTRACE_O_TRACE* flags, meaning that syscall
entry/exit events would cause event stops regardless of which flavor of
PTRACE_CONT/SINGLESTEP/etc was last used to resume the thread.  (For any
such new feature it would also make sense to have those use a clean
PTRACE_EVENT_SYSCALL_{ENTRY,EXIT} stop code instead of the silly
TRACESYSGOOD magic format.)  

If we had that, then it would follow naturally that these option flags
would be inherited like all the others.  But AFAIK there has not really
been any practical demand for this cleanup from ptrace users.  It is
straightforward enough to add, and a nice cleanup IMHO.  So perhaps we
should consider doing this anyway.  I can imagine that gdb might like to
use it, anyway.

> > This has no other effects, meaning it does not cause
> > either the parent or the child to stop for any event.
> 
> Somehow I am starting to feel uncomfortable. The debugger traces the
> thread and doesn't even know about this fact. And can't know until the
> first report. But once again, if this is fine for gdb - please ignore.

That's why I'm asking Jan et al to direct us more about these details.

> > The point of PTRACE_O_INHERIT would be to attach newly-created threads and
> > children without causing an event stop and the attendant overhead.
> 
> OK, PTRACE_EVENT_FORK stops both, parent and child. But we can implement
> PTRACE_O_INHERIT so that only the new child/thread stops. Or this doesn't
> help enough?

My impression of what Jan asked for was "no slowdown" on thread creations
without other interesting events.  For normal debugging scenarios at the
macro level, it's really not an interesting event that a new thread was
created, only when a breakpoint/crash/etc happens.

> Or. Suppose that clone() under PTRACE_O_INHERIT notifies the tracer
> (sends SIGCHLD), and the new tracee gets the new PTRACE_O_INHERITed
> mark. Then we can implement wait(W_WHO_WAS_CLONNED) which clears
> PTRACE_O_INHERITed and reports the new tracee (just in case, this
> doesn't need the stopped tracee).

I don't really follow this idea at all, sorry.

> Not sure this makes any sense, but how "info treads" can work otherwise?

It can always look as /proc/PID/task/ to enumerate threads.  (And there are
also the libthread_db means, though AIUI those are now deprecated.)

> Not sure I really understand why this is the problem by itself. Yes, if
> its ppid == 1, we know that the original parent was traced by us and then
> it has gone.

GDB needs to know which of its tracees was the parent (or grandparent or
great-grandparent, etc.) to know what set of text-insertion breakpoints it
has in its memory, inherited watchpoints its threads have, etc.  Otherwise
at a SIGTRAP on an unrecognized new task it has no way to line up what it's
talking about at all.

> Suppose that we (to simplify the discussion) introduce task->original_ppid,
> then we can "solve" the problem and report this pid to gdb. But given that
> it has already died, how this can help?

If it has already died but GDB knows exactly what process that was from the
PPID, then it knows what tracing state, breakpoints, etc. were inherited.

But you may need the full ancestry, not just the immediate parent.  If it's
already dead and GDB had never known about it, then GDB still doesn't know
anything.  However, if GDB is still getting all death reports, possibly
that could be enough for the ancestry to add up, though there are races and
corners there I think.

> > So perhaps it would be better to have
> > this be just PTRACE_O_THREAD_INHERIT, where it only applies to CLONE_THREAD
> > clones.
> 
> Or we can have both.

Indeed.  But I want to add things that GDB is really going to use, not go
hog-wild on speculative features that might never be used or might wind up
having subtly different semantics than what is really worthwhile for GDB.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: hw_breakpoint userland interface
  2011-02-10 21:14           ` Roland McGrath
@ 2011-02-11 20:17             ` Oleg Nesterov
  0 siblings, 0 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-11 20:17 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/10, Roland McGrath wrote:
>
> > Do you have any plans to introduce something like per-process
> > breakpoints, or (apart from inheriting) everything continues to
> > be per-thread?
>
> I think we should look at doing what makes most sense for the users,
> i.e. GDB.  It's my impression that what is really desired is both options.
>
> In fact, I'm not altogether sure that anyone wants per-thread watchpoints
> at all, that being data watchpoints.  When the issue is finding how some
> memory gets touched, then it doesn't make much sense to confine it to a
> particular thread--you want to know when the memory changed, whatever
> thread did it.

Yes, agreed. But then we have some implementation problems, perf_event's
are always per-thread (although they have the "group" notion).

clone() should always inherit bps, probably not too hard to implement.

There real problem is attach-the-new-bp-to-the-process, we don't have the
sleeping lock to process all threads and protect the race with clone().

And ptrace_detach/release_task, we can't destroy perf_event under tasklist.
(and the current task->ptrace_bps[] code is buggy and imho ugly).

Of course, I am not saying this is impossible, I am just trying to
understand the possible problems.


Now, let's forget about the implementation details. First of all, we
can't really make the per-process-bp API, bp shouldn't affect the threads
we do not ptrace. So, per-process should mean the threads we do ptrace.
In this case it is not clear what should represent the process. OK,
I think this is minor.

> If the hw_breakpoint interfaces also support (or will support) the
> hardware-assisted code breakpoints,

Looking at the code, I think it does... X86_BREAKPOINT_EXECUTE. But
again, I didn't verify.

> They are so limited that you always need
> to fall back to traditional text-inserted breakpoints

I forgot everything I learned when I reviewed uprobes patches. But iirc,
they will have perf_event interface as well.


To summarize: I still have no idea about how this API should look ;)

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-11 19:24   ` Roland McGrath
@ 2011-02-11 20:46     ` Oleg Nesterov
  2011-02-12  0:59       ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-11 20:46 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer

On 02/11, Roland McGrath wrote:
>
> > What else? Say, should TIF_SYSCALL_TRACE be copied?
> > Suppose that we try adapt strace to this new option.
>
> Let's talk about the userland interface features here,

Ah, I didn't mean implementation details, sorry.

> For strace as it is, I don't think there is really any benefit to using
> PTRACE_O_INHERIT.

OK.

> Another feature that has made obvious sense for a long time is to have
> PTRACE_O_TRACESYSCALL instead of just PTRACE_SYSCALL

Yes, and that is why I asked about TIF_SYSCALL_TRACE. Because to me
PTRACE_SYSCALL looks like a property/option regardless of implementation
details.

But I agree, this was a bit off-topic.

> > Or. Suppose that clone() under PTRACE_O_INHERIT notifies the tracer
> > (sends SIGCHLD), and the new tracee gets the new PTRACE_O_INHERITed
> > mark. Then we can implement wait(W_WHO_WAS_CLONNED) which clears
> > PTRACE_O_INHERITed and reports the new tracee (just in case, this
> > doesn't need the stopped tracee).
>
> I don't really follow this idea at all, sorry.

I meant, we can intoduce the new W*** flag for do_wait(). If the new
tracee was PTRACE_O_INHERIT'ed, do_wait() returns its pid.

> > Not sure this makes any sense, but how "info treads" can work otherwise?
>
> It can always look as /proc/PID/task/ to enumerate threads.

Well yes, but /proc/PID/task/ is not convenient and reliable.
Especially if we do not trace all threads.

But, again, if this is fine to gdb - forget.

> > Not sure I really understand why this is the problem by itself. Yes, if
> > its ppid == 1, we know that the original parent was traced by us and then
> > it has gone.
>
> [... snip ...]

Hmm... Still can't understans right now, but this doesn't matter:

> > > So perhaps it would be better to have
> > > this be just PTRACE_O_THREAD_INHERIT, where it only applies to CLONE_THREAD
> > > clones.
> >
> > Or we can have both.
>
> Indeed.  But I want to add things that GDB is really going to use, not go
> hog-wild on speculative features

Yes, yes, agreed.

Damn. I was so unclear. I tried to say: yes, I think PTRACE_O_THREAD_INHERIT
makes more sense. If nothing else, the tracer doesn't necessarily wants to
follow forks, this can create the unneeded overhead if it only wants to ptrace
the sub-threads.

IOW, I think that we should have both, or PTRACE_O_THREAD_INHERIT only.
Or, this option should take "clone_flags" as an argument.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-11 20:46     ` Oleg Nesterov
@ 2011-02-12  0:59       ` Roland McGrath
  2011-02-12 19:11         ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-12  0:59 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Project Archer

> Yes, and that is why I asked about TIF_SYSCALL_TRACE. Because to me
> PTRACE_SYSCALL looks like a property/option regardless of implementation
> details.

But it's only if you're looking at implementation details rather than the
userland interface that you'd think any such thing.

> > > Or. Suppose that clone() under PTRACE_O_INHERIT notifies the tracer
> > > (sends SIGCHLD), and the new tracee gets the new PTRACE_O_INHERITed
> > > mark. Then we can implement wait(W_WHO_WAS_CLONNED) which clears
> > > PTRACE_O_INHERITed and reports the new tracee (just in case, this
> > > doesn't need the stopped tracee).
> >
> > I don't really follow this idea at all, sorry.
> 
> I meant, we can intoduce the new W*** flag for do_wait(). If the new
> tracee was PTRACE_O_INHERIT'ed, do_wait() returns its pid.

I still don't understand the proposal.

> Well yes, but /proc/PID/task/ is not convenient and reliable.
> Especially if we do not trace all threads.

Tracing some threads but not all is really an artifact of the ptrace
interface and not something that any real userland debugger-like thing
ever wants to do.

> Damn. I was so unclear. I tried to say: yes, I think PTRACE_O_THREAD_INHERIT
> makes more sense. If nothing else, the tracer doesn't necessarily wants to
> follow forks, this can create the unneeded overhead if it only wants to ptrace
> the sub-threads.
> 
> IOW, I think that we should have both, or PTRACE_O_THREAD_INHERIT only.

That sounds reasonable (which is why I suggested it in the first place ;-).
But, again, we want to see what GDB really wants to use and only add that.

> Or, this option should take "clone_flags" as an argument.

That is far less straightforward to implement in the vanilla kernel as it
stands today, and so really doesn't fall under the "low hanging fruit" rubric.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-12  0:59       ` Roland McGrath
@ 2011-02-12 19:11         ` Oleg Nesterov
  2011-02-14 19:31           ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-12 19:11 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer

On 02/11, Roland McGrath wrote:
>
> > > > Or. Suppose that clone() under PTRACE_O_INHERIT notifies the tracer
> > > > (sends SIGCHLD), and the new tracee gets the new PTRACE_O_INHERITed
> > > > mark. Then we can implement wait(W_WHO_WAS_CLONNED) which clears
> > > > PTRACE_O_INHERITed and reports the new tracee (just in case, this
> > > > doesn't need the stopped tracee).
> > >
> > > I don't really follow this idea at all, sorry.
> >
> > I meant, we can intoduce the new W*** flag for do_wait(). If the new
> > tracee was PTRACE_O_INHERIT'ed, do_wait() returns its pid.
>
> I still don't understand the proposal.

To simplify the explanation, suppose we add task_struct->unknown_tracee
boolean.

if tracehook_finish_clone()->ptrace_init_task() does __ptrace_link()
because of PTRACE_O_INHERIT, it also sets child->unknown_tracee and
notifies the tracee via do_notify_parent_cldstop().

Then we add WCLONNED and modify wait_consider_task(),

	-	if (likely(!ptrace) && unlikely(task_ptrace(p))) {
	-		/*
	-		 * This child is hidden by ptrace.
	-		 * We aren't allowed to see it now, but eventually we will.
	-		 */
	-		wo->notask_error = 0;
	-		return 0;
	-	}
	+	if (unlikely(ptrace) {
	+		if (unlikely(p->unknown_tracee) && (wo->wo_flags & WCLONNED)) {
	+			// of course, this is racy
	+			p->unknown_tracee = 0;
	+
	+			// we need wait_task_ptrace_inherited(wo, p);
	+			read_unlock(&tasklist_lock);
	+			return p->pid;
	+		}
	+
	+	} else if (unlikely(task_ptrace(p))) {
	+		/*
	+		 * This child is hidden by ptrace.
	+		 * We aren't allowed to see it now, but eventually we will.
	+		 */
	+		wo->notask_error = 0;
	+		return 0;
	+	}

Of course this is just incomplete pseudo-code to explain what I mean.

> > Well yes, but /proc/PID/task/ is not convenient and reliable.
> > Especially if we do not trace all threads.
>
> Tracing some threads but not all is really an artifact of the ptrace
> interface and not something that any real userland debugger-like thing
> ever wants to do.

Off-topic note: I disagree very much, but this doesn't matter. I agree
that ptrace nterface should not be per-thread, and gdb always traces all
threads.


> But, again, we want to see what GDB really wants to use and only add that.

Yes, yes, agreed.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-12 19:11         ` Oleg Nesterov
@ 2011-02-14 19:31           ` Roland McGrath
  2011-02-14 19:46             ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-14 19:31 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Project Archer

> > > I meant, we can intoduce the new W*** flag for do_wait(). If the new
> > > tracee was PTRACE_O_INHERIT'ed, do_wait() returns its pid.
> >
> > I still don't understand the proposal.
> 
> To simplify the explanation, suppose we add task_struct->unknown_tracee
> boolean.
> 
> if tracehook_finish_clone()->ptrace_init_task() does __ptrace_link()
> because of PTRACE_O_INHERIT, it also sets child->unknown_tracee and
> notifies the tracee via do_notify_parent_cldstop().

So the suggestion is to have the tracer see a wait report that simply says
"here is a new implicit tracee".  I don't see how that is useful at all.
It supplies no new information, only mentions the tracee earlier.  That
makes it a short race window in which it's possible for a tracer to have no
possible means of identifying the lineage of an implicit tracee.  But it
does not solve the problem in the general case.

> > Tracing some threads but not all is really an artifact of the ptrace
> > interface and not something that any real userland debugger-like thing
> > ever wants to do.
> 
> Off-topic note: I disagree very much, but this doesn't matter. I agree
> that ptrace nterface should not be per-thread, and gdb always traces all
> threads.

Then I don't understand at all what you are disagreeing with.
You think the interface should not be per-thread, but you don't agree
that a per-thread interface is not something debuggers really want?


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-14 19:31           ` Roland McGrath
@ 2011-02-14 19:46             ` Oleg Nesterov
  2011-02-15  0:36               ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-14 19:46 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer

On 02/14, Roland McGrath wrote:
>
> > > > I meant, we can intoduce the new W*** flag for do_wait(). If the new
> > > > tracee was PTRACE_O_INHERIT'ed, do_wait() returns its pid.
> > >
> > > I still don't understand the proposal.
> >
> > To simplify the explanation, suppose we add task_struct->unknown_tracee
> > boolean.
> >
> > if tracehook_finish_clone()->ptrace_init_task() does __ptrace_link()
> > because of PTRACE_O_INHERIT, it also sets child->unknown_tracee and
> > notifies the tracee via do_notify_parent_cldstop().
>
> So the suggestion is to have the tracer see a wait report that simply says
> "here is a new implicit tracee".  I don't see how that is useful at all.

OK, lets forget then.

> > > Tracing some threads but not all is really an artifact of the ptrace
> > > interface and not something that any real userland debugger-like thing
> > > ever wants to do.
> >
> > Off-topic note: I disagree very much, but this doesn't matter. I agree
> > that ptrace nterface should not be per-thread, and gdb always traces all
> > threads.
>
> Then I don't understand at all what you are disagreeing with.
> You think the interface should not be per-thread, but you don't agree
> that a per-thread interface is not something debuggers really want?

I meant, I do not agree that it never makes sense to trace, say, a
single thread from the thread group. But since we are talking about
gdb my noncompliance is off-topic.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-14 19:46             ` Oleg Nesterov
@ 2011-02-15  0:36               ` Roland McGrath
  2011-02-15 13:16                 ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-15  0:36 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Project Archer

> I meant, I do not agree that it never makes sense to trace, say, a
> single thread from the thread group. But since we are talking about
> gdb my noncompliance is off-topic.

We are also talking about what makes sense for a kernel interface, so it's
certainly apropos to share our thoughts on the general subject.  Please
elaborate on why you think this makes sense.  It's certainly true that it
makes sense to have your tracing details different for one thread than for
others (stopping on more events, etc.) but that's quite a different thing
from saying it really makes sense to have a debugging interface structured
to consider tracing an individual thread as unrelated to tracing all the
threads in the same process, as Linux's ptrace is today.  I can't really
imagine what you have in mind, so please tell me.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15  0:36               ` Roland McGrath
@ 2011-02-15 13:16                 ` Oleg Nesterov
  2011-02-15 21:43                   ` Jan Kratochvil
  2011-02-15 22:17                   ` Roland McGrath
  0 siblings, 2 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-15 13:16 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer

On 02/14, Roland McGrath wrote:
>
> > I meant, I do not agree that it never makes sense to trace, say, a
> > single thread from the thread group. But since we are talking about
> > gdb my noncompliance is off-topic.
>
> We are also talking about what makes sense for a kernel interface, so it's
> certainly apropos to share our thoughts on the general subject.  Please
> elaborate on why you think this makes sense.  It's certainly true that it
> makes sense to have your tracing details different for one thread than for
> others (stopping on more events, etc.)

No, I didn't mean this (but I agree this is true).

> but that's quite a different thing
> from saying it really makes sense to have a debugging interface structured
> to consider tracing an individual thread as unrelated to tracing all the
> threads in the same process,

Yes, and I can't understand why this doesn't make sense.

I do not have any good example in mind. But if I want to know what this
particular thread does, why should I trace the whole process? Not to
mention, with the current ptrace interface it is never "free" to ptrace
the threads you do not care about.

If nothing else. I like the fact "strace -p" can trace the individual
thread without disturbing the whole thread group.


And in fact I don't really understand why WCLONNED doesn't make sense
in general. OK, probably gdb doesn't need it. As you said,

	It supplies no new information, only mentions the tracee earlier.

sure. IOW, the new thread doesn't exist (from gdb pov) until it reports
something interesting.

OK, but I think PTRACE_O_INHERIT is useful in general, not only for gdb.
And it looks a bit strange the tracer can't know what it traces. Especially
because the current API is per-thread.

Suppose that debugger uses PTRACE_O_INHERIT and then it decides to detach
gracefully. It should detach per-thread, but how? /proc is very unconvenient.
Even if we traced all threads, we do not trace them all after we detach
the first thread, and that thread can clone the new threads after detach.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 13:16                 ` Oleg Nesterov
@ 2011-02-15 21:43                   ` Jan Kratochvil
  2011-02-15 21:56                     ` Roland McGrath
                                       ` (2 more replies)
  2011-02-15 22:17                   ` Roland McGrath
  1 sibling, 3 replies; 57+ messages in thread
From: Jan Kratochvil @ 2011-02-15 21:43 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Roland McGrath, Project Archer

On Tue, 15 Feb 2011 14:08:05 +0100, Oleg Nesterov wrote:
> And it looks a bit strange the tracer can't know what it traces.

With PTRACE_O_INHERIT GDB can no longer know as any thread can disappear any
time.  But even GDB does not care much, it has the list stale and updates it
by update_thread_list() during `info threads' and similar commands.
GDB also never trusts the found threads list, it does tkill (TID, 0) to verify
if any TID is still alive.


> Suppose that debugger uses PTRACE_O_INHERIT and then it decides to detach
> gracefully. It should detach per-thread, but how? /proc is very unconvenient.
> Even if we traced all threads, we do not trace them all after we detach
> the first thread, and that thread can clone the new threads after detach.

Maybe it is obvious but it has not been heard yet.
With PTRACE_O_THREAD_INHERIT the debugger no longer has to enumerate threads
for ptrace operations - except for attach+detach.

Could PTRACE_ATTACH_NOSTOP+PTRACE_DETACH_NOSTOP? act on the whole PID at once
if PTRACE_O_THREAD_INHERIT is applied - already during PTRACE_ATTACH_NOSTOP?
It is a bit overloading the *_NOSTOP meaning but it is a new PTRACE_* op.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 21:43                   ` Jan Kratochvil
@ 2011-02-15 21:56                     ` Roland McGrath
  2011-02-16 19:42                       ` Oleg Nesterov
  2011-02-15 22:02                     ` Roland McGrath
  2011-02-16 19:38                     ` Oleg Nesterov
  2 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-15 21:56 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Oleg Nesterov, Project Archer

> GDB also never trusts the found threads list, it does tkill (TID, 0) to verify
> if any TID is still alive.

Btw, it should use tgkill to avoid false-positive corner cases of PID reuse.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 21:43                   ` Jan Kratochvil
  2011-02-15 21:56                     ` Roland McGrath
@ 2011-02-15 22:02                     ` Roland McGrath
  2011-02-16 16:02                       ` Jan Kratochvil
  2011-02-16 19:48                       ` Oleg Nesterov
  2011-02-16 19:38                     ` Oleg Nesterov
  2 siblings, 2 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-15 22:02 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Oleg Nesterov, Project Archer

> Could PTRACE_ATTACH_NOSTOP+PTRACE_DETACH_NOSTOP? act on the whole PID at once
> if PTRACE_O_THREAD_INHERIT is applied - already during PTRACE_ATTACH_NOSTOP?
> It is a bit overloading the *_NOSTOP meaning but it is a new PTRACE_* op.

This is an obviously desireable feature.  I considered it when making my
suggestions, but stopped short of suggesting it because I think it may have
substantially difficult implementation issues in the kernel.  It may well
be that that detach-group is not really very hard to do at all (that
probably being asynchronous detach, not requiring stopped tracees, and not
supporting any parting-signal feature).  Perhaps attach-group is also more
tractable than I thought from a brief consideration of the implementation
issues.  Oleg will have to study it and give his opinions.  It was my guess
that even if it's pretty doable, the attach-group implementation will be
sufficiently hairy that the kernel community would demand that we do some
simpler incremental changes first before attempting it (such as what I
proposed for ATTACH_NOSTOP, which eliminates the SIGSTOP and rolls in
atomic option-setting).


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 13:16                 ` Oleg Nesterov
  2011-02-15 21:43                   ` Jan Kratochvil
@ 2011-02-15 22:17                   ` Roland McGrath
  2011-02-16 20:48                     ` Oleg Nesterov
  1 sibling, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-15 22:17 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Project Archer

> Yes, and I can't understand why this doesn't make sense.
> 
> I do not have any good example in mind. But if I want to know what this
> particular thread does, why should I trace the whole process? Not to
> mention, with the current ptrace interface it is never "free" to ptrace
> the threads you do not care about.

On this subject we are no longer talking about ptrace, since it is a
per-thread interface.  Quite simply, the notion of considering a thread in
isolation from its process just does not make any sense to me at all from
the perspective of a user or a debugger or anything in userland, or in the
abstract.  It only appears sensical at all because of the Linux kernel's
implementation details.

As I already mentioned, it is certainly sensical that any sane interface
would not require you to do anything that perturbs or slows all threads in
a process merely by the structure of the interface.  Naturally, just to be
"attached" to a process doesn't imply noticing any events that happen in it
except for the process going away entirely.

> If nothing else. I like the fact "strace -p" can trace the individual
> thread without disturbing the whole thread group.

If that's all you want to see, that's all you'd ask for.
I don't see what that has to do with structuring an interface
such that you pretend that a thread is an independent agent that
has no relationship with its containing process.

> And in fact I don't really understand why WCLONNED doesn't make sense
> in general. OK, probably gdb doesn't need it. As you said,
> 
> 	It supplies no new information, only mentions the tracee earlier.

(There would be only one N in that, btw.)  I don't think I said it didn't
make any sense at all.  I said it doesn't really address the problem of
tracking lineage.  That's the subject I was considering, since that's the
context in which you proposed it.

> OK, but I think PTRACE_O_INHERIT is useful in general, not only for gdb.
> And it looks a bit strange the tracer can't know what it traces. Especially
> because the current API is per-thread.
> 
> Suppose that debugger uses PTRACE_O_INHERIT and then it decides to detach
> gracefully. It should detach per-thread, but how? /proc is very unconvenient.
> Even if we traced all threads, we do not trace them all after we detach
> the first thread, and that thread can clone the new threads after detach.

That is a good point, and relevant to GDB's use.  If this were the
interface, then certainly it would need a way to know what to detach.
This highlights how it's all a highly unnatural interface for what a
debugger really wants to do.  But in considering small incremental
improvements to ptrace, we need to contemplate such issues.

This is an example of how the clean, new, non-ptrace, high-level interface
that I always imagined would be far more natural.  In that notion, the
debugger would define a "tracing group" (always needed a better term for
that) that is its handle on referring to what it's doing in the interface.
Things like automatic attaching of new threads or children would be in the
context of a particular tracing group, and graceful detach could also
operate on the basis of releasing an entire tracing group.  But, all that
gets us away into the land of "good ideas" rather than "ptrace improvement".


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas (QPassSignals)
  2011-02-03 22:39 ptrace improvement ideas Roland McGrath
                   ` (2 preceding siblings ...)
  2011-02-10 20:00 ` ptrace improvement: PTRACE_O_INHERIT Oleg Nesterov
@ 2011-02-16 11:31 ` Jan Kratochvil
  2011-02-16 18:36   ` Roland McGrath
  3 siblings, 1 reply; 57+ messages in thread
From: Jan Kratochvil @ 2011-02-16 11:31 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer, Oleg Nesterov

On Thu, 03 Feb 2011 23:39:05 +0100, Roland McGrath wrote:
> These are a few ideas to get the discussion started.

An idea about another simple ptrace extension - to set a set of signals which
are not interesting to GDB, like what exists in the gdbserver protocol:
	info '(gdb)QPassSignals'
	`QPassSignals: SIGNAL [;SIGNAL]...'
	Each listed SIGNAL should be passed directly to the inferior process.
	Signals are numbered identically to continue packets and

which applies to 'nostop noprint pass' signals such as SIGCHLD by default.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 22:02                     ` Roland McGrath
@ 2011-02-16 16:02                       ` Jan Kratochvil
  2011-02-16 18:28                         ` Roland McGrath
  2011-02-16 19:48                       ` Oleg Nesterov
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Kratochvil @ 2011-02-16 16:02 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Oleg Nesterov, Project Archer

On Tue, 15 Feb 2011 23:01:53 +0100, Roland McGrath wrote:
> (that probably being asynchronous detach, not requiring stopped tracees, and
> not supporting any parting-signal feature).

That may be too asynchronous.  After GDB/gcore/etc. finishes new PTRACE_ATTACH
must complete successfully.  I never know if it is already guaranteed or not
but in practice it works now.

If there is no PTRACE_DETACH_ALL_TIDS(pid) and PTRACE_O_THREAD_INHERIT is in
use GDB probably has to repeatedly iterate /proc/PID/task/*/status till all
have TracerPid == 0.


> the attach-group implementation will be sufficiently hairy that the kernel
> community would demand that we do some simpler incremental changes first
> before attempting it (such as what I proposed for ATTACH_NOSTOP, which
> eliminates the SIGSTOP and rolls in atomic option-setting).

thread_db_find_new_threads_2 already does an ugly loop of repeated threads
finding.  There are also some failures that can be probably fixed by changing
td_ta_thr_iter to readdir(/proc/PID/task) (RH#677654).  There are some pending
bugs in GDB about it.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 16:02                       ` Jan Kratochvil
@ 2011-02-16 18:28                         ` Roland McGrath
  2011-02-16 20:00                           ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 18:28 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Oleg Nesterov, Project Archer

> That may be too asynchronous.  After GDB/gcore/etc. finishes new PTRACE_ATTACH
> must complete successfully.  I never know if it is already guaranteed or not
> but in practice it works now.

Sorry if I was unclear, that is not the kind of asynchrony I was talking
about.  It is guaranteed now that when PTRACE_DETACH succeeds, the tracee
is detached and can be attached anew.  That would be true of what I
suggested also.  The asynchrony I mean is the good kind: that it doesn't
have to be stopped for you to detach it.  

There is nothing special for a debugger to worry about with this
asynchrony unless it uses multiple threads where one thread calls wait*
while another thread calls ptrace.  In that case, the debugger's wait*
thread could see a stop result that appears to be after its other thread
detached the same tracee.  (That is already true now with PTRACE_DETACH.)
If the debugger's own wait* call is strictly ordered after its detaching
ptrace call, there can be no such confusion.

> If there is no PTRACE_DETACH_ALL_TIDS(pid) and PTRACE_O_THREAD_INHERIT is in
> use GDB probably has to repeatedly iterate /proc/PID/task/*/status till all
> have TracerPid == 0.

Indeed.  That seems undesireable.

> > the attach-group implementation will be sufficiently hairy that the kernel
> > community would demand that we do some simpler incremental changes first
> > before attempting it (such as what I proposed for ATTACH_NOSTOP, which
> > eliminates the SIGSTOP and rolls in atomic option-setting).
> 
> thread_db_find_new_threads_2 already does an ugly loop of repeated threads
> finding.  There are also some failures that can be probably fixed by changing
> td_ta_thr_iter to readdir(/proc/PID/task) (RH#677654).  There are some pending
> bugs in GDB about it.

I take that to mean that you do not see a strong demand for a
group-attach feature, though having one would simplify things.
(But since GDB will never get rid of its code to support older
kernels, perhaps such simplification doesn't really help much
in practice.)


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas (QPassSignals)
  2011-02-16 11:31 ` ptrace improvement ideas (QPassSignals) Jan Kratochvil
@ 2011-02-16 18:36   ` Roland McGrath
  2011-02-16 20:21     ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 18:36 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Project Archer, Oleg Nesterov

> An idea about another simple ptrace extension - to set a set of signals which
> are not interesting to GDB, like what exists in the gdbserver protocol:

That is another obvious feature.  It is straightforward to implement in the
kernel, but it requires some extra storage per-thread, which is clumsy in
the current kernel implementation.  As things are structured today, it's
hard to avoid bloating the kernel's main per-thread data structure by at
least a word (cost paid on every thread in the system even when ptrace is
not in use).  I'm not sure what the kernel community reaction to that would
be, though it might be acceptable.  It's already desireable to restructure
the ptrace implementation details so that ptrace-related storage is not
allocated when ptrace is not being used, but it's nontrivial to make such
changes.  They might happen in the future anyway.  But for those reasons
I did not include this as obvious low-hanging fruit.

It's also not clear to me how important that feature is in practice to GDB
users.  It's an obvious thing that we've all known about forever, but in my
recollections it is not very common to hear about a user complaining about
overhead of intercepting and passing on signals.  I suppose it is very
important to any application that is designed to generate and handle lots
of signals, but I'm not sure how often that comes up in practice.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 21:43                   ` Jan Kratochvil
  2011-02-15 21:56                     ` Roland McGrath
  2011-02-15 22:02                     ` Roland McGrath
@ 2011-02-16 19:38                     ` Oleg Nesterov
  2011-02-16 19:40                       ` Roland McGrath
  2 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 19:38 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Roland McGrath, Project Archer

On 02/15, Jan Kratochvil wrote:
>
> On Tue, 15 Feb 2011 14:08:05 +0100, Oleg Nesterov wrote:
>
> > Suppose that debugger uses PTRACE_O_INHERIT and then it decides to detach
> > gracefully. It should detach per-thread, but how? /proc is very unconvenient.
> > Even if we traced all threads, we do not trace them all after we detach
> > the first thread, and that thread can clone the new threads after detach.
>
> Maybe it is obvious but it has not been heard yet.
> With PTRACE_O_THREAD_INHERIT the debugger no longer has to enumerate threads
> for ptrace operations - except for attach+detach.
>
> Could PTRACE_ATTACH_NOSTOP+PTRACE_DETACH_NOSTOP? act on the whole PID at once
> if PTRACE_O_THREAD_INHERIT is applied - already during PTRACE_ATTACH_NOSTOP?
> It is a bit overloading the *_NOSTOP meaning but it is a new PTRACE_* op.

Yes, I thought about PTRACE_ATTACH_TG / PTRACE_DETACH_TG too.

This is nontrivial to implement. We should obviously avoid the races with
clone(). Now that cred_guard_mutex is per-process, this is probably possible...
But I can't understand how we can check __ptrace_may_access() for each thread.
We should refactor this code, task_lock() is needed to pin ->mm, and ->mm
should be the same. Anyway, looks possible.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 19:38                     ` Oleg Nesterov
@ 2011-02-16 19:40                       ` Roland McGrath
  0 siblings, 0 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 19:40 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> This is nontrivial to implement. We should obviously avoid the races with
> clone(). Now that cred_guard_mutex is per-process, this is probably possible...
> But I can't understand how we can check __ptrace_may_access() for each thread.
> We should refactor this code, task_lock() is needed to pin ->mm, and ->mm
> should be the same. Anyway, looks possible.

Right, that matches what I was thinking when I looked at it.
It seems like it should be doable, but looking at the details
of the locking got hairy fast.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 21:56                     ` Roland McGrath
@ 2011-02-16 19:42                       ` Oleg Nesterov
  2011-02-16 19:45                         ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 19:42 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/15, Roland McGrath wrote:
>
> > GDB also never trusts the found threads list, it does tkill (TID, 0) to verify
> > if any TID is still alive.
>
> Btw, it should use tgkill to avoid false-positive corner cases of PID reuse.

Even tgkill() can't 100% protect against PID reuse, but I agree it is better.

But I failed to understand why gdb needs tkill/tgkill currently. A thread
can't go away silently? This also means its pid can't be reused.

Confused.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 19:42                       ` Oleg Nesterov
@ 2011-02-16 19:45                         ` Roland McGrath
  2011-02-16 20:09                           ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 19:45 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> Even tgkill() can't 100% protect against PID reuse, but I agree it is better.

If the process-wide PID (tgid) is not reused, which it never is silently
when ptrace'd today, then tgkill does protect completely against TID reuse.
With ptrace as it is today, there is only a silent TID reuse risk in the
case of MT exec.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 22:02                     ` Roland McGrath
  2011-02-16 16:02                       ` Jan Kratochvil
@ 2011-02-16 19:48                       ` Oleg Nesterov
  2011-02-16 20:02                         ` Roland McGrath
  1 sibling, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 19:48 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/15, Roland McGrath wrote:
>
> It was my guess
> that even if it's pretty doable, the attach-group implementation will be
> sufficiently hairy

At first glance, this should be more or less straightforward. I think.

(btw, this is obviously much more difficult with utrace, we can't
 utrace_attach_task() under tasklist).

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 18:28                         ` Roland McGrath
@ 2011-02-16 20:00                           ` Oleg Nesterov
  2011-02-16 20:07                             ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 20:00 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/16, Roland McGrath wrote:
>
> > That may be too asynchronous.  After GDB/gcore/etc. finishes new PTRACE_ATTACH
> > must complete successfully.  I never know if it is already guaranteed or not
> > but in practice it works now.
>
> Sorry if I was unclear, that is not the kind of asynchrony I was talking
> about.  It is guaranteed now that when PTRACE_DETACH succeeds, the tracee
> is detached and can be attached anew.  That would be true of what I
> suggested also.  The asynchrony I mean is the good kind: that it doesn't
> have to be stopped for you to detach it.
>
> There is nothing special for a debugger to worry about with this
> asynchrony

Yes, but I can' understand the next part:

> unless it uses multiple threads where one thread calls wait*
> while another thread calls ptrace.  In that case, the debugger's wait*
> thread could see a stop result that appears to be after its other thread
> detached the same tracee.  (That is already true now with PTRACE_DETACH.)
> If the debugger's own wait* call is strictly ordered after its detaching
> ptrace call, there can be no such confusion.

Could you spell please?

Just curious.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 19:48                       ` Oleg Nesterov
@ 2011-02-16 20:02                         ` Roland McGrath
  2011-02-16 20:15                           ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 20:02 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> At first glance, this should be more or less straightforward. I think.
> 
> (btw, this is obviously much more difficult with utrace, we can't
>  utrace_attach_task() under tasklist).

Indeed.  We might have to resort to a loop chasing untraced clones or
something.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 20:00                           ` Oleg Nesterov
@ 2011-02-16 20:07                             ` Roland McGrath
  2011-02-16 20:32                               ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 20:07 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> > unless it uses multiple threads where one thread calls wait*
> > while another thread calls ptrace.  In that case, the debugger's wait*
> > thread could see a stop result that appears to be after its other thread
> > detached the same tracee.  (That is already true now with PTRACE_DETACH.)
> > If the debugger's own wait* call is strictly ordered after its detaching
> > ptrace call, there can be no such confusion.
> 
> Could you spell please?

Say a debugger has two threads.

Thread A calls wait*.
Thread B calls PTRACE_DETACH.

Thread A can perceive a wait result from tracee T to be simultaneous with,
or after, thread B has detached from T.

The debugger has to do some work to be sure it doesn't get confused in its
own bookkeeping of what tracees it thinks are attached.

That's all.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 19:45                         ` Roland McGrath
@ 2011-02-16 20:09                           ` Oleg Nesterov
  2011-02-16 20:16                             ` Roland McGrath
  2011-02-19 19:48                             ` Jan Kratochvil
  0 siblings, 2 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 20:09 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/16, Roland McGrath wrote:
>
> > Even tgkill() can't 100% protect against PID reuse, but I agree it is better.
>
> If the process-wide PID (tgid) is not reused
> ...
> then tgkill does protect completely against TID reuse.

How? sooner or later the process which creates/destroys a thread
in a loop will reuse some pid number.

But probably you meant that tgkill() guarantees it is still in
the same process, in this case I agree.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 20:02                         ` Roland McGrath
@ 2011-02-16 20:15                           ` Oleg Nesterov
  2011-02-16 20:31                             ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 20:15 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/16, Roland McGrath wrote:
>
> > At first glance, this should be more or less straightforward. I think.
> >
> > (btw, this is obviously much more difficult with utrace, we can't
> >  utrace_attach_task() under tasklist).
>
> Indeed.  We might have to resort to a loop chasing untraced clones or
> something.

Or, attach can use a mutex in engine->data and synchronize with its
own report_clone/report_death callbacks. This makes attach O(n).
ugdb does this.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 20:09                           ` Oleg Nesterov
@ 2011-02-16 20:16                             ` Roland McGrath
  2011-02-19 19:48                             ` Jan Kratochvil
  1 sibling, 0 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 20:16 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> How? sooner or later the process which creates/destroys a thread
> in a loop will reuse some pid number.
> 
> But probably you meant that tgkill() guarantees it is still in
> the same process, in this case I agree.

Yes, that is what I meant.  Sorry.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas (QPassSignals)
  2011-02-16 18:36   ` Roland McGrath
@ 2011-02-16 20:21     ` Oleg Nesterov
  2011-02-18 20:24       ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 20:21 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/16, Roland McGrath wrote:
>
> > An idea about another simple ptrace extension - to set a set of signals which
> > are not interesting to GDB, like what exists in the gdbserver protocol:
>
> That is another obvious feature.  It is straightforward to implement in the
> kernel, but it requires some extra storage per-thread,

Yes, all we need is the simple bitmask.

> It's already desireable to restructure
> the ptrace implementation details so that ptrace-related storage is not
> allocated when ptrace is not being used,

Yes. IIRC, we already had the patches, but right now I can't recall
if they were complete (most probably yes, except I am not sure about
task->ptrace member). I'll try to find this series.

Probably this is the first thing we should do.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 20:15                           ` Oleg Nesterov
@ 2011-02-16 20:31                             ` Roland McGrath
  2011-02-16 21:04                               ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 20:31 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> Or, attach can use a mutex in engine->data and synchronize with its
> own report_clone/report_death callbacks. This makes attach O(n).
> ugdb does this.

Sure, you'd do that too.  But for attaching to live threads, you still have
to chase the new clones just done by the unattached threads before you
attached them.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 20:07                             ` Roland McGrath
@ 2011-02-16 20:32                               ` Oleg Nesterov
  0 siblings, 0 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 20:32 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/16, Roland McGrath wrote:
>
> > Could you spell please?
>
> Say a debugger has two threads.
>
> Thread A calls wait*.
> Thread B calls PTRACE_DETACH.
>
> Thread A can perceive a wait result from tracee T to be simultaneous with,
> or after, thread B has detached from T.

Ah, yes, sure.

I thought you meant that do_wait() (in kernel) can somehow see the
tracee after it was already detached.

Thanks.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-15 22:17                   ` Roland McGrath
@ 2011-02-16 20:48                     ` Oleg Nesterov
  0 siblings, 0 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 20:48 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Project Archer

On 02/15, Roland McGrath wrote:
>
> Quite simply, the notion of considering a thread in
> isolation from its process just does not make any sense to me at all from
> the perspective of a user or a debugger or anything in userland,

And this is what I can't understand. Despite the fact I have read
your email many times.

> This is an example of how the clean, new, non-ptrace, high-level interface
> that I always imagined would be far more natural.  In that notion, the
> debugger would define a "tracing group" (always needed a better term for
> that) that is its handle on referring to what it's doing in the interface.

Yes, agreed.

The sane interface should use something, say, fd which (at least) allows
to detach all tracees or set some options for all of them. And of course
attach-the-whole-process makes a lot of sense. But at the same time
attach-this-particular-thread looks very natural to me.

> But, all that
> gets us away into the land of "good ideas" rather than "ptrace improvement".

Yes.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 20:31                             ` Roland McGrath
@ 2011-02-16 21:04                               ` Oleg Nesterov
  2011-02-16 21:51                                 ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-16 21:04 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/16, Roland McGrath wrote:
>
> > Or, attach can use a mutex in engine->data and synchronize with its
> > own report_clone/report_death callbacks. This makes attach O(n).
> > ugdb does this.
>
> Sure, you'd do that too.  But for attaching to live threads, you still have
> to chase the new clones just done by the unattached threads before you
> attached them.

Actually no, or I misunderstood.

To simplify, lets ignore the problems with the dead group_leader.
Now suppose (again, to simplify) that report_clone/report_death
take global_mutex before anything else, and attach-to-all takes
this mutex too.

This means we can always use next_thread(last_attached_thread).
The next attach can fail if it races with exit, in this case
we should just re-read next_thread() and retry.

We should see all sub-threads, copy_process() does list_add_tail().
Once next_thread() returns the group leader, everything is done.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 21:04                               ` Oleg Nesterov
@ 2011-02-16 21:51                                 ` Roland McGrath
  0 siblings, 0 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-16 21:51 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> We should see all sub-threads, copy_process() does list_add_tail().
> Once next_thread() returns the group leader, everything is done.

I see.  You are more clever than I am. ;-)

Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement ideas (QPassSignals)
  2011-02-16 20:21     ` Oleg Nesterov
@ 2011-02-18 20:24       ` Oleg Nesterov
  0 siblings, 0 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-18 20:24 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/16, Oleg Nesterov wrote:
>
> On 02/16, Roland McGrath wrote:
>
> > It's already desireable to restructure
> > the ptrace implementation details so that ptrace-related storage is not
> > allocated when ptrace is not being used,
>
> Yes. IIRC, we already had the patches, but right now I can't recall
> if they were complete (most probably yes, except I am not sure about
> task->ptrace member). I'll try to find this series.

It turns out, it was complete but it should be changed a bit.

> Probably this is the first thing we should do.

Yes, I think this can help to implement the new features.

But there is a problem. If we want PTRACE_ATTCH_ALL_THREADS (and iiuc
we do want) we have to use GFP_ATOMIC to allocate ptrace_state under
tasklist_lock.

I wonder if we can invent a sleepable lock. Actually it is also needed
for cgroups, the guys from google are going to add the rw_semaphore into
signal_struct. But it depends on CONFIG_CGROUPS and doesn't cover exit.

Yes, yes, I am talking about implementation details again...

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-16 20:09                           ` Oleg Nesterov
  2011-02-16 20:16                             ` Roland McGrath
@ 2011-02-19 19:48                             ` Jan Kratochvil
  2011-02-19 20:37                               ` Oleg Nesterov
  2011-02-21 19:44                               ` Roland McGrath
  1 sibling, 2 replies; 57+ messages in thread
From: Jan Kratochvil @ 2011-02-19 19:48 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Roland McGrath, Project Archer

On Wed, 16 Feb 2011 21:01:10 +0100, Oleg Nesterov wrote:
> On 02/16, Roland McGrath wrote:
> > If the process-wide PID (tgid) is not reused
> > ...
> > then tgkill does protect completely against TID reuse.
> 
> How? sooner or later the process which creates/destroys a thread
> in a loop will reuse some pid number.

(a) GDB tracks separately  threads (libpthread-managed) and LWPs (TIDs).
    (So the libthread_db TD_DEATH notification is offtopic for this mail.)

(b) For LWPs it does not use PTRACE_O_TRACEEXIT.
    /* Do not enable PTRACE_O_TRACEEXIT until GDB is more prepared to support
       read-only process state.  */
    (I do not know more regarding this comment.)

Therefore it occasionally uses tkill (TID, 0) instead of PTRACE_O_TRACEEXIT.

It uses PTRACE_O_TRACECLONE so it should see a reusal of thread TID in the same
PID, although it seems to me it will just create a duplicate {PID, TID}.

Just currently there are enough bugreports about users hitting threading races
of GDB in real world so we do not have to think about more race possibilities.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-19 19:48                             ` Jan Kratochvil
@ 2011-02-19 20:37                               ` Oleg Nesterov
  2011-02-20  8:18                                 ` Jan Kratochvil
  2011-02-21 19:54                                 ` Roland McGrath
  2011-02-21 19:44                               ` Roland McGrath
  1 sibling, 2 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-19 20:37 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Roland McGrath, Project Archer

On 02/19, Jan Kratochvil wrote:
>
> On Wed, 16 Feb 2011 21:01:10 +0100, Oleg Nesterov wrote:
> > On 02/16, Roland McGrath wrote:
> > > If the process-wide PID (tgid) is not reused
> > > ...
> > > then tgkill does protect completely against TID reuse.
> >
> > How? sooner or later the process which creates/destroys a thread
> > in a loop will reuse some pid number.
>
> (a) GDB tracks separately  threads (libpthread-managed) and LWPs (TIDs).
>     (So the libthread_db TD_DEATH notification is offtopic for this mail.)
>
> (b) For LWPs it does not use PTRACE_O_TRACEEXIT.
>     /* Do not enable PTRACE_O_TRACEEXIT until GDB is more prepared to support
>        read-only process state.  */
>     (I do not know more regarding this comment.)
>
> Therefore it occasionally uses tkill (TID, 0) instead of PTRACE_O_TRACEEXIT.

Cough. I still can't understand why gdb needs tkill(TID, 0). Please
ignore, I know nothing about gdb implementation.

However. With or without PTRACE_O_TRACEEXIT, the thread can't go away
until ptracer does do_wait(WEXITED). And until it does do_wait() tkill()
succeeds even if the tracee is dead/zombie.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-19 20:37                               ` Oleg Nesterov
@ 2011-02-20  8:18                                 ` Jan Kratochvil
  2011-02-20 21:05                                   ` Oleg Nesterov
  2011-02-21 19:54                                 ` Roland McGrath
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Kratochvil @ 2011-02-20  8:18 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Roland McGrath, Project Archer

On Sat, 19 Feb 2011 21:29:21 +0100, Oleg Nesterov wrote:
> Cough. I still can't understand why gdb needs tkill(TID, 0). Please
> ignore, I know nothing about gdb implementation.
> 
> However. With or without PTRACE_O_TRACEEXIT, the thread can't go away
> until ptracer does do_wait(WEXITED). And until it does do_wait() tkill()
> succeeds even if the tracee is dead/zombie.

Maybe:
          /* The thread has previously exited.  We need to delete it
             now because, for some vendor 2.4 kernels with NPTL
             support backported, there won't be an exit event unless
             it is the main thread.  2.6 kernels will report an exit
             event for each thread that exits, as expected.  */

There probably was a reason why it was once implemented.  Unfortunately GDB is
in use also on embedded targets with obsolete and patched Linux kernels so it
is not welcome to remove such code.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-20  8:18                                 ` Jan Kratochvil
@ 2011-02-20 21:05                                   ` Oleg Nesterov
  0 siblings, 0 replies; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-20 21:05 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Roland McGrath, Project Archer

On 02/20, Jan Kratochvil wrote:
>
> On Sat, 19 Feb 2011 21:29:21 +0100, Oleg Nesterov wrote:
> > Cough. I still can't understand why gdb needs tkill(TID, 0). Please
> > ignore, I know nothing about gdb implementation.
> >
> > However. With or without PTRACE_O_TRACEEXIT, the thread can't go away
> > until ptracer does do_wait(WEXITED). And until it does do_wait() tkill()
> > succeeds even if the tracee is dead/zombie.
>
> Maybe:
>           /* The thread has previously exited.  We need to delete it
>              now because, for some vendor 2.4 kernels with NPTL
>              support backported, there won't be an exit event unless
>              it is the main thread.  2.6 kernels will report an exit
>              event for each thread that exits, as expected.  */
>
> There probably was a reason why it was once implemented.

Probably yes... I tried to understand this code, but failed. I am
not suprised, gdb is not trivial.

But. This makes me think that Roland was right, gdb needs tgkill().
Otherwise, if the old kernel does not report the exit via WIFEXITED,
then tkill(TID, 0) can succeed while the tracee has gone away.

> Unfortunately GDB is
> in use also on embedded targets with obsolete and patched Linux kernels so it
> is not welcome to remove such code.

Heh, I see. I always knew that the kernel hacking is simple compared
to user-space development ;)

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-19 19:48                             ` Jan Kratochvil
  2011-02-19 20:37                               ` Oleg Nesterov
@ 2011-02-21 19:44                               ` Roland McGrath
  1 sibling, 0 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-21 19:44 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Oleg Nesterov, Project Archer

> (a) GDB tracks separately  threads (libpthread-managed) and LWPs (TIDs).
>     (So the libthread_db TD_DEATH notification is offtopic for this mail.)

Then the scalability issues with thread creation and death will not be
algorithmically improved (only halved at best) by making the automatic
kernel notifications less eager.  GDB will have to stop enabling the
libthread_db event breakpoints for thread creation and death before the
stuff we are discussing really matters.

> (b) For LWPs it does not use PTRACE_O_TRACEEXIT.
>     /* Do not enable PTRACE_O_TRACEEXIT until GDB is more prepared to support
>        read-only process state.  */
>     (I do not know more regarding this comment.)

I don't know from the GDB side what the details are, but I might be able to
offer a little insight here.  At the PTRACE_EVENT_EXIT stop, you have what
looks like any other ptrace stop, where you can fetch and set the registers
and so forth.  But the tracee thread has already started exiting and it
will never return to user mode, no matter what you do.  So it's impossible
to do "jump", an inferior function call, etc.--for changing registers to
have any kind of effect (except on the contents of a core dump initiated by
another thread).


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-19 20:37                               ` Oleg Nesterov
  2011-02-20  8:18                                 ` Jan Kratochvil
@ 2011-02-21 19:54                                 ` Roland McGrath
  2011-02-22 19:39                                   ` Oleg Nesterov
  1 sibling, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-21 19:54 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> However. With or without PTRACE_O_TRACEEXIT, the thread can't go away
> until ptracer does do_wait(WEXITED). And until it does do_wait() tkill()
> succeeds even if the tracee is dead/zombie.

That is true except in the case of MT exec (de_thread).  The other threads
live in the process at the time when another does exec will be immediately
reaped without the chance for the debugger to see its death report.  I
think it's even a race to happen to see the death reports for these.  I'm
not managing to see how the immediate reaping happens in the source just
now, though I remember clearly that this is what happens.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-21 19:54                                 ` Roland McGrath
@ 2011-02-22 19:39                                   ` Oleg Nesterov
  2011-02-22 20:49                                     ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-22 19:39 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/21, Roland McGrath wrote:
>
> > However. With or without PTRACE_O_TRACEEXIT, the thread can't go away
> > until ptracer does do_wait(WEXITED). And until it does do_wait() tkill()
> > succeeds even if the tracee is dead/zombie.
>
> That is true except in the case of MT exec (de_thread).

Ah! Indeed you are right, I forgot about this case again.

> The other threads
> live in the process at the time when another does exec will be immediately
> reaped without the chance for the debugger to see its death report.

Unless I missed something again, other threads are visible to wait().

But, in case de_thread() changes the leader, it does release_task(old_leader).
However, from ptrace pov it looks as if the execing thread goes away, it
changes its pid.

This reminds me. PTRACE_O_TRACEEXIT is not reliable if
arch_ptrace_stop_needed(). This is only sparc/ia64 so far, but we were
going to change ptrace_stop() so that it never stops if the process
was really killed by SIGKILL...

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-22 19:39                                   ` Oleg Nesterov
@ 2011-02-22 20:49                                     ` Roland McGrath
  2011-02-22 21:10                                       ` Oleg Nesterov
  0 siblings, 1 reply; 57+ messages in thread
From: Roland McGrath @ 2011-02-22 20:49 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> Unless I missed something again, other threads are visible to wait().

That's what it looks like in the code to me as well.  But my recollection
is otherwise.  And IMHO it should be otherwise, that is that MT exec cannot
be arbitrarily delayed by any userland act.  As long as the de_thread delay
is an uninterruptible wait not even affected by SIGKILL, it looks like a
DoS sort of vector if userland can prevent the other threads from dying
quickly.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-22 20:49                                     ` Roland McGrath
@ 2011-02-22 21:10                                       ` Oleg Nesterov
  2011-02-22 22:16                                         ` Roland McGrath
  0 siblings, 1 reply; 57+ messages in thread
From: Oleg Nesterov @ 2011-02-22 21:10 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Jan Kratochvil, Project Archer

On 02/22, Roland McGrath wrote:
>
> > Unless I missed something again, other threads are visible to wait().
>
> That's what it looks like in the code to me as well.  But my recollection
> is otherwise.  And IMHO it should be otherwise, that is that MT exec cannot
> be arbitrarily delayed by any userland act.

perhaps, I am not sure right now...  but

> As long as the de_thread delay
> is an uninterruptible wait not even affected by SIGKILL, it looks like a
> DoS sort of vector

Aaahhh. good point, and I never thought about this... this is like vfork().

At first glance it is simple to turn this wait into TASK_KILLABLE, if we
stay with the current behaviour.

Oleg.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: ptrace improvement: PTRACE_O_INHERIT
  2011-02-22 21:10                                       ` Oleg Nesterov
@ 2011-02-22 22:16                                         ` Roland McGrath
  0 siblings, 0 replies; 57+ messages in thread
From: Roland McGrath @ 2011-02-22 22:16 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Project Archer

> Aaahhh. good point, and I never thought about this... this is like vfork().
> 
> At first glance it is simple to turn this wait into TASK_KILLABLE, if we
> stay with the current behaviour.

Before considering changes, we should look into what the actual behavior is
now and was in the past.  I could have sworn that other threads were
automatically reaped before.  Perhaps I was wrong, or perhaps this got
changed at some point.  Or perhaps they really are now and it's happening
by a means that is escaping us when we read the source today.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2011-02-22 22:16 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-03 22:39 ptrace improvement ideas Roland McGrath
2011-02-04 18:56 ` Oleg Nesterov
2011-02-04 18:58   ` Roland McGrath
2011-02-07 21:11 ` Jan Kratochvil
2011-02-08  1:58   ` Roland McGrath
2011-02-08 20:41     ` Jan Kratochvil
2011-02-09  2:48       ` Roland McGrath
2011-02-08 21:07     ` Oleg Nesterov
2011-02-08 23:18       ` hw_breakpoint userland interface Roland McGrath
2011-02-10 21:03         ` Oleg Nesterov
2011-02-10 21:14           ` Roland McGrath
2011-02-11 20:17             ` Oleg Nesterov
2011-02-10 20:00 ` ptrace improvement: PTRACE_O_INHERIT Oleg Nesterov
2011-02-11 19:24   ` Roland McGrath
2011-02-11 20:46     ` Oleg Nesterov
2011-02-12  0:59       ` Roland McGrath
2011-02-12 19:11         ` Oleg Nesterov
2011-02-14 19:31           ` Roland McGrath
2011-02-14 19:46             ` Oleg Nesterov
2011-02-15  0:36               ` Roland McGrath
2011-02-15 13:16                 ` Oleg Nesterov
2011-02-15 21:43                   ` Jan Kratochvil
2011-02-15 21:56                     ` Roland McGrath
2011-02-16 19:42                       ` Oleg Nesterov
2011-02-16 19:45                         ` Roland McGrath
2011-02-16 20:09                           ` Oleg Nesterov
2011-02-16 20:16                             ` Roland McGrath
2011-02-19 19:48                             ` Jan Kratochvil
2011-02-19 20:37                               ` Oleg Nesterov
2011-02-20  8:18                                 ` Jan Kratochvil
2011-02-20 21:05                                   ` Oleg Nesterov
2011-02-21 19:54                                 ` Roland McGrath
2011-02-22 19:39                                   ` Oleg Nesterov
2011-02-22 20:49                                     ` Roland McGrath
2011-02-22 21:10                                       ` Oleg Nesterov
2011-02-22 22:16                                         ` Roland McGrath
2011-02-21 19:44                               ` Roland McGrath
2011-02-15 22:02                     ` Roland McGrath
2011-02-16 16:02                       ` Jan Kratochvil
2011-02-16 18:28                         ` Roland McGrath
2011-02-16 20:00                           ` Oleg Nesterov
2011-02-16 20:07                             ` Roland McGrath
2011-02-16 20:32                               ` Oleg Nesterov
2011-02-16 19:48                       ` Oleg Nesterov
2011-02-16 20:02                         ` Roland McGrath
2011-02-16 20:15                           ` Oleg Nesterov
2011-02-16 20:31                             ` Roland McGrath
2011-02-16 21:04                               ` Oleg Nesterov
2011-02-16 21:51                                 ` Roland McGrath
2011-02-16 19:38                     ` Oleg Nesterov
2011-02-16 19:40                       ` Roland McGrath
2011-02-15 22:17                   ` Roland McGrath
2011-02-16 20:48                     ` Oleg Nesterov
2011-02-16 11:31 ` ptrace improvement ideas (QPassSignals) Jan Kratochvil
2011-02-16 18:36   ` Roland McGrath
2011-02-16 20:21     ` Oleg Nesterov
2011-02-18 20:24       ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).