From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <archer-return-1312-listarch-archer=sourceware.org@sourceware.org>
Received: (qmail 20871 invoked by alias); 10 May 2009 14:59:05 -0000
Mailing-List: contact archer-help@sourceware.org; run by ezmlm
Sender: <archer@sourceware.org>
Precedence: bulk
List-Post: <mailto:archer@sourceware.org>
List-Help: <mailto:archer-help@sourceware.org>
List-Subscribe: <mailto:archer-subscribe@sourceware.org>
List-Id: <archer.sourceware.org>
Received: (qmail 20818 invoked by uid 22791); 10 May 2009 14:59:04 -0000
X-SWARE-Spam-Status: No, hits=-2.1 required=5.0
	tests=AWL,BAYES_00,J_CHICKENPOX_92,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Date: Sun, 10 May 2009 14:59:00 -0000
From: Jan Kratochvil <jan.kratochvil@redhat.com>
To: Pedro Alves <pedro@codesourcery.com>
Cc: archer@sourceware.org, Chris Moller <cmoller@redhat.com>
Subject: Re: Proof-of-concept on fd-connected linux-nat.c server
Message-ID: <20090510145549.GA4932@host0.dyn.jankratochvil.net>
References: <20090509151556.GA17252@host0.dyn.jankratochvil.net> <200905101125.24435.pedro@codesourcery.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200905101125.24435.pedro@codesourcery.com>
User-Agent: Mutt/1.5.19 (2009-01-05)
X-SW-Source: 2009-q2/txt/msg00083.txt.bz2

On Sun, 10 May 2009 12:25:24 +0200, Pedro Alves wrote:
> You're reinventing a remote protocol, and, at the wrong layer, IMO.

I cannot say too much about the gdbserver layer placement but thanks for the
confirmation about gdbserver.


> The single most important time sensitive operation when debugging is
> single-stepping speed, and that's mostly dominated by roundtrips.
> 
> > What do you think about implementing gdbserver.ko? 
> 
> What would this be solving?

ptrace interface is wrong:  Its binding on waitpid()/SIGCHLD/SIGSTOP is
a nightmare conflicting with regular uses of waitpid/SIGCHLD/SIGSTOP.  Also
its TID vs. PID handling is not right, just attaching to a whole process is
a task for many screens of code in GDB.

One feature of dropping ptrace() is to make "step" faster.  While there exists
now PTRACE_SINGLEBLOCK==DEBUGCTLMSR_BTF still the kernel can perform
"step/"next" faster without the userland/kernelland switching, even for an
inferior code loop on a single line of code (we want to step over).

PTRACE_SYSCALL is too general, it should be possible to stop for example only
when inferior calls a syscall related to the specified file descriptor(s).

PTRACE_O_TRACE{FORK,VFORK,CLONE} works but it is a nightmare with new child
tasks appearing before the original call reports the new task TID.

Also kernel (already?) contains debugging infrastructure for systemtap such as
for placing/stepping-over breakpoints which can be reused by GDB if those
GDB parts get appropriately disabled on the enhanced kernels.

(As i do not work on this project I may not know it all.)


> In all seriousness, I think that you're going the wrong direction
> entirely.

Just to make it clear - my patch was just the "proof-of-concept", nothing
I would ever want to get accepted anywhere.

Still it is not clear to me if gdbserver.ko is the right way to get rid of the
ptrace/waitpid/SIGCHLD/SIGSTOP pain.


> I really suggest you get acquainted with the remote
> protocol and gdbserver, before coming up with a new solution.

May I ask why do you put effort both on linux-nat.c and gdbserver?  Isn't it
cheaper to unify the codebase and start (transparently) using gdbserver even
for local operations?


> > * Removing local queue (waitpid_queue) would be IMHO good even for current FSF
> >   GDB HEAD - 
> 
> It's going to happen:

http://sourceware.org/ml/gdb-patches/2009-04/msg00523.html
> pselect/ppoll would be another fancy way, but that is unuseable, as you can
> read on the BUGS section of the pselect man page.

As ppoll() is a way how to remove the local pipe entirely, what about
a workaround for non-ppoll systems by using O_ASYNC/SIGIO/sigsuspend().
Or do exist systems both without ppoll() and without O_ASYNC?


Regards,
Jan