From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-13426-listarch-gdb=sources.redhat.com@sources.redhat.com>
Received: (qmail 5755 invoked by alias); 16 Apr 2003 14:38:46 -0000
Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb/>
List-Post: <mailto:gdb@sources.redhat.com>
List-Help: <mailto:gdb-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-owner@sources.redhat.com
Received: (qmail 5744 invoked from network); 16 Apr 2003 14:38:46 -0000
Received: from unknown (HELO mx1.redhat.com) (66.187.233.31)
  by sources.redhat.com with SMTP; 16 Apr 2003 14:38:46 -0000
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254])
	by mx1.redhat.com (8.11.6/8.11.6) with ESMTP id h3GEckD02767
	for <gdb@sources.redhat.com>; Wed, 16 Apr 2003 10:38:46 -0400
Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156])
	by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h3GEckq14870
	for <gdb@sources.redhat.com>; Wed, 16 Apr 2003 10:38:46 -0400
Received: from localhost.redhat.com (romulus-int.sfbay.redhat.com [172.16.27.46])
	by pobox.corp.redhat.com (8.11.6/8.11.6) with ESMTP id h3GEcig22300;
	Wed, 16 Apr 2003 10:38:44 -0400
Received: by localhost.redhat.com (Postfix, from userid 469)
	id 3F6912C43E; Wed, 16 Apr 2003 10:43:12 -0400 (EDT)
From: Elena Zannoni <ezannoni@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <16029.27648.37989.683217@localhost.redhat.com>
Date: Wed, 16 Apr 2003 14:38:00 -0000
To: Daniel Jacobowitz <drow@mvista.com>
Cc: Elena Zannoni <ezannoni@redhat.com>, gdb@sources.redhat.com,
   roland@redhat.com
Subject: Re: Linux kernel problem -- food for thoughts
In-Reply-To: <20030416142811.GA9574@nevyn.them.org>
References: <16029.26499.985342.118733@localhost.redhat.com>
	<20030416142811.GA9574@nevyn.them.org>
X-SW-Source: 2003-04/txt/msg00158.txt.bz2

Daniel Jacobowitz writes:
 > On Wed, Apr 16, 2003 at 10:24:03AM -0400, Elena Zannoni wrote:
 > > 
 > > Gdb is currently having a 'little problem' backtracing out of system
 > > calls in x86 kernels which support NPTL. I think the current public
 > > 2.5 kernel would make this problem show up.
 > > 
 > > Right now, if you are in system calls the backtrace will show up as:
 > > 
 > >  0xffffe002 in ??
 > 
 > I was just thinking about this.  My reaction is:
 >   - the page needs to be readable; I vaguely remember badgering Linus
 > about this and getting it fixed, but it might have been someone else,
 > or it might not have gotten fixed.
 >   - GDB needs to get the location of the EH information from glibc
 > somehow.  My instinct is to make glibc export this in a global symbol,
 > just like the way we get signal numbers from linuxthreads.
 > 
 > How does that sound?

Roland (but I'll let him speak) has had a thought about creating a
/proc/pid/vsyscall file, which then gdb could read with add-symbol-file....

the page is readable right now in 2.5 and the patch for the .eh_frame
has been integrated.

core files will also need to be addressed.

 > 
 > 
 > Note that we don't use eh information on i386 yet.  We need to fix
 > that.  I tried once and got distracted by another project, I think :)

Yep, of course.

elena


 > 
 > > 
 > > Here is an explanation of the problem that Roland has provided:
 > > 
 > > ---------------
 > > Previously asm or C code in libc entered the kernel by setting some
 > > registers and using the "int $0x80" instruction.  e.g.
 > > 
 > > 00000000 <__getpid>:
 > >    0:	b8 14 00 00 00       	mov    $0x14,%eax
 > >    5:	cd 80                	int    $0x80
 > >    7:	c3                   	ret    
 > > 
 > > That is the function called __getpid in libc, the pre-NPTL build.  (In the
 > > shared library you will see this if you've run with LD_ASSUME_KERNEL=2.4.1
 > > so that /lib/i686/libc.so.6 is what you are using.)
 > > 
 > > In the new libc (/lib/tls/libc.so.6), that function looks like this:
 > > 
 > > 00000000 <__getpid>:
 > >    0:	b8 14 00 00 00       	mov    $0x14,%eax
 > >    5:	65 ff 15 10 00 00 00 	call   *%gs:0x10
 > >    c:	c3                   	ret    
 > > 
 > > %gs:0x10 is a location that has been initialized to a kernel-supplied
 > > special entry point address.  In the current kernels, that address is
 > > always 0xffffe000.  But that is not part of the ABI, which is why it's
 > > indirect instead of a literal "call 0xffffe000".  The kernel supplies the
 > > actual entry point address to libc at startup time, and nothing in the
 > > kernel-user interface prevents it from using a different address in each
 > > process if it chose to.
 > > 
 > > The reason for this is that there can be multiple ways to enter the kernel,
 > > not just the "int $0x80" trap instruction.  Some kernels on some hardware
 > > may use a different method that performs better.  By using this
 > > kernel-supplied entry point address, no user code has to be changed to
 > > select the method.  It's entirely the kernel's choice.
 > > 
 > > In all the RH kernels we have right now, the entry point page contains:
 > > 
 > > 	0xffffe000:	int $0x80
 > > 	0xffffe002:	ret
 > > 
 > > But user code cannot presume what this code sequence looks like exactly.
 > > It will be some sequence of register and stack moves and special trap
 > > instructions, but you have to disassemble to know exactly.  In the case
 > > above, the PC value seen while a thread is in the kernel is 0xffffe002.
 > > You can disassemble the "ret" there and see that you have to pop the PC off
 > > the stack to recover the caller's frame.  
 > > 
 > > Another example of what this code might look like when you disassemble it is:
 > > 
 > > 	0xffffe000:	push   %ecx
 > > 	0xffffe001:	push   %edx
 > > 	0xffffe002:	push   %ebp
 > > 	0xffffe003: 	mov    %esp,%ebp
 > > 	0xffffe005: 	sysenter 
 > > 	0xffffe007:	nop    
 > > 	0xffffe008:	nop    
 > > 	0xffffe009:	nop    
 > > 	0xffffe00a:	nop    
 > > 	0xffffe00b:	nop    
 > > 	0xffffe00c:	nop    
 > > 	0xffffe00d:	nop    
 > > 	0xffffe00e: 	jmp    0xffffe003
 > > 	0xffffe010:	pop    %ebp
 > > 	0xffffe011:	pop    %edx
 > > 	0xffffe012:	pop    %ecx
 > > 	0xffffe013:	ret    
 > > 
 > > In this example, depending on what happened inside the kernel the PC you
 > > usually see may be either 0xffffe00e or 0xffffe010.  If the process gets a
 > > signal or you attach asynchronously or so forth, the PC might be at any of
 > > the earlier instructions as well.  You cannot rely on exactly what the
 > > sequence is, so you must be able to disassemble from where you are and
 > > cope.  In this case you will most often see 0xffffe010, in which case you
 > > need to pop those three registers and the PC off the stack to restore the
 > > caller's frame.
 > > 
 > > So, these cases are like a leaf function with no debugging info.  The
 > > first solution idea was interpreting the epilogue code.  It will
 > > probably be safe to assume that it looks like epilogue code normally
 > > does, i.e. register pops and not any arbitrary instructions.
 > > 
 > > Another solution I was considering is to have the system somewhere provide
 > > DWARF unwind info matching the possible PC addresses in the vsyscall page.
 > > I am now pretty sure this is the way to go.  The recent development is that
 > > NPTL now needs .eh_frame information for these PCs as well, and Ulrich has
 > > made a kernel change to provide it.  The .eh_frame info for the vsyscall
 > > PCs is on the same read-only kernel page.  The C library now uses this as
 > > if the vsyscall page were a DSO with .eh_frame info to register, so that
 > > exception-style unwinding from any valid PC in a magic entry point works.
 > > 
 > > So, there is a .eh_frame section available for this code, and getting it
 > > from where it is into gdb can be done by hook or by crook.  I have the
 > > impression that gdb turning an available .eh_frame section into happy
 > > backtraces is something that might be expected real soon now.  
 > > Sounds like a winner.
 > > 
 > > I think that elucidates all but the dreariest bits of the technical issues.
 > > Now the practical questions.  Oh, one dreary bit: 83172 mostly talks about
 > > the fact that ptrace refuses to read the 0xffffe000 page for you, which is
 > > presumed a prerequisite for dealing with the real can of worms (unwinding).
 > > 
 > > --------------------
 > > 
 > > 
 > > I think right now the public 2.5 kernel has a fix to make the page
 > > readable, and another one to provide the .eh_frame information. There
 > > is no mechanism yet to make that debug info accessible to gdb.
 > > 
 > > 
 > > elena
 > > 
 > 
 > -- 
 > Daniel Jacobowitz
 > MontaVista Software                         Debian GNU/Linux Developer