From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10167 invoked by alias); 28 Mar 2006 15:20:29 -0000 Received: (qmail 10158 invoked by uid 22791); 28 Mar 2006 15:20:28 -0000 X-Spam-Check-By: sourceware.org Received: from balabit.hu (HELO balabit.hu) (195.70.34.196) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 28 Mar 2006 15:20:27 +0000 Subject: Re: thread register state information invalid in core files From: Balazs Scheidler To: Daniel Jacobowitz Cc: gdb@sourceware.org In-Reply-To: <20060328143647.GB30581@nevyn.them.org> References: <1143542626.8742.12.camel@bzorp.balabit> <20060328143647.GB30581@nevyn.them.org> Content-Type: text/plain Date: Tue, 28 Mar 2006 21:18:00 -0000 Message-Id: <1143559222.16757.11.camel@bzorp.balabit> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-03/txt/msg00187.txt.bz2 On Tue, 2006-03-28 at 09:36 -0500, Daniel Jacobowitz wrote: > On Tue, Mar 28, 2006 at 12:43:45PM +0200, Balazs Scheidler wrote: > > Anything else: > > (gdb) thread 2 > > [Switching to thread 2 (process 26119)]#0 0x00010202 in ?? () > > (gdb) bt > > #0 0x00010202 in ?? () > > Cannot access memory at address 0x0 > > (gdb) info registers > > eax 0xc010007b -1072693125 > > ecx 0x243948 2373960 > > edx 0x0 0 > > ebx 0x1f8 504 > > esp 0x0 0x0 > > ebp 0x7b 0x7b > > esi 0x409272c 67708716 > > edi 0x243900 2373888 > > eip 0x10202 0x10202 > > eflags 0x7b 123 > > cs 0x26f4 9972 > > ss 0x0 0 > > ds 0xffff 65535 > > es 0x3965 14693 > > fs 0x0 0 > > gs 0x33 51 > > > > Looking at the value of ESP and EBP it is possible that gdb incorrectly > > reads the stack-frame information. > > It looks to me like the core file is just corrupt. > > These registers are in the pseudo-sections you saw in objdump, in the > order the header files describe for an elf_gregset_t. You may want to > check the core file by hand; you can dump the sections using objdump -s > -j "sectionname". > > I remember having various problems with threaded core dumps in recent > kernels. This is the content of .reg/31158 (same as .reg) Contents of section .reg/31158: 0000 68ee1008 05000000 bbb70000 00000000 h............... 0010 402f2400 28f7ffbf fcffffff 7b0010c0 @/$.(.......{... 0020 7b000000 00000000 33000000 a8000000 {.......3....... 0030 23051e00 73000000 46020000 1cf7ffbf #...s...F....... 0040 7b000000 {... and .reg2/31158 (same as .reg2) Contents of section .reg2/31158: 0000 7f032000 0000c901 c8c41500 73000000 .. .........s... 0010 9ce2ffbf 7b000000 801f0000 bd6f0200 ....{........o.. 0020 00000000 ffffffff 01000000 0000ffff ................ 0030 af3fffff f5130000 ffff818a feffffffx .?.............. 0040 0100ffff 00000000 000000e0 00400080 .............@.. 0050 4a14f145 51882440 e0da89ea 3a9d5188 J..EQ.$@....:.Q. 0060 1d4000d8 89ea3a9d 51881d40 .@....:.Q..@ If I understand your hint correctly, the registers should be read as follows: #define ELF_CORE_COPY_REGS(pr_reg, regs) \ pr_reg[0] = regs->ebx; \ pr_reg[1] = regs->ecx; \ pr_reg[2] = regs->edx; \ pr_reg[3] = regs->esi; \ pr_reg[4] = regs->edi; \ pr_reg[5] = regs->ebp; \ pr_reg[6] = regs->eax; \ pr_reg[7] = regs->xds; \ pr_reg[8] = regs->xes; \ savesegment(fs,pr_reg[9]); \ savesegment(gs,pr_reg[10]); \ pr_reg[11] = regs->orig_eax; \ pr_reg[12] = regs->eip; \ pr_reg[13] = regs->xcs; \ pr_reg[14] = regs->eflags; \ pr_reg[15] = regs->esp; \ pr_reg[16] = regs->xss; This does seem to be the case, "info registers" output from gdb) eax 0xfffffffc -4 ecx 0x5 5 edx 0xb7bb 47035 ebx 0x810ee68 135327336 esp 0xbffff71c 0xbffff71c ebp 0xbffff728 0xbffff728 esi 0x0 0 edi 0x242f40 2371392 eip 0x1e0523 0x1e0523 eflags 0x246 582 cs 0x73 115 ss 0x7b 123 ds 0xc010007b -1072693125 es 0x7b 123 fs 0x0 0 gs 0x33 51 However the values are bogus. The valid ebp value for the crashing thread is 0x0409272c So it seems to be a kernel bug. Any hints where this was fixed or whether it was fixed at all? > > > The funny part that the segfault > > itself occurred in the PID number 31158 (not the main thread for sure), > > but gdb lists pid 31158 as the main thread with the main thread's stack. > > The kernel always dumps the faulting thread first. Sure, but it has the context of the main thread. -- Bazsi