public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* Re: backtrace through 'sleep', (1255 and 1253)
@ 2003-08-04 16:35 Michael Elizabeth Chastain
  0 siblings, 0 replies; 4+ messages in thread
From: Michael Elizabeth Chastain @ 2003-08-04 16:35 UTC (permalink / raw)
  To: ezannoni; +Cc: drow, gdb

eza> yes. How did the prologue analyzer changed between 5.3 and now?

The prologue analyzer got refactored, but it looks like basically
the same code.  There's nothing that understands 'xor %ecx, %ecx'
in the 5.3 code.

In 5.3, i386_frame_chain looks for a frameless function.  It has some
simple tests and it doesn't call the prologue analyzer.  If it can't
decide whether the function is frameless or not, then i386_frame_chain
assumes that is framed.  That works for 'sleep'.

In gdb HEAD, i386_frame_chain directly calls the prologue analyzer.
If the prologue analyzer can't handle it, then i386_frame_chain assumes
that is frameless (the code that I quoted).

From a user point of view, the workaround for today would be:
use the debugging version of glibc, or use the static version of glibc.
I really don't want to ship gdb 6.0 like that but if we don't fix gdb
then I'll write something like that for PROBLEMS.

I looked at the 'abstract interpretation' code in s390-tdep.c.
I think this is the right solution for gdb HEAD.  For gdb 6.0, I think
that shoving 'xor %reg, %reg' into the existing prologue reader is
probably good enough to get us through the release.  I will make a patch
for this later this week if no one beats me to it.

The sinking feeling I have is that this problem will affect a lot
of architectures, not just i386.

Michael C

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: backtrace through 'sleep', (1255 and 1253)
  2003-08-04 15:33 ` Elena Zannoni
@ 2003-08-04 15:36   ` Daniel Jacobowitz
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Jacobowitz @ 2003-08-04 15:36 UTC (permalink / raw)
  To: Elena Zannoni; +Cc: Michael Elizabeth Chastain, gdb

On Mon, Aug 04, 2003 at 11:40:47AM -0400, Elena Zannoni wrote:
> Unlikely to happen, I am afraid :-(
> 
>  > . Ask the gcc guys directly to not schedule any instructions between
>  >   'push %ebp' and 'mov %esp, %ebp'.
>  > 
> 
> more likely.

Not very, I think.

>  > . Change gdb so that the prologue reader is more powerful.  It doesn't
>  >   take much to get through the 'xor %ecx, %ecx' instruction.  The
>  >   trouble is that there could be a billion different instructions
>  >   in there ('mov any-register, immediate').  The advantage is that
>  >   this would work without any changes to external software.
>  > 
> 
> yes. How did the prologue analyzer changed between 5.3 and now?

Mark dramatically enhanced it at the same time he added dwarf2 CFI
unwinding.  Unfortunately, the higher accuracy is causing it to assume
some functions are frameless which aren't.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: backtrace through 'sleep', (1255 and 1253)
  2003-08-02 15:18 Michael Elizabeth Chastain
@ 2003-08-04 15:33 ` Elena Zannoni
  2003-08-04 15:36   ` Daniel Jacobowitz
  0 siblings, 1 reply; 4+ messages in thread
From: Elena Zannoni @ 2003-08-04 15:33 UTC (permalink / raw)
  To: Michael Elizabeth Chastain; +Cc: gdb

Michael Elizabeth Chastain writes:
 > Here's what I've learned so far.
 > 
 > This is the code for 'sleep' in /lib/i686/libc.so.6:
 > 
 >   push %ebp
 >   xor  %ecx, %ecx
 >   mov  %esp, %ebp
 >   push %edi
 >   xor  %edx, %edx
 >   ...
 >   call __i686.get_pc_thunk.bx
 >   add  $0x7bfab, %ebx
 >   sub  $0x1cc, %esp
 >   ...
 > 
 > This is on a red hat linux 8 system, native i686-pc-linux-gnu.
 > 
 > This is C code, not hand-coded assembler!  The "xor" instructions have been
 > mixed into the prologue.  They are just setting some variables to zero.
 > The call to __i686.get_pc_thunk.bx comes from gcc -fpic.
 > 
 > Here is the code in i386_frame_cache:
 > 
 >   frame_unwind_register (next_frame, I386_EBP_REGNUM, buf);
 >   cache->base = extract_unsigned_integer (buf, 4);
 >   if (cache->base == 0)
 >     return cache;
 > 
 >   cache->save_regs[I386_EIP_REGNUM] = 4;
 > 
 >   cache->pc = frame_func_unwind (next_frame);
 >   if (cache->pc != 0)
 >     i386_analyze_prologue (cache->pc, frame_pc_unwind (next_frame), cache);
 > 
 >   if (cache->locals < 0)
 >     {
 >       /* We didn't find a valid frame, which means that CACHE->base
 >          currently holds the frame pointer for our calling frame.  If
 >          we're at the start of a function, or somewhere half-way its
 >          prologue, the function's frame probably hasn't been fully
 >          setup yet.  Try to reconstruct the base address for the stack
 >          frame by looking at the stack pointer.  For truly "frameless"
 >          functions this might work too.  */
 > 
 >       frame_unwind_register (next_frame, I386_ESP_REGNUM, buf);
 >       cache->base = extract_unsigned_integer (buf, 4) + cache->sp_offset;
 >     }
 > 
 > The etiology is:
 > 
 >   The prologue analyzer fails on this function because of the 
 >   'xor %ecx, %ecx'.
 > 
 >   So cache->locals == -1.
 > 
 >   /* We didn't find a valid frame ... */
 > 
 >   So the code behaves like it's in a frameless function.  It grabs
 >   the stack pointer and adds an offset to it and uses that for a frame.
 > 
 > Whereas, in reality, the pc is in the middle of 'sleep' (well past the
 > prologue), and there is a perfectly good frame.  In fact if I undo the
 > bogus re-assignment to cache->base in this case then the stack trace
 > works fine.
 > 
 > Now, what to do about it ...
 > 
 > Red Hat Linux 8 has an rpm for a debug version of glibc.  The
 > glibc-debug rpm installs libraries in /usr/lib/debug, rather than
 > overwriting /lib/i686.  I installed glibc-debug and set LD_LIBRARY_PATH
 > to /usr/lib/debug, and it worked!  The test cases in both gdb/1253 and
 > gdb/1255 both backtraced just fine!

FWIW, in general the RedHat debug rpms contain only debug info, and a
section in the /lib/i686 libraries contains a pointer to them. You
shouldn't need to set LD_LIBRARY_PATH at all, just do a 'set
debug-file-directory /usr/lib/debug' and gdb should be able to
integrate the two together.  For glibc though, they provide 3 flavors
of rpms, one w/o debug info (glibc), one which includes the debug info
and the rest (glibc-debug) which you installed, and one which includes
only the debuginfo (glibc-debuginfo) for which you can do what I
described. The glibc-debug stuff gets installed in /usr/lib/debug/.
The glibc-debuginfo gets installed in /usr/lib/debug/lib/.


 > 
 > Also, static-linking with glibc works, because the static version
 > of 'sleep' has different code (no -fpic) with a prologue that gdb
 > can digest.
 > 
 > So we can either:
 > 
 > . Document the problem and tell people to use a debugging glibc or
 >   static-link their program.  Also send a message to vendors that they may
 >   want to make the debugging glibc the default glibc.  Vendors may even
 >   want to patch their gcc to not mix other instructions into the prologue,
 >   because gdb is a lot more sensitive to un-analyzable prologues now.
 > 

Unlikely to happen, I am afraid :-(

 > . Ask the gcc guys directly to not schedule any instructions between
 >   'push %ebp' and 'mov %esp, %ebp'.
 > 

more likely.

 > . Change gdb so that the prologue reader is more powerful.  It doesn't
 >   take much to get through the 'xor %ecx, %ecx' instruction.  The
 >   trouble is that there could be a billion different instructions
 >   in there ('mov any-register, immediate').  The advantage is that
 >   this would work without any changes to external software.
 > 

yes. How did the prologue analyzer changed between 5.3 and now?

elena


 > . Do nothing, let the users suffer.
 > 
 > . Something else?
 > 
 > Michael C

^ permalink raw reply	[flat|nested] 4+ messages in thread

* backtrace through 'sleep', (1255 and 1253)
@ 2003-08-02 15:18 Michael Elizabeth Chastain
  2003-08-04 15:33 ` Elena Zannoni
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Elizabeth Chastain @ 2003-08-02 15:18 UTC (permalink / raw)
  To: gdb

Here's what I've learned so far.

This is the code for 'sleep' in /lib/i686/libc.so.6:

  push %ebp
  xor  %ecx, %ecx
  mov  %esp, %ebp
  push %edi
  xor  %edx, %edx
  ...
  call __i686.get_pc_thunk.bx
  add  $0x7bfab, %ebx
  sub  $0x1cc, %esp
  ...

This is on a red hat linux 8 system, native i686-pc-linux-gnu.

This is C code, not hand-coded assembler!  The "xor" instructions have been
mixed into the prologue.  They are just setting some variables to zero.
The call to __i686.get_pc_thunk.bx comes from gcc -fpic.

Here is the code in i386_frame_cache:

  frame_unwind_register (next_frame, I386_EBP_REGNUM, buf);
  cache->base = extract_unsigned_integer (buf, 4);
  if (cache->base == 0)
    return cache;

  cache->save_regs[I386_EIP_REGNUM] = 4;

  cache->pc = frame_func_unwind (next_frame);
  if (cache->pc != 0)
    i386_analyze_prologue (cache->pc, frame_pc_unwind (next_frame), cache);

  if (cache->locals < 0)
    {
      /* We didn't find a valid frame, which means that CACHE->base
         currently holds the frame pointer for our calling frame.  If
         we're at the start of a function, or somewhere half-way its
         prologue, the function's frame probably hasn't been fully
         setup yet.  Try to reconstruct the base address for the stack
         frame by looking at the stack pointer.  For truly "frameless"
         functions this might work too.  */

      frame_unwind_register (next_frame, I386_ESP_REGNUM, buf);
      cache->base = extract_unsigned_integer (buf, 4) + cache->sp_offset;
    }

The etiology is:

  The prologue analyzer fails on this function because of the 
  'xor %ecx, %ecx'.

  So cache->locals == -1.

  /* We didn't find a valid frame ... */

  So the code behaves like it's in a frameless function.  It grabs
  the stack pointer and adds an offset to it and uses that for a frame.

Whereas, in reality, the pc is in the middle of 'sleep' (well past the
prologue), and there is a perfectly good frame.  In fact if I undo the
bogus re-assignment to cache->base in this case then the stack trace
works fine.

Now, what to do about it ...

Red Hat Linux 8 has an rpm for a debug version of glibc.  The
glibc-debug rpm installs libraries in /usr/lib/debug, rather than
overwriting /lib/i686.  I installed glibc-debug and set LD_LIBRARY_PATH
to /usr/lib/debug, and it worked!  The test cases in both gdb/1253 and
gdb/1255 both backtraced just fine!

Also, static-linking with glibc works, because the static version
of 'sleep' has different code (no -fpic) with a prologue that gdb
can digest.

So we can either:

. Document the problem and tell people to use a debugging glibc or
  static-link their program.  Also send a message to vendors that they may
  want to make the debugging glibc the default glibc.  Vendors may even
  want to patch their gcc to not mix other instructions into the prologue,
  because gdb is a lot more sensitive to un-analyzable prologues now.

. Ask the gcc guys directly to not schedule any instructions between
  'push %ebp' and 'mov %esp, %ebp'.

. Change gdb so that the prologue reader is more powerful.  It doesn't
  take much to get through the 'xor %ecx, %ecx' instruction.  The
  trouble is that there could be a billion different instructions
  in there ('mov any-register, immediate').  The advantage is that
  this would work without any changes to external software.

. Do nothing, let the users suffer.

. Something else?

Michael C

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-08-04 16:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-04 16:35 backtrace through 'sleep', (1255 and 1253) Michael Elizabeth Chastain
  -- strict thread matches above, loose matches on Subject: below --
2003-08-02 15:18 Michael Elizabeth Chastain
2003-08-04 15:33 ` Elena Zannoni
2003-08-04 15:36   ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).