public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* Strange stack trace on Windows
@ 2007-09-29 22:01 Gordon Prieur
  2007-09-30  3:00 ` Daniel Jacobowitz
  2007-09-30 18:13 ` Eli Zaretskii
  0 siblings, 2 replies; 16+ messages in thread
From: Gordon Prieur @ 2007-09-29 22:01 UTC (permalink / raw)
  To: gdb

Hi,

    When I interrupt the debugee on Windows I almost never get a stack trace
with he debuggee information in it. I get similar traces with both MinGW and
Cygwin gdb commands:

> 115where
> 115&"where\n"
> 115~"#0  0x7c90eb94 in ntdll!LdrAccessResource ()\n"
> 115~"   from C:\\WINDOWS\\system32\\ntdll.dll\n"
> 115~"#1  0x7c90e3ed in ntdll!ZwRequestWaitReplyPort ()\n"
> 115~"   from C:\\WINDOWS\\system32\\ntdll.dll\n"
> 115~"#2  0x7c9132f8 in ntdll!CsrProbeForWrite () from 
> C:\\WINDOWS\\system32\\ntdll.dll\n"
> 115~"#3  0x00003fec in ?? ()\n"
> 115~"#4  0x0022fa70 in ?? ()\n"
> 115~"#5  0x0022fa70 in ?? ()\n"
> 115~"#6  0x00000000 in ?? ()\n"
> 115^done

    Its not that I'm looking at the wrong thread, I've checked all threads
(and set the current thread to the user thread). Can anybody explain 
this and
tell me how to get to user data?

Thanks,
Gordon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2007-09-29 22:01 Strange stack trace on Windows Gordon Prieur
@ 2007-09-30  3:00 ` Daniel Jacobowitz
  2007-09-30 18:13 ` Eli Zaretskii
  1 sibling, 0 replies; 16+ messages in thread
From: Daniel Jacobowitz @ 2007-09-30  3:00 UTC (permalink / raw)
  To: Gordon Prieur; +Cc: gdb

On Sat, Sep 29, 2007 at 02:45:02PM -0700, Gordon Prieur wrote:
> Hi,
> 
>    When I interrupt the debugee on Windows I almost never get a stack trace
> with he debuggee information in it. I get similar traces with both MinGW and
> Cygwin gdb commands:

This is a known (hard) problem.  GDB does not know how to get symbol
information for Windows system DLLs.  It approximates it by using the
export tables, but they do not contain enough information for a
sensible backtrace.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2007-09-29 22:01 Strange stack trace on Windows Gordon Prieur
  2007-09-30  3:00 ` Daniel Jacobowitz
@ 2007-09-30 18:13 ` Eli Zaretskii
  2007-10-01 14:03   ` Gordon Prieur
  1 sibling, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2007-09-30 18:13 UTC (permalink / raw)
  To: Gordon Prieur; +Cc: gdb

> Date: Sat, 29 Sep 2007 14:45:02 -0700
> From: Gordon Prieur <Gordon.Prieur@Sun.COM>
> 
>     When I interrupt the debugee on Windows I almost never get a stack trace
> with he debuggee information in it. I get similar traces with both MinGW and
> Cygwin gdb commands:
> 
> > 115where
> > 115&"where\n"
> > 115~"#0  0x7c90eb94 in ntdll!LdrAccessResource ()\n"
> > 115~"   from C:\\WINDOWS\\system32\\ntdll.dll\n"
> > 115~"#1  0x7c90e3ed in ntdll!ZwRequestWaitReplyPort ()\n"
> > 115~"   from C:\\WINDOWS\\system32\\ntdll.dll\n"
> > 115~"#2  0x7c9132f8 in ntdll!CsrProbeForWrite () from 
> > C:\\WINDOWS\\system32\\ntdll.dll\n"
> > 115~"#3  0x00003fec in ?? ()\n"
> > 115~"#4  0x0022fa70 in ?? ()\n"
> > 115~"#5  0x0022fa70 in ?? ()\n"
> > 115~"#6  0x00000000 in ?? ()\n"
> > 115^done

If you type "step" repeatedly, do you eventually get to a frame that
is in your program?  If you do, you can get a valid stack trace at
that point.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2007-09-30 18:13 ` Eli Zaretskii
@ 2007-10-01 14:03   ` Gordon Prieur
  2007-10-01 14:39     ` Joel Brobecker
  0 siblings, 1 reply; 16+ messages in thread
From: Gordon Prieur @ 2007-10-01 14:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb


Eli Zaretskii wrote:
>> Date: Sat, 29 Sep 2007 14:45:02 -0700
>> From: Gordon Prieur <Gordon.Prieur@Sun.COM>
>>
>>     When I interrupt the debugee on Windows I almost never get a stack trace
>> with he debuggee information in it. I get similar traces with both MinGW and
>> Cygwin gdb commands:
>>
>>     
>>> 115where
>>> 115&"where\n"
>>> 115~"#0  0x7c90eb94 in ntdll!LdrAccessResource ()\n"
>>> 115~"   from C:\\WINDOWS\\system32\\ntdll.dll\n"
>>> 115~"#1  0x7c90e3ed in ntdll!ZwRequestWaitReplyPort ()\n"
>>> 115~"   from C:\\WINDOWS\\system32\\ntdll.dll\n"
>>> 115~"#2  0x7c9132f8 in ntdll!CsrProbeForWrite () from 
>>> C:\\WINDOWS\\system32\\ntdll.dll\n"
>>> 115~"#3  0x00003fec in ?? ()\n"
>>> 115~"#4  0x0022fa70 in ?? ()\n"
>>> 115~"#5  0x0022fa70 in ?? ()\n"
>>> 115~"#6  0x00000000 in ?? ()\n"
>>> 115^done
>>>       
>
> If you type "step" repeatedly, do you eventually get to a frame that
> is in your program?  If you do, you can get a valid stack trace at
> that point.
>   

Sometime yes, sometimes no. I implemented that solution abotu 6 months 
ago but backed
it out because it was just as likely to crash gdb:-(  or hang my IDE 
(netbeans).

Gordon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2007-10-01 14:03   ` Gordon Prieur
@ 2007-10-01 14:39     ` Joel Brobecker
  0 siblings, 0 replies; 16+ messages in thread
From: Joel Brobecker @ 2007-10-01 14:39 UTC (permalink / raw)
  To: Gordon Prieur; +Cc: Eli Zaretskii, gdb

> Sometime yes, sometimes no. I implemented that solution abotu 6 months
> ago but backed it out because it was just as likely to crash gdb:-(
> or hang my IDE (netbeans).

We have experienced the same type of problem at AdaCore, and decided
to make some compromises: We decided to trust the %ebp registers when
unwinding frameless functions from a DLL. This comes with a price:
We miss a frame in the backtrace. But because tasking is so important
in Ada, we felt it was a better compromise than not being able to unwind
from tasks that are blocked waiting for a rendez-vous.

Which compromise is best actually depends on the user, which is why
this code, or a variation of it, never made it to the FSF tree.

This is what our i386_frame_cache() does in case of frameless routines:

  if (cache->locals < 0)
    {
      /* We didn't find a valid frame, which means that CACHE->base
         currently holds the frame pointer for our calling frame.  If
         we're at the start of a function, or somewhere half-way its
         prologue, the function's frame probably hasn't been fully
         setup yet.  Try to reconstruct the base address for the stack
         frame by looking at the stack pointer.  For truly "frameless"
         functions this might work too.  */

      if (i386_in_dll (cache->pc)
          && !i386_function_has_frame (cache->pc))
        {
          /* Functions in DLL for which do not seem to create a standard
             frame are unwound using %ebp.  This is actually the caller's
             frame base instead of our own, but there are some functions
             such as WaitForSingleObjectEx in one of the Windows system
             DLLs for which the frame base cannot possibly be determined
             from the stack pointer.  As a consequence, our caller will be
             missing from the backtrace, but this is better than having
             an aborted backtrace due to a bogus frame base.
             
             We use this approach only for functions in DLLs because
             this is the only place where we have seen the type of
             highly optimized code that cause us trouble.  In other
             cases, we expect the code to come with frame debugging
             information, making prologue scanning unnecessary.
             
             We also avoid blindly following %ebp if we are midway through
             setting up a standard frame.  In that case, we know how to
             determine the frame base using the stack pointer.  */

          cache->saved_regs[I386_EBP_REGNUM] = 0;
        }
      else
        {
          i386_frameless_adjust_cache_hack (cache, frame_pc_unwind (next_frame));

          if (cache->stack_align)
            {
              /* We're halfway aligning the stack.  */
              cache->base = ((cache->saved_sp - 4) & 0xfffffff0) - 4;
              cache->saved_regs[I386_EIP_REGNUM] = cache->saved_sp - 4;

              /* This will be added back below.  */ 
              cache->saved_regs[I386_EIP_REGNUM] -= cache->base;
            }
          else
            {
              frame_unwind_register (next_frame, I386_ESP_REGNUM, buf);
              cache->base = extract_unsigned_integer (buf, 4) + cache->sp_offset;
            }
        }
    }

And the two helper functions are defined as:

/* Return non-zero if the function starting at START_PC has a prologue
   that sets up a standard frame.  */

static int
i386_function_has_frame (CORE_ADDR start_pc)
{
  struct i386_frame_cache cache;

  cache.locals = -1;
  i386_analyze_prologue (start_pc, 0xffffffff, &cache);

  return (cache.locals >= 0);
}

/* Return non-zero if PC is inside one of the inferior's DLLs.  */

static int
i386_in_dll (CORE_ADDR pc)
{
   char *so_name = solib_address (pc);
   int len;

   if (so_name == NULL)
     return 0;

   len = strlen (so_name);
   if (len < 5)
     return 0;

   return ((so_name[len - 1] == 'l' || so_name[len - 1] == 'L')
           && (so_name[len - 2] == 'l' || so_name[len - 2] == 'L')
           && (so_name[len - 3] == 'd' || so_name[len - 3] == 'D')
           && so_name[len - 4] == '.');
}



-- 
Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
@ 2009-03-23 13:12 Roland Schwingel
  0 siblings, 0 replies; 16+ messages in thread
From: Roland Schwingel @ 2009-03-23 13:12 UTC (permalink / raw)
  To: Joel Brobecker, gdb

Hi Joel...
 
Thanks for your reply...

gdb-owner@sourceware.org wrote on 19.03.2009 15:18:06:

 > [...]
 > The idea is that, during a function call made during single-stepping,
 > you'll stop at the first instruction of the function.  At this point,
 > we want to use the standard method of computing the frame cache rather
 > than using the alternative method of trusting the %ebp register.
 > This is what the check that I added was about.
 >
 > The patch that I sent was to be made on top of the first patch
 > that I sent long ago. Did you do that?
Sure. I made my changes on top of your older patch. I studied your old/new
patch over and over. I had to slightly adjust it as your new patch does not
100% match the current cvs code. If you like I send you my full i386-tdep.c
(it is quite fat - so I do not attach it now)

 >
 > > In my tests both cache->pc and current_pc are ALWAYS identical.
 >
 > They should be identical when you step into a function during
 > your "next" operation, but other should be different. If this is not
 > the case, then I missed something (maybe something obvious).

Would it help if I make a simple plain c example (source + executable code)
which you can step thru on your own? If you have the time to do so...

Roland

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2009-03-18  9:26 Roland Schwingel
@ 2009-03-19 14:18 ` Joel Brobecker
  0 siblings, 0 replies; 16+ messages in thread
From: Joel Brobecker @ 2009-03-19 14:18 UTC (permalink / raw)
  To: Roland Schwingel; +Cc: gdb

> Unfortunately it does not work.
> cache->pc is set from get_frame_func(this_frame).
> current_pc is set from get_frame_pc(this_frame)

I am not sure why it doesn't work. Maybe it's one of these things that
are so obvious that you don't see them anymore... In any case, my
reasoning was that:

  - get_frame_func(this_frame) returns the address of the function
    corresponding to this_frame

  - get_frame_pc(this_frame) returns the current PC in this frame.

The idea is that, during a function call made during single-stepping,
you'll stop at the first instruction of the function.  At this point,
we want to use the standard method of computing the frame cache rather
than using the alternative method of trusting the %ebp register.
This is what the check that I added was about.

The patch that I sent was to be made on top of the first patch
that I sent long ago. Did you do that?

> In my tests both cache->pc and current_pc are ALWAYS identical.

They should be identical when you step into a function during
your "next" operation, but other should be different. If this is not
the case, then I missed something (maybe something obvious).

-- 
Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
@ 2009-03-18  9:26 Roland Schwingel
  2009-03-19 14:18 ` Joel Brobecker
  0 siblings, 1 reply; 16+ messages in thread
From: Roland Schwingel @ 2009-03-18  9:26 UTC (permalink / raw)
  To: Joel Brobecker, gdb

Hi Joel,

Joel Brobecker wrote on 17.03.2009 20:42:56:
 > > That sounds interesting... :-)
 > > Could you outline that a bit more? Where and how can I do that?
 > > (I am digging in gdb's source only for a few days now).
 >
 > You can try the attached patch. What it does is that it matches
 > the "current_pc" with the start address of the associated function
 > (if any). If they are identical, then we're at the beginning of
 > the function.   In that case, we know that the function will appear
 > frameless since the frame hasn't been setup, but we also know how to
 > unwind properly from it.  I can't test the patch right now, so let
 > me know how it goes.
First thank you for your patch and time! I really appreciate that!

I adapted your diff (hopefully correctly) to match the i386_frame_cache()
function from gdb's CVS head. Obviously is your i386-tdep.c quite
different to the one from CVS head. I think I did not made any mistake
in adding the comparison.

Unfortunately it does not work.
cache->pc is set from get_frame_func(this_frame).
current_pc is set from get_frame_pc(this_frame)
(BTW: In the current CVS head there is no current_pc anymore
 in i386_frame_cache(). I added it on my own.)

In my tests both cache->pc and current_pc are ALWAYS identical.
Mabye it has something to do with your different i386-tdep.c?

Would it help when I create sample code and send it to you?

Again thank you for patch,

Roland

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2009-03-17 15:39 Roland Schwingel
@ 2009-03-17 19:43 ` Joel Brobecker
  0 siblings, 0 replies; 16+ messages in thread
From: Joel Brobecker @ 2009-03-17 19:43 UTC (permalink / raw)
  To: Roland Schwingel; +Cc: gdb

[-- Attachment #1: Type: text/plain, Size: 980 bytes --]

> That sounds interesting... :-)
> Could you outline that a bit more? Where and how can I do that?
> (I am digging in gdb's source only for a few days now).

You can try the attached patch. What it does is that it matches
the "current_pc" with the start address of the associated function
(if any). If they are identical, then we're at the beginning of
the function.   In that case, we know that the function will appear
frameless since the frame hasn't been setup, but we also know how to
unwind properly from it.  I can't test the patch right now, so let
me know how it goes.

> Hmm.. Are so few people using gdb on windows? I think there should be
> way more interest in getting gdb to deal right with MS debugging
> format in order to get also debugging with frameless functions right.

Not sure. I suspect that most people don't debug programs using
threads, and so don't need AdaCore's patch. And if you don't install
the patch, then the "next" problem goes away.

-- 
Joel

[-- Attachment #2: unwind.diff --]
[-- Type: text/x-diff, Size: 546 bytes --]

diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 0d77fab..073721e 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -1561,7 +1561,8 @@ i386_frame_cache (struct frame_info *this_frame, void **this_cache)
 	 functions this might work too.  */
 
       current_pc = get_frame_pc (this_frame);
-      if (i386_in_dll (current_pc)
+      if (current_pc == cache->pc
+          && i386_in_dll (current_pc)
           && !i386_function_has_frame (current_pc))
         {
           /* Functions in DLL for which do not seem to create a standard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
@ 2009-03-17 15:39 Roland Schwingel
  2009-03-17 19:43 ` Joel Brobecker
  0 siblings, 1 reply; 16+ messages in thread
From: Roland Schwingel @ 2009-03-17 15:39 UTC (permalink / raw)
  To: Joel Brobecker, gdb

Hi...

Joel Brobecker wrote on 17.03.2009 16:08:35:
 > > Wouldn't it be possible to also adjust gdb's stepping code to work with
 > > your patch? Or (also not really nice, but maybe cleaner) to
 > > only use your patch for stack dumping. For stepping rely on the
 > > "other" (means current) frame code.
 >
 > Unfortunately, not. Everything is related: In order to determine whether
 > we stepped into a function during the "next", we do the equivalent of
 > a 2-frame backtrace.
 >
 > However, thinking about this a little more, perhaps there is a way out
 > for the "next" case: Try checking the PC against the start address
 > of the function. If the PC is at the start address of your function,
 > then the function prologue hasn't had time to adjust the stack in such
 > a way that we can't unwind from it, and thus the normal processing
 > should work.
That sounds interesting... :-)
Could you outline that a bit more? Where and how can I do that?
(I am digging in gdb's source only for a few days now).

 > > It is quite painful to use gdb on windows for quite a while now.
 > > Windows, whether one may like it or not, is a major platform
 > > and gdb should also operate well here. I am fighting for a long time
 > > with these problems now.
 >
 > As I hinted earlier, I think that this has to do with the fact that
 > you debug code that makes calls to functions that live inside DLLs.
 > This is a relatively specific condition...
Really so specific? I don't think so. Having dlls is quite common
on windows. The operating system itself is mostly a bunch of dlls. And if
one wants to use these functions he has to call them. As in my
example setbuf() from msvcrt.dll.
 
 > > Isn't there a general solution thinkable?  GDB is cool piece of
 > > software. For me it is "THE" debugger but this problem transforms more
 > > and more into killer problem here.
 >
 > Pedro answered exactly what I would have answered: The real proper
 > way to fix the problem is to teach GDB how to read the Windows "debug"
 > info. So far, no one has had enough interest in this project to see
 > it through.
Hmm.. Are so few people using gdb on windows? I think there should be
way more interest in getting gdb to deal right with MS debugging
format in order to get also debugging with frameless functions right.

Roland

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
@ 2009-03-17 15:26 Roland Schwingel
  0 siblings, 0 replies; 16+ messages in thread
From: Roland Schwingel @ 2009-03-17 15:26 UTC (permalink / raw)
  To: Pedro Alves, gdb

Hi Pedro...

Thanks for your reply.

Pedro Alves wrote on 17.03.2009 15:27:09:
 > On Tuesday 17 March 2009 13:48:24, Roland Schwingel wrote:
 > > It is quite painful to use gdb on windows for quite a while now.
 > > Windows, whether one may like it or not, is a major platform
 > > and gdb should also operate well here. I am fighting for a long time
 > > with these problems now. Isn't there a general solution thinkable?
 >
 > Sure there is.  :-)  Teach GDB about MSFT's debug info, e.g., PDB files,
 > and about the FPO (frame pointer omission) information in them.
 >
 > See e.g., <http://www.debuginfo.com/articles/gendebuginfo.html>.
Sigh... This is the answer I have feared. If one would do that he
must also implement that as kind of multi debug format handling.
The code which is compiled with gcc and having stabs(or dwarf2)
debugging informations and the system shared libs having MS
format..

Maybe someone has to take this challenge once...

Roland

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2009-03-17 13:49 Roland Schwingel
  2009-03-17 14:27 ` Pedro Alves
@ 2009-03-17 15:08 ` Joel Brobecker
  1 sibling, 0 replies; 16+ messages in thread
From: Joel Brobecker @ 2009-03-17 15:08 UTC (permalink / raw)
  To: Roland Schwingel; +Cc: gdb

> This list could get quite very long. I already considered to do
> something like that, but just as some kind of last resort as the list
> could get  quite very long. It must contain quite every windows
> function. Or completely match dll names like msvcrt.dll

Right - this is one of the reasons why AdaCore decided not to go
that route. In all fairness, most of our uses debug Ada, and thus
do not face the same issue as you do, since they rarely write code
that contains calls to Windows functions. So the compromise does
not impact us as much as it impacts you.

> Wouldn't it be possible to also adjust gdb's stepping code to work with
> your patch? Or (also not really nice, but maybe cleaner) to
> only use your patch for stack dumping. For stepping rely on the
> "other" (means current) frame code.

Unfortunately, not. Everything is related: In order to determine whether
we stepped into a function during the "next", we do the equivalent of
a 2-frame backtrace.

However, thinking about this a little more, perhaps there is a way out
for the "next" case: Try checking the PC against the start address
of the function. If the PC is at the start address of your function,
then the function prologue hasn't had time to adjust the stack in such
a way that we can't unwind from it, and thus the normal processing
should work.

> It is quite painful to use gdb on windows for quite a while now.
> Windows, whether one may like it or not, is a major platform
> and gdb should also operate well here. I am fighting for a long time
> with these problems now.

As I hinted earlier, I think that this has to do with the fact that
you debug code that makes calls to functions that live inside DLLs.
This is a relatively specific condition...

> Isn't there a general solution thinkable?  GDB is cool piece of
> software. For me it is "THE" debugger but this problem transforms more
> and more into killer problem here.

Pedro answered exactly what I would have answered: The real proper
way to fix the problem is to teach GDB how to read the Windows "debug"
info. So far, no one has had enough interest in this project to see
it through.

-- 
Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2009-03-17 13:49 Roland Schwingel
@ 2009-03-17 14:27 ` Pedro Alves
  2009-03-17 15:08 ` Joel Brobecker
  1 sibling, 0 replies; 16+ messages in thread
From: Pedro Alves @ 2009-03-17 14:27 UTC (permalink / raw)
  To: gdb; +Cc: Roland Schwingel, Joel Brobecker

On Tuesday 17 March 2009 13:48:24, Roland Schwingel wrote:
> It is quite painful to use gdb on windows for quite a while now.
> Windows, whether one may like it or not, is a major platform
> and gdb should also operate well here. I am fighting for a long time
> with these problems now. Isn't there a general solution thinkable?

Sure there is.  :-)  Teach GDB about MSFT's debug info, e.g., PDB files,
and about the FPO (frame pointer omission) information in them.

See e.g., <http://www.debuginfo.com/articles/gendebuginfo.html>.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
@ 2009-03-17 13:49 Roland Schwingel
  2009-03-17 14:27 ` Pedro Alves
  2009-03-17 15:08 ` Joel Brobecker
  0 siblings, 2 replies; 16+ messages in thread
From: Roland Schwingel @ 2009-03-17 13:49 UTC (permalink / raw)
  To: Joel Brobecker, gdb

Hi Joel...

Thanks for your reply....
gdb-owner@sourceware.org wrote on 17.03.2009 14:19:49:
 > > With the patch from Joel I get really good strack traces back in 
this case.
 >
 > Glad to hear that you were able to put the patch to good use :)
It is of very good use. If I only could get the stepping now right... ;-)

 > > As soon as I type next on one of the setbuf() functions gdb steps  
 > > directly to the assembly code of setbuf not over it as I would like to
 > > have it...
 >
 > I suspect that this is because we fail to detect that the caller of
 > setbuf if "function", most likely because the setbuf function does
 > not setup a frame. You can confirm this by requesting a backtrace
 > after you landed inside "setbuf".  If your "function" has disappeared
 > from the backtrace, you know.
Yes... :-) This is true... I already observed this... It has disappeared.
 
 > As I explained back then, this patch is a compromise: You'll win some,
 > and lose some. But you can change a bit the compromise by deciding that
 > certain routines should be excluded from the heuristics. For instance,
 > you can expand i386_in_dll to not only check whether the PC is inside
 > a DLL, but also check the name of the function associated to that PC.
 > One possibility is to only match the name of the routines you know are
 > causing trouble. Another way it to exclude all frameless routines that
 > you know GDB can actually unwind from.
This list could get quite very long. I already considered to do something
like that, but just as some kind of last resort as the list could get 
quite very
long. It must contain quite every windows function. Or completely
match dll names like msvcrt.dll

Wouldn't it be possible to also adjust gdb's stepping code to work with
your patch? Or (also not really nice, but maybe cleaner) to
only use your patch for stack dumping. For stepping rely on the
"other" (means current) frame code.

It is quite painful to use gdb on windows for quite a while now.
Windows, whether one may like it or not, is a major platform
and gdb should also operate well here. I am fighting for a long time
with these problems now. Isn't there a general solution thinkable?
GDB is cool piece of software. For me it is "THE" debugger but
this problem transforms more and more into killer problem here.

 > PS: It looks like I'll have to implement the same type of patch for
 >     x86_64-windows.
Might be useful, too... At least for me.... ;-)

Roland

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
  2009-03-17 11:58 Roland Schwingel
@ 2009-03-17 13:19 ` Joel Brobecker
  0 siblings, 0 replies; 16+ messages in thread
From: Joel Brobecker @ 2009-03-17 13:19 UTC (permalink / raw)
  To: Roland Schwingel; +Cc: gdb

> With the patch from Joel I get really good strack traces back in this case.

Glad to hear that you were able to put the patch to good use :)

> But there is mabye a sideeffect of this patch when stepping thru an  
> application.
[...]
> As soon as I type next on one of the setbuf() functions gdb steps  
> directly to the assembly code of setbuf not over it as I would like to
> have it...

I suspect that this is because we fail to detect that the caller of
setbuf if "function", most likely because the setbuf function does
not setup a frame. You can confirm this by requesting a backtrace
after you landed inside "setbuf".  If your "function" has disappeared
from the backtrace, you know.

As I explained back then, this patch is a compromise: You'll win some,
and lose some. But you can change a bit the compromise by deciding that
certain routines should be excluded from the heuristics. For instance,
you can expand i386_in_dll to not only check whether the PC is inside
a DLL, but also check the name of the function associated to that PC.
One possibility is to only match the name of the routines you know are
causing trouble. Another way it to exclude all frameless routines that
you know GDB can actually unwind from.

At AdaCore, we decided to stay away from that, because it's very hacky
and OS version dependent.

PS: It looks like I'll have to implement the same type of patch for
    x86_64-windows.

-- 
Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange stack trace on Windows
@ 2009-03-17 11:58 Roland Schwingel
  2009-03-17 13:19 ` Joel Brobecker
  0 siblings, 1 reply; 16+ messages in thread
From: Roland Schwingel @ 2009-03-17 11:58 UTC (permalink / raw)
  To: Joel Brobecker, gdb

Hi....

I am following up on an old post from Joel Brobecker from october 2007 
regarding
strange stack traces on windows triggered by frameless functions (see 
below).
This is a real pain when using gdb on windows. You are seeing close to 
NOTHING
when you try to debug a crashing application on windows since gdb 6.0 
especially
when the executable is multithreaded.

With the patch from Joel I get really good strack traces back in this case.
This is really good news. I can also sacrifice the missing frame. It is 
way better
than the other case.

But there is mabye a sideeffect of this patch when stepping thru an 
application.
GDB sometimes steps inside of functions I don't want to step into. Maybe 
Joel
has a new version ready?

What happens - my scenario:
I have a tiny application (just outlined here to keep it short) loading 
nothing
else than a certain dll and continue exection there...
int main (int argc, char **argv)
{
    setbuf(stdout,NULL);
    setbuf(stderr,NULL);
   
    // Load shared library and continue there
    anotherDLLHandle  = LoadLibrary("another.dll");
    anotherDLLFunction = GetProcAdress(anotherDLLHandle ,"function");
    anotherDLLFunction();
    ..
}

Code from another.dll:
void function(void)
{
    ...
    setbuf(stdout,NULL);
    setbuf(stderr,NULL);
    ...
}

When I am stepping thru the app from main() without Joel's patch 
everything is fine
with stepping. When I use Joel's patch the following happens:
I can step correctly thru the app until I reach function() from another.dll.
As soon as I type next on one of the setbuf() functions gdb steps 
directly to
the assembly code of setbuf not over it as I would like to have it...

So what's wrong here? I would like to have correct stack traces AND 
correct stepping
thru my app on windows. IMHO should the stack trace issue be addressed 
in gdb
in general. In my eyes GDB should work as fine on windows as it does on 
eg. linux.
(BTW: I am using GDB from current CVS sources).

I attach the inital post from Joel containing the patch to my mail to 
get the right context.

Thanks in advance for your help!

Roland

Joel Brobecker wrote on 01.10.2007 16:39:06:
 > > Sometime yes, sometimes no. I implemented that solution abotu 6 months
 > > ago but backed it out because it was just as likely to crash gdb:-(
 > > or hang my IDE (netbeans).
 >
 > We have experienced the same type of problem at AdaCore, and decided
 > to make some compromises: We decided to trust the %ebp registers when
 > unwinding frameless functions from a DLL. This comes with a price:
 > We miss a frame in the backtrace. But because tasking is so important
 > in Ada, we felt it was a better compromise than not being able to unwind
 > from tasks that are blocked waiting for a rendez-vous.
 >
 > Which compromise is best actually depends on the user, which is why
 > this code, or a variation of it, never made it to the FSF tree.
 >
 > This is what our i386_frame_cache() does in case of frameless routines:
 >
 >   if (cache->locals < 0)
 >     {
 >       /* We didn't find a valid frame, which means that CACHE->base
 >          currently holds the frame pointer for our calling frame.  If
 >          we're at the start of a function, or somewhere half-way its
 >          prologue, the function's frame probably hasn't been fully
 >          setup yet.  Try to reconstruct the base address for the stack
 >          frame by looking at the stack pointer.  For truly "frameless"
 >          functions this might work too.  */
 >
 >       if (i386_in_dll (cache->pc)
 >           && !i386_function_has_frame (cache->pc))
 >         {
 >           /* Functions in DLL for which do not seem to create a standard
 >              frame are unwound using %ebp.  This is actually the caller's
 >              frame base instead of our own, but there are some functions
 >              such as WaitForSingleObjectEx in one of the Windows system
 >              DLLs for which the frame base cannot possibly be determined
 >              from the stack pointer.  As a consequence, our caller 
will be
 >              missing from the backtrace, but this is better than having
 >              an aborted backtrace due to a bogus frame base.
 >              
 >              We use this approach only for functions in DLLs because
 >              this is the only place where we have seen the type of
 >              highly optimized code that cause us trouble.  In other
 >              cases, we expect the code to come with frame debugging
 >              information, making prologue scanning unnecessary.
 >              
 >              We also avoid blindly following %ebp if we are midway 
through
 >              setting up a standard frame.  In that case, we know how to
 >              determine the frame base using the stack pointer.  */
 >
 >           cache->saved_regs[I386_EBP_REGNUM] = 0;
 >         }
 >       else
 >         {
 >           i386_frameless_adjust_cache_hack (cache, frame_pc_unwind 
(next_frame));
 >
 >           if (cache->stack_align)
 >             {
 >               /* We're halfway aligning the stack.  */
 >               cache->base = ((cache->saved_sp - 4) & 0xfffffff0) - 4;
 >               cache->saved_regs[I386_EIP_REGNUM] = cache->saved_sp - 4;
 >
 >               /* This will be added back below.  */
 >               cache->saved_regs[I386_EIP_REGNUM] -= cache->base;
 >             }
 >           else
 >             {
 >               frame_unwind_register (next_frame, I386_ESP_REGNUM, buf);
 >               cache->base = extract_unsigned_integer (buf, 4) + 
cache->sp_offset;
 >             }
 >         }
 >     }
 >
 > And the two helper functions are defined as:
 >
 > /* Return non-zero if the function starting at START_PC has a prologue
 >    that sets up a standard frame.  */
 >
 > static int
 > i386_function_has_frame (CORE_ADDR start_pc)
 > {
 >   struct i386_frame_cache cache;
 >
 >   cache.locals = -1;
 >   i386_analyze_prologue (start_pc, 0xffffffff, &cache);
 >
 >   return (cache.locals >= 0);
 > }
 >
 > /* Return non-zero if PC is inside one of the inferior's DLLs.  */
 >
 > static int
 > i386_in_dll (CORE_ADDR pc)
 > {
 >    char *so_name = solib_address (pc);
 >    int len;
 >
 >    if (so_name == NULL)
 >      return 0;
 >
 >    len = strlen (so_name);
 >    if (len < 5)
 >      return 0;
 >
 >    return ((so_name[len - 1] == 'l' || so_name[len - 1] == 'L')
 >            && (so_name[len - 2] == 'l' || so_name[len - 2] == 'L')
 >            && (so_name[len - 3] == 'd' || so_name[len - 3] == 'D')
 >            && so_name[len - 4] == '.');
 > }
 >
 >
 >
 > --
 > Joel
 >

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-03-23 13:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-29 22:01 Strange stack trace on Windows Gordon Prieur
2007-09-30  3:00 ` Daniel Jacobowitz
2007-09-30 18:13 ` Eli Zaretskii
2007-10-01 14:03   ` Gordon Prieur
2007-10-01 14:39     ` Joel Brobecker
2009-03-17 11:58 Roland Schwingel
2009-03-17 13:19 ` Joel Brobecker
2009-03-17 13:49 Roland Schwingel
2009-03-17 14:27 ` Pedro Alves
2009-03-17 15:08 ` Joel Brobecker
2009-03-17 15:26 Roland Schwingel
2009-03-17 15:39 Roland Schwingel
2009-03-17 19:43 ` Joel Brobecker
2009-03-18  9:26 Roland Schwingel
2009-03-19 14:18 ` Joel Brobecker
2009-03-23 13:12 Roland Schwingel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).