public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Analyzing a SEG FAULT that gdb doesn't help with
@ 2015-07-29 23:16 Michael Enright
  2015-07-30 14:40 ` Jon TURNEY
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Enright @ 2015-07-29 23:16 UTC (permalink / raw)
  To: cygwin

've got a program which was running but suddenly does not run.

I've been trying to debug in the usual way but all I get is a stackdump.

I consulted the internet for advice on how to use a stackdump and it
was pretty clear. I also brought LDD into the mix.

The IP register when the SEGV occurs is at 6155d363. Below are the
ranges per DLL that LDD prints for my system (updated today by the
way).

        ntdll.dll => /cygdrive/c/Windows/SysWOW64/ntdll.dll (0x77840000)
        kernel32.dll => /cygdrive/c/Windows/syswow64/kernel32.dll (0x754a0000)
        KERNELBASE.dll => /cygdrive/c/Windows/syswow64/KERNELBASE.dll
(0x75820000)
        cygwin1.dll => /usr/bin/cygwin1.dll (0x61000000)
        cygexpat-1.dll => /usr/bin/cygexpat-1.dll (0x6f630000)
        cygmozjs185-1.0.dll => /usr/bin/cygmozjs185-1.0.dll (0x6e590000)
        cyggcc_s-1.dll => /usr/bin/cyggcc_s-1.dll (0x6f580000)
        cygstdc++-6.dll => /usr/bin/cygstdc++-6.dll (0x6d000000)
        cygffi-6.dll => /usr/bin/cygffi-6.dll (0x6f600000)
        cygnspr4.dll => /usr/bin/cygnspr4.dll (0x6df70000)

So I tried to addr2line /usr/bin/cygwin1.dll 6155d363 and I got
nothing (?? at ??:?) I then reviewed in setup-x86 the possible cygwin
packages to see if there was a missing package I could use to enable
cygwin1.dll's addresses to be translated but I didn't recognize
anything.

I also tried strace. I got very little information regarding whatever
the programming failure is:
-----------------8 cut here 8---------------------------------------
Installation of CLOCK_SYNC_GET_CAPS_EX.txt successful
  107 1561415 [main] cdiclient 4536 write: 54 = write(1, 0x801BA9F8, 54)
15458 1576873 [main] cdiclient 4536 fhandler_pty_slave::write: pty0,
write(0x801BA9F8, 17)
  121 1576994 [main] cdiclient 4536
fhandler_pty_common::process_opost_output: (1852): pty output_mutex
(0x150): waiting -1 ms
   94 1577088 [main] cdiclient 4536
fhandler_pty_common::process_opost_output: (1852): pty output_mutex:
acquired
  118 1577206 [main] cdiclient 4536
fhandler_pty_common::process_opost_output: (1891): pty
output_mutex(0x150) released
----------------
   99 1577305 [main] cdiclient 4536 write: 17 = write(1, 0x801BA9F8, 17)
--- Process 4536, exception c0000005 at 6115D363
77604 1654909 [main] cdiclient 4536 exception::handle: In
cygwin_except_handler exception 0xC0000005 at 0x6115D363 sp 0x28BFA4
  146 1655055 [main] cdiclient 4536 exception::handle: In
cygwin_except_handler signal 11 at 0x6115D363
-----------------8 cut here 8---------------------------------------

There is a write to stdout immediately preceding the crash. I would
not guess that that is causing the problem. The write is using the
same buffer as the one just before it and any others I found and the
return value is equal to the byte count argument.

The write to stdout precedes the part of the program where I instruct
the JavaScript interpreter to call a function defined by the file that
has been interpreted already. It's possible that in the course of
executing that JavaScript it called into one of my JavaScript
extensions and that the problem lies there. But I can't even identify
where within cygwin1 or any other executable the SEGV occurred.


1) Why is it not the case that gdb handles this SEGV in the usual
manner? It too just allows the stackdump to be made and lets me know
that the inferior has run its course.

2) What tools have I overlooked in debugging this?

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-07-29 23:16 Analyzing a SEG FAULT that gdb doesn't help with Michael Enright
@ 2015-07-30 14:40 ` Jon TURNEY
  2015-07-30 17:46   ` Michael Enright
  2015-07-31 12:51   ` Jon TURNEY
  0 siblings, 2 replies; 9+ messages in thread
From: Jon TURNEY @ 2015-07-30 14:40 UTC (permalink / raw)
  To: cygwin; +Cc: mike

On 30/07/2015 00:16, Michael Enright wrote:
> So I tried to addr2line /usr/bin/cygwin1.dll 6155d363 and I got
> nothing (?? at ??:?) I then reviewed in setup-x86 the possible cygwin
> packages to see if there was a missing package I could use to enable
> cygwin1.dll's addresses to be translated but I didn't recognize
> anything.

You need to install the 'cygwin-debuginfo' package for debug symbols for 
cygwin1.dll

You also need to point addr2line at those detached debug symbols, as 
(unlike gdb) it doesn't follow .gnu_debuglink pointers.

(I'm assuming you've typoed 6155d363 here and it should be 0x6115D363 as 
the strace output says)

# addr2line -e /usr/lib/debug/usr/bin/cygwin1.dbg 0x6115D363
/usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64

> 1) Why is it not the case that gdb handles this SEGV in the usual
> manner? It too just allows the stackdump to be made and lets me know
> that the inferior has run its course.

This shouldn't happen.  Are you sure the crashing process is the direct 
inferior of gdb, and not some wrapper process which runs it? 
(uninstalled libtool generated binaries do this, for e.g.)



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-07-30 14:40 ` Jon TURNEY
@ 2015-07-30 17:46   ` Michael Enright
  2015-07-30 18:48     ` Michael Enright
  2015-07-31 12:51   ` Jon TURNEY
  1 sibling, 1 reply; 9+ messages in thread
From: Michael Enright @ 2015-07-30 17:46 UTC (permalink / raw)
  To: cygwin

On Thu, Jul 30, 2015 at 7:39 AM, Jon TURNEY  wrote:
> You need to install the 'cygwin-debuginfo' package for debug symbols for
> cygwin1.dll

I missed this in my searches. I see now that I should have used the
"debug" category.

>
> You also need to point addr2line at those detached debug symbols, as (unlike
> gdb) it doesn't follow .gnu_debuglink pointers.
>
> (I'm assuming you've typoed 6155d363 here and it should be 0x6115D363 as the
> strace output says)

I've been having trouble getting that number right

>
> # addr2line -e /usr/lib/debug/usr/bin/cygwin1.dbg 0x6115D363
> /usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64

Another problem is that there's only one stack frame in the stack
dump, so knowing that it's a strlen just means I have to crank out
some grep commands and hopefully one of them is vulnerable to a
special case that now happens all the time.

>
> Are you sure the crashing process is the direct
> inferior of gdb, and not some wrapper process which runs it? (uninstalled
> libtool generated binaries do this, for e.g.)
>

The crashing executable is just a client of SpiderMonkey (via
libmozjs185) with several extensions to JavaScript in order to emulate
some of the Windows cscript utility and to emulate another environment
that happens to be a massive annoyance to run scripts in. The
executable is built using a textbook Makefile.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-07-30 17:46   ` Michael Enright
@ 2015-07-30 18:48     ` Michael Enright
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Enright @ 2015-07-30 18:48 UTC (permalink / raw)
  To: cygwin

On Thu, Jul 30, 2015 at 10:46 AM, Michael Enright  wrote:
> On Thu, Jul 30, 2015 at 7:39 AM, Jon TURNEY  wrote:
>> You need to install the 'cygwin-debuginfo' package for debug symbols for
>> cygwin1.dll
>
> I missed this in my searches. I see now that I should have used the
> "debug" category.
>
>>
>> You also need to point addr2line at those detached debug symbols, as (unlike
>> gdb) it doesn't follow .gnu_debuglink pointers.
>>
>> (I'm assuming you've typoed 6155d363 here and it should be 0x6115D363 as the
>> strace output says)
>
> I've been having trouble getting that number right
>
>>
>> # addr2line -e /usr/lib/debug/usr/bin/cygwin1.dbg 0x6115D363
>> /usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64
>

So I set a breakpoint at that location. After hitting it a couple
dozen times I referred back to the stackdump for some register values.

I replaced the original breakpoint with b *0x6115D363 if $ecx==0
When that hit, I did a backtrace:

(gdb) bt
#0  strlen () at
/usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64
#1  0x6115bbc6 in __strftime (s=s@entry=0x28c1c8
"(\274\a\200\300\t`\377\210\340`\377\001", maxsize=maxsize@entry=100,
format=0x6e7e86da <js_Null_str+9683> "Z)",
    format@entry=0x6e7e86d8 <js_Null_str+9681> "(%Z)",
tim_p=tim_p@entry=0x28c0ec, era_info=era_info@entry=0x28c078,
alt_digits=alt_digits@entry=0x28c07c)
    at /usr/src/debug/cygwin-2.1.0-1/newlib/libc/time/strftime.c:1344
#2  0x6115d1bd in strftime (s=0x28c1c8
"(\274\a\200\300\t`\377\210\340`\377\001", maxsize=100,
format=0x6e7e86d8 <js_Null_str+9681> "(%Z)", tim_p=0x28c0ec)
    at /usr/src/debug/cygwin-2.1.0-1/newlib/libc/time/strftime.c:673
#3  0x610e9925 in _sigfe () at sigfe.s:38
#4  0x6e7e86d8 in js_Null_str () from /usr/bin/cygmozjs185-1.0.dll
#5  0x0028c0ec in ?? ()
#6  0x800392ec in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Then ...

(gdb) up
#1  0x6115bbc6 in __strftime (s=s@entry=0x28c1c8
"(\274\a\200\300\t`\377\210\340`\377\001", maxsize=maxsize@entry=100,
format=0x6e7e86da <js_Null_str+9683> "Z)",
    format@entry=0x6e7e86d8 <js_Null_str+9681> "(%Z)",
tim_p=tim_p@entry=0x28c0ec, era_info=era_info@entry=0x28c078,
alt_digits=alt_digits@entry=0x28c07c)
    at /usr/src/debug/cygwin-2.1.0-1/newlib/libc/time/strftime.c:1344
1344                  size = strlen (tznam);

Well, let's just kibitz myself:

(gdb) print tznam
$3 = 0xc07a4000 <error: Cannot access memory at address 0xc07a4000>
(gdb) list
1339                    tznam = _tzname[tim_p->tm_isdst > 0];
1340                  /* Note that in case of wcsftime this loop only works for
1341                     timezone abbreviations using the portable
codeset (aka ASCII).
1342                     This seems to be the case, but if that ever
changes, this
1343                     loop needs revisiting. */
1344                  size = strlen (tznam);
1345                  for (i = 0; i < size; i++)
1346                    {
1347                      if (count < maxsize - 1)
1348                        s[count++] = tznam[i];
(gdb) print _tzname
$4 = {0x800cfc48 "\200", <incomplete sequence \356\066>, 0x800cfc44 "PDT"}
(gdb) print _tzname[0]
$5 = 0x800cfc48 "\200", <incomplete sequence \356\066>
(gdb) print _tzname[1]
$6 = 0x800cfc44 "PDT"

So "something happened" when libmozjs tried to convert a time to a
string, and whatever happened has to do with the time zone.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-07-30 14:40 ` Jon TURNEY
  2015-07-30 17:46   ` Michael Enright
@ 2015-07-31 12:51   ` Jon TURNEY
  2015-07-31 18:46     ` Michael Enright
  1 sibling, 1 reply; 9+ messages in thread
From: Jon TURNEY @ 2015-07-31 12:51 UTC (permalink / raw)
  To: cygwin; +Cc: mike

On 30/07/2015 15:39, Jon TURNEY wrote:
> On 30/07/2015 00:16, Michael Enright wrote:
>> 1) Why is it not the case that gdb handles this SEGV in the usual
>> manner? It too just allows the stackdump to be made and lets me know
>> that the inferior has run its course.
>
> This shouldn't happen.  Are you sure the crashing process is the direct
> inferior of gdb, and not some wrapper process which runs it?
> (uninstalled libtool generated binaries do this, for e.g.)

Oh, I just remembered something :)

I think you need to use the gdb command 'set cygwin-exceptions on' to 
tell gdb to break on exceptions inside the cygwin DLL (by default they 
are ignored, as they may be generated during normal operation when 
checking a pointer is valid)

I shall have to see if I can find a place for these last couple of 
answers in the FAQ or documentation somewhere.  It's rather too obscure 
at the moment.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-07-31 12:51   ` Jon TURNEY
@ 2015-07-31 18:46     ` Michael Enright
  2015-07-31 20:12       ` Michael Enright
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Enright @ 2015-07-31 18:46 UTC (permalink / raw)
  To: cygwin

On Fri, Jul 31, 2015 at 5:51 AM, Jon TURNEY wrote:
>
> I think you need to use the gdb command 'set cygwin-exceptions on' to tell
> gdb to break on exceptions inside the cygwin DLL (by default they are
> ignored, as they may be generated during normal operation when checking a
> pointer is valid)
>
> I shall have to see if I can find a place for these last couple of answers
> in the FAQ or documentation somewhere.  It's rather too obscure at the
> moment.
>

This is going to help, I have another application (which I don't even
know yet if it uses strftime because I didn't write it) that is
falling over in a similar fashion, with a different 0x61xxxxxx address
involved.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-07-31 18:46     ` Michael Enright
@ 2015-07-31 20:12       ` Michael Enright
  2015-08-01 20:28         ` Brian Inglis
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Enright @ 2015-07-31 20:12 UTC (permalink / raw)
  To: cygwin

On Fri, Jul 31, 2015 at 11:46 AM, Michael Enright wrote:
> On Fri, Jul 31, 2015 at 5:51 AM, Jon TURNEY wrote:
>>
>> I think you need to use the gdb command 'set cygwin-exceptions on' to tell
>> gdb to break on exceptions <...>
>
> This is going to help, I have another application (which I don't even
> know yet if it uses strftime because I didn't write it) that is
> falling over in a similar fashion, with a different 0x61xxxxxx address
> involved.

The program in question is passing strings to printf that (a) end with
"% " or (b) in the middle have "% S". To be clear these strings are
the sole argument so they are format strings. This happens tons of
times during a run but eventually it crashes in printf, generating a
stackdump unless the magic setting is set.

As I read the posix spec, % can be followed by flags and space is
actually a flag. This flag affects how signs are handled for numeric
output. So it could be that the code is trying to deal with
%<flag><conversion-char> and S is not a valid conversion char. My
attempts to reproduce this outside the evil program have not worked.
The output is a little crazy when you printf("something % Something")
but in my test program it doesn't crash. I tried printing the strings
that the real program might have to deal with but this didn't cause a
crash either.

I have modified the evil program so that in at least this one spot,
lines from the input file are not passed to printf to be output.

So there might be something, because an internal SEGV that actually
halts the program is bad, but I haven't got a good test case. I have
always disagreed with both printf(sometext) and printf("%s", sometext)
as wastes of cycles but I wasn't the one making the choices when the
evil program was written.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-07-31 20:12       ` Michael Enright
@ 2015-08-01 20:28         ` Brian Inglis
  2015-08-02  1:55           ` Michael Enright
  0 siblings, 1 reply; 9+ messages in thread
From: Brian Inglis @ 2015-08-01 20:28 UTC (permalink / raw)
  To: cygwin

Michael Enright <mike <at> kmcardiff.com> writes:
> On Fri, Jul 31, 2015 at 11:46 AM, Michael Enright wrote:
> > On Fri, Jul 31, 2015 at 5:51 AM, Jon TURNEY wrote:
> The program in question is passing strings to printf that (a) end with
> "% " or (b) in the middle have "% S". To be clear these strings are
> the sole argument so they are format strings. This happens tons of
> times during a run but eventually it crashes in printf, generating a
> stackdump unless the magic setting is set.
> 
> As I read the posix spec, % can be followed by flags and space is
> actually a flag. This flag affects how signs are handled for numeric
> output. So it could be that the code is trying to deal with
> %<flag><conversion-char> and S is not a valid conversion char. My
> attempts to reproduce this outside the evil program have not worked.
> The output is a little crazy when you printf("something % Something")
> but in my test program it doesn't crash. I tried printing the strings
> that the real program might have to deal with but this didn't cause a
> crash either.

Seems like the problem may be developer confusion between strftime and
printf conversion flag prefixes. The strftime space conversion flag
character is _ so space filled seconds should use %_S, whereas the printf
conversion flag character is space ' ', though I can't recall ever using
that, as it is normally the default. 



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Analyzing a SEG FAULT that gdb doesn't help with
  2015-08-01 20:28         ` Brian Inglis
@ 2015-08-02  1:55           ` Michael Enright
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Enright @ 2015-08-02  1:55 UTC (permalink / raw)
  To: cygwin

On Sat, Aug 1, 2015 at 1:28 PM, Brian Inglis  wrote:
>
> Seems like the problem may be developer confusion between strftime and
> printf conversion flag prefixes. The strftime space conversion flag
> character is _ so space filled seconds should use %_S, whereas the printf
> conversion flag character is space ' ', though I can't recall ever using
> that, as it is normally the default.
>
Well this code is not related to the strftime code in my other thread.
In this case it was just that a programmer didn't understand why
people do printf("%s",string) instead of printf(string). The program
takes lines out of log files, snarfs information out of them and makes
a report. Sometimes what belongs in the report is a copy of the entire
line from the log, so printf(line). But that would seem unwise in
general. In particular these log lines have prompts in them, and as
you know prompts often have '%' characters in them. It was only a
matter of time before this habit became a bad one.

So my email above is really the conclusion of the investigation, and
the fix is for me to switch the code to fputs in such cases.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-08-02  1:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-29 23:16 Analyzing a SEG FAULT that gdb doesn't help with Michael Enright
2015-07-30 14:40 ` Jon TURNEY
2015-07-30 17:46   ` Michael Enright
2015-07-30 18:48     ` Michael Enright
2015-07-31 12:51   ` Jon TURNEY
2015-07-31 18:46     ` Michael Enright
2015-07-31 20:12       ` Michael Enright
2015-08-01 20:28         ` Brian Inglis
2015-08-02  1:55           ` Michael Enright

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).