* Analyzing a SEG FAULT that gdb doesn't help with @ 2015-07-29 23:16 Michael Enright 2015-07-30 14:40 ` Jon TURNEY 0 siblings, 1 reply; 9+ messages in thread From: Michael Enright @ 2015-07-29 23:16 UTC (permalink / raw) To: cygwin 've got a program which was running but suddenly does not run. I've been trying to debug in the usual way but all I get is a stackdump. I consulted the internet for advice on how to use a stackdump and it was pretty clear. I also brought LDD into the mix. The IP register when the SEGV occurs is at 6155d363. Below are the ranges per DLL that LDD prints for my system (updated today by the way). ntdll.dll => /cygdrive/c/Windows/SysWOW64/ntdll.dll (0x77840000) kernel32.dll => /cygdrive/c/Windows/syswow64/kernel32.dll (0x754a0000) KERNELBASE.dll => /cygdrive/c/Windows/syswow64/KERNELBASE.dll (0x75820000) cygwin1.dll => /usr/bin/cygwin1.dll (0x61000000) cygexpat-1.dll => /usr/bin/cygexpat-1.dll (0x6f630000) cygmozjs185-1.0.dll => /usr/bin/cygmozjs185-1.0.dll (0x6e590000) cyggcc_s-1.dll => /usr/bin/cyggcc_s-1.dll (0x6f580000) cygstdc++-6.dll => /usr/bin/cygstdc++-6.dll (0x6d000000) cygffi-6.dll => /usr/bin/cygffi-6.dll (0x6f600000) cygnspr4.dll => /usr/bin/cygnspr4.dll (0x6df70000) So I tried to addr2line /usr/bin/cygwin1.dll 6155d363 and I got nothing (?? at ??:?) I then reviewed in setup-x86 the possible cygwin packages to see if there was a missing package I could use to enable cygwin1.dll's addresses to be translated but I didn't recognize anything. I also tried strace. I got very little information regarding whatever the programming failure is: -----------------8 cut here 8--------------------------------------- Installation of CLOCK_SYNC_GET_CAPS_EX.txt successful 107 1561415 [main] cdiclient 4536 write: 54 = write(1, 0x801BA9F8, 54) 15458 1576873 [main] cdiclient 4536 fhandler_pty_slave::write: pty0, write(0x801BA9F8, 17) 121 1576994 [main] cdiclient 4536 fhandler_pty_common::process_opost_output: (1852): pty output_mutex (0x150): waiting -1 ms 94 1577088 [main] cdiclient 4536 fhandler_pty_common::process_opost_output: (1852): pty output_mutex: acquired 118 1577206 [main] cdiclient 4536 fhandler_pty_common::process_opost_output: (1891): pty output_mutex(0x150) released ---------------- 99 1577305 [main] cdiclient 4536 write: 17 = write(1, 0x801BA9F8, 17) --- Process 4536, exception c0000005 at 6115D363 77604 1654909 [main] cdiclient 4536 exception::handle: In cygwin_except_handler exception 0xC0000005 at 0x6115D363 sp 0x28BFA4 146 1655055 [main] cdiclient 4536 exception::handle: In cygwin_except_handler signal 11 at 0x6115D363 -----------------8 cut here 8--------------------------------------- There is a write to stdout immediately preceding the crash. I would not guess that that is causing the problem. The write is using the same buffer as the one just before it and any others I found and the return value is equal to the byte count argument. The write to stdout precedes the part of the program where I instruct the JavaScript interpreter to call a function defined by the file that has been interpreted already. It's possible that in the course of executing that JavaScript it called into one of my JavaScript extensions and that the problem lies there. But I can't even identify where within cygwin1 or any other executable the SEGV occurred. 1) Why is it not the case that gdb handles this SEGV in the usual manner? It too just allows the stackdump to be made and lets me know that the inferior has run its course. 2) What tools have I overlooked in debugging this? -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-07-29 23:16 Analyzing a SEG FAULT that gdb doesn't help with Michael Enright @ 2015-07-30 14:40 ` Jon TURNEY 2015-07-30 17:46 ` Michael Enright 2015-07-31 12:51 ` Jon TURNEY 0 siblings, 2 replies; 9+ messages in thread From: Jon TURNEY @ 2015-07-30 14:40 UTC (permalink / raw) To: cygwin; +Cc: mike On 30/07/2015 00:16, Michael Enright wrote: > So I tried to addr2line /usr/bin/cygwin1.dll 6155d363 and I got > nothing (?? at ??:?) I then reviewed in setup-x86 the possible cygwin > packages to see if there was a missing package I could use to enable > cygwin1.dll's addresses to be translated but I didn't recognize > anything. You need to install the 'cygwin-debuginfo' package for debug symbols for cygwin1.dll You also need to point addr2line at those detached debug symbols, as (unlike gdb) it doesn't follow .gnu_debuglink pointers. (I'm assuming you've typoed 6155d363 here and it should be 0x6115D363 as the strace output says) # addr2line -e /usr/lib/debug/usr/bin/cygwin1.dbg 0x6115D363 /usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64 > 1) Why is it not the case that gdb handles this SEGV in the usual > manner? It too just allows the stackdump to be made and lets me know > that the inferior has run its course. This shouldn't happen. Are you sure the crashing process is the direct inferior of gdb, and not some wrapper process which runs it? (uninstalled libtool generated binaries do this, for e.g.) -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-07-30 14:40 ` Jon TURNEY @ 2015-07-30 17:46 ` Michael Enright 2015-07-30 18:48 ` Michael Enright 2015-07-31 12:51 ` Jon TURNEY 1 sibling, 1 reply; 9+ messages in thread From: Michael Enright @ 2015-07-30 17:46 UTC (permalink / raw) To: cygwin On Thu, Jul 30, 2015 at 7:39 AM, Jon TURNEY wrote: > You need to install the 'cygwin-debuginfo' package for debug symbols for > cygwin1.dll I missed this in my searches. I see now that I should have used the "debug" category. > > You also need to point addr2line at those detached debug symbols, as (unlike > gdb) it doesn't follow .gnu_debuglink pointers. > > (I'm assuming you've typoed 6155d363 here and it should be 0x6115D363 as the > strace output says) I've been having trouble getting that number right > > # addr2line -e /usr/lib/debug/usr/bin/cygwin1.dbg 0x6115D363 > /usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64 Another problem is that there's only one stack frame in the stack dump, so knowing that it's a strlen just means I have to crank out some grep commands and hopefully one of them is vulnerable to a special case that now happens all the time. > > Are you sure the crashing process is the direct > inferior of gdb, and not some wrapper process which runs it? (uninstalled > libtool generated binaries do this, for e.g.) > The crashing executable is just a client of SpiderMonkey (via libmozjs185) with several extensions to JavaScript in order to emulate some of the Windows cscript utility and to emulate another environment that happens to be a massive annoyance to run scripts in. The executable is built using a textbook Makefile. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-07-30 17:46 ` Michael Enright @ 2015-07-30 18:48 ` Michael Enright 0 siblings, 0 replies; 9+ messages in thread From: Michael Enright @ 2015-07-30 18:48 UTC (permalink / raw) To: cygwin On Thu, Jul 30, 2015 at 10:46 AM, Michael Enright wrote: > On Thu, Jul 30, 2015 at 7:39 AM, Jon TURNEY wrote: >> You need to install the 'cygwin-debuginfo' package for debug symbols for >> cygwin1.dll > > I missed this in my searches. I see now that I should have used the > "debug" category. > >> >> You also need to point addr2line at those detached debug symbols, as (unlike >> gdb) it doesn't follow .gnu_debuglink pointers. >> >> (I'm assuming you've typoed 6155d363 here and it should be 0x6115D363 as the >> strace output says) > > I've been having trouble getting that number right > >> >> # addr2line -e /usr/lib/debug/usr/bin/cygwin1.dbg 0x6115D363 >> /usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64 > So I set a breakpoint at that location. After hitting it a couple dozen times I referred back to the stackdump for some register values. I replaced the original breakpoint with b *0x6115D363 if $ecx==0 When that hit, I did a backtrace: (gdb) bt #0 strlen () at /usr/src/debug/cygwin-2.1.0-1/newlib/libc/machine/i386/strlen.S:64 #1 0x6115bbc6 in __strftime (s=s@entry=0x28c1c8 "(\274\a\200\300\t`\377\210\340`\377\001", maxsize=maxsize@entry=100, format=0x6e7e86da <js_Null_str+9683> "Z)", format@entry=0x6e7e86d8 <js_Null_str+9681> "(%Z)", tim_p=tim_p@entry=0x28c0ec, era_info=era_info@entry=0x28c078, alt_digits=alt_digits@entry=0x28c07c) at /usr/src/debug/cygwin-2.1.0-1/newlib/libc/time/strftime.c:1344 #2 0x6115d1bd in strftime (s=0x28c1c8 "(\274\a\200\300\t`\377\210\340`\377\001", maxsize=100, format=0x6e7e86d8 <js_Null_str+9681> "(%Z)", tim_p=0x28c0ec) at /usr/src/debug/cygwin-2.1.0-1/newlib/libc/time/strftime.c:673 #3 0x610e9925 in _sigfe () at sigfe.s:38 #4 0x6e7e86d8 in js_Null_str () from /usr/bin/cygmozjs185-1.0.dll #5 0x0028c0ec in ?? () #6 0x800392ec in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) Then ... (gdb) up #1 0x6115bbc6 in __strftime (s=s@entry=0x28c1c8 "(\274\a\200\300\t`\377\210\340`\377\001", maxsize=maxsize@entry=100, format=0x6e7e86da <js_Null_str+9683> "Z)", format@entry=0x6e7e86d8 <js_Null_str+9681> "(%Z)", tim_p=tim_p@entry=0x28c0ec, era_info=era_info@entry=0x28c078, alt_digits=alt_digits@entry=0x28c07c) at /usr/src/debug/cygwin-2.1.0-1/newlib/libc/time/strftime.c:1344 1344 size = strlen (tznam); Well, let's just kibitz myself: (gdb) print tznam $3 = 0xc07a4000 <error: Cannot access memory at address 0xc07a4000> (gdb) list 1339 tznam = _tzname[tim_p->tm_isdst > 0]; 1340 /* Note that in case of wcsftime this loop only works for 1341 timezone abbreviations using the portable codeset (aka ASCII). 1342 This seems to be the case, but if that ever changes, this 1343 loop needs revisiting. */ 1344 size = strlen (tznam); 1345 for (i = 0; i < size; i++) 1346 { 1347 if (count < maxsize - 1) 1348 s[count++] = tznam[i]; (gdb) print _tzname $4 = {0x800cfc48 "\200", <incomplete sequence \356\066>, 0x800cfc44 "PDT"} (gdb) print _tzname[0] $5 = 0x800cfc48 "\200", <incomplete sequence \356\066> (gdb) print _tzname[1] $6 = 0x800cfc44 "PDT" So "something happened" when libmozjs tried to convert a time to a string, and whatever happened has to do with the time zone. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-07-30 14:40 ` Jon TURNEY 2015-07-30 17:46 ` Michael Enright @ 2015-07-31 12:51 ` Jon TURNEY 2015-07-31 18:46 ` Michael Enright 1 sibling, 1 reply; 9+ messages in thread From: Jon TURNEY @ 2015-07-31 12:51 UTC (permalink / raw) To: cygwin; +Cc: mike On 30/07/2015 15:39, Jon TURNEY wrote: > On 30/07/2015 00:16, Michael Enright wrote: >> 1) Why is it not the case that gdb handles this SEGV in the usual >> manner? It too just allows the stackdump to be made and lets me know >> that the inferior has run its course. > > This shouldn't happen. Are you sure the crashing process is the direct > inferior of gdb, and not some wrapper process which runs it? > (uninstalled libtool generated binaries do this, for e.g.) Oh, I just remembered something :) I think you need to use the gdb command 'set cygwin-exceptions on' to tell gdb to break on exceptions inside the cygwin DLL (by default they are ignored, as they may be generated during normal operation when checking a pointer is valid) I shall have to see if I can find a place for these last couple of answers in the FAQ or documentation somewhere. It's rather too obscure at the moment. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-07-31 12:51 ` Jon TURNEY @ 2015-07-31 18:46 ` Michael Enright 2015-07-31 20:12 ` Michael Enright 0 siblings, 1 reply; 9+ messages in thread From: Michael Enright @ 2015-07-31 18:46 UTC (permalink / raw) To: cygwin On Fri, Jul 31, 2015 at 5:51 AM, Jon TURNEY wrote: > > I think you need to use the gdb command 'set cygwin-exceptions on' to tell > gdb to break on exceptions inside the cygwin DLL (by default they are > ignored, as they may be generated during normal operation when checking a > pointer is valid) > > I shall have to see if I can find a place for these last couple of answers > in the FAQ or documentation somewhere. It's rather too obscure at the > moment. > This is going to help, I have another application (which I don't even know yet if it uses strftime because I didn't write it) that is falling over in a similar fashion, with a different 0x61xxxxxx address involved. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-07-31 18:46 ` Michael Enright @ 2015-07-31 20:12 ` Michael Enright 2015-08-01 20:28 ` Brian Inglis 0 siblings, 1 reply; 9+ messages in thread From: Michael Enright @ 2015-07-31 20:12 UTC (permalink / raw) To: cygwin On Fri, Jul 31, 2015 at 11:46 AM, Michael Enright wrote: > On Fri, Jul 31, 2015 at 5:51 AM, Jon TURNEY wrote: >> >> I think you need to use the gdb command 'set cygwin-exceptions on' to tell >> gdb to break on exceptions <...> > > This is going to help, I have another application (which I don't even > know yet if it uses strftime because I didn't write it) that is > falling over in a similar fashion, with a different 0x61xxxxxx address > involved. The program in question is passing strings to printf that (a) end with "% " or (b) in the middle have "% S". To be clear these strings are the sole argument so they are format strings. This happens tons of times during a run but eventually it crashes in printf, generating a stackdump unless the magic setting is set. As I read the posix spec, % can be followed by flags and space is actually a flag. This flag affects how signs are handled for numeric output. So it could be that the code is trying to deal with %<flag><conversion-char> and S is not a valid conversion char. My attempts to reproduce this outside the evil program have not worked. The output is a little crazy when you printf("something % Something") but in my test program it doesn't crash. I tried printing the strings that the real program might have to deal with but this didn't cause a crash either. I have modified the evil program so that in at least this one spot, lines from the input file are not passed to printf to be output. So there might be something, because an internal SEGV that actually halts the program is bad, but I haven't got a good test case. I have always disagreed with both printf(sometext) and printf("%s", sometext) as wastes of cycles but I wasn't the one making the choices when the evil program was written. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-07-31 20:12 ` Michael Enright @ 2015-08-01 20:28 ` Brian Inglis 2015-08-02 1:55 ` Michael Enright 0 siblings, 1 reply; 9+ messages in thread From: Brian Inglis @ 2015-08-01 20:28 UTC (permalink / raw) To: cygwin Michael Enright <mike <at> kmcardiff.com> writes: > On Fri, Jul 31, 2015 at 11:46 AM, Michael Enright wrote: > > On Fri, Jul 31, 2015 at 5:51 AM, Jon TURNEY wrote: > The program in question is passing strings to printf that (a) end with > "% " or (b) in the middle have "% S". To be clear these strings are > the sole argument so they are format strings. This happens tons of > times during a run but eventually it crashes in printf, generating a > stackdump unless the magic setting is set. > > As I read the posix spec, % can be followed by flags and space is > actually a flag. This flag affects how signs are handled for numeric > output. So it could be that the code is trying to deal with > %<flag><conversion-char> and S is not a valid conversion char. My > attempts to reproduce this outside the evil program have not worked. > The output is a little crazy when you printf("something % Something") > but in my test program it doesn't crash. I tried printing the strings > that the real program might have to deal with but this didn't cause a > crash either. Seems like the problem may be developer confusion between strftime and printf conversion flag prefixes. The strftime space conversion flag character is _ so space filled seconds should use %_S, whereas the printf conversion flag character is space ' ', though I can't recall ever using that, as it is normally the default. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Analyzing a SEG FAULT that gdb doesn't help with 2015-08-01 20:28 ` Brian Inglis @ 2015-08-02 1:55 ` Michael Enright 0 siblings, 0 replies; 9+ messages in thread From: Michael Enright @ 2015-08-02 1:55 UTC (permalink / raw) To: cygwin On Sat, Aug 1, 2015 at 1:28 PM, Brian Inglis wrote: > > Seems like the problem may be developer confusion between strftime and > printf conversion flag prefixes. The strftime space conversion flag > character is _ so space filled seconds should use %_S, whereas the printf > conversion flag character is space ' ', though I can't recall ever using > that, as it is normally the default. > Well this code is not related to the strftime code in my other thread. In this case it was just that a programmer didn't understand why people do printf("%s",string) instead of printf(string). The program takes lines out of log files, snarfs information out of them and makes a report. Sometimes what belongs in the report is a copy of the entire line from the log, so printf(line). But that would seem unwise in general. In particular these log lines have prompts in them, and as you know prompts often have '%' characters in them. It was only a matter of time before this habit became a bad one. So my email above is really the conclusion of the investigation, and the fix is for me to switch the code to fputs in such cases. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-08-02 1:55 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-07-29 23:16 Analyzing a SEG FAULT that gdb doesn't help with Michael Enright 2015-07-30 14:40 ` Jon TURNEY 2015-07-30 17:46 ` Michael Enright 2015-07-30 18:48 ` Michael Enright 2015-07-31 12:51 ` Jon TURNEY 2015-07-31 18:46 ` Michael Enright 2015-07-31 20:12 ` Michael Enright 2015-08-01 20:28 ` Brian Inglis 2015-08-02 1:55 ` Michael Enright
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).