public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* gcc 3.3.6 - stack corruption questions
@ 2005-07-25 14:55 Louis LeBlanc
  2005-07-25 15:15 ` Giovanni Bajo
  0 siblings, 1 reply; 10+ messages in thread
From: Louis LeBlanc @ 2005-07-25 14:55 UTC (permalink / raw)
  To: gcc

Hey folks.  I'm having some trouble with a process compiled with gcc
3.3.6.  This code is pretty complex and has several features that are not
typically in use because they involve non-production test cases.

The problem is I'm getting core dumps (SEGV) that appears to come from
this code when I know it shouldn't be in the execution path.  The code
in question is switched on by a command line argument only, and the
process is managed by a parent process that monitors and manages it's
execution - reporting crashes and restarting it if necessary.

Here's my environment:
gcc 3.3.6 built on SunOS 5.8 sun4u sparc SUNW,Ultra-60,
app built on the same platform and execution on SunOS 5.8 sun4u sparc
  SUNW,UltraSPARC-IIi-cEngine.

The entire codebase is written in C, and is compiled as follows:
/usr/local/gcc-3.3.6/bin/gcc -ggdb -g3 -Wall -D_REENTRANT
-Wno-multichar -Wno-unused-function -D_SOLARIS -DUSE_DEV_POLL
-mcpu=ultrasparc -O2 -DTIMING=1 -DDB_TIMING=1  -Icommon/include
-I/opt/oracle/8.1.7/include -I/opt/oracle/8.1.7/rdbms/public  -c -o
store.o store.c

These problems have popped up time and again over the last 6 years,
going as far back as gcc 2.95, but gdb has never been able to tell me
any more than where the problem came from (the Solaris pstack utility
always agrees with gdb).  These problems are only repeated under
longer execution times, and only after some thousands or even millions
of transactions.  The application is supposed to provide 99.97%
availability, so having this happen 12 times over the course of a
weekend is a bit concerning.  Sometimes a build will prove wonderfully
stable, but then a very small code change made to tweak some behavior
will completely destabilize it.

Recently, I added a handler to catch segfaults and bus errors to try
to extract more info through the ucontext interface.  I am able to get
a little explicit detail, but not much new information.  Problem with
this is it doesn't preserve the originating stack as well.

At this point, I'm at a loss as to where to start.  This is a pretty
important codebase (to my employer, anyway) and the frequency of these
inexplicable problems is starting to cause some concern.

Any suggestions as to where to go next?  If I've forgotten any
potentially useful information please don't hesitate to request it.
Please CC me directly, as I am not on the dev list.

Thanks for your time.

Lou
-- 
Louis LeBlanc                                 leblanc@keyslapper.net
Fully Funded Hobbyist,                   KeySlapper Extrordinaire :þ
http://www.keyslapper.net                                       Ô¿Ô¬
Key fingerprint = C5E7 4762 F071 CE3B ED51  4FB8 AF85 A2FE 80C8 D9A2

Flugg's Law:
  When you need to knock on wood is when you realize
  that the world is composed of vinyl, naugahyde and aluminum.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-07-26 22:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-07-25 14:55 gcc 3.3.6 - stack corruption questions Louis LeBlanc
2005-07-25 15:15 ` Giovanni Bajo
2005-07-25 15:23   ` Louis LeBlanc
2005-07-25 22:00   ` Louis LeBlanc
2005-07-25 22:28     ` Giovanni Bajo
2005-07-26 21:06       ` Louis LeBlanc
2005-07-26 21:52         ` Robert Dewar
2005-07-26 22:27           ` Louis LeBlanc
2005-07-25 22:50     ` Robert Dewar
2005-07-25 23:00       ` Dale Johannesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).