From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11003 invoked by alias); 15 May 2009 09:17:43 -0000 Received: (qmail 10991 invoked by uid 22791); 15 May 2009 09:17:41 -0000 X-SWARE-Spam-Status: No, hits=-1.1 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS,SPF_PASS,WEIRD_PORT X-Spam-Check-By: sourceware.org Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 15 May 2009 09:17:34 +0000 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n4F9HVDk014378; Fri, 15 May 2009 05:17:31 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n4F9HUP0030638; Fri, 15 May 2009 05:17:31 -0400 Received: from zebedee.pink (vpn-12-139.rdu.redhat.com [10.11.12.139]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n4F9HSEs027733; Fri, 15 May 2009 05:17:28 -0400 Message-ID: <4A0D3327.3080308@redhat.com> Date: Fri, 15 May 2009 09:17:00 -0000 From: Andrew Haley User-Agent: Thunderbird 2.0.0.17 (X11/20081009) MIME-Version: 1.0 To: Ben Gardiner CC: GCJ Subject: Re: libSegFault.so and gcj References: <4A0C38A1.4010300@nanometrics.ca> <4A0C40E7.6080907@redhat.com> In-Reply-To: <4A0C40E7.6080907@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact java-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: java-owner@gcc.gnu.org X-SW-Source: 2009-05/txt/msg00049.txt.bz2 Andrew Haley wrote: > Ben Gardiner wrote: > >> We are running a gcj-compiled application on an embedded platform >> (MPC852T). For reference our versions are gcc-4.0.1, glibc-2.3.3 and >> linux-2.4.24 -- I know these versions are ancient, but please don't stop >> reading here. >> >> We sometimes encounter segfaults in our application; that is to say that >> it will terminate with 'Segmentation fault' on the console and return >> 139. These occur rather infrequently, and we have yet to find a reliable >> way to reproduce them. To make things more difficult, we do not have >> room for core dumps on our filesystem. >> >> I thought that we could get the some information about these segfaults >> by using the preload library libSegFault.so; I tested it and integrated >> it with our init scripts and let it loose into our releases hoping that >> a backtrace or two would come back to me. None did; there was no output >> produced by libSegFault.so at all. >> >> I think that since gcj registers its own segfault handler which >> translates segv signals into NullPointerExceptions, the original signals >> never make it to libSegfault's handler. Gcj registers its handler, >> catch_segv (from prims.cc:146 in our version of gcj), in INIT_SEGV >> (powerpc-signal.h:62) called from _Jv_CreateJavaVM (prims.cc:1211). Here >> is a snippet of INIT_SEGV: >> >> #define INIT_SEGV \ >> do \ >> { \ >> struct kernel_old_sigaction kact; \ >> kact.k_sa_handler = catch_segv; \ >> kact.k_sa_mask = 0; \ >> kact.k_sa_flags = 0; \ >> if (syscall (SYS_sigaction, SIGSEGV, &kact, NULL) != 0) \ >> __asm__ __volatile__ (".long 0"); \ >> } \ >> while (0) >> >> and of catch_segv: >> >> SIGNAL_HANDLER (catch_segv) >> { >> java::lang::NullPointerException *nullp >> = new java::lang::NullPointerException; >> unblock_signal (SIGSEGV); >> MAKE_THROW_FRAME (nullp); >> throw nullp; >> } >> >> I don't know a whole lot about signal handlers -- please correct me if >> I'm wrong: I think that since the syscall (SYS_sigaction,...) passes >> NULL as the fourth argument, that gcj is disregarding the presence of >> any previously registered signal handlers. > > Correct. gcj treats all segfaults as null pointer exceptions. > >> I also think that since the >> flags are zero that catch_segv is executed on the same stack as the >> process that threw the signal instead of the alternate stack. > > Also correct. > >> I reason from this that the segfaults are likely stack overflows. Could >> anyone confirm this? > > That's quite possible. Do you not have a debugger? > > Clearly if it really is a stack overflow then you're not going to be > able to call the null pointer handler. There is a way around this, > though. If you use the -fstack-check option gcc generates a probe > at the start of every method that writes a zero some 12kbytes below > the stack pointer. This will give you enough stack space for the > catch_segv handler to run. Although you'll have to make very sure that the catch_segv handler is built *without* the -fstack-check option ! One other thing that you can use to detect stack overflow: compile with `-finstrument-functions'. This might be the easiest way to do it. Andrew. Andrew.