From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17070 invoked by alias); 14 May 2009 16:04:11 -0000 Received: (qmail 16907 invoked by uid 22791); 14 May 2009 16:04:07 -0000 X-SWARE-Spam-Status: No, hits=-1.1 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS,SPF_PASS,WEIRD_PORT X-Spam-Check-By: sourceware.org Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 14 May 2009 16:03:57 +0000 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n4EG3tQs003577; Thu, 14 May 2009 12:03:55 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n4EG3s9s015620; Thu, 14 May 2009 12:03:54 -0400 Received: from zebedee.pink (vpn-12-170.rdu.redhat.com [10.11.12.170]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n4EG3qQQ003498; Thu, 14 May 2009 12:03:53 -0400 Message-ID: <4A0C40E7.6080907@redhat.com> Date: Thu, 14 May 2009 16:04:00 -0000 From: Andrew Haley User-Agent: Thunderbird 2.0.0.17 (X11/20081009) MIME-Version: 1.0 To: Ben Gardiner CC: GCJ Subject: Re: libSegFault.so and gcj References: <4A0C38A1.4010300@nanometrics.ca> In-Reply-To: <4A0C38A1.4010300@nanometrics.ca> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact java-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: java-owner@gcc.gnu.org X-SW-Source: 2009-05/txt/msg00047.txt.bz2 Ben Gardiner wrote: > We are running a gcj-compiled application on an embedded platform > (MPC852T). For reference our versions are gcc-4.0.1, glibc-2.3.3 and > linux-2.4.24 -- I know these versions are ancient, but please don't stop > reading here. > > We sometimes encounter segfaults in our application; that is to say that > it will terminate with 'Segmentation fault' on the console and return > 139. These occur rather infrequently, and we have yet to find a reliable > way to reproduce them. To make things more difficult, we do not have > room for core dumps on our filesystem. > > I thought that we could get the some information about these segfaults > by using the preload library libSegFault.so; I tested it and integrated > it with our init scripts and let it loose into our releases hoping that > a backtrace or two would come back to me. None did; there was no output > produced by libSegFault.so at all. > > I think that since gcj registers its own segfault handler which > translates segv signals into NullPointerExceptions, the original signals > never make it to libSegfault's handler. Gcj registers its handler, > catch_segv (from prims.cc:146 in our version of gcj), in INIT_SEGV > (powerpc-signal.h:62) called from _Jv_CreateJavaVM (prims.cc:1211). Here > is a snippet of INIT_SEGV: > > #define INIT_SEGV \ > do \ > { \ > struct kernel_old_sigaction kact; \ > kact.k_sa_handler = catch_segv; \ > kact.k_sa_mask = 0; \ > kact.k_sa_flags = 0; \ > if (syscall (SYS_sigaction, SIGSEGV, &kact, NULL) != 0) \ > __asm__ __volatile__ (".long 0"); \ > } \ > while (0) > > and of catch_segv: > > SIGNAL_HANDLER (catch_segv) > { > java::lang::NullPointerException *nullp > = new java::lang::NullPointerException; > unblock_signal (SIGSEGV); > MAKE_THROW_FRAME (nullp); > throw nullp; > } > > I don't know a whole lot about signal handlers -- please correct me if > I'm wrong: I think that since the syscall (SYS_sigaction,...) passes > NULL as the fourth argument, that gcj is disregarding the presence of > any previously registered signal handlers. Correct. gcj treats all segfaults as null pointer exceptions. > I also think that since the > flags are zero that catch_segv is executed on the same stack as the > process that threw the signal instead of the alternate stack. Also correct. > I reason from this that the segfaults are likely stack overflows. Could > anyone confirm this? That's quite possible. Do you not have a debugger? Clearly if it really is a stack overflow then you're not going to be able to call the null pointer handler. There is a way around this, though. If you use the -fstack-check option gcc generates a probe at the start of every method that writes a zero some 12kbytes below the stack pointer. This will give you enough stack space for the catch_segv handler to run. > Could we patch INIT_SEGV somehow so that signals not caught by > catch_segv will be passed up so that libSegFault.so can catch them? No. They're all caught. Andrew.