From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31460 invoked by alias); 15 May 2009 20:49:11 -0000 Received: (qmail 31452 invoked by uid 22791); 15 May 2009 20:49:09 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from mail3.caviumnetworks.com (HELO mail3.caviumnetworks.com) (12.108.191.235) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 15 May 2009 20:49:02 +0000 Received: from exch4.caveonetworks.com (Not Verified[192.168.16.23]) by mail3.caviumnetworks.com with MailMarshal (v6,2,2,3503) id ; Fri, 15 May 2009 16:48:01 -0400 Received: from exch4.caveonetworks.com ([192.168.16.23]) by exch4.caveonetworks.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 15 May 2009 13:47:00 -0700 Received: from dd1.caveonetworks.com ([64.169.86.201]) by exch4.caveonetworks.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 15 May 2009 13:47:00 -0700 Message-ID: <4A0DD4C4.3050006@caviumnetworks.com> Date: Fri, 15 May 2009 20:49:00 -0000 From: David Daney User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: Ben Gardiner CC: Andrew Haley , GCJ Subject: Re: libSegFault.so and gcj References: <4A0C38A1.4010300@nanometrics.ca> <4A0C40E7.6080907@redhat.com> <4A0DCF4D.1030506@nanometrics.ca> In-Reply-To: <4A0DCF4D.1030506@nanometrics.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact java-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: java-owner@gcc.gnu.org X-SW-Source: 2009-05/txt/msg00051.txt.bz2 Ben Gardiner wrote: > Andrew Haley wrote: >>> I reason from this that the segfaults are likely stack overflows. Could >>> anyone confirm this? >>> >> That's quite possible. Do you not have a debugger? >> >> Clearly if it really is a stack overflow then you're not going to be >> able to call the null pointer handler. There is a way around this, >> though. If you use the -fstack-check option gcc generates a probe >> at the start of every method that writes a zero some 12kbytes below >> the stack pointer. This will give you enough stack space for the >> catch_segv handler to run. > David Daney wrote: >> Usually if you die with a SIGSEGV, it is due to stack overflow. >> Probably for one reason or another you are getting a fault during the >> NullPointerException processing which causes the signal handler to be >> reentered recursively. This goes on until the stack overflows and the >> kernel then kills the process. If you could attach a debugger to the >> process, that might shed some light on exactly what is happening. >> Assuming that it is not normal for your application to take >> NullPointerExceptions it shouldn't be too tedious. > Andrew and David, thank you for your insights and for the speed with > which they were provided. > > About the debugger; I agree it would be the easiest way to figure out > what's going on here. Since I don't know how to reproduce the problem I > was hoping to get some information from our devices in the field if and > when they die of a segmentation fault. > > Would it be possible -- and if so, are there any significant drawbacks > -- to store the previous handler in INIT_SEGV and register it when > catch_segv is entered, then re-register catch_segv on the way out? Would > this allow the segv signal to be passed up to libSegFault.so's handler > when it would have otherwise resulted in a recursive dead-end? > I think it would be very difficult to get that to work. You would have to restore the handler in catch_segv, but where would you reregister catch_segv? Actually I am not sure, but it is in either the java personality routine or the unwinder in libgcc. The problem is: what do you do for a multi-threaded application? Thread A has to be able to handle SIGSEGV while thread B is unwinding its own exceptions. Making sure that there were no race conditions could be difficult. David Daney