From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10174 invoked by alias); 14 May 2009 15:28:42 -0000 Received: (qmail 10164 invoked by uid 22791); 14 May 2009 15:28:41 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT X-Spam-Check-By: sourceware.org Received: from mail.nanometrics.ca (HELO mail.nanometrics.ca) (206.191.47.130) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 14 May 2009 15:28:36 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.nanometrics.ca (Postfix) with ESMTP id 6D99C2875B41 for ; Thu, 14 May 2009 11:28:34 -0400 (EDT) Received: from mail.nanometrics.ca ([127.0.0.1]) by localhost (mail.nanometrics.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6+NP+wMVk9HB for ; Thu, 14 May 2009 11:28:34 -0400 (EDT) Received: from [10.11.2.17] (beng-pc.nanometrics.ca [10.11.2.17]) by mail.nanometrics.ca (Postfix) with ESMTP id 2F2F02875A9C for ; Thu, 14 May 2009 11:28:34 -0400 (EDT) Message-ID: <4A0C38A1.4010300@nanometrics.ca> Date: Thu, 14 May 2009 15:28:00 -0000 From: Ben Gardiner User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) MIME-Version: 1.0 To: GCJ Subject: libSegFault.so and gcj Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact java-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: java-owner@gcc.gnu.org X-SW-Source: 2009-05/txt/msg00046.txt.bz2 Hello all, We are running a gcj-compiled application on an embedded platform (MPC852T). For reference our versions are gcc-4.0.1, glibc-2.3.3 and linux-2.4.24 -- I know these versions are ancient, but please don't stop reading here. We sometimes encounter segfaults in our application; that is to say that it will terminate with 'Segmentation fault' on the console and return 139. These occur rather infrequently, and we have yet to find a reliable way to reproduce them. To make things more difficult, we do not have room for core dumps on our filesystem. I thought that we could get the some information about these segfaults by using the preload library libSegFault.so; I tested it and integrated it with our init scripts and let it loose into our releases hoping that a backtrace or two would come back to me. None did; there was no output produced by libSegFault.so at all. I think that since gcj registers its own segfault handler which translates segv signals into NullPointerExceptions, the original signals never make it to libSegfault's handler. Gcj registers its handler, catch_segv (from prims.cc:146 in our version of gcj), in INIT_SEGV (powerpc-signal.h:62) called from _Jv_CreateJavaVM (prims.cc:1211). Here is a snippet of INIT_SEGV: #define INIT_SEGV \ do \ { \ struct kernel_old_sigaction kact; \ kact.k_sa_handler = catch_segv; \ kact.k_sa_mask = 0; \ kact.k_sa_flags = 0; \ if (syscall (SYS_sigaction, SIGSEGV, &kact, NULL) != 0) \ __asm__ __volatile__ (".long 0"); \ } \ while (0) and of catch_segv: SIGNAL_HANDLER (catch_segv) { java::lang::NullPointerException *nullp = new java::lang::NullPointerException; unblock_signal (SIGSEGV); MAKE_THROW_FRAME (nullp); throw nullp; } I don't know a whole lot about signal handlers -- please correct me if I'm wrong: I think that since the syscall (SYS_sigaction,...) passes NULL as the fourth argument, that gcj is disregarding the presence of any previously registered signal handlers. I also think that since the flags are zero that catch_segv is executed on the same stack as the process that threw the signal instead of the alternate stack. I reason from this that the segfaults are likely stack overflows. Could anyone confirm this? Could we patch INIT_SEGV somehow so that signals not caught by catch_segv will be passed up so that libSegFault.so can catch them? Is there another way to catch the cause of these segfaults? Regards, Ben Gardiner Nanometrics Seismological Instruments 250 Herzberg Rd Kanata ON CA K2K 2A1 613 592 6776 x239