From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22305 invoked by alias); 15 Apr 2002 23:46:02 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 22290 invoked by uid 71); 15 Apr 2002 23:46:01 -0000 Date: Mon, 15 Apr 2002 16:46:00 -0000 Message-ID: <20020415234601.22289.qmail@sources.redhat.com> To: nobody@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: "Boehm, Hans" Subject: RE: java/6092: sparc-sun-solaris2.7 has hundreds of libjava failu res with -m64 Reply-To: "Boehm, Hans" X-SW-Source: 2002-04/txt/msg00807.txt.bz2 List-Id: The following reply was made to PR java/6092; it has been noted by GNATS. From: "Boehm, Hans" To: "'tromey@redhat.com'" , "Boehm, Hans" Cc: "Kaveh R. Ghazi" , gcc-gnats@gcc.gnu.org Subject: RE: java/6092: sparc-sun-solaris2.7 has hundreds of libjava failu res with -m64 Date: Mon, 15 Apr 2002 16:42:33 -0700 > From: Tom Tromey [mailto:tromey@redhat.com] > Hans> A possible cause of that is confusion about where the roots are. > Hans> If you can check GC_stackbottom, check that the collector and > Hans> libgcj configuration agree on threads, get GC_dump() output, and > Hans> check the root locations against the nm output that might > Hans> identify something. Otherwise we need better debug information. > > In order to reduce the number of variables a bit, I tried gctest. I > got the same problem. Here's some info. > > (gdb) call GC_dump() > ***Static roots: > From 0x100106000 to 0x10012d8b8 > Total size: 161976 > > ***Heap sections: > Total heap size: 131072 > Section 0 from 0x100178000 to 0x100198000 0/16 blacklisted > > ***Free blocks: > Free list 16 (Total size 131072): > 0x100178000 size 131072 not black listed > Total of 131072 bytes on free list > > ***Blocks in use: > (kind(0=ptrfree,1=normal,2=unc.,3=stubborn):size_in_bytes, > #_marks_set) > > blocks = 0, bytes = 0 > I hadn't appreciated the fact that this was dying during initialization. That's further strong evidence that it in fact does have the root set wrong. Otherwise this looks plausible. > > > (gdb) p GC_stackbottom > $3 = 0xffffffff80000000
> > I dug through the source a bit. I think this machine doesn't define > USERLIMIT (a simple test program fails), so we must be using > HEURISTIC2. > I agree. Can you print the stack pointer with gdb right after startup, and verify that it's just below that value? > > When using nm what am I looking for? I looked at one global, > GC_gc_no. Its address falls into the range that GC_dump prints for > the static roots. Should I do this comparison for all the globals? > You want to get the symbols sorted in numeric order (nm -n on Linux; you may need nm -p | sort on Solaris). I would then check that a) It looks like all data and bss symbols are included in the "Static roots" interval, and b) It doesn't look like there are any obvious unmapped holes in the "Static roots" interval. I would guess that (b) is the problem here. It may be that the data segment in fact starts at a higher address than what the collector is assuming. The collector currently assumes the same spacing between etext and the start of the data segment as with the 32 bit ABI. I think this doesn't work well if the page size exceeds 64K, which may have been perceived as a problem by the designers of the 64 bit ABI. If this doesn't lead to a diganosis of the problem, it would also be very helpful to extract the fault address somehow. If you can determine the faulting instruction and the corresponding register contents, that should be fairly easy. It wasn't clear to me whather gdb is sufficiently functional to do that. Hans