GC_descr_obj_sz miscompilation (was RE: GC_enable

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental())
@ 2004-09-10  0:20 Boehm, Hans
  2004-09-10  6:15 ` Ranjit Mathew
  2004-09-10  6:32 ` Richard Henderson
  0 siblings, 2 replies; 5+ messages in thread
From: Boehm, Hans @ 2004-09-10  0:20 UTC (permalink / raw)
  To: Boehm, Hans, Bryce McKinlay
  Cc: 'DHollenbeck', 'gcc@gcc.gnu.org',
	'java@gcc.gnu.org'

I've been spending some time trying to debug the "make check"
failure in the Java garbage collector on my IA64 machine in a
tree that's about 2 weeks old.  As far as I can tell, there
are two issues, none of which turn out to be related to
incremental GC:

1) The multithreaded GC doesn't run under gdb 6.1.  I suspect
some signals are being misdirected, or the like.  (I mention
this here only because it might frustrate others, too.)

2) As far as I can tell, GC_descr_obj_size in typd_mlc.c
appears to be getting badly miscompiled at -O1 and higher.
Details follow, but I haven't had time to try to track
this down completely.  (3.3.4 generates correct code which
also contains less other silliness than the CVS version,
at least for this function.) 

Does (2) ring any bells?  I'm having trouble building the
current CVS tree to verify that this problem is there as well.

Do we know that on X86 and X86_64, the "make check" failure
in boehm-gc is due to incremental mode?  Or might that be a
similar issue?

Hans

Source:

word GC_descr_obj_size(d)
register complex_descriptor *d;
{
    switch(d -> TAG) {
      case LEAF_TAG:
      	return(d -> ld.ld_nelements * d -> ld.ld_size);
      case ARRAY_TAG:
        return(d -> ad.ad_nelements
               * GC_descr_obj_size(d -> ad.ad_element_descr));
      case SEQUENCE_TAG:
        return(GC_descr_obj_size(d -> sd.sd_first)
               + GC_descr_obj_size(d -> sd.sd_second));
      default:
        ABORT("Bad complex descriptor");
        /*NOTREACHED*/ return 0; /*NOTREACHED*/
    }
}

Abbreviated annotated assembly code at -O1 for the LEAFTAG (=1) case:

	.global GC_descr_obj_size#
	.proc GC_descr_obj_size#
GC_descr_obj_size:
	.prologue 12, 33
	.save ar.pfs, r34
	alloc r34 = ar.pfs, 1, 3, 1, 0
	adds r16 = -16, r12
	.fframe 32
	adds r12 = -32, r12		// Reserve frame to save fp registers.
						// Should have used scratch registers, but ...
	mov r35 = r1
	.save rp, r33
	mov r33 = b0
	;;
	.save.f 0x1
	stf.spill [r16] = f2, 16	// Save fp registers.
	;;
	.save.f 0x2
	stf.spill [r16] = f3
	.body
	addl r14 = 1, r0
	;;
	setf.sig f2 = r14			// f2.sig = 1; f3.sig = 0;
	setf.sig f3 = r0
.L129:
	ld8 r14 = [r32]
	;;
	cmp.eq p6, p7 = 2, r14
	(p6) br.cond.dpnt .L121
	;;
	cmp.eq p6, p7 = 3, r14
	(p6) br.cond.dptk .L122
	;;
	cmp.eq p6, p7 = 1, r14
	(p7) br.cond.dptk .L119
	br .L123				// For LEAF_TAG (1), we go here.
	;;
.L121:...
.L122:...
.L119:...
.L123:
	getf.sig r14 = f3		// Unconditionally zero
	;;
	add r8 = r32, r14		// Adds argument, which is a pointer!
					// The necessary multiplication seems
					// to have disappeared completely,
					// as did the reson for using an fp
					// register in this path.
	adds r17 = 16, r12
	mov ar.pfs = r34
	mov b0 = r33
	;;
	ldf.fill f2 = [r17], 16
	;;
	ldf.fill f3 = [r17]
	.restore sp
	adds r12 = 32, r12
	br.ret.sptk.many b0	// Result is pointer value in r8.  Oops.
	;;
	.endp GC_descr_obj_size#


> -----Original Message-----
> From: Boehm, Hans 
> Sent: Wednesday, September 08, 2004 8:34 AM
> To: Bryce McKinlay
> Cc: Boehm, Hans; 'DHollenbeck'; java@gcc.gnu.org
> Subject: Re: GC_enable_incremental()
> 
> 
> 
> 
> On Tue, 7 Sep 2004, Bryce McKinlay wrote:
> 
> > Boehm, Hans wrote:
> >
> > >The real-time thread is running Java code?
> > >
> > >There are several issues with getting incremental 
> collection to work
> > >in 3.4:
> > >
> > >1) There is a bug in pthread_support.c which causes the 
> thread stopping
> > >signal handler to disable signals and the write to a heap 
> object.  This
> > >is bad if that object is write protected, something that 
> can happen with
> > >the incremental collector.  Hence the "killed" message.  
> This should be
> > >fixed in the CVS trunk.  The fix should probably be 
> backported to 3.4,
> > >but it hasn't been.  The problem seems to show up only 
> with 2.6 kernels.
> > >
> > >
> > Hans, I still see a problem with incremental mode using CVS 
> trunk, on a
> > 2.6 kernel both on x86 and x86_64:
> >
> > $ GC_ENABLE_INCREMENTAL=1 ./gctest
> > Segmentation fault
> >
> > (gdb) r
> > Starting program: /home/mckinlay/tests/gctest
> > [Thread debugging using libthread_db enabled]
> > [New Thread -151066912 (LWP 8129)]
> > [New Thread 33545136 (LWP 8132)]
> > [New Thread 87886768 (LWP 8133)]
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 33545136 (LWP 8132)]
> > GC_restart_handler (sig=24) at pthread_stop_world.c:184
> > 184         me->stop_info.signal = SIG_THR_RESTART;
> >
> > I havn't looked at this extensively. Perhaps this segfault 
> is "normal"
> > for incremental mode? The segfault location seems a bit 
> random - I've
> > seen it sometimes in GC_finalize, sometimes in 
> GC_restart_handler. It
> > does appear to be caused by the GC writing to write-protected heap
> > objects, because I can read them fine in gdb.
> I'm now also seeing a problem with "make check" in the GC directory
> on IA64.  I'll investigate further.  The segfault there may indeed be
> normal, but it should be caught.
> 
> >
> > >2) There hasn't been a systematic effort to get 
> incremental GC debugged
> > >and tested in the context of gcj.  This is nontrivial, 
> since it can break
> > >if a system call writes to a pointer-containing section of 
> the heap.
> > >Hence the library has to play by a stricter set of rules.
> > >
> > >
> >
> > Is this really a problem? I can't think of any situation 
> where a system
> > call would write to pointers on the heap. Unless perhaps
> > System.arraycopy() were implemented by a system call on 
> some platform,
> > but I don't think this is the case usually.
> 
> The problem isn't really a system call writing pointers.  The problem
> is that if a system call writes into the heap, the GC needs to have
> been told in advance that that section of memory is pointerfree.
> Otherwise the system call may fail, since the memory may not 
> be writable.
> And there is not always a way to recover.
> 
> It's possible that that we already do this more or less right.  But
> I'm not sure it has been carefully explored.  To do this completely
> right, you should check GC_incremental_protection_needs() before
> calling GC_enable_incremental().  Even writing to pointerfree sections
> of the heap from a system call is unsafe if the physical page size is
> larger than the collectors block size.  But on X86 it should be fine.
> >
> > Regards
> >
> > Bryce
> >
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental())
  2004-09-10  0:20 GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental()) Boehm, Hans
@ 2004-09-10  6:15 ` Ranjit Mathew
  2004-09-10 17:24   ` Bryce McKinlay
  2004-09-10  6:32 ` Richard Henderson
  1 sibling, 1 reply; 5+ messages in thread
From: Ranjit Mathew @ 2004-09-10  6:15 UTC (permalink / raw)
  To: Boehm, Hans; +Cc: 'gcc@gcc.gnu.org', 'java@gcc.gnu.org'

Boehm, Hans wrote:
> 
> 1) The multithreaded GC doesn't run under gdb 6.1.  I suspect
> some signals are being misdirected, or the like.  (I mention
> this here only because it might frustrate others, too.)

You might have better luck with GDB 6.2/6.2.1. From
the NEWS file for 6.2:
--------------------------- 8< ---------------------------
* Fix for ``many threads''

On GNU/Linux systems that use the NPTL threads library, a program
rapidly creating and deleting threads would confuse GDB leading to the
error message:

        ptrace: No such process.
        thread_db_get_info: cannot get thread info: generic error

This problem has been fixed.

[...]

* Signal trampoline code overhauled

Many generic problems with GDB's signal handling code have been fixed.
These include: backtraces through non-contiguous stacks; recognition
of sa_sigaction signal trampolines; backtrace from a NULL pointer
call; backtrace through a signal trampoline; step into and out of
signal handlers; and single-stepping in the signal trampoline.

Please note that kernel bugs are a limiting factor here.  These
features have been shown to work on an s390 GNU/Linux system that
include a 2.6.8-rc1 kernel.  Ref PR breakpoints/1702.
--------------------------- 8< ---------------------------

HTH,
Ranjit.

-- 
Ranjit Mathew          Email: rmathew AT gmail DOT com

Bangalore, INDIA.      Web: http://ranjitmathew.tripod.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental())
  2004-09-10  0:20 GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental()) Boehm, Hans
  2004-09-10  6:15 ` Ranjit Mathew
@ 2004-09-10  6:32 ` Richard Henderson
  2004-09-10  6:38   ` Ranjit Mathew
  1 sibling, 1 reply; 5+ messages in thread
From: Richard Henderson @ 2004-09-10  6:32 UTC (permalink / raw)
  To: Boehm, Hans
  Cc: Bryce McKinlay, 'DHollenbeck', 'gcc@gcc.gnu.org',
	'java@gcc.gnu.org'

On Thu, Sep 09, 2004 at 04:36:00PM -0700, Boehm, Hans wrote:
> Does (2) ring any bells?  I'm having trouble building the
> current CVS tree to verify that this problem is there as well.

Nope, no bells rung.  Do file a bug report so this doesn't
get lost.


r~

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental())
  2004-09-10  6:32 ` Richard Henderson
@ 2004-09-10  6:38   ` Ranjit Mathew
  0 siblings, 0 replies; 5+ messages in thread
From: Ranjit Mathew @ 2004-09-10  6:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: 'gcc@gcc.gnu.org', 'java@gcc.gnu.org'

Richard Henderson wrote:
> On Thu, Sep 09, 2004 at 04:36:00PM -0700, Boehm, Hans wrote:
> 
>>Does (2) ring any bells?  I'm having trouble building the
>>current CVS tree to verify that this problem is there as well.
> 
> 
> Nope, no bells rung.  Do file a bug report so this doesn't
> get lost.

...and do put "wrong-code" in the Keywords and
"[3.5 Regression]" in the bug summary.

Ranjit.

-- 
Ranjit Mathew          Email: rmathew AT gmail DOT com

Bangalore, INDIA.      Web: http://ranjitmathew.tripod.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental())
  2004-09-10  6:15 ` Ranjit Mathew
@ 2004-09-10 17:24   ` Bryce McKinlay
  0 siblings, 0 replies; 5+ messages in thread
From: Bryce McKinlay @ 2004-09-10 17:24 UTC (permalink / raw)
  To: Ranjit Mathew
  Cc: Boehm, Hans, 'gcc@gcc.gnu.org', 'java@gcc.gnu.org'

Ranjit Mathew wrote:

>Boehm, Hans wrote:
>  
>
>>1) The multithreaded GC doesn't run under gdb 6.1.  I suspect
>>some signals are being misdirected, or the like.  (I mention
>>this here only because it might frustrate others, too.)
>>    
>>
>
>You might have better luck with GDB 6.2/6.2.1. From
>the NEWS file for 6.2:
>--------------------------- 8< ---------------------------
>* Fix for ``many threads''
>  
>

I still have problems with heavily threaded Java programs freezing under 
gdb 6.2 (and current CVS). A workaround is to set 
LD_ASSUME_KERNEL=2.4.19 to force the use of linuxthreads, rather than NPTL.

Regards

Bryce

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-09-10 16:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-10  0:20 GC_descr_obj_sz miscompilation (was RE: GC_enable_incremental()) Boehm, Hans
2004-09-10  6:15 ` Ranjit Mathew
2004-09-10 17:24   ` Bryce McKinlay
2004-09-10  6:32 ` Richard Henderson
2004-09-10  6:38   ` Ranjit Mathew

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).