public inbox for java@gcc.gnu.org
 help / color / mirror / Atom feed
* JVMTI events system and _jvmtiEnvironments uninitialised memory problem?  [was Re: [PATCH PR42811,4.5 regression] java.lang.ExceptionInInitializerError  in ecj1]
       [not found] ` <4B6D0C67.5030500@gmail.com>
@ 2010-02-14 22:23   ` Dave Korn
       [not found]   ` <4B78BA6D.5000603@gmail.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Korn @ 2010-02-14 22:23 UTC (permalink / raw)
  To: Dave Korn; +Cc: GCC Java


    Good evening all,

  I've been trying to understand why my patch to try and fix PR42811 caused
failures in the "events" test from jvmti.exp:

On 06/02/2010 06:29, Dave Korn wrote:
> On 05/02/2010 22:26, Dave Korn wrote:

>> 	PR java/42811
>> 	* configure.ac (libgcj_spec_lgcj_override): Add undefined reference
>> 	to arbitrary symbol in noncore library.
>> 	(libgcj_spec_lgcj_bc_override): Likewise.

  I'm using configure flags:  '--enable-libjava' '--enable-interpreter'
'--disable-libgcj-debug' '--with-ecj-jar=/usr/share/java/ecj.jar', DW2 EH,
native target i686-pc-cygwin, languages c/c++/java, graphite enabled but no
LTO.  Don't suppose those last two are relevant.

>   It caused a regression:
> 
>> FAIL: events output
> 
>   For some reason, a couple of extra jvm event callbacks appeared:
> 
>> $ diff -pu events.out .e.o
>> --- events.out  2009-03-14 06:47:02.984375000 +0000
>> +++ .e.o        2010-02-06 06:04:50.218750000 +0000
>> @@ -4,9 +4,9 @@ created JVMTI environment #0
>>  created JVMTI environment #1
>>  created JVMTI environment #2
>>  setting callbacks for envs
>> -RequestedEvents:
>> +RequestedEvents: VMInit,VMDeath,
>>  enable VM_INIT for env0, env1, env2
>> -RequestedEvents: VMInit,
>> +RequestedEvents: VMInit,VMDeath,
>>  enable VM_DEATH for env1,env2
>>  RequestedEvents: VMInit,VMDeath,
>>  enable THREAD_END for env2

  Well, I've been tracking it down under the debugger (and it doesn't show up
if you launch the program directly under the debugger, you have to launch it
naturally and intercept it at an early breakpoint, which smells like a clue)
and it looks like it's caused by use of uninitialised memory.  Valgrind
doesn't run on windows, which is the only platform where the sublib-enabled
build currently works, so I don't have mechanical verification, but here's
what appears to be the sequence of events:

  The testcase creates three VM environments, and tests enabling and disabling
their VM event callbacks:

> static void
> do_enable_tests ()
> {
>   printf ("- enable tests -\n");
>   JavaVM *vm = _Jv_GetJavaVM ();
>   jvmtiEnv *env[3];
>   int i;
>   for (i = 0; i < 3; ++i)
>     {
>       vm->GetEnv (reinterpret_cast<void **> (&env[i]), JVMTI_VERSION_1_0);
>       printf ("created JVMTI environment #%d\n", i);
>     }
> 
>   jvmtiEventCallbacks callbacks;
>   memset (&callbacks, 0, sizeof (jvmtiEventCallbacks));
> 
>   printf ("setting callbacks for envs\n");
>   callbacks.VMInit = VMInitCB;
>   env[0]->SetEventCallbacks (&callbacks, sizeof (callbacks));
>   callbacks.VMDeath = VMDeathCB;
>   env[1]->SetEventCallbacks (&callbacks, sizeof (callbacks));
>   callbacks.ThreadEnd = ThreadEndCB;
>   env[2]->SetEventCallbacks (&callbacks, sizeof (callbacks));
>   print_events ();

  At this point in the test, none of the callbacks should actually be enabled;
even though some of the event callbacks have been set, the test hasn't called
SetEventNotificationMode to enable them yet; but as the failure diff above
shows, print_events(), which is where the "RequestedEvents:" trace comes from,
thinks that some of the callbacks are indeed enabled.  For reference, the
contents of the environment struct look like this, from include/jvmti_md.h:

> /* Contents of the jvmtiEnv; but only inside the implementation. */
> #define _CLASSPATH_JVMTIENV_CONTENTS					\
>   /* Event handlers registered via SetEventCallbacks */			\
>   jvmtiEventCallbacks callbacks;					\
> 									\
>   /* Array of event thread for which to report event. */		\
>   /* NULL means all threads. One for each callback.   */		\
>   jthread thread[EVENT_SLOTS];						\
>   									\
>   /* Array of notification modes for callbacks. */			\
>   /* One for each callback.                     */			\
>   bool enabled[EVENT_SLOTS];

  The vm->GetEnv() call that allocates the VMs passes through _Jv_JNI_GetEnv()
and ends up in _Jv_GetJVMTIEnv() in libjava/jvmti.cc:

> _Jv_JVMTIEnv *
> _Jv_GetJVMTIEnv (void)
> {
>   _Jv_JVMTIEnv *env
>     = (_Jv_JVMTIEnv *) _Jv_MallocUnchecked (sizeof (_Jv_JVMTIEnv));
>   env->p = &_Jv_JVMTI_Interface;
>   struct jvmti_env_list *element
>     = (struct jvmti_env_list *) _Jv_MallocUnchecked (sizeof (struct jvmti_env_list));
>   element->env = env;
>   element->next = NULL;
> 
>   _envListLock->writeLock ()->lock ();
>   if (_jvmtiEnvironments == NULL)
>     _jvmtiEnvironments = element;
>   else
>     {
>       struct jvmti_env_list *e;
>       for (e = _jvmtiEnvironments; e->next != NULL; e = e->next)
> 	;
>       e->next = element;
>     }
>   _envListLock->writeLock ()->unlock ();

  As far as I can see, that just mallocs a bunch of heap space, then
immediately shoves it live onto the global list of environments - still full
of uninitialised heap crud!  Sure enough, that's also how it's returned to the
calling code from do_enable_tests(), and although the calls to
SetEventCallbacks() fully initialise the callbacks member, the thread[] and
enabled[] arrays remain full of rubbish by the time we get to this loop in
libjava/jvmti.cc#check_enabled_event():

>   if (_jvmtiEnvironments != NULL)
>     {
>       _envListLock->readLock ()->lock ();
>       struct jvmti_env_list *e;
>       FOREACH_ENVIRONMENT (e)
> 	{
> 	  char *addr
> 	    = reinterpret_cast<char *> (&e->env->callbacks) + offset;
> 	  void **callback = reinterpret_cast<void **> (addr);
> 	  if (e->env->enabled[index] && *callback != NULL)
> 	    {
> 	      *enabled = true;
> 	      _envListLock->readLock ()->unlock ();
> 	      return;
> 	    }
> 	}
> 
>       _envListLock->readLock ()->unlock ();
>     }

and as a result it believes various events are enabled even though we haven't
yet called SetEventNotificationMode() to turn them on.

  I set a breakpoint in _Jv_GetJVMTIEnv(), just after the call to
_Jv_MallocUnchecked(), and manually called memset to zero out the
freshly-returned space, and it fixed the bug, so I'm just bootstrapping and
testing a patch to add an explicit memset call there (will post separately),
but I just wondered if there is supposed to be an assumption that the
_Jv_Malloc* routines return zeroed memory, or if possibly there should be a
_Jv_Calloc interface added?  Or have I somehow missed some place that should
have zeroed out the thread[] and enabled[] arrays?

  BTW, assuming this proposed patch passes testing and is agreed to be the
right thing to do, I'll be resurrecting my proposed fix for PR42811(*).  Heads
up, and maybe a little pre-emptive "ping!" in advance :)

    cheers,
      DaveK
-- 
(*) - http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00261.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Two quick GC questions [was Re: [REVIVED: PATCH PR42811,4.5 regression]  java.lang.ExceptionInInitializerError in ecj1]
       [not found]         ` <4B7D631B.8030401@gmail.com>
@ 2010-02-18 20:41           ` Dave Korn
  2010-02-18 22:06             ` Boehm, Hans
                               ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Dave Korn @ 2010-02-18 20:41 UTC (permalink / raw)
  To: GCC Java; +Cc: Dave Korn

On 18/02/2010 15:56, Dave Korn wrote:

>>> Running /gnu/gcc/gcc/libjava/testsuite/libjava.loader/loader.exp ...
>>> FAIL: /gnu/gcc/obj-pr42811-3/i686-pc-cygwin/libjava/testsuite/TestEarlyGC.exe output - /gnu/gcc/obj-pr42811-3/i686-pc-cygwin/libjava/testsuite/TestEarlyGC.exe
>> *sigh*, and it turns out to be because some memory is getting released by the
>> garbage collector when its still in use.  I'm busy trying to learn how the GC
>> works real fast so I can figure out what's going wrong.

  Well, it turns out to be fairly straightforward: there is no code to
register the .data and .bss sections of the main executable as static data
regions with the GC, so it has no idea there's a live pointer to an object(*)
in there, and happily frees it up.

  This leads me to two quick questions about the GC:

1- There's no call to GC_INIT anywhere under libjava/, and I can't find
anything in boehm.cc that even looks suitable for the purpose.  Does anyone
know how the main exe's data/bss sections are supposed to get registered on a
posixy system?

2- Libgcj statically links its own personal private copy of boehm-gc, rather
than using a shared library; does anyone know why it was designed this way?

>   I'm now very sure that the problem is a side-effect of the libgcj DLL being
> rebased at runtime, and not related to the clearing of memory in
> _Jv_GetJVMTIEnv (which isn't even called in this testcase).

  This remains the case, btw.

    cheers,
      DaveK
-- 
(*) - The object in question happens to be the jstring "C" passed as a name to
defineClass, and next time through the loop it's been replaced by a hashmap
object and defineClass throws when it finds it has the wrong object type as an
argument.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Two quick GC questions [was Re: [REVIVED: PATCH PR42811,4.5  regression]  java.lang.ExceptionInInitializerError in ecj1]
  2010-02-18 20:41           ` Two quick GC questions [was Re: [REVIVED: PATCH " Dave Korn
@ 2010-02-18 22:06             ` Boehm, Hans
  2010-02-19  7:04               ` Dave Korn
  2010-02-18 22:13             ` Bryce McKinlay
  2010-02-20 18:22             ` Dave Korn
  2 siblings, 1 reply; 7+ messages in thread
From: Boehm, Hans @ 2010-02-18 22:06 UTC (permalink / raw)
  To: Dave Korn, GCC Java

 

> -----Original Message-----
> From: java-owner@gcc.gnu.org [mailto:java-owner@gcc.gnu.org] 
> On Behalf Of Dave Korn
> Sent: Thursday, February 18, 2010 12:59 PM
> To: GCC Java
> Cc: Dave Korn
> Subject: Two quick GC questions [was Re: [REVIVED: PATCH 
> PR42811,4.5 regression] java.lang.ExceptionInInitializerError in ecj1]
> 
> On 18/02/2010 15:56, Dave Korn wrote:
> 
> >>> Running 
> /gnu/gcc/gcc/libjava/testsuite/libjava.loader/loader.exp ...
> >>> FAIL: 
> >>> 
> /gnu/gcc/obj-pr42811-3/i686-pc-cygwin/libjava/testsuite/TestEarlyGC.
> >>> exe output - 
> >>> 
> /gnu/gcc/obj-pr42811-3/i686-pc-cygwin/libjava/testsuite/TestEarlyGC.
> >>> exe
> >> *sigh*, and it turns out to be because some memory is getting 
> >> released by the garbage collector when its still in use.  I'm busy 
> >> trying to learn how the GC works real fast so I can figure 
> out what's going wrong.
> 
>   Well, it turns out to be fairly straightforward: there is 
> no code to register the .data and .bss sections of the main 
> executable as static data regions with the GC, so it has no 
> idea there's a live pointer to an object(*) in there, and 
> happily frees it up.
> 
>   This leads me to two quick questions about the GC:
> 
> 1- There's no call to GC_INIT anywhere under libjava/, and I 
> can't find anything in boehm.cc that even looks suitable for 
> the purpose.  Does anyone know how the main exe's data/bss 
> sections are supposed to get registered on a posixy system?
The GC normally does that internally, whether or not GC_INIT is called.  GC_init_inner() does not need to get called, but the collector tries to do that automatically when it can, though IIRC gcj should call GC_init()directly, possibly not through the GC_INIT() macro.

In the simplest case, GC_init_inner calls GC_register_data_segments.  In other cases, GC_register_dynamic_libraries does it, and GC_register_main_static_data() returns false,
avoiding the call the GC_register_data_segments.


> 
> 2- Libgcj statically links its own personal private copy of 
> boehm-gc, rather than using a shared library; does anyone 
> know why it was designed this way?
Way back when, it needed a custon GC configuration.  But those options have generally become runtime configurable, so I suspect that's no longer true, though it may need to make some initialization calls to get the right configuration.

Hans
> 
> >   I'm now very sure that the problem is a side-effect of the libgcj 
> > DLL being rebased at runtime, and not related to the clearing of 
> > memory in _Jv_GetJVMTIEnv (which isn't even called in this 
> testcase).
> 
>   This remains the case, btw.
> 
>     cheers,
>       DaveK
> --
> (*) - The object in question happens to be the jstring "C" 
> passed as a name to defineClass, and next time through the 
> loop it's been replaced by a hashmap object and defineClass 
> throws when it finds it has the wrong object type as an argument.
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Two quick GC questions [was Re: [REVIVED: PATCH PR42811,4.5   regression] java.lang.ExceptionInInitializerError in ecj1]
  2010-02-18 20:41           ` Two quick GC questions [was Re: [REVIVED: PATCH " Dave Korn
  2010-02-18 22:06             ` Boehm, Hans
@ 2010-02-18 22:13             ` Bryce McKinlay
  2010-02-19  3:22               ` Boehm, Hans
  2010-02-20 18:22             ` Dave Korn
  2 siblings, 1 reply; 7+ messages in thread
From: Bryce McKinlay @ 2010-02-18 22:13 UTC (permalink / raw)
  To: Dave Korn; +Cc: GCC Java

On Thu, Feb 18, 2010 at 8:58 PM, Dave Korn
<dave.korn.cygwin@googlemail.com> wrote:
> On 18/02/2010 15:56, Dave Korn wrote:
>
>>>> Running /gnu/gcc/gcc/libjava/testsuite/libjava.loader/loader.exp ...
>>>> FAIL: /gnu/gcc/obj-pr42811-3/i686-pc-cygwin/libjava/testsuite/TestEarlyGC.exe output - /gnu/gcc/obj-pr42811-3/i686-pc-cygwin/libjava/testsuite/TestEarlyGC.exe
>>> *sigh*, and it turns out to be because some memory is getting released by the
>>> garbage collector when its still in use.  I'm busy trying to learn how the GC
>>> works real fast so I can figure out what's going wrong.
>
>  Well, it turns out to be fairly straightforward: there is no code to
> register the .data and .bss sections of the main executable as static data
> regions with the GC, so it has no idea there's a live pointer to an object(*)
> in there, and happily frees it up.
>
>  This leads me to two quick questions about the GC:
>
> 1- There's no call to GC_INIT anywhere under libjava/, and I can't find
> anything in boehm.cc that even looks suitable for the purpose.  Does anyone
> know how the main exe's data/bss sections are supposed to get registered on a
> posixy system?

GC_init_gcj_malloc() is called from boehm.cc:_Jv_InitGC(), and that in
turn calls GC_init() if required.

> 2- Libgcj statically links its own personal private copy of boehm-gc, rather
> than using a shared library; does anyone know why it was designed this way?

Early on, the GC actually was in its own shared library (as was
libffi). The main reason to put these into libgcj was that it avoids
weird, hard-to-debug problems if you accidentally end up with the
wrong version of one library or the other. That's probably not such a
big deal now, but (windows limitations aside) I think its generally
better to have less shared libraries rather than more.

In theory, there could also be a performance advantage to having a
single library because a smart linker / linker script could optimize
internal calls to things like GC_malloc, speeding up the allocation
path by avoiding PLT indirection.

Bryce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Two quick GC questions [was Re: [REVIVED: PATCH PR42811,4.5    regression] java.lang.ExceptionInInitializerError in ecj1]
  2010-02-18 22:13             ` Bryce McKinlay
@ 2010-02-19  3:22               ` Boehm, Hans
  0 siblings, 0 replies; 7+ messages in thread
From: Boehm, Hans @ 2010-02-19  3:22 UTC (permalink / raw)
  To: Bryce McKinlay, Dave Korn; +Cc: GCC Java

> From: Bryce McKinlay
>
> In theory, there could also be a performance advantage to
> having a single library because a smart linker / linker
> script could optimize internal calls to things like
> GC_malloc, speeding up the allocation path by avoiding PLT
> indirection.
>
> Bryce
>
Also in theory, in this case, you should be able to have your cake and eat it, too.  The idea was that we should be able to inline the fast path that looks only at a small collection of the thread-local free-lists, but then calls into the GC library on the slow path.  GC7.x generally has the infratsructure for that.  But it's not clear anyone has the time to do this.  Last time I looked into this, the current gcj allocation path is still much slower than one would like, so this should help.

Hans

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Two quick GC questions [was Re: [REVIVED: PATCH PR42811,4.5 regression]   java.lang.ExceptionInInitializerError in ecj1]
  2010-02-18 22:06             ` Boehm, Hans
@ 2010-02-19  7:04               ` Dave Korn
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Korn @ 2010-02-19  7:04 UTC (permalink / raw)
  To: Boehm, Hans; +Cc: Dave Korn, GCC Java

On 18/02/2010 22:05, Boehm, Hans wrote:
>> From: Dave Korn

>> Well, it turns out to be fairly straightforward: there is no code to
>> register the .data and .bss sections of the main executable as static
>> data regions with the GC, so it has no idea there's a live pointer to an
>> object(*) in there, and happily frees it up.

>> 1- There's no call to GC_INIT anywhere under libjava/, and I can't find
>> anything in boehm.cc that even looks suitable for the purpose.  Does
>> anyone know how the main exe's data/bss sections are supposed to get
>> registered on a posixy system?

> The GC normally does that internally, whether or not GC_INIT is called.
> GC_init_inner() does not need to get called, but the collector tries to do
> that automatically when it can, though IIRC gcj should call
> GC_init()directly, possibly not through the GC_INIT() macro.
> 
> In the simplest case, GC_init_inner calls GC_register_data_segments.  In
> other cases, GC_register_dynamic_libraries does it, and
> GC_register_main_static_data() returns false, avoiding the call the
> GC_register_data_segments.

  Thanks Hans, I think I've got it now.  The cygwin target ends up using the
generic GC_register_data_segments() implementation, which registers the minmax
range of the data and bss segments.  That did the job when cygwin was only
able to use libjava statically linked and everything was ending up in one
monolithic executable, but now we can build a shared libjava it only registers
the static data areas in the libgcj DLL, naturally.

  There's no definition for GC_register_dynamic_libraries at all, either, it
uses the empty definition that defines GC_no_dynamic_loading.  I could either
adapt the "parse /proc/self/maps file" strategy, or try making the win32 stuff
work; either way, it'll have to take care of registering the main exe's data
sections - which I guess is how things probably end up working on *nix hosts
too, when they parse the maps file.

>> 2- Libgcj statically links its own personal private copy of boehm-gc,
>> rather than using a shared library; does anyone know why it was designed
>> this way?
> Way back when, it needed a custon GC configuration.  But those options have
> generally become runtime configurable, so I suspect that's no longer true,
> though it may need to make some initialization calls to get the right
> configuration.

  Ah.  Well, when I'm building libjava as two separate sublibraries, I now
have the problem that each gets its own separate copy of the GC; that's
obviously liable to be disastrous.

  One solution would be to build a single shared GC library for them both to
reference, but I don't know what I'd have to do to get the runtime config correct.

  Another alternative would be to export the GC functions from the main libgcj
dll, for the benefit of the libgcj-noncore dll.  That's a little ugly because
we don't really want to expose those APIs to the user, but they shouldn't be
linking anything against them anyway, so we could just call it a private
interface between the two dlls and not worry about breaking binary
compatibility (or even removing altogether it in favour of an external shared
lib gc) in future.  This would probably be a safer change this close to the
release of 4.5.0, so unless any of the java maintainers have a fundamental
objection or can see a serious flaw in the suggestion, I'll go ahead and
prepare a patch that resolves the clash that way.

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Two quick GC questions [was Re: [REVIVED: PATCH PR42811,4.5 regression]  java.lang.ExceptionInInitializerError in ecj1]
  2010-02-18 20:41           ` Two quick GC questions [was Re: [REVIVED: PATCH " Dave Korn
  2010-02-18 22:06             ` Boehm, Hans
  2010-02-18 22:13             ` Bryce McKinlay
@ 2010-02-20 18:22             ` Dave Korn
  2 siblings, 0 replies; 7+ messages in thread
From: Dave Korn @ 2010-02-20 18:22 UTC (permalink / raw)
  To: Dave Korn; +Cc: GCC Java, Boehm, Hans, Bryce McKinlay

On 18/02/2010 20:58, Dave Korn wrote:
>  [ snip ]


  Thank you for all your help, Bruce and Hans; you gave me everything I needed
to dig into the GC and figure out how to enable the support for cygwin, and
I've posted the resulting patch(es):

http://gcc.gnu.org/ml/java-patches/2010-q1/msg00039.html

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-02-20 18:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4B6C9B26.3010106@gmail.com>
     [not found] ` <4B6D0C67.5030500@gmail.com>
2010-02-14 22:23   ` JVMTI events system and _jvmtiEnvironments uninitialised memory problem? [was Re: [PATCH PR42811,4.5 regression] java.lang.ExceptionInInitializerError in ecj1] Dave Korn
     [not found]   ` <4B78BA6D.5000603@gmail.com>
     [not found]     ` <7230133d1002150504r23738e28if92303426c349661@mail.gmail.com>
     [not found]       ` <4B7CC61E.1050709@gmail.com>
     [not found]         ` <4B7D631B.8030401@gmail.com>
2010-02-18 20:41           ` Two quick GC questions [was Re: [REVIVED: PATCH " Dave Korn
2010-02-18 22:06             ` Boehm, Hans
2010-02-19  7:04               ` Dave Korn
2010-02-18 22:13             ` Bryce McKinlay
2010-02-19  3:22               ` Boehm, Hans
2010-02-20 18:22             ` Dave Korn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).