public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* break jmisc.main
@ 2003-03-13 20:39 David Carlton
  2003-03-13 20:54 ` David Carlton
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: David Carlton @ 2003-03-13 20:39 UTC (permalink / raw)
  To: gdb; +Cc: Tom Tromey, Michael Elizabeth Chastain

Here's the scoop with the FAILs on "break jmisc.main" and "break
jmisc.main(java.lang.String[]))".

1) For "break jmisc.main", decode_line_1 calls decode_compound (which
   handles C++ and Java compound data structures).  That notices that
   there is a class calles 'jmisc', and then looks for a member in it
   called 'main'.

   Unfortunately, GDB thinks the member in question is called
   'jmisc.main(java.lang.String[])' instead of just 'main': the debug
   info says:

	.long	.LC2	# DW_AT_name: "jmisc.main(java.lang.String[])"

   Sigh.  GCJ should get fixed.

2) For "break jmisc.main(java.lang.String[])", decode_compound gets
   bypassed, and decode_variable gets called, looking for a symbol of
   that name.  Unfortunately, it doesn't find one: the symbol that it
   finds is called something strange like
   "jmisc::main(Jaray<java::lang::String*>*)".  (I'm pretty sure
   that's right, though I'd have to check this at home to be sure;
   that's what c++filt demangles the name to.)

   Something weird is going on here; at first, I'd assumed this was a
   bug in cplus_demangle, but c++filt -s java gets the name demangled
   correctly.  So my guess is that, somewhere, a demangler is getting
   called in a situation where the symbol isn't yet identified as a
   Java symbol, so the C++ demangler gets used.  Do the minsym readers
   reliably know the language of the minsyms they're creating?  If
   not, then we could be getting the bad value there and caching it
   with the new demangling code, so the bad value remains when the
   symbol table is setting the symbol's name.

So, we have two things to do: submit a bug report to the GCJ people,
and track down where the symbol name is getting demangled
incorrectly.  (And a third thing: convince somebody who knows more
about GCJ to become GDB's Java maintainer.)

David Carlton
carlton@math.stanford.edu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 20:39 break jmisc.main David Carlton
@ 2003-03-13 20:54 ` David Carlton
  2003-03-13 20:57 ` Daniel Jacobowitz
  2003-03-13 22:59 ` Tom Tromey
  2 siblings, 0 replies; 14+ messages in thread
From: David Carlton @ 2003-03-13 20:54 UTC (permalink / raw)
  To: gdb; +Cc: Tom Tromey, Michael Elizabeth Chastain

On 13 Mar 2003 12:39:03 -0800, David Carlton <carlton@math.stanford.edu> said:

> So my guess is that, somewhere, a demangler is getting
> called in a situation where the symbol isn't yet identified as a
> Java symbol, so the C++ demangler gets used.  Do the minsym readers
> reliably know the language of the minsyms they're creating?  If
> not, then we could be getting the bad value there and caching it
> with the new demangling code, so the bad value remains when the
> symbol table is setting the symbol's name.

To be specific, in prim_record_minimal_symbol_and_info, we see:

  SYMBOL_LANGUAGE (msymbol) = language_auto;
  SYMBOL_SET_NAMES (msymbol, (char *)name, strlen (name), objfile);

Oops.

David Carlton
carlton@math.stanford.edu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 20:39 break jmisc.main David Carlton
  2003-03-13 20:54 ` David Carlton
@ 2003-03-13 20:57 ` Daniel Jacobowitz
  2003-03-13 21:07   ` Daniel Jacobowitz
                     ` (2 more replies)
  2003-03-13 22:59 ` Tom Tromey
  2 siblings, 3 replies; 14+ messages in thread
From: Daniel Jacobowitz @ 2003-03-13 20:57 UTC (permalink / raw)
  To: David Carlton; +Cc: gdb, Tom Tromey, Michael Elizabeth Chastain

On Thu, Mar 13, 2003 at 12:39:03PM -0800, David Carlton wrote:
> Here's the scoop with the FAILs on "break jmisc.main" and "break
> jmisc.main(java.lang.String[]))".
> 
> 1) For "break jmisc.main", decode_line_1 calls decode_compound (which
>    handles C++ and Java compound data structures).  That notices that
>    there is a class calles 'jmisc', and then looks for a member in it
>    called 'main'.
> 
>    Unfortunately, GDB thinks the member in question is called
>    'jmisc.main(java.lang.String[])' instead of just 'main': the debug
>    info says:
> 
> 	.long	.LC2	# DW_AT_name: "jmisc.main(java.lang.String[])"
> 
>    Sigh.  GCJ should get fixed.

Yep.

> 2) For "break jmisc.main(java.lang.String[])", decode_compound gets
>    bypassed, and decode_variable gets called, looking for a symbol of
>    that name.  Unfortunately, it doesn't find one: the symbol that it
>    finds is called something strange like
>    "jmisc::main(Jaray<java::lang::String*>*)".  (I'm pretty sure
>    that's right, though I'd have to check this at home to be sure;
>    that's what c++filt demangles the name to.)
> 
>    Something weird is going on here; at first, I'd assumed this was a
>    bug in cplus_demangle, but c++filt -s java gets the name demangled
>    correctly.  So my guess is that, somewhere, a demangler is getting
>    called in a situation where the symbol isn't yet identified as a
>    Java symbol, so the C++ demangler gets used.  Do the minsym readers
>    reliably know the language of the minsyms they're creating?  If
>    not, then we could be getting the bad value there and caching it
>    with the new demangling code, so the bad value remains when the
>    symbol table is setting the symbol's name.

Do you know if this actually broke with my caching patch, or if it was
broken before?  I checked, and nowhere in GDB do we ever set the
demangling style to Java.  Not that I could find, at least.

FYI, if you "set demangle-style java" and then "file ./jmisc", this
test passes.  I really don't know what we can do about it.  My
instincts tell me that we need to either:
 - not demangle at all until we know the language; doesn't help for
stabs anyway
 - transform between the Java and C++ demanglings.  Converting from the
C++ output to the Java output looks doable, although exceedingly
annoying:
    - different names for some types (bool vs boolean, char vs wchar_t)
    - All '*' characters are removed
    - JArray<TYPE> becomes TYPE[].
  (That's an exhaustive list.)
  Going the other way, Java -> C++, would probably be impossible
  because of the removed '*'s.
 - Re-demangle if we discover that the symbol is a Java symbol. 
Ewwwwww.

> So, we have two things to do: submit a bug report to the GCJ people,
> and track down where the symbol name is getting demangled
> incorrectly.  (And a third thing: convince somebody who knows more
> about GCJ to become GDB's Java maintainer.)
> 
> David Carlton
> carlton@math.stanford.edu
> 

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 20:57 ` Daniel Jacobowitz
@ 2003-03-13 21:07   ` Daniel Jacobowitz
  2003-03-13 21:16   ` David Carlton
  2003-03-13 23:04   ` Tom Tromey
  2 siblings, 0 replies; 14+ messages in thread
From: Daniel Jacobowitz @ 2003-03-13 21:07 UTC (permalink / raw)
  To: David Carlton, gdb, Tom Tromey, Michael Elizabeth Chastain

On Thu, Mar 13, 2003 at 03:56:39PM -0500, Daniel Jacobowitz wrote:
> On Thu, Mar 13, 2003 at 12:39:03PM -0800, David Carlton wrote:
> > Here's the scoop with the FAILs on "break jmisc.main" and "break
> > jmisc.main(java.lang.String[]))".
> > 
> > 1) For "break jmisc.main", decode_line_1 calls decode_compound (which
> >    handles C++ and Java compound data structures).  That notices that
> >    there is a class calles 'jmisc', and then looks for a member in it
> >    called 'main'.
> > 
> >    Unfortunately, GDB thinks the member in question is called
> >    'jmisc.main(java.lang.String[])' instead of just 'main': the debug
> >    info says:
> > 
> > 	.long	.LC2	# DW_AT_name: "jmisc.main(java.lang.String[])"
> > 
> >    Sigh.  GCJ should get fixed.
> 
> Yep.
> 
> > 2) For "break jmisc.main(java.lang.String[])", decode_compound gets
> >    bypassed, and decode_variable gets called, looking for a symbol of
> >    that name.  Unfortunately, it doesn't find one: the symbol that it
> >    finds is called something strange like
> >    "jmisc::main(Jaray<java::lang::String*>*)".  (I'm pretty sure
> >    that's right, though I'd have to check this at home to be sure;
> >    that's what c++filt demangles the name to.)
> > 
> >    Something weird is going on here; at first, I'd assumed this was a
> >    bug in cplus_demangle, but c++filt -s java gets the name demangled
> >    correctly.  So my guess is that, somewhere, a demangler is getting
> >    called in a situation where the symbol isn't yet identified as a
> >    Java symbol, so the C++ demangler gets used.  Do the minsym readers
> >    reliably know the language of the minsyms they're creating?  If
> >    not, then we could be getting the bad value there and caching it
> >    with the new demangling code, so the bad value remains when the
> >    symbol table is setting the symbol's name.
> 
> Do you know if this actually broke with my caching patch, or if it was
> broken before?  I checked, and nowhere in GDB do we ever set the
> demangling style to Java.  Not that I could find, at least.

Never mind, I've found where it happened; there was an explicit
DMGL_JAVA.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 20:57 ` Daniel Jacobowitz
  2003-03-13 21:07   ` Daniel Jacobowitz
@ 2003-03-13 21:16   ` David Carlton
  2003-03-13 21:22     ` Daniel Jacobowitz
  2003-03-13 23:32     ` David Carlton
  2003-03-13 23:04   ` Tom Tromey
  2 siblings, 2 replies; 14+ messages in thread
From: David Carlton @ 2003-03-13 21:16 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Tom Tromey, Michael Elizabeth Chastain

On Thu, 13 Mar 2003 15:56:39 -0500, Daniel Jacobowitz <drow@mvista.com> said:

> Do you know if this actually broke with my caching patch, or if it was
> broken before?

The dates that Michael gave for the regression make that quite likely.

> I checked, and nowhere in GDB do we ever set the demangling style to
> Java.  Not that I could find, at least.

Right now, symbol_find_demangled_name says:

  if (gsymbol->language == language_java)
    {
      demangled =
        cplus_demangle (mangled,
                        DMGL_PARAMS | DMGL_ANSI | DMGL_JAVA);
      if (demangled != NULL)
	{
	  gsymbol->language = language_java;
	  return demangled;
	}
    }

There was a similar case in whatever it replaced.

> FYI, if you "set demangle-style java" and then "file ./jmisc", this
> test passes.  I really don't know what we can do about it.  My
> instincts tell me that we need to either:
>  - not demangle at all until we know the language; doesn't help for
> stabs anyway

Is there any way for the minsym readers to guess the language based on
a file name, or something like that?

>  - transform between the Java and C++ demanglings.  Converting from the
> C++ output to the Java output looks doable, although exceedingly
> annoying:
>     - different names for some types (bool vs boolean, char vs wchar_t)
>     - All '*' characters are removed
>     - JArray<TYPE> becomes TYPE[].
>   (That's an exhaustive list.)
>   Going the other way, Java -> C++, would probably be impossible
>   because of the removed '*'s.

Getting that to work well doesn't sound much fun at all to me.  If
there were a Java maintainer to help, I'd be willing to chip this in,
but I'd really rather now.

>  - Re-demangle if we discover that the symbol is a Java symbol. 
> Ewwwwww.

Ewwww indeed.  As a temporary solution, though, we could modify
symbol_set_names and friends to use a different cache to store Java
demangled names.  (Well, use the same cache, but if the current
symbol's language is Java then to lookup ##JAVA##demangled_name
instead of just demangled_name, or something like that.)  The minsyms
will still have the wrong names, but that must always have been
broken.  And it means that Java code won't be able to share memory
between names of partial symbols and names of minimal symbols; given
the absence of a Java maintainer, I don't care about that in the
slightest.

I'll think about this, but if there's no easy way to get a good guess
at the current language when building minimal symbol tables, I suppose
I'll reluctantly take a stab at the temporary solution.

David Carlton
carlton@math.stanford.edu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 21:16   ` David Carlton
@ 2003-03-13 21:22     ` Daniel Jacobowitz
  2003-03-13 23:32     ` David Carlton
  1 sibling, 0 replies; 14+ messages in thread
From: Daniel Jacobowitz @ 2003-03-13 21:22 UTC (permalink / raw)
  To: David Carlton; +Cc: gdb, Tom Tromey, Michael Elizabeth Chastain

On Thu, Mar 13, 2003 at 01:15:59PM -0800, David Carlton wrote:
> I'll think about this, but if there's no easy way to get a good guess
> at the current language when building minimal symbol tables, I suppose
> I'll reluctantly take a stab at the temporary solution.

There's no way.  Minsyms are read at the objfile granularity; Java can
(does) occur at the compilation unit granularity.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 20:39 break jmisc.main David Carlton
  2003-03-13 20:54 ` David Carlton
  2003-03-13 20:57 ` Daniel Jacobowitz
@ 2003-03-13 22:59 ` Tom Tromey
  2003-03-13 23:03   ` Daniel Jacobowitz
  2 siblings, 1 reply; 14+ messages in thread
From: Tom Tromey @ 2003-03-13 22:59 UTC (permalink / raw)
  To: David Carlton; +Cc: gdb, Michael Elizabeth Chastain

>>>>> "David" == David Carlton <carlton@math.stanford.edu> writes:

David> Here's the scoop with the FAILs on "break jmisc.main" and "break
David> jmisc.main(java.lang.String[]))".

Thanks a lot for looking at this.

David> 	.long	.LC2	# DW_AT_name: "jmisc.main(java.lang.String[])"
David>    Sigh.  GCJ should get fixed.

I really don't know anything about debug info.  How should this read?

In the above `jmisc' is just a class.  However, `java.lang' is a
namespace.  In the past at least there wasn't namespace support in
gdb...?

David>    Unfortunately, it doesn't find one: the symbol that it
David>    finds is called something strange like
David>    "jmisc::main(Jaray<java::lang::String*>*)".  (I'm pretty sure
David>    that's right, though I'd have to check this at home to be sure;
David>    that's what c++filt demangles the name to.)

Should be `JArray', but other than that it looks ok.

gcj uses the same mangling as C++.  That is an important part of the
whole "CNI" approach to writing native methods -- you can just write
them in C++ with basically zero overhead.

I don't think there is any way to tell a Java symbol from a C++
symbol.  Which one you want to use depends more on context -- if I'm
debugging the Java code, I like to see the Java symbols.  If I'm
debugging the C++ code, it is probably more convenient to see the C++
form.  Likewise for entering breakpoints and the like.

David> (And a third thing: convince somebody who knows more about GCJ
David> to become GDB's Java maintainer.)

I would love for anybody to become an active gdb/java maintainer.
The only inducement I have is the future possibility of a cool gcj
t-shirt (assuming I ever print more).  That plus gratitude.

Tom

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 22:59 ` Tom Tromey
@ 2003-03-13 23:03   ` Daniel Jacobowitz
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Jacobowitz @ 2003-03-13 23:03 UTC (permalink / raw)
  To: Tom Tromey; +Cc: David Carlton, gdb, Michael Elizabeth Chastain

On Thu, Mar 13, 2003 at 03:53:26PM -0700, Tom Tromey wrote:
> >>>>> "David" == David Carlton <carlton@math.stanford.edu> writes:
> 
> David> Here's the scoop with the FAILs on "break jmisc.main" and "break
> David> jmisc.main(java.lang.String[]))".
> 
> Thanks a lot for looking at this.
> 
> David> 	.long	.LC2	# DW_AT_name: "jmisc.main(java.lang.String[])"
> David>    Sigh.  GCJ should get fixed.
> 
> I really don't know anything about debug info.  How should this read?
> 
> In the above `jmisc' is just a class.  However, `java.lang' is a
> namespace.  In the past at least there wasn't namespace support in
> gdb...?

Still isn't, but David is making great progress on that front.

The field should probably read just "main".  Try readelf -wi on a C++
binary to see what GDB expects.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 20:57 ` Daniel Jacobowitz
  2003-03-13 21:07   ` Daniel Jacobowitz
  2003-03-13 21:16   ` David Carlton
@ 2003-03-13 23:04   ` Tom Tromey
  2 siblings, 0 replies; 14+ messages in thread
From: Tom Tromey @ 2003-03-13 23:04 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: David Carlton, gdb, Michael Elizabeth Chastain

>>>>> "Daniel" == Daniel Jacobowitz <drow@mvista.com> writes:

Daniel>  - transform between the Java and C++ demanglings.  Converting from the
Daniel> C++ output to the Java output looks doable, although exceedingly
Daniel> annoying:
Daniel>     - different names for some types (bool vs boolean, char vs wchar_t)
Daniel>     - All '*' characters are removed
Daniel>     - JArray<TYPE> becomes TYPE[].
Daniel>   (That's an exhaustive list.)

You also convert `::' to `.'.

Daniel>   Going the other way, Java -> C++, would probably be impossible
Daniel>   because of the removed '*'s.

In Java there are only a small number of primitive types, and
references.  If it isn't a primitive type, it is a reference and you
add a `*'.  So I think it should be possible.

Tom

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 21:16   ` David Carlton
  2003-03-13 21:22     ` Daniel Jacobowitz
@ 2003-03-13 23:32     ` David Carlton
  2003-03-13 23:36       ` Daniel Jacobowitz
  1 sibling, 1 reply; 14+ messages in thread
From: David Carlton @ 2003-03-13 23:32 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Tom Tromey, Michael Elizabeth Chastain

On 13 Mar 2003 13:15:59 -0800, David Carlton <carlton@math.stanford.edu> said:

> I'll think about this, but if there's no easy way to get a good
> guess at the current language when building minimal symbol tables, I
> suppose I'll reluctantly take a stab at the temporary solution.

Actually, it wasn't so bad: Daniel did a good job of making the new
symbol name initialization functions modular, so I only had to touch
one of them.  Here's a patch that might work; it doesn't cause any
non-Java regressions, but this machine doesn't have gcj 3.2 on it, so
I have no idea if it actually fixes the problem.  I'll try to test it
tonight when I get home, but if anybody reading this wants to test it
before then, that would be great.  The symtab.h part is purely
cosmetic: just apply the symtab.c part, if you don't want to have to
recompile every file that depends on symtab.h.

David Carlton
carlton@math.stanford.edu

2003-03-13  David Carlton  <carlton@math.stanford.edu>

	* symtab.c (symbol_set_names): Add prefix when storing Java names
	in hash table.  Fix for PR java/1039.
	* symtab.h: Change 'name' argument in declaration of
	symbol_set_names to 'linkage_name'.
	(SYMBOL_SET_NAMES): Change 'name' argument to 'linkage_name'.

Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.99
diff -u -p -r1.99 symtab.c
--- symtab.c	4 Mar 2003 17:06:21 -0000	1.99
+++ symtab.c	13 Mar 2003 22:59:00 -0000
@@ -484,61 +484,103 @@ symbol_find_demangled_name (struct gener
   return NULL;
 }
 
-/* Set both the mangled and demangled (if any) names for GSYMBOL based on
-   NAME and LEN.  The hash table corresponding to OBJFILE is used, and the
-   memory comes from that objfile's symbol_obstack.  NAME is copied, so the
-   pointer can be discarded after calling this function.  */
+/* Set both the mangled and demangled (if any) names for GSYMBOL based
+   on LINKAGE_NAME and LEN.  The hash table corresponding to OBJFILE
+   is used, and the memory comes from that objfile's symbol_obstack.
+   LINKAGE_NAME is copied, so the pointer can be discarded after
+   calling this function.  */
+
+/* We have to be careful when dealing with Java names: when we run
+   into a Java minimal symbol, we don't know it's a Java symbol, so it
+   gets demangled as a C++ name.  This is unfortunate, but there's not
+   much we can do about it: but when demangling partial symbols and
+   regular symbols, we'd better not reuse the wrong demangled name.
+   (See PR gdb/1039.)  We solve this by putting a distinctive prefix
+   on Java names when storing them in the hash table.  */
+
+#define JAVA_PREFIX "##JAVA$$"
+#define JAVA_PREFIX_LEN 8
 
 void
 symbol_set_names (struct general_symbol_info *gsymbol,
-		  const char *name, int len, struct objfile *objfile)
+		  const char *linkage_name, int len, struct objfile *objfile)
 {
   char **slot;
-  const char *tmpname;
+  /* A 0-terminated copy of the linkage name.  */
+  const char *linkage_name_copy;
+  /* A copy of the linkage name that might have a special Java prefix
+     added to it, for use when looking names up in the hash table.  */
+  const char *lookup_name;
+  /* The length of lookup_name.  */
+  int lookup_len;
 
   if (objfile->demangled_names_hash == NULL)
     create_demangled_names_hash (objfile);
 
-  /* The stabs reader generally provides names that are not NULL-terminated;
-     most of the other readers don't do this, so we can just use the given
-     copy.  */
-  if (name[len] != 0)
+  /* The stabs reader generally provides names that are not
+     NUL-terminated; most of the other readers don't do this, so we
+     can just use the given copy, unless we're in the Java case.  */
+  if (gsymbol->language == language_java)
     {
-      char *alloc_name = alloca (len + 1);
-      memcpy (alloc_name, name, len);
-      alloc_name[len] = 0;
-      tmpname = alloc_name;
+      char *alloc_name;
+      lookup_len = len + JAVA_PREFIX_LEN;
+
+      alloc_name = alloca (lookup_len + 1);
+      memcpy (alloc_name, JAVA_PREFIX, JAVA_PREFIX_LEN);
+      memcpy (alloc_name + JAVA_PREFIX_LEN, linkage_name, len);
+      alloc_name[lookup_len] = '\0';
+
+      lookup_name = alloc_name;
+      linkage_name_copy = alloc_name + JAVA_PREFIX_LEN;
+    }
+  else if (linkage_name[len] != '\0')
+    {
+      char *alloc_name;
+      lookup_len = len;
+
+      alloc_name = alloca (lookup_len + 1);
+      memcpy (alloc_name, linkage_name, len);
+      alloc_name[lookup_len] = '\0';
+
+      lookup_name = alloc_name;
+      linkage_name_copy = alloc_name;
     }
   else
-    tmpname = name;
+    {
+      lookup_len = len;
+      lookup_name = linkage_name;
+      linkage_name_copy = linkage_name;
+    }
 
-  slot = (char **) htab_find_slot (objfile->demangled_names_hash, tmpname, INSERT);
+  slot = (char **) htab_find_slot (objfile->demangled_names_hash,
+				   lookup_name, INSERT);
 
   /* If this name is not in the hash table, add it.  */
   if (*slot == NULL)
     {
-      char *demangled_name = symbol_find_demangled_name (gsymbol, tmpname);
+      char *demangled_name = symbol_find_demangled_name (gsymbol,
+							 linkage_name_copy);
       int demangled_len = demangled_name ? strlen (demangled_name) : 0;
 
       /* If there is a demangled name, place it right after the mangled name.
 	 Otherwise, just place a second zero byte after the end of the mangled
 	 name.  */
       *slot = obstack_alloc (&objfile->symbol_obstack,
-			     len + demangled_len + 2);
-      memcpy (*slot, tmpname, len + 1);
-      if (demangled_name)
+			     lookup_len + demangled_len + 2);
+      memcpy (*slot, lookup_name, lookup_len + 1);
+      if (demangled_name != NULL)
 	{
-	  memcpy (*slot + len + 1, demangled_name, demangled_len + 1);
+	  memcpy (*slot + lookup_len + 1, demangled_name, demangled_len + 1);
 	  xfree (demangled_name);
 	}
       else
-	(*slot)[len + 1] = 0;
+	(*slot)[lookup_len + 1] = '\0';
     }
 
-  gsymbol->name = *slot;
-  if ((*slot)[len + 1])
+  gsymbol->name = *slot + lookup_len - len;
+  if ((*slot)[lookup_len + 1] != '\0')
     gsymbol->language_specific.cplus_specific.demangled_name
-      = &(*slot)[len + 1];
+      = &(*slot)[lookup_len + 1];
   else
     gsymbol->language_specific.cplus_specific.demangled_name = NULL;
 }
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.65
diff -u -p -r1.65 symtab.h
--- symtab.h	3 Mar 2003 18:34:12 -0000	1.65
+++ symtab.h	13 Mar 2003 22:59:05 -0000
@@ -156,10 +156,10 @@ extern void symbol_init_language_specifi
 extern void symbol_init_demangled_name (struct general_symbol_info *symbol,
 					struct obstack *obstack);
 
-#define SYMBOL_SET_NAMES(symbol,name,len,objfile) \
-  symbol_set_names (&(symbol)->ginfo, name, len, objfile)
+#define SYMBOL_SET_NAMES(symbol,linkage_name,len,objfile) \
+  symbol_set_names (&(symbol)->ginfo, linkage_name, len, objfile)
 extern void symbol_set_names (struct general_symbol_info *symbol,
-			      const char *name, int len,
+			      const char *linkage_name, int len,
 			      struct objfile *objfile);
 
 /* Now come lots of name accessor macros.  Short version as to when to

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 23:32     ` David Carlton
@ 2003-03-13 23:36       ` Daniel Jacobowitz
  2003-03-14  0:17         ` David Carlton
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Jacobowitz @ 2003-03-13 23:36 UTC (permalink / raw)
  To: David Carlton; +Cc: gdb, Tom Tromey, Michael Elizabeth Chastain

On Thu, Mar 13, 2003 at 03:32:27PM -0800, David Carlton wrote:
> On 13 Mar 2003 13:15:59 -0800, David Carlton <carlton@math.stanford.edu> said:
> 
> > I'll think about this, but if there's no easy way to get a good
> > guess at the current language when building minimal symbol tables, I
> > suppose I'll reluctantly take a stab at the temporary solution.
> 
> Actually, it wasn't so bad: Daniel did a good job of making the new
> symbol name initialization functions modular, so I only had to touch
> one of them.  Here's a patch that might work; it doesn't cause any
> non-Java regressions, but this machine doesn't have gcj 3.2 on it, so
> I have no idea if it actually fixes the problem.  I'll try to test it
> tonight when I get home, but if anybody reading this wants to test it
> before then, that would be great.  The symtab.h part is purely
> cosmetic: just apply the symtab.c part, if you don't want to have to
> recompile every file that depends on symtab.h.

Hmm, I like this in principle.  Could it have a more prominent FIXME on
it though?  It's really not a good long term solution.  We shouldn't
need both demangled copies... or if we do, then perhaps both should be
associated with the minsym.

> 
> David Carlton
> carlton@math.stanford.edu
> 
> 2003-03-13  David Carlton  <carlton@math.stanford.edu>
> 
> 	* symtab.c (symbol_set_names): Add prefix when storing Java names
> 	in hash table.  Fix for PR java/1039.
> 	* symtab.h: Change 'name' argument in declaration of
> 	symbol_set_names to 'linkage_name'.
> 	(SYMBOL_SET_NAMES): Change 'name' argument to 'linkage_name'.
> 
> Index: symtab.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.c,v
> retrieving revision 1.99
> diff -u -p -r1.99 symtab.c
> --- symtab.c	4 Mar 2003 17:06:21 -0000	1.99
> +++ symtab.c	13 Mar 2003 22:59:00 -0000
> @@ -484,61 +484,103 @@ symbol_find_demangled_name (struct gener
>    return NULL;
>  }
>  
> -/* Set both the mangled and demangled (if any) names for GSYMBOL based on
> -   NAME and LEN.  The hash table corresponding to OBJFILE is used, and the
> -   memory comes from that objfile's symbol_obstack.  NAME is copied, so the
> -   pointer can be discarded after calling this function.  */
> +/* Set both the mangled and demangled (if any) names for GSYMBOL based
> +   on LINKAGE_NAME and LEN.  The hash table corresponding to OBJFILE
> +   is used, and the memory comes from that objfile's symbol_obstack.
> +   LINKAGE_NAME is copied, so the pointer can be discarded after
> +   calling this function.  */
> +
> +/* We have to be careful when dealing with Java names: when we run
> +   into a Java minimal symbol, we don't know it's a Java symbol, so it
> +   gets demangled as a C++ name.  This is unfortunate, but there's not
> +   much we can do about it: but when demangling partial symbols and
> +   regular symbols, we'd better not reuse the wrong demangled name.
> +   (See PR gdb/1039.)  We solve this by putting a distinctive prefix
> +   on Java names when storing them in the hash table.  */
> +
> +#define JAVA_PREFIX "##JAVA$$"
> +#define JAVA_PREFIX_LEN 8
>  
>  void
>  symbol_set_names (struct general_symbol_info *gsymbol,
> -		  const char *name, int len, struct objfile *objfile)
> +		  const char *linkage_name, int len, struct objfile *objfile)
>  {
>    char **slot;
> -  const char *tmpname;
> +  /* A 0-terminated copy of the linkage name.  */
> +  const char *linkage_name_copy;
> +  /* A copy of the linkage name that might have a special Java prefix
> +     added to it, for use when looking names up in the hash table.  */
> +  const char *lookup_name;
> +  /* The length of lookup_name.  */
> +  int lookup_len;
>  
>    if (objfile->demangled_names_hash == NULL)
>      create_demangled_names_hash (objfile);
>  
> -  /* The stabs reader generally provides names that are not NULL-terminated;
> -     most of the other readers don't do this, so we can just use the given
> -     copy.  */
> -  if (name[len] != 0)
> +  /* The stabs reader generally provides names that are not
> +     NUL-terminated; most of the other readers don't do this, so we
> +     can just use the given copy, unless we're in the Java case.  */
> +  if (gsymbol->language == language_java)
>      {
> -      char *alloc_name = alloca (len + 1);
> -      memcpy (alloc_name, name, len);
> -      alloc_name[len] = 0;
> -      tmpname = alloc_name;
> +      char *alloc_name;
> +      lookup_len = len + JAVA_PREFIX_LEN;
> +
> +      alloc_name = alloca (lookup_len + 1);
> +      memcpy (alloc_name, JAVA_PREFIX, JAVA_PREFIX_LEN);
> +      memcpy (alloc_name + JAVA_PREFIX_LEN, linkage_name, len);
> +      alloc_name[lookup_len] = '\0';
> +
> +      lookup_name = alloc_name;
> +      linkage_name_copy = alloc_name + JAVA_PREFIX_LEN;
> +    }
> +  else if (linkage_name[len] != '\0')
> +    {
> +      char *alloc_name;
> +      lookup_len = len;
> +
> +      alloc_name = alloca (lookup_len + 1);
> +      memcpy (alloc_name, linkage_name, len);
> +      alloc_name[lookup_len] = '\0';
> +
> +      lookup_name = alloc_name;
> +      linkage_name_copy = alloc_name;
>      }
>    else
> -    tmpname = name;
> +    {
> +      lookup_len = len;
> +      lookup_name = linkage_name;
> +      linkage_name_copy = linkage_name;
> +    }
>  
> -  slot = (char **) htab_find_slot (objfile->demangled_names_hash, tmpname, INSERT);
> +  slot = (char **) htab_find_slot (objfile->demangled_names_hash,
> +				   lookup_name, INSERT);
>  
>    /* If this name is not in the hash table, add it.  */
>    if (*slot == NULL)
>      {
> -      char *demangled_name = symbol_find_demangled_name (gsymbol, tmpname);
> +      char *demangled_name = symbol_find_demangled_name (gsymbol,
> +							 linkage_name_copy);
>        int demangled_len = demangled_name ? strlen (demangled_name) : 0;
>  
>        /* If there is a demangled name, place it right after the mangled name.
>  	 Otherwise, just place a second zero byte after the end of the mangled
>  	 name.  */
>        *slot = obstack_alloc (&objfile->symbol_obstack,
> -			     len + demangled_len + 2);
> -      memcpy (*slot, tmpname, len + 1);
> -      if (demangled_name)
> +			     lookup_len + demangled_len + 2);
> +      memcpy (*slot, lookup_name, lookup_len + 1);
> +      if (demangled_name != NULL)
>  	{
> -	  memcpy (*slot + len + 1, demangled_name, demangled_len + 1);
> +	  memcpy (*slot + lookup_len + 1, demangled_name, demangled_len + 1);
>  	  xfree (demangled_name);
>  	}
>        else
> -	(*slot)[len + 1] = 0;
> +	(*slot)[lookup_len + 1] = '\0';
>      }
>  
> -  gsymbol->name = *slot;
> -  if ((*slot)[len + 1])
> +  gsymbol->name = *slot + lookup_len - len;
> +  if ((*slot)[lookup_len + 1] != '\0')
>      gsymbol->language_specific.cplus_specific.demangled_name
> -      = &(*slot)[len + 1];
> +      = &(*slot)[lookup_len + 1];
>    else
>      gsymbol->language_specific.cplus_specific.demangled_name = NULL;
>  }
> Index: symtab.h
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.h,v
> retrieving revision 1.65
> diff -u -p -r1.65 symtab.h
> --- symtab.h	3 Mar 2003 18:34:12 -0000	1.65
> +++ symtab.h	13 Mar 2003 22:59:05 -0000
> @@ -156,10 +156,10 @@ extern void symbol_init_language_specifi
>  extern void symbol_init_demangled_name (struct general_symbol_info *symbol,
>  					struct obstack *obstack);
>  
> -#define SYMBOL_SET_NAMES(symbol,name,len,objfile) \
> -  symbol_set_names (&(symbol)->ginfo, name, len, objfile)
> +#define SYMBOL_SET_NAMES(symbol,linkage_name,len,objfile) \
> +  symbol_set_names (&(symbol)->ginfo, linkage_name, len, objfile)
>  extern void symbol_set_names (struct general_symbol_info *symbol,
> -			      const char *name, int len,
> +			      const char *linkage_name, int len,
>  			      struct objfile *objfile);
>  
>  /* Now come lots of name accessor macros.  Short version as to when to
> 

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-13 23:36       ` Daniel Jacobowitz
@ 2003-03-14  0:17         ` David Carlton
  2003-03-14  4:18           ` Daniel Jacobowitz
  0 siblings, 1 reply; 14+ messages in thread
From: David Carlton @ 2003-03-14  0:17 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Tom Tromey, Michael Elizabeth Chastain

On Thu, 13 Mar 2003 18:36:11 -0500, Daniel Jacobowitz <drow@mvista.com> said:

> Hmm, I like this in principle.  Could it have a more prominent FIXME on
> it though?

Sure, will do.

> It's really not a good long term solution.  We shouldn't need both
> demangled copies... or if we do, then perhaps both should be
> associated with the minsym.

Well, this would be another argument in favor of coalescing minimal
symbols and partial symbols (and possibly even regular symbols) into a
single data structure: it might make it a possible to update incorrect
information like this.

David Carlton
carlton@math.stanford.edu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
  2003-03-14  0:17         ` David Carlton
@ 2003-03-14  4:18           ` Daniel Jacobowitz
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Jacobowitz @ 2003-03-14  4:18 UTC (permalink / raw)
  To: David Carlton; +Cc: gdb

On Thu, Mar 13, 2003 at 04:17:03PM -0800, David Carlton wrote:
> On Thu, 13 Mar 2003 18:36:11 -0500, Daniel Jacobowitz <drow@mvista.com> said:
> 
> > Hmm, I like this in principle.  Could it have a more prominent FIXME on
> > it though?
> 
> Sure, will do.
> 
> > It's really not a good long term solution.  We shouldn't need both
> > demangled copies... or if we do, then perhaps both should be
> > associated with the minsym.
> 
> Well, this would be another argument in favor of coalescing minimal
> symbols and partial symbols (and possibly even regular symbols) into a
> single data structure: it might make it a possible to update incorrect
> information like this.

Yes, I agree that this is the way to go.  Particularly, once we've read
a symtab in we should never need the corresponding psymtab again.  Of
course, there are more partial symbols than minimal symbols; and
symbols take more memory than partial symbols.  So it's not trivial.

By the way, I noticed something very interesting today.  SGI apparently
had DWARF-2 extensions including a .debug_typenames section (and var,
func, weak names) to expand upon the concept of .debug_pubnames.  We
could make GCC generate those and then use them to build psymtabs, I
bet.  That would speed up load time a lot.

Oh well, something else for the List.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: break jmisc.main
@ 2003-03-14 15:09 Michael Elizabeth Chastain
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Elizabeth Chastain @ 2003-03-14 15:09 UTC (permalink / raw)
  To: carlton, drow; +Cc: gdb, tromey

carlton> 2) For "break jmisc.main(java.lang.String[])", decode_compound gets
carlton>    bypassed, and decode_variable gets called, looking for a symbol of
carlton>    that name.  Unfortunately, it doesn't find one: the symbol that it
carlton>    finds is called something strange like
carlton>    "jmisc::main(Jaray<java::lang::String*>*)".  (I'm pretty sure
carlton>    that's right, though I'd have to check this at home to be sure;
carlton>    that's what c++filt demangles the name to.)

drow> Do you know if this actually broke with my caching patch, or if it was
drow> broken before?  I checked, and nowhere in GDB do we ever set the
drow> demangling style to Java.  Not that I could find, at least.

My testbed says that this worked in gdb HEAD on 2003-02-01 and failed
in gdb HEAD on 2003-02-05.

Michael C

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2003-03-14 15:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-13 20:39 break jmisc.main David Carlton
2003-03-13 20:54 ` David Carlton
2003-03-13 20:57 ` Daniel Jacobowitz
2003-03-13 21:07   ` Daniel Jacobowitz
2003-03-13 21:16   ` David Carlton
2003-03-13 21:22     ` Daniel Jacobowitz
2003-03-13 23:32     ` David Carlton
2003-03-13 23:36       ` Daniel Jacobowitz
2003-03-14  0:17         ` David Carlton
2003-03-14  4:18           ` Daniel Jacobowitz
2003-03-13 23:04   ` Tom Tromey
2003-03-13 22:59 ` Tom Tromey
2003-03-13 23:03   ` Daniel Jacobowitz
2003-03-14 15:09 Michael Elizabeth Chastain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).