From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <archer-commits-return-284-listarch-archer-commits=sourceware.org@sourceware.org>
Received: (qmail 18157 invoked by alias); 12 Jan 2009 20:20:50 -0000
Mailing-List: contact archer-commits-help@sourceware.org; run by ezmlm
Sender: <archer-commits@sourceware.org>
Precedence: bulk
List-Post: <mailto:archer-commits@sourceware.org>
List-Help: <mailto:archer-commits-help@sourceware.org>
List-Subscribe: <mailto:archer-commits-subscribe@sourceware.org>
Received: (qmail 18132 invoked by uid 306); 12 Jan 2009 20:20:49 -0000
Date: Mon, 12 Jan 2009 20:20:00 -0000
Message-ID: <20090112202049.18114.qmail@sourceware.org>
From: tromey@sourceware.org
To: archer-commits@sourceware.org
Subject: [SCM]  archer-tromey-charset: wrote function comments and ChangeLog entry
X-Git-Refname: refs/heads/archer-tromey-charset
X-Git-Reftype: branch
X-Git-Oldrev: bb8d53e80b7e8c3e4ccfeac09c91135c44c1a669
X-Git-Newrev: 53cfd74cde0b5ce17d1c0dd0183858dcf6099710
X-SW-Source: 2009-q1/txt/msg00034.txt.bz2
List-Id: <archer-commits.sourceware.org>

The branch, archer-tromey-charset has been updated
       via  53cfd74cde0b5ce17d1c0dd0183858dcf6099710 (commit)
      from  bb8d53e80b7e8c3e4ccfeac09c91135c44c1a669 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email.

- Log -----------------------------------------------------------------
commit 53cfd74cde0b5ce17d1c0dd0183858dcf6099710
Author: Tom Tromey <tromey@redhat.com>
Date:   Mon Jan 12 13:20:15 2009 -0700

    wrote function comments and ChangeLog entry

-----------------------------------------------------------------------

Summary of changes:
 gdb/ChangeLog     |  178 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 gdb/c-exp.y       |    6 ++
 gdb/c-lang.c      |  131 +++++++++++++++++++++++++++++----------
 gdb/c-valprint.c  |   32 +++-------
 gdb/charset.c     |   44 +++++--------
 gdb/doc/ChangeLog |    4 +
 gdb/parse.c       |   13 ++++
 7 files changed, 325 insertions(+), 83 deletions(-)

First 500 lines of diff:
diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index f37ec17..9e9927e 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,8 +1,182 @@
-2008-12-23  Tom Tromey  <tromey@redhat.com>
-
+2009-01-12  Tom Tromey  <tromey@redhat.com>
+
+	* value.h (value_typed_string): Declare.
+	(val_print_string): Update.
+	* valprint.h (print_char_chars): Update.
+	* valprint.c (print_char_chars): Add type argument.  Update.
+	(val_print_string): Likewise.
+	* valops.c (value_typed_string): New function.
+	* utils.c (do_obstack_free): New function.
+	(make_cleanup_obstack_free): Likewise.
+	(host_char_to_target): New function.
+	(parse_escape): Use host_char_to_target, host_hex_value.  Update.
+	* typeprint.c (print_type_scalar): Update.
+	* scm-valprint.c (scm_scmval_print): Update.
+	* scm-lang.h (scm_printchar, scm_printstr): Update.
+	* scm-lang.c (scm_printchar): Add type argument.
+	(scm_printstr): Likewise.
+	* printcmd.c (print_formatted): Update.
+	(print_scalar_formatted): Update.
+	(printf_command) <wide_string_arg, wide_char_arg>: New constants.
+	Handle '%lc' and '%ls'.
+	* parser-defs.h (struct typed_stoken): New type.
+	(struct stoken_vector): Likewise.
+	(write_exp_string_vector): Declare.
+	* parse.c (write_exp_string_vector): New function.
+	* p-valprint.c (pascal_val_print): Update.
+	* p-lang.h (is_pascal_string_type, pascal_printchar,
+	pascal_printstr): Update.
+	* p-lang.c (is_pascal_string_type): Remove 'char_size' argument.
+	Add 'char_type' argument.
+	(pascal_emit_char): Add type argument.
+	(pascal_printchar): Likewise.
+	(pascal_printstr): Likewise.
+	* objc-lang.c (objc_emit_char): Add type argument.
+	(objc_printchar): Likewise.
+	(objc_printstr): Likewise.
+	* macroexp.c (get_character_constant): Handle unicode characters.
+	(get_string_literal): Handle unicode strings.
+	* m2-valprint.c (print_unpacked_pointer): Update.
+	(m2_print_array_contents): Update.
+	(m2_val_print): Update.
+	* m2-lang.c (m2_emit_char): Add type argument.
+	(m2_printchar): Likewise.
+	(m2_printstr): Likewise.
+	* language.h (struct language_defn) <la_printchar>: Add type
+	argument.
+	<la_printstr, la_emitchar>: Likewise.
+	(LA_PRINT_CHAR): Likewise.
+	(LA_PRINT_STRING): Likewise.
+	(LA_EMIT_CHAR): Likewise.
+	* language.c (unk_lang_emit_char): Add type argument.
+	(unk_lang_printchar): Likewise.
+	(unk_lang_printstr): Likewise.
+	* jv-valprint.c (java_val_print): Update.
+	* jv-lang.c (java_emit_char): Add type argument.
+	* f-valprint.c (f_val_print): Update.
+	* f-lang.c (f_emit_char): Add type argument.
+	(f_printchar): Likewise.
+	(f_printstr): Likewise.
+	* expprint.c (print_subexp_standard): Update.
+	* defs.h (make_cleanup_obstack_free): Declare.
+	* charset.h (target_wide_charset): Declare.
+	(c_target_char_has_backslash_escape, c_parse_backslash,
+	host_char_print_literally, host_char_to_target,
+	target_char_to_host, target_char_to_control_char): Remove.
+	(enum transliterations): New type.
+	(convert_between_encodings): Declare.
+	(HOST_ESCAPE_CHAR): New define.
+	(host_letter_to_control_character, host_hex_value): Declare.
+	* charset-list.h: New file.
+	* c-valprint.c (textual_name): New function.
+	(textual_element_type): Handle wide character types.
+	(c_val_print): Pass original type to textual_element_type.  Handle
+	wide character types.
+	* c-lang.h (enum c_string_type): New type.
+	(c_printchar, c_printstr): Update.
+	* c-lang.c (classify_type): New function.
+	(print_wchar): Likewise.
+	(c_emit_char): Add type argument.  Handle wide characters.
+	(c_printchar): Likewise.
+	(c_printstr): Add type argument.  Handle wide and multibyte
+	character sets.
+	(convert_ucn): New function.
+	(emit_numeric_character): Likewise.
+	(convert_octal): Likewise.
+	(convert_hex): Likewise.
+	(ADVANCE): New macro.
+	(convert_escape): New function.
+	(parse_one_string): Likewise.
+	(evaluate_subexp_c): Likewise.
+	(exp_descriptor_c): New global.
+	(c_language_defn): Use exp_descriptor_c.
+	(cplus_language_defn): Likewise.
+	(asm_language_defn): Likewise.
+	(minimal_language_defn): Likewise.
+	(charset_for_string_type): New function.
+	* c-exp.y (%union): Add 'svec' and 'tsval'.
+	(CHAR): New token.
+	(exp): Add CHAR production.
+	(string_exp): Rewrite.
+	(exp) <string_exp>: Rewrite.
+	(tempbuf): Now global.
+	(tempbuf_init): New global.
+	(parse_string_or_char): New function.
+	(yylex) <tempbuf>: Now global.
+	<tokptr, tempbufindex, tempbufsize, token_string, class_prefix>:
+	Remove.
+	Handle 'u', 'U', and 'L' prefixes.  Call parse_string_or_char.
+	* auxv.c (fprint_target_auxv): Update.
+	* ada-valprint.c (ada_emit_char): Add type argument.
+	(ada_printchar): Likewise.
+	(ada_print_scalar): Update.
+	(printstr): Add type argument.  Update calls to ada_emit_char.
+	(ada_printstr): Add type argument.
+	(ada_val_print_array): Update.
+	(ada_val_print_1): Likewise.
+	* ada-lang.c (emit_char): Add type argument.
+	* ada-lang.h (ada_emit_char, ada_printchar, ada_printstr): Add
+	type arguments.
 	* gdb_locale.h: Include langinfo.h.
 	* charset.c (_initialize_charset): Set default host charset from
-	the locale.
+	the locale.  Don't register charsets.  Add target-wide-charset
+	commands.
+	(struct charset, struct translation): Remove.
+	(GDB_DEFAULT_TARGET_WIDE_CHARSET): New define.
+	(target_wide_charset_name): New global.
+	(show_target_wide_charset_name): New function.
+	(host_charset_enum): Rewrite.
+	(target_charset_enum): Likewise.
+	(target_wide_charset_enum): Likewise.
+	(all_charsets, register_charset, lookup_charset, all_translations,
+	register_translation, lookup_translation): Remove.
+	(simple_charset, ascii_print_literally, ascii_to_control): Remove.
+	(iso_8859_print_literally, iso_8859_to_control,
+	iso_8859_family_charset): Remove.
+	(ebcdic_print_literally, ebcdic_to_control,
+	ebcdic_family_charset): Remove.
+	(struct cached_iconv, check_iconv_cache, cached_iconv_convert,
+	register_iconv_charsets): Remove.
+	(target_wide_charset_be_name, target_wide_charset_le_name): New
+	globals.
+	(identity_either_char_to_other): Remove.
+	(set_be_le_names, validate): New functions.
+	(backslashable, backslashed, represented): Remove.
+	(default_c_target_char_has_backslash_escape): Remove.
+	(default_c_parse_backslash, iconv_convert): Remove.
+	(ascii_to_iso_8859_1_table, ascii_to_ebcdic_us_table,
+	ascii_to_ibm1047_table, iso_8859_1_to_ascii_table,
+	iso_8859_1_to_ebcdic_us_table, iso_8859_1_to_ibm1047_table,
+	ebcdic_us_to_ascii_table, ebcdic_us_to_iso_8859_1_table,
+	ebcdic_us_to_ibm1047_table, ibm1047_to_ascii_table,
+	ibm1047_to_iso_8859_1_table, ibm1047_to_ebcdic_us_table): Remove.
+	(table_convert_char, table_translation, simple_table_translation):
+	Remove.
+	(current_host_charset, current_target_charset,
+	c_target_char_has_backslash_escape_func,
+	c_target_char_has_backslash_escape_baton): Remove.
+	(c_parse_backslash_func, c_parse_backslash_baton): Remove.
+	(host_char_to_target_func, host_char_to_target_baton): Remove.
+	(target_char_to_host_func, target_char_to_host_baton): Remove.
+	(cached_iconv_host_to_target, cached_iconv_target_to_host):
+	Remove.
+	(lookup_charset_or_error, check_valid_host_charset): Remove.
+	(set_host_and_target_charsets): Remove.
+	(set_host_charset, set_target_charset): Remove.
+	(set_host_charset_sfunc, set_target_charset_sfunc): Rewrite.
+	(set_target_wide_charset_sfunc): New function.
+	(show_charset): Print target wide character set.
+	(host_charset, target_charset): Rewrite.
+	(target_wide_charset): New function.
+	(c_target_char_has_backslash_escape): Remove.
+	(c_parse_backslash): Remove.
+	(host_letter_to_control_character): New function.
+	(host_char_print_literally): Remove.
+	(host_hex_value): New function.
+	(target_char_to_control_char): Remove.
+	(cleanup_iconv): New function.
+	(convert_between_encodings): New function.
+	(target_char_to_host): Remove.
 	* aclocal.m4, config.in, configure: Rebuild.
 	* configure.ac: Call AM_LANGINFO_CODESET.
 	* acinclude.m4: Include codeset.m4.
diff --git a/gdb/c-exp.y b/gdb/c-exp.y
index 1a32b67..62d36a0 100644
--- a/gdb/c-exp.y
+++ b/gdb/c-exp.y
@@ -1394,6 +1394,12 @@ parse_number (p, len, parsed_float, putithere)
 static struct obstack tempbuf;
 static int tempbuf_init;
 
+/* Parse a string or character literal from TOKPTR.  The string or
+   character may be wide or unicode.  *OUTPTR is set to just after the
+   end of the literal in the input string.  The resulting token is
+   stored in VALUE.  This returns a token value, either STRING or
+   CHAR, depending on what was parsed.  *HOST_CHARS is set to the
+   number of host characters in the literal.  */
 static int
 parse_string_or_char (char *tokptr, char **outptr, struct typed_stoken *value,
 		      int *host_chars)
diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index 37f2bf4..f3da24b 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -40,10 +40,45 @@
 
 extern void _initialize_c_language (void);
 
+/* Given a C string type, STR_TYPE, return the corresponding target
+   character set name.  */
+
+static const char *
+charset_for_string_type (enum c_string_type str_type)
+{
+  switch (str_type & ~C_CHAR)
+    {
+    case C_STRING:
+      return target_charset ();
+    case C_WIDE_STRING:
+      return target_wide_charset ();
+    case C_STRING_16:
+      /* FIXME: UCS-2 is not always correct.  */
+      if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG)
+	return "UCS-2BE";
+      else
+	return "UCS-2LE";
+    case C_STRING_32:
+      /* FIXME: UCS-4 is not always correct.  */
+      if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG)
+	return "UCS-4BE";
+      else
+	return "UCS-4LE";
+    }
+  internal_error (__FILE__, __LINE__, "unhandled c_string_type");
+}
+
+/* Classify ELTTYPE according to what kind of character it is.  Return
+   the enum constant representing the character type.  Also set
+   *ENCODING to the name of the character set to use when converting
+   characters of this type to the host character set.  */
+
 static enum c_string_type
 classify_type (struct type *elttype, const char **encoding)
 {
   struct type *saved_type = elttype;
+  enum c_string_type result;
+
   /* We do one or two passes -- one on ELTTYPE, and then maybe a
      second one on a typedef target.  */
   do
@@ -52,34 +87,26 @@ classify_type (struct type *elttype, const char **encoding)
 
       if (TYPE_CODE (elttype) == TYPE_CODE_CHAR || !name)
 	{
-	  *encoding = target_charset ();
-	  return C_CHAR;
+	  result = C_CHAR;
+	  goto done;
 	}
 
       if (!strcmp (name, "wchar_t"))
 	{
-	  *encoding = target_wide_charset ();
-	  return C_WIDE_CHAR;
+	  result = C_WIDE_CHAR;
+	  goto done;
 	}
 
       if (!strcmp (name, "char16_t"))
 	{
-	  /* FIXME: UCS-2 is not always correct.  */
-	  if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG)
-	    *encoding = "UCS-2BE";
-	  else
-	    *encoding = "UCS-2LE";
-	  return C_CHAR_16;
+	  result = C_CHAR_16;
+	  goto done;
 	}
 
       if (!strcmp (name, "char32_t"))
 	{
-	  /* FIXME: UCS-4 is not always correct.  */
-	  if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG)
-	    *encoding = "UCS-4BE";
-	  else
-	    *encoding = "UCS-4LE";
-	  return C_CHAR_32;
+	  result = C_CHAR_32;
+	  goto done;
 	}
 
       CHECK_TYPEDEF (elttype);
@@ -87,10 +114,16 @@ classify_type (struct type *elttype, const char **encoding)
   while (elttype != saved_type);
 
   /* Punt.  */
-  *encoding = target_charset ();
-  return C_CHAR;
+  result = C_CHAR;
+
+ done:
+  *encoding = charset_for_string_type (result);
+  return result;
 }
 
+/* A helper function that resets STATE to the initial state, writing
+   any output to OUTPUT.  */
+
 static void
 reset_state (struct obstack *output, mbstate_t *state)
 {
@@ -104,6 +137,10 @@ reset_state (struct obstack *output, mbstate_t *state)
     }
 }
 
+/* Print a wide character W, using the multi-byte state STATE, to
+   OUTPUT.  QUOTER is a (narrow) character indicating the style of
+   quotes surrounding the character to be printed.  */
+
 static void
 print_wchar (wchar_t w, mbstate_t *state, struct obstack *output,
 	     int quoter)
@@ -416,6 +453,12 @@ c_printstr (struct ui_file *stream, struct type *type, const gdb_byte *string,
 
 /* Evaluating C and C++ expressions.  */
 
+/* Convert a UCN.  The digits of the UCN start at P and extend no
+   farther than LIMIT.  DEST_CHARSET is the name of the character set
+   into which the UCN should be converted.  The results are written to
+   OUTPUT.  LENGTH is the maximum length of the UCN, either 4 or 8.
+   Returns a pointer to just after the final digit of the UCN.  */
+
 static char *
 convert_ucn (char *p, char *limit, const char *dest_charset,
 	     struct obstack *output, int length)
@@ -439,6 +482,9 @@ convert_ucn (char *p, char *limit, const char *dest_charset,
   return p;
 }
 
+/* Emit a character, VALUE, which was specified numerically to OUTPUT.
+   TYPE is the target character type.  */
+
 static void
 emit_numeric_character (struct type *type, unsigned long value,
 			struct obstack *output)
@@ -450,10 +496,13 @@ emit_numeric_character (struct type *type, unsigned long value,
   obstack_grow (output, buffer, TYPE_LENGTH (type));
 }
 
+/* Convert an octal escape sequence.  TYPE is the target character
+   type.  The digits of the escape sequence begin at P and extend no
+   farther than LIMIT.  The result is written to OUTPUT.  Returns a
+   pointer to just after the final digit of the escape sequence.  */
+
 static char *
-convert_octal (struct type *type, const char *dest_charset,
-	       char *p, char *limit,
-	       struct obstack *output)
+convert_octal (struct type *type, char *p, char *limit, struct obstack *output)
 {
   unsigned long value = 0;
 
@@ -468,10 +517,13 @@ convert_octal (struct type *type, const char *dest_charset,
   return p;
 }
 
+/* Convert a hex escape sequence.  TYPE is the target character type.
+   The digits of the escape sequence begin at P and extend no farther
+   than LIMIT.  The result is written to OUTPUT.  Returns a pointer to
+   just after the final digit of the escape sequence.  */
+
 static char *
-convert_hex (struct type *type, const char *dest_charset,
-	     char *p, char *limit,
-	     struct obstack *output)
+convert_hex (struct type *type, char *p, char *limit, struct obstack *output)
 {
   unsigned long value = 0;
 
@@ -493,10 +545,16 @@ convert_hex (struct type *type, const char *dest_charset,
       error (_("Malformed escape sequence"));	\
   } while (0)
 
+/* Convert an escape sequence to a target format.  TYPE is the target
+   character type to use, and DEST_CHARSET is the name of the target
+   character set.  The backslash of the escape sequence is at *P, and
+   the escape sequence will not extend past LIMIT.  The results are
+   written to OUTPUT.  Returns a pointer to just past the final
+   character of the escape sequence.  */
+
 static char *
 convert_escape (struct type *type, const char *dest_charset,
-		char *p, char *limit,
-		struct obstack *output)
+		char *p, char *limit, struct obstack *output)
 {
   /* Skip the backslash.  */
   ADVANCE;
@@ -512,7 +570,7 @@ convert_escape (struct type *type, const char *dest_charset,
       ADVANCE;
       if (!isxdigit (*p))
 	error (_("\\x used with no following hex digits."));
-      p = convert_hex (type, dest_charset, p, limit, output);
+      p = convert_hex (type, p, limit, output);
       break;
 
     case '0':
@@ -523,7 +581,7 @@ convert_escape (struct type *type, const char *dest_charset,
     case '5':
     case '6':
     case '7':
-      p = convert_octal (type, dest_charset, p, limit, output);
+      p = convert_octal (type, p, limit, output);
       break;
 
     case 'u':
@@ -540,6 +598,12 @@ convert_escape (struct type *type, const char *dest_charset,
   return p;
 }
 
+/* Given a single string from a (C-specific) OP_STRING list, convert
+   it to a target string, handling escape sequences specially.  The
+   output is written to OUTPUT.  DATA is the input string, which has
+   length LEN.  DEST_CHARSET is the name of the target character set,
+   and TYPE is the type of target character to use.  */
+
 static void
 parse_one_string (struct obstack *output, char *data, int len,
 		  const char *dest_charset, struct type *type)
@@ -565,6 +629,10 @@ parse_one_string (struct obstack *output, char *data, int len,
     }
 }
 
+/* Expression evaluator for the C language family.  Most operations
+   are delegated to evaluate_subexp_standard; see that function for a
+   description of the arguments.  */
+
 static struct value *
 evaluate_subexp_c (struct type *expect_type, struct expression *exp,
 		   int *pos, enum noside noside)
@@ -596,23 +664,20 @@ evaluate_subexp_c (struct type *expect_type, struct expression *exp,
 	switch (dest_type & ~C_CHAR)
 	  {
 	  case C_STRING:
-	    dest_charset = target_charset ();
 	    type = language_string_char_type (current_language,
 					      current_gdbarch);
 	    break;
 	  case C_WIDE_STRING:
-	    dest_charset = target_wide_charset ();
 	    type = lookup_typename ("wchar_t", NULL, 0);
 	    break;
 	  case C_STRING_16:
-	    dest_charset = "UCS-2";
 	    type = lookup_typename ("char16_t", NULL, 0);
 	    break;
 	  case C_STRING_32:
-	    dest_charset = "UCS-4";
 	    type = lookup_typename ("char32_t", NULL, 0);
 	    break;
 	  }
+	dest_charset = charset_for_string_type (dest_type);
 
 	while (*pos < limit)
 	  {
@@ -760,7 +825,7 @@ c_language_arch_info (struct gdbarch *gdbarch,
   lai->bool_type_default = builtin->builtin_int;
 }
 
-const struct exp_descriptor exp_descriptor_c = 
+static const struct exp_descriptor exp_descriptor_c = 
 {
   print_subexp_standard,
   operator_length_standard,
diff --git a/gdb/c-valprint.c b/gdb/c-valprint.c
index 3571005..9e661c5 100644
--- a/gdb/c-valprint.c
+++ b/gdb/c-valprint.c
@@ -58,6 +58,7 @@ print_function_pointer_address (CORE_ADDR address, struct ui_file *stream,
 /* A helper for textual_element_type.  This checks the name of the
    typedef.  This is bogus but it isn't apparent that the compiler
    provides us the help we may need.  */
+
 static int
 textual_name (const char *name)
 {
@@ -163,28 +164,15 @@ c_val_print (struct type *type, const gdb_byte *valaddr, int embedded_offset,
 		{
 		  unsigned int temp_len;
 
-		  temp_len = 0;
-		  while (temp_len < len && temp_len < options->print_max)


hooks/post-receive
--
Repository for Project Archer.