From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18157 invoked by alias); 12 Jan 2009 20:20:50 -0000 Mailing-List: contact archer-commits-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: Received: (qmail 18132 invoked by uid 306); 12 Jan 2009 20:20:49 -0000 Date: Mon, 12 Jan 2009 20:20:00 -0000 Message-ID: <20090112202049.18114.qmail@sourceware.org> From: tromey@sourceware.org To: archer-commits@sourceware.org Subject: [SCM] archer-tromey-charset: wrote function comments and ChangeLog entry X-Git-Refname: refs/heads/archer-tromey-charset X-Git-Reftype: branch X-Git-Oldrev: bb8d53e80b7e8c3e4ccfeac09c91135c44c1a669 X-Git-Newrev: 53cfd74cde0b5ce17d1c0dd0183858dcf6099710 X-SW-Source: 2009-q1/txt/msg00034.txt.bz2 List-Id: The branch, archer-tromey-charset has been updated via 53cfd74cde0b5ce17d1c0dd0183858dcf6099710 (commit) from bb8d53e80b7e8c3e4ccfeac09c91135c44c1a669 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email. - Log ----------------------------------------------------------------- commit 53cfd74cde0b5ce17d1c0dd0183858dcf6099710 Author: Tom Tromey Date: Mon Jan 12 13:20:15 2009 -0700 wrote function comments and ChangeLog entry ----------------------------------------------------------------------- Summary of changes: gdb/ChangeLog | 178 ++++++++++++++++++++++++++++++++++++++++++++++++++++- gdb/c-exp.y | 6 ++ gdb/c-lang.c | 131 +++++++++++++++++++++++++++++---------- gdb/c-valprint.c | 32 +++------- gdb/charset.c | 44 +++++-------- gdb/doc/ChangeLog | 4 + gdb/parse.c | 13 ++++ 7 files changed, 325 insertions(+), 83 deletions(-) First 500 lines of diff: diff --git a/gdb/ChangeLog b/gdb/ChangeLog index f37ec17..9e9927e 100644 --- a/gdb/ChangeLog +++ b/gdb/ChangeLog @@ -1,8 +1,182 @@ -2008-12-23 Tom Tromey - +2009-01-12 Tom Tromey + + * value.h (value_typed_string): Declare. + (val_print_string): Update. + * valprint.h (print_char_chars): Update. + * valprint.c (print_char_chars): Add type argument. Update. + (val_print_string): Likewise. + * valops.c (value_typed_string): New function. + * utils.c (do_obstack_free): New function. + (make_cleanup_obstack_free): Likewise. + (host_char_to_target): New function. + (parse_escape): Use host_char_to_target, host_hex_value. Update. + * typeprint.c (print_type_scalar): Update. + * scm-valprint.c (scm_scmval_print): Update. + * scm-lang.h (scm_printchar, scm_printstr): Update. + * scm-lang.c (scm_printchar): Add type argument. + (scm_printstr): Likewise. + * printcmd.c (print_formatted): Update. + (print_scalar_formatted): Update. + (printf_command) : New constants. + Handle '%lc' and '%ls'. + * parser-defs.h (struct typed_stoken): New type. + (struct stoken_vector): Likewise. + (write_exp_string_vector): Declare. + * parse.c (write_exp_string_vector): New function. + * p-valprint.c (pascal_val_print): Update. + * p-lang.h (is_pascal_string_type, pascal_printchar, + pascal_printstr): Update. + * p-lang.c (is_pascal_string_type): Remove 'char_size' argument. + Add 'char_type' argument. + (pascal_emit_char): Add type argument. + (pascal_printchar): Likewise. + (pascal_printstr): Likewise. + * objc-lang.c (objc_emit_char): Add type argument. + (objc_printchar): Likewise. + (objc_printstr): Likewise. + * macroexp.c (get_character_constant): Handle unicode characters. + (get_string_literal): Handle unicode strings. + * m2-valprint.c (print_unpacked_pointer): Update. + (m2_print_array_contents): Update. + (m2_val_print): Update. + * m2-lang.c (m2_emit_char): Add type argument. + (m2_printchar): Likewise. + (m2_printstr): Likewise. + * language.h (struct language_defn) : Add type + argument. + : Likewise. + (LA_PRINT_CHAR): Likewise. + (LA_PRINT_STRING): Likewise. + (LA_EMIT_CHAR): Likewise. + * language.c (unk_lang_emit_char): Add type argument. + (unk_lang_printchar): Likewise. + (unk_lang_printstr): Likewise. + * jv-valprint.c (java_val_print): Update. + * jv-lang.c (java_emit_char): Add type argument. + * f-valprint.c (f_val_print): Update. + * f-lang.c (f_emit_char): Add type argument. + (f_printchar): Likewise. + (f_printstr): Likewise. + * expprint.c (print_subexp_standard): Update. + * defs.h (make_cleanup_obstack_free): Declare. + * charset.h (target_wide_charset): Declare. + (c_target_char_has_backslash_escape, c_parse_backslash, + host_char_print_literally, host_char_to_target, + target_char_to_host, target_char_to_control_char): Remove. + (enum transliterations): New type. + (convert_between_encodings): Declare. + (HOST_ESCAPE_CHAR): New define. + (host_letter_to_control_character, host_hex_value): Declare. + * charset-list.h: New file. + * c-valprint.c (textual_name): New function. + (textual_element_type): Handle wide character types. + (c_val_print): Pass original type to textual_element_type. Handle + wide character types. + * c-lang.h (enum c_string_type): New type. + (c_printchar, c_printstr): Update. + * c-lang.c (classify_type): New function. + (print_wchar): Likewise. + (c_emit_char): Add type argument. Handle wide characters. + (c_printchar): Likewise. + (c_printstr): Add type argument. Handle wide and multibyte + character sets. + (convert_ucn): New function. + (emit_numeric_character): Likewise. + (convert_octal): Likewise. + (convert_hex): Likewise. + (ADVANCE): New macro. + (convert_escape): New function. + (parse_one_string): Likewise. + (evaluate_subexp_c): Likewise. + (exp_descriptor_c): New global. + (c_language_defn): Use exp_descriptor_c. + (cplus_language_defn): Likewise. + (asm_language_defn): Likewise. + (minimal_language_defn): Likewise. + (charset_for_string_type): New function. + * c-exp.y (%union): Add 'svec' and 'tsval'. + (CHAR): New token. + (exp): Add CHAR production. + (string_exp): Rewrite. + (exp) : Rewrite. + (tempbuf): Now global. + (tempbuf_init): New global. + (parse_string_or_char): New function. + (yylex) : Now global. + : + Remove. + Handle 'u', 'U', and 'L' prefixes. Call parse_string_or_char. + * auxv.c (fprint_target_auxv): Update. + * ada-valprint.c (ada_emit_char): Add type argument. + (ada_printchar): Likewise. + (ada_print_scalar): Update. + (printstr): Add type argument. Update calls to ada_emit_char. + (ada_printstr): Add type argument. + (ada_val_print_array): Update. + (ada_val_print_1): Likewise. + * ada-lang.c (emit_char): Add type argument. + * ada-lang.h (ada_emit_char, ada_printchar, ada_printstr): Add + type arguments. * gdb_locale.h: Include langinfo.h. * charset.c (_initialize_charset): Set default host charset from - the locale. + the locale. Don't register charsets. Add target-wide-charset + commands. + (struct charset, struct translation): Remove. + (GDB_DEFAULT_TARGET_WIDE_CHARSET): New define. + (target_wide_charset_name): New global. + (show_target_wide_charset_name): New function. + (host_charset_enum): Rewrite. + (target_charset_enum): Likewise. + (target_wide_charset_enum): Likewise. + (all_charsets, register_charset, lookup_charset, all_translations, + register_translation, lookup_translation): Remove. + (simple_charset, ascii_print_literally, ascii_to_control): Remove. + (iso_8859_print_literally, iso_8859_to_control, + iso_8859_family_charset): Remove. + (ebcdic_print_literally, ebcdic_to_control, + ebcdic_family_charset): Remove. + (struct cached_iconv, check_iconv_cache, cached_iconv_convert, + register_iconv_charsets): Remove. + (target_wide_charset_be_name, target_wide_charset_le_name): New + globals. + (identity_either_char_to_other): Remove. + (set_be_le_names, validate): New functions. + (backslashable, backslashed, represented): Remove. + (default_c_target_char_has_backslash_escape): Remove. + (default_c_parse_backslash, iconv_convert): Remove. + (ascii_to_iso_8859_1_table, ascii_to_ebcdic_us_table, + ascii_to_ibm1047_table, iso_8859_1_to_ascii_table, + iso_8859_1_to_ebcdic_us_table, iso_8859_1_to_ibm1047_table, + ebcdic_us_to_ascii_table, ebcdic_us_to_iso_8859_1_table, + ebcdic_us_to_ibm1047_table, ibm1047_to_ascii_table, + ibm1047_to_iso_8859_1_table, ibm1047_to_ebcdic_us_table): Remove. + (table_convert_char, table_translation, simple_table_translation): + Remove. + (current_host_charset, current_target_charset, + c_target_char_has_backslash_escape_func, + c_target_char_has_backslash_escape_baton): Remove. + (c_parse_backslash_func, c_parse_backslash_baton): Remove. + (host_char_to_target_func, host_char_to_target_baton): Remove. + (target_char_to_host_func, target_char_to_host_baton): Remove. + (cached_iconv_host_to_target, cached_iconv_target_to_host): + Remove. + (lookup_charset_or_error, check_valid_host_charset): Remove. + (set_host_and_target_charsets): Remove. + (set_host_charset, set_target_charset): Remove. + (set_host_charset_sfunc, set_target_charset_sfunc): Rewrite. + (set_target_wide_charset_sfunc): New function. + (show_charset): Print target wide character set. + (host_charset, target_charset): Rewrite. + (target_wide_charset): New function. + (c_target_char_has_backslash_escape): Remove. + (c_parse_backslash): Remove. + (host_letter_to_control_character): New function. + (host_char_print_literally): Remove. + (host_hex_value): New function. + (target_char_to_control_char): Remove. + (cleanup_iconv): New function. + (convert_between_encodings): New function. + (target_char_to_host): Remove. * aclocal.m4, config.in, configure: Rebuild. * configure.ac: Call AM_LANGINFO_CODESET. * acinclude.m4: Include codeset.m4. diff --git a/gdb/c-exp.y b/gdb/c-exp.y index 1a32b67..62d36a0 100644 --- a/gdb/c-exp.y +++ b/gdb/c-exp.y @@ -1394,6 +1394,12 @@ parse_number (p, len, parsed_float, putithere) static struct obstack tempbuf; static int tempbuf_init; +/* Parse a string or character literal from TOKPTR. The string or + character may be wide or unicode. *OUTPTR is set to just after the + end of the literal in the input string. The resulting token is + stored in VALUE. This returns a token value, either STRING or + CHAR, depending on what was parsed. *HOST_CHARS is set to the + number of host characters in the literal. */ static int parse_string_or_char (char *tokptr, char **outptr, struct typed_stoken *value, int *host_chars) diff --git a/gdb/c-lang.c b/gdb/c-lang.c index 37f2bf4..f3da24b 100644 --- a/gdb/c-lang.c +++ b/gdb/c-lang.c @@ -40,10 +40,45 @@ extern void _initialize_c_language (void); +/* Given a C string type, STR_TYPE, return the corresponding target + character set name. */ + +static const char * +charset_for_string_type (enum c_string_type str_type) +{ + switch (str_type & ~C_CHAR) + { + case C_STRING: + return target_charset (); + case C_WIDE_STRING: + return target_wide_charset (); + case C_STRING_16: + /* FIXME: UCS-2 is not always correct. */ + if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG) + return "UCS-2BE"; + else + return "UCS-2LE"; + case C_STRING_32: + /* FIXME: UCS-4 is not always correct. */ + if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG) + return "UCS-4BE"; + else + return "UCS-4LE"; + } + internal_error (__FILE__, __LINE__, "unhandled c_string_type"); +} + +/* Classify ELTTYPE according to what kind of character it is. Return + the enum constant representing the character type. Also set + *ENCODING to the name of the character set to use when converting + characters of this type to the host character set. */ + static enum c_string_type classify_type (struct type *elttype, const char **encoding) { struct type *saved_type = elttype; + enum c_string_type result; + /* We do one or two passes -- one on ELTTYPE, and then maybe a second one on a typedef target. */ do @@ -52,34 +87,26 @@ classify_type (struct type *elttype, const char **encoding) if (TYPE_CODE (elttype) == TYPE_CODE_CHAR || !name) { - *encoding = target_charset (); - return C_CHAR; + result = C_CHAR; + goto done; } if (!strcmp (name, "wchar_t")) { - *encoding = target_wide_charset (); - return C_WIDE_CHAR; + result = C_WIDE_CHAR; + goto done; } if (!strcmp (name, "char16_t")) { - /* FIXME: UCS-2 is not always correct. */ - if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG) - *encoding = "UCS-2BE"; - else - *encoding = "UCS-2LE"; - return C_CHAR_16; + result = C_CHAR_16; + goto done; } if (!strcmp (name, "char32_t")) { - /* FIXME: UCS-4 is not always correct. */ - if (gdbarch_byte_order (current_gdbarch) == BFD_ENDIAN_BIG) - *encoding = "UCS-4BE"; - else - *encoding = "UCS-4LE"; - return C_CHAR_32; + result = C_CHAR_32; + goto done; } CHECK_TYPEDEF (elttype); @@ -87,10 +114,16 @@ classify_type (struct type *elttype, const char **encoding) while (elttype != saved_type); /* Punt. */ - *encoding = target_charset (); - return C_CHAR; + result = C_CHAR; + + done: + *encoding = charset_for_string_type (result); + return result; } +/* A helper function that resets STATE to the initial state, writing + any output to OUTPUT. */ + static void reset_state (struct obstack *output, mbstate_t *state) { @@ -104,6 +137,10 @@ reset_state (struct obstack *output, mbstate_t *state) } } +/* Print a wide character W, using the multi-byte state STATE, to + OUTPUT. QUOTER is a (narrow) character indicating the style of + quotes surrounding the character to be printed. */ + static void print_wchar (wchar_t w, mbstate_t *state, struct obstack *output, int quoter) @@ -416,6 +453,12 @@ c_printstr (struct ui_file *stream, struct type *type, const gdb_byte *string, /* Evaluating C and C++ expressions. */ +/* Convert a UCN. The digits of the UCN start at P and extend no + farther than LIMIT. DEST_CHARSET is the name of the character set + into which the UCN should be converted. The results are written to + OUTPUT. LENGTH is the maximum length of the UCN, either 4 or 8. + Returns a pointer to just after the final digit of the UCN. */ + static char * convert_ucn (char *p, char *limit, const char *dest_charset, struct obstack *output, int length) @@ -439,6 +482,9 @@ convert_ucn (char *p, char *limit, const char *dest_charset, return p; } +/* Emit a character, VALUE, which was specified numerically to OUTPUT. + TYPE is the target character type. */ + static void emit_numeric_character (struct type *type, unsigned long value, struct obstack *output) @@ -450,10 +496,13 @@ emit_numeric_character (struct type *type, unsigned long value, obstack_grow (output, buffer, TYPE_LENGTH (type)); } +/* Convert an octal escape sequence. TYPE is the target character + type. The digits of the escape sequence begin at P and extend no + farther than LIMIT. The result is written to OUTPUT. Returns a + pointer to just after the final digit of the escape sequence. */ + static char * -convert_octal (struct type *type, const char *dest_charset, - char *p, char *limit, - struct obstack *output) +convert_octal (struct type *type, char *p, char *limit, struct obstack *output) { unsigned long value = 0; @@ -468,10 +517,13 @@ convert_octal (struct type *type, const char *dest_charset, return p; } +/* Convert a hex escape sequence. TYPE is the target character type. + The digits of the escape sequence begin at P and extend no farther + than LIMIT. The result is written to OUTPUT. Returns a pointer to + just after the final digit of the escape sequence. */ + static char * -convert_hex (struct type *type, const char *dest_charset, - char *p, char *limit, - struct obstack *output) +convert_hex (struct type *type, char *p, char *limit, struct obstack *output) { unsigned long value = 0; @@ -493,10 +545,16 @@ convert_hex (struct type *type, const char *dest_charset, error (_("Malformed escape sequence")); \ } while (0) +/* Convert an escape sequence to a target format. TYPE is the target + character type to use, and DEST_CHARSET is the name of the target + character set. The backslash of the escape sequence is at *P, and + the escape sequence will not extend past LIMIT. The results are + written to OUTPUT. Returns a pointer to just past the final + character of the escape sequence. */ + static char * convert_escape (struct type *type, const char *dest_charset, - char *p, char *limit, - struct obstack *output) + char *p, char *limit, struct obstack *output) { /* Skip the backslash. */ ADVANCE; @@ -512,7 +570,7 @@ convert_escape (struct type *type, const char *dest_charset, ADVANCE; if (!isxdigit (*p)) error (_("\\x used with no following hex digits.")); - p = convert_hex (type, dest_charset, p, limit, output); + p = convert_hex (type, p, limit, output); break; case '0': @@ -523,7 +581,7 @@ convert_escape (struct type *type, const char *dest_charset, case '5': case '6': case '7': - p = convert_octal (type, dest_charset, p, limit, output); + p = convert_octal (type, p, limit, output); break; case 'u': @@ -540,6 +598,12 @@ convert_escape (struct type *type, const char *dest_charset, return p; } +/* Given a single string from a (C-specific) OP_STRING list, convert + it to a target string, handling escape sequences specially. The + output is written to OUTPUT. DATA is the input string, which has + length LEN. DEST_CHARSET is the name of the target character set, + and TYPE is the type of target character to use. */ + static void parse_one_string (struct obstack *output, char *data, int len, const char *dest_charset, struct type *type) @@ -565,6 +629,10 @@ parse_one_string (struct obstack *output, char *data, int len, } } +/* Expression evaluator for the C language family. Most operations + are delegated to evaluate_subexp_standard; see that function for a + description of the arguments. */ + static struct value * evaluate_subexp_c (struct type *expect_type, struct expression *exp, int *pos, enum noside noside) @@ -596,23 +664,20 @@ evaluate_subexp_c (struct type *expect_type, struct expression *exp, switch (dest_type & ~C_CHAR) { case C_STRING: - dest_charset = target_charset (); type = language_string_char_type (current_language, current_gdbarch); break; case C_WIDE_STRING: - dest_charset = target_wide_charset (); type = lookup_typename ("wchar_t", NULL, 0); break; case C_STRING_16: - dest_charset = "UCS-2"; type = lookup_typename ("char16_t", NULL, 0); break; case C_STRING_32: - dest_charset = "UCS-4"; type = lookup_typename ("char32_t", NULL, 0); break; } + dest_charset = charset_for_string_type (dest_type); while (*pos < limit) { @@ -760,7 +825,7 @@ c_language_arch_info (struct gdbarch *gdbarch, lai->bool_type_default = builtin->builtin_int; } -const struct exp_descriptor exp_descriptor_c = +static const struct exp_descriptor exp_descriptor_c = { print_subexp_standard, operator_length_standard, diff --git a/gdb/c-valprint.c b/gdb/c-valprint.c index 3571005..9e661c5 100644 --- a/gdb/c-valprint.c +++ b/gdb/c-valprint.c @@ -58,6 +58,7 @@ print_function_pointer_address (CORE_ADDR address, struct ui_file *stream, /* A helper for textual_element_type. This checks the name of the typedef. This is bogus but it isn't apparent that the compiler provides us the help we may need. */ + static int textual_name (const char *name) { @@ -163,28 +164,15 @@ c_val_print (struct type *type, const gdb_byte *valaddr, int embedded_offset, { unsigned int temp_len; - temp_len = 0; - while (temp_len < len && temp_len < options->print_max) hooks/post-receive -- Repository for Project Archer.