From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from alt-proxy28.mail.unifiedlayer.com (alt-proxy28.mail.unifiedlayer.com [74.220.216.123]) by sourceware.org (Postfix) with ESMTPS id 1840A385AC34 for ; Thu, 17 Feb 2022 22:05:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1840A385AC34 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tromey.com Received: from cmgw12.mail.unifiedlayer.com (unknown [10.0.90.127]) by progateway1.mail.pro1.eigbox.com (Postfix) with ESMTP id 80AA01003DAA2 for ; Thu, 17 Feb 2022 22:05:53 +0000 (UTC) Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with ESMTP id KouPnYhXs8lmIKouPnx93u; Thu, 17 Feb 2022 22:05:53 +0000 X-Authority-Reason: nr=8 X-Authority-Analysis: v=2.4 cv=HvGzp2fS c=1 sm=1 tr=0 ts=620ec6c1 a=ApxJNpeYhEAb1aAlGBBbmA==:117 a=ApxJNpeYhEAb1aAlGBBbmA==:17 a=dLZJa+xiwSxG16/P+YVxDGlgEgI=:19 a=oGFeUVbbRNcA:10:nop_rcvd_month_year a=Qbun_eYptAEA:10:endurance_base64_authed_username_1 a=EODCf7v48SgjGE1SBkIA:9 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=b4z837FhrrtS/gZT9A6uprS10SCAyIIXI8V1JzYO2bU=; b=TlMMrFAXLOSGl9YpICPLpdYRey nqv+0FayXVgoyUC7sUWqAjT6qLSGKiKM/1x/72F3h+B/TxcaReOoxKtQ5/MKNboA7BxIXgyYNpliA KzSgV1fS1UGkK+GwMeoKb9SXt; Received: from 75-166-146-214.hlrn.qwest.net ([75.166.146.214]:41038 helo=prentzel.Home) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nKouO-002ABb-Na; Thu, 17 Feb 2022 15:05:52 -0700 From: Tom Tromey To: gdb-patches@sourceware.org Cc: Tom Tromey Subject: [PATCH v2 12/18] Add a default encoding to generic_emit_char and generic_printstr Date: Thu, 17 Feb 2022 15:05:40 -0700 Message-Id: <20220217220547.3874030-13-tom@tromey.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220217220547.3874030-1-tom@tromey.com> References: <20220217220547.3874030-1-tom@tromey.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 75.166.146.214 X-Source-L: No X-Exim-ID: 1nKouO-002ABb-Na X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 75-166-146-214.hlrn.qwest.net (prentzel.Home) [75.166.146.214]:41038 X-Source-Auth: tom+tromey.com X-Email-Count: 18 X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-Spam-Status: No, score=-3031.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Feb 2022 22:05:55 -0000 This adds a default encoding to generic_emit_char and generic_printstr. The default is pretty basic: use the target charset for single-byte characters, use the wide charset for wchar_t, and assume UTF-16/32 for the appropriately-sized other characters. Languages for which these do not hold can be modified to do something else if need be. --- gdb/valprint.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/gdb/valprint.c b/gdb/valprint.c index ecb9b3c9871..39c75e82a71 100644 --- a/gdb/valprint.c +++ b/gdb/valprint.c @@ -2245,6 +2245,37 @@ default_emit_wchar (obstack_wide_file *stream, } } +/* Helper function to get the default encoding, given a type. */ +static const char * +get_default_encoding (struct type *chtype) +{ + const char *encoding; + if (TYPE_LENGTH (chtype) == 1) + encoding = target_charset (chtype->arch ()); + else if (streq (chtype->name (), "wchar_t")) + encoding = target_wide_charset (chtype->arch ()); + else if (TYPE_LENGTH (chtype) == 2) + { + if (type_byte_order (chtype) == BFD_ENDIAN_BIG) + encoding = "UTF-16BE"; + else + encoding = "UTF-16LE"; + } + else if (TYPE_LENGTH (chtype) == 4) + { + if (type_byte_order (chtype) == BFD_ENDIAN_BIG) + encoding = "UTF-32BE"; + else + encoding = "UTF-32LE"; + } + else + { + /* No idea. */ + encoding = target_charset (chtype->arch ()); + } + return encoding; +} + /* Print the character C on STREAM as part of the contents of a literal string whose delimiter is QUOTER. ENCODING names the encoding of C. */ @@ -2254,6 +2285,8 @@ generic_emit_char (int c, struct type *type, struct ui_file *stream, int quoter, const char *encoding, emit_char_ftype emitter) { + if (encoding == nullptr) + encoding = get_default_encoding (type); enum bfd_endian byte_order = type_byte_order (type); gdb_byte *c_buf; @@ -2590,6 +2623,8 @@ generic_printstr (struct ui_file *stream, struct type *type, const struct value_print_options *options, emit_char_ftype emitter) { + if (encoding == nullptr) + encoding = get_default_encoding (type); enum bfd_endian byte_order = type_byte_order (type); unsigned int i; int width = TYPE_LENGTH (type); -- 2.31.1