From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from alt-proxy28.mail.unifiedlayer.com (alt-proxy28.mail.unifiedlayer.com [74.220.216.123]) by sourceware.org (Postfix) with ESMTPS id 43FC53858CDB for ; Sat, 8 Oct 2022 18:55:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 43FC53858CDB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tromey.com Received: from cmgw12.mail.unifiedlayer.com (unknown [10.0.90.127]) by progateway1.mail.pro1.eigbox.com (Postfix) with ESMTP id 07EBE100403D4 for ; Sat, 8 Oct 2022 18:55:43 +0000 (UTC) Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with ESMTP id hEz8oqJ1v7nBmhEz8onQOz; Sat, 08 Oct 2022 18:55:43 +0000 X-Authority-Reason: nr=8 X-Authority-Analysis: v=2.4 cv=KJ2fsHJo c=1 sm=1 tr=0 ts=6341c7af a=ApxJNpeYhEAb1aAlGBBbmA==:117 a=ApxJNpeYhEAb1aAlGBBbmA==:17 a=dLZJa+xiwSxG16/P+YVxDGlgEgI=:19 a=Qawa6l4ZSaYA:10:nop_rcvd_month_year a=Qbun_eYptAEA:10:endurance_base64_authed_username_1 a=CCpqsmhAAAAA:8 a=WdJUITwtKeW4esnf5IAA:9 a=ul9cdbp4aOFLsgKbc677:22 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References :Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=e7s59d6UNiE7ohtWhxysin5F8+GY2G4/4tnASuPBBB8=; b=yNaURsc+J9awAEEfQSdU2PsMXC y0JtzgM6+OR/9MO6D6kdqnf+kxQqFam9Q/JzT6gZDtKxnBc9fDlk/YZr+1wq5bcR+alyYYcRx5aZm HdT3rr/KJRKGAnzGK4TnaEySs; Received: from 71-211-160-49.hlrn.qwest.net ([71.211.160.49]:60872 helo=prentzel) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1ohEz7-000Lak-TD; Sat, 08 Oct 2022 12:55:42 -0600 From: Tom Tromey To: Patrick Monnerat Cc: Tom Tromey , Patrick Monnerat via Gdb-patches Subject: Re: [PATCH] gdb: add UTF16/UTF32 target charsets in phony_iconv References: <20221002140010.106238-1-patrick@monnerat.net> <87k05bs8c5.fsf@tromey.com> <0a978271-3085-8bf3-f5fd-6a0b3f9f3ea2@monnerat.net> X-Attribution: Tom Date: Sat, 08 Oct 2022 12:55:36 -0600 In-Reply-To: <0a978271-3085-8bf3-f5fd-6a0b3f9f3ea2@monnerat.net> (Patrick Monnerat's message of "Sat, 8 Oct 2022 02:12:02 +0200") Message-ID: <874jwejgbb.fsf@tromey.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 71.211.160.49 X-Source-L: No X-Exim-ID: 1ohEz7-000Lak-TD X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 71-211-160-49.hlrn.qwest.net (prentzel) [71.211.160.49]:60872 X-Source-Auth: tom+tromey.com X-Email-Count: 2 X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-Spam-Status: No, score=-3022.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Oct 2022 18:55:58 -0000 >> I don't recall the outcome from this, but is there no way to improve gdb >> to use this iconv? If it truly works well enough then it seems like it >> would be the better approach. Patrick> There is a fuzzy source comment about an iconv problem in Patrick> Solaris. Your bugzilla comment Patrick> https://sourceware.org/bugzilla/show_bug.cgi?id=29315#c2 also goes in Patrick> this direction: however we have no details about what the problem is, Patrick> if we should still support it and/or if there are other systems Patrick> affected by such an iconv problem. Ok. Thank you for the link, I knew we had discussed this somewhere in the past... The comments at the top of gdb_wchar.h describe the situation somewhat, though they don't really explain what was wrong with Solaris. My recollection, though, is that the Solaris wchar_t doesn't have any ordinary encoding but is instead a weird hybrid thing, and furthermore that the Solaris iconv doesn't accept "wchar_t" as an encoding name. So, on Solaris, there's no convenient way to do the conversions (it's possible to convert wchar_t to/from the locale's multi-byte encoding, but I didn't implement that since it seemed like a pain). All of this is based on the idea that it's convenient to work in a wide character representation at some points in the code. At the time, I figured relying on wchar_t would be good for this because (presumably) hosts would support that reasonably well and we wouldn't have to do extra work in gdb. However, it seems to me that it doesn't really have to be done this way. We could use UTF-32 instead, by making our own tables (along the lines of ada-unicode.py) for "isdigit" and "isprint". In addition to this, I suppose we could simply require iconv. Probably any host that has iconv will support UTF-32 (if not, what good is it really). And libiconv exists and can even be conveniently dropped into the source tree if there are any hosts that don't have it. This may not be a good plan if there are active host platforms where this would be a pain to deal with. Anyway, what do you think of this plan? Patrick> Please note I had the intention to determine the Solaris problem and Patrick> tried to install the last available OpenSolaris (dated 2009) in a VM Patrick> without success: I gave up. Yeah, that's fine, I wouldn't have done it myself. Tom