From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 55DDB3858D28 for ; Sun, 16 Oct 2022 06:24:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 55DDB3858D28 Received: from fencepost.gnu.org ([2001:470:142:3::e]:57444) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ojx4o-0005pb-Gw; Sun, 16 Oct 2022 02:24:46 -0400 Received: from [87.69.77.57] (port=1968 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ojx4n-00076R-Pl; Sun, 16 Oct 2022 02:24:46 -0400 Date: Sun, 16 Oct 2022 09:24:31 +0300 Message-Id: <837d105lr4.fsf@gnu.org> From: Eli Zaretskii To: Tom Tromey Cc: gdb-patches@sourceware.org In-Reply-To: <878rlgedug.fsf@tromey.com> (message from Tom Tromey on Sat, 15 Oct 2022 19:50:31 -0600) Subject: Re: [PATCH] gdb: add UTF16/UTF32 target charsets in phony_iconv References: <20221002140010.106238-1-patrick@monnerat.net> <87k05bs8c5.fsf@tromey.com> <0a978271-3085-8bf3-f5fd-6a0b3f9f3ea2@monnerat.net> <874jwejgbb.fsf@tromey.com> <2f10efe4-1095-b620-ea1c-08cc047c45c4@monnerat.net> <87zge3irph.fsf@tromey.com> <878rlgedug.fsf@tromey.com> X-Spam-Status: No, score=1.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Oct 2022 06:24:49 -0000 > From: Tom Tromey > Date: Sat, 15 Oct 2022 19:50:31 -0600 > Cc: Patrick Monnerat via Gdb-patches > > However, I think there's a better way to fix all this. It's very simple > and offhand I don't know why I didn't think of it before... the use of > wchar_t depends on knowing the encoding of wchar_t -- and I think we do > know this on mingw, as it's a form of UTF-16. Beware: wchar_t on MS-Windows _is_ indeed UTF-16, but that means a single wchar_t character can only represent characters within the BMP; anything beyond the BMP will need a 'wchar_t *' string whose length is at least 2. So if the code converts single characters, on Windows it can only do that with BMP codepoints. (Ignore me if what I say makes no sense or is not useful: I wasn't tracking this discussion.)