public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "simon at pushface dot org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug ada/95959] New: Error in conversion from UTF16 to UTF8 Date: Mon, 29 Jun 2020 11:21:40 +0000 [thread overview] Message-ID: <bug-95959-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95959 Bug ID: 95959 Summary: Error in conversion from UTF16 to UTF8 Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ada Assignee: unassigned at gcc dot gnu.org Reporter: simon at pushface dot org Target Milestone: --- Created attachment 48799 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48799&action=edit Demonstration There's an error in converting from UTF16 to UTF8 for code points in U+10000 to u+10FFFF (which require 4 UTF8 bytes). The attached demonstration shows this by taking a UTF8 character (Clef, U+1D11E), converting to UTF16, and converting back to UTF8, which should round-trip back to the same character, but doesn't. The third byte of the final UTF8 is wrong $ ./utftest Codepoint: 16#1D11E# UTF-8: 4: 2#11110000# 2#10011101# 2#10000100# 2#10011110# UTF-16: 2: 2#1101100000110100# 2#1101110100011110# UTF-8: 4: 2#11110000# 2#10011101# 2#10010000# 2#10011110# Bug The attached patch corrects the problem.
next reply other threads:[~2020-06-29 11:21 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-29 11:21 simon at pushface dot org [this message] 2020-06-29 11:22 ` [Bug ada/95959] " simon at pushface dot org 2020-06-29 11:55 ` charlet at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-95959-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).