public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "carlos at redhat dot com" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sourceware.org Subject: [Bug localedata/31370] wcwidth() does not treat DEFAULT_IGNORABLE_CODE_POINTs as zero-width Date: Mon, 12 Feb 2024 13:45:07 +0000 [thread overview] Message-ID: <bug-31370-131-ufjLuIBZy3@http.sourceware.org/bugzilla/> (raw) In-Reply-To: <bug-31370-131@http.sourceware.org/bugzilla/> https://sourceware.org/bugzilla/show_bug.cgi?id=31370 Carlos O'Donell <carlos at redhat dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed| |2024-02-12 CC| |carlos at redhat dot com --- Comment #2 from Carlos O'Donell <carlos at redhat dot com> --- (In reply to Jules Bertholet from comment #0) > Unicode specifies (https://www.unicode.org/faq/unsup_char.html#3) that > characters with the `Default_Ignorable_Code_Point` property > > > should be rendered as completely invisible (and non advancing, i.e. “zero width”), if not explicitly supported in rendering. > > Hence, `wcwidth()` should give them all a width of 0, with two exceptions: Please provide a patch to libc-alpha@sourceware.org following: https://sourceware.org/glibc/wiki/Contribution%20checklist Please also provide justification for the zero width by quoting another implementation that also provides zero width e.g. CLDR. The goal is for glibc to harmonize closer to CLDR. It seems sensible to me that they would be zero width if they are non-advancing, but that isn't always what an end user needs (as seen below). > - the soft hyphen (U+00AD SOFT HYPHEN) is assigned width 1 by longstanding > precedent We use 1 in UTF-8 (default width). So this matches. The expectation is that the system is trying to determine a width where the hyphen is chosen during the display process. > - U+115F HANGUL CHOSEONG FILLER combines with jungseong and jongseong jamo > to form a width-2 syllable block, and should therefore keep its width 2 We use 2 in UTF-8. So this matches. <U1100>...<U115F> 2 > However, `wcwidth()` currently also incorrectly assigns non-zero width to > U+3164 HANGUL FILLER and U+FFA0 HALFWIDTH HANGUL FILLER. This needs justification by highlighting that we are harmonizing the implementation with CLDR. Currently we have: <U3131>...<U318E> 2 While U+FFA0 is default 1. Thanks for filling this issue. -- You are receiving this mail because: You are on the CC list for the bug.
next prev parent reply other threads:[~2024-02-12 13:45 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-02-11 16:41 [Bug localedata/31370] New: " julesbertholet at quoi dot xyz 2024-02-11 16:55 ` [Bug localedata/31370] " julesbertholet at quoi dot xyz 2024-02-12 13:45 ` carlos at redhat dot com [this message] 2024-02-13 22:53 ` maiku.fabian at gmail dot com 2024-02-14 18:02 ` julesbertholet at quoi dot xyz 2024-02-14 18:27 ` carlos at redhat dot com 2024-02-14 20:47 ` julesbertholet at quoi dot xyz 2024-02-14 20:49 ` julesbertholet at quoi dot xyz 2024-02-16 17:43 ` carlos at redhat dot com 2024-02-18 18:20 ` julesbertholet at quoi dot xyz
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-31370-131-ufjLuIBZy3@http.sourceware.org/bugzilla/ \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).