public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "carlos at redhat dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug localedata/31370] wcwidth() does not treat DEFAULT_IGNORABLE_CODE_POINTs as zero-width
Date: Mon, 12 Feb 2024 13:45:07 +0000	[thread overview]
Message-ID: <bug-31370-131-ufjLuIBZy3@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-31370-131@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=31370

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-02-12
                 CC|                            |carlos at redhat dot com

--- Comment #2 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Jules Bertholet from comment #0)
> Unicode specifies (https://www.unicode.org/faq/unsup_char.html#3) that
> characters with the `Default_Ignorable_Code_Point` property
> 
> > should be rendered as completely invisible (and non advancing, i.e. “zero width”), if not explicitly supported in rendering.
> 
> Hence, `wcwidth()` should give them all a width of 0, with two exceptions:

Please provide a patch to libc-alpha@sourceware.org following:
https://sourceware.org/glibc/wiki/Contribution%20checklist

Please also provide justification for the zero width by quoting another
implementation that also provides zero width e.g. CLDR.

The goal is for glibc to harmonize closer to CLDR.

It seems sensible to me that they would be zero width if they are
non-advancing, but that isn't always what an end user needs (as seen below).

> - the soft hyphen (U+00AD SOFT HYPHEN) is assigned width 1 by longstanding
> precedent

We use 1 in UTF-8 (default width). So this matches. The expectation is that the
system is trying to determine a width where the hyphen is chosen during the
display process.

> - U+115F HANGUL CHOSEONG FILLER combines with jungseong and jongseong jamo
> to form a width-2 syllable block, and should therefore keep its width 2

We use 2 in UTF-8. So this matches.
<U1100>...<U115F>       2

> However, `wcwidth()` currently also incorrectly assigns non-zero width to
> U+3164 HANGUL FILLER and U+FFA0 HALFWIDTH HANGUL FILLER.

This needs justification by highlighting that we are harmonizing the
implementation with CLDR.

Currently we have:
<U3131>...<U318E>       2

While U+FFA0 is default 1.

Thanks for filling this issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

  parent reply	other threads:[~2024-02-12 13:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-11 16:41 [Bug localedata/31370] New: " julesbertholet at quoi dot xyz
2024-02-11 16:55 ` [Bug localedata/31370] " julesbertholet at quoi dot xyz
2024-02-12 13:45 ` carlos at redhat dot com [this message]
2024-02-13 22:53 ` maiku.fabian at gmail dot com
2024-02-14 18:02 ` julesbertholet at quoi dot xyz
2024-02-14 18:27 ` carlos at redhat dot com
2024-02-14 20:47 ` julesbertholet at quoi dot xyz
2024-02-14 20:49 ` julesbertholet at quoi dot xyz
2024-02-16 17:43 ` carlos at redhat dot com
2024-02-18 18:20 ` julesbertholet at quoi dot xyz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-31370-131-ufjLuIBZy3@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).