* [Bug localedata/19852] [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
@ 2016-03-22 11:19 ` egmont at gmail dot com
2016-03-22 18:33 ` vapier at gentoo dot org
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: egmont at gmail dot com @ 2016-03-22 11:19 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
--- Comment #1 from Egmont Koblinger <egmont at gmail dot com> ---
Forwarding VTE maintainer's observation here:
The bug was introduced in glibc commit
4a4839c94a4c93ffc0d5b95c69a08b02a57007f2. It's due to a bug in the unicode
generation scripts, see
https://sourceware.org/bugzilla/show_bug.cgi?id=14094#c18 where the problem was
mentioned but the wrong choice made; the script needs to be smarter.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A
@ 2016-03-22 11:19 egmont at gmail dot com
2016-03-22 11:19 ` [Bug localedata/19852] " egmont at gmail dot com
` (13 more replies)
0 siblings, 14 replies; 15+ messages in thread
From: egmont at gmail dot com @ 2016-03-22 11:19 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Bug ID: 19852
Summary: [2.22 Regression] Incorrect wcwidth for U+3099 and
U+309A
Product: glibc
Version: 2.23
Status: NEW
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: egmont at gmail dot com
CC: libc-locales at sourceware dot org
Target Milestone: ---
(After running setlocale() with en_US.UTF-8 or something similar)
wcwidth() for U+3099 and U+309A (and presumably a few others) returns:
· 0 in glibc up to 2.21,
· 2 in glibc 2.22 & 2.23.
Quoting from Unicode 8.0:
http://unicode.org/reports/tr11/
"ED4. East Asian Wide (W): All other characters that are always wide."
"6.2 Combining Marks [...] nonspacing marks used only with wide characters are
given a W."
http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
"3099..309A;W # Mn [2] COMBINING [...]"
According to these, I believe the correct return value would be 0 (it's a
non-spacing mark).
Markus Kuhn's wcwidth (https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c) also
returns 0.
We found this originally being reported against VTE (the terminal emulation
widget behind gnome-terminal and others) causing incorrect look there:
https://bugzilla.gnome.org/show_bug.cgi?id=762052. The conclusion there
(beginning at comment:22) was also the same: it should return 0.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
2016-03-22 11:19 ` [Bug localedata/19852] " egmont at gmail dot com
2016-03-22 18:33 ` vapier at gentoo dot org
@ 2016-03-22 18:33 ` vapier at gentoo dot org
2016-04-07 18:17 ` vapier at gentoo dot org
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vapier at gentoo dot org @ 2016-03-22 18:33 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike Frysinger <vapier at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://sourceware.org/bugz
| |illa/show_bug.cgi?id=14094
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
2016-03-22 11:19 ` [Bug localedata/19852] " egmont at gmail dot com
@ 2016-03-22 18:33 ` vapier at gentoo dot org
2016-03-22 18:33 ` vapier at gentoo dot org
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vapier at gentoo dot org @ 2016-03-22 18:33 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike Frysinger <vapier at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |aoliva at redhat dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (2 preceding siblings ...)
2016-03-22 18:33 ` vapier at gentoo dot org
@ 2016-04-07 18:17 ` vapier at gentoo dot org
2016-04-07 18:18 ` [Bug localedata/19852] charmaps/UTF-8: incorrect " vapier at gentoo dot org
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vapier at gentoo dot org @ 2016-04-07 18:17 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike Frysinger <vapier at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://sourceware.org/bugz
| |illa/show_bug.cgi?id=19919
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (3 preceding siblings ...)
2016-04-07 18:17 ` vapier at gentoo dot org
@ 2016-04-07 18:18 ` vapier at gentoo dot org
2016-04-22 5:18 ` vapier at gentoo dot org
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vapier at gentoo dot org @ 2016-04-07 18:18 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike Frysinger <vapier at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mfabian at suse dot de
Summary|[2.22 Regression] Incorrect |charmaps/UTF-8: incorrect
|wcwidth for U+3099 and |wcwidth for U+3099 and
|U+309A |U+309A
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (4 preceding siblings ...)
2016-04-07 18:18 ` [Bug localedata/19852] charmaps/UTF-8: incorrect " vapier at gentoo dot org
@ 2016-04-22 5:18 ` vapier at gentoo dot org
2016-04-22 9:36 ` egmont at gmail dot com
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vapier at gentoo dot org @ 2016-04-22 5:18 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
--- Comment #2 from Mike Frysinger <vapier at gentoo dot org> ---
isn't the issue fundamentally that the official unicode's data is wrong ? so
once this is fixed in unicode.org, glibc will roll the fix automatically ?
they have a form for it:
http://unicode.org/reporting.html
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (5 preceding siblings ...)
2016-04-22 5:18 ` vapier at gentoo dot org
@ 2016-04-22 9:36 ` egmont at gmail dot com
2016-04-22 22:02 ` egmont at gmail dot com
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: egmont at gmail dot com @ 2016-04-22 9:36 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
--- Comment #3 from Egmont Koblinger <egmont at gmail dot com> ---
I cannot tell if it's a bug or an unfortunate design in Unicode database,
sorry.
At least, even if it's a Unicode bug, glibc used to contain a workaround for
this bug which was accidentally removed and probably should be restored for the
time being.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (6 preceding siblings ...)
2016-04-22 9:36 ` egmont at gmail dot com
@ 2016-04-22 22:02 ` egmont at gmail dot com
2016-04-22 22:02 ` vapier at gentoo dot org
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: egmont at gmail dot com @ 2016-04-22 22:02 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
--- Comment #5 from Egmont Koblinger <egmont at gmail dot com> ---
(In reply to Mike Frysinger from comment #4)
> seems like bug 4335 is also related ...
Not too much, I think.
That one is about defining locales where ambiguous width characters take up 2
cells instead of 1.
This one is about the width of combining accents themselves that are intended
to be applied on top of double wide (not ambiguous but clearly double wide)
characters.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (7 preceding siblings ...)
2016-04-22 22:02 ` egmont at gmail dot com
@ 2016-04-22 22:02 ` vapier at gentoo dot org
2017-07-11 14:23 ` tg at mirbsd dot de
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vapier at gentoo dot org @ 2016-04-22 22:02 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike Frysinger <vapier at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://sourceware.org/bugz
| |illa/show_bug.cgi?id=4335
--- Comment #4 from Mike Frysinger <vapier at gentoo dot org> ---
i think we should get this clarified/documented before we continue to stumble
blindly hoping for the best :)
seems like bug 4335 is also related ...
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (8 preceding siblings ...)
2016-04-22 22:02 ` vapier at gentoo dot org
@ 2017-07-11 14:23 ` tg at mirbsd dot de
2017-08-15 9:34 ` maiku.fabian at gmail dot com
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: tg at mirbsd dot de @ 2017-07-11 14:23 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Thorsten Glaser <tg at mirbsd dot de> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tg at mirbsd dot de
--- Comment #6 from Thorsten Glaser <tg at mirbsd dot de> ---
I’ve filed https://sourceware.org/bugzilla/show_bug.cgi?id=21750 noting _all_
differences from Markus Kuhn’s xterm code (updated for Unicode 10) to the
current glibc localedata.
For this particular problem, the fix is easy (interestingly enough, I had a
similar bug in MirBSD when redoing the wcwidth code): read EastAsianWidth
before, not after, UnicodeData, so the NSM bidi class overrides the width set
by the former.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (9 preceding siblings ...)
2017-07-11 14:23 ` tg at mirbsd dot de
@ 2017-08-15 9:34 ` maiku.fabian at gmail dot com
2017-08-17 9:07 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-08-15 9:34 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |maiku.fabian at gmail dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (10 preceding siblings ...)
2017-08-15 9:34 ` maiku.fabian at gmail dot com
@ 2017-08-17 9:07 ` cvs-commit at gcc dot gnu.org
2017-08-17 13:52 ` maiku.fabian at gmail dot com
2017-08-17 15:56 ` maiku.fabian at gmail dot com
13 siblings, 0 replies; 15+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2017-08-17 9:07 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
--- Comment #7 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".
The branch, master has been updated
via bb6274ee1293a6bc76d9d7c889783303de181295 (commit)
via c14b84baae83bfb73f7cd00ba7c24964ad1c712c (commit)
via 7a79e321c6f85b204036c33d85f6b2aa794e7c76 (commit)
via 267ee5d7ab57591a6b1bc2d2a010c88188427063 (commit)
via 41b6f0ce85d98c62739b04863e8c38a1f4154e80 (commit)
via 580be3035d2e0f479c4ac955bf719b0bf936f5cf (commit)
from 038d1cafafb3094a9fbebd35f4aa8d0ebae0e55b (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bb6274ee1293a6bc76d9d7c889783303de181295
commit bb6274ee1293a6bc76d9d7c889783303de181295
Author: Akhilesh Kumar <akhilesh.k@samsung.com>
Date: Wed Aug 16 15:33:58 2017 +0530
Fix abmon for bem_ZM
Until now the abbreviated month names were in English.
[BZ #21960]
* locales/bem_ZM (LC_TIME): Fix abmon, make it agree with CLDR.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c14b84baae83bfb73f7cd00ba7c24964ad1c712c
commit c14b84baae83bfb73f7cd00ba7c24964ad1c712c
Author: Akhilesh Kumar <akhilesh.k@samsung.com>
Date: Wed Aug 16 18:01:53 2017 +0530
Fix country name for xh_ZA
[BZ #21959]
* locales/xh_ZA (LC_ADDRESS): Fix country name.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7a79e321c6f85b204036c33d85f6b2aa794e7c76
commit 7a79e321c6f85b204036c33d85f6b2aa794e7c76
Author: Thorsten Glaser <tg@mirbsd.de>
Date: Fri Jul 14 14:02:50 2017 +0200
Refresh generated charmap data and ChangeLog
[BZ #21750]
* charmaps/UTF-8: Refresh.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=267ee5d7ab57591a6b1bc2d2a010c88188427063
commit 267ee5d7ab57591a6b1bc2d2a010c88188427063
Author: Thorsten Glaser <tg@mirbsd.de>
Date: Fri Jul 14 14:02:46 2017 +0200
Resolve some historically special cases of ambiguous width
[BZ #21750]
* unicode-gen/utf8_gen.py (U+00AD): Set width to 1.
* unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0.
* unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2.
* unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=41b6f0ce85d98c62739b04863e8c38a1f4154e80
commit 41b6f0ce85d98c62739b04863e8c38a1f4154e80
Author: Thorsten Glaser <tg@mirbsd.de>
Date: Fri Jul 14 14:02:44 2017 +0200
Handle more cases of combining characters
[BZ #21750]
* unicode-gen/utf8_gen.py: Treat category Me and Mn as combining.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=580be3035d2e0f479c4ac955bf719b0bf936f5cf
commit 580be3035d2e0f479c4ac955bf719b0bf936f5cf
Author: Thorsten Glaser <tg@mirbsd.de>
Date: Fri Jul 14 14:02:37 2017 +0200
UnicodeData has precedence over EastAsianWidth
[BZ #19852]
[BZ #21750]
* unicode-gen/utf8_gen.py: Process EastAsianWidth lines before
UnicodeData lines so the latter have precedence; remove hack
to group output by EastAsianWidth ranges.
-----------------------------------------------------------------------
Summary of changes:
localedata/ChangeLog | 24 +
localedata/charmaps/UTF-8 |111468
+++++++++++++++++++++++++++++++++++-
localedata/locales/bem_ZM | 25 +-
localedata/locales/xh_ZA | 5 +-
localedata/unicode-gen/utf8_gen.py | 38 +-
5 files changed, 111400 insertions(+), 160 deletions(-)
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (11 preceding siblings ...)
2017-08-17 9:07 ` cvs-commit at gcc dot gnu.org
@ 2017-08-17 13:52 ` maiku.fabian at gmail dot com
2017-08-17 15:56 ` maiku.fabian at gmail dot com
13 siblings, 0 replies; 15+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-08-17 13:52 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at sourceware dot org |maiku.fabian at gmail dot com
Target Milestone|--- |2.27
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug localedata/19852] charmaps/UTF-8: incorrect wcwidth for U+3099 and U+309A
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
` (12 preceding siblings ...)
2017-08-17 13:52 ` maiku.fabian at gmail dot com
@ 2017-08-17 15:56 ` maiku.fabian at gmail dot com
13 siblings, 0 replies; 15+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-08-17 15:56 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=19852
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #8 from Mike FABIAN <maiku.fabian at gmail dot com> ---
FIXED thanks to Thorsten Glaser.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-08-17 13:52 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-22 11:19 [Bug localedata/19852] New: [2.22 Regression] Incorrect wcwidth for U+3099 and U+309A egmont at gmail dot com
2016-03-22 11:19 ` [Bug localedata/19852] " egmont at gmail dot com
2016-03-22 18:33 ` vapier at gentoo dot org
2016-03-22 18:33 ` vapier at gentoo dot org
2016-04-07 18:17 ` vapier at gentoo dot org
2016-04-07 18:18 ` [Bug localedata/19852] charmaps/UTF-8: incorrect " vapier at gentoo dot org
2016-04-22 5:18 ` vapier at gentoo dot org
2016-04-22 9:36 ` egmont at gmail dot com
2016-04-22 22:02 ` egmont at gmail dot com
2016-04-22 22:02 ` vapier at gentoo dot org
2017-07-11 14:23 ` tg at mirbsd dot de
2017-08-15 9:34 ` maiku.fabian at gmail dot com
2017-08-17 9:07 ` cvs-commit at gcc dot gnu.org
2017-08-17 13:52 ` maiku.fabian at gmail dot com
2017-08-17 15:56 ` maiku.fabian at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).