public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* [Bug localedata/14010] New: Serious omissions in alphabetic character class
@ 2012-04-23  4:22 bugdal at aerifal dot cx
  2012-09-21 23:46 ` [Bug localedata/14010] " bugdal at aerifal dot cx
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: bugdal at aerifal dot cx @ 2012-04-23  4:22 UTC (permalink / raw)
  To: libc-locales

http://sourceware.org/bugzilla/show_bug.cgi?id=14010

             Bug #: 14010
           Summary: Serious omissions in alphabetic character class
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
        AssignedTo: unassigned@sourceware.org
        ReportedBy: bugdal@aerifal.cx
                CC: libc-locales@sources.redhat.com
    Classification: Unclassified


The localedata generation code defines is_alpha based on Unicode categories L*,
plus Nl, Nd, and a moderate number of special cases mostly to fix Thai language
support (to fix is_alpha returning false for letters in category Mn). However
Thai is not the only language affected; any language that uses non-spacing
letters is broken by glibc's deficient is_alpha definition. As a particular
example, all of the Tibetan subjoined letters are considered non-alphabetic
(and thus punctuation) by glibc.

Unicode addresses this issue by defining the Other_Alphabetic property in
PropList.txt and the Alphabetic derived property in DerivedCoreProperties.txt,
the latter of which consists of Lu+Ll+Lt+Lm+Lo+Nl + Other_Alphabetic. This
subsumes all special-case hacks for Thai in glibc's gen-unicode-ctype.c and
fixes the issue (at least approximately) for all other languages/scripts at the
same time.

glibc's localedata should adopt the definition of Alphabetic from Unicode's 
DerivedCoreProperties.txt (and still add Nd and the special cases from So).

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
@ 2012-09-21 23:46 ` bugdal at aerifal dot cx
  2012-09-23 22:39 ` joseph at codesourcery dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugdal at aerifal dot cx @ 2012-09-21 23:46 UTC (permalink / raw)
  To: libc-locales


http://sourceware.org/bugzilla/show_bug.cgi?id=14010

--- Comment #1 from Rich Felker <bugdal at aerifal dot cx> 2012-09-21 23:07:28 UTC ---
Ping. Has anybody looked at this?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
  2012-09-21 23:46 ` [Bug localedata/14010] " bugdal at aerifal dot cx
@ 2012-09-23 22:39 ` joseph at codesourcery dot com
  2013-10-25 20:32 ` myllynen at redhat dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: joseph at codesourcery dot com @ 2012-09-23 22:39 UTC (permalink / raw)
  To: libc-locales


http://sourceware.org/bugzilla/show_bug.cgi?id=14010

--- Comment #2 from joseph at codesourcery dot com <joseph at codesourcery dot com> 2012-09-23 19:34:48 UTC ---
We know that there are over 500 open bugs and bugs are filed faster than 
they are fixed.  Constructive responses on libc-alpha to 
<http://sourceware.org/ml/libc-alpha/2012-08/msg00611.html> regarding how 
to get more people actively fixing more bugs would be more useful, towards 
the goal of getting down to maybe 100 bugs that are genuinely hard, than 
pinging individual bugs (unless the ping is for something like reminding 
someone to submit a patch or test whether a commit has fixed the bug for 
them - where there is clear in-progress work that may have been forgotten 
about).

There's plenty of room for an interested person to become glibc's 
character set expert and address this bug, bug 14094 and bug 14095 (only 
14095 is particularly likely to be hard) and probably other bugs as well.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
  2012-09-21 23:46 ` [Bug localedata/14010] " bugdal at aerifal dot cx
  2012-09-23 22:39 ` joseph at codesourcery dot com
@ 2013-10-25 20:32 ` myllynen at redhat dot com
  2013-10-25 20:34 ` bugdal at aerifal dot cx
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: myllynen at redhat dot com @ 2013-10-25 20:32 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=14010

Marko Myllynen <myllynen at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |myllynen at redhat dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
                   ` (4 preceding siblings ...)
  2013-10-25 20:34 ` bugdal at aerifal dot cx
@ 2013-10-25 20:34 ` joseph at codesourcery dot com
  2014-06-25 12:09 ` fweimer at redhat dot com
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: joseph at codesourcery dot com @ 2013-10-25 20:34 UTC (permalink / raw)
  To: libc-locales

http://sourceware.org/bugzilla/show_bug.cgi?id=14010

--- Comment #4 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
On Fri, 25 Oct 2013, bugdal at aerifal dot cx wrote:

> Joseph, thanks for acknowledging this bug. Issue 14094 looks related (as in,
> both could be resolved at the same time, if desired), but 14095 is a completely
> separate matter and I don't think it's helpful to tie them together.

The connection is that they all (and bug 16061) need someone to act as 
glibc's character set / Unicode expert and do a proper analysis of the 
issues involved and the current state of this data in glibc.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
                   ` (3 preceding siblings ...)
  2013-10-25 20:34 ` bugdal at aerifal dot cx
@ 2013-10-25 20:34 ` bugdal at aerifal dot cx
  2013-10-25 20:34 ` joseph at codesourcery dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugdal at aerifal dot cx @ 2013-10-25 20:34 UTC (permalink / raw)
  To: libc-locales

http://sourceware.org/bugzilla/show_bug.cgi?id=14010

--- Comment #5 from Rich Felker <bugdal at aerifal dot cx> ---
On Fri, Oct 25, 2013 at 03:19:28PM +0000, joseph at codesourcery dot com wrote:
> The connection is that they all (and bug 16061) need someone to act as 
> glibc's character set / Unicode expert and do a proper analysis of the 
> issues involved and the current state of this data in glibc.

My view is that I don't think it requires a collation expert to handle
the fixing of the alphabetic class and/or updating the character class
data to latest Unicode. Collation is a much more specialized expertise
requirement.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
                   ` (2 preceding siblings ...)
  2013-10-25 20:32 ` myllynen at redhat dot com
@ 2013-10-25 20:34 ` bugdal at aerifal dot cx
  2013-10-25 20:34 ` bugdal at aerifal dot cx
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugdal at aerifal dot cx @ 2013-10-25 20:34 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=14010

--- Comment #3 from Rich Felker <bugdal at aerifal dot cx> ---
Joseph, thanks for acknowledging this bug. Issue 14094 looks related (as in,
both could be resolved at the same time, if desired), but 14095 is a completely
separate matter and I don't think it's helpful to tie them together.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
                   ` (5 preceding siblings ...)
  2013-10-25 20:34 ` joseph at codesourcery dot com
@ 2014-06-25 12:09 ` fweimer at redhat dot com
  2014-12-04 10:34 ` maiku.fabian at gmail dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: fweimer at redhat dot com @ 2014-06-25 12:09 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=14010

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
                   ` (6 preceding siblings ...)
  2014-06-25 12:09 ` fweimer at redhat dot com
@ 2014-12-04 10:34 ` maiku.fabian at gmail dot com
  2020-04-15 13:51 ` meave390 at gmail dot com
  2020-04-15 14:02 ` meave390 at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: maiku.fabian at gmail dot com @ 2014-12-04 10:34 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=14010

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |maiku.fabian at gmail dot com
         Resolution|---                         |DUPLICATE

--- Comment #6 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Rich Felker from comment #5)
> On Fri, Oct 25, 2013 at 03:19:28PM +0000, joseph at codesourcery dot com
> wrote:
> > The connection is that they all (and bug 16061) need someone to act as 
> > glibc's character set / Unicode expert and do a proper analysis of the 
> > issues involved and the current state of this data in glibc.
> 
> My view is that I don't think it requires a collation expert to handle
> the fixing of the alphabetic class and/or updating the character class
> data to latest Unicode. Collation is a much more specialized expertise
> requirement.

https://sourceware.org/bugzilla/show_bug.cgi?id=14094#c33 amd following
comments address the problem with the alphabetic class and  
updating the character classes to the latest Unicode.

So I think we can mark this bug here as a duplicate of bug#14094.

*** This bug has been marked as a duplicate of bug 14094 ***

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
                   ` (7 preceding siblings ...)
  2014-12-04 10:34 ` maiku.fabian at gmail dot com
@ 2020-04-15 13:51 ` meave390 at gmail dot com
  2020-04-15 14:02 ` meave390 at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: meave390 at gmail dot com @ 2020-04-15 13:51 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=14010

jack <meave390 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |meave390 at gmail dot com

--- Comment #7 from jack <meave390 at gmail dot com> ---
There are the update for the all web users online seen the site
https://setclockwindows.com and look easily here how do i set my clock in
windows 10 hope you can save this information to use full for you.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/14010] Serious omissions in alphabetic character class
  2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
                   ` (8 preceding siblings ...)
  2020-04-15 13:51 ` meave390 at gmail dot com
@ 2020-04-15 14:02 ` meave390 at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: meave390 at gmail dot com @ 2020-04-15 14:02 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=14010

--- Comment #8 from jack <meave390 at gmail dot com> ---
There are the update for the all web users you have to seen the batter site
here https://setclockwindows.com and looking the best information to how do i
set my clock in windows 10 hope you have to batter working way.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-15 14:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-23  4:22 [Bug localedata/14010] New: Serious omissions in alphabetic character class bugdal at aerifal dot cx
2012-09-21 23:46 ` [Bug localedata/14010] " bugdal at aerifal dot cx
2012-09-23 22:39 ` joseph at codesourcery dot com
2013-10-25 20:32 ` myllynen at redhat dot com
2013-10-25 20:34 ` bugdal at aerifal dot cx
2013-10-25 20:34 ` bugdal at aerifal dot cx
2013-10-25 20:34 ` joseph at codesourcery dot com
2014-06-25 12:09 ` fweimer at redhat dot com
2014-12-04 10:34 ` maiku.fabian at gmail dot com
2020-04-15 13:51 ` meave390 at gmail dot com
2020-04-15 14:02 ` meave390 at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).