public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
From: "oscar.gustafsson at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: libc-locales@sourceware.org
Subject: [Bug localedata/31205] New: Inconsistent (mon_)grouping formats
Date: Tue, 02 Jan 2024 13:28:25 +0000	[thread overview]
Message-ID: <bug-31205-716@http.sourceware.org/bugzilla/> (raw)

https://sourceware.org/bugzilla/show_bug.cgi?id=31205

            Bug ID: 31205
           Summary: Inconsistent (mon_)grouping formats
           Product: glibc
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: oscar.gustafsson at gmail dot com
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

I was trying to look into using number grouping for a project and realized that
the formats used is not consistent. For reference, here is the documentation:

https://sourceware.org/glibc/manual/html_node/General-Numeric.html

These are the two issues I've found:

* Many locales have the same digit repeated, e.g., en_US
https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/en_US;h=5cc518dff2fc1309e5cddd86950d6e9898a2d7e1;hb=refs/heads/master#l75
As far as I can tell, it should be enough to have a single 3 there. As is the
case for, e.g., en_HK
https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/en_HK;h=5f797e076099c4972d3c74fe92e5a6607c3bae95;hb=refs/heads/master#l84

* Some locales have 0;0 as grouping, e.g. el_GR
https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/el_GR;h=285e1e009276476f2aa2d2745177944c7b34a78b;hb=HEAD
Not sure what this is supposed to mean, but, e.g,. POSIX have -1 to indicate
"no grouping" 
https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/POSIX;h=7ec7f1c5774ab1fb011c08e2e17d435923e48fe2;hb=refs/heads/master#l262 

Note that "The last member is either 0, in which case the previous member is
used over and over again for all the remaining groups...", i.e., string
termination, but here there will be a string with three string termination
characters, to no previous member.

To some extent this is also the case for mon_grouping, at least the first case.

I guess the impact of this issue depends on the situation. The first one will
just waste a few bytes (and lead to confusion), but the second may lead to
weird results, at least in code using the raw localedata information without
noticing this.

If people agree that this should be consistent and fixed (not so obvious what
to replace 0;0 with, probably -1?), I'd be happy to provide a patch. (Even more
happy to be able to do that using standard git-access, I can provide some
credentials that I know how to use it etc.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

             reply	other threads:[~2024-01-02 13:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-02 13:28 oscar.gustafsson at gmail dot com [this message]
2024-01-02 13:29 ` [Bug localedata/31205] " oscar.gustafsson at gmail dot com
2024-01-02 16:04 ` maiku.fabian at gmail dot com
2024-01-02 16:11 ` maiku.fabian at gmail dot com
2024-01-02 16:14 ` maiku.fabian at gmail dot com
2024-01-02 16:37 ` oscar.gustafsson at gmail dot com
2024-01-18 15:11 ` maiku.fabian at gmail dot com
2024-01-18 15:12 ` maiku.fabian at gmail dot com
2024-01-18 16:11 ` maiku.fabian at gmail dot com
2024-01-19 14:21 ` maiku.fabian at gmail dot com
2024-01-22 14:22 ` maiku.fabian at gmail dot com
2024-01-25 10:41 ` cvs-commit at gcc dot gnu.org
2024-01-25 10:50 ` maiku.fabian at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-31205-716@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=libc-locales@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).