From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 94301 invoked by alias); 20 Jun 2018 08:11:48 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 92203 invoked by uid 89); 20 Jun 2018 08:10:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=indications, presentation, Thousands, charts X-HELO: mail-wr0-f196.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:reply-to:organization:to:cc :message-id:date:user-agent:mime-version:content-language :content-transfer-encoding; bh=hWx1TNmCJBgW6SEKTFfKzBx/BSk/1khjgTnY5qUCnXA=; b=g7WN+rqz4MLbhN29EEcEykVIWAvloy1TGHR57TE7RWxHFBqYYHmNdO7grQrK3bqVaK ErJS/1q5Wya4+7zPBlDt4sOci1e2j1vC+D0QtDGnPP9fqQ0GU+UUx+kc5JqnU149VT38 TAkbnyeDbh5tY6UlJFtt/Ya2aUYNagxD79qpyNY0y0zq7NdiYWquTrDiXRB4LGJ4zI8f x8hoxBBIn5qJTe+SSORTIA9Hu1ibWeiuffqsiD3tGJWKrfhd4LQadi3KOTB5Qhosoe21 bJJHmA24qCk2Lqv+1Z5yIVShQY8WRFpcw5qfsSjbBwLbDbvyvTA9vLonPjl7GoP8G/T1 5SGw== X-Gm-Message-State: APt69E3fNUO6+91Cmlp8DzYCBLoEYjWAPNAYUWyVHdaRBaTA1mNgeqEj zSzpGGQGUWnRRMyOzeGazmMKU06v658= X-Google-Smtp-Source: ADUXVKILxri+C9zwPFHhXcW3pBaKzKTStCllsSwYvc063ogeYHAYpuo1/XfZa38+yPElHT+o0Hj6nA== X-Received: by 2002:adf:8989:: with SMTP id x9-v6mr16497269wrx.78.1529482224362; Wed, 20 Jun 2018 01:10:24 -0700 (PDT) From: Marko Myllynen Subject: Locales: Thousands separator Reply-To: Marko Myllynen To: GNU C Library Cc: Mike Fabian , Stanislav Brabec , Carlos O'Donell Message-ID: <5e0e7fec-59b1-8af9-5711-4509975e8f29@redhat.com> Date: Wed, 20 Jun 2018 08:11:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-06/txt/msg00598.txt.bz2 Hi, Commit 70a6707 [1] changed many locales to use U+202F NARROW NO-BREAK SPACE (NNBSP) as the thousands separator instead of U+00A0 NO-BREAK SPACE (NBSP). The patch submission nor the follow-up discussion [2] did not cite any standards or references as rationale for this change. CLDR defines thousand separator as NBSP for many of the languages affected by the change (the presentation on the page below is not optimal, you'll probably need to check the source code to see that indeed it's U+00A0 (using the " " notation) which is in use): https://www.unicode.org/cldr/charts/33/by_type/numbers.symbols.html#a1ef41eaeb6982d This means that due to the glibc change there is now inconsistency between the affected glibc locales and any CLDR-using platform. As a concrete example, Java 9 enabled CLDR locale data by default [3], so this inconsistency is not limited to cases across different systems but there might be applications running on a recent GNU/Linux system using different thousands separator. I have been under impression that the long-term plan for glibc locales would be to use CLDR data as source to the extent possible so this change would seem to be at odds with that plan. I found no indications from CLDR Trac that CLDR would be switching to NNBSP. This inconsistency also presents a dilemma for keymap definitions when there is only one feasible key combination available for producing non-breaking space: which variant to choose, the glibc one or the CLDR one. Given the considerations above, what do the glibc maintainers think about the current situation, is this inconsistency seen as an issue? 1) https://sourceware.org/git/?p=glibc.git;a=commit;h=70a6707fa15e63591d991761be025e26e8d02bb6 2) https://sourceware.org/ml/libc-alpha/2016-11/msg00062.html 3) https://docs.oracle.com/javase/9/intl/internationalization-enhancements-jdk-9.htm Thanks, -- Marko Myllynen