From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 62455 invoked by alias); 18 Apr 2018 07:14:55 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 62376 invoked by uid 89); 18 Apr 2018 07:14:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=mehr, ein, Content, dem X-HELO: mx1.redhat.com Subject: Re: de_DE has been using the wrong group separator for over 18 years To: kdex , libc-alpha@sourceware.org References: <7224816.qpMlRvYOtE@punchy> From: Florian Weimer Message-ID: <3e1607ab-e44e-9b28-5fd2-541b3313906d@redhat.com> Date: Wed, 18 Apr 2018 07:14:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <7224816.qpMlRvYOtE@punchy> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-SW-Source: 2018-04/txt/msg00300.txt.bz2 On 04/18/2018 12:24 AM, kdex wrote: > To give some context: I have previously posted the following on libc-locales > and was asked to bring this to the attention of senior developers on this > least who speak German. > > I have noticed that the locale `de_DE` has erroneously been using a full stop > (U+002E) for the thousands (group) separator in `mon_thousands_sep` and > `thousands_sep` ever since 2000. The usage of a full stop to group thousands > has (to my knowledge) has never been standardized. > > As per DIN 1333, DIN 5008, and DIN EN ISO 80000, the separator should have > been a thin space (U+2009). > > In fact, DIN 1333 even explicitly forbids the usage of U+002E to group > thousands, and DIN EN ISO 80000 explicitly excludes all other characters than > a thin space. These standards are simply not universally used. They aren't exactly wrong, either, because some typesetters actually use a (thin) space. It's just that adoption is poor. U+002E is perfectly acceptable and widely used, especially if U+2009 is not available (and U+0020 risks introducing a line break). Here's a recent example: »Die Finanzkontrolle Schwarzarbeit überprüfte im Jahr 2017 mehr als 52.000 Arbeitgeber und leitete fast 108.000 Strafverfahren ein. Die Anzahl der eingeleiteten Ermittlungsverfahren wegen der Nichtgewährung des gesetzlichen Mindestlohns nach dem Mindestlohngesetz stieg auf 2.522 Verfahren (2016: 1.651; 2015: 705).« (Also look at the date at the top of the page—it doesn't follow DIN ISO 8601, either.) I don't think the locales need to change. Using characters from the ASCII range for printing numbers has its advantages. Thanks, Florian