* [PATCH 0/2] Make C/POSIX and C.UTF-8 consistent.
@ 2022-01-31 5:34 Carlos O'Donell
2022-01-31 5:34 ` [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point Carlos O'Donell
2022-01-31 5:34 ` [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX Carlos O'Donell
0 siblings, 2 replies; 15+ messages in thread
From: Carlos O'Donell @ 2022-01-31 5:34 UTC (permalink / raw)
To: libc-alpha, fweimer, michael.hudson
We had a recent report from Michael Hudson-Doyle that he had seen a
problem with C.UTF-8 when running tests for rrdtool. The report
prompted Florian to place this on the glibc 2.35 blocker for review.
Upon review I decided to haromize C.UTF-8 closer to C/POSIX and I
worked with Florian to fix the discrepancies between the C.UTF-8
locale and the builtin C/POSIX locale. The work uncoverd a problem
in the parsing of LC_MONETARY by localedef which needed fixing in order
to make C/POSIX and C.UTF-8 consistent. The first commit fixes the
mon_decimal_point handling in localedef parsing, while the second commit
fixes C.UTF-8 and adds a new test to check for consistency beween
C/POSIX and C.UTF-8. The test is based on work that Florian Weimer did
to help me identify the inconsistencies between the locales.
Carlos O'Donell (2):
localedef: Fix handling of empty mon_decimal_point
localedata: Adjust C.UTF-8 to align with C/POSIX.
locale/programs/ld-monetary.c | 4 +-
localedata/Makefile | 30 +-
localedata/locales/C | 22 +-
localedata/tst-c-utf8-consistency.c | 539 ++++++++++++++++++++++++++++
4 files changed, 579 insertions(+), 16 deletions(-)
create mode 100644 localedata/tst-c-utf8-consistency.c
--
2.31.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-01-31 5:34 [PATCH 0/2] Make C/POSIX and C.UTF-8 consistent Carlos O'Donell
@ 2022-01-31 5:34 ` Carlos O'Donell
2022-01-31 15:26 ` Florian Weimer
2022-02-01 11:47 ` Florian Weimer
2022-01-31 5:34 ` [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX Carlos O'Donell
1 sibling, 2 replies; 15+ messages in thread
From: Carlos O'Donell @ 2022-01-31 5:34 UTC (permalink / raw)
To: libc-alpha, fweimer, michael.hudson
The handling of mon_decimal_point is incorrect when it comes to
handling the empty "" value. The existing parser in monetary_read()
will correctly handle setting the non-wide-character value and the
wide-character value e.g. STR_ELEM_WC(mon_decimal_point) if they are
set in the locale definition. However, in monetary_finish() we have
conflicting TEST_ELEM() which sets a default value (if the locale
definition doesn't include one), and subsequent code which looks for
mon_decimal_point to be NULL to issue a specific error message and set
the defaults. The latter is unused because TEST_ELEM() always sets a
default. The simplest solution is to remove the TEST_ELEM() check,
and allow the existing check to look to see if mon_decimal_point is
NULL and set an appropriate default. The final fix is to move the
setting of mon_decimal_point_wc so it occurs only when
mon_decimal_point is being set to a default, keeping both values
consistent. There is no way to tell the difference between
mon_decimal_point_wc having been set to the empty string and not
having been defined at all, for that distinction we must use
mon_decimal_point being NULL or "", and so we must logically set
the default together with mon_decimal_point.
Lastly, there are more fixes similar to this that could be made to
ld-monetary.c, but we avoid that in order to fix just the code
required for mon_decimal_point, which impacts the ability for C.UTF-8
to set mon_decimal_point to "", since without this fix we end up with
an inconsistent setting of mon_decimal_point set to "", but
mon_decimal_point_wc set to "." which is incorrect.
Tested on x86_64 and i686 without regression.
---
locale/programs/ld-monetary.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/locale/programs/ld-monetary.c b/locale/programs/ld-monetary.c
index 277b9ff042..3b0412b405 100644
--- a/locale/programs/ld-monetary.c
+++ b/locale/programs/ld-monetary.c
@@ -207,7 +207,6 @@ No definition for %s category found"), "LC_MONETARY");
TEST_ELEM (int_curr_symbol, "");
TEST_ELEM (currency_symbol, "");
- TEST_ELEM (mon_decimal_point, ".");
TEST_ELEM (mon_thousands_sep, "");
TEST_ELEM (positive_sign, "");
TEST_ELEM (negative_sign, "");
@@ -257,6 +256,7 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
record_error (0, 0, _("%s: field `%s' not defined"),
"LC_MONETARY", "mon_decimal_point");
monetary->mon_decimal_point = ".";
+ monetary->mon_decimal_point_wc = L'.';
}
else if (monetary->mon_decimal_point[0] == '\0' && ! be_quiet && ! nothing)
{
@@ -264,8 +264,6 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
%s: value for field `%s' must not be an empty string"),
"LC_MONETARY", "mon_decimal_point");
}
- if (monetary->mon_decimal_point_wc == L'\0')
- monetary->mon_decimal_point_wc = L'.';
if (monetary->mon_grouping_len == 0)
{
--
2.31.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX.
2022-01-31 5:34 [PATCH 0/2] Make C/POSIX and C.UTF-8 consistent Carlos O'Donell
2022-01-31 5:34 ` [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point Carlos O'Donell
@ 2022-01-31 5:34 ` Carlos O'Donell
2022-01-31 8:47 ` Andreas Schwab
2022-02-01 12:05 ` Florian Weimer
1 sibling, 2 replies; 15+ messages in thread
From: Carlos O'Donell @ 2022-01-31 5:34 UTC (permalink / raw)
To: libc-alpha, fweimer, michael.hudson
We have had one downstream report from Canonical [1] that
an rrdtool test was broken by the differences in LC_TIME
that we had in the non-builtin C locale (C.UTF-8). If one
application has an issue there are going to be others, and
so with this commit we review and fix all the issues that
cause the builtin C locale to be different from C.UTF-8,
which includes:
* mon_decimal_point should be empty e.g. ""
- Depends on mon_decimal_point_wc fix.
* negative_sign should be empty e.g. ""
* week should be aligned with ISO 30112 default e.g. 7;19971130;4
* d_fmt corrected with escaped slashes e.g. "%m//%d//%y"
* yesstr and nostr should be empty e.g. ""
* country_ab2 and country_ab3 should be empty e.g. ""
We bump LC_IDENTIFICATION version and adjust the date to
indicate the change in the locale.
A new tst-c-utf8-consistency test is added to ensure
consistency between C/POSIX and C.UTF-8.
Tested on x86_64 and i686 without regression.
[1] https://sourceware.org/pipermail/libc-alpha/2022-January/135703.html
Co-authored-by: Florian Weimer <fweimer@redhat.com>
---
localedata/Makefile | 30 +-
localedata/locales/C | 22 +-
localedata/tst-c-utf8-consistency.c | 539 ++++++++++++++++++++++++++++
3 files changed, 578 insertions(+), 13 deletions(-)
create mode 100644 localedata/tst-c-utf8-consistency.c
diff --git a/localedata/Makefile b/localedata/Makefile
index 79db713925..9ae2e5c161 100644
--- a/localedata/Makefile
+++ b/localedata/Makefile
@@ -155,11 +155,31 @@ locale_test_suite := tst_iswalnum tst_iswalpha tst_iswcntrl \
tst_wcsxfrm tst_wctob tst_wctomb tst_wctrans \
tst_wctype tst_wcwidth
-tests = $(locale_test_suite) tst-digits tst-setlocale bug-iconv-trans \
- tst-leaks tst-mbswcs1 tst-mbswcs2 tst-mbswcs3 tst-mbswcs4 tst-mbswcs5 \
- tst-mbswcs6 tst-xlocale1 tst-xlocale2 bug-usesetlocale \
- tst-strfmon1 tst-sscanf bug-setlocale1 tst-setlocale2 tst-setlocale3 \
- tst-wctype tst-iconv-math-trans
+tests = \
+ $(locale_test_suite) \
+ bug-iconv-trans \
+ bug-setlocale1 \
+ bug-usesetlocale \
+ tst-c-utf8-consistency \
+ tst-digits \
+ tst-iconv-math-trans \
+ tst-leaks \
+ tst-mbswcs1 \
+ tst-mbswcs2 \
+ tst-mbswcs3 \
+ tst-mbswcs4 \
+ tst-mbswcs5 \
+ tst-mbswcs6 \
+ tst-setlocale \
+ tst-setlocale2 \
+ tst-setlocale3 \
+ tst-sscanf \
+ tst-strfmon1 \
+ tst-wctype \
+ tst-xlocale1 \
+ tst-xlocale2 \
+ # tests
+
tests-static = bug-setlocale1-static
tests += $(tests-static)
ifeq (yes,$(build-shared))
diff --git a/localedata/locales/C b/localedata/locales/C
index ca801c79cf..fb647ccc4b 100644
--- a/localedata/locales/C
+++ b/localedata/locales/C
@@ -12,8 +12,8 @@ tel ""
fax ""
language ""
territory ""
-revision "2.0"
-date "2020-06-28"
+revision "2.1"
+date "2022-01-30"
category "i18n:2012";LC_IDENTIFICATION
category "i18n:2012";LC_CTYPE
category "i18n:2012";LC_COLLATE
@@ -68,11 +68,11 @@ LC_MONETARY
% glibc/locale/C-monetary.c.).
int_curr_symbol ""
currency_symbol ""
-mon_decimal_point "."
+mon_decimal_point ""
mon_thousands_sep ""
mon_grouping -1
positive_sign ""
-negative_sign "-"
+negative_sign ""
int_frac_digits -1
frac_digits -1
p_cs_precedes -1
@@ -121,7 +121,9 @@ mon "January";"February";"March";"April";"May";"June";"July";/
%
% ISO 8601 conforming applications should use the values 7, 19971201 (a
% Monday), and 4 (Thursday), respectively.
-week 7;19971201;4
+%
+% This field is consciously aligned with ISO 30112 and the C/POSIX locale.
+week 7;19971130;4
first_weekday 1
first_workday 2
@@ -129,7 +131,7 @@ first_workday 2
d_t_fmt "%a %b %e %H:%M:%S %Y"
% Appropriate date representation (%x)
-d_fmt "%m/%d/%y"
+d_fmt "%m//%d//%y"
% Appropriate time representation (%X)
t_fmt "%H:%M:%S"
@@ -150,8 +152,8 @@ LC_MESSAGES
%
yesexpr "^[yY]"
noexpr "^[nN]"
-yesstr "Yes"
-nostr "No"
+yesstr ""
+nostr ""
END LC_MESSAGES
LC_PAPER
@@ -175,6 +177,10 @@ LC_ADDRESS
% the LC_ADDRESS category.
% (also used in the built in C/POSIX locale in glibc/locale/C-address.c)
postal_fmt "%a%N%f%N%d%N%b%N%s %h %e %r%N%C-%z %T%N%c%N"
+% The abbreviated 2 char and 3 char should be set to empty strings to
+% match the C/POSIX locale.
+country_ab2 ""
+country_ab3 ""
END LC_ADDRESS
LC_TELEPHONE
diff --git a/localedata/tst-c-utf8-consistency.c b/localedata/tst-c-utf8-consistency.c
new file mode 100644
index 0000000000..50feed3090
--- /dev/null
+++ b/localedata/tst-c-utf8-consistency.c
@@ -0,0 +1,539 @@
+/* Test that C/POSIX and C.UTF-8 are consistent.
+ Copyright (C) 2022 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <langinfo.h>
+#include <locale.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <support/check.h>
+
+/* Initialized by do_test using newlocale. */
+static locale_t c_utf8;
+
+/* Set to true for second pass. */
+static bool use_nl_langinfo_l;
+
+static void
+switch_to_c (void)
+{
+ if (setlocale (LC_ALL, "C") == NULL)
+ FAIL_EXIT1 ("setlocale (LC_ALL, \"C\")");
+}
+
+static void
+switch_to_c_utf8 (void)
+{
+ if (setlocale (LC_ALL, "C.UTF-8") == NULL)
+ FAIL_EXIT1 ("setlocale (LC_ALL, \"C.UTF-8\")");
+}
+
+static char *
+str (nl_item item)
+{
+ if (!use_nl_langinfo_l)
+ switch_to_c ();
+ return nl_langinfo (item);
+}
+
+static char *
+str_utf8 (nl_item item)
+{
+ if (use_nl_langinfo_l)
+ return nl_langinfo_l (item, c_utf8);
+ else
+ {
+ switch_to_c_utf8 ();
+ return nl_langinfo (item);
+ }
+}
+
+static wchar_t *
+wstr (nl_item item)
+{
+ return (wchar_t *) str (item);
+}
+
+static wchar_t *
+wstr_utf8 (nl_item item)
+{
+ return (wchar_t *) str_utf8 (item);
+}
+
+static int
+byte (nl_item item)
+{
+ return (signed char) *str (item);
+}
+
+static int
+byte_utf8 (nl_item item)
+{
+ return (signed char) *str_utf8 (item);
+}
+
+static int
+word (nl_item item)
+{
+ union
+ {
+ char *ptr;
+ int word;
+ } u;
+ u.ptr = str (item);
+ return u.word;
+}
+
+static int
+word_utf8 (nl_item item)
+{
+ union
+ {
+ char *ptr;
+ int word;
+ } u;
+ u.ptr = str_utf8 (item);
+ return u.word;
+}
+
+static void
+one_pass (void)
+{
+ /* LC_TIME. */
+ TEST_COMPARE_STRING (str (ABDAY_1), str_utf8 (ABDAY_1));
+ TEST_COMPARE_STRING (str (ABDAY_2), str_utf8 (ABDAY_2));
+ TEST_COMPARE_STRING (str (ABDAY_3), str_utf8 (ABDAY_3));
+ TEST_COMPARE_STRING (str (ABDAY_4), str_utf8 (ABDAY_4));
+ TEST_COMPARE_STRING (str (ABDAY_5), str_utf8 (ABDAY_5));
+ TEST_COMPARE_STRING (str (ABDAY_6), str_utf8 (ABDAY_6));
+ TEST_COMPARE_STRING (str (ABDAY_7), str_utf8 (ABDAY_7));
+
+ TEST_COMPARE_STRING (str (DAY_1), str_utf8 (DAY_1));
+ TEST_COMPARE_STRING (str (DAY_2), str_utf8 (DAY_2));
+ TEST_COMPARE_STRING (str (DAY_3), str_utf8 (DAY_3));
+ TEST_COMPARE_STRING (str (DAY_4), str_utf8 (DAY_4));
+ TEST_COMPARE_STRING (str (DAY_5), str_utf8 (DAY_5));
+ TEST_COMPARE_STRING (str (DAY_6), str_utf8 (DAY_6));
+ TEST_COMPARE_STRING (str (DAY_7), str_utf8 (DAY_7));
+
+ TEST_COMPARE_STRING (str (ABMON_1), str_utf8 (ABMON_1));
+ TEST_COMPARE_STRING (str (ABMON_2), str_utf8 (ABMON_2));
+ TEST_COMPARE_STRING (str (ABMON_3), str_utf8 (ABMON_3));
+ TEST_COMPARE_STRING (str (ABMON_4), str_utf8 (ABMON_4));
+ TEST_COMPARE_STRING (str (ABMON_5), str_utf8 (ABMON_5));
+ TEST_COMPARE_STRING (str (ABMON_6), str_utf8 (ABMON_6));
+ TEST_COMPARE_STRING (str (ABMON_7), str_utf8 (ABMON_7));
+ TEST_COMPARE_STRING (str (ABMON_8), str_utf8 (ABMON_8));
+ TEST_COMPARE_STRING (str (ABMON_9), str_utf8 (ABMON_9));
+ TEST_COMPARE_STRING (str (ABMON_10), str_utf8 (ABMON_10));
+ TEST_COMPARE_STRING (str (ABMON_11), str_utf8 (ABMON_11));
+ TEST_COMPARE_STRING (str (ABMON_12), str_utf8 (ABMON_12));
+
+ TEST_COMPARE_STRING (str (MON_1), str_utf8 (MON_1));
+ TEST_COMPARE_STRING (str (MON_2), str_utf8 (MON_2));
+ TEST_COMPARE_STRING (str (MON_3), str_utf8 (MON_3));
+ TEST_COMPARE_STRING (str (MON_4), str_utf8 (MON_4));
+ TEST_COMPARE_STRING (str (MON_5), str_utf8 (MON_5));
+ TEST_COMPARE_STRING (str (MON_6), str_utf8 (MON_6));
+ TEST_COMPARE_STRING (str (MON_7), str_utf8 (MON_7));
+ TEST_COMPARE_STRING (str (MON_8), str_utf8 (MON_8));
+ TEST_COMPARE_STRING (str (MON_9), str_utf8 (MON_9));
+ TEST_COMPARE_STRING (str (MON_10), str_utf8 (MON_10));
+ TEST_COMPARE_STRING (str (MON_11), str_utf8 (MON_11));
+ TEST_COMPARE_STRING (str (MON_12), str_utf8 (MON_12));
+
+ TEST_COMPARE_STRING (str (AM_STR), str_utf8 (AM_STR));
+ TEST_COMPARE_STRING (str (PM_STR), str_utf8 (PM_STR));
+
+ TEST_COMPARE_STRING (str (D_T_FMT), str_utf8 (D_T_FMT));
+ TEST_COMPARE_STRING (str (D_FMT), str_utf8 (D_FMT));
+ TEST_COMPARE_STRING (str (T_FMT), str_utf8 (T_FMT));
+ TEST_COMPARE_STRING (str (T_FMT_AMPM),
+ str_utf8 (T_FMT_AMPM));
+
+ TEST_COMPARE_STRING (str (ERA), str_utf8 (ERA));
+ TEST_COMPARE_STRING (str (ERA_YEAR), str_utf8 (ERA_YEAR));
+ TEST_COMPARE_STRING (str (ERA_D_FMT), str_utf8 (ERA_D_FMT));
+ TEST_COMPARE_STRING (str (ALT_DIGITS), str_utf8 (ALT_DIGITS));
+ TEST_COMPARE_STRING (str (ERA_D_T_FMT), str_utf8 (ERA_D_T_FMT));
+ TEST_COMPARE_STRING (str (ERA_T_FMT), str_utf8 (ERA_T_FMT));
+ TEST_COMPARE (word (_NL_TIME_ERA_NUM_ENTRIES),
+ word_utf8 (_NL_TIME_ERA_NUM_ENTRIES));
+ /* No array elements, so nothing to compare for _NL_TIME_ERA_ENTRIES. */
+ TEST_COMPARE (word (_NL_TIME_ERA_NUM_ENTRIES), 0);
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABDAY_1), wstr_utf8 (_NL_WABDAY_1));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABDAY_2), wstr_utf8 (_NL_WABDAY_2));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABDAY_3), wstr_utf8 (_NL_WABDAY_3));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABDAY_4), wstr_utf8 (_NL_WABDAY_4));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABDAY_5), wstr_utf8 (_NL_WABDAY_5));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABDAY_6), wstr_utf8 (_NL_WABDAY_6));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABDAY_7), wstr_utf8 (_NL_WABDAY_7));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WDAY_1), wstr_utf8 (_NL_WDAY_1));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WDAY_2), wstr_utf8 (_NL_WDAY_2));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WDAY_3), wstr_utf8 (_NL_WDAY_3));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WDAY_4), wstr_utf8 (_NL_WDAY_4));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WDAY_5), wstr_utf8 (_NL_WDAY_5));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WDAY_6), wstr_utf8 (_NL_WDAY_6));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WDAY_7), wstr_utf8 (_NL_WDAY_7));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_1), wstr_utf8 (_NL_WABMON_1));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_2), wstr_utf8 (_NL_WABMON_2));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_3), wstr_utf8 (_NL_WABMON_3));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_4), wstr_utf8 (_NL_WABMON_4));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_5), wstr_utf8 (_NL_WABMON_5));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_6), wstr_utf8 (_NL_WABMON_6));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_7), wstr_utf8 (_NL_WABMON_7));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_8), wstr_utf8 (_NL_WABMON_8));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_9), wstr_utf8 (_NL_WABMON_9));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_10), wstr_utf8 (_NL_WABMON_10));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_11), wstr_utf8 (_NL_WABMON_11));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABMON_12), wstr_utf8 (_NL_WABMON_12));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_1), wstr_utf8 (_NL_WMON_1));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_2), wstr_utf8 (_NL_WMON_2));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_3), wstr_utf8 (_NL_WMON_3));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_4), wstr_utf8 (_NL_WMON_4));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_5), wstr_utf8 (_NL_WMON_5));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_6), wstr_utf8 (_NL_WMON_6));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_7), wstr_utf8 (_NL_WMON_7));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_8), wstr_utf8 (_NL_WMON_8));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_9), wstr_utf8 (_NL_WMON_9));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_10), wstr_utf8 (_NL_WMON_10));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_11), wstr_utf8 (_NL_WMON_11));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WMON_12), wstr_utf8 (_NL_WMON_12));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WAM_STR), wstr_utf8 (_NL_WAM_STR));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WPM_STR), wstr_utf8 (_NL_WPM_STR));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WD_T_FMT), wstr_utf8 (_NL_WD_T_FMT));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WD_FMT), wstr_utf8 (_NL_WD_FMT));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WT_FMT), wstr_utf8 (_NL_WT_FMT));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WT_FMT_AMPM),
+ wstr_utf8 (_NL_WT_FMT_AMPM));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WERA_YEAR), wstr_utf8 (_NL_WERA_YEAR));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WERA_D_FMT), wstr_utf8 (_NL_WERA_D_FMT));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALT_DIGITS),
+ wstr_utf8 (_NL_WALT_DIGITS));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WERA_D_T_FMT),
+ wstr_utf8 (_NL_WERA_D_T_FMT));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WERA_T_FMT), wstr_utf8 (_NL_WERA_T_FMT));
+
+ /* This is somewhat inconsistent, but see locale/categories.def. */
+ TEST_COMPARE (byte (_NL_TIME_WEEK_NDAYS), byte_utf8 (_NL_TIME_WEEK_NDAYS));
+ TEST_COMPARE (word (_NL_TIME_WEEK_1STDAY),
+ word_utf8 (_NL_TIME_WEEK_1STDAY));
+ TEST_COMPARE (byte (_NL_TIME_WEEK_1STWEEK),
+ byte_utf8 (_NL_TIME_WEEK_1STWEEK));
+ TEST_COMPARE (byte (_NL_TIME_FIRST_WEEKDAY),
+ byte_utf8 (_NL_TIME_FIRST_WEEKDAY));
+ TEST_COMPARE (byte (_NL_TIME_FIRST_WORKDAY),
+ byte_utf8 (_NL_TIME_FIRST_WORKDAY));
+ TEST_COMPARE (byte (_NL_TIME_CAL_DIRECTION),
+ byte_utf8 (_NL_TIME_CAL_DIRECTION));
+ TEST_COMPARE_STRING (str (_NL_TIME_TIMEZONE), str_utf8 (_NL_TIME_TIMEZONE));
+
+ TEST_COMPARE_STRING (str (_DATE_FMT), str_utf8 (_DATE_FMT));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_W_DATE_FMT), wstr_utf8 (_NL_W_DATE_FMT));
+
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_TIME_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_TIME_CODESET), "UTF-8");
+
+ TEST_COMPARE_STRING (str (ALTMON_1), str_utf8 (ALTMON_1));
+ TEST_COMPARE_STRING (str (ALTMON_2), str_utf8 (ALTMON_2));
+ TEST_COMPARE_STRING (str (ALTMON_3), str_utf8 (ALTMON_3));
+ TEST_COMPARE_STRING (str (ALTMON_4), str_utf8 (ALTMON_4));
+ TEST_COMPARE_STRING (str (ALTMON_5), str_utf8 (ALTMON_5));
+ TEST_COMPARE_STRING (str (ALTMON_6), str_utf8 (ALTMON_6));
+ TEST_COMPARE_STRING (str (ALTMON_7), str_utf8 (ALTMON_7));
+ TEST_COMPARE_STRING (str (ALTMON_8), str_utf8 (ALTMON_8));
+ TEST_COMPARE_STRING (str (ALTMON_9), str_utf8 (ALTMON_9));
+ TEST_COMPARE_STRING (str (ALTMON_10), str_utf8 (ALTMON_10));
+ TEST_COMPARE_STRING (str (ALTMON_11), str_utf8 (ALTMON_11));
+ TEST_COMPARE_STRING (str (ALTMON_12), str_utf8 (ALTMON_12));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_1), wstr_utf8 (_NL_WALTMON_1));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_2), wstr_utf8 (_NL_WALTMON_2));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_3), wstr_utf8 (_NL_WALTMON_3));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_4), wstr_utf8 (_NL_WALTMON_4));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_5), wstr_utf8 (_NL_WALTMON_5));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_6), wstr_utf8 (_NL_WALTMON_6));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_7), wstr_utf8 (_NL_WALTMON_7));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_8), wstr_utf8 (_NL_WALTMON_8));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_9), wstr_utf8 (_NL_WALTMON_9));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_10), wstr_utf8 (_NL_WALTMON_10));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_11), wstr_utf8 (_NL_WALTMON_11));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WALTMON_12), wstr_utf8 (_NL_WALTMON_12));
+
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_1), str_utf8 (_NL_ABALTMON_1));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_2), str_utf8 (_NL_ABALTMON_2));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_3), str_utf8 (_NL_ABALTMON_3));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_4), str_utf8 (_NL_ABALTMON_4));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_5), str_utf8 (_NL_ABALTMON_5));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_6), str_utf8 (_NL_ABALTMON_6));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_7), str_utf8 (_NL_ABALTMON_7));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_8), str_utf8 (_NL_ABALTMON_8));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_9), str_utf8 (_NL_ABALTMON_9));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_10), str_utf8 (_NL_ABALTMON_10));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_11), str_utf8 (_NL_ABALTMON_11));
+ TEST_COMPARE_STRING (str (_NL_ABALTMON_12), str_utf8 (_NL_ABALTMON_12));
+
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_1),
+ wstr_utf8 (_NL_WABALTMON_1));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_2),
+ wstr_utf8 (_NL_WABALTMON_2));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_3),
+ wstr_utf8 (_NL_WABALTMON_3));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_4),
+ wstr_utf8 (_NL_WABALTMON_4));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_5),
+ wstr_utf8 (_NL_WABALTMON_5));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_6),
+ wstr_utf8 (_NL_WABALTMON_6));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_7),
+ wstr_utf8 (_NL_WABALTMON_7));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_8),
+ wstr_utf8 (_NL_WABALTMON_8));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_9),
+ wstr_utf8 (_NL_WABALTMON_9));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_10),
+ wstr_utf8 (_NL_WABALTMON_10));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_11),
+ wstr_utf8 (_NL_WABALTMON_11));
+ TEST_COMPARE_STRING_WIDE (wstr (_NL_WABALTMON_12),
+ wstr_utf8 (_NL_WABALTMON_12));
+
+ /* LC_COLLATE. Mostly untested, only expected differences. */
+ TEST_COMPARE_STRING (str (_NL_COLLATE_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_COLLATE_CODESET), "UTF-8");
+
+ /* LC_CTYPE. Mostly untested, only expected differences. */
+ TEST_COMPARE_STRING (str (CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (CODESET), "UTF-8");
+
+ /* LC_MONETARY. */
+ TEST_COMPARE_STRING (str (INT_CURR_SYMBOL), str_utf8 (INT_CURR_SYMBOL));
+ TEST_COMPARE_STRING (str (CURRENCY_SYMBOL), str_utf8 (CURRENCY_SYMBOL));
+ TEST_COMPARE_STRING (str (MON_DECIMAL_POINT), str_utf8 (MON_DECIMAL_POINT));
+ TEST_COMPARE_STRING (str (MON_THOUSANDS_SEP), str_utf8 (MON_THOUSANDS_SEP));
+ TEST_COMPARE_STRING (str (MON_GROUPING), str_utf8 (MON_GROUPING));
+ TEST_COMPARE_STRING (str (POSITIVE_SIGN), str_utf8 (POSITIVE_SIGN));
+ TEST_COMPARE_STRING (str (NEGATIVE_SIGN), str_utf8 (NEGATIVE_SIGN));
+ TEST_COMPARE (byte (INT_FRAC_DIGITS), byte_utf8 (INT_FRAC_DIGITS));
+ TEST_COMPARE (byte (FRAC_DIGITS), byte_utf8 (FRAC_DIGITS));
+ TEST_COMPARE (byte (P_CS_PRECEDES), byte_utf8 (P_CS_PRECEDES));
+ TEST_COMPARE (byte (P_SEP_BY_SPACE), byte_utf8 (P_SEP_BY_SPACE));
+ TEST_COMPARE (byte (N_CS_PRECEDES), byte_utf8 (N_CS_PRECEDES));
+ TEST_COMPARE (byte (N_SEP_BY_SPACE), byte_utf8 (N_SEP_BY_SPACE));
+ TEST_COMPARE (byte (P_SIGN_POSN), byte_utf8 (P_SIGN_POSN));
+ TEST_COMPARE (byte (N_SIGN_POSN), byte_utf8 (N_SIGN_POSN));
+ TEST_COMPARE_STRING (str (CRNCYSTR), str_utf8 (CRNCYSTR));
+ TEST_COMPARE (byte (INT_P_CS_PRECEDES), byte_utf8 (INT_P_CS_PRECEDES));
+ TEST_COMPARE (byte (INT_P_SEP_BY_SPACE), byte_utf8 (INT_P_SEP_BY_SPACE));
+ TEST_COMPARE (byte (INT_N_CS_PRECEDES), byte_utf8 (INT_N_CS_PRECEDES));
+ TEST_COMPARE (byte (INT_N_SEP_BY_SPACE), byte_utf8 (INT_N_SEP_BY_SPACE));
+ TEST_COMPARE (byte (INT_P_SIGN_POSN), byte_utf8 (INT_P_SIGN_POSN));
+ TEST_COMPARE (byte (INT_N_SIGN_POSN), byte_utf8 (INT_N_SIGN_POSN));
+ TEST_COMPARE_STRING (str (_NL_MONETARY_DUO_INT_CURR_SYMBOL),
+ str_utf8 (_NL_MONETARY_DUO_INT_CURR_SYMBOL));
+ TEST_COMPARE_STRING (str (_NL_MONETARY_DUO_CURRENCY_SYMBOL),
+ str_utf8 (_NL_MONETARY_DUO_CURRENCY_SYMBOL));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_FRAC_DIGITS),
+ byte_utf8 (_NL_MONETARY_DUO_INT_FRAC_DIGITS));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_FRAC_DIGITS),
+ byte_utf8 (_NL_MONETARY_DUO_FRAC_DIGITS));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_P_CS_PRECEDES),
+ byte_utf8 (_NL_MONETARY_DUO_P_CS_PRECEDES));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_P_SEP_BY_SPACE),
+ byte_utf8 (_NL_MONETARY_DUO_P_SEP_BY_SPACE));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_N_CS_PRECEDES),
+ byte_utf8 (_NL_MONETARY_DUO_N_CS_PRECEDES));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_N_SEP_BY_SPACE),
+ byte_utf8 (_NL_MONETARY_DUO_N_SEP_BY_SPACE));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_P_CS_PRECEDES),
+ byte_utf8 (_NL_MONETARY_DUO_INT_P_CS_PRECEDES));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_P_SEP_BY_SPACE),
+ byte_utf8 (_NL_MONETARY_DUO_INT_P_SEP_BY_SPACE));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_N_CS_PRECEDES),
+ byte_utf8 (_NL_MONETARY_DUO_INT_N_CS_PRECEDES));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_N_SEP_BY_SPACE),
+ byte_utf8 (_NL_MONETARY_DUO_INT_N_SEP_BY_SPACE));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_P_SIGN_POSN),
+ byte_utf8 (_NL_MONETARY_DUO_INT_P_SIGN_POSN));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_N_SIGN_POSN),
+ byte_utf8 (_NL_MONETARY_DUO_INT_N_SIGN_POSN));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_P_SIGN_POSN),
+ byte_utf8 (_NL_MONETARY_DUO_P_SIGN_POSN));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_N_SIGN_POSN),
+ byte_utf8 (_NL_MONETARY_DUO_N_SIGN_POSN));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_P_SIGN_POSN),
+ byte_utf8 (_NL_MONETARY_DUO_INT_P_SIGN_POSN));
+ TEST_COMPARE (byte (_NL_MONETARY_DUO_INT_N_SIGN_POSN),
+ byte_utf8 (_NL_MONETARY_DUO_INT_N_SIGN_POSN));
+ TEST_COMPARE (word (_NL_MONETARY_UNO_VALID_FROM),
+ word_utf8 (_NL_MONETARY_UNO_VALID_FROM));
+ TEST_COMPARE (word (_NL_MONETARY_UNO_VALID_TO),
+ word_utf8 (_NL_MONETARY_UNO_VALID_TO));
+ TEST_COMPARE (word (_NL_MONETARY_DUO_VALID_FROM),
+ word_utf8 (_NL_MONETARY_DUO_VALID_FROM));
+ TEST_COMPARE (word (_NL_MONETARY_DUO_VALID_TO),
+ word_utf8 (_NL_MONETARY_DUO_VALID_TO));
+ /* _NL_MONETARY_CONVERSION_RATE cannot be tested (word array). */
+ TEST_COMPARE (word (_NL_MONETARY_DECIMAL_POINT_WC),
+ word_utf8 (_NL_MONETARY_DECIMAL_POINT_WC));
+ TEST_COMPARE (word (_NL_MONETARY_THOUSANDS_SEP_WC),
+ word_utf8 (_NL_MONETARY_THOUSANDS_SEP_WC));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_MONETARY_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_MONETARY_CODESET), "UTF-8");
+
+ /* LC_NUMERIC. */
+
+ TEST_COMPARE_STRING (str (DECIMAL_POINT), str_utf8 (DECIMAL_POINT));
+ TEST_COMPARE_STRING (str (RADIXCHAR), str_utf8 (RADIXCHAR));
+ TEST_COMPARE_STRING (str (THOUSANDS_SEP), str_utf8 (THOUSANDS_SEP));
+ TEST_COMPARE_STRING (str (THOUSEP), str_utf8 (THOUSEP));
+ TEST_COMPARE_STRING (str (GROUPING), str_utf8 (GROUPING));
+ TEST_COMPARE (word (_NL_NUMERIC_DECIMAL_POINT_WC),
+ word_utf8 (_NL_NUMERIC_DECIMAL_POINT_WC));
+ TEST_COMPARE (word (_NL_NUMERIC_THOUSANDS_SEP_WC),
+ word_utf8 (_NL_NUMERIC_THOUSANDS_SEP_WC));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_NUMERIC_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_NUMERIC_CODESET), "UTF-8");
+
+ /* LC_MESSAGES. */
+
+ TEST_COMPARE_STRING (str (YESEXPR), str_utf8 (YESEXPR));
+ TEST_COMPARE_STRING (str (NOEXPR), str_utf8 (NOEXPR));
+ TEST_COMPARE_STRING (str (YESSTR), str_utf8 (YESSTR));
+ TEST_COMPARE_STRING (str (NOSTR), str_utf8 (NOSTR));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_MESSAGES_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_MESSAGES_CODESET), "UTF-8");
+
+ /* LC_PAPER. */
+
+ TEST_COMPARE (word (_NL_PAPER_HEIGHT), word_utf8 (_NL_PAPER_HEIGHT));
+ TEST_COMPARE (word (_NL_PAPER_WIDTH), word_utf8 (_NL_PAPER_WIDTH));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_PAPER_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_PAPER_CODESET), "UTF-8");
+
+ /* LC_NAME. */
+
+ TEST_COMPARE_STRING (str (_NL_NAME_NAME_FMT),
+ str_utf8 (_NL_NAME_NAME_FMT));
+ TEST_COMPARE_STRING (str (_NL_NAME_NAME_GEN),
+ str_utf8 (_NL_NAME_NAME_GEN));
+ TEST_COMPARE_STRING (str (_NL_NAME_NAME_MR),
+ str_utf8 (_NL_NAME_NAME_MR));
+ TEST_COMPARE_STRING (str (_NL_NAME_NAME_MRS),
+ str_utf8 (_NL_NAME_NAME_MRS));
+ TEST_COMPARE_STRING (str (_NL_NAME_NAME_MISS),
+ str_utf8 (_NL_NAME_NAME_MISS));
+ TEST_COMPARE_STRING (str (_NL_NAME_NAME_MS),
+ str_utf8 (_NL_NAME_NAME_MS));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_NAME_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_NAME_CODESET), "UTF-8");
+
+ /* LC_ADDRESS. */
+
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_POSTAL_FMT),
+ str_utf8 (_NL_ADDRESS_POSTAL_FMT));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_COUNTRY_NAME),
+ str_utf8 (_NL_ADDRESS_COUNTRY_NAME));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_COUNTRY_POST),
+ str_utf8 (_NL_ADDRESS_COUNTRY_POST));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_COUNTRY_AB2),
+ str_utf8 (_NL_ADDRESS_COUNTRY_AB2));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_COUNTRY_AB3),
+ str_utf8 (_NL_ADDRESS_COUNTRY_AB3));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_COUNTRY_CAR),
+ str_utf8 (_NL_ADDRESS_COUNTRY_CAR));
+ TEST_COMPARE (word (_NL_ADDRESS_COUNTRY_NUM),
+ word_utf8 (_NL_ADDRESS_COUNTRY_NUM));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_COUNTRY_ISBN),
+ str_utf8 (_NL_ADDRESS_COUNTRY_ISBN));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_LANG_NAME),
+ str_utf8 (_NL_ADDRESS_LANG_NAME));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_LANG_AB),
+ str_utf8 (_NL_ADDRESS_LANG_AB));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_LANG_TERM),
+ str_utf8 (_NL_ADDRESS_LANG_TERM));
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_LANG_LIB),
+ str_utf8 (_NL_ADDRESS_LANG_LIB));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_ADDRESS_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_ADDRESS_CODESET), "UTF-8");
+
+ /* LC_TELEPHONE. */
+
+ TEST_COMPARE_STRING (str (_NL_TELEPHONE_TEL_INT_FMT),
+ str_utf8 (_NL_TELEPHONE_TEL_INT_FMT));
+ TEST_COMPARE_STRING (str (_NL_TELEPHONE_TEL_DOM_FMT),
+ str_utf8 (_NL_TELEPHONE_TEL_DOM_FMT));
+ TEST_COMPARE_STRING (str (_NL_TELEPHONE_INT_SELECT),
+ str_utf8 (_NL_TELEPHONE_INT_SELECT));
+ TEST_COMPARE_STRING (str (_NL_TELEPHONE_INT_PREFIX),
+ str_utf8 (_NL_TELEPHONE_INT_PREFIX));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_TELEPHONE_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_TELEPHONE_CODESET), "UTF-8");
+
+ /* LC_MEASUREMENT. */
+
+ TEST_COMPARE (byte (_NL_MEASUREMENT_MEASUREMENT),
+ byte_utf8 (_NL_MEASUREMENT_MEASUREMENT));
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_MEASUREMENT_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_MEASUREMENT_CODESET), "UTF-8");
+
+ /* LC_IDENTIFICATION is skipped since C.UTF-8 is distinct from C. */
+
+ /* _NL_IDENTIFICATION_CATEGORY cannot be tested because it is a
+ string array. */
+ /* Expected difference. */
+ TEST_COMPARE_STRING (str (_NL_IDENTIFICATION_CODESET), "ANSI_X3.4-1968");
+ TEST_COMPARE_STRING (str_utf8 (_NL_IDENTIFICATION_CODESET), "UTF-8");
+}
+
+static int
+do_test (void)
+{
+ puts ("info: using setlocale and nl_langinfo");
+ one_pass ();
+
+ puts ("info: using nl_langinfo_l");
+
+ c_utf8 = newlocale (LC_ALL_MASK, "C.UTF-8", (locale_t) 0);
+ TEST_VERIFY_EXIT (c_utf8 != (locale_t) 0);
+
+ switch_to_c ();
+ use_nl_langinfo_l = true;
+ one_pass ();
+
+ freelocale (c_utf8);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
--
2.31.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX.
2022-01-31 5:34 ` [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX Carlos O'Donell
@ 2022-01-31 8:47 ` Andreas Schwab
2022-01-31 16:07 ` Carlos O'Donell
2022-02-01 12:05 ` Florian Weimer
1 sibling, 1 reply; 15+ messages in thread
From: Andreas Schwab @ 2022-01-31 8:47 UTC (permalink / raw)
To: Carlos O'Donell via Libc-alpha
Cc: fweimer, michael.hudson, Carlos O'Donell
On Jan 31 2022, Carlos O'Donell via Libc-alpha wrote:
> +%
> +% This field is consciously aligned with ISO 30112 and the C/POSIX locale.
> +week 7;19971130;4
The copy of ISO 30112 that I could find says 7;19971201;4.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-01-31 5:34 ` [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point Carlos O'Donell
@ 2022-01-31 15:26 ` Florian Weimer
2022-01-31 16:09 ` Andreas Schwab
2022-02-01 11:47 ` Florian Weimer
1 sibling, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2022-01-31 15:26 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: libc-alpha, michael.hudson
* Carlos O'Donell:
> diff --git a/locale/programs/ld-monetary.c b/locale/programs/ld-monetary.c
> index 277b9ff042..3b0412b405 100644
> --- a/locale/programs/ld-monetary.c
> +++ b/locale/programs/ld-monetary.c
> @@ -207,7 +207,6 @@ No definition for %s category found"), "LC_MONETARY");
>
> TEST_ELEM (int_curr_symbol, "");
> TEST_ELEM (currency_symbol, "");
> - TEST_ELEM (mon_decimal_point, ".");
> TEST_ELEM (mon_thousands_sep, "");
> TEST_ELEM (positive_sign, "");
> TEST_ELEM (negative_sign, "");
> @@ -257,6 +256,7 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
> record_error (0, 0, _("%s: field `%s' not defined"),
> "LC_MONETARY", "mon_decimal_point");
> monetary->mon_decimal_point = ".";
> + monetary->mon_decimal_point_wc = L'.';
> }
> else if (monetary->mon_decimal_point[0] == '\0' && ! be_quiet && ! nothing)
> {
> @@ -264,8 +264,6 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
> %s: value for field `%s' must not be an empty string"),
> "LC_MONETARY", "mon_decimal_point");
> }
> - if (monetary->mon_decimal_point_wc == L'\0')
> - monetary->mon_decimal_point_wc = L'.';
>
> if (monetary->mon_grouping_len == 0)
> {
There's an existing comment
/* The decimal point must not be empty. This is not said explicitly
in POSIX but ANSI C (ISO/IEC 9899) says in 4.4.2.1 it has to be
!= "". */
that says that empty strings/null characters are invalid.
The comment was clearly copied from locale/programs/ld-numeric.c.
*However* we have got this code in stdio-common/printf_fp.c:
decimal = _nl_lookup (loc, LC_MONETARY, MON_DECIMAL_POINT);
if (*decimal == '\0')
decimal = _nl_lookup (loc, LC_NUMERIC, DECIMAL_POINT);
decimalwc = _nl_lookup_word (loc, LC_MONETARY,
_NL_MONETARY_DECIMAL_POINT_WC);
if (decimalwc == L'\0')
decimalwc = _nl_lookup_word (loc, LC_NUMERIC,
_NL_NUMERIC_DECIMAL_POINT_WC);
So we use LC_NUMERIC as the fallback, and our strfmon implementation is
okay with it. But our localeconv implementation lacks this fallback,
which looks like a bug because the built-in C locale uses an empty
string/a null character.
Still I think simplifying the locale data is the right direction here.
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX.
2022-01-31 8:47 ` Andreas Schwab
@ 2022-01-31 16:07 ` Carlos O'Donell
0 siblings, 0 replies; 15+ messages in thread
From: Carlos O'Donell @ 2022-01-31 16:07 UTC (permalink / raw)
To: Andreas Schwab, Carlos O'Donell via Libc-alpha
Cc: fweimer, michael.hudson
On 1/31/22 03:47, Andreas Schwab wrote:
> On Jan 31 2022, Carlos O'Donell via Libc-alpha wrote:
>
>> +%
>> +% This field is consciously aligned with ISO 30112 and the C/POSIX locale.
>> +week 7;19971130;4
>
> The copy of ISO 30112 that I could find says 7;19971201;4.
>
First and foremost we try to align with C/POSIX builtin locale.
Aligning with C/POSIX ensures that application developers can seamlessly
change from "C" to "C.UTF-8" without breaking existing tests.
e.g. locale/C-time.c
135 { .string = "\7" },
136 { .word = 19971130 },
137 { .string = "\4" },
I'm using ISO 30112 WD12 (2018-02-12), which I have access to as part
of my SC22 involvement. This version would go on to become the final
published standard 2020-09, though I don't yet have a copy of this
version.
Section 4.7 "LC_TIME" under week:
~~~
If the keyword is not specified the values are taken as 7,
19971130 (a Sunday), and 7 (Saturday), respectively.
ISO 8601 conforming applications should use the values
7, 19971201 (a Monday), and 4 (Thursday),
respectively.
~~~
This matches the ISO 30112 WD10 [2014] that was used when creating
the defaults in ld-time.c:
482 /* Set up defaults based on ISO 30112 WD10 [2014]. */
483 if (time->week_ndays == 0)
484 time->week_ndays = 7;
485
486 if (time->week_1stday == 0)
487 time->week_1stday = 19971130;
488
489 if (time->week_1stweek == 0)
490 time->week_1stweek = 7;
This also matches the legacy withdrawn ISO/IEC 14652:2002.
I could change the comment to read:
% This field is consciously aligned with the builtin C/POSIX locale.
Would a new comment resolve your review?
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-01-31 15:26 ` Florian Weimer
@ 2022-01-31 16:09 ` Andreas Schwab
2022-01-31 16:20 ` Florian Weimer
0 siblings, 1 reply; 15+ messages in thread
From: Andreas Schwab @ 2022-01-31 16:09 UTC (permalink / raw)
To: Florian Weimer via Libc-alpha; +Cc: Carlos O'Donell, Florian Weimer
On Jan 31 2022, Florian Weimer via Libc-alpha wrote:
> There's an existing comment
>
> /* The decimal point must not be empty. This is not said explicitly
> in POSIX but ANSI C (ISO/IEC 9899) says in 4.4.2.1 it has to be
> != "". */
>
> that says that empty strings/null characters are invalid.
This is only about decimal_point, mon_decimal_point can be empty.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-01-31 16:09 ` Andreas Schwab
@ 2022-01-31 16:20 ` Florian Weimer
2022-01-31 16:30 ` Andreas Schwab
0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2022-01-31 16:20 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Florian Weimer via Libc-alpha, Carlos O'Donell
* Andreas Schwab:
> On Jan 31 2022, Florian Weimer via Libc-alpha wrote:
>
>> There's an existing comment
>>
>> /* The decimal point must not be empty. This is not said explicitly
>> in POSIX but ANSI C (ISO/IEC 9899) says in 4.4.2.1 it has to be
>> != "". */
>>
>> that says that empty strings/null characters are invalid.
>
> This is only about decimal_point, mon_decimal_point can be empty.
Hmm, I'll take your word for it.
So the comment should definitely go, and the Carlos' change is the right
way to do it?
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-01-31 16:20 ` Florian Weimer
@ 2022-01-31 16:30 ` Andreas Schwab
2022-01-31 16:37 ` Florian Weimer
0 siblings, 1 reply; 15+ messages in thread
From: Andreas Schwab @ 2022-01-31 16:30 UTC (permalink / raw)
To: Florian Weimer; +Cc: Florian Weimer via Libc-alpha, Carlos O'Donell
On Jan 31 2022, Florian Weimer wrote:
> * Andreas Schwab:
>
>> On Jan 31 2022, Florian Weimer via Libc-alpha wrote:
>>
>>> There's an existing comment
>>>
>>> /* The decimal point must not be empty. This is not said explicitly
>>> in POSIX but ANSI C (ISO/IEC 9899) says in 4.4.2.1 it has to be
>>> != "". */
>>>
>>> that says that empty strings/null characters are invalid.
>>
>> This is only about decimal_point, mon_decimal_point can be empty.
>
> Hmm, I'll take your word for it.
See 7.11.2.1, paragraph 3 and 10.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-01-31 16:30 ` Andreas Schwab
@ 2022-01-31 16:37 ` Florian Weimer
0 siblings, 0 replies; 15+ messages in thread
From: Florian Weimer @ 2022-01-31 16:37 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Florian Weimer via Libc-alpha
* Andreas Schwab:
> On Jan 31 2022, Florian Weimer wrote:
>
>> * Andreas Schwab:
>>
>>> On Jan 31 2022, Florian Weimer via Libc-alpha wrote:
>>>
>>>> There's an existing comment
>>>>
>>>> /* The decimal point must not be empty. This is not said explicitly
>>>> in POSIX but ANSI C (ISO/IEC 9899) says in 4.4.2.1 it has to be
>>>> != "". */
>>>>
>>>> that says that empty strings/null characters are invalid.
>>>
>>> This is only about decimal_point, mon_decimal_point can be empty.
>>
>> Hmm, I'll take your word for it.
>
> See 7.11.2.1, paragraph 3 and 10.
That is fairly conclusive indeed (numbers match C11).
Are you okay with Carlos' patch with a comment update?
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-01-31 5:34 ` [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point Carlos O'Donell
2022-01-31 15:26 ` Florian Weimer
@ 2022-02-01 11:47 ` Florian Weimer
2022-02-01 16:00 ` Carlos O'Donell
1 sibling, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2022-02-01 11:47 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: libc-alpha, michael.hudson
* Carlos O'Donell:
> The handling of mon_decimal_point is incorrect when it comes to
> handling the empty "" value. The existing parser in monetary_read()
> will correctly handle setting the non-wide-character value and the
> wide-character value e.g. STR_ELEM_WC(mon_decimal_point) if they are
> set in the locale definition. However, in monetary_finish() we have
> conflicting TEST_ELEM() which sets a default value (if the locale
> definition doesn't include one), and subsequent code which looks for
> mon_decimal_point to be NULL to issue a specific error message and set
> the defaults. The latter is unused because TEST_ELEM() always sets a
> default. The simplest solution is to remove the TEST_ELEM() check,
> and allow the existing check to look to see if mon_decimal_point is
> NULL and set an appropriate default. The final fix is to move the
> setting of mon_decimal_point_wc so it occurs only when
> mon_decimal_point is being set to a default, keeping both values
> consistent. There is no way to tell the difference between
> mon_decimal_point_wc having been set to the empty string and not
> having been defined at all, for that distinction we must use
> mon_decimal_point being NULL or "", and so we must logically set
> the default together with mon_decimal_point.
>
> Lastly, there are more fixes similar to this that could be made to
> ld-monetary.c, but we avoid that in order to fix just the code
> required for mon_decimal_point, which impacts the ability for C.UTF-8
> to set mon_decimal_point to "", since without this fix we end up with
> an inconsistent setting of mon_decimal_point set to "", but
> mon_decimal_point_wc set to "." which is incorrect.
>
> Tested on x86_64 and i686 without regression.
> ---
> locale/programs/ld-monetary.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/locale/programs/ld-monetary.c b/locale/programs/ld-monetary.c
> index 277b9ff042..3b0412b405 100644
> --- a/locale/programs/ld-monetary.c
> +++ b/locale/programs/ld-monetary.c
> @@ -207,7 +207,6 @@ No definition for %s category found"), "LC_MONETARY");
>
> TEST_ELEM (int_curr_symbol, "");
> TEST_ELEM (currency_symbol, "");
> - TEST_ELEM (mon_decimal_point, ".");
> TEST_ELEM (mon_thousands_sep, "");
> TEST_ELEM (positive_sign, "");
> TEST_ELEM (negative_sign, "");
> @@ -257,6 +256,7 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
> record_error (0, 0, _("%s: field `%s' not defined"),
> "LC_MONETARY", "mon_decimal_point");
> monetary->mon_decimal_point = ".";
> + monetary->mon_decimal_point_wc = L'.';
> }
> else if (monetary->mon_decimal_point[0] == '\0' && ! be_quiet && ! nothing)
> {
> @@ -264,8 +264,6 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
> %s: value for field `%s' must not be an empty string"),
> "LC_MONETARY", "mon_decimal_point");
> }
> - if (monetary->mon_decimal_point_wc == L'\0')
> - monetary->mon_decimal_point_wc = L'.';
>
> if (monetary->mon_grouping_len == 0)
> {
I have verified that this does not change the localedef output for the
existing locales created by install-locale-files.
I think we need further cleanups in the comments and checks (which were
coped from LC_NUMERIC, but should not apply to LC_MONETARY). But I
think we can release with this version.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX.
2022-01-31 5:34 ` [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX Carlos O'Donell
2022-01-31 8:47 ` Andreas Schwab
@ 2022-02-01 12:05 ` Florian Weimer
2022-02-01 16:13 ` Carlos O'Donell
1 sibling, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2022-02-01 12:05 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: libc-alpha, michael.hudson
* Carlos O'Donell:
> We have had one downstream report from Canonical [1] that
> an rrdtool test was broken by the differences in LC_TIME
> that we had in the non-builtin C locale (C.UTF-8). If one
> application has an issue there are going to be others, and
> so with this commit we review and fix all the issues that
> cause the builtin C locale to be different from C.UTF-8,
> which includes:
> * mon_decimal_point should be empty e.g. ""
> - Depends on mon_decimal_point_wc fix.
> * negative_sign should be empty e.g. ""
> * week should be aligned with ISO 30112 default e.g. 7;19971130;4
> * d_fmt corrected with escaped slashes e.g. "%m//%d//%y"
> * yesstr and nostr should be empty e.g. ""
> * country_ab2 and country_ab3 should be empty e.g. ""
>
> We bump LC_IDENTIFICATION version and adjust the date to
> indicate the change in the locale.
>
> A new tst-c-utf8-consistency test is added to ensure
> consistency between C/POSIX and C.UTF-8.
>
> Tested on x86_64 and i686 without regression.
>
> [1] https://sourceware.org/pipermail/libc-alpha/2022-January/135703.html
>
> Co-authored-by: Florian Weimer <fweimer@redhat.com>
> ---
> localedata/Makefile | 30 +-
> localedata/locales/C | 22 +-
> localedata/tst-c-utf8-consistency.c | 539 ++++++++++++++++++++++++++++
> 3 files changed, 578 insertions(+), 13 deletions(-)
> create mode 100644 localedata/tst-c-utf8-consistency.c
This looks broadly okay to me. Dropping the ISO standard reference
seems prudent if we can't check its contents.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Thanks,
Florian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-02-01 11:47 ` Florian Weimer
@ 2022-02-01 16:00 ` Carlos O'Donell
2022-02-01 16:14 ` Carlos O'Donell
0 siblings, 1 reply; 15+ messages in thread
From: Carlos O'Donell @ 2022-02-01 16:00 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha, michael.hudson
On 2/1/22 06:47, Florian Weimer wrote:
> * Carlos O'Donell:
>
>> The handling of mon_decimal_point is incorrect when it comes to
>> handling the empty "" value. The existing parser in monetary_read()
>> will correctly handle setting the non-wide-character value and the
>> wide-character value e.g. STR_ELEM_WC(mon_decimal_point) if they are
>> set in the locale definition. However, in monetary_finish() we have
>> conflicting TEST_ELEM() which sets a default value (if the locale
>> definition doesn't include one), and subsequent code which looks for
>> mon_decimal_point to be NULL to issue a specific error message and set
>> the defaults. The latter is unused because TEST_ELEM() always sets a
>> default. The simplest solution is to remove the TEST_ELEM() check,
>> and allow the existing check to look to see if mon_decimal_point is
>> NULL and set an appropriate default. The final fix is to move the
>> setting of mon_decimal_point_wc so it occurs only when
>> mon_decimal_point is being set to a default, keeping both values
>> consistent. There is no way to tell the difference between
>> mon_decimal_point_wc having been set to the empty string and not
>> having been defined at all, for that distinction we must use
>> mon_decimal_point being NULL or "", and so we must logically set
>> the default together with mon_decimal_point.
>>
>> Lastly, there are more fixes similar to this that could be made to
>> ld-monetary.c, but we avoid that in order to fix just the code
>> required for mon_decimal_point, which impacts the ability for C.UTF-8
>> to set mon_decimal_point to "", since without this fix we end up with
>> an inconsistent setting of mon_decimal_point set to "", but
>> mon_decimal_point_wc set to "." which is incorrect.
>>
>> Tested on x86_64 and i686 without regression.
>> ---
>> locale/programs/ld-monetary.c | 4 +---
>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/locale/programs/ld-monetary.c b/locale/programs/ld-monetary.c
>> index 277b9ff042..3b0412b405 100644
>> --- a/locale/programs/ld-monetary.c
>> +++ b/locale/programs/ld-monetary.c
>> @@ -207,7 +207,6 @@ No definition for %s category found"), "LC_MONETARY");
>>
>> TEST_ELEM (int_curr_symbol, "");
>> TEST_ELEM (currency_symbol, "");
>> - TEST_ELEM (mon_decimal_point, ".");
>> TEST_ELEM (mon_thousands_sep, "");
>> TEST_ELEM (positive_sign, "");
>> TEST_ELEM (negative_sign, "");
>> @@ -257,6 +256,7 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
>> record_error (0, 0, _("%s: field `%s' not defined"),
>> "LC_MONETARY", "mon_decimal_point");
>> monetary->mon_decimal_point = ".";
>> + monetary->mon_decimal_point_wc = L'.';
>> }
>> else if (monetary->mon_decimal_point[0] == '\0' && ! be_quiet && ! nothing)
>> {
>> @@ -264,8 +264,6 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
>> %s: value for field `%s' must not be an empty string"),
>> "LC_MONETARY", "mon_decimal_point");
>> }
>> - if (monetary->mon_decimal_point_wc == L'\0')
>> - monetary->mon_decimal_point_wc = L'.';
>>
>> if (monetary->mon_grouping_len == 0)
>> {
>
> I have verified that this does not change the localedef output for the
> existing locales created by install-locale-files.
>
> I think we need further cleanups in the comments and checks (which were
> coped from LC_NUMERIC, but should not apply to LC_MONETARY). But I
> think we can release with this version.
I filed this bug to track that:
Bug 28845 - ld-monetary.c should be updated to match ISO C and other standards.
https://sourceware.org/bugzilla/show_bug.cgi?id=28845
Thanks for the review!
> Reviewed-by: Florian Weimer <fweimer@redhat.com>
>
> Thanks,
> Florian
>
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX.
2022-02-01 12:05 ` Florian Weimer
@ 2022-02-01 16:13 ` Carlos O'Donell
0 siblings, 0 replies; 15+ messages in thread
From: Carlos O'Donell @ 2022-02-01 16:13 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha, michael.hudson
On 2/1/22 07:05, Florian Weimer wrote:
> * Carlos O'Donell:
>
>> We have had one downstream report from Canonical [1] that
>> an rrdtool test was broken by the differences in LC_TIME
>> that we had in the non-builtin C locale (C.UTF-8). If one
>> application has an issue there are going to be others, and
>> so with this commit we review and fix all the issues that
>> cause the builtin C locale to be different from C.UTF-8,
>> which includes:
>> * mon_decimal_point should be empty e.g. ""
>> - Depends on mon_decimal_point_wc fix.
>> * negative_sign should be empty e.g. ""
>> * week should be aligned with ISO 30112 default e.g. 7;19971130;4
>> * d_fmt corrected with escaped slashes e.g. "%m//%d//%y"
>> * yesstr and nostr should be empty e.g. ""
>> * country_ab2 and country_ab3 should be empty e.g. ""
>>
>> We bump LC_IDENTIFICATION version and adjust the date to
>> indicate the change in the locale.
>>
>> A new tst-c-utf8-consistency test is added to ensure
>> consistency between C/POSIX and C.UTF-8.
>>
>> Tested on x86_64 and i686 without regression.
>>
>> [1] https://sourceware.org/pipermail/libc-alpha/2022-January/135703.html
>>
>> Co-authored-by: Florian Weimer <fweimer@redhat.com>
>> ---
>> localedata/Makefile | 30 +-
>> localedata/locales/C | 22 +-
>> localedata/tst-c-utf8-consistency.c | 539 ++++++++++++++++++++++++++++
>> 3 files changed, 578 insertions(+), 13 deletions(-)
>> create mode 100644 localedata/tst-c-utf8-consistency.c
>
> This looks broadly okay to me. Dropping the ISO standard reference
> seems prudent if we can't check its contents.
>
> Reviewed-by: Florian Weimer <fweimer@redhat.com>
Thanks. I'll drop the ISO references from the commit message.
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point
2022-02-01 16:00 ` Carlos O'Donell
@ 2022-02-01 16:14 ` Carlos O'Donell
0 siblings, 0 replies; 15+ messages in thread
From: Carlos O'Donell @ 2022-02-01 16:14 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha, michael.hudson
On 2/1/22 11:00, Carlos O'Donell wrote:
> On 2/1/22 06:47, Florian Weimer wrote:
>> * Carlos O'Donell:
>>
>>> The handling of mon_decimal_point is incorrect when it comes to
>>> handling the empty "" value. The existing parser in monetary_read()
>>> will correctly handle setting the non-wide-character value and the
>>> wide-character value e.g. STR_ELEM_WC(mon_decimal_point) if they are
>>> set in the locale definition. However, in monetary_finish() we have
>>> conflicting TEST_ELEM() which sets a default value (if the locale
>>> definition doesn't include one), and subsequent code which looks for
>>> mon_decimal_point to be NULL to issue a specific error message and set
>>> the defaults. The latter is unused because TEST_ELEM() always sets a
>>> default. The simplest solution is to remove the TEST_ELEM() check,
>>> and allow the existing check to look to see if mon_decimal_point is
>>> NULL and set an appropriate default. The final fix is to move the
>>> setting of mon_decimal_point_wc so it occurs only when
>>> mon_decimal_point is being set to a default, keeping both values
>>> consistent. There is no way to tell the difference between
>>> mon_decimal_point_wc having been set to the empty string and not
>>> having been defined at all, for that distinction we must use
>>> mon_decimal_point being NULL or "", and so we must logically set
>>> the default together with mon_decimal_point.
>>>
>>> Lastly, there are more fixes similar to this that could be made to
>>> ld-monetary.c, but we avoid that in order to fix just the code
>>> required for mon_decimal_point, which impacts the ability for C.UTF-8
>>> to set mon_decimal_point to "", since without this fix we end up with
>>> an inconsistent setting of mon_decimal_point set to "", but
>>> mon_decimal_point_wc set to "." which is incorrect.
>>>
>>> Tested on x86_64 and i686 without regression.
>>> ---
>>> locale/programs/ld-monetary.c | 4 +---
>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>
>>> diff --git a/locale/programs/ld-monetary.c b/locale/programs/ld-monetary.c
>>> index 277b9ff042..3b0412b405 100644
>>> --- a/locale/programs/ld-monetary.c
>>> +++ b/locale/programs/ld-monetary.c
>>> @@ -207,7 +207,6 @@ No definition for %s category found"), "LC_MONETARY");
>>>
>>> TEST_ELEM (int_curr_symbol, "");
>>> TEST_ELEM (currency_symbol, "");
>>> - TEST_ELEM (mon_decimal_point, ".");
>>> TEST_ELEM (mon_thousands_sep, "");
>>> TEST_ELEM (positive_sign, "");
>>> TEST_ELEM (negative_sign, "");
>>> @@ -257,6 +256,7 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
>>> record_error (0, 0, _("%s: field `%s' not defined"),
>>> "LC_MONETARY", "mon_decimal_point");
>>> monetary->mon_decimal_point = ".";
>>> + monetary->mon_decimal_point_wc = L'.';
>>> }
>>> else if (monetary->mon_decimal_point[0] == '\0' && ! be_quiet && ! nothing)
>>> {
>>> @@ -264,8 +264,6 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"),
>>> %s: value for field `%s' must not be an empty string"),
>>> "LC_MONETARY", "mon_decimal_point");
>>> }
>>> - if (monetary->mon_decimal_point_wc == L'\0')
>>> - monetary->mon_decimal_point_wc = L'.';
>>>
>>> if (monetary->mon_grouping_len == 0)
>>> {
>>
>> I have verified that this does not change the localedef output for the
>> existing locales created by install-locale-files.
>>
>> I think we need further cleanups in the comments and checks (which were
>> coped from LC_NUMERIC, but should not apply to LC_MONETARY). But I
>> think we can release with this version.
>
> I filed this bug to track that:
> Bug 28845 - ld-monetary.c should be updated to match ISO C and other standards.
> https://sourceware.org/bugzilla/show_bug.cgi?id=28845
And I filed one more bug to track the original bug, which I'll close after push:
https://sourceware.org/bugzilla/show_bug.cgi?id=28847
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-02-01 16:14 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-31 5:34 [PATCH 0/2] Make C/POSIX and C.UTF-8 consistent Carlos O'Donell
2022-01-31 5:34 ` [PATCH 1/2] localedef: Fix handling of empty mon_decimal_point Carlos O'Donell
2022-01-31 15:26 ` Florian Weimer
2022-01-31 16:09 ` Andreas Schwab
2022-01-31 16:20 ` Florian Weimer
2022-01-31 16:30 ` Andreas Schwab
2022-01-31 16:37 ` Florian Weimer
2022-02-01 11:47 ` Florian Weimer
2022-02-01 16:00 ` Carlos O'Donell
2022-02-01 16:14 ` Carlos O'Donell
2022-01-31 5:34 ` [PATCH 2/2] localedata: Adjust C.UTF-8 to align with C/POSIX Carlos O'Donell
2022-01-31 8:47 ` Andreas Schwab
2022-01-31 16:07 ` Carlos O'Donell
2022-02-01 12:05 ` Florian Weimer
2022-02-01 16:13 ` Carlos O'Donell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).