From: Carlos O'Donell <carlos@redhat.com>
To: Michael Hudson-Doyle <michael.hudson@canonical.com>
Cc: libc-alpha@sourceware.org, Florian Weimer <fweimer@redhat.com>
Subject: Re: [PATCH v12 2/2] Add generic C.UTF-8 locale (Bug 17318)
Date: Fri, 28 Jan 2022 11:42:52 -0500 [thread overview]
Message-ID: <a2ff9a8e-60cf-b791-80a3-6ef145c608ad@redhat.com> (raw)
In-Reply-To: <CAJ8wqtftKJL2veMSeQRri+5tqmSMFY5VSunAkmu0dupToG7REQ@mail.gmail.com>
On 1/25/22 21:44, Michael Hudson-Doyle wrote:
> On Tue, 7 Sept 2021 at 03:45, Carlos O'Donell via Libc-alpha <
> libc-alpha@sourceware.org> wrote:
>
>> diff --git a/localedata/locales/C b/localedata/locales/C
>> new file mode 100644
>> index 0000000000..ca801c79cf
>> --- /dev/null
>> +++ b/localedata/locales/C
>
>
> [...]
>
>
>>
>>
> +LC_TIME
>> +% This is the POSIX Locale definition for the LC_TIME category with the
>> +% exception that time is per ISO 8601 and 24-hour.
>> +%
>> +% Abbreviated weekday names (%a)
>> +abday "Sun";"Mon";"Tue";"Wed";"Thu";"Fri";"Sat"
>> +
>> +% Full weekday names (%A)
>> +day "Sunday";"Monday";"Tuesday";"Wednesday";"Thursday";/
>> + "Friday";"Saturday"
>> +
>> +% Abbreviated month names (%b)
>> +abmon "Jan";"Feb";"Mar";"Apr";"May";"Jun";"Jul";"Aug";"Sep";/
>> + "Oct";"Nov";"Dec"
>> +
>> +% Full month names (%B)
>> +mon "January";"February";"March";"April";"May";"June";"July";/
>> + "August";"September";"October";"November";"December"
>> +
>> +% Week description, consists of three fields:
>> +% 1. Number of days in a week.
>> +% 2. Gregorian date that is a first weekday (19971130 for Sunday,
>> 19971201 for Monday).
>> +% 3. The weekday number to be contained in the first week of the year.
>> +%
>> +% ISO 8601 conforming applications should use the values 7, 19971201 (a
>> +% Monday), and 4 (Thursday), respectively.
>> +week 7;19971201;4
>>
>
> It's obviously a bit late, but this is a difference from the Debian/Ubuntu
It is never too late! Thank you for raising this.
Given that you've had problems with one application, other applications will have problems too.
I think we should probably keep C == C.UTF-8 and not change any of the existing LC_TIME properties.
> C.UTF-8 locale, which has:
>
> week 7;19971130;4
This is the default value from ISO 30112.
This data matches the internal C/POSIX locale.
e.g.
{ .string = "\7" },
7 days in the week.
{ .word = 19971130 },
Week start Sunday. This matches ISO 30112 definition if week is not specified.
{ .string = "\4" },
And Thursday needs to be included in the week for it be considered a "first week."
{ .string = "\1" },
{ .string = "\2" },
And ld-time.c follows defaults from ISO 30112 also.
482 /* Set up defaults based on ISO 30112 WD10 [2014]. */
483 if (time->week_ndays == 0)
484 time->week_ndays = 7;
485
486 if (time->week_1stday == 0)
487 time->week_1stday = 19971130;
488
489 if (time->week_1stweek == 0)
490 time->week_1stweek = 7;
> (confusingly, this is preceded by this comment:
>
> % ISO 8601 conforming applications should use the values 7, 19971130 (a
> % Monday), and 4 (Thursday), respectively.
>
> but 19971130 is a Sunday).
The above comment is wrong as you note, it is a Sunday.
The verbatim comment from ISO 30112 standard is:
~~~
ISO 8601 conforming applications should use the values 7, 19971201 (a
Monday), and 4 (Thursday), respectively.
~~~
Note the correction in the YYYYMMDD e.g. 19971201.
In our upstream C.UTF-8 locale we are consciously aligning with ISO 8601 in more cases.
117 % Week description, consists of three fields:
118 % 1. Number of days in a week.
119 % 2. Gregorian date that is a first weekday (19971130 for Sunday, 19971201 for Monday).
120 % 3. The weekday number to be contained in the first week of the year.
121 %
122 % ISO 8601 conforming applications should use the values 7, 19971201 (a
123 % Monday), and 4 (Thursday), respectively.
124 week 7;19971201;4
125 first_weekday 1
126 first_workday 2
So there is a difference between C and C.UTF-8 in that they have different first weekday.
> The locale(5) page from the man-pages project also says:
>
> "For compatibility reasons, all glibc locales should set the value of the
> second week list item to 19971130 (Sunday) and base the abday and day lists
> appropriately,".
This is to align with ISO 30112, which is an older standard.
> I found this because it breaks a test of rrdtool (which is probably buggy!
> It sets LC_TIME but needs to clear LC_ALL for that to take any effect) and
> I just wanted to check that this was truly the intended value before (even
> if only just) the release.
In this case for C.UTF-8 we have aligned week with ISO 8601.
There are other parts of C.UTF-8's LC_TIME which are not aligned with ISO 8601.
However, this choice is perhaps inconsistent with the intent of C.UTF-8, so I think this is
actually a bug, and Florian found a real bug in d_fmt (need double slashes).
I'm going to post a patch to fix this and make it consistent with C.
--
Cheers,
Carlos.
next prev parent reply other threads:[~2022-01-28 16:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-06 15:43 [PATCH v12 0/2] C.UTF-8 Carlos O'Donell
2021-09-06 15:43 ` [PATCH v12 1/2] Add 'codepoint_collation' support for LC_COLLATE Carlos O'Donell
2021-09-06 17:20 ` Matheus Castanho
2021-09-06 17:28 ` Florian Weimer
2021-09-07 1:28 ` Carlos O'Donell
2021-09-07 1:57 ` Carlos O'Donell
2021-09-20 12:49 ` Matheus Castanho
2021-09-20 12:54 ` Carlos O'Donell
2021-09-06 15:43 ` [PATCH v12 2/2] Add generic C.UTF-8 locale (Bug 17318) Carlos O'Donell
2022-01-26 2:44 ` Michael Hudson-Doyle
2022-01-28 16:42 ` Carlos O'Donell [this message]
2022-01-30 23:58 ` Michael Hudson-Doyle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a2ff9a8e-60cf-b791-80a3-6ef145c608ad@redhat.com \
--to=carlos@redhat.com \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
--cc=michael.hudson@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).