public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
From: "dwmw2 at infradead dot org" <sourceware-bugzilla@sourceware.org>
To: libc-locales@sourceware.org
Subject: [Bug localedata/4628] Provide rump locales with ISO 8601 variants for use with LC_TIME
Date: Fri, 07 Jan 2022 10:58:07 +0000	[thread overview]
Message-ID: <bug-4628-716-Dx07yHfhAW@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-4628-716@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=4628

--- Comment #24 from David Woodhouse <dwmw2 at infradead dot org> ---
Thanks, Carlos.

(In reply to Carlos O'Donell from comment #22) 
> I feel that ISO 8601 is a red-herring here, and the vast majority of users
> are simply talking about d_fmt and the default becoming YYYY-MM-DD for their
> particular use case.

Agreed.

> There are three positions that can be taken on default value of d_fmt:

It seems to me that these positions are retroactively (re)defining the very
semantics of d_fmt, and the purposes for which e.g. strftime("%x") should be
used. Yet applications have been using it in its current form for decades, and
it has been received wisdom that "well-behaved" applications will use the
locale format for the date.

We can change precisely what date format the applications get when they use 
d_fmt — that is precisely the *point* of it, after all. But I'd suggest that it
doesn't make much sense to talk about when/whether applications should use it
at all. That ship has sailed.

> (a) A locale represents the information for an application to use to present
> such information to someone local in that locale. Thus d_fmt should be used
> when exchanging information locally (whatever that means). An application
> may make a choice that in the context of the application's use of the data
> choose to present the data in a more interoperable unambiguous format, a
> 21st century way, e.g. YYYY-MM-DD for d_fmt.

It took me a while to parse the difference between (a) and (b) here, and the
distinction between "for local only viewing" in (a) vs. "default" in (b).

I think the final sentence of (a) should say that an application may explicitly
use e.g. YYYY-MM-DD (%Y-%m-%d) *instead* of using d_fmt (%x).

Of course, in today's interconnected world the only application that can know
its output will only be consumed locally is Snapchat. Anything that outputs
text which can be shared via email/screenshots/files/databases would need to
eschew the legacy d_fmt and explicitly use %Y-%m-%d instead.

On top of which, even if an application *could* know that it's displaying only
to a local user, the archaic dd/mm/yy form of en_GB is *still* the wrong thing
to display. The world has changed, with computers now being considered "broken"
if they cannot instantly communicate with others on a different continent, and
the legacy dd/mm form is a poor choice even for *local* communication because
users are inundated with dates in 'wrong' mm/dd form that makes it ambiguous.

Applications should just use %Y-%m-%d everywhere, unconditionally.

So if we choose (a) then we should change the strftime(3) man page to make it
clear that the %x format is deprecated and should never be used, much like the
warning we already have on %D. And we should patch all the applications in the
world, something like this...

-    strftime(buf, sizeof(buf), _("The date is %x"), tm);
+    strftime(buf, sizeof(buf), _("The date is %Y-%m-%d", tm);


> (b) A locale represents the default way information should be presented, and
> in the 21st century, this means d_fmt should be YYYY-MM-DD.

This is my understanding of the current position. Well-behaved applications
*should* use d_fmt, and it *should* do something appropriate based on which
country, and which millennium, the user is living in.

Instead of deprecating '%x' and declaring that decent applications will change
to manually use %Y-%m-%d, which is the logical conclusion of choice (a), choice
(b) would simply use the existing flexibility to make existing applications do
the right thing seamlessly. It seems like the better choice to me.

> (c) The framework should allow for easy customization of the pattern used
> for d_fmt, without loosing the language-specific localization in the rest of
> LC_TIME.

I don't even know that this needs to be selectable at run time. A build time
choice could be perfectly sufficient, wouldn't it?

Let's switch viewpoints and look at the user/application experience rather than
the implementation side.

Let's also take a slightly less contentious example which really is just a bug.
Poland *officially* adopted the YYYY-MM-DD format in 2002, yet 'LC_TIME=pl_PL
date +%x' still seems to output the legacy 07.01.2022 format. Shouldn't that
one just be fixed in the next glibc release? Should we file a separate bug for
it?

A *distribution* might then want to revert that change, perhaps if shipping the
new glibc as an update to an existing system (to avoid breaking some
admittedly-already-broken screenscraping scripts). So making it a build-time
option would be useful.

But the next major release of the distribution would probably just use the
"new" post-2002 d_fmt for pl_PL.

For the individual user, the experience would just be that in a new version,
the behaviour gets updated. And if they *really* object to the fixed behaviour,
they still have the (runtime) option of building their own locale to regress it
for themselves. It's not trivial, but it doesn't *need* to be.

And if we can do it for pl_PL (which we absolutely should), then why shouldn't
we do the same for *all* locales? Because this far into the 21st century,
%Y-%m-%d is the only sane way to represent numeric dates.

> I see maybe another way forward:
> 
> (4) New pattern with override selection.
> 
> Following the Mozilla and ECMA Script recommendations it might be more
> possible to define variants of d_fmt, and allow users to pick a variant e.g.
> 
> In environment:
> LC_TIME=en_US,d_fmt={iso8601}
> 
> In locale sources:
> d_fmt            "%m//%d//%Y"
> % Variant {iso8601} pattern for d_fmt.
> d_fmt_iso8601    "%Y-%M-%d"

If you do proceed with runtime options, please ensure that the *default* is
YYYY-MM-DD and that the suffix is required to go back to the legacy form. Or
define *both* 'iso8601' and 'legacy' suffixes and allow the system default to
be configured at build time (and *that* should default to iso8601).

-- 
You are receiving this mail because:
You are the assignee for the bug.

  parent reply	other threads:[~2022-01-07 10:58 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-4628-716@http.sourceware.org/bugzilla/>
2016-02-04 13:57 ` fweimer at redhat dot com
2016-05-20 12:44 ` dwmw2 at infradead dot org
2016-05-20 12:44 ` fweimer at redhat dot com
2016-05-20 13:20 ` nicolas.mailhot at laposte dot net
2016-05-20 13:55 ` carlos at redhat dot com
2016-05-20 13:59 ` fweimer at redhat dot com
2016-05-20 14:04 ` carlos at redhat dot com
2016-05-20 19:13 ` carlos at redhat dot com
2016-05-20 19:13 ` vapier at sourceware dot org
2016-05-20 22:36 ` gunnarhj at ubuntu dot com
2016-05-20 22:37 ` carlos at redhat dot com
2016-05-20 23:28 ` gunnarhj at ubuntu dot com
2017-08-02 13:57 ` yjf.victor at foxmail dot com
2018-06-24  0:50 ` joerg_bugzilla_sourceware at reisenweber dot net
2018-06-24  0:55 ` joerg_bugzilla_sourceware at reisenweber dot net
2021-12-21 14:18 ` dwmw2 at infradead dot org
2022-01-06 22:03 ` carlos at redhat dot com
2022-01-06 22:04 ` carlos at redhat dot com
2022-01-06 22:04 ` carlos at redhat dot com
2022-01-06 22:04 ` carlos at redhat dot com
2022-01-07  9:48 ` nicolas.mailhot at laposte dot net
2022-01-07 10:58 ` dwmw2 at infradead dot org [this message]
2022-01-07 11:29 ` nicolas.mailhot at laposte dot net

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-4628-716-Dx07yHfhAW@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=libc-locales@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).