public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: newlib@sourceware.org
Subject: Re: Support non-POSIX TZ strings
Date: Mon, 14 Feb 2022 10:10:23 -0700	[thread overview]
Message-ID: <758cfb47-ac13-fb88-877e-63a1d4327429@SystematicSw.ab.ca> (raw)
In-Reply-To: <D43932B0-8505-4782-A130-984AA87FC29A@jdoubleu.de>

On 2022-02-14 06:21, jdoubleu wrote:
> Hello,
> 
> I stumbled upon an issue with some TZ strings not handled as expected by newlib's tzset() function.
> The tzset functions expects the string stored in the TZ environment variable to follow the POSIX format as described here: https://sourceware.org/newlib/libc.html#tzset <https://sourceware.org/newlib/libc.html#tzset> (or https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html <https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html>).
> 
> However, the glibc implementation extends the format and additionally allows ‘<[+|-]hh[:mm[:ss]]>’ in the format (compare https://www.man7.org/linux/man-pages/man3/tzset.3.html <https://www.man7.org/linux/man-pages/man3/tzset.3.html>). It seems like the timezone database (zoneinfo) provided by the IANA (https://www.iana.org/time-zones <https://www.iana.org/time-zones>) adopted that format; or at least the zic compiler generates these strings in the zoneinfo files for most systems.
> 
> That leads to the timezone for "America/Argentina/Buenos_Aires” to be "<-03>3”, as can be seen in this dump https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv <https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv> or a linux system: `tail -n 1 /usr/share/zoneinfo/America/Argentina/Buenos_Aires`.
> 
> Some more background information can be found here https://github.com/esp8266/Arduino/issues/8423 <https://github.com/esp8266/Arduino/issues/8423> and here https://github.com/esp8266/Arduino/issues/7690 <https://github.com/esp8266/Arduino/issues/7690>.
> 
> One way to approach this is for the user to just replace the incompatible part of the string with a valid timezone identifier, as proposed by https://github.com/esp8266/Arduino/pull/7699 <https://github.com/esp8266/Arduino/pull/7699>.
> Since the timezone identifier (e.g. `PST`, `PDT`, `CET`, …) is not really used elsewhere by newlib, this should not be a problem, as far as I can imagine.
> 
> On the other hand, some ports implemented a proper parsing: https://github.com/earlephilhower/newlib-xtensa/pull/14 <https://github.com/earlephilhower/newlib-xtensa/pull/14>.
> 
> Now my question is whether the extended format should be support by newlib? Is this desired behaviour and would you accept code contributions for that matter?

Not sure what point you are trying to make and your terminology is 
non-standard, but we should start with the actual POSIX spec under TZ:

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 


and the current implementation:

https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c

which does not handle "<" ">" quoted POSIX +/-numeric time zone 
*abbreviations*, now common in the TZ database.

The BSD or TZcode implementations could probably be adapted to update 
newlib tzset to avoid reinvention e.g.

	https://github.com/eggert/tz/blob/main/newtzset.3
	https://github.com/eggert/tz/blob/main/localtime.c#L1081
thru
	https://github.com/eggert/tz/blob/main/localtime.c#L1400

[The original (American) English language time zone abbreviations were 
often made up by the (American) TZ database maintainers and mailing list 
users, and never used or published in the locale (e.g Germany used 
German language time zone abbreviations like MEZ/MESZ not MET, similarly 
for other European countries, see CLDR time zone abbreviations), only by 
(American) English and mailing list users.

These made up (American) English language time zone abbreviations were 
tracked down and replaced by the current TZ database maintainers after 
the POSIX spec was expanded, but none are considered canonical, and CLDR 
locale time zone abbreviations, as supported by ICU, are preferred (see 
announcements on the home page https://unicode.org/).

ICU4X (https://github.com/unicode-org/icu4x) is being developed to 
support "resource constrained" environments, but as the language 
bindings include Rust, Objective C, C++, whether that will be usable 
with embedded libraries such as newlib, musl, uclibc, dietlibc, 
picolibc, might be ascertained by starting a discussion as encouraged on 
the project site.]

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

  reply	other threads:[~2022-02-14 17:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-14 13:21 jdoubleu
2022-02-14 17:10 ` Brian Inglis [this message]
2022-02-14 19:58   ` jdoubleu
2022-02-14 20:45     ` Brian Inglis
2022-02-14 21:33       ` Jeff Johnston
2022-02-15 22:02         ` Brian Inglis
2022-02-15 22:36     ` Brian Inglis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=758cfb47-ac13-fb88-877e-63a1d4327429@SystematicSw.ab.ca \
    --to=brian.inglis@systematicsw.ab.ca \
    --cc=newlib@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).